Table of Contents

  1. Gather
    1.1 Webpage Data with wget
    1.2 Twitter Data with twarc
    1.3 Instagram Data with instagram-scraper
  2. Analyze
  3. Publish

1. Gather

1.1 Webpage Data with wget

wget is a free command line utility for non-interactive downloads of files from the web to retrieve online material. In this workshop, we only cover how you can download online digital files from public websites, such as Government Archives.

1.2 Twitter Data with twarc

twarc is a command line tool and Python library for archiving Twitter JSON data. In addition to letting you collect tweets Twarc can also help you collect users, trends and hydrate tweet ids. In this workshop, we only cover how you can setup twarc and gather Twitter data, such as terms, hashtags, and accounts.

1.3 Instagram Data with instagram-scraper

instagram-scraper is a command-line application written in Python that scrapes and downloads an instagram user’s photos, videos, and metadata. In this workshop, we only cover how you can setup instagram-scraper and gather Instagram data, such as photos, vidos, and metadata from user posts via accounts or hashtags.

2. Analyze

Coming soon…

3. Publish

Coming soon…