Analyzing Twitter Data

Oxford Spring School, 2022

Christopher Barrie

Introduction

Day 1

  • General intro.
  • Counting tweets
    • Applied examples

    • Group work

  • Getting tweets
    • Applied examples

    • Group work

Introduction

Day 2

  • Getting friends/followers
    • Applied examples

    • Group work

  • Getting data outside the API
    • Applied examples

    • Group work

  • Replication task
    • Group work

Introduction

Day 3

  • Group projects
    • 1.5 hrs prep.

    • 1.5 hrs pres.

      • Research question

      • Data you targeted

      • Empirical strategy (e.g., experimental, observational, inferential)

      • Findings

Analyzing Twitter data

Why?

  1. Important site of political communication
  2. Important site of social interaction
  3. It’s very accessible

Especially, since intro. of Academic Research Product Track

Twitter Academic API

  • 10m tweets monthly
  • Full historical archive
  • Authorization requires…

Authorization

Go here

  1. Do you have an academic profile?
  2. Can you describe your research?
    • Will you share with government entity?
    • Is the research non-commercial?
    • Is it conducted at individual level?
    • Will you make non-anon. data available?

What you’ll see

Or a live example…

What you’ll see

This is what the Developer Portal looks like

Getting started…

Don’t forget to Restart R

After we’ve done that…

We’re good to go 🥳

After we’ve done that…

We’re good to go 🥳

But before we do that…

What’s an API anyway?

  • We’re talking about Web APIs
  • Application Programming Interface”
  • Helpful to think about compared to scraping…

Scraping Twitter

Using the Twitter API

Requesting versus scraping

  • Scraping:
    • Extracts visible content from screen

    • Often trial and error as not optimized for serving data ( legibility not readability)

  • API calls
    • Requests content from a “blank” version of page

    • Delivers data in more readily usable format (JSON, csv etc.)

What does it look like?

  • Essentially, a long URL string…

  • TODO Insert example API request here

What is it made of?

  1. A query
  2. An endpoint
  3. Optional parameters

What is it made of?

my_query <- "#BLM lang:EN"

endpoint_url <- "https://api.twitter.com/2/tweets/search/all"

params <- list(
  "query" = my_query,
  "start_time" = "2021-01-01T00:00:00Z",
  "end_time" = "2021-07-31T23:59:59Z",
  "max_results" = 20
)

What is it made of?

my_query <- "#BLM lang:EN"

endpoint_url <- "https://api.twitter.com/2/tweets/search/all"

params <- list(
  "query" = my_query,
  "start_time" = "2021-01-01T00:00:00Z",
  "end_time" = "2021-07-31T23:59:59Z",
  "max_results" = 20
)

params
$query
[1] "#BLM lang:EN"

$start_time
[1] "2021-01-01T00:00:00Z"

$end_time
[1] "2021-07-31T23:59:59Z"

$max_results
[1] 20

But we don’t often do this…

So what we’ll do is …

  • We’ll focus on academictwitteR
  • But we’ll also be thinking later more generally about:
    • APIs

    • research design