Week 5

Computational Sociology

Christopher Barrie

Introduction

  1. Housekeeping
  2. Linked surveys and social media data

Introduction: Surveys

  • Survey background

  • Innovations in digital surveys

  • Linked survey designs

Introduction: Surveys

Introduction: Surveys

Introduction: Surveys

Introduction: Surveys

Three main types of survey sample:

  1. Representative
  2. Experimental
  3. Purposive

Introduction: Surveys

Three main types of survey sample:

  1. Representative
  2. Experimental
  3. Purposive

Representative Surveys

  • Random sample from a larger population (randomization into sample)

    • National population

    • (Online) subpopulation

Representative Surveys

  • Random sample from a larger population (randomization into sample)

    • National population

      • Internet-based techniques: 1) snail-mailed online survey; 2) non-probability sampling + matching/post-stratification; 3) active sampling from online panel (e.g., YouGov)

Representative Surveys

Representative Surveys

Representative Surveys

  • Random sample from a larger population (randomization into sample)

    • National population

      • Internet-based techniques: 1) snail-mailed online survey; 2) non-probability sampling + matching/post-stratification; 3) active sampling from online panel (e.g., YouGov)

Linked Surveys Guess, Nagler, and Tucker (2019)

Surveying

Gathering social media data

Linking

The data

Some considerations

  1. Observation effects?
  2. Deletion
  3. Response bias
  4. đź’¸

Introduction: Social media data

How do we get it?

  1. APIs
  2. Web scraping
  3. Private agreement

Introduction: Social media data

How do we get it?

  1. APIs
  2. Web scraping
  3. Private agreement

APIs

  • “Application Programming Interface”

    • A common language, allowing one computer, or piece of software, to speak to another
  • Normally for social science research: “web APIs”

APIs

APIs

APIs

APIs

APIs

APIs versus web scraping

  • When web scraping:

    • data is optimized for screen legibility

    • not machine legibility

APIs versus web scraping

  • Compare to web scraping, where we:

    • get info already displayed on screen according to location markers/selectors
  • In contrast APIs:

    • request information based on a set of instructions;

    • the the logics of which are governed by the client (platform in question)

Using APIs: pre-packaged

library(academictwitteR)

tweetsblm <- get_all_tweets(
  query = "BLM",
  start_tweets = "2020-01-01T00:00:00Z",
  end_tweets = "2020-01-05T00:00:00Z",
  bearer_token = get_bearer(),
  file = "data/blmtweets.rds",
  data_path = "data/json_data/",
  n = 500
)

Using APIs: user-written

library(httr)

endpoint_url <- "https://api.twitter.com/2/tweets/search/all"

params <- list(
  "query" = "#happymonday",
  "start_time" = "2021-01-01T00:00:00Z",
  "end_time" = "2021-07-31T23:59:59Z"
)

Using APIs: user-written

https://api.twitter.com/2/tweets/search/all?query=%23happymonday&start_time=2021-01-01T00%3A00%3A00Z&end_time=2021-07-31T23%3A59%3A59Z

Analyzing social media data González-Bailón and De Domenico (2021)

Analyzing social media data

Analyzing social media data

A note on computational thinking

This week:

  • We approach problems using combinations of data

    • We combine surveys with social media data

    • We combine social media data with data on audience reach

  • We use approximations to get at a question in the social world

    • e.g., we use measures of centrality and audience reach to infer visibility (w/o directly observing)

    • e.g., we use software solutions (browser extensions) to observe behaviour in the absence of direct observation

  • We adapt non-social-science data to scientific purposes

References

González-Bailón, Sandra, and Manlio De Domenico. 2021. “Bots Are Less Central Than Verified Accounts During Contentious Political Events.” Proceedings of the National Academy of Sciences 118 (11). https://doi.org/10.1073/pnas.2013443118.
Guess, Andrew, Jonathan Nagler, and Joshua Tucker. 2019. “Less Than You Think: Prevalence and Predictors of Fake News Dissemination on Facebook.” Science Advances 5 (1). https://doi.org/10.1126/sciadv.aau4586.