QTM 385 - Experimental Methods

Lecture 22 - Survey Experiments for Sensitive Topics

Danilo Freire

Emory University

Hello, everyone! 😉

Notes about assignment 09 and group project

Notes

Assignment 09

  • Some of you reported issues with the table in question 03 in assignment 09
  • The table was coded correctly, but it does not render properly in RStudio Visual Mode
  • It has been fixed 🤓
City Resume Quality Race % Received call (N)
Boston Low-quality Black 7.01 (542)
Boston Low-quality White 10.15 (542)
Boston High-quality Black 8.50 (541)
Boston High-quality White 13.12 (541)
Chicago Low-quality Black 5.52 (670)
Chicago Low-quality White 7.16 (670)
Chicago High-quality Black 5.28 (682)
Chicago High-quality White 8.94 (682)

Group project

  • The simulated datasets and codebooks are available on our GitHub repository
  • https://github.com/danilofreire/qtm385/tree/main/simulated-pap-data
  • Each folder contains a file with the codebook and a description of the dataset, and the data in .csv format
  • The order of the presentations has also been randomised:
  • Wednesday, the 23rd of April: Groups 1, 4, 5 and 6
  • Monday, the 28th of April: Groups 2, 3, 7 and 8
  • Each group will have about 10 minutes to present their projects, with about 5 minutes for questions

Brief recap 📚

Survey experiments

Core components

  • Survey experiments combine random assignment with survey methods to study attitudes
  • Main applications: behavioural economics, psychology, marketing, political behaviour, and public opinion
  • Core design variations:
    • Presence/absence of stimuli
    • Dosage levels of treatment intensity
    • Qualitative variations in treatment content
  • Validation methods:
    • Manipulation checks post-treatment
    • Placebo treatments for specificity testing
    • Non-equivalent outcomes for effect containment
  • Common implementations:
    • Question wording manipulations
    • Vignette designs with randomised attributes
    • Audio or video stimuli

Today’s plan 📅

Survey experiments with a twist

  • What to do when subjects have an incentive to lie?
  • Sensitive topics are often difficult to study
  • Social desirability bias is a common problem in survey research
  • We need special techniques to study these problems
  • List experiments, randomised response technique, endorsement experiments, and conjoint experiments are possible solutions
  • Maintains plausible deniability for respondents
  • Software to estimate these models: http://sensitivequestions.org/
  • List experiments measure prevalence indirectly through item counts
  • Randomised response techniques use probability models for anonymity
  • Endorsement experiments assess support without direct attribution
  • Conjoint analysis measures preferences through trade-off scenarios
  • Sometimes requires careful probability weighting in analysis

List experiments 📋

What is a list experiment?

  • The logic of list experiments is simple
  • Respondents are presented with a list of items and asked how many they agree with
    • Just how many items, not which ones
  • The list includes a sensitive item (e.g., “I have committed a crime”) and several non-sensitive items
  • The sensitive item is randomly assigned to a subset of respondents
  • The key is to compare the average number of items agreed with between the treatment group (who sees the sensitive item) and the control group (who does not)
  • The difference in means provides an estimate of the prevalence of the sensitive item
  • This method is also known as the item count technique

Example

Measuring prejudice

Now I’m going to read you three things that sometimes make people angry or upset. After I read all three, just tell me HOW MANY of them upset you. (I don’t want to know which ones, just how many.)

  1. the federal government increasing the tax on gasoline
  2. professional athletes getting million-dollar-plus salaries
  3. large corporations polluting the environment

How many, if any, of these things upset you?

Example of a list experiment

Measuring prejudice

Now I’m going to read you four things that sometimes make people angry or upset. After I read all four, just tell me HOW MANY of them upset you. (I don’t want to know which ones, just how many.)

  1. the federal government increasing the tax on gasoline
  2. professional athletes getting million-dollar-plus salaries
  3. large corporations polluting the environment
  4. a Muslim family moving next door to you

How many, if any, of these things upset you?

Some notation

  • Sample of respondents \(N\), where \(T_i = 1\) if respondent \(i\) is in the treatment group and \(T_i = 0\) if in the control group
  • \(J\) is the number of items in the control list, \(J + 1\) is the number of items in the treatment list
  • \(Z_{ij}(t)\) a binary variable denoting respondent \(i\)’s preference for the \(j\)th control item for \(j = 1, \dots , J\) under the treatment status \(t = 0, 1\)
  • \(Y_{i}(0) = \sum_{j=1}^{J} Z_{ij}(0)\) is the potential answer \(i\) would give if asked about the control list
  • \(Y_{i}(1) = \sum_{j=1}^{J+1} Z_{ij}(1)\) is the number of items in the treatment list that respondent \(i\) agrees with
  • The observed response is \(Y_i = Y_i(T_i)\), where \(Y_i(0)\) is in the range of \(\{0,1, \dots, J\}\) and \(Y_i(1)\) is in the range of \(\{0,1, \dots J + 1\}\)
  • Now let’s discuss the assumptions…

Assumptions

No design effects

  • First, we need to assume that the addition of the sensitive item does not change the sum of affirmative answers to the control items
  • It is not necessary that respondents answer the control items truthfully, but the average number of affirmative answers must be the same in both groups
  • This is the no design effects assumption
  • Formally, for each respondent \(i = 1, \dots, N\), we assume:

\[ \sum_{j=1}^{J} Z_{ij}(0) = \sum_{j=1}^{J} Z_{ij}(1) \]

No liars

  • Second, we need to assume that the sensitive item is not a lie
  • That is, all respondents give truthful answers for the sensitive item
  • This is a strong assumption, as you can imagine

\[ Z_{i,J+1}(1) = Z^*_{i,J+1} \]

where \(Z^*_{i,J+1}\) represents a truthful answer to the sensitive item. The treatment effect is

\[\hat{\tau} = \frac{1}{N_1} \sum_{i=1}^{N} T_i Y_i - \frac{1}{N_0} \sum_{i=1}^{N} (1 - T_i) Y_i,\]

where \(N_1 = \sum_{i=1}^{N} T_i\) is the size of the treatment group and \(N_0 = N - N_1\) is the size of the control group

More about list experiments

Notes about the design

  • List experiements have several advantages, as they are easy to implement and clear to respondents
  • But they have some issues as well:
    • Limited power to detect small effects
    • Floor effects: If many respondents disagree with all or most items, it is hard to estimate the prevalence of the sensitive item
    • Ceiling effects: If someone agrees with all items, we know for sure they agree with the sensitive item
    • Sample homogeneity: If the sample is too homogeneous, it may be hard to detect differences
  • But there are some solutions for these problems
  • Use items that contradict each other to reduce ceiling and floor effects
    • Example: ask about pro-gun and pro-choice items
  • A weakly informative Bayesian prior and covariates can also be used to reduce ceiling and floor effects
  • You can also add direct questions to the experiment and use a weighted average of the direct question estimate and the list experiment estimate among those who answer “No” to the direct question (Arrow et al. 2015)
  • You can also include multiple sensitive items in the list experiment
  • The main issue is to interpret the results correctly, so we usually see what explains the sensitive item in the list experiment

R example

library(list)
data(race)

lm.results <- ictreg(y ~ south + age + male + college, data = race,
treat = "treat", J=3, method = "ml")
summary(lm.results)

Item Count Technique Regression 

Call: ictreg(formula = y ~ south + age + male + college, data = race, 
    treat = "treat", J = 3, method = "ml")

Sensitive item 
                Est.    S.E.
(Intercept) -5.50797 1.02102
south        1.67541 0.55851
age          0.63582 0.16333
male         0.84627 0.49372
college     -0.31538 0.47360

Control items 
                Est.    S.E.
(Intercept)  1.19138 0.14368
south       -0.29202 0.09692
age          0.03323 0.02768
male        -0.25059 0.08194
college     -0.51641 0.08368

Log-likelihood: -1444.394

Number of control items J set to 3. Treatment groups were indicated by '1' and the control group by '0'.

Did hidden Trump voters exist?

Did hidden Trump voters exist?

Source: Coppock (2017)

Did hidden Trump voters exist?

Do Russians support the war in Ukraine?

Do Russians support the war in Ukraine?

Source: Chapovski (2022)

Do Russians support the war in Ukraine?

Do Russians support the war in Ukraine?

Endorsement experiments

What is an endorsement experiment?

  • Endorsement experiments are a type of survey experiment that measure support for a policy or candidate
  • They are a variation of the vignette experiment, where respondents are presented with a scenario and asked to evaluate it
  • Here, the scenario includes an endorsement from a prominent figure or group
  • These experiments are often used to study partisan bias, group identity, or framing effects
  • The main advantage of endorsement experiments is that they can elicit preferences without directly asking about them
  • While it is easy to estimate them when there one group is involved, it is more difficult when there are multiple groups
  • It is a little hard to combine the items, but it is possible
  • A solution is to use Bayesian hierarchical models, which is described in this (very technical) paper: Bullock et al (2011)
  • We can also use conjoint experiments, as we’ll see in a bit!

Example

Measuring Support for Militant Groups in Pakistan

  • Militant violence in Pakistan is a serious international problem, yet little is known about who supports militant organisations and why
  • There is a strong incentives for locals to falsify information to avoid repercussions
  • Also, asking respondents directly poses safety risks for researchers and participants alike
  • Unlike a direct measure, nonresponse and social desirability biases are minimised since respondents are reacting to the policy and not directly to the group itself (or so it seems! 😅)
  • The authors asked respondents about their level of support for polio vaccinations, curriculum reforms, crime regulations, and border disputes
  • Control group:
The World Health Organization recently announced 
a plan to introduce universal polio vaccination 
across Pakistan. How much do you support
such a plan?
(1) A great deal; (2) A lot; (3) A moderate amount; 
(4) A little; (5) Not at all.
  • Treatment group:
The World Health Organization recently announced 
a plan to introduce universal polio vaccination 
across Pakistan. Pakistani militant groups 
fighting in Kashmir have voiced support for 
this program. How much do you support such a plan?
(1) A great deal; (2) A lot; (3) A moderate amount; 
(4) A little; (5) Not at all.

Results

Did political endorsement influence mass opinion during COVID-19?

Did political endorsement influence mass opinion during COVID-19?

Source: Gadarian et al (2021)

Did political endorsement influence mass opinion during COVID-19?

Did political endorsement influence mass opinion during COVID-19?

Randomised response techniques 🎲

Randomised response

  • This is a old technique (Werner 1965, Boruch 1971) that is used to reduce social desirability bias
  • The main idea is the following:
    • The respondent roll a die or other randomising device out of view of the enumerator
    • The enumerator does not know the outcome of the randomisation
    • For instance, if 1 shows in the die, the respondent says no
    • If 6 shows, the respondent says yes
    • If any other number shows, the respondent answers truthfully
    • The enumerator sees the answer, but does not know if it is true or random
  • The methods works because the probabilities are known so we can estimate the true proportion of respondents who agree with the sensitive item
  • However, the method is a little confusing to respondents and it is important to either run the experiment in person or provide clear instructions if running it online
  • Believe it or not, this is the most effective method to reduce social desirability bias!

Xenophobia and anti-semitism in Germany

Source: Krumpal (2012)

Xenophobia and anti-semitism in Germany

Conjoint analysis

Conjoint experiments

  • Conjoint analysis measures preferences for multiple attributes of a product, service, or policy
  • The method aims to understand how different factors influence decision-making
  • It is also used to study trade-offs between different attributes
  • As the other experiments we discussed, conjoint analysis provides a plausible deniability for respondents
  • Why is that? Because respondents are not asked about their preferences directly
  • Instead, they are presented with a series of hypothetical scenarios and asked to choose their preferred option
  • The main advantage of conjoint analysis is that it allows researchers to study complex preferences in a more realistic way
  • Another advantage is that the method is very robust, flexible, and adaptable to various research contexts
  • It also increases sample size significantly as respondents are presented with several scenarios
  • However, it can be time-consuming to design and analyse, but that’s pretty much the only con 😉
  • Ah, software to implement it is quite difficult to use as well!
  • See my own experiment for an example

Hainmueller et al (2014)

  • Assumptions:
    • No carryover effects
    • No profile-order effects
    • Randomisation of profiles
    • Need to cluster errors because answers are not independent (a single individual chooses many profiles)
  • Survey design tool
  • cregg R package

Source Hainmueller et al (2014)

Example: Immigration applicants (US)

Example: Immigration applicants (US)

Example: Presidential candidates (US)

Conclusion

Conclusion

  • How cool are these methods?! 🤓
  • There are many other methods to study sensitive topics
  • Today we saw four of them:
    • List experiments
    • Endorsement experiments
    • Randomised response techniques
    • Conjoint analysis
  • Please remember that these methods are not perfect, and that they have several assumptions
  • They are also areas of active research, so please be mindful that something new may come up 😉
  • If you are interested in learning more about these methods, please check the following R packages:
  • list
  • endorse
  • rr
  • cregg
  • Visit the website http://sensitivequestions.org/ for more information
  • And let me know if you’d like to learn more about these methods! 🤓

…And that’s all for today! 🎉

Thank you for your attention! 🙏