QTM 385 - Experimental Methods

Lecture 22 - Survey Experiments for Sensitive Topics

Danilo Freire

danilo.freire@emory.edu

Emory University

Hello, everyone! 😉

Notes about assignment 09 and group project

Notes

Assignment 09

Some of you reported issues with the table in question 03 in assignment 09
The table was coded correctly, but it does not render properly in RStudio Visual Mode
It has been fixed 🤓

City	Resume Quality	Race	% Received call	(N)
Boston	Low-quality	Black	7.01	(542)
Boston	Low-quality	White	10.15	(542)
Boston	High-quality	Black	8.50	(541)
Boston	High-quality	White	13.12	(541)
Chicago	Low-quality	Black	5.52	(670)
Chicago	Low-quality	White	7.16	(670)
Chicago	High-quality	Black	5.28	(682)
Chicago	High-quality	White	8.94	(682)

Group project

The simulated datasets and codebooks are available on our GitHub repository
https://github.com/danilofreire/qtm385/tree/main/simulated-pap-data
Each folder contains a file with the codebook and a description of the dataset, and the data in .csv format
The order of the presentations has also been randomised:
Wednesday, the 23rd of April: Groups 1, 4, 5 and 6
Monday, the 28th of April: Groups 2, 3, 7 and 8
Each group will have about 10 minutes to present their projects, with about 5 minutes for questions

Brief recap 📚

Survey experiments

Core components

Survey experiments combine random assignment with survey methods to study attitudes
Main applications: behavioural economics, psychology, marketing, political behaviour, and public opinion
Core design variations:
- Presence/absence of stimuli
- Dosage levels of treatment intensity
- Qualitative variations in treatment content

Validation methods:
- Manipulation checks post-treatment
- Placebo treatments for specificity testing
- Non-equivalent outcomes for effect containment
Common implementations:
- Question wording manipulations
- Vignette designs with randomised attributes
- Audio or video stimuli

Today’s plan 📅

Survey experiments with a twist

What to do when subjects have an incentive to lie?
Sensitive topics are often difficult to study
Social desirability bias is a common problem in survey research
We need special techniques to study these problems
List experiments, randomised response technique, endorsement experiments, and conjoint experiments are possible solutions
Maintains plausible deniability for respondents
Software to estimate these models: http://sensitivequestions.org/

List experiments measure prevalence indirectly through item counts
Randomised response techniques use probability models for anonymity
Endorsement experiments assess support without direct attribution
Conjoint analysis measures preferences through trade-off scenarios
Sometimes requires careful probability weighting in analysis

List experiments 📋

What is a list experiment?

The logic of list experiments is simple
Respondents are presented with a list of items and asked how many they agree with
- Just how many items, not which ones
The list includes a sensitive item (e.g., “I have committed a crime”) and several non-sensitive items
The sensitive item is randomly assigned to a subset of respondents
The key is to compare the average number of items agreed with between the treatment group (who sees the sensitive item) and the control group (who does not)
The difference in means provides an estimate of the prevalence of the sensitive item
This method is also known as the item count technique

Source: Blair and Imai (2012)

Example

Measuring prejudice

Now I’m going to read you three things that sometimes make people angry or upset. After I read all three, just tell me HOW MANY of them upset you. (I don’t want to know which ones, just how many.)

the federal government increasing the tax on gasoline

professional athletes getting million-dollar-plus salaries

large corporations polluting the environment

How many, if any, of these things upset you?

Example of a list experiment

Measuring prejudice

Now I’m going to read you four things that sometimes make people angry or upset. After I read all four, just tell me HOW MANY of them upset you. (I don’t want to know which ones, just how many.)

the federal government increasing the tax on gasoline

professional athletes getting million-dollar-plus salaries

large corporations polluting the environment

a Muslim family moving next door to you

How many, if any, of these things upset you?

Some notation

Sample of respondents \(N\), where \(T_i = 1\) if respondent \(i\) is in the treatment group and \(T_i = 0\) if in the control group
\(J\) is the number of items in the control list, \(J + 1\) is the number of items in the treatment list
\(Z_{ij}(t)\) a binary variable denoting respondent \(i\)’s preference for the \(j\)th control item for \(j = 1, \dots , J\) under the treatment status \(t = 0, 1\)
\(Y_{i}(0) = \sum_{j=1}^{J} Z_{ij}(0)\) is the potential answer \(i\) would give if asked about the control list
\(Y_{i}(1) = \sum_{j=1}^{J+1} Z_{ij}(1)\) is the number of items in the treatment list that respondent \(i\) agrees with
The observed response is \(Y_i = Y_i(T_i)\), where \(Y_i(0)\) is in the range of \(\{0,1, \dots, J\}\) and \(Y_i(1)\) is in the range of \(\{0,1, \dots J + 1\}\)
Now let’s discuss the assumptions…

Assumptions

No design effects

First, we need to assume that the addition of the sensitive item does not change the sum of affirmative answers to the control items
It is not necessary that respondents answer the control items truthfully, but the average number of affirmative answers must be the same in both groups
This is the no design effects assumption
Formally, for each respondent \(i = 1, \dots, N\), we assume:

\[ \sum_{j=1}^{J} Z_{ij}(0) = \sum_{j=1}^{J} Z_{ij}(1) \]

No liars

Second, we need to assume that the sensitive item is not a lie
That is, all respondents give truthful answers for the sensitive item
This is a strong assumption, as you can imagine

\[ Z_{i,J+1}(1) = Z^*_{i,J+1} \]

where \(Z^*_{i,J+1}\) represents a truthful answer to the sensitive item. The treatment effect is

\[\hat{\tau} = \frac{1}{N_1} \sum_{i=1}^{N} T_i Y_i - \frac{1}{N_0} \sum_{i=1}^{N} (1 - T_i) Y_i,\]

where \(N_1 = \sum_{i=1}^{N} T_i\) is the size of the treatment group and \(N_0 = N - N_1\) is the size of the control group

More about list experiments

Notes about the design

List experiements have several advantages, as they are easy to implement and clear to respondents
But they have some issues as well:
- Limited power to detect small effects
- Floor effects: If many respondents disagree with all or most items, it is hard to estimate the prevalence of the sensitive item
- Ceiling effects: If someone agrees with all items, we know for sure they agree with the sensitive item
- Sample homogeneity: If the sample is too homogeneous, it may be hard to detect differences

But there are some solutions for these problems
Use items that contradict each other to reduce ceiling and floor effects
- Example: ask about pro-gun and pro-choice items
A weakly informative Bayesian prior and covariates can also be used to reduce ceiling and floor effects
You can also add direct questions to the experiment and use a weighted average of the direct question estimate and the list experiment estimate among those who answer “No” to the direct question (Arrow et al. 2015)
You can also include multiple sensitive items in the list experiment
The main issue is to interpret the results correctly, so we usually see what explains the sensitive item in the list experiment

R example

More information at https://list.sensitivequestions.org/articles/getting-started.html

library(list)
data(race)

lm.results <- ictreg(y ~ south + age + male + college, data = race,
treat = "treat", J=3, method = "ml")
summary(lm.results)


Item Count Technique Regression 

Call: ictreg(formula = y ~ south + age + male + college, data = race, 
    treat = "treat", J = 3, method = "ml")

Sensitive item 
                Est.    S.E.
(Intercept) -5.50797 1.02102
south        1.67541 0.55851
age          0.63582 0.16333
male         0.84627 0.49372
college     -0.31538 0.47360

Control items 
                Est.    S.E.
(Intercept)  1.19138 0.14368
south       -0.29202 0.09692
age          0.03323 0.02768
male        -0.25059 0.08194
college     -0.51641 0.08368

Log-likelihood: -1444.394

Number of control items J set to 3. Treatment groups were indicated by '1' and the control group by '0'.

Did hidden Trump voters exist?

Source: Coppock (2017)

Did hidden Trump voters exist?

Do Russians support the war in Ukraine?

Source: Chapovski (2022)

Do Russians support the war in Ukraine?

Endorsement experiments

What is an endorsement experiment?

Endorsement experiments are a type of survey experiment that measure support for a policy or candidate
They are a variation of the vignette experiment, where respondents are presented with a scenario and asked to evaluate it
Here, the scenario includes an endorsement from a prominent figure or group
These experiments are often used to study partisan bias, group identity, or framing effects
The main advantage of endorsement experiments is that they can elicit preferences without directly asking about them

While it is easy to estimate them when there one group is involved, it is more difficult when there are multiple groups
It is a little hard to combine the items, but it is possible
A solution is to use Bayesian hierarchical models, which is described in this (very technical) paper: Bullock et al (2011)
We can also use conjoint experiments, as we’ll see in a bit!

Example

Measuring Support for Militant Groups in Pakistan

Militant violence in Pakistan is a serious international problem, yet little is known about who supports militant organisations and why
There is a strong incentives for locals to falsify information to avoid repercussions
Also, asking respondents directly poses safety risks for researchers and participants alike
Unlike a direct measure, nonresponse and social desirability biases are minimised since respondents are reacting to the policy and not directly to the group itself (or so it seems! 😅)
The authors asked respondents about their level of support for polio vaccinations, curriculum reforms, crime regulations, and border disputes

Control group:

The World Health Organization recently announced 
a plan to introduce universal polio vaccination 
across Pakistan. How much do you support
such a plan?
(1) A great deal; (2) A lot; (3) A moderate amount; 
(4) A little; (5) Not at all.

Treatment group:

The World Health Organization recently announced 
a plan to introduce universal polio vaccination 
across Pakistan. Pakistani militant groups 
fighting in Kashmir have voiced support for 
this program. How much do you support such a plan?
(1) A great deal; (2) A lot; (3) A moderate amount; 
(4) A little; (5) Not at all.

Results

Did political endorsement influence mass opinion during COVID-19?

Source: Gadarian et al (2021)

Did political endorsement influence mass opinion during COVID-19?

Randomised response techniques 🎲

Randomised response

This is a old technique (Werner 1965, Boruch 1971) that is used to reduce social desirability bias
The main idea is the following:
- The respondent roll a die or other randomising device out of view of the enumerator
- The enumerator does not know the outcome of the randomisation
- For instance, if 1 shows in the die, the respondent says no
- If 6 shows, the respondent says yes
- If any other number shows, the respondent answers truthfully
- The enumerator sees the answer, but does not know if it is true or random

The methods works because the probabilities are known so we can estimate the true proportion of respondents who agree with the sensitive item
However, the method is a little confusing to respondents and it is important to either run the experiment in person or provide clear instructions if running it online
Believe it or not, this is the most effective method to reduce social desirability bias!

Xenophobia and anti-semitism in Germany

Source: Krumpal (2012)

Xenophobia and anti-semitism in Germany

Conjoint analysis

Conjoint experiments

Conjoint analysis measures preferences for multiple attributes of a product, service, or policy
The method aims to understand how different factors influence decision-making
It is also used to study trade-offs between different attributes
As the other experiments we discussed, conjoint analysis provides a plausible deniability for respondents
Why is that? Because respondents are not asked about their preferences directly
Instead, they are presented with a series of hypothetical scenarios and asked to choose their preferred option

The main advantage of conjoint analysis is that it allows researchers to study complex preferences in a more realistic way
Another advantage is that the method is very robust, flexible, and adaptable to various research contexts
It also increases sample size significantly as respondents are presented with several scenarios
However, it can be time-consuming to design and analyse, but that’s pretty much the only con 😉
Ah, software to implement it is quite difficult to use as well!
See my own experiment for an example

Hainmueller et al (2014)

Assumptions:
- No carryover effects
- No profile-order effects
- Randomisation of profiles
- Need to cluster errors because answers are not independent (a single individual chooses many profiles)
Survey design tool
cregg R package

Source Hainmueller et al (2014)

Example: Immigration applicants (US)

Example: Presidential candidates (US)

Conclusion

How cool are these methods?! 🤓
There are many other methods to study sensitive topics
Today we saw four of them:
- List experiments
- Endorsement experiments
- Randomised response techniques
- Conjoint analysis
Please remember that these methods are not perfect, and that they have several assumptions
They are also areas of active research, so please be mindful that something new may come up 😉

If you are interested in learning more about these methods, please check the following R packages:
list
endorse
rr
cregg
Visit the website http://sensitivequestions.org/ for more information
And let me know if you’d like to learn more about these methods! 🤓

…And that’s all for today! 🎉

Thank you for your attention! 🙏