QTM 385 - Experimental Methods

Lecture 09 - One-Sided Non-Compliance

Danilo Freire

danilo.freire@emory.edu

Emory University

Hi, there!
Hope all is well! 😉

Group work 👥

Group work

This week’s task

Please send me an email (danilo.freire@emory.edu) by Wednesday with the following content:
- Two paragraphs (maximum) summarising an experiment that you wish to develop in this course. At a minimum, your summary should include a research question, why the question is important, and a rough outline of how you plan to answer the question.
We’ll be working on a little bit of the task each week during class, and I’ll be posting the week’s assignments on the website. It will be fun! 😊

Brief recap 📚

Last class

Clustering:
- Assigns whole groups (e.g. classrooms) to treatment due to practical constraints
- Introduces intra-cluster correlation (ICC) which increases variance
- Requires cluster-robust standard errors and careful power calculations
- The effective sample size is always smaller than the actual sample size, sometimes substantially so!
A few suggestions on how to deal with clustering:
- Increase the number of clusters
- Increase the number of units per cluster
- Use pair-matching (or any type of blocking) to improve precision

Statistical power:
- Power = Probability of detecting true effects (aim for ≥80%) Influenced by: effect size magnitude, outcome variability, sample size, and significance level
- DeclareDesign package enables power simulation through:
- Model declaration, treatment effect estimation, design diagnosis across sample sizes
- You can use any power calculator to estimate power, but DeclareDesign allows for any type of design, what can be difficult to estimate using traditional formulas
- Power curves show how power changes with sample size

Today’s plan 📋

One-sided non-compliance

One-sided non-compliance
Compliers versus never-takers
Intent-to-treat (ITT) effect versus average treatment effect (ATE)
Complier average causal effect (CACE) is the effect of the treatment on the compliers
Instrumental variables (IV) can be used to estimate the CACE
Two-stage least squares (2SLS) is the most common IV method
Placebo designs can help test the IV assumptions

Source: Chris Said (2021)

Non-compliance 🤔

Non-compliance: A big problem!

In experimental research, compliance is the extent to which participants follow the treatment assignment
Under full compliance, all participants follow the treatment assignment (and that’s what we want!)
Non-compliance occurs when participants do not follow the treatment assignment
In everyday language, compliance and non-compliance have a negative connotation, but in research, they are neutral terms
Non-compliance is a problem because it undermines the internal validity of the study
Today we will examine one-sided non-compliance, which is when units in the treatment group do not receive the treatment
- Those in the control group are not affected by this issue
Next class, we will discuss two-sided non-compliance, which is when some people in the treatment group do not receive the treatment and some people in the control group do receive the treatment
- This complicates analysis quite a bit, but we have methods to deal with it 🤓

Non-compliance visualised

Source: Spotify R&D Engineering (2023)

A motivating example

Canvassing to increase voter turnout

Imagine that you are interested in studying the effect of canvassing on voter turnout
- Maybe if you knock on people’s doors and talk to them about the importance of voting, they will be more likely to vote!
You design an experiment where you randomly assign 1000 to receive canvassing (treatment group) and 1000 to not receive canvassing (control group)
However… usually only 25% of the people in the treatment group are actually canvassed
- The rest are not home, refuse to talk, etc.
So we have 250 people treated and 1000 in the control group
What would you do? 🤔

Option 01: Just compare the two groups

As-treated analysis

The first option we have is to just compare the two groups as if nothing had happened
So we would compare the 1000 people who were in the treatment group with the 1000 people who were in the control group
Then calculate the average treatment effect (ATE) as the difference between the two groups as we always do
What do you think? 🤔

Option 01: Just compare the two groups

As-treated analysis

The problem with this approach is that it undermines the internal validity of the study
The random assignment is no longer valid because the treatment group is not receiving the treatment
We are assuming that the effect of canvassing is zero for the 750 people who were not canvassed
There might be selection bias in the treatment group
- For example, maybe the people who were canvassed are more likely to vote anyway
- People who refuse to talk to canvassers might be less likely to vote, and so on
So this approach is not recommended 👎

Option 02: Assume random compliance

As-treated analysis

The second option, related to the first, is to assume that the differences between the two groups are random
In other words, we assume that the people who were canvassed are randomly selected from the treatment group
And the fact that only 25% of the people were canvassed is just bad luck
If this is the case, we can drop the people who were not canvassed from the treatment group and compare the 250 people who were canvassed with the 1000 people in the control group
This would be able to recover the true ATE if the assumption is correct
What do you think? 🤔

Option 02: Assume random compliance

As-treated analysis

The problem with this approach is that we cannot test the assumption
We cannot know if the differences between the two groups are random or not
Most likely they are not!
Unless you can really justify that the differences are random, this approach is not recommended 👎
But if you can justify it (good luck with that! 😂), this is okay!

Option 03: Redefine the ATE

Just give people the choice

The third option is to stick to the random assignment and compare the two groups as if everyone had followed the treatment assignment
Instead of comparing the people who were actually canvassed with those who were not canvassed, we compare the people who were assigned to be canvassed with those who were not assigned to be canvassed
The difference here is semantic:
- We would be able to recover the true ATE if we had only given people the choice of whether or not to be canvassed
For instance, rather than analysing the effect of Medicaid on health outcomes, we would be analysing the effect of being offered Medicaid on health outcomes
- In this definition, non-compliance is impossible
What do you think?

Option 03: Redefine the ATE

Just give people the choice

The problem with this approach is that it underestimates the treatment effect
The average treatment effect is the difference between the outcome of the people who were actually canvassed and the outcome of the people who were not canvassed
But the this analysis compares the outcome of the people who were assigned to be canvassed with the outcome of the people who were not assigned to be canvassed
This is not the same as receiving the treatment or not!
So this approach is not recommended either 👎

Option 04: Instrumental variables (IV)

A clever way to deal with non-compliance

The fourth option is to use instrumental variables (IV) (or two-stage least squares - 2SLS)
The benefit of IV is that it allows us to recover the true effect of the programme instead of only the effect of being offered the programme
The downside is that IV does not allow us to recover the true ATE in the whole population
This is because it only measures the effect of the programme on the compliers, that is, those who took the treatment when it was assigned to them
This is the best approach 👍
But first, some definitions and concepts… 🤓

New definitions and assumptions 🤓

Full compliance

In an ideal experiment, we randomly assign each user to a treatment or a control group
All users in the treatment group experience the treatment, and all users in the control group do not experience the treatment
The table below summarises full compliance:

Random assignment (\(Z\))	Treatment status (\(D\))
Treatment	Treated
Control	Untreated

For the next slides, it is useful to introduce some definitions:
- \(Z \in \{0, 1\}\) indicates whether a user was assigned to the treatment or the control group (visited by a canvasser or not)
- \(D \in \{0, 1\}\) indicates whether a user was treated (actually heard the message)
- \(Y\), as always, is the outcome we care about (voter turnout)
In this case, the treatment effect is the difference between the potential outcomes of the treated and untreated users, as we have seen before

One-sided non-compliance

In the case of one-sided non-compliance, some users in the treatment group do not receive the treatment
The table below summarises the situation:

Random assignment (\(Z\))	Treatment status (\(D\))
Treatment	Treated
	Untreated
Control	Untreated

In this case, the quantity \(E = [Y| Z=1] - [Y|Z=0]\) does not represent the treatment effect anymore
Instead, it represents the effect of being assigned to the treatment group only, i.e., the intent-to-treat (ITT) effect
Let’s formalise this a bit more…

One-sided non-compliance

Notation

Let the experimental assignment of subject \(i\) be \(z_i\)
When \(z_i = 1\), the subject is assigned to the treatment group, and when \(z_i = 0\), the subject is assigned to the control group
Let \(d_i(z)\) represent whether subject \(i\) is actually treated, given the assignment \(z_i\)
To make it short, let’s write \(d_i(z = 1)\) as \(d_i(1)\) and \(d_i(z = 0)\) as \(d_i(0)\)
If a subject receives no treatment when assigned to the control groups, we represent them as \(d_i(0) = 0\)
For one-sided non-compliance, \(d_i(0)\) is always 0 for all people in the control groups, but \(d_i(1)\) can be 0 or 1
If \(d_i(1) = 1\), I would open the door if canvassed, but if \(d_i(1) = 0\), I would not open the door

Compliers and never-takers

Two new groups

In the case of one-sided non-compliance, we have two new groups of analysis
Compliers are those who would take the treatment if assigned to the treatment group and would not take the treatment if assigned to the control group
- So, \(d_i(1) = 1\) and \(d_i(0) = 0\)
However, we also have a group of people who would not take the treatment even if assigned to the treatment group
- These are the never-takers
- For them, \(d_i(1) = d_i(0) = 0\)
Thus, the expression \(ATE | d_i(1)\) means the average treatment effect (ATE) for the compliers
Keep in mind that the names “compliers” and “never-takers” are unrelated with the outcomes \(Y_i\), just with the treatment assignment \(d_i(z)\)
It is not always easy to define who is a complier in an experiment
- What if canvassing is done in the weekends but some people are at home only during the week? Compliers or never-takers?
- If we canvass them during the week instead, are they compliers or never-takers?

First assumption: Non-interference

The first assumption we need to make is that of non-interference
Non-interference means that whether a subject is treated depends only on the subject’s own treatment group assignment
This assumption is strong, difficult to test, and often violated
The intent-to-treat (\(ITT\)) effect of assignment (\(z\)) on treatment status (\(d\)) is defined as:

\[ ITT_{i, D} = d_i (1) - d_i (0) \]

If everyone complies perfectly, then \(d_i(1)\) will be 1 and \(d_i(0)\) will be 0, so the difference is 1
The average \(ITT_{i, D}\) across all subjects is

\[ITT_D = E[ITT_{i, D}] = E[d_i(1)] - E[d_i(0)]\]

That is, the proportion of people who take the treatment when assigned to the treatment group minus the proportion of people who take the treatment when assigned to the control group
In one-sided non-compliance, \(E[d_i(0)] = 0\) for all subjects, so \(ITT_D = E[d_i(1)] \geq 0\)

ITT effect on the outcome

The intent-to-treat effect of \(z_i\) on \(Y_i\) for each subject is:

\[ITT_{i,Y} = Y_i(z = 1, d(1)) - Y_i(z = 0, d(0))\]

That is:
- \(Y_i(z = 1, d(1))\): Outcome for person \(i\) if assigned to treatment (\(z=1\)) and they actually take the treatment (\(d(1)\))
- \(Y_i(z = 0, d(0))\): Outcome for person \(i\) if assigned to control (\(z=0\)) and do not take the treatment (\(d(0)\))
Hence, the average \(ITT_{Y}\) is:

\[ITT_{Y} = E[ITT_{Y}] = E[Y_i(z = 1, d(1))] - E[Y_i(z = 0, d(0))]\]

If we have full compliance, \(ITT_{Y}\) is the same as the average treatment effect (ATE)
If not, \(ITT_{Y}\) is the intent-to-treat (ITT) effect: whether a programme “made a difference” in the outcome, regardless of whether people actually took the treatment

Second assumption: Exclusion restriction

The second assumption we need to make is that of the exclusion restriction
The exclusion restriction means that the only way the treatment assignment (\(z\)) affects the outcome (\(Y\)) is through its effect on whether people actually get the treatment (\(d\))
In other words, untreated subjects have the same potential outcomes regardless of their assignments:
- \(Y_i(z = 0, d(0)) = Y_i(z = 0, d(1))\)
And the same is true for treated subjects:
- \(Y_i(z = 1, d(1)) = Y_i(z = 1, d(0))\)
In general:
- \(Y_i(z, d) = Y_i(d)\)
This assumption is also strong, and the main reason why we have placebos in science!

CACE and IVs 🤓

Complier average causal effect (CACE)

The effect of the treatment on the compliers

As we cannot correctly estimate the ATE with non-compliance, we focus on the complier average causal effect (CACE)
CACE tries to answer this question: “For those individuals who actually heard the message, what is the effect of the message on their likelihood of voting?”
Formally, the CACE is defined as:

\[ CACE \equiv \frac{\sum_{i=1}^{N}(Y_i(1) - Y_i(0))d_i(1)}{\sum_{i=1}^{N}d_i(1)} = \frac{ITT_Y}{ITT_D} = E[(Y_i(d = 1) - Y_i(d = 0)) | d_i(1) = 1] \]

In other words, it is the treatment effect, but only coming from the compliers, divided by the number of compliers
CACE is also know as Local Average Treatment Effects (LATE)

CACE and instrumental variables

The effect of the treatment on the compliers

The good thing about CACE/LATE is that we have a consistent estimator for it
Equivalent to two-stage least square estimators
- Regress \(D_i\) on \(Z_i\) to get fitted values \(\hat{D_i}\)
- Regress \(Y_i\) on \(\hat{D_i}\)
But remember the assumptions:
- Non-interference will be violated, for instance, if subject \(j\) is canvassed and tells \(i\) about it
- Excludability can also be violated if the controls are treated by another canvassing campaign
- First stage requires that at least one person is treated

Source: Blackwell (2021)

Estimating the regressions

Let’s go back to our example of canvassing
We have the following regressions to estimate the CACE:
- First stage: \(TREATED_i = \alpha_0 + \alpha_1 ASSIGNED + \epsilon_i\)
- Second stage: \(VOTED_i = \beta_0 + \beta_1 TREATED + u_i\)
The first stage estimates the effect of the treatment assignment on the treatment status
The second stage estimates the effect of the treatment status on the outcome
The coefficient \(\beta_1\) is the two-stage least squares (2SLS) estimator of the CACE

Estimating the regressions

Data preparation

# Load the dataset
data1 <- read.csv("canvassing.csv", head=TRUE, sep=",")
colnames(data1)

 [1] "X"          "id1"        "persons"    "v98_1"      "v98_2"     
 [6] "persngrp"   "mailings"   "phongotv"   "ward"       "majpty1"   
[11] "majpty2"    "age1"       "age2"       "placebo"    "vote98"    
[16] "v98"        "cntany"     "pcntany"    "v96_1"      "v96_0"     
[21] "age"        "agemiss"    "agesq"      "majorpty"   "onetreat"  
[26] "onetreat_p"

dim(data1)

[1] 31098    26

# Select one-person households that were either pure controls or canvass only
sel <-  data1$onetreat==1 & data1$mailings==0 & data1$phongotv==0 & data1$persons==1

# Verify the number of observations
table(sel)

sel
FALSE  TRUE 
24008  7090

data2 <- data1[sel,]

# Rename variables
data2$VOTED    <- data2$v98
data2$ASSIGNED <- data2$persngrp
data2$TREATED  <- data2$cntany

Estimating the regressions

First stage: \(ITT_d\) with robust standard errors

The intercept of zero in this equation indicates that no one in the control group was contacted, in keeping with the definition of one-sided noncompliance
The coefficient 0.273 indicates that assignment to the treatment group caused 27.3% of the targeted subjects to be treated
In other words, the estimated share of Compliers in the treatment group is 27.3%. The 95% CI suggests that this proportion ranges from 25.0% to 29.6%

# Load the required packages
library(AER)      # For IV
library(sandwich) # For robust SEs

# Box 5.5: ITT_D
# Note that results from this will vary from the book
itt_d_fit <- lm(TREATED ~ ASSIGNED, data = data2)
coeftest(itt_d_fit, vcovHC(itt_d_fit))


t test of coefficients:

              Estimate Std. Error t value  Pr(>|t|)    
(Intercept) 1.5358e-14 1.6258e-16  94.464 < 2.2e-16 ***
ASSIGNED    2.7336e-01 1.1733e-02  23.299 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Estimating the regressions

\(ITT_Y\) with robust standard errors

Here we estimate the ITT of the whole population
This model accounts for the possibility that ASSIGNED is not a perfect measure of treatment status
It can be “endogenous”, that is, related to unobserved factors (\(u_i\)) that affect outcomes
Those assigned to the treatment group were 3.84 percentage points more likely to vote
The estimated ITT may be a useful thing to know!
If you are conducting an evaluation of a programme, you can use the ITT to assess the programs output in relation to its costs

# Box 5.4: ITT with robust SEs
itt_fit <- lm(VOTED ~ ASSIGNED, data = data2)
coeftest(itt_fit, vcovHC(itt_fit))


t test of coefficients:

            Estimate Std. Error t value  Pr(>|t|)    
(Intercept) 0.375376   0.006446 58.2344 < 2.2e-16 ***
ASSIGNED    0.038464   0.014479  2.6565  0.007914 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Estimating the regressions

CACE using 2SLS

Finally, here we estimate the CACE
It is the effect of the treatment on the compliers
So we could just have used the formula: \(CACE = \frac{ITT_Y}{ITT_D} = \frac{0.038464}{0.2734} \approx 0.1407\)
The estimated average treatment effect of the canvassing treatment among Compliers is a 14.07 percentage point increase in the probability of voting
We could have estimated the CACE using the ivreg function from the AER package and gotten the same result

# Box 5.6: CACE
cace_fit <- ivreg(VOTED ~ TREATED, ~ ASSIGNED, data = data2)
coeftest(cace_fit, vcovHC(cace_fit))


t test of coefficients:

            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.375376   0.006446 58.2344   <2e-16 ***
TREATED     0.140711   0.052434  2.6836   0.0073 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Estimating the regressions

Using `estimatr`

There’s no need to learn how to use ivreg if you don’t want to!
Our familiar estimatr package has a function called iv_robust that does the same thing
The results are the same as before, and we also see that the 95% confidence interval ranges from 0.038 to 0.24, that is, canvassing increses the probability of voting by 3.8 to 24 percentage points
The effect is positive and statistically significant

# Box 5.6: CACE 
# Load estimatr
library(estimatr)

# CACE with estimatr
cace_fit2 <- iv_robust(VOTED ~ TREATED | ASSIGNED, data = data2)
cace_fit2

             Estimate Std. Error   t value    Pr(>|t|)   CI Lower  CI Upper
(Intercept) 0.3753764 0.00644539 58.239525 0.000000000 0.36274155 0.3880113
TREATED     0.1407115 0.05241688  2.684469 0.007281396 0.03795877 0.2434642
              DF
(Intercept) 7088
TREATED     7088

Designs that antecipate non-compliance 🤓

Large-\(n\) designs

Non-compliance not only prevents us from estimating the true ATE, it also makes CACE estimation more challenging.
While 2SLS is a consistent estimator for the CACE, the estimator becomes much less precise if the proportion of compliers is small
So the first advice is to design experiments with large sample sizes to increase the number of compliers
This is not always feasible, though, as it can be expensive
But if you can, do it! 😊

Placebo designs

A more realistic approach is to anticipate non-compliance and include placebo conditions in the experiment
This is done in two steps:
- First, subjects are recruited to the study and assigned to treatment and control groups
- Second, given compliance, subjects are randomly allocated to two groups:
  - The treatment group receives the treatment in the usual way
  - The placebo group receives a “non-treatment” that is assumed to have no effect on the outcome of interest
For instance, we could have a placebo group that receives a fake canvassing treatment, such as information about the importance of recycling or the benefits of exercise
CACE can be estimated by comparing the outcomes for those given the canvassing treatment and those given the “non-treatment”

Placebo designs

Why does this work?
Because the main problem in one-sided compliance is the existence of never-takers
But if we randomise the treatment amongst the compliers, we screen-out the never-takers by design
Compliers in the treated state can then be compared directly to Compliers in the untreated state, which eliminates the noise generated by the never-takers
Thus, we are back to full compliance and can estimate the true ATE!
This is a very powerful tool in experimental research
Think about it when designing your experiments! 😊

Partial treatment

Finally, what to do when we have partial treatment?
For instance, a subject interrupts the medical treatment before the end
The easiest and most widely used approach is to classify the partially-treated subject as untreated, estimate the CACE, and then classify the subject as treated and estimate the CACE again
Those two estimates provide bounds for the CACE
- The lower bound is the estimate when the subject is classified as treated
- The upper bound is the estimate when the subject is classified as untreated
While not perfect, this strategy at least provides a range of possible values for the CACE and allows us to quantity the uncertainty in our estimates

Conclusion 📚

Conclusion

Non-compliance is a big problem in experimental research
One-sided non-compliance is when units in the treatment group do not receive the treatment
We have seen that we have several options to deal with non-compliance, but the best one is to use instrumental variables (IV)
IV allows us to estimate the complier average causal effect (CACE), which is the effect of the treatment on the compliers
We have also seen that large-\(n\) designs and placebo designs can help anticipate non-compliance
Next class, we will discuss two-sided non-compliance, which is when some people in the treatment group do not receive the treatment and some people in the control group do receive the treatment
…and we will see how to deal with it! 😊

…and that’s all for today! 🎉

QTM 385 - Experimental Methods

Hi, there! Hope all is well! 😉

Group work 👥

Group work

This week’s task

Brief recap 📚

Last class

Today’s plan 📋

One-sided non-compliance

Non-compliance 🤔

Non-compliance: A big problem!

Non-compliance visualised

A motivating example

Canvassing to increase voter turnout

Option 01: Just compare the two groups

As-treated analysis

Option 01: Just compare the two groups

As-treated analysis

Option 02: Assume random compliance

As-treated analysis

Option 02: Assume random compliance

As-treated analysis

Option 03: Redefine the ATE

Just give people the choice

Option 03: Redefine the ATE

Just give people the choice

Option 04: Instrumental variables (IV)

A clever way to deal with non-compliance

New definitions and assumptions 🤓

Full compliance

One-sided non-compliance

One-sided non-compliance

Notation

Compliers and never-takers

Two new groups

First assumption: Non-interference

ITT effect on the outcome

Second assumption: Exclusion restriction

CACE and IVs 🤓

Complier average causal effect (CACE)

The effect of the treatment on the compliers

CACE and instrumental variables

The effect of the treatment on the compliers

Estimating the regressions

Estimating the regressions

Data preparation

Estimating the regressions

First stage: \(ITT_d\) with robust standard errors

Estimating the regressions

\(ITT_Y\) with robust standard errors

Estimating the regressions

CACE using 2SLS

Estimating the regressions

Using estimatr

Designs that antecipate non-compliance 🤓

Large-\(n\) designs

Placebo designs

Placebo designs

Partial treatment

Conclusion 📚

Conclusion

…and that’s all for today! 🎉

Thank you! 🙏

Hi, there!
Hope all is well! 😉

Using `estimatr`