Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide

The Experimental Ideal

EC 425/525, Set 2

Edward Rubin

09 April 2019

1 / 42

Prologue

2 / 42

Schedule

Last time

Research basics, our class, and R

Today

Admin: Canvas exists. Updated class website.
Material: The Rubin causal model (not mine), Chapter 2 MHE.

3 / 42

Schedule

Last time

Research basics, our class, and R

Today

Admin: Canvas exists. Updated class website.
Material: The Rubin causal model (not mine), Chapter 2 MHE.
Assignment1 Install R and RStudio on your computer.

3 / 42

Schedule

Last time

Research basics, our class, and R

Today

Admin: Canvas exists. Updated class website.
Material: The Rubin causal model (not mine), Chapter 2 MHE.
Assignment1 Install R and RStudio on your computer.
Assignment2 Take 15 minutes to quietly think about your interests.

3 / 42

Schedule

Last time

Research basics, our class, and R

Today

Admin: Canvas exists. Updated class website.
Material: The Rubin causal model (not mine), Chapter 2 MHE.
Assignment1 Install R and RStudio on your computer.
Assignment2 Take 15 minutes to quietly think about your interests.
Assignment3 First step of project proposal due April 15th.

3 / 42

Schedule

Last time

Research basics, our class, and R

Today

Admin: Canvas exists. Updated class website.
Material: The Rubin causal model (not mine), Chapter 2 MHE.
Assignment1 Install R and RStudio on your computer.
Assignment2 Take 15 minutes to quietly think about your interests.
Assignment3 First step of project proposal due April 15th.

Future

Lab: Matrix work, regression, functions, simulation
Long run: Deepen understandings/intuitions for causality and inference.

3 / 42

Review

Research fundamentals

4 / 42

Review

Research fundamentals

Angrist and Pischke provide four fundamental questions for research:

  1. What is the causal relationship of interest?

  2. How would an ideal experiment capture this causal effect of interest?

  3. What is your identification strategy?

  4. What is your mode of inference?

5 / 42

Review

Research fundamentals

Angrist and Pischke provide four fundamental questions for research:

  1. What is the causal relationship of interest?

  2. How would an ideal experiment capture this causal effect of interest?

  3. What is your identification strategy?

  4. What is your mode of inference?

Seemingly straightforward questions can be fundamentally unanswerable.

5 / 42

Review

General research recommendations

More unsolicited advice:

  • Be curious.

  • Ask questions.

  • Attend seminars.

  • Meet faculty (UO + visitors).

  • Focus on learning—especially intuition.

  • Be kind and constructive.

Learning is not always the same as getting good grades.

6 / 42

The experimental ideal

7 / 42

The experimental ideal

What's so great about experiments?

Science widely regards experiments as the gold standard for research.

But why? The costs can be substantial.

Costs

  • slow and expensive
  • heavily regulated by review boards
  • can abstract away from the actual question/setting

Benefits

So the benefits need to be pretty large, right?

8 / 42

The experimental ideal

Example: Hospitals and health

Imagine we want to know the causal effect of hospitals on health.

9 / 42

The experimental ideal

Example: Hospitals and health

Imagine we want to know the causal effect of hospitals on health.

Research question

Within the population of poor, elderly individuals, does visiting the emergency room for primary care improve health?

9 / 42

The experimental ideal

Example: Hospitals and health

Imagine we want to know the causal effect of hospitals on health.

Research question

Within the population of poor, elderly individuals, does visiting the emergency room for primary care improve health?

Empirical exercise

  1. Collect data on health status and hospital visits.
  2. Summarize health status by hospital-visit group.
9 / 42

The experimental ideal

Example: Hospitals and health

Our empirical exercise from the 2005 National Health Inteview Survey:

Group Sample Size Mean Health Status Std. Error
Hospital 7,774 3.21 0.014
No hospital 90,049 3.93 0.003
10 / 42

The experimental ideal

Example: Hospitals and health

Our empirical exercise from the 2005 National Health Inteview Survey:

Group Sample Size Mean Health Status Std. Error
Hospital 7,774 3.21 0.014
No hospital 90,049 3.93 0.003

We get a t statistic of 58.9 when testing a difference in groups' means (0.72).

10 / 42

The experimental ideal

Example: Hospitals and health

Our empirical exercise from the 2005 National Health Inteview Survey:

Group Sample Size Mean Health Status Std. Error
Hospital 7,774 3.21 0.014
No hospital 90,049 3.93 0.003

We get a t statistic of 58.9 when testing a difference in groups' means (0.72).

Conclusion? Hospitals make folks worse. Hospitals make sick people sicker.

10 / 42

The experimental ideal

Example: Hospitals and health

Our empirical exercise from the 2005 National Health Inteview Survey:

Group Sample Size Mean Health Status Std. Error
Hospital 7,774 3.21 0.014
No hospital 90,049 3.93 0.003

We get a t statistic of 58.9 when testing a difference in groups' means (0.72).

Conclusion? Hospitals make folks worse. Hospitals make sick people sicker.

Alternative conclusion: Perhaps we're making a mistake in our analysis...

10 / 42

The experimental ideal

Example: Hospitals and health

Our empirical exercise from the 2005 National Health Inteview Survey:

Group Sample Size Mean Health Status Std. Error
Hospital 7,774 3.21 0.014
No hospital 90,049 3.93 0.003

We get a t statistic of 58.9 when testing a difference in groups' means (0.72).

Conclusion? Hospitals make folks worse. Hospitals make sick people sicker.

Alternative conclusion: Perhaps we're making a mistake in our analysis...
maybe sick people go to hospitals?

10 / 42

The experimental ideal

Potential outcomes framework

Let's develop a framework to better discuss the problem here.

11 / 42

The experimental ideal

Potential outcomes framework

Let's develop a framework to better discuss the problem here.

  • Binary treatment variable (e.g., hospitalized): Di=0,1
  • Outcome for individual i (e.g., health): Yi

This framework has a few names...

  • Neyman potential outcomes framework
  • Rubin causal model
  • Neyman-Rubin "potential outcome"|"causal" "framework"|"model"
11 / 42

The experimental ideal

Potential outcomes framework

Research question: Does Di affect Yi?

12 / 42

The experimental ideal

Potential outcomes framework

Research question: Does Di affect Yi?

For each individual i, there are two potential outcomes (w/ binary Di)

12 / 42

The experimental ideal

Potential outcomes framework

Research question: Does Di affect Yi?

For each individual i, there are two potential outcomes (w/ binary Di)

  1. Y1i if Di=1
    i's health outcome if she went to the hospital
12 / 42

The experimental ideal

Potential outcomes framework

Research question: Does Di affect Yi?

For each individual i, there are two potential outcomes (w/ binary Di)

  1. Y1i if Di=1
    i's health outcome if she went to the hospital

  2. Y0i if Di=0
    i's health outcome if she did not go to the hospital

12 / 42

The experimental ideal

Potential outcomes framework

Research question: Does Di affect Yi?

For each individual i, there are two potential outcomes (w/ binary Di)

  1. Y1i if Di=1
    i's health outcome if she went to the hospital

  2. Y0i if Di=0
    i's health outcome if she did not go to the hospital

The difference between these two outcomes gives us the causal effect of hospital treatment, i.e.,

τi=Y1iY0i

12 / 42

The experimental ideal

#problems

This simple equation τi=Y1iY0i leads us to the fundamental problem of causal inference.

13 / 42

The experimental ideal

#problems

This simple equation τi=Y1iY0i leads us to the fundamental problem of causal inference.

We can never simultaneously observe Y1i and Y0i.

13 / 42

The experimental ideal

#problems

This simple equation τi=Y1iY0i leads us to the fundamental problem of causal inference.

We can never simultaneously observe Y1i and Y0i.

Most of applied econometrics focuses on addressing this simple problem.

13 / 42

The experimental ideal

#problems

This simple equation τi=Y1iY0i leads us to the fundamental problem of causal inference.

We can never simultaneously observe Y1i and Y0i.

Most of applied econometrics focuses on addressing this simple problem.

Accordingly, our methods try to address the related question

For each Y1i, what is a (reasonably) good counterfactual?

13 / 42

The experimental ideal

Solutions?

Problem We cannot directly calculate τi=Y1iY0i.

14 / 42

The experimental ideal

Solutions?

Problem We cannot directly calculate τi=Y1iY0i.

Proposed solution
Compare outcomes for people who visited the hospital (Y1iDi=1)
to outcomes for people who did not visit the hospital (Y0jDj=0).

14 / 42

The experimental ideal

Solutions?

Problem We cannot directly calculate τi=Y1iY0i.

Proposed solution
Compare outcomes for people who visited the hospital (Y1iDi=1)
to outcomes for people who did not visit the hospital (Y0jDj=0).

E[YiDi=1]E[YiDi=0] which gives us the observed difference in health outcomes.

14 / 42

The experimental ideal

Solutions?

Problem We cannot directly calculate τi=Y1iY0i.

Proposed solution
Compare outcomes for people who visited the hospital (Y1iDi=1)
to outcomes for people who did not visit the hospital (Y0jDj=0).

E[YiDi=1]E[YiDi=0] which gives us the observed difference in health outcomes.

Q This comparison will return an answer, but is it the answer we want?

14 / 42

The experimental ideal

Selection

Q What does E[YiDi=1]E[YiDi=0] actually tell us?

15 / 42

The experimental ideal

Selection

Q What does E[YiDi=1]E[YiDi=0] actually tell us?

A First notice that we can write i's outcome Yi as Yi=Y0i+Di(Y1iY0i)τi

15 / 42

The experimental ideal

Selection

Q What does E[YiDi=1]E[YiDi=0] actually tell us?

A First notice that we can write i's outcome Yi as Yi=Y0i+Di(Y1iY0i)τi

Now write out our expectation, apply this definition, do creative math.

E[YiDi=1]E[YiDi=0]

15 / 42

The experimental ideal

Selection

Q What does E[YiDi=1]E[YiDi=0] actually tell us?

A First notice that we can write i's outcome Yi as Yi=Y0i+Di(Y1iY0i)τi

Now write out our expectation, apply this definition, do creative math.

E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]

15 / 42

The experimental ideal

Selection

Q What does E[YiDi=1]E[YiDi=0] actually tell us?

A First notice that we can write i's outcome Yi as Yi=Y0i+Di(Y1iY0i)τi

Now write out our expectation, apply this definition, do creative math.

E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]
=E[Y1iDi=1]E[Y0iDi=1]+E[Y0iDi=1]E[Y0iDi=0]

15 / 42

The experimental ideal

Selection

Q What does E[YiDi=1]E[YiDi=0] actually tell us?

A First notice that we can write i's outcome Yi as Yi=Y0i+Di(Y1iY0i)τi

Now write out our expectation, apply this definition, do creative math.

E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]
=E[Y1iDi=1]E[Y0iDi=1]Average treatment effect on the treated 😀+E[Y0iDi=1]E[Y0iDi=0]

15 / 42

The experimental ideal

Selection

Q What does E[YiDi=1]E[YiDi=0] actually tell us?

A First notice that we can write i's outcome Yi as Yi=Y0i+Di(Y1iY0i)τi

Now write out our expectation, apply this definition, do creative math.

E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]
=E[Y1iDi=1]E[Y0iDi=1]Average treatment effect on the treated 😀+E[Y0iDi=1]E[Y0iDi=0]Selection bias 😞

15 / 42

The experimental ideal

Selection

The first term is good variation—essentially the answer that we want.
E[Y1iDi=1]E[Y0iDi=1]

16 / 42

The experimental ideal

Selection

The first term is good variation—essentially the answer that we want.
E[Y1iDi=1]E[Y0iDi=1]
   =E[Y1iY0iDi=1]

16 / 42

The experimental ideal

Selection

The first term is good variation—essentially the answer that we want.
E[Y1iDi=1]E[Y0iDi=1]
   =E[Y1iY0iDi=1]
   =E[τiDi=1]

16 / 42

The experimental ideal

Selection

The first term is good variation—essentially the answer that we want.
E[Y1iDi=1]E[Y0iDi=1]
   =E[Y1iY0iDi=1]
   =E[τiDi=1]
The average causal effect of hospitalization for hospitalized individuals.

16 / 42

The experimental ideal

Selection

The first term is good variation—essentially the answer that we want.
E[Y1iDi=1]E[Y0iDi=1]
   =E[Y1iY0iDi=1]
   =E[τiDi=1]
The average causal effect of hospitalization for hospitalized individuals.

The second term is bad variation—preventing us from knowing the answer.
E[Y0iDi=1]E[Y0iDi=0]

16 / 42

The experimental ideal

Selection

The first term is good variation—essentially the answer that we want.
E[Y1iDi=1]E[Y0iDi=1]
   =E[Y1iY0iDi=1]
   =E[τiDi=1]
The average causal effect of hospitalization for hospitalized individuals.

The second term is bad variation—preventing us from knowing the answer.
E[Y0iDi=1]E[Y0iDi=0]
The difference in the average untreated outcome between the treatment and control groups.

16 / 42

The experimental ideal

Selection

The first term is good variation—essentially the answer that we want.
E[Y1iDi=1]E[Y0iDi=1]
   =E[Y1iY0iDi=1]
   =E[τiDi=1]
The average causal effect of hospitalization for hospitalized individuals.

The second term is bad variation—preventing us from knowing the answer.
E[Y0iDi=1]E[Y0iDi=0]
The difference in the average untreated outcome between the treatment and control groups.

Selection bias The extent to which the "control group" provides a bad counterfactual for the treated individuals.

16 / 42

The experimental ideal

Selection

Angrist and Pischke (MHE, p. 15),

The goal of most empirical economic research is to overcome selection bias, and therefore to say something about the causal effect of a variable like Di.

17 / 42

The experimental ideal

Selection

Angrist and Pischke (MHE, p. 15),

The goal of most empirical economic research is to overcome selection bias, and therefore to say something about the causal effect of a variable like Di.

Q So how do experiments—the gold standard of empirical economic (and scientific) research—accomplish this goal and overcome selection bias?

17 / 42

The experimental ideal

Back to experiments

Q How do experiments overcome selection bias?

18 / 42

The experimental ideal

Back to experiments

Q How do experiments overcome selection bias?
A Experiments break the link between potential outcomes and treatment.

In other words: Randomly assigning Di makes Di independent of which outcome we observe (meaning Y1i or Y0i).

18 / 42

The experimental ideal

Back to experiments

Q How do experiments overcome selection bias?
A Experiments break the link between potential outcomes and treatment.

In other words: Randomly assigning Di makes Di independent of which outcome we observe (meaning Y1i or Y0i).

Difference in means with random assignment of Di
E[YiDi=1]E[YiDi=0]

18 / 42

The experimental ideal

Back to experiments

Q How do experiments overcome selection bias?
A Experiments break the link between potential outcomes and treatment.

In other words: Randomly assigning Di makes Di independent of which outcome we observe (meaning Y1i or Y0i).

Difference in means with random assignment of Di
E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]

18 / 42

The experimental ideal

Back to experiments

Q How do experiments overcome selection bias?
A Experiments break the link between potential outcomes and treatment.

In other words: Randomly assigning Di makes Di independent of which outcome we observe (meaning Y1i or Y0i).

Difference in means with random assignment of Di
E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]
=E[Y1iDi=1]E[Y0iDi=1]

18 / 42

The experimental ideal

Back to experiments

Q How do experiments overcome selection bias?
A Experiments break the link between potential outcomes and treatment.

In other words: Randomly assigning Di makes Di independent of which outcome we observe (meaning Y1i or Y0i).

Difference in means with random assignment of Di
E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]
=E[Y1iDi=1]E[Y0iDi=1]  from random assignment of Di

18 / 42

The experimental ideal

Back to experiments

Q How do experiments overcome selection bias?
A Experiments break the link between potential outcomes and treatment.

In other words: Randomly assigning Di makes Di independent of which outcome we observe (meaning Y1i or Y0i).

Difference in means with random assignment of Di
E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]
=E[Y1iDi=1]E[Y0iDi=1]  from random assignment of Di
=E[Y1iY0iDi=1]

18 / 42

The experimental ideal

Back to experiments

Q How do experiments overcome selection bias?
A Experiments break the link between potential outcomes and treatment.

In other words: Randomly assigning Di makes Di independent of which outcome we observe (meaning Y1i or Y0i).

Difference in means with random assignment of Di
E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]
=E[Y1iDi=1]E[Y0iDi=1]  from random assignment of Di
=E[Y1iY0iDi=1]
=E[τiDi=1]

18 / 42

The experimental ideal

Back to experiments

Q How do experiments overcome selection bias?
A Experiments break the link between potential outcomes and treatment.

In other words: Randomly assigning Di makes Di independent of which outcome we observe (meaning Y1i or Y0i).

Difference in means with random assignment of Di
E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]
=E[Y1iDi=1]E[Y0iDi=1]  from random assignment of Di
=E[Y1iY0iDi=1]
=E[τiDi=1]
=E[τi]

18 / 42

The experimental ideal

Back to experiments

Q How do experiments overcome selection bias?
A Experiments break the link between potential outcomes and treatment.

In other words: Randomly assigning Di makes Di independent of which outcome we observe (meaning Y1i or Y0i).

Difference in means with random assignment of Di
E[YiDi=1]E[YiDi=0]
=E[Y1iDi=1]E[Y0iDi=0]
=E[Y1iDi=1]E[Y0iDi=1]  from random assignment of Di
=E[Y1iY0iDi=1]
=E[τiDi=1]
=E[τi]       Random assignment of Di breaks selection bias.

18 / 42

The experimental ideal

Randomly assigned treatment

The key to avoiding selection bias: random assignment of treatment

19 / 42

The experimental ideal

Randomly assigned treatment

The key to avoiding selection bias: random assignment of treatment
(or as-good-as random assignment, e.g., natural experiments).

19 / 42

The experimental ideal

Randomly assigned treatment

The key to avoiding selection bias: random assignment of treatment
(or as-good-as random assignment, e.g., natural experiments).

Random assignment of treatment gives us E[Y0iDi=0]=E[Y0iDi=1] meaning the control group's mean now provides a good counterfactual for the treatment group's mean.

19 / 42

The experimental ideal

Randomly assigned treatment

The key to avoiding selection bias: random assignment of treatment
(or as-good-as random assignment, e.g., natural experiments).

Random assignment of treatment gives us E[Y0iDi=0]=E[Y0iDi=1] meaning the control group's mean now provides a good counterfactual for the treatment group's mean.

In other words, there is no selection bias, i.e.,

Selection bias =E[Y0iDi=1]E[Y0iDi=0]=0
19 / 42

The experimental ideal

Randomly assigned treatment

Additional benefit of randomization:

The average treatment effect is now representative of the population average, rather than the treatment-group average.

20 / 42

The experimental ideal

Randomly assigned treatment

Additional benefit of randomization:

The average treatment effect is now representative of the population average, rather than the treatment-group average.

E[τiDi=1]=E[τiDi=0]=E[τi]

20 / 42

The experimental ideal

Example: Training programs

Governments subsidize training programs to assist disadvantaged workers.

21 / 42

The experimental ideal

Example: Training programs

Governments subsidize training programs to assist disadvantaged workers.

Q Do these programs have the desired effects (i.e., increase wages)?

21 / 42

The experimental ideal

Example: Training programs

Governments subsidize training programs to assist disadvantaged workers.

Q Do these programs have the desired effects (i.e., increase wages)?

A Observational studies—comparing wage data from participants and non-participants—often find that people who complete these programs actually make lower wages.

21 / 42

The experimental ideal

Example: Training programs

Governments subsidize training programs to assist disadvantaged workers.

Q Do these programs have the desired effects (i.e., increase wages)?

A Observational studies—comparing wage data from participants and non-participants—often find that people who complete these programs actually make lower wages.

Challenges Participants self select. + Programs target lower-wage workers.

21 / 42

The experimental ideal

Example: Training programs

How do we formalize these concerns in our framework?

22 / 42

The experimental ideal

Example: Training programs

How do we formalize these concerns in our framework?

Observational program evaluations
E[WageiProgrami=1]E[WageiProgrami=0]= E[Wage1iProgrami=1]E[Wage0iProgrami=1]Average causal effect of training program on wages for participants, i.e.¯τ1+E[Wage0iProgrami=1]E[Wage0iProgrami=0]Selection bias

22 / 42

The experimental ideal

Example: Training programs

How do we formalize these concerns in our framework?

Observational program evaluations
E[WageiProgrami=1]E[WageiProgrami=0]= E[Wage1iProgrami=1]E[Wage0iProgrami=1]Average causal effect of training program on wages for participants, i.e.¯τ1+E[Wage0iProgrami=1]E[Wage0iProgrami=0]Selection bias

If the program attracts/selects individuals who, on average, have lower wages without the program (sort of the point of the program), then we have negative selection bias.

22 / 42

The experimental ideal

Example: Training programs

E[WageiProgrami=1]E[WageiProgrami=0]= E[Wage1iProgrami=1]E[Wage0iProgrami=1]+E[Wage0iProgrami=1]E[Wage0iProgrami=0]

So even if the program, on average, has an positive wage effect (in the participant group), i.e., ¯τ1>0, we will detect a lower effect due to the negative selection bias.

23 / 42

The experimental ideal

Example: Training programs

E[WageiProgrami=1]E[WageiProgrami=0]= E[Wage1iProgrami=1]E[Wage0iProgrami=1]+E[Wage0iProgrami=1]E[Wage0iProgrami=0]

So even if the program, on average, has an positive wage effect (in the participant group), i.e., ¯τ1>0, we will detect a lower effect due to the negative selection bias.

If the bias is sufficiently large (relative to the treatment effect), our estimate will even get the sign of the effect wrong.

23 / 42

The experimental ideal

Example: Training programs

E[WageiProgrami=1]E[WageiProgrami=0]= E[Wage1iProgrami=1]E[Wage0iProgrami=1]+E[Wage0iProgrami=1]E[Wage0iProgrami=0]

So even if the program, on average, has an positive wage effect (in the participant group), i.e., ¯τ1>0, we will detect a lower effect due to the negative selection bias.

If the bias is sufficiently large (relative to the treatment effect), our estimate will even get the sign of the effect wrong.

Related While observational studies typically found negative program effects, several experiments found positive program effects.

23 / 42

The experimental ideal

Example: The STAR experiment

The Tennessee STAR experiment is a famous/popular example of an experiment that allows us to answer an important social/policy question.

Research question Do classroom resources affect student performance?

24 / 42

The experimental ideal

Example: The STAR experiment

The Tennessee STAR experiment is a famous/popular example of an experiment that allows us to answer an important social/policy question.

Research question Do classroom resources affect student performance?

  • Statewide(-ish) in Tennessee for the 1985–1986 kindergarten cohort
  • Ran for 4 years with ∼11,600 children. Cost ∼$12 million.
24 / 42

The experimental ideal

Example: The STAR experiment

The Tennessee STAR experiment is a famous/popular example of an experiment that allows us to answer an important social/policy question.

Research question Do classroom resources affect student performance?

  • Statewide(-ish) in Tennessee for the 1985–1986 kindergarten cohort
  • Ran for 4 years with ∼11,600 children. Cost ∼$12 million.

Treatments

  1. Small classes (13–17 students)
  2. Regular classes (22–35 students) plus part-time teacher's aide
  3. Regular classes (22–35 students) plus full-time teacher's aide
24 / 42

The experimental ideal

Example: The STAR experiment

First question Did the randomization balance participants' characteristics across the treatment groups?

25 / 42

The experimental ideal

Example: The STAR experiment

First question Did the randomization balance participants' characteristics across the treatment groups?

Ideally, we would have pre-experiment data on outcome variable.

Unfortunately, we only have a few demographic attributes.

25 / 42
Table 2.2.1, MHE
Treatment: Class Size
Variable Small Regular Regular + Aide P-value
Free lunch 0.47 0.48 0.50 0.09
White/Asian 0.68 0.67 0.66 0.26
Age in 1985 5.44 5.43 5.42 0.32
Attrition rate 0.49 0.52 0.53 0.02
K. class size 15.10 22.40 22.80 0.00
K. test percentile 54.70 48.90 50.00 0.00

space

26 / 42
Table 2.2.1, MHE
Treatment: Class Size
Variable Small Regular Regular + Aide P-value
Free lunch 0.47 0.48 0.50 0.09
White/Asian 0.68 0.67 0.66 0.26
Age in 1985 5.44 5.43 5.42 0.32
Attrition rate 0.49 0.52 0.53 0.02
K. class size 15.10 22.40 22.80 0.00
K. test percentile 54.70 48.90 50.00 0.00

Demographics appear balanced across the three treatment groups.

27 / 42
Table 2.2.1, MHE
Treatment: Class Size
Variable Small Regular Regular + Aide P-value
Free lunch 0.47 0.48 0.50 0.09
White/Asian 0.68 0.67 0.66 0.26
Age in 1985 5.44 5.43 5.42 0.32
Attrition rate 0.49 0.52 0.53 0.02
K. class size 15.10 22.40 22.80 0.00
K. test percentile 54.70 48.90 50.00 0.00

The three groups differ significantly on attrition rate.

28 / 42
Table 2.2.1, MHE
Treatment: Class Size
Variable Small Regular Regular + Aide P-value
Free lunch 0.47 0.48 0.50 0.09
White/Asian 0.68 0.67 0.66 0.26
Age in 1985 5.44 5.43 5.42 0.32
Attrition rate 0.49 0.52 0.53 0.02
K. class size 15.10 22.40 22.80 0.00
K. test percentile 54.70 48.90 50.00 0.00

The randomization generated variation in the treatment.

29 / 42
Table 2.2.1, MHE
Treatment: Class Size
Variable Small Regular Regular + Aide P-value
Free lunch 0.47 0.48 0.50 0.09
White/Asian 0.68 0.67 0.66 0.26
Age in 1985 5.44 5.43 5.42 0.32
Attrition rate 0.49 0.52 0.53 0.02
K. class size 15.10 22.40 22.80 0.00
K. test percentile 54.70 48.90 50.00 0.00

The small-class treatment significantly increased test scores.

30 / 42

The experimental ideal

The STAR experiment

The previous table estimated/compared the treatment effects using simple differences in means.

We can make the same comparisons using regressions.

Specifically, we regress our outcome (test percentile) on dummy variables (binary indicator variables) for each treatment group.

31 / 42

The experimental ideal

Example of our three treatment dummies.

iyiTrt1iTrt2iTrt3i1y11002y2100y100+1y1010pyp010p+1yp+1001NyN001

32 / 42

The experimental ideal

Regression analysis

Assume for the moment that the treatment effect is constant, i.e.,

You'll often hear econometricians say "homogeneous" (vs. "hetergeneous").

Y1iY0i=ρi

33 / 42

The experimental ideal

Regression analysis

Assume for the moment that the treatment effect is constant, i.e.,

You'll often hear econometricians say "homogeneous" (vs. "hetergeneous").

Y1iY0i=ρithen we can rewrite Yi=Y0i+Di(Y1iY0i)

33 / 42

The experimental ideal

Regression analysis

Assume for the moment that the treatment effect is constant, i.e.,

You'll often hear econometricians say "homogeneous" (vs. "hetergeneous").

Y1iY0i=ρithen we can rewrite Yi=Y0i+Di(Y1iY0i)as Yi=α=E[Y0i]+DiρY1iY0i+ηiY0iE[Y0i]

33 / 42

The experimental ideal

Regression analysis

Yi=α+Diρ+ηi

Now write out the conditional expectation of Yi for both levels of Di

34 / 42

The experimental ideal

Regression analysis

Yi=α+Diρ+ηi

Now write out the conditional expectation of Yi for both levels of Di

E[YiDi=1]=

34 / 42

The experimental ideal

Regression analysis

Yi=α+Diρ+ηi

Now write out the conditional expectation of Yi for both levels of Di

E[YiDi=1]= E[α+ρ+ηiDi=1]

34 / 42

The experimental ideal

Regression analysis

Yi=α+Diρ+ηi

Now write out the conditional expectation of Yi for both levels of Di

E[YiDi=1]= E[α+ρ+ηiDi=1] =α+ρ+E[ηi|Di=1]

34 / 42

The experimental ideal

Regression analysis

Yi=α+Diρ+ηi

Now write out the conditional expectation of Yi for both levels of Di

E[YiDi=1]= E[α+ρ+ηiDi=1] =α+ρ+E[ηi|Di=1]

E[YiDi=0]

34 / 42

The experimental ideal

Regression analysis

Yi=α+Diρ+ηi

Now write out the conditional expectation of Yi for both levels of Di

E[YiDi=1]= E[α+ρ+ηiDi=1] =α+ρ+E[ηi|Di=1]

E[YiDi=0] =E[α+ηiDi=0]

34 / 42

The experimental ideal

Regression analysis

Yi=α+Diρ+ηi

Now write out the conditional expectation of Yi for both levels of Di

E[YiDi=1]= E[α+ρ+ηiDi=1] =α+ρ+E[ηi|Di=1]

E[YiDi=0] =E[α+ηiDi=0] =α+E[ηiDi=0]

34 / 42

The experimental ideal

Regression analysis

Yi=α+Diρ+ηi

Now write out the conditional expectation of Yi for both levels of Di

E[YiDi=1]= E[α+ρ+ηiDi=1] =α+ρ+E[ηi|Di=1]

E[YiDi=0] =E[α+ηiDi=0] =α+E[ηiDi=0]

Take the difference...

E[YiDi=1]E[YiDi=0]

34 / 42

The experimental ideal

Regression analysis

Yi=α+Diρ+ηi

Now write out the conditional expectation of Yi for both levels of Di

E[YiDi=1]= E[α+ρ+ηiDi=1] =α+ρ+E[ηi|Di=1]

E[YiDi=0] =E[α+ηiDi=0] =α+E[ηiDi=0]

Take the difference...

E[YiDi=1]E[YiDi=0]
   =ρ+E[ηi|Di=1]E[ηiDi=0]Selection bias

34 / 42

The experimental ideal

Regression analysis

E[YiDi=1]E[YiDi=0]=ρ+E[ηi|Di=1]E[ηiDi=0]

Again, our estimate of the treatment effect (ρ) is only going to be as good as our ability to shut down the selection bias.

Selection bias in regression model: E[ηi|Di=1]E[ηiDi=0]

Selection bias here should remind you a lot of

35 / 42

The experimental ideal

Regression analysis

E[YiDi=1]E[YiDi=0]=ρ+E[ηi|Di=1]E[ηiDi=0]

Again, our estimate of the treatment effect (ρ) is only going to be as good as our ability to shut down the selection bias.

Selection bias in regression model: E[ηi|Di=1]E[ηiDi=0]

Selection bias here should remind you a lot of omitted-variable bias.

There is something in our disturbance ηi that is affecting Yi and is also correlated with Di.

35 / 42

The experimental ideal

Regression analysis

E[YiDi=1]E[YiDi=0]=ρ+E[ηi|Di=1]E[ηiDi=0]

Again, our estimate of the treatment effect (ρ) is only going to be as good as our ability to shut down the selection bias.

Selection bias in regression model: E[ηi|Di=1]E[ηiDi=0]

Selection bias here should remind you a lot of omitted-variable bias.

There is something in our disturbance ηi that is affecting Yi and is also correlated with Di.

In other metrics-y words: Our treatment Di is endogenous.

35 / 42

The experimental ideal

Solutions and covariates

Selection bias in regression model: E[ηi|Di=1]E[ηiDi=0]

36 / 42

The experimental ideal

Solutions and covariates

Selection bias in regression model: E[ηi|Di=1]E[ηiDi=0]

As before, if we randomly assign Di, then selection bias disappears.

36 / 42

The experimental ideal

Solutions and covariates

Selection bias in regression model: E[ηi|Di=1]E[ηiDi=0]

As before, if we randomly assign Di, then selection bias disappears.

Another potential route to identification is to condition on covariates in the hopes that they "take care of" the relationship between Di and whatever is in our disturbance ηi.

36 / 42

The experimental ideal

Solutions and covariates

Selection bias in regression model: E[ηi|Di=1]E[ηiDi=0]

As before, if we randomly assign Di, then selection bias disappears.

Another potential route to identification is to condition on covariates in the hopes that they "take care of" the relationship between Di and whatever is in our disturbance ηi.

Without very clear reasons explaining how you know you've controlled for the "bad variation", clean and convincing identification on this path is going to be challenging.

36 / 42

The experimental ideal

Covariates

That said, covariates can help with two things:

  1. Even experiments may need conditioning/controls: The STAR experiment was random within school—not across schools.

  2. Covariates can soak up unexplained variation—increasing precision.

37 / 42

The experimental ideal

Covariates

That said, covariates can help with two things:

  1. Even experiments may need conditioning/controls: The STAR experiment was random within school—not across schools.

  2. Covariates can soak up unexplained variation—increasing precision.

Now that we've seen regression can analyze experiments, let's estimate the STAR example...

37 / 42
Table 2.2.2, MHE
Explanatory variable 1 2 3
Small class 4.82 5.37 5.36
(2.19) (1.26) (1.21)
Regular + aide 0.12 0.29 0.53
(2.23) (1.13) (1.09)
White/Asian 8.35
(1.35)
Female 4.48
(0.63)
Free lunch -13.15
(0.77)
School F.E. F T T

The omitted level is Regular (with part-time aide).

38 / 42
Table 2.2.2, MHE
Explanatory variable 1 2 3
Small class 4.82 5.37 5.36
(2.19) (1.26) (1.21)
Regular + aide 0.12 0.29 0.53
(2.23) (1.13) (1.09)
White/Asian 8.35
(1.35)
Female 4.48
(0.63)
Free lunch -13.15
(0.77)
School F.E. F T T

Results without other controls are very similar to the difference in means.

39 / 42
Table 2.2.2, MHE
Explanatory variable 1 2 3
Small class 4.82 5.37 5.36
(2.19) (1.26) (1.21)
Regular + aide 0.12 0.29 0.53
(2.23) (1.13) (1.09)
White/Asian 8.35
(1.35)
Female 4.48
(0.63)
Free lunch -13.15
(0.77)
School F.E. F T T

School FEs enforce the experiment's design and increase precision.

40 / 42
Table 2.2.2, MHE
Explanatory variable 1 2 3
Small class 4.82 5.37 5.36
(2.19) (1.26) (1.21)
Regular + aide 0.12 0.29 0.53
(2.23) (1.13) (1.09)
White/Asian 8.35
(1.35)
Female 4.48
(0.63)
Free lunch -13.15
(0.77)
School F.E. F T T

Additional controls slightly increase precision.

41 / 42

Prologue

2 / 42
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow