Regression Stuff

class: center, middle, inverse, title-slide

# Regression Stuff
## EC 607, Set 05
### Edward Rubin

---

class: inverse, middle

# Prologue

---
name: previously

# Schedule

## Last time: Inference and simulation

Let's review using a quote from *MHE*

> We've chosen to start with the .hi[asymptotic approach to inference] because modern empirical work typically leans heavily on the large-sample theory that lies behind robust variance formulas. The .hi[payoff is valid inference under weak assumptions], in particular, a framework that makes sense for our less-than-literal approach to regression models. On the other hand, the .hi[large-sample approach is not without its dangers]...

.grey-light[.small[*MHE*, p. 48 (emphasis added)]]

---
name: schedule

# Schedule
## Today

Regression and causality
<br>.note[Read] *MHE* 3.2

## Upcoming

Project, step 1
<br>Assignment \#2

---
name: advice
layout: false
class: clear, middle

.attn[Advice] Make sure you're taking a few minutes for personal health..super[.pink[†]]

.footnote[.pink[†] *health* = physical, mental, and spiritual. Also: Do a better job than I do.]

---
layout: false
class: inverse, middle

# Regression talk
## Saturated models

---
layout: true

# Regression talk
## Saturated models

A .attn[saturated model] is a regression model that includes a discrete (indicator) variable for each set of values the explanatory variables can take.

---
name: saturated

For discrete regressors, saturated models are pretty straightforward.

.ex[Example] For the relationship between .purple[Wages] and .orange[College Graduation],

$$
`\begin{align}
  \color{#6A5ACD}{\text{Wages}_{i}} = \alpha + \beta \, \color{#FFA500}{\mathbb{I}\left\{ \text{College Graduate} \right\}_i} + \varepsilon_i
\end{align}`
$$

---

For multi-valued variables, you need an indicator for each potential value.

.ex[Example.sub[2]] Regressing .purple[Wages] on .turquoise[Schooling] `$\left(\color{#20B2AA}{s_i \in \left\{0,1,2,\ldots T \right\}}\right)$`.

$$
`\begin{align}
  \color{#6A5ACD}{\text{Wages}_{i}} = \alpha +
  \beta_1 \, \color{#20B2AA}{\mathbb{I}\left\{ s_i = 1 \right\}_i}  +
  \beta_2 \, \color{#20B2AA}{\mathbb{I}\left\{ s_i = 2 \right\}_i}  +
  \cdots +
  \beta_T \, \color{#20B2AA}{\mathbb{I}\left\{ s_i = T \right\}_i}  +
  \varepsilon_i
\end{align}`
$$

Here, `$\color{#20B2AA}{s_i=0}$` is our reference level; `$\beta_j$` is the effect of `$\color{#20B2AA}{j}$` years of schooling.

$$
`\begin{align}
  \mathop{E}\left[ \color{#6A5ACD}{\text{Wages}_i} \mid \color{#20B2AA}{s_i = j} \right] - \mathop{E}\left[ \color{#6A5ACD}{\text{Wages}_i} \mid \color{#20B2AA}{s_i = 0} \right] = \alpha + \beta_j - \alpha = \beta_j
\end{align}`
$$

---
layout: true

# Regression talk
## Saturated models

---

.qa[Q] Why focus on saturated models?

.qa[A] .hi-slate[Saturated models perfectly fit the CEF]
--
 because the CEF is a linear function of the dummy variables—a special case of the linear CEF theorem.

---
If you have multiple explanatory variables, you need .attn[interactions].

.ex[Example.sub[3]] Regressing .purple[Wages] on .orange[College Graduation] and .red[Gender].

$$
`\begin{align}
  \color{#6A5ACD}{\text{Wages}_{i}} = \alpha &+ \beta_1 \, \color{#FFA500}{\mathbb{I}\left\{ \text{College Graduate} \right\}_i} + \beta_2 \, \color{#fb6107}{\mathbb{I}\left\{ \text{Female} \right\}_i}
  \\
  &+ \beta_3 \, \color{#FFA500}{\mathbb{I}\left\{ \text{College Graduate} \right\}_i} \times \color{#fb6107}{\mathbb{I}\left\{ \text{Female} \right\}_i} +  \varepsilon_i
\end{align}`
$$

Here, the uninteracted terms `$\left( \beta_1\:\&\: \beta_2 \right)$` are called .attn[main effects]; `$\beta_3$` gives the effect of the .attn[interaction].

$$
`\begin{align}
  \mathop{E}\left[ \color{#6A5ACD}{\text{Wages}_i} | \color{#FFA500}{\text{College Graduate}_i = 0},\, \color{#fb6107}{\text{Female}_i = 0} \right] &= \alpha \\
  \mathop{E}\left[ \color{#6A5ACD}{\text{Wages}_i} | \color{#FFA500}{\text{College Graduate}_i = 1},\, \color{#fb6107}{\text{Female}_i = 0} \right] &= \alpha + \beta_1 \\
  \mathop{E}\left[ \color{#6A5ACD}{\text{Wages}_i} | \color{#FFA500}{\text{College Graduate}_i = 0},\, \color{#fb6107}{\text{Female}_i = 1} \right] &= \alpha + \beta_2 \\
  \mathop{E}\left[ \color{#6A5ACD}{\text{Wages}_i} | \color{#FFA500}{\text{College Graduate}_i = 1},\, \color{#fb6107}{\text{Female}_i = 1} \right] &= \alpha + \beta_1 + \beta_2 + \beta_3
\end{align}`
$$

---

The CEF can take on four possible values,

and the specification of our saturated regression model

does not restrict the CEF at all.

---
layout: true
# Regression talk
## Model specification

---
name: specification

*Saturated models* sit at one extreme of the model-specification spectrum, with *linear, uninteracted models* occupying the opposite extreme.

.pull-left[
.attn[Saturated models]
- Fit CEF `$(+)$`
- Complex `$(-)$`
  - Many dummies
  - Many interactions
]

.pull-right[
.attn[Plain, linear models]
- Linear approximations `$(-)$`
- Simple `$(+)$`
]

Don't forget the there are many options in between—though some make less sense than others (_e.g._, interactions without main effects).

---

.note[Note] Saturated models perfectly fit the CEF regardless of `$\text{Y}_{i}$`'s distribution.

Continuous, linear probability, logged, non-negative—it works for all.

---
layout: false
class: clear, middle

Now back to causality...

---
class: inverse, middle

# Regression and causality

---
layout: true
# Regression and causality
## The return of causality

---
name: causal_reg

We've spent the last few lectures developing properties/understanding of (.hi-slate[1]) the CEF and (.hi-slate[2]) least-squares regression.

Let's return to our main goal of the course...

.qa[Q] When can we actually interpret a regression as .hi[causal]?.super[.pink[†]]

.footnote[.pink[†] *Hint:* There is no ".mono[reg y x, causal]" command in Stata.]

.qa[A] A regression is causal when the CEF it approximates is causal.

---

Great... thanks.

.qa[Q] So when is a CEF causal?

.qa[A] First, return to the potential-outcomes framework, describing hypothetical outcomes.

> A CEF is causal when it describes .hi[differences in average potential outcomes] for a fixed reference population.

.grey-light[*MHE*, p. 52 (emphasis added)]

Let's work through this "definition" of causal CEFs with an example.

---
layout: true
# Regression and causality
## Causal CEFs

---
name: causal_cef

.ex[Example] The (causal) effect of schooling on income.

The causal effect of schooling for individual `$i$` would tell us how `$i$`'s
<br>.hi-purple[earnings] `$\color{#6A5ACD}{\text{Y}_{i}}$` would change if we varied `$i$`'s .hi-pink[level of schooling] `$\color{#e64173}{s_i}$`.

Previously, we discussed how experiments randomly assign treatment to .it[ensure the variable of interest is independent of potential outcomes].

Now we would like to .hi-slate[extend this framework] to

1. variables that take on .hi-slate[more than two values]
2. situations that require us to .hi-slate[hold many covariates constant] in order to achieve a valid causal interpretation

---

The idea of *holding (many) covariates constant* brings us to one of the cornerstones of applied econometrics: the .attn[conditional independence assumption (CIA)] (also called *selection on observables*).

---
layout: true
# Regression and causality
## The conditional independence assumption

---
name: cia

.note[Definition(s)]

- Conditional on some set of covariates `$\text{X}_{i}$`, selection bias disappears.

- Conditional on `$\text{X}_{i}$`, potential outcomes `$\left( \color{#6A5ACD}{\text{Y}_{0i}},\, \color{#6A5ACD}{\text{Y}_{1i}} \right)$` are independent of treatment status `$\left( \color{#e64173}{\text{D}_{i}} \right)$`.

$$
`\begin{align}
  \def\ci{\perp\mkern-10mu\perp}
  \left\{ \color{#6A5ACD}{\text{Y}_{0i}},\,\color{#6A5ACD}{\text{Y}_{1i}} \right\} \ci  \color{#e64173}{\text{D}_{i}} | \text{X}_{i}
\end{align}`
$$

To see how CIA eliminates selection bias...

Selection bias `$= \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{0i}} \mid \text{X}_{i},\, \color{#e64173}{\text{D}_{i}= 1} \right] - \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{0i}} \mid \text{X}_{i},\, \color{#e64173}{\text{D}_{i}= 0} \right]$`
--
<br>.white[Selection bias] `$= \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{0i}} \mid \text{X}_{i} \right] - \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{0i}} \mid \text{X}_{i} \right]$`
--
<br>.white[Selection bias] `$= 0$`

---
name: cia_binary

Another way you'll hear CIA: After controlling for some set of variables `$\text{X}_{i}$`, treatment assignment is .it[.hi-slate[as good as random]].

To see how this assumption.super[.pink[†]] buys us a causal interpretation, write out our old difference in means—but now condition on `$\text{X}_{i}$`.

.footnote[.pink[†] Another way to think about econometric assumptions is as requirements.]

$$
`\begin{align}
  &\mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{i}} \mid \text{X}_{i},\, \color{#e64173}{\text{D}_{i}=1} \right] - \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{i}} \mid \text{X}_{i},\, \color{#e64173}{\text{D}_{i}=0} \right] \\[0.5em]
  &= \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{1i}} \mid \text{X}_{i} \right] - \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{0i}} \mid \text{X}_{i} \right] \\[0.5em]
  &= \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{1i}} - \color{#6A5ACD}{\text{Y}_{0i}} \mid \text{X}_{i} \right]
\end{align}`
$$

Even randomized experiments need the CIA—_e.g._, the STAR experiment's .it[within-school] randomization.

---
name: cia_multi

Now let's extend this framework to .hi-slate[multi-valued explanatory variables].

.ex[Example continued] .pink[Schooling] `$\left( \color{#e64173}{s_i} \right)$` takes on integers `$\in\left\{ 0,\,1,\,\ldots,\, T \right\}$`.

We want to know the effect of an individual's .pink[schooling] on her .purple[wages] `$\left( \color{#6A5ACD}{\text{Y}_{i}} \right)$`.

Previously, `$\color{#6A5ACD}{\text{Y}_{1i}}$` denoted individual `$i$`'s outcome under treatment.

Now, `$\color{#6A5ACD}{\text{Y}_{si}}$` denotes individual `$i$`'s outcome .pink[with] `$\color{#e64173}{s}$` .pink[years of schooling].

Let each individual have her own function between .pink[schooling] and .purple[earnings].

$$
`\begin{align}
  \color{#6A5ACD}{\text{Y}_{si}} \equiv \mathop{f_i}(\color{#e64173}{s})
\end{align}`
$$

`$\mathop{f_i}(\color{#e64173}{s})$` answers exactly the type of causal questions that we want to answer.

---

Extending the CIA to this multi-valued setting...

.center[
`$\color{#6A5ACD}{\text{Y}_{si}} \ci \color{#e64173}{s_i} \mid \text{X}_{i}\enspace$` for all `$\color{#e64173}{s}$`
]

If we apply the CIA to `$\color{#6A5ACD}{\text{Y}_{si}} \equiv \mathop{f_i}(\color{#e64173}{s})$`, we define the .it[average causal effect] of a one-year increase in .pink[schooling] as
$$
`\begin{align}
  \mathop{E}\left[ \mathop{f_i}(\color{#e64173}{s}) - \mathop{f_i}(\color{#e64173}{s-1}) \mid \text{X}_{i} \right]
\end{align}`
$$

However, the data only contain one realization of `$f_i(\color{#e64173}{s})$` per `$i$`—we only see `$f_i(\color{#e64173}{s})$` evaluated at exactly one value of `$\color{#e64173}{s}$` per `$i$`, _i.e._, `$\color{#6A5ACD}{\text{Y}_{i}} = f_i(\color{#e64173}{s_i})$`.

The CIA to the rescue!
--
 Conditional on `$\text{X}_{i}$`, `$\color{#6A5ACD}{\text{Y}_{si}}$` and `$\color{#e64173}{s_i}$` are independent.

---

The CIA to the rescue! Conditional on `$\text{X}_{i}$`, `$\color{#6A5ACD}{\text{Y}_{si}}$` and `$\color{#e64173}{s_i}$` are independent.

$$
`\begin{align}
  &\mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{i}} \mid \text{X}_{i},\, \color{#e64173}{s_i = s} \right] - \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{i}} \mid \text{X}_{i},\, \color{#e64173}{s_i = s-1} \right] \\[0.5em]
  &=\mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{si}} \mid \text{X}_{i},\, \color{#e64173}{s_i = s} \right] - \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{(s-1)i}} \mid \text{X}_{i},\, \color{#e64173}{s_i = s-1} \right] \\[0.5em]
  &=\mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{si}} \mid \text{X}_{i} \right] - \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{(s-1)i}} \mid \text{X}_{i} \right] \\[0.5em]
  &=\mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{si}} - \color{#6A5ACD}{\text{Y}_{(s-1)i}} \mid \text{X}_{i} \right] \\[0.5em]
  &=\mathop{E}\left[ \mathop{f_i}(\color{#e64173}{s}) - \mathop{f_i}(\color{#e64173}{s-1}) \mid \text{X}_{i} \right]
\end{align}`
$$

With the CIA, a difference in conditional averages allows causal interpretations.

---

.ex[Example] The causal effect of high-school graduation is

.big-left[
`$\mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{i}} \mid \text{X}_{i},\, \color{#e64173}{s_i = 12} \right] -   \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{i}} \mid \text{X}_{i},\, \color{#e64173}{s_i = 11} \right]$`
]
--
.big-left[
`$=\mathop{E}\left[ f_i(\color{#e64173}{12}) \mid \text{X}_{i},\, \color{#e64173}{s_i = 12} \right] -   \mathop{E}\left[ f_i(\color{#e64173}{11}) \mid \text{X}_{i},\, \color{#e64173}{s_i = 11} \right]$`
]
--
.big-left[
`$=\mathop{E}\left[ f_i(\color{#e64173}{12}) \mid \text{X}_{i},\, \color{#e64173}{s_i = 12} \right] -   \mathop{E}\left[ f_i(\color{#e64173}{11}) \mid \text{X}_{i},\, \color{#e64173}{s_i = 12} \right]$`  .grey-light[(from CIA)]
]
--
.big-left[
`$=\mathop{E}\left[ f_i(\color{#e64173}{12}) - f_i(\color{#e64173}{11}) \mid \text{X}_{i},\, \color{#e64173}{s_i = 12} \right]$`
]
--
.big-left[
`$=$` The average causal effect of graduation .it[for graduates]
]
--
.big-left[
`$=\mathop{E}\left[ f_i(\color{#e64173}{12}) - f_i(\color{#e64173}{11}) \mid \text{X}_{i} \right]$` .grey-light[(CIA again)]
]
--
.big-left[
`$=$` The (conditional) average causal effect of graduation .it[at] `$X_{i}$`
]

---

.qa[Q] What about the .hi-slate[unconditional] average causal effect of graduation?

.qa[A] First, remember what we just showed...

$$
`\begin{align}
  \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{i}} \mid \text{X}_{i},\, \color{#e64173}{s_i = 12} \right] -  \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{i}} \mid \text{X}_{i},\, \color{#e64173}{s_i = 11} \right] = \mathop{E}\left[ f_i(\color{#e64173}{12}) - f_i(\color{#e64173}{11}) \mid \text{X}_{i} \right]
\end{align}`
$$

Now take the expected value of both sides and apply the LIE.

.big-left[
`$E\!\bigg(\mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{i}} \mid \text{X}_{i},\, \color{#e64173}{s_i = 12} \right] -  \mathop{E}\left[ \color{#6A5ACD}{\text{Y}_{i}} \mid \text{X}_{i},\, \color{#e64173}{s_i = 11} \right] \bigg)$`
]
.big-left[
`$=E\! \bigg( \mathop{E}\left[ f_i(\color{#e64173}{12}) - f_i(\color{#e64173}{11}) \mid \text{X}_{i}\right] \bigg)$`
]
--
.big-left[
`$=\mathop{E}\left[ f_i(\color{#e64173}{12}) - f_i(\color{#e64173}{11}) \right]$` .grey-light[(Iterating expectations)]
]

---

.note[Takeaways]

1. Conditional independence gives our parameters .hi[causal interpretations] (eliminating selection bias).

2. The interpretation changes slightly—without iterating expectations, we have .hi[conditional average treatment effects].

3. The CIA is challenging—you need to know which set of covariates `$\left( \text{X}_{i} \right)$` leads to .hi[as-good-as-random residual variation in your treatment].

4. The idea of conditioning on observables to match *comparable* individuals introduces us to .hi[matching estimators]—comparing groups of individuals with the same covariate values.

---
layout: true

# Regression and causality
## From the CIA to regression

---
name: cia_reg

Conditional independence fits into our regression framework in two ways.

1. If we assume `$f_i(\color{#e64173}{s})$` is (.hi-slate[A]) linear in `$\color{#e64173}{s}$` and (.hi-slate[B]) equal across all individuals except for an additive error, linear regression estimates `$f(\color{#e64173}{s})$`.

2. If we allow `$f_i(\color{#e64173}{s})$` to be nonlinear in `$\color{#e64173}{s}$` and heterogeneous across `$i$`, regression provides a weighted average of individual-specific differences `$f_i(\color{#e64173}{s}) - f_i(\color{#e64173}{s-1})$`..super[.pink[†]]

.footnote[.pink[†] Leads to a matching-style estimator.]

Let's start with the 'easier' case: a linear, constant-effects (causal) model.

---

Let `$f_i(\color{#e64173}{s})$` be linear in `$\color{#e64173}{s}$` and equal across `$i$` except for an error term, _e.g._,

$$
`\begin{align}
  f_i(\color{#e64173}{s}) = \alpha + \rho \color{#e64173}{s} + \eta_i \tag{A}
\end{align}`
$$

Substitute in our observed value of `$\color{#e64173}{s_i}$` and the outcome `$\color{#6A5ACD}{\text{Y}_{i}}$`

$$
`\begin{align}
  \color{#6A5ACD}{\text{Y}_{i}} = \alpha + \rho \color{#e64173}{s_i} + \eta_i \tag{B}
\end{align}`
$$

While `$\rho$` in `$(\text{A})$` is explicitly causal, regression-based estimates of `$\rho$` in `$(\text{B})$` need not be causal (selection/OVB for endogenous `$s_i$`).

---

Continuing with our linear, constant-effect causal model...

$$
`\begin{align}
  f_i(\color{#e64173}{s}) = \alpha + \rho \color{#e64173}{s} + \eta_i \tag{A}
\end{align}`
$$

Now impose the conditional independence assumption for covariates `$\text{X}_{i}$`.

$$
`\begin{align}
  \eta_i = \text{X}_{i}'\gamma + \nu_i \tag{C}
\end{align}`
$$

where `$\gamma$` is a vector of population coefficients from regressing `$\eta_i$` on `$\text{X}_{i}$`.

.note[Note] Least-squares regression implies

1. `$\mathop{E}\left[ \eta_i \mid \text{X}_{i} \right] = \text{X}_{i}'\gamma$`
1. `$\text{X}_{i}$` is uncorrelated with `$\nu_i$`.

---

Now write out the conditional expectation function of `$f_i(\color{#e64173}{s})$` on `$\text{X}_{i}$` and `$\color{#e64173}{s_i}$`.

.big-left[
`$\mathop{E}\left[ f_i(\color{#e64173}{s}) \mid \text{X}_{i},\, \color{#e64173}{s_i} \right]$`
]
.big-left[
`$=\mathop{E}\left[ f_i(\color{#e64173}{s}) \mid \text{X}_{i} \right]$` .grey-light[(CIA)]
]
--
.big-left[
`$=\mathop{E}\left[ \alpha + \rho \color{#e64173}{s_i} + \eta_i \mid \text{X}_{i} \right]$`
]
--
.big-left[
`$=\alpha + \rho \color{#e64173}{s_i} + \mathop{E}\left[ \eta_i \mid \text{X}_{i} \right]$`
]
--
.big-left[
`$=\alpha + \rho \color{#e64173}{s_i} + \text{X}_{i}'\gamma$` .grey-light[(Least-squares regression)]
]

The CEF of `$f_i(\color{#e64173}{s_i})$` is linear, which means that the (right.super[.pink[†]]) population regression will be the CEF.

.footnote[
.pink[†] Here, "right" means conditional on `$\text{X}_{i}$`.
]

---

Thus, the linear causal (regression) model is
$$
`\begin{align}
  \text{Y}_{i} = \alpha + \rho \color{#e64173}{s_i} + \text{X}_{i}'\gamma + \nu_i
\end{align}`
$$

The residual `$\nu_i$` is uncorrelated with

1. `$\color{#e64173}{s_i}$` (from the CIA)
2. `$\text{X}_{i}$` (from defining `$\gamma$` via the regression of `$\eta$` on `$\text{X}_{i}$`)

The coefficient `$\rho$` gives the causal effect of `$\color{#e64173}{s_i}$` on `$\color{#6A5ACD}{\text{Y}_{i}}$`.

---

As Angrist and Pischke note, this .hi[conditional-independence assumption] (.it[a.k.a.] the selection-on-observables assumption) is the cornerstone of modern empirical work in economics—and many other disciplines.

Nearly any empirical application that wants a causal interpretation involves a (sometimes implicit) argument that .hi[conditional on some set of covariates, treatment is as-good-as random].

.ex[Part of our job:] Reasoning through the validity of this assumption.

---
layout: true
# Regression and causality
## CIA example

---
name: example

Let's continue with the returns to graduation `$\left( \text{G}_i \right)$`.

Let's imagine

1. Women are more likely to graduate.
1. Everyone receives the same return to graduation.
1. Women receive lower wages across the board.

---

First, we need to generate some data.

```r
# Set seed
set.seed(12345)
# Set sample size
n = 1e4
# Generate data
ex_df = tibble(
  female = rep(c(0, 1), each = n/2),
  grad = runif(n, min = female/3, max = 1) %>% round(0),
  wage = 100 - 25 * female + 5 * grad + rnorm(n, sd = 3)
)
```

---

Now we can estimate our naïve regression

$$
`\begin{align}
  \text{Wage}_i = \alpha + \beta \text{Grad}_i + \varepsilon_i
\end{align}`
$$

`lm(wage ~ grad, data = ex_df)`

<table>
 <thead>
  <tr>
   <th style="text-align:left;">  </th>
   <th style="text-align:right;"> Coef. </th>
   <th style="text-align:right;"> S.E. </th>
   <th style="text-align:right;"> t stat </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Intercept </td>
   <td style="text-align:right;"> 91.65 </td>
   <td style="text-align:right;"> 0.20 </td>
   <td style="text-align:right;"> 447.70 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Graduate </td>
   <td style="text-align:right;"> -1.59 </td>
   <td style="text-align:right;"> 0.26 </td>
   <td style="text-align:right;"> -6.18 </td>
  </tr>
</tbody>
</table>

Maybe we should have plotted our data...

---
layout: false
class: clear, center

---
class: clear, middle

We're still missing something...

---
class: clear, center

---
# Regression and causality
## CIA example

Now we can estimate our causal regression

$$
`\begin{align}
  \text{Wage}_i = \alpha + \beta_1 \text{Grad}_i + \beta_2 \text{Female}_i + \varepsilon_i
\end{align}`
$$

`lm(wage ~ grad + female, data = ex_df)`

<table>
 <thead>
  <tr>
   <th style="text-align:left;">  </th>
   <th style="text-align:right;"> Coef. </th>
   <th style="text-align:right;"> S.E. </th>
   <th style="text-align:right;"> t stat </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Intercept </td>
   <td style="text-align:right;"> 99.98 </td>
   <td style="text-align:right;"> 0.05 </td>
   <td style="text-align:right;"> 1868.81 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Graduate </td>
   <td style="text-align:right;"> 5.03 </td>
   <td style="text-align:right;"> 0.06 </td>
   <td style="text-align:right;"> 78.23 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Female </td>
   <td style="text-align:right;"> -25.00 </td>
   <td style="text-align:right;"> 0.06 </td>
   <td style="text-align:right;"> -402.64 </td>
  </tr>
</tbody>
</table>

---
layout: false
# Table of contents

.pull-left[
### Admin
.smaller[

1. [Last time](#previously)
1. [Schedule](#schedule)
1. [Advice](#advice)
]

]

.pull-right[
### Regression
.smaller[

1. [Saturated models](#saturated)
1. [Model specification](#specification)
1. [Causal regressions](#causal_reg)
1. [Causal CEFs](#causal_cef)
1. [Conditional independence assumption](#cia)
  - [Binary treatment](#cia_binary)
  - [Multi-valued treatment](#cia_multi)
  - [Regression](#cia_reg)
  - [Example](#example)

]
]

---
exclude: true