class: center, middle, inverse, title-slide .title[ # Econometrics ] .subtitle[ ## Differences-in-Differences ] .author[ ### Mustapha Douch based on
Florian Oswald’s slides ] .date[ ### UniTo ESOMAS 2025-10-23 ] --- layout: true --- layout: true --- # Recap from last week * Applied inference tools to regression analysis * *Standard error* of regression coefficients * *Statistical significance* of regression coefficients -- ## This Week: ***Differences-in-differences*** * Exploits changes in policy over time that don't affect everyone * Need to find (or construct) appropriate control group(s) * *Key assumption:* parallel trends * *Empirical application*: impact of ***minimum wage*** on ***employment*** --- #From Correlation to Causation - What do we mean by “causal”? - Why correlation is not (necessarily) causation - How randomization turns correlation into causation - Common pitfalls: confounding, colliders, selection bias, Simpson’s paradox -- Learning goals: - Recognise when a correlation is causal (and when it isn’t) - Draw and reason with simple causal diagrams - Identify what needs to be adjusted for (and what must not) - Implement simple causal analyses in R --- ## **Spot the causal claim** - Coffee drinking is associated with longer lifespan. - Students who attend office hours get higher grades. - Areas with more police officers have more recorded crime. - Sleep deprivation reduces reaction time. For each: - Is it causal or correlational? - If correlational, what else could explain it? -- ** Answers: ** - Coffee: confounding by health behaviours or socioeconomic status. - Office hours: selection, motivated or struggling students self-select. - Police and crime: reverse causality or measurement (more police → more detection). - Sleep deprivation: plausibly causal (supported by experiments). *** Main lesson: *** before analyzing, articulate alternative explanations. --- ## Correlation vs causation - Correlation: two variables move together. Symmetric, descriptive, unitless; does not imply direction. - Causation: changing X (while holding other causes fixed, ** ceteris paribus **) changes Y. Asymmetric, requires assumptions or design. Key idea: - Correlation becomes causal when we can rule out alternative pathways linking X and Y. --- ## Example: ** Ice cream and drownings ** - Across weeks of the year: - Ice cream sales and drowning incidents are positively correlated. - Is ice cream causing drownings? -- Confounder: - Temperature drives both ice cream consumption and swimming exposure. We’ll simulate and diagnose this. --- We’ll simulate and diagnose this. <img src="chapter_did_files/figure-html/plot-icecream-1-1.svg" style="display: block; margin: auto;" /> --- We’ll simulate and diagnose this. <img src="chapter_did_files/figure-html/plot-icecream-2-1.svg" style="display: block; margin: auto;" /> --- ## Adjusting for the confounder |model | estimate| std.error| p.value| |:---------------------------------|--------:|---------:|-------:| |Naive: drownings ~ icecream_sales | 0.01619| 0.00089| 0| --- ## Adjusting for the confounder |model | estimate| std.error| p.value| |:----------------|--------:|---------:|-------:| |Adjusted: + temp | -0.0022| 0.0029| 0.4588| --- ## ** Adjusting for the confounder ** ``` ## # A tibble: 1 × 4 ## model estimate std.error p.value ## <chr> <dbl> <dbl> <dbl> ## 1 Naive: drownings ~ icecream_sales 0.0162 0.000890 4.11e-57 ``` ``` ## # A tibble: 1 × 4 ## model estimate std.error p.value ## <chr> <dbl> <dbl> <dbl> ## 1 Adjusted: + temp -0.00217 0.00292 0.459 ``` Interpretation: - Naive slope is positive and significant. - After adjusting for temperature, the ice-cream effect shrinks toward zero. Conclusion: - The initial correlation was spurious, driven by a common cause (temperature). --- ## ** When does correlation equal causation? ** - In a well-conducted randomized experiment: - Treatment assignment is independent of all prior causes of the outcome. - Groups are exchangeable on average. - The difference in means estimates the causal effect. -- We’ll simulate an RCT. ``` r n <- 1000 tau <- 2 # true average treatment effect (ATE) d_rct <- tibble( W = rbinom(n, 1, 0.5), # randomized treatment Y0 = rnorm(n, 10, 2), # potential outcome under control Y1 = Y0 + tau, # treatment adds tau Y = if_else(W == 1, Y1, Y0) # observed outcome ) ate_est <- with(d_rct, mean(Y[W==1]) - mean(Y[W==0])) paste0("Estimated ATE (difference in means): ", round(ate_est, 3)) ``` ``` ## [1] "Estimated ATE (difference in means): 1.786" ``` --- ## ** Let's look at this Graphically ** <img src="chapter_did_files/figure-html/plot-rct-1.svg" style="display: block; margin: auto;" /> --- ## ** Potential outcomes: a minimal primer ** - Each unit i has two potential outcomes: `\(Y_i(1)\)`, `\(Y_i(0)\)` - Causal effect for i: `\(τ_i\)` = `\(Y_i(1)\)` − `\(Y_i(0)\)` - Fundamental problem: we never observe both for the same i - Randomization ensures E[Y(0) | W=1] = E[Y(0) | W=0], so: - ATE = E[Y | W=1] − E[Y | W=0] under random assignment In observational data, we seek conditions under which this comparison is still valid (or can be made valid by adjustment). -- ** Note: ** The core challenge is missing counterfactuals. Randomization makes the missing counterfactuals ignorable on average; observational studies must emulate this by conditioning on confounders. --- ## The anatomy of ** bias ** Main threats: - Confounding (common causes) - Reverse causality (Y → X) - Selection/collider bias (conditioning on a common effect) - Measurement error and missing data - Model misspecification -- ** Do you re-call our inital examples: ** - health consciousness affects both coffee and longevity (confounding). - sicker people reduce exercise (reverse causality). - looking only at hospitalised patients (selection). We’ll use ** Causal diagrams (DAGs) ** to help us see and fix these. --- ## **Causal Diagrams (DAGs)**: Core Building Blocks - Nodes: variables - Arrows: causal influence - Acyclic: arrows never loop back DAGs help us decide *what* to control for to isolate the causal effect (`\(\mathbf{X \to Y}\)`). ### Structural Motifs & Adjustment Rules <div style="font-size: 0.75em;"> Table: Fundamental Causal Motifs and Adjustment Rules |Motif |Structure |Problem |Action | |:-------------------------|:---------|:-----------------------------------------------------------------|:----------------------------------------------------------------| |Fork (Confounder) |X ← Z → Y |Z creates a spurious correlation between X and Y. |**Adjust for Z** (Blocks the path). | |Chain (Mediator) |X → M → Y |M is the mechanism by which X affects Y. |**Do NOT adjust for M** (Unless finding only the direct effect). | |Collider (Selection Bias) |X → S ← Y |Conditioning on S creates a spurious correlation between X and Y. |**Crucial: Do NOT adjust for S** (Adjusting opens the path). | --- ## ** Causal diagrams (DAGs) **: core building blocks <img src="chapter_did_files/figure-html/dag-fork-1.svg" style="display: block; margin: auto;" /> --- <img src="chapter_did_files/figure-html/dag-chain-1.svg" style="display: block; margin: auto;" /> --- <img src="chapter_did_files/figure-html/dag-collider-1.svg" style="display: block; margin: auto;" /> --- ## ** Confounding in practice ** (observational study) Simulated confounded study: - X (treatment) depends on a pre-treatment risk score Z - Y depends on both X and Z - Naive regression is biased; adjustment recovers τ -- ``` ## # A tibble: 3 × 2 ## Model Estimate ## <chr> <dbl> ## 1 Naive: Y ~ X 4.41 ## 2 Adjusted: Y ~ X + Z 3.01 ## 3 Truth (ATE) 3 ``` ** Note: ** The naive estimate generally overstates the effect because treated units have higher Z (and Z improves Y). After adjusting for Z, the coefficient on X approaches the true τ=3. Highlight that adjustment requires measuring the right Z’s. --- ## ** Graphically ** <img src="chapter_did_files/figure-html/plot-confounding-1.svg" style="display: block; margin: auto;" /> --- ## Collider bias: **selection** can create spurious correlations - X and Y are independent causes of selection S - Conditioning on S (e.g., only looking at admitted/hired/hospitalised) induces a correlation between X and Y Simulation: ``` ## # A tibble: 2 × 2 ## Sample Correlation ## <chr> <dbl> ## 1 Unconditional -0.00226 ## 2 Condition on S=1 -0.330 ``` --- <img src="chapter_did_files/figure-html/plot-collider-1.svg" style="display: block; margin: auto;" /> ** Note: ** A and B are uncorrelated in the population, but among selected (S=1), high A tends to coincide with lower B and vice versa. Moral: filtering your data on outcomes or post-treatment variables can manufacture correlations out of thin air. --- ## **Simpson’s paradox** - An association reverses when aggregating across groups. - Often due to confounding by group/strata. Example: UC Berkeley admissions ``` ## # A tibble: 2 × 2 ## Gender rate ## <fct> <dbl> ## 1 Male 0.445 ## 2 Female 0.304 ``` --- ## Example: UC Berkeley admissions ** Overall admission rates ** <img src="chapter_did_files/figure-html/plot-ucb1-1.svg" style="display: block; margin: auto;" /> --- ** Within departments** <img src="chapter_did_files/figure-html/plot-ucb2-1.svg" style="display: block; margin: auto;" /> ** Why we see this?** -- Different application patterns across competitive vs less competitive departments (a confounder). Always examine stratified analyses and consider DAGs to decide on adjustment. --- ## **Adjustment**: regression as statistical control If ignorability holds given Z: - Y(x) ⟂ X | Z - Then E[Y | do(X=x)] = `\(E_Z\)`[ E[Y | X=x, Z] ] In practice: - Fit Y ~ X + Z - Interpret coefficient on X as adjusted effect (under assumptions) Caution: - Include pre-treatment confounders - Do not control for colliders or post-treatment variables --- ## Hands-on: partialling out a confounder ``` ## # A tibble: 1 × 4 ## model estimate std.error p.value ## <chr> <dbl> <dbl> <dbl> ## 1 Naive 2.69 0.0291 0 ``` ``` ## # A tibble: 1 × 4 ## model estimate std.error p.value ## <chr> <dbl> <dbl> <dbl> ## 1 Adjusted 1.98 0.0228 0 ``` <img src="chapter_did_files/figure-html/plot-partialling-1.svg" style="display: block; margin: auto;" /> --- ## **What should we adjust for?** - Pre-treatment confounders: Yes - Mediators: No, if estimating total effect - Colliders/selection variables: No - Descendants of colliders: No --- ## **Mini-exercise (in-class)** Given a study on “exercise (X) and blood pressure (Y)”, with observed variables: - Age, Sex, Smoking status, BMI, Diet quality, Clinic (site), location Tasks: - Draw a plausible DAG - Propose an adjustment set --- ## **Plausible DAG & Adjustment Solution** ### The Causal Structure (DAG) We assume the primary relationship is `\(\mathbf{X \to Y}\)`. The bias comes from multiple **Fork** motifs (Confounders). A=Age; S=Smoking, D=Diet Quality, B= BMI <img src="chapter_did_files/figure-html/dag-exercise-1.svg" style="display: block; margin: auto;" /> --- # Evaluation methods * Multiple regression often does not provide causal estimates because of ***selection on unobservables***. -- * RCTs are one way to solve this problem but they are often impossible to do. -- * Four main causal evaluation methods used in economics: - ***instrumental variables (IV)***, - ***propensity-score matching***, - ***differences-in-differences (DiD)***, and - ***regression discontinuity designs (RDD)***. -- * These methods are used to identify __causal relationships__ between treatments and outcomes. -- * In this lecture, we will cover a popular and rigorous program evaluation method: __differences-in-differences__. -- * Next week we will look at __regression discontinuity designs__. --- # Differences-in-Differences (DiD) * Usual starting point: subjects are not randomly allocated to treatment ⚠️ -- ## DiD Requirements: -- * 2 time periods: before and after treatment. -- * 2 groups: -- - ***control group:*** never receives treatment, -- - ***treatment group:*** initially untreated and then fully treated. -- * Under certain assumptions, control group can be used as the counterfactual for treatment group --- # An Example: Minimum Wage and Employment -- * Imagine you are interested in assessing the __causal__ impact of increasing the minimum wage on (un)employment. -- * Why is this not that straightforward? What should the control group be? --- * Why is this not that straightforward? What should the control group be? <img src="chapter_did_files/figure-html/unnamed-chunk-2-1.svg" style="display: block; margin: auto;" /> ** Note:** Any change in the local economy that affects employment (`\(Y\)`) also affects the political will to raise the minimum wage (`\(X\)`). These are called unobserved economy-wide shocks (`\(Z\)`). Example: A city raises the minimum wage. Employment falls. Was it the new wage law, or was it a regional recession that was going to reduce employment anyway? --- * Seminal 1994 [paper](http://davidcard.berkeley.edu/papers/njmin-aer.pdf) by prominent labor economists David Card and Alan Krueger entitled "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania" -- * Estimates the effect of an increase in the minimum wage on the employment rate in the fast-food industry. Why this industry? -- **Think of** burger and pizza restaurants! -- -Heavily dominated by low-skilled workers -Fast-food franchises tend to draw workers from very localized labor markets -Easy to survey --- # Institutional Details * In the US, there is a national minimum wage, but states can depart from it. -- * April 1, 1992: New Jersey minimum wage increases from $4.25 to $5.05 per hour. -- * Neighboring Pennsylvania did not change its minimum wage level. -- .pull-left[ <img src="../img/photos/nj_penn_map.png" width="600px" style="display: block; margin: auto;" /> ] -- .pull-right[ <br> <br> Pennsylvania and New Jersey are ***very similar***: similar institutions, similar habits, similar consumers, similar incomes, similar weather, etc. ] --- # Card and Krueger (1994): Methodology * Surveyed 410 fast-food establishments in New Jersey (NJ) and eastern Pennsylvania -- * Timing: -- - Survey before NJ MW increase: Feb/March 1992 -- - Survey after NJ MW increase: Nov/Dec 1992 -- * What comparisons do you think they did? -- .pull-left[ Let's take a closer at their data ``` r # install package that contains the cleaned data remotes::install_github("b-rodrigues/diffindiff") # load package library(diffindiff) # load data ck1994 <- njmin ``` ] -- .pull-right[ ``` ## # A tibble: 6 × 6 ## sheet chain state observation empft emppt ## <chr> <chr> <chr> <chr> <dbl> <dbl> ## 1 46 bk Pennsylvania February 1992 30 15 ## 2 49 kfc Pennsylvania February 1992 6.5 6.5 ## 3 506 kfc Pennsylvania February 1992 3 7 ## 4 56 wendys Pennsylvania February 1992 20 20 ## 5 61 wendys Pennsylvania February 1992 6 26 ## 6 62 wendys Pennsylvania February 1992 0 31 ``` ] --- class: inverse # Task 1 (10 minutes) 1. Take a look at the dataset and list the variables. Check the variable definitions with `?njmin`. 1. Tabulate the number of stores by `state` and by survey wave (`observation`). Does it match what's in *Table 1* of the [paper](http://davidcard.berkeley.edu/papers/njmin-aer.pdf)? 1. Create a full-time equivalent (FTE) employees variable called `empfte` equal to `empft` + 0.5*`emppt` + `nmgrs`. `empft` and `emppt` correspond respectively to the number of full-time and part-time employees. `nmgrs` corresponds to the number of managers. This is how Card and Krueger compute their full-time equivalent (FTE) employment variable (p.775 of the paper). 1. Compute the average number of FTE employment, average percentage of FT employees (out of the number of FTE employees), and average starting wage (`wage_st`) by state and by survey wave. Compare your results with *Table 2* of the paper. 5. How different are New Jersey and Pennsylvania's fast-food restaurants before the minimum wage increase? --- # Card and Krueger DiD: Tabular Results .center[__Average Employment Per Store Before and After the Rise in NJ Minimum Wage__] <table class="table table-striped" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Variables </th> <th style="text-align:left;"> Pennsylvania </th> <th style="text-align:left;"> New Jersey </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> FTE employment before </td> <td style="text-align:left;"> <span style=" text-align: c;">23.33</span> </td> <td style="text-align:left;"> <span style=" text-align: c;">20.44</span> </td> </tr> <tr> <td style="text-align:left;"> FTE employment after </td> <td style="text-align:left;"> <span style=" text-align: c;">21.17</span> </td> <td style="text-align:left;"> <span style=" text-align: c;">21.03</span> </td> </tr> <tr> <td style="text-align:left;"> Change in mean FTE employment </td> <td style="text-align:left;"> <span style=" font-weight: bold; color: white !important;border-radius: 4px; padding-right: 4px; padding-left: 4px; background-color: rgba(253, 231, 37, 255) !important;text-align: c;">-2.17</span> </td> <td style="text-align:left;"> <span style=" font-weight: bold; color: white !important;border-radius: 4px; padding-right: 4px; padding-left: 4px; background-color: rgba(68, 1, 84, 255) !important;text-align: center;">0.59</span> </td> </tr> </tbody> </table> -- ## DiD Estimate Differences-in-differences causal estimate: `\(0.59 - (-2.17) = 2.76\)` -- Interpretation: the minimum wage increase led to an __increase__ in FTE employment per store of 2.76 on average. -- Yes the essence of differences-in-differences is _that_ simple! 😀 -- Let's look at these results graphically. --- # DiD Graphically <img src="chapter_did_files/figure-html/unnamed-chunk-7-1.svg" style="display: block; margin: auto;" /> --- # DiD Graphically <img src="chapter_did_files/figure-html/unnamed-chunk-8-1.svg" style="display: block; margin: auto;" /> --- # DiD Graphically <img src="chapter_did_files/figure-html/unnamed-chunk-9-1.svg" style="display: block; margin: auto;" /> --- # DiD Graphically <img src="chapter_did_files/figure-html/unnamed-chunk-10-1.svg" style="display: block; margin: auto;" /> --- # DiD Graphically <img src="chapter_did_files/figure-html/unnamed-chunk-11-1.svg" style="display: block; margin: auto;" /> --- # DiD Graphically <img src="chapter_did_files/figure-html/unnamed-chunk-12-1.svg" style="display: block; margin: auto;" /> --- # What if we had done a naive after/before comparison? <img src="chapter_did_files/figure-html/unnamed-chunk-13-1.svg" style="display: block; margin: auto;" /> --- # What if we had done a naive after/before comparison? <img src="chapter_did_files/figure-html/unnamed-chunk-14-1.svg" style="display: block; margin: auto;" /> --- # What if we had done a naive after NJ/PA comparison? <img src="chapter_did_files/figure-html/unnamed-chunk-15-1.svg" style="display: block; margin: auto;" /> --- # What if we had done a naive after NJ/PA comparison? <img src="chapter_did_files/figure-html/unnamed-chunk-16-1.svg" style="display: block; margin: auto;" /> --- layout: false class: title-slide-section-red, middle # Estimation --- layout: true <div class="my-footer"><img src="../img/logo/unito-shield.png" style="height: 60px;"/></div> --- # DiD in Regression Form * In practice, DiD is usually estimated on more than 2 periods (4 observations) * There are more data points before and after the policy change -- 3 ingredients: -- 1. __Treatment dummy variable__: `\(TREAT_s\)` where the `\(s\)` subscript reminds us that the treatment is at the state level -- 1. __Post-treatment periods dummy variables__: `\(POST_t\)` where the `\(t\)` subscript reminds us that this variable varies over time -- 1. __Interaction term between the two__: `\(TREAT_s \times POST_t\)` 👉 the ***coefficient on this term is the DiD causal effect***! --- # DiD in Regression Form __Treatment dummy variable__ $$ TREAT_s = \begin{cases}\begin{array}{lcl} 0 \quad \text{if } s = \text{Pennsylvania} \\\ 1 \quad \text{if } s = \text{New Jersey} \end{array}\end{cases} $$ -- __Post-treatment periods dummy variable__ $$ POST_t = \begin{cases}\begin{array}{lcl} 0 \quad \text{if } t < \text{April 1, 1992} \\\ 1 \quad \text{if } t \geq \text{April 1, 1992} \end{array}\end{cases} $$ -- __Which observations correspond to `\(TREAT_s \times POST_t = 1\)`?__ -- * Let's put all these ingredients together: `$$EMP_{st} = \alpha + \beta TREAT_s + \gamma POST_t + \delta(TREAT_s \times POST_t) + \varepsilon_{st}$$` * `\(\delta\)`: causal effect of the minimum wage increase on employment --- # Understanding the Regression `$$EMP_{st} = \color{#d96502}\alpha + \color{#027D83}\beta TREAT_s + \color{#02AB0D}\gamma POST_t + \color{#d90502}\delta(TREAT_s \times POST_t) + \varepsilon_{st}$$` -- We have the following: -- `\(\mathbb{E}(EMP_{st} \; | \; TREAT_s = 0, POST_t = 0) = \color{#d96502}\alpha\)` -- `\(\mathbb{E}(EMP_{st} \; | \; TREAT_s = 0, POST_t = 1) = \color{#d96502}\alpha + \color{#02AB0D}\gamma\)` -- `\(\mathbb{E}(EMP_{st} \; | \; TREAT_s = 1, POST_t = 0) = \color{#d96502}\alpha + \color{#027D83}\beta\)` -- `\(\mathbb{E}(EMP_{st} \; | \; TREAT_s = 1, POST_t = 1) = \color{#d96502}\alpha + \color{#027D83}\beta + \color{#02AB0D}\gamma + \color{#d90502}\delta\)` -- `$$[\mathbb{E}(EMP_{st} \; | \; TREAT_s = 1, POST_t = 1)-\mathbb{E}(EMP_{st} \; | \; TREAT_s = 1, POST_t = 0)] - \\ [\mathbb{E}(EMP_{st} \; | \; TREAT_s = 0, POST_t = 1)-\mathbb{E}(EMP_{st} \; | \; TREAT_s = 0, POST_t = 0)] = \color{#d90502}\delta$$` --- # Understanding the Regression `$$EMP_{st} = \color{#d96502}\alpha + \color{#027D83}\beta TREAT_s + \color{#02AB0D}\gamma POST_t + \color{#d90502}\delta(TREAT_s \times POST_t) + \varepsilon_{st}$$` In table form: | Pre mean | Post mean | `\(\Delta\)`(post - pre) :-:|:--:|:--:|:--: Pennsylvania (PA) | `\(\color{#d96502}\alpha\)` | `\(\color{#d96502}\alpha + \color{#02AB0D}\gamma\)` | `\(\color{#02AB0D}\gamma\)` New Jersey (NJ) | `\(\color{#d96502}\alpha + \color{#027D83}\beta\)` | `\(\color{#d96502}\alpha + \color{#027D83}\beta + \color{#02AB0D}\gamma + \color{#d90502}\delta\)` | `\(\color{#02AB0D}\gamma + \color{#d90502}\delta\)` `\(\Delta\)`(NJ - PA) | `\(\color{#027D83}\beta\)` | `\(\color{#027D83}\beta + \color{#d90502}\delta\)` | `\(\color{#d90502}\delta\)` -- This table generalizes to other settings by substituting *Pennsylvania* with *Control* and *New Jersey* with *Treatment* --- class: inverse # Task 2 (10 minutes) 1. Create a dummy variable, `treat`, equal to `FALSE` if `state` is Pennsylvania and `TRUE` if New Jersey. 1. Create a dummy variable, `post`, equal to `FALSE` if `observation` is February 1992 and `TRUE` otherwise. 1. Estimate the following regression model. Do you obtain the same results as in slide 9? `$$empfte_{st} = \alpha + \beta treat_s + \gamma post_t + \delta(treat_s \times post_t) + \varepsilon_{st}$$` --- layout: false class: title-slide-section-red, middle # Identifying Assumptions --- layout: true <div class="my-footer"><img src="../img/logo/unito-shield.png" style="height: 60px;"/></div> --- # DiD Crucial Assumption: Parallel Trends > __Common or parallel trends assumption__: absent any minimum wage increase, Pennsylvania's fast-food employment trend would have been what we should have expected to see in New Jersey. -- * This assumption states that Pennsylvania's fast-food employment trend between February and November 1992 provides a reliable counterfactual employment trend New Jersey's fast-food industry *would have experienced* had New Jersey not increased its minimum wage. -- * Impossible to completely validate or invalidate this assumption. * *Intuitive check:* compare trends before policy change (and after policy change if no expected medium-term effects) --- # Parallel Trends: Graphically <img src="chapter_did_files/figure-html/unnamed-chunk-17-1.svg" style="display: block; margin: auto;" /> --- # Checking the parallel trends assumption <img src="chapter_did_files/figure-html/unnamed-chunk-18-1.svg" style="display: block; margin: auto;" /> --- # Checking the parallel trends assumption <img src="chapter_did_files/figure-html/unnamed-chunk-19-1.svg" style="display: block; margin: auto;" /> --- # Parallel trends assumption `\(\rightarrow\)` Verified ✅ <img src="chapter_did_files/figure-html/unnamed-chunk-20-1.svg" style="display: block; margin: auto;" /> --- # Parallel trends assumption `\(\rightarrow\)` Verified ✅ <img src="chapter_did_files/figure-html/unnamed-chunk-21-1.svg" style="display: block; margin: auto;" /> --- # Parallel trends assumption `\(\rightarrow\)` Not verified ❌ <img src="chapter_did_files/figure-html/unnamed-chunk-22-1.svg" style="display: block; margin: auto;" /> --- # Parallel trends assumption `\(\rightarrow\)` Not verified ❌ <img src="chapter_did_files/figure-html/unnamed-chunk-23-1.svg" style="display: block; margin: auto;" /> --- # Parallel Trends Assumption: [Card and Krueger (2000)](https://inequality.stanford.edu/sites/default/files/media/_media/pdf/Reference%20Media/Card%20and%20Krueger_2000_Policy.pdf) Here is the actual trends for Pennsylvania and New Jersey <img src="../img/photos/min_wage_parallel_trends.png" width="600px" style="display: block; margin: auto;" /> -- * Is the common trend assumption likely to be verified? --- # Parallel Trends Assumption: Formally Let: * `\(Y_{ist}^1\)`: fast food employment at restaurant `\(i\)` in state `\(s\)` at time `\(t\)` if there is a high state MW; -- * `\(Y_{ist}^0\)`: fast food employment at restaurant `\(i\)` in state `\(s\)` at time `\(t\)` if there is a low state MW; -- These are potential outcomes, you can only observe one of the two. -- The key assumption underlying DiD estimation is that, in the no-treatment state, restaurant `\(i\)`'s outcome in state `\(s\)` at time `\(t\)` is given by: `$$\mathbb{E}[Y_{ist}^0|s,t] = \gamma_s + \lambda_t$$` 2 implicit assumptions: 1. ***Selection bias***: relates to fixed state characteristics `\((\gamma)\)` 2. ***Time trend***: same time trend for treatment and control group `\((\lambda)\)` --- # Parallel Trends Assumption: Formally Outcomes in the comparison group: `$$\mathbb{E}[Y_{ist}| s = \text{Pennsylvania},t = \text{Feb}] = \gamma_{PA} + \lambda_{Feb}$$` -- `$$\mathbb{E}[Y_{ist}|s = \text{Pennsylvania},t = \text{Nov}] = \gamma_{PA} + \lambda_{Nov}$$` -- $$ `\begin{align} \mathbb{E}[Y_{ist}&|s = \text{Pennsylvania},t = \text{Nov}] - \mathbb{E}[Y_{ist}| s = \text{Pennsylvania},t = \text{Feb}] \\ &= \gamma_{PA} + \lambda_{Nov} - (\gamma_{PA} + \lambda_{Feb}) \\ &= \lambda_{Nov} - \lambda_{Feb} \end{align}` $$ --- # Parallel Trends Assumption: Formally Outcomes in the comparison group: `$$\mathbb{E}[Y_{ist}| s = \text{Pennsylvania},t = \text{Feb}] = \gamma_{PA} + \lambda_{Feb}$$` `$$\mathbb{E}[Y_{ist}|s = \text{Pennsylvania},t = \text{Nov}] = \gamma_{PA} + \lambda_{Nov}$$` $$ `\begin{align} \mathbb{E}[Y_{ist}&|s = \text{Pennsylvania},t = \text{Nov}] - \mathbb{E}[Y_{ist}| s = \text{Pennsylvania},t = \text{Feb}] \\ &= \gamma_{PA} + \lambda_{Nov} - (\gamma_{PA} + \lambda_{Feb}) \\ &= \underbrace{\lambda_{Nov} - \lambda_{Feb}}_{\text{time trend}} \end{align}` $$ -- `\(\rightarrow\)` the comparison group allows to estimate the ***time trend***. --- # Parallel Trends Assumption: Formally Let `\(\delta\)` denote the true impact of the minimum wage increase: `$$\mathbb{E}[Y_{ist}^1 - Y_{ist}^0|s,t] = \delta$$` -- Outcomes in the treatment group: `$$\mathbb{E}[Y_{ist}|s = \text{New Jersey}, t = \text{Feb}] = \gamma_{NJ} + \lambda_{Feb}$$` -- `$$\mathbb{E}[Y_{ist}|s = \text{New Jersey}, t = \text{Nov}] = \gamma_{NJ} + \delta + \lambda_{Nov}$$` -- $$ `\begin{align} \mathbb{E}[Y_{ist}&|s = \text{New Jersey}, t = \text{Nov}] - \mathbb{E}[Y_{ist}|s = \text{New Jersey}, t = \text{Feb}] \\ &= \gamma_{NJ} + \delta + \lambda_{Nov} - (\gamma_{NJ} + \lambda_{Feb}) \\ &= \delta + \lambda_{Nov} - \lambda_{Feb} \end{align}` $$ --- # Parallel Trends Assumption: Formally Let `\(\delta\)` denote the true impact of the minimum wage increase: `$$\mathbb{E}[Y_{ist}^1 - Y_{ist}^0|s,t] = \delta$$` Outcomes in the treatment group: `$$\mathbb{E}[Y_{ist}|s = \text{New Jersey}, t = \text{Feb}] = \gamma_{NJ} + \lambda_{Feb}$$` `$$\mathbb{E}[Y_{ist}|s = \text{New Jersey}, t = \text{Nov}] = \gamma_{NJ} + \delta + \lambda_{Nov}$$` $$ `\begin{align} \mathbb{E}[Y_{ist}&|s = \text{New Jersey}, t = \text{Nov}] - \mathbb{E}[Y_{ist}|s = \text{New Jersey}, t = \text{Feb}] \\ &= \gamma_{NJ} + \delta + \lambda_{Nov} - (\gamma_{NJ} + \lambda_{Feb}) \\ &= \delta + \underbrace{\lambda_{Nov} - \lambda_{Feb}}_{\text{time trend}} \end{align}` $$ --- # Parallel Trends Assumption: Formally Therefore we have: $$ `\begin{align} \mathbb{E}[Y_{ist}&|s = \text{PA},t = \text{Nov}] - \mathbb{E}[Y_{ist}| s = \text{PA},t = \text{Feb}] = \underbrace{\lambda_{Nov} - \lambda_{Feb}}_{\text{time trend}} \end{align}` $$ -- $$ `\begin{align} \mathbb{E}[Y_{ist}&|s = \text{NJ},t = \text{Nov}] - \mathbb{E}[Y_{ist}| s = \text{NJ},t = \text{Feb}] = \delta + \underbrace{\lambda_{Nov} - \lambda_{Feb}}_{\text{time trend}} \end{align}` $$ -- $$ `\begin{align} DD &= \mathbb{E}[Y_{ist}|s = \text{NJ}, t = \text{Nov}] - \mathbb{E}[Y_{ist}|s = \text{NJ}, t = \text{Feb}] \\ & \qquad \qquad - \Big(\mathbb{E}[Y_{ist}|s = \text{PA},t = \text{Nov}] - \mathbb{E}[Y_{ist}| s = \text{PA},t = \text{Feb}]\Big) \\ &= \delta + \lambda_{Nov} - \lambda_{Feb} - (\lambda_{Nov} - \lambda_{Feb}) \\ &= \delta \end{align}` $$ --- class: title-slide-final, middle # END