QTM 385 - Experimental Methods

Lecture 20 - Mediation Analysis

Danilo Freire

Emory University

Hello again! 👋

Brief recap 📚

Last class: Interference

  • We discussed two interesting papers (I hope so 😂!)

  • Paper 1: Munshi (2003)

    • Topic: Identifying network effects in the labour market.
    • Context: Mexican migrants in the U.S.
    • Challenge: Endogeneity and selection bias.
    • Method: Instrumental Variables (IV) using rainfall.
  • Paper 2: Miguel & Kramer (2004)
    • Topic: Identifying treatment impacts with externalities.
    • Context: Deworming programme in Kenyan schools.
    • Challenge: Standard methods underestimate effects due to spillovers (SUTVA violation).
    • Method: Randomised phase-in design.
  • The focus of last class was on clever identification strategies when standard RCT assumptions are violated or RCTs are not feasible (and how to think about them).

Today’s plan 📅

Mediation: Unpacking the Black Box

  • What is mediation analysis? The search for causal mechanisms
  • The classic regression-based approach (and why it’s often problematic)
  • Thinking about mediation with potential outcomes
  • The challenge of complex potential outcomes
  • Can we experimentally manipulate mediators? (Encouragement designs & excludability)
  • A more pragmatic approach: Implicit mediation analysis
    • Adding/subtracting treatment components
  • Examples: Conditional cash transfers, voter turnout mailers
  • Strengths and limitations of different approaches
  • Moving beyond simple ATEs to how effects happen

What is Mediation? 🤔

The Causal Chain: Z → M → Y

  • Experiments often tell us that a treatment (\(Z\)) affects an outcome (\(Y\)).
  • Mediation analysis asks how or why this effect occurs.
  • It investigates the role of intermediate variables (mediators, \(M\)) that lie on the causal path between \(Z\) and \(Y\).
  • The core idea: \(Z\) causes \(M\), and \(M\) causes \(Y\).
  • Examples:
    • Limes (\(Z\)) → Vitamin C intake (\(M\)) → Reduced scurvy (\(Y\))
    • Reserving council seats for women (\(Z\)) → Female incumbency/Changed attitudes (\(M\)) → Election of women later (\(Y\)) (Bhavnani 2009)
  • Goal: Identify the pathways through which \(Z\) transmits its influence to \(Y\)

Why Care About Mechanisms?

  • Scientific Understanding: Moves beyond “black box” descriptions to explanatory theories. How does the world work?
  • Refining Theory: Tests specific theoretical propositions about causal processes
  • Designing Better Interventions: If we know why something works, we might find more efficient or potent ways to achieve the same outcome (e.g., Vitamin C tablets instead of limes)
    • Mediators themselves can have mediators!
  • Generalisability: Understanding the mechanism helps predict if an effect will hold in different contexts where the mediator might operate differently
  • Audiences and policymakers almost always ask about mechanisms!

Types of Mediation

Simple Mediation

Source: Righetti (2023)

Multiple Mediation

Source: Righetti (2023)

The Traditional Approach: Regression 📈

The Three-Equation System

  • A very common approach to estimate mediation is to use a three-equation system:
    1. Regress the mediator (\(M\)) on the treatment (\(Z\)): \(M_i = \alpha_1 + a Z_i + e_{1i}\) Effect of Z on M: \(\hat{a}\)
    2. Regress the outcome (\(Y\)) on the treatment (\(Z\)): \(Y_i = \alpha_2 + c Z_i + e_{2i}\) Total effect of Z on Y: \(\hat{c}\)
    3. Regress the outcome (\(Y\)) on both the treatment (\(Z\)) and the mediator (\(M\)): \(Y_i = \alpha_3 + d Z_i + b M_i + e_{3i}\) Effect of M on Y: \(\hat{b}\); Direct effect of Z on Y: \(\hat{d}\)
  • \(Z\) is randomly assigned (uncorrelated with errors, unbiased estimates of \(Z\) in equations 01 and 02), but \(M\) is not (equation 03). \(M\) is an outcome of \(Z\). So it is an observational study!

Decomposing Effects (Under Strong Assumptions)

  • If we assume constant treatment effects (a, b, c, d are the same for everyone) and certain other conditions hold…
  • The total effect (c) can be decomposed:
    • Total Effect (c): How much \(Y\) changes for a one-unit change in \(Z\). Estimated from Eq. (2).
    • Direct Effect (d): How much \(Y\) changes for a one-unit change in \(Z\), holding \(M\) constant. Estimated from Eq. (3).
    • Indirect Effect (ab): How much of \(Z\)’s effect on \(Y\) is transmitted through \(M\). Calculated as \(\hat{a} \times \hat{b}\) (or \(\hat{c} - \hat{d}\)).
  • This decomposition (\(c = d + ab\)) is the cornerstone of traditional mediation analysis.
  • BUT: This relies heavily on the constant effects assumption.

The Constant Effects Assumption Revisited

  • Remember from heterogeneous effects: causal effects often vary!
  • If \(a_i\) and \(b_i\) vary across individuals, then the average indirect effect is \(E[a_i b_i]\)
  • \(E[a_i b_i] = E[a_i] E[b_i] + Cov(a_i, b_i)\)
  • Simply multiplying the average effect of \(Z\) on \(M\) (\(\hat{a} \approx E[a_i]\)) by the average effect of \(M\) on \(Y\) (\(\hat{b} \approx E[b_i]\)) only gives the average indirect effect if \([Cov(a_i, b_i) = 0]\).
  • This covariance term is generally not zero and not identifiable.
  • So, the simple \(c = d + ab\) decomposition breaks down with heterogeneous effects.

The Problem with Equation (3): Endogeneity of \(M\)

  • Even if we ignore heterogeneous effects for a moment… Equation (3) is problematic: \(Y_i = \alpha_3 + d Z_i + b M_i + e_{3i}\)
  • \(Z\) is random, but \(M\) is an outcome variable. It’s very likely correlated with the error term \(e_{3i}\).
  • Why? Unobserved confounders might affect both M and Y.
  • Example (Bhavnani): Unmeasured local ‘egalitarianism’ (\(e_{3i}\)) might encourage women to run for office (\(M\)) and make voters more likely to elect women (\(Y\)), independently of the reservation policy (\(Z\)).
  • Including \(M\) in the regression is like including a post-treatment variable that is correlated with the error term!

Regression Approach Summary

  • Relies on strong, often implausible assumptions:
    • Constant treatment effects (for \(c = d + ab\))
    • No unobserved confounding between \(M\) and \(Y\) (for unbiased \(\hat{b}\) and \(\hat{d}\))
  • These assumptions are not guaranteed by random assignment of \(Z\) alone.
  • Prone to bias that exaggerates the mediator’s role and downplays the treatment’s direct effect.
  • Use with extreme caution, if at all, for causal mediation claims based solely on \(Z\) randomisation.

A Potential Outcomes View

Defining Potential Outcomes for Mediation

  • Let’s apply our familiar framework. For each individual \(i\):
    • \(Z_i\): Randomly assigned treatment (0 or 1)
    • \(M_i(z)\): Potential value of the mediator if assigned to treatment \(z\)
    • \(Y_i(z)\): Potential value of the outcome if assigned to treatment \(z\)
  • Observed variables:
    • \(M_i = M_i(Z_i)\)
    • \(Y_i = Y_i(Z_i)\)
  • Average Treatment Effect (Total Effect): \(ATE = E[Y_i(1) - Y_i(0)]\)
  • Effect of \(Z\) on \(M\): \(E[M_i(1) - M_i(0)]\)
  • These are estimable from a standard experiment randomising \(Z\).

Expanding Potential Outcomes: \(Y_i(m, z)\)

  • To formally define direct and indirect effects, we need a more complex notation (Imai, Keele, Yamamoto 2010; Robins & Greenland 1992):
  • \(Y_i(m, z)\): Potential outcome for individual \(i\) if their mediator were set to value \(m\) AND they were assigned treatment \(z\).
  • This allows us to think about hypothetical scenarios:
    • What would \(Y\) be if we gave the treatment (\(z=1\)) but somehow forced the mediator to the level it would have taken under control (\(m=M_i(0)\))? This would be \(Y_i(M_i(0), 1)\).
    • These are complex potential outcomes because they involve situations that can never happen in reality (more on that soon!).
  • Linking back: \(Y_i(1) = Y_i(M_i(1), 1)\) and \(Y_i(0) = Y_i(M_i(0), 0)\).

Defining Effects with \(Y(m, z)\)

  • Using this notation, we can define effects more precisely:
  • Total Effect (ATE): \(E[Y_i(1) - Y_i(0)] = E[Y_i(M_i(1), 1) - Y_i(M_i(0), 0)]\)
  • Controlled Direct Effect (CDE): Effect of \(Z\) on \(Y\), holding \(M\) fixed at some level \(m\).
    • \(CDE(m) = E[Y_i(m, 1) - Y_i(m, 0)]\)
  • Natural Direct Effect (NDE): Effect of \(Z\) on \(Y\) if \(M\) were held at the level it would have taken under control.
    • \(NDE = E[Y_i(M_i(0), 1) - Y_i(M_i(0), 0)]\)
  • Natural Indirect Effect (NIE): Effect of \(M\) changing from \(M_i(0)\) to \(M_i(1)\), holding \(Z\) fixed (usually at \(Z=1\) for policy relevance).
    • \(NIE = E[Y_i(M_i(1), 1) - Y_i(M_i(0), 1)]\)
  • Under certain assumptions, \(ATE = NDE + NIE\).

The Challenge: Complex Potential Outcomes

  • Look closely at the definitions of NDE and NIE:
    • NDE involves \(Y_i(M_i(0), 1)\)
    • NIE involves \(Y_i(M_i(0), 1)\)
  • These are “complex” or “counterfactual” potential outcomes.
    • \(Y_i(M_i(0), 1)\) represents the outcome under treatment (\(Z=1\)), but with the mediator at the level it would take under control (\(M=M(0)\)).
  • This state can never happen in reality! If \(Z_i=1\), we observe \(M_i(1)\), not \(M_i(0)\).
  • These potential outcomes are fundamentally unobservable from any single experiment just randomising \(Z\).
  • Now you see why mediation analysis is so tricky and why we don’t do it often!

Fundamental Problem Recap

  • Estimating the theoretically precise Natural Direct Effect (NDE) and Natural Indirect Effect (NIE) requires knowing the values of complex potential outcomes like \(Y_i(M_i(0), 1)\).
  • Since these are unobservable, we cannot estimate NDE and NIE without making strong assumptions.
  • Common assumptions include versions of “sequential ignorability” (Imai et al.), essentially assuming M is “as-if randomised” conditional on Z and pre-treatment covariates – very similar to the assumption needed for the regression approach to be unbiased.
  • Just randomising Z is not sufficient to identify mediation pathways without these extra assumptions.

Can We Address This? 🤔

Ruling Out Mediators (A Modest Goal)

  • What if we can show that \(Z\) has no effect on \(M\)?
  • If \(M_i(1) = M_i(0)\) for all individuals \(i\) (the sharp null hypothesis), then \(M\) cannot possibly mediate the effect of Z.
  • In this case, the complex potential outcomes simplify: \(Y_i(M_i(0), 1) = Y_i(M_i(1), 1) = Y_i(1)\). The NIE becomes zero.
  • How to test this?
    • Estimate the average effect: \(E[M_i(1) - M_i(0)]\). If it’s precisely zero…
    • Test for heterogeneity: Check if \(Var(M_i(1))\) differs from \(Var(M_i(0))\). (As discussed in Lecture 18). If variances are similar and ATE is zero, it lends support to the sharp null.
  • Ruling out mediators is easier than quantifying mediation. Useful for eliminating hypotheses.

Manipulating the Mediator Directly?

  • What if we design an experiment that randomises both \(Z\) and \(M\)? (A factorial design!)
  • Example: 2x2 design
    • Group 1: \(Z=0\), \(M=0\) (Control)
    • Group 2: \(Z=0\), \(M=1\) (Manipulate \(M\) only)
    • Group 3: \(Z=1\), \(M=0\) (Manipulate \(Z\), block \(M\)?)
    • Group 4: \(Z=1\), \(M=1\) (Manipulate \(Z\) and \(M\))
  • This helps estimate Controlled Direct Effects (CDEs). E.g., Effect of \(Z\) holding \(M\) at 0 is (Group 3 - Group 1). Effect of \(M\) holding \(Z\) at 1 is (Group 4 - Group 3).
  • BUT: This still doesn’t directly estimate the NDE or NIE, because \(M\) is set experimentally, not allowed to take its “natural” value \(M_i(z)\). But why is this important?
  • Manipulating \(M\) might be hard or artificial (e.g., giving Vitamin C tablets vs. limes might have different effects beyond Vitamin C).

The Encouragement Design Analogy

  • Often, we can’t directly set \(M\). Instead, we use an “encouragement” \(Z\) to try and influence \(M\).
  • Think back to non-compliance and Instrumental Variables (IV):
    • \(Z\) = Assignment/Encouragement (e.g., offer tutoring)
    • \(M\) = Treatment Received (e.g., actually attend tutoring)
    • \(Y\) = Outcome (e.g., test score)
  • We can use \(Z\) as an instrument for \(M\) to estimate the effect of \(M\) on \(Y\) for Compliers (CACE/LATE).
  • This is sometimes framed as a mediation analysis: \(Z\) affects \(Y\) through its effect on \(M\).
  • Formula: \(CACE_{M \to Y} = \frac{ITT_{Y}}{ITT_{M}} = \frac{E[Y|Z=1] - E[Y|Z=0]}{E[M|Z=1] - E[M|Z=0]}\)

The Excludability Problem for IV/Mediation

  • For the IV estimate of the effect of \(M\) on \(Y\) to be valid, we need the exclusion restriction.
  • In the mediation context, this means \(Z\) must affect \(Y\) only through \(M\). There can be no direct path from \(Z\) to \(Y\) (\(d=0\)).
  • This is a very strong assumption! Often, the encouragement (\(Z\)) might affect the outcome (\(Y\)) through other channels besides the intended mediator (\(M\)).
  • Example (Bhavnani): Seat reservations (\(Z\)) might affect future elections (\(Y\)) not just by creating incumbents (\(M1\)), but also by changing voter attitudes (\(M2\)) or mobilising different voters (\(M3\)).
  • If \(Z\) has a direct effect or affects multiple mediators, the simple IV approach doesn’t isolate the effect through one specific \(M\).
  • Identifying effects through multiple mediators requires multiple encouragements (instruments) with different effects on the mediators – very complex!

A Different Approach: Implicit Mediation 💡

Scaling Back Ambitions: Focus on Treatment Components

  • Given the challenges of formally estimating direct/indirect effects or using IV with strong assumptions…
  • An alternative: Implicit Mediation Analysis.
  • Instead of trying to measure M and model the Z → M → Y pathway explicitly…
  • Focus on the treatment Z itself. Many treatments are “bundles” of different components.
  • Design experiments that add or subtract specific components of the treatment bundle.
  • Compare the effects of these different treatment variations.
  • This implicitly tests the importance of the components (and the mediators they likely affect) without needing to measure M or make strong assumptions about unobservables.

Example: Conditional Cash Transfers (CCTs)

  • CCT programmes give cash to poor families if they meet certain conditions (e.g., school attendance, health check-ups).
  • Potential mediators: Increased income (cash effect), Changed behaviour due to rules (conditionality effect).
  • Implicit Mediation Design: (e.g., Baird et al. 2009)
    • Group 1: Control (no programme)
    • Group 2: Unconditional Cash Transfer (UCT - gets cash, no rules)
    • Group 3: Conditional Cash Transfer (CCT - gets cash + rules)
  • Comparisons:
    • (Group 2 - Group 1): Effect of cash alone.
    • (Group 3 - Group 1): Effect of cash + conditions.
    • (Group 3 - Group 2): Effect of conditions (holding cash constant).
  • This tells us about the importance of the conditions (likely mediator: behaviour change) vs. the cash (likely mediator: income) without directly measuring parental behaviour or income changes and modelling them.

Example: Voter Turnout Postcards (Gerber, Green, Larimer 2008)

  • Famous experiment testing social pressure effects on voting.
  • Treatment “ingredients” gradually added:
    • Group 1: Control (no mail)
    • Group 2: “Civic Duty” mailer (basic encouragement)
    • Group 3: “Hawthorne” mailer (Civic Duty + told they are being studied)
    • Group 4: “Self” mailer (Hawthorne + shown own household’s past voting record)
    • Group 5: “Neighbors” mailer (Self + shown neighbours’ past voting records)
  • Implicit mediators: Sense of duty, being watched, accountability for own record, comparison to neighbours.

Voter Turnout Results (Table 10.2)

Illustrative Turnout Rates:

Group Treatment Components Turnout Effect vs Control
1. Control None 29.7%
2. Civic Duty Encouragement 31.5% +1.8%
3. Hawthorne Encouragement + Monitoring 32.2% +2.5%
4. Self Encouragement + Monitoring + Own Record 34.5% +4.8%
5. Neighbors Encouragement + Monitoring + All Records 37.8% +8.1%

(Based on Gerber, Green, and Larimer 2008)

  • Clear pattern: Adding social pressure components significantly increases turnout.
  • Comparing Group 5 vs 4 suggests disclosing neighbours’ records adds ~3.3% effect.
  • Comparing Group 4 vs 3 suggests disclosing own record adds ~2.3% effect.
  • Tells us which ingredients matter without modelling psychological states.

Interpreting Implicit Mediation

  • This approach identifies which aspects of a complex intervention drive the effect.
  • It suggests the likely importance of different mediating processes (e.g., social comparison is powerful for turnout) without needing to directly measure or model the mediator variable (e.g., feelings of shame/pride).
  • It’s inherently design-based and stays within the clean framework of comparing randomly assigned groups.
  • Very useful for programme evaluation and refinement. What are the active ingredients?

Strengths of Implicit Mediation

  • Avoids bias inherent in regression-based mediation with non-randomised mediators.
  • Stays within the unbiased statistical framework of comparing randomised groups.
  • Lends itself to exploration and discovery of effective treatment variations.
  • Relies on weaker, design-based assumptions rather than statistical assumptions about unobservables.
  • Often more practical and feasible than trying to perfectly manipulate or measure mediators in field settings.

Conclusion

Key Takeaways on Mediation

  • Mediation analysis seeks to understand how a treatment Z affects an outcome Y through an intermediate variable M (causal mechanisms).
  • Traditional regression approaches are widespread but problematic:
    • Rely on constant effects assumption.
    • Suffer from bias due to unobserved confounding between M and Y.
    • Tend to overestimate M’s role and underestimate Z’s direct role.
  • The potential outcomes framework reveals fundamental challenges:
    • Estimating natural direct/indirect effects requires observing complex potential outcomes that are inherently unobservable.
    • Randomising Z alone is insufficient; strong assumptions are needed.
  • Experimentally manipulating M helps estimate controlled effects but doesn’t fully solve the problem and may be artificial.
  • Using Z as an instrument for M requires the strong exclusion restriction (no direct Z->Y path), often violated.
  • Ruling out mediators (showing Z doesn’t affect M) is a more modest but achievable goal.
  • Implicit mediation analysis is a pragmatic, design-based alternative:
    • Varies treatment components experimentally.
    • Compares effects to infer which components (and likely mechanisms) matter.
    • Avoids modelling M directly, relies on fewer assumptions.

Final Thoughts

  • Be critical of causal mediation claims, especially those based solely on standard regression methods after randomising only Z.
  • Ask about the assumptions being made (constant effects? no confounding? exclusion restriction?).
  • Favour design-based approaches where possible.
  • Implicit mediation offers a robust way to gain insights into mechanisms by comparing different versions of a treatment.
  • Understanding mechanisms is crucial, but requires careful thought about identification strategies!

Thanks very much! 😊

See you next time! 👋