Problem Set 3: Causality and Review
EC 421: Introduction to Econometrics
1 Instructions
Due Upload your answer on Canvas before 11:59 pm (Pacific) on Thursday, 13 June 2025.
Optional: This assignment is optional. If you submit it on Canvas, we will grade it, and it will replace the lowest grade among your assignments. If you do not submit it, your grade will not change.
Important You must submit your answers as an HTML or PDF file, built from an RMarkdown (.RMD
) or Quarto (.qmd
) file. Do not submit the .RMD
or .qmd
file. You will not receive credit for it.
Integrity If you are suspected of cheating, then you will receive a zero—for the assignment and possibly for the course. We may report you to the dean. Cheating includes copying from your classmates, from the internet, and from previous assignments.
Objective This problem set has three main purposes: (1) reinforce what you learned about causality; (2) review the material from our course; (3) help you prepare for the final.
2 Causality
Suppose we are analyzing a public policy where some individuals (
1 | 1 | 11 | 10 |
2 | 1 | 10 | 8 |
3 | 1 | 12 | 7 |
4 | 1 | 9 | 5 |
5 | 1 | 13 | 9 |
6 | 0 | 5 | 4 |
7 | 0 | 7 | 1 |
8 | 0 | 8 | 3 |
9 | 0 | 6 | 2 |
10 | 0 | 4 | 1 |
[01] Calculate the individual treatment effect for each individual (each
[02] Explain why this dataset would be impossible to observe the “real life”.
[03] Calculate the average treatment effect.
[04] Which data points would we observe in the real world? Briefly explain.
[05] Would we have selection bias if we compared the average
[06] Explain how randomized experiments avoid selection bias.
[07] Suppose even-numbered individuals received the treatment and odd-numbered individuals did not. Would this “randomization” avoid selection bias? Explain.
3 Time-series review
[08] What are the benefits and drawback of using a dynamic model with a lagged outcome variable?
[09] Explain why we care whether time-series data are non-stationary.
[10] Suppose
[11] Continuing the definition of
[12] Imagine you have a regression model that includes a lagged outcome variable. Is OLS unbiased and/or consistent for the coefficients? Explain how your answer depends on whether the disturbance is autocorrelated.
4 General review
[13] Explain how measurement error (as defined in class) in an explanatory variable affects OLS’s estimates for that variable’s coefficients.
[14] Explain how we can sign the bias from omitted variable bias.
[15] Explain the difference between
[16] Explain why we discuss exogeneity so much.
[17] Define a “standard error” and explain why we need it.
[18] Why does heteroskedasticity matter?
[19] What is a
[20] Define autocorrelation and explain why it matters.