Problem set 2

EC607

Due Saturday, 10 May 2025 by 11:59 PM.

Part 1: Contemplate

1.1 (5 points) It’s that time again. Take 15 minutes to think about (1) your research interests, (2) how you are doing, and (3) what matters to you. No screens. No paper. No nothing. Just sit somewhere (ideally not PLC) quietly and think.

To submit: Where and when did you complete this task?

Part 2: Extend your OLS function

2.1 (5 points) Extend your OLS function from the last problem set to estimate “classic OLS” standard errors (i.e., standard errors that assume homoskedastic and independent disturbances).

As with the first problem set: Do not use lm or other regression functions. You need to program the standard error estimator yourself (you can use matrix operations, sum, mean, etc.).

2.2 (10 points) Now further extend your OLS function to estimate heteroskedasticity-robust standard errors (same rules).

You will want to do one of the following:

  1. Add a new argument to your function that dictates the type of standard error to compute or create a new function that computes the heteroskedasticity-robust standard errors.
  2. Just return both types of standard errors.

Part 3: Simulate with a familiar DGP

3.1 (10 points) Now run a simulation (at least 1,000 iterations) to see how well each of the standard error estimators behaves with a fairly small sample size.

In each iteration draw 25 observations from the DGP from problem 2.1 in the first problem set, i.e.,

\[ y_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + \beta_3 x_{3i} + \epsilon_i \]

where

  • \(\beta_0 = 0\), \(\beta_1 = 1\), \(\beta_2 = 2\), and \(\beta_3 = 3\)
  • \(x_{1i} \sim N(0, 1)\)
  • \(x_{2i} \sim N(0, 1)\)
  • \(x_{3i} = u_i + w_i\)
  • \(\epsilon_i = v_i + w_i\)
  • and \(w_i \sim N(0, 1)\), \(u_i \sim N(0, 1)\), and \(v_i \sim N(0, 1)\)

To be clear: within each iteration, use OLS to estimate the regression coefficients, the “classical” standard errors, and the “heteroskedasticity-robust” standard errors.

3.2 (10 points) Using the results of your simulation in 3.1: What percent of the 95% confidence intervals for \(\beta_1\) contain the true value?

3.3 (10 points) Now repeat 3.1 and 3.2 but with 1,000 observations per sample (rather than 25).

3.4 (5 points) How do the confidence intervals’ coverage rates compare (a) across the two standard-error estimators and (b) across the sample sizes? Explain why this is suprising or expected.

3.5 (5 points) What is the average confidence interval width for each of the estimator by sample-size combinations? Does this result match what you would expect? Explain.