1.1 (5 points) It’s that time again. Take 15 minutes to think about (1) your research interests, (2) how you are doing, and (3) what matters to you. No screens. No paper. No nothing. Just sit somewhere (ideally not PLC) quietly and think.
To submit: Where and when did you complete this task?
Answer: Completion.
Part 2: Extend your OLS function
2.1 (5 points) Extend your OLS function from the last problem set to estimate “classic OLS” standard errors (i.e., standard errors that assume homoskedastic and independent disturbances).
As with the first problem set: Do not use lm or other regression functions. You need to program the standard error estimator yourself (you can use matrix operations, sum, mean, etc.).
Answer: Here’s my function…
# The functionols_fun =function(X, y) {# Create a column of ones ones =matrix(1, nrow =nrow(X), ncol =1)# Combine the X matrix with the column of ones X_ones =cbind(ones, X)# Calculate the coefficients b =solve(t(X_ones) %*% X_ones) %*%t(X_ones) %*% y# Calculate the residuals e = y - X_ones %*% b# Calculate the residual variance s2 = (t(e) %*% e) / (nrow(X_ones) -ncol(X_ones)) s2 = s2 |>as.numeric()# Spherical VCOV b_vcov = s2 *solve(t(X_ones) %*% X_ones)# Return the coefficients and standard errorsdata.frame(b = b[,1],se =sqrt(diag(b_vcov)) )}# Run the function on carsols_fun(X =matrix(cars$speed, ncol =1),y =matrix(cars$dist, ncol =1))
b se
1 -17.579095 6.7584402
2 3.932409 0.4155128
# Check against fixestfixest::feols(dist ~ speed, data = cars)
2.2 (10 points) Now further extend your OLS function to estimate heteroskedasticity-robust standard errors (same rules).
You will want to do one of the following:
Add a new argument to your function that dictates the type of standard error to compute or create a new function that computes the heteroskedasticity-robust standard errors.
Just return both types of standard errors.
Answer: Updating the function…
# The functionols_fun =function(X, y, se_type ='iid') {# Create a column of ones ones =matrix(1, nrow =nrow(X), ncol =1)# Combine the X matrix with the column of ones X_ones =cbind(ones, X)# Define n and p n =nrow(X_ones) p =ncol(X_ones)# Calculate the coefficients b =solve(t(X_ones) %*% X_ones) %*%t(X_ones) %*% y# Calculate the residuals e = y - X_ones %*% b# Calculate the residual variance s2 = (t(e) %*% e) / (n - p) s2 = s2 |>as.numeric()# Spherical VCOV vcov_sph = s2 *solve(t(X_ones) %*% X_ones)# Heteroskedasticity-robust VCOV vcov_het =solve(t(X_ones) %*% X_ones) %*% (t(X_ones) %*%diag(as.vector(e)^2) %*% X_ones) %*%solve(t(X_ones) %*% X_ones)# Return the coefficients and standard errorsdata.frame(b = b[,1],se_sph =sqrt(diag(vcov_sph)),se_het =sqrt(diag(vcov_het)),se_hc1 =sqrt(diag(vcov_het) * n / (n - p)) )}# Run the function on carsols_fun(X =matrix(cars$speed, ncol =1),y =matrix(cars$dist, ncol =1))
Notice that the default heteroskedasticity-robust standard error estimator (in fixest) is the “HC1” estimator.
Part 3: Simulate with a familiar DGP
3.1 (10 points) Now run a simulation (at least 1,000 iterations) to see how well each of the standard error estimators behaves with a fairly small sample size.
In each iteration draw 25 observations from the DGP from problem 2.1 in the first problem set, i.e.,
and \(w_i \sim N(0, 1)\), \(u_i \sim N(0, 1)\), and \(v_i \sim N(0, 1)\)
To be clear: within each iteration, use OLS to estimate the regression coefficients, the “classical” standard errors, and the “heteroskedasticity-robust” standard errors.
Answer: Run the simulation…
# Load the `tidyverse` (using `pacman`)library(pacman)p_load(tidyverse, purrr, furrr, future)
# Set the seedset.seed(12)# Define the iteration functioniter_fun =function(it, n =25) {# Draw n observations from the population sample_df =tibble(id =1:n,u =rnorm(n),v =rnorm(n),w =rnorm(n),epsilon = v + w,x1 =rnorm(n),x2 =rnorm(n),x3 = u + w,y =0+1* x1 +2* x2 +3* x3 + epsilon )# Now run our function est =ols_fun(X = sample_df |>select(x1, x2, x3) |>as.matrix(),y = sample_df |>select(y) |>as.matrix() ) |>mutate(param =c('beta_0', 'beta_1', 'beta_2', 'beta_3'),iteration = it )}# Run the simulation 25 times (in parallel)plan(multisession, workers =10)sim_list =future_map(1:1e4, iter_fun,.options =furrr_options(seed =TRUE))# Combine the results into a single data framesim_df =bind_rows(sim_list) |>tibble()
3.2 (10 points) Using the results of your simulation in 3.1: What percent of the 95% confidence intervals for \(\beta_1\) contain the true value?
Answer: Calculate the coverage rate…
# Filter results to beta_1b1_df = sim_df |>filter(param =='beta_1')# Calculate the lower and upper bounds of the 95% CIq =qnorm(0.975)b1_df = b1_df |>mutate(lower_sph = b - q * se_sph,upper_sph = b + q * se_sph,lower_het = b - q * se_het,upper_het = b + q * se_het,lower_hc1 = b - q * se_hc1,upper_hc1 = b + q * se_hc1 )# Determine coverageb1_df = b1_df |>mutate(covers_sph = (lower_sph <1) * (upper_sph >1),covers_het = (lower_het <1) * (upper_het >1),covers_hc1 = (lower_hc1 <1) * (upper_hc1 >1) )# Coverage rate by estimatorb1_df |>summarise(coverage_sph =mean(covers_sph),coverage_het =mean(covers_het),coverage_hc1 =mean(covers_hc1) )
3.4 (5 points) How do the confidence intervals’ coverage rates compare (a) across the two standard-error estimators and (b) across the sample sizes? Explain why this is suprising or expected.
Answer: The coverage rates in small samples are not great, with the exception of the “classic” OLS standard errors (they should be close 95%). Once the sample size increases, coverage rates get better—especially for the HC1 estimator.
The fact that the “classic” OLS standard errors perform well is actually expected: our DGP does not have heteroskedasticity. Thus, the classic OLS standard errors are unbiased in our setting.
The estimator that I’ve labeled “het” is biased but consistent. Our small sample (n = 25) is not large enough to get close to the asymptotic distribution. It looks like it might not be getting close even with 1,000 observations.
The HC1 estimator is unbiased but less efficient than the “classic” OLS standard errors when the disturbance actually is homoskedastic.
3.5 (5 points) What is the average confidence interval width for each of the estimator by sample-size combinations? Does this result match what you would expect? Explain.
Answer: First, let’s check the average width of the confidence intervals for each of the estimators in the 25-observation simulation.
Given what we have already seen, the size of the intervals make sense. The estimators with too-small coverage rates are too tight.
The 1,000-observation intervals are narrower than the 25-observation intervals, which is expected: the more information we have, the more precise our estimates should be.