EC 421, Set 8
Prologue
Introduction to time series
Autocorrelation
Writing your own functions.
Everything you do in R involves some sort of function, e.g.,
mean()lm()summary()read_csv()ggplot()+The basic idea in R is doing things to objects with functions.
We write functions to make life easier. Instead of copying and pasting the same line of code a million times, you can write one function.
In R, you use the function() function to write functions.1
the_name: The name we are giving to our new function.arg1: The first argument of our function.arg2: The second argument of our function.Let’s write a function that multiplies two numbers. (It needs two arguments.)
Did it work?
… that you tell them.
If you are going to repeat a task (e.g., a simulation), then you have a good situation for writing your own function.
R offers many functions (via its many packages), but you will sometimes find a scenario for which no one has written a function.
Now you know how to write your own.
Time series
Review
Changes to our model/framework.
Our model now has \(\textcolor{#e64173}{t}\) subscripts for time periods.
Dynamic models allow lags of explanatory and/or outcome variables.
We changed our exogeneity assumption to contemporaneous exogeneity, i.e., \(\mathop{\boldsymbol{E}}\left[ u_t \middle| X_t \right] = 0\)
Including lags of outcome variables can lead to biased coefficient estimates from OLS.
Lagged explanatory variables make OLS inefficient.
Autocorrelation
Autocorrelation occurs when our disturbances are correlated over time, i.e., \(\mathop{\text{Cov}} \left( u_t,\, u_s \right) \neq 0\) for \(t\neq s\).
Another way to think about: If the shock from disturbance \(t\) correlates with “nearby” shocks in \(t-1\) and \(t+1\).
Note: Serial correlation and autocorrelation are the same thing.
Why is autocorrelation prevalent in time-series analyses?
Positive autocorrelation: Disturbances \(\left( u_t \right)\) over time
Positive autocorrelation: Outcomes \(\left( y_t \right)\) over time
Negative autocorrelation: Disturbances \(\left( u_t \right)\) over time
Negative autocorrelation: Outcomes \(\left( y_t \right)\) over time
Let’s start with a very common model: a static time-series model whose disturbances exhibit first-order autocorrelation, a.k.a. AR(1):
\[ \begin{align} \text{Births}_t &= \beta_0 + \beta_1 \text{Income}_t + u_t \end{align} \] where \[ \begin{align} \textcolor{#e64173}{u_t} = \textcolor{#e64173}{\rho \, u_{t-1}} + \varepsilon_t \end{align} \] and the \(\varepsilon_t\) are independently and identically distributed (i.i.d.).
Second-order autocorrelation, or AR(2), would be
\[ \begin{align} \textcolor{#6A5ACD}{u_t} = \textcolor{#6A5ACD}{\rho_1 u_{t-1}} + \textcolor{#6A5ACD}{\rho_2 u_{t-2}} + \varepsilon_t \end{align} \]
An AR(p) model/process has a disturbance structure of
\[ \begin{align} u_t = \sum_{j=1}^\textcolor{#20B2AA}{p} \rho_j u_{t-j} + \varepsilon_t \end{align} \]
allowing the current disturbance to correlated with up to \(\textcolor{#20B2AA}{p}\) of its lags.
For static models or dynamic models with lagged explanatory variables, in the presence of autocorrelation
OLS provides unbiased estimates for the coefficients.
OLS creates biased estimates for the standard errors.
OLS is inefficient.
Recall: Same implications as heteroskedasticity and correlated disturbances.
(Correlated disturbances are actually what we have with autocorrelation.)
However, autocorrelation get trickier with lagged outcome variables.
Consider a model with one lag of the outcome variable—ADL(1, 0)—model with AR(1) disturbances
\[ \begin{align} \text{Births}_t = \beta_0 + \beta_1 \text{Income}_t + \beta_2 \text{Births}_{t-1} + u_t \end{align} \] where \[ \begin{align} u_t = \rho u_{t-1} + \varepsilon_t \end{align} \]
Problem: \(\text{Births}_{t-1}\) (a regressor in the model for time \(t\)) and
\(u_{t}\) (the disturbance for time \(t\)) both depend upon \(u_{t-1}\).
I.e., a regressor is correlated with its contemporaneous disturbance.
Q: Why is this a problem?
A: It violates contemporaneous exogeneity, i.e., \(\mathop{\text{Cov}} \left( x_t,\,u_t \right) \neq 0\).
To see this problem, first write out the model for \(t\) and \(t-1\): \[ \begin{align} \text{Births}_t &= \beta_0 + \beta_1 \text{Income}_t + \beta_2 \text{Births}_{t-1} + u_t \\ \text{Births}_{t-1} &= \beta_0 + \beta_1 \text{Income}_{t-1} + \beta_2 \text{Births}_{t-2} + u_{t-1} \end{align} \] and now note that \(u_t = \rho u_{t-1} + \varepsilon_t\). Substituting… \[ \begin{align} \text{Births}_t &= \beta_0 + \beta_1 \text{Income}_t + \beta_2 \textcolor{#6A5ACD}{\text{Births}_{t-1}} + \overbrace{\left( \rho \textcolor{#e64173}{u_{t-1}} + \varepsilon_t \right)}^{u_t} \tag{1} \\ \textcolor{#6A5ACD}{\text{Births}_{t-1}} &= \beta_0 + \beta_1 \text{Income}_{t-1} + \beta_2 \text{Births}_{t-2} + \textcolor{#e64173}{u_{t-1}} \tag{2} \end{align} \]
In \((1)\), we can see that \(u_t\) depends upon (covaries with) \(\textcolor{#e64173}{u_{t-1}}\).
In \((2)\), we can see that \(\textcolor{#6A5ACD}{\text{Births}_{t-1}}\), a regressor in \((1)\), also covaries with \(u_{t-1}\).
∴ This model violates our contemporaneous exogeneity requirement.
Implications: For models with lagged outcome variables and autocorrelated disturbances
The models violate contemporaneous exogeneity.
OLS is biased and inconsistent for the coefficients.
Intuition? Why is OLS inconsistent and biased when we violate exogeneity?
Think back to omitted-variable bias…
\[ \begin{align} y_t &= \beta_0 + \beta_1 x_t + u_t \end{align} \]
When \(\mathop{\text{Cov}} \left( x_t,\, u_t \right)\neq0\), we cannot separate the effect of \(u_t\) on \(y_t\) from the effect of \(x_t\) on \(y_t\). Thus, we get inconsistent estimates for \(\beta_1\).
Similarly,
\[ \begin{align} \text{Births}_t &= \beta_0 + \beta_1 \text{Income}_t + \beta_2 \text{Births}_{t-1} + \overbrace{\left( \rho u_{t-1} + \varepsilon_t \right)}^{u_t} \tag{1} \end{align} \]
we cannot separate the effects of \(u_t\) on \(\text{Births}_t\) from \(\text{Births}_{t-1}\) on \(\text{Births}_{t}\), because both \(u_t\) and \(\text{Births}_{t-1}\) depend upon \(u_{t-1}\). \(\textcolor{#e64173}{\hat{\beta}_2}\) is biased (w/ OLS).
To see how this bias can look, let’s run a simulation.
\[ \begin{align} y_t = 1 + 2 x_t + 0.5 y_{t-1} + u_t \\ u_t &= 0.9 u_{t-1} + \varepsilon_t \end{align} \]
One (easy) way generate 100 disturbances from AR(1), with \(\rho = 0.9\):
We are going to run 10,000 iterations with \(T=100\).
Q: Will this simulation tell us about bias or consistency?
A: Bias. We would need to let \(T\rightarrow\infty\) to consider consistency.
Outline of our simulation:
Generate T=100 values of x
Generate T=100 values of u
Calculate yt = β0 + β1 xt + β2 yt-1 + ut
Regress y on x; record estimates
Repeat 1-4 10,000 times
Distribution of OLS estimates, \(\hat{\beta}_2\)
\(y_t = 1 + 2 x_t + \textcolor{#e64173}{0.5} y_{t-1} + u_t\)
Distribution of OLS estimates, \(\hat{\beta}_1\)
\(y_t = 1 + \textcolor{#FFA500}{2} x_t + 0.5 y_{t-1} + u_t\)
Testing for autocorrelation
Suppose we have the static model, \[ \begin{align} \text{Births}_t = \beta_0 + \beta_1 \text{Income}_t + u_t \tag{A} \end{align} \] and we want to test for an AR(1) process in our disturbances \(u_t\).
Test for autocorrelation: Test for correlation in the lags of our residuals:
\[ \begin{align} e_t = \textcolor{#e64173}{\rho} e_{t-1} + v_t \end{align} \]
Does \(\textcolor{#e64173}{\hat{\rho}}\) differ significantly from zero?
Familiar idea: Use residuals to learn about disturbances.
Specifically, to test for AR(1) disturbances in the static model \[ \begin{align} \text{Births}_t = \beta_0 + \beta_1 \text{Income}_t + u_t \tag{A} \end{align} \]
\[ e_t = \rho e_{t-1} + v_t \]
For an example, let’s return to our plot of negative autocorrelation.
Negative autocorrelation: Disturbances \(\left( u_t \right)\) over time
Step 1: Estimate the static model \(\left( y_t = \beta_0 + \beta_1 x_t + u_t \right)\) with OLS
Step 4: t test for the estimated \((\hat{\rho})\) coefficient in step 3.
# A tibble: 1 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 lag(e) -0.851 0.0535 -15.9 6.88e-29
That’s a very small p-value—much smaller than 0.05.
Reject H0 (H0 was \(\rho=0\), i.e., no autocorrelation).
Step 5: Conclude. Statistically significant evidence of autocorrelation.
What if we wanted to test for AR(3)?
We add more lags of residuals to the regression in Step 3.
We jointly test the significance of the coefficients (i.e., \(\text{LM}\) or \(F\)).
Let’s do it.
Step 1: Estimate the static model \(\left( y_t = \beta_0 + \beta_1 x_t + u_t \right)\) with OLS
Step 3: Regress the residual on its lag (no intercept)
Note: lag(v, n) from dplyr takes the nth lag of the variable v.
Step 4: Calculate the \(\text{LM} = n\times R_e^2\) test statistic—distributed \(\chi_k^2\).
\(k\) is the number of regressors in the regression in Step 3 (here, \(k=3\)).
Step 5: Conclude.
Recall: Our hypotheses consider the model \[ \begin{align} e_t = \rho_1 e_{t-1} + \rho_2 e_{t-2} + \rho_3 e_{t-3} \end{align} \]
which we are actually using to learn about the model \[ \begin{align} u_t = \rho_1 u_{t-1} + \rho_2 u_{t-2} + \rho_3 u_{t-3} \end{align} \]
H0: \(\rho_1 = \rho_2 = \rho_3 = 0\) vs. HA: \(\rho_j \neq 0\) for at least one \(j\) in \(\{1,\, 2,\, 3\}\)
Our p-value is less than 0.05.
Reject H0.
Conclude there is statistically significant evidence of autocorrelation.
Recall: OLS is biased and inconsistent when our model has both
a lagged dependent variable
autocorrelated disturbances
Problem: If OLS is biased for \(\beta\), then it is also biased for \(u_t\).
∴ We can’t apply our nice trick of just using \(e_t\) to learn about \(u_t\).
Solution: Breusch-Godfrey test includes the other explanatory variables,
\[ \begin{align} e_t = \textcolor{#6A5ACD}{\underbrace{\gamma_0 + \gamma_1 x_{1t} + \gamma_2 x_{2t} + \cdots}_{\text{Explanatory variables (RHS)}}} + \textcolor{#e64173}{\underbrace{\rho_1 e_{t-1} + \rho_2 e_{t-2} + \cdots}_{\text{Lagged residuals}}} + \varepsilon_t \end{align} \]
Specifically, to test for AR(2) disturbances in the ADL(1, 0) model \[ \begin{align} \text{Births}_t = \beta_0 + \beta_1 \text{Income}_t + \beta_2 \text{Births}_{t-1} + u_t \tag{B} \end{align} \]
\[ e_t = \gamma_0 + \gamma_1 \text{Income}_t + \gamma_3 \text{Births}_{t-1} + \rho_1 e_{t-1} + \rho_2 e_{t-2} + v_t \]
For an example, let’s consider the relationship between monthly presidential approval ratings and oil prices during President George W. Bush’s1 presidency.
We will
\[ \begin{align} \text{Approval}_t = \beta_0 + \beta_1 \text{Approval}_{t-1} + \beta_2 \text{Price}_t + u_t \end{align} \]
Note: We’re ignoring any other violations of exogeneity for the moment.
Monthly presidential approval ratings, 2001–2006
Approval rating vs. its one-month lag, 2001–2006
Approval rating vs. its two-month lag, 2001–2006
Oil prices, 2001–2006
Approval rating vs. oil prices, 2001–2006
Approval rating vs. oil prices, 2001–2006 with quadratic fit
Step 1: Estimate our ADL(1, 0) model with OLS.
# A tibble: 3 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 16.2 7.86 2.06 4.40e- 2
2 lag(approve) 0.841 0.0752 11.2 2.17e-16
3 price_oil -0.0410 0.0215 -1.90 6.15e- 2
Step 2: Record residuals from the OLS regression.
Note: We add an NA because we use a lag—the first element is missing.
E.g.,
{1, 2, 3, 4, 5, 6, 7, 8, 9} = x
{?, 1, 2, 3, 4, 5, 6, 7, 8} = lag(x)
{?, ?, 1, 2, 3, 4, 5, 6, 7} = lag(x, 2)
{?, ?, ?, 1, 2, 3, 4, 5, 6} = lag(x, 3)
Step 3: Regress residuals on an intercept, the explanatory variables, and lagged residuals.
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.92474 9.30455 0.852 0.3979
lag(approve) -0.08503 0.09192 -0.925 0.3589
price_oil -0.01690 0.02407 -0.702 0.4854
lag(e) 0.25236 0.14648 1.723 0.0903 .
lag(e, 2) 0.07865 0.14471 0.544 0.5889
Step 4: \(F\) (or \(\text{LM}\)) test for \(\rho_1=\rho_2=0\).
Recall: We can test joint significance using an \(F\) test that compares the restricted (here: \(\rho_1=\rho_2=0\)) and unrestricted models. \[ \begin{align} F_{q,\,n-p} = \dfrac{\left(\text{SSE}_r - \text{SSE}_u\right)\big/q}{\text{SSE}_u\big/\left( n-p \right)} \end{align} \] where \(q\) is the number of restrictions and \(p\) is the number of parameters in our unrestricted model (include the intercept).
We can use the waldtest() function from the lmtest package for this test.
Step 4: \(F\) (or \(\text{LM}\)) test for \(\rho_1=\rho_2=0\).
Here, we’re telling waldtest to test
bg_reg (our unrestricted model)lag(e) and lag(e, 2) (our restricted model)Step 4: \(F\) (or \(\text{LM}\)) test for \(\rho_1=\rho_2=0\).
Wald test
Model 1: e ~ lag(approve) + price_oil + lag(e) + lag(e, 2)
Model 2: e ~ lag(approve) + price_oil
Res.Df Df F Pr(>F)
1 57
2 59 -2 1.6153 0.2078
Step 5: Conclusion of hypothesis test
With a p-value of \(\sim\) 0.208, we fail to reject the null hypothesis.
However, we tested for a specific type of autocorrelation: AR(2).
We might get different answers with different tests.
The p-value for AR(1) is 0.0896—suggestive of first-order autocorrelation.
Living with autocorrelation
Suppose we believe autocorrelation is present. What do we do?
I’ll give you three options.1
Misspecification
Serial-correlation robust standard errors (a.k.a. Newey-West)
FGLS
These solutions should look familiar.
Misspecification with autocorrelation is very similar to our discussion with heteroskedasticity.
By incorrectly specifying your model, you can create autocorrelation.
Omitting variables that are correlated through time will cause your disturbances to be correlated through time.
Example: Suppose births depend upon income and previous births \[ \begin{align} \text{Births}_t = \beta_0 + \beta_1 \text{Births}_{t-1} + \beta_2 \text{Income}_t + u_t \end{align} \]
but we write down the model as only depending upon previous births, i.e., \[ \begin{align} \text{Births}_t = \beta_0 + \beta_1 \text{Births}_{t-1} + v_t \end{align} \]
Then our disturbance \(v_t\) is \[ \begin{align} v_t = \beta_2 \text{Income}_t + u_t \end{align} \]
which is likely autocorrelated, since income is correlated in time.
Note: This autocorrelation has nothing to do with \(u_t\).
“Proof”
\[ \begin{align} v_{t} &= \beta_2 \text{Income}_{t} + u_{t} \\ v_{t-1} &= \beta_2 \text{Income}_{t-1} + u_{t-1} \end{align} \]
\(\mathop{\text{Cov}} \left( v_{t},\,v_{t-1} \right)\)
\(= \mathop{\text{Cov}} \left( \beta_2 \text{Income}_{t} + u_{t},\, \beta_2 \text{Income}_{t-1} + u_{t-1} \right)\)
\(= \textcolor{#e64173}{\mathop{\text{Cov}} \left( \beta_2 \text{Income}_{t},\, \beta_2 \text{Income}_{t-1} \right)} + \mathop{\text{Cov}} \left( \beta_2 \text{Income}_{t},\, u_t \right)\)
\(\textcolor{#ffffff}{=} + \mathop{\text{Cov}} \left(u_{t},\, \beta_2 \text{Income}_{t-1}\right) + \mathop{\text{Cov}} \left( u_{t},\, u_{t-1} \right)\)
\(\neq 0\) (in general) even if \(u_t\) is exogenous and without autocorrelation.
We can just focus on estimating consistent standard errors (fixing inference) in the presence of autocorrelation.
As we did with the heteroskedasticity-robust standard errors.
These standard errors are called serial-correlation robust standard errors (or Newey-West standard errors).1
est_iid est_nw
Dependent Var.: approve approve
Constant 16.16* (7.860) 16.16** (5.863)
l(approve) 0.8405*** (0.0752) 0.8405*** (0.0497)
price_oil -0.0410. (0.0215) -0.0410* (0.0176)
_______________ __________________ __________________
S.E. type IID Newey-West (L=2)
Observations 64 64
R2 0.90722 0.90722
Adj. R2 0.90417 0.90417
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The p-values on the price of oil coefficient: 0.062 vs. 0.023.
If we do not have a lagged outcome variable, then feasible generalized least squares (FGLS) can give us efficient and consistent standard errors.
Let’s start with a simple static model that includes an AR(1) disturbance \(u_t\).
\[ \begin{align} \text{Births}_t &= \beta_0 + \beta_1 \text{Income}_t + u_t \tag{1} \\ u_t &= \rho u_{t-1} + \varepsilon_t \tag{2} \end{align} \]
Now our old trick: Write out \((1)\) for period \(t-1\) (and then multiple by \(\rho\)) \[ \begin{align} \text{Births}_{t-1} &= \beta_0 + \beta_1 \text{Income}_{t-1} + u_{t-1} \tag{3} \\ \rho \text{Births}_{t-1} &= \rho \beta_0 + \rho \beta_1 \text{Income}_{t-1} + \rho u_{t-1} \tag{4} \end{align} \]
And now subtract \((4)\) from \((1)\)…
\[ \begin{align} \text{Births}_t - \rho \text{Births}_{t-1} =& \beta_0 \left( 1 - \rho \right) + \\ & \beta_1 \text{Income}_t - \rho \beta_1 \text{Income}_{t-1} + \\ & u_t - \rho u_{t-1} \end{align} \]
which gives us a very specific dynamic model \[ \begin{align} \text{Births}_t =& \beta_0 \left( 1 - \rho \right) + \rho \text{Births}_{t-1} +\\ & \beta_1 \text{Income}_t - \rho \beta_1 \text{Income}_{t-1} + \\ & \textcolor{#e64173}{\underbrace{u_t - \rho u_{t-1}}_{=\varepsilon_t}} \\ \textcolor{#ffffff}{=}& \textcolor{#ffffff}{\beta_0 \left( 1 - \rho \right) + \rho \text{Births}_{t-1} +}\\ &\textcolor{#ffffff}{ \beta_1 \text{Income}_t - \rho \beta_1 \text{Income}_{t-1} + \varepsilon_t} \end{align} \]
\[ \begin{align} \text{Births}_t - \rho \text{Births}_{t-1} =& \beta_0 \left( 1 - \rho \right) + \\ & \beta_1 \text{Income}_t - \rho \beta_1 \text{Income}_{t-1} + \\ & u_t - \rho u_{t-1} \end{align} \] which gives us a very specific dynamic model \[ \begin{align} \text{Births}_t =& \beta_0 \left( 1 - \rho \right) + \rho \text{Births}_{t-1} +\\ & \beta_1 \text{Income}_t - \rho \beta_1 \text{Income}_{t-1} + \\ & \textcolor{#e64173}{\underbrace{u_t - \rho u_{t-1}}_{=\varepsilon_t}} \\ =& \beta_0 \left( 1 - \rho \right) + \rho \text{Births}_{t-1} +\\ & \beta_1 \text{Income}_t - \rho \beta_1 \text{Income}_{t-1} + \textcolor{#e64173}{\varepsilon_t} \end{align} \] that happens to be free of autocorrelation.
This transformed model is free of autocorrelation. \[ \begin{align} \text{Births}_t =& \beta_0 \left( 1 - \rho \right) + \rho \text{Births}_{t-1} +\\ & \beta_1 \text{Income}_t - \rho \beta_1 \text{Income}_{t-1} + \textcolor{#e64173}{\varepsilon_t} \end{align} \]
Q: How do we actually estimate this model? (We don’t know \(\rho\).)
A: FGLS (of course)…
Autocorrelation means the disturbances correlated through time, \[\mathop{\text{Cov}}\left(u_t, u_s\right) \neq 0 \quad \text{for } t \neq s.\]
For static time-series models, OLS coefficient estimates are unbiased1,
but OLS standard errors are biased and OLS is inefficient.
For lagged outcome variables, OLS coefficient estimates
are biased and inconsistent.
Once again, violating assumptions can lead to bad decisions.
Tests for autocorrelation involve regressing residuals on their lags.
In static models: regress residuals on their lags.
In dynamic models with lagged outcomes: use Breusch-Godfrey
regress residuals on the original regressors plus lagged residuals.
Potential solutions/responses (akin to heteroskedasticity):
Consider whether the model is misspecified.
Use Newey-West standard errors.
For static models with AR disturbances, FGLS can transform the model to remove autocorrelation.