Linear Regression Model and the OLS Estimator

Effect of Heteroskedasticity

Topic of the module

Understand the effect of heteroskedasticity on the sampling distribution of the OLS estimator.


Data generating process (DGP)

Consider \(n\) observations are generated from the simple regression model,

$$ \begin{align} Y_i = \beta_{0} + \beta_{1} X_{i} + u_{i}, \end{align} $$

where \(\beta_{0}=1\) is the intercept and \(\beta_{1}\) is the slope parameter.

Furthermore, suppose \(X_{i}\) and \(u_{i}\) are independently normally distributed, i.e.,

$$ \begin{align} X_{i} \sim N\left(0, \sigma_{X}^{2}\right), \;\;\;\;\; u_{i} \sim N\left(0, \sigma_{ui}^{2}\right), \end{align} $$

where \(\sigma_{X}^{2}\) and \(\sigma_{ui}^{2}\) is the variance of \(X_{i}\) and \(u_{i}\).


Conditions for and Effects of heteroskedasticity

Heteroskedasticity occurs when \(\sigma_{ui}^{2}\) is not constant but a function of \(X_{i}\), e.g.,

$$ \begin{align} \sigma_{ui}^{2} = \text{exp}\left(\gamma_{0} + \gamma_{1} X_{i}\right), \end{align} $$

which is also known as multiplicative heteroskedasticity where \(\gamma_{0}\) and \(\gamma_{1}\) are parameters which determine the shape of the heteroskedasticity.

Thus, for positive/negative values of \(\gamma_{1}\) the variance of \(u_{i}\) is positively/negatively related with the value of \(X_{i}\).

Heteroskedasticity effects the standard error of the OLS estimator \(\widehat{\beta}_{1}\) and thus the construction of the standardized OLS estimator \(z_{\widehat{\beta}_{1}}\).

For the construction of the standardized OLS estimator \(z_{\widehat{\beta}_{1}}\), the variance of \(\widehat{\beta}_{1}\), i.e., \(\sigma_{\widehat{\beta}_{1}}^{2}\), has to be estimated.

The variance of \(\widehat{\beta}_1\), i.e., \(\sigma_{\widehat{\beta}_{1}}^{2}\), can be robustly estimated by,

$$ \begin{align} \widehat{\sigma}_{\widehat{\beta}_{1}}^{2} = \frac{1}{n} \times \frac{\frac{1}{n-2}\sum_{i=1}^{n}\left(X_{i} - \overline{X}\right)^{2}\widehat{u}_{i}^{2}}{\left[\frac{1}{n}\sum_{i=1}^{n}\left(X_{i} - \overline{X}\right)^{2}\right]^{2}}, \end{align} $$

where \(\widehat{u}_{i}\) are the residuals of the estimate regression line.

Note, the estimator for \(\sigma_{\widehat{\beta}_{1}}^{2}\) above is robust w.r.t. to heteroskedasticity, i.e., it does not rely on the assumption of homoskedasticity.

Instead, some statistic software report estimates \(\sigma_{\widehat{\beta}_{1}}^{2}\), based on the assumption of homoskedasticity.

The so called homoskedasticity-only estimator of \(\sigma_{\widehat{\beta}_{1}}^{2}\), is given by,

$$ \begin{align} \widetilde{\sigma}_{\widehat{\beta}_{1}}^{2} = \frac{\frac{1}{n-2}\sum_{i=1}^{n}\widehat{u}_{i}^{2}}{\sum_{i=1}^{n}\left(X_{i} - \overline{X}\right)^{2}}. \end{align} $$

Illustration

Change the parameters and see the effect on the properties of the OLS estimator \(\widehat{\beta}_{1}\) as estimator for \(\beta_{1}\).

Parameters


Sample Size \(n\)


Variance of \(X_{i}\)


Variance of \(u_{i}\)


Heteroskedasticity


Scatter plot (realizations)

The red fitted regression line is based on the regression of,

$$ \begin{align} Y_{i} \;\;\;\;\; \text{on} \;\;\;\;\; X_{i}. \end{align} $$

The scatter plots and the fitted regression lines represent the result for only one simulation. The shaded areas illustrate the range of all fitted regression lines across all simulation outcomes.

Scatter plot (fitted residuals)

The fitted unobserved residuals are constructed as,

$$ \begin{align} \widehat{u}_{i} = Y_{i} - \widehat{\beta}_{1} X_{i}, \end{align} $$

for only one simulation where \(\widehat{\beta}_{1}\) is the respective OLS estimate.

Histogram of the OLS estimates \(\widehat{\beta}_{1}\)

Consistency:

As the sample size \(N\) grows the OLS estimator \(\widehat{\beta}_{1}\) gets closer to \(\beta\), i.e.,

$$ \begin{align} \widehat{\beta}_{1} \overset{p}{\to} \beta. \end{align} $$
Histogram of the standardized OLS estimates \(z_{\widehat{\beta}_{1}}\)
Asymptotic Normality:

In the case of heteroskedasticity,

$$ \begin{align} z_{\overline{Y}} &= \frac{\overline{Y} - \beta}{\widehat{\sigma}_{\widehat{\beta}_{1}}}, \end{align} $$

i.e., based on robust standard errors \(\widehat{\sigma}_{\widehat{\beta}_{1}}\) gets closer to the standard normal distribution \(N\left(0, 1\right)\) (see green histogram).

Instead,

$$ \begin{align} z_{\overline{Y}} &= \frac{\overline{Y} - \beta}{\widetilde{\sigma}_{\widehat{\beta}_{1}}}, \end{align} $$

i.e., based on ordinary standard errors \(\widetilde{\sigma}_{\widehat{\beta}_{1}}\) does not get closer to the standard normal distribution \(N\left(0, 1\right)\) (see red histogram).

More Details

  1. A realization of the DGP specificied above is simulated:
    1. A i.i.d. sequence of realizations \(X_{1}, X_{2}, ..., X_{n}\) and \(u_{1}, u_{2}, ..., u_{n}\) are drawn from the distribution of \(X_{i}\) and \(u_{i}\) above.
    2. Based on the sequence of observations \(X_{1}, X_{2}, ..., X_{n}\) and \(u_{1}, u_{2}, ..., u_{n}\) a sequence of observations \(Y_{1}, Y_{2}, ..., Y_{n}\) is constructed based on the regression model above.
  2. Based on the sequence of observations \(Y_{1}, Y_{2}, ... Y_{n}\) and \(X_{1}, X_{2}, ..., X_{n}\), the OLS estimates and standardized OLS estimates for the intercept \(\beta_{0}\) and slope parameter \(\beta_{1}\) are calcluated.
  3. The values of the OLS estimate \(\widehat{\beta}_{1}\) and the standardized OLS estimate \(z_{\widehat{\beta}_{1}}\) are stored.
  4. Step 1 to 3 is repeated \(10,000\) times resulting in \(10,\!000\) OLS and standardized OLS estimates.
  5. The distribution of the OLS and standardized estimates are illustrated using histograms.
There is no explanation yet.
The figure shows:
The scatter plot of the values of x and y and the corresponding fitted regression line estimated by OLS for one particular realization of the underlying DGP.
The red shaded area illustrates the range of all fitted regression lines estimated by OLS across all realizations for the underlying DGP.
Increasing the sample size decreases the range of the different fitted regression lines estimated by OLS across different realizations of the underlying DGP illustrated by the red shaded area.
Increasing the variance of \(X_i\) decreases the range of the different fitted regression lines estimated by OLS across different realizations of the underlying DGP illustrated by the red shaded area.
Increasing the variance of \(u_i\) decreases the range of the different fitted regression lines estimated by OLS across different realizations of the underlying DGP illustrated by the red shaded area.
Introducing heteroskedasticity increases the range of the different fitted regression lines estimated by OLS across different realizations of the underlying DGP illustrated by the red shaded area.
The figure shows:
The scatter plot of the value of x and the fitted residuals of a simple linear regression model estimated by OLS.
Note, since the estimated parameters includes an intercept and a slope coefficient, the fitted residuals have a mean equal to zero and are uncorrelated with x by construction of the OLS estimator.
Increasing the sample size increases the number of fitted residuals estimated by OLS.
Increasing the variance of \(X_i\) increases the range of the fitted residuals on the x-axis.
Increasing the variance of \(u_i\) increases the range of the fitted residuals on the x-axis as well as vertically.
Introducing heteroskedasticity leads to a wider spread of the fitted residuals vertically.
BlaBlaBla
The figure shows:
The histogram of the estimated slope coefficient across all realizations of the underlying DGP.
The red vertical dashed line represents the estimated slope coefficient for one particular realization of the underlying DGP.
The green vertical dashed line represents the slope coefficient of the underlying DGP.
The figure shows:
The histogram of the estimated slope coefficient across all realizations of the underlying DGP.
An increase of the variance of \(X_i\) lets the OLS estimator \(\widehat{\beta_1}\) get close to \(\beta\)
The figure shows:
The histogram of the estimated slope coefficient across all realizations of the underlying DGP.
A positive increase of the variance of \(u_i\) lets the OLS estimator \(\widehat{\beta_1}\) get further away from \(\beta\). For a negative variance increase the contrary holds.
The figure shows:
The histogram of the estimated slope coefficient across all realizations of the underlying DGP.
An increase of the heteroskedasticity lets the OLS estimator \(\widehat{\beta_1}\) get further away from \(\beta\).
The figure shows:
The histogram of the standardized estimated slope coefficient across all realizations of the DGP.
For the standardization with subtract the slope coefficient and divide by an estimate of the variance of the estimated slope coefficient.
The red vertical dashed line represents the standardized estimated slope coefficient for one particular realization of the DGP.
The green vertical dashed curve represents the pdf of the standard normal distribution.
Note, the estimate for the variance of the estimated slope coefficient is explained below and is a function of the sample size, the variance of u, the variance of x, and the covariance of u and x.
By increasing the sample size the sampling distribution of the standardized estimated slope coefficient gets closer to the standard normal distribution which pdf is illustrated by the green dashed curve.
This is the result of central limit theorem.
Note, the sampling distribution of the standardized estimate is stable across the sample size. Thus, for large n the standardized estimate can be used to conduct hypothesis tests.
By increasing the variance of \(X_i\) the ordinary as well as the robust SE has more realizations on the more extreme ends, e.g., Histogram has fatter tails.
By increasing the variance of \(u_i\) the robust SE has more extreme realizations than the ordinary SE and therefore fits worse to the pdf of \(N(0,1)\).
When heteroskedasticity is introduced, the robust SE outperforms the ordinary SE and fits much better to the pdf of \(N(0,1)\).

This module is part of the DeLLFi project of the University of Hohenheim and funded by the
Foundation for Innovation in University Teaching

Creative Commons License