Properties of the Sample Average as Estimator for the Mean

Continuous Uniform Distribution

Topic of the module

Understand the effect of increasing the sample size \(n\) on the sampling distribution of the sample average as estimator for the mean.


Data generating process (DGP)

Consider \(n\) observations are drawn from the continuous uniform distribution,

$$ \begin{align} Y_i &\sim U_{\left[a,b\right]}, \end{align} $$

where \(a\) and \(b\) are parameters representing the lower and upper bound of the continuous uniform distribution.


Estimator and Parameter of Interest

We are interested in the sampling properties of the sample average \(\overline{Y}\) given by,

$$ \begin{align} \overline{Y} = \frac{1}{n}\sum_{i=1}^{N} Y_{i}, \end{align} $$

as estimator for the mean \(\mu=\text{E}\left(Y_{i}\right)\) of the continuous uniform distribution given by,

$$ \begin{align} \mu=\frac{1}{2}\left(a+b\right), \end{align} $$

where \(a\) and \(b\) are the lower and upper bound of the continuous uniform distribution, respectively.

Illustration

Change the parameters and see the effect on the properties of the sample average \(\overline{Y}\) as estimator for \(\mu\).

Parameters


Sample size \(n\)


Lower bound \(a\)


Upper bound \(b\)


Histogram (realizations)

The histogram shows the distribution of the realizations of the DGP.

Histogram of the sample average \(\overline{Y}\)
Consistency:

As the sample size \(n\) grows the sample average \(\overline{Y}\) gets closer to \(\mu\), i.e.,

$$ \begin{align} \overline{Y} \overset{p}{\to} \mu. \end{align} $$
Histogram of the standardized sample average \(z_{\overline{Y}}\)
Asymptotic Normality:

As the sample size \(n\) grows the distribution of the standardized sample average,

$$ \begin{align} z_{\overline{Y}} &= \frac{\overline{Y} - \mu}{\sigma_{\overline{Y}}}, \\ z_{\overline{Y}} &= \frac{\overline{Y} - \mu}{\frac{\sigma}{\sqrt{N}}}, \end{align} $$

gets closer to the standard normal distribution \(N\left(0, 1\right)\).

More Details

For the construction of the standardized sample average \(z_{\overline{Y}}\), the mean \(\mu\) as well as the variance \(\sigma^{2}\) are used.

For the continuous uniform distribution the mean is given by,

$$ \begin{align} \mu=\frac{1}{2}\left(a+b\right), \end{align} $$

and the variance is given by,

$$ \begin{align} \sigma^{2}=\frac{1}{12}\left(b-a\right)^{2}, \end{align} $$

where \(a\) and \(b\) are the lower and upper bound of the continuous uniform distribution, respectively.

  1. A realization of the DGP specified above is simulated, i.e., a i.i.d. sequence of realizations \(Y_{1}, Y_{2}, ..., Y_{n}\) are drawn from the distribution of \(Y_{i}\) above.
  2. Based on the sequence of observations \(Y_{1}, Y_{2}, ... Y_{n}\), the sample average and standardized sample average are calculated.
  3. The values of the sample average \(\overline{Y}\) and the standardized sample average \(z_{\overline{Y}}\) are stored.
  4. Step 1 to 3 is repeated \(10,\!000\) times resulting in \(10,\!000\) sample averages and standardized sample averages.
  5. The distribution of the sample averages and standardized sample averages are illustrated using histograms.
There is no explanation yet.
The figure shows:
The histogram of the observed values in terms of their relative frequency based on an automatic bin selection procedure for one particular realization of the DGP.
The observed values are evenly distributed between the lower bound a and the upper bound b.
Changing the sample size only changes the bins of the histogram whereas the relative frequencies of the observed values approximately stay the same across bins.
Increasing the lower bound decreases the range and the relative frequencies of the observed values for one particular realization of the DGP from the left. Decreasing the lower bound increases the range and the relative frequencies of the observed values for one particular realization of the DGP to right.
Increasing/decreasing the upper bound increases/decreases the range and the relative frequencies of the of observed values for one particular realization of the DGP from the right.
The figure shows:
The histogram of all estimated sample averages for all realizations of the DGP.
The red vertical dashed line represents the sample average for one particular realization of the DGP.
The green vertical dashed line represents the mean of the underlying DGP.
By increasing the sample size the sample average concentrate more around the mean used to generate the data.
This is the result of law of large numbers.
Increasing/decreasing the lower bound a decreases/increases the mean of the underlying DPG and shifts the histogram to the left/right.
Increasing/decreasing the upper bound b increases/decreases the mean of the underlying DPG and shifts the histogram to the right/left.
The figure shows:
The histogram of all standardized sample averages for all realizations of the DGP.
The red vertical dashed line represents the standardized sample average for one particular realization of the DGP.
The green vertical dashed curve represents the pdf of the standard normal distribution.
By increasing the sample size the sampling distribution of the standardized sample average gets closer to the standard normal distribution which pdf is illustrated by the green dashed curve.
This is the results of the central limit theorem.
Changing the lower bound a does not effect the results of the central limit theorem.
Changing the upper bound b does not effect the results of the central limit theorem.

This module is part of the DeLLFi project of the University of Hohenheim and funded by the
Foundation for Innovation in University Teaching

Creative Commons License