Bernoulli Distribution - Sample Average

Properties of the Sample Average as Estimator for the Mean

Bernoulli Distribution

Topic of the module

Understand the effect of increasing the sample size $n$ on the sampling distribution of the sample average as estimator for the mean.

Data generating process (DGP)

Consider $n$ observations are drawn from the Bernoulli distribution,

$$ \begin{align} Y_i &\sim \text{Bernoulli}\left(p\right), \end{align} $$

where $p$ is a parameters representing the probability of success of the Bernoulli distribution.

Estimator and parameter of interest

We are interested in the sampling properties of the sample average $\overline{Y}$ given by,

$$ \begin{align} \overline{Y} = \frac{1}{N}\sum_{i=1}^{n} Y_{i}, \end{align} $$

as estimator for the mean $\mu=\text{E}\left(Y_{i}\right)$ of the Bernoulli distribution given by,

$$ \begin{align} \mu=p, \end{align} $$

where $p$ is the probability of success of the Bernoulli distribution.

Illustration

Change the parameters and see the effect on the properties of the sample average $\overline{Y}$ as estimator for $\mu$.

Parameters

Sample size $n$

Probability of
success $p$

Bar chart

The bar chart shows the number of ones and zeros of one realization of the DGP.

Histogram of the sample average $\overline{Y}$

Consistency:

As the sample size $n$ grows the sample average $\overline{Y}$ gets closer to $\mu$, i.e.,

$$ \begin{align} \overline{Y} \overset{p}{\to} \mu. \end{align} $$

Histogram of the standardized sample average $z_{\overline{Y}}$

Asymptotic Normality:

As the sample size $n$ grows the distribution of the standardized sample average,

$$ \begin{align} z_{\overline{Y}} &= \frac{\overline{Y} - \mu}{\sigma_{\overline{Y}}}, \\ z_{\overline{Y}} &= \frac{\overline{Y} - \mu}{\frac{\sigma}{\sqrt{N}}}, \end{align} $$

gets closer to the standard normal distribution $N\left(0, 1\right)$.

More Details

For the construction of the standardized sample average $z_{\overline{Y}}$, the mean $\mu$ as well as the variance $\sigma^{2}$ are used.

For the continuous uniform distribution the mean is given by,

$$ \begin{align} \mu=p, \end{align} $$

and the variance is given by,

$$ \begin{align} \sigma^{2}=p\left(1-p\right), \end{align} $$

where $p$ is the probability of success of the Bernoulli distribution.

A realization of the DGP specified above is simulated, i.e., a i.i.d. sequence of realizations $Y_{1}, Y_{2}, ..., Y_{n}$ are drawn from the distribution of $Y_{i}$ above.
Based on the sequence of observations $Y_{1}, Y_{2}, ... Y_{n}$, the sample average and standardized sample average are calculated.
The values of the sample average $\overline{Y}$ and the standardized sample average $z_{\overline{Y}}$ are stored.
Step 1 to 3 is repeated $10,\!000$ times resulting in $10,\!000$ sample averages and standardized sample averages.
The distribution of the sample averages and standardized sample averages are illustrated using histograms.

This module is part of the DeLLFi project of the University of Hohenheim and funded by the
Foundation for Innovation in University Teaching