ECON 3818

class: center, middle, inverse, title-slide

# ECON 3818
## Chapter 15
### Kyle Butts
### 03 October 2021

---

exclude: true

---
class: clear, middle

<style type="text/css">
/* -------------------------------------------------------
 *
 *     !! This file was generated by xaringanthemer !!
 *
 *  Changes made to this file directly will be overwritten
 *  if you used xaringanthemer in your xaringan slides Rmd
 * ------------------------------------------------------- */

@import url(https://fonts.googleapis.com/css?family=Roboto&display=swap);
@import url(https://fonts.googleapis.com/css?family=Roboto&display=swap);
@import url(https://fonts.googleapis.com/css?family=Source+Code+Pro:400,700&display=swap);
@import url(https://fonts.googleapis.com/css2?family=Atkinson+Hyperlegible&display=swap);

:root {
  /* Fonts */
  --text-font-family: 'Atkinson Hyperelegible';
  --text-font-is-google: 1;
  --text-font-family-fallback: Roboto, -apple-system, BlinkMacSystemFont, avenir next, avenir, helvetica neue, helvetica, Ubuntu, roboto, noto, segoe ui, arial;
  --text-font-base: sans-serif;
  --header-font-family: 'Atkinson Hyperelegible'
  --header-font-is-google: 1;
  --header-font-family-fallback: Georgia, serif;
  --code-font-family: 'Source Code Pro';
  --code-font-is-google: 1;
  --base-font-size: 20px;
  --text-font-size: 1rem;
  --code-font-size: 0.9rem;
  --code-inline-font-size: 1em;
  --header-h1-font-size: 1.75rem;
  --header-h2-font-size: 1.6rem;
  --header-h3-font-size: 1.5rem;

/* Colors */
  --text-color: #131516;
  --text-color-light: #555F61;
  --header-color: #FFF;
  --background-color: #FFF;
  --link-color: #107895;
  --code-highlight-color: rgba(255,255,0,0.5);
  --inverse-text-color: #d6d6d6;
  --inverse-background-color: #272822;
  --inverse-header-color: #f3f3f3;
  --inverse-link-color: #107895;
  --title-slide-background-color: #272822;
  --title-slide-text-color: #d6d6d6;
  --header-background-color: #FFF;
  --header-background-text-color: #FFF;
}

html {
  font-size: var(--base-font-size);
}

body {
  font-family: var(--text-font-family), var(--text-font-family-fallback), var(--text-font-base);
  font-weight: normal;
  color: var(--text-color);
}
h1, h2, h3 {
  font-family: var(--header-font-family), var(--header-font-family-fallback);
  color: var(--text-color-light);
}
.remark-slide-content {
  background-color: var(--background-color);
  font-size: 1rem;
  padding: 24px 32px 16px 32px;
  width: 100%;
  height: 100%;
}
.remark-slide-content h1 {
  font-size: var(--header-h1-font-size);
}
.remark-slide-content h2 {
  font-size: var(--header-h2-font-size);
}
.remark-slide-content h3 {
  font-size: var(--header-h3-font-size);
}
.remark-code, .remark-inline-code {
  font-family: var(--code-font-family), Menlo, Consolas, Monaco, Liberation Mono, Lucida Console, monospace;
}
.remark-code {
  font-size: var(--code-font-size);
}
.remark-inline-code {
  font-size: var(--code-inline-font-size);
  color: #000;
}
.remark-slide-number {
  color: #107895;
  opacity: 1;
  font-size: 0.9em;
}
a, a > code {
  color: var(--link-color);
  text-decoration: none;
}
.footnote {
  position: absolute;
  bottom: 60px;
  padding-right: 6em;
  font-size: 0.9em;
}
.remark-code-line-highlighted {
  background-color: var(--code-highlight-color);
}
.inverse {
  background-color: var(--inverse-background-color);
  color: var(--inverse-text-color);

}
.inverse h1, .inverse h2, .inverse h3 {
  color: var(--inverse-header-color);
}
.inverse a, .inverse a > code {
  color: var(--inverse-link-color);
}
img, video, iframe {
  max-width: 100%;
}
blockquote {
  border-left: solid 5px lightgray;
  padding-left: 1em;
}

@page { margin: 0; }
@media print {
  .remark-slide-scaler {
    width: 100% !important;
    height: 100% !important;
    transform: scale(1) !important;
    top: 0 !important;
    left: 0 !important;
  }
}

/* Modified metropolis */

.clear{
  border-top: 0px solid #FAFAFA;
}

h1 {
  margin-top: -5px;
  margin-left: -00px;
  margin-bottom: 30px;
  color: var(--text-color-light);
  font-weight: 200;
}
h2, h3, h4 {
  padding-top: -15px;
  padding-bottom: 00px;
  color: #1A292C;
  text-shadow: none;
  font-weight: 400;
  text-align: left;
  margin-left: 00px;
  margin-bottom: -10px;
}

.title-slide .inverse .remark-slide-content {
  background-color: #FAFAFA;
}
.title-slide {
  background-color: #FAFAFA;
  border-top: 80px solid #FAFAFA;
}
.title-slide h1  {
  color: var(--text-color);
  font-size: 40px;
  text-shadow: none;
  font-weight: 400;
  text-align: left;
  margin-left: 15px;
}
.title-slide h2  {
  margin-top: -15px;
  color: var(--link-color);
  text-shadow: none;
  font-weight: 300;
  font-size: 35px;
  text-align: left;
  margin-left: 15px;
}
.title-slide h3  {
  color: var(--text-color-light);
  text-shadow: none;
  font-weight: 300;
  font-size: 25px;
  text-align: left;
  margin-left: 15px;
  margin-bottom: 0px;
}
.title-slide h3:last-of-type  {
  font-style: italic;
  font-size: 1rem;
}

/* Remove orange line */
hr, .title-slide h2::after, .mline h1::after {
  content: '';
  display: block;
  border: none;
  background-color: #e5e5e5;
  color: #e5e5e5;
  height: 1px;
}

hr, .mline h1::after {
  margin: 1em 15px 0 15px;
}

.title-slide h2::after {
  margin: 10px 15px 35px 0;
}

.mline h1::after {
  margin: 10px 15px 0 15px;
}

/* turns off slide numbers for title page: https://github.com/gnab/remark/issues/298 */
.title-slide .remark-slide-number {
  display: none;
}

/* Custom CSS */

/* More line spacing */
body {
  line-height: 1.5;
}

/* Font styling */
.hi {
  font-weight: 600;
}
.mono {
  font-family: monospace;
}
.ul {
  text-decoration: underline;
}
.ol {
  text-decoration: overline;
}
.st {
  text-decoration: line-through;
}
.bf {
  font-weight: bold;
}
.it {
  font-style: italic;
}

/* Font Sizes */
.bigger {
  font-size: 125%;
}
.huge{
  font-size: 150%;
}
.small {
  font-size: 95%;
}
.smaller {
  font-size: 85%;
}
.smallest {
  font-size: 75%;
}
.tiny {
  font-size: 50%;
}

/* Remark customization */
.clear .remark-slide-number {
  display: none;
}
.inverse .remark-slide-number {
  display: none;
}
.remark-code-line-highlighted {
  background-color: rgba(249, 39, 114, 0.5);
}

/* Xaringan tweeks */

.inverse {
  background-color: #23373B;
  text-shadow: 0 0 20px #333;
  /* text-shadow: none; */
}

.title-slide {
  background-color: #ffffff;
  border-top: 80px solid #ffffff;
}

.footnote {
  bottom: 1em;
  font-size: 80%;
  color: #7f7f7f;
}

/* Lists */
li {
    margin-top: 4px;
}

/* Mono-spaced font, smaller */
.mono-small {
  font-family: monospace;
  font-size: 16px;
}
.mono-small .mjx-chtml {
  font-size: 103% !important;
}

.pseudocode, .pseudocode-small {
  font-family: monospace;
  background: #f8f8f8;
  border-radius: 3px;
  padding: 10px;
  padding-top: 0px;
  padding-bottom: 0px;
}
.pseudocode-small {
  font-size: 16px;
}
.remark-code {
  font-size: 68%;
}

.remark-inline-code {
  background: #F5F5F5; /* lighter */
  /* background: #e7e8e2; /* darker */
  border-radius: 3px;
  padding: 4px;
}

/* Super and Subscripts */

.super{
  vertical-align: super;
  font-size: 70%;
  line-height: 1%;
}
.sub{
  vertical-align: sub;
  font-size: 70%;
  line-height: 1%;
}

/* Subheader */
.subheader{
  font-weight: 100;
  font-style: italic;
  display: block;
  margin-top: -25px;
  margin-bottom: 25px;
}

/* 2/3 left; 1/3 right */
.more-left {
  float: left;
  width: 63%;
}
.less-right {
  float: right;
  width: 31%;
}
.more-right ~ * {
  clear: both;
}

/* 9/10 left; 1/10 right */
.left90 {
  padding-top: 0.7em;
  float: left;
  width: 85%;
}
.right10 {
  padding-top: 0.7em;
  float: right;
  width: 9%;
}

/* 95% left; 5% right */
.left95 {
  padding-top: 0.7em;
  float: left;
  width: 91%;
}
.right05 {
  padding-top: 0.7em;
  float: right;
  width: 5%;
}

.left5 {
  padding-top: 0.7em;
  margin-left: 0em;
  margin-right: -0.4em;
  float: left;
  width: 7%;
}
.left10 {
  padding-top: 0.7em;
  margin-left: -0.2em;
  margin-right: -0.5em;
  float: left;
  width: 10%;
}
.left30 {
  padding-top: 0.7em;
  float: left;
  width: 30%;
}
.right30 {
  padding-top: 0.7em;
  float: right;
  width: 30%;
}

.thin-left {
  padding-top: 0.7em;
  margin-left: -1em;
  margin-right: -0.5em;
  float: left;
  width: 27.5%;
}

/* Example */
.ex {
  font-weight: 300;
  color: #555F61 !important;
  font-style: italic;
}

.col-left {
  float: left;
  width: 47%;
  margin-top: -1em;
}
.col-right {
  float: right;
  width: 47%;
  margin-top: -1em;
}

.clear-up {
  clear: both;
  margin-top: -1em;
}

/* Format tables */
table {
  color: #000000;
  font-size: 14pt;
  line-height: 100%;
  border-top: 1px solid #ffffff !important;
  border-bottom: 1px solid #ffffff !important;
}
th, td {
  background-color: #ffffff;
}
table th {
  font-weight: 400;
}

/* Attention */
.attn {
  font-weight: 500;
  color: #e64173 !important;
  font-family: 'Zilla Slab' !important;
}

/* Note */
.note {
  font-weight: 300;
  font-style: italic;
  color: #314f4f !important;
  /* color: #cccccc !important; */
  font-family: 'Zilla Slab' !important;
}

/* Question and answer */
.qa {
  font-weight: 500;
  /* color: #314f4f !important; */
  color: #e64173 !important;
  font-family: 'Zilla Slab' !important;
}

/* Figure Caption */
.caption {
  font-size: 0.8888889em;
  line-height: 1.5;
  margin-top: 1em;
  color: #6b7280;
}
</style>

<div style = "position:fixed; visibility: hidden">
$$
\require{color}
\definecolor{purple}{rgb}{0.337254901960784, 0.00392156862745098, 0.643137254901961}
\definecolor{navy}{rgb}{0.0509803921568627, 0.23921568627451, 0.337254901960784}
\definecolor{ruby}{rgb}{0.603921568627451, 0.145098039215686, 0.0823529411764706}
\definecolor{alice}{rgb}{0.0627450980392157, 0.470588235294118, 0.584313725490196}
\definecolor{daisy}{rgb}{0.92156862745098, 0.788235294117647, 0.266666666666667}
\definecolor{coral}{rgb}{0.949019607843137, 0.427450980392157, 0.129411764705882}
\definecolor{kelly}{rgb}{0.509803921568627, 0.576470588235294, 0.337254901960784}
\definecolor{jet}{rgb}{0.0745098039215686, 0.0823529411764706, 0.0862745098039216}
\definecolor{asher}{rgb}{0.333333333333333, 0.372549019607843, 0.380392156862745}
\definecolor{slate}{rgb}{0.192156862745098, 0.309803921568627, 0.309803921568627}
\definecolor{cranberry}{rgb}{0.901960784313726, 0.254901960784314, 0.450980392156863}
$$
</div>
	
<script type="text/x-mathjax-config">
	MathJax.Hub.Config({
		TeX: {
			Macros: {
				purple: ["{\\color{purple}{#1}}", 1],
				navy: ["{\\color{navy}{#1}}", 1],
				ruby: ["{\\color{ruby}{#1}}", 1],
				alice: ["{\\color{alice}{#1}}", 1],
				daisy: ["{\\color{daisy}{#1}}", 1],
				coral: ["{\\color{coral}{#1}}", 1],
				kelly: ["{\\color{kelly}{#1}}", 1],
				jet: ["{\\color{jet}{#1}}", 1],
				asher: ["{\\color{asher}{#1}}", 1],
				slate: ["{\\color{slate}{#1}}", 1],
				cranberry: ["{\\color{cranberry}{#1}}", 1]
			},
			loader: {load: ['[tex]/color']},
			tex: {packages: {'[+]': ['color']}}
		}
	});
</script>

## Chapter 15: Parameters and Statistics

---
# Parameters and Statistics
We have discussed using sample data to make inference about the population.  In particular, we will use sample .hi.kelly[statistics] to make inference about population .hi.purple[parameters].

A .hi.purple[parameter] is a number that describes the population. In practice, parameters are unknown because we cannot examine the entire population.

A .hi.kelly[statistic] is a number that can be calculated from sample data without using any unknown parameters. In practice, we use statistics to estimate parameters.

---
# Greek Letters and Statistics

.pull-left[
.hi.kelly[Latin Letters]

- Latin letters like `$\bar{x}$` and `$s^2$` are calculations that represent guesses (estimates) at the population values.
]
.pull-right[
.hi.purple[Greek Letters]

- Greek letters like `$\mu$` and `$\sigma^2$` represent the truth about the population.
]

The goal for the class is for the latin letters to be good guesses for the greek letters:

$$
	\kelly{\text{Data}} \longrightarrow \kelly{\text{Calculation}} \longrightarrow \kelly{\text{Estimates}} \longrightarrow^{hopefully!} \purple{\text{Truth}}
$$

For example, 
$$
	\kelly{X} \longrightarrow \kelly{\frac{1}{n} \sum_{i=1}^n X_i} \longrightarrow \kelly{\bar{x}} \longrightarrow^{hopefullly!} \purple{\mu}
$$

---
# Examples of Parameters
Some parameters of distributions we've encountered are

- `$n$` and `$p$` in `$X\sim B(n,p)$` with probability mass function
$$
	P(X=x)={n \choose x} p^x \left(1-p\right)^{n-x}
$$

- `$a$` and `$b$` in `$X\sim U(a,b)$` with probability density function
$$
	f(x)=\frac{1}{b-a}
$$

- `$\mu$` and `$\sigma^2$` in `$X\sim N(\mu,\sigma^2)$` with probability density function

$$
	f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\left(\frac{x-\mu}{\sigma}\right)^2} 
$$

---
# Mean and Variance

Two population parameters of particular interest are

- the mean, denoted `$\mu$`, defined by `$E(X)$`
- the variance, denoted `$\sigma^2$`, defined by `$E(X^2)-E(X)^2$`

We .hi[do not] observe these. Therefore, we guess using

- the sample mean, `$\bar{X}$`
- the sample variance, `$s^2$`

Why do we use these as our guess?

---
# Getting the right sample
Before we talk about the properties of sample statistics, we need to make sure we have the right sample. We talked about good ways to generate a sample.

.hi.it[The right sample is the most important part of any data analysis.]

A .hi.kelly[Simple Random Sample] has no bias and has observations that are from the same population.

---
# Identically Distributed

If every observation is from the same population, we say all of the observations in our sample are .hi.cranberry[identically distributed].  In math, this means for any two observations `$X_i$` and `$X_j$`,

$$
	Pr(X_i < x) = Pr(X_j < x)
$$

---
# Independent Observations

Does observing `$X_i$` impact our best guess of `$X_j$`?  Sometimes yes (time series, spatial dependence), but hopefully not.

To simplify things, we need to assume .hi.red[independent sample observations], meaning 
$$
	Pr(X_i=a \ \vert \ X_j=b) = Pr(X_i=a)
$$

Intuitively, this means that .it[observing] one outcome doesn't help you .it[predict] any other outcome.

To summarize, we want an .it[i.i.d.] sample, i.e. sample observations that are .hi.purple[independent and identically distributed].

---
# Sample Statistics are Random Variables

For a sample `$X_1,..., X_n$` of the random variable `$X$`, any function of that sample, `$\hat{\theta}=g(X_1,...,X_n)$`, is a .hi.ruby[sample statistic].  For example,

`$$\ruby{\bar{X}} = \frac{1}{n} \sum_{i=1}^{n} X_i$$`

`$$\ruby{\displaystyle s^2} = \frac{1}{n-1} \sum_{i=1}^{n} (X_i - \bar{X})^2$$`

Because `$X_1,..., X_n$` are random variables, any sample statistic `$\ruby{\hat{\theta}} = g(X_1,...,X_n)$` is itself a random variable!

That means, there is some distribution for the values of `$\ruby{\hat{\theta}}$`

---
# Sampling Distributions

This is one of the most important concepts in the course. One .hi[trial] would consist of the following:

- .hi.kelly[Random Sample] - Grab a group of observations from the population

- .hi.ruby[Sample Statistic] - Take your particular random sample and calculate a sample statistic (e.g. sample mean)

.hi.coral[Sampling Distribution] - Imagine repeatedly grabbing a different group of observations from the population and calculating the sample mean. This is performing many .hi[trials]. The sample means themselves will have a distribution.

---
class: clear, center

---
class: clear, center

---
class: clear, center

.center[
<img style="width:100%;" src="data:image/png;base64,#sample_dist.gif"/>
]

---
class: clear, middle, center

---
# Sample Size

The variance of the .it.coral[sampling distribution] depends on the sample size. As $n$ gets larger, each individual .hi[trial] gives a better guess at the mean. Hence, the .coral[sampling distribution] gets more narrow

.center[
<img style="width:80%;" src="data:image/png;base64,#dist_n.gif"/>
]

---
class: clear

---
# Sampling Distributions
We will only observe 1 sample in the world though.

How does the concept of .coral[sampling distribution] help us?

- Since we don't know the true population parameter, Our .ruby[sample statistic] will be our best guess at the possible true value.

- If we know the .coral[sampling distribution], then we can consider uncertainty about our .ruby[sample statistic].

---
# Law of Large Numbers

Is `$\bar{X}$` actually a good guess for `$\mu$`? Under certain conditions, we can use the .hi.purple[Law of Large Numbers (LLN)] to guarantee that `$\bar{X}$` approaches `$\mu$` as the sample size grows large.

.hi[Theorem]: Let `$X_1,X_2,...,X_n$` be an i.i.d. set of observations with `$E(X_i) = \mu$`.

Define the sample mean of size `$n$` as `$\bar{X}_n = \frac{1}{n}\sum_{i = 1}^{n}X_i$`. Then

$$ \bar{X}_n \to \mu \quad \text{as} \quad n \to \infty. $$

Intuitively, as we observe a larger and larger sample, we average over randomness and our sample mean approaches the true population mean.

---
# Law of Large Numbers

.center[
<img style="width: 90%;" src="data:image/png;base64,#lln.gif"/>
]

---
# Law of Large Numbers

---
# Properties of the sample mean

.hi[Theorem]:
Let `$X_1,X_2,...,X_n$` be an i.i.d. sample with `$E(X_i) = \mu$` and `$Var(X_i) = \sigma^2 < \infty$`. Then

`$$E(\bar{X}_n) = \mu$$`

`$$Var(\bar{X}_n) = \frac{\sigma^2}{n}$$`

Intuitively, we grab many samples from a population. The variance of our sample averages shrinks as we observe more observations per sample.

---
# Clicker Question
Suppose we sample 100 observations from a distribution with `$\mu = 15$` and `$\sigma^2 = 25$`. What are `$E(\bar{X}_{100})$` and `$Var(\bar{X}_{100})$`?

---
class: clear

## When is the sample mean Normally Distributed?

Although we know the mean and variance of `$\bar{X}$`, we generally don't know its distribution function.

.hi[Theorem]: Let `$X_1,X_2,...,X_n$` be an i.i.d. sample with `$X_i \sim N(\mu, \sigma^2)$` for `$i=1,2,...,n$`.

Then 
$$
	\bar{X}_n \sim N(\mu, \frac{\sigma^2}{n}).
$$

Intuitively, if all the observations come from the same normal distribution then the sample average is also normally distributed and centered at the true mean (but much more narrow).

---
# Central Limit Theorem

What if `$X_i$` are not normally distributed?

If the number of observation, `$n$`, per sample is large (we will discuss this more later), then the distribution of `$X_i$` doesn't matter. We will always have

$$
	\bar{X}_n \sim N(\mu, \frac{\sigma^2}{n}).
$$