class: center, middle, inverse, title-slide # ECON 3818 ## Chapter 16 ### Kyle Butts ### 11 October 2021 --- class: clear, middle <!-- Custom css --> <style type="text/css"> /* ------------------------------------------------------- * * !! This file was generated by xaringanthemer !! * * Changes made to this file directly will be overwritten * if you used xaringanthemer in your xaringan slides Rmd * ------------------------------------------------------- */ @import url(https://fonts.googleapis.com/css?family=Roboto&display=swap); @import url(https://fonts.googleapis.com/css?family=Roboto&display=swap); @import url(https://fonts.googleapis.com/css?family=Source+Code+Pro:400,700&display=swap); @import url(https://fonts.googleapis.com/css2?family=Atkinson+Hyperlegible&display=swap); :root { /* Fonts */ --text-font-family: 'Atkinson Hyperelegible'; --text-font-is-google: 1; --text-font-family-fallback: Roboto, -apple-system, BlinkMacSystemFont, avenir next, avenir, helvetica neue, helvetica, Ubuntu, roboto, noto, segoe ui, arial; --text-font-base: sans-serif; --header-font-family: 'Atkinson Hyperelegible' --header-font-is-google: 1; --header-font-family-fallback: Georgia, serif; --code-font-family: 'Source Code Pro'; --code-font-is-google: 1; --base-font-size: 20px; --text-font-size: 1rem; --code-font-size: 0.9rem; --code-inline-font-size: 1em; --header-h1-font-size: 1.75rem; --header-h2-font-size: 1.6rem; --header-h3-font-size: 1.5rem; /* Colors */ --text-color: #131516; --text-color-light: #555F61; --header-color: #FFF; --background-color: #FFF; --link-color: #107895; --code-highlight-color: rgba(255,255,0,0.5); --inverse-text-color: #d6d6d6; --inverse-background-color: #272822; --inverse-header-color: #f3f3f3; --inverse-link-color: #107895; --title-slide-background-color: #272822; --title-slide-text-color: #d6d6d6; --header-background-color: #FFF; --header-background-text-color: #FFF; } html { font-size: var(--base-font-size); } body { font-family: var(--text-font-family), var(--text-font-family-fallback), var(--text-font-base); font-weight: normal; color: var(--text-color); } h1, h2, h3 { font-family: var(--header-font-family), var(--header-font-family-fallback); color: var(--text-color-light); } .remark-slide-content { background-color: var(--background-color); font-size: 1rem; padding: 24px 32px 16px 32px; width: 100%; height: 100%; } .remark-slide-content h1 { font-size: var(--header-h1-font-size); } .remark-slide-content h2 { font-size: var(--header-h2-font-size); } .remark-slide-content h3 { font-size: var(--header-h3-font-size); } .remark-code, .remark-inline-code { font-family: var(--code-font-family), Menlo, Consolas, Monaco, Liberation Mono, Lucida Console, monospace; } .remark-code { font-size: var(--code-font-size); } .remark-inline-code { font-size: var(--code-inline-font-size); color: #000; } .remark-slide-number { color: #107895; opacity: 1; font-size: 0.9em; } a, a > code { color: var(--link-color); text-decoration: none; } .footnote { position: absolute; bottom: 60px; padding-right: 6em; font-size: 0.9em; } .remark-code-line-highlighted { background-color: var(--code-highlight-color); } .inverse { background-color: var(--inverse-background-color); color: var(--inverse-text-color); } .inverse h1, .inverse h2, .inverse h3 { color: var(--inverse-header-color); } .inverse a, .inverse a > code { color: var(--inverse-link-color); } img, video, iframe { max-width: 100%; } blockquote { border-left: solid 5px lightgray; padding-left: 1em; } @page { margin: 0; } @media print { .remark-slide-scaler { width: 100% !important; height: 100% !important; transform: scale(1) !important; top: 0 !important; left: 0 !important; } } /* Modified metropolis */ .clear{ border-top: 0px solid #FAFAFA; } h1 { margin-top: -5px; margin-left: -00px; margin-bottom: 30px; color: var(--text-color-light); font-weight: 200; } h2, h3, h4 { padding-top: -15px; padding-bottom: 00px; color: #1A292C; text-shadow: none; font-weight: 400; text-align: left; margin-left: 00px; margin-bottom: -10px; } .title-slide .inverse .remark-slide-content { background-color: #FAFAFA; } .title-slide { background-color: #FAFAFA; border-top: 80px solid #FAFAFA; } .title-slide h1 { color: var(--text-color); font-size: 40px; text-shadow: none; font-weight: 400; text-align: left; margin-left: 15px; } .title-slide h2 { margin-top: -15px; color: var(--link-color); text-shadow: none; font-weight: 300; font-size: 35px; text-align: left; margin-left: 15px; } .title-slide h3 { color: var(--text-color-light); text-shadow: none; font-weight: 300; font-size: 25px; text-align: left; margin-left: 15px; margin-bottom: 0px; } .title-slide h3:last-of-type { font-style: italic; font-size: 1rem; } /* Remove orange line */ hr, .title-slide h2::after, .mline h1::after { content: ''; display: block; border: none; background-color: #e5e5e5; color: #e5e5e5; height: 1px; } hr, .mline h1::after { margin: 1em 15px 0 15px; } .title-slide h2::after { margin: 10px 15px 35px 0; } .mline h1::after { margin: 10px 15px 0 15px; } /* turns off slide numbers for title page: https://github.com/gnab/remark/issues/298 */ .title-slide .remark-slide-number { display: none; } /* Custom CSS */ /* More line spacing */ body { line-height: 1.5; } /* Font styling */ .hi { font-weight: 600; } .mono { font-family: monospace; } .ul { text-decoration: underline; } .ol { text-decoration: overline; } .st { text-decoration: line-through; } .bf { font-weight: bold; } .it { font-style: italic; } /* Font Sizes */ .bigger { font-size: 125%; } .huge{ font-size: 150%; } .small { font-size: 95%; } .smaller { font-size: 85%; } .smallest { font-size: 75%; } .tiny { font-size: 50%; } /* Remark customization */ .clear .remark-slide-number { display: none; } .inverse .remark-slide-number { display: none; } .remark-code-line-highlighted { background-color: rgba(249, 39, 114, 0.5); } /* Xaringan tweeks */ .inverse { background-color: #23373B; text-shadow: 0 0 20px #333; /* text-shadow: none; */ } .title-slide { background-color: #ffffff; border-top: 80px solid #ffffff; } .footnote { bottom: 1em; font-size: 80%; color: #7f7f7f; } /* Lists */ li { margin-top: 4px; } /* Mono-spaced font, smaller */ .mono-small { font-family: monospace; font-size: 16px; } .mono-small .mjx-chtml { font-size: 103% !important; } .pseudocode, .pseudocode-small { font-family: monospace; background: #f8f8f8; border-radius: 3px; padding: 10px; padding-top: 0px; padding-bottom: 0px; } .pseudocode-small { font-size: 16px; } .remark-code { font-size: 68%; } .remark-inline-code { background: #F5F5F5; /* lighter */ /* background: #e7e8e2; /* darker */ border-radius: 3px; padding: 4px; } /* Super and Subscripts */ .super{ vertical-align: super; font-size: 70%; line-height: 1%; } .sub{ vertical-align: sub; font-size: 70%; line-height: 1%; } /* Subheader */ .subheader{ font-weight: 100; font-style: italic; display: block; margin-top: -25px; margin-bottom: 25px; } /* 2/3 left; 1/3 right */ .more-left { float: left; width: 63%; } .less-right { float: right; width: 31%; } .more-right ~ * { clear: both; } /* 9/10 left; 1/10 right */ .left90 { padding-top: 0.7em; float: left; width: 85%; } .right10 { padding-top: 0.7em; float: right; width: 9%; } /* 95% left; 5% right */ .left95 { padding-top: 0.7em; float: left; width: 91%; } .right05 { padding-top: 0.7em; float: right; width: 5%; } .left5 { padding-top: 0.7em; margin-left: 0em; margin-right: -0.4em; float: left; width: 7%; } .left10 { padding-top: 0.7em; margin-left: -0.2em; margin-right: -0.5em; float: left; width: 10%; } .left30 { padding-top: 0.7em; float: left; width: 30%; } .right30 { padding-top: 0.7em; float: right; width: 30%; } .thin-left { padding-top: 0.7em; margin-left: -1em; margin-right: -0.5em; float: left; width: 27.5%; } /* Example */ .ex { font-weight: 300; color: #555F61 !important; font-style: italic; } .col-left { float: left; width: 47%; margin-top: -1em; } .col-right { float: right; width: 47%; margin-top: -1em; } .clear-up { clear: both; margin-top: -1em; } /* Format tables */ table { color: #000000; font-size: 14pt; line-height: 100%; border-top: 1px solid #ffffff !important; border-bottom: 1px solid #ffffff !important; } th, td { background-color: #ffffff; } table th { font-weight: 400; } /* Attention */ .attn { font-weight: 500; color: #e64173 !important; font-family: 'Zilla Slab' !important; } /* Note */ .note { font-weight: 300; font-style: italic; color: #314f4f !important; /* color: #cccccc !important; */ font-family: 'Zilla Slab' !important; } /* Question and answer */ .qa { font-weight: 500; /* color: #314f4f !important; */ color: #e64173 !important; font-family: 'Zilla Slab' !important; } /* Figure Caption */ .caption { font-size: 0.8888889em; line-height: 1.5; margin-top: 1em; color: #6b7280; } </style> <!-- From xaringancolor --> <div style = "position:fixed; visibility: hidden"> $$ \require{color} \definecolor{purple}{rgb}{0.337254901960784, 0.00392156862745098, 0.643137254901961} \definecolor{navy}{rgb}{0.0509803921568627, 0.23921568627451, 0.337254901960784} \definecolor{ruby}{rgb}{0.603921568627451, 0.145098039215686, 0.0823529411764706} \definecolor{alice}{rgb}{0.0627450980392157, 0.470588235294118, 0.584313725490196} \definecolor{daisy}{rgb}{0.92156862745098, 0.788235294117647, 0.266666666666667} \definecolor{coral}{rgb}{0.949019607843137, 0.427450980392157, 0.129411764705882} \definecolor{kelly}{rgb}{0.509803921568627, 0.576470588235294, 0.337254901960784} \definecolor{jet}{rgb}{0.0745098039215686, 0.0823529411764706, 0.0862745098039216} \definecolor{asher}{rgb}{0.333333333333333, 0.372549019607843, 0.380392156862745} \definecolor{slate}{rgb}{0.192156862745098, 0.309803921568627, 0.309803921568627} \definecolor{cranberry}{rgb}{0.901960784313726, 0.254901960784314, 0.450980392156863} $$ </div> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { Macros: { purple: ["{\\color{purple}{#1}}", 1], navy: ["{\\color{navy}{#1}}", 1], ruby: ["{\\color{ruby}{#1}}", 1], alice: ["{\\color{alice}{#1}}", 1], daisy: ["{\\color{daisy}{#1}}", 1], coral: ["{\\color{coral}{#1}}", 1], kelly: ["{\\color{kelly}{#1}}", 1], jet: ["{\\color{jet}{#1}}", 1], asher: ["{\\color{asher}{#1}}", 1], slate: ["{\\color{slate}{#1}}", 1], cranberry: ["{\\color{cranberry}{#1}}", 1] }, loader: {load: ['[tex]/color']}, tex: {packages: {'[+]': ['color']}} } }); </script> <style> .purple {color: #5601A4;} .navy {color: #0D3D56;} .ruby {color: #9A2515;} .alice {color: #107895;} .daisy {color: #EBC944;} .coral {color: #F26D21;} .kelly {color: #829356;} .jet {color: #131516;} .asher {color: #555F61;} .slate {color: #314F4F;} .cranberry {color: #E64173;} </style> ## Chapter 16: Confidence Intervals --- # Statistical Inference Recall, we're interested in estimating some unknown .cranberry[population] parameter `\(\theta\)` using the .kelly[sample] `\(X_1, ...,X_n\)` We can use some estimator `\(\hat{\theta}\)` - We can find its bias and its variance - Can say what it converges to using Law of Large Numbers and the Central Limit Theorem However, we don't have `\(n \to \infty\)`. We have a .hi[finite] sample. - Therefore, we want to construct some belief about how good our estimator is. - For example, if we have a sample mean with 5 individuals, our sampling distribution has a large variance. We want to report that. --- # Example Say we collect ACT Scores for 654 students, and calculate `\(\bar{X}_{654}=26.8\)`. Somehow we also know that the standard deviation of the *population distribution* is `\(\sigma\)` = 7.5 To visualize: <img src="data:image/png;base64,#ch16_files/figure-html/68-95-99-1.svg" width="75%" style="display: block; margin: auto;" /> --- # 95% Rule From the fact that `\(\bar{X}_n \sim N(\mu, \frac{\sigma}{\sqrt{n}})\)`, we have: `\begin{align*} P(- z_{\alpha/2} \leq \frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \leq z_{\alpha/2}) = 1-\alpha, \end{align*}` where `\(- z_{\alpha/2}\)` is the .emph[critical value] with a left-tail probability of 2.5%. With some math, we see: -- `\begin{align*} P(- z_{\alpha/2} \frac{\sigma}{\sqrt{n}} \leq \bar{X}_n - \mu \leq z_{\alpha/2} \frac{\sigma}{\sqrt{n}}) = 1-\alpha \\ \end{align*}` -- `\begin{align*} P(\mu - z_{\alpha/2} \frac{\sigma}{\sqrt{n}} \leq \bar{X}_n \leq \mu + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}) = 1-\alpha \\ \end{align*}` -- That is, `\(1-\alpha\)`% of the time, the sample mean falls between `\(\mu - z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\)` and `\(\mu + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\)`. --- # Central Idea of Confidence Interval The interval we defined above involves `\(\mu\)` which we .emph[do not know!] Instead, let's do the math slightly differently: `\begin{align*} P(- z_{\alpha/2} \frac{\sigma}{\sqrt{n}} \leq \bar{X}_n - \mu \leq z_{\alpha/2} \frac{\sigma}{\sqrt{n}}) = 1-\alpha \\ \end{align*}` -- `\begin{align*} P(- \bar{X}_n - z_{\alpha/2} \frac{\sigma}{\sqrt{n}} \leq - \mu \leq - \bar{X}_n + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}) = 1-\alpha \\ \end{align*}` -- `\begin{align*} P(\bar{X}_n - z_{\alpha/2} \frac{\sigma}{\sqrt{n}} \leq \mu \leq \bar{X}_n + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}) = 1-\alpha \\ \end{align*}` -- Now we are able to categorize some uncertainty surrounding `\(\mu\)`. In particular, we have that in repeated sampling of `\(\bar{X}_n\)` (many samples of size `\(n\)`), `\(\mu\)` will be in the interval `\begin{align*} \left[\bar{X}_n - z_{\alpha/2} \frac{\sigma}{\sqrt{n}}, \bar{X}_n + z_{\alpha/2} \frac{\sigma}{\sqrt{n}}\right] \end{align*}` `\(1 - \alpha\)`% of the time. --- # Confidence Interval The 95 part of the 68-95-99.7 rule for Normal distributions says that `\(\bar{X}_{654}\)` is within 2 standard deviations of the mean `\(\mu\)` in 95% of samples. Therefore `\(z_{\alpha/2} \approx 2\)`. Since `\(\sigma=7.5\)`, the standard deviation of our sampling distribution is `\(\frac{7.5}{\sqrt{654}}=0.3\)` (from the Central Limit Theorem). This means for .hi[95% of all samples of size 654], the sample mean `\(\bar{X}_{654}\)` is within `\(0.6\)` of the population mean (two std deviations). <br> If we estimate that `\(\mu\)` lies somewhere in the interval from `\(\bar{X}_{654}-0.6\)` to `\(\bar{X}_{654}+0.6\)`, we'll be right for 95% of all possible samples. --- # Confidence Interval Plugging in all the information we have leads to: `$$\big[26.8-0.6, 26.8+0.6\big] = \big[ 26.2, 27.4 \big]$$` This interval is a .hi.ruby[confidence interval], which says that we are .it[95% confident] that the true mean of the BMI, `\(\mu\)`, is in between 26.2 and 27.4 .sup[*] .footnote[.sup[*] This is because we got this interval from a method that captures the population mean for 95% of all possible samples] --- # Confidence Intervals Let's see what I mean. Let's say the population average ACT Score is 27. I will draw many samples of size 654 from the distribution `\(N(27, 7.5)\)`. For each sample, I will calculate `\(\bar{X}_{654}\)` and add and subtract `\(0.6\)` to form my confidence interval `\([\bar{X}_{654}-0.6, \bar{X}_{654}+0.6]\)`. <img src="data:image/png;base64,#ch16_files/figure-html/ci-coverage-1.svg" width="75%" style="display: block; margin: auto;" /> --- # Confidence Intervals In that example, the confidence interval was `\(\bar{X}_{654} \pm 0.6\)`. In general, a confidence interval takes the form `$$\kelly{\text{estimate}} \pm \daisy{\text{margin of error}}$$` where the .hi.daisy[margin of error] shows how much variability there is in our estimate --- # Margins of Error For a given level of confidence, `\(C\)` (say 95%), the .daisy[margin of error] for our sample mean as: $$ \daisy{k} = Z_{\frac{1-C}{2}}} \frac{\sigma}{\sqrt{n}} $$ <br> - Let's say `\(C = 95\%\)`. We want to capture the middle 95%, so `\(\frac{1-.95}{2} = 2.5\%\)` in each tail. - `\(Z_{0.025} = 1.96\)` which is where the `\(\sim 2\)` standard deviations comes from. - 90% Confidence Interval: `\(\implies Z_{\frac{1-C}{2}} = Z_{.05} = 1.645\)` standard deviations. --- # Example Lets determine the .it[exact] margin of error for previous example `$$k=Z_{\frac{1-C}{2}} \cdot \frac{\sigma}{\sqrt{n}}$$` If we are calculating 95% confidence interval, where `\(\bar{X}_{654}=26.8\)` `\(\sigma = 7.5\)`, then `$$k=Z_{0.025} \cdot \frac{7.5}{\sqrt{654}}$$` We find `\(Z_{0.025}\)` using the table. `\(Z_{0.025}\)` is the z-score such that `\(P(z>Z_{0.025})=0.025\)` - Look up 0.025 (or 0.975!) and find the corresponding z-score --- # Example Using the z-table, we find that `\(Z_{0.025} = 1.96\)`. This means: `$$k = 1.96\cdot \frac{7.5}{\sqrt{654}} = 0.57$$` This means our .it[exact] 95% confidence interval is: `$$[26.23, 27.37]$$` --- # Clicker Question What Z-score will be associated with a 82% confidence interval <ol type = "a"> <li>0.92</li> <li>1.34</li> <li>0.82</li> <li>0.79</li> </ol> --- # Example Say high-school freshmen are sampled to see how long they spend on social media per day. The sample mean of `\(n = 50\)` students is `\(\bar{X}_{50} = 2.1\)` hours. The standard deviation of the population is `\(0.5\)` hours. What is the 90% confidence interval for our estimate of the mean number of hours per day? --- # Clicker Question A 95% confidence interval for the mean hours freshmen spent on social media per day was calculated to be [2.5 hours, 3.1 hours] based off a sample mean `\(\bar{X}_{50} = 2.8\)`. The confidence interval was based on a SRS sample of `\(n = 50\)`. The standard deviation of the population is: <ol type = "a"> <li>0.3</li> <li>1.96</li> <li>0.2772</li> <li>1.0823</li> </ol> --- # Margins of Error So to recap, the margin of error is calculated by: `$$k=Z_{\frac{1-C}{2}} \cdot \frac{\sigma}{\sqrt{n}}$$` This means the size of the margin of error is determined by: - level of confidence - size of sample (can sometimes control) - variance (which we can't control) Discuss with a partner what happens to margin of error when - increase the level of confidence required - decrease sample size - increase variance of population --- # Margins of Error Level of confidence - higher the level of confidence we want to have, larger the margin of error Size of sample - larger the sample, smaller the margin of error - Variance larger the variance, larger the margin of error --- # Example Given a sample mean of 8, from a sample of 36 observations, and a variance of 25, construct 90%, 95%, and 99% confidence intervals for the true value, `\(\mu\)`. --- # Confidence Intervals How to .hi[properly] think of a confidence interval: 1. You collect many different samples from the population 2. For each sample, you calculate the mean and the associated confidence interval around that mean 3. You'll have a confidence interval for each sample you collected 4. .hi[95% of the confidence intervals you calculated will include the true mean `\(\mu\)`] --- # Confidence Intervals *Common misconceptions* about confidence intervals The CI .hi.ruby[does not] tell you: - The true mean is inside the confidence interval - The probability the true mean is in your CI is C% The CI .hi.kelly[does] tell you: - the range of estimates that contain the true mean C% of the time *in repeated sampling* --- # Confidence Intervals <img src="data:image/png;base64,#ch16_files/figure-html/unnamed-chunk-2-1.svg" width="90%" style="display: block; margin: auto;" /> --- # Clicker Question -- Midterm Example Veterinary researchers at a major university veterinary hospital calculated a 99% confidence interval for the average age of horses admitted for laminitis (a particular foot disease) as 6.3 to 7.4 years. Based on this information we conclude that: <ol type = "a"> <li> 99% of all horses admitted for laminitis are between 6.3 and 7.4 years old </li> <li> 99% of the time, the average age of a horse admitted for laminitis will be between 6.3 and 7.4 years </li> <li> We are 99% confident that the true mean age of horses with laminitis is between 6.3 and 7.4 years old </li> <li> 99% of all samples of size n=25 will have an average age of horses with laminitis between 6.3 and 7.4 years old. </li> </ol>