class: center, middle, inverse, title-slide # ECON 3818 ## Chapter 21 ### Kyle Butts ### 25 August 2021 --- class: clear, middle <!-- Custom css --> <style type="text/css"> /* ------------------------------------------------------- * * !! This file was generated by xaringanthemer !! * * Changes made to this file directly will be overwritten * if you used xaringanthemer in your xaringan slides Rmd * ------------------------------------------------------- */ @import url(https://fonts.googleapis.com/css?family=Roboto&display=swap); @import url(https://fonts.googleapis.com/css?family=Roboto&display=swap); @import url(https://fonts.googleapis.com/css?family=Source+Code+Pro:400,700&display=swap); @import url(https://fonts.googleapis.com/css2?family=Atkinson+Hyperlegible&display=swap); :root { /* Fonts */ --text-font-family: 'Atkinson Hyperelegible'; --text-font-is-google: 1; --text-font-family-fallback: Roboto, -apple-system, BlinkMacSystemFont, avenir next, avenir, helvetica neue, helvetica, Ubuntu, roboto, noto, segoe ui, arial; --text-font-base: sans-serif; --header-font-family: 'Atkinson Hyperelegible' --header-font-is-google: 1; --header-font-family-fallback: Georgia, serif; --code-font-family: 'Source Code Pro'; --code-font-is-google: 1; --base-font-size: 20px; --text-font-size: 1rem; --code-font-size: 0.9rem; --code-inline-font-size: 1em; --header-h1-font-size: 1.75rem; --header-h2-font-size: 1.6rem; --header-h3-font-size: 1.5rem; /* Colors */ --text-color: #131516; --text-color-light: #555F61; --header-color: #FFF; --background-color: #FFF; --link-color: #107895; --code-highlight-color: rgba(255,255,0,0.5); --inverse-text-color: #d6d6d6; --inverse-background-color: #272822; --inverse-header-color: #f3f3f3; --inverse-link-color: #107895; --title-slide-background-color: #272822; --title-slide-text-color: #d6d6d6; --header-background-color: #FFF; --header-background-text-color: #FFF; } html { font-size: var(--base-font-size); } body { font-family: var(--text-font-family), var(--text-font-family-fallback), var(--text-font-base); font-weight: normal; color: var(--text-color); } h1, h2, h3 { font-family: var(--header-font-family), var(--header-font-family-fallback); color: var(--text-color-light); } .remark-slide-content { background-color: var(--background-color); font-size: 1rem; padding: 24px 32px 16px 32px; width: 100%; height: 100%; } .remark-slide-content h1 { font-size: var(--header-h1-font-size); } .remark-slide-content h2 { font-size: var(--header-h2-font-size); } .remark-slide-content h3 { font-size: var(--header-h3-font-size); } .remark-code, .remark-inline-code { font-family: var(--code-font-family), Menlo, Consolas, Monaco, Liberation Mono, Lucida Console, monospace; } .remark-code { font-size: var(--code-font-size); } .remark-inline-code { font-size: var(--code-inline-font-size); color: #000; } .remark-slide-number { color: #107895; opacity: 1; font-size: 0.9em; } a, a > code { color: var(--link-color); text-decoration: none; } .footnote { position: absolute; bottom: 60px; padding-right: 6em; font-size: 0.9em; } .remark-code-line-highlighted { background-color: var(--code-highlight-color); } .inverse { background-color: var(--inverse-background-color); color: var(--inverse-text-color); } .inverse h1, .inverse h2, .inverse h3 { color: var(--inverse-header-color); } .inverse a, .inverse a > code { color: var(--inverse-link-color); } img, video, iframe { max-width: 100%; } blockquote { border-left: solid 5px lightgray; padding-left: 1em; } @page { margin: 0; } @media print { .remark-slide-scaler { width: 100% !important; height: 100% !important; transform: scale(1) !important; top: 0 !important; left: 0 !important; } } /* Modified metropolis */ .clear{ border-top: 0px solid #FAFAFA; } h1 { margin-top: -5px; margin-left: -00px; margin-bottom: 30px; color: var(--text-color-light); font-weight: 200; } h2, h3, h4 { padding-top: -15px; padding-bottom: 00px; color: #1A292C; text-shadow: none; font-weight: 400; text-align: left; margin-left: 00px; margin-bottom: -10px; } .title-slide .inverse .remark-slide-content { background-color: #FAFAFA; } .title-slide { background-color: #FAFAFA; border-top: 80px solid #FAFAFA; } .title-slide h1 { color: var(--text-color); font-size: 40px; text-shadow: none; font-weight: 400; text-align: left; margin-left: 15px; } .title-slide h2 { margin-top: -15px; color: var(--link-color); text-shadow: none; font-weight: 300; font-size: 35px; text-align: left; margin-left: 15px; } .title-slide h3 { color: var(--text-color-light); text-shadow: none; font-weight: 300; font-size: 25px; text-align: left; margin-left: 15px; margin-bottom: 0px; } .title-slide h3:last-of-type { font-style: italic; font-size: 1rem; } /* Remove orange line */ hr, .title-slide h2::after, .mline h1::after { content: ''; display: block; border: none; background-color: #e5e5e5; color: #e5e5e5; height: 1px; } hr, .mline h1::after { margin: 1em 15px 0 15px; } .title-slide h2::after { margin: 10px 15px 35px 0; } .mline h1::after { margin: 10px 15px 0 15px; } /* turns off slide numbers for title page: https://github.com/gnab/remark/issues/298 */ .title-slide .remark-slide-number { display: none; } /* Custom CSS */ /* More line spacing */ body { line-height: 1.5; } /* Font styling */ .hi { font-weight: 600; } .mono { font-family: monospace; } .ul { text-decoration: underline; } .ol { text-decoration: overline; } .st { text-decoration: line-through; } .bf { font-weight: bold; } .it { font-style: italic; } /* Font Sizes */ .bigger { font-size: 125%; } .huge{ font-size: 150%; } .small { font-size: 95%; } .smaller { font-size: 85%; } .smallest { font-size: 75%; } .tiny { font-size: 50%; } /* Remark customization */ .clear .remark-slide-number { display: none; } .inverse .remark-slide-number { display: none; } .remark-code-line-highlighted { background-color: rgba(249, 39, 114, 0.5); } /* Xaringan tweeks */ .inverse { background-color: #23373B; text-shadow: 0 0 20px #333; /* text-shadow: none; */ } .title-slide { background-color: #ffffff; border-top: 80px solid #ffffff; } .footnote { bottom: 1em; font-size: 80%; color: #7f7f7f; } /* Lists */ li { margin-top: 4px; } /* Mono-spaced font, smaller */ .mono-small { font-family: monospace; font-size: 16px; } .mono-small .mjx-chtml { font-size: 103% !important; } .pseudocode, .pseudocode-small { font-family: monospace; background: #f8f8f8; border-radius: 3px; padding: 10px; padding-top: 0px; padding-bottom: 0px; } .pseudocode-small { font-size: 16px; } .remark-code { font-size: 68%; } .remark-inline-code { background: #F5F5F5; /* lighter */ /* background: #e7e8e2; /* darker */ border-radius: 3px; padding: 4px; } /* Super and Subscripts */ .super{ vertical-align: super; font-size: 70%; line-height: 1%; } .sub{ vertical-align: sub; font-size: 70%; line-height: 1%; } /* Subheader */ .subheader{ font-weight: 100; font-style: italic; display: block; margin-top: -25px; margin-bottom: 25px; } /* 2/3 left; 1/3 right */ .more-left { float: left; width: 63%; } .less-right { float: right; width: 31%; } .more-right ~ * { clear: both; } /* 9/10 left; 1/10 right */ .left90 { padding-top: 0.7em; float: left; width: 85%; } .right10 { padding-top: 0.7em; float: right; width: 9%; } /* 95% left; 5% right */ .left95 { padding-top: 0.7em; float: left; width: 91%; } .right05 { padding-top: 0.7em; float: right; width: 5%; } .left5 { padding-top: 0.7em; margin-left: 0em; margin-right: -0.4em; float: left; width: 7%; } .left10 { padding-top: 0.7em; margin-left: -0.2em; margin-right: -0.5em; float: left; width: 10%; } .left30 { padding-top: 0.7em; float: left; width: 30%; } .right30 { padding-top: 0.7em; float: right; width: 30%; } .thin-left { padding-top: 0.7em; margin-left: -1em; margin-right: -0.5em; float: left; width: 27.5%; } /* Example */ .ex { font-weight: 300; color: #555F61 !important; font-style: italic; } .col-left { float: left; width: 47%; margin-top: -1em; } .col-right { float: right; width: 47%; margin-top: -1em; } .clear-up { clear: both; margin-top: -1em; } /* Format tables */ table { color: #000000; font-size: 14pt; line-height: 100%; border-top: 1px solid #ffffff !important; border-bottom: 1px solid #ffffff !important; } th, td { background-color: #ffffff; } table th { font-weight: 400; } /* Attention */ .attn { font-weight: 500; color: #e64173 !important; font-family: 'Zilla Slab' !important; } /* Note */ .note { font-weight: 300; font-style: italic; color: #314f4f !important; /* color: #cccccc !important; */ font-family: 'Zilla Slab' !important; } /* Question and answer */ .qa { font-weight: 500; /* color: #314f4f !important; */ color: #e64173 !important; font-family: 'Zilla Slab' !important; } /* Figure Caption */ .caption { font-size: 0.8888889em; line-height: 1.5; margin-top: 1em; color: #6b7280; } </style> <!-- From xaringancolor --> <div style = "position:fixed; visibility: hidden"> $$ \require{color} \definecolor{purple}{rgb}{0.337254901960784, 0.00392156862745098, 0.643137254901961} \definecolor{navy}{rgb}{0.0509803921568627, 0.23921568627451, 0.337254901960784} \definecolor{ruby}{rgb}{0.603921568627451, 0.145098039215686, 0.0823529411764706} \definecolor{alice}{rgb}{0.0627450980392157, 0.470588235294118, 0.584313725490196} \definecolor{daisy}{rgb}{0.92156862745098, 0.788235294117647, 0.266666666666667} \definecolor{coral}{rgb}{0.949019607843137, 0.427450980392157, 0.129411764705882} \definecolor{kelly}{rgb}{0.509803921568627, 0.576470588235294, 0.337254901960784} \definecolor{jet}{rgb}{0.0745098039215686, 0.0823529411764706, 0.0862745098039216} \definecolor{asher}{rgb}{0.333333333333333, 0.372549019607843, 0.380392156862745} \definecolor{slate}{rgb}{0.192156862745098, 0.309803921568627, 0.309803921568627} \definecolor{cranberry}{rgb}{0.901960784313726, 0.254901960784314, 0.450980392156863} $$ </div> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { Macros: { purple: ["{\\color{purple}{#1}}", 1], navy: ["{\\color{navy}{#1}}", 1], ruby: ["{\\color{ruby}{#1}}", 1], alice: ["{\\color{alice}{#1}}", 1], daisy: ["{\\color{daisy}{#1}}", 1], coral: ["{\\color{coral}{#1}}", 1], kelly: ["{\\color{kelly}{#1}}", 1], jet: ["{\\color{jet}{#1}}", 1], asher: ["{\\color{asher}{#1}}", 1], slate: ["{\\color{slate}{#1}}", 1], cranberry: ["{\\color{cranberry}{#1}}", 1] }, loader: {load: ['[tex]/color']}, tex: {packages: {'[+]': ['color']}} } }); </script> <style> .purple {color: #5601A4;} .navy {color: #0D3D56;} .ruby {color: #9A2515;} .alice {color: #107895;} .daisy {color: #EBC944;} .coral {color: #F26D21;} .kelly {color: #829356;} .jet {color: #131516;} .asher {color: #555F61;} .slate {color: #314F4F;} .cranberry {color: #E64173;} </style> ## Chapter 21: Comparing Two Means --- # Two-Sample Framework Comparing two populations is one of the most common situations in statistics. These are called .hi.kelly[two-sample problems]. Can divide into groups, A and B - .ex[Example:] Women vs. Men; Econ Majors vs. Non-Econ Majors; Treated vs. Control We want to know if they differ along some measurable margin - .ex[Example:] salary, hours of homework per week, health This is different from the matched pairs set up because: - We have a separate sample for each group and we cannot match the observations --- # Two-Sample Framework Consider two groups, A and B. You have the following information for each group:
Population/Group
Sample Mean
Standard Deviation
A
\( \bar{X}_A \)
\( \sigma_A \)
B
\( \bar{X}_B \)
\( \sigma_B \)
We use `\(\bar{X}_A\)` and `\(\bar{X}_B\)` to say something about the difference in population means, `\(\mu_A - \mu_B\)` - Construct a confidence interval for `\(\mu_A-\mu_B\)` - Test the hypothesis `\(H_0: \mu_A - \mu_B = 0\)` --- # Conditions for Two-Sample Inference We use `\(\bar{X}_A\)` and `\(\bar{X}_B\)` to say something about `\(\mu_A - \mu_B\)` - We have two SRS's from two distinct populations - The two samples are independent of one another - We measure the same response variable for both samples - Both populations are normally distributed - In practice, it is enough the distributions have similar shapes and that the data have no strong outliers. --- # Distribution of \\( \bar{X}_A\\) and \\(\bar{X}_B\\) If the sample mean, `\(\bar{X}_i \sim N\left(\mu_i, \frac{\sigma^2_i}{n}\right)\)` for `\(i \in A, B\)`, then: 1. `\(\bar{X}_A\)` and `\(\bar{X}_B\)` is normally distributed -- 2. `\(E[\bar{X}_A - \bar{X}_B] = E[\bar{X}_A] - E[\bar{X}_B]\)` -- 3. `\(V[\bar{X}_A - \bar{X}_B] = V[\bar{X}_A] + V[\bar{X}_B]\)` (by independence) -- To summarize: $$ \bar{X}_A - \bar{X}_B \sim N(\mu_A-\mu_B, \frac{\sigma^2_A}{n} + \frac{\sigma^2_B}{n}) $$ --- # Distribution of \\( \bar{X}_A\\) and \\(\bar{X}_B\\) Therefore, when both `\(\sigma^2\)` are known: $$ \frac{(\bar{X}_A-\bar{X}_B) - (\mu_A-\mu_B)}{\sqrt{\frac{\sigma^2_A}{n_A} + \frac{\sigma^2_B}{n_B}}} \sim N(0,1) $$ --- # Distribution of \\( \bar{X}_A\\) and \\(\bar{X}_B\\) As we mentioned, we don't always know the population variance, `\(\sigma^2\)`. If we don't know these values, we can use the sample standard deviations `\(s_A\)` and `\(s_B\)` as estimators. The standard error for the difference in sample means is: $$ SE_{\bar{X}_A-\bar{X}_B}=\sqrt{\frac{s^2_A}{n_A}+\frac{s^2_B}{n_B}} $$ --- # Distribution of \\( \bar{X}_A\\) and \\(\bar{X}_B\\) Since we estimate the sample standard deviations, we should use the `\(\daisy{t}\)`-distribution $$ \frac{\bar{X}_A-\bar{X}_B - (\mu_A-\mu_B)}{\sqrt{\frac{s^2_A}{n_A} + \frac{s^2_B}{n_B}}} $$ can be approximated by the `\(\daisy{t}\)`-distribution, where the degrees of freedom is `\(min\{n_A, n_B\}-1\)` - Statistical software can be more exact, but the formulas get complicated --- # Two-Sample Confidence Interval <div class="subheader alice"> \( \sigma^2 \) Known </div> A confidence interval for `\(\mu_A-\mu_B\)` with level of confidence `\(C\)`: $$ (\bar{X}_A-\bar{X}_B) \pm Z^{(1-C)/2} \cdot \sqrt{\frac{\sigma^2_A}{n_A}+\frac{\sigma^2_B}{n_B}} $$ --- # Two-Sample Confidence Interval <div class="subheader alice"> \( \sigma^2 \) Known </div> Say we have two groups -- athletes and non-athletes and we're asked to construct a 95\% confidence interval for the difference in GPA `\(\mu_A-\mu_{NA}\)`
Group
Sample Mean
Standard Deviation
Sample Size
Athletes
\( \bar{X}=2.8 \)
\( \sigma = 0.4 \)
15
Non-athletes
\( \bar{X}=2.9 \)
\( \sigma = 0.5 \)
25
$$ CI= (2.8-2.9) \pm Z_{0.025}\cdot \sqrt{\frac{0.4^2}{15}+\frac{0.5^2}{25}} = [-0.38, 0.18] $$ --- # Two-Sample Confidence Interval <div class="subheader alice"> \( \sigma^2 \) Unknown </div> Since we have to estimate `\(\sigma^2\)` for both samples, we need to use the `\(\daisy{t}\)`-distribution to find the critical value: $$ (\bar{X}_A-\bar{X}_B) \pm t^{n-1, \frac{1-C}{2}} \cdot \sqrt{\frac{s^2_A}{n_A}+\frac{s^2_B}{n_B}} $$ --- # Two-Sample Confidence Interval <div class="subheader alice"> \( \sigma^2 \) Unknown </div> We have 2 groups of students, and we're asked to construct 90\% confidence interval for difference in test scores, `\(\mu_A-\mu_B\)`
Group
Sample Mean
Standard Deviation
Sample Size
Treated
\( \bar{X} = 76 \)
\( s = 9 \)
60
Control
\( \bar{X} = 73 \)
\( s = 5 \)
20
-- $$ CI=(76-73) \pm t^{0.05}_{19} \cdot \sqrt{\frac{9^2}{60}+\frac{5^2}{20}} =[0.21, 5.79] $$ --- # Two-Sample Hypothesis Testing <div class="subheader alice"> \( \sigma^2 \) Known </div> Researchers are asking college graduates how old they were when they had their first job. Researchers are curious to see if students who attended state schools got jobs earlier in life than those who attended private colleges.
Group
Sample Mean
Standard Deviation
Sample Size
State Colleges
\( \bar{X}= 18.19 \)
\( \sigma = 3.8 \)
20
Private Colleges
\( \bar{X}= 20.98 \)
\( \sigma = 4.2 \)
20
Test the following hypothesis at the `\(\alpha=0.05\)` significance level: $$ H_0: \mu_A - \mu_B = 0 $$ $$ H_1: \mu_A - \mu_B <0 $$ --- # Two-Sample Hypothesis Testing <div class="subheader alice"> \( \sigma^2 \) Known </div> Calculate p-value using: $$ P(\bar{X}_A-\bar{X}_B \leq 18.19-20.98 \ \vert \ \mu_A - \mu_B = 0) $$ $$ P\left(\frac{\bar{X}_A-\bar{X}_B-(\mu_A-\mu_B)}{\sqrt{\frac{\sigma_A^2}{n_A} + \frac{\sigma_B^2}{n_B}}} \leq \frac{18.19-20.98 - (0)}{\sqrt{\frac{3.8^2}{20}+\frac{4.2^2}{20}}} \right) $$ -- <br/> `\(p\)`-value `\(= P(Z \leq -2.2) = 0.014 \implies \text{reject } H_0\)` because p-value `\(\leq \alpha=0.05\)` --- # Two-Sample Hypothesis Testing <div class="subheader alice"> \( \sigma^2 \) Unknown </div> You want to test how attached individuals are to their friends, and whether that is different across people who volunteer for community service versus those who do not.
Group
Sample Mean
Standard Deviation
Sample Size
Service
\( \bar{X}= 105.32 \)
\( s = 14.68 \)
57
No Service
\( \bar{X}= 96.82 \)
\( s = 14.26 \)
17
Test the following hypothesis at `\(\alpha=0.01\)` level: $$ H_0: \mu_A - \mu_B = 0 $$ $$ H_1: \mu_A - \mu_B \neq 0 $$ --- # Two-Sample Hypothesis Testing <div class="subheader alice"> \( \sigma^2 \) Unknown </div> $$ t = \frac{\bar{X}_S-\bar{X}_N - (\mu_A - \mu_B)}{\sqrt{\frac{s^2_S}{n_S} + \frac{s^2_N}{n_N}}} = \frac{105.32-96.82-(0)}{\sqrt{\frac{14.68^2}{57}+\frac{14.26^2}{17}}} $$ $$ \implies t = \frac{8.5}{3.9677} = 2.142 $$ Look at t-table, row with degrees freedom = 16. `\(t_{16}^{0.025} = 2.12\)` and `\(t_{16}^{0.01} = 2.58\)`, this means p-value is in between 0.025 and 0.01, .hi[BUT] it's a two-tailed test so we need to multiply these probabilities by 2: `\(0.02 < p\)`-value `\(< 0.05 \implies\)` Do not reject null at `\(\alpha = 0.01\)` --- # Review of Chapter 21 In this chapter, we focus on making inferences about the relationship between the means of two different samples - Confidence intervals around the difference in means \\( \mu_A - \mu_B \pm \\) margin of error - Generally testing \\( H_0: \mu_A-\mu_B = 0 \\) - You'll be given sample means (\\( \bar{X} \\)), standard deviations (\\( \sigma \\) or \\( s \\)) and population size (\\( n \\)) of each sample. <br/> If you're given `\(\sigma\)`, use Z-distribution If you're given `\(s\)`, use t-distribution (unless .hi[both] samples are large enough) --- # Calculating Margin of Error with Two Samples Variances known: $$ (\bar{X}_A-\bar{X}_B) \pm Z^{\frac{1-C}{2}} \cdot \sqrt{\frac{\sigma^2_A}{n_A}+\frac{\sigma^2_B}{n_B}} $$ <br/> Variances unknown: $$ (\bar{X}_A-\bar{X}_B) \pm t^{n-1, \frac{1-C}{2}} \cdot \sqrt{\frac{s^2_A}{n_A}+\frac{s^2_B}{n_B}} $$ --- # Calculating test-Statistic Variances known: $$ Z = \frac{\bar{X}_A-\bar{X}_B-(\mu_A-\mu_B)}{\sqrt{\frac{\sigma_A^2}{n_A} + \frac{\sigma_B^2}{n_B}}} $$ <br/> Varainces unknown: $$ t = \frac{\bar{X}_A - \bar{X}_B - (\mu_A - \mu_B)}{\sqrt{\frac{s^2_A}{n_A}+\frac{s^2_B}{n_B}}} $$ --- # Clicker Question You're given the following information about average length of careers in NFL versus MLB.
Group
Sample Mean
Standard Deviation
Sample Size
NFL
\( \bar{X}= 3.3 \)
\( s = 2.1 \)
20
MLB
\( \bar{X}= 5.6 \)
\( s = 3.5 \)
17
You want to construct a 90% confidence interval. Given this information, calculate the margin of error: <ol type = "a"> <li> 1.6 </li> <li> 1.645 </li> <li> 1.96 </li> </ol> --- # Midterm Example New research has developed a new drug designed to reduce blood pressure. In an experiment, 21 subjects were assigned randomly to the treatment group and receive the experimental drug. The other 23 subjects were assigned to the control group and received a placebo treatment. A summary of these data is:
Group
Sample Mean
Standard Deviation
Sample Size
Treatment
\( \bar{X} = 23.48 \)
\( s = 8.01 \)
21
Placebo
\( \bar{X} = 18.52 \)
\( s = 7.15 \)
13
We want to test whether there was any difference in means across these two groups: <ol type = "a"> <li>State the null and alternative hypothesis </li> <li>Calculate p-value or range of p-values </li> <li>Do you reject at \( \alpha=0.05 \) level? </li> <li>If you were incorrect in part c, what kind of error did you make? </li> </ol> --- class: clear