class: center, middle, inverse, title-slide # ECON 3818 ## Chapter 8 ### Kyle Butts ### 10 August 2021 --- exclude: true --- class: clear, middle <!-- Custom css --> <style type="text/css"> /* ------------------------------------------------------- * * !! This file was generated by xaringanthemer !! * * Changes made to this file directly will be overwritten * if you used xaringanthemer in your xaringan slides Rmd * ------------------------------------------------------- */ @import url(https://fonts.googleapis.com/css?family=Roboto&display=swap); @import url(https://fonts.googleapis.com/css?family=Roboto&display=swap); @import url(https://fonts.googleapis.com/css?family=Source+Code+Pro:400,700&display=swap); :root { /* Fonts */ --text-font-family: Roboto; --text-font-is-google: 1; --text-font-family-fallback: -apple-system, BlinkMacSystemFont, avenir next, avenir, helvetica neue, helvetica, Ubuntu, roboto, noto, segoe ui, arial; --text-font-base: sans-serif; --header-font-family: Roboto; --header-font-is-google: 1; --header-font-family-fallback: Georgia, serif; --code-font-family: 'Source Code Pro'; --code-font-is-google: 1; --base-font-size: 20px; --text-font-size: 1rem; --code-font-size: 0.9rem; --code-inline-font-size: 1em; --header-h1-font-size: 1.75rem; --header-h2-font-size: 1.6rem; --header-h3-font-size: 1.5rem; /* Colors */ --text-color: #131516; --text-color-light: #555F61; --header-color: #FFF; --background-color: #FFF; --link-color: #107895; --code-highlight-color: rgba(255,255,0,0.5); --inverse-text-color: #d6d6d6; --inverse-background-color: #272822; --inverse-header-color: #f3f3f3; --inverse-link-color: #107895; --title-slide-background-color: #272822; --title-slide-text-color: #d6d6d6; --header-background-color: #FFF; --header-background-text-color: #FFF; } html { font-size: var(--base-font-size); } body { font-family: var(--text-font-family), var(--text-font-family-fallback), var(--text-font-base); font-weight: normal; color: var(--text-color); } h1, h2, h3 { font-family: var(--header-font-family), var(--header-font-family-fallback); color: var(--text-color-light); } .remark-slide-content { background-color: var(--background-color); font-size: 1rem; padding: 24px 32px 16px 32px; width: 100%; height: 100%; } .remark-slide-content h1 { font-size: var(--header-h1-font-size); } .remark-slide-content h2 { font-size: var(--header-h2-font-size); } .remark-slide-content h3 { font-size: var(--header-h3-font-size); } .remark-code, .remark-inline-code { font-family: var(--code-font-family), Menlo, Consolas, Monaco, Liberation Mono, Lucida Console, monospace; } .remark-code { font-size: var(--code-font-size); } .remark-inline-code { font-size: var(--code-inline-font-size); color: #000; } .remark-slide-number { color: #107895; opacity: 1; font-size: 0.9em; } a, a > code { color: var(--link-color); text-decoration: none; } .footnote { position: absolute; bottom: 60px; padding-right: 6em; font-size: 0.9em; } .remark-code-line-highlighted { background-color: var(--code-highlight-color); } .inverse { background-color: var(--inverse-background-color); color: var(--inverse-text-color); } .inverse h1, .inverse h2, .inverse h3 { color: var(--inverse-header-color); } .inverse a, .inverse a > code { color: var(--inverse-link-color); } img, video, iframe { max-width: 100%; } blockquote { border-left: solid 5px lightgray; padding-left: 1em; } @page { margin: 0; } @media print { .remark-slide-scaler { width: 100% !important; height: 100% !important; transform: scale(1) !important; top: 0 !important; left: 0 !important; } } /* Modified metropolis */ .clear{ border-top: 0px solid #FAFAFA; } h1 { margin-top: -5px; margin-left: -00px; margin-bottom: 30px; color: var(--text-color-light); font-weight: 200; } h2, h3, h4 { padding-top: -15px; padding-bottom: 00px; color: #1A292C; text-shadow: none; font-weight: 400; text-align: left; margin-left: 00px; margin-bottom: -10px; } .title-slide .inverse .remark-slide-content { background-color: #FAFAFA; } .title-slide { background-color: #FAFAFA; border-top: 80px solid #FAFAFA; } .title-slide h1 { color: var(--text-color); font-size: 40px; text-shadow: none; font-weight: 400; text-align: left; margin-left: 15px; } .title-slide h2 { margin-top: -15px; color: var(--link-color); text-shadow: none; font-weight: 300; font-size: 35px; text-align: left; margin-left: 15px; } .title-slide h3 { color: var(--text-color-light); text-shadow: none; font-weight: 300; font-size: 25px; text-align: left; margin-left: 15px; margin-bottom: 0px; } .title-slide h3:last-of-type { font-style: italic; font-size: 1rem; } /* Remove orange line */ hr, .title-slide h2::after, .mline h1::after { content: ''; display: block; border: none; background-color: #e5e5e5; color: #e5e5e5; height: 1px; } hr, .mline h1::after { margin: 1em 15px 0 15px; } .title-slide h2::after { margin: 10px 15px 35px 0; } .mline h1::after { margin: 10px 15px 0 15px; } /* turns off slide numbers for title page: https://github.com/gnab/remark/issues/298 */ .title-slide .remark-slide-number { display: none; } /* Custom CSS */ /* More line spacing */ body { line-height: 1.5; } /* Font styling */ .hi { font-weight: 600; } .mono { font-family: monospace; } .ul { text-decoration: underline; } .ol { text-decoration: overline; } .st { text-decoration: line-through; } .bf { font-weight: bold; } .it { font-style: italic; } /* Font Sizes */ .bigger { font-size: 125%; } .huge{ font-size: 150%; } .small { font-size: 95%; } .smaller { font-size: 85%; } .smallest { font-size: 75%; } .tiny { font-size: 50%; } /* Remark customization */ .clear .remark-slide-number { display: none; } .inverse .remark-slide-number { display: none; } .remark-code-line-highlighted { background-color: rgba(249, 39, 114, 0.5); } /* Xaringan tweeks */ .inverse { background-color: #23373B; text-shadow: 0 0 20px #333; /* text-shadow: none; */ } .title-slide { background-color: #ffffff; border-top: 80px solid #ffffff; } .footnote { bottom: 1em; font-size: 80%; color: #7f7f7f; } /* Lists */ li { margin-top: 4px; } /* Mono-spaced font, smaller */ .mono-small { font-family: monospace; font-size: 16px; } .mono-small .mjx-chtml { font-size: 103% !important; } .pseudocode, .pseudocode-small { font-family: monospace; background: #f8f8f8; border-radius: 3px; padding: 10px; padding-top: 0px; padding-bottom: 0px; } .pseudocode-small { font-size: 16px; } .remark-code { font-size: 68%; } .remark-inline-code { background: #F5F5F5; /* lighter */ /* background: #e7e8e2; /* darker */ border-radius: 3px; padding: 4px; } /* Super and Subscripts */ .super{ vertical-align: super; font-size: 70%; line-height: 1%; } .sub{ vertical-align: sub; font-size: 70%; line-height: 1%; } /* Subheader */ .subheader{ font-weight: 100; font-style: italic; display: block; margin-top: -25px; margin-bottom: 25px; } /* 2/3 left; 1/3 right */ .more-left { float: left; width: 63%; } .less-right { float: right; width: 31%; } .more-right ~ * { clear: both; } /* 9/10 left; 1/10 right */ .left90 { padding-top: 0.7em; float: left; width: 85%; } .right10 { padding-top: 0.7em; float: right; width: 9%; } /* 95% left; 5% right */ .left95 { padding-top: 0.7em; float: left; width: 91%; } .right05 { padding-top: 0.7em; float: right; width: 5%; } .left5 { padding-top: 0.7em; margin-left: 0em; margin-right: -0.4em; float: left; width: 7%; } .left10 { padding-top: 0.7em; margin-left: -0.2em; margin-right: -0.5em; float: left; width: 10%; } .left30 { padding-top: 0.7em; float: left; width: 30%; } .right30 { padding-top: 0.7em; float: right; width: 30%; } .thin-left { padding-top: 0.7em; margin-left: -1em; margin-right: -0.5em; float: left; width: 27.5%; } /* Example */ .ex { font-weight: 300; color: #555F61 !important; font-style: italic; } .col-left { float: left; width: 47%; margin-top: -1em; } .col-right { float: right; width: 47%; margin-top: -1em; } .clear-up { clear: both; margin-top: -1em; } /* Format tables */ table { color: #000000; font-size: 14pt; line-height: 100%; border-top: 1px solid #ffffff !important; border-bottom: 1px solid #ffffff !important; } th, td { background-color: #ffffff; } table th { font-weight: 400; } /* Attention */ .attn { font-weight: 500; color: #e64173 !important; font-family: 'Zilla Slab' !important; } /* Note */ .note { font-weight: 300; font-style: italic; color: #314f4f !important; /* color: #cccccc !important; */ font-family: 'Zilla Slab' !important; } /* Question and answer */ .qa { font-weight: 500; /* color: #314f4f !important; */ color: #e64173 !important; font-family: 'Zilla Slab' !important; } /* Figure Caption */ .caption { font-size: 0.8888889em; line-height: 1.5; margin-top: 1em; color: #6b7280; } </style> <!-- From xaringancolor --> <div style = "position:fixed; visibility: hidden"> $$ \require{color} \definecolor{purple}{rgb}{0.337254901960784, 0.00392156862745098, 0.643137254901961} \definecolor{navy}{rgb}{0.0509803921568627, 0.23921568627451, 0.337254901960784} \definecolor{ruby}{rgb}{0.603921568627451, 0.145098039215686, 0.0823529411764706} \definecolor{alice}{rgb}{0.0627450980392157, 0.470588235294118, 0.584313725490196} \definecolor{daisy}{rgb}{0.92156862745098, 0.788235294117647, 0.266666666666667} \definecolor{coral}{rgb}{0.949019607843137, 0.427450980392157, 0.129411764705882} \definecolor{kelly}{rgb}{0.509803921568627, 0.576470588235294, 0.337254901960784} \definecolor{jet}{rgb}{0.0745098039215686, 0.0823529411764706, 0.0862745098039216} \definecolor{asher}{rgb}{0.333333333333333, 0.372549019607843, 0.380392156862745} \definecolor{slate}{rgb}{0.192156862745098, 0.309803921568627, 0.309803921568627} \definecolor{cranberry}{rgb}{0.901960784313726, 0.254901960784314, 0.450980392156863} $$ </div> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { Macros: { purple: ["{\\color{purple}{#1}}", 1], navy: ["{\\color{navy}{#1}}", 1], ruby: ["{\\color{ruby}{#1}}", 1], alice: ["{\\color{alice}{#1}}", 1], daisy: ["{\\color{daisy}{#1}}", 1], coral: ["{\\color{coral}{#1}}", 1], kelly: ["{\\color{kelly}{#1}}", 1], jet: ["{\\color{jet}{#1}}", 1], asher: ["{\\color{asher}{#1}}", 1], slate: ["{\\color{slate}{#1}}", 1], cranberry: ["{\\color{cranberry}{#1}}", 1] }, loader: {load: ['[tex]/color']}, tex: {packages: {'[+]': ['color']}} } }); </script> <style> .purple {color: #5601A4;} .navy {color: #0D3D56;} .ruby {color: #9A2515;} .alice {color: #107895;} .daisy {color: #EBC944;} .coral {color: #F26D21;} .kelly {color: #829356;} .jet {color: #131516;} .asher {color: #555F61;} .slate {color: #314F4F;} .cranberry {color: #E64173;} </style> ## Chapter 8: Producing Data -- Sampling --- # Population vs. Sample Recall our definitions of population and sample from earlier in the semester: - .hi.purple[population]: the entire group about which we want information - .hi.kelly[sample]: the part of the population for which we collect information We use information from the sample to draw conclusions about the population as a whole. In order to ensure *accurate* inferences of population must follow a sampling design: - .hi.alice[sampling design]: describes exactly how to choose a sample from the population - .hi.ruby[sample survey]: describes the population of interest, the sample to be surveyed, and the way in which they are surveyed (the .alice[sample design]). --- # Sample Survey Example <img src="data:image/png;base64,#cpsbls.png" width="70%" style="display: block; margin: auto;" /> --- # Sample Survey Example <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#bls_example.jpg" alt="Graphic from https://twitter.com/crampell/status/1423661884759289858/" width="90%" /> <p class="caption">Graphic from https://twitter.com/crampell/status/1423661884759289858/</p> </div> --- # Clicker Question An online store contacts 1000 customers from its list of customers who have purchased in the last year. In all, 696 of the 1000 say that they are very satisfied with the store's website. The .hi[population] in this setting is: <ol type = "a"> <li>all customers who have purchased something in the last year</li> <li>the 1000 who were contacted</li> <li>the 696 customers who were very satisfied with the store's website</li> </ol> --- # Clicker Question An opinion poll calls 2000 randomly chosen residential telephone numbers in Portland and asks to speak with an adult member of the household. The interviewer asks, "how many movies have you watched in a movie theatre in the past 12 months?". In all, 831 people respond. The .hi[sample] in this study is: <ol type = "a"> <li>all adults living in Portland</li> <li>all 2000 residential phone numbers called</li> <li>the 831 people who responded</li> </ol> --- # Importance of "good" sampling In order to draw inferences about the population by using a sample, that sample must be representative. The choice of sample design may hinder representation. The following are examples of poor sample designs: - A .hi.cranberry[convenience sample] surveys members of the population that are easiest to reach - A .hi.cranberry[voluntary response sample] consists of people who choose to participate by responding to a general appeal We say that a sample is .hi.coral[biased] if it systematically favors certain outcomes --- # Sample Bias > The late film critic Pauline Kael is reported to have said that Nixon couldn't have won because she didn't know anybody who voted for him. > - Jonah Goldberg -- Among the people she surveyed (*friends*), the proportion of people who voted for Nixon was 0 (*sample statistic*) --- # Sample Bias .pull-left[ <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#survivorship-bias.png" alt="Figure: Bullet holes from returning planes" width="100%" /> <p class="caption">Figure: Bullet holes from returning planes</p> </div> ] .pull-right[ In WWII, the US Navy received this diagram that said "bullet holes of the planes that returned" Where should you put extra armor on the planes? ] --- # Sample Bias <img src="data:image/png;base64,#survivorship-bias-2.jpg" width="65%" style="display: block; margin: auto;" /> --- # Sample Bias <img src="data:image/png;base64,#sample.png" width="70%" style="display: block; margin: auto;" /> --- # Simple Random Samples The best way to avoid sample bias is to randomly sample. - .hi.kelly[Random sampling]: the use of chance to select a sample - .hi.coral[Simple random sample]: consists of `\(n\)` individuals from the population chosen in such a way that every set of `\(n\)` individuals has an equal chance to be the sample actually selected Examples: - Flip a coin to choose someone in the sample - Assign each individual a number and choose sample by randomly generating set of numbers --- # Stratified SRS We might group individuals into a .hi.daisy[strata] if we are confident they are similar. Selecting SRSs within a strata leads to better representation. This process is called .hi.daisy[stratified SRS]. <br> .ex[Example:] A SRS from the set of all households in the U.S. may inadvertently select more single-parent households than two-parent, leading to a biased sample. Solution: 1. Create two .daisy[strata]; one for single-parent households and one for two-parent households 2. Choose SRS within strata. --- # Stratified SRS <img src="data:image/png;base64,#stratified.png" width="50%" style="display: block; margin: auto;" /> --- # Clicker Question If I randomly choose a sample of 40 from a classroom of 300, what is the probability of each student to be selected into the sample? <ol type = "a"> <li>0.003</li> <li>0.13</li> <li>.025</li> </ol> --- # Inference Recall, we use samples to give us information about a larger population This process of drawing conclusions about a population on the basis of sample data is called .hi.purple[inference] - Results from sample inference come with .hi.alice[margins of error]. - bounds on our inferred estimates of population properties. - These margins of error decrease as the sample size approaches the population size. --- # Inference .hi[Biased sample design will lead to biased inference.] - With biased inference, we cannot make any trustworthy claims about population. - The sample you use will determine your results --- # Threats to Valid Inference A few common sources of bias are: - .hi[Undercoverage]: when some groups in the population are left out of the process of choosing the sample - .hi[Oversampling]: when some groups are sampled more often than others in a way that is not representative of the population - .hi[Nonresponse]: when an individual chosen for the sample can't be contacted or refuses to participate - .hi[Response Bias]: a systematic pattern of incorrect responses in a sample survey - .hi[Wording Effect]: a systematic pattern of responses due to poor (or manipulated) wording of survey questions --- # Clicker Question Individuals are randomly selected to receive a text message link to an online survey. The survey asks, "Rate your dissatisfaction with Boulder's Soda Tax." This survey suffers from: <ol type = "a"> <li>Oversampling</li> <li>Nonresponse</li> <li>Wording effect</li> <li>Both B and C</li> </ol> --- # Clicker Question At a large university, a simple random sample of five female professors is selected, and a simple random sample of 10 male professors is selected. The two samples are combined to give an overall sample of 15 professors. The overall sample is: <ol type = "a"> <li>a simple random sample</li> <li>biased due to imbalance</li> <li>a stratified sample</li> <li>All of the answer options are correct</li> </ol> --- # Clicker Question -- Midterm Example A sociologist studying freshmen at a major university carried out a survey, asking (among other questions) how often students went out per week, how many hours they studied per day, and how many hours they slept at night. The sociologist, who would like a simple random sample but finds it too time consuming to obtain such a sample, decides to use all students enrolled in his own class. This type of sample: <ol type = "a"> <li>is a convenience sample.</li> <li>likely results in undercoverage of certain types of freshmen.</li> <li>could lead to biased conclusions.</li> <li>All of the answer options are correct</li> </ol>