class: center, middle, inverse, title-slide # ECON 3818 ## Chapter 6 ### Kyle Butts ### 26 September 2021 --- class: clear, middle <!-- Custom css --> <style type="text/css"> /* ------------------------------------------------------- * * !! This file was generated by xaringanthemer !! * * Changes made to this file directly will be overwritten * if you used xaringanthemer in your xaringan slides Rmd * ------------------------------------------------------- */ @import url(https://fonts.googleapis.com/css?family=Roboto&display=swap); @import url(https://fonts.googleapis.com/css?family=Roboto&display=swap); @import url(https://fonts.googleapis.com/css?family=Source+Code+Pro:400,700&display=swap); @import url(https://fonts.googleapis.com/css2?family=Atkinson+Hyperlegible&display=swap); :root { /* Fonts */ --text-font-family: 'Atkinson Hyperelegible'; --text-font-is-google: 1; --text-font-family-fallback: Roboto, -apple-system, BlinkMacSystemFont, avenir next, avenir, helvetica neue, helvetica, Ubuntu, roboto, noto, segoe ui, arial; --text-font-base: sans-serif; --header-font-family: 'Atkinson Hyperelegible' --header-font-is-google: 1; --header-font-family-fallback: Georgia, serif; --code-font-family: 'Source Code Pro'; --code-font-is-google: 1; --base-font-size: 20px; --text-font-size: 1rem; --code-font-size: 0.9rem; --code-inline-font-size: 1em; --header-h1-font-size: 1.75rem; --header-h2-font-size: 1.6rem; --header-h3-font-size: 1.5rem; /* Colors */ --text-color: #131516; --text-color-light: #555F61; --header-color: #FFF; --background-color: #FFF; --link-color: #107895; --code-highlight-color: rgba(255,255,0,0.5); --inverse-text-color: #d6d6d6; --inverse-background-color: #272822; --inverse-header-color: #f3f3f3; --inverse-link-color: #107895; --title-slide-background-color: #272822; --title-slide-text-color: #d6d6d6; --header-background-color: #FFF; --header-background-text-color: #FFF; } html { font-size: var(--base-font-size); } body { font-family: var(--text-font-family), var(--text-font-family-fallback), var(--text-font-base); font-weight: normal; color: var(--text-color); } h1, h2, h3 { font-family: var(--header-font-family), var(--header-font-family-fallback); color: var(--text-color-light); } .remark-slide-content { background-color: var(--background-color); font-size: 1rem; padding: 24px 32px 16px 32px; width: 100%; height: 100%; } .remark-slide-content h1 { font-size: var(--header-h1-font-size); } .remark-slide-content h2 { font-size: var(--header-h2-font-size); } .remark-slide-content h3 { font-size: var(--header-h3-font-size); } .remark-code, .remark-inline-code { font-family: var(--code-font-family), Menlo, Consolas, Monaco, Liberation Mono, Lucida Console, monospace; } .remark-code { font-size: var(--code-font-size); } .remark-inline-code { font-size: var(--code-inline-font-size); color: #000; } .remark-slide-number { color: #107895; opacity: 1; font-size: 0.9em; } a, a > code { color: var(--link-color); text-decoration: none; } .footnote { position: absolute; bottom: 60px; padding-right: 6em; font-size: 0.9em; } .remark-code-line-highlighted { background-color: var(--code-highlight-color); } .inverse { background-color: var(--inverse-background-color); color: var(--inverse-text-color); } .inverse h1, .inverse h2, .inverse h3 { color: var(--inverse-header-color); } .inverse a, .inverse a > code { color: var(--inverse-link-color); } img, video, iframe { max-width: 100%; } blockquote { border-left: solid 5px lightgray; padding-left: 1em; } @page { margin: 0; } @media print { .remark-slide-scaler { width: 100% !important; height: 100% !important; transform: scale(1) !important; top: 0 !important; left: 0 !important; } } /* Modified metropolis */ .clear{ border-top: 0px solid #FAFAFA; } h1 { margin-top: -5px; margin-left: -00px; margin-bottom: 30px; color: var(--text-color-light); font-weight: 200; } h2, h3, h4 { padding-top: -15px; padding-bottom: 00px; color: #1A292C; text-shadow: none; font-weight: 400; text-align: left; margin-left: 00px; margin-bottom: -10px; } .title-slide .inverse .remark-slide-content { background-color: #FAFAFA; } .title-slide { background-color: #FAFAFA; border-top: 80px solid #FAFAFA; } .title-slide h1 { color: var(--text-color); font-size: 40px; text-shadow: none; font-weight: 400; text-align: left; margin-left: 15px; } .title-slide h2 { margin-top: -15px; color: var(--link-color); text-shadow: none; font-weight: 300; font-size: 35px; text-align: left; margin-left: 15px; } .title-slide h3 { color: var(--text-color-light); text-shadow: none; font-weight: 300; font-size: 25px; text-align: left; margin-left: 15px; margin-bottom: 0px; } .title-slide h3:last-of-type { font-style: italic; font-size: 1rem; } /* Remove orange line */ hr, .title-slide h2::after, .mline h1::after { content: ''; display: block; border: none; background-color: #e5e5e5; color: #e5e5e5; height: 1px; } hr, .mline h1::after { margin: 1em 15px 0 15px; } .title-slide h2::after { margin: 10px 15px 35px 0; } .mline h1::after { margin: 10px 15px 0 15px; } /* turns off slide numbers for title page: https://github.com/gnab/remark/issues/298 */ .title-slide .remark-slide-number { display: none; } /* Custom CSS */ /* More line spacing */ body { line-height: 1.5; } /* Font styling */ .hi { font-weight: 600; } .mono { font-family: monospace; } .ul { text-decoration: underline; } .ol { text-decoration: overline; } .st { text-decoration: line-through; } .bf { font-weight: bold; } .it { font-style: italic; } /* Font Sizes */ .bigger { font-size: 125%; } .huge{ font-size: 150%; } .small { font-size: 95%; } .smaller { font-size: 85%; } .smallest { font-size: 75%; } .tiny { font-size: 50%; } /* Remark customization */ .clear .remark-slide-number { display: none; } .inverse .remark-slide-number { display: none; } .remark-code-line-highlighted { background-color: rgba(249, 39, 114, 0.5); } /* Xaringan tweeks */ .inverse { background-color: #23373B; text-shadow: 0 0 20px #333; /* text-shadow: none; */ } .title-slide { background-color: #ffffff; border-top: 80px solid #ffffff; } .footnote { bottom: 1em; font-size: 80%; color: #7f7f7f; } /* Lists */ li { margin-top: 4px; } /* Mono-spaced font, smaller */ .mono-small { font-family: monospace; font-size: 16px; } .mono-small .mjx-chtml { font-size: 103% !important; } .pseudocode, .pseudocode-small { font-family: monospace; background: #f8f8f8; border-radius: 3px; padding: 10px; padding-top: 0px; padding-bottom: 0px; } .pseudocode-small { font-size: 16px; } .remark-code { font-size: 68%; } .remark-inline-code { background: #F5F5F5; /* lighter */ /* background: #e7e8e2; /* darker */ border-radius: 3px; padding: 4px; } /* Super and Subscripts */ .super{ vertical-align: super; font-size: 70%; line-height: 1%; } .sub{ vertical-align: sub; font-size: 70%; line-height: 1%; } /* Subheader */ .subheader{ font-weight: 100; font-style: italic; display: block; margin-top: -25px; margin-bottom: 25px; } /* 2/3 left; 1/3 right */ .more-left { float: left; width: 63%; } .less-right { float: right; width: 31%; } .more-right ~ * { clear: both; } /* 9/10 left; 1/10 right */ .left90 { padding-top: 0.7em; float: left; width: 85%; } .right10 { padding-top: 0.7em; float: right; width: 9%; } /* 95% left; 5% right */ .left95 { padding-top: 0.7em; float: left; width: 91%; } .right05 { padding-top: 0.7em; float: right; width: 5%; } .left5 { padding-top: 0.7em; margin-left: 0em; margin-right: -0.4em; float: left; width: 7%; } .left10 { padding-top: 0.7em; margin-left: -0.2em; margin-right: -0.5em; float: left; width: 10%; } .left30 { padding-top: 0.7em; float: left; width: 30%; } .right30 { padding-top: 0.7em; float: right; width: 30%; } .thin-left { padding-top: 0.7em; margin-left: -1em; margin-right: -0.5em; float: left; width: 27.5%; } /* Example */ .ex { font-weight: 300; color: #555F61 !important; font-style: italic; } .col-left { float: left; width: 47%; margin-top: -1em; } .col-right { float: right; width: 47%; margin-top: -1em; } .clear-up { clear: both; margin-top: -1em; } /* Format tables */ table { color: #000000; font-size: 14pt; line-height: 100%; border-top: 1px solid #ffffff !important; border-bottom: 1px solid #ffffff !important; } th, td { background-color: #ffffff; } table th { font-weight: 400; } /* Attention */ .attn { font-weight: 500; color: #e64173 !important; font-family: 'Zilla Slab' !important; } /* Note */ .note { font-weight: 300; font-style: italic; color: #314f4f !important; /* color: #cccccc !important; */ font-family: 'Zilla Slab' !important; } /* Question and answer */ .qa { font-weight: 500; /* color: #314f4f !important; */ color: #e64173 !important; font-family: 'Zilla Slab' !important; } /* Figure Caption */ .caption { font-size: 0.8888889em; line-height: 1.5; margin-top: 1em; color: #6b7280; } </style> <!-- From xaringancolor --> <div style = "position:fixed; visibility: hidden"> $$ \require{color} \definecolor{purple}{rgb}{0.337254901960784, 0.00392156862745098, 0.643137254901961} \definecolor{navy}{rgb}{0.0509803921568627, 0.23921568627451, 0.337254901960784} \definecolor{ruby}{rgb}{0.603921568627451, 0.145098039215686, 0.0823529411764706} \definecolor{alice}{rgb}{0.0627450980392157, 0.470588235294118, 0.584313725490196} \definecolor{daisy}{rgb}{0.92156862745098, 0.788235294117647, 0.266666666666667} \definecolor{coral}{rgb}{0.949019607843137, 0.427450980392157, 0.129411764705882} \definecolor{kelly}{rgb}{0.509803921568627, 0.576470588235294, 0.337254901960784} \definecolor{jet}{rgb}{0.0745098039215686, 0.0823529411764706, 0.0862745098039216} \definecolor{asher}{rgb}{0.333333333333333, 0.372549019607843, 0.380392156862745} \definecolor{slate}{rgb}{0.192156862745098, 0.309803921568627, 0.309803921568627} \definecolor{cranberry}{rgb}{0.901960784313726, 0.254901960784314, 0.450980392156863} $$ </div> <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { Macros: { purple: ["{\\color{purple}{#1}}", 1], navy: ["{\\color{navy}{#1}}", 1], ruby: ["{\\color{ruby}{#1}}", 1], alice: ["{\\color{alice}{#1}}", 1], daisy: ["{\\color{daisy}{#1}}", 1], coral: ["{\\color{coral}{#1}}", 1], kelly: ["{\\color{kelly}{#1}}", 1], jet: ["{\\color{jet}{#1}}", 1], asher: ["{\\color{asher}{#1}}", 1], slate: ["{\\color{slate}{#1}}", 1], cranberry: ["{\\color{cranberry}{#1}}", 1] }, loader: {load: ['[tex]/color']}, tex: {packages: {'[+]': ['color']}} } }); </script> <style> .purple {color: #5601A4;} .navy {color: #0D3D56;} .ruby {color: #9A2515;} .alice {color: #107895;} .daisy {color: #EBC944;} .coral {color: #F26D21;} .kelly {color: #829356;} .jet {color: #131516;} .asher {color: #555F61;} .slate {color: #314F4F;} .cranberry {color: #E64173;} </style> ## Chapter 6: Two-Way Tables --- # Introduction This chapter discusses the relationship between two categorical variables - To analyze categorical data, we use the .it[counts] or .it[percentages] of individuals that fall into various categories --- # Two-Way Table Typically, published data is grouped in order to save space. This chapter will talk about .hi.purple[Two-Way Tables]
Associate
Bachelors
Masters
Doctorate
Women
673
1050
481
95
Men
401
780
342
89
This two-way table describes two categorical variables. One is the sex of an individual, this is the .hi.kelly[row variable] and the other is the degree attained, which is the .hi.coral[column variable] --- # Joint Distribution The joint distribution is found by dividing each cell by the total count. `$$P(x,y)=\frac{\text{number of times x and y occurs}}{\text{total number of occurrences}}$$` Since we have 3911 people in total, we get the following joint probabilities:
Associate
Bachelors
Masters
Doctorate
Women
673
1,050
481
95
Joint
17%
27%
12%
2%
Men
401
780
342
89
Joint
10%
20%
9%
2%
This means the probability of being a woman **and** having a Master's degree is 12% --- # Marginal Distribution A .hi.alice[marginal distribution] is the probability distribution associated with only one of the random variables In order to calculate, we need to look at the distribution of each variable *separately*. We do this by looking at the "Total" column and "Total" row. We will have two different marginal distributions, the "row" marginal and the "column" marginal --- # Row Marginal In this scenario, the row marginal is the distribution of sex alone:
Associate
Bachelors
Masters
Doctorate
Row Marginal
Women
673
1,050
481
95
2,299
Joint
17%
27%
12%
2%
58%
Men
401
780
342
89
1,612
Joint
10%
20%
9%
2%
42%
58.4% of individuals in this sample of degree holders are women. --- # Column Marginal In this scenario, the column marginal is the distribution of degrees alone:
Associate
Bachelors
Masters
Doctorate
Women
673
1,050
481
95
Joint
17%
27%
12%
2%
Men
401
780
342
89
Joint
10%
20%
9%
2%
Column Marginal
0.273
0.47
0.21
0.047
The probability of an individual having a Bachelor's degree is 47%. --- # Conditional Distribution We can use these tables to back out conditional distributions. Remember: `$$P(A \vert B)=\frac{P(A \cap B)}{P(B)}$$` This is the same expression (just different notation) as: `$$P(x\vert y)=\frac{P(x,y)}{P(y)}$$` --- # Conditional Distribution For example, we can calculate the probability of holding each degree, given the individual is a woman: $$ P(\text{Associate's} \vert \text{Woman}) = \frac{P(\text{Associate's and Woman})}{P(\text{Woman})} = 0.17/0.584 = 29.1\% $$ $$ P(\text{Bachelor's} \vert \text{Woman}) = \frac{P(\text{Bachelor's and Woman})}{P(\text{Woman})} = 0.27/0.584 = 46.2\% $$ $$ P(\text{Master's} \vert \text{Woman}) = \frac{P(\text{Master's and Woman})}{P(\text{Woman})} = 0.12/0.584 = 20.5\% $$ $$ P(\text{Doctorate} \vert \text{Woman}) = \frac{P(\text{Doctorate and Woman})}{P(\text{Woman})} = 0.024/0.584 = 4.1\% $$ --- # Conditional Distribution We could also calculate the probability an individual is a particular sex, based of holding an Associate's degree $$ P(\text{Male} \vert \text{Associate's}) = \frac{P(\text{Male and Associate's})}{P(\text{Associate's})} = .103/0.273 = 37.7\% $$ $$ P(\text{Female} \vert \text{Associate's}) = \frac{P(\text{Female and Associate's})}{P(\text{Associate's})} = .17/.273 = 62.3\% $$ --- # Joint Probabilities vs. Conditional Probabilities - Joint probabilities take into account the probability that each event happens on its own - Conditional probabilities assume that one event has already happened --- # Joint Probability vs. Conditional Probability{Example} $$ P(\text{work in tech job} \cap \text{live in Boulder}) \text{ vs. } P(\text{work in tech job} \ \vert \ \text{live in Boulder}) $$ - `\(P(\text{work in tech})\)` = work in tech/entire US population = relatively small, let's say `\(7\%\)` - `\(P(\text{live in Boulder})\)` = Boulder population/entire US population = also small, `\(<1\%\)` This means the probability of BOTH happening is small, because both events are unlikely compared to the state space of the entire US population - But the P(work in tech `\(\vert\)` live in Boulder) will be higher because now the state space is Boulder population, which has a greater concentration of high-tech employees --- # Marginal and Conditional Distributions Here are some formal definitions: The .hi[marginal distribution] of one of the categorical variables in a two-way table of counts is the .hi[distribution of that variable among all individuals described by the table] - distribution of sex or degrees alone A .hi[conditional distribution] of a variable is the .hi[distribution of values of that variable among only individuals who have a given value of the other variable]. There is a separate conditional distribution for each value of the other variable. - there are two sets of conditional distributions for any two-way table, probability of having a degree based on sex, probability of being a particular sex based on degree held --- # Clicker Question
First
Second
Third
Crew
Alive
203
118
178
212
Dead
122
167
528
673
Total
325
285
706
885
Given this joint distribution, what is the probability of survival, given you are a first class passenger? <ol type = "a"> <li>9%</li> <li>62%</li> <li>1.3%</li> <li>30%</li> </ol> --- class: clear, center, middle <iframe width="840" height="472.5" src="https://www.youtube.com/embed/sxYrzzy3cq8" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> .center[[Source](https://ed.ted.com/lessons/how-statistics-can-be-misleading-mark-liddell)] --- class: clear, center, middle <iframe width="840" height="472.5" src="https://www.youtube.com/embed/t-Ci3FosqZs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> .center[[Source](https://www.youtube.com/watch?v=t-Ci3FosqZs)] --- # Simpson's Paradox Consider the survival rates for the following groups of victims who were taken to the hospital, either by helicopter or by car:
Helicopter
Car
Victim Died
64
260
Victim Survived
136
840
Total
200
1100
The probability of died conditional on helicopter, is higher than the probability of died conditional on car. Does this mean that this (more costly) mode of transportation isn't helping?
Helicopter
Car
Victim Died
32%
68%
Victim Survived
24%
76%
--- # Simpson's Paradox The idea is there a confounding variable, the severity of the accident, whose proportion differs between planes and car crashes <div class = "pull-left">
Serious Accidents
Helicopter
Car
Victim Died
48
60
Victim Survived
52
40
Total
100
100
</div> <div class = "pull-right">
Less Serious Accidents
Helicopter
Car
Victim Died
16
200
Victim Survived
84
800
Total
100
1000
</div> <br/> <div class = "pull-left">
Serious Accidents
Helicopter
Car
Victim Died
48%
52%
Victim Survived
60%
40%
</div> <div class = "pull-right">
Less Serious Accidents
Helicopter
Car
Victim Died
16%
84%
Victim Survived
20%
80%
</div> --- # Simpson's Paradox <img src="data:image/png;base64,#ch6_files/figure-html/unnamed-chunk-21-1.svg" style="display: block; margin: auto;" /> --- # Clicker Question From 2010-2013 the US median wage increased 1%, however over the same time period the median wage has decreased within each education subgroup (high school drop outs, high school graduates, some college, bachelor's or more). Which of the following explanations is consistent with Simpson's paradox? <ol type = "a"> <li>The BLS didn't control for inflation</li> <li>There are more people with bachelor's degrees (high income people)</li> <li>The wage of the highest income earner went up even more</li> <li>There are less unemployed people now</li> </ol>