class: center, middle, inverse, title-slide .title[ # Econometrics ] .subtitle[ ## Estimation
Confidence Intervals
Hypothesis Testing ] .author[ ### Florian Oswald ] .date[ ### UniTo ESOMAS 2025-10-14 ] --- layout: true <div class="my-footer"><img src="../img/logo/unito-shield.png" style="height: 60px;"/></div> --- layout: true <div class="my-footer"><img src="../img/logo/unito-shield.png" style="height: 60px;"/></div> --- # Estimators (based on SW chapter 3) * An *estimator* is a function of a sample of data. Given the sample is random, so is the estimator. * An *estimate* is the numerical result **produced** by an estimator. It's a number, not a random variable. -- * Let us focus on estimating the population mean `\(E(Y)=\mu_Y\)`. * Given a set of sample data, how could we estimate this quantity? --- # Estimators for `\(E(Y)\)` * We already met `\(\bar{y}\)` as a good estimator for `\(E(Y)\)`. * But it's not the only option: We could also have used the average of the largest two extremes, like `\(\tilde{y} = \frac{max(y) - min(y)}{2}\)`, no? Or what about the *last* value in our i.i.d. sample of data `\(\{y_i\}_{i=1}^n\)`, `\(y_n\)`? * generically, we write an estimator for population quantity `\(x\)` often `\(\hat{x}\)`. -- ### What do we care about when choosing an estimator? 1. Unbiasedness 2. Consistency 3. Efficiency --- # Unbiasedness * We have already heard that being unbiased means simply that `$$E(\hat{x}) = x,$$` for example we saw `$$E(\bar{y}) = \mu_Y.$$` * Is last value `\(y_n\)` an unbiased estimator for `\(\mu_Y\)`? --- # Consistency * We also heard already that we like estimators where some form of *law of large numbers* applies. -- * This is because we acknowledge that there is randomness in **any** estimate we could come up with. * But ***at least*** we want to use the power of a lot of information (large `\(n\)`) to make sure that this randomness could be made arbitrarily small. --- # Consistency * We also heard already that we like estimators where some form of *law of large numbers* applies. .right-wide[  ] .left-thin[ <br> * `\(\bar{y} = \frac{1}{n}\sum_{i=1}^n y_i\)` depends on `\(n\)` * We wrote `\(\bar{y} \overset{p}{\to} \mu_Y\)` * Is this true? `\(y_n\overset{p}{\to} \mu_Y\)`? Why or Why not? ] --- # Efficiency * If two estimators `\(\hat{x}_1\)` and `\(\hat{x}_2\)`, both unbiased, have variances such that `$$var(\hat{x}_1) > var(\hat{x}_2)$$` then we say that `\(\hat{x}_2\)` is *more efficient* than `\(\hat{x}_1\)` - because it uses the same amount of data in a more efficient way. -- * Both `\(\bar{y}\)` and `\(y_n\)` are unbiased. Which is more efficient? --- class: inverse # Task: what about this estimator? Define `$$\tilde{y} = \frac{1}{n}\left(\frac{1}{2} y_1 + \frac{3}{2} y_2 + + \frac{1}{2} y_3 + \dots + \frac{3}{2} y_n \right)$$` and assume `\(n\)` is an even number. 1. Show that this is unbiased. 2. Show that this is consistent. 3. Show that the sample average is more efficient that this. --- # BLUE * The estimator on the previous slide is very similar to the sample mean - just differently weighting the observations. * We have seen that the main problem of tilde was that it has a large variance. * So, among all unbiased estimators which are weighted averages of the Y's, the sample mean is the most efficient. * We say ybar is the Best Linear Unbiased Estimator (BLUE). --- # Sample Variance and Standard Error * We often don't know the variance of the population either - so we need to estimate it. * We call `\(s_Y^2\)` the *sample variance*, and it's an estimator for the population variance `\(\sigma_Y^2\)` `$$s_Y^2 = \frac{1}{n-1}\sum_{i=1}^n\left(y_i - \bar{y} \right)^2$$` * this is quite similar to the population variance. What's different? --- # Degrees of Freedom * We are dividing by `\(n-1\)` instead of `\(n\)` and we substract `\(\bar{y}\)` instead of `\(\mu_Y\)` -- * Well, we use `\(\bar{y}\)` because we don't know `\(\mu_Y\)` -- * Let's show where the `\((n-1)\)` thing comes from --- class: inverse # Task: Show where the `\(n-1\)` sample correction comes from We want to show that this `\(\mathbb{E}\big[(Y_i-\overline{Y})^2\big]\)` is not exactly the same as `\(\mathbb{E}\big[(Y_i-\mu_Y)^2\big]\)`, the latter being the population variance. In particular, show that there is an `\(n-1\)` too much: `$$\mathbb{E}\!\left[\sum_{i=1}^n (Y_i-\overline{Y})^2\right] = n\sigma^2\left(1 - \frac{1}{n}\right) = (n-1)\sigma^2.$$` * You can see that for small `\(n\)`, 10 say, that makes quite a difference. --- # Standard error of `\(\bar{y}\)` * The estimator for the standard *deviation* of `\(\bar{y}\)` `\(\left( \frac{\sigma_y}{\sqrt{n}}\right)\)` has a special name: we call it the **standard error**. * We write `\(SE(\bar{y}) = \hat{\sigma}_{\bar{y}} = \frac{s_Y}{\sqrt{n}}\)` * Again, observe that this is identical to `\(\left( \frac{\sigma_y}{\sqrt(n)}\right)\)` (the std. deviation of `\(\bar{y}\)`) ***except*** for the fact that we cannot use `\(\sigma_Y\)` - because we often don't know it. * `\(s_Y\)` is the estimator we derived on the previous slide! --- # Confidence Intervals * A *Confidence Interval* (CI) for a given population parameter `\(\theta\)` is a *random* interval whose endpoints are statistics (they are random variables) * A CI contains `\(\theta\)` with a preassigned probability. * For example, a 95% confidence interval for `\(\theta\)` is an interval in which `\(\theta\)` lies 95% of the time. * The CI is called an **interval estimate** (as opposed to a *point* estimate - which is what the sample mean is). --- # Confidence Intervals for the Population Mean * Consider an i.i.d. sample `\({Y_i}\)` with unknown mean `\(\mu_Y\)` and **known** variance `\(\sigma_Y^2\)`. * Standardizing any random variable `\(X\)` means `$$Z = \frac{X - E(X)}{sd(X)}, \qquad E(Z) = 0, \quad Var(Z) = 1$$` This guarantees mean 0 and variance 1, but does not imply that `\(Z\)` is normally distributed unless `\(X\)` itself is normal. * For the sample mean, by the Central Limit Theorem (CLT), for large `\(n\)` we have `$$\frac{\bar{Y} - \mu_Y}{\sigma_Y / \sqrt{n}} \approx \mathcal{N}(0,1)$$` and if the `\({Y_i}\)` are normally distributed, this relationship holds **exactly** for any `\(n\)`. --- # Confidence Intervals for the Population Mean * Thus, the standardized sample mean is approximately standard normal when `\(\sigma_Y\)` is known (exactly so if `\({Y_i}\)` are normal). * If `\(\sigma_Y\)` is **unknown**, we replace it with the sample standard deviation `\(s_Y\)` and obtain `$$T = \frac{\bar{Y} - \mu_Y}{s_Y / \sqrt{n}} \sim t_{n-1}$$` which converges to `\(\mathcal{N}(0,1)\)` as `\(n \to \infty\)`. --- # Confidence Intervals for the population mean in large samples * We want to find two numbers `\(c_L,c_H\)` to form an interval `\(\left[c_L,c_H\right]\)` around `\(\mu\)` with confidence `\(1-\alpha\)`: `$$\Pr\left( c_L \leq \mu \leq c_H \right) = 1- \alpha,$$` or `$$\Pr\left( \frac{\bar{y} - c_H}{s} \leq \frac{\bar{y} - \mu}{s} \leq \frac{\bar{y} - c_L}{s} \right) = 1- \alpha,$$` --- # Confidence Intervals for the population mean in large samples * We said `\(\frac{\bar{y} - E(Y)}{s} \approx \mathcal{N}(0,1)\)` is approximately standard normal. *large sample* means we will assume it **is** normal. * Also, we want this interval to be symmetric, so that `$$\bar{y} - c_H = -\left(\bar{y} - c_L\right)$$` * So, to compute this `$$\Pr\left( \frac{\bar{y} - c_H}{s} \leq \frac{\bar{y} - \mu}{s} \leq \frac{\bar{y} - c_L}{s} \right) = 1- \alpha,$$` we do `$$\Phi\left(\frac{\bar{y} - c_L}{s} \right) - \Phi\left(-\frac{\bar{y} - c_L}{s} \right) = 1 - \alpha$$` --- # Confidence Intervals for the population mean in large samples Finally, using the symmetry of `\(\Phi\)`, `$$\Phi\left(\frac{\bar{y} - c_L}{s} \right) - \Phi\left(-\frac{\bar{y} - c_L}{s} \right) = 1 - \alpha$$` becomes `$$2\Phi\left(\frac{\bar{y} - c_L}{s} \right) - 1 = 1 - \alpha$$` or `$$\Phi\left(\frac{\bar{y} - c_L}{s} \right) = 1 - \frac{\alpha}{2}$$` --- # Confidence Intervals for the population mean in large samples * we choose a value for `\(\alpha\)`. Say `\(\alpha = 0.05\)` `$$\Phi\left(\frac{\bar{y} - c_L}{s} \right) = 1 - \frac{0.05}{2} = 0.975$$` * We need find the value for `\(c_L\)`! Easy - just invert the `\(\Phi\)` function (or look up in your probability table for standard normal) `$$\frac{\bar{y} - c_L}{s} = \Phi^{-1}\left(0.975\right)$$` ( `qnorm(0.975) = 1.959964` `\(\approx 1.96\)`) which we can easily solve for `\(c_L\)` now and get `$$c_L = \bar{y} - 1.96 s, \quad c_H = \bar{y} + 1.96 s$$` --- # What if `\(\sigma^2\)` is not known?! * Then we just use a consistent estimator for it, i.e. we use `\(SE(\bar{y})\)`! * Remember that this standard error depends on `\(n\)`, and so will our test statistic: `$$\frac{\bar{Y} - c_L}{S / \sqrt{n}} \sim t_{n-1}$$` * for small `\(n\)` this is now `\(t\)` distributed - have to work with t-dist table instead of `\(\Phi\)` as above. * For any `\(n>30\)`, can use the normal again. --- # Demo ``` r library(shinyCLT) CLT() ``` --- # Packages used in this set of slides ``` r library(tidyverse) library(infer) library(moderndive) ``` --- # Is There Gender Discrimination In Promotions? <!-- also: https://www.ncbi.nlm.nih.gov/pubmed/12083361 --> .pull-left[ * Article published in the *Journal of Applied Psychology* in 1970 investigates whether female employees at Banks are discriminated against. * 48 supervisors were given *identical* candidate CVs - identical up to the first name, which was male or female. * Many similar experiments have been conducted with other groups. Arabic Names, Black names, Jewish names or other groups that can be identified from typical name choice. ] -- .pull-right[ ``` r library(moderndive) promotions ``` ``` ## # A tibble: 48 × 3 ## id decision gender ## <int> <fct> <fct> ## 1 1 promoted male ## 2 2 promoted male ## 3 3 promoted male ## 4 4 promoted male ## 5 5 promoted male ## 6 6 promoted male ## 7 7 promoted male ## 8 8 promoted male ## 9 9 promoted male ## 10 10 promoted male ## # ℹ 38 more rows ``` ] --- # Looking At Promotions .pull-left[ <img src="hypothesis_files/figure-html/unnamed-chunk-4-1.svg" style="display: block; margin: auto;" /> ] .pull-right[ ``` r promotions %>% group_by(gender, decision) %>% summarize(n = n()) %>% mutate(proportion = n / sum(n)) ``` ``` ## # A tibble: 4 × 4 ## # Groups: gender [2] ## gender decision n proportion ## <fct> <fct> <int> <dbl> ## 1 male not 3 0.125 ## 2 male promoted 21 0.875 ## 3 female not 10 0.417 ## 4 female promoted 14 0.583 ``` * 87.5% of "men" were promoted. * 58.3% of "women" were promoted. * That's a difference of 87.55 - 58.3% = 29.2%. * Is the 29% advantage for men in this sample **conclusive evidence**? * In a *hyopthetical world* **without gender discrimination**, could we have observed a 29% difference *by chance*? ] --- # Imposing A Hypothetical World: No Gender Discriminiation .pull-left[ * Suppose we lived in a world without gender discrimination. * The label `gender` in our dataframe would be meaningless. * Let's randomly reassign `gender` to each row and see how this affects the result. * Suppose we have 48 playing cards: 24 red (female) and 24 (black) * Shuffle the cards, and lay down the cards in a row, record `f` if **red**. ] -- .pull-right[ ``` r bind_cols(promotions, dplyr::select(promotions_shuffled, decision_shuffled = decision, gender_shuffled = gender)) ```
id
decision
gender
decision_shuffled
gender_shuffled
1
promoted
male
promoted
female
2
promoted
male
promoted
female
3
promoted
male
promoted
male
4
promoted
male
promoted
female
5
promoted
male
promoted
male
6
promoted
male
promoted
male
7
promoted
male
promoted
male
8
promoted
male
promoted
female
9
promoted
male
promoted
male
10
promoted
male
promoted
female
11
promoted
male
promoted
male
12
promoted
male
promoted
female
13
promoted
male
promoted
female
14
promoted
male
promoted
male
15
promoted
male
promoted
male
16
promoted
male
promoted
male
17
promoted
male
promoted
male
18
promoted
male
promoted
female
19
promoted
male
promoted
female
20
promoted
male
promoted
male
21
promoted
male
promoted
female
22
promoted
female
promoted
male
23
promoted
female
promoted
female
24
promoted
female
promoted
male
25
promoted
female
promoted
male
26
promoted
female
promoted
male
27
promoted
female
promoted
male
28
promoted
female
promoted
female
29
promoted
female
promoted
female
30
promoted
female
promoted
male
31
promoted
female
promoted
male
32
promoted
female
promoted
female
33
promoted
female
promoted
female
34
promoted
female
promoted
female
35
promoted
female
promoted
female
36
not
male
not
male
37
not
male
not
female
38
not
male
not
female
39
not
female
not
male
40
not
female
not
male
41
not
female
not
male
42
not
female
not
female
43
not
female
not
male
44
not
female
not
female
45
not
female
not
female
46
not
female
not
female
47
not
female
not
female
48
not
female
not
male
* Observe how in `promotions_shuffled` we randomly assigned `gender1`. * The `decision` column is the same! * What does this now look like? ] --- # Reshuffled Promotions .pull-left[ <img src="hypothesis_files/figure-html/unnamed-chunk-7-1.svg" style="display: block; margin: auto;" /> ] .pull-right[ ``` r promotions %>% group_by(gender, decision) %>% summarize(n = n()) %>% mutate(proportion = n / sum(n)) ``` ``` ## # A tibble: 4 × 4 ## # Groups: gender [2] ## gender decision n proportion ## <fct> <fct> <int> <dbl> ## 1 male not 3 0.125 ## 2 male promoted 21 0.875 ## 3 female not 10 0.417 ## 4 female promoted 14 0.583 ``` ``` r promotions_shuffled %>% group_by(gender, decision) %>% summarize(n = n()) %>% mutate(proportion = n / sum(n)) ``` ``` ## # A tibble: 4 × 4 ## # Groups: gender [2] ## gender decision n proportion ## <fct> <fct> <int> <dbl> ## 1 male not 6 0.25 ## 2 male promoted 18 0.75 ## 3 female not 7 0.292 ## 4 female promoted 17 0.708 ``` ] --- # Sampling Variation? .pull-left[ * In the hypothetical world, the difference was only 4.2%. * But what's the role of *sampling variation*? How representative of that hypothetical world is 4.2%? * Let's construct the sampling distribution ourselves! ] -- .pull-right[ 1. You need to shuffle a deck of 48 cards, 24 red, 24 black, and lay out card after card in front of you. 2. You do **not** put the cards back into the deck! 3. You could use the function `sample` for example. Look at `?sample` to find out more. 3. fill in your results into [this shared spreadsheet](https://docs.google.com/spreadsheets/d/118NlPUQjd13XodX7IGri8J-oLgkObdCtDEFri65tWmo/edit?usp=sharing)! ] --- # Sampling Variation in Reshuffling .pull-left[ <img src="hypothesis_files/figure-html/unnamed-chunk-10-1.svg" style="display: block; margin: auto;" /> ] .pull-right[ * This distribution was created in our **hypothetical** scenario: no discrimination. * We see how sampling variation affects the difference in promotion rates. * The red line denotes the *observed difference* in the **real world**. * Now: How *likely* is it that the red line is part of this **hypothetical** distribution? ] --- # Recap .pull-left[ * We just did a **permutation test**. We randomly reshuffled and checked if it makes a difference. * Again Resampling: Boostrapping is with replacment, permutation is without. * Bootstrapping: we put the paper slips **back** after recording them. * Permutation: We took card after card from our deck (*without* putting it back!) ] -- .pull-right[ * We observed the estimate `\(\hat{p}_m - \hat{p}_f = 29\%\)` in the real world. * We *tested* whether in a hypothetical universe with no discrimination, 29% *likely* to occur. * We concluded *rather not*. We tended to **reject** that hypothesis. * The real question was: is 29% **really** different from zero? What is the role of sampling variation? ] --- layout: false class: title-slide-section-red, middle # Hypothesis Testing Setup --- layout: true <div class="my-footer"><img src="../img/logo/unito-shield.png" style="height: 60px;"/></div> --- # Hypothesis Test Notation and Definitions .pull-left[ * In Hypothesis testing we compare two **competing hypothesis**. * In our example: `$$\begin{align}H_0:& p_m - p_f = 0\\H_A:& p_m - p_f > 0\end{align}$$` * `\(H_0\)` stands for the **null hypothesis**, where *no effect* is observed. That's our hypothetical world from above. * `\(H_A\)` or `\(H_1\)` is the **alternative** hypothesis. Here, we have a *one-sided* alternative, saying that `\(p_m > p_f\)`, ie women are discriminated against. The *two-sided* formulation is just `\(H_A: p_m - p_f \neq 0\)` ] -- .pull-right[ * A **test statistic** is a summary statistic which we use to summarise a certain aspect of our sample. Here: `\(\hat{p}_m - \hat{p}_f\)` * The *observed test statistic* is the number we get from our real world sample: `\(\hat{p}_m - \hat{p}_f = 29\%\)` * The **null distribution** is the sampling distribution of our test statistic, assuming the Null hypothesis is **true**. That's our hypothetical world without discrimination. * We have seen such a null distribution just above: ] --- # Null Distribution .left-wide[ <img src="hypothesis_files/figure-html/unnamed-chunk-11-1.svg" style="display: block; margin: auto;" /> ] .right-thin[ * This **is** the sampling distribution of `\(\hat{p}_m - \hat{p}_f\)`, assuming `\(H_0\)` is true. * The red line is the *observed* test statistic. ] --- # P-Value and Significance Level `\(\alpha\)` .pull-left[ * The **p-value** is the probability of observing a test statistic *more extreme* than the one we obtained, assuming `\(H_0\)` is true. 🤔 * How *strong* a piece of evidence is it to observe `\(\hat{p}_m - \hat{p}_f=29\%\)` in a world where `\(p_m - p_f=0\)` is assumed true? Very strong? Not so strong? * How many samples did we obtain that had a difference *greater* than 29%? Many, or not so many? * The p-value quantifies this by measuring the probability to the right of the red line in the previous plot. ] -- .pull-right[ * The **significance level** `\(\alpha\)` is a *cutoff* on the p-value. * We choose it *before* conducting our hypothesis test. It's common to assume `\(\alpha = 5\%\)`. * If the p-value falls below the cutoff `\(\alpha\)`, we **reject** the null hypothesis on the grounds that *what we observe is too unlikely to happen* under the Null. * Small p-value: The red line is *too far* from the center of the Null distribution. Observing the red line would have happened with very small probability only. ] --- layout: false class: title-slide-section-red, middle # Conducting Hypothesis Tests --- layout: true <div class="my-footer"><img src="../img/logo/unito-shield.png" style="height: 60px;"/></div> --- background-image: url(../img/photos/ht.png) background-size: 800px background-position: 60% 60% # Testing with `infer` --- # `infer` Testing Pipeline .pull-left[ * Here we follow closely the [infer workflow](https://moderndive.com/9-hypothesis-testing.html#infer-workflow-ht) given in moderndive. * We augment our previous pipeline with the `hypothesize` function, defining the type of null hypothesis. * Also, we give a `formula` to `specify()` this time, instead of only a variable name as before. * We create the Null Distribution by *reshuffling* (deck of cards), and *not* by *resampling* (pennies). ] -- .pull-right[ ``` r null_distribution <- promotions %>% # takes formula, defines success specify(formula = decision ~ gender, success = "promoted") %>% # decisions are independent of gender hypothesize(null = "independence") %>% # generate 1000 reshufflings of data generate(reps = 1000, type = "permute") %>% # compute p_m - p_f from each reshuffle calculate(stat = "diff in props", order = c("male", "female")) null_distribution ``` ``` ## Response: decision (factor) ## Explanatory: gender (factor) ## Null Hypothesis: ind... ## # A tibble: 1,000 × 2 ## replicate stat ## <int> <dbl> ## 1 1 -0.0417 ## 2 2 0.125 ## 3 3 -0.0417 ## 4 4 -0.125 ## 5 5 0.208 ## 6 6 -0.125 ## 7 7 -0.125 ## 8 8 -0.208 ## 9 9 0.208 ## 10 10 -0.125 ## # ℹ 990 more rows ``` ] --- # Back to Reality: What did we *Observe*? .pull-left[ * We computed `\(\hat{p}_m - \hat{p}_f\)` from our *real-world* sample before. ``` r obs_diff_prop <- promotions %>% specify(decision ~ gender, success = "promoted") %>% calculate(stat = "diff in props", order = c("male", "female")) obs_diff_prop ``` ``` ## Response: decision (factor) ## Explanatory: gender (factor) ## # A tibble: 1 × 1 ## stat ## <dbl> ## 1 0.292 ``` ] .pull-right[ * How does that observed statistic compare the distribution of **this** test statistic, assuming that `\(H_0\)` is true? * We **created** that distribution on the previous slide: `null_distribution`. * Let's confront `null_distribution` with `obs_diff_prop`, and let's compute the p-value! ] --- # Visualize the Null .left-wide[ ``` r visualize(null_distribution, bins = 10) ``` <img src="hypothesis_files/figure-html/unnamed-chunk-14-1.svg" style="display: block; margin: auto;" /> ] .right-thin[ * This is the distribution of `\(\hat{p}_m - \hat{p}_f\)` under `\(H_0\)`. * No Discrimination in that world. ] --- # Visualize the P-value .pull-left[ ``` r visualize(null_distribution, bins = 10) + shade_p_value(obs_stat = obs_diff_prop, direction = "right") ``` <img src="hypothesis_files/figure-html/unnamed-chunk-15-1.svg" style="display: block; margin: auto;" /> ] .pull-right[ * `shade_p_value` adds the p-value based on `obs_diff_prop`, i.e 0.29. * `direction = "right"` represents our one-sided alternative `\(H_A:p_m - p_f > 0\)` * *more extreme* means *bigger difference* here, hence *more to the right*. * If `\(H_A:p_m - p_f < 0\)`, we'd set `direction = "left"` * The red area **is the p-value**! * Is that a *big* or a *small* area? ] --- # Obtaining the p-value and Deciding to Reject .pull-left[ * Obtain the precise p-value with ``` r p_value <- null_distribution %>% get_p_value(obs_stat = obs_diff_prop, direction = "right") p_value ``` ``` ## # A tibble: 1 × 1 ## p_value ## <dbl> ## 1 0.034 ``` * So, the probability of observing a 29% difference in a world with no discrimination is only 3.4%. That probability is due to sampling variation. ] -- .pull-right[ * Suppose we had set `\(\alpha = 0.001 = 0.1\%\)` * Given that the p-value is *greater* than `\(\alpha\)`, * i.e. 3.4% > 0.1%, * we would **fail to reject** the null `\(H_0:p_m - p_f = 0\)`. * The p-value was not sufficiently small to convince us in this case. * What would have happened, had we set cutoff `\(\alpha = 0.05 = 5\%\)` instead? ] --- # Testing Errors .pull-left[ * Working with probabilities implies that sometimes, we make an error. * 29% may be *unlikely* under `\(H_0\)`, but that doesn't mean it's *impossible* to occur. * So, it may happen that we sometimes reject `\(H_0\)`, when in fact it was true. ] -- .pull-right[ * This is similar to a verdict reach in a court trial:  * In fact, in hypothesis testing:  ] --- # Type I and Type II Errors .pull-left[ * So, there are even two types of errors to make! 😲 * Type I: We convict an innocent person. We Reject a *true* Null. * Type II: We *fail* to convict a criminal. We *fail* to reject a *wrong* Null. * We **choose** the frequency of a Type I error by setting `\(\alpha\)`, called the **significance level**. ] -- .pull-right[ * The probability of committing a type II error is called `\(\beta\)`. The value `\(1-\beta\)`, i.e. the prob. of *not* making such an error, is called the **power** of a hypothesis test. * Ideally, `\(\alpha = \beta = 0\)`. However, with random sampling this is impossible. Also, both errors are inversely related. (see next slide) * So, typically we fix `\(\alpha\)` and try to maximize the power of the test. * Given a certain frequency of convicting an innocent person, we try to make sure we convict as many true criminals as possible. ] --- # Type I and II Errors are Inversely related .left-wide[  ] .right-thin[ * `\(\hat{\theta}\)` is *some* test statistic. * `\(f(\hat{\theta}|\theta_0)\)` and `\(f(\hat{\theta}|\theta_A)\)` are Null and Alternative distributions. * Changing `\(\alpha\)` moves critical value `\(\hat{\theta}_c\)`. * This example is fully worked out [here](https://scpoecon.github.io/Econometrics/images/hypothesis.pdf) ] --- class: title-slide-final, middle # THANKS To the amazing [moderndive](https://moderndive.com/) team! --- class: title-slide-final, middle background-image: url(../img/logo/esomas.png) background-size: 250px background-position: 9% 19% # END | | | | :--------------------------------------------------------------------------------------------------------- | :-------------------------------- | | <a href="mailto:florian.oswald@sciencespo.fr">.ScPored[<i class="fa fa-paper-plane fa-fw"></i>] | florian.oswald@sciencespo.fr | | <a href="https://github.com/ScPoEcon/Econometrics-Slides">.ScPored[<i class="fa fa-link fa-fw"></i>] | Slides | | <a href="https://scpoecon.github.io/Econometrics">.ScPored[<i class="fa fa-link fa-fw"></i>] | Book | | <a href="http://twitter.com/ScPoEcon">.ScPored[<i class="fa fa-twitter fa-fw"></i>] | @ScPoEcon | | <a href="http://github.com/ScPoEcon">.ScPored[<i class="fa fa-github fa-fw"></i>] | @ScPoEcon |