class: center, middle, inverse, title-slide # Simple Linear Regression: Estimation ## EC 320: Introduction to Econometrics ### Winter 2022 --- class: inverse, middle # HouseKeeping - Lab 04 today, Exercise 04 due today. - Problem Set 2 out, due next Monday. --- # Last Time We considered a simple linear regression of `\(Y_i\)` on `\(X_i\)`: $$ Y_i = \beta_0 + \beta_1X_i + u_i. $$ -- - `\(\beta_0\)` and `\(\beta_1\)` are __population parameters__ that describe the *"true"* relationship between `\(X_i\)` and `\(Y_i\)`. - __Problem:__ We don't know the population parameters. The best we can do is to estimate them. --- # Last Time We derived the OLS estimator by picking estimates that minimize `\(\sum_{i=1}^n \hat{u}_i^2\)`. - __Intercept:__ $$ \hat{\beta}_0 = \bar{Y} - \hat{\beta}_1 \bar{X}. $$ - __Slope:__ $$ `\begin{aligned} \hat{\beta}_1 &= \dfrac{\sum_{i=1}^n (Y_i - \bar{Y})(X_i - \bar{X})}{\sum_{i=1}^n (X_i - \bar{X})^2}. \end{aligned}` $$ We used these formulas to obtain estimates of the parameters `\(\beta_0\)` and `\(\beta_1\)` in a regression of `\(Y_i\)` on `\(X_i\)`. --- # Last Time With the OLS estimates of the population parameters, we constructed a regression line: $$ \hat{Y_i} = \hat{\beta}_0 + \hat{\beta}_1X_i. $$ - `\(\hat{Y_i}\)` are predicted or __fitted__ values of `\(Y_i\)`. - You can think of `\(\hat{Y_i}\)` as an estimate of the average value of `\(Y_i\)` given a particular of `\(X_i\)`. -- OLS still produces prediction errors: `\(\hat{u}_i = Y_i - \hat{Y_i}\)`. - Put differently, there is a part of `\(Y_i\)` we can explain and a part we cannot: `\(Y_i = \hat{Y_i} + \hat{u}_i\)`. --- # Review What is the equation for the regression model estimated below? <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-1-1.svg" style="display: block; margin: auto;" /> --- # Review The estimated __intercept__ is -9.85. What does this tell us? <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-2-1.svg" style="display: block; margin: auto;" /> --- # Review The estimated __slope__ is 2.2. How do we interpret it? <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-3-1.svg" style="display: block; margin: auto;" /> --- # Today ## Agenda 1. Highlight important properties of OLS. 2. Discuss goodness of fit: how well does one variable explain another? 2. Units of measurement. --- class: inverse, middle # OLS Properties --- # OLS Properties The way we selected OLS estimates `\(\hat{\beta}_0\)` and `\(\hat{\beta}_1\)` gives us three important properties: 1. Residuals sum to zero: `\(\sum_{i=1}^n \hat{u}_i = 0\)`. 2. The sample covariance between the independent variable and the residuals is zero: `\(\sum_{i=1}^n X_i \hat{u}_i = 0\)`. 3. The point `\((\bar{X}, \bar{Y})\)` is always on the regression line. --- # OLS Residuals Residuals sum to zero: `\(\sum_{i=1}^n \hat{u}_i = 0\)`. - By extension, the sample mean of the residuals are zero. - You will prove this in Problem Set 2. --- # OLS Residuals The sample covariance between the independent variable and the residuals is zero: `\(\sum_{i=1}^n X_i \hat{u}_i = 0\)`. - You will prove a version of this in Problem Set 2. --- # OLS Regression Line The point `\((\bar{X}, \bar{Y})\)` is always on the regression line. - Start with the regression line: `\(\hat{Y_i} = \hat{\beta}_0 + \hat{\beta}_1X_i\)`. -- - `\(\hat{Y_i} = \bar{Y} - \hat{\beta}_1 \bar{X} + \hat{\beta}_1X_i\)`. -- - Plug `\(\bar{X}\)` into `\(X_i\)`: $$ `\begin{aligned} \hat{Y_i} &= \bar{Y} - \hat{\beta}_1 \bar{X} + \hat{\beta}_1\bar{X} \\ &= \bar{Y}. \end{aligned}` $$ --- class: inverse, middle # Goodness of Fit --- # Goodness of Fit ## .hi[Regression 1] *vs.* .hi-green[Regression 2] - Same slope. - Same intercept. **Q:** Which fitted regression line *"explains"*<sup>*</sup> the data better? .pull-left[ <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-4-1.svg" style="display: block; margin: auto;" /> ] .pull-right[ <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-5-1.svg" style="display: block; margin: auto;" /> ] .footnote[ <sup>*</sup> _Explains_ .mono[=] _fits_. ] --- # Goodness of Fit ## .hi[Regression 1] *vs.* .hi-green[Regression 2] The __coefficient of determination__ `\(R^2\)` is the fraction of the variation in `\(Y_i\)` *"explained"* by `\(X_i\)` in a linear regression. - `\(R^2 = 1 \implies X_i\)` explains _all_ of the variation in `\(Y_i\)`. - `\(R^2 = 0 \implies X_i\)` explains _none_ of the variation in `\(Y_i\)`. .pull-left[ .center[ `\(R^2\)` .mono[=] 0.73 ] <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-6-1.svg" style="display: block; margin: auto;" /> ] .pull-right[ .center[ `\(R^2\)` .mono[=] 0.07 ] <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-7-1.svg" style="display: block; margin: auto;" /> ] --- # Goodness of Fit <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-8-1.svg" style="display: block; margin: auto;" /> --- # Goodness of Fit <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-9-1.svg" style="display: block; margin: auto;" /> --- # Goodness of Fit <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-10-1.svg" style="display: block; margin: auto;" /> --- # Explained and Unexplained Variation Residuals remind us that there are parts of `\(Y_i\)` we can't explain. $$ Y_i = \hat{Y_i} + \hat{u}_i $$ - Sum the above, divide by `\(n\)`, and use the fact that OLS residuals sum to zero to get `\(\bar{\hat{u}} = 0 \implies \bar{Y} = \bar{\hat{Y}}\)`. -- __Total Sum of Squares (TSS)__ measures variation in `\(Y_i\)`: $$ \text{TSS} \equiv \sum_{i=1}^n (Y_i - \bar{Y})^2. $$ - We will decompose this variation into explained and unexplained parts. --- # Explained and Unexplained Variation __Explained Sum of Squares (ESS)__ measures the variation in `\(\hat{Y_i}\)`: $$ \text{ESS} \equiv \sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2. $$ -- **Residual Sum of Squares (RSS)** measures the variation in `\(\hat{u}_i\)`: $$ \text{RSS} \equiv \sum_{i=1}^n \hat{u}_i^2. $$ -- .hi[Goal:] Show that `\(\text{TSS} = \text{ESS} + \text{RSS}\)`. --- class: white-slide **Step 1:** Plug `\(Y_i = \hat{Y_i} + \hat{u}_i\)` into TSS. `\(\text{TSS}\)` -- <br> `\(\quad = \sum_{i=1}^n (Y_i - \bar{Y})^2\)` -- <br> `\(\quad = \sum_{i=1}^n ([\hat{Y_i} + \hat{u}_i] - [\bar{\hat{Y}} + \bar{\hat{u}}])^2\)` -- **Step 2:** Recall that `\(\bar{\hat{u}} = 0\)` and `\(\bar{Y} = \bar{\hat{Y}}\)`. `\(\text{TSS}\)` -- <br> `\(\quad = \sum_{i=1}^n \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right)^2\)` -- <br> `\(\quad = \sum_{i=1}^n \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right) \left( [\hat{Y_i} - \bar{Y}] + \hat{u}_i \right)\)` -- <br> `\(\quad = \sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2 + \sum_{i=1}^n \hat{u}_i^2 + 2 \sum_{i=1}^n \left( (\hat{Y_i} - \bar{Y})\hat{u}_i \right)\)` --- class: white-slide **Step 3:** Notice .hi-purple[ESS] and .hi[RSS]. `\(\text{TSS}\)` -- <br> `\(\quad = \color{#9370DB}{\sum_{i=1}^n (\hat{Y_i} - \bar{Y})^2} + \color{#e64173}{\sum_{i=1}^n \hat{u}_i^2} + 2 \sum_{i=1}^n \left( (\hat{Y_i} - \bar{Y})\hat{u}_i \right)\)` -- <br> `\(\quad = \color{#9370DB}{\text{ESS}} + \color{#e64173}{\text{RSS}} + 2 \sum_{i=1}^n \left( (\hat{Y_i} - \bar{Y})\hat{u}_i \right)\)` --- class: white-slide **Step 4:** Simplify. `\(\text{TSS}\)` -- <br> `\(\quad = \text{ESS} + \text{RSS} + 2 \sum_{i=1}^n \left( (\hat{Y_i} - \bar{Y})\hat{u}_i \right)\)` -- <br> `\(\quad = \text{ESS} + \text{RSS} + 2 \sum_{i=1}^n \hat{Y_i}\hat{u}_i - 2 \bar{Y}\sum_{i=1}^n \hat{u}_i\)` -- **Step 5:** Shut down the last two terms. Notice that `\(\sum_{i=1}^n \hat{Y_i}\hat{u}_i\)` <br> `\(\quad = \sum_{i=1}^n (\hat{\beta}_0 + \hat{\beta}_1X_i)\hat{u}_i\)` -- <br> `\(\quad = \hat{\beta}_0 \sum_{i=1}^n \hat{u}_i + \hat{\beta}_1 \sum_{i=1}^n X_i\hat{u}_i\)` -- <br> `\(\quad = 0\)` --- # Goodness of Fit ## Calculating `\(R^2\)` - `\(R^2 = \frac{\text{ESS}}{\text{TSS}}\)`. - `\(R^2 = 1 - \frac{\text{RSS}}{\text{TSS}}\)`. -- `\(R^2\)` is related to the correlation between the actual values of `\(Y\)` and the fitted values of `\(Y\)`. - Can show that `\(R^2 = (r_{Y, \hat{Y}})^2\)`. --- # Goodness of Fit ## So what? In the social sciences, low `\(R^2\)` values are common. -- Low `\(R^2\)` doesn't mean that an estimated regression is useless. - In a randomized control trial, `\(R^2\)` is usually less than 0.1. -- High `\(R^2\)` doesn't necessarily mean you have a *"good"* regression. - Worries about selection bias and omitted variables still apply. --- class: inverse, middle # Units of Measurement --- # Last Time We ran a regression of crimes per 1000 students on police per 1000 students. We found that `\(\hat{\beta_0}\)` .mono[=] 18.41 and `\(\hat{\beta_1}\)` .mono[=] 1.76. <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-11-1.svg" style="display: block; margin: auto;" /> --- # Last Time What if we had run a regression of crimes per student on police per 1000 students? What would happen to the slope? -- <img src="08-Simple_Linear_Regression_Estimation_files/figure-html/unnamed-chunk-12-1.svg" style="display: block; margin: auto;" /> `\(\hat{\beta_1}\)` .mono[=] 0.001756. --- # Demeaning ## Practice problem Suppose that, before running a regression of `\(Y_i\)` on `\(X_i\)`, you decided to _demean_ each variable by subtracting off the mean from each observation. This gave you `\(\tilde{Y}_i = Y_i - \bar{Y}\)` and `\(\tilde{X}_i = X_i - \bar{X}\)`. Then you decide to estimate $$ \tilde{Y}_i = \beta_0 + \beta_1 \tilde{X}_i + u_i. $$ What will you get for your intercept estimate `\(\hat{\beta}_0\)`?