class: center, middle, inverse, title-slide # Nonlinear Relationships ## EC 320: Introduction to Econometrics ### Winter 2022 --- class: inverse, middle # Prologue --- # Housekeeping ### Final Exam Review lecture this Wednesday. - Come prepared with questions. **Exam:** Friday, March 18 at 10:15am in TYKE 140 - If things change, will announce it immediately on Canvas ### Lab Some practice problems reviewed ### Poll Office hours on the finals week? --- class: inverse, middle # Nonlinear Relationships --- # Can We Do Better? `$$(\widehat{\text{Life Expectancy})_i} = 53.96 + 8\times 10^{-4} \cdot \text{GDP}_i$$` <img src="16-Nonlinear_Relationships_files/figure-html/unnamed-chunk-2-1.svg" style="display: block; margin: auto;" /> --- # Nonlinear Relationships Many economic relationships are **nonlinear**. - *e.g.*, most production functions, profit, diminishing marginal utility, tax revenue as a function of the tax rate, *etc.* -- ## The flexibility of OLS OLS can accommodate many, but not all, nonlinear relationships. - Underlying model must be linear-in-parameters. - Nonlinear transformations of variables are okay. - Modeling some nonlinear relationships requires advanced estimation techniques, such as *maximum likelihood*.<sup>.pink[†]</sup> .footnote[ .pink[†] Beyond the scope of this class. ] --- # Linearity .hi-green[Linear-in-parameters:] .green[Parameters] enter model as a weighted sum, where the weights are functions of the variables. - One of the assumptions required for the unbiasedness of OLS. .hi-pink[Linear-in-variables:] .pink[Variables] enter the model as a weighted sum, where the weights are functions of the parameters. - Not required for the unbiasedness of OLS. -- The standard linear regression model satisfies both properties: `$$Y_i = \beta_0 + \beta_1X_{1i} + \beta_2X_{2i} + \dots + \beta_kX_{ki} + u_i$$` --- # Linearity Which of the following is .hi-green[linear-in-parameters], .hi-pink[linear-in-variables], or .hi-purple[neither]? 1. `\(Y_i = \beta_0 + \beta_1X_{i} + \beta_2X_{i}^2 + \dots + \beta_kX_{i}^k + u_i\)` 2. `\(Y_i = \beta_0X_i^{\beta_1}v_i\)` 3. `\(Y_i = \beta_0 + \beta_1\beta_2X_{i} + u_i\)` --- count: false # Linearity Which of the following is .hi-green[linear-in-parameters], .hi-pink[linear-in-variables], or .hi-purple[neither]? 1. `\(\color{#007935}{Y_i = \beta_0 + \beta_1X_{i} + \beta_2X_{i}^2 + \dots + \beta_kX_{i}^k + u_i}\)` 2. `\(Y_i = \beta_0X_i^{\beta_1}v_i\)` 3. `\(Y_i = \beta_0 + \beta_1\beta_2X_{i} + u_i\)` Model 1 is .green[linear-in-parameters], but not linear-in-variables. --- count: false # Linearity Which of the following is .hi-green[linear-in-parameters], .hi-pink[linear-in-variables], or .hi-purple[neither]? 1. `\(\color{#007935}{Y_i = \beta_0 + \beta_1X_{i} + \beta_2X_{i}^2 + \dots + \beta_kX_{i}^k + u_i}\)` 2. `\(\color{#9370DB}{Y_i = \beta_0X_i^{\beta_1}v_i}\)` 3. `\(Y_i = \beta_0 + \beta_1\beta_2X_{i} + u_i\)` Model 1 is .green[linear-in-parameters], but not linear-in-variables. Model 2 is .purple[neither]. --- count: false # Linearity Which of the following is .hi-green[linear-in-parameters], .hi-pink[linear-in-variables], or .hi-purple[neither]? 1. `\(\color{#007935}{Y_i = \beta_0 + \beta_1X_{i} + \beta_2X_{i}^2 + \dots + \beta_kX_{i}^k + u_i}\)` 2. `\(\color{#9370DB}{Y_i = \beta_0X_i^{\beta_1}v_i}\)` 3. `\(\color{#e64173}{Y_i = \beta_0 + \beta_1\beta_2X_{i} + u_i}\)` Model 1 is .green[linear-in-parameters], but not linear-in-variables. Model 2 is .purple[neither]. Model 3 is .pink[linear-in-variables], but not linear-in-parameters. --- # We're Going to Take Logs The natural log is the inverse function for the exponential function: <br> `\(\quad \log(e^x) = x\)` for `\(x>0\)`. ## (Natural) Log Rules 1. Product rule: `\(\log(AB) = \log(A) + \log(B)\)`. -- 2. Quotient rule: `\(\log(A/B) = \log(A) - \log(B)\)`. -- 3. Power rule: `\(\log(A^B) = B \cdot \log(A)\)`. -- 4. Derivative: `\(f(x) = \log(x)\)` .mono[=>] `\(f'(x) = \dfrac{1}{x}\)`. -- 5. `\(\log(e) = 1\)`, `\(\log(1) = 0\)`, and `\(\log(x)\)` is undefined for `\(x \leq 0\)`. --- # Log-Linear Model **Nonlinear Model** `$$Y_i = \alpha e^{\beta_1 X_i}v_i$$` - `\(Y > 0\)`, `\(X\)` is continuous, and `\(v_i\)` is a multiplicative error term. - Cannot estimate parameters with OLS directly. -- **Logarithmic Transformation** `$$\log(Y_i) = \log(\alpha) + \beta_1 X_i + \log(v_i)$$` - Redefine `\(\log(\alpha) \equiv \beta_0\)` and `\(\log(v_i) \equiv u_i\)`. -- **Transformed (Linear) Model** `$$\log(Y_i) = \beta_0 + \beta_1 X_i + u_i$$` - *Can* estimate with OLS, but coefficient interpretation changes. --- # Log-Linear Model **Regression Model** `$$\log(Y_i) = \beta_0 + \beta_1 X_i + u_i$$` **Interpretation** - A one-unit increase in the explanatory variable increases the outcome variable by approximately `\(\beta_1\times 100\)` percent, on average. - *Example:* If `\(\log(\hat{\text{Pay}_i}) = 2.9 + 0.03 \cdot \text{School}_i\)`, then an additional year of schooling increases pay by approximately 3 percent, on average. --- # Log-Linear Model **Derivation** Consider the log-linear model $$ \log(Y) = \beta_0 + \beta_1 \, X + u $$ and differentiate $$ \dfrac{dY}{Y} = \beta_1 dX $$ -- A marginal (small) change in `\(X\)` (_i.e._, `\(dX\)`) leads to a `\(\beta_1 dX\)` **proportionate change** in `\(Y\)`. - Multiply by 100 to get the **percentage change** in `\(Y\)`. --- # Log-Linear Example `$$\log(\hat{Y_i}) = 10.02 + 0.73 \cdot \text{X}_i$$` <img src="16-Nonlinear_Relationships_files/figure-html/log linear plot-1.svg" style="display: block; margin: auto;" /> --- count: false # Log-Linear Example `$$\log(\hat{Y_i}) = 10.02 + 0.73 \cdot \text{X}_i$$` <img src="16-Nonlinear_Relationships_files/figure-html/log linear plot 2-1.svg" style="display: block; margin: auto;" /> --- # Log-Linear Model **Note:** If you have a log-linear model with a binary indicator variable, the interpretation of the coefficient on that variable changes. Consider $$ \log(Y_i) = \beta_0 + \beta_1 X_i + u_i $$ for binary variable `\(X\)`. Interpretation of `\(\beta_1\)`: - When `\(X\)` changes from 0 to 1, `\(Y\)` will increase by `\(100 \times \left( e^{\beta_1} -1 \right)\)` percent. - When `\(X\)` changes from 1 to 0, `\(Y\)` will decrease by `\(100 \times \left( e^{-\beta_1} -1 \right)\)` percent. --- # Log-Linear Example Binary explanatory variable: `trained` - `trained == 1` if employee received training. - `trained == 0` if employee did not receive training. ```r lm(log(productivity) ~ trained, data = df2) %>% tidy() ``` ``` #> # A tibble: 2 × 5 #> term estimate std.error statistic p.value #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 9.94 0.0446 223. 0 #> 2 trained 0.557 0.0631 8.83 4.72e-18 ``` **Q:** How do we interpret the coefficient on `trained`? **A.sub[1]:** Trained workers `\((e^{0.557}-1)\times 100\)` (74.52) percent more productive than untrained workers. **A.sub[2]:** Untrained workers `\((e^{-0.557}-1)\times 100\)` (42.7) percent less productive than trained workers. --- # Log-Log Model **Nonlinear Model** `$$Y_i = \alpha X_i^{\beta_1}v_i$$` - `\(Y > 0\)`, `\(X > 0\)`, and `\(v_i\)` is a multiplicative error term. - Cannot estimate parameters with OLS directly. -- **Logarithmic Transformation** `$$\log(Y_i) = \log(\alpha) + \beta_1 \log(X_i) + \log(v_i)$$` - Redefine `\(\log(\alpha) \equiv \beta_0\)` and `\(\log(v_i) \equiv u_i\)`. -- **Transformed (Linear) Model** `$$\log(Y_i) = \beta_0 + \beta_1 \log(X_i) + u_i$$` - *Can* estimate with OLS, but coefficient interpretation changes. --- # Log-Log Model **Regression Model** $$ \log(Y_i) = \beta_0 + \beta_1 \log(X_i) + u_i $$ **Interpretation** - A one-percent increase in the explanatory variable leads to a `\(\beta_1\)`-percent change in the outcome variable, on average. - Often interpreted as an elasticity. - *Example:* If `\(\log(\widehat{\text{Quantity Demanded}}_i) = 0.45 - 0.31 \cdot \log(\text{Income}_i)\)`, then each one-percent increase in income decreases quantity demanded by 0.31 percent. --- # Log-Log Model **Derivation** Consider the log-log model $$ \log(Y_i) = \beta_0 + \beta_1 \log(X_i) + u $$ and differentiate $$ \dfrac{dY}{Y} = \beta_1 \dfrac{dX}{X} $$ A one-percent increase in `\(X\)` leads to a `\(\beta_1\)`-percent increase in `\(Y\)`. - Rearrange to show elasticity interpretation: $$ \dfrac{dY}{dX} \dfrac{X}{Y} = \beta_1 $$ --- # Log-Log Example `$$\log(\hat{Y_i}) = 0.01 + 2.99 \cdot \log(\text{X}_i)$$` <img src="16-Nonlinear_Relationships_files/figure-html/log log plot-1.svg" style="display: block; margin: auto;" /> --- count: false # Log-Log Example `$$\log(\hat{Y_i}) = 0.01 + 2.99 \cdot \log(\text{X}_i)$$` <img src="16-Nonlinear_Relationships_files/figure-html/log log plot 2-1.svg" style="display: block; margin: auto;" /> --- # Linear-Log Model **Nonlinear Model** `$$e^{Y_i} = \alpha X_i^{\beta_1}v_i$$` - `\(X > 0\)` and `\(v_i\)` is a multiplicative error term. - Cannot estimate parameters with OLS directly. -- **Logarithmic Transformation** `$$Y_i = \log(\alpha) + \beta_1 \log(X_i) + \log(v_i)$$` - Redefine `\(\log(\alpha) \equiv \beta_0\)` and `\(\log(v_i) \equiv u_i\)`. -- **Transformed (Linear) Model** `$$Y_i = \beta_0 + \beta_1 \log(X_i) + u_i$$` - *Can* estimate with OLS, but coefficient interpretation changes. --- # Linear-Log Model **Regression Model** `$$Y_i = \beta_0 + \beta_1 \log(X_i) + u_i$$` **Interpretation** - A one-percent increase in the explanatory variable increases the outcome variable by approximately `\(\beta_1 \div 100\)`, on average. - *Example:* If `\(\hat{(\text{Blood Pressure})_i} = 150 - 9.1 \log(\text{Income}_i)\)`, then a one-percent increase in income decrease blood pressure by 0.091 points. --- # Linear-Log Model **Derivation** Consider the log-linear model $$ Y = \beta_0 + \beta_1 \log(X) + u $$ and differentiate $$ dY = \beta_1 \dfrac{dX}{X} $$ -- A one-percent increase in `\(X\)` leads to a `\(\beta_1 \div 100\)` **change** in `\(Y\)`. --- # Linear-Log Example `$$\hat{Y_i} = 0 + 0.99 \cdot \log(\text{X}_i)$$` <img src="16-Nonlinear_Relationships_files/figure-html/linear log plot-1.svg" style="display: block; margin: auto;" /> --- count: false # Linear-Log Example `$$\hat{Y_i} = 0 + 0.99 \cdot \log(\text{X}_i)$$` <img src="16-Nonlinear_Relationships_files/figure-html/linear log plot 2-1.svg" style="display: block; margin: auto;" /> --- class: white-slide .center[**(Approximate) Coefficient Interpretation**] <table> <thead> <tr> <th style="text-align:left;"> Model </th> <th style="text-align:left;"> \(\beta_1\) Interpretation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;font-style: italic;color: black !important;vertical-align:top;"> Level-level <br> \(Y_i = \beta_0 + \beta_1 X_i + u_i\) </td> <td style="text-align:left;font-style: italic;color: black !important;"> \(\Delta Y = \beta_1 \cdot \Delta X\) <br> A one-unit increase in \(X\) leads to a <br> \(\beta_1\)-unit increase in \(Y\) </td> </tr> <tr> <td style="text-align:left;font-weight: bold;font-style: italic;color: black !important;vertical-align:top;"> Log-level <br> \(\log(Y_i) = \beta_0 + \beta_1 X_i + u_i\) </td> <td style="text-align:left;font-style: italic;color: black !important;"> \(\%\Delta Y = 100 \cdot \beta_1 \cdot \Delta X\) <br> A one-unit increase in \(X\) leads to a <br> \(\beta_1 \cdot 100\)-percent increase in \(Y\) </td> </tr> <tr> <td style="text-align:left;font-weight: bold;font-style: italic;color: black !important;vertical-align:top;"> Log-log <br> \(\log(Y_i) = \beta_0 + \beta_1 \log(X_i) + u_i\) </td> <td style="text-align:left;font-style: italic;color: black !important;"> \(\%\Delta Y = \beta_1 \cdot \%\Delta X\) <br> A one-percent increase in \(X\) leads to a <br> \(\beta_1\)-percent increase in \(Y\) </td> </tr> <tr> <td style="text-align:left;font-weight: bold;font-style: italic;color: black !important;vertical-align:top;"> Level-log <br> \(Y_i = \beta_0 + \beta_1 \log(X_i) + u_i\) </td> <td style="text-align:left;font-style: italic;color: black !important;"> \(\Delta Y = (\beta_1 \div 100) \cdot \%\Delta X\) <br> A one-percent increase in \(X\) leads to a <br> \(\beta_1 \div 100\)-unit increase in \(Y\) </td> </tr> </tbody> </table> --- # Can We Do Better? `$$(\widehat{\text{Life Expectancy})_i} = 53.96 + 8\times 10^{-4} \cdot \text{GDP}_i \quad\quad R^2 = 0.34$$` <img src="16-Nonlinear_Relationships_files/figure-html/unnamed-chunk-9-1.svg" style="display: block; margin: auto;" /> --- # Can We Do Better? `$$\log( \widehat{\text{Life Expectancy}_i}) = 3.97 + 1.3\times 10^{-5} \cdot \text{GDP}_i \quad\quad R^2 = 0.3$$` <img src="16-Nonlinear_Relationships_files/figure-html/unnamed-chunk-11-1.svg" style="display: block; margin: auto;" /> --- # Can We Do Better? `$$\log ( \widehat{\text{Life Expectancy}_i} ) = 2.86 + 0.15 \cdot \log \left( \text{GDP}_i \right) \quad\quad R^2 = 0.61$$` <img src="16-Nonlinear_Relationships_files/figure-html/unnamed-chunk-13-1.svg" style="display: block; margin: auto;" /> --- # Can We Do Better? `$$( \widehat{\text{Life Expectancy}})_i = -9.1 + 8.41 \cdot \log \left( \text{GDP}_i \right) \quad\quad R^2 = 0.65$$` <img src="16-Nonlinear_Relationships_files/figure-html/unnamed-chunk-15-1.svg" style="display: block; margin: auto;" /> --- # Practical Considerations **Consideration 1:** Do your data take negative numbers or zeros as values? ```r log(0) ``` ``` #> [1] -Inf ``` -- **Consideration 2:** What coefficient interpretation do you want? Unit change? Unit-free percent change? -- **Consideration 3:** Are your data skewed? .pull-left[ <img src="16-Nonlinear_Relationships_files/figure-html/skew 1-1.svg" style="display: block; margin: auto;" /> ] .pull-right[ <img src="16-Nonlinear_Relationships_files/figure-html/skew 2-1.svg" style="display: block; margin: auto;" /> ] --- class: inverse, middle # Quadratic Regression --- # Quadratic Data <img src="16-Nonlinear_Relationships_files/figure-html/quad plot-1.svg" style="display: block; margin: auto;" /> --- # Quadratic Regression **Regression Model** `$$Y_i = \beta_0 + \beta_1 X_i + \beta_2 X_i^2 + u_i$$` -- **Interpretation** Sign of `\(\beta_2\)` indicates whether the relationship is convex (.mono[+]) or concave (.mono[-]) -- Sign of `\(\beta_1\)`? -- 🤷 -- Partial derivative of `\(Y\)` with respect to `\(X\)` is the .hi[marginal effect] of `\(X\)` on `\(Y\)`: `$$\color{#e64173}{\dfrac{\partial Y}{\partial X} = \beta_1 + 2 \beta_2 X}$$` - Effect of `\(X\)` depends on the level of `\(X\)` --- # Quadratic Regression ```r lm(y ~ x + I(x^2), data = quad_df) %>% tidy() ``` ``` #> # A tibble: 3 × 5 #> term estimate std.error statistic p.value #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 13.2 2.26 5.81 8.30e- 9 #> 2 x 15.7 1.03 15.3 1.99e- 47 #> 3 I(x^2) -2.50 0.0982 -25.4 2.46e-110 ``` .pink[What is the marginal effect of] `\(\color{#e64173}{X}\)` .pink[on] `\(\color{#e64173}{Y}\)`.pink[?] -- <br> `\(\widehat{\dfrac{\partial \text{Y}}{\partial \text{X}} } = \hat{\beta}_1 + 2\hat{\beta}_2 X = 15.69 + -4.99X\)` --- # Quadratic Regression ```r lm(y ~ x + I(x^2), data = quad_df) %>% tidy() ``` ``` #> # A tibble: 3 × 5 #> term estimate std.error statistic p.value #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 13.2 2.26 5.81 8.30e- 9 #> 2 x 15.7 1.03 15.3 1.99e- 47 #> 3 I(x^2) -2.50 0.0982 -25.4 2.46e-110 ``` .pink[What is the marginal effect of] `\(\color{#e64173}{X}\)` .pink[on] `\(\color{#e64173}{Y}\)` .pink[when] `\(\color{#e64173}{X=0}\)`.pink[?] -- <br> `\(\widehat{\dfrac{\partial \text{Y}}{\partial \text{X}} }\Bigg|_{\small \text{X}=0} = \hat{\beta}_1 = 15.69\)` --- # Quadratic Regression ```r lm(y ~ x + I(x^2), data = quad_df) %>% tidy() ``` ``` #> # A tibble: 3 × 5 #> term estimate std.error statistic p.value #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 13.2 2.26 5.81 8.30e- 9 #> 2 x 15.7 1.03 15.3 1.99e- 47 #> 3 I(x^2) -2.50 0.0982 -25.4 2.46e-110 ``` .pink[What is the marginal effect of] `\(\color{#e64173}{X}\)` .pink[on] `\(\color{#e64173}{Y}\)` .pink[when] `\(\color{#e64173}{X=2}\)`.pink[?] -- <br> `\(\widehat{\dfrac{\partial \text{Y}}{\partial \text{X}} }\Bigg|_{\small \text{X}=2} = \hat{\beta}_1 + 2\hat{\beta}_2 \cdot (2) = 15.69 -9.99 = 5.71\)` --- # Quadratic Regression ```r lm(y ~ x + I(x^2), data = quad_df) %>% tidy() ``` ``` #> # A tibble: 3 × 5 #> term estimate std.error statistic p.value #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 13.2 2.26 5.81 8.30e- 9 #> 2 x 15.7 1.03 15.3 1.99e- 47 #> 3 I(x^2) -2.50 0.0982 -25.4 2.46e-110 ``` .pink[What is the marginal effect of] `\(\color{#e64173}{X}\)` .pink[on] `\(\color{#e64173}{Y}\)` .pink[when] `\(\color{#e64173}{X=7}\)`.pink[?] -- <br> `\(\widehat{\dfrac{\partial \text{Y}}{\partial \text{X}} }\Bigg|_{\small \text{X}=7} = \hat{\beta}_1 + 2\hat{\beta}_2 \cdot (7) = 15.69 -34.96 = -19.27\)` --- class: white-slide .center[**Fitted Regression Line**] <img src="16-Nonlinear_Relationships_files/figure-html/unnamed-chunk-22-1.svg" style="display: block; margin: auto;" /> --- class: white-slide .center[**Marginal Effect of X on Y**] <img src="16-Nonlinear_Relationships_files/figure-html/unnamed-chunk-23-1.svg" style="display: block; margin: auto;" /> --- # Quadratic Regression **Where does the regression** `\(\hat{Y_i} = \hat{\beta}_0 + \hat{\beta}_1 X_i + \hat{\beta}_2 X_i^2\)` ***turn*?** - In other words, where is the peak (valley) of the fitted relationship? -- **Step 1:** Take the derivative and set equal to zero. `$$\widehat{\dfrac{\partial \text{Y}}{\partial \text{X}} } = \hat{\beta}_1 + 2\hat{\beta}_2 X = 0$$` -- **Step 2:** Solve for `\(X\)`. `$$X = -\dfrac{\hat{\beta}_1}{2\hat{\beta}_2}$$` -- **Example:** Peak of previous regression occurs at `\(X = 3.14\)`.