class: center, middle, inverse, title-slide # What is Econometrics? ## EC 320: Introduction to Econometrics ### Boyoon Chang ### Winter 2022 --- class: inverse, middle # Prologue --- # Who am I? [**Boyoon (Bo) Chang**](https://bchang.me) - Doctoral student in economics - Former research associate in economics team at a law firm - Focus in applied microeconomics, empirical industrial organization Where can you find me? - Office: 420 PLC - Office hours: MW 14:00-15:00, or by appointment - Email: [.mono[bchang@uoregon.edu]](mailto:bchang@uoregon.edu) **(use EC 320 in the subject line)** Your GE: **Kyutaro Matsuzawa** - Office hours: T/TR 14:00-15:00 via Zoom (info on Canvas) - Email: [.mono[kyutarom@uoregon.edu]](mailto:kyutarom@uoregon.edu) **(use EC 320 in the subject line)** --- # Today's Topic Syllabus - Course material - What, when, where, who Econometrics - Motivation - Examples R - What is R? - Why are we using R? - Getting started with R --- # Motivation ## Why study econometrics? 1. Develop __skills that employers value__. -- 1. Cultivate __healthy skepticism__. -- 1. Learn about the world using __data__. ??? Government agencies and private firms rely on data to make informed decisions. Requires people who can clean and analyze data, create informative visualizations, and communicate results. .mono[R] facilitates these tasks. When should we trust the findings of a study? Junk science abounds. Blind faith in science and science denialism are harmful. Econometrics gives us a way to evaluate the quality of evidence. Makes us better citizens. Historically, high quality data were scarce. Econometricians had to think rigorously and creatively about how to learn from less than ideal data. Result: proliferation of robust methods with falsifiable assumptions. --- # Motivation ## Why study econometrics? ### Provide answers to important questions -- - Do minimum wage policies __reduce poverty__? -- - Does the death penalty __deter violent crime__? -- - Are recessions __good for your health__? -- - How will global warming __affect the economy__? -- - What __explains the gender pay gap__? --- # Econometrics Most econometric inquiry concerns one of two distinct goals: 1. .hi-purple[Prediction:] Accurately .purple[predict] or .purple[forecast] an outcome given a set of predictors. .purple[Given what we know about] `\(\color{#6A5ACD}{x}\)`.purple[, what values do we expect] `\(\color{#6A5ACD}{y}\)` .purple[to take?] 1. .hi-green[Causal identification:] .green[Estimate] the effect of an intervention on an outcome. .green[How does] `\(\color{#007935}{y}\)` .green[change when we change] `\(\color{#007935}{x}\)`.green[?] ??? __Prediction examples__ Netflix uses information on users and their choices to provide individualized movie recommendations. Some states use data on defendants to predict pretrial flight risk. The Federal Reserve uses economic data to forecast inflation, unemployment, and GDP. __Causal examples__ Pharmaceutical companies run clinical trials to determine whether new medicines reduce symptoms or cause side effects. Tech companies use __A/B testing__ to improve user experience (and increase profit). Economists use __natural experiments__ to better understand how people respond to incentives. -- The main focus of EC 320 and EC 421 is causal identification. -- - But...both rely on a common set of statistical techniques. -- - For those interested, Professor Tim Duy teaches forecasting (EC 422) this Winter. --- # Econometrics ## Not all relationships are causal <img src="01-Introduction_files/figure-html/spurious-1.png" style="display: block; margin: auto;" /> --- # Econometrics ## Correlation vs. Causation Common refrain: _"Correlation doesn't necessarily imply causation!"_ - __Q:__ Why might correlation fail to describe a causal relationship? -- - __A:__ Omitted-variables bias, selection bias, simultaneity, reverse causality. -- Correlation can imply causation. - Requires strong assumptions. -- - **Real life often violates these assumptions!** - **Solutions:** Conduct an experiment or find a natural experiment. --- # Example: *Blue Paradox* [Recent study](https://www.pnas.org/content/116/12/5319) by UO economist [Grant McDermott](https://grantmcdermott.com) and coauthors. **Question:** Do commercial fishers preempt fishing bans by increasing their fishing effort before the bans go into effect? **Motivation** - Recent conservation efforts seek to preserve aquatic habitat and increase fish stocks. - Policy lever: Restrict fishing activity in marine protected areas. - Concern: Preemptive behavior could *decrease* fish stocks. -- **Data** - Vessel-level data on fishing effort/intensity. --- # Example: *Blue Paradox* **Natural Experiment** Phoenix Islands Protected Area (PIPA) - First mentioned on 1 September 2014; implemented 1 January 2015. - *Treatment group:* PIPA. - *Control group:* Outlying Kiribati islands. <img src="figure2.jpg" width="50%" style="display: block; margin: auto;" /> --- # Example: *Blue Paradox* **Natural Experiment** Measure the causal effect of the fishing ban by comparing fishing effort in treatment and control regions, before-and-after the implementation of the policy. - A *difference-in-differences* comparison. - .hi[Assumption:] .pink[Parallel trends.] If we believe this assumption, then the observed change supports a causal interpretation. If not, then the change could reflect other factors and thus fail to isolate the causal effect of the ban. --- # Example: *Blue Paradox* **Results** <img src="figure3.jpg" width="85%" style="display: block; margin: auto;" /> --- # Example: *Blue Paradox* **Discussion** Results provide causal evidence that commercial fishers engage in preemptive behavior in response to conservation policy changes. Results are *consistent* with economic theory, but *cannot prove* that the theory is correct. - **Science cannot prove anything.** - Science can .hi[falsify or reject] existing hypotheses or .hi[corroborate] existing evidence. -- Also...the causal statement rests on a critical assumption. - Cannot prove that the assumption is true, but can falsify it. - Failure to falsify `\(\neq\)` assumption is true. ??? As a social scientist, we have to be very careful about making a very definitive statements because most of the times, this is not true. --- class: inverse, middle # .mono[R] --- layout: true # .mono[R] --- ## What is .mono[R]? According to the [.mono[R] project website](https://www.r-project.org), > .mono[R] is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. -- What does that mean? - .mono[R] is __free__ and __open source__. - .mono[R] executes a variety of statistical techniques and produces beautiful graphs. - .mono[R] has a vibrant, thriving online community (see [stack overflow](https://stackoverflow.com/questions/tagged/r)). --- ## Why are we using .mono[R]? 1. .mono[R] is __free__. -- 1. __.mono[R] is popular__ among economists, political scientists, psychologists, sociologists, geographers, anthropologists, biologists, data scientists, and statisticians. -- 1. __Employers prefer .mono[R]__ over most competing software environments. -- 1. .mono[R] can __adapt to nearly any task__: 'metrics, spatial data analysis, machine learning, web scraping, data cleaning, website building, teaching. --- <img src="01-Introduction_files/figure-html/statistical languages-1.png" style="display: block; margin: auto;" /> --- layout: false class: inverse, middle # .mono[R] + [Examples] --- # .mono[R] + Regression ```r # A simple regression *fit <- lm(mpg ~ 1 + wt, data = mtcars) # Show the coefficients coef(summary(fit)) ``` ``` #> Estimate Std. Error t value #> (Intercept) 37.285126 1.877627 19.857575 #> wt -5.344472 0.559101 -9.559044 #> Pr(>|t|) #> (Intercept) 8.241799e-19 #> wt 1.293959e-10 ``` ```r # A nice, clear table library(broom) tidy(fit) ``` ``` #> # A tibble: 2 × 5 #> term estimate std.error statistic p.value #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) 37.3 1.88 19.9 8.24e-19 #> 2 wt -5.34 0.559 -9.56 1.29e-10 ``` --- # .mono[R] + Plotting (w/ .mono[plot]) <img src="01-Introduction_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- # .mono[R] + Plotting (w/ .mono[plot]) ```r # Load packages with dataset library(gapminder) # Create dataset plot( x = gapminder$gdpPercap, y = gapminder$lifeExp, xlab = "GDP per capita", ylab = "Life Expectancy" ) ``` --- # .mono[R] + Plotting (w/ .mono[ggplot2]) <img src="01-Introduction_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- # .mono[R] + Plotting (w/ .mono[ggplot2]) ```r # Load packages library(gapminder); library(dplyr) # Create dataset ggplot(data = gapminder, aes(x = gdpPercap, y = lifeExp)) + geom_point(alpha = 0.75) + scale_x_continuous("GDP per capita", label = scales::comma) + ylab("Life Expectancy") + theme_pander(base_size = 17, base_family = "Arial", fc = met_slate) ``` --- # .mono[R] + More plotting (w/ .mono[ggplot2]) <img src="01-Introduction_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> --- # .mono[R] + More plotting (w/ .mono[ggplot2]) ```r # Load packages library(gapminder); library(dplyr) # Create dataset ggplot( data = filter(gapminder, year %in% c(1952, 2002)), aes(x = gdpPercap, y = lifeExp, color = continent, group = country) ) + geom_path(alpha = 0.25) + geom_point(aes(shape = as.character(year), size = pop), alpha = 0.75) + scale_x_log10("GDP per capita", label = scales::comma) + ylab("Life Expectancy") + scale_shape_manual("Year", values = c(1, 17)) + scale_color_viridis("Continent", discrete = T, end = 0.95) + guides(size = F) + theme_pander(base_size = 17, base_family = "Arial", fc = met_slate) ``` --- # .mono[R] + Animated plots (w/ .mono[gganimate]) .center[![Gapminder](ex_gganimate.gif)] --- # .mono[R] + Animated plots (w/ .mono[gganimate]) ```r # The package for animating ggplot2 library(gganimate) # As before ggplot( data = gapminder %>% filter(continent != "Oceania"), aes(gdpPercap, lifeExp, size = pop, color = country) ) + geom_point(alpha = 0.7, show.legend = FALSE) + scale_colour_manual(values = country_colors) + scale_size(range = c(2, 12)) + scale_x_log10("GDP per capita", label = scales::comma) + facet_wrap(~continent) + theme_pander(base_size = 17, base_family = "Arial", fc = met_slate) + theme(panel.border = element_rect(color = "grey90", fill = NA)) + # Add gganimate code labs(title = "Year: {frame_time}") + ylab("Life Expectancy") + transition_time(year) + ease_aes("linear") ``` --- # .mono[R] + Animated maps (w/ .mono[gganimate]) .center[![](dc_gunshots_map.gif)] --- class: inverse, center, middle # Getting Started with .mono[R] --- layout: true # Starting .mono[R] --- ## Installation - Install [.mono[R]](https://www.r-project.org/). - Install [.mono[RStudio]](https://www.rstudio.com/products/rstudio/download/preview/). - __Note:__ [All academic workstations at the UO have .mono[R]](https://library.uoregon.edu/library-technology-services/public-info/a-software), but having a copy of .mono[R] on your computer will prove useful for the econometrics sequence and 400-level elective courses. -- ## Resources - Google and [StackOverflow](https://stackoverflow.com/questions/tagged/r) - Time - Your classmates - Your GE - Me --- ## .mono[R] basics .more-left[ 1. Everything is an __object__. 1. Every object has a __name__ and __value__. 1. You use __functions__ on these objects. 1. Functions come in __libraries__ (__packages__). 1. .mono[R] will try to __help__ you. 1. .mono[R] has its __quirks__. ] .less-right[ `foo` `foo <- 2` `mean(foo)` `library(dplyr)` `?dplyr` `NA; error; warning` ]