class: center, middle, inverse, title-slide .title[ # R: Introduction and Review ] .subtitle[ ## EC 421, Set 1(r) ] .author[ ### Edward Rubin ] --- class: inverse, middle # Econometrics .it.slab[Recall:] Applied econometrics, data science, analytics require: 1. Intuition for the __theory__ behind statistics/econometrics<br>(assumptions, results, strengths, weaknesses). 1. Practical knowledge of how to __apply theoretical methods__ to data. 1. Efficient methods for __working with data__<br>(cleaning, aggregating, joining, visualizing). __This course__ aims to deepen your knowledge in each of these three areas. - 1: As before. - 2–3: __R__ --- class: inverse, middle # R --- layout: true # R --- ## What is R? To quote the [R project website](https://www.r-project.org): > R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. -- What does that mean? - R was created for the statistical and graphical work required by econometrics. - R has a vibrant, thriving online community. ([stack overflow](https://stackoverflow.com/questions/tagged/r)) - Plus it's __free__ and __open source__. --- ## Why are we using R? 1\. R is __free__ and __open source__—saving both you and the university 💰💵💰. 2\. _Related:_ Outside of a small group of economists, private- and public-sector __employers favor R__ over .mono[Stata] and most competing softwares. 3\. R is very __flexible and powerful__—adaptable to nearly any task, _e.g._, 'metrics, spatial data analysis, machine learning, web scraping, data cleaning, website building, teaching (these slides). --- ## Why are we using R? 4\. _Related:_ R imposes __no limitations__ on your amount of observations, variables, memory, or processing power. (I'm looking at __you__, .mono[Stata].) 5\. If you put in the work,<sup>†</sup> you will come away with a __valuable and marketable__ tool. 6\. I 💖 __R__ 7\. R is a nice gateway to (and plays well with) other programming languages (.it[e.g.], Python, SQL, C++, JavaScript). .footnote[ [†]: Learning R definitely requires time and effort. ] --- <img src="slides_files/figure-html/statistical languages-1.svg" style="display: block; margin: auto;" /> --- layout: false class: inverse, middle # R + Examples --- # R + Regression ``` r # A simple regression *fit = lm(dist ~ 1 + speed, data = cars) # Show the coefficients coef(summary(fit)) ``` ``` #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -17.579095 6.7584402 -2.601058 1.231882e-02 #> speed 3.932409 0.4155128 9.463990 1.489836e-12 ``` ``` r # A nice, clear table library(broom) tidy(fit) ``` ``` #> # A tibble: 2 × 5 #> term estimate std.error statistic p.value #> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 (Intercept) -17.6 6.76 -2.60 1.23e- 2 #> 2 speed 3.93 0.416 9.46 1.49e-12 ``` --- # R + Plotting (w/ .mono[plot]) <img src="slides_files/figure-html/example: plot-1.svg" style="display: block; margin: auto;" /> --- # R + Plotting (w/ .mono[plot]) ``` r # Load packages with dataset library(gapminder) # Create dataset plot( x = gapminder$gdpPercap, y = gapminder$lifeExp, xlab = "GDP per capita", ylab = "Life Expectancy" ) ``` --- # R + Plotting (w/ .mono[ggplot2]) <img src="slides_files/figure-html/example: ggplot2 v1-1.svg" style="display: block; margin: auto;" /> --- # R + Plotting (w/ .mono[ggplot2]) ``` r # Load packages library(gapminder); library(dplyr) # Create dataset ggplot(data = gapminder, aes(x = gdpPercap, y = lifeExp)) + geom_point(alpha = 0.75) + scale_x_continuous("GDP per capita", label = scales::comma) + ylab("Life Expectancy") + theme_pander(base_size = 16) ``` --- # R + More plotting (w/ .mono[ggplot2]) <img src="slides_files/figure-html/example: ggplot2 v2-1.svg" style="display: block; margin: auto;" /> --- # R + More plotting (w/ .mono[ggplot2]) ``` r # Load packages library(gapminder); library(dplyr) # Create dataset ggplot( data = filter(gapminder, year %in% c(1952, 2002)), aes(x = gdpPercap, y = lifeExp, color = continent, group = country) ) + geom_path(alpha = 0.25) + geom_point(aes(shape = as.character(year), size = pop), alpha = 0.75) + scale_x_log10("GDP per capita", label = scales::comma) + ylab("Life Expectancy") + scale_shape_manual("Year", values = c(1, 17)) + scale_color_viridis("Continent", discrete = T, end = 0.95) + guides(size = F) + theme_pander(base_size = 16) ``` --- # R + Animated plots (w/ .mono[gganimate]) .center[] --- # R + Animated plots (w/ .mono[gganimate]) ``` r # The package for animating ggplot2 library(gganimate) # As before ggplot( data = gapminder %>% filter(continent != "Oceania"), aes(gdpPercap, lifeExp, size = pop, color = country) ) + geom_point(alpha = 0.7, show.legend = FALSE) + scale_colour_manual(values = country_colors) + scale_size(range = c(2, 12)) + scale_x_log10("GDP per capita", label = scales::comma) + facet_wrap(~continent) + theme_pander(base_size = 16) + theme(panel.border = element_rect(color = "grey90", fill = NA)) + # Here comes the gganimate-specific bits labs(title = "Year: {frame_time}") + ylab("Life Expectancy") + transition_time(year) + ease_aes("linear") ``` --- # R + Interactive plots (w/ .mono[plotly])
--- class: clear, middle ``` r plot_ly( data = gapminder %>% filter(year == 2007), x = ~gdpPercap, y = ~lifeExp, type = "scatter", mode = "markers", size = ~pop, color = ~continent, colors = viridis::plasma(n = 5, end = .93), text = ~paste( "Country: ", country, "<br>GDP per capita:", scales::dollar(gdpPercap, 1), "<br>Life Expectancy:", scales::comma(lifeExp, 1), "<br>Population:", scales::comma(pop) ), hoverinfo = "text", sizes = c(5, 100) ) %>% layout( title = "Gapminder data in 2007", xaxis = list(title = "GDP per capita (log scale)", type = "log"), yaxis = list(title = "Life Expectancy") ) --- # R + Maps ``` ``` r library(leaflet) leaflet() %>% addTiles() %>% addMarkers(lng = -123.075, lat = 44.045, popup = "The University of Oregon") ```
--- class: inverse, middle # Getting started with R --- layout: true # Starting R --- ## Installation - Install [R](https://www.r-project.org/). - Install [.mono[RStudio]](https://www.rstudio.com/products/rstudio/download/preview/). - __Optional/Overkill:__ [Git](https://git-scm.com/downloads) - Create an account on [GitHub](https://github.com/) - Register for a student/educator [discount](https://education.github.com/discount_requests/new). - For installation guidance and troubleshooting, check out Jenny Bryan's [website](http://happygitwithr.com/). - __Note:__ Many UO labs have R installed and ready (helpful in an pinch). --- ## Resources ### Free(-ish) - Google (which inevitably leads to StackOverflow) - Time - ChatGPT, Copilot, and other AI assistants - [Data services at the UO library](https://library.uoregon.edu/data-services) - Your classmates - Your GE and me - R resources [here](http://edrub.in/ARE212/resources.html) and [here](https://education.rstudio.com/) - [swirl](https://swirlstats.com/) and [learnr](https://cran.r-project.org/web/packages/learnr/index.html) ### Money Short online courses, e.g., [DataCamp](https://www.datacamp.com) --- ## Some R basics You will dive deeper into R in lab, but here six big points about R: .more-left[ 1. Everything is an __object__. 1. Every object has a __name__ and __value__. 1. You use __functions__ on these objects. 1. Functions come in __libraries__ (__packages__) 1. R will try to __help__ you. 1. R has its __quirks__. ] .less-right[ `foo` `foo = 2` `mean(foo)` `library(dplyr)` `?dplyr` `NA; error; warning` ] --- exclude: true ## R _vs._ .mono[Stata] Coming from .mono[Stata], here are a few important changes (benefits): - Multiple objects and arrays (_e.g._, data frames) can exist in the same workspace (in memory). No more `keep`, `preserve`, `restore`, `snapshot` nonsense! - (Base) R comes with lots of useful built-in functions—and provides all the tools necessary for you to build your own functions. However, many of the _best_ functions come from external libraries. - You don't need to `tset` or `xtset` data (you can if you really want... `ts`). --- layout: false class: clear, middle .it.slab[Next:] Metrics review(s)