.hi-slate[scales] (*e.g.*, `scale_color_manual()`), -- and other .hi-slate[options] (*e.g.*, `xlab()`). -- You .hi-slate[add layers] using the addition sign (`+`). -- .ex[Example] A time series of problems, `color` defined by money ```{R, gg-fake2, eval = F} library(ggplot2) ggplot( data = pretend_df, aes(x = time, y = problems, color = money) *) + *geom_point() + geom_line() ``` --- layout: true class: clear, middle --- Alright, let's build a plot. We'll use the `economics` dataset that comes with `ggplot2`
(because economics). --- ```{R, view-economics, echo = F, eval = T} DT::datatable( economics, fillContainer = FALSE, options = list(pageLength = 8) ) ``` --- name: ex-gg .smaller[Set up the plot. ```{R, gg0, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop)) ``` ] --- .smaller[Label the axes. ```{R, gg1, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop)) + ylab("Unemployment rate") + xlab("Date") ``` ] --- .smaller[Draw some points. ```{R, gg2, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop)) + ylab("Unemployment rate") + xlab("Date") + geom_point() ``` ] --- .smaller[Map the `size` to the median duration of unemployment. ```{R, gg3, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop, size = uempmed)) + ylab("Unemployment rate") + xlab("Date") + geom_point() ``` ] --- .smaller[Change the `shape` of the points. ```{R, gg4, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop, size = uempmed)) + ylab("Unemployment rate") + xlab("Date") + geom_point(shape = 1) ``` ] --- .smaller[Map points' `color` to the median duration of unemployment. ```{R, gg5, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop, size = uempmed)) + ylab("Unemployment rate") + xlab("Date") + geom_point(aes(color = uempmed)) ``` ] --- .smaller[Add some transparency (`alpha`) to our points. ```{R, gg6, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop, size = uempmed)) + ylab("Unemployment rate") + xlab("Date") + geom_point(aes(color = uempmed), alpha = 0.5) ``` ] --- .smaller[Same size points; all bigger. ```{R, gg7, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop)) + ylab("Unemployment rate") + xlab("Date") + geom_point(aes(color = uempmed), alpha = 0.5, size = 3) ``` ] --- .smaller[Change our theme—maybe you're a minimalist (but want slightly larger fonts)? ```{R, gg8, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop)) + ylab("Unemployment rate") + xlab("Date") + geom_point(aes(color = uempmed), alpha = 0.5, size = 3) + theme_minimal(base_size = 14) ``` ] --- .smaller[Want your figure to look like Stata made it? ```{R, gg9, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop)) + ylab("Unemployment rate") + xlab("Date") + geom_point(aes(color = uempmed), alpha = 0.5, size = 3) + ggthemes::theme_stata(base_size = 14) ``` ] --- .smaller[The "pander" theme from the `ggthemes` package. ```{R, gg10, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop)) + ylab("Unemployment rate") + xlab("Date") + geom_point(aes(color = uempmed), alpha = 0.5, size = 3) + ggthemes::theme_pander(base_size = 14) ``` ] --- .smaller[Change (and label) our color scale. .note[Note] `viridis` [is the best](https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html). ```{R, gg11, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop)) + ylab("Unemployment rate") + xlab("Date") + geom_point(aes(color = uempmed), alpha = 0.5, size = 3) + ggthemes::theme_pander(base_size = 14) + scale_color_viridis_c("Dur. unemp.") ``` ] --- .smaller[Connect the dots. ```{R, gg12, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop)) + ylab("Unemployment rate") + xlab("Date") + geom_line(color = "grey80") + geom_point(aes(color = uempmed), alpha = 0.5, size = 3) + ggthemes::theme_pander(base_size = 14) + scale_color_viridis_c("Dur. unemp.") ``` ] --- .smaller[How about a smoother? ```{R, gg13, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop)) + ylab("Unemployment rate") + xlab("Date") + geom_point(aes(color = uempmed), alpha = 0.5, size = 3) + geom_smooth(se = F) + ggthemes::theme_pander(base_size = 14) + scale_color_viridis_c("Dur. unemp.") ``` ] --- .smaller[The `group` aesthetic separates groups. ```{R, gg14, fig.height = 5} ggplot(data = economics, aes(x = date, y = unemploy/pop, group = date < ymd(19900101))) + ylab("Unemployment rate") + xlab("Date") + geom_point(aes(color = uempmed), alpha = 0.5, size = 3) + geom_smooth(se = F) + ggthemes::theme_pander(base_size = 14) + scale_color_viridis_c("Dur. unemp.") ``` ] --- .note[Note] The `ymd()` function comes from the `lubridate` package. --- `ggplot2` knows histogams. --- name: gg-hist A histogram. .smaller[ ```{R, gg-hist1, fig.height = 5} ggplot(data = economics, aes(x = unemploy/pop)) + xlab("Unemployment rate") + geom_histogram(color = "white", fill = "#e64173") + ggthemes::theme_pander(base_size = 14) ``` ] --- Add a horizontal line where count = 0. .smaller[ ```{R, gg-hist2, fig.height = 5} ggplot(data = economics, aes(x = unemploy/pop)) + xlab("Unemployment rate") + geom_histogram(color = "white", fill = "#e64173") + geom_hline(yintercept = 0) + ggthemes::theme_pander(base_size = 14) ``` ] --- `ggplot2` knows densities. --- name: gg-density A density plot. .smaller[ ```{R, gg-density1, fig.height = 5} ggplot(data = economics, aes(x = unemploy/pop)) + xlab("Unemployment rate") + geom_density(color = NA, fill = "#e64173") + geom_hline(yintercept = 0) + ggthemes::theme_pander(base_size = 14) ``` ] --- Now with Epanechnikov kernel! .smaller[ ```{R, gg-density2, fig.height = 5} ggplot(data = economics, aes(x = unemploy/pop)) + xlab("Unemployment rate") + geom_density(kernel = "epanechnikov", color = NA, fill = "#e64173") + geom_hline(yintercept = 0) + ggthemes::theme_pander(base_size = 14) ``` ] --- `ggplot2` itself is incredibly flexible/powerful. But there are [even more packages](https://www.ggplot2-exts.org/gallery/) that extend its power—_e.g._, `ggthemes`, `gganimate`, `cowplot`, `ggmap`, `ggExtra`, and (of course) `viridis`. --- name: gg-more Gapminder meets `gganimate` ```{R, ex-gganimate, include = F, cache = T, dev = "png", eval = F} # The package for animating ggplot2 p_load(gganimate, gapminder) # As before gg <- ggplot( data = gapminder %>% filter(continent != "Oceania"), aes(gdpPercap, lifeExp, size = pop, color = country) ) + geom_point(alpha = 0.7, show.legend = FALSE) + scale_colour_manual(values = country_colors) + scale_size(range = c(2, 12)) + scale_x_log10("GDP per capita", label = scales::comma) + facet_wrap(~continent) + theme_pander(base_size = 16) + theme(panel.border = element_rect(color = "grey90", fill = NA)) + # Here comes the gganimate-specific bits labs(title = "Year: {frame_time}") + ylab("Life Expectancy") + transition_time(year) + ease_aes("linear") # Save the animation anim_save( animation = gg, filename = "ex_gganimate.gif", path = here(), width = 10.5, height = 7, # units = "in", # res = 150, nframes = 56 ) ``` .center[![Gapminder](ex_gganimate.gif)] --- US births by month since 1933 ```{R, ex-new-ts, echo = F, eval = T} # Load births data; drop totals; create time variable birth_df <- read_csv("usa_birth_1933_2015.csv") %>% janitor::clean_names() %>% filter(month != "TOT") %>% mutate( month = as.numeric(month), time = year + (month-1)/12 ) # Load days of months data days_df <- read_csv("days_of_month.csv") # Clean up days days_lon <- gather(days_df, year, n_days, -Month) days_lon <- janitor::clean_names(days_lon) days_lon$year <- as.integer(days_lon$year) # Join birth_df <- left_join( x = birth_df, y = days_lon, by = c("year", "month") ) # Calculate 30-day equivalent births by month birth_df %<>% mutate( births_30day = births / n_days * 30 ) lo <- min(c(birth_df$births, birth_df$births_30day)) hi <- max(c(birth_df$births, birth_df$births_30day)) # Plot new-ish time-series graph of birth rates # Plot newfangled time-series graph of birth rates ggplot(data = birth_df %>% filter(year < 2050), aes( x = year, y = factor(month, labels = month.abb), fill = births/1e5, color = births/1e5 ) ) + geom_tile() + xlab("Year") + ylab("Month") + theme_pander(base_family = "Fira Sans Book", base_size = 20) + scale_fill_viridis("Births (100K)", option = "magma", limits = c(lo, hi)/1e5) + scale_color_viridis("Births (100K)", option = "magma", limits = c(lo, hi)/1e5) + theme( legend.position = "bottom", legend.key.width = unit(1.5, units = "in"), legend.key.height = unit(0.2, units = "in"), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), line = element_blank(), rect = element_blank(), axis.ticks = element_blank() ) ``` --- layout: true # ggplot2 --- name: ggsave ## Saving plots You can save your `ggplot2`-based figures using `ggsave()`. --- ## `ggsave()` Option 1 By default, `ggsave()` saves the last plot printed to the screen. ```{R, ex-ggsave-1, eval = F} # Create a simple scatter plot ggplot(data = fun_df, aes(x = x, y = y)) + geom_point() # Save our simple scatter plot ggsave(filename = "simple_scatter.pdf") ``` -- .note[Notes] - This example creates a PDF. Change to `".png"` for PNG, *etc.* - There several helpful, optional arguments: `path`, `width`, `height`, `dpi`. --- ## `ggsave()` Option 2 You can assign your `ggplot()` objects to memory ```{R, ex-gg-assign, eval = F} # Create a simple scatter plot named 'gg_points' gg_points <- ggplot(data = fun_df, aes(x = x, y = y)) + geom_point() ``` -- We can then save this figure with the name `gg_points` using `ggsave()` ```{R, ex-ggsave-2, eval = F} # Save our simple scatter plot name 'ggsave' ggsave( filename = "simple_scatter.pdf", plot = gg_points ) ``` --- layout: false # Resources ## There's always more `ggplot2` 1. .mono[RStudio]'s [cheat sheet for `ggplot2`](https://github.com/rstudio/cheatsheets/blob/master/data-visualization-2.1.pdf). 1. `ggplot2` [reference index](https://ggplot2.tidyverse.org/reference/index.html) 1. The `tidyverse` [page](https://ggplot2.tidyverse.org) on `ggplot2`. 1. 