Introduction to ggplot2

The most popular data visualization framework in R

PHAC R Usergroup: Basics of R Summer Camp

2025-07-23

What is ggplot2?

ggplot2 is the most downloaded R library in the world

ggplot2 overview

  • Part of the tidyverse suite of packages, ggplot2 offers great consistency with dplyr pipes and syntax 1

  • “gg” refers to “The Grammar of Graphics,” a 1999 book from Leland Wilkinson that provides a framework for creating statistical graphics in layered components

  • The “2” in ggplot2 alludes to the fact that this package supplants an older library simply called ggplot, both developed by Hadley Wickham 2

Anatomy of a plot

https://ggplot2.tidyverse.org/articles/ggplot2.html

https://ggplot2.tidyverse.org/articles/ggplot2.html

Getting started - Data

tail(ggplot_downloads_df)
           date count package
3829 2025-06-25 54021 ggplot2
3830 2025-06-26 50323 ggplot2
3831 2025-06-27 50306 ggplot2
3832 2025-06-28 31488 ggplot2
3833 2025-06-29 33328 ggplot2
3834 2025-06-30 48941 ggplot2

Getting started - Plot

library(ggplot2)

# ggplot2 needs the user to provide at least the following three components: 
# data, mapping, and layer

g2 <- ggplot(ggplot_downloads_df,          # Data
             aes(x = date, y = count)) +   # Mapping
             geom_point()                  # Layer

g2

What makes for a good plot?

  • Easy to start with some minimally viable code for a plot, but how does one make a cleaner and more polished graph?

    • Ironically, more lines of code are needed for a less cluttered plot
  • By utilizing rest of the components, one can add more features to the graph and also clean up unnecessary elements

  • Implement some good data visualization practices 1

Polishing the plot - Adding another Geom

g3 <- ggplot(ggplot_downloads_df,          # Data
             aes(x = date, y = count)) +   # Mapping
             geom_point() +                # Layer
             geom_smooth()                 # Layer (additional)
  
g3

Polishing the plot - Scaling layer

library(scales)

g4 <- ggplot(ggplot_downloads_df,          # Data
             aes(x = date, y = count)) +   # Mapping
             geom_point() +                # Layer
             geom_smooth() +               # Layer (additional)
             scale_y_continuous(
               labels = label_number(
                 suffix = "K", scale = 1e-3)
               )                           # Scale
  
g4

Polishing the plot - Adding theme layer

g5 <- ggplot(ggplot_downloads_df,          # Data
             aes(x = date, y = count)) +   # Mapping
             geom_point() +                # Layer
             geom_smooth() +               # Layer (additional)
             scale_y_continuous(
               labels = label_number(
                 suffix = "K", scale = 1e-3)
               ) +                         # Scale
             theme_minimal()               # Theme
g5

Polishing the plot - Specifying Labels

g6 <- ggplot(ggplot_downloads_df,          # Data
             aes(x = date, y = count)) +   # Mapping
             geom_point() +                # Layer
             geom_smooth() +               # Layer (additional)
             scale_y_continuous(
               labels = label_number(
                 suffix = "K", scale = 1e-3)
               ) +                         # Scale
             theme_minimal() +             # Theme
             labs(title = "ggplot2's rise in popularity over the past decade",
                  subtitle = "Number of daily downloads from CRAN",
                         x = NULL,
                         y = NULL)         # Labels
g6

Polishing the plot - Modifying Colour

g7 <- ggplot(ggplot_downloads_df,          # Data
             aes(x = date, y = count)) +   # Mapping
             geom_point(colour = "lightgrey") + # Layer
             geom_smooth() +               # Layer (additional)
             scale_y_continuous(
               labels = label_number(
                 suffix = "K", scale = 1e-3)
               ) +                         # Scale
             theme_minimal() +             # Theme
             labs(title = "ggplot2's rise in popularity over the past decade",
                  subtitle = "Number of daily downloads from CRAN",
                         x = NULL,
                         y = NULL)         # Labels
g7

From start to finish

Where to go from here?

  • Experiment with different types of graphs
  • Change default parameters
  • Learn how to incorporate ggplot2 graphs in R Markdown/Quarto outputs
  • Implement reactive ggplot2 objects within Shiny apps
  • Explore interactive plots via ggiraph or ggplotly packages
  • Compare and contrast ggplot2 with base R or different libraries
  • Match Government of Canada look and feel
  • Make visuals colour-blind friendly and generally compliant with accessibility standards

Example plot types

https://ouyanglab.com/covid19dataviz/ggplot2.html

https://ouyanglab.com/covid19dataviz/ggplot2.html

Resources