Lecture Lab 2

Leon Eyrich Jessen

Data Visualisation I

First: Make sure you’re on track!

Lab I Learning Objectives

  • Navigate the RStudio IDE and master the very basics of R
  • Create, edit and run a basic Quarto document
  • Explain why reproducible data analysis is important
  • Describe the components of a reproducible data analysis

Where should you be by now?

  • In a group
  • On Piazza
  • On the RStudio Cloud Server
  • On GitHub

If any of this is not the case, see the Getting Started section on the course site

What is Data Visualisation?

  • Think about Hadley’s and Hans’ lectures, the book chapter and paper you read for today
  • Use the next 5 minutes to think and talk to the person next to you
  • Then, write your thoughts on Piazza in the “LIVE Q&A”

On Data Visualisation

  • Depending on where your read, something along the lines of:
    • The graphical representation of information and data
    • Uses visual elements like charts, graphs and maps
    • Provide an accessible way to see and understand trends, outliers and patterns
  • Data visualisation is your means to summarise and communicate key messages
    • Within research
    • Within industry
  • It is not easy!
    • Anyone can make a plot, but an impactful data visualisation requires true skills

What is wrong here?

Should have been

What’s wrong here?

Should have been

Or maybe even

What is this?

Or this?

Or this?

Or this?

Or this?

Or this?

Or this?

…and then there is this one?!?

…and then there is this one?!?

Or this?

  • Don’t even know where to begin with this one…

In Summary on Data Visualisation

  • Think carefully about exactly what it is, you want to communicate with your visualisation

  • Remove redundant information

  • Be honest and show your data

  • Less is more, do not cram 3 plots into 1

  • Do not make a fancy, but information deprived plot

  • Think about colour choice - Separation, but also quite a few people are colour blind

  • Today is meant as in intro, data visualisation will be an integrated part of the rest of the course

ggplot, the Grammar of Graphics

The basic syntax of ggplot

  • In a new code chunk in your Quarto document, input and run:
ggplot(data = my_data,
       mapping = aes(x = v1, y = v2)) +
  geom_point()
  • Define your data

  • Map variables in your plot to your visualisation

  • Choose a graphical representation

  • Let us look at that in e another way

ggplot, the Grammar of Graphics

Example - Scatter plot

library("tidyverse")
ggplot(data = datasets::Puromycin,
       mapping = aes(x = conc,
                     y = rate)) +
  geom_point()

Break and the it is time for exercises…