Understand Data Types in R

Mastering Numbers, Text, date, time and Logic for Powerful Data Exploration

Masumbuko Semba, Zac Maritin, Emmanuel Mpina

What is R

  • R is a programming language and environment commonly used for statistical computing and data analysis.

  • It supports various data types. Some of the fundamental data types in R are:

    1. Numeric
    2. Integer
    3. Character
    4. Logical
    5. Factor
    6. Date and time

Numerical

  • Represent numbers with decimal places.
  • Example: 3.14159, 2.71828, -0.693147.
  • Examples in action: Measuring distances, calculating averages, representing scientific constants.
# Example of numeric data type
x <- 5.2
y <- 3.8

Integer

  • Represent whole numbers with no decimal places
  • Example: 1, 10, -50.
  • Examples in action: Counting objects, representing years, recording scores.
# Example of integer data type
a <- 10L  # The 'L' suffix indicates an integer literal
b <- 3L

Character

  • Represents a sequence of characters enclosed in single or double quotes.
  • Example: "Hello world!", "Nairobi".
  • Example in action: Storing names, titles, descriptions, and text data from various sources.
name <- "John"

Logical

  • Logical data type is used for storing Boolean values (TRUE or FALSE).
is_true <- TRUE
is_false <- FALSE

2== 2
[1] TRUE
2==3
[1] FALSE

Factor

  • Represents a categorical variable with a fixed set of levels, primarily used for nominal data.
  • Example: factor(c(“Male”, “Female”, “Unknown”)), factor(color_codes, levels = c(“red”, “green”, “blue”)).
  • **Example in action*: Grouping and analyzing text data based on categories like genders, sentiment labels, or topic classifications.
# Example of factor data type
gender <- c("Male", "Female", "Male", "Female", "Male")
gender_factor <- factor(gender)

Date and time

  • R has special classes for representing dates and times.
# Example of date and time data types
today <- Sys.Date()
current_time <- Sys.time()