CSV vs XLSX Formats and Importing Data in R

Mastering data import in R

Masumbuko Semba

2024-02-16

Learning Agenda

  1. Get familiar with R and Rstudio
  2. Data structure and data types
  3. Reading and writing data in Rstudio
  4. Tidying and Data manipulation with tidyverse
  5. Plotting and Visualization
  6. Descriptive Statistics
  7. Inferential Statistics
  8. Modelling and simulation
  9. Spatial Handling and Analysis

csv Format

  • Definition: CSV (Comma-Separated Values)
  • Structure: Plain text file with data separated by commas
  • Advantages: Lightweight, easy to create and read, widely supported

Note

Disadvantages: Limited support for formatting, no support for multiple sheets

csv format …

  • The csv format is loaded with readr_csv() function
  • The function is from readr package
  • To access this package we need to install tidyverse package
  • An ecosystem of packages including readr
# Install and load tidyverse package
install.packages("tidyverse")

Warning

Loading packages in R is necessary to use their functions and features.

xlsx format

  • Definition: XLSX (Excel Workbook)
  • Structure: Binary file format with data, styles, and multiple sheets
  • Advantages: Rich formatting, supports multiple sheets, formulas, charts
  • Disadvantages: Larger file size, more complex than CSV

something

something

something

something

something

something

csv format …

  • load the package in R to use its functions
  • you load the package with require function
  • alternatively, you use library function
# load the package
require(tidyverse)