Dates and Times

Authors
Affiliation

Greg Ridgeway

University of Pennsylvania

Ruth Moyer

University of Pennsylvania

Published

August 14, 2025

1 Introduction

Working with dates and times is a lot different than working with the more familiar numbers. Months have different number of days. Sometimes we count hours of the day up to 12 and then start over. Sometimes we count hours up to 24 and then start over. Some years have 366 days. We have 24 times zones around the world. Twice a year we switch clocks for Daylight Saving Time, except in some places like Arizona. Arithmetic, such as adding one month to a date, is poorly defined. Which date is one month after January 31st? Is it February 28th? Or is it March 3rd?

Fortunately, software for working with dates exists to make these tasks easier. Unfortunately, every system seems to make their own design decisions. Excel stores dates as the number of days since January 0, 1900… that’s not a typo… they count from January 0, 1900. Linux systems count days since January 1, 1970. SPSS stores times as the number of seconds since midnight October 14, 1582, the adoption date of the Gregorian calendar. Much of the world did not adopt the calendar in 1582. The American colonies did not adopt the Gregorian calendar until 1752 along with Great Britain. So beware if you are a historian digging through centuries old data. Aligning dates can become very messy.

R has had a variety of attempts at providing a means for managing dates. We are going to use the lubridate package that address just about everything you might need to do with dates and times.

Do not use as.Date(). lubridate has an easier to read date formatting, more intelligently handles dates of different formats, and has better date arithmetic.

lubridate is not part of R by default. You will need to install it. Simply run

install.packages("lubridate")

and R will hit the web, download the lubridate package and any supporting packages it needs (and it does need a few), and installs them. This is a one time event. Once you have lubridate on your machine you will not need to reinstall it every time you need it.

Some of our students, particularly on Macs, have encountered trouble installing some packages for R. R will sometimes try to download the source code for the packages and compile them from scratch on your machine. Sometimes that goes well and other times it requires that you have other tools installed on your machine. An easy solution is to run

install.packages("lubridate", type="mac.binary")

instead to insist that R finds and installs a ready-to-use version of the packages.

2 Working with dates

While lubridate is now installed, once per R session you will need to load lubridate.

library(lubridate)
library(dplyr)

If you close R and restart it, then you’ll need to run this line again.

Let’s reload the sample of Chicago crime data discussed in the introductory notes, available on the R4Crim github site.

load("chicago crime 20141124-20141209.RData")

Let’s extract five dates from the chicagoCrime dataset.

chicagoCrime |>
   select(Date) |>
   slice(c(1,2500,5000,7500,10000))
                    Date
1 12/09/2014 11:54:00 PM
2 12/05/2014 11:00:00 PM
3 12/02/2014 10:58:00 AM
4 11/28/2014 02:35:00 PM
5 11/24/2014 12:30:00 AM

As you can see the dates include the date in month/day/year format and the time on a 12 hour AM/PM clock. R has no idea that these values represent dates. You are familiar with this date formatting, but R just thinks they are strings of characters. Use substring() to just extract the date part, the first 10 characters of Date.

textDate <- chicagoCrime |>
   select(Date) |>
   mutate(Date = substring(Date, 1, 10)) |>
   slice(c(1,2500,5000,7500,10000)) |>
   pull(Date)
textDate
[1] "12/09/2014" "12/05/2014" "12/02/2014" "11/28/2014" "11/24/2014"

Now let’s use the mdy() function from the lubridate package to tell R that these are not just strings of characters, but they actually represent months, days, and years.

b <- mdy(textDate)
is(b)
b
[1] "Date"     "oldClass"
[1] "2014-12-09" "2014-12-05" "2014-12-02" "2014-11-28" "2014-11-24"

b now stores those five dates in a format that recognizes the month, day, and year. is(b) tells us that R is storing b as a date. There are different functions for other date formats depending on the ordering of the day, month and year, like dmy() and ymd() and even mdy_hms() for month, day, year, hours, minutes, seconds format.

Now that R knows these are dates, the lubridate package provides a lot of functions to help you work with dates.

year(b)
month(b)
month(b, label=TRUE)
month(b, label=TRUE, abbr=FALSE)
wday(b, label=TRUE)
[1] 2014 2014 2014 2014 2014
[1] 12 12 12 11 11
[1] Dec Dec Dec Nov Nov
12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
[1] December December December November November
12 Levels: January < February < March < April < May < June < ... < December
[1] Tue Fri Tue Fri Mon
Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat

Subtraction will tell you the time between two dates. How many days since December 1, 2014? How many days have passed from the values in b to today? The now() function gives you the date and time, well… right now.

b - mdy("12/01/2014")
Time differences in days
[1]  8  4  1 -3 -7
date(now()) - b
Time differences in days
[1] 3901 3905 3908 3912 3916

When subtracting dates, R will make a good guess for the unit of time to use in the result. Use difftime() if you want to be specific about the unit of time and not leave it up to R to decide.

difftime(b, mdy("12/01/2014"), units = "days")
Time differences in days
[1]  8  4  1 -3 -7
difftime(b, mdy("12/01/2014"), units = "hours")
Time differences in hours
[1]  192   96   24  -72 -168

We can add time to the dates as well

b + dyears(1) # adds 365 days, does not increase year by 1
[1] "2015-12-09 06:00:00 UTC" "2015-12-05 06:00:00 UTC"
[3] "2015-12-02 06:00:00 UTC" "2015-11-28 06:00:00 UTC"
[5] "2015-11-24 06:00:00 UTC"
b + ddays(31)
[1] "2015-01-09" "2015-01-05" "2015-01-02" "2014-12-29" "2014-12-25"

Now let’s go ahead and create a new column in our Chicago dataset containing properly stored dates.

chicagoCrime <-
   chicagoCrime |>
   mutate(realdate = mdy_hms(Date))

chicagoCrime |>
   select(Date, realdate) |>
   head() # show the dates in the first few rows
                    Date            realdate
1 12/09/2014 11:54:00 PM 2014-12-09 23:54:00
2 12/09/2014 11:45:00 PM 2014-12-09 23:45:00
3 12/09/2014 11:42:00 PM 2014-12-09 23:42:00
4 12/09/2014 11:42:00 PM 2014-12-09 23:42:00
5 12/09/2014 11:40:00 PM 2014-12-09 23:40:00
6 12/09/2014 11:37:00 PM 2014-12-09 23:37:00

lubridate has converted the date and time formats to a more standardized form, one that is easier to use on a computer.

The default timezone is Coordinated Universal Time abbreviated UTC, which is the same as Greenwich Mean Time. Interestingly, the abbreviation CUT would make more sense in English, but TCU would make more sense in French, so the compromise was to universally abbreviate as UTC. Since all of these crimes occurred in Chicago, let’s explicitly set the timezone to Central Time. The function OlsonNames() will give you a list of all possible time zones you can use.

chicagoCrime <-
   chicagoCrime |>
   mutate(realdate = force_tz(realdate, "America/Chicago"))
chicagoCrime$realdate[1:5]   # show just the first five dates
[1] "2014-12-09 23:54:00 CST" "2014-12-09 23:45:00 CST"
[3] "2014-12-09 23:42:00 CST" "2014-12-09 23:42:00 CST"
[5] "2014-12-09 23:40:00 CST"

Now when printed you can see that the timezone is set to Central Standard Time. R will automatically handle Daylight Saving Time. Note that an August date reports Central Daylight Time.

mdy_hms("8/1/2014 12:00:00") |> 
   force_tz("America/Chicago")
[1] "2014-08-01 12:00:00 CDT"

Note that force_tz() keeps the dates and times the same, but overwrites the timezone. If you want to lookup what the date and time would be in a different timezone, then use with_tz().

mdy_hms("8/1/2014 12:00:00") |>
   force_tz("America/Chicago") |> # force to CDT
   with_tz("America/New_York")    # get time in EDT
[1] "2014-08-01 13:00:00 EDT"

We can actually find out when Daylight Saving Time ends. Generate all November dates and convert them to Chicago time.

mdy_hms("11/1/2025 12:00:00", tz="America/Chicago") + ddays(0:29)
 [1] "2025-11-01 12:00:00 CDT" "2025-11-02 11:00:00 CST"
 [3] "2025-11-03 11:00:00 CST" "2025-11-04 11:00:00 CST"
 [5] "2025-11-05 11:00:00 CST" "2025-11-06 11:00:00 CST"
 [7] "2025-11-07 11:00:00 CST" "2025-11-08 11:00:00 CST"
 [9] "2025-11-09 11:00:00 CST" "2025-11-10 11:00:00 CST"
[11] "2025-11-11 11:00:00 CST" "2025-11-12 11:00:00 CST"
[13] "2025-11-13 11:00:00 CST" "2025-11-14 11:00:00 CST"
[15] "2025-11-15 11:00:00 CST" "2025-11-16 11:00:00 CST"
[17] "2025-11-17 11:00:00 CST" "2025-11-18 11:00:00 CST"
[19] "2025-11-19 11:00:00 CST" "2025-11-20 11:00:00 CST"
[21] "2025-11-21 11:00:00 CST" "2025-11-22 11:00:00 CST"
[23] "2025-11-23 11:00:00 CST" "2025-11-24 11:00:00 CST"
[25] "2025-11-25 11:00:00 CST" "2025-11-26 11:00:00 CST"
[27] "2025-11-27 11:00:00 CST" "2025-11-28 11:00:00 CST"
[29] "2025-11-29 11:00:00 CST" "2025-11-30 11:00:00 CST"

Looks like by 11am on November 2, 2025, Chicago is back to Central Standard Time.

3 Exercises

  1. At what hour does Daylight Saving Time end? (Hint: Try using dminutes() to add time to the date DST ends)
  2. Thanksgiving occurs on the fourth Thursday in November. On what date will Thanksgiving fall in 2020? Hints:
    • Try listing all dates in November
    • Use wday() to get the weekday
    • find the fourth Thursday
  3. Make a function that takes as input a year and returns the date of Thanksgiving in that year. Here’s a template to start
tday <- function(year)
{

   return( )
}

4 Solutions to the exercises

  1. At what hour does Daylight Saving Time end?
mdy("11/2/2025", tz="America/Chicago") + dminutes(1:180)
  [1] "2025-11-02 00:01:00 CDT" "2025-11-02 00:02:00 CDT"
  [3] "2025-11-02 00:03:00 CDT" "2025-11-02 00:04:00 CDT"
  [5] "2025-11-02 00:05:00 CDT" "2025-11-02 00:06:00 CDT"
  [7] "2025-11-02 00:07:00 CDT" "2025-11-02 00:08:00 CDT"
  [9] "2025-11-02 00:09:00 CDT" "2025-11-02 00:10:00 CDT"
 [11] "2025-11-02 00:11:00 CDT" "2025-11-02 00:12:00 CDT"
 [13] "2025-11-02 00:13:00 CDT" "2025-11-02 00:14:00 CDT"
 [15] "2025-11-02 00:15:00 CDT" "2025-11-02 00:16:00 CDT"
 [17] "2025-11-02 00:17:00 CDT" "2025-11-02 00:18:00 CDT"
 [19] "2025-11-02 00:19:00 CDT" "2025-11-02 00:20:00 CDT"
 [21] "2025-11-02 00:21:00 CDT" "2025-11-02 00:22:00 CDT"
 [23] "2025-11-02 00:23:00 CDT" "2025-11-02 00:24:00 CDT"
 [25] "2025-11-02 00:25:00 CDT" "2025-11-02 00:26:00 CDT"
 [27] "2025-11-02 00:27:00 CDT" "2025-11-02 00:28:00 CDT"
 [29] "2025-11-02 00:29:00 CDT" "2025-11-02 00:30:00 CDT"
 [31] "2025-11-02 00:31:00 CDT" "2025-11-02 00:32:00 CDT"
 [33] "2025-11-02 00:33:00 CDT" "2025-11-02 00:34:00 CDT"
 [35] "2025-11-02 00:35:00 CDT" "2025-11-02 00:36:00 CDT"
 [37] "2025-11-02 00:37:00 CDT" "2025-11-02 00:38:00 CDT"
 [39] "2025-11-02 00:39:00 CDT" "2025-11-02 00:40:00 CDT"
 [41] "2025-11-02 00:41:00 CDT" "2025-11-02 00:42:00 CDT"
 [43] "2025-11-02 00:43:00 CDT" "2025-11-02 00:44:00 CDT"
 [45] "2025-11-02 00:45:00 CDT" "2025-11-02 00:46:00 CDT"
 [47] "2025-11-02 00:47:00 CDT" "2025-11-02 00:48:00 CDT"
 [49] "2025-11-02 00:49:00 CDT" "2025-11-02 00:50:00 CDT"
 [51] "2025-11-02 00:51:00 CDT" "2025-11-02 00:52:00 CDT"
 [53] "2025-11-02 00:53:00 CDT" "2025-11-02 00:54:00 CDT"
 [55] "2025-11-02 00:55:00 CDT" "2025-11-02 00:56:00 CDT"
 [57] "2025-11-02 00:57:00 CDT" "2025-11-02 00:58:00 CDT"
 [59] "2025-11-02 00:59:00 CDT" "2025-11-02 01:00:00 CDT"
 [61] "2025-11-02 01:01:00 CDT" "2025-11-02 01:02:00 CDT"
 [63] "2025-11-02 01:03:00 CDT" "2025-11-02 01:04:00 CDT"
 [65] "2025-11-02 01:05:00 CDT" "2025-11-02 01:06:00 CDT"
 [67] "2025-11-02 01:07:00 CDT" "2025-11-02 01:08:00 CDT"
 [69] "2025-11-02 01:09:00 CDT" "2025-11-02 01:10:00 CDT"
 [71] "2025-11-02 01:11:00 CDT" "2025-11-02 01:12:00 CDT"
 [73] "2025-11-02 01:13:00 CDT" "2025-11-02 01:14:00 CDT"
 [75] "2025-11-02 01:15:00 CDT" "2025-11-02 01:16:00 CDT"
 [77] "2025-11-02 01:17:00 CDT" "2025-11-02 01:18:00 CDT"
 [79] "2025-11-02 01:19:00 CDT" "2025-11-02 01:20:00 CDT"
 [81] "2025-11-02 01:21:00 CDT" "2025-11-02 01:22:00 CDT"
 [83] "2025-11-02 01:23:00 CDT" "2025-11-02 01:24:00 CDT"
 [85] "2025-11-02 01:25:00 CDT" "2025-11-02 01:26:00 CDT"
 [87] "2025-11-02 01:27:00 CDT" "2025-11-02 01:28:00 CDT"
 [89] "2025-11-02 01:29:00 CDT" "2025-11-02 01:30:00 CDT"
 [91] "2025-11-02 01:31:00 CDT" "2025-11-02 01:32:00 CDT"
 [93] "2025-11-02 01:33:00 CDT" "2025-11-02 01:34:00 CDT"
 [95] "2025-11-02 01:35:00 CDT" "2025-11-02 01:36:00 CDT"
 [97] "2025-11-02 01:37:00 CDT" "2025-11-02 01:38:00 CDT"
 [99] "2025-11-02 01:39:00 CDT" "2025-11-02 01:40:00 CDT"
[101] "2025-11-02 01:41:00 CDT" "2025-11-02 01:42:00 CDT"
[103] "2025-11-02 01:43:00 CDT" "2025-11-02 01:44:00 CDT"
[105] "2025-11-02 01:45:00 CDT" "2025-11-02 01:46:00 CDT"
[107] "2025-11-02 01:47:00 CDT" "2025-11-02 01:48:00 CDT"
[109] "2025-11-02 01:49:00 CDT" "2025-11-02 01:50:00 CDT"
[111] "2025-11-02 01:51:00 CDT" "2025-11-02 01:52:00 CDT"
[113] "2025-11-02 01:53:00 CDT" "2025-11-02 01:54:00 CDT"
[115] "2025-11-02 01:55:00 CDT" "2025-11-02 01:56:00 CDT"
[117] "2025-11-02 01:57:00 CDT" "2025-11-02 01:58:00 CDT"
[119] "2025-11-02 01:59:00 CDT" "2025-11-02 01:00:00 CST"
[121] "2025-11-02 01:01:00 CST" "2025-11-02 01:02:00 CST"
[123] "2025-11-02 01:03:00 CST" "2025-11-02 01:04:00 CST"
[125] "2025-11-02 01:05:00 CST" "2025-11-02 01:06:00 CST"
[127] "2025-11-02 01:07:00 CST" "2025-11-02 01:08:00 CST"
[129] "2025-11-02 01:09:00 CST" "2025-11-02 01:10:00 CST"
[131] "2025-11-02 01:11:00 CST" "2025-11-02 01:12:00 CST"
[133] "2025-11-02 01:13:00 CST" "2025-11-02 01:14:00 CST"
[135] "2025-11-02 01:15:00 CST" "2025-11-02 01:16:00 CST"
[137] "2025-11-02 01:17:00 CST" "2025-11-02 01:18:00 CST"
[139] "2025-11-02 01:19:00 CST" "2025-11-02 01:20:00 CST"
[141] "2025-11-02 01:21:00 CST" "2025-11-02 01:22:00 CST"
[143] "2025-11-02 01:23:00 CST" "2025-11-02 01:24:00 CST"
[145] "2025-11-02 01:25:00 CST" "2025-11-02 01:26:00 CST"
[147] "2025-11-02 01:27:00 CST" "2025-11-02 01:28:00 CST"
[149] "2025-11-02 01:29:00 CST" "2025-11-02 01:30:00 CST"
[151] "2025-11-02 01:31:00 CST" "2025-11-02 01:32:00 CST"
[153] "2025-11-02 01:33:00 CST" "2025-11-02 01:34:00 CST"
[155] "2025-11-02 01:35:00 CST" "2025-11-02 01:36:00 CST"
[157] "2025-11-02 01:37:00 CST" "2025-11-02 01:38:00 CST"
[159] "2025-11-02 01:39:00 CST" "2025-11-02 01:40:00 CST"
[161] "2025-11-02 01:41:00 CST" "2025-11-02 01:42:00 CST"
[163] "2025-11-02 01:43:00 CST" "2025-11-02 01:44:00 CST"
[165] "2025-11-02 01:45:00 CST" "2025-11-02 01:46:00 CST"
[167] "2025-11-02 01:47:00 CST" "2025-11-02 01:48:00 CST"
[169] "2025-11-02 01:49:00 CST" "2025-11-02 01:50:00 CST"
[171] "2025-11-02 01:51:00 CST" "2025-11-02 01:52:00 CST"
[173] "2025-11-02 01:53:00 CST" "2025-11-02 01:54:00 CST"
[175] "2025-11-02 01:55:00 CST" "2025-11-02 01:56:00 CST"
[177] "2025-11-02 01:57:00 CST" "2025-11-02 01:58:00 CST"
[179] "2025-11-02 01:59:00 CST" "2025-11-02 02:00:00 CST"

What happens after “2025-11-02 01:59:00 CDT”? You can see that it ends when the clock strikes two in the morning. If you want R to really do all the work, use lubridate’s dst() function and find the minimum and maximum times before and after the switch to daylight saving time.

a <- data.frame(date = mdy("11/2/2025", tz="America/Chicago") + dminutes(1:180))
# the latest time that is still in Daylight Saving Time
a |>
   filter(dst(date)) |>
   slice_max(date)
                 date
1 2025-11-02 01:59:00
# the earliest time that is no longer in Daylight Saving Time
a |>
   filter(!dst(date)) |>
   slice_min(date)
                 date
1 2025-11-02 01:00:00
# more succinctly
a |> 
   group_by(dst(date)) |>
   summarize(min(date), max(date))
# A tibble: 2 × 3
  `dst(date)` `min(date)`         `max(date)`        
  <lgl>       <dttm>              <dttm>             
1 FALSE       2025-11-02 01:00:00 2025-11-02 02:00:00
2 TRUE        2025-11-02 00:01:00 2025-11-02 01:59:00
  1. On what date will Thanksgiving fall in 2020?
data.frame(date=mdy("11/1/2025") + ddays(0:29)) |>
   filter(wday(date, label=TRUE) == "Thu") |>
   slice(4)
        date
1 2025-11-27
# or using some base R code
a <- mdy(paste0("11/",1:30,"/2025"))
a[wday(a,label=TRUE)=="Thu"][4]
[1] "2025-11-27"
  1. Make a function that takes as input a year and returns the date of Thanksgiving in that year.
tday <- function(year)
{
   data.frame(date=mdy(paste0("11/1/",year)) + ddays(0:29)) |>
      filter(wday(date, label=TRUE) == "Thu") |>
      slice(4) |>
      pull(date)
}
tday(2025)
[1] "2025-11-27"
lapply(2025:2100, tday)
[[1]]
[1] "2025-11-27"

[[2]]
[1] "2026-11-26"

[[3]]
[1] "2027-11-25"

[[4]]
[1] "2028-11-23"

[[5]]
[1] "2029-11-22"

[[6]]
[1] "2030-11-28"

[[7]]
[1] "2031-11-27"

[[8]]
[1] "2032-11-25"

[[9]]
[1] "2033-11-24"

[[10]]
[1] "2034-11-23"

[[11]]
[1] "2035-11-22"

[[12]]
[1] "2036-11-27"

[[13]]
[1] "2037-11-26"

[[14]]
[1] "2038-11-25"

[[15]]
[1] "2039-11-24"

[[16]]
[1] "2040-11-22"

[[17]]
[1] "2041-11-28"

[[18]]
[1] "2042-11-27"

[[19]]
[1] "2043-11-26"

[[20]]
[1] "2044-11-24"

[[21]]
[1] "2045-11-23"

[[22]]
[1] "2046-11-22"

[[23]]
[1] "2047-11-28"

[[24]]
[1] "2048-11-26"

[[25]]
[1] "2049-11-25"

[[26]]
[1] "2050-11-24"

[[27]]
[1] "2051-11-23"

[[28]]
[1] "2052-11-28"

[[29]]
[1] "2053-11-27"

[[30]]
[1] "2054-11-26"

[[31]]
[1] "2055-11-25"

[[32]]
[1] "2056-11-23"

[[33]]
[1] "2057-11-22"

[[34]]
[1] "2058-11-28"

[[35]]
[1] "2059-11-27"

[[36]]
[1] "2060-11-25"

[[37]]
[1] "2061-11-24"

[[38]]
[1] "2062-11-23"

[[39]]
[1] "2063-11-22"

[[40]]
[1] "2064-11-27"

[[41]]
[1] "2065-11-26"

[[42]]
[1] "2066-11-25"

[[43]]
[1] "2067-11-24"

[[44]]
[1] "2068-11-22"

[[45]]
[1] "2069-11-28"

[[46]]
[1] "2070-11-27"

[[47]]
[1] "2071-11-26"

[[48]]
[1] "2072-11-24"

[[49]]
[1] "2073-11-23"

[[50]]
[1] "2074-11-22"

[[51]]
[1] "2075-11-28"

[[52]]
[1] "2076-11-26"

[[53]]
[1] "2077-11-25"

[[54]]
[1] "2078-11-24"

[[55]]
[1] "2079-11-23"

[[56]]
[1] "2080-11-28"

[[57]]
[1] "2081-11-27"

[[58]]
[1] "2082-11-26"

[[59]]
[1] "2083-11-25"

[[60]]
[1] "2084-11-23"

[[61]]
[1] "2085-11-22"

[[62]]
[1] "2086-11-28"

[[63]]
[1] "2087-11-27"

[[64]]
[1] "2088-11-25"

[[65]]
[1] "2089-11-24"

[[66]]
[1] "2090-11-23"

[[67]]
[1] "2091-11-22"

[[68]]
[1] "2092-11-27"

[[69]]
[1] "2093-11-26"

[[70]]
[1] "2094-11-25"

[[71]]
[1] "2095-11-24"

[[72]]
[1] "2096-11-22"

[[73]]
[1] "2097-11-28"

[[74]]
[1] "2098-11-27"

[[75]]
[1] "2099-11-26"

[[76]]
[1] "2100-11-25"