fastymd is a package for working with Year-Month-Day (YMD) style date objects. It provides extremely fast passing of character strings and numeric values to date objects as well as fast decomposition of these in to their year, month and day components. The underlying algorithms follow the approach of Howard Hinnant for calculating days from the UNIX Epoch of Gregorian Calendar dates and vice versa.
The API won’t give any surprises:
library(fastymd)
cdate <- c("2025-04-16", "2025-04-17")
(res <- fymd(cdate))
#> [1] "2025-04-16" "2025-04-17"
res == as.Date(cdate)
#> [1] TRUE TRUE
get_ymd(res)
#> year month day
#> 1 2025 4 16
#> 2 2025 4 17
fymd(2025, 4, 16) == res[1L]
#> [1] TRUE
Invalid dates will return NA
and a warning:
fymd(2021, 02, 29) # not a leap year
#> NAs introduced due to invalid month and/or day combinations.
#> [1] NA
More interesting is the handling of output after a valid date. Consider the following timestamp:
timelt <- as.POSIXlt(Sys.time(), tz = "UTC")
(timestamp <- strftime(timelt, "%Y-%m-%dT%H:%M:%S%z"))
#> [1] "2025-10-03T08:50:17+0000"
By default the time element is ignored:
(res <- fymd(timestamp))
#> [1] "2025-10-03"
res == as.Date(timestamp, tz = "UTC")
#> [1] TRUE
This ignoring of the timestamp is both good and bad. For timestamps it makes
perfect sense, but perhaps you have simple dates and a concern that some are
corrupted. For these we can use the strict
argument:
cdate <- "2025-04-16nonsense "
fymd(cdate)
#> [1] "2025-04-16"
fymd(cdate, strict = TRUE)
#> NAs introduced due to invalid date strings.
#> [1] NA
The character method of fymd()
parses input strings in a fixed, year, month
and day order. These values must be digits but can be separated by any non-digit
character. This is similar in spirit to the fastDate()
function in Simon
Urbanek’s fasttime package, using
pure text parsing and no system calls for maximum speed.
For extremely fast passing of POSIX style timestamps you will struggle to beat the performance of fasttime. This works fantastically for timestamps that do not need validation and are within the date range supported by the package (currently 1970-01-01 through to the year 2199).
fymd()
fills the, admittedly small, niche where you want fast parsing of YMD
strings along with date validation and support for a wider range of dates from
the Proleptic Gregorian calendar
(currently we support years in the range [-9999, 9999]
). This additional
capability does come with a small performance penalty but, hopefully, this has
been kept to a minimum and the implementation remains competitive.
library(microbenchmark)
# 1970-01-01 (UNIX epoch) to "2199-01-01"
dates <- seq.Date(from = .Date(0), to = fymd("2199-01-01"), by = "day")
# comparison timings for fymd (character method)
cdates <- format(dates)
(res_c <- microbenchmark(
fasttime = fasttime::fastDate(cdates),
fastymd = fymd(cdates),
ymd = ymd::ymd(cdates),
lubridate = lubridate::ymd(cdates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fasttime 533.670 538.575 577.8399 540.6035 544.7715 2105.967 100
#> fastymd 817.212 822.011 891.7562 824.8310 828.7085 5767.230 100
#> ymd 4236.551 4281.931 4395.9070 4313.1145 4336.5280 5953.068 100
#> lubridate 5486.083 5599.050 6219.0921 5668.7810 7022.4130 8608.015 100
# comparison timings for fymd (numeric method)
ymd <- get_ymd(dates)
(res_n <- microbenchmark(
fastymd = fymd(ymd[[1]], ymd[[2]], ymd[[3]]),
lubridate = lubridate::make_date(ymd[[1]], ymd[[2]], ymd[[3]]),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 325.480 326.7925 356.3634 328.165 332.5630 1956.998 100
#> lubridate 676.659 720.7870 839.9302 723.551 726.2465 2585.256 100
# comparison timings for year getter
(res_get_year <- microbenchmark(
fastymd = get_year(dates),
ymd = ymd::year(dates),
lubridate = lubridate::year(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 358.942 359.8645 399.9320 360.9220 364.338 3179.119 100
#> ymd 381.245 394.3240 545.4191 403.5015 478.542 3748.817 100
#> lubridate 7564.449 7583.4395 8110.1502 7602.2950 8880.130 10426.904 100
# comparison timings for month getter
(res_get_month <- microbenchmark(
fastymd = get_month(dates),
ymd = ymd::month(dates),
lubridate = lubridate::month(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 325.219 326.5470 328.9860 327.5240 330.8250 352.291 100
#> ymd 417.141 425.0265 436.5834 433.5175 442.9205 712.876 100
#> lubridate 8234.756 8273.7135 9062.5125 8289.4180 9659.8770 38708.926 100
# comparison timings for mday getter
(res_get_mday <- microbenchmark(
fastymd = get_mday(dates),
ymd = ymd::mday(dates),
lubridate = lubridate::day(dates),
check = "equal"
))
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> fastymd 361.417 362.0785 378.6033 364.0725 366.4865 1695.318 100
#> ymd 421.540 426.5440 434.1789 429.9460 433.9390 790.311 100
#> lubridate 7541.686 7561.8090 7842.8506 7578.0845 7617.9145 10427.656 100