Temporal Data Manipulation with m61r

pv71u98h1

2026-01-12

Introduction

Leveraging the memory efficiency of Base R, the m61r package provides tools for manipulating time series, such as expanding time intervals into discrete slots or performing window-based aggregations. # Temporal Function Reference

In m61r, temporal operations rely on R primitives executed inside mutate() and summarise().

Operation Base R Expression (m61r)
Cast to Date ~as.Date(col)
Cast to Datetime ~as.POSIXct(col)
Datetime Sequence ~seq(from, to, by="1 day")
Extract Hour ~as.POSIXlt(col)$hour

Formatting and Component Extraction

The mutate() and transmutate() methods allow for the creation of derived columns using standard R format() codes.

Component R Format Code Example Expression
Year %Y year = ~format(dt, "%Y")
Month (numeric) %m month = ~format(dt, "%m")
Day of the week %w (0-6) weekday = ~format(dt, "%w")
Hour (24h) %H hour = ~format(dt, "%H")
Grouping Key %Y-%m ym_key = ~format(dt, "%Y-%m")

Practical Examples

Component Extraction

raw_data <- data.frame(
  timestamp = as.POSIXct(c("2023-01-01 10:00:00", "2023-01-01 11:30:00")),
  value = c(10, 20)
)
data <- m61r(raw_data)

data$mutate(
  year  = ~format(timestamp, "%Y"),
  hour  = ~as.POSIXlt(timestamp)$hour
)

Interval Expansion

When having a “start” and an “end” time, a powerful way to analyse the data is to “explode” the intervals into individual rows.

1. Creating Sequences

df_intervals <- data.frame(
  id = 1,
  start = as.POSIXct("2025-01-01 08:00"),
  end   = as.POSIXct("2025-01-01 10:00"),
  load  = 100
)

p <- m61r(df_intervals)
p$mutate(slot = ~Map(function(s, e) seq(s, e, by = "hour"), start, end))

2. Structural Explosion

The explode method flattens the list-column, duplicating other column values for each element in the sequence.

p$explode("slot")
p$head()
##   id               start                 end load                slot
## 1  1 2025-01-01 08:00:00 2025-01-01 10:00:00  100 2025-01-01 08:00:00
## 2  1 2025-01-01 08:00:00 2025-01-01 10:00:00  100 2025-01-01 09:00:00
## 3  1 2025-01-01 08:00:00 2025-01-01 10:00:00  100 2025-01-01 10:00:00

Approximate Matching with As-Of Joins

As-of joins are essential for synchronizing two time-series where timestamps do not match exactly.

prices <- data.frame(
  ts = as.POSIXct("2025-01-01 08:00") + c(0, 3600),
  val = c(10, 12)
)

df_sync <- data.frame(ts = as.POSIXct("2025-01-01 09:30"), event = "Trade")
p_sync <- m61r(df_sync)
p_sync$join_asof(prices, by_x = "ts", by_y = "ts", direction = "backward")

Date Extraction Example

The m61r object handles Date-class objects just as efficiently as POSIXct.

df_dates <- data.frame(
  id = 1:2,
  entry = as.Date(c("2020-01-01", "2021-01-01")),
  exit  = as.Date(c("2020-06-01", "2021-06-01"))
)

p_dates <- m61r(df_dates)
p_dates$mutate(year = ~as.POSIXlt(entry)$year + 1900)
p_dates$head()
##   id      entry       exit year
## 1  1 2020-01-01 2020-06-01 2020
## 2  2 2021-01-01 2021-06-01 2021

Time-Based Aggregation

A common task in time-series analysis is to bin data into specific time intervals (e.g., hourly or daily) and compute statistics. The following example demonstrates how to create a “bin” key and use it for grouping.

Hourly Summary Example

# Create a dataset with random timestamps within a day
set.seed(123)
ts_data <- data.frame(
  time = as.POSIXct("2025-01-01 00:00:00") + runif(50, 0, 86400),
  consumption = rnorm(50, 500, 100)
)

p_agg <- m61r(ts_data)

p_agg$mutate(hour_bin = ~format(time, "%Y-%m-%d %H:00"))
p_agg$group_by(~hour_bin)
p_agg$summarise(
  n_obs = ~length(consumption),
  avg_load = ~mean(consumption)
)

p_agg$head(5)
##           hour_bin n_obs avg_load
## 1 2025-01-01 00:00     1 521.5942
## 2 2025-01-01 01:00     2 458.0534
## 3 2025-01-01 02:00     1 461.9529
## 4 2025-01-01 03:00     4 540.9203
## 5 2025-01-01 05:00     4 401.9479

Conclusion

By staying true to Base R, m61r ensures that your temporal workflows remain portable, fast, and light.