In this vignette, we compare the computation time/memory usage of
dense matrix and sparse Matrix.
We begin with an analysis of the time/memory it takes to create these
objects.  In the atime code below, we allocate a vector for
comparison, and we specify a result function which computes the
length of the object x created by each expression. This means
atime will save length as a function of data size N (in addition
to time and memory).
library(Matrix)
#> package 'Matrix' was built under R version 4.4.3
N_seq <- unique(as.integer(10^seq(0,7,by=0.25)))
vec.mat.result <- atime::atime(
  N=N_seq,
  vector=numeric(N),
  matrix=matrix(0, N, N),
  Matrix=Matrix(0, N, N),
  result=function(x)data.frame(length=length(x)))
plot(vec.mat.result)
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
The plot above shows three panels, one for each unit.
kilobytes is the amount of memory used. We see that Matrix and
vector use the same amount of memory asymptotically, whereas
matrix uses more (larger slope on the log-log plot implies larger
asymptotic complexity class).length is the value returned by the length function. We see that
matrix and Matrix have the same value, whereas vector has
asymptotically smaller length (smaller slope on log-log plot).seconds is the amount of time taken. We see that Matrix is
slower than vector and matrix by a small constant overhead,
which can be seen for small N. We also see that for large N,
Matrix and vector have the same asymptotic time complexity,
which is much faster than matrix.bench::pressAn alternative method to compute asymptotic timings is via
bench::press, which provides functionality for parameterized
benchmarking (similar to atime_grid). Because atime() has special
treatment of the N parameter, the code required for asymptotic
measurement is relatively simple; compare the atime code above to
the bench::press code below, which measures the same asymptotic
quantities (seconds, kilobytes, length).
seconds.limit <- 0.01
done.vec <- NULL
measure.vars <- c("seconds","kilobytes","length")
press_result <- bench::press(N = N_seq, {
  exprs <- function(...)as.list(match.call()[-1])
  elist <- exprs(
    vector=numeric(N),
    matrix=matrix(0, N, N),
    Matrix=Matrix(0, N, N))
  elist[names(done.vec)] <- NA #Don't run exprs which already exceeded limit.
  mark.args <- c(elist, list(iterations=10, check=FALSE))
  mark.result <- do.call(bench::mark, mark.args)
  ## Rename some columns for easier interpretation.
  desc.vec <- attr(mark.result$expression, "description")
  mark.result$description <- desc.vec
  mark.result$seconds <- as.numeric(mark.result$median)
  mark.result$kilobytes <- as.numeric(mark.result$mem_alloc/1024)
  ## Compute length column to measure in addition to time/memory.
  mark.result$length <- NA
  for(desc.i in seq_along(desc.vec)){
    description <- desc.vec[[desc.i]]
    result <- eval(elist[[description]])
    mark.result$length[desc.i] <- length(result)
  }
  ## Set NA time/memory/length for exprs which were not run.
  mark.result[desc.vec %in% names(done.vec), measure.vars] <- NA
  ## If expr went over time limit, indicate it is done.
  over.limit <- mark.result$seconds > seconds.limit
  over.desc <- desc.vec[is.finite(mark.result$seconds) & over.limit]
  done.vec[over.desc] <<- TRUE
  mark.result
})
The bench::press code above is relatively complicated, because it re-implements two functions that are provided by atime:
N values. This keeps overall computation reasonable, even when comparing expressions which have different asymptotic time complexity (such as quadratic for matrix and linear for Matrix in this example).seconds and kilobytes as a function of N (such as length in this example), then atime makes that easy (just provide a result function), whereas it is more complex to implement in bench::press (for loop is required).Below we visualize the results from bench::press,
library(data.table)
(press_long <- melt(
  data.table(press_result),
  measure.vars=measure.vars,
  id.vars=c("N","description"),
  na.rm=TRUE))
| N | description | variable | value | 
|---|---|---|---|
| 1 | vector | seconds | 0.000000e+00 | 
| 1 | matrix | seconds | 0.000000e+00 | 
| 1 | Matrix | seconds | 1.000000e-03 | 
| 3 | vector | seconds | 0.000000e+00 | 
| 3 | matrix | seconds | 0.000000e+00 | 
| ⋮ | ⋮ | ⋮ | ⋮ | 
| 3162277 | Matrix | length | 9.999996e+12 | 
| 5623413 | vector | length | 5.623413e+06 | 
| 5623413 | Matrix | length | 3.162277e+13 | 
| 10000000 | vector | length | 1.000000e+07 | 
| 10000000 | Matrix | length | 1.000000e+14 | 
if(require(ggplot2)){
  gg <- ggplot()+
    ggtitle("bench::press results for comparison")+
    facet_grid(variable ~ ., labeller=label_both, scales="free")+
    geom_line(aes(
      N, value,
      color=description),
      data=press_long)+
    scale_x_log10(limits=c(NA, max(press_long$N*2)))+
    scale_y_log10("")
  if(requireNamespace("directlabels")){
    directlabels::direct.label(gg,"right.polygons")
  }else gg
}
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
We can see that the plot from atime and bench::press are consistent.
Below we estimate the best asymptotic complexity classes:
vec.mat.best <- atime::references_best(vec.mat.result)
plot(vec.mat.best)
#> log-10 transformation introduced infinite values.
The plot above shows that
matrix has time, memory, and length which are all quadratic O(N^2).Matrix has linear O(N) time and memory, but O(N^2) values for
length.vector has time, memory, and length which are all linear O(N).Below we estimate the throughput for some given limits:
vec.mat.pred <- predict(
  vec.mat.best,
  seconds=vec.mat.result$seconds.limit,
  ##kilobytes=1000,#not available on CRAN.
  length=100)
plot(vec.mat.pred)
#> log-10 transformation introduced infinite values.
In the plot above we can see the throughput N for a given limit of
kilobytes, length or seconds. Below we use Matrix as a
reference, and compute the throughput ratio, Matrix to other.
library(data.table)
dcast(vec.mat.pred$prediction[
, ratio := N[expr.name=="Matrix"]/N, by=unit
], unit + unit.value ~ expr.name, value.var="ratio")
| unit | unit.value | Matrix | matrix | vector | 
|---|---|---|---|---|
| length | 1e+02 | 1 | 1.00 | 0.100 | 
| seconds | 1e-02 | 1 | 2709.26 | 0.949 | 
From the table above (matrix column), we can see that the throughput
of Matrix is 100-1000x larger than matrix, for the given limits.
First we show the difference between sparse and dense matrix multiplication, when the matrix has 90% zeros (10% non-zeros).
library(Matrix)
sparse.prop <- 0.9
dense.prop <- 1-sparse.prop
mult.result <- atime::atime(
  N=as.integer(10^seq(1,4,by=0.25)),
  setup={
    m <- matrix(0, N, N)
    set.seed(1)
    w <- rnorm(N)
    N.not.zero <- as.integer(dense.prop*N^2)
    m[sample(N^2, N.not.zero)] <- rnorm(N.not.zero)
    M <- Matrix(m)
  },
  sparse = M %*% w,
  dense = m %*% w,
  result=TRUE)
plot(mult.result)
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
Above we see that sparse is faster than dense, but by constant
factors.
Below we estimate the best asymptotic complexity classes:
mult.best <- atime::references_best(mult.result)
plot(mult.best)
#> log-10 transformation introduced infinite values.
Above we see that both sparse and dense matrix multiplication are
quadratic O(N^2) time (for a quadratic number of non-zero entries).
Finally, we verify below that both methods yield the same result:
library(data.table)
mult.compare <- dcast(
  mult.result$measurements, N ~ expr.name, value.var="result"
)[
, equal := paste(all.equal(as.numeric(dense[[1]]), as.numeric(sparse[[1]])))
, by=N
][]
mult.compare[,.(N,equal)]
| N | equal | 
|---|---|
| 10 | TRUE | 
| 17 | TRUE | 
| 31 | TRUE | 
| 56 | TRUE | 
| 100 | TRUE | 
| ⋮ | ⋮ | 
| 1000 | TRUE | 
| 1778 | TRUE | 
| 3162 | TRUE | 
| 5623 | Numeric: lengths (0, 5623) differ | 
| 10000 | Numeric: lengths (0, 10000) differ | 
We see the only difference is for large N when dense result is NULL (benchmark not run because a smaller N went over the time limit).
Next we show the difference between sparse and dense matrix multiplication, when the matrix has a linear number of non-zeros (asymptotically fewer than in the previous section).
library(Matrix)
mult.result <- atime::atime(
  N=as.integer(10^seq(1,4,by=0.25)),
  setup={
    m <- matrix(0, N, N)
    set.seed(1)
    w <- rnorm(N)
    N.not.zero <- N
    m[sample(N^2, N.not.zero)] <- rnorm(N.not.zero)
    M <- Matrix(m)
  },
  sparse = M %*% w,
  dense = m %*% w,
  result=TRUE)
plot(mult.result)
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
Above we see that sparse is asymptotically faster than dense (different asymptotic slopes).
Below we estimate the best asymptotic complexity classes:
mult.best <- atime::references_best(mult.result)
plot(mult.best)
#> log-10 transformation introduced infinite values.
Above we see that sparse is linear O(N) time whereas dense is
quadratic O(N^2) time (for a linear number of non-zero entries).
Finally, we verify below that both methods yield the same result:
library(data.table)
mult.compare <- dcast(
  mult.result$measurements, N ~ expr.name, value.var="result"
)[
, equal := paste(all.equal(as.numeric(dense[[1]]), as.numeric(sparse[[1]])))
, by=N
][]
mult.compare[,.(N,equal)]
| N | equal | 
|---|---|
| 10 | TRUE | 
| 17 | TRUE | 
| 31 | TRUE | 
| 56 | TRUE | 
| 100 | TRUE | 
| ⋮ | ⋮ | 
| 1000 | TRUE | 
| 1778 | TRUE | 
| 3162 | TRUE | 
| 5623 | Numeric: lengths (0, 5623) differ | 
| 10000 | Numeric: lengths (0, 10000) differ | 
We see the only difference is for large N when dense result is NULL (benchmark not run because a smaller N went over the time limit).
In this section we show how you can code both comparisons at the same time, without repetition. The trick is to first define a list of parameters to vary:
param.list <- list(
  non_zeros=c("N","N^2/10"),
  fun=c("matrix","Matrix")
)
After that we create a grid of expressions to evaluate, by expanding the parameter grid:
(expr.list <- atime::atime_grid(
  param.list,
  Mw=L[[fun]][[non_zeros]]%*%w,
  collapse="\n"))
#> $`Mw non_zeros=N\nfun=Matrix`
#> L[["Matrix"]][["N"]] %*% w
#> 
#> $`Mw non_zeros=N\nfun=matrix`
#> L[["matrix"]][["N"]] %*% w
#> 
#> $`Mw non_zeros=N^2/10\nfun=Matrix`
#> L[["Matrix"]][["N^2/10"]] %*% w
#> 
#> $`Mw non_zeros=N^2/10\nfun=matrix`
#> L[["matrix"]][["N^2/10"]] %*% w
#> 
#> attr(,"parameters")
#>                          expr.name expr.grid non_zeros    fun
#>                             <char>    <char>    <char> <char>
#> 1:      Mw non_zeros=N\nfun=Matrix        Mw         N Matrix
#> 2:      Mw non_zeros=N\nfun=matrix        Mw         N matrix
#> 3: Mw non_zeros=N^2/10\nfun=Matrix        Mw    N^2/10 Matrix
#> 4: Mw non_zeros=N^2/10\nfun=matrix        Mw    N^2/10 matrix
Finally we pass the list of expressions to atime, along with a
setup argument which creates the required list L of input data,
based on the parameters:
mult.result <- atime::atime(
  N=as.integer(10^seq(1,3.5,by=0.25)),
  setup={
    L <- list()
    set.seed(1)
    w <- rnorm(N)
    for(non_zeros in param.list$non_zeros){
      N.not.zero <- as.integer(eval(str2lang(non_zeros)))
      m <- matrix(0, N, N)
      m[sample(N^2, N.not.zero)] <- rnorm(N.not.zero)
      for(fun in param.list$fun){
        L[[fun]][[non_zeros]] <- get(fun)(as.numeric(m), N, N)
      }
    }
  },
  expr.list=expr.list)
plot(mult.result)
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
#> log-10 transformation introduced infinite values.
Below we estimate the best asymptotic complexity classes:
mult.best <- atime::references_best(mult.result)
plot(mult.best)
#> log-10 transformation introduced infinite values.
Below we show an alternative visualization:
only.seconds <- mult.best
only.seconds$measurements <- mult.best$measurements[unit=="seconds"]
only.seconds$plot.references <- mult.best$plot.references[unit=="seconds"]
if(require(ggplot2)){
  plot(only.seconds)+
    facet_grid(non_zeros ~ fun, labeller=label_both)
}
Overall in this vignette we have shown how atime can be used to
demonstrate when sparse matrices can be used for efficient
computations.
We also showed a comparison between atime and bench::press, which
highlighted two areas where atime is more convenient (stopping after
exceeding a time limit, and measuring quantities other than
time/memory as a function of data size N).