Zero-Copy Julia Arrays in R with jlview

Introduction

When working with Julia arrays from R via JuliaCall, every transfer copies data. For a 50,000 x 25,000 Float64 matrix (~9.3 GB), that means allocating 9.3 GB on the R side and spending seconds on the copy. If you are iterating on exploratory analysis or building a pipeline that shuttles arrays back and forth, those copies add up fast.

jlview eliminates that overhead using R’s ALTREP (Alternative Representations) framework. Instead of copying, jlview() returns a lightweight R vector whose data pointer points directly into Julia’s memory. R operations like sum(), subsetting, and colMeans() read from Julia’s buffer with zero additional allocation.

Latency R Memory
jlview (zero-copy) 38 ms 0 MB
copy (collect) 2.7 s 9.3 GB
Improvement 72x faster 100% less

Benchmark: 50K x 25K Float64 matrix (9.3 GB)

Getting Started

Install jlview from GitHub:

# install.packages("remotes")
remotes::install_github("tanaylab/jlview")

Before using jlview, initialize the Julia runtime via JuliaCall:

library(jlview)
JuliaCall::julia_setup()

The julia_setup() call is required once per R session. jlview will automatically load its Julia-side support module when you first call jlview().

Dense Arrays

Vectors

Create a Julia vector and wrap it in an ALTREP view:

JuliaCall::julia_command("v = randn(100_000)")
x <- jlview(JuliaCall::julia_eval("v"))

length(x) # 100000
sum(x) # computed directly from Julia memory
x[1:5] # subsetting works as usual

Matrices

Two-dimensional Julia arrays become R matrices with proper dimensions:

JuliaCall::julia_command("M = randn(1000, 500)")
m <- jlview(JuliaCall::julia_eval("M"))

dim(m) # [1] 1000  500
m[1:3, 1:3] # subset rows and columns
colSums(m) # column sums, no copy

Verifying Zero-Copy

You can confirm that no R-side allocation occurred by checking is.altrep():

.Internal(inspect(x))
# Should show ALTREP wrapper, not a materialized REALSXP

Type Handling

jlview supports the following Julia element types:

Julia type R type Strategy
Float64 numeric Direct zero-copy
Int32 integer Direct zero-copy
Float32 numeric Convert to Float64 in Julia, then zero-copy
Int64 numeric Convert to Float64 in Julia, then zero-copy
Int16 integer Convert to Int32 in Julia, then zero-copy
UInt8 integer Convert to Int32 in Julia, then zero-copy
Bool logical Full copy (layout incompatible)
String[] character Full copy (layout incompatible)

The conversion strategy is deliberate. Types like Float32 and Int64 do not have a direct R counterpart with matching memory layout. jlview converts them once on the Julia side into a layout-compatible type (Float64 or Int32), pins the converted array, and then creates a zero-copy view of that. The one-time conversion cost is small compared to copying across runtimes.

For Bool and String arrays, the memory layouts are fundamentally incompatible (Julia Bool is 1 byte, R logical is 4 bytes; Julia strings are GC-managed objects). These fall back to JuliaCall’s standard copy path, and jlview() will emit a warning.

Named Arrays

Julia’s NamedArrays package provides named dimensions. jlview has dedicated functions that preserve these names without triggering ALTREP materialization.

Named Vectors

JuliaCall::julia_command("using NamedArrays")
JuliaCall::julia_command('nv = NamedArray([10.0, 20.0, 30.0], (["a", "b", "c"],))')
x <- jlview_named_vector(JuliaCall::julia_eval("nv"))

names(x) # [1] "a" "b" "c"
x["b"] # 20, still zero-copy for the data

Named Matrices

JuliaCall::julia_command('nm = NamedArray(randn(3, 2), (["r1","r2","r3"], ["c1","c2"]))')
m <- jlview_named_matrix(JuliaCall::julia_eval("nm"))

rownames(m) # [1] "r1" "r2" "r3"
colnames(m) # [1] "c1" "c2"
m["r1", "c2"]

Names are attached atomically during ALTREP construction. This is important because setting names() or dimnames() on an existing ALTREP vector would normally trigger materialization (a full copy), defeating the purpose. By passing names through jlview(..., names = ...) or jlview(..., dimnames = ...), the names are set on the ALTREP object before R ever inspects the data.

Sparse Matrices

Julia’s SparseMatrixCSC maps naturally to R’s dgCMatrix from the Matrix package. jlview_sparse() constructs a dgCMatrix where the nonzero values (x slot) are backed by a zero-copy ALTREP view of Julia’s nzval array.

JuliaCall::julia_command("using SparseArrays")
JuliaCall::julia_command("sp = sprand(Float64, 10000, 5000, 0.01)")
s <- jlview_sparse(JuliaCall::julia_eval("sp"))

class(s) # [1] "dgCMatrix"
dim(s) # [1] 10000  5000
Matrix::nnzero(s)

The row indices (i slot) and column pointers (p slot) require a 1-to-0 index shift (Julia is 1-based, dgCMatrix is 0-based). These are copied and shifted in Julia before being returned to R as plain integer vectors.

Memory Management

jlview pins Julia arrays in a global dictionary to prevent Julia’s garbage collector from reclaiming them while R holds a reference. This means Julia memory is held as long as the R ALTREP object exists.

Three-Layer Defense

  1. Pinning dictionary – Each array is stored in JlviewSupport.PINNED with a unique ID. The C finalizer on the R ALTREP object calls unpin() when R garbage-collects the wrapper.

  2. GC pressure tracking – jlview tracks total pinned bytes and reports them to R via R_AdjustExternalMemory(). When pinned memory exceeds a threshold (default 10 GB), jlview forces an R gc() to reclaim stale ALTREP objects.

  3. Explicit release – For tight control, call jlview_release() to immediately unpin the array without waiting for R’s GC.

Explicit Release

m <- jlview(JuliaCall::julia_eval("randn(10000, 1000)"))
# ... use m ...
jlview_release(m)
# m is now invalid; accessing it will error

Scoped Release

with_jlview() guarantees release even if an error occurs:

result <- with_jlview(JuliaCall::julia_eval("randn(100000)"), {
    c(mean(.x), sd(.x))
})
# .x is automatically released here, result is a plain R vector

Tuning GC Pressure

# Check current state
jlview_gc_pressure()
# $pinned_bytes
# [1] 80000000
# $threshold
# [1] 10737418240

# Lower the threshold to 500 MB
jlview_set_gc_threshold(500e6)

Copy-on-Write Semantics

jlview objects follow R’s standard copy-on-write (COW) semantics. Read operations (subsetting, aggregation, printing) are zero-copy. Write operations trigger materialization: R allocates a fresh buffer, copies the data from Julia, and the ALTREP wrapper is replaced by a standard R vector.

x <- jlview(JuliaCall::julia_eval("collect(1.0:5.0)"))
y <- x # y and x share Julia memory, no copy
sum(y) # zero-copy read

y[1] <- 999.0 # WRITE: triggers materialization
# y is now a standard R numeric vector (copy of Julia data, modified)
# x still points to Julia memory, unchanged

This is identical to how R treats any shared vector – jlview does not introduce new semantics. The only difference is that before materialization, the backing store is Julia memory instead of R memory.

Serialization

jlview objects can be saved with saveRDS() and restored with readRDS(). On save, the data is materialized into a standard R vector (since Julia memory cannot be serialized). On load, you get back a regular R vector.

x <- jlview(JuliaCall::julia_eval("randn(1000)"))
saveRDS(x, "my_vector.rds")

# In a new session (no Julia needed):
y <- readRDS("my_vector.rds")
class(y) # "numeric" -- a plain R vector

This means serialization always works correctly, but the zero-copy property is not preserved across save/load cycles.

Known Limitations