llamaR 0.2.2

ragnar integration

Batch embeddings

Context embedding mode

Backend & device selection

Hardware & system

Tests


llamaR 0.2.1

New functions

Bug fixes

Tests


llamaR 0.2.0

Hugging Face integration

New functions

Dependencies


llamaR 0.1.3

GPU and build system improvements

Vulkan GPU support on Windows

CRAN compliance

Dependencies

DESCRIPTION


llamaR 0.1.2

CRAN compliance fixes

Documentation

DESCRIPTION

Packaging


llamaR 0.1.1

R interface — first working release

Full LLM inference cycle is now available from R:

Memory management

Model and context are wrapped as ExternalPtr with automatic GC finalizers. The context holds a reference to the model ExternalPtr, preventing premature collection.

Generation internals

llama_generate() runs the full pipeline in a single C++ call: prompt tokenization → encode → autoregressive decode loop with a sampler chain → detokenization of generated tokens.

Tests

19 assertions across 7 test blocks, all passing.


llamaR 0.1.0

Initial Release

Dependencies

Known Limitations