Arithmetic routines for native R matrices and big.matrix objects
Frédéric Bertrand, Michael J. Kane, Bryan Lewis, John W. Emerson
https://doi.org/10.32614/CRAN.package.bigalgebra
bigalgebra
provides fast linear algebra primitives that operate seamlessly on base matrix
objects and [bigmemory::big.matrix
] containers. The package wraps BLAS and LAPACK routines with R-friendly helpers so that vector updates, matrix products, and classic decompositions work the same way in memory or on disk.
Package highlights
- big.matrix workflows – Guidance on creating, sharing, and cleaning up file-backed matrices is collected in the Working with big.matrix Objects vignette.
-
Vector kernels – Level 1 BLAS-style helpers such as
dset()
,dsub()
andddot()
extend familiar vector algebra tobig.matrix
inputs. See the Level 1 BLAS-Style Helpers vignette. -
Matrix products – Wrappers including
dgemm()
anddsymm()
expose Level 3 BLAS routines for dense matrix multiplication with optional file-backed outputs. Explore the Matrix Wrapper Helpers vignette. -
LAPACK decompositions – QR, Cholesky, eigenvalue, and SVD helpers (
dgeqrf()
,dpotrf()
,dgeev()
,dgesdd()
) bring advanced factorisations to large datasets. Walk through the LAPACK Decompositions vignette.
Package options
The package defines a number of global options that begin with bigalgebra
:
Option Default value * bigalgebra.temp_pattern
with default matrix_
* bigalgebra.tempdir
with default tempdir
* bigalgebra.mixed_arithmetic_returns_R_matrix
with default TRUE
* bigalgebra.DEBUG
with default FALSE
The bigalgebra.tempdir
option must be a function that returns a temporary directory path used to store big matrix results of BLAS and LAPACK operations. The default value is simply the base R tempdir()
function.
The bigalgebra.temp_pattern
option is a name prefix for file names of generated big matrix objects output as a result of BLAS and LAPACK operations.
The bigalgebra.mixed_arithmetic_returns_R_matrix
option determines whether arithmetic operations involving an R matrix or vector and a big.matrix
matrix or vector return a big matrix (when the option is FALSE
), or return a normal R matrix (TRUE
).
BLAS and LAPACK backends
The package is built, by default, with R’s native BLAS libraries, which use 32-bit signed integer indexing. The default build is limited to vectors of at most 2^31 − 1 entries and matrices with at most 2^31 − 1 rows and 2^31 − 1 columns (note that standard R matrices are limited to 2^31 − 1 total entries).
The package includes a reference BLAS implementation that supports 64-bit integer indexing, relaxing the limitation on vector lengths and matrix row and column limits. Installation of this package with the 64-bit reference BLAS implementation may be performed from the command-line install:
REFBLAS=1 R CMD INSTALL bigalgebra
where bigalgebra
is the source package (for example, bigalgebra_0.9.0.tar.gz
).
The package may also be built with user-supplied external BLAS and LAPACK libraries, in either 32- or 64-bit varieties. This is an advanced topic that requires additional Makevars
modification, and may include adjustment of the low-level calling syntax depending on the library used.
Feel free to contact us for help installing and running the package.
This website, the unit tests, some C code fixes and improvements as well as these examples were created by F. Bertrand.
Maintainer: Frédéric Bertrand frederic.bertrand@lecnam.net.
Installation
You can install the released version of bigalgebra from CRAN with:
install.packages("bigalgebra")
You can install the development version of bigalgebra from GitHub with:
devtools::install_github("fbertran/bigalgebra")
Quick tour of the functionality
The snippets below mirror the worked examples in the vignettes and show how the helpers behave with in-memory and file-backed matrices.
Level 1 BLAS helpers
These helpers cover vector updates, reductions, and element-wise transforms such as the in-place square root provided by dsqrt()
.
library(bigmemory)
library(bigalgebra)
x <- bigmemory::big.matrix(5, 1, init = 0)
dset(ALPHA = 9, X = x)
dsqrt(X = x)
x[]
#> [1] 3 3 3 3 3
y <- bigmemory::big.matrix(5, 1, init = 1)
dvcal(ALPHA = 0.5, X = x, BETA = 2, Y = y)
y[]
#> [1] 3.5 3.5 3.5 3.5 3.5
Matrix products with dgemm()
A <- bigmemory::big.matrix(5, 4, init = 1)
B <- bigmemory::big.matrix(4, 4, init = 2)
C <- bigmemory::big.matrix(5, 4, init = 0)
dgemm(A = A, B = B, C = C, ALPHA = 1, BETA = 0)
C[]
#> [,1] [,2] [,3] [,4]
#> [1,] 8 8 8 8
#> [2,] 8 8 8 8
#> [3,] 8 8 8 8
#> [4,] 8 8 8 8
#> [5,] 8 8 8 8
LAPACK decompositions
set.seed(1)
M <- matrix(rnorm(9), 3)
SPD <- crossprod(M)
SPD_big <- as.big.matrix(SPD)
dpotrf(A = SPD_big)
#> [1] 0
chol_factor <- SPD_big[,]
chol_factor[lower.tri(chol_factor)] <- 0
chol_factor
#> [,1] [,2] [,3]
#> [1,] 1.060398 -0.2388263 -0.6138286
#> [2,] 0.000000 1.8082109 0.2222424
#> [3,] 0.000000 0.0000000 0.8294922
File-backed big.matrix
workflows
tmpdir <- tempdir()
file_big <- filebacked.big.matrix(3, 3, init = diag(3),
backingpath = tmpdir,
backingfile = "example.bin")
#> Warning in filebacked.big.matrix(3, 3, init = diag(3), backingpath = tmpdir, : No
#> descriptor file given, it will be named example.bin.desc
file_big[1, 3] <- 5
file_big[]
#> [,1] [,2] [,3]
#> [1,] 1 1 5
#> [2,] 1 1 1
#> [3,] 1 1 1
rm(file_big)
gc()
#> used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
#> Ncells 900616 48.1 1699095 90.8 NA 1377768 73.6
#> Vcells 2253326 17.2 8388608 64.0 65536 3247956 24.8
Available vignettes
The full vignette set expands on the topics above and demonstrates how the routines interact:
- Working with big.matrix Objects – Managing shared memory, file backing, and clean-up for large datasets.
- Level 1 BLAS-Style Helpers – Filling vectors, Hadamard products, and reductions on disk-backed data.
- Matrix Wrapper Helpers – Symmetric and general matrix products, including strategies for chaining operations.
- LAPACK Decompositions with bigalgebra – QR, Cholesky, eigenvalue, and SVD workflows for both in-memory and file-backed matrices.