Streaming Kernel PLS in bigPLSR: XX^T and Column-Chunked Variants
Frédéric Bertrand
Cedric, Cnam, Parisfrederic.bertrand@lecnam.net
2025-11-18
Source:vignettes/bigPLSR-kpls-streaming.Rmd
bigPLSR-kpls-streaming.RmdOverview
This vignette documents bigPLSR’s kernel PLS
streaming backends for bigmemory::big.matrix inputs. We
provide two complementary streaming strategies:
- Column-chunked Gram (existing): updates based on per-column blocks to form products involving K = X X^T implicitly.
- Row-chunked XX^T (new): computes a = X^T u by scanning rows in blocks, then emits t = X a, enabling efficient access patterns when n >> p or when the storage layout favors row-contiguous slices (e.g., file-backed subsets).
Both strategies produce the same model up to floating point
round-off. Selection is automatic (see ?pls_fit) or can be
forced via the option
options(bigPLSR.kpls_gram = "rows" | "cols" | "auto").
Math sketch
Let X in R^{n x p}, Y in R^{n x m} be centered.
At component h, kernel-PLS uses the NIPALS-like fixed-point update
- Start with u in R^n (e.g., a column of Y).
- Compute a = X^T u.
- Normalize w = a / ||a||_2.
- Scores: t = X w.
- Loadings:
- p = (X^T t)/(t^T t),
- q = (Y^T t)/(t^T t).
- Deflate: X <- X - t p^T, Y <- Y - t q^T, and set u <- Y q.
Coefficients after H components are
beta = W (P^T W)^{-1} Q^T,
yhat = 1 * mu_Y + (x - mu_X) beta.
The row-chunked implementation keeps X on disk and performs steps (2) and (4) with two passes over row blocks:
- Pass A (accumulate a): for each block B of rows, update a += B^T u_B.
- Pass B (emit t): for each block B, write t_B = B * a.
Loadings p are accumulated precisely like Pass A but with t instead of u.
APIs
- C++ entry points (Rcpp):
cpp_kpls_stream_xxt(X_ptr, Y_ptr, ncomp, chunk_rows, chunk_cols, center, return_big)cpp_kpls_stream_cols(X_ptr, Y_ptr, ncomp, chunk_cols, center, return_big)
- R wrapper:
pls_fit(..., backend = "bigmem", algorithm = "kernelpls", chunk_size, chunk_cols, ...)
pls_fit() chooses the variant via
options(bigPLSR.kpls_gram) or heuristics when
"auto" is set (the default).
When to prefer each variant
- Column-chunked (“cols”): good default; excellent when p is large and access by columns is cheap (typical bigmemory column-major backing).
- Row-chunked XX^T (“rows”): prefer when n >> p, when row access is contiguous (e.g., file-backed partitions), or when you want to minimize repeated column-touching across iterations.
References
- Dayal, B., & MacGregor, J.F. (1997). Improved PLS algorithms. Journal of Chemometrics, 11(1), 73–85.
- Rosipal, R., & Trejo, L.J. (2001). Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. JMLR, 2, 97–123.
- (and other kernel/logistic/sparse KPLS references in the
kpls_reviewvignette)