Double RKHS PLS (rkhs_xy): Theory and Usage
Frédéric Bertrand
Cedric, Cnam, Parisfrederic.bertrand@lecnam.net
2025-11-18
Source:vignettes/double-rkhs-pls.Rmd
double-rkhs-pls.RmdOverview
We implement a double RKHS variant of PLS, where both the input and the output spaces are endowed with reproducing kernels:
- with entries ,
- with entries .
We use centered Grams and , where .
Operator and Latent Directions
Following the spirit of Kernel PLS Regression II (IEEE TNNLS, 2019), we avoid explicit square roots and form the SPD surrogate operator with small ridge for stability. We compute the first orthonormal latent directions via power iteration with Gram–Schmidt orthogonalization on .
We then solve a small regression in the latent space: and form dual coefficients so that training predictions satisfy
Centering for Prediction
Given new inputs $X_\*$, define the cross-Gram $$ K_\* = K(X_\*, X) . $$ To apply training centering to $K_\*$, use $$ \tilde K_\* \;=\; K_\* \;-\; \mathbf{1}\, \bar k_X^\top \;-\; \bar k_\* \mathbf{1}^\top \;+\; \mu_X, $$ where: - is the column mean vector for the (uncentered) training Gram, - is its grand mean, - $\bar k_\*$ is the row mean of $K_\*$ (computed at prediction time).
Predictions then follow the familiar dual form: $$ \hat Y_\* \;=\; \tilde K_\* \, \alpha + \mathbf{1}_\* \, \bar y^\top . $$
Practical Notes
- Choose (e.g., RBF) to reflect nonlinear structure in inputs. A linear already produces numeric outputs in .
- The ridge terms stabilize inversions and dampen numerical noise.
- With
algorithm = "rkhs_xy", the package returns:-
dual_coef, -
scores(approximately orthonormal), -
intercept, - and uses the centered cross-kernel formula above in
predict().
-
Minimal Example
library(bigPLSR)
set.seed(42)
n <- 60; p <- 6; m <- 2
X <- matrix(rnorm(n * p), n, p)
Y <- cbind(sin(X[,1]) + 0.4 * X[,2]^2,
cos(X[,3]) - 0.3 * X[,4]^2) + matrix(rnorm(n*m, sd=.05), n, m)
op <- options(
bigPLSR.rkhs_xy.kernel_x = "rbf",
bigPLSR.rkhs_xy.gamma_x = 0.5,
bigPLSR.rkhs_xy.kernel_y = "linear",
bigPLSR.rkhs_xy.lambda_x = 1e-6,
bigPLSR.rkhs_xy.lambda_y = 1e-6
)
on.exit(options(op), add = TRUE)
fit <- pls_fit(X, Y, ncomp = 3, algorithm = "rkhs_xy", backend = "arma")
Yhat <- predict(fit, X)
mean((Y - Yhat)^2)
#> [1] 2.619847e-12References • Rosipal & Trejo (2001) Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. JMLR 2:97–123. doi:10.5555/944733.944741. • Kernel PLS Regression II: Kernel Partial Least Squares Regression by Projecting Both Independent and Dependent Variables into Reproducing Kernel Hilbert Space. IEEE TNNLS (2019). doi:10.1109/TNNLS.2019.2932014.