Skip to contents

Overview

We implement a double RKHS variant of PLS, where both the input and the output spaces are endowed with reproducing kernels:

  • KXn×nK_X \in \mathbb{R}^{n\times n} with entries [KX]ij=kX(xi,xj)[K_X]_{ij} = k_X(x_i, x_j),
  • KYn×nK_Y \in \mathbb{R}^{n\times n} with entries [KY]ij=kY(yi,yj)[K_Y]_{ij} = k_Y(y_i, y_j).

We use centered Grams K̃X=HKXH\tilde K_X = H K_X H and K̃Y=HKYH\tilde K_Y = H K_Y H, where H=I1n𝟏𝟏H = I - \frac{1}{n}\mathbf{1}\mathbf{1}^\top.

Operator and Latent Directions

Following the spirit of Kernel PLS Regression II (IEEE TNNLS, 2019), we avoid explicit square roots and form the SPD surrogate operator v=(KX+λxI)1KXKYKX(KX+λxI)1v, \mathcal{M} \, v = (K_X+\lambda_x I)^{-1} \; K_X \; K_Y \; K_X \; (K_X+\lambda_x I)^{-1} \, v, with small ridge λx>0\lambda_x > 0 for stability. We compute the first AA orthonormal latent directions T=[t1,,tA]T = [t_1,\dots,t_A] via power iteration with Gram–Schmidt orthogonalization on \mathcal{M}.

We then solve a small regression in the latent space: C=(TT)1(TỸ),Ỹ=Y𝟏y, C = (T^\top T)^{-1} (T^\top \tilde Y), \qquad \tilde Y = Y - \mathbf{1} \bar y^\top, and form dual coefficients α=UC,U=(KX+λxI)1T, \alpha \;=\; U \, C, \qquad U \;=\; (K_X+\lambda_x I)^{-1} T, so that training predictions satisfy Ŷ=K̃Xα+𝟏y. \hat Y \;=\; \tilde K_X \, \alpha + \mathbf{1}\,\bar y^\top .

Centering for Prediction

Given new inputs $X_\*$, define the cross-Gram $$ K_\* = K(X_\*, X) . $$ To apply training centering to $K_\*$, use $$ \tilde K_\* \;=\; K_\* \;-\; \mathbf{1}\, \bar k_X^\top \;-\; \bar k_\* \mathbf{1}^\top \;+\; \mu_X, $$ where: - kX=1n𝟏KX\bar k_X = \frac{1}{n}\mathbf{1}^\top K_X is the column mean vector for the (uncentered) training Gram, - μX=1n2𝟏KX𝟏\mu_X = \frac{1}{n^2} \mathbf{1}^\top K_X \mathbf{1} is its grand mean, - $\bar k_\*$ is the row mean of $K_\*$ (computed at prediction time).

Predictions then follow the familiar dual form: $$ \hat Y_\* \;=\; \tilde K_\* \, \alpha + \mathbf{1}_\* \, \bar y^\top . $$

Practical Notes

  • Choose kXk_X (e.g., RBF) to reflect nonlinear structure in inputs. A linear kYk_Y already produces numeric outputs in m\mathbb{R}^m.
  • The ridge terms λx,λy\lambda_x, \lambda_y stabilize inversions and dampen numerical noise.
  • With algorithm = "rkhs_xy", the package returns:
    • dual_coef =α=\alpha,
    • scores =T=T (approximately orthonormal),
    • intercept =y=\bar y,
    • and uses the centered cross-kernel formula above in predict().

Minimal Example

library(bigPLSR)
set.seed(42)
n <- 60; p <- 6; m <- 2
X <- matrix(rnorm(n * p), n, p)
Y <- cbind(sin(X[,1]) + 0.4 * X[,2]^2,
           cos(X[,3]) - 0.3 * X[,4]^2) + matrix(rnorm(n*m, sd=.05), n, m)

op <- options(
  bigPLSR.rkhs_xy.kernel_x = "rbf",
  bigPLSR.rkhs_xy.gamma_x  = 0.5,
  bigPLSR.rkhs_xy.kernel_y = "linear",
  bigPLSR.rkhs_xy.lambda_x = 1e-6,
  bigPLSR.rkhs_xy.lambda_y = 1e-6
)
on.exit(options(op), add = TRUE)

fit <- pls_fit(X, Y, ncomp = 3, algorithm = "rkhs_xy", backend = "arma")
Yhat <- predict(fit, X)
mean((Y - Yhat)^2)
#> [1] 2.619847e-12

References • Rosipal & Trejo (2001) Kernel Partial Least Squares Regression in Reproducing Kernel Hilbert Space. JMLR 2:97–123. doi:10.5555/944733.944741. • Kernel PLS Regression II: Kernel Partial Least Squares Regression by Projecting Both Independent and Dependent Variables into Reproducing Kernel Hilbert Space. IEEE TNNLS (2019). doi:10.1109/TNNLS.2019.2932014.