Partial Least Squares Components for Cox Models with Big Matrices
Source:R/big_pls_cox.R.R
big_pls_cox.RdCompute Partial Least Squares (PLS) components tailored for
Cox proportional hazards models when predictors are stored as a
big.matrix from the bigmemory package.
Usage
big_pls_cox(
X,
time,
status,
ncomp = 2L,
control = survival::coxph.control(),
keepX = NULL
)Arguments
- X
A numeric matrix or a
bigmemory::big.matrixobject containing the predictors.- time
Numeric vector of survival times.
- status
Integer (0/1) vector of event indicators.
- ncomp
Number of latent components to compute.
- control
Optional list passed to
survival::coxph.control.- keepX
Optional integer vector specifying the number of variables to retain (naive sparsity) in each component. A value of zero keeps all predictors. If a single integer is supplied it is recycled across components.
Value
A list with the computed scores, loadings, weights, scaling information and the
fitted Cox model returned by survival::coxph.fit.
Details
The function standardises each predictor column, iteratively builds latent scores using martingale residuals from Cox fits, and deflates the predictors without materialising the full design matrix in memory. Both in-memory and file-backed bigmemory matrices are supported.
References
Maumy, M., Bertrand, F. (2023). PLS models and their extension for big data. Joint Statistical Meetings (JSM 2023), Toronto, ON, Canada.
Maumy, M., Bertrand, F. (2023). bigPLS: Fitting and cross-validating PLS-based Cox models to censored big data. BioC2023 — The Bioconductor Annual Conference, Dana-Farber Cancer Institute, Boston, MA, USA. Poster. https://doi.org/10.7490/f1000research.1119546.1
Bastien, P., Bertrand, F., Meyer, N., & Maumy-Bertrand, M. (2015). Deviance residuals-based sparse PLS and sparse kernel PLS for censored data. Bioinformatics, 31(3), 397–404. doi:10.1093/bioinformatics/btu660
Bertrand, F., Bastien, P., Meyer, N., & Maumy-Bertrand, M. (2014). PLS models for censored data. In Proceedings of UseR! 2014 (p. 152).
Examples
if (requireNamespace("survival", quietly = TRUE)) {
set.seed(1)
X <- matrix(rnorm(100), nrow = 20)
time <- rexp(20)
status <- rbinom(20, 1, 0.5)
fit <- big_pls_cox(X, time, status, ncomp = 2)
str(fit)
}
#> List of 9
#> $ scores : num [1:20, 1:2] 0.315 0.28 -1.226 1.668 -0.372 ...
#> $ loadings: num [1:5, 1:2] 0.6905 0.0875 0.2379 0.2205 -0.6772 ...
#> $ weights : num [1:5, 1:2] 0.681 0.273 0.248 0.117 -0.622 ...
#> $ center : num [1:5] 0.19052 -0.00647 0.1388 0.10174 0.11985
#> $ scale : num [1:5] 0.913 0.871 0.81 1.05 0.911
#> $ cox_fit :List of 10
#> ..$ coefficients : num [1:2] 0.756 0.247
#> ..$ var : num [1:2, 1:2] 0.1286 0.0544 0.0544 0.2571
#> ..$ loglik : num [1:2] -18.5 -15.8
#> ..$ score : num 4.82
#> ..$ iter : int 4
#> ..$ linear.predictors: num [1:20] 0.0329 0.0502 -0.7536 1.4699 -0.5345 ...
#> ..$ residuals : Named num [1:20] -0.26 -0.125 -0.504 0.837 0.921 ...
#> .. ..- attr(*, "names")= chr [1:20] "1" "2" "3" "4" ...
#> ..$ means : num [1:2] 7.77e-17 -3.33e-17
#> ..$ method : chr "efron"
#> ..$ class : chr "coxph"
#> $ keepX : int [1:2] 0 0
#> $ time : num [1:20] 0.7632 1.5727 1.8356 0.0372 0.1256 ...
#> $ status : num [1:20] 0 1 0 1 1 1 0 0 0 1 ...
#> - attr(*, "class")= chr "big_pls_cox"