Compute coefficient vector after variable selection for the fitting criteria of a given model. May be used for a step by step use of Selectboost.
Usage
lasso_msgps_all(X, Y, penalty = "enet")
enet_msgps_all(X, Y, penalty = "enet", alpha = 0.5)
alasso_msgps_all(X, Y, penalty = "alasso")
alasso_enet_msgps_all(X, Y, penalty = "alasso", alpha = 0.5)
lasso_cv_glmnet_all_5f(X, Y)
spls_spls_all(X, Y, K.seq = c(1:5), eta.seq = (1:9)/10, fold.val = 5)
varbvs_linear_all(X, Y, include.threshold.list = (1:19)/20)
lasso_cv_glmnet_bin_all(X, Y)
lasso_glmnet_bin_all(X, Y)
splsda_spls_all(X, Y, K.seq = c(1:10), eta.seq = (1:9)/10)
sgpls_spls_all(X, Y, K.seq = c(1:10), eta.seq = (1:9)/10)
varbvs_binomial_all(X, Y, include.threshold.list = (1:19)/20)Arguments
- X
- A numeric matrix. The predictors matrix. 
- Y
- A binary factor. The 0/1 classification response. 
- penalty
- A character value to select the penalty term in msgps (Model Selection Criteria via Generalized Path Seeking). Defaults to "enet". "genet" is the generalized elastic net and "alasso" is the adaptive lasso, which is a weighted version of the lasso. 
- alpha
- A numeric value to set the value of \(\alpha\) on "enet" and "genet" penalty in msgps (Model Selection Criteria via Generalized Path Seeking). 
- K.seq
- A numeric vector. Number of components to test. 
- eta.seq
- A numeric vector. Eta sequence to test. 
- fold.val
- A numeric value. Number of folds to use. 
- include.threshold.list
- A numeric vector. Vector of threshold to use. 
- K
- A numeric value. Number of folds to use. 
Details
lasso_msgps_all returns the matrix of coefficients
for an optimal linear model estimated by the LASSO estimator and selected
by model selection criteria including Mallows' Cp, bias-corrected AIC (AICc),
generalized cross validation (GCV) and BIC.
The the msgps function of the msgps package implements
Model Selection Criteria via Generalized Path Seeking to compute the degrees
of freedom of the LASSO.
enet_msgps_all returns the matrix of coefficients
for an optimal linear model estimated by the ELASTIC NET estimator and selected
by model selection criteria including Mallows' Cp, bias-corrected AIC (AICc),
generalized cross validation (GCV) and BIC.
The the msgps function of the msgps package implements
Model Selection Criteria via Generalized Path Seeking to compute the degrees
of freedom of the ELASTIC NET.
alasso_msgps_all returns the matrix of coefficients
for an optimal linear model estimated by the adaptive LASSO estimator and selected
by model selection criteria including Mallows' Cp, bias-corrected AIC (AICc),
generalized cross validation (GCV) and BIC.
The the msgps function of the msgps package implements
Model Selection Criteria via Generalized Path Seeking to compute the degrees
of freedom of the adaptive LASSO.
alasso_enet_msgps_all returns the matrix of coefficients
for an optimal linear model estimated by the adaptive ELASTIC NET estimator and selected
by model selection criteria including Mallows' Cp, bias-corrected AIC (AICc),
generalized cross validation (GCV) and BIC.
The the msgps function of the msgps package implements
Model Selection Criteria via Generalized Path Seeking to compute the degrees
of freedom of the adaptive ELASTIC NET.
lasso_cv_glmnet_all_5f returns the matrix of coefficients
for a linear model estimated by the LASSO using the lambda.min and lambda.1se
(lambda.min+1se) values computed by 5 fold cross validation. It uses the glmnet
and cv.glmnet functions of the glmnet package.
spls_spls_all returns the matrix of the raw (coef.spls)
and correct.spls and bootstrap corrected coefficients
for a linear model estimated by the SPLS (sparse partial least squares) and 5 fold cross validation.
It uses the spls, cv.spls, ci.spls, coef.spls and
correct.spls functions of the spls package.
varbvs_linear_all returns the matrix of the coefficients
for a linear model estimated by the varbvs (variational approximation for Bayesian
variable selection in linear regression, family = gaussian) and the requested threshold values.
It uses the varbvs, coef and variable.names functions of the varbvs package.
lasso_cv_glmnet_bin_all returns the matrix of coefficients
for a logistic model estimated by the LASSO using the lambda.min and lambda.1se
(lambda.min+1se) values computed by 5 fold cross validation. It uses the glmnet and cv.glmnet
functions of the glmnet package.
lasso_glmnet_bin_all returns the matrix of coefficients
for a logistic model estimated by the LASSO using the AICc_glmnetB and BIC_glmnetB
information criteria. It uses the glmnet function of the glmnet package and the
AICc_glmnetB and BIC_glmnetB functions of the SelectBoost package that were
adapted from the AICc_glmnetB and BIC_glmnetB functions of the rLogistic
(https://github.com/echi/rLogistic) package.
splsda_spls_all returns the matrix of the raw (coef.splsda) coefficients
for logistic regression model estimated by the SGPLS (sparse généralized partial least squares) and
5 fold cross validation. It uses the splsda, cv.splsda and coef.splsda functions
of the sgpls package.
sgpls_spls_all returns the matrix of the raw (coef.sgpls) coefficients
for logistic regression model estimated by the SGPLS (sparse généralized partial least squares) and
5 fold cross validation. It uses the sgpls, cv.sgpls and coef.sgpls functions
of the sgpls package.
varbvs_binomial_all returns the matrix of the coefficients
for a linear model estimated by the varbvs (variational approximation for Bayesian
variable selection in logistic regression, family = binomial) and the requested threshold values.
It uses the varbvs, coef and variable.names functions of the varbvs package.
References
selectBoost: a general algorithm to enhance the performance of variable selection methods in correlated datasets, Frédéric Bertrand, Ismaïl Aouadi, Nicolas Jung, Raphael Carapito, Laurent Vallat, Seiamak Bahram, Myriam Maumy-Bertrand, Bioinformatics, 2020. doi:10.1093/bioinformatics/btaa855
See also
glmnet, cv.glmnet, msgps, AICc_BIC_glmnetB, spls, cv.spls, correct.spls, splsda, cv.splsda, sgpls, cv.sgpls, varbvs
Other Variable selection functions:
var_select
Author
Frederic Bertrand, frederic.bertrand@lecnam.net
Examples
set.seed(314)
xran <- matrix(rnorm(100*6),100,6)
beta0 <- c(3,1.5,0,0,2,0)
epsilon <- rnorm(100,sd=3)
yran <- c(xran %*% beta0 + epsilon)
ybin <- ifelse(yran>=0,1,0)
set.seed(314)
lasso_msgps_all(xran,yran)
#>          Cp     AICc      GCV      BIC
#> V1 2.888389 2.888389 2.888389 2.822453
#> V2 1.240134 1.240134 1.240134 1.158685
#> V3 0.294536 0.294536 0.294536 0.197723
#> V4 0.000000 0.000000 0.000000 0.000000
#> V5 1.488073 1.488073 1.488073 1.396131
#> V6 0.000000 0.000000 0.000000 0.000000
set.seed(314)
enet_msgps_all(xran,yran)
#>          Cp     AICc      GCV      BIC
#> V1 2.941541 2.937504 2.941541 2.814043
#> V2 1.442444 1.440943 1.442444 1.393650
#> V3 0.516834 0.515345 0.516834 0.472523
#> V4 0.000000 0.000000 0.000000 0.000000
#> V5 1.629282 1.626506 1.629282 1.540810
#> V6 0.225360 0.224630 0.225360 0.200158
set.seed(314)
alasso_msgps_all(xran,yran)
#>          Cp     AICc      GCV      BIC
#> V1 3.007477 3.006131 3.007813 2.968790
#> V2 1.309948 1.305444 1.311450 1.172197
#> V3 0.278524 0.269215 0.281131 0.000000
#> V4 0.000000 0.000000 0.000000 0.000000
#> V5 1.602220 1.598403 1.603261 1.487032
#> V6 0.000000 0.000000 0.000000 0.000000
set.seed(314)
alasso_enet_msgps_all(xran,yran)
#>          Cp     AICc      GCV      BIC
#> V1 3.007477 3.006131 3.007813 2.968790
#> V2 1.309948 1.305444 1.311450 1.172197
#> V3 0.278524 0.269215 0.281131 0.000000
#> V4 0.000000 0.000000 0.000000 0.000000
#> V5 1.602220 1.598403 1.603261 1.487032
#> V6 0.000000 0.000000 0.000000 0.000000
set.seed(314)
lasso_cv_glmnet_all_5f(xran,yran)
#>      lambda.min lambda.1se
#> [1,]  3.0265634  2.5914804
#> [2,]  1.4632565  0.8834199
#> [3,]  0.5351174  0.0000000
#> [4,]  0.0000000  0.0000000
#> [5,]  1.6827459  1.0760789
#> [6,]  0.2299210  0.0000000
set.seed(314)
spls_spls_all(xran,yran)
#> eta = 0.1 
#> eta = 0.2 
#> eta = 0.3 
#> eta = 0.4 
#> eta = 0.5 
#> eta = 0.6 
#> eta = 0.7 
#> eta = 0.8 
#> eta = 0.9 
#> 
#> Optimal parameters: eta = 0.9, K = 5
#> 10 % completed...
#> 20 % completed...
#> 30 % completed...
#> 40 % completed...
#> 50 % completed...
#> 60 % completed...
#> 70 % completed...
#> 80 % completed...
#> 90 % completed...
#> 100 % completed...
#>      raw_coefs_K.opt_5_eta.opt_0.9
#> [1,]                     3.0415349
#> [2,]                     1.3270025
#> [3,]                     0.4983083
#> [4,]                     0.0000000
#> [5,]                     1.6470041
#> [6,]                     0.2256989
#>      bootstrap_corrected_coefs_K.opt_5_eta.opt_0.9
#> [1,]                                      3.041535
#> [2,]                                      1.327003
#> [3,]                                      0.000000
#> [4,]                                      0.000000
#> [5,]                                      1.647004
#> [6,]                                      0.000000
set.seed(314)
varbvs_linear_all(xran,yran)
#>    coef_varbvs_0.05 coef_varbvs_0.1 coef_varbvs_0.15 coef_varbvs_0.2
#> X1         3.016325        3.016325         3.016325        3.016325
#> X2         1.282165        1.282165         1.282165        1.282165
#> X3         0.000000        0.000000         0.000000        0.000000
#> X4         0.000000        0.000000         0.000000        0.000000
#> X5         1.645791        1.645791         1.645791        1.645791
#> X6         0.000000        0.000000         0.000000        0.000000
#>    coef_varbvs_0.25 coef_varbvs_0.3 coef_varbvs_0.35 coef_varbvs_0.4
#> X1         3.016325        3.016325         3.016325        3.016325
#> X2         1.282165        1.282165         1.282165        1.282165
#> X3         0.000000        0.000000         0.000000        0.000000
#> X4         0.000000        0.000000         0.000000        0.000000
#> X5         1.645791        1.645791         1.645791        1.645791
#> X6         0.000000        0.000000         0.000000        0.000000
#>    coef_varbvs_0.45 coef_varbvs_0.5 coef_varbvs_0.55 coef_varbvs_0.6
#> X1         3.016325        3.016325         3.016325        3.016325
#> X2         1.282165        1.282165         1.282165        1.282165
#> X3         0.000000        0.000000         0.000000        0.000000
#> X4         0.000000        0.000000         0.000000        0.000000
#> X5         1.645791        1.645791         1.645791        1.645791
#> X6         0.000000        0.000000         0.000000        0.000000
#>    coef_varbvs_0.65 coef_varbvs_0.7 coef_varbvs_0.75 coef_varbvs_0.8
#> X1         3.016325        3.016325         3.016325        3.016325
#> X2         1.282165        1.282165         1.282165        1.282165
#> X3         0.000000        0.000000         0.000000        0.000000
#> X4         0.000000        0.000000         0.000000        0.000000
#> X5         1.645791        1.645791         1.645791        1.645791
#> X6         0.000000        0.000000         0.000000        0.000000
#>    coef_varbvs_0.85 coef_varbvs_0.9 coef_varbvs_0.95
#> X1         3.016325        3.016325         3.016325
#> X2         1.282165        1.282165         0.000000
#> X3         0.000000        0.000000         0.000000
#> X4         0.000000        0.000000         0.000000
#> X5         1.645791        1.645791         1.645791
#> X6         0.000000        0.000000         0.000000
set.seed(314)
lasso_cv_glmnet_bin_all(xran,ybin)
#>      lambda.min lambda.1se
#> [1,]  1.0500913  0.9854475
#> [2,]  0.3353787  0.2834710
#> [3,]  0.1756045  0.1187688
#> [4,]  0.0000000  0.0000000
#> [5,]  0.5707957  0.5097384
#> [6,]  0.0000000  0.0000000
set.seed(314)
lasso_glmnet_bin_all(xran,ybin)
#>          AICc      BIC
#> [1,] 1.345775 1.345775
#> [2,] 0.000000 0.000000
#> [3,] 0.000000 0.000000
#> [4,] 0.000000 0.000000
#> [5,] 0.000000 0.000000
#> [6,] 0.000000 0.000000
set.seed(314)
# \donttest{
splsda_spls_all(xran,ybin, K.seq=1:3)
#> 
#> Optimal parameters: eta = 0.9, K = 3
#>    raw_coefs_K.opt_3_eta.opt_0.9
#> x1                     0.9729799
#> x2                     0.4145599
#> x3                     0.3216531
#> x4                     0.0000000
#> x5                     0.6557821
#> x6                     0.0000000
# }
set.seed(314)
# \donttest{
sgpls_spls_all(xran,ybin, K.seq=1:3)
#> 
#> Optimal parameters: eta = 0.4, K = 3
#>    raw_coefs_K.opt_3_eta.opt_0.4
#> x1                    0.52287257
#> x2                    0.23647477
#> x3                    0.16705245
#> x4                   -0.05893075
#> x5                    0.33188549
#> x6                    0.03978646
# }
set.seed(314)
varbvs_binomial_all(xran,ybin)
#>    coef_varbvs_0.05 coef_varbvs_0.1 coef_varbvs_0.15 coef_varbvs_0.2
#> X1       1.35696226       1.3569623        1.3569623       1.3569623
#> X2       0.02417951       0.0000000        0.0000000       0.0000000
#> X3       0.00000000       0.0000000        0.0000000       0.0000000
#> X4       0.00000000       0.0000000        0.0000000       0.0000000
#> X5       0.57262410       0.5726241        0.5726241       0.5726241
#> X6       0.00000000       0.0000000        0.0000000       0.0000000
#>    coef_varbvs_0.25 coef_varbvs_0.3 coef_varbvs_0.35 coef_varbvs_0.4
#> X1        1.3569623       1.3569623        1.3569623       1.3569623
#> X2        0.0000000       0.0000000        0.0000000       0.0000000
#> X3        0.0000000       0.0000000        0.0000000       0.0000000
#> X4        0.0000000       0.0000000        0.0000000       0.0000000
#> X5        0.5726241       0.5726241        0.5726241       0.5726241
#> X6        0.0000000       0.0000000        0.0000000       0.0000000
#>    coef_varbvs_0.45 coef_varbvs_0.5 coef_varbvs_0.55 coef_varbvs_0.6
#> X1        1.3569623       1.3569623        1.3569623       1.3569623
#> X2        0.0000000       0.0000000        0.0000000       0.0000000
#> X3        0.0000000       0.0000000        0.0000000       0.0000000
#> X4        0.0000000       0.0000000        0.0000000       0.0000000
#> X5        0.5726241       0.5726241        0.5726241       0.5726241
#> X6        0.0000000       0.0000000        0.0000000       0.0000000
#>    coef_varbvs_0.65 coef_varbvs_0.7 coef_varbvs_0.75 coef_varbvs_0.8
#> X1        1.3569623       1.3569623        1.3569623        1.356962
#> X2        0.0000000       0.0000000        0.0000000        0.000000
#> X3        0.0000000       0.0000000        0.0000000        0.000000
#> X4        0.0000000       0.0000000        0.0000000        0.000000
#> X5        0.5726241       0.5726241        0.5726241        0.000000
#> X6        0.0000000       0.0000000        0.0000000        0.000000
#>    coef_varbvs_0.85 coef_varbvs_0.9 coef_varbvs_0.95
#> X1         1.356962        1.356962         1.356962
#> X2         0.000000        0.000000         0.000000
#> X3         0.000000        0.000000         0.000000
#> X4         0.000000        0.000000         0.000000
#> X5         0.000000        0.000000         0.000000
#> X6         0.000000        0.000000         0.000000
