Variable selection functions

Compute coefficient vector after variable selection.

lasso_cv_glmnet_bin_min(X, Y)

lasso_cv_glmnet_bin_1se(X, Y)

lasso_glmnet_bin_AICc(X, Y)

lasso_glmnet_bin_BIC(X, Y)

lasso_cv_lars_min(X, Y)

lasso_cv_lars_1se(X, Y)

lasso_cv_glmnet_min(X, Y)

lasso_cv_glmnet_min_weighted(X, Y, priors)

lasso_cv_glmnet_1se(X, Y)

lasso_cv_glmnet_1se_weighted(X, Y, priors)

lasso_msgps_Cp(X, Y, penalty = "enet")

lasso_msgps_AICc(X, Y, penalty = "enet")

lasso_msgps_GCV(X, Y, penalty = "enet")

lasso_msgps_BIC(X, Y, penalty = "enet")

enetf_msgps_Cp(X, Y, penalty = "enet", alpha = 0.5)

enetf_msgps_AICc(X, Y, penalty = "enet", alpha = 0.5)

enetf_msgps_GCV(X, Y, penalty = "enet", alpha = 0.5)

enetf_msgps_BIC(X, Y, penalty = "enet", alpha = 0.5)

lasso_cascade(M, Y, K, eps = 10^-5, cv.fun)

Arguments

X

A numeric matrix. The predictors matrix.

Y

A binary factor. The 0/1 classification response.

priors

A numeric vector. Weighting vector for the variable selection. When used with the glmnet estimation function, the weights share the following meanings:

0: the variable is always included in the model
1: neutral weight
Inf: variable is always excluded from the model

penalty

A character value to select the penalty term in msgps (Model Selection Criteria via Generalized Path Seeking). Defaults to "enet". "genet" is the generalized elastic net and "alasso" is the adaptive lasso, which is a weighted version of the lasso.

alpha

A numeric value to set the value of \(\alpha\) on "enet" and "genet" penalty in msgps (Model Selection Criteria via Generalized Path Seeking).

M

A numeric matrix. The transposed predictors matrix.

K

A numeric value. Number of folds to use.

eps

A numeric value. Threshold to set to 0 the inferred value of a parameter.

cv.fun

A function. Fonction used to create folds. Used to perform corss-validation subkectwise.

Value

A vector of coefficients.

Details

lasso_cv_glmnet_bin_min returns the vector of coefficients for a binary logistic model estimated by the lasso using the lambda.min value computed by 10 fold cross validation. It uses the glmnet function of the glmnetpackage.

lasso_cv_glmnet_bin_1se returns the vector of coefficients for a binary logistic model estimated by the lasso using the lambda.1se (lambda.min+1se) value computed by 10 fold cross validation. It uses the glmnet function of the glmnetpackage.

lasso_glmnet_bin_AICc returns the vector of coefficients for a binary logistic model estimated by the lasso and selected according to the bias-corrected AIC (AICC) criterion. It uses the glmnet

lasso_glmnet_bin_BIC returns the vector of coefficients for a binary logistic model estimated by the lasso and selected according to the BIC criterion. It uses the glmnet

lasso_cv_lars_min returns the vector of coefficients for a linear model estimated by the lasso using the lambda.min value computed by 5 fold cross validation. It uses the lars function of the lars package.

lasso_cv_lars_1se returns the vector of coefficients for a linear model estimated by the lasso using the lambda.1se (lambda.min+1se) value computed by 5 fold cross validation. It uses the lars function of the lars package.

lasso_cv_glmnet_min returns the vector of coefficients for a linear model estimated by the lasso using the lambda.min value computed by 10 fold cross validation. It uses the glmnet function of the glmnet package.

lasso_cv_glmnet_min_weighted returns the vector of coefficients for a linear model estimated by the weighted lasso using the lambda.min value computed by 10 fold cross validation. It uses the glmnet function of the glmnet package.

lasso_cv_glmnet_1se returns the vector of coefficients for a linear model estimated by the lasso using the lambda.1se (lambda.min+1se) value computed by 10 fold cross validation. It uses the glmnet function of the glmnet package.

lasso_cv_glmnet_1se_weighted returns the vector of coefficients for a linear model estimated by the weighted lasso using the lambda.1se (lambda.min+1se) value computed by 10 fold cross validation. It uses the glmnet function of the glmnet package.

lasso_msgps_Cp returns the vector of coefficients for a linear model estimated by the lasso selectd using Mallows' Cp. It uses the msgps function of the msgps package.

lasso_msgps_AICc returns the vector of coefficients for a linear model estimated by the lasso selected according to the bias-corrected AIC (AICC) criterion. It uses the msgps function of the msgps package.

lasso_msgps_GCV returns the vector of coefficients for a linear model estimated by the lasso selected according to the generalized cross validation criterion. It uses the msgps function of the msgps package.

lasso_msgps_BIC returns the vector of coefficients for a linear model estimated by the lasso selected according to the BIC criterion. It uses the msgps function of the msgps package.

enetf_msgps_Cp returns the vector of coefficients for a linear model estimated by the elastic net selectd using Mallows' Cp. It uses the msgps function of the msgps package.

enetf_msgps_AICc returns the vector of coefficients for a linear model estimated by the elastic net selected according to the bias-corrected AIC (AICC) criterion. It uses the msgps function of the msgps package.

enetf_msgps_GCV returns the vector of coefficients for a linear model estimated by the elastic net selected according to the generalized cross validation criterion. It uses the msgps function of the msgps package.

enetf_msgps_BIC returns the vector of coefficients for a linear model estimated by the elastic net selected according to the BIC criterion. It uses the msgps function of the msgps package.

lasso_cascade returns the vector of coefficients for a linear model estimated by the lasso. It uses the lars function of the lars package.

References

selectBoost: a general algorithm to enhance the performance of variable selection methods in correlated datasets, Frédéric Bertrand, Ismaïl Aouadi, Nicolas Jung, Raphael Carapito, Laurent Vallat, Seiamak Bahram, Myriam Maumy-Bertrand, Bioinformatics, 2020. doi:10.1093/bioinformatics/btaa855

Author

Frederic Bertrand, frederic.bertrand@utt.fr

Examples

set.seed(314)
xran=matrix(rnorm(150),30,5)
ybin=sample(0:1,30,replace=TRUE)
yran=rnorm(30)
set.seed(314)
lasso_cv_glmnet_bin_min(xran,ybin)
#> [1] -0.2541407  0.0000000  0.1191938  0.0000000  0.0000000  0.4885167

set.seed(314)
lasso_cv_glmnet_bin_1se(xran,ybin)
#> [1] -0.23703995  0.00000000  0.06427661  0.00000000  0.00000000  0.42415945

set.seed(314)
lasso_glmnet_bin_AICc(xran,ybin)
#> [1] 0.000000 0.000000 0.000000 0.000000 1.138298

set.seed(314)
lasso_glmnet_bin_BIC(xran,ybin)
#> [1] 0.000000 0.000000 0.000000 0.000000 1.138298

set.seed(314)
lasso_cv_lars_min(xran,yran)
#> [1] 0 0 0 0 0

set.seed(314)
lasso_cv_lars_1se(xran,yran)
#> [1] 0 0 0 0 0

set.seed(314)
lasso_cv_glmnet_min(xran,yran)
#> [1] 0 0 0 0 0

set.seed(314)
lasso_cv_glmnet_min_weighted(xran,yran,c(1000,0,0,1,1))
#> [1]  0.00000000 -0.05593766 -0.12916707  0.00000000  0.00000000

set.seed(314)
lasso_cv_glmnet_1se(xran,yran)
#> [1] 0 0 0 0 0

set.seed(314)
lasso_cv_glmnet_1se_weighted(xran,yran,c(1000,0,0,1,1))
#> [1]  0.00000000 -0.05593766 -0.12916707  0.00000000  0.00000000

set.seed(314)
lasso_msgps_Cp(xran,yran)
#> V1 V2 V3 V4 V5 
#>  0  0  0  0  0 

set.seed(314)
lasso_msgps_AICc(xran,yran)
#> V1 V2 V3 V4 V5 
#>  0  0  0  0  0 

set.seed(314)
lasso_msgps_GCV(xran,yran)
#> V1 V2 V3 V4 V5 
#>  0  0  0  0  0 

set.seed(314)
lasso_msgps_BIC(xran,yran)
#> V1 V2 V3 V4 V5 
#>  0  0  0  0  0 

set.seed(314)
enetf_msgps_Cp(xran,yran)
#> V1 V2 V3 V4 V5 
#>  0  0  0  0  0 

set.seed(314)
enetf_msgps_AICc(xran,yran)
#> V1 V2 V3 V4 V5 
#>  0  0  0  0  0 

set.seed(314)
enetf_msgps_GCV(xran,yran)
#> V1 V2 V3 V4 V5 
#>  0  0  0  0  0 

set.seed(314)
enetf_msgps_BIC(xran,yran)
#> V1 V2 V3 V4 V5 
#>  0  0  0  0  0 

set.seed(314)
lasso_cascade(t(xran),yran,5,cv.fun=lars::cv.folds)
#> [1] 0 0 0 0 0