SelectBoost for GAMLSS (stability selection)
Usage
sb_gamlss(
formula,
data,
family,
mu_scope,
sigma_scope = NULL,
nu_scope = NULL,
tau_scope = NULL,
base_sigma = ~1,
base_nu = ~1,
base_tau = ~1,
B = 100,
sample_fraction = 0.7,
pi_thr = 0.6,
k = 2,
direction = c("both", "forward", "backward"),
pre_standardize = FALSE,
use_groups = FALSE,
c0 = 0.5,
engine = c("stepGAIC", "glmnet", "grpreg", "sgl"),
engine_sigma = NULL,
engine_nu = NULL,
engine_tau = NULL,
grpreg_penalty = c("grLasso", "grMCP", "grSCAD"),
sgl_alpha = 0.95,
df_smooth = 6L,
progress = TRUE,
glmnet_alpha = 1,
glmnet_family = c("gaussian", "binomial", "poisson"),
parallel = c("none", "auto", "multisession", "multicore"),
workers = NULL,
trace = TRUE,
corr_func = "cor",
group_fun = SelectBoost::group_func_2,
...
)Arguments
- formula
Base formula for the location \(\mu\) parameter (in the main model call).
- data
Data frame.
- family
A
gamlss.distfamily object (e.g.,gamlss.dist::NO()).- mu_scope
Formula of candidate terms for \(\mu\).
- sigma_scope, nu_scope, tau_scope
Formulas of candidate terms for \(\sigma\), \(\nu\), \(\tau\).
- base_sigma, base_nu, base_tau
Optional base (always-included) formulas for \(\sigma\), \(\nu\), \(\tau\).
- B
Number of bootstrap subsamples for stability selection.
- sample_fraction
Fraction of rows per subsample (e.g., 0.7).
- pi_thr
Selection proportion threshold to define “stable” terms (e.g., 0.6).
- k
Penalty weight for stepwise GAIC when
engine = "stepGAIC"(default 2).- direction
Stepwise direction for
stepGAIC("both","forward","backward").- pre_standardize
Logical; standardize numeric predictors before penalized fits.
- use_groups
Logical; treat SelectBoost correlation groups during resampling.
- c0
SelectBoost meta-parameter controlling reweighting/thresholding (see vignette).
- engine
Engine for \(\mu\) (
"stepGAIC","glmnet","grpreg","sgl").- engine_sigma, engine_nu, engine_tau
Optional engines for \(\sigma\), \(\nu\), \(\tau\).
- grpreg_penalty
Group penalty for grpreg (
"grLasso","grMCP","grSCAD").- sgl_alpha
Alpha for sparse group lasso.
- df_smooth
Degrees of freedom for proxy spline bases (
pb()/cs()→splines::bs(df=df_smooth)) used only for grouped selection design.- progress
Logical; show a progress bar in sequential runs.
- glmnet_alpha
Elastic-net mixing for glmnet (1 = lasso, 0 = ridge).
- glmnet_family
Family passed to glmnet-based selectors ("gaussian", "binomial", "poisson").
- parallel
Parallel mode (
"none","auto","multisession", `"multicore").- workers
Integer; number of workers if parallel.
- trace
Logical; print progress messages.
- corr_func
Correlation function passed to
SelectBoost::boost.compcorrs.- group_fun
Grouping function passed to
SelectBoost::boost.findgroups.- ...
Passed to underlying engines (e.g., to
gamlss::gamlss,glmnet, etc.).
Value
An object of class "sb_gamlss" with elements:
final_fit: the finalgamlssobject.final_formula: list of formulas for mu/sigma/nu/tau.selection: data.frame of selection counts and proportions.B,sample_fraction,pi_thr,k.scaler: list withcenter,scale,vars,response.
Examples
set.seed(1)
dat <- data.frame(
y = gamlss.dist::rNO(60, mu = 0),
x1 = rnorm(60),
x2 = rnorm(60),
x3 = rnorm(60)
)
fit <- sb_gamlss(
y ~ 1,
data = dat,
family = gamlss.dist::NO(),
mu_scope = ~ x1 + x2 + gamlss::pb(x3),
B = 8,
pi_thr = 0.6,
trace = FALSE
)
fit$final_formula
#> $mu
#> y ~ 1
#> <environment: 0x1281ba0b0>
#>
#> $sigma
#> ~1
#> <environment: 0x1283126d8>
#>
#> $nu
#> ~1
#> <environment: 0x1283126d8>
#>
#> $tau
#> ~1
#> <environment: 0x1283126d8>
#>