SelectBoost for GAMLSS (stability selection)

Usage

sb_gamlss(
  formula,
  data,
  family,
  mu_scope,
  sigma_scope = NULL,
  nu_scope = NULL,
  tau_scope = NULL,
  base_sigma = ~1,
  base_nu = ~1,
  base_tau = ~1,
  B = 100,
  sample_fraction = 0.7,
  pi_thr = 0.6,
  k = 2,
  direction = c("both", "forward", "backward"),
  pre_standardize = FALSE,
  use_groups = FALSE,
  c0 = 0.5,
  engine = c("stepGAIC", "glmnet", "grpreg", "sgl"),
  engine_sigma = NULL,
  engine_nu = NULL,
  engine_tau = NULL,
  grpreg_penalty = c("grLasso", "grMCP", "grSCAD"),
  sgl_alpha = 0.95,
  df_smooth = 6L,
  progress = TRUE,
  glmnet_alpha = 1,
  glmnet_family = c("gaussian", "binomial", "poisson"),
  parallel = c("none", "auto", "multisession", "multicore"),
  workers = NULL,
  trace = TRUE,
  corr_func = "cor",
  group_fun = SelectBoost::group_func_2,
  ...
)

Arguments

formula: Base formula for the location \(\mu\) parameter (in the main model call).
data: Data frame.
family: A gamlss.dist family object (e.g., gamlss.dist::NO()).
mu_scope: Formula of candidate terms for \(\mu\).
sigma_scope, nu_scope, tau_scope: Formulas of candidate terms for \(\sigma\), \(\nu\), \(\tau\).
base_sigma, base_nu, base_tau: Optional base (always-included) formulas for \(\sigma\), \(\nu\), \(\tau\).
B: Number of bootstrap subsamples for stability selection.
sample_fraction: Fraction of rows per subsample (e.g., 0.7).
pi_thr: Selection proportion threshold to define “stable” terms (e.g., 0.6).
k: Penalty weight for stepwise GAIC when engine = "stepGAIC" (default 2).
direction: Stepwise direction for stepGAIC ("both", "forward", "backward").
pre_standardize: Logical; standardize numeric predictors before penalized fits.
use_groups: Logical; treat SelectBoost correlation groups during resampling.
c0: SelectBoost meta-parameter controlling reweighting/thresholding (see vignette).
engine: Engine for \(\mu\) ("stepGAIC", "glmnet", "grpreg", "sgl").
engine_sigma, engine_nu, engine_tau: Optional engines for \(\sigma\), \(\nu\), \(\tau\).
grpreg_penalty: Group penalty for grpreg ("grLasso", "grMCP", "grSCAD").
sgl_alpha: Alpha for sparse group lasso.
df_smooth: Degrees of freedom for proxy spline bases (pb()/cs() → splines::bs(df=df_smooth)) used only for grouped selection design.
progress: Logical; show a progress bar in sequential runs.
glmnet_alpha: Elastic-net mixing for glmnet (1 = lasso, 0 = ridge).
glmnet_family: Family passed to glmnet-based selectors ("gaussian", "binomial", "poisson").
parallel: Parallel mode ("none", "auto", "multisession", `"multicore").
workers: Integer; number of workers if parallel.
trace: Logical; print progress messages.
corr_func: Correlation function passed to SelectBoost::boost.compcorrs.
group_fun: Grouping function passed to SelectBoost::boost.findgroups.
...: Passed to underlying engines (e.g., to gamlss::gamlss, glmnet, etc.).

Value

An object of class "sb_gamlss" with elements:

final_fit: the final gamlss object.
final_formula: list of formulas for mu/sigma/nu/tau.
selection: data.frame of selection counts and proportions.
B, sample_fraction, pi_thr, k.
scaler: list with center, scale, vars, response.

Examples

set.seed(1)
dat <- data.frame(
  y = gamlss.dist::rNO(60, mu = 0),
  x1 = rnorm(60),
  x2 = rnorm(60),
  x3 = rnorm(60)
)
fit <- sb_gamlss(
  y ~ 1,
  data = dat,
  family = gamlss.dist::NO(),
  mu_scope = ~ x1 + x2 + gamlss::pb(x3),
  B = 8,
  pi_thr = 0.6,
  trace = FALSE
)
fit$final_formula
#> $mu
#> y ~ 1
#> <environment: 0x1281ba0b0>
#> 
#> $sigma
#> ~1
#> <environment: 0x1283126d8>
#> 
#> $nu
#> ~1
#> <environment: 0x1283126d8>
#> 
#> $tau
#> ~1
#> <environment: 0x1283126d8>
#>