Skip to contents

SelectBoost for GAMLSS (stability selection)

Usage

sb_gamlss(
  formula,
  data,
  family,
  mu_scope,
  sigma_scope = NULL,
  nu_scope = NULL,
  tau_scope = NULL,
  base_sigma = ~1,
  base_nu = ~1,
  base_tau = ~1,
  B = 100,
  sample_fraction = 0.7,
  pi_thr = 0.6,
  k = 2,
  direction = c("both", "forward", "backward"),
  pre_standardize = FALSE,
  use_groups = FALSE,
  c0 = 0.5,
  engine = c("stepGAIC", "glmnet", "grpreg", "sgl"),
  engine_sigma = NULL,
  engine_nu = NULL,
  engine_tau = NULL,
  grpreg_penalty = c("grLasso", "grMCP", "grSCAD"),
  sgl_alpha = 0.95,
  df_smooth = 6L,
  progress = TRUE,
  glmnet_alpha = 1,
  glmnet_family = c("gaussian", "binomial", "poisson"),
  parallel = c("none", "auto", "multisession", "multicore"),
  workers = NULL,
  trace = TRUE,
  corr_func = "cor",
  group_fun = SelectBoost::group_func_2,
  ...
)

Arguments

formula

Base formula for the location \(\mu\) parameter (in the main model call).

data

Data frame.

family

A gamlss.dist family object (e.g., gamlss.dist::NO()).

mu_scope

Formula of candidate terms for \(\mu\).

sigma_scope, nu_scope, tau_scope

Formulas of candidate terms for \(\sigma\), \(\nu\), \(\tau\).

base_sigma, base_nu, base_tau

Optional base (always-included) formulas for \(\sigma\), \(\nu\), \(\tau\).

B

Number of bootstrap subsamples for stability selection.

sample_fraction

Fraction of rows per subsample (e.g., 0.7).

pi_thr

Selection proportion threshold to define “stable” terms (e.g., 0.6).

k

Penalty weight for stepwise GAIC when engine = "stepGAIC" (default 2).

direction

Stepwise direction for stepGAIC ("both", "forward", "backward").

pre_standardize

Logical; standardize numeric predictors before penalized fits.

use_groups

Logical; treat SelectBoost correlation groups during resampling.

c0

SelectBoost meta-parameter controlling reweighting/thresholding (see vignette).

engine

Engine for \(\mu\) ("stepGAIC", "glmnet", "grpreg", "sgl").

engine_sigma, engine_nu, engine_tau

Optional engines for \(\sigma\), \(\nu\), \(\tau\).

grpreg_penalty

Group penalty for grpreg ("grLasso", "grMCP", "grSCAD").

sgl_alpha

Alpha for sparse group lasso.

df_smooth

Degrees of freedom for proxy spline bases (pb()/cs()splines::bs(df=df_smooth)) used only for grouped selection design.

progress

Logical; show a progress bar in sequential runs.

glmnet_alpha

Elastic-net mixing for glmnet (1 = lasso, 0 = ridge).

glmnet_family

Family passed to glmnet-based selectors ("gaussian", "binomial", "poisson").

parallel

Parallel mode ("none", "auto", "multisession", `"multicore").

workers

Integer; number of workers if parallel.

trace

Logical; print progress messages.

corr_func

Correlation function passed to SelectBoost::boost.compcorrs.

group_fun

Grouping function passed to SelectBoost::boost.findgroups.

...

Passed to underlying engines (e.g., to gamlss::gamlss, glmnet, etc.).

Value

An object of class "sb_gamlss" with elements:

  • final_fit: the final gamlss object.

  • final_formula: list of formulas for mu/sigma/nu/tau.

  • selection: data.frame of selection counts and proportions.

  • B, sample_fraction, pi_thr, k.

  • scaler: list with center, scale, vars, response.

Examples

set.seed(1)
dat <- data.frame(
  y = gamlss.dist::rNO(60, mu = 0),
  x1 = rnorm(60),
  x2 = rnorm(60),
  x3 = rnorm(60)
)
fit <- sb_gamlss(
  y ~ 1,
  data = dat,
  family = gamlss.dist::NO(),
  mu_scope = ~ x1 + x2 + gamlss::pb(x3),
  B = 8,
  pi_thr = 0.6,
  trace = FALSE
)
fit$final_formula
#> $mu
#> y ~ 1
#> <environment: 0x1281ba0b0>
#> 
#> $sigma
#> ~1
#> <environment: 0x1283126d8>
#> 
#> $nu
#> ~1
#> <environment: 0x1283126d8>
#> 
#> $tau
#> ~1
#> <environment: 0x1283126d8>
#>