selectboost_quantile() adapts the core SelectBoost workflow to sparse
quantile regression:
Usage
selectboost_quantile(
x,
y = NULL,
tau = 0.5,
B = 50,
c0_seq = NULL,
step_num = 0.1,
group = group_neighbors,
max_group_size = NULL,
screen = c("auto", "none", "quantile_rank"),
screen_size = NULL,
lambda = NULL,
tune_lambda = c("none", "cv", "bic"),
lambda_rule = c("min", "one_se"),
lambda_factors = NULL,
lambda_inflation = 1,
nlambda = 20,
lambda_min_ratio = 0.05,
folds = 5,
repeats = 1,
subsamples = 1,
sample_fraction = 0.5,
complementary_pairs = FALSE,
selector = quantile_lasso_selector,
standardize = TRUE,
eps = 1e-06,
seed = NULL,
data = NULL,
subset = NULL,
na.action = stats::na.fail,
verbose = interactive(),
...
)Arguments
- x
Numeric design matrix or a formula.
- y
Numeric response vector when
xis a matrix.- tau
Quantile level in
(0, 1). Can be a vector.- B
Number of perturbation replicates for each
c0threshold.- c0_seq
Optional decreasing sequence of correlation thresholds. When
NULL, it is computed from empirical correlation quantiles usingstep_num.- step_num
Step size used to build the default
c0path.- group
Grouping rule used to convert the absolute correlation matrix and threshold
c0into a list of neighborhoods, one per variable. Can be a function or the name of one. Functions must accept(abs_corr, c0).- max_group_size
Optional cap on the size of each correlation neighborhood. When supplied, only the strongest absolute correlations are retained within each variable's group.
- screen
Screening rule applied before the SelectBoost loop.
"auto"enables tau-aware rank screening whenp > n,"none"disables screening, and"quantile_rank"always uses the built-in rank-score screen. Functions must accept(x, y, tau, screen_size).- screen_size
Optional number of predictors retained after screening.
- lambda
Optional lasso penalty supplied to
quantreg::rq.fit.lasso(). A scalar applies a common slope penalty, while a full penalty vector can also be supplied. Whentauhas length greater than one,lambdacan also be a list with one entry pertau.- tune_lambda
One of
"none","cv", or"bic". When not"none", the package tunes a penalty profile once on the original design and reuses it for all perturbations.- lambda_rule
Selection rule used after tuning.
"min"takes the best tuning score, while"one_se"applies the one-standard-error rule whentune_lambda = "cv".- lambda_factors
Optional positive multipliers applied to the default quantile-lasso penalty profile during tuning.
- lambda_inflation
Optional multiplier applied after tuning to favor a stronger selection penalty.
- nlambda
Number of tuning candidates when
lambda_factorsisNULL.- lambda_min_ratio
Smallest tuning multiplier used to generate the default tuning grid.
- folds
Number of cross-validation folds when
tune_lambda = "cv".- repeats
Number of repeated fold assignments when
tune_lambda = "cv".- subsamples
Number of subsample draws used for stability selection. Values greater than one aggregate selection frequencies across subsamples.
- sample_fraction
Fraction of observations drawn in each subsample when
subsamples > 1.- complementary_pairs
Should subsamples be generated as complementary pairs?
- selector
Function used to fit the sparse quantile model. It must accept
(x, y, tau, lambda, ...)and return a named coefficient vector including an intercept.- standardize
Should the selector be fitted on the SelectBoost-normalized design? When
TRUE, columns are centered and scaled to unit Euclidean norm before fitting, matching the original package. WhenFALSE, perturbations are still generated in the normalized space but mapped back to the original scale before model fitting.- eps
Numerical tolerance used to turn coefficients into selections.
- seed
Optional random seed for reproducible perturbations and tuning.
- data
Optional data frame used when
xis a formula.- subset
Optional subset expression used with the formula interface.
- na.action
Missing-data handler used with the formula interface.
- verbose
Should the routine report progress?
- ...
Additional arguments forwarded to
selector.
Value
An object of class "selectboost_quantile" with components:
frequencies, baseline, baseline_standardized, c0_seq, tau, B,
lambda, lambda_tuning, call, and preprocessing metadata.
Details
build a centered, unit-norm design as in
SelectBoost::boost.normalize(),compute correlation neighborhoods along a
c0path,fit a directional distribution to each variable's sign-aligned neighborhood in the sample hyperplane,
draw perturbed predictors from those fitted directional models,
refit penalized quantile regression and aggregate selection frequencies.
This version keeps the public API stable while separating the internals into explicit preprocessing, grouping, directional perturbation, and tuning stages.
Examples
sim <- simulate_quantile_data(n = 80, p = 12, active = 1:3, seed = 1)
fit <- selectboost_quantile(sim$x, sim$y, tau = 0.5, B = 8, seed = 1)
print(fit)
#> SelectBoost-style quantile regression sketch
#> tau: 0.5
#> perturbation replicates: 8
#> c0 thresholds: 13
#> predictors: 12
#> grouping: group_neighbors
#> screening: none
#> lambda: vector[13], range [0.0000, 0.0000]
#> top mean selection frequencies:
#> x1 x3 x2 x7 x9 x12
#> 0.856 0.702 0.683 0.615 0.615 0.615
summary(fit, threshold = 0.6)
#> Tau: 0.5
#> Stable support threshold: 0.6
#> Selection metric: hybrid
#> Variables above the threshold:
#> [1] "x1"
#> Top summary scores:
#> x1 x3 x2 x12 x9 x10 x7 x5 x4 x6
#> 0.817 0.365 0.245 0.173 0.142 0.112 0.071 0.061 0.000 0.000
dat <- data.frame(y = sim$y, sim$x)
fit_formula <- selectboost_quantile(
y ~ .,
data = dat,
tau = 0.5,
B = 4,
step_num = 0.5,
seed = 1
)