Skip to contents

Repeats selection on interval-valued responses by sampling a pseudo-response from each interval (uniformly or midpoint), tallying variable selection frequencies across B replicates.

Usage

fastboost_interval(
  X,
  Y_low,
  Y_high,
  func,
  B = 100,
  sample = c("uniform", "midpoint"),
  version = "glmnet",
  use.parallel = FALSE,
  seed = NULL,
  ...
)

Arguments

X

Numeric matrix (n × p).

Y_low, Y_high

Interval bounds in [0,1]. Rows with missing bounds are dropped.

func

Function function(X, y, ...) returning a named coefficient vector as in the other selectors (nonselected = 0).

B

Number of interval resamples.

sample

"uniform" (default) or "midpoint" for drawing pseudo-responses.

version

Ignored (reserved for future).

use.parallel

Use parallel::mclapply if available.

seed

Optional RNG seed. Scoped via withr::with_seed() so the caller's RNG state is restored afterwards.

...

Extra args forwarded to func.

Value

A list with:

betas

B × (p+1) matrix of coefficients over replicates.

freq

Named vector of selection frequencies for each predictor.

Examples

# suppose you have interval data (Y_low, Y_high)
set.seed(1)
n <- 120; p <- 6
X <- matrix(rnorm(n*p), n, p); colnames(X) <- paste0("x",1:p)
mu <- plogis(X[,1] - 0.5*X[,2]); Y <- rbeta(n, mu*25, (1-mu)*25)
Y_low <- pmax(0, Y - 0.05); Y_high <- pmin(1, Y + 0.05)
fb <- fastboost_interval(X, Y_low, Y_high,
       func = function(X,y) betareg_glmnet(X,y, choose="bic", prestandardize=TRUE),
       B = 40)
sort(fb$freq, decreasing = TRUE)
#> x1 x2 x3 x4 x5 x6 
#>  1  1  0  0  0  0