Fit Survival Models with Stochastic Gradient Descent

Performs stochastic gradient descent optimisation for large-scale survival models after removing observations with missing values.

Usage

bigSurvSGD.na.omit(
  formula = survival::Surv(time = time, status = status) ~ .,
  data,
  norm.method = "standardize",
  features.mean = NULL,
  features.sd = NULL,
  opt.method = "AMSGrad",
  beta.init = NULL,
  beta.type = "averaged",
  lr.const = 0.12,
  lr.tau = 0.5,
  strata.size = 20,
  batch.size = 1,
  num.epoch = 100,
  b1 = 0.9,
  b2 = 0.99,
  eps = 1e-08,
  inference.method = "plugin",
  num.boot = 1000,
  num.epoch.boot = 100,
  boot.method = "SGD",
  lr.const.boot = 0.12,
  lr.tau.boot = 0.5,
  num.sample.strata = 1000,
  sig.level = 0.05,
  beta0 = 0,
  alpha = NULL,
  lambda = NULL,
  nlambda = 100,
  num.strata.lambda = 10,
  lambda.scale = 1,
  parallel.flag = FALSE,
  num.cores = NULL,
  bigmemory.flag = FALSE,
  num.rows.chunk = 1e+06,
  col.names = NULL,
  type = "float"
)

Arguments

formula: Model formula describing the survival outcome and the set of predictors to include in the optimisation.
data: Input data set or connection to a big-memory backed design matrix that contains the variables referenced in formula.
norm.method: Normalization strategy applied to the feature matrix before optimisation, for example centring or standardising columns.
features.mean: Optional pre-computed column means used when normalising the features so that repeated fits can reuse shared statistics.
features.sd: Optional pre-computed column standard deviations used in concert with features.mean for scaling the predictors.
opt.method: Gradient based optimisation routine to employ, such as vanilla SGD or adaptive methods like Adam.
beta.init: Vector of starting values for the regression coefficients supplied when warm-starting the optimisation.
beta.type: Indicator controlling how beta.init is interpreted, for example whether the coefficients correspond to the original or normalised scale.
lr.const: Base learning-rate constant used by the stochastic gradient descent routine.
lr.tau: Learning-rate decay horizon or damping factor that moderates the step size schedule.
strata.size: Number of observations drawn per stratum when building mini-batches for the optimisation loop.
batch.size: Total number of observations assembled into each stochastic gradient batch.
num.epoch: Number of passes over the training data used during the optimisation.
b1: First exponential moving-average rate used by adaptive methods such as Adam to smooth gradients.
b2: Second exponential moving-average rate used by adaptive methods to smooth squared gradients.
eps: Numerical stabilisation constant added to denominators when updating the adaptive moments.
inference.method: Inference approach requested after fitting, for example naive asymptotics or bootstrap resampling.
num.boot: Number of bootstrap replicates to draw when inference.method relies on resampling.
num.epoch.boot: Number of optimisation epochs to run within each bootstrap replicate.
boot.method: Type of bootstrap scheme to apply, such as ordinary or stratified resampling.
lr.const.boot: Learning-rate constant used during bootstrap refits.
lr.tau.boot: Learning-rate decay factor applied during bootstrap refits.
num.sample.strata: Number of strata sampled without replacement during each bootstrap iteration when stratified resampling is selected.
sig.level: Significance level used when constructing confidence intervals or hypothesis tests.
beta0: Optional vector of coefficients under the null hypothesis when performing hypothesis tests.
alpha: Elastic-net mixing parameter controlling the relative weight of \(\ell_1\) and \(\ell_2\) regularisation penalties.
lambda: Sequence of regularisation strengths supplied explicitly for penalised estimation.
nlambda: Number of automatically generated lambda values when a grid is produced internally.
num.strata.lambda: Number of strata used when tuning lambda via cross-validation or other search procedures.
lambda.scale: Scale on which the lambda grid is generated, for example logarithmic or linear spacing.
parallel.flag: Logical flag enabling parallel computation of gradients or bootstrap replicates.
num.cores: Number of processing cores to use when parallel execution is enabled.
bigmemory.flag: Logical flag indicating whether intermediate matrices should be stored using bigmemory backed objects.
num.rows.chunk: Row chunk size to use when streaming data from an on-disk matrix representation.
col.names: Optional character vector of column names associated with the feature matrix.
type: Type of survival model to fit, for example Cox proportional hazards or accelerated failure time variants.

Value

A fitted model object storing the learned coefficients, optimisation metadata, and any requested inference summaries. coef: Log of hazards ratio. If no inference is used, it returns a vector for estimated coefficients: If inference is used, it returns a matrix including estimates and confidence intervals of coefficients. In case of penalization, it resturns a matrix with columns corresponding to lambdas. coef.exp: Exponentiated version of coef (hazards ratio). lambda: Returns lambda(s) used for penalizarion. alpha: Returns alpha used for penalizarion. features.mean: Returns means of features, if given or calculated features.sd: Returns standard deviations of features, if given or calculated.

Examples

# \donttest{
data(micro.censure, package = "bigPLScox")
surv_data <- stats::na.omit(micro.censure[, c("survyear", "DC", "sexe", "Agediag")])
# Increase num.epoch and num.boot for real use
fit <- bigSurvSGD.na.omit(
   survival::Surv(survyear, DC) ~ .,
   data = surv_data,
   norm.method = "standardize",
   opt.method = "adam",
   batch.size = 16,
   num.epoch = 2,
 )
#> Warning: Strata size times batch size is greater than number of observations.
#>  This package resizes them to strata size = 20 and batch size = 4
# }

Usage

Arguments

Value

See also

Examples