Fit L1-norm SVM
svm.1norm.Rd
SVM with variable selection (clone selection) using L1-norm penalty. ( a fast Newton algorithm NLPSVM from Fung and Mangasarian )
Arguments
- A
n-by-d data matrix to train (n chips/patients, d clones/genes).
- d
vector of class labels -1 or 1's (for n chips/patiens ).
- k
k-fold for cv, default k=5.
- nu
weighted parameter, 1 - easy estimation, 0 - hard estimation, any other value - used as nu by the algorithm. Default : 0.
- output
0 - no output, 1 - produce output, default is 0.
- delta
some small value, default: \(10^-3\).
- epsi
tuning parameter.
- seed
seed.
- maxIter
maximal iterations, default: 700.
Details
k: k-fold for cv, is a way to divide the data set into test and training set.
if k = 0: simply run the algorithm without any correctness
calculation, this is the default.
if k = 1: run the algorithm and calculate correctness on
the whole data set.
if k = any value less than the number of rows in the data set:
divide up the data set into test and training
using k-fold method.
if k = number of rows in the data set: use the 'leave one out' (loo) method
Value
a list of
- w
coefficients of the hyperplane
- b
intercept of the hyperplane
- xind
the index of the selected features (genes) in the data matrix.
- epsi
optimal tuning parameter epsilon
- iter
number of iterations
- k
k-fold for cv
- trainCorr
for cv: average train correctness
- testCorr
for cv: average test correctness
- nu
weighted parameter
References
Fung, G. and Mangasarian, O. L. (2004). A feature selection newton method for support vector machine classification. Computational Optimization and Applications Journal 28(2) pp. 185-202.
Examples
set.seed(123)
train<-sim.data(n = 20, ng = 100, nsg = 10, corr=FALSE, seed=12)
print(str(train))
#> List of 3
#> $ x : num [1:100, 1:20] -0.64379 -0.00486 -0.08606 -0.2183 2.45035 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:100] "pos1" "pos2" "pos3" "pos4" ...
#> .. ..$ : chr [1:20] "1" "2" "3" "4" ...
#> $ y : Named num [1:20] -1 1 -1 -1 -1 -1 -1 1 1 1 ...
#> ..- attr(*, "names")= chr [1:20] "1" "2" "3" "4" ...
#> $ seed: num 12
#> NULL
# train data
model <- lpsvm(A=t(train$x), d=train$y, k=5, nu=0,output=0, delta=10^-3, epsi=0.001, seed=12)
print(model)
#>
#> Bias = 178.3869
#> Selected Variables= pos3 neg2 bal10 bal11 bal12 bal21 bal22 bal24 bal30 bal32 bal35 bal46 bal48 bal50 bal51 bal56 bal66 bal71 bal74 bal77 bal81
#> Coefficients:
#> pos3 neg2 bal10 bal11 bal12 bal21 bal22
#> 899.92693 -231.19038 241.69383 101.85423 -250.31708 308.99564 82.03101
#> bal24 bal30 bal32 bal35 bal46 bal48 bal50
#> -571.25835 112.53478 16.61886 678.41629 302.98743 -337.51707 35.23294
#> bal51 bal56 bal66 bal71 bal74 bal77 bal81
#> 643.03151 26.97797 -3.53579 326.13807 22.94525 296.81442 285.79581
#>