1,708 research outputs found
An Equivalence between the Lasso and Support Vector Machines
We investigate the relation of two fundamental tools in machine learning and signal processing, that is the support vector machine (SVM) for classification, and the Lasso technique used in regression. We show that the resulting optimization problems are equivalent, in the following sense. Given any instance of an L2-loss soft-margin (or hard-margin) SVM, we construct a Lasso instance having the same optimal solutions, and vice versa. As a consequence, many existing optimization algorithms for both SVMs and Lasso can also be applied to the respective other problem instances. Also, the equivalence allows for many known theoretical insights for SVM and Lasso to be translated between the two settings. One such implication gives a simple kernelized version of the Lasso, analogous to the kernels used in the SVM setting. Another consequence is that the sparsity of a Lasso solution is equal to the number of support vectors for the corresponding SVM instance, and that one can use screening rules to prune the set of support vectors. Furthermore, we can relate sublinear time algorithms for the two problems, and give a new such algorithm variant for the Lasso. We also study the regularization paths for both methods
A Reduction of the Elastic Net to Support Vector Machines with an Application to GPU Computing
The past years have witnessed many dedicated open-source projects that built
and maintain implementations of Support Vector Machines (SVM), parallelized for
GPU, multi-core CPUs and distributed systems. Up to this point, no comparable
effort has been made to parallelize the Elastic Net, despite its popularity in
many high impact applications, including genetics, neuroscience and systems
biology. The first contribution in this paper is of theoretical nature. We
establish a tight link between two seemingly different algorithms and prove
that Elastic Net regression can be reduced to SVM with squared hinge loss
classification. Our second contribution is to derive a practical algorithm
based on this reduction. The reduction enables us to utilize prior efforts in
speeding up and parallelizing SVMs to obtain a highly optimized and parallel
solver for the Elastic Net and Lasso. With a simple wrapper, consisting of only
11 lines of MATLAB code, we obtain an Elastic Net implementation that naturally
utilizes GPU and multi-core CPUs. We demonstrate on twelve real world data
sets, that our algorithm yields identical results as the popular (and highly
optimized) glmnet implementation but is one or several orders of magnitude
faster.Comment: 10 page
Modified Frank-Wolfe Algorithm for Enhanced Sparsity in Support Vector Machine Classifiers
This work proposes a new algorithm for training a re-weighted L2 Support
Vector Machine (SVM), inspired on the re-weighted Lasso algorithm of Cand\`es
et al. and on the equivalence between Lasso and SVM shown recently by Jaggi. In
particular, the margin required for each training vector is set independently,
defining a new weighted SVM model. These weights are selected to be binary, and
they are automatically adapted during the training of the model, resulting in a
variation of the Frank-Wolfe optimization algorithm with essentially the same
computational complexity as the original algorithm. As shown experimentally,
this algorithm is computationally cheaper to apply since it requires less
iterations to converge, and it produces models with a sparser representation in
terms of support vectors and which are more stable with respect to the
selection of the regularization hyper-parameter
The Parameter Houlihan: a solution to high-throughput identifiability indeterminacy for brutally ill-posed problems
One way to interject knowledge into clinically impactful forecasting is to
use data assimilation, a nonlinear regression that projects data onto a
mechanistic physiologic model, instead of a set of functions, such as neural
networks. Such regressions have an advantage of being useful with particularly
sparse, non-stationary clinical data. However, physiological models are often
nonlinear and can have many parameters, leading to potential problems with
parameter identifiability, or the ability to find a unique set of parameters
that minimize forecasting error. The identifiability problems can be minimized
or eliminated by reducing the number of parameters estimated, but reducing the
number of estimated parameters also reduces the flexibility of the model and
hence increases forecasting error. We propose a method, the parameter Houlihan,
that combines traditional machine learning techniques with data assimilation,
to select the right set of model parameters to minimize forecasting error while
reducing identifiability problems. The method worked well: the data
assimilation-based glucose forecasts and estimates for our cohort using the
Houlihan-selected parameter sets generally also minimize forecasting errors
compared to other parameter selection methods such as by-hand parameter
selection. Nevertheless, the forecast with the lowest forecast error does not
always accurately represent physiology, but further advancements of the
algorithm provide a path for improving physiologic fidelity as well. Our hope
is that this methodology represents a first step toward combining machine
learning with data assimilation and provides a lower-threshold entry point for
using data assimilation with clinical data by helping select the right
parameters to estimate
- …