26,086 research outputs found
Interior Point Methods for Massive Support Vector Machines
We investigate the use of interior point methods for solving quadratic
programming problems with a small number of linear constraints where
the quadratic term consists of a low-rank update to a positive semi-de nite
matrix. Several formulations of the support vector machine t into this
category. An interesting feature of these particular problems is the vol-
ume of data, which can lead to quadratic programs with between 10 and
100 million variables and a dense Q matrix. We use OOQP, an object-
oriented interior point code, to solve these problem because it allows us
to easily tailor the required linear algebra to the application. Our linear
algebra implementation uses a proximal point modi cation to the under-
lying algorithm, and exploits the Sherman-Morrison-Woodbury formula
and the Schur complement to facilitate e cient linear system solution.
Since we target massive problems, the data is stored out-of-core and we
overlap computation and I/O to reduce overhead. Results are reported
for several linear support vector machine formulations demonstrating the
reliability and scalability of the method
Oracle-Based Robust Optimization via Online Learning
Robust optimization is a common framework in optimization under uncertainty
when the problem parameters are not known, but it is rather known that the
parameters belong to some given uncertainty set. In the robust optimization
framework the problem solved is a min-max problem where a solution is judged
according to its performance on the worst possible realization of the
parameters. In many cases, a straightforward solution of the robust
optimization problem of a certain type requires solving an optimization problem
of a more complicated type, and in some cases even NP-hard. For example,
solving a robust conic quadratic program, such as those arising in robust SVM,
ellipsoidal uncertainty leads in general to a semidefinite program. In this
paper we develop a method for approximately solving a robust optimization
problem using tools from online convex optimization, where in every stage a
standard (non-robust) optimization program is solved. Our algorithms find an
approximate robust solution using a number of calls to an oracle that solves
the original (non-robust) problem that is inversely proportional to the square
of the target accuracy
Training Support Vector Machines Using Frank-Wolfe Optimization Methods
Training a Support Vector Machine (SVM) requires the solution of a quadratic
programming problem (QP) whose computational complexity becomes prohibitively
expensive for large scale datasets. Traditional optimization methods cannot be
directly applied in these cases, mainly due to memory restrictions.
By adopting a slightly different objective function and under mild conditions
on the kernel used within the model, efficient algorithms to train SVMs have
been devised under the name of Core Vector Machines (CVMs). This framework
exploits the equivalence of the resulting learning problem with the task of
building a Minimal Enclosing Ball (MEB) problem in a feature space, where data
is implicitly embedded by a kernel function.
In this paper, we improve on the CVM approach by proposing two novel methods
to build SVMs based on the Frank-Wolfe algorithm, recently revisited as a fast
method to approximate the solution of a MEB problem. In contrast to CVMs, our
algorithms do not require to compute the solutions of a sequence of
increasingly complex QPs and are defined by using only analytic optimization
steps. Experiments on a large collection of datasets show that our methods
scale better than CVMs in most cases, sometimes at the price of a slightly
lower accuracy. As CVMs, the proposed methods can be easily extended to machine
learning problems other than binary classification. However, effective
classifiers are also obtained using kernels which do not satisfy the condition
required by CVMs and can thus be used for a wider set of problems
Optimistic Robust Optimization With Applications To Machine Learning
Robust Optimization has traditionally taken a pessimistic, or worst-case
viewpoint of uncertainty which is motivated by a desire to find sets of optimal
policies that maintain feasibility under a variety of operating conditions. In
this paper, we explore an optimistic, or best-case view of uncertainty and show
that it can be a fruitful approach. We show that these techniques can be used
to address a wide variety of problems. First, we apply our methods in the
context of robust linear programming, providing a method for reducing
conservatism in intuitive ways that encode economically realistic modeling
assumptions. Second, we look at problems in machine learning and find that this
approach is strongly connected to the existing literature. Specifically, we
provide a new interpretation for popular sparsity inducing non-convex
regularization schemes. Additionally, we show that successful approaches for
dealing with outliers and noise can be interpreted as optimistic robust
optimization problems. Although many of the problems resulting from our
approach are non-convex, we find that DCA or DCA-like optimization approaches
can be intuitive and efficient
CoCoA: A General Framework for Communication-Efficient Distributed Optimization
The scale of modern datasets necessitates the development of efficient
distributed optimization methods for machine learning. We present a
general-purpose framework for distributed computing environments, CoCoA, that
has an efficient communication scheme and is applicable to a wide variety of
problems in machine learning and signal processing. We extend the framework to
cover general non-strongly-convex regularizers, including L1-regularized
problems like lasso, sparse logistic regression, and elastic net
regularization, and show how earlier work can be derived as a special case. We
provide convergence guarantees for the class of convex regularized loss
minimization objectives, leveraging a novel approach in handling
non-strongly-convex regularizers and non-smooth loss functions. The resulting
framework has markedly improved performance over state-of-the-art methods, as
we illustrate with an extensive set of experiments on real distributed
datasets
- …