3,865 research outputs found
A Projection-Free Algorithm for Solving Support Vector Machine Models
In this thesis our goal is to solve the dual problem of the support vector machine (SVM) problem, which is an example of convex smooth optimization problem over a polytope. To this goal, we apply the conditional gradient (CG) method by providing explicit solution to the linear programming (LP) subproblem. We also describe the conditional gradient sliding (CGS) method that can be considered as an improvement of CG in terms of number of gradient evaluations. Even though CGS performs better than CG in terms of optimal complexity bounds, it is not a practical method because it requires the knowledge of the Lipschitz constant and also the number of iterations. As an improvement of CGS, we designed a new method, conditional gradient sliding with line search (CGS-ls) that resolves the issues in CGS method. CGS-ls requires gradient evaluations and linear optimization calls that achieves the optimal complexity bounds in CGS method. We also compare the performance of our method with CG and CGS methods as numerical results by experimenting them in dual problem of SVM for binary classification of two subsets of the MNIST hand-written digits dataset
Variance-Reduced and Projection-Free Stochastic Optimization
The Frank-Wolfe optimization algorithm has recently regained popularity for
machine learning applications due to its projection-free property and its
ability to handle structured constraints. However, in the stochastic learning
setting, it is still relatively understudied compared to the gradient descent
counterpart. In this work, leveraging a recent variance reduction technique, we
propose two stochastic Frank-Wolfe variants which substantially improve
previous results in terms of the number of stochastic gradient evaluations
needed to achieve accuracy. For example, we improve from
to if the objective function
is smooth and strongly convex, and from to
if the objective function is smooth and
Lipschitz. The theoretical improvement is also observed in experiments on
real-world datasets for a multiclass classification application
Semi-proximal Mirror-Prox for Nonsmooth Composite Minimization
We propose a new first-order optimisation algorithm to solve high-dimensional
non-smooth composite minimisation problems. Typical examples of such problems
have an objective that decomposes into a non-smooth empirical risk part and a
non-smooth regularisation penalty. The proposed algorithm, called Semi-Proximal
Mirror-Prox, leverages the Fenchel-type representation of one part of the
objective while handling the other part of the objective via linear
minimization over the domain. The algorithm stands in contrast with more
classical proximal gradient algorithms with smoothing, which require the
computation of proximal operators at each iteration and can therefore be
impractical for high-dimensional problems. We establish the theoretical
convergence rate of Semi-Proximal Mirror-Prox, which exhibits the optimal
complexity bounds, i.e. , for the number of calls to linear
minimization oracle. We present promising experimental results showing the
interest of the approach in comparison to competing methods
Projected gradient descent for non-convex sparse spike estimation
We propose a new algorithm for sparse spike estimation from Fourier
measurements. Based on theoretical results on non-convex optimization
techniques for off-the-grid sparse spike estimation, we present a projected
gradient descent algorithm coupled with a spectral initialization procedure.
Our algorithm permits to estimate the positions of large numbers of Diracs in
2d from random Fourier measurements. We present, along with the algorithm,
theoretical qualitative insights explaining the success of our algorithm. This
opens a new direction for practical off-the-grid spike estimation with
theoretical guarantees in imaging applications
- …