7,179 research outputs found
SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives
In this work we introduce a new optimisation method called SAGA in the spirit
of SAG, SDCA, MISO and SVRG, a set of recently proposed incremental gradient
algorithms with fast linear convergence rates. SAGA improves on the theory
behind SAG and SVRG, with better theoretical convergence rates, and has support
for composite objectives where a proximal operator is used on the regulariser.
Unlike SDCA, SAGA supports non-strongly convex problems directly, and is
adaptive to any inherent strong convexity of the problem. We give experimental
results showing the effectiveness of our method.Comment: Advances In Neural Information Processing Systems, Nov 2014,
Montreal, Canad
A Proximal Stochastic Gradient Method with Progressive Variance Reduction
We consider the problem of minimizing the sum of two convex functions: one is
the average of a large number of smooth component functions, and the other is a
general convex function that admits a simple proximal mapping. We assume the
whole objective function is strongly convex. Such problems often arise in
machine learning, known as regularized empirical risk minimization. We propose
and analyze a new proximal stochastic gradient method, which uses a multi-stage
scheme to progressively reduce the variance of the stochastic gradient. While
each iteration of this algorithm has similar cost as the classical stochastic
gradient method (or incremental gradient method), we show that the expected
objective value converges to the optimum at a geometric rate. The overall
complexity of this method is much lower than both the proximal full gradient
method and the standard proximal stochastic gradient method
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization
Due to their simplicity and excellent performance, parallel asynchronous
variants of stochastic gradient descent have become popular methods to solve a
wide range of large-scale optimization problems on multi-core architectures.
Yet, despite their practical success, support for nonsmooth objectives is still
lacking, making them unsuitable for many problems of interest in machine
learning, such as the Lasso, group Lasso or empirical risk minimization with
convex constraints.
In this work, we propose and analyze ProxASAGA, a fully asynchronous sparse
method inspired by SAGA, a variance reduced incremental gradient algorithm. The
proposed method is easy to implement and significantly outperforms the state of
the art on several nonsmooth, large-scale problems. We prove that our method
achieves a theoretical linear speedup with respect to the sequential version
under assumptions on the sparsity of gradients and block-separability of the
proximal term. Empirical benchmarks on a multi-core architecture illustrate
practical speedups of up to 12x on a 20-core machine.Comment: Appears in Advances in Neural Information Processing Systems 30 (NIPS
2017), 28 page
- …