9,203 research outputs found
Catalyst Acceleration for Gradient-Based Non-Convex Optimization
We introduce a generic scheme to solve nonconvex optimization problems using
gradient-based algorithms originally designed for minimizing convex functions.
Even though these methods may originally require convexity to operate, the
proposed approach allows one to use them on weakly convex objectives, which
covers a large class of non-convex functions typically appearing in machine
learning and signal processing. In general, the scheme is guaranteed to produce
a stationary point with a worst-case efficiency typical of first-order methods,
and when the objective turns out to be convex, it automatically accelerates in
the sense of Nesterov and achieves near-optimal convergence rate in function
values. These properties are achieved without assuming any knowledge about the
convexity of the objective, by automatically adapting to the unknown weak
convexity constant. We conclude the paper by showing promising experimental
results obtained by applying our approach to incremental algorithms such as
SVRG and SAGA for sparse matrix factorization and for learning neural networks
Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure
Stochastic optimization algorithms with variance reduction have proven
successful for minimizing large finite sums of functions. Unfortunately, these
techniques are unable to deal with stochastic perturbations of input data,
induced for example by data augmentation. In such cases, the objective is no
longer a finite sum, and the main candidate for optimization is the stochastic
gradient descent method (SGD). In this paper, we introduce a variance reduction
approach for these settings when the objective is composite and strongly
convex. The convergence rate outperforms SGD with a typically much smaller
constant factor, which depends on the variance of gradient estimates only due
to perturbations on a single example.Comment: Advances in Neural Information Processing Systems (NIPS), Dec 2017,
Long Beach, CA, United State
A Generic Approach for Escaping Saddle points
A central challenge to using first-order methods for optimizing nonconvex
problems is the presence of saddle points. First-order methods often get stuck
at saddle points, greatly deteriorating their performance. Typically, to escape
from saddles one has to use second-order methods. However, most works on
second-order methods rely extensively on expensive Hessian-based computations,
making them impractical in large-scale settings. To tackle this challenge, we
introduce a generic framework that minimizes Hessian based computations while
at the same time provably converging to second-order critical points. Our
framework carefully alternates between a first-order and a second-order
subroutine, using the latter only close to saddle points, and yields
convergence results competitive to the state-of-the-art. Empirical results
suggest that our strategy also enjoys a good practical performance
- …