163 research outputs found
Variance Reduction for Faster Non-Convex Optimization
We consider the fundamental problem in non-convex optimization of efficiently
reaching a stationary point. In contrast to the convex case, in the long
history of this basic problem, the only known theoretical results on
first-order non-convex optimization remain to be full gradient descent that
converges in iterations for smooth objectives, and
stochastic gradient descent that converges in iterations
for objectives that are sum of smooth functions.
We provide the first improvement in this line of research. Our result is
based on the variance reduction trick recently introduced to convex
optimization, as well as a brand new analysis of variance reduction that is
suitable for non-convex optimization. For objectives that are sum of smooth
functions, our first-order minibatch stochastic method converges with an
rate, and is faster than full gradient descent by
.
We demonstrate the effectiveness of our methods on empirical risk
minimizations with non-convex loss functions and training neural nets.Comment: polished writin
Momentum-Based Variance Reduction in Non-Convex SGD
Variance reduction has emerged in recent years as a strong competitor to
stochastic gradient descent in non-convex problems, providing the first
algorithms to improve upon the converge rate of stochastic gradient descent for
finding first-order critical points. However, variance reduction techniques
typically require carefully tuned learning rates and willingness to use
excessively large "mega-batches" in order to achieve their improved results. We
present a new algorithm, STORM, that does not require any batches and makes use
of adaptive learning rates, enabling simpler implementation and less
hyperparameter tuning. Our technique for removing the batches uses a variant of
momentum to achieve variance reduction in non-convex optimization. On smooth
losses , STORM finds a point with in iterations
with variance in the gradients, matching the optimal rate but
without requiring knowledge of .Comment: Added Ac
Efficient Regret Minimization in Non-Convex Games
We consider regret minimization in repeated games with non-convex loss
functions. Minimizing the standard notion of regret is computationally
intractable. Thus, we define a natural notion of regret which permits efficient
optimization and generalizes offline guarantees for convergence to an
approximate local optimum. We give gradient-based methods that achieve optimal
regret, which in turn guarantee convergence to equilibrium in this framework.Comment: Published as a conference paper at ICML 201
- …