Search CORE

2,631 research outputs found

Stochastic Frank-Wolfe Methods for Nonconvex Optimization

Author: Poczos Barnabas
Reddi Sashank J.
Smola Alex
Sra Suvrit
Publication venue
Publication date: 29/07/2016
Field of study

We study Frank-Wolfe methods for nonconvex stochastic and finite-sum optimization problems. Frank-Wolfe methods (in the convex case) have gained tremendous recent interest in machine learning and optimization communities due to their projection-free property and their ability to exploit structured constraints. However, our understanding of these algorithms in the nonconvex setting is fairly limited. In this paper, we propose nonconvex stochastic Frank-Wolfe methods and analyze their convergence properties. For objective functions that decompose into a finite-sum, we leverage ideas from variance reduction techniques for convex optimization to obtain new variance reduced nonconvex Frank-Wolfe methods that have provably faster convergence than the classical Frank-Wolfe method. Finally, we show that the faster convergence rates of our variance reduced methods also translate into improved convergence rates for the stochastic setting

arXiv.org e-Print Archive

Crossref

A Lower Bound for the Optimization of Finite Sums

Author: Agarwal Alekh
Bottou Leon
Publication venue
Publication date: 03/10/2015
Field of study

This paper presents a lower bound for optimizing a finite sum of

n

functions, where each function is

L

-smooth and the sum is

\mu

-strongly convex. We show that no algorithm can reach an error

\epsilon

in minimizing all functions from this class in fewer than

\Omega(n + \sqrt{n(\kappa-1)}\log(1/\epsilon))

iterations, where

\kappa=L/\mu

is a surrogate condition number. We then compare this lower bound to upper bounds for recently developed methods specializing to this setting. When the functions involved in this sum are not arbitrary, but based on i.i.d. random data, then we further contrast these complexity results with those for optimal first-order methods to directly optimize the sum. The conclusion we draw is that a lot of caution is necessary for an accurate comparison, and identify machine learning scenarios where the new methods help computationally.Comment: Added an erratum, we are currently working on extending the result to randomized algorithm

arXiv.org e-Print Archive

CiteSeerX