2,631 research outputs found

    Stochastic Frank-Wolfe Methods for Nonconvex Optimization

    Full text link
    We study Frank-Wolfe methods for nonconvex stochastic and finite-sum optimization problems. Frank-Wolfe methods (in the convex case) have gained tremendous recent interest in machine learning and optimization communities due to their projection-free property and their ability to exploit structured constraints. However, our understanding of these algorithms in the nonconvex setting is fairly limited. In this paper, we propose nonconvex stochastic Frank-Wolfe methods and analyze their convergence properties. For objective functions that decompose into a finite-sum, we leverage ideas from variance reduction techniques for convex optimization to obtain new variance reduced nonconvex Frank-Wolfe methods that have provably faster convergence than the classical Frank-Wolfe method. Finally, we show that the faster convergence rates of our variance reduced methods also translate into improved convergence rates for the stochastic setting

    A Lower Bound for the Optimization of Finite Sums

    Full text link
    This paper presents a lower bound for optimizing a finite sum of nn functions, where each function is LL-smooth and the sum is μ\mu-strongly convex. We show that no algorithm can reach an error ϵ\epsilon in minimizing all functions from this class in fewer than Ω(n+n(κ1)log(1/ϵ))\Omega(n + \sqrt{n(\kappa-1)}\log(1/\epsilon)) iterations, where κ=L/μ\kappa=L/\mu is a surrogate condition number. We then compare this lower bound to upper bounds for recently developed methods specializing to this setting. When the functions involved in this sum are not arbitrary, but based on i.i.d. random data, then we further contrast these complexity results with those for optimal first-order methods to directly optimize the sum. The conclusion we draw is that a lot of caution is necessary for an accurate comparison, and identify machine learning scenarios where the new methods help computationally.Comment: Added an erratum, we are currently working on extending the result to randomized algorithm
    corecore