Search CORE

3,361 research outputs found

Accelerating Incremental Gradient Optimization with Curvature Information

Author: Nedich Angelia
Scaglione Anna
Shi Wei
Uribe Cesar A.
Wai Hoi-To
Publication venue
Publication date: 28/02/2020
Field of study

This paper studies an acceleration technique for incremental aggregated gradient ({\sf IAG}) method through the use of \emph{curvature} information for solving strongly convex finite sum optimization problems. These optimization problems of interest arise in large-scale learning applications. Our technique utilizes a curvature-aided gradient tracking step to produce accurate gradient estimates incrementally using Hessian information. We propose and analyze two methods utilizing the new technique, the curvature-aided IAG ({\sf CIAG}) method and the accelerated CIAG ({\sf A-CIAG}) method, which are analogous to gradient method and Nesterov's accelerated gradient method, respectively. Setting

\kappa

to be the condition number of the objective function, we prove the

R

linear convergence rates of

1 - \frac{4c_0 \kappa}{(\kappa+1)^2}

for the {\sf CIAG} method, and

1 - \sqrt{\frac{c_1}{2\kappa}}

for the {\sf A-CIAG} method, where

c_0,c_1 \leq 1

are constants inversely proportional to the distance between the initial point and the optimal solution. When the initial iterate is close to the optimal solution, the

R

linear convergence rates match with the gradient and accelerated gradient method, albeit {\sf CIAG} and {\sf A-CIAG} operate in an incremental setting with strictly lower computation complexity. Numerical experiments confirm our findings. The source codes used for this paper can be found on \url{http://github.com/hoitowai/ciag/}.Comment: 22 pages, 3 figures, 3 tables. Accepted by Computational Optimization and Applications, to appea

arXiv.org e-Print Archive

DSpace@MIT

Semistochastic Quadratic Bound Methods

Author: Aravkin Aleksandr Y.
Choromanska Anna
Jebara Tony
Kanevsky Dimitri
Publication venue
Publication date: 17/02/2014
Field of study

Partition functions arise in a variety of settings, including conditional random fields, logistic regression, and latent gaussian models. In this paper, we consider semistochastic quadratic bound (SQB) methods for maximum likelihood inference based on partition function optimization. Batch methods based on the quadratic bound were recently proposed for this class of problems, and performed favorably in comparison to state-of-the-art techniques. Semistochastic methods fall in between batch algorithms, which use all the data, and stochastic gradient type methods, which use small random selections at each iteration. We build semistochastic quadratic bound-based methods, and prove both global convergence (to a stationary point) under very weak assumptions, and linear convergence rate under stronger assumptions on the objective. To make the proposed methods faster and more stable, we consider inexact subproblem minimization and batch-size selection schemes. The efficacy of SQB methods is demonstrated via comparison with several state-of-the-art techniques on commonly used datasets.Comment: 11 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

A Lower Bound for the Optimization of Finite Sums

Author: Agarwal Alekh
Bottou Leon
Publication venue
Publication date: 03/10/2015
Field of study

This paper presents a lower bound for optimizing a finite sum of

n

functions, where each function is

L

-smooth and the sum is

\mu

-strongly convex. We show that no algorithm can reach an error

\epsilon

in minimizing all functions from this class in fewer than

\Omega(n + \sqrt{n(\kappa-1)}\log(1/\epsilon))

iterations, where

\kappa=L/\mu

is a surrogate condition number. We then compare this lower bound to upper bounds for recently developed methods specializing to this setting. When the functions involved in this sum are not arbitrary, but based on i.i.d. random data, then we further contrast these complexity results with those for optimal first-order methods to directly optimize the sum. The conclusion we draw is that a lot of caution is necessary for an accurate comparison, and identify machine learning scenarios where the new methods help computationally.Comment: Added an erratum, we are currently working on extending the result to randomized algorithm

arXiv.org e-Print Archive

CiteSeerX

Tracking the gradients using the Hessian: A new look at variance reducing stochastic methods

Author: Bach Francis
Gower Robert M.
Roux Nicolas Le
Publication venue
Publication date: 01/01/2018
Field of study

Our goal is to improve variance reducing stochastic methods through better control variates. We first propose a modification of SVRG which uses the Hessian to track gradients over time, rather than to recondition, increasing the correlation of the control variates and leading to faster theoretical convergence close to the optimum. We then propose accurate and computationally efficient approximations to the Hessian, both using a diagonal and a low-rank matrix. Finally, we demonstrate the effectiveness of our method on a wide range of problems.Comment: 17 pages, 2 figures, 1 tabl

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server