Search CORE

304 research outputs found

Randomized Quasi-Newton Updates are Linearly Convergent Matrix Inversion Algorithms

Author: Bhatia R.
Davis T. A.
Greenstadt B. J.
Li W.
Lu Y.
Peter Richtárik
Pilanci M.
Robert M. Gower
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/12/2017
Field of study

Crossref

Edinburgh Research Explorer

Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization

Author: Gower Robert M.
Hanzely Filip
Richtárik Peter
Stich Sebastian
Publication venue
Publication date: 12/02/2018
Field of study

We present the first accelerated randomized algorithm for solving linear systems in Euclidean spaces. One essential problem of this type is the matrix inversion problem. In particular, our algorithm can be specialized to invert positive definite matrices in such a way that all iterates (approximate solutions) generated by the algorithm are positive definite matrices themselves. This opens the way for many applications in the field of optimization and machine learning. As an application of our general theory, we develop the {\em first accelerated (deterministic and stochastic) quasi-Newton updates}. Our updates lead to provably more aggressive approximations of the inverse Hessian, and lead to speed-ups over classical non-accelerated rules in numerical experiments. Experiments with empirical risk minimization show that our rules can accelerate training of machine learning models.Comment: 37 pages, 32 figures, 3 algorithm

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Edinburgh Research Explorer

Tracking the gradients using the Hessian: A new look at variance reducing stochastic methods

Author: Bach Francis
Gower Robert M.
Roux Nicolas Le
Publication venue
Publication date: 01/01/2018
Field of study

Our goal is to improve variance reducing stochastic methods through better control variates. We first propose a modification of SVRG which uses the Hessian to track gradients over time, rather than to recondition, increasing the correlation of the control variates and leading to faster theoretical convergence close to the optimum. We then propose accurate and computationally efficient approximations to the Hessian, both using a diagonal and a low-rank matrix. Finally, we demonstrate the effectiveness of our method on a wide range of problems.Comment: 17 pages, 2 figures, 1 tabl

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server