Search CORE

521 research outputs found

Global convergence rate analysis of unconstrained optimization methods based on probabilistic models

Author: Cartis Coralia
Scheinberg Katya
Publication venue
Publication date: 05/01/2017
Field of study

We present global convergence rates for a line-search method which is based on random first-order models and directions whose quality is ensured only with certain probability. We show that in terms of the order of the accuracy, the evaluation complexity of such a method is the same as its counterparts that use deterministic accurate models; the use of probabilistic models only increases the complexity by a constant, which depends on the probability of the models being good. We particularize and improve these results in the convex and strongly convex case. We also analyze a probabilistic cubic regularization variant that allows approximate probabilistic second-order models and show improved complexity bounds compared to probabilistic first-order methods; again, as a function of the accuracy, the probabilistic cubic regularization bounds are of the same (optimal) order as for the deterministic case

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

A Generic Approach for Escaping Saddle points

Author: Bach Francis
Poczos Barnabas
Reddi Sashank J
Salakhutdinov Ruslan
Smola Alexander J
Sra Suvrit
Zaheer Manzil
Publication venue
Publication date: 05/09/2017
Field of study

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server