Search CORE

35 research outputs found

Global convergence rate analysis of unconstrained optimization methods based on probabilistic models

Author: Cartis Coralia
Scheinberg Katya
Publication venue
Publication date: 05/01/2017
Field of study

We present global convergence rates for a line-search method which is based on random first-order models and directions whose quality is ensured only with certain probability. We show that in terms of the order of the accuracy, the evaluation complexity of such a method is the same as its counterparts that use deterministic accurate models; the use of probabilistic models only increases the complexity by a constant, which depends on the probability of the models being good. We particularize and improve these results in the convex and strongly convex case. We also analyze a probabilistic cubic regularization variant that allows approximate probabilistic second-order models and show improved complexity bounds compared to probabilistic first-order methods; again, as a function of the accuracy, the probabilistic cubic regularization bounds are of the same (optimal) order as for the deterministic case

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Global convergence rate analysis of unconstrained optimization methods based on probabilistic models

Author: Cartis Coralia
Scheinberg Katya
Publication venue: Unspecified
Publication date
Field of study

We present global convergence rates for a line-search method which is based on random first-order models and directions whose quality is ensured only with certain probability. We show that in terms of the order of the accuracy, the evaluation complexity of such a method is the same as its counterparts that use deterministic accurate models; the use of probabilistic models only increases the complexity by a constant, which depends on the probability of the models being good. We particularize and improve these results in the convex and strongly convex case. We also analyse a probabilistic cubic regularization variant that allows approximate probabilistic second-order models and show improved complexity bounds compared to probabilistic first-order methods; again, as a function of the accuracy, the probabilistic cubic regularization bounds are of the same (optimal) order as for the deterministic case

A stochastic cubic regularisation method with inexact function evaluations and random derivatives for finite sum minimisation

Author: Bellavia Stefania
Gurioli Gianmarco
Morini Benedetta
Toint Philippe L.
Publication venue
Publication date: 01/01/2020
Field of study

Repository of the University of Namur

A Generic Approach for Escaping Saddle points

Author: Bach Francis
Poczos Barnabas
Reddi Sashank J
Salakhutdinov Ruslan
Smola Alexander J
Sra Suvrit
Zaheer Manzil
Publication venue
Publication date: 05/09/2017
Field of study

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Adaptive Regularization for Nonconvex Optimization Using Inexact Function Values and Randomly Perturbed Derivatives

Author: Bellavia S.
Gurioli G.
Morini B.
Toint Ph. L.
Publication venue
Publication date: 13/05/2020
Field of study

A regularization algorithm allowing random noise in derivatives and inexact function values is proposed for computing approximate local critical points of any order for smooth unconstrained optimization problems. For an objective function with Lipschitz continuous

p

-th derivative and given an arbitrary optimality order

q \leq p

, it is shown that this algorithm will, in expectation, compute such a point in at most

O\left(\left(\min_{j\in\{1,\ldots,q\}}\epsilon_j\right)^{-\frac{p+1}{p-q+1}}\right)

inexact evaluations of

f

and its derivatives whenever

q\in\{1,2\}

, where

\epsilon_j

is the tolerance for

j

th order accuracy. This bound becomes at most

O\left(\left(\min_{j\in\{1,\ldots,q\}}\epsilon_j\right)^{-\frac{q(p+1)}{p}}\right)

inexact evaluations if

q>2

and all derivatives are Lipschitz continuous. Moreover these bounds are sharp in the order of the accuracy tolerances. An extension to convexly constrained problems is also outlined.Comment: 22 page

arXiv.org e-Print Archive

Repository of the University of Namur

A Subsampling Line-Search Method with Second-Order Results

Author: Bergou El-houcine
Diouane Youssef
Kunc Vladimir
Kungurtsev Vyacheslav
Royer Clément W.
Publication venue
Publication date: 23/03/2020
Field of study

In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic algorithms that sample problem data, which can jeopardize the guarantees obtained through classical globalization techniques in optimization such as a trust region or a line search. Using subsampled function values is particularly challenging for the latter strategy, which relies upon multiple evaluations. On top of that all, there has been an increasing interest for nonconvex formulations of data-related problems, such as training deep learning models. For such instances, one aims at developing methods that converge to second-order stationary points quickly, i.e., escape saddle points efficiently. This is particularly delicate to ensure when one only accesses subsampled approximations of the objective and its derivatives. In this paper, we describe a stochastic algorithm based on negative curvature and Newton-type directions that are computed for a subsampling model of the objective. A line-search technique is used to enforce suitable decrease for this model, and for a sufficiently large sample, a similar amount of reduction holds for the true objective. By using probabilistic reasoning, we can then obtain worst-case complexity guarantees for our framework, leading us to discuss appropriate notions of stationarity in a subsampling context. Our analysis encompasses the deterministic regime, and allows us to identify sampling requirements for second-order line-search paradigms. As we illustrate through real data experiments, these worst-case estimates need not be satisfied for our method to be competitive with first-order strategies in practice

arXiv.org e-Print Archive

PolyPublie