35 research outputs found
Global convergence rate analysis of unconstrained optimization methods based on probabilistic models
We present global convergence rates for a line-search method which is based
on random first-order models and directions whose quality is ensured only with
certain probability. We show that in terms of the order of the accuracy, the
evaluation complexity of such a method is the same as its counterparts that use
deterministic accurate models; the use of probabilistic models only increases
the complexity by a constant, which depends on the probability of the models
being good. We particularize and improve these results in the convex and
strongly convex case.
We also analyze a probabilistic cubic regularization variant that allows
approximate probabilistic second-order models and show improved complexity
bounds compared to probabilistic first-order methods; again, as a function of
the accuracy, the probabilistic cubic regularization bounds are of the same
(optimal) order as for the deterministic case
Global convergence rate analysis of unconstrained optimization methods based on probabilistic models
We present global convergence rates for a line-search method which is based on random first-order models and directions whose quality is ensured only with certain probability. We show that in terms of the order of the accuracy, the evaluation complexity of such a method is the same as its counterparts that use deterministic accurate models; the use of probabilistic models only increases the complexity by a constant, which depends on the probability of the models being good. We particularize and improve these results in the convex and strongly convex case. We also analyse a probabilistic cubic regularization variant that allows approximate probabilistic second-order models and show improved complexity bounds compared to probabilistic first-order methods; again, as a function of the accuracy, the probabilistic cubic regularization bounds are of the same (optimal) order as for the deterministic case
A Generic Approach for Escaping Saddle points
A central challenge to using first-order methods for optimizing nonconvex
problems is the presence of saddle points. First-order methods often get stuck
at saddle points, greatly deteriorating their performance. Typically, to escape
from saddles one has to use second-order methods. However, most works on
second-order methods rely extensively on expensive Hessian-based computations,
making them impractical in large-scale settings. To tackle this challenge, we
introduce a generic framework that minimizes Hessian based computations while
at the same time provably converging to second-order critical points. Our
framework carefully alternates between a first-order and a second-order
subroutine, using the latter only close to saddle points, and yields
convergence results competitive to the state-of-the-art. Empirical results
suggest that our strategy also enjoys a good practical performance
Adaptive Regularization for Nonconvex Optimization Using Inexact Function Values and Randomly Perturbed Derivatives
A regularization algorithm allowing random noise in derivatives and inexact
function values is proposed for computing approximate local critical points of
any order for smooth unconstrained optimization problems. For an objective
function with Lipschitz continuous -th derivative and given an arbitrary
optimality order , it is shown that this algorithm will, in
expectation, compute such a point in at most
inexact evaluations of and its derivatives whenever , where
is the tolerance for th order accuracy. This bound becomes at
most
inexact evaluations if and all derivatives are Lipschitz continuous.
Moreover these bounds are sharp in the order of the accuracy tolerances. An
extension to convexly constrained problems is also outlined.Comment: 22 page
A Subsampling Line-Search Method with Second-Order Results
In many contemporary optimization problems such as those arising in machine
learning, it can be computationally challenging or even infeasible to evaluate
an entire function or its derivatives. This motivates the use of stochastic
algorithms that sample problem data, which can jeopardize the guarantees
obtained through classical globalization techniques in optimization such as a
trust region or a line search. Using subsampled function values is particularly
challenging for the latter strategy, which relies upon multiple evaluations. On
top of that all, there has been an increasing interest for nonconvex
formulations of data-related problems, such as training deep learning models.
For such instances, one aims at developing methods that converge to
second-order stationary points quickly, i.e., escape saddle points efficiently.
This is particularly delicate to ensure when one only accesses subsampled
approximations of the objective and its derivatives.
In this paper, we describe a stochastic algorithm based on negative curvature
and Newton-type directions that are computed for a subsampling model of the
objective. A line-search technique is used to enforce suitable decrease for
this model, and for a sufficiently large sample, a similar amount of reduction
holds for the true objective. By using probabilistic reasoning, we can then
obtain worst-case complexity guarantees for our framework, leading us to
discuss appropriate notions of stationarity in a subsampling context. Our
analysis encompasses the deterministic regime, and allows us to identify
sampling requirements for second-order line-search paradigms. As we illustrate
through real data experiments, these worst-case estimates need not be satisfied
for our method to be competitive with first-order strategies in practice