35 research outputs found

    Global convergence rate analysis of unconstrained optimization methods based on probabilistic models

    Full text link
    We present global convergence rates for a line-search method which is based on random first-order models and directions whose quality is ensured only with certain probability. We show that in terms of the order of the accuracy, the evaluation complexity of such a method is the same as its counterparts that use deterministic accurate models; the use of probabilistic models only increases the complexity by a constant, which depends on the probability of the models being good. We particularize and improve these results in the convex and strongly convex case. We also analyze a probabilistic cubic regularization variant that allows approximate probabilistic second-order models and show improved complexity bounds compared to probabilistic first-order methods; again, as a function of the accuracy, the probabilistic cubic regularization bounds are of the same (optimal) order as for the deterministic case

    Global convergence rate analysis of unconstrained optimization methods based on probabilistic models

    Get PDF
    We present global convergence rates for a line-search method which is based on random first-order models and directions whose quality is ensured only with certain probability. We show that in terms of the order of the accuracy, the evaluation complexity of such a method is the same as its counterparts that use deterministic accurate models; the use of probabilistic models only increases the complexity by a constant, which depends on the probability of the models being good. We particularize and improve these results in the convex and strongly convex case. We also analyse a probabilistic cubic regularization variant that allows approximate probabilistic second-order models and show improved complexity bounds compared to probabilistic first-order methods; again, as a function of the accuracy, the probabilistic cubic regularization bounds are of the same (optimal) order as for the deterministic case

    A Generic Approach for Escaping Saddle points

    Full text link
    A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from saddles one has to use second-order methods. However, most works on second-order methods rely extensively on expensive Hessian-based computations, making them impractical in large-scale settings. To tackle this challenge, we introduce a generic framework that minimizes Hessian based computations while at the same time provably converging to second-order critical points. Our framework carefully alternates between a first-order and a second-order subroutine, using the latter only close to saddle points, and yields convergence results competitive to the state-of-the-art. Empirical results suggest that our strategy also enjoys a good practical performance

    Adaptive Regularization for Nonconvex Optimization Using Inexact Function Values and Randomly Perturbed Derivatives

    Get PDF
    A regularization algorithm allowing random noise in derivatives and inexact function values is proposed for computing approximate local critical points of any order for smooth unconstrained optimization problems. For an objective function with Lipschitz continuous pp-th derivative and given an arbitrary optimality order qpq \leq p, it is shown that this algorithm will, in expectation, compute such a point in at most O((minj{1,,q}ϵj)p+1pq+1)O\left(\left(\min_{j\in\{1,\ldots,q\}}\epsilon_j\right)^{-\frac{p+1}{p-q+1}}\right) inexact evaluations of ff and its derivatives whenever q{1,2}q\in\{1,2\}, where ϵj\epsilon_j is the tolerance for jjth order accuracy. This bound becomes at most O((minj{1,,q}ϵj)q(p+1)p)O\left(\left(\min_{j\in\{1,\ldots,q\}}\epsilon_j\right)^{-\frac{q(p+1)}{p}}\right) inexact evaluations if q>2q>2 and all derivatives are Lipschitz continuous. Moreover these bounds are sharp in the order of the accuracy tolerances. An extension to convexly constrained problems is also outlined.Comment: 22 page

    A Subsampling Line-Search Method with Second-Order Results

    Full text link
    In many contemporary optimization problems such as those arising in machine learning, it can be computationally challenging or even infeasible to evaluate an entire function or its derivatives. This motivates the use of stochastic algorithms that sample problem data, which can jeopardize the guarantees obtained through classical globalization techniques in optimization such as a trust region or a line search. Using subsampled function values is particularly challenging for the latter strategy, which relies upon multiple evaluations. On top of that all, there has been an increasing interest for nonconvex formulations of data-related problems, such as training deep learning models. For such instances, one aims at developing methods that converge to second-order stationary points quickly, i.e., escape saddle points efficiently. This is particularly delicate to ensure when one only accesses subsampled approximations of the objective and its derivatives. In this paper, we describe a stochastic algorithm based on negative curvature and Newton-type directions that are computed for a subsampling model of the objective. A line-search technique is used to enforce suitable decrease for this model, and for a sufficiently large sample, a similar amount of reduction holds for the true objective. By using probabilistic reasoning, we can then obtain worst-case complexity guarantees for our framework, leading us to discuss appropriate notions of stationarity in a subsampling context. Our analysis encompasses the deterministic regime, and allows us to identify sampling requirements for second-order line-search paradigms. As we illustrate through real data experiments, these worst-case estimates need not be satisfied for our method to be competitive with first-order strategies in practice
    corecore