Search CORE

1,302 research outputs found

Error bounds, quadratic growth, and linear convergence of proximal methods

Author: Drusvyatskiy Dmitriy
Lewis Adrian S.
Publication venue
Publication date: 27/06/2016
Field of study

The proximal gradient algorithm for minimizing the sum of a smooth and a nonsmooth convex function often converges linearly even without strong convexity. One common reason is that a multiple of the step length at each iteration may linearly bound the "error" -- the distance to the solution set. We explain the observed linear convergence intuitively by proving the equivalence of such an error bound to a natural quadratic growth condition. Our approach generalizes to linear convergence analysis for proximal methods (of Gauss-Newton type) for minimizing compositions of nonsmooth functions with smooth mappings. We observe incidentally that short step-lengths in the algorithm indicate near-stationarity, suggesting a reliable termination criterion.Comment: 35 page

arXiv.org e-Print Archive

The restricted strong convexity revisited: Analysis of equivalence to error bound and quadratic growth

Author: Zhang Hui
Publication venue
Publication date: 18/06/2016
Field of study

The restricted strong convexity is an effective tool for deriving globally linear convergence rates of descent methods in convex minimization. Recently, the global error bound and quadratic growth properties appeared as new competitors. In this paper, with the help of Ekeland's variational principle, we show the equivalence between these three notions. To deal with convex minimization over a closed convex set and structured convex optimization, we propose a group of modified versions and a group of extended versions of these three notions by using gradient mapping and proximal gradient mapping separately, and prove that the equivalence for the modified and extended versions still holds. Based on these equivalence notions, we establish new asymptotically linear convergence results for the proximal gradient method. Finally, we revisit the problem of minimizing the composition of an affine mapping with a strongly convex differentiable function over a polyhedral set, and obtain a strengthened property of the restricted strong convex type under mild assumptions.Comment: 15 pages; accepted in Optimization Lette

arXiv.org e-Print Archive

On the R-superlinear convergence of the KKT residues generated by the augmented Lagrangian method for convex composite conic programming

Author: Cui Ying
Sun Defeng
Toh Kim-Chuan
Publication venue
Publication date: 27/06/2017
Field of study

Due to the possible lack of primal-dual-type error bounds, the superlinear convergence for the Karush-Kuhn-Tucker (KKT) residues of the sequence generated by augmented Lagrangian method (ALM) for solving convex composite conic programming (CCCP) has long been an outstanding open question. In this paper, we aim to resolve this issue by first conducting convergence rate analysis for the ALM with Rockafellar's stopping criteria under only a mild quadratic growth condition on the dual of CCCP. More importantly, by further assuming that the Robinson constraint qualification holds, we establish the R-superlinear convergence of the KKT residues of the iterative sequence under easy-to-implement stopping criteria {for} the augmented Lagrangian subproblems. Equipped with this discovery, we gain insightful interpretations on the impressive numerical performance of several recently developed semismooth Newton-CG based ALM solvers for solving linear and convex quadratic semidefinite programming

arXiv.org e-Print Archive

Proximal algorithms for constrained composite optimization, with applications to solving low-rank SDPs

Author: Bai Yu
Duchi John
Mei Song
Publication venue
Publication date: 01/03/2019
Field of study

We study a family of (potentially non-convex) constrained optimization problems with convex composite structure. Through a novel analysis of non-smooth geometry, we show that proximal-type algorithms applied to exact penalty formulations of such problems exhibit local linear convergence under a quadratic growth condition, which the compositional structure we consider ensures. The main application of our results is to low-rank semidefinite optimization with Burer-Monteiro factorizations. We precisely identify the conditions for quadratic growth in the factorized problem via structures in the semidefinite problem, which could be of independent interest for understanding matrix factorization

arXiv.org e-Print Archive

Randomized Smoothing SVRG for Large-scale Nonsmooth Convex Optimization

Author: Huang Wenjie
Publication venue
Publication date: 11/05/2018
Field of study

In this paper, we consider the problem of minimizing the average of a large number of nonsmooth and convex functions. Such problems often arise in typical machine learning problems as empirical risk minimization, but are computationally very challenging. We develop and analyze a new algorithm that achieves robust linear convergence rate, and both its time complexity and gradient complexity are superior than state-of-art nonsmooth algorithms and subgradient-based schemes. Besides, our algorithm works without any extra error bound conditions on the objective function as well as the common strongly-convex condition. We show that our algorithm has wide applications in optimization and machine learning problems, and demonstrate experimentally that it performs well on a large-scale ranking problem.Comment: 10 pages, 12 figures. arXiv admin note: text overlap with arXiv:1103.4296, arXiv:1403.4699 by other author

arXiv.org e-Print Archive

Adaptive restart of accelerated gradient methods under local quadratic growth condition

Author: Fercoq Olivier
Qu Zheng
Publication venue: 'Oxford University Press (OUP)'
Publication date: 07/09/2017
Field of study

By analyzing accelerated proximal gradient methods under a local quadratic growth condition, we show that restarting these algorithms at any frequency gives a globally linearly convergent algorithm. This result was previously known only for long enough frequencies. Then, as the rate of convergence depends on the match between the frequency and the quadratic error bound, we design a scheme to automatically adapt the frequency of restart from the observed decrease of the norm of the gradient mapping. Our algorithm has a better theoretical bound than previously proposed methods for the adaptation to the quadratic error bound of the objective. We illustrate the efficiency of the algorithm on a Lasso problem and on a regularized logistic regression problem

arXiv.org e-Print Archive

Proximal Quasi-Newton Methods for Regularized Convex Optimization with Linear and Accelerated Sublinear Convergence Rates

Author: Ghanbari Hiva
Scheinberg Katya
Publication venue
Publication date: 16/10/2017
Field of study

In [19], a general, inexact, efficient proximal quasi-Newton algorithm for composite optimization problems has been proposed and a sublinear global convergence rate has been established. In this paper, we analyze the convergence properties of this method, both in the exact and inexact setting, in the case when the objective function is strongly convex. We also investigate a practical variant of this method by establishing a simple stopping criterion for the subproblem optimization. Furthermore, we consider an accelerated variant, based on FISTA [1], to the proximal quasi-Newton algorithm. A similar accelerated method has been considered in [7], where the convergence rate analysis relies on very strong impractical assumptions. We present a modified analysis while relaxing these assumptions and perform a practical comparison of the accelerated proximal quasi- Newton algorithm and the regular one. Our analysis and computational results show that acceleration may not bring any benefit in the quasi-Newton setting

arXiv.org e-Print Archive

The proximal point method revisited

Author: Drusvyatskiy Dmitriy
Publication venue
Publication date: 16/12/2017
Field of study

In this short survey, I revisit the role of the proximal point method in large scale optimization. I focus on three recent examples: a proximally guided subgradient method for weakly convex stochastic approximation, the prox-linear algorithm for minimizing compositions of convex functions and smooth maps, and Catalyst generic acceleration for regularized Empirical Risk Minimization.Comment: 11 pages, submitted to SIAG/OPT Views and New

arXiv.org e-Print Archive

Convergence of the Forward-Backward Algorithm: Beyond the Worst Case with the Help of Geometry

Author: Garrigos Guillaume
Rosasco Lorenzo
Villa Silvia
Publication venue
Publication date: 01/08/2017
Field of study

We provide a comprehensive study of the convergence of forward-backward algorithm under suitable geometric conditions leading to fast rates. We present several new results and collect in a unified view a variety of results scattered in the literature, often providing simplified proofs. Novel contributions include the analysis of infinite dimensional convex minimization problems, allowing the case where minimizers might not exist. Further, we analyze the relation between different geometric conditions, and discuss novel connections with a priori conditions in linear inverse problems, including source conditions, restricted isometry properties and partial smoothness

arXiv.org e-Print Archive

Linear convergence of first order methods for non-strongly convex optimization

Author: Glineur F.
Necoara I.
Nesterov Yu.
Publication venue
Publication date: 09/08/2016
Field of study

The standard assumption for proving linear convergence of first order methods for smooth convex optimization is the strong convexity of the objective function, an assumption which does not hold for many practical applications. In this paper, we derive linear convergence rates of several first order methods for solving smooth non-strongly convex constrained optimization problems, i.e. involving an objective function with a Lipschitz continuous gradient that satisfies some relaxed strong convexity condition. In particular, in the case of smooth constrained convex optimization, we provide several relaxations of the strong convexity conditions and prove that they are sufficient for getting linear convergence for several first order methods such as projected gradient, fast gradient and feasible descent methods. We also provide examples of functional classes that satisfy our proposed relaxations of strong convexity conditions. Finally, we show that the proposed relaxed strong convexity conditions cover important applications ranging from solving linear systems, Linear Programming, and dual formulations of linearly constrained convex problems.Comment: 36 pages, 4 figure

arXiv.org e-Print Archive