Search CORE

655 research outputs found

Proximal Newton-type methods for minimizing composite functions

Author: Lee Jason D.
Saunders Michael A.
Sun Yuekai
Publication venue
Publication date: 17/03/2014
Field of study

We generalize Newton-type methods for minimizing smooth functions to handle a sum of two convex functions: a smooth function and a nonsmooth function with a simple proximal mapping. We show that the resulting proximal Newton-type methods inherit the desirable convergence behavior of Newton-type methods for minimizing smooth functions, even when search directions are computed inexactly. Many popular methods tailored to problems arising in bioinformatics, signal processing, and statistical learning are special cases of proximal Newton-type methods, and our analysis yields new convergence results for some of these methods

arXiv.org e-Print Archive

Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization

Author: Li Qunwei
Liang Yingbin
Varshney Pramod K.
Zhou Yi
Publication venue
Publication date: 14/05/2017
Field of study

In many modern machine learning applications, structures of underlying mathematical models often yield nonconvex optimization problems. Due to the intractability of nonconvexity, there is a rising need to develop efficient methods for solving general nonconvex problems with certain performance guarantee. In this work, we investigate the accelerated proximal gradient method for nonconvex programming (APGnc). The method compares between a usual proximal gradient step and a linear extrapolation step, and accepts the one that has a lower function value to achieve a monotonic decrease. In specific, under a general nonsmooth and nonconvex setting, we provide a rigorous argument to show that the limit points of the sequence generated by APGnc are critical points of the objective function. Then, by exploiting the Kurdyka-{\L}ojasiewicz (\KL) property for a broad class of functions, we establish the linear and sub-linear convergence rates of the function value sequence generated by APGnc. We further propose a stochastic variance reduced APGnc (SVRG-APGnc), and establish its linear convergence under a special case of the \KL property. We also extend the analysis to the inexact version of these methods and develop an adaptive momentum strategy that improves the numerical performance.Comment: Accepted in ICML 2017, 9 papes, 4 figure

arXiv.org e-Print Archive

Minimization of nonsmooth nonconvex functions using inexact evaluations and its worst-case complexity

Author: Gratton S.
Simon E.
Toint Ph. L.
Publication venue
Publication date: 27/02/2019
Field of study

An adaptive regularization algorithm using inexact function and derivatives evaluations is proposed for the solution of composite nonsmooth nonconvex optimization. It is shown that this algorithm needs at most

O(|\log(\epsilon)|\,\epsilon^{-2})

evaluations of the problem's functions and their derivatives for finding an

\epsilon

-approximate first-order stationary point. This complexity bound therefore generalizes that provided by [Bellavia, Gurioli, Morini and Toint, 2018] for inexact methods for smooth nonconvex problems, and is within a factor

|\log(\epsilon)|

of the optimal bound known for smooth and nonsmooth nonconvex minimization with exact evaluations. A practically more restrictive variant of the algorithm with worst-case complexity

O(|\log(\epsilon)|+\epsilon^{-2})

is also presented.Comment: 19 page

arXiv.org e-Print Archive

A convergence framework for inexact nonconvex and nonsmooth algorithms and its applications to several iterations

Author: Cheng Lizhi
Jiang Hao
Sun Tao
Zhu Wei
Publication venue
Publication date: 27/11/2018
Field of study

In this paper, we consider the convergence of an abstract inexact nonconvex and nonsmooth algorithm. We promise a pseudo sufficient descent condition and a pseudo relative error condition, which are both related to an auxiliary sequence, for the algorithm; and a continuity condition is assumed to hold. In fact, a lot of classical inexact nonconvex and nonsmooth algorithms allow these three conditions. Under a special kind of summable assumption on the auxiliary sequence, we prove the sequence generated by the general algorithm converges to a critical point of the objective function if being assumed Kurdyka- Lojasiewicz property. The core of the proofs lies in building a new Lyapunov function, whose successive difference provides a bound for the successive difference of the points generated by the algorithm. And then, we apply our findings to several classical nonconvex iterative algorithms and derive the corresponding convergence result

arXiv.org e-Print Archive

Composite Convex Optimization with Global and Local Inexact Oracles

Author: Necoara Ion
Sun Tianxiao
Tran-Dinh Quoc
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/02/2020
Field of study

We introduce new global and local inexact oracle concepts for a wide class of convex functions in composite convex minimization. Such inexact oracles naturally come from primal-dual framework, barrier smoothing, inexact computations of gradients and Hessian, and many other situations. We also provide examples showing that the class of convex functions equipped with the newly inexact second-order oracles is larger than standard self-concordant as well as Lipschitz gradient function classes. Further, we investigate several properties of convex and/or self-concordant functions under the inexact second-order oracles which are useful for algorithm development. Next, we apply our theory to develop inexact proximal Newton-type schemes for minimizing general composite convex minimization problems equipped with such inexact oracles. Our theoretical results consist of new optimization algorithms, accompanied with global convergence guarantees to solve a wide class of composite convex optimization problems. When the first objective term is additionally self-concordant, we establish different local convergence results for our method. In particular, we prove that depending on the choice of accuracy levels of the inexact second-order oracles, we obtain different local convergence rates ranging from

R

-linear and

R

-superlinear to

R

-quadratic. In special cases, where convergence bounds are known, our theory recovers the best known rates. We also apply our settings to derive a new primal-dual method for composite convex minimization problems. Finally, we present some representative numerical examples to illustrate the benefit of our new algorithms.Comment: 28 pages, 6 figures, and 2 table

arXiv.org e-Print Archive

Efficiency of minimizing compositions of convex functions and smooth maps

Author: Drusvyatskiy Dmitriy
Paquette Courtney
Publication venue
Publication date: 14/08/2017
Field of study

We consider global efficiency of algorithms for minimizing a sum of a convex function and a composition of a Lipschitz convex function with a smooth map. The basic algorithm we rely on is the prox-linear method, which in each iteration solves a regularized subproblem formed by linearizing the smooth map. When the subproblems are solved exactly, the method has efficiency

\mathcal{O}(\varepsilon^{-2})

, akin to gradient descent for smooth minimization. We show that when the subproblems can only be solved by first-order methods, a simple combination of smoothing, the prox-linear method, and a fast-gradient scheme yields an algorithm with complexity

\widetilde{\mathcal{O}}(\varepsilon^{-3})

. The technique readily extends to minimizing an average of

m

composite functions, with complexity

\widetilde{\mathcal{O}}(m/\varepsilon^{2}+\sqrt{m}/\varepsilon^{3})

in expectation. We round off the paper with an inertial prox-linear method that automatically accelerates in presence of convexity

arXiv.org e-Print Archive

A Family of Inexact SQA Methods for Non-Smooth Convex Minimization with Provable Convergence Guarantees Based on the Luo-Tseng Error Bound Property

Author: So Anthony Man-Cho
Yue Man-Chung
Zhou Zirui
Publication venue
Publication date: 26/01/2018
Field of study

We propose a new family of inexact sequential quadratic approximation (SQA) methods, which we call the inexact regularized proximal Newton (

\textsf{IRPN}

) method, for minimizing the sum of two closed proper convex functions, one of which is smooth and the other is possibly non-smooth. Our proposed method features strong convergence guarantees even when applied to problems with degenerate solutions while allowing the inner minimization to be solved inexactly. Specifically, we prove that when the problem possesses the so-called Luo-Tseng error bound (EB) property,

\textsf{IRPN}

converges globally to an optimal solution, and the local convergence rate of the sequence of iterates generated by

\textsf{IRPN}

is linear, superlinear, or even quadratic, depending on the choice of parameters of the algorithm. Prior to this work, such EB property has been extensively used to establish the linear convergence of various first-order methods. However, to the best of our knowledge, this work is the first to use the Luo-Tseng EB property to establish the superlinear convergence of SQA-type methods for non-smooth convex minimization. As a consequence of our result,

\textsf{IRPN}

is capable of solving regularized regression or classification problems under the high-dimensional setting with provable convergence guarantees. We compare our proposed

\textsf{IRPN}

with several empirically efficient algorithms by applying them to the

\ell_1

-regularized logistic regression problem. Experiment results show the competitiveness of our proposed method

arXiv.org e-Print Archive

A Flexible Coordinate Descent Method

Author: Fountoulakis Kimon
Tappenden Rachael
Publication venue
Publication date: 26/02/2018
Field of study

We present a novel randomized block coordinate descent method for the minimization of a convex composite objective function. The method uses (approximate) partial second-order (curvature) information, so that the algorithm performance is more robust when applied to highly nonseparable or ill conditioned problems. We call the method Flexible Coordinate Descent (FCD). At each iteration of FCD, a block of coordinates is sampled randomly, a quadratic model is formed about that block and the model is minimized \emph{approximately/inexactly} to determine the search direction. An inexpensive line search is then employed to ensure a monotonic decrease in the objective function and acceptance of large step sizes. We present several high probability iteration complexity results to show that convergence of FCD is guaranteed theoretically. Finally, we present numerical results on large-scale problems to demonstrate the practical performance of the method.Comment: 31 pages, 24 figure

arXiv.org e-Print Archive

Truncated Nonsmooth Newton Multigrid Methods for Block-Separable Minimization Problems

Author: Gräser Carsten
Sander Oliver
Publication venue
Publication date: 25/10/2017
Field of study

The Truncated Nonsmooth Newton Multigrid (TNNMG) method is a robust and efficient solution method for a wide range of block-separable convex minimization problems, typically stemming from discretizations of nonlinear and nonsmooth partial differential equations. This paper proves global convergence of the method under weak conditions both on the objective functional, and on the local inexact subproblem solvers that are part of the method. It also discusses a range of algorithmic choices that allows to customize the algorithm for many specific problems. Numerical examples are deliberately omitted, because many such examples have already been published elsewhere.Comment: Dedicate the paper to Elias Pippin

arXiv.org e-Print Archive

Composite Optimization by Nonconvex Majorization-Minimization

Author: Geiping Jonas
Moeller Michael
Publication venue
Publication date: 03/09/2018
Field of study

The minimization of a nonconvex composite function can model a variety of imaging tasks. A popular class of algorithms for solving such problems are majorization-minimization techniques which iteratively approximate the composite nonconvex function by a majorizing function that is easy to minimize. Most techniques, e.g. gradient descent, utilize convex majorizers in order to guarantee that the majorizer is easy to minimize. In our work we consider a natural class of nonconvex majorizers for these functions, and show that these majorizers are still sufficient for a globally convergent optimization scheme. Numerical results illustrate that by applying this scheme, one can often obtain superior local optima compared to previous majorization-minimization methods, when the nonconvex majorizers are solved to global optimality. Finally, we illustrate the behavior of our algorithm for depth super-resolution from raw time-of-flight data.Comment: 38 pages, 12 figures, accepted for publication in SIIM

arXiv.org e-Print Archive