655 research outputs found
Proximal Newton-type methods for minimizing composite functions
We generalize Newton-type methods for minimizing smooth functions to handle a
sum of two convex functions: a smooth function and a nonsmooth function with a
simple proximal mapping. We show that the resulting proximal Newton-type
methods inherit the desirable convergence behavior of Newton-type methods for
minimizing smooth functions, even when search directions are computed
inexactly. Many popular methods tailored to problems arising in bioinformatics,
signal processing, and statistical learning are special cases of proximal
Newton-type methods, and our analysis yields new convergence results for some
of these methods
Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization
In many modern machine learning applications, structures of underlying
mathematical models often yield nonconvex optimization problems. Due to the
intractability of nonconvexity, there is a rising need to develop efficient
methods for solving general nonconvex problems with certain performance
guarantee. In this work, we investigate the accelerated proximal gradient
method for nonconvex programming (APGnc). The method compares between a usual
proximal gradient step and a linear extrapolation step, and accepts the one
that has a lower function value to achieve a monotonic decrease. In specific,
under a general nonsmooth and nonconvex setting, we provide a rigorous argument
to show that the limit points of the sequence generated by APGnc are critical
points of the objective function. Then, by exploiting the
Kurdyka-{\L}ojasiewicz (\KL) property for a broad class of functions, we
establish the linear and sub-linear convergence rates of the function value
sequence generated by APGnc. We further propose a stochastic variance reduced
APGnc (SVRG-APGnc), and establish its linear convergence under a special case
of the \KL property. We also extend the analysis to the inexact version of
these methods and develop an adaptive momentum strategy that improves the
numerical performance.Comment: Accepted in ICML 2017, 9 papes, 4 figure
Minimization of nonsmooth nonconvex functions using inexact evaluations and its worst-case complexity
An adaptive regularization algorithm using inexact function and derivatives
evaluations is proposed for the solution of composite nonsmooth nonconvex
optimization. It is shown that this algorithm needs at most
evaluations of the problem's functions and
their derivatives for finding an -approximate first-order stationary
point. This complexity bound therefore generalizes that provided by [Bellavia,
Gurioli, Morini and Toint, 2018] for inexact methods for smooth nonconvex
problems, and is within a factor of the optimal bound known
for smooth and nonsmooth nonconvex minimization with exact evaluations. A
practically more restrictive variant of the algorithm with worst-case
complexity is also presented.Comment: 19 page
A convergence framework for inexact nonconvex and nonsmooth algorithms and its applications to several iterations
In this paper, we consider the convergence of an abstract inexact nonconvex
and nonsmooth algorithm. We promise a pseudo sufficient descent condition and a
pseudo relative error condition, which are both related to an auxiliary
sequence, for the algorithm; and a continuity condition is assumed to hold. In
fact, a lot of classical inexact nonconvex and nonsmooth algorithms allow these
three conditions. Under a special kind of summable assumption on the auxiliary
sequence, we prove the sequence generated by the general algorithm converges to
a critical point of the objective function if being assumed Kurdyka-
Lojasiewicz property. The core of the proofs lies in building a new Lyapunov
function, whose successive difference provides a bound for the successive
difference of the points generated by the algorithm. And then, we apply our
findings to several classical nonconvex iterative algorithms and derive the
corresponding convergence result
Composite Convex Optimization with Global and Local Inexact Oracles
We introduce new global and local inexact oracle concepts for a wide class of
convex functions in composite convex minimization. Such inexact oracles
naturally come from primal-dual framework, barrier smoothing, inexact
computations of gradients and Hessian, and many other situations. We also
provide examples showing that the class of convex functions equipped with the
newly inexact second-order oracles is larger than standard self-concordant as
well as Lipschitz gradient function classes. Further, we investigate several
properties of convex and/or self-concordant functions under the inexact
second-order oracles which are useful for algorithm development. Next, we apply
our theory to develop inexact proximal Newton-type schemes for minimizing
general composite convex minimization problems equipped with such inexact
oracles. Our theoretical results consist of new optimization algorithms,
accompanied with global convergence guarantees to solve a wide class of
composite convex optimization problems. When the first objective term is
additionally self-concordant, we establish different local convergence results
for our method. In particular, we prove that depending on the choice of
accuracy levels of the inexact second-order oracles, we obtain different local
convergence rates ranging from -linear and -superlinear to -quadratic.
In special cases, where convergence bounds are known, our theory recovers the
best known rates. We also apply our settings to derive a new primal-dual method
for composite convex minimization problems. Finally, we present some
representative numerical examples to illustrate the benefit of our new
algorithms.Comment: 28 pages, 6 figures, and 2 table
Efficiency of minimizing compositions of convex functions and smooth maps
We consider global efficiency of algorithms for minimizing a sum of a convex
function and a composition of a Lipschitz convex function with a smooth map.
The basic algorithm we rely on is the prox-linear method, which in each
iteration solves a regularized subproblem formed by linearizing the smooth map.
When the subproblems are solved exactly, the method has efficiency
, akin to gradient descent for smooth
minimization. We show that when the subproblems can only be solved by
first-order methods, a simple combination of smoothing, the prox-linear method,
and a fast-gradient scheme yields an algorithm with complexity
. The technique readily extends to
minimizing an average of composite functions, with complexity
in
expectation. We round off the paper with an inertial prox-linear method that
automatically accelerates in presence of convexity
A Family of Inexact SQA Methods for Non-Smooth Convex Minimization with Provable Convergence Guarantees Based on the Luo-Tseng Error Bound Property
We propose a new family of inexact sequential quadratic approximation (SQA)
methods, which we call the inexact regularized proximal Newton
() method, for minimizing the sum of two closed proper convex
functions, one of which is smooth and the other is possibly non-smooth. Our
proposed method features strong convergence guarantees even when applied to
problems with degenerate solutions while allowing the inner minimization to be
solved inexactly. Specifically, we prove that when the problem possesses the
so-called Luo-Tseng error bound (EB) property, converges
globally to an optimal solution, and the local convergence rate of the sequence
of iterates generated by is linear, superlinear, or even
quadratic, depending on the choice of parameters of the algorithm. Prior to
this work, such EB property has been extensively used to establish the linear
convergence of various first-order methods. However, to the best of our
knowledge, this work is the first to use the Luo-Tseng EB property to establish
the superlinear convergence of SQA-type methods for non-smooth convex
minimization. As a consequence of our result, is capable of
solving regularized regression or classification problems under the
high-dimensional setting with provable convergence guarantees. We compare our
proposed with several empirically efficient algorithms by
applying them to the -regularized logistic regression problem.
Experiment results show the competitiveness of our proposed method
A Flexible Coordinate Descent Method
We present a novel randomized block coordinate descent method for the
minimization of a convex composite objective function. The method uses
(approximate) partial second-order (curvature) information, so that the
algorithm performance is more robust when applied to highly nonseparable or ill
conditioned problems. We call the method Flexible Coordinate Descent (FCD). At
each iteration of FCD, a block of coordinates is sampled randomly, a quadratic
model is formed about that block and the model is minimized
\emph{approximately/inexactly} to determine the search direction. An
inexpensive line search is then employed to ensure a monotonic decrease in the
objective function and acceptance of large step sizes. We present several high
probability iteration complexity results to show that convergence of FCD is
guaranteed theoretically. Finally, we present numerical results on large-scale
problems to demonstrate the practical performance of the method.Comment: 31 pages, 24 figure
Truncated Nonsmooth Newton Multigrid Methods for Block-Separable Minimization Problems
The Truncated Nonsmooth Newton Multigrid (TNNMG) method is a robust and
efficient solution method for a wide range of block-separable convex
minimization problems, typically stemming from discretizations of nonlinear and
nonsmooth partial differential equations. This paper proves global convergence
of the method under weak conditions both on the objective functional, and on
the local inexact subproblem solvers that are part of the method. It also
discusses a range of algorithmic choices that allows to customize the algorithm
for many specific problems. Numerical examples are deliberately omitted,
because many such examples have already been published elsewhere.Comment: Dedicate the paper to Elias Pippin
Composite Optimization by Nonconvex Majorization-Minimization
The minimization of a nonconvex composite function can model a variety of
imaging tasks. A popular class of algorithms for solving such problems are
majorization-minimization techniques which iteratively approximate the
composite nonconvex function by a majorizing function that is easy to minimize.
Most techniques, e.g. gradient descent, utilize convex majorizers in order to
guarantee that the majorizer is easy to minimize. In our work we consider a
natural class of nonconvex majorizers for these functions, and show that these
majorizers are still sufficient for a globally convergent optimization scheme.
Numerical results illustrate that by applying this scheme, one can often obtain
superior local optima compared to previous majorization-minimization methods,
when the nonconvex majorizers are solved to global optimality. Finally, we
illustrate the behavior of our algorithm for depth super-resolution from raw
time-of-flight data.Comment: 38 pages, 12 figures, accepted for publication in SIIM
- …