16,137 research outputs found
From error bounds to the complexity of first-order descent methods for convex functions
This paper shows that error bounds can be used as effective tools for
deriving complexity results for first-order descent methods in convex
minimization. In a first stage, this objective led us to revisit the interplay
between error bounds and the Kurdyka-\L ojasiewicz (KL) inequality. One can
show the equivalence between the two concepts for convex functions having a
moderately flat profile near the set of minimizers (as those of functions with
H\"olderian growth). A counterexample shows that the equivalence is no longer
true for extremely flat functions. This fact reveals the relevance of an
approach based on KL inequality. In a second stage, we show how KL inequalities
can in turn be employed to compute new complexity bounds for a wealth of
descent methods for convex problems. Our approach is completely original and
makes use of a one-dimensional worst-case proximal sequence in the spirit of
the famous majorant method of Kantorovich. Our result applies to a very simple
abstract scheme that covers a wide class of descent methods. As a byproduct of
our study, we also provide new results for the globalization of KL inequalities
in the convex framework.
Our main results inaugurate a simple methodology: derive an error bound,
compute the desingularizing function whenever possible, identify essential
constants in the descent method and finally compute the complexity using the
one-dimensional worst case proximal sequence. Our method is illustrated through
projection methods for feasibility problems, and through the famous iterative
shrinkage thresholding algorithm (ISTA), for which we show that the complexity
bound is of the form where the constituents of the bound only depend
on error bound constants obtained for an arbitrary least squares objective with
regularization
Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization
Relative to the large literature on upper bounds on complexity of convex
optimization, lesser attention has been paid to the fundamental hardness of
these problems. Given the extensive use of convex optimization in machine
learning and statistics, gaining an understanding of these complexity-theoretic
issues is important. In this paper, we study the complexity of stochastic
convex optimization in an oracle model of computation. We improve upon known
results and obtain tight minimax complexity estimates for various function
classes
Sharp Time--Data Tradeoffs for Linear Inverse Problems
In this paper we characterize sharp time-data tradeoffs for optimization
problems used for solving linear inverse problems. We focus on the minimization
of a least-squares objective subject to a constraint defined as the sub-level
set of a penalty function. We present a unified convergence analysis of the
gradient projection algorithm applied to such problems. We sharply characterize
the convergence rate associated with a wide variety of random measurement
ensembles in terms of the number of measurements and structural complexity of
the signal with respect to the chosen penalty function. The results apply to
both convex and nonconvex constraints, demonstrating that a linear convergence
rate is attainable even though the least squares objective is not strongly
convex in these settings. When specialized to Gaussian measurements our results
show that such linear convergence occurs when the number of measurements is
merely 4 times the minimal number required to recover the desired signal at all
(a.k.a. the phase transition). We also achieve a slower but geometric rate of
convergence precisely above the phase transition point. Extensive numerical
results suggest that the derived rates exactly match the empirical performance
- …