25 research outputs found
Model Consistency of Partly Smooth Regularizers
This paper studies least-square regression penalized with partly smooth
convex regularizers. This class of functions is very large and versatile
allowing to promote solutions conforming to some notion of low-complexity.
Indeed, they force solutions of variational problems to belong to a
low-dimensional manifold (the so-called model) which is stable under small
perturbations of the function. This property is crucial to make the underlying
low-complexity model robust to small noise. We show that a generalized
"irrepresentable condition" implies stable model selection under small noise
perturbations in the observations and the design matrix, when the
regularization parameter is tuned proportionally to the noise level. This
condition is shown to be almost a necessary condition. We then show that this
condition implies model consistency of the regularized estimator. That is, with
a probability tending to one as the number of measurements increases, the
regularized estimator belongs to the correct low-dimensional model manifold.
This work unifies and generalizes several previous ones, where model
consistency is known to hold for sparse, group sparse, total variation and
low-rank regularizations
Sparse Support Recovery with Non-smooth Loss Functions
In this paper, we study the support recovery guarantees of underdetermined
sparse regression using the -norm as a regularizer and a non-smooth
loss function for data fidelity. More precisely, we focus in detail on the
cases of and losses, and contrast them with the usual
loss. While these losses are routinely used to account for either
sparse ( loss) or uniform ( loss) noise models, a
theoretical analysis of their performance is still lacking. In this article, we
extend the existing theory from the smooth case to these non-smooth
cases. We derive a sharp condition which ensures that the support of the vector
to recover is stable to small additive noise in the observations, as long as
the loss constraint size is tuned proportionally to the noise level. A
distinctive feature of our theory is that it also explains what happens when
the support is unstable. While the support is not stable anymore, we identify
an "extended support" and show that this extended support is stable to small
additive noise. To exemplify the usefulness of our theory, we give a detailed
numerical analysis of the support stability/instability of compressed sensing
recovery with these different losses. This highlights different parameter
regimes, ranging from total support stability to progressively increasing
support instability.Comment: in Proc. NIPS 201
Model Consistency for Learning with Mirror-Stratifiable Regularizers
Low-complexity non-smooth convex regularizers are routinely used to impose
some structure (such as sparsity or low-rank) on the coefficients for linear
predictors in supervised learning. Model consistency consists then in selecting
the correct structure (for instance support or rank) by regularized empirical
risk minimization.
It is known that model consistency holds under appropriate non-degeneracy
conditions. However such conditions typically fail for highly correlated
designs and it is observed that regularization methods tend to select larger
models.
In this work, we provide the theoretical underpinning of this behavior using
the notion of mirror-stratifiable regularizers. This class of regularizers
encompasses the most well-known in the literature, including the or
trace norms. It brings into play a pair of primal-dual models, which in turn
allows one to locate the structure of the solution using a specific dual
certificate.
We also show how this analysis is applicable to optimal solutions of the
learning problem, and also to the iterates computed by a certain class of
stochastic proximal-gradient algorithms.Comment: 14 pages, 4 figure
Sensitivity Analysis for Mirror-Stratifiable Convex Functions
This paper provides a set of sensitivity analysis and activity identification
results for a class of convex functions with a strong geometric structure, that
we coined "mirror-stratifiable". These functions are such that there is a
bijection between a primal and a dual stratification of the space into
partitioning sets, called strata. This pairing is crucial to track the strata
that are identifiable by solutions of parametrized optimization problems or by
iterates of optimization algorithms. This class of functions encompasses all
regularizers routinely used in signal and image processing, machine learning,
and statistics. We show that this "mirror-stratifiable" structure enjoys a nice
sensitivity theory, allowing us to study stability of solutions of optimization
problems to small perturbations, as well as activity identification of
first-order proximal splitting-type algorithms. Existing results in the
literature typically assume that, under a non-degeneracy condition, the active
set associated to a minimizer is stable to small perturbations and is
identified in finite time by optimization schemes. In contrast, our results do
not require any non-degeneracy assumption: in consequence, the optimal active
set is not necessarily stable anymore, but we are able to track precisely the
set of identifiable strata.We show that these results have crucial implications
when solving challenging ill-posed inverse problems via regularization, a
typical scenario where the non-degeneracy condition is not fulfilled. Our
theoretical results, illustrated by numerical simulations, allow to
characterize the instability behaviour of the regularized solutions, by
locating the set of all low-dimensional strata that can be potentially
identified by these solutions
Convergence of the Forward-Backward Algorithm: Beyond the Worst Case with the Help of Geometry
We provide a comprehensive study of the convergence of forward-backward
algorithm under suitable geometric conditions leading to fast rates. We present
several new results and collect in a unified view a variety of results
scattered in the literature, often providing simplified proofs. Novel
contributions include the analysis of infinite dimensional convex minimization
problems, allowing the case where minimizers might not exist. Further, we
analyze the relation between different geometric conditions, and discuss novel
connections with a priori conditions in linear inverse problems, including
source conditions, restricted isometry properties and partial smoothness
Activity Identification and Local Linear Convergence of Forward--Backward-type methods
In this paper, we consider a class of Forward--Backward (FB) splitting
methods that includes several variants (e.g. inertial schemes, FISTA) for
minimizing the sum of two proper convex and lower semi-continuous functions,
one of which has a Lipschitz continuous gradient, and the other is partly
smooth relatively to a smooth active manifold . We propose a
unified framework, under which we show that, this class of FB-type algorithms
(i) correctly identifies the active manifolds in a finite number of iterations
(finite activity identification), and (ii) then enters a local linear
convergence regime, which we characterize precisely in terms of the structure
of the underlying active manifolds. For simpler problems involving polyhedral
functions, we show finite termination. We also establish and explain why FISTA
(with convergent sequences) locally oscillates and can be slower than FB. These
results may have numerous applications including in signal/image processing,
sparse recovery and machine learning. Indeed, the obtained results explain the
typical behaviour that has been observed numerically for many problems in these
fields such as the Lasso, the group Lasso, the fused Lasso and the nuclear norm
regularization to name only a few.Comment: Full length version of the previous short on