362 research outputs found
Combinatorial Penalties: Which structures are preserved by convex relaxations?
We consider the homogeneous and the non-homogeneous convex relaxations for
combinatorial penalty functions defined on support sets. Our study identifies
key differences in the tightness of the resulting relaxations through the
notion of the lower combinatorial envelope of a set-function along with new
necessary conditions for support identification. We then propose a general
adaptive estimator for convex monotone regularizers, and derive new sufficient
conditions for support recovery in the asymptotic setting
Structured sparsity-inducing norms through submodular functions
Sparse methods for supervised learning aim at finding good linear predictors
from as few variables as possible, i.e., with small cardinality of their
supports. This combinatorial selection problem is often turned into a convex
optimization problem by replacing the cardinality function by its convex
envelope (tightest convex lower bound), in this case the L1-norm. In this
paper, we investigate more general set-functions than the cardinality, that may
incorporate prior knowledge or structural constraints which are common in many
applications: namely, we show that for nondecreasing submodular set-functions,
the corresponding convex envelope can be obtained from its \lova extension, a
common tool in submodular analysis. This defines a family of polyhedral norms,
for which we provide generic algorithmic tools (subgradients and proximal
operators) and theoretical results (conditions for support recovery or
high-dimensional inference). By selecting specific submodular functions, we can
give a new interpretation to known norms, such as those based on
rank-statistics or grouped norms with potentially overlapping groups; we also
define new norms, in particular ones that can be used as non-factorial priors
for supervised learning
Low Complexity Regularization of Linear Inverse Problems
Inverse problems and regularization theory is a central theme in contemporary
signal processing, where the goal is to reconstruct an unknown signal from
partial indirect, and possibly noisy, measurements of it. A now standard method
for recovering the unknown signal is to solve a convex optimization problem
that enforces some prior knowledge about its structure. This has proved
efficient in many problems routinely encountered in imaging sciences,
statistics and machine learning. This chapter delivers a review of recent
advances in the field where the regularization prior promotes solutions
conforming to some notion of simplicity/low-complexity. These priors encompass
as popular examples sparsity and group sparsity (to capture the compressibility
of natural signals and images), total variation and analysis sparsity (to
promote piecewise regularity), and low-rank (as natural extension of sparsity
to matrix-valued data). Our aim is to provide a unified treatment of all these
regularizations under a single umbrella, namely the theory of partial
smoothness. This framework is very general and accommodates all low-complexity
regularizers just mentioned, as well as many others. Partial smoothness turns
out to be the canonical way to encode low-dimensional models that can be linear
spaces or more general smooth manifolds. This review is intended to serve as a
one stop shop toward the understanding of the theoretical properties of the
so-regularized solutions. It covers a large spectrum including: (i) recovery
guarantees and stability to noise, both in terms of -stability and
model (manifold) identification; (ii) sensitivity analysis to perturbations of
the parameters involved (in particular the observations), with applications to
unbiased risk estimation ; (iii) convergence properties of the forward-backward
proximal splitting scheme, that is particularly well suited to solve the
corresponding large-scale regularized optimization problem
Simultaneous Variable and Covariance Selection with the Multivariate Spike-and-Slab Lasso
We propose a Bayesian procedure for simultaneous variable and covariance
selection using continuous spike-and-slab priors in multivariate linear
regression models where q possibly correlated responses are regressed onto p
predictors. Rather than relying on a stochastic search through the
high-dimensional model space, we develop an ECM algorithm similar to the EMVS
procedure of Rockova & George (2014) targeting modal estimates of the matrix of
regression coefficients and residual precision matrix. Varying the scale of the
continuous spike densities facilitates dynamic posterior exploration and allows
us to filter out negligible regression coefficients and partial covariances
gradually. Our method is seen to substantially outperform regularization
competitors on simulated data. We demonstrate our method with a re-examination
of data from a recent observational study of the effect of playing high school
football on several later-life cognition, psychological, and socio-economic
outcomes
Structured Bayesian variable selection for multiple correlated response variables and high-dimensional predictors
It is becoming increasingly common to study complex associations between
multiple phenotypes and high-dimensional genomic features in biomedicine.
However, it requires flexible and efficient joint statistical models if there
are correlations between multiple response variables and between
high-dimensional predictors. We propose a structured multivariate Bayesian
variable selection model to identify sparse predictors associated with multiple
correlated response variables. The approach makes use of known structure
information between the multiple response variables and high-dimensional
predictors via a Markov random field (MRF) prior for the latent indicator
variables of the coefficient matrix of a sparse seemingly unrelated regressions
(SSUR). The structure information included in the MRF prior can improve the
model performance (i.e., variable selection and response prediction) compared
to other common priors. In addition, we employ random effects to capture
heterogeneity of grouped samples. The proposed approach is validated by
simulation studies and applied to a pharmacogenomic study which includes
pharmacological profiling and multi-omics data (i.e., gene expression, copy
number variation and mutation) from in vitro anti-cancer drug sensitivity
screening
Model Consistency of Partly Smooth Regularizers
This paper studies least-square regression penalized with partly smooth
convex regularizers. This class of functions is very large and versatile
allowing to promote solutions conforming to some notion of low-complexity.
Indeed, they force solutions of variational problems to belong to a
low-dimensional manifold (the so-called model) which is stable under small
perturbations of the function. This property is crucial to make the underlying
low-complexity model robust to small noise. We show that a generalized
"irrepresentable condition" implies stable model selection under small noise
perturbations in the observations and the design matrix, when the
regularization parameter is tuned proportionally to the noise level. This
condition is shown to be almost a necessary condition. We then show that this
condition implies model consistency of the regularized estimator. That is, with
a probability tending to one as the number of measurements increases, the
regularized estimator belongs to the correct low-dimensional model manifold.
This work unifies and generalizes several previous ones, where model
consistency is known to hold for sparse, group sparse, total variation and
low-rank regularizations
Statistical Learning for Resting-State fMRI: Successes and Challenges
International audienceIn the absence of external stimuli, fluctuations in cerebral activity can be used to reveal intrinsic structures. Well-conditioned probabilistic models of this so-called resting-state activity are needed to support neuroscientific hypotheses. Exploring two specific descriptions of resting-state fMRI, namely spatial analysis and connectivity graphs, we discuss the progress brought by statistical learning techniques, but also the neuroscientific picture that they paint, and possible modeling pitfalls
- …