362 research outputs found

    Combinatorial Penalties: Which structures are preserved by convex relaxations?

    Get PDF
    We consider the homogeneous and the non-homogeneous convex relaxations for combinatorial penalty functions defined on support sets. Our study identifies key differences in the tightness of the resulting relaxations through the notion of the lower combinatorial envelope of a set-function along with new necessary conditions for support identification. We then propose a general adaptive estimator for convex monotone regularizers, and derive new sufficient conditions for support recovery in the asymptotic setting

    Structured sparsity-inducing norms through submodular functions

    Get PDF
    Sparse methods for supervised learning aim at finding good linear predictors from as few variables as possible, i.e., with small cardinality of their supports. This combinatorial selection problem is often turned into a convex optimization problem by replacing the cardinality function by its convex envelope (tightest convex lower bound), in this case the L1-norm. In this paper, we investigate more general set-functions than the cardinality, that may incorporate prior knowledge or structural constraints which are common in many applications: namely, we show that for nondecreasing submodular set-functions, the corresponding convex envelope can be obtained from its \lova extension, a common tool in submodular analysis. This defines a family of polyhedral norms, for which we provide generic algorithmic tools (subgradients and proximal operators) and theoretical results (conditions for support recovery or high-dimensional inference). By selecting specific submodular functions, we can give a new interpretation to known norms, such as those based on rank-statistics or grouped norms with potentially overlapping groups; we also define new norms, in particular ones that can be used as non-factorial priors for supervised learning

    Low Complexity Regularization of Linear Inverse Problems

    Full text link
    Inverse problems and regularization theory is a central theme in contemporary signal processing, where the goal is to reconstruct an unknown signal from partial indirect, and possibly noisy, measurements of it. A now standard method for recovering the unknown signal is to solve a convex optimization problem that enforces some prior knowledge about its structure. This has proved efficient in many problems routinely encountered in imaging sciences, statistics and machine learning. This chapter delivers a review of recent advances in the field where the regularization prior promotes solutions conforming to some notion of simplicity/low-complexity. These priors encompass as popular examples sparsity and group sparsity (to capture the compressibility of natural signals and images), total variation and analysis sparsity (to promote piecewise regularity), and low-rank (as natural extension of sparsity to matrix-valued data). Our aim is to provide a unified treatment of all these regularizations under a single umbrella, namely the theory of partial smoothness. This framework is very general and accommodates all low-complexity regularizers just mentioned, as well as many others. Partial smoothness turns out to be the canonical way to encode low-dimensional models that can be linear spaces or more general smooth manifolds. This review is intended to serve as a one stop shop toward the understanding of the theoretical properties of the so-regularized solutions. It covers a large spectrum including: (i) recovery guarantees and stability to noise, both in terms of â„“2\ell^2-stability and model (manifold) identification; (ii) sensitivity analysis to perturbations of the parameters involved (in particular the observations), with applications to unbiased risk estimation ; (iii) convergence properties of the forward-backward proximal splitting scheme, that is particularly well suited to solve the corresponding large-scale regularized optimization problem

    Simultaneous Variable and Covariance Selection with the Multivariate Spike-and-Slab Lasso

    Full text link
    We propose a Bayesian procedure for simultaneous variable and covariance selection using continuous spike-and-slab priors in multivariate linear regression models where q possibly correlated responses are regressed onto p predictors. Rather than relying on a stochastic search through the high-dimensional model space, we develop an ECM algorithm similar to the EMVS procedure of Rockova & George (2014) targeting modal estimates of the matrix of regression coefficients and residual precision matrix. Varying the scale of the continuous spike densities facilitates dynamic posterior exploration and allows us to filter out negligible regression coefficients and partial covariances gradually. Our method is seen to substantially outperform regularization competitors on simulated data. We demonstrate our method with a re-examination of data from a recent observational study of the effect of playing high school football on several later-life cognition, psychological, and socio-economic outcomes

    Structured Bayesian variable selection for multiple correlated response variables and high-dimensional predictors

    Full text link
    It is becoming increasingly common to study complex associations between multiple phenotypes and high-dimensional genomic features in biomedicine. However, it requires flexible and efficient joint statistical models if there are correlations between multiple response variables and between high-dimensional predictors. We propose a structured multivariate Bayesian variable selection model to identify sparse predictors associated with multiple correlated response variables. The approach makes use of known structure information between the multiple response variables and high-dimensional predictors via a Markov random field (MRF) prior for the latent indicator variables of the coefficient matrix of a sparse seemingly unrelated regressions (SSUR). The structure information included in the MRF prior can improve the model performance (i.e., variable selection and response prediction) compared to other common priors. In addition, we employ random effects to capture heterogeneity of grouped samples. The proposed approach is validated by simulation studies and applied to a pharmacogenomic study which includes pharmacological profiling and multi-omics data (i.e., gene expression, copy number variation and mutation) from in vitro anti-cancer drug sensitivity screening

    Model Consistency of Partly Smooth Regularizers

    Full text link
    This paper studies least-square regression penalized with partly smooth convex regularizers. This class of functions is very large and versatile allowing to promote solutions conforming to some notion of low-complexity. Indeed, they force solutions of variational problems to belong to a low-dimensional manifold (the so-called model) which is stable under small perturbations of the function. This property is crucial to make the underlying low-complexity model robust to small noise. We show that a generalized "irrepresentable condition" implies stable model selection under small noise perturbations in the observations and the design matrix, when the regularization parameter is tuned proportionally to the noise level. This condition is shown to be almost a necessary condition. We then show that this condition implies model consistency of the regularized estimator. That is, with a probability tending to one as the number of measurements increases, the regularized estimator belongs to the correct low-dimensional model manifold. This work unifies and generalizes several previous ones, where model consistency is known to hold for sparse, group sparse, total variation and low-rank regularizations

    Statistical Learning for Resting-State fMRI: Successes and Challenges

    Get PDF
    International audienceIn the absence of external stimuli, fluctuations in cerebral activity can be used to reveal intrinsic structures. Well-conditioned probabilistic models of this so-called resting-state activity are needed to support neuroscientific hypotheses. Exploring two specific descriptions of resting-state fMRI, namely spatial analysis and connectivity graphs, we discuss the progress brought by statistical learning techniques, but also the neuroscientific picture that they paint, and possible modeling pitfalls
    • …
    corecore