Search CORE

27,501 research outputs found

Theoretical Properties of the Overlapping Groups Lasso

Author: Percival Daniel
Publication venue
Publication date: 09/11/2011
Field of study

We present two sets of theoretical results on the grouped lasso with overlap of Jacob, Obozinski and Vert (2009) in the linear regression setting. This method allows for joint selection of predictors in sparse regression, allowing for complex structured sparsity over the predictors encoded as a set of groups. This flexible framework suggests that arbitrarily complex structures can be encoded with an intricate set of groups. Our results show that this strategy results in unexpected theoretical consequences for the procedure. In particular, we give two sets of results: (1) finite sample bounds on prediction and estimation, and (2) asymptotic distribution and selection. Both sets of results give insight into the consequences of choosing an increasingly complex set of groups for the procedure, as well as what happens when the set of groups cannot recover the true sparsity pattern. Additionally, these results demonstrate the differences and similarities between the the grouped lasso procedure with and without overlapping groups. Our analysis shows the set of groups must be chosen with caution - an overly complex set of groups will damage the analysis.Comment: 20 pages, submitted to Annals of Statistic

arXiv.org e-Print Archive

Crossref

Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eQTL mapping

Author: Kim Seyoung
Xing Eric P.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 28/09/2012
Field of study

We consider the problem of estimating a sparse multi-response regression function, with an application to expression quantitative trait locus (eQTL) mapping, where the goal is to discover genetic variations that influence gene-expression levels. In particular, we investigate a shrinkage technique capable of capturing a given hierarchical structure over the responses, such as a hierarchical clustering tree with leaf nodes for responses and internal nodes for clusters of related responses at multiple granularity, and we seek to leverage this structure to recover covariates relevant to each hierarchically-defined cluster of responses. We propose a tree-guided group lasso, or tree lasso, for estimating such structured sparsity under multi-response regression by employing a novel penalty function constructed from the tree. We describe a systematic weighting scheme for the overlapping groups in the tree-penalty such that each regression coefficient is penalized in a balanced manner despite the inhomogeneous multiplicity of group memberships of the regression coefficients due to overlaps among groups. For efficient optimization, we employ a smoothing proximal gradient method that was originally developed for a general class of structured-sparsity-inducing penalties. Using simulated and yeast data sets, we demonstrate that our method shows a superior performance in terms of both prediction errors and recovery of true sparsity patterns, compared to other methods for learning a multivariate-response regression.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS549 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

A CONSTRAINED MATCHING PURSUIT APPROACH TO AUDIO DECLIPPING

Author: Adler A
Elad M
Emiya V
Gribonval R
IEEE
Jafari MG
Plumbley MD
Publication venue
Publication date: 01/01/2011
Field of study

© 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

HAL-CentraleSupelec

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

University of Surrey

Queen Mary Research Online

Surrey Research Insight

Hal-Diderot

HAL-Rennes 1

Distributing the Kalman Filter for Large-Scale Systems

Author: Khan Usman A.
Moura Jose M. F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/02/2008
Field of study

This paper derives a \emph{distributed} Kalman filter to estimate a sparsely connected, large-scale,

n-

dimensional, dynamical system monitored by a network of

N

sensors. Local Kalman filters are implemented on the (

n_l-

dimensional, where

n_l\ll n

) sub-systems that are obtained after spatially decomposing the large-scale system. The resulting sub-systems overlap, which along with an assimilation procedure on the local Kalman filters, preserve an

L

th order Gauss-Markovian structure of the centralized error processes. The information loss due to the

L

th order Gauss-Markovian approximation is controllable as it can be characterized by a divergence that decreases as

L\uparrow

. The order of the approximation,

L

, leads to a lower bound on the dimension of the sub-systems, hence, providing a criterion for sub-system selection. The assimilation procedure is carried out on the local error covariances with a distributed iterate collapse inversion (DICI) algorithm that we introduce. The DICI algorithm computes the (approximated) centralized Riccati and Lyapunov equations iteratively with only local communication and low-order computation. We fuse the observations that are common among the local Kalman filters using bipartite fusion graphs and consensus averaging algorithms. The proposed algorithm achieves full distribution of the Kalman filter that is coherent with the centralized Kalman filter with an

L

th order Gaussian-Markovian structure on the centralized error processes. Nowhere storage, communication, or computation of

n-

dimensional vectors and matrices is needed; only

n_l \ll n

dimensional vectors and matrices are communicated or used in the computation at the sensors

arXiv.org e-Print Archive

Crossref

High Dimensional Classification with combined Adaptive Sparse PLS and Logistic Regression

Author: Durif G.
Lambert-Lacroix S.
Michaelsson J.
Modolo L.
Mold J. E.
Picard F.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 30/08/2017
Field of study

Motivation: The high dimensionality of genomic data calls for the development of specific classification methodologies, especially to prevent over-optimistic predictions. This challenge can be tackled by compression and variable selection, which combined constitute a powerful framework for classification, as well as data visualization and interpretation. However, current proposed combinations lead to instable and non convergent methods due to inappropriate computational frameworks. We hereby propose a stable and convergent approach for classification in high dimensional based on sparse Partial Least Squares (sparse PLS). Results: We start by proposing a new solution for the sparse PLS problem that is based on proximal operators for the case of univariate responses. Then we develop an adaptive version of the sparse PLS for classification, which combines iterative optimization of logistic regression and sparse PLS to ensure convergence and stability. Our results are confirmed on synthetic and experimental data. In particular we show how crucial convergence and stability can be when cross-validation is involved for calibration purposes. Using gene expression data we explore the prediction of breast cancer relapse. We also propose a multicategorial version of our method on the prediction of cell-types based on single-cell expression data. Availability: Our approach is implemented in the plsgenomics R-package.Comment: 9 pages, 3 figures, 4 tables + Supplementary Materials 8 pages, 3 figures, 10 table

arXiv.org e-Print Archive

HAL-ENS-LYON

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot