Search CORE

103 research outputs found

Conditioning of Random Block Subdictionaries with Applications to Block-Sparse Recovery and Regression

Author: Bajwa Waheed U.
Calderbank Robert
Duarte Marco F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

The linear model, in which a set of observations is assumed to be given by a linear combination of columns of a matrix, has long been the mainstay of the statistics and signal processing literature. One particular challenge for inference under linear models is understanding the conditions on the dictionary under which reliable inference is possible. This challenge has attracted renewed attention in recent years since many modern inference problems deal with the "underdetermined" setting, in which the number of observations is much smaller than the number of columns in the dictionary. This paper makes several contributions for this setting when the set of observations is given by a linear combination of a small number of groups of columns of the dictionary, termed the "block-sparse" case. First, it specifies conditions on the dictionary under which most block subdictionaries are well conditioned. This result is fundamentally different from prior work on block-sparse inference because (i) it provides conditions that can be explicitly computed in polynomial time, (ii) the given conditions translate into near-optimal scaling of the number of columns of the block subdictionaries as a function of the number of observations for a large class of dictionaries, and (iii) it suggests that the spectral norm and the quadratic-mean block coherence of the dictionary (rather than the worst-case coherences) fundamentally limit the scaling of dimensions of the well-conditioned block subdictionaries. Second, this paper investigates the problems of block-sparse recovery and block-sparse regression in underdetermined settings. Near-optimal block-sparse recovery and regression are possible for certain dictionaries as long as the dictionary satisfies easily computable conditions and the coefficients describing the linear combination of groups of columns can be modeled through a mild statistical prior.Comment: 39 pages, 3 figures. A revised and expanded version of the paper published in IEEE Transactions on Information Theory (DOI: 10.1109/TIT.2015.2429632); this revision includes corrections in the proofs of some of the result

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst

Multi-task additive models with shared transfer functions based on dictionary learning

Author: Fawzi Alhussein
Frossard Pascal
Sinn Mathieu
Publication venue
Publication date: 22/04/2015
Field of study

Additive models form a widely popular class of regression models which represent the relation between covariates and response variables as the sum of low-dimensional transfer functions. Besides flexibility and accuracy, a key benefit of these models is their interpretability: the transfer functions provide visual means for inspecting the models and identifying domain-specific relations between inputs and outputs. However, in large-scale problems involving the prediction of many related tasks, learning independently additive models results in a loss of model interpretability, and can cause overfitting when training data is scarce. We introduce a novel multi-task learning approach which provides a corpus of accurate and interpretable additive models for a large number of related forecasting tasks. Our key idea is to share transfer functions across models in order to reduce the model complexity and ease the exploration of the corpus. We establish a connection with sparse dictionary learning and propose a new efficient fitting algorithm which alternates between sparse coding and transfer function updates. The former step is solved via an extension of Orthogonal Matching Pursuit, whose properties are analyzed using a novel recovery condition which extends existing results in the literature. The latter step is addressed using a traditional dictionary update rule. Experiments on real-world data demonstrate that our approach compares favorably to baseline methods while yielding an interpretable corpus of models, revealing structure among the individual tasks and being more robust when training data is scarce. Our framework therefore extends the well-known benefits of additive models to common regression settings possibly involving thousands of tasks

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Computational Methods for Sparse Solution of Linear Inverse Problems

Author: Tropp Joel A.
Wright Stephen J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

The goal of the sparse approximation problem is to approximate a target signal using a linear combination of a few elementary signals drawn from a fixed collection. This paper surveys the major practical algorithms for sparse approximation. Specific attention is paid to computational issues, to the circumstances in which individual methods tend to perform well, and to the theoretical guarantees available. Many fundamental questions in electrical engineering, statistics, and applied mathematics can be posed as sparse approximation problems, making these algorithms versatile and relevant to a plethora of applications

CiteSeerX

Caltech Authors

A Multiple Hypothesis Testing Approach to Low-Complexity Subspace Unmixing

Author: Bajwa Waheed U.
Mixon Dustin G.
Publication venue
Publication date: 19/11/2016
Field of study

Subspace-based signal processing traditionally focuses on problems involving a few subspaces. Recently, a number of problems in different application areas have emerged that involve a significantly larger number of subspaces relative to the ambient dimension. It becomes imperative in such settings to first identify a smaller set of active subspaces that contribute to the observation before further processing can be carried out. This problem of identification of a small set of active subspaces among a huge collection of subspaces from a single (noisy) observation in the ambient space is termed subspace unmixing. This paper formally poses the subspace unmixing problem under the parsimonious subspace-sum (PS3) model, discusses connections of the PS3 model to problems in wireless communications, hyperspectral imaging, high-dimensional statistics and compressed sensing, and proposes a low-complexity algorithm, termed marginal subspace detection (MSD), for subspace unmixing. The MSD algorithm turns the subspace unmixing problem for the PS3 model into a multiple hypothesis testing (MHT) problem and its analysis in the paper helps control the family-wise error rate of this MHT problem at any level

\alpha \in [0,1]

under two random signal generation models. Some other highlights of the analysis of the MSD algorithm include: (i) it is applicable to an arbitrary collection of subspaces on the Grassmann manifold; (ii) it relies on properties of the collection of subspaces that are computable in polynomial time; and (

iii

) it allows for linear scaling of the number of active subspaces as a function of the ambient dimension. Finally, numerical results are presented in the paper to better understand the performance of the MSD algorithm.Comment: Submitted for journal publication; 33 pages, 14 figure

arXiv.org e-Print Archive

CiteSeerX

Frame Coherence and Sparse Signal Processing

Author: Bajwa Waheed U.
Calderbank Robert
Mixon Dustin G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

The sparse signal processing literature often uses random sensing matrices to obtain performance guarantees. Unfortunately, in the real world, sensing matrices do not always come from random processes. It is therefore desirable to evaluate whether an arbitrary matrix, or frame, is suitable for sensing sparse signals. To this end, the present paper investigates two parameters that measure the coherence of a frame: worst-case and average coherence. We first provide several examples of frames that have small spectral norm, worst-case coherence, and average coherence. Next, we present a new lower bound on worst-case coherence and compare it to the Welch bound. Later, we propose an algorithm that decreases the average coherence of a frame without changing its spectral norm or worst-case coherence. Finally, we use worst-case and average coherence, as opposed to the Restricted Isometry Property, to garner near-optimal probabilistic guarantees on both sparse signal detection and reconstruction in the presence of noise. This contrasts with recent results that only guarantee noiseless signal recovery from arbitrary frames, and which further assume independence across the nonzero entries of the signal---in a sense, requiring small average coherence replaces the need for such an assumption

arXiv.org e-Print Archive

CiteSeerX

Crossref