5,731 research outputs found
Coding-Theoretic Methods for Sparse Recovery
We review connections between coding-theoretic objects and sparse learning
problems. In particular, we show how seemingly different combinatorial objects
such as error-correcting codes, combinatorial designs, spherical codes,
compressed sensing matrices and group testing designs can be obtained from one
another. The reductions enable one to translate upper and lower bounds on the
parameters attainable by one object to another. We survey some of the
well-known reductions in a unified presentation, and bring some existing gaps
to attention. New reductions are also introduced; in particular, we bring up
the notion of minimum "L-wise distance" of codes and show that this notion
closely captures the combinatorial structure of RIP-2 matrices. Moreover, we
show how this weaker variation of the minimum distance is related to
combinatorial list-decoding properties of codes.Comment: Added Lemma 34 in the first revision. Original version in Proceedings
of the Allerton Conference on Communication, Control and Computing, September
201
Support Recovery of Sparse Signals
We consider the problem of exact support recovery of sparse signals via noisy
measurements. The main focus is the sufficient and necessary conditions on the
number of measurements for support recovery to be reliable. By drawing an
analogy between the problem of support recovery and the problem of channel
coding over the Gaussian multiple access channel, and exploiting mathematical
tools developed for the latter problem, we obtain an information theoretic
framework for analyzing the performance limits of support recovery. Sharp
sufficient and necessary conditions on the number of measurements in terms of
the signal sparsity level and the measurement noise level are derived.
Specifically, when the number of nonzero entries is held fixed, the exact
asymptotics on the number of measurements for support recovery is developed.
When the number of nonzero entries increases in certain manners, we obtain
sufficient conditions tighter than existing results. In addition, we show that
the proposed methodology can deal with a variety of models of sparse signal
recovery, hence demonstrating its potential as an effective analytical tool.Comment: 33 page
Compressed sensing using sparse binary measurements: a rateless coding perspective
Compressed Sensing (CS) methods using sparse binary measurement matrices and iterative message-passing re- covery procedures have been recently investigated due to their low computational complexity and excellent performance. Drawing much of inspiration from sparse-graph codes such as Low-Density Parity-Check (LDPC) codes, these studies use analytical tools from modern coding theory to analyze CS solutions. In this paper, we consider and systematically analyze the CS setup inspired by a class of efficient, popular and flexible sparse-graph codes called rateless codes. The proposed rateless CS setup is asymptotically analyzed using tools such as Density Evolution and EXIT charts and fine-tuned using degree distribution optimization techniques
Discovery of low-dimensional structure in high-dimensional inference problems
Many learning and inference problems involve high-dimensional data such as images, video or genomic data, which cannot be processed efficiently using conventional methods due to their dimensionality. However, high-dimensional data often exhibit an inherent low-dimensional structure, for instance they can often be represented sparsely in some basis or domain. The discovery of an underlying low-dimensional structure is important to develop more robust and efficient analysis and processing algorithms.
The first part of the dissertation investigates the statistical complexity of sparse recovery problems, including sparse linear and nonlinear regression models, feature selection and graph estimation. We present a framework that unifies sparse recovery problems and construct an analogy to channel coding in classical information theory. We perform an information-theoretic analysis to derive bounds on the number of samples required to reliably recover sparsity patterns independent of any specific recovery algorithm. In particular, we show that sample complexity can be tightly characterized using a mutual information formula similar to channel coding results. Next, we derive major extensions to this framework, including dependent input variables and a lower bound for sequential adaptive recovery schemes, which helps determine whether adaptivity provides performance gains. We compute statistical complexity bounds for various sparse recovery problems, showing our analysis improves upon the existing bounds and leads to intuitive results for new applications.
In the second part, we investigate methods for improving the computational complexity of subgraph detection in graph-structured data, where we aim to discover anomalous patterns present in a connected subgraph of a given graph. This problem arises in many applications such as detection of network intrusions, community detection, detection of anomalous events in surveillance videos or disease outbreaks. Since optimization over connected subgraphs is a combinatorial and computationally difficult problem, we propose a convex relaxation that offers a principled approach to incorporating connectivity and conductance constraints on candidate subgraphs. We develop a novel nearly-linear time algorithm to solve the relaxed problem, establish convergence and consistency guarantees and demonstrate its feasibility and performance with experiments on real networks
- …