2,569 research outputs found
Structured Sparsity: Discrete and Convex approaches
Compressive sensing (CS) exploits sparsity to recover sparse or compressible
signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity
is also used to enhance interpretability in machine learning and statistics
applications: While the ambient dimension is vast in modern data analysis
problems, the relevant information therein typically resides in a much lower
dimensional space. However, many solutions proposed nowadays do not leverage
the true underlying structure. Recent results in CS extend the simple sparsity
idea to more sophisticated {\em structured} sparsity models, which describe the
interdependency between the nonzero components of a signal, allowing to
increase the interpretability of the results and lead to better recovery
performance. In order to better understand the impact of structured sparsity,
in this chapter we analyze the connections between the discrete models and
their convex relaxations, highlighting their relative advantages. We start with
the general group sparse model and then elaborate on two important special
cases: the dispersive and the hierarchical models. For each, we present the
models in their discrete nature, discuss how to solve the ensuing discrete
problems and then describe convex relaxations. We also consider more general
structures as defined by set functions and present their convex proxies.
Further, we discuss efficient optimization solutions for structured sparsity
problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure
A measure of individual role in collective dynamics
Identifying key players in collective dynamics remains a challenge in several
research fields, from the efficient dissemination of ideas to drug target
discovery in biomedical problems. The difficulty lies at several levels: how to
single out the role of individual elements in such intermingled systems, or
which is the best way to quantify their importance. Centrality measures
describe a node's importance by its position in a network. The key issue
obviated is that the contribution of a node to the collective behavior is not
uniquely determined by the structure of the system but it is a result of the
interplay between dynamics and network structure. We show that dynamical
influence measures explicitly how strongly a node's dynamical state affects
collective behavior. For critical spreading, dynamical influence targets nodes
according to their spreading capabilities. For diffusive processes it
quantifies how efficiently real systems may be controlled by manipulating a
single node.Comment: accepted for publication in Scientific Report
Phase Transitions in the Pooled Data Problem
In this paper, we study the pooled data problem of identifying the labels
associated with a large collection of items, based on a sequence of pooled
tests revealing the counts of each label within the pool. In the noiseless
setting, we identify an exact asymptotic threshold on the required number of
tests with optimal decoding, and prove a phase transition between complete
success and complete failure. In addition, we present a novel noisy variation
of the problem, and provide an information-theoretic framework for
characterizing the required number of tests for general random noise models.
Our results reveal that noise can make the problem considerably more difficult,
with strict increases in the scaling laws even at low noise levels. Finally, we
demonstrate similar behavior in an approximate recovery setting, where a given
number of errors is allowed in the decoded labels.Comment: Accepted to NIPS 201
Multiresolution analysis in statistical mechanics. I. Using wavelets to calculate thermodynamic properties
The wavelet transform, a family of orthonormal bases, is introduced as a
technique for performing multiresolution analysis in statistical mechanics. The
wavelet transform is a hierarchical technique designed to separate data sets
into sets representing local averages and local differences. Although
one-to-one transformations of data sets are possible, the advantage of the
wavelet transform is as an approximation scheme for the efficient calculation
of thermodynamic and ensemble properties. Even under the most drastic of
approximations, the resulting errors in the values obtained for average
absolute magnetization, free energy, and heat capacity are on the order of 10%,
with a corresponding computational efficiency gain of two orders of magnitude
for a system such as a Ising lattice. In addition, the errors in
the results tend toward zero in the neighborhood of fixed points, as determined
by renormalization group theory.Comment: 13 pages plus 7 figures (PNG
Revealing evolutionary constraints on proteins through sequence analysis
Statistical analysis of alignments of large numbers of protein sequences has
revealed "sectors" of collectively coevolving amino acids in several protein
families. Here, we show that selection acting on any functional property of a
protein, represented by an additive trait, can give rise to such a sector. As
an illustration of a selected trait, we consider the elastic energy of an
important conformational change within an elastic network model, and we show
that selection acting on this energy leads to correlations among residues. For
this concrete example and more generally, we demonstrate that the main
signature of functional sectors lies in the small-eigenvalue modes of the
covariance matrix of the selected sequences. However, secondary signatures of
these functional sectors also exist in the extensively-studied large-eigenvalue
modes. Our simple, general model leads us to propose a principled method to
identify functional sectors, along with the magnitudes of mutational effects,
from sequence data. We further demonstrate the robustness of these functional
sectors to various forms of selection, and the robustness of our approach to
the identification of multiple selected traits.Comment: 37 pages, 28 figure
- …