2,569 research outputs found

    Structured Sparsity: Discrete and Convex approaches

    Full text link
    Compressive sensing (CS) exploits sparsity to recover sparse or compressible signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity is also used to enhance interpretability in machine learning and statistics applications: While the ambient dimension is vast in modern data analysis problems, the relevant information therein typically resides in a much lower dimensional space. However, many solutions proposed nowadays do not leverage the true underlying structure. Recent results in CS extend the simple sparsity idea to more sophisticated {\em structured} sparsity models, which describe the interdependency between the nonzero components of a signal, allowing to increase the interpretability of the results and lead to better recovery performance. In order to better understand the impact of structured sparsity, in this chapter we analyze the connections between the discrete models and their convex relaxations, highlighting their relative advantages. We start with the general group sparse model and then elaborate on two important special cases: the dispersive and the hierarchical models. For each, we present the models in their discrete nature, discuss how to solve the ensuing discrete problems and then describe convex relaxations. We also consider more general structures as defined by set functions and present their convex proxies. Further, we discuss efficient optimization solutions for structured sparsity problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure

    A measure of individual role in collective dynamics

    Get PDF
    Identifying key players in collective dynamics remains a challenge in several research fields, from the efficient dissemination of ideas to drug target discovery in biomedical problems. The difficulty lies at several levels: how to single out the role of individual elements in such intermingled systems, or which is the best way to quantify their importance. Centrality measures describe a node's importance by its position in a network. The key issue obviated is that the contribution of a node to the collective behavior is not uniquely determined by the structure of the system but it is a result of the interplay between dynamics and network structure. We show that dynamical influence measures explicitly how strongly a node's dynamical state affects collective behavior. For critical spreading, dynamical influence targets nodes according to their spreading capabilities. For diffusive processes it quantifies how efficiently real systems may be controlled by manipulating a single node.Comment: accepted for publication in Scientific Report

    Phase Transitions in the Pooled Data Problem

    Get PDF
    In this paper, we study the pooled data problem of identifying the labels associated with a large collection of items, based on a sequence of pooled tests revealing the counts of each label within the pool. In the noiseless setting, we identify an exact asymptotic threshold on the required number of tests with optimal decoding, and prove a phase transition between complete success and complete failure. In addition, we present a novel noisy variation of the problem, and provide an information-theoretic framework for characterizing the required number of tests for general random noise models. Our results reveal that noise can make the problem considerably more difficult, with strict increases in the scaling laws even at low noise levels. Finally, we demonstrate similar behavior in an approximate recovery setting, where a given number of errors is allowed in the decoded labels.Comment: Accepted to NIPS 201

    Multiresolution analysis in statistical mechanics. I. Using wavelets to calculate thermodynamic properties

    Full text link
    The wavelet transform, a family of orthonormal bases, is introduced as a technique for performing multiresolution analysis in statistical mechanics. The wavelet transform is a hierarchical technique designed to separate data sets into sets representing local averages and local differences. Although one-to-one transformations of data sets are possible, the advantage of the wavelet transform is as an approximation scheme for the efficient calculation of thermodynamic and ensemble properties. Even under the most drastic of approximations, the resulting errors in the values obtained for average absolute magnetization, free energy, and heat capacity are on the order of 10%, with a corresponding computational efficiency gain of two orders of magnitude for a system such as a 4×44\times 4 Ising lattice. In addition, the errors in the results tend toward zero in the neighborhood of fixed points, as determined by renormalization group theory.Comment: 13 pages plus 7 figures (PNG

    Revealing evolutionary constraints on proteins through sequence analysis

    Full text link
    Statistical analysis of alignments of large numbers of protein sequences has revealed "sectors" of collectively coevolving amino acids in several protein families. Here, we show that selection acting on any functional property of a protein, represented by an additive trait, can give rise to such a sector. As an illustration of a selected trait, we consider the elastic energy of an important conformational change within an elastic network model, and we show that selection acting on this energy leads to correlations among residues. For this concrete example and more generally, we demonstrate that the main signature of functional sectors lies in the small-eigenvalue modes of the covariance matrix of the selected sequences. However, secondary signatures of these functional sectors also exist in the extensively-studied large-eigenvalue modes. Our simple, general model leads us to propose a principled method to identify functional sectors, along with the magnitudes of mutational effects, from sequence data. We further demonstrate the robustness of these functional sectors to various forms of selection, and the robustness of our approach to the identification of multiple selected traits.Comment: 37 pages, 28 figure
    corecore