60,338 research outputs found

    Sub-linear Upper Bounds on Fourier dimension of Boolean Functions in terms of Fourier sparsity

    Full text link
    We prove that the Fourier dimension of any Boolean function with Fourier sparsity ss is at most O(s2/3)O\left(s^{2/3}\right). Our proof method yields an improved bound of O~(s)\widetilde{O}(\sqrt{s}) assuming a conjecture of Tsang~\etal~\cite{tsang}, that for every Boolean function of sparsity ss there is an affine subspace of F2n\mathbb{F}_2^n of co-dimension O(\poly\log s) restricted to which the function is constant. This conjectured bound is tight upto poly-logarithmic factors as the Fourier dimension and sparsity of the address function are quadratically separated. We obtain these bounds by observing that the Fourier dimension of a Boolean function is equivalent to its non-adaptive parity decision tree complexity, and then bounding the latter

    Simple Analysis of Sparse, Sign-Consistent JL

    Get PDF
    Allen-Zhu, Gelashvili, Micali, and Shavit construct a sparse, sign-consistent Johnson-Lindenstrauss distribution, and prove that this distribution yields an essentially optimal dimension for the correct choice of sparsity. However, their analysis of the upper bound on the dimension and sparsity requires a complicated combinatorial graph-based argument similar to Kane and Nelson\u27s analysis of sparse JL. We present a simple, combinatorics-free analysis of sparse, sign-consistent JL that yields the same dimension and sparsity upper bounds as the original analysis. Our analysis also yields dimension/sparsity tradeoffs, which were not previously known. As with previous proofs in this area, our analysis is based on applying Markov\u27s inequality to the pth moment of an error term that can be expressed as a quadratic form of Rademacher variables. Interestingly, we show that, unlike in previous work in the area, the traditionally used Hanson-Wright bound is not strong enough to yield our desired result. Indeed, although the Hanson-Wright bound is known to be optimal for gaussian degree-2 chaos, it was already shown to be suboptimal for Rademachers. Surprisingly, we are able to show a simple moment bound for quadratic forms of Rademachers that is sufficiently tight to achieve our desired result, which given the ubiquity of moment and tail bounds in theoretical computer science, is likely to be of broader interest

    Fast Markov chain Monte Carlo sampling for sparse Bayesian inference in high-dimensional inverse problems using L1-type priors

    Full text link
    Sparsity has become a key concept for solving of high-dimensional inverse problems using variational regularization techniques. Recently, using similar sparsity-constraints in the Bayesian framework for inverse problems by encoding them in the prior distribution has attracted attention. Important questions about the relation between regularization theory and Bayesian inference still need to be addressed when using sparsity promoting inversion. A practical obstacle for these examinations is the lack of fast posterior sampling algorithms for sparse, high-dimensional Bayesian inversion: Accessing the full range of Bayesian inference methods requires being able to draw samples from the posterior probability distribution in a fast and efficient way. This is usually done using Markov chain Monte Carlo (MCMC) sampling algorithms. In this article, we develop and examine a new implementation of a single component Gibbs MCMC sampler for sparse priors relying on L1-norms. We demonstrate that the efficiency of our Gibbs sampler increases when the level of sparsity or the dimension of the unknowns is increased. This property is contrary to the properties of the most commonly applied Metropolis-Hastings (MH) sampling schemes: We demonstrate that the efficiency of MH schemes for L1-type priors dramatically decreases when the level of sparsity or the dimension of the unknowns is increased. Practically, Bayesian inversion for L1-type priors using MH samplers is not feasible at all. As this is commonly believed to be an intrinsic feature of MCMC sampling, the performance of our Gibbs sampler also challenges common beliefs about the applicability of sample based Bayesian inference.Comment: 33 pages, 14 figure

    Structured Sparsity: Discrete and Convex approaches

    Full text link
    Compressive sensing (CS) exploits sparsity to recover sparse or compressible signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity is also used to enhance interpretability in machine learning and statistics applications: While the ambient dimension is vast in modern data analysis problems, the relevant information therein typically resides in a much lower dimensional space. However, many solutions proposed nowadays do not leverage the true underlying structure. Recent results in CS extend the simple sparsity idea to more sophisticated {\em structured} sparsity models, which describe the interdependency between the nonzero components of a signal, allowing to increase the interpretability of the results and lead to better recovery performance. In order to better understand the impact of structured sparsity, in this chapter we analyze the connections between the discrete models and their convex relaxations, highlighting their relative advantages. We start with the general group sparse model and then elaborate on two important special cases: the dispersive and the hierarchical models. For each, we present the models in their discrete nature, discuss how to solve the ensuing discrete problems and then describe convex relaxations. We also consider more general structures as defined by set functions and present their convex proxies. Further, we discuss efficient optimization solutions for structured sparsity problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure

    Minimax bounds for sparse PCA with noisy high-dimensional data

    Full text link
    We study the problem of estimating the leading eigenvectors of a high-dimensional population covariance matrix based on independent Gaussian observations. We establish a lower bound on the minimax risk of estimators under the l2l_2 loss, in the joint limit as dimension and sample size increase to infinity, under various models of sparsity for the population eigenvectors. The lower bound on the risk points to the existence of different regimes of sparsity of the eigenvectors. We also propose a new method for estimating the eigenvectors by a two-stage coordinate selection scheme.Comment: 1 figur

    Bounds of restricted isometry constants in extreme asymptotics: formulae for Gaussian matrices

    Full text link
    Restricted Isometry Constants (RICs) provide a measure of how far from an isometry a matrix can be when acting on sparse vectors. This, and related quantities, provide a mechanism by which standard eigen-analysis can be applied to topics relying on sparsity. RIC bounds have been presented for a variety of random matrices and matrix dimension and sparsity ranges. We provide explicitly formulae for RIC bounds, of n by N Gaussian matrices with sparsity k, in three settings: a) n/N fixed and k/n approaching zero, b) k/n fixed and n/N approaching zero, and c) n/N approaching zero with k/n decaying inverse logrithmically in N/n; in these three settings the RICs a) decay to zero, b) become unbounded (or approach inherent bounds), and c) approach a non-zero constant. Implications of these results for RIC based analysis of compressed sensing algorithms are presented.Comment: 40 pages, 5 figure

    Sparsity in Variational Autoencoders

    Full text link
    Working in high-dimensional latent spaces, the internal encoding of data in Variational Autoencoders becomes naturally sparse. We discuss this known but controversial phenomenon sometimes refereed to as overpruning, to emphasize the under-use of the model capacity. In fact, it is an important form of self-regularization, with all the typical benefits associated with sparsity: it forces the model to focus on the really important features, highly reducing the risk of overfitting. Especially, it is a major methodological guide for the correct tuning of the model capacity, progressively augmenting it to attain sparsity, or conversely reducing the dimension of the network removing links to zeroed out neurons. The degree of sparsity crucially depends on the network architecture: for instance, convolutional networks typically show less sparsity, likely due to the tighter relation of features to different spatial regions of the input.Comment: An Extended Abstract of this survey will be presented at the 1st International Conference on Advances in Signal Processing and Artificial Intelligence (ASPAI' 2019), 20-22 March 2019, Barcelona, Spai
    • …
    corecore