60,338 research outputs found
Sub-linear Upper Bounds on Fourier dimension of Boolean Functions in terms of Fourier sparsity
We prove that the Fourier dimension of any Boolean function with Fourier
sparsity is at most . Our proof method yields an
improved bound of assuming a conjecture of
Tsang~\etal~\cite{tsang}, that for every Boolean function of sparsity there
is an affine subspace of of co-dimension O(\poly\log s)
restricted to which the function is constant. This conjectured bound is tight
upto poly-logarithmic factors as the Fourier dimension and sparsity of the
address function are quadratically separated. We obtain these bounds by
observing that the Fourier dimension of a Boolean function is equivalent to its
non-adaptive parity decision tree complexity, and then bounding the latter
Simple Analysis of Sparse, Sign-Consistent JL
Allen-Zhu, Gelashvili, Micali, and Shavit construct a sparse, sign-consistent Johnson-Lindenstrauss distribution, and prove that this distribution yields an essentially optimal dimension for the correct choice of sparsity. However, their analysis of the upper bound on the dimension and sparsity requires a complicated combinatorial graph-based argument similar to Kane and Nelson\u27s analysis of sparse JL. We present a simple, combinatorics-free analysis of sparse, sign-consistent JL that yields the same dimension and sparsity upper bounds as the original analysis. Our analysis also yields dimension/sparsity tradeoffs, which were not previously known.
As with previous proofs in this area, our analysis is based on applying Markov\u27s inequality to the pth moment of an error term that can be expressed as a quadratic form of Rademacher variables. Interestingly, we show that, unlike in previous work in the area, the traditionally used Hanson-Wright bound is not strong enough to yield our desired result. Indeed, although the Hanson-Wright bound is known to be optimal for gaussian degree-2 chaos, it was already shown to be suboptimal for Rademachers. Surprisingly, we are able to show a simple moment bound for quadratic forms of Rademachers that is sufficiently tight to achieve our desired result, which given the ubiquity of moment and tail bounds in theoretical computer science, is likely to be of broader interest
Fast Markov chain Monte Carlo sampling for sparse Bayesian inference in high-dimensional inverse problems using L1-type priors
Sparsity has become a key concept for solving of high-dimensional inverse
problems using variational regularization techniques. Recently, using similar
sparsity-constraints in the Bayesian framework for inverse problems by encoding
them in the prior distribution has attracted attention. Important questions
about the relation between regularization theory and Bayesian inference still
need to be addressed when using sparsity promoting inversion. A practical
obstacle for these examinations is the lack of fast posterior sampling
algorithms for sparse, high-dimensional Bayesian inversion: Accessing the full
range of Bayesian inference methods requires being able to draw samples from
the posterior probability distribution in a fast and efficient way. This is
usually done using Markov chain Monte Carlo (MCMC) sampling algorithms. In this
article, we develop and examine a new implementation of a single component
Gibbs MCMC sampler for sparse priors relying on L1-norms. We demonstrate that
the efficiency of our Gibbs sampler increases when the level of sparsity or the
dimension of the unknowns is increased. This property is contrary to the
properties of the most commonly applied Metropolis-Hastings (MH) sampling
schemes: We demonstrate that the efficiency of MH schemes for L1-type priors
dramatically decreases when the level of sparsity or the dimension of the
unknowns is increased. Practically, Bayesian inversion for L1-type priors using
MH samplers is not feasible at all. As this is commonly believed to be an
intrinsic feature of MCMC sampling, the performance of our Gibbs sampler also
challenges common beliefs about the applicability of sample based Bayesian
inference.Comment: 33 pages, 14 figure
Structured Sparsity: Discrete and Convex approaches
Compressive sensing (CS) exploits sparsity to recover sparse or compressible
signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity
is also used to enhance interpretability in machine learning and statistics
applications: While the ambient dimension is vast in modern data analysis
problems, the relevant information therein typically resides in a much lower
dimensional space. However, many solutions proposed nowadays do not leverage
the true underlying structure. Recent results in CS extend the simple sparsity
idea to more sophisticated {\em structured} sparsity models, which describe the
interdependency between the nonzero components of a signal, allowing to
increase the interpretability of the results and lead to better recovery
performance. In order to better understand the impact of structured sparsity,
in this chapter we analyze the connections between the discrete models and
their convex relaxations, highlighting their relative advantages. We start with
the general group sparse model and then elaborate on two important special
cases: the dispersive and the hierarchical models. For each, we present the
models in their discrete nature, discuss how to solve the ensuing discrete
problems and then describe convex relaxations. We also consider more general
structures as defined by set functions and present their convex proxies.
Further, we discuss efficient optimization solutions for structured sparsity
problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure
Minimax bounds for sparse PCA with noisy high-dimensional data
We study the problem of estimating the leading eigenvectors of a
high-dimensional population covariance matrix based on independent Gaussian
observations. We establish a lower bound on the minimax risk of estimators
under the loss, in the joint limit as dimension and sample size increase
to infinity, under various models of sparsity for the population eigenvectors.
The lower bound on the risk points to the existence of different regimes of
sparsity of the eigenvectors. We also propose a new method for estimating the
eigenvectors by a two-stage coordinate selection scheme.Comment: 1 figur
Bounds of restricted isometry constants in extreme asymptotics: formulae for Gaussian matrices
Restricted Isometry Constants (RICs) provide a measure of how far from an
isometry a matrix can be when acting on sparse vectors. This, and related
quantities, provide a mechanism by which standard eigen-analysis can be applied
to topics relying on sparsity. RIC bounds have been presented for a variety of
random matrices and matrix dimension and sparsity ranges. We provide explicitly
formulae for RIC bounds, of n by N Gaussian matrices with sparsity k, in three
settings: a) n/N fixed and k/n approaching zero, b) k/n fixed and n/N
approaching zero, and c) n/N approaching zero with k/n decaying inverse
logrithmically in N/n; in these three settings the RICs a) decay to zero, b)
become unbounded (or approach inherent bounds), and c) approach a non-zero
constant. Implications of these results for RIC based analysis of compressed
sensing algorithms are presented.Comment: 40 pages, 5 figure
Sparsity in Variational Autoencoders
Working in high-dimensional latent spaces, the internal encoding of data in
Variational Autoencoders becomes naturally sparse. We discuss this known but
controversial phenomenon sometimes refereed to as overpruning, to emphasize the
under-use of the model capacity. In fact, it is an important form of
self-regularization, with all the typical benefits associated with sparsity: it
forces the model to focus on the really important features, highly reducing the
risk of overfitting. Especially, it is a major methodological guide for the
correct tuning of the model capacity, progressively augmenting it to attain
sparsity, or conversely reducing the dimension of the network removing links to
zeroed out neurons. The degree of sparsity crucially depends on the network
architecture: for instance, convolutional networks typically show less
sparsity, likely due to the tighter relation of features to different spatial
regions of the input.Comment: An Extended Abstract of this survey will be presented at the 1st
International Conference on Advances in Signal Processing and Artificial
Intelligence (ASPAI' 2019), 20-22 March 2019, Barcelona, Spai
- …