23 research outputs found
Recommended from our members
Sparsity lower bounds for dimensionality reducing maps
We give near-tight lower bounds for the sparsity required in several dimensionality reducing linear maps. First, consider the Johnson-Lindenstrauss (JL) lemma which states that for any set of n vectors in Rd there is an A∈Rm x d with m = O(ε-2log n) such that mapping by A preserves the pairwise Euclidean distances up to a 1 pm ε factor. We show there exists a set of n vectors such that any such A with at most s non-zero entries per column must have s = Ω(ε-1log n/log(1/ε)) if m < O(n/log(1/ε)). This improves the lower bound of Ω(min{ε-2, ε-1√(logm d)) by [Dasgupta-Kumar-Sarlos, STOC 2010], which only held against the stronger property of distributional JL, and only against a certain restricted class of distributions. Meanwhile our lower bound is against the JL lemma itself, with no restrictions. Our lower bound matches the sparse JL upper bound of [Kane-Nelson, SODA 2012] up to an O(log(1/ε)) factor. Next, we show that any m x n matrix with the k-restricted isometry property (RIP) with constant distortion must have Ω(k log(n/k)) non-zeroes per column if m=O(k log (n/k)), the optimal number of rows for RIP, and k < n/polylog n. This improves the previous lower bound of Ω(min{k, n/m}) by [Chandar, 2010] and shows that for most k it is impossible to have a sparse RIP matrix with an optimal number of rows.
Both lower bounds above also offer a tradeoff between sparsity and the number of rows.
Lastly, we show that any oblivious distribution over subspace embedding matrices with 1 non-zero per column and preserving distances in a d dimensional-subspace up to a constant factor must have at least Ω(d2) rows. This matches an upper bound in [Nelson-Nguyên, arXiv abs/1211.1002] and shows the impossibility of obtaining the best of both of constructions in that work, namely 1 non-zero per column and d ⋅ polylog d rows.Engineering and Applied Science
An Improved Lower Bound for Sparse Reconstruction from Subsampled Hadamard Matrices
We give a short argument that yields a new lower bound on the number of
subsampled rows from a bounded, orthonormal matrix necessary to form a matrix
with the restricted isometry property. We show that a matrix formed by
uniformly subsampling rows of an Hadamard matrix contains a
-sparse vector in the kernel, unless the number of subsampled rows is
--- our lower bound applies whenever . Containing a sparse vector in the kernel precludes not only
the restricted isometry property, but more generally the application of those
matrices for uniform sparse recovery.Comment: Improved exposition and added an autho
An Improved Lower Bound for Sparse Reconstruction from Subsampled Walsh Matrices
We give a short argument that yields a new lower bound on the number of uniformly and independently subsampled rows from a bounded, orthonormal matrix necessary to form a matrix with the restricted isometry property. We show that a matrix formed by uniformly and independently subsampling rows of an N ×N Walsh matrix contains a K-sparse vector in the kernel, unless the number of subsampled rows is Ω(KlogKlog(N/K)) — our lower bound applies whenever min(K,N/K) \u3e logC N. Containing a sparse vector in the kernel precludes not only the restricted isometry property, but more generally the application of those matrices for uniform sparse recovery
Sparser Johnson-Lindenstrauss Transforms
We give two different and simple constructions for dimensionality reduction
in via linear mappings that are sparse: only an
-fraction of entries in each column of our embedding matrices
are non-zero to achieve distortion with high probability, while
still achieving the asymptotically optimal number of rows. These are the first
constructions to provide subconstant sparsity for all values of parameters,
improving upon previous works of Achlioptas (JCSS 2003) and Dasgupta, Kumar,
and Sarl\'{o}s (STOC 2010). Such distributions can be used to speed up
applications where dimensionality reduction is used.Comment: v6: journal version, minor changes, added Remark 23; v5: modified
abstract, fixed typos, added open problem section; v4: simplified section 4
by giving 1 analysis that covers both constructions; v3: proof of Theorem 25
in v2 was written incorrectly, now fixed; v2: Added another construction
achieving same upper bound, and added proof of near-tight lower bound for DKS
schem
Restricted Isometry Property for General p-Norms
The Restricted Isometry Property (RIP) is a fundamental property of a matrix
which enables sparse recovery. Informally, an matrix satisfies RIP
of order for the norm, if for every
vector with at most non-zero coordinates.
For every we obtain almost tight bounds on the minimum
number of rows necessary for the RIP property to hold. Prior to this work,
only the cases , , and were studied. Interestingly,
our results show that the case is a "singularity" point: the optimal
number of rows is for all , as opposed to for .
We also obtain almost tight bounds for the column sparsity of RIP matrices
and discuss implications of our results for the Stable Sparse Recovery problem.Comment: An extended abstract of this paper is to appear at the 31st
International Symposium on Computational Geometry (SoCG 2015
Random projections for Bayesian regression
This article deals with random projections applied as a data reduction
technique for Bayesian regression analysis. We show sufficient conditions under
which the entire -dimensional distribution is approximately preserved under
random projections by reducing the number of data points from to in the case . Under mild
assumptions, we prove that evaluating a Gaussian likelihood function based on
the projected data instead of the original data yields a
-approximation in terms of the Wasserstein
distance. Our main result shows that the posterior distribution of Bayesian
linear regression is approximated up to a small error depending on only an
-fraction of its defining parameters. This holds when using
arbitrary Gaussian priors or the degenerate case of uniform distributions over
for . Our empirical evaluations involve different
simulated settings of Bayesian linear regression. Our experiments underline
that the proposed method is able to recover the regression model up to small
error while considerably reducing the total running time