448 research outputs found
Dimensionality reduction with subgaussian matrices: a unified theory
We present a theory for Euclidean dimensionality reduction with subgaussian
matrices which unifies several restricted isometry property and
Johnson-Lindenstrauss type results obtained earlier for specific data sets. In
particular, we recover and, in several cases, improve results for sets of
sparse and structured sparse vectors, low-rank matrices and tensors, and smooth
manifolds. In addition, we establish a new Johnson-Lindenstrauss embedding for
data sets taking the form of an infinite union of subspaces of a Hilbert space
Isometric sketching of any set via the Restricted Isometry Property
In this paper we show that for the purposes of dimensionality reduction
certain class of structured random matrices behave similarly to random Gaussian
matrices. This class includes several matrices for which matrix-vector multiply
can be computed in log-linear time, providing efficient dimensionality
reduction of general sets. In particular, we show that using such matrices any
set from high dimensions can be embedded into lower dimensions with near
optimal distortion. We obtain our results by connecting dimensionality
reduction of any set to dimensionality reduction of sparse vectors via a
chaining argument.Comment: 17 page
Fast Cross-Polytope Locality-Sensitive Hashing
We provide a variant of cross-polytope locality sensitive hashing with
respect to angular distance which is provably optimal in asymptotic sensitivity
and enjoys hash computation time. Building on a recent
result (by Andoni, Indyk, Laarhoven, Razenshteyn, Schmidt, 2015), we show that
optimal asymptotic sensitivity for cross-polytope LSH is retained even when the
dense Gaussian matrix is replaced by a fast Johnson-Lindenstrauss transform
followed by discrete pseudo-rotation, reducing the hash computation time from
to . Moreover, our scheme achieves
the optimal rate of convergence for sensitivity. By incorporating a
low-randomness Johnson-Lindenstrauss transform, our scheme can be modified to
require only random bitsComment: 14 pages, 6 figure
Toward a unified theory of sparse dimensionality reduction in Euclidean space
Let be a sparse Johnson-Lindenstrauss
transform [KN14] with non-zeroes per column. For a subset of the unit
sphere, given, we study settings for required to
ensure i.e. so that preserves the norm of every
simultaneously and multiplicatively up to . We
introduce a new complexity parameter, which depends on the geometry of , and
show that it suffices to choose and such that this parameter is small.
Our result is a sparse analog of Gordon's theorem, which was concerned with a
dense having i.i.d. Gaussian entries. We qualitatively unify several
results related to the Johnson-Lindenstrauss lemma, subspace embeddings, and
Fourier-based restricted isometries. Our work also implies new results in using
the sparse Johnson-Lindenstrauss transform in numerical linear algebra,
classical and model-based compressed sensing, manifold learning, and
constrained least squares problems such as the Lasso
Subspace clustering of dimensionality-reduced data
Subspace clustering refers to the problem of clustering unlabeled
high-dimensional data points into a union of low-dimensional linear subspaces,
assumed unknown. In practice one may have access to dimensionality-reduced
observations of the data only, resulting, e.g., from "undersampling" due to
complexity and speed constraints on the acquisition device. More pertinently,
even if one has access to the high-dimensional data set it is often desirable
to first project the data points into a lower-dimensional space and to perform
the clustering task there; this reduces storage requirements and computational
cost. The purpose of this paper is to quantify the impact of
dimensionality-reduction through random projection on the performance of the
sparse subspace clustering (SSC) and the thresholding based subspace clustering
(TSC) algorithms. We find that for both algorithms dimensionality reduction
down to the order of the subspace dimensions is possible without incurring
significant performance degradation. The mathematical engine behind our
theorems is a result quantifying how the affinities between subspaces change
under random dimensionality reducing projections.Comment: ISIT 201
- …