8 research outputs found
Approximate Matrix Multiplication with Application to Linear Embeddings
In this paper, we study the problem of approximately computing the product of
two real matrices. In particular, we analyze a dimensionality-reduction-based
approximation algorithm due to Sarlos [1], introducing the notion of nuclear
rank as the ratio of the nuclear norm over the spectral norm. The presented
bound has improved dependence with respect to the approximation error (as
compared to previous approaches), whereas the subspace -- on which we project
the input matrices -- has dimensions proportional to the maximum of their
nuclear rank and it is independent of the input dimensions. In addition, we
provide an application of this result to linear low-dimensional embeddings.
Namely, we show that any Euclidean point-set with bounded nuclear rank is
amenable to projection onto number of dimensions that is independent of the
input dimensionality, while achieving additive error guarantees.Comment: 8 pages, International Symposium on Information Theor
Approximation and Streaming Algorithms for Projective Clustering via Random Projections
Let be a set of points in . In the projective
clustering problem, given and norm , we have to
compute a set of -dimensional flats such that is minimized; here
represents the (Euclidean) distance of to the closest flat in
. We let denote the minimal value and interpret
to be . When and
and , the problem corresponds to the -median, -mean and the
-center clustering problems respectively.
For every , and , we show that the
orthogonal projection of onto a randomly chosen flat of dimension
will -approximate
. This result combines the concepts of geometric coresets and
subspace embeddings based on the Johnson-Lindenstrauss Lemma. As a consequence,
an orthogonal projection of to an dimensional randomly chosen subspace
-approximates projective clusterings for every and
simultaneously. Note that the dimension of this subspace is independent of the
number of clusters~.
Using this dimension reduction result, we obtain new approximation and
streaming algorithms for projective clustering problems. For example, given a
stream of points, we show how to compute an -approximate
projective clustering for every and simultaneously using only
space. Compared to
standard streaming algorithms with space requirement, our approach
is a significant improvement when the number of input points and their
dimensions are of the same order of magnitude.Comment: Canadian Conference on Computational Geometry (CCCG 2015
Dimensionality reduction with subgaussian matrices: a unified theory
We present a theory for Euclidean dimensionality reduction with subgaussian
matrices which unifies several restricted isometry property and
Johnson-Lindenstrauss type results obtained earlier for specific data sets. In
particular, we recover and, in several cases, improve results for sets of
sparse and structured sparse vectors, low-rank matrices and tensors, and smooth
manifolds. In addition, we establish a new Johnson-Lindenstrauss embedding for
data sets taking the form of an infinite union of subspaces of a Hilbert space
Approximation and Streaming Algorithms for Projective Clustering via Random Projections
Abstract Let P be a set of n points in R d . In the projective clustering problem, given k, q and norm ρ ∈ [1, ∞], we have to compute a set F of k q-dimensional flats such that represents the (Euclidean) distance of p to the closest flat in F. We let f q k (P, ρ) denote the minimal value and interpret f q k (P, ∞) to be max r∈P d(r, F). When ρ = 1, 2 and ∞ and q = 0, the problem corresponds to the k-median, kmean and the k-center clustering problems respectively. For every 0 < ε < 1, S ⊂ P and ρ ≥ 1, we show that the orthogonal projection of P onto a randomly chosen flat of dimension O(((q + 1) 2 log(1/ε)/ε 3 ) log n) will ε-approximate f q 1 (S, ρ). This result combines the concepts of geometric coresets and subspace embeddings based on the Johnson-Lindenstrauss Lemma. As a consequence, an orthogonal projection of P to an O(((q + 1) 2 log((q + 1)/ε)/ε 3 ) log n) dimensional randomly chosen subspace ε-approximates projective clusterings for every k and ρ simultaneously. Note that the dimension of this subspace is independent of the number of clusters k. Using this dimension reduction result, we obtain new approximation and streaming algorithms for projective clustering problems. For example, given a stream of n points, we show how to compute an ε-approximate projective clustering for every k and ρ simultaneously using only O((n + d)((q + 1) 2 log((q + 1)/ε))/ε 3 log n) space. Compared to standard streaming algorithms with Ω(kd) space requirement, our approach is a significant improvement when the number of input points and their dimensions are of the same order of magnitude
Embeddings of surfaces, curves, and moving points in euclidean space
In this paper we show that dimensionality reduction (i.e., Johnson-Lindenstrauss lemma) preserves not only the distances between static points, but also between moving points, and more generally between low-dimensional flats, polynomial curves, curves with low winding degree, and polynomial surfaces. We also show that surfaces with bounded doubling dimension can be embedded into low dimension with small additive error. Finally, we show that for points with polynomial motion, the radius of the smallest enclosing ball can be preserved under dimensionality reduction.
Random observations on random observations: Sparse signal acquisition and processing
In recent years, signal processing has come under mounting pressure to accommodate the increasingly high-dimensional raw data generated by modern sensing systems. Despite extraordinary advances in computational power, processing the signals produced in application areas such as imaging, video, remote surveillance, spectroscopy, and genomic data analysis continues to pose a tremendous challenge. Fortunately, in many cases these high-dimensional signals contain relatively little information compared to their ambient dimensionality. For example, signals can often be well-approximated as a sparse linear combination of elements from a known basis or dictionary.
Traditionally, sparse models have been exploited only after acquisition, typically for tasks such as compression. Recently, however, the applications of sparsity have greatly expanded with the emergence of compressive sensing, a new approach to data acquisition that directly exploits sparsity in order to acquire analog signals more efficiently via a small set of more general, often randomized, linear measurements. If properly chosen, the number of measurements can be much smaller than the number of Nyquist-rate samples. A common theme in this research is the use of randomness in signal acquisition, inspiring the design of hardware systems that directly implement random measurement protocols.
This thesis builds on the field of compressive sensing and illustrates how sparsity can be exploited to design efficient signal processing algorithms at all stages of the information processing pipeline, with a particular focus on the manner in which randomness can be exploited to design new kinds of acquisition systems for sparse signals. Our key contributions include: (i) exploration and analysis of the appropriate properties for a sparse signal acquisition system; (ii) insight into the useful properties of random measurement schemes; (iii) analysis of an important family of algorithms for recovering sparse signals from random measurements; (iv) exploration of the impact of noise, both structured and unstructured, in the context of random measurements; and (v) algorithms that process random measurements to directly extract higher-level information or solve inference problems without resorting to full-scale signal recovery, reducing both the cost of signal acquisition and the complexity of the post-acquisition processing