3,159 research outputs found
Approximation and Streaming Algorithms for Projective Clustering via Random Projections
Let be a set of points in . In the projective
clustering problem, given and norm , we have to
compute a set of -dimensional flats such that is minimized; here
represents the (Euclidean) distance of to the closest flat in
. We let denote the minimal value and interpret
to be . When and
and , the problem corresponds to the -median, -mean and the
-center clustering problems respectively.
For every , and , we show that the
orthogonal projection of onto a randomly chosen flat of dimension
will -approximate
. This result combines the concepts of geometric coresets and
subspace embeddings based on the Johnson-Lindenstrauss Lemma. As a consequence,
an orthogonal projection of to an dimensional randomly chosen subspace
-approximates projective clusterings for every and
simultaneously. Note that the dimension of this subspace is independent of the
number of clusters~.
Using this dimension reduction result, we obtain new approximation and
streaming algorithms for projective clustering problems. For example, given a
stream of points, we show how to compute an -approximate
projective clustering for every and simultaneously using only
space. Compared to
standard streaming algorithms with space requirement, our approach
is a significant improvement when the number of input points and their
dimensions are of the same order of magnitude.Comment: Canadian Conference on Computational Geometry (CCCG 2015
A Dimension Reduction Scheme for the Computation of Optimal Unions of Subspaces
Given a set of points \F in a high dimensional space, the problem of finding
a union of subspaces \cup_i V_i\subset \R^N that best explains the data \F
increases dramatically with the dimension of \R^N. In this article, we study a
class of transformations that map the problem into another one in lower
dimension. We use the best model in the low dimensional space to approximate
the best solution in the original high dimensional space. We then estimate the
error produced between this solution and the optimal solution in the high
dimensional space.Comment: 15 pages. Some corrections were added, in particular the title was
changed. It will appear in "Sampling Theory in Signal and Image Processing
- …