Search CORE

8 research outputs found

Approximate Matrix Multiplication with Application to Linear Embeddings

Author: Kyrillidis Anastasios
Vlachos Michail
Zouzias Anastasios
Publication venue
Publication date: 29/03/2014
Field of study

In this paper, we study the problem of approximately computing the product of two real matrices. In particular, we analyze a dimensionality-reduction-based approximation algorithm due to Sarlos [1], introducing the notion of nuclear rank as the ratio of the nuclear norm over the spectral norm. The presented bound has improved dependence with respect to the approximation error (as compared to previous approaches), whereas the subspace -- on which we project the input matrices -- has dimensions proportional to the maximum of their nuclear rank and it is independent of the input dimensions. In addition, we provide an application of this result to linear low-dimensional embeddings. Namely, we show that any Euclidean point-set with bounded nuclear rank is amenable to projection onto number of dimensions that is independent of the input dimensionality, while achieving additive error guarantees.Comment: 8 pages, International Symposium on Information Theor

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Crossref

Serveur académique lausannois

Approximation and Streaming Algorithms for Projective Clustering via Random Projections

Author: Kerber Michael
Raghvendra Sharath
Publication venue
Publication date: 08/07/2014
Field of study

Let

P

be a set of

n

points in

\mathbb{R}^d

. In the projective clustering problem, given

k, q

and norm

\rho \in [1,\infty]

, we have to compute a set

\mathcal{F}

k

q

-dimensional flats such that

(\sum_{p\in P}d(p, \mathcal{F})^\rho)^{1/\rho}

is minimized; here

d(p, \mathcal{F})

represents the (Euclidean) distance of

p

to the closest flat in

\mathcal{F}

. We let

f_k^q(P,\rho)

denote the minimal value and interpret

f_k^q(P,\infty)

to be

\max_{r\in P}d(r, \mathcal{F})

. When

\rho=1,2

and

\infty

and

q=0

, the problem corresponds to the

k

-median,

k

-mean and the

k

-center clustering problems respectively. For every

0 < \epsilon < 1

S\subset P

and

\rho \ge 1

, we show that the orthogonal projection of

P

onto a randomly chosen flat of dimension

O(((q+1)^2\log(1/\epsilon)/\epsilon^3) \log n)

will

\epsilon

-approximate

f_1^q(S,\rho)

. This result combines the concepts of geometric coresets and subspace embeddings based on the Johnson-Lindenstrauss Lemma. As a consequence, an orthogonal projection of

P

to an

O(((q+1)^2 \log ((q+1)/\epsilon)/\epsilon^3) \log n)

dimensional randomly chosen subspace

\epsilon

-approximates projective clusterings for every

k

and

\rho

simultaneously. Note that the dimension of this subspace is independent of the number of clusters~

k

. Using this dimension reduction result, we obtain new approximation and streaming algorithms for projective clustering problems. For example, given a stream of

n

points, we show how to compute an

\epsilon

-approximate projective clustering for every

k

and

\rho

simultaneously using only

O((n+d)((q+1)^2\log ((q+1)/\epsilon))/\epsilon^3 \log n)

space. Compared to standard streaming algorithms with

\Omega(kd)

space requirement, our approach is a significant improvement when the number of input points and their dimensions are of the same order of magnitude.Comment: Canadian Conference on Computational Geometry (CCCG 2015

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

Dimensionality reduction with subgaussian matrices: a unified theory

Author: Dirksen Sjoerd
Publication venue
Publication date: 17/02/2014
Field of study

We present a theory for Euclidean dimensionality reduction with subgaussian matrices which unifies several restricted isometry property and Johnson-Lindenstrauss type results obtained earlier for specific data sets. In particular, we recover and, in several cases, improve results for sets of sparse and structured sparse vectors, low-rank matrices and tensors, and smooth manifolds. In addition, we establish a new Johnson-Lindenstrauss embedding for data sets taking the form of an infinite union of subspaces of a Hilbert space

arXiv.org e-Print Archive

CiteSeerX

Approximation and Streaming Algorithms for Projective Clustering via Random Projections

Author: Michael Kerber
Sharath Raghvendra
Publication venue
Publication date: 06/03/2020
Field of study

Abstract Let P be a set of n points in R d . In the projective clustering problem, given k, q and norm ρ ∈ [1, ∞], we have to compute a set F of k q-dimensional flats such that represents the (Euclidean) distance of p to the closest flat in F. We let f q k (P, ρ) denote the minimal value and interpret f q k (P, ∞) to be max r∈P d(r, F). When ρ = 1, 2 and ∞ and q = 0, the problem corresponds to the k-median, kmean and the k-center clustering problems respectively. For every 0 < ε < 1, S ⊂ P and ρ ≥ 1, we show that the orthogonal projection of P onto a randomly chosen flat of dimension O(((q + 1) 2 log(1/ε)/ε 3 ) log n) will ε-approximate f q 1 (S, ρ). This result combines the concepts of geometric coresets and subspace embeddings based on the Johnson-Lindenstrauss Lemma. As a consequence, an orthogonal projection of P to an O(((q + 1) 2 log((q + 1)/ε)/ε 3 ) log n) dimensional randomly chosen subspace ε-approximates projective clusterings for every k and ρ simultaneously. Note that the dimension of this subspace is independent of the number of clusters k. Using this dimension reduction result, we obtain new approximation and streaming algorithms for projective clustering problems. For example, given a stream of n points, we show how to compute an ε-approximate projective clustering for every k and ρ simultaneously using only O((n + d)((q + 1) 2 log((q + 1)/ε))/ε 3 log n) space. Compared to standard streaming algorithms with Ω(kd) space requirement, our approach is a significant improvement when the number of input points and their dimensions are of the same order of magnitude

CiteSeerX

Embeddings of surfaces, curves, and moving points in euclidean space

Author: Hai Yu
Pankaj K. Agarwal
Sariel Har-peled
Publication venue
Publication date: 01/01/2007
Field of study

In this paper we show that dimensionality reduction (i.e., Johnson-Lindenstrauss lemma) preserves not only the distances between static points, but also between moving points, and more generally between low-dimensional flats, polynomial curves, curves with low winding degree, and polynomial surfaces. We also show that surfaces with bounded doubling dimension can be embedded into low dimension with small additive error. Finally, we show that for points with polynomial motion, the radius of the smallest enclosing ball can be preserved under dimensionality reduction.

CiteSeerX

Crossref

Random observations on random observations: Sparse signal acquisition and processing

Author: Davenport Mark A.
Publication venue
Publication date: 01/01/2010
Field of study

In recent years, signal processing has come under mounting pressure to accommodate the increasingly high-dimensional raw data generated by modern sensing systems. Despite extraordinary advances in computational power, processing the signals produced in application areas such as imaging, video, remote surveillance, spectroscopy, and genomic data analysis continues to pose a tremendous challenge. Fortunately, in many cases these high-dimensional signals contain relatively little information compared to their ambient dimensionality. For example, signals can often be well-approximated as a sparse linear combination of elements from a known basis or dictionary. Traditionally, sparse models have been exploited only after acquisition, typically for tasks such as compression. Recently, however, the applications of sparsity have greatly expanded with the emergence of compressive sensing, a new approach to data acquisition that directly exploits sparsity in order to acquire analog signals more efficiently via a small set of more general, often randomized, linear measurements. If properly chosen, the number of measurements can be much smaller than the number of Nyquist-rate samples. A common theme in this research is the use of randomness in signal acquisition, inspiring the design of hardware systems that directly implement random measurement protocols. This thesis builds on the field of compressive sensing and illustrates how sparsity can be exploited to design efficient signal processing algorithms at all stages of the information processing pipeline, with a particular focus on the manner in which randomness can be exploited to design new kinds of acquisition systems for sparse signals. Our key contributions include: (i) exploration and analysis of the appropriate properties for a sparse signal acquisition system; (ii) insight into the useful properties of random measurement schemes; (iii) analysis of an important family of algorithms for recovering sparse signals from random measurements; (iv) exploration of the impact of noise, both structured and unstructured, in the context of random measurements; and (v) algorithms that process random measurements to directly extract higher-level information or solve inference problems without resorting to full-scale signal recovery, reducing both the cost of signal acquisition and the complexity of the post-acquisition processing

CiteSeerX

DSpace at Rice University