Search CORE

We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map having

m = O(\tilde{r}/\varepsilon^2)

rows. Here

\tilde{r}

is the maximum stable rank, i.e. squared ratio of Frobenius and operator norms, of the two matrices being multiplied. This is a quantitative improvement over previous work of [MZ11, KVZ14], and is also optimal for any oblivious dimensionality-reducing map. Furthermore, due to the black box reliance on the subspace embedding property in our proofs, our theorem can be applied to a much more general class of sketching matrices than what was known before, in addition to achieving better bounds. For example, one can apply our theorem to efficient subspace embeddings such as the Subsampled Randomized Hadamard Transform or sparse subspace embeddings, or even with subspace embedding constructions that may be developed in the future. Our main theorem, via connections with spectral error matrix multiplication shown in prior work, implies quantitative improvements for approximate least squares regression and low rank approximation. Our main result has also already been applied to improve dimensionality reduction guarantees for

k

-means clustering [CEMMP14], and implies new results for nonparametric regression [YPW15]. We also separately point out that the proof of the "BSS" deterministic row-sampling result of [BSS12] can be modified to show that for any matrices

A, B

of stable rank at most

\tilde{r}

, one can achieve the spectral norm guarantee for approximate matrix multiplication of

A^T B

by deterministically sampling

O(\tilde{r}/\varepsilon^2)

rows that can be found in polynomial time. The original result of [BSS12] was for rank instead of stable rank. Our observation leads to a stronger version of a main theorem of [KMST10].Comment: v3: minor edits; v2: fixed one step in proof of Theorem 9 which was wrong by a constant factor (see the new Lemma 5 and its use; final theorem unaffected

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

Author: Munteanu Alexander
Schwiegelshohn Chris
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

We present a technical survey on the state of the art approaches in data reduction and the coreset framework. These include geometric decompositions, gradient methods, random sampling, sketching and random projections. We further outline their importance for the design of streaming algorithms and give a brief overview on lower bounding techniques

Archivio della ricerca- Università di Roma La Sapienza

Dimensionality Reduction for k-Means Clustering and Low Rank Approximation

Author: Cohen Michael B.
Elder Sam
Musco Cameron
Musco Christopher
Persu Madalina
Publication venue
Publication date: 02/04/2015
Field of study

We show how to approximate a data matrix

\mathbf{A}

with a much smaller sketch

\mathbf{\tilde A}

that can be used to solve a general class of constrained k-rank approximation problems to within

(1+\epsilon)

error. Importantly, this class of problems includes

k

-means clustering and unconstrained low rank approximation (i.e. principal component analysis). By reducing data points to just

O(k)

dimensions, our methods generically accelerate any exact, approximate, or heuristic algorithm for these ubiquitous problems. For

k

-means dimensionality reduction, we provide

(1+\epsilon)

relative error results for many common sketching techniques, including random row projection, column selection, and approximate SVD. For approximate principal component analysis, we give a simple alternative to known algorithms that has applications in the streaming setting. Additionally, we extend recent work on column-based matrix reconstruction, giving column subsets that not only `cover' a good subspace for \bv{A}, but can be used directly to compute this subspace. Finally, for

k

-means clustering, we show how to achieve a

(9+\epsilon)

approximation by Johnson-Lindenstrauss projecting data points to just

O(\log k/\epsilon^2)

dimensions. This gives the first result that leverages the specific structure of

k

-means to achieve dimension independent of input size and sublinear in

k

arXiv.org e-Print Archive

CiteSeerX

Ramsey-type theorems for metric spaces with applications to online problems

Author: Achlioptas
Bartal
Bartal
Bartal
Bartal
Ben-David
Blum
Bollobás
Borodin
Borodin
Bourgain
Béla Bollobás
Fakcharoenphol
Fiat
Fiat
Irani
Karlin
Karloff
Kleinberg
Koutsoupias
MacWilliams
Manasse
Manor Mendel
Matoušek
McGeoch
Seiden
Seiden
Sleator
Yair Bartal
Publication venue: 'Elsevier BV'
Publication date: 16/06/2004
Field of study

A nearly logarithmic lower bound on the randomized competitive ratio for the metrical task systems problem is presented. This implies a similar lower bound for the extensively studied k-server problem. The proof is based on Ramsey-type theorems for metric spaces, that state that every metric space contains a large subspace which is approximately a hierarchically well-separated tree (and in particular an ultrametric). These Ramsey-type theorems may be of independent interest.Comment: Fix an error in the metadata. 31 pages, 0 figures. Preliminary version in FOCS '01. To be published in J. Comput. System Sc

arXiv.org e-Print Archive

University of Memphis Digital Commons

CiteSeerX

Elsevier - Publisher Connector

Crossref

Random projections for Bayesian regression

Author: Geppert Leo N.
Ickstadt Katja
Munteanu Alexander
Quedenfeld Jens
Sohler Christian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/11/2015
Field of study

This article deals with random projections applied as a data reduction technique for Bayesian regression analysis. We show sufficient conditions under which the entire

d

-dimensional distribution is approximately preserved under random projections by reducing the number of data points from

n

k\in O(\operatorname{poly}(d/\varepsilon))

in the case

n\gg d

. Under mild assumptions, we prove that evaluating a Gaussian likelihood function based on the projected data instead of the original data yields a

(1+O(\varepsilon))

-approximation in terms of the

\ell_2

Wasserstein distance. Our main result shows that the posterior distribution of Bayesian linear regression is approximated up to a small error depending on only an

\varepsilon

-fraction of its defining parameters. This holds when using arbitrary Gaussian priors or the degenerate case of uniform distributions over

\mathbb{R}^d

for

\beta

. Our empirical evaluations involve different simulated settings of Bayesian linear regression. Our experiments underline that the proposed method is able to recover the regression model up to small error while considerably reducing the total running time

arXiv.org e-Print Archive

Springer - Publisher Connector