Search CORE

1,320 research outputs found

Improved Practical Matrix Sketching with Guarantees

Author: Desai Amey
Ghashami Mina
Phillips Jeff M.
Publication venue
Publication date: 01/01/2014
Field of study

Matrices have become essential data representations for many large-scale problems in data analytics, and hence matrix sketching is a critical task. Although much research has focused on improving the error/size tradeoff under various sketching paradigms, the many forms of error bounds make these approaches hard to compare in theory and in practice. This paper attempts to categorize and compare most known methods under row-wise streaming updates with provable guarantees, and then to tweak some of these methods to gain practical improvements while retaining guarantees. For instance, we observe that a simple heuristic iSVD, with no guarantees, tends to outperform all known approaches in terms of size/error trade-off. We modify the best performing method with guarantees FrequentDirections under the size/error trade-off to match the performance of iSVD and retain its guarantees. We also demonstrate some adversarial datasets where iSVD performs quite poorly. In comparing techniques in the time/error trade-off, techniques based on hashing or sampling tend to perform better. In this setting we modify the most studied sampling regime to retain error guarantee but obtain dramatic improvements in the time/error trade-off. Finally, we provide easy replication of our studies on APT, a new testbed which makes available not only code and datasets, but also a computing platform with fixed environmental settings.Comment: 27 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Isometric sketching of any set via the Restricted Isometry Property

Author: Oymak Samet
Recht Benjamin
Soltanolkotabi Mahdi
Publication venue
Publication date: 06/10/2015
Field of study

In this paper we show that for the purposes of dimensionality reduction certain class of structured random matrices behave similarly to random Gaussian matrices. This class includes several matrices for which matrix-vector multiply can be computed in log-linear time, providing efficient dimensionality reduction of general sets. In particular, we show that using such matrices any set from high dimensions can be embedded into lower dimensions with near optimal distortion. We obtain our results by connecting dimensionality reduction of any set to dimensionality reduction of sparse vectors via a chaining argument.Comment: 17 page

arXiv.org e-Print Archive

CiteSeerX

Practical sketching algorithms for low-rank matrix approximation

Author: Cevher Volkan
Tropp Joel A.
Udell Madeleine
Yurtsever Alp
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 06/12/2017
Field of study

This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch. These methods can preserve structural properties of the input matrix, such as positive-semidefiniteness, and they can produce approximations with a user-specified rank. The algorithms are simple, accurate, numerically stable, and provably correct. Moreover, each method is accompanied by an informative error bound that allows users to select parameters a priori to achieve a given approximation quality. These claims are supported by numerical experiments with real and synthetic data

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Caltech Authors

Toward a unified theory of sparse dimensionality reduction in Euclidean space

Author: Avron H.
Bühlmann P.
Candès E.
Hegde C.
Lu Y.
Paul S.
Talagrand M.
Woodruff D. P.
Publication venue
Publication date: 01/01/2015
Field of study

Let

\Phi\in\mathbb{R}^{m\times n}

be a sparse Johnson-Lindenstrauss transform [KN14] with

s

non-zeroes per column. For a subset

T

of the unit sphere,

\varepsilon\in(0,1/2)

given, we study settings for

m,s

required to ensure

\mathop{\mathbb{E}}_\Phi \sup_{x\in T} \left|\|\Phi x\|_2^2 - 1 \right| < \varepsilon ,

i.e. so that

\Phi

preserves the norm of every

x\in T

simultaneously and multiplicatively up to

1+\varepsilon

. We introduce a new complexity parameter, which depends on the geometry of

T

, and show that it suffices to choose

s

and

m

such that this parameter is small. Our result is a sparse analog of Gordon's theorem, which was concerned with a dense

\Phi

having i.i.d. Gaussian entries. We qualitatively unify several results related to the Johnson-Lindenstrauss lemma, subspace embeddings, and Fourier-based restricted isometries. Our work also implies new results in using the sparse Johnson-Lindenstrauss transform in numerical linear algebra, classical and model-based compressed sensing, manifold learning, and constrained least squares problems such as the Lasso

arXiv.org e-Print Archive

CiteSeerX

Crossref

Publikationsserver der RWTH Aachen University

Utrecht University Repository

Simple and Deterministic Matrix Sketching

Author: Liberty Edo
Publication venue
Publication date: 11/07/2012
Field of study

We adapt a well known streaming algorithm for approximating item frequencies to the matrix sketching setting. The algorithm receives the rows of a large matrix

A \in \R^{n \times m}

one after the other in a streaming fashion. It maintains a sketch matrix B \in \R^ {1/\eps \times m} such that for any unit vector

x

[\|Ax\|^2 \ge \|Bx\|^2 \ge \|Ax\|^2 - \eps \|A\|_{f}^2 \.] Sketch updates per row in

A

require O(m/\eps^2) operations in the worst case. A slight modification of the algorithm allows for an amortized update time of O(m/\eps) operations per row. The presented algorithm stands out in that it is: deterministic, simple to implement, and elementary to prove. It also experimentally produces more accurate sketches than widely used approaches while still being computationally competitive

arXiv.org e-Print Archive

CiteSeerX