Search CORE

93 research outputs found

Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression

Author: Mahoney Michael W.
Meng Xiangrui
Publication venue
Publication date: 01/01/2013
Field of study

Low-distortion embeddings are critical building blocks for developing random sampling and random projection algorithms for linear algebra problems. We show that, given a matrix

A \in \R^{n \times d}

with

n \gg d

and a

p \in [1, 2)

, with a constant probability, we can construct a low-distortion embedding matrix \Pi \in \R^{O(\poly(d)) \times n} that embeds \A_p, the

\ell_p

subspace spanned by

A

's columns, into (\R^{O(\poly(d))}, \| \cdot \|_p); the distortion of our embeddings is only O(\poly(d)), and we can compute

\Pi A

in O(\nnz(A)) time, i.e., input-sparsity time. Our result generalizes the input-sparsity time

\ell_2

subspace embedding by Clarkson and Woodruff [STOC'13]; and for completeness, we present a simpler and improved analysis of their construction for

\ell_2

. These input-sparsity time

\ell_p

embeddings are optimal, up to constants, in terms of their running time; and the improved running time propagates to applications such as

(1\pm \epsilon)

-distortion

\ell_p

subspace embedding and relative-error

\ell_p

regression. For

\ell_2

, we show that a

(1+\epsilon)

-approximate solution to the

\ell_2

regression problem specified by the matrix

A

and a vector

b \in \R^n

can be computed in O(\nnz(A) + d^3 \log(d/\epsilon) /\epsilon^2) time; and for

\ell_p

, via a subspace-preserving sampling procedure, we show that a

(1\pm \epsilon)

-distortion embedding of \A_p into \R^{O(\poly(d))} can be computed in O(\nnz(A) \cdot \log n) time, and we also show that a

(1+\epsilon)

-approximate solution to the

\ell_p

regression problem

\min_{x \in \R^d} \|A x - b\|_p

can be computed in O(\nnz(A) \cdot \log n + \poly(d) \log(1/\epsilon)/\epsilon^2) time. Moreover, we can improve the embedding dimension or equivalently the sample size to

O(d^{3+p/2} \log(1/\epsilon) / \epsilon^2)

without increasing the complexity.Comment: 22 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Sketching via hashing: from heavy hitters to compressed sensing to sparse fourier transform

Author: Indyk Piotr
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Sketching via hashing is a popular and useful method for processing large data sets. Its basic idea is as follows. Suppose that we have a large multi-set of elements S=[formula], and we would like to identify the elements that occur “frequently" in S. The algorithm starts by selecting a hash function h that maps the elements into an array c[1…m]. The array entries are initialized to 0. Then, for each element a ∈ S, the algorithm increments c[h(a)]. At the end of the process, each array entry c[j] contains the count of all data elements a ∈ S mapped to j

DSpace@MIT

Crossref

Tight Bounds for Sketching the Operator Norm, Schatten Norms, and Subspace Embeddings

Author: Li Yi
Woodruff David P.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016)
Publication date: 01/01/2016
Field of study

We consider the following oblivious sketching problem: given epsilon in (0,1/3) and n >= d/epsilon^2, design a distribution D over R^{k * nd} and a function f: R^k * R^{nd} -> R}, so that for any n * d matrix A, Pr_{S sim D} [(1-epsilon) |A|_{op} = 2/3, where |A|_{op} = sup_{x:|x|_2 = 1} |Ax|_2 is the operator norm of A and S(A) denotes S * A, interpreting A as a vector in R^{nd}. We show a tight lower bound of k = Omega(d^2/epsilon^2) for this problem. Previously, Nelson and Nguyen (ICALP, 2014) considered the problem of finding a distribution D over R^{k * n} such that for any n * d matrix A, Pr_{S sim D}[forall x, (1-epsilon)|Ax|_2 = 2/3, which is called an oblivious subspace embedding (OSE). Our result considerably strengthens theirs, as it (1) applies only to estimating the operator norm, which can be estimated given any OSE, and (2) applies to distributions over general linear operators S which treat A as a vector and compute S(A), rather than the restricted class of linear operators corresponding to matrix multiplication. Our technique also implies the first tight bounds for approximating the Schatten p-norm for even integers p via general linear sketches, improving the previous lower bound from k = Omega(n^{2-6/p}) [Regev, 2014] to k = Omega(n^{2-4/p}). Importantly, for sketching the operator norm up to a factor of alpha, where alpha - 1 = Omega(1), we obtain a tight k = Omega(n^2/alpha^4) bound, matching the upper bound of Andoni and Nguyen (SODA, 2013), and improving the previous k = Omega(n^2/alpha^6) lower bound. Finally, we also obtain the first lower bounds for approximating Ky Fan norms

Dagstuhl Research Online Publication Server

Online Row Sampling

Author: Cohen Michael B.
Musco Cameron
Pachocki Jakub
Publication venue
Publication date: 01/01/2016
Field of study

Finding a small spectral approximation for a tall

n \times d

matrix

A

is a fundamental numerical primitive. For a number of reasons, one often seeks an approximation whose rows are sampled from those of

A

. Row sampling improves interpretability, saves space when

A

is sparse, and preserves row structure, which is especially important, for example, when

A

represents a graph. However, correctly sampling rows from

A

can be costly when the matrix is large and cannot be stored and processed in memory. Hence, a number of recent publications focus on row sampling in the streaming setting, using little more space than what is required to store the outputted approximation [KL13, KLM+14]. Inspired by a growing body of work on online algorithms for machine learning and data analysis, we extend this work to a more restrictive online setting: we read rows of

A

one by one and immediately decide whether each row should be kept in the spectral approximation or discarded, without ever retracting these decisions. We present an extremely simple algorithm that approximates

A

up to multiplicative error

\epsilon

and additive error

\delta

using

O(d \log d \log(\epsilon||A||_2/\delta)/\epsilon^2)

online samples, with memory overhead proportional to the cost of storing the spectral approximation. We also present an algorithm that uses

O(d^2

) memory but only requires

O(d\log(\epsilon||A||_2/\delta)/\epsilon^2)

samples, which we show is optimal. Our methods are clean and intuitive, allow for lower memory usage than prior work, and expose new theoretical properties of leverage score based matrix approximation