93 research outputs found
Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression
Low-distortion embeddings are critical building blocks for developing random
sampling and random projection algorithms for linear algebra problems. We show
that, given a matrix with and a , with a constant probability, we can construct a low-distortion embedding
matrix \Pi \in \R^{O(\poly(d)) \times n} that embeds \A_p, the
subspace spanned by 's columns, into (\R^{O(\poly(d))}, \| \cdot \|_p);
the distortion of our embeddings is only O(\poly(d)), and we can compute in O(\nnz(A)) time, i.e., input-sparsity time. Our result generalizes the
input-sparsity time subspace embedding by Clarkson and Woodruff
[STOC'13]; and for completeness, we present a simpler and improved analysis of
their construction for . These input-sparsity time embeddings
are optimal, up to constants, in terms of their running time; and the improved
running time propagates to applications such as -distortion
subspace embedding and relative-error regression. For
, we show that a -approximate solution to the
regression problem specified by the matrix and a vector can be
computed in O(\nnz(A) + d^3 \log(d/\epsilon) /\epsilon^2) time; and for
, via a subspace-preserving sampling procedure, we show that a -distortion embedding of \A_p into \R^{O(\poly(d))} can be
computed in O(\nnz(A) \cdot \log n) time, and we also show that a
-approximate solution to the regression problem can be computed in O(\nnz(A) \cdot \log n + \poly(d)
\log(1/\epsilon)/\epsilon^2) time. Moreover, we can improve the embedding
dimension or equivalently the sample size to without increasing the complexity.Comment: 22 page
Sketching via hashing: from heavy hitters to compressed sensing to sparse fourier transform
Sketching via hashing is a popular and useful method for processing large data sets. Its basic idea is as follows. Suppose that we have a large multi-set of elements S=[formula], and we would like to identify the elements that occur “frequently" in S. The algorithm starts by selecting a hash function h that maps the elements into an array c[1…m]. The array entries are initialized to 0. Then, for each element a ∈ S, the algorithm increments c[h(a)]. At the end of the process, each array entry c[j] contains the count of all data elements a ∈ S mapped to j
Tight Bounds for Sketching the Operator Norm, Schatten Norms, and Subspace Embeddings
We consider the following oblivious sketching problem: given epsilon in (0,1/3) and n >= d/epsilon^2, design a distribution D over R^{k * nd} and a function f: R^k * R^{nd} -> R}, so that for any n * d matrix A, Pr_{S sim D} [(1-epsilon) |A|_{op} = 2/3, where |A|_{op} = sup_{x:|x|_2 = 1} |Ax|_2 is the operator norm of A and S(A) denotes S * A, interpreting A as a vector in R^{nd}. We show a tight lower bound of k = Omega(d^2/epsilon^2) for this problem. Previously, Nelson and Nguyen (ICALP, 2014) considered the problem of finding a distribution D over R^{k * n} such that for any n * d matrix A, Pr_{S sim D}[forall x, (1-epsilon)|Ax|_2 = 2/3, which is called an oblivious subspace embedding (OSE). Our result considerably strengthens theirs, as it (1) applies only to estimating the operator norm, which can be estimated given any OSE, and (2) applies to distributions over general linear operators S which treat A as a vector and compute S(A), rather than the restricted class of linear operators corresponding to matrix multiplication. Our technique also implies the first tight bounds for approximating the Schatten p-norm for even integers p via general linear sketches, improving the previous lower bound from k = Omega(n^{2-6/p}) [Regev, 2014] to k = Omega(n^{2-4/p}). Importantly, for sketching the operator norm up to a factor of alpha, where alpha - 1 = Omega(1), we obtain a tight k = Omega(n^2/alpha^4) bound, matching the upper bound of Andoni and Nguyen (SODA, 2013), and improving the previous k = Omega(n^2/alpha^6) lower bound. Finally, we also obtain the first lower bounds for approximating Ky Fan norms
Online Row Sampling
Finding a small spectral approximation for a tall matrix is
a fundamental numerical primitive. For a number of reasons, one often seeks an
approximation whose rows are sampled from those of . Row sampling improves
interpretability, saves space when is sparse, and preserves row structure,
which is especially important, for example, when represents a graph.
However, correctly sampling rows from can be costly when the matrix is
large and cannot be stored and processed in memory. Hence, a number of recent
publications focus on row sampling in the streaming setting, using little more
space than what is required to store the outputted approximation [KL13,
KLM+14].
Inspired by a growing body of work on online algorithms for machine learning
and data analysis, we extend this work to a more restrictive online setting: we
read rows of one by one and immediately decide whether each row should be
kept in the spectral approximation or discarded, without ever retracting these
decisions. We present an extremely simple algorithm that approximates up to
multiplicative error and additive error using online samples, with memory overhead
proportional to the cost of storing the spectral approximation. We also present
an algorithm that uses ) memory but only requires
samples, which we show is
optimal.
Our methods are clean and intuitive, allow for lower memory usage than prior
work, and expose new theoretical properties of leverage score based matrix
approximation
Lower Bounds for Oblivious Subspace Embeddings
An oblivious subspace embedding (OSE) for some , and d ≤ m ≤ n is a distribution over such that for any linear subspace of dimension d. We prove any OSE with has , which is optimal. Furthermore, if every in the support of is sparse, having at most s non-zero entries per column, we show tradeoff lower bounds between m and s.Engineering and Applied Science
- …