18,001 research outputs found
Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression
Low-distortion embeddings are critical building blocks for developing random
sampling and random projection algorithms for linear algebra problems. We show
that, given a matrix with and a , with a constant probability, we can construct a low-distortion embedding
matrix \Pi \in \R^{O(\poly(d)) \times n} that embeds \A_p, the
subspace spanned by 's columns, into (\R^{O(\poly(d))}, \| \cdot \|_p);
the distortion of our embeddings is only O(\poly(d)), and we can compute in O(\nnz(A)) time, i.e., input-sparsity time. Our result generalizes the
input-sparsity time subspace embedding by Clarkson and Woodruff
[STOC'13]; and for completeness, we present a simpler and improved analysis of
their construction for . These input-sparsity time embeddings
are optimal, up to constants, in terms of their running time; and the improved
running time propagates to applications such as -distortion
subspace embedding and relative-error regression. For
, we show that a -approximate solution to the
regression problem specified by the matrix and a vector can be
computed in O(\nnz(A) + d^3 \log(d/\epsilon) /\epsilon^2) time; and for
, via a subspace-preserving sampling procedure, we show that a -distortion embedding of \A_p into \R^{O(\poly(d))} can be
computed in O(\nnz(A) \cdot \log n) time, and we also show that a
-approximate solution to the regression problem can be computed in O(\nnz(A) \cdot \log n + \poly(d)
\log(1/\epsilon)/\epsilon^2) time. Moreover, we can improve the embedding
dimension or equivalently the sample size to without increasing the complexity.Comment: 22 page
Optimal approximate matrix product in terms of stable rank
We prove, using the subspace embedding guarantee in a black box way, that one
can achieve the spectral norm guarantee for approximate matrix multiplication
with a dimensionality-reducing map having
rows. Here is the maximum stable rank, i.e. squared ratio of
Frobenius and operator norms, of the two matrices being multiplied. This is a
quantitative improvement over previous work of [MZ11, KVZ14], and is also
optimal for any oblivious dimensionality-reducing map. Furthermore, due to the
black box reliance on the subspace embedding property in our proofs, our
theorem can be applied to a much more general class of sketching matrices than
what was known before, in addition to achieving better bounds. For example, one
can apply our theorem to efficient subspace embeddings such as the Subsampled
Randomized Hadamard Transform or sparse subspace embeddings, or even with
subspace embedding constructions that may be developed in the future.
Our main theorem, via connections with spectral error matrix multiplication
shown in prior work, implies quantitative improvements for approximate least
squares regression and low rank approximation. Our main result has also already
been applied to improve dimensionality reduction guarantees for -means
clustering [CEMMP14], and implies new results for nonparametric regression
[YPW15].
We also separately point out that the proof of the "BSS" deterministic
row-sampling result of [BSS12] can be modified to show that for any matrices
of stable rank at most , one can achieve the spectral norm
guarantee for approximate matrix multiplication of by deterministically
sampling rows that can be found in polynomial
time. The original result of [BSS12] was for rank instead of stable rank. Our
observation leads to a stronger version of a main theorem of [KMST10].Comment: v3: minor edits; v2: fixed one step in proof of Theorem 9 which was
wrong by a constant factor (see the new Lemma 5 and its use; final theorem
unaffected
Efficient Construction of Probabilistic Tree Embeddings
In this paper we describe an algorithm that embeds a graph metric
on an undirected weighted graph into a distribution of tree metrics
such that for every pair , and
. Such embeddings have
proved highly useful in designing fast approximation algorithms, as many hard
problems on graphs are easy to solve on tree instances. For a graph with
vertices and edges, our algorithm runs in time with high
probability, which improves the previous upper bound of shown by
Mendel et al.\,in 2009.
The key component of our algorithm is a new approximate single-source
shortest-path algorithm, which implements the priority queue with a new data
structure, the "bucket-tree structure". The algorithm has three properties: it
only requires linear time in the number of edges in the input graph; the
computed distances have a distance preserving property; and when computing the
shortest-paths to the -nearest vertices from the source, it only requires to
visit these vertices and their edge lists. These properties are essential to
guarantee the correctness and the stated time bound.
Using this shortest-path algorithm, we show how to generate an intermediate
structure, the approximate dominance sequences of the input graph, in time, and further propose a simple yet efficient algorithm to converted
this sequence to a tree embedding in time, both with high
probability. Combining the three subroutines gives the stated time bound of the
algorithm.
Then we show that this efficient construction can facilitate some
applications. We proved that FRT trees (the generated tree embedding) are
Ramsey partitions with asymptotically tight bound, so the construction of a
series of distance oracles can be accelerated
Multi-Embedding of Metric Spaces
Metric embedding has become a common technique in the design of algorithms.
Its applicability is often dependent on how high the embedding's distortion is.
For example, embedding finite metric space into trees may require linear
distortion as a function of its size. Using probabilistic metric embeddings,
the bound on the distortion reduces to logarithmic in the size.
We make a step in the direction of bypassing the lower bound on the
distortion in terms of the size of the metric. We define "multi-embeddings" of
metric spaces in which a point is mapped onto a set of points, while keeping
the target metric of polynomial size and preserving the distortion of paths.
The distortion obtained with such multi-embeddings into ultrametrics is at most
O(log Delta loglog Delta) where Delta is the aspect ratio of the metric. In
particular, for expander graphs, we are able to obtain constant distortion
embeddings into trees in contrast with the Omega(log n) lower bound for all
previous notions of embeddings.
We demonstrate the algorithmic application of the new embeddings for two
optimization problems: group Steiner tree and metrical task systems
- β¦