352 research outputs found
Sketch-based Randomized Algorithms for Dynamic Graph Regression
A well-known problem in data science and machine learning is {\em linear
regression}, which is recently extended to dynamic graphs. Existing exact
algorithms for updating the solution of dynamic graph regression problem
require at least a linear time (in terms of : the size of the graph).
However, this time complexity might be intractable in practice. In the current
paper, we utilize {\em subsampled randomized Hadamard transform} and
\textsf{CountSketch} to propose the first randomized algorithms. Suppose that
we are given an matrix embedding of the graph, where .
Let be the number of samples required for a guaranteed approximation error,
which is a sublinear function of . Our first algorithm reduces time
complexity of pre-processing to .
Then after an edge insertion or an edge deletion, it updates the approximate
solution in time. Our second algorithm reduces time complexity of
pre-processing to , where is the number of nonzero elements of . Then after
an edge insertion or an edge deletion or a node insertion or a node deletion,
it updates the approximate solution in time, with
. Finally, we show
that under some assumptions, if our first algorithm
outperforms our second algorithm and if our second
algorithm outperforms our first algorithm
Improved analysis of the subsampled randomized Hadamard transform
This paper presents an improved analysis of a structured dimension-reduction
map called the subsampled randomized Hadamard transform. This argument
demonstrates that the map preserves the Euclidean geometry of an entire
subspace of vectors. The new proof is much simpler than previous approaches,
and it offers---for the first time---optimal constants in the estimate on the
number of dimensions required for the embedding.Comment: 8 pages. To appear, Advances in Adaptive Data Analysis, special issue
"Sparse Representation of Data and Images." v2--v4 include minor correction
Randomized Dynamic Mode Decomposition
This paper presents a randomized algorithm for computing the near-optimal
low-rank dynamic mode decomposition (DMD). Randomized algorithms are emerging
techniques to compute low-rank matrix approximations at a fraction of the cost
of deterministic algorithms, easing the computational challenges arising in the
area of `big data'. The idea is to derive a small matrix from the
high-dimensional data, which is then used to efficiently compute the dynamic
modes and eigenvalues. The algorithm is presented in a modular probabilistic
framework, and the approximation quality can be controlled via oversampling and
power iterations. The effectiveness of the resulting randomized DMD algorithm
is demonstrated on several benchmark examples of increasing complexity,
providing an accurate and efficient approach to extract spatiotemporal coherent
structures from big data in a framework that scales with the intrinsic rank of
the data, rather than the ambient measurement dimension. For this work we
assume that the dynamics of the problem under consideration is evolving on a
low-dimensional subspace that is well characterized by a fast decaying singular
value spectrum
Optimal approximate matrix product in terms of stable rank
We prove, using the subspace embedding guarantee in a black box way, that one
can achieve the spectral norm guarantee for approximate matrix multiplication
with a dimensionality-reducing map having
rows. Here is the maximum stable rank, i.e. squared ratio of
Frobenius and operator norms, of the two matrices being multiplied. This is a
quantitative improvement over previous work of [MZ11, KVZ14], and is also
optimal for any oblivious dimensionality-reducing map. Furthermore, due to the
black box reliance on the subspace embedding property in our proofs, our
theorem can be applied to a much more general class of sketching matrices than
what was known before, in addition to achieving better bounds. For example, one
can apply our theorem to efficient subspace embeddings such as the Subsampled
Randomized Hadamard Transform or sparse subspace embeddings, or even with
subspace embedding constructions that may be developed in the future.
Our main theorem, via connections with spectral error matrix multiplication
shown in prior work, implies quantitative improvements for approximate least
squares regression and low rank approximation. Our main result has also already
been applied to improve dimensionality reduction guarantees for -means
clustering [CEMMP14], and implies new results for nonparametric regression
[YPW15].
We also separately point out that the proof of the "BSS" deterministic
row-sampling result of [BSS12] can be modified to show that for any matrices
of stable rank at most , one can achieve the spectral norm
guarantee for approximate matrix multiplication of by deterministically
sampling rows that can be found in polynomial
time. The original result of [BSS12] was for rank instead of stable rank. Our
observation leads to a stronger version of a main theorem of [KMST10].Comment: v3: minor edits; v2: fixed one step in proof of Theorem 9 which was
wrong by a constant factor (see the new Lemma 5 and its use; final theorem
unaffected
- âŠ