611 research outputs found
Deterministic Fully Dynamic SSSP and More
We present the first non-trivial fully dynamic algorithm maintaining exact
single-source distances in unweighted graphs. This resolves an open problem
stated by Sankowski [COCOON 2005] and van den Brand and Nanongkai [FOCS 2019].
Previous fully dynamic single-source distances data structures were all
approximate, but so far, non-trivial dynamic algorithms for the exact setting
could only be ruled out for polynomially weighted graphs (Abboud and
Vassilevska Williams, [FOCS 2014]). The exact unweighted case remained the main
case for which neither a subquadratic dynamic algorithm nor a quadratic lower
bound was known.
Our dynamic algorithm works on directed graphs, is deterministic, and can
report a single-source shortest paths tree in subquadratic time as well. Thus
we also obtain the first deterministic fully dynamic data structure for
reachability (transitive closure) with subquadratic update and query time. This
answers an open problem of van den Brand, Nanongkai, and Saranurak [FOCS 2019].
Finally, using the same framework we obtain the first fully dynamic data
structure maintaining all-pairs -approximate distances within
non-trivial sub- worst-case update time while supporting optimal-time
approximate shortest path reporting at the same time. This data structure is
also deterministic and therefore implies the first known non-trivial
deterministic worst-case bound for recomputing the transitive closure of a
digraph.Comment: Extended abstract to appear in FOCS 202
Fully Dynamic Shortest Path Reporting Against an Adaptive Adversary
Algebraic data structures are the main subroutine for maintaining distances
in fully dynamic graphs in subquadratic time. However, these dynamic algebraic
algorithms generally cannot maintain the shortest paths, especially against
adaptive adversaries. We present the first fully dynamic algorithm that
maintains the shortest paths against an adaptive adversary in subquadratic
update time. This is obtained via a combinatorial reduction that allows
reconstructing the shortest paths with only a few distance estimates. Using
this reduction, we obtain the following:
On weighted directed graphs with real edge weights in , we can
maintain approximate shortest paths in
update and query time. This improves upon the approximate distance
data structures from [v.d.Brand, Nanongkai, FOCS'19], which only returned a
distance estimate, by matching their complexity and returning an approximate
shortest path.
On unweighted directed graphs, we can maintain exact shortest paths in
update and query time. This
improves upon [Bergamaschi, Henzinger, P.Gutenberg, V.Williams, Wein, SODA'21]
who could report the path only against oblivious adversaries. We improve both
their update and query time while also handling adaptive adversaries.
On unweighted undirected graphs, our reduction holds not just against
adaptive adversaries but is also deterministic. We maintain a
-approximate -shortest path in
time per update, and -approximate single source shortest paths in
time per update. Previous deterministic results by
[v.d.Brand, Nazari, Forster, FOCS'22] could only maintain distance estimates
but no paths
Fast Deterministic Fully Dynamic Distance Approximation
In this paper, we develop deterministic fully dynamic algorithms for
computing approximate distances in a graph with worst-case update time
guarantees. In particular, we obtain improved dynamic algorithms that, given an
unweighted and undirected graph undergoing edge insertions and
deletions, and a parameter , maintain
-approximations of the -distance between a given pair of
nodes and , the distances from a single source to all nodes
("SSSP"), the distances from multiple sources to all nodes ("MSSP"), or the
distances between all nodes ("APSP").
Our main result is a deterministic algorithm for maintaining
-approximate -distance with worst-case update time
(for the current best known bound on the matrix multiplication
exponent ). This even improves upon the fastest known randomized
algorithm for this problem. Similar to several other well-studied dynamic
problems whose state-of-the-art worst-case update time is , this
matches a conditional lower bound [BNS, FOCS 2019]. We further give a
deterministic algorithm for maintaining -approximate
single-source distances with worst-case update time , which also
matches a conditional lower bound.
At the core, our approach is to combine algebraic distance maintenance data
structures with near-additive emulator constructions. This also leads to novel
dynamic algorithms for maintaining -emulators that improve
upon the state of the art, which might be of independent interest. Our
techniques also lead to improved randomized algorithms for several problems
such as exact -distances and diameter approximation.Comment: Changes to the previous version: improved bounds for approximate st
distances using new algebraic data structure
Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models
Large language models (LLMs) have made fundamental changes in human life. The
attention scheme is one of the key components over all the LLMs, such as BERT,
GPT-1, Transformers, GPT-2, 3, 3.5 and 4. Inspired by previous theoretical
study of static version of the attention multiplication problem [Zandieh, Han,
Daliri, and Karbasi arXiv 2023, Alman and Song arXiv 2023]. In this work, we
formally define a dynamic version of attention matrix multiplication problem.
There are matrices , they represent query,
key and value in LLMs. In each iteration we update one entry in or . In
the query stage, we receive as input, and want to
answer , where is a square matrix and is a diagonal matrix. Here denote a length- vector
that all the entries are ones.
We provide two results: an algorithm and a conditional lower bound.
On one hand, inspired by the lazy update idea from [Demetrescu and
Italiano FOCS 2000, Sankowski FOCS 2004, Cohen, Lee and Song STOC 2019, Brand
SODA 2020], we provide a data-structure that uses
amortized update time, and
worst-case query time.
On the other hand, show that unless the hinted matrix vector
multiplication conjecture [Brand, Nanongkai and Saranurak FOCS 2019] is false,
there is no algorithm that can use both amortized update time, and worst query
time.
In conclusion, our algorithmic result is conditionally optimal unless hinted
matrix vector multiplication conjecture is false
Dynamic Maxflow via Dynamic Interior Point Methods
In this paper we provide an algorithm for maintaining a
-approximate maximum flow in a dynamic, capacitated graph
undergoing edge additions. Over a sequence of -additions to an -node
graph where every edge has capacity our algorithm runs in
time . To obtain this result we
design dynamic data structures for the more general problem of detecting when
the value of the minimum cost circulation in a dynamic graph undergoing edge
additions obtains value at most (exactly) for a given threshold . Over a
sequence -additions to an -node graph where every edge has capacity
and cost we solve this thresholded
minimum cost flow problem in . Both of our algorithms
succeed with high probability against an adaptive adversary. We obtain these
results by dynamizing the recent interior point method used to obtain an almost
linear time algorithm for minimum cost flow (Chen, Kyng, Liu, Peng, Probst
Gutenberg, Sachdeva 2022), and introducing a new dynamic data structure for
maintaining minimum ratio cycles in an undirected graph that succeeds with high
probability against adaptive adversaries.Comment: 30 page
On Dynamic Graph Algorithms with Predictions
We study dynamic algorithms in the model of algorithms with predictions. We
assume the algorithm is given imperfect predictions regarding future updates,
and we ask how such predictions can be used to improve the running time. This
can be seen as a model interpolating between classic online and offline dynamic
algorithms. Our results give smooth tradeoffs between these two extreme
settings.
First, we give algorithms for incremental and decremental transitive closure
and approximate APSP that take as an additional input a predicted sequence of
updates (edge insertions, or edge deletions, respectively). They preprocess it
in time, and then handle updates in
worst-case time and queries in worst-case
time. Here is an error measure that can be bounded by the maximum
difference between the predicted and actual insertion (deletion) time of an
edge, i.e., by the -error of the predictions.
The second group of results concerns fully dynamic problems with vertex
updates, where the algorithm has access to a predicted sequence of the next
updates. We show how to solve fully dynamic triangle detection, maximum
matching, single-source reachability, and more, in
worst-case update time. Here denotes how much earlier the -th
update occurs than predicted.
Our last result is a reduction that transforms a worst-case incremental
algorithm without predictions into a fully dynamic algorithm which is given a
predicted deletion time for each element at the time of its insertion. As a
consequence we can, e.g., maintain fully dynamic exact APSP with such
predictions in worst-case vertex insertion time and
worst-case vertex deletion time (for the prediction
error defined as above).Comment: To appear in proceedings of SODA 2024. Abstract shortened to meet
arXiv requirement
Training (Overparametrized) Neural Networks in Near-Linear Time
The slow convergence rate and pathological curvature issues of first-order
gradient methods for training deep neural networks, initiated an ongoing effort
for developing faster - optimization
algorithms beyond SGD, without compromising the generalization error. Despite
their remarkable convergence rate ( of the training batch
size ), second-order algorithms incur a daunting slowdown in the
(inverting the Hessian
matrix of the loss function), which renders them impractical. Very recently,
this computational overhead was mitigated by the works of [ZMG19,CGH+19},
yielding an -time second-order algorithm for training two-layer
overparametrized neural networks of polynomial width .
We show how to speed up the algorithm of [CGH+19], achieving an
-time backpropagation algorithm for training (mildly
overparametrized) ReLU networks, which is near-linear in the dimension ()
of the full gradient (Jacobian) matrix. The centerpiece of our algorithm is to
reformulate the Gauss-Newton iteration as an -regression problem, and
then use a Fast-JL type dimension reduction to the
underlying Gram matrix in time independent of , allowing to find a
sufficiently good approximate solution via -
conjugate gradient. Our result provides a proof-of-concept that advanced
machinery from randomized linear algebra -- which led to recent breakthroughs
in (ERM, LPs, Regression) -- can be
carried over to the realm of deep learning as well
- …