44 research outputs found
Reliable Hubs for Partially-Dynamic All-Pairs Shortest Paths in Directed Graphs
We give new partially-dynamic algorithms for the all-pairs shortest paths problem in weighted directed graphs. Most importantly, we give a new deterministic incremental algorithm for the problem that handles updates in O~(mn^(4/3) log{W}/epsilon) total time (where the edge weights are from [1,W]) and explicitly maintains a (1+epsilon)-approximate distance matrix. For a fixed epsilon>0, this is the first deterministic partially dynamic algorithm for all-pairs shortest paths in directed graphs, whose update time is o(n^2) regardless of the number of edges. Furthermore, we also show how to improve the state-of-the-art partially dynamic randomized algorithms for all-pairs shortest paths [Baswana et al. STOC\u2702, Bernstein STOC\u2713] from Monte Carlo randomized to Las Vegas randomized without increasing the running time bounds (with respect to the O~(*) notation).
Our results are obtained by giving new algorithms for the problem of dynamically maintaining hubs, that is a set of O~(n/d) vertices which hit a shortest path between each pair of vertices, provided it has hop-length Omega(d). We give new subquadratic deterministic and Las Vegas algorithms for maintenance of hubs under either edge insertions or deletions
Low Space External Memory Construction of the Succinct Permuted Longest Common Prefix Array
The longest common prefix (LCP) array is a versatile auxiliary data structure
in indexed string matching. It can be used to speed up searching using the
suffix array (SA) and provides an implicit representation of the topology of an
underlying suffix tree. The LCP array of a string of length can be
represented as an array of length words, or, in the presence of the SA, as
a bit vector of bits plus asymptotically negligible support data
structures. External memory construction algorithms for the LCP array have been
proposed, but those proposed so far have a space requirement of words
(i.e. bits) in external memory. This space requirement is in some
practical cases prohibitively expensive. We present an external memory
algorithm for constructing the bit version of the LCP array which uses
bits of additional space in external memory when given a
(compressed) BWT with alphabet size and a sampled inverse suffix array
at sampling rate . This is often a significant space gain in
practice where is usually much smaller than or even constant. We
also consider the case of computing succinct LCP arrays for circular strings
Geometry Helps to Compare Persistence Diagrams
Exploiting geometric structure to improve the asymptotic complexity of
discrete assignment problems is a well-studied subject. In contrast, the
practical advantages of using geometry for such problems have not been
explored. We implement geometric variants of the Hopcroft--Karp algorithm for
bottleneck matching (based on previous work by Efrat el al.) and of the auction
algorithm by Bertsekas for Wasserstein distance computation. Both
implementations use k-d trees to replace a linear scan with a geometric
proximity query. Our interest in this problem stems from the desire to compute
distances between persistence diagrams, a problem that comes up frequently in
topological data analysis. We show that our geometric matching algorithms lead
to a substantial performance gain, both in running time and in memory
consumption, over their purely combinatorial counterparts. Moreover, our
implementation significantly outperforms the only other implementation
available for comparing persistence diagrams.Comment: 20 pages, 10 figures; extended version of paper published in ALENEX
201
Fully Dynamic k -Center Clustering
International audienceStatic and dynamic clustering algorithms are a fundamental tool in any machine learning library. Most of the efforts in developing dynamic machine learning and data mining algorithms have been focusing on the sliding window model (where at any given point in time only the most recent data items are retained) or more simplistic models. However, in many real-world applications one might need to deal with arbitrary deletions and insertions. For example, one might need to remove data items that are not necessarily the oldest ones, because they have been flagged as containing inappropriate content or due to privacy concerns. Clustering trajectory data might also require to deal with more general update operations. We develop a (2 +)-approximation algorithm for the k-center clustering problem with "small" amortized cost under the fully dynamic adversarial model. In such a model, points can be added or removed arbitrarily, provided that the adversary does not have access to the random choices of our algorithm. The amortized cost of our algorithm is poly-logarithmic when the ratio between the maximum and minimum distance between any two points in input is bounded by a polynomial, while k and are constant. Our theoretical results are complemented with an extensive experimental evaluation on dynamic data from Twitter, Flickr, as well as trajectory data, demonstrating the effectiveness of our approach
Optimal Dynamic Distributed MIS
Finding a maximal independent set (MIS) in a graph is a cornerstone task in
distributed computing. The local nature of an MIS allows for fast solutions in
a static distributed setting, which are logarithmic in the number of nodes or
in their degrees. The result trivially applies for the dynamic distributed
model, in which edges or nodes may be inserted or deleted. In this paper, we
take a different approach which exploits locality to the extreme, and show how
to update an MIS in a dynamic distributed setting, either \emph{synchronous} or
\emph{asynchronous}, with only \emph{a single adjustment} and in a single
round, in expectation. These strong guarantees hold for the \emph{complete
fully dynamic} setting: Insertions and deletions, of edges as well as nodes,
gracefully and abruptly. This strongly separates the static and dynamic
distributed models, as super-constant lower bounds exist for computing an MIS
in the former.
Our results are obtained by a novel analysis of the surprisingly simple
solution of carefully simulating the greedy \emph{sequential} MIS algorithm
with a random ordering of the nodes. As such, our algorithm has a direct
application as a -approximation algorithm for correlation clustering. This
adds to the important toolbox of distributed graph decompositions, which are
widely used as crucial building blocks in distributed computing.
Finally, our algorithm enjoys a useful \emph{history-independence} property,
meaning the output is independent of the history of topology changes that
constructed that graph. This means the output cannot be chosen, or even biased,
by the adversary in case its goal is to prevent us from optimizing some
objective function.Comment: 19 pages including appendix and reference