9,924 research outputs found
Efficient Computation of Distance Sketches in Distributed Networks
Distance computation is one of the most fundamental primitives used in
communication networks. The cost of effectively and accurately computing
pairwise network distances can become prohibitive in large-scale networks such
as the Internet and Peer-to-Peer (P2P) networks. To negotiate the rising need
for very efficient distance computation, approximation techniques for numerous
variants of this question have recently received significant attention in the
literature. The goal is to preprocess the graph and store a small amount of
information such that whenever a query for any pairwise distance is issued, the
distance can be well approximated (i.e., with small stretch) very quickly in an
online fashion. Specifically, the pre-processing (usually) involves storing a
small sketch with each node, such that at query time only the sketches of the
concerned nodes need to be looked up to compute the approximate distance. In
this paper, we present the first theoretical study of distance sketches derived
from distance oracles in a distributed network. We first present a fast
distributed algorithm for computing approximate distance sketches, based on a
distributed implementation of the distance oracle scheme of [Thorup-Zwick, JACM
2005]. We also show how to modify this basic construction to achieve different
tradeoffs between the number of pairs for which the distance estimate is
accurate and other parameters. These tradeoffs can then be combined to give an
efficient construction of small sketches with provable average-case as well as
worst-case performance. Our algorithms use only small-sized messages and hence
are suitable for bandwidth-constrained networks, and can be used in various
networking applications such as topology discovery and construction, token
management, load balancing, monitoring overlays, and several other problems in
distributed algorithms.Comment: 18 page
Brief Announcement: Massively Parallel Approximate Distance Sketches
Data structures that allow efficient distance estimation have been extensively studied both in centralized models and classical distributed models. We initiate their study in newer (and arguably more realistic) models of distributed computation: the Congested Clique model and the Massively Parallel Computation (MPC) model. In MPC we give two main results: an algorithm that constructs stretch/space optimal distance sketches but takes a (small) polynomial number of rounds, and an algorithm that constructs distance sketches with worse stretch but that only takes polylogarithmic rounds. Along the way, we show that other useful combinatorial structures can also be computed in MPC. In particular, one key component we use is an MPC construction of the hopsets of Elkin and Neiman (2016). This result has additional applications such as the first polylogarithmic time algorithm for constant approximate single-source shortest paths for weighted graphs in the low memory MPC setting
Fully decentralized computation of aggregates over data streams
In several emerging applications, data is collected in massive streams at several distributed points of observation. A basic and challenging task is to allow every node to monitor a neighbourhood of interest by issuing continuous aggregate queries on the streams observed in its vicinity. This class of algorithms is fully decentralized and diffusive in nature: collecting all data at few central nodes of the network is unfeasible in networks of low capability devices or in the presence of massive data sets. The main difficulty in designing diffusive algorithms is to cope with duplicate detections. These arise both from the observation of the same event at several nodes of the network and/or receipt of the same aggregated information along multiple paths of diffusion. In this paper, we consider fully decentralized algorithms that answer locally continuous aggregate queries on the number of distinct events, total number of events and the second frequency moment in the scenario outlined above. The proposed algorithms use in the worst case or on realistic distributions sublinear space at every node. We also propose strategies that minimize the communication needed to update the aggregates when new events are observed. We experimentally evaluate for the efficiency and accuracy of our algorithms on realistic simulated scenarios
Massively Parallel Approximate Distance Sketches
Data structures that allow efficient distance estimation (distance oracles, distance sketches, etc.) have been extensively studied, and are particularly well studied in centralized models and classical distributed models such as CONGEST. We initiate their study in newer (and arguably more realistic) models of distributed computation: the Congested Clique model and the Massively Parallel Computation (MPC) model. We provide efficient constructions in both of these models, but our core results are for MPC. In MPC we give two main results: an algorithm that constructs stretch/space optimal distance sketches but takes a (small) polynomial number of rounds, and an algorithm that constructs distance sketches with worse stretch but that only takes polylogarithmic rounds.
Along the way, we show that other useful combinatorial structures can also be computed in MPC. In particular, one key component we use to construct distance sketches are an MPC construction of the hopsets of [Elkin and Neiman, 2016]. This result has additional applications such as the first polylogarithmic time algorithm for constant approximate single-source shortest paths for weighted graphs in the low memory MPC setting
Fast Routing Table Construction Using Small Messages
We describe a distributed randomized algorithm computing approximate
distances and routes that approximate shortest paths. Let n denote the number
of nodes in the graph, and let HD denote the hop diameter of the graph, i.e.,
the diameter of the graph when all edges are considered to have unit weight.
Given 0 < eps <= 1/2, our algorithm runs in weak-O(n^(1/2 + eps) + HD)
communication rounds using messages of O(log n) bits and guarantees a stretch
of O(eps^(-1) log eps^(-1)) with high probability. This is the first
distributed algorithm approximating weighted shortest paths that uses small
messages and runs in weak-o(n) time (in graphs where HD in weak-o(n)). The time
complexity nearly matches the lower bounds of weak-Omega(sqrt(n) + HD) in the
small-messages model that hold for stateless routing (where routing decisions
do not depend on the traversed path) as well as approximation of the weigthed
diameter. Our scheme replaces the original identifiers of the nodes by labels
of size O(log eps^(-1) log n). We show that no algorithm that keeps the
original identifiers and runs for weak-o(n) rounds can achieve a
polylogarithmic approximation ratio.
Variations of our techniques yield a number of fast distributed approximation
algorithms solving related problems using small messages. Specifically, we
present algorithms that run in weak-O(n^(1/2 + eps) + HD) rounds for a given 0
< eps <= 1/2, and solve, with high probability, the following problems:
- O(eps^(-1))-approximation for the Generalized Steiner Forest (the running
time in this case has an additive weak-O(t^(1 + 2eps)) term, where t is the
number of terminals);
- O(eps^(-2))-approximation of weighted distances, using node labels of size
O(eps^(-1) log n) and weak-O(n^(eps)) bits of memory per node;
- O(eps^(-1))-approximation of the weighted diameter;
- O(eps^(-3))-approximate shortest paths using the labels 1,...,n.Comment: 40 pages, 2 figures, extended abstract submitted to STOC'1
- …