180 research outputs found
Derandomizing Local Distributed Algorithms under Bandwidth Restrictions
This paper addresses the cornerstone family of local problems in distributed computing, and investigates the curious gap between randomized and deterministic solutions under bandwidth restrictions.
Our main contribution is in providing tools for derandomizing solutions to local problems, when the n nodes can only send O(log n)-bit messages in each round of communication. We combine bounded independence, which we show to be sufficient for some algorithms, with the method of conditional expectations and with additional machinery, to obtain the following results.
First, we show that in the Congested Clique model, which allows all-to-all communication, there is a deterministic maximal independent set (MIS) algorithm that runs in O(log^2 Delta) rounds, where Delta is the maximum degree. When Delta=O(n^(1/3)), the bound improves to O(log Delta).
Adapting the above to the CONGEST model gives an O(D log^2 n)-round deterministic MIS algorithm, where D is the diameter of the graph. Apart from a previous unproven claim of a O(D log^3 n)-round algorithm, the only known deterministic solutions for the CONGEST model are a coloring-based O(Delta + log^* n)-round algorithm, where Delta is the maximal degree in the graph, and a 2^O(sqrt(log n log log n))-round algorithm, which is super-polylogarithmic in n.
In addition, we deterministically construct a (2k-1)-spanner with O(kn^(1+1/k) log n) edges in O(k log n) rounds in the Congested Clique model. For comparison, in the more stringent CONGEST model, where the communication graph is identical to the input graph, the best deterministic algorithm for constructing a (2k-1)-spanner with O(kn^(1+1/k)) edges runs in O(n^(1-1/k)) rounds
Rounds vs Communication Tradeoffs for Maximal Independent Sets
We consider the problem of finding a maximal independent set (MIS) in the
shared blackboard communication model with vertex-partitioned inputs. There are
players corresponding to vertices of an undirected graph, and each player
sees the edges incident on its vertex -- this way, each edge is known by both
its endpoints and is thus shared by two players. The players communicate in
simultaneous rounds by posting their messages on a shared blackboard visible to
all players, with the goal of computing an MIS of the graph. While the MIS
problem is well studied in other distributed models, and while shared
blackboard is, perhaps, the simplest broadcast model, lower bounds for our
problem were only known against one-round protocols.
We present a lower bound on the round-communication tradeoff for computing an
MIS in this model. Specifically, we show that when rounds of interaction
are allowed, at least one player needs to communicate
bits. In particular, with logarithmic bandwidth, finding an MIS requires
rounds. This lower bound can be compared with the
algorithm of Ghaffari, Gouleakis, Konrad, Mitrovi\'c, and Rubinfeld [PODC 2018]
that solves MIS in rounds but with a logarithmic bandwidth for
an average player. Additionally, our lower bound further extends to the closely
related problem of maximal bipartite matching.
To prove our results, we devise a new round elimination framework, which we
call partial-input embedding, that may also be useful in future work for
proving round-sensitive lower bounds in the presence of edge-sharing between
players.
Finally, we discuss several implications of our results to multi-round
(adaptive) distributed sketching algorithms, broadcast congested clique, and to
the welfare maximization problem in two-sided matching markets.Comment: Full version of the paper in FOCS 2022, 44 page
Massively Parallel Algorithms for Distance Approximation and Spanners
Over the past decade, there has been increasing interest in
distributed/parallel algorithms for processing large-scale graphs. By now, we
have quite fast algorithms -- usually sublogarithmic-time and often
-time, or even faster -- for a number of fundamental graph
problems in the massively parallel computation (MPC) model. This model is a
widely-adopted theoretical abstraction of MapReduce style settings, where a
number of machines communicate in an all-to-all manner to process large-scale
data. Contributing to this line of work on MPC graph algorithms, we present
round MPC algorithms for computing
-spanners in the strongly sublinear regime of local memory. To
the best of our knowledge, these are the first sublogarithmic-time MPC
algorithms for spanner construction. As primary applications of our spanners,
we get two important implications, as follows:
-For the MPC setting, we get an -round algorithm for
approximation of all pairs shortest paths (APSP) in the
near-linear regime of local memory. To the best of our knowledge, this is the
first sublogarithmic-time MPC algorithm for distance approximations.
-Our result above also extends to the Congested Clique model of distributed
computing, with the same round complexity and approximation guarantee. This
gives the first sub-logarithmic algorithm for approximating APSP in weighted
graphs in the Congested Clique model
Streaming and Massively Parallel Algorithms for Edge Coloring
A valid edge-coloring of a graph is an assignment of "colors" to its edges such that no two incident edges receive the same color. The goal is to find a proper coloring that uses few colors. (Note that the maximum degree, Delta, is a trivial lower bound.) In this paper, we revisit this fundamental problem in two models of computation specific to massive graphs, the Massively Parallel Computations (MPC) model and the Graph Streaming model:
- Massively Parallel Computation: We give a randomized MPC algorithm that with high probability returns a Delta+O~(Delta^(3/4)) edge coloring in O(1) rounds using O(n) space per machine and O(m) total space. The space per machine can also be further improved to n^(1-Omega(1)) if Delta = n^Omega(1). Our algorithm improves upon a previous result of Harvey et al. [SPAA 2018].
- Graph Streaming: Since the output of edge-coloring is as large as its input, we consider a standard variant of the streaming model where the output is also reported in a streaming fashion. The main challenge is that the algorithm cannot "remember" all the reported edge colors, yet has to output a proper edge coloring using few colors.
We give a one-pass O~(n)-space streaming algorithm that always returns a valid coloring and uses 5.44 Delta colors with high probability if the edges arrive in a random order. For adversarial order streams, we give another one-pass O~(n)-space algorithm that requires O(Delta^2) colors
Large-Scale Distributed Algorithms for Facility Location with Outliers
This paper presents fast, distributed, O(1)-approximation algorithms for metric facility location problems with outliers in the Congested Clique model, Massively Parallel Computation (MPC) model, and in the k-machine model. The paper considers Robust Facility Location and Facility Location with Penalties, two versions of the facility location problem with outliers proposed by Charikar et al. (SODA 2001). The paper also considers two alternatives for specifying the input: the input metric can be provided explicitly (as an n x n matrix distributed among the machines) or implicitly as the shortest path metric of a given edge-weighted graph. The results in the paper are:
- Implicit metric: For both problems, O(1)-approximation algorithms running in O(poly(log n)) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model.
- Explicit metric: For both problems, O(1)-approximation algorithms running in O(log log log n) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model.
Our main contribution is to show the existence of Mettu-Plaxton-style O(1)-approximation algorithms for both Facility Location with outlier problems. As shown in our previous work (Berns et al., ICALP 2012, Bandyapadhyay et al., ICDCN 2018) Mettu-Plaxton style algorithms are more easily amenable to being implemented efficiently in distributed and large-scale models of computation
- …