2,616 research outputs found
Exponentially Faster Massively Parallel Maximal Matching
The study of approximate matching in the Massively Parallel Computations
(MPC) model has recently seen a burst of breakthroughs. Despite this progress,
however, we still have a far more limited understanding of maximal matching
which is one of the central problems of parallel and distributed computing. All
known MPC algorithms for maximal matching either take polylogarithmic time
which is considered inefficient, or require a strictly super-linear space of
per machine.
In this work, we close this gap by providing a novel analysis of an extremely
simple algorithm a variant of which was conjectured to work by Czumaj et al.
[STOC'18]. The algorithm edge-samples the graph, randomly partitions the
vertices, and finds a random greedy maximal matching within each partition. We
show that this algorithm drastically reduces the vertex degrees. This, among
some other results, leads to an round algorithm for
maximal matching with space (or even mildly sublinear in using
standard techniques).
As an immediate corollary, we get a approximate minimum vertex cover in
essentially the same rounds and space. This is the best possible approximation
factor under standard assumptions, culminating a long line of research. It also
leads to an improved round algorithm for
approximate matching. All these results can also be implemented in the
congested clique model within the same number of rounds.Comment: A preliminary version of this paper is to appear in the proceedings
of The 60th Annual IEEE Symposium on Foundations of Computer Science (FOCS
2019
Massively Parallel Algorithms for Distance Approximation and Spanners
Over the past decade, there has been increasing interest in
distributed/parallel algorithms for processing large-scale graphs. By now, we
have quite fast algorithms -- usually sublogarithmic-time and often
-time, or even faster -- for a number of fundamental graph
problems in the massively parallel computation (MPC) model. This model is a
widely-adopted theoretical abstraction of MapReduce style settings, where a
number of machines communicate in an all-to-all manner to process large-scale
data. Contributing to this line of work on MPC graph algorithms, we present
round MPC algorithms for computing
-spanners in the strongly sublinear regime of local memory. To
the best of our knowledge, these are the first sublogarithmic-time MPC
algorithms for spanner construction. As primary applications of our spanners,
we get two important implications, as follows:
-For the MPC setting, we get an -round algorithm for
approximation of all pairs shortest paths (APSP) in the
near-linear regime of local memory. To the best of our knowledge, this is the
first sublogarithmic-time MPC algorithm for distance approximations.
-Our result above also extends to the Congested Clique model of distributed
computing, with the same round complexity and approximation guarantee. This
gives the first sub-logarithmic algorithm for approximating APSP in weighted
graphs in the Congested Clique model
Exponential Speedup over Locality in MPC with Optimal Memory
Locally Checkable Labeling (LCL) problems are graph problems in which a solution is correct if it satisfies some given constraints in the local neighborhood of each node. Example problems in this class include maximal matching, maximal independent set, and coloring problems. A successful line of research has been studying the complexities of LCL problems on paths/cycles, trees, and general graphs, providing many interesting results for the LOCAL model of distributed computing. In this work, we initiate the study of LCL problems in the low-space Massively Parallel Computation (MPC) model. In particular, on forests, we provide a method that, given the complexity of an LCL problem in the LOCAL model, automatically provides an exponentially faster algorithm for the low-space MPC setting that uses optimal global memory, that is, truly linear.
While restricting to forests may seem to weaken the result, we emphasize that all known (conditional) lower bounds for the MPC setting are obtained by lifting lower bounds obtained in the distributed setting in tree-like networks (either forests or high girth graphs), and hence the problems that we study are challenging already on forests. Moreover, the most important technical feature of our algorithms is that they use optimal global memory, that is, memory linear in the number of edges of the graph. In contrast, most of the state-of-the-art algorithms use more than linear global memory. Further, they typically start with a dense graph, sparsify it, and then solve the problem on the residual graph, exploiting the relative increase in global memory. On forests, this is not possible, because the given graph is already as sparse as it can be, and using optimal memory requires new solutions
Equivalence Classes and Conditional Hardness in Massively Parallel Computations
The Massively Parallel Computation (MPC) model serves as a common abstraction of many modern large-scale data processing frameworks, and has been receiving increasingly more attention over the past few years, especially in the context of classical graph problems. So far, the only way to argue lower bounds for this model is to condition on conjectures about the hardness of some specific problems, such as graph connectivity on promise graphs that are either one cycle or two cycles, usually called the one cycle vs. two cycles problem. This is unlike the traditional arguments based on conjectures about complexity classes (e.g., P ? NP), which are often more robust in the sense that refuting them would lead to groundbreaking algorithms for a whole bunch of problems.
In this paper we present connections between problems and classes of problems that allow the latter type of arguments. These connections concern the class of problems solvable in a sublogarithmic amount of rounds in the MPC model, denoted by MPC(o(log N)), and some standard classes concerning space complexity, namely L and NL, and suggest conjectures that are robust in the sense that refuting them would lead to many surprisingly fast new algorithms in the MPC model. We also obtain new conditional lower bounds, and prove new reductions and equivalences between problems in the MPC model
Parallel Graph Algorithms in Constant Adaptive Rounds: Theory meets Practice
We study fundamental graph problems such as graph connectivity, minimum
spanning forest (MSF), and approximate maximum (weight) matching in a
distributed setting. In particular, we focus on the Adaptive Massively Parallel
Computation (AMPC) model, which is a theoretical model that captures
MapReduce-like computation augmented with a distributed hash table.
We show the first AMPC algorithms for all of the studied problems that run in
a constant number of rounds and use only space per machine,
where . Our results improve both upon the previous results in
the AMPC model, as well as the best-known results in the MPC model, which is
the theoretical model underpinning many popular distributed computation
frameworks, such as MapReduce, Hadoop, Beam, Pregel and Giraph.
Finally, we provide an empirical comparison of the algorithms in the MPC and
AMPC models in a fault-tolerant distriubted computation environment. We
empirically evaluate our algorithms on a set of large real-world graphs and
show that our AMPC algorithms can achieve improvements in both running time and
round-complexity over optimized MPC baselines
- …