41 research outputs found
Lessons from the Congested Clique Applied to MapReduce
The main results of this paper are (I) a simulation algorithm which, under
quite general constraints, transforms algorithms running on the Congested
Clique into algorithms running in the MapReduce model, and (II) a distributed
-coloring algorithm running on the Congested Clique which has an
expected running time of (i) rounds, if ;
and (ii) rounds otherwise. Applying the simulation theorem to
the Congested-Clique -coloring algorithm yields an -round
-coloring algorithm in the MapReduce model.
Our simulation algorithm illustrates a natural correspondence between
per-node bandwidth in the Congested Clique model and memory per machine in the
MapReduce model. In the Congested Clique (and more generally, any network in
the model), the major impediment to constructing fast
algorithms is the restriction on message sizes. Similarly, in the
MapReduce model, the combined restrictions on memory per machine and total
system memory have a dominant effect on algorithm design. In showing a fairly
general simulation algorithm, we highlight the similarities and differences
between these models.Comment: 15 page
Massively Parallel Algorithms for Distance Approximation and Spanners
Over the past decade, there has been increasing interest in
distributed/parallel algorithms for processing large-scale graphs. By now, we
have quite fast algorithms -- usually sublogarithmic-time and often
-time, or even faster -- for a number of fundamental graph
problems in the massively parallel computation (MPC) model. This model is a
widely-adopted theoretical abstraction of MapReduce style settings, where a
number of machines communicate in an all-to-all manner to process large-scale
data. Contributing to this line of work on MPC graph algorithms, we present
round MPC algorithms for computing
-spanners in the strongly sublinear regime of local memory. To
the best of our knowledge, these are the first sublogarithmic-time MPC
algorithms for spanner construction. As primary applications of our spanners,
we get two important implications, as follows:
-For the MPC setting, we get an -round algorithm for
approximation of all pairs shortest paths (APSP) in the
near-linear regime of local memory. To the best of our knowledge, this is the
first sublogarithmic-time MPC algorithm for distance approximations.
-Our result above also extends to the Congested Clique model of distributed
computing, with the same round complexity and approximation guarantee. This
gives the first sub-logarithmic algorithm for approximating APSP in weighted
graphs in the Congested Clique model
Sparse Hopsets in Congested Clique
We give the first Congested Clique algorithm that computes a sparse hopset
with polylogarithmic hopbound in polylogarithmic time. Given a graph ,
a -hopset with "hopbound" , is a set of edges
added to such that for any pair of nodes and in there is a path
with at most hops in with length within of
the shortest path between and in .
Our hopsets are significantly sparser than the recent construction of
Censor-Hillel et al. [6], that constructs a hopset of size
, but with a smaller polylogarithmic hopbound. On the other
hand, the previously known constructions of sparse hopsets with polylogarithmic
hopbound in the Congested Clique model, proposed by Elkin and Neiman
[10],[11],[12], all require polynomial rounds.
One tool that we use is an efficient algorithm that constructs an
-limited neighborhood cover, that may be of independent interest.
Finally, as a side result, we also give a hopset construction in a variant of
the low-memory Massively Parallel Computation model, with improved running time
over existing algorithms
Equivalence Classes and Conditional Hardness in Massively Parallel Computations
The Massively Parallel Computation (MPC) model serves as a common abstraction of many modern large-scale data processing frameworks, and has been receiving increasingly more attention over the past few years, especially in the context of classical graph problems. So far, the only way to argue lower bounds for this model is to condition on conjectures about the hardness of some specific problems, such as graph connectivity on promise graphs that are either one cycle or two cycles, usually called the one cycle vs. two cycles problem. This is unlike the traditional arguments based on conjectures about complexity classes (e.g., P ? NP), which are often more robust in the sense that refuting them would lead to groundbreaking algorithms for a whole bunch of problems.
In this paper we present connections between problems and classes of problems that allow the latter type of arguments. These connections concern the class of problems solvable in a sublogarithmic amount of rounds in the MPC model, denoted by MPC(o(log N)), and some standard classes concerning space complexity, namely L and NL, and suggest conjectures that are robust in the sense that refuting them would lead to many surprisingly fast new algorithms in the MPC model. We also obtain new conditional lower bounds, and prove new reductions and equivalences between problems in the MPC model
Large-Scale Distributed Algorithms for Facility Location with Outliers
This paper presents fast, distributed, O(1)-approximation algorithms for metric facility location problems with outliers in the Congested Clique model, Massively Parallel Computation (MPC) model, and in the k-machine model. The paper considers Robust Facility Location and Facility Location with Penalties, two versions of the facility location problem with outliers proposed by Charikar et al. (SODA 2001). The paper also considers two alternatives for specifying the input: the input metric can be provided explicitly (as an n x n matrix distributed among the machines) or implicitly as the shortest path metric of a given edge-weighted graph. The results in the paper are:
- Implicit metric: For both problems, O(1)-approximation algorithms running in O(poly(log n)) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model.
- Explicit metric: For both problems, O(1)-approximation algorithms running in O(log log log n) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model.
Our main contribution is to show the existence of Mettu-Plaxton-style O(1)-approximation algorithms for both Facility Location with outlier problems. As shown in our previous work (Berns et al., ICALP 2012, Bandyapadhyay et al., ICDCN 2018) Mettu-Plaxton style algorithms are more easily amenable to being implemented efficiently in distributed and large-scale models of computation
Algebraic Methods in the Congested Clique
In this work, we use algebraic methods for studying distance computation and
subgraph detection tasks in the congested clique model. Specifically, we adapt
parallel matrix multiplication implementations to the congested clique,
obtaining an round matrix multiplication algorithm, where
is the exponent of matrix multiplication. In conjunction
with known techniques from centralised algorithmics, this gives significant
improvements over previous best upper bounds in the congested clique model. The
highlight results include:
-- triangle and 4-cycle counting in rounds, improving upon the
triangle detection algorithm of Dolev et al. [DISC 2012],
-- a -approximation of all-pairs shortest paths in
rounds, improving upon the -round -approximation algorithm of Nanongkai [STOC 2014], and
-- computing the girth in rounds, which is the first
non-trivial solution in this model.
In addition, we present a novel constant-round combinatorial algorithm for
detecting 4-cycles.Comment: This is work is a merger of arxiv:1412.2109 and arxiv:1412.266