41 research outputs found

    Lessons from the Congested Clique Applied to MapReduce

    Full text link
    The main results of this paper are (I) a simulation algorithm which, under quite general constraints, transforms algorithms running on the Congested Clique into algorithms running in the MapReduce model, and (II) a distributed O(Δ)O(\Delta)-coloring algorithm running on the Congested Clique which has an expected running time of (i) O(1)O(1) rounds, if ΔΘ(log4n)\Delta \geq \Theta(\log^4 n); and (ii) O(loglogn)O(\log \log n) rounds otherwise. Applying the simulation theorem to the Congested-Clique O(Δ)O(\Delta)-coloring algorithm yields an O(1)O(1)-round O(Δ)O(\Delta)-coloring algorithm in the MapReduce model. Our simulation algorithm illustrates a natural correspondence between per-node bandwidth in the Congested Clique model and memory per machine in the MapReduce model. In the Congested Clique (and more generally, any network in the CONGEST\mathcal{CONGEST} model), the major impediment to constructing fast algorithms is the O(logn)O(\log n) restriction on message sizes. Similarly, in the MapReduce model, the combined restrictions on memory per machine and total system memory have a dominant effect on algorithm design. In showing a fairly general simulation algorithm, we highlight the similarities and differences between these models.Comment: 15 page

    Massively Parallel Algorithms for Distance Approximation and Spanners

    Full text link
    Over the past decade, there has been increasing interest in distributed/parallel algorithms for processing large-scale graphs. By now, we have quite fast algorithms -- usually sublogarithmic-time and often poly(loglogn)poly(\log\log n)-time, or even faster -- for a number of fundamental graph problems in the massively parallel computation (MPC) model. This model is a widely-adopted theoretical abstraction of MapReduce style settings, where a number of machines communicate in an all-to-all manner to process large-scale data. Contributing to this line of work on MPC graph algorithms, we present poly(logk)poly(loglogn)poly(\log k) \in poly(\log\log n) round MPC algorithms for computing O(k1+o(1))O(k^{1+{o(1)}})-spanners in the strongly sublinear regime of local memory. To the best of our knowledge, these are the first sublogarithmic-time MPC algorithms for spanner construction. As primary applications of our spanners, we get two important implications, as follows: -For the MPC setting, we get an O(log2logn)O(\log^2\log n)-round algorithm for O(log1+o(1)n)O(\log^{1+o(1)} n) approximation of all pairs shortest paths (APSP) in the near-linear regime of local memory. To the best of our knowledge, this is the first sublogarithmic-time MPC algorithm for distance approximations. -Our result above also extends to the Congested Clique model of distributed computing, with the same round complexity and approximation guarantee. This gives the first sub-logarithmic algorithm for approximating APSP in weighted graphs in the Congested Clique model

    Sparse Hopsets in Congested Clique

    Get PDF
    We give the first Congested Clique algorithm that computes a sparse hopset with polylogarithmic hopbound in polylogarithmic time. Given a graph G=(V,E)G=(V,E), a (β,ϵ)(\beta,\epsilon)-hopset HH with "hopbound" β\beta, is a set of edges added to GG such that for any pair of nodes uu and vv in GG there is a path with at most β\beta hops in GHG \cup H with length within (1+ϵ)(1+\epsilon) of the shortest path between uu and vv in GG. Our hopsets are significantly sparser than the recent construction of Censor-Hillel et al. [6], that constructs a hopset of size O~(n3/2)\tilde{O}(n^{3/2}), but with a smaller polylogarithmic hopbound. On the other hand, the previously known constructions of sparse hopsets with polylogarithmic hopbound in the Congested Clique model, proposed by Elkin and Neiman [10],[11],[12], all require polynomial rounds. One tool that we use is an efficient algorithm that constructs an \ell-limited neighborhood cover, that may be of independent interest. Finally, as a side result, we also give a hopset construction in a variant of the low-memory Massively Parallel Computation model, with improved running time over existing algorithms

    Equivalence Classes and Conditional Hardness in Massively Parallel Computations

    Get PDF
    The Massively Parallel Computation (MPC) model serves as a common abstraction of many modern large-scale data processing frameworks, and has been receiving increasingly more attention over the past few years, especially in the context of classical graph problems. So far, the only way to argue lower bounds for this model is to condition on conjectures about the hardness of some specific problems, such as graph connectivity on promise graphs that are either one cycle or two cycles, usually called the one cycle vs. two cycles problem. This is unlike the traditional arguments based on conjectures about complexity classes (e.g., P ? NP), which are often more robust in the sense that refuting them would lead to groundbreaking algorithms for a whole bunch of problems. In this paper we present connections between problems and classes of problems that allow the latter type of arguments. These connections concern the class of problems solvable in a sublogarithmic amount of rounds in the MPC model, denoted by MPC(o(log N)), and some standard classes concerning space complexity, namely L and NL, and suggest conjectures that are robust in the sense that refuting them would lead to many surprisingly fast new algorithms in the MPC model. We also obtain new conditional lower bounds, and prove new reductions and equivalences between problems in the MPC model

    Large-Scale Distributed Algorithms for Facility Location with Outliers

    Get PDF
    This paper presents fast, distributed, O(1)-approximation algorithms for metric facility location problems with outliers in the Congested Clique model, Massively Parallel Computation (MPC) model, and in the k-machine model. The paper considers Robust Facility Location and Facility Location with Penalties, two versions of the facility location problem with outliers proposed by Charikar et al. (SODA 2001). The paper also considers two alternatives for specifying the input: the input metric can be provided explicitly (as an n x n matrix distributed among the machines) or implicitly as the shortest path metric of a given edge-weighted graph. The results in the paper are: - Implicit metric: For both problems, O(1)-approximation algorithms running in O(poly(log n)) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model. - Explicit metric: For both problems, O(1)-approximation algorithms running in O(log log log n) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model. Our main contribution is to show the existence of Mettu-Plaxton-style O(1)-approximation algorithms for both Facility Location with outlier problems. As shown in our previous work (Berns et al., ICALP 2012, Bandyapadhyay et al., ICDCN 2018) Mettu-Plaxton style algorithms are more easily amenable to being implemented efficiently in distributed and large-scale models of computation

    Algebraic Methods in the Congested Clique

    Full text link
    In this work, we use algebraic methods for studying distance computation and subgraph detection tasks in the congested clique model. Specifically, we adapt parallel matrix multiplication implementations to the congested clique, obtaining an O(n12/ω)O(n^{1-2/\omega}) round matrix multiplication algorithm, where ω<2.3728639\omega < 2.3728639 is the exponent of matrix multiplication. In conjunction with known techniques from centralised algorithmics, this gives significant improvements over previous best upper bounds in the congested clique model. The highlight results include: -- triangle and 4-cycle counting in O(n0.158)O(n^{0.158}) rounds, improving upon the O(n1/3)O(n^{1/3}) triangle detection algorithm of Dolev et al. [DISC 2012], -- a (1+o(1))(1 + o(1))-approximation of all-pairs shortest paths in O(n0.158)O(n^{0.158}) rounds, improving upon the O~(n1/2)\tilde{O} (n^{1/2})-round (2+o(1))(2 + o(1))-approximation algorithm of Nanongkai [STOC 2014], and -- computing the girth in O(n0.158)O(n^{0.158}) rounds, which is the first non-trivial solution in this model. In addition, we present a novel constant-round combinatorial algorithm for detecting 4-cycles.Comment: This is work is a merger of arxiv:1412.2109 and arxiv:1412.266
    corecore