7 research outputs found

    On the Distributed Complexity of Large-Scale Graph Computations

    Full text link
    Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a message-passing model for distributed computing where k≥2k \geq 2 machines jointly perform computations on graphs with nn nodes (typically, n≫kn \gg k). The input graph is assumed to be initially randomly partitioned among the kk machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication {\em rounds} of the computation. Our main contribution is the {\em General Lower Bound Theorem}, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. The General Lower Bound Theorem is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic and this theorem can be used in a "cookbook" fashion to show distributed lower bounds in the context of several problems, including non-graph problems. We present two applications by showing (almost) tight lower bounds for the round complexity of two fundamental graph problems, namely {\em PageRank computation} and {\em triangle enumeration}. Our approach, as demonstrated in the case of PageRank, can yield tight lower bounds for problems (including, and especially, under a stochastic partition of the input) where communication complexity techniques are not obvious. Our approach, as demonstrated in the case of triangle enumeration, can yield stronger round lower bounds as well as message-round tradeoffs compared to approaches that use communication complexity techniques

    Large-Scale Distributed Algorithms for Facility Location with Outliers

    Get PDF
    This paper presents fast, distributed, O(1)-approximation algorithms for metric facility location problems with outliers in the Congested Clique model, Massively Parallel Computation (MPC) model, and in the k-machine model. The paper considers Robust Facility Location and Facility Location with Penalties, two versions of the facility location problem with outliers proposed by Charikar et al. (SODA 2001). The paper also considers two alternatives for specifying the input: the input metric can be provided explicitly (as an n x n matrix distributed among the machines) or implicitly as the shortest path metric of a given edge-weighted graph. The results in the paper are: - Implicit metric: For both problems, O(1)-approximation algorithms running in O(poly(log n)) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model. - Explicit metric: For both problems, O(1)-approximation algorithms running in O(log log log n) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model. Our main contribution is to show the existence of Mettu-Plaxton-style O(1)-approximation algorithms for both Facility Location with outlier problems. As shown in our previous work (Berns et al., ICALP 2012, Bandyapadhyay et al., ICDCN 2018) Mettu-Plaxton style algorithms are more easily amenable to being implemented efficiently in distributed and large-scale models of computation

    The Complexity of Symmetry Breaking in Massive Graphs

    Get PDF
    The goal of this paper is to understand the complexity of symmetry breaking problems, specifically maximal independent set (MIS) and the closely related beta-ruling set problem, in two computational models suited for large-scale graph processing, namely the k-machine model and the graph streaming model. We present a number of results. For MIS in the k-machine model, we improve the O~(m/k^2 + Delta/k)-round upper bound of Klauck et al. (SODA 2015) by presenting an O~(m/k^2)-round algorithm. We also present an Omega~(n/k^2) round lower bound for MIS, the first lower bound for a symmetry breaking problem in the k-machine model. For beta-ruling sets, we use hierarchical sampling to obtain more efficient algorithms in the k-machine model and also in the graph streaming model. More specifically, we obtain a k-machine algorithm that runs in O~(beta n Delta^{1/beta}/k^2) rounds and, by using a similar hierarchical sampling technique, we obtain one-pass algorithms for both insertion-only and insertion-deletion streams that use O(beta * n^{1+1/2^{beta-1}}) space. The latter result establishes a clear separation between MIS, which is known to require Omega(n^2) space (Cormode et al., ICALP 2019), and beta-ruling sets, even for beta = 2. Finally, we present an even faster 2-ruling set algorithm in the k-machine model, one that runs in O~(n/k^{2-epsilon} + k^{1-epsilon}) rounds for any epsilon, 0 <=epsilon <=1. For a wide range of values of k this round complexity simplifies to O~(n/k^2) rounds, which we conjecture is optimal. Our results use a variety of techniques. For our upper bounds, we prove and use simulation theorems for beeping algorithms, hierarchical sampling, and L_0-sampling, whereas for our lower bounds we use information-theoretic arguments and reductions to 2-party communication complexity problems

    The Message Complexity of Distributed Graph Optimization

    Full text link
    The message complexity of a distributed algorithm is the total number of messages sent by all nodes over the course of the algorithm. This paper studies the message complexity of distributed algorithms for fundamental graph optimization problems. We focus on four classical graph optimization problems: Maximum Matching (MaxM), Minimum Vertex Cover (MVC), Minimum Dominating Set (MDS), and Maximum Independent Set (MaxIS). In the sequential setting, these problems are representative of a wide spectrum of hardness of approximation. While there has been some progress in understanding the round complexity of distributed algorithms (for both exact and approximate versions) for these problems, much less is known about their message complexity and its relation with the quality of approximation. We almost fully quantify the message complexity of distributed graph optimization by showing the following results...[see paper for full abstract
    corecore