75 research outputs found

    On the Distributed Complexity of Large-Scale Graph Computations

    Full text link
    Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a message-passing model for distributed computing where k≥2k \geq 2 machines jointly perform computations on graphs with nn nodes (typically, n≫kn \gg k). The input graph is assumed to be initially randomly partitioned among the kk machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication {\em rounds} of the computation. Our main contribution is the {\em General Lower Bound Theorem}, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. The General Lower Bound Theorem is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic and this theorem can be used in a "cookbook" fashion to show distributed lower bounds in the context of several problems, including non-graph problems. We present two applications by showing (almost) tight lower bounds for the round complexity of two fundamental graph problems, namely {\em PageRank computation} and {\em triangle enumeration}. Our approach, as demonstrated in the case of PageRank, can yield tight lower bounds for problems (including, and especially, under a stochastic partition of the input) where communication complexity techniques are not obvious. Our approach, as demonstrated in the case of triangle enumeration, can yield stronger round lower bounds as well as message-round tradeoffs compared to approaches that use communication complexity techniques

    A Distributed Algorithm for Directed Minimum-Weight Spanning Tree

    Get PDF

    Towards a complexity theory for the congested clique

    Full text link
    The congested clique model of distributed computing has been receiving attention as a model for densely connected distributed systems. While there has been significant progress on the side of upper bounds, we have very little in terms of lower bounds for the congested clique; indeed, it is now know that proving explicit congested clique lower bounds is as difficult as proving circuit lower bounds. In this work, we use various more traditional complexity-theoretic tools to build a clearer picture of the complexity landscape of the congested clique: -- Nondeterminism and beyond: We introduce the nondeterministic congested clique model (analogous to NP) and show that there is a natural canonical problem family that captures all problems solvable in constant time with nondeterministic algorithms. We further generalise these notions by introducing the constant-round decision hierarchy (analogous to the polynomial hierarchy). -- Non-constructive lower bounds: We lift the prior non-uniform counting arguments to a general technique for proving non-constructive uniform lower bounds for the congested clique. In particular, we prove a time hierarchy theorem for the congested clique, showing that there are decision problems of essentially all complexities, both in the deterministic and nondeterministic settings. -- Fine-grained complexity: We map out relationships between various natural problems in the congested clique model, arguing that a reduction-based complexity theory currently gives us a fairly good picture of the complexity landscape of the congested clique

    Super-Fast MST Algorithms in the Congested Clique Using o(m) Messages

    Get PDF
    In a sequence of recent results (PODC 2015 and PODC 2016), the running time of the fastest algorithm for the minimum spanning tree (MST) problem in the Congested Clique model was first improved to O(log(log(log(n)))) from O(log(log(n))) (Hegeman et al., PODC 2015) and then to O(log^*(n)) (Ghaffari and Parter, PODC 2016). All of these algorithms use Theta(n^2) messages independent of the number of edges in the input graph. This paper positively answers a question raised in Hegeman et al., and presents the first "super-fast" MST algorithm with o(m) message complexity for input graphs with m edges. Specifically, we present an algorithm running in O(log^*(n)) rounds, with message complexity ~O(sqrt{m * n}) and then build on this algorithm to derive a family of algorithms, containing for any epsilon, 0 < epsilon <= 1, an algorithm running in O(log^*(n)/epsilon) rounds, using ~O(n^{1 + epsilon}/epsilon) messages. Setting epsilon = log(log(n))/log(n) leads to the first sub-logarithmic round Congested Clique MST algorithm that uses only ~O(n) messages. Our primary tools in achieving these results are (i) a component-wise bound on the number of candidates for MST edges, extending the sampling lemma of Karger, Klein, and Tarjan (Karger, Klein, and Tarjan, JACM 1995) and (ii) Theta(log(n))-wise-independent linear graph sketches (Cormode and Firmani, Dist. Par. Databases, 2014) for generating MST candidate edges

    Brief Announcement: On Connectivity in the Broadcast Congested Clique

    Get PDF
    Recently, very fast deterministic and randomized algorithms have been obtained for connectivity and minimum spanning tree in the unicast congested clique. In contrast, no solution faster than a simple parallel implementation of the Boruvka\u27s algorithm has been known for both problems in the broadcast congested clique. In this announcement, we present the first sub-logarithmic deterministic algorithm for connected components in the broadcast congested clique
    • …
    corecore