    A Faster Distributed Single-Source Shortest Paths Algorithm

    We devise new algorithms for the single-source shortest paths (SSSP) problem with non-negative edge weights in the CONGEST model of distributed computing. While close-to-optimal solutions, in terms of the number of rounds spent by the algorithm, have recently been developed for computing SSSP approximately, the fastest known exact algorithms are still far away from matching the lower bound of Ω~(n+D) \tilde \Omega (\sqrt{n} + D) rounds by Peleg and Rubinovich [SIAM Journal on Computing 2000], where n n is the number of nodes in the network and D D is its diameter. The state of the art is Elkin's randomized algorithm [STOC 2017] that performs O~(n2/3D1/3+n5/6) \tilde O(n^{2/3} D^{1/3} + n^{5/6}) rounds. We significantly improve upon this upper bound with our two new randomized algorithms for polynomially bounded integer edge weights, the first performing O~(nD) \tilde O (\sqrt{n D}) rounds and the second performing O~(nD1/4+n3/5+D) \tilde O (\sqrt{n} D^{1/4} + n^{3/5} + D) rounds. Our bounds also compare favorably to the independent result by Ghaffari and Li [STOC 2018]. As side results, we obtain a (1+ϵ) (1 + \epsilon) -approximation O~((nD1/4+D)/ϵ) \tilde O ((\sqrt{n} D^{1/4} + D) / \epsilon) -round algorithm for directed SSSP and a new work/depth trade-off for exact SSSP on directed graphs in the PRAM model.Comment: Presented at the the 59th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2018

    An Improved Distributed Algorithm for Maximal Independent Set

    The Maximal Independent Set (MIS) problem is one of the basics in the study of locality in distributed graph algorithms. This paper presents an extremely simple randomized algorithm providing a near-optimal local complexity for this problem, which incidentally, when combined with some recent techniques, also leads to a near-optimal global complexity. Classical algorithms of Luby [STOC'85] and Alon, Babai and Itai [JALG'86] provide the global complexity guarantee that, with high probability, all nodes terminate after O(logn)O(\log n) rounds. In contrast, our initial focus is on the local complexity, and our main contribution is to provide a very simple algorithm guaranteeing that each particular node vv terminates after O(logdeg(v)+log1/ϵ)O(\log \mathsf{deg}(v)+\log 1/\epsilon) rounds, with probability at least 1ϵ1-\epsilon. The guarantee holds even if the randomness outside 22-hops neighborhood of vv is determined adversarially. This degree-dependency is optimal, due to a lower bound of Kuhn, Moscibroda, and Wattenhofer [PODC'04]. Interestingly, this local complexity smoothly transitions to a global complexity: by adding techniques of Barenboim, Elkin, Pettie, and Schneider [FOCS'12, arXiv: 1202.1983v3], we get a randomized MIS algorithm with a high probability global complexity of O(logΔ)+2O(loglogn)O(\log \Delta) + 2^{O(\sqrt{\log \log n})}, where Δ\Delta denotes the maximum degree. This improves over the O(log2Δ)+2O(loglogn)O(\log^2 \Delta) + 2^{O(\sqrt{\log \log n})} result of Barenboim et al., and gets close to the Ω(min{logΔ,logn})\Omega(\min\{\log \Delta, \sqrt{\log n}\}) lower bound of Kuhn et al. Corollaries include improved algorithms for MIS in graphs of upper-bounded arboricity, or lower-bounded girth, for Ruling Sets, for MIS in the Local Computation Algorithms (LCA) model, and a faster distributed algorithm for the Lov\'asz Local Lemma

    Towards a complexity theory for the congested clique

    The congested clique model of distributed computing has been receiving attention as a model for densely connected distributed systems. While there has been significant progress on the side of upper bounds, we have very little in terms of lower bounds for the congested clique; indeed, it is now know that proving explicit congested clique lower bounds is as difficult as proving circuit lower bounds. In this work, we use various more traditional complexity-theoretic tools to build a clearer picture of the complexity landscape of the congested clique: -- Nondeterminism and beyond: We introduce the nondeterministic congested clique model (analogous to NP) and show that there is a natural canonical problem family that captures all problems solvable in constant time with nondeterministic algorithms. We further generalise these notions by introducing the constant-round decision hierarchy (analogous to the polynomial hierarchy). -- Non-constructive lower bounds: We lift the prior non-uniform counting arguments to a general technique for proving non-constructive uniform lower bounds for the congested clique. In particular, we prove a time hierarchy theorem for the congested clique, showing that there are decision problems of essentially all complexities, both in the deterministic and nondeterministic settings. -- Fine-grained complexity: We map out relationships between various natural problems in the congested clique model, arguing that a reduction-based complexity theory currently gives us a fairly good picture of the complexity landscape of the congested clique

    Massively Parallel Algorithms for Distance Approximation and Spanners

    Over the past decade, there has been increasing interest in distributed/parallel algorithms for processing large-scale graphs. By now, we have quite fast algorithms -- usually sublogarithmic-time and often poly(loglogn)poly(\log\log n)-time, or even faster -- for a number of fundamental graph problems in the massively parallel computation (MPC) model. This model is a widely-adopted theoretical abstraction of MapReduce style settings, where a number of machines communicate in an all-to-all manner to process large-scale data. Contributing to this line of work on MPC graph algorithms, we present poly(logk)poly(loglogn)poly(\log k) \in poly(\log\log n) round MPC algorithms for computing O(k1+o(1))O(k^{1+{o(1)}})-spanners in the strongly sublinear regime of local memory. To the best of our knowledge, these are the first sublogarithmic-time MPC algorithms for spanner construction. As primary applications of our spanners, we get two important implications, as follows: -For the MPC setting, we get an O(log2logn)O(\log^2\log n)-round algorithm for O(log1+o(1)n)O(\log^{1+o(1)} n) approximation of all pairs shortest paths (APSP) in the near-linear regime of local memory. To the best of our knowledge, this is the first sublogarithmic-time MPC algorithm for distance approximations. -Our result above also extends to the Congested Clique model of distributed computing, with the same round complexity and approximation guarantee. This gives the first sub-logarithmic algorithm for approximating APSP in weighted graphs in the Congested Clique model

    On the Distributed Complexity of Large-Scale Graph Computations

    Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a message-passing model for distributed computing where k2k \geq 2 machines jointly perform computations on graphs with nn nodes (typically, nkn \gg k). The input graph is assumed to be initially randomly partitioned among the kk machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication {\em rounds} of the computation. Our main contribution is the {\em General Lower Bound Theorem}, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. The General Lower Bound Theorem is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic and this theorem can be used in a "cookbook" fashion to show distributed lower bounds in the context of several problems, including non-graph problems. We present two applications by showing (almost) tight lower bounds for the round complexity of two fundamental graph problems, namely {\em PageRank computation} and {\em triangle enumeration}. Our approach, as demonstrated in the case of PageRank, can yield tight lower bounds for problems (including, and especially, under a stochastic partition of the input) where communication complexity techniques are not obvious. Our approach, as demonstrated in the case of triangle enumeration, can yield stronger round lower bounds as well as message-round tradeoffs compared to approaches that use communication complexity techniques