38 research outputs found
Sparse Matrix Multiplication and Triangle Listing in the Congested Clique Model
We show how to multiply two n x n matrices S and T over semirings in the Congested Clique model, where n nodes communicate in a fully connected synchronous network using O(log{n})-bit messages, within O(nz(S)^{1/3} nz(T)^{1/3}/n + 1) rounds of communication, where nz(S) and nz(T) denote the number of non-zero elements in S and T, respectively. By leveraging the sparsity of the input matrices, our algorithm greatly reduces communication costs compared with general multiplication algorithms [Censor-Hillel et al., PODC 2015], and thus improves upon the state-of-the-art for matrices with o(n^2) non-zero elements. Moreover, our algorithm exhibits the additional strength of surpassing previous solutions also in the case where only one of the two matrices is such. Particularly, this allows to efficiently raise a sparse matrix to a power greater than 2. As applications, we show how to speed up the computation on non-dense graphs of 4-cycle counting and all-pairs-shortest-paths.
Our algorithmic contribution is a new deterministic method of restructuring the input matrices in a sparsity-aware manner, which assigns each node with element-wise multiplication tasks that are not necessarily consecutive but guarantee a balanced element distribution, providing for communication-efficient multiplication.
Moreover, this new deterministic method for restructuring matrices may be used to restructure the adjacency matrix of input graphs, enabling faster deterministic solutions for graph related problems. As an example, we present a new sparsity aware, deterministic algorithm which solves the triangle listing problem in O(m/n^{5/3} + 1) rounds, a complexity that was previously obtained by a randomized algorithm [Pandurangan et al., SPAA 2018], and that matches the known lower bound of Omega~(n^{1/3}) when m=n^2 of [Izumi and Le Gall, PODC 2017, Pandurangan et al., SPAA 2018]. Naturally, our triangle listing algorithm also implies triangle counting within the same complexity of O(m/n^{5/3} + 1) rounds, which is (possibly more than) a cubic improvement over the previously known deterministic O(m^2/n^3)-round algorithm [Dolev et al., DISC 2012]
Fast Distributed Algorithms for Girth, Cycles and Small Subgraphs
In this paper we give fast distributed graph algorithms for detecting and listing small subgraphs, and for computing or approximating the girth. Our algorithms improve upon the state of the art by polynomial factors, and for girth, we obtain a constant-time algorithm for additive +1 approximation in Congested Clique, and the first parametrized algorithm for exact computation in Congest.
In the Congested Clique model, we first develop a technique for learning small neighborhoods, and apply it to obtain an O(1)-round algorithm that computes the girth with only an additive +1 error. Next, we introduce a new technique (the partition tree technique) allowing for efficiently listing all copies of any subgraph, which is deterministic and improves upon the state-of the-art for non-dense graphs. We give two concrete applications of the partition tree technique: First we show that for constant k, it is possible to solve C_{2k}-detection in O(1) rounds in the Congested Clique, improving on prior work, which used fast matrix multiplication and thus had polynomial round complexity. Second, we show that in triangle-free graphs, the girth can be exactly computed in time polynomially faster than the best known bounds for general graphs. We remark that no analogous result is currently known for sequential algorithms.
In the Congest model, we describe a new approach for finding cycles, and instantiate it in two ways: first, we show a fast parametrized algorithm for girth with round complexity O?(min{g? n^{1-1/?(g)},n}) for any girth g; and second, we show how to find small even-length cycles C_{2k} for k = 3,4,5 in O(n^{1-1/k}) rounds. This is a polynomial improvement upon the previous running times; for example, our C?-detection algorithm runs in O(n^{2/3}) rounds, compared to O(n^{3/4}) in prior work. Finally, using our improved C?-freeness algorithm, and the barrier on proving lower bounds on triangle-freeness of Eden et al., we show that improving the current ??(?n) lower bound for C?-freeness of Korhonen et al. by any polynomial factor would imply strong circuit complexity lower bounds
On the Distributed Complexity of Large-Scale Graph Computations
Motivated by the increasing need to understand the distributed algorithmic
foundations of large-scale graph computations, we study some fundamental graph
problems in a message-passing model for distributed computing where
machines jointly perform computations on graphs with nodes (typically, ). The input graph is assumed to be initially randomly partitioned among
the machines, a common implementation in many real-world systems.
Communication is point-to-point, and the goal is to minimize the number of
communication {\em rounds} of the computation.
Our main contribution is the {\em General Lower Bound Theorem}, a theorem
that can be used to show non-trivial lower bounds on the round complexity of
distributed large-scale data computations. The General Lower Bound Theorem is
established via an information-theoretic approach that relates the round
complexity to the minimal amount of information required by machines to solve
the problem. Our approach is generic and this theorem can be used in a
"cookbook" fashion to show distributed lower bounds in the context of several
problems, including non-graph problems. We present two applications by showing
(almost) tight lower bounds for the round complexity of two fundamental graph
problems, namely {\em PageRank computation} and {\em triangle enumeration}. Our
approach, as demonstrated in the case of PageRank, can yield tight lower bounds
for problems (including, and especially, under a stochastic partition of the
input) where communication complexity techniques are not obvious.
Our approach, as demonstrated in the case of triangle enumeration, can yield
stronger round lower bounds as well as message-round tradeoffs compared to
approaches that use communication complexity techniques
Beyond Distributed Subgraph Detection: Induced Subgraphs, Multicolored Problems and Graph Parameters
Subgraph detection has recently been one of the most studied problems in the CONGEST model of distributed computing. In this work, we study the distributed complexity of problems closely related to subgraph detection, mainly focusing on induced subgraph detection. The main line of this work presents lower bounds and parameterized algorithms w.r.t structural parameters of the input graph:
- On general graphs, we give unconditional lower bounds for induced detection of cycles and patterns of treewidth 2 in CONGEST. Moreover, by adapting reductions from centralized parameterized complexity, we prove lower bounds in CONGEST for detecting patterns with a 4-clique, and for induced path detection conditional on the hardness of triangle detection in the congested clique.
- On graphs of bounded degeneracy, we show that induced paths can be detected fast in CONGEST using techniques from parameterized algorithms, while detecting cycles and patterns of treewidth 2 is hard.
- On graphs of bounded vertex cover number, we show that induced subgraph detection is easy in CONGEST for any pattern graph. More specifically, we adapt a centralized parameterized algorithm for a more general maximum common induced subgraph detection problem to the distributed setting. In addition to these induced subgraph detection results, we study various related problems in the CONGEST and congested clique models, including for multicolored versions of subgraph-detection-like problems
Parallel Algorithms for Small Subgraph Counting
Subgraph counting is a fundamental problem in analyzing massive graphs, often
studied in the context of social and complex networks. There is a rich
literature on designing efficient, accurate, and scalable algorithms for this
problem. In this work, we tackle this challenge and design several new
algorithms for subgraph counting in the Massively Parallel Computation (MPC)
model:
Given a graph over vertices, edges and triangles, our first
main result is an algorithm that, with high probability, outputs a
-approximation to , with optimal round and space complexity
provided any space per machine, assuming
.
Our second main result is an -rounds
algorithm for exactly counting the number of triangles, parametrized by the
arboricity of the input graph. The space per machine is
for any constant , and the total space is ,
which matches the time complexity of (combinatorial) triangle counting in the
sequential model. We also prove that this result can be extended to exactly
counting -cliques for any constant , with the same round complexity and
total space . Alternatively, allowing space per
machine, the total space requirement reduces to .
Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri
(ITCS 2020) for exactly counting all subgraphs of size at most , can be
implemented in the MPC model in rounds,
space per machine and total space. Therefore,
this result also exhibits the phenomenon that a time bound in the sequential
model translates to a space bound in the MPC model