66 research outputs found
Parallel Graph Connectivity in Log Diameter Rounds
We study graph connectivity problem in MPC model. On an undirected graph with
nodes and edges, round connectivity algorithms have been
known for over 35 years. However, no algorithms with better complexity bounds
were known. In this work, we give fully scalable, faster algorithms for the
connectivity problem, by parameterizing the time complexity as a function of
the diameter of the graph. Our main result is a
time connectivity algorithm for diameter- graphs, using total
memory. If our algorithm can use more memory, it can terminate in fewer rounds,
and there is no lower bound on the memory per processor.
We extend our results to related graph problems such as spanning forest,
finding a DFS sequence, exact/approximate minimum spanning forest, and
bottleneck spanning forest. We also show that achieving similar bounds for
reachability in directed graphs would imply faster boolean matrix
multiplication algorithms.
We introduce several new algorithmic ideas. We describe a general technique
called double exponential speed problem size reduction which roughly means that
if we can use total memory to reduce a problem from size to , for
in one phase, then we can solve the problem in
phases. In order to achieve this fast reduction for graph
connectivity, we use a multistep algorithm. One key step is a carefully
constructed truncated broadcasting scheme where each node broadcasts neighbor
sets to its neighbors in a way that limits the size of the resulting neighbor
sets. Another key step is random leader contraction, where we choose a smaller
set of leaders than many previous works do
Recommended from our members
New Primitives for Tackling Graph Problems and Their Applications in Parallel Computing
We study fundamental graph problems under parallel computing models. In particular, we consider two parallel computing models: Parallel Random Access Machine (PRAM) and Massively Parallel Computation (MPC). The PRAM model is a classic model of parallel computation. The efficiency of a PRAM algorithm is measured by its parallel time and the number of processors needed to achieve the parallel time. The MPC model is an abstraction of modern massive parallel computing systems such as MapReduce, Hadoop and Spark. The MPC model captures well coarse-grained computation on large data --- data is distributed to processors, each of which has a sublinear (in the input data) amount of local memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. We usually desire fully scalable MPC algorithms, i.e., algorithms that can work for any local memory size. The efficiency of a fully scalable MPC algorithm is measured by its parallel time and the total space usage (the local memory size times the number of machines).
Consider an -vertex -edge undirected graph (either weighted or unweighted) with diameter (the largest diameter of its connected components). Let =+ denote the size of . We present a series of efficient (randomized) parallel graph algorithms with theoretical guarantees. Several results are listed as follows:
1) Fully scalable MPC algorithms for graph connectivity and spanning forest using () total space and (log loglog_{/} ) parallel time.
2) Fully scalable MPC algorithms for 2-edge and 2-vertex connectivity using () total space where 2-edge connectivity algorithm needs (log loglog_{/} ) parallel time, and 2-vertex connectivity algorithm needs (log ⸱log²log_{/} n+\log D'⸱loglog_{/} ) parallel time. Here ' denotes the bi-diameter of .
3) PRAM algorithms for graph connectivity and spanning forest using () processors and (log loglog_{/} ) parallel time.
4) PRAM algorithms for (1 + )-approximate shortest path and (1 + )-approximate uncapacitated minimum cost flow using () processors and poly(log ) parallel time.
These algorithms are built on a series of new graph algorithmic primitives which may be of independent interests
Log Diameter Rounds Algorithms for 2-Vertex and 2-Edge Connectivity
Many modern parallel systems, such as MapReduce, Hadoop and Spark, can be modeled well by the MPC model. The MPC model captures well coarse-grained computation on large data - data is distributed to processors, each of which has a sublinear (in the input data) amount of memory and we alternate between rounds of computation and rounds of communication, where each machine can communicate an amount of data as large as the size of its memory. This model is stronger than the classical PRAM model, and it is an intriguing question to design algorithms whose running time is smaller than in the PRAM model.
In this paper, we study two fundamental problems, 2-edge connectivity and 2-vertex connectivity (biconnectivity). PRAM algorithms which run in O(log n) time have been known for many years. We give algorithms using roughly log diameter rounds in the MPC model. Our main results are, for an n-vertex, m-edge graph of diameter D and bi-diameter D\u27, 1) a O(log D log log_{m/n} n) parallel time 2-edge connectivity algorithm, 2) a O(log D log^2 log_{m/n}n+log D\u27log log_{m/n}n) parallel time biconnectivity algorithm, where the bi-diameter D\u27 is the largest cycle length over all the vertex pairs in the same biconnected component. Our results are fully scalable, meaning that the memory per processor can be O(n^{delta}) for arbitrary constant delta>0, and the total memory used is linear in the problem size. Our 2-edge connectivity algorithm achieves the same parallel time as the connectivity algorithm of [Andoni et al., 2018]. We also show an Omega(log D\u27) conditional lower bound for the biconnectivity problem
- …