21,207 research outputs found
Round Compression for Parallel Matching Algorithms
For over a decade now we have been witnessing the success of {\em massive
parallel computation} (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or
Spark. One of the reasons for their success is the fact that these frameworks
are able to accurately capture the nature of large-scale computation. In
particular, compared to the classic distributed algorithms or PRAM models,
these frameworks allow for much more local computation. The fundamental
question that arises in this context is though: can we leverage this additional
power to obtain even faster parallel algorithms?
A prominent example here is the {\em maximum matching} problem---one of the
most classic graph problems. It is well known that in the PRAM model one can
compute a 2-approximate maximum matching in rounds. However, the
exact complexity of this problem in the MPC framework is still far from
understood. Lattanzi et al. showed that if each machine has
memory, this problem can also be solved -approximately in a constant number
of rounds. These techniques, as well as the approaches developed in the follow
up work, seem though to get stuck in a fundamental way at roughly
rounds once we enter the near-linear memory regime. It is thus entirely
possible that in this regime, which captures in particular the case of sparse
graph computations, the best MPC round complexity matches what one can already
get in the PRAM model, without the need to take advantage of the extra local
computation power.
In this paper, we finally refute that perplexing possibility. That is, we
break the above round complexity bound even in the case of {\em
slightly sublinear} memory per machine. In fact, our improvement here is {\em
almost exponential}: we are able to deliver a -approximation to
maximum matching, for any fixed constant , in
rounds
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
There has been significant recent interest in parallel graph processing due
to the need to quickly analyze the large graphs available today. Many graph
codes have been designed for distributed memory or external memory. However,
today even the largest publicly-available real-world graph (the Hyperlink Web
graph with over 3.5 billion vertices and 128 billion edges) can fit in the
memory of a single commodity multicore server. Nevertheless, most experimental
work in the literature report results on much smaller graphs, and the ones for
the Hyperlink graph use distributed or external memory. Therefore, it is
natural to ask whether we can efficiently solve a broad class of graph problems
on this graph in memory.
This paper shows that theoretically-efficient parallel graph algorithms can
scale to the largest publicly-available graphs using a single machine with a
terabyte of RAM, processing them in minutes. We give implementations of
theoretically-efficient parallel algorithms for 20 important graph problems. We
also present the optimizations and techniques that we used in our
implementations, which were crucial in enabling us to process these large
graphs quickly. We show that the running times of our implementations
outperform existing state-of-the-art implementations on the largest real-world
graphs. For many of the problems that we consider, this is the first time they
have been solved on graphs at this scale. We have made the implementations
developed in this work publicly-available as the Graph-Based Benchmark Suite
(GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium
on Parallelism in Algorithms and Architectures (SPAA), 201
Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs
As massive graphs become more prevalent, there is a rapidly growing need for
scalable algorithms that solve classical graph problems, such as maximum
matching and minimum vertex cover, on large datasets. For massive inputs,
several different computational models have been introduced, including the
streaming model, the distributed communication model, and the massively
parallel computation (MPC) model that is a common abstraction of
MapReduce-style computation. In each model, algorithms are analyzed in terms of
resources such as space used or rounds of communication needed, in addition to
the more traditional approximation ratio.
In this paper, we give a single unified approach that yields better
approximation algorithms for matching and vertex cover in all these models. The
highlights include:
* The first one pass, significantly-better-than-2-approximation for matching
in random arrival streams that uses subquadratic space, namely a
-approximation streaming algorithm that uses space
for constant .
* The first 2-round, better-than-2-approximation for matching in the MPC
model that uses subquadratic space per machine, namely a
-approximation algorithm with memory per
machine for constant .
By building on our unified approach, we further develop parallel algorithms
in the MPC model that give a -approximation to matching and an
-approximation to vertex cover in only MPC rounds and
memory per machine. These results settle multiple open
questions posed in the recent paper of Czumaj~et.al. [STOC 2018]
Recommended from our members
System approach to disparity estimation
A system approach to disparity estimation using dynamic programming is presented. The four step system can calculate a dense correspondence map between a stereo pair with parallel or
nonparallel camera geometry. Results are presented with CCIR 601 format stereo images
- …