8 research outputs found
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
There has been significant recent interest in parallel graph processing due
to the need to quickly analyze the large graphs available today. Many graph
codes have been designed for distributed memory or external memory. However,
today even the largest publicly-available real-world graph (the Hyperlink Web
graph with over 3.5 billion vertices and 128 billion edges) can fit in the
memory of a single commodity multicore server. Nevertheless, most experimental
work in the literature report results on much smaller graphs, and the ones for
the Hyperlink graph use distributed or external memory. Therefore, it is
natural to ask whether we can efficiently solve a broad class of graph problems
on this graph in memory.
This paper shows that theoretically-efficient parallel graph algorithms can
scale to the largest publicly-available graphs using a single machine with a
terabyte of RAM, processing them in minutes. We give implementations of
theoretically-efficient parallel algorithms for 20 important graph problems. We
also present the optimizations and techniques that we used in our
implementations, which were crucial in enabling us to process these large
graphs quickly. We show that the running times of our implementations
outperform existing state-of-the-art implementations on the largest real-world
graphs. For many of the problems that we consider, this is the first time they
have been solved on graphs at this scale. We have made the implementations
developed in this work publicly-available as the Graph-Based Benchmark Suite
(GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium
on Parallelism in Algorithms and Architectures (SPAA), 201
Fast algorithm for real-time rings reconstruction
The GAP project is dedicated to study the application of GPU in several contexts in which
real-time response is important to take decisions. The definition of real-time depends on
the application under study, ranging from answer time of μs up to several hours in case
of very computing intensive task. During this conference we presented our work in low
level triggers [1] [2] and high level triggers [3] in high energy physics experiments, and
specific application for nuclear magnetic resonance (NMR) [4] [5] and cone-beam CT [6].
Apart from the study of dedicated solution to decrease the latency due to data transport
and preparation, the computing algorithms play an essential role in any GPU application.
In this contribution, we show an original algorithm developed for triggers application, to
accelerate the ring reconstruction in RICH detector when it is not possible to have seeds
for reconstruction from external trackers
Scalable parallel minimum spanning forest computation
10.1145/2145816.2145842Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP205-21