Search CORE

65 research outputs found

Deterministic massively parallel connectivity

Author: Coy Sam
Czumaj Artur
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

We consider the problem of designing fundamental graph algorithms on the model of Massive Parallel Computation (MPC). The input to the problem is an undirected grap

Warwick Research Archives Portal Repository

Exploring the Design Space of Static and Incremental Graph Connectivity Algorithms on GPUs

Author: Acar Umut A.
Bader David A.
Banerjee Dip Sankar
Beamer Scott
Blelloch Guy E.
Chakrabarti Deepayan
Chitnis Laukik
Dhulipala Laxman
Dhulipala Laxman
Ediger D.
Ester Martin
Green O.
Hambrusch S.
Holm J.
Hsu Tsan-Sheng
Jayanti Siddhartha V.
Karger David R.
Kiveris Raimondas
Liu Sixue
Madduri Kamesh
McColl R.
Merrill Duane
Patwary M. M. A.
Phillips C. A.
Shun Julian
Shun Julian
Siddhartha
Slota George M.
Soman J.
Stergiou Stergios
Sutton M.
Wang Yangzihao
Wang Yangzihao
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/08/2020
Field of study

Connected components and spanning forest are fundamental graph algorithms due to their use in many important applications, such as graph clustering and image segmentation. GPUs are an ideal platform for graph algorithms due to their high peak performance and memory bandwidth. While there exist several GPU connectivity algorithms in the literature, many design choices have not yet been explored. In this paper, we explore various design choices in GPU connectivity algorithms, including sampling, linking, and tree compression, for both the static as well as the incremental setting. Our various design choices lead to over 300 new GPU implementations of connectivity, many of which outperform state-of-the-art. We present an experimental evaluation, and show that we achieve an average speedup of 2.47x speedup over existing static algorithms. In the incremental setting, we achieve a throughput of up to 48.23 billion edges per second. Compared to state-of-the-art CPU implementations on a 72-core machine, we achieve a speedup of 8.26--14.51x for static connectivity and 1.85--13.36x for incremental connectivity using a Tesla V100 GPU

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Biconnectivity, Chain Decomposition and st-Numbering Using O(n) Bits

Author: Chakraborty Sankardeep
Raman Venkatesh
Satti Srinivasa Rao
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th International Symposium on Algorithms and Computation (ISAAC 2016)
Publication date: 01/01/2016
Field of study

Recent work by Elmasry et al. (STACS 2015) and Asano et al. (ISAAC 2014) reconsidered classical fundamental graph algorithms focusing on improving the space complexity. Elmasry et al. gave, among others, an implementation of depth first search (DFS) of a graph on n vertices and m edges, taking O(m lg lg n) time using O(n) bits of space improving on the time bound of O(m lg n) due to Asano et al. Subsequently Banerjee et al. (COCOON 2016) gave an O(m + n) time implementation using O(m+n) bits, for DFS and its classical applications (including testing for biconnectivity, and finding cut vertices and cut edges). Recently, Kammer et al. (MFCS 2016) gave an algorithm for testing biconnectivity using O(n + min{m, n lg lg n}) bits in linear time. In this paper, we consider O(n) bits implementations of the classical applications of DFS. These include the problem of finding cut vertices, and biconnected components, chain decomposition and st-numbering. Classical algorithms for them typically use DFS and some Omega(lg n) bits of information at each node. Our O(n)-bit implementations for these problems take O(m lg^c n lg lg n) time for some small constant c (c leq 3). Central to our implementation is a succinct representation of the DFS tree and a space efficient partitioning of the DFS tree into connected subtrees, which maybe of independent interest for space efficient graph algorithms

Dagstuhl Research Online Publication Server