Search CORE

5 research outputs found

A Novel Parallel Triangle Counting Algorithm with Reduced Communication

Author: Bader David A.
Du Zhihui
Ganeshan Anya
Gundogdu Ahmet
Lew Jason
Li Fuhuan
Rodriguez Oliver Alvarado
Publication venue
Publication date: 03/10/2022
Field of study

Counting and finding triangles in graphs is often used in real-world analytics for characterizing the cohesiveness and identifying communities in graphs. In this paper, we present novel sequential and parallel triangle counting algorithms based on identifying horizontal-edges in a breadth-first search (BFS) traversal of the graph. The BFS allows our algorithm to drastically reduce the number of edges examined for set intersections. Our new approach is the first communication-optimal parallel algorithm that asymptotically reduces the communication on massive graphs such as from real social networks and synthetic graphs from the Graph500 Benchmark. In our estimate from massive-scale Graph500 graphs, our new algorithms reduces the communication by 21.8x on a scale 36 and by 180x on a scale 42. Because communication is known to be the dominant cost of parallel triangle counting, our new parallel algorithm, to our knowledge, is now the fastest method for counting triangles in large graphs.Comment: 10 page

arXiv.org e-Print Archive

Parallelizing Maximal Clique Enumeration on GPUs

Author: Almasri Mohammad
Chang Yen-Hsiang
Hajj Izzat El
Hwu Wen-mei
Nagi Rakesh
Xiong Jinjun
Publication venue
Publication date: 10/06/2022
Field of study

We present a GPU solution for exact maximal clique enumeration (MCE) that performs a search tree traversal following the Bron-Kerbosch algorithm. Prior works on parallelizing MCE on GPUs perform a breadth-first traversal of the tree, which has limited scalability because of the explosion in the number of tree nodes at deep levels. We propose to parallelize MCE on GPUs by performing depth-first traversal of independent subtrees in parallel. Since MCE suffers from high load imbalance and memory capacity requirements, we propose a worker list for dynamic load balancing, as well as partial induced subgraphs and a compact representation of excluded vertex sets to regulate memory consumption. Our evaluation shows that our GPU implementation on a single GPU outperforms the state-of-the-art parallel CPU implementation by a geometric mean of 4.9x (up to 16.7x), and scales efficiently to multiple GPUs. Our code has been open-sourced to enable further research on accelerating MCE

arXiv.org e-Print Archive

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Recommended from our members

Fast BFS-Based Triangle Counting on GPUs

Author: Owens John D
Wang Leyuan
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

In this paper, we propose a novel method, GSM, to compute graph matching (subgraph isomorphism) on GPUs. Unlike previous formulations of graph matching, our approach is BFS-based and thus can be mapped onto GPUs in a massively parallel fashion. Our implementation uses the Gunrock program- ming model and we evaluate our implementation in runtime and memory consumption compared with previous state-of-the- art work. We sustain a peak traversed-edges-per-second (TEPS) rate of nearly 10 GTEPS. Our algorithm is the most scalable and parallel among all existing GPU implementations and also outperforms all existing CPU distributed implementations. This work specifically focuses on leveraging our implementation on the triangle counting problem for the Subgraph Isomorphism Graph Challenge, demonstrating a geometric mean speedup over the 2018 champion of 3.84

eScholarship - University of California

Fast BFS-Based Triangle Counting on GPUs

Author: Owens John D
Wang Leyuan
Publication venue: eScholarship, University of California
Publication date: 04/09/2019
Field of study

arXiv.org e-Print Archive

Crossref

eScholarship - University of California