Search CORE

4 research outputs found

GraphChallenge.org: Raising the Bar on Graph Analytic Performance

Author: Gadepally Vijay
Hurley Michael
Jones Michael
Kao Edward
Kepner Jeremy
Mohindra Sanjeev
Monticciolo Paul
Reuther Albert
Samsi Siddharth
Smith Steven
Song William
Staheli Diane
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/05/2018
Field of study

The rise of graph analytic systems has created a need for new ways to measure and compare the capabilities of graph processing systems. The MIT/Amazon/IEEE Graph Challenge has been developed to provide a well-defined community venue for stimulating research and highlighting innovations in graph analysis software, hardware, algorithms, and systems. GraphChallenge.org provides a wide range of pre-parsed graph data sets, graph generators, mathematically defined graph algorithms, example serial implementations in a variety of languages, and specific metrics for measuring performance. Graph Challenge 2017 received 22 submissions by 111 authors from 36 organizations. The submissions highlighted graph analytic innovations in hardware, software, algorithms, systems, and visualization. These submissions produced many comparable performance measurements that can be used for assessing the current state of the art of the field. There were numerous submissions that implemented the triangle counting challenge and resulted in over 350 distinct measurements. Analysis of these submissions show that their execution time is a strong function of the number of edges in the graph,

N_e

, and is typically proportional to

N_e^{4/3}

for large values of

N_e

. Combining the model fits of the submissions presents a picture of the current state of the art of graph analysis, which is typically

10^8

edges processed per second for graphs with

10^8

edges. These results are

30

times faster than serial implementations commonly used by many graph analysts and underscore the importance of making these performance benefits available to the broader community. Graph Challenge provides a clear picture of current graph analysis systems and underscores the need for new innovations to achieve high performance on very large graphs.Comment: 7 pages, 6 figures; submitted to IEEE HPEC Graph Challenge. arXiv admin note: text overlap with arXiv:1708.0686

arXiv.org e-Print Archive

Crossref

Understanding Coarsening for Embedding Large-Scale Graphs

Author: Akyildiz Taha Atahan
Aljundi Amro Alabsi
Kaya Kamer
Publication venue
Publication date: 10/09/2020
Field of study

A significant portion of the data today, e.g, social networks, web connections, etc., can be modeled by graphs. A proper analysis of graphs with Machine Learning (ML) algorithms has the potential to yield far-reaching insights into many areas of research and industry. However, the irregular structure of graph data constitutes an obstacle for running ML tasks on graphs such as link prediction, node classification, and anomaly detection. Graph embedding is a compute-intensive process of representing graphs as a set of vectors in a d-dimensional space, which in turn makes it amenable to ML tasks. Many approaches have been proposed in the literature to improve the performance of graph embedding, e.g., using distributed algorithms, accelerators, and pre-processing techniques. Graph coarsening, which can be considered a pre-processing step, is a structural approximation of a given, large graph with a smaller one. As the literature suggests, the cost of embedding significantly decreases when coarsening is employed. In this work, we thoroughly analyze the impact of the coarsening quality on the embedding performance both in terms of speed and accuracy. Our experiments with a state-of-the-art, fast graph embedding tool show that there is an interplay between the coarsening decisions taken and the embedding quality.Comment: 10 pages, 6 figures, submitted to 2020 IEEE International Conference on Big Dat

arXiv.org e-Print Archive

Sabanci University Research Database

Scalable Community Detection using Distributed Louvain Algorithm

Author: Sattar Naw Safrin
Publication venue: ScholarWorks@UNO
Publication date: 23/05/2019
Field of study

Community detection (or clustering) in large-scale graph is an important problem in graph mining. Communities reveal interesting characteristics of a network. Louvain is an efficient sequential algorithm but fails to scale emerging large-scale data. Developing distributed-memory parallel algorithms is challenging because of inter-process communication and load-balancing issues. In this work, we design a shared memory-based algorithm using OpenMP, which shows a 4-fold speedup but is limited to available physical cores. Our second algorithm is an MPI-based parallel algorithm that scales to a moderate number of processors. We also implement a hybrid algorithm combining both. Finally, we incorporate dynamic load-balancing in our final algorithm DPLAL (Distributed Parallel Louvain Algorithm with Load-balancing). DPLAL overcomes the performance bottleneck of the previous algorithms, shows around 12-fold speedup scaling to a larger number of processors. Overall, we present the challenges, our solutions, and the empirical performance of our algorithms for several large real-world networks

University of New Orleans

puzzlef/louvain-communities-openmp: Design of OpenMP-based Louvain algorithm for community detection

Author: Subhajit Sahu
Publication venue: Zenodo
Publication date: 19/12/2023
Field of study

Multi-threaded OpenMP-based <a href="https://en.wikipedia.org/wiki/Louvain_method">Louvain</a> algorithm for <a href="https://en.wikipedia.org/wiki/Community_structure">community detection</a>. Recent advancements in data collection and graph representations have led to unprecedented levels of complexity, demanding efficient parallel algorithms for community detection on large networks. The use of multicore/shared memory setups is crucial for energy efficiency and compatibility with extensive DRAM sizes. However, existing community detection algorithms face challenges in parallelization due to their irregular and inherently sequential nature. While studies on the Louvain algorithm propose optimizations and parallelization techniques, they often neglect the aggregation phase, creating a bottleneck even after optimizing the local-moving phase. Additionally, these optimization techniques are scattered across multiple papers, making it challenging for readers to grasp and implement them effectively. To address this, we introduce GVE-Louvain, an optimized parallel implementation of Louvain for shared memory multicores. Below we plot the time taken by <a href="https://github.com/ECP-ExaGraph/vite">Vite</a> (Louvain), <a href="https://github.com/ECP-ExaGraph/grappolo">Grappolo</a> (Louvain), <a href="https://github.com/networkit/networkit">NetworKit</a> Louvain, and GVE-Louvain on 13 different graphs. GVE-Louvain surpasses Vite, Grappolo, and NetworKit by <code>50×</code>, <code>22×</code>, and <code>20×</code> respectively, achieving a processing rate of <code>560</code> edges/s on a <code>3.8</code> edge graph. <a href="https://docs.google.com/spreadsheets/d/1aJI2Us60KXbSx9LeGyHdnuYfEg4d_5bFhiLXc9eaUjM/edit?usp=sharing"></a> Below we plot the speedup of GVE-Louvain wrt Vite, Grappolo, and NetworKit. <a href="https://docs.google.com/spreadsheets/d/1aJI2Us60KXbSx9LeGyHdnuYfEg4d_5bFhiLXc9eaUjM/edit?usp=sharing"></a> Next, we plot the modularity of communities identified by Vite, Grappolo, NetworKit, and GVE-Louvain. GVE-Louvain on average obtains <code>3.1%</code> higher modularity than Vite (especially on web graphs), and <code>0.6%</code> lower modularity than Grappolo and NetworKit (especially on social networks with poor clustering). <a href="https://docs.google.com/spreadsheets/d/1aJI2Us60KXbSx9LeGyHdnuYfEg4d_5bFhiLXc9eaUjM/edit?usp=sharing"></a> Finally, we plot the strong scaling behaviour of GVE-Louvain. With doubling of threads, GVE-Louvain exhibits an average performance scaling of <code>1.6×</code>. <a href="https://docs.google.com/spreadsheets/d/1eR0jkbjoskL9K2HNVy-irnHERMV770egljh94alRT0U/edit?usp=sharing"></a> Refer to our technical report for more details: <a href="https://arxiv.org/abs/2312.04876">GVE-Louvain: Fast Louvain Algorithm for Community Detection in Shared Memory Setting</a>. <blockquote> [!NOTE] You can just copy <code>main.sh</code> to your system and run it. For the code, refer to <code>main.cxx</code>. </blockquote> <h3>Code structure</h3> The code structure of GVE-Louvain is as follows: <pre><code>- inc/_algorithm.hxx: Algorithm utility functions - inc/_bitset.hxx: Bitset manipulation functions - inc/_cmath.hxx: Math functions - inc/_ctypes.hxx: Data type utility functions - inc/_cuda.hxx: CUDA utility functions - inc/_debug.hxx: Debugging macros (LOG, ASSERT, ...) - inc/_iostream.hxx: Input/output stream functions - inc/_iterator.hxx: Iterator utility functions - inc/_main.hxx: Main program header - inc/_mpi.hxx: MPI (Message Passing Interface) utility functions - inc/_openmp.hxx: OpenMP utility functions - inc/_queue.hxx: Queue utility functions - inc/_random.hxx: Random number generation functions - inc/_string.hxx: String utility functions - inc/_utility.hxx: Runtime measurement functions - inc/_vector.hxx: Vector utility functions - inc/batch.hxx: Batch update generation functions - inc/bfs.hxx: Breadth-first search algorithms - inc/csr.hxx: Compressed Sparse Row (CSR) data structure functions - inc/dfs.hxx: Depth-first search algorithms - inc/duplicate.hxx: Graph duplicating functions - inc/Graph.hxx: Graph data structure functions - inc/louvain.hxx: Louvain community detection algorithm functions - inc/main.hxx: Main header - inc/mtx.hxx: Graph file reading functions - inc/properties.hxx: Graph Property functions - inc/selfLoop.hxx: Graph Self-looping functions - inc/symmetricize.hxx: Graph Symmetricization functions - inc/transpose.hxx: Graph transpose functions - inc/update.hxx: Update functions - main.cxx: Experimentation code - process.js: Node.js script for processing output logs </code></pre> Note that each branch in this repository contains code for a specific experiment. The <code>main</code> branch contains code for the final experiment. If the intention of a branch in unclear, or if you have comments on our technical report, feel free to open an issue. <h2>References</h2> <ul> <li><a href="https://arxiv.org/abs/0803.0476">Fast unfolding of communities in large networks; Vincent D. Blondel et al. (2008)</a></li> <li><a href="https://arxiv.org/abs/1305.2006">Community Detection on the GPU; Md. Naim et al. (2017)</a></li> <li><a href="https://ieeexplore.ieee.org/document/8091047">Scalable Static and Dynamic Community Detection Using Grappolo; Mahantesh Halappanavar et al. (2017)</a></li> <li><a href="https://www.nature.com/articles/s41598-019-41695-z">From Louvain to Leiden: guaranteeing well-connected communities; V.A. Traag et al. (2019)</a></li> <li><a href="https://www.youtube.com/watch?v=0zuiLBOIcsw">CS224W: Machine Learning with Graphs | Louvain Algorithm; Jure Leskovec (2021)</a></li> <li><a href="https://doi.org/10.1145/2049662.2049663">The University of Florida Sparse Matrix Collection; Timothy A. Davis et al. (2011)</a></li> </ul> <a href="https://www.youtube.com/watch?v=M6npDdVGue4"></a> <a href="https://puzzlef.github.io"></a></p&gt

ZENODO