Search CORE

2,438 research outputs found

Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

Author: Blelloch G. E.
Blelloch G. E.
Cormen T. H.
Da Zheng D. M.
Dasari N. S.
Gonzalez J. E.
Greenlaw R.
Karp R. M.
Low Y.
Maon Y.
Ramachandran V.
Shiloach Y.
Zhou W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/07/2019
Field of study

There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

arXiv.org e-Print Archive

Crossref

DSpace@MIT

A Distributed Algorithm For Identifying Strongly Connected Components On Incremental Graphs

Author: Bhowmick S.
Das S. (Sajal) K.
Khanda A.
Norris B.
Pandey A.
Srinivasan S.
Srinivasan S.
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2023
Field of study

Incremental graphs that change over time capture the changing relationships of different entities. Given that many real-world networks are extremely large, it is often necessary to partition the network over many distributed systems and solve a complex graph problem over the partitioned network. This paper presents a distributed algorithm for identifying strongly connected components (SCC) on incremental graphs. We propose a two-phase asynchronous algorithm that involves storing the intermediate results between each iteration of dynamic updates in a novel meta-graph storage format for efficient recomputation of the SCC for successive iterations. To the best of our knowledge, this is the first attempt at identifying SCC for incremental graphs across distributed compute nodes. Our experimental analysis on real and synthesized graphs shows up to 2.8x performance improvement over the state-of-the-art by reducing the overall memory utilized and improving the communication bandwidth

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Parallel Maximum Clique Algorithms with Applications to Network Analysis and Storage

Author: Ali Patwary
Assefaw H. Gebremedhin
David F. Gleich
Md. Mostofa
Ryan A. Rossi
Publication venue
Publication date: 25/12/2013
Field of study

We propose a fast, parallel maximum clique algorithm for large sparse graphs that is designed to exploit characteristics of social and information networks. The method exhibits a roughly linear runtime scaling over real-world networks ranging from 1000 to 100 million nodes. In a test on a social network with 1.8 billion edges, the algorithm finds the largest clique in about 20 minutes. Our method employs a branch and bound strategy with novel and aggressive pruning techniques. For instance, we use the core number of a vertex in combination with a good heuristic clique finder to efficiently remove the vast majority of the search space. In addition, we parallelize the exploration of the search tree. During the search, processes immediately communicate changes to upper and lower bounds on the size of maximum clique, which occasionally results in a super-linear speedup because vertices with large search spaces can be pruned by other processes. We apply the algorithm to two problems: to compute temporal strong components and to compress graphs.Comment: 11 page

arXiv.org e-Print Archive

CiteSeerX

Parametric Multi-Step Scheme for GPU-Accelerated Graph Decomposition into Strongly Connected Components

Author: \u10ce\u161ka Milan
Aldegheri Stefano
Barnat Ji\u159\ued
Bombieri Nicola
Busato Federico
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

The problem of decomposing a directed graph into strongly connected components (SCCs) is a fundamental graph problem that is inherently present in many scientific and commercial applications. Clearly, there is a strong need for good high-performance, e.g., GPU-accelerated, algorithms to solve it. Unfortunately, among existing GPU-enabled algorithms to solve the problem, there is none that can be considered the best on every graph, disregarding the graph characteristics. Indeed, the choice of the right and most appropriate algorithm to be used is often left to inexperienced users. In this paper, we introduce a novel parametric multi-step scheme to evaluate existing GPU-accelerated algorithms for SCC decomposition in order to alleviate the burden of the choice and to help the user to identify which combination of existing techniques for SCC decomposition would fit an expected use case the most. We support our scheme with an extensive experimental evaluation that dissects correlations between the internal structure of GPU-based algorithms and their performance on various classes of graphs. The measurements confirm that there is no algorithm that would beat all other algorithms in the decomposition on all of the classes of graphs. Our contribution thus represents an important step towards an ultimate solution of automatically adjusted scheme for the GPU-accelerated SCC decomposition

Crossref

Catalogo dei prodotti della ricerca

Optimizing graph algorithms on pregel-like systems

Author: Allwright J. R.
Chung S.
Dean J.
Gonzalez J. E.
Hong S.
Hong S.
Kannan R.
Karger D. R.
Manne F.
Orzan S. M.
Preis R.
Publication venue: 'VLDB Endowment'
Publication date
Field of study

Crossref

Efficient trimming for strongly connected components calculation

Author: Niewenhuis Dante
Varbanescu Ana Lucia
Publication venue: Association for Computing Machinery
Publication date: 17/05/2022
Field of study

Strongly Connected Components (SCCs) are useful for many applications, such as community detection and personalized recommendation. Determining the SCCs of a graph, however, can be very expensive, and parallelization is not an easy way out: the paral-lelization itself is challenging, and its performance impact varies non-trivially with the input graph structure. This variability is due to trivial components, i.e., SCCs consisting of a single vertex, which lead to significant workload imbalance. Trimming is an effective method to remove trivial components, but is inefficient when used on graphs with few trivial components. In this work, we propose FB-AI-Trim, a parallel SCC algorithm with selective trimming. Our algorithm decides dynamically, at runtime, based on the input graph how to trim the graph. To this end, we train a neural network to predict, using topological graph information, whether trimming is beneficial for performance. We evaluate FB-AI-Trim using 173 unseen graphs, and compare it against four different static trimming models. Our results demonstrate that, over the set of graphs, FB-AI-Trim is the fastest algorithm. Furthermore, FB-AI-Trim is, in 80% of the cases, less than 10% slower than the best performing model on a single graph. Finally, FB-AI-Trim shows significant performance degradation in less than 3% of the graphs.</p

University of Twente Research Information

High performance graph analysis on parallel architectures

Author: Grivas Athanasios K
Publication venue: Newcastle University
Publication date: 01/01/2016
Field of study

PhD ThesisOver the last decade pharmacology has been developing computational methods to enhance drug development and testing. A computational method called network pharmacology uses graph analysis tools to determine protein target sets that can lead on better targeted drugs for diseases as Cancer. One promising area of network-based pharmacology is the detection of protein groups that can produce better e ects if they are targeted together by drugs. However, the e cient prediction of such protein combinations is still a bottleneck in the area of computational biology. The computational burden of the algorithms used by such protein prediction strategies to characterise the importance of such proteins consists an additional challenge for the eld of network pharmacology. Such computationally expensive graph algorithms as the all pairs shortest path (APSP) computation can a ect the overall drug discovery process as needed network analysis results cannot be given on time. An ideal solution for these highly intensive computations could be the use of super-computing. However, graph algorithms have datadriven computation dictated by the structure of the graph and this can lead to low compute capacity utilisation with execution times dominated by memory latency. Therefore, this thesis seeks optimised solutions for the real-world graph problems of critical node detection and e ectiveness characterisation emerged from the collaboration with a pioneer company in the eld of network pharmacology as part of a Knowledge Transfer Partnership (KTP) / Secondment (KTS). In particular, we examine how genetic algorithms could bene t the prediction of protein complexes where their removal could produce a more e ective 'druggable' impact. Furthermore, we investigate how the problem of all pairs shortest path (APSP) computation can be bene ted by the use of emerging parallel hardware architectures as GPU- and FPGA- desktop-based accelerators. In particular, we address the problem of critical node detection with the development of a heuristic search method. It is based on a genetic algorithm that computes optimised node combinations where their removal causes greater impact than common impact analysis strategies. Furthermore, we design a general pattern for parallel network analysis on multi-core architectures that considers graph's embedded properties. It is a divide and conquer approach that decomposes a graph into smaller subgraphs based on its strongly connected components and computes the all pairs shortest paths concurrently on GPU. Furthermore, we use linear algebra to design an APSP approach based on the BFS algorithm. We use algebraic expressions to transform the problem of path computation to multiple independent matrix-vector multiplications that are executed concurrently on FPGA. Finally, we analyse how the optimised solutions of perturbation analysis and parallel graph processing provided in this thesis will impact the drug discovery process.This research was part of a Knowledge Transfer Partnership (KTP) and Knowledge Transfer Secondment (KTS) between e-therapeutics PLC and Newcastle University. It was supported as a collaborative project by e-therapeutics PLC and Technology Strategy boar

Newcastle University eTheses

Distributed Set Reachability

Author: Martin Theobald
Sairam Gurajada
Publication venue
Publication date: 01/01/2016
Field of study

DBIS EPub