Search CORE

57 research outputs found

Shared-memory Graph Truss Decomposition

Author: Kabir Humayun
Madduri Kamesh
Publication venue
Publication date: 06/07/2017
Field of study

We present PKT, a new shared-memory parallel algorithm and OpenMP implementation for the truss decomposition of large sparse graphs. A k-truss is a dense subgraph definition that can be considered a relaxation of a clique. Truss decomposition refers to a partitioning of all the edges in the graph based on their k-truss membership. The truss decomposition of a graph has many applications. We show that our new approach PKT consistently outperforms other truss decomposition approaches for a collection of large sparse graphs and on a 24-core shared-memory server. PKT is based on a recently proposed algorithm for k-core decomposition.Comment: 10 pages, conference submissio

arXiv.org e-Print Archive

Crossref

Distributed-Memory Breadth-First Search on Massive Graphs

Author: Asanovic Krste
Beamer Scott
Buluc Aydin
Madduri Kamesh
Patterson David
Publication venue
Publication date: 01/01/2017
Field of study

This chapter studies the problem of traversing large graphs using the breadth-first search order on distributed-memory supercomputers. We consider both the traditional level-synchronous top-down algorithm as well as the recently discovered direction optimizing algorithm. We analyze the performance and scalability trade-offs in using different local data structures such as CSR and DCSC, enabling in-node multithreading, and graph decompositions such as 1D and 2D decomposition.Comment: arXiv admin note: text overlap with arXiv:1104.451

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Jet: Multilevel Graph Partitioning on GPUs

Author: Boman Erik G.
Gilbert Michael S.
Madduri Kamesh
Rajamanickam Sivasankaran
Publication venue
Publication date: 25/04/2023
Field of study

The multilevel heuristic is the dominant strategy for high-quality sequential and parallel graph partitioning. Partition refinement is a key step of multilevel graph partitioning. In this work, we present Jet, a new parallel algorithm for partition refinement specifically designed for Graphics Processing Units (GPUs). We combine Jet with GPU-aware coarsening to develop a

k

-way graph partitioner. The new partitioner achieves superior quality when compared to state-of-the-art shared memory graph partitioners on a large collection of test graphs.Comment: Submitted as a non-archival track paper for SIAM ACDA 202

arXiv.org e-Print Archive

Multi-level bitmap indexes for flash memory storage

Author: Canon Shane
Madduri Kamesh
Wu Kesheng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Due to their low access latency, high read speed, and power-efficient operation, flash memory storage devices are rapidly emerging as an attractive alternative to traditional magnetic storage devices. However, tests show that the most efficient indexing methods are not able to take advantage of the flash memory storage devices. In this paper, we present a set of multi-level bitmap indexes that can effectively take advantage of flash storage devices. These indexing methods use coarsely binned indexes to answer queries approximately, and then use finely binned indexes to refine the answers. Our new methods read significantly lower volumes of data at the expense of an increased disk access count, thus taking full advantage of the improved read speed and low access latency of flash devices. To demonstrate the advantage of these new indexes, we measure their performance on a number of storage systems using a standard data warehousing benchmark called the Set Query Benchmark. We observe that multi-level strategies on flash drives are up to 3 times faster than traditional indexing strategies on magnetic disk drives

Crossref

UNT Digital Library

Parallel Shortest Path Algorithms for Solving Large-Scale Instances

Author: Bruce A. Hendrickson
David A. Bader
Jonathan W. Berry
Joseph R. Crobak
Kamesh Madduri
Publication venue: Georgia Institute of Technology
Publication date: 01/01/2006
Field of study

We present an experimental study of parallel algorithms for solving the single source shortest path problem with non-negative edge weights (NSSP) on large-scale graphs. We implement Meyer and Sander's Δ-stepping algorithm and report performance results on the Cray MTA-2, a multithreaded parallel architecture. The MTA-2 is a high-end shared memory system offering two unique features that aid the efficient implementation of irregular parallel graph algorithms: the ability to exploit fine-grained parallelism, and low-overhead synchronization primitives. Our implementation exhibits remarkable parallel speedup when compared with a competitive sequential algorithm, for low-diameter sparse graphs. For instance, Δ-stepping on a directed scale-free graph of 100 million vertices and 1 billion edges takes less than ten seconds on 40 processors of the MTA-2, with a relative speedup of close to 30. To our knowledge, these are the first performance results of a parallel NSSP problem on realistic graph instances in the order of billions of vertices and edges

Scholarly Materials And Research @ Georgia Tech

CiteSeerX

Global Simulation of Plasma Microturbulence at the Petascale & Beyond (Optimizing the GTC Code for Blue Gene/Q): ALCF-2 Early Science Program Technical Report

Author: Ethier Stephane
Ibrahim Khaled
Madduri Kamesh
Oliker Leonid
Tang William
Wang Bei
Williams Samuel
Williams Timothy
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 14/05/2013
Field of study

Crossref

UNT Digital Library

A graph-theoretic analysis of the human protein-interaction network using multicore parallel algorithms

Author: Alfarano
Bader
Bader
Bader
Batada
Batagelj
Berg
Bork
Brandes
Coffman
David A. Bader
Freeman
Gandhi
Giot
Girvan
Guimerà
Hermjakob
Jeong
Jeong
Joy
Kamesh Madduri
Lehner
Li
Liljeros
Newman
Peri
Ramani
Reguly
Rual
Salwinski
Scott
Shannon
Sole
Tong
Uetz
Vazquez
Vazquez
Wuchty
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Exploring the Design Space of Static and Incremental Graph Connectivity Algorithms on GPUs

Author: Acar Umut A.
Bader David A.
Banerjee Dip Sankar
Beamer Scott
Blelloch Guy E.
Chakrabarti Deepayan
Chitnis Laukik
Dhulipala Laxman
Dhulipala Laxman
Ediger D.
Ester Martin
Green O.
Hambrusch S.
Holm J.
Hsu Tsan-Sheng
Jayanti Siddhartha V.
Karger David R.
Kiveris Raimondas
Liu Sixue
Madduri Kamesh
McColl R.
Merrill Duane
Patwary M. M. A.
Phillips C. A.
Shun Julian
Shun Julian
Siddhartha
Slota George M.
Soman J.
Stergiou Stergios
Sutton M.
Wang Yangzihao
Wang Yangzihao
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/08/2020
Field of study

Connected components and spanning forest are fundamental graph algorithms due to their use in many important applications, such as graph clustering and image segmentation. GPUs are an ideal platform for graph algorithms due to their high peak performance and memory bandwidth. While there exist several GPU connectivity algorithms in the literature, many design choices have not yet been explored. In this paper, we explore various design choices in GPU connectivity algorithms, including sampling, linking, and tree compression, for both the static as well as the incremental setting. Our various design choices lead to over 300 new GPU implementations of connectivity, many of which outperform state-of-the-art. We present an experimental evaluation, and show that we achieve an average speedup of 2.47x speedup over existing static algorithms. In the incremental setting, we achieve a throughput of up to 48.23 billion edges per second. Compared to state-of-the-art CPU implementations on a 72-core machine, we achieve a speedup of 8.26--14.51x for static connectivity and 1.85--13.36x for incremental connectivity using a Tesla V100 GPU

arXiv.org e-Print Archive

Crossref

DSpace@MIT

A high-performance framework for analyzing massive complex networks

Author: Madduri Kamesh
Publication venue: Georgia Institute of Technology
Publication date: 08/07/2008
Field of study

Graphs are a fundamental and widely-used abstraction for representing data. We can analytically study interesting aspects of real-world complex systems such as the Internet, social systems, transportation networks, and biological interaction data by modeling them as graphs. Graph-theoretic and combinatorial problems are also pervasive in scientific computing and engineering applications. In this dissertation, we address the problem of analyzing large-scale complex networks that represent interactions between hundreds of thousands to billions of entities. We present SNAP, a new high-performance computational framework for efficiently processing graph-theoretic queries on massive datasets. Graph analysis is computationally very different from traditional scientific computing, and solving massive graph-theoretic problems on current high performance computing systems is challenging due to several reasons. First, real-world graphs are often characterized by a low diameter and unbalanced degree distributions, and are difficult to partition on parallel systems. Second, parallel algorithms for solving graph-theoretic problems are typically memory intensive, and the memory accesses are fine-grained and highly irregular. The primary contributions of this dissertation are the design and implementation of novel parallel graph algorithms for traversal, shortest paths, and centrality computations, optimized for the small-world network topology, and high-performance multithreaded architectures and multicore servers. SNAP (Small-world Network Analysis and Partitioning) is a modular, open-source framework for the exploratory analysis and partitioning of large-scale networks. With SNAP, we demonstrate the capability to process massive graphs with billions of vertices and edges, and achieve up to two orders of magnitude speedup over state-of-the-art network analysis approaches. We also design a new parallel computing benchmark for characterizing the performance of graph-theoretic problems on high-end systems; study data representations for dynamic graph problems on parallel systems; and apply algorithms in SNAP to solve real-world problems in social network analysis and systems biology.Ph.D.Committee Chair: Bader, David; Committee Member: Berry, Jonathan; Committee Member: Fujimoto, Richard; Committee Member: Saini, Subhash; Committee Member: Vuduc, Richar

Scholarly Materials And Research @ Georgia Tech

Recommended from our members

Efficient Joins with Compressed Bitmap Indexes

Author: Madduri Kamesh
Publication venue: eScholarship, University of California
Publication date: 08/07/2010
Field of study

We present a new class of adaptive algorithms that use compressed bitmap indexes to speed up evaluation of the range join query in relational databases. We determine the best strategy to process a join query based on a fast sub-linear time computation of the join selectivity (the ratio of the number of tuples in the result to the total number of possible tuples). In addition, we use compressed bitmaps to represent the join output compactly: the space requirement for storing the tuples representing the join of two relations is asymptotically bounded by min(h; n . cb), where h is the number of tuple pairs in the result relation, n is the number of tuples in the smaller of the two relations, and cb is the cardinality of the larger column being joined. We present a theoretical analysis of our algorithms, as well as experimental results on large-scale synthetic and real data sets. Our implementations are efficient, and consistently outperform well-known approaches for a range of join selectivity factors. For instance, our count-only algorithm is up to three orders of magnitude faster than the sort-merge approach, and our best bitmap index-based algorithm is 1.2x-80x faster than the sort-merge algorithm, for various query instances. We achieve these speedups by exploiting several inherent performance advantages of compressed bitmap indexes for join processing: an implicit partitioning of the attributes, space-efficiency, and tolerance of high-cardinality relations

eScholarship - University of California