Search CORE

70 research outputs found

Parallel heuristics for scalable community detection

Author: Halappanavar Mahantesh
Kalyanaraman Ananth
Lu Hao
Publication venue: The Authors and Battelle Memorial Institute. Published by Elsevier B.V.
Publication date: 31/08/2015
Field of study

AbstractCommunity detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is a multi-phase, iterative heuristic for modularity optimization. Originally developed by Blondel et al. (2008), the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method is also inherently sequential, thereby limiting its scalability. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose heuristics that are designed to break the sequential barrier. For evaluation purposes, we implemented our heuristics using OpenMP multithreading, and tested them over real world graphs derived from multiple application domains (e.g., internet, citation, biological). Compared to the serial Louvain implementation, our parallel implementation is able to produce community outputs with a higher modularity for most of the inputs tested, in comparable number or fewer iterations, while providing absolute speedups of up to 16× using 32 threads

Elsevier - Publisher Connector

FARe: Fault-Aware GNN Training on ReRAM-based PIM Accelerators

Author: Dhingra Pratyush
Doppa Janardhan Rao
Joardar Biresh Kumar
Kalyanaraman Ananth
Ogbogu Chukwufumnanya
Pande Partha Pratim
Publication venue
Publication date: 19/01/2024
Field of study

Resistive random-access memory (ReRAM)-based processing-in-memory (PIM) architecture is an attractive solution for training Graph Neural Networks (GNNs) on edge platforms. However, the immature fabrication process and limited write endurance of ReRAMs make them prone to hardware faults, thereby limiting their widespread adoption for GNN training. Further, the existing fault-tolerant solutions prove inadequate for effectively training GNNs in the presence of faults. In this paper, we propose a fault-aware framework referred to as FARe that mitigates the effect of faults during GNN training. FARe outperforms existing approaches in terms of both accuracy and timing overhead. Experimental results demonstrate that FARe framework can restore GNN test accuracy by 47.6% on faulty ReRAM hardware with a ~1% timing overhead compared to the fault-free counterpart.Comment: This paper has been accepted to the conference DATE (Design, Automation and Test in Europe) - 202

arXiv.org e-Print Archive

Fused Breadth-First Probabilistic Traversals on Distributed GPU Systems

Author: Becchi Michela
Halappanavar Mahantesh
Kalyanaraman Ananth
Minutoli Marco
Neff Reece
Tumeo Antonino
Zarch Mostafa Eghbali
Publication venue
Publication date: 16/11/2023
Field of study

Probabilistic breadth-first traversals (BPTs) are used in many network science and graph machine learning applications. In this paper, we are motivated by the application of BPTs in stochastic diffusion-based graph problems such as influence maximization. These applications heavily rely on BPTs to implement a Monte-Carlo sampling step for their approximations. Given the large sampling complexity, stochasticity of the diffusion process, and the inherent irregularity in real-world graph topologies, efficiently parallelizing these BPTs remains significantly challenging. In this paper, we present a new algorithm to fuse massive number of concurrently executing BPTs with random starts on the input graph. Our algorithm is designed to fuse BPTs by combining separate traversals into a unified frontier on distributed multi-GPU systems. To show the general applicability of the fused BPT technique, we have incorporated it into two state-of-the-art influence maximization parallel implementations (gIM and Ripples). Our experiments on up to 4K nodes of the OLCF Frontier supercomputer (

32,768

GPUs and

196

K CPU cores) show strong scaling behavior, and that fused BPTs can improve the performance of these implementations up to 34

\times

(for gIM) and ~360

\times

(for Ripples).Comment: 12 pages, 11 figure

arXiv.org e-Print Archive

The B73 Maize Genome: Complexity, Diversity, and Dynamics

Author: Aluru Srinivas
Emrich Scott J.
et al.
Fu Yan
Hsia An-Ping
Jia Yi
Kalyanaraman Ananth
Liu Sanzhen
Myers Alan M.
Nettleton Dan
Schnable Patrick S.
Ware Doreen
Yeh Cheng-Ting
Ying Kai
Publication venue: Iowa State University Digital Repository
Publication date: 20/11/2009
Field of study

We report an improved draft nucleotide sequence of the 2.3-gigabase genome of maize, an important crop plant and model for biological research. Over 32,000 genes were predicted, of which 99.8% were placed on reference chromosomes. Nearly 85% of the genome is composed of hundreds of families of transposable elements, dispersed nonuniformly across the genome. These were responsible for the capture and amplification of numerous gene fragments and affect the composition, sizes, and positions of centromeres. We also report on the correlation of methylation-poor regions with Mu transposon insertions and recombination, and copy number variants with insertions and/or deletions, as well as how uneven gene losses between duplicated regions were involved in returning an ancient allotetraploid to a genetically diploid state. These analyses inform and set the stage for further investigations to improve our understanding of the domestication and agricultural improvements of maize

Digital Repository @ Iowa State University (ISU)

Detailed Analysis of a Contiguous 22-Mb Region of the Maize Genome

Author: A Esen
A Kalyanaraman
A Smit
AA Salamov
AH Paterson
AH Paterson
AH Paterson
Ananth Kalyanaraman
Angelina Angelova
AP Tikhonov
Apurva Narechania
B Gaut
B Gaut
B McClintock
B McClintock
B Meyers
BA Kronmiller
Blake C. Meyers
C Liang
C Soderlund
C Soderlund
C Soderlund
CA Whitelaw
Catrina Fronick
Cheng-Ting Yeh
Chengzhi Liang
Cm Vitte
Cristian Chaparro
D Austin
D Austin
D Bubeck
D Lisch
David C. Schwartz
David Kudrna
Dawn H. Nagel
DN Duvick
Doreen Ware
E Allen
E Kellogg
EM McCarthy
Emanuele De Paoli
F Liu
F Wei
F Wei
F Wei
Fusheng Wei
G Haberer
G Zabala
Gabriel Scara
H Fu
H Fu
H Yao
HB Mann
HS Malik
HyeRan Kim
I Goldman
I Goldman
J Besemer
J Lai
J Ma
J Messing
JD Thompson
JE Stajich
Jean-Marc Deragon
Jeffrey L. Bennetzen
Jennifer Currie
Jianwei Zhang
Jinke Lin
JL Bennetzen
JL Bennetzen
JL Bennetzen
JL Bennetzen
JN Volff
Joseph R. Ecker
Joshua C. Stein
K Ilic
K Lahners
K Nobuta
K Vandepoele
Kai Ying
KJ Edwards
KM Devos
Kristi Collura
L Veldboom
L Yang
L Zhang
Laura Courtney
Lifang Zhang
Lixing Yang
Lori Spiegel
Lucinda A. Fulton
Lydia Nascimento
M Alleman
M Bohn
M Bohn
M Chen
M Gale
M Kimura
M Kimura
M Morgante
M Spannagl
MA Gore
Marina Wissotski
Melissa Kramer
MR Woodhouse
N Alexandrov
N Jiang
N Rostoks
N Springer
Ning Jiang
P Byrne
P SanMiguel
P SanMiguel
Pamela J. Green
Patrick S. Schnable
Phillip San Miguel
PS Schnable
Q Li
Q Zhou
R Bruggmann
R Liu
RA Martienssen
RD Finn
Regina S. Baucom
Richard K. Wilson
RK Slotkin
Robert A. Martienssen
Robert S. Fulton
Rod A. Wing
RS Baucom
S Ahn
S Kurtz
S Liu
S Ouyang
S Schwartz
S Takahashi
S Zhou
Sandra W. Clifton
Scott Kruchowski
SE Lewis
SH Hulbert
Shiguo Zhou
Shiran Pasternak
Srinivas Aluru
Stephanie Adams
Susan M. Rock
Susan R. Wessler
T Wicker
TAGI AGI
Tina A. Graves
V Curwen
VV Kapitonov
W Beavis
W Gilbert
W Ramakrishna
W. Richard McCombie
WA Wilson
William Courtney
WJ Kent
Wolfgang Golser
X Cui
X Gao
XF Wang
XY Lin
Y Jia
Yeisoo Yu
YK Jin
Yujun Han
Z Swigonova
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Most of our understanding of plant genome structure and evolution has come from the careful annotation of small (e.g., 100 kb) sequenced genomic regions or from automated annotation of complete genome sequences. Here, we sequenced and carefully annotated a contiguous 22 Mb region of maize chromosome 4 using an improved pseudomolecule for annotation. The sequence segment was comprehensively ordered, oriented, and confirmed using the maize optical map. Nearly 84% of the sequence is composed of transposable elements (TEs) that are mostly nested within each other, of which most families are low-copy. We identified 544 gene models using multiple levels of evidence, as well as five miRNA genes. Gene fragments, many captured by TEs, are prevalent within this region. Elimination of gene redundancy from a tetraploid maize ancestor that originated a few million years ago is responsible in this region for most disruptions of synteny with sorghum and rice. Consistent with other sub-genomic analyses in maize, small RNA mapping showed that many small RNAs match TEs and that most TEs match small RNAs. These results, performed on ∼1% of the maize genome, demonstrate the feasibility of refining the B73 RefGen_v1 genome assembly by incorporating optical map, high-resolution genetic map, and comparative genomic data sets. Such improvements, along with those of gene and repeat annotation, will serve to promote future functional genomic and phylogenomic research in maize and other grasses

Public Library of Science (PLOS)

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Archivio istituzionale della ricerca - Università degli Studi di Udine

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Purdue E-Pubs

University of Queensland eSpace

BioEarth: Envisioning and developing a new regional earth system model to inform natural and agricultural resource management

Author: AB Guenther
AF Hamlet
Alan F. Hamlet
Alex Guenther
Ananth Kalyanaraman
Andrew B. Perleberg
B Drewniak
Bart Nijssen
Bhagyam Chandrasekharan
Brian K. Lamb
Chad E. Kruger
Christina L. Tague
CL Tague
Claudio O. Stöckle
CO Stöckle
Cody Miller
CP Weaver
D Byun
D Jaffe
DC Wong
DW Cash
E Allen
EG Irwin
Elizabeth Allen
F Giorgi
Fok-Yan Leung
Georgine G. Yorgey
H Biemans
J Liu
J Phillipson
J-F Lamarque
JA Harrison
Janet S. Choate
Jennie C. Stephens
Jennifer C. Adam
Jin-Ho Yoon
JJ Harou
John A. Harrison
Jonathan Yoder
Joseph K. Vaughan
Julian Reyes
Jun Zhu
Justin Poinsatte
Keyvan Malek
Kiran J. Chinnayakanahalli
Kirti Rajagopalan
Kristen Johnson
L. Ruby Leung
LR Leung
LW Green
M Callon
M Liu
MC Lemos
Michael P. Brady
Mingliang Liu
MS Smith
MS Wigmosta
N Voisin
PR Gent
PW Mote
R. David Evans
Roger Nelson
RW Katz
S Wharton
S-Y Wang
Sarah Anderson
Serena H. Chung
T Kitzberger
Tristan Mullis
TSC Rowan
Tsengel Nergui
Von Walden
W Maslowski
WM Washington
WV Reid
X Liang
Xiaoyan Jiang
Yong Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

FastEtch: A Fast Sketch-Based Assembler for Genomes

Author: Ghosh Priyanka
Kalyanaraman Ananth
Publication venue: IEEE
Publication date: 01/07/2019
Field of study

De novo genome assembly describes the process of reconstructing an unknown genome from a large collection of short (or long) reads sequenced from the genome. A single run of a Next-Generation Sequencing (NGS) technology can produce billions of short reads, making genome assembly computationally demanding (both in terms of memory and time). One of the major computational steps in modern day short read assemblers involves the construction and use of a string data structure called the de Bruijn graph. In fact, a majority of short read assemblers build the complete de Bruijn graph for the set of input reads, and subsequently traverse and prune low-quality edges, in order to generate genomic "contigs"-the output of assembly. These steps of graph construction and traversal, contribute to well over 90 percent of the runtime and memory. In this paper, we present a fast algorithm, FastEtch, that uses sketching to build an approximate version of the de Bruijn graph for the purpose of generating an assembly. The algorithm uses Count-Min sketch, which is a probabilistic data structure for streaming data sets. The result is an approximate de Bruijn graph that stores information pertaining only to a selected subset of nodes that are most likely to contribute to the contig generation step. In addition, edges are notstored; instead that fraction which contribute to our contig generation are detected on-the-fly. This approximate approach is intended to significantly improve performance (both execution time and memory footprint) whilst possibly compromising on the output assembly quality. We present two main versions of the assembler-one that generates an assembly, where each contig represents a contiguous genomic region from one strand of the DNA, and another that generates an assembly, where the contigs can straddle either of the two strands of the DNA. For further scalability, we have implemented a multi-threaded parallel code. Experimental results using our algorithm conducted on E. coli, Yeast, C. elegans, and Human (Chr2 and Chr2+3) genomes show that our method yields one of the best time-memory-quality trade-offs, when compared against many state-of-the-art genome assemblers

Washington State University institutional repository

An Efficient Parallel Approach for Identifying Protein Families in Large-scale Metagenomic Data Sets

Author: Ananth Kalyanaraman
Changjun Wu
Publication venue: IEEE Press
Publication date: 01/01/2008
Field of study

Abstract—Metagenomics is the study of environmental microbial communities using state-of-the-art genomic tools. Recent advancements in high-throughput technologies have enabled the accumulation of large volumes of metagenomic data that was until a couple of years back was deemed impractical for generation. A primary bottleneck, however, is in the lack of scalable algorithms and open source software for largescale data processing. In this paper, we present the design and implementation of a novel parallel approach to identify protein families from large-scale metagenomic data. Given a set of peptide sequences we reduce the problem to one of detecting arbitrarily-sized dense subgraphs from bipartite graphs. Our approach efficiently parallelizes this task on a distributed memory machine through a combination of divide-and-conquer and combinatorial pattern matching heuristic techniques. We present performance and quality results of extensively testing our implementation on 160K randomly sampled sequences from the CAMERA environmental sequence database using 512 nodes of a BlueGene/L supercomputer. I

CiteSeerX

Crossref