Search CORE

47 research outputs found

SRS: A FRAMEWORK FOR DEVELOPING MALLEABLE AND MIGRATABLE PARALLEL APPLICATIONS FOR DISTRIBUTED SYSTEMS

Author: Arabe J. N. C.
Foster I.
JACK J. DONGARRA
Koo R.
SATHISH S. VADHIYAR
Tannenbaum T.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date
Field of study

Identifying Logical Homogeneous Clusters for Efficient Wide-area Communications

Author: A. Legrand
M. Burger
N.T. Karonis
N.T. Karonis
P. Bhat
R. Thakur
S. Vadhiyar
T. Kielman
T. Kielmann
Publication venue
Publication date: 01/01/2004
Field of study

Recently, many works focus on the implementation of collective communication operations adapted to wide area computational systems, like computational Grids or global-computing. Due to the inherently heterogeneity of such environments, most works separate "clusters" in different hierarchy levels. to better model the communication. However, in our opinion, such works do not give enough attention to the delimitation of such clusters, as they normally use the locality or the IP subnet from the machines to delimit a cluster without verifying the "homogeneity" of such clusters. In this paper, we describe a strategy to gather network information from different local-area networks and to construct "logical homogeneous clusters", better suited to the performance modelling.Comment: http://www.springerlink.com/index/TTJJL61R1EXDLCM

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Holistic Slowdown Driven Scheduling and Resource Management for Malleable Jobs

Author: Cirne W.
Kale L. V.
Kumbhar P.
Lopez V.
Lucero A.
Ludwig Walter
M. Cera
Vadhiyar Sathish S.
Yoo A. B.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

In job scheduling, the concept of malleability has been explored since many years ago. Research shows that malleability improves system performance, but its utilization in HPC never became widespread. The causes are the difficulty in developing malleable applications, and the lack of support and integration of the different layers of the HPC software stack. However, in the last years, malleability in job scheduling is becoming more critical because of the increasing complexity of hardware and workloads. In this context, using nodes in an exclusive mode is not always the most efficient solution as in traditional HPC jobs, where applications were highly tuned for static allocations, but offering zero flexibility to dynamic executions. This paper proposes a new holistic, dynamic job scheduling policy, Slowdown Driven (SD-Policy), which exploits the malleability of applications as the key technology to reduce the average slowdown and response time of jobs. SD-Policy is based on backfill and node sharing. It applies malleability to running jobs to make room for jobs that will run with a reduced set of resources, only when the estimated slowdown improves over the static approach. We implemented SD-Policy in SLURM and evaluated it in a real production environment, and with a simulator using workloads of up to 198K jobs. Results show better resource utilization with the reduction of makespan, response time, slowdown, and energy consumption, up to respectively 7%, 50%, 70%, and 6%, for the evaluated workloads

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC

GrADSolve—a grid-based RPC system for parallel computing with application-level scheduling

Author: Arbenz
Balay
Berman
Bershad
Birrell
Butler
Casanova
Chang
Denis
Denis
Denis
Foster
Geist
Jack J. Dongarra
Maassen
Petitet
René
Sathish S. Vadhiyar
Sato
Wolski
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Analysis of DNA sequence transformations on grids

Author: Joshi Y
Vadhiyar S
Publication venue: Elsevier Science
Publication date
Field of study

Study of the evolution of species or organisms is essential for various biological applications. Evolution is typically studied at the molecular level by analyzing the mutations of DNA sequences of organisms. Techniques have been developed for building phylogenetic or evolutionary trees for a set of sequences. Though phylogenetic trees capture the overall evolutionary relationships among the sequences, they do not reveal fine-level details of the evolution. In this work, we attempt to resolve various fine-level sequence transformation details associated with a phylogenetic tree using cellular automata. In particular, our work tries to determine the cellular automata rules for neighbor-dependent mutations of segments of DNA sequences. We also determine the number of time steps needed for evolution of a progeny from an ancestor and the unknown segments of the intermediate sequences in the phylogenetic tree. Due to the existence of vast number of cellular automata rules, we have developed a grid system that performs parallel guided explorations of the rules on grid resources. We demonstrate our techniques by conducting experiments on a grid comprising machines in three countries and obtaining potentially useful statistics regarding evolutions in three HIV sequences. In particular, our work is able to verify the phenomenon of neighbor-dependent mutations and find that certain combinations of neighbor-dependent mutations, defined by a cellular automata rule, occur with greater than 90% probability. We also find the average number of time steps for mutations for some branches of phylogenetic tree over a large number of possible transformations with standard deviations less than 2

Open Access Repository of IISc Research Publications

HyPar: A divide-and-conquer model for hybrid CPU-GPU graph processing

Author: Panja Rintu
Vadhiyar Sathish S
Publication venue: ACADEMIC PRESS INC ELSEVIER SCIENCE
Publication date
Field of study

Efficient processing of graph applications on heterogeneous CPU-GPU systems require effectively harnessing the combined power of both the CPU and GPU devices. This paper presents HyPar, a divide-and-conquer model for processing graph applications on hybrid CPU-GPU systems. Our strategy partitions the given graph across the devices and performs simultaneous independent computations on both the devices. The model provides a simple and generic API, supported with efficient runtime strategies for hybrid executions. The divide-and-conquer model is demonstrated with five graph applications and using experiments with these applications on a heterogeneous system it is shown that our HyPar strategy provides equivalent performance to the state-of-art, optimized CPU-only and GPU-only implementations of the corresponding applications. When compared to the prevalent BSP approach for multi-device executions of graphs, our HyPar method yields 74%-92% average performance improvements

Open Access Repository of IISc Research Publications

An Efficient MPI_Allgather for Grids

Author: Gupta Rakhi
Vadhiyar Sathish S
Publication venue: ACM Press
Publication date
Field of study

Allgather is an important MPI collective communication. Most of the algorithms for allgather have been designed for homogeneous and tightly coupled systems. The existing algorithms for allgather on Gridsystems do not efficiently utilize the bandwidths available on slow wide-area links of the grid. In this paper, we present an algorithm for allgather on grids that efficiently utilizes wide-area bandwidths and is also wide-area optimal. Our algorithm is also adaptive to gridload dynamics since it considers transient network characteristics for dividing the nodes into clusters. Our experiments on a real-grid setup consisting of 3 sites show that our algorithm gives an average performance improvement of 52% over existing strategies

Open Access Repository of IISc Research Publications

ACCT: Automatic Collective Communications Tuning

Author: Fagg Graham E
Vadhiyar Sathish S
Publication venue: Springer Nature
Publication date: 01/01/2000
Field of study

The University of Manchester - Institutional Repository

SRS: A framework for developing malleable and migratable parallel applications for distributed systems

Author: Dongarra Jack J.
Vadhiyar Sathish S.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2002
Field of study

The ability to produce malleable parallel applications that can be stopped and reconfigured during the execution can offer attractive benefits for both the system and the applications. The reconfiguration can be in terms of varying the parallelism for the applications, changing the data distributions during the executions or dynamically changing the software components involved in the application execution. In distributed and Grid computing systems, migration and reconfiguration of such malleable applications across distributed heterogeneous sites which do not share common file systems provides flexibility for scheduling and resource management in such distributed environments. The present reconfiguration systems do not support migration of parallel applications to distributed locations. In this paper, we discuss a framework for developing malleable and migratable MPI message-passing parallel applications for distributed systems. The framework includes a user-level checkpointing library called SRS and a runtime support system that manages the checkpointed data for distribution to distributed locations. Our experiment results indicate that the parallel applications, with instrumentation to SRS library, were able to achieve reconfigurability incurring about 15- 35% overhead

CiteSeerX

The University of Manchester - Institutional Repository