Search CORE

14 research outputs found

Load Balancing in Parallel Molecular Dynamics

Author: L.V. Kalé
Milind Bh
Milind Bhandarkar
Robert Brunner
Publication venue
Publication date: 01/01/1998
Field of study

Implementing a parallel molecular dynamics as a parallel application presents some unique load balancing challenges. Non-uniform distribution of atoms in space, along with the need to avoid symmetric redundant computations, produces a highly irregular computational load. Scalability and efficiency considerations produce further irregularity. Also, as the simulation evolves, the movement of atoms causes changes in the load distributions. This paper describes the use of an object-based, measurement-based load balancing strategy for a parallel molecular dynamics application, and its impact on performance. 1 Introduction Computational molecular dynamics is aimed at studying the properties of biomolecular systems, and their dynamic interactions. As human understanding of biomolecules progresses, such computational simulations become increasingly important. In addition to their use in understanding basic biological processes, such simulations are used in rational drug design. As rese..

CiteSeerX

Adaptive Load Balancing for MPI Programs ⋆

Author: Eric De Sturler
Jay Hoeflinger
L. V. Kalé
Milind Bh
Publication venue
Publication date
Field of study

often exhibit irregular structure and dynamic load patterns. Many such applications have been developed using MPI. Incorporating dynamic load balancing techniques at the application-level involves significant changes to the design and structure of applications. On the other hand, traditional run-time systems for MPI do not support dynamic load balancing. Object-based parallel programming languages, such as Charm++ support efficient dynamic load balancing using object migration. However, converting legacy MPI applications to such object-based paradigms is cumbersome. This paper describes an implementation of MPI, called Adaptive MPI (AMPI) that supports dynamic load balancing for MPI applications. Conversion from MPI to this platform is straightforward even for large legacy codes. We describe our positive experience in converting the component codes ROCFLO and ROCSOLID of a Rocket Simulation application to AMPI

CiteSeerX

Converse : An Interoperable Framework for Parallel Programming

Author: Joshua Yelon
Laxmikant Kal&apos
Milind Bh
Narain Jagathesan
Sanjeev Krishnan
Publication venue
Publication date
Field of study

Many different parallel languages and paradigms have been developed, each with its own advantages. To benefit from all of them, it should be possible to link together modules written in different parallel languages in a single application. Since the paradigms sometimes differ in fundamental ways, this is difficult to accomplish. This paper describes a framework, Converse, that supports such multi-lingual interoperability. The framework is meant to be inclusive, and has been verified to support the SPMD programming style, message-driven programming, parallel object-oriented programming, and thread-based paradigms. The framework aims at extracting the essential aspects of the runtime support into a set of core components, so that language-specific code does not have to pay overhead for features that it does not need. 1 Introduction Research on parallel computing has produced a number of different parallel programming paradigms, architectures and algorithms. There is a wealth of paralle..

CiteSeerX

Avoiding Algorithmic Obfuscation in a Message-Driven Parallel MD

Author: Aritomo Shinozaki
Attila Gursoy
James C. Phillips
Klaus Schulten
Laxmikant Kalé
Milind Bh
Neal Krawetz
Robert Brunner
Robert D. Skeel
Publication venue: press
Publication date
Field of study

Abstract. Parallel molecular dynamics programs employing shared memory or replicated data architectures encounter problems scaling to large numbers of processors. Spatial decomposition schemes offer better performance in theory, but often suffer from complexity of implementation and difficulty in load balancing. In the program NAMD 2, we have addressed these issues with a hybrid decomposition scheme in which atoms are distributed among processors in regularly sized patches while the work involved in computing interactions between patches is decomposed into independently assignable compute objects. When needed, patches are represented on remote processors by proxies. The execution of compute objects takes place in a prioritized message-driven manner, allowing maximum overlap of work and communication without significant programmer effort. In order to avoid obfuscation of the simulation algorithm by the parallel framework, the algorithm associated with a patch is encapsulated by a single function executing in a separate thread. Output and calculations requiring globally reduced quantities are similarly isolated in a single thread executing on the master node. This combination of features allows us to make efficient use of large parallel machines and clusters of multiprocessor workstations while presenting minimal barriers to method development and implementation

CiteSeerX