2,691 research outputs found
Hierarchical fractional-step approximations and parallel kinetic Monte Carlo algorithms
We present a mathematical framework for constructing and analyzing parallel
algorithms for lattice Kinetic Monte Carlo (KMC) simulations. The resulting
algorithms have the capacity to simulate a wide range of spatio-temporal scales
in spatially distributed, non-equilibrium physiochemical processes with complex
chemistry and transport micro-mechanisms. The algorithms can be tailored to
specific hierarchical parallel architectures such as multi-core processors or
clusters of Graphical Processing Units (GPUs). The proposed parallel algorithms
are controlled-error approximations of kinetic Monte Carlo algorithms,
departing from the predominant paradigm of creating parallel KMC algorithms
with exactly the same master equation as the serial one.
Our methodology relies on a spatial decomposition of the Markov operator
underlying the KMC algorithm into a hierarchy of operators corresponding to the
processors' structure in the parallel architecture. Based on this operator
decomposition, we formulate Fractional Step Approximation schemes by employing
the Trotter Theorem and its random variants; these schemes, (a) determine the
communication schedule} between processors, and (b) are run independently on
each processor through a serial KMC simulation, called a kernel, on each
fractional step time-window.
Furthermore, the proposed mathematical framework allows us to rigorously
justify the numerical and statistical consistency of the proposed algorithms,
showing the convergence of our approximating schemes to the original serial
KMC. The approach also provides a systematic evaluation of different processor
communicating schedules.Comment: 34 pages, 9 figure
Multi-Architecture Monte-Carlo (MC) Simulation of Soft Coarse-Grained Polymeric Materials: SOft coarse grained Monte-carlo Acceleration (SOMA)
Multi-component polymer systems are important for the development of new
materials because of their ability to phase-separate or self-assemble into
nano-structures. The Single-Chain-in-Mean-Field (SCMF) algorithm in conjunction
with a soft, coarse-grained polymer model is an established technique to
investigate these soft-matter systems. Here we present an im- plementation of
this method: SOft coarse grained Monte-carlo Accelera- tion (SOMA). It is
suitable to simulate large system sizes with up to billions of particles, yet
versatile enough to study properties of different kinds of molecular
architectures and interactions. We achieve efficiency of the simulations
commissioning accelerators like GPUs on both workstations as well as
supercomputers. The implementa- tion remains flexible and maintainable because
of the implementation of the scientific programming language enhanced by
OpenACC pragmas for the accelerators. We present implementation details and
features of the program package, investigate the scalability of our
implementation SOMA, and discuss two applications, which cover system sizes
that are difficult to reach with other, common particle-based simulation
methods
PPF - A Parallel Particle Filtering Library
We present the parallel particle filtering (PPF) software library, which
enables hybrid shared-memory/distributed-memory parallelization of particle
filtering (PF) algorithms combining the Message Passing Interface (MPI) with
multithreading for multi-level parallelism. The library is implemented in Java
and relies on OpenMPI's Java bindings for inter-process communication. It
includes dynamic load balancing, multi-thread balancing, and several
algorithmic improvements for PF, such as input-space domain decomposition. The
PPF library hides the difficulties of efficient parallel programming of PF
algorithms and provides application developers with the necessary tools for
parallel implementation of PF methods. We demonstrate the capabilities of the
PPF library using two distributed PF algorithms in two scenarios with different
numbers of particles. The PPF library runs a 38 million particle problem,
corresponding to more than 1.86 GB of particle data, on 192 cores with 67%
parallel efficiency. To the best of our knowledge, the PPF library is the first
open-source software that offers a parallel framework for PF applications.Comment: 8 pages, 8 figures; will appear in the proceedings of the IET Data
Fusion & Target Tracking Conference 201
Parallelizing RRT on large-scale distributed-memory architectures
This paper addresses the problem of parallelizing the Rapidly-exploring Random Tree (RRT) algorithm on large-scale distributed-memory architectures, using the Message Passing Interface. We compare three parallel versions of RRT based on classical parallelization schemes. We evaluate them on different motion planning problems and analyze the various factors influencing their performance
Computational Physics on Graphics Processing Units
The use of graphics processing units for scientific computations is an
emerging strategy that can significantly speed up various different algorithms.
In this review, we discuss advances made in the field of computational physics,
focusing on classical molecular dynamics, and on quantum simulations for
electronic structure calculations using the density functional theory, wave
function techniques, and quantum field theory.Comment: Proceedings of the 11th International Conference, PARA 2012,
Helsinki, Finland, June 10-13, 201
Optimizing molecular dynamics simulations with product lines
This paper presents a case study of using product-lines to address the variability of optimization methods and target platform mappings in high-performance molecular dynamics simulations. We use Feature Oriented Programming to incrementally extend the base algorithm by composing performance enhancement features with the core functionality. Developed features encapsulate common optimization methods in molecular dynamics simulations and target platform mappings. The main benefits of the approach are: 1) it promotes an incremental development, where optimizations and mappings are developed incrementally and simultaneously with the core functionality; 2) complex optimizations and mappings can be obtained by composing basic features. The performance of synthesized products is comparable to the performance of products developed with traditional parallel programming techniques. In this approach complex optimizations become easier to develop, by composing basic features, providing a performance advantage over traditional programming techniques.(undefined
Scalable and fast heterogeneous molecular simulation with predictive parallelization schemes
Multiscale and inhomogeneous molecular systems are challenging topics in the
field of molecular simulation. In particular, modeling biological systems in
the context of multiscale simulations and exploring material properties are
driving a permanent development of new simulation methods and optimization
algorithms. In computational terms, those methods require parallelization
schemes that make a productive use of computational resources for each
simulation and from its genesis. Here, we introduce the heterogeneous domain
decomposition approach which is a combination of an heterogeneity sensitive
spatial domain decomposition with an \textit{a priori} rearrangement of
subdomain-walls. Within this approach, the theoretical modeling and
scaling-laws for the force computation time are proposed and studied as a
function of the number of particles and the spatial resolution ratio. We also
show the new approach capabilities, by comparing it to both static domain
decomposition algorithms and dynamic load balancing schemes. Specifically, two
representative molecular systems have been simulated and compared to the
heterogeneous domain decomposition proposed in this work. These two systems
comprise an adaptive resolution simulation of a biomolecule solvated in water
and a phase separated binary Lennard-Jones fluid.Comment: 14 pages, 12 figure
- …