6,203 research outputs found
Scaling soft matter physics to thousands of graphics processing units in parallel
We describe a multi-graphics processing unit (GPU) implementation of the Ludwig application, which specialises in simulating a variety of complex fluids via lattice Boltzmann fluid dynamics coupled to additional physics describing complex fluid constituents. We describe our methodology in augmenting the original central processing unit (CPU) version with GPU functionality in a maintainable fashion. We present several optimisations that maximise performance on the GPU architecture through tuning for the GPU memory hierarchy. We describe how we implement particles within the fluid in such a way to avoid a major diversion of the CPU and GPU codebases, whilst minimising data transfer at each time step. We detail our halo-exchange communication phase for the code, which exploits overlapping to allow efficient parallel scaling to many GPUs. We present results showing that the application demonstrates excellent scaling to at least 8192 GPUs in parallel, the largest system tested at the time of writing. The GPU version (on NVIDIA K20X GPUs) is around 3.5-5 times faster that the CPU version (on fully utilised AMD Opteron 6274 16-core CPUs), comparing equal numbers of CPUs and GPUs
Computational Physics on Graphics Processing Units
The use of graphics processing units for scientific computations is an
emerging strategy that can significantly speed up various different algorithms.
In this review, we discuss advances made in the field of computational physics,
focusing on classical molecular dynamics, and on quantum simulations for
electronic structure calculations using the density functional theory, wave
function techniques, and quantum field theory.Comment: Proceedings of the 11th International Conference, PARA 2012,
Helsinki, Finland, June 10-13, 201
Harvesting graphics power for MD simulations
We discuss an implementation of molecular dynamics (MD) simulations on a
graphic processing unit (GPU) in the NVIDIA CUDA language. We tested our code
on a modern GPU, the NVIDIA GeForce 8800 GTX. Results for two MD algorithms
suitable for short-ranged and long-ranged interactions, and a congruential
shift random number generator are presented. The performance of the GPU's is
compared to their main processor counterpart. We achieve speedups of up to 80,
40 and 150 fold, respectively. With newest generation of GPU's one can run
standard MD simulations at 10^7 flops/$.Comment: 12 pages, 5 figures. Submitted to Mol. Si
Strong scaling of general-purpose molecular dynamics simulations on GPUs
We describe a highly optimized implementation of MPI domain decomposition in
a GPU-enabled, general-purpose molecular dynamics code, HOOMD-blue (Anderson
and Glotzer, arXiv:1308.5587). Our approach is inspired by a traditional
CPU-based code, LAMMPS (Plimpton, J. Comp. Phys. 117, 1995), but is implemented
within a code that was designed for execution on GPUs from the start (Anderson
et al., J. Comp. Phys. 227, 2008). The software supports short-ranged pair
force and bond force fields and achieves optimal GPU performance using an
autotuning algorithm. We are able to demonstrate equivalent or superior scaling
on up to 3,375 GPUs in Lennard-Jones and dissipative particle dynamics (DPD)
simulations of up to 108 million particles. GPUDirect RDMA capabilities in
recent GPU generations provide better performance in full double precision
calculations. For a representative polymer physics application, HOOMD-blue 1.0
provides an effective GPU vs. CPU node speed-up of 12.5x.Comment: 30 pages, 14 figure
Status and Future Perspectives for Lattice Gauge Theory Calculations to the Exascale and Beyond
In this and a set of companion whitepapers, the USQCD Collaboration lays out
a program of science and computing for lattice gauge theory. These whitepapers
describe how calculation using lattice QCD (and other gauge theories) can aid
the interpretation of ongoing and upcoming experiments in particle and nuclear
physics, as well as inspire new ones.Comment: 44 pages. 1 of USQCD whitepapers
Fast, Scalable, and Interactive Software for Landau-de Gennes Numerical Modeling of Nematic Topological Defects
Numerical modeling of nematic liquid crystals using the tensorial Landau-de
Gennes (LdG) theory provides detailed insights into the structure and
energetics of the enormous variety of possible topological defect
configurations that may arise when the liquid crystal is in contact with
colloidal inclusions or structured boundaries. However, these methods can be
computationally expensive, making it challenging to predict (meta)stable
configurations involving several colloidal particles, and they are often
restricted to system sizes well below the experimental scale. Here we present
an open-source software package that exploits the embarrassingly parallel
structure of the lattice discretization of the LdG approach. Our
implementation, combining CUDA/C++ and OpenMPI, allows users to accelerate
simulations using both CPU and GPU resources in either single- or multiple-core
configurations. We make use of an efficient minimization algorithm, the Fast
Inertial Relaxation Engine (FIRE) method, that is well-suited to large-scale
parallelization, requiring little additional memory or computational cost while
offering performance competitive with other commonly used methods. In
multi-core operation we are able to scale simulations up to supra-micron length
scales of experimental relevance, and in single-core operation the simulation
package includes a user-friendly GUI environment for rapid prototyping of
interfacial features and the multifarious defect states they can promote. To
demonstrate this software package, we examine in detail the competition between
curvilinear disclinations and point-like hedgehog defects as size scale,
material properties, and geometric features are varied. We also study the
effects of an interface patterned with an array of topological point-defects.Comment: 16 pages, 6 figures, 1 youtube link. The full catastroph
- …