1,842 research outputs found
A Customized 3D GPU Poisson Solver for Free Boundary Conditions
A 3-dimensional GPU Poisson solver is developed for all possible combinations
of free and periodic boundary conditions (BCs) along the three directions. It
is benchmarked for various grid sizes and different BCs and a significant
performance gain is observed for problems including one or more free BCs. The
GPU Poisson solver is also benchmarked against two different CPU
implementations of the same method and a significant amount of acceleration of
the computation is observed with the GPU version.Comment: 10 pages, 5 figure
BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images
In cryo-electron microscopy (EM), molecular structures are determined from
large numbers of projection images of individual particles. To harness the full
power of this single-molecule information, we use the Bayesian inference of EM
(BioEM) formalism. By ranking structural models using posterior probabilities
calculated for individual images, BioEM in principle addresses the challenge of
working with highly dynamic or heterogeneous systems not easily handled in
traditional EM reconstruction. However, the calculation of these posteriors for
large numbers of particles and models is computationally demanding. Here we
present highly parallelized, GPU-accelerated computer software that performs
this task efficiently. Our flexible formulation employs CUDA, OpenMP, and MPI
parallelization combined with both CPU and GPU computing. The resulting BioEM
software scales nearly ideally both on pure CPU and on CPU+GPU architectures,
thus enabling Bayesian analysis of tens of thousands of images in a reasonable
time. The general mathematical framework and robust algorithms are not limited
to cryo-electron microscopy but can be generalized for electron tomography and
other imaging experiments
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS
GROMACS is a widely used package for biomolecular simulation, and over the
last two decades it has evolved from small-scale efficiency to advanced
heterogeneous acceleration and multi-level parallelism targeting some of the
largest supercomputers in the world. Here, we describe some of the ways we have
been able to realize this through the use of parallelization on all levels,
combined with a constant focus on absolute performance. Release 4.6 of GROMACS
uses SIMD acceleration on a wide range of architectures, GPU offloading
acceleration, and both OpenMP and MPI parallelism within and between nodes,
respectively. The recent work on acceleration made it necessary to revisit the
fundamental algorithms of molecular simulation, including the concept of
neighborsearching, and we discuss the present and future challenges we see for
exascale simulation - in particular a very fine-grained task parallelism. We
also discuss the software management, code peer review and continuous
integration testing required for a project of this complexity.Comment: EASC 2014 conference proceedin
A GPU based real-time software correlation system for the Murchison Widefield Array prototype
Modern graphics processing units (GPUs) are inexpensive commodity hardware
that offer Tflop/s theoretical computing capacity. GPUs are well suited to many
compute-intensive tasks including digital signal processing.
We describe the implementation and performance of a GPU-based digital
correlator for radio astronomy. The correlator is implemented using the NVIDIA
CUDA development environment. We evaluate three design options on two
generations of NVIDIA hardware. The different designs utilize the internal
registers, shared memory and multiprocessors in different ways. We find that
optimal performance is achieved with the design that minimizes global memory
reads on recent generations of hardware.
The GPU-based correlator outperforms a single-threaded CPU equivalent by a
factor of 60 for a 32 antenna array, and runs on commodity PC hardware. The
extra compute capability provided by the GPU maximises the correlation
capability of a PC while retaining the fast development time associated with
using standard hardware, networking and programming languages. In this way, a
GPU-based correlation system represents a middle ground in design space between
high performance, custom built hardware and pure CPU-based software
correlation.
The correlator was deployed at the Murchison Widefield Array 32 antenna
prototype system where it ran in real-time for extended periods. We briefly
describe the data capture, streaming and correlation system for the prototype
array.Comment: 11 pages, to appear in PAS
Computational Physics on Graphics Processing Units
The use of graphics processing units for scientific computations is an
emerging strategy that can significantly speed up various different algorithms.
In this review, we discuss advances made in the field of computational physics,
focusing on classical molecular dynamics, and on quantum simulations for
electronic structure calculations using the density functional theory, wave
function techniques, and quantum field theory.Comment: Proceedings of the 11th International Conference, PARA 2012,
Helsinki, Finland, June 10-13, 201
- …