920 research outputs found
Harvesting graphics power for MD simulations
We discuss an implementation of molecular dynamics (MD) simulations on a
graphic processing unit (GPU) in the NVIDIA CUDA language. We tested our code
on a modern GPU, the NVIDIA GeForce 8800 GTX. Results for two MD algorithms
suitable for short-ranged and long-ranged interactions, and a congruential
shift random number generator are presented. The performance of the GPU's is
compared to their main processor counterpart. We achieve speedups of up to 80,
40 and 150 fold, respectively. With newest generation of GPU's one can run
standard MD simulations at 10^7 flops/$.Comment: 12 pages, 5 figures. Submitted to Mol. Si
Simulating spin models on GPU
Over the last couple of years it has been realized that the vast
computational power of graphics processing units (GPUs) could be harvested for
purposes other than the video game industry. This power, which at least
nominally exceeds that of current CPUs by large factors, results from the
relative simplicity of the GPU architectures as compared to CPUs, combined with
a large number of parallel processing units on a single chip. To benefit from
this setup for general computing purposes, the problems at hand need to be
prepared in a way to profit from the inherent parallelism and hierarchical
structure of memory accesses. In this contribution I discuss the performance
potential for simulating spin models, such as the Ising model, on GPU as
compared to conventional simulations on CPU.Comment: 5 pages, 4 figures, elsarticl
GPU in Physics Computation: Case Geant4 Navigation
General purpose computing on graphic processing units (GPU) is a potential
method of speeding up scientific computation with low cost and high energy
efficiency. We experimented with the particle physics simulation toolkit Geant4
used at CERN to benchmark its geometry navigation functionality on a GPU. The
goal was to find out whether Geant4 physics simulations could benefit from GPU
acceleration and how difficult it is to modify Geant4 code to run in a GPU.
We ported selected parts of Geant4 code to C99 & CUDA and implemented a
simple gamma physics simulation utilizing this code to measure efficiency. The
performance of the program was tested by running it on two different platforms:
NVIDIA GeForce 470 GTX GPU and a 12-core AMD CPU system. Our conclusion was
that GPUs can be a competitive alternate for multi-core computers but porting
existing software in an efficient way is challenging
Brian2GeNN: accelerating spiking neural network simulations with graphics hardware
âBrianâ is a popular Python-based simulator for spiking neural networks, commonly used in computational neuroscience. GeNN is a C++-based meta-compiler for accelerating spiking neural network simulations using consumer or high performance grade graphics processing units (GPUs). Here we introduce a new software package, Brian2GeNN, that connects the two systems so that users can make use of GeNN GPU acceleration when developing their models in Brian, without requiring any technical knowledge about GPUs, C++ or GeNN. The new Brian2GeNN software uses a pipeline of code generation to translate Brian scripts into C++ code that can be used as input to GeNN, and subsequently can be run on suitable NVIDIA GPU accelerators. From the userâs perspective, the entire pipeline is invoked by adding two simple lines to their Brian scripts. We have shown that using Brian2GeNN, two non-trivial models from the literature can run tens to hundreds of times faster than on CPU
Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration.
General-purpose computing on graphics processing units (GPGPU) is shown to dramatically increase the speed of Monte Carlo simulations of photon migration. In a standard simulation of time-resolved photon migration in a semi-infinite geometry, the proposed methodology executed on a low-cost graphics processing unit (GPU) is a factor 1000 faster than simulation performed on a single standard processor. In addition, we address important technical aspects of GPU-based simulations of photon migration. The technique is expected to become a standard method in Monte Carlo simulations of photon migration
Computing Performance Benchmarks among CPU, GPU, and FPGA
In recent years, the world of high performance computing has been developing rapidly. The goal of this project was to conduct computing performance benchmarks on three major computing platforms, CPUs, GPUs, and FPGAs. A total of 66 benchmarks were evaluated. GPUs outperformed the other platforms in terms of execution time. CPUs outperformed in overall execution combined with transfer time. FPGAs outperformed for fixed algorithms using streaming. The team made several recommendations for further research in this area
Parallel Tempering Simulation of the three-dimensional Edwards-Anderson Model with Compact Asynchronous Multispin Coding on GPU
Monte Carlo simulations of the Ising model play an important role in the
field of computational statistical physics, and they have revealed many
properties of the model over the past few decades. However, the effect of
frustration due to random disorder, in particular the possible spin glass
phase, remains a crucial but poorly understood problem. One of the obstacles in
the Monte Carlo simulation of random frustrated systems is their long
relaxation time making an efficient parallel implementation on state-of-the-art
computation platforms highly desirable. The Graphics Processing Unit (GPU) is
such a platform that provides an opportunity to significantly enhance the
computational performance and thus gain new insight into this problem. In this
paper, we present optimization and tuning approaches for the CUDA
implementation of the spin glass simulation on GPUs. We discuss the integration
of various design alternatives, such as GPU kernel construction with minimal
communication, memory tiling, and look-up tables. We present a binary data
format, Compact Asynchronous Multispin Coding (CAMSC), which provides an
additional speedup compared with the traditionally used Asynchronous
Multispin Coding (AMSC). Our overall design sustains a performance of 33.5
picoseconds per spin flip attempt for simulating the three-dimensional
Edwards-Anderson model with parallel tempering, which significantly improves
the performance over existing GPU implementations.Comment: 15 pages, 18 figure
- âŠ