11,977 research outputs found
A Hardware Generator of Multi-point Distributed Random Numbers for Monte Carlo Simulation
Monte Carlo simulation of weak approximations of stochastic differential equations constitutes an intensive computational task. In applications such as finance, for instance, to achieve "real time" execution, as often required, one needs highly efficient implementations of the multi-point distributed random number generator underlying the simulations. In this paper a fast and flexible dedicated hardware solution on a field programmable gate array is presented. A comparative performance analysis between a software-only and the proposed hardware solution demonstrates that the hardware solution is bottleneck-free, retains the flexibility of the software solution and significantly increases the computational efficiency. Moreover, simulations in applications such as economics, insurance, physics, population dynamics, epidemiology, structural mechanics, chemistry and biotechnology can benefit from the obtained speedup.random number generators; random bit generators; hardware implementation; field programmable gate arrays (FPGAs); Monte Carlo simulation; weak Taylor schemes; multi-point distributed random variables
A Hardware Generator of Multi-Point Distributed Random Numbers for Monte Carlo Simulation
Monte Carlo simulation of weak approximation of stochastic differential equations constitutes an intensive computational task. In applications such as finance, for instance, to achieve "real time" execution, as often required, one needs highly efficient implementations of the multi-point distributed random number generator underlying the simulations. In this paper, a fast and flexible dedicated hardware solution on a field programmable gate array is presented. A comparative performance analysis between a software-only and the poposed hardware solution demonstrated that the hardware solution is bottleneck-free, retains the flexibility of the software solution and significantly increases the computational efficiency. Moreover, simulations in Applications wuch as economics insurance, physics, population dynamics, epidemiology, structural mechanics, checmistry and biotechnology can benefit from the obtained speedups
GPU in Physics Computation: Case Geant4 Navigation
General purpose computing on graphic processing units (GPU) is a potential
method of speeding up scientific computation with low cost and high energy
efficiency. We experimented with the particle physics simulation toolkit Geant4
used at CERN to benchmark its geometry navigation functionality on a GPU. The
goal was to find out whether Geant4 physics simulations could benefit from GPU
acceleration and how difficult it is to modify Geant4 code to run in a GPU.
We ported selected parts of Geant4 code to C99 & CUDA and implemented a
simple gamma physics simulation utilizing this code to measure efficiency. The
performance of the program was tested by running it on two different platforms:
NVIDIA GeForce 470 GTX GPU and a 12-core AMD CPU system. Our conclusion was
that GPUs can be a competitive alternate for multi-core computers but porting
existing software in an efficient way is challenging
QCDGPU: open-source package for Monte Carlo lattice simulations on OpenCL-compatible multi-GPU systems
The multi-GPU open-source package QCDGPU for lattice Monte Carlo simulations
of pure SU(N) gluodynamics in external magnetic field at finite temperature and
O(N) model is developed. The code is implemented in OpenCL, tested on AMD and
NVIDIA GPUs, AMD and Intel CPUs and may run on other OpenCL-compatible devices.
The package contains minimal external library dependencies and is OS
platform-independent. It is optimized for heterogeneous computing due to the
possibility of dividing the lattice into non-equivalent parts to hide the
difference in performances of the devices used. QCDGPU has client-server part
for distributed simulations. The package is designed to produce lattice gauge
configurations as well as to analyze previously generated ones. QCDGPU may be
executed in fault-tolerant mode. Monte Carlo procedure core is based on PRNGCL
library for pseudo-random numbers generation on OpenCL-compatible devices,
which contains several most popular pseudo-random number generators.Comment: Presented at the Third International Conference "High Performance
Computing" (HPC-UA 2013), Kyiv, Ukraine; 9 pages, 2 figure
Effective Monte Carlo simulation on System-V massively parallel associative string processing architecture
We show that the latest version of massively parallel processing associative
string processing architecture (System-V) is applicable for fast Monte Carlo
simulation if an effective on-processor random number generator is implemented.
Our lagged Fibonacci generator can produce random numbers on a processor
string of 12K PE-s. The time dependent Monte Carlo algorithm of the
one-dimensional non-equilibrium kinetic Ising model performs 80 faster than the
corresponding serial algorithm on a 300 MHz UltraSparc.Comment: 8 pages, 9 color ps figures embedde
Mixing multi-core CPUs and GPUs for scientific simulation software
Recent technological and economic developments have led to widespread availability of
multi-core CPUs and specialist accelerator processors such as graphical processing units
(GPUs). The accelerated computational performance possible from these devices can be very
high for some applications paradigms. Software languages and systems such as NVIDIA's
CUDA and Khronos consortium's open compute language (OpenCL) support a number of
individual parallel application programming paradigms. To scale up the performance of some
complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and
very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica-
tions using threading approaches and multi-core CPUs to control independent GPU devices.
We present speed-up data and discuss multi-threading software issues for the applications
level programmer and o er some suggested areas for language development and integration
between coarse-grained and ne-grained multi-thread systems. We discuss results from three
common simulation algorithmic areas including: partial di erential equations; graph cluster
metric calculations and random number generation. We report on programming experiences
and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs;
a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and
trends in multi-core programming for scienti c applications developers
JANUS: an FPGA-based System for High Performance Scientific Computing
This paper describes JANUS, a modular massively parallel and reconfigurable
FPGA-based computing system. Each JANUS module has a computational core and a
host. The computational core is a 4x4 array of FPGA-based processing elements
with nearest-neighbor data links. Processors are also directly connected to an
I/O node attached to the JANUS host, a conventional PC. JANUS is tailored for,
but not limited to, the requirements of a class of hard scientific applications
characterized by regular code structure, unconventional data manipulation
instructions and not too large data-base size. We discuss the architecture of
this configurable machine, and focus on its use on Monte Carlo simulations of
statistical mechanics. On this class of application JANUS achieves impressive
performances: in some cases one JANUS processing element outperfoms high-end
PCs by a factor ~ 1000. We also discuss the role of JANUS on other classes of
scientific applications.Comment: 11 pages, 6 figures. Improved version, largely rewritten, submitted
to Computing in Science & Engineerin
- …