Search CORE

1,312 research outputs found

Many-Threaded Differential Evolution on the GPU

Author: A. Engelbrecht
A.J. Page
D. Ashlock
D. Fernandez-Baca
H. Izakian
J.G. Proakis
K.V. Price
L. Bertacco
N. Fujimoto
P. Krömer
P. Krömer
R. Martí
R. Storn
S. Ali
S. Roy
T. Schiavinotto
T. Schiavinotto
T.D. Braun
V. Campos
V. Snášel
W. Zhu
W.B. Langdon
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Simulation of reaction-diffusion processes in three dimensions using CUDA

Author: Alexandrov
Anderson
Belleman
Block
Buluc
Castano-Diez
Castets
Che
Costello
Cross
Dabdub
Epstein
Ferenc Izsák
Ferenc Molnár
Ford
Fowler
Garland
Gutiérrez
Horváth
Horváth
Horváth
Huang
István Lagzi
Januszewski
Komatitsch
Komatitsch
Lagzi
Lagzi
Lagzi
Lagzi
Lengyel
Li
Liu
Liu
Lovas
Martin
Melchionna
Micikevicius
Molnár
Nakamasu
NVIDIA Corporation
Owens
Preis
Pápai
Rácz
Róbert Mészáros
Sainio
Sanderson
Sanna
Sanna
Schmidt
Senocak
Shoji
Shoji
Simek
Stone
Stone
Sultan
Volford
Volford
Walsh
Publication venue: 'Elsevier BV'
Publication date: 03/04/2010
Field of study

Numerical solution of reaction-diffusion equations in three dimensions is one of the most challenging applied mathematical problems. Since these simulations are very time consuming, any ideas and strategies aiming at the reduction of CPU time are important topics of research. A general and robust idea is the parallelization of source codes/programs. Recently, the technological development of graphics hardware created a possibility to use desktop video cards to solve numerically intensive problems. We present a powerful parallel computing framework to solve reaction-diffusion equations numerically using the Graphics Processing Units (GPUs) with CUDA. Four different reaction-diffusion problems, (i) diffusion of chemically inert compound, (ii) Turing pattern formation, (iii) phase separation in the wake of a moving diffusion front and (iv) air pollution dispersion were solved, and additionally both the Shared method and the Moving Tiles method were tested. Our results show that parallel implementation achieves typical acceleration values in the order of 5-40 times compared to CPU using a single-threaded implementation on a 2.8 GHz desktop computer.Comment: 8 figures, 5 table

arXiv.org e-Print Archive

Crossref

University of Twente Research Information

Thread-Scalable Evaluation of Multi-Jet Observables

Author: A. Hameren van
C.F. Berger
C.F. Berger
C.F. Berger
C.F. Berger
E. Boos
F. Krauss
F.A. Berends
G. Bevilacqua
G. Bevilacqua
Gerben C. Stavenga
J. Pumplin
Jan Winter
K. Hagiwara
K. Hagiwara
K. Melnikov
K. Melnikov
M.L. Mangano
P. Draggiotis
P.D. Draggiotis
P.D. Draggiotis
R. Frederix
R. Kleiss
R. Kleiss
R.K. Ellis
T. Gleisberg
T. Gleisberg
T. Stelzer
W. Kahan
W.T. Giele
Walter T. Giele
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/02/2010
Field of study

A leading-order, leading-color parton-level event generator is developed for use on a multi-threaded GPU. Speed-up factors between 150 and 300 are obtained compared to an unoptimized CPU-based implementation of the event generator. In this first paper we study the feasibility of a GPU-based event generator with an emphasis on the constraints imposed by the hardware. Some studies of Monte Carlo convergence and accuracy are presented for PP -> 2,...,10 jet observables using of the order of 1e11 events.Comment: 16 pages, 5 figures, 3 table

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Springer - Publisher Connector

Air pollution modelling using a graphics processing unit with CUDA

Author: Lagzi Istvan
Meszaros Robert
Molnar Jr. Ferenc
Szakaly Tamas
Publication venue: 'Elsevier BV'
Publication date: 16/12/2009
Field of study

The Graphics Processing Unit (GPU) is a powerful tool for parallel computing. In the past years the performance and capabilities of GPUs have increased, and the Compute Unified Device Architecture (CUDA) - a parallel computing architecture - has been developed by NVIDIA to utilize this performance in general purpose computations. Here we show for the first time a possible application of GPU for environmental studies serving as a basement for decision making strategies. A stochastic Lagrangian particle model has been developed on CUDA to estimate the transport and the transformation of the radionuclides from a single point source during an accidental release. Our results show that parallel implementation achieves typical acceleration values in the order of 80-120 times compared to CPU using a single-threaded implementation on a 2.33 GHz desktop computer. Only very small differences have been found between the results obtained from GPU and CPU simulations, which are comparable with the effect of stochastic transport phenomena in atmosphere. The relatively high speedup with no additional costs to maintain this parallel architecture could result in a wide usage of GPU for diversified environmental applications in the near future.Comment: 5 figure

arXiv.org e-Print Archive

ELTE Digital Institutional Repository (EDIT)

Parallel Algorithm for Solving Kepler's Equation on Graphics Processing Units: Application to Analysis of Doppler Exoplanet Searches

Author: Belleman
Eric B. Ford
Ford
Ford
Gregory
Harris
Kahan
Portegies Zwart
ter Braak
Publication venue: 'Elsevier BV'
Publication date: 16/12/2008
Field of study

[Abridged] We present the results of a highly parallel Kepler equation solver using the Graphics Processing Unit (GPU) on a commercial nVidia GeForce 280GTX and the "Compute Unified Device Architecture" programming environment. We apply this to evaluate a goodness-of-fit statistic (e.g., chi^2) for Doppler observations of stars potentially harboring multiple planetary companions (assuming negligible planet-planet interactions). We tested multiple implementations using single precision, double precision, pairs of single precision, and mixed precision arithmetic. We find that the vast majority of computations can be performed using single precision arithmetic, with selective use of compensated summation for increased precision. However, standard single precision is not adequate for calculating the mean anomaly from the time of observation and orbital period when evaluating the goodness-of-fit for real planetary systems and observational data sets. Using all double precision, our GPU code outperforms a similar code using a modern CPU by a factor of over 60. Using mixed-precision, our GPU code provides a speed-up factor of over 600, when evaluating N_sys > 1024 models planetary systems each containing N_pl = 4 planets and assuming N_obs = 256 observations of each system. We conclude that modern GPUs also offer a powerful tool for repeatedly evaluating Kepler's equation and a goodness-of-fit statistic for orbital models when presented with a large parameter space.Comment: 19 pages, to appear in New Astronom

arXiv.org e-Print Archive

Crossref

Fat vs. thin threading approach on GPUs: application to stochastic simulation of chemical reactions

Author: Erban R.
Giles M. B.
Klingbeil G.
Maini P. K.
Publication venue
Publication date: 01/01/2010
Field of study

We explore two different threading approaches on a graphics processing unit (GPU) exploiting two different characteristics of the current GPU architecture. The fat thread approach tries to minimise data access time by relying on shared memory and registers potentially sacrificing parallelism. The thin thread approach maximises parallelism and tries to hide access latencies. We apply these two approaches to the parallel stochastic simulation of chemical reaction systems using the stochastic simulation algorithm (SSA) by Gillespie (J. Phys. Chem, Vol. 81, p. 2340-2361, 1977). In these cases, the proposed thin thread approach shows comparable performance while eliminating the limitation of the reaction system’s size

Oxford University Research Archive