Search CORE

165,760 research outputs found

Accelerating the Fourier split operator method via graphics processing units

Author: Bauke Heiko
Keitel Christoph H.
Publication venue: 'Elsevier BV'
Publication date: 17/12/2010
Field of study

Current generations of graphics processing units have turned into highly parallel devices with general computing capabilities. Thus, graphics processing units may be utilized, for example, to solve time dependent partial differential equations by the Fourier split operator method. In this contribution, we demonstrate that graphics processing units are capable to calculate fast Fourier transforms much more efficiently than traditional central processing units. Thus, graphics processing units render efficient implementations of the Fourier split operator method possible. Performance gains of more than an order of magnitude as compared to implementations for traditional central processing units are reached in the solution of the time dependent Schr\"odinger equation and the time dependent Dirac equation

arXiv.org e-Print Archive

MPG.PuRe

Computational Physics on Graphics Processing Units

Author: A. Asadchev
A. Castro
A. Harju
A. Harju
A. McAdams
A.G. Anderson
A.P. Lyubartsev
A.W. Götz
B.L. Tembre
C. Bonati
C. McNeile
C.M. Isborn
D.J. Hardy
E. Darve
G. Bhanot
G. Egri
G. Kresse
H.J. Rothe
I. Montvay
I. Samish
I. Ufimtsev
I.S. Ufimtsev
I.S. Ufimtsev
I.S. Ufimtsev
J. Enkovaara
J. Gao
J. Hubbard
J.A. Anderson
J.A. McCammon
J.E. Stone
J.S. Meredith
K. Esler
K. Moreland
K. Yasuda
K. Yasuda
L. Genovese
L. Genovese
L. Greengard
L. Gu
L. Ha
M. Bordag
M. Göckeler
M. Hasenbusch
M. Hutchinson
M. Macedonia
M.C. Gutzwiller
M.C. Payne
M.P. Allen
N. Cardoso
N. Goodnight
N. Luehr
N.A. Gumerov
P. Giannozzi
P. Kipfer
P. Petreczky
R. Parr
R.D. Mawhinney
R.D. Skeel
R.G. Belleman
S. Hakala
S. Ihnatsenka
S. Maintz
T. Shirakawa
T. Siro
T. Takahashi
T.W. Chiu
V. Rokhlin
V. Springel
W. Jia
W. Kohn
W.M.C. Foulkes
X. Andrade
Y. Aoki
Y. Chen
Z. Fodor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The use of graphics processing units for scientific computations is an emerging strategy that can significantly speed up various different algorithms. In this review, we discuss advances made in the field of computational physics, focusing on classical molecular dynamics, and on quantum simulations for electronic structure calculations using the density functional theory, wave function techniques, and quantum field theory.Comment: Proceedings of the 11th International Conference, PARA 2012, Helsinki, Finland, June 10-13, 201

arXiv.org e-Print Archive

Crossref

Accelerating NBODY6 with Graphics Processing Units

Author: Aarseth
Aarseth
Aarseth
Ahmad
Belleman
Bulirsch
Fukushima
Gaburov
Keigo Nitadori
Kustaanheimo
Makino
Makino
Makino
Makino
Mikkola
Mikkola
Mikkola
Spurzem
Sverre J. Aarseth
Tanikawa
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

We describe the use of Graphics Processing Units (GPUs) for speeding up the code NBODY6 which is widely used for direct

N

-body simulations. Over the years, the

N^2

nature of the direct force calculation has proved a barrier for extending the particle number. Following an early introduction of force polynomials and individual time-steps, the calculation cost was first reduced by the introduction of a neighbour scheme. After a decade of GRAPE computers which speeded up the force calculation further, we are now in the era of GPUs where relatively small hardware systems are highly cost-effective. A significant gain in efficiency is achieved by employing the GPU to obtain the so-called regular force which typically involves some 99 percent of the particles, while the remaining local forces are evaluated on the host. However, the latter operation is performed up to 20 times more frequently and may still account for a significant cost. This effort is reduced by parallel SSE/AVX procedures where each interaction term is calculated using mainly single precision. We also discuss further strategies connected with coordinate and velocity prediction required by the integration scheme. This leaves hard binaries and multiple close encounters which are treated by several regularization methods. The present nbody6-GPU code is well balanced for simulations in the particle range

10^4-2 \times 10^5

for a dual GPU system attached to a standard PC.Comment: 8 pages, 3 figures, 2 tables, MNRAS accepte

arXiv.org e-Print Archive

CiteSeerX

Crossref

Improved Parallel Rabin-Karp Algorithm Using Compute Unified Device Architecture

Author: D Xu
DE Knuth
M Gongora-Blandon
N Singla
RM Karp
RS Boyer
RS Chillar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2018
Field of study

String matching algorithms are among one of the most widely used algorithms in computer science. Traditional string matching algorithms efficiency of underlaying string matching algorithm will greatly increase the efficiency of any application. In recent years, Graphics processing units are emerged as highly parallel processor. They out perform best of the central processing units in scientific computation power. By combining recent advancement in graphics processing units with string matching algorithms will allows to speed up process of string matching. In this paper we proposed modified parallel version of Rabin-Karp algorithm using graphics processing unit. Based on that, result of CPU as well as parallel GPU implementations are compared for evaluating effect of varying number of threads, cores, file size as well as pattern size.Comment: Information and Communication Technology for Intelligent Systems (ICTIS 2017

arXiv.org e-Print Archive

Crossref

Large-scale Ferrofluid Simulations on Graphics Processing Units

Author: Denisov S.
Hanggi P.
Lyutyy T. V.
Polyakov A. Yu.
Reva V. V.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

We present an approach to molecular-dynamics simulations of ferrofluids on graphics processing units (GPUs). Our numerical scheme is based on a GPU-oriented modification of the Barnes-Hut (BH) algorithm designed to increase the parallelism of computations. For an ensemble consisting of one million of ferromagnetic particles, the performance of the proposed algorithm on a Tesla M2050 GPU demonstrated a computational-time speed-up of four order of magnitude compared to the performance of the sequential All-Pairs (AP) algorithm on a single-core CPU, and two order of magnitude compared to the performance of the optimized AP algorithm on the GPU. The accuracy of the scheme is corroborated by comparing the results of numerical simulations with theoretical predictions

arXiv.org e-Print Archive

OPUS Augsburg