SU(2) Lattice Gauge Theory Simulations on Fermi GPUs

Bhanot; Clark; Creutz; Creutz; Egri; Engels; Huntley; Kirk; Kovacs; McLerran; Nuno Cardoso; Pedro Bicudo; Press; Shakespeare; Stack

research

SU(2) Lattice Gauge Theory Simulations on Fermi GPUs

Authors: Bhanot
Clark
Creutz
Creutz
Egri
Engels
Huntley
Kirk
Kovacs
McLerran
Nuno Cardoso
Pedro Bicudo
Press
Shakespeare
Stack
Publication date: 1 January 2011
Publisher: 'Elsevier BV'
Doi

Abstract

In this work we explore the performance of CUDA in quenched lattice SU(2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes for the Monte Carlo generation of SU(2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (

50\ 000

) without smearing and almost

2\ 000

configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of

200 \times

the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than

2 \times

slower) than single precision computations.Comment: 20 pages, 11 figures, 3 tables, accepted in Journal of Computational Physic

Similar works

Full text

Available Versions

CiteSeerX

oai:CiteSeerX.psu:10.1.1.751.8...

Last time updated on 30/10/2017

Crossref

info:doi/10.1016%2Fj.jcp.2011....

Last time updated on 01/04/2019