Search CORE

185 research outputs found

Multi-mass solvers for lattice QCD on GPUs

Author: A. Alexandru
Alexandru
Alexandru
B. Gamari
Babich
C. Pelissier
Clark
Clark
DeGrand
Egri
F.X. Lee
Gross
Joo
Montvay
Politzer
Saad
van der Vorst
Wettig
Wilson
Wilson
Publication venue: 'Elsevier BV'
Publication date: 26/03/2011
Field of study

Graphical Processing Units (GPUs) are more and more frequently used for lattice QCD calculations. Lattice studies often require computing the quark propagators for several masses. These systems can be solved using multi-shift inverters but these algorithms are memory intensive which limits the size of the problem that can be solved using GPUs. In this paper, we show how to efficiently use a memory-lean single-mass inverter to solve multi-mass problems. We focus on the BiCGstab algorithm for Wilson fermions and show that the single-mass inverter not only requires less memory but also outperforms the multi-shift variant by a factor of two.Comment: 27 pages, 6 figures, 3 Table

arXiv.org e-Print Archive

Crossref

Status and Future Perspectives for Lattice Gauge Theory Calculations to the Exascale and Beyond

Author: Christ Norman H.
Detmold William
Edwards Robert G.
Joó Bálint
Jung Chulwoo
Savage Martin
Shanahan Phiala
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/11/2019
Field of study

In this and a set of companion whitepapers, the USQCD Collaboration lays out a program of science and computing for lattice gauge theory. These whitepapers describe how calculation using lattice QCD (and other gauge theories) can aid the interpretation of ongoing and upcoming experiments in particle and nuclear physics, as well as inspire new ones.Comment: 44 pages. 1 of USQCD whitepapers

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

QCD simulations with staggered fermions on GPUs

Author: Bonati Claudio
Cossu Guido
D'Elia Massimo
Incardona Pietro
Publication venue: 'Elsevier BV'
Publication date: 28/12/2011
Field of study

We report on our implementation of the RHMC algorithm for the simulation of lattice QCD with two staggered flavors on Graphics Processing Units, using the NVIDIA CUDA programming language. The main feature of our code is that the GPU is not used just as an accelerator, but instead the whole Molecular Dynamics trajectory is performed on it. After pointing out the main bottlenecks and how to circumvent them, we discuss the obtained performances. We present some preliminary results regarding OpenCL and multiGPU extensions of our code and discuss future perspectives.Comment: 22 pages, 14 eps figures, final version to be published in Computer Physics Communication

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università di Pisa

UnipiEprints

Solving Lattice QCD systems of equations using mixed precision solvers on GPUs

Author: Barros
Brannick
Bulava
Bunk
C. Rebbi
Clark
De Forcrand
DeGrand
Edwards
Egri
Holmgren
K. Barros
Kahan
M.A. Clark
Martin
NVIDIA Corporation
R. Babich
R.C. Brower
Sleijpen
Publication venue: 'Elsevier BV'
Publication date: 21/12/2009
Field of study

Modern graphics hardware is designed for highly parallel numerical tasks and promises significant cost and performance benefits for many scientific applications. One such application is lattice quantum chromodyamics (lattice QCD), where the main computational challenge is to efficiently solve the discretized Dirac equation in the presence of an SU(3) gauge field. Using NVIDIA's CUDA platform we have implemented a Wilson-Dirac sparse matrix-vector product that performs at up to 40 Gflops, 135 Gflops and 212 Gflops for double, single and half precision respectively on NVIDIA's GeForce GTX 280 GPU. We have developed a new mixed precision approach for Krylov solvers using reliable updates which allows for full double precision accuracy while using only single or half precision arithmetic for the bulk of the computation. The resulting BiCGstab and CG solvers run in excess of 100 Gflops and, in terms of iterations until convergence, perform better than the usual defect-correction approach for mixed precision.Comment: 30 pages, 7 figure

arXiv.org e-Print Archive

Crossref

Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics

Author: Babich Ronald
Clark Michael A.
Joó Bálint
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Graphics Processing Units (GPUs) are having a transformational effect on numerical lattice quantum chromodynamics (LQCD) calculations of importance in nuclear and particle physics. The QUDA library provides a package of mixed precision sparse matrix linear solvers for LQCD applications, supporting single GPUs based on NVIDIA's Compute Unified Device Architecture (CUDA). This library, interfaced to the QDP++/Chroma framework for LQCD calculations, is currently in production use on the "9g" cluster at the Jefferson Laboratory, enabling unprecedented price/performance for a range of problems in LQCD. Nevertheless, memory constraints on current GPU devices limit the problem sizes that can be tackled. In this contribution we describe the parallelization of the QUDA library onto multiple GPUs using MPI, including strategies for the overlapping of communication and computation. We report on both weak and strong scaling for up to 32 GPUs interconnected by InfiniBand, on which we sustain in excess of 4 Tflops.Comment: 11 pages, 7 figures, to appear in the Proceedings of Supercomputing 2010 (submitted April 12, 2010

arXiv.org e-Print Archive

CiteSeerX

Staggered fermions simulations on GPUs

Author: Bonati Claudio
Cossu Guido
D'Elia Massimo
Di Giacomo Adriano
Publication venue
Publication date: 01/01/2010
Field of study

We present our implementation of the RHMC algorithm for staggered fermions on Graphics Processing Units using the NVIDIA CUDA programming language. While previous studies exclusively deal with the Dirac matrix inversion problem, our code performs the complete MD trajectory on the GPU. After pointing out the main bottlenecks and how to circumvent them, we discuss the performance of our code.Comment: Poster presented at the XXVIII International Symposium on Lattice Field Theory, June 14-19, 2010, Villasimius, Sardinia Ital

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università di Pisa

UnipiEprints