Search CORE

24 research outputs found

Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics

Author: Babich Ronald
Clark Michael A.
Joó Bálint
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Graphics Processing Units (GPUs) are having a transformational effect on numerical lattice quantum chromodynamics (LQCD) calculations of importance in nuclear and particle physics. The QUDA library provides a package of mixed precision sparse matrix linear solvers for LQCD applications, supporting single GPUs based on NVIDIA's Compute Unified Device Architecture (CUDA). This library, interfaced to the QDP++/Chroma framework for LQCD calculations, is currently in production use on the "9g" cluster at the Jefferson Laboratory, enabling unprecedented price/performance for a range of problems in LQCD. Nevertheless, memory constraints on current GPU devices limit the problem sizes that can be tackled. In this contribution we describe the parallelization of the QUDA library onto multiple GPUs using MPI, including strategies for the overlapping of communication and computation. We report on both weak and strong scaling for up to 32 GPUs interconnected by InfiniBand, on which we sustain in excess of 4 Tflops.Comment: 11 pages, 7 figures, to appear in the Proceedings of Supercomputing 2010 (submitted April 12, 2010

arXiv.org e-Print Archive

CiteSeerX

Gauge Field Generation on Large-Scale GPU-Enabled Systems

Author: Winter Frank
Publication venue
Publication date: 05/12/2012
Field of study

Over the past years GPUs have been successfully applied to the task of inverting the fermion matrix in lattice QCD calculations. Even strong scaling to capability-level supercomputers, corresponding to O(100) GPUs or more has been achieved. However strong scaling a whole gauge field generation algorithm to this regim requires significantly more functionality than just having the matrix inverter utilizing the GPUs and has not yet been accomplished. This contribution extends QDP-JIT, the migration of SciDAC QDP++ to GPU-enabled parallel systems, to help to strong scale the whole Hybrid Monte-Carlo to this regime. Initial results are shown for gauge field generation with Chroma simulating pure Wilson fermions on OLCF TitanDev.Comment: The 30th International Symposium on Lattice Field Theory, June 24-29, 2012, Cairns, Australia (Acknowledgment and Citation added

arXiv.org e-Print Archive

Crossref

Excited and exotic charmonium spectroscopy from lattice QCD

Author: AX El-Khadra
Bálint Joó
C Kim
C Michael
C Morningstar
Christopher E. Thomas
D Mohler
David G. Richards
E Follana
E Kou
FE Close
Graham Moir
GS Bali
H-W Lin
J Bulava
JJ Dudek
JJ Dudek
JJ Dudek
JJ Dudek
JJ Dudek
JJ Dudek
JJ Dudek
JJ Dudek
Jozef J. Dudek
K Nakamura
K Rummukainen
L Levkova
Liuming Liu
M Clark
M Lüscher
M Lüscher
M Lüscher
M Peardon
Michael Peardon
MS Chanowitz
N Brambilla
N Isgur
NH Christ
P Guo
Pol Vilaseca
R Morrin
RG Edwards
RG Edwards
RG Edwards
Robert G. Edwards
S-L Zhu
Sinéad M. Ryan
T Barnes
T Burch
T Pedlar
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Extending the QUDA library for Domain Wall and Twisted Mass fermions

Author: Alexei Strelchenko
Publication venue
Publication date
Field of study

We extend the QUDA library, an open source library for performing calculations in lattice QCD on Graphics Processing Units (GPUs) using NVIDIA's CUDA platform, to include kernels for non-degenerate twisted mass and multi-gpu Domain Wall fermion operators. Performance analysis is provided for both cases

ZENODO

Multi-mass solvers for lattice QCD on GPUs

Author: A. Alexandru
Alexandru
Alexandru
B. Gamari
Babich
C. Pelissier
Clark
Clark
DeGrand
Egri
F.X. Lee
Gross
Joo
Montvay
Politzer
Saad
van der Vorst
Wettig
Wilson
Wilson
Publication venue: 'Elsevier BV'
Publication date: 26/03/2011
Field of study

Graphical Processing Units (GPUs) are more and more frequently used for lattice QCD calculations. Lattice studies often require computing the quark propagators for several masses. These systems can be solved using multi-shift inverters but these algorithms are memory intensive which limits the size of the problem that can be solved using GPUs. In this paper, we show how to efficiently use a memory-lean single-mass inverter to solve multi-mass problems. We focus on the BiCGstab algorithm for Wilson fermions and show that the single-mass inverter not only requires less memory but also outperforms the multi-shift variant by a factor of two.Comment: 27 pages, 6 figures, 3 Table

arXiv.org e-Print Archive

Crossref

Lattice QCD based on OpenCL

Author: Bach Matthias
Lindenstruth Volker
Philipsen Owe
Pinke Christopher
Publication venue: 'Elsevier BV'
Publication date: 26/09/2012
Field of study

We present an OpenCL-based Lattice QCD application using a heatbath algorithm for the pure gauge case and Wilson fermions in the twisted mass formulation. The implementation is platform independent and can be used on AMD or NVIDIA GPUs, as well as on classical CPUs. On the AMD Radeon HD 5870 our double precision dslash implementation performs at 60 GFLOPS over a wide range of lattice sizes. The hybrid Monte-Carlo presented reaches a speedup of four over the reference code running on a server CPU.Comment: 19 pages, 11 figure

arXiv.org e-Print Archive

GSI Repository