Search CORE

77 research outputs found

Solving Lattice QCD systems of equations using mixed precision solvers on GPUs

Author: Barros
Brannick
Bulava
Bunk
C. Rebbi
Clark
De Forcrand
DeGrand
Edwards
Egri
Holmgren
K. Barros
Kahan
M.A. Clark
Martin
NVIDIA Corporation
R. Babich
R.C. Brower
Sleijpen
Publication venue: 'Elsevier BV'
Publication date: 21/12/2009
Field of study

Modern graphics hardware is designed for highly parallel numerical tasks and promises significant cost and performance benefits for many scientific applications. One such application is lattice quantum chromodyamics (lattice QCD), where the main computational challenge is to efficiently solve the discretized Dirac equation in the presence of an SU(3) gauge field. Using NVIDIA's CUDA platform we have implemented a Wilson-Dirac sparse matrix-vector product that performs at up to 40 Gflops, 135 Gflops and 212 Gflops for double, single and half precision respectively on NVIDIA's GeForce GTX 280 GPU. We have developed a new mixed precision approach for Krylov solvers using reliable updates which allows for full double precision accuracy while using only single or half precision arithmetic for the bulk of the computation. The resulting BiCGstab and CG solvers run in excess of 100 Gflops and, in terms of iterations until convergence, perform better than the usual defect-correction approach for mixed precision.Comment: 30 pages, 7 figure

arXiv.org e-Print Archive

Crossref

Preconditioning of Improved and ``Perfect'' Fermion Actions

Author: A. Frommer
B. Medeke
Barbour
Battista
Bhattacharya
Bietenholz
Bietenholz
Bietenholz
Eicker
Eisenstat
Fischer
Frommer
G. Weuffen
Gupta
Hasenfratz
Jansen
Jansen
K. Schilling
Lepage
Lüscher
Lüscher
N. Eicker
Niedermayer
Orginos
Oyanagi
Sheikholeslami
Symanzik
Th. Lippert
W. Bietenholz
Wilson
Wilson
Publication venue: 'Elsevier BV'
Publication date: 01/01/1998
Field of study

We construct a locally-lexicographic SSOR preconditioner to accelerate the parallel iterative solution of linear systems of equations for two improved discretizations of lattice fermions: the Sheikholeslami-Wohlert scheme where a non-constant block-diagonal term is added to the Wilson fermion matrix and renormalization group improved actions which incorporate couplings beyond nearest neighbors of the lattice fermion fields. In case (i) we find the block llssor-scheme to be more effective by a factor about 2 than odd-even preconditioned solvers in terms of convergence rates, at beta=6.0. For type (ii) actions, we show that our preconditioner accelerates the iterative solution of a linear system of hypercube fermions by a factor of 3 to 4.Comment: 27 pages, Latex, 17 Figures include

arXiv.org e-Print Archive

CiteSeerX

Crossref

Juelich Shared Electronic Resources

CERN Document Server

Algorithms in Lattice QCD

Author: Pickles Stephen M.
Publication venue: The University of Edinburgh
Publication date: 01/01/1998
Field of study

The enormous computing resources that large-scale simulations in Lattice QCD require will continue to test the limits of even the largest supercomputers into the foreseeable future. The efficiency of such simulations will therefore concern practitioners of lattice QCD for some time to come. I begin with an introduction to those aspects of lattice QCD essential to the remainder of the thesis, and follow with a description of the Wilson fermion matrix M, an object which is central to my theme. The principal bottleneck in Lattice QCD simulations is the solution of linear systems involving M, and this topic is treated in depth. I compare some of the more popular iterative methods, including Minimal Residual, Corij ugate Gradient on the Normal Equation, BI-Conjugate Gradient, QMR., BiCGSTAB and BiCGSTAB2, and then turn to a study of block algorithms, a special class of iterative solvers for systems with multiple right-hand sides. Included in this study are two block algorithms which had not previously been applied to lattice QCD. The next chapters are concerned with a generalised Hybrid Monte Carlo algorithm (OHM C) for QCD simulations involving dynamical quarks. I focus squarely on the efficient and robust implementation of GHMC, and describe some tricks to improve its performance. A limited set of results from HMC simulations at various parameter values is presented. A treatment of the non-hermitian Lanczos method and its application to the eigenvalue problem for M rounds off the theme of large-scale matrix computations

Edinburgh Research Archive