Performance Portability Strategies for Grid C++ Expression Templates

Avilés-Casco, Alejandro Vaquero; Boyle, Peter A.; Clark, M. A.; DeTar, Carleton; Lin, Meifeng; Rana, Verinder

research

Performance Portability Strategies for Grid C++ Expression Templates

Authors: Alejandro Vaquero Avilés-Casco
Peter A. Boyle
M. A. Clark
Carleton DeTar
Meifeng Lin
Verinder Rana
Publication date: 25 October 2017
Publisher: 'EDP Sciences'
Doi

Abstract

One of the key requirements for the Lattice QCD Application Development as part of the US Exascale Computing Project is performance portability across multiple architectures. Using the Grid C++ expression template as a starting point, we report on the progress made with regards to the Grid GPU offloading strategies. We present both the successes and issues encountered in using CUDA, OpenACC and Just-In-Time compilation. Experimentation and performance on GPUs with a SU(3)

\times

SU(3) streaming test will be reported. We will also report on the challenges of using current OpenMP 4.x for GPU offloading in the same code.Comment: 8 pages, 4 figures. Talk presented at the 35th International Symposium on Lattice Field Theory, 18-24 June 2017, Granada, Spai

Similar works

Full text

Available Versions

EDP Sciences OAI-PMH repository (1.2.0)

oai:edpsciences.org:dkey/10.10...

Last time updated on 10/04/2020