Measurement data for paper "Evaluation of Asynchronous Offloading Capabilities of Accelerator Programming Models for Multiple Devices"

Hahnfeld, Jonas; Müller, Matthias S.; Pflug, Hans Joachim; Price, James; Terboven, Christian

Measurement data for paper "Evaluation of Asynchronous Offloading Capabilities of Accelerator Programming Models for Multiple Devices"

Authors: Jonas Hahnfeld
Matthias S. Müller
Hans Joachim Pflug
James Price
Christian Terboven
Publication date: 1 January 2018
Publisher: RWTH Aachen
Doi

Abstract

Accelerator devices are increasingly used to build large supercomputers and current installations usually include more than one accelerator per system node. To keep all devices busy, kernels have to be executed concurrently which can be achieved via asynchronous kernel launches. Our work compares the performance for an implementation of the Conjugate Gradient method with CUDA, OpenCL, and OpenACC on NVIDIA Pascal GPUs. Furthermore, it takes a look at Intel Xeon Phi coprocessors when programmed with OpenCL and OpenMP. In doing so, it tries to answer the question of whether the higher abstraction level of directive based models is inferior to lower level paradigms in terms of performance.This archive contains the modications to liboffload, all binaries and libraries including their respective commit ids, and the raw data of ourmeasurements

Similar works

Full text

Available Versions

Publikationsserver der RWTH Aachen University

oai:publications.rwth-aachen.d...

Last time updated on 18/04/2018

RWTH Publications

oai:publications.rwth-aachen.d...

Last time updated on 18/04/2020