Search CORE

21,140 research outputs found

An investigation of the performance portability of OpenCL

Author: Hammond Simon D.
Herdman J. A.
Jarvis Stephen A.
Miller I.
Pennycook Simon J.
Wright Steven A.
Publication venue: 'Elsevier BV'
Publication date: 11/08/2012
Field of study

This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level benchmark from the NAS Parallel Benchmark Suite. An account of the design decisions addressed during the development of this code is presented, demonstrating the importance of memory arrangement and work-item/work-group distribution strategies when applications are deployed on different device types. The resulting platform-agnostic, single source application is benchmarked on a number of different architectures, and is shown to be 1.3–1.5× slower than native FORTRAN 77 or CUDA implementations on a single node and 1.3–3.1× slower on multiple nodes. We also explore the potential performance gains of OpenCL’s device fissioning capability, demonstrating up to a 3× speed-up over our original OpenCL implementation

Warwick Research Archives Portal Repository

Automatic Computation of Cross Sections in HEP

Author: Fujimoto J.
Ishikawa T.
Jimbo M.
Kaneko T.
Kato K.
Kawabata S.
Kon T.
Kurihara Y.
Kuroda M.
Nakazawa N.
Shimizu Y.
Tanaka H.
Yuasa F.
Publication venue: 'Japan Society of Applied Physics'
Publication date: 01/01/2000
Field of study

For the study of reactions in High Energy Physics (HEP) automatic computation systems have been developed and are widely used nowadays. GRACE is one of such systems and it has achieved much success in analyzing experimental data. Since we deal with the cross section whose value can be given by calculating hundreds of Feynman diagrams, we manage the large scale calculation, so that effective symbolic manipulation, the treat of singularity in the numerical integration are required. The talk will describe the software design of GRACE system and computational techniques in the GRACE.Comment: 6 pages, Latex, ICCP

arXiv.org e-Print Archive

Crossref

CERN Document Server

Making extreme computations possible with virtual machines

Author: Chokoufe B.
Ohl T.
Reuter J.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2016
Field of study

State-of-the-art algorithms generate scattering amplitudes for high-energy physics at leading order for high-multiplicity processes as compiled code (in Fortran, C or C++). For complicated processes the size of these libraries can become tremendous (many GiB). We show that amplitudes can be translated to byte-code instructions, which even reduce the size by one order of magnitude. The byte-code is interpreted by a Virtual Machine with runtimes comparable to compiled code and a better scaling with additional legs. We study the properties of this algorithm, as an extension of the Optimizing Matrix Element Generator (O'Mega). The bytecode matrix elements are available as alternative input for the event generator WHIZARD. The bytecode interpreter can be implemented very compactly, which will help with a future implementation on massively parallel GPUs.Comment: 5 pages, 2 figures. arXiv admin note: substantial text overlap with arXiv:1411.383

arXiv.org e-Print Archive

DESY Publication Database

DESY