Search CORE

7,338 research outputs found

Accelerated test execution using GPUs

Author: Chechik M
Crnkovic I
Grünbacher P
Kroening D
Rajan A
Schrammel P
Sharma S
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

As product life-cycles become shorter and the scale and complexity of systems increase, accelerating the execution of large test suites gains importance. Existing research has primarily focussed on techniques that reduce the size of the test suite. By contrast, we propose a technique that accelerates test execution, allowing test suites to run in a fraction of the original time, by parallel execution with a Graphics Processing Unit (GPU). Program testing, which is in essence execution of the same program with multiple sets of test data, naturally exhibits the kind of data parallelism that can be exploited with GPUs. Our approach simultaneously executes the program with one test case per GPU thread. GPUs have severe limitations, and we discuss these in the context of our approach and define the scope of our applications. We observe speed-ups up to a factor of 27 compared to single-core execution on conventional CPUs with embedded systems benchmark programs

Crossref

Oxford University Research Archive

Sussex Research Online

Accelerated Finite State Machine Test Execution Using GPUs

Author: Dubach Christophe
Kapoor Arnav
Rajan Ajitha
Yaneva Vanya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/05/2019
Field of study

Edinburgh Research Explorer

Highly accelerated simulations of glassy dynamics using GPUs: caveats on limited floating-point precision

Author: Anderson
Block
Dekker
Felix Höfling
Flenner
Frenkel
Götze
Hansen
Harvey
Knuth
Knuth
Kob
Kob
Kob
Lippert
Liu
Mosayebi
Peter H. Colberg
Plimpton
Preis
Rapaport
Sagan
Stone
van Meel
Voelz
Weeks
Xu
Yang
Zagha
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Modern graphics processing units (GPUs) provide impressive computing resources, which can be accessed conveniently through the CUDA programming interface. We describe how GPUs can be used to considerably speed up molecular dynamics (MD) simulations for system sizes ranging up to about 1 million particles. Particular emphasis is put on the numerical long-time stability in terms of energy and momentum conservation, and caveats on limited floating-point precision are issued. Strict energy conservation over 10^8 MD steps is obtained by double-single emulation of the floating-point arithmetic in accuracy-critical parts of the algorithm. For the slow dynamics of a supercooled binary Lennard-Jones mixture, we demonstrate that the use of single-floating point precision may result in quantitatively and even physically wrong results. For simulations of a Lennard-Jones fluid, the described implementation shows speedup factors of up to 80 compared to a serial implementation for the CPU, and a single GPU was found to compare with a parallelised MD simulation using 64 distributed cores.Comment: 12 pages, 7 figures, to appear in Comp. Phys. Comm., HALMD package licensed under the GPL, see http://research.colberg.org/projects/halm

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Crossref

CoreTSAR: Task Scheduling for Accelerator-aware Runtimes

Author: de Supinski Bronis R.
Feng Wu-chun
Rountree Barry
Scogland Thomas R. W.
Publication venue
Publication date: 01/01/2012
Field of study

Heterogeneous supercomputers that incorporate computational accelerators such as GPUs are increasingly popular due to their high peak performance, energy efficiency and comparatively low cost. Unfortunately, the programming models and frameworks designed to extract performance from all computational units still lack the flexibility of their CPU-only counterparts. Accelerated OpenMP improves this situation by supporting natural migration of OpenMP code from CPUs to a GPU. However, these implementations currently lose one of OpenMP’s best features, its flexibility: typical OpenMP applications can run on any number of CPUs. GPU implementations do not transparently employ multiple GPUs on a node or a mix of GPUs and CPUs. To address these shortcomings, we present CoreTSAR, our runtime library for dynamically scheduling tasks across heterogeneous resources, and propose straightforward extensions that incorporate this functionality into Accelerated OpenMP. We show that our approach can provide nearly linear speedup to four GPUs over only using CPUs or one GPU while increasing the overall flexibility of Accelerated OpenMP

Computer Science Technical Reports @Virginia Tech