Search CORE

6,215 research outputs found

A pilgrimage to gravity on GPUs

Author: A. Ahmad
A. Gualandris
A. Tanikawa
E. Gaburov
E. Holmberg
E.N. Dorband
G.J. Sussman
J. Barnes
J. Bédorf
J. Bédorf
J. Goodman
J. Makino
J.H. Applegate
J.R. Hurley
K. Nitadori
L. Nyland
M. Fujii
P. Hut
R. Spurzem
R. Spurzem
R. Spurzem
R. Yokota
R.G. Belleman
R.H. Miller
S. Harfst
S. Inagaki
S. Portegies Zwart
S. Portegies Zwart
S. Portegies Zwart
S. von Hoerner
S.F. Portegies Zwart
S.F. Portegies Zwart
S.J. Aarseth
S.J. Aarseth
T. Fukushige
T.S. van Albada
W. Dehnen
W. Dehnen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2012
Field of study

In this short review we present the developments over the last 5 decades that have led to the use of Graphics Processing Units (GPUs) for astrophysical simulations. Since the introduction of NVIDIA's Compute Unified Device Architecture (CUDA) in 2007 the GPU has become a valuable tool for N-body simulations and is so popular these days that almost all papers about high precision N-body simulations use methods that are accelerated by GPUs. With the GPU hardware becoming more advanced and being used for more advanced algorithms like gravitational tree-codes we see a bright future for GPU like hardware in computational astrophysics.Comment: To appear in: European Physical Journal "Special Topics" : "Computer Simulations on Graphics Processing Units" . 18 pages, 8 figure

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Leiden University Scholary Publications

Status and Future Perspectives for Lattice Gauge Theory Calculations to the Exascale and Beyond

Author: Christ Norman H.
Detmold William
Edwards Robert G.
Joó Bálint
Jung Chulwoo
Savage Martin
Shanahan Phiala
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/11/2019
Field of study

In this and a set of companion whitepapers, the USQCD Collaboration lays out a program of science and computing for lattice gauge theory. These whitepapers describe how calculation using lattice QCD (and other gauge theories) can aid the interpretation of ongoing and upcoming experiments in particle and nuclear physics, as well as inspire new ones.Comment: 44 pages. 1 of USQCD whitepapers

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Performance analysis of parallel gravitational $N$ -body codes on large GPU cluster

Author: Berczik Peter
Huang Siyi
Spurzem Rainer
Publication venue: 'IOP Publishing'
Publication date: 11/08/2015
Field of study

We compare the performance of two very different parallel gravitational

N

-body codes for astrophysical simulations on large GPU clusters, both pioneer in their own fields as well as in certain mutual scales - NBODY6++ and Bonsai. We carry out the benchmark of the two codes by analyzing their performance, accuracy and efficiency through the modeling of structure decomposition and timing measurements. We find that both codes are heavily optimized to leverage the computational potential of GPUs as their performance has approached half of the maximum single precision performance of the underlying GPU cards. With such performance we predict that a speed-up of

200-300

can be achieved when up to 1k processors and GPUs are employed simultaneously. We discuss the quantitative information about comparisons of two codes, finding that in the same cases Bonsai adopts larger time steps as well as relative energy errors than NBODY6++, typically ranging from

10-50

times larger, depending on the chosen parameters of the codes. While the two codes are built for different astrophysical applications, in specified conditions they may overlap in performance at certain physical scale, and thus allowing the user to choose from either one with finetuned parameters accordingly.Comment: 15 pages, 7 figures, 3 tables, accepted for publication in Research in Astronomy and Astrophysics (RAA

arXiv.org e-Print Archive

Crossref

Repository of the Academy's Library

High-level programming of stencil computations on multi-GPU systems using the SkelCL library

Author: Breuer Stefan
Gorlatch Sergei
Haidl Michael
Steuwer Michel
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/09/2014
Field of study

The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA. This makes development of stencil applications a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high-level programming abstractions with competitive performance on multi-GPU systems. SkelCL extends the OpenCL standard by three high-level features: 1) pre-implemented parallel patterns (a.k.a. skeletons); 2) container data types for vectors and matrices; 3) automatic data (re)distribution mechanism. We introduce two new SkelCL skeletons which specifically target stencil computations – MapOverlap and Stencil – and we describe their use for particular application examples, discuss their efficient parallel implementation, and report experimental results on systems with multiple GPUs. Our evaluation of three real-world applications shows that stencil code written with SkelCL is considerably shorter and offers competitive performance to hand-tuned OpenCL code

Crossref

Enlighten

Applications for Ultrascale Computing

Author: Bongo Lars Ailo
Ciegis Raimondas
Frasheri Neki
Gong Jing
Kimovski Dragi
Kropf Peter
Margenov Svetozar
Mihajlovic Milan
Neytcheva Maya
Rauber Thomas
Runger Gudula
Trobec Roman
Wuyts Roel
Wyrzykowski Roman
Publication venue: 'FSAEIHE South Ural State University (National Research University)'
Publication date: 01/01/2015
Field of study

The University of Manchester - Institutional Repository

Enhancing speed and scalability of the ParFlow simulation code

Author: Burstedde Carsten
Fonseca Jose A.
Kollet Stefan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/09/2017
Field of study

Regional hydrology studies are often supported by high resolution simulations of subsurface flow that require expensive and extensive computations. Efficient usage of the latest high performance parallel computing systems becomes a necessity. The simulation software ParFlow has been demonstrated to meet this requirement and shown to have excellent solver scalability for up to 16,384 processes. In the present work we show that the code requires further enhancements in order to fully take advantage of current petascale machines. We identify ParFlow's way of parallelization of the computational mesh as a central bottleneck. We propose to reorganize this subsystem using fast mesh partition algorithms provided by the parallel adaptive mesh refinement library p4est. We realize this in a minimally invasive manner by modifying selected parts of the code to reinterpret the existing mesh data structures. We evaluate the scaling performance of the modified version of ParFlow, demonstrating good weak and strong scaling up to 458k cores of the Juqueen supercomputer, and test an example application at large scale.Comment: The final publication is available at link.springer.co

arXiv.org e-Print Archive

Crossref

Juelich Shared Electronic Resources

The stellar atmosphere simulation code Bifrost

Author: Abbett
Arber
Arnaud
B. V. Gudiksen
Carlsson
Carlsson
Carlsson
Carlsson
Cheung
De Pontieu
DeVore
Dorfi
Galsgaard
Gudiksen
Gustafsson
Hansteen
Hansteen
Hansteen
Hayek
Heggland
Heinemann
Isobe
J. Leenaarts
J. Martínez-Sykora
Keller
Leenaarts
Leenaarts
M. Carlsson
Martínez-Sykora
Martínez-Sykora
Martínez-Sykora
Nordlund
Orszag
Rempel
Ryu
Shull
Skartlien
Sod
Stein
Thompson
Tortosa-Andreu
Trujillo Bueno
Tsuji
Tóth
V. H. Hansteen
Vernazza
Vögler
W. Hayek
Wedemeyer-Böhm
Woodall
Publication venue: 'EDP Sciences'
Publication date: 01/01/2011
Field of study

Context: Numerical simulations of stellar convection and photospheres have been developed to the point where detailed shapes of observed spectral lines can be explained. Stellar atmospheres are very complex, and very different physical regimes are present in the convection zone, photosphere, chromosphere, transition region and corona. To understand the details of the atmosphere it is necessary to simulate the whole atmosphere since the different layers interact strongly. These physical regimes are very diverse and it takes a highly efficient massively parallel numerical code to solve the associated equations. Aims: The design, implementation and validation of the massively parallel numerical code Bifrost for simulating stellar atmospheres from the convection zone to the corona. Methods: The code is subjected to a number of validation tests, among them the Sod shock tube test, the Orzag-Tang colliding shock test, boundary condition tests and tests of how the code treats magnetic field advection, chromospheric radiation, radiative transfer in an isothermal scattering atmosphere, hydrogen ionization and thermal conduction. Results: Bifrost completes the tests with good results and shows near linear efficiency scaling to thousands of computing cores

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Utrecht University Repository