Search CORE

100 research outputs found

Optimization of Finite-Differencing Kernels for Numerical Relativity Applications

Author: Alfieri Roberto
Bernuzzi Sebastiano
Perego Albino
Radice David
Publication venue: 'MDPI AG'
Publication date: 01/01/2018
Field of study

A simple optimization strategy for the computation of 3D finite-differencing kernels on many-cores architectures is proposed. The 3D finite-differencing computation is split direction-by-direction and exploits two level of parallelism: in-core vectorization and multi-threads shared-memory parallelization. The main application of this method is to accelerate the high-order stencil computations in numerical relativity codes. Our proposed method provides substantial speedup in computations involving tensor contractions and 3D stencil calculations on different processor microarchitectures, including Intel Knight Landing

Archivio istituzionale della Ricerca - Università degli Studi di Parma

Directory of Open Access Journals

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

Author: Blazewicz Marek
Brandt Steven R.
Ciznicki Milosz
Hinder Ian
Kierzynka Michal
Koppelman David M.
Löffler Frank
Schnetter Erik
Tao Jian
Publication venue: 'IOS Press'
Publication date: 01/01/2013
Field of study

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.Comment: 18 pages, 4 figures, accepted for publication in Scientific Programmin

arXiv.org e-Print Archive

CiteSeerX

Directory of Open Access Journals

Louisiana State University

MPG.PuRe

Kranc: a Mathematica application to generate numerical codes for tensorial evolution equations

Author: Alcubierre
Alcubierre
Allen
Christiane Lechner
Cook
Cook
Diener
Goodale
Gustafsson
Hawke
Husa
Ian Hinder
Lee
Lehner
Martín-García
Musgrave
Parker
Pasquali
Penrose
Ruppelt
Sascha Husa
Talbot
Thornburg
Thornburg
Wald
York
Publication venue: 'Elsevier BV'
Publication date: 02/08/2010
Field of study

We present a suite of Mathematica-based computer-algebra packages, termed "Kranc", which comprise a toolbox to convert (tensorial) systems of partial differential evolution equations to parallelized C or Fortran code. Kranc can be used as a "rapid prototyping" system for physicists or mathematicians handling very complicated systems of partial differential equations, but through integration into the Cactus computational toolkit we can also produce efficient parallelized production codes. Our work is motivated by the field of numerical relativity, where Kranc is used as a research tool by the authors. In this paper we describe the design and implementation of both the Mathematica packages and the resulting code, we discuss some example applications, and provide results on the performance of an example numerical code for the Einstein equations.Comment: 24 pages, 1 figure. Corresponds to journal versio

arXiv.org e-Print Archive

Crossref

Spatial support vector regression to detect silent errors in the exascale era

Author: Balaprakash Prasanna
Bautista Gomez Leonardo
Cappello Franck
Cristal Kestelman Adrián
Di Sheng
Labarta Mancho Jesús José
Subasi Omer
Unsal Osman Sabri
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs) or silent errors are one of the major sources that corrupt the executionresults of HPC applications without being detected. In this work, we explore a low-memory-overhead SDC detector, by leveraging epsilon-insensitive support vector machine regression, to detect SDCs that occur in HPC applications that can be characterized by an impact error bound. The key contributions are three fold. (1) Our design takes spatialfeatures (i.e., neighbouring data values for each data point in a snapshot) into training data, such that little memory overhead (less than 1%) is introduced. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show thatour detector can achieve the detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% of false positive rate for most cases. Our detector incurs low performance overhead, 5% on average, for all benchmarks studied in the paper. Compared with other state-of-the-art techniques, our detector exhibits the best tradeoff considering the detection ability and overheads.This work was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research Program, under Contract DE-AC02-06CH11357, by FI-DGR 2013 scholarship, by HiPEAC PhD Collaboration Grant, the European Community’s Seventh Framework Programme [FP7/2007-2013] under the Mont-blanc 2 Project (www.montblanc-project.eu), grant agreement no. 610402, and TIN2015-65316-P.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

A pseudospectral matrix method for time-dependent tensor fields on a spherical shell

Author: Alcubierre
Alvi
Ben-Israel
Bernd Brügmann
Bona
Boyd
Brügmann
Brügmann
Brügmann
Campbell
Cheong
Fornberg
Fornberg
Friedrich
Galassi
Garfinkle
Goldberg
Grandclément
Gundlach
Hesthaven
Kidder
Kostelec
Lindblom
Merilees
Misner
Nath
Newman
Novak
Pretorius
Pretorius
Pretorius
Rinne
Rinne
Ruiz
Spotz
Spotz
Swarztrauber
Swarztrauber
Szilágyi
Tichy
Tichy
Trefethen
Weideman
Wiaux
Wiaux
York
Publication venue: 'Elsevier BV'
Publication date: 18/04/2011
Field of study

We construct a pseudospectral method for the solution of time-dependent, non-linear partial differential equations on a three-dimensional spherical shell. The problem we address is the treatment of tensor fields on the sphere. As a test case we consider the evolution of a single black hole in numerical general relativity. A natural strategy would be the expansion in tensor spherical harmonics in spherical coordinates. Instead, we consider the simpler and potentially more efficient possibility of a double Fourier expansion on the sphere for tensors in Cartesian coordinates. As usual for the double Fourier method, we employ a filter to address time-step limitations and certain stability issues. We find that a tensor filter based on spin-weighted spherical harmonics is successful, while two simplified, non-spin-weighted filters do not lead to stable evolutions. The derivatives and the filter are implemented by matrix multiplication for efficiency. A key technical point is the construction of a matrix multiplication method for the spin-weighted spherical harmonic filter. As example for the efficient parallelization of the double Fourier, spin-weighted filter method we discuss an implementation on a GPU, which achieves a speed-up of up to a factor of 20 compared to a single core CPU implementation.Comment: 33 pages, 9 figure

arXiv.org e-Print Archive

Crossref

The Coyote Universe III: Simulation Suite and Precision Emulator for the Nonlinear Matter Power Spectrum

Author: Albrecht
Albrecht
Brian Williams
Christian Wagner
David Higdon
Earl Lawrence
Heitmann
Heitmann
Heitmann
Higdon
Juszkiewicz
Katrin Heitmann
Komatsu
Martin White
Peacock
Peebles
Perlmutter
Riess
Salman Habib
Vishniac
Publication venue: 'IOP Publishing'
Publication date: 10/05/2013
Field of study

Many of the most exciting questions in astrophysics and cosmology, including the majority of observational probes of dark energy, rely on an understanding of the nonlinear regime of structure formation. In order to fully exploit the information available from this regime and to extract cosmological constraints, accurate theoretical predictions are needed. Currently such predictions can only be obtained from costly, precision numerical simulations. This paper is the third in a series aimed at constructing an accurate calibration of the nonlinear mass power spectrum on Mpc scales for a wide range of currently viable cosmological models, including dark energy. The first two papers addressed the numerical challenges, and the scheme by which an interpolator was built from a carefully chosen set of cosmological models. In this paper we introduce the "Coyote Univers"' simulation suite which comprises nearly 1,000 N-body simulations at different force and mass resolutions, spanning 38 wCDM cosmologies. This large simulation suite enables us to construct a prediction scheme, or emulator, for the nonlinear matter power spectrum accurate at the percent level out to k~1 h/Mpc. We describe the construction of the emulator, explain the tests performed to ensure its accuracy, and discuss how the central ideas may be extended to a wider range of cosmological models and applications. A power spectrum emulator code is released publicly as part of this paper.Comment: 10 pages, 10 figures, minor changes to address referee report, version v1.1 of the power spectrum emulator code can be downloaded at http://www.hep.anl.gov/cosmology/CosmicEmu/emu.html, includes now fortran wrapper and choice of any redshift between z=0 and z=1 (note: webpage now maintained at Argonne National Laboratory

arXiv.org e-Print Archive

Crossref

Exploring the capabilities of support vector machines in detecting silent data corruptions

Author: Balaprakash Prasanna
Bautista-Gomez Leonardo
Cappello Franck
Cristal Kestelman Adrián
Di Sheng
Krishnamoorthy Sriram
Labarta Mancho Jesús José
Subasi Omer
Unsal Osman Sabri
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs), or silent errors, are one of the major sources that corrupt the execution results of HPC applications without being detected. In this work, we explore a set of novel SDC detectors – by leveraging epsilon-insensitive support vector machine regression – to detect SDCs that occur in HPC applications. The key contributions are threefold. (1) Our exploration takes temporal, spatial, and spatiotemporal features into account and analyzes different detectors based on different features. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show that support-vector-machine-based detectors can achieve detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% false positive rate for most cases. Our detectors incur low performance overhead, 5% on average, for all benchmarks studied in this work.This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research under Award Number 66905, program manager Lucy Nowell. Pacific Northwest National Laboratory is operated by Battelle for DOE under Contract DE-AC05-76RL01830. In addition, this material is based upon work supported by the National Science Foundation under Grant No. 1619253, and also by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, program manager Lucy Nowell, under contract number DE-AC02-06CH11357 (DOE Catalog project) and in part by the European Union FEDER funds under contract TIN2015-65316-P.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC