Search CORE

19 research outputs found

Best practices for HPM-assisted performance engineering on modern multicore processors

Author: F. Günther
J. Treibig
K. Iglberger
M. Burtscher
T. Klug
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/06/2012
Field of study

Many tools and libraries employ hardware performance monitoring (HPM) on modern processors, and using this data for performance assessment and as a starting point for code optimizations is very popular. However, such data is only useful if it is interpreted with care, and if the right metrics are chosen for the right purpose. We demonstrate the sensible use of hardware performance counters in the context of a structured performance engineering approach for applications in computational science. Typical performance patterns and their respective metric signatures are defined, and some of them are illustrated using case studies. Although these generic concepts do not depend on specific tools or environments, we restrict ourselves to modern x86-based multicore processors and use the likwid-perfctr tool under the Linux OS.Comment: 10 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Energy aware scheduling model and online heuristics for stencil codes on heterogeneous computing architectures

Author: AA Chandio
AD Pereira
CE Shannon
G Terzopoulos
I Holyer
IM Bomze
J Mei
J Treibig
Jan Weglarz
K Bilal
K Datta
K Kurowski
KA Rojek
Krzysztof Kurowski
M Blazewicz
M Ciznicki
M Ciznicki
M Ciznicki
M Ciznicki
Milosz Ciznicki
S Sellappa
S Williams
VG Vizing
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Performance Engineering: From Numbers to Insight

Author: J. Treibig
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Comparison of finite volume and lattice Boltzmann methods for multicomponent flow simulations

Author: Falcucci G.
Mukherjee S.
Treibig J.
Publication venue
Publication date: 01/01/2020
Field of study

In pseudopotential lattice Boltzmann (LB) models for simulating multicomponent flows, interaction forces between the components of a mixture lead to phase separation and interfacial tension. At the macroscopic scale, such LB models solve an advection‐diffusion equation for each component and the Navier‐Stokes equations for the fluid mixture. In this paper, the computational efficiency of the LB method is compared with a finite volume (FV) solver for the same macroscopic‐scale equations for a binary system in a two dimensional domain. The FV implementation replicates the phase separation of the LB model. Differences in the interfacial tension are due to truncation of the Taylor series expansion of the LB interaction force in the FV version. While the computations required to update the domain for each timestep can be completed faster with the FV approach, a smaller timestep is required to achieve stability, which negates the improvement in processing speed. The FV implementation, however, allows independent variation of model parameters, which is not possible in LB. For example, the viscosity can be changed without affecting interfacial tension or the extent of phase separation. Furthermore, it is possible to obtain low interfacial tensions without suppressing phase separation with the FV formulation. The significance of changing the diffusion rate of components on the deformation of a droplet in shear is also demonstrated. For three‐dimensional simulations, the finite volume approach is expected to be faster than LB and would benefit from the demonstrated flexibility in specifying model parameters

University of Limerick Institutional Repository

Crossref

Irish Universities

Performance Analysis of the Kahan-Enhanced Scalar Product on Current Multicore Processors

Author: D Goldberg
Georg Hager
J Gregory
J Treibig
J Treibig
P Linz
S Williams
SM Rump
W Kahan
YK Zhu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Application instrumentation for performance analysis and tuning with focus on energy efficiency

Author: Asanovic K
Cesarini D
Gerndt M
Treibig J
Publication venue: 'Wiley'
Publication date: 01/01/2020
Field of study

Profiling and tuning of parallel applications is an essential part of HPC. Analysis and elimination of application hot spots can be performed using many available tools, which also provides resource consumption measurements for instrumented parts of the code. Since complex applications show different behavior in each part of the code, it is essential to be able to insert instrumentation to analyse these parts. Because each performance analysis or autotuning tool can bring different insights into an application behavior, it is valuable to analyze and optimize an application using a variety of them. We present our on request inserted shared C/C++ API for the most common open-source HPC performance analysis tools, which simplify the process of the manual instrumentation. Besides manual instrumentation, profiling libraries provide different methods for instrumentation. Of these, the binary patching is the most universal mechanism, and highly improves the user-friendliness and robustness of the tool. We provide an overview of the most commonly used binary patching tools, and describe a workflow for how to use them to implement a binary instrumentation tool for any profiler or autotuner. We have also evaluated the minimum overhead of the manual and binary instrumentation.Web of Scienc

Crossref

DSpace at VSB Technical University of Ostrava

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Exploiting SIMD and Thread-Level Parallelism in Multiblock CFD

Author: G. Albada
J. Treibig
M. Vavra
P. Roe
S. Williams
T. Henretty
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

A Predictive Performance Model for Stencil Codes on Multicore CPUs

Author: A. Nguyen
A. Schäfer
J. Treibig
S. Kamil
S. Williams
T. Henretty
T. Maruyama
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Using Intel Xeon Phi Coprocessor to Accelerate Computations in MPDATA Algorithm

Author: J Treibig
K Rojek
M Wittmann
P Smolarkiewicz
R Wyrzykowski
R Wyrzykowski
Z Piotrowski
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref