Search CORE

1,813 research outputs found

Domain knowledge specification for energy tuning

Author: Bendifallah Zakaria
Beseda Martin
Bouizi Othman
Chowdhury Anamika
Gerndt Michael
Kumaraswamy Madhura
Locans Uldis
Vysocký Ondřej
Zapletal Jan
Říha Lubomír
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

To overcome the challenges of energy consumption of HPC systems, the European Union Horizon 2020 READEX (Runtime Exploitation of Application Dynamism for Energy-efficient Exascale computing) project uses an online auto-tuning approach to improve energy efficiency of HPC applications. The READEX methodology pre-computes optimal system configurations at design-time, such as the CPU frequency, for instances of program regions and switches at runtime to the configuration given in the tuning model when the region is executed. READEX goes beyond previous approaches by exploiting dynamic changes of a region's characteristics by leveraging region and characteristic specific system configurations. While the tool suite supports an automatic approach, specifying domain knowledge such as the structure and characteristics of the application and application tuning parameters can significantly help to create a more refined tuning model. This paper presents the means available for an application expert to provide domain knowledge and presents tuning results for some benchmarks.Web of Science316art. no. E465

DSpace at VSB Technical University of Ostrava

An LLVM Instrumentation Plug-in for Score-P

Author: Brendel Ronny
Döbel Sebastian
Herold Christian
Tschüter Ronny
Weber Matthias
Wesarg Bert
Ziegenbalg Johannes
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2017
Field of study

Reducing application runtime, scaling parallel applications to higher numbers of processes/threads, and porting applications to new hardware architectures are tasks necessary in the software development process. Therefore, developers have to investigate and understand application runtime behavior. Tools such as monitoring infrastructures that capture performance relevant data during application execution assist in this task. The measured data forms the basis for identifying bottlenecks and optimizing the code. Monitoring infrastructures need mechanisms to record application activities in order to conduct measurements. Automatic instrumentation of the source code is the preferred method in most application scenarios. We introduce a plug-in for the LLVM infrastructure that enables automatic source code instrumentation at compile-time. In contrast to available instrumentation mechanisms in LLVM/Clang, our plug-in can selectively include/exclude individual application functions. This enables developers to fine-tune the measurement to the required level of detail while avoiding large runtime overheads due to excessive instrumentation.Comment: 8 page

arXiv.org e-Print Archive

Crossref

LIKWID: Lightweight Performance Tools

Author: Hager Georg
Treibig Jan
Wellein Gerhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Exploiting the performance of today's microprocessors requires intimate knowledge of the microarchitecture as well as an awareness of the ever-growing complexity in thread and cache topology. LIKWID is a set of command line utilities that addresses four key problems: Probing the thread and cache topology of a shared-memory node, enforcing thread-core affinity on a program, measuring performance counter metrics, and microbenchmarking for reliable upper performance bounds. Moreover, it includes a mpirun wrapper allowing for portable thread-core affinity in MPI and hybrid MPI/threaded applications. To demonstrate the capabilities of the tool set we show the influence of thread affinity on performance using the well-known OpenMP STREAM triad benchmark, use hardware counter tools to study the performance of a stencil code, and finally show how to detect bandwidth problems on ccNUMA-based compute nodes.Comment: 12 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Scalability and Performance Analysis of OpenMP Codes Using the Periscope Toolkit

Author: Benedict Shajulin
Gerndt Michael
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 10/02/2015
Field of study

In this paper, we present two new approaches while rendering necessary extensions to Periscope to perform scalability and performance analysis on OpenMP codes. Periscope is an online-based performance analysis toolkit which consists of a user defined number of analysis agents that automatically search for the performance properties while the application is running. In order to detect the scalability and performance bottlenecks of OpenMP codes using Periscope, a few newly defined performance properties and meta properties are formalized. We manifest our implementation by evaluating NAS OpenMP benchmarks. As shown in our results, our approach identifies the code regions which do not scale well and other performance problems, e.g. load imbalance in NAS parallel benchmarks

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Distribution of Periscope Analysis Agents on ALTIX 4700

Author: Gerndt Michael
Strohhäcker Sebastian
Publication venue: John von Neumann Institute for Computing
Publication date: 01/01/2007
Field of study

Juelich Shared Electronic Resources

Effects of magnification and visual accommodation on aimpoint estimation in simulated landings with real and virtual image displays

Author: Petitt J. C.
Randle R. J.
Roscoe S. N.
Publication venue
Publication date
Field of study

Twenty professional pilots observed a computer-generated airport scene during simulated autopilot-coupled night landing approaches and at two points (20 sec and 10 sec before touchdown) judged whether the airplane would undershoot or overshoot the aimpoint. Visual accommodation was continuously measured using an automatic infrared optometer. Experimental variables included approach slope angle, display magnification, visual focus demand (using ophthalmic lenses), and presentation of the display as either a real (direct view) or a virtual (collimated) image. Aimpoint judgments shifted predictably with actual approach slope and display magnification. Both pilot judgments and measured accommodation interacted with focus demand with real-image displays but not with virtual-image displays. With either type of display, measured accommodation lagged far behind focus demand and was reliably less responsive to the virtual images. Pilot judgments shifted dramatically from an overwhelming perceived-overshoot bias 20 sec before touchdown to a reliable undershoot bias 10 sec later

NASA Technical Reports Server

LIKWID Monitoring Stack: A flexible framework enabling job specific performance monitoring for the masses

Author: Eitzinger Jan
Hager Georg
Röhl Thomas
Wellein Gerhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/08/2017
Field of study

System monitoring is an established tool to measure the utilization and health of HPC systems. Usually system monitoring infrastructures make no connection to job information and do not utilize hardware performance monitoring (HPM) data. To increase the efficient use of HPC systems automatic and continuous performance monitoring of jobs is an essential component. It can help to identify pathological cases, provides instant performance feedback to the users, offers initial data to judge on the optimization potential of applications and helps to build a statistical foundation about application specific system usage. The LIKWID monitoring stack is a modular framework build on top of the LIKWID tools library. It aims on enabling job specific performance monitoring using HPM data, system metrics and application-level data for small to medium sized commodity clusters. Moreover, it is designed to integrate in existing monitoring infrastructures to speed up the change from pure system monitoring to job-aware monitoring.Comment: 4 pages, 4 figures. Accepted for HPCMASPA 2017, the Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications, held in conjunction with IEEE Cluster 2017, Honolulu, HI, September 5, 201

arXiv.org e-Print Archive

Crossref