Search CORE

3 research outputs found

Power monitoring with PAPI for extreme scale architectures and dataflow-based programming models

Author: Anthony Danalis
Heike Mccraw
Jack Dongarra
James Ralph
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/12/2014
Field of study

Abstract—For more than a decade, the PAPI performance-monitoring library has provided a clear, portable interface to the hardware performance counters available on all modern CPUs and other components of interest (e.g., GPUs, network, and I/O systems). Most major end-user tools that application developers use to analyze the performance of their applications rely on PAPI to gain access to these performance counters. One of the critical roadblocks on the way to larger, more complex high performance systems, has been widely identified as being the energy efficiency constraints. With modern extreme scale machines having hundreds of thousands of cores, the ability to reduce power consumption for each CPU at the software level becomes critically important, both for economic and environmental reasons. In order for PAPI to continue playing its well established role in HPC, it is pressing to provide valuable performance data that not only originates from within the processing cores but also delivers insight into the power consumption of the system as a whole. An extensive effort has been made to extend the Perfor-mance API to support power monitoring capabilities for various platforms. This paper provides detailed information about three components that allow power monitoring on the Intel Xeon Phi and Blue Gene/Q. Furthermore, we discuss the integration of PAPI in PARSEC – a task-based dataflow-driven execution engine – enabling hardware performance counter and power monitoring at true task granularity. I

CiteSeerX

Crossref

Application power profiling on IBM Blue Gene/Q

Author: Alam
Bautista Gomez
Bautista-Gomez
Bekas
Chun
chun Feng
Dongarra
Duy
Freeh
Ge
Goiri
Haring
Hennecke
John Tramm
Kant
Kurien
Lakner
Laros
Leon
Malossi
Mammela
Michael E. Papka
Pakbaznia
Patil
Sean Wallace
Susan Coghlan
Tang
Tramm
Venkatram Vishwanath
Wallace
Wallace
Wallace
Weaver
Weaver
Xu
Yang
Yoshii
Zhiling Lan
Zhou Zhou
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Evaluating technologies and techniques for transitioning hydrodynamics applications to future generations of supercomputers

Author: Mallinson Andrew C.
Publication venue
Publication date
Field of study

Current supercomputer development trends present severe challenges for scientific codebases. Moore’s law continues to hold, however, power constraints have brought an end to Dennard scaling, forcing significant increases in overall concurrency. The performance imbalance between the processor and memory sub-systems is also increasing and architectures are becoming significantly more complex. Scientific computing centres need to harness more computational resources in order to facilitate new scientific insights and maintaining their codebases requires significant investments. Centres therefore have to decide how best to develop their applications to take advantage of future architectures. To prevent vendor "lock-in" and maximise investments, achieving portableperformance across multiple architectures is also a significant concern. Efficiently scaling applications will be essential for achieving improvements in science and the MPI (Message Passing Interface) only model is reaching its scalability limits. Hybrid approaches which utilise shared memory programming models are a promising approach for improving scalability. Additionally PGAS (Partitioned Global Address Space) models have the potential to address productivity and scalability concerns. Furthermore, OpenCL has been developed with the aim of enabling applications to achieve portable-performance across a range of heterogeneous architectures. This research examines approaches for achieving greater levels of performance for hydrodynamics applications on future supercomputer architectures. The development of a Lagrangian-Eulerian hydrodynamics application is presented together with its utility for conducting such research. Strategies for improving application performance, including PGAS- and hybrid-based approaches are evaluated at large node-counts on several state-of-the-art architectures. Techniques to maximise the performance and scalability of OpenMP-based hybrid implementations are presented together with an assessment of how these constructs should be combined with existing approaches. OpenCL is evaluated as an additional technology for implementing a hybrid programming model and improving performance-portability. To enhance productivity several tools for automatically hybridising applications and improving process-to-topology mappings are evaluated. Power constraints are starting to limit supercomputer deployments, potentially necessitating the use of more energy efficient technologies. Advanced processor architectures are therefore evaluated as future candidate technologies, together with several application optimisations which will likely be necessary. An FPGA-based solution is examined, including an analysis of how effectively it can be utilised via a high-level programming model, as an alternative to the specialist approaches which currently limit the applicability of this technology

Warwick Research Archives Portal Repository