Search CORE

451 research outputs found

Evaluation of DVFS techniques on modern HPC processors and accelerators for energy-aware applications

Author: Biferale
Biferale
Biferale
Calore
Calore
Calore
Calore
Calore
Crimi
Dick
Etinski
Ge
Khabi
Lim
Mantovani
Mazouz
Peraza
Sbragaglia
Scagliarini
Succi
Sundriyal
Williams
Wittmann
Publication venue: 'Wiley'
Publication date: 01/01/2017
Field of study

Energy efficiency is becoming increasingly important for computing systems, in particular for large scale HPC facilities. In this work we evaluate, from an user perspective, the use of Dynamic Voltage and Frequency Scaling (DVFS) techniques, assisted by the power and energy monitoring capabilities of modern processors in order to tune applications for energy efficiency. We run selected kernels and a full HPC application on two high-end processors widely used in the HPC context, namely an NVIDIA K80 GPU and an Intel Haswell CPU. We evaluate the available trade-offs between energy-to-solution and time-to-solution, attempting a function-by-function frequency tuning. We finally estimate the benefits obtainable running the full code on a HPC multi-GPU node, with respect to default clock frequency governors. We instrument our code to accurately monitor power consumption and execution time without the need of any additional hardware, and we enable it to change CPUs and GPUs clock frequencies while running. We analyze our results on the different architectures using a simple energy-performance model, and derive a number of energy saving strategies which can be easily adopted on recent high-end HPC systems for generic applications

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Ferrara

Reducing Communication in the Solution of Linear Systems

Author: Lindquist Neil S
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2023
Field of study

There is a growing performance gap between computation and communication on modern computers, making it crucial to develop algorithms with lower latency and bandwidth requirements. Because systems of linear equations are important for numerous scientific and engineering applications, I have studied several approaches for reducing communication in those problems. First, I developed optimizations to dense LU with partial pivoting, which downstream applications can adopt with little to no effort. Second, I consider two techniques to completely replace pivoting in dense LU, which can provide significantly higher speedups, albeit without the same numerical guarantees as partial pivoting. One technique uses randomized preprocessing, while the other is a novel combination of block factorization and additive perturbation. Finally, I investigate using mixed precision in GMRES for solving sparse systems, which reduces the volume of data movement, and thus, the pressure on the memory bandwidth

University of Tennessee, Knoxville: Trace