Search CORE

1 research outputs found

E-AMOM: An Energy-Aware Modeling and Optimization Methodology for Scientific Applications on Multicore Systems

Author: Lively Charles
Publication venue
Publication date
Field of study

Power consumption is an important constraint in achieving efficient execution on High Performance Computing Multicore Systems. As the number of cores available on a chip continues to increase, the importance of power consumption will continue to grow. In order to achieve improved performance on multicore systems scientific applications must make use of efficient methods for reducing power consumption and must further be refined to achieve reduced execution time. In this dissertation, we introduce a performance modeling framework, E-AMOM, to enable improved execution of scientific applications on parallel multicore systems with regards to a limited power budget. We develop models for each application based upon performance hardware counters. Our models utilize different performance counters for each application and for each performance component (runtime, system power consumption, CPU power consumption, and memory power consumption) that are selected via our performance-tuned principal component analysis method. Models developed through E-AMOM provide insight into the performance characteristics of each application that affect performance for each component on a parallel multicore system. Our models are more than 92% accurate across both Hybrid (MPI/OpenMP) and MPI implementations for six scientific applications. E-AMOM includes an optimization component that utilizes our models to employ run-time Dynamic Voltage and Frequency Scaling (DVFS) and Dynamic Concurrency Throttling to reduce power consumption of the scientific applications. Further, we optimize our applications based upon insights provided by the performance models to reduce runtime of the applications. Our methods and techniques are able to save up to 18% in energy consumption for Hybrid (MPI/OpenMP) and MPI scientific applications and reduce the runtime of the applications up to 11% on parallel multicore systems

Texas A&M Repository