3 research outputs found

    Analytical cost metrics: days of future past

    Get PDF
    2019 Summer.Includes bibliographical references.Future exascale high-performance computing (HPC) systems are expected to be increasingly heterogeneous, consisting of several multi-core CPUs and a large number of accelerators, special-purpose hardware that will increase the computing power of the system in a very energy-efficient way. Specialized, energy-efficient accelerators are also an important component in many diverse systems beyond HPC: gaming machines, general purpose workstations, tablets, phones and other media devices. With Moore's law driving the evolution of hardware platforms towards exascale, the dominant performance metric (time efficiency) has now expanded to also incorporate power/energy efficiency. This work builds analytical cost models for cost metrics such as time, energy, memory access, and silicon area. These models are used to predict the performance of applications, for performance tuning, and chip design. The idea is to work with domain specific accelerators where analytical cost models can be accurately used for performance optimization. The performance optimization problems are formulated as mathematical optimization problems. This work explores the analytical cost modeling and mathematical optimization approach in a few ways. For stencil applications and GPU architectures, the analytical cost models are developed for execution time as well as energy. The models are used for performance tuning over existing architectures, and are coupled with silicon area models of GPU architectures to generate highly efficient architecture configurations. For matrix chain products, analytical closed form solutions for off-chip data movement are built and used to minimize the total data movement cost of a minimum op count tree

    Energy Modeling and Optimization for Tiled Nested-Loop Codes

    No full text
    International audienceWe develop a methodology for modeling the energy efficiency of tiled nested-loop codes running on a graphics processing unit (GPU) and use it for energy efficiency optimization. % We use the polyhedral model, a We assume that a highly optimized and parametrized version of a tiled nested -- loop code, either written by an expert programmer or automatically produced by a polyhedral compilation tool -- is given to us as an input. We then model the energy consumption as an analytical function of a set of parameters characterizing the software and the GPU hardware. Most previous attempts at GPU energy modeling were based on low-level machine models that were then used to model whole programs through simulations, or were analytical models that required low level details. In contrast, our approach develops analytical models based on (i) machine and architecture parameters, (ii) program size parameters as found in the polyhedral model and (iii) tiling parameters, such as those that are chosen by auto-or manual tuners. Our model therefore allows efficient optimization of the energy efficiency with respect to a set of parameters of interest. We illustrate the framework on three nested-loop codes: Smith-Waterman, and one-dimensional and two-dimensional Jacobi stencils, and analyze the accuracy of the resulting models. We also show that the models can be used for optimal tile-size selection for energy efficiency. With optimal choice of model parameters the RMS error is less than 4%. Two factors allow us to attain this high accuracy. The first is domain-specificity: we focus only on tile-able nested-loop codes. The second is that we decouple the energy model from a model of the execution time, a known hard problem
    corecore