Search CORE

5 research outputs found

ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales

Author: Balaprakash Prasanna
Geltz Brad
Hall Mary
Hovland Paul
Jana Siddhartha
Koo Jaehoon
Kruse Michael
Taylor Valerie
Videau Brice
Wu Xingfu
Publication venue
Publication date: 28/03/2023
Field of study

As we enter the exascale computing era, efficiently utilizing power and optimizing the performance of scientific applications under power and energy constraints has become critical and challenging. We propose a low-overhead autotuning framework to autotune performance and energy for various hybrid MPI/OpenMP scientific applications at large scales and to explore the tradeoffs between application runtime and power/energy for energy efficient application execution, then use this framework to autotune four ECP proxy applications -- XSBench, AMG, SWFFT, and SW4lite. Our approach uses Bayesian optimization with a Random Forest surrogate model to effectively search parameter spaces with up to 6 million different configurations on two large-scale production systems, Theta at Argonne National Laboratory and Summit at Oak Ridge National Laboratory. The experimental results show that our autotuning framework at large scales has low overhead and achieves good scalability. Using the proposed autotuning framework to identify the best configurations, we achieve up to 91.59% performance improvement, up to 21.2% energy savings, and up to 37.84% EDP improvement on up to 4,096 nodes

arXiv.org e-Print Archive

BaCO: A Fast and Portable Bayesian Compiler Optimization Framework

Author: Ejjeh Adel
Hellsten Erik
Hsu Olivia
Kjolstad Fredrik
Lacouture Rubens
Lenfers Johannes
Nardi Luigi
Olukotun Kunle
Souza Artur
Steuwer Michel
Publication venue
Publication date: 11/04/2023
Field of study

We introduce the Bayesian Compiler Optimization framework (BaCO), a general purpose autotuner for modern compilers targeting CPUs, GPUs, and FPGAs. BaCO provides the flexibility needed to handle the requirements of modern autotuning tasks. Particularly, it deals with permutation, ordered, and continuous parameter types along with both known and unknown parameter constraints. To reason about these parameter types and efficiently deliver high-quality code, BaCO uses Bayesian optimiza tion algorithms specialized towards the autotuning domain. We demonstrate BaCO's effectiveness on three modern compiler systems: TACO, RISE & ELEVATE, and HPVM2FPGA for CPUs, GPUs, and FPGAs respectively. For these domains, BaCO outperforms current state-of-the-art autotuners by delivering on average 1.36x-1.56x faster code with a tiny search budget, and BaCO is able to reach expert-level performance 2.9x-3.9x faster

arXiv.org e-Print Archive

Application Performance Modeling via Tensor Completion

Author: Hutter Edward
Solomonik Edgar
Publication venue
Publication date: 29/08/2023
Field of study

Performance tuning, software/hardware co-design, and job scheduling are among the many tasks that rely on models to predict application performance. We propose and evaluate low-rank tensor decomposition for modeling application performance. We discretize the input and configuration domains of an application using regular grids. Application execution times mapped within grid-cells are averaged and represented by tensor elements. We show that low-rank canonical-polyadic (CP) tensor decomposition is effective in approximating these tensors. We further show that this decomposition enables accurate extrapolation of unobserved regions of an application's parameter space. We then employ tensor completion to optimize a CP decomposition given a sparse set of observed execution times. We consider alternative piecewise/grid-based models and supervised learning models for six applications and demonstrate that CP decomposition optimized using tensor completion offers higher prediction accuracy and memory-efficiency for high-dimensional performance modeling

arXiv.org e-Print Archive

Hybrid Models for Mixed Variables in Bayesian Optimization

Author: Cho Younghyun
Demmel James W.
Li Xiaoye S.
Liu Yang
Luo Hengrui
Publication venue
Publication date: 18/08/2023
Field of study

This paper presents a new type of hybrid models for Bayesian optimization (BO) adept at managing mixed variables, encompassing both quantitative (continuous and integer) and qualitative (categorical) types. Our proposed new hybrid models merge Monte Carlo Tree Search structure (MCTS) for categorical variables with Gaussian Processes (GP) for continuous ones. Addressing efficiency in searching phase, we juxtapose the original (frequentist) upper confidence bound tree search (UCTS) and the Bayesian Dirichlet search strategies, showcasing the tree architecture's integration into Bayesian optimization. Central to our innovation in surrogate modeling phase is online kernel selection for mixed-variable BO. Our innovations, including dynamic kernel selection, unique UCTS (hybridM) and Bayesian update strategies (hybridD), position our hybrid models as an advancement in mixed-variable surrogate models. Numerical experiments underscore the hybrid models' superiority, highlighting their potential in Bayesian optimization.Comment: 32 pages, 8 Figure

arXiv.org e-Print Archive

eScholarship - University of California