GBOTuner: Autotuning of OpenMP Parallel Codes with Bayesian Optimization and Code Representation Transfer Learning

Abstract

Empirical autotuning methods such as Bayesian optimization (BO) are a powerful approach that allows us to optimize tuning parameters of parallel codes as black-boxes. However, BO is an expensive approach because it relies on empirical samples from true evaluations for varying parameter configurations. In this thesis, we present GBOTuner, an autotuning framework for optimizing the performance of OpenMP parallel codes, where OpenMP is a widely used API that enables shared-memory parallelism in C, C++, and Fortran using simple compiler directives. GBOTuner improves sample efficiency of BO by combining code representation learning from a Graph Neural Network (GNN) into a BO autotuning pipeline. Compared to typical BO, GBOTuner uses a hybrid approach that exploits not only a Gaussian Process (GP)-based surrogate model learned from the empirical samples for the given target code but also a GNN-based performance prediction model learned from other codes. We evaluate GBOTuner using 78 OpenMP parallel code kernels obtained from five benchmark suites. GBOTuner significantly and consistently improves the tuning cost and quality over state-of-the-art BO tools across most cases, especially with a small tuning budget, resulting in up to 1.4x/1.3x higher tuned results on an Intel and an AMD platform, respectively

Similar works

Full text

thumbnail-image

Scholar Commons - Santa Clara University

redirect
Last time updated on 20/11/2025

This paper was published in Scholar Commons - Santa Clara University.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.