A Work-Stealing For Dynamic Workload Balancing On CPU-GPU Heterogeneous Computing Platforms

Abstract

Although many general purpose workloads have been accelerated on graphical processing units (gpus) over the last decade, other applications whose runtime behaviors are dynamic and irregular such as ones based on trees and graphs have suffered from serious workload imbalance problem caused by architectural differences between cpu and gpu processors. In this thesis, we propose a work-stealing framework to overcome such problems. Our proposed framework allows cpu and gpu threads to steal tasks from each other as well as within the same device by leveraging fine-grained data sharing and thread communication feature available on modern cpu-gpu heterogeneous systems. The implementation of bfs application on the top of our framework achieves a minimum of 8.5% performance improvement over the one with coarse-grained task partitioning scheme. It also achieves 16% performance improvement on average over its non-stealing counterpart

    Similar works