Skip to main content
Article thumbnail
Location of Repository

Auto Tuning Method for Deciding Block Size Parameters in Dynamically Load-balanced BLAS

By Yuta Sawa and Reiji Suda


Abstract. High-performance routines of BLAS (Basic Linear Algebra Subprograms) are constantly required in the field of numerical calculations. We have implemented DL-BLAS (Dynamically Load-balanced BLAS) to enhance the performance of BLAS when other tasks use CPU resources of multi-core CPU architectures. DL-BLAS tiles matrices into submatrices to make subtasks and dynamically assigns tasks to CPU cores. We found that the dimensions of submatrices used in DL-BLAS affect the performance. To attain high-performance we have to solve an optimization problem where variables are the dimensions of the submatrices. The search space of the optimization problem is so vast that exhaustive search is unrealistic. We propose an auto tuning search algorithm which consists of Diagonal Search and Reductive Search. Our auto tuning algorithm provides semi-optimal parameters in realistic computing time. Using our algorithm, we got parameters which gave us the best performance in most of cases. As a result, DL-BLAS reached higher performance than ATLAS and GotoBLAS in many performance evaluation tests

Topics: Key words, DL-BLAS, Diagonal Search, Reductive Search
Year: 2013
OAI identifier: oai:CiteSeerX.psu:
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • (external link)
  • (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.