Automatic Generators for a Family of Matrix Multiplication Routines with
  Apache TVM

Alaejos, Guillermo; Alonso-Jordá, Pedro; Castelló, Adrián; Igual, Francisco D.; Martínez, Héctor; Quintana-Ortí, Enrique S.

Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM

Authors: Guillermo Alaejos
Pedro Alonso-Jordá
Adrián Castelló
Francisco D. Igual
Héctor Martínez
Enrique S. Quintana-Ortí
Publication date: 31 October 2023
Publisher

Abstract

We explore the utilization of the Apache TVM open source framework to automatically generate a family of algorithms that follow the approach taken by popular linear algebra libraries, such as GotoBLAS2, BLIS and OpenBLAS, in order to obtain high-performance blocked formulations of the general matrix multiplication (GEMM). % In addition, we fully automatize the generation process, by also leveraging the Apache TVM framework to derive a complete variety of the processor-specific micro-kernels for GEMM. This is in contrast with the convention in high performance libraries, which hand-encode a single micro-kernel per architecture using Assembly code. % In global, the combination of our TVM-generated blocked algorithms and micro-kernels for GEMM 1)~improves portability, maintainability and, globally, streamlines the software life cycle; 2)~provides high flexibility to easily tailor and optimize the solution to different data types, processor architectures, and matrix operand shapes, yielding performance on a par (or even superior for specific matrix shapes) with that of hand-tuned libraries; and 3)~features a small memory footprint.Comment: 35 pages, 22 figures. Submitted to ACM TOM

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2310.20347

Last time updated on 18/01/2024