Experience-guided, mixed-precision matrix multiplication with apache TVM for ARM processors

Castelló, Adrián; Martínez, Héctor; Catalán, Sandra; Igual Peña, Francisco D.; Quintana-Ortí, Enrique S.

research article

oai:riunet.upv.es:10251/212851

Experience-guided, mixed-precision matrix multiplication with apache TVM for ARM processors

Authors: Adrián Castelló
Héctor Martínez
Sandra Catalán
Francisco D. Igual Peña
Enrique S. Quintana-Ortí
Publication date: 1 January 2025
Publisher: Springer-Verlag
Doi

Abstract

[EN] Deep learning (DL) generates new computational tasks that are different from those encountered in classical scientific applications. In particular, DL training and inference require general matrix multiplications (gemm) with matrix operands that are far from large and square as in other scientific fields. In addition, DL models gain arithmetic/storage complexity, and as a result, reduced precision via quantization is now mainstream for inferring DL models in edge devices. Automatic code generation addresses these new types of gemm by (1) improving portability between different hardware with only one base code; (2) supporting mixed and reduced precision; and (3) enabling auto-tuning methods that, given a base operation, perform a (costly) optimization search for the best schedule. In this paper, we rely on Apache TVM to generate an experience-guided gemm that provides performance competitive with the TVM auto-scheduler, while reducing tuning time by a factor of 48x.This work received funding from projects PID2020-113656RB and PID2021-126576NB-I00 of MCIN/AEI/ 10.13039/501100011033; PROMETEO 2023-CI PROM/2022/20. H. Martinez is a POSTDOC_21_00025 postdoctoral fellow supported by Junta de Andalucia. S. Catalan is supported by the grant RYC2021-033973-I, funded by MCIN/AEI/10. 13039/501100011033 and the "NextGenerationEU"/ PRTR, and UJI-2023-04, funded by Universitat Jaume I.Castelló, A.; Martínez, H.; Catalán, S.; Igual Peña, FD.; Quintana-Ortí, ES. (2025). Experience-guided, mixed-precision matrix multiplication with apache TVM for ARM processors. The Journal of Supercomputing. 81(1). https://doi.org/10.1007/s11227-024-06720-781

Similar works

Full text

Open in the Core reader

Download PDF

RiuNet

oai:riunet.upv.es:10251/212851

Last time updated on 03/01/2025

This paper was published in RiuNet.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: info:eu-repo/semantics/openAccess