[EN] Deep learning (DL) generates new computational tasks that are different from those encountered in classical scientific applications. In particular, DL training and inference require general matrix multiplications (gemm) with matrix operands that are far from large and square as in other scientific fields. In addition, DL models gain arithmetic/storage complexity, and as a result, reduced precision via quantization is now mainstream for inferring DL models in edge devices. Automatic code generation addresses these new types of gemm by (1) improving portability between different hardware with only one base code; (2) supporting mixed and reduced precision; and (3) enabling auto-tuning methods that, given a base operation, perform a (costly) optimization search for the best schedule. In this paper, we rely on Apache TVM to generate an experience-guided gemm that provides performance competitive with the TVM auto-scheduler, while reducing tuning time by a factor of 48x.This work received funding from projects PID2020-113656RB and PID2021-126576NB-I00 of MCIN/AEI/ 10.13039/501100011033; PROMETEO 2023-CI PROM/2022/20. H. Martinez is a POSTDOC_21_00025 postdoctoral fellow supported by Junta de Andalucia. S. Catalan is supported by the grant RYC2021-033973-I, funded by MCIN/AEI/10. 13039/501100011033 and the "NextGenerationEU"/ PRTR, and UJI-2023-04, funded by Universitat Jaume I.Castelló, A.; Martínez, H.; Catalán, S.; Igual Peña, FD.; Quintana-Ortí, ES. (2025). Experience-guided, mixed-precision matrix multiplication with apache TVM for ARM processors. The Journal of Supercomputing. 81(1). https://doi.org/10.1007/s11227-024-06720-781
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.