Search CORE

3 research outputs found

Static versus Dynamic Memory Allocation: a Comparison for Linear Algebra Kernels

Author: Baroudi Toufik
Loechner Vincent
Seghir Rachid
Publication venue: HAL CCSD
Publication date: 22/01/2020
Field of study

International audienceThe polyhedral model permits to automatically improve data locality and enable parallelism of regular linear algebra kernels. In previous work we have proposed a new data structure, 2d-packed layout, to store only the non-zeros elements of regular sparse (triangular and banded) matrices dynamically allocated for different basic linear algebra operations, and used Pluto to parallelize and optimize them. To our surprise, there were huge discrepancies in our measures of these kernels execution times that were due to the allocation mode: as statically declared arrays or as dynamically allocated arrays of pointers.In this paper we compare the performance of various linear algebra kernels, including some linear algebra kernels from the PolyBench suite, using different array allocation modes. We present our detailed investigation of the possible reasons of the performance variation on two different architectures: a dual 12-cores AMD (Magny-Cours) and a dual 10-cores Intel Xeon (Haswell-EP).We conclude that static or dynamic memory allocation has an impact on performance in many cases, and that the processor architecture and the gcc compiler's decisions can provoke significant and sometimes surprising variations, in favor of one or the other allocation mode

INRIA a CCSD electronic archive server

Optimization of Triangular and Banded Matrix Operations Using 2d-Packed Layouts

Author: Baroudi Toufik
Loechner Vincent
Seghir Rachid
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2017
Field of study

International audienceOver the past few years, multicore systems have become more and more powerful and, thereby, very useful in high-performance computing. However, many applications, such as some linear algebra algorithms, still cannot take full advantage of these systems. This is mainly due to the shortage of optimization techniques dealing with irregular control structures. In particular, the well-known polyhedral model fails to optimize loop nests whose bounds and/or array references are not affine functions. This is more likely to occur when handling sparse matrices in their packed formats. In this paper, we propose to use 2d-packed layouts and simple affine transformations to enable optimization of triangular and banded matrix operations. The benefit of our proposal is shown through an experimental study over a set of linear algebra benchmarks

Crossref

INRIA a CCSD electronic archive server

Layout-oblivious compiler optimization for matrix computations

Author: Agarwal R. C.
Cui H.
Huimin Cui
Jingling Xue
Qing Yi
Volkov V.
Wu P.
Xiaobing Feng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref