3 research outputs found

    Static versus Dynamic Memory Allocation: a Comparison for Linear Algebra Kernels

    Get PDF
    International audienceThe polyhedral model permits to automatically improve data locality and enable parallelism of regular linear algebra kernels. In previous work we have proposed a new data structure, 2d-packed layout, to store only the non-zeros elements of regular sparse (triangular and banded) matrices dynamically allocated for different basic linear algebra operations, and used Pluto to parallelize and optimize them. To our surprise, there were huge discrepancies in our measures of these kernels execution times that were due to the allocation mode: as statically declared arrays or as dynamically allocated arrays of pointers.In this paper we compare the performance of various linear algebra kernels, including some linear algebra kernels from the PolyBench suite, using different array allocation modes. We present our detailed investigation of the possible reasons of the performance variation on two different architectures: a dual 12-cores AMD (Magny-Cours) and a dual 10-cores Intel Xeon (Haswell-EP).We conclude that static or dynamic memory allocation has an impact on performance in many cases, and that the processor architecture and the gcc compiler's decisions can provoke significant and sometimes surprising variations, in favor of one or the other allocation mode

    Optimization of Triangular and Banded Matrix Operations Using 2d-Packed Layouts

    Get PDF
    International audienceOver the past few years, multicore systems have become more and more powerful and, thereby, very useful in high-performance computing. However, many applications, such as some linear algebra algorithms, still cannot take full advantage of these systems. This is mainly due to the shortage of optimization techniques dealing with irregular control structures. In particular, the well-known polyhedral model fails to optimize loop nests whose bounds and/or array references are not affine functions. This is more likely to occur when handling sparse matrices in their packed formats. In this paper, we propose to use 2d-packed layouts and simple affine transformations to enable optimization of triangular and banded matrix operations. The benefit of our proposal is shown through an experimental study over a set of linear algebra benchmarks

    Layout-oblivious compiler optimization for matrix computations

    No full text
    corecore