Search CORE

93 research outputs found

Towards an Achievable Performance for the Loop Nests

Author: A Darte
AH Ashouri
AW Lim
DA Padua
G Fursin
Georgios Tournavitis
J Demšar
K Kennedy
K Stock
MJ Wolfe
Padua
R Allen
R Cammarota
T Grosser
U Bondhugula
W Li
Zhangxiaowen Gong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Numerous code optimization techniques, including loop nest optimizations, have been developed over the last four decades. Loop optimization techniques transform loop nests to improve the performance of the code on a target architecture, including exposing parallelism. Finding and evaluating an optimal, semantic-preserving sequence of transformations is a complex problem. The sequence is guided using heuristics and/or analytical models and there is no way of knowing how close it gets to optimal performance or if there is any headroom for improvement. This paper makes two contributions. First, it uses a comparative analysis of loop optimizations/transformations across multiple compilers to determine how much headroom may exist for each compiler. And second, it presents an approach to characterize the loop nests based on their hardware performance counter values and a Machine Learning approach that predicts which compiler will generate the fastest code for a loop nest. The prediction is made for both auto-vectorized, serial compilation and for auto-parallelization. The results show that the headroom for state-of-the-art compilers ranges from 1.10x to 1.42x for the serial code and from 1.30x to 1.71x for the auto-parallelized code. These results are based on the Machine Learning predictions.Comment: Accepted at the 31st International Workshop on Languages and Compilers for Parallel Computing (LCPC 2018

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

A step towards unifying schedule and storage optimization

Author: Cohen A.
Darte A.
Feautrier P.
Feautrier P.
Feautrier P.
Frédéric Vivien
Saman Amarasinghe
Sheldon J. W.
William Thies
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Bouclettes: A Fortran loop parallelizer

Author: A. Darte
A. Darte
P. Feautrier
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Extended Lattice-Based Memory Allocation

Author: Darte A.
Darte A.
Feautrier P.
Schrijver A.
Verdoolaege S.
Yuki T.
Publication venue: HAL CCSD
Publication date: 17/03/2016
Field of study

International audienceThis work extends lattice-based memory allocation, an earlier work on memory reuse through array contraction. Such an optimization is used for optimizing high-level programming languages where storage mapping may be abstracted away from programmers and to complement code transformations that introduce intermediate buffers. The main motivation for this extension is to improve the handling of more general forms of specifications we see today, e.g., with loop tiling, pipelining, and other forms of parallelism available in explicitly-parallel languages. Specifically, we handle the case when conflicting constraints (those that describe the array indices that cannot share the same location) are specified as a (non-convex) union of polyhedra. The choice of directions (or basis) of array reuse becomes important when dealing with non-convex specifications. We extend the two dual approaches in the original work to handle unions of polyhedra, and to select a suitable basis. Our final approach relies on a combination of the two, also revealing their links with, on one hand, the construction of multi-dimensional schedules for parallelism and tiling (but with a fundamental difference that we identify) and, on the other hand, the construction of universal reuse vectors (UOV), which was only used so far in a specific context, for schedule-independent mapping

HAL-ENS-LYON

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Loop Parallelization Algorithms

Author: A. Darte
A. Darte
A. Darte
A. Darte
A. Darte
A. Darte
A. Schrijver
D. Callahan
F. Irigoin
J. R. Allen
J. Xue
J.-F. Collard
L. Lamport
M. E. Wolf
M. Wolfe
P. Boulet
P. Feautrier
P. Feautrier
P. Feautrier
R. Schreiber
R.M. Karp
Publication venue
Publication date: 01/01/2001
Field of study

this paper: Example 7.

HAL-ENS-LYON

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Affine-by-Statement Scheduling of Uniform and Affine Loop Nests over Parametric Domains

Author: A. Darte
Y. Robert
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Tiling and memory reuse for sequences of nested loops

Author: A. Darte
A. Darte
D. Gannon
F. Quilleré
F. Quilleré
J. Xue
M. Wolf
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/08/2002
Field of study

International audienceOur aim is to minimize the electrical energy used during the execution of signal processing applications that are a sequence of loop nests. This energy is mostly used to transfer data among various levels of memory hierarchy. To minimize these transfers, we transform these programs by using simultaneously loop permutation, tiling, loop fusion with shifting and memory reuse. Each input nest uses a stencil of data produced in the previous nest and the references to the same array are equal, up to a shift. All transformations described in this paper have been implemented in pips, our optimizing compiler and cache misses reductions have been measured

Crossref

HAL Descartes

HAL-MINES ParisTech

Systematic Generation of Executing Programs for Processor Elements in Parallel ASIC or FPGA-Based Systems and Their Transformation into VHDL-Descriptions of Processor Element Control Units

Author: A. Darte
S.Y. Kung
U. Banerjee
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Distribution of Nad Glycohydrolase Among Liver-cells

Author: Amarcostesec A.
Darte C.
Pradofigueroa M.
Vanberkel TJC.
Publication venue: Swets Zeitlinger Publishers
Publication date: 01/01/1982
Field of study

DIAL UCLouvain

Circuit retiming applied to decomposed software pipelining

Author: A. Darte
P.-Y. Calland
Y. Robert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref