Search CORE

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Safety by Construction: Pattern-Based Application of Safety Mechanisms in XANDAR

Author: Antonopoulos Christos P.
Becker Jürgen
Dörr Tobias
Kelefouras Vasilios
Keramidas Georgios
Masing Leonard
Mavropoulos Michail
Schade Florian
Voros Nikolaos
Publication venue
Publication date: 13/07/2022
Field of study

Safety by Construction: Pattern-Based Application of Safety Mechanisms in XANDAR

Author: Antonopoulos Christos P.
Becker Jürgen
Dörr Tobias
Kelefouras Vasilios
Keramidas Georgios
Masing Leonard
Mavropoulos Michail
Schade Florian
Voros Nikolaos
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 02/12/2022
Field of study

Safety by Construction: Pattern-Based Application of Safety Mechanisms in XANDAR

Author: Antonopoulos Christos P.
Becker Jürgen
Dörr Tobias
Kelefouras Vasilios
Keramidas Georgios
Masing Leonard
Mavropoulos Michail
Schade Florian
Voros Nikolaos
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 02/12/2022
Field of study

A methodology pruning the search space of six compiler transformations by addressing them together as one problem and by exploiting the hardware architecture details

Author: H Rong
KD Cooper
KD Cooper
KD Cooper
KD Cooper
L Almagor
L Renganarayanan
L Wang
M Stephenson
MD Smith
P Kulkarni
PA Kulkarni
PMW Knijnenburg
RC Whaley
S Hack
S Kulkarni
S Rubin
Vasilios Kelefouras
VI Kelefouras
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/01/2017
Field of study

Today’s compilers have a plethora of optimizations-transformations to choose from, and the correct choice, order as well parameters of transformations have a significant/large impact on performance; choosing the correct order and parameters of optimizations has been a long standing problem in compilation research, which until now remains unsolved; the separate sub-problems optimization gives a different schedule/binary for each sub-problem and these schedules cannot coexist, as by refining one degrades the other. Researchers try to solve this problem by using iterative compilation techniques but the search space is so big that it cannot be searched even by using modern supercomputers. Moreover, compiler transformations do not take into account the hardware architecture details and data reuse in an efficient way. In this paper, a new iterative compilation methodology is presented which reduces the search space of six compiler transformations by addressing the above problems; the search space is reduced by many orders of magnitude and thus an efficient solution is now capable to be found. The transformations are the following: loop tiling (including the number of the levels of tiling), loop unroll, register allocation, scalar replacement, loop interchange and data array layouts. The search space is reduced (a) by addressing the aforementioned transformations together as one problem and not separately, (b) by taking into account the custom hardware architecture details (e.g., cache size and associativity) and algorithm characteristics (e.g., data reuse). The proposed methodology has been evaluated over iterative compilation and gcc/icc compilers, on both embedded and general purpose processors; it achieves significant performance gains at many orders of magnitude lower compilation time

Plymouth Electronic Archive and Research Library

White Rose Research Online

A high-performance matrix-matrix multiplication methodology for CPU and GPU architectures

Author: A. Kritikakou
B Moon
DF Bacon
F Desprez
G Shobaki
HR Arabnia
HR Arabnia
HR Arabnia
HR Arabnia
HR Arabnia
Iosif Mporas
J Kurzak
K Goto
KD Cooper
M Hattori
M Kulkarni
M Stephenson
M Tartara
MA Wani
N Binkert
N Nethercote
P Bjørstad
P Kulkarni
PA Kulkarni
R Nath
RC Whaley
RC Whaley
RD Blumofe
SM Bhandarkar
SM Bhandarkar
SM Bhandarkar
SS Pinter
T Austin
V Strassen
Vasilios Kelefouras
Vasilios Kolonias
VI Kelefouras
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Current compilers cannot generate code that can compete with hand-tuned code in efficiency, even for a simple kernel like matrix–matrix multiplication (MMM). A key step in program optimization is the estimation of optimal values for parameters such as tile sizes and number of levels of tiling. The scheduling parameter values selection is a very difficult and time-consuming task, since parameter values depend on each other; this is why they are found by using searching methods and empirical techniques. To overcome this problem, the scheduling sub-problems must be optimized together, as one problem and not separately. In this paper, an MMM methodology is presented where the optimum scheduling parameters are found by decreasing the search space theoretically, while the major scheduling sub-problems are addressed together as one problem and not separately according to the hardware architecture parameters and input size; for different hardware architecture parameters and/or input sizes, a different implementation is produced. This is achieved by fully exploiting the software characteristics (e.g., data reuse) and hardware architecture parameters (e.g., data caches sizes and associativities), giving high-quality solutions and a smaller search space. This methodology refers to a wide range of CPU and GPU architectures

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

University of Hertfordshire Research Archive

HAL Descartes

Hal-Diderot

HAL-Rennes 1

A methodology for speeding up matrix vector multiplication for single/multi-core architectures

Author: Angeliki Kritikakou
B Hendrickson
Costas Goutis
Elissavet Papadima
HR Arabnia
HR Arabnia
HR Arabnia
HR Arabnia
HR Arabnia
MA Wani
N Fujimoto
N Zhang
P Kulkarni
RC Whaley
RC Whaley
SM Bhandarkar
SM Bhandarkar
SM Bhandarkar
Vasilios Kelefouras
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/03/2015
Field of study

In this paper, a new methodology for computing the Dense Matrix Vector Multiplication, for both embedded (processors without SIMD unit) and general purpose processors (single and multi-core processors, with SIMD unit), is presented. This methodology achieves higher execution speed than ATLAS state-of-the-art library (speedup from 1.2 up to 1.45). This is achieved by fully exploiting the combination of the software (e.g., data reuse) and hardware parameters (e.g., data cache associativity) which are considered simultaneously as one problem and not separately, giving a smaller search space and high-quality solutions. The proposed methodology produces a different schedule for different values of the (i) number of the levels of data cache; (ii) data cache sizes; (iii) data cache associativities; (iv) data cache and main memory latencies; (v) data array layout of the matrix and (vi) number of cores

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

XANDAR: Exploiting the X-by-Construction Paradigm in Model-based Development of Safety-critical Systems

Author: Adler Nico
Ahlbrecht Alexander
Antonopoulos Christos P.
Antonopoulos Konstantinos
Becker Juergen
Durak Umut
Dörr Tobias
Garousi Vahid
Karadimas Dimitris
Kelefouras Vasilios
Keramidas Georgios
Khan Rafiullah
Masing Leonard
Mavropoulos Michail
Morales Victor
Nemeth Geza
Panagiotou Christos
Sailer Andreas
Schade Florian
Sezer Sakir
Siddiqui Fahad
Tiganourias Efstratios
Voros Nikolaos
Weber Raphael
Wilhelm Thomas
Zaeske Wanja
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 07/06/2022
Field of study