brought to you by to provided by INRIA a CCSD electronic archive

## Loop Optimization in Presence of STI-MRAM Caches: a Study of Performance-Energy Tradeoffs

Pierre-Yves Péneau<sup>1</sup>, Rabab Bouziane<sup>2</sup>, Abdoulaye Gamatié<sup>1</sup>, Erven Rohou<sup>2</sup>, Florent Bruguier<sup>1</sup>, Gilles Sassatelli<sup>1</sup>, Lionel Torres<sup>1</sup>, Sophiane Senni<sup>1</sup>

<sup>1</sup> LIRMM - CNRS - University of Montpellier <sup>2</sup> Inria/IRISA

## Code optimizations for (i = 0; i < N; i++)for (k = 0; k < M; k++)for (j = 0; j < N; j++)C[i][j] += A[j][k] \* B[i][k] + B[j][k] \* A[i][k]Tiling for (ti = 0; ti < N; ti+=SI)for (tk = 0; tk < M; tk+=SK)for (tj = 0; tj < N; tj+=SJ)for (i = ti; i < ti + SI; i++)for (k = tk; k < tk + SK; k++)for (j = tj; j < tj + SJ; j + +)C[i][j] += A[j][k] \* B[i][k] + B[j][k] \* A[i][k]or Interchange for (i = 0; i < N; i++)for (j = 0; j < N; j++)for (k = 0; k < M; k++)C[i][j] += A[j][k] \* B[i][k] + B[j][k] \* A[i][k]







## Gained insights

Decreased power consumption (up to 31%)

No or low overhead (up to 5.4%)

Observed gain varies with memory operating frequency

Contact information Pierre-Yves Péneau LIRMM - CNRS first.last@lirmm.fr

This work has been partially funded by the French ANR agency under the grant ANR-15 CE25-0007-01 within the CONTINUUM project.







