Loop Optimization in Presence of STT-MRAM Caches: a Study of Performance-Energy Tradeoffs by Péneau, Pierre-Yves et al.
Loop Optimization in Presence of STT-MRAM Caches:
a Study of Performance-Energy Tradeoffs
Pierre-Yves Péneau¹, Rabab Bouziane², Abdoulaye Gamatié¹, Erven Rohou²,
Florent Bruguier¹, Gilles Sassatelli¹, Lionel Torres¹, Sophiane Senni¹
¹ LIRMM - CNRS - University of Montpellier
² Inria/IRISA
This work has been partially funded by the French ANR agency under the grant ANR-15 CE25-0007-01 within the CONTINUUM project.
I R I S A  
+
Code optimizations
for ( i = 0; i < N; i++ )
      for ( k = 0; k < M; k++ )
            for ( j = 0; j < N; j++ )
                  C[i][j] += A[j][k] * B[i][k] + B[j][k] * A[i][k]
for (ti = 0; ti < N; ti+=SI)
    for (tk = 0; tk < M; tk+=SK)
        for (tj = 0; tj < N; tj+=SJ)
            for (i = ti; i <ti+SI; i++)
                for (k = tk; k <tk+SK; k++)
                    for (j = tj; j <tj+SJ; j++)
                        C[i][j] += A[j][k] * B[i][k] + B[j][k] * A[i][k]
for ( i = 0; i < N; i++ )
      for ( j = 0; j < N; j++ )
          for ( k = 0; k < M; k++ )  



























consumption (up to 31%)
No or low
overhead (up to 5.4%)
Impact of NVMs on performance and energy
Tiling
Interchange
or
Performance and energy
tradeoff
Operating frequency
impact
