Search CORE

1 research outputs found

A Matrix--Matrix Multiplication methodology for single/multi-core architectures using SIMD

Author: Goutis Costas
Kelefouras Vasileios
Kritikakou Angeliki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/01/2014
Field of study

In this paper, a new methodology for speeding up Matrix–Matrix Multiplication using Single Instruction Multiple Data unit, at one and more cores having a shared cache, is presented. This methodology achieves higher execution speed than ATLAS state of the art library (speedup from 1.08 up to 3.5), by decreasing the number of instructions (load/store and arithmetic) and the data cache accesses and misses in thememory hierarchy. This is achieved by fully exploiting the software characteristics (e.g. data reuse) and hardware parameters (e.g. data caches sizes and associativities) as one problem and not separately, giving high quality solutions and a smaller search space

Crossref

Sheffield Hallam University Research Archive