StreamSVD: Low-rank approximation and streaming accelerator co-design

Bouganis, C-S; Yu, Z

StreamSVD: Low-rank approximation and streaming accelerator co-design

Authors: C-S Bouganis
Z Yu
Publication date: 1 November 2021
Publisher: 'Institute of Electrical and Electronics Engineers (IEEE)'
Doi

Abstract

The post-training compression of a Convolutional Neural Network (CNN) aims to produce Pareto-optimal designs on the accuracy-performance frontier when the access to training data is not possible. Low-rank approximation is one of the methods that is often utilised in such cases. However, existing work considers the low-rank approximation of the network and the optimisation of the hardware accelerator separately, leading to systems with sub-optimal performance. This work focuses on the efficient mapping of a CNN into an FPGA device, and presents StreamSVD, a model-accelerator co-design framework 1 . The framework considers simultaneously the compression of a CNN model through a hardware-aware low-rank approximation scheme, and the optimisation of the hardware accelerator's architecture by taking into account the approximation scheme's compute structure. Our results show that the co-designed StreamSVD outperforms existing work that utilises similar low-rank approximation schemes by providing better accuracy-throughput trade-off. The proposed framework also achieves competitive performance compared with other post-training compression methods, even outperforming them under certain cases

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Supporting member

Spiral - Imperial College Digital Repository

oai:spiral.imperial.ac.uk:1004...

Last time updated on 19/06/2023