Characterizing Sources of Ineffectual Computations in Deep Learning Networks

Mahmoud, M; Moshovos, A; Mullins, R; Nikolic, M; Zhao, Y

Characterizing Sources of Ineffectual Computations in Deep Learning Networks

Authors: M Mahmoud
A Moshovos
R Mullins
M Nikolic
Y Zhao
Publication date: 1 March 2019
Publisher: Proceedings - 2019 IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2019
Doi

Abstract

Hardware accelerators for inference with neural networks can take advantage of the properties of data they process. Performance gains and reduced memory bandwidth during inference have been demonstrated by using narrower data types [1] [2] and by exploiting the ability to skip and compress values that are zero [3]-[6]. Similarly useful properties have been identified at a lower-level such as varying precision requirements [7] and bit-level sparsity [8] [9]. To date, the analysis of these potential sources of superfluous computation and communication has been constrained to a small number of older Convolutional Neural Networks (CNNs) used for image classification. It is an open question as to whether they exist more broadly. This paper aims to determine whether these properties persist in: (1) more recent and thus more accurate and better performing image classification networks, (2) models for image applications other than classification such as image segmentation and low-level computational imaging, (3) Long-Short-Term-Memory (LSTM) models for non-image applications such as those for natural language processing, and (4) quantized image classification models. We demonstrate that such properties persist and discuss the implications and opportunities for future accelerator designs

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Crossref

Last time updated on 10/08/2021

Sustaining member

Apollo (Cambridge)

oai:www.repository.cam.ac.uk:1...

Last time updated on 03/06/2019