Search CORE

18 research outputs found

Compressing a convolution neural network based on quantization

Author: Kupryianava D.
Lukashevich M.
Pertsau D.
Publication venue: 'Belarusian State University of Informatics and Radioelectronics'
Publication date: 01/01/2023
Field of study

Modern deep neural network models contain a large number of parameters and have a significant size. In this paper we experimentally investigate approaches to compression of convolutional neural network. The results showing the efficiency of quantization of the model while maintaining high accuracy are obtained

Belarusian State University of Informatics and Radioelectronics Repository

Taxonomy of Saliency Metrics for Channel Pruning

Author: Anderson Andrew
Gregg David
Persand Kaveena
Publication venue
Publication date: 01/01/2021
Field of study

Pruning unimportant parameters can allow deep neural networks (DNNs) to reduce their heavy computation and memory requirements. A saliency metric estimates which parameters can be safely pruned with little impact on the classification performance of the DNN. Many saliency metrics have been proposed, each within the context of a wider pruning algorithm. The result is that it is difficult to separate the effectiveness of the saliency metric from the wider pruning algorithm that surrounds it. Similar-looking saliency metrics can yield very different results because of apparently minor design choices. We propose a taxonomy of saliency metrics based on four mostly-orthogonal principal components. We show that a broad range of metrics from the pruning literature can be grouped according to these components. Our taxonomy not only serves as a guide to prior work, but allows us to construct new saliency metrics by exploring novel combinations of our taxonomic components. We perform an in-depth experimental investigation of more than 300 saliency metrics. Our results provide decisive answers to open research questions, and demonstrate the importance of reduction and scaling when pruning groups of weights. We find that some of our constructed metrics can outperform the best existing state-of-the-art metrics for convolutional neural network channel pruning

arXiv.org e-Print Archive

Directory of Open Access Journals

Composition of Saliency Metrics for Channel Pruning with a Myopic Oracle

Author: Anderson Andrew
Gregg David
Persand Kaveena
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/06/2021
Field of study

The computation and memory needed for Convolutional Neural Network (CNN) inference can be reduced by pruning weights from the trained network. Pruning is guided by a pruning saliency, which heuristically approximates the change in the loss function associated with the removal of specific weights. Many pruning signals have been proposed, but the performance of each heuristic depends on the particular trained network. This leaves the data scientist with a difficult choice. When using any one saliency metric for the entire pruning process, we run the risk of the metric assumptions being invalidated, leading to poor decisions being made by the metric. Ideally we could combine the best aspects of different saliency metrics. However, despite an extensive literature review, we are unable to find any prior work on composing different saliency metrics. The chief difficulty lies in combining the numerical output of different saliency metrics, which are not directly comparable. We propose a method to compose several primitive pruning saliencies, to exploit the cases where each saliency measure does well. Our experiments show that the composition of saliencies avoids many poor pruning choices identified by individual saliencies. In most cases our method finds better selections than even the best individual pruning saliency

arXiv.org e-Print Archive

Crossref