Search CORE

1,482 research outputs found

Towards Effective Low-bitwidth Convolutional Neural Networks

Author: Liu Lingqiao
Reid Ian
Shen Chunhua
Tan Mingkui
Zhuang Bohan
Publication venue
Publication date: 16/11/2017
Field of study

This paper tackles the problem of training a deep convolutional neural network with both low-precision weights and low-bitwidth activations. Optimizing a low-precision network is very challenging since the training process can easily get trapped in a poor local minima, which results in substantial accuracy loss. To mitigate this problem, we propose three simple-yet-effective approaches to improve the network training. First, we propose to use a two-stage optimization strategy to progressively find good local minima. Specifically, we propose to first optimize a net with quantized weights and then quantized activations. This is in contrast to the traditional methods which optimize them simultaneously. Second, following a similar spirit of the first method, we propose another progressive optimization approach which progressively decreases the bit-width from high-precision to low-precision during the course of training. Third, we adopt a novel learning scheme to jointly train a full-precision model alongside the low-precision one. By doing so, the full-precision model provides hints to guide the low-precision model training. Extensive experiments on various datasets ( i.e., CIFAR-100 and ImageNet) show the effectiveness of the proposed methods. To highlight, using our methods to train a 4-bit precision network leads to no performance decrease in comparison with its full-precision counterpart with standard network architectures ( i.e., AlexNet and ResNet-50).Comment: 11 page

arXiv.org e-Print Archive

Crossref

Adelaide Research & Scholarship

Bayesian Compression for Deep Learning

Author: Louizos Christos
Ullrich Karen
Welling Max
Publication venue
Publication date: 01/01/2017
Field of study

Compression and computational efficiency in deep learning have become a problem of great significance. In this work, we argue that the most principled and effective way to attack this problem is by adopting a Bayesian point of view, where through sparsity inducing priors we prune large parts of the network. We introduce two novelties in this paper: 1) we use hierarchical priors to prune nodes instead of individual weights, and 2) we use the posterior uncertainties to determine the optimal fixed point precision to encode the weights. Both factors significantly contribute to achieving the state of the art in terms of compression rates, while still staying competitive with methods designed to optimize for speed or energy efficiency.Comment: Published as a conference paper at NIPS 201

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

ARTS: An adaptive regularization training schedule for activation sparsity exploration

Author: Bondarau Egor
Moreira Orlando
Pourtaherian Arash
Waeijen Luc J.W.
Zhu Zeqi
Publication venue
Publication date: 01/09/2022
Field of study

Pure OAI Repository