1,482 research outputs found
Towards Effective Low-bitwidth Convolutional Neural Networks
This paper tackles the problem of training a deep convolutional neural
network with both low-precision weights and low-bitwidth activations.
Optimizing a low-precision network is very challenging since the training
process can easily get trapped in a poor local minima, which results in
substantial accuracy loss. To mitigate this problem, we propose three
simple-yet-effective approaches to improve the network training. First, we
propose to use a two-stage optimization strategy to progressively find good
local minima. Specifically, we propose to first optimize a net with quantized
weights and then quantized activations. This is in contrast to the traditional
methods which optimize them simultaneously. Second, following a similar spirit
of the first method, we propose another progressive optimization approach which
progressively decreases the bit-width from high-precision to low-precision
during the course of training. Third, we adopt a novel learning scheme to
jointly train a full-precision model alongside the low-precision one. By doing
so, the full-precision model provides hints to guide the low-precision model
training. Extensive experiments on various datasets ( i.e., CIFAR-100 and
ImageNet) show the effectiveness of the proposed methods. To highlight, using
our methods to train a 4-bit precision network leads to no performance decrease
in comparison with its full-precision counterpart with standard network
architectures ( i.e., AlexNet and ResNet-50).Comment: 11 page
Bayesian Compression for Deep Learning
Compression and computational efficiency in deep learning have become a
problem of great significance. In this work, we argue that the most principled
and effective way to attack this problem is by adopting a Bayesian point of
view, where through sparsity inducing priors we prune large parts of the
network. We introduce two novelties in this paper: 1) we use hierarchical
priors to prune nodes instead of individual weights, and 2) we use the
posterior uncertainties to determine the optimal fixed point precision to
encode the weights. Both factors significantly contribute to achieving the
state of the art in terms of compression rates, while still staying competitive
with methods designed to optimize for speed or energy efficiency.Comment: Published as a conference paper at NIPS 201
- …