8,585 research outputs found
Automated Pruning for Deep Neural Network Compression
In this work we present a method to improve the pruning step of the current
state-of-the-art methodology to compress neural networks. The novelty of the
proposed pruning technique is in its differentiability, which allows pruning to
be performed during the backpropagation phase of the network training. This
enables an end-to-end learning and strongly reduces the training time. The
technique is based on a family of differentiable pruning functions and a new
regularizer specifically designed to enforce pruning. The experimental results
show that the joint optimization of both the thresholds and the network weights
permits to reach a higher compression rate, reducing the number of weights of
the pruned network by a further 14% to 33% compared to the current
state-of-the-art. Furthermore, we believe that this is the first study where
the generalization capabilities in transfer learning tasks of the features
extracted by a pruned network are analyzed. To achieve this goal, we show that
the representations learned using the proposed pruning methodology maintain the
same effectiveness and generality of those learned by the corresponding
non-compressed network on a set of different recognition tasks.Comment: 8 pages, 5 figures. Published as a conference paper at ICPR 201
Deep Multiple Description Coding by Learning Scalar Quantization
In this paper, we propose a deep multiple description coding framework, whose
quantizers are adaptively learned via the minimization of multiple description
compressive loss. Firstly, our framework is built upon auto-encoder networks,
which have multiple description multi-scale dilated encoder network and
multiple description decoder networks. Secondly, two entropy estimation
networks are learned to estimate the informative amounts of the quantized
tensors, which can further supervise the learning of multiple description
encoder network to represent the input image delicately. Thirdly, a pair of
scalar quantizers accompanied by two importance-indicator maps is automatically
learned in an end-to-end self-supervised way. Finally, multiple description
structural dissimilarity distance loss is imposed on multiple description
decoded images in pixel domain for diversified multiple description generations
rather than on feature tensors in feature domain, in addition to multiple
description reconstruction loss. Through testing on two commonly used datasets,
it is verified that our method is beyond several state-of-the-art multiple
description coding approaches in terms of coding efficiency.Comment: 8 pages, 4 figures. (DCC 2019: Data Compression Conference). Testing
datasets for "Deep Optimized Multiple Description Image Coding via Scalar
Quantization Learning" can be found in the website of
https://github.com/mdcnn/Deep-Multiple-Description-Codin
Scalable Compression of Deep Neural Networks
Deep neural networks generally involve some layers with mil- lions of
parameters, making them difficult to be deployed and updated on devices with
limited resources such as mobile phones and other smart embedded systems. In
this paper, we propose a scalable representation of the network parameters, so
that different applications can select the most suitable bit rate of the
network based on their own storage constraints. Moreover, when a device needs
to upgrade to a high-rate network, the existing low-rate network can be reused,
and only some incremental data are needed to be downloaded. We first
hierarchically quantize the weights of a pre-trained deep neural network to
enforce weight sharing. Next, we adaptively select the bits assigned to each
layer given the total bit budget. After that, we retrain the network to
fine-tune the quantized centroids. Experimental results show that our method
can achieve scalable compression with graceful degradation in the performance.Comment: 5 pages, 4 figures, ACM Multimedia 201
Compression of Deep Neural Networks on the Fly
Thanks to their state-of-the-art performance, deep neural networks are
increasingly used for object recognition. To achieve these results, they use
millions of parameters to be trained. However, when targeting embedded
applications the size of these models becomes problematic. As a consequence,
their usage on smartphones or other resource limited devices is prohibited. In
this paper we introduce a novel compression method for deep neural networks
that is performed during the learning phase. It consists in adding an extra
regularization term to the cost function of fully-connected layers. We combine
this method with Product Quantization (PQ) of the trained weights for higher
savings in storage consumption. We evaluate our method on two data sets (MNIST
and CIFAR10), on which we achieve significantly larger compression rates than
state-of-the-art methods
- …