8 research outputs found
Learning Sparse & Ternary Neural Networks with Entropy-Constrained Trained Ternarization (EC2T)
Deep neural networks (DNN) have shown remarkable success in a variety of
machine learning applications. The capacity of these models (i.e., number of
parameters), endows them with expressive power and allows them to reach the
desired performance. In recent years, there is an increasing interest in
deploying DNNs to resource-constrained devices (i.e., mobile devices) with
limited energy, memory, and computational budget. To address this problem, we
propose Entropy-Constrained Trained Ternarization (EC2T), a general framework
to create sparse and ternary neural networks which are efficient in terms of
storage (e.g., at most two binary-masks and two full-precision values are
required to save a weight matrix) and computation (e.g., MAC operations are
reduced to a few accumulations plus two multiplications). This approach
consists of two steps. First, a super-network is created by scaling the
dimensions of a pre-trained model (i.e., its width and depth). Subsequently,
this super-network is simultaneously pruned (using an entropy constraint) and
quantized (that is, ternary values are assigned layer-wise) in a training
process, resulting in a sparse and ternary network representation. We validate
the proposed approach in CIFAR-10, CIFAR-100, and ImageNet datasets, showing
its effectiveness in image classification tasks.Comment: Proceedings of the CVPR'20 Joint Workshop on Efficient Deep Learning
in Computer Vision. Code is available at
https://github.com/d-becking/efficientCNN
Forward and Backward Information Retention for Accurate Binary Neural Networks
Weight and activation binarization is an effective approach to deep neural
network compression and can accelerate the inference by leveraging bitwise
operations. Although many binarization methods have improved the accuracy of
the model by minimizing the quantization error in forward propagation, there
remains a noticeable performance gap between the binarized model and the
full-precision one. Our empirical study indicates that the quantization brings
information loss in both forward and backward propagation, which is the
bottleneck of training accurate binary neural networks. To address these
issues, we propose an Information Retention Network (IR-Net) to retain the
information that consists in the forward activations and backward gradients.
IR-Net mainly relies on two technical contributions: (1) Libra Parameter
Binarization (Libra-PB): simultaneously minimizing both quantization error and
information loss of parameters by balanced and standardized weights in forward
propagation; (2) Error Decay Estimator (EDE): minimizing the information loss
of gradients by gradually approximating the sign function in backward
propagation, jointly considering the updating ability and accurate gradients.
We are the first to investigate both forward and backward processes of binary
networks from the unified information perspective, which provides new insight
into the mechanism of network binarization. Comprehensive experiments with
various network structures on CIFAR-10 and ImageNet datasets manifest that the
proposed IR-Net can consistently outperform state-of-the-art quantization
methods