2 research outputs found
Least squares binary quantization of neural networks
Quantizing weights and activations of deep neural networks results in
significant improvement in inference efficiency at the cost of lower accuracy.
A source of the accuracy gap between full precision and quantized models is the
quantization error. In this work, we focus on the binary quantization, in which
values are mapped to -1 and 1. We provide a unified framework to analyze
different scaling strategies. Inspired by the pareto-optimality of 2-bits
versus 1-bit quantization, we introduce a novel 2-bits quantization with
provably least squares error. Our quantization algorithms can be implemented
efficiently on the hardware using bitwise operations. We present proofs to show
that our proposed methods are optimal, and also provide empirical error
analysis. We conduct experiments on the ImageNet dataset and show a reduced
accuracy gap when using the proposed least squares quantization algorithms
Improving Accuracy of Binary Neural Networks using Unbalanced Activation Distribution
Binarization of neural network models is considered as one of the promising
methods to deploy deep neural network models on resource-constrained
environments such as mobile devices. However, Binary Neural Networks (BNNs)
tend to suffer from severe accuracy degradation compared to the full-precision
counterpart model. Several techniques were proposed to improve the accuracy of
BNNs. One of the approaches is to balance the distribution of binary
activations so that the amount of information in the binary activations becomes
maximum. Based on extensive analysis, in stark contrast to previous work, we
argue that unbalanced activation distribution can actually improve the accuracy
of BNNs. We also show that adjusting the threshold values of binary activation
functions results in the unbalanced distribution of the binary activation,
which increases the accuracy of BNN models. Experimental results show that the
accuracy of previous BNN models (e.g. XNOR-Net and Bi-Real-Net) can be improved
by simply shifting the threshold values of binary activation functions without
requiring any other modification.Comment: CVPR 2021, 10 pages, 10 figure