24,381 research outputs found
Ternary Weight Networks
We introduce ternary weight networks (TWNs) - neural networks with weights
constrained to +1, 0 and -1. The Euclidian distance between full (float or
double) precision weights and the ternary weights along with a scaling factor
is minimized. Besides, a threshold-based ternary function is optimized to get
an approximated solution which can be fast and easily computed. TWNs have
stronger expressive abilities than the recently proposed binary precision
counterparts and are thus more effective than the latter. Meanwhile, TWNs
achieve up to 16 or 32 model compression rate and need fewer
multiplications compared with the full precision counterparts. Benchmarks on
MNIST, CIFAR-10, and large scale ImageNet datasets show that the performance of
TWNs is only slightly worse than the full precision counterparts but
outperforms the analogous binary precision counterparts a lot.Comment: 5 pages, 3 fitures, conferenc
Understanding BatchNorm in Ternary Training
Neural networks are comprised of two components, weights andactivation function. Ternary weight neural networks (TNNs) achievea good performance and offer up to 16x compression ratio. TNNsare difficult to train without BatchNorm and there has been no studyto clarify the role of BatchNorm in a ternary network. Benefitingfrom a study in binary networks, we show how BatchNorm helps inresolving the exploding gradients issue
Learning Sparse & Ternary Neural Networks with Entropy-Constrained Trained Ternarization (EC2T)
Deep neural networks (DNN) have shown remarkable success in a variety of
machine learning applications. The capacity of these models (i.e., number of
parameters), endows them with expressive power and allows them to reach the
desired performance. In recent years, there is an increasing interest in
deploying DNNs to resource-constrained devices (i.e., mobile devices) with
limited energy, memory, and computational budget. To address this problem, we
propose Entropy-Constrained Trained Ternarization (EC2T), a general framework
to create sparse and ternary neural networks which are efficient in terms of
storage (e.g., at most two binary-masks and two full-precision values are
required to save a weight matrix) and computation (e.g., MAC operations are
reduced to a few accumulations plus two multiplications). This approach
consists of two steps. First, a super-network is created by scaling the
dimensions of a pre-trained model (i.e., its width and depth). Subsequently,
this super-network is simultaneously pruned (using an entropy constraint) and
quantized (that is, ternary values are assigned layer-wise) in a training
process, resulting in a sparse and ternary network representation. We validate
the proposed approach in CIFAR-10, CIFAR-100, and ImageNet datasets, showing
its effectiveness in image classification tasks.Comment: Proceedings of the CVPR'20 Joint Workshop on Efficient Deep Learning
in Computer Vision. Code is available at
https://github.com/d-becking/efficientCNN
- …