280 research outputs found
Flexible Network Binarization with Layer-wise Priority
How to effectively approximate real-valued parameters with binary codes plays
a central role in neural network binarization. In this work, we reveal an
important fact that binarizing different layers has a widely-varied effect on
the compression ratio of network and the loss of performance. Based on this
fact, we propose a novel and flexible neural network binarization method by
introducing the concept of layer-wise priority which binarizes parameters in
inverse order of their layer depth. In each training step, our method selects a
specific network layer, minimizes the discrepancy between the original
real-valued weights and its binary approximations, and fine-tunes the whole
network accordingly. During the iteration of the above process, it is
significant that we can flexibly decide whether to binarize the remaining
floating layers or not and explore a trade-off between the loss of performance
and the compression ratio of model. The resulting binary network is applied for
efficient pedestrian detection. Extensive experimental results on several
benchmarks show that under the same compression ratio, our method achieves much
lower miss rate and faster detection speed than the state-of-the-art neural
network binarization method.Comment: More experiments on image classification are planne
Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving
Autonomous driving has harsh requirements of small model size and energy
efficiency, in order to enable the embedded system to achieve real-time
on-board object detection. Recent deep convolutional neural network based
object detectors have achieved state-of-the-art accuracy. However, such models
are trained with numerous parameters and their high computational costs and
large storage prohibit the deployment to memory and computation resource
limited systems. Low-precision neural networks are popular techniques for
reducing the computation requirements and memory footprint. Among them, binary
weight neural network (BWN) is the extreme case which quantizes the float-point
into just bit. BWNs are difficult to train and suffer from accuracy
deprecation due to the extreme low-bit representation. To address this problem,
we propose a knowledge transfer (KT) method to aid the training of BWN using a
full-precision teacher network. We built DarkNet- and MobileNet-based binary
weight YOLO-v2 detectors and conduct experiments on KITTI benchmark for car,
pedestrian and cyclist detection. The experimental results show that the
proposed method maintains high detection accuracy while reducing the model size
of DarkNet-YOLO from 257 MB to 8.8 MB and MobileNet-YOLO from 193 MB to 7.9 MB.Comment: Accepted by ICRA 201
PBGen: Partial Binarization of Deconvolution-Based Generators for Edge Intelligence
This work explores the binarization of the deconvolution-based generator in a
GAN for memory saving and speedup of image construction. Our study suggests
that different from convolutional neural networks (including the discriminator)
where all layers can be binarized, only some of the layers in the generator can
be binarized without significant performance loss. Supported by theoretical
analysis and verified by experiments, a direct metric based on the dimension of
deconvolution operations is established, which can be used to quickly decide
which layers in the generator can be binarized. Our results also indicate that
both the generator and the discriminator should be binarized simultaneously for
balanced competition and better performance. Experimental results based on
CelebA suggest that directly applying state-of-the-art binarization techniques
to all the layers of the generator will lead to 2.83 performance loss
measured by sliced Wasserstein distance compared with the original generator,
while applying them to selected layers only can yield up to 25.81
saving in memory consumption, and 1.96 and 1.32 speedup in
inference and training respectively with little performance loss.Comment: 17 pages, paper re-organized
Indiscapes: Instance Segmentation Networks for Layout Parsing of Historical Indic Manuscripts
Historical palm-leaf manuscript and early paper documents from Indian
subcontinent form an important part of the world's literary and cultural
heritage. Despite their importance, large-scale annotated Indic manuscript
image datasets do not exist. To address this deficiency, we introduce
Indiscapes, the first ever dataset with multi-regional layout annotations for
historical Indic manuscripts. To address the challenge of large diversity in
scripts and presence of dense, irregular layout elements (e.g. text lines,
pictures, multiple documents per image), we adapt a Fully Convolutional Deep
Neural Network architecture for fully automatic, instance-level spatial layout
parsing of manuscript images. We demonstrate the effectiveness of proposed
architecture on images from the Indiscapes dataset. For annotation flexibility
and keeping the non-technical nature of domain experts in mind, we also
contribute a custom, web-based GUI annotation tool and a dashboard-style
analytics portal. Overall, our contributions set the stage for enabling
downstream applications such as OCR and word-spotting in historical Indic
manuscripts at scale.Comment: Oral presentation at International Conference on Document Analysis
and Recognition (ICDAR) - 2019. For dataset, pre-trained networks and
additional details, visit project page at http://ihdia.iiit.ac.in
DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks
The field of video compression has developed some of the most sophisticated
and efficient compression algorithms known in the literature, enabling very
high compressibility for little loss of information. Whilst some of these
techniques are domain specific, many of their underlying principles are
universal in that they can be adapted and applied for compressing different
types of data. In this work we present DeepCABAC, a compression algorithm for
deep neural networks that is based on one of the state-of-the-art video coding
techniques. Concretely, it applies a Context-based Adaptive Binary Arithmetic
Coder (CABAC) to the network's parameters, which was originally designed for
the H.264/AVC video coding standard and became the state-of-the-art for
lossless compression. Moreover, DeepCABAC employs a novel quantization scheme
that minimizes the rate-distortion function while simultaneously taking the
impact of quantization onto the accuracy of the network into account.
Experimental results show that DeepCABAC consistently attains higher
compression rates than previously proposed coding techniques for neural network
compression. For instance, it is able to compress the VGG16 ImageNet model by
x63.6 with no loss of accuracy, thus being able to represent the entire network
with merely 8.7MB. The source code for encoding and decoding can be found at
https://github.com/fraunhoferhhi/DeepCABAC
- …