30,108 research outputs found
Fixed-point Factorized Networks
In recent years, Deep Neural Networks (DNN) based methods have achieved
remarkable performance in a wide range of tasks and have been among the most
powerful and widely used techniques in computer vision. However, DNN-based
methods are both computational-intensive and resource-consuming, which hinders
the application of these methods on embedded systems like smart phones. To
alleviate this problem, we introduce a novel Fixed-point Factorized Networks
(FFN) for pretrained models to reduce the computational complexity as well as
the storage requirement of networks. The resulting networks have only weights
of -1, 0 and 1, which significantly eliminates the most resource-consuming
multiply-accumulate operations (MACs). Extensive experiments on large-scale
ImageNet classification task show the proposed FFN only requires one-thousandth
of multiply operations with comparable accuracy
ReBNet: Residual Binarized Neural Network
This paper proposes ReBNet, an end-to-end framework for training
reconfigurable binary neural networks on software and developing efficient
accelerators for execution on FPGA. Binary neural networks offer an intriguing
opportunity for deploying large-scale deep learning models on
resource-constrained devices. Binarization reduces the memory footprint and
replaces the power-hungry matrix-multiplication with light-weight XnorPopcount
operations. However, binary networks suffer from a degraded accuracy compared
to their fixed-point counterparts. We show that the state-of-the-art methods
for optimizing binary networks accuracy, significantly increase the
implementation cost and complexity. To compensate for the degraded accuracy
while adhering to the simplicity of binary networks, we devise the first
reconfigurable scheme that can adjust the classification accuracy based on the
application. Our proposition improves the classification accuracy by
representing features with multiple levels of residual binarization. Unlike
previous methods, our approach does not exacerbate the area cost of the
hardware accelerator. Instead, it provides a tradeoff between throughput and
accuracy while the area overhead of multi-level binarization is negligible.Comment: To Appear In The 26th IEEE International Symposium on
Field-Programmable Custom Computing Machine
- …