5,402 research outputs found
ReBNet: Residual Binarized Neural Network
This paper proposes ReBNet, an end-to-end framework for training
reconfigurable binary neural networks on software and developing efficient
accelerators for execution on FPGA. Binary neural networks offer an intriguing
opportunity for deploying large-scale deep learning models on
resource-constrained devices. Binarization reduces the memory footprint and
replaces the power-hungry matrix-multiplication with light-weight XnorPopcount
operations. However, binary networks suffer from a degraded accuracy compared
to their fixed-point counterparts. We show that the state-of-the-art methods
for optimizing binary networks accuracy, significantly increase the
implementation cost and complexity. To compensate for the degraded accuracy
while adhering to the simplicity of binary networks, we devise the first
reconfigurable scheme that can adjust the classification accuracy based on the
application. Our proposition improves the classification accuracy by
representing features with multiple levels of residual binarization. Unlike
previous methods, our approach does not exacerbate the area cost of the
hardware accelerator. Instead, it provides a tradeoff between throughput and
accuracy while the area overhead of multi-level binarization is negligible.Comment: To Appear In The 26th IEEE International Symposium on
Field-Programmable Custom Computing Machine
On Exact Computation with an Infinitely Wide Neural Net
How well does a classic deep net architecture like AlexNet or VGG19 classify
on a standard dataset such as CIFAR-10 when its width --- namely, number of
channels in convolutional layers, and number of nodes in fully-connected
internal layers --- is allowed to increase to infinity? Such questions have
come to the forefront in the quest to theoretically understand deep learning
and its mysteries about optimization and generalization. They also connect deep
learning to notions such as Gaussian processes and kernels. A recent paper
[Jacot et al., 2018] introduced the Neural Tangent Kernel (NTK) which captures
the behavior of fully-connected deep nets in the infinite width limit trained
by gradient descent; this object was implicit in some other recent papers. An
attraction of such ideas is that a pure kernel-based method is used to capture
the power of a fully-trained deep net of infinite width.
The current paper gives the first efficient exact algorithm for computing the
extension of NTK to convolutional neural nets, which we call Convolutional NTK
(CNTK), as well as an efficient GPU implementation of this algorithm. This
results in a significant new benchmark for the performance of a pure
kernel-based method on CIFAR-10, being higher than the methods reported
in [Novak et al., 2019], and only lower than the performance of the
corresponding finite deep net architecture (once batch normalization, etc. are
turned off). Theoretically, we also give the first non-asymptotic proof showing
that a fully-trained sufficiently wide net is indeed equivalent to the kernel
regression predictor using NTK.Comment: In NeurIPS 2019. Code available: https://github.com/ruosongwang/cnt
- …