40 research outputs found
Optimization of XNOR Convolution for Binary Convolutional Neural Networks on GPU
Binary convolutional networks have lower computational load and lower memory
foot-print compared to their full-precision counterparts. So, they are a
feasible alternative for the deployment of computer vision applications on
limited capacity embedded devices. Once trained on less resource-constrained
computational environments, they can be deployed for real-time inference on
such devices. In this study, we propose an implementation of binary
convolutional network inference on GPU by focusing on optimization of XNOR
convolution. Experimental results show that using GPU can provide a speed-up of
up to with a kernel size of . The implementation is
publicly available at
https://github.com/metcan/Binary-Convolutional-Neural-Network-Inference-on-GP
Signed Binary Weight Networks: Improving Efficiency of Binary Weight Networks by Exploiting Sparsity
Efficient inference of Deep Neural Networks (DNNs) is essential to making AI
ubiquitous. Two important algorithmic techniques have shown promise for
enabling efficient inference - sparsity and binarization. These techniques
translate into weight sparsity and weight repetition at the hardware-software
level allowing the deployment of DNNs with critically low power and latency
requirements. We propose a new method called signed-binary networks to improve
further efficiency (by exploiting both weight sparsity and weight repetition)
while maintaining similar accuracy. Our method achieves comparable accuracy on
ImageNet and CIFAR10 datasets with binary and can lead to sparsity. We
observe real speedup when deploying these models on general-purpose devices. We
show that this high percentage of unstructured sparsity can lead to a further
~2x reduction in energy consumption on ASICs with respect to binary