Search CORE

40 research outputs found

Optimization of XNOR Convolution for Binary Convolutional Neural Networks on GPU

Author: İnci Alperen
Kaya Mete Can
Temizel Alptekin
Publication venue
Publication date: 28/07/2020
Field of study

Binary convolutional networks have lower computational load and lower memory foot-print compared to their full-precision counterparts. So, they are a feasible alternative for the deployment of computer vision applications on limited capacity embedded devices. Once trained on less resource-constrained computational environments, they can be deployed for real-time inference on such devices. In this study, we propose an implementation of binary convolutional network inference on GPU by focusing on optimization of XNOR convolution. Experimental results show that using GPU can provide a speed-up of up to

42.61\times

with a kernel size of

3\times3

. The implementation is publicly available at https://github.com/metcan/Binary-Convolutional-Neural-Network-Inference-on-GP

arXiv.org e-Print Archive

OpenMETU (Middle East Technical University)

Knowledge distillation via softmax regression representation learning

Author: Bulat A
International Conference on Learning Representations (ICLR)
Martinez B
Tzimiropoulos G
Yang J
Publication venue: International Conference on Learning Representations (ICLR)
Publication date: 04/05/2021
Field of study

Queen Mary Research Online

Signed Binary Weight Networks: Improving Efficiency of Binary Weight Networks by Exploiting Sparsity

Author: Hoffman Judy
Kuhar Sachit
Tumanov Alexey
Publication venue
Publication date: 24/11/2022
Field of study

Efficient inference of Deep Neural Networks (DNNs) is essential to making AI ubiquitous. Two important algorithmic techniques have shown promise for enabling efficient inference - sparsity and binarization. These techniques translate into weight sparsity and weight repetition at the hardware-software level allowing the deployment of DNNs with critically low power and latency requirements. We propose a new method called signed-binary networks to improve further efficiency (by exploiting both weight sparsity and weight repetition) while maintaining similar accuracy. Our method achieves comparable accuracy on ImageNet and CIFAR10 datasets with binary and can lead to

>69\%

sparsity. We observe real speedup when deploying these models on general-purpose devices. We show that this high percentage of unstructured sparsity can lead to a further ~2x reduction in energy consumption on ASICs with respect to binary

arXiv.org e-Print Archive