40 research outputs found

    Optimization of XNOR Convolution for Binary Convolutional Neural Networks on GPU

    Full text link
    Binary convolutional networks have lower computational load and lower memory foot-print compared to their full-precision counterparts. So, they are a feasible alternative for the deployment of computer vision applications on limited capacity embedded devices. Once trained on less resource-constrained computational environments, they can be deployed for real-time inference on such devices. In this study, we propose an implementation of binary convolutional network inference on GPU by focusing on optimization of XNOR convolution. Experimental results show that using GPU can provide a speed-up of up to 42.61×42.61\times with a kernel size of 3×33\times3. The implementation is publicly available at https://github.com/metcan/Binary-Convolutional-Neural-Network-Inference-on-GP

    Signed Binary Weight Networks: Improving Efficiency of Binary Weight Networks by Exploiting Sparsity

    Full text link
    Efficient inference of Deep Neural Networks (DNNs) is essential to making AI ubiquitous. Two important algorithmic techniques have shown promise for enabling efficient inference - sparsity and binarization. These techniques translate into weight sparsity and weight repetition at the hardware-software level allowing the deployment of DNNs with critically low power and latency requirements. We propose a new method called signed-binary networks to improve further efficiency (by exploiting both weight sparsity and weight repetition) while maintaining similar accuracy. Our method achieves comparable accuracy on ImageNet and CIFAR10 datasets with binary and can lead to >69%>69\% sparsity. We observe real speedup when deploying these models on general-purpose devices. We show that this high percentage of unstructured sparsity can lead to a further ~2x reduction in energy consumption on ASICs with respect to binary
    corecore