18 research outputs found

    Forward and Backward Information Retention for Accurate Binary Neural Networks

    Full text link
    Weight and activation binarization is an effective approach to deep neural network compression and can accelerate the inference by leveraging bitwise operations. Although many binarization methods have improved the accuracy of the model by minimizing the quantization error in forward propagation, there remains a noticeable performance gap between the binarized model and the full-precision one. Our empirical study indicates that the quantization brings information loss in both forward and backward propagation, which is the bottleneck of training accurate binary neural networks. To address these issues, we propose an Information Retention Network (IR-Net) to retain the information that consists in the forward activations and backward gradients. IR-Net mainly relies on two technical contributions: (1) Libra Parameter Binarization (Libra-PB): simultaneously minimizing both quantization error and information loss of parameters by balanced and standardized weights in forward propagation; (2) Error Decay Estimator (EDE): minimizing the information loss of gradients by gradually approximating the sign function in backward propagation, jointly considering the updating ability and accurate gradients. We are the first to investigate both forward and backward processes of binary networks from the unified information perspective, which provides new insight into the mechanism of network binarization. Comprehensive experiments with various network structures on CIFAR-10 and ImageNet datasets manifest that the proposed IR-Net can consistently outperform state-of-the-art quantization methods

    Enhancing Reliability of Neural Networks at the Edge: Inverted Normalization with Stochastic Affine Transformations

    Full text link
    Bayesian Neural Networks (BayNNs) naturally provide uncertainty in their predictions, making them a suitable choice in safety-critical applications. Additionally, their realization using memristor-based in-memory computing (IMC) architectures enables them for resource-constrained edge applications. In addition to predictive uncertainty, however, the ability to be inherently robust to noise in computation is also essential to ensure functional safety. In particular, memristor-based IMCs are susceptible to various sources of non-idealities such as manufacturing and runtime variations, drift, and failure, which can significantly reduce inference accuracy. In this paper, we propose a method to inherently enhance the robustness and inference accuracy of BayNNs deployed in IMC architectures. To achieve this, we introduce a novel normalization layer combined with stochastic affine transformations. Empirical results in various benchmark datasets show a graceful degradation in inference accuracy, with an improvement of up to 58.11%58.11\%

    Learning Discrete Weights and Activations Using the Local Reparameterization Trick

    Full text link
    In computer vision and machine learning, a crucial challenge is to lower the computation and memory demands for neural network inference. A commonplace solution to address this challenge is through the use of binarization. By binarizing the network weights and activations, one can significantly reduce computational complexity by substituting the computationally expensive floating operations with faster bitwise operations. This leads to a more efficient neural network inference that can be deployed on low-resource devices. In this work, we extend previous approaches that trained networks with discrete weights using the local reparameterization trick to also allow for discrete activations. The original approach optimized a distribution over the discrete weights and uses the central limit theorem to approximate the pre-activation with a continuous Gaussian distribution. Here we show that the probabilistic modeling can also allow effective training of networks with discrete activation as well. This further reduces runtime and memory footprint at inference time with state-of-the-art results for networks with binary activations

    Optimization of Deep Convolutional Neural Network with the Integrated Batch Normalization and Global pooling

    Get PDF
    Deep convolutional neural networks (DCNN) have made significant progress in a wide range of applications in recent years, which include image identification, audio recognition, and translation of machine information. These tasks assist machine intelligence in a variety of ways. However, because of the large number of parameters, float manipulations and conversion of machine terminal remains difficult. To handle this issue, optimization of convolution in the DCNN is initiated that adjusts the characteristics of the neural network, and the loss of information is minimized with enriched performance. Minimization of convolution function addresses the optimization issues. Initially, batch normalization is completed, and instead of lowering neighborhood values, a full feature map is minimized to a single value using the global pooling approach. Traditional convolution is split into depth and pointwise to decrease the model size and calculations. The optimized convolution-based DCNN's performance is evaluated with the assistance of accuracy and occurrence of error. The optimized DCNN is compared with the existing state-of-the-art techniques, and the optimized DCNN outperforms the existing technique

    Basic Binary Convolution Unit for Binarized Image Restoration Network

    Full text link
    Lighter and faster image restoration (IR) models are crucial for the deployment on resource-limited devices. Binary neural network (BNN), one of the most promising model compression methods, can dramatically reduce the computations and parameters of full-precision convolutional neural networks (CNN). However, there are different properties between BNN and full-precision CNN, and we can hardly use the experience of designing CNN to develop BNN. In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for IR tasks. We conduct systematic analyses to explain each component's role in binary convolution and discuss the pitfalls. Specifically, we find that residual connection can reduce the information loss caused by binarization; BatchNorm can solve the value range gap between residual connection and binary convolution; The position of the activation function dramatically affects the performance of BNN. Based on our findings and analyses, we design a simple yet efficient basic binary convolution unit (BBCU). Furthermore, we divide IR networks into four parts and specially design variants of BBCU for each part to explore the benefit of binarizing these parts. We conduct experiments on different IR tasks, and our BBCU significantly outperforms other BNNs and lightweight models, which shows that BBCU can serve as a basic unit for binarized IR networks. All codes and models will be released.Comment: ICLR2023, code is available at https://github.com/Zj-BinXia/BBC
    corecore