2 research outputs found

    Towards Efficient and Reliable Deep Neural Networks

    Get PDF
    Deep neural networks have achieved state-of-the-art performance for various machine learning tasks in different domains such as computer vision, natural language processing, bioinformatics, speech processing, etc. Despite the success, their excessive computational and memory requirements limit their practical usability for real-time applications or in resource-limited devices. Neural network quantization has become increasingly popular due to efficient memory consumption and faster computation resulting from bit-wise operations on the quantized networks, where the objective is to learn a network while restricting the parameters (and activations) to take values from a small discrete set. Another important aspect of modern neural networks is the adversarial vulnerability and reliability of the predictions of deep neural networks. In addition to obtaining accurate predictions, it is also critical to accurately quantify the predictive uncertainty of deep neural networks in many real-world decision-making applications. Calibrating neural networks is of utmost importance when employing them in safety-critical applications where the downstream decision-making depends on the predicted probabilities. Further to this, modern machine vision algorithms have also been shown to be extremely susceptible to small and almost imperceptible perturbations of their inputs. To this end, we tackle these fundamental challenges in modern neural networks, focussing on the efficiency and reliability of neural networks. Neural network quantization is usually formulated as a constrained optimization problem and optimized via a modified version of gradient descent. To this end, first by interpreting the continuous parameters (unconstrained) as the dual of the quantized ones, we introduce a Mirror Descent (MD) framework for NN quantization. Specifically, we provide conditions on the projections (i.e., mapping from continuous to quantized ones) which enable us to derive valid mirror maps and in turn the respective MD updates. Furthermore, we present a numerically stable implementation of MD that requires storing an additional set of auxiliary variables (unconstrained), and show that it is strikingly analogous to the STE based method which is typically viewed as a ``trick'' to avoid vanishing gradients issue. Our experiments on multiple computer vision classification datasets with multiple network architectures demonstrate that our MD variants yield state-of-the-art performance. Even though quantized networks exhibit excellent generalization capabilities, their robustness properties are not well-understood. Therefore next, we systematically study the robustness of quantized networks against gradient based adversarial attacks and demonstrate that these quantized models suffer from gradient vanishing issues and show a fake sense of robustness. By attributing gradient vanishing to poor forward-backward signal propagation in the trained network, we introduce a simple temperature scaling approach to mitigate this issue while preserving the decision boundary. Experiments on multiple image classification datasets with multiple network architectures demonstrate that our temperature scaled attacks obtain near-perfect success rate on quantized networks. Finally, we introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test in which the main idea is to compare the respective cumulative probability distributions. From this, by approximating the empirical cumulative distribution using a differentiable function via splines, we obtain a recalibration function, which maps the network outputs to actual (calibrated) class assignment probabilities. We tested our method against existing calibration approaches on various image classification datasets and our spline-based recalibration approach consistently outperforms existing methods on KS error as well as other commonly used calibration measures
    corecore