27 research outputs found

    Approximation speed of quantized vs. unquantized ReLU neural networks and beyond

    Full text link
    We deal with two complementary questions about approximation properties of ReLU networks. First, we study how the uniform quantization of ReLU networks with real-valued weights impacts their approximation properties. We establish an upper-bound on the minimal number of bits per coordinate needed for uniformly quantized ReLU networks to keep the same polynomial asymptotic approximation speeds as unquantized ones. We also characterize the error of nearest-neighbour uniform quantization of ReLU networks. This is achieved using a new lower-bound on the Lipschitz constant of the map that associates the parameters of ReLU networks to their realization, and an upper-bound generalizing classical results. Second, we investigate when ReLU networks can be expected, or not, to have better approximation properties than other classical approximation families. Indeed, several approximation families share the following common limitation: their polynomial asymptotic approximation speed of any set is bounded from above by the encoding speed of this set. We introduce a new abstract property of approximation families, called infinite-encodability, which implies this upper-bound. Many classical approximation families, defined with dictionaries or ReLU networks, are shown to be infinite-encodable. This unifies and generalizes several situations where this upper-bound is known

    Quantized Deep Transfer Learning - Gearbox Fault Diagnosis on Edge Devices

    Get PDF
    This study has designed and implemented a deep transfer learning (DTL) model-based framework that takes an input time series of gearbox vibration patterns, which are accelerometer readings. It classifies the gear’s damage type from a predefined catalog. Industrial gearboxes are often operated even after damage because damage detection is formidable. It causes a lot of wear and tear, which leads to more repair costs. With this proposed DTL model-based framework, at an early stage, gearbox damage can be detected so that gears can be replaced immediately with less repair cost. The proposed methodology involves training a convolutional neural network (CNN) model using a transfer learning technique on a predefined dataset of eight types of gearbox conditions. Then, using quantization, the size of the CNN model is reduced, leading to easy inference on edge and embedded devices. An accuracy of 99.49 % using transfer learning of the VGG16 model is achieved, pre-trained on the Imagenet dataset. Other models and architectures were also tested, but VGG16 emerged as the winner. The methodology also addresses the problem of deployment on edge/embedded devices, as in most cases, accurate models are too heavy to be used in the industry due to memory and computation power constraints in embedded devices. This is done with the help of quantization, enabling the proposed model to be deployed on devices like the Raspberry Pi, leading to inference on the go without the need for the internet and cloud computing. Consequently, the current methodology achieved a 4x reduction in model size with the help of INT8 Quantization

    ProxQuant: Quantized Neural Networks via Proximal Operators

    Full text link
    To make deep neural networks feasible in resource-constrained environments (such as mobile devices), it is beneficial to quantize models by using low-precision weights. One common technique for quantizing neural networks is the straight-through gradient method, which enables back-propagation through the quantization mapping. Despite its empirical success, little is understood about why the straight-through gradient method works. Building upon a novel observation that the straight-through gradient method is in fact identical to the well-known Nesterov's dual-averaging algorithm on a quantization constrained optimization problem, we propose a more principled alternative approach, called ProxQuant, that formulates quantized network training as a regularized learning problem instead and optimizes it via the prox-gradient method. ProxQuant does back-propagation on the underlying full-precision vector and applies an efficient prox-operator in between stochastic gradient steps to encourage quantizedness. For quantizing ResNets and LSTMs, ProxQuant outperforms state-of-the-art results on binary quantization and is on par with state-of-the-art on multi-bit quantization. For binary quantization, our analysis shows both theoretically and experimentally that ProxQuant is more stable than the straight-through gradient method (i.e. BinaryConnect), challenging the indispensability of the straight-through gradient method and providing a powerful alternative
    corecore