118 research outputs found

    Automated Circuit Approximation Method Driven by Data Distribution

    Full text link
    We propose an application-tailored data-driven fully automated method for functional approximation of combinational circuits. We demonstrate how an application-level error metric such as the classification accuracy can be translated to a component-level error metric needed for an efficient and fast search in the space of approximate low-level components that are used in the application. This is possible by employing a weighted mean error distance (WMED) metric for steering the circuit approximation process which is conducted by means of genetic programming. WMED introduces a set of weights (calculated from the data distribution measured on a selected signal in a given application) determining the importance of each input vector for the approximation process. The method is evaluated using synthetic benchmarks and application-specific approximate MAC (multiply-and-accumulate) units that are designed to provide the best trade-offs between the classification accuracy and power consumption of two image classifiers based on neural networks.Comment: Accepted for publication at Design, Automation and Test in Europe (DATE 2019). Florence, Ital

    Gabor Filter Assisted Energy Efficient Fast Learning Convolutional Neural Networks

    Full text link
    Convolutional Neural Networks (CNN) are being increasingly used in computer vision for a wide range of classification and recognition problems. However, training these large networks demands high computational time and energy requirements; hence, their energy-efficient implementation is of great interest. In this work, we reduce the training complexity of CNNs by replacing certain weight kernels of a CNN with Gabor filters. The convolutional layers use the Gabor filters as fixed weight kernels, which extracts intrinsic features, with regular trainable weight kernels. This combination creates a balanced system that gives better training performance in terms of energy and time, compared to the standalone CNN (without any Gabor kernels), in exchange for tolerable accuracy degradation. We show that the accuracy degradation can be mitigated by partially training the Gabor kernels, for a small fraction of the total training cycles. We evaluated the proposed approach on 4 benchmark applications. Simple tasks like face detection and character recognition (MNIST and TiCH), were implemented using LeNet architecture. While a more complex task of object recognition (CIFAR10) was implemented on a state of the art deep CNN (Network in Network) architecture. The proposed approach yields 1.31-1.53x improvement in training energy in comparison to conventional CNN implementation. We also obtain improvement up to 1.4x in training time, up to 2.23x in storage requirements, and up to 2.2x in memory access energy. The accuracy degradation suffered by the approximate implementations is within 0-3% of the baseline.Comment: Accepted in ISLPED 201

    The Effects of Approximate Multiplication on Convolutional Neural Networks

    Full text link
    This paper analyzes the effects of approximate multiplication when performing inferences on deep convolutional neural networks (CNNs). The approximate multiplication can reduce the cost of the underlying circuits so that CNN inferences can be performed more efficiently in hardware accelerators. The study identifies the critical factors in the convolution, fully-connected, and batch normalization layers that allow more accurate CNN predictions despite the errors from approximate multiplication. The same factors also provide an arithmetic explanation of why bfloat16 multiplication performs well on CNNs. The experiments are performed with recognized network architectures to show that the approximate multipliers can produce predictions that are nearly as accurate as the FP32 references, without additional training. For example, the ResNet and Inception-v4 models with Mitch-ww6 multiplication produces Top-5 errors that are within 0.2% compared to the FP32 references. A brief cost comparison of Mitch-ww6 against bfloat16 is presented, where a MAC operation saves up to 80% of energy compared to the bfloat16 arithmetic. The most far-reaching contribution of this paper is the analytical justification that multiplications can be approximated while additions need to be exact in CNN MAC operations.Comment: 12 pages, 11 figures, 4 tables, accepted for publication in the IEEE Transactions on Emerging Topics in Computin

    An energy efficient additive neural network

    Get PDF
    In this paper, we propose a new energy efficient neural network with the universal approximation property over space of Lebesgue integrable functions. This network, called additive neural network, is very suitable for mobile computing. The neural structure is based on a novel vector product definition, called ef-operator, that permits a multiplier-free implementation. In ef-operation, the 'product' of two real numbers is defined as the sum of their absolute values, with the sign determined by the sign of the product of the numbers. This 'product' is used to construct a vector product in n-dimensional Euclidean space. The vector product induces the lasso norm. The proposed additive neural network successfully solves the XOR problem. The experiments on MNIST dataset show that the classification performances of the proposed additive neural networks are very similar to the corresponding multi-layer perceptron. © 2017 IEEE
    corecore