12 research outputs found
Automated Circuit Approximation Method Driven by Data Distribution
We propose an application-tailored data-driven fully automated method for
functional approximation of combinational circuits. We demonstrate how an
application-level error metric such as the classification accuracy can be
translated to a component-level error metric needed for an efficient and fast
search in the space of approximate low-level components that are used in the
application. This is possible by employing a weighted mean error distance
(WMED) metric for steering the circuit approximation process which is conducted
by means of genetic programming. WMED introduces a set of weights (calculated
from the data distribution measured on a selected signal in a given
application) determining the importance of each input vector for the
approximation process. The method is evaluated using synthetic benchmarks and
application-specific approximate MAC (multiply-and-accumulate) units that are
designed to provide the best trade-offs between the classification accuracy and
power consumption of two image classifiers based on neural networks.Comment: Accepted for publication at Design, Automation and Test in Europe
(DATE 2019). Florence, Ital
An energy efficient additive neural network
In this paper, we propose a new energy efficient neural network with the universal approximation property over space of Lebesgue integrable functions. This network, called additive neural network, is very suitable for mobile computing. The neural structure is based on a novel vector product definition, called ef-operator, that permits a multiplier-free implementation. In ef-operation, the 'product' of two real numbers is defined as the sum of their absolute values, with the sign determined by the sign of the product of the numbers. This 'product' is used to construct a vector product in n-dimensional Euclidean space. The vector product induces the lasso norm. The proposed additive neural network successfully solves the XOR problem. The experiments on MNIST dataset show that the classification performances of the proposed additive neural networks are very similar to the corresponding multi-layer perceptron. © 2017 IEEE
Gabor Filter Assisted Energy Efficient Fast Learning Convolutional Neural Networks
Convolutional Neural Networks (CNN) are being increasingly used in computer
vision for a wide range of classification and recognition problems. However,
training these large networks demands high computational time and energy
requirements; hence, their energy-efficient implementation is of great
interest. In this work, we reduce the training complexity of CNNs by replacing
certain weight kernels of a CNN with Gabor filters. The convolutional layers
use the Gabor filters as fixed weight kernels, which extracts intrinsic
features, with regular trainable weight kernels. This combination creates a
balanced system that gives better training performance in terms of energy and
time, compared to the standalone CNN (without any Gabor kernels), in exchange
for tolerable accuracy degradation. We show that the accuracy degradation can
be mitigated by partially training the Gabor kernels, for a small fraction of
the total training cycles. We evaluated the proposed approach on 4 benchmark
applications. Simple tasks like face detection and character recognition (MNIST
and TiCH), were implemented using LeNet architecture. While a more complex task
of object recognition (CIFAR10) was implemented on a state of the art deep CNN
(Network in Network) architecture. The proposed approach yields 1.31-1.53x
improvement in training energy in comparison to conventional CNN
implementation. We also obtain improvement up to 1.4x in training time, up to
2.23x in storage requirements, and up to 2.2x in memory access energy. The
accuracy degradation suffered by the approximate implementations is within 0-3%
of the baseline.Comment: Accepted in ISLPED 201
The Effects of Approximate Multiplication on Convolutional Neural Networks
This paper analyzes the effects of approximate multiplication when performing
inferences on deep convolutional neural networks (CNNs). The approximate
multiplication can reduce the cost of the underlying circuits so that CNN
inferences can be performed more efficiently in hardware accelerators. The
study identifies the critical factors in the convolution, fully-connected, and
batch normalization layers that allow more accurate CNN predictions despite the
errors from approximate multiplication. The same factors also provide an
arithmetic explanation of why bfloat16 multiplication performs well on CNNs.
The experiments are performed with recognized network architectures to show
that the approximate multipliers can produce predictions that are nearly as
accurate as the FP32 references, without additional training. For example, the
ResNet and Inception-v4 models with Mitch-6 multiplication produces Top-5
errors that are within 0.2% compared to the FP32 references. A brief cost
comparison of Mitch-6 against bfloat16 is presented, where a MAC operation
saves up to 80% of energy compared to the bfloat16 arithmetic. The most
far-reaching contribution of this paper is the analytical justification that
multiplications can be approximated while additions need to be exact in CNN MAC
operations.Comment: 12 pages, 11 figures, 4 tables, accepted for publication in the IEEE
Transactions on Emerging Topics in Computin
Number Systems for Deep Neural Network Architectures: A Survey
Deep neural networks (DNNs) have become an enabling component for a myriad of
artificial intelligence applications. DNNs have shown sometimes superior
performance, even compared to humans, in cases such as self-driving, health
applications, etc. Because of their computational complexity, deploying DNNs in
resource-constrained devices still faces many challenges related to computing
complexity, energy efficiency, latency, and cost. To this end, several research
directions are being pursued by both academia and industry to accelerate and
efficiently implement DNNs. One important direction is determining the
appropriate data representation for the massive amount of data involved in DNN
processing. Using conventional number systems has been found to be sub-optimal
for DNNs. Alternatively, a great body of research focuses on exploring suitable
number systems. This article aims to provide a comprehensive survey and
discussion about alternative number systems for more efficient representations
of DNN data. Various number systems (conventional/unconventional) exploited for
DNNs are discussed. The impact of these number systems on the performance and
hardware design of DNNs is considered. In addition, this paper highlights the
challenges associated with each number system and various solutions that are
proposed for addressing them. The reader will be able to understand the
importance of an efficient number system for DNN, learn about the widely used
number systems for DNN, understand the trade-offs between various number
systems, and consider various design aspects that affect the impact of number
systems on DNN performance. In addition, the recent trends and related research
opportunities will be highlightedComment: 28 page