1,395 research outputs found
An investigation into adaptive power reduction techniques for neural hardware
In light of the growing applicability of Artificial Neural Network (ANN) in the signal processing field [1] and the present thrust of the semiconductor industry towards lowpower SOCs for mobile devices [2], the power consumption of ANN hardware has become a very important implementation issue. Adaptability is a powerful and useful feature of neural networks. All current approaches for low-power ANN hardware techniques are ‘non-adaptive’ with respect to the power consumption of the network (i.e. power-reduction is not an objective of the adaptation/learning process). In the research work presented in this thesis, investigations on possible adaptive power reduction techniques have been carried out, which attempt to exploit the adaptability of neural networks in order to reduce the power consumption. Three separate approaches for such adaptive power reduction are proposed: adaptation of size, adaptation of network weights and adaptation of calculation precision. Initial case studies exhibit promising results with significantpower reduction
Non-Volatile Memory Array Based Quantization- and Noise-Resilient LSTM Neural Networks
In cloud and edge computing models, it is important that compute devices at
the edge be as power efficient as possible. Long short-term memory (LSTM)
neural networks have been widely used for natural language processing, time
series prediction and many other sequential data tasks. Thus, for these
applications there is increasing need for low-power accelerators for LSTM model
inference at the edge. In order to reduce power dissipation due to data
transfers within inference devices, there has been significant interest in
accelerating vector-matrix multiplication (VMM) operations using non-volatile
memory (NVM) weight arrays. In NVM array-based hardware, reduced bit-widths
also significantly increases the power efficiency. In this paper, we focus on
the application of quantization-aware training algorithm to LSTM models, and
the benefits these models bring in terms of resilience against both
quantization error and analog device noise. We have shown that only 4-bit NVM
weights and 4-bit ADC/DACs are needed to produce equivalent LSTM network
performance as floating-point baseline. Reasonable levels of ADC quantization
noise and weight noise can be naturally tolerated within our NVMbased quantized
LSTM network. Benchmark analysis of our proposed LSTM accelerator for inference
has shown at least 2.4x better computing efficiency and 40x higher area
efficiency than traditional digital approaches (GPU, FPGA, and ASIC). Some
other novel approaches based on NVM promise to deliver higher computing
efficiency (up to 4.7x) but require larger arrays with potential higher error
rates.Comment: Published in: 2019 IEEE International Conference on Rebooting
Computing (ICRC
A Decade of Neural Networks: Practical Applications and Prospects
The Jet Propulsion Laboratory Neural Network Workshop, sponsored by NASA and DOD, brings together sponsoring agencies, active researchers, and the user community to formulate a vision for the next decade of neural network research and application prospects. While the speed and computing power of microprocessors continue to grow at an ever-increasing pace, the demand to intelligently and adaptively deal with the complex, fuzzy, and often ill-defined world around us remains to a large extent unaddressed. Powerful, highly parallel computing paradigms such as neural networks promise to have a major impact in addressing these needs. Papers in the workshop proceedings highlight benefits of neural networks in real-world applications compared to conventional computing techniques. Topics include fault diagnosis, pattern recognition, and multiparameter optimization
Memory Devices and A/D Interfaces: Design Trade-offs in Mixed-Signal Accelerators for Machine Learning Applications
This tutorial focuses on memory elements and analog/digital (A/D) interfaces used in mixed-signal accelerators for deep neural networks (DNNs) in machine learning (ML) applications. These very dedicated systems exploit analog in-memory computation (AiMC) of weights and input activations to accelerate the DNN algorithm. The co-optimization of the memory cell storing the weights with the peripheral circuits is mandatory for improving the performance metrics of the accelerator. In this tutorial, four memory devices for AiMC are reported and analyzed with their computation scheme,
including the digital-to-analog converter (DAC). Moreover, we review analog-to-digital converters (ADCs) for the quantization of the AiMC results, focusing on the design trade-offs of the different topologies given by the context
- …