5 research outputs found
Bit Error Robustness for Energy-Efficient DNN Accelerators
Deep neural network (DNN) accelerators received considerable attention in
past years due to saved energy compared to mainstream hardware. Low-voltage
operation of DNN accelerators allows to further reduce energy consumption
significantly, however, causes bit-level failures in the memory storing the
quantized DNN weights. In this paper, we show that a combination of robust
fixed-point quantization, weight clipping, and random bit error training
(RandBET) improves robustness against random bit errors in (quantized) DNN
weights significantly. This leads to high energy savings from both low-voltage
operation as well as low-precision quantization. Our approach generalizes
across operating voltages and accelerators, as demonstrated on bit errors from
profiled SRAM arrays. We also discuss why weight clipping alone is already a
quite effective way to achieve robustness against bit errors. Moreover, we
specifically discuss the involved trade-offs regarding accuracy, robustness and
precision: Without losing more than 1% in accuracy compared to a normally
trained 8-bit DNN, we can reduce energy consumption on CIFAR-10 by 20%. Higher
energy savings of, e.g., 30%, are possible at the cost of 2.5% accuracy, even
for 4-bit DNNs
NeuralFuse: Learning to Improve the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes
Deep neural networks (DNNs) have become ubiquitous in machine learning, but
their energy consumption remains a notable issue. Lowering the supply voltage
is an effective strategy for reducing energy consumption. However, aggressively
scaling down the supply voltage can lead to accuracy degradation due to random
bit flips in static random access memory (SRAM) where model parameters are
stored. To address this challenge, we introduce NeuralFuse, a novel add-on
module that addresses the accuracy-energy tradeoff in low-voltage regimes by
learning input transformations to generate error-resistant data
representations. NeuralFuse protects DNN accuracy in both nominal and
low-voltage scenarios. Moreover, NeuralFuse is easy to implement and can be
readily applied to DNNs with limited access, such as non-configurable hardware
or remote access to cloud-based APIs. Experimental results demonstrate that, at
a 1% bit error rate, NeuralFuse can reduce SRAM memory access energy by up to
24% while improving accuracy by up to 57%. To the best of our knowledge, this
is the first model-agnostic approach (i.e., no model retraining) to address
low-voltage-induced bit errors. The source code is available at
https://github.com/IBM/NeuralFuse
Accelerating Neuromorphic Vision Algorithms for Recognition
ABSTRACT Video analytics introduce new levels of intelligence to automated scene understanding. Neuromorphic algorithms, such as HMAX, are proposed as robust and accurate algorithms that mimic the processing in the visual cortex of the brain. HMAX, for instance, is a versatile algorithm that can be repurposed to target several visual recognition applications. This paper presents the design and evaluation of hardware accelerators for extracting visual features for universal recognition. The recognition applications include object recognition, face identification, facial expression recognition, and action recognition. These accelerators were validated on a multi-FPGA platform and significant performance enhancement and power efficiencies were demonstrated when compared to CMP and GPU platforms. Results demonstrate as much as 7.6X speedup and 12.8X more power-efficient performance when compared to those platforms