1,709 research outputs found
An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration
We empirically evaluate an undervolting technique, i.e., underscaling the
circuit supply voltage below the nominal level, to improve the power-efficiency
of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable
Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing
faults due to excessive circuit latency increase. We evaluate the
reliability-power trade-off for such accelerators. Specifically, we
experimentally study the reduced-voltage operation of multiple components of
real FPGAs, characterize the corresponding reliability behavior of CNN
accelerators, propose techniques to minimize the drawbacks of reduced-voltage
operation, and combine undervolting with architectural CNN optimization
techniques, i.e., quantization and pruning. We investigate the effect of
environmental temperature on the reliability-power trade-off of such
accelerators. We perform experiments on three identical samples of modern
Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification
CNN benchmarks. This approach allows us to study the effects of our
undervolting technique for both software and hardware variability. We achieve
more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain
is the result of eliminating the voltage guardband region, i.e., the safe
voltage region below the nominal level that is set by FPGA vendor to ensure
correct functionality in worst-case environmental and circuit conditions. 43%
of the power-efficiency gain is due to further undervolting below the
guardband, which comes at the cost of accuracy loss in the CNN accelerator. We
evaluate an effective frequency underscaling technique that prevents this
accuracy loss, and find that it reduces the power-efficiency gain from 43% to
25%.Comment: To appear at the DSN 2020 conferenc
Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications
Wireless sensor networks monitor dynamic environments that change rapidly
over time. This dynamic behavior is either caused by external factors or
initiated by the system designers themselves. To adapt to such conditions,
sensor networks often adopt machine learning techniques to eliminate the need
for unnecessary redesign. Machine learning also inspires many practical
solutions that maximize resource utilization and prolong the lifespan of the
network. In this paper, we present an extensive literature review over the
period 2002-2013 of machine learning methods that were used to address common
issues in wireless sensor networks (WSNs). The advantages and disadvantages of
each proposed algorithm are evaluated against the corresponding problem. We
also provide a comparative guide to aid WSN designers in developing suitable
machine learning solutions for their specific application challenges.Comment: Accepted for publication in IEEE Communications Surveys and Tutorial
Efficient Error-Tolerant Quantized Neural Network Accelerators
Neural Networks are currently one of the most widely deployed machine
learning algorithms. In particular, Convolutional Neural Networks (CNNs), are
gaining popularity and are evaluated for deployment in safety critical
applications such as self driving vehicles. Modern CNNs feature enormous memory
bandwidth and high computational needs, challenging existing hardware platforms
to meet throughput, latency and power requirements. Functional safety and error
tolerance need to be considered as additional requirement in safety critical
systems. In general, fault tolerant operation can be achieved by adding
redundancy to the system, which is further exacerbating the computational
demands. Furthermore, the question arises whether pruning and quantization
methods for performance scaling turn out to be counterproductive with regards
to fail safety requirements. In this work we present a methodology to evaluate
the impact of permanent faults affecting Quantized Neural Networks (QNNs) and
how to effectively decrease their effects in hardware accelerators. We use
FPGA-based hardware accelerated error injection, in order to enable the fast
evaluation. A detailed analysis is presented showing that QNNs containing
convolutional layers are by far not as robust to faults as commonly believed
and can lead to accuracy drops of up to 10%. To circumvent that, we propose two
different methods to increase their robustness: 1) selective channel
replication which adds significantly less redundancy than used by the common
triple modular redundancy and 2) a fault-aware scheduling of processing
elements for folded implementationsComment: 6 pages, 5 figure
- …