1 research outputs found
Resilient neural network training for accelerators with computing errors
âWith the advancements of neural networks, customized accelerators are increasingly adopted in massive AI
applications. To gain higher energy efficiency or performance,
many hardware design optimizations such as near-threshold
logic or overclocking can be utilized. In these cases, computing
errors may happen and the computing errors are difficult
to be captured by conventional training on general purposed
processors (GPPs). Applying the offline trained neural network
models to the accelerators with errors directly may lead to
considerable prediction accuracy loss.
To address this problem, we explore the resilience of neural
network models and relax the accelerator design constraints to
enable aggressive design options. First of all, we propose to
train the neural network models using the acceleratorsâ forward
computing results such that the models can learn both the data
and the computing errors. In addition, we observe that some of
the neural network layers are more sensitive to the computing
errors. With this observation, we schedule the most sensitive
layer to the attached GPP to reduce the negative influence of
the computing errors. According to the experiments, the neural
network models obtained from the proposed training outperform
the original models significantly when the CNN accelerators are
affected by computing errors