2 research outputs found

    FATE: Fast and Accurate Timing Error Prediction Framework for Low Power DNN Accelerator Design

    Full text link
    Deep neural networks (DNN) are increasingly being accelerated on application-specific hardware such as the Google TPU designed especially for deep learning. Timing speculation is a promising approach to further increase the energy efficiency of DNN accelerators. Architectural exploration for timing speculation requires detailed gate-level timing simulations that can be time-consuming for large DNNs that execute millions of multiply-and-accumulate (MAC) operations. In this paper we propose FATE, a new methodology for fast and accurate timing simulations of DNN accelerators like the Google TPU. FATE proposes two novel ideas: (i) DelayNet, a DNN based timing model for MAC units; and (ii) a statistical sampling methodology that reduces the number of MAC operations for which timing simulations are performed. We show that FATE results in between 8 times-58 times speed-up in timing simulations, while introducing less than 2% error in classification accuracy estimates. We demonstrate the use of FATE by comparing to conventional DNN accelerator that uses 2's complement (2C) arithmetic with an alternative implementation that uses signed magnitude representations (SMR). We show that that the SMR implementation provides 18% more energy savings for the same classification accuracy than 2C, a result that might be of independent interest.Comment: To appear at IEEE/ACM International Conference On Computer Aided Design 201

    ThUnderVolt: Enabling Aggressive Voltage Underscaling and Timing Error Resilience for Energy Efficient Deep Neural Network Accelerators

    Full text link
    Hardware accelerators are being increasingly deployed to boost the performance and energy efficiency of deep neural network (DNN) inference. In this paper we propose Thundervolt, a new framework that enables aggressive voltage underscaling of high-performance DNN accelerators without compromising classification accuracy even in the presence of high timing error rates. Using post-synthesis timing simulations of a DNN accelerator modeled on the Google TPU, we show that Thundervolt enables between 34%-57% energy savings on state-of-the-art speech and image recognition benchmarks with less than 1% loss in classification accuracy and no performance loss. Further, we show that Thundervolt is synergistic with and can further increase the energy efficiency of commonly used run-time DNN pruning techniques like Zero-Skip
    corecore