1,488 research outputs found

    An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration

    Get PDF
    We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect of environmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.Comment: To appear at the DSN 2020 conferenc

    Machine Learning for Microcontroller Performance Screening

    Get PDF
    In safety-critical applications, microcontrollers must satisfy strict quality constraints and performances in terms of Fmax (the maximum operating frequency). Traditional speed-binning techniques are not feasible to be applied to mass production, due to the high cost of the needed test equipment. Literature has proven that data extracted from on-chip ring oscillators (ROs) can model the Fmax of integrated circuits by means of machine learning models able to predict the actual operating frequency of the devices. Those models, once trained, can be easily applied to the ROs data coming from every produced device with low effort and no need for high-cost equipment. This research aims to develop machine learning methodologies to be deployed in the MCU screening process, allowing for a more efficient and accurate Fmax estimation, as well as improved speed binning. The effectiveness of this approach has been demonstrated on a real world dataset of microcontroller data

    A Multi-Label Active Learning Framework for Microcontroller Performance Screening

    Get PDF
    In safety-critical applications, microcontrollers have to be tested to satisfy strict quality and performances constraints. It has been demonstrated that on-chip ring oscillators can be be used as speed monitors to reliably predict the performances. However, any machine-learning model is likely to be inaccurate if trained on an inadequate dataset, and labeling data for training is quite a costly process. In this paper, we present a methodology based on active learning to select the best samples to be included in the training set, significantly reducing the time and cost required. Moreover, since different speed measurements are available, we designed a multi-label technique to take advantage of their correlations. Experimental results demonstrate that the approach halves the training-set size, with respect to a random labelling, while it increases the predictive accuracy, with respect to standard single-label machine-learning models
    • …
    corecore