7 research outputs found

    Energy-efficient adaptive machine learning on IoT end-nodes with class-dependent confidence

    Get PDF
    Energy-efficient machine learning models that can run directly on edge devices are of great interest in IoT applications, as they can reduce network pressure and response latency, and improve privacy. An effective way to obtain energy-efficiency with small accuracy drops is to sequentially execute a set of increasingly complex models, early-stopping the procedure for 'easy' inputs that can be confidently classified by the smallest models. As a stopping criterion, current methods employ a single threshold on the output probabilities produced by each model. In this work, we show that such a criterion is sub-optimal for datasets that include classes of different complexity, and we demonstrate a more general approach based on per-classes thresholds. With experiments on a low-power end-node, we show that our method can significantly reduce the energy consumption compared to the single-threshold approach

    Energy-efficient and Privacy-aware Social Distance Monitoring with Low-resolution Infrared Sensors and Adaptive Inference

    Get PDF
    Low-resolution infrared (IR) Sensors combined with machine learning (ML) can be leveraged to implement privacy-preserving social distance monitoring solutions in indoor spaces. However, the need of executing these applications on Internet of Things (IoT) edge nodes makes energy consumption critical. In this work, we propose an energy-efficient adaptive inference solution consisting of the cascade of a simple wake-up trigger and a 8-bit quantized Convolutional Neural Network (CNN), which is only invoked for difficult-to-classify frames. Deploying such adaptive system on a IoT Microcontroller, we show that, when processing the output of a 8×8 low-resolution IR sensor, we are able to reduce the energy consumption by 37-57% with respect to a static CNN-based approach, with an accuracy drop of less than 2% (83% balanced accuracy)

    Input-dependent edge-cloud mapping of recurrent neural networks inference

    Get PDF
    6noGiven the computational complexity of Recurrent Neural Networks (RNNs) inference, IoT and mobile devices typically offload this task to the cloud. However, the execution time and energy consumption of RNN inference strongly depends on the length of the processed input. Therefore, considering also communication costs, it may be more convenient to process short input sequences locally and only offload long ones to the cloud. In this paper, we propose a low-overhead runtime tool that performs this choice automatically. Results based on real edge and cloud devices show that our method is able to simultaneously reduce the total execution time and energy consumption of the system compared to solutions that run RNN inference fully locally or fully in the cloud.partially_openopenJahier Pagliari D.; Chiaro R.; Chen Y.; Vinco S.; Macii E.; Poncino M.Jahier Pagliari, D.; Chiaro, R.; Chen, Y.; Vinco, S.; Macii, E.; Poncino, M

    Fast and Accurate Inference on Microcontrollers with Boosted Cooperative Convolutional Neural Networks (BC-Net)

    Get PDF
    Arithmetic precision scaling is mandatory to deploy Convolutional Neural Networks (CNNs) on resource-constrained devices such as microcontrollers (MCUs), and quantization via fixed-point or binarization are the most adopted techniques today. Despite being born by the same concept of bit-width lowering, these two strategies differ substantially each other, and hence are often conceived and implemented separately. However, their joint integration is feasible and, if properly implemented, can bring to large savings and high processing efficiency. This work elaborates on this aspect introducing a boosted collaborative mechanism that pushes CNNs towards higher performance and more predictive capability. Referred as BC-Net, the proposed solution consists of a self-adaptive conditional scheme where a lightweight binary net and an 8-bit quantized net are trained to cooperate dynamically. Experiments conducted on four different CNN benchmarks deployed on off-the-shelf boards powered with the MCUs of the Cortex-M family by ARM show that BC-Nets outperform classical quantization and binarization when applied as separate techniques (up to 81.49% speed-up and up to 3.8% of accuracy improvement). The comparative analysis with a previously proposed cooperative method also demonstrates BC-Nets achieve substantial savings in terms of both performance (+19%) and accuracy (+3.45%)

    Dynamic bit-width reconfiguration for energy-efficient deep learning hardware

    No full text
    Deep learning models have reached state of the art performance in many machine learning tasks. Benefits in terms of energy, bandwidth, latency, etc., can be obtained by evaluating these models directly within Internet of Things end nodes, rather than in the cloud. This calls for implementations of deep learning tasks that can run in resource limited environments with low energy footprints. Research and industry have recently investigated these aspects, coming up with specialized hardware accelerators for low power deep learning. One effective technique adopted in these devices consists in reducing the bit-width of calculations, exploiting the error resilience of deep learning. However, bit-widths are tipically set statically for a given model, regardless of input data. Unless models are retrained, this solution invariably sacrifices accuracy for energy efficiency. In this paper, we propose a new approach for implementing input-dependant dynamic bit-width reconfiguration in deep learning accelerators. Our method is based on a fully automatic characterization phase, and can be applied to popular models without retraining. Using the energy data from a real deep learning accelerator chip, we show that 50% energy reduction can be achieved with respect to a static bit-width selection, with less than 1% accuracy loss
    corecore