356 research outputs found

    Compact recurrent neural networks for acoustic event detection on low-energy low-complexity platforms

    Full text link
    Outdoor acoustic events detection is an exciting research field but challenged by the need for complex algorithms and deep learning techniques, typically requiring many computational, memory, and energy resources. This challenge discourages IoT implementation, where an efficient use of resources is required. However, current embedded technologies and microcontrollers have increased their capabilities without penalizing energy efficiency. This paper addresses the application of sound event detection at the edge, by optimizing deep learning techniques on resource-constrained embedded platforms for the IoT. The contribution is two-fold: firstly, a two-stage student-teacher approach is presented to make state-of-the-art neural networks for sound event detection fit on current microcontrollers; secondly, we test our approach on an ARM Cortex M4, particularly focusing on issues related to 8-bits quantization. Our embedded implementation can achieve 68% accuracy in recognition on Urbansound8k, not far from state-of-the-art performance, with an inference time of 125 ms for each second of the audio stream, and power consumption of 5.5 mW in just 34.3 kB of RAM

    Sound Event Detection with Binary Neural Networks on Tightly Power-Constrained IoT Devices

    Full text link
    Sound event detection (SED) is a hot topic in consumer and smart city applications. Existing approaches based on Deep Neural Networks are very effective, but highly demanding in terms of memory, power, and throughput when targeting ultra-low power always-on devices. Latency, availability, cost, and privacy requirements are pushing recent IoT systems to process the data on the node, close to the sensor, with a very limited energy supply, and tight constraints on the memory size and processing capabilities precluding to run state-of-the-art DNNs. In this paper, we explore the combination of extreme quantization to a small-footprint binary neural network (BNN) with the highly energy-efficient, RISC-V-based (8+1)-core GAP8 microcontroller. Starting from an existing CNN for SED whose footprint (815 kB) exceeds the 512 kB of memory available on our platform, we retrain the network using binary filters and activations to match these memory constraints. (Fully) binary neural networks come with a natural drop in accuracy of 12-18% on the challenging ImageNet object recognition challenge compared to their equivalent full-precision baselines. This BNN reaches a 77.9% accuracy, just 7% lower than the full-precision version, with 58 kB (7.2 times less) for the weights and 262 kB (2.4 times less) memory in total. With our BNN implementation, we reach a peak throughput of 4.6 GMAC/s and 1.5 GMAC/s over the full network, including preprocessing with Mel bins, which corresponds to an efficiency of 67.1 GMAC/s/W and 31.3 GMAC/s/W, respectively. Compared to the performance of an ARM Cortex-M4 implementation, our system has a 10.3 times faster execution time and a 51.1 times higher energy-efficiency.Comment: 6 pages conferenc

    Near Sensor Artificial Intelligence on IoT Devices for Smart Cities

    Get PDF
    The IoT is in a continuous evolution thanks to new technologies that open the doors to various applications. While the structure of the IoT network remains the same over the years, specifically composed of a server, gateways, and nodes, their tasks change according to new challenges: the use of multimedia information and the large amount of data created by millions of devices forces the system to move from the cloud-centric approach to the thing-centric approach, where nodes partially process the information. Computing at the sensor node level solves well-known problems like scalability and privacy concerns. However, this study’s primary focus is on the impact that bringing the computation at the edge has on energy: continuous transmission of multimedia data drains the battery, and processing information on the node reduces the amount of data transferred to event-based alerts. Nevertheless, most of the foundational services for IoT applications are provided by AI. Due to this class of algorithms’ complexity, they are always delegated to GPUs or devices with an energy budget that is orders of magnitude more than an IoT node, which should be energy-neutral and powered only by energy harvesters. Enabling AI on IoT nodes is a challenging task. From the software side, this work explores the most recent compression techniques for NN, enabling the reduction of state-of-the-art networks to make them fit in microcontroller systems. From the hardware side, this thesis focuses on hardware selection. It compares the AI algorithms’ efficiency running on both well-established microcontrollers and state-of-the-art processors. An additional contribution towards energy-efficient AI is the exploration of hardware for acquisition and pre-processing of sound data, analyzing the data’s quality for further classification. Moreover, the combination of software and hardware co-design is the key point of this thesis to bring AI to the very edge of the IoT network

    Decentralized Federated Learning for Epileptic Seizures Detection in Low-Power Wearable Systems

    Get PDF
    In healthcare, data privacy of patients regulations prohibits data from being moved outside the hospital, preventing international medical datasets from being centralized for AI training. Federated learning (FL) is a data privacy-focused method that trains a global model by aggregating local models from hospitals. Existing FL techniques adopt a central server-based network topology, where the server assembles the local models trained in each hospital to create a global model. However, the server could be a point of failure, and models trained in FL usually have worse performance than those trained in the centralized learning manner when the patient's data are not independent and identically distributed (Non-IID) in the hospitals. This paper presents a decentralized FL framework, including training with adaptive ensemble learning and a deployment phase using knowledge distillation. The adaptive ensemble learning step in the training phase leads to the acquisition of a specific model for each hospital that is the optimal combination of local models and models from other available hospitals. This step solves the non-IID challenges in each hospital. The deployment phase adjusts the model's complexity to meet the resource constraints of wearable systems. We evaluated the performance of our approach on edge computing platforms using EPILEPSIAE and TUSZ databases, which are public epilepsy datasets.RYC2021-032853-

    Towards Green Metaverse Networking Technologies, Advancements and Future Directions

    Full text link
    As the Metaverse is iteratively being defined, its potential to unleash the next wave of digital disruption and create real-life value becomes increasingly clear. With distinctive features of immersive experience, simultaneous interactivity, and user agency, the Metaverse has the capability to transform all walks of life. However, the enabling technologies of the Metaverse, i.e., digital twin, artificial intelligence, blockchain, and extended reality, are known to be energy-hungry, therefore raising concerns about the sustainability of its large-scale deployment and development. This article proposes Green Metaverse Networking for the first time to optimize energy efficiencies of all network components for Metaverse sustainable development. We first analyze energy consumption, efficiency, and sustainability of energy-intensive technologies in the Metaverse. Next, focusing on computation and networking, we present major advancements related to energy efficiency and their integration into the Metaverse. A case study of energy conservation by incorporating semantic communication and stochastic resource allocation in the Metaverse is presented. Finally, we outline the critical challenges of Metaverse sustainable development, thereby indicating potential directions of future research towards the green Metaverse

    Machine Learning for Microcontroller-Class Hardware -- A Review

    Get PDF
    The advancements in machine learning opened a new opportunity to bring intelligence to the low-end Internet-of-Things nodes such as microcontrollers. Conventional machine learning deployment has high memory and compute footprint hindering their direct deployment on ultra resource-constrained microcontrollers. This paper highlights the unique requirements of enabling onboard machine learning for microcontroller class devices. Researchers use a specialized model development workflow for resource-limited applications to ensure the compute and latency budget is within the device limits while still maintaining the desired performance. We characterize a closed-loop widely applicable workflow of machine learning model development for microcontroller class devices and show that several classes of applications adopt a specific instance of it. We present both qualitative and numerical insights into different stages of model development by showcasing several use cases. Finally, we identify the open research challenges and unsolved questions demanding careful considerations moving forward.Comment: Accepted for publication at IEEE Sensors Journa

    Efficient Neural Networks for Tiny Machine Learning: A Comprehensive Review

    Full text link
    The field of Tiny Machine Learning (TinyML) has gained significant attention due to its potential to enable intelligent applications on resource-constrained devices. This review provides an in-depth analysis of the advancements in efficient neural networks and the deployment of deep learning models on ultra-low power microcontrollers (MCUs) for TinyML applications. It begins by introducing neural networks and discussing their architectures and resource requirements. It then explores MEMS-based applications on ultra-low power MCUs, highlighting their potential for enabling TinyML on resource-constrained devices. The core of the review centres on efficient neural networks for TinyML. It covers techniques such as model compression, quantization, and low-rank factorization, which optimize neural network architectures for minimal resource utilization on MCUs. The paper then delves into the deployment of deep learning models on ultra-low power MCUs, addressing challenges such as limited computational capabilities and memory resources. Techniques like model pruning, hardware acceleration, and algorithm-architecture co-design are discussed as strategies to enable efficient deployment. Lastly, the review provides an overview of current limitations in the field, including the trade-off between model complexity and resource constraints. Overall, this review paper presents a comprehensive analysis of efficient neural networks and deployment strategies for TinyML on ultra-low-power MCUs. It identifies future research directions for unlocking the full potential of TinyML applications on resource-constrained devices.Comment: 39 pages, 9 figures, 5 table

    OWSNet: Towards Real-time Offensive Words Spotting Network for Consumer IoT Devices

    Get PDF
    Every modern household owns at least a dozen of IoT devices like smart speakers, video doorbells, smartwatches, where most of them are equipped with a Keyword spotting(KWS) system-based digital voice assistant like Alexa. The state-of-the-art KWS systems require a large number of operations, higher computation, memory resources to show top performance. In this paper, in contrast to existing resource-demanding KWS systems, we propose a light-weight temporal convolution based KWS system named OWSNet, that can comfortably execute on a variety of IoT devices around us and can accurately spot multiple keywords in real-time without disturbing the device\u27s routine functionalities. When OWSNet is deployed on consumer IoT devices placed in the workplace, home, etc., in addition to spotting wake/trigger words like `Hey Siri\u27, `Alexa\u27, it can also accurately spot offensive words in real-time. If regular wake words are spotted, it activates the voice assistant; else if offensive words are spotted, it starts to capture and stream audio data to speech analytics APIs for autonomous threat and insecurities detection in the scene. The evaluation results show that the OWSNet is faster than state-of-the-art models as it produced ~ 1-74 times faster inference on Raspberry Pi 4 and ~ 1-12 times faster inference on NVIDIA Jetson Nano. In this paper, to optimize IoT use-case models like OWSNet, we present a generic multi-component ML model optimization sequence that can reduce the memory and computation demands of a wide range of ML models thus enabling their execution on low resource, cost, power IoT devices
    corecore