8 research outputs found
Machine Learning for Microcontroller-Class Hardware -- A Review
The advancements in machine learning opened a new opportunity to bring
intelligence to the low-end Internet-of-Things nodes such as microcontrollers.
Conventional machine learning deployment has high memory and compute footprint
hindering their direct deployment on ultra resource-constrained
microcontrollers. This paper highlights the unique requirements of enabling
onboard machine learning for microcontroller class devices. Researchers use a
specialized model development workflow for resource-limited applications to
ensure the compute and latency budget is within the device limits while still
maintaining the desired performance. We characterize a closed-loop widely
applicable workflow of machine learning model development for microcontroller
class devices and show that several classes of applications adopt a specific
instance of it. We present both qualitative and numerical insights into
different stages of model development by showcasing several use cases. Finally,
we identify the open research challenges and unsolved questions demanding
careful considerations moving forward.Comment: Accepted for publication at IEEE Sensors Journa
Recommended from our members
Robust and Efficient Neural Inertial Localization and Complex Activity Recognition
Inertial complex activity recognition and neural inertial navigation are challenging due to missing samples, misaligned data timestamps across sensor channels, variations in sampling rates and high model deployment costs. In this thesis, we introduce a robust training pipeline for complex activity detection that handles sampling rate variability, missing data, and misaligned data timestamps using intelligent data augmentation techniques. Specifically, we use controlled jitter in window length and add artificial misalignments in data timestamps between sensors, along with masking representations of missing data. In addition, we exploit end-to-end sequential learning, alpha-beta filters, Madgwick filters, hardware and quantization-aware Bayesian neural architecture search and a temporal convolutional neural network backbone to form the basis of scalable, real-time and sub-meter GPS-free inertial localization on wide spectrum of target resource-constrained hardware. We also provide a compact, ultra-low-power, environmentally resilient and modular sensor tag configuration that pushes the state-of-the-art in inertial odometry hardware. On average, the network found via our efficient pipeline provided 3x peak activation and 6x memory savings over the state-of-the-art neural inertial algorithms and taking at most 24 hours to train and search pareto-optimal models in the backbone search space. Moreover, we evaluate the complex activity pipeline on state-of-the-art complex activity recognition dataset, achieving test accuracies of 88% and 72% respectively for coarse and granular-activity classification while ranking 3rd in the 2020 Cooking Activity Recognition Challenge out of 78 submissions
Recommended from our members
Physics-Aware Tiny Machine Learning
Tiny machine learning has enabled Internet of Things platforms to make intelligent inferences for time-critical and remote applications from unstructured data. However, realizing edge artificial intelligence systems that can perform long-term high-level reasoning and obey the underlying system physics, rules, and constraints within the tight platform resource budget is challenging. This dissertation explores how rich, robust, and intelligent inferences can be made on extremely resource-constrained platforms in a platform-aware and automated fashion. Firstly, we introduce a robust training pipeline that handles sampling rate variability, missing data, and misaligned data timestamps through intelligent data augmentation techniques during training time. We use a controlled jitter in window length and add artificial misalignments in data timestamps between sensors, along with masking representations of missing data. Secondly, we introduce TinyNS, a platform-aware neurosymbolic architecture search framework for the automatic co-optimization and deployment of neural operators and physics-based process models. TinyNS exploits fast, gradient-free, and black-box Bayesian optimization to automatically construct the most performant learning-enabled, physics, and context-aware edge artificial intelligence program from a search space containing neural and symbolic operators within the platform resource constraints. To guarantee deployability, TinyNS receives hardware metrics directly from the target hardware during the optimization process. Thirdly, we introduce the concept of neurosymbolic tiny machine learning, where we showcase recipes for defining the physics-aware tiny machine learning program synthesis search space from five neurosymbolic program categories. Neurosymbolic artificial intelligence combines the context awareness and integrity of symbolic techniques with the robustness and performance of machine learning models. We develop parsers to automatically write microcontroller code for neurosymbolic programs and showcase several previously unseen TinyML applications. These include onboard physics-aware neural-inertial navigation, on-device human activity recognition, on-chip fall detection, neural-Kalman filtering, and co-optimization of neural and symbolic processes. Finally, we showcase techniques to personalize and adapt tiny machine learning systems to the target domain and application. We illustrate the use of transfer learning, resource-efficient unsupervised template creation and matching, and foundation models as pathways to realize generalizable, domain-aware, and data-efficient edge artificial intelligence systems
Recommended from our members
Machine Learning for Microcontroller-Class Hardware: A Review.
The advancements in machine learning opened a new opportunity to bring intelligence to the low-end Internet-of-Things nodes such as microcontrollers. Conventional machine learning deployment has high memory and compute footprint hindering their direct deployment on ultra resource-constrained microcontrollers. This paper highlights the unique requirements of enabling onboard machine learning for microcontroller class devices. Researchers use a specialized model development workflow for resource-limited applications to ensure the compute and latency budget is within the device limits while still maintaining the desired performance. We characterize a closed-loop widely applicable workflow of machine learning model development for microcontroller class devices and show that several classes of applications adopt a specific instance of it. We present both qualitative and numerical insights into different stages of model development by showcasing several use cases. Finally, we identify the open research challenges and unsolved questions demanding careful considerations moving forward
Recommended from our members
TinyOdom: Hardware-Aware Efficient Neural Inertial Navigation.
Deep inertial sequence learning has shown promising odometric resolution over model-based approaches for trajectory estimation in GPS-denied environments. However, existing neural inertial dead-reckoning frameworks are not suitable for real-time deployment on ultra-resource-constrained (URC) devices due to substantial memory, power, and compute bounds. Current deep inertial odometry techniques also suffer from gravity pollution, high-frequency inertial disturbances, varying sensor orientation, heading rate singularity, and failure in altitude estimation. In this paper, we introduce TinyOdom, a framework for training and deploying neural inertial models on URC hardware. TinyOdom exploits hardware and quantization-aware Bayesian neural architecture search (NAS) and a temporal convolutional network (TCN) backbone to train lightweight models targetted towards URC devices. In addition, we propose a magnetometer, physics, and velocity-centric sequence learning formulation robust to preceding inertial perturbations. We also expand 2D sequence learning to 3D using a model-free barometric g-h filter robust to inertial and environmental variations. We evaluate TinyOdom for a wide spectrum of inertial odometry applications and target hardware against competing methods. Specifically, we consider four applications: pedestrian, animal, aerial, and underwater vehicle dead-reckoning. Across different applications, TinyOdom reduces the size of neural inertial models by 31× to 134× with 2.5m to 12m error in 60 seconds, enabling the direct deployment of models on URC devices while still maintaining or exceeding the localization resolution over the state-of-the-art. The proposed barometric filter tracks altitude within ±0.1m and is robust to inertial disturbances and ambient dynamics. Finally, our ablation study shows that the introduced magnetometer, physics, and velocity-centric sequence learning formulation significantly improve localization performance even with notably lightweight models
Recommended from our members
Auritus
Smart ear-worn devices (called earables) are being equipped with various onboard sensors and algorithms, transforming earphones from simple audio transducers to multi-modal interfaces making rich inferences about human motion and vital signals. However, developing sensory applications using earables is currently quite cumbersome with several barriers in the way. First, time-series data from earable sensors incorporate information about physical phenomena in complex settings, requiring machine-learning (ML) models learned from large-scale labeled data. This is challenging in the context of earables because large-scale open-source datasets are missing. Secondly, the small size and compute constraints of earable devices make on-device integration of many existing algorithms for tasks such as human activity and head-pose estimation difficult. To address these challenges, we introduce Auritus an extendable and open-source optimization toolkit designed to enhance and replicate earable applications. Auritus serves two primary functions. Firstly, Auritus handles data collection, pre-processing, and labeling tasks for creating customized earable datasets using graphical tools. The system includes an open-source dataset with 2.43 million inertial samples related to head and full-body movements, consisting of 34 head poses and 9 activities from 45 volunteers. Secondly, Auritus provides a tightly-integrated hardware-in-the-loop (HIL) optimizer and TinyML interface to develop lightweight and real-time machine-learning (ML) models for activity detection and filters for head-pose tracking. To validate the utlity of Auritus, we showcase three sample applications, namely fall detection, spatial audio rendering, and augmented reality (AR) interfacing. Auritus recognizes activities with 91% leave 1-out test accuracy (98% test accuracy) using real-time models as small as 6-13 kB. Our models are 98-740× smaller and 3-6% more accurate over the state-of-the-art. We also estimate head pose with absolute errors as low as 5 degrees using 20kB filters, achieving up to 1.6× precision improvement over existing techniques. We make the entire system open-source so that researchers and developers can contribute to any layer of the system or rapidly prototype their applications using our dataset and algorithms