31 research outputs found

    Machine Learning Meets Internet of Things: From Theory to Practice

    Get PDF
    Standalone execution of problem-solving Artificial Intelligence (AI) on IoT devices produces a higher level of autonomy and privacy. This is because the sensitive user data collected by the devices need not be transmitted to the cloud for inference. The chipsets used to design IoT devices are resource-constrained due to their limited memory footprint, fewer computation cores, and low clock speeds. These limitations constrain one from deploying and executing complex problem-solving AI (usually an ML model) on IoT devices. Since there is a high potential for building intelligent IoT devices, in this tutorial, we teach researchers and developers; (i) How to deep compress CNNs and efficiently deploy on resource-constrained devices; (ii) How to efficiently port and execute ranking, regression, and classification problems solving ML classifiers on IoT devices; (iii) How to create ML-based self-learning devices that can locally re-train themselves on-the-fly using the unseen real-world data

    Smart speaker design and implementation with biometric authentication and advanced voice interaction capability

    Full text link
    Advancements in semiconductor technology have reduced dimensions and cost while improving the performance and capacity of chipsets. In addition, advancement in the AI frameworks and libraries brings possibilities to accommodate more AI at the resource-constrained edge of consumer IoT devices. Sensors are nowadays an integral part of our environment which provide continuous data streams to build intelligent applications. An example could be a smart home scenario with multiple interconnected devices. In such smart environments, for convenience and quick access to web-based service and personal information such as calendars, notes, emails, reminders, banking, etc, users link third-party skills or skills from the Amazon store to their smart speakers. Also, in current smart home scenarios, several smart home products such as smart security cameras, video doorbells, smart plugs, smart carbon monoxide monitors, and smart door locks, etc. are interlinked to a modern smart speaker via means of custom skill addition. Since smart speakers are linked to such services and devices via the smart speaker user's account. They can be used by anyone with physical access to the smart speaker via voice commands. If done so, the data privacy, home security and other aspects of the user get compromised. Recently launched, Tensor Cam's AI Camera, Toshiba's Symbio, Facebook's Portal are camera-enabled smart speakers with AI functionalities. Although they are camera-enabled, yet they do not have an authentication scheme in addition to calling out the wake-word. This paper provides an overview of cybersecurity risks faced by smart speaker users due to lack of authentication scheme and discusses the development of a state-of-the-art camera-enabled, microphone array-based modern Alexa smart speaker prototype to address these risks

    Imbal-OL: Online Machine Learning from Imbalanced Data Streams in Real-world IoT

    Get PDF
    Typically a Neural Networks (NN) is trained on data centers using historic datasets, then a C source file (model as a char array) of the trained model is generated and flashed on IoT devices. This standard process impedes the flexibility of billions of deployed ML-powered devices as they cannot learn unseen/fresh data patterns (static intelligence) and are impossible to adapt to dynamic scenarios. Currently, to address this issue, Online Machine Learning (OL) algorithms are deployed on IoT devices that provide devices the ability to locally re-train themselves - continuously updating the last few NN layers using unseen data patterns encountered after deployment. In OL, catastrophic forgetting is common when NNs are trained using non-stationary data distribution. The majority of recent work in the OL domain embraces the implicit assumption that the distribution of local training data is balanced. But the fact is, the sensor data streams in real-world IoT are severely imbalanced and temporally correlated. This paper introduces Imbal-OL, a resource-friendly technique that can be used as an OL plugin to balance the size of classes in a range of data streams. When Imbal-OL processed stream is used for OL, the models can adapt faster to changes in the stream while parallelly preventing catastrophic forgetting. Experimental evaluation of Imbal-OL using CIFAR datasets over ResNet-18 demonstrates its ability to deal with imperfect data streams, as it manages to produce high-quality models even under challenging learning setting

    Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural Networks Execution Approach

    Get PDF
    Edge analytics refers to the application of data analytics and Machine Learning (ML) algorithms on IoT devices. The concept of edge analytics is gaining popularity due to its ability to perform AI-based analytics at the device level, enabling autonomous decision-making, without depending on the cloud. However, the majority of Internet of Things (IoT) devices are embedded systems with a low-cost microcontroller unit (MCU) or a small CPU as its brain, which often are incapable of handling complex ML algorithms. In this paper, we propose an approach for the ecient execution of already deeply compressed, large neural networks (NNs) on tiny IoT devices. After optimizing NNs using state-of-the-art deep model compression methods, when the resultant models are executed by MCUs or small CPUs using the model execution sequence produced by our approach, higher levels of conserved SRAM can be achieved. During the evaluation for nine popular models, when comparing the default NN execution sequence with the sequence produced by our approach, we found that 1.61-38.06% less SRAM was used to produce inference results, the inference time was reduced by 0.28-4.9 ms, and energy consumption was reduced by 4-84 mJ. Despite achieving such high conserved levels of SRAM, our method 100% preserved the accuracy, F1 score, etc. (model performance)

    Demo Abstract: Porting and Execution of Anomalies Detection Models on Embedded Systems in IoT

    Get PDF
    In the Industry 4.0 era, Microcontrollers (MCUs) based tiny embedded sensor systems have become the sensing paradigm to interact with the physical world. In 2020, 25.6 billion MCUs were shipped, and over 250 billion MCUs are already operating in the wild. Such low-power, low-cost MCUs are being used as the brain to control diverse applications and soon will become the global digital nervous system. In an Industrial IoT setup, such tiny MCU-based embedded systems are equipped with anomaly detection models and mounted on production plant machines for monitoring the machine’s health/condition. These models process the machine’s health data (from temperature, RPM, vibration sensors) and raise timely alerts when it predicts/detects data patterns that show deviations from the normal operation state. In this demo, we train One Class Support Vector Machines (OCSVM) based anomaly detection models and port the trained models to their MCU executable versions. We then deploy and execute the ported models on 4 popular MCUs and report their on-board inference performance along with their memory (Flash and SRAM) consumption. The steps/procedure that we show in the demo is generic, and the viewers can use it to efficiently port a wide variety of datasets-trained classifiers and execute them on different resource-constrained MCU and small CPU-based devices

    OWSNet: Towards Real-time Offensive Words Spotting Network for Consumer IoT Devices

    Get PDF
    Every modern household owns at least a dozen of IoT devices like smart speakers, video doorbells, smartwatches, where most of them are equipped with a Keyword spotting(KWS) system-based digital voice assistant like Alexa. The state-of-the-art KWS systems require a large number of operations, higher computation, memory resources to show top performance. In this paper, in contrast to existing resource-demanding KWS systems, we propose a light-weight temporal convolution based KWS system named OWSNet, that can comfortably execute on a variety of IoT devices around us and can accurately spot multiple keywords in real-time without disturbing the device\u27s routine functionalities. When OWSNet is deployed on consumer IoT devices placed in the workplace, home, etc., in addition to spotting wake/trigger words like `Hey Siri\u27, `Alexa\u27, it can also accurately spot offensive words in real-time. If regular wake words are spotted, it activates the voice assistant; else if offensive words are spotted, it starts to capture and stream audio data to speech analytics APIs for autonomous threat and insecurities detection in the scene. The evaluation results show that the OWSNet is faster than state-of-the-art models as it produced ~ 1-74 times faster inference on Raspberry Pi 4 and ~ 1-12 times faster inference on NVIDIA Jetson Nano. In this paper, to optimize IoT use-case models like OWSNet, we present a generic multi-component ML model optimization sequence that can reduce the memory and computation demands of a wide range of ML models thus enabling their execution on low resource, cost, power IoT devices

    OTA-TinyML: Over the air deployment of TinyML models and execution on IoT devices

    Get PDF
    This article presents a novel over-the-air (OTA) technique to remotely deploy tiny ML models over Internet of Things (IoT) devices and perform tasks, such as machine learning (ML) model updates, firmware reflashing, reconfiguration, or repurposing. We discuss relevant challenges for OTA ML deployment over IoT both at the scientific and engineering level. We propose OTA-TinyML to enable resource-constrained IoT devices to perform end-to-end fetching, storage, and execution of many TinyML models. OTA-TinyML loads the C source file of ML models from a web server into the embedded IoT devices via HTTPS. OTA-TinyML is tested by performing remote fetching of six types of ML models, storing them on four types of memory units, then loading and executing on seven popular MCU boards

    Toward distributed, global, deep learning using IoT devices

    Get PDF
    Deep learning (DL) using large scale, high-quality IoT datasets can be computationally expensive. Utilizing such datasets to produce a problem-solving model within a reasonable time frame requires a scalable distributed training platform/system. We present a novel approach where to train one DL model on the hardware of thousands of mid-sized IoT devices across the world, rather than the use of GPU cluster available within a data center. We analyze the scalability and model convergence of the subsequently generated model, identify three bottlenecks that are: high computational operations, time consuming dataset loading I/O, and the slow exchange of model gradients. To highlight research challenges for globally distributed DL training and classification, we consider a case study from the video data processing domain. A need for a two-step deep compression method, which increases the training speed and scalability of DL training processing, is also outlined. Our initial experimental validation shows that the proposed method is able to improve the tolerance of the distributed training process to varying internet bandwidth, latency, and Quality of Service metrics
    corecore