569 research outputs found
Horizontally distributed inference of deep neural networks for AI-enabled IoT
Motivated by the pervasiveness of artificial intelligence (AI) and the Internet of Things (IoT) in the current “smart everything” scenario, this article provides a comprehensive overview of the most recent research at the intersection of both domains, focusing on the design and development of specific mechanisms for enabling a collaborative inference across edge devices towards the in situ execution of highly complex state-of-the-art deep neural networks (DNNs), despite the resource-constrained nature of such infrastructures. In particular, the review discusses the most salient approaches conceived along those lines, elaborating on the specificities of the partitioning schemes and the parallelism paradigms explored, providing an organized and schematic discussion of the underlying workflows and associated communication patterns, as well as the architectural aspects of the DNNs that have driven the design of such techniques, while also highlighting both the primary challenges encountered at the design and operational levels and the specific adjustments or enhancements explored in response to them.Agencia Estatal de Investigación | Ref. DPI2017-87494-RMinisterio de Ciencia e Innovación | Ref. PDC2021-121644-I00Xunta de Galicia | Ref. ED431C 2022/03-GR
Adaptive ResNet Architecture for Distributed Inference in Resource-Constrained IoT Systems
As deep neural networks continue to expand and become more complex, most edge
devices are unable to handle their extensive processing requirements.
Therefore, the concept of distributed inference is essential to distribute the
neural network among a cluster of nodes. However, distribution may lead to
additional energy consumption and dependency among devices that suffer from
unstable transmission rates. Unstable transmission rates harm real-time
performance of IoT devices causing low latency, high energy usage, and
potential failures. Hence, for dynamic systems, it is necessary to have a
resilient DNN with an adaptive architecture that can downsize as per the
available resources. This paper presents an empirical study that identifies the
connections in ResNet that can be dropped without significantly impacting the
model's performance to enable distribution in case of resource shortage. Based
on the results, a multi-objective optimization problem is formulated to
minimize latency and maximize accuracy as per available resources. Our
experiments demonstrate that an adaptive ResNet architecture can reduce shared
data, energy consumption, and latency throughout the distribution while
maintaining high accuracy.Comment: Accepted in the International Wireless Communications & Mobile
Computing Conference (IWCMC 2023
Privacy-preserving Security Inference Towards Cloud-Edge Collaborative Using Differential Privacy
Cloud-edge collaborative inference approach splits deep neural networks
(DNNs) into two parts that run collaboratively on resource-constrained edge
devices and cloud servers, aiming at minimizing inference latency and
protecting data privacy. However, even if the raw input data from edge devices
is not directly exposed to the cloud, state-of-the-art attacks targeting
collaborative inference are still able to reconstruct the raw private data from
the intermediate outputs of the exposed local models, introducing serious
privacy risks. In this paper, a secure privacy inference framework for
cloud-edge collaboration is proposed, termed CIS, which supports adaptively
partitioning the network according to the dynamically changing network
bandwidth and fully releases the computational power of edge devices. To
mitigate the influence introduced by private perturbation, CIS provides a way
to achieve differential privacy protection by adding refined noise to the
intermediate layer feature maps offloaded to the cloud. Meanwhile, with a given
total privacy budget, the budget is reasonably allocated by the size of the
feature graph rank generated by different convolution filters, which makes the
inference in the cloud robust to the perturbed data, thus effectively trade-off
the conflicting problem between privacy and availability. Finally, we construct
a real cloud-edge collaborative inference computing scenario to verify the
effectiveness of inference latency and model partitioning on
resource-constrained edge devices. Furthermore, the state-of-the-art cloud-edge
collaborative reconstruction attack is used to evaluate the practical
availability of the end-to-end privacy protection mechanism provided by CIS
Runtime adaptive iomt node on multi-core processor platform
The Internet of Medical Things (IoMT) paradigm is becoming mainstream in multiple clinical trials and healthcare procedures. Thanks to innovative technologies, latest-generation communication networks, and state-of-the-art portable devices, IoTM opens up new scenarios for data collection and continuous patient monitoring. Two very important aspects should be considered to make the most of this paradigm. For the first aspect, moving the processing task from the cloud to the edge leads to several advantages, such as responsiveness, portability, scalability, and reliability of the sensor node. For the second aspect, in order to increase the accuracy of the system, state-of-the-art cognitive algorithms based on artificial intelligence and deep learning must be integrated. Sensory nodes often need to be battery powered and need to remain active for a long time without a different power source. Therefore, one of the challenges to be addressed during the design and development of IoMT devices concerns energy optimization. Our work proposes an implementation of cognitive data analysis based on deep learning techniques on resource-constrained computing platform. To handle power efficiency, we introduced a component called Adaptive runtime Manager (ADAM). This component takes care of reconfiguring the hardware and software of the device dynamically during the execution, in order to better adapt it to the workload and the required operating mode. To test the high computational load on a multi-core system, the Orlando prototype board by STMicroelectronics, cognitive analysis of Electrocardiogram (ECG) traces have been adopted, considering single-channel and six-channel simultaneous cases. Experimental results show that by managing the sensory node configuration at runtime, energy savings of at least 15% can be achieved
- …