911 research outputs found

    Service Abstractions for Scalable Deep Learning Inference at the Edge

    Get PDF
    Deep learning driven intelligent edge has already become a reality, where millions of mobile, wearable, and IoT devices analyze real-time data and transform those into actionable insights on-device. Typical approaches for optimizing deep learning inference mostly focus on accelerating the execution of individual inference tasks, without considering the contextual correlation unique to edge environments and the statistical nature of learning-based computation. Specifically, they treat inference workloads as individual black boxes and apply canonical system optimization techniques, developed over the last few decades, to handle them as yet another type of computation-intensive applications. As a result, deep learning inference on edge devices still face the ever increasing challenges of customization to edge device heterogeneity, fuzzy computation redundancy between inference tasks, and end-to-end deployment at scale. In this thesis, we propose the first framework that automates and scales the end-to-end process of deploying efficient deep learning inference from the cloud to heterogeneous edge devices. The framework consists of a series of service abstractions that handle DNN model tailoring, model indexing and query, and computation reuse for runtime inference respectively. Together, these services bridge the gap between deep learning training and inference, eliminate computation redundancy during inference execution, and further lower the barrier for deep learning algorithm and system co-optimization. To build efficient and scalable services, we take a unique algorithmic approach of harnessing the semantic correlation between the learning-based computation. Rather than viewing individual tasks as isolated black boxes, we optimize them collectively in a white box approach, proposing primitives to formulate the semantics of the deep learning workloads, algorithms to assess their hidden correlation (in terms of the input data, the neural network models, and the deployment trials) and merge common processing steps to minimize redundancy

    Edge Intelligence : Empowering Intelligence to the Edge of Network

    Get PDF
    Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data are captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protects the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this article, we present a thorough and comprehensive survey of the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e., edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare, and analyze the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, and so on. This article provides a comprehensive survey of edge intelligence and its application areas. In addition, we summarize the development of the emerging research fields and the current state of the art and discuss the important open issues and possible theoretical and technical directions.Peer reviewe

    Edge Intelligence : Empowering Intelligence to the Edge of Network

    Get PDF
    Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data are captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protects the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this article, we present a thorough and comprehensive survey of the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e., edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare, and analyze the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, and so on. This article provides a comprehensive survey of edge intelligence and its application areas. In addition, we summarize the development of the emerging research fields and the current state of the art and discuss the important open issues and possible theoretical and technical directions.Peer reviewe

    Performance Evaluation of the Object Detection Algorithms on Embedded Devices

    Get PDF
    Edge computing has seen a dramatic rise in demand, driven by the necessity for real-time, low-latency applications across various domains from autonomous vehicles to surveillance systems. Among these, real-time object detection stands as a crucial technology. However, the inherent constraints of edge devices, including limited computational power, present significant challenges. This thesis provides a comprehensive evaluation of several Convolutional Neural Networks based object detection models when deployed on resource-constrained edge devices, specifically Raspberry Pi and Google’s Coral TPU. The models examined include EfficientDet, YOLO, and variants of the MobileNet family combined with SSD for object detection tasks. We developed a novel benchmarking framework that allowed the evaluation of these models under different configurations, enabling an accurate assessment of their performance characteristics. The benchmarking framework and the metrics used for evaluation can provide a foundation for future work, focusing on the design and deployment of efficient real-time object detection models on edge devices. The performance of these models was scrutinized based on an exhaustive set of metrics including processing speed (frames per second), model accuracy (F1 score), energy consumption, CPU utilization, memory footprint, and device temperature. A novel benchmarking framework was developed to evaluate these models under diverse configurations, providing a precise assessment of their respective performance characteristics. This benchmarking framework, along with the evaluation metrics, sets the foundation for future research concentrating on the design and deployment of efficient real-time object detection models on edge devices. The findings of this study underscore the fact that no single model is a universal solution for all edge applications; instead, the choice of model is heavily dependent on the specific requirements and constraints of the given application. By offering a detailed overview of the performance traits of each model, we aim to guide practitioners in making informed decisions when deploying object detection models in edge computing environments. This work sets the stage for future exploration in the development of more efficient and effective models for real-time object detection on edge device

    A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends

    Full text link
    In today's digital age, Convolutional Neural Networks (CNNs), a subset of Deep Learning (DL), are widely used for various computer vision tasks such as image classification, object detection, and image segmentation. There are numerous types of CNNs designed to meet specific needs and requirements, including 1D, 2D, and 3D CNNs, as well as dilated, grouped, attention, depthwise convolutions, and NAS, among others. Each type of CNN has its unique structure and characteristics, making it suitable for specific tasks. It's crucial to gain a thorough understanding and perform a comparative analysis of these different CNN types to understand their strengths and weaknesses. Furthermore, studying the performance, limitations, and practical applications of each type of CNN can aid in the development of new and improved architectures in the future. We also dive into the platforms and frameworks that researchers utilize for their research or development from various perspectives. Additionally, we explore the main research fields of CNN like 6D vision, generative models, and meta-learning. This survey paper provides a comprehensive examination and comparison of various CNN architectures, highlighting their architectural differences and emphasizing their respective advantages, disadvantages, applications, challenges, and future trends

    Designing for Deployable, Secure, and Generic Machine Learning Systems

    Get PDF
    Machine learning systems have catalyzed numerous image-centric applications owing to the significant achievements of machine learning algorithms and models. While these systems have showcased the efficacy of machine learning models, certain challenges persist, such as machine learning system design and security vulnerabilities inherent in deep neural networks. Moreover, the deployment of deep neural network models remains a significant hurdle. This dissertation introduces a multimedia prototyping framework tailored for visual analytical applications, improving the reusability of video analysis software tools with minimal performance overhead. Furthermore, we present novel image-processing techniques designed to bolster the robustness of deep neural networks and propose an innovative compression technique to address deployment challenges. First, we propose a new software prototyping framework called Video as Text (vText) that analyzes and manipulates the video data as trivial as we handle text data in most Unix and Linux systems to tackle the reusability issue in the existing video analysis tools. The vText paradigm seeks to mimic such programs. We demonstrate the design and implementation of vText linking video codecs with computer vision and image processing algorithms, and the performance evaluation shows that the vText framework achieves comparable running time and is easily used for prototyping visual analytical programs. Second, to reduce the vulnerability of deep neural networks against adversaries, we propose three color-reduction image processing approaches, which are Gaussian smoothing plus PNM color reduction (GPCR), Gaussian smoothing plus K-means (GK-means), and fast GK-means to make deep convolutional neural networks more robust to adversarial perturbation. We evaluate the approaches on a subset of the ImageNet dataset. Our evaluation reveals that our GK-means-based algorithms have the best top-1 classification accuracy. The final contribution of the dissertation is introducing a novel deep neural network compression framework on class specialization problems to address the limited utilization of deep neural network-based functionalities. We propose a novel knowledge distillation framework with two proposed losses, Renormalized Knowledge Distillation (RKD) and Intra-Class Variance (ICV), to render computationally efficient, specialized neural network models. Our quantitatively empirical evaluation demonstrates that our proposed framework achieves significant classification accuracy improvements for the tasks where the number of subclasses or instances in datasets is relatively small
    • …
    corecore