1,484 research outputs found

    Collaborative Intelligent Cross-Camera Video Analytics at Edge: Opportunities and Challenges

    Full text link
    Nowadays, video cameras are deployed in large scale for spatial monitoring of physical places (e.g., surveillance systems in the context of smart cities). The massive camera deployment, however, presents new challenges for analyzing the enormous data, as the cost of high computational overhead of sophisticated deep learning techniques imposes a prohibitive overhead, in terms of energy consumption and processing throughput, on such resource-constrained edge devices. To address these limitations, this paper envisions a collaborative intelligent cross-camera video analytics paradigm at the network edge in which camera nodes adjust their pipelines (e.g., inference) to incorporate correlated observations and shared knowledge from other nodes' contents. By harassing redundant spatio-temporal to reduce the size of the inference search space in one hand, and intelligent collaboration between video nodes on the other, we discuss how such collaborative paradigm can considerably improve accuracy, reduce latency and decrease communication bandwidth compared to non-collaborative baselines. This paper also describes major opportunities and challenges in realizing such a paradigm.Comment: First International Workshop on Challenges in Artificial Intelligence and Machine Learnin

    System Abstractions for Scalable Application Development at the Edge

    Get PDF
    Recent years have witnessed an explosive growth of Internet of Things (IoT) devices, which collect or generate huge amounts of data. Given diverse device capabilities and application requirements, data processing takes place across a range of settings, from on-device to a nearby edge server/cloud and remote cloud. Consequently, edge-cloud coordination has been studied extensively from the perspectives of job placement, scheduling and joint optimization. Typical approaches focus on performance optimization for individual applications. This often requires domain knowledge of the applications, but also leads to application-specific solutions. Application development and deployment over diverse scenarios thus incur repetitive manual efforts. There are two overarching challenges to provide system-level support for application development at the edge. First, there is inherent heterogeneity at the device hardware level. The execution settings may range from a small cluster as an edge cloud to on-device inference on embedded devices, differing in hardware capability and programming environments. Further, application performance requirements vary significantly, making it even more difficult to map different applications to already heterogeneous hardware. Second, there are trends towards incorporating edge and cloud and multi-modal data. Together, these add further dimensions to the design space and increase the complexity significantly. In this thesis, we propose a novel framework to simplify application development and deployment over a continuum of edge to cloud. Our framework provides key connections between different dimensions of design considerations, corresponding to the application abstraction, data abstraction and resource management abstraction respectively. First, our framework masks hardware heterogeneity with abstract resource types through containerization, and abstracts away the application processing pipelines into generic flow graphs. Further, our framework further supports a notion of degradable computing for application scenarios at the edge that are driven by multimodal sensory input. Next, as video analytics is the killer app of edge computing, we include a generic data management service between video query systems and a video store to organize video data at the edge. We propose a video data unit abstraction based on a notion of distance between objects in the video, quantifying the semantic similarity among video data. Last, considering concurrent application execution, our framework supports multi-application offloading with device-centric control, with a userspace scheduler service that wraps over the operating system scheduler

    Edge Video Analytics: A Survey on Applications, Systems and Enabling Techniques

    Full text link
    Video, as a key driver in the global explosion of digital information, can create tremendous benefits for human society. Governments and enterprises are deploying innumerable cameras for a variety of applications, e.g., law enforcement, emergency management, traffic control, and security surveillance, all facilitated by video analytics (VA). This trend is spurred by the rapid advancement of deep learning (DL), which enables more precise models for object classification, detection, and tracking. Meanwhile, with the proliferation of Internet-connected devices, massive amounts of data are generated daily, overwhelming the cloud. Edge computing, an emerging paradigm that moves workloads and services from the network core to the network edge, has been widely recognized as a promising solution. The resulting new intersection, edge video analytics (EVA), begins to attract widespread attention. Nevertheless, only a few loosely-related surveys exist on this topic. The basic concepts of EVA (e.g., definition, architectures) were not fully elucidated due to the rapid development of this domain. To fill these gaps, we provide a comprehensive survey of the recent efforts on EVA. In this paper, we first review the fundamentals of edge computing, followed by an overview of VA. The EVA system and its enabling techniques are discussed next. In addition, we introduce prevalent frameworks and datasets to aid future researchers in the development of EVA systems. Finally, we discuss existing challenges and foresee future research directions. We believe this survey will help readers comprehend the relationship between VA and edge computing, and spark new ideas on EVA.Comment: 31 pages, 13 figure

    Realizing Video Analytic Service in the Fog-Based Infrastructure-Less Environments

    Get PDF
    Deep learning has unleashed the great potential in many fields and now is the most significant facilitator for video analytics owing to its capability to providing more intelligent services in a complex scenario. Meanwhile, the emergence of fog computing has brought unprecedented opportunities to provision intelligence services in infrastructure-less environments like remote national parks and rural farms. However, most of the deep learning algorithms are computationally intensive and impossible to be executed in such environments due to the needed supports from the cloud. In this paper, we develop a video analytic framework, which is tailored particularly for the fog devices to realize video analytic service in a rapid manner. Also, the convolution neural networks are used as the core processing unit in the framework to facilitate the image analysing process

    A survey on big multimedia data processing and management in smart cities

    Full text link
    © 2019 Association for Computing Machinery. All rights reserved. Integration of embedded multimedia devices with powerful computing platforms, e.g., machine learning platforms, helps to build smart cities and transforms the concept of Internet of Things into Internet of Multimedia Things (IoMT). To provide different services to the residents of smart cities, the IoMT technology generates big multimedia data. The management of big multimedia data is a challenging task for IoMT technology. Without proper management, it is hard to maintain consistency, reusability, and reconcilability of generated big multimedia data in smart cities. Various machine learning techniques can be used for automatic classification of raw multimedia data and to allow machines to learn features and perform specific tasks. In this survey, we focus on various machine learning platforms that can be used to process and manage big multimedia data generated by different applications in smart cities. We also highlight various limitations and research challenges that need to be considered when processing big multimedia data in real-time

    Why is the video analytics accuracy fluctuating, and what can we do about it?

    Full text link
    It is a common practice to think of a video as a sequence of images (frames), and re-use deep neural network models that are trained only on images for similar analytics tasks on videos. In this paper, we show that this leap of faith that deep learning models that work well on images will also work well on videos is actually flawed. We show that even when a video camera is viewing a scene that is not changing in any human-perceptible way, and we control for external factors like video compression and environment (lighting), the accuracy of video analytics application fluctuates noticeably. These fluctuations occur because successive frames produced by the video camera may look similar visually, but these frames are perceived quite differently by the video analytics applications. We observed that the root cause for these fluctuations is the dynamic camera parameter changes that a video camera automatically makes in order to capture and produce a visually pleasing video. The camera inadvertently acts as an unintentional adversary because these slight changes in the image pixel values in consecutive frames, as we show, have a noticeably adverse impact on the accuracy of insights from video analytics tasks that re-use image-trained deep learning models. To address this inadvertent adversarial effect from the camera, we explore the use of transfer learning techniques to improve learning in video analytics tasks through the transfer of knowledge from learning on image analytics tasks. In particular, we show that our newly trained Yolov5 model reduces fluctuation in object detection across frames, which leads to better tracking of objects(40% fewer mistakes in tracking). Our paper also provides new directions and techniques to mitigate the camera's adversarial effect on deep learning models used for video analytics applications
    • …
    corecore