491 research outputs found
A comprehensive survey of multi-view video summarization
[EN] There has been an exponential growth in the amount of visual data on a daily basis acquired from single or multi-view surveillance camera networks. This massive amount of data requires efficient mechanisms such as video summarization to ensure that only significant data are reported and the redundancy is reduced. Multi-view video summarization (MVS) is a less redundant and more concise way of providing information from the video content of all the cameras in the form of either keyframes or video segments. This paper presents an overview of the existing strategies proposed for MVS, including their advantages and drawbacks. Our survey covers the genericsteps in MVS, such as the pre-processing of video data, feature extraction, and post-processing followed by summary generation. We also describe the datasets that are available for the evaluation of MVS. Finally, we examine the major current issues related to MVS and put forward the recommendations for future research(1). (C) 2020 Elsevier Ltd. All rights reserved.This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2019R1A2B5B01070067)Hussain, T.; Muhammad, K.; Ding, W.; Lloret, J.; Baik, SW.; De Albuquerque, VHC. (2021). A comprehensive survey of multi-view video summarization. Pattern Recognition. 109:1-15. https://doi.org/10.1016/j.patcog.2020.10756711510
Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy
In this paper we shall consider the problem of deploying attention to subsets
of the video streams for collating the most relevant data and information of
interest related to a given task. We formalize this monitoring problem as a
foraging problem. We propose a probabilistic framework to model observer's
attentive behavior as the behavior of a forager. The forager, moment to moment,
focuses its attention on the most informative stream/camera, detects
interesting objects or activities, or switches to a more profitable stream. The
approach proposed here is suitable to be exploited for multi-stream video
summarization. Meanwhile, it can serve as a preliminary step for more
sophisticated video surveillance, e.g. activity and behavior analysis.
Experimental results achieved on the UCR Videoweb Activities Dataset, a
publicly available dataset, are presented to illustrate the utility of the
proposed technique.Comment: Accepted to IEEE Transactions on Image Processin
Collaborative Multi-Agent Video Fast-Forwarding
Multi-agent applications have recently gained significant popularity. In many
computer vision tasks, a network of agents, such as a team of robots with
cameras, could work collaboratively to perceive the environment for efficient
and accurate situation awareness. However, these agents often have limited
computation, communication, and storage resources. Thus, reducing resource
consumption while still providing an accurate perception of the environment
becomes an important goal when deploying multi-agent systems. To achieve this
goal, we identify and leverage the overlap among different camera views in
multi-agent systems for reducing the processing, transmission and storage of
redundant/unimportant video frames. Specifically, we have developed two
collaborative multi-agent video fast-forwarding frameworks in distributed and
centralized settings, respectively. In these frameworks, each individual agent
can selectively process or skip video frames at adjustable paces based on
multiple strategies via reinforcement learning. Multiple agents then
collaboratively sense the environment via either 1) a consensus-based
distributed framework called DMVF that periodically updates the fast-forwarding
strategies of agents by establishing communication and consensus among
connected neighbors, or 2) a centralized framework called MFFNet that utilizes
a central controller to decide the fast-forwarding strategies for agents based
on collected data. We demonstrate the efficacy and efficiency of our proposed
frameworks on a real-world surveillance video dataset VideoWeb and a new
simulated driving dataset CarlaSim, through extensive simulations and
deployment on an embedded platform with TCP communication. We show that
compared with other approaches in the literature, our frameworks achieve better
coverage of important frames, while significantly reducing the number of frames
processed at each agent.Comment: IEEE Transactions on Multimedia, 2023. arXiv admin note: text overlap
with arXiv:2008.0443
- …