2,042 research outputs found
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
Video browsing interfaces and applications: a review
We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data—which, if presented in its raw format, is rather unwieldy and costly—have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other
Learning to detect video events from zero or very few video examples
In this work we deal with the problem of high-level event detection in video.
Specifically, we study the challenging problems of i) learning to detect video
events from solely a textual description of the event, without using any
positive video examples, and ii) additionally exploiting very few positive
training samples together with a small number of ``related'' videos. For
learning only from an event's textual description, we first identify a general
learning framework and then study the impact of different design choices for
various stages of this framework. For additionally learning from example
videos, when true positive training samples are scarce, we employ an extension
of the Support Vector Machine that allows us to exploit ``related'' event
videos by automatically introducing different weights for subsets of the videos
in the overall training set. Experimental evaluations performed on the
large-scale TRECVID MED 2014 video dataset provide insight on the effectiveness
of the proposed methods.Comment: Image and Vision Computing Journal, Elsevier, 2015, accepted for
publicatio
A comprehensive survey of multi-view video summarization
[EN] There has been an exponential growth in the amount of visual data on a daily basis acquired from single or multi-view surveillance camera networks. This massive amount of data requires efficient mechanisms such as video summarization to ensure that only significant data are reported and the redundancy is reduced. Multi-view video summarization (MVS) is a less redundant and more concise way of providing information from the video content of all the cameras in the form of either keyframes or video segments. This paper presents an overview of the existing strategies proposed for MVS, including their advantages and drawbacks. Our survey covers the genericsteps in MVS, such as the pre-processing of video data, feature extraction, and post-processing followed by summary generation. We also describe the datasets that are available for the evaluation of MVS. Finally, we examine the major current issues related to MVS and put forward the recommendations for future research(1). (C) 2020 Elsevier Ltd. All rights reserved.This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2019R1A2B5B01070067)Hussain, T.; Muhammad, K.; Ding, W.; Lloret, J.; Baik, SW.; De Albuquerque, VHC. (2021). A comprehensive survey of multi-view video summarization. Pattern Recognition. 109:1-15. https://doi.org/10.1016/j.patcog.2020.10756711510
- …