4 research outputs found
Determination of Shot Boundary in MPEG Videos for TRECVID 2007
Detection of shot boundary plays important roles in many video applications. Herein, a novel method on shot boundary detection from compressed video is proposed. Firstly, we extract several local indicators from macroblocks, and these features are used in determining candidate cuts via rule-based decision making into five sub-spaces. Then, global indicators of frame similarity between start and end frames of candidate cuts are examined, using fast phase correlation on cropped DC images. Gradual transitions like fade and dissolve as well as combined shot cuts are also identified in compressed domain. Experimental results on the test data from TRECVID 2007 have demonstrated the effectiveness and robustness of our proposed methodology. Moreover, our submissions can achieve nearly 5 times faster than real-time video play (25 frames/s) due to the nature of its compressed-domain processing, achieving additional advantages in terms of processing speed and computing costs
Determination of shot boundary in MPEG videos for TRECVID 2007
Detection of shot boundary plays important roles in many video applications. Herein, a novel method on shot boundary detection from compressed video is proposed. Firstly, we extract several local indicators from macroblocks, and these features are used in determining candidate cuts via rule-based decision making into five sub-spaces. Then, global indicators of frame similarity between start and end frames of candidate cuts are examined, using fast phase correlation on cropped DC images. Gradual transitions like fade and dissolve as well as combined shot cuts are also identified in compressed domain. Experimental results on the test data from TRECVID 2007 have demonstrated the effectiveness and robustness of our proposed methodology. Moreover, our submissions can achieve nearly 5 times faster than real-time video play (25 frames/s) due to the nature of its compressed-domain processing, achieving additional advantages in terms of processing speed and computing costs
Recommended from our members
Content-based Digital Video Processing. Digital Videos Segmentation, Retrieval and Interpretation.
Recent research approaches in semantics based video content analysis require shot boundary detection as the first step to divide video sequences into sections. Furthermore, with the advances in networking and computing capability, efficient retrieval of multimedia data has become an important issue. Content-based retrieval technologies have been widely implemented to protect intellectual property rights (IPR). In addition, automatic recognition of highlights from videos is a fundamental and challenging problem for content-based indexing and retrieval applications.
In this thesis, a paradigm is proposed to segment, retrieve and interpret digital videos. Five algorithms are presented to solve the video segmentation task. Firstly, a simple shot cut detection algorithm is designed for real-time implementation. Secondly, a systematic method is proposed for shot detection using content-based rules and FSM (finite state machine). Thirdly, the shot detection is implemented using local and global indicators. Fourthly, a context awareness approach is proposed to detect shot boundaries. Fifthly, a fuzzy logic method is implemented for shot detection. Furthermore, a novel analysis approach is presented for the detection of video copies. It is robust to complicated distortions and capable of locating the copy of segments inside original videos. Then,
iv
objects and events are extracted from MPEG Sequences for Video Highlights Indexing and Retrieval. Finally, a human fighting detection algorithm is proposed for movie annotation
Crowd Scene Analysis in Video Surveillance
There is an increasing interest in crowd scene analysis in video surveillance due to the ubiquitously deployed video surveillance systems in public places with high density of objects amid the increasing concern on public security and safety. A comprehensive crowd scene analysis approach is required to not only be able to recognize crowd events and detect abnormal events, but also update the innate learning model in an online, real-time fashion. To this end, a set of approaches for Crowd Event Recognition (CER) and Abnormal Event Detection (AED) are developed in this thesis.
To address the problem of curse of dimensionality, we propose a video manifold learning method for crowd event analysis. A novel feature descriptor is proposed to encode regional optical flow features of video frames, where adaptive quantization and binarization of the feature code are employed to improve the discriminant ability of crowd motion patterns. Using the feature code as input, a linear dimensionality reduction algorithm that preserves both the intrinsic spatial and temporal properties is proposed, where the generated low-dimensional video manifolds are conducted for CER and AED.
Moreover, we introduce a framework for AED by integrating a novel incremental and decremental One-Class Support Vector Machine (OCSVM) with a sliding buffer. It not only updates the model in an online fashion with low computational cost, but also adapts to concept drift by discarding obsolete patterns. Furthermore, the framework has been improved by introducing Multiple Incremental and Decremental Learning (MIDL), kernel fusion, and multiple target tracking, which leads to more accurate and faster AED.
In addition, we develop a framework for another video content analysis task, i.e., shot boundary detection. Specifically, instead of directly assessing the pairwise difference between consecutive frames over time, we propose to evaluate a divergence measure between two OCSVM classifiers trained on two successive frame sets, which is more robust to noise and gradual transitions such as fade-in and fade-out. To speed up the processing procedure, the two OCSVM classifiers are updated online by the MIDL proposed for AED.
Extensive experiments on five benchmark datasets validate the effectiveness and efficiency of our approaches in comparison with the state of the art