14,849 research outputs found
Evaluation of automatic shot boundary detection on a large video test suite
The challenge facing the indexing of digital video information in order to support browsing and retrieval by users, is to design systems that can accurately and automatically process large amounts of heterogeneous video.
The segmentation of video material into shots and scenes is the basic operation in the analysis of video content. This paper presents a detailed evaluation of a histogram-based shot cut detector based on eight hours of TV broadcast video.
Our observations are that the selection of similarity thresholds for determining shot boundaries in such broadcast video is difficult and necessitates the development of systems that employ adaptive thresholding in order to address the huge variation of characteristics prevalent in TV broadcast video
Video shot boundary detection: seven years of TRECVid activity
Shot boundary detection (SBD) is the process of automatically detecting the boundaries between shots in video. It is a problem which has attracted much attention since video became available in digital form as it is an essential pre-processing step to almost all video analysis, indexing, summarisation, search, and other content-based operations. Automatic SBD was one of the tracks of activity within the annual TRECVid benchmarking exercise, each year from 2001 to 2007 inclusive. Over those seven years we have seen 57 different research groups from across the world work to determine the best approaches to SBD while using a common dataset and common scoring metrics. In this paper we present an overview of the TRECVid shot boundary detection task, a high-level overview of the most significant of the approaches taken, and a comparison of performances, focussing on one year (2005) as an example
Circle-based Eye Center Localization (CECL)
We propose an improved eye center localization method based on the Hough
transform, called Circle-based Eye Center Localization (CECL) that is simple,
robust, and achieves accuracy on a par with typically more complex
state-of-the-art methods. The CECL method relies on color and shape cues that
distinguish the iris from other facial structures. The accuracy of the CECL
method is demonstrated through a comparison with 15 state-of-the-art eye center
localization methods against five error thresholds, as reported in the
literature. The CECL method achieved an accuracy of 80.8% to 99.4% and ranked
first for 2 of the 5 thresholds. It is concluded that the CECL method offers an
attractive alternative to existing methods for automatic eye center
localization.Comment: Published and presented at The 14th IAPR International Conference on
Machine Vision Applications, 2015. http://www.mva-org.jp/mva2015
Audio and video processing for automatic TV advertisement detection
As a partner in the Centre for Digital Video Processing, the Visual Media Processing Group at Dublin City University conducts research and development in the area of digital video management. The current stage of development is demonstrated on our Web-based digital video system called Físchlár [1,2], which provides for efficient recording,
analyzing, browsing and viewing of digitally captured television programmes. In order to make the browsing of
programme material more efficient, users have requested the option of automatically deleting advertisement breaks.
Our initial work on this task focused on locating ad-breaks by detecting patterns of silent black frames which separate
individual advertisements and/or complete ad-breaks in most commercial TV stations. However, not all TV stations use
silent, black frames to flag ad-breaks. We therefore decided to attempt to detect advertisements using the rate of shot cuts in the digitised TV signal. This paper describes the implementation and performance of both methods of ad-break
detection
GraFIX: a semiautomatic approach for parsing low- and high-quality eye-tracking data
Fixation durations (FD) have been used widely as a measurement of information processing and attention. However, issues like data quality can seriously influence the accuracy of the fixation detection methods and, thus, affect the validity of our results (Holmqvist, Nyström, & Mulvey, 2012). This is crucial when studying special populations such as infants, where common issues with testing (e.g., high degree of movement, unreliable eye detection, low spatial precision) result in highly variable data quality and render existing FD detection approaches highly time consuming (hand-coding) or imprecise (automatic detection). To address this problem, we present GraFIX, a novel semiautomatic method consisting of a two-step process in which eye-tracking data is initially parsed by using velocity-based algorithms whose input parameters are adapted by the user and then manipulated using the graphical interface, allowing accurate and rapid adjustments of the algorithms’ outcome. The present algorithms (1) smooth the raw data, (2) interpolate missing data points, and (3) apply a number of criteria to automatically evaluate and remove artifactual fixations. The input parameters (e.g., velocity threshold, interpolation latency) can be easily manually adapted to fit each participant. Furthermore, the present application includes visualization tools that facilitate the manual coding of fixations. We assessed this method by performing an intercoder reliability analysis in two groups of infants presenting low- and high-quality data and compared it with previous methods. Results revealed that our two-step approach with adaptable FD detection criteria gives rise to more reliable and stable measures in low- and high-quality data
Adaptive video segmentation
The efficiency of a video indexing technique depends on the efficiency of the video segmentation algorithm which is a fundamental step in video indexing. Video segmentation is a process of splitting up a video sequence into its constituent scenes. This work focuses on the problem of video segmentation. A content-based approach has been used which segments a video based on the information extracted from the video itself. The main emphasis is on using structural information in the video such as edges as they are largely invariant to illumination and motion changes. The edge-based features have been used in conjunction with the intensity-based features in a multi-resolution framework to improve the performance of the segmentation algorithm.;To further improve the performance and to reduce the problem of automated choice of parameters, we introduce adaptation in the video segmentation process. (Abstract shortened by UMI.)
- …