14,685 research outputs found
The TREC-2002 video track report
TREC-2002 saw the second running of the Video Track, the goal of which was to promote progress in content-based retrieval from digital video via open, metrics-based evaluation. The track used 73.3 hours of publicly available digital video (in MPEG-1/VCD format) downloaded by the participants directly from the Internet Archive (Prelinger Archives) (internetarchive, 2002) and some from the Open
Video Project (Marchionini, 2001). The material comprised advertising, educational, industrial, and amateur films produced between the 1930's and the 1970's by corporations, nonprofit organizations, trade associations, community and interest groups, educational institutions, and individuals. 17 teams representing 5 companies and 12 universities - 4 from Asia, 9 from Europe, and 4 from the US - participated in one or more of three tasks in the 2001 video track: shot boundary determination, feature extraction, and search (manual or interactive). Results were scored by NIST using manually created truth data for shot boundary determination and manual assessment of feature extraction and search results. This paper is an introduction to, and an overview
of, the track framework - the tasks, data, and measures - the approaches taken by the participating groups, the results, and issues regrading the evaluation. For detailed information about the approaches and results, the reader should see the various site reports in the final workshop proceedings
Activity-driven content adaptation for effective video summarisation
In this paper, we present a novel method for content adaptation and video summarization fully implemented in compressed-domain. Firstly, summarization of generic videos is modeled as the process of extracted human objects under various activities/events. Accordingly, frames are classified into five categories via fuzzy decision including shot changes (cut and gradual transitions), motion activities (camera motion and object motion) and others by using two inter-frame measurements. Secondly, human objects are detected using Haar-like features. With the detected human objects and attained frame categories, activity levels for each frame are determined to adapt with video contents. Continuous frames belonging to same category are grouped to form one activity entry as content of interest (COI) which will convert the original video into a series of activities. An overall adjustable quota is used to control the size of generated summarization for efficient streaming purpose. Upon this quota, the frames selected for summarization are determined by evenly sampling the accumulated activity levels for content adaptation. Quantitative evaluations have proved the effectiveness and efficiency of our proposed approach, which provides a more flexible and general solution for this topic as domain-specific tasks such as accurate recognition of objects can be avoided
Event detection based on generic characteristics of field-sports
In this paper, we propose a generic framework for event detection in broadcast video of multiple different field-sports. Features indicating significant events are selected, and robust detectors built. These features are rooted in generic characteristics common to all genres of field-sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested across multiple genres of field-sports including soccer, rugby, hockey and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable
Event detection in field sports video using audio-visual features and a support vector machine
In this paper, we propose a novel audio-visual feature-based framework for event detection in broadcast video of multiple different field sports. Features indicating significant events are selected and robust detectors built. These features are rooted in characteristics common to all genres of field sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested generically across multiple genres of field sports including soccer, rugby, hockey, and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable
Personalized video summarization based on group scoring
In this paper an expert-based model for generation of personalized video summaries is suggested. The video frames are initially scored and annotated by multiple video experts. Thereafter, the scores for the video segments that have been assigned the higher priorities by end users will be upgraded. Considering the required summary length, the highest scored video frames will be inserted into a personalized final summary. For evaluation purposes, the video summaries generated by our system have been compared against the results from a number of automatic and semi-automatic summarization tools that use different modalities for abstraction
Recommended from our members
Automatic parsing of sports videos with grammars
Motivated by the analogies between languages and sports videos, we introduce a novel
approach for video parsing with grammars. It utilizes compiler techniques for integrating both semantic
annotation and syntactic analysis to generate a semantic index of events and a table of content for a given
sports video. The video sequence is first segmented and annotated by event detection with domain
knowledge. A grammar-based parser is then used to identify the structure of the video content.
Meanwhile, facilities for error handling are introduced which are particularly useful when the results of
automatic parsing need to be adjusted. As a case study, we have developed a system for video parsing in
the particular domain of TV diving programs. Experimental results indicate the proposed approach is
effectiv
- …