12,811 research outputs found
News story segmentation in the Físchlár video indexing system
This paper presents an approach to segmenting individual news stories in broadcast news programmes. The approach first performs shot boundary detection and keyframe extraction on the programme. Shots are then clustered into groups based on their colour and temporal similarity. The clustering process is controlled using the groups' statistics. After clustering, a set of criteria are applied and groups are successively eliminated in order to converge upon a set of anchorperson groups. The temporal locations of the shots in these anchorperson groups are then used to segment the programme in terms of individual news items. This work is carried out within the context of a complete video indexing, browsing and retrieval syste
Scene extraction in motion pictures
This paper addresses the challenge of bridging the semantic gap between the rich meaning users desire when they query to locate and browse media and the shallowness of media descriptions that can be computed in today\u27s content management systems. To facilitate high-level semantics-based content annotation and interpretation, we tackle the problem of automatic decomposition of motion pictures into meaningful story units, namely scenes. Since a scene is a complicated and subjective concept, we first propose guidelines from fill production to determine when a scene change occurs. We then investigate different rules and conventions followed as part of Fill Grammar that would guide and shape an algorithmic solution for determining a scene. Two different techniques using intershot analysis are proposed as solutions in this paper. In addition, we present different refinement mechanisms, such as film-punctuation detection founded on Film Grammar, to further improve the results. These refinement techniques demonstrate significant improvements in overall performance. Furthermore, we analyze errors in the context of film-production techniques, which offer useful insights into the limitations of our method
Dublin City University video track experiments for TREC 2003
In this paper, we describe our experiments for both the News Story Segmentation task and Interactive Search task for
TRECVID 2003. Our News Story Segmentation task involved the use of a Support Vector Machine (SVM) to combine evidence from audio-visual analysis tools in order to generate a listing of news stories from a given news programme. Our
Search task experiment compared a video retrieval system based on text, image and relevance feedback with a text-only
video retrieval system in order to identify which was more effective. In order to do so we developed two variations of our Físchlár video retrieval system and conducted user testing in a controlled lab environment. In this paper we outline our work on both of these two tasks
Automatic genre identification for content-based video categorization
This paper presents a set of computational features originating from our study of editing effects, motion, and color used in videos, for the task of automatic video categorization. These features besides representing human understanding of typical attributes of different video genres, are also inspired by the techniques and rules used by many directors to endow specific characteristics to a genre-program which lead to certain emotional impact on viewers. We propose new features whilst also employing traditionally used ones for classification. This research, goes beyond the existing work with a systematic analysis of trends exhibited by each of our features in genres such as cartoons, commercials, music, news, and sports, and it enables an understanding of the similarities, dissimilarities, and also likely confusion between genres. Classification results from our experiments on several hours of video establish the usefulness of this feature set. We also explore the issue of video clip duration required to achieve reliable genre identification and demonstrate its impact on classification accuracy.<br /
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
Personalized video summarization based on group scoring
In this paper an expert-based model for generation of personalized video summaries is suggested. The video frames are initially scored and annotated by multiple video experts. Thereafter, the scores for the video segments that have been assigned the higher priorities by end users will be upgraded. Considering the required summary length, the highest scored video frames will be inserted into a personalized final summary. For evaluation purposes, the video summaries generated by our system have been compared against the results from a number of automatic and semi-automatic summarization tools that use different modalities for abstraction
Personalized video summarization by highest quality frames
In this work, a user-centered approach has been the basis for generation of the personalized video summaries. Primarily, the video experts score and annotate the video frames during the enrichment phase. Afterwards, the frames scores for different video segments will be updated based on the captured end-users (different with video experts) priorities towards existing video scenes. Eventually, based on the pre-defined skimming time, the highest scored video frames will be extracted to be included into the personalized video summaries. In order to evaluate the effectiveness of our proposed model, we have compared the video summaries generated by our system against the results from 4 other summarization tools using different modalities
TRECVID 2008 - goals, tasks, data, evaluation mechanisms and metrics
The TREC Video Retrieval Evaluation (TRECVID) 2008 is a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in content-based exploitation of digital video via open, metrics-based evaluation. Over the last 7 years this effort has yielded a
better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. In 2008, 77 teams (see Table 1) from various research organizations --- 24 from
Asia, 39 from Europe, 13 from North America, and 1 from Australia --- participated in one or more of five tasks: high-level feature extraction, search (fully automatic, manually assisted, or interactive), pre-production video (rushes) summarization, copy detection, or surveillance event detection. The copy detection and surveillance event detection tasks are being run for the first time in TRECVID.
This paper presents an overview of TRECVid in 2008
- …