5,176 research outputs found

    Design of Video Retrieval System Using MPEG-7 Descriptors

    Get PDF
    AbstractThe paper proposes a content-based video retrieval system designed using MPEG-7 (multimedia content description interface), which provides a standard description for a video. The system consists of three parts: shot boundary detection, feature extraction and similarity measurement. In shot boundary detection, cut and dissolve can be detected using the histogram difference and skipping image difference, respectively. In feature extraction part, two MPEG-7 visual descriptors, Color Structure Descriptor (CSD) and Edge Histogram Descriptor (EHD), are used to represent the color feature and edge feature of the key frames. Lastly, the similarity between key frames is calculated using dynamic-weighted feature similarity calculation. The proposed system is tested on three kinds of videos. Promising results are obtained in terms of both effectiveness and efficiency

    Scene Segmentation and Classification

    Get PDF
    In this thesis work we propose a novel method for video segmentation and classification, which are important tasks in indexing and retrieval of videos. Video indexing techniques requires the video to be segmented effectively into smaller meaningful units shots. Because of huge volumes of digital data and their dimensionality, indexing the data in shot level is a tough task. Scene classification has become a challenging and important problem in recent years because of its efficiency in video indexing. The main issue in video segmentation is the selection of features that are robust to false illuminations and object motion. Shot boundary detection algorithm is proposed which detects both the abrupt and gradual transitions simultaneously. Each shot is represented using a key-frame(s). The key-frame is a still image of a shot or it is a cumulative histogram representation that best represents the content of a shot. From each shot one or multiple key frame(s) are extracted. This research work presents a new method for segmenting videos into scenes. Scene is defined as a sequence of shots that are semantically co-related. Shots from a scene will have similar color content, background information. The similarity between a pair of shots is the color histogram intersection of the key frames of the two shots. Histogram intersection outputs the count of pixels with similar color in the two frames. Shot similarity matrix with 0 ′ s and 1 ′ s is computed, that outputs the similarity between any two shots. Shots are from the same scene if the similarity between the two shots is 1, else they are from different scenes. Spectral clustering algorithm is used to identify scene boundaries. Shots belonging to scene will form a cluster. A new method is proposed to detect scenes, sequence of shots that are similar will have an edge between them and forms a node. Edge represents the similarity value 1 between shots. SVM classifier is used for scene classification. The experimental results on different data-sets shows that the proposed algorithms can effectively segment and classify digital videos. Key words: Content based video retrieval, video content analysis, video indexing, shot boundary detection, key-frames, scene segmentation, and video classification

    The TREC-2002 video track report

    Get PDF
    TREC-2002 saw the second running of the Video Track, the goal of which was to promote progress in content-based retrieval from digital video via open, metrics-based evaluation. The track used 73.3 hours of publicly available digital video (in MPEG-1/VCD format) downloaded by the participants directly from the Internet Archive (Prelinger Archives) (internetarchive, 2002) and some from the Open Video Project (Marchionini, 2001). The material comprised advertising, educational, industrial, and amateur films produced between the 1930's and the 1970's by corporations, nonprofit organizations, trade associations, community and interest groups, educational institutions, and individuals. 17 teams representing 5 companies and 12 universities - 4 from Asia, 9 from Europe, and 4 from the US - participated in one or more of three tasks in the 2001 video track: shot boundary determination, feature extraction, and search (manual or interactive). Results were scored by NIST using manually created truth data for shot boundary determination and manual assessment of feature extraction and search results. This paper is an introduction to, and an overview of, the track framework - the tasks, data, and measures - the approaches taken by the participating groups, the results, and issues regrading the evaluation. For detailed information about the approaches and results, the reader should see the various site reports in the final workshop proceedings

    Scene extraction in motion pictures

    Full text link
    This paper addresses the challenge of bridging the semantic gap between the rich meaning users desire when they query to locate and browse media and the shallowness of media descriptions that can be computed in today\u27s content management systems. To facilitate high-level semantics-based content annotation and interpretation, we tackle the problem of automatic decomposition of motion pictures into meaningful story units, namely scenes. Since a scene is a complicated and subjective concept, we first propose guidelines from fill production to determine when a scene change occurs. We then investigate different rules and conventions followed as part of Fill Grammar that would guide and shape an algorithmic solution for determining a scene. Two different techniques using intershot analysis are proposed as solutions in this paper. In addition, we present different refinement mechanisms, such as film-punctuation detection founded on Film Grammar, to further improve the results. These refinement techniques demonstrate significant improvements in overall performance. Furthermore, we analyze errors in the context of film-production techniques, which offer useful insights into the limitations of our method

    TRECVID 2003 - an overview

    Get PDF

    TRECVID 2004 - an overview

    Get PDF

    Automatic summarization of rushes video using bipartite graphs

    Get PDF
    In this paper we present a new approach for automatic summarization of rushes, or unstructured video. Our approach is composed of three major steps. First, based on shot and sub-shot segmentations, we filter sub-shots with low information content not likely to be useful in a summary. Second, a method using maximal matching in a bipartite graph is adapted to measure similarity between the remaining shots and to minimize inter-shot redundancy by removing repetitive retake shots common in rushes video. Finally, the presence of faces and motion intensity are characterised in each sub-shot. A measure of how representative the sub-shot is in the context of the overall video is then proposed. Video summaries composed of keyframe slideshows are then generated. In order to evaluate the effectiveness of this approach we re-run the evaluation carried out by TRECVid, using the same dataset and evaluation metrics used in the TRECVid video summarization task in 2007 but with our own assessors. Results show that our approach leads to a significant improvement on our own work in terms of the fraction of the TRECVid summary ground truth included and is competitive with the best of other approaches in TRECVid 2007

    Video shot boundary detection: seven years of TRECVid activity

    Get PDF
    Shot boundary detection (SBD) is the process of automatically detecting the boundaries between shots in video. It is a problem which has attracted much attention since video became available in digital form as it is an essential pre-processing step to almost all video analysis, indexing, summarisation, search, and other content-based operations. Automatic SBD was one of the tracks of activity within the annual TRECVid benchmarking exercise, each year from 2001 to 2007 inclusive. Over those seven years we have seen 57 different research groups from across the world work to determine the best approaches to SBD while using a common dataset and common scoring metrics. In this paper we present an overview of the TRECVid shot boundary detection task, a high-level overview of the most significant of the approaches taken, and a comparison of performances, focussing on one year (2005) as an example
    corecore