454 research outputs found

    Indexing, browsing and searching of digital video

    Get PDF
    Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver

    Video shot boundary detection: seven years of TRECVid activity

    Get PDF
    Shot boundary detection (SBD) is the process of automatically detecting the boundaries between shots in video. It is a problem which has attracted much attention since video became available in digital form as it is an essential pre-processing step to almost all video analysis, indexing, summarisation, search, and other content-based operations. Automatic SBD was one of the tracks of activity within the annual TRECVid benchmarking exercise, each year from 2001 to 2007 inclusive. Over those seven years we have seen 57 different research groups from across the world work to determine the best approaches to SBD while using a common dataset and common scoring metrics. In this paper we present an overview of the TRECVid shot boundary detection task, a high-level overview of the most significant of the approaches taken, and a comparison of performances, focussing on one year (2005) as an example

    DC-image for real time compressed video matching

    Get PDF
    This chapter presents a suggested framework for video matching based on local features extracted from the DC-image of MPEG compressed videos, without full decompression. In addition, the relevant arguments and supporting evidences are discussed. Several local feature detectors will be examined to select the best for matching using the DC-image. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and computation complexity. The second experiment compares between using local features and global features regarding compressed video matching with respect to the DC-image. The results confirmed that the use of DC-image, despite its highly reduced size, it is promising as it produces higher matching precision, compared to the full I-frame. Also, SIFT, as a local feature, outperforms most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the real-time margin which leaves a space for further optimizations that can be done to improve this computation complexity

    Compressed-domain shot boundary detection for H.264/AVC using intra partitioning maps

    Get PDF
    In this paper, a novel technique for shot boundary detection operating on H.264/AVC-compressed sequences is presented. Due to new and improved coding tools in H.264/AVC, the characteristics of the obtained sequences differ from former video coding standards. Although several algorithms working on this new standard are already proposed, the presence of IDR frames can still lead to a low accuracy for abrupt transitions. To solve this issue, we present the motion-compensated intra partitioning map which relies on the intra partitioning modes and the motion vectors present in the compressed video stream. Experimental results show that this motion-compensated map achieves a high accuracy and exceeds related work

    Highly efficient low-level feature extraction for video representation and retrieval.

    Get PDF
    PhDWitnessing the omnipresence of digital video media, the research community has raised the question of its meaningful use and management. Stored in immense multimedia databases, digital videos need to be retrieved and structured in an intelligent way, relying on the content and the rich semantics involved. Current Content Based Video Indexing and Retrieval systems face the problem of the semantic gap between the simplicity of the available visual features and the richness of user semantics. This work focuses on the issues of efficiency and scalability in video indexing and retrieval to facilitate a video representation model capable of semantic annotation. A highly efficient algorithm for temporal analysis and key-frame extraction is developed. It is based on the prediction information extracted directly from the compressed domain features and the robust scalable analysis in the temporal domain. Furthermore, a hierarchical quantisation of the colour features in the descriptor space is presented. Derived from the extracted set of low-level features, a video representation model that enables semantic annotation and contextual genre classification is designed. Results demonstrate the efficiency and robustness of the temporal analysis algorithm that runs in real time maintaining the high precision and recall of the detection task. Adaptive key-frame extraction and summarisation achieve a good overview of the visual content, while the colour quantisation algorithm efficiently creates hierarchical set of descriptors. Finally, the video representation model, supported by the genre classification algorithm, achieves excellent results in an automatic annotation system by linking the video clips with a limited lexicon of related keywords

    Video Indexing and Retrieval Techniques Using Novel Approaches to Video Segmentation, Characterization, and Similarity Matching

    Get PDF
    Multimedia applications are rapidly spread at an ever-increasing rate introducing a number of challenging problems at the hands of the research community, The most significant and influential problem, among them, is the effective access to stored data. In spite of the popularity of keyword-based search technique in alphanumeric databases, it is inadequate for use with multimedia data due to their unstructured nature. On the other hand, a number of content-based access techniques have been developed in the context of image indexing and retrieval; meanwhile video retrieval systems start to gain wide attention, This work proposes a number of techniques constituting a fully content-based system for retrieving video data. These techniques are primarily targeting the efficiency, reliability, scalability, extensibility, and effectiveness requirements of such applications. First, an abstract representation of the video stream, known as the DC sequence, is extracted. Second, to deal with the problem of video segmentation, an efficient neural network model is introduced. The novel use of the neural network improves the reliability while the efficiency is achieved through the instantaneous use of the recall phase to identify shot boundaries. Third, the problem of key frames extraction is addressed using two efficient algorithms that adapt their selection decisions based on the amount of activity found in each video shot enabling the selection of a near optimal expressive set of key frames. Fourth, the developed system employs an indexing scheme that supports two low-level features, color and texture, to represent video data, Finally, we propose, in the retrieval stage, a novel model for performing video data matching task that integrates a number of human-based similarity factors. All our software implementations are in Java, which enables it to be used across heterogeneous platforms. The retrieval system performance has been evaluated yielding a very good retrieval rate and accuracy, which demonstrate the effectiveness of the developed system

    Advanced content-based semantic scene analysis and information retrieval: the SCHEMA project

    Get PDF
    The aim of the SCHEMA Network of Excellence is to bring together a critical mass of universities, research centers, industrial partners and end users, in order to design a reference system for content-based semantic scene analysis, interpretation and understanding. Relevant research areas include: content-based multimedia analysis and automatic annotation of semantic multimedia content, combined textual and multimedia information retrieval, semantic -web, MPEG-7 and MPEG-21 standards, user interfaces and human factors. In this paper, recent advances in content-based analysis, indexing and retrieval of digital media within the SCHEMA Network are presented. These advances will be integrated in the SCHEMA module-based, expandable reference system

    Top-Down and Bottom-Up Semantic Indexing of Multimedia

    Get PDF
    The aim of this work consists in proposing a dual approach for the sake of semantic indexing of audio-visual documents. We present two dierent algorithms based respectively on a bottom-up and a top-down strategy. Considering the top-down approach, we propose an algorithm which implements a nite-state machine and uses low-level motion indices extracted from an MPEG compressed bit-stream. Simulation results show that the proposed method can eectively detect the presence of relevant events in sport programs. Using the bottom-up approach, the indexing is performed by means of Hidden Markov Models (HMM), with an innovative approach: the input signal is considered as a non-stationary stochastic process, modeled by a HMM in which each state is associated with a different property of audio-visual material. Several samples from the MPEG-7 content set have been analyzed using the proposed scheme, demonstrating the performance of the overall approach to provide insights about the content of audio-visual programmes. Moreover, what appears quite attractive instead is to use low-level descriptors in providing a feedback for non-expert users of the content of the described audio-visual programme. The experiments have demonstrated that, by adequate visualization or presentation, low-level features carry instantly semantic information about the programme content, given a certain programme category, which may thus help the viewer to use such low-level information for navigation or retrieval of relevant events
    • 

    corecore