1,770 research outputs found

    Ridiculously Fast Shot Boundary Detection with Fully Convolutional Neural Networks

    Full text link
    Shot boundary detection (SBD) is an important component of many video analysis tasks, such as action recognition, video indexing, summarization and editing. Previous work typically used a combination of low-level features like color histograms, in conjunction with simple models such as SVMs. Instead, we propose to learn shot detection end-to-end, from pixels to final shot boundaries. For training such a model, we rely on our insight that all shot boundaries are generated. Thus, we create a dataset with one million frames and automatically generated transitions such as cuts, dissolves and fades. In order to efficiently analyze hours of videos, we propose a Convolutional Neural Network (CNN) which is fully convolutional in time, thus allowing to use a large temporal context without the need to repeatedly processing frames. With this architecture our method obtains state-of-the-art results while running at an unprecedented speed of more than 120x real-time

    Using selfsupervised algorithms for video analysis and scene detection

    Get PDF
    With the increasing available audiovisual content, well-ordered and effective management of video is desired, and therefore, automatic, and accurate solutions for video indexing and retrieval are needed. Self-supervised learning algorithms with 3D convolutional neural networks are a promising solution for these tasks, thanks to its independence from human-annotations and its suitability to identify spatio-temporal features. This work presents a self-supervised algorithm for the analysis of video shots, accomplished by a two-stage implementation: 1- An algorithm that generates pseudo-labels for 20-frame samples with different automatically generated shot transitions (Hardcuts/Cropcuts, Dissolves, Fades in/out, Wipes) and 2- A fully convolutional 3D trained network with an overall achieved accuracy greater than 97% in the testing set. The model implemented is based in [5], improving the detection of large smooth transitions by implementing a larger temporal context. The transitions detected occur centered in the 10th and 11th frames of a 20-frame input window

    Wipe scene change detection in video sequences

    Get PDF

    TRECVID 2004 - an overview

    Get PDF

    TRECVID 2007 - Overview

    Get PDF

    Language-based multimedia information retrieval

    Get PDF
    This paper describes various methods and approaches for language-based multimedia information retrieval, which have been developed in the projects POP-EYE and OLIVE and which will be developed further in the MUMIS project. All of these project aim at supporting automated indexing of video material by use of human language technologies. Thus, in contrast to image or sound-based retrieval methods, where both the query language and the indexing methods build on non-linguistic data, these methods attempt to exploit advanced text retrieval technologies for the retrieval of non-textual material. While POP-EYE was building on subtitles or captions as the prime language key for disclosing video fragments, OLIVE is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which then serve as the basis for text-based retrieval functionality

    Debris/ice/TPS assessment and integrated photographic analysis for Shuttle Mission STS-54

    Get PDF
    A Debris/Ice/TPS assessment and integrated photographic analysis was conducted for Shuttle Mission STS-54. Debris inspections of the flight elements and launch pad were performed before and after launch. Ice/frost conditions on the External Tank were assessed by the use of computer programs, nomographs, and infrared scanner data during cryogenic loading of the vehicle followed by on-pad visual inspection. High speed photography was analyzed after launch to identify ice/debris sources and evaluate potential vehicle damage and/or in-flight anomalies. This report documents the debris/ice/TPS conditions and integrated photographic analysis of Shuttle Mission STS-54, and the resulting effect on the Space Shuttle Program

    Adaptive video segmentation

    Get PDF
    The efficiency of a video indexing technique depends on the efficiency of the video segmentation algorithm which is a fundamental step in video indexing. Video segmentation is a process of splitting up a video sequence into its constituent scenes. This work focuses on the problem of video segmentation. A content-based approach has been used which segments a video based on the information extracted from the video itself. The main emphasis is on using structural information in the video such as edges as they are largely invariant to illumination and motion changes. The edge-based features have been used in conjunction with the intensity-based features in a multi-resolution framework to improve the performance of the segmentation algorithm.;To further improve the performance and to reduce the problem of automated choice of parameters, we introduce adaptation in the video segmentation process. (Abstract shortened by UMI.)

    Debris/ice/TPS assessment and photographic analysis for shuttle mission STS-31R

    Get PDF
    A Debris/Ice/Thermal Protection System (TPS) assessment and photographic analysis was conducted for Space Shuttle Mission STS-31R. Debris inspections of the flight elements and launch pad are performed before and after launch. Ice/frost conditions on the External Tank are assessed by the use of computer programs, nomographs, and infrared scanner data during cryogenic loading of the vehicle followed by on-pad visual inspection. High speed photography is analyzed after launch to identify ice/debris sources and evaluate potential vehicle damage and/or in-flight anomalies. The debris/ice/TPS conditions and photographic analysis of Mission STS-31R, is presented along with their overall effect on the Space Shuttle Program
    corecore