1,770 research outputs found
Ridiculously Fast Shot Boundary Detection with Fully Convolutional Neural Networks
Shot boundary detection (SBD) is an important component of many video
analysis tasks, such as action recognition, video indexing, summarization and
editing. Previous work typically used a combination of low-level features like
color histograms, in conjunction with simple models such as SVMs. Instead, we
propose to learn shot detection end-to-end, from pixels to final shot
boundaries. For training such a model, we rely on our insight that all shot
boundaries are generated. Thus, we create a dataset with one million frames and
automatically generated transitions such as cuts, dissolves and fades. In order
to efficiently analyze hours of videos, we propose a Convolutional Neural
Network (CNN) which is fully convolutional in time, thus allowing to use a
large temporal context without the need to repeatedly processing frames. With
this architecture our method obtains state-of-the-art results while running at
an unprecedented speed of more than 120x real-time
Using selfsupervised algorithms for video analysis and scene detection
With the increasing available audiovisual content, well-ordered and effective management of video is desired, and therefore, automatic, and accurate solutions for video indexing and retrieval are needed. Self-supervised learning algorithms with 3D convolutional neural networks are a promising solution for these tasks, thanks to its independence from human-annotations and its suitability to identify spatio-temporal features. This work presents a self-supervised algorithm for the analysis of video shots, accomplished by a two-stage implementation: 1- An algorithm that generates pseudo-labels for 20-frame samples with different automatically generated shot transitions (Hardcuts/Cropcuts, Dissolves, Fades in/out, Wipes) and 2- A fully convolutional 3D trained network with an overall achieved accuracy greater than 97% in the testing set. The model implemented is based in [5], improving the detection of large smooth transitions by implementing a larger temporal context. The transitions detected occur centered in the 10th and 11th frames of a 20-frame input window
Language-based multimedia information retrieval
This paper describes various methods and approaches for language-based multimedia information retrieval, which have been developed in the projects POP-EYE and OLIVE and which will be developed further in the MUMIS project. All of these project aim at supporting automated indexing of video material by use of human language technologies. Thus, in contrast to image or sound-based retrieval methods, where both the query language and the indexing methods build on non-linguistic data, these methods attempt to exploit advanced text retrieval technologies for the retrieval of non-textual material. While POP-EYE was building on subtitles or captions as the prime language key for disclosing video fragments, OLIVE is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which then serve as the basis for text-based retrieval functionality
Debris/ice/TPS assessment and integrated photographic analysis for Shuttle Mission STS-54
A Debris/Ice/TPS assessment and integrated photographic analysis was conducted for Shuttle Mission STS-54. Debris inspections of the flight elements and launch pad were performed before and after launch. Ice/frost conditions on the External Tank were assessed by the use of computer programs, nomographs, and infrared scanner data during cryogenic loading of the vehicle followed by on-pad visual inspection. High speed photography was analyzed after launch to identify ice/debris sources and evaluate potential vehicle damage and/or in-flight anomalies. This report documents the debris/ice/TPS conditions and integrated photographic analysis of Shuttle Mission STS-54, and the resulting effect on the Space Shuttle Program
Adaptive video segmentation
The efficiency of a video indexing technique depends on the efficiency of the video segmentation algorithm which is a fundamental step in video indexing. Video segmentation is a process of splitting up a video sequence into its constituent scenes. This work focuses on the problem of video segmentation. A content-based approach has been used which segments a video based on the information extracted from the video itself. The main emphasis is on using structural information in the video such as edges as they are largely invariant to illumination and motion changes. The edge-based features have been used in conjunction with the intensity-based features in a multi-resolution framework to improve the performance of the segmentation algorithm.;To further improve the performance and to reduce the problem of automated choice of parameters, we introduce adaptation in the video segmentation process. (Abstract shortened by UMI.)
Debris/ice/TPS assessment and photographic analysis for shuttle mission STS-31R
A Debris/Ice/Thermal Protection System (TPS) assessment and photographic analysis was conducted for Space Shuttle Mission STS-31R. Debris inspections of the flight elements and launch pad are performed before and after launch. Ice/frost conditions on the External Tank are assessed by the use of computer programs, nomographs, and infrared scanner data during cryogenic loading of the vehicle followed by on-pad visual inspection. High speed photography is analyzed after launch to identify ice/debris sources and evaluate potential vehicle damage and/or in-flight anomalies. The debris/ice/TPS conditions and photographic analysis of Mission STS-31R, is presented along with their overall effect on the Space Shuttle Program
- …