Search CORE

4,152 research outputs found

Video browsing interfaces and applications: a review

Author: Boeszoermenyi L.
Hopfgartner F.
Jose J.
Marques O.
Schoeffmann K.
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/02/2010
Field of study

We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data—which, if presented in its raw format, is rather unwieldy and costly—have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other

Enlighten

White Rose Research Online

Towards key-frame extraction methods for 3D video: a review

Author: A Khodakovsky
A Smolic
AF Smeaton
AG Money
B Bustos
B Ionescu
B-D Choi
B-L Yeo
BT Truong
C Cotsaces
C Halit
C Jin
C Nguyen
C Shelton
CH Lee
D DeMenthon
G Ciocca
H Schwarz
H-J Lee
HM Briceño
HS Chang
J Assa
J Bescos
J Nam
J Peng
J Xu
J Xu
J Yuan
J-L Lai
K Ju
K Schoeffmann
K Schoeffmann
K-S Huang
L Ferreira
L Ferreira
L Herranz
L Herranz
Lino Ferreira
Luis A. da Silva Cruz
M Furini
M Mrak
M Savelonas
N Ejaz
N Ejaz
ND Doulamis
P Huang
P Huang
P Merkle
P Mundur
P Over
P Over
P Sidiropoulos
Pedro Assuncao
PJ Besl
Q-G Ji
R Parent
R Slama
R Xu
S Chikkerur
S Lian
SEF de Avila
T Yamasaki
T-Y Lee
T-Y Liu
U Gargi
V Blanz
W Hu
W-N Lie
X Cao
Y Fu
Y Li
Y-F Ma
Y-J Zhang
Z Cernekova
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The increasing rate of creation and use of 3D video content leads to a pressing need for methods capable of lowering the cost of 3D video searching, browsing and indexing operations, with improved content selection performance. Video summarisation methods specifically tailored for 3D video content fulfil these requirements. This paper presents a review of the state-of-the-art of a crucial component of 3D video summarisation algorithms: the key-frame extraction methods. The methods reviewed cover 3D video key-frame extraction as well as shot boundary detection methods specific for use in 3D video. The performance metrics used to evaluate the key-frame extraction methods and the summaries derived from those key-frames are presented and discussed. The applications of these methods are also presented and discussed, followed by an exposition about current research challenges on 3D video summarisation methods

Crossref

Estudo Geral

RUSHES—an annotation and retrieval engine for multimedia semantic units

Author: Abdul H. Sadka
Benini Sergio
Ebroul Izquierdo
Ingo Feldmann
Isabel Alonso Mediavilla
Leonardi Riccardo
Mohammad Rafiq Swash
Oliver Schreer
Pedro Concejero
Tijana Janjusevic
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Multimedia analysis and reuse of raw un-edited audio visual content known as rushes is gaining acceptance by a large number of research labs and companies. A set of research projects are considering multimedia indexing, annotation, search and retrieval in the context of European funded research, but only the FP6 project RUSHES is focusing on automatic semantic annotation, indexing and retrieval of raw and un-edited audio-visual content. Even professional content creators and providers as well as home-users are dealing with this type of content and therefore novel technologies for semantic search and retrieval are required. In this paper, we present a summary of the most relevant achievements of the RUSHES project, focusing on specific approaches for automatic annotation as well as the main features of the final RUSHES search engine

Fraunhofer-ePrints

Archivio istituzionale della ricerca - Università di Brescia

UntrimmedNets for Weakly Supervised Action Recognition and Detection

Author: Lin Dahua
Van Gool Luc
Wang Limin
Xiong Yuanjun
Publication venue
Publication date: 22/05/2017
Field of study

Current action recognition methods heavily rely on trimmed videos for model training. However, it is expensive and time-consuming to acquire a large-scale trimmed video dataset. This paper presents a new weakly supervised architecture, called UntrimmedNet, which is able to directly learn action recognition models from untrimmed videos without the requirement of temporal annotations of action instances. Our UntrimmedNet couples two important components, the classification module and the selection module, to learn the action models and reason about the temporal duration of action instances, respectively. These two components are implemented with feed-forward networks, and UntrimmedNet is therefore an end-to-end trainable architecture. We exploit the learned models for action recognition (WSR) and detection (WSD) on the untrimmed video datasets of THUMOS14 and ActivityNet. Although our UntrimmedNet only employs weak supervision, our method achieves performance superior or comparable to that of those strongly supervised approaches on these two datasets.Comment: camera-ready version to appear in CVPR201

arXiv.org e-Print Archive

Crossref

Using selfsupervised algorithms for video analysis and scene detection

Author: Mora Arias Juan Felipe
Publication venue: Universitat Politècnica de Catalunya
Publication date: 29/10/2020
Field of study

With the increasing available audiovisual content, well-ordered and effective management of video is desired, and therefore, automatic, and accurate solutions for video indexing and retrieval are needed. Self-supervised learning algorithms with 3D convolutional neural networks are a promising solution for these tasks, thanks to its independence from human-annotations and its suitability to identify spatio-temporal features. This work presents a self-supervised algorithm for the analysis of video shots, accomplished by a two-stage implementation: 1- An algorithm that generates pseudo-labels for 20-frame samples with different automatically generated shot transitions (Hardcuts/Cropcuts, Dissolves, Fades in/out, Wipes) and 2- A fully convolutional 3D trained network with an overall achieved accuracy greater than 97% in the testing set. The model implemented is based in [5], improving the detection of large smooth transitions by implementing a larger temporal context. The transitions detected occur centered in the 10th and 11th frames of a 20-frame input window

UPCommons. Portal del coneixement obert de la UPC