Search CORE

5,535 research outputs found

Video summarisation: A conceptual framework and survey of the state of the art

Author: Arthur G. Money
Babaguchi
Boyatzis
Cernekova
Chang
Chang
Crockford
Dey
Dimitrova
Ekin
Ferman
Gianluigi
Hanjalic
Hanjalic
Harry Agius
Joffe
Kim
Lee
Lew
Li
Li
Lienhart
Ma
Moriyama
Ngo
Otsuka
Shih
Silverman
Taylor
Tjondronegoro
Tseng
Wang
Zhu
Publication venue: 'Elsevier BV'
Publication date: 01/02/2008
Field of study

This is the post-print (final draft post-refereeing) version of the article. Copyright @ 2007 Elsevier Inc.Video summaries provide condensed and succinct representations of the content of a video stream through a combination of still images, video segments, graphical representations and textual descriptors. This paper presents a conceptual framework for video summarisation derived from the research literature and used as a means for surveying the research literature. The framework distinguishes between video summarisation techniques (the methods used to process content from a source video stream to achieve a summarisation of that stream) and video summaries (outputs of video summarisation techniques). Video summarisation techniques are considered within three broad categories: internal (analyse information sourced directly from the video stream), external (analyse information not sourced directly from the video stream) and hybrid (analyse a combination of internal and external information). Video summaries are considered as a function of the type of content they are derived from (object, event, perception or feature based) and the functionality offered to the user for their consumption (interactive or static, personalised or generic). It is argued that video summarisation would benefit from greater incorporation of external information, particularly user based information that is unobtrusively sourced, in order to overcome longstanding challenges such as the semantic gap and providing video summaries that have greater relevance to individual users

Crossref

Brunel University Research Archive

Video summarization by group scoring

Author: Adesnik Hillel
Cabrini Stefano
Dhuey Scott
Koshelev Alexander
Lambert Raquel
Lanzio Vittorino
Micheletti Paolo
Sassolini Simone
Telian Gregory
West Melanie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2014
Field of study

In this paper a new model for user-centered video summarization is presented. Involvement of more than one expert in generating the final video summary should be regarded as the main use case for this algorithm. This approach consists of three major steps. First, the video frames are scored by a group of operators. Next, these assigned scores are averaged to produce a singular value for each frame and lastly, the highest scored video frames alongside the corresponding audio and textual contents are extracted to be inserted into the summary. The effectiveness of this approach has been evaluated by comparing the video summaries generated by this system against the results from a number of automatic summarization tools that use different modalities for abstraction

Crossref

eScholarship - University of California

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Brunel University Research Archive

Efficient Video Indexing on the Web: A System that Leverages User Interactions with a Video Player

Author: A. Holyer
A.G. Money
A.G. Money
F.C. Li
G.C. Silva
R. Hjelsvold
S.M. Drucker
X. Ferre
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

In this paper, we propose a user-based video indexing method, that automatically generates thumbnails of the most important scenes of an online video stream, by analyzing users' interactions with a web video player. As a test bench to verify our idea we have extended the YouTube video player into the VideoSkip system. In addition, VideoSkip uses a web-database (Google Application Engine) to keep a record of some important parameters, such as the timing of basic user actions (play, pause, skip). Moreover, we implemented an algorithm that selects representative thumbnails. Finally, we populated the system with data from an experiment with nine users. We found that the VideoSkip system indexes video content by leveraging implicit users interactions, such as pause and thirty seconds skip. Our early findings point toward improvements of the web video player and its thumbnail generation technique. The VideSkip system could compliment content-based algorithms, in order to achieve efficient video-indexing in difficult videos, such as lectures or sports.Comment: 9 pages, 3 figures, UCMedia 2010: 2nd International ICST Conference on User Centric Medi

arXiv.org e-Print Archive

CiteSeerX

Crossref

Spott : on-the-spot e-commerce for television using deep learning-based video analysis techniques

Author: Schuurman Dimitri
Vandecasteele Florian
Vandenbroucke Karel
Verstockt Steven
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Spott is an innovative second screen mobile multimedia application which offers viewers relevant information on objects (e.g., clothing, furniture, food) they see and like on their television screens. The application enables interaction between TV audiences and brands, so producers and advertisers can offer potential consumers tailored promotions, e-shop items, and/or free samples. In line with the current views on innovation management, the technological excellence of the Spott application is coupled with iterative user involvement throughout the entire development process. This article discusses both of these aspects and how they impact each other. First, we focus on the technological building blocks that facilitate the (semi-) automatic interactive tagging process of objects in the video streams. The majority of these building blocks extensively make use of novel and state-of-the-art deep learning concepts and methodologies. We show how these deep learning based video analysis techniques facilitate video summarization, semantic keyframe clustering, and (similar) object retrieval. Secondly, we provide insights in user tests that have been performed to evaluate and optimize the application's user experience. The lessons learned from these open field tests have already been an essential input in the technology development and will further shape the future modifications to the Spott application

Ghent University Academic Bibliography

Multimedia content modeling and personalization

Author: Angelides MC
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2003
Field of study

Crossref

Brunel University Research Archive

Adapting End-to-End Speech Recognition for Readable Subtitles

Author: Liu Danni
Niehues Jan
Spanakis Gerasimos
Publication venue
Publication date: 01/01/2020
Field of study

Automatic speech recognition (ASR) systems are primarily evaluated on transcription accuracy. However, in some use cases such as subtitling, verbatim transcription would reduce output readability given limited screen size and reading time. Therefore, this work focuses on ASR with output compression, a task challenging for supervised approaches due to the scarcity of training data. We first investigate a cascaded system, where an unsupervised compression model is used to post-edit the transcribed speech. We then compare several methods of end-to-end speech recognition under output length constraints. The experiments show that with limited data far less than needed for training a model from scratch, we can adapt a Transformer-based ASR model to incorporate both transcription and compression capabilities. Furthermore, the best performance in terms of WER and ROUGE scores is achieved by explicitly modeling the length constraints within the end-to-end ASR system.Comment: IWSLT 202

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref