3,335 research outputs found
The FĂschlĂĄr-News-Stories system: personalised access to an archive of TV news
The âFĂschlĂĄrâ systems are a family of tools for capturing, analysis, indexing, browsing, searching and summarisation of digital video information. FĂschlĂĄr-News-Stories, described in this paper, is one of those systems, and provides access to a growing archive of broadcast TV news. FĂschlĂĄr-News-Stories has several notable features including the fact that it automatically records TV news and segments a broadcast news program into stories, eliminating advertisements and credits at the start/end of the broadcast. FĂschlĂĄr-News-Stories supports access to individual stories via calendar lookup, text search through closed captions, automatically-generated links between related stories, and personalised access using a personalisation and recommender system based on collaborative filtering. Access to individual news stories is supported either by browsing keyframes with synchronised closed captions, or by playback of the recorded video. One strength of the FĂschlĂĄr-News-Stories system is that it is actually used, in practice, daily, to access news. Several aspects of the FĂschlĂĄr systems have been published before, bit in this paper we give a summary of the FĂschlĂĄr-News-Stories system in operation by following a scenario in which it is used and also outlining how the underlying system realises the functions it offers
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
An MPEG-7 scheme for semantic content modelling and filtering of digital video
Abstract Part 5 of the MPEG-7 standard specifies Multimedia Description Schemes (MDS); that is, the format multimedia content models should conform to in order to ensure interoperability across multiple platforms and applications. However, the standard does not specify how the content or the associated model may be filtered. This paper proposes an MPEG-7 scheme which can be deployed for digital video content modelling and filtering. The proposed scheme, COSMOS-7, produces rich and multi-faceted semantic content models and supports a content-based filtering approach that only analyses content relating directly to the preferred content requirements of the user. We present details of the scheme, front-end systems used for content modelling and filtering and experiences with a number of users
COSMOS-7: Video-oriented MPEG-7 scheme for modelling and filtering of semantic content
MPEG-7 prescribes a format for semantic content models for multimedia to ensure interoperability across a multitude of platforms and application domains. However, the standard leaves it open as to how the models should be used and how their content should be filtered. Filtering is a technique used to retrieve only content relevant to user requirements, thereby reducing the necessary content-sifting effort of the user. This paper proposes an MPEG-7 scheme that can be deployed for semantic content modelling and filtering of digital video. The proposed scheme, COSMOS-7, produces rich and multi-faceted semantic content models and supports a content-based filtering approach that only analyses content relating directly to the preferred content requirements of the user
Automated Organisation and Quality Analysis of User-Generated Audio Content
The abundance and ubiquity of user-generated content has opened horizons when it
comes to the organization and analysis of vast and heterogeneous data, especially with
the increase of quality of the recording devices witnessed nowadays. Most of the activity
experienced in social networks today contains audio excerpts, either by belonging to a
certain video file or an actual audio clip, therefore the analysis of the audio features
present in such content is of extreme importance in order to better understand it. Such
understanding would lead to a better handling of ubiquity data and would ultimately
provide a better experience to the end-user.
The work discussed in this thesis revolves around using audio features to organize
and retrieve meaningful insights from user-generated content crawled from social media
websites, more particularly data related to concert clips. From its redundancy and
abundance (i.e., for the existence of several recordings of a given event), recordings from
musical shows represent a very good use case to derive useful and practical conclusions
around the scope of this thesis.
Mechanisms that provide a better understanding of such content are presented and already
partly implemented, such as audio clustering based on the existence of overlapping
audio segments between different audio clips, audio segmentation that synchronizes and
relates the different clusterâs clips in time, and techniques to infer audio quality of such
clips. All the proposed methods use information retrieved from an audio fingerprinting
algorithm, used for the synchronization of the different audio files, with methods for
filtering possible false positives of the algorithm being also presented.
For the evaluation and validation of the proposed methods, we used one dataset
made of several audio recordings regarding different concert clips manually crawled
from YouTube
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and âenablersâ, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
Indexing Audio-Visual Sequences by Joint Audio and Video Processing
The focus of this work is oriented to the creation of a content-based hierarchical organisation of audio-visual data (a description scheme) and to the creation of meta-data (descriptors) to associate with audio and/or visual signals. The generation of efficient indices to access audio-visual databases is strictly connected to the generation of content descriptors and to the hierarchical representation of audio-visual material. Once a hierarchy can be extracted from the data analysis, a nested indexing structure can be created to access relevant information at a specific level of detail. Accordingly, a query can be made very specific in relationship to the level of detail that is required by the user. In order to construct the hierarchy, we describe how to extract information content from audio-visual sequences so as to have different hierarchical indicators (or descriptors), which can be associated to each media (audio, video). At this stage, video and audio signals can be separated into temporally consistent elements. At the lowest level, information is organised in frames (groups of pixels for visual information, groups of consecutive samples for audio information). At a higher level, low-level consistent temporal entities are identified: in case of digital image sequences, these consist of shots (or continuous camera records) which can be obtained by detecting cuts or special effects such as dissolves, fade in
and fade out; in case of audio information, these represent consistent audio segments belonging to one specific audio type (such as speech, music, silence, ...). One more level up, patterns of video shots or audio segments can be recognised so as to reflect more meaningful structures such as dialogues, actions, ... At the highest level, information is organised so as to establish correlation beyond the temporal organisation of information, allowing to reflect classes of visual or audio types: we call these classes idioms. The paper ends with a
description of possible solutions to allow a cross-modal analysis of audio and video information, which may validate or invalidate the proposed hierarchy, and in some cases enable more sophisticated levels of representation of information content
- âŠ