Search CORE

8 research outputs found

Language-based multimedia information retrieval

Author: Gauvain J.L.
Hiemstra D.
Jong F.M.G. de
Netter K.
Publication venue
Publication date: 01/01/2000
Field of study

This paper describes various methods and approaches for language-based multimedia information retrieval, which have been developed in the projects POP-EYE and OLIVE and which will be developed further in the MUMIS project. All of these project aim at supporting automated indexing of video material by use of human language technologies. Thus, in contrast to image or sound-based retrieval methods, where both the query language and the indexing methods build on non-linguistic data, these methods attempt to exploit advanced text retrieval technologies for the retrieval of non-textual material. While POP-EYE was building on subtitles or captions as the prime language key for disclosing video fragments, OLIVE is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which then serve as the basis for text-based retrieval functionality

CiteSeerX

Radboud Repository

University of Twente Research Information

Multimedia information technology and the annotation of video

Author: Jong F.M.G. de
Smeulders A.
Worring M.
Publication venue: Stichting Archiefpublicaties
Publication date: 01/01/2006
Field of study

The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning

University of Twente Research Information

A Robust Intelligent Tutoring System for the Integration of People with Intellectual Disabilities into Social and Work Environments

Author: A. Conde
A. Ezeiza
A. Goicoechea
A. Soraluze
E. Irigoyen
K. L. de Ipina
M. Larranaga
N. Garay
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

IntechOpen

Crossref

Enriching Textual Documents with Timecodes from Video Fragments

Author: de Jong Franciska
van der Sluis Ielka
Publication venue
Publication date: 01/01/2000
Field of study

ARTS repository - University of Groningen

A Probabilistic Multimedia Retrieval Model and its Evaluation

Author: de Jong Franciska M.G.
de Vries A.J.
de Vries A.P.
Hiemstra Djoerd
Sayed A.H.
van Ballegooij A.
Westerveld T.H.W.
Publication venue: Hindawi Publishing
Publication date: 01/01/2003
Field of study

We present a probabilistic model for the retrieval of multimodal documents. The model is based on Bayesian decision theory and combines models for text-based search with models for visual search. The textual model is based on the language modelling approach to text retrieval, and the visual information is modelled as a mixture of Gaussian densities. Both models have proved successful on various standard retrieval tasks. We evaluate the multimodal model on the search task of TREC′s video track. We found that the disclosure of video material based on visual information only is still too difficult. Even with purely visual information needs, text-based retrieval still outperforms visual approaches. The probabilistic model is useful for text, visual, and multimedia retrieval. Unfortunately, simplifying assumptions that reduce its computational complexity degrade retrieval effectiveness. Regarding the question whether the model can effectively combine information from different modalities, we conclude that whenever both modalities yield reasonable scores, a combined run outperforms the individual runs

CiteSeerX

Springer - Publisher Connector

Directory of Open Access Journals

University of Twente Research Information

Transcriber: Development and use of a tool for assisting speech corpora production”.

Author: Claude Barras
Edouard Georois
Mark Liberman
Zhibiao Wu
Publication venue
Publication date: 01/01/2001
Field of study

Abstract We present``Transcriber'', a tool for assisting in the creation of speech corpora, and describe some aspects of its development and use. Transcriber was designed for the manual segmentation and transcription of long duration broadcast news recordings, including annotation of speech turns, topics and acoustic conditions. It is highly portable, relying on the scripting language Tcl/Tk with extensions such as Snack for advanced audio functions and tcLex for lexical analysis, and has been tested on various Unix systems and Windows. The data format follows the XML standard with Unicode support for multilingual transcriptions. Distributed as free software in order to encourage the production of corpora, ease their sharing, increase user feedback and motivate software contributions, Transcriber has been in use for over a year in several countries. As a result of this collective experience, new requirements arose to support additional data formats, video control, and a better management of conversational speech. Using the annotation graphs framework recently formalized, adaptation of the tool towards new tasks and support of dierent data formats will become easier. Ó 2001 Elsevier Science B.V. All rights reserved. R esum e Nous pr esentons``Transcriber'', un outil d'aide a la cr eation de corpus de parole, et nous d ecrivons des el ements de son d eveloppement et de son utilisation. Transcriber a et e conc ßu pour permettre la segmentation manuelle et la transcription d'enregistrements de nouvelles radio-dius ees de longue dur ee, ainsi que l'annotation des tours de parole, des th emes et des conditions acoustiques. Cet outil tr es portable, reposant sur le langage de script Tcl/Tk et des extensions telles que Snack pour les fonctionnalit es audio et tcLex pour l'analyse lexicale, a et e test e sur di erents syst emes Unix et sous Windows. Le format de donn ees respecte le standard XML avec un support d'Unicode pour les transcriptions multilingues. Distribu e sous license libre pour encourager la production de corpus, faciliter leur echange, augmenter le retour d'exp erience des utilisateurs et motiver les contributions logicielles ext erieures, Transcriber est utilis e depuis plus d'un an dans plusieurs pays. Suite a cette utilisation, de nouveaux besoins sont apparus comme le support de formats de donn ees suppl ementaires, de la vid eo, et un meilleur traitement de la parole conversationnelle. En utilisant le mod ele des graphes d'annotation formalis e r ecemment, l'adaptation de l'outil vers de nouvelles t aches et le support de di erents formats de donn ees sera facilit e. Ó 2001 Elsevier Science B.V. All rights reserved

CiteSeerX

Language-based multimedia information retrieval

Author: de Jong Franciska
Gauvain Jean-Luc
Harman Donna
Hiemstra Djoerd
Mariani Joseph-Jean
Netter Klaus
Publication venue: Centre de Hautes Etudes Internationales d'Informatique Documentaire (CID)
Publication date: 01/01/2000
Field of study