Search CORE

3 research outputs found

Overview of VideoCLEF 2009: New perspectives on speech-based multimedia content enrichment

Author: A. Hanjalic
A.F. Smeaton
J. Kekäläinen
J. Kürsten
J.J.M. Kierkels
J.M. Perea-Ortega
M. Larson
M. Larson
P. Pecina
S. Raaijmakers
T.-A. Dobrilǎ
Á. Gyarmati
Publication venue
Publication date: 01/01/2009
Field of study

VideoCLEF 2009 offered three tasks related to enriching video content for improved multimedia access in a multilingual environment. For each task, video data (Dutch-language television, predominantly documentaries) accompanied by speech recognition transcripts were provided. The Subject Classification Task involved automatic tagging of videos with subject theme labels. The best performance was achieved by approaching subject tagging as an information retrieval task and using both speech recognition transcripts and archival metadata. Alternatively, classifiers were trained using either the training data provided or data collected from Wikipedia or via general Web search. The Affect Task involved detecting narrative peaks, defined as points where viewers perceive heightened dramatic tension. The task was carried out on the “Beeldenstorm” collection containing 45 short-form documentaries on the visual arts. The best runs exploited affective vocabulary and audience directed speech. Other approaches included using topic changes, elevated speaking pitch, increased speaking intensity and radical visual changes. The Linking Task, also called “Finding Related Resources Across Languages,” involved linking video to material on the same subject in a different language. Participants were provided with a list of multimedia anchors (short video segments) in the Dutch-language “Beeldenstorm” collection and were expected to return target pages drawn from English-language Wikipedia. The best performing methods used the transcript of the speech spoken during the multimedia anchor to build a query to search an index of the Dutch language Wikipedia. The Dutch Wikipedia pages returned were used to identify related English pages. Participants also experimented with pseudo-relevance feedback, query translation and methods that targeted proper names

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Overview of VideoCLEF 2008: Automatic generation of topic-based feeds for dual language audio-visual content

Author: Jones Gareth J.F.
Larson Martha
Newman Eamonn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

The VideoCLEF track, introduced in 2008, aims to develop and evaluate tasks related to analysis of and access to multilingual multimedia content. In its first year, VideoCLEF piloted the Vid2RSS task, whose main subtask was the classification of dual language video (Dutchlanguage television content featuring English-speaking experts and studio guests). The task offered two additional discretionary subtasks: feed translation and automatic keyframe extraction. Task participants were supplied with Dutch archival metadata, Dutch speech transcripts, English speech transcripts and 10 thematic category labels, which they were required to assign to the test set videos. The videos were grouped by class label into topic-based RSS-feeds, displaying title, description and keyframe for each video. Five groups participated in the 2008 VideoCLEF track. Participants were required to collect their own training data; both Wikipedia and general web content were used. Groups deployed various classifiers (SVM, Naive Bayes and k-NN) or treated the problem as an information retrieval task. Both the Dutch speech transcripts and the archival metadata performed well as sources of indexing features, but no group succeeded in exploiting combinations of feature sources to significantly enhance performance. A small scale fluency/adequacy evaluation of the translation task output revealed the translation to be of sufficient quality to make it valuable to a non-Dutch speaking English speaker. For keyframe extraction, the strategy chosen was to select the keyframe from the shot with the most representative speech transcript content. The automatically selected shots were shown, with a small user study, to be competitive with manually selected shots. Future years of VideoCLEF will aim to expand the corpus and the class label list, as well as to extend the track to additional tasks

CiteSeerX

Irish Universities

DCU Online Research Access Service

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Clasificación automática de vídeos

Author: Aparicio Escribano David
Publication venue
Publication date: 14/12/2009
Field of study

La actual tendencia a digitalizar los diferentes contenidos audiovisuales para su almacenamiento y posible explotación en medios informáticos y de telecomunicaciones está haciendo que distintas líneas de investigación se centren en procesar y analizar dichos documentos, así como buscar posibles soluciones a ciertos problemas y necesidades que traen consigo estos contenidos. La búsqueda de documentos en texto es una de las necesidades actuales mejor satisfechas mediante buscadores como Google o Yahoo en Internet, mas no es el caso de los contenidos audiovisuales. Poder consultar tanto por temática como por contenido en vídeos, audios y documentos de este estilo, abre un abanico de posibilidades bastante extenso. La clasificación automática de contenidos audiovisuales puede ayudar a digitalizar de forma más rápida los cientos de miles de contenidos de este tipo de años atrás, consiguiendo así un ahorro de recursos y de tiempo. Puede permitir detectar vídeos con contenidos violentos, pornográficos u otros que deban ser tratados de distinta manera por ciertos usuarios. El presente estudio pretende analizar las actuales técnicas de clasificación automática de vídeos, que distingue dos fases bien definidas, el reconocimiento automático del habla y la clasificación automática de texto. El reconocimiento automático del habla permite realizar la transcripción a texto del contenido audiovisual para posteriormente ser clasificado como un documento de texto. Las actuales líneas de investigación en clasificación automática de textos están bastante avanzadas y es por ello que el proyecto pretende seguir esta línea, convirtiendo los documentos audiovisuales en documentos de texto para, posteriormente ser procesados con técnicas de procesamiento del lenguaje natural y métodos de clasificación automática. En definitiva, la clasificación y búsqueda de documentos audiovisuales es algo necesario en la actualidad, y aunque de momento no sea una tarea prioritaria, poco a poco debe ganar posiciones, ya que, la sociedad y en concreto el mundo que rodea Internet, requiere de documentos como vídeos y audios donde los usuarios puedan realizar consultas sobre dichos contenidos. El proyecto que se presenta a continuación ha realizado un estudio avanzado sobre la clasificación automática de vídeos obteniendo unos resultados aceptables en un caso práctico realizado, con una precisión superior al 40% y una cobertura similar. Permite hacerse una idea de la viabilidad de estos sistemas y ofrece un estudio detallado de las actuales técnicas y líneas de investigación.Ingeniería Técnica en Informática de Gestió

Universidad Carlos III de Madrid e-Archivo