Search CORE

1,301 research outputs found

Multimedia information seeking through search and hyperlinking

Author: Aly Robin
Chen Shu
de Nies Tom
Debevere Pedro
Eskevich Maria
Galuscakova Petra
Gravier Guillaume
Guinaudeau Camille
Jones Gareth J.F.
Larson Martha
Nadeem Danish
Ordelman Roeland
Pecina Pavel
Sebillot Pascale
Van deWalle Rik
Publication venue
Publication date: 16/04/2013
Field of study

Searching for relevant webpages and following hyperlinks to related content is a widely accepted and effective approach to information seeking on the textual web. Existing work on multimedia information retrieval has focused on search for individual relevant items or on content linking without specific attention to search results. We describe our research exploring integrated multimodal search and hyperlinking for multimedia data. Our investigation is based on the MediaEval 2012 Search and Hyperlinking task. This includes a known-item search task using the Blip10000 internet video collection, where automatically created hyperlinks link each relevant item to related items within the collection. The search test queries and link assessment for this task was generated using the Amazon Mechanical Turk crowdsourcing platform. Our investigation examines a range of alternative methods which seek to address the challenges of search and hyperlinking using multimodal approaches. The results of our experiments are used to propose a research agenda for developing eective techniques for search and hyperlinking of multimedia content

DCU Online Research Access Service

Indexing, browsing and searching of digital video

Author: Abe
Avaro
Brown
Chang
Chang
Choi
Goodrum
Hauptmann
Hirschman
Jarina
Kavanagh
Kazman
Koegel Buford
Kravtchenko
Le Gall
Lee
Lienhart
Marchionini
Maybury
McTear
Myers
Myllymaki
Poynton
Puri
Rasmussen
Rorvig
Rowley
Smyth
Sparck Jones
Stein
Wactlar
Wallace
Witbrock
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Video Augmentation in Education: in-context support for learners through prerequisite graphs

Author: GALLUCCIO ILENIA
Publication venue: Università degli studi di Genova
Publication date: 29/05/2023
Field of study

The field of education is experiencing a massive digitisation process that has been ongoing for the past decade. The role played by distance learning and Video-Based Learning, which is even more reinforced by the pandemic crisis, has become an established reality. However, the typical features of video consumption, such as sequential viewing and viewing time proportional to duration, often lead to sub-optimal conditions for the use of video lessons in the process of acquisition, retrieval and consolidation of learning contents. Video augmentation can prove to be an effective support to learners, allowing a more flexible exploration of contents, a better understanding of concepts and relationships between concepts and an optimization of time required for video consumption at different stages of the learning process. This thesis focuses therefore on the study of methods for: 1) enhancing video capabilities through video augmentation features; 2) extracting concept and relationships from video materials; 3) developing intelligent user interfaces based on the knowledge extracted. The main research goal is to understand to what extent video augmentation can improve the learning experience. This research goal inspired the design of EDURELL Framework, within which two applications were developed to enable the testing of augmented methods and their provision. The novelty of this work lies in using the knowledge within the video, without exploiting external materials, to exploit its educational potential. The enhancement of the user interface takes place through various support features among which in particular a map that progressively highlights the prerequisite relationships between the concepts as they are explained, i.e., following the advancement of the video. The proposed approach has been designed following a user-centered iterative approach and the results in terms of effect and impact on video comprehension and learning experience make a contribution to the research in this field

Archivio istituzionale della ricerca - Università di Genova

Multimedia search without visual analysis: the value of linguistic and contextual information

Author: Jong Franciska M.G. de
Vries Arjen P. de
Westerveld Thijs
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2007
Field of study

This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

CiteSeerX

CWI's Institutional Repository

University of Twente Research Information

Hierarchical Topic Models for Language-based Video Hyperlinking

Author: Bois Rémi
Gravier Guillaume
Moens Sien
Morin Emmanuel
Simon Anca-Roxana
Sébillot Pascale
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

International audienceWe investigate video hyperlinking based on speech transcripts , leveraging a hierarchical topical structure to address two essential aspects of hyperlinking, namely, serendipity control and link justification. We propose and compare different approaches exploiting a hierarchy of topic models as an intermediate representation to compare the transcripts of video segments. These hierarchical representations offer a basis to characterize the hyperlinks, thanks to the knowledge of the topics who contributed to the creation of the links, and to control serendipity by choosing to give more weights to either general or specific topics. Experiments are performed on BBC videos from the Search and Hyperlinking task at MediaEval. Link precisions similar to those of direct text comparison are achieved however exhibiting different targets along with a potential control of serendipity

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Automatic Transformation of a Video Using Multimodal Information for an Engaging Exploration Experience

Author: Conlan Owen
Haider Fasih
Luz Saturnino
Salim Fahim A
Publication venue: 'MDPI AG'
Publication date: 27/04/2020
Field of study

Edinburgh Research Explorer

Supporting learning-in-use :Some applications of activity theory to the analysis and design of ICT-enabled collaborative work and learning

Author: Harris Steven R.
Publication venue
Publication date: 01/01/2007
Field of study

University of South Wales Research Explorer

Deliverable D5.1 LinkedTV Platform and Architecture

Author: Fricke R. (Rolf)
Thomsen J. (Jan)
Publication venue
Publication date: 18/04/2012
Field of study

The objective of Linked TV is the integration of hyperlinks in videos to open up new possibilities for an interactive, seamless usage of video on the Web. LinkedTV provides a platform for the automatic identification of media fragments, their metadata annotations and connection with the Linked Open Data Cloud, which enables to develop applications for the search for objects, persons or events in videos and retrieval of more detailed related information. The objective of D5.1 is the design of the platform architecture for the server and client side based on the requirements derived from the scenarios defined in WP6 and technical needs from WPs 1-4. The document defines workflows, components, data structures and tools. Flexible interfaces and an efficient communications infrastructure allow for a seamless deployment of the system in heterogeneous, distributed environments. The resulting design builds the basis for the distributed development of all components in WP1-4 and their integration into a platform enabling for the efficient development of Hypervideo applications

CWI's Institutional Repository

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

Author: Benetos Emmanouil
Chen Wenhu
Dannenberg Roger
Fu Jie
Guo Yike
LI Yizhi
Lin Chenghua
Liu Si
Ma Yinghao
Pan Jiahao
Xue Wei
Yuan Ruibin
Zhang Ge
Zhuo Le
Publication venue
Publication date: 29/06/2023
Field of study

We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even in challenging genres such as rock and metal. Our novel, training-free approach utilizes Whisper, a weakly supervised robust speech recognition model, and GPT-4, today's most performant chat-based large language model. In the proposed method, Whisper functions as the "ear" by transcribing the audio, while GPT-4 serves as the "brain," acting as an annotator with a strong performance for contextualized output selection and correction. Our experiments show that LyricWhiz significantly reduces Word Error Rate compared to existing methods in English and can effectively transcribe lyrics across multiple languages. Furthermore, we use LyricWhiz to create the first publicly available, large-scale, multilingual lyrics transcription dataset with a CC-BY-NC-SA copyright license, based on MTG-Jamendo, and offer a human-annotated subset for noise level estimation and evaluation. We anticipate that our proposed method and dataset will advance the development of multilingual lyrics transcription, a challenging and emerging task.Comment: 9 pages, 2 figures, 5 tables, accepted by ISMIR 202

arXiv.org e-Print Archive