Search CORE

10 research outputs found

Multimodal content-based video retrieval

Author: Jonker Willem
Mihajlovic V.
Petkovic M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

University of Twente Research Information

Segment Based Indexing Technique for Video Data File

Author: Saravanan D.
Publication venue: The Author(s). Published by Elsevier B.V.
Publication date: 31/12/2016
Field of study

AbstractA video is an effective tool to exchange the information in the structure of showing the brief text message due to the advance developed technology. Video capturing is effortless process but the related video retrieval is the difficult process, for that process the videos must be indexed. Retrieval is the method that retrieved a video using a user query. The query will be image or texts depend upon the query result output system that returned a particular video or image based on that query. In this project we create a indexing for video file by using segment based indexing technique. Here video will be divided into a hierarchy which is in storyboards of film making. For instance, a hierarchical based video search is composed into multi stage abstraction for assist the users to locate the specific video segments/frames logically. This paper brings out the reduced bandwidth and reduced delays the video through the network of searching and reviewing. Experimental results verify this

Elsevier - Publisher Connector

Multimedia Retrieval

Author
Publication venue: Springer
Publication date: 01/01/2007
Field of study

University of Twente Research Information

Automatic Quality Assessment of Lecture Videos Using Multimodal Features

Author: Shi Jianwei
Publication venue: Hannover : Gottfried Wilhelm Leibniz Universität
Publication date: 23/05/2019
Field of study

Multimedia Retrieval, eine entwickelte Methodologie, welche aus Information Retrieval stammt, wird in der digitalisierten Gesellschaft weit verbreitet eingesetzt. Bei der Suche nach Videos im Internet, müssen diese nach ihrer Relevanz sortiert werden. Die meisten Ansätze berechnen die Relevanz jedoch nur aus grundlegenden Inhaltsinformationen. Ziel dieser Arbeit ist es, Relevanz in verschiedenen Modalitäten zu analysieren. Für den konkreten Fall von Vortragsvideos, Merkmale von folgenden Modalitäten werden von dementsprechenden Kursmaterialien extrahiert: akustische, linguistische, und visuelle Modalität. Außerdem sind modalitätsübergreifende Merkmale insbesondere in dieser Arbeit zunächst vorgeschlagen und berechnet durch die Verarbeitung von Audio, Bilder, Transkripte und Texte. Eine Benutzerevaluation wurde durchgeführt, um Benutzermeinungen in Bezug auf die erzeugten Merkmale zu erheben. Die Ergebnisse haben gezeigt, dass die meisten Merkmale ein Video in verschiedenen Aspekten widerspiegeln können. Die Art und Weise, wie der Lerneffekt durch diese Merkmale beeinflusst wird, wird ebenfalls berücksichtigt. Für die weitere Forschung baut diese Studie eine solide Basis für die Extraktion der Merkmale auf. Zudem gewinnt die Arbeit ein besseres Verständnis zum Lernen.Mutimedia retrieval, a developed methodology based on information retrieval, is broadly used in the digitalised society. When searching videos online, they need to be sorted according to their relevance. However, most approaches calculate the relevance only from basic content information. This thesis aims to analyse the relevance in multiple modalities. For the specific case of lecture videos, features from following modalities are extracted from corresponding course materials: audio, linguistic, and visual modality. Furthermore, cross-modal features are specifically first proposed in this thesis and calculated by processing audio, images, transcripts, and texts. A user evaluation has been conducted to collect user's opinions with regards to these generated features. The results have shown that most features can reflect a video in multiple aspects. The way the learning effect is influenced by these features is considered as well. For further research, this study builds a solid base for feature extraction and gains a better understanding of learning

Institutionelles Repositorium der Leibniz Universität Hannover

Semantic multimedia analysis using knowledge and context

Author: Nikolopoulos Spyridon
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/2012
Field of study

PhDThe difficulty of semantic multimedia analysis can be attributed to the extended diversity in form and appearance exhibited by the majority of semantic concepts and the difficulty to express them using a finite number of patterns. In meeting this challenge there has been a scientific debate on whether the problem should be addressed from the perspective of using overwhelming amounts of training data to capture all possible instantiations of a concept, or from the perspective of using explicit knowledge about the concepts’ relations to infer their presence. In this thesis we address three problems of pattern recognition and propose solutions that combine the knowledge extracted implicitly from training data with the knowledge provided explicitly in structured form. First, we propose a BNs modeling approach that defines a conceptual space where both domain related evi- dence and evidence derived from content analysis can be jointly considered to support or disprove a hypothesis. The use of this space leads to sig- nificant gains in performance compared to analysis methods that can not handle combined knowledge. Then, we present an unsupervised method that exploits the collective nature of social media to automatically obtain large amounts of annotated image regions. By proving that the quality of the obtained samples can be almost as good as manually annotated images when working with large datasets, we significantly contribute towards scal- able object detection. Finally, we introduce a method that treats images, visual features and tags as the three observable variables of an aspect model and extracts a set of latent topics that incorporates the semantics of both visual and tag information space. By showing that the cross-modal depen- dencies of tagged images can be exploited to increase the semantic capacity of the resulting space, we advocate the use of all existing information facets in the semantic analysis of social media

Queen Mary Research Online

Research Self-Evaluation 2003-2008, Computer Science Department, University of Twente.

Author: Aksit Mehmet
Apers Peter M.G.
Hartel Pieter H.
Haverkort Boudewijn R.H.M.
Havinga Paul J.M.
Nijholt Antinus
Pras Aiko
Rensink Arend
van de Pol Jan Cornelis
van Sinderen Marten J.
Wieringa Roelf J.
Publication venue: Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2009
Field of study

University of Twente Research Information

Multi-modal surrogates for retrieving and making sense of videos: is synchronization between the multiple modalities optimal?

Author: Song Yaxiao
Publication venue: University of North Carolina at Chapel Hill
Publication date: 01/12/2010
Field of study

Video surrogates can help people quickly make sense of the content of a video before downloading or seeking more detailed information. Visual and audio features of a video are primary information carriers and might become important components of video retrieval and video sense-making. In the past decades, most research and development efforts on video surrogates have focused on visual features of the video, and comparatively little work has been done on audio surrogates and examining their pros and cons in aiding users' retrieval and sense-making of digital videos. Even less work has been done on multi-modal surrogates, where more than one modality are employed for consuming the surrogates, for example, the audio and visual modalities. This research examined the effectiveness of a number of multi-modal surrogates, and investigated whether synchronization between the audio and visual channels is optimal. A user study was conducted to evaluate six different surrogates on a set of six recognition and inference tasks to answer two main research questions: (1) How do automatically-generated multi-modal surrogates compare to manually-generated ones in video retrieval and video sense-making? and (2) Does synchronization between multiple surrogate channels enhance or inhibit video retrieval and video sense-making? Forty-eight participants participated in the study, in which the surrogates were measured on the the time participants spent on experiencing the surrogates, the time participants spent on doing the tasks, participants' performance accuracy on the tasks, participants' confidence in their task responses, and participants' subjective ratings on the surrogates. On average, the uncoordinated surrogates were more helpful than the coordinated ones, but the manually-generated surrogates were only more helpful than the automatically-generated ones in terms of task completion time. Participants' subjective ratings were more favorable for the coordinated surrogate C2 (Magic A + V) and the uncoordinated surrogate U1 (Magic A + Storyboard V) with respect to usefulness, usability, enjoyment, and engagement. The post-session questionnaire comments demonstrated participants' preference for the coordinated surrogates, but the comments also revealed the value of having uncoordinated sensory channels

Carolina Digital Repository

Multimodal content-based video retrieval

Author: Blanken H.M.
Jonker W.
Mihajlovic V.
Petkovic M.
Publication venue: Springer Verlag
Publication date: 01/01/2007
Field of study

This chapter is a case study showing how important events (highlights) can be automatically detected in video recordings of Formula 1 car racing. Numerous approaches presented in literature have shown that it is becoming possible to extract interesting events from video. However, the majority of the approaches uses individual visual or audio cues. According to the current understanding of human perception it is expected that using evidence obtained from different modalities should result in a more robust and accurate perception of video. On the other hand, fusion of multimodal evidence is quite challenging, since it has to deal with indications which may contradict each other. In this chapter we deal with three topics, one being fusion of evidence from different modalities

Repository TU/e

Pure OAI Repository

University of Twente Research Information

Multimodal content-based video retrieval

Author: Blanken H.M.
Blanken H.M.
Blok H.E.
Feng L.
Jonker W.
Mihajlovic V.
Petkovic M.
Vries de, A.P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study