Search CORE

123 research outputs found

Recommended from our members

MAC-REALM: A video content feature extraction and modelling framework

Author: Parmar Minaz
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A consequence of the ‘data deluge’ is the exponential increase in digital video footage, while the ability to find relevant video clips diminishes. Traditional text based search engines are no longer optimal for searching, as they cannot provide a granular search of the content inside video footage. To be able to search the video in a content based manner, the content features of the video need to be extracted and modelled into a content model, which can then act as a searchable proxy for the video content. This thesis focuses on the extraction of syntactic and semantic content features and content modelling, using machine driven processes, with either little or no user interaction. Our abstract framework design extracts syntactic and semantic content features and compiles them into an integrated content model. The framework integrates a four plane strategy that consists of a pre-processing plane that removes redundant data and filters the media to improve the feature extraction properties of the media; a syntactic feature extraction plane that extracts low level syntactic feature and mid-level syntactic features that have semantic attributes; a semantic relationship analysis and linkage plane, where the spatial and temporal relationships of all the content features are defined, and finally a content modelling stage where the syntactic and semantic content features are integrated into a content model. Each of the four planes can be split into three layers namely, the content layer, where the content to be processed is stored; the application layer, where the content is converted into content descriptions, and the MPEG-7 layer, where content descriptions are serialised. Using MPEG-7 standards to produce the content model will provide wide-ranging interoperability, while facilitating granular multi-content type searches. The framework is aiming to ‘bridge’ the semantic gap, by integrating the syntactic and semantic content features from extraction through to modelling. The design of the framework has been implemented into a prototype called MAC-REALM, which has been tested and evaluated for its effectiveness to extract and model content features. Conclusions are drawn about the research output as a whole and whether they have met the objectives. Finally, future work is presented on how concept detection and crowd sourcing can be used with MAC-REALM

Brunel University Research Archive

TRECVID 2004 - an overview

Author: Kraaij Wessel
Over Paul
Smeaton Alan F.
Publication venue: 'University of Aden - Faculty of Economics and Administration'
Publication date: 01/11/2004
Field of study

Irish Universities

DCU Online Research Access Service

A PROPOSAL FOR A VIDEO MODELING FOR COMPOSING MULTIMEDIA DOCUMENT

Author
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date
Field of study

Crossref

Video annotation for studying the brain in naturalistic settings

Author: Wallenius Kalle
Publication venue: Aalto-yliopisto
Publication date: 01/01/2010
Field of study

Aivojen tutkiminen luonnollisissa asetelmissa on viimeaikainen suunta aivotutkimuksessa. Perinteisesti aivotutkimuksessa on käytetty hyvin yksinkertaistettuja ja keinotekoisia ärsykkeitä, mutta viime aikoina on alettu tutkia ihmisaivoja yhä luonnollisimmissa asetelmissa. Näissä kokeissa on käytetty elokuvaa luonnollisena ärsykkeenä. Elokuvan monimutkaisuudesta johtuen tarvitaan siitä yksinkertaistettu malli laskennallisen käsittely mahdollistamiseksi. Tämä malli tuotetaan annotoimalla; keräämällä elokuvan keskeisistä ärsykepiirteistä dataa tietorakenteen muodostamiseksi. Tätä dataa verrataan aivojen aikariippuvaiseen aktivaatioon etsittäessä mahdollisia korrelaatioita. Kaikkia elokuvan ominaisuuksia ei pystytä annotoimaan automaattisesti; ihmiselle merkitykselliset ominaisuudet on annotoitava käsin, joka on joissain tapauksissa ongelmallista johtuen elokuvan käyttämistä useista viestintämuodoista. Ymmärrys näistä viestinnän muodoista auttaa analysoimaan ja annotoimaan elokuvia. Elokuvaa Tulitikkutehtaan Tyttö (Aki Kaurismäki, 1990) käytettiin ärsykkeenä aivojen tutkimiseksi luonnollisissa asetelmissa. Kokeista saadun datan analysoinnin helpottamiseksi annotoitiin elokuvan keskeiset visuaaliset ärsykepiirteet. Tässä työssä tutkittiin annotointiin käytettävissä olevia eri lähestymistapoja ja teknologioita. Annotointi auttaa informaation organisoinnissa, mistä syystä annotointia ilmestyy nykyään kaikkialla. Erilaisia annotaatiotyökaluja ja -teknologioita kehitetään jatkuvasti. Lisäksi videoanalyysimenetelmät ovat alkaneet mahdollistaa yhä merkityksellisemmän informaation automaattisen annotoinnin tulevaisuudessa.Studying the brain in naturalistic settings is a recent trend in neuroscience. Traditional brain imaging experiments have relied on using highly simplified and artificial stimuli, but recently efforts have been put into studying the human brain in conditions closer to real-life. The methodology used in these studies involve imitating naturalistic stimuli with a movie. Because of the complexity of the naturalistic stimulus, a simplified model of it is needed to handle it computationally. This model is obtained by making annotations; collecting information of salient features of the movie to form a data structure. This data is compared with the brain activity evolving in time to search for possible correlations. All the features of a movie cannot be reliably annotated automatically: semantic features of a movie require manual annotations, which is in some occasions problematic due to the various cinematic techniques adopted. Understanding these methods helps analyzing and annotating movies. The movie Match Factory Girl (Aki Kaurismäki, 1990) was used as a stimulus in studying the brain in naturalistic settings. To help the analysis of the acquired data the salient visual features of the movie were annotated. In this work existing annotation approaches and available technologies for annotation were reviewed. Annotations help organizing information, therefore they are nowadays found everywhere. Different tools and technologies are being developed constantly. Furthermore, development of automatic video analysis methods are going to provide more meaningful annotations in the future

Aaltodoc Publication Archive

Handling temporal heterogeneous data for content-based management of large video collections

Author: Bruno Eric
Janvier Bruno
Marchand-Maillet Stéphane
Moënne-Loccoz Nicolas
Publication venue
Publication date: 18/06/2018
Field of study

Video document retrieval is now an active part of the domain of multimedia retrieval. However, unlike for other media, the management of a collection of video documents adds the problem of efficiently handling an overwhelming volume of temporal data. Challenges include balancing efficient content modeling and storage against fast access at various levels. In this paper, we detail the framework we have built to accommodate our developments in content-based multimedia retrieval. We show that not only our framework facilitates the development of processing and indexing algorithms but it also opens the way to several other possibilities such as rapid interface prototyping or retrieval algorithm benchmarking. Here, we discuss our developments in relation to wider contexts such as MPEG-7 and the TREC Video Trac

RERO DOC Digital Library

Highly efficient low-level feature extraction for video representation and retrieval.

Author: Calie Janko
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/2004
Field of study

PhDWitnessing the omnipresence of digital video media, the research community has raised the question of its meaningful use and management. Stored in immense multimedia databases, digital videos need to be retrieved and structured in an intelligent way, relying on the content and the rich semantics involved. Current Content Based Video Indexing and Retrieval systems face the problem of the semantic gap between the simplicity of the available visual features and the richness of user semantics. This work focuses on the issues of efficiency and scalability in video indexing and retrieval to facilitate a video representation model capable of semantic annotation. A highly efficient algorithm for temporal analysis and key-frame extraction is developed. It is based on the prediction information extracted directly from the compressed domain features and the robust scalable analysis in the temporal domain. Furthermore, a hierarchical quantisation of the colour features in the descriptor space is presented. Derived from the extracted set of low-level features, a video representation model that enables semantic annotation and contextual genre classification is designed. Results demonstrate the efficiency and robustness of the temporal analysis algorithm that runs in real time maintaining the high precision and recall of the detection task. Adaptive key-frame extraction and summarisation achieve a good overview of the visual content, while the colour quantisation algorithm efficiently creates hierarchical set of descriptors. Finally, the video representation model, supported by the genre classification algorithm, achieves excellent results in an automatic annotation system by linking the video clips with a limited lexicon of related keywords

Queen Mary Research Online

OpenGrey Repository

MAC-REALM: A Video Content Feature Extraction and Modelling Framework

Author: Aly
Amiri
Amri
Anderson
Angelides
Angelides
Bai
Chiarcos
Choroś
Dal Mutto
Dumont
Döller
Fromme
Garg
Güsgen
Haslhofer
Hu
Huang
Inigo
Kristensen
Ladický
Lavee
Lawrence
Li
Liu
Ma
Ma
Mahesh
Manjunath
Marios C. Angelides
Mezaris
Mika
Minaz Parmar
Mitrovic
MPEG
Richardson
Sarmiento
Seeling
Singhai
Snoek
Tapu
Tjondronegoro
Tsao
Tuzel
Vazquez-Reina
Wang
Wolf
Xue
Zajic
Zeng
Zhu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 29/06/2015
Field of study

Crossref

Brunel University Research Archive

Digital Image Access & Retrieval

Author: Heidorn P. Bryan
Sandore Beth
Publication venue: Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Publication date: 01/01/1997
Field of study

The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository

Adaptive video delivery using semantics

Author: Steiger Olivier
Publication venue: Lausanne, EPFL
Publication date: 16/03/2005
Field of study

The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications

Infoscience - École polytechnique fédérale de Lausanne

Overview of the MPEG-7 Standard and of Future Challenges for Visual Information Analysis

Author
Publication venue: Springer
Publication date
Field of study

Springer - Publisher Connector