Search CORE

2,169 research outputs found

An MPEG-7 scheme for semantic content modelling and filtering of digital video

Author: A. Vakali
A. Vetro
B.L. Tseng
B.L. Tseng
C. Okoli
C.S. Goldfarb
F. Golshani
F. Kretz
G. Rowe
H. Kosch
H.W. Agius
H.W. Agius
H.W. Agius
Harry Agius
J. Hunter
J. Magalhães
J.F. Allen
L. Al-Safadi
L. Wenyin
M. Davis
M. Echiffre
M. Eirinaki
M.C. Angelides
M.R. Naphande
Marios C. Angelides
N. Adami
P. Correia
P. Salembier
P.M. Fonseca
R. Zhao
S. Adali
S.R. Newcomb
S.R. Newcomb
S.W. Ambler
T. Meyer-Boudnik
U. Westermann
Y.F. Day
É Germain
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2006
Field of study

Abstract Part 5 of the MPEG-7 standard specifies Multimedia Description Schemes (MDS); that is, the format multimedia content models should conform to in order to ensure interoperability across multiple platforms and applications. However, the standard does not specify how the content or the associated model may be filtered. This paper proposes an MPEG-7 scheme which can be deployed for digital video content modelling and filtering. The proposed scheme, COSMOS-7, produces rich and multi-faceted semantic content models and supports a content-based filtering approach that only analyses content relating directly to the preferred content requirements of the user. We present details of the scheme, front-end systems used for content modelling and filtering and experiences with a number of users

Crossref

Brunel University Research Archive

Recommended from our members

Multimedia delivery in the future internet

Author: Aggoun A
Amon P
Arbel I
Chernilov A
Cosmas J
Garcia G
Jari A
Keller S
Kontopoulos C
Lamy-Bergot C
Leon A
Mattavelli M
Mauthe A
Mota T
Naumann M
Navarro A
Negru O
Pinto F
Shao B
Timmerer C
Tsekleves E
Zahariadis T
Publication venue: 'Society for Leukocyte Biology'
Publication date: 01/01/2008
Field of study

The term “Networked Media” implies that all kinds of media including text, image, 3D graphics, audio and video are produced, distributed, shared, managed and consumed on-line through various networks, like the Internet, Fiber, WiFi, WiMAX, GPRS, 3G and so on, in a convergent manner [1]. This white paper is the contribution of the Media Delivery Platform (MDP) cluster and aims to cover the Networked challenges of the Networked Media in the transition to the Future of the Internet. Internet has evolved and changed the way we work and live. End users of the Internet have been confronted with a bewildering range of media, services and applications and of technological innovations concerning media formats, wireless networks, terminal types and capabilities. And there is little evidence that the pace of this innovation is slowing. Today, over one billion of users access the Internet on regular basis, more than 100 million users have downloaded at least one (multi)media file and over 47 millions of them do so regularly, searching in more than 160 Exabytes1 of content. In the near future these numbers are expected to exponentially rise. It is expected that the Internet content will be increased by at least a factor of 6, rising to more than 990 Exabytes before 2012, fuelled mainly by the users themselves. Moreover, it is envisaged that in a near- to mid-term future, the Internet will provide the means to share and distribute (new) multimedia content and services with superior quality and striking flexibility, in a trusted and personalized way, improving citizens’ quality of life, working conditions, edutainment and safety. In this evolving environment, new transport protocols, new multimedia encoding schemes, cross-layer inthe network adaptation, machine-to-machine communication (including RFIDs), rich 3D content as well as community networks and the use of peer-to-peer (P2P) overlays are expected to generate new models of interaction and cooperation, and be able to support enhanced perceived quality-of-experience (PQoE) and innovative applications “on the move”, like virtual collaboration environments, personalised services/ media, virtual sport groups, on-line gaming, edutainment. In this context, the interaction with content combined with interactive/multimedia search capabilities across distributed repositories, opportunistic P2P networks and the dynamic adaptation to the characteristics of diverse mobile terminals are expected to contribute towards such a vision. Based on work that has taken place in a number of EC co-funded projects, in Framework Program 6 (FP6) and Framework Program 7 (FP7), a group of experts and technology visionaries have voluntarily contributed in this white paper aiming to describe the status, the state-of-the art, the challenges and the way ahead in the area of Content Aware media delivery platforms

Brunel University Research Archive

Learning midlevel image features for natural scene and texture classification

Author: Guérin-Dugué Anne
Le Borgne Hervé
O'Connor Noel E.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2007
Field of study

This paper deals with coding of natural scenes in order to extract semantic information. We present a new scheme to project natural scenes onto a basis in which each dimension encodes statistically independent information. Basis extraction is performed by independent component analysis (ICA) applied to image patches culled from natural scenes. The study of the resulting coding units (coding filters) extracted from well-chosen categories of images shows that they adapt and respond selectively to discriminant features in natural scenes. Given this basis, we define global and local image signatures relying on the maximal activity of filters on the input image. Locally, the construction of the signature takes into account the spatial distribution of the maximal responses within the image. We propose a criterion to reduce the size of the space of representation for faster computation. The proposed approach is tested in the context of texture classification (111 classes), as well as natural scenes classification (11 categories, 2037 images). Using a common protocol, the other commonly used descriptors have at most 47.7% accuracy on average while our method obtains performances of up to 63.8%. We show that this advantage does not depend on the size of the signature and demonstrate the efficiency of the proposed criterion to select ICA filters and reduce the dimensio

DCU Online Research Access Service

Multi modal multi-semantic image retrieval

Author: Kesorn Kraisak
Publication venue
Publication date: 01/01/2010
Field of study

PhDThe rapid growth in the volume of visual information, e.g. image, and video can overwhelm users’ ability to find and access the specific visual information of interest to them. In recent years, ontology knowledge-based (KB) image information retrieval techniques have been adopted into in order to attempt to extract knowledge from these images, enhancing the retrieval performance. A KB framework is presented to promote semi-automatic annotation and semantic image retrieval using multimodal cues (visual features and text captions). In addition, a hierarchical structure for the KB allows metadata to be shared that supports multi-semantics (polysemy) for concepts. The framework builds up an effective knowledge base pertaining to a domain specific image collection, e.g. sports, and is able to disambiguate and assign high level semantics to ‘unannotated’ images. Local feature analysis of visual content, namely using Scale Invariant Feature Transform (SIFT) descriptors, have been deployed in the ‘Bag of Visual Words’ model (BVW) as an effective method to represent visual content information and to enhance its classification and retrieval. Local features are more useful than global features, e.g. colour, shape or texture, as they are invariant to image scale, orientation and camera angle. An innovative approach is proposed for the representation, annotation and retrieval of visual content using a hybrid technique based upon the use of an unstructured visual word and upon a (structured) hierarchical ontology KB model. The structural model facilitates the disambiguation of unstructured visual words and a more effective classification of visual content, compared to a vector space model, through exploiting local conceptual structures and their relationships. The key contributions of this framework in using local features for image representation include: first, a method to generate visual words using the semantic local adaptive clustering (SLAC) algorithm which takes term weight and spatial locations of keypoints into account. Consequently, the semantic information is preserved. Second a technique is used to detect the domain specific ‘non-informative visual words’ which are ineffective at representing the content of visual data and degrade its categorisation ability. Third, a method to combine an ontology model with xi a visual word model to resolve synonym (visual heterogeneity) and polysemy problems, is proposed. The experimental results show that this approach can discover semantically meaningful visual content descriptions and recognise specific events, e.g., sports events, depicted in images efficiently. Since discovering the semantics of an image is an extremely challenging problem, one promising approach to enhance visual content interpretation is to use any associated textual information that accompanies an image, as a cue to predict the meaning of an image, by transforming this textual information into a structured annotation for an image e.g. using XML, RDF, OWL or MPEG-7. Although, text and image are distinct types of information representation and modality, there are some strong, invariant, implicit, connections between images and any accompanying text information. Semantic analysis of image captions can be used by image retrieval systems to retrieve selected images more precisely. To do this, a Natural Language Processing (NLP) is exploited firstly in order to extract concepts from image captions. Next, an ontology-based knowledge model is deployed in order to resolve natural language ambiguities. To deal with the accompanying text information, two methods to extract knowledge from textual information have been proposed. First, metadata can be extracted automatically from text captions and restructured with respect to a semantic model. Second, the use of LSI in relation to a domain-specific ontology-based knowledge model enables the combined framework to tolerate ambiguities and variations (incompleteness) of metadata. The use of the ontology-based knowledge model allows the system to find indirectly relevant concepts in image captions and thus leverage these to represent the semantics of images at a higher level. Experimental results show that the proposed framework significantly enhances image retrieval and leads to narrowing of the semantic gap between lower level machinederived and higher level human-understandable conceptualisation

Queen Mary Research Online

Device-based decision-making for adaptation of three-dimensional content

Author: Di Giacomo Thomas
Garchery Stephane
Joslin Chris
Kim Hyung Seok
Magnenat-Thalmann Nadia
Publication venue
Publication date: 18/06/2018
Field of study

The goal of this research was the creation of an adaptation mechanism for the delivery of three-dimensional content. The adaptation of content, for various network and terminal capabilities - as well as for different user preferences, is a key feature that needs to be investigated. Current state-of-the art research of the adaptation shows promising results for specific tasks and limited types of content, but is still not well-suited for massive heterogeneous environments. In this research, we present a method for transmitting adapted three-dimensional content to multiple target devices. This paper presents some theoretical and practical methods for adapting three-dimensional content, which includes shapes and animation. We also discuss practical details of the integration of our methods into MPEG-21 and MPEG-4 architecture

RERO DOC Digital Library

MPEG-SCORM : ontologia de metadados interoperáveis para integração de padrões multimídia e e-learning

Author: Santos Marcelo Correia dos, 1978-
Publication venue: [s.n.]
Publication date: 02/09/2018
Field of study

Orientador: Yuzo IanoTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A convergência entre as mídias digitais propõe uma integração entre as TIC, focadas no domínio do multimídia (sob a responsabilidade do Moving Picture Experts Group, constituindo o subcomitê ISO / IEC JTC1 SC29), e as TICE, (TIC para a Educação, geridas pelo ISO / IEC JTC1 SC36), destacando-se os padrões MPEG, empregados na forma de conteúdo e metadados para o multimídia, e as TICE, aplicadas à Educação a Distância, ou e-Learning (o aprendizado eletrônico). Neste sentido, coloca-se a problemática de desenvolver uma correspondência interoperável de bases normativas, atingindo assim uma proposta inovadora na convergência entre as mídias digitais e as aplicações para e-Learning, essencialmente multimídia. Para este fim, propõe-se criar e aplicar uma ontologia de metadados interoperáveis para web, TV digital e extensões para dispositivos móveis, baseada na integração entre os padrões de metadados MPEG-21 e SCORM, empregando a linguagem XPathAbstract: The convergence of digital media offers an integration of the ICT, focused on telecommunications and multimedia domain (under responsibility of the Moving Picture Experts Group, ISO/IEC JTC1 SC29), with the ICTE (the ICT for Education, managed by the ISO/IEC JTC1 SC36), highlighting the MPEG formats, featured as content and as description metadata potentially applied to the Multimedia or Digital TV and as a technology applied to e-Learning. Regarding this, it is presented the problem of developing an interoperable matching for normative bases, achieving an innovative proposal in the convergence between digital Telecommunications and applications for e-Learning, both essentially multimedia. To achieve this purpose, it is proposed to creating a ontology for interoperability between educational applications in Digital TV environments and vice-versa, simultaneously facilitating the creation of learning metadata based objects for Digital TV programs as well as providing multimedia video content as learning objects for Distance Education. This ontology is designed as interoperable metadata for the Web, Digital TV and e-Learning, built on the integration between MPEG-21 and SCORM metadata standards, employing the XPath languageDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia ElétricaCAPE

Repositorio da Producao Cientifica e Intelectual da Unicamp

Understanding user experience of mobile video: Framework, measurement, and optimization

Author: Docherty Michael
Song Wei
Tjondronegoro Dian
Publication venue: 'IntechOpen'
Publication date: 01/01/2012
Field of study

Since users have become the focus of product/service design in last decade, the term User eXperience (UX) has been frequently used in the field of Human-Computer-Interaction (HCI). Research on UX facilitates a better understanding of the various aspects of the user’s interaction with the product or service. Mobile video, as a new and promising service and research field, has attracted great attention. Due to the significance of UX in the success of mobile video (Jordan, 2002), many researchers have centered on this area, examining users’ expectations, motivations, requirements, and usage context. As a result, many influencing factors have been explored (Buchinger, Kriglstein, Brandt & Hlavacs, 2011; Buchinger, Kriglstein & Hlavacs, 2009). However, a general framework for specific mobile video service is lacking for structuring such a great number of factors. To measure user experience of multimedia services such as mobile video, quality of experience (QoE) has recently become a prominent concept. In contrast to the traditionally used concept quality of service (QoS), QoE not only involves objectively measuring the delivered service but also takes into account user’s needs and desires when using the service, emphasizing the user’s overall acceptability on the service. Many QoE metrics are able to estimate the user perceived quality or acceptability of mobile video, but may be not enough accurate for the overall UX prediction due to the complexity of UX. Only a few frameworks of QoE have addressed more aspects of UX for mobile multimedia applications but need be transformed into practical measures. The challenge of optimizing UX remains adaptations to the resource constrains (e.g., network conditions, mobile device capabilities, and heterogeneous usage contexts) as well as meeting complicated user requirements (e.g., usage purposes and personal preferences). In this chapter, we investigate the existing important UX frameworks, compare their similarities and discuss some important features that fit in the mobile video service. Based on the previous research, we propose a simple UX framework for mobile video application by mapping a variety of influencing factors of UX upon a typical mobile video delivery system. Each component and its factors are explored with comprehensive literature reviews. The proposed framework may benefit in user-centred design of mobile video through taking a complete consideration of UX influences and in improvement of mobile videoservice quality by adjusting the values of certain factors to produce a positive user experience. It may also facilitate relative research in the way of locating important issues to study, clarifying research scopes, and setting up proper study procedures. We then review a great deal of research on UX measurement, including QoE metrics and QoE frameworks of mobile multimedia. Finally, we discuss how to achieve an optimal quality of user experience by focusing on the issues of various aspects of UX of mobile video. In the conclusion, we suggest some open issues for future study

IntechOpen

Queensland University of Technology ePrints Archive

State-of-the-Art and Trends in Scalable Video Compression with Wavelet Based Approaches

Author: Adami Nicola
Leonardi Riccardo
Signoroni Alberto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

3noScalable Video Coding (SVC) differs form traditional single point approaches mainly because it allows to encode in a unique bit stream several working points corresponding to different quality, picture size and frame rate. This work describes the current state-of-the-art in SVC, focusing on wavelet based motion-compensated approaches (WSVC). It reviews individual components that have been designed to address the problem over the years and how such components are typically combined to achieve meaningful WSVC architectures. Coding schemes which mainly differ from the space-time order in which the wavelet transforms operate are here compared, discussing strengths and weaknesses of the resulting implementations. An evaluation of the achievable coding performances is provided considering the reference architectures studied and developed by ISO/MPEG in its exploration on WSVC. The paper also attempts to draw a list of major differences between wavelet based solutions and the SVC standard jointly targeted by ITU and ISO/MPEG. A major emphasis is devoted to a promising WSVC solution, named STP-tool, which presents architectural similarities with respect to the SVC standard. The paper ends drawing some evolution trends for WSVC systems and giving insights on video coding applications which could benefit by a wavelet based approach.partially_openpartially_openADAMI N; SIGNORONI. A; R. LEONARDIAdami, Nicola; Signoroni, Alberto; Leonardi, Riccard

Archivio istituzionale della ricerca - Università di Brescia

Multimedia content description framework

Author: Bergman Lawrence David
Kim Michelle Yoonk Yung
Li Chung-Sheng
Mohan Rakesh
Smith John Richard
Publication venue
Publication date: 13/05/2003
Field of study

A framework is provided for describing multimedia content and a system in which a plurality of multimedia storage devices employing the content description methods of the present invention can interoperate. In accordance with one form of the present invention, the content description framework is a description scheme (DS) for describing streams or aggregations of multimedia objects, which may comprise audio, images, video, text, time series, and various other modalities. This description scheme can accommodate an essentially limitless number of descriptors in terms of features, semantics or metadata, and facilitate content-based search, index, and retrieval, among other capabilities, for both streamed or aggregated multimedia objects

NASA Technical Reports Server

Content on demand video adaptation based on MPEG-21 digital item adaptation

Author: A Garcia
BS Manjunath
D Cotroneo
D Mukherjee
F Pereira
G Panis
H Harroud
H Huang
J Gecsei
J Hunter
J Nam
J Xin
Jesse S Jin
JG Kim
JT Park
K Shen
KT Fung
Liang-Tien Chia
M Bertini
M Hicks
M Metso
M Naghshineh
M Xu
M Xu
M Xu
M Xu
M Xu
M Xu
M Xu
Min Xu
Mpeg-21 digital item adaptation
O Avaro
R Han
S Acharya
S Benyaminovich
S Khan
S Ramanathan
SF Chang
Suhuai Luo
VK Goyal
W Lee
W Li
W Yin
W Yin
W Yuan
WH Cheng
X Wang
Xiangjian He
Y Wang
Yu Peng
Yusuo Hu
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref