9 research outputs found
A Model for Weighting Image Objects in Home Photographs
International audienceThe paper presents a contribution to image indexing consisting in a weighting model for visible objects - or image objects - in home photographs. To improve its effectiveness this weighting model has been designed according to human perception criteria about what is estimated as important in photographs. Four basic hypotheses related to human perception are presented, and their validity is estimated as compared to actual observations from a user study. Finally a formal definition of this weighting model is presented and its consistence with the user study is evaluated
Investigation Report on Universal Multimedia Access
Universal Multimedia Access (UMA) refers to the ability to access by any user to the desired multimedia content(s) over any type of network with any device from anywhere and anytime. UMA is a key framework for multimedia content delivery service using metadata. This investigation report analyzes the state-of-the-art technologies in UMA and tries to identify the key issues of UMA. The state-of-the-art in multimedia content adaptation, an overview of the standards that supports the UMA framework, potential privacy problems in UMA systems and some new UMA applications are presented in this report. This report also provides challenges that still remain to be resolved in UMA to make clear the potential key problems in UMA and determine which ones to solve
Recommended from our members
MAC-REALM: A video content feature extraction and modelling framework
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A consequence of the âdata delugeâ is the exponential increase in digital video footage, while the ability to find relevant video clips diminishes. Traditional text based search engines are no longer optimal for searching, as they cannot provide a granular search of the content inside video footage. To be able to search the video in a content based manner, the content features of the video need to be extracted and modelled into a content model, which can then act as a searchable proxy for the video content. This thesis focuses on the extraction of syntactic and semantic content features and content modelling, using machine driven processes, with either little or no user interaction. Our abstract framework design extracts syntactic and semantic content features and compiles them into an integrated content model. The framework integrates a four plane strategy that consists of a pre-processing plane that removes redundant data and filters the media to improve the feature extraction properties of the media; a syntactic feature extraction plane that extracts low level syntactic feature and mid-level syntactic features that have semantic attributes; a semantic relationship analysis and linkage plane, where the spatial and temporal relationships of all the content features are defined, and finally a content modelling stage where the syntactic and semantic content features are integrated into a content model. Each of the four planes can be split into three layers namely, the content layer, where the content to be processed is stored; the application layer, where the content is converted into content descriptions, and the MPEG-7 layer, where content descriptions are serialised. Using MPEG-7 standards to produce the content model will provide wide-ranging interoperability, while facilitating granular multi-content type searches. The framework is aiming to âbridgeâ the semantic gap, by integrating the syntactic and semantic content features from extraction through to modelling. The design of the framework has been implemented into a prototype called MAC-REALM, which has been tested and evaluated for its effectiveness to extract and model content features. Conclusions are drawn about the research output as a whole and whether they have met the objectives. Finally, future work is presented on how concept detection and crowd sourcing can be used with MAC-REALM
Enhancing person annotation for personal photo management using content and context based technologies
Rapid technological growth and the decreasing cost of photo capture means that we are all taking more digital photographs than ever before. However, lack of technology for automatically organising personal photo archives has resulted in many users left with poorly annotated photos, causing them great frustration when such photo collections are to be browsed or searched at a later time. As a result, there has recently been significant research interest in technologies for supporting effective annotation.
This thesis addresses an important sub-problem of the broad annotation problem, namely "person annotation" associated with personal digital photo management. Solutions to this problem are provided using content analysis tools in combination with context data within the experimental photo management framework, called âMediAssistâ. Readily available image metadata, such as location and date/time, are captured from digital cameras with in-built GPS functionality, and thus provide knowledge about when and where the photos were taken. Such information is then used to identify the "real-world" events corresponding to certain activities in the photo capture process. The
problem of enabling effective person annotation is formulated in such a way that both "within-event" and "cross-event" relationships of persons' appearances are captured.
The research reported in the thesis is built upon a firm foundation of content-based analysis technologies, namely face detection, face recognition, and body-patch matching together with data fusion.
Two annotation models are investigated in this thesis, namely progressive and non-progressive. The effectiveness of each model is evaluated against varying proportions of
initial annotation, and the type of initial annotation based on individual and combined face, body-patch and person-context information sources. The results reported in the thesis strongly validate the use of multiple information sources for person annotation whilst
emphasising the advantage of event-based photo analysis in real-life photo management systems
New Frontiers in Universal Multimedia Access
Universal Multimedia Access (UMA) refers to the ability to access by any user to the desired multimedia content(s) over any type of network with any device from anywhere and anytime. UMA is a key framework for multimedia content delivery service using metadata. This report consists of three parts. The first part of this report analyzes the state-of-the-art technologies in UMA, identifies the key issues and gives what are the new challenges that still remain to be resolved in UMA. The key issues in UMA include the adaptation of multimedia contents to bridge the gap between content creation and consuming, standardized metadata description that facilitates the adaptation (e.g. MPEG-7, MPEG-21 DIA, CC/PP), and UMA system designing considering its target application. The second part introduces our approach towards these challenges; how to jointly adapt multimedia contents including different modalities and balance their presentation in an optimal way. A scheme for adapting audiovisual contents and its metadata (text) to any screen is proposed to provide the best experience in browsing the desired content. The adaptation process is modeled as an optimization problem of the total value of the content provided to the user. The total content value is optimized by jointly controlling the balance between video and metadata presentation, the transformation of the video content, and the amount of the metadata to be presented. Experimental results show that the proposed adaptation scheme enables users to browse audiovisual contents with their metadata optimized to the screen size of their devices. The last part reports some potential UMA applications especially focusing on a universal access application to TV news archives as an example