14 research outputs found
An object-based approach to retrieval of image and video content
Promising new directions have been opened up for content-based visual retrieval in recent years. Object-based retrieval which allows users to manipulate video objects as part of their searching and browsing interaction, is one of these. It is the purpose of this thesis to constitute itself as a part of a larger stream of research that investigates visual objects as a possible approach to advancing the use of semantics in content-based visual retrieval.
The notion of using objects in video retrieval has been seen as desirable for some years, but only very recently has technology started to allow even very basic object-location functions on video. The main hurdles to greater use of objects in video retrieval are the overhead of
object segmentation on large amounts of video and the issue of whether objects can actually be used efficiently for multimedia retrieval. Despite this, there are already some examples of work which supports retrieval based on video objects.
This thesis investigates an object-based approach to content-based visual retrieval. The main research contributions of this work are a study of shot boundary detection on compressed domain video where a fast detection approach is proposed and evaluated, and a study on the use of objects in interactive image retrieval. An object-based retrieval framework is developed in order to investigate object-based retrieval on a corpus of natural image
and video. This framework contains the entire processing chain required to analyse, index and interactively retrieve images and video via object-to-object matching. The experimental results indicate that object-based searching consistently outperforms image-based search using low-level features. This result goes some way towards validating the approach of allowing users to select objects as a basis for searching video archives when the information need dictates it as appropriate
An automatic technique for visual quality classification for MPEG-1 video
The Centre for Digital Video Processing at Dublin City University developed Fischlar [1], a web-based system for recording, analysis, browsing and playback of digitally captured television programs. One major issue for Fischlar is the automatic evaluation of video quality in order to avoid processing and storage of corrupted data. In this paper we propose an automatic classification technique that detects the video content quality in order to provide a decision criterion for the processing and storage stages
Local wavelet features for statistical object classification and localisation
This article presents a system for texture-based
probabilistic classification and localisation of 3D objects in 2D digital images and discusses selected applications. The objects are described by local feature vectors computed using the wavelet transform. In the training phase, object features are statistically modelled as normal density functions. In the recognition phase, a maximisation algorithm compares the learned density functions
with the feature vectors extracted from a real scene and yields the classes and poses of objects found in it. Experiments carried out on a real dataset of over 40000 images demonstrate the robustness of the system in terms of classification and localisation accuracy. Finally, two important application scenarios are discussed, namely classification of museum artefacts and classification of
metallography images
Associating low-level features with semantic concepts using video objects and relevance feedback
The holy grail of multimedia indexing and retrieval is developing algorithms capable of imitating human abilities in distinguishing and recognising semantic concepts within the content, so that retrieval can be based on âreal worldâ concepts that come naturally to users. In this paper, we discuss an approach to using segmented video objects as the midlevel connection between low-level features and semantic
concept description. In this paper, we consider a video object as a particular instance of a semantic concept and we
model the semantic concept as an average representation
of its instances. A system supporting object-based search
through a test corpus is presented that allows matching presegmented objects based on automatically extracted lowlevel features. In the system, relevance feedback is employed to drive the learning of the semantic model during
a regular search process
QIMERA: a software platform for video object segmentation and tracking
In this paper we present an overview of an ongoing collaborative project in the field of video object segmentation and tracking. The objective of the project is
to develop a flexible modular software architecture that can be used as test-bed for segmentation algorithms. The background to the project is described, as is the first version of the software system itself. Some sample results for the first segmentation algorithm developed using the system are presented and directions for future work are discussed
Dublin City University video track experiments for TREC 2001
Dublin City University participated in the interactive search task and Shot Boundary Detection task* of the TREC Video Track. In the interactive search task experiment thirty people used three different digital video browsers to find video segments matching the given topics. Each user was under a time constraint of six minutes for each topic assigned to them. The purpose of this experiment was to compare video browsers and so a method was developed for combining independent usersâ results for a topic into one set of results. Collated results based on thirty users are available herein though individual usersâ and browsersâ results are currently unavailable for comparison. Our purpose in participating in this TREC track was to create the ground truth within the TREC framework, which will allow us to do direct browser performance comparisons
Using video objects and relevance feedback in video retrieval
Video retrieval is mostly based on using text from dialogue and this remains the most signiÂŻcant component, despite progress in other aspects. One problem with this is when a searcher wants to locate video based on what is appearing in the video rather than what is being spoken about. Alternatives such as automatically-detected features and image-based keyframe matching can be used, though these still need further improvement in quality. One other modality for video retrieval is based on segmenting objects from video and allowing end users to use these as part of querying. This uses similarity between query objects and objects from video, and in theory allows retrieval based on what is actually appearing on-screen. The main hurdles to greater use of this are the overhead of object segmentation on large amounts of video and the issue of whether we can actually achieve effective object-based retrieval.
We describe a system to support object-based video retrieval where a user selects example video objects as part of the query. During a search a user builds up a set of these which are matched against objects previously segmented from a video library. This match is based on MPEG-7 Dominant Colour, Shape Compaction and Texture Browsing descriptors. We use a user-driven semi-automated segmentation process to segment the video archive which is very accurate and is faster than conventional video annotation
Addressing the challenge of managing large-scale digital multimedia libraries
Traditional Digital Libraries require human editorial control over the lifecycles of digital objects contained therein. This imposes an inherent (human) overhead on the maintenance of these digital libraries, which becomes unwieldy once the number of important information units in the digital library becomes too large. A revised framework is needed for digital libraries that takes the onus off the editor and allows the digital library to directly control digital object lifecycles, by employing a set of transformation rules that operate directly on the digital objects themselves. In this paper we motivate and describe a revised digital library framework that utilises transformation rules to automatically optimise system resources. We evaluate this library in three scenarios and also outline how we could apply concepts from this revised framework to address other challenges for digital libraries and digital information access in general
TRECVid 2005 experiments at Dublin City University
In this paper we describe our experiments in the automatic and interactive search tasks and the BBC rushes pilot task of TRECVid 2005. Our approach this year is somewhat different than previous submissions in that we have implemented a multi-user search system using a DiamondTouch tabletop device from Mitsubishi Electric Research Labs (MERL).We developed two versions of oursystem one with emphasis on efficient completion of the search task (FĂschlĂĄr-DT Efficiency) and the other with more emphasis on increasing awareness among searchers (FĂschlĂĄr-DT Awareness). We supplemented these runs with a further two runs one for each of the two systems, in which we augmented the initial results with results from an automatic run. In addition to these interactive submissions we also submitted three fully automatic runs. We also took part in the BBC rushes pilot task where we indexed the video by semi-automatic segmentation of objects appearing in the
video and our search/browsing system allows full keyframe and/or object-based searching. In the interactive search experiments we found that the awareness system outperformed the efficiency system. We also found that supplementing the interactive results with results of an automatic run improves both the Mean Average Precision and Recall values for both system variants. Our results suggest that providing awareness cues in a collaborative search setting improves retrieval performance. We also learned that multi-user searching is a viable alternative to the traditional single searcher paradigm, provided the system is designed to effectively support collaboration
Balancing simplicity and functionality in designing user-interface for an interactive TV
Recent computer vision and content-based multimedia techniques such as scene segmentation, face detection, searching through video clips, and video summarisation are potentially useful tools in enhancing the usefulness of an interactive TV (iTV). However, the technical nature and the relative immaturity of these tools means it is difficult to represent new functionalities afforded by these techniques in an easy-to-use manner on a TV interface where simplicity is critical and the viewers are not necessarily proficient in using advanced or highly-sophisticated interaction using a remote control. By introducing multiple layers of interaction sophistication and the unobtrusive semi-transparent panels that can be immediately invoked without menu hierarchy or complex sequence of actions, we developed an iTV application featuring powerful content retrieval techniques yet providing a streamlined and simple interface that gracefully leverages these techniques. Initial version of the interface is ready for demonstration