2,070 research outputs found
Highly efficient low-level feature extraction for video representation and retrieval.
PhDWitnessing the omnipresence of digital video media, the research community has
raised the question of its meaningful use and management. Stored in immense
multimedia databases, digital videos need to be retrieved and structured in an
intelligent way, relying on the content and the rich semantics involved. Current
Content Based Video Indexing and Retrieval systems face the problem of the semantic
gap between the simplicity of the available visual features and the richness of user
semantics.
This work focuses on the issues of efficiency and scalability in video indexing and
retrieval to facilitate a video representation model capable of semantic annotation. A
highly efficient algorithm for temporal analysis and key-frame extraction is developed.
It is based on the prediction information extracted directly from the compressed domain
features and the robust scalable analysis in the temporal domain. Furthermore,
a hierarchical quantisation of the colour features in the descriptor space is presented.
Derived from the extracted set of low-level features, a video representation model that
enables semantic annotation and contextual genre classification is designed.
Results demonstrate the efficiency and robustness of the temporal analysis algorithm
that runs in real time maintaining the high precision and recall of the detection task.
Adaptive key-frame extraction and summarisation achieve a good overview of the
visual content, while the colour quantisation algorithm efficiently creates hierarchical
set of descriptors. Finally, the video representation model, supported by the genre
classification algorithm, achieves excellent results in an automatic annotation system by
linking the video clips with a limited lexicon of related keywords
Recent Trends in Computational Intelligence
Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications
Organising and structuring a visual diary using visual interest point detectors
As wearable cameras become more popular, researchers are increasingly focusing on novel applications to manage the large volume of data these devices produce. One such application is the construction of a Visual Diary from an individualâs photographs. Microsoftâs SenseCam, a
device designed to passively record a Visual Diary and cover a typical day of the user wearing the camera, is an example of one such device. The vast quantity of images generated by these devices means that the management and organisation of these collections is not a trivial matter.
We believe wearable cameras, such as SenseCam, will become more popular in the future and the management of the volume of data generated by these devices is a key issue.
Although there is a significant volume of work in the literature in the object detection and recognition
and scene classification fields, there is little work in the area of setting detection. Furthermore, few authors have examined the issues involved in analysing extremely large image collections (like a Visual Diary) gathered over a long period of time. An algorithm developed for setting
detection should be capable of clustering images captured at the same real world locations (e.g. in the dining room at home, in front of the computer in the office, in the park, etc.). This requires the selection and implementation of suitable methods to identify visually similar backgrounds in images using their visual features. We present a number of approaches to setting detection based on
the extraction of visual interest point detectors from the images. We also analyse the performance of two of the most popular descriptors - Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF).We present an implementation of a Visual Diary application and evaluate
its performance via a series of user experiments. Finally, we also outline some techniques to allow the Visual Diary to automatically detect new settings, to scale as the image collection continues to grow substantially over time, and to allow the user to generate a personalised summary of their data
Recommended from our members
MAC-REALM: A video content feature extraction and modelling framework
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A consequence of the âdata delugeâ is the exponential increase in digital video footage, while the ability to find relevant video clips diminishes. Traditional text based search engines are no longer optimal for searching, as they cannot provide a granular search of the content inside video footage. To be able to search the video in a content based manner, the content features of the video need to be extracted and modelled into a content model, which can then act as a searchable proxy for the video content. This thesis focuses on the extraction of syntactic and semantic content features and content modelling, using machine driven processes, with either little or no user interaction. Our abstract framework design extracts syntactic and semantic content features and compiles them into an integrated content model. The framework integrates a four plane strategy that consists of a pre-processing plane that removes redundant data and filters the media to improve the feature extraction properties of the media; a syntactic feature extraction plane that extracts low level syntactic feature and mid-level syntactic features that have semantic attributes; a semantic relationship analysis and linkage plane, where the spatial and temporal relationships of all the content features are defined, and finally a content modelling stage where the syntactic and semantic content features are integrated into a content model. Each of the four planes can be split into three layers namely, the content layer, where the content to be processed is stored; the application layer, where the content is converted into content descriptions, and the MPEG-7 layer, where content descriptions are serialised. Using MPEG-7 standards to produce the content model will provide wide-ranging interoperability, while facilitating granular multi-content type searches. The framework is aiming to âbridgeâ the semantic gap, by integrating the syntactic and semantic content features from extraction through to modelling. The design of the framework has been implemented into a prototype called MAC-REALM, which has been tested and evaluated for its effectiveness to extract and model content features. Conclusions are drawn about the research output as a whole and whether they have met the objectives. Finally, future work is presented on how concept detection and crowd sourcing can be used with MAC-REALM
A heuristic for the retrieval of objects in low resolution video
International audienceIn this paper, we tackle the problem of matching of objects in video in the context of the rough indexing paradigm. In this context, the video data are of very low resolution and segmentation is consequently inaccurate. The region features (texture, color, shape) are not strongly relevant due to the resolution. The structure of the objects must be considered in order to improve the robustness of the matching of regions. Indeed, the problem of object matching can be expressed in terms of directed acyclic graph (DAG) matching. Here, we propose a method based on a heuristic in order to approach object matching. The results are compared with those of a method based on relaxation matching
Efficient Techniques for Management and Delivery of Video Data
The rapid advances in electronic imaging, storage, data compression telecommunications, and networking technology have resulted in a vast creation and use of digital videos in many important applications such as digital libraries, distance learning, public information systems, electronic commerce, movie on demand, etc. This brings about the need for management as well as delivery of video data. Organizing and managing video data, however, is much more complex than managing conventional text data due to their semantically rich and unstructured contents. Also, the enormous size of video files requires high communication bandwidth for data delivery. In this dissertation, I present the following techniques for video data management and delivery. Decomposing video into meaningful pieces (i.e., shots) is a very fundamental step to handling the complicated contents of video data. Content-based video parsing techniques are presented and analyzed. In order to reduce the computation cost substantially, a non-sequential approach to shot boundary detection is investigated. Efficient browsing and indexing of video data are essential for video data management. Non-linear browsing and cost-effective indexing schemes for video data based on their contents are described and evaluated. In order to satisfy various user requests, delivering long videos through the limited capacity of bandwidth is challenging work. To reduce the demand on this bandwidth, a hybrid of two effective approaches, periodic broadcast and scheduled multicast, is discussed and simulated.
The current techniques related to the above works are discussed thoroughly to explain their advantages and disadvantages, and to make the new improved schemes. The substantial amount of experiments and simulations as well as the concepts are provided to compare the introduced techniques with the other existing ones. The results indicate that they outperform recent techniques by a significant margin. I conclude the dissertation with a discussing of future research directions
- âŠ