1,416 research outputs found
Segmentation for Image Indexing and Retrieval on Discrete Cosines Domain
This paper used region growing segmentation technique to segment the Discrete Cosines (DC) image. The classic problem of content Based image retrieval (CBIR) is the lack of accuracy in matching between image query and image in the database. By using region growing technique on DC image,it reduced the number of image regions indexed. The proposed of recursive region growing is not new technique but its application on DC images to build indexing keys is quite new and not yet presented by many authors. The experimental results show that the proposed methods on segmented images present good precision which are higher than 0.60 on all classes. So, it could be concluded that region growing segmented based CBIR more efficient compared to DC images in term of their precision 0.59 and 0.75, respectively. Moreover, DC based CBIR can save time and simplify algorithm compared to DCT images. The most significant finding from this work is instead of using 64 DCT coefficients this research only used 1/64 coefficients which is DC coefficient.
Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields
This work presents a first evaluation of using spatio-temporal receptive
fields from a recently proposed time-causal spatio-temporal scale-space
framework as primitives for video analysis. We propose a new family of video
descriptors based on regional statistics of spatio-temporal receptive field
responses and evaluate this approach on the problem of dynamic texture
recognition. Our approach generalises a previously used method, based on joint
histograms of receptive field responses, from the spatial to the
spatio-temporal domain and from object recognition to dynamic texture
recognition. The time-recursive formulation enables computationally efficient
time-causal recognition. The experimental evaluation demonstrates competitive
performance compared to state-of-the-art. Especially, it is shown that binary
versions of our dynamic texture descriptors achieve improved performance
compared to a large range of similar methods using different primitives either
handcrafted or learned from data. Further, our qualitative and quantitative
investigation into parameter choices and the use of different sets of receptive
fields highlights the robustness and flexibility of our approach. Together,
these results support the descriptive power of this family of time-causal
spatio-temporal receptive fields, validate our approach for dynamic texture
recognition and point towards the possibility of designing a range of video
analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure
Audiovisual processing for sports-video summarisation technology
In this thesis a novel audiovisual feature-based scheme is proposed for the automatic summarization of sports-video content The scope of operability of the scheme is designed to encompass the wide variety o f sports genres that come under the description ‘field-sports’. Given the assumption that, in terms of conveying the narrative of a field-sports-video, score-update events constitute the most significant moments, it is proposed that their detection should thus yield a favourable summarisation solution. To this end, a generic methodology is proposed for the automatic identification of score-update events in field-sports-video content. The scheme is based on the development of robust extractors for a set of critical features, which are shown to reliably indicate their locations. The evidence gathered by the feature extractors is combined and analysed using a Support Vector Machine (SVM), which performs the event detection process. An SVM is chosen on the basis that its underlying technology represents an implementation of the latest generation of machine learning algorithms, based on the recent advances in statistical learning. Effectively, an SVM offers a solution to optimising the classification performance of a decision hypothesis, inferred from a given set of training data. Via a learning phase that utilizes a 90-hour field-sports-video trainmg-corpus, the SVM infers a score-update event model by observing patterns in the extracted feature evidence. Using a similar but distinct 90-hour evaluation corpus, the effectiveness of this model is then tested genencally across multiple genres of fieldsports- video including soccer, rugby, field hockey, hurling, and Gaelic football. The results suggest that in terms o f the summarization task, both high event retrieval and content rejection statistics are achievable
Pattern Recognition
Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition
Highly efficient low-level feature extraction for video representation and retrieval.
PhDWitnessing the omnipresence of digital video media, the research community has
raised the question of its meaningful use and management. Stored in immense
multimedia databases, digital videos need to be retrieved and structured in an
intelligent way, relying on the content and the rich semantics involved. Current
Content Based Video Indexing and Retrieval systems face the problem of the semantic
gap between the simplicity of the available visual features and the richness of user
semantics.
This work focuses on the issues of efficiency and scalability in video indexing and
retrieval to facilitate a video representation model capable of semantic annotation. A
highly efficient algorithm for temporal analysis and key-frame extraction is developed.
It is based on the prediction information extracted directly from the compressed domain
features and the robust scalable analysis in the temporal domain. Furthermore,
a hierarchical quantisation of the colour features in the descriptor space is presented.
Derived from the extracted set of low-level features, a video representation model that
enables semantic annotation and contextual genre classification is designed.
Results demonstrate the efficiency and robustness of the temporal analysis algorithm
that runs in real time maintaining the high precision and recall of the detection task.
Adaptive key-frame extraction and summarisation achieve a good overview of the
visual content, while the colour quantisation algorithm efficiently creates hierarchical
set of descriptors. Finally, the video representation model, supported by the genre
classification algorithm, achieves excellent results in an automatic annotation system by
linking the video clips with a limited lexicon of related keywords
Recommended from our members
Image database retrieval using neural networks
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The broad objective of this work has been to achieve retrieval of images from large unconstrained databases using image content. The problem is typified by the need to locate a target image within a database where no numerical indexing terms exist. Here, retrieval is based on important features within in an image and uses sample images or user sketches to specify a query. A typical query might be framed as "Find all images similar to this one", for example. The aim of this work has been to show how neural networks can provide a practical, flexible and robust solution to this problem. A neural network is basically an adaptive information filter which can be used to extract the salient characteristics of a data set during a training phase. The transformation learnt by the network can map the images into compact indices which support very rapid fuzzy matching of images across the database. This learning process optimises the performance of the code with respect to the contents of the database. We assess the applicability of several neural network architectures and learning rules for a practical coding scheme and investigate how the system parameters affect the performance of the system. We introduce a novel learning law which has a number of advantages over existing paradigms. In-depth mathematical analysis and extensive empirical tests are used to corroborate the arguments presented throughout. This thesis aims to show the nature of the image retrieval problem, how current research trends attempt to tackle it and how neural networks can offer us a real alternative to conventional approaches
Automatic indexing of video content via the detection of semantic events
The number, and size, of digital video databases is continuously growing. Unfortunately, most, if not all, of the video content in these databases is stored without any sort of indexing or analysis and without any associated metadata. If any of the videos do have metadata, then it is usually the result of some manual annotation process rather than any automatic indexing. Thus, locating clips and browsing content is difficult, time consuming and generally inefficient. The task of automatically indexing movies is particularly difficult given their innovative creation process and the individual style of many film makers. However, there are a number of underlying film grammar conventions that are universally followed, from a Hollywood blockbuster to an underground movie with a limited budget. These conventions dictate many elements of film making such as camera placement and editing. By examining the use of these conventions it is possible to extract information about the events in a movie. This research aims to provide an approach that creates an indexed version of a movie to facilitate ease of browsing and efficient retrieval. In order to achieve this aim, all of the relevant events contained within a movie are detected and classified into a predefined index. The event detection process involves examining the underlying structure of a movie and utilising audiovisual analysis techniques, supported by machine learning algorithms, to extract information based on this structure. The result is an indexed movie that can be presented to users for browsing/retrieval of relevant events, as well as supporting user specified searching. Extensive evaluation of the indexing approach is carried out. This evaluation indicates efficient performance of the event detection and retrieval system, and also highlights the subjective nature of video content
Image similarity in medical images
Recent experiments have indicated a strong influence of the substrate grain orientation on the self-ordering in anodic porous alumina. Anodic porous alumina with straight pore channels grown in a stable, self-ordered manner is formed on (001) oriented Al grain, while disordered porous pattern is formed on (101) oriented Al grain with tilted pore channels growing in an unstable manner. In this work, numerical simulation of the pore growth process is carried out to understand this phenomenon. The rate-determining step of the oxide growth is assumed to be the Cabrera-Mott barrier at the oxide/electrolyte (o/e) interface, while the substrate is assumed to determine the ratio β between the ionization and oxidation reactions at the metal/oxide (m/o) interface. By numerically solving the electric field inside a growing porous alumina during anodization, the migration rates of the ions and hence the evolution of the o/e and m/o interfaces are computed. The simulated results show that pore growth is more stable when β is higher. A higher β corresponds to more Al ionized and migrating away from the m/o interface rather than being oxidized, and hence a higher retained O:Al ratio in the oxide. Experimentally measured oxygen content in the self-ordered porous alumina on (001) Al is indeed found to be about 3% higher than that in the disordered alumina on (101) Al, in agreement with the theoretical prediction. The results, therefore, suggest that ionization on (001) Al substrate is relatively easier than on (101) Al, and this leads to the more stable growth of the pore channels on (001) Al
- …