1,265 research outputs found
Indexing Techniques for Image and Video Databases: an approach based on Animate Vision Paradigm
[ITALIANO]In questo lavoro di tesi vengono presentate e discusse delle innovative tecniche di indicizzazione per database video e di immagini basate sul paradigma della “Animate Vision” (Visione Animata).
Da un lato, sarà mostrato come utilizzando, quali algoritmi di analisi di una data immagine, alcuni meccanismi di visione biologica, come i movimenti saccadici e le fissazioni dell'occhio umano, sia possibile ottenere un query processing in database di immagini più efficace ed efficiente. In particolare, verranno discussi, la metodologia grazie alla quale risulta possibile generare due sequenze di fissazioni, a partire rispettivamente, da un'immagine di query I_q ed una di test I_t del data set, e, come confrontare tali sequenze al fine di determinare una possibile misura della similarità (consistenza) tra le due immagini. Contemporaneamente, verrà discusso come tale approccio unito a tecniche classiche di clustering possa essere usato per scoprire le associazioni semantiche nascoste tra immagini, in termini di categorie, che, di contro, permettono un'automatica pre-classificazione (indicizzazione) delle immagini e possono essere usate per guidare e migliorare il processo di query. Saranno presentati, infine, dei risultati preliminari e l'approccio proposto sarà confrontato con le più recenti tecniche per il recupero di immagini descritte in letteratura.
Dall'altro lato, sarà mostrato come utilizzando la precedente rappresentazione “foveata” di un'immagine, risulti possibile partizionare un video in shot. Più precisamente, il metodo per il rilevamento dei cambiamenti di shot si baserà sulla computazione, in ogni istante di tempo, della misura di consistenza tra le sequenze di fissazioni generate da un osservatore ideale che guarda il video. Lo schema proposto permette l'individuazione, attraverso l'utilizzo di un'unica tecnica anziché di più metodi dedicati, sia delle transizioni brusche sia di quelle graduali. Vengono infine mostrati i risultati ottenuti su varie tipologie di video e, come questi, validano l'approccio proposto. / [INGLESE]In this dissertation some novel indexing techniques for video and image database based on “Animate Vision” Paradigm are presented and discussed.
From one hand, it will be shown how, by embedding within image inspection algorithms active mechanisms of biological vision such as saccadic eye movements and fixations, a more effective query processing in image database can be achieved. In particular, it will be discussed the way to generate two fixation sequences from a query image I_q and a test image I_t of the data set, respectively, and how to compare the two sequences in order to compute a possible similarity (consistency) measure between the two images. Meanwhile, it will be shown how the approach can be used with classical clustering techniques to discover and represent the hidden semantic associations among images, in terms of categories, which, in turn, allow an automatic pre-classification (indexing), and can be used to drive and improve the query processing. Eventually, preliminary results will be presented and the proposed approach compared with the most recent techniques for image retrieval described in the literature.
From the other one, it will be discussed how by taking advantage of such foveated representation of an image, it is possible to partitioning of a video into shots. More precisely, the shot-change detection method will be based on the computation, at each time instant, of the consistency measure of the fixation sequences generated by an ideal observer looking at the video. The proposed scheme aims at detecting both abrupt and gradual transitions between shots using a single technique, rather than a set of dedicated methods. Results on videos of various content types are reported and validate the proposed approach
06171 Abstracts Collection -- Content-Based Retrieval
From 23.04.06 to 28.04.06, the Dagstuhl Seminar 06171 `Content-Based Retrieval\u27\u27
was held in the International Conference and Research Center (IBFI),
Schloss Dagstuhl.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Gaussian mixture model classifiers for detection and tracking in UAV video streams.
Masters Degree. University of KwaZulu-Natal, Durban.Manual visual surveillance systems are subject to a high degree of human-error and operator fatigue. The automation of such systems often employs detectors, trackers and classifiers as fundamental building blocks. Detection, tracking and classification are especially useful and challenging in Unmanned Aerial Vehicle (UAV) based surveillance systems. Previous solutions have addressed challenges via complex classification methods. This dissertation proposes less complex Gaussian Mixture Model (GMM) based classifiers that can simplify the process; where data is represented as a reduced set of model parameters, and classification is performed in the low dimensionality parameter-space. The specification and adoption of GMM based classifiers on the UAV visual tracking feature space formed the principal contribution of the work. This methodology can be generalised to other feature spaces.
This dissertation presents two main contributions in the form of submissions to ISI accredited journals. In the first paper, objectives are demonstrated with a vehicle detector incorporating a two stage GMM classifier, applied to a single feature space, namely Histogram of Oriented Gradients (HoG). While the second paper demonstrates objectives with a vehicle tracker using colour histograms (in RGB and HSV), with Gaussian Mixture Model (GMM) classifiers and a Kalman filter.
The proposed works are comparable to related works with testing performed on benchmark datasets. In the tracking domain for such platforms, tracking alone is insufficient. Adaptive detection and classification can assist in search space reduction, building of knowledge priors and improved target representations. Results show that the proposed approach improves performance and robustness. Findings also indicate potential further enhancements such as a multi-mode tracker with global and local tracking based on a combination of both papers
Similarity search and data mining techniques for advanced database systems.
Modern automated methods for measurement, collection, and analysis of data in industry and science are providing more and more data with drastically increasing structure complexity. On the one hand, this growing complexity is justified by the need for a richer and more precise description of real-world objects, on the other hand it is justified by the rapid progress in measurement and analysis techniques that allow the user a versatile exploration of objects. In order to manage the huge volume of such complex data, advanced database systems are employed. In contrast to conventional database systems that support exact match queries, the user of these advanced database systems focuses on applying similarity search and data mining techniques.
Based on an analysis of typical advanced database systems — such as biometrical, biological, multimedia, moving, and CAD-object database systems — the following three challenging characteristics of complexity are detected: uncertainty (probabilistic feature vectors), multiple instances (a set of homogeneous feature vectors), and multiple representations (a set of heterogeneous feature vectors). Therefore, the goal of this thesis is to develop similarity search and data mining techniques that are capable of handling uncertain, multi-instance, and multi-represented objects.
The first part of this thesis deals with similarity search techniques. Object identification is a similarity search technique that is typically used for the recognition of objects from image, video, or audio data. Thus, we develop a novel probabilistic model for object identification. Based on it, two novel types of identification queries are defined. In order to process the novel query types efficiently, we introduce an index structure called Gauss-tree. In addition, we specify further probabilistic models and query types for uncertain multi-instance objects and uncertain spatial objects. Based on the index structure, we develop algorithms for an efficient processing of these query types. Practical benefits of using probabilistic feature vectors are demonstrated on a real-world application for video similarity search. Furthermore, a similarity search technique is presented that is based on aggregated multi-instance objects, and that is suitable for video similarity search. This technique takes multiple representations into account in order to achieve better effectiveness.
The second part of this thesis deals with two major data mining techniques: clustering and classification. Since privacy preservation is a very important demand of distributed advanced applications, we propose using uncertainty for data obfuscation in order to provide privacy preservation during clustering. Furthermore, a model-based and a density-based clustering method for multi-instance objects are developed. Afterwards, original extensions and enhancements of the density-based clustering algorithms DBSCAN and OPTICS for handling multi-represented objects are introduced. Since several advanced database systems like biological or multimedia database systems handle predefined, very large class systems, two novel classification techniques for large class sets that benefit from using multiple representations are defined. The first classification method is based on the idea of a k-nearest-neighbor classifier. It employs a novel density-based technique to reduce training instances and exploits the entropy impurity of the local neighborhood in order to weight a given representation. The second technique addresses hierarchically-organized class systems. It uses a novel hierarchical, supervised method for the reduction of large multi-instance objects, e.g. audio or video, and applies support vector machines for efficient hierarchical classification of multi-represented objects. User benefits of this technique are demonstrated by a prototype that performs a classification of large music collections.
The effectiveness and efficiency of all proposed techniques are discussed and verified by comparison with conventional approaches in versatile experimental evaluations on real-world datasets
Similarity search and data mining techniques for advanced database systems.
Modern automated methods for measurement, collection, and analysis of data in industry and science are providing more and more data with drastically increasing structure complexity. On the one hand, this growing complexity is justified by the need for a richer and more precise description of real-world objects, on the other hand it is justified by the rapid progress in measurement and analysis techniques that allow the user a versatile exploration of objects. In order to manage the huge volume of such complex data, advanced database systems are employed. In contrast to conventional database systems that support exact match queries, the user of these advanced database systems focuses on applying similarity search and data mining techniques.
Based on an analysis of typical advanced database systems — such as biometrical, biological, multimedia, moving, and CAD-object database systems — the following three challenging characteristics of complexity are detected: uncertainty (probabilistic feature vectors), multiple instances (a set of homogeneous feature vectors), and multiple representations (a set of heterogeneous feature vectors). Therefore, the goal of this thesis is to develop similarity search and data mining techniques that are capable of handling uncertain, multi-instance, and multi-represented objects.
The first part of this thesis deals with similarity search techniques. Object identification is a similarity search technique that is typically used for the recognition of objects from image, video, or audio data. Thus, we develop a novel probabilistic model for object identification. Based on it, two novel types of identification queries are defined. In order to process the novel query types efficiently, we introduce an index structure called Gauss-tree. In addition, we specify further probabilistic models and query types for uncertain multi-instance objects and uncertain spatial objects. Based on the index structure, we develop algorithms for an efficient processing of these query types. Practical benefits of using probabilistic feature vectors are demonstrated on a real-world application for video similarity search. Furthermore, a similarity search technique is presented that is based on aggregated multi-instance objects, and that is suitable for video similarity search. This technique takes multiple representations into account in order to achieve better effectiveness.
The second part of this thesis deals with two major data mining techniques: clustering and classification. Since privacy preservation is a very important demand of distributed advanced applications, we propose using uncertainty for data obfuscation in order to provide privacy preservation during clustering. Furthermore, a model-based and a density-based clustering method for multi-instance objects are developed. Afterwards, original extensions and enhancements of the density-based clustering algorithms DBSCAN and OPTICS for handling multi-represented objects are introduced. Since several advanced database systems like biological or multimedia database systems handle predefined, very large class systems, two novel classification techniques for large class sets that benefit from using multiple representations are defined. The first classification method is based on the idea of a k-nearest-neighbor classifier. It employs a novel density-based technique to reduce training instances and exploits the entropy impurity of the local neighborhood in order to weight a given representation. The second technique addresses hierarchically-organized class systems. It uses a novel hierarchical, supervised method for the reduction of large multi-instance objects, e.g. audio or video, and applies support vector machines for efficient hierarchical classification of multi-represented objects. User benefits of this technique are demonstrated by a prototype that performs a classification of large music collections.
The effectiveness and efficiency of all proposed techniques are discussed and verified by comparison with conventional approaches in versatile experimental evaluations on real-world datasets
Recommended from our members
Interactive Imaging via Hand Gesture Recognition.
With the growth of computer power, Digital Image Processing plays a more and more important role in the modern world, including the field of industry, medical, communications, spaceflight technology etc. As a sub-field, Interactive Image Processing emphasizes particularly on the communications between machine and human. The basic flowchart is definition of object, analysis and training phase, recognition and feedback. Generally speaking, the core issue is how we define the interesting object and track them more accurately in order to complete the interaction process successfully.
This thesis proposes a novel dynamic simulation scheme for interactive image processing. The work consists of two main parts: Hand Motion Detection and Hand Gesture recognition. Within a hand motion detection processing, movement of hand will be identified and extracted. In a specific detection period, the current image is compared with the previous image in order to generate the difference between them. If the generated difference exceeds predefined threshold alarm, a typical hand motion movement is detected. Furthermore, in some particular situations, changes of hand gesture are also desired to be detected and classified. This task requires features extraction and feature comparison among each type of gestures. The essentials of hand gesture are including some low level features such as color, shape etc. Another important feature is orientation histogram. Each type of hand gestures has its particular representation in the domain of orientation histogram. Because Gaussian Mixture Model has great advantages to represent the object with essential feature elements and the Expectation-Maximization is the efficient procedure to compute the maximum likelihood between testing images and predefined standard sample of each different gesture, the comparability between testing image and samples of each type of gestures will be estimated by Expectation-Maximization algorithm in Gaussian Mixture Model. The performance of this approach in experiments shows the proposed method works well and accurately
Semantic multimedia analysis using knowledge and context
PhDThe difficulty of semantic multimedia analysis can be attributed to the
extended diversity in form and appearance exhibited by the majority of
semantic concepts and the difficulty to express them using a finite number
of patterns. In meeting this challenge there has been a scientific debate
on whether the problem should be addressed from the perspective of using
overwhelming amounts of training data to capture all possible instantiations
of a concept, or from the perspective of using explicit knowledge about
the concepts’ relations to infer their presence. In this thesis we address
three problems of pattern recognition and propose solutions that combine
the knowledge extracted implicitly from training data with the knowledge
provided explicitly in structured form. First, we propose a BNs modeling
approach that defines a conceptual space where both domain related evi-
dence and evidence derived from content analysis can be jointly considered
to support or disprove a hypothesis. The use of this space leads to sig-
nificant gains in performance compared to analysis methods that can not
handle combined knowledge. Then, we present an unsupervised method
that exploits the collective nature of social media to automatically obtain
large amounts of annotated image regions. By proving that the quality of
the obtained samples can be almost as good as manually annotated images
when working with large datasets, we significantly contribute towards scal-
able object detection. Finally, we introduce a method that treats images,
visual features and tags as the three observable variables of an aspect model
and extracts a set of latent topics that incorporates the semantics of both
visual and tag information space. By showing that the cross-modal depen-
dencies of tagged images can be exploited to increase the semantic capacity
of the resulting space, we advocate the use of all existing information facets
in the semantic analysis of social media
- …