47 research outputs found
Reconnaissance d'objets par indexation géométrique étendue.
National audiencePlusieurs travaux traitent déjà du problème de la reconnaissance. Certains utilisent le paradigme de prediction-verification. Leur efficacité dépend en général de l'implémentation du parcours d'arbre qui est inhérent à ce problème. Lamdan et Wolfson [7] introduisaient les premiers la notion de Geometric Hashing ou l'indéxation géométrique. Cette méthode repose, comme celles de la classe précédente, sur l'hypothèse d'un mouvement rigide entre l'image inconnue et son modèle. Par contre, la recherche arborescente est réduite à une indexation par tables de hachage, ce qui reduit considérablement la compléxite du problème. Pour s'affranchir de la contrainte de rigidité, d'autres auteurs ont essayé d'implementer des systèmes basés sur des critères plus topologiques [6]. Ils sont malheureusement très sensibles au bruit. D'autres méthodes, comme celles basées sur des approches stochastiques, ont l'inconvénient d'être tres fortement liées à la modélisation initiale du problème, et par conséquent, manquent d'une flexibilité indispensable. Notre méthode se rapproche plus du Geometric Hashing. Par contre cette dernière résout uniquement le problème de l'indéxation, et non pas celui de la mise en correspondance. Nous nous proposons de faire les deux simultanément, permettant ainsi de retrouver rapidement un (ou des) modèle(s) dans une image inconnue et de donner des informations quantitatives quant à la position des primitives qui ont participé à la reconnaissance et leur mise en correspondance avec celles du modèle trouvé
Object representation and recognition
One of the primary functions of the human visual system is object recognition, an ability that allows us to relate the visual stimuli falling on our retinas to our knowledge of the world. For example, object recognition allows you to use knowledge of what an apple looks like to find it in the supermarket, to use knowledge of what a shark looks like to swim in th
On the Mahalanobis Distance Classification Criterion for Multidimensional Normal Distributions
Many existing engineering works model the statistical characteristics of the entities under study as normal distributions. These models are eventually used for decision
making, requiring in practice the definition of the classification region corresponding to the desired confidence level. Surprisingly enough, however, a great amount of computer vision works using multidimensional normal models leave unspecified or fail to establish correct confidence regions due to misconceptions on the features of Gaussian functions or to wrong analogies with the unidimensional case. The resulting regions incur in deviations that can be unacceptable in high-dimensional models.
Here we provide a comprehensive derivation of the optimal
confidence regions for multivariate normal distributions of arbitrary dimensionality. To this end, firstly we derive the condition for region optimality of general continuous multidimensional distributions, and then we apply it to the widespread case of the normal probability density function. The obtained results are used to analyze the confidence error incurred by previous works related to vision research, showing that deviations caused by wrong regions may turn into unacceptable as dimensionality increases. To support the theoretical analysis, a quantitative example in the context of moving object detection by means of background modeling is given
Closed-Loop Learning of Visual Control Policies
In this paper we present a general, flexible framework for learning mappings
from images to actions by interacting with the environment. The basic idea is
to introduce a feature-based image classifier in front of a reinforcement
learning algorithm. The classifier partitions the visual space according to the
presence or absence of few highly informative local descriptors that are
incrementally selected in a sequence of attempts to remove perceptual aliasing.
We also address the problem of fighting overfitting in such a greedy algorithm.
Finally, we show how high-level visual features can be generated when the power
of local descriptors is insufficient for completely disambiguating the aliased
states. This is done by building a hierarchy of composite features that consist
of recursive spatial combinations of visual features. We demonstrate the
efficacy of our algorithms by solving three visual navigation tasks and a
visual version of the classical Car on the Hill control problem
Simultaneous Object Recognition and Segmentation from Single or Multiple Model Views
We present a novel Object Recognition approach based on affine invariant regions. It actively counters the problems related to the limited repeatability of the region detectors, and the difficulty of matching, in the presence of large amounts of background clutter and particularly challenging viewing conditions. After producing an initial set of matches, the method gradually explores the surrounding image areas, recursively constructing more and more matching regions, increasingly farther from the initial ones. This process covers the object with matches, and simultaneously separates the correct matches from the wrong ones. Hence, recognition and segmentation are achieved at the same time. The approach includes a mechanism for capturing the relationships between multiple model views and exploiting these for integrating the contributions of the views at recognition time. This is based on an efficient algorithm for partitioning a set of region matches into groups lying on smooth surfaces. Integration is achieved by measuring the consistency of configurations of groups arising from different model views. Experimental results demonstrate the stronger power of the approach in dealing with extensive clutter, dominant occlusion, and large scale and viewpoint changes. Non-rigid deformations are explicitly taken into account, and the approximative contours of the object are produced. All presented techniques can extend any view-point invariant feature extracto
Geometric and photometric affine invariant image registration
This thesis aims to present a solution to the correspondence problem for the registration
of wide-baseline images taken from uncalibrated cameras. We propose an affine
invariant descriptor that combines the geometry and photometry of the scene to find
correspondences between both views. The geometric affine invariant component of the
descriptor is based on the affine arc-length metric, whereas the photometry is analysed
by invariant colour moments. A graph structure represents the spatial distribution of the
primitive features; i.e. nodes correspond to detected high-curvature points, whereas arcs
represent connectivities by extracted contours. After matching, we refine the search for
correspondences by using a maximum likelihood robust algorithm. We have evaluated
the system over synthetic and real data. The method is endemic to propagation of errors
introduced by approximations in the system.BAE SystemsSelex Sensors and Airborne System
Improving Bags-of-Words model for object categorization
In the past decade, Bags-of-Words (BOW) models have become popular for the task of object recognition, owing to their good performance and simplicity. Some of the most effective recent methods for computer-based object recognition work by detecting and extracting local image features, before quantizing them according to a codebook rule such as k-means clustering, and classifying these with conventional classifiers such as Support Vector Machines and Naive Bayes.
In this thesis, a Spatial Object Recognition Framework is presented that consists of the four main contributions of the research.
The first contribution, frequent keypoint pattern discovery, works by combining pairs and triplets of frequent keypoints in order to discover intermediate representations for object classes. Based on the same frequent keypoints principle, algorithms for locating the region-of-interest in training images is then discussed.
Extensions to the successful Spatial Pyramid Matching scheme, in order to better capture spatial relationships, are then proposed. The pairs frequency histogram and shapes frequency histogram work by capturing more redefined spatial information between local image features.
Finally, alternative techniques to Spatial Pyramid Matching for capturing spatial information are presented. The proposed techniques, variations of binned log-polar histograms, divides the image into grids of different scale and different orientation. Thus captures the distribution of image features both in distance and orientation explicitly.
Evaluations on the framework are focused on several recent and popular datasets, including image retrieval, object recognition, and object categorization. Overall, while the effectiveness of the framework is limited in some of the datasets, the proposed contributions are nevertheless powerful improvements of the BOW model