8 research outputs found

    Generalization to Novel Views: Universal, Class-based, and Model-based Processing", Int

    Get PDF
    Abstract. A major problem in object recognition is that a novel image of a given object can be different from all previously seen images. Images can vary considerably due to changes in viewing conditions such as viewing position and illumination. In this paper we distinguish between three types of recognition schemes by the level at which generalization to novel images takes place: universal, class, and model-based. The first is applicable equally to all objects, the second to a class of objects, and the third uses known properties of individual objects. We derive theoretical limitations on each of the three generalization levels. For the universal level, previous results have shown that no invariance can be obtained. Here we show that this limitation holds even when the assumptions made on the objects and the recognition functions are relaxed. We also extend the results to changes of illumination direction. For the class level, previous studies presented specific examples of classes of objects for which functions invariant to viewpoint exist. Here, we distinguish between classes that admit such invariance and classes that do not. We demonstrate that there is a tradeoff between the set of objects that can be discriminated by a given recognition function and the set of images from which the recognition function can recognize these objects. Furthermore, we demonstrate that although functions that are invariant to illumination direction do not exist at the universal level, when the objects are restricted to belong to a given class, an invariant function to illumination direction can be defined. A general conclusion of this study is that class-based processing, that has not been used extensively in the past, is often advantageous for dealing with variations due to viewpoint and illuminant changes. Keywords: object recognition, invariance 1

    Quelle information géométrique peut-on obtenir à partir d'une ou plusieurs images prises par projection perspective ?

    Get PDF
    Les travaux présentés dans cet article ont été réalisés au sein du projet Movi du laboratoire Lifia à Grenoble, par Boubakeur Boufama, Pascal Brand, Patrick Gros, Luce Morin, Long Quan et Francoise Veillon, le tout avec la participation et sous la direction de Roger Mohr. Les contributions de chacun seront précisées dans le fil du texte par les réferences bibliographiques, auxquelles le lecteur est invité à se reporter pour les détails techniques qui ne seront pas tous donnés ici. L'ensemble du travail a été réalisé dans le cadre du projet Esprit - Bra Viva.National audienceEn vision par ordinateur, on considère une camera qui prend des images. En supposant simplement que cette opération de prise de vue est d'un certain type géométrique, et plus précisément que c'est une projection perspective, on peut calculer à partir d'une ou de plusieurs images des quantités géométriques caractéristiques de la scène observée. Apres avoir étudié quelques modèles géométriques de cameras, les informations géométriques que l'on peut tirer d'une, deux, trois ou plusieurs images sont etudiées successivement

    Retinal image quality assessment using generic image quality indicators

    Get PDF
    Neste documento é descrito um algoritmo de avaliação da qualidade de imagens retinográficasas baseado em critérios genéricos de qualidade de imagem. As características da imagem utilizadas são cor, focagem, contraste e iluminação, analisadas usando diferentes técnicas de processamento de imagem, das quais se destaca a nova aplicação dada á retroprojecção do histograma. Os quatro algoritmos de avaliação das características da imagem dão origem a catorze medidas que são combinadas para aferir a adequa ção da imagem para fins de diagnóstico. Para além de serem a base da classificação global da qualidade da imagem retinográfica, estes quatro algoritmos também fornecem informação importante ao operador da câmara retinográfica, uma vez que indicam a qualidade da respectiva caracter ística da imagem (cor, focagem, contraste ou iluminação). Este tipo de informação pode ser utilizado para melhor ajustar o processo de captura da imagem. O desempenho de cada algoritmo foi extensivamente avaliado através da comparação dos resultados de classificação automática de imagens recolhidas de um largo espectro de fontes, incluindo bases de dados propriet árias e também públicas (DRIVE, Messidor, ROC e STARE), com a classificaçao efectuada por avaliadores humanos, dando origem a áreas abaixo da curva ROC perto do valor óptimo de 1. O desempenho global do algoritmo foi avaliado através da comparação feita contra a classi- ficação humana, evidenciando uma sensibilidade de 99.76% e especificiidade de 99.49% num conjunto de dados constituído por 2032 imagens retinográficas de uma base de dados proprietária e da base de dados Messidor. Além disso, a complexidade computacional do algoritmo e a sua sensibilidade ao ruído e à resolução das imagens foram também experimentalmente quantificadas, demonstrando um desempenho muito bom e con- firmando a usabilidade da solução em condições de ambulatório. Assim, a combinação das características mencionadas cria uma nova contribuição para a avaliação da qualidade de imagens retinográficas, cuja eficiência é demonstrada pelos resultados. Palavras-chave: qualidade de imagens retinográficas, avaliação da cor, avaliação da focagem, avaliação do contraste, avaliação da iluminação

    Statistical Object Recognition

    Get PDF
    Two formulations of model-based object recognition are described. MAP Model Matching evaluates joint hypotheses of match and pose, while Posterior Marginal Pose Estimation evaluates the pose only. Local search in pose space is carried out with the Expectation--Maximization (EM) algorithm. Recognition experiments are described where the EM algorithm is used to refine and evaluate pose hypotheses in 2D and 3D. Initial hypotheses for the 2D experiments were generated by a simple indexing method: Angle Pair Indexing. The Linear Combination of Views method of Ullman and Basri is employed as the projection model in the 3D experiments

    Pose-invariant face recognition using real and virtual views

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (p. 173-184).by David James Beymer.Ph.D

    Pose-Invariant Face Recognition Using Real and Virtual Views

    Get PDF
    The problem of automatic face recognition is to visually identify a person in an input image. This task is performed by matching the input face against the faces of known people in a database of faces. Most existing work in face recognition has limited the scope of the problem, however, by dealing primarily with frontal views, neutral expressions, and fixed lighting conditions. To help generalize existing face recognition systems, we look at the problem of recognizing faces under a range of viewpoints. In particular, we consider two cases of this problem: (i) many example views are available of each person, and (ii) only one view is available per person, perhaps a driver's license or passport photograph. Ideally, we would like to address these two cases using a simple view-based approach, where a person is represented in the database by using a number of views on the viewing sphere. While the view-based approach is consistent with case (i), for case (ii) we need to augment the single real view of each person with synthetic views from other viewpoints, views we call 'virtual views'. Virtual views are generated using prior knowledge of face rotation, knowledge that is 'learned' from images of prototype faces. This prior knowledge is used to effectively rotate in depth the single real view available of each person. In this thesis, I present the view-based face recognizer, techniques for synthesizing virtual views, and experimental results using real and virtual views in the recognizer

    Attentional Selection in Object Recognition

    Get PDF
    A key problem in object recognition is selection, namely, the problem of identifying regions in an image within which to start the recognition process, ideally by isolating regions that are likely to come from a single object. Such a selection mechanism has been found to be crucial in reducing the combinatorial search involved in the matching stage of object recognition. Even though selection is of help in recognition, it has largely remained unsolved because of the difficulty in isolating regions belonging to objects under complex imaging conditions involving occlusions, changing illumination, and object appearances. This thesis presents a novel approach to the selection problem by proposing a computational model of visual attentional selection as a paradigm for selection in recognition. In particular, it proposes two modes of attentional selection, namely, attracted and pay attention modes as being appropriate for data and model-driven selection in recognition. An implementation of this model has led to new ways of extracting color, texture and line group information in images, and their subsequent use in isolating areas of the scene likely to contain the model object. Among the specific results in this thesis are: a method of specifying color by perceptual color categories for fast color region segmentation and color-based localization of objects, and a result showing that the recognition of texture patterns on model objects is possible under changes in orientation and occlusions without detailed segmentation. The thesis also presents an evaluation of the proposed model by integrating with a 3D from 2D object recognition system and recording the improvement in performance. These results indicate that attentional selection can significantly overcome the computational bottleneck in object recognition, both due to a reduction in the number of features, and due to a reduction in the number of matches during recognition using the information derived during selection. Finally, these studies have revealed a surprising use of selection, namely, in the partial solution of the pose of a 3D object

    Recognizing 3-D Objects Using 2-D Images

    Get PDF
    We discuss a strategy for visual recognition by forming groups of salient image features, and then using these groups to index into a data base to find all of the matching groups of model features. We discuss the most space efficient possible method of representing 3-D models for indexing from 2-D data, and show how to account for sensing error when indexing. We also present a convex grouping method that is robust and efficient, both theoretically and in practice. Finally, we combine these modules into a complete recognition system, and test its performance on many real images
    corecore