11 research outputs found

    Recognition of 3-D Objects from Multiple 2-D Views by a Self-Organizing Neural Architecture

    Full text link
    The recognition of 3-D objects from sequences of their 2-D views is modeled by a neural architecture, called VIEWNET that uses View Information Encoded With NETworks. VIEWNET illustrates how several types of noise and varialbility in image data can be progressively removed while incornplcte image features are restored and invariant features are discovered using an appropriately designed cascade of processing stages. VIEWNET first processes 2-D views of 3-D objects using the CORT-X 2 filter, which discounts the illuminant, regularizes and completes figural boundaries, and removes noise from the images. Boundary regularization and cornpletion are achieved by the same mechanisms that suppress image noise. A log-polar transform is taken with respect to the centroid of the resulting figure and then re-centered to achieve 2-D scale and rotation invariance. The invariant images are coarse coded to further reduce noise, reduce foreshortening effects, and increase generalization. These compressed codes are input into a supervised learning system based on the fuzzy ARTMAP algorithm. Recognition categories of 2-D views are learned before evidence from sequences of 2-D view categories is accumulated to improve object recognition. Recognition is studied with noisy and clean images using slow and fast learning. VIEWNET is demonstrated on an MIT Lincoln Laboratory database of 2-D views of jet aircraft with and without additive noise. A recognition rate of 90% is achieved with one 2-D view category and of 98.5% correct with three 2-D view categories.National Science Foundation (IRI 90-24877); Office of Naval Research (N00014-91-J-1309, N00014-91-J-4100, N00014-92-J-0499); Air Force Office of Scientific Research (F9620-92-J-0499, 90-0083

    The 3D visibility complex : a new approach to the problems of accurate visibility

    Full text link

    Sensory processing and world modeling for an active ranging device

    Get PDF
    In this project, we studied world modeling and sensory processing for laser range data. World Model data representation and operation were defined. Sensory processing algorithms for point processing and linear feature detection were designed and implemented. The interface between world modeling and sensory processing in the Servo and Primitive levels was investigated and implemented. In the primitive level, linear features detectors for edges were also implemented, analyzed and compared. The existing world model representations is surveyed. Also presented is the design and implementation of the Y-frame model, a hierarchical world model. The interfaces between the world model module and the sensory processing module are discussed as well as the linear feature detectors that were designed and implemented

    Image understanding and feature extraction for applications in industry and mapping

    Get PDF
    Bibliography: p. 212-220.The aim of digital photogrammetry is the automated extraction and classification of the three dimensional information of a scene from a number of images. Existing photogrammetric systems are semi-automatic requiring manual editing and control, and have very limited domains of application so that image understanding capabilities are left to the user. Among the most important steps in a fully integrated system are the extraction of features suitable for matching, the establishment of the correspondence between matching points and object classification. The following study attempts to explore the applicability of pattern recognition concepts in conjunction with existing area-based methods, feature-based techniques and other approaches used in computer vision in order to increase the level of automation and as a general alternative and addition to existing methods. As an illustration of the pattern recognition approach examples of industrial applications are given. The underlying method is then extended to the identification of objects in aerial images of urban scenes and to the location of targets in close-range photogrammetric applications. Various moment-based techniques are considered as pattern classifiers including geometric invariant moments, Legendre moments, Zernike moments and pseudo-Zernike moments. Two-dimensional Fourier transforms are also considered as pattern classifiers. The suitability of these techniques is assessed. These are then applied as object locators and as feature extractors or interest operators. Additionally the use of fractal dimension to segment natural scenes for regional classification in order to limit the search space for particular objects is considered. The pattern recognition techniques require considerable preprocessing of images. The various image processing techniques required are explained where needed. Extracted feature points are matched using relaxation based techniques in conjunction with area-based methods to 'obtain subpixel accuracy. A subpixel pattern recognition based method is also proposed and an investigation into improved area-based subpixel matching methods is undertaken. An algorithm for determining relative orientation parameters incorporating the epipolar line constraint is investigated and compared with a standard relative orientation algorithm. In conclusion a basic system that can be automated based on some novel techniques in conjunction with existing methods is described and implemented in a mapping application. This system could be largely automated with suitably powerful computers

    Reconstrucción geométrica de sólidos utilizando técnicas de optimización

    Get PDF
    Este trabajo tiene por objetivo la reconstrucción automática de modelos geométricos, a partir de la información contenida en una única imagen vectorial y geométricamente consistente de un objeto poliédrico. Los procesos de optimización son a nuestro entender el camino más prometedor para la reconstrucción, en tanto que pueden simular la manera en que percibe el ser humano. Sin embargo la Reconstrucción Geométrica planteada como proceso de optimización presenta como problema fundamental una función objetivo compleja: con muchos mínimos locales. Los mínimos locales son modelos no válidos, porque no son acordes con la percepción visual humana (no son psicológicamente plausibles). Además, el punto de partida del algoritmo (la imagen), constituye un mínimo local. Nuestro trabajo se orientó inicialmente a implementar un algoritmo de optimización de los que se proclaman capaces de obtener mínimos globales. Sin embargo, llegamos a la conclusión de que ni siquiera dichos algoritmos garantizan el óptimo en el caso de la Reconstrucción Geométrica, porque su comportamiento depende mucho de sus propios parámetros de ajuste y de la naturaleza del modelo a reconstruir. Es por ello que creemos necesario que los algoritmos de optimización vengan asistidos de estrategias de inflado tentativo, para generar modelos iniciales tan próximos como sea posible al optimo global, es decir, que sean lo más parecidos posible al modelo psicológicamente plausible. En ese camino hemos desarrollado tres estrategias que permiten generar modelos iniciales. Hemos comprobado que cada una de estas estrategias funcionan bien cuando se aplican a modelos de ciertas tipologías, por lo que hemos desarrollado una clasificación específica de poliedros acorde con nuestros fines. Dado que la clasificación está orientada a seleccionar la estrategia de inflado tentantivo más conveniente, también hemos desarrollado un algoritmo para detectar el tipo de poliedro automáticamente a partir de la imagen de entrada.Universidad Politécnica de CartagenaPrograma de Doctorado en Análisis y Diseño Avanzado de Estructura

    Pose-invariant face recognition using real and virtual views

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995.Includes bibliographical references (p. 173-184).by David James Beymer.Ph.D

    Learning and Example Selection for Object and Pattern Detection

    Get PDF
    This thesis presents a learning based approach for detecting classes of objects and patterns with variable image appearance but highly predictable image boundaries. It consists of two parts. In part one, we introduce our object and pattern detection approach using a concrete human face detection example. The approach first builds a distribution-based model of the target pattern class in an appropriate feature space to describe the target's variable image appearance. It then learns from examples a similarity measure for matching new patterns against the distribution-based target model. The approach makes few assumptions about the target pattern class and should therefore be fairly general, as long as the target class has predictable image boundaries. Because our object and pattern detection approach is very much learning-based, how well a system eventually performs depends heavily on the quality of training examples it receives. The second part of this thesis looks at how one can select high quality examples for function approximation learning tasks. We propose an {em active learning} formulation for function approximation, and show for three specific approximation function classes, that the active example selection strategy learns its target with fewer data samples than random sampling. We then simplify the original active learning formulation, and show how it leads to a tractable example selection paradigm, suitable for use in many object and pattern detection problems

    Pose-Invariant Face Recognition Using Real and Virtual Views

    Get PDF
    The problem of automatic face recognition is to visually identify a person in an input image. This task is performed by matching the input face against the faces of known people in a database of faces. Most existing work in face recognition has limited the scope of the problem, however, by dealing primarily with frontal views, neutral expressions, and fixed lighting conditions. To help generalize existing face recognition systems, we look at the problem of recognizing faces under a range of viewpoints. In particular, we consider two cases of this problem: (i) many example views are available of each person, and (ii) only one view is available per person, perhaps a driver's license or passport photograph. Ideally, we would like to address these two cases using a simple view-based approach, where a person is represented in the database by using a number of views on the viewing sphere. While the view-based approach is consistent with case (i), for case (ii) we need to augment the single real view of each person with synthetic views from other viewpoints, views we call 'virtual views'. Virtual views are generated using prior knowledge of face rotation, knowledge that is 'learned' from images of prototype faces. This prior knowledge is used to effectively rotate in depth the single real view available of each person. In this thesis, I present the view-based face recognizer, techniques for synthesizing virtual views, and experimental results using real and virtual views in the recognizer
    corecore