112 research outputs found

    The Extraction and Use of Image Planes for Three-dimensional Metric Reconstruction

    Get PDF
    The three-dimensional (3D) metric reconstruction of a scene from two-dimensional images is a fundamental problem in Computer Vision. The major bottleneck in the process of retrieving such structure lies in the task of recovering the camera parameters. These parameters can be calculated either through a pattern-based calibration procedure, which requires an accurate knowledge of the scene, or using a more flexible approach, known as camera autocalibration, which exploits point correspondences across images. While pattern-based calibration requires the presence of a calibration object, autocalibration constraints are often cast into nonlinear optimization problems which are often sensitive to both image noise and initialization. In addition, autocalibration fails for some particular motions of the camera. To overcome these problems, we propose to combine scene and autocalibration constraints and address in this thesis (a) the problem of extracting geometric information of the scene from uncalibrated images, (b) the problem of obtaining a robust estimate of the affine calibration of the camera, and (c) the problem of upgrading and refining the affine calibration into a metric one. In particular, we propose a method for identifying the major planar structures in a scene from images and another method to recognize parallel pairs of planes whenever these are available. The identified parallel planes are then used to obtain a robust estimate of both the affine and metric 3D structure of the scene without resorting to the traditional error prone calculation of vanishing points. We also propose a refinement method which, unlike existing ones, is capable of simultaneously incorporating plane parallelism and perpendicularity constraints in the autocalibration process. Our experiments demonstrate that the proposed methods are robust to image noise and provide satisfactory results

    What can be done with an embedded stereo-rig in urban environments?

    Get PDF
    International audienceThe development of the Autonomous Guided Vehicles (AGVs) with urban applications are now possible due to the recent solutions (DARPA Grand Challenge) developed to solve the Simultaneous Localization And Mapping (SLAM) problem: perception, path planning and control. For the last decade, the introduction of GPS systems and vision have been allowed the transposition of SLAM methods dedicated to indoor environments to outdoor ones. When the GPS data are unavailable, the current position of the mobile robot can be estimated by the fusion of data from odometer and/or Inertial Navigation System (INS). We detail in this article what can be done with an uncalibrated stereo-rig, when it is embedded in a vehicle which is going through urban roads. The methodology is based on features extracted on planes: we mainly assume the road at the foreground as the plane common to all the urban scenes but other planes like vertical frontages of buildings can be used if the features extracted on the road are not enough relevant. The relative motions of the coplanar features tracked with both cameras allow us to stimate the vehicle ego-motion with a high precision. Futhermore, the features which don't check the relative motion of the considered plane can be assumed as obstacles

    Exploiting Structural Regularities and Beyond: Vision-based Localization and Mapping in Man-Made Environments

    Get PDF
    Image-based estimation of camera motion, known as visual odometry (VO), plays a very important role in many robotic applications such as control and navigation of unmanned mobile robots, especially when no external navigation reference signal is available. The core problem of VO is the estimation of the camera’s ego-motion (i.e. tracking) either between successive frames, namely relative pose estimation, or with respect to a global map, namely absolute pose estimation. This thesis aims to develop efficient, accurate and robust VO solutions by taking advantage of structural regularities in man-made environments, such as piece-wise planar structures, Manhattan World and more generally, contours and edges. Furthermore, to handle challenging scenarios that are beyond the limits of classical sensor based VO solutions, we investigate a recently emerging sensor — the event camera and study on event-based mapping — one of the key problems in the event-based VO/SLAM. The main achievements are summarized as follows. First, we revisit an old topic on relative pose estimation: accurately and robustly estimating the fundamental matrix given a collection of independently estimated homograhies. Three classical methods are reviewed and then we show a simple but nontrivial two-step normalization within the direct linear method that achieves similar performance to the less attractive and more computationally intensive hallucinated points based method. Second, an efficient 3D rotation estimation algorithm for depth cameras in piece-wise planar environments is presented. It shows that by using surface normal vectors as an input, planar modes in the corresponding density distribution function can be discovered and continuously tracked using efficient non-parametric estimation techniques. The relative rotation can be estimated by registering entire bundles of planar modes by using robust L1-norm minimization. Third, an efficient alternative to the iterative closest point algorithm for real-time tracking of modern depth cameras in ManhattanWorlds is developed. We exploit the common orthogonal structure of man-made environments in order to decouple the estimation of the rotation and the three degrees of freedom of the translation. The derived camera orientation is absolute and thus free of long-term drift, which in turn benefits the accuracy of the translation estimation as well. Fourth, we look into a more general structural regularity—edges. A real-time VO system that uses Canny edges is proposed for RGB-D cameras. Two novel alternatives to classical distance transforms are developed with great properties that significantly improve the classical Euclidean distance field based methods in terms of efficiency, accuracy and robustness. Finally, to deal with challenging scenarios that go beyond what standard RGB/RGB-D cameras can handle, we investigate the recently emerging event camera and focus on the problem of 3D reconstruction from data captured by a stereo event-camera rig moving in a static scene, such as in the context of stereo Simultaneous Localization and Mapping

    Uncalibrated stereo vision applied to breast cancer treatment aesthetic assessment

    Get PDF
    Mestrado Integrado. Engenharia Informática e Computação. Universidade do Porto. Faculdade de Engenharia. 201

    Use of a single reference image in visual processing of polyhedral objects.

    Get PDF
    He Yong.Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.Includes bibliographical references (leaves 69-72).Abstracts in English and Chinese.ABSTRACT --- p.iACKNOWLEDGEMENTS --- p.vTABLE OF CONTENTS --- p.viLIST OF FIGURES --- p.viiiLIST OF TABLES --- p.xChapter 1 --- INTRODUCTION --- p.1Chapter 2 --- PRELIMINARY --- p.6Chapter 3 --- IMAGE MOSAICING FOR SINGLY VISIBLE SURFACES --- p.9Chapter 3.1 --- Background --- p.9Chapter 3.2 --- Correspondence Inference Mechanism --- p.13Chapter 3.3 --- Seamless Lining up of Surface Boundary --- p.17Chapter 3.4 --- Experimental Result --- p.21Chapter 3.5 --- Summary of Image Mosaicing Work --- p.32Chapter 4 --- MOBILE ROBOT SELF-LOCALIZATION FROM MONOCULAR VISION --- p.33Chapter 4.1 --- Background --- p.33Chapter 4.2 --- Problem Definition --- p.37Chapter 4.3 --- Our Strategy of Localizing the Mobile Robot --- p.38Chapter 4.3.1 --- Establishing Correspondences --- p.40Chapter 4.3.2 --- Determining Position from Factorizing E-matrix --- p.49Chapter 4.3.3 --- Improvement on the Factorization Result --- p.55Chapter 4.4 --- Experimental Result --- p.56Chapter 4.5 --- Summary of Mobile Robot Self-localization Work --- p.62Chapter 5 --- CONCLUSION AND FUTURE WORK --- p.63APPENDIX --- p.67BIBLIOGRAPHY --- p.6

    Specular surface recovery from reflections of a planar pattern undergoing an unknown pure translation

    Get PDF
    LNCS v. 6493 entitled: Computer Vision – ACCV 2010: 10th Asian Conference on Computer Vision, Queenstown, New Zealand, November 8-12, 2010, Revised Selected Papers, Part 2This paper addresses the problem of specular surface recovery, and proposes a novel solution based on observing the reflections of a translating planar pattern. Previous works have demonstrated that a specular surface can be recovered from the reflections of two calibrated planar patterns. In this paper, however, only one reference planar pattern is assumed to have been calibrated against a fixed camera observing the specular surface. Instead of introducing and calibrating a second pattern, the reference pattern is allowed to undergo an unknown pure translation, and a closed form solution is derived for recovering such a motion. Unlike previous methods which estimate the shape by directly triangulating the visual rays and reflection rays, a novel method based on computing the projections of the visual rays on the translating pattern is introduced. This produces a depth range for each pixel which also provides a measure of the accuracy of the estimation. The proposed approach enables a simple auto-calibration of the translating pattern, and data redundancy resulting from the translating pattern can improve both the robustness and accuracy of the shape estimation. Experimental results on both synthetic and real data are presented to demonstrate the effectiveness of the proposed approach. © 2011 Springer-Verlag Berlin Heidelberg.postprintThe 10th Asian Conference on Computer Vision, Queenstown, New Zealand, 8-12 November 2010. In Lecture Notes in Computer Science, 2010, v. 6493, p. 137-14

    Método para el registro automático de imágenes basado en transformaciones proyectivas planas dependientes de las distancias y orientado a imágenes sin características comunes

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Ciencias Físicas, Departamento de Arquitectura de Computadores y Automática, leída el 18-12-2015Multisensory data fusion oriented to image-based application improves the accuracy, quality and availability of the data, and consequently, the performance of robotic systems, by means of combining the information of a scene acquired from multiple and different sources into a unified representation of the 3D world scene, which is more enlightening and enriching for the subsequent image processing, improving either the reliability by using the redundant information, or the capability by taking advantage of complementary information. Image registration is one of the most relevant steps in image fusion techniques. This procedure aims the geometrical alignment of two or more images. Normally, this process relies on feature-matching techniques, which is a drawback for combining sensors that are not able to deliver common features. For instance, in the combination of ToF and RGB cameras, the robust feature-matching is not reliable. Typically, the fusion of these two sensors has been addressed from the computation of the cameras calibration parameters for coordinate transformation between them. As a result, a low resolution colour depth map is provided. For improving the resolution of these maps and reducing the loss of colour information, extrapolation techniques are adopted. A crucial issue for computing high quality and accurate dense maps is the presence of noise in the depth measurement from the ToF camera, which is normally reduced by means of sensor calibration and filtering techniques. However, the filtering methods, implemented for the data extrapolation and denoising, usually over-smooth the data, reducing consequently the accuracy of the registration procedure...La fusión multisensorial orientada a aplicaciones de procesamiento de imágenes, conocida como fusión de imágenes, es una técnica que permite mejorar la exactitud, la calidad y la disponibilidad de datos de un entorno tridimensional, que a su vez permite mejorar el rendimiento y la operatividad de sistemas robóticos. Dicha fusión, se consigue mediante la combinación de la información adquirida por múltiples y diversas fuentes de captura de datos, la cual se agrupa del tal forma que se obtiene una mejor representación del entorno 3D, que es mucho más ilustrativa y enriquecedora para la implementación de métodos de procesamiento de imágenes. Con ello se consigue una mejora en la fiabilidad y capacidad del sistema, empleando la información redundante que ha sido adquirida por múltiples sensores. El registro de imágenes es uno de los procedimientos más importantes que componen la fusión de imágenes. El objetivo principal del registro de imágenes es la consecución de la alineación geométrica entre dos o más imágenes. Normalmente, este proceso depende de técnicas de búsqueda de patrones comunes entre imágenes, lo cual puede ser un inconveniente cuando se combinan sensores que no proporcionan datos con características similares. Un ejemplo de ello, es la fusión de cámaras de color de alta resolución (RGB) con cámaras de Tiempo de Vuelo de baja resolución (Time-of-Flight (ToF)), con las cuales no es posible conseguir una detección robusta de patrones comunes entre las imágenes capturadas por ambos sensores. Por lo general, la fusión entre estas cámaras se realiza mediante el cálculo de los parámetros de calibración de las mismas, que permiten realizar la trasformación homogénea entre ellas. Y como resultado de este xii Abstract procedimiento, se obtienen mapas de profundad y de color de baja resolución. Con el objetivo de mejorar la resolución de estos mapas y de evitar la pérdida de información de color, se utilizan diversas técnicas de extrapolación de datos. Un factor crucial a tomar en cuenta para la obtención de mapas de alta calidad y alta exactitud, es la presencia de ruido en las medidas de profundidad obtenidas por las cámaras ToF. Este problema, normalmente se reduce mediante la calibración de estos sensores y con técnicas de filtrado de datos. Sin embargo, las técnicas de filtrado utilizadas, tanto para la interpolación de datos, como para la reducción del ruido, suelen producir el sobre-alisamiento de los datos originales, lo cual reduce la exactitud del registro de imágenes...Sección Deptal. de Arquitectura de Computadores y Automática (Físicas)Fac. de Ciencias FísicasTRUEunpu

    Flexible and User-Centric Camera Calibration using Planar Fiducial Markers

    Full text link
    The benefit of accurate camera calibration for recovering 3D structure from images is a well-studied topic. Recently 3D vision tools for end-user applications have become popular among large audiences, mostly unskilled in computer vision. This motivates the need for a flexible and user-centric camera calibration method which drastically releases the critical requirements on the calibration target and ensures that low-quality or faulty images provided by end users do not degrade the overall calibration and in effect the resulting 3D model. In this paper we present and advocate an approach to camera cal-ibration using fiducial markers, aiming at the accuracy of target calibration techniques without the requirement for a precise calibration pattern, to ease the calibration effort for the end-user. An extensive set of experiments with real images is presented which demonstrates improvements in the estimation of the parameters of the camera model as well as accuracy in the multi-view stereo reconstruction of large scale scenes. Pixel re-projection errors and ground truth errors obtained by our method are significantly lower compared to popular calibration routines, even though paper-printable and easy-to-use targets are employed.
    corecore