300 research outputs found

    Proyecciones cónicas de rectas en sistemas catadióptricos para percepción visual en entornos construidos por el hombre

    Get PDF
    Los sistemas de visión omnidireccional son dispositivos que permiten la adquisición de imágenes con un campo de vista de 360º en un eje y superior 180º en el otro. La necesidad de integrar estas cámaras en sistemas de visión por computador ha impulsado la investigación en este campo profundizando en los modelos matemáticos y la base teórica necesaria que permite la implementación de aplicaciones. Existen diversas tecnologías para obtener imágenes omnidireccionales. Los sistemas catadióptricos son aquellos que consiguen aumentar el campo de vista utilizando espejos. Entre estos, encontramos los sistemas hiper-catadióptricos que son aquellos que utilizan una cámara perspectiva y un espejo hiperbólico. La geometría hiperbólica del espejo garantiza que el sistema sea central. En estos sistemas adquieren una especial relevancia las rectas del espacio, en la medida en que, rectas largas son completamente visibles en única imagen. La recta es una forma geométrica abundante en entornos construidos por el hombre que además acostumbra a ordenarse según direcciones dominantes. Salvo construcciones singulares, la fuerza de la gravedad fija una dirección vertical que puede utilizarse como referencia en el cálculo de la orientación del sistema. Sin embargo el uso de rectas en sistemas catadióptricos implica la dificultad añadida de trabajar con un modelo proyectivo no lineal en el que las rectas 3d son proyectadas en cónicas. Este TFM recoge el trabajo que se presenta en el artículo "Significant Conics on Catadioptric Images for 3D Orientation and Image Rectification" que pretendemos enviar a "Robotics and Autonomous Systems". En él se presenta un método para calcular la orientación de un sistema hiper-catadióptrico utilizando las cónicas que son proyecciones de rectas 3D. El método calcula la orientación respecto del sistema de referencia absoluto definido por el conjunto de puntos de fuga en un entorno en que existan direcciones dominantes

    Geometric Properties of Central Catadioptric Line Images and Their Application in Calibration

    Get PDF
    In central catadioptric systems, lines in a scene are projected to conic curves in the image. This work studies the geometry of the central catadioptric projection of lines and its use in calibration. It is shown that the conic curves where the lines are mapped possess several projective invariant properties. From these properties, it follows that any central catadioptric system can be fully calibrated from an image of three or more lines. The image of the absolute conic, the relative pose between the camera and the mirror, and the shape of the reflective surface can be recovered using a geometric construction based on the conic loci where the lines are projected. This result is valid for any central catadioptric system and generalizes previous results for paracatadioptric sensors. Moreover, it is proven that systems with a hyperbolic/elliptical mirror can be calibrated from the image of two lines. If both the shape and the pose of the mirror are known, then two line images are enough to determine the image of the absolute conic encoding the camera’s intrinsic parameters. The sensitivity to errors is evaluated and the approach is used to calibrate a real camer

    Modeling the environment with egocentric vision systems

    Get PDF
    Cada vez más sistemas autónomos, ya sean robots o sistemas de asistencia, están presentes en nuestro día a día. Este tipo de sistemas interactúan y se relacionan con su entorno y para ello necesitan un modelo de dicho entorno. En función de las tareas que deben realizar, la información o el detalle necesario del modelo varía. Desde detallados modelos 3D para sistemas de navegación autónomos, a modelos semánticos que incluyen información importante para el usuario como el tipo de área o qué objetos están presentes. La creación de estos modelos se realiza a través de las lecturas de los distintos sensores disponibles en el sistema. Actualmente, gracias a su pequeño tamaño, bajo precio y la gran información que son capaces de capturar, las cámaras son sensores incluidos en todos los sistemas autónomos. El objetivo de esta tesis es el desarrollar y estudiar nuevos métodos para la creación de modelos del entorno a distintos niveles semánticos y con distintos niveles de precisión. Dos puntos importantes caracterizan el trabajo desarrollado en esta tesis: - El uso de cámaras con punto de vista egocéntrico o en primera persona ya sea en un robot o en un sistema portado por el usuario (wearable). En este tipo de sistemas, las cámaras son solidarias al sistema móvil sobre el que van montadas. En los últimos años han aparecido muchos sistemas de visión wearables, utilizados para multitud de aplicaciones, desde ocio hasta asistencia de personas. - El uso de sistemas de visión omnidireccional, que se distinguen por su gran campo de visión, incluyendo mucha más información en cada imagen que las cámara convencionales. Sin embargo plantean nuevas dificultades debido a distorsiones y modelos de proyección más complejos. Esta tesis estudia distintos tipos de modelos del entorno: - Modelos métricos: el objetivo de estos modelos es crear representaciones detalladas del entorno en las que localizar con precisión el sistema autónomo. Ésta tesis se centra en la adaptación de estos modelos al uso de visión omnidireccional, lo que permite capturar más información en cada imagen y mejorar los resultados en la localización. - Modelos topológicos: estos modelos estructuran el entorno en nodos conectados por arcos. Esta representación tiene menos precisión que la métrica, sin embargo, presenta un nivel de abstracción mayor y puede modelar el entorno con más riqueza. %, por ejemplo incluyendo el tipo de área de cada nodo, la localización de objetos importantes o el tipo de conexión entre los distintos nodos. Esta tesis se centra en la creación de modelos topológicos con información adicional sobre el tipo de área de cada nodo y conexión (pasillo, habitación, puertas, escaleras...). - Modelos semánticos: este trabajo también contribuye en la creación de nuevos modelos semánticos, más enfocados a la creación de modelos para aplicaciones en las que el sistema interactúa o asiste a una persona. Este tipo de modelos representan el entorno a través de conceptos cercanos a los usados por las personas. En particular, esta tesis desarrolla técnicas para obtener y propagar información semántica del entorno en secuencias de imágen

    Low-Resolution Vision for Autonomous Mobile Robots

    Get PDF
    The goal of this research is to develop algorithms using low-resolution images to perceive and understand a typical indoor environment and thereby enable a mobile robot to autonomously navigate such an environment. We present techniques for three problems: autonomous exploration, corridor classification, and minimalistic geometric representation of an indoor environment for navigation. First, we present a technique for mobile robot exploration in unknown indoor environments using only a single forward-facing camera. Rather than processing all the data, the method intermittently examines only small 32X24 downsampled grayscale images. We show that for the task of indoor exploration the visual information is highly redundant, allowing successful navigation even using only a small fraction (0.02%) of the available data. The method keeps the robot centered in the corridor by estimating two state parameters: the orientation within the corridor and the distance to the end of the corridor. The orientation is determined by combining the results of five complementary measures, while the estimated distance to the end combines the results of three complementary measures. These measures, which are predominantly information-theoretic, are analyzed independently, and the combined system is tested in several unknown corridor buildings exhibiting a wide variety of appearances, showing the sufficiency of low-resolution visual information for mobile robot exploration. Because the algorithm discards such a large percentage (99.98%) of the information both spatially and temporally, processing occurs at an average of 1000 frames per second, or equivalently takes a small fraction of the CPU. Second, we present an algorithm using image entropy to detect and classify corridor junctions from low resolution images. Because entropy can be used to perceive depth, it can be used to detect an open corridor in a set of images recorded by turning a robot at a junction by 360 degrees. Our algorithm involves detecting peaks from continuously measured entropy values and determining the angular distance between the detected peaks to determine the type of junction that was recorded (either middle, L-junction, T-junction, dead-end, or cross junction). We show that the same algorithm can be used to detect open corridors from both monocular as well as omnidirectional images. Third, we propose a minimalistic corridor representation consisting of the orientation line (center) and the wall-floor boundaries (lateral limit). The representation is extracted from low-resolution images using a novel combination of information theoretic measures and gradient cues. Our study investigates the impact of image resolution upon the accuracy of extracting such a geometry, showing that centerline and wall-floor boundaries can be estimated with reasonable accuracy even in texture-poor environments with low-resolution images. In a database of 7 unique corridor sequences for orientation measurements, less than 2% additional error was observed as the resolution of the image decreased by 99.9%

    Towards Robust Visual Localization in Challenging Conditions

    Get PDF
    Visual localization is a fundamental problem in computer vision, with a multitude of applications in robotics, augmented reality and structure-from-motion. The basic problem is to, based on one or more images, figure out the position and orientation of the camera which captured these images relative to some model of the environment. Current visual localization approaches typically work well when the images to be localized are captured under similar conditions compared to those captured during mapping. However, when the environment exhibits large changes in visual appearance, due to e.g. variations in weather, seasons, day-night or viewpoint, the traditional pipelines break down. The reason is that the local image features used are based on low-level pixel-intensity information, which is not invariant to these transformations: when the environment changes, this will cause a different set of keypoints to be detected, and their descriptors will be different, making the long-term visual localization problem a challenging one. In this thesis, five papers are included, which present work towards solving the problem of long-term visual localization. Two of the articles present ideas for how semantic information may be included to aid in the localization process: one approach relies only on the semantic information for visual localization, and the other shows how the semantics can be used to detect outlier feature correspondences. The third paper considers how the output from a monocular depth-estimation network can be utilized to extract features that are less sensitive to viewpoint changes. The fourth article is a benchmark paper, where we present three new benchmark datasets aimed at evaluating localization algorithms in the context of long-term visual localization. Lastly, the fifth article considers how to perform convolutions on spherical imagery, which in the future might be applied to learning local image features for the localization problem

    Wide-area egomotion from omnidirectional video and coarse 3D structure

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 85-89).This thesis describes a method for real-time vision-based localization in human-made environments. Given a coarse model of the structure (walls, floors, ceilings, doors and windows) and a video sequence, the system computes the camera pose (translation and rotation) in model coordinates with an accuracy of a few centimeters in translation and a few degrees in rotation. The system has several novel aspects: it performs 6-DOF localization; it handles visually cluttered and dynamic environments; it scales well over regions extending through several buildings; and it runs over several hours without losing lock. We demonstrate that the localization problem can be split into two distinct problems: an initialization phase and a maintenance phase. In the initialization phase, the system determines the camera pose with no other information than a search region provided by the user (building, floor, area, room). This step is computationally intensive and is run only once, at startup. We present a probabilistic method to address the initialization problem using a RANSAC framework. In the maintenance phase, the system keeps track of the camera pose from frame to frame without any user interaction.(cont.) This phase is computationally light-weight to allow a high processing frame rate and is coupled with a feedback loop that helps reacquire "lock" when lock has been lost. We demonstrate a simple, robust geometric tracking algorithm based on correspondences between 3D model lines and 2D image edges. We present navigation results on several real datasets across the MIT campus with cluttered, dynamic environments. The first dataset consists of a five-minute robotic exploration across the Robotics, Vision and Sensor Network Lab. The second dataset consists of a two-minute hand-held, 3D motion in the same lab space. The third dataset consists of a 26-minute exploration across MIT buildings 26 and 36.by Olivier Koch.S.M

    Exploiting line metric reconstruction from non-central circular panoramas

    Get PDF
    In certain non-central imaging systems, straight lines are projected via a non-planar surface encapsulating the 4 degrees of freedom of the 3D line. Consequently the geometry of the 3D line can be recovered from a minimum of four image points. However, with classical non-central catadioptric systems there is not enough effective baseline for a practical implementation of the method. In this paper we propose a multi-camera system configuration resembling the circular panoramic model which results in a particular non-central projection allowing the stitching of a non-central panorama. From a single panorama we obtain well-conditioned 3D reconstruction of lines, which are specially interesting in texture-less scenarios. No previous information about the direction or arrangement of the lines in the scene is assumed. The proposed method is evaluated on both synthetic and real images
    corecore