8,554 research outputs found

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling

    On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

    Get PDF
    Current approaches for visual-inertial odometry (VIO) are able to attain highly accurate state estimation via nonlinear optimization. However, real-time optimization quickly becomes infeasible as the trajectory grows over time, this problem is further emphasized by the fact that inertial measurements come at high rate, hence leading to fast growth of the number of variables in the optimization. In this paper, we address this issue by preintegrating inertial measurements between selected keyframes into single relative motion constraints. Our first contribution is a \emph{preintegration theory} that properly addresses the manifold structure of the rotation group. We formally discuss the generative measurement model as well as the nature of the rotation noise and derive the expression for the \emph{maximum a posteriori} state estimator. Our theoretical development enables the computation of all necessary Jacobians for the optimization and a-posteriori bias correction in analytic form. The second contribution is to show that the preintegrated IMU model can be seamlessly integrated into a visual-inertial pipeline under the unifying framework of factor graphs. This enables the application of incremental-smoothing algorithms and the use of a \emph{structureless} model for visual measurements, which avoids optimizing over the 3D points, further accelerating the computation. We perform an extensive evaluation of our monocular \VIO pipeline on real and simulated datasets. The results confirm that our modelling effort leads to accurate state estimation in real-time, outperforming state-of-the-art approaches.Comment: 20 pages, 24 figures, accepted for publication in IEEE Transactions on Robotics (TRO) 201

    Fine-To-Coarse Global Registration of RGB-D Scans

    Full text link
    RGB-D scanning of indoor environments is important for many applications, including real estate, interior design, and virtual reality. However, it is still challenging to register RGB-D images from a hand-held camera over a long video sequence into a globally consistent 3D model. Current methods often can lose tracking or drift and thus fail to reconstruct salient structures in large environments (e.g., parallel walls in different rooms). To address this problem, we propose a "fine-to-coarse" global registration algorithm that leverages robust registrations at finer scales to seed detection and enforcement of new correspondence and structural constraints at coarser scales. To test global registration algorithms, we provide a benchmark with 10,401 manually-clicked point correspondences in 25 scenes from the SUN3D dataset. During experiments with this benchmark, we find that our fine-to-coarse algorithm registers long RGB-D sequences better than previous methods

    A factorization approach to inertial affine structure from motion

    Full text link
    We consider the problem of reconstructing a 3-D scene from a moving camera with high frame rate using the affine projection model. This problem is traditionally known as Affine Structure from Motion (Affine SfM), and can be solved using an elegant low-rank factorization formulation. In this paper, we assume that an accelerometer and gyro are rigidly mounted with the camera, so that synchronized linear acceleration and angular velocity measurements are available together with the image measurements. We extend the standard Affine SfM algorithm to integrate these measurements through the use of image derivatives

    Simultaneous Parameter Calibration, Localization, and Mapping

    Get PDF
    The calibration parameters of a mobile robot play a substantial role in navigation tasks. Often these parameters are subject to variations that depend either on changes in the environment or on the load of the robot. In this paper, we propose an approach to simultaneously estimate a map of the environment, the position of the on-board sensors of the robot, and its kinematic parameters. Our method requires no prior knowledge about the environment and relies only on a rough initial guess of the parameters of the platform. The proposed approach estimates the parameters online and it is able to adapt to non-stationary changes of the configuration. We tested our approach in simulated environments and on a wide range of real-world data using different types of robotic platforms. (C) 2012 Taylor & Francis and The Robotics Society of Japa

    A factorization approach to inertial affine structure from motion

    Full text link
    We consider the problem of reconstructing a 3-D scene from a moving camera with high frame rate using the affine projection model. This problem is traditionally known as Affine Structure from Motion (Affine SfM), and can be solved using an elegant low-rank factorization formulation. In this paper, we assume that an accelerometer and gyro are rigidly mounted with the camera, so that synchronized linear acceleration and angular velocity measurements are available together with the image measurements. We extend the standard Affine SfM algorithm to integrate these measurements through the use of image derivatives

    Holistic Temporal Situation Interpretation for Traffic Participant Prediction

    Get PDF
    For a profound understanding of traffic situations including a prediction of traf- fic participants’ future motion, behaviors and routes it is crucial to incorporate all available environmental observations. The presence of sensor noise and depen- dency uncertainties, the variety of available sensor data, the complexity of large traffic scenes and the large number of different estimation tasks with diverging requirements require a general method that gives a robust foundation for the de- velopment of estimation applications. In this work, a general description language, called Object-Oriented Factor Graph Modeling Language (OOFGML), is proposed, that unifies formulation of esti- mation tasks from the application-oriented problem description via the choice of variable and probability distribution representation through to the inference method definition in implementation. The different language properties are dis- cussed theoretically using abstract examples. The derivation of explicit application examples is shown for the automated driv- ing domain. A domain-specific ontology is defined which forms the basis for four exemplary applications covering the broad spectrum of estimation tasks in this domain: Basic temporal filtering, ego vehicle localization using advanced interpretations of perceived objects, road layout perception utilizing inter-object dependencies and finally highly integrated route, behavior and motion estima- tion to predict traffic participant’s future actions. All applications are evaluated as proof of concept and provide an example of how their class of estimation tasks can be represented using the proposed language. The language serves as a com- mon basis and opens a new field for further research towards holistic solutions for automated driving

    Information-driven navigation

    Get PDF
    En los últimos años, hemos presenciado un progreso enorme de la precisión y la robustez de la “Odometría Visual” (VO) y del “Mapeo y la Localización Simultánea” (SLAM). Esta mejora de su funcionamiento ha permitido las primeras implementaciones comerciales relacionadascon la realidad aumentada (AR), la realidad virtual (VR) y la robótica. En esta tesis, desarrollamos nuevos métodos probabilísticos para mejorar la precisión, robustez y eficiencia de estas técnicas. Las contribuciones de nuestro trabajo están publicadas en tres artículos y se complementan con el lanzamiento de “SID-SLAM”, el software que contiene todas nuestras contribuciones, y del “Minimal Texture dataset”.Nuestra primera contribución es un algoritmo para la selección de puntos basado en Teoría de la Información para sistemas RGB-D VO/SLAM basados en métodos directos y/o en características visuales (features). El objetivo es seleccionar las medidas más informativas, para reducir el tama˜no del problema de optimización con un impacto mínimo en la precisión. Nuestros resultados muestran que nuestro nuevo criterio permitereducir el número de puntos hasta tan sólo 24 de ellos, alcanzando la precisión del estado del arte y reduciendo en hasta 10 veces la demanda computacional.El desarrollo de mejores modelos de incertidumbre para las medidas visuales mejoraría la precisión de la estructura y movimiento multi-vista y llevaría a estimaciones más realistas de la incertidumbre del estado en VO/SLAM. En esta tesis derivamos un modelo de covarianza para residuos multi-vista, que se convierte en un elemento crucial de nuestras contribuciones basadas en Teoría de la Información.La odometría visual y los sistemas de SLAM se dividen típicamente en la literatura en dos categorías, los basados en features y los métodos directos, dependiendo del tipo de residuos que son minimizados. En la última parte de la tesis combinamos nuestras dos contribucionesanteriores en la formulación e implementación de SID-SLAM, el primer sistema completo de SLAM semi-directo RGB-D que utiliza de forma integrada e indistinta features y métodos directos, en un sistema completo dirigido con información. Adicionalmente, grabamos ‘‘Minimal Texture”, un dataset RGB-D con un contenido visual conceptualmente simple pero arduo, con un ground truth preciso para facilitar la investigación del estado del arte en SLAM semi-directo.In the last years, we have witnessed an impressive progress in the accuracy and robustness of Visual Odometry (VO) and Simultaneous Localization and Mapping (SLAM). This boost in the performance has enabled the first commercial implementations related to augmented reality (AR), virtual reality (VR) and robotics. In this thesis, we developed new probabilistic methods to further improve the accuracy, robustness and efficiency of VO and SLAM. The contributions of our work are issued in three main publications and complemented with the release of SID-SLAM, the software containing all our contributions, and the challenging Mininal Texture dataset. Our first contribution is an information-theoretic approach to point selection for direct and/or feature-based RGB-D VO/SLAM. The aim is to select only the most informative measurements, in order to reduce the optimization problem with a minimal impact in the accuracy. Our experimental results show that our novel criteria allows us to reduce the number of tracked points down to only 24 of them, achieving state-of-the-art accuracy while reducing 10x the computational demand. Better uncertainty models for visual measurements will impact the accuracy of multi-view structure and motion and will lead to realistic uncertainty estimates of the VO/SLAM states. We derived a novel model for multi-view residual covariances based on perspective deformation, which has become a crucial element in our information-driven approach. Visual odometry and SLAM systems are typically divided in the literature into two categories, feature-based and direct methods, depending on the type of residuals that are minimized. We combined our two previous contributions in the formulation and implementation of SID-SLAM, the first full semi-direct RGB-D SLAM system that uses tightly and indistinctly features and direct methods within a complete information-driven pipeline. Moreover, we recorded Minimal Texture an RGB-D dataset with conceptually simple but challenging content, with accurate ground truth to facilitate state-of-the-art research on semi-direct SLAM.<br /
    corecore