    Probabilistic RGB-D Odometry based on Points, Lines and Planes Under Depth Uncertainty

    This work proposes a robust visual odometry method for structured environments that combines point features with line and plane segments, extracted through an RGB-D camera. Noisy depth maps are processed by a probabilistic depth fusion framework based on Mixtures of Gaussians to denoise and derive the depth uncertainty, which is then propagated throughout the visual odometry pipeline. Probabilistic 3D plane and line fitting solutions are used to model the uncertainties of the feature parameters and pose is estimated by combining the three types of primitives based on their uncertainties. Performance evaluation on RGB-D sequences collected in this work and two public RGB-D datasets: TUM and ICL-NUIM show the benefit of using the proposed depth fusion framework and combining the three feature-types, particularly in scenes with low-textured surfaces, dynamic objects and missing depth measurements.Comment: Major update: more results, depth filter released as opensource, 34 page

    Dynamic Objects Segmentation for Visual Localization in Urban Environments

    Visual localization and mapping is a crucial capability to address many challenges in mobile robotics. It constitutes a robust, accurate and cost-effective approach for local and global pose estimation within prior maps. Yet, in highly dynamic environments, like crowded city streets, problems arise as major parts of the image can be covered by dynamic objects. Consequently, visual odometry pipelines often diverge and the localization systems malfunction as detected features are not consistent with the precomputed 3D model. In this work, we present an approach to automatically detect dynamic object instances to improve the robustness of vision-based localization and mapping in crowded environments. By training a convolutional neural network model with a combination of synthetic and real-world data, dynamic object instance masks are learned in a semi-supervised way. The real-world data can be collected with a standard camera and requires minimal further post-processing. Our experiments show that a wide range of dynamic objects can be reliably detected using the presented method. Promising performance is demonstrated on our own and also publicly available datasets, which also shows the generalization capabilities of this approach.Comment: 4 pages, submitted to the IROS 2018 Workshop "From Freezing to Jostling Robots: Current Challenges and New Paradigms for Safe Robot Navigation in Dense Crowds

    Robust Real-time RGB-D Visual Odometry in Dynamic Environments via Rigid Motion Model

    In the paper, we propose a robust real-time visual odometry in dynamic environments via rigid-motion model updated by scene flow. The proposed algorithm consists of spatial motion segmentation and temporal motion tracking. The spatial segmentation first generates several motion hypotheses by using a grid-based scene flow and clusters the extracted motion hypotheses, separating objects that move independently of one another. Further, we use a dual-mode motion model to consistently distinguish between the static and dynamic parts in the temporal motion tracking stage. Finally, the proposed algorithm estimates the pose of a camera by taking advantage of the region classified as static parts. In order to evaluate the performance of visual odometry under the existence of dynamic rigid objects, we use self-collected dataset containing RGB-D images and motion capture data for ground-truth. We compare our algorithm with state-of-the-art visual odometry algorithms. The validation results suggest that the proposed algorithm can estimate the pose of a camera robustly and accurately in dynamic environments

    Visual slam in dynamic environments

    El problema de localización y construcción visual simultánea de mapas (visual SLAM por sus siglas en inglés Simultaneous Localization and Mapping) consiste en localizar una cámara en un mapa que se construye de manera online. Esta tecnología permite la localización de robots en entornos desconocidos y la creación de un mapa de la zona con los sensores que lleva incorporados, es decir, sin contar con ninguna infraestructura externa. A diferencia de los enfoques de odometría en los cuales el movimiento incremental es integrado en el tiempo, un mapa permite que el sensor se localice continuamente en el mismo entorno sin acumular deriva.Asumir que la escena observada es estática es común en los algoritmos de SLAM visual. Aunque la suposición estática es válida para algunas aplicaciones, limita su utilidad en escenas concurridas del mundo real para la conducción autónoma, los robots de servicio o realidad aumentada y virtual entre otros. La detección y el estudio de objetos dinámicos es un requisito para estimar con precisión la posición del sensor y construir mapas estables, útiles para aplicaciones robóticas que operan a largo plazo.Las contribuciones principales de esta tesis son tres: 1. Somos capaces de detectar objetos dinámicos con la ayuda del uso de la segmentación semántica proveniente del aprendizaje profundo y el uso de enfoques de geometría multivisión. Esto nos permite lograr una precisión en la estimación de la trayectoria de la cámara en escenas altamente dinámicas comparable a la que se logra en entornos estáticos, así como construir mapas en 3D que contienen sólo la estructura del entorno estático y estable. 2. Logramos alucinar con imágenes realistas la estructura estática de la escena detrás de los objetos dinámicos. Esto nos permite ofrecer mapas completos con una representación plausible de la escena sin discontinuidades o vacíos ocasionados por las oclusiones de los objetos dinámicos. El reconocimiento visual de lugares también se ve impulsado por estos avances en el procesamiento de imágenes. 3. Desarrollamos un marco conjunto tanto para resolver el problema de SLAM como el seguimiento de múltiples objetos con el fin de obtener un mapa espacio-temporal con información de la trayectoria del sensor y de los alrededores. La comprensión de los objetos dinámicos circundantes es de crucial importancia para los nuevos requisitos de las aplicaciones emergentes de realidad aumentada/virtual o de la navegación autónoma. Estas tres contribuciones hacen avanzar el estado del arte en SLAM visual. Como un producto secundario de nuestra investigación y para el beneficio de la comunidad científica, hemos liberado el código que implementa las soluciones propuestas.<br /

    Agent and object aware tracking and mapping methods for mobile manipulators

    The age of the intelligent machine is upon us. They exist in our factories, our warehouses, our military, our hospitals, on our roads, and on the moon. Most of these things we call robots. When placed in a controlled or known environment such as an automotive factory or a distribution warehouse they perform their given roles with exceptional efficiency, achieving far more than is within reach of a humble human being. Despite the remarkable success of intelligent machines in such domains, they have yet to make a full-hearted deployment into our homes. The missing link between the robots we have now and the robots that are soon to come to our houses is perception. Perception as we mean it here refers to a level of understanding beyond the collection and aggregation of sensory data. Much of the available sensory information is noisy and unreliable, our homes contain many reflective surfaces, repeating textures on large flat surfaces, and many disruptive moving elements, including humans. These environments change over time, with objects frequently moving within and between rooms. This idea of change in an environment is fundamental to robotic applications, as in most cases we expect them to be effectors of such change. We can identify two particular challenges1 that must be solved for robots to make the jump to less structured environments - how to manage noise and disruptive elements in observational data, and how to understand the world as a set of changeable elements (objects) which move over time within a wider environment. In this thesis we look at one possible approach to solving each of these problems. For the first challenge we use proprioception aboard a robot with an articulated arm to handle difficult and unreliable visual data caused both by the robot and the environment. We use sensor data aboard the robot to improve the pose tracking of a visual system when the robot moves rapidly, with high jerk, or when observing a scene with little visual variation. For the second challenge, we build a model of the world on the level of rigid objects, and relocalise them both as they change location between different sequences and as they move. We use semantics, image keypoints, and 3D geometry to register and align objects between sequences, showing how their position has moved between disparate observations.Open Acces