1,084 research outputs found

    Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image

    Full text link
    We consider the problem of dense depth prediction from a sparse set of depth measurements and a single RGB image. Since depth estimation from monocular images alone is inherently ambiguous and unreliable, to attain a higher level of robustness and accuracy, we introduce additional sparse depth samples, which are either acquired with a low-resolution depth sensor or computed via visual Simultaneous Localization and Mapping (SLAM) algorithms. We propose the use of a single deep regression network to learn directly from the RGB-D raw data, and explore the impact of number of depth samples on prediction accuracy. Our experiments show that, compared to using only RGB images, the addition of 100 spatially random depth samples reduces the prediction root-mean-square error by 50% on the NYU-Depth-v2 indoor dataset. It also boosts the percentage of reliable prediction from 59% to 92% on the KITTI dataset. We demonstrate two applications of the proposed algorithm: a plug-in module in SLAM to convert sparse maps to dense maps, and super-resolution for LiDARs. Software and video demonstration are publicly available.Comment: accepted to ICRA 2018. 8 pages, 8 figures, 3 tables. Video at https://www.youtube.com/watch?v=vNIIT_M7x7Y. Code at https://github.com/fangchangma/sparse-to-dens

    Non-Parametric Learning for Monocular Visual Odometry

    Get PDF
    This thesis addresses the problem of incremental localization from visual information, a scenario commonly known as visual odometry. Current visual odometry algorithms are heavily dependent on camera calibration, using a pre-established geometric model to provide the transformation between input (optical flow estimates) and output (vehicle motion estimates) information. A novel approach to visual odometry is proposed in this thesis where the need for camera calibration, or even for a geometric model, is circumvented by the use of machine learning principles and techniques. A non-parametric Bayesian regression technique, the Gaussian Process (GP), is used to elect the most probable transformation function hypothesis from input to output, based on training data collected prior and during navigation. Other than eliminating the need for a geometric model and traditional camera calibration, this approach also allows for scale recovery even in a monocular configuration, and provides a natural treatment of uncertainties due to the probabilistic nature of GPs. Several extensions to the traditional GP framework are introduced and discussed in depth, and they constitute the core of the contributions of this thesis to the machine learning and robotics community. The proposed framework is tested in a wide variety of scenarios, ranging from urban and off-road ground vehicles to unconstrained 3D unmanned aircrafts. The results show a significant improvement over traditional visual odometry algorithms, and also surpass results obtained using other sensors, such as laser scanners and IMUs. The incorporation of these results to a SLAM scenario, using a Exact Sparse Information Filter (ESIF), is shown to decrease global uncertainty by exploiting revisited areas of the environment. Finally, a technique for the automatic segmentation of dynamic objects is presented, as a way to increase the robustness of image information and further improve visual odometry results

    Non-Parametric Learning for Monocular Visual Odometry

    Get PDF
    This thesis addresses the problem of incremental localization from visual information, a scenario commonly known as visual odometry. Current visual odometry algorithms are heavily dependent on camera calibration, using a pre-established geometric model to provide the transformation between input (optical flow estimates) and output (vehicle motion estimates) information. A novel approach to visual odometry is proposed in this thesis where the need for camera calibration, or even for a geometric model, is circumvented by the use of machine learning principles and techniques. A non-parametric Bayesian regression technique, the Gaussian Process (GP), is used to elect the most probable transformation function hypothesis from input to output, based on training data collected prior and during navigation. Other than eliminating the need for a geometric model and traditional camera calibration, this approach also allows for scale recovery even in a monocular configuration, and provides a natural treatment of uncertainties due to the probabilistic nature of GPs. Several extensions to the traditional GP framework are introduced and discussed in depth, and they constitute the core of the contributions of this thesis to the machine learning and robotics community. The proposed framework is tested in a wide variety of scenarios, ranging from urban and off-road ground vehicles to unconstrained 3D unmanned aircrafts. The results show a significant improvement over traditional visual odometry algorithms, and also surpass results obtained using other sensors, such as laser scanners and IMUs. The incorporation of these results to a SLAM scenario, using a Exact Sparse Information Filter (ESIF), is shown to decrease global uncertainty by exploiting revisited areas of the environment. Finally, a technique for the automatic segmentation of dynamic objects is presented, as a way to increase the robustness of image information and further improve visual odometry results

    Towards an Autonomous Walking Robot for Planetary Surfaces

    Get PDF
    In this paper, recent progress in the development of the DLR Crawler - a six-legged, actively compliant walking robot prototype - is presented. The robot implements a walking layer with a simple tripod and a more complex biologically inspired gait. Using a variety of proprioceptive sensors, different reflexes for reactively crossing obstacles within the walking height are realised. On top of the walking layer, a navigation layer provides the ability to autonomously navigate to a predefined goal point in unknown rough terrain using a stereo camera. A model of the environment is created, the terrain traversability is estimated and an optimal path is planned. The difficulty of the path can be influenced by behavioral parameters. Motion commands are sent to the walking layer and the gait pattern is switched according to the estimated terrain difficulty. The interaction between walking layer and navigation layer was tested in different experimental setups

    Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

    Get PDF
    Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors' take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved
    • 

    corecore