1,617 research outputs found

    Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image

    Full text link
    We consider the problem of dense depth prediction from a sparse set of depth measurements and a single RGB image. Since depth estimation from monocular images alone is inherently ambiguous and unreliable, to attain a higher level of robustness and accuracy, we introduce additional sparse depth samples, which are either acquired with a low-resolution depth sensor or computed via visual Simultaneous Localization and Mapping (SLAM) algorithms. We propose the use of a single deep regression network to learn directly from the RGB-D raw data, and explore the impact of number of depth samples on prediction accuracy. Our experiments show that, compared to using only RGB images, the addition of 100 spatially random depth samples reduces the prediction root-mean-square error by 50% on the NYU-Depth-v2 indoor dataset. It also boosts the percentage of reliable prediction from 59% to 92% on the KITTI dataset. We demonstrate two applications of the proposed algorithm: a plug-in module in SLAM to convert sparse maps to dense maps, and super-resolution for LiDARs. Software and video demonstration are publicly available.Comment: accepted to ICRA 2018. 8 pages, 8 figures, 3 tables. Video at https://www.youtube.com/watch?v=vNIIT_M7x7Y. Code at https://github.com/fangchangma/sparse-to-dens

    Dense Piecewise Planar RGB-D SLAM for Indoor Environments

    Full text link
    The paper exploits weak Manhattan constraints to parse the structure of indoor environments from RGB-D video sequences in an online setting. We extend the previous approach for single view parsing of indoor scenes to video sequences and formulate the problem of recovering the floor plan of the environment as an optimal labeling problem solved using dynamic programming. The temporal continuity is enforced in a recursive setting, where labeling from previous frames is used as a prior term in the objective function. In addition to recovery of piecewise planar weak Manhattan structure of the extended environment, the orthogonality constraints are also exploited by visual odometry and pose graph optimization. This yields reliable estimates in the presence of large motions and absence of distinctive features to track. We evaluate our method on several challenging indoors sequences demonstrating accurate SLAM and dense mapping of low texture environments. On existing TUM benchmark we achieve competitive results with the alternative approaches which fail in our environments.Comment: International Conference on Intelligent Robots and Systems (IROS) 201

    Benchmarking and Comparing Popular Visual SLAM Algorithms

    Full text link
    This paper contains the performance analysis and benchmarking of two popular visual SLAM Algorithms: RGBD-SLAM and RTABMap. The dataset used for the analysis is the TUM RGBD Dataset from the Computer Vision Group at TUM. The dataset selected has a large set of image sequences from a Microsoft Kinect RGB-D sensor with highly accurate and time-synchronized ground truth poses from a motion capture system. The test sequences selected depict a variety of problems and camera motions faced by Simultaneous Localization and Mapping (SLAM) algorithms for the purpose of testing the robustness of the algorithms in different situations. The evaluation metrics used for the comparison are Absolute Trajectory Error (ATE) and Relative Pose Error (RPE). The analysis involves comparing the Root Mean Square Error (RMSE) of the two metrics and the processing time for each algorithm. This paper serves as an important aid in the selection of SLAM algorithm for different scenes and camera motions. The analysis helps to realize the limitations of both SLAM methods. This paper also points out some underlying flaws in the used evaluation metrics.Comment: 7 pages, 4 figure

    A General Framework for Flexible Multi-Cue Photometric Point Cloud Registration

    Get PDF
    The ability to build maps is a key functionality for the majority of mobile robots. A central ingredient to most mapping systems is the registration or alignment of the recorded sensor data. In this paper, we present a general methodology for photometric registration that can deal with multiple different cues. We provide examples for registering RGBD as well as 3D LIDAR data. In contrast to popular point cloud registration approaches such as ICP our method does not rely on explicit data association and exploits multiple modalities such as raw range and image data streams. Color, depth, and normal information are handled in an uniform manner and the registration is obtained by minimizing the pixel-wise difference between two multi-channel images. We developed a flexible and general framework and implemented our approach inside that framework. We also released our implementation as open source C++ code. The experiments show that our approach allows for an accurate registration of the sensor data without requiring an explicit data association or model-specific adaptations to datasets or sensors. Our approach exploits the different cues in a natural and consistent way and the registration can be done at framerate for a typical range or imaging sensor.Comment: 8 page
    • …
    corecore