1,617 research outputs found
Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image
We consider the problem of dense depth prediction from a sparse set of depth
measurements and a single RGB image. Since depth estimation from monocular
images alone is inherently ambiguous and unreliable, to attain a higher level
of robustness and accuracy, we introduce additional sparse depth samples, which
are either acquired with a low-resolution depth sensor or computed via visual
Simultaneous Localization and Mapping (SLAM) algorithms. We propose the use of
a single deep regression network to learn directly from the RGB-D raw data, and
explore the impact of number of depth samples on prediction accuracy. Our
experiments show that, compared to using only RGB images, the addition of 100
spatially random depth samples reduces the prediction root-mean-square error by
50% on the NYU-Depth-v2 indoor dataset. It also boosts the percentage of
reliable prediction from 59% to 92% on the KITTI dataset. We demonstrate two
applications of the proposed algorithm: a plug-in module in SLAM to convert
sparse maps to dense maps, and super-resolution for LiDARs. Software and video
demonstration are publicly available.Comment: accepted to ICRA 2018. 8 pages, 8 figures, 3 tables. Video at
https://www.youtube.com/watch?v=vNIIT_M7x7Y. Code at
https://github.com/fangchangma/sparse-to-dens
Dense Piecewise Planar RGB-D SLAM for Indoor Environments
The paper exploits weak Manhattan constraints to parse the structure of
indoor environments from RGB-D video sequences in an online setting. We extend
the previous approach for single view parsing of indoor scenes to video
sequences and formulate the problem of recovering the floor plan of the
environment as an optimal labeling problem solved using dynamic programming.
The temporal continuity is enforced in a recursive setting, where labeling from
previous frames is used as a prior term in the objective function. In addition
to recovery of piecewise planar weak Manhattan structure of the extended
environment, the orthogonality constraints are also exploited by visual
odometry and pose graph optimization. This yields reliable estimates in the
presence of large motions and absence of distinctive features to track. We
evaluate our method on several challenging indoors sequences demonstrating
accurate SLAM and dense mapping of low texture environments. On existing TUM
benchmark we achieve competitive results with the alternative approaches which
fail in our environments.Comment: International Conference on Intelligent Robots and Systems (IROS)
201
Benchmarking and Comparing Popular Visual SLAM Algorithms
This paper contains the performance analysis and benchmarking of two popular
visual SLAM Algorithms: RGBD-SLAM and RTABMap. The dataset used for the
analysis is the TUM RGBD Dataset from the Computer Vision Group at TUM. The
dataset selected has a large set of image sequences from a Microsoft Kinect
RGB-D sensor with highly accurate and time-synchronized ground truth poses from
a motion capture system. The test sequences selected depict a variety of
problems and camera motions faced by Simultaneous Localization and Mapping
(SLAM) algorithms for the purpose of testing the robustness of the algorithms
in different situations. The evaluation metrics used for the comparison are
Absolute Trajectory Error (ATE) and Relative Pose Error (RPE). The analysis
involves comparing the Root Mean Square Error (RMSE) of the two metrics and the
processing time for each algorithm. This paper serves as an important aid in
the selection of SLAM algorithm for different scenes and camera motions. The
analysis helps to realize the limitations of both SLAM methods. This paper also
points out some underlying flaws in the used evaluation metrics.Comment: 7 pages, 4 figure
A General Framework for Flexible Multi-Cue Photometric Point Cloud Registration
The ability to build maps is a key functionality for the majority of mobile
robots. A central ingredient to most mapping systems is the registration or
alignment of the recorded sensor data. In this paper, we present a general
methodology for photometric registration that can deal with multiple different
cues. We provide examples for registering RGBD as well as 3D LIDAR data. In
contrast to popular point cloud registration approaches such as ICP our method
does not rely on explicit data association and exploits multiple modalities
such as raw range and image data streams. Color, depth, and normal information
are handled in an uniform manner and the registration is obtained by minimizing
the pixel-wise difference between two multi-channel images. We developed a
flexible and general framework and implemented our approach inside that
framework. We also released our implementation as open source C++ code. The
experiments show that our approach allows for an accurate registration of the
sensor data without requiring an explicit data association or model-specific
adaptations to datasets or sensors. Our approach exploits the different cues in
a natural and consistent way and the registration can be done at framerate for
a typical range or imaging sensor.Comment: 8 page
- …