784 research outputs found
Semantically Guided Depth Upsampling
We present a novel method for accurate and efficient up- sampling of sparse
depth data, guided by high-resolution imagery. Our approach goes beyond the use
of intensity cues only and additionally exploits object boundary cues through
structured edge detection and semantic scene labeling for guidance. Both cues
are combined within a geodesic distance measure that allows for
boundary-preserving depth in- terpolation while utilizing local context. We
model the observed scene structure by locally planar elements and formulate the
upsampling task as a global energy minimization problem. Our method determines
glob- ally consistent solutions and preserves fine details and sharp depth
bound- aries. In our experiments on several public datasets at different levels
of application, we demonstrate superior performance of our approach over the
state-of-the-art, even for very sparse measurements.Comment: German Conference on Pattern Recognition 2016 (Oral
Layered Interpretation of Street View Images
We propose a layered street view model to encode both depth and semantic
information on street view images for autonomous driving. Recently, stixels,
stix-mantics, and tiered scene labeling methods have been proposed to model
street view images. We propose a 4-layer street view model, a compact
representation over the recently proposed stix-mantics model. Our layers encode
semantic classes like ground, pedestrians, vehicles, buildings, and sky in
addition to the depths. The only input to our algorithm is a pair of stereo
images. We use a deep neural network to extract the appearance features for
semantic classes. We use a simple and an efficient inference algorithm to
jointly estimate both semantic classes and layered depth values. Our method
outperforms other competing approaches in Daimler urban scene segmentation
dataset. Our algorithm is massively parallelizable, allowing a GPU
implementation with a processing speed about 9 fps.Comment: The paper will be presented in the 2015 Robotics: Science and Systems
Conference (RSS
Effects of Ground Manifold Modeling on the Accuracy of Stixel Calculations
This paper highlights the role of ground manifold modeling for stixel calculations; stixels are medium-level data representations used for the development of computer vision modules for self-driving cars. By using single-disparity maps and simplifying ground manifold models, calculated stixels may suffer from noise, inconsistency, and false-detection rates for obstacles, especially in challenging datasets. Stixel calculations can be improved with respect to accuracy and robustness by using more adaptive ground manifold approximations. A comparative study of stixel results, obtained for different ground-manifold models (e.g., plane-fitting, line-fitting in v-disparities or polynomial approximation, and graph cut), defines the main part of this paper. This paper also considers the use of trinocular stereo vision and shows that this provides options to enhance stixel results, compared with the binocular recording. Comprehensive experiments are performed on two publicly available challenging datasets. We also use a novel way for comparing calculated stixels with ground truth. We compare depth information, as given by extracted stixels, with ground-truth depth, provided by depth measurements using a highly accurate LiDAR range sensor (as available in one of the public datasets). We evaluate the accuracy of four different ground-manifold methods. The experimental results also include quantitative evaluations of the tradeoff between accuracy and run time. As a result, the proposed trinocular recording together with graph-cut estimation of ground manifolds appears to be a recommended way, also considering challenging weather and lighting conditions
Robust and accurate depth estimation by fusing LiDAR and Stereo
Depth estimation is one of the key technologies in some fields such as
autonomous driving and robot navigation. However, the traditional method of
using a single sensor is inevitably limited by the performance of the sensor.
Therefore, a precision and robust method for fusing the LiDAR and stereo
cameras is proposed. This method fully combines the advantages of the LiDAR and
stereo camera, which can retain the advantages of the high precision of the
LiDAR and the high resolution of images respectively. Compared with the
traditional stereo matching method, the texture of the object and lighting
conditions have less influence on the algorithm. Firstly, the depth of the
LiDAR data is converted to the disparity of the stereo camera. Because the
density of the LiDAR data is relatively sparse on the y-axis, the converted
disparity map is up-sampled using the interpolation method. Secondly, in order
to make full use of the precise disparity map, the disparity map and stereo
matching are fused to propagate the accurate disparity. Finally, the disparity
map is converted to the depth map. Moreover, the converted disparity map can
also increase the speed of the algorithm. We evaluate the proposed pipeline on
the KITTI benchmark. The experiment demonstrates that our algorithm has higher
accuracy than several classic methods
- …