4,816 research outputs found
Morphological processing of stereoscopic image superimpositions for disparity map estimation
This paper deals with the problem of depth map computation from a pair of rectified stereo images and presents a novel solution based on the morphological processing of disparity space volumes. The reader is guided through the four steps composing the proposed method: the segmentation of stereo images, the diffusion of superimposition costs controlled by the segmentation, the resulting generation of a sparse disparity map which finally drives the estimation of the dense disparity map. An objective evaluation of the algorithm's features and qualities is provided and is accompanied by the results obtained on Middlebury's 2014 stereo database
Robust and accurate depth estimation by fusing LiDAR and Stereo
Depth estimation is one of the key technologies in some fields such as
autonomous driving and robot navigation. However, the traditional method of
using a single sensor is inevitably limited by the performance of the sensor.
Therefore, a precision and robust method for fusing the LiDAR and stereo
cameras is proposed. This method fully combines the advantages of the LiDAR and
stereo camera, which can retain the advantages of the high precision of the
LiDAR and the high resolution of images respectively. Compared with the
traditional stereo matching method, the texture of the object and lighting
conditions have less influence on the algorithm. Firstly, the depth of the
LiDAR data is converted to the disparity of the stereo camera. Because the
density of the LiDAR data is relatively sparse on the y-axis, the converted
disparity map is up-sampled using the interpolation method. Secondly, in order
to make full use of the precise disparity map, the disparity map and stereo
matching are fused to propagate the accurate disparity. Finally, the disparity
map is converted to the depth map. Moreover, the converted disparity map can
also increase the speed of the algorithm. We evaluate the proposed pipeline on
the KITTI benchmark. The experiment demonstrates that our algorithm has higher
accuracy than several classic methods
Hierarchical inference of disparity
Disparity selective cells in V1 respond to the correlated receptive fields of the left and right retinae, which do not necessarily correspond to the same object in the 3D scene, i.e., these cells respond equally to both false and correct stereo matches. On the other hand, neurons in the extrastriate visual area V2 show much stronger responses to correct visual matches [Bakin et al, 2000]. This indicates that a part of the stereo correspondence problem is solved during disparity processing in these two areas. However, the mechanisms employed by the brain to accomplish this task are not yet understood. Existing computational models are mostly based on cooperative computations in V1 [Marr and Poggio 1976, Read and Cumming 2007], without exploiting the potential benefits of the hierarchical structure between V1 and V2. Here we propose a two-layer graphical model for disparity estimation from stereo. The lower layer matches the linear responses of neurons with Gabor receptive fields across images. Nodes in the upper layer infer a sparse code of the disparity map and act as priors that help disambiguate false from correct matches. When learned on natural disparity maps, the receptive fields of the sparse code converge to oriented depth edges, which is consistent with the electrophysiological studies in macaque [von der Heydt et al, 2000]. Moreover, when such a code is used for depth inference in our two layer model, the resulting disparity map for the Tsukuba stereo pair [middlebury database] has 40% less false matches than the solution given by the first layer. Our model offers a demonstration of the hierarchical disparity computation, leading to testable predictions about V1-V2 interactions
Non-learning Stereo-aided Depth Completion under Mis-projection via Selective Stereo Matching
We propose a non-learning depth completion method for a sparse depth map
captured using a light detection and ranging (LiDAR) sensor guided by a pair of
stereo images. Generally, conventional stereo-aided depth completion methods
have two limiations. (i) They assume the given sparse depth map is accurately
aligned to the input image, whereas the alignment is difficult to achieve in
practice. (ii) They have limited accuracy in the long range because the depth
is estimated by pixel disparity. To solve the abovementioned limitations, we
propose selective stereo matching (SSM) that searches the most appropriate
depth value for each image pixel from its neighborly projected LiDAR points
based on an energy minimization framework. This depth selection approach can
handle any type of mis-projection. Moreover, SSM has an advantage in terms of
long-range depth accuracy because it directly uses the LiDAR measurement rather
than the depth acquired from the stereo. SSM is a discrete process; thus, we
apply variational smoothing with binary anisotropic diffusion tensor (B-ADT) to
generate a continuous depth map while preserving depth discontinuity across
object boundaries. Experimentally, compared with the previous state-of-the-art
stereo-aided depth completion, the proposed method reduced the mean absolute
error (MAE) of the depth estimation to 0.65 times and demonstrated
approximately twice more accurate estimation in the long range. Moreover, under
various LiDAR-camera calibration errors, the proposed method reduced the depth
estimation MAE to 0.34-0.93 times from previous depth completion methods.Comment: 15 pages, 13 figure
Traffic sign detection and tracking using robust 3D analysis
In this paper we present an innovative technique to tackle the problem of automatic road sign detection and tracking using an on-board stereo camera. It involves a continuous 3D analysis of the road sign during the whole tracking process. Firstly, a color and appearance based model is applied to generate road sign candidates in both stereo images. A sparse disparity map between the left and right images is then created for each candidate by using contour-based and SURF-based matching in the far and short range, respectively. Once the map has been computed, the correspondences are back-projected to generate a cloud of 3D points, and the best-fit plane is computed through RANSAC, ensuring robustness to outliers. Temporal consistency is enforced by means of a Kalman filter, which exploits the intrinsic smoothness of the 3D camera motion in traffic environments. Additionally, the estimation of the plane allows to correct deformations due to perspective, thus easing further sign classification
Learning sparse representations of depth
This paper introduces a new method for learning and inferring sparse
representations of depth (disparity) maps. The proposed algorithm relaxes the
usual assumption of the stationary noise model in sparse coding. This enables
learning from data corrupted with spatially varying noise or uncertainty,
typically obtained by laser range scanners or structured light depth cameras.
Sparse representations are learned from the Middlebury database disparity maps
and then exploited in a two-layer graphical model for inferring depth from
stereo, by including a sparsity prior on the learned features. Since they
capture higher-order dependencies in the depth structure, these priors can
complement smoothness priors commonly used in depth inference based on Markov
Random Field (MRF) models. Inference on the proposed graph is achieved using an
alternating iterative optimization technique, where the first layer is solved
using an existing MRF-based stereo matching algorithm, then held fixed as the
second layer is solved using the proposed non-stationary sparse coding
algorithm. This leads to a general method for improving solutions of state of
the art MRF-based depth estimation algorithms. Our experimental results first
show that depth inference using learned representations leads to state of the
art denoising of depth maps obtained from laser range scanners and a time of
flight camera. Furthermore, we show that adding sparse priors improves the
results of two depth estimation methods: the classical graph cut algorithm by
Boykov et al. and the more recent algorithm of Woodford et al.Comment: 12 page
- …