4,707 research outputs found

    Cross-Scale Cost Aggregation for Stereo Matching

    Full text link
    Human beings process stereoscopic correspondence across multiple scales. However, this bio-inspiration is ignored by state-of-the-art cost aggregation methods for dense stereo correspondence. In this paper, a generic cross-scale cost aggregation framework is proposed to allow multi-scale interaction in cost aggregation. We firstly reformulate cost aggregation from a unified optimization perspective and show that different cost aggregation methods essentially differ in the choices of similarity kernels. Then, an inter-scale regularizer is introduced into optimization and solving this new optimization problem leads to the proposed framework. Since the regularization term is independent of the similarity kernel, various cost aggregation methods can be integrated into the proposed general framework. We show that the cross-scale framework is important as it effectively and efficiently expands state-of-the-art cost aggregation methods and leads to significant improvements, when evaluated on Middlebury, KITTI and New Tsukuba datasets.Comment: To Appear in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014 (poster, 29.88%

    Local Stereo Matching Using Adaptive Local Segmentation

    Get PDF
    We propose a new dense local stereo matching framework for gray-level images based on an adaptive local segmentation using a dynamic threshold. We define a new validity domain of the fronto-parallel assumption based on the local intensity variations in the 4-neighborhood of the matching pixel. The preprocessing step smoothes low textured areas and sharpens texture edges, whereas the postprocessing step detects and recovers occluded and unreliable disparities. The algorithm achieves high stereo reconstruction quality in regions with uniform intensities as well as in textured regions. The algorithm is robust against local radiometrical differences; and successfully recovers disparities around the objects edges, disparities of thin objects, and the disparities of the occluded region. Moreover, our algorithm intrinsically prevents errors caused by occlusion to propagate into nonoccluded regions. It has only a small number of parameters. The performance of our algorithm is evaluated on the Middlebury test bed stereo images. It ranks highly on the evaluation list outperforming many local and global stereo algorithms using color images. Among the local algorithms relying on the fronto-parallel assumption, our algorithm is the best ranked algorithm. We also demonstrate that our algorithm is working well on practical examples as for disparity estimation of a tomato seedling and a 3D reconstruction of a face

    A Nonlocal Method with Modified Initial Cost and Multiple Weight for Stereo Matching

    Get PDF
    This paper presents a new nonlocal cost aggregation method for stereo matching. The minimum spanning tree (MST) employs color difference as the sole component to build the weight function, which often leads to failure in achieving satisfactory results in some boundary regions with similar color distributions. In this paper, a modified initial cost is used. The erroneous pixels are often caused by two pixels from object and background, which have similar color distribution. And then inner color correlation is employed as a new component of the weight function, which is determined to effectively eliminate them. Besides, the segmentation method of the tree structure is also improved. Thus, a more robust and reasonable tree structure is developed. The proposed method was tested on Middlebury datasets. As can be expected, experimental results show that the proposed method outperforms the classical nonlocal methods

    ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems

    Full text link
    In this paper we present ActiveStereoNet, the first deep learning solution for active stereo systems. Due to the lack of ground truth, our method is fully self-supervised, yet it produces precise depth with a subpixel precision of 1/30th1/30th of a pixel; it does not suffer from the common over-smoothing issues; it preserves the edges; and it explicitly handles occlusions. We introduce a novel reconstruction loss that is more robust to noise and texture-less patches, and is invariant to illumination changes. The proposed loss is optimized using a window-based cost aggregation with an adaptive support weight scheme. This cost aggregation is edge-preserving and smooths the loss function, which is key to allow the network to reach compelling results. Finally we show how the task of predicting invalid regions, such as occlusions, can be trained end-to-end without ground-truth. This component is crucial to reduce blur and particularly improves predictions along depth discontinuities. Extensive quantitatively and qualitatively evaluations on real and synthetic data demonstrate state of the art results in many challenging scenes.Comment: Accepted by ECCV2018, Oral Presentation, Main paper + Supplementary Material
    • ā€¦
    corecore