23,228 research outputs found

    Guided Stereo Matching

    Full text link
    Stereo is a prominent technique to infer dense depth maps from images, and deep learning further pushed forward the state-of-the-art, making end-to-end architectures unrivaled when enough data is available for training. However, deep networks suffer from significant drops in accuracy when dealing with new environments. Therefore, in this paper, we introduce Guided Stereo Matching, a novel paradigm leveraging a small amount of sparse, yet reliable depth measurements retrieved from an external source enabling to ameliorate this weakness. The additional sparse cues required by our method can be obtained with any strategy (e.g., a LiDAR) and used to enhance features linked to corresponding disparity hypotheses. Our formulation is general and fully differentiable, thus enabling to exploit the additional sparse inputs in pre-trained deep stereo networks as well as for training a new instance from scratch. Extensive experiments on three standard datasets and two state-of-the-art deep architectures show that even with a small set of sparse input cues, i) the proposed paradigm enables significant improvements to pre-trained networks. Moreover, ii) training from scratch notably increases accuracy and robustness to domain shifts. Finally, iii) it is suited and effective even with traditional stereo algorithms such as SGM.Comment: CVPR 201

    Guided Filtering based Pyramidal Stereo Matching for Unrectified Images

    Get PDF
    Stereo matching deals with recovering quantitative depth information from a set of input images, based on the visual disparity between corresponding points. Generally most of the algorithms assume that the processed images are rectified. As robotics becomes popular, conducting stereo matching in the context of cloth manipulation, such as obtaining the disparity map of the garments from the two cameras of the cloth folding robot, is useful and challenging. This is resulted from the fact of the high efficiency, accuracy and low memory requirement under the usage of high resolution images in order to capture the details (e.g. cloth wrinkles) for the given application (e.g. cloth folding). Meanwhile, the images can be unrectified. Therefore, we propose to adapt guided filtering algorithm into the pyramidical stereo matching framework that works directly for unrectified images. To evaluate the proposed unrectified stereo matching in terms of accuracy, we present three datasets that are suited to especially the characteristics of the task of cloth manipulations. By com- paring the proposed algorithm with two baseline algorithms on those three datasets, we demonstrate that our proposed approach is accurate, efficient and requires low memory. This also shows that rather than relying on image rectification, directly applying stereo matching through the unrectified images can be also quite effective and meanwhile efficien

    Evaluation of confidence-driven cost aggregation strategies

    Get PDF
    In this thesis I describe eight new stereo matching algorithms that perform the cost-aggregation step using a guided filter with a confidence map as guidance image, and share the structure of a linear stereo matching algorithm. The results of the execution of the proposed algorithms on four pictures from the Middlebury dataset are shown as well. Finally, based on these results, a ranking of the proposed algorithms is presented

    Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs

    Full text link
    Human visual system relies on both binocular stereo cues and monocular focusness cues to gain effective 3D perception. In computer vision, the two problems are traditionally solved in separate tracks. In this paper, we present a unified learning-based technique that simultaneously uses both types of cues for depth inference. Specifically, we use a pair of focal stacks as input to emulate human perception. We first construct a comprehensive focal stack training dataset synthesized by depth-guided light field rendering. We then construct three individual networks: a Focus-Net to extract depth from a single focal stack, a EDoF-Net to obtain the extended depth of field (EDoF) image from the focal stack, and a Stereo-Net to conduct stereo matching. We show how to integrate them into a unified BDfF-Net to obtain high-quality depth maps. Comprehensive experiments show that our approach outperforms the state-of-the-art in both accuracy and speed and effectively emulates human vision systems

    A new high resolution depth map estimation system using stereo vision and depth sensing device

    Get PDF
    Depth map estimation is a classical problem in computer vision. Conventional depth estimation relies on stereo/multi-view matching or depth sensing devices alone. In this paper, we propose a system which addresses high resolution and high quality depth estimation based on joint fusion of stereo and Kinect data. The problem is formulated as a maximum a posteriori probability (MAP) estimation problem and reliability of two devices are derived. The depth map estimated is further refined by color image guided depth matting and a 2D polynomial regression (LPR)-based filtering. Experimental results show that our system can provide high quality and resolution depth map, which complements the strengths of stereo vision and Kinect depth sensor. © 2013 IEEE.published_or_final_versio

    New Stereo Vision Algorithm Composition Using Weighted Adaptive Histogram Equalization and Gamma Correction

    Get PDF
    This work presents the composition of a new algorithm for a stereo vision system to acquire accurate depth measurement from stereo correspondence. Stereo correspondence produced by matching is commonly affected by image noise such as illumination variation, blurry boundaries, and radiometric differences. The proposed algorithm introduces a pre-processing step based on the combination of Contrast Limited Adaptive Histogram Equalization (CLAHE) and Adaptive Gamma Correction Weighted Distribution (AGCWD) with a guided filter (GF). The cost value of the pre-processing step is determined in the matching cost step using the census transform (CT), which is followed by aggregation using the fixed-window and GF technique. A winner-takes-all (WTA) approach is employed to select the minimum disparity map value and final refinement using left-right consistency checking (LR) along with a weighted median filter (WMF) to remove outliers. The algorithm improved the accuracy 31.65% for all pixel errors and 23.35% for pixel errors in nonoccluded regions compared to several established algorithms on a Middlebury dataset

    Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching

    Full text link
    Correlation based stereo matching has achieved outstanding performance, which pursues cost volume between two feature maps. Unfortunately, current methods with a fixed model do not work uniformly well across various datasets, greatly limiting their real-world applicability. To tackle this issue, this paper proposes a new perspective to dynamically calculate correlation for robust stereo matching. A novel Uncertainty Guided Adaptive Correlation (UGAC) module is introduced to robustly adapt the same model for different scenarios. Specifically, a variance-based uncertainty estimation is employed to adaptively adjust the sampling area during warping operation. Additionally, we improve the traditional non-parametric warping with learnable parameters, such that the position-specific weights can be learned. We show that by empowering the recurrent network with the UGAC module, stereo matching can be exploited more robustly and effectively. Extensive experiments demonstrate that our method achieves state-of-the-art performance over the ETH3D, KITTI, and Middlebury datasets when employing the same fixed model over these datasets without any retraining procedure. To target real-time applications, we further design a lightweight model based on UGAC, which also outperforms other methods over KITTI benchmarks with only 0.6 M parameters.Comment: Accepted by ICCV202

    Stereo matching based on absolute differences for multiple objects detection

    Get PDF
    This article presents a new algorithm for object detection using stereo camera system. The problem to get an accurate object detion using stereo camera is the imprecise of matching process between two scenes with the same viewpoint. Hence, this article aims to reduce the incorrect matching pixel with four stages. This new algorithm is the combination of continuous process of matching cost computation, aggregation, optimization and filtering. The first stage is matching cost computation to acquire preliminary result using an absolute differences method. Then the second stage known as aggregation step uses a guided filter with fixed window support size. After that, the optimization stage uses winner-takes-all (WTA) approach which selects the smallest matching differences value and normalized it to the disparity level. The last stage in the framework uses a bilateral filter. It is effectively further decrease the error on the disparity map which contains information of object detection and locations. The proposed work produces low errors (i.e., 12.11% and 14.01% nonocc and all errors) based on the KITTI dataset and capable to perform much better compared with before the proposed framework and competitive with some newly available methods

    Stereo Matching Based On Absolute Differences For Multiple Objects Detection

    Get PDF
    This article presents a new algorithm for object detection using stereo camera system. The problem to get an accurate object detection using stereo camera is the imprecise of matching process between two scenes with the same viewpoint. Hence, this article aims to reduce the incorrect matching pixel with four stages. This new algorithm is the combination of continuous process of matching cost computation, aggregation, optimization and filtering. The first stage is matching cost computation to acquire preliminary result using an absolute differences method. Then the second stage known as aggregation step uses a guided filter with fixed window support size. After that, the optimization stage uses winner-takes-all (WTA) approach which selects the smallest matching differences value and normalized it to the disparity level. The last stage in the framework uses a bilateral filter. It is effectively further decrease the error on the disparity map which contains information of object detection and locations. The proposed work produces low errors (i.e., 12.11% and 14.01% nonocc and all errors) based on the KITTI dataset and capable to perform much better compared with before the proposed framework and competitive with some newly available methods
    corecore