2,391 research outputs found

    Cross-Scale Cost Aggregation for Stereo Matching

    Full text link
    Human beings process stereoscopic correspondence across multiple scales. However, this bio-inspiration is ignored by state-of-the-art cost aggregation methods for dense stereo correspondence. In this paper, a generic cross-scale cost aggregation framework is proposed to allow multi-scale interaction in cost aggregation. We firstly reformulate cost aggregation from a unified optimization perspective and show that different cost aggregation methods essentially differ in the choices of similarity kernels. Then, an inter-scale regularizer is introduced into optimization and solving this new optimization problem leads to the proposed framework. Since the regularization term is independent of the similarity kernel, various cost aggregation methods can be integrated into the proposed general framework. We show that the cross-scale framework is important as it effectively and efficiently expands state-of-the-art cost aggregation methods and leads to significant improvements, when evaluated on Middlebury, KITTI and New Tsukuba datasets.Comment: To Appear in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014 (poster, 29.88%

    Low-level Vision by Consensus in a Spatial Hierarchy of Regions

    Full text link
    We introduce a multi-scale framework for low-level vision, where the goal is estimating physical scene values from image data---such as depth from stereo image pairs. The framework uses a dense, overlapping set of image regions at multiple scales and a "local model," such as a slanted-plane model for stereo disparity, that is expected to be valid piecewise across the visual field. Estimation is cast as optimization over a dichotomous mixture of variables, simultaneously determining which regions are inliers with respect to the local model (binary variables) and the correct co-ordinates in the local model space for each inlying region (continuous variables). When the regions are organized into a multi-scale hierarchy, optimization can occur in an efficient and parallel architecture, where distributed computational units iteratively perform calculations and share information through sparse connections between parents and children. The framework performs well on a standard benchmark for binocular stereo, and it produces a distributional scene representation that is appropriate for combining with higher-level reasoning and other low-level cues.Comment: Accepted to CVPR 2015. Project page: http://www.ttic.edu/chakrabarti/consensus

    3D RECONSTRUCTION FROM STEREO/RANGE IMAGES

    Get PDF
    3D reconstruction from stereo/range image is one of the most fundamental and extensively researched topics in computer vision. Stereo research has recently experienced somewhat of a new era, as a result of publically available performance testing such as the Middlebury data set, which has allowed researchers to compare their algorithms against all the state-of-the-art algorithms. This thesis investigates into the general stereo problems in both the two-view stereo and multi-view stereo scopes. In the two-view stereo scope, we formulate an algorithm for the stereo matching problem with careful handling of disparity, discontinuity and occlusion. The algorithm works with a global matching stereo model based on an energy minimization framework. The experimental results are evaluated on the Middlebury data set, showing that our algorithm is the top performer. A GPU approach of the Hierarchical BP algorithm is then proposed, which provides similar stereo quality to CPU Hierarchical BP while running at real-time speed. A fast-converging BP is also proposed to solve the slow convergence problem of general BP algorithms. Besides two-view stereo, ecient multi-view stereo for large scale urban reconstruction is carefully studied in this thesis. A novel approach for computing depth maps given urban imagery where often large parts of surfaces are weakly textured is presented. Finally, a new post-processing step to enhance the range images in both the both the spatial resolution and depth precision is proposed

    ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems

    Full text link
    In this paper we present ActiveStereoNet, the first deep learning solution for active stereo systems. Due to the lack of ground truth, our method is fully self-supervised, yet it produces precise depth with a subpixel precision of 1/30th1/30th of a pixel; it does not suffer from the common over-smoothing issues; it preserves the edges; and it explicitly handles occlusions. We introduce a novel reconstruction loss that is more robust to noise and texture-less patches, and is invariant to illumination changes. The proposed loss is optimized using a window-based cost aggregation with an adaptive support weight scheme. This cost aggregation is edge-preserving and smooths the loss function, which is key to allow the network to reach compelling results. Finally we show how the task of predicting invalid regions, such as occlusions, can be trained end-to-end without ground-truth. This component is crucial to reduce blur and particularly improves predictions along depth discontinuities. Extensive quantitatively and qualitatively evaluations on real and synthetic data demonstrate state of the art results in many challenging scenes.Comment: Accepted by ECCV2018, Oral Presentation, Main paper + Supplementary Material

    Enhanced Image View Synthesis Using Multistage Hybrid Median Filter For Stereo Images

    Get PDF
    Disparity depth map estimation of stereo matching algorithm is one of the most active research topics in computer vision.In the field of image processing,many existing stereo matching algorithms to obtain disparity depth map are developed and designed with low accuracy.To improve the accuracy of disparity depth map is quite challenging and difficult especially with uncontrolled dynamic environment.The accuracy is affected by many unwanted aspects including random noises,horizontal streaks,low texture,depth map non-edge preserving, occlusion,and depth discontinuities.Thus,this research proposed a new robust method of hybrid stereo matching algorithm with significant accuracy of computation.The thesis will present in detail the development,design, and analysis of performance on Multistage Hybrid Median Filter (MHMF).There are two main parts involved in our developed method which combined in two main stages.Stage 1 consists of the Sum of Absolute Differences (SAD) from Basic Block Matching (BBM) algorithm and the part of Scanline Optimization (SO) from Dynamic Programming (DP) algorithm.While,Stage 2 is the main core of our MHMF as a post-processing step which included segmentation,merging, and hybrid median filtering.The significant feature of the post-processing step is on its ability to handle efficiently the unwanted aspects obtained from the raw disparity depth map on the step of optimization.In order to remove and overcome the challenges unwanted aspects, the proposed MHMF has three stages of filtering process along with the developed approaches in Stage 2 of MHMF algorithm.There are two categories of evaluation performed on the obtained disparity depth map: subjective evaluation and objective evaluation.The objective evaluation involves the evaluation on Middlebury Stereo Vision system and evaluation using traditional methods such as Mean Square Errors (MSE),Peak to Signal Noise Ratio (PSNR) and Structural Similarity Index Metric (SSIM).Based on the results of the standard benchmarking datasets from Middlebury,the proposed algorithm is able to reduce errors of non-occluded and all errors respectively.While,the subjective evaluation is done for datasets captured from MV BLUE FOX camera using human's eyes perception.Based on the results,the proposed MHMF is able to obtain accurate results, specifically 69% and 71% of non-occluded and all errors for disparity depth map, and it outperformed some of the existing methods in the literature such as BBM and DP algorithms
    corecore