Search CORE

2,391 research outputs found

Cross-Scale Cost Aggregation for Stereo Matching

Author: Dongbo Min
Kang Zhang
Lifeng Sun
Qi Tian
Shiqiang Yang
Shuicheng Yan
Yuqiang Fang
Publication venue
Publication date: 03/03/2014
Field of study

Human beings process stereoscopic correspondence across multiple scales. However, this bio-inspiration is ignored by state-of-the-art cost aggregation methods for dense stereo correspondence. In this paper, a generic cross-scale cost aggregation framework is proposed to allow multi-scale interaction in cost aggregation. We firstly reformulate cost aggregation from a unified optimization perspective and show that different cost aggregation methods essentially differ in the choices of similarity kernels. Then, an inter-scale regularizer is introduced into optimization and solving this new optimization problem leads to the proposed framework. Since the regularization term is independent of the similarity kernel, various cost aggregation methods can be integrated into the proposed general framework. We show that the cross-scale framework is important as it effectively and efficiently expands state-of-the-art cost aggregation methods and leads to significant improvements, when evaluated on Middlebury, KITTI and New Tsukuba datasets.Comment: To Appear in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014 (poster, 29.88%

arXiv.org e-Print Archive

CiteSeerX

Crossref

Low-level Vision by Consensus in a Spatial Hierarchy of Regions

Author: Chakrabarti Ayan
Gortler Steven J.
Xiong Ying
Zickler Todd
Publication venue
Publication date: 14/04/2015
Field of study

We introduce a multi-scale framework for low-level vision, where the goal is estimating physical scene values from image data---such as depth from stereo image pairs. The framework uses a dense, overlapping set of image regions at multiple scales and a "local model," such as a slanted-plane model for stereo disparity, that is expected to be valid piecewise across the visual field. Estimation is cast as optimization over a dichotomous mixture of variables, simultaneously determining which regions are inliers with respect to the local model (binary variables) and the correct co-ordinates in the local model space for each inlying region (continuous variables). When the regions are organized into a multi-scale hierarchy, optimization can occur in an efficient and parallel architecture, where distributed computational units iteratively perform calculations and share information through sparse connections between parents and children. The framework performs well on a standard benchmark for binocular stereo, and it produces a distributional scene representation that is appropriate for combining with higher-level reasoning and other low-level cues.Comment: Accepted to CVPR 2015. Project page: http://www.ttic.edu/chakrabarti/consensus

arXiv.org e-Print Archive

Crossref

3D RECONSTRUCTION FROM STEREO/RANGE IMAGES

Author: Yang Qingxiong
Publication venue: UKnowledge
Publication date: 01/01/2007
Field of study

3D reconstruction from stereo/range image is one of the most fundamental and extensively researched topics in computer vision. Stereo research has recently experienced somewhat of a new era, as a result of publically available performance testing such as the Middlebury data set, which has allowed researchers to compare their algorithms against all the state-of-the-art algorithms. This thesis investigates into the general stereo problems in both the two-view stereo and multi-view stereo scopes. In the two-view stereo scope, we formulate an algorithm for the stereo matching problem with careful handling of disparity, discontinuity and occlusion. The algorithm works with a global matching stereo model based on an energy minimization framework. The experimental results are evaluated on the Middlebury data set, showing that our algorithm is the top performer. A GPU approach of the Hierarchical BP algorithm is then proposed, which provides similar stereo quality to CPU Hierarchical BP while running at real-time speed. A fast-converging BP is also proposed to solve the slow convergence problem of general BP algorithms. Besides two-view stereo, ecient multi-view stereo for large scale urban reconstruction is carefully studied in this thesis. A novel approach for computing depth maps given urban imagery where often large parts of surfaces are weakly textured is presented. Finally, a new post-processing step to enhance the range images in both the both the spatial resolution and depth precision is proposed

University of Kentucky

ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems

Author: A Bhandari
A Foi
A Hosni
D Scharstein
F Besse
H Hirschmuller
H Zhao
J Kowalczuk
J Xie
J Zbontar
KJ Yoon
Mingsong Dou
PF Felzenszwalb
R Garg
R Szeliski
RA Hamzah
SR Fanello
SR Fanello
SR Fanello
Publication venue
Publication date: 01/01/2018
Field of study

In this paper we present ActiveStereoNet, the first deep learning solution for active stereo systems. Due to the lack of ground truth, our method is fully self-supervised, yet it produces precise depth with a subpixel precision of

1/30th

of a pixel; it does not suffer from the common over-smoothing issues; it preserves the edges; and it explicitly handles occlusions. We introduce a novel reconstruction loss that is more robust to noise and texture-less patches, and is invariant to illumination changes. The proposed loss is optimized using a window-based cost aggregation with an adaptive support weight scheme. This cost aggregation is edge-preserving and smooths the loss function, which is key to allow the network to reach compelling results. Finally we show how the task of predicting invalid regions, such as occlusions, can be trained end-to-end without ground-truth. This component is crucial to reduce blur and particularly improves predictions along depth discontinuities. Extensive quantitatively and qualitatively evaluations on real and synthetic data demonstrate state of the art results in many challenging scenes.Comment: Accepted by ECCV2018, Oral Presentation, Main paper + Supplementary Material

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Enhanced Image View Synthesis Using Multistage Hybrid Median Filter For Stereo Images

Author: Ali Hussein Aboali Maged Yahya
Publication venue
Publication date: 01/01/2018
Field of study

Disparity depth map estimation of stereo matching algorithm is one of the most active research topics in computer vision.In the field of image processing,many existing stereo matching algorithms to obtain disparity depth map are developed and designed with low accuracy.To improve the accuracy of disparity depth map is quite challenging and difficult especially with uncontrolled dynamic environment.The accuracy is affected by many unwanted aspects including random noises,horizontal streaks,low texture,depth map non-edge preserving, occlusion,and depth discontinuities.Thus,this research proposed a new robust method of hybrid stereo matching algorithm with significant accuracy of computation.The thesis will present in detail the development,design, and analysis of performance on Multistage Hybrid Median Filter (MHMF).There are two main parts involved in our developed method which combined in two main stages.Stage 1 consists of the Sum of Absolute Differences (SAD) from Basic Block Matching (BBM) algorithm and the part of Scanline Optimization (SO) from Dynamic Programming (DP) algorithm.While,Stage 2 is the main core of our MHMF as a post-processing step which included segmentation,merging, and hybrid median filtering.The significant feature of the post-processing step is on its ability to handle efficiently the unwanted aspects obtained from the raw disparity depth map on the step of optimization.In order to remove and overcome the challenges unwanted aspects, the proposed MHMF has three stages of filtering process along with the developed approaches in Stage 2 of MHMF algorithm.There are two categories of evaluation performed on the obtained disparity depth map: subjective evaluation and objective evaluation.The objective evaluation involves the evaluation on Middlebury Stereo Vision system and evaluation using traditional methods such as Mean Square Errors (MSE),Peak to Signal Noise Ratio (PSNR) and Structural Similarity Index Metric (SSIM).Based on the results of the standard benchmarking datasets from Middlebury,the proposed algorithm is able to reduce errors of non-occluded and all errors respectively.While,the subjective evaluation is done for datasets captured from MV BLUE FOX camera using human's eyes perception.Based on the results,the proposed MHMF is able to obtain accurate results, specifically 69% and 71% of non-occluded and all errors for disparity depth map, and it outperformed some of the existing methods in the literature such as BBM and DP algorithms

Universiti Teknikal Malaysia Melaka (UTeM) Repository