2,073 research outputs found

    ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems

    Full text link
    In this paper we present ActiveStereoNet, the first deep learning solution for active stereo systems. Due to the lack of ground truth, our method is fully self-supervised, yet it produces precise depth with a subpixel precision of 1/30th1/30th of a pixel; it does not suffer from the common over-smoothing issues; it preserves the edges; and it explicitly handles occlusions. We introduce a novel reconstruction loss that is more robust to noise and texture-less patches, and is invariant to illumination changes. The proposed loss is optimized using a window-based cost aggregation with an adaptive support weight scheme. This cost aggregation is edge-preserving and smooths the loss function, which is key to allow the network to reach compelling results. Finally we show how the task of predicting invalid regions, such as occlusions, can be trained end-to-end without ground-truth. This component is crucial to reduce blur and particularly improves predictions along depth discontinuities. Extensive quantitatively and qualitatively evaluations on real and synthetic data demonstrate state of the art results in many challenging scenes.Comment: Accepted by ECCV2018, Oral Presentation, Main paper + Supplementary Material

    Reliable fusion of ToF and stereo depth driven by confidence measures

    Get PDF
    In this paper we propose a framework for the fusion of depth data produced by a Time-of-Flight (ToF) camera and stereo vision system. Initially, depth data acquired by the ToF camera are upsampled by an ad-hoc algorithm based on image segmentation and bilateral filtering. In parallel a dense disparity map is obtained using the Semi- Global Matching stereo algorithm. Reliable confidence measures are extracted for both the ToF and stereo depth data. In particular, ToF confidence also accounts for the mixed-pixel effect and the stereo confidence accounts for the relationship between the pointwise matching costs and the cost obtained by the semi-global optimization. Finally, the two depth maps are synergically fused by enforcing the local consistency of depth data accounting for the confidence of the two data sources at each location. Experimental results clearly show that the proposed method produces accurate high resolution depth maps and outperforms the compared fusion algorithms

    Precision in 3-D Points Reconstructed From Stereo

    Get PDF
    We characterize the precision of a 3-D reconstruction from stereo: we derive confidence intervals for the components (X,Y,Z) of the reconstructed 3-D points. The precision assessment can be used in data rejection, data reduction, and data fusion of the 3-D points. Also, based on the confidence intervals a bad/failing stereo camera pair can be detected, and discarded from a polynocular stereo system. Experimentally, we have evaluated the performance of the confidence intervals for Z in terms of empirical capture frequencies vs. theoretical probability of capture for a test, ground truth, scene. We have tested the interval estimation procedure on more complex scenes (for example, human faces), but since we do not have ground truth models, we have evaluated the performance in such cases only quantitatively. Currently we are developing ground truth models for more complex (such as general indoor) scenes, and will evaluate quantitatively the performance of the confidence intervals for the depth of the reconstructed points in the automatic rejection of 3-D points which have high degree of uncertainty

    Positioning System for a Hand-Held Mine Detector

    Get PDF
    Humanitarian mine clearance aims at reducing the nuisance of regions infected by explosive devices. These devices need to be detected with a high rate of success while keeping a low false alarm rate to reduce time losses and personnel’s fatigue. This chapter describes a positioning system developed to track hand-held detector movements in the context of close-range mine detection. With such a system, the signals captured by the detector over time can be used to build two- or three-dimensional data. The objects possibly present in the data can then be visually appreciated by an operator to detect specific features such as shape or size or known signatures. The positioning system developed in the framework of the HOPE European project requires only a camera and an extra bar. It adds few constraints to current mine clearance procedures and requires limited additional hardware. The software developed for calibration and continuous acquisition of the position is described, and evaluation results are presented
    • …
    corecore