60 research outputs found

    Better Stereo Matching From Simple Yet Effective Wrangling of Deep Features

    Get PDF
    Cost volume plays a pivotal role in stereo matching. Most recent works focused on deep feature extraction and cost refinement for a more accurate cost volume. Unlike them, we probe from a different perspective: feature wrangling. We find that simple wrangling of deep features can effectively improve the construction of cost volume and thus the performance of stereo matching. Specifically, we develop two simple yet effective wrangling techniques of deep features, spatially a differentiable feature transformation and channel-wise a memory-economical feature expansion, for better cost construction. Exploiting the local ordering information provided by a differentiable rank transform, we achieve an enhancement of the search for correspondence; with the help of disparity division, our feature expansion allows for more features into the cost volume with no extra memory required. Equipped with these two feature wrangling techniques, our simple network can perform outstandingly on the widely used KITTI and Sceneflow datasets

    Disparity Estimation with Scene Depth Cues

    Get PDF
    The cost volume plays a pivotal role in stereo matching, usually working as an optimization object. However, we find it also can provide effective scene prior to guide the disparity learning, as it reflects well the depth relationship between scenario objects. Inspired by this new perspective, we propose the CSA module, which consists of a new correlation and selection (CS) layer and a new aggregation layer. The CS layer can regulate the matching costs and re-encode the feature information into the correlation volume. The aggregation layer can preserve better the depth cues of the refined cost volume, through a convolution network and a unimodalization operation. The proposed module can be trained in a supervised manner, making the extraction of scene depth cues more accurate. Extensive experiments on the Sceneflow and KITTI datasets have demonstrated that with our module embedded, SOTA networks can achieve substantially better performance

    Real-time self-adaptive deep stereo

    Full text link
    Deep convolutional neural networks trained end-to-end are the state-of-the-art methods to regress dense disparity maps from stereo pairs. These models, however, suffer from a notable decrease in accuracy when exposed to scenarios significantly different from the training set, e.g., real vs synthetic images, etc.). We argue that it is extremely unlikely to gather enough samples to achieve effective training/tuning in any target domain, thus making this setup impractical for many applications. Instead, we propose to perform unsupervised and continuous online adaptation of a deep stereo network, which allows for preserving its accuracy in any environment. However, this strategy is extremely computationally demanding and thus prevents real-time inference. We address this issue introducing a new lightweight, yet effective, deep stereo architecture, Modularly ADaptive Network (MADNet) and developing a Modular ADaptation (MAD) algorithm, which independently trains sub-portions of the network. By deploying MADNet together with MAD we introduce the first real-time self-adaptive deep stereo system enabling competitive performance on heterogeneous datasets.Comment: Accepted at CVPR2019 as oral presentation. Code Available https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stere

    MC-Stereo: Multi-peak Lookup and Cascade Search Range for Stereo Matching

    Full text link
    Stereo matching is a fundamental task in scene comprehension. In recent years, the method based on iterative optimization has shown promise in stereo matching. However, the current iteration framework employs a single-peak lookup, which struggles to handle the multi-peak problem effectively. Additionally, the fixed search range used during the iteration process limits the final convergence effects. To address these issues, we present a novel iterative optimization architecture called MC-Stereo. This architecture mitigates the multi-peak distribution problem in matching through the multi-peak lookup strategy, and integrates the coarse-to-fine concept into the iterative framework via the cascade search range. Furthermore, given that feature representation learning is crucial for successful learn-based stereo matching, we introduce a pre-trained network to serve as the feature extractor, enhancing the front end of the stereo matching pipeline. Based on these improvements, MC-Stereo ranks first among all publicly available methods on the KITTI-2012 and KITTI-2015 benchmarks, and also achieves state-of-the-art performance on ETH3D. Code is available at https://github.com/MiaoJieF/MC-Stereo.Comment: Accepted to 3DV 202
    corecore