9,245 research outputs found
Guided Stereo Matching
Stereo is a prominent technique to infer dense depth maps from images, and
deep learning further pushed forward the state-of-the-art, making end-to-end
architectures unrivaled when enough data is available for training. However,
deep networks suffer from significant drops in accuracy when dealing with new
environments. Therefore, in this paper, we introduce Guided Stereo Matching, a
novel paradigm leveraging a small amount of sparse, yet reliable depth
measurements retrieved from an external source enabling to ameliorate this
weakness. The additional sparse cues required by our method can be obtained
with any strategy (e.g., a LiDAR) and used to enhance features linked to
corresponding disparity hypotheses. Our formulation is general and fully
differentiable, thus enabling to exploit the additional sparse inputs in
pre-trained deep stereo networks as well as for training a new instance from
scratch. Extensive experiments on three standard datasets and two
state-of-the-art deep architectures show that even with a small set of sparse
input cues, i) the proposed paradigm enables significant improvements to
pre-trained networks. Moreover, ii) training from scratch notably increases
accuracy and robustness to domain shifts. Finally, iii) it is suited and
effective even with traditional stereo algorithms such as SGM.Comment: CVPR 201
Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation
We present an approach for encoding visual task relationships to improve
model performance in an Unsupervised Domain Adaptation (UDA) setting. Semantic
segmentation and monocular depth estimation are shown to be complementary
tasks; in a multi-task learning setting, a proper encoding of their
relationships can further improve performance on both tasks. Motivated by this
observation, we propose a novel Cross-Task Relation Layer (CTRL), which encodes
task dependencies between the semantic and depth predictions. To capture the
cross-task relationships, we propose a neural network architecture that
contains task-specific and cross-task refinement heads. Furthermore, we propose
an Iterative Self-Learning (ISL) training scheme, which exploits semantic
pseudo-labels to provide extra supervision on the target domain. We
experimentally observe improvements in both tasks' performance because the
complementary information present in these tasks is better captured.
Specifically, we show that: (1) our approach improves performance on all tasks
when they are complementary and mutually dependent; (2) the CTRL helps to
improve both semantic segmentation and depth estimation tasks performance in
the challenging UDA setting; (3) the proposed ISL training scheme further
improves the semantic segmentation performance. The implementation is available
at https://github.com/susaha/ctrl-uda.Comment: Accepted at CVPR 2021; updated results according to the released
source cod
On the Synergies between Machine Learning and Binocular Stereo for Depth Estimation from Images: a Survey
Stereo matching is one of the longest-standing problems in computer vision
with close to 40 years of studies and research. Throughout the years the
paradigm has shifted from local, pixel-level decision to various forms of
discrete and continuous optimization to data-driven, learning-based methods.
Recently, the rise of machine learning and the rapid proliferation of deep
learning enhanced stereo matching with new exciting trends and applications
unthinkable until a few years ago. Interestingly, the relationship between
these two worlds is two-way. While machine, and especially deep, learning
advanced the state-of-the-art in stereo matching, stereo itself enabled new
ground-breaking methodologies such as self-supervised monocular depth
estimation based on deep networks. In this paper, we review recent research in
the field of learning-based depth estimation from single and binocular images
highlighting the synergies, the successes achieved so far and the open
challenges the community is going to face in the immediate future.Comment: Accepted to TPAMI. Paper version of our CVPR 2019 tutorial:
"Learning-based depth estimation from stereo and monocular images: successes,
limitations and future challenges"
(https://sites.google.com/view/cvpr-2019-depth-from-image/home
- …