3 research outputs found
Self-Supervised Learning for Stereo Reconstruction on Aerial Images
Recent developments established deep learning as an inevitable tool to boost
the performance of dense matching and stereo estimation. On the downside,
learning these networks requires a substantial amount of training data to be
successful. Consequently, the application of these models outside of the
laboratory is far from straight forward. In this work we propose a
self-supervised training procedure that allows us to adapt our network to the
specific (imaging) characteristics of the dataset at hand, without the
requirement of external ground truth data. We instead generate interim training
data by running our intermediate network on the whole dataset, followed by
conservative outlier filtering. Bootstrapped from a pre-trained version of our
hybrid CNN-CRF model, we alternate the generation of training data and network
training. With this simple concept we are able to lift the completeness and
accuracy of the pre-trained version significantly. We also show that our final
model compares favorably to other popular stereo estimation algorithms on an
aerial dataset.Comment: Symposium Prize Paper Award @IGARSS 201
Self-Supervised Learning for Monocular Depth Estimation from Aerial Imagery
Supervised learning based methods for monocular depth estimation usually require large amounts of extensively annotated training data. In the case of aerial imagery, this ground truth is particularly difficult to acquire. Therefore, in this paper, we present a method for self-supervised learning for monocular depth estimation from aerial imagery that does not require annotated training data. For this, we only use an image sequence from a single moving camera and learn to simultaneously estimate depth and pose information. By sharing the weights between pose and depth estimation, we achieve a relatively small model, which favors real-time application. We evaluate our approach on three diverse datasets and compare the results to conventional methods that estimate depth maps based on multi-view geometry. We achieve an accuracy δ1:25 of up to 93.5 %. In addition, we have paid particular attention to the generalization of a trained model to unknown data and the self-improving capabilities of our approach. We conclude that, even though the results of monocular depth estimation are inferior to those achieved by conventional methods, they are well suited to provide a good initialization for methods that rely on image matching or to provide estimates in regions where image matching fails, e.g. occluded or texture-less regions
Self-Supervised Learning for Monocular Depth Estimation from Aerial Imagery
Supervised learning based methods for monocular depth estimation usually
require large amounts of extensively annotated training data. In the case of
aerial imagery, this ground truth is particularly difficult to acquire.
Therefore, in this paper, we present a method for self-supervised learning for
monocular depth estimation from aerial imagery that does not require annotated
training data. For this, we only use an image sequence from a single moving
camera and learn to simultaneously estimate depth and pose information. By
sharing the weights between pose and depth estimation, we achieve a relatively
small model, which favors real-time application. We evaluate our approach on
three diverse datasets and compare the results to conventional methods that
estimate depth maps based on multi-view geometry. We achieve an accuracy
{\delta}1.25 of up to 93.5 %. In addition, we have paid particular attention to
the generalization of a trained model to unknown data and the self-improving
capabilities of our approach. We conclude that, even though the results of
monocular depth estimation are inferior to those achieved by conventional
methods, they are well suited to provide a good initialization for methods that
rely on image matching or to provide estimates in regions where image matching
fails, e.g. occluded or texture-less regions