665 research outputs found
Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos
Learning to predict scene depth from RGB inputs is a challenging task both
for indoor and outdoor robot navigation. In this work we address unsupervised
learning of scene depth and robot ego-motion where supervision is provided by
monocular videos, as cameras are the cheapest, least restrictive and most
ubiquitous sensor for robotics.
Previous work in unsupervised image-to-depth learning has established strong
baselines in the domain. We propose a novel approach which produces higher
quality results, is able to model moving objects and is shown to transfer
across data domains, e.g. from outdoors to indoor scenes. The main idea is to
introduce geometric structure in the learning process, by modeling the scene
and the individual objects; camera ego-motion and object motions are learned
from monocular videos as input. Furthermore an online refinement method is
introduced to adapt learning on the fly to unknown domains.
The proposed approach outperforms all state-of-the-art approaches, including
those that handle motion e.g. through learned flow. Our results are comparable
in quality to the ones which used stereo as supervision and significantly
improve depth prediction on scenes and datasets which contain a lot of object
motion. The approach is of practical relevance, as it allows transfer across
environments, by transferring models trained on data collected for robot
navigation in urban scenes to indoor navigation settings. The code associated
with this paper can be found at https://sites.google.com/view/struct2depth.Comment: Thirty-Third AAAI Conference on Artificial Intelligence (AAAI'19
Occlusion Aware Unsupervised Learning of Optical Flow
It has been recently shown that a convolutional neural network can learn
optical flow estimation with unsupervised learning. However, the performance of
the unsupervised methods still has a relatively large gap compared to its
supervised counterpart. Occlusion and large motion are some of the major
factors that limit the current unsupervised learning of optical flow methods.
In this work we introduce a new method which models occlusion explicitly and a
new warping way that facilitates the learning of large motion. Our method shows
promising results on Flying Chairs, MPI-Sintel and KITTI benchmark datasets.
Especially on KITTI dataset where abundant unlabeled samples exist, our
unsupervised method outperforms its counterpart trained with supervised
learning.Comment: CVPR 2018 Camera-read
- …