Search CORE

1,815 research outputs found

A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation

Author: Brox Thomas
Cremers Daniel
Dosovitskiy Alexey
Fischer Philipp
Häusser Philip
Ilg Eddy
Mayer Nikolaus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/12/2015
Field of study

Recent work has shown that optical flow estimation can be formulated as a supervised learning task and can be successfully solved with convolutional networks. Training of the so-called FlowNet was enabled by a large synthetically generated dataset. The present paper extends the concept of optical flow estimation via convolutional networks to disparity and scene flow estimation. To this end, we propose three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks. Our datasets are the first large-scale datasets to enable training and evaluating scene flow methods. Besides the datasets, we present a convolutional network for real-time disparity estimation that provides state-of-the-art results. By combining a flow and disparity estimation network and training it jointly, we demonstrate the first scene flow estimation with a convolutional network.Comment: Includes supplementary materia

arXiv.org e-Print Archive

Crossref

Anytime Stereo Image Depth Estimation on Mobile Devices

Author: Campbell Mark
Huang Gao
Lai Zihang
van der Maaten Laurens
Wang Brian H.
Wang Yan
Weinberger Kilian Q.
Publication venue
Publication date: 05/03/2019
Field of study

Many applications of stereo depth estimation in robotics require the generation of accurate disparity maps in real time under significant computational constraints. Current state-of-the-art algorithms force a choice between either generating accurate mappings at a slow pace, or quickly generating inaccurate ones, and additionally these methods typically require far too many parameters to be usable on power- or memory-constrained devices. Motivated by these shortcomings, we propose a novel approach for disparity prediction in the anytime setting. In contrast to prior work, our end-to-end learned approach can trade off computation and accuracy at inference time. Depth estimation is performed in stages, during which the model can be queried at any time to output its current best estimate. Our final model can process 1242

\times

375 resolution images within a range of 10-35 FPS on an NVIDIA Jetson TX2 module with only marginal increases in error -- using two orders of magnitude fewer parameters than the most competitive baseline. The source code is available at https://github.com/mileyan/AnyNet .Comment: Accepted by ICRA201

arXiv.org e-Print Archive

Crossref