1,254 research outputs found
Accurate Optical Flow via Direct Cost Volume Processing
We present an optical flow estimation approach that operates on the full
four-dimensional cost volume. This direct approach shares the structural
benefits of leading stereo matching pipelines, which are known to yield high
accuracy. To this day, such approaches have been considered impractical due to
the size of the cost volume. We show that the full four-dimensional cost volume
can be constructed in a fraction of a second due to its regularity. We then
exploit this regularity further by adapting semi-global matching to the
four-dimensional setting. This yields a pipeline that achieves significantly
higher accuracy than state-of-the-art optical flow methods while being faster
than most. Our approach outperforms all published general-purpose optical flow
methods on both Sintel and KITTI 2015 benchmarks.Comment: Published at the Conference on Computer Vision and Pattern
Recognition (CVPR 2017
SCNet: Learning Semantic Correspondence
This paper addresses the problem of establishing semantic correspondences
between images depicting different instances of the same object or scene
category. Previous approaches focus on either combining a spatial regularizer
with hand-crafted features, or learning a correspondence model for appearance
only. We propose instead a convolutional neural network architecture, called
SCNet, for learning a geometrically plausible model for semantic
correspondence. SCNet uses region proposals as matching primitives, and
explicitly incorporates geometric consistency in its loss function. It is
trained on image pairs obtained from the PASCAL VOC 2007 keypoint dataset, and
a comparative evaluation on several standard benchmarks demonstrates that the
proposed approach substantially outperforms both recent deep learning
architectures and previous methods based on hand-crafted features.Comment: ICCV 201
LIFT: Learned Invariant Feature Transform
We introduce a novel Deep Network architecture that implements the full
feature point handling pipeline, that is, detection, orientation estimation,
and feature description. While previous works have successfully tackled each
one of these problems individually, we show how to learn to do all three in a
unified manner while preserving end-to-end differentiability. We then
demonstrate that our Deep pipeline outperforms state-of-the-art methods on a
number of benchmark datasets, without the need of retraining.Comment: Accepted to ECCV 2016 (spotlight
Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs
Human visual system relies on both binocular stereo cues and monocular
focusness cues to gain effective 3D perception. In computer vision, the two
problems are traditionally solved in separate tracks. In this paper, we present
a unified learning-based technique that simultaneously uses both types of cues
for depth inference. Specifically, we use a pair of focal stacks as input to
emulate human perception. We first construct a comprehensive focal stack
training dataset synthesized by depth-guided light field rendering. We then
construct three individual networks: a Focus-Net to extract depth from a single
focal stack, a EDoF-Net to obtain the extended depth of field (EDoF) image from
the focal stack, and a Stereo-Net to conduct stereo matching. We show how to
integrate them into a unified BDfF-Net to obtain high-quality depth maps.
Comprehensive experiments show that our approach outperforms the
state-of-the-art in both accuracy and speed and effectively emulates human
vision systems
- …