166 research outputs found
TAPA-MVS: Textureless-Aware PAtchMatch Multi-View Stereo
One of the most successful approaches in Multi-View Stereo estimates a depth
map and a normal map for each view via PatchMatch-based optimization and fuses
them into a consistent 3D points cloud. This approach relies on
photo-consistency to evaluate the goodness of a depth estimate. It generally
produces very accurate results; however, the reconstructed model often lacks
completeness, especially in correspondence of broad untextured areas where the
photo-consistency metrics are unreliable. Assuming the untextured areas
piecewise planar, in this paper we generate novel PatchMatch hypotheses so to
expand reliable depth estimates in neighboring untextured regions. At the same
time, we modify the photo-consistency measure such to favor standard or novel
PatchMatch depth hypotheses depending on the textureness of the considered
area. We also propose a depth refinement step to filter wrong estimates and to
fill the gaps on both the depth maps and normal maps while preserving the
discontinuities. The effectiveness of our new methods has been tested against
several state of the art algorithms in the publicly available ETH3D dataset
containing a wide variety of high and low-resolution images
Learned Multi-Patch Similarity
Estimating a depth map from multiple views of a scene is a fundamental task
in computer vision. As soon as more than two viewpoints are available, one
faces the very basic question how to measure similarity across >2 image
patches. Surprisingly, no direct solution exists, instead it is common to fall
back to more or less robust averaging of two-view similarities. Encouraged by
the success of machine learning, and in particular convolutional neural
networks, we propose to learn a matching function which directly maps multiple
image patches to a scalar similarity score. Experiments on several multi-view
datasets demonstrate that this approach has advantages over methods based on
pairwise patch similarity.Comment: 10 pages, 7 figures, Accepted at ICCV 201
Semi-Global Stereo Matching with Surface Orientation Priors
Semi-Global Matching (SGM) is a widely-used efficient stereo matching
technique. It works well for textured scenes, but fails on untextured slanted
surfaces due to its fronto-parallel smoothness assumption. To remedy this
problem, we propose a simple extension, termed SGM-P, to utilize precomputed
surface orientation priors. Such priors favor different surface slants in
different 2D image regions or 3D scene regions and can be derived in various
ways. In this paper we evaluate plane orientation priors derived from stereo
matching at a coarser resolution and show that such priors can yield
significant performance gains for difficult weakly-textured scenes. We also
explore surface normal priors derived from Manhattan-world assumptions, and we
analyze the potential performance gains using oracle priors derived from
ground-truth data. SGM-P only adds a minor computational overhead to SGM and is
an attractive alternative to more complex methods employing higher-order
smoothness terms.Comment: extended draft of 3DV 2017 (spotlight) pape
General Dynamic Scene Reconstruction from Multiple View Video
This paper introduces a general approach to dynamic scene reconstruction from
multiple moving cameras without prior knowledge or limiting constraints on the
scene structure, appearance, or illumination. Existing techniques for dynamic
scene reconstruction from multiple wide-baseline camera views primarily focus
on accurate reconstruction in controlled environments, where the cameras are
fixed and calibrated and background is known. These approaches are not robust
for general dynamic scenes captured with sparse moving cameras. Previous
approaches for outdoor dynamic scene reconstruction assume prior knowledge of
the static background appearance and structure. The primary contributions of
this paper are twofold: an automatic method for initial coarse dynamic scene
segmentation and reconstruction without prior knowledge of background
appearance or structure; and a general robust approach for joint segmentation
refinement and dense reconstruction of dynamic scenes from multiple
wide-baseline static or moving cameras. Evaluation is performed on a variety of
indoor and outdoor scenes with cluttered backgrounds and multiple dynamic
non-rigid objects such as people. Comparison with state-of-the-art approaches
demonstrates improved accuracy in both multiple view segmentation and dense
reconstruction. The proposed approach also eliminates the requirement for prior
knowledge of scene structure and appearance
- …