112,088 research outputs found
Multi-View Stereo with Single-View Semantic Mesh Refinement
While 3D reconstruction is a well-established and widely explored research
topic, semantic 3D reconstruction has only recently witnessed an increasing
share of attention from the Computer Vision community. Semantic annotations
allow in fact to enforce strong class-dependent priors, as planarity for ground
and walls, which can be exploited to refine the reconstruction often resulting
in non-trivial performance improvements. State-of-the art methods propose
volumetric approaches to fuse RGB image data with semantic labels; even if
successful, they do not scale well and fail to output high resolution meshes.
In this paper we propose a novel method to refine both the geometry and the
semantic labeling of a given mesh. We refine the mesh geometry by applying a
variational method that optimizes a composite energy made of a state-of-the-art
pairwise photo-metric term and a single-view term that models the semantic
consistency between the labels of the 3D mesh and those of the segmented
images. We also update the semantic labeling through a novel Markov Random
Field (MRF) formulation that, together with the classical data and smoothness
terms, takes into account class-specific priors estimated directly from the
annotated mesh. This is in contrast to state-of-the-art methods that are
typically based on handcrafted or learned priors. We are the first, jointly
with the very recent and seminal work of [M. Blaha et al arXiv:1706.08336,
2017], to propose the use of semantics inside a mesh refinement framework.
Differently from [M. Blaha et al arXiv:1706.08336, 2017], which adopts a more
classical pairwise comparison to estimate the flow of the mesh, we apply a
single-view comparison between the semantically annotated image and the current
3D mesh labels; this improves the robustness in case of noisy segmentations.Comment: {\pounds}D Reconstruction Meets Semantic, ICCV worksho
Planar Prior Assisted PatchMatch Multi-View Stereo
The completeness of 3D models is still a challenging problem in multi-view
stereo (MVS) due to the unreliable photometric consistency in low-textured
areas. Since low-textured areas usually exhibit strong planarity, planar models
are advantageous to the depth estimation of low-textured areas. On the other
hand, PatchMatch multi-view stereo is very efficient for its sampling and
propagation scheme. By taking advantage of planar models and PatchMatch
multi-view stereo, we propose a planar prior assisted PatchMatch multi-view
stereo framework in this paper. In detail, we utilize a probabilistic graphical
model to embed planar models into PatchMatch multi-view stereo and contribute a
novel multi-view aggregated matching cost. This novel cost takes both
photometric consistency and planar compatibility into consideration, making it
suited for the depth estimation of both non-planar and planar regions.
Experimental results demonstrate that our method can efficiently recover the
depth information of extremely low-textured areas, thus obtaining high complete
3D models and achieving state-of-the-art performance.Comment: Accepted by AAAI-202
CHOSEN: Contrastive Hypothesis Selection for Multi-View Depth Refinement
We propose CHOSEN, a simple yet flexible, robust and effective multi-view
depth refinement framework. It can be employed in any existing multi-view
stereo pipeline, with straightforward generalization capability for different
multi-view capture systems such as camera relative positioning and lenses.
Given an initial depth estimation, CHOSEN iteratively re-samples and selects
the best hypotheses, and automatically adapts to different metric or intrinsic
scales determined by the capture system. The key to our approach is the
application of contrastive learning in an appropriate solution space and a
carefully designed hypothesis feature, based on which positive and negative
hypotheses can be effectively distinguished. Integrated in a simple baseline
multi-view stereo pipeline, CHOSEN delivers impressive quality in terms of
depth and normal accuracy compared to many current deep learning based
multi-view stereo pipelines
- …