9,135 research outputs found
Learned Semantic Multi-Sensor Depth Map Fusion
Volumetric depth map fusion based on truncated signed distance functions has
become a standard method and is used in many 3D reconstruction pipelines. In
this paper, we are generalizing this classic method in multiple ways: 1)
Semantics: Semantic information enriches the scene representation and is
incorporated into the fusion process. 2) Multi-Sensor: Depth information can
originate from different sensors or algorithms with very different noise and
outlier statistics which are considered during data fusion. 3) Scene denoising
and completion: Sensors can fail to recover depth for certain materials and
light conditions, or data is missing due to occlusions. Our method denoises the
geometry, closes holes and computes a watertight surface for every semantic
class. 4) Learning: We propose a neural network reconstruction method that
unifies all these properties within a single powerful framework. Our method
learns sensor or algorithm properties jointly with semantic depth fusion and
scene completion and can also be used as an expert system, e.g. to unify the
strengths of various photometric stereo algorithms. Our approach is the first
to unify all these properties. Experimental evaluations on both synthetic and
real data sets demonstrate clear improvements.Comment: 11 pages, 7 figures, 2 tables, accepted for the 2nd Workshop on 3D
Reconstruction in the Wild (3DRW2019) in conjunction with ICCV201
OctNetFusion: Learning Depth Fusion from Data
In this paper, we present a learning based approach to depth fusion, i.e.,
dense 3D reconstruction from multiple depth images. The most common approach to
depth fusion is based on averaging truncated signed distance functions, which
was originally proposed by Curless and Levoy in 1996. While this method is
simple and provides great results, it is not able to reconstruct (partially)
occluded surfaces and requires a large number frames to filter out sensor noise
and outliers. Motivated by the availability of large 3D model repositories and
recent advances in deep learning, we present a novel 3D CNN architecture that
learns to predict an implicit surface representation from the input depth maps.
Our learning based method significantly outperforms the traditional volumetric
fusion approach in terms of noise reduction and outlier suppression. By
learning the structure of real world 3D objects and scenes, our approach is
further able to reconstruct occluded regions and to fill in gaps in the
reconstruction. We demonstrate that our learning based approach outperforms
both vanilla TSDF fusion as well as TV-L1 fusion on the task of volumetric
fusion. Further, we demonstrate state-of-the-art 3D shape completion results.Comment: 3DV 2017, https://github.com/griegler/octnetfusio
CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction
Given the recent advances in depth prediction from Convolutional Neural
Networks (CNNs), this paper investigates how predicted depth maps from a deep
neural network can be deployed for accurate and dense monocular reconstruction.
We propose a method where CNN-predicted dense depth maps are naturally fused
together with depth measurements obtained from direct monocular SLAM. Our
fusion scheme privileges depth prediction in image locations where monocular
SLAM approaches tend to fail, e.g. along low-textured regions, and vice-versa.
We demonstrate the use of depth prediction for estimating the absolute scale of
the reconstruction, hence overcoming one of the major limitations of monocular
SLAM. Finally, we propose a framework to efficiently fuse semantic labels,
obtained from a single frame, with dense SLAM, yielding semantically coherent
scene reconstruction from a single view. Evaluation results on two benchmark
datasets show the robustness and accuracy of our approach.Comment: 10 pages, 6 figures, IEEE Computer Society Conference on Computer
Vision and Pattern Recognition (CVPR), Hawaii, USA, June, 2017. The first two
authors contribute equally to this pape
- …