29 research outputs found
An Octree-Based Approach towards Efficient Variational Range Data Fusion
Volume-based reconstruction is usually expensive both in terms of memory
consumption and runtime. Especially for sparse geometric structures, volumetric
representations produce a huge computational overhead. We present an efficient
way to fuse range data via a variational Octree-based minimization approach by
taking the actual range data geometry into account. We transform the data into
Octree-based truncated signed distance fields and show how the optimization can
be conducted on the newly created structures. The main challenge is to uphold
speed and a low memory footprint without sacrificing the solutions' accuracy
during optimization. We explain how to dynamically adjust the optimizer's
geometric structure via joining/splitting of Octree nodes and how to define the
operators. We evaluate on various datasets and outline the suitability in terms
of performance and geometric accuracy.Comment: BMVC 201
Creating Simplified 3D Models with High Quality Textures
This paper presents an extension to the KinectFusion algorithm which allows
creating simplified 3D models with high quality RGB textures. This is achieved
through (i) creating model textures using images from an HD RGB camera that is
calibrated with Kinect depth camera, (ii) using a modified scheme to update
model textures in an asymmetrical colour volume that contains a higher number
of voxels than that of the geometry volume, (iii) simplifying dense polygon
mesh model using quadric-based mesh decimation algorithm, and (iv) creating and
mapping 2D textures to every polygon in the output 3D model. The proposed
method is implemented in real-time by means of GPU parallel processing.
Visualization via ray casting of both geometry and colour volumes provides
users with a real-time feedback of the currently scanned 3D model. Experimental
results show that the proposed method is capable of keeping the model texture
quality even for a heavily decimated model and that, when reconstructing small
objects, photorealistic RGB textures can still be reconstructed.Comment: 2015 International Conference on Digital Image Computing: Techniques
and Applications (DICTA), Page 1 -
OctNetFusion: Learning Depth Fusion from Data
In this paper, we present a learning based approach to depth fusion, i.e.,
dense 3D reconstruction from multiple depth images. The most common approach to
depth fusion is based on averaging truncated signed distance functions, which
was originally proposed by Curless and Levoy in 1996. While this method is
simple and provides great results, it is not able to reconstruct (partially)
occluded surfaces and requires a large number frames to filter out sensor noise
and outliers. Motivated by the availability of large 3D model repositories and
recent advances in deep learning, we present a novel 3D CNN architecture that
learns to predict an implicit surface representation from the input depth maps.
Our learning based method significantly outperforms the traditional volumetric
fusion approach in terms of noise reduction and outlier suppression. By
learning the structure of real world 3D objects and scenes, our approach is
further able to reconstruct occluded regions and to fill in gaps in the
reconstruction. We demonstrate that our learning based approach outperforms
both vanilla TSDF fusion as well as TV-L1 fusion on the task of volumetric
fusion. Further, we demonstrate state-of-the-art 3D shape completion results.Comment: 3DV 2017, https://github.com/griegler/octnetfusio
Point-to-hyperplane RGB-D Pose Estimation: Fusing Photometric and Geometric Measurements
International audienceThe objective of this paper is to investigate the problem of how to best combine and fuse color and depth measurements for incremental pose estimation or 3D tracking. Subsequently a framework will be proposed that allows to formulate the problem with a unique measurement vector and not to combine them in an ad-hoc manner. In particular, the full color and depth measurement will be defined as a 4-vector (by combining 3D Euclidean points + image intensities) and an optimal error for pose estimation will be derived from this. As will be shown, this will lead to designing an iterative closest point approach in 4 dimensional space. A kd-tree is used to find the closest point in 4D-space, therefore simultaneously accounting for color and depth. Based on this unified framework a novel Point-to-hyperplane approach will be introduced which has the advantages of classic Point-to-plane ICP but in 4D-space. By doing this it will be shown that there is no longer any need to provide or estimate a scale factor between different measurement types. Consequently, this allows to increase the convergence domain and speed up the alignment, whilst maintaining the robust and accurate properties. Results on both simulated and real environments will be provided along with benchmark comparisons
Dense 3D Object Reconstruction from a Single Depth View
In this paper, we propose a novel approach, 3D-RecGAN++, which reconstructs
the complete 3D structure of a given object from a single arbitrary depth view
using generative adversarial networks. Unlike existing work which typically
requires multiple views of the same object or class labels to recover the full
3D geometry, the proposed 3D-RecGAN++ only takes the voxel grid representation
of a depth view of the object as input, and is able to generate the complete 3D
occupancy grid with a high resolution of 256^3 by recovering the
occluded/missing regions. The key idea is to combine the generative
capabilities of autoencoders and the conditional Generative Adversarial
Networks (GAN) framework, to infer accurate and fine-grained 3D structures of
objects in high-dimensional voxel space. Extensive experiments on large
synthetic datasets and real-world Kinect datasets show that the proposed
3D-RecGAN++ significantly outperforms the state of the art in single view 3D
object reconstruction, and is able to reconstruct unseen types of objects.Comment: TPAMI 2018. Code and data are available at:
https://github.com/Yang7879/3D-RecGAN-extended. This article extends from
arXiv:1708.0796
3D Object reconstruction from a single depth view with adversarial learning
In this paper, we propose a novel 3D-RecGAN approach, which reconstructs the complete 3D structure of a given object from a single arbitrary depth view using generative adversarial networks. Unlike the existing work which typically requires multiple views of the same object or class labels to recover the full 3D geometry, the proposed 3D-RecGAN only takes the voxel grid representation of a depth view of the object as input, and is able to generate the complete 3D occupancy grid by filling in the occluded/missing regions. The key idea is to combine the generative capabilities of autoencoders and the conditional Generative Adversarial Networks (GAN) framework, to infer accurate and fine-grained 3D structures of objects in high-dimensional voxel space. Extensive experiments on large synthetic datasets show that the proposed 3D-RecGAN significantly outperforms the state of the art in single view 3D object reconstruction, and is able to reconstruct unseen types of objects. Our code and data are available at: https://github.com/Yang7879/3D-RecGAN
Fast capture of textured full-body avatar with RGB-D cameras
We present a practical system which can provide
a textured full-body avatar within three seconds. It uses sixteen
RGB-depth (RGB-D) cameras, ten of which are arranged
to capture the body, while six target the important
head region. The configuration of the multiple cameras is
formulated as a constraint-based minimum set space-covering
problem, which is approximately solved by a heuristic algorithm.
The camera layout determined can cover the fullbody
surface of an adult, with geometric errors of less than
5 mm. After arranging the cameras, they are calibrated using
a mannequin before scanning real humans. The 16 RGB-D
images are all captured within 1 s, which both avoids the
need for the subject to attempt to remain still for an uncomfortable
period, and helps to keep pose changes between different
cameras small. All scans are combined and processed
to reconstruct the photo-realistic textured mesh in 2 s. During
both system calibration and working capture of a real
subject, the high-quality RGB information is exploited to assist
geometric reconstruction and texture stitching optimization