1,709 research outputs found
Sparse-to-Continuous: Enhancing Monocular Depth Estimation using Occupancy Maps
This paper addresses the problem of single image depth estimation (SIDE),
focusing on improving the quality of deep neural network predictions. In a
supervised learning scenario, the quality of predictions is intrinsically
related to the training labels, which guide the optimization process. For
indoor scenes, structured-light-based depth sensors (e.g. Kinect) are able to
provide dense, albeit short-range, depth maps. On the other hand, for outdoor
scenes, LiDARs are considered the standard sensor, which comparatively provides
much sparser measurements, especially in areas further away. Rather than
modifying the neural network architecture to deal with sparse depth maps, this
article introduces a novel densification method for depth maps, using the
Hilbert Maps framework. A continuous occupancy map is produced based on 3D
points from LiDAR scans, and the resulting reconstructed surface is projected
into a 2D depth map with arbitrary resolution. Experiments conducted with
various subsets of the KITTI dataset show a significant improvement produced by
the proposed Sparse-to-Continuous technique, without the introduction of extra
information into the training stage.Comment: Accepted. (c) 2019 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
DeepVoxels: Learning Persistent 3D Feature Embeddings
In this work, we address the lack of 3D understanding of generative neural
networks by introducing a persistent 3D feature embedding for view synthesis.
To this end, we propose DeepVoxels, a learned representation that encodes the
view-dependent appearance of a 3D scene without having to explicitly model its
geometry. At its core, our approach is based on a Cartesian 3D grid of
persistent embedded features that learn to make use of the underlying 3D scene
structure. Our approach combines insights from 3D geometric computer vision
with recent advances in learning image-to-image mappings based on adversarial
loss functions. DeepVoxels is supervised, without requiring a 3D reconstruction
of the scene, using a 2D re-rendering loss and enforces perspective and
multi-view geometry in a principled manner. We apply our persistent 3D scene
representation to the problem of novel view synthesis demonstrating
high-quality results for a variety of challenging scenes.Comment: Video: https://www.youtube.com/watch?v=HM_WsZhoGXw Supplemental
material:
https://drive.google.com/file/d/1BnZRyNcVUty6-LxAstN83H79ktUq8Cjp/view?usp=sharing
Code: https://github.com/vsitzmann/deepvoxels Project page:
https://vsitzmann.github.io/deepvoxels
VConv-DAE: Deep Volumetric Shape Learning Without Object Labels
With the advent of affordable depth sensors, 3D capture becomes more and more
ubiquitous and already has made its way into commercial products. Yet,
capturing the geometry or complete shapes of everyday objects using scanning
devices (e.g. Kinect) still comes with several challenges that result in noise
or even incomplete shapes. Recent success in deep learning has shown how to
learn complex shape distributions in a data-driven way from large scale 3D CAD
Model collections and to utilize them for 3D processing on volumetric
representations and thereby circumventing problems of topology and
tessellation. Prior work has shown encouraging results on problems ranging from
shape completion to recognition. We provide an analysis of such approaches and
discover that training as well as the resulting representation are strongly and
unnecessarily tied to the notion of object labels. Thus, we propose a full
convolutional volumetric auto encoder that learns volumetric representation
from noisy data by estimating the voxel occupancy grids. The proposed method
outperforms prior work on challenging tasks like denoising and shape
completion. We also show that the obtained deep embedding gives competitive
performance when used for classification and promising results for shape
interpolation
- …