22 research outputs found
A Deep Primal-Dual Network for Guided Depth Super-Resolution
In this paper we present a novel method to increase the spatial resolution of
depth images. We combine a deep fully convolutional network with a non-local
variational method in a deep primal-dual network. The joint network computes a
noise-free, high-resolution estimate from a noisy, low-resolution input depth
map. Additionally, a high-resolution intensity image is used to guide the
reconstruction in the network. By unrolling the optimization steps of a
first-order primal-dual algorithm and formulating it as a network, we can train
our joint method end-to-end. This not only enables us to learn the weights of
the fully convolutional network, but also to optimize all parameters of the
variational method and its optimization procedure. The training of such a deep
network requires a large dataset for supervision. Therefore, we generate
high-quality depth maps and corresponding color images with a physically based
renderer. In an exhaustive evaluation we show that our method outperforms the
state-of-the-art on multiple benchmarks.Comment: BMVC 201
Sparsity Invariant CNNs
In this paper, we consider convolutional neural networks operating on sparse
inputs with an application to depth upsampling from sparse laser scan data.
First, we show that traditional convolutional networks perform poorly when
applied to sparse data even when the location of missing data is provided to
the network. To overcome this problem, we propose a simple yet effective sparse
convolution layer which explicitly considers the location of missing data
during the convolution operation. We demonstrate the benefits of the proposed
network architecture in synthetic and real experiments with respect to various
baseline approaches. Compared to dense baselines, the proposed sparse
convolution network generalizes well to novel datasets and is invariant to the
level of sparsity in the data. For our evaluation, we derive a novel dataset
from the KITTI benchmark, comprising 93k depth annotated RGB images. Our
dataset allows for training and evaluating depth upsampling and depth
prediction techniques in challenging real-world settings and will be made
available upon publication
Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera
Depth completion, the technique of estimating a dense depth image from sparse
depth measurements, has a variety of applications in robotics and autonomous
driving. However, depth completion faces 3 main challenges: the irregularly
spaced pattern in the sparse depth input, the difficulty in handling multiple
sensor modalities (when color images are available), as well as the lack of
dense, pixel-level ground truth depth labels. In this work, we address all
these challenges. Specifically, we develop a deep regression model to learn a
direct mapping from sparse depth (and color images) to dense depth. We also
propose a self-supervised training framework that requires only sequences of
color and sparse depth images, without the need for dense depth labels. Our
experiments demonstrate that our network, when trained with semi-dense
annotations, attains state-of-the- art accuracy and is the winning approach on
the KITTI depth completion benchmark at the time of submission. Furthermore,
the self-supervised framework outperforms a number of existing solutions
trained with semi- dense annotations.Comment: Software:
https://github.com/fangchangma/self-supervised-depth-completion . Video:
https://youtu.be/bGXfvF261pc . 12 pages, 6 figures, 3 table
Indoor Depth Completion with Boundary Consistency and Self-Attention
Depth estimation features are helpful for 3D recognition. Commodity-grade
depth cameras are able to capture depth and color image in real-time. However,
glossy, transparent or distant surface cannot be scanned properly by the
sensor. As a result, enhancement and restoration from sensing depth is an
important task. Depth completion aims at filling the holes that sensors fail to
detect, which is still a complex task for machine to learn. Traditional
hand-tuned methods have reached their limits, while neural network based
methods tend to copy and interpolate the output from surrounding depth values.
This leads to blurred boundaries, and structures of the depth map are lost.
Consequently, our main work is to design an end-to-end network improving
completion depth maps while maintaining edge clarity. We utilize self-attention
mechanism, previously used in image inpainting fields, to extract more useful
information in each layer of convolution so that the complete depth map is
enhanced. In addition, we propose boundary consistency concept to enhance the
depth map quality and structure. Experimental results validate the
effectiveness of our self-attention and boundary consistency schema, which
outperforms previous state-of-the-art depth completion work on Matterport3D
dataset. Our code is publicly available at
https://github.com/patrickwu2/Depth-CompletionComment: Accepted by ICCVW (RLQ) 201