622 research outputs found
Propagating Confidences through CNNs for Sparse Data Regression
In most computer vision applications, convolutional neural networks (CNNs)
operate on dense image data generated by ordinary cameras. Designing CNNs for
sparse and irregularly spaced input data is still an open problem with numerous
applications in autonomous driving, robotics, and surveillance. To tackle this
challenging problem, we introduce an algebraically-constrained convolution
layer for CNNs with sparse input and demonstrate its capabilities for the scene
depth completion task. We propose novel strategies for determining the
confidence from the convolution operation and propagating it to consecutive
layers. Furthermore, we propose an objective function that simultaneously
minimizes the data error while maximizing the output confidence. Comprehensive
experiments are performed on the KITTI depth benchmark and the results clearly
demonstrate that the proposed approach achieves superior performance while
requiring three times fewer parameters than the state-of-the-art methods.
Moreover, our approach produces a continuous pixel-wise confidence map enabling
information fusion, state inference, and decision support.Comment: To appear in the British Machine Vision Conference (BMVC2018
Sparse-to-Continuous: Enhancing Monocular Depth Estimation using Occupancy Maps
This paper addresses the problem of single image depth estimation (SIDE),
focusing on improving the quality of deep neural network predictions. In a
supervised learning scenario, the quality of predictions is intrinsically
related to the training labels, which guide the optimization process. For
indoor scenes, structured-light-based depth sensors (e.g. Kinect) are able to
provide dense, albeit short-range, depth maps. On the other hand, for outdoor
scenes, LiDARs are considered the standard sensor, which comparatively provides
much sparser measurements, especially in areas further away. Rather than
modifying the neural network architecture to deal with sparse depth maps, this
article introduces a novel densification method for depth maps, using the
Hilbert Maps framework. A continuous occupancy map is produced based on 3D
points from LiDAR scans, and the resulting reconstructed surface is projected
into a 2D depth map with arbitrary resolution. Experiments conducted with
various subsets of the KITTI dataset show a significant improvement produced by
the proposed Sparse-to-Continuous technique, without the introduction of extra
information into the training stage.Comment: Accepted. (c) 2019 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
A Few Photons Among Many: Unmixing Signal and Noise for Photon-Efficient Active Imaging
Conventional LIDAR systems require hundreds or thousands of photon detections
to form accurate depth and reflectivity images. Recent photon-efficient
computational imaging methods are remarkably effective with only 1.0 to 3.0
detected photons per pixel, but they are not demonstrated at
signal-to-background ratio (SBR) below 1.0 because their imaging accuracies
degrade significantly in the presence of high background noise. We introduce a
new approach to depth and reflectivity estimation that focuses on unmixing
contributions from signal and noise sources. At each pixel in an image,
short-duration range gates are adaptively determined and applied to remove
detections likely to be due to noise. For pixels with too few detections to
perform this censoring accurately, we borrow data from neighboring pixels to
improve depth estimates, where the neighborhood formation is also adaptive to
scene content. Algorithm performance is demonstrated on experimental data at
varying levels of noise. Results show improved performance of both reflectivity
and depth estimates over state-of-the-art methods, especially at low
signal-to-background ratios. In particular, accurate imaging is demonstrated
with SBR as low as 0.04. This validation of a photon-efficient, noise-tolerant
method demonstrates the viability of rapid, long-range, and low-power LIDAR
imaging
Confidence Propagation through CNNs for Guided Sparse Depth Regression
Generally, convolutional neural networks (CNNs) process data on a regular
grid, e.g. data generated by ordinary cameras. Designing CNNs for sparse and
irregularly spaced input data is still an open research problem with numerous
applications in autonomous driving, robotics, and surveillance. In this paper,
we propose an algebraically-constrained normalized convolution layer for CNNs
with highly sparse input that has a smaller number of network parameters
compared to related work. We propose novel strategies for determining the
confidence from the convolution operation and propagating it to consecutive
layers. We also propose an objective function that simultaneously minimizes the
data error while maximizing the output confidence. To integrate structural
information, we also investigate fusion strategies to combine depth and RGB
information in our normalized convolution network framework. In addition, we
introduce the use of output confidence as an auxiliary information to improve
the results. The capabilities of our normalized convolution network framework
are demonstrated for the problem of scene depth completion. Comprehensive
experiments are performed on the KITTI-Depth and the NYU-Depth-v2 datasets. The
results clearly demonstrate that the proposed approach achieves superior
performance while requiring only about 1-5% of the number of parameters
compared to the state-of-the-art methods.Comment: 14 pages, 14 Figure
A Joint Intensity and Depth Co-Sparse Analysis Model for Depth Map Super-Resolution
High-resolution depth maps can be inferred from low-resolution depth
measurements and an additional high-resolution intensity image of the same
scene. To that end, we introduce a bimodal co-sparse analysis model, which is
able to capture the interdependency of registered intensity and depth
information. This model is based on the assumption that the co-supports of
corresponding bimodal image structures are aligned when computed by a suitable
pair of analysis operators. No analytic form of such operators exist and we
propose a method for learning them from a set of registered training signals.
This learning process is done offline and returns a bimodal analysis operator
that is universally applicable to natural scenes. We use this to exploit the
bimodal co-sparse analysis model as a prior for solving inverse problems, which
leads to an efficient algorithm for depth map super-resolution.Comment: 13 pages, 4 figure
Sparsity Invariant CNNs
In this paper, we consider convolutional neural networks operating on sparse
inputs with an application to depth upsampling from sparse laser scan data.
First, we show that traditional convolutional networks perform poorly when
applied to sparse data even when the location of missing data is provided to
the network. To overcome this problem, we propose a simple yet effective sparse
convolution layer which explicitly considers the location of missing data
during the convolution operation. We demonstrate the benefits of the proposed
network architecture in synthetic and real experiments with respect to various
baseline approaches. Compared to dense baselines, the proposed sparse
convolution network generalizes well to novel datasets and is invariant to the
level of sparsity in the data. For our evaluation, we derive a novel dataset
from the KITTI benchmark, comprising 93k depth annotated RGB images. Our
dataset allows for training and evaluating depth upsampling and depth
prediction techniques in challenging real-world settings and will be made
available upon publication
- …