2,673 research outputs found
Propagating Confidences through CNNs for Sparse Data Regression
In most computer vision applications, convolutional neural networks (CNNs)
operate on dense image data generated by ordinary cameras. Designing CNNs for
sparse and irregularly spaced input data is still an open problem with numerous
applications in autonomous driving, robotics, and surveillance. To tackle this
challenging problem, we introduce an algebraically-constrained convolution
layer for CNNs with sparse input and demonstrate its capabilities for the scene
depth completion task. We propose novel strategies for determining the
confidence from the convolution operation and propagating it to consecutive
layers. Furthermore, we propose an objective function that simultaneously
minimizes the data error while maximizing the output confidence. Comprehensive
experiments are performed on the KITTI depth benchmark and the results clearly
demonstrate that the proposed approach achieves superior performance while
requiring three times fewer parameters than the state-of-the-art methods.
Moreover, our approach produces a continuous pixel-wise confidence map enabling
information fusion, state inference, and decision support.Comment: To appear in the British Machine Vision Conference (BMVC2018
The Missing Data Encoder: Cross-Channel Image Completion\\with Hide-And-Seek Adversarial Network
Image completion is the problem of generating whole images from fragments
only. It encompasses inpainting (generating a patch given its surrounding),
reverse inpainting/extrapolation (generating the periphery given the central
patch) as well as colorization (generating one or several channels given other
ones). In this paper, we employ a deep network to perform image completion,
with adversarial training as well as perceptual and completion losses, and call
it the ``missing data encoder'' (MDE). We consider several configurations based
on how the seed fragments are chosen. We show that training MDE for ``random
extrapolation and colorization'' (MDE-REC), i.e. using random
channel-independent fragments, allows a better capture of the image semantics
and geometry. MDE training makes use of a novel ``hide-and-seek'' adversarial
loss, where the discriminator seeks the original non-masked regions, while the
generator tries to hide them. We validate our models both qualitatively and
quantitatively on several datasets, showing their interest for image
completion, unsupervised representation learning as well as face occlusion
handling
- …