8,993 research outputs found
Learning Shape Priors for Single-View 3D Completion and Reconstruction
The problem of single-view 3D shape completion or reconstruction is
challenging, because among the many possible shapes that explain an
observation, most are implausible and do not correspond to natural objects.
Recent research in the field has tackled this problem by exploiting the
expressiveness of deep convolutional networks. In fact, there is another level
of ambiguity that is often overlooked: among plausible shapes, there are still
multiple shapes that fit the 2D image equally well; i.e., the ground truth
shape is non-deterministic given a single-view input. Existing fully supervised
approaches fail to address this issue, and often produce blurry mean shapes
with smooth surfaces but no fine details.
In this paper, we propose ShapeHD, pushing the limit of single-view shape
completion and reconstruction by integrating deep generative models with
adversarially learned shape priors. The learned priors serve as a regularizer,
penalizing the model only if its output is unrealistic, not if it deviates from
the ground truth. Our design thus overcomes both levels of ambiguity
aforementioned. Experiments demonstrate that ShapeHD outperforms state of the
art by a large margin in both shape completion and shape reconstruction on
multiple real datasets.Comment: ECCV 2018. The first two authors contributed equally to this work.
Project page: http://shapehd.csail.mit.edu
Propagating Confidences through CNNs for Sparse Data Regression
In most computer vision applications, convolutional neural networks (CNNs)
operate on dense image data generated by ordinary cameras. Designing CNNs for
sparse and irregularly spaced input data is still an open problem with numerous
applications in autonomous driving, robotics, and surveillance. To tackle this
challenging problem, we introduce an algebraically-constrained convolution
layer for CNNs with sparse input and demonstrate its capabilities for the scene
depth completion task. We propose novel strategies for determining the
confidence from the convolution operation and propagating it to consecutive
layers. Furthermore, we propose an objective function that simultaneously
minimizes the data error while maximizing the output confidence. Comprehensive
experiments are performed on the KITTI depth benchmark and the results clearly
demonstrate that the proposed approach achieves superior performance while
requiring three times fewer parameters than the state-of-the-art methods.
Moreover, our approach produces a continuous pixel-wise confidence map enabling
information fusion, state inference, and decision support.Comment: To appear in the British Machine Vision Conference (BMVC2018
- …