19,916 research outputs found
Learning Deep Object Detectors from 3D Models
Crowdsourced 3D CAD models are becoming easily accessible online, and can
potentially generate an infinite number of training images for almost any
object category.We show that augmenting the training data of contemporary Deep
Convolutional Neural Net (DCNN) models with such synthetic data can be
effective, especially when real training data is limited or not well matched to
the target domain. Most freely available CAD models capture 3D shape but are
often missing other low level cues, such as realistic object texture, pose, or
background. In a detailed analysis, we use synthetic CAD-rendered images to
probe the ability of DCNN to learn without these cues, with surprising
findings. In particular, we show that when the DCNN is fine-tuned on the target
detection task, it exhibits a large degree of invariance to missing low-level
cues, but, when pretrained on generic ImageNet classification, it learns better
when the low-level cues are simulated. We show that our synthetic DCNN training
approach significantly outperforms previous methods on the PASCAL VOC2007
dataset when learning in the few-shot scenario and improves performance in a
domain shift scenario on the Office benchmark
GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation
The inherent ambiguity in ground-truth annotations of 3D bounding boxes
caused by occlusions, signal missing, or manual annotation errors can confuse
deep 3D object detectors during training, thus deteriorating the detection
accuracy. However, existing methods overlook such issues to some extent and
treat the labels as deterministic. In this paper, we formulate the label
uncertainty problem as the diversity of potentially plausible bounding boxes of
objects, then propose GLENet, a generative framework adapted from conditional
variational autoencoders, to model the one-to-many relationship between a
typical 3D object and its potential ground-truth bounding boxes with latent
variables. The label uncertainty generated by GLENet is a plug-and-play module
and can be conveniently integrated into existing deep 3D detectors to build
probabilistic detectors and supervise the learning of the localization
uncertainty. Besides, we propose an uncertainty-aware quality estimator
architecture in probabilistic detectors to guide the training of IoU-branch
with predicted localization uncertainty. We incorporate the proposed methods
into various popular base 3D detectors and demonstrate significant and
consistent performance gains on both KITTI and Waymo benchmark datasets.
Especially, the proposed GLENet-VR outperforms all published LiDAR-based
approaches by a large margin and ranks among single-modal methods on
the challenging KITTI test set. We will make the source code and pre-trained
models publicly available
Learning Shape Priors for Single-View 3D Completion and Reconstruction
The problem of single-view 3D shape completion or reconstruction is
challenging, because among the many possible shapes that explain an
observation, most are implausible and do not correspond to natural objects.
Recent research in the field has tackled this problem by exploiting the
expressiveness of deep convolutional networks. In fact, there is another level
of ambiguity that is often overlooked: among plausible shapes, there are still
multiple shapes that fit the 2D image equally well; i.e., the ground truth
shape is non-deterministic given a single-view input. Existing fully supervised
approaches fail to address this issue, and often produce blurry mean shapes
with smooth surfaces but no fine details.
In this paper, we propose ShapeHD, pushing the limit of single-view shape
completion and reconstruction by integrating deep generative models with
adversarially learned shape priors. The learned priors serve as a regularizer,
penalizing the model only if its output is unrealistic, not if it deviates from
the ground truth. Our design thus overcomes both levels of ambiguity
aforementioned. Experiments demonstrate that ShapeHD outperforms state of the
art by a large margin in both shape completion and shape reconstruction on
multiple real datasets.Comment: ECCV 2018. The first two authors contributed equally to this work.
Project page: http://shapehd.csail.mit.edu
- …