30,073 research outputs found
3D Bounding Box Estimation Using Deep Learning and Geometry
We present a method for 3D object detection and pose estimation from a single
image. In contrast to current techniques that only regress the 3D orientation
of an object, our method first regresses relatively stable 3D object properties
using a deep convolutional neural network and then combines these estimates
with geometric constraints provided by a 2D object bounding box to produce a
complete 3D bounding box. The first network output estimates the 3D object
orientation using a novel hybrid discrete-continuous loss, which significantly
outperforms the L2 loss. The second output regresses the 3D object dimensions,
which have relatively little variance compared to alternatives and can often be
predicted for many object types. These estimates, combined with the geometric
constraints on translation imposed by the 2D bounding box, enable us to recover
a stable and accurate 3D object pose. We evaluate our method on the challenging
KITTI object detection benchmark both on the official metric of 3D orientation
estimation and also on the accuracy of the obtained 3D bounding boxes. Although
conceptually simple, our method outperforms more complex and computationally
expensive approaches that leverage semantic segmentation, instance level
segmentation and flat ground priors and sub-category detection. Our
discrete-continuous loss also produces state of the art results for 3D
viewpoint estimation on the Pascal 3D+ dataset.Comment: To appear in IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 201
Frustum PointNets for 3D Object Detection from RGB-D Data
In this work, we study 3D object detection from RGB-D data in both indoor and
outdoor scenes. While previous methods focus on images or 3D voxels, often
obscuring natural 3D patterns and invariances of 3D data, we directly operate
on raw point clouds by popping up RGB-D scans. However, a key challenge of this
approach is how to efficiently localize objects in point clouds of large-scale
scenes (region proposal). Instead of solely relying on 3D proposals, our method
leverages both mature 2D object detectors and advanced 3D deep learning for
object localization, achieving efficiency as well as high recall for even small
objects. Benefited from learning directly in raw point clouds, our method is
also able to precisely estimate 3D bounding boxes even under strong occlusion
or with very sparse points. Evaluated on KITTI and SUN RGB-D 3D detection
benchmarks, our method outperforms the state of the art by remarkable margins
while having real-time capability.Comment: 15 pages, 12 figures, 14 table
- …