336 research outputs found
Frustum PointNets for 3D Object Detection from RGB-D Data
In this work, we study 3D object detection from RGB-D data in both indoor and
outdoor scenes. While previous methods focus on images or 3D voxels, often
obscuring natural 3D patterns and invariances of 3D data, we directly operate
on raw point clouds by popping up RGB-D scans. However, a key challenge of this
approach is how to efficiently localize objects in point clouds of large-scale
scenes (region proposal). Instead of solely relying on 3D proposals, our method
leverages both mature 2D object detectors and advanced 3D deep learning for
object localization, achieving efficiency as well as high recall for even small
objects. Benefited from learning directly in raw point clouds, our method is
also able to precisely estimate 3D bounding boxes even under strong occlusion
or with very sparse points. Evaluated on KITTI and SUN RGB-D 3D detection
benchmarks, our method outperforms the state of the art by remarkable margins
while having real-time capability.Comment: 15 pages, 12 figures, 14 table
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
The goal of this paper is to take a single 2D image of a scene and recover
the 3D structure in terms of a small set of factors: a layout representing the
enclosing surfaces as well as a set of objects represented in terms of shape
and pose. We propose a convolutional neural network-based approach to predict
this representation and benchmark it on a large dataset of indoor scenes. Our
experiments evaluate a number of practical design questions, demonstrate that
we can infer this representation, and quantitatively and qualitatively
demonstrate its merits compared to alternate representations.Comment: Project url with code: https://shubhtuls.github.io/factored3
Joint 3D Proposal Generation and Object Detection from View Aggregation
We present AVOD, an Aggregate View Object Detection network for autonomous
driving scenarios. The proposed neural network architecture uses LIDAR point
clouds and RGB images to generate features that are shared by two subnetworks:
a region proposal network (RPN) and a second stage detector network. The
proposed RPN uses a novel architecture capable of performing multimodal feature
fusion on high resolution feature maps to generate reliable 3D object proposals
for multiple object classes in road scenes. Using these proposals, the second
stage detection network performs accurate oriented 3D bounding box regression
and category classification to predict the extents, orientation, and
classification of objects in 3D space. Our proposed architecture is shown to
produce state of the art results on the KITTI 3D object detection benchmark
while running in real time with a low memory footprint, making it a suitable
candidate for deployment on autonomous vehicles. Code is at:
https://github.com/kujason/avodComment: For any inquiries contact aharakeh(at)uwaterloo(dot)c
3D Object Detection Using Scale Invariant and Feature Reweighting Networks
3D object detection plays an important role in a large number of real-world
applications. It requires us to estimate the localizations and the orientations
of 3D objects in real scenes. In this paper, we present a new network
architecture which focuses on utilizing the front view images and frustum point
clouds to generate 3D detection results. On the one hand, a PointSIFT module is
utilized to improve the performance of 3D segmentation. It can capture the
information from different orientations in space and the robustness to
different scale shapes. On the other hand, our network obtains the useful
features and suppresses the features with less information by a SENet module.
This module reweights channel features and estimates the 3D bounding boxes more
effectively. Our method is evaluated on both KITTI dataset for outdoor scenes
and SUN-RGBD dataset for indoor scenes. The experimental results illustrate
that our method achieves better performance than the state-of-the-art methods
especially when point clouds are highly sparse.Comment: The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19
- …