Search CORE

336 research outputs found

Frustum PointNets for 3D Object Detection from RGB-D Data

Author: Guibas Leonidas J.
Liu Wei
Qi Charles R.
Su Hao
Wu Chenxia
Publication venue
Publication date: 12/04/2018
Field of study

In this work, we study 3D object detection from RGB-D data in both indoor and outdoor scenes. While previous methods focus on images or 3D voxels, often obscuring natural 3D patterns and invariances of 3D data, we directly operate on raw point clouds by popping up RGB-D scans. However, a key challenge of this approach is how to efficiently localize objects in point clouds of large-scale scenes (region proposal). Instead of solely relying on 3D proposals, our method leverages both mature 2D object detectors and advanced 3D deep learning for object localization, achieving efficiency as well as high recall for even small objects. Benefited from learning directly in raw point clouds, our method is also able to precisely estimate 3D bounding boxes even under strong occlusion or with very sparse points. Evaluated on KITTI and SUN RGB-D 3D detection benchmarks, our method outperforms the state of the art by remarkable margins while having real-time capability.Comment: 15 pages, 12 figures, 14 table

arXiv.org e-Print Archive

Crossref

Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene

Author: Efros Alexei A.
Fouhey David
Gupta Saurabh
Malik Jitendra
Tulsiani Shubham
Publication venue
Publication date: 24/04/2018
Field of study

The goal of this paper is to take a single 2D image of a scene and recover the 3D structure in terms of a small set of factors: a layout representing the enclosing surfaces as well as a set of objects represented in terms of shape and pose. We propose a convolutional neural network-based approach to predict this representation and benchmark it on a large dataset of indoor scenes. Our experiments evaluate a number of practical design questions, demonstrate that we can infer this representation, and quantitatively and qualitatively demonstrate its merits compared to alternate representations.Comment: Project url with code: https://shubhtuls.github.io/factored3

arXiv.org e-Print Archive

Crossref

Joint 3D Proposal Generation and Object Detection from View Aggregation

Author: Harakeh Ali
Ku Jason
Lee Jungwook
Mozifian Melissa
Waslander Steven
Publication venue
Publication date: 12/07/2018
Field of study

We present AVOD, an Aggregate View Object Detection network for autonomous driving scenarios. The proposed neural network architecture uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network. The proposed RPN uses a novel architecture capable of performing multimodal feature fusion on high resolution feature maps to generate reliable 3D object proposals for multiple object classes in road scenes. Using these proposals, the second stage detection network performs accurate oriented 3D bounding box regression and category classification to predict the extents, orientation, and classification of objects in 3D space. Our proposed architecture is shown to produce state of the art results on the KITTI 3D object detection benchmark while running in real time with a low memory footprint, making it a suitable candidate for deployment on autonomous vehicles. Code is at: https://github.com/kujason/avodComment: For any inquiries contact aharakeh(at)uwaterloo(dot)c

arXiv.org e-Print Archive

Crossref

3D Object Detection Using Scale Invariant and Feature Reweighting Networks

Author: Hu Ruolan
Huang Kaiqi
Liu Zhe
Zhao Xin
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 08/01/2019
Field of study

3D object detection plays an important role in a large number of real-world applications. It requires us to estimate the localizations and the orientations of 3D objects in real scenes. In this paper, we present a new network architecture which focuses on utilizing the front view images and frustum point clouds to generate 3D detection results. On the one hand, a PointSIFT module is utilized to improve the performance of 3D segmentation. It can capture the information from different orientations in space and the robustness to different scale shapes. On the other hand, our network obtains the useful features and suppresses the features with less information by a SENet module. This module reweights channel features and estimates the 3D bounding boxes more effectively. Our method is evaluated on both KITTI dataset for outdoor scenes and SUN-RGBD dataset for indoor scenes. The experimental results illustrate that our method achieves better performance than the state-of-the-art methods especially when point clouds are highly sparse.Comment: The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications