3D object detection is receiving increasing attention from both industry and
academia thanks to its wide applications in various fields. In this paper, we
propose Point-Voxel Region-based Convolution Neural Networks (PV-RCNNs) for 3D
object detection on point clouds. First, we propose a novel 3D detector,
PV-RCNN, which boosts the 3D detection performance by deeply integrating the
feature learning of both point-based set abstraction and voxel-based sparse
convolution through two novel steps, i.e., the voxel-to-keypoint scene encoding
and the keypoint-to-grid RoI feature abstraction. Second, we propose an
advanced framework, PV-RCNN++, for more efficient and accurate 3D object
detection. It consists of two major improvements: sectorized proposal-centric
sampling for efficiently producing more representative keypoints, and
VectorPool aggregation for better aggregating local point features with much
less resource consumption. With these two strategies, our PV-RCNN++ is about
3× faster than PV-RCNN, while also achieving better performance. The
experiments demonstrate that our proposed PV-RCNN++ framework achieves
state-of-the-art 3D detection performance on the large-scale and
highly-competitive Waymo Open Dataset with 10 FPS inference speed on the
detection range of 150m * 150m.Comment: Accepted by International Journal of Computer Vision (IJCV), code is
available at https://github.com/open-mmlab/OpenPCDe