150 research outputs found
3D Object Detection for Autonomous Driving: A Survey
Autonomous driving is regarded as one of the most promising remedies to
shield human beings from severe crashes. To this end, 3D object detection
serves as the core basis of such perception system especially for the sake of
path planning, motion prediction, collision avoidance, etc. Generally, stereo
or monocular images with corresponding 3D point clouds are already standard
layout for 3D object detection, out of which point clouds are increasingly
prevalent with accurate depth information being provided. Despite existing
efforts, 3D object detection on point clouds is still in its infancy due to
high sparseness and irregularity of point clouds by nature, misalignment view
between camera view and LiDAR bird's eye of view for modality synergies,
occlusions and scale variations at long distances, etc. Recently, profound
progress has been made in 3D object detection, with a large body of literature
being investigated to address this vision task. As such, we present a
comprehensive review of the latest progress in this field covering all the main
topics including sensors, fundamentals, and the recent state-of-the-art
detection methods with their pros and cons. Furthermore, we introduce metrics
and provide quantitative comparisons on popular public datasets. The avenues
for future work are going to be judiciously identified after an in-deep
analysis of the surveyed works. Finally, we conclude this paper.Comment: 3D object detection, Autonomous driving, Point cloud
LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation
Semantic map construction under bird's-eye view (BEV) plays an essential role
in autonomous driving. In contrast to camera image, LiDAR provides the accurate
3D observations to project the captured 3D features onto BEV space inherently.
However, the vanilla LiDAR-based BEV feature often contains many indefinite
noises, where the spatial features have little texture and semantic cues. In
this paper, we propose an effective LiDAR-based method to build semantic map.
Specifically, we introduce a BEV feature pyramid decoder that learns the robust
multi-scale BEV features for semantic map construction, which greatly boosts
the accuracy of the LiDAR-based method. To mitigate the defects caused by
lacking semantic cues in LiDAR data, we present an online Camera-to-LiDAR
distillation scheme to facilitate the semantic learning from image to point
cloud. Our distillation scheme consists of feature-level and logit-level
distillation to absorb the semantic information from camera in BEV. The
experimental results on challenging nuScenes dataset demonstrate the efficacy
of our proposed LiDAR2Map on semantic map construction, which significantly
outperforms the previous LiDAR-based methods over 27.9% mIoU and even performs
better than the state-of-the-art camera-based approaches. Source code is
available at: https://github.com/songw-zju/LiDAR2Map.Comment: Accepted by CVPR202
- …