Accurate 3D object detection (3DOD) is crucial for safe navigation of complex
environments by autonomous robots. Regressing accurate 3D bounding boxes in
cluttered environments based on sparse LiDAR data is however a highly
challenging problem. We address this task by exploring recent advances in
conditional energy-based models (EBMs) for probabilistic regression. While
methods employing EBMs for regression have demonstrated impressive performance
on 2D object detection in images, these techniques are not directly applicable
to 3D bounding boxes. In this work, we therefore design a differentiable
pooling operator for 3D bounding boxes, serving as the core module of our EBM
network. We further integrate this general approach into the state-of-the-art
3D object detector SA-SSD. On the KITTI dataset, our proposed approach
consistently outperforms the SA-SSD baseline across all 3DOD metrics,
demonstrating the potential of EBM-based regression for highly accurate 3DOD.
Code is available at https://github.com/fregu856/ebms_3dod.Comment: Code is available at https://github.com/fregu856/ebms_3do