9 research outputs found
Reinforced Axial Refinement Network for Monocular 3D Object Detection
Monocular 3D object detection aims to extract the 3D position and properties
of objects from a 2D input image. This is an ill-posed problem with a major
difficulty lying in the information loss by depth-agnostic cameras.
Conventional approaches sample 3D bounding boxes from the space and infer the
relationship between the target object and each of them, however, the
probability of effective samples is relatively small in the 3D space. To
improve the efficiency of sampling, we propose to start with an initial
prediction and refine it gradually towards the ground truth, with only one 3d
parameter changed in each step. This requires designing a policy which gets a
reward after several steps, and thus we adopt reinforcement learning to
optimize it. The proposed framework, Reinforced Axial Refinement Network
(RAR-Net), serves as a post-processing stage which can be freely integrated
into existing monocular 3D detection methods, and improve the performance on
the KITTI dataset with small extra computational costs.Comment: Accepted by ECCV 202