LPFormer: LiDAR Pose Estimation Transformer with Multi-Task Network

Abstract

In this technical report, we present the 1st place solution for the 2023 Waymo Open Dataset Pose Estimation challenge. Due to the difficulty of acquiring large-scale 3D human keypoint annotation, previous methods have commonly relied on 2D image features and 2D sequential annotations for 3D human pose estimation. In contrast, our proposed method, named LPFormer, uses only LiDAR as its input along with its corresponding 3D annotations. LPFormer consists of two stages: the first stage detects the human bounding box and extracts multi-level feature representations, while the second stage employs a transformer-based network to regress the human keypoints using these features. Experimental results on the Waymo Open Dataset demonstrate the top performance, and improvements even compared to previous multi-modal solutions.Comment: Technical report of the top solution for the Waymo Open Dataset Challenges 2023 - Pose Estimation. CVPR 2023 Workshop on Autonomous Drivin

    Similar works

    Full text

    thumbnail-image

    Available Versions