ViP3D: End-to-end Visual Trajectory Prediction via 3D Agent Queries

Chen, Xuanyao; Gu, Junru; Hu, Chenxu; Wang, Yilun; Wang, Yue; Zhang, Tianyuan; Zhao, Hang

ViP3D: End-to-end Visual Trajectory Prediction via 3D Agent Queries

Authors: Xuanyao Chen
Junru Gu
Chenxu Hu
Yilun Wang
Yue Wang
Tianyuan Zhang
Hang Zhao
Publication date: 2 August 2022
Publisher

Abstract

Existing autonomous driving pipelines separate the perception module from the prediction module. The two modules communicate via hand-picked features such as agent boxes and trajectories as interfaces. Due to this separation, the prediction module only receives partial information from the perception module. Even worse, errors from the perception modules can propagate and accumulate, adversely affecting the prediction results. In this work, we propose ViP3D, a visual trajectory prediction pipeline that leverages the rich information from raw videos to predict future trajectories of agents in a scene. ViP3D employs sparse agent queries throughout the pipeline, making it fully differentiable and interpretable. Furthermore, we propose an evaluation metric for this novel end-to-end visual trajectory prediction task. Extensive experimental results on the nuScenes dataset show the strong performance of ViP3D over traditional pipelines and previous end-to-end models.Comment: Project page is at https://tsinghua-mars-lab.github.io/ViP3

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2208.01582

Last time updated on 06/10/2022