PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework

Fu, Changhong; Huang, Ziyuan; Li, Bowen; Li, Yiming; Scherer, Sebastian; Ye, Junjie; Zhao, Hang

PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework

Authors: Changhong Fu
Ziyuan Huang
Bowen Li
Yiming Li
Sebastian Scherer
Junjie Ye
Hang Zhao
Publication date: 21 November 2022
Publisher

Abstract

Visual object tracking is an essential capability of intelligent robots. Most existing approaches have ignored the online latency that can cause severe performance degradation during real-world processing. Especially for unmanned aerial vehicle, where robust tracking is more challenging and onboard computation is limited, latency issue could be fatal. In this work, we present a simple framework for end-to-end latency-aware tracking, i.e., end-to-end predictive visual tracking (PVT++). PVT++ is capable of turning most leading-edge trackers into predictive trackers by appending an online predictor. Unlike existing solutions that use model-based approaches, our framework is learnable, such that it can take not only motion information as input but it can also take advantage of visual cues or a combination of both. Moreover, since PVT++ is end-to-end optimizable, it can further boost the latency-aware tracking performance by joint training. Additionally, this work presents an extended latency-aware evaluation benchmark for assessing an any-speed tracker in the online setting. Empirical results on robotic platform from aerial perspective show that PVT++ can achieve up to 60% performance gain on various trackers and exhibit better robustness than prior model-based solution, largely mitigating the degradation brought by latency. Code and models will be made public.Comment: 18 pages, 10 figure

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2211.11629

Last time updated on 24/12/2022