648 research outputs found
Robust Visual Tracking Using the Bidirectional Scale Estimation
Object tracking with robust scale estimation is a challenging task in computer vision. This paper presents a novel tracking algorithm that learns the translation and scale filters with a complementary scheme. The translation filter is constructed using the ridge regression and multidimensional features. A robust scale filter is constructed by the bidirectional scale estimation, including the forward scale and backward scale. Firstly, we learn the scale filter using the forward tracking information. Then the forward scale and backward scale can be estimated using the respective scale filter. Secondly, a conservative strategy is adopted to compromise the forward and backward scales. Finally, the scale filter is updated based on the final scale estimation. It is effective to update scale filter since the stable scale estimation can improve the performance of scale filter. To reveal the effectiveness of our tracker, experiments are performed on 32 sequences with significant scale variation and on the benchmark dataset with 50 challenging videos. Our results show that the proposed tracker outperforms several state-of-the-art trackers in terms of robustness and accuracy
Bi-Mapper: Holistic BEV Semantic Mapping for Autonomous Driving
A semantic map of the road scene, covering fundamental road elements, is an
essential ingredient in autonomous driving systems. It provides important
perception foundations for positioning and planning when rendered in the
Bird's-Eye-View (BEV). Currently, the prior knowledge of hypothetical depth can
guide the learning of translating front perspective views into BEV directly
with the help of calibration parameters. However, it suffers from geometric
distortions in the representation of distant objects. In addition, another
stream of methods without prior knowledge can learn the transformation between
front perspective views and BEV implicitly with a global view. Considering that
the fusion of different learning methods may bring surprising beneficial
effects, we propose a Bi-Mapper framework for top-down road-scene semantic
understanding, which incorporates a global view and local prior knowledge. To
enhance reliable interaction between them, an asynchronous mutual learning
strategy is proposed. At the same time, an Across-Space Loss (ASL) is designed
to mitigate the negative impact of geometric distortions. Extensive results on
nuScenes and Cam2BEV datasets verify the consistent effectiveness of each
module in the proposed Bi-Mapper framework. Compared with exiting road mapping
networks, the proposed Bi-Mapper achieves 2.1% higher IoU on the nuScenes
dataset. Moreover, we verify the generalization performance of Bi-Mapper in a
real-world driving scenario. The source code is publicly available at
https://github.com/lynn-yu/Bi-Mapper.Comment: Accepted to IEEE Robotics and Automation Letters (RA-L). The source
code is publicly available at https://github.com/lynn-yu/Bi-Mappe
- …