1,163 research outputs found
Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism
In this paper, we propose a CNN-based framework for online MOT. This
framework utilizes the merits of single object trackers in adapting appearance
models and searching for target in the next frame. Simply applying single
object tracker for MOT will encounter the problem in computational efficiency
and drifted results caused by occlusion. Our framework achieves computational
efficiency by sharing features and using ROI-Pooling to obtain individual
features for each target. Some online learned target-specific CNN layers are
used for adapting the appearance model for each target. In the framework, we
introduce spatial-temporal attention mechanism (STAM) to handle the drift
caused by occlusion and interaction among targets. The visibility map of the
target is learned and used for inferring the spatial attention map. The spatial
attention map is then applied to weight the features. Besides, the occlusion
status can be estimated from the visibility map, which controls the online
updating process via weighted loss on training samples with different occlusion
statuses in different frames. It can be considered as temporal attention
mechanism. The proposed algorithm achieves 34.3% and 46.0% in MOTA on
challenging MOT15 and MOT16 benchmark dataset respectively.Comment: Accepted at International Conference on Computer Vision (ICCV) 201
- …