8,351 research outputs found
Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism
In this paper, we propose a CNN-based framework for online MOT. This
framework utilizes the merits of single object trackers in adapting appearance
models and searching for target in the next frame. Simply applying single
object tracker for MOT will encounter the problem in computational efficiency
and drifted results caused by occlusion. Our framework achieves computational
efficiency by sharing features and using ROI-Pooling to obtain individual
features for each target. Some online learned target-specific CNN layers are
used for adapting the appearance model for each target. In the framework, we
introduce spatial-temporal attention mechanism (STAM) to handle the drift
caused by occlusion and interaction among targets. The visibility map of the
target is learned and used for inferring the spatial attention map. The spatial
attention map is then applied to weight the features. Besides, the occlusion
status can be estimated from the visibility map, which controls the online
updating process via weighted loss on training samples with different occlusion
statuses in different frames. It can be considered as temporal attention
mechanism. The proposed algorithm achieves 34.3% and 46.0% in MOTA on
challenging MOT15 and MOT16 benchmark dataset respectively.Comment: Accepted at International Conference on Computer Vision (ICCV) 201
Tracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking
Current multi-person localisation and tracking systems have an over reliance
on the use of appearance models for target re-identification and almost no
approaches employ a complete deep learning solution for both objectives. We
present a novel, complete deep learning framework for multi-person localisation
and tracking. In this context we first introduce a light weight sequential
Generative Adversarial Network architecture for person localisation, which
overcomes issues related to occlusions and noisy detections, typically found in
a multi person environment. In the proposed tracking framework we build upon
recent advances in pedestrian trajectory prediction approaches and propose a
novel data association scheme based on predicted trajectories. This removes the
need for computationally expensive person re-identification systems based on
appearance features and generates human like trajectories with minimal
fragmentation. The proposed method is evaluated on multiple public benchmarks
including both static and dynamic cameras and is capable of generating
outstanding performance, especially among other recently proposed deep neural
network based approaches.Comment: To appear in IEEE Winter Conference on Applications of Computer
Vision (WACV), 201
Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers
Online Multi-Object Tracking (MOT) from videos is a challenging computer
vision task which has been extensively studied for decades. Most of the
existing MOT algorithms are based on the Tracking-by-Detection (TBD) paradigm
combined with popular machine learning approaches which largely reduce the
human effort to tune algorithm parameters. However, the commonly used
supervised learning approaches require the labeled data (e.g., bounding boxes),
which is expensive for videos. Also, the TBD framework is usually suboptimal
since it is not end-to-end, i.e., it considers the task as detection and
tracking, but not jointly. To achieve both label-free and end-to-end learning
of MOT, we propose a Tracking-by-Animation framework, where a differentiable
neural model first tracks objects from input frames and then animates these
objects into reconstructed frames. Learning is then driven by the
reconstruction error through backpropagation. We further propose a
Reprioritized Attentive Tracking to improve the robustness of data association.
Experiments conducted on both synthetic and real video datasets show the
potential of the proposed model. Our project page is publicly available at:
https://github.com/zhen-he/tracking-by-animationComment: CVPR 201
Recommended from our members
Multiperson Tracking by Online Learned Grouping Model With Nonlinear Motion Context
Data association and occlusion handling for vision-based people tracking by mobile robots
This paper presents an approach for tracking multiple persons on a mobile robot with a combination of colour and thermal vision sensors, using several new techniques. First, an adaptive colour model is incorporated into the measurement model of the tracker. Second, a new approach for detecting occlusions is introduced, using a machine learning classifier for pairwise comparison of persons (classifying which one is in front of the other). Third, explicit occlusion handling is incorporated into the tracker. The paper presents a comprehensive, quantitative evaluation of the whole system and its different components using several real world data sets
- …