25 research outputs found
Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism
In this paper, we propose a CNN-based framework for online MOT. This
framework utilizes the merits of single object trackers in adapting appearance
models and searching for target in the next frame. Simply applying single
object tracker for MOT will encounter the problem in computational efficiency
and drifted results caused by occlusion. Our framework achieves computational
efficiency by sharing features and using ROI-Pooling to obtain individual
features for each target. Some online learned target-specific CNN layers are
used for adapting the appearance model for each target. In the framework, we
introduce spatial-temporal attention mechanism (STAM) to handle the drift
caused by occlusion and interaction among targets. The visibility map of the
target is learned and used for inferring the spatial attention map. The spatial
attention map is then applied to weight the features. Besides, the occlusion
status can be estimated from the visibility map, which controls the online
updating process via weighted loss on training samples with different occlusion
statuses in different frames. It can be considered as temporal attention
mechanism. The proposed algorithm achieves 34.3% and 46.0% in MOTA on
challenging MOT15 and MOT16 benchmark dataset respectively.Comment: Accepted at International Conference on Computer Vision (ICCV) 201
Who is where? Matching people in video to wearable acceleration during crowded mingling events
ConferenciaWe address the challenging problem of associating acceler-
ation data from a wearable sensor with the corresponding
spatio-temporal region of a person in video during crowded
mingling scenarios. This is an important rst step for multi-
sensor behavior analysis using these two modalities. Clearly,
as the numbers of people in a scene increases, there is also
a need to robustly and automatically associate a region of
the video with each person's device. We propose a hierarchi-
cal association approach which exploits the spatial context
of the scene, outperforming the state-of-the-art approaches
signi cantly. Moreover, we present experiments on match-
ing from 3 to more than 130 acceleration and video streams
which, to our knowledge, is signi cantly larger than prior
works where only up to 5 device streams are associated
Improving Global Multi-target Tracking with Local Updates
Conference dates: September 6-7 & 12, 2014We propose a scheme to explicitly detect and resolve ambiguous situations in multiple target tracking. During periods of uncertainty, our method applies multiple local single target trackers to hypothesise short term tracks. These tracks are combined with the tracks obtained by a global multi-target tracker, if they result in a reduction in the global cost function. Since tracking failures typically arise when targets become occluded, we propose a local data association scheme to maintain the target identities in these situations. We demonstrate a reduction of up to 50% in the global cost function, which in turn leads to superior performance on several challenging benchmark sequences. Additionally, we show tracking results in sports videos where poor video quality and frequent and severe occlusions between multiple players pose difficulties for state-of-the-art trackers.Anton Milan, Rikke Gade, Anthony Dick, Thomas B. Moeslund, and Ian Rei
Deformable Object Tracking Using Clustering and Particle Filter
Visual tracking of a deformable object is a challenging problem, as the target object frequently changes its attributes like shape, posture, color and so on. In this work, we propose a model-free tracker using clustering to track a target object which poses deformations and rotations. Clustering is applied to segment the tracked object into several independent components and the discriminative parts are tracked to locate the object. The proposed technique segments the target object into independent components using data clustering techniques and then tracks by finding corresponding clusters. Particle filters method is incorporated to improve the accuracy of the proposed technique. Experiments are carried out with several standard data sets, and results demonstrate comparable performance to the state-of-the-art visual tracking methods
Identity Retention of Multiple Objects under Extreme Occlusion Scenarios using Feature Descriptors
Identity assignment and retention needs multiple object detection and tracking. It plays a vital role in behavior analysis and gait recognition. The objective of Multiple Object Tracking (MOT) is to detect, track and retain identities from an image sequence. An occlusion is a major resistance in identity retention. It is a challenging task to handle occlusion while tracking varying number of person in the complex scene using a monocular camera. In MOT, occlusion remains a challenging task in real world applications. This paper uses Gaussian Mixture Model (GMM) and Hungarian Assignment (HA) for person detection and tracking. We propose an identity retention algorithm using Rotation Scale and Translation (RST) invariant feature descriptors. In addition, a segmentation based optimum demerge handling algorithm is proposed to retain proper identities under occlusion. The proposed approach is evaluated on a standard surveillance dataset sequences and it achieves 97 % object detection accuracy and 85% tracking accuracy for PETS-S2.L1 sequence and 69.7% accuracy as well as 72.3% precision for Town Centre Sequence