3,354 research outputs found
Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism
In this paper, we propose a CNN-based framework for online MOT. This
framework utilizes the merits of single object trackers in adapting appearance
models and searching for target in the next frame. Simply applying single
object tracker for MOT will encounter the problem in computational efficiency
and drifted results caused by occlusion. Our framework achieves computational
efficiency by sharing features and using ROI-Pooling to obtain individual
features for each target. Some online learned target-specific CNN layers are
used for adapting the appearance model for each target. In the framework, we
introduce spatial-temporal attention mechanism (STAM) to handle the drift
caused by occlusion and interaction among targets. The visibility map of the
target is learned and used for inferring the spatial attention map. The spatial
attention map is then applied to weight the features. Besides, the occlusion
status can be estimated from the visibility map, which controls the online
updating process via weighted loss on training samples with different occlusion
statuses in different frames. It can be considered as temporal attention
mechanism. The proposed algorithm achieves 34.3% and 46.0% in MOTA on
challenging MOT15 and MOT16 benchmark dataset respectively.Comment: Accepted at International Conference on Computer Vision (ICCV) 201
Real-time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification
Online multi-object tracking is a fundamental problem in time-critical video
analysis applications. A major challenge in the popular tracking-by-detection
framework is how to associate unreliable detection results with existing
tracks. In this paper, we propose to handle unreliable detection by collecting
candidates from outputs of both detection and tracking. The intuition behind
generating redundant candidates is that detection and tracks can complement
each other in different scenarios. Detection results of high confidence prevent
tracking drifts in the long term, and predictions of tracks can handle noisy
detection caused by occlusion. In order to apply optimal selection from a
considerable amount of candidates in real-time, we present a novel scoring
function based on a fully convolutional neural network, that shares most
computations on the entire image. Moreover, we adopt a deeply learned
appearance representation, which is trained on large-scale person
re-identification datasets, to improve the identification ability of our
tracker. Extensive experiments show that our tracker achieves real-time and
state-of-the-art performance on a widely used people tracking benchmark.Comment: ICME 201
UA-DETRAC: A New Benchmark and Protocol for Multi-Object Detection and Tracking
In recent years, numerous effective multi-object tracking (MOT) methods are
developed because of the wide range of applications. Existing performance
evaluations of MOT methods usually separate the object tracking step from the
object detection step by using the same fixed object detection results for
comparisons. In this work, we perform a comprehensive quantitative study on the
effects of object detection accuracy to the overall MOT performance, using the
new large-scale University at Albany DETection and tRACking (UA-DETRAC)
benchmark dataset. The UA-DETRAC benchmark dataset consists of 100 challenging
video sequences captured from real-world traffic scenes (over 140,000 frames
with rich annotations, including occlusion, weather, vehicle category,
truncation, and vehicle bounding boxes) for object detection, object tracking
and MOT system. We evaluate complete MOT systems constructed from combinations
of state-of-the-art object detection and object tracking methods. Our analysis
shows the complex effects of object detection accuracy on MOT system
performance. Based on these observations, we propose new evaluation tools and
metrics for MOT systems that consider both object detection and object tracking
for comprehensive analysis.Comment: 18 pages, 11 figures, accepted by CVI
Identity Retention of Multiple Objects under Extreme Occlusion Scenarios using Feature Descriptors
Identity assignment and retention needs multiple object detection and tracking. It plays a vital role in behavior analysis and gait recognition. The objective of Multiple Object Tracking (MOT) is to detect, track and retain identities from an image sequence. An occlusion is a major resistance in identity retention. It is a challenging task to handle occlusion while tracking varying number of person in the complex scene using a monocular camera. In MOT, occlusion remains a challenging task in real world applications. This paper uses Gaussian Mixture Model (GMM) and Hungarian Assignment (HA) for person detection and tracking. We propose an identity retention algorithm using Rotation Scale and Translation (RST) invariant feature descriptors. In addition, a segmentation based optimum demerge handling algorithm is proposed to retain proper identities under occlusion. The proposed approach is evaluated on a standard surveillance dataset sequences and it achieves 97 % object detection accuracy and 85% tracking accuracy for PETS-S2.L1 sequence and 69.7% accuracy as well as 72.3% precision for Town Centre Sequence
- …