36,576 research outputs found
End-to-end Flow Correlation Tracking with Spatial-temporal Attention
Discriminative correlation filters (DCF) with deep convolutional features
have achieved favorable performance in recent tracking benchmarks. However,
most of existing DCF trackers only consider appearance features of current
frame, and hardly benefit from motion and inter-frame information. The lack of
temporal information degrades the tracking performance during challenges such
as partial occlusion and deformation. In this work, we focus on making use of
the rich flow information in consecutive frames to improve the feature
representation and the tracking accuracy. Firstly, individual components,
including optical flow estimation, feature extraction, aggregation and
correlation filter tracking are formulated as special layers in network. To the
best of our knowledge, this is the first work to jointly train flow and
tracking task in a deep learning framework. Then the historical feature maps at
predefined intervals are warped and aggregated with current ones by the guiding
of flow. For adaptive aggregation, we propose a novel spatial-temporal
attention mechanism. Extensive experiments are performed on four challenging
tracking datasets: OTB2013, OTB2015, VOT2015 and VOT2016, and the proposed
method achieves superior results on these benchmarks.Comment: Accepted in CVPR 201
Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Tracking
With efficient appearance learning models, Discriminative Correlation Filter
(DCF) has been proven to be very successful in recent video object tracking
benchmarks and competitions. However, the existing DCF paradigm suffers from
two major issues, i.e., spatial boundary effect and temporal filter
degradation. To mitigate these challenges, we propose a new DCF-based tracking
method. The key innovations of the proposed method include adaptive spatial
feature selection and temporal consistent constraints, with which the new
tracker enables joint spatial-temporal filter learning in a lower dimensional
discriminative manifold. More specifically, we apply structured spatial
sparsity constraints to multi-channel filers. Consequently, the process of
learning spatial filters can be approximated by the lasso regularisation. To
encourage temporal consistency, the filter model is restricted to lie around
its historical value and updated locally to preserve the global structure in
the manifold. Last, a unified optimisation framework is proposed to jointly
select temporal consistency preserving spatial features and learn
discriminative filters with the augmented Lagrangian method. Qualitative and
quantitative evaluations have been conducted on a number of well-known
benchmarking datasets such as OTB2013, OTB50, OTB100, Temple-Colour, UAV123 and
VOT2018. The experimental results demonstrate the superiority of the proposed
method over the state-of-the-art approaches
- …