44,700 research outputs found
Online Metric-Weighted Linear Representations for Robust Visual Tracking
In this paper, we propose a visual tracker based on a metric-weighted linear
representation of appearance. In order to capture the interdependence of
different feature dimensions, we develop two online distance metric learning
methods using proximity comparison information and structured output learning.
The learned metric is then incorporated into a linear representation of
appearance.
We show that online distance metric learning significantly improves the
robustness of the tracker, especially on those sequences exhibiting drastic
appearance changes. In order to bound growth in the number of training samples,
we design a time-weighted reservoir sampling method.
Moreover, we enable our tracker to automatically perform object
identification during the process of object tracking, by introducing a
collection of static template samples belonging to several object classes of
interest. Object identification results for an entire video sequence are
achieved by systematically combining the tracking information and visual
recognition at each frame. Experimental results on challenging video sequences
demonstrate the effectiveness of the method for both inter-frame tracking and
object identification.Comment: 51 pages. Appearing in IEEE Transactions on Pattern Analysis and
Machine Intelligenc
Understanding and Diagnosing Visual Tracking Systems
Several benchmark datasets for visual tracking research have been proposed in
recent years. Despite their usefulness, whether they are sufficient for
understanding and diagnosing the strengths and weaknesses of different trackers
remains questionable. To address this issue, we propose a framework by breaking
a tracker down into five constituent parts, namely, motion model, feature
extractor, observation model, model updater, and ensemble post-processor. We
then conduct ablative experiments on each component to study how it affects the
overall result. Surprisingly, our findings are discrepant with some common
beliefs in the visual tracking research community. We find that the feature
extractor plays the most important role in a tracker. On the other hand,
although the observation model is the focus of many studies, we find that it
often brings no significant improvement. Moreover, the motion model and model
updater contain many details that could affect the result. Also, the ensemble
post-processor can improve the result substantially when the constituent
trackers have high diversity. Based on our findings, we put together some very
elementary building blocks to give a basic tracker which is competitive in
performance to the state-of-the-art trackers. We believe our framework can
provide a solid baseline when conducting controlled experiments for visual
tracking research
- …