35,144 research outputs found
Non-sparse Linear Representations for Visual Tracking with Online Reservoir Metric Learning
Most sparse linear representation-based trackers need to solve a
computationally expensive L1-regularized optimization problem. To address this
problem, we propose a visual tracker based on non-sparse linear
representations, which admit an efficient closed-form solution without
sacrificing accuracy. Moreover, in order to capture the correlation information
between different feature dimensions, we learn a Mahalanobis distance metric in
an online fashion and incorporate the learned metric into the optimization
problem for obtaining the linear representation. We show that online metric
learning using proximity comparison significantly improves the robustness of
the tracking, especially on those sequences exhibiting drastic appearance
changes. Furthermore, in order to prevent the unbounded growth in the number of
training samples for the metric learning, we design a time-weighted reservoir
sampling method to maintain and update limited-sized foreground and background
sample buffers for balancing sample diversity and adaptability. Experimental
results on challenging videos demonstrate the effectiveness and robustness of
the proposed tracker.Comment: Appearing in IEEE Conf. Computer Vision and Pattern Recognition, 201
Good Features to Correlate for Visual Tracking
During the recent years, correlation filters have shown dominant and
spectacular results for visual object tracking. The types of the features that
are employed in these family of trackers significantly affect the performance
of visual tracking. The ultimate goal is to utilize robust features invariant
to any kind of appearance change of the object, while predicting the object
location as properly as in the case of no appearance change. As the deep
learning based methods have emerged, the study of learning features for
specific tasks has accelerated. For instance, discriminative visual tracking
methods based on deep architectures have been studied with promising
performance. Nevertheless, correlation filter based (CFB) trackers confine
themselves to use the pre-trained networks which are trained for object
classification problem. To this end, in this manuscript the problem of learning
deep fully convolutional features for the CFB visual tracking is formulated. In
order to learn the proposed model, a novel and efficient backpropagation
algorithm is presented based on the loss function of the network. The proposed
learning framework enables the network model to be flexible for a custom
design. Moreover, it alleviates the dependency on the network trained for
classification. Extensive performance analysis shows the efficacy of the
proposed custom design in the CFB tracking framework. By fine-tuning the
convolutional parts of a state-of-the-art network and integrating this model to
a CFB tracker, which is the top performing one of VOT2016, 18% increase is
achieved in terms of expected average overlap, and tracking failures are
decreased by 25%, while maintaining the superiority over the state-of-the-art
methods in OTB-2013 and OTB-2015 tracking datasets.Comment: Accepted version of IEEE Transactions on Image Processin
Online Metric-Weighted Linear Representations for Robust Visual Tracking
In this paper, we propose a visual tracker based on a metric-weighted linear
representation of appearance. In order to capture the interdependence of
different feature dimensions, we develop two online distance metric learning
methods using proximity comparison information and structured output learning.
The learned metric is then incorporated into a linear representation of
appearance.
We show that online distance metric learning significantly improves the
robustness of the tracker, especially on those sequences exhibiting drastic
appearance changes. In order to bound growth in the number of training samples,
we design a time-weighted reservoir sampling method.
Moreover, we enable our tracker to automatically perform object
identification during the process of object tracking, by introducing a
collection of static template samples belonging to several object classes of
interest. Object identification results for an entire video sequence are
achieved by systematically combining the tracking information and visual
recognition at each frame. Experimental results on challenging video sequences
demonstrate the effectiveness of the method for both inter-frame tracking and
object identification.Comment: 51 pages. Appearing in IEEE Transactions on Pattern Analysis and
Machine Intelligenc
- …