4,692 research outputs found
Remove Cosine Window from Correlation Filter-based Visual Trackers: When and How
Correlation filters (CFs) have been continuously advancing the
state-of-the-art tracking performance and have been extensively studied in the
recent few years. Most of the existing CF trackers adopt a cosine window to
spatially reweight base image to alleviate boundary discontinuity. However,
cosine window emphasizes more on the central region of base image and has the
risk of contaminating negative training samples during model learning. On the
other hand, spatial regularization deployed in many recent CF trackers plays a
similar role as cosine window by enforcing spatial penalty on CF coefficients.
Therefore, we in this paper investigate the feasibility to remove cosine window
from CF trackers with spatial regularization. When simply removing cosine
window, CF with spatial regularization still suffers from small degree of
boundary discontinuity. To tackle this issue, binary and Gaussian shaped mask
functions are further introduced for eliminating boundary discontinuity while
reweighting the estimation error of each training sample, and can be
incorporated with multiple CF trackers with spatial regularization. In
comparison to the counterparts with cosine window, our methods are effective in
handling boundary discontinuity and sample contamination, thereby benefiting
tracking performance. Extensive experiments on three benchmarks show that our
methods perform favorably against the state-of-the-art trackers using either
handcrafted or deep CNN features. The code is publicly available at
https://github.com/lifeng9472/Removing_cosine_window_from_CF_trackers.Comment: 13 pages, 7 figures, submitted to IEEE Transactions on Image
Processin
Learning Spatial-Aware Regressions for Visual Tracking
In this paper, we analyze the spatial information of deep features, and
propose two complementary regressions for robust visual tracking. First, we
propose a kernelized ridge regression model wherein the kernel value is defined
as the weighted sum of similarity scores of all pairs of patches between two
samples. We show that this model can be formulated as a neural network and thus
can be efficiently solved. Second, we propose a fully convolutional neural
network with spatially regularized kernels, through which the filter kernel
corresponding to each output channel is forced to focus on a specific region of
the target. Distance transform pooling is further exploited to determine the
effectiveness of each output channel of the convolution layer. The outputs from
the kernelized ridge regression model and the fully convolutional neural
network are combined to obtain the ultimate response. Experimental results on
two benchmark datasets validate the effectiveness of the proposed method.Comment: To appear in CVPR201
- …