5 research outputs found
Particle Filter Re-detection for Visual Tracking via Correlation Filters
Most of the correlation filter based tracking algorithms can achieve good
performance and maintain fast computational speed. However, in some complicated
tracking scenes, there is a fatal defect that causes the object to be located
inaccurately. In order to address this problem, we propose a particle filter
redetection based tracking approach for accurate object localization. During
the tracking process, the kernelized correlation filter (KCF) based tracker
locates the object by relying on the maximum response value of the response
map; when the response map becomes ambiguous, the KCF tracking result becomes
unreliable. Our method can provide more candidates by particle resampling to
detect the object accordingly. Additionally, we give a new object scale
evaluation mechanism, which merely considers the differences between the
maximum response values in consecutive frames. Extensive experiments on OTB2013
and OTB2015 datasets demonstrate that the proposed tracker performs favorably
in relation to the state-of-the-art methods.Comment: 18 pages, 6 figures, 2 table
Hierarchical Spatial-aware Siamese Network for Thermal Infrared Object Tracking
Most thermal infrared (TIR) tracking methods are discriminative, treating the
tracking problem as a classification task. However, the objective of the
classifier (label prediction) is not coupled to the objective of the tracker
(location estimation). The classification task focuses on the between-class
difference of the arbitrary objects, while the tracking task mainly deals with
the within-class difference of the same objects. In this paper, we cast the TIR
tracking problem as a similarity verification task, which is coupled well to
the objective of the tracking task. We propose a TIR tracker via a Hierarchical
Spatial-aware Siamese Convolutional Neural Network (CNN), named HSSNet. To
obtain both spatial and semantic features of the TIR object, we design a
Siamese CNN that coalesces the multiple hierarchical convolutional layers.
Then, we propose a spatial-aware network to enhance the discriminative ability
of the coalesced hierarchical feature. Subsequently, we train this network end
to end on a large visible video detection dataset to learn the similarity
between paired objects before we transfer the network into the TIR domain.
Next, this pre-trained Siamese network is used to evaluate the similarity
between the target template and target candidates. Finally, we locate the
candidate that is most similar to the tracked target. Extensive experimental
results on the benchmarks VOT-TIR 2015 and VOT-TIR 2016 show that our proposed
method achieves favourable performance compared to the state-of-the-art
methods.Comment: 20 pages, 7 figure
Material Based Object Tracking in Hyperspectral Videos: Benchmark and Algorithms
Traditional color images only depict color intensities in red, green and blue
channels, often making object trackers fail in challenging scenarios, e.g.,
background clutter and rapid changes of target appearance. Alternatively,
material information of targets contained in a large amount of bands of
hyperspectral images (HSI) is more robust to these difficult conditions. In
this paper, we conduct a comprehensive study on how material information can be
utilized to boost object tracking from three aspects: benchmark dataset,
material feature representation and material based tracking. In terms of
benchmark, we construct a dataset of fully-annotated videos, which contain both
hyperspectral and color sequences of the same scene. Material information is
represented by spectral-spatial histogram of multidimensional gradient, which
describes the 3D local spectral-spatial structure in an HSI, and fractional
abundances of constituted material components which encode the underlying
material distribution. These two types of features are embedded into
correlation filters, yielding material based tracking. Experimental results on
the collected benchmark dataset show the potentials and advantages of material
based object tracking.Comment: Update result
Learning Deep Multi-Level Similarity for Thermal Infrared Object Tracking
Existing deep Thermal InfraRed (TIR) trackers only use semantic features to
describe the TIR object, which lack the sufficient discriminative capacity for
handling distractors. This becomes worse when the feature extraction network is
only trained on RGB images.To address this issue, we propose a multi-level
similarity model under a Siamese framework for robust TIR object tracking.
Specifically, we compute different pattern similarities on two convolutional
layers using the proposed multi-level similarity network. One of them focuses
on the global semantic similarity and the other computes the local structural
similarity of the TIR object. These two similarities complement each other and
hence enhance the discriminative capacity of the network for handling
distractors. In addition, we design a simple while effective relative entropy
based ensemble subnetwork to integrate the semantic and structural
similarities. This subnetwork can adaptive learn the weights of the semantic
and structural similarities at the training stage. To further enhance the
discriminative capacity of the tracker, we construct the first large scale TIR
video sequence dataset for training the proposed model. The proposed TIR
dataset not only benefits the training for TIR tracking but also can be applied
to numerous TIR vision tasks. Extensive experimental results on the VOT-TIR2015
and VOT-TIR2017 benchmarks demonstrate that the proposed algorithm performs
favorably against the state-of-the-art methods.Comment: 18 page
AMIL: Adversarial Multi Instance Learning for Human Pose Estimation
Human pose estimation has an important impact on a wide range of applications
from human-computer interface to surveillance and content-based video
retrieval. For human pose estimation, joint obstructions and overlapping upon
human bodies result in departed pose estimation. To address these problems, by
integrating priors of the structure of human bodies, we present a novel
structure-aware network to discreetly consider such priors during the training
of the network. Typically, learning such constraints is a challenging task.
Instead, we propose generative adversarial networks as our learning model in
which we design two residual multiple instance learning (MIL) models with the
identical architecture, one is used as the generator and the other one is used
as the discriminator. The discriminator task is to distinguish the actual poses
from the fake ones. If the pose generator generates the results that the
discriminator is not able to distinguish from the real ones, the model has
successfully learnt the priors. In the proposed model, the discriminator
differentiates the ground-truth heatmaps from the generated ones, and later the
adversarial loss back-propagates to the generator. Such procedure assists the
generator to learn reasonable body configurations and is proved to be
advantageous to improve the pose estimation accuracy. Meanwhile, we propose a
novel function for MIL. It is an adjustable structure for both instance
selection and modeling to appropriately pass the information between instances
in a single bag. In the proposed residual MIL neural network, the pooling
action adequately updates the instance contribution to its bag. The proposed
adversarial residual multi-instance neural network that is based on pooling has
been validated on two datasets for the human pose estimation task and
successfully outperforms the other state-of-arts models