3 research outputs found
Improved Hard Example Mining by Discovering Attribute-based Hard Person Identity
In this paper, we propose Hard Person Identity Mining (HPIM) that attempts to
refine the hard example mining to improve the exploration efficacy in person
re-identification. It is motivated by following observation: the more
attributes some people share, the more difficult to separate their identities.
Based on this observation, we develop HPIM via a transferred attribute
describer, a deep multi-attribute classifier trained from the source noisy
person attribute datasets. We encode each image into the attribute
probabilistic description in the target person re-ID dataset. Afterwards in the
attribute code space, we consider each person as a distribution to generate his
view-specific attribute codes in different practical scenarios. Hence we
estimate the person-specific statistical moments from zeroth to higher order,
which are further used to calculate the central moment discrepancies between
persons. Such discrepancy is a ground to choose hard identity to organize
proper mini-batches, without concerning the person representation changing in
metric learning. It presents as a complementary tool of hard example mining,
which helps to explore the global instead of the local hard example constraint
in the mini-batch built by randomly sampled identities. Extensive experiments
on two person re-identification benchmarks validated the effectiveness of our
proposed algorithm
\emph{cm}SalGAN: RGB-D Salient Object Detection with Cross-View Generative Adversarial Networks
Image salient object detection (SOD) is an active research topic in computer
vision and multimedia area. Fusing complementary information of RGB and depth
has been demonstrated to be effective for image salient object detection which
is known as RGB-D salient object detection problem. The main challenge for
RGB-D salient object detection is how to exploit the salient cues of both
intra-modality (RGB, depth) and cross-modality simultaneously which is known as
cross-modality detection problem. In this paper, we tackle this challenge by
designing a novel cross-modality Saliency Generative Adversarial Network
(\emph{cm}SalGAN). \emph{cm}SalGAN aims to learn an optimal view-invariant and
consistent pixel-level representation for RGB and depth images via a novel
adversarial learning framework, which thus incorporates both information of
intra-view and correlation information of cross-view images simultaneously for
RGB-D saliency detection problem. To further improve the detection results, the
attention mechanism and edge detection module are also incorporated into
\emph{cm}SalGAN. The entire \emph{cm}SalGAN can be trained in an end-to-end
manner by using the standard deep neural network framework. Experimental
results show that \emph{cm}SalGAN achieves the new state-of-the-art RGB-D
saliency detection performance on several benchmark datasets.Comment: Accepted by IEEE Transactions on Multimedi
Tracking by Joint Local and Global Search: A Target-aware Attention based Approach
Tracking-by-detection is a very popular framework for single object tracking
which attempts to search the target object within a local search window for
each frame. Although such local search mechanism works well on simple videos,
however, it makes the trackers sensitive to extremely challenging scenarios,
such as heavy occlusion and fast motion. In this paper, we propose a novel and
general target-aware attention mechanism (termed TANet) and integrate it with
tracking-by-detection framework to conduct joint local and global search for
robust tracking. Specifically, we extract the features of target object patch
and continuous video frames, then we concatenate and feed them into a decoder
network to generate target-aware global attention maps. More importantly, we
resort to adversarial training for better attention prediction. The appearance
and motion discriminator networks are designed to ensure its consistency in
spatial and temporal views. In the tracking procedure, we integrate the
target-aware attention with multiple trackers by exploring candidate search
regions for robust tracking. Extensive experiments on both short-term and
long-term tracking benchmark datasets all validated the effectiveness of our
algorithm. The project page of this paper can be found at
\url{https://sites.google.com/view/globalattentiontracking/home/extend}.Comment: Accepted by IEEE TNNLS 202