10,426 research outputs found
Semantics-Aligned Representation Learning for Person Re-identification
Person re-identification (reID) aims to match person images to retrieve the
ones with the same identity. This is a challenging task, as the images to be
matched are generally semantically misaligned due to the diversity of human
poses and capture viewpoints, incompleteness of the visible bodies (due to
occlusion), etc. In this paper, we propose a framework that drives the reID
network to learn semantics-aligned feature representation through delicate
supervision designs. Specifically, we build a Semantics Aligning Network (SAN)
which consists of a base network as encoder (SA-Enc) for re-ID, and a decoder
(SA-Dec) for reconstructing/regressing the densely semantics aligned full
texture image. We jointly train the SAN under the supervisions of person
re-identification and aligned texture generation. Moreover, at the decoder,
besides the reconstruction loss, we add Triplet ReID constraints over the
feature maps as the perceptual losses. The decoder is discarded in the
inference and thus our scheme is computationally efficient. Ablation studies
demonstrate the effectiveness of our design. We achieve the state-of-the-art
performances on the benchmark datasets CUHK03, Market1501, MSMT17, and the
partial person reID dataset Partial REID. Code for our proposed method is
available at:
https://github.com/microsoft/Semantics-Aligned-Representation-Learning-for-Person-Re-identification.Comment: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20),
code has been release
Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval
In this paper, we propose a novel deep generative approach to cross-modal
retrieval to learn hash functions in the absence of paired training samples
through the cycle consistency loss. Our proposed approach employs adversarial
training scheme to lean a couple of hash functions enabling translation between
modalities while assuming the underlying semantic relationship. To induce the
hash codes with semantics to the input-output pair, cycle consistency loss is
further proposed upon the adversarial training to strengthen the correlations
between inputs and corresponding outputs. Our approach is generative to learn
hash functions such that the learned hash codes can maximally correlate each
input-output correspondence, meanwhile can also regenerate the inputs so as to
minimize the information loss. The learning to hash embedding is thus performed
to jointly optimize the parameters of the hash functions across modalities as
well as the associated generative models. Extensive experiments on a variety of
large-scale cross-modal data sets demonstrate that our proposed method achieves
better retrieval results than the state-of-the-arts.Comment: To appeared on IEEE Trans. Image Processing. arXiv admin note: text
overlap with arXiv:1703.10593 by other author
Support Neighbor Loss for Person Re-Identification
Person re-identification (re-ID) has recently been tremendously boosted due
to the advancement of deep convolutional neural networks (CNN). The majority of
deep re-ID methods focus on designing new CNN architectures, while less
attention is paid on investigating the loss functions. Verification loss and
identification loss are two types of losses widely used to train various deep
re-ID models, both of which however have limitations. Verification loss guides
the networks to generate feature embeddings of which the intra-class variance
is decreased while the inter-class ones is enlarged. However, training networks
with verification loss tends to be of slow convergence and unstable performance
when the number of training samples is large. On the other hand, identification
loss has good separating and scalable property. But its neglect to explicitly
reduce the intra-class variance limits its performance on re-ID, because the
same person may have significant appearance disparity across different camera
views. To avoid the limitations of the two types of losses, we propose a new
loss, called support neighbor (SN) loss. Rather than being derived from data
sample pairs or triplets, SN loss is calculated based on the positive and
negative support neighbor sets of each anchor sample, which contain more
valuable contextual information and neighborhood structure that are beneficial
for more stable performance. To ensure scalability and separability, a
softmax-like function is formulated to push apart the positive and negative
support sets. To reduce intra-class variance, the distance between the anchor's
nearest positive neighbor and furthest positive sample is penalized.
Integrating SN loss on top of Resnet50, superior re-ID results to the
state-of-the-art ones are obtained on several widely used datasets.Comment: Accepted by ACM Multimedia (ACM MM) 201
- …