949 research outputs found
Person Re-identification with Correspondence Structure Learning
This paper addresses the problem of handling spatial misalignments due to
camera-view changes or human-pose variations in person re-identification. We
first introduce a boosting-based approach to learn a correspondence structure
which indicates the patch-wise matching probabilities between images from a
target camera pair. The learned correspondence structure can not only capture
the spatial correspondence pattern between cameras but also handle the
viewpoint or human-pose variation in individual images. We further introduce a
global-based matching process. It integrates a global matching constraint over
the learned correspondence structure to exclude cross-view misalignments during
the image patch matching process, hence achieving a more reliable matching
score between images. Experimental results on various datasets demonstrate the
effectiveness of our approach
Learning Correspondence Structures for Person Re-identification
This paper addresses the problem of handling spatial misalignments due to
camera-view changes or human-pose variations in person re-identification. We
first introduce a boosting-based approach to learn a correspondence structure
which indicates the patch-wise matching probabilities between images from a
target camera pair. The learned correspondence structure can not only capture
the spatial correspondence pattern between cameras but also handle the
viewpoint or human-pose variation in individual images. We further introduce a
global constraint-based matching process. It integrates a global matching
constraint over the learned correspondence structure to exclude cross-view
misalignments during the image patch matching process, hence achieving a more
reliable matching score between images. Finally, we also extend our approach by
introducing a multi-structure scheme, which learns a set of local
correspondence structures to capture the spatial correspondence sub-patterns
between a camera pair, so as to handle the spatial misalignments between
individual images in a more precise way. Experimental results on various
datasets demonstrate the effectiveness of our approach.Comment: IEEE Trans. Image Processing, vol. 26, no. 5, pp. 2438-2453, 2017.
The project page for this paper is available at
http://min.sjtu.edu.cn/lwydemo/personReID.htm arXiv admin note: text overlap
with arXiv:1504.0624
Support Neighbor Loss for Person Re-Identification
Person re-identification (re-ID) has recently been tremendously boosted due
to the advancement of deep convolutional neural networks (CNN). The majority of
deep re-ID methods focus on designing new CNN architectures, while less
attention is paid on investigating the loss functions. Verification loss and
identification loss are two types of losses widely used to train various deep
re-ID models, both of which however have limitations. Verification loss guides
the networks to generate feature embeddings of which the intra-class variance
is decreased while the inter-class ones is enlarged. However, training networks
with verification loss tends to be of slow convergence and unstable performance
when the number of training samples is large. On the other hand, identification
loss has good separating and scalable property. But its neglect to explicitly
reduce the intra-class variance limits its performance on re-ID, because the
same person may have significant appearance disparity across different camera
views. To avoid the limitations of the two types of losses, we propose a new
loss, called support neighbor (SN) loss. Rather than being derived from data
sample pairs or triplets, SN loss is calculated based on the positive and
negative support neighbor sets of each anchor sample, which contain more
valuable contextual information and neighborhood structure that are beneficial
for more stable performance. To ensure scalability and separability, a
softmax-like function is formulated to push apart the positive and negative
support sets. To reduce intra-class variance, the distance between the anchor's
nearest positive neighbor and furthest positive sample is penalized.
Integrating SN loss on top of Resnet50, superior re-ID results to the
state-of-the-art ones are obtained on several widely used datasets.Comment: Accepted by ACM Multimedia (ACM MM) 201
Beyond Frontal Faces: Improving Person Recognition Using Multiple Cues
We explore the task of recognizing peoples' identities in photo albums in an
unconstrained setting. To facilitate this, we introduce the new People In Photo
Albums (PIPA) dataset, consisting of over 60000 instances of 2000 individuals
collected from public Flickr photo albums. With only about half of the person
images containing a frontal face, the recognition task is very challenging due
to the large variations in pose, clothing, camera viewpoint, image resolution
and illumination. We propose the Pose Invariant PErson Recognition (PIPER)
method, which accumulates the cues of poselet-level person recognizers trained
by deep convolutional networks to discount for the pose variations, combined
with a face recognizer and a global recognizer. Experiments on three different
settings confirm that in our unconstrained setup PIPER significantly improves
on the performance of DeepFace, which is one of the best face recognizers as
measured on the LFW dataset
Structured learning of metric ensembles with application to person re-identification
Matching individuals across non-overlapping camera networks, known as person
re-identification, is a fundamentally challenging problem due to the large
visual appearance changes caused by variations of viewpoints, lighting, and
occlusion. Approaches in literature can be categoried into two streams: The
first stream is to develop reliable features against realistic conditions by
combining several visual features in a pre-defined way; the second stream is to
learn a metric from training data to ensure strong inter-class differences and
intra-class similarities. However, seeking an optimal combination of visual
features which is generic yet adaptive to different benchmarks is a unsoved
problem, and metric learning models easily get over-fitted due to the scarcity
of training data in person re-identification. In this paper, we propose two
effective structured learning based approaches which explore the adaptive
effects of visual features in recognizing persons in different benchmark data
sets. Our framework is built on the basis of multiple low-level visual features
with an optimal ensemble of their metrics. We formulate two optimization
algorithms, CMCtriplet and CMCstruct, which directly optimize evaluation
measures commonly used in person re-identification, also known as the
Cumulative Matching Characteristic (CMC) curve.Comment: 16 pages. Extended version of "Learning to Rank in Person
Re-Identification With Metric Ensembles", at
http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Paisitkriangkrai_Learning_to_Rank_2015_CVPR_paper.html.
arXiv admin note: text overlap with arXiv:1503.0154
- …