672 research outputs found
Person re-identification via efficient inference in fully connected CRF
In this paper, we address the problem of person re-identification problem,
i.e., retrieving instances from gallery which are generated by the same person
as the given probe image. This is very challenging because the person's
appearance usually undergoes significant variations due to changes in
illumination, camera angle and view, background clutter, and occlusion over the
camera network. In this paper, we assume that the matched gallery images should
not only be similar to the probe, but also be similar to each other, under
suitable metric. We express this assumption with a fully connected CRF model in
which each node corresponds to a gallery and every pair of nodes are connected
by an edge. A label variable is associated with each node to indicate whether
the corresponding image is from target person. We define unary potential for
each node using existing feature calculation and matching techniques, which
reflect the similarity between probe and gallery image, and define pairwise
potential for each edge in terms of a weighed combination of Gaussian kernels,
which encode appearance similarity between pair of gallery images. The specific
form of pairwise potential allows us to exploit an efficient inference
algorithm to calculate the marginal distribution of each label variable for
this dense connected CRF. We show the superiority of our method by applying it
to public datasets and comparing with the state of the art.Comment: 7 pages, 4 figure
Ensemble of Different Approaches for a Reliable Person Re-identification System
An ensemble of approaches for reliable person re-identification is proposed in this paper. The proposed ensemble is built combining widely used person re-identification systems using different color spaces and some variants of state-of-the-art approaches that are proposed in this paper. Different descriptors are tested, and both texture and color features are extracted from the images; then the different descriptors are compared using different distance measures (e.g., the Euclidean distance, angle, and the Jeffrey distance). To improve performance, a method based on skeleton detection, extracted from the depth map, is also applied when the depth map is available. The proposed ensemble is validated on three widely used datasets (CAVIAR4REID, IAS, and VIPeR), keeping the same parameter set of each approach constant across all tests to avoid overfitting and to demonstrate that the proposed system can be considered a general-purpose person re-identification system. Our experimental results show that the proposed system offers significant improvements over baseline approaches. The source code used for the approaches tested in this paper will be available at https://www.dei.unipd.it/node/2357 and http://robotics.dei.unipd.it/reid/
Person Re-identification by Local Maximal Occurrence Representation and Metric Learning
Person re-identification is an important technique towards automatic search
of a person's presence in a surveillance video. Two fundamental problems are
critical for person re-identification, feature representation and metric
learning. An effective feature representation should be robust to illumination
and viewpoint changes, and a discriminant metric should be learned to match
various person images. In this paper, we propose an effective feature
representation called Local Maximal Occurrence (LOMO), and a subspace and
metric learning method called Cross-view Quadratic Discriminant Analysis
(XQDA). The LOMO feature analyzes the horizontal occurrence of local features,
and maximizes the occurrence to make a stable representation against viewpoint
changes. Besides, to handle illumination variations, we apply the Retinex
transform and a scale invariant texture operator. To learn a discriminant
metric, we propose to learn a discriminant low dimensional subspace by
cross-view quadratic discriminant analysis, and simultaneously, a QDA metric is
learned on the derived subspace. We also present a practical computation method
for XQDA, as well as its regularization. Experiments on four challenging person
re-identification databases, VIPeR, QMUL GRID, CUHK Campus, and CUHK03, show
that the proposed method improves the state-of-the-art rank-1 identification
rates by 2.2%, 4.88%, 28.91%, and 31.55% on the four databases, respectively.Comment: This paper has been accepted by CVPR 2015. For source codes and
extracted features please visit
http://www.cbsr.ia.ac.cn/users/scliao/projects/lomo_xqda
BiCov: a novel image representation for person re-identification and face verification
International audienceThis paper proposes a novel image representation which can properly handle both background and illumination variations. It is therefore adapted to the person/face reidentification tasks, avoiding the use of any additional pre-processing steps such as foreground-background separation or face and body part segmentation. This novel representation relies on the combination of Biologically Inspired Features (BIF) and covariance descriptors used to compute the similarity of the BIF features at neighboring scales. Hence, we will refer to it as the BiCov representation. To show the effectiveness of BiCov, this paper conducts experiments on two person re-identification tasks (VIPeR and ETHZ) and one face verification task (LFW), on which it improves the current state-of-the-art performance
Re-identifying people in the crowd
Developing an automated surveillance system is of great interest for various reasons including forensic and security applications. In the case of a network of surveillance cameras with non-overlapping fields of view, person detection and tracking alone are insufficient to track a subject of interest across the network. In this case, instances of a person captured in one camera view need to be retrieved among a gallery of different people, in other camera views. This vision problem is commonly known as person re-identification (re-id).
Cross-view instances of pedestrians exhibit varied levels of illumination, viewpoint, and pose variations which makes the problem very challenging. Despite recent progress towards improving accuracy, existing systems suffer from low applicability to real-world scenarios. This is mainly caused by the need for large amounts of annotated data from pairwise camera views to be available for training. Given the difficulty of obtaining such data and annotating it, this thesis aims to bring the person re-id problem a step closer to real-world deployment.
In the first contribution, the single-shot protocol, where each individual is represented by a pair of images that need to be matched, is considered. Following the extensive annotation of four datasets for six attributes, an evaluation of the most widely used feature extraction schemes is conducted. The results reveal two high-performing descriptors among those evaluated, and show illumination variation to have the most impact on re-id accuracy.
Motivated by the wide availability of videos from surveillance cameras and the additional visual and temporal information they provide, video-based person re-id is then investigated, and a su-pervised system is developed. This is achieved by improving and extending the best performing image-based person descriptor into three dimensions and combining it with distance metric learn-ing. The system obtained achieves state-of-the-art results on two widely used datasets.
Given the cost and difficulty of obtaining labelled data from pairwise cameras in a network to train the model, an unsupervised video-based person re-id method is also developed. It is based on a set-based distance measure that leverages rank vectors to estimate the similarity scores between person tracklets. The proposed system outperforms other unsupervised methods by a large margin on two datasets while competing with deep learning methods on another large-scale dataset
- …