54,767 research outputs found
Person Re-identification by Local Maximal Occurrence Representation and Metric Learning
Person re-identification is an important technique towards automatic search
of a person's presence in a surveillance video. Two fundamental problems are
critical for person re-identification, feature representation and metric
learning. An effective feature representation should be robust to illumination
and viewpoint changes, and a discriminant metric should be learned to match
various person images. In this paper, we propose an effective feature
representation called Local Maximal Occurrence (LOMO), and a subspace and
metric learning method called Cross-view Quadratic Discriminant Analysis
(XQDA). The LOMO feature analyzes the horizontal occurrence of local features,
and maximizes the occurrence to make a stable representation against viewpoint
changes. Besides, to handle illumination variations, we apply the Retinex
transform and a scale invariant texture operator. To learn a discriminant
metric, we propose to learn a discriminant low dimensional subspace by
cross-view quadratic discriminant analysis, and simultaneously, a QDA metric is
learned on the derived subspace. We also present a practical computation method
for XQDA, as well as its regularization. Experiments on four challenging person
re-identification databases, VIPeR, QMUL GRID, CUHK Campus, and CUHK03, show
that the proposed method improves the state-of-the-art rank-1 identification
rates by 2.2%, 4.88%, 28.91%, and 31.55% on the four databases, respectively.Comment: This paper has been accepted by CVPR 2015. For source codes and
extracted features please visit
http://www.cbsr.ia.ac.cn/users/scliao/projects/lomo_xqda
Illumination Distillation Framework for Nighttime Person Re-Identification and A New Benchmark
Nighttime person Re-ID (person re-identification in the nighttime) is a very
important and challenging task for visual surveillance but it has not been
thoroughly investigated. Under the low illumination condition, the performance
of person Re-ID methods usually sharply deteriorates. To address the low
illumination challenge in nighttime person Re-ID, this paper proposes an
Illumination Distillation Framework (IDF), which utilizes illumination
enhancement and illumination distillation schemes to promote the learning of
Re-ID models. Specifically, IDF consists of a master branch, an illumination
enhancement branch, and an illumination distillation module. The master branch
is used to extract the features from a nighttime image. The illumination
enhancement branch first estimates an enhanced image from the nighttime image
using a nonlinear curve mapping method and then extracts the enhanced features.
However, nighttime and enhanced features usually contain data noise due to
unstable lighting conditions and enhancement failures. To fully exploit the
complementary benefits of nighttime and enhanced features while suppressing
data noise, we propose an illumination distillation module. In particular, the
illumination distillation module fuses the features from two branches through a
bottleneck fusion model and then uses the fused features to guide the learning
of both branches in a distillation manner. In addition, we build a real-world
nighttime person Re-ID dataset, named Night600, which contains 600 identities
captured from different viewpoints and nighttime illumination conditions under
complex outdoor environments. Experimental results demonstrate that our IDF can
achieve state-of-the-art performance on two nighttime person Re-ID datasets
(i.e., Night600 and Knight ). We will release our code and dataset at
https://github.com/Alexadlu/IDF.Comment: Accepted by TM
Re-identifying people in the crowd
Developing an automated surveillance system is of great interest for various reasons including forensic and security applications. In the case of a network of surveillance cameras with non-overlapping fields of view, person detection and tracking alone are insufficient to track a subject of interest across the network. In this case, instances of a person captured in one camera view need to be retrieved among a gallery of different people, in other camera views. This vision problem is commonly known as person re-identification (re-id).
Cross-view instances of pedestrians exhibit varied levels of illumination, viewpoint, and pose variations which makes the problem very challenging. Despite recent progress towards improving accuracy, existing systems suffer from low applicability to real-world scenarios. This is mainly caused by the need for large amounts of annotated data from pairwise camera views to be available for training. Given the difficulty of obtaining such data and annotating it, this thesis aims to bring the person re-id problem a step closer to real-world deployment.
In the first contribution, the single-shot protocol, where each individual is represented by a pair of images that need to be matched, is considered. Following the extensive annotation of four datasets for six attributes, an evaluation of the most widely used feature extraction schemes is conducted. The results reveal two high-performing descriptors among those evaluated, and show illumination variation to have the most impact on re-id accuracy.
Motivated by the wide availability of videos from surveillance cameras and the additional visual and temporal information they provide, video-based person re-id is then investigated, and a su-pervised system is developed. This is achieved by improving and extending the best performing image-based person descriptor into three dimensions and combining it with distance metric learn-ing. The system obtained achieves state-of-the-art results on two widely used datasets.
Given the cost and difficulty of obtaining labelled data from pairwise cameras in a network to train the model, an unsupervised video-based person re-id method is also developed. It is based on a set-based distance measure that leverages rank vectors to estimate the similarity scores between person tracklets. The proposed system outperforms other unsupervised methods by a large margin on two datasets while competing with deep learning methods on another large-scale dataset
- …