23,645 research outputs found
Orientation Driven Bag of Appearances for Person Re-identification
Person re-identification (re-id) consists of associating individual across
camera network, which is valuable for intelligent video surveillance and has
drawn wide attention. Although person re-identification research is making
progress, it still faces some challenges such as varying poses, illumination
and viewpoints. For feature representation in re-identification, existing works
usually use low-level descriptors which do not take full advantage of body
structure information, resulting in low representation ability.
%discrimination. To solve this problem, this paper proposes the mid-level
body-structure based feature representation (BSFR) which introduces body
structure pyramid for codebook learning and feature pooling in the vertical
direction of human body. Besides, varying viewpoints in the horizontal
direction of human body usually causes the data missing problem, , the
appearances obtained in different orientations of the identical person could
vary significantly. To address this problem, the orientation driven bag of
appearances (ODBoA) is proposed to utilize person orientation information
extracted by orientation estimation technic. To properly evaluate the proposed
approach, we introduce a new re-identification dataset (Market-1203) based on
the Market-1501 dataset and propose a new re-identification dataset (PKU-Reid).
Both datasets contain multiple images captured in different body orientations
for each person. Experimental results on three public datasets and two proposed
datasets demonstrate the superiority of the proposed approach, indicating the
effectiveness of body structure and orientation information for improving
re-identification performance.Comment: 13 pages, 15 figures, 3 tables, submitted to IEEE Transactions on
Circuits and Systems for Video Technolog
Robust Depth-based Person Re-identification
Person re-identification (re-id) aims to match people across non-overlapping
camera views. So far the RGB-based appearance is widely used in most existing
works. However, when people appeared in extreme illumination or changed
clothes, the RGB appearance-based re-id methods tended to fail. To overcome
this problem, we propose to exploit depth information to provide more invariant
body shape and skeleton information regardless of illumination and color
change. More specifically, we exploit depth voxel covariance descriptor and
further propose a locally rotation invariant depth shape descriptor called
Eigen-depth feature to describe pedestrian body shape. We prove that the
distance between any two covariance matrices on the Riemannian manifold is
equivalent to the Euclidean distance between the corresponding Eigen-depth
features. Furthermore, we propose a kernelized implicit feature transfer scheme
to estimate Eigen-depth feature implicitly from RGB image when depth
information is not available. We find that combining the estimated depth
features with RGB-based appearance features can sometimes help to better reduce
visual ambiguities of appearance features caused by illumination and similar
clothes. The effectiveness of our models was validated on publicly available
depth pedestrian datasets as compared to related methods for person
re-identification.Comment: IEEE Transactions on Image Processing Early Acces
Large Margin Learning in Set to Set Similarity Comparison for Person Re-identification
Person re-identification (Re-ID) aims at matching images of the same person
across disjoint camera views, which is a challenging problem in multimedia
analysis, multimedia editing and content-based media retrieval communities. The
major challenge lies in how to preserve similarity of the same person across
video footages with large appearance variations, while discriminating different
individuals. To address this problem, conventional methods usually consider the
pairwise similarity between persons by only measuring the point to point (P2P)
distance. In this paper, we propose to use deep learning technique to model a
novel set to set (S2S) distance, in which the underline objective focuses on
preserving the compactness of intra-class samples for each camera view, while
maximizing the margin between the intra-class set and inter-class set. The S2S
distance metric is consisted of three terms, namely the class-identity term,
the relative distance term and the regularization term. The class-identity term
keeps the intra-class samples within each camera view gathering together, the
relative distance term maximizes the distance between the intra-class class set
and inter-class set across different camera views, and the regularization term
smoothness the parameters of deep convolutional neural network (CNN). As a
result, the final learned deep model can effectively find out the matched
target to the probe object among various candidates in the video gallery by
learning discriminative and stable feature representations. Using the CUHK01,
CUHK03, PRID2011 and Market1501 benchmark datasets, we extensively conducted
comparative evaluations to demonstrate the advantages of our method over the
state-of-the-art approaches.Comment: Accepted by IEEE Transactions on Multimedi
Person Re-identification in Appearance Impaired Scenarios
Person re-identification is critical in surveillance applications. Current
approaches rely on appearance based features extracted from a single or
multiple shots of the target and candidate matches. These approaches are at a
disadvantage when trying to distinguish between candidates dressed in similar
colors or when targets change their clothing. In this paper we propose a
dynamics-based feature to overcome this limitation. The main idea is to capture
soft biometrics from gait and motion patterns by gathering dense short
trajectories (tracklets) which are Fisher vector encoded. To illustrate the
merits of the proposed features we introduce three new "appearance-impaired"
datasets. Our experiments on the original and the appearance impaired datasets
demonstrate the benefits of incorporating dynamics-based information with
appearance-based information to re-identification algorithms.Comment: 10 page
GAN-based Pose-aware Regulation for Video-based Person Re-identification
Video-based person re-identification deals with the inherent difficulty of
matching unregulated sequences with different length and with incomplete target
pose/viewpoint structure. Common approaches operate either by reducing the
problem to the still images case, facing a significant information loss, or by
exploiting inter-sequence temporal dependencies as in Siamese Recurrent Neural
Networks or in gait analysis. However, in all cases, the inter-sequences
pose/viewpoint misalignment is not considered, and the existing spatial
approaches are mostly limited to the still images context. To this end, we
propose a novel approach that can exploit more effectively the rich video
information, by accounting for the role that the changing pose/viewpoint factor
plays in the sequences matching process. Specifically, our approach consists of
two components. The first one attempts to complement the original
pose-incomplete information carried by the sequences with synthetic
GAN-generated images, and fuse their feature vectors into a more discriminative
viewpoint-insensitive embedding, namely Weighted Fusion (WF). Another one
performs an explicit pose-based alignment of sequence pairs to promote coherent
feature matching, namely Weighted-Pose Regulation (WPR). Extensive experiments
on two large video-based benchmark datasets show that our approach outperforms
considerably existing methods
Recognizing Partial Biometric Patterns
Biometric recognition on partial captured targets is challenging, where only
several partial observations of objects are available for matching. In this
area, deep learning based methods are widely applied to match these partial
captured objects caused by occlusions, variations of postures or just partial
out of view in person re-identification and partial face recognition. However,
most current methods are not able to identify an individual in case that some
parts of the object are not obtainable, while the rest are specialized to
certain constrained scenarios. To this end, we propose a robust general
framework for arbitrary biometric matching scenarios without the limitations of
alignment as well as the size of inputs. We introduce a feature post-processing
step to handle the feature maps from FCN and a dictionary learning based
Spatial Feature Reconstruction (SFR) to match different sized feature maps in
this work. Moreover, the batch hard triplet loss function is applied to
optimize the model. The applicability and effectiveness of the proposed method
are demonstrated by the results from experiments on three person
re-identification datasets (Market1501, CUHK03, DukeMTMC-reID), two partial
person datasets (Partial REID and Partial iLIDS) and two partial face datasets
(CASIA-NIR-Distance and Partial LFW), on which state-of-the-art performance is
ensured in comparison with several state-of-the-art approaches. The code is
released online and can be found on the website:
https://github.com/lingxiao-he/Partial-Person-ReID.Comment: 13 pages, 11 figure
PRISM: Person Re-Identification via Structured Matching
Person re-identification (re-id), an emerging problem in visual surveillance,
deals with maintaining entities of individuals whilst they traverse various
locations surveilled by a camera network. From a visual perspective re-id is
challenging due to significant changes in visual appearance of individuals in
cameras with different pose, illumination and calibration. Globally the
challenge arises from the need to maintain structurally consistent matches
among all the individual entities across different camera views. We propose
PRISM, a structured matching method to jointly account for these challenges. We
view the global problem as a weighted graph matching problem and estimate edge
weights by learning to predict them based on the co-occurrences of visual
patterns in the training examples. These co-occurrence based scores in turn
account for appearance changes by inferring likely and unlikely visual
co-occurrences appearing in training instances. We implement PRISM on single
shot and multi-shot scenarios. PRISM uniformly outperforms state-of-the-art in
terms of matching rate while being computationally efficient
Adaptive Re-ranking of Deep Feature for Person Re-identification
Typical person re-identification (re-ID) methods train a deep CNN to extract
deep features and combine them with a distance metric for the final evaluation.
In this work, we focus on exploiting the full information encoded in the deep
feature to boost the re-ID performance. First, we propose a Deep Feature Fusion
(DFF) method to exploit the diverse information embedded in a deep feature. DFF
treats each sub-feature as an information carrier and employs a diffusion
process to exchange their information. Second, we propose an Adaptive
Re-Ranking (ARR) method to exploit the contextual information encoded in the
features of neighbors. ARR utilizes the contextual information to re-rank the
retrieval results in an iterative manner. Particularly, it adds more contextual
information after each iteration automatically to consider more matches. Third,
we propose a strategy that combines DFF and ARR to enhance the performance.
Extensive comparative evaluations demonstrate the superiority of the proposed
methods on three large benchmarks
Attention Driven Person Re-identification
Person re-identification (ReID) is a challenging task due to arbitrary human
pose variations, background clutters, etc. It has been studied extensively in
recent years, but the multifarious local and global features are still not
fully exploited by either ignoring the interplay between whole-body images and
body-part images or missing in-depth examination of specific body-part images.
In this paper, we propose a novel attention-driven multi-branch network that
learns robust and discriminative human representation from global whole-body
images and local body-part images simultaneously. Within each branch, an
intra-attention network is designed to search for informative and
discriminative regions within the whole-body or body-part images, where
attention is elegantly decomposed into spatial-wise attention and channel-wise
attention for effective and efficient learning. In addition, a novel
inter-attention module is designed which fuses the output of intra-attention
networks adaptively for optimal person ReID. The proposed technique has been
evaluated over three widely used datasets CUHK03, Market-1501 and
DukeMTMC-ReID, and experiments demonstrate its superior robustness and
effectiveness as compared with the state of the arts.Comment: Accepted in the Pattern Recognition (PR
Person Re-identification Meets Image Search
For long time, person re-identification and image search are two separately
studied tasks. However, for person re-identification, the effectiveness of
local features and the "query-search" mode make it well posed for image search
techniques.
In the light of recent advances in image search, this paper proposes to treat
person re-identification as an image search problem. Specifically, this paper
claims two major contributions. 1) By designing an unsupervised Bag-of-Words
representation, we are devoted to bridging the gap between the two tasks by
integrating techniques from image search in person re-identification. We show
that our system sets up an effective yet efficient baseline that is amenable to
further supervised/unsupervised improvements. 2) We contribute a new high
quality dataset which uses DPM detector and includes a number of distractor
images. Our dataset reaches closer to realistic settings, and new perspectives
are provided.
Compared with approaches that rely on feature-feature match, our method is
faster by over two orders of magnitude. Moreover, on three datasets, we report
competitive results compared with the state-of-the-art methods
- …