58,587 research outputs found
What-and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re-identification
Matching pedestrians across disjoint camera views, known as person
re-identification (re-id), is a challenging problem that is of importance to
visual recognition and surveillance. Most existing methods exploit local
regions within spatial manipulation to perform matching in local
correspondence. However, they essentially extract \emph{fixed} representations
from pre-divided regions for each image and perform matching based on the
extracted representation subsequently. For models in this pipeline, local finer
patterns that are crucial to distinguish positive pairs from negative ones
cannot be captured, and thus making them underperformed. In this paper, we
propose a novel deep multiplicative integration gating function, which answers
the question of \emph{what-and-where to match} for effective person re-id. To
address \emph{what} to match, our deep network emphasizes common local patterns
by learning joint representations in a multiplicative way. The network
comprises two Convolutional Neural Networks (CNNs) to extract convolutional
activations, and generates relevant descriptors for pedestrian matching. This
thus, leads to flexible representations for pair-wise images. To address
\emph{where} to match, we combat the spatial misalignment by performing
spatially recurrent pooling via a four-directional recurrent neural network to
impose spatial dependency over all positions with respect to the entire image.
The proposed network is designed to be end-to-end trainable to characterize
local pairwise feature interactions in a spatially aligned manner. To
demonstrate the superiority of our method, extensive experiments are conducted
over three benchmark data sets: VIPeR, CUHK03 and Market-1501.Comment: Published at Pattern Recognition, Elsevie
Learning Correspondence Structures for Person Re-identification
This paper addresses the problem of handling spatial misalignments due to
camera-view changes or human-pose variations in person re-identification. We
first introduce a boosting-based approach to learn a correspondence structure
which indicates the patch-wise matching probabilities between images from a
target camera pair. The learned correspondence structure can not only capture
the spatial correspondence pattern between cameras but also handle the
viewpoint or human-pose variation in individual images. We further introduce a
global constraint-based matching process. It integrates a global matching
constraint over the learned correspondence structure to exclude cross-view
misalignments during the image patch matching process, hence achieving a more
reliable matching score between images. Finally, we also extend our approach by
introducing a multi-structure scheme, which learns a set of local
correspondence structures to capture the spatial correspondence sub-patterns
between a camera pair, so as to handle the spatial misalignments between
individual images in a more precise way. Experimental results on various
datasets demonstrate the effectiveness of our approach.Comment: IEEE Trans. Image Processing, vol. 26, no. 5, pp. 2438-2453, 2017.
The project page for this paper is available at
http://min.sjtu.edu.cn/lwydemo/personReID.htm arXiv admin note: text overlap
with arXiv:1504.0624
Query-guided End-to-End Person Search
Person search has recently gained attention as the novel task of finding a
person, provided as a cropped sample, from a gallery of non-cropped images,
whereby several other people are also visible. We believe that i. person
detection and re-identification should be pursued in a joint optimization
framework and that ii. the person search should leverage the query image
extensively (e.g. emphasizing unique query patterns). However, so far, no prior
art realizes this. We introduce a novel query-guided end-to-end person search
network (QEEPS) to address both aspects. We leverage a most recent joint
detector and re-identification work, OIM [37]. We extend this with i. a
query-guided Siamese squeeze-and-excitation network (QSSE-Net) that uses global
context from both the query and gallery images, ii. a query-guided region
proposal network (QRPN) to produce query-relevant proposals, and iii. a
query-guided similarity subnetwork (QSimNet), to learn a query-guided
reidentification score. QEEPS is the first end-to-end query-guided detection
and re-id network. On both the most recent CUHK-SYSU [37] and PRW [46]
datasets, we outperform the previous state-of-the-art by a large margin.Comment: Accepted as poster in CVPR 201
- …