2,863 research outputs found
Temporal Model Adaptation for Person Re-Identification
Person re-identification is an open and challenging problem in computer
vision. Majority of the efforts have been spent either to design the best
feature representation or to learn the optimal matching metric. Most approaches
have neglected the problem of adapting the selected features or the learned
model over time. To address such a problem, we propose a temporal model
adaptation scheme with human in the loop. We first introduce a
similarity-dissimilarity learning method which can be trained in an incremental
fashion by means of a stochastic alternating directions methods of multipliers
optimization procedure. Then, to achieve temporal adaptation with limited human
effort, we exploit a graph-based approach to present the user only the most
informative probe-gallery matches that should be used to update the model.
Results on three datasets have shown that our approach performs on par or even
better than state-of-the-art approaches while reducing the manual pairwise
labeling effort by about 80%
Joint & Progressive Learning from High-Dimensional Data for Multi-Label Classification
Despite the fact that nonlinear subspace learning techniques (e.g. manifold
learning) have successfully applied to data representation, there is still room
for improvement in explainability (explicit mapping), generalization
(out-of-samples), and cost-effectiveness (linearization). To this end, a novel
linearized subspace learning technique is developed in a joint and progressive
way, called \textbf{j}oint and \textbf{p}rogressive \textbf{l}earning
str\textbf{a}teg\textbf{y} (J-Play), with its application to multi-label
classification. The J-Play learns high-level and semantically meaningful
feature representation from high-dimensional data by 1) jointly performing
multiple subspace learning and classification to find a latent subspace where
samples are expected to be better classified; 2) progressively learning
multi-coupled projections to linearly approach the optimal mapping bridging the
original space with the most discriminative subspace; 3) locally embedding
manifold structure in each learnable latent subspace. Extensive experiments are
performed to demonstrate the superiority and effectiveness of the proposed
method in comparison with previous state-of-the-art methods.Comment: accepted in ECCV 201
Action Classification with Locality-constrained Linear Coding
We propose an action classification algorithm which uses Locality-constrained
Linear Coding (LLC) to capture discriminative information of human body
variations in each spatiotemporal subsequence of a video sequence. Our proposed
method divides the input video into equally spaced overlapping spatiotemporal
subsequences, each of which is decomposed into blocks and then cells. We use
the Histogram of Oriented Gradient (HOG3D) feature to encode the information in
each cell. We justify the use of LLC for encoding the block descriptor by
demonstrating its superiority over Sparse Coding (SC). Our sequence descriptor
is obtained via a logistic regression classifier with L2 regularization. We
evaluate and compare our algorithm with ten state-of-the-art algorithms on five
benchmark datasets. Experimental results show that, on average, our algorithm
gives better accuracy than these ten algorithms.Comment: ICPR 201
Recurrent Attention Models for Depth-Based Person Identification
We present an attention-based model that reasons on human body shape and
motion dynamics to identify individuals in the absence of RGB information,
hence in the dark. Our approach leverages unique 4D spatio-temporal signatures
to address the identification problem across days. Formulated as a
reinforcement learning task, our model is based on a combination of
convolutional and recurrent neural networks with the goal of identifying small,
discriminative regions indicative of human identity. We demonstrate that our
model produces state-of-the-art results on several published datasets given
only depth images. We further study the robustness of our model towards
viewpoint, appearance, and volumetric changes. Finally, we share insights
gleaned from interpretable 2D, 3D, and 4D visualizations of our model's
spatio-temporal attention.Comment: Computer Vision and Pattern Recognition (CVPR) 201
- …