10,325 research outputs found
Deep Co-attention based Comparators For Relative Representation Learning in Person Re-identification
Person re-identification (re-ID) requires rapid, flexible yet discriminant
representations to quickly generalize to unseen observations on-the-fly and
recognize the same identity across disjoint camera views. Recent effective
methods are developed in a pair-wise similarity learning system to detect a
fixed set of features from distinct regions which are mapped to their vector
embeddings for the distance measuring. However, the most relevant and crucial
parts of each image are detected independently without referring to the
dependency conditioned on one and another. Also, these region based methods
rely on spatial manipulation to position the local features in comparable
similarity measuring. To combat these limitations, in this paper we introduce
the Deep Co-attention based Comparators (DCCs) that fuse the co-dependent
representations of the paired images so as to focus on the relevant parts of
both images and produce their \textit{relative representations}. Given a pair
of pedestrian images to be compared, the proposed model mimics the foveation of
human eyes to detect distinct regions concurrent on both images, namely
co-dependent features, and alternatively attend to relevant regions to fuse
them into the similarity learning. Our comparator is capable of producing
dynamic representations relative to a particular sample every time, and thus
well-suited to the case of re-identifying pedestrians on-the-fly. We perform
extensive experiments to provide the insights and demonstrate the effectiveness
of the proposed DCCs in person re-ID. Moreover, our approach has achieved the
state-of-the-art performance on three benchmark data sets: DukeMTMC-reID
\cite{DukeMTMC}, CUHK03 \cite{FPNN}, and Market-1501 \cite{Market1501}
Transfer Metric Learning: Algorithms, Applications and Outlooks
Distance metric learning (DML) aims to find an appropriate way to reveal the
underlying data relationship. It is critical in many machine learning, pattern
recognition and data mining algorithms, and usually require large amount of
label information (such as class labels or pair/triplet constraints) to achieve
satisfactory performance. However, the label information may be insufficient in
real-world applications due to the high-labeling cost, and DML may fail in this
case. Transfer metric learning (TML) is able to mitigate this issue for DML in
the domain of interest (target domain) by leveraging knowledge/information from
other related domains (source domains). Although achieved a certain level of
development, TML has limited success in various aspects such as selective
transfer, theoretical understanding, handling complex data, big data and
extreme cases. In this survey, we present a systematic review of the TML
literature. In particular, we group TML into different categories according to
different settings and metric transfer strategies, such as direct metric
approximation, subspace approximation, distance approximation, and distribution
approximation. A summarization and insightful discussion of the various TML
approaches and their applications will be presented. Finally, we indicate some
challenges and provide possible future directions.Comment: 14 pages, 5 figure
Cross-Entropy Adversarial View Adaptation for Person Re-identification
Person re-identification (re-ID) is a task of matching pedestrians under
disjoint camera views. To recognise paired snapshots, it has to cope with large
cross-view variations caused by the camera view shift. Supervised deep neural
networks are effective in producing a set of non-linear projections that can
transform cross-view images into a common feature space. However, they
typically impose a symmetric architecture, yielding the network ill-conditioned
on its optimisation. In this paper, we learn view-invariant subspace for person
re-ID, and its corresponding similarity metric using an adversarial view
adaptation approach. The main contribution is to learn coupled asymmetric
mappings regarding view characteristics which are adversarially trained to
address the view discrepancy by optimising the cross-entropy view confusion
objective. To determine the similarity value, the network is empowered with a
similarity discriminator to promote features that are highly discriminant in
distinguishing positive and negative pairs. The other contribution includes an
adaptive weighing on the most difficult samples to address the imbalance of
within/between-identity pairs. Our approach achieves notable improved
performance in comparison to state-of-the-arts on benchmark datasets.Comment: Appearing at IEEE Transactions on Circuits and Systems for Video
Technolog
Unsupervised Part-Based Disentangling of Object Shape and Appearance
Large intra-class variation is the result of changes in multiple object
characteristics. Images, however, only show the superposition of different
variable factors such as appearance or shape. Therefore, learning to
disentangle and represent these different characteristics poses a great
challenge, especially in the unsupervised case. Moreover, large object
articulation calls for a flexible part-based model. We present an unsupervised
approach for disentangling appearance and shape by learning parts consistently
over all instances of a category. Our model for learning an object
representation is trained by simultaneously exploiting invariance and
equivariance constraints between synthetically transformed images. Since no
part annotation or prior information on an object class is required, the
approach is applicable to arbitrary classes. We evaluate our approach on a wide
range of object categories and diverse tasks including pose prediction,
disentangled image synthesis, and video-to-video translation. The approach
outperforms the state-of-the-art on unsupervised keypoint prediction and
compares favorably even against supervised approaches on the task of shape and
appearance transfer.Comment: CVPR 2019 Ora
Adaptation and Re-Identification Network: An Unsupervised Deep Transfer Learning Approach to Person Re-Identification
Person re-identification (Re-ID) aims at recognizing the same person from
images taken across different cameras. To address this task, one typically
requires a large amount labeled data for training an effective Re-ID model,
which might not be practical for real-world applications. To alleviate this
limitation, we choose to exploit a sufficient amount of pre-existing labeled
data from a different (auxiliary) dataset. By jointly considering such an
auxiliary dataset and the dataset of interest (but without label information),
our proposed adaptation and re-identification network (ARN) performs
unsupervised domain adaptation, which leverages information across datasets and
derives domain-invariant features for Re-ID purposes. In our experiments, we
verify that our network performs favorably against state-of-the-art
unsupervised Re-ID approaches, and even outperforms a number of baseline Re-ID
methods which require fully supervised data for training.Comment: 7 pages, 3 figures. CVPR 2018 workshop pape
Transfer Adaptation Learning: A Decade Survey
The world we see is ever-changing and it always changes with people, things,
and the environment. Domain is referred to as the state of the world at a
certain moment. A research problem is characterized as transfer adaptation
learning (TAL) when it needs knowledge correspondence between different
moments/domains. Conventional machine learning aims to find a model with the
minimum expected risk on test data by minimizing the regularized empirical risk
on the training data, which, however, supposes that the training and test data
share similar joint probability distribution. TAL aims to build models that can
perform tasks of target domain by learning knowledge from a semantic related
but distribution different source domain. It is an energetic research filed of
increasing influence and importance, which is presenting a blowout publication
trend. This paper surveys the advances of TAL methodologies in the past decade,
and the technical challenges and essential problems of TAL have been observed
and discussed with deep insights and new perspectives. Broader solutions of
transfer adaptation learning being created by researchers are identified, i.e.,
instance re-weighting adaptation, feature adaptation, classifier adaptation,
deep network adaptation and adversarial adaptation, which are beyond the early
semi-supervised and unsupervised split. The survey helps researchers rapidly
but comprehensively understand and identify the research foundation, research
status, theoretical limitations, future challenges and under-studied issues
(universality, interpretability, and credibility) to be broken in the field
toward universal representation and safe applications in open-world scenarios.Comment: 26 pages, 4 figure
Unsupervised Person Re-identification: Clustering and Fine-tuning
The superiority of deeply learned pedestrian representations has been
reported in very recent literature of person re-identification (re-ID). In this
paper, we consider the more pragmatic issue of learning a deep feature with no
or only a few labels. We propose a progressive unsupervised learning (PUL)
method to transfer pretrained deep representations to unseen domains. Our
method is easy to implement and can be viewed as an effective baseline for
unsupervised re-ID feature learning. Specifically, PUL iterates between 1)
pedestrian clustering and 2) fine-tuning of the convolutional neural network
(CNN) to improve the original model trained on the irrelevant labeled dataset.
Since the clustering results can be very noisy, we add a selection operation
between the clustering and fine-tuning. At the beginning when the model is
weak, CNN is fine-tuned on a small amount of reliable examples which locate
near to cluster centroids in the feature space. As the model becomes stronger
in subsequent iterations, more images are being adaptively selected as CNN
training samples. Progressively, pedestrian clustering and the CNN model are
improved simultaneously until algorithm convergence. This process is naturally
formulated as self-paced learning. We then point out promising directions that
may lead to further improvement. Extensive experiments on three large-scale
re-ID datasets demonstrate that PUL outputs discriminative features that
improve the re-ID accuracy.Comment: Add more results, parameter analysis and comparison
FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification
Person re-identification (reID) is an important task that requires to
retrieve a person's images from an image dataset, given one image of the person
of interest. For learning robust person features, the pose variation of person
images is one of the key challenges. Existing works targeting the problem
either perform human alignment, or learn human-region-based representations.
Extra pose information and computational cost is generally required for
inference. To solve this issue, a Feature Distilling Generative Adversarial
Network (FD-GAN) is proposed for learning identity-related and pose-unrelated
representations. It is a novel framework based on a Siamese structure with
multiple novel discriminators on human poses and identities. In addition to the
discriminators, a novel same-pose loss is also integrated, which requires
appearance of a same person's generated images to be similar. After learning
pose-unrelated person features with pose guidance, no auxiliary pose
information and additional computational cost is required during testing. Our
proposed FD-GAN achieves state-of-the-art performance on three person reID
datasets, which demonstrates that the effectiveness and robust feature
distilling capability of the proposed FD-GAN.Comment: Accepted in Proceedings of 32nd Conference on Neural Information
Processing Systems (NeurIPS 2018). Code available:
https://github.com/yxgeee/FD-GA
cvpaper.challenge in 2016: Futuristic Computer Vision through 1,600 Papers Survey
The paper gives futuristic challenges disscussed in the cvpaper.challenge. In
2015 and 2016, we thoroughly study 1,600+ papers in several
conferences/journals such as CVPR/ICCV/ECCV/NIPS/PAMI/IJCV
Towards Self-similarity Consistency and Feature Discrimination for Unsupervised Domain Adaptation
Recent advances in unsupervised domain adaptation mainly focus on learning
shared representations by global distribution alignment without considering
class information across domains. The neglect of class information, however,
may lead to partial alignment (or even misalignment) and poor generalization
performance. For comprehensive alignment, we argue that the similarities across
different features in the source domain should be consistent with that of in
the target domain. Based on this assumption, we propose a new domain
discrepancy metric, i.e., Self-similarity Consistency (SSC), to enforce the
feature structure being consistent across domains. The renowned correlation
alignment (CORAL) is proven to be a special case, and a sub-optimal measure of
our proposed SSC. Furthermore, we also propose to mitigate the side effect of
the partial alignment and misalignment by incorporating the discriminative
information of the deep representations. Specifically, an embarrassingly simple
and effective feature norm constraint is exploited to enlarge the discrepancy
of inter-class samples. It relieves the requirements of strict alignment when
performing adaptation, therefore improving the adaptation performance
significantly. Extensive experiments on visual domain adaptation tasks
demonstrate the effectiveness of our proposed SSC metric and feature
discrimination approach.Comment: This paper has been submitted to ACMMMM 201
- …