96,359 research outputs found
Pixel and Feature Transfer Fusion for Unsupervised Cross-Dataset Person Re-Identification
Publisher Copyright: IEEERecently, unsupervised cross-dataset person reidentification (Re-ID) has attracted more and more attention, which aims to transfer knowledge of a labeled source domain to an unlabeled target domain. There are two common frameworks: one is pixel-alignment of transferring low-level knowledge, and the other is feature-alignment of transferring high-level knowledge. In this article, we propose a novel recurrent autoencoder (RAE) framework to unify these two kinds of methods and inherit their merits. Specifically, the proposed RAE includes three modules, i.e., a feature-transfer (FT) module, a pixel-transfer (PT) module, and a fusion module. The FT module utilizes an encoder to map source and target images to a shared feature space. In the space, not only features are identity-discriminative but also the gap between source and target features is reduced. The PT module takes a decoder to reconstruct original images with its features. Here, we hope that the images reconstructed from target features are in the sourcestyle. Thus, the low-level knowledge can be propagated to the target domain. After transferring both high- and low-level knowledge with the two proposed modules above, we design another bilinear pooling layer to fuse both kinds of knowledge. Extensive experiments on Market-1501, DukeMTMC-ReID, and MSMT17 datasets show that our method significantly outperforms either pixel-alignment or feature-alignment Re-ID methods and achieves new state-of-the-art results.Peer reviewe
Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation
We present a novel unsupervised domain adaptation method for semantic
segmentation that generalizes a model trained with source images and
corresponding ground-truth labels to a target domain. A key to domain adaptive
semantic segmentation is to learn domain-invariant and discriminative features
without target ground-truth labels. To this end, we propose a bi-directional
pixel-prototype contrastive learning framework that minimizes intra-class
variations of features for the same object class, while maximizing inter-class
variations for different ones, regardless of domains. Specifically, our
framework aligns pixel-level features and a prototype of the same object class
in target and source images (i.e., positive pairs), respectively, sets them
apart for different classes (i.e., negative pairs), and performs the alignment
and separation processes toward the other direction with pixel-level features
in the source image and a prototype in the target image. The cross-domain
matching encourages domain-invariant feature representations, while the
bidirectional pixel-prototype correspondences aggregate features for the same
object class, providing discriminative features. To establish training pairs
for contrastive learning, we propose to generate dynamic pseudo labels of
target images using a non-parametric label transfer, that is, pixel-prototype
correspondences across different domains. We also present a calibration method
compensating class-wise domain biases of prototypes gradually during training.Comment: Accepted to ECCV 202
Weakly-supervised Caricature Face Parsing through Domain Adaptation
A caricature is an artistic form of a person's picture in which certain
striking characteristics are abstracted or exaggerated in order to create a
humor or sarcasm effect. For numerous caricature related applications such as
attribute recognition and caricature editing, face parsing is an essential
pre-processing step that provides a complete facial structure understanding.
However, current state-of-the-art face parsing methods require large amounts of
labeled data on the pixel-level and such process for caricature is tedious and
labor-intensive. For real photos, there are numerous labeled datasets for face
parsing. Thus, we formulate caricature face parsing as a domain adaptation
problem, where real photos play the role of the source domain, adapting to the
target caricatures. Specifically, we first leverage a spatial transformer based
network to enable shape domain shifts. A feed-forward style transfer network is
then utilized to capture texture-level domain gaps. With these two steps, we
synthesize face caricatures from real photos, and thus we can use parsing
ground truths of the original photos to learn the parsing model. Experimental
results on the synthetic and real caricatures demonstrate the effectiveness of
the proposed domain adaptation algorithm. Code is available at:
https://github.com/ZJULearning/CariFaceParsing .Comment: Accepted in ICIP 2019, code and model are available at
https://github.com/ZJULearning/CariFaceParsin
Unsupervised Domain Transfer with Conditional Invertible Neural Networks
Synthetic medical image generation has evolved as a key technique for neural
network training and validation. A core challenge, however, remains in the
domain gap between simulations and real data. While deep learning-based domain
transfer using Cycle Generative Adversarial Networks and similar architectures
has led to substantial progress in the field, there are use cases in which
state-of-the-art approaches still fail to generate training images that produce
convincing results on relevant downstream tasks. Here, we address this issue
with a domain transfer approach based on conditional invertible neural networks
(cINNs). As a particular advantage, our method inherently guarantees cycle
consistency through its invertible architecture, and network training can
efficiently be conducted with maximum likelihood training. To showcase our
method's generic applicability, we apply it to two spectral imaging modalities
at different scales, namely hyperspectral imaging (pixel-level) and
photoacoustic tomography (image-level). According to comprehensive experiments,
our method enables the generation of realistic spectral data and outperforms
the state of the art on two downstream classification tasks (binary and
multi-class). cINN-based domain transfer could thus evolve as an important
method for realistic synthetic data generation in the field of spectral imaging
and beyond
- …