104 research outputs found
DACS: Domain Adaptation via Cross-domain Mixed Sampling
Semantic segmentation models based on convolutional neural networks have
recently displayed remarkable performance for a multitude of applications.
However, these models typically do not generalize well when applied on new
domains, especially when going from synthetic to real data. In this paper we
address the problem of unsupervised domain adaptation (UDA), which attempts to
train on labelled data from one domain (source domain), and simultaneously
learn from unlabelled data in the domain of interest (target domain). Existing
methods have seen success by training on pseudo-labels for these unlabelled
images. Multiple techniques have been proposed to mitigate low-quality
pseudo-labels arising from the domain shift, with varying degrees of success.
We propose DACS: Domain Adaptation via Cross-domain mixed Sampling, which mixes
images from the two domains along with the corresponding labels and
pseudo-labels. These mixed samples are then trained on, in addition to the
labelled data itself. We demonstrate the effectiveness of our solution by
achieving state-of-the-art results for GTA5 to Cityscapes, a common
synthetic-to-real semantic segmentation benchmark for UDA.Comment: This paper has been accepted to WACV202
SimSwap: An Efficient Framework For High Fidelity Face Swapping
We propose an efficient framework, called Simple Swap (SimSwap), aiming for
generalized and high fidelity face swapping. In contrast to previous approaches
that either lack the ability to generalize to arbitrary identity or fail to
preserve attributes like facial expression and gaze direction, our framework is
capable of transferring the identity of an arbitrary source face into an
arbitrary target face while preserving the attributes of the target face. We
overcome the above defects in the following two ways. First, we present the ID
Injection Module (IIM) which transfers the identity information of the source
face into the target face at feature level. By using this module, we extend the
architecture of an identity-specific face swapping algorithm to a framework for
arbitrary face swapping. Second, we propose the Weak Feature Matching Loss
which efficiently helps our framework to preserve the facial attributes in an
implicit way. Extensive experiments on wild faces demonstrate that our SimSwap
is able to achieve competitive identity performance while preserving attributes
better than previous state-of-the-art methods. The code is already available on
github: https://github.com/neuralchen/SimSwap.Comment: Accepted by ACMMM 202
Text-Guided Neural Image Inpainting
Image inpainting task requires filling the corrupted image with contents
coherent with the context. This research field has achieved promising progress
by using neural image inpainting methods. Nevertheless, there is still a
critical challenge in guessing the missed content with only the context pixels.
The goal of this paper is to fill the semantic information in corrupted images
according to the provided descriptive text. Unique from existing text-guided
image generation works, the inpainting models are required to compare the
semantic content of the given text and the remaining part of the image, then
find out the semantic content that should be filled for missing part. To
fulfill such a task, we propose a novel inpainting model named Text-Guided Dual
Attention Inpainting Network (TDANet). Firstly, a dual multimodal attention
mechanism is designed to extract the explicit semantic information about the
corrupted regions, which is done by comparing the descriptive text and
complementary image areas through reciprocal attention. Secondly, an image-text
matching loss is applied to maximize the semantic similarity of the generated
image and the text. Experiments are conducted on two open datasets. Results
show that the proposed TDANet model reaches new state-of-the-art on both
quantitative and qualitative measures. Result analysis suggests that the
generated images are consistent with the guidance text, enabling the generation
of various results by providing different descriptions. Codes are available at
https://github.com/idealwhite/TDANetComment: ACM MM'2020 (Oral). 9 pages, 4 tables, 7 figure
Consistency Regularization with High-dimensional Non-adversarial Source-guided Perturbation for Unsupervised Domain Adaptation in Segmentation
Unsupervised domain adaptation for semantic segmentation has been intensively
studied due to the low cost of the pixel-level annotation for synthetic data.
The most common approaches try to generate images or features mimicking the
distribution in the target domain while preserving the semantic contents in the
source domain so that a model can be trained with annotations from the latter.
However, such methods highly rely on an image translator or feature extractor
trained in an elaborated mechanism including adversarial training, which brings
in extra complexity and instability in the adaptation process. Furthermore,
these methods mainly focus on taking advantage of the labeled source dataset,
leaving the unlabeled target dataset not fully utilized. In this paper, we
propose a bidirectional style-induced domain adaptation method, called BiSIDA,
that employs consistency regularization to efficiently exploit information from
the unlabeled target domain dataset, requiring only a simple neural style
transfer model. BiSIDA aligns domains by not only transferring source images
into the style of target images but also transferring target images into the
style of source images to perform high-dimensional perturbation on the
unlabeled target images, which is crucial to the success in applying
consistency regularization in segmentation tasks. Extensive experiments show
that our BiSIDA achieves new state-of-the-art on two commonly-used
synthetic-to-real domain adaptation benchmarks: GTA5-to-CityScapes and
SYNTHIA-to-CityScapes
- …