70,121 research outputs found
Semi-Supervised Image-to-Image Translation
Image-to-image translation is a long-established and a difficult problem in
computer vision. In this paper we propose an adversarial based model for
image-to-image translation. The regular deep neural-network based methods
perform the task of image-to-image translation by comparing gram matrices and
using image segmentation which requires human intervention. Our generative
adversarial network based model works on a conditional probability approach.
This approach makes the image translation independent of any local, global and
content or style features. In our approach we use a bidirectional
reconstruction model appended with the affine transform factor that helps in
conserving the content and photorealism as compared to other models. The
advantage of using such an approach is that the image-to-image translation is
semi-supervised, independant of image segmentation and inherits the properties
of generative adversarial networks tending to produce realistic. This method
has proven to produce better results than Multimodal Unsupervised
Image-to-image translation
Recommended from our members
Transformation Consistency Regularization – A Semi-supervised Paradigm for Image-to-Image Translation
Scarcity of labeled data has motivated the development of semi-supervised
learning methods, which learn from large portions of unlabeled data alongside a
few labeled samples. Consistency Regularization between model's predictions
under different input perturbations, particularly has shown to provide
state-of-the art results in a semi-supervised framework. However, most of these
method have been limited to classification and segmentation applications. We
propose Transformation Consistency Regularization, which delves into a more
challenging setting of image-to-image translation, which remains unexplored by
semi-supervised algorithms. The method introduces a diverse set of geometric
transformations and enforces the model's predictions for unlabeled data to be
invariant to those transformations. We evaluate the efficacy of our algorithm
on three different applications: image colorization, denoising and
super-resolution. Our method is significantly data efficient, requiring only
around 10 - 20% of labeled samples to achieve similar image reconstructions to
its fully-supervised counterpart. Furthermore, we show the effectiveness of our
method in video processing applications, where knowledge from a few frames can
be leveraged to enhance the quality of the rest of the movie
Transformation Consistency Regularization – A Semi-supervised Paradigm for Image-to-Image Translation
Scarcity of labeled data has motivated the development of semi-supervised
learning methods, which learn from large portions of unlabeled data alongside a
few labeled samples. Consistency Regularization between model's predictions
under different input perturbations, particularly has shown to provide
state-of-the art results in a semi-supervised framework. However, most of these
method have been limited to classification and segmentation applications. We
propose Transformation Consistency Regularization, which delves into a more
challenging setting of image-to-image translation, which remains unexplored by
semi-supervised algorithms. The method introduces a diverse set of geometric
transformations and enforces the model's predictions for unlabeled data to be
invariant to those transformations. We evaluate the efficacy of our algorithm
on three different applications: image colorization, denoising and
super-resolution. Our method is significantly data efficient, requiring only
around 10 - 20% of labeled samples to achieve similar image reconstructions to
its fully-supervised counterpart. Furthermore, we show the effectiveness of our
method in video processing applications, where knowledge from a few frames can
be leveraged to enhance the quality of the rest of the movie
Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation
Automatic high-quality rendering of anime scenes from complex real-world
images is of significant practical value. The challenges of this task lie in
the complexity of the scenes, the unique features of anime style, and the lack
of high-quality datasets to bridge the domain gap. Despite promising attempts,
previous efforts are still incompetent in achieving satisfactory results with
consistent semantic preservation, evident stylization, and fine details. In
this study, we propose Scenimefy, a novel semi-supervised image-to-image
translation framework that addresses these challenges. Our approach guides the
learning with structure-consistent pseudo paired data, simplifying the pure
unsupervised setting. The pseudo data are derived uniquely from a
semantic-constrained StyleGAN leveraging rich model priors like CLIP. We
further apply segmentation-guided data selection to obtain high-quality pseudo
supervision. A patch-wise contrastive style loss is introduced to improve
stylization and fine details. Besides, we contribute a high-resolution anime
scene dataset to facilitate future research. Our extensive experiments
demonstrate the superiority of our method over state-of-the-art baselines in
terms of both perceptual quality and quantitative performance.Comment: ICCV 2023. The first two authors contributed equally. Code:
https://github.com/Yuxinn-J/Scenimefy Project page:
https://yuxinn-j.github.io/projects/Scenimefy.htm
RainDiffusion:When Unsupervised Learning Meets Diffusion Models for Real-world Image Deraining
What will happen when unsupervised learning meets diffusion models for
real-world image deraining? To answer it, we propose RainDiffusion, the first
unsupervised image deraining paradigm based on diffusion models. Beyond the
traditional unsupervised wisdom of image deraining, RainDiffusion introduces
stable training of unpaired real-world data instead of weakly adversarial
training. RainDiffusion consists of two cooperative branches: Non-diffusive
Translation Branch (NTB) and Diffusive Translation Branch (DTB). NTB exploits a
cycle-consistent architecture to bypass the difficulty in unpaired training of
standard diffusion models by generating initial clean/rainy image pairs. DTB
leverages two conditional diffusion modules to progressively refine the desired
output with initial image pairs and diffusive generative prior, to obtain a
better generalization ability of deraining and rain generation. Rain-Diffusion
is a non adversarial training paradigm, serving as a new standard bar for
real-world image deraining. Extensive experiments confirm the superiority of
our RainDiffusion over un/semi-supervised methods and show its competitive
advantages over fully-supervised ones.Comment: 9 page
Semantically consistent image-to-image translation for unsupervised domain adaptation
Unsupervised Domain Adaptation (UDA) aims to adapt models trained on a source
domain to a new target domain where no labelled data is available. In this
work, we investigate the problem of UDA from a synthetic computer-generated
domain to a similar but real-world domain for learning semantic segmentation.
We propose a semantically consistent image-to-image translation method in
combination with a consistency regularisation method for UDA. We overcome
previous limitations on transferring synthetic images to real looking images.
We leverage pseudo-labels in order to learn a generative image-to-image
translation model that receives additional feedback from semantic labels on
both domains. Our method outperforms state-of-the-art methods that combine
image-to-image translation and semi-supervised learning on relevant domain
adaptation benchmarks, i.e., on GTA5 to Cityscapes and SYNTHIA to Cityscapes
- …