131,767 research outputs found
Progressive Energy-Based Cooperative Learning for Multi-Domain Image-to-Image Translation
This paper studies a novel energy-based cooperative learning framework for
multi-domain image-to-image translation. The framework consists of four
components: descriptor, translator, style encoder, and style generator. The
descriptor is a multi-head energy-based model that represents a multi-domain
image distribution. The components of translator, style encoder, and style
generator constitute a diversified image generator. Specifically, given an
input image from a source domain, the translator turns it into a stylised
output image of the target domain according to a style code, which can be
inferred by the style encoder from a reference image or produced by the style
generator from a random noise. Since the style generator is represented as an
domain-specific distribution of style codes, the translator can provide a
one-to-many transformation (i.e., diversified generation) between source domain
and target domain. To train our framework, we propose a likelihood-based
multi-domain cooperative learning algorithm to jointly train the multi-domain
descriptor and the diversified image generator (including translator, style
encoder, and style generator modules) via multi-domain MCMC teaching, in which
the descriptor guides the diversified image generator to shift its probability
density toward the data distribution, while the diversified image generator
uses its randomly translated images to initialize the descriptor's Langevin
dynamics process for efficient sampling
AlignFlow: Cycle Consistent Learning from Multiple Domains via Normalizing Flows
Given datasets from multiple domains, a key challenge is to efficiently
exploit these data sources for modeling a target domain. Variants of this
problem have been studied in many contexts, such as cross-domain translation
and domain adaptation. We propose AlignFlow, a generative modeling framework
that models each domain via a normalizing flow. The use of normalizing flows
allows for a) flexibility in specifying learning objectives via adversarial
training, maximum likelihood estimation, or a hybrid of the two methods; and b)
learning and exact inference of a shared representation in the latent space of
the generative model. We derive a uniform set of conditions under which
AlignFlow is marginally-consistent for the different learning objectives.
Furthermore, we show that AlignFlow guarantees exact cycle consistency in
mapping datapoints from a source domain to target and back to the source
domain. Empirically, AlignFlow outperforms relevant baselines on image-to-image
translation and unsupervised domain adaptation and can be used to
simultaneously interpolate across the various domains using the learned
representation.Comment: AAAI 202
MedGAN: Medical Image Translation using GANs
Image-to-image translation is considered a new frontier in the field of
medical image analysis, with numerous potential applications. However, a large
portion of recent approaches offers individualized solutions based on
specialized task-specific architectures or require refinement through
non-end-to-end training. In this paper, we propose a new framework, named
MedGAN, for medical image-to-image translation which operates on the image
level in an end-to-end manner. MedGAN builds upon recent advances in the field
of generative adversarial networks (GANs) by merging the adversarial framework
with a new combination of non-adversarial losses. We utilize a discriminator
network as a trainable feature extractor which penalizes the discrepancy
between the translated medical images and the desired modalities. Moreover,
style-transfer losses are utilized to match the textures and fine-structures of
the desired target images to the translated images. Additionally, we present a
new generator architecture, titled CasNet, which enhances the sharpness of the
translated medical outputs through progressive refinement via encoder-decoder
pairs. Without any application-specific modifications, we apply MedGAN on three
different tasks: PET-CT translation, correction of MR motion artefacts and PET
image denoising. Perceptual analysis by radiologists and quantitative
evaluations illustrate that the MedGAN outperforms other existing translation
approaches.Comment: 16 pages, 8 figure
- …