260 research outputs found
Multimodal Controller for Generative Models
Class-conditional generative models are crucial tools for data generation
from user-specified class labels. Existing approaches for class-conditional
generative models require nontrivial modifications of backbone generative
architectures to model conditional information fed into the model. This paper
introduces a plug-and-play module named `multimodal controller' to generate
multimodal data without introducing additional learning parameters. In the
absence of the controllers, our model reduces to non-conditional generative
models. We test the efficacy of multimodal controllers on CIFAR10, COIL100, and
Omniglot benchmark datasets. We demonstrate that multimodal controlled
generative models (including VAE, PixelCNN, Glow, and GAN) can generate
class-conditional images of significantly better quality when compared with
conditional generative models. Moreover, we show that multimodal controlled
models can also create novel modalities of images
DualVAE: Controlling Colours of Generated and Real Images
Colour controlled image generation and manipulation are of interest to
artists and graphic designers. Vector Quantised Variational AutoEncoders
(VQ-VAEs) with autoregressive (AR) prior are able to produce high quality
images, but lack an explicit representation mechanism to control colour
attributes. We introduce DualVAE, a hybrid representation model that provides
such control by learning disentangled representations for colour and geometry.
The geometry is represented by an image intensity mapping that identifies
structural features. The disentangled representation is obtained by two novel
mechanisms:
(i) a dual branch architecture that separates image colour attributes from
geometric attributes, and (ii) a new ELBO that trains the combined colour and
geometry representations. DualVAE can control the colour of generated images,
and recolour existing images by transferring the colour latent representation
obtained from an exemplar image. We demonstrate that DualVAE generates images
with FID nearly two times better than VQ-GAN on a diverse collection of
datasets, including animated faces, logos and artistic landscapes
Morphology-preserving Autoregressive 3D Generative Modelling of the Brain
Human anatomy, morphology, and associated diseases can be studied using
medical imaging data. However, access to medical imaging data is restricted by
governance and privacy concerns, data ownership, and the cost of acquisition,
thus limiting our ability to understand the human body. A possible solution to
this issue is the creation of a model able to learn and then generate synthetic
images of the human body conditioned on specific characteristics of relevance
(e.g., age, sex, and disease status). Deep generative models, in the form of
neural networks, have been recently used to create synthetic 2D images of
natural scenes. Still, the ability to produce high-resolution 3D volumetric
imaging data with correct anatomical morphology has been hampered by data
scarcity and algorithmic and computational limitations. This work proposes a
generative model that can be scaled to produce anatomically correct,
high-resolution, and realistic images of the human brain, with the necessary
quality to allow further downstream analyses. The ability to generate a
potentially unlimited amount of data not only enables large-scale studies of
human anatomy and pathology without jeopardizing patient privacy, but also
significantly advances research in the field of anomaly detection, modality
synthesis, learning under limited data, and fair and ethical AI. Code and
trained models are available at: https://github.com/AmigoLab/SynthAnatomy.Comment: 13 pages, 3 figures, 2 tables, accepted at SASHIMI MICCAI 202
- …