Search CORE

260 research outputs found

Multimodal Controller for Generative Models

Author: Diao Enmao
Ding Jie
Tarokh Vahid
Publication venue
Publication date: 03/08/2022
Field of study

Class-conditional generative models are crucial tools for data generation from user-specified class labels. Existing approaches for class-conditional generative models require nontrivial modifications of backbone generative architectures to model conditional information fed into the model. This paper introduces a plug-and-play module named `multimodal controller' to generate multimodal data without introducing additional learning parameters. In the absence of the controllers, our model reduces to non-conditional generative models. We test the efficacy of multimodal controllers on CIFAR10, COIL100, and Omniglot benchmark datasets. We demonstrate that multimodal controlled generative models (including VAE, PixelCNN, Glow, and GAN) can generate class-conditional images of significantly better quality when compared with conditional generative models. Moreover, we show that multimodal controlled models can also create novel modalities of images

arXiv.org e-Print Archive

DualVAE: Controlling Colours of Generated and Real Images

Author: Kanhere Salil S.
Liebowitz David
Moore Kristen
Rathakumar Keerth
Walder Christian
Publication venue
Publication date: 30/05/2023
Field of study

Colour controlled image generation and manipulation are of interest to artists and graphic designers. Vector Quantised Variational AutoEncoders (VQ-VAEs) with autoregressive (AR) prior are able to produce high quality images, but lack an explicit representation mechanism to control colour attributes. We introduce DualVAE, a hybrid representation model that provides such control by learning disentangled representations for colour and geometry. The geometry is represented by an image intensity mapping that identifies structural features. The disentangled representation is obtained by two novel mechanisms: (i) a dual branch architecture that separates image colour attributes from geometric attributes, and (ii) a new ELBO that trains the combined colour and geometry representations. DualVAE can control the colour of generated images, and recolour existing images by transferring the colour latent representation obtained from an exemplar image. We demonstrate that DualVAE generates images with FID nearly two times better than VQ-GAN on a diverse collection of datasets, including animated faces, logos and artistic landscapes

arXiv.org e-Print Archive

Morphology-preserving Autoregressive 3D Generative Modelling of the Brain

Author: Appleyard Jeremy
Borges Pedro
Cardoso Jorge
Fernandez Virginia
Graham Mark S.
Mehra Disha
Nachev Parashkev
Novati Guido
Ourselin Sebastien
Pinaya Walter Hugo Lopez
Tudosiu Petru-Daniel
Vella Mike
Yang Dai
Publication venue
Publication date: 07/09/2022
Field of study

Human anatomy, morphology, and associated diseases can be studied using medical imaging data. However, access to medical imaging data is restricted by governance and privacy concerns, data ownership, and the cost of acquisition, thus limiting our ability to understand the human body. A possible solution to this issue is the creation of a model able to learn and then generate synthetic images of the human body conditioned on specific characteristics of relevance (e.g., age, sex, and disease status). Deep generative models, in the form of neural networks, have been recently used to create synthetic 2D images of natural scenes. Still, the ability to produce high-resolution 3D volumetric imaging data with correct anatomical morphology has been hampered by data scarcity and algorithmic and computational limitations. This work proposes a generative model that can be scaled to produce anatomically correct, high-resolution, and realistic images of the human brain, with the necessary quality to allow further downstream analyses. The ability to generate a potentially unlimited amount of data not only enables large-scale studies of human anatomy and pathology without jeopardizing patient privacy, but also significantly advances research in the field of anomaly detection, modality synthesis, learning under limited data, and fair and ethical AI. Code and trained models are available at: https://github.com/AmigoLab/SynthAnatomy.Comment: 13 pages, 3 figures, 2 tables, accepted at SASHIMI MICCAI 202

arXiv.org e-Print Archive