289 research outputs found
Improving Generalization for Abstract Reasoning Tasks Using Disentangled Feature Representations
In this work we explore the generalization characteristics of unsupervised
representation learning by leveraging disentangled VAE's to learn a useful
latent space on a set of relational reasoning problems derived from Raven
Progressive Matrices. We show that the latent representations, learned by
unsupervised training using the right objective function, significantly
outperform the same architectures trained with purely supervised learning,
especially when it comes to generalization
Auto-regressive Image Synthesis with Integrated Quantization
Deep generative models have achieved conspicuous progress in realistic image
synthesis with multifarious conditional inputs, while generating diverse yet
high-fidelity images remains a grand challenge in conditional image generation.
This paper presents a versatile framework for conditional image generation
which incorporates the inductive bias of CNNs and powerful sequence modeling of
auto-regression that naturally leads to diverse image generation. Instead of
independently quantizing the features of multiple domains as in prior research,
we design an integrated quantization scheme with a variational regularizer that
mingles the feature discretization in multiple domains, and markedly boosts the
auto-regressive modeling performance. Notably, the variational regularizer
enables to regularize feature distributions in incomparable latent spaces by
penalizing the intra-domain variations of distributions. In addition, we design
a Gumbel sampling strategy that allows to incorporate distribution uncertainty
into the auto-regressive training procedure. The Gumbel sampling substantially
mitigates the exposure bias that often incurs misalignment between the training
and inference stages and severely impairs the inference performance. Extensive
experiments over multiple conditional image generation tasks show that our
method achieves superior diverse image generation performance qualitatively and
quantitatively as compared with the state-of-the-art.Comment: Accepted to ECCV 2022 as Oral Presentatio
DualVAE: Controlling Colours of Generated and Real Images
Colour controlled image generation and manipulation are of interest to
artists and graphic designers. Vector Quantised Variational AutoEncoders
(VQ-VAEs) with autoregressive (AR) prior are able to produce high quality
images, but lack an explicit representation mechanism to control colour
attributes. We introduce DualVAE, a hybrid representation model that provides
such control by learning disentangled representations for colour and geometry.
The geometry is represented by an image intensity mapping that identifies
structural features. The disentangled representation is obtained by two novel
mechanisms:
(i) a dual branch architecture that separates image colour attributes from
geometric attributes, and (ii) a new ELBO that trains the combined colour and
geometry representations. DualVAE can control the colour of generated images,
and recolour existing images by transferring the colour latent representation
obtained from an exemplar image. We demonstrate that DualVAE generates images
with FID nearly two times better than VQ-GAN on a diverse collection of
datasets, including animated faces, logos and artistic landscapes
- …