646 research outputs found
Unsupervised Controllable Generation with Self-Training
Recent generative adversarial networks (GANs) are able to generate impressive
photo-realistic images. However, controllable generation with GANs remains a
challenging research problem. Achieving controllable generation requires
semantically interpretable and disentangled factors of variation. It is
challenging to achieve this goal using simple fixed distributions such as
Gaussian distribution. Instead, we propose an unsupervised framework to learn a
distribution of latent codes that control the generator through self-training.
Self-training provides an iterative feedback in the GAN training, from the
discriminator to the generator, and progressively improves the proposal of the
latent codes as training proceeds. The latent codes are sampled from a latent
variable model that is learned in the feature space of the discriminator. We
consider a normalized independent component analysis model and learn its
parameters through tensor factorization of the higher-order moments. Our
framework exhibits better disentanglement compared to other variants such as
the variational autoencoder, and is able to discover semantically meaningful
latent codes without any supervision. We demonstrate empirically on both cars
and faces datasets that each group of elements in the learned code controls a
mode of variation with a semantic meaning, e.g. pose or background change. We
also demonstrate with quantitative metrics that our method generates better
results compared to other approaches
Unsupervised Controllable Generation with Self-Training
Recent generative adversarial networks (GANs) are able to generate impressive photo-realistic images. However, controllable generation with GANs remains a challenging research problem. Achieving controllable generation requires semantically interpretable and disentangled factors of variation. It is challenging to achieve this goal using simple fixed distributions such as Gaussian distribution. Instead, we propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training. Self-training provides an iterative feedback in the GAN training, from the discriminator to the generator, and progressively improves the proposal of the latent codes as training proceeds. The latent codes are sampled from a latent variable model that is learned in the feature space of the discriminator. We consider a normalized independent component analysis model and learn its parameters through tensor factorization of the higher-order moments. Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder, and is able to discover semantically meaningful latent codes without any supervision. We demonstrate empirically on both cars and faces datasets that each group of elements in the learned code controls a mode of variation with a semantic meaning, e.g. pose or background change. We also demonstrate with quantitative metrics that our method generates better results compared to other approaches
Towards a Neural Graphics Pipeline for Controllable Image Generation
In this paper, we leverage advances in neural networks towards forming a
neural rendering for controllable image generation, and thereby bypassing the
need for detailed modeling in conventional graphics pipeline. To this end, we
present Neural Graphics Pipeline (NGP), a hybrid generative model that brings
together neural and traditional image formation models. NGP decomposes the
image into a set of interpretable appearance feature maps, uncovering direct
control handles for controllable image generation. To form an image, NGP
generates coarse 3D models that are fed into neural rendering modules to
produce view-specific interpretable 2D maps, which are then composited into the
final output image using a traditional image formation model. Our approach
offers control over image generation by providing direct handles controlling
illumination and camera parameters, in addition to control over shape and
appearance variations. The key challenge is to learn these controls through
unsupervised training that links generated coarse 3D models with unpaired real
images via neural and traditional (e.g., Blinn- Phong) rendering functions,
without establishing an explicit correspondence between them. We demonstrate
the effectiveness of our approach on controllable image generation of
single-object scenes. We evaluate our hybrid modeling framework, compare with
neural-only generation methods (namely, DCGAN, LSGAN, WGAN-GP, VON, and SRNs),
report improvement in FID scores against real images, and demonstrate that NGP
supports direct controls common in traditional forward rendering. Code is
available at http://geometry.cs.ucl.ac.uk/projects/2021/ngp.Comment: Eurographics 202
- …