9 research outputs found

    Unsupervised Controllable Generation with Self-Training

    Get PDF
    Recent generative adversarial networks (GANs) are able to generate impressive photo-realistic images. However, controllable generation with GANs remains a challenging research problem. Achieving controllable generation requires semantically interpretable and disentangled factors of variation. It is challenging to achieve this goal using simple fixed distributions such as Gaussian distribution. Instead, we propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training. Self-training provides an iterative feedback in the GAN training, from the discriminator to the generator, and progressively improves the proposal of the latent codes as training proceeds. The latent codes are sampled from a latent variable model that is learned in the feature space of the discriminator. We consider a normalized independent component analysis model and learn its parameters through tensor factorization of the higher-order moments. Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder, and is able to discover semantically meaningful latent codes without any supervision. We demonstrate empirically on both cars and faces datasets that each group of elements in the learned code controls a mode of variation with a semantic meaning, e.g. pose or background change. We also demonstrate with quantitative metrics that our method generates better results compared to other approaches

    Unsupervised Controllable Generation with Self-Training

    Get PDF
    Recent generative adversarial networks (GANs) are able to generate impressive photo-realistic images. However, controllable generation with GANs remains a challenging research problem. Achieving controllable generation requires semantically interpretable and disentangled factors of variation. It is challenging to achieve this goal using simple fixed distributions such as Gaussian distribution. Instead, we propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training. Self-training provides an iterative feedback in the GAN training, from the discriminator to the generator, and progressively improves the proposal of the latent codes as training proceeds. The latent codes are sampled from a latent variable model that is learned in the feature space of the discriminator. We consider a normalized independent component analysis model and learn its parameters through tensor factorization of the higher-order moments. Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder, and is able to discover semantically meaningful latent codes without any supervision. We demonstrate empirically on both cars and faces datasets that each group of elements in the learned code controls a mode of variation with a semantic meaning, e.g. pose or background change. We also demonstrate with quantitative metrics that our method generates better results compared to other approaches

    Towards a Neural Graphics Pipeline for Controllable Image Generation

    Get PDF
    In this paper, we leverage advances in neural networks towards forming a neural rendering for controllable image generation, and thereby bypassing the need for detailed modeling in conventional graphics pipeline. To this end, we present Neural Graphics Pipeline (NGP), a hybrid generative model that brings together neural and traditional image formation models. NGP decomposes the image into a set of interpretable appearance feature maps, uncovering direct control handles for controllable image generation. To form an image, NGP generates coarse 3D models that are fed into neural rendering modules to produce view-specific interpretable 2D maps, which are then composited into the final output image using a traditional image formation model. Our approach offers control over image generation by providing direct handles controlling illumination and camera parameters, in addition to control over shape and appearance variations. The key challenge is to learn these controls through unsupervised training that links generated coarse 3D models with unpaired real images via neural and traditional (e.g., Blinn- Phong) rendering functions, without establishing an explicit correspondence between them. We demonstrate the effectiveness of our approach on controllable image generation of single-object scenes. We evaluate our hybrid modeling framework, compare with neural-only generation methods (namely, DCGAN, LSGAN, WGAN-GP, VON, and SRNs), report improvement in FID scores against real images, and demonstrate that NGP supports direct controls common in traditional forward rendering. Code is available at http://geometry.cs.ucl.ac.uk/projects/2021/ngp.Comment: Eurographics 202

    Towards Efficient and Scalable Acceleration of Online Decision Tree Learning on FPGA

    Full text link
    Decision trees are machine learning models commonly used in various application scenarios. In the era of big data, traditional decision tree induction algorithms are not suitable for learning large-scale datasets due to their stringent data storage requirement. Online decision tree learning algorithms have been devised to tackle this problem by concurrently training with incoming samples and providing inference results. However, even the most up-to-date online tree learning algorithms still suffer from either high memory usage or high computational intensity with dependency and long latency, making them challenging to implement in hardware. To overcome these difficulties, we introduce a new quantile-based algorithm to improve the induction of the Hoeffding tree, one of the state-of-the-art online learning models. The proposed algorithm is light-weight in terms of both memory and computational demand, while still maintaining high generalization ability. A series of optimization techniques dedicated to the proposed algorithm have been investigated from the hardware perspective, including coarse-grained and fine-grained parallelism, dynamic and memory-based resource sharing, pipelining with data forwarding. We further present a high-performance, hardware-efficient and scalable online decision tree learning system on a field-programmable gate array (FPGA) with system-level optimization techniques. Experimental results show that our proposed algorithm outperforms the state-of-the-art Hoeffding tree learning method, leading to 0.05% to 12.3% improvement in inference accuracy. Real implementation of the complete learning system on the FPGA demonstrates a 384x to 1581x speedup in execution time over the state-of-the-art design.Comment: appear as a conference paper in FCCM 201
    corecore