9 research outputs found
Unsupervised Controllable Generation with Self-Training
Recent generative adversarial networks (GANs) are able to generate impressive
photo-realistic images. However, controllable generation with GANs remains a
challenging research problem. Achieving controllable generation requires
semantically interpretable and disentangled factors of variation. It is
challenging to achieve this goal using simple fixed distributions such as
Gaussian distribution. Instead, we propose an unsupervised framework to learn a
distribution of latent codes that control the generator through self-training.
Self-training provides an iterative feedback in the GAN training, from the
discriminator to the generator, and progressively improves the proposal of the
latent codes as training proceeds. The latent codes are sampled from a latent
variable model that is learned in the feature space of the discriminator. We
consider a normalized independent component analysis model and learn its
parameters through tensor factorization of the higher-order moments. Our
framework exhibits better disentanglement compared to other variants such as
the variational autoencoder, and is able to discover semantically meaningful
latent codes without any supervision. We demonstrate empirically on both cars
and faces datasets that each group of elements in the learned code controls a
mode of variation with a semantic meaning, e.g. pose or background change. We
also demonstrate with quantitative metrics that our method generates better
results compared to other approaches
Unsupervised Controllable Generation with Self-Training
Recent generative adversarial networks (GANs) are able to generate impressive photo-realistic images. However, controllable generation with GANs remains a challenging research problem. Achieving controllable generation requires semantically interpretable and disentangled factors of variation. It is challenging to achieve this goal using simple fixed distributions such as Gaussian distribution. Instead, we propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training. Self-training provides an iterative feedback in the GAN training, from the discriminator to the generator, and progressively improves the proposal of the latent codes as training proceeds. The latent codes are sampled from a latent variable model that is learned in the feature space of the discriminator. We consider a normalized independent component analysis model and learn its parameters through tensor factorization of the higher-order moments. Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder, and is able to discover semantically meaningful latent codes without any supervision. We demonstrate empirically on both cars and faces datasets that each group of elements in the learned code controls a mode of variation with a semantic meaning, e.g. pose or background change. We also demonstrate with quantitative metrics that our method generates better results compared to other approaches
Towards a Neural Graphics Pipeline for Controllable Image Generation
In this paper, we leverage advances in neural networks towards forming a
neural rendering for controllable image generation, and thereby bypassing the
need for detailed modeling in conventional graphics pipeline. To this end, we
present Neural Graphics Pipeline (NGP), a hybrid generative model that brings
together neural and traditional image formation models. NGP decomposes the
image into a set of interpretable appearance feature maps, uncovering direct
control handles for controllable image generation. To form an image, NGP
generates coarse 3D models that are fed into neural rendering modules to
produce view-specific interpretable 2D maps, which are then composited into the
final output image using a traditional image formation model. Our approach
offers control over image generation by providing direct handles controlling
illumination and camera parameters, in addition to control over shape and
appearance variations. The key challenge is to learn these controls through
unsupervised training that links generated coarse 3D models with unpaired real
images via neural and traditional (e.g., Blinn- Phong) rendering functions,
without establishing an explicit correspondence between them. We demonstrate
the effectiveness of our approach on controllable image generation of
single-object scenes. We evaluate our hybrid modeling framework, compare with
neural-only generation methods (namely, DCGAN, LSGAN, WGAN-GP, VON, and SRNs),
report improvement in FID scores against real images, and demonstrate that NGP
supports direct controls common in traditional forward rendering. Code is
available at http://geometry.cs.ucl.ac.uk/projects/2021/ngp.Comment: Eurographics 202
Towards Efficient and Scalable Acceleration of Online Decision Tree Learning on FPGA
Decision trees are machine learning models commonly used in various
application scenarios. In the era of big data, traditional decision tree
induction algorithms are not suitable for learning large-scale datasets due to
their stringent data storage requirement. Online decision tree learning
algorithms have been devised to tackle this problem by concurrently training
with incoming samples and providing inference results. However, even the most
up-to-date online tree learning algorithms still suffer from either high memory
usage or high computational intensity with dependency and long latency, making
them challenging to implement in hardware. To overcome these difficulties, we
introduce a new quantile-based algorithm to improve the induction of the
Hoeffding tree, one of the state-of-the-art online learning models. The
proposed algorithm is light-weight in terms of both memory and computational
demand, while still maintaining high generalization ability. A series of
optimization techniques dedicated to the proposed algorithm have been
investigated from the hardware perspective, including coarse-grained and
fine-grained parallelism, dynamic and memory-based resource sharing, pipelining
with data forwarding. We further present a high-performance, hardware-efficient
and scalable online decision tree learning system on a field-programmable gate
array (FPGA) with system-level optimization techniques. Experimental results
show that our proposed algorithm outperforms the state-of-the-art Hoeffding
tree learning method, leading to 0.05% to 12.3% improvement in inference
accuracy. Real implementation of the complete learning system on the FPGA
demonstrates a 384x to 1581x speedup in execution time over the
state-of-the-art design.Comment: appear as a conference paper in FCCM 201