17 research outputs found

    Generative Temporal Models with Spatial Memory for Partially Observed Environments

    Full text link
    In model-based reinforcement learning, generative and temporal models of environments can be leveraged to boost agent performance, either by tuning the agent's representations during training or via use as part of an explicit planning mechanism. However, their application in practice has been limited to simplistic environments, due to the difficulty of training such models in larger, potentially partially-observed and 3D environments. In this work we introduce a novel action-conditioned generative model of such challenging environments. The model features a non-parametric spatial memory system in which we store learned, disentangled representations of the environment. Low-dimensional spatial updates are computed using a state-space model that makes use of knowledge on the prior dynamics of the moving agent, and high-dimensional visual observations are modelled with a Variational Auto-Encoder. The result is a scalable architecture capable of performing coherent predictions over hundreds of time steps across a range of partially observed 2D and 3D environments.Comment: ICML 201

    Leveraging the Exact Likelihood of Deep Latent Variable Models

    Get PDF
    Deep latent variable models (DLVMs) combine the approximation abilities of deep neural networks and the statistical foundations of generative models. Variational methods are commonly used for inference; however, the exact likelihood of these models has been largely overlooked. The purpose of this work is to study the general properties of this quantity and to show how they can be leveraged in practice. We focus on important inferential problems that rely on the likelihood: estimation and missing data imputation. First, we investigate maximum likelihood estimation for DLVMs: in particular, we show that most unconstrained models used for continuous data have an unbounded likelihood function. This problematic behaviour is demonstrated to be a source of mode collapse. We also show how to ensure the existence of maximum likelihood estimates, and draw useful connections with nonparametric mixture models. Finally, we describe an algorithm for missing data imputation using the exact conditional likelihood of a deep latent variable model. On several data sets, our algorithm consistently and significantly outperforms the usual imputation scheme used for DLVMs

    Memory-Based Learning of Latent Structures for Generative Adversarial Networks

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2019. 2. ๊น€๊ฑดํฌ.๋ณธ ์—ฐ๊ตฌ๋Š” Generative Adversarial Network (GAN) ๋ชจ๋ธ์˜ ํ•™์Šต ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋‘๊ฐ€์ง€ ๋ฌธ์ œ์ ์„ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ์•ˆ์„ ์ œ์‹œํ•˜์˜€๋‹ค. ๋จผ์ €, ์ผ๋ฐ˜์ ์ธ GAN ๋ชจ๋ธ์€ ์‚ฌ์ง„๊ณผ ๊ฐ™์€ ๋ณต์žกํ•œ ํ™•๋ฅ ๋ณ€์ˆ˜์˜ ๋ถ„ํฌ๋ฅผ ๋ชจ๋ธ๋งํ•  ๋•Œ ์ž ์žฌ๋ณ€์ˆ˜์˜ ์‚ฌ์ „ํ™•๋ฅ ๋ถ„ํฌ๋กœ ํ‘œ์ค€์ •๊ทœ๋ถ„ํฌ๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Ÿฐ ์—ฐ์†์ ์ธ ์ž ์žฌ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ ์„œ๋กœ ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ๊ฐ„์˜ ๊ตฌ์กฐ์  ๋ถˆ์—ฐ์†์„ฑ์„ ๋ฐ˜์˜ํ•˜๊ธฐ ์–ด๋ ต๋‹ค. ๋˜ ๋‹ค๋ฅธ ๋ฌธ์ œ์ ์œผ๋กœ, GAN ๋ชจ๋ธ์—์„œ ํŒ๋ณ„์ž๋Š” ํ•™์Šต ๊ณผ์ •์—์„œ ๊ณผ๊ฑฐ์— ์ƒ์„ฑ์ž ๋ชจ๋ธ์ด ์ƒ์„ฑํ–ˆ๋˜ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ๋ง๊ฐํ•˜๋ฉฐ, ์ด๋กœ์ธํ•ด ํ•™์Šต ๊ณผ์ •์ด ๋ถˆ์•ˆ์ •ํ•ด์ง„๋‹ค. ์ด ๋‘๊ฐ€์ง€ ๋ฌธ์ œ์ ์€ ์ƒ์„ฑ์ž๊ฐ€ ํŒ๋ณ„์ž๊ฐ€ ๊ณต์œ ํ•˜๋Š” memory network๋ฅผ ๋™์‹œ์— ํ•™์Šตํ•จ์œผ๋กœ์จ ํฌ๊ฒŒ ์™„ํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ƒ์„ฑ์ž๊ฐ€ ํ•™์Šต ๋ฐ์ดํ„ฐ์— ๋‚ด์žฌ๋œ ๊ตฐ์ง‘์˜ ๋ถ„ํฌ๋ฅผ ํ•™์Šตํ•œ๋‹ค๋ฉด ์ด๋ฅผ ํ†ตํ•ด ๊ตฌ์กฐ์  ๋ถˆ์—ฐ์†์„ฑ์œผ๋กœ ์ธํ•œ ์„ฑ๋Šฅ ํ•˜๋ฝ์„ ํ”ผํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ํŒ๋ณ„์ž๊ฐ€ ์ฃผ์–ด์ง„ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ํŒ๋ณ„์„ ํ•  ๋•Œ ํ•™์Šต ์ „ ๊ณผ์ •์— ๊ฑธ์ณ ์ƒ์„ฑ์ž๊ฐ€ ์ƒ์„ฑํ–ˆ๋˜ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ๋“ค๋กœ๋ถ€ํ„ฐ ํ•™์Šต๋œ ๊ตฐ์ง‘ ๋ถ„ํฌ๋ฅผ ์ฐธ์กฐํ•œ๋‹ค๋ฉด ๋ง๊ฐ ๋ฌธ์ œ๋กœ ์ธํ•œ ์˜ํ–ฅ์„ ๋œ ๋ฐ›๊ฒŒ ๋œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ ์ œ์‹œํ•œ memoryGAN ๋ชจ๋ธ์€ ๋น„์ง€๋„ํ•™์Šต์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ์— ๋‚ด์žฌ๋œ ๊ตฐ์ง‘์˜ ๋ถ„ํฌ๋ฅผ ํ•™์Šตํ•˜์—ฌ ๊ตฌ์กฐ์  ๋ถˆ์—ฐ์†์„ฑ ๋ฌธ์ œ์™€ ๋ง๊ฐ ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๋ฉฐ, ๋Œ€๋ถ€๋ถ„์˜ GAN ๋ชจ๋ธ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. Fashion-MNIST, CelebA, CIFAR10, ๊ทธ๋ฆฌ๊ณ  Chairs ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ์„ฑ๋Šฅ ํ‰๊ฐ€ ๋ฐ ์‹œ๊ฐํ™” ์‹คํ—˜์„ ํ†ตํ•ด memoryGAN์ด ํ™•๋ฅ ๋ก ์ ์œผ๋กœ ํ•ด์„ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ์ด๋ฉฐ, ๋†’์€ ์ˆ˜์ค€์˜ ์‚ฌ์ง„ ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ•œ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์˜€๋‹ค. ํŠนํžˆ memoryGAN์€ ๊ฐœ์„ ๋œ ์ตœ์ ํ™” ๋ฐฉ๋ฒ•์ด๋‚˜ Weaker divergence๋ฅผ ๋„์ž…ํ•˜์ง€ ์•Š๊ณ ๋„ CIFAR10 ๋ฐ์ดํ„ฐ์…‹์—์„œ Inception Score๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋น„์ง€๋„ํ•™์Šต ๋ฐฉ์‹์˜ GAN ๋ชจ๋ธ ์ค‘ ๋†’์€ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ–ˆ๋‹ค.We propose an approach to address two issues that commonly occur during training of unsupervised GANs. First, since GANs use only a continuous latent distribution to embed multiple classes or clusters of data, they often do not correctly handle the structural discontinuity between disparate classes in a latent space. Second, discriminators of GANs easily forget about past generated samples by generators, incurring instability during adversarial training. We argue that these two infamous problems of unsupervised GAN training can be largely alleviated by a learnable memory network to which both generators and discriminators can access. Generators can effectively learn representation of training samples to understand underlying cluster distributions of data, which ease the structure discontinuity problem. At the same time, discriminators can better memorize clusters of previously generated samples, which mitigate the forgetting problem. We propose a novel end-to-end GAN model named memoryGAN, which involves a memory network that is unsupervisedly trainable and integrable to many existing GAN models. With evaluations on multiple datasets such as Fashion-MNIST, CelebA, CIFAR10, and Chairs, we show that our model is probabilistically interpretable, and generates realistic image samples of high visual fidelity. The memoryGAN also achieves the state-of-the-art inception scores over unsupervised GAN models on the CIFAR10 dataset, without any optimization tricks and weaker divergences.Introduction Related Works The MemoryGAN Experiments ConclusionMaste

    Learning to Learn Variational Semantic Memory

    Get PDF
    In this paper, we introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning. The variational semantic memory accrues and stores semantic information for the probabilistic inference of class prototypes in a hierarchical Bayesian framework. The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences. By doing so, it is able to accumulate long-term, general knowledge that enables it to learn new concepts of objects. We formulate memory recall as the variational inference of a latent memory variable from addressed contents, which offers a principled way to adapt the knowledge to individual tasks. Our variational semantic memory, as a new long-term memory module, confers principled recall and update mechanisms that enable semantic information to be efficiently accrued and adapted for few-shot learning. Experiments demonstrate that the probabilistic modelling of prototypes achieves a more informative representation of object classes compared to deterministic vectors. The consistent new state-of-the-art performance on four benchmarks shows the benefit of variational semantic memory in boosting few-shot recognition.Comment: accepted to NeurIPS 2020; code is available in https://github.com/YDU-uva/VS
    corecore