24 research outputs found
Semantic Compression of Episodic Memories
Storing knowledge of an agent's environment in the form of a probabilistic
generative model has been established as a crucial ingredient in a multitude of
cognitive tasks. Perception has been formalised as probabilistic inference over
the state of latent variables, whereas in decision making the model of the
environment is used to predict likely consequences of actions. Such generative
models have earlier been proposed to underlie semantic memory but it remained
unclear if this model also underlies the efficient storage of experiences in
episodic memory. We formalise the compression of episodes in the normative
framework of information theory and argue that semantic memory provides the
distortion function for compression of experiences. Recent advances and
insights from machine learning allow us to approximate semantic compression in
naturalistic domains and contrast the resulting deviations in compressed
episodes with memory errors observed in the experimental literature on human
memory.Comment: CogSci201
OOGAN: Disentangling GAN with One-Hot Sampling and Orthogonal Regularization
Exploring the potential of GANs for unsupervised disentanglement learning,
this paper proposes a novel GAN-based disentanglement framework with One-Hot
Sampling and Orthogonal Regularization (OOGAN). While previous works mostly
attempt to tackle disentanglement learning through VAE and seek to implicitly
minimize the Total Correlation (TC) objective with various sorts of
approximation methods, we show that GANs have a natural advantage in
disentangling with an alternating latent variable (noise) sampling method that
is straightforward and robust. Furthermore, we provide a brand-new perspective
on designing the structure of the generator and discriminator, demonstrating
that a minor structural change and an orthogonal regularization on model
weights entails an improved disentanglement. Instead of experimenting on simple
toy datasets, we conduct experiments on higher-resolution images and show that
OOGAN greatly pushes the boundary of unsupervised disentanglement.Comment: AAAI 202
VAE with a VampPrior
Many different methods to train deep generative models have been introduced
in the past. In this paper, we propose to extend the variational auto-encoder
(VAE) framework with a new type of prior which we call "Variational Mixture of
Posteriors" prior, or VampPrior for short. The VampPrior consists of a mixture
distribution (e.g., a mixture of Gaussians) with components given by
variational posteriors conditioned on learnable pseudo-inputs. We further
extend this prior to a two layer hierarchical model and show that this
architecture with a coupled prior and posterior, learns significantly better
models. The model also avoids the usual local optima issues related to useless
latent dimensions that plague VAEs. We provide empirical studies on six
datasets, namely, static and binary MNIST, OMNIGLOT, Caltech 101 Silhouettes,
Frey Faces and Histopathology patches, and show that applying the hierarchical
VampPrior delivers state-of-the-art results on all datasets in the unsupervised
permutation invariant setting and the best results or comparable to SOTA
methods for the approach with convolutional networks.Comment: 16 pages, final version, AISTATS 201
Predictive Uncertainty through Quantization
High-risk domains require reliable confidence estimates from predictive
models. Deep latent variable models provide these, but suffer from the rigid
variational distributions used for tractable inference, which err on the side
of overconfidence. We propose Stochastic Quantized Activation Distributions
(SQUAD), which imposes a flexible yet tractable distribution over discretized
latent variables. The proposed method is scalable, self-normalizing and sample
efficient. We demonstrate that the model fully utilizes the flexible
distribution, learns interesting non-linearities, and provides predictive
uncertainty of competitive quality