25 research outputs found
Multi-Level Variational Autoencoder: Learning Disentangled Representations from Grouped Observations
We would like to learn a representation of the data which decomposes an
observation into factors of variation which we can independently control.
Specifically, we want to use minimal supervision to learn a latent
representation that reflects the semantics behind a specific grouping of the
data, where within a group the samples share a common factor of variation. For
example, consider a collection of face images grouped by identity. We wish to
anchor the semantics of the grouping into a relevant and disentangled
representation that we can easily exploit. However, existing deep probabilistic
models often assume that the observations are independent and identically
distributed. We present the Multi-Level Variational Autoencoder (ML-VAE), a new
deep probabilistic model for learning a disentangled representation of a set of
grouped observations. The ML-VAE separates the latent representation into
semantically meaningful parts by working both at the group level and the
observation level, while retaining efficient test-time inference. Quantitative
and qualitative evaluations show that the ML-VAE model (i) learns a
semantically meaningful disentanglement of grouped data, (ii) enables
manipulation of the latent representation, and (iii) generalises to unseen
groups
DISCO Nets: DISsimilarity COefficient Networks
We present a new type of probabilistic model which we call DISsimilarity
COefficient Networks (DISCO Nets). DISCO Nets allow us to efficiently sample
from a posterior distribution parametrised by a neural network. During
training, DISCO Nets are learned by minimising the dissimilarity coefficient
between the true distribution and the estimated distribution. This allows us to
tailor the training to the loss related to the task at hand. We empirically
show that (i) by modeling uncertainty on the output value, DISCO Nets
outperform equivalent non-probabilistic predictive networks and (ii) DISCO Nets
accurately model the uncertainty of the output, outperforming existing
probabilistic models based on deep neural networks
How agents see things: On visual representations in an emergent language game
There is growing interest in the language developed by agents interacting in
emergent-communication settings. Earlier studies have focused on the agents'
symbol usage, rather than on their representation of visual input. In this
paper, we consider the referential games of Lazaridou et al. (2017) and
investigate the representations the agents develop during their evolving
interaction. We find that the agents establish successful communication by
inducing visual representations that almost perfectly align with each other,
but, surprisingly, do not capture the conceptual properties of the objects
depicted in the input images. We conclude that, if we are interested in
developing language-like communication systems, we must pay more attention to
the visual semantics agents associate to the symbols they use.Comment: 2018 Conference on Empirical Methods in Natural Language Processin