338 research outputs found
On the Difference Between the Information Bottleneck and the Deep Information Bottleneck
Combining the Information Bottleneck model with deep learning by replacing
mutual information terms with deep neural nets has proved successful in areas
ranging from generative modelling to interpreting deep neural networks. In this
paper, we revisit the Deep Variational Information Bottleneck and the
assumptions needed for its derivation. The two assumed properties of the data
, and their latent representation take the form of two Markov chains
and . Requiring both to hold during the optimisation process can
be limiting for the set of potential joint distributions . We
therefore show how to circumvent this limitation by optimising a lower bound
for for which only the latter Markov chain has to be satisfied. The
actual mutual information consists of the lower bound which is optimised in
DVIB and cognate models in practice and of two terms measuring how much the
former requirement is violated. Finally, we propose to interpret the
family of information bottleneck models as directed graphical models and show
that in this framework the original and deep information bottlenecks are
special cases of a fundamental IB model
Semantic Compression of Episodic Memories
Storing knowledge of an agent's environment in the form of a probabilistic
generative model has been established as a crucial ingredient in a multitude of
cognitive tasks. Perception has been formalised as probabilistic inference over
the state of latent variables, whereas in decision making the model of the
environment is used to predict likely consequences of actions. Such generative
models have earlier been proposed to underlie semantic memory but it remained
unclear if this model also underlies the efficient storage of experiences in
episodic memory. We formalise the compression of episodes in the normative
framework of information theory and argue that semantic memory provides the
distortion function for compression of experiences. Recent advances and
insights from machine learning allow us to approximate semantic compression in
naturalistic domains and contrast the resulting deviations in compressed
episodes with memory errors observed in the experimental literature on human
memory.Comment: CogSci201
Learning Extremal Representations with Deep Archetypal Analysis
Archetypes are typical population representatives in an extremal sense, where
typicality is understood as the most extreme manifestation of a trait or
feature. In linear feature space, archetypes approximate the data convex hull
allowing all data points to be expressed as convex mixtures of archetypes.
However, it might not always be possible to identify meaningful archetypes in a
given feature space. Learning an appropriate feature space and identifying
suitable archetypes simultaneously addresses this problem. This paper
introduces a generative formulation of the linear archetype model,
parameterized by neural networks. By introducing the distance-dependent
archetype loss, the linear archetype model can be integrated into the latent
space of a variational autoencoder, and an optimal representation with respect
to the unknown archetypes can be learned end-to-end. The reformulation of
linear Archetypal Analysis as deep variational information bottleneck, allows
the incorporation of arbitrarily complex side information during training.
Furthermore, an alternative prior, based on a modified Dirichlet distribution,
is proposed. The real-world applicability of the proposed method is
demonstrated by exploring archetypes of female facial expressions while using
multi-rater based emotion scores of these expressions as side information. A
second application illustrates the exploration of the chemical space of small
organic molecules. In this experiment, it is demonstrated that exchanging the
side information but keeping the same set of molecules, e. g. using as side
information the heat capacity of each molecule instead of the band gap energy,
will result in the identification of different archetypes. As an application,
these learned representations of chemical space might reveal distinct starting
points for de novo molecular design.Comment: Under review for publication at the International Journal of Computer
Vision (IJCV). Extended version of our GCPR2019 paper "Deep Archetypal
Analysis
- …