961 research outputs found
The Poisson transform for unnormalised statistical models
Contrary to standard statistical models, unnormalised statistical models only
specify the likelihood function up to a constant. While such models are natural
and popular, the lack of normalisation makes inference much more difficult.
Here we show that inferring the parameters of a unnormalised model on a space
can be mapped onto an equivalent problem of estimating the intensity
of a Poisson point process on . The unnormalised statistical model now
specifies an intensity function that does not need to be normalised.
Effectively, the normalisation constant may now be inferred as just another
parameter, at no loss of information. The result can be extended to cover
non-IID models, which includes for example unnormalised models for sequences of
graphs (dynamical graphs), or for sequences of binary vectors. As a
consequence, we prove that unnormalised parameteric inference in non-IID models
can be turned into a semi-parametric estimation problem. Moreover, we show that
the noise-contrastive divergence of Gutmann & Hyv\"arinen (2012) can be
understood as an approximation of the Poisson transform, and extended to
non-IID settings. We use our results to fit spatial Markov chain models of eye
movements, where the Poisson transform allows us to turn a highly non-standard
model into vanilla semi-parametric logistic regression
Hierarchical Models in the Brain
This paper describes a general model that subsumes many parametric models for
continuous data. The model comprises hidden layers of state-space or dynamic
causal models, arranged so that the output of one provides input to another. The
ensuing hierarchy furnishes a model for many types of data, of arbitrary
complexity. Special cases range from the general linear model for static data to
generalised convolution models, with system noise, for nonlinear time-series
analysis. Crucially, all of these models can be inverted using exactly the same
scheme, namely, dynamic expectation maximization. This means that a single model
and optimisation scheme can be used to invert a wide range of models. We present
the model and a brief review of its inversion to disclose the relationships
among, apparently, diverse generative models of empirical data. We then show
that this inversion can be formulated as a simple neural network and may provide
a useful metaphor for inference and learning in the brain
Future state maximisation as an intrinsic motivation for decision making
The concept of an “intrinsic motivation" is used in the psychology literature to distinguish between behaviour which is motivated by the expectation of an immediate, quantifiable reward (“extrinsic motivation") and behaviour which arises because it is inherently useful, interesting or enjoyable. Examples of the latter can include curiosity driven behaviour such as exploration and the accumulation of knowledge, as well as developing skills that might not be immediately useful but that have the potential to be re-used in a variety of different future situations. In this thesis, we examine a candidate for an intrinsic motivation with wide-ranging applicability which we refer to as “future state maximisation". Loosely speaking this is the idea that, taking everything else to be equal, decisions should be made so as to maximally keep one's options open, or to give the maximal amount of control over what one can potentially do in the future. Our goal is to study how this principle can be applied in a quantitative manner, as well as identifying examples of systems where doing so could be useful in either explaining or generating behaviour.
We consider a number of examples, however our primary application is to a model of collective motion in which we consider a group of agents equipped with simple visual sensors, moving around in two dimensions. In this model, agents aim to make decisions about how to move so as to maximise the amount of control they have over the potential visual states that they can access in the future. We find that with each agent following this simple, low-level motivational principle a swarm spontaneously emerges in which the agents exhibit rich collective behaviour, remaining cohesive and highly-aligned. Remarkably, the emergent swarm also shares a number of features which are observed in real flocks of starlings, including scale free correlations and marginal opacity. We go on to explore how the model can be developed to allow us to manipulate and control the swarm, as well as looking at heuristics which are able to mimic future state maximisation whilst requiring significantly less computation, and so which could plausibly operate under animal cognition
Unsupervised speech enhancement with diffusion-based generative models
Recently, conditional score-based diffusion models have gained significant
attention in the field of supervised speech enhancement, yielding
state-of-the-art performance. However, these methods may face challenges when
generalising to unseen conditions. To address this issue, we introduce an
alternative approach that operates in an unsupervised manner, leveraging the
generative power of diffusion models. Specifically, in a training phase, a
clean speech prior distribution is learnt in the short-time Fourier transform
(STFT) domain using score-based diffusion models, allowing it to
unconditionally generate clean speech from Gaussian noise. Then, we develop a
posterior sampling methodology for speech enhancement by combining the learnt
clean speech prior with a noise model for speech signal inference. The noise
parameters are simultaneously learnt along with clean speech estimation through
an iterative expectationmaximisation (EM) approach. To the best of our
knowledge, this is the first work exploring diffusion-based generative models
for unsupervised speech enhancement, demonstrating promising results compared
to a recent variational auto-encoder (VAE)-based unsupervised approach and a
state-of-the-art diffusion-based supervised method. It thus opens a new
direction for future research in unsupervised speech enhancement
Amortised learning by wake-sleep
Models that employ latent variables to capture structure in observed data lie at the heart of many current unsupervised learning algorithms, but exact maximum-likelihood learning for powerful and flexible latent-variable models is almost always intractable. Thus, state-of-the-art approaches either abandon the maximum-likelihood framework entirely, or else rely on a variety of variational approximations to the posterior distribution over the latents. Here, we propose an alternative approach that we call amortised learning. Rather than computing an approximation to the posterior over latents, we use a wake-sleep Monte-Carlo strategy to learn a function that directly estimates the maximum-likelihood parameter updates. Amortised learning is possible whenever samples of latents and observations can be simulated from the generative model, treating the model as a “black box”. We demonstrate its effectiveness on a wide range of complex models, including those with latents that are discrete or supported on non-Euclidean spaces
Hybrid system identification using switching density networks
Behaviour cloning is a commonly used strategy for imitation learning and can
be extremely effective in constrained domains. However, in cases where the
dynamics of an environment may be state dependent and varying, behaviour
cloning places a burden on model capacity and the number of demonstrations
required. This paper introduces switching density networks, which rely on a
categorical reparametrisation for hybrid system identification. This results in
a network comprising a classification layer that is followed by a regression
layer. We use switching density networks to predict the parameters of hybrid
control laws, which are toggled by a switching layer to produce different
controller outputs, when conditioned on an input state. This work shows how
switching density networks can be used for hybrid system identification in a
variety of tasks, successfully identifying the key joint angle goals that make
up manipulation tasks, while simultaneously learning image-based goal
classifiers and regression networks that predict joint angles from images. We
also show that they can cluster the phase space of an inverted pendulum,
identifying the balance, spin and pump controllers required to solve this task.
Switching density networks can be difficult to train, but we introduce a cross
entropy regularisation loss that stabilises training
Learning object, grasping and manipulation activities using hierarchical HMMs
This article presents a probabilistic algorithm for representing and learning complex manipulation activities performed by humans in everyday life. The work builds on the multi-level Hierarchical Hidden Markov Model (HHMM) framework which allows decomposition of longer-term complex manipulation activities into layers of abstraction whereby the building blocks can be represented by simpler action modules called action primitives. This way, human task knowledge can be synthesised in a compact, effective representation suitable, for instance, to be subsequently transferred to a robot for imitation. The main contribution is the use of a robust framework capable of dealing with the uncertainty or incomplete data inherent to these activities, and the ability to represent behaviours at multiple levels of abstraction for enhanced task generalisation. Activity data from 3D video sequencing of human manipulation of different objects handled in everyday life is used for evaluation. A comparison with a mixed generative-discriminative hybrid model HHMM/SVM (support vector machine) is also presented to add rigour in highlighting the benefit of the proposed approach against comparable state of the art techniques. © 2014 Springer Science+Business Media New York
- …