50 research outputs found
Information transfer and causality in the sensorimotor loop
This thesis investigates information-theoretic tools for detecting and describing causal influences in embodied agents. It presents an analysis of philosophical and statistical approaches to causation, and in particular focuses on causal Bayes nets and transfer entropy. It argues for a novel perspective that explicitly incorporates the epistemological role of information as a tool for inference. This approach clarifies and resolves some of the known problems associated with such methods.
Here it is argued, through a series of experiments, mathematical results and some philosophical accounts, that universally applicable measures of causal influence strength are unlikely to exist. Instead, the focus should be on the role that information-theoretic tools can play in inferential tests for causal relationships in embodied agents particularly, and dynamical systems in general. This thesis details how these two approaches differ.
Following directly from these arguments, the thesis proposes a concept of “hidden” information transfer to describe situations where causal influences passing through a chain of variables may be more easily detected at the end-points than at intermediate nodes. This is described using theoretical examples, and also appears in the information dynamics of computer-simulated and real robots developed herein. Practical examples include some minimal models of agent-environment systems, but also a novel complete system for generating locomotion gait patterns using a biologically-inspired decentralized architecture on a walking robotic hexapod
Phenomenological modelling: statistical abstraction methods for Markov chains
Continuous-time Markov chains have long served as exemplary low-level models for an
array of systems, be they natural processes like chemical reactions and population fluctuations
in ecosystems, or artificial processes like server queuing systems or communication
networks. Our interest in such systems is often an emergent macro-scale behaviour, or
phenomenon, which can be well characterised by the satisfaction of a set of properties.
Although theoretically elegant, the fundamental low-level nature of Markov chain models
makes macro-scale analysis of the phenomenon of interest difficult. Particularly, it is not
easy to determine the driving mechanisms for the emergent phenomenon, or to predict
how changes at the Markov chain level will influence the macro-scale behaviour.
The difficulties arise primarily from two aspects of such models. Firstly, as the number
of components in the modelled system grows, so does the state-space of the Markov
chain, often making behaviour characterisation untenable under both simulation-based
and analytical methods. Secondly, the behaviour of interest in such systems is usually
dependent on the inherent stochasticity of the model, and may not be aligned to the
underlying state interpretation. In a model where states represent a low-level, primitive
aspect of system components, the phenomenon of interest often varies significantly with
respect to this low-level aspect that states represent.
This work focuses on providing methodological frameworks that circumvent these
issues by developing abstraction strategies, which preserve the phenomena of interest. In
the first part of this thesis, we express behavioural characteristics of the system in terms
of a temporal logic with Markov chain trajectories as semantic objects. This allows us
to group regions of the state-space by how well they satisfy the logical properties that
characterise macro-scale behaviour, in order to produce an abstracted Markov chain.
States of the abstracted chain correspond to certain satisfaction probabilities of the logical
properties, and inferred dynamics match the behaviour of the original chain in terms of
the properties. The resulting model has a smaller state-space which is interpretable in
terms of an emergent behaviour of the original system, and is therefore valuable to a
researcher despite the accuracy sacrifices. Coarsening based on logical properties is particularly useful in multi-scale modelling,
where a layer of the model is a (continuous-time) Markov chain. In such models, the layer
is relevant to other layers only in terms of its output: some logical property evaluated
on the trajectory drawn from the Markov chain. We develop here a framework for
constructing a surrogate (discrete-time) Markov chain, with states corresponding to layer
output. The expensive simulation of a large Markov chain is therefore replaced by an
interpretable abstracted model. We can further use this framework to test whether a
posited mechanism could be the driver for a specific macro-scale behaviour exhibited by
the model.
We use a powerful Bayesian non-parametric regression technique based on Gaussian
process theory to produce the necessary elements of the abstractions above. In particular,
we observe trajectories of the original system from which we infer the satisfaction of
logical properties for varying model parametrisation, and the dynamics for the abstracted
system that match the original in behaviour.
The final part of the thesis presents a novel continuous-state process approximation
to the macro-scale behaviour of discrete-state Markov chains with large state-spaces.
The method is based on spectral analysis of the transition matrix of the chain, where we
use the popular manifold learning method of diffusion maps to analyse the transition
matrix as the operator of a hidden continuous process. An embedding of states in
a continuous space is recovered, and the space is endowed with a drift vector field
inferred via Gaussian process regression. In this manner, we form an ODE whose
solution approximates the evolution of the CTMC mean, mapped onto the continuous
space (known as the fluid limit). Our method is general and differs significantly from
other continuous approximation methods; the latter rely on the Markov chain having
a particular population structure, suggestive of a natural continuous state-space and
associated dynamics.
Overall, this thesis contributes novel methodologies that emphasize the importance
of macro-scale behaviour in modelling complex systems. Part of the work focuses on
abstracting large systems into more concise systems that retain behavioural characteristics
and are interpretable to the modeller. The final part examines the relationship between
continuous and discrete state-spaces and seeks for a transition path between the two which
does not rely on exogenous semantics of the system states. Further than the computational
and theoretical benefits of these methodologies, they push at the boundaries of various
prevalent approaches to stochastic modelling
Learning AMP chain graphs and some marginal models thereof under faithfulness
This paper deals with chain graphs under the Andersson-Madigan-Perlman (AMP) interpretation. In particular, we present a constraint based algorithm for learning an AMP chain graph a given probability distribution is faithful to. Moreover, we show that the extension of Meeks conjecture to AMP chain graphs does not hold, which compromises the development of efficient and correct score + search learning algorithms under assumptions weaker than faithfulness. We also study the problem of how to represent the result of marginalizing out some nodes in an AMP CG. We introduce a new family of graphical models that solves this problem partially. We name this new family maximal covariance-concentration graphs because it includes both covariance and concentration graphs as subfamilies
Learning AMP chain graphs and some marginal models thereof under faithfulness
This paper deals with chain graphs under the Andersson-Madigan-Perlman (AMP) interpretation. In particular, we present a constraint based algorithm for learning an AMP chain graph a given probability distribution is faithful to. Moreover, we show that the extension of Meeks conjecture to AMP chain graphs does not hold, which compromises the development of efficient and correct score + search learning algorithms under assumptions weaker than faithfulness. We also study the problem of how to represent the result of marginalizing out some nodes in an AMP CG. We introduce a new family of graphical models that solves this problem partially. We name this new family maximal covariance-concentration graphs because it includes both covariance and concentration graphs as subfamilies