Learning and predicting with chain event graphs
- Publication date
- Publisher
Abstract
Graphical models provide a very promising avenue for making sense of large,
complex datasets. The most popular graphical models in use at the moment are
Bayesian networks (BNs). This thesis shows, however, they are not always ideal factorisations
of a system. Instead, I advocate for the use of a relatively new graphical
model, the chain event graph (CEG), that is based on event trees.
Event trees directly represent graphically the event space of a system. Chain
event graphs reduce their potentially huge dimensionality by taking into account
identical probability distributions on some of the event tree’s subtrees, with the
added benefits of showing the conditional independence relationships of the system
— one of the advantages of the Bayesian network representation that event trees
lack — and implementation of causal hypotheses that is just as easy, and arguably
more natural, than is the case with Bayesian networks, with a larger domain of
implementation using purely graphical means.
The trade-off for this greater expressive power, however, is that model specification
and selection are much more difficult to undertake with the larger set of
possible models for a given set of variables. My thesis is the first exposition of how
to learn CEGs. I demonstrate that not only is conjugate (and hence quick) learning
of CEGs possible, but I characterise priors that imply conjugate updating based
on very reasonable assumptions that also have direct Bayesian network analogues.
By re-casting CEGs as partition models, I show how established partition learning
algorithms can be adapted for the task of learning CEGs.
I then develop a robust yet flexible prediction machine based on CEGs for
any discrete multivariate time series — the dynamic CEG model — which combines
the power of CEGs, multi-process and steady modelling, lattice theory and Occam’s
razor. This is also an exact method that produces reliable predictions without
requiring much a priori modelling. I then demonstrate how easily causal analysis
can be implemented with this model class that can express a wide variety of causal
hypotheses. I end with an application of these techniques to real educational data,
drawing inferences that would not have been possible simply using BNs