Faculty of Engineering, School of Computer Science
Abstract
The universe of mathematical modelling from observational data is a vast space. It consists a
cacophony of differing paths, with doors to worlds with seemingly diametrically opposed perspectives
that all attempt to conjure a crystal ball of both intuitive understanding and predictive capability.
Among these many worlds is an approach that is broadly called kernel methods, which, while
complex in detail, when viewed from afar ultimately reduces to a rather simple question: how close is
something to something else? What does it mean to be close? Specifically, how can we quantify
closeness in some reasonable and principled way?
This thesis presents four approaches that address generalised kernel learning. Firstly, we introduce a
probabilistic framework that allows joint learning of model and kernel parameters in order to capture
nonstationary spatial phenomena. Secondly, we introduce a theoretical framework based on optimal
transport that enables online kernel parameter transfer. Such parameter transfer involves the ability
to re-use previously learned parameters, without re-optimisation, on newly observed data. This
extends the first contribution which was unable operate in real-time due to the necessity of reoptimising
parameters to new observations. Thirdly, we introduce a learnable Fourier based kernel
embeddings that exploits generalised quantile representations for stationary kernels. Finally, a
method for input warped Fourier kernel embeddings is proposed that allows nonstationary data
embeddings using simple stationary kernels.
By introducing theoretically cohesive and algorithmically intuitive methods this thesis opens new
doors to removing traditional assumptions that have hindered adoption of the kernel perspective. We
hope that the ideas presented will demonstrate a curious and inspiring view to the potential of
learnable kernel embeddings