Generalised Kernel Representations with Applications to Data Efficient Machine Learning

Abstract

The universe of mathematical modelling from observational data is a vast space. It consists a cacophony of differing paths, with doors to worlds with seemingly diametrically opposed perspectives that all attempt to conjure a crystal ball of both intuitive understanding and predictive capability. Among these many worlds is an approach that is broadly called kernel methods, which, while complex in detail, when viewed from afar ultimately reduces to a rather simple question: how close is something to something else? What does it mean to be close? Specifically, how can we quantify closeness in some reasonable and principled way? This thesis presents four approaches that address generalised kernel learning. Firstly, we introduce a probabilistic framework that allows joint learning of model and kernel parameters in order to capture nonstationary spatial phenomena. Secondly, we introduce a theoretical framework based on optimal transport that enables online kernel parameter transfer. Such parameter transfer involves the ability to re-use previously learned parameters, without re-optimisation, on newly observed data. This extends the first contribution which was unable operate in real-time due to the necessity of reoptimising parameters to new observations. Thirdly, we introduce a learnable Fourier based kernel embeddings that exploits generalised quantile representations for stationary kernels. Finally, a method for input warped Fourier kernel embeddings is proposed that allows nonstationary data embeddings using simple stationary kernels. By introducing theoretically cohesive and algorithmically intuitive methods this thesis opens new doors to removing traditional assumptions that have hindered adoption of the kernel perspective. We hope that the ideas presented will demonstrate a curious and inspiring view to the potential of learnable kernel embeddings

    Similar works