7 research outputs found

    Particle filter-based Gaussian process optimisation for parameter inference

    Full text link
    We propose a novel method for maximum likelihood-based parameter inference in nonlinear and/or non-Gaussian state space models. The method is an iterative procedure with three steps. At each iteration a particle filter is used to estimate the value of the log-likelihood function at the current parameter iterate. Using these log-likelihood estimates, a surrogate objective function is created by utilizing a Gaussian process model. Finally, we use a heuristic procedure to obtain a revised parameter iterate, providing an automatic trade-off between exploration and exploitation of the surrogate model. The method is profiled on two state space models with good performance both considering accuracy and computational cost.Comment: Accepted for publication in proceedings of the 19th World Congress of the International Federation of Automatic Control (IFAC), Cape Town, South Africa, August 2014. 6 pages, 4 figure

    Statistical modelling with additive Gaussian process priors

    Get PDF
    Regression with Gaussian process (GP) priors has become increasingly popular due to its ability to model complex relationships between variables and handle auto-correlation in the data through the covariance function of the process, called kernel. Despite its popularity, the statistical modelling aspect of GP regression has received relatively limited attention. In this thesis, we explore a regression model where the regression function can be decomposed into a sum of lower-dimensional functions, akin to the principles of Generalised Additive Models (Hastie and Tibshirani, 1990). We propose additive interaction modelling using a class of hierarchical ANOVA decomposition kernel. This flexible statistical modelling framework naturally accommodates interaction effects of any order without increasing the number of model parameters. Our approach facilitates straightforward assessment and comparison of models with different interaction structures through the model marginal likelihood. We also demonstrate how this framework enhances the interpretability of complex data structures, especially when combined with the concept of kernel centring. The second segment of the thesis focuses on the computational aspects of implementing the proposed additive models for handling large-scale data structured in multidimensional grids. Such structured data often arise in scenarios involving multilevel repeated measurements, as commonly seen in spatio-temporal analysis or medical, behavioural, and psychological studies. Leveraging the Kronecker product structure within the covariance matrix, we reduce the time complexity to O(n3) and storage requirements to O(n2). We extend existing work in the GP literature to encompass all models under hierarchical ANOVA decomposition kernels. Additionally, we address issues related to incomplete grids and various missingness mechanisms. We illustrate the practical application of our proposed methodologies using both simulated and real-world spatio-temporal and longitudinal data

    Gaussian Process Based Approaches for Survival Analysis

    Get PDF
    Traditional machine learning focuses on the situation where a fixed number of features are available for each data-point. For medical applications each individual patient will typically have a different set of clinical tests associated with them. This results in a varying number of observed per patient features. An important indicator of interest in medical domains is survival information. Survival data presents its own particular challenges such as censoring. The aim of this thesis is to explore how machine learning ideas can be transferred to the domain of clinical data analysis. We consider two primary challenges; firstly how survival models can be made more flexible through non-linearisation and secondly methods for missing data imputation in order to handle the varying number of observed per patient features. We use the framework of Gaussian process modelling to facilitate conflation of our approaches; allowing the dual challenges of survival data and missing data to be addressed. The results show promise, although challenges remain. In particular when a large proportion of data is missing, greater uncertainty in inferences results. Principled handling of this uncertainty requires propagation through any Gaussian process model used for subsequent regression

    Variational Approximate Inference in Latent Linear Models

    Get PDF
    Latent linear models are core to much of machine learning and statistics. Specific examples of this model class include Bayesian generalised linear models, Gaussian process regression models and unsupervised latent linear models such as factor analysis and principal components analysis. In general, exact inference in this model class is computationally and analytically intractable. Approximations are thus required. In this thesis we consider deterministic approximate inference methods based on minimising the Kullback-Leibler (KL) divergence between a given target density and an approximating `variational' density. First we consider Gaussian KL (G-KL) approximate inference methods where the approximating variational density is a multivariate Gaussian. Regarding this procedure we make a number of novel contributions: sufficient conditions for which the G-KL objective is differentiable and convex are described, constrained parameterisations of Gaussian covariance that make G-KL methods fast and scalable are presented, the G-KL lower-bound to the target density's normalisation constant is proven to dominate those provided by local variational bounding methods. We also discuss complexity and model applicability issues of G-KL and other Gaussian approximate inference methods. To numerically validate our approach we present results comparing the performance of G-KL and other deterministic Gaussian approximate inference methods across a range of latent linear model inference problems. Second we present a new method to perform KL variational inference for a broad class of approximating variational densities. Specifically, we construct the variational density as an affine transformation of independently distributed latent random variables. The method we develop extends the known class of tractable variational approximations for which the KL divergence can be computed and optimised and enables more accurate approximations of non-Gaussian target densities to be obtained
    corecore