57 research outputs found

    Approximate inference for state-space models

    Get PDF
    This thesis is concerned with state estimation in partially observed diffusion processes with discrete time observations. This problem can be solved exactly in a Bayesian framework, up to a set of generally intractable stochastic partial differential equations. Numerous approximate inference methods exist to tackle the problem in a practical way. This thesis introduces a novel deterministic approach that can capture non normal properties of the exact Bayesian solution. The variational approach to approximate inference has a natural formulation for partially observed diffusion processes. In the variational framework, the exact Bayesian solution is the optimal variational solution and, as a consequence, all variational approximations have a universal ordering in terms of optimality. The new approach generalises the current variational Gaussian process approximation algorithm, and therefore provides a method for obtaining super optimal algorithms in relation to the current state-of-the-art variational methods. Every diffusion process is composed of a drift component and a diffusion component. To obtain a variational formulation, the diffusion component must be fixed. Subsequently, the exact Bayesian solution and all variational approximations are characterised by their drift component. To use a particular class of drift, the variational formulation requires a closed form for the family of marginal densities generated by diffusion processes with drift components from the aforementioned class. This requirement in general cannot be met. In this thesis, it is shown how this coupling can be weakened, allowing for more flexible relations between the variational drift and the variational approximations of the marginal densities of the true posterior process. Based on this revelation, a selection of novel variational drift components are proposed

    Advances in variational Bayesian nonlinear blind source separation

    Get PDF
    Linear data analysis methods such as factor analysis (FA), independent component analysis (ICA) and blind source separation (BSS) as well as state-space models such as the Kalman filter model are used in a wide range of applications. In many of these, linearity is just a convenient approximation while the underlying effect is nonlinear. It would therefore be more appropriate to use nonlinear methods. In this work, nonlinear generalisations of FA and ICA/BSS are presented. The methods are based on a generative model, with a multilayer perceptron (MLP) network to model the nonlinearity from the latent variables to the observations. The model is estimated using variational Bayesian learning. The variational Bayesian method is well-suited for the nonlinear data analysis problems. The approach is also theoretically interesting, as essentially the same method is used in several different fields and can be derived from several different starting points, including statistical physics, information theory, Bayesian statistics, and information geometry. These complementary views can provide benefits for interpretation of the operation of the learning method and its results. Much of the work presented in this thesis consists of improvements that make the nonlinear factor analysis and blind source separation methods faster and more stable, while being applicable to other learning problems as well. The improvements include methods to accelerate convergence of alternating optimisation algorithms such as the EM algorithm and an improved approximation of the moments of a nonlinear transform of a multivariate probability distribution. These improvements can be easily applied to other models besides FA and ICA/BSS, such as nonlinear state-space models. A specialised version of the nonlinear factor analysis method for post-nonlinear mixtures is presented as well.reviewe

    Nonparametric enrichment in computational and biological representations of distributions

    Get PDF
    This thesis proposes nonparametric techniques to enhance unsupervised learning methods in computational or biological contexts. Representations of intractable distributions and their relevant statistics are enhanced by nonparametric components trained to handle challenging estimation problems. The first part introduces a generic algorithm for learning generative latent variable models. In contrast to traditional variational learning, no representation for the intractable posterior distributions are computed, making it agnostic to the model structure and the support of latent variables. Kernel ridge regression is used to consistently estimate the gradient for learning. In many unsupervised tasks, this approach outperforms advanced alternatives based on the expectation-maximisation algorithm and variational approximate inference. In the second part, I train a model of data known as the kernel exponential family density. The kernel, used to describe smooth functions, is augmented by a parametric component trained using an efficient meta-learning procedure; meta-learning prevents overfitting as would occur using conventional routines. After training, the contours of the kernel become adaptive to the local geometry of the underlying density. Compared to maximum-likelihood learning, our method better captures the shape of the density, which is the desired quantity in many downstream applications. The final part sees how nonparametric ideas contribute to understanding uncertainty computation in the brain. First, I show that neural networks can learn to represent uncertainty using the distributed distributional code (DDC), a representation similar to the nonparametric kernel mean embedding. I then derive several DDC-based message-passing algorithms, including computations of filtering and real-time smoothing. The latter is a common neural computation embodied in many postdictive phenomena of perception in multiple modalities. The main idea behind these algorithms is least-squares regression, where the training data are simulated from an internal model. The internal model can be concurrently updated to follow the statistics in sensory stimuli, enabling adaptive inference

    Deep Gaussian Processes and Variational Propagation of Uncertainty

    Get PDF
    Uncertainty propagation across components of complex probabilistic models is vital for improving regularisation. Unfortunately, for many interesting models based on non-linear Gaussian processes (GPs), straightforward propagation of uncertainty is computationally and mathematically intractable. This thesis is concerned with solving this problem through developing novel variational inference approaches. From a modelling perspective, a key contribution of the thesis is the development of deep Gaussian processes (deep GPs). Deep GPs generalise several interesting GP-based models and, hence, motivate the development of uncertainty propagation techniques. In a deep GP, each layer is modelled as the output of a multivariate GP, whose inputs are governed by another GP. The resulting model is no longer a GP but, instead, can learn much more complex interactions between data. In contrast to other deep models, all the uncertainty in parameters and latent variables is marginalised out and both supervised and unsupervised learning is handled. Two important special cases of a deep GP can equivalently be seen as its building components and, historically, were developed as such. Firstly, the variational GP-LVM is concerned with propagating uncertainty in Gaussian process latent variable models. Any observed inputs (e.g. temporal) can also be used to correlate the latent space posteriors. Secondly, this thesis develops manifold relevance determination (MRD) which considers a common latent space for multiple views. An adapted variational framework allows for strong model regularisation, resulting in rich latent space representations to be learned. The developed models are also equipped with algorithms that maximise the information communicated between their different stages using uncertainty propagation, to achieve improved learning when partially observed values are present. The developed methods are demonstrated in experiments with simulated and real data. The results show that the developed variational methodologies improve practical applicability by enabling automatic capacity control in the models, even when data are scarce

    Automatic scoring of X-rays in Psoriatic Arthritis

    Get PDF

    Towards Bayesian System Identification: With Application to SHM of Offshore Structures

    Get PDF
    Within the offshore industry Structural Health Monitoring remains a growing area of interest. The oil and gas sectors are faced with ageing infrastructure and are driven by the desire for reliable lifetime extension, whereas the wind energy sector is investing heavily in a large number of structures. This leads to a number of distinct challenges for Structural Health Monitoring which are brought together by one unifying theme --- uncertainty. The offshore environment is highly uncertain, existing structures have not been monitored from construction and the loading and operational conditions they have experienced (among other factors) are not known. For the wind energy sector, high numbers of structures make traditional inspection methods costly and in some cases dangerous due to the inaccessibility of many wind farms. Structural Health Monitoring attempts to address these issues by providing tools to allow automated online assessment of the condition of structures to aid decision making. The work of this thesis presents a number of Bayesian methods which allow system identification, for Structural Health Monitoring, under uncertainty. The Bayesian approach explicitly incorporates prior knowledge that is available and combines this with evidence from observed data to allow the formation of updated beliefs. This is a natural way to approach Structural Health Monitoring, or indeed, many engineering problems. It is reasonable to assume that there is some knowledge available to the engineer before attempting to detect, locate, classify, or model damage on a structure. Having a framework where this knowledge can be exploited, and the uncertainty in that knowledge can be handled rigorously, is a powerful methodology. The problem being that the actual computation of Bayesian results can pose a significant challenge both computationally and in terms of specifying appropriate models. This thesis aims to present a number of Bayesian tools, each of which leverages the power of the Bayesian paradigm to address a different Structural Health Monitoring challenge. Within this work the use of Gaussian Process models is presented as a flexible nonparametric Bayesian approach to regression, which is extended to handle dynamic models within the Gaussian Process NARX framework. The challenge in training Gaussian Process models is seldom discussed and the work shown here aims to offer a quantitative assessment of different learning techniques including discussions on the choice of cost function for optimisation of hyperparameters and the choice of the optimisation algorithm itself. Although rarely considered, the effects of these choices are demonstrated to be important and to inform the use of a Gaussian Process NARX model for wave load identification on offshore structures. The work is not restricted to only Gaussian Process models, but Bayesian state-space models are also used. The novel use of Particle Gibbs for identification of nonlinear oscillators is shown and modifications to this algorithm are applied to handle its specific use in Structural Health Monitoring. Alongside this, the Bayesian state-space model is used to perform joint input-state-parameter inference for Operational Modal Analysis where the use of priors over the parameters and the forcing function (in the form of a Gaussian Process transformed into a state-space representation) provides a methodology for this output-only identification under parameter uncertainty. Interestingly, this method is shown to recover the parameter distributions of the model without compromising the recovery of the loading time-series signal when compared to the case where the parameters are known. Finally, a novel use of an online Bayesian clustering method is presented for performing Structural Health Monitoring in the absence of any available training data. This online method does not require a pre-collected training dataset, nor a model of the structure, and is capable of detecting and classifying a range of operational and damage conditions while in service. This leaves the reader with a toolbox of methods which can be applied, where appropriate, to identification of dynamic systems with a view to Structural Health Monitoring problems within the offshore industry and across engineering

    Robot environment learning with a mixed-linear probabilistic state-space model

    Get PDF
    This thesis proposes the use of a probabilistic state-space model with mixed-linear dynamics for learning to predict a robot's experiences. It is motivated by a desire to bridge the gap between traditional models with predefined objective semantics on the one hand, and the biologically-inspired "black box" behavioural paradigm on the other. A novel EM-type algorithm for the model is presented, which is less compuationally demanding than the Monte Carlo techniques developed for use in (for example) visual applications. The algorithm's E-step is slightly approximative, but an extension is described which would in principle make it asymptotically correct. Investigation using synthetically sampled data shows that the uncorrected E-step can any case make correct inferences about quite complicated systems. Results collected from two simulated mobile robot environments support the claim that mixed-linear models can capture both discontinuous and continuous structure in world in an intuitively natural manner; while they proved to perform only slightly better than simpler autoregressive hidden Markov models on these simple tasks, it is possible to claim tentatively that they might scale more effectively to environments in which trends over time played a larger role. Bayesian confidence regions—easily by mixed-linear model— proved be an effective guard for preventing it from making over-confident predictions outside its area of competence. A section on future extensions discusses how the model's easy invertibility could be harnessed to the ultimate aim of choosing actions, from a continuous space of possibilities, which maximise the robot's expected payoff over several steps into the futur
    • …
    corecore