10 research outputs found
Probabilistic Linear Multistep Methods
We present a derivation and theoretical investigation of the Adams-Bashforth and Adams-Moulton family of linear multistep methods for solving ordinary differential equations, starting from a Gaussian process (GP) framework. In the limit, this formulation coincides with the classical deterministic methods, which have been used as higher-order initial value problem solvers for over a century. Furthermore, the natural probabilistic framework provided by the GP formulation allows us to derive probabilistic versions of these methods, in the spirit of a number of other probabilistic ODE solvers presented in the recent literature. In contrast to higher-order Runge-Kutta methods, which require multiple intermediate function evaluations per step, Adams family methods make use of previous function evaluations, so that increased accuracy arising from a higher-order multistep approach comes at very little additional computational cost. We show that through a careful choice of covariance function for the GP, the posterior mean and standard deviation over the numerical solution can be made to exactly coincide with the value given by the deterministic method and its local truncation error respectively. We provide a rigorous proof of the convergence of these new methods, as well as an empirical investigation (up to fifth order) demonstrating their convergence rates in practice
Black box probabilistic numerics
Peer reviewe
Optimal Quantisation of Probability Measures Using Maximum Mean Discrepancy
Several researchers have proposed minimisation of maximum mean discrepancy (MMD) as a method to quantise probability measures, i.e., to approximate a distribution by a representative point set. We consider sequential algorithms that greedily minimise MMD over a discrete candidate set. We propose a novel non-myopic algorithm and, in order to both improve statistical efficiency and reduce computational cost, we investigate a variant that applies this technique to a mini-batch of the candidate set at each iteration. When the candidate points are sampled from the target, the consistency of these new algorithms—and their mini-batch variants—is established. We demonstrate the algorithms on a range of important computational problems, including optimisation of nodes in Bayesian cubature and the thinning of Markov chain output
Topics in the probabilistic solution of ordinary differential equations
This thesis concerns several new developments in the probabilistic solution of ordinary differential equations. Probabilistic numerical methods are differentiated from their classical counterparts through the key property of returning a probability measure as output, rather than simply a point value. When properly calibrated, this measure can then be taken to probabilistically represent the output uncertainty arising from the application of the numerical procedure.
After giving some introductory context, we start with a concise survey of the still-developing field of probabilistic ODE solvers, highlighting how several different paradigms have developed somewhat in parallel. One of these, established by Conrad et al. (2016), defines randomised one-step solvers for initial value problems, where the outputs are empirical measures arising from Monte Carlo repetitions of the algorithm. We extend this to multistep solvers of Adams-Bashforth type using a novel Gaussian process construction. The properties of this method are explored and its convergence is rigorously proved.
We continue by defining a class of implicit probabilistic ODE solvers, the first in the literature. Unlike explicit methods, these modified Adams-Moulton algorithms incorporate information from the ODE dynamics beyond the current time-point, and as such are able to enhance the accuracy of the probabilistic model of numerical error. In their full form, they output a non-parametric description of the stepwise error, though we also propose a parametric approximation that aids computation. Once again, we explore the properties of the method and prove its convergence in the small step-size limit.
We follow with a discussion on the problem of calibration for these classes of algorithms, and generalise a proposal from Conrad et al. in order to implement it for our methods. We then apply the new integrators to two test differential equation models, first in the solution of the forward model, then later in the setting of a Bayesian inverse problem. We contrast the effect of using probabilistic integrators instead of classical ones on posterior inference over the model parameters, as well as derived functions of the forward solution.
We conclude with a brief discussion on the advantages and shortcomings of the proposed methods, and posit several suggestions for potential future research.Open Acces
Postprocessing of MCMC
Markov chain Monte Carlo is the engine of modern Bayesian statistics, being used to approximate the posterior and derived quantities of interest. Despite this, the issue of how the output from a Markov chain is post-processed and reported is often overlooked. Convergence diagnostics can be used to control bias via burn-in removal, but these do not account for (common) situations where a limited computational budget engenders a bias-variance trade-off. The aim of this article is to review state-of-the-art techniques for post-processing Markov chain output. Our review covers methods based on discrepancy minimisation, which
directly address the bias-variance trade-off, as well as general-purpose control variate methods for approximating expected quantities of interest
Testing whether a learning procedure is calibrated
A learning procedure takes as input a dataset and performs inference for the parameters of a model that is assumed to have given rise to the dataset. Here we consider learning procedures whose output is a probability distribution, representing uncertainty about after seeing the dataset. Bayesian inference is a prime example of such a procedure, but one can also construct other learning procedures that return distributional output. This paper studies conditions for a learning procedure to be considered calibrated, in the sense that the true data-generating parameters are plausible as samples from its distributional output. A learning procedure whose inferences and predictions are systematically over- or under-confident will fail to be calibrated. On the other hand, a learning procedure that is calibrated need not be statistically efficient. A hypothesis-testing framework is developed in order to assess, using simulation, whether a learning procedure is calibrated. Several vignettes are presented to illustrate different aspects of the framework
Testing whether a Learning Procedure is Calibrated
A learning procedure takes as input a dataset and performs inference for the parameters θ of a model that is assumed to have given rise to the dataset. Here we consider learning procedures whose output is a probability distribution, representing uncertainty about θ after seeing the dataset. Bayesian inference is a prime example of such a procedure, but one can also construct other learning procedures that return distributional output. This paper studies conditions for a learning procedure to be considered calibrated, in the sense that the true data-generating parameters are plausible as samples from its distributional output. A learning procedure whose inferences and predictions are systematically over- or under-confident will fail to be calibrated. On the other hand, a learning procedure that is calibrated need not be statistically efficient. A hypothesis-testing framework is developed in order to assess, using simulation, whether a learning procedure is calibrated. Several vignettes are presented to illustrate different aspects of the framework