28 research outputs found
Statistical computation with kernels
Modern statistical inference has seen a tremendous increase in the size and complexity of models and datasets. As such, it has become reliant on advanced com- putational tools for implementation. A first canonical problem in this area is the numerical approximation of integrals of complex and expensive functions. Numerical integration is required for a variety of tasks, including prediction, model comparison and model choice. A second canonical problem is that of statistical inference for models with intractable likelihoods. These include models with intractable normal- isation constants, or models which are so complex that their likelihood cannot be evaluated, but from which data can be generated. Examples include large graphical models, as well as many models in imaging or spatial statistics.
This thesis proposes to tackle these two problems using tools from the kernel methods and Bayesian non-parametrics literature. First, we analyse a well-known algorithm for numerical integration called Bayesian quadrature, and provide consis- tency and contraction rates. The algorithm is then assessed on a variety of statistical inference problems, and extended in several directions in order to reduce its compu- tational requirements. We then demonstrate how the combination of reproducing kernels with Steinâs method can lead to computational tools which can be used with unnormalised densities, including numerical integration and approximation of probability measures. We conclude by studying two minimum distance estimators derived from kernel-based statistical divergences which can be used for unnormalised and generative models.
In each instance, the tractability provided by reproducing kernels and their properties allows us to provide easily-implementable algorithms whose theoretical foundations can be studied in depth
Bayesian Quadrature for Multiple Related Integrals
Bayesian probabilistic numerical methods are a set of tools providing
posterior distributions on the output of numerical methods. The use of these
methods is usually motivated by the fact that they can represent our
uncertainty due to incomplete/finite information about the continuous
mathematical problem being approximated. In this paper, we demonstrate that
this paradigm can provide additional advantages, such as the possibility of
transferring information between several numerical methods. This allows users
to represent uncertainty in a more faithful manner and, as a by-product,
provide increased numerical efficiency. We propose the first such numerical
method by extending the well-known Bayesian quadrature algorithm to the case
where we are interested in computing the integral of several related functions.
We then prove convergence rates for the method in the well-specified and
misspecified cases, and demonstrate its efficiency in the context of
multi-fidelity models for complex engineering systems and a problem of global
illumination in computer graphics.Comment: Proceedings of the 35th International Conference on Machine Learning
(ICML), PMLR 80:5369-5378, 201
Statistical computation with kernels
Modern statistical inference has seen a tremendous increase in the size and complexity of models and datasets. As such, it has become reliant on advanced computational tools for implementation. A first canonical problem in this area is the numerical approximation of integrals of complex and expensive functions. Numerical integration is required for a variety of tasks, including prediction, model comparison and model choice. A second canonical problem is that of statistical inference for models with intractable likelihoods. These include models with intractable normalisation constants, or models which are so complex that their likelihood cannot be evaluated, but from which data can be generated. Examples include large graphical models, as well as many models in imaging or spatial statistics.
This thesis proposes to tackle these two problems using tools from the kernel methods and Bayesian non-parametrics literature. First, we analyse a well-known algorithm for numerical integration called Bayesian quadrature, and provide consistency and contraction rates. The algorithm is then assessed on a variety of statistical inference problems, and extended in several directions in order to reduce its computational requirements. We then demonstrate how the combination of reproducing kernels with Stein's method can lead to computational tools which can be used with unnormalised densities, including numerical integration and approximation of probability measures. We conclude by studying two minimum distance estimators derived from kernel-based statistical divergences which can be used for unnormalised and generative models.
In each instance, the tractability provided by reproducing kernels and their properties allows us to provide easily-implementable algorithms whose theoretical foundations can be studied in depth
Discrepancy-based Inference for Intractable Generative Models using Quasi-Monte Carlo
Intractable generative models are models for which the likelihood is
unavailable but sampling is possible. Most approaches to parameter inference in
this setting require the computation of some discrepancy between the data and
the generative model. This is for example the case for minimum distance
estimation and approximate Bayesian computation. These approaches require
sampling a high number of realisations from the model for different parameter
values, which can be a significant challenge when simulating is an expensive
operation. In this paper, we propose to enhance this approach by enforcing
"sample diversity" in simulations of our models. This will be implemented
through the use of quasi-Monte Carlo (QMC) point sets. Our key results are
sample complexity bounds which demonstrate that, under smoothness conditions on
the generator, QMC can significantly reduce the number of samples required to
obtain a given level of accuracy when using three of the most common
discrepancies: the maximum mean discrepancy, the Wasserstein distance, and the
Sinkhorn divergence. This is complemented by a simulation study which
highlights that an improved accuracy is sometimes also possible in some
settings which are not covered by the theory.Comment: minor presentation changes and updated reference
Convergence Guarantees for Gaussian Process Means with Misspecified Likelihoods and Smoothness
Gaussian processes are ubiquitous in machine learning, statistics, and
applied mathematics. They provide a flexible modelling framework for
approximating functions, whilst simultaneously quantifying uncertainty.
However, this is only true when the model is well-specified, which is often not
the case in practice. In this paper, we study the properties of Gaussian
process means when the smoothness of the model and the likelihood function are
misspecified. In this setting, an important theoretical question of practial
relevance is how accurate the Gaussian process approximations will be given the
difficulty of the problem, our model and the extent of the misspecification.
The answer to this problem is particularly useful since it can inform our
choice of model and experimental design. In particular, we describe how the
experimental design and choice of kernel and kernel hyperparameters can be
adapted to alleviate model misspecification
Meta-learning Control Variates: Variance Reduction with Limited Data
Control variates can be a powerful tool to reduce the variance of Monte Carlo estimators, but constructing effective control variates can be challenging when the number of samples is small. In this paper, we show that when a large number of related integrals need to be computed, it is possible to leverage the similarity between these integration tasks to improve performance even when the number of samples per task is very small. Our approach, called meta learning CVs (Meta-CVs), can be used for up to hundreds or thousands of tasks. Our empirical assessment indicates that Meta-CVs can lead to significant variance reduction in such settings, and our theoretical analysis establishes general conditions under which Meta-CVs can be successfully trained
Bayesian Numerical Integration with Neural Networks
Bayesian probabilistic numerical methods for numerical integration offer
significant advantages over their non-Bayesian counterparts: they can encode
prior information about the integrand, and can quantify uncertainty over
estimates of an integral. However, the most popular algorithm in this class,
Bayesian quadrature, is based on Gaussian process models and is therefore
associated with a high computational cost. To improve scalability, we propose
an alternative approach based on Bayesian neural networks which we call
Bayesian Stein networks. The key ingredients are a neural network architecture
based on Stein operators, and an approximation of the Bayesian posterior based
on the Laplace approximation. We show that this leads to orders of magnitude
speed-ups on the popular Genz functions benchmark, and on challenging problems
arising in the Bayesian analysis of dynamical systems, and the prediction of
energy production for a large-scale wind farm