Search CORE

8 research outputs found

DISCO Nets: DISsimilarity COefficient Networks

Author: Bouchacourt Diane
Kumar M. Pawan
Nowozin Sebastian
Publication venue
Publication date: 28/10/2016
Field of study

We present a new type of probabilistic model which we call DISsimilarity COefficient Networks (DISCO Nets). DISCO Nets allow us to efficiently sample from a posterior distribution parametrised by a neural network. During training, DISCO Nets are learned by minimising the dissimilarity coefficient between the true distribution and the estimated distribution. This allows us to tailor the training to the loss related to the task at hand. We empirically show that (i) by modeling uncertainty on the output value, DISCO Nets outperform equivalent non-probabilistic predictive networks and (ii) DISCO Nets accurately model the uncertainty of the output, outperforming existing probabilistic models based on deep neural networks

arXiv.org e-Print Archive

Oxford University Research Archive

Statistical inference for generative models with maximum mean discrepancy

Author: Barp A
Briol F-X
Duncan AB
Girolami M
Publication venue: ArXiv
Publication date: 20/02/2022
Field of study

While likelihood-based inference and its variants provide a statistically efficient and widely applicable approach to parametric inference, their application to models involving intractable likelihoods poses challenges. In this work, we study a class of minimum distance estimators for intractable generative models, that is, statistical models for which the likelihood is intractable, but simulation is cheap. The distance considered, maximum mean discrepancy (MMD), is defined through the embedding of probability measures into a reproducing kernel Hilbert space. We study the theoretical properties of these estimators, showing that they are consistent, asymptotically normal and robust to model misspecification. A main advantage of these estimators is the flexibility offered by the choice of kernel, which can be used to trade-off statistical efficiency and robustness. On the algorithmic side, we study the geometry induced by MMD on the parameter space and use this to introduce a novel natural gradient descent-like algorithm for efficient implementation of these estimators. We illustrate the relevance of our theoretical results on several classes of models including a discrete-time latent Markov process and two multivariate stochastic differential equation models

Spiral - Imperial College Digital Repository

Statistical Inference for Generative Models with Maximum Mean Discrepancy

Author: Barp Alessandro
Briol Francois-Xavier
Duncan Andrew B.
Girolami Mark
Publication venue
Publication date: 13/06/2019
Field of study

arXiv.org e-Print Archive

UCL Discovery

Statistical computation with kernels

Author: Briol François-Xavier
Publication venue: 'University of Warwick'
Publication date: 01/09/2018
Field of study

Modern statistical inference has seen a tremendous increase in the size and complexity of models and datasets. As such, it has become reliant on advanced com- putational tools for implementation. A first canonical problem in this area is the numerical approximation of integrals of complex and expensive functions. Numerical integration is required for a variety of tasks, including prediction, model comparison and model choice. A second canonical problem is that of statistical inference for models with intractable likelihoods. These include models with intractable normal- isation constants, or models which are so complex that their likelihood cannot be evaluated, but from which data can be generated. Examples include large graphical models, as well as many models in imaging or spatial statistics. This thesis proposes to tackle these two problems using tools from the kernel methods and Bayesian non-parametrics literature. First, we analyse a well-known algorithm for numerical integration called Bayesian quadrature, and provide consis- tency and contraction rates. The algorithm is then assessed on a variety of statistical inference problems, and extended in several directions in order to reduce its compu- tational requirements. We then demonstrate how the combination of reproducing kernels with Stein’s method can lead to computational tools which can be used with unnormalised densities, including numerical integration and approximation of probability measures. We conclude by studying two minimum distance estimators derived from kernel-based statistical divergences which can be used for unnormalised and generative models. In each instance, the tractability provided by reproducing kernels and their properties allows us to provide easily-implementable algorithms whose theoretical foundations can be studied in depth

UCL Discovery

Statistical computation with kernels

Author: Briol François-Xavier
Publication venue
Publication date
Field of study

Modern statistical inference has seen a tremendous increase in the size and complexity of models and datasets. As such, it has become reliant on advanced computational tools for implementation. A first canonical problem in this area is the numerical approximation of integrals of complex and expensive functions. Numerical integration is required for a variety of tasks, including prediction, model comparison and model choice. A second canonical problem is that of statistical inference for models with intractable likelihoods. These include models with intractable normalisation constants, or models which are so complex that their likelihood cannot be evaluated, but from which data can be generated. Examples include large graphical models, as well as many models in imaging or spatial statistics. This thesis proposes to tackle these two problems using tools from the kernel methods and Bayesian non-parametrics literature. First, we analyse a well-known algorithm for numerical integration called Bayesian quadrature, and provide consistency and contraction rates. The algorithm is then assessed on a variety of statistical inference problems, and extended in several directions in order to reduce its computational requirements. We then demonstrate how the combination of reproducing kernels with Stein's method can lead to computational tools which can be used with unnormalised densities, including numerical integration and approximation of probability measures. We conclude by studying two minimum distance estimators derived from kernel-based statistical divergences which can be used for unnormalised and generative models. In each instance, the tractability provided by reproducing kernels and their properties allows us to provide easily-implementable algorithms whose theoretical foundations can be studied in depth

Warwick Research Archives Portal Repository

Nonparametric Scoring Rules

Author: Lahaie Sebastien
Zawadzki Erik
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 04/03/2015
Field of study

A scoring rule is a device for eliciting and assessing probabilistic forecasts from an agent. When dealing with continuous outcome spaces, and absent any prior insights into the structure of the agent's beliefs, the rule should allow for a flexible reporting interface that can accurately represent complicated, multi-modal distributions. In this paper, we provide such a scoring rule based on a nonparametric approach of eliciting a set of samples from the agent and efficiently evaluating the score using kernel methods. We prove that sampled reports of increasing size converge rapidly to the true score, and that sampled reports are approximately optimal. We also demonstrate a connection between the scoring rule and the maximum mean discrepancy divergence. Experimental results are provided that confirm rapid convergence and that the expected score correlates well with standard notions of divergence, both important considerations for ensuring that agents are incentivized to report accurate information

Association for the Advancement of Artificial Intelligence: AAAI Publications