Search CORE

3,662 research outputs found

String and Membrane Gaussian Processes

Author: Roberts Stephen
Samo Yves-Laurent Kom
Publication venue
Publication date: 01/01/2016
Field of study

In this paper we introduce a novel framework for making exact nonparametric Bayesian inference on latent functions, that is particularly suitable for Big Data tasks. Firstly, we introduce a class of stochastic processes we refer to as string Gaussian processes (string GPs), which are not to be mistaken for Gaussian processes operating on text. We construct string GPs so that their finite-dimensional marginals exhibit suitable local conditional independence structures, which allow for scalable, distributed, and flexible nonparametric Bayesian inference, without resorting to approximations, and while ensuring some mild global regularity constraints. Furthermore, string GP priors naturally cope with heterogeneous input data, and the gradient of the learned latent function is readily available for explanatory analysis. Secondly, we provide some theoretical results relating our approach to the standard GP paradigm. In particular, we prove that some string GPs are Gaussian processes, which provides a complementary global perspective on our framework. Finally, we derive a scalable and distributed MCMC scheme for supervised learning tasks under string GP priors. The proposed MCMC scheme has computational time complexity

\mathcal{O}(N)

and memory requirement

\mathcal{O}(dN)

, where

N

is the data size and

d

the dimension of the input space. We illustrate the efficacy of the proposed approach on several synthetic and real-world datasets, including a dataset with

6

millions input points and

8

attributes.Comment: To appear in the Journal of Machine Learning Research (JMLR), Volume 1

arXiv.org e-Print Archive

Oxford University Research Archive

Scalable Bayesian nonparametric regression via a Plackett-Luce model for conditional ranks

Author: Caron Francois
Gray-Davies Tristan
Holmes Chris
Publication venue
Publication date: 24/06/2015
Field of study

We present a novel Bayesian nonparametric regression model for covariates X and continuous, real response variable Y. The model is parametrized in terms of marginal distributions for Y and X and a regression function which tunes the stochastic ordering of the conditional distributions F(y|x). By adopting an approximate composite likelihood approach, we show that the resulting posterior inference can be decoupled for the separate components of the model. This procedure can scale to very large datasets and allows for the use of standard, existing, software from Bayesian nonparametric density estimation and Plackett-Luce ranking estimation to be applied. As an illustration, we show an application of our approach to a US Census dataset, with over 1,300,000 data points and more than 100 covariates

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive