55,091 research outputs found
Bayesian classification with Gaussian processes
We consider the problem of assigning an input vector to one of m classes by predicting P(c|x) for c=1,...,m. For a two-class problem, the probability of class one given x is estimated by s(y(x)), where s(y)=1/(1+e-y). A Gaussian process prior is placed on y(x), and is combined with the training data to obtain predictions for new x points. We provide a Bayesian treatment, integrating over uncertainty in y and in the parameters that control the Gaussian process prior the necessary integration over y is carried out using Laplace's approximation. The method is generalized to multiclass problems (m>2) using the softmax function. We demonstrate the effectiveness of the method on a number of datasets
Categorical Inputs, Sensitivity Analysis, Optimization and Importance Tempering with tgp Version 2, an R Package for Treed Gaussian Process Models
This document describes the new features in version 2.x of the tgp package for R, implementing treed Gaussian process (GP) models. The topics covered include methods for dealing with categorical inputs and excluding inputs from the tree or GP part of the model; fully Bayesian sensitivity analysis for inputs/covariates; sequential optimization of black-box functions; and a new Monte Carlo method for inference in multi-modal posterior distributions that combines simulated tempering and importance sampling. These additions extend the functionality of tgp across all models in the hierarchy: from Bayesian linear models, to classification and regression trees (CART), to treed Gaussian processes with jumps to the limiting linear model. It is assumed that the reader is familiar with the baseline functionality of the package, outlined in the first vignette (Gramacy 2007).
A Mutually-Dependent Hadamard Kernel for Modelling Latent Variable Couplings
We introduce a novel kernel that models input-dependent couplings across
multiple latent processes. The pairwise joint kernel measures covariance along
inputs and across different latent signals in a mutually-dependent fashion. A
latent correlation Gaussian process (LCGP) model combines these non-stationary
latent components into multiple outputs by an input-dependent mixing matrix.
Probit classification and support for multiple observation sets are derived by
Variational Bayesian inference. Results on several datasets indicate that the
LCGP model can recover the correlations between latent signals while
simultaneously achieving state-of-the-art performance. We highlight the latent
covariances with an EEG classification dataset where latent brain processes and
their couplings simultaneously emerge from the model.Comment: 17 pages, 6 figures; accepted to ACML 201
Mondrian Forests for Large-Scale Regression when Uncertainty Matters
Many real-world regression problems demand a measure of the uncertainty
associated with each prediction. Standard decision forests deliver efficient
state-of-the-art predictive performance, but high-quality uncertainty estimates
are lacking. Gaussian processes (GPs) deliver uncertainty estimates, but
scaling GPs to large-scale data sets comes at the cost of approximating the
uncertainty estimates. We extend Mondrian forests, first proposed by
Lakshminarayanan et al. (2014) for classification problems, to the large-scale
non-parametric regression setting. Using a novel hierarchical Gaussian prior
that dovetails with the Mondrian forest framework, we obtain principled
uncertainty estimates, while still retaining the computational advantages of
decision forests. Through a combination of illustrative examples, real-world
large-scale datasets, and Bayesian optimization benchmarks, we demonstrate that
Mondrian forests outperform approximate GPs on large-scale regression tasks and
deliver better-calibrated uncertainty assessments than decision-forest-based
methods.Comment: Proceedings of the 19th International Conference on Artificial
Intelligence and Statistics (AISTATS) 2016, Cadiz, Spain. JMLR: W&CP volume
5
- …