Search CORE

1,819 research outputs found

Laplace Approximation for Divisive Gaussian Processes for Nonstationary Regression

Author: Figueiras-Vidal AR
Lázaro-Gredilla M
Muñoz-González L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/06/2015
Field of study

The standard Gaussian Process regression (GP) is usually formulated under stationary hypotheses: The noise power is considered constant throughout the input space and the covariance of the prior distribution is typically modeled as depending only on the difference between input samples. These assumptions can be too restrictive and unrealistic for many real-world problems. Although nonstationarity can be achieved using specific covariance functions, they require a prior knowledge of the kind of nonstationarity, not available for most applications. In this paper we propose to use the Laplace approximation to make inference in a divisive GP model to perform nonstationary regression, including heteroscedastic noise cases. The log-concavity of the likelihood ensures a unimodal posterior and makes that the Laplace approximation converges to a unique maximum. The characteristics of the likelihood also allow to obtain accurate posterior approximations when compared to the Expectation Propagation (EP) approximations and the asymptotically exact posterior provided by a Markov Chain Monte Carlo implementation with Elliptical Slice Sampling (ESS), but at a reduced computational load with respect to both, EP and ESS

Spiral - Imperial College Digital Repository

Slice sampling covariance hyperparameters of latent Gaussian models

Author: Adams Ryan Prescott
Murray Iain
Publication venue
Publication date: 01/01/2010
Field of study

The Gaussian process (GP) is a popular way to specify dependencies between random variables in a probabilistic model. In the Bayesian framework the covariance structure can be specified using unknown hyperparameters. Integrating over these hyperparameters considers different possible explanations for the data when making predictions. This integration is often performed using Markov chain Monte Carlo (MCMC) sampling. However, with non-Gaussian observations standard hyperparameter sampling approaches require careful tuning and may converge slowly. In this paper we present a slice sampling approach that requires little tuning while mixing well in both strong- and weak-data regimes.Comment: 9 pages, 4 figures, 4 algorithms. Minor corrections to previous version. This version to appear in Advances in Neural Information Processing Systems (NIPS) 23, 201

arXiv.org e-Print Archive

CiteSeerX

Edinburgh Research Archive

Edinburgh Research Explorer

Recommended from our members

Advances in Bayesian inference and stable optimization for large-scale machine learning problems

Author: Fagan Francois Johannes
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

A core task in machine learning, and the topic of this thesis, is developing faster and more accurate methods of posterior inference in probabilistic models. The thesis has two components. The first explores using deterministic methods to improve the efficiency of Markov Chain Monte Carlo (MCMC) algorithms. We propose new MCMC algorithms that can use deterministic methods as a “prior” to bias MCMC proposals to be in areas of high posterior density, leading to highly efficient sampling. In Chapter 2 we develop such methods for continuous distributions, and in Chapter 3 for binary distributions. The resulting methods consistently outperform existing state-of-the-art sampling techniques, sometimes by several orders of magnitude. Chapter 4 uses similar ideas as in Chapters 2 and 3, but in the context of modeling the performance of left-handed players in one-on-one interactive sports. The second part of this thesis explores the use of stable stochastic gradient descent (SGD) methods for computing a maximum a posteriori (MAP) estimate in large-scale machine learning problems. In Chapter 5 we propose two such methods for softmax regression. The first is an implementation of Implicit SGD (ISGD), a stable but difficult to implement SGD method, and the second is a new SGD method specifically designed for optimizing a double-sum formulation of the softmax. Both methods comprehensively outperform the previous state-of-the-art on seven real world datasets. Inspired by the success of ISGD on the softmax, we investigate its application to neural networks in Chapter 6. In this chapter we present a novel layer-wise approximation of ISGD that has efficiently computable updates. Experiments show that the resulting method is more robust to high learning rates and generally outperforms standard backpropagation on a variety of tasks

Columbia University Academic Commons

Pseudo-Marginal Bayesian Inference for Gaussian Processes

Author: Filippone Maurizio
Girolami Mark
Publication venue
Publication date: 07/04/2014
Field of study

The main challenges that arise when adopting Gaussian Process priors in probabilistic modeling are how to carry out exact Bayesian inference and how to account for uncertainty on model parameters when making model-based predictions on out-of-sample data. Using probit regression as an illustrative working example, this paper presents a general and effective methodology based on the pseudo-marginal approach to Markov chain Monte Carlo that efficiently addresses both of these issues. The results presented in this paper show improvements over existing sampling methods to simulate from the posterior distribution over the parameters defining the covariance function of the Gaussian Process prior. This is particularly important as it offers a powerful tool to carry out full Bayesian inference of Gaussian Process based hierarchic statistical models in general. The results also demonstrate that Monte Carlo based integration of all model parameters is actually feasible in this class of models providing a superior quantification of uncertainty in predictions. Extensive comparisons with respect to state-of-the-art probabilistic classifiers confirm this assertion.Comment: 14 pages double colum

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

Enlighten

Integrals over Gaussians under Linear Domain Constraints

Author: Gessner Alexandra
Hennig Philipp
Kanjilal Oindrila
Publication venue
Publication date: 01/01/2020
Field of study

Integrals of linearly constrained multivariate Gaussian densities are a frequent problem in machine learning and statistics, arising in tasks like generalized linear models and Bayesian optimization. Yet they are notoriously hard to compute, and to further complicate matters, the numerical values of such integrals may be very small. We present an efficient black-box algorithm that exploits geometry for the estimation of integrals over a small, truncated Gaussian volume, and to simulate therefrom. Our algorithm uses the Holmes-Diaconis-Ross (HDR) method combined with an analytic version of elliptical slice sampling (ESS). Adapted to the linear setting, ESS allows for rejection-free sampling, because intersections of ellipses and domain boundaries have closed-form solutions. The key idea of HDR is to decompose the integral into easier-to-compute conditional probabilities by using a sequence of nested domains. Remarkably, it allows for direct computation of the logarithm of the integral value and thus enables the computation of extremely small probability masses. We demonstrate the effectiveness of our tailored combination of HDR and ESS on high-dimensional integrals and on entropy search for Bayesian optimization

arXiv.org e-Print Archive

MPG.PuRe

Gaussian Processes with Monotonicity constraints for Big Data

Author: Tolvanen Ville
Publication venue
Publication date: 16/06/2014
Field of study

Tämän työn tarkoitus on kehittää menetelmä monotonisuusrajoitettujen Gaussisten Prosessien käyttämiseksi suurille aineistoille. Variaatiolaskentaan perustuvaa menetelmää testataan usealla simuloidulla ja oikealla aineistolla. Uuden menetelmän prediktiivistä kykyä verrataan expectation propagation menetelmään, sekä Markov chain Monte Carlo menetelmiin. Työssä saatujen tulosten perusteella voidaan päätellä, että uusi menetelmä toimii ja sitä voidaan käyttää, kun aineistot kasvavat liian suuriksia laskennallisesti raskaille menetelmille.In this thesis, we combine recent advances in monotonicity constraints for Gaussian processes with Big Data inference of Gaussian Proceses. The new variational inference based method is developed and experimented on several simulated and real world data sets by comparing the predictive performance to Expectation Propagation and Markov chain Monte Carlo methods. The results indicate that the new method produces good results and can be used when the data sets get so large that the computationally demanding methods cannot be used

Aaltodoc Publication Archive

Perspectives of Imaging of Single Protein Molecules with the Present Design of the European XFEL. - Part I - X-ray Source, Beamlime Optics and Instrument Simulations

Author: Geloni Gianluca
Kocharyan Vitali
Saldin Evgeni
Serkez Svitozar
Yefanov Oleksandr
Zagorodnov Igor
Publication venue
Publication date: 01/01/2014
Field of study

The Single Particles, Clusters and Biomolecules (SPB) instrument at the European XFEL is located behind the SASE1 undulator, and aims to support imaging and structure determination of biological specimen between about 0.1 micrometer and 1 micrometer size. The instrument is designed to work at photon energies from 3 keV up to 16 keV. This wide operation range is a cause for challenges to the focusing optics. In particular, a long propagation distance of about 900 m between x-ray source and sample leads to a large lateral photon beam size at the optics. The beam divergence is the most important parameter for the optical system, and is largest for the lowest photon energies and for the shortest pulse duration (corresponding to the lowest charge). Due to the large divergence of nominal X-ray pulses with duration shorter than 10 fs, one suffers diffraction from mirror aperture, leading to a 100-fold decrease in fluence at photon energies around 4 keV, which are ideal for imaging of single biomolecules. The nominal SASE1 output power is about 50 GW. This is very far from the level required for single biomolecule imaging, even assuming perfect beamline and focusing efficiency. Here we demonstrate that the parameters of the accelerator complex and of the SASE1 undulator offer an opportunity to optimize the SPB beamline for single biomolecule imaging with minimal additional costs and time. Start to end simulations from the electron injector at the beginning of the accelerator complex up to the generation of diffraction data indicate that one can achieve diffraction without diffraction with about 0.5 photons per Shannon pixel at near-atomic resolution with 1e13 photons in a 4 fs pulse at 4 keV photon energy and in a 100 nm focus, corresponding to a fluence of 1e23 ph/cm^2. This result is exemplified using the RNA Pol II molecule as a case study

arXiv.org e-Print Archive

DESY Publication Database

DESY