95 research outputs found

    Training Deep Gaussian Processes using Stochastic Expectation Propagation and Probabilistic Backpropagation

    Get PDF
    Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (GPs) and are formally equivalent to neural networks with multiple, infinitely wide hidden layers. DGPs are probabilistic and non-parametric and as such are arguably more flexible, have a greater capacity to generalise, and provide better calibrated uncertainty estimates than alternative deep models. The focus of this paper is scalable approximate Bayesian learning of these networks. The paper develops a novel and efficient extension of probabilistic backpropagation, a state-of-the-art method for training Bayesian neural networks, that can be used to train DGPs. The new method leverages a recently proposed method for scaling Expectation Propagation, called stochastic Expectation Propagation. The method is able to automatically discover useful input warping, expansion or compression, and it is therefore is a flexible form of Bayesian kernel design. We demonstrate the success of the new method for supervised learning on several real-world datasets, showing that it typically outperforms GP regression and is never much worse

    On the impact of covariance functions in multi-objective Bayesian optimization for engineering design

    Get PDF
    This is the author accepted manuscript. The final version is available from the publisher via the DOI in this recordMulti-objective Bayesian optimization (BO) is a highly useful class of methods that can effectively solve computationally expensive engineering design optimization problems with multiple objectives. However, the impact of covariance function, which is an important part of multi-objective BO, is rarely studied in the context of engineering optimization. We aim to shed light on this issue by performing numerical experiments on engineering design optimization problems, primarily low-fidelity problems so that we are able to statistically evaluate the performance of BO methods with various covariance functions. In this paper, we performed the study using a set of subsonic airfoil optimization cases as benchmark problems. Expected hypervolume improvement was used as the acquisition function to enrich the experimental design. Results show that the choice of the covariance function give a notable impact on the performance of multi-objective BO. In this regard, Kriging models with Matern-3/2 is the most robust method in terms of the diversity and convergence to the Pareto front that can handle problems with various complexities.Natural Environment Research Council (NERC

    A Geometric Variational Approach to Bayesian Inference

    Get PDF
    We propose a novel Riemannian geometric framework for variational inference in Bayesian models based on the nonparametric Fisher-Rao metric on the manifold of probability density functions. Under the square-root density representation, the manifold can be identified with the positive orthant of the unit hypersphere in L2, and the Fisher-Rao metric reduces to the standard L2 metric. Exploiting such a Riemannian structure, we formulate the task of approximating the posterior distribution as a variational problem on the hypersphere based on the alpha-divergence. This provides a tighter lower bound on the marginal distribution when compared to, and a corresponding upper bound unavailable with, approaches based on the Kullback-Leibler divergence. We propose a novel gradient-based algorithm for the variational problem based on Frechet derivative operators motivated by the geometry of the Hilbert sphere, and examine its properties. Through simulations and real-data applications, we demonstrate the utility of the proposed geometric framework and algorithm on several Bayesian models

    On the Use of Upper Trust Bounds in Constrained Bayesian Optimization Infill Criterion

    Get PDF
    In order to handle constrained optimization problems with a large number of design variables, a new approach has been proposed to address constraints in a surrogate-based optimization framework. This approach focuses on sequential enrichment using adaptive surrogate models based on Bayesian optimization approach, and Gaussian process models. A constraints criterion using the uncertainty estimation of the Gaussian process models is introduced. Different evolutions of the algorithm, based on the accuracy of the constraints surrogate models, are used for selecting the infill sample points. The resulting algorithm has been tested on the well known modified Branin optimization problem

    The Variational Garrote

    Get PDF
    In this paper, we present a new variational method for sparse regression using L0L_0 regularization. The variational parameters appear in the approximate model in a way that is similar to Breiman's Garrote model. We refer to this method as the variational Garrote (VG). We show that the combination of the variational approximation and L0L_0 regularization has the effect of making the problem effectively of maximal rank even when the number of samples is small compared to the number of variables. The VG is compared numerically with the Lasso method, ridge regression and the recently introduced paired mean field method (PMF) (M. Titsias & M. L\'azaro-Gredilla., NIPS 2012). Numerical results show that the VG and PMF yield more accurate predictions and more accurately reconstruct the true model than the other methods. It is shown that the VG finds correct solutions when the Lasso solution is inconsistent due to large input correlations. Globally, VG is significantly faster than PMF and tends to perform better as the problems become denser and in problems with strongly correlated inputs. The naive implementation of the VG scales cubic with the number of features. By introducing Lagrange multipliers we obtain a dual formulation of the problem that scales cubic in the number of samples, but close to linear in the number of features.Comment: 26 pages, 11 figure

    Probabilistic machine learning and artificial intelligence.

    Get PDF
    How can a machine learn from experience? Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.The author acknowledges an EPSRC grant EP/I036575/1, the DARPA PPAML programme, a Google Focused Research Award for the Automatic Statistician and support from Microsoft Research.This is the author accepted manuscript. The final version is available from NPG at http://www.nature.com/nature/journal/v521/n7553/full/nature14541.html#abstract
    • …
    corecore