Search CORE

43 research outputs found

Student-t Processes as Alternatives to Gaussian Processes

Author: Ghahramani Zoubin
Shah Amar
Wilson Andrew Gordon
Publication venue
Publication date: 19/02/2014
Field of study

We investigate the Student-t process as an alternative to the Gaussian process as a nonparametric prior over functions. We derive closed form expressions for the marginal likelihood and predictive distribution of a Student-t process, by integrating away an inverse Wishart process prior over the covariance kernel of a Gaussian process model. We show surprising equivalences between different hierarchical Gaussian process models leading to Student-t processes, and derive a new sampling scheme for the inverse Wishart process, which helps elucidate these equivalences. Overall, we show that a Student-t process can retain the attractive properties of a Gaussian process -- a nonparametric representation, analytic marginal and predictive distributions, and easy model selection through covariance kernels -- but has enhanced flexibility, and predictive covariances that, unlike a Gaussian process, explicitly depend on the values of training observations. We verify empirically that a Student-t process is especially useful in situations where there are changes in covariance structure, or in applications like Bayesian optimization, where accurate predictive covariances are critical for good performance. These advantages come at no additional computational cost over Gaussian processes.Comment: 13 pages, 6 figures, 1 table. To appear in "The Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS), 2014.

arXiv.org e-Print Archive

CiteSeerX

Hypervolume-based Multi-objective Bayesian Optimization with Student-t Processes

Author: Couckuyt Ivo
Dhaene Tom
van der Herten Joachim
Publication venue
Publication date: 01/01/2016
Field of study

Student-

t

processes have recently been proposed as an appealing alternative non-parameteric function prior. They feature enhanced flexibility and predictive variance. In this work the use of Student-

t

processes are explored for multi-objective Bayesian optimization. In particular, an analytical expression for the hypervolume-based probability of improvement is developed for independent Student-

t

process priors of the objectives. Its effectiveness is shown on a multi-objective optimization problem which is known to be difficult with traditional Gaussian processes.Comment: 5 pages, 3 figure

arXiv.org e-Print Archive

Ghent University Academic Bibliography

On the average uncertainty for systems with nonlinear coupling

Author: Kon Mark A.
Nelson Kenric P.
Umarov Sabir
Publication venue
Publication date: 06/05/2016
Field of study

The increased uncertainty and complexity of nonlinear systems have motivated investigators to consider generalized approaches to defining an entropy function. New insights are achieved by defining the average uncertainty in the probability domain as a transformation of entropy functions. The Shannon entropy when transformed to the probability domain is the weighted geometric mean of the probabilities. For the exponential and Gaussian distributions, we show that the weighted geometric mean of the distribution is equal to the density of the distribution at the location plus the scale, i.e. at the width of the distribution. The average uncertainty is generalized via the weighted generalized mean, in which the moment is a function of the nonlinear source. Both the Renyi and Tsallis entropies transform to this definition of the generalized average uncertainty in the probability domain. For the generalized Pareto and Student's t-distributions, which are the maximum entropy distributions for these generalized entropies, the appropriate weighted generalized mean also equals the density of the distribution at the location plus scale. A coupled entropy function is proposed, which is equal to the normalized Tsallis entropy divided by one plus the coupling.Comment: 24 pages, including 4 figures and 1 tabl

arXiv.org e-Print Archive

Crossref

Boston University Institutional Repository (OpenBU)

Practical Bayesian optimization in the presence of outliers

Author: Martinez-Cantin Ruben
McCourt Michael
Tee Kevin
Publication venue
Publication date: 12/12/2017
Field of study

Inference in the presence of outliers is an important field of research as outliers are ubiquitous and may arise across a variety of problems and domains. Bayesian optimization is method that heavily relies on probabilistic inference. This allows outstanding sample efficiency because the probabilistic machinery provides a memory of the whole optimization process. However, that virtue becomes a disadvantage when the memory is populated with outliers, inducing bias in the estimation. In this paper, we present an empirical evaluation of Bayesian optimization methods in the presence of outliers. The empirical evidence shows that Bayesian optimization with robust regression often produces suboptimal results. We then propose a new algorithm which combines robust regression (a Gaussian process with Student-t likelihood) with outlier diagnostics to classify data points as outliers or inliers. By using an scheduler for the classification of outliers, our method is more efficient and has better convergence over the standard robust regression. Furthermore, we show that even in controlled situations with no expected outliers, our method is able to produce better results.Comment: 10 pages (2 of references), 6 figures, 1 algorith

arXiv.org e-Print Archive

Repositorio Universidad de Zaragoza

BRUNO: A Deep Recurrent Model for Exchangeable Data

Author: Dambre Joni
Degrave Jonas
Gal Yarin
Gretton Arthur
Huszár Ferenc
Korshunova Iryna
Publication venue
Publication date: 01/01/2018
Field of study

We present a novel model architecture which leverages deep learning tools to perform exact Bayesian inference on sets of high dimensional, complex observations. Our model is provably exchangeable, meaning that the joint distribution over observations is invariant under permutation: this property lies at the heart of Bayesian inference. The model does not require variational approximations to train, and new samples can be generated conditional on previous samples, with cost linear in the size of the conditioning set. The advantages of our architecture are demonstrated on learning tasks that require generalisation from short observed sequences while modelling sequence variability, such as conditional image generation, few-shot learning, and anomaly detection.Comment: NIPS 201

arXiv.org e-Print Archive

Ghent University Academic Bibliography

UCL Discovery

Oxford University Research Archive