    Invariant Priors for Bayesian Quadrature

    Bayesian quadrature (BQ) is a model-based numerical integration method that is able to increase sample efficiency by encoding and leveraging known structure of the integration task at hand. In this paper, we explore priors that encode invariance of the integrand under a set of bijective transformations in the input domain, in particular some unitary transformations, such as rotations, axis-flips, or point symmetries. We show initial results on superior performance in comparison to standard Bayesian quadrature on several synthetic and one real world application

    Automating Active Learning for Gaussian Processes

    In many problems in science, technology, and engineering, unlabeled data is abundant but acquiring labeled observations is expensive -- it requires a human annotator, a costly laboratory experiment, or a time-consuming computer simulation. Active learning is a machine learning paradigm designed to minimize the cost of obtaining labeled data by carefully selecting which new data should be gathered next. However, excessive machine learning expertise is often required to effectively apply these techniques in their current form. In this dissertation, we propose solutions that further automate active learning. Our core contributions are active learning algorithms that are easy for non-experts to use but that deliver results competitive with or better than human-expert solutions. We begin introducing a novel active search algorithm that automatically and dynamically balances exploration against exploitation --- without relying on a parameter to control this tradeoff. We also provide a theoretical investigation on the hardness of this problem, proving that no polynomial-time policy can achieve a constant factor approximation ratio for the expected utility of the optimal policy. Next, we introduce a novel information-theoretic approach for active model selection. Our method is based on maximizing the mutual information between the output variable and the model class. This is the first active-model-selection approach that does not require updating each model for every candidate point. As a result, we successfully developed an automated audiometry test for rapid screening of noise-induced hearing loss, a widespread and preventable disability, if diagnosed early. We proceed by introducing a novel model selection algorithm for fixed-size datasets, called Bayesian optimization for model selection (BOMS). Our proposed model search method is based on Bayesian optimization in model space, where we reason about the model evidence as a function to be maximized. BOMS is capable of finding a model that explains the dataset well without any human assistance. Finally, we extend BOMS to active learning, creating a fully automatic active learning framework. We apply this framework to Bayesian optimization, creating a sample-efficient automated system for black-box optimization. Crucially, we account for the uncertainty in the choice of model; our method uses multiple and carefully-selected models to represent its current belief about the latent objective function. Our algorithms are completely general and can be extended to any class of probabilistic models. In this dissertation, however, we mainly use the powerful class of Gaussian process models to perform inference. Extensive experimental evidence is provided to demonstrate that all proposed algorithms outperform previously developed solutions to these problems

    Numerical Integration as and for Probabilistic Inference

    Numerical integration or quadrature is one of the workhorses of modern scientific computing and a key operation to perform inference in intractable probabilistic models. The epistemic uncertainty about the true value of an analytically intractable integral identifies the integration task as an inference problem itself. Indeed, numerical integration can be cast as a probabilistic numerical method known as Bayesian quadrature (BQ). BQ leverages structural assumptions about the function to be integrated via properties encoded in the prior. A posterior belief over the unknown integral value emerges by conditioning the BQ model on an actively selected point set and corresponding function evaluations. Iterative updates to the Bayesian model turn BQ into an adaptive quadrature method that quantifies its uncertainty about the solution of the integral in a principled way. This thesis traces out the scope of probabilistic integration methods and highlights types of integration tasks that BQ excels at. These arise when sample efficiency is required and encodable prior knowledge about the integration problem of low to moderate dimensionality is at hand. The first contribution addresses transfer learning with BQ. It extends the notion of active learning schemes to cost-sensitive settings where cheap approximations to an expensive-to-evaluate integrand are available. The degeneracy of acquisition policies in simple BQ is lifted upon generalization to the multi-source, cost-sensitive setting. This motivates the formulation of a set of desirable properties for BQ acquisition functions. A second application considers integration tasks arising in statistical computations on Riemannian manifolds that have been learned from data. Unsupervised learning algorithms that respect the intrinsic geometry of the data rely on the repeated estimation of expensive and structured integrals. Our custom-made active BQ scheme outperforms conventional integration tools for Riemannian statistics. Despite their unarguable benefits, BQ schemes provide limited flexibility to construct suitable priors while keeping the inference step tractable. In a final contribution, we identify the ubiquitous integration problem of computing multivariate normal probabilities as a type of integration task that is structurally taxing for BQ. The instead proposed method is an elegant algorithm based on Markov chain Monte Carlo that permits both sampling from and estimating the normalization constant of linearly constrained Gaussians that contain an arbitrarily small probability mass. As a whole, this thesis contributes to the wider goal of advancing integration algorithms to satisfy the needs imposed by contemporary probabilistic machine learning applications