Search CORE

8 research outputs found

Notes on the runtime of A* sampling

Author: Markou Stratis
Publication venue
Publication date: 24/07/2022
Field of study

The challenge of simulating random variables is a central problem in Statistics and Machine Learning. Given a tractable proposal distribution

P

, from which we can draw exact samples, and a target distribution

Q

which is absolutely continuous with respect to

P

, the A* sampling algorithm allows simulating exact samples from

Q

, provided we can evaluate the Radon-Nikodym derivative of

Q

with respect to

P

. Maddison et al. originally showed that for a target distribution

Q

and proposal distribution

P

, the runtime of A* sampling is upper bounded by

\mathcal{O}(\exp(D_{\infty}[Q||P]))

where

D_{\infty}[Q||P]

is the Renyi divergence from

Q

P

. This runtime can be prohibitively large for many cases of practical interest. Here, we show that with additional restrictive assumptions on

Q

and

P

, we can achieve much faster runtimes. Specifically, we show that if

Q

and

P

are distributions on

\mathbb{R}

and their Radon-Nikodym derivative is unimodal, the runtime of A* sampling is

\mathcal{O}(D_{\infty}[Q||P])

, which is exponentially faster than A* sampling without assumptions

arXiv.org e-Print Archive

Environmental Sensor Placement with Convolutional Gaussian Neural Processes

Author: Andersson Tom R.
Bruinsma Wessel P.
Coca-Castro Alejandro
Ellis Anna-Louise
Hosking J. Scott
Jones Daniel C.
Lazzara Matthew A.
Markou Stratis
Requeima James
Turner Richard E.
Vaughan Anna
Publication venue
Publication date: 29/03/2023
Field of study

Environmental sensors are crucial for monitoring weather conditions and the impacts of climate change. However, it is challenging to maximise measurement informativeness and place sensors efficiently, particularly in remote regions like Antarctica. Probabilistic machine learning models can evaluate placement informativeness by predicting the uncertainty reduction provided by a new sensor. Gaussian process (GP) models are widely used for this purpose, but they struggle with capturing complex non-stationary behaviour and scaling to large datasets. This paper proposes using a convolutional Gaussian neural process (ConvGNP) to address these issues. A ConvGNP uses neural networks to parameterise a joint Gaussian distribution at arbitrary target locations, enabling flexibility and scalability. Using simulated surface air temperature anomaly over Antarctica as ground truth, the ConvGNP learns spatial and seasonal non-stationarities, outperforming a non-stationary GP baseline. In a simulated sensor placement experiment, the ConvGNP better predicts the performance boost obtained from new observations than GP baselines, leading to more informative sensor placements. We contrast our approach with physics-based sensor placement methods and propose future work towards an operational sensor placement recommendation system. This system could help to realise environmental digital twins that actively direct measurement sampling to improve the digital representation of reality.Comment: In review for the Climate Informatics 2023 special issue of Environmental Data Scienc

arXiv.org e-Print Archive

Trieste: Efficiently Exploring The Depths of Black-box Functions with TensorFlow

Author: Artemev Artem
Berkeley Joel
Couckuyt Ivo
Ghani Khurram
Goodall Alexander
Granta Uri
Loka Nasrulloh R. B. S
Markou Stratis
Moss Henry B.
Ober Sebastian W.
Paleyes Andrei
Pascual-Diaz Sergio
Picheny Victor
Qing Jixiang
Stojic Hrvoje
Vakili Sattar
Publication venue
Publication date: 16/02/2023
Field of study

We present Trieste, an open-source Python package for Bayesian optimization and active learning benefiting from the scalability and efficiency of TensorFlow. Our library enables the plug-and-play of popular TensorFlow-based models within sequential decision-making loops, e.g. Gaussian processes from GPflow or GPflux, or neural networks from Keras. This modular mindset is central to the package and extends to our acquisition functions and the internal dynamics of the decision-making loop, both of which can be tailored and extended by researchers or engineers when tackling custom use cases. Trieste is a research-friendly and production-ready toolkit backed by a comprehensive test suite, extensive documentation, and available at https://github.com/secondmind-labs/trieste

arXiv.org e-Print Archive

Environmental sensor placement with convolutional Gaussian neural processes

Author: Andersson Tom R.
Bruinsma Wessel P.
Coca-Castro Alejandro
Ellis Anna-Louise
Hosking Scott
Jones Dani
Lazzara Matthew A.
Markou Stratis
Requeima James
Turner Richard E.
Vaughan Anna
Publication venue: Cambridge University Press
Publication date: 03/08/2023
Field of study

Environmental sensors are crucial for monitoring weather conditions and the impacts of climate change. However, it is challenging to place sensors in a way that maximises the informativeness of their measurements, particularly in remote regions like Antarctica. Probabilistic machine learning models can suggest informative sensor placements by finding sites that maximally reduce prediction uncertainty. Gaussian process (GP) models are widely used for this purpose, but they struggle with capturing complex non-stationary behaviour and scaling to large datasets. This paper proposes using a convolutional Gaussian neural process (ConvGNP) to address these issues. A ConvGNP uses neural networks to parameterise a joint Gaussian distribution at arbitrary target locations, enabling flexibility and scalability. Using simulated surface air temperature anomaly over Antarctica as training data, the ConvGNP learns spatial and seasonal non-stationarities, outperforming a non-stationary GP baseline. In a simulated sensor placement experiment, the ConvGNP better predicts the performance boost obtained from new observations than GP baselines, leading to more informative sensor placements. We contrast our approach with physics-based sensor placement methods and propose future steps towards an operational sensor placement recommendation system. Our work could help to realise environmental digital twins that actively direct measurement sampling to improve the digital representation of reality

NERC Open Research Archive

Efficient Gaussian Neural Processes for Regression

Author: Bruinsma Wessel
Markou Stratis
Requeima James
Turner Richard
Publication venue: https://sites.google.com/view/udlworkshop2021/home
Publication date: 22/08/2021
Field of study

Conditional Neural Processes (CNP; Garnelo et al., 2018) are an attractive family of meta-learning models which produce well-calibrated predictions, enable fast inference at test time, and are trainable via a simple maximum likelihood procedure. A limitation of CNPs is their inability to model dependencies in the outputs. This significantly hurts predictive performance and renders it impossible to draw coherent function samples, which limits the applicability of CNPs in down-stream applications and decision making. Neural Processes (NPs; Garnelo et al., 2018) attempt to alleviate this issue by using latent variables, relying on these to model output dependencies, but introduces difficulties stemming from approximate inference. One recent alternative (Bruinsma et al., 2021), which we refer to as the FullConvGNP, models dependencies in the predictions while still being trainable via exact maximum-likelihood. Unfortunately, the FullConvGNP relies on expensive 2D-dimensional convolutions, which limit its applicability to only one-dimensional data. In this work, we present an alternative way to model output dependencies which also lends itself maximum likelihood training but, unlike the FullConvGNP, can be scaled to two- and three-dimensional data. The proposed models exhibit good performance in synthetic experiments

arXiv.org e-Print Archive

Apollo (Cambridge)

Practical Conditional Neural Processes Via Tractable Dependent Predictions

Author: Bruinsma Wessel P.
Markou Stratis
Requeima James
Turner Richard E.
Vaughan Anna
Publication venue
Publication date: 13/06/2022
Field of study

Conditional Neural Processes (CNPs; Garnelo et al., 2018a) are meta-learning models which leverage the flexibility of deep learning to produce well-calibrated predictions and naturally handle off-the-grid and missing data. CNPs scale to large datasets and train with ease. Due to these features, CNPs appear well-suited to tasks from environmental sciences or healthcare. Unfortunately, CNPs do not produce correlated predictions, making them fundamentally inappropriate for many estimation and decision making tasks. Predicting heat waves or floods, for example, requires modelling dependencies in temperature or precipitation over time and space. Existing approaches which model output dependencies, such as Neural Processes (NPs; Garnelo et al., 2018b) or the FullConvGNP (Bruinsma et al., 2021), are either complicated to train or prohibitively expensive. What is needed is an approach which provides dependent predictions, but is simple to train and computationally tractable. In this work, we present a new class of Neural Process models that make correlated predictions and support exact maximum likelihood training that is simple and scalable. We extend the proposed models by using invertible output transformations, to capture non-Gaussian output distributions. Our models can be used in downstream estimation tasks which require dependent function samples. By accounting for output dependencies, our models show improved predictive performance on a range of experiments with synthetic and real data.Comment: 23 pages; accepted to the 10th International Conference on Learning Representations (ICLR 2022

arXiv.org e-Print Archive

Autoregressive Conditional Neural Processes

Author: Andersson Tom R.
Bruinsma Wessel P.
Buonomo Anthony
Foong Andrew Y.K.
Hosking J. Scott
Markou Stratis
Requiema James
Turner Richard E.
Vaughan Anna
Publication venue
Publication date: 01/03/2023
Field of study

Conditional neural processes (CNPs; Garnelo et al., 2018a) are attractive meta-learning models which produce well-calibrated predictions and are trainable via a simple maximum likelihood procedure. Although CNPs have many advantages, they are unable to model dependencies in their predictions. Various works propose solutions to this, but these come at the cost of either requiring approximate inference or being limited to Gaussian predictions. In this work, we instead propose to change how CNPs are deployed at test time, without any modifications to the model or training procedure. Instead of making predictions independently for every target point, we autoregressively define a joint predictive distribution using the chain rule of probability, taking inspiration from the neural autoregressive density estimator (NADE) literature. We show that this simple procedure allows factorised Gaussian CNPs to model highly dependent, non-Gaussian predictive distributions. Perhaps surprisingly, in an extensive range of tasks with synthetic and real data, we show that CNPs in autoregressive (AR) mode not only significantly outperform non-AR CNPs, but are also competitive with more sophisticated models that are significantly more computationally expensive and challenging to train. This performance is remarkable given that AR CNPs are not trained to model joint dependencies. Our work provides an example of how ideas from neural distribution estimation can benefit neural processes, and motivates research into the AR deployment of other neural process model

NERC Open Research Archive