17 research outputs found
The Gaussian Process Autoregressive Regression Model (GPAR)
Multi-output regression models must exploit dependencies between outputs to maximise predictive performance. The application of Gaussian processes (GPs) to this setting typically yields models that are computationally demanding and have limited representational power. We present the Gaussian Process Autoregressive Regression (GPAR) model, a scalable multi-output GP model that is able to capture nonlinear, possibly input-varying, dependencies between outputs in a simple and tractable way: the product rule is used to decompose the joint distribution over the outputs into a set of conditionals, each of which is modelled by a standard GP. GPAR's efficacy is demonstrated on a variety of synthetic and real-world problems, outperforming existing GP models and achieving state-of-the-art performance on established benchmarks
Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes
The goal of this paper is to design image classification systems that, after
an initial multi-task training phase, can automatically adapt to new tasks
encountered at test time. We introduce a conditional neural process based
approach to the multi-task classification setting for this purpose, and
establish connections to the meta-learning and few-shot learning literature.
The resulting approach, called CNAPs, comprises a classifier whose parameters
are modulated by an adaptation network that takes the current task's dataset as
input. We demonstrate that CNAPs achieves state-of-the-art results on the
challenging Meta-Dataset benchmark indicating high-quality transfer-learning.
We show that the approach is robust, avoiding both over-fitting in low-shot
regimes and under-fitting in high-shot regimes. Timing experiments reveal that
CNAPs is computationally efficient at test-time as it does not involve gradient
based adaptation. Finally, we show that trained models are immediately
deployable to continual learning and active learning where they can outperform
existing approaches that do not leverage transfer learning
Sim2Real for Environmental Neural Processes
Machine learning (ML)-based weather models have recently undergone rapid
improvements. These models are typically trained on gridded reanalysis data
from numerical data assimilation systems. However, reanalysis data comes with
limitations, such as assumptions about physical laws and low spatiotemporal
resolution. The gap between reanalysis and reality has sparked growing interest
in training ML models directly on observations such as weather stations.
Modelling scattered and sparse environmental observations requires scalable and
flexible ML architectures, one of which is the convolutional conditional neural
process (ConvCNP). ConvCNPs can learn to condition on both gridded and
off-the-grid context data to make uncertainty-aware predictions at target
locations. However, the sparsity of real observations presents a challenge for
data-hungry deep learning models like the ConvCNP. One potential solution is
'Sim2Real': pre-training on reanalysis and fine-tuning on observational data.
We analyse Sim2Real with a ConvCNP trained to interpolate surface air
temperature over Germany, using varying numbers of weather stations for
fine-tuning. On held-out weather stations, Sim2Real training substantially
outperforms the same model architecture trained only with reanalysis data or
only with station data, showing that reanalysis data can serve as a stepping
stone for learning from real observations. Sim2Real could thus enable more
accurate models for weather prediction and climate monitoring.Comment: 4 pages, 3 figures, To be published in Tackling Climate Change with
Machine Learning workshop at NeurIP
Environmental Sensor Placement with Convolutional Gaussian Neural Processes
Environmental sensors are crucial for monitoring weather conditions and the
impacts of climate change. However, it is challenging to maximise measurement
informativeness and place sensors efficiently, particularly in remote regions
like Antarctica. Probabilistic machine learning models can evaluate placement
informativeness by predicting the uncertainty reduction provided by a new
sensor. Gaussian process (GP) models are widely used for this purpose, but they
struggle with capturing complex non-stationary behaviour and scaling to large
datasets. This paper proposes using a convolutional Gaussian neural process
(ConvGNP) to address these issues. A ConvGNP uses neural networks to
parameterise a joint Gaussian distribution at arbitrary target locations,
enabling flexibility and scalability. Using simulated surface air temperature
anomaly over Antarctica as ground truth, the ConvGNP learns spatial and
seasonal non-stationarities, outperforming a non-stationary GP baseline. In a
simulated sensor placement experiment, the ConvGNP better predicts the
performance boost obtained from new observations than GP baselines, leading to
more informative sensor placements. We contrast our approach with physics-based
sensor placement methods and propose future work towards an operational sensor
placement recommendation system. This system could help to realise
environmental digital twins that actively direct measurement sampling to
improve the digital representation of reality.Comment: In review for the Climate Informatics 2023 special issue of
Environmental Data Scienc
Environmental sensor placement with convolutional Gaussian neural processes
Environmental sensors are crucial for monitoring weather conditions and the impacts of climate change. However, it is challenging to place sensors in a way that maximises the informativeness of their measurements, particularly in remote regions like Antarctica. Probabilistic machine learning models can suggest informative sensor placements by finding sites that maximally reduce prediction uncertainty. Gaussian process (GP) models are widely used for this purpose, but they struggle with capturing complex non-stationary behaviour and scaling to large datasets. This paper proposes using a convolutional Gaussian neural process (ConvGNP) to address these issues. A ConvGNP uses neural networks to parameterise a joint Gaussian distribution at arbitrary target locations, enabling flexibility and scalability. Using simulated surface air temperature anomaly over Antarctica as training data, the ConvGNP learns spatial and seasonal non-stationarities, outperforming a non-stationary GP baseline. In a simulated sensor placement experiment, the ConvGNP better predicts the performance boost obtained from new observations than GP baselines, leading to more informative sensor placements. We contrast our approach with physics-based sensor placement methods and propose future steps towards an operational sensor placement recommendation system. Our work could help to realise environmental digital twins that actively direct measurement sampling to improve the digital representation of reality
Recommended from our members
The Neural Processes Family: Translation Equivariance and Output Dependencies
Most contemporary machine learning approaches use a model trained from scratch on a particular task and a learning algorithm designed by hand. This approach has worked very well with the advent of deep learning and in the presence of very large datasets (Goodfellow et al., 2016). Recently, meta-learning has emerged as a machine learning approach to learn both a model and a learning algorithm (Hospedales et al., 2021; Schmidhuber, 1987) directly from data. Neural processes (Garnelo et al., 2018a,b) are a family of meta-learning models which combine the flexibility of deep learning with the uncertainty awareness of probabilistic models. Training using meta-learning allows neural processes to apply deep neural networks to applications with smaller training sets where they would typically overfit. Neural processes produce well-calibrated predictions, enable fast inference at test time, and have flexible data-handling properties that make them a good candidate for messy real-world datasets and applications.
However, this thesis focuses on addressing two shortcomings when applying neural processes to real-world applications by i) incorporating translation equivariance into the architecture of neural processes rather than requiring the model to learn this inductive bias directly from data and ii) developing methods for neural processes to parametrize rich predictive distributions that can model dependencies between output-space variables and produce coherent samples.
This thesis makes four main contributions to the family of neural processes models. First, we introduce the convolutional conditional neural process (ConvCNP). The ConvCNP incorporates translation equivariance into its modelling assumptions by using convolutional neural networks and improves training data efficiency and performance when data is approximately stationary. Second, we propose the latent variable version of the ConvCNP, the convolutional latent neural process (convLNP) that is able to model epistemic uncertainty and output-space dependencies and able to produce coherent function samples. We also propose an approximate maximum likelihood training procedure for the ConvLNP improving upon the standard VI approximate inference technique used by latent neural processes at the time. Third, we propose the Gaussian neural process (GNP) which models the predictive distribution with a full covariance Gaussian. The GNP can model joint output-space dependencies like the ConvLNP but avoids the issues associated with using latent variables. Training GNPs is much more simple than the ConvLNP since it uses the same maximum likelihood technique as standard conditional neural processes. Fourth, we introduce the autoregressive neural process (AR NP). Rather than proposing a new neural process architecture this method produces predictions at test time by evaluating existing neural process models autoregressively via the product rule of probability. This method allows for the use of existing, potentially already trained neural processes to model non-Gaussian predictive distributions and produce coherent samples without any modifications to the architecture or training procedure.
The efficacy of each of these methods is demonstrated through a series of synthetic and real world experiments in climate science, population modelling, and medical science applications. It can be seen in these applications that incorporating translation equivariance as a modelling assumption and generating predictive distributions that model output-space dependencies improves predictive performance
Relative sectional curvature in compact angled 2-complexes
We define the notion of relative sectional curvature for 2-complexes, and prove that a compact angled 2-complex that has negative sectional curvature relative to planar sections has coherent fundamental group. We analyze a certain type of 1-complex that we call flattenable graphs Gamma→ X for an compact angled 2-complex X, and show that if X has nonpositive sectional curvature, and if for every flattenable graph pi1 (Gamma) → pi1( X) is finitely presented, then X has coherent fundamental group. Finally we show that if X is a compact angled 2-complex with negative sectional curvature relative to pi-gons and planar sections then pi1(X) is coherent. Some results are provided which are useful for creating examples of 2-complexes with these properties, or to test a 2-complex for these properties
Recommended from our members
The Gaussian Process Autoregressive Regression Model (GPAR)
Multi-output regression models must exploit dependencies between outputs to maximise predictive performance. The application of Gaussian processes (GPs) to this setting typically yields models that are computationally demanding and have limited representational power. We present the Gaussian Process Autoregressive Regression (GPAR) model, a scalable multi-output GP model that is able to capture nonlinear, possibly input-varying, dependencies between outputs in a simple and tractable way: the product rule is used to decompose the joint distribution over the outputs into a set of conditionals, each of which is modelled by a standard GP. GPAR's efficacy is demonstrated on a variety of synthetic and real-world problems, outperforming existing GP models and achieving state-of-the-art performance on established benchmarks
Efficient Gaussian Neural Processes for Regression
Conditional Neural Processes (CNP; Garnelo et al., 2018) are an attractive
family of meta-learning models which produce well-calibrated predictions,
enable fast inference at test time, and are trainable via a simple maximum
likelihood procedure. A limitation of CNPs is their inability to model
dependencies in the outputs. This significantly hurts predictive performance
and renders it impossible to draw coherent function samples, which limits the
applicability of CNPs in down-stream applications and decision making. Neural
Processes (NPs; Garnelo et al., 2018) attempt to alleviate this issue by using
latent variables, relying on these to model output dependencies, but introduces
difficulties stemming from approximate inference. One recent alternative
(Bruinsma et al., 2021), which we refer to as the FullConvGNP, models
dependencies in the predictions while still being trainable via exact
maximum-likelihood. Unfortunately, the FullConvGNP relies on expensive
2D-dimensional convolutions, which limit its applicability to only
one-dimensional data. In this work, we present an alternative way to model
output dependencies which also lends itself maximum likelihood training but,
unlike the FullConvGNP, can be scaled to two- and three-dimensional data. The
proposed models exhibit good performance in synthetic experiments