95 research outputs found
Dimension reduction for Gaussian process emulation: an application to the influence of bathymetry on tsunami heights
High accuracy complex computer models, or simulators, require large resources
in time and memory to produce realistic results. Statistical emulators are
computationally cheap approximations of such simulators. They can be built to
replace simulators for various purposes, such as the propagation of
uncertainties from inputs to outputs or the calibration of some internal
parameters against observations. However, when the input space is of high
dimension, the construction of an emulator can become prohibitively expensive.
In this paper, we introduce a joint framework merging emulation with dimension
reduction in order to overcome this hurdle. The gradient-based kernel dimension
reduction technique is chosen due to its ability to drastically decrease
dimensionality with little loss in information. The Gaussian process emulation
technique is combined with this dimension reduction approach. Our proposed
approach provides an answer to the dimension reduction issue in emulation for a
wide range of simulation problems that cannot be tackled using existing
methods. The efficiency and accuracy of the proposed framework is demonstrated
theoretically, and compared with other methods on an elliptic partial
differential equation (PDE) problem. We finally present a realistic application
to tsunami modeling. The uncertainties in the bathymetry (seafloor elevation)
are modeled as high-dimensional realizations of a spatial process using a
geostatistical approach. Our dimension-reduced emulation enables us to compute
the impact of these uncertainties on resulting possible tsunami wave heights
near-shore and on-shore. We observe a significant increase in the spread of
uncertainties in the tsunami heights due to the contribution of the bathymetry
uncertainties. These results highlight the need to include the effect of
uncertainties in the bathymetry in tsunami early warnings and risk assessments.Comment: 26 pages, 8 figures, 2 table
Sequential Design with Mutual Information for Computer Experiments (MICE): Emulation of a Tsunami Model
Computer simulators can be computationally intensive to run over a large
number of input values, as required for optimization and various uncertainty
quantification tasks. The standard paradigm for the design and analysis of
computer experiments is to employ Gaussian random fields to model computer
simulators. Gaussian process models are trained on input-output data obtained
from simulation runs at various input values. Following this approach, we
propose a sequential design algorithm, MICE (Mutual Information for Computer
Experiments), that adaptively selects the input values at which to run the
computer simulator, in order to maximize the expected information gain (mutual
information) over the input space. The superior computational efficiency of the
MICE algorithm compared to other algorithms is demonstrated by test functions,
and a tsunami simulator with overall gains of up to 20% in that case
Efficient spatial modelling using the SPDE approach with bivariate splines
Gaussian fields (GFs) are frequently used in spatial statistics for their
versatility. The associated computational cost can be a bottleneck, especially
in realistic applications. It has been shown that computational efficiency can
be gained by doing the computations using Gaussian Markov random fields (GMRFs)
as the GFs can be seen as weak solutions to corresponding stochastic partial
differential equations (SPDEs) using piecewise linear finite elements. We
introduce a new class of representations of GFs with bivariate splines instead
of finite elements. This allows an easier implementation of piecewise
polynomial representations of various degrees. It leads to GMRFs that can be
inferred efficiently and can be easily extended to non-stationary fields. The
solutions approximated with higher order bivariate splines converge faster,
hence the computational cost can be alleviated. Numerical simulations using
both real and simulated data also demonstrate that our framework increases the
flexibility and efficiency.Comment: 26 pages, 7 figures and 3 table
Linked Gaussian Process Emulation for Systems of Computer Models Using Matérn Kernels and Adaptive Design
The state-of-the-art linked Gaussian process offers a way to build analytical
emulators for systems of computer models. We generalize the closed form
expressions for the linked Gaussian process under the squared exponential
kernel to a class of Mat\'ern kernels, that are essential in advanced
applications. An iterative procedure to construct linked Gaussian processes as
surrogate models for any feed-forward systems of computer models is presented
and illustrated on a feed-back coupled satellite system. We also introduce an
adaptive design algorithm that could increase the approximation accuracy of
linked Gaussian process surrogates with reduced computational costs on running
expensive computer systems, by allocating runs and refining emulators of
individual sub-models based on their heterogeneous functional complexity
Deep Gaussian Process Emulation using Stochastic Imputation
We propose a novel deep Gaussian process (DGP) inference method for computer
model emulation using stochastic imputation. By stochastically imputing the
latent layers, the approach transforms the DGP into the linked GP, a
state-of-the-art surrogate model formed by linking a system of feed-forward
coupled GPs. This transformation renders a simple while efficient DGP training
procedure that only involves optimizations of conventional stationary GPs. In
addition, the analytically tractable mean and variance of the linked GP allows
one to implement predictions from DGP emulators in a fast and accurate manner.
We demonstrate the method in a series of synthetic examples and real-world
applications, and show that it is a competitive candidate for efficient DGP
surrogate modeling in comparison to the variational inference and the
fully-Bayesian approach. A package
implementing the method is also produced and available at
https://github.com/mingdeyu/DGP
Embedding machine-learnt sub-grid variability improves climate model biases
The under-representation of cloud formation is a long-standing bias
associated with climate simulations. Parameterisation schemes are required to
capture cloud processes within current climate models but have known biases. We
overcome these biases by embedding a Multi-Output Gaussian Process (MOGP)
trained on high resolution Unified Model simulations to represent the
variability of temperature and specific humidity within a climate model. A
trained MOGP model is coupled in-situ with a simplified Atmospheric General
Circulation Model named SPEEDY. The temperature and specific humidity profiles
of SPEEDY are perturbed at fixed intervals according to the variability
predicted from the MOGP. Ten-year predictions are generated for both control
and ML-hybrid models. The hybrid model reduces the global precipitation bias by
18\% and over the tropics by 22\%. To further understand the drivers of these
improvements, physical quantities of interest are explored, such as the
distribution of lifted index values and the alteration of the Hadley cell. The
control and hybrid set-ups are also run in a plus 4K sea-surface temperature
experiment to explore the effects of the approach on patterns relating to cloud
cover and precipitation in a warmed climate setting
Multi-level emulation of tsunami simulations over Cilacap, South Java, Indonesia
Carrying out a Probabilistic Tsunami Hazard Assessment (PTHA) requires a large number of simulations done at a high resolution. Statistical emulation builds a surrogate to replace the simulator and thus reduces computational costs when propagating uncertainties from the earthquake sources to the tsunami inundations. To reduce further these costs, we propose here to build emulators that exploit multiple levels of resolution and a sequential design of computer experiments. By running a few tsunami simulations at high resolution and many more simulations at lower resolutions we are able to provide realistic assessments whereas, for the same budget, using only the high resolution tsunami simulations do not provide a satisfactory outcome. As a result, PTHA can be considered with higher precision using the highest spatial resolutions, and for impacts over larger regions. We provide an illustration to the city of Cilacap in Indonesia that demonstrates the benefit of our approach.</p
Multi-level emulation of tsunami simulations over Cilacap, South Java, Indonesia
Carrying out a Probabilistic Tsunami Hazard Assessment (PTHA) requires a large number of simulations done at a high resolution. Statistical emulation builds a surrogate to replace the simulator and thus reduces computational costs when propagating uncertainties from the earthquake sources to the tsunami inundations. To reduce further these costs, we propose here to build emulators that exploit multiple levels of resolution and a sequential design of computer experiments. By running a few tsunami simulations at high resolution and many more simulations at lower resolutions we are able to provide realistic assessments whereas, for the same budget, using only the high resolution tsunami simulations do not provide a satisfactory outcome. As a result, PTHA can be considered with higher precision using the highest spatial resolutions, and for impacts over larger regions. We provide an illustration to the city of Cilacap in Indonesia that demonstrates the benefit of our approach
- …
