95 research outputs found

    Dimension reduction for Gaussian process emulation: an application to the influence of bathymetry on tsunami heights

    Get PDF
    High accuracy complex computer models, or simulators, require large resources in time and memory to produce realistic results. Statistical emulators are computationally cheap approximations of such simulators. They can be built to replace simulators for various purposes, such as the propagation of uncertainties from inputs to outputs or the calibration of some internal parameters against observations. However, when the input space is of high dimension, the construction of an emulator can become prohibitively expensive. In this paper, we introduce a joint framework merging emulation with dimension reduction in order to overcome this hurdle. The gradient-based kernel dimension reduction technique is chosen due to its ability to drastically decrease dimensionality with little loss in information. The Gaussian process emulation technique is combined with this dimension reduction approach. Our proposed approach provides an answer to the dimension reduction issue in emulation for a wide range of simulation problems that cannot be tackled using existing methods. The efficiency and accuracy of the proposed framework is demonstrated theoretically, and compared with other methods on an elliptic partial differential equation (PDE) problem. We finally present a realistic application to tsunami modeling. The uncertainties in the bathymetry (seafloor elevation) are modeled as high-dimensional realizations of a spatial process using a geostatistical approach. Our dimension-reduced emulation enables us to compute the impact of these uncertainties on resulting possible tsunami wave heights near-shore and on-shore. We observe a significant increase in the spread of uncertainties in the tsunami heights due to the contribution of the bathymetry uncertainties. These results highlight the need to include the effect of uncertainties in the bathymetry in tsunami early warnings and risk assessments.Comment: 26 pages, 8 figures, 2 table

    Sequential Design with Mutual Information for Computer Experiments (MICE): Emulation of a Tsunami Model

    Get PDF
    Computer simulators can be computationally intensive to run over a large number of input values, as required for optimization and various uncertainty quantification tasks. The standard paradigm for the design and analysis of computer experiments is to employ Gaussian random fields to model computer simulators. Gaussian process models are trained on input-output data obtained from simulation runs at various input values. Following this approach, we propose a sequential design algorithm, MICE (Mutual Information for Computer Experiments), that adaptively selects the input values at which to run the computer simulator, in order to maximize the expected information gain (mutual information) over the input space. The superior computational efficiency of the MICE algorithm compared to other algorithms is demonstrated by test functions, and a tsunami simulator with overall gains of up to 20% in that case

    Efficient spatial modelling using the SPDE approach with bivariate splines

    Get PDF
    Gaussian fields (GFs) are frequently used in spatial statistics for their versatility. The associated computational cost can be a bottleneck, especially in realistic applications. It has been shown that computational efficiency can be gained by doing the computations using Gaussian Markov random fields (GMRFs) as the GFs can be seen as weak solutions to corresponding stochastic partial differential equations (SPDEs) using piecewise linear finite elements. We introduce a new class of representations of GFs with bivariate splines instead of finite elements. This allows an easier implementation of piecewise polynomial representations of various degrees. It leads to GMRFs that can be inferred efficiently and can be easily extended to non-stationary fields. The solutions approximated with higher order bivariate splines converge faster, hence the computational cost can be alleviated. Numerical simulations using both real and simulated data also demonstrate that our framework increases the flexibility and efficiency.Comment: 26 pages, 7 figures and 3 table

    Linked Gaussian Process Emulation for Systems of Computer Models Using Matérn Kernels and Adaptive Design

    Get PDF
    The state-of-the-art linked Gaussian process offers a way to build analytical emulators for systems of computer models. We generalize the closed form expressions for the linked Gaussian process under the squared exponential kernel to a class of Mat\'ern kernels, that are essential in advanced applications. An iterative procedure to construct linked Gaussian processes as surrogate models for any feed-forward systems of computer models is presented and illustrated on a feed-back coupled satellite system. We also introduce an adaptive design algorithm that could increase the approximation accuracy of linked Gaussian process surrogates with reduced computational costs on running expensive computer systems, by allocating runs and refining emulators of individual sub-models based on their heterogeneous functional complexity

    Deep Gaussian Process Emulation using Stochastic Imputation

    Get PDF
    We propose a novel deep Gaussian process (DGP) inference method for computer model emulation using stochastic imputation. By stochastically imputing the latent layers, the approach transforms the DGP into the linked GP, a state-of-the-art surrogate model formed by linking a system of feed-forward coupled GPs. This transformation renders a simple while efficient DGP training procedure that only involves optimizations of conventional stationary GPs. In addition, the analytically tractable mean and variance of the linked GP allows one to implement predictions from DGP emulators in a fast and accurate manner. We demonstrate the method in a series of synthetic examples and real-world applications, and show that it is a competitive candidate for efficient DGP surrogate modeling in comparison to the variational inference and the fully-Bayesian approach. A Python\texttt{Python} package dgpsi\texttt{dgpsi} implementing the method is also produced and available at https://github.com/mingdeyu/DGP

    Embedding machine-learnt sub-grid variability improves climate model biases

    Full text link
    The under-representation of cloud formation is a long-standing bias associated with climate simulations. Parameterisation schemes are required to capture cloud processes within current climate models but have known biases. We overcome these biases by embedding a Multi-Output Gaussian Process (MOGP) trained on high resolution Unified Model simulations to represent the variability of temperature and specific humidity within a climate model. A trained MOGP model is coupled in-situ with a simplified Atmospheric General Circulation Model named SPEEDY. The temperature and specific humidity profiles of SPEEDY are perturbed at fixed intervals according to the variability predicted from the MOGP. Ten-year predictions are generated for both control and ML-hybrid models. The hybrid model reduces the global precipitation bias by 18\% and over the tropics by 22\%. To further understand the drivers of these improvements, physical quantities of interest are explored, such as the distribution of lifted index values and the alteration of the Hadley cell. The control and hybrid set-ups are also run in a plus 4K sea-surface temperature experiment to explore the effects of the approach on patterns relating to cloud cover and precipitation in a warmed climate setting

    Multi-level emulation of tsunami simulations over Cilacap, South Java, Indonesia

    Get PDF
    Carrying out a Probabilistic Tsunami Hazard Assessment (PTHA) requires a large number of simulations done at a high resolution. Statistical emulation builds a surrogate to replace the simulator and thus reduces computational costs when propagating uncertainties from the earthquake sources to the tsunami inundations. To reduce further these costs, we propose here to build emulators that exploit multiple levels of resolution and a sequential design of computer experiments. By running a few tsunami simulations at high resolution and many more simulations at lower resolutions we are able to provide realistic assessments whereas, for the same budget, using only the high resolution tsunami simulations do not provide a satisfactory outcome. As a result, PTHA can be considered with higher precision using the highest spatial resolutions, and for impacts over larger regions. We provide an illustration to the city of Cilacap in Indonesia that demonstrates the benefit of our approach.</p

    Multi-level emulation of tsunami simulations over Cilacap, South Java, Indonesia

    Get PDF
    Carrying out a Probabilistic Tsunami Hazard Assessment (PTHA) requires a large number of simulations done at a high resolution. Statistical emulation builds a surrogate to replace the simulator and thus reduces computational costs when propagating uncertainties from the earthquake sources to the tsunami inundations. To reduce further these costs, we propose here to build emulators that exploit multiple levels of resolution and a sequential design of computer experiments. By running a few tsunami simulations at high resolution and many more simulations at lower resolutions we are able to provide realistic assessments whereas, for the same budget, using only the high resolution tsunami simulations do not provide a satisfactory outcome. As a result, PTHA can be considered with higher precision using the highest spatial resolutions, and for impacts over larger regions. We provide an illustration to the city of Cilacap in Indonesia that demonstrates the benefit of our approach
    corecore