16 research outputs found

    Deep Statistical Models with Application to Environmental Data

    Get PDF
    When analyzing environmental data, constructing a realistic statistical model is important, not only to fully characterize the physical phenomena, but also to provide valid and useful predictions. Gaussian process models are amongst the most popular tools used for this purpose. However, many assumptions are usually made when using Gaussian processes, such as stationarity of the covariance function. There are several approaches to construct nonstationary spatial and spatio-temporal Gaussian processes, including the deformation approach. In the deformation approach, the geographical domain is warped into a new domain, on which the Gaussian process is modeled to be stationary. One of the main challenges with this approach is how to construct a deformation function that is complicated enough to adequately capture the nonstationarity in the process, but simple enough to facilitate statistical inference and prediction. In this thesis, by using ideas from deep learning, we construct deformation functions that are compositions of simple warping units. In particular, deformation functions that are composed of aligning functions and warping functions are introduced to model nonstationary and asymmetric multivariate spatial processes, while spatial and temporal warping functions are used to model nonstationary spatio-temporal processes. Similarly to the traditional deformation approach, familiar stationary models are used on the warped domain. It is shown that this new approach to model nonstationarity is computationally efficient, and that it can lead to predictions that are superior to those from stationary models. We show the utility of these models on both simulated data and real-world environmental data: ocean temperatures and surface-ice elevation. The developed warped nonstationary processes can also be used for emulation. We show that a warped, gradient-enhanced Gaussian process surrogate model can be embedded in algorithms such as importance sampling and delayed-acceptance Markov chain Monte Carlo. Our surrogate models can provide more accurate emulation than other traditional surrogate models, and can help speed up Bayesian inference in problems with exponential-family likelihoods with intractable normalizing constants, for example when analyzing satellite images using the Potts model

    Manifold learning for emulations of computer models

    Get PDF
    Computer simulations are widely used in scientific research and engineering areas. Thought they could provide accurate result, the computational expense is normally high and thus hinder their applications to problems, where repeated evaluations are required, e.g, design optimization and uncertainty quantification. For partial differential equation (PDE) models the outputs of interest are often spatial fields, leading to high-dimensional output spaces. Although emulators can be used to find faithful and computationally inexpensive approximations of computer models, there are few methods for handling high-dimensional output spaces. For Gaussian process (GP) emulation, approximations of the correlation structure and/or dimensionality reduction are necessary. Linear dimensionality reduction will fail when the output space is not well approximated by a linear subspace of the ambient space in which it lies. Manifold learning can overcome the limitations of linear methods if an accurate inverse map is available. In this thesis, manifold learning is applied to construct GP emulators for very high-dimensional output spaces arising from parameterised PDE model simulations. Artificial neural network (ANN) support vector machine (SVM) emulators using manifold learning are also studied. A general framework for the inverse map approximation and a new efficient method for diffusion maps were developed. The manifold learning based emulators are then to extend reduced order models (ROMs) based on proper orthogonal decomposition to dynamic, parameterized PDEs. A similar approach is used to extend the discrete empirical interpolation method (DEIM) to ROMs for nonlinear, parameterized dynamic PDEs

    A Metalearning Approach for Physics-Informed Neural Networks (PINNs): Application to Parameterized PDEs

    Full text link
    Physics-informed neural networks (PINNs) as a means of discretizing partial differential equations (PDEs) are garnering much attention in the Computational Science and Engineering (CS&E) world. At least two challenges exist for PINNs at present: an understanding of accuracy and convergence characteristics with respect to tunable parameters and identification of optimization strategies that make PINNs as efficient as other computational science tools. The cost of PINNs training remains a major challenge of Physics-informed Machine Learning (PiML) - and, in fact, machine learning (ML) in general. This paper is meant to move towards addressing the latter through the study of PINNs on new tasks, for which parameterized PDEs provides a good testbed application as tasks can be easily defined in this context. Following the ML world, we introduce metalearning of PINNs with application to parameterized PDEs. By introducing metalearning and transfer learning concepts, we can greatly accelerate the PINNs optimization process. We present a survey of model-agnostic metalearning, and then discuss our model-aware metalearning applied to PINNs as well as implementation considerations and algorithmic complexity. We then test our approach on various canonical forward parameterized PDEs that have been presented in the emerging PINNs literature

    Convolved Gaussian process priors for multivariate regression with applications to dynamical systems

    Get PDF
    In this thesis we address the problem of modeling correlated outputs using Gaussian process priors. Applications of modeling correlated outputs include the joint prediction of pollutant metals in geostatistics and multitask learning in machine learning. Defining a Gaussian process prior for correlated outputs translates into specifying a suitable covariance function that captures dependencies between the different output variables. Classical models for obtaining such a covariance function include the linear model of coregionalization and process convolutions. We propose a general framework for developing multiple output covariance functions by performing convolutions between smoothing kernels particular to each output and covariance functions that are common to all outputs. Both the linear model of coregionalization and the process convolutions turn out to be special cases of this framework. Practical aspects of the proposed methodology are studied in this thesis. They involve the use of domain-specific knowledge for defining relevant smoothing kernels, efficient approximations for reducing computational complexity and a novel method for establishing a general class of nonstationary covariances with applications in robotics and motion capture data.Reprints of the publications that appear at the end of this document, report case studies and experimental results in sensor networks, geostatistics and motion capture data that illustrate the performance of the different methods proposed.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Bayesian treed multivariate Gaussian process with adaptive design: Application to a carbon capture unit

    Get PDF
    Computer experiments are widely used in scientific research to study and predict the behavior of complex systems, which often have responses consisting of a set of nonstationary outputs. The computational cost of simulations at high resolution often is expensive and impractical for parametric studies at different input values. In this article, we develop a Bayesian treed multivariate Gaussian process (BTMGP) as an extension of the Bayesian treed Gaussian process (BTGP) to model the cross-covariance function and the nonstationarity of the multivariate output. We facilitate the computational complexity of the Markov chain Monte Carlo sampler by choosing appropriately the covariance function and prior distributions. Based on the BTMGP, we develop a sequential design of experiment for the input space and construct an emulator. We demonstrate the use of the proposed method in test cases and compare it with alternative approaches. We also apply the sequential sampling technique and BTMGP to model the multiphase flow in a full scale regenerator of a carbon capture unit

    High-dimensional-output surrogate models for uncertainty and sensitivity analyses

    Get PDF
    Computational models that describe complex physical phenomena tend to be computationally expensive and time consuming. Partial differential equation (PDE) based models in particular produce spatio-temporal data sets in high dimensional output spaces. Repeated calls of computer models to perform tasks such as sensitivity analysis, uncertainty quantification and design optimization can become computationally infeasible as a result. While constructing an emulator is one solution to approximate the outcome of expensive computer models, it is not always capable of dealing with high-dimensional data sets. To deal with high-dimensional data, in this thesis emulation strategies (Gaussian processes (GPs), artificial neural networks (ANNs) and support vector machines (SVMs)) are combined with linear and non-linear dimensionality reduction techniques (kPCA, Isomap and diffusion maps) to develop efficient emulators. For sensitivity analysis (variance based), a probabilistic framework is developed to account for the emulator uncertainty and the method is extended to multivariate outputs, with a derivation of new semi-analytical results for performing rapid sensitivity analysis of univariate or multivariate outputs. The developed emulators are also used to extend reduced order models (ROMs) based on proper orthogonal decomposition to parameter-dependent PDEs, including an extension of the discrete empirical interpolation method for non-linear problems PDE systems
    corecore