72 research outputs found
Structured Bayesian Gaussian process latent variable model: applications to data-driven dimensionality reduction and high-dimensional inversion
We introduce a methodology for nonlinear inverse problems using a variational
Bayesian approach where the unknown quantity is a spatial field. A structured
Bayesian Gaussian process latent variable model is used both to construct a
low-dimensional generative model of the sample-based stochastic prior as well
as a surrogate for the forward evaluation. Its Bayesian formulation captures
epistemic uncertainty introduced by the limited number of input and output
examples, automatically selects an appropriate dimensionality for the learned
latent representation of the data, and rigorously propagates the uncertainty of
the data-driven dimensionality reduction of the stochastic space through the
forward model surrogate. The structured Gaussian process model explicitly
leverages spatial information for an informative generative prior to improve
sample efficiency while achieving computational tractability through Kronecker
product decompositions of the relevant kernel matrices. Importantly, the
Bayesian inversion is carried out by solving a variational optimization
problem, replacing traditional computationally-expensive Monte Carlo sampling.
The methodology is demonstrated on an elliptic PDE and is shown to return
well-calibrated posteriors and is tractable with latent spaces with over 100
dimensions
Modeling the Dynamics of PDE Systems with Physics-Constrained Deep Auto-Regressive Networks
In recent years, deep learning has proven to be a viable methodology for
surrogate modeling and uncertainty quantification for a vast number of physical
systems. However, in their traditional form, such models can require a large
amount of training data. This is of particular importance for various
engineering and scientific applications where data may be extremely expensive
to obtain. To overcome this shortcoming, physics-constrained deep learning
provides a promising methodology as it only utilizes the governing equations.
In this work, we propose a novel auto-regressive dense encoder-decoder
convolutional neural network to solve and model non-linear dynamical systems
without training data at a computational cost that is potentially magnitudes
lower than standard numerical solvers. This model includes a Bayesian framework
that allows for uncertainty quantification of the predicted quantities of
interest at each time-step. We rigorously test this model on several non-linear
transient partial differential equation systems including the turbulence of the
Kuramoto-Sivashinsky equation, multi-shock formation and interaction with 1D
Burgers' equation and 2D wave dynamics with coupled Burgers' equations. For
each system, the predictive results and uncertainty are presented and discussed
together with comparisons to the results obtained from traditional numerical
analysis methods.Comment: 48 pages, 30 figures, Accepted to Journal of Computational Physic
Quantifying model form uncertainty in Reynolds-averaged turbulence models with Bayesian deep neural networks
Data-driven methods for improving turbulence modeling in Reynolds-Averaged
Navier-Stokes (RANS) simulations have gained significant interest in the
computational fluid dynamics community. Modern machine learning algorithms have
opened up a new area of black-box turbulence models allowing for the tuning of
RANS simulations to increase their predictive accuracy. While several
data-driven turbulence models have been reported, the quantification of the
uncertainties introduced has mostly been neglected. Uncertainty quantification
for such data-driven models is essential since their predictive capability
rapidly declines as they are tested for flow physics that deviate from that in
the training data. In this work, we propose a novel data-driven framework that
not only improves RANS predictions but also provides probabilistic bounds for
fluid quantities such as velocity and pressure. The uncertainties capture both
model form uncertainty as well as epistemic uncertainty induced by the limited
training data. An invariant Bayesian deep neural network is used to predict the
anisotropic tensor component of the Reynolds stress. This model is trained
using Stein variational gradient decent algorithm. The computed uncertainty on
the Reynolds stress is propagated to the quantities of interest by vanilla
Monte Carlo simulation. Results are presented for two test cases that differ
geometrically from the training flows at several different Reynolds numbers.
The prediction enhancement of the data-driven model is discussed as well as the
associated probabilistic bounds for flow properties of interest. Ultimately
this framework allows for a quantitative measurement of model confidence and
uncertainty quantification for flows in which no high-fidelity observations or
prior knowledge is available.Comment: 47 pages, 21 figures, Accepted to the Journal of Computational
Physic
Predictive Collective Variable Discovery with Deep Bayesian Models
Extending spatio-temporal scale limitations of models for complex atomistic
systems considered in biochemistry and materials science necessitates the
development of enhanced sampling methods. The potential acceleration in
exploring the configurational space by enhanced sampling methods depends on the
choice of collective variables (CVs). In this work, we formulate the discovery
of CVs as a Bayesian inference problem and consider the CVs as hidden
generators of the full-atomistic trajectory. The ability to generate samples of
the fine-scale atomistic configurations using limited training data allows us
to compute estimates of observables as well as our probabilistic confidence on
them. The methodology is based on emerging methodological advances in machine
learning and variational inference. The discovered CVs are related to
physicochemical properties which are essential for understanding mechanisms
especially in unexplored complex systems. We provide a quantitative assessment
of the CVs in terms of their predictive ability for alanine dipeptide (ALA-2)
and ALA-15 peptide
Predictive Coarse-Graining
We propose a data-driven, coarse-graining formulation in the context of
equilibrium statistical mechanics. In contrast to existing techniques which are
based on a fine-to-coarse map, we adopt the opposite strategy by prescribing a
probabilistic coarse-to-fine map. This corresponds to a directed probabilistic
model where the coarse variables play the role of latent generators of the fine
scale (all-atom) data. From an information-theoretic perspective, the framework
proposed provides an improvement upon the relative entropy method and is
capable of quantifying the uncertainty due to the information loss that
unavoidably takes place during the CG process. Furthermore, it can be readily
extended to a fully Bayesian model where various sources of uncertainties are
reflected in the posterior of the model parameters. The latter can be used to
produce not only point estimates of fine-scale reconstructions or macroscopic
observables, but more importantly, predictive posterior distributions on these
quantities. Predictive posterior distributions reflect the confidence of the
model as a function of the amount of data and the level of coarse-graining. The
issues of model complexity and model selection are seamlessly addressed by
employing a hierarchical prior that favors the discovery of sparse solutions,
revealing the most prominent features in the coarse-grained model. A flexible
and parallelizable Monte Carlo - Expectation-Maximization (MC-EM) scheme is
proposed for carrying out inference and learning tasks. A comparative
assessment of the proposed methodology is presented for a lattice spin system
and the SPC/E water model
Bayesian Deep Convolutional Encoder-Decoder Networks for Surrogate Modeling and Uncertainty Quantification
We are interested in the development of surrogate models for uncertainty
quantification and propagation in problems governed by stochastic PDEs using a
deep convolutional encoder-decoder network in a similar fashion to approaches
considered in deep learning for image-to-image regression tasks. Since normal
neural networks are data intensive and cannot provide predictive uncertainty,
we propose a Bayesian approach to convolutional neural nets. A recently
introduced variational gradient descent algorithm based on Stein's method is
scaled to deep convolutional networks to perform approximate Bayesian inference
on millions of uncertain network parameters. This approach achieves state of
the art performance in terms of predictive accuracy and uncertainty
quantification in comparison to other approaches in Bayesian neural networks as
well as techniques that include Gaussian processes and ensemble methods even
when the training data size is relatively small. To evaluate the performance of
this approach, we consider standard uncertainty quantification benchmark
problems including flow in heterogeneous media defined in terms of limited
data-driven permeability realizations. The performance of the surrogate model
developed is very good even though there is no underlying structure shared
between the input (permeability) and output (flow/pressure) fields as is often
the case in the image-to-image regression models used in computer vision
problems. Studies are performed with an underlying stochastic input
dimensionality up to where most other uncertainty quantification
methods fail. Uncertainty propagation tasks are considered and the predictive
output Bayesian statistics are compared to those obtained with Monte Carlo
estimates.Comment: 52 pages, 28 figures, submitted to Journal of Computational Physic
Structured Bayesian Gaussian process latent variable model
We introduce a Bayesian Gaussian process latent variable model that
explicitly captures spatial correlations in data using a parameterized spatial
kernel and leveraging structure-exploiting algebra on the model covariance
matrices for computational tractability. Inference is made tractable through a
collapsed variational bound with similar computational complexity to that of
the traditional Bayesian GP-LVM. Inference over partially-observed test cases
is achieved by optimizing a "partially-collapsed" bound. Modeling
high-dimensional time series systems is enabled through use of a dynamical GP
latent variable prior. Examples imputing missing data on images and
super-resolution imputation of missing video frames demonstrate the model
Variational Reformulation of Bayesian Inverse Problems
The classical approach to inverse problems is based on the optimization of a
misfit function. Despite its computational appeal, such an approach suffers
from many shortcomings, e.g., non-uniqueness of solutions, modeling prior
knowledge, etc. The Bayesian formalism to inverse problems avoids most of the
difficulties encountered by the optimization approach, albeit at an increased
computational cost. In this work, we use information theoretic arguments to
cast the Bayesian inference problem in terms of an optimization problem. The
resulting scheme combines the theoretical soundness of fully Bayesian inference
with the computational efficiency of a simple optimization
Solving inverse problems using conditional invertible neural networks
Inverse modeling for computing a high-dimensional spatially-varying property
field from indirect sparse and noisy observations is a challenging problem.
This is due to the complex physical system of interest often expressed in the
form of multiscale PDEs, the high-dimensionality of the spatial property of
interest, and the incomplete and noisy nature of observations. To address these
challenges, we develop a model that maps the given observations to the unknown
input field in the form of a surrogate model. This inverse surrogate model will
then allow us to estimate the unknown input field for any given sparse and
noisy output observations. Here, the inverse mapping is limited to a broad
prior distribution of the input field with which the surrogate model is
trained. In this work, we construct a two- and three-dimensional inverse
surrogate models consisting of an invertible and a conditional neural network
trained in an end-to-end fashion with limited training data. The invertible
network is developed using a flow-based generative model. The developed inverse
surrogate model is then applied for an inversion task of a multiphase flow
problem where given the pressure and saturation observations the aim is to
recover a high-dimensional non-Gaussian permeability field where the two facies
consist of heterogeneous permeability and varying length-scales. For both the
two- and three-dimensional surrogate models, the predicted sample realizations
of the non-Gaussian permeability field are diverse with the predictive mean
being close to the ground truth even when the model is trained with limited
data.Comment: 58 pages, 20 figure
Physics-Constrained Deep Learning for High-dimensional Surrogate Modeling and Uncertainty Quantification without Labeled Data
Surrogate modeling and uncertainty quantification tasks for PDE systems are
most often considered as supervised learning problems where input and output
data pairs are used for training. The construction of such emulators is by
definition a small data problem which poses challenges to deep learning
approaches that have been developed to operate in the big data regime. Even in
cases where such models have been shown to have good predictive capability in
high dimensions, they fail to address constraints in the data implied by the
PDE model. This paper provides a methodology that incorporates the governing
equations of the physical model in the loss/likelihood functions. The resulting
physics-constrained, deep learning models are trained without any labeled data
(e.g. employing only input data) and provide comparable predictive responses
with data-driven models while obeying the constraints of the problem at hand.
This work employs a convolutional encoder-decoder neural network approach as
well as a conditional flow-based generative model for the solution of PDEs,
surrogate model construction, and uncertainty quantification tasks. The
methodology is posed as a minimization problem of the reverse Kullback-Leibler
(KL) divergence between the model predictive density and the reference
conditional density, where the later is defined as the Boltzmann-Gibbs
distribution at a given inverse temperature with the underlying potential
relating to the PDE system of interest. The generalization capability of these
models to out-of-distribution input is considered. Quantification and
interpretation of the predictive uncertainty is provided for a number of
problems.Comment: 51 pages, 18 figures, submitted to Journal of Computational Physic
- …