4 research outputs found
Efficient Gaussian process updating under linear operator data for uncertainty reduction on implicit sets in Bayesian inverse problems
This thesis aims at developing sequential uncertainty reduction techniques for set
estimation in Bayesian inverse problems. Sequential uncertainty reduction (SUR)
strategies provide a statistically principled way of designing data collection plans
that optimally reduce the uncertainty on a given quantity of interest. This thesis
focusses on settings where the quantity of interest is a set that is implicitly defined
by conditions on some unknown function and one is only able to observe the values
of linear operators applied to the function. This setting corresponds to the one encoutered
in linear inverse problems and proves to be challenging for SUR techniques.
Indeed, SUR relies on having a probabilistic model for the unknown function under
consideration, and these models become untractable for moderately sized problem.
We start by introducing an implicit representation for covariance matrices of Gaussian
processes (GP) to overcome this limitation, and demonstrate how it allows one
to perform SUR for excursion set estimation in a real-world 3D gravimetric inversion
problem on the Stromboli volcano. In a second time, we focus on extending vanilly
SUR to multivariate problems. To that end, we introduce the concept of 'generalized
locations', which allows us to rewrite the co-kriging equations in a form-invariant
way and to derive semi-analytical formulae for multivariate SUR criteria. Those
approaches are demonstrated on a river plume estimation problem. After having
extended SUR for inverse problems to large-scale and multivariate settings, we devote
our attention to improving the realism of the models by including user-defined
trends. We show how this can be done by extending universal kriging to inverse
problems and also provide fast k-fold cross-validation formulae. Finally, in order to
provide theoretical footing for the developed approaches, show how the conditional
law of a GP can be seen as a disintegration of a corresponding Gaussian measure
under some suitable condition
Non-Sequential Ensemble Kalman Filtering using Distributed Arrays
This work introduces a new, distributed implementation of the Ensemble Kalman
Filter (EnKF) that allows for non-sequential assimilation of large datasets in
high-dimensional problems. The traditional EnKF algorithm is computationally
intensive and exhibits difficulties in applications requiring interaction with
the background covariance matrix, prompting the use of methods like sequential
assimilation which can introduce unwanted consequences, such as dependency on
observation ordering. Our implementation leverages recent advancements in
distributed computing to enable the construction and use of the full model
error covariance matrix in distributed memory, allowing for single-batch
assimilation of all observations and eliminating order dependencies.
Comparative performance assessments, involving both synthetic and real-world
paleoclimatic reconstruction applications, indicate that the new,
non-sequential implementation outperforms the traditional, sequential one
Learning excursion sets of vector-valued Gaussian random fields for autonomous ocean sampling
Improving and optimizing oceanographic sampling is a crucial task for marine
science and maritime resource management. Faced with limited resources in
understanding processes in the water-column, the combination of statistics and
autonomous systems provide new opportunities for experimental design. In this
work we develop efficient spatial sampling methods for characterizing regions
defined by simultaneous exceedances above prescribed thresholds of several
responses, with an application focus on mapping coastal ocean phenomena based
on temperature and salinity measurements. Specifically, we define a design
criterion based on uncertainty in the excursions of vector-valued Gaussian
random fields, and derive tractable expressions for the expected integrated
Bernoulli variance reduction in such a framework. We demonstrate how this
criterion can be used to prioritize sampling efforts at locations that are
ambiguous, making exploration more effective. We use simulations to study and
compare properties of the considered approaches, followed by results from field
deployments with an autonomous underwater vehicle as part of a study mapping
the boundary of a river plume. The results demonstrate the potential of
combining statistical methods and robotic platforms to effectively inform and
execute data-driven environmental sampling
Uncertainty Quantification and Experimental Design for Large-Scale Linear Inverse Problems under Gaussian Process Priors
We consider the use of Gaussian process (GP) priors for solving inverse
problems in a Bayesian framework. As is well known, the computational
complexity of GPs scales cubically in the number of datapoints. We here show
that in the context of inverse problems involving integral operators, one faces
additional difficulties that hinder inversion on large grids. Furthermore, in
that context, covariance matrices can become too large to be stored. By
leveraging results about sequential disintegrations of Gaussian measures, we
are able to introduce an implicit representation of posterior covariance
matrices that reduces the memory footprint by only storing low rank
intermediate matrices, while allowing individual elements to be accessed
on-the-fly without needing to build full posterior covariance matrices.
Moreover, it allows for fast sequential inclusion of new observations. These
features are crucial when considering sequential experimental design tasks. We
demonstrate our approach by computing sequential data collection plans for
excursion set recovery for a gravimetric inverse problem, where the goal is to
provide fine resolution estimates of high density regions inside the Stromboli
volcano, Italy. Sequential data collection plans are computed by extending the
weighted integrated variance reduction (wIVR) criterion to inverse problems.
Our results show that this criterion is able to significantly reduce the
uncertainty on the excursion volume, reaching close to minimal levels of
residual uncertainty. Overall, our techniques allow the advantages of
probabilistic models to be brought to bear on large-scale inverse problems
arising in the natural sciences.Comment: under revie