21 research outputs found
Multifidelity Covariance Estimation via Regression on the Manifold of Symmetric Positive Definite Matrices
We introduce a multifidelity estimator of covariance matrices formulated as
the solution to a regression problem on the manifold of symmetric positive
definite matrices. The estimator is positive definite by construction, and the
Mahalanobis distance minimized to obtain it possesses properties which enable
practical computation. We show that our manifold regression multifidelity
(MRMF) covariance estimator is a maximum likelihood estimator under a certain
error model on manifold tangent space. More broadly, we show that our
Riemannian regression framework encompasses existing multifidelity covariance
estimators constructed from control variates. We demonstrate via numerical
examples that our estimator can provide significant decreases, up to one order
of magnitude, in squared estimation error relative to both single-fidelity and
other multifidelity covariance estimators. Furthermore, preservation of
positive definiteness ensures that our estimator is compatible with downstream
tasks, such as data assimilation and metric learning, in which this property is
essential.Comment: 30 pages + 15-page supplemen
Multigrid sequential data assimilation for the large-eddy simulation of a massively separated bluff-body flow
The potential for data-driven applications to scale-resolving simulations of
turbulent flows is assessed herein. Multigrid sequential data assimilation
algorithms have been used to calibrate solvers for Large Eddy Simulation for
the analysis of the high-Reynolds-number flow around a rectangular cylinder of
aspect ratio 5:1. This test case has been chosen because of a number of
physical complexities which elude accurate representation using reduced-order
numerical simulation. The results for the statistical moments of the velocity
and pressure flow field show that the data-driven techniques employed, which
are based on the Ensemble Kalman Filter, are able to significantly improve the
predictive features of the solver for reduced grid resolution. In addition, it
was observed that, despite the sparse and asymmetric distribution of
observation in the data-driven process, the data augmented results exhibit
perfectly symmetric statistics and a significantly improved accuracy also far
from the sensor location
Covariance estimation using h-statistics in Monte Carlo and Multilevel Monte Carlo methods
We present novel Monte Carlo (MC) and multilevel Monte Carlo (MLMC) methods
for determining the unbiased covariance of random variables using h-statistics.
The advantage of this procedure lies in the unbiased construction of the
estimator's mean square error in a closed form. This is in contrast to the
conventional MC and MLMC covariance estimators, which are based on biased mean
square errors defined solely by upper bounds, particularly within the MLMC.
Finally, the numerical results of the algorithms are demonstrated by estimating
the covariance of the stochastic response of a simple 1D stochastic elliptic
PDE such as Poisson's model
Bayesian Recursive Update for Ensemble Kalman Filters
Few real-world systems are amenable to truly Bayesian filtering;
nonlinearities and non-Gaussian noises can wreak havoc on filters that rely on
linearization and Gaussian uncertainty approximations. This article presents
the Bayesian Recursive Update Filter (BRUF), a Kalman filter that uses a
recursive approach to incorporate information from nonlinear measurements. The
BRUF relaxes the measurement linearity assumption of the Extended Kalman Filter
(EKF) by dividing the measurement update into a user-defined number of steps.
The proposed technique is extended for ensemble filters in the Bayesian
Recursive Update Ensemble Kalman Filter (BRUEnKF). The performance of both
filters is demonstrated in numerical examples, and new filters are introduced
which exploit the theoretical foundation of the BRUF in different ways. A
comparison between the BRUEnKF and Gromov flow, a popular particle flow
algorithm, is presented in detail. Finally, the BRUEnKF is shown to outperform
the EnKF for a very high-dimensional system
Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry
We introduce a multi-fidelity estimator of covariance matrices that employs
the log-Euclidean geometry of the symmetric positive-definite manifold. The
estimator fuses samples from a hierarchy of data sources of differing
fidelities and costs for variance reduction while guaranteeing definiteness, in
contrast with previous approaches. The new estimator makes covariance
estimation tractable in applications where simulation or data collection is
expensive; to that end, we develop an optimal sample allocation scheme that
minimizes the mean-squared error of the estimator given a fixed budget.
Guaranteed definiteness is crucial to metric learning, data assimilation, and
other downstream tasks. Evaluations of our approach using data from physical
applications (heat conduction, fluid dynamics) demonstrate more accurate metric
learning and speedups of more than one order of magnitude compared to
benchmarks.Comment: To appear at the International Conference on Machine Learning (ICML)
202
Multilevel assimilation of inverted seismic data
I ensemble-basert data-assimilering (DA) er størrelsen på ensemblet vanligvis begrenset til hundre medlemmer. Rett frem bruk av ensemble-basert DA kan resultere i betydelig Monte Carlo-feil, som ofte viser seg som alvorlig undervurdering av parameterusikkerheter. Assimilering av store mengder samtidige data forsterker de negative effektene av Monte Carlo-feilen. Avstandsbasert lokalisering er det konvensjonelle middelet for å begrense dette problemet. Denne metoden har imidlertid sine egne ulemper. Den vil, f.eks., fjerne sanne korrelasjoner over lange distanser og det er svært vanskelig å benytte på data som ikke har en unik fysisk plassering. Bruk av modeller med lavere kvalitet reduserer beregningskostnadene per ensemble-medlem og gir derfor muligheten til å redusere Monte Carlo-feilen ved å øke ensemble-størrelsen. Men, modeller med lavere kvalitet øker også modelleringsfeilen. Data-assimilering på flere nivåer (MLDA) bruker et utvalg av modeller som danner hierarkier av både beregningskostnad og beregningsnøyaktighet, og prøver åå oppnå en bedre balanse mellom Monte Carlo-feil og modelleringsfeil.
I dette PhD-prosjektet ble flere MLDA-algoritmer utviklet og deres kvalitet for assimilering av inverterte seismiske data ble vurdert på forenklede reservoarproblemer. Bruk av modeller på flere nivå innebærer introduksjon av noen numeriske feil (multilevel modeling error, MLME), i tillegg til de allerede eksisterende numeriske feilene. Flere beregningsmessig rimelige metoder ble utviklet for delvis å kompansere for MLME i gjennomføring av data-assimilering på flere nivåer. Metodene ble også undersøkt under historie tilpassing på forenklede reservoar problemer. Til slutt ble en av de nye MLDA-algoritmene valgt og ytelsen ble vurdert på et historie tilpassings problem med en realistisk reservoar modell.In ensemble-based data assimilation (DA), the ensemble size is usually limited to around one hundred. Straightforward application of ensemble-based DA can therefore result in significant Monte Carlo errors, often manifesting themselves as severe underestimation of parameter uncertainties. Assimilation of large amounts of simultaneous data enhances the negative effects of Monte Carlo errors. Distance-based localization is the conventional remedy for this problem. However, it has its own drawbacks, e.g. not allowing for true long-range correlations and difficulty in assimilation of data which do not have a specific physical location. Use of lower-fidelity models reduces the computational cost per ensemble member and therefore renders the possibility to reduce Monte Carlo errors by increasing the ensemble size, but it also adds to the modeling error.
Multilevel data assimilation (MLDA) uses a selection of models forming hierarchies of both computational cost and computational accuracy, and tries to obtain a better balance between Monte Carlo errors and modeling errors.
In this PhD project, several MLDA algorithms were developed and their quality for assimilation of inverted seismic data was assessed in simplistic reservoir problems. Utilization of multilevel models entails introduction of some numerical errors (multilevel modeling error, MLME) to the problem in addition to the already existing numerical errors. Several computationally inexpensive methods were devised for partially accounting for MLME in the context of multilevel data assimilation. They were also investigated in simplistic reservoir history-matching problems. Finally, one of the novel MLDA algorithms was chosen and its performance was assessed in a realistic reservoir history-matching problem.Doktorgradsavhandlin
A filtered multilevel Monte Carlo method for estimating the expectation of discretized random fields
We investigate the use of multilevel Monte Carlo (MLMC) methods for
estimating the expectation of discretized random fields. Specifically, we
consider a setting in which the input and output vectors of the numerical
simulators have inconsistent dimensions across the multilevel hierarchy. This
requires the introduction of grid transfer operators borrowed from multigrid
methods. Starting from a simple 1D illustration, we demonstrate numerically
that the resulting MLMC estimator deteriorates the estimation of high-frequency
components of the discretized expectation field compared to a Monte Carlo (MC)
estimator. By adapting mathematical tools initially developed for multigrid
methods, we perform a theoretical spectral analysis of the MLMC estimator of
the expectation of discretized random fields, in the specific case of linear,
symmetric and circulant simulators. This analysis provides a spectral
decomposition of the variance into contributions associated with each scale
component of the discretized field. We then propose improved MLMC estimators
using a filtering mechanism similar to the smoothing process of multigrid
methods. The filtering operators improve the estimation of both the small- and
large-scale components of the variance, resulting in a reduction of the total
variance of the estimator. These improvements are quantified for the specific
class of simulators considered in our spectral analysis. The resulting filtered
MLMC (F-MLMC) estimator is applied to the problem of estimating the discretized
variance field of a diffusion-based covariance operator, which amounts to
estimating the expectation of a discretized random field. The numerical
experiments support the conclusions of the theoretical analysis even with
non-linear simulators, and demonstrate the improvements brought by the proposed
F-MLMC estimator compared to both a crude MC and an unfiltered MLMC estimator
State estimation in nonlinear parametric time dependent systems using Tensor Train
International audienceIn the present work we propose a reduced-order method to solve the state estimation problem when nonlinear parametric time-dependent systems are at hand. The method is based on the approximation of the set of system solutions by means of a Tensor Train format. The particular structure of Tensor Train makes it possible to set up both a variational and a sequential method. Several numerical experiments are proposed to assess the behaviour of the method