247 research outputs found

    Design and implementation of data assimilation methods based on Cholesky decomposition

    Get PDF
    In Data Assimilation, analyses of a system are obtained by combining a previous numerical model of the system and observations or measurements from it. These numerical models are typically expressed as a set of ordinary differential equations and/or a set of partial differential equations wherein all knowledge about dynamics and physics of, for instance, the ocean and or the atmosphere are encapsulated. We treat numerical forecasts and observations as random variables and therefore, error dynamics can be estimated by using Bayes’ rule. For the estimation of hyper-parameters in error distributions, an ensemble of model realizations is employed. In practice, model resolutions are several order of magnitudes larger than ensemble sizes, and consequently, sampling errors impact the quality of analysis corrections and besides, models can be highly non-linear and well-common Gaussian assumptions on prior errors can be broken. To overcome these situations, we replace prior errors by a mixture of Gaussians and even more, precision covariance matrices intra-clusters are estimated by means of the modified Cholesky decomposition. Four different methods are proposed, namely the Posterior EnKF with its deterministic and stochastic variations, a Non-Gaussian method and a MCMC filter, which used the Bickel-Levina estimator; these methods are based on a modified Cholesky decomposition and tested with the Lorenz 96 model. Their implementations are shown to provide equivalent solutions compared to another EnKF methods like the LETKF and the EnSRF.DoctoradoDoctor en Ingeniería de Sistemas y Computació

    Improved tabu search and simulated annealing methods for nonlinear data assimilation

    Get PDF
    Nonlinear data assimilation can be a very challenging task. Four local search methods are proposed for nonlinear data assimilation in this paper. The methods work as follows: At each iteration, the observation operator is linearized around the current solution, and a gradient approximation of the three dimensional variational (3D-Var) cost function is obtained. Then, samples along potential steepest descent directions of the 3D-Var cost function are generated, and the acceptance/rejection criteria for such samples are similar to those proposed by the Tabu Search and the Simulated Annealing framework. In addition, such samples can be drawn within certain sub-spaces so as to reduce the computational effort of computing search directions. Once a posterior mode is estimated, matrix-free ensemble Kalman filter approaches can be implemented to estimate posterior members. Furthermore, the convergence of the proposed methods is theoretically proven based on the necessary assumptions and conditions. Numerical experiments have been performed by using the Lorenz-96 model. The numerical results show that the cost function values on average can be reduced by several orders of magnitudes by using the proposed methods. Even more, the proposed methods can converge faster to posterior modes when sub-space approximations are employed to reduce the computational efforts among iterations

    Parameter estimation by implicit sampling

    Full text link
    Implicit sampling is a weighted sampling method that is used in data assimilation, where one sequentially updates estimates of the state of a stochastic model based on a stream of noisy or incomplete data. Here we describe how to use implicit sampling in parameter estimation problems, where the goal is to find parameters of a numerical model, e.g.~a partial differential equation (PDE), such that the output of the numerical model is compatible with (noisy) data. We use the Bayesian approach to parameter estimation, in which a posterior probability density describes the probability of the parameter conditioned on data and compute an empirical estimate of this posterior with implicit sampling. Our approach generates independent samples, so that some of the practical difficulties one encounters with Markov Chain Monte Carlo methods, e.g.~burn-in time or correlations among dependent samples, are avoided. We describe a new implementation of implicit sampling for parameter estimation problems that makes use of multiple grids (coarse to fine) and BFGS optimization coupled to adjoint equations for the required gradient calculations. The implementation is "dimension independent", in the sense that a well-defined finite dimensional subspace is sampled as the mesh used for discretization of the PDE is refined. We illustrate the algorithm with an example where we estimate a diffusion coefficient in an elliptic equation from sparse and noisy pressure measurements. In the example, dimension\slash mesh-independence is achieved via Karhunen-Lo\`{e}ve expansions

    Unconstrained Learning Machines

    Get PDF
    With the use of information technology in industries, a new need has arisen in analyzing large scale data sets and automating data analysis that was once performed by human intuition and simple analog processing machines. The new generation of computer programs now has to outperform their predecessors in detecting complex and non-trivial patterns buried in data warehouses. Improved Machines Learning (ML) techniques such as Neural Networks (NNs) and Support Vector Machines (SVMs) have shown remarkable performances on supervised learning problems for the past couple of decades (e.g. anomaly detection, classification and identification, interpolation and extrapolation, etc.).Nevertheless, many such techniques have ill-conditioned structures which lack adaptability for processing exotic data or very large amounts of data. Some techniques cannot even process data in an on-line fashion. Furthermore, as the processing power of computers increases, there is a pressing need for ML algorithms to perform supervised learning tasks in less time than previously required over even larger sets of data, which means that time and memory complexities of these algorithms must be improved.The aims of this research is to construct an improved type of SVM-like algorithms for tasks such as nonlinear classification and interpolation that is more scalable, error-tolerant and accurate. Additionally, this family of algorithms must be able to compute solutions in a controlled timing, preferably small with respect to modern computational technologies. These new algorithms should also be versatile enough to have useful applications in engineering, meteorology or quality control.This dissertation introduces a family of SVM-based algorithms named Unconstrained Learning Machines (ULMs) which attempt to solve the robustness, scalability and timing issues of traditional supervised learning algorithms. ULMs are not based on geometrical analogies (e.g. SVMs) or on the replication of biological models (e.g. NNs). Their construction is strictly based on statistical considerations taken from the recently developed statistical learning theory. Like SVMs, ULMS are using kernel methods extensively in order to process exotic and/or non-numerical objects stored in databases and search for hidden patterns in data with tailored measures of similarities.ULMs are applied to a variety of problems in manufacturing engineering and in meteorology. The robust nonlinear nonparametric interpolation abilities of ULMs allow for the representation of sub-millimetric deformations on the surface of manufactured parts, the selection of conforming objects and the diagnostic and modeling of manufacturing processes. ULMs play a role in assimilating the system states of computational weather models, removing the intrinsic noise without any knowledge of the underlying mathematical models and helping the establishment of more accurate forecasts

    Continuous reservoir model updating by ensemble Kalman filter on Grid computing architectures

    Get PDF
    A reservoir engineering Grid computing toolkit, ResGrid and its extensions, were developed and applied to designed reservoir simulation studies and continuous reservoir model updating. The toolkit provides reservoir engineers with high performance computing capacity to complete their projects without requiring them to delve into Grid resource heterogeneity, security certification, or network protocols. Continuous and real-time reservoir model updating is an important component of closed-loop model-based reservoir management. The method must rapidly and continuously update reservoir models by assimilating production data, so that the performance predictions and the associated uncertainty are up-to-date for optimization. The ensemble Kalman filter (EnKF), a Bayesian approach for model updating, uses Monte Carlo statistics for fusing observation data with forecasts from simulations to estimate a range of plausible models. The ensemble of updated models can be used for uncertainty forecasting or optimization. Grid environments aggregate geographically distributed, heterogeneous resources. Their virtual architecture can handle many large parallel simulation runs, and is thus well suited to solving model-based reservoir management problems. In the study, the ResGrid workflow for Grid-based designed reservoir simulation and an adapted workflow provide tools for building prior model ensembles, task farming and execution, extracting simulator output results, implementing the EnKF, and using a web portal for invoking those scripts. The ResGrid workflow is demonstrated for a geostatistical study of 3-D displacements in heterogeneous reservoirs. A suite of 1920 simulations assesses the effects of geostatistical methods and model parameters. Multiple runs are simultaneously executed using parallel Grid computing. Flow response analyses indicate that efficient, widely-used sequential geostatistical simulation methods may overestimate flow response variability when compared to more rigorous but computationally costly direct methods. Although the EnKF has attracted great interest in reservoir engineering, some aspects of the EnKF remain poorly understood, and are explored in the dissertation. First, guidelines are offered to select data assimilation intervals. Second, an adaptive covariance inflation method is shown to be effective to stabilize the EnKF. Third, we show that simple truncation can correct negative effects of nonlinearity and non-Gaussianity as effectively as more complex and expensive reparameterization methods
    • …
    corecore