495 research outputs found

    Second order adjoints for solving PDE-constrained optimization problems

    Get PDF
    Inverse problems are of utmost importance in many fields of science and engineering. In the variational approach inverse problems are formulated as PDE-constrained optimization problems, where the optimal estimate of the uncertain parameters is the minimizer of a certain cost functional subject to the constraints posed by the model equations. The numerical solution of such optimization problems requires the computation of derivatives of the model output with respect to model parameters. The first order derivatives of a cost functional (defined on the model output) with respect to a large number of model parameters can be calculated efficiently through first order adjoint sensitivity analysis. Second order adjoint models give second derivative information in the form of matrix-vector products between the Hessian of the cost functional and user defined vectors. Traditionally, the construction of second order derivatives for large scale models has been considered too costly. Consequently, data assimilation applications employ optimization algorithms that use only first order derivative information, like nonlinear conjugate gradients and quasi-Newton methods. In this paper we discuss the mathematical foundations of second order adjoint sensitivity analysis and show that it provides an efficient approach to obtain Hessian-vector products. We study the benefits of using of second order information in the numerical optimization process for data assimilation applications. The numerical studies are performed in a twin experiment setting with a two-dimensional shallow water model. Different scenarios are considered with different discretization approaches, observation sets, and noise levels. Optimization algorithms that employ second order derivatives are tested against widely used methods that require only first order derivatives. Conclusions are drawn regarding the potential benefits and the limitations of using high-order information in large scale data assimilation problems

    Parameter estimation by implicit sampling

    Full text link
    Implicit sampling is a weighted sampling method that is used in data assimilation, where one sequentially updates estimates of the state of a stochastic model based on a stream of noisy or incomplete data. Here we describe how to use implicit sampling in parameter estimation problems, where the goal is to find parameters of a numerical model, e.g.~a partial differential equation (PDE), such that the output of the numerical model is compatible with (noisy) data. We use the Bayesian approach to parameter estimation, in which a posterior probability density describes the probability of the parameter conditioned on data and compute an empirical estimate of this posterior with implicit sampling. Our approach generates independent samples, so that some of the practical difficulties one encounters with Markov Chain Monte Carlo methods, e.g.~burn-in time or correlations among dependent samples, are avoided. We describe a new implementation of implicit sampling for parameter estimation problems that makes use of multiple grids (coarse to fine) and BFGS optimization coupled to adjoint equations for the required gradient calculations. The implementation is "dimension independent", in the sense that a well-defined finite dimensional subspace is sampled as the mesh used for discretization of the PDE is refined. We illustrate the algorithm with an example where we estimate a diffusion coefficient in an elliptic equation from sparse and noisy pressure measurements. In the example, dimension\slash mesh-independence is achieved via Karhunen-Lo\`{e}ve expansions

    Space-time adaptive solution of inverse problems with the discrete adjoint method

    Get PDF
    Adaptivity in both space and time has become the norm for solving problems modeled by partial differential equations. The size of the discretized problem makes uniformly refined grids computationally prohibitive. Adaptive refinement of meshes and time steps allows to capture the phenomena of interest while keeping the cost of a simulation tractable on the current hardware. Many fields in science and engineering require the solution of inverse problems where parameters for a given model are estimated based on available measurement information. In contrast to forward (regular) simulations, inverse problems have not extensively benefited from the adaptive solver technology. Previous research in inverse problems has focused mainly on the continuous approach to calculate sensitivities, and has typically employed fixed time and space meshes in the solution process. Inverse problem solvers that make exclusive use of uniform or static meshes avoid complications such as the differentiation of mesh motion equations, or inconsistencies in the sensitivity equations between subdomains with different refinement levels. However, this comes at the cost of low computational efficiency. More efficient computations are possible through judicious use of adaptive mesh refinement, adaptive time steps, and the discrete adjoint method. This paper develops a framework for the construction and analysis of discrete adjoint sensitivities in the context of time dependent, adaptive grid, adaptive step models. Discrete adjoints are attractive in practice since they can be generated with low effort using automatic differentiation. However, this approach brings several important challenges. The adjoint of the forward numerical scheme may be inconsistent with the continuous adjoint equations. A reduction in accuracy of the discrete adjoint sensitivities may appear due to the intergrid transfer operators. Moreover, the optimization algorithm may need to accommodate state and gradient vectors whose dimensions change between iterations. This work shows that several of these potential issues can be avoided for the discontinuous Galerkin (DG) method. The adjoint model development is considerably simplified by decoupling the adaptive mesh refinement mechanism from the forward model solver, and by selectively applying automatic differentiation on individual algorithms. In forward models discontinuous Galerkin discretizations can efficiently handle high orders of accuracy, h/ph/p-refinement, and parallel computation. The analysis reveals that this approach, paired with Runge Kutta time stepping, is well suited for the adaptive solutions of inverse problems. The usefulness of discrete discontinuous Galerkin adjoints is illustrated on a two-dimensional adaptive data assimilation problem

    Mean-field optimal control and optimality conditions in the space of probability measures

    Get PDF
    We derive a framework to compute optimal controls for problems with states in the space of probability measures. Since many optimal control problems constrained by a system of ordinary differential equations (ODE) modelling interacting particles converge to optimal control problems constrained by a partial differential equation (PDE) in the mean-field limit, it is interesting to have a calculus directly on the mesoscopic level of probability measures which allows us to derive the corresponding first-order optimality system. In addition to this new calculus, we provide relations for the resulting system to the first-order optimality system derived on the particle level, and the first-order optimality system based on L2L^2-calculus under additional regularity assumptions. We further justify the use of the L2L^2-adjoint in numerical simulations by establishing a link between the adjoint in the space of probability measures and the adjoint corresponding to L2L^2-calculus. Moreover, we prove a convergence rate for the convergence of the optimal controls corresponding to the particle formulation to the optimal controls of the mean-field problem as the number of particles tends to infinity
    • …
    corecore