495 research outputs found
Second order adjoints for solving PDE-constrained optimization problems
Inverse problems are of utmost importance in many fields of science and engineering. In the
variational approach inverse problems are formulated as PDE-constrained optimization problems,
where the optimal estimate of the uncertain parameters is the minimizer of a certain cost
functional subject to the constraints posed by the model equations. The numerical solution
of such optimization problems requires the computation of derivatives of the model output
with respect to model parameters. The first order derivatives of a cost functional (defined
on the model output) with respect to a large number of model parameters can be calculated
efficiently through first order adjoint sensitivity analysis. Second order adjoint models
give second derivative information in the form of matrix-vector products between the Hessian
of the cost functional and user defined vectors. Traditionally, the construction of second
order derivatives for large scale models has been considered too costly. Consequently, data
assimilation applications employ optimization algorithms that use only first order derivative
information, like nonlinear conjugate gradients and quasi-Newton methods.
In this paper we discuss the mathematical foundations of second order adjoint sensitivity
analysis and show that it provides an efficient approach to obtain Hessian-vector products. We
study the benefits of using of second order information in the numerical optimization process
for data assimilation applications. The numerical studies are performed in a twin experiment
setting with a two-dimensional shallow water model. Different scenarios are considered with
different discretization approaches, observation sets, and noise levels. Optimization algorithms
that employ second order derivatives are tested against widely used methods that require
only first order derivatives. Conclusions are drawn regarding the potential benefits and the
limitations of using high-order information in large scale data assimilation problems
Parameter estimation by implicit sampling
Implicit sampling is a weighted sampling method that is used in data
assimilation, where one sequentially updates estimates of the state of a
stochastic model based on a stream of noisy or incomplete data. Here we
describe how to use implicit sampling in parameter estimation problems, where
the goal is to find parameters of a numerical model, e.g.~a partial
differential equation (PDE), such that the output of the numerical model is
compatible with (noisy) data. We use the Bayesian approach to parameter
estimation, in which a posterior probability density describes the probability
of the parameter conditioned on data and compute an empirical estimate of this
posterior with implicit sampling. Our approach generates independent samples,
so that some of the practical difficulties one encounters with Markov Chain
Monte Carlo methods, e.g.~burn-in time or correlations among dependent samples,
are avoided. We describe a new implementation of implicit sampling for
parameter estimation problems that makes use of multiple grids (coarse to fine)
and BFGS optimization coupled to adjoint equations for the required gradient
calculations. The implementation is "dimension independent", in the sense that
a well-defined finite dimensional subspace is sampled as the mesh used for
discretization of the PDE is refined. We illustrate the algorithm with an
example where we estimate a diffusion coefficient in an elliptic equation from
sparse and noisy pressure measurements. In the example, dimension\slash
mesh-independence is achieved via Karhunen-Lo\`{e}ve expansions
Space-time adaptive solution of inverse problems with the discrete adjoint method
Adaptivity in both space and time has become the norm for solving problems modeled by partial differential equations. The size of the discretized problem makes uniformly refined grids computationally prohibitive. Adaptive refinement of meshes and time steps allows to capture the phenomena of interest while keeping the cost of a simulation tractable on the current hardware. Many fields in science and engineering require the solution of inverse problems where parameters for a given model are estimated based on available measurement information. In contrast to forward (regular) simulations, inverse problems have not extensively benefited from the adaptive solver technology. Previous research in inverse problems has focused mainly on the continuous approach to calculate sensitivities, and has typically employed fixed time and space meshes in the solution process. Inverse problem solvers that make exclusive use of uniform or static meshes avoid complications such as the differentiation of mesh motion equations, or inconsistencies in the sensitivity equations between subdomains with different refinement levels. However, this comes at the cost of low computational efficiency. More efficient computations are possible through judicious use of adaptive mesh refinement, adaptive time steps, and the discrete adjoint method.
This paper develops a framework for the construction and analysis of discrete adjoint sensitivities in the context of time dependent, adaptive grid, adaptive step models. Discrete adjoints are attractive in practice since they can be generated with low effort using automatic differentiation. However, this approach brings several important challenges. The adjoint of the forward numerical scheme may be inconsistent with the continuous adjoint equations. A reduction in accuracy of the discrete adjoint sensitivities may appear due to the intergrid transfer operators. Moreover, the optimization algorithm may need to accommodate state and gradient vectors whose dimensions change between iterations. This work shows that several of these potential issues can be avoided for the discontinuous Galerkin (DG) method. The adjoint model development is considerably simplified by decoupling the adaptive mesh refinement mechanism from the forward model solver, and by selectively applying automatic differentiation on individual algorithms.
In forward models discontinuous Galerkin discretizations can efficiently handle high orders of accuracy, -refinement, and parallel computation. The analysis reveals that this approach, paired with Runge Kutta time stepping, is well suited for the adaptive solutions of inverse problems. The usefulness of discrete discontinuous Galerkin adjoints is illustrated on a two-dimensional adaptive data assimilation problem
Mean-field optimal control and optimality conditions in the space of probability measures
We derive a framework to compute optimal controls for problems with states in
the space of probability measures. Since many optimal control problems
constrained by a system of ordinary differential equations (ODE) modelling
interacting particles converge to optimal control problems constrained by a
partial differential equation (PDE) in the mean-field limit, it is interesting
to have a calculus directly on the mesoscopic level of probability measures
which allows us to derive the corresponding first-order optimality system. In
addition to this new calculus, we provide relations for the resulting system to
the first-order optimality system derived on the particle level, and the
first-order optimality system based on -calculus under additional
regularity assumptions. We further justify the use of the -adjoint in
numerical simulations by establishing a link between the adjoint in the space
of probability measures and the adjoint corresponding to -calculus.
Moreover, we prove a convergence rate for the convergence of the optimal
controls corresponding to the particle formulation to the optimal controls of
the mean-field problem as the number of particles tends to infinity
- …