700 research outputs found
Robust Optimization of PDEs with Random Coefficients Using a Multilevel Monte Carlo Method
This paper addresses optimization problems constrained by partial
differential equations with uncertain coefficients. In particular, the robust
control problem and the average control problem are considered for a tracking
type cost functional with an additional penalty on the variance of the state.
The expressions for the gradient and Hessian corresponding to either problem
contain expected value operators. Due to the large number of uncertainties
considered in our model, we suggest to evaluate these expectations using a
multilevel Monte Carlo (MLMC) method. Under mild assumptions, it is shown that
this results in the gradient and Hessian corresponding to the MLMC estimator of
the original cost functional. Furthermore, we show that the use of certain
correlated samples yields a reduction in the total number of samples required.
Two optimization methods are investigated: the nonlinear conjugate gradient
method and the Newton method. For both, a specific algorithm is provided that
dynamically decides which and how many samples should be taken in each
iteration. The cost of the optimization up to some specified tolerance
is shown to be proportional to the cost of a gradient evaluation with requested
root mean square error . The algorithms are tested on a model elliptic
diffusion problem with lognormal diffusion coefficient. An additional nonlinear
term is also considered.Comment: This work was presented at the IMG 2016 conference (Dec 5 - Dec 9,
2016), at the Copper Mountain conference (Mar 26 - Mar 30, 2017), and at the
FrontUQ conference (Sept 5 - Sept 8, 2017
Interior-point methods for PDE-constrained optimization
In applied sciences PDEs model an extensive variety of phenomena. Typically the final goal of simulations is a system which is optimal in a certain sense. For instance optimal control problems identify a control to steer a system towards a desired state. Inverse problems seek PDE parameters which are most consistent with measurements. In these optimization problems PDEs appear as equality constraints. PDE-constrained optimization problems are large-scale and often nonconvex. Their numerical solution leads to large ill-conditioned linear systems. In many practical problems inequality constraints implement technical limitations or prior knowledge.
In this thesis interior-point (IP) methods are considered to solve nonconvex large-scale PDE-constrained optimization problems with inequality constraints. To cope with enormous fill-in of direct linear solvers, inexact search directions are allowed in an inexact interior-point (IIP) method. This thesis builds upon the IIP method proposed in [Curtis, Schenk, Wächter, SIAM Journal on Scientific Computing, 2010]. SMART tests cope with the lack of inertia information to control Hessian modification and also specify termination tests for the iterative linear solver.
The original IIP method needs to solve two sparse large-scale linear systems in each optimization step. This is improved to only a single linear system solution in most optimization steps. Within this improved IIP framework, two iterative linear solvers are evaluated: A general purpose algebraic multilevel incomplete L D L^T preconditioned SQMR method is applied to PDE-constrained optimization problems for optimal server room cooling in three space dimensions and to compute an ambient temperature for optimal cooling. The results show robustness and efficiency of the IIP method when compared with the exact IP method.
These advantages are even more evident for a reduced-space preconditioned (RSP) GMRES solver which takes advantage of the linear system's structure. This RSP-IIP method is studied on the basis of distributed and boundary control problems originating from superconductivity and from two-dimensional and three-dimensional parameter estimation problems in groundwater modeling. The numerical results exhibit the improved efficiency especially for multiple PDE constraints.
An inverse medium problem for the Helmholtz equation with pointwise box constraints is solved by IP methods. The ill-posedness of the problem is explored numerically and different regularization strategies are compared. The impact of box constraints and the importance of Hessian modification on the optimization algorithm is demonstrated. A real world seismic imaging problem is solved successfully by the RSP-IIP method
Second order adjoints for solving PDE-constrained optimization problems
Inverse problems are of utmost importance in many fields of science and engineering. In the
variational approach inverse problems are formulated as PDE-constrained optimization problems,
where the optimal estimate of the uncertain parameters is the minimizer of a certain cost
functional subject to the constraints posed by the model equations. The numerical solution
of such optimization problems requires the computation of derivatives of the model output
with respect to model parameters. The first order derivatives of a cost functional (defined
on the model output) with respect to a large number of model parameters can be calculated
efficiently through first order adjoint sensitivity analysis. Second order adjoint models
give second derivative information in the form of matrix-vector products between the Hessian
of the cost functional and user defined vectors. Traditionally, the construction of second
order derivatives for large scale models has been considered too costly. Consequently, data
assimilation applications employ optimization algorithms that use only first order derivative
information, like nonlinear conjugate gradients and quasi-Newton methods.
In this paper we discuss the mathematical foundations of second order adjoint sensitivity
analysis and show that it provides an efficient approach to obtain Hessian-vector products. We
study the benefits of using of second order information in the numerical optimization process
for data assimilation applications. The numerical studies are performed in a twin experiment
setting with a two-dimensional shallow water model. Different scenarios are considered with
different discretization approaches, observation sets, and noise levels. Optimization algorithms
that employ second order derivatives are tested against widely used methods that require
only first order derivatives. Conclusions are drawn regarding the potential benefits and the
limitations of using high-order information in large scale data assimilation problems
Recommended from our members
Hybrid Analog-Digital Co-Processing for Scientific Computation
In the past 10 years computer architecture research has moved to more heterogeneity and less adherence to conventional abstractions. Scientists and engineers hold an unshakable belief that computing holds keys to unlocking humanity's Grand Challenges. Acting on that belief they have looked deeper into computer architecture to find specialized support for their applications. Likewise, computer architects have looked deeper into circuits and devices in search of untapped performance and efficiency. The lines between computer architecture layers---applications, algorithms, architectures, microarchitectures, circuits and devices---have blurred. Against this backdrop, a menagerie of computer architectures are on the horizon, ones that forgo basic assumptions about computer hardware, and require new thinking of how such hardware supports problems and algorithms.
This thesis is about revisiting hybrid analog-digital computing in support of diverse modern workloads. Hybrid computing had extensive applications in early computing history, and has been revisited for small-scale applications in embedded systems. But architectural support for using hybrid computing in modern workloads, at scale and with high accuracy solutions, has been lacking.
I demonstrate solving a variety of scientific computing problems, including stochastic ODEs, partial differential equations, linear algebra, and nonlinear systems of equations, as case studies in hybrid computing. I solve these problems on a system of multiple prototype analog accelerator chips built by a team at Columbia University. On that team I made contributions toward programming the chips, building the digital interface, and validating the chips' functionality. The analog accelerator chip is intended for use in conjunction with a conventional digital host computer.
The appeal and motivation for using an analog accelerator is efficiency and performance, but it comes with limitations in accuracy and problem sizes that we have to work around.
The first problem is how to do problems in this unconventional computation model. Scientific computing phrases problems as differential equations and algebraic equations. Differential equations are a continuous view of the world, while algebraic equations are a discrete one. Prior work in analog computing mostly focused on differential equations; algebraic equations played a minor role in prior work in analog computing. The secret to using the analog accelerator to support modern workloads on conventional computers is that these two viewpoints are interchangeable. The algebraic equations that underlie most workloads can be solved as differential equations,
and differential equations are naturally solvable in the analog accelerator chip. A hybrid analog-digital computer architecture can focus on solving linear and nonlinear algebra problems to support many workloads.
The second problem is how to get accurate solutions using hybrid analog-digital computing. The reason that the analog computation model gives less accurate solutions is it gives up representing numbers as digital binary numbers, and instead uses the full range of analog voltage and current to represent real numbers. Prior work has established that encoding data in analog signals gives an energy efficiency advantage as long as the analog data precision is limited. While the analog accelerator alone may be useful for energy-constrained applications where inputs and outputs are imprecise, we are more interested in using analog in conjunction with digital for precise solutions. This thesis gives novel insight that the trick to do so is to solve nonlinear problems where low-precision guesses are useful for conventional digital algorithms.
The third problem is how to solve large problems using hybrid analog-digital computing. The reason the analog computation model can't handle large problems is it gives up step-by-step discrete-time operation, instead allowing variables to evolve smoothly in continuous time. To make that happen the analog accelerator works by chaining hardware for mathematical operations end-to-end. During computation analog data flows through the hardware with no overheads in control logic and memory accesses. The downside is then the needed hardware size grows alongside problem sizes. While scientific computing researchers have for a long time split large problems into smaller subproblems to fit in digital computer constraints, this thesis is a first attempt to consider these divide-and-conquer algorithms as an essential tool in using the analog model of computation.
As we enter the post-Moore’s law era of computing, unconventional architectures will offer specialized models of computation that uniquely support specific problem types. Two prominent examples are deep neural networks and quantum computers. Recent trends in computer science research show these unconventional architectures will soon have broad adoption. In this thesis I show another specialized, unconventional architecture is to use analog accelerators to solve problems in scientific computing. Computer architecture researchers will discover other important models of computation in the future. This thesis is an example of the discovery process, implementation, and evaluation of how an unconventional architecture supports specialized workloads
An Efficient Parallel-in-Time Method for Optimization with Parabolic PDEs
To solve optimization problems with parabolic PDE constraints, often methods
working on the reduced objective functional are used. They are computationally
expensive due to the necessity of solving both the state equation and a
backward-in-time adjoint equation to evaluate the reduced gradient in each
iteration of the optimization method. In this study, we investigate the use of
the parallel-in-time method PFASST in the setting of PDE constrained
optimization. In order to develop an efficient fully time-parallel algorithm we
discuss different options for applying PFASST to adjoint gradient computation,
including the possibility of doing PFASST iterations on both the state and
adjoint equations simultaneously. We also explore the additional gains in
efficiency from reusing information from previous optimization iterations when
solving each equation. Numerical results for both a linear and a non-linear
reaction-diffusion optimal control problem demonstrate the parallel speedup and
efficiency of different approaches
- …