Search CORE

4,499 research outputs found

Second order adjoints for solving PDE-constrained optimization problems

Author: Alexe Mihai
Cioaca Alexandru
Sandu Dr Adrian
Publication venue
Publication date: 01/01/2010
Field of study

Inverse problems are of utmost importance in many fields of science and engineering. In the variational approach inverse problems are formulated as PDE-constrained optimization problems, where the optimal estimate of the uncertain parameters is the minimizer of a certain cost functional subject to the constraints posed by the model equations. The numerical solution of such optimization problems requires the computation of derivatives of the model output with respect to model parameters. The first order derivatives of a cost functional (defined on the model output) with respect to a large number of model parameters can be calculated efficiently through first order adjoint sensitivity analysis. Second order adjoint models give second derivative information in the form of matrix-vector products between the Hessian of the cost functional and user defined vectors. Traditionally, the construction of second order derivatives for large scale models has been considered too costly. Consequently, data assimilation applications employ optimization algorithms that use only first order derivative information, like nonlinear conjugate gradients and quasi-Newton methods. In this paper we discuss the mathematical foundations of second order adjoint sensitivity analysis and show that it provides an efficient approach to obtain Hessian-vector products. We study the benefits of using of second order information in the numerical optimization process for data assimilation applications. The numerical studies are performed in a twin experiment setting with a two-dimensional shallow water model. Different scenarios are considered with different discretization approaches, observation sets, and noise levels. Optimization algorithms that employ second order derivatives are tested against widely used methods that require only first order derivatives. Conclusions are drawn regarding the potential benefits and the limitations of using high-order information in large scale data assimilation problems

Computer Science Technical Reports @Virginia Tech

Probabilistic Interpretation of Linear Solvers

Author: Hennig Philipp
Publication venue
Publication date: 15/10/2014
Field of study

This manuscript proposes a probabilistic framework for algorithms that iteratively solve unconstrained linear problems

Bx = b

with positive definite

B

for

x

. The goal is to replace the point estimates returned by existing methods with a Gaussian posterior belief over the elements of the inverse of

B

, which can be used to estimate errors. Recent probabilistic interpretations of the secant family of quasi-Newton optimization algorithms are extended. Combined with properties of the conjugate gradient algorithm, this leads to uncertainty-calibrated methods with very limited cost overhead over conjugate gradients, a self-contained novel interpretation of the quasi-Newton and conjugate gradient algorithms, and a foundation for new nonlinear optimization methods.Comment: final version, in press at SIAM J Optimizatio

arXiv.org e-Print Archive

Publikationsserver der Universität Tübingen

MPG.PuRe

The divergence of the BFGS and Gauss Newton Methods

Author: Mascarenhas Walter F.
Publication venue
Publication date: 01/01/2013
Field of study

We present examples of divergence for the BFGS and Gauss Newton methods. These examples have objective functions with bounded level sets and other properties concerning the examples published recently in this journal, like unit steps and convexity along the search lines. As these other examples, the iterates, function values and gradients in the new examples fit into the general formulation in our previous work {\it On the divergence of line search methods, Comput. Appl. Math. vol.26 no.1 (2007)}, which also presents an example of divergence for Newton's method.Comment: This article was accepted by Mathematical programmin

arXiv.org e-Print Archive

CiteSeerX

Implicit particle methods and their connection with variational data assimilation

Author: Alexandre J. Chorin
Arulampalam
Bennet
Bickel
Buehner
Cappé
Carpenter
Chorin
Chorin
Chorin
Chorin
Chorin
Conn
Cornebise
Courtier
Courtier
Dimet
Doucet
Doucet
Ethan Atkins
Evensen
Evensen
Evensen
Fedorenko
Fertig
Fletcher
Fournier
Geweke
Goodman
Gordon
Hammersley
Hunt
Johansen
Kalman
Kalman
Kalnay
Kalnay
Kalos
Kass
Kass
Kloeden
Kurapov
Liu
Liu
Liu
Liu
Lorenz
Matthias Morzfeld
Miller
Moral
Morzfeld
Morzfeld
Nocedal
Pitt
Rabier
Smith
Smídl
Snyder
Talagrand
Talagrand
Tremolet
van Leeuwen
van Leeuwen
Zupanski
Publication venue: 'American Meteorological Society'
Publication date: 08/05/2012
Field of study

The implicit particle filter is a sequential Monte Carlo method for data assimilation that guides the particles to the high-probability regions via a sequence of steps that includes minimizations. We present a new and more general derivation of this approach and extend the method to particle smoothing as well as to data assimilation for perfect models. We show that the minimizations required by implicit particle methods are similar to the ones one encounters in variational data assimilation and explore the connection of implicit particle methods with variational data assimilation. In particular, we argue that existing variational codes can be converted into implicit particle methods at a low cost, often yielding better estimates, that are also equipped with quantitative measures of the uncertainty. A detailed example is presented

arXiv.org e-Print Archive

CiteSeerX

Crossref

Do optimization methods in deep learning applications matter?

Author: Kiran Mariam
Ozyildirim Buse Melis
Publication venue: eScholarship, University of California
Publication date: 28/02/2020
Field of study

With advances in deep learning, exponential data growth and increasing model complexity, developing efficient optimization methods are attracting much research attention. Several implementations favor the use of Conjugate Gradient (CG) and Stochastic Gradient Descent (SGD) as being practical and elegant solutions to achieve quick convergence, however, these optimization processes also present many limitations in learning across deep learning applications. Recent research is exploring higher-order optimization functions as better approaches, but these present very complex computational challenges for practical use. Comparing first and higher-order optimization functions, in this paper, our experiments reveal that Levemberg-Marquardt (LM) significantly supersedes optimal convergence but suffers from very large processing time increasing the training complexity of both, classification and reinforcement learning problems. Our experiments compare off-the-shelf optimization functions(CG, SGD, LM and L-BFGS) in standard CIFAR, MNIST, CartPole and FlappyBird experiments.The paper presents arguments on which optimization functions to use and further, which functions would benefit from parallelization efforts to improve pretraining time and learning rate convergence

arXiv.org e-Print Archive

eScholarship - University of California