Search CORE

264 research outputs found

Sharing storage using dirty vectors

Author: Christianson B.
Dixon L.
Brown S.
Publication venue: Society for Industrial and Applied Mathematics (SIAM Press)
Publication date: 01/01/1996
Field of study

Consider a computation F with n inputs (independent variables) and m outputs (dependent variables) and suppose that we wish to evaluate the Jacobian of F. Automatic differentiation commonly performs this evaluation by associating vector storage either with the program variables (in the case of forward-mode automatic differentiation) or with the adjoint variables (in the case of reverse). Each vector component contains a partial derivative with respect to an independent variable, or a partial derivative of a dependent variable, respectively. The vectors may be full vectors, or they may be dynamically managed sparse data structures. In either case, many of these vectors will be scalar multiples of one another. For example, any intermediate variable produced by a unary operation in the forward mode will have a derivative vector that is a multiple of the derivative for the argument. Any computational graph node that is read just once during its lifetime will have an adjoint vector that is a multiple of the adjoint of the node that reads it. It is frequently wasteful to perform component multiplications explicitly. A scalar multiple of another vector can be replaced by a single multiplicative "scale factor" together with a pointer to the other vector. Automated use of this "dirty vector" technique can save considerable memory management overhead and dramatically reduce the number of floating-point operations required. In particular, dirty vectors often allow shared threads of computation to be reverse-accumulated cheaply. The mechanism permits a number of generalizations, some of which give efficient techniques for preaccumulation

University of Hertfordshire Research Archive

Computing the Sparsity Pattern of Hessians using Automatic Differentiation

Author: Gay D. M.
Griewank Andreas
Margarida Pinheiro Mello
Robert Mansel Gower
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Crossref

Edinburgh Research Explorer

The divergence of the BFGS and Gauss Newton Methods

Author: Mascarenhas Walter F.
Publication venue
Publication date: 01/01/2013
Field of study

We present examples of divergence for the BFGS and Gauss Newton methods. These examples have objective functions with bounded level sets and other properties concerning the examples published recently in this journal, like unit steps and convexity along the search lines. As these other examples, the iterates, function values and gradients in the new examples fit into the general formulation in our previous work {\it On the divergence of line search methods, Comput. Appl. Math. vol.26 no.1 (2007)}, which also presents an example of divergence for Newton's method.Comment: This article was accepted by Mathematical programmin

arXiv.org e-Print Archive

CiteSeerX

Fast derivatives of likelihood functionals for ODE based models using adjoint-state method

Author: Haber Tom
Melicher Valdemar
Vanroose Wim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

We consider time series data modeled by ordinary differential equations (ODEs), widespread models in physics, chemistry, biology and science in general. The sensitivity analysis of such dynamical systems usually requires calculation of various derivatives with respect to the model parameters. We employ the adjoint state method (ASM) for efficient computation of the first and the second derivatives of likelihood functionals constrained by ODEs with respect to the parameters of the underlying ODE model. Essentially, the gradient can be computed with a cost (measured by model evaluations) that is independent of the number of the ODE model parameters and the Hessian with a linear cost in the number of the parameters instead of the quadratic one. The sensitivity analysis becomes feasible even if the parametric space is high-dimensional. The main contributions are derivation and rigorous analysis of the ASM in the statistical context, when the discrete data are coupled with the continuous ODE model. Further, we present a highly optimized implementation of the results and its benchmarks on a number of problems. The results are directly applicable in (e.g.) maximum-likelihood estimation or Bayesian sampling of ODE based statistical models, allowing for faster, more stable estimation of parameters of the underlying ODE model.Comment: 5 figure

arXiv.org e-Print Archive

Institutional Repository Universiteit Antwerpen

Scalable Rejection Sampling for Bayesian Hierarchical Models

Author: Braun Michael
Damien Paul
Publication venue
Publication date: 01/11/2014
Field of study

Bayesian hierarchical modeling is a popular approach to capturing unobserved heterogeneity across individual units. However, standard estimation methods such as Markov chain Monte Carlo (MCMC) can be impracticable for modeling outcomes from a large number of units. We develop a new method to sample from posterior distributions of Bayesian models, without using MCMC. Samples are independent, so they can be collected in parallel, and we do not need to be concerned with issues like chain convergence and autocorrelation. The algorithm is scalable under the weak assumption that individual units are conditionally independent, making it applicable for large datasets. It can also be used to compute marginal likelihoods

arXiv.org e-Print Archive

CiteSeerX