238 research outputs found

    Automatic Differentiation of Algorithms for Machine Learning

    Get PDF
    Automatic differentiation---the mechanical transformation of numeric computer programs to calculate derivatives efficiently and accurately---dates to the origin of the computer age. Reverse mode automatic differentiation both antedates and generalizes the method of backwards propagation of errors used in machine learning. Despite this, practitioners in a variety of fields, including machine learning, have been little influenced by automatic differentiation, and make scant use of available tools. Here we review the technique of automatic differentiation, describe its two main modes, and explain how it can benefit machine learning practitioners. To reach the widest possible audience our treatment assumes only elementary differential calculus, and does not assume any knowledge of linear algebra.Comment: 7 pages, 1 figur

    Fast derivatives of likelihood functionals for ODE based models using adjoint-state method

    Full text link
    We consider time series data modeled by ordinary differential equations (ODEs), widespread models in physics, chemistry, biology and science in general. The sensitivity analysis of such dynamical systems usually requires calculation of various derivatives with respect to the model parameters. We employ the adjoint state method (ASM) for efficient computation of the first and the second derivatives of likelihood functionals constrained by ODEs with respect to the parameters of the underlying ODE model. Essentially, the gradient can be computed with a cost (measured by model evaluations) that is independent of the number of the ODE model parameters and the Hessian with a linear cost in the number of the parameters instead of the quadratic one. The sensitivity analysis becomes feasible even if the parametric space is high-dimensional. The main contributions are derivation and rigorous analysis of the ASM in the statistical context, when the discrete data are coupled with the continuous ODE model. Further, we present a highly optimized implementation of the results and its benchmarks on a number of problems. The results are directly applicable in (e.g.) maximum-likelihood estimation or Bayesian sampling of ODE based statistical models, allowing for faster, more stable estimation of parameters of the underlying ODE model.Comment: 5 figure

    Automatic differentiation in machine learning: a survey

    Get PDF
    Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure

    Structural model updating using vibration measurements

    Get PDF
    A multi-objective optimization framework is presented for updating finite element models of structures based on vibration measurements. The method results in multiple Pareto optimal structural models that are consistent with the measured data and the residuals used to measure the discrepancies between the measured and the finite element model predicted characteristics. The relation between the multi-objective identification method, Bayesian in-ference method, and conventional single-objective weighted residuals methods for model up-dating is discussed. Computational algorithms for the efficient and reliable solution of the resulting optimization problems are presented. The algorithms are classified to gradient-based, evolutionary strategies and hybrid techniques. In particular, efficient algorithms are introduced for reducing the computational cost involved in estimating the gradients of the ob-jective functions representing the modal residuals. Specifically, a formulation requiring the solution of the adjoint problem is presented, avoiding the explicit estimation of the gradients of the modal characteristics. The adjoint method is also extended to carry out efficiently the estimation of the Hessian of the objective function. The computational cost for estimating the gradients and Hessian is shown to be independent of the number of structural model parame-ters. The methodology is particularly efficient to system with several number of model param-eters and large number of DOFs where repeated gradient and Hessian evaluations are computationally time consuming. Component mode synthesis methods dividing the structure to linear substructural components with fixed properties and linear substructural components with uncertain properties are incorporated into the methodology to further reduce the compu-tational effort required in optimization problems. The linear substructures with fixed proper-ties are represented by their lower contributing modes which remain unchanged during the model updating process. The method is particular effective for finite element models with large number of DOF and for parameter estimation in localized areas of a structure. Theoret-ical and computational developments are illustrated by updating finite element models of a laboratory building using impact hammer measurements and multi-span reinforced concrete bridges using ambient vibration measurements

    Automatic differentiation in machine learning: a survey

    Get PDF
    Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD) is a technique for calculating derivatives of numeric functions expressed as computer programs efficiently and accurately, used in fields such as computational fluid dynamics, nuclear engineering, and atmospheric sciences. Despite its advantages and use in other fields, machine learning practitioners have been little influenced by AD and make scant use of available tools. We survey the intersection of AD and machine learning, cover applications where AD has the potential to make a big impact, and report on some recent developments in the adoption of this technique. We aim to dispel some misconceptions that we contend have impeded the use of AD within the machine learning community
    • …
    corecore