376 research outputs found

    Sparse Automatic Differentiation for Large-Scale Computations Using Abstract Elementary Algebra

    Full text link
    Most numerical solvers and libraries nowadays are implemented to use mathematical models created with language-specific built-in data types (e.g. real in Fortran or double in C) and their respective elementary algebra implementations. However, built-in elementary algebra typically has limited functionality and often restricts flexibility of mathematical models and analysis types that can be applied to those models. To overcome this limitation, a number of domain-specific languages with more feature-rich built-in data types have been proposed. In this paper, we argue that if numerical libraries and solvers are designed to use abstract elementary algebra rather than language-specific built-in algebra, modern mainstream languages can be as effective as any domain-specific language. We illustrate our ideas using the example of sparse Jacobian matrix computation. We implement an automatic differentiation method that takes advantage of sparse system structures and is straightforward to parallelize in MPI setting. Furthermore, we show that the computational cost scales linearly with the size of the system.Comment: Submitted to ACM Transactions on Mathematical Softwar

    Getting Started with ADOL-C

    Get PDF
    The C++ package ADOL-C described in this paper facilitates the evaluation of first and higher derivatives of vector functions that are defined by computer programs written in C or C++. The numerical values of derivative vectors are obtained free of truncation errors at mostly a small multiple of the run time and a fix small multiple random access memory required by the given function evaluation program. Derivative matrices are obtained by columns, by rows or in sparse format. This tutorial describes the source code modification required for the application of ADOL-C, the most frequently used drivers to evaluate derivatives and some recent developments

    Advanced Concepts for Automatic Differentiation based on Operator Overloading

    Get PDF
    Mit Hilfe der Technik des Automatischen Differenzierens (AD) lassen sich für Funktionen, die als Programmquellcode gegeben sind, Ableitungsinformationen rechentechnisch effizient und mit geringem Aufwand für den Nutzer bereitstellen. Eine Variante der Implementierung von AD basiert auf der Überladung von Operatoren und Funktionen, die von vielen modernen Programmiersprachen ermöglicht wird. Durch Ausnutzung des Konzepts der Überladung wird eine interne Funktions-Repräsentation (Tape) generiert, die anschließend für die Ableitungsberechnung herangezogen wird. In der Dissertation werden neue Techniken erarbeitet, die eine effizientere Tape-Erstellung und die parallele Tape-Auswertung ermöglichen. Anhand von Laufzeituntersuchungen für numerische Beispiele werden die Möglichkeiten der neuen Techniken verdeutlicht.Using the technique of Automatic Differentiation (AD), derivative information can be computed efficiently for any function that is given as source code in a supported programming languages. One basic implementation strategy is based on the concept of operator overloading that is available for many programming languages. Due the overloading of operators, an internal representation of the function can be generated at runtime. This so-called tape can then be used for computing derivatives. In the thesis, new techniques are introduced that allow a more efficient tape creation and the parallel evaluation of tapes. Advantages of the new techniques are demonstrated by means of runtime analyses for numerical examples

    Automatic differentiation in machine learning: a survey

    Get PDF
    Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure

    TMB: Automatic Differentiation and Laplace Approximation

    Get PDF
    TMB is an open source R package that enables quick implementation of complex nonlinear random effect (latent variable) models in a manner similar to the established AD Model Builder package (ADMB, admb-project.org). In addition, it offers easy access to parallel computations. The user defines the joint likelihood for the data and the random effects as a C++ template function, while all the other operations are done in R; e.g., reading in the data. The package evaluates and maximizes the Laplace approximation of the marginal likelihood where the random effects are automatically integrated out. This approximation, and its derivatives, are obtained using automatic differentiation (up to order three) of the joint likelihood. The computations are designed to be fast for problems with many random effects (~10^6) and parameters (~10^3). Computation times using ADMB and TMB are compared on a suite of examples ranging from simple models to large spatial models where the random effects are a Gaussian random field. Speedups ranging from 1.5 to about 100 are obtained with increasing gains for large problems. The package and examples are available at http://tmb-project.org
    • …
    corecore