9 research outputs found

    Index handling and assign optimization for Algorithmic Differentiation reuse index managers

    Full text link
    For operator overloading Algorithmic Differentiation tools, the identification of primal variables and adjoint variables is usually done via indices. Two common schemes exist for their management and distribution. The linear approach is easy to implement and supports memory optimization with respect to copy statements. On the other hand, the reuse approach requires more implementation effort but results in much smaller adjoint vectors, which are more suitable for the vector mode of Algorithmic Differentiation. In this paper, we present both approaches, how to implement them, and discuss their advantages, disadvantages and properties of the resulting Algorithmic Differentiation type. In addition, a new management scheme is presented which supports copy optimizations and the reuse of indices, thus combining the advantages of the other two. The implementations of all three schemes are compared on a simple synthetic example and on a real world example using the computational fluid dynamics solver in SU2.Comment: 20 pages, 14 figures, 4 table

    Reverse-Mode Automatic Differentiation of Compiled Programs

    Full text link
    Tools for algorithmic differentiation (AD) provide accurate derivatives of computer-implemented functions for use in, e. g., optimization and machine learning (ML). However, they often require the source code of the function to be available in a restricted set of programming languages. As a step towards making AD accessible for code bases with cross-language or closed-source components, we recently presented the forward-mode AD tool Derivgrind. It inserts forward-mode AD logic into the machine code of a compiled program using the Valgrind dynamic binary instrumentation framework. This work extends Derivgrind, adding the capability to record the real-arithmetic evaluation tree, and thus enabling operator overloading style reverse-mode AD for compiled programs. We maintain the high level of correctness reported for Derivgrind's forward mode, failing the same few testcases in an extensive test suite for the same well-understood reasons. Runtime-wise, the recording slows down the execution of a compiled 64-bit benchmark program by a factor of about 180.Comment: 17 pages, 5 figures, 1 listin

    Forward-Mode Automatic Differentiation of Compiled Programs

    Full text link
    Algorithmic differentiation (AD) is a set of techniques that provide partial derivatives of computer-implemented functions. Such a function can be supplied to state-of-the-art AD tools via its source code, or via an intermediate representation produced while compiling its source code. We present the novel AD tool Derivgrind, which augments the machine code of compiled programs with forward-mode AD logic. Derivgrind leverages the Valgrind instrumentation framework for a structured access to the machine code, and a shadow memory tool to store dot values. Access to the source code is required at most for the files in which input and output variables are defined. Derivgrind's versatility comes at the price of scaling the run-time by a factor between 30 and 75, measured on a benchmark based on a numerical solver for a partial differential equation. Results of our extensive regression test suite indicate that Derivgrind produces correct results on GCC- and Clang-compiled programs, including a Python interpreter, with a small number of exceptions. While we provide a list of scenarios that Derivgrind does not handle correctly, nearly all of them are academic counterexamples or originate from highly optimized math libraries. As long as differentiating those is avoided, Derivgrind can be applied to an unprecedentedly wide range of cross-language or partially closed-source software with little integration efforts.Comment: 21 pages, 3 figures, 3 tables, 5 listing

    Exploration of differentiability in a proton computed tomography simulation framework

    Get PDF
    Objective. Gradient-based optimization using algorithmic derivatives can be a useful technique to improve engineering designs with respect to a computer-implemented objective function. Likewise, uncertainty quantification through computer simulations can be carried out by means of derivatives of the computer simulation. However, the effectiveness of these techniques depends on how ‘well-linearizable’ the software is. In this study, we assess how promising derivative information of a typical proton computed tomography (pCT) scan computer simulation is for the aforementioned applications. Approach. This study is mainly based on numerical experiments, in which we repeatedly evaluate three representative computational steps with perturbed input values. We support our observations with a review of the algorithmic steps and arithmetic operations performed by the software, using debugging techniques. Main results. The model-based iterative reconstruction (MBIR) subprocedure (at the end of the software pipeline) and the Monte Carlo (MC) simulation (at the beginning) were piecewise differentiable. However, the observed high density and magnitude of jumps was likely to preclude most meaningful uses of the derivatives. Jumps in the MBIR function arose from the discrete computation of the set of voxels intersected by a proton path, and could be reduced in magnitude by a ‘fuzzy voxels’ approach. The investigated jumps in the MC function arose from local changes in the control flow that affected the amount of consumed random numbers. The tracking algorithm solves an inherently non-differentiable problem. Significance. Besides the technical challenges of merely applying AD to existing software projects, the MC and MBIR codes must be adapted to compute smoother functions. For the MBIR code, we presented one possible approach for this while for the MC code, this will be subject to further research. For the tracking subprocedure, further research on surrogate models is necessary

    AutoMat -- Automatic Differentiation for Generalized Standard Materials on GPUs

    Full text link
    We propose a universal method for the evaluation of generalized standard materials that greatly simplifies the material law implementation process. By means of automatic differentiation and a numerical integration scheme, AutoMat reduces the implementation effort to two potential functions. By moving AutoMat to the GPU, we close the performance gap to conventional evaluation routines and demonstrate in detail that the expression level reverse mode of automatic differentiation as well as its extension to second order derivatives can be applied inside CUDA kernels. We underline the effectiveness and the applicability of AutoMat by integrating it into the FFT-based homogenization scheme of Moulinec and Suquet and discuss the benefits of using AutoMat with respect to runtime and solution accuracy for an elasto-viscoplastic example.Comment: 28 pages, 15 figures, 7 tables; new layout, more detailed proof of Theorem

    SciCompKL/CoDiPack: Version 2.2.0

    No full text
    <h3>v 2.2.0 - 2024-01-30</h3> <ul> <li><p>Features:</p> <ul> <li>New helper for adding Enzyme-generated derivative functions to the tape. See \ref Example_24_Enzyme_external_function_helper.</li> <li>Recover primal values from primal values tapes in ExternalFunctionHelper.</li> <li>Forward AD type for CUDA kernels.</li> <li>Matrix matrix multiplications can now be handled in an optimal way with CoDiPack.</li> <li>Tagging tape for detecting errors in the AD workflow.</li> </ul> </li> <li><p>Bugfix:</p> <ul> <li>Uninitialized values in external function helper.</li> <li>External function outputs in Jacobian tapes no longer use unused indices.</li> </ul> </li> <li><p>Other: the overhead for storing data as mutch as possible.</p> <ul> <li>Added low level function support to the tapes. Low level functions are between external functions and statements. As they can occur quite often, they reduce</li> </ul> <ul> <li>Added helper structures for creating low level functions.</li> <li>External functions are now handled via the low level function interface.</li> </ul> </li> </ul&gt

    Event-Based Automatic Differentiation of OpenMP with OpDiLib

    Full text link
    We present the new software OpDiLib, a universal add-on for classical operator overloading AD tools that enables the automatic differentiation (AD) of OpenMP parallelized code. With it, we establish support for OpenMP features in a reverse mode operator overloading AD tool to an extent that was previously only reported on in source transformation tools. We achieve this with an event-based implementation ansatz that is unprecedented in AD. Combined with modern OpenMP features around OMPT, we demonstrate how it can be used to achieve differentiation without any additional modifications of the source code; neither do we impose a priori restrictions on the data access patterns, which makes OpDiLib highly applicable. For further performance optimizations, restrictions like atomic updates on the adjoint variables can be lifted in a fine-grained manner for any parts of the code. OpDiLib can also be applied in a semi-automatic fashion via a macro interface, which supports compilers that do not implement OMPT. In a detailed performance study, we demonstrate the applicability of OpDiLib for a pure operator overloading approach in a hybrid parallel environment. We quantify the cost of atomic updates on the adjoint vector and showcase the speedup and scaling that can be achieved with the different configurations of OpDiLib in both the forward and the reverse pass.Comment: 34 pages, 10 figures, 2 tables, 12 listing

    Exploration of Differentiability in a Proton Computed Tomography Simulation Framework

    Get PDF
    Objective. Algorithmic differentiation (AD) can be a useful technique to numerically optimize design and algorithmic parameters by, and quantify uncertainties in, computer simulations. However, the effectiveness of AD depends on how 'well-linearizable' the software is. In this study, we assess how promising derivative information of a typical proton computed tomography (pCT) scan computer simulation is for the aforementioned applications. Approach. This study is mainly based on numerical experiments, in which we repeatedly evaluate three representative computational steps with perturbed input values. We support our observations with a review of the algorithmic steps and arithmetic operations performed by the software, using debugging techniques. Main results. The model-based iterative reconstruction (MBIR) subprocedure (at the end of the software pipeline) and the Monte Carlo (MC) simulation (at the beginning) were piecewise differentiable. Jumps in the MBIR function arose from the discrete computation of the set of voxels intersected by a proton path. Jumps in the MC function likely arose from changes in the control flow that affect the amount of consumed random numbers. The tracking algorithm solves an inherently non-differentiable problem. Significance. The MC and MBIR codes are ready for the integration of AD, and further research on surrogate models for the tracking subprocedure is necessary
    corecore