5 research outputs found

    High-level python abstractions for optimal checkpointing in inversion problems

    Get PDF
    Inversion and PDE-constrained optimization problems often rely on solving the adjoint problem to calculate the gradient of the objec- tive function. This requires storing large amounts of intermediate data, setting a limit to the largest problem that might be solved with a given amount of memory available. Checkpointing is an approach that can reduce the amount of memory required by redoing parts of the computation instead of storing intermediate results. The Revolve checkpointing algorithm o ers an optimal schedule that trades computational cost for smaller memory footprints. Integrat- ing Revolve into a modern python HPC code and combining it with code generation is not straightforward. We present an API that makes checkpointing accessible from a DSL-based code generation environment along with some initial performance gures with a focus on seismic applications

    Instead of Rewriting Foreign Code for Machine Learning, Automatically Synthesize Fast Gradients

    Full text link
    Applying differentiable programming techniques and machine learning algorithms to foreign programs requires developers to either rewrite their code in a machine learning framework, or otherwise provide derivatives of the foreign code. This paper presents Enzyme, a high-performance automatic differentiation (AD) compiler plugin for the LLVM compiler framework capable of synthesizing gradients of statically analyzable programs expressed in the LLVM intermediate representation (IR). Enzyme synthesizes gradients for programs written in any language whose compiler targets LLVM IR including C, C++, Fortran, Julia, Rust, Swift, MLIR, etc., thereby providing native AD capabilities in these languages. Unlike traditional source-to-source and operator-overloading tools, Enzyme performs AD on optimized IR. On a machine-learning focused benchmark suite including Microsoft's ADBench, AD on optimized IR achieves a geometric mean speedup of 4.5x over AD on IR before optimization allowing Enzyme to achieve state-of-the-art performance. Packaging Enzyme for PyTorch and TensorFlow provides convenient access to gradients of foreign code with state-of-the art performance, enabling foreign code to be directly incorporated into existing machine learning workflows.Comment: To be published in NeurIPS 202
    corecore