5 research outputs found
High-level python abstractions for optimal checkpointing in inversion problems
Inversion and PDE-constrained optimization problems often rely on solving the adjoint problem to calculate the gradient of the objec- tive function. This requires storing large amounts of intermediate data, setting a limit to the largest problem that might be solved with a given amount of memory available. Checkpointing is an approach that can reduce the amount of memory required by redoing parts of the computation instead of storing intermediate results. The Revolve checkpointing algorithm o ers an optimal schedule that trades computational cost for smaller memory footprints. Integrat- ing Revolve into a modern python HPC code and combining it with code generation is not straightforward. We present an API that makes checkpointing accessible from a DSL-based code generation environment along with some initial performance gures with a focus on seismic applications
Instead of Rewriting Foreign Code for Machine Learning, Automatically Synthesize Fast Gradients
Applying differentiable programming techniques and machine learning
algorithms to foreign programs requires developers to either rewrite their code
in a machine learning framework, or otherwise provide derivatives of the
foreign code. This paper presents Enzyme, a high-performance automatic
differentiation (AD) compiler plugin for the LLVM compiler framework capable of
synthesizing gradients of statically analyzable programs expressed in the LLVM
intermediate representation (IR). Enzyme synthesizes gradients for programs
written in any language whose compiler targets LLVM IR including C, C++,
Fortran, Julia, Rust, Swift, MLIR, etc., thereby providing native AD
capabilities in these languages. Unlike traditional source-to-source and
operator-overloading tools, Enzyme performs AD on optimized IR. On a
machine-learning focused benchmark suite including Microsoft's ADBench, AD on
optimized IR achieves a geometric mean speedup of 4.5x over AD on IR before
optimization allowing Enzyme to achieve state-of-the-art performance. Packaging
Enzyme for PyTorch and TensorFlow provides convenient access to gradients of
foreign code with state-of-the art performance, enabling foreign code to be
directly incorporated into existing machine learning workflows.Comment: To be published in NeurIPS 202