We show how reverse-mode AD (automatic differentiation)—a generalized gradient-calculation operator—can be incorporated as a first-class function in a functional-programming language. An important property of AD transformations is that they preserve certain complexity properties. Here, this property is that the reverse phase of the reverse-mode transform of a function has the same temporal complexity (up to a small constant factor) as the original untransformed function. The main technical difficulty to be faced is that reverse-mode AD must convert fanout (multiple use of a variable) in the untransformed code into addition in the reverse phase of the transformed code. We address this by expressing all straight-line code segments in A-normal form, which makes fanout lexically apparent. Our formulation generalizes reverse-mode AD to apply to arbitrary higher-order functions, while preserving its desirable complexity properties
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.