Accelerated Adjoint Algorithmic Differentiation with Applications in Finance

De Beer, Jarred

Accelerated Adjoint Algorithmic Differentiation with Applications in Finance

Authors: Jarred De Beer
Publication date: 1 January 2017
Publisher: Division of Actuarial Science

Abstract

Adjoint Differentiation's (AD) ability to calculate Greeks efficiently and to machine precision while scaling in constant time to the number of input variables is attractive for calibration and hedging where frequent calculations are required. Algorithmic adjoint differentiation tools automatically generates derivative code and provide interesting challenges in both Computer Science and Mathematics. In this dissertation we focus on a manual implementation with particular emphasis on parallel processing using Graphics Processing Units (GPUs) to accelerate run times. Adjoint differentiation is applied to a Call on Max rainbow option with 3 underlying assets in a Monte Carlo environment. Assets are driven by the Heston stochastic volatility model and implemented using the Milstein discretisation scheme with truncation. The price is calculated along with Deltas and Vegas for each asset, at a total of 6 sensitivities. The application achieves favourable levels of parallelism on all three dimensions implemented by the GPU: Instruction Level Parallelism (ILP), Thread level parallelism (TLP), and Single Instruction Multiple Data (SIMD). We estimate the forward pass of the Milstein discretisation contains an ILP of 3.57 which is between the average range of 2-4. Monte Carlo simulations are embarrassingly parallel and are capable of achieving a high level of concurrency. However, in this context a single kernel running at low occupancy can perform better with a combination of Shared memory, vectorized data structures and a high register count per thread. Run time on the Intel Xeon CPU with 501 760 paths and 360 time steps takes 48.801 seconds. The GT950 Maxwell GPU completed in 0.115 seconds, achieving an 422⇥ speedup and a throughput of 13 million paths per second. The K40 is capable of achieving better performance

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Cape Town University OpenUCT

oai:open.uct.ac.za:11427/24888

Last time updated on 15/10/2017