research

Exploiting spatiotemporal locality for fast call stack traversal

Abstract

In the approach to exascale, scalable tools are becoming increasingly necessary to support parallel applications. Evaluating an application’s call stack is a vital technique for a wide variety of profilers and debuggers, and can create a significant performance overhead. In this paper we present a heuristic technique to reduce the overhead of frequent call stack evaluations. We use this technique to estimate the similarity between successive call stacks, removing the need for full call stack traversal and eliminating a significant portion of the performance overhead. We demonstrate this technique applied to a parallel memory tracing toolkit, WMTools, and analyse the performance gains and accuracy

    Similar works