Efficient call path profiles on unmodified, optimized code

Abstract

Identifying performance bottlenecks and their associated calling contexts is critical for tuning high-performance applications. This thesis presents a new approach to measuring resource utilization and its calling context. Previous instrumentation-based approaches for reporting calling context introduce overhead proportional to the number of function calls performed. We describe a new design for a call path profiler based on stack sampling. Our design enables profiling of unmodified binaries, provides low and controllable overhead, and accurately attributes context-dependent costs of calls. We use a special trampoline function that improves the efficiency of stack sampling and enables the association of unique invocation counts with sampled call sites. We evaluate a Tru64/Alpha implementation of our design and show that on call-intensive codes, the overhead of our approach is over two orders of magnitude lower than the overhead of an instrumentation-based approach, with comparable overhead on other codes

Similar works

Full text

thumbnail-image

DSpace at Rice University

Full text is not available
oai:scholarship.rice.edu:1911/17780Last time updated on 6/11/2012

This paper was published in DSpace at Rice University.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.