4 research outputs found
Understanding Uncertainty in Static Pointer Analysis
Institute for Computing Systems ArchitectureFor programs that make extensive use of pointers, pointer analysis is often critical
for the effectiveness of optimising compilers and tools for reasoning about program
behaviour and correctness. Static pointer analysis has been extensively studied and
several algorithms have been proposed, but these only provide approximate solutions.
As such inaccuracy may hinder further optimisations, it is important to understand
how short these algorithms come of providing accurate information about the points-to
relations.
This thesis attempts to quantify the amount of uncertainty of the points-to relations
that remains after a state-of-the-art context- and flow-sensitive pointer analysis algorithm
is applied to a collection of programs from two well-known benchmark suites:
SPEC integer and MediaBench. This remaining static uncertainty is then compared
to the run-time behaviour. Unlike previous work that compared run-time behaviour
against less accurate context- and flow-insensitive algorithms, the goal of this work is
to quantify the amount of uncertainty that is intrinsic to the applications and that defeat
even the most accurate static analyses.
In a first step to quantify the uncertainties, a compiler framework was proposed and
implemented. It is based on the SUIF1 research compiler framework and the SPAN
pointer analysis package. This framework was then used to collect extensive data
from the static points-to analysis. It was also used to drive a profiled execution of the
programs in order to collect the real run-time points-to data. Finally, the static and the
run-time data were compared.
Experimental results show that often the static pointer analysis is very accurate, but
for some benchmarks a significant fraction, up to 25%, of their accesses via pointer dereferences
cannot be statically fully disambiguated. We find that some 27% of these
de-references turn out to access a single memory location at run time, but many do
access several different memory locations. We find that the main reasons for this are
the use of pointer arithmetic and the fact that some control paths are not taken. The
latter is an example of a source of uncertainty that is intrinsic to the application
Fuzzy Set Abstraction
Program analysis plays a key part in improving modern software. Static (sound) analyses produce globally correct, but often pessimistic results while dynamic (complete) analyses yield highly precise results but with limited coverage. We present the Fuzzy set abstraction which generalizes previous work based on 3-valued logic. Our abstraction allows for hybrid analysis where static results are refined dynamically through the use of fuzzy control systems
Quantifying Uncertainty in Points-To Relations ⋆
Abstract. For programs that make extensive use of pointers, pointer analysis is often critical for the effectiveness of optimizing compilers and tools for reasoning about program behavior and correctness. Static pointer analysis has been extensively studied and several algorithms have been proposed, but these only provide approximate solutions. As such inaccuracy may hinder further optimizations, it is important to understand how short these algorithms come of providing accurate information about the points-to relations. This paper attempts to quantify the amount of uncertainty of the points-to relations that remains after a state-of-the-art context- and flowsensitive pointer analysis algorithm is applied to a collection of programs from two well-known benchmark suites: SPEC integer and MediaBench. This remaining static uncertainty is then compared to the run-time behavior. Unlike previous work that compared run-time behavior against less accurate context- and flow-insensitive algorithms, the goal of this work is to quantify the amount of uncertainty that is intrinsic to the applications and that defeat even the most accurate static analyses. Experimental results show that often the static pointer analysis is very accurate, but for some benchmarks a significant fraction, up to 25%, of their accesses via pointer de-references cannot be statically fully disambiguated. We find that some 27 % of these de-references turn out to access a single memory location at run time, but many do access several different memory locations. We find that the main reasons for this are the use of pointer arithmetic and the fact that some control paths are not taken. The latter is an example of a source of uncertainty that is intrinsic to the application.