175 research outputs found
Heap Abstractions for Static Analysis
Heap data is potentially unbounded and seemingly arbitrary. As a consequence,
unlike stack and static memory, heap memory cannot be abstracted directly in
terms of a fixed set of source variable names appearing in the program being
analysed. This makes it an interesting topic of study and there is an abundance
of literature employing heap abstractions. Although most studies have addressed
similar concerns, their formulations and formalisms often seem dissimilar and
some times even unrelated. Thus, the insights gained in one description of heap
abstraction may not directly carry over to some other description. This survey
is a result of our quest for a unifying theme in the existing descriptions of
heap abstractions. In particular, our interest lies in the abstractions and not
in the algorithms that construct them.
In our search of a unified theme, we view a heap abstraction as consisting of
two features: a heap model to represent the heap memory and a summarization
technique for bounding the heap representation. We classify the models as
storeless, store based, and hybrid. We describe various summarization
techniques based on k-limiting, allocation sites, patterns, variables, other
generic instrumentation predicates, and higher-order logics. This approach
allows us to compare the insights of a large number of seemingly dissimilar
heap abstractions and also paves way for creating new abstractions by
mix-and-match of models and summarization techniques.Comment: 49 pages, 20 figure
Structural Analysis: Shape Information via Points-To Computation
This paper introduces a new hybrid memory analysis, Structural Analysis,
which combines an expressive shape analysis style abstract domain with
efficient and simple points-to style transfer functions. Using data from
empirical studies on the runtime heap structures and the programmatic idioms
used in modern object-oriented languages we construct a heap analysis with the
following characteristics: (1) it can express a rich set of structural, shape,
and sharing properties which are not provided by a classic points-to analysis
and that are useful for optimization and error detection applications (2) it
uses efficient, weakly-updating, set-based transfer functions which enable the
analysis to be more robust and scalable than a shape analysis and (3) it can be
used as the basis for a scalable interprocedural analysis that produces precise
results in practice.
The analysis has been implemented for .Net bytecode and using this
implementation we evaluate both the runtime cost and the precision of the
results on a number of well known benchmarks and real world programs. Our
experimental evaluations show that the domain defined in this paper is capable
of precisely expressing the majority of the connectivity, shape, and sharing
properties that occur in practice and, despite the use of weak updates, the
static analysis is able to precisely approximate the ideal results. The
analysis is capable of analyzing large real-world programs (over 30K bytecodes)
in less than 65 seconds and using less than 130MB of memory. In summary this
work presents a new type of memory analysis that advances the state of the art
with respect to expressive power, precision, and scalability and represents a
new area of study on the relationships between and combination of concepts from
shape and points-to analyses
A static heap analysis for shape and connectivity: Unified memory analysis: The base framework
Modeling the evolution of the state of program memory during program execution is critical to many parallehzation techniques. Current memory analysis techniques either provide very accurate information but run prohibitively
slowly or produce very conservative results. An approach based on abstract interpretation is presented for analyzing programs at compile time, which can accurately determine many important program properties such as aliasing, logical data structures and shape. These properties are known to be critical for transforming a single threaded program into a versión that can be run on múltiple execution units in parallel. The analysis is shown to be of polynomial complexity in the size of the memory heap. Experimental results for benchmarks in the Jolden suite are given. These results show that in practice the analysis method is efflcient and is capable of accurately determining shape information in programs that créate and manipúlate complex data structures
Recommended from our members
Combining data structure repair and program repair
textBugs in code continue to pose a fundamental problem for software reliability and cause expensive failures. The process of removing known bugs is termed debugging, which is a classic methodology commonly performed before code is deployed. Traditionally, debugging is tedious, often requiring much manual effort. A more recent technique that complements debugging is data structure repair, which handles bugs that make it to deployed systems and lead to erroneous behavior at runtime by modifying erroneous program states to recover from errors. While data structure repair presents a promising basis for dealing with bugs at runtime, it remains computationally expensive. Our thesis is that debugging and data structure repair can be integrated to provide the basis of an effective approach for removing bugs before code is deployed and handling them after it is deployed. We present a bi-directional integration where ideas at the basis of data structure repair assist in automating debugging and vice versa. Our key insight is two-fold: (1)a repair action performed to mutate an erroneous object field value to repair it can be abstracted into a program statement that performs that update correctly; and (2)repair actions that are performed repeatedly to fix the same error can be memoized and re-used. We design, develop, and evaluate two techniques that embody our insight. One, we present an automated debugging technique that leverages a systematic constraint-based data structure repair technique developed in previous work and provides suggestions on how to fix a faulty program. Two, we present repair abstractions that are based on the same central ideas as in our automated debugging technique and memoize how an erroneous state was repaired, which enables prioritizing and re-using repair actions when the same error occurs again. The focus of our work is programs that operate on structurally complex data, e.g., heap-allocated data structures that have complex structural integrity constraints, such as acyclicity. Checking such constraints plays a central role in the techniques that lay at the foundation of our work. These techniques require the user to provide the constraints, which poses a burden on the user. To facilitate the use of constraint-based techniques, we present a third technique to check constraint violations at runtime using graph spectra, which have been studied extensively by mathematicians to capture properties of graphs. We view the heap of an object-oriented program as an edge-labeled graph, which allows us to apply results from graph spectra theory. Experimental results show the effectiveness of using graph spectra as a basis of capturing structural properties of a class of commonly used data structures.Electrical and Computer Engineerin
Parallelizing irregular C codes assisted by interprocedural shape analysis
In the new multicore architecture arena, the problem of improving the performance of a code is more in the soft-ware side than in the hardware one. However, optimizing irregular dynamic data structure based codes for such ar-chitectures is not easy, either by hand or compiler assisted. Regarding this last approach, shape analysis is a static tech-nique that achieves abstraction of dynamic memory and can help to disambiguate, quite accurately, memory references in programs that create and traverse recursive data struc-tures. This kind of analysis has promising applicability for accurate data dependence tests in loops or recursive func-tions that traverse dynamic data structures. However, sup-port for interprocedural programs in shape analysis is still a challenge, especially in the presence of recursive func-tions. In this work we present a novel fully context-sensitive interprocedural shape analysis algorithm that supports re-cursion and can be used to uncover parallelism. Our ap-proach is based on three key ideas: i) intraprocedural sup-port based on “Coexistent Links Sets ” to precisely describe the memory configurations during the abstract interpreta-tion of the C code; ii) interprocedural support based on “Recursive Flow Links ” to trace the state of pointers in previous calls; and iii) annotations of the read/written heap locations during the program analysis. We present prelim-inary experiments that reveal that our technique compares favorably with related work, and obtains precise memory abstractions in a variety of recursive programs that create and manipulate dynamic data structures. We have also im-plemented a data dependence test over our interprocedural shape analysis. With this test we have obtained promis-ing results, automatically detecting parallelism in three C codes, which have been successfully parallelized
Efficient Context-Sensitive Shape Analysis with Graph Based Heap Models
The performance of heap analysis techniques has a significant impact on their utility in an optimizing compiler.Most shape analysis techniques perform interprocedural dataflow analysis in a context-sensitive manner, which can result in analyzing each procedure body many times (causing significant increases in runtime even if the analysis results are memoized). To improve the effectiveness of memoization (and thus speed up the analysis) project/extend operations are used to remove portions of the heap model that cannot be affected by the called procedure (effectively reducing the number of different contexts that a procedure needs to be analyzed with). This paper introduces project/extend operations that are capable of accurately modeling properties that are important when analyzing non-trivial programs (sharing, nullity information, destructive recursive functions, and composite data structures). The techniques we introduce are able to handle these features while significantly improving the effectiveness of memoizing analysis results (and thus improving analysis performance). Using a range of well known benchmarks (many of which have not been successfully analyzed using other existing shape analysis methods) we demonstrate that our approach results in significant improvements in both accuracy and efficiency over a baseline analysis
- …