694 research outputs found

    Heap Reference Analysis Using Access Graphs

    Full text link
    Despite significant progress in the theory and practice of program analysis, analysing properties of heap data has not reached the same level of maturity as the analysis of static and stack data. The spatial and temporal structure of stack and static data is well understood while that of heap data seems arbitrary and is unbounded. We devise bounded representations which summarize properties of the heap data. This summarization is based on the structure of the program which manipulates the heap. The resulting summary representations are certain kinds of graphs called access graphs. The boundedness of these representations and the monotonicity of the operations to manipulate them make it possible to compute them through data flow analysis. An important application which benefits from heap reference analysis is garbage collection, where currently liveness is conservatively approximated by reachability from program variables. As a consequence, current garbage collectors leave a lot of garbage uncollected, a fact which has been confirmed by several empirical studies. We propose the first ever end-to-end static analysis to distinguish live objects from reachable objects. We use this information to make dead objects unreachable by modifying the program. This application is interesting because it requires discovering data flow information representing complex semantics. In particular, we discover four properties of heap data: liveness, aliasing, availability, and anticipability. Together, they cover all combinations of directions of analysis (i.e. forward and backward) and confluence of information (i.e. union and intersection). Our analysis can also be used for plugging memory leaks in C/C++ languages.Comment: Accepted for printing by ACM TOPLAS. This version incorporates referees' comment

    Precise Null Pointer Analysis Through Global Value Numbering

    Full text link
    Precise analysis of pointer information plays an important role in many static analysis techniques and tools today. The precision, however, must be balanced against the scalability of the analysis. This paper focusses on improving the precision of standard context and flow insensitive alias analysis algorithms at a low scalability cost. In particular, we present a semantics-preserving program transformation that drastically improves the precision of existing analyses when deciding if a pointer can alias NULL. Our program transformation is based on Global Value Numbering, a scheme inspired from compiler optimizations literature. It allows even a flow-insensitive analysis to make use of branch conditions such as checking if a pointer is NULL and gain precision. We perform experiments on real-world code to measure the overhead in performing the transformation and the improvement in the precision of the analysis. We show that the precision improves from 86.56% to 98.05%, while the overhead is insignificant.Comment: 17 pages, 1 section in Appendi

    Heap Abstractions for Static Analysis

    Full text link
    Heap data is potentially unbounded and seemingly arbitrary. As a consequence, unlike stack and static memory, heap memory cannot be abstracted directly in terms of a fixed set of source variable names appearing in the program being analysed. This makes it an interesting topic of study and there is an abundance of literature employing heap abstractions. Although most studies have addressed similar concerns, their formulations and formalisms often seem dissimilar and some times even unrelated. Thus, the insights gained in one description of heap abstraction may not directly carry over to some other description. This survey is a result of our quest for a unifying theme in the existing descriptions of heap abstractions. In particular, our interest lies in the abstractions and not in the algorithms that construct them. In our search of a unified theme, we view a heap abstraction as consisting of two features: a heap model to represent the heap memory and a summarization technique for bounding the heap representation. We classify the models as storeless, store based, and hybrid. We describe various summarization techniques based on k-limiting, allocation sites, patterns, variables, other generic instrumentation predicates, and higher-order logics. This approach allows us to compare the insights of a large number of seemingly dissimilar heap abstractions and also paves way for creating new abstractions by mix-and-match of models and summarization techniques.Comment: 49 pages, 20 figure

    Structural Analysis: Shape Information via Points-To Computation

    Full text link
    This paper introduces a new hybrid memory analysis, Structural Analysis, which combines an expressive shape analysis style abstract domain with efficient and simple points-to style transfer functions. Using data from empirical studies on the runtime heap structures and the programmatic idioms used in modern object-oriented languages we construct a heap analysis with the following characteristics: (1) it can express a rich set of structural, shape, and sharing properties which are not provided by a classic points-to analysis and that are useful for optimization and error detection applications (2) it uses efficient, weakly-updating, set-based transfer functions which enable the analysis to be more robust and scalable than a shape analysis and (3) it can be used as the basis for a scalable interprocedural analysis that produces precise results in practice. The analysis has been implemented for .Net bytecode and using this implementation we evaluate both the runtime cost and the precision of the results on a number of well known benchmarks and real world programs. Our experimental evaluations show that the domain defined in this paper is capable of precisely expressing the majority of the connectivity, shape, and sharing properties that occur in practice and, despite the use of weak updates, the static analysis is able to precisely approximate the ideal results. The analysis is capable of analyzing large real-world programs (over 30K bytecodes) in less than 65 seconds and using less than 130MB of memory. In summary this work presents a new type of memory analysis that advances the state of the art with respect to expressive power, precision, and scalability and represents a new area of study on the relationships between and combination of concepts from shape and points-to analyses

    Automatically Finding Bugs in Open Source Programs

    Get PDF
    We consider properties desirable for static analysis tools targeted at finding bugs in the real open source code, and review tools based on various approaches to defect detection. A static analysis tool is described, that includes a framework for flow-sensitive interprocedural dataflow analysis and scales to analysis of large programs. The framework enables implementation of multiple checkers searching for specific bugs, such as null pointer dereference and buffer overflow, abstracting from the checkers details such as alias analysis

    Expression-based aliasing for OO-languages

    Full text link
    Alias analysis has been an interesting research topic in verification and optimization of programs. The undecidability of determining whether two expressions in a program may reference to the same object is the main source of the challenges raised in alias analysis. In this paper we propose an extension of a previously introduced alias calculus based on program expressions, to the setting of unbounded program executions s.a. infinite loops and recursive calls. Moreover, we devise a corresponding executable specification in the K-framework. An important property of our extension is that, in a non-concurrent setting, the corresponding alias expressions can be over-approximated in terms of a notion of regular expressions. This further enables us to show that the associated K-machinery implements an algorithm that always stops and provides a sound over-approximation of the "may aliasing" information, where soundness stands for the lack of false negatives. As a case study, we analyze the integration and further applications of the alias calculus in SCOOP. The latter is an object-oriented programming model for concurrency, recently formalized in Maude; K-definitions can be compiled into Maude for execution
    corecore