306 research outputs found

    Generating analyzers with PAG

    Get PDF
    To produce high qualitiy code, modern compilers use global optimization algorithms based on it abstract interpretation. These algorithms are rather complex; their implementation is therfore a non-trivial task and error-prone. However, since thez are based on a common theory, they have large similar parts. We conclude that analyzer writing better should be replaced with analyzer generation. We present the tool sf PAG that has a high level functional input language to specify data flow analyses. It offers th specifications of even recursive data structures and is therfore not limited to bit vector problems. sf PAG generates efficient analyzers wich can be easily integrated in existing compilers. The analyzers are interprocedural, they can handle recursive procedures with local variables and higher order functions. sf PAG has successfully been tested by generating several analyzers (e.g. alias analysis, constant propagation, inerval analysis) for an industrial quality ANSI-C and Fortran90 compiler. This technical report consits of two parts; the first introduces the generation system and the second evaluates generated analyzers with respect to their space and time consumption. bf Keywords: data flow analysis, specification and generation of analyzers, lattice specification, abstract syntax specification, interprocedural analysis, compiler construction

    Enforcing Termination of Interprocedural Analysis

    Full text link
    Interprocedural analysis by means of partial tabulation of summary functions may not terminate when the same procedure is analyzed for infinitely many abstract calling contexts or when the abstract domain has infinite strictly ascending chains. As a remedy, we present a novel local solver for general abstract equation systems, be they monotonic or not, and prove that this solver fails to terminate only when infinitely many variables are encountered. We clarify in which sense the computed results are sound. Moreover, we show that interprocedural analysis performed by this novel local solver, is guaranteed to terminate for all non-recursive programs --- irrespective of whether the complete lattice is infinite or has infinite strictly ascending or descending chains

    Generating analyzers with PAG

    Get PDF
    To produce high qualitiy code, modern compilers use global optimization algorithms based on it abstract interpretation. These algorithms are rather complex; their implementation is therfore a non-trivial task and error-prone. However, since thez are based on a common theory, they have large similar parts. We conclude that analyzer writing better should be replaced with analyzer generation. We present the tool sf PAG that has a high level functional input language to specify data flow analyses. It offers th specifications of even recursive data structures and is therfore not limited to bit vector problems. sf PAG generates efficient analyzers wich can be easily integrated in existing compilers. The analyzers are interprocedural, they can handle recursive procedures with local variables and higher order functions. sf PAG has successfully been tested by generating several analyzers (e.g. alias analysis, constant propagation, inerval analysis) for an industrial quality ANSI-C and Fortran90 compiler. This technical report consits of two parts; the first introduces the generation system and the second evaluates generated analyzers with respect to their space and time consumption. bf Keywords: data flow analysis, specification and generation of analyzers, lattice specification, abstract syntax specification, interprocedural analysis, compiler construction

    Acta Cybernetica : Volume 20. Number 4.

    Get PDF

    Program analysis of temporal memory mismanagement

    Full text link
    In the use of C/C++ programs, the performance benefits obtained from flexible low-level memory access and management sacrifice language-level support for memory safety and garbage collection. Memory-related programming mistakes are introduced as a result, rendering C/C++ programs prone to memory errors. A common category of programming mistakes is defined by the misplacement of deallocation operations, also known as temporal memory mismanagement, which can generate two types of bugs: (1) use-after-free (UAF) bugs and (2) memory leaks. The former are severe security vulnerabilities that expose programs to both data and control-flow exploits, while the latter are critical performance bugs that compromise software availability and reliability. In the case of UAF bugs, existing solutions that almost exclusively rely on dynamic analysis suffer from limitations, including low code coverage, binary incompatibility, and high overheads. In the case of memory leaks, detection techniques are abundant; however, fixing techniques have been poorly investigated. In this thesis, we present three novel program analysis frameworks to address temporal memory mismanagement in C/C++. First, we introduce Tac, the first static UAF detection framework to combine typestate analysis with machine learning. Tac identifies representative features to train a Support Vector Machine to classify likely true/false UAF candidates, thereby providing guidance for typestate analysis used to locate bugs with precision. We then present CRed, a pointer analysis-based framework for UAF detection with a novel context-reduction technique and a new demand-driven path-sensitive pointer analysis to boost scalability and precision. A major advantage of CRed is its ability to substantially and soundly reduce search space without losing bug-finding ability. This is achieved by utilizing must-not-alias information to truncate unnecessary segments of calling contexts. Finally, we propose AutoFix, an automated memory leak fixing framework based on value-flow analysis and static instrumentation that can fix all leaks reported by any front-end detector with negligible overheads safely and with precision. AutoFix tolerates false leaks with a shadow memory data structure carefully designed to keep track of the allocation and deallocation of potentially leaked memory objects. The contribution of this thesis is threefold. First, we advance existing state-of-the-art solutions to detecting memory leaks by proposing a series of novel program analysis techniques to address temporal memory mismanagement. Second, corresponding prototype tools are fully implemented in the LLVM compiler framework. Third, an extensive evaluation of open-source C/C++ benchmarks is conducted to validate the effectiveness of the proposed techniques

    BCFA: Bespoke Control Flow Analysis for CFA at Scale

    Full text link
    Many data-driven software engineering tasks such as discovering programming patterns, mining API specifications, etc., perform source code analysis over control flow graphs (CFGs) at scale. Analyzing millions of CFGs can be expensive and performance of the analysis heavily depends on the underlying CFG traversal strategy. State-of-the-art analysis frameworks use a fixed traversal strategy. We argue that a single traversal strategy does not fit all kinds of analyses and CFGs and propose bespoke control flow analysis (BCFA). Given a control flow analysis (CFA) and a large number of CFGs, BCFA selects the most efficient traversal strategy for each CFG. BCFA extracts a set of properties of the CFA by analyzing the code of the CFA and combines it with properties of the CFG, such as branching factor and cyclicity, for selecting the optimal traversal strategy. We have implemented BCFA in Boa, and evaluated BCFA using a set of representative static analyses that mainly involve traversing CFGs and two large datasets containing 287 thousand and 162 million CFGs. Our results show that BCFA can speedup the large scale analyses by 1%-28%. Further, BCFA has low overheads; less than 0.2%, and low misprediction rate; less than 0.01%.Comment: 12 page

    A demand-driven solver for constraint-based control flow analysis

    Get PDF
    This thesis develops a demand driven solver for constraint based control flow analysis. Our approach is modular, flow-sensitive and scaling. It allows to efficiently construct the interprocedural control flow graph (ICFG) for object-oriented languages. The analysis is based on the formal semantics of a Java-like language. It is proven to be correct with respect to this semantics. The base algorithms are given and we evaluate the applicability of our approach to real world programs. Construction of the ICFG is a key problem for the translation and optimization of object-oriented languages. The more accurate these graphs are, the more applicable, precise and faster are these analyses. While most present techniques are flow-insensitive, we present a flow-sensitive approach that is scalable. The analysis result is twofold. On the one hand, it allows to identify and delete uncallable methods, thus minimizing the program\u27;s footprint. This is especially important in the setting of embedded systems, where usually memory resources are quite expensive. On the other hand, the interprocedural control flow graph generated is much more precise than those generated with present techniques. This allows for increased accuracy when performing data flow analyses. Also this aspect is important for embedded systems, as more precise analyses allow the compiler to apply better optimizations, resulting in smaller and/or faster programs. Experimental results are given that demonstrate the applicability and scalability of the analysis.Diese Arbeit entwickelt einen Bedarf-gesteuerten Löser fĂŒr Constraint- basierte Kontrollflußanalyse. Unser Ansatz ist modular, fluß-sensitiv and skaliert. Er erlaubt das effiziente Konstruieren des interprozeduralen Kontrollflußgraphen fuer objektorientierte Programmiersprachen. Die Analyse basiert auf der formalen Semantik einer Java-Ă€hnlichen Sprache und wird als korrekt bezĂŒglich dieser Semantik bewiesen. Wir prĂ€sentieren die grundlegenden Algorithmen und belegen die Anwendbarkeit unseres Ansatzes auf realistische Programme. Die Konstruktion des interprozeduralen Kontrollflußgraphen ist ein SchlĂŒsselproblem bei der Übersetzung und Optimierung objekt- orientierter Programmiersprachen. Je genauer diese Graphen sind, desto prĂ€ziser und schneller sind darauf arbeitende Datenfluß-Analysen. WĂ€hrend die meisten heute verbreiteten Techniken fluß-insensitiv sind, prĂ€sentieren wir einen skalierbaren fluß-sensitiven Ansatz. Unsere Analyse hat zwei Hauptergebnisse. Einerseits erlaubt sie, nicht erreichbare Methoden zu identifizieren und zu löschen, wodurch die GrĂ¶ĂŸe des erzeugten Programmes reduziert wird. Dies ist besonders fĂŒr eingebettete Systeme wichtig, bei denen zusĂ€tzlicher Speicherplatz teuer ist. Andererseits ist der mit unserem Ansatz berechnete interprozedurale Kontrollflußgraph wesentlich genauer als der von derzeitigen Techniken berechnete Graph. Dieser prĂ€zisere Graph erlaubt eine grĂ¶ĂŸere Genauigkeit bei Datenflußanalysen. Auch dieser Aspekt ist fĂŒr eingebettete Systeme von großer Bedeutung, da prĂ€zisere Analysen bessere Optimierungen erlauben. Hierdurch wird das erzeugte Programm kleiner und/oder schneller. Experimentelle Ergebnisse belegen die Anwendbarkeit und Skalierbarkeit unserer Analyse
    • 

    corecore