217 research outputs found

    An Algebraic Framework for Compositional Program Analysis

    Full text link
    The purpose of a program analysis is to compute an abstract meaning for a program which approximates its dynamic behaviour. A compositional program analysis accomplishes this task with a divide-and-conquer strategy: the meaning of a program is computed by dividing it into sub-programs, computing their meaning, and then combining the results. Compositional program analyses are desirable because they can yield scalable (and easily parallelizable) program analyses. This paper presents algebraic framework for designing, implementing, and proving the correctness of compositional program analyses. A program analysis in our framework defined by an algebraic structure equipped with sequencing, choice, and iteration operations. From the analysis design perspective, a particularly interesting consequence of this is that the meaning of a loop is computed by applying the iteration operator to the loop body. This style of compositional loop analysis can yield interesting ways of computing loop invariants that cannot be defined iteratively. We identify a class of algorithms, the so-called path-expression algorithms [Tarjan1981,Scholz2007], which can be used to efficiently implement analyses in our framework. Lastly, we develop a theory for proving the correctness of an analysis by establishing an approximation relationship between an algebra defining a concrete semantics and an algebra defining an analysis.Comment: 15 page

    Enforcing Termination of Interprocedural Analysis

    Full text link
    Interprocedural analysis by means of partial tabulation of summary functions may not terminate when the same procedure is analyzed for infinitely many abstract calling contexts or when the abstract domain has infinite strictly ascending chains. As a remedy, we present a novel local solver for general abstract equation systems, be they monotonic or not, and prove that this solver fails to terminate only when infinitely many variables are encountered. We clarify in which sense the computed results are sound. Moreover, we show that interprocedural analysis performed by this novel local solver, is guaranteed to terminate for all non-recursive programs --- irrespective of whether the complete lattice is infinite or has infinite strictly ascending or descending chains

    Incremental Analysis of Programs

    Get PDF
    Algorithms used to determine the control and data flow properties of computer programs are generally designed for one-time analysis of an entire new input. Application of such algorithms when the input is only slightly modified results in an inefficient system. In this theses a set of incremental update algorithms are presented for data flow analysis. These algorithms update the solution from a previous analysis to reflect changes in the program. Thus, extensive reanalysis to reflect changes in the program. Thus, extensive reanalysis of programs after each program modification can be avoided. The incremental update algorithms presented for global flow analysis are based on Hecht/Ullman iterative algorithms. Banning\u27s interprocedural data flow analysis algorithms form the basis for the incremental interprocedural algorithms

    Faster Algorithms for Weighted Recursive State Machines

    Full text link
    Pushdown systems (PDSs) and recursive state machines (RSMs), which are linearly equivalent, are standard models for interprocedural analysis. Yet RSMs are more convenient as they (a) explicitly model function calls and returns, and (b) specify many natural parameters for algorithmic analysis, e.g., the number of entries and exits. We consider a general framework where RSM transitions are labeled from a semiring and path properties are algebraic with semiring operations, which can model, e.g., interprocedural reachability and dataflow analysis problems. Our main contributions are new algorithms for several fundamental problems. As compared to a direct translation of RSMs to PDSs and the best-known existing bounds of PDSs, our analysis algorithm improves the complexity for finite-height semirings (that subsumes reachability and standard dataflow properties). We further consider the problem of extracting distance values from the representation structures computed by our algorithm, and give efficient algorithms that distinguish the complexity of a one-time preprocessing from the complexity of each individual query. Another advantage of our algorithm is that our improvements carry over to the concurrent setting, where we improve the best-known complexity for the context-bounded analysis of concurrent RSMs. Finally, we provide a prototype implementation that gives a significant speed-up on several benchmarks from the SLAM/SDV project

    Static Execute After algorithms as alternatives for impact analysis

    Get PDF
    Impact analysis plays an important role in many software engineering tasks such as software maintenance, regression testing and debugging. In this paper, we present a static method to compute the impact sets of particular program points. For large programs, this method is more effective than the slightly more precise slicing. Our technique can also be used on larger programs with over thousands of lines of code where no slicers can be applied since the determination of the program dependence graphs, which are the bases of slicing, is an especially expensive task. As a result, our method could be efficiently used in the field of impact analysis

    Faster Algorithms for Dynamic Algebraic Queries in Basic RSMs with Constant Treewidth

    Get PDF
    Interprocedural analysis is at the heart of numerous applications in programming languages, such as alias analysis, constant propagation, and so on. Recursive state machines (RSMs) are standard models for interprocedural analysis. We consider a general framework with RSMs where the transitions are labeled from a semiring and path properties are algebraic with semiring operations. RSMs with algebraic path properties can model interprocedural dataflow analysis problems, the shortest path problem, the most probable path problem, and so on. The traditional algorithms for interprocedural analysis focus on path properties where the starting point is fixed as the entry point of a specific method. In this work, we consider possible multiple queries as required in many applications such as in alias analysis. The study of multiple queries allows us to bring in an important algorithmic distinction between the resource usage of the one-time preprocessing vs for each individual query. The second aspect we consider is that the control flow graphs for most programs have constant treewidth. Our main contributions are simple and implementable algorithms that support multiple queries for algebraic path properties for RSMs that have constant treewidth. Our theoretical results show that our algorithms have small additional one-time preprocessing but can answer subsequent queries significantly faster as compared to the current algorithmic solutions for interprocedural dataflow analysis. We have also implemented our algorithms and evaluated their performance for performing on-demand interprocedural dataflow analysis on various domains, such as for live variable analysis and reaching definitions, on a standard benchmark set. Our experimental results align with our theoretical statements and show that after a lightweight preprocessing, on-demand queries are answered much faster than the standard existing algorithmic approaches

    Identifying reusable functions in code using specification driven techniques

    Get PDF
    The work described in this thesis addresses the field of software reuse. Software reuse is widely considered as a way to increase the productivity and improve the quality and reliability of new software systems. Identifying, extracting and reengineering software. components which implement abstractions within existing systems is a promising cost-effective way to create reusable assets. Such a process is referred to as reuse reengineering. A reference paradigm has been defined within the RE(^2) project which decomposes a reuse reengineering process in five sequential phases. In particular, the first phase of the reference paradigm, called Candidature phase, is concerned with the analysis of source code for the identification of software components implementing abstractions and which are therefore candidate to be reused. Different candidature criteria exist for the identification of reuse-candidate software components. They can be classified in structural methods (based on structural properties of the software) and specification driven methods (that search for software components implementing a given specification).In this thesis a new specification driven candidature criterion for the identification and the extraction of code fragments implementing functional abstractions is presented. The method is driven by a formal specification of the function to be isolated (given in terms of a precondition and a post condition) and is based on the theoretical frameworks of program slicing and symbolic execution. Symbolic execution and theorem proving techniques are used to map the specification of the functional abstractions onto a slicing criterion. Once the slicing criterion has been identified the slice is isolated using algorithms based on dependence graphs. The method has been specialised for programs written in the C language. Both symbolic execution and program slicing are performed by exploiting the Combined C Graph (CCG), a fine-grained dependence based program representation that can be used for several software maintenance tasks
    • …
    corecore