145 research outputs found

    Faster Algorithms for Weighted Recursive State Machines

    Full text link
    Pushdown systems (PDSs) and recursive state machines (RSMs), which are linearly equivalent, are standard models for interprocedural analysis. Yet RSMs are more convenient as they (a) explicitly model function calls and returns, and (b) specify many natural parameters for algorithmic analysis, e.g., the number of entries and exits. We consider a general framework where RSM transitions are labeled from a semiring and path properties are algebraic with semiring operations, which can model, e.g., interprocedural reachability and dataflow analysis problems. Our main contributions are new algorithms for several fundamental problems. As compared to a direct translation of RSMs to PDSs and the best-known existing bounds of PDSs, our analysis algorithm improves the complexity for finite-height semirings (that subsumes reachability and standard dataflow properties). We further consider the problem of extracting distance values from the representation structures computed by our algorithm, and give efficient algorithms that distinguish the complexity of a one-time preprocessing from the complexity of each individual query. Another advantage of our algorithm is that our improvements carry over to the concurrent setting, where we improve the best-known complexity for the context-bounded analysis of concurrent RSMs. Finally, we provide a prototype implementation that gives a significant speed-up on several benchmarks from the SLAM/SDV project

    Weighted pushdown systems and their application to interprocedural dataflow analysis

    Get PDF
    AbstractRecently, pushdown systems (PDSs) have been extended to weighted PDSs, in which each transition is labeled with a value, and the goal is to determine the meet-over-all-paths value (for paths that meet a certain criterion). This paper shows how weighted PDSs yield new algorithms for certain classes of interprocedural dataflow-analysis problems

    Parameterized Algorithms for Scalable Interprocedural Data-flow Analysis

    Full text link
    Data-flow analysis is a general technique used to compute information of interest at different points of a program and is considered to be a cornerstone of static analysis. In this thesis, we consider interprocedural data-flow analysis as formalized by the standard IFDS framework, which can express many widely-used static analyses such as reaching definitions, live variables, and null-pointer. We focus on the well-studied on-demand setting in which queries arrive one-by-one in a stream and each query should be answered as fast as possible. While the classical IFDS algorithm provides a polynomial-time solution to this problem, it is not scalable in practice. Specifically, it either requires a quadratic-time preprocessing phase or takes linear time per query, both of which are untenable for modern huge codebases with hundreds of thousands of lines. Previous works have already shown that parameterizing the problem by the treewidth of the program's control-flow graph is promising and can lead to significant gains in efficiency. Unfortunately, these results were only applicable to the limited special case of same-context queries. In this work, we obtain significant speedups for the general case of on-demand IFDS with queries that are not necessarily same-context. This is achieved by exploiting a new graph sparsity parameter, namely the treedepth of the program's call graph. Our approach is the first to exploit the sparsity of control-flow graphs and call graphs at the same time and parameterize by both treewidth and treedepth. We obtain an algorithm with a linear preprocessing phase that can answer each query in constant time with respect to the input size. Finally, we show experimental results demonstrating that our approach significantly outperforms the classical IFDS and its on-demand variant

    Pluggable abstract domains for analyzing embedded software

    Get PDF
    ManuscriptMany abstract value domains such as intervals, bitwise, constants, and value-sets have been developed to support dataflow analysis. Different domains offer alternative tradeoffs between analysis speed and precision. Furthermore, some domains are a better match for certain kinds of code than others. This paper presents the design and implementation of cXprop, an analysis and transformation tool for C that implements "conditional X propagation," a generalization of the well-known conditional constant propagation algorithm where X is an abstract value domain supplied by the user. cXprop is interprocedural, context-insensitive, and achieves reasonable precision on pointer-rich codes. We have applied cXprop to sensor network programs running on TinyOS, in order to reduce code size through interprocedural dead code elimination, and to find limited-bitwidth global variables. Our analysis of global variables is supported by a novel concurrency model for interruptdriven software. cXprop reduces TinyOS application code size by an average of 9.2% and predicts an average data size reduction of 8.2% through RAM compression

    Contributions to the Construction of Extensible Semantic Editors

    Get PDF
    This dissertation addresses the need for easier construction and extension of language tools. Specifically, the construction and extension of so-called semantic editors is considered, that is, editors providing semantic services for code comprehension and manipulation. Editors like these are typically found in state-of-the-art development environments, where they have been developed by hand. The list of programming languages available today is extensive and, with the lively creation of new programming languages and the evolution of old languages, it keeps growing. Many of these languages would benefit from proper tool support. Unfortunately, the development of a semantic editor can be a time-consuming and error-prone endeavor, and too large an effort for most language communities. Given the complex nature of programming, and the huge benefits of good tool support, this lack of tools is problematic. In this dissertation, an attempt is made at narrowing the gap between generative solutions and how state-of-the-art editors are constructed today. A generative alternative for construction of textual semantic editors is explored with focus on how to specify extensible semantic editor services. Specifically, this dissertation shows how semantic services can be specified using a semantic formalism called refer- ence attribute grammars (RAGs), and how these services can be made responsive enough for editing, and be provided also when the text in an editor is erroneous. Results presented in this dissertation have been found useful, both in industry and in academia, suggesting that the explored approach may help to reduce the effort of editor construction

    Foo's To Blame: Techniques For Mapping Performance Data To Program Variables

    Get PDF
    Traditional methods of performance analysis offer a code centric view, presenting performance data in terms of blocks of contiguous code (statement, basic block, loop, function, etc.). Existing data centric techniques allow various program properties to be mapped directly to variables. Our approach extends these data centric mappings. Just as code centric techniques allow lower level objects like source lines be mapped up to functions, our inclusive technique allows low level data centric operations like computations on scalars to be mapped up to complex data structures like those found in scientific frameworks. Our system utilizes static analysis to collect information about the program that can be combined with runtime information to perform data centric program analysis. By pushing most of the analysis to pre-run and post-mortem, we can minimize the amount of data collected at runtime. This allows us to perform less instrumentation and also minimizes program perturbation. It also allows us to collect information that would not be possible with existing techniques. We present two applications of this analysis. The first application of our analysis is targeted at mapping performance data to high level data structures with multiple levels of abstraction. We create extended data centric mappings, which we call variable blame, that relates data centric information to these variables. The second application is a method for mapping cache miss information to variables. Existing approaches for this analysis rely on explicit hardware support and extensive program instrumentation. By utilizing our analysis and applying software heuristics, we are able to lessen those requirements. We apply both of these analyses to applications and show what performance information can be provided by our analysis that can not currently be determined. We also discuss how we can use that information to improve program performance

    Deductive formal verification of embedded systems

    Get PDF
    We combine static analysis techniques with model-based deductive verification using SMT solvers to provide a framework that, given an analysis aspect of the source code, automatically generates an analyzer capable of inferring information about that aspect. The analyzer is generated by translating the collecting semantics of a program to a formula in first order logic over multiple underlying theories. We import the semantics of the API invocations as first order logic assertions. These assertions constitute the models used by the analyzer. Logical specification of the desired program behavior is incorporated as a first order logic formula. An SMT-LIB solver treats the combined formula as a constraint and solves it. The solved form can be used to identify logical and security errors in embedded programs. We have used this framework to analyze Android applications and MATLAB code. We also report the formal verification of the conformance of the open source Netgear WNR3500L wireless router firmware implementation to the RFC 2131. Formal verification of a software system is essential for its deployment in mission-critical environments. The specifications for the development of routers are provided by RFCs that are only described informally in English. It is prudential to ensure that a router firmware conforms to its corresponding RFC before it can be deployed for managing mission-critical networks. The formal verification process demonstrates the usefulness of inductive types and higher-order logic in software certification
    corecore