145 research outputs found
Faster Algorithms for Weighted Recursive State Machines
Pushdown systems (PDSs) and recursive state machines (RSMs), which are
linearly equivalent, are standard models for interprocedural analysis. Yet RSMs
are more convenient as they (a) explicitly model function calls and returns,
and (b) specify many natural parameters for algorithmic analysis, e.g., the
number of entries and exits. We consider a general framework where RSM
transitions are labeled from a semiring and path properties are algebraic with
semiring operations, which can model, e.g., interprocedural reachability and
dataflow analysis problems.
Our main contributions are new algorithms for several fundamental problems.
As compared to a direct translation of RSMs to PDSs and the best-known existing
bounds of PDSs, our analysis algorithm improves the complexity for
finite-height semirings (that subsumes reachability and standard dataflow
properties). We further consider the problem of extracting distance values from
the representation structures computed by our algorithm, and give efficient
algorithms that distinguish the complexity of a one-time preprocessing from the
complexity of each individual query. Another advantage of our algorithm is that
our improvements carry over to the concurrent setting, where we improve the
best-known complexity for the context-bounded analysis of concurrent RSMs.
Finally, we provide a prototype implementation that gives a significant
speed-up on several benchmarks from the SLAM/SDV project
Weighted pushdown systems and their application to interprocedural dataflow analysis
AbstractRecently, pushdown systems (PDSs) have been extended to weighted PDSs, in which each transition is labeled with a value, and the goal is to determine the meet-over-all-paths value (for paths that meet a certain criterion). This paper shows how weighted PDSs yield new algorithms for certain classes of interprocedural dataflow-analysis problems
Parameterized Algorithms for Scalable Interprocedural Data-flow Analysis
Data-flow analysis is a general technique used to compute information of
interest at different points of a program and is considered to be a cornerstone
of static analysis. In this thesis, we consider interprocedural data-flow
analysis as formalized by the standard IFDS framework, which can express many
widely-used static analyses such as reaching definitions, live variables, and
null-pointer. We focus on the well-studied on-demand setting in which queries
arrive one-by-one in a stream and each query should be answered as fast as
possible. While the classical IFDS algorithm provides a polynomial-time
solution to this problem, it is not scalable in practice. Specifically, it
either requires a quadratic-time preprocessing phase or takes linear time per
query, both of which are untenable for modern huge codebases with hundreds of
thousands of lines. Previous works have already shown that parameterizing the
problem by the treewidth of the program's control-flow graph is promising and
can lead to significant gains in efficiency. Unfortunately, these results were
only applicable to the limited special case of same-context queries.
In this work, we obtain significant speedups for the general case of
on-demand IFDS with queries that are not necessarily same-context. This is
achieved by exploiting a new graph sparsity parameter, namely the treedepth of
the program's call graph. Our approach is the first to exploit the sparsity of
control-flow graphs and call graphs at the same time and parameterize by both
treewidth and treedepth. We obtain an algorithm with a linear preprocessing
phase that can answer each query in constant time with respect to the input
size. Finally, we show experimental results demonstrating that our approach
significantly outperforms the classical IFDS and its on-demand variant
Pluggable abstract domains for analyzing embedded software
ManuscriptMany abstract value domains such as intervals, bitwise, constants, and value-sets have been developed to support dataflow analysis. Different domains offer alternative tradeoffs between analysis speed and precision. Furthermore, some domains are a better match for certain kinds of code than others. This paper presents the design and implementation of cXprop, an analysis and transformation tool for C that implements "conditional X propagation," a generalization of the well-known conditional constant propagation algorithm where X is an abstract value domain supplied by the user. cXprop is interprocedural, context-insensitive, and achieves reasonable precision on pointer-rich codes. We have applied cXprop to sensor network programs running on TinyOS, in order to reduce code size through interprocedural dead code elimination, and to find limited-bitwidth global variables. Our analysis of global variables is supported by a novel concurrency model for interruptdriven software. cXprop reduces TinyOS application code size by an average of 9.2% and predicts an average data size reduction of 8.2% through RAM compression
Contributions to the Construction of Extensible Semantic Editors
This dissertation addresses the need for easier construction and extension of language tools. Specifically, the construction and extension of so-called semantic editors is considered, that is, editors providing semantic services for code comprehension and manipulation. Editors like these are typically found in state-of-the-art development environments, where they have been developed by hand. The list of programming languages available today is extensive and, with the lively creation of new programming languages and the evolution of old languages, it keeps growing. Many of these languages would benefit from proper tool support. Unfortunately, the development of a semantic editor can be a time-consuming and error-prone endeavor, and too large an effort for most language communities. Given the complex nature of programming, and the huge benefits of good tool support, this lack of tools is problematic. In this dissertation, an attempt is made at narrowing the gap between generative solutions and how state-of-the-art editors are constructed today. A generative alternative for construction of textual semantic editors is explored with focus on how to specify extensible semantic editor services. Specifically, this dissertation shows how semantic services can be specified using a semantic formalism called refer- ence attribute grammars (RAGs), and how these services can be made responsive enough for editing, and be provided also when the text in an editor is erroneous. Results presented in this dissertation have been found useful, both in industry and in academia, suggesting that the explored approach may help to reduce the effort of editor construction
Foo's To Blame: Techniques For Mapping Performance Data To Program Variables
Traditional methods of performance analysis offer a code centric view, presenting performance data in terms of blocks of contiguous code (statement, basic block, loop, function, etc.). Existing data centric techniques allow various program properties to be mapped directly to variables. Our approach extends these data centric mappings. Just as code centric techniques allow lower level objects like source lines be mapped up to functions, our inclusive technique allows low level data centric operations like computations on scalars to be mapped up to complex data structures like those found in scientific frameworks. Our system utilizes static analysis to collect information about the program that can be combined with runtime information to perform data centric program analysis. By pushing most of the analysis to pre-run and post-mortem, we can minimize the amount of data collected at runtime. This allows us to perform less instrumentation and also minimizes program perturbation. It also allows us to collect information that would not be possible with existing techniques.
We present two applications of this analysis. The first application of our analysis is targeted at mapping performance data to high level data structures with multiple levels of abstraction. We create extended data centric mappings, which we call variable blame, that relates data centric information to these variables.
The second application is a method for mapping cache miss information to variables. Existing approaches for this analysis rely on explicit hardware support and extensive program instrumentation. By utilizing our analysis and applying software heuristics, we are able to lessen those requirements.
We apply both of these analyses to applications and show what performance information can be provided by our analysis that can not currently be determined. We also discuss how we can use that information to improve program performance
Deductive formal verification of embedded systems
We combine static analysis techniques with model-based deductive verification using SMT solvers to provide a framework that, given an analysis aspect of the source code, automatically generates an analyzer capable of inferring information about that aspect.
The analyzer is generated by translating the collecting semantics of a program to a formula in first order logic over multiple underlying theories. We import the semantics of the API invocations as first order logic assertions. These assertions constitute the models used by the analyzer. Logical specification of the desired program behavior is incorporated as a first order logic formula. An SMT-LIB solver treats the combined formula as a constraint and solves it. The solved form can be used to identify logical and security errors in embedded programs. We have used this framework to analyze Android applications and MATLAB code.
We also report the formal verification of the conformance of the open source Netgear WNR3500L wireless router firmware implementation to the RFC 2131. Formal verification of a software system is essential for its deployment in mission-critical environments. The specifications for the development of routers are provided by RFCs that are only described informally in English. It is prudential to ensure that a router firmware conforms to its corresponding RFC before it can be deployed for managing mission-critical networks. The formal verification process demonstrates the usefulness of inductive types and higher-order logic in software certification
- …