637 research outputs found
Differentially Testing Soundness and Precision of Program Analyzers
In the last decades, numerous program analyzers have been developed both by
academia and industry. Despite their abundance however, there is currently no
systematic way of comparing the effectiveness of different analyzers on
arbitrary code. In this paper, we present the first automated technique for
differentially testing soundness and precision of program analyzers. We used
our technique to compare six mature, state-of-the art analyzers on tens of
thousands of automatically generated benchmarks. Our technique detected
soundness and precision issues in most analyzers, and we evaluated the
implications of these issues to both designers and users of program analyzers
Optimizing compilation with preservation of structural code coverage metrics to support software testing
Code-coverage-based testing is a widely-used testing strategy with the aim of providing a meaningful decision criterion for the adequacy of a test suite. Code-coverage-based testing is also mandated for the development of safety-critical applications; for example, the DO178b document requires the application of the modified condition/decision coverage. One critical issue of code-coverage testing is that structural code coverage criteria are typically applied to source code whereas the generated machine code may result in a different code structure because of code optimizations performed by a compiler. In this work, we present the automatic calculation of coverage profiles describing which structural code-coverage criteria are preserved by which code optimization, independently of the concrete test suite. These coverage profiles allow to easily extend compilers with the feature of preserving any given code-coverage criteria by enabling only those code optimizations that preserve it. Furthermore, we describe the integration of these coverage profile into the compiler GCC. With these coverage profiles, we answer the question of how much code optimization is possible without compromising the error-detection likelihood of a given test suite. Experimental results conclude that the performance cost to achieve preservation of structural code coverage in GCC is rather low.Peer reviewedSubmitted Versio
The SeaHorn Verification Framework
In this paper, we present SeaHorn, a software verification framework. The key distinguishing feature of SeaHorn is its modular design that separates the concerns of the syntax of the programming language, its operational semantics, and the verification semantics. SeaHorn encompasses several novelties: it (a) encodes verification conditions using an efficient yet precise inter-procedural technique, (b) provides flexibility in the verification semantics to allow different levels of precision, (c) leverages the state-of-the-art in software model checking and abstract interpretation for verification, and (d) uses Horn-clauses as an intermediate language to represent verification conditions which simplifies interfacing with multiple verification tools based on Horn-clauses. SeaHorn provides users with a powerful verification tool and researchers with an extensible and customizable framework for experimenting with new software verification techniques. The effectiveness and scalability of SeaHorn are demonstrated by an extensive experimental evaluation using benchmarks from SV-COMP 2015 and real avionics code
Precise set sharin analysis for java-style programs (and proofs).
Finding useful sharing information between instances in object- oriented programs has recently been the focus of much research. The applications of such static analysis are multiple: by knowing which variables definitely do not share in memory we can apply conventional compiler optimizations, find coarse-grained parallelism opportunities, or, more importantly, verify certain correctness aspects of programs even in the absence of annotations. In this paper we introduce a framework for deriving precise sharing information based on abstract interpretation for a Java-like language. Our analysis achieves precision in various ways, including supporting multivariance, which allows separating different contexts. We propose a combined Set Sharing + Nullity + Classes domain which captures which instances do not share and which ones are definitively null, and which uses the classes to refine the static information when inheritance is present. The use of a set sharing abstraction allows a more precise representation of the existing sharings and is crucial in achieving precision during interprocedural analysis. Carrying the domains in a combined way facilitates the interaction among them in the presence of multivariance in the analysis. We show through examples and
experimentally that both the set sharing part of the domain as well as the combined domain provide more accurate information than previous work based on pair sharing domains, at reasonable cost
Synthesizing Iterators from Abstraction Functions
A technique for synthesizing iterators from declarative abstraction functions written in a relational logic specification language is described. The logic includes a transitive closure operator that makes it convenient for expressing reachability queries on linked data structures. Some optimizations, including tuple elimination, iterator flattening, and traversal state reduction, are used to improve performance of the generated iterators.
A case study demonstrates that most of the iterators in the widely used JDK Collections classes can be replaced with code synthesized from declarative abstraction functions. These synthesized iterators perform competitively with the hand-written originals.
In a user study the synthesized iterators always passed more test cases than the hand-written ones, were almost always as efficient, usually took less programmer effort, and were the qualitative preference of all participants who provided free-form comments
Compatible Remediation on Vulnerabilities from Third-Party Libraries for Java Projects
With the increasing disclosure of vulnerabilities in open-source software,
software composition analysis (SCA) has been widely applied to reveal
third-party libraries and the associated vulnerabilities in software projects.
Beyond the revelation, SCA tools adopt various remediation strategies to fix
vulnerabilities, the quality of which varies substantially. However,
ineffective remediation could induce side effects, such as compilation
failures, which impede acceptance by users. According to our studies, existing
SCA tools could not correctly handle the concerns of users regarding the
compatibility of remediated projects. To this end, we propose Compatible
Remediation of Third-party libraries (CORAL) for Maven projects to fix
vulnerabilities without breaking the projects. The evaluation proved that CORAL
not only fixed 87.56% of vulnerabilities which outperformed other tools (best
75.32%) and achieved a 98.67% successful compilation rate and a 92.96%
successful unit test rate. Furthermore, we found that 78.45% of vulnerabilities
in popular Maven projects could be fixed without breaking the compilation, and
the rest of the vulnerabilities (21.55%) could either be fixed by upgrades that
break the compilations or even be impossible to fix by upgrading.Comment: 11 pages, conferenc
Reachability computation for polynomial dynamical systems
This paper is concerned with the problem of computing the bounded time reachable set of a polynomial discrete-time dynamical system. The problem is well-known for being difficult when nonlinear systems are considered. In this regard, we propose three reachability methods that differ in the set representation. The proposed algorithms adopt boxes, parallelotopes, and parallelotope bundles to construct flowpipes that contain the actual reachable sets. The latter is a new data structure for the symbolic representation of polytopes. Our methods exploit the Bernstein expansion of polynomials to bound the images of sets. The scalability and precision of the presented methods are analyzed on a number of dynamical systems, in comparison with other existing approaches
IST Austria Thesis
This dissertation focuses on algorithmic aspects of program verification, and presents modeling and complexity advances on several problems related to the
static analysis of programs, the stateless model checking of concurrent programs, and the competitive analysis of real-time scheduling algorithms.
Our contributions can be broadly grouped into five categories.
Our first contribution is a set of new algorithms and data structures for the quantitative and data-flow analysis of programs, based on the graph-theoretic notion of treewidth.
It has been observed that the control-flow graphs of typical programs have special structure, and are characterized as graphs of small treewidth.
We utilize this structural property to provide faster algorithms for the quantitative and data-flow analysis of recursive and concurrent programs.
In most cases we make an algebraic treatment of the considered problem,
where several interesting analyses, such as the reachability, shortest path, and certain kind of data-flow analysis problems follow as special cases.
We exploit the constant-treewidth property to obtain algorithmic improvements for on-demand versions of the problems,
and provide data structures with various tradeoffs between the resources spent in the preprocessing and querying phase.
We also improve on the algorithmic complexity of quantitative problems outside the algebraic path framework,
namely of the minimum mean-payoff, minimum ratio, and minimum initial credit for energy problems.
Our second contribution is a set of algorithms for Dyck reachability with applications to data-dependence analysis and alias analysis.
In particular, we develop an optimal algorithm for Dyck reachability on bidirected graphs, which are ubiquitous in context-insensitive, field-sensitive points-to analysis.
Additionally, we develop an efficient algorithm for context-sensitive data-dependence analysis via Dyck reachability,
where the task is to obtain analysis summaries of library code in the presence of callbacks.
Our algorithm preprocesses libraries in almost linear time, after which the contribution of the library in the complexity of the client analysis is (i)~linear in the number of call sites and (ii)~only logarithmic in the size of the whole library, as opposed to linear in the size of the whole library.
Finally, we prove that Dyck reachability is Boolean Matrix Multiplication-hard in general, and the hardness also holds for graphs of constant treewidth.
This hardness result strongly indicates that there exist no combinatorial algorithms for Dyck reachability with truly subcubic complexity.
Our third contribution is the formalization and algorithmic treatment of the Quantitative Interprocedural Analysis framework.
In this framework, the transitions of a recursive program are annotated as good, bad or neutral, and receive a weight which measures
the magnitude of their respective effect.
The Quantitative Interprocedural Analysis problem asks to determine whether there exists an infinite run of the program where the long-run ratio of the bad weights over the good weights is above a given threshold.
We illustrate how several quantitative problems related to static analysis of recursive programs can be instantiated in this framework,
and present some case studies to this direction.
Our fourth contribution is a new dynamic partial-order reduction for the stateless model checking of concurrent programs. Traditional approaches rely on the standard Mazurkiewicz equivalence between traces, by means of partitioning the trace space into equivalence classes, and attempting to explore a few representatives from each class.
We present a new dynamic partial-order reduction method called the Data-centric Partial Order Reduction (DC-DPOR).
Our algorithm is based on a new equivalence between traces, called the observation equivalence.
DC-DPOR explores a coarser partitioning of the trace space than any exploration method based on the standard Mazurkiewicz equivalence.
Depending on the program, the new partitioning can be even exponentially coarser.
Additionally, DC-DPOR spends only polynomial time in each explored class.
Our fifth contribution is the use of automata and game-theoretic verification techniques in the competitive analysis and synthesis of real-time scheduling algorithms for firm-deadline tasks.
On the analysis side, we leverage automata on infinite words to compute the competitive ratio of real-time schedulers subject to various environmental constraints.
On the synthesis side, we introduce a new instance of two-player mean-payoff partial-information games, and show
how the synthesis of an optimal real-time scheduler can be reduced to computing winning strategies in this new type of games
Recommended from our members
Automatic Derivation of Requirements for Components Used in Human-Intensive Systems
Human-intensive systems (HISs), where humans must coordinate with each other along with software and/or hardware components to achieve system missions, are increasingly prevalent in safety-critical domains (e.g., healthcare). Such systems are often complex, involving aspects such as concurrency and exceptional situations. For these systems, it is often difficult but important to determine requirements for the individual components that are necessary to ensure the system requirements are satisfied. In this thesis, we investigated an approach that employs interface synthesis methods developed for software systems to automatically derive such requirements for components used in HISs.
In previous work, we investigated a requirement deriver that employs a regular language learning algorithm to iteratively refine the derived requirement based on counterexamples generated by model checking techniques. Since this learning-based requirement deriver often did not scale well, we investigated several learning and model checking optimizations. These optimizations significantly improved performance but affected the counterexample generation heuristics, often widely varying the permissiveness of the derived requirements. For comparison purposes, we investigated a direct requirement deriver that was purported to have poor performance but guarantees the derived requirements are adequately permissive, conceptually meaning the requirements are permissive as possible without violating the system requirements. For our evaluation, we applied these requirement derivers to case studies in two important domains, healthcare and election administration.
Based on this evaluation, the direct requirement deriver with all optimizations applied had reasonable performance and ensures the derived requirements are adequately permissive. For the learning-based requirement deriver, many of the optimizations and heuristics have been presented previously, but we recommend how to selectively combine them to obtain reasonable performance while usually producing the adequately permissive derived requirements.
Since such derived requirements often reflect the system complexity, these requirements can be easily misunderstood. Thus, we also investigated building views of the requirements that abstract away or highlight certain aspects to try to improve their understandability. Each single view appears to improve understandability and the multiple views seem to complement each other further improving understandability. Such derived requirements and their views can be used to safely develop and deploy the components used in HISs
Automating Program Verification and Repair Using Invariant Analysis and Test Input Generation
Software bugs are a persistent feature of daily life---crashing web browsers, allowing cyberattacks, and distorting the results of scientific computations. One approach to improving software uses program invariants---mathematical descriptions of program behaviors---to verify code and detect bugs. Current invariant generation techniques lack support for complex yet important forms of invariants, such as general polynomial relations and properties of arrays. As a result, we lack the ability to conduct precise analysis of programs that use this common data structure. This dissertation presents DIG, a static and dynamic analysis framework for discovering several useful classes of program invariants, including (i) nonlinear polynomial relations, which are fundamental to many scientific applications; disjunctive invariants, (ii) which express branching behaviors in programs; and (iii) properties about multidimensional arrays, which appear in many practical applications. We describe theoretical and empirical results showing that DIG can efficiently and accurately find many important invariants in real-world uses, e.g., polynomial properties in numerical algorithms and array relations in a full AES encryption implementation. Automatic program verification and synthesis are long-standing problems in computer science. However, there has been a lot of work on program verification and less so on program synthesis. Consequently, important synthesis tasks, e.g., generating program repairs, remain difficult and time-consuming. This dissertation proves that certain formulations of verification and synthesis are equivalent, allowing for direct applications of techniques and tools between these two research areas. Based on these ideas, we develop CETI, a tool that leverages existing verification techniques and tools for automatic program repair. Experimental results show that CETI can have higher success rates than many other standard program repair methods
- …