676 research outputs found

    Correctness Witness Validation by Abstract Interpretation

    Full text link
    Witnesses record automated program analysis results and make them exchangeable. To validate correctness witnesses through abstract interpretation, we introduce a novel abstract operation unassume. This operator incorporates witness invariants into the abstract program state. Given suitable invariants, the unassume operation can accelerate fixpoint convergence and yield more precise results. We demonstrate the feasibility of this approach by augmenting an abstract interpreter with unassume operators and evaluating the impact of incorporating witnesses on performance and precision. Using manually crafted witnesses, we can confirm verification results for multi-threaded programs with a reduction in effort ranging from 7% to 47% in CPU time. More intriguingly, we discover that using witnesses from model checkers can guide our analyzer to verify program properties that it could not verify on its own.Comment: 29 pages, 4 figures, 2 tables, extended version of the paper which is to appear at VMCAI 202

    Decidability and Synthesis of Abstract Inductive Invariants

    Get PDF
    Decidability and synthesis of inductive invariants ranging in a given domain play an important role in many software and hardware verification systems. We consider here inductive invariants belonging to an abstract domain AA as defined in abstract interpretation, namely, ensuring the existence of the best approximation in AA of any system property. In this setting, we study the decidability of the existence of abstract inductive invariants in AA of transition systems and their corresponding algorithmic synthesis. Our model relies on some general results which relate the existence of abstract inductive invariants with least fixed points of best correct approximations in AA of the transfer functions of transition systems and their completeness properties. This approach allows us to derive decidability and synthesis results for abstract inductive invariants which are applied to the well-known Kildall's constant propagation and Karr's affine equalities abstract domains. Moreover, we show that a recent general algorithm for synthesizing inductive invariants in domains of logical formulae can be systematically derived from our results and generalized to a range of algorithms for computing abstract inductive invariants

    Approximations in Learning & Program Analysis

    Get PDF
    In this work we compare and contrast the approximations made in the problems of Data Compression, Program Analysis and Supervised Machine Learning. G\uf6del\u2019s Incompleteness Theorem mandates that any formal system rich enough to include integers will have unprovable truths. Thus non computable problems abound, including, but not limited to, Program Analysis, Data Compression and Machine Learning. Indeed, it can be shown that there are more non-computable functions than computable. Due to non- computability, precise solutions for these problems are not feasible, and only approximate solutions may be computed. Presently, each of these problems of Data Compression, Machine Learning and Program Analysis is studied independently. Each problem has it\u2019s own multitude of abstractions, algorithms and notions of tradeoffs among the various parameters. It would be interesting to have a unified framework, across disciplines, that makes explicit the abstraction specifications and ensuing tradeoffs. Such a framework would promote inter-disciplinary research and develop a unified body of knowledge to tackle non-computable problems. As a small step to that larger goal, we propose an Information Oriented Model of Computation that allows comparing the approximations used in Data Compression, Program Analysis and Machine Learning. To the best of our knowledge, this is the first work to propose a method for systematic comparison of approximations across disciplines. The model describes computation as set reconstruction. Non-computability is then presented as inability to perfectly reconstruct sets. In an effort to compare and contrast the approximations, select algorithms for Data Compression, Machine Learning and Program Analysis are analyzed using our model. We were able to relate the problems of Data Compression, Machine Learning and Program Analysis as specific instances of the general problem of approximate set reconstruction. We demonstrate the use of abstract interpreters in compression schemes. We then compare and contrast the approximations in Program Analysis and Supervised Machine Learning. We demonstrate the use of ordered structures, fixpoint equations and least fixpoint approximation computations, all characteristic of Abstract Interpretation (Program Analysis) in Machine Learning algorithms. We also present the idea that widening, like regression, is an inductive learner. Regression generalizes known states to a hypothesis. Widening generalizes abstract states on a iteration chain to a fixpoint. While Regression usually aims to minimize the total error (sum of false positives and false negatives), Widening aims for soundness and hence errs on the side of false positives to have zero false negatives. We use this duality to derive a generic widening operator from regression on the set of abstract states. The results of the dissertation are the first steps towards a unified approach to approximate computation. Consequently, our preliminary results lead to a lot more interesting questions, some of which we have tried to discuss in the concluding chapter

    Graph-Based Shape Analysis Beyond Context-Freeness

    Full text link
    We develop a shape analysis for reasoning about relational properties of data structures. Both the concrete and the abstract domain are represented by hypergraphs. The analysis is parameterized by user-supplied indexed graph grammars to guide concretization and abstraction. This novel extension of context-free graph grammars is powerful enough to model complex data structures such as balanced binary trees with parent pointers, while preserving most desirable properties of context-free graph grammars. One strength of our analysis is that no artifacts apart from grammars are required from the user; it thus offers a high degree of automation. We implemented our analysis and successfully applied it to various programs manipulating AVL trees, (doubly-linked) lists, and combinations of both

    Mapping programs to equations

    Get PDF
    Extracting the function of a program from a static analysis of its source code is a valuable capability in software engineering; at a time when there is increasing talk of using AI (Artificial Intelligence) to generate software from natural language specifications, it becomes increasingly important to determine the exact function of software as written, to figure out what AI has understood the natural language specification to mean. For all its criticality, the ability to derive the domain-to-range function of a program has proved to be an elusive goal, due primarily to the difficulty of deriving the function of iterative statements. Several automated tools obviate this difficulty by unrolling the loops; but this is clearly an imperfect solution, especially in light of the fact that loops capture most of the computing power of a program, are the locus of most of its complexity, and the source of most of its faults. This dissertation investigates a three-step process to map a program written in a C-like language into a function from inputs to outputs, or from initial states to final states. The semantics of iterative statements are captured (while loops, repeat loops, for loops), including nested iterative statements, by means of the concept of invariant relation; an invariant relation is a reflexive transitive relation that links program states separated by an arbitrary number of iterations. But the function derived for large and complex programs may be too unwieldy to be useful, not unlike drinking from a fire hose. In order to enable the user to query the program at scale, four functions are proposed. We propose four functions: Assume(), which enables the user to make assumptions about program states or program parts; Capture(), which enables the user to capture the state of the program at some label of the function of some program part; Verify(), which enables the user to verify a unary assertion about the state of the program at some label, or a binary assertion about a program part; and Establish(), which is envisioned to use program repair techniques to modify the program so as to make a Verify() query return true

    A Review of Formal Methods applied to Machine Learning

    Full text link
    We review state-of-the-art formal methods applied to the emerging field of the verification of machine learning systems. Formal methods can provide rigorous correctness guarantees on hardware and software systems. Thanks to the availability of mature tools, their use is well established in the industry, and in particular to check safety-critical applications as they undergo a stringent certification process. As machine learning is becoming more popular, machine-learned components are now considered for inclusion in critical systems. This raises the question of their safety and their verification. Yet, established formal methods are limited to classic, i.e. non machine-learned software. Applying formal methods to verify systems that include machine learning has only been considered recently and poses novel challenges in soundness, precision, and scalability. We first recall established formal methods and their current use in an exemplar safety-critical field, avionic software, with a focus on abstract interpretation based techniques as they provide a high level of scalability. This provides a golden standard and sets high expectations for machine learning verification. We then provide a comprehensive and detailed review of the formal methods developed so far for machine learning, highlighting their strengths and limitations. The large majority of them verify trained neural networks and employ either SMT, optimization, or abstract interpretation techniques. We also discuss methods for support vector machines and decision tree ensembles, as well as methods targeting training and data preparation, which are critical but often neglected aspects of machine learning. Finally, we offer perspectives for future research directions towards the formal verification of machine learning systems

    Transfer Function Synthesis without Quantifier Elimination

    Get PDF
    Traditionally, transfer functions have been designed manually for each operation in a program, instruction by instruction. In such a setting, a transfer function describes the semantics of a single instruction, detailing how a given abstract input state is mapped to an abstract output state. The net effect of a sequence of instructions, a basic block, can then be calculated by composing the transfer functions of the constituent instructions. However, precision can be improved by applying a single transfer function that captures the semantics of the block as a whole. Since blocks are program-dependent, this approach necessitates automation. There has thus been growing interest in computing transfer functions automatically, most notably using techniques based on quantifier elimination. Although conceptually elegant, quantifier elimination inevitably induces a computational bottleneck, which limits the applicability of these methods to small blocks. This paper contributes a method for calculating transfer functions that finesses quantifier elimination altogether, and can thus be seen as a response to this problem. The practicality of the method is demonstrated by generating transfer functions for input and output states that are described by linear template constraints, which include intervals and octagons.Comment: 37 pages, extended version of ESOP 2011 pape

    Static Analysis of Run-Time Errors in Embedded Real-Time Parallel C Programs

    Get PDF
    We present a static analysis by Abstract Interpretation to check for run-time errors in parallel and multi-threaded C programs. Following our work on Astr\'ee, we focus on embedded critical programs without recursion nor dynamic memory allocation, but extend the analysis to a static set of threads communicating implicitly through a shared memory and explicitly using a finite set of mutual exclusion locks, and scheduled according to a real-time scheduling policy and fixed priorities. Our method is thread-modular. It is based on a slightly modified non-parallel analysis that, when analyzing a thread, applies and enriches an abstract set of thread interferences. An iterator then re-analyzes each thread in turn until interferences stabilize. We prove the soundness of our method with respect to the sequential consistency semantics, but also with respect to a reasonable weakly consistent memory semantics. We also show how to take into account mutual exclusion and thread priorities through a partitioning over an abstraction of the scheduler state. We present preliminary experimental results analyzing an industrial program with our prototype, Th\'es\'ee, and demonstrate the scalability of our approach
    • 

    corecore