11 research outputs found
Automatic Program Instrumentation for Automatic Verification (Extended Technical Report)
In deductive verification and software model checking, dealing with certain
specification language constructs can be problematic when the back-end solver
is not sufficiently powerful or lacks the required theories. One way to deal
with this is to transform, for verification purposes, the program to an
equivalent one not using the problematic constructs, and to reason about its
correctness instead. In this paper, we propose instrumentation as a unifying
verification paradigm that subsumes various existing ad-hoc approaches, has a
clear formal correctness criterion, can be applied automatically, and can
transfer back witnesses and counterexamples. We illustrate our approach on the
automated verification of programs that involve quantification and aggregation
operations over arrays, such as the maximum value or sum of the elements in a
given segment of the array, which are known to be difficult to reason about
automatically. We formalise array aggregation operations as monoid
homomorphisms. We implement our approach in the MonoCera tool, which is
tailored to the verification of programs with aggregation, and evaluate it on
example programs, including SV-COMP programs.Comment: 36 page
Controlled and effective interpolation
Model checking is a well established technique to verify systems, exhaustively and automatically. The state space explosion, known as the main difficulty in model checking scalability, has been successfully approached by symbolic model checking which represents programs using logic, usually at the propositional or first order theories level. Craig interpolation is one of the most successful abstraction techniques used in symbolic methods. Interpolants can be efficiently generated from proofs of unsatisfiability, and have been used as means of over-approximation to generate inductive invariants, refinement predicates, and function summaries. However, interpolation is still not fully understood. For several theories it is only possible to generate one interpolant, giving the interpolation-based application no chance of further optimization via interpolation. For the theories that have interpolation systems that are able to generate different interpolants, it is not understood what makes one interpolant better than another, and how to generate the most suitable ones for a particular verification task. The goal of this thesis is to address the problems of how to generate multiple interpolants for theories that still lack this flexibility in their interpolation algorithms, and how to aim at good interpolants. This thesis extends the state-of-the-art by introducing novel interpolation frameworks for different theories. For propositional logic, this work provides a thorough theoretical analysis showing which properties are desirable in a labeling function for the Labeled Interpolation Systems framework (LIS). The Proof-Sensitive labeling function is presented, and we prove that it generates interpolants with the smallest number of Boolean connectives in the entire LIS framework. Two variants that aim at controlling the logical strength of propositional interpolants while maintaining a small size are given. The new interpolation algorithms are compared to previous ones from the literature in different model checking settings, showing that they consistently lead to a better overall verification performance. The Equalities and Uninterpreted Functions (EUF)-interpolation system, presented in this thesis, is a duality-based interpolation framework capable of generating multiple interpolants for a single proof of unsatisfiability, and provides control over the logical strength of the interpolants it generates using labeling functions. The labeling functions can be theoretically compared with respect to their strength, and we prove that two of them generate the interpolants with the smallest number of equalities. Our experiments follow the theory, showing that the generated interpolants indeed have different logical strength. We combine propositional and EUF interpolation in a model checking setting, and show that the strength of the interpolation algorithms for different theories has to be aligned in order to generate smaller interpolants. This work also introduces the Linear Real Arithmetic (LRA)-interpolation system, an interpolation framework for LRA. The framework is able to generate infinitely many interpolants of different logical strength using the duality of interpolants. The strength of the LRA interpolants can be controlled by a normalized strength factor, which makes it straightforward for an interpolationbased application to choose the level of strength it wants for the interpolants. Our experiments with the LRA-interpolation system and a model checker show that it is very important for the application to be able to fine tune the strength of the LRA interpolants in order to achieve optimal performance. The interpolation frameworks were implemented and form the interpolation module in OpenSMT2, an open source efficient SMT solver. OpenSMT2 has been integrated to the propositional interpolation-based model checkers FunFrog and eVolCheck, and to the first order interpolation-based model checkerHiFrog. This thesis presents real life model checking experiments using the novel interpolation frameworks and the tools aforementioned, showing the viability and strengths of the techniques
Regular Expressions for Data Words
Abstract. In data words, each position carries not only a letter form a finite alphabet, as the usual words do, but also a data value coming from an infinite domain. There has been a renewed interest in them due to applications in querying and reasoning about data models with complex structural properties, notably XML, and more recently, graph databases. Logical formalisms designed for querying such data often require concise and easily understandable presentations of regular languages over data words. Our goal, therefore, is to define and study regular expressions for data words. As the automaton model, we take register automata, which are a natural analog of NFAs for data words. We first equip standard regular expressions with limited memory, and show that they capture the class of data words defined by register automata. The complexity of the main decision problems for these expressions (nonemptiness, membership) also turns out to be the same as for register automata. We then look at a subclass of these regular expressions that can define many properties of interest in applications of data words, and show that the main decision problems can be solved efficiently for it.
A completely unique account of enumeration
How can we enumerate the inhabitants of an algebraic datatype? This paper explores a datatype generic solution that works for all regular types and indexed families. The enumerators presented here are provably both complete and unique—they will eventually produce every value exactly once—and fair—they avoid bias when composing enumerators. Finally, these enumerators memoise previously enumerated values whenever possible, thereby avoiding repeatedly recomputing recursive results
Genericity Through Stratification
A fundamental issue in the -calculus is to find appropriate notions
for meaningfulness. It is well-known that in the call-by-name
-calculus (CbN) the meaningful terms can be identified with the
solvable ones, and that this notion is not appropriate in the call-by-value
-calculus (CbV). This paper validates the challenging claim that yet
another notion, previously introduced in the literature as potential
valuability, appropriately represents meaningfulness in CbV. Akin to CbN, this
claim is corroborated by proving two essential properties. The first one is
genericity, stating that meaningless subterms have no bearing on evaluating
normalizing terms. To prove this (which was an open problem), we use a novel
approach based on stratified reduction, indifferently applicable to CbN and
CbV, and in a quantitative way. The second property concerns consistency of the
smallest congruence relation resulting from equating all meaningless terms.
While the consistency result is not new, we provide the first direct
operational proof of it. We also show that such a congruence has a unique
consistent and maximal extension, which coincides with a well-known notion of
observational equivalence. Our results thus supply the formal concepts and
tools that validate the informal notion of meaningfulness underlying CbN and
CbV
Quantitative Variants of Language Equations and their Applications to Description Logics
Unification in description logics (DLs) has been introduced as a novel inference service that can be used to detect redundancies in ontologies, by finding different concepts that may potentially stand for the same intuitive notion. Together with the special case of matching, they were first investigated in detail for the DL FL0, where these problems can be reduced to solving certain language equations.
In this thesis, we extend this service in two directions. In order to increase the recall of this method for finding redundancies, we introduce and investigate the notion of approximate unification, which basically finds pairs of concepts that “almost” unify, in order to account for potential small modelling errors. The meaning of “almost” is formalized using distance measures between concepts. We show that approximate unification in FL0 can be reduced to approximately solving language equations, and devise algorithms for solving the latter problem for particular distance measures. Furthermore, we make a first step towards integrating background knowledge, formulated in so-called TBoxes, by investigating the special case of matching in the presence of TBoxes of different forms. We acquire a tight complexity bound for the general case, while we prove that the problem becomes easier in a restricted setting. To achieve these bounds, we take advantage of an equivalence characterization of FL0 concepts that is based on formal languages. In addition, we incorporate TBoxes in computing concept distances. Even though our results on the approximate setting cannot deal with TBoxes yet, we prepare the framework that future research can build on. Before we journey to the technical details of the above investigations, we showcase our program in the simpler setting of the equational theory ACUI, where we are able to also combine the two extensions. In the course of studying the above problems, we make heavy use of automata theory, where we also derive novel results that could be of independent interest