37 research outputs found
GRUNGE: A Grand Unified ATP Challenge
This paper describes a large set of related theorem proving problems obtained
by translating theorems from the HOL4 standard library into multiple logical
formalisms. The formalisms are in higher-order logic (with and without type
variables) and first-order logic (possibly with multiple types, and possibly
with type variables). The resultant problem sets allow us to run automated
theorem provers that support different logical formats on corresponding
problems, and compare their performances. This also results in a new "grand
unified" large theory benchmark that emulates the ITP/ATP hammer setting, where
systems and metasystems can use multiple ATP formalisms in complementary ways,
and jointly learn from the accumulated knowledge.Comment: CADE 27 -- 27th International Conference on Automated Deductio
Disproving in First-Order Logic with Definitions, Arithmetic and Finite Domains
This thesis explores several methods which enable a first-order
reasoner to conclude satisfiability of a formula modulo an
arithmetic theory. The most general method requires restricting
certain quantifiers to range over finite sets; such assumptions
are common in the software verification setting. In addition, the
use of first-order reasoning allows for an implicit
representation of those finite sets, which can avoid
scalability problems that affect other quantified reasoning
methods. These new techniques form a useful complement to
existing methods that are primarily aimed at proving validity.
The Superposition calculus for hierarchic theory combinations
provides a basis for reasoning modulo theories in a first-order
setting. The recent account of ‘weak abstraction’ and related
improvements make an mplementation of the calculus practical.
Also, for several logical theories of interest Superposition is
an effective decision procedure for the quantifier free fragment.
The first contribution is an implementation of that calculus
(Beagle), including an optimized implementation of Cooper’s
algorithm for quantifier elimination in the theory of linear
integer arithmetic. This includes a novel means of extracting
values
for quantified variables in satisfiable integer problems. Beagle
won an efficiency award at CADE Automated theorem prover System
Competition (CASC)-J7, and won the arithmetic non-theorem
category at CASC-25. This implementation is the start point for
solving the ‘disproving with theories’ problem.
Some hypotheses can be disproved by showing that, together with
axioms the hypothesis is unsatisfiable. Often this is relative to
other axioms that enrich a base theory by defining new functions.
In that case, the disproof is contingent on the satisfiability of
the enrichment.
Satisfiability in this context is undecidable. Instead, general
characterizations of definition formulas, which do not alter the
satisfiability status of the main axioms, are given. These
general criteria apply to recursive definitions, definitions over
lists, and to arrays. This allows proving some non-theorems which
are otherwise intractable, and justifies similar disproofs of
non-linear arithmetic formulas.
When the hypothesis is contingently true, disproof requires
proving existence of
a model. If the Superposition calculus saturates a clause set,
then a model exists,
but only when the clause set satisfies a completeness criterion.
This requires each
instance of an uninterpreted, theory-sorted term to have a
definition in terms of
theory symbols.
The second contribution is a procedure that creates such
definitions, given that a subset of quantifiers range over finite
sets. Definitions are produced in a counter-example driven way
via a sequence of over and under approximations to the clause
set. Two descriptions of the method are given: the first uses the
component solver modularly, but has an inefficient
counter-example heuristic. The second is more general, correcting
many of the inefficiencies of the first, yet it requires tracking
clauses through a proof. This latter method is shown to apply
also to lists and to problems with unbounded quantifiers.
Together, these tools give new ways for applying successful
first-order reasoning methods to problems involving interpreted
theories
Automated Theorem Proving with Extensions of First-Order Logic
Automated theorem provers are computer programs that check whether a logical conjecture follows from a set of logical statements. The conjecture and the statements are expressed in the language of some formal logic, such as first-order logic. Theorem provers for first-order logic have been used for automation in proof assistants, verification of programs, static analysis of networks, and other purposes. However, the efficient usage of these provers remains challenging. One of the challenges is the complexity of translating domain problems to first-order logic. Not only can such translation be cumbersome due to semantic differences between the domain and the logic, but it might inadvertently result in problems that provers cannot easily handle.The work presented in the thesis addresses this challenge by developing an extension of first-order logic named FOOL. FOOL contains syntactical features of programming languages and more expressive logics, is friendly for translation of problems from various domains, and can be efficiently supported by existing theorem provers. We describe the syntax and semantics of FOOL and present a simple translation from FOOL to plain first-order logic. We describe an efficient clausal normal form transformation algorithm for FOOL and based on it implement a support for FOOL in the Vampire theorem prover. We illustrate the efficient use of FOOL for program verification by describing a concise encoding of next state relations of imperative programs in FOOL. We show a usage of features of FOOL in problems of static analysis of networks. We demonstrate the efficiency of automated theorem proving in FOOL with an extensive set of experiments. In these experiments we compare the performance of Vampire on a large collection of problems from various sources translated to FOOL and ordinary first-order logic. Finally, we fix the syntax for FOOL in TPTP, the standard language of first-order theorem provers
Analyzing the selection of the herbrand base process for building a Smart Semantic Tree Theorem Prover
Traditionally, semantic trees have played an important role in proof theory for validating the unsatisfiability of sets of clauses. More recently, they have also been used to implement more practical tools for verifying the unsatisfiability of clause sets in first-order predicate logic. The method ultimately relies on the Herbrand Base, a set used in building the semantic tree. The Herbrand Base is used together with the Herbrand Universe, which stems from the initial clause set in a particular theorem. When searching for a closed semantic tree, the selection of suitable atoms from the Herbrand Base is very important and should be carried out carefully by educated guesses in order to avoid building a tree using atoms which are irrelevant for the proof. In an effort to circumvent the creation of irrelevant ground instances, a novel approach is investigated in this dissertation. As opposed to creating the ground instances of the clauses in S in a strict syntactic order, the values will be established through calculations which are based on relevance for the problem at hand. This idea has been applied and accordingly tested with the use of the Smart Semantic Tree Theorem Prover (SSTTP), which provides an algorithm for choos- ing prominent atoms from the Herbrand Base for utilisation in the generation of closed semantic trees. Part of this study is an empirical investigation of this prover performance on first-order problems without equality, as well as whether or not it is able to compete with modern theorem provers in certain niches. The results of the SSTTP are promising in terms of finding proofs in less time than some of the state-of-the-art provers. However, it can not compete with them in terms of the total number of the solved problems