178 research outputs found

    MaLeS: A Framework for Automatic Tuning of Automated Theorem Provers

    Full text link
    MaLeS is an automatic tuning framework for automated theorem provers. It provides solutions for both the strategy finding as well as the strategy scheduling problem. This paper describes the tool and the methods used in it, and evaluates its performance on three automated theorem provers: E, LEO-II and Satallax. An evaluation on a subset of the TPTP library problems shows that on average a MaLeS-tuned prover solves 8.67% more problems than the prover with its default settings

    Learning-Assisted Automated Reasoning with Flyspeck

    Full text link
    The considerable mathematical knowledge encoded by the Flyspeck project is combined with external automated theorem provers (ATPs) and machine-learning premise selection methods trained on the proofs, producing an AI system capable of answering a wide range of mathematical queries automatically. The performance of this architecture is evaluated in a bootstrapping scenario emulating the development of Flyspeck from axioms to the last theorem, each time using only the previous theorems and proofs. It is shown that 39% of the 14185 theorems could be proved in a push-button mode (without any high-level advice and user interaction) in 30 seconds of real time on a fourteen-CPU workstation. The necessary work involves: (i) an implementation of sound translations of the HOL Light logic to ATP formalisms: untyped first-order, polymorphic typed first-order, and typed higher-order, (ii) export of the dependency information from HOL Light and ATP proofs for the machine learners, and (iii) choice of suitable representations and methods for learning from previous proofs, and their integration as advisors with HOL Light. This work is described and discussed here, and an initial analysis of the body of proofs that were found fully automatically is provided

    ENIGMA: Efficient Learning-based Inference Guiding Machine

    Full text link
    ENIGMA is a learning-based method for guiding given clause selection in saturation-based theorem provers. Clauses from many proof searches are classified as positive and negative based on their participation in the proofs. An efficient classification model is trained on this data, using fast feature-based characterization of the clauses . The learned model is then tightly linked with the core prover and used as a basis of a new parameterized evaluation heuristic that provides fast ranking of all generated clauses. The approach is evaluated on the E prover and the CASC 2016 AIM benchmark, showing a large increase of E's performance.Comment: Submitted to LPAR 201

    GRUNGE: A Grand Unified ATP Challenge

    Full text link
    This paper describes a large set of related theorem proving problems obtained by translating theorems from the HOL4 standard library into multiple logical formalisms. The formalisms are in higher-order logic (with and without type variables) and first-order logic (possibly with multiple types, and possibly with type variables). The resultant problem sets allow us to run automated theorem provers that support different logical formats on corresponding problems, and compare their performances. This also results in a new "grand unified" large theory benchmark that emulates the ITP/ATP hammer setting, where systems and metasystems can use multiple ATP formalisms in complementary ways, and jointly learn from the accumulated knowledge.Comment: CADE 27 -- 27th International Conference on Automated Deductio

    Premise Selection for Mathematics by Corpus Analysis and Kernel Methods

    Get PDF
    Smart premise selection is essential when using automated reasoning as a tool for large-theory formal proof development. A good method for premise selection in complex mathematical libraries is the application of machine learning to large corpora of proofs. This work develops learning-based premise selection in two ways. First, a newly available minimal dependency analysis of existing high-level formal mathematical proofs is used to build a large knowledge base of proof dependencies, providing precise data for ATP-based re-verification and for training premise selection algorithms. Second, a new machine learning algorithm for premise selection based on kernel methods is proposed and implemented. To evaluate the impact of both techniques, a benchmark consisting of 2078 large-theory mathematical problems is constructed,extending the older MPTP Challenge benchmark. The combined effect of the techniques results in a 50% improvement on the benchmark over the Vampire/SInE state-of-the-art system for automated reasoning in large theories.Comment: 26 page

    An efficient contradiction separation based automated deduction algorithm for enhancing reasoning capability

    Get PDF
    Automated theorem prover (ATP) for first-order logic (FOL), as a significant inference engine, is one of the hot research areas in the field of knowledge representation and automated reasoning. E prover, as one of the leading ATPs, has made a significant contribution to the development of theorem provers for FOL, particularly equality handling, after more than two decades of development. However, there are still a large number of problems in the TPTP problem library, the benchmark problem library for ATPs, that E has yet to solve. The standard contradiction separation (S-CS) rule is an inference method introduced recently that can handle multiple clauses in a synergized way and has a few distinctive features which complements to the calculus of E. Binary clauses, on the other hand, are widely utilized in the automated deduction process for FOL because they have a minimal number of literals (typically only two literals), few symbols, and high manipulability. As a result, it is feasible to improve a prover's deduction capability by reusing binary clause. In this paper, a binary clause reusing algorithm based on the S-CS rule is firstly proposed, which is then incorporated into E with the objective to enhance E’s performance, resulting in an extended E prover. According to experimental findings, the performance of the extended E prover not only outperforms E itself in a variety of aspects, but also solves 18 problems with rating of 1 in the TPTP library, meaning that none of the existing ATPs are able to resolve them
    • …
    corecore