19 research outputs found

    HOL(y)Hammer: Online ATP Service for HOL Light

    Full text link
    HOL(y)Hammer is an online AI/ATP service for formal (computer-understandable) mathematics encoded in the HOL Light system. The service allows its users to upload and automatically process an arbitrary formal development (project) based on HOL Light, and to attack arbitrary conjectures that use the concepts defined in some of the uploaded projects. For that, the service uses several automated reasoning systems combined with several premise selection methods trained on all the project proofs. The projects that are readily available on the server for such query answering include the recent versions of the Flyspeck, Multivariate Analysis and Complex Analysis libraries. The service runs on a 48-CPU server, currently employing in parallel for each task 7 AI/ATP combinations and 4 decision procedures that contribute to its overall performance. The system is also available for local installation by interested users, who can customize it for their own proof development. An Emacs interface allowing parallel asynchronous queries to the service is also provided. The overall structure of the service is outlined, problems that arise and their solutions are discussed, and an initial account of using the system is given

    Initial Experiments with TPTP-style Automated Theorem Provers on ACL2 Problems

    Get PDF
    This paper reports our initial experiments with using external ATP on some corpora built with the ACL2 system. This is intended to provide the first estimate about the usefulness of such external reasoning and AI systems for solving ACL2 problems.Comment: In Proceedings ACL2 2014, arXiv:1406.123

    Learning-assisted Theorem Proving with Millions of Lemmas

    Full text link
    Large formal mathematical libraries consist of millions of atomic inference steps that give rise to a corresponding number of proved statements (lemmas). Analogously to the informal mathematical practice, only a tiny fraction of such statements is named and re-used in later proofs by formal mathematicians. In this work, we suggest and implement criteria defining the estimated usefulness of the HOL Light lemmas for proving further theorems. We use these criteria to mine the large inference graph of the lemmas in the HOL Light and Flyspeck libraries, adding up to millions of the best lemmas to the pool of statements that can be re-used in later proofs. We show that in combination with learning-based relevance filtering, such methods significantly strengthen automated theorem proving of new conjectures over large formal mathematical libraries such as Flyspeck.Comment: journal version of arXiv:1310.2797 (which was submitted to LPAR conference

    Hammering towards QED

    Get PDF
    This paper surveys the emerging methods to automate reasoning over large libraries developed with formal proof assistants. We call these methods hammers. They give the authors of formal proofs a strong “one-stroke” tool for discharging difficult lemmas without the need for careful and detailed manual programming of proof search. The main ingredients underlying this approach are efficient automatic theorem provers that can cope with hundreds of axioms, suitable translations of the proof assistant’s logic to the logic of the automatic provers, heuristic and learning methods that select relevant facts from large libraries, and methods that reconstruct the automatically found proofs inside the proof assistants. We outline the history of these methods, explain the main issues and techniques, and show their strength on several large benchmarks. We also discuss the relation of this technology to the QED Manifesto and consider its implications for QED-like efforts.Blanchette’s Sledgehammer research was supported by the Deutsche Forschungs- gemeinschaft projects Quis Custodiet (grants NI 491/11-1 and NI 491/11-2) and Hardening the Hammer (grant NI 491/14-1). Kaliszyk is supported by the Austrian Science Fund (FWF) grant P26201. Sledgehammer was originally supported by the UK’s Engineering and Physical Sciences Research Council (grant GR/S57198/01). Urban’s work was supported by the Marie-Curie Outgoing International Fellowship project AUTOKNOMATH (grant MOIF-CT-2005-21875) and by the Netherlands Organisation for Scientific Research (NWO) project Knowledge-based Automated Reasoning (grant 612.001.208).This is the final published version. It first appeared at http://jfr.unibo.it/article/view/4593/5730?acceptCookies=1

    Premise Selection and External Provers for HOL4

    Full text link
    Learning-assisted automated reasoning has recently gained popularity among the users of Isabelle/HOL, HOL Light, and Mizar. In this paper, we present an add-on to the HOL4 proof assistant and an adaptation of the HOLyHammer system that provides machine learning-based premise selection and automated reasoning also for HOL4. We efficiently record the HOL4 dependencies and extract features from the theorem statements, which form a basis for premise selection. HOLyHammer transforms the HOL4 statements in the various TPTP-ATP proof formats, which are then processed by the ATPs. We discuss the different evaluation settings: ATPs, accessible lemmas, and premise numbers. We measure the performance of HOLyHammer on the HOL4 standard library. The results are combined accordingly and compared with the HOL Light experiments, showing a comparably high quality of predictions. The system directly benefits HOL4 users by automatically finding proofs dependencies that can be reconstructed by Metis

    Machine Learning of Coq Proof Guidance: First Experiments

    Full text link
    We report the results of the first experiments with learning proof dependencies from the formalizations done with the Coq system. We explain the process of obtaining the dependencies from the Coq proofs, the characterization of formulas that is used for the learning, and the evaluation method. Various machine learning methods are compared on a dataset of 5021 toplevel Coq proofs coming from the CoRN repository. The best resulting method covers on average 75% of the needed proof dependencies among the first 100 predictions, which is a comparable performance of such initial experiments on other large-theory corpora
    corecore