5 research outputs found
Beagle as a HOL4 external ATP method
International audienceThis paper presents BEAGLE TAC, a HOL4 tactic for using Beagle as an external ATP for discharging HOL4 goals. We implement a translation of the higher-order goals to the TFA format of TPTP and add trace output to Beagle to reconstruct the intermediate steps derived by the ATP in HOL4. Our translation combines the characteristics of existing successful translations from HOL to FOL and SMT-LIB; however, we needed to adapt certain stages of the translation in order to benefit form the expressiveness of the TFA format and the power of Beagle. In our initial experiments, we demonstrate that our system can prove, without any arithmetic lemmas, 81% of the goals solved by Metis
Premise Selection and External Provers for HOL4
Learning-assisted automated reasoning has recently gained popularity among
the users of Isabelle/HOL, HOL Light, and Mizar. In this paper, we present an
add-on to the HOL4 proof assistant and an adaptation of the HOLyHammer system
that provides machine learning-based premise selection and automated reasoning
also for HOL4. We efficiently record the HOL4 dependencies and extract features
from the theorem statements, which form a basis for premise selection.
HOLyHammer transforms the HOL4 statements in the various TPTP-ATP proof
formats, which are then processed by the ATPs. We discuss the different
evaluation settings: ATPs, accessible lemmas, and premise numbers. We measure
the performance of HOLyHammer on the HOL4 standard library. The results are
combined accordingly and compared with the HOL Light experiments, showing a
comparably high quality of predictions. The system directly benefits HOL4 users
by automatically finding proofs dependencies that can be reconstructed by
Metis
GRUNGE: A Grand Unified ATP Challenge
This paper describes a large set of related theorem proving problems obtained
by translating theorems from the HOL4 standard library into multiple logical
formalisms. The formalisms are in higher-order logic (with and without type
variables) and first-order logic (possibly with multiple types, and possibly
with type variables). The resultant problem sets allow us to run automated
theorem provers that support different logical formats on corresponding
problems, and compare their performances. This also results in a new "grand
unified" large theory benchmark that emulates the ITP/ATP hammer setting, where
systems and metasystems can use multiple ATP formalisms in complementary ways,
and jointly learn from the accumulated knowledge.Comment: CADE 27 -- 27th International Conference on Automated Deductio