1,877 research outputs found
Towards Ranking Geometric Automated Theorem Provers
The field of geometric automated theorem provers has a long and rich history,
from the early AI approaches of the 1960s, synthetic provers, to today
algebraic and synthetic provers.
The geometry automated deduction area differs from other areas by the strong
connection between the axiomatic theories and its standard models. In many
cases the geometric constructions are used to establish the theorems'
statements, geometric constructions are, in some provers, used to conduct the
proof, used as counter-examples to close some branches of the automatic proof.
Synthetic geometry proofs are done using geometric properties, proofs that can
have a visual counterpart in the supporting geometric construction.
With the growing use of geometry automatic deduction tools as applications in
other areas, e.g. in education, the need to evaluate them, using different
criteria, is felt. Establishing a ranking among geometric automated theorem
provers will be useful for the improvement of the current
methods/implementations. Improvements could concern wider scope, better
efficiency, proof readability and proof reliability.
To achieve the goal of being able to compare geometric automated theorem
provers a common test bench is needed: a common language to describe the
geometric problems; a comprehensive repository of geometric problems and a set
of quality measures.Comment: In Proceedings ThEdu'18, arXiv:1903.1240
Mining State-Based Models from Proof Corpora
Interactive theorem provers have been used extensively to reason about
various software/hardware systems and mathematical theorems. The key challenge
when using an interactive prover is finding a suitable sequence of proof steps
that will lead to a successful proof requires a significant amount of human
intervention. This paper presents an automated technique that takes as input
examples of successful proofs and infers an Extended Finite State Machine as
output. This can in turn be used to generate proofs of new conjectures. Our
preliminary experiments show that the inferred models are generally accurate
(contain few false-positive sequences) and that representing existing proofs in
such a way can be very useful when guiding new ones.Comment: To Appear at Conferences on Intelligent Computer Mathematics 201
Getting More out of Large Language Models for Proofs
Large language models have the potential to simplify formal theorem proving
and make it more accessible. But how to get the most out of these models is
still an open question. To answer this question, we take a step back and
explore the failure cases of these models using common prompting-based
techniques. Our talk will discuss these failure cases and what they can teach
us about how to get more out of these models
Computer theorem proving in math
We give an overview of issues surrounding computer-verified theorem proving
in the standard pure-mathematical context. This is based on my talk at the PQR
conference (Brussels, June 2003)
Towards a Geometry Automated Provers Competition
The geometry automated theorem proving area distinguishes itself by a large
number of specific methods and implementations, different approaches
(synthetic, algebraic, semi-synthetic) and different goals and applications
(from research in the area of artificial intelligence to applications in
education).
Apart from the usual measures of efficiency (e.g. CPU time), the possibility
of visual and/or readable proofs is also an expected output against which the
geometry automated theorem provers (GATP) should be measured.
The implementation of a competition between GATP would allow to create a test
bench for GATP developers to improve the existing ones and to propose new ones.
It would also allow to establish a ranking for GATP that could be used by
"clients" (e.g. developers of educational e-learning systems) to choose the
best implementation for a given intended use.Comment: In Proceedings ThEdu'19, arXiv:2002.1189
LLMSTEP: LLM proofstep suggestions in Lean
We present LLMSTEP, a tool for integrating a language model into the Lean
proof assistant. LLMSTEP is a Lean 4 tactic that sends a user's proof state to
a server hosting a language model. The language model generates suggestions,
which are checked in Lean and displayed to a user in their development
environment. We provide a baseline language model, along with code for
fine-tuning and evaluation to support further development. We provide server
implementations that run on CPU, a CUDA GPU, or a Google Colab notebook, as a
step towards fast, effective language model suggestions for any user
- …