9 research outputs found
Capturing Hiproofs in HOL Light
Hierarchical proof trees (hiproofs for short) add structure to ordinary proof
trees, by allowing portions of trees to be hierarchically nested. The
additional structure can be used to abstract away from details, or to label
particular portions to explain their purpose. In this paper we present two
complementary methods for capturing hiproofs in HOL Light, along with a tool to
produce web-based visualisations. The first method uses tactic recording, by
modifying tactics to record their arguments and construct a hierarchical tree;
this allows a tactic proof script to be modified. The second method uses proof
recording, which extends the HOL Light kernel to record hierachical proof trees
alongside theorems. This method is less invasive, but requires care to manage
the size of the recorded objects. We have implemented both methods, resulting
in two systems: Tactician and HipCam
Proof Repair Infrastructure for Supervised Models: Building a Large Proof Repair Dataset
We report on our efforts building a new, large proof-repair dataset and benchmark suite for the Coq proof assistant. The dataset is made up of Git commits from open-source projects with old and new versions of definitions and proofs aligned across commits. Building this dataset has been a significant undertaking, highlighting a number of challenges and gaps in existing infrastructure. We discuss these challenges and gaps, and we provide recommendations for how the proof assistant community can address them. Our hope is to make it easier to build datasets and benchmark suites so that machine-learning tools for proofs will move to target the tasks that matter most and do so equitably across proof assistants
Understanding and maintaining tactics graphically OR how we are learning that a diagram can be worth more than 10K LoC
The use of a functional language to implement proof strategies as proof tactics in interactive theorem provers, often provides short, concise and elegant implementations. Whilst being elegant, the use of higher order features and combinator languages often results in a very procedural view of a strategy, which may deviate significantly from the high-level ideas behind it. This can make a tactic hard to understand and hence difficult to to debug and maintain for experts and non-experts alike: one often has to tear apart complex combinations of lower level tactics manually in order to analyse a failure in the overall strategy.In an industrial technology transfer project, we have been working on porting a very large and complex proof tactic into PSGraph, a graphical language for representing proof strategies. The goal of this work is to improve understandability and maintainability of tactics. Motivated by some initial successes with this, we here extend PSGraph with additional features for development and debugging. Through the re-implementation and refactoring of several existing tactics, we demonstrates the advantages of PSGraph compared with a typical sentential tactic language with respect to debugging, readability and maintenance. In order to act as guidance for others, we give a fairly detailed comparison of the user experience with the two approaches. The paper is supported by a web page providing further details about the implementation as well as interactive illustrations of the examples
Recommended from our members
Automating the Formal Verification of Software
Formally verified correctness is one of the most desirable properties of software systems. Despite great progress made toward verification via interactive proof assistants, such as Coq and Isabelle/HOL, such verification remains one of the most effort-intensive (and often prohibitively difficult) software development activities. Recent work has created tools that automatically synthesize proofs either through reasoning using precomputed facts or using machine learning to model proofs and then perform biased search through the proof space. However, models in existing tools fail to capture the richness present in proofs, such as the information the programmer has access to when writing proofs and the natural language contained within variable names. Furthermore, these prior models do not make use of variations in the learning process and advances in large language models.
In this dissertation, I develop tools to improve proof synthesis and to enable fully automating more verification. I first present TacTok, a proof-synthesis tool that models proofs using both the partial proof written thus far and the semantics of the proof state. I then present Diva, a proof-synthesis tool that controls the learning process to produce a diverse set of models and, due to the unique nature of proof synthesis (the existence of the theorem prover, an oracle that infallibly judges a proof’s correctness), efficiently combines these models to improve the overall proving power. I then present Passport, a proof-synthesis tool that systematically explores different ways of encoding identifiers in proofs to improve synthesis. Finally, I present Baldur, a proof-synthesis tool that uses transformer-based pretrained large language models fine-tuned on proofs to generate and repair whole proofs at once, rather than one step at a time.
This dissertation contributes new ideas for improving automated proof synthesis and empirically demonstrates that the improvement is significant on large benchmarks consisting of open-source software projects
Capturing proof process
PhD ThesisProof automation is a common bottleneck for industrial adoption of formal methods.
Heuristic search techniques fail to discharge every proof obligation (PO), and
significant effort is spent on proving the remaining ones interactively. Luckily,
they usually fall into several proof families, where a single idea is required to discharge
all similar POs. However, interactive formal proof requires expertise and
is expensive: repeating the ideas over multiple proofs adds up to significant costs.
The AI4FM research project aims to alleviate the repetitive effort by “learning”
from an expert doing interactive proof. The expert’s proof attempts can give rise
to reusable strategies, which capture the ideas necessary to discharge similar POs.
Automatic replay of these strategies would complete the remaining proof tasks
within the same family, enabling the expert to focus on novel proof ideas.
This thesis presents an architecture to capture the expert’s proof ideas as a highlevel
proof process. Expert insight is not reflected in low-level proof scripts, therefore
a generic ProofProcess framework is developed to capture high-level proof information,
such as proof intent and important proof features of the proof steps taken.
The framework accommodates branching to represent the actual proof structure
as well as layers of abstraction to accommodate different granularities. The full
history of how the proof was discovered is recorded, including multiple attempts
to capture alternative, failed or unfinished versions.
A prototype implementation of the ProofProcess framework is available, including
integrations with Isabelle and Z/EVES theorem provers. Two case studies illustrate
how the ProofProcess systems are used to capture high-level proof processes
in examples from industrial-style formal developments. Reuse of the captured
information to discharge similar proofs within the examples is also explored.
The captured high-level information facilitates extraction of reusable proof
strategies. Furthermore, the data could be used for proof maintenance, training,
proof metrics, and other use cases