2 research outputs found
Binary-Compatible Verification of Filesystems with ACL2
Filesystems are an essential component of most computer systems. Work on the verification of filesystem functionality has been focused on constructing new filesystems in a manner which simplifies the process of verifying them against specifications. This leaves open the question of whether filesystems already in use are correct at the binary level.
This paper introduces LoFAT, a model of the FAT32 filesystem which efficiently implements a subset of the POSIX filesystem operations, and HiFAT, a more abstract model of FAT32 which is simpler to reason about. LoFAT is proved to be correct in terms of refinement of HiFAT, and made executable by enabling the state of the model to be written to and read from FAT32 disk images. EqFAT, an equivalence relation for disk images, considers whether two disk images contain the same directory tree modulo reordering of files and implementation-level details regarding cluster allocation. A suite of co-simulation tests uses EqFAT to compare the operation of existing FAT32 implementations to LoFAT and check the correctness of existing implementations of FAT32 such as the mtools suite of programs and the Linux FAT32 implementation. All models and proofs are formalized and mechanically verified in ACL2
Recommended from our members
Automating the Formal Verification of Software
Formally verified correctness is one of the most desirable properties of software systems. Despite great progress made toward verification via interactive proof assistants, such as Coq and Isabelle/HOL, such verification remains one of the most effort-intensive (and often prohibitively difficult) software development activities. Recent work has created tools that automatically synthesize proofs either through reasoning using precomputed facts or using machine learning to model proofs and then perform biased search through the proof space. However, models in existing tools fail to capture the richness present in proofs, such as the information the programmer has access to when writing proofs and the natural language contained within variable names. Furthermore, these prior models do not make use of variations in the learning process and advances in large language models.
In this dissertation, I develop tools to improve proof synthesis and to enable fully automating more verification. I first present TacTok, a proof-synthesis tool that models proofs using both the partial proof written thus far and the semantics of the proof state. I then present Diva, a proof-synthesis tool that controls the learning process to produce a diverse set of models and, due to the unique nature of proof synthesis (the existence of the theorem prover, an oracle that infallibly judges a proof’s correctness), efficiently combines these models to improve the overall proving power. I then present Passport, a proof-synthesis tool that systematically explores different ways of encoding identifiers in proofs to improve synthesis. Finally, I present Baldur, a proof-synthesis tool that uses transformer-based pretrained large language models fine-tuned on proofs to generate and repair whole proofs at once, rather than one step at a time.
This dissertation contributes new ideas for improving automated proof synthesis and empirically demonstrates that the improvement is significant on large benchmarks consisting of open-source software projects