Abstract. Using a theorem prover, we have veri ed a microprocessor design, FM9801. We de ne our correctness criterion for processors with speculative execution and interrupts. Our veri cation approach de nes an invariant on an intermediate abstraction that records the history of instructions. We veri ed the invariant rst, and then proved the correctness criterion. We found several bugs during the veri cation process.
FM980and Correctness Criterion
We argue that even complex microprocessor design can be formally veri ed. As an evidence of our claim, we have mechanically veri ed our FM9801 microprocessor design. It has various features such as out-of-order issue and completion of instructions with Tomasulo's algorithm, speculative execution with branch prediction, precise handling of internal exceptions and external interrupts, and supervisor/user modes.
The FM9801 is formally speci ed in the ACL2 logic KM96] at the instructionset architecture (ISA) level and the microarchitecture (MA) level. These formal de nitions are publicly available along with the FM9801 veri cation scripts Saw]. The ISA sequentially executes instructions. Its behavior is speci ed with function ISA-step(ISA; intr), which returns the ISA state after executing one instruction from state ISA, with interrupt signal intr. The MA is a clock cycle accurate model of the pipelined hardware design. Its behavioral function MA-step(MA; sigs) returns the MA state after one clock cycle of execution with external signals sigs.
We de ne ISA-stepn(ISA; intr-list; m) as the recursive function that repeatedly applies the next state function ISA-step to state ISA m times, where intr-list is a list of interrupt signals for each execution step. Similarly, we dene MA-stepn(MA; sig-list; n) as n applications of MA-step with a list of signals sig-list. Projection function proj(MA) returns the ISA state consisting of the program counter, the register le, and the memory in MA.
Our correctness criterion is whether our machine designs satisfy the commutative diagram shown in Fig. 1 In fact, all we need to know is that the properties labeled 1 through 6 hold for MA n . The rest of the properties are necessary for our inductive proofs. The properties in Table 1 were obtained interactively during the veri cation process. Initially we started invariant veri cation by only considering the conjunction of properties labeled 1 through 6. Naturally, our rst proof attempt failed. Then, we analyzed the failed proof, and added more properties to the conjunction. Eventually we identi ed all properties in . This was the most time consuming part of the veri cation.
The proof of the correctness criterion must bridge the complex time abstraction between the ISA level and the MA level. The state of each programmer visible component in the MA is related to di erent ISA states. These relations are expressed with properties labeled 1 through 4 in Table 1 . The proof of the criterion can be found in our report. SH].
Veri cation Summary
The FM9801 veri cation was carried out with the ACL2 theorem prover. First, we simulated our FM9801 speci cation using ACL2's execution capability. This eliminated most of the bugs in our original design before we started the formal veri cation process. The size of the ACL2 veri cation scripts and the time to certify the proofs for each stage of the veri cation are given in Table 2 . The whole veri cation project took about 15 months. The veri cation of invariant V P2 P occupied the largest portion in the ACL2 proof scripts and in our veri cation e ort. This is not surprising because the proof of the invariant is the core of our veri cation process. We found several bugs that were not detected in the simulation, and all these bugs were detected during the veri cation of the invariant. We found 14 design faults in our machine design during the veri cation process. For instance, one bug was found in the control logic for the speculative execution and the branch prediction. A prediction is usually made for the branch instruction at the instruction fetch unit (IFU). If the branch instruction stalls in the IFU, then more than one branch predictions are made for a single branch instruction. In the original design, if the branch prediction outcomes di ered, the machine did not correctly execute instructions from the branching point. Although this veri cation was labor intensive, our technique seems to scale well with the size of machine. In Table 3 , we compare the size of our machine speci cation and veri cation scripts with two other proof e orts, each with different machine sizes, where we employed a similar approach. The ratio of the size of the veri ed machine design and its veri cation script does not change much. We also note that the CPU time in Table 2 is relatively small. This is because we decompose a complex veri cation problem into small lemmas to avoid case explosions. Typically, the ACL2 theorem prover proves single lemmas in less than a minutes during our veri cation.
We have demonstrated that the pipelined machine with complex control features can be mechanically veri ed. Although the veri cation cost was high, we do not see any major di culty in scaling the veri cation process for a more complex design. Until now, we have only used an theorem prover, but the combination of algorithmic approach could improve the veri cation e ciency. Improving invariant veri cation processes will make our technique more practical.
