Abstract. A parameterized de nition of subtractive oating point division algorithms is presented and veri ed using PVS. The general algorithm is proven to satisfy a formal de nition of an IEEE standard for oating point arithmetic. The utility of the general speci cation is illustrated using a number of di erent instances of the general algorithm.
Introduction
As computing systems become more complex, it becomes increasingly di cult to ensure that testing fully exercises the design. This was made abundantly clear by the infamous bug in the oating point unit of the Intel Pentium(tm) microprocessor. The bug consists of ve missing entries in a lookup table. Pratt 21] provides a thorough analysis of this error. He provides compelling arguments that a thorough manual analysis of a design may still allow errors to evade detection. This is particularly true if the aw is in a region of the design that is thought to be unreachable. Machine assisted reasoning is crucial to prevent such errors. Two recent veri cations of Taylor's SRT divider 20] illustrate how theorem proving techniques can be used to prevent omissions in lookup tables similar to that employed by the Pentium 5, 18] . These veri cations describe a relationship from a veri ed algorithm to a hardware design. In order to complete the veri cation, it is necessary to relate the algorithm to a speci cation of the oating point operation.
Our work provides a link between the IEEE oating point standards 7, 8] and a class of veri ed division algorithms. A strength of theorem prover based veri cation is that it allows veri cation of classes of algorithms. Once a class is veri ed with respect to the standard, it can be used routinely in the development of veri ed hardware. Taylor's SRT divider is an instance of the class we have veri ed. Other instances include the classical restoring and non-restoring division algorithms. Thus, our general theory provides a standard speci cation for IEEE compliant subtractive division.
Section 3 presents a veri cation of the class of subtractive division algorithms. We illustrate the utility of the general veri cation by exhibiting a number of instances. Section 4 extends these algorithms to provide veri ed IEEE compliant rounding. This veri cation proceeeds in two stages. Section 4.1 presents the veri cation of a standard algorithm to round oating point results in accordance with the standard. Section 4.2 composes the rounding algorithm with the general division algorithm to provide a veri ed IEEE compliant division algorithm.
Related work
Much of the previous work in theorem prover based veri cation of oating point algorithms has focused on verifying core algorithms and a corresponding hardware design. The rst e orts targeted veri ed implementations of binary nonrestoring algorithms. Leeser In all of the e orts above, the veri cations address functional correctness of core algorithms and associated hardware designs. They do not address the issue of relating the algorithms to a formal de nition of oating point operations.
Harrison has presented a veri cation of two oating point algorithms; square root, and a CORDIC 23] natural logarithm algorithm 9]. For both algorithms, Harrison relates the proofs to a oating point interpretation. Although he does not present hardware descriptions, he does address some of the preliminary error analysis necessary to provide correct rounding. The IEEE standards for oating point arithmetic unambiguously state that each operation shall be performed as if it rst produced an intermediate result correct to in nite precision and with unbounded range, and then that result rounded according to one of the modes . . . 7, 8] Barrett manually veri ed a general rounding algorithm with respect to a Z formalization of IEEE 754 2, 3] . When Barrett performed his veri cation, there was no machine assisted reasoning for the Z speci cation language. Some tools for machine assisted application of Z have recently been developed 14]. Thus far, these have not been applied to oating point veri cation.
Recently, the microcode for the oating point division and square root algorithms of the AMD5 K 86 TM microprocessor has been mechanically veri ed using the ACL2 theorem prover 16, 19] . Both algorithms assume correct hardware for oating point multiplication, addition, and subtraction. Both veri cations include detailed analysis of rounding and proof that the delivered result is rounded in accordance with the IEEE standard. In addition, the veri cations guarantee that all intermediate results of the algorithms t the datapath of the existing oating point hardware. Aagaard and Seger employ a combination of theorem proving and model-checking techniques to verify a oating point multiplier against a formal de nition of IEEE multiplication 1].
Our work is a generalization of the Rue , Srivas, and Shankar veri cation. In addition, to the SRT algorithms, our veri cation encompasses most of the algorithms presented in 6]. In addition, our veri cation includes a formal path relating the algorithm to the IEEE standard.
Brief introduction to PVS
PVS 17] is a veri cation system that provides support for general purpose theorem proving. The speci cation language is a higher order logic augmented with dependent types. Theories can be parameterized, and the dependent type mechanism allows for stating arbitrary constraints on theory parameters. The type system of PVS includes predicate subtypes and is therefore undecidable. PVS frequently generates proof obligations to ensure that expressions are well typed. PVS has powerful decision procedures, so many proofs involving simple arithmetic expressions can be discharged automatically. In addition, PVS provides a collection of pre-proven results in the prelude. Also included with PVS are libraries providing support for bit-vectors and nite sets. In PVS, the real numbers are a base type, and other numeric types are de ned as subtypes of the reals. This allows speci cations to freely mix operations on numeric types.
3 General veri cation of subtractive division algorithms There are two principle classes of oating point division algorithms. The subtractive algorithms use shifting followed by addition/subtraction to generate quotient digits in time linear with respect to the size of the operands. Multiplicative algorithms, such as Goldschmidt's Algorithm or Newton-Raphson iterations provide fewer iterations, but the operations in each iteration grow increasingly complex. Ercegovac and Lang present a detailed study of subtractive algorithms for both division and square root 6]. The general division algorithm is presented in PVS providing a parameterized class of veri ed subtractive division algorithms.
General algorithm de nition
Subtractive algorithms generate one quotient digit per iteration. They are designed so that in iteration i the remainder is no larger than r ?i for a radix-r algorithm, assuming certain constraints on the dividend and divisor. Ercegovac and Lang present a series of interrelated factors which di erentiate subtractive division algorithms: the radix (r), the quotient-digit set (f?a; ?a By using record types for the range of the function, the de nition is a direct transliteration of the recurrence equations. In addition, by declaring the partial remainder to be of type p type(D), PVS automatically generates a proof obligation to ensure that the invariant is satis ed. This obligation is proven using the type constraints on the quotient selection function.
Veri cation of the general algorithm
To simplify later de nitions we de ne the abbreviations:
PVS strategy (induct-and-simplify) proves To strengthen the convergence property, the algorithm includes a corrective step after the nal iteration.
De nition4.
Expanding the de nitions of P and Q, and using lemma 2, we prove Theorem5 (correctness).
Similarly expanding the de nition of P and using lemma 3, we prove Theorem6 (convergence).
Thus, Q(X; D)(n) contains n radix-r digits of the quotient X D .
Example instantiations
A number of quotient selection functions have been developed for use with the general subtractive division algorithm. These include functions for three radix-2 algorithms and ve radix-4 algorithms. The radix-2 algorithms are restoring, non-restoring, and SRT. The radix-4 algorithms include two using xed multiples of the divisor for comparison with p (using a = 3 and a = 2), and three distinct radix-4 SRT lookup tables. 4 Veri cation with respect to the IEEE standard This section illustrates how to extend the above general algorithm to provide a veri ed IEEE compliant implementation of division. This does not constitute a full proof of compliance, it just illustrates how non-exceptional cases of division can be realized. This veri cation step is performed with respect to a formal speci cation of IEEE 854 de ned using PVS 15] . The veri cation is performed in two stages. First, a generalization of the basic guard, round, and sticky bit rounding algorithm is shown to satisfy the requirements of the standard. Then, the general subtractive algorithm is shown to provide su cient information to utilize this rounding scheme. The veri ed rounding scheme is applicable for all oating-point algorithms. In addition, the theory mapping the standard to the general subtractive algorithm includes a number of intermediate results that apply to all oating point division algorithms.
Rounding scheme
The IEEE Standards for oating-point arithmetic 7, 8] require support for four rounding modes. The default mode is round to nearest even, and requires that the returned value be the oating point number nearest to the exact result. If the exact result is halfway between two oating point numbers, the standards require that it be rounded to the one with an even least signi cant digit. The other three modes round the result towards positive in nity, negative in nity, or zero. The discussion in this section uses the following fact about real numbers: . The rst extra digit is called the guard digit and ensures that the computed result can be normalized while preserving p digits of precision. This is necessary for multiplication and division algorithms. The second extra digit is called the round digit, and is used to control rounding for every mode except to zero. Finally, a sticky ag is required to distinguish the case when the in nitely precise result lies halfway between two representable values for mode to near. The PVS theory implementing the Guard, Round, Sticky (GRS) scheme has been generalized to allow an arbitrary even radix. Thus it works for both base-2 and base-10 instances of IEEE 854. The principle function realizing the GRS rounding scheme is De nition11. The PVS proof of this result consists of a fairly simple case analysis, except for mode to near.
The initial PVS proof for mode to near included a complicated case analysis, where it was di cult to exploit symmetry. IEEE oating point numbers are de ned using a sign and magnitude representation. However, the de nition of round scaled does not take advantage of this representation. Thus, the rst proof for mode to near included an unnatural case split on the sign of the argument. Since the GRS rounding scheme is de ned in terms of a sign and magnitude representation, the cases for negative arguments do not align in the same manner as for the positive arguments. Thus, there was little opportunity to reuse proofs from the corresponding positive cases. The PVS proof has been simpli ed using the following lemma (which was proven using the PVS strategy (grind)):
Lemma 15. round(z; to near) = sgn(z) round(jzj; to near) Even without the case split due to the sign, the proof for mode to near still involves a di cult case analysis. This case analysis consists of relating the values of the round digit and sticky ag to the corresponding cases from the speci cation for rounding mode to near.
Relating the general algorithm to the standard
The veri ed rounding scheme asserts that in order to achieve IEEE compliant rounding to p signi cant digits, it is su cient to compute (p + 1) truncated signi cant digits and determine if there are any remaining non-zero digits. The general subtractive division algorithm ensures at least n ? 1 digits of precision after n iterations. Furthermore, if the computed remainder is nonzero, then there are additional digits in the in nitely precise result. Since it is possible for the radix of the division algorithm to be di erent from that for the representation of oating point numbers, the PVS theory has to relate the potential division radices to those allowed by the standards.
There are some simple results that describe the range of possible values for oating point division. Let x denote a base-b oating point number with p signi cant digits. A nite oating point number x is represented using three elds: a sign x 2 f0; 1g, an integer exponent E x , and a signi cand The IEEE standard requires that all operations be performed as if to in nite precision and then rounded. The in nitely precise quotient of two oating point numbers x and y is v(x)=v(y). A number of general facts about oating point division have been proven using PVS. These include the following:
The PVS proofs of these three identities involve algebraic manipulation of expressions composed of exponents, absolute values, and the integer oor function. The proofs are not conceptually di cult, but neither do they yield to the bruteforce proof strategies of PVS.
IEEE compliant division
The PVS theories are structured so that a designer can generate an instance of a veri ed algorithm with a minimum of e ort. First, the designer selects a quotient selection function based on the constraints of the development e ort. One such choice might be to use a radix-4 SRT lookup There are a number of ways to build on the work presented here. The basic style of the speci cation is common for a large number of oating point algorithms.
The most obvious class of algorithms to consider is subtractive square root algorithms. Another good candidate for exploration is whether a similar general schema can be developed for division through multiplication algorithms. Another example to be considered is the generalized CORDIC algorithm 23]. The algorithm has already been de ned in PVS, and the solutions to the de ning CORDIC equations have been veri ed. These proofs required the addition of some axioms describing properties of trigonometric and hyperbolic functions. The limited support (in PVS) for reasoning about these functions made analysis of accumulated error and convergence di cult, thus remaining as future work.
From the above mentioned work, it should be possible to build up a library of general oating point algorithms veri ed with respect to the IEEE standard. Such a library would present a developer of oating point hardware with a variety of options, provided there was a good link between the veri ed library and hardware development tools.
Lessons learned
The PVS language features allow direct de nition of recurrence equations. The proof strategies are e ective for establishing the functional correctness of recurrence based algorithms. However, issues such as IEEE compliant rounding required di cult and time consuming proofs. There were two primary sources of di culty. The rst is that these details often involve complex case analysis where each case has a di erent structure. The di culty is compounded by the fact that most of these cases do not succumb to the automatic proof strategies. The second major source of di culty was that most of the veri cation e ort was in a domain where there is limited support from the prover. During this veri cation e ort, most proof steps consisted of algebraic manipulation of expressions involving exponentiation. In contrast to the basic arithmetic functions, there are a limited number of pre-proven properties about exponentiation. Although the prelude includes some facts about exponentiation, these are not organized to enable e ective automatic rewriting.
The PVS type system can be used to e ectively restrict types in de nitions. However, this may lead to extra e ort proving irrelevant type correctness conditions. Finding the right balance can be di cult. During the veri cation relating the algorithm to the IEEE standard, much of the e ort consisted of repeatedly discharging the same collection of type correctness conditions.
Concluding remarks
Formal veri cation is an enabling technology. This general veri cation will allow designers to focus their e orts on more advanced optimizations of hardware implementations secure in the knowledge that the routine aspects of the design have been addressed. However, a great deal of work is still needed to make formal veri cation a useful technology. In particular, a designer should not be required to generate all of the supporting theories for well known algorithms and hardware structures. A large set of libraries should be available from which to select the pieces required to complete and verify a design.
The ultimate goal of the work is to assist in the development of veri ed hardware. With that in mind, the general algorithm was presented in a standard form that can be easily transformed to an equivalent tail-recursive description 11]. This provides a top-level speci cation for deriving a hardware realization using a transformational system such as DRS 4] . This process has been tested with the general subtractive division algorithms presented in this paper.
The work presented here is a major step toward establishing an environment conducive to development of formally veri ed oating point hardware. The primary bene t of the work is not the fact that subtractive division algorithms are shown to be compliant with IEEE standards. Instead, this work demonstrates that with proper foresight in developing and verifying generalized solutions, it is then much easier for future designers to verify particular instantiations of those general solutions. As a complete library of veri ed oating point algorithms emerges, future oating point implementations should have a much higher condence of correctness.
