The formal verification of a pipelined double-precision IEEE floating-point multiplier by Mark D. Aagaard & Carl-johan H. Seger
The Formal Veriﬁcation of a Pipelined Double-Precision IEEE Floating-Point
Multiplier
Mark D. Aagaard and Carl-Johan H. Seger
Dept. of Comp. Sci, Univ. of British Columbia
Vancouver B.C. V6T 1Z4 Canada
Abstract
Floating-pointcircuitsare notoriously difﬁcultto design
and verify. For veriﬁcation, simulation barely offers ad-
equate coverage, conventional model-checking techniques
are infeasible,and theorem-proving based veriﬁcationis not
sufﬁcientlymature. Inthispaperwepresent theformalver-
iﬁcation of a radix-eight, pipelined, IEEE double-precision
ﬂoating-point multiplier. The veriﬁcation was carried out
using a mixture of model-checking and theorem-proving
techniques in the Voss hardware veriﬁcation system. By
combining model-checking and theorem-proving we were
able to build on the strengths of both areas and achieve
signiﬁcant results with a reasonable amount of effort.
1 Introduction
This paper describes the veriﬁcation of a pipelined,
IEEE compliant [7], double-precision ﬂoating-point
multiplier. The features of the multiplier include:
￿based on high-performance commercial designs
from Digital Equipment Corp. [3]
￿radix-eight multiplier array with carry-save
adders
￿round-to-nearest rounding mode
￿optionalnon-IEEE-compliantmode: treatsdenor-
malised numbers as zero
￿four stage pipeline
￿three 56-bit carry-select adders with carry-
propagate circuitry
￿over 33 000 two-input gate equivalents
The top-level speciﬁcation of the circuit is written
in terms of arithmetic operations on integers. The
designwasdoneinstructuralVHDL,thensynthesised
to a unit-delay gate-level model using a cell-library.
The veriﬁcation was carried out in the Voss hardware
veriﬁcation system[9].
Voss includes an efﬁcient implementation of or-
dered binary decision diagrams (BDDs); an event
driven symbolic simulator with comprehensive delay
andraceanalysiscapabilities;asetoftheorem-proving
style inference rules; and a general purpose, func-
tional programming language. The simulator imple-
ments symbolic trajectory evaluation, which offers a
good compromise between expressibility of speciﬁca-
tionsand rapid veriﬁcation. The inference rules allow
thecompositionofveriﬁcation resultsandsupportab-
stract data-types, such as integers. This enables Voss
to overcome the limitations inherent in BDD-based
model-checking.
LargepartsoftheIEEEﬂoating-pointstandardhave
been formalised by Barrett [2] in the Z speciﬁcation
language and by Carre˜ no and Miner [5] in the HOL
and PVS theorem provers. Some work has recently
been done in formally verifying complex integer cir-
cuits. Bryant and Chen have used Binary Moment
Diagrams (BMDs) [4] to verify a sixty-two bit combi-
national multiplier. Claesen et al. and O’Leary et al.
have used theorem provers to verify an SRT integer
divider [10] and an SRT integer square-root circuit [8],
respectively.
2 Background and Theory
In Voss, speciﬁcations consist of an antecedent, a
consequent, and an optional relation (used for rela-
tional, but not functional, veriﬁcation). Typically the
antecedent is used to initialise inputs to the circuit.
In functional veriﬁcation, the consequent speciﬁes the
values of the outputs as functions of the inputs. In
relational veriﬁcation the relation gives the correct-
ness condition in terms of the variables appearing in
the antecedent and consequent. (Section 4.1 has an
example of relational veriﬁcation.) The antecedent
andconsequentare temporalformulas. The key tothe
efﬁciency of trajectory evaluation is the restricted lan-
guage of the temporal formulas: there is no negation,
theonly temporaloperatoris “next”, and there is only
a restricted form of disjunction.
Hazelhurst and Seger [6] have deﬁned a set of
inference rules for composing veriﬁcation results in
Voss. These rules include: pre-condition strength-
ening, post-condition weakening, structural composi-tion, and instantiationof symbolic and temporal vari-
ables.
One of the most powerful ramiﬁcations of these
rules is that abstract data types (ADTs) for integers
and other objects can be deﬁned and related to prim-
itive trajectory formulas. This allows speciﬁcations to
be written in terms of the ADTs (e.g. integers). Ver-
iﬁcation can be carried out either by mapping the
speciﬁcation down to bit-vectors or by manipulating
the ADTs. As an example of the second method, a
linear-programming package has been added to Voss.
Section 4.2 illustrates how we use arithetic decision
procedures to simplify integer expressions. 1
Trajectory evaluation is automatic, but can exceed
thecapacity of computeresources. In comparison, us-
ing inference rules requires human interaction, but is
computationally less intensive. Our normal veriﬁca-
tion technique is to rely primarily on trajectory eval-
uation and use inference rules only when necessary.
We use a mixture of top-down and bottom-up veri-
ﬁcation: top-down to isolate bugs and bottom-up to
composesuccessfulveriﬁcations. Whendoingdivide-
and-conquer veriﬁcation with trajectory evaluation,
only the speciﬁcation needs to be partitioned. Trajec-
tory evaluations can be carried out on the complete
circuit, but onlythosepartsrelatedtothespeciﬁcation
are exercised, which makes it very efﬁcient.
3 Multiplication Implementation
The IEEE standard deﬁnes six different classes
of ﬂoating-point data: inﬁnity, normalised, denor-
malised,zero, quietNaNs, andsignalling NaNs. Mul-
tiplication is only performed if both operands are ei-
ther normalised or denormalised. If multiplication
is performed, the result may overﬂow, underﬂow, be
normalised, or be denormalised.
IEEE ﬂoating-pointnumbersare represented asbit-
vectors with three ﬁelds: sign, exponent and signiﬁ-
cand. Denormalised numbers (or denorms) represent
values that, if normalised, would require that the ex-
ponent ﬁeld be less than the minimum representable
value.
Due to the cost in both area and performance re-
quiredtosupportdenormalisednumbersinhardware,
werelyonsoftwaresupportwhenafullyIEEEcompli-
ant result is needed. When our multiplier (which we
call the“ADK”multiplier) isin IEEE compliantmode,
it generates an emulation exception when multiplica-
tion is to be performed on a denorm input or when
1We have not formally veriﬁed the ADTs or the linear program-
ming package. Nonetheless, these techniques offer a practical and
relatively sound method for manipulating integer and Boolean ex-
pressions
the result will be a denorm. (Underﬂows are always
handled in hardware.) We improve performance in
non-compliant mode by treating denorms as zeros.
The operations performed in each stage of the
pipeline are summarised in Table 1.
Table 1: Pipeline stages and datapath operations
1. Signiﬁcand Multiplier: Booth recode.
Multiplicand: multiply by three
Exponent Add exponents together
Special Detect input denorms, inﬁnities,
NaNs, zeroes
2. Signiﬁcand Multiplier array with carry-save
adders
Exponent Subtract bias
3. Signiﬁcand Addcarry &sumvectors; normalise
Exponent Decrement if signiﬁcand shifted
4. Signiﬁcand Round-to-nearest; renormalise
Exponent Increment if signiﬁcand shifted
Special Detect overﬂow, denorm out, un-
derﬂow; set exception signals
Eachrowinthemultiplierarraycalculatestheprod-
uctofa digitfrom therecoded multiplierand themul-
tiplicand. Digits in the radix-eight recoded multiplier
are in the range
￿4
:
:
:
+4 and are in sign-magnitude
format. The magnitude is used to select the desired
multiple of the multiplicand. The sign is used to
negate the product if needed. Negating the product
is done by inverting it and including the extra “plus-
one” needed for the two’s complement in the initial
partial product. The least-signiﬁcant cell in each row
computesthestickyandguardbitsusedinrounding.
4 Veriﬁcation
Our veriﬁcation of the multiplier relies on a hierar-
chyofspeciﬁcations(Figure1). IEEERelisarelational
formalization of the IEEE standard for multiplication
(including NaNs, inﬁnities, etc.). ADK Rel is a rela-
tional speciﬁcation of our multiplier. It is identical to
IEEE Rel except for those cases where our multiplier
raises an emulation exception. ADK Fun is a func-
tional speciﬁcation describing exactly what result our
multipliershouldproduceforanysetofinputs. Below
ADK Fun are speciﬁcations for subparts of the circuit,
such as the Booth recoder and rounding circuitry.
Our formalizationoftheIEEE standard isrelational,
not functional. The IEEE standard is non-functional
in several cases, in that it speciﬁes properties that the
result must satisfy but not the exact value that must
be produced. For example, if one of the operands is aIEEE Rel
ADK Rel
ADK Fun
Sig Mult
Norm Exp/Special Booth/Preadd Prod Sum CPA
*3 Mux
Implementation
Theorem 1
ADK Fun2
Theorem Proving
Trajectory Evaluation
Verification Techniques
Round
Stage1 Stage2 Stage3 Stage4 Exp
Figure 1: Hierarchy of speciﬁcations
NaN, the standard requires that result is a NaN, but it
does not specify which NaN.
The IEEE standard is informal and written in nat-
ural language, so formalizations of the standard can
only be veriﬁed against it informally. Our formaliza-
tion[1]isonlyafewpagesand(webelieve)quiteread-
able. Thus, we claim that others can inspect our for-
malization and convince themselves that it conforms
to the standard.
We began the veriﬁcation by using test-vector sim-
ulation as a quick and effective way of catching many
bugs and then relied on trajectory evaluation to ver-
ify individual components (the lowest layer in Fig-
ure 1). Once the components had been veriﬁed, we
needed to compose the results to verify the complete
circuit. We used a single trajectory evaluation to ver-
ify the signiﬁcand datapaths in stages three and four
and Exp/Special against ADK Fun2. We used infer-
ence rules to combine the veriﬁcation results from
Booth/Preadd,Prod,and Sum toprove thatthecircuit
multipliescorrectly (Sig Mult). The veriﬁcation results
for Sig Mult and ADK Fun2 were combined together
using inference rules to complete the veriﬁcation of
the multiplier against ADK Fun. We used BDDs and
inference rules to verify that ADK Fun implies ADK
Rel and ADK Rel implies IEEE Rel.
Thetotaldesignandveriﬁcationefforttookapprox-
imately seventy work days (Table 2). At the time of
writing, some of the theorem-proving parts of Voss
are still evolving. A few aspects of the veriﬁcation of
ADK Fun and ADK Rel are not complete and are not
included included in Table 2.
Each of the trajectory evaluations for the low-
est level speciﬁcations took under a minute on a
Sparc10/51with64Mofmemory. ADK Fun2required
approximately ten minutes. The automated decision
procedures used in the theorem proving veriﬁcation
Table 2: Design and veriﬁcation effort
Des Spec Ver Rdes Tot
ADK Fun2 — 3.8 2.2 6.0 12.0
Sig Mult —2 . 0 3 . 0 5.0
Booth 1.0 1.5 0.5 3.0
Prod/Sum 7.0 5.0 5.5 5.0 22.5
Carry-Prop Add 1.5 0.5 0.3 2.3
Normalization 0.1 0.2 0.3
Rounding 1.0 0.3 1.3
Exp/Special 6.6 4.6 1.0 12.2
Interconnect 1.7 3.0 4.7
Total 18.9 12.8 19.6 12.0 63.3
Des Initial Design Ver Veriﬁcation/bug ﬁxes
Spec Speciﬁcation Rdes Redesign
All times are measured in workdays
ran in under three minutes each. Variable re-ordering
wasautomaticallydone once for each of thetrajectory
evaluations in Figure 1. Most runs took several hours
on a DEC 3000 with 512M of memory.
In Sections 4.1 and 4.2 we brieﬂy describe the veri-
ﬁcation of the Booth recoder and signiﬁcand multipli-
cation datapath. These examples are two of the most
complicated veriﬁcations. They illustrate the use of
relational veriﬁcation and arithmetic decision proce-
dures respectively.
4.1 Booth Recoder
The Boothrecoder instageonewasveriﬁed against
the relational speciﬁcation in Equation 1. The input is
the multiplier (m) and the outputs are eighteen sign-
magnitude digits in the range
￿4
:
:
:
+4( sgni and magi
for 0
￿i
￿17). A functional speciﬁcation would require
separate equations deﬁning sgni and magi in terms of
m,whichwouldclearlybemuchmoredifﬁculttowrite
than the relational speciﬁcation.
m
=
17
X
i
=0
8i
￿
(1
￿2
￿sgni
)
￿magi (1)
4.2 Signiﬁcand Multiplication
Theorem 1 says that the composition of the speci-
ﬁcations for the components in the signiﬁcand datap-
aths in stagesone and two implies that the sum of the
carry (C) and sum (S) vectors output from the multi-
plier array is the upper ﬁfty-ﬁve bits of the product of
the multiplier (M1) and multiplicand (M2). The the-
orem was proved automatically by Voss’ arithmetic
decision procedures in three minutes.
The ﬁrst line describes the Booth recoding of the
multiplier (M1). The second line isfor thepreadditionTheorem 1: Composition of signiﬁcand speciﬁcations
‘
￿ 17
X
i
=0
8i
￿
(1
￿2
￿sgni
)
￿magi
= M1
￿
^
￿
P
=
17
X
i
=0
8i
￿
(sgni
)
￿
^
￿
pi
= M2
￿
(1
￿2
￿sgni
)
￿magi
￿sgni
￿
^
￿
C
+ S
= p17
+
(p16
+
(
￿
￿
￿
(p 0
+P
)
=8
￿
￿
￿
)
=8
)
￿
=
)
￿
C
+S
=
( M 1
￿ M 2
)
= 2 51
￿
of the plus-ones into the initial partial product (P) for
the generation of two’s complement products in the
multiplier array (see Section 3). The third and fourth
lines describe the calculation of the product terms (pi)
andthesummationofthepartialproductsusingcarry-
save addition (C and S).
5 Conclusion
As shown in Table 2, we found many bugs, both
in our design and speciﬁcations. Many of these could
havebeenfoundthroughextensiveuseoftest-vectors,
but it is doubtful that they could have been found as
quicklyaswithtrajectoryevaluation. Mostofthebugs
wererelatedtothemultiplierarrayorthespecialcases.
Because of the regularity of the multiplication imple-
mentation, we were able to ﬁnd many of the bugs
using test vectors. However, the control circuitry for
the special cases is very irregular, making test vec-
tors impractical. Our most subtle bug illustrates the
need for relational and high-level speciﬁciations. We
were very conﬁdent in our speciﬁcation ADK Fun, but
verifying it against ADK Rel revealed that a particu-
lar NaN value would sometimes produce a result of
inﬁnity, rather a NaN. This error was in both our im-
plementation and functional speciﬁcation and would
very likely have remained undetected in test-vector
simulation.
Our long-term goal is to develop practical and
rigourous formal-veriﬁcation techniques. From expe-
rience with a variety of model-checking and theorem-
provingtechniques, wehaveconcluded thattrajectory
evaluation and built-in support for debugging hard-
ware is a very effective veriﬁcation process. Com-
posing veriﬁcation results using both trajectory eval-
uation and inference rules provides the freedom to
choose the most appropriate technique for each sit-
uation. Using a general-purpose programming lan-
guage as an interface makes it easy toautomaterepet-
itive tasks and customize interfaces. More experience
withcombined model-checking and theorem-proving
based veriﬁcation is clearly needed, but even at this
early stage, we are very optimistic that the combina-
tionoffersthe promiseofpractical formalveriﬁcation,
scalability, and high-level speciﬁcations.
References
[1] M. D. Aagaard and C.-J. H. Seger, “The design and
veriﬁcation of a radix-eight, pipelined, IEEE double-
precision ﬂoating-point multiplier,” tech. rep., Dept. of
Comp. Sci, Univ. of British Columbia, 1995.
[2] G.Barrett,“Formalmethodsappliedto aﬂoating-point
number system,” IEEE Trans. Soft. Eng., vol. 15, no. 5,
pp. 611–621, 1989.
[3] B. J. Benschneider, et al. , “A pipelined 50-MHz CMOS
64-bitﬂoating-pointarithmeticprocessor,” IEEEJour. of
Solid-State Circuits, vol. 24, pp. 1317–1323,Oct. 1989.
[4] R. E. Bryant and Y.-A. Chen, “Veriﬁcation of arith-
metic functions with binary moment diagrams,” Tech.
Rep. CMU//CS-94-160, Dept. of Comp. Sci, Carnegie-
Mellon Univ. Aug. 1994.
[5] V. A. Carre˜ no and P. S. Miner, “Speciﬁcation of the
IEEE-854ﬂoating-point standard in HOL and PVS,” in
Higher Order Logic Theorem Proving and Its Applications,
Sept. 1995.
[6] S. Hazelhurst and C.-J. H. Seger, “A simple theorem
prover based on symbolic trajectory evaluation and
BDDs,” IEEE Trans.on CAD, Apr. 1995.
[7] IEEE, IEEE Standard for binary ﬂoating-point arithmetic.
ANSI/IEEEStd 754-1985, 1985.
[8] J. W. O’Leary, M. E. Leeser, J. Y. Hickey, and M. D.
Aagaard, “Non-restoring integer square root: A case
studyindesignbyprincipledoptimization,”inTheorem
Provers in Circuit Design, Springer Verlag; New York,
Sept. 1994.
[9] C.-J. Seger, “Voss — A formal hardware veriﬁcation
systemuser’s guide,”Tech. Rep. 93-45, Dept. of Comp.
Sci, Univ. of British Columbia, 1993.
[10] D. Verkest, L. Claesen, and H. De Man, “A proof of
the nonrestoring division algorithm and its implemen-
tation on an ALU,” Formal Methods in System Design,
vol. 4, pp. 5–31, Jan. 1994.
Acknowledgments
We would like to thank Mark Greenstreet, Scott Hazel-
hurst, Catherine Leung, Andy Martin, David Weih, and the
Semiconductor Engineering Group at Digital Equipment
Corp. This research was supported by operating grant
OGPO 109688 from the Natural Sciences and Engineering
ResearchCouncil of Canada, a fellowship from the B.C. Ad-
vanced Systems Institute, Research Contract DJ-295 from
the Semiconductor Research Corporation, and equipment
grants from Sun Microsystems Inc, Canada, and Digital
Equipment Corp, Canada.