Doctor of Philosophy by Lv, Jinpeng
SCALABLE FORMAL VERIFICATION OF FINITE FIELD




A dissertation submitted to the faculty of
The University of Utah
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy
Department of Electrical and Computer Engineering
The University of Utah
December 2012
Copyright c© Jinpeng Lv 2012
All Rights Reserved
T h e  U n i v e r s i t y  o f  U t a h  G r a d u a t e  S c h o o l
STATEMENT OF DISSERTATION APPROVAL 
The dissertation of 











and by , Chair of
the Department of 








Electrical and Computer Engineering
ABSTRACT
With the spread of internet and mobile devices, transferring information safely
and securely has become more important than ever. Finite fields have widespread
applications in such domains, such as in cryptography, error correction codes, among
many others. In most finite field applications, the field size – and therefore the bit-width
of the operands – can be very large. The high complexity of arithmetic operations
over such large fields requires circuits to be (semi-) custom designed. This raises the
potential for errors/bugs in the implementation, which can be maliciously exploited
and can compromise the security of such systems. Formal verification of finite field
arithmetic circuits has therefore become an imperative.
This dissertation targets the problem of formal verification of hardware implemen-
tations of combinational arithmetic circuits over finite fields of the type F2k . Two
specific problems are addressed: i) verifying the correctness of a custom-designed
arithmetic circuit implementation against a given word-level polynomial specification
over F2k ; and ii) gate-level equivalence checking of two different arithmetic circuit
implementations.
This dissertation proposes polynomial abstractions over finite fields to model and
represent the circuit constraints. Subsequently, decision procedures based on modern
computer algebra techniques – notably, Gro¨bner bases-related theory and technology
– are engineered to solve the verification problem efficiently. The arithmetic circuit is
modeled as a polynomial system in the ring F2k [x1, x2, · · · , xd], and computer algebra-
based results (Hilbert’s Nullstellensatz) over finite fields are exploited for verification.
Using our approach, experiments are performed on a variety of custom-designed
finite field arithmetic benchmark circuits. The results are also compared against con-
temporary methods, based on SAT and SMT solvers, BDDs, and AIG-based methods.
Our tools can verify the correctness of, and detect bugs in, up to 163-bit circuits in F2163 ,
whereas contemporary approaches are infeasible beyond 48-bit circuits.
To Ruina, Andrew and Emma.
CONTENTS
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
CHAPTERS
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Hardware Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Property Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Equivalence Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Computer Algebra-Based Formal Verification . . . . . . . . . . . . . . . . . . . 5
1.3 Objective and Contributions of this Dissertation . . . . . . . . . . . . . . . . . . 6
1.3.1 Contributions of this Dissertation . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2. PREVIOUS WORK AND LIMITATIONS . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 BDDs and Their Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 SAT Solvers and SMT Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Circuit-Based Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Computer Algebra-Based Approaches . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Verification of Finite Field Applications . . . . . . . . . . . . . . . . . . . . . . . . 15
3. PRELIMINARIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 Rings, Fields and Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Finite Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 Construction of Finite Fields F2k . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.2 Hardware Implementations of Arithmetic Operations Over F2k . . 23
4. COMPUTER ALGEBRA FUNDAMENTALS . . . . . . . . . . . . . . . . . . . . 30
4.1 Monomials and Their Orderings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Varieties and Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3 Gro¨bner Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Hillbert’s Nullstellensatz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5. IMPLEMENTATION VERIFICATION USING IDEAL MEMBERSHIP
TESTING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Verification Setup and Polynomial Modeling . . . . . . . . . . . . . . . . . . . . 44
5.3 Verification Formulation as Ideal Membership Testing . . . . . . . . . . . . . 47
5.3.1 Generating I(VF
2k
(J)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.4 Obviating Buchberger’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.5 Our Overall Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.6.1 Evaluation of SAT, SMT, BDD, AIG-Based Methods . . . . . . . . . 57
5.6.2 Evaluation of Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6. GATE-LEVEL EQUIVALENCE CHECKING OF ARITHMETIC CIRCUITS
OVER F2K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.1 Problem Statement and Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.1.1 Verification Problem Formulation as Weak Nullstellensatz . . . . . 66
6.2 Verification Using a Minimum Number
of S-polynomial Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.3 Improving Polynomial Division Using F4-style Reduction . . . . . . . . . . 73
6.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.4.1 Equivalence Checking of Structurally Similar Circuits . . . . . . . . . 82
6.4.2 Equivalence Checking of Structurally Dissimilar Circuits . . . . . . 84
6.5 Limitation of Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7. VERIFICATION OF COMPOSITE FIELD ARITHMETIC CIRCUITS 87
7.1 Circuit Designs over Composite Fields . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.2 Problem Formulation and Hierarchy Verification . . . . . . . . . . . . . . . . . 93
7.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8. CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . . . . . . . . . . 99
8.1 Computer Algebra-Based Approaches for Equivalence Checking of Arith-
metic Circuit over F2k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.2.1 Speeding up Verification Using a Graphics Processing Unit . . . . . 100
8.2.2 Extraction of Circuit Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.2.3 Simulation-Based Verification of Circuits . . . . . . . . . . . . . . . . . . 102
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
vi
LIST OF FIGURES
1.1 Typical circuit design and verification flow. . . . . . . . . . . . . . . . . . . . . . . 2
2.1 BMD for F = x ∗ y; x, y are 2-bit wide, F is 4-bit wide. . . . . . . . . . . . . 10
2.2 BMD for F = x ∗ y; x, y, F are all 2-bit wide. . . . . . . . . . . . . . . . . . . . . 11
3.1 4-bit adder over F24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Mastrovito multiplier over F24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Montgomery multiplier over F2k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Barrett multiplier over F2k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.1 The verification setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2 A 2-bit multiplier over F(22). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.3 A 2-bit multiplier over F(22). The gate⊗ corresponds to AND-gate, i.e.,
bit-level multiplication modulo 2. The gate ⊕ corresponds to XOR-gate,
i.e., addition modulo 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.1 The equivalence checking setup: miter. . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.2 Miter for 2-bit circuit equivalence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.3 A solution (bug) in (F2k − F2k) is a “don’t care”. . . . . . . . . . . . . . . . . . . 68
7.1 Mastrovito multiplier over F(22)2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.2 Mastrovito multiplier over F24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
LIST OF TABLES
3.1 Additive and multiplicative inverses in Z5. . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Bit-vector, exponential and polynomial representation of elements in F24 =
F2[x] (mod x
4 + x3 + 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.1 Runtime for verification of Montgomery versus Mastrovito multipliers
over F2k for BDDs, SAT, SMT-solver and AIG/ABC-based methods. TO
= timeout of 10hrs. Time is given in seconds. . . . . . . . . . . . . . . . . . . . . . 58
5.2 Verification of Mastrovito multipliers by computing Gro¨bner bases using
SINGULAR. MO=out of 8G memory. Time is given in seconds. . . . . . . . 58
5.3 Runtime for verifying bug-free and buggy Mastrovito multipliers using
our approach. TO = timeout of 10hrs. Time is given in seconds. . . . . . . . 59
5.4 Runtime for verifying bug-free and buggy Montgomery multipliers using
our approach. TO = timeout of 10hrs. Time is given in seconds. . . . . . . . 60
5.5 Runtime for verifying bug-free and buggy Barrett multipliers using our
approach. TO = timeout of 10hrs. Time is given in seconds. . . . . . . . . . . 60
5.6 Verification of ECC point addition. Run-time given is seconds. TO =
timeout of 24hrs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.7 Verification of ECC point doubling. Run-time given is seconds. TO =
timeout of 24hrs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.1 Matrix representation for polynomials. . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Matrix subtraction of polynomials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.3 Matrix reduction for polynomials: representation. . . . . . . . . . . . . . . . . . . 76
6.4 Matrix reduction for polynomials: subtraction. . . . . . . . . . . . . . . . . . . . . 76
6.5 Matrix created for polynomial reduction for Example 6.8. . . . . . . . . . . . . 81
6.6 Subtraction result of the matrix created for polynomial reduction. . . . . . . 83
6.7 Verification of Mastrovito multiplier vs. Barrett multiplier. TO=10hrs.
⋆=Out of variable limitation. Time is given in seconds. . . . . . . . . . . . . . . 85
6.8 Verification of Barrett multiplier vs. Montgomery multiplier. TO=10hrs.⋆=Out
of variable limitation. Time is given in seconds. . . . . . . . . . . . . . . . . . . . 85
6.9 Verification of Mastrovito multiplier vs. Montgomery multiplier. TO=10hrs.
Time is given in seconds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.1 Verification setup over F(22)2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
7.2 Statistics of designs over F2m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.3 Verification of Mastrovito multiplier over F(2m)n using proposed approach.
All times are given in seconds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
ACKNOWLEDGEMENTS
I am grateful to many people, without whose support I could not have completed
this Ph.D. study and dissertation. First and foremost, I would like to thank my advisor,
Professor Priyank Kalla. He has been patient enough to teach me everything I need to
learn. I have learned many things from him, not only the knowledge itself, but also the
way to organize the knowledge and apply it to real-world problems. Moreover, he is
always available to discuss questions with me and provided perspectives based on his
experience. I especially enjoy brainstorming with him. Actually, the most important
result of my Ph.D. research is achieved by brainstorming. Next, I would like to thank
Professor Florian Enescu for his extensive help and contribution to this work. I would
also like to thank the other members of my committee - Ganesh Gopalakrishnan, Chris
Myers, Ken Stevens and Rongrong Chen for their help and support. Finally, I would
like to thank my friends: Arun Jay, Sammer Merchant and Brandt Hammer for all the
good times we spent together.
CHAPTER 1
INTRODUCTION
With the rapidly increasing complexity of hardware systems, verification of the cor-
rectness of designs poses serious challenges. Design flaws can be extremely costly. For
example, the Intel Pentium floating point divide bug resulted in 475 million dollars of
extra costs in 1993. In many safety-critical applications, such as cryptography systems,
arithmetic bugs can be especially catastrophic. In [10], it is shown that incorrect (buggy)
hardware can lead to full leakage of the secret key, which can compromise the security
of such systems. Therefore, it is of utmost importance to verify the correctness of
hardware designs.
1.1 Hardware Verification
Today, hardware verification averages about 70 percent of the overall hardware de-
sign effort and is believed to be the largest source of risk and cost. Hardware verification
is becoming even more challenging as the design complexity increases.
The hardware design flow typically starts with a high-level specification or a prop-
erty of the design. This specification is then translated into a register-transfer-level
(RTL) description, which is further optimized and translated to its corresponding netlist
representation. Then, the logic-level netlist is translated to a physical layout, which is
subsequently fabricated into integrated circuits. Fig. 1.1 shows a typical design flow
for realizing a hardware system. The design flow can be automated by Computer-Aided
Design (CAD) tools available from both academia and industry. However, one critical
question emerges: how to prove equivalent functionality between the different levels of
representations. This is the objective of hardware verification. For example, after the
RTL description is transformed into a gate-level netlist, it is important to ensure that its
functionality remains the same. Similarly, after logic optimization is performed on the
gate-level unoptimized netlist, it has to be ensured that the optimization process does
not introduce a bug in the original design. Therefore, as shown in Fig. 1.1, verification











Figure 1.1. Typical circuit design and verification flow.
the “golden model”, RTL-level model and netlist-level model, and between unoptimized
and optimized netlists, etc.
There are two main methodologies applied to hardware verification: simulation and
formal verification. In a traditional design flow, simulation is the primary methodology
for design validation. The effectiveness of simulation is achieved by exhaustive assign-
ments of inputs to excite all possible behaviors of the system and then by analyzing the
output values. However, the increasing complexity of designs makes it impossible for
simulation to provide complete coverage.
In recent years, formal verification has emerged as an alternative technique to ensure
the correctness of hardware designs, overcoming some of the limitations of simulation.
Formal verification is the process of utilizing mathematical theory to reason about the
correctness of hardware designs. Formal verification in hardware usually takes one
of two forms: property checking and equivalence checking. Property checking is a
process of checking whether a design conforms to its given behavior or properties.
Equivalence checking is conducted to prove the equivalent functionality of two given
designs. Usually, equivalence checking is applied at various stages of the design cycle
3to verify correctness of the applied transformations. Figure 1.1 shows the role of
equivalence checking in a typical hardware design flow.
Techniques utilized by property checking include model checking, theorem proving
and approaches that integrate both. Equivalence checking makes use of Binary Decision
Diagrams (BDDs), Satisfiability (SAT) solvers, and And-Inverter-Graph (AIG)-based
reductions, among others. As an emerging technique for equivalence checking, com-
puter algebra-based decision procedures are gaining popularity. This kind of verifica-
tion technique is believed to be more sophisticated in verifying arithmetic hardware
designs in that they exploit the powerful applications of mathematics rather than ad-hoc
techniques.
1.1.1 Property Checking
Property verification refers to proving the correspondence between designs and
given properties. Usually, property verification is achieved by two main formal meth-
ods: theorem proving and model checking.
Theorem proving [60] requires the existence of mathematical descriptions for both
the specification and implementation, allowing these descriptions to be manipulated
in a formal mathematical framework. Theorem provers apply primitive proof (math-
ematical) rules to a specification in order to derive new properties of a specification.
Through this method, theorem proving can reduce a proof goal to simpler subgoals that
can be easily proved/disproved automatically by primitive proof steps. The benefit of
this approach is its generality and completeness. However, despite several advances,
generating the proof requires extensive guidance from the user. As a result, theorem
proving lacks the level of automation that is desirable for a CAD framework to be prac-
tically useful. Theorem proving has gained commercial use in verifying that division
and other operations are correctly implemented in processors at AMD and Intel.
Model checking [21] is an approach to formally verifying finite-state systems. Prop-
erties of the system are modeled as temporal logic formulas, and the model defined
by the system is traversed to check if the properties hold or not. Therefore, model
checking consists of specifying the desired properties of the system and checking if
there are violations of specified properties for all possible behaviors of the system.
Model checking is one of the most successful approaches for property verification
to date. Model checking tools [12] [63] [40] have achieved a significant level of au-
tomation and maturity and are widely in use in both academia and industry. A good
4aspect of model checking that is extremely important in practice is the ability to generate
counterexamples. Such counterexamples provide a way to trace the incorrect behaviors
(bugs). However, these tools tend to be memory intensive and are more applicable to at
most medium sized designs or at the block-level, rather than at the system-level.
1.1.2 Equivalence Checking
Equivalence checking is used to formally prove that two representations of circuit
designs have exactly equivalent functionality. As shown in Fig. 1.1, once a high-level
representation is validated (by simulation or property checking), it is transformed into a
gate-level netlist so that logic synthesis tools can be used to optimize the design accord-
ing to the desired area/delay/power constraints. Then, the design proceeds through a
varied set of optimization and transformation operations. During various transformation
stages, different implementations of the design, or parts of the design, are examined de-
pending upon the constraints, such as area, performance, testability, etc. As the design
is modified by replacing one of its components by another equivalent implementation,
it needs to be verified whether or not the modified design is functionally equivalent to
the original one.
Equivalence checking has important applications in arithmetic circuit verification.
Hardware designs contain a large number of custom-designed circuits such as adders,
multipliers, dividers, and so on. Such circuits are usually not synthesized by CAD tools
because of area and performance constraints. Therefore, this raises the potential for
errors/bugs in the implementation. Consequently, it remains a challenge to conduct
equivalence checking for these large-scale arithmetic circuits.
As an intensively investigated topic, techniques and approaches for equivalence
checking have been well established. With various techniques employed for equiv-
alence checking, BDDs and SAT-based techniques are the two dominant approaches
widely used in both academia and industry. BDD-based approaches try to construct
canonical representations of given circuits and conduct a linear comparison to determine
whether they are equivalent or not. SAT-based equivalence checking approaches try to
find the unsatisfiability of a “miter” representing two designs.
There are also many promising generalizations of SAT and BDDs: Binary Moment
Diagrams (BMDs), which have shown their superiority for verifying integer multipliers
[16], and Satisfiability Modulo Theories (SMT) solvers, which are the next generation
of SAT. These approaches, to some extent, have gained some successes in equivalence
5checking. However, these approaches are beginning to show signs of inadequacy in two
cases. First, large-scale hardware designs still hinder the equivalence checking as the
level of design complexity grows rapidly. For example, the verification of a 16-bit
modular multiplier becomes infeasible for the current SAT/BDD-based approaches.
Secondly, for structurally similar circuits, this problem can be efficiently solved using
the techniques of AIG-based reductions [11] and subsequent use of circuit-SAT solvers
[53]. However, when the circuits are functionally equivalent but structurally very dis-
similar, none of the contemporary techniques, including BDDs, SAT and AIG-based
approaches, are able to prove equivalence.
Ideally, approaches for equivalence checking should maintain a high-level of ab-
straction while still retaining sufficient information so as to not lose lower-level func-
tional details [37]. For instance, implementing arithmetic functions at bit-level can
provide highly optimized implementations while word-level abstraction usually has
much less structural information for solvers to analyze.
Arithmetic Bit Level (ABL) [85] abstraction techniques come close to achieving
these requirements by extracting an arithmetic bit level representation from a given
circuit. Then, the method can use the ABL information to prune the search space of
SAT solvers. The drawback of this approach is that it can only identify ABL information
locally when analyzing the given circuit, which results in an exponential blowup when
looking at sophisticated circuits consisting of several arithmetic blocks.
In this dissertation, we focus on equivalence checking problems for finite field arith-
metic circuits. Such circuits are found in many applications such as in cryptography,
coding theory, signal processing, among others. We utilize the theory of computer-
algebra and algebraic-geometry, notably, Gro¨bner bases-related theory and technology,
as the underlying verification engines. Our approach is sophisticated enough to take
into account both high-level (word-level) specifications and low-level (bit-level) imple-
mentation details.
1.2 Computer Algebra-Based Formal Verification
The first computer algebra-based verification technique dates back to 1996 when
Gro¨bner bases were utilized for SAT solving and formal verification [23]. Indeed,
there have been many attempts to solve verification problems using Gro¨bner basis
formulations [4] [24] [87]. The standard flow of these approaches is:
1. The verification problem is first formulated as a polynomial system.
62. The polynomial system is fed into a Gro¨bner basis engine to check whether the
desired property is satisfied.
The critical step of this approach is the Gro¨bner basis computation. Unfortunately,
the computation is known to have worst-case double-exponential complexity in the
input data. In practice, Gro¨bner basis algorithms have not been capable of satisfactorily
solving problems derived from real-world applications. Besides, these methods are
employed for verification by modeling constraints over the Boolean level Z2; word-level
abstractions, which can be powerfully modeled in algebra, are not utilized.
Recent advances [88] [56] [73] [58] [57] suggest a new direction of utilizing com-
puter algebra theory to conduct hardware verification. These works show that it is fea-
sible to overcome the complexity of Gro¨bner basis algorithm by efficiently engineering
the integration of Gro¨bner bases theory and circuit analysis techniques.
1.3 Objective and Contributions of this Dissertation
This dissertation focuses on verification of hardware implementations of arithmetic
circuits over finite fields of the type F2k . Specifically, the following verification prob-
lems are addressed:
1. Formal verification of a custom-designed finite field arithmetic circuit implemen-
tation against its given word-level polynomial specification.
2. Gate-level equivalence checking of two finite field arithmetic circuit implemen-
tations.
Verification of only combinational logic circuits over finite fields is considered in
this work. Sequential circuit verification is a very different problem for arithmetic
circuits – and it is beyond the scope of this dissertation.
The motivation for this work stems from applications in cryptography circuits, though
our techniques can be applied to verify arbitrary finite field arithmetic circuits. In
cryptosystems, the datapath size (operand size) k in the circuits can be very large. For
example, the U.S. National Institute for Standards and Technology (NIST) recommends
the use of finite fields corresponding to datapath sizes of k = 163-bit or more. The large
size and high complexity of such circuits makes design verification quite challenging.
Indeed, contemporary combinational verification techniques are unable to verify such
large arithmetic circuits.
71.3.1 Contributions of this Dissertation
We propose the application of computer-algebra techniques, notably, Gro¨bner bases-
related theory and technology [17] [3], as the underlying verification framework for our
applications. The advantage of using computer-algebra techniques is that it allows us
to integrate finite field arithmetic, circuit models and algebraic reasoning in a common
verification framework. The circuits are modeled as a system of multivariate poly-
nomials in the field F2k . The formal verification problem is then formulated using
Hilbert’s Nullstellensatz [25] as ideal membership testing. A Gro¨bner basis engine is
subsequently employed as a decision procedure to solve this verification problem.
Gro¨bner basis theory is very powerful as it enables one to solve many polynomial
decision questions. Unfortunately, the computational algorithms are known to have
worst-case double-exponential complexity in the input data. Therefore, in order to make
verification practical and scalable, we engineer efficient application of Gro¨bner basis by
integrating it with circuit analysis techniques. Specifically, we analyze the topology
of the given circuit and derive efficient variable and term orders to systematically
represent and manipulate the polynomials. Subsequently, using the theory of Gro¨bner
bases over finite fields, we prove that our term orderings impose specific constraints
on the polynomials that can obviate the need to compute a Gro¨bner basis. Under
this term ordering, either the polynomials themselves constitute a Gro¨bner basis, or
the term ordering allows us to identify a minimum number of computations in the
Gro¨bner basis algorithm that are sufficient for verification. This significantly scales
verification – we are able to verify circuits for which contemporary verification methods
are infeasible. To further improve our approach, we implement an efficient polynomial
reduction (division) algorithm that operates on a matrix-based representation of the
polynomial system.
Experiments are conducted over various custom-designed arithmetic circuits over
F2k . These include three different modulo-multiplier architectures and point-addition
circuits used in elliptic curve cryptosystems. Using our approach and tools, we can
verify the correctness of, and detect bugs in, up to 163-bit finite field arithmetic circuits,
whereas contemporary approaches are infeasible.
1.4 Thesis Organization
The rest of this dissertation is organized as follows. Chapter 2 reviews previ-
ous approaches and highlights their drawbacks with respect to the given verification
8problem. Chapter 3 briefly describes the construction and properties of finite fields
F2k . Arithmetic circuit design over such fields is also reviewed to shed some light
on the difficulty of the verification problem. Chapter 4 covers preliminary theoretical
background related to computer-algebra, algebraic-geometry and Gro¨bner bases. Chap-
ter 5 describes our approach to verify a circuit implementation against a word-level
polynomial specification using ideal membership testing. We show how the Gro¨bner
basis computation can be obviated using efficient term orderings derived from the given
circuit. Chapter 6 presents our approach to equivalence checking of two arithmetic cir-
cuit implementations. Efficient term orderings and matrix-based polynomial reduction
procedures are derived. Chapter 7 describes a hierarchical verification methodology
to verify arithmetic circuits over composite fields F(2m)n , where k = m · n. Finally,
Chapter 8 concludes the dissertation with a perspective on current and future research
directions on computer algebra methods for verification.
CHAPTER 2
PREVIOUS WORK AND LIMITATIONS
Equivalence checking has been extensively investigated and many well-developed
theories and techniques have been successfully applied in both academica and industry.
The fundamental techniques used in equivalence checking include BDDs [15] and SAT
solvers [26]. Recently, Gro¨bner bases-based approaches are also gaining popularity.
This chapter reviews widely used techniques in the equivalence checking domain and
discusses their limitations.
2.1 BDDs and Their Variants
Reduced Ordered Binary Decision Diagrams (ROBDDs or BDDs) are a canonical
Directed Acyclic Graph (DAG) representation of a Boolean function. Circuits are
usually described as a DAG. Two functionally equivalent circuits can be represented by
the same BDDs. Therefore, equivalence checking between two circuits can be simply
achieved by a comparison of their BDDs.
BDDs have found wide applications in many verification problems, including equiv-
alence checking of arithmetic circuits, symbolic model checking [33] [63], among many
others. However, along with the increasing complexity of designs, the size-explosion
problem of BDDs becomes a bottleneck for many applications. This problem becomes
especially serious when applied on designs containing large arithmetic data-path units.
For example, BDD representation of multipliers requires memory that is exponential
in the number of variables. As a result, BDDs fail to represent multipliers beyond
16-bit. As an attempt to control the exponential size, partitioned BDDs [70] introduce
intermediate variables to represent sub-BDDs, thus partitioning the original BDD. Un-
fortunately, it is an intractable problem to find an optimum partition. This issue renders
partitioned ROBDDs impractical for general verification problems.
Other efforts to extend the capabilities of BDDs are derived from generic Word
Level Decision Diagrams (WLDDs), which are graph-based representations for func-
10
tions with a Boolean domain and an integer range. These representations include
ADDs [5], *BMDs [16], etc. A thorough review of WLDDs can be found in [41].
Algebraic decision diagrams (ADDs) [5] provide an efficient means for representing
and performing arithmetic operations on functions from the binary domain ({0, 1}) to
the integer domain, i.e., {0, 1} → Z. However, the mapping/decomposition at each
node/variable is still binary and leads to exactly two terms. Restricting the decompo-
sition to a binary type limits the abstraction of integer variables, as they have to be
decomposed into their constituent bits. Consequently, ADDs face the same problem
that BDDs do: the exponential size of the number of input bits.
BMDs [16] and their variants, such as HDDs [22], K*BMD [30], among others,
perform a moment-based decomposition of a linear function. BMDs represent binary
variables as (0, 1) integers instead of Boolean variables. Moment diagrams provide a
concise representation of integer-valued functions defined over vectors of bits, or words,
such as X = 2n−1xn−1+ . . .+2x1+x0, for an n-bit word X , where each xi is a binary
variable. BMDs are linear in size for integer multiplier circuits, as shown in Figure
2.1. The multiplicative constants of this representation reside in the terminal nodes.
Moreover, the constants can also be represented as multiplicative terms and assigned
to the edges of the graph, giving a rise to the Multiplicative Binary Moment Diagram
(*BMD) [16]. Several rules for manipulating edge weights are imposed on the graph to
ensure canonicity.
One of the main limitations of BMDs is that performing some arithmetic operations
on functions represented by BMDs is very expensive. For example, for an n-bit vector











Figure 2.1. BMD for F = x ∗ y; x, y are 2-bit wide, F is 4-bit wide.
11
on bit-vectors are distorted, losing the compactness of word-level expression. One such
example is depicted in Fig. 2.2.
Taylor Expansion Diagrams (TEDs) [20] [19] [45] [44] are derived from Taylor
series and canonical DAG representations for functions that can be abstracted as poly-
nomials. TEDs represent bit-vectors (x0, x1, . . . , xn−1) as algebraic symbols (X[0 :
n− 1]), raising the abstraction from bits (Boolean) to words (integers). Let f(x, y, . . .)
be a real differentiable function. Using the Taylor series expansion with respect to a
variable x, the function f can be represented as
f(x, y, . . .) = f(x = 0, y, . . .) + x · f ′(x = 0, y, . . .) +
(1/2)x2 · f ′′(x = 0, y, . . .) + · · · (2.1)
The derivatives of f at x = 0 are independent of x, and can be further decomposed w.r.t
the remaining variables, one variable at a time. This resulting recursive decomposition
can be represented using a nonbinary tree called the TED, with memory requirements
much smaller than other representations. TEDs are applicable to modeling, symbolic
simulation and equivalence verification, provided that a polynomial abstraction is feasi-
ble. For binary operations, the diagram reduces to a *BMD, inheriting all its limitations.
Besides, TEDs cannot model modulo operations over bit-vectors. Therefore, TEDs are
incapable of solving the equivalence problems presented in this dissertation.
2.2 SAT Solvers and SMT Solvers
The SAT problem is a decision-problem. In principle, any decidable decision prob-
lem can be modeled in terms of SAT, and because of this, SAT solvers are used in an












Figure 2.2. BMD for F = x ∗ y; x, y, F are all 2-bit wide.
12
The objective of SAT solvers is to find variable assignments such that the given
constraints (formulas) can be satisfied. If this is not possible, SAT solvers have to prove
that no assignments satisfy the constraints (UNSAT).
Solving SAT-instances of any useful size was not possible until the introduction
of the Davis-Putnam (DP) [27] algorithm. The DP algorithm works by eliminating
variables through deriving new constraints from the original constraints containing the
variables. Still, this has its limitations: though the variable is eliminated, the cost of
elimination can be large because of the clauses needed to represent the variable in its
absence. As a result, the algorithm did not see much use, but was used as a stepping
stone for a more versatile techniques based on searching.
The foundation of nearly all modern SAT solvers lies in the DPLL approach [26].
The DPLL algorithm adopts a technique called backtracking search, whereby variables
are recursively assigned, simplifying the formula at each step, building candidates to
the solutions, abandoning each partial solution that can not possibly be completed to
a valid solution (backtracking). The DPLL algorithm also utilizes rules such as unit-
propagation and pure-literal elimination to reduce formula size and reduce the number
of decisions needed. However, in essence, the DPLL algorithm is an exhaustive search
for satisfying assignment.
Based on the basic DPLL framework, many improvements have been proposed. A
major advance is conflict driven clause learning [79]. Conflict driven clause learning
takes a strategy that new clauses are learned from conflicts during backtrack search
and the structure of conflicts is exploited during clause learning. With this technique,
the size of problem search space is greatly reduced and SAT solvers achieve the per-
formance improvement by orders of magnitude. However, there are still many prob-
lems that are intractable for SAT solvers, such as problems from cryptography domain
where the designs often involve tens of millions of variables. One major drawback
that limits the capacity of SAT solvers is the lack of ability for word-level reasoning.
To resolve this limitation, satisfiability modulo theories (SMT) are proposed and have
gained significant popularity since 2003. The SMT problem is to decide the satisfiability
of a formula expressed in a first-order background theory, such as linear inequalities,
bit vectors, linear arithmetic and uninterpreted functions, etc. In fact, SMT can be
considered as an extension of SAT to first-order logic. In other words, SMT solvers
first apply highly optimized decision procedures for different first-order theories and
then check the satisfiability using SAT solvers. For example, X > Y ∧ Y = Z is
13
first interpreted into X > Z and then X > Z is fed into a SAT solver to check the
satisfiability.
For our problems of interest, bit-vector (BV) theories have been shown to be useful
and important for hardware equivalence checking. In our case, equivalence checking
problems are first compiled into the formula. Then, decision procedures for bit-vector
theories, such as term rewriting techniques, are applied on the compiled formula to ob-
tain further optimization. Next, the optimized formula is bit-blasted to an equisatisfiable
Boolean formula. Finally, an integrated SAT solver is used to enumerate assignments
to the Boolean formula to find a satisfying assignment.
One advantage of bit-vector theories in SMT is that all problems are described
and operated upon word-level (bit-vector), proving to be effective for computationally
intensive designs, such as arithmetic circuits. For example, at word level, a 32-bit
multiplication can be represented as one term with two 32-bit words, while at bit-level,
it is represented as thousands of Boolean variables. Moreover, some instances can be
fully decided on the word-level, thus achieving a high performance.
As mentioned above, SMT formulas obviously provide a much richer modeling
language than what is possible with Boolean SAT formulas, even allowing word-level
representations of datapath operations. Solvers based on these theories [31] [14] [13]
[43] have improved abilities to represent arithmetic computations, but ultimately rely on
SAT tools to solve the verification instance, making them prone to the same limitations,
as shown in our experiments. For equivalence checking of gate-level circuits, word-
level information is not available. Then, SMT solvers have no benefits as they have to
rely on SAT solvers to solve the bit-level verification instance.
2.2.1 Circuit-Based Solvers
The above SAT and SMT solvers do not take into consideration circuit topology, so
they are inefficient in verifying circuit designs. Instead, circuit-based solvers, such as
C-SAT [53] [54], focus specifically on the mechanics of checking the equivalence of
pairs of combinational circuits. The main strategy utilized by C-SAT solvers is signal
correlation guided learning, which attempts to identify common subcircuit structure. In
other words, an internal node in the first circuit may be equivalent to an internal node
in the second circuit, thus combining the identical subcircuit as one node. This way,
if two circuits are structurally similar, the original problem becomes a problem with
much smaller space. To identify the common subcircuits, a technique called structural
14
hashing [11] is used. This is achieved by random simulation: first sending random
vectors through the two circuits and then collecting pairs of candidate equivalent nodes.
Practical use [11] has shown that this technique can detect potentially many, high
probability, candidate equivalent nodes.
AIG [49], on the other hand, is a pseudo canonical representation of a circuit. One
good property of AIGs is that the operations based on AIG are fast, such as adding
nodes or merging nodes. By representing the circuit with AIGs, many equivalent nodes
over a large circuit can be identified quickly.
When coupled with AIG as the circuit representation and techniques used in C-SAT,
circuit-based SAT solvers can achieve remarkable speedups in solving a wide variety of
circuit equivalence checking problems.
When two circuits are structurally very dissimilar, structural hashing is able to iden-
tify the common subcircuits, thus reducing the problem size. However, these techniques
are infeasible when verifying structurally dissimilar circuits. For example, in our ex-
periments, we have shown that equivalence checking of Mastrovito versus Montgomery
multipliers using ABC [11] and C-SAT [53] is infeasible beyond 16-bit circuits.
2.3 Computer Algebra-Based Approaches
Computer algebra-based approaches were first proposed in 1996 for SAT solving
and formal verification [23] [4]. The principle idea of these approaches is to reason
about the existence of solutions in the polynomial domain: verification problems are
first formulated as polynomials; then the polynoial system is fed into a Gro¨bner basis
engine to check the existence of solutions. There have been many attempts to solve
verification problems using this Gro¨bner basis formulation [87]. Instead of analyzing
the entire problem for proof-refutation, the work of [24] utilized Gro¨bner bases to
preprocess SAT instance to obtain additional information about the problem. This
information is then fed back into the SAT solver, thus benefiting the SAT solving.
One limitation of these approaches is that the Gro¨bner basis computation is known
to have worst-case double-exponential complexity in the input data. Besides, in prac-
tice, the implementations of Gro¨bner basis algorithm have not been capable of satisfac-
torily solving problems derived from real-world applications.
Recent advances [88] [73] suggest a new direction of utilizing computer algebra
theory to conduct hardware verification. It is feasible to overcome the complexity of
15
Gro¨bner basis algorithm by efficiently engineering Gro¨bner bases theory and integration
of circuit analysis techniques.
The work described in [88] addresses verification of finite precision integer datapath
circuits using the concepts of Gro¨bner bases over the ring Z2k . They model the circuit
constraints by way of arithmetic-bit-level (ABL) polynomials ({G}), and formulate
the verification test as an equivalent variety subset problem. To solve this, first they
derive a term order that already makes {G} a Gro¨bner basis. Then, they compute a
normal form f of the specification g w.r.t. {G}. If f is a vanishing polynomial over
Z2k [76], circuit correctness is established. In [73], the authors further show that the
vanishing polynomial test can be omitted by formulating the problem directly overQ :=
Z2k [X]/〈x2 − x : x ∈ X〉.
However, such approaches are effective only over ring Z2k while our problems are
derived from finite fields F2k . The mathematical theories differ significantly in these
two domains. Therefore, these approaches cannot be applied for our problems.
2.4 Verification of Finite Field Applications
There has not been much research by the design verification community to verify
finite field applications. The following works specifically targeted automated decision
procedures for verification of finite field applications: [67] [69] [74].
The theorem-proving approach of [67] verifies a finite field F2k implementation
against a given polynomial specification. They devise a decision procedure-based on
polynomial division, variable elimination, term rewriting, etc., and demonstrate a cor-
rectness proof of a sub-block of a Reed-Solomon decoder. Their decision procedures
were partly built upon BDDs (requiring decision over F2), and that is infeasible for
large circuits.
The work of [69] solves similar problems as those of [67]. However, they make use
of OKFDDs [29] to canonically represent the circuit constraints. Moreover, instead of
verifying circuit over F2k directly, [69] verifies the circuit over its equivalent composite
field F(2m)n representation, where a nonprime k = m · n. Their approach has no benefit
if k is prime – say, when k = 163 for elliptic curves. Moreover, the size-explosion of
FDDs limits their approach to 16-bit (F216) circuits, as shown in their experiments.
MODDs [42] were proposed as a canonical representation of the characteristic
function of a circuit over finite field F2k . However, as each node in the DAG may
16
have up to k children, MODDs have been shown to be exponential in the number of
variables, thus infeasible beyond 32-bit circuits.
None of the above approaches provide a scalable and efficient solution to the prob-
lem of verification of large finite field arithmetic circuits.
CHAPTER 3
PRELIMINARIES
This chapter gives an account of basic communicative algebra objects, such as
modular arithmetic, groups, rings, fields and polynomials. Emphasis is placed on
finite fields and hardware design over such fields as these applications are the focus
of this dissertation. The material is referred from [62] [75] [51] for finite field concepts
and [61] [65] [48] [89] [46] for hardware design over finite fields.
3.1 Rings, Fields and Polynomials
Definition 3.1 An abelian group is a set S and a binary operation ′+′ satisfying:
• Closure Law: For every a, b ∈ S, a+ b ∈ S.
• Associative Law: For every a, b, c ∈ S, a+ (b+ c) = (a+ b) + c.
• Commutativity: For every a, b ∈ S, a+ b = b+ a.
• Existence of Identity: There is an identity element 0 ∈ S such that for all a ∈ S;
a+ 0 = a.
• Existence of Inverse: If a ∈ S, then there is an element a−1 ∈ S such that a +
a−1 = 0.
The set of integers Z, for instance, forms an abelian group under addition.
Definition 3.2 Given two binary operations ′+′ and ′·′ on the set R as well as two
distinguished elements 0, 1 ∈ R, the system R is called a ring if the following properties
hold:
• R forms an abelian group under the ’+’ operation with additive identity element
0.
• Distributive Laws: For all a, b, c ∈ R, a · (b+ c) = a · b+ a · c .
18
• Associative Law of Multiplication: For every a, b, c ∈ R, a · (b · c) = (a · b) · c.
If there is an identity element 1 ∈ R such that for all a ∈ R, a · 1 = a = 1 · a, then
R is said to be a ring with unity.
The ring R is commutative if the following law also holds:
• Commutative Law of Multiplication: For every a, b ∈ R, a · b = b · a.
Henceforth, we consider only commutative rings with unity, as defined above. The
set of integers, Z, and the set of rational numbers, Q, are examples of commutative
rings with unity.
Definition 3.3 The modular number system with base n is a set of positive integers
Zn = {0, 1, . . . , n − 1}, with the two operations ′+′ and ′.′ satisfying the properties
below:
(a+ b) (mod n) ≡ (a (mod n) + b (mod n)) (mod n) (3.1)
(a · b) (mod n) ≡ (a (mod n) · b (mod n)) (mod n) (3.2)
(−a) (mod n) ≡ (n− a) (mod n) (3.3)
Example 3.1 The set Z8 = {0, 1, . . . , 7} denotes the modular number system with base
8. Examples of some operations performed (mod 8) are:
3 + 6 = 9 (mod 8) = 1
5 + 7 = 12 (mod 8) = 4
(−3) = 8− 3 (mod 8) = 5
2 · 4 = 8 (mod 8) = 0
3 · 5 = 15 (mod 8) = 7
3 · (−3) = (3 · 5) (mod 8) = 7
The modular number system Zn = {0, 1, . . . , n − 1}, where n is a natural number,
forms a commutative ring with the identity elements 0 and 1. This type of a ring is a
finite integer ring, where addition and multiplication are defined modulo n (mod n).
Many hardware and software applications perform bit-vector arithmetic. Arithmetic
over k-bit vectors manifests itself as algebra over the finite integer ring Z2k , as a k-bit
vector represents integer values from {0, ...., 2k − 1}.
Example 3.2 Consider the following hardware description given in Verilog. It takes as
inputs two 4-bit vectors, and computes the sum, which is also represented with a 4-bit
wide vector. Therefore, addition is performed modulo 24.
module Adder ( A , B , sum ) ;
19
i n p u t [ 3 : 0 ] A ;
i n p u t [ 3 : 0 ] B ;
o u t p u t [ 3 : 0 ] sum ;
reg [ 3 : 0 ] sum ;
a lways @ ( A or B )
beg in
sum <= A + B ;
end
endmodule
This code exemplifies arithmetic computations over the ring Z2k implemented at
bit-vector level.
Definition 3.4 A field F is a commutative ring with unity, where every non-zero element
in F has a multiplicative inverse; i.e., ∀ a ∈ F− {0}, ∃ aˆ ∈ F such that a · aˆ = 1.
A field is defined over a ring with an extra condition: the presence of a multiplicative
inverse for all non-zero elements. Therefore, a field must be a ring while a ring is not
necessarily a field. For example, the set Z2k = {0, 1, · · · , 2k − 1} forms a finite ring.
However, Z2k is not a field because not every element in Z2k has a multiplicative inverse.
In general, fields can be infinite, or contain a finite number of elements. For exam-
ple, fractions Q, complex numbers C, are infinite fields. In our applications, we focus
on finite fields, which are described later in Section 3.2.
Definition 3.5 Let R be a ring. A polynomial over R in the indeterminate x is an
expression of the form:
a0 + a1x+ a2x




i, ∀ai ∈ R. (3.4)
The constants ai are the coefficients and k is the degree of the polynomial. For
example, 4x2 + 6x is a polynomial in x over Z, with coefficients 4 and 6 and degree 2.
Definition 3.6 The system consisting of the set of all polynomials in the indeterminate
x with coefficients in the ring R, where addition and multiplication are defined accord-
ingly, forms a ring called the ring of polynomials R[x]. Similarly, R[x1, x2, · · · , xn]
represents the ring of multivariate polynomials with coefficients in R.
20
For example, Z23 [x] stands for the system of all polynomials in x with coefficients
in Z23; 4x2 + 6x is an instance of a polynomial belonging to Z23 [x].
3.2 Finite Fields
Finite fields find widespread applications in computer engineering, such as in error
correcting codes, elliptic curve cryptography, digital signal processing, testing of VLSI
circuits, among others. We describe the relevant finite field concepts [62] [75] [51] and
hardware designs over such fields [61] [65] [48] [89] [46].
Definition 3.7 A finite field, also called a Galois field, is a field with a finite number
of elements. The number of elements q of the finite field is a power of a prime integer –
i.e., q = pk, where p is a prime integer, and k ≥ 1. Finite fields are denoted as Fq or
Fpk .
Definition 3.8 The characteristic of a finite field F with unity element 1 is the smallest
integer n such that 1 + · · ·+ 1 (n times) = 0.
Lemma 3.1 The characteristic of a finite field Fpk is the prime integer p.
Lemma 3.2 The finite integer ring Zn forms a finite field if and only if n is prime. Such
fields are customarily denoted as Zp = Fp.
Example 3.3 Consider the field Z5. The additive and multiplicative inverses of each
element in Z5 (except 0) are also elements in Z5, as shown in Table 3.1. In contrast, Z4
is not a field, as 2 does not have a multiplicative inverse in Z4.
While Z2k is not a field, there do exist fields Fpk with nonprime cardinality. Such
fields are called extension fields. We are interested in extension fields Fpk , where p = 2
and k > 1. As these are algebraic extensions of the binary field F2, they are generally
Table 3.1. Additive and multiplicative inverses in Z5.







termed as binary extension fields F2k . Such fields are most widely used in digital
hardware applications as the computation can be universally encoded in binary form
for practical reasons.
3.2.1 Construction of Finite Fields F2k
To construct and describe the properties of finite fields F2k , the concept of irre-
ducible polynomials is required:
Definition 3.9 A polynomial P (x) ∈ F2 [x] is irreducible if P (x) is nonconstant with
degree k and it cannot be factored into a product of polynomials of lower degree in
F2[x].
Therefore, a polynomial with degree k is irreducible over F2 if and only if it has no
roots in F2. For example, x2+x+1 is an irreducible polynomial, because x2+x+1 = 0
has no roots in F2. Irreducible polynomials of any arbitrary degree always exist in F2[x].
To construct F2k , we take the polynomial ring F2[x] and an irreducible polynomial
P (x) ∈ F2[x] of degree k, and construct F2k ≡ F2[x] (mod P (x)). Let α be a root of
P (x), i.e., P (α) = 0. Note that P (x) is irreducible in F2[x]; however, the root lies in




(ai · αi) = a0 + a1 · α + · · ·+ ak−1 · αk−1 (3.5)
where ai ∈ F2 are the coefficients and P (α) = 0. The degree of any element A in F2k is
always less than k. This is because A is always computed modulo P (x), and P (x) has
degree k. The remainder ( (mod P (x))) can be of degree at most k−1. For this reason,
the field F2k can be viewed as a k-dimensional vector space over F2. The equivalent bit
vector representation for element A is given below:
A = (ak−1ak−2 · · · a0) (3.6)
The example below explains the construction of the finite field F24 .
Example 3.4 Let us construct F24 as F2[x] (mod P (x)), where P (x) = x4 +x3 +1 ∈
F2[x] is an irreducible polynomial of degree k = 4. Let α be the root of P (x), i.e.,
P (α) = 0.
Any element A ∈ F2[x] (mod x4 + x3 + 1) has a representation of the type: A =
a3x
3+a2x
2+a1x+a0 (degree < 4) where the coefficients a3, . . . , a0 are in F2 = {0, 1}.
22
Since there are only 16 such polynomials, we obtain 16 elements in the field F24 . Each
element in F24 can then be viewed as a 4-bit vector over F2: F24={(0000), (0001), . . .
(1110),(1111)}. If α is the root of P (x), then each element also has an exponential
representation; all three representations are shown in Table 3.2. For example, consider
the element α12. Computing α12 (mod α4 + α3 + 1) = α + 1 = (0011); hence, we
have the three equivalent representations.
There may exist more than one irreducible polynomials with degree k. In such cases,
any degree k irreducible polynomial can be used for field construction. For example,
both x3 + x2 + 1 and x3 + x + 1 are irreducible in F2 and either one can be used to
construct F23 . This is due to the following result:
Theorem 3.1 There exists a unique field Fpk , for any prime p and any positive integer
k.
Theorem 3.1 implies that finite fields with the same number of elements are isomor-
phic to each other up to the labeling of the elements.
Lemma 3.3 Let A be any element in Fq, then Aq−1 = 1.
As a consequence of Lemma 3.3, the following is a very important result that we
will use to investigate solutions to polynomial equations in Fq.
Theorem 3.2 [Generalized Fermat′s Little Theorem] Given a finite field Fq, each
element A ∈ Fq satisfies:
Table 3.2. Bit-vector, exponential and polynomial representation of elements in
F24 = F2[x] (mod x
4 + x3 + 1)
a3a2a1a0 Exponential Polynomial a3a2a1a0 Exponential Polynomial
0000 0 0 1000 α3 α3
0001 1 1 1001 α4 α3 + 1
0010 α α 1010 α10 α3 + α
0011 α12 α + 1 1011 α5 α3 + α + 1
0100 α2 α2 1100 α14 α3 + α2
0101 α9 α2 + 1 1101 α11 α3 + α2 + 1
0110 α13 α2 + α 1110 α8 α3 + α2 + α
0111 α7 α2 + α + 1 1111 α6 α3 + α2 + α + 1
23
Aq ≡ A
Aq − A ≡ 0 (3.7)
As a polynomial extension of the above consequence, let xq − x be a polynomial in
Fq[x]. Every element A ∈ Fq is a solution to xq − x = 0. Therefore, xq − x always
vanishes in Fq, and such polynomials are called vanishing polynomials of the field Fq.









= α (mod α2 + α + 1)
(α + 1)2
2
= α + 1 (mod α2 + α + 1)
3.2.2 Hardware Implementations of Arithmetic Operations Over F2k
In some cases, finite field (primitive) computations such as ADD, MUL, etc., are
implemented in hardware, and algorithms are then implemented in software (e.g., cryp-
toprocessors [84] [47]). In other cases, the entire design can be implemented in hard-
ware – such as a one-shot Reed-Solomon encoder-decoder chip [66] [50], or the point
multiplication circuitry [38] used in elliptic curve cryptosystems. Therefore, there has
been a lot of research in VLSI implementations of finite field arithmetic. We describe
the design of such primitive computations below to shed some light on the architectures
and their design and verification complexity.
Addition in F2k is performed by correspondingly adding the polynomials together,
and reducing the coefficients of the result modulo the characteristic 2.
Example 3.6 Given A = α3 + α2 + 1 = (1101) and B = α2 + 1 = (0101) in F24 ,
A+B = (α3 + α2 + 1) + (α2 + 1) = (α3) + (α2 + α2) + (1 + 1) = α3 = (1000).
Example 3.7 A 4-bit adder in F24 is given in Figure 3.1. It takes as inputs two 4-bit
vectors: A = (a3a2a1a0), B = (b3b2b1b0) and computes the result Z = (z3z2z1z0).
Note, an adder circuit is trivial and only consists of XOR gates.
Conceptually, the multiplication Z = A × B (mod P (x)) in F2k consists of two














Figure 3.1. 4-bit adder over F24 .
the result is reduced modulo the irreducible polynomial P (x). Multiplication procedure
is shown in Example 3.8.
Example 3.8 Consider the field F24 . We take as inputs: A = a0+a1 ·α+a2 ·α2+a3 ·α3
and B = b0 + b1 · α + b2 · α2 + b3 · α3, along with the irreducible polynomial P (x) =
x4 + x3 + 1. We have to perform the multiplication Z = A × B (mod P (x)). The
coefficients of A = {a0, . . . , a3}, B = {b0, . . . , b3} are in F2 = {0, 1}. Multiplication
can be performed as shown below:
a3 a2 a1 a0
× b3 b2 b1 b0
a3 · b0 a2 · b0 a1 · b0 a0 · b0
a3 · b1 a2 · b1 a1 · b1 a0 · b1
a3 · b2 a2 · b2 a1 · b2 a0 · b2
a3 · b3 a2 · b3 a1 · b3 a0 · b3
s6 s5 s4 s3 s2 s1 s0
The result Sum = s0 + s1 · α + s2 · α2 + s3 · α3 + s4 · α4 + s5 · α5 + s6 · α6,
s0 = a0 · b0
s1 = a0 · b1 + a1 · b0
s2 = a0 · b2 + a1 · b1 + a2 · b0
s3 = a0 · b3 + a1 · b2 + a2 · b2 + a3 · b1
s4 = a1 · b3 + a2 · b1 + a3 · b1
s5 = a2 · b3 + a3 · b2
s6 = a3 · b3
Here the multiply “·” and add “+” operations are performed modulo 2, so they can
be implemented in a circuit using AND and XOR gates. Note that unlike integer mul-
tipliers, there are no carry-chains in the design, as the coefficients are always reduced
25
modulo p = 2. However, the result is yet to be reduced modulo the primitive polynomial
P (x) = x4+x3+1. This is shown below, where higher degree coefficients are reduced
(mod P (x)).
s3 s2 s1 s0
s4 0 0 s4 s4 · α4 (mod P (α)) = s4 · (α3 + 1)
s5 0 s5 s5 s5 · α5 (mod P (α)) = s5 · (α3 + α + 1)
s6 s6 s6 s6 s6 · α6 (mod P (α)) = s6 · (α3 + α2 + α + 1)
z3 z2 z1 z0
The final result (output) of the circuit is: Z = z0 + z1α + z2α2 + z3α3; where
z0 = s0 + s4 + s5 + s6; z1 = s1 + s5 + s6; z2 = s2 + s6; z3 = s3 + s4 + s5 + s6.
The above multiplier design is called the Mastrovito multiplier [61], which is the
most straightforward way to design a multiplier over F2k . A logic circuit for a 4-bit
Mastrovito multiplier over finite field F24 is illustrated in Fig. 3.2.
Modular multiplication is at the heart of many public-key cryptosystems, such as
Elliptic Curve Cryptography (ECC) [64]. Due to the very large field size (and hence
the datapath width) used in these cryptosystems, the above Mastrovito multiplier ar-
chitecture is inefficient, especially when exponentiation and repeat multiplications are






































Figure 3.2. Mastrovito multiplier over F24 .
26
multiplication algorithms are used to overcome the complexity of such operations.
These include the Montgomery reduction [65] [48] and the Barrett reduction [46].
Montgomery Reduction: Montgomery reduction (MR) computes:
G = MR(A,B) = A · B ·R−1 (mod P (x)) (3.8)
where A,B are k-bit inputs, R = αk, R−1 is the multiplicative inverse of R in F2k
and P (x) is the irreducible polynomial for F2k . Since Montgomery reduction cannot
directly compute A · B, to compute A · B (mod P (x)), we need to precompute A · R
and B ·R, as shown in Figure 3.3.
Each MR block in Figure 3.3 represents a Montgomery reduction step, which is a
hardware implementation of the algorithm shown in Algorithm 1.
Algorithm 1: Montgomery Reduction Algorithm [48]
Input: A(x), B(x) ∈ F2k ; irreducible polynomial P (x).
Output: G(x) = A(x) ·B(x) · x−k (mod P (x)).
G(x) :=0
for (i = 0; i ≤ k − 1; ++i ) do
G(x) := G(x) + Ai ·B(x) /*Ai is the ith bit of A*/;
G(x) := G(x) +G0 · P (x) /*G0 is the lowest bit of G*/;
G(x) := G(x)/x /*Right shift 1 bit*/;
end
The design of Fig. 3.3 is an overkill to compute just A · B (mod P (x)). However,
when these multiplications are performed repeatedly, such as in iterative squaring, then
the Montgomery approach speeds-up the computation. As shown in [89], the critical
path delay and gate counts of a squarer designed using the Montgomery approach are
much smaller than the traditional approaches.
Barrett Reduction: Barrett reduction is the other widely used multiplier design














G=A B (mod P)
"1"
Figure 3.3. Montgomery multiplier over F2k
27
the traditional Barrett reduction, proposed in [7], needs a precomputed value of the
reciprocal/inverse of modulus P (x). This precomputation requires extra computational
time and memory space. To overcome this limitation, the recent approach of [46] avoids
such a precomputation of inverses and therefore greatly simplifies the hardware design
implementation. This algorithmic computation is shown in Algorithm 2.
Algorithm 2: Barrett Reduction Without Precomputation Algorithm [46]
Input: R(x) ∈ F2k ; irreducible polynomial P (x) = xn +
l∑
i=0
mi · xi satisfying
l = ⌊n
2
⌋,mi ∈ {0, 1}.




/*Right shift n bit*/;





G1(x) = R(x) (mod x
n) /*Keep the lower n bits of R(x)*/;
G2(x) = P (x) ·Q3(x) (mod xn) ;
G(x) = G1(x) +G2(x) ;
Based on Barrett reduction, a multiplier can be designed with two simple steps:
multiplication R = A × B and a subsequent Barrett reduction G = R (mod P ). This
is shown in Figure 3.4. As we can see, a Barrett multiplier is similar to a Mastrovito
multiplier except for the reduction step.
One of the most influential applications of finite fields is in elliptic curve cryptog-
raphy (ECC). ECC is an approach to public-key cryptography based on the algebraic
structure of elliptic curves over finite fields. The main operations of encryption, de-
cryption and authentication in ECC rely on point multiplications. Point multiplication
involves a series of addition and doubling of points on the elliptic curve. A drawback
of traditional point multiplication is that each point addition and doubling involves a
multiplicative inverse operation over finite fields. Representing the points in projective
coordinate systems [38] eliminates the need for multiplicative inverse operation and




G=A B (mod P)
Figure 3.4. Barrett multiplier over F2k .
28
we have verified custom designs based on the Lo´pez-Dahab (LD) coordinate system
[52].
Example 3.9 Consider point addition in LD projective coordinate. Given an elliptic
curve: Y 2 +XY Z = X3Z + aX2Z2 + bZ4 over F2k , where X, Y, Z are k-bit vectors
that are elements in F2k and similarly, a, b are constants from the field. Let (X1, Y1, Z1)
+ (X2, Y2, 1) = (X3, Y3, Z3) represent point addition over the elliptic curve. Then X3,
Y3, Z3 can be computed as follows:
A = Y2 · Z21 + Y1
B = X2 · Z1 +X1
C = Z1 · B
D = B2 · (C + aZ21 )
Z3 = C
2
E = A · C
X3 = A
2 +D + E
F = X3 +X2 · Z3
G = X3 + Y2 · Z3
Y3 = E · F + Z3 ·G
Example 3.10 Consider point doubling in projective coordinate system. Given an
elliptic curve: Y 2 + XY Z = X3Z + aX2Z2 + bZ4. Let 2(X1, Y1, Z1) = (X3, Y3,






1 · Z3 +X3 · (aZ3 + Y 21 + bZ41 )
In the above examples, polynomoial multiplication and squaring operations are
implemented in hardware using Montgomery or Barrett reductions over finite fields
F2k .
The field size for such applications is generally very large; as discussed before, for
ECC, in F2k , k = 163 or larger. The large size and complicated arithmetic nature of
such circuits clearly shows the complexity of the formal verification problem. Con-
temporary techniques lack the requisite power of abstraction to model and verify such
29
large systems. For this reason, we propose polynomial abstractions over finite fields to
model and verify such circuits using computer algebra techniques. This is the subject
of subsequent chapters of this dissertation.
CHAPTER 4
COMPUTER ALGEBRA FUNDAMENTALS
This chapter reviews preliminary fundamental concepts of commutative and com-
puter algebra that are utilized in our work. The concepts of polynomial ideals, varieties
and Gro¨bner bases are described with regard to their algorithmic computation. Finally,
the results of Hillbert’s Nullstellensatz are described, which are employed for verifica-
tion over finite fields in subsequent chapters. The material is mostly referred from the
textbooks [25] [3].
4.1 Monomials and Their Orderings
Definition 4.1 A monomial in x1, x2, · · · , xd is a product of this form:
xα11 · xα22 · · · · xαdd , (4.1)
where αi ≥ 0, i ∈ {1, · · · , d}. The total degree of the monomial is α1 + · · ·+ αd.
For simplicity, we will denote a monomial xα11 · xα22 · · · · xαdd = xα, where α =
(α1, · · · , αd), i.e., α ∈ Zd≥0.
Definition 4.2 A multivariate polynomial f in variables x1, x2, . . . , xd with coeffi-





aα · xα, aα ∈ K
The set of all polynomials in x1, x2, . . . , xd with coefficients in field K is denoted
by K[x1, x2, . . . , xd].
Definition 4.3 Let f =
∑
α aαx
α be a polynomial in K[x1, x2, . . . , xd].
1. We refer to the constant aα ∈ K as the coefficient of the monomial aαxα.
2. If aα 6= 0, we call aαxα a term of f .
31
As an example, 2x2 + y is a polynomial with two terms, 2x2 and y, with 2 and 1 as
coefficients, respectively. In contrast, x+ y−1 is not a polynomial because the exponent
of y is less than 0.
An important fact of polynomials is that a polynomial is a sum of terms and these
terms have to be arranged unambiguously so that they can be manipulated in a consistent
manner. Therefore, we need to establish the concept monomial ordering (or term
ordering). A term ordering, represented by >, defines how terms in a polynomial are
ordered. Term orderings are totally ordered, i.e., antisymmetric, transitive, total, with
constant terms last in the ordering. More formally, we have the following definitions:
Definition 4.4 Let Td = {xα : α ∈ Zd≥0} be the set of all monomials in x1, . . . , xd. A
monomial order > on Td is a total well-ordering satisfying:
• For any xα ∈ Td, xα > 1
• For all α, β, γ, xα > xβ ⇒ xα · xγ > xβ · xγ
A total-ordering ensures that there is no ambiguity with respect to where a term is
found in the term ordering. Total-orderings for monomials come in different forms, no-
tably lexicographic orderings (lex), and its variants: degree-lexicographic ordering
(deglex) and reverse degree-lexicographic ordering (revdeglex).
A lexicographic ordering (lex) is a total-ordering > such that variables in the terms
are lexicographically ordered. Higher variable-degrees take precedence over lower
degrees (e.g., a3 = aaa).
Definition 4.5 Lexicographic order: Let x1 > x2 > · · · > xd lexicographically. Also
let α = (α1, . . . , αd); β = (β1, . . . , βd) ∈ Zd≥0. Then we have:
xα > xβ ⇐⇒
{
Starting from the left, the first coordinates of αi, βi
that are different satisfy αi > βi
(4.2)
A degree-lexicographic ordering (deglex) is a total-ordering > such that the total
degree of a term takes precedence over the lexicographic ordering. A degree-reverse-
lexicographic ordering (degrevlex) is the same as a deglex ordering. However, terms
are lexed in reverse.
32
Definition 4.6 Degree lexicographic order: Let x1 > x2 > · · · > xd lexicographi-
cally. Also let α = (α1, . . . , αd); β = (β1, . . . , βd) ∈ Zd≥0. Then we have:







i=1 βi and xα > xβ w.r.t. lex order
(4.3)
Definition 4.7 Degree reverse lexicographic order: Let x1 > x2 > · · · > xd lexico-
graphically. Also let α = (α1, . . . , αd); β = (β1, . . . , βd) ∈ Zd≥0. Then we have:









i=1 βi and the first coordinates
αi, βi from the right, which are different, satisfy αi < βi
(4.4)
As a consequence of these term orderings, we have the following relations, where
a > b > c.
lex:a2b > a2 > abc > ab > ac2 > ac > b2c > b2 > bc3 > 1 (4.5)
deglex:bc3 > a2b > abc > ac2 > b2c > a2 > ab > ac > b2 > 1 (4.6)
degrevlex:bc3 > a2b > abc > b2c > ac2 > a2 > ab > b2 > ac > 1 (4.7)
The difference between the lex and two deg- orderings is obvious, while the differ-
ence between the two degree-based orderings can be seen by considering from which
direction the term is lexed, e.g., ac2 > b2c (deglex, left-to-right) versus b2c > ac2
(degrevlex, right-to-left).
Example 4.1 Let f = 2x2yz+3xy3− 2x3. Effects of different term orderings on f are
shown below:
• lex x > y > z: f = −2x3 + 2x2yz + 3xy3
• deglex x > y > z: f = 2x2yz + 3xy3 − 2x3
• degrevlex x > y > z: f = 3xy3 + 2x2yz − 2x3
Definition 4.8 The leading term is the first term in a term ordered polynomial. Like-
wise, the leading coefficient is the coefficient of the leading term. Finally, a leading
power product is the leading term lacking the coefficient. We use the following notation:
lt(f) — Leading Term (4.8)
lc(f) — Leading Coefficient (4.9)
lm(f) — Leading Monomial (4.10)
33
Example 4.2
f = 3a2b+ 2ab+ 4bc (4.11)
lt(f) = 3a2b (4.12)
lc(f) = 3 (4.13)
lm(f) = a2b (4.14)
4.2 Varieties and Ideals
In verification applications, it is often required to analyze (the presence or absence
of) solutions to a given system of constraints. In our applications, these constraints are
polynomials and their solutions are described as varieties.
Definition 4.9 Let K be a field, and let f1, . . . , fs ∈ K[x1, x2, . . . , xd]. We call V (f1, . . . , fs)
the affine variety defined by f1, . . . , fs as:
V (f1, . . . , fs) = {(a1, . . . , ad) ∈ Kd : fi(a1, . . . , ad) = 0, ∀i, 1 ≤ i ≤ s}. (4.15)
V (f1, . . . , fs) ∈ Kd is the set of all solutions of the system of equations: f1(x1, . . . , xd) =
· · · = fs(x1, . . . , xd) = 0.
Example 4.3 Given R [x, y], V (x2+y2) = {(0, 0)}. Similarly, in R [x, y], V (x2+y2−
1) = {all points on the circle : x2+y2−1 = 0}. However, varieties depend on which
field we are operating on. For the same polynomial x2 + 1, we have:
• In R[x], V (x2 + 1) = ∅.
• In C[x], V (x2 + 1) = {(±i)}.
The above example shows the variety can be infinite, finite (nonempty set) or empty.
It is interesting to note that we will be operating over finite fields Fq, and any finite set
of points is a variety. Consider the points {(a1, . . . , ad) : a1, . . . , ad ∈ Fq} in Fdq . Any
single point is a variety of some polynomial system: e.g., (a1, . . . , ad) is a variety of
x1−a1 = x2−a2 = · · · = xd−ad = 0. Moreover, finite unions and finite intersections
of varieties are also varieties. Let U = V (f1, . . . , fs) and W = V (g1, . . . , gt). Then:
• U ∩W = V (f1, . . . , fs, g1, . . . , gt)
• U ∪W = V (figj : 1 ≤ i ≤ s, 1 ≤ j ≤ t)
34
Another important concept related to varieties is that the variety depends not just
on the given system of polynomial equations, but rather on the ideal generated by the
polynomials.
Definition 4.10 A subset I ⊂ K[x1, x2, . . . , xd] is an ideal if it satisfies:
• 0 ∈ I
• I is closed under addition: x, y ∈ I ⇒ x+ y ∈ I
• If x ∈ K[x1, x2, . . . , xd] and y ∈ I , then x · y ∈ I as well as y · x ∈ I .
Any ideal is generated by its basis or generators.
Definition 4.11 Let f1, f2, . . . , fs be the given elements of K[x1, x2, . . . , xd]. Let I be
an ideal in K[x1, x2, . . . , xd]. If:
I = {g1f1 + g2f2 + . . .+ gsfs : g1, . . . , gs ∈ K[x1, x2, . . . , xd]} (4.16)
then, f1, . . . , fs are called the basis (or generators) of the ideal I and correspondingly
I is denoted as I = 〈f1, f2, . . . , fs〉.
Example 4.4 The set of even integers, which is a subset of the ring of integers Z, forms
an ideal of Z. This can be seen from the following;
• 0 belongs to the set of even integers.
• The sum of two even integers x and y is always an even integer.
• The product of any integer x with an even integer y is always an even integer.
Example 4.5 Given R [x, y], I = 〈x, y〉 is an ideal containing all polynomials gener-
ated by x and y, such as x2 + y, x · y + x. J = 〈x2, y2〉 is an ideal containing all
polynomials generated by x2 and y2, such as x2 + y2, x2 · y2 + x10. Notice I 6= J
because x+ y can only be generated by I .
Any ideal may have many different bases. For instance, it is possible to have
different sets of polynomials {f1, . . . , fs} and {g1, . . . , gt} that may generate the same
ideal, i.e., 〈f1, . . . , fs〉 = 〈g1, . . . , gt〉. Since variety depends on the ideal, these sets of
polynomials have the same solutions.
35
Proposition 4.1 If f1, . . . , fs and g1, . . . , gt are bases of the same ideal in K[x1, . . . , xd],
so that 〈f1, . . . , fs〉 = 〈g1, . . . , gt〉, then V (f1, . . . , fs) = V (g1, . . . , gt).
Example 4.6 Consider the two bases F1 = {(2x2 + 3y2 − 11, x2 − y2 − 3} and F2 =
{x2−4, y2−1}. These two bases generate the same ideal, i.e., 〈F1〉 = 〈F2〉. Therefore,
they represent the same variety, i.e.,
V (F1) = V (F2) = {±2,±1}. (4.17)
An important fundamental problem that we need to solve is one of ideal membership
testing.
Definition 4.12 Let f, f1, . . . , fs be polynomials in K[x1, . . . , xd]. Let ideal I = 〈f1, . . . , fs〉 ⊂
K[x1, . . . , xd]. If f can be written as f = f1h1 + · · ·+ fshs, then we say f is a member
of the ideal I .
Our verification problems are formulated as ideal membership testing. For this
purpose, we require a decision procedure to unequivocally decide ideal membership.
Gro¨bner basis provides such a decision procedure, and this is described in the next
section.
4.3 Gro¨bner Bases
As mentioned above, different generating sets may constitute the same ideal. How-
ever, some generating sets may be better than others – that is they may be a better
representation of the ideal. A Gro¨bner basis is one such ideal representation that has
many important properties that allow us to solve many polynomial decision questions.
By analyzing the Gro¨bner basis, one can deduce the presence or absence of solutions
(varieties), find the dimension of the varieties and also deduce ideal membership. A
Gro¨bner basis, in essence, is a canonical representation of an ideal. Buchberger’s
work [17] laid the foundation for computing a Gro¨bner basis of an ideal. This section
provides a synopsis of some of these concepts.
Among many equivalent definitions of Gro¨bner bases, we start with the definition
that can best describe the properties of Gro¨bner bases:
Definition 4.13 A set of non-zero polynomials G = {g1, . . . , gt} contained in an ideal
I , is called a Gro¨bner basis for I if and only if for all f ∈ I such that f 6= 0, there
exists i ∈ {1, . . . , t} such that lm(gi) divides lm(f).
36
G = Gro¨bnerBasis(I) ⇐⇒ ∀f ∈ I : f 6= 0, ∃gi ∈ G : lm(gi) | lm(f) (4.18)
Given a set of polynomials F = {f1, . . . , fs} that generate ideal I = 〈f1, . . . , fs〉,
Buchberger gives an algorithm to compute a Gro¨bner basis G = 〈g1, . . . , gt〉. This
algorithm relies on the notions of S-polynomials and polynomial reduction, which are
described below.
Definition 4.14 For a field K, f, g ∈ K[x1, . . . , xd], L = lcm (lt(f), lt(g)), an S-




· f − L
lt(g)
· g (4.19)
Note, lcm denotes least common multiple.
Definition 4.15 The reduction of a polynomial f , by another polynomial g, to a re-
duced polynomial r is denoted:
f
g−→ r
Reduction is carried out using multivariate, polynomial long division.
For sets of polynomials, the notation
f
F−→+ r
represents the reduced polynomial r resulting from f as reduced by a set of non-zero
polynomials F = {f1, . . . , fs}. The polynomial r is considered reduced if r = 0 or no
term in r is divisible by a lm(fi), ∀fi ∈ F .
For all intents and purposes, the reduction process f F−→+ r, of dividing a poly-
nomial f by a set of polynomials of F , can be modeled as repeated long division of f
by each of the polynomials in F until no further reductions can be made—the result of
which is r, as shown in Algorithm 3.
The division algorithm keeps cancelling the leading terms of polynomials until no
more leading terms can be further cancelled. So the key step is p = p− lt(p)/lt(fi) · fi,
as the following example shows.
Example 4.7 Given f1 = y2 − x and f2 = y − x in Q[x, y] with deglex: y > x. Then
f1/f2 = f1 − lt(f1)/lt(f2) · f2 = y2 − x− (y2/y) · (y− x) = y · x− x. Then y · x− x
can be further divided by f2: (y · x− x)/f2 = x2 − x, which is the final result.
37
Algorithm 3: Polynomial Division
Input: f, f1, . . . , fs
Output: r, a1, . . . , as, such that f = a1 · f1 + · · ·+ as · fs + r.
a1 = a2 = · · · = as = 0; r = 0;
p := f ;
while p 6= 0 do
i=1;
divisionmark = false;
while i ≤ s && divisionmark = false do
if fi can divide p then
ai = ai + lt(p)/lt(fi);






if divisionmark = false then
r = r + lt(p);
p = p− lt(p);
end
end
Algorithm 4: Buchberger’s Algorithm
Input: F = {f1, . . . , fs}, such that I = 〈f1, . . . , fs〉
Output: G = {g1, . . . , gt}, a Gro¨bner basis of I
G := F ;
repeat
G′ := G;
for each pair {fi, fj}, i 6= j in G′ do
Spoly(fi, fj)
G′−→+ r ;
if r 6= 0 then
G := G ∪ {r} ;
end
end
until G = G′;
We now present Buchberger’s Algorithm [17] for computing Gro¨bner bases.
For Gro¨bner basis computation, a monomial (term) ordering is fixed to ensure that
polynomials are manipulated in a consistent manner. Buchberger’s algorithm then takes
pairs of polynomials (fi, fj) in the basis G and combines them into “S-polynomials”
(Spoly(fi, fj)) to cancel leading terms. The S-polynomial is then reduced (divided)
38
by all elements of G to a remainder r, denoted as S(fi, fj)
G−→+ r. Multivariate
polynomial division is used for this reduction step. This process is repeated for all
unique pairs of polynomials, including those created by newly added elements, until no
new polynomials are generated, ultimately constructing the Gro¨bner basis.
Example 4.8 Consider the ideal I ⊂ Q[x, y], I = 〈f1, f2〉, where f1 = yx − y, f2 =
y2 − x. Assume a degree-lexicographic term ordering with y > x is imposed.
First, we need to compute Spoly(f1, f2) = x · f2 − y · f1 = y2 − x2. Then, we
conduct a polynomial reduction y2−x2 f2−→ x2−x f1−→ x2−x. Let f3 = x2−x. Then,
G is updated as {f1, f2, f3}. Next, we compute Spoly(f1, f3) = 0. So there is no new
polynomial generated. Similarly, we compute Spoly(f2, f3) = x · y2 − x3, followed by
x · y2− x3 f1−→ y2− x3 f2−→ x− x3 f2−→ 0. Again, no polynomial is generated. Finally,
G = {f1,f2, f3}.
Gro¨bner basis now gives a decision procedure to test for membership in an ideal.
Theorem 4.1 Let G = {g1, · · · , gt} be a Gro¨bner basis for an ideal I ⊂ K[x1, · · · , xd]
and let f ∈ K[x1, . . . , xd]. Then, f ∈ I if and only if the remainder on division of f by
G is zero.
In other words,
f ∈ I ⇐⇒ f G−→+ 0 (4.20)
Example 4.9 Consider Example 4.8. Let f = y2x − x be another polynomial. Note
that f = yf1 + f2, so f ∈ I . If we divide f by f1 first and then by f2, we will obtain
a zero remainder. However, since the set {f1, f2} is not a Gro¨bner basis, we find that
the reduction f f2−→ x2 − x f1−→ x2 − x 6= 0; i.e., dividing f by f2 first and then by f1
does not lead to a zero remainder. However, if we compute the Gro¨bner basis G of I ,
G = {x2− x, yx− y, y2− x}, dividing f by polynomials in G in any order will always
lead to the zero remainder. Therefore, one can decide ideal membership unequivocally
using the Gro¨bner basis.
Definition 4.16 A minimal Gro¨bner basis for a polynomial ideal I is a Groebner
basis G for I such that
• lc(gi) = 1, ∀gi ∈ G
39
• ∀gi ∈ G, lt(gi) /∈ 〈lt(G− {gi})〉
A minimal Gro¨bner basis is a Gro¨bner basis such that no leading term of any element
in G divides another in G. A minimal Gro¨bner basis can be computed by removing any
polynomial whose leading term can be divided by another in a given Gro¨bner basis.
A minimal Gro¨bner basis can be further reduced.
Definition 4.17 A reduced Gro¨bner basis for a polynomial ideal I is a Gro¨bner basis
G = {g1, . . . , gt} such that:
• lc(gi) = 1, ∀gi ∈ G
• ∀gi ∈ G, no monomial of gi lies in 〈lt(G− {gi})〉
G is a reduced Gro¨bner basis when no monomial of any element inG divides the leading
term of another element.
For a given monomial ordering, the reduced Gro¨bner basis is a canonical represen-
tation of the ideal, as given by Proposition 4.2 below.
Proposition 4.2 Let I 6= {0} be a polynomial ideal. Then, for a given monomial
ordering, I has a unique reduced Gro¨bner basis.
4.4 Hillbert’s Nullstellensatz
In this section, we further describe some correspondence between ideals and vari-
eties in the context of algebraic geometry. The celebrated results of Hillbert’s Nullstel-
lensatz establish such correspondences, and these results, together with Gro¨bner bases,
provide a basis for our verification solutions.
Definition 4.18 A field K is an algebraically closed field if every polynomial in one
variable with degree at least 1, with coefficients in K, has a root in K.
In other words, any nonconstant polynomial equation over K [x] always has at least one
root in K. Every field K is contained in an algebraically closed one K. For example,
the field of reals R is not an algebraically closed field, because x2 + 1 = 0 has no root
in R. However, x2 + 1 = 0 has roots in the field of complex numbers C, which is an
algebraically closed field. In fact, C is the algebra closure of R. Every algebraically
closed field is an infinite field.
40
Theorem 4.2 [Weak Nullstellensatz] Let I ⊂ K[x1, x2, · · · , xd] be an ideal satis-
fying V (I) = ∅. Then, I = K[x1, x2, · · · , xd], or equivalently,
V (I) = ∅ ⇐⇒ I = K[x1, x2, · · · , xd] = 〈1〉 (4.21)
Corollary 4.1 Let I = 〈f1, . . . , fs〉 ⊂ K[x1, x2, · · · , xd]. LetG be the reduced Gro¨bner
basis of I . Then, V (I) = 0 ⇐⇒ G = {1}.
The Weak Nullstellensatz offers a way to evaluate whether or not the system of
multivariate polynomial equations (ideal I) has common solutions in Kd. For this
purpose, we only need to check if the ideal is generated by the unit element, i.e.,
1 ∈ I . This approach can be used to evaluate the feasibility of constraints in our
verification problems. Another interesting result that we will employ is one of Strong
Nullstellensatz, to describe which we need the concepts of “ideals of varieties” and
radicals.
Let K be any field and let a = (a1, . . . , ad) ∈ Kd be a point, and f ∈ K[x1, . . . , xd]
be a polynomial. We say that f vanishes on a if f(a) = 0, i.e., a is in the variety of f .
Definition 4.19 For any variety V of Kd, the ideal of polynomials that vanish on V ,
called the vanishing ideal of V , is defined as I(V ) = {f ∈ F[x1, . . . , xd] : ∀a ∈
V, f(a) = 0}.
Proposition 4.3 If a polynomial f vanishes on a variety V , then, f ∈ I(V ).
Example 4.10 Let ideal J = 〈x2, y2〉. Then, V (J) = {(0, 0)}. All polynomials in
J will obviously agree with the solution and vanish on this variety. However, the
polynomials x, y are not in J but they also vanish on this variety. Therefore, I(V (J)) is
the set of all polynomials that vanish on V (J), and the polynomials x, y are members
of I(V (J)).
Definition 4.20 Let J ⊂ K[x1, . . . , xd] be an ideal. The radical of J is defined as√
J = {f ∈ K[x1, . . . , xd] : ∃m ∈ N, fm ∈ J}.
Example 4.11 Let J = 〈x2, y2〉 ⊂ K [x, y]. Note, neither x nor y belongs to J , but
they belong to
√
J . Similarly, x · y /∈ J , but since (x · y)2 = x2 · y2 ∈ J , therefore,




J , then J is said to be a radical ideal. Moreover, I(V ) is a radical
ideal. The Strong Nullstellensatz establishes the correspondence between radical ideals
and varieties.
Theorem 4.3 (Strong Nullstellensatz [3]) Let K be an algebraically closed field, and




For verification, we have to analyze constraints corresponding to the circuit func-
tionality. Solutions to these constraints are viewed as varieties and the constraints
themselves are analyzed as polynomial ideals. Since Nullstellensatz defines the cor-
respondences between ideals and varieties, the verification problems are modeled using
Nullstellensatz. These are subsequently solved using Gro¨bner basis techniques. While
Nullstellensatz applies over algebraically closed fields, and finite fields are not alge-
braically closed, our approach requires modifications to suit our problems, as described




This chapter describes our approach to the problem of formal verification of hard-
ware implementations of arithmetic circuits over finite fields of the type F2k , using a
computer-algebra/algebraic-geometry-based approach. Given a specification polyno-
mial f and a circuit C, we have to prove that the circuit C correctly implements f .
Otherwise, we have to generate a counter example that excites the bug in the design.
The arithmetic circuit is modeled as a polynomial system in F2k [x1, x2, · · · , xd] and
the verification problem is formulated using Strong Nullstellensatz over finite fields as
a membership test in a corresponding (radical) ideal. This requires the computation
of a Gro¨bner basis, which is computationally expensive. To overcome this limitation,
we analyze the circuit topology and derive a term order to represent the polynomials.
Subsequently, using the theory of Gro¨bner bases over finite fields, we prove that this
term order renders the set of polynomials itself a Gro¨bner basis of this (radical) ideal –
thus significantly enhancing verification efficiency. Using our approach, we can verify
the correctness of, and detect bugs in, up to 163-bit circuits in F2163 , corresponding
to the NIST-specified ECC standard. In contrast, contemporary approaches, including
SAT, SMT, BDD and AIG-based techniques, are infeasible.
5.1 Problem Statement
The following is our problem statement:
• Given a finite field F2k , i.e., given k (datapath size), along with the corresponding
irreducible polynomial P (x), let P (α) = 0, i.e., α be the root of P (x).
• Given a word-level specification polynomial S = F(A1, A2, . . . , An) (mod P (x)),
where each Ai represents a word-level k-bit input; S,A1, A2, . . . , An ∈ F2k ; F is
a function describing the input-output relation.
43
• Given a gate-level combinational circuit C, the bit-level primary inputs of the cir-
cuit are {aj0, aj1, . . . , ajk−1}, for j = 1, . . . , n; the primary outputs are {z0, . . . , zk−1} =
Z. Here aji , zi ∈ F2, i = 0, . . . , k − 1.
• The word-level and bit-level correspondences are the following:
A1 = a10 + a
1




An = an0 + a
n
1α + · · ·+ ank−1αk−1 = (ank−1 · · · an1an0 ),
and the primary outputs are related as:
Z = z0 + z1α + z2α
2 + · · ·+ zk−1αk−1 = (zk−1 · · · z2z1z0).
Our goal is to formally prove that ∀Aj, Z ∈ F2k , the circuit output Z correctly imple-
ments the specification S = F(A1, A2, . . . , An) (mod P (x)) over F2k . Otherwise, we
have to produce a counter-example that excites the bug in the design.
Example 5.1 Consider the verification problem instance for a multiplier circuit over
F2k .
• Given the finite field F2k and the corresponding irreducible polynomial P (x), let
P (α) = 0.
• Given a word-level multiplier specification polynomial S = A · B (mod P (x)),
where A,B, S ∈ F2k (k-bit vectors), function F corresponds to multiplication
operation: A ·B (mod P )).
• Given a gate-level combinational circuit, the bit-level primary inputs of the circuit
are {a0, . . . , ak−1, b0, . . . , bk−1}, and {z0, . . . , zk−1} are the primary outputs;




2+· · ·+bk−1αk−1 andZ = z0+z1α+· · ·+zk−1αk−1.
We need to check whether the circuit implementation matches the specification, i.e.,
whether S = Z, ∀ai, bi.
Our approach is generic enough to verify the implementation of any combinational
finite field arithmetic circuit against the given polynomial specification. Without loss
of generality and for the purpose of exposition of our proposed approach, we use finite
field multiplier circuits for our verification objective, as they form the core of most
computations and are notoriously hard to verify.
44
5.2 Verification Setup and Polynomial Modeling
Our verification setup is depicted in Fig. 5.1. Given the specification polynomial
S = A · B (mod P (x)), and the circuit implementation with A,B as inputs and Z as
output, we want to verify the property S = Z over F2k .
Specification: Given two k-bit inputs in bit-vector form A = (ak−1ak−2 · · · a1a0)
and B = (bk−1bk−2 · · · b1b0), the specification can be modeled in polynomial forms in
F2k as follows:
A =a0 + a1 · α + · · ·+ ak−1 · αk−1
B =b0 + b1 · α + · · ·+ bk−1 · αk−1
S =A ·B (mod P (x))
Implementation: Given a gate-level circuit netlist, we map the gate-level Boolean
operators (AND, OR, NOT, XOR) to polynomials over F2(⊂ F2k) using the following
one-to-one mapping over B→ F2 :
¬a→ a+ 1 (mod 2)
a ∨ b→ a+ b+ a · b (mod 2)
a ∧ b→ a · b (mod 2)
a⊕ b→ a+ b (mod 2)
(5.1)
where a, b ∈ F2 = {0, 1}. Note that the equation c = F(a, b) is written in polynomial
form as c−F(a, b) = c+ F(a, b), as −1 ≡ +1 (mod 2).
Example 5.2 Consider the equation with Boolean operators:
z = a⊕ (b ∨ c).
The equation modeled over F2 is:











Figure 5.1. The verification setup.
45
The left-hand side expression is a polynomial in F2 [a, b, c, z] ⊂ F2k [a, b, c, z]:
z + a+ b+ c+ b · c
Therefore, we can transform the entire circuit implementation as polynomials over
F2k . Let Z denote the word-level result of the circuit.
The Verification Property: The property S = Z is modeled as a polynomial
f : S + Z = 0 over F2k . Overall, our verification constraints can be modeled as a
polynomial system as follows:
f1(x1, x2, · · · , xd) = 0








fA : A+ a0 + a1 · α + · · ·+ ak−1 · αk−1 = 0
fB : B + b0 + b1 · α + · · ·+ bk−1 · αk−1 = 0




f : S + Z = 0
}
Property: S = Z ?
Example 5.3 Consider a 2-bit multiplier over F22 , where P (x) = x2 + x+ 1, as given
in Figure 5.2. Variables a0, a1, b0, b1 are primary inputs, z0, z1 are primary outputs and
c0, c1, c2, c3, r0 are intermediate variables. The gate ⊗ corresponds to AND-gate, i.e.,
bit-level multiplication modulo 2. The gate ⊕ corresponds to XOR-gate, i.e., addition
modulo 2.
The circuit can be described using the following Boolean equations:
c0 = a0 ∧ b0,
c1 = a0 ∧ b1,
c2 = a1 ∧ b0,
c3 = a1 ∧ b1,
r0 = c1 ⊕ c2,
z0 = c0 ⊕ c3,

















Figure 5.2. A 2-bit multiplier over F(22).
With the mapping rules given in Equation 5.1, the above equations are transformed
into the following polynomials:
c0 + a0 · b0,
c1 + a0 · b1,
c2 + a1 · b0,
c3 + a1 · b1,
r0 + c1 + c2,
z0 + c0 + c3,
z1 + r0 + c3,
Therefore, our overall polynomial system is:
f1 : c0 + a0 · b0
f2 : c1 + a0 · b1
f3 : c2 + a1 · b0
f4 : c3 + a1 · b1
f5 : r0 + c1 + c2
f6 : z0 + c0 + c3
f7 : z1 + r0 + c3




fA : A+ a0 + a1 · α
fB : B + b0 + b1 · α
fspec : S + A · B

 specification
f : S + Z
}
Property to verify: S = Z ?
47
With the polynomial model given above, we formulate our problem as (radical)
ideal membership testing, which is described next.
5.3 Verification Formulation as Ideal Membership Testing
To formulate our verification test, we first analyze the circuit and model the Boolean
gate-level operators as polynomials over F2 (⊂ F2k), as given by the mappings of
Equations 5.1. To this set, we then append the polynomials corresponding to the word-
level specification. Let {f1, f2, . . . , fs} denote this set of polynomials derived from
both specification and implementation. Let {x1, x2, . . . , xd} denote all the variables
in the polynomial system. As a consequence, {f1, f2, . . . , fs} ∈ F2k [x1, . . . , xd]. Let
J = 〈f1, . . . , fs〉 ⊂ F2k [x1, . . . , xd] denote the ideal generated by these polynomials.
Our verification property S = Z is also modeled as a polynomial f : S + Z ∈
F2k [x1, . . . , xd].
To prove that the specification polynomial (f ) matches the implementation (J =
〈f1, . . . , fs〉), we need to check whether f : S + Z = 0 agrees with all the solutions of
J over the field F2k . In computer algebra terminology, we need to check whether or not




(J) denotes the variety of ideal J over the
given field F2k . This is because for all points (solutions) p ∈ VF
2k
(J), if f(p) = 0, then
f : S + Z = 0 =⇒ S = Z. On the other hand, if f(p) 6= 0 for some point p, then p
corresponds to the bug in the design.
Now if f vanishes on VF
2k
(J), according to Proposition 4.3, we know that f should
be a member of the radical ideal I(VF
2k
(J)). Therefore, our verification test can be
modeled as membership testing of f in the (radical) ideal I(VF
2k
(J)). To solve this
problem, we need to first derive the generators of I(VF
2k
(J)) (note that we are only





Strong Nullstellensatz establishes correspondences between ideals and their radi-
cals. As given in Theorem 4.3, I(VK(J)) =
√
J , where the variety V is taken over
the algebraically closed field K. Finite fields are, however, not algebraically closed, as
shown by the following result from [62]:
Theorem 5.1 Given finite fields F2n and F2m such that n divides m. Then F2n ⊂ F2m .
48
Therefore, F2 ⊂ F22 ⊂ F24 ⊂ F28 ⊂ . . . ; and F2 ⊂ F23 ⊂ F26 . . . ; and so on. The
algebraic closure of F2k is known to be an infinite field obtained as the union of all such
finite fields.
Therefore, Nullstellensatz needs to be suitably modified for application over finite
fields. We revisit the notion of vanishing polynomials for this purpose.
Over the finite field F2k , any element A satisfies the property A2
k − A = 0. There-
fore, polynomial x2k−x vanishes at all points in F2k , and x2k−x is called the vanishing
polynomial of the field. As a consequence, the variety V (x2k − x) = F2k . Over
multivariate polynomial ring F2k [x1, . . . , xd], V (x2
k
1 − x1, . . . , x2kd − xd) is Fd2k .
In the sequel, we use the following notation: Let J0 = 〈x2k1 − x1, . . . , x2kd − xd〉
denote the ideal of vanishing polynomials over F2k . Also, if J = 〈f1, . . . , fs〉 then,
the sum of ideals J + J0 = 〈f1, . . . , fs, x2k1 − x1, . . . , x2kd − xd〉. Let F2k denote the
algebraic closure of F2k .

























As a consequence of the above lemma, variety of any ideal J over a finite field F2k
can be equivalently analyzed over its algebraic closure F2k by just appending to J all
the vanishing polynomials J0. These vanishing polynomials do not change the zero-set





(J + J0)) =
√
J + J0.









J0)). According to Strong Nullstellensatz, I(VF
2k
(J + J0)) =
√





(J + J0)) =
√
J + J0 (5.2)
49
Lemma 5.3 Let J be any arbitrary polynomial ideal in F2k [x1, . . . , xd] and J0 be the
corresponding vanishing ideal. Then, J + J0 is radical. In other words,
√
J + J0 =
J + J0.
Proof. This is a well-known result, a proof of which is given in [36].
Putting together the above results, we finally arrive at the following application of
Nullstellensatz over finite fields.
Theorem 5.2 [Strong Nullstellensatz in Finite Fields] Let J ⊂ F2k [x1, x2, · · · , xn]
be an ideal and J0 be the ideal of vanishing polynomials. Then,
I(VF
2k
(J)) = J + J0 = J + 〈x2k1 − x1, x2
k
2 − x2, · · · , x2
k
d − xd〉 (5.3)





(J + J0)) =
√
J + J0 = J + J0 (5.4)
where J0 = 〈x2k1 − x1, x2k2 − x2, · · · , x2kd − xd〉.
Overall Verification Problem Formulation: Through Strong Nullstellensatz over
finite fields, given an ideal J , we can directly construct ideal I(VF
2k
(J)) = J + J0. For
our verification problem, we take the polynomials {f1, . . . , fs} representing the circuit
constraints and the specification polynomials to generate ideal J . Then, we append the
vanishing polynomials {x2k1 − x1, . . . , x2kd − xd} of ideal J0. Our verification problem
can now be formulated as testing whether the verification property polynomial f is in
J + J0. If f ∈ (J + J0), correctness of the circuit is established. Otherwise, there is
a bug in the design. To test if f ∈ (J + J0), it is required to compute a Gro¨bner basis
G of the ideal J + J0. Then, we reduce f w.r.t. G: i.e., f
G−→+ r. If r = 0, then, the
circuit is correct; otherwise, there is a bug in the design.
Example 5.4 Let us reconsider Example 5.3. First, polynomials are extracted from
the circuit implementation and the specification, as shown in Example 5.3. These
polynomials represent the ideal J . Along with the ideal J0 = 〈x2k1 − x1, . . . , x2kd − xd〉,
the following polynomials represent J + J0 for the multiplier circuit.
50
f1 : c0 + a0 · b0
f2 : c1 + a0 · b1
f3 : c2 + a1 · b0
f4 : c3 + a1 · b1
f5 : r0 + c1 + c2
f6 : z0 + c0 + c3
f7 : z1 + r0 + c3
fZ : Z + z0 + z1 · α


implementation (⊂ J )
fA : A+ a0 + a1 · α
fB : B + b0 + b1 · α
fspec : S + A ·B = 0

 specification (⊂ J )
a20 − a0, a21 − a1, b20 − b0, b21 − b1
c20 − c0, c21 − c1, c22 − c2, c23 − c3
r20 − r0, z20 − z0, z21 − z1




Now we need to compute the Gro¨bner basis G of this ideal J + J0. Once the
computation of G is completed, we simply need a polynomial reduction to test whether
f : S+Z can be reduced by G. In other words, we need to test whether S+Z G−→+ 0.
While our approach seems reasonably simple, the complexity of Gro¨bner basis
computation can make verification infeasible.
Complexity of Gro¨bner Basis Over Finite Fields: For our specific problem of
computing a Gro¨bner basis for J + J0 over Fq, the following result is known [36]:
Theorem 5.3 Let I = 〈f1, . . . , fs, xq1 − x1, . . . , xqd − xd〉 ⊂ Fq[x1, . . . , xd] be an ideal
over any finite field Fq. The time and space complexity of Buchberger’s algorithm to
compute a Gro¨bner basis of I is bounded by qO(d), assuming that the length of input
f1, . . . , fs is dominated by qO(d).
In our case q = 2k, and when k and d are large, this complexity makes verification
infeasible. In what follows, we show that a variable/term order can be derived by
analyzing the circuit topology, which makes the set of polynomials {f1, . . . , fs, x2k1 −
51
x1, . . . , x
2k
d − xd} itself a Gro¨bner basis of J + J0, thus obviating the need to apply
Buchberger’s algorithm.
5.4 Obviating Buchberger’s Algorithm
Just as variable orderings play a critical role in constructing BDDs and solving
SAT feasibly, the Gro¨bner basis computation is also highly susceptible to the term
orderings imposed on the polynomials. Therefore, a key step to improve/avoid the
high complexity of Gro¨bner basis computation is to derive a “good” term order.
Buchberger’s work [17] initially laid the foundation for computing Gro¨bner’s bases.
Subsequently, many improvements were introduced to improve the efficiency of Buch-
berger’s algorithm. Two of the most important improvements are the chain and product
criteria. For our particular circuit verification application, we exploit the product crite-
ria.
Lemma 5.4 [Product Criterion [18]] Let F be any field, and f, g ∈ F[x1, · · · , xd]
be polynomials. If the equality lm(f) · lm(g) = LCM(lm(f), lm(g)) holds, then
Spoly(f, g)
G−→+ 0.
The above result states that when the leading monomials of f, g are relatively prime,
then Spoly(f, g) always reduces to 0 modulo G. Thus, Spoly(f, g) need not be con-
sidered in Buchberger’s algorithm. Modern computer algebra engines perform this
check to avoid unnecessary Spoly(f, g) computations. If we could analyze the given
circuit and derive a term order such that every polynomial pair (f, g) in the generating
set has relatively prime leading monomials, then for all S-polynomials, the subse-
quent reduction would not add any new polynomials in the basis. In other words,
Spoly(f, g)
G−→+ 0 for all pairs f, g. Consequently, the polynomials {f1, . . . , fs}
extracted from the circuit (corresponding ideal J) and represented using such a term
order would themselves constitute a Gro¨bner basis of J . In [88], the authors derive
exactly such a term order, and a similar concept can be applied in our case.
Note that in our case:
• since the circuit constraints {f1, . . . , fs} are modeled as polynomials in F2 ⊂ F2k ,
they contain only multilinear monomial terms;
• the output of a gate is uniquely computed, and it always appears as a “single
variable term” in the polynomials;
52
• the circuit is acyclic.
Let xi be the output variable of any gate Hi in the circuit, and let xp1 , . . . , xpj denote
variables that are the inputs to the gate Hi. If we can represent the polynomials fi such
that xi > every monomial in the variables xp1 , . . . , xpj , then all (fi, fj), i 6= j have
relatively prime leading monomials and {f1, . . . , fs} is a Gro¨bner basis.
Proposition 5.1 Let C be any arbitrary combinational circuit. Let {x1, . . . , xd} denote
the set of all variables (signals) in the circuit, i.e., the primary input, intermediate and
primary output variables. Perform a reverse topological traversal of the circuit and
order the variables such that xi > xj if xi appears earlier in the reverse topological
order. Impose a lex term order to represent the Boolean expression for each gate as a
polynomial fi; then, fi = xi + tail(fi). Then, the set of all polynomials {f1, . . . , fs}
forms a Gro¨bner basis, as lt(fi) and lt(fj) for i 6= j are relatively prime.
Example 5.5 Consider the circuit of Figure 5.3. Variables a0, a1, b0, b1 are primary
inputs, z0, z1 are primary outputs and c0, c1, c2, c3, r0 are intermediate variables.
We perform a reverse topological traversal of the circuit. Starting from the primary
outputs, traverse the circuit to the primary inputs, and order the gates according to the
their (reverse) topological levels. The primary outputs z0, z1 are both at level-0, vari-
ables r0, c0, c3 are at level-1, c1, c2 are at level-2 and the primary inputs a0, a1, b0, b1
are at level-3. We order the variables {z0 > z1} > {r0 > c0 > c3} > {c1 > c2} >
{a0 > a1 > b0 > b1}. Using this variable order, we impose a lex term order on the
















Figure 5.3. A 2-bit multiplier over F(22). The gate ⊗ corresponds to AND-gate, i.e.,
bit-level multiplication modulo 2. The gate ⊕ corresponds to XOR-gate, i.e., addition
modulo 2.
53
c0 + a0 · b0, lm = c0;
c1 + a0 · b1, lm = c1;
c2 + a1 · b0, lm = c2;
c3 + a1 · b1, lm = c3;
r0 + c1 + s2, lm = r0;
z0 + c0 · c3, lm = z0;
z1 + r0 · c3, lm = z1
In our overall problem formulation, we also have variables A,B, S, Z ∈ F2k . They
can also be accommodated in this term order by imposing S > Z > A > B > z0 >
z1 > r0 > c0 > c3 > c1 > c2 > a0 > a1 > b0 > b1.
Thus, using the result of Proposition 5.1, the set of polynomials {f1, . . . , fs} is
a Gro¨bner basis for J . Note that {x2k1 − x1, . . . , x2kd − xd} is a Gro¨bner basis for
J0. However, we have to compute a Gro¨bner basis of J + J0 = 〈f1, . . . , fs, x2k1 −
x1, . . . , x
2k
d −xd〉. Not all polynomial pairs in {f1, . . . , fs, x2k1 −x1, . . . , x2kd −xd} have
relatively prime leading monomials.
Consider an arbitrary polynomial fi ∈ J . Using our term order, we have fi =
xi + tail(fi); i.e., the leading monomial of fi is a single variable term xi. Clearly, the
pairs (xi+tail(fi), x2
k
i −xi), fi ∈ J, x2ki −xi ∈ J0 do not have relatively prime leading
monomials. In fact, the pairs (xi + tail(fi), x2
k
i − xi) are the only ones to be considered
for Gro¨bner basis computation, as all other pairs have relatively prime leading terms.
This motivated us to investigate further the question “what is the result of the reduction
Spoly(xi + tail(fi), x2
k
i − xi) J,J0−→+ r”. We state and prove the following:
Theorem 5.4 Let q = 2k, and let Fq[x1, . . . , xd] be a ring on which we have a monomial
order >. Let I be a subset of {1, . . . , d}. For all i ∈ I , let fi = xi + Pi (where
Pi = tail(fi)) such that all indeterminates xj that appear in Pi satisfy xi > xj . Then,
the set G = {fi : i ∈ I} ∪ {xq1 − x1, . . . , xqd − xd} is a Gro¨bner basis.
Proof. According to Buchberger’s Theorem (Theorem 1.7.4 in [3]), we need to show
that for all f, g ∈ G, Spoly(f, g) G→+ 0. Let G1 = {fi : i ∈ I}. Lemma 5.4 shows that
if f, g ∈ G, have relatively prime leading terms, then, Spoly(f, g) G→+ 0. So the only
case where Lemma 5.4 does not apply is when f = xi + Pi and g = xqi − xi. Then,
54
Spoly(f, g) = xq−1i f − g = Pixq−1i + xi. In what follows, it is important to note that
the indeterminates appearing in Pi are all less than xi.




xi+Pi−→ P 2i xq−2i + xi.
Next, P 2i x
q−2
i +xi−P 2i xq−3i (xi+Pi) = P 3i xq−3i +xi. Continuing in this fashion, we





xi+Pi−→ P 2i xq−2i + xi xi+Pi−→ P 3i xq−3i + xi xi+Pi−→ · · ·
· · · xi+Pi−→ P qi + xi xi+Pi−→ P qi − Pi.
Over the finite field Fq, P qi − Pi is a vanishing polynomial. Therefore, P qi − Pi ∈
I(V (J0)) = 〈xq1 − x1, . . . , xqd − xd〉. By Lemma 5.4, G0 = {xq1 − x1, . . . , xqd − xd} is
Gro¨bner basis. Therefore, P qi − Pi G0→+ 0, which gives that P qi − Pi G→+ 0, as G0 ⊂ G.
In conclusion, ∀f, g ∈ G, Spoly(f, g) G→+ 0 and hence, G is a Gro¨bner basis.
As a consequence of Theorem 5.4, the Gro¨bner basis G for our verification instance
(ideal J + J0) can be obtained directly by construction using a reverse topological
traversal of the circuit. While G is indeed a Gro¨bner basis, it is neither minimal nor
reduced. We now show that this basis can actually be made minimal by considering the
vanishing ideal of only the primary inputs of the given circuit.
Corollary 5.1 Let q = 2k and Fq[x1, . . . , xd] be the ring on which we impose the
monomial order > obtained via Proposition 5.1. Let I be a subset of {1, . . . , d}. For
all i ∈ I , let fi = xi + Pi (where Pi = tail(fi)) such that all indeterminates xj that
appear in Pi satisfy xi > xj . Let XPI denote the set of all primary input variables of
the circuit. Then, the set G = {fi : i ∈ I} ∪ {x2pi − xpi} is a minimal Gro¨bner basis,
where xpi ∈ XPI .
Proof. According to the Definition 4.16 of a minimal Gro¨bner basis, two conditions
have to be satisfied: i) all polynomials in the basis are monic, i.e., their leading coeffi-
cient is 1; and ii) the leading monomial of any polynomial does not divide the leading
monomial of any other polynomial in the basis. We have already shown that G is a
Gro¨bner basis. Moreover, in F2k , the coefficient of every non-zero term is always 1.
Therefore, all polynomials are monic.
Furthermore, our ideal basis G consists of two sets of polynomials: i) polynomials
derived from the circuit, which are of the form fi = xi + tail(fi); and ii) the vanishing
55
polynomials x2ki −xi for i = 1, . . . , d. Our term order ensures that in fi = xi+tail(fi), xi
corresponds to either the primary output variables or the intermediate variables. Primary
input variables (xi ∈ XPI) will never occur as leading terms of fi because a primary in-
put is not an output of any gate in the circuit. Therefore, ∀xi ∈ ({x1, . . . , xd}−{XPI}),
there always exists fi with lm(fi) = xi, which will divide the vanishing polynomial
x2
k
i − xi. In such cases, x2ki − xi, xi /∈ XPI can be removed from the basis. By
eliminating all vanishing polynomials corresponding to non-primary-input variables,
we will obtain G = {fi : i ∈ I} ∪ {x2kpi − xpi} as a minimal Gro¨bner basis, where
xpi ∈ XPI .
Finally, since xpi ∈ F2 ⊂ F2k , x2i − xi = 0, we obtain G = {fi} ∪ {x2pi − xpi} as
the minimal Gro¨bner basis.
While we can obtain a minimal Gro¨bner basis G directly by construction, un-
fortunately, we cannot obtain a reduced Gro¨bner basis without actually performing
the reduction. This is because in a reduced Gro¨bner basis, the tail (tail(fi)) of every
polynomial fi is also reduced w.r.t. lt(fj), for all i 6= j. However, a reduced Gro¨bner
basis computation is not necessary for ideal membership testing.
5.5 Our Overall Approach
We set up the verification problem in F2k [x1, . . . , xd], on which we impose the
monomial order> as derived above. We extract the set of polynomialsG1 = {f1, . . . , fs}
from the circuit. We generate the set G0 = {x2kpi − xpi}∀xpi ∈ XPI . Then, the set
G = G1∪G0 forms a minimal Gro¨bner basis of the ideal J+J0 = 〈f1, . . . , fs, x2kpi−xpi〉.
We take our specification polynomial f and compute f G→+ r. If r = 0, then f ∈ J+J0
and the circuit is correct; otherwise, if r 6= 0, then we have a bug in the design.
Moreover, if r 6= 0, then the monomial order ensures that r contains only the primary
input variables. To show this, assume that r 6= 0 and r contains either an intermediate
or a primary output variable xj . As there always exists a polynomial fj in G with
lm(fj) = xj , r can be further reduced by fj . Continuing in this fashion, all the terms
with non-primary-input (intermediate or primary output) variables can be eliminated.
Finally, in the presence of a bug, any assignment to the (primary-input) variables that
makes r 6= 0, provides a counter-example for debugging. A SAT or SMT-solver can
find such an assignment in no time as r is simplified by Gro¨bner basis reduction. Our
results therefore obviate the need to construct a Gro¨bner basis, and the verification can
be performed only by reduction: f G→+ r.
56
Our overall approach is described in Algorithm 5. It first inputs the given circuit
implementation as Boolean equations. Each equation then is transformed to polynomi-
als G1 using Equations 5.1. All polynomials are then normalized into a sum-of-term
form using the distributive law: A · (B + C) = A ∗ B + A ∗ C. Subsequently, our
verification problem is formulated as a radical ideal membership testing. We conduct
a reverse topology traversal of the circuit to generate the variable ordering. Then, we
append vanishing polynomials G0 = {x2 + x} for all x ∈ primary inputs. Finally, we
compute the reduction of f (property polynomial) modulo G1 ∪ G0. If the reduction
result is r = 0, the circuit is correct. If there are bugs in implementation, then the result
r is a polynomial that encodes all input vector assignments that excite the bug(s) in the
design.
Algorithm 5: Proposed Verification Algorithm
Input: Circuit Implementation Equations Z.
Specification Polynomial S.
Output: True if S = Z. Bug polynomial r if S 6= Z.
for (i=0; i < number of eqns ; i++) do
/*Each equation is transformed to polynomials */;
poly[i] = Eqn-to-Poly(eqn[i]);




/*Obtain circuit-based variable order*/;
ordered var=T Traversal(newpoly);








return Bug polynomial r;
end
5.6 Experimental Results
Our algorithm is implemented in C + + with calls to the SINGULAR computer
algebra tool [v. 3-1-2] [28] to perform polynomial reductions. Our experiments are
57
conducted on a desktop with 2.40 GHz Intel Core(TM)2 Quad CPU and 8 GB memory
running 64-bit Linux.
We conducted verification experiments on several large custom-designed circuits,
including Mastrovito multipliers, Montgomery multipliers, Barrett multipliers and ECC
point addition and point doubling circuits. The designs are given in equation (EQN)
format and then translated to different formats: CNF, SMTLIB, BLIF, Polynomials
that are used by SAT, SMT, BDD/AIG-based solvers, and Singular, respectively. All
our circuit benchmarks have been made available to the larger verification community
through the SMT-LIB benchmark suite [55].
5.6.1 Evaluation of SAT, SMT, BDD, AIG-Based Methods
We evaluated the performance of many SAT solvers [83] [9] [32] [8], SMT solvers
[31] [6] [68] [14] [13] [2] [1] [11] and BDD-based techniques [82], on our benchmarks.
For these experiments, using the conventional equivalence checking approach, we cre-
ated a “miter” circuit to compare the specification against the implementation. The
implementation was given as a Montgomery multiplier as a gate-level netlist. Since
BDD/SAT/AIG-based approaches cannot operate upon word-level representations di-
rectly, the specification is given as a Mastrovito-style gate-level circuit implementation.
For SMT experiments, the designs were modeled at bit-vector level using quantifier-
free bit-vector (QF-BV) theories, maintaining a bit-vector-level abstraction whenever
possible. Table 5.1 shows that none of the BDDs, AIG/ABC, SAT or SMT solvers can
verify the correctness of circuits beyond 16-bit.
5.6.2 Evaluation of Our Approach
Our approach takes as inputs a gate-level circuit implementation and word-level
specification. Note the difference in the input requirements between our approach
and SAT/BDD/SMT/AIG-based approaches. Our approach only requires a word-level
specification while SAT/BDD/SMT/AIG-based approaches require an inherently large
gate-level specification. Therefore, there is an inherent advantage of our method in that
it maintains a high-level abstraction whenever possible.
Verification Using Gro¨bner Basis Computations in SINGULAR: Conceptually,
our approach requires first computing a Gro¨bner basis and then conduct a polynomial
reduction (ideal membership testing). If we use SINGULAR to compute a Gro¨bner basis
using our term order derived from Proposition 5.1, but without deducing the results
58
Table 5.1. Runtime for verification of Montgomery versus Mastrovito multipliers over
F2k for BDDs, SAT, SMT-solver and AIG/ABC-based methods. TO = timeout of 10hrs.
Time is given in seconds.
Word size of the operands k-bit
Solver 8 12 16
MiniSAT 22.55 TO TO
CryptoMiniSAT 7.17 16082.40 TO
PrecoSAT 7.94 TO TO
PicoSAT 14.85 TO TO
Yices 10.48 TO TO
Beaver 6.31 TO TO
CVC TO TO TO
Z3 85.46 TO TO
Boolector 5.03 TO TO
Sonolar 46.73 TO TO
SimplifyingSTP 14.66 TO TO
ABC 242.78 TO TO
BDD 0.10 14.14 1899.69
of Theorem 5.4 and Corollary 5.1, we can verify the correctness of only up to 48-bit
multipliers. Beyond that, the Gro¨bner basis engine runs into memory explosion. This
result is shown in Table 5.2.
Evaluation of Our Approach: Our approach only requires a polynomial reduction
(division) for the verification test: S + Z
G1,G0−−→+ r and to check if r = 0. For
this polynomial reduction, we use the REDUCE command in SINGULAR. Results for
verification of Mastrovito multipliers using our term ordering and only this reduction
are shown in Table 5.3. With our approach, we can verify the correctness of up to
163-bit Mastrovito multipliers. We also experimented with bug-catching in incorrect
designs; the bugs were introduced by arbitrarily swapping the wires (variables) xi with
Table 5.2. Verification of Mastrovito multipliers by computing Gro¨bner bases using
SINGULAR. MO=out of 8G memory. Time is given in seconds.
Size 16 32 48 64 96 128 160 163
#variables 323 1155 2499 4355 9603 16899 26243 27224
#polynomials 609 2241 4897 8577 19009 33537 52161 54117
#terms 2415 9439 21071 37311 83615 148351 231519 240261
Time 0.94 93.80 1174.27 MO MO MO MO MO
59
Table 5.3. Runtime for verifying bug-free and buggy Mastrovito multipliers using our
approach. TO = timeout of 10hrs. Time is given in seconds.
method 16 32 48 64 96 128 160 163
#variables 323 1155 2499 4355 9603 16899 26243 27224
#polynomials 291 1091 2403 4227 9411 16643 25923 26989
#terms 1793 7169 16129 28673 64513 114689 179201 185984
Bug-free 0.04 1.41 24.00 112.13 758.82 3054 9361 16170
Bugs 0.04 1.43 25.11 114.86 788.65 3061 9384 16368
xj , for some gate i 6= j. In such cases, we obtained a non-zero r. We used a SAT solver
to find a SAT assignment to r 6= 0. These run times are shown in Table 5.3.
Results of the verification of Montgomery multipliers are shown in Table 5.4. Mont-
gomery multipliers are significantly larger than Mastrovito multipliers. If we represent
a polynomial for every gate in the design, then we create too many variables (d) in the
system, exceeding SINGULAR’S capacity (d ≤ 32767). For this reason, we partition
the circuit, and construct the polynomials for each circuit partition – and we ensure
that our term ordering constraint is not violated. With such efforts, we are able to
verify Montgomery multipliers up to 128-bit datapaths, beyond which we still exceed
SINGULAR’S capacity. Similarly, results for the verification of Barrett multipliers are
shown in Table 5.5.
Table 5.6 and Table 5.7 show the results of verifying ECC point addition and point
doubling circuits, respectively. There are several representation systems for ECC point
addition and point doubling. We choose the Lo´pez-Dahab coordinate system [52] to
represent point addition and point multiplication. We custom designed these circuits,
where the polynomial computations were implemented using Mastrovito multipliers.
Our approach is able to verify up to 163-bit ECC operations, whereas SAT, SMT, BDD
and AIG-based techniques cannot even verify 16-bit ECC circuits.
5.7 Conclusions
This chapter has presented a formal approach to model and verify multiplier circuits
over finite fields F2k using a computer algebra-based approach. We show how the veri-
fication test can be formulated as membership testing of the specification polynomial f
in a (radical) ideal J+J0 = 〈f1, . . . , fs, x2k1 −x1, . . . , x2kd −xd〉, where J = 〈f1, . . . , fs〉
corresponds to the ideal generated by polynomials extracted from the circuit, and J0
60
Table 5.4. Runtime for verifying bug-free and buggy Montgomery multipliers using
our approach. TO = timeout of 10hrs. Time is given in seconds.
method 16 32 48 64 96 128
#variables 319 1194 2280 4395 6562 14122
#polynomials 287 1130 2184 4267 6370 13866
#terms 2262 10741 18199 40021 55512 134887
Bug-free 0.03 1.50 11.03 27.70 1802.75 10919.35
Bugs 0.03 1.52 11.10 28.18 1812.15 11047.10
Table 5.5. Runtime for verifying bug-free and buggy Barrett multipliers using our
approach. TO = timeout of 10hrs. Time is given in seconds.
method 16 32 48 64 96 128 160 163
#variables 305 1103 2389 4146 9216 16072 24643 26847
#polynomials 276 1041 2263 4004 8986 15008 24318 25746
#terms 1777 6757 15228 26452 60824 107454 16386 174571
Bug-free 0.03 1.31 22.12 103.30 724.14 2865 9024 14048
Bugs 0.03 1.32 23.06 106.02 734.63 2947 9207 14836
Table 5.6. Verification of ECC point addition. Run-time given is seconds. TO = timeout
of 24hrs.
Size 16 32 48 64 96 128 160 163
#variables 548 1615 3623 6854 13986 28468 30237 31384
#polynomials 10812 30826 86482 123544 288720 509660 604740 646129
Runtime 0.26 4.82 118 557 3598 15346 47290 81016
Table 5.7. Verification of ECC point doubling. Run-time given is seconds. TO =
timeout of 24hrs.
Size 16 32 48 64 96 128 160 163
#variables 528 1598 3321 6409 12230 26493 29015 30442
#polynomials 4640 14523 42324 61274 142733 243452 297465 313145
Runtime 0.10 2.21 54 263 1532 8012 21493 36439
61
= 〈x2ki −xi〉 corresponds to the ideal of vanishing polynomials of the field. By analyz-
ing the circuit topology, we derive a monomial order that makes the set {f1, . . . , fs, x2k1 −
x1, . . . , x
2k
d −xd} itself a Gro¨bner basis of J +J0. Subsequently, the verification can be
formulated by simply carrying out the reduction f J,J0→+ r. Using our approach, we are
able to verify the correctness of up to 163-bit multipliers and ECC point addition circuits
over F2163 , whereas conventional techniques based on SAT, SMT, BDD and AIG-based
solvers are infeasible. A conference paper based on this approach was presented in [57],
and a journal version of this paper has been submitted for review.
CHAPTER 6
GATE-LEVEL EQUIVALENCE CHECKING OF
ARITHMETIC CIRCUITS OVER F2K
This chapter describes our approach to equivalence checking of two combinational
circuits designed for finite field computations. Combinational equivalence checking is a
fundamental problem in hardware verification, and it has been widely investigated over
the years. Canonical decision diagrams (BDDs and their variants), implication-based
methods, SAT solvers, and And-Invert-Graph (AIG)-based reductions are among the
many techniques employed for this purpose. When one circuit is synthesized from the
other, this problem can be efficiently solved using AIG-based reductions (e.g., the ABC
tool [11]) and circuit-SAT solvers (e.g., CSAT [53]). Synthesized circuits generally
contain many subcircuit equivalences, which AIG- and CSAT-based tools can identify
and exploit for verification. However, when the circuits are functionally equivalent but
structurally very dissimilar, none of the contemporary techniques, including ABC and
CSAT, offer a practical solution. Particularly, for custom-designed arithmetic circuits ,
this problem largely remains unsolved today. Since these custom-designed circuits are
prevalent in industry, it is therefore imperative to develop scalable methods to verify
such circuits.
Focusing on finite field arithmetic circuits, we utilize computer algebra techniques
and formulate the equivalence verification problem as a Weak Nullstellensatz proof,
and solve it using Gro¨bner bases. This requires the computation of a reduced Gro¨bner
basis, which can be expensive for large circuits. To overcome this complexity, we again
wish to exploit the circuit topology-based term orderings (as described in the previous
chapter) for polynomial manipulation. Unfortunately, unlike in the previous case, the
set of polynomials corresponding to this verification instance (the miter circuit) does not
constitute a Gro¨bner basis. However, using Gro¨bner bases theory, we identify a mini-
mum number of S-polynomial computations that are necessary and sufficient to prove or
disprove equivalence. Experiments demonstrate the effectiveness and efficiency of our
63
approach – we can verify 128-bit structurally very dissimilar implementations, while
none of the contemporary methods are feasible.
6.1 Problem Statement and Modeling
In this application, we are given two combinational arithmetic circuits C1 and C2, as
gate-level flattened netlists. We have to prove or disprove their functional equivalence.
Our approach is generic enough to perform equivalence checking of any arbitrary
combinational arithmetic circuit over F2k . However, without loss of generality, we will
again consider finite field multiplier circuits as examples to explain our approach.
Our problem can be formally described as:
• Given a finite field F2k , i.e., given k (datapath size), along with the corresponding
irreducible polynomial P (x), let P (α) = 0, i.e., α be the root of P (x).
• Given two k-bit combinational circuits C1 and C2, the common primary inputs
of both circuits are {a0, . . . , ak−1, b0, . . . , bk−1}. The primary outputs of C1 are
{x0, . . . , xk−1}; the primary outputs ofC2 are {y0, . . . , yk−1}, where ai, bi, xi, yi ∈
F2, i = 0, . . . , k − 1.
• The word-level representation of inputs is A = a0 + a1α + · · · + ak−1αk−1,
and B = b0 + b1α + · · · + bk−1αk−1. Correspondingly, the outputs are X =
x0 + x1α + · · ·+ xk−1αk−1 and Y = y0 + y1α + · · ·+ yk−1αk−1.
Our goal is to formally prove that ∀ai, bi ∈ F2 ⊂ F2k , the outputs X and Y of circuits
C1 and C2 are equal to each other, i.e., X = Y always holds. Otherwise, there must
exist a bug in one of the given circuits.
The equivalence verification setup is shown in Figure 6.1. Given circuits C1 and C2,
we want to prove that for all possible inputs, the output X of circuit C1 is always equal
to the output Y of circuit C2 . This can be, conversely, modeled as proving that X 6= Y
has no solutions. Such a setup is called a “miter” circuit, and proving infeasibility of
the miter is a standard practice in combinational circuit verification. This is mostly
because it enables the use of constraint-solvers (such as SAT solvers) to prove/disprove
equivalence.
64
The constraints for circuits C1 and C2 are modeled as polynomials over F2k using
Equations 5.1. The X 6= Y constraint corresponding to the miter is also modeled as a
polynomial in F2k as follows:
t(X − Y ) = 1,where t is a free variable in F2k (6.1)
The correctness of the above constraint modeling can be shown as follows:
• WhenX = Y,X−Y = 0, so t·0 = 1 has no solutions, and the miter is infeasible.
• When X 6= Y, (X − Y ) 6= 0. Over any field, every non-zero element has a
multiplicative inverse. Let t−1 = (X − Y ). Then, t · t−1 = 1 will always have a
solution over F2k .
The above t(X − Y ) = 1 model for the miter can also be employed over F2, i.e.,
the Boolean ring. Since 1 is the only non-zero element in F2, t = 1, and the X 6= Y
constraint is specified as X + Y + 1 = 0 (mod 2).
Overall, the entire miter circuit can be modeled as a polynomial system over F2k in
Equations 6.2.




fA : A+ a0 + a1α + · · ·+ ak−1αk−1








fB : B + b0 + b1α + · · ·+ bk−1αk−1




fm : t · (X − Y ) + 1 = 0
}
Miter:X 6= Y
Subsequently, we need to check whether or not there are any solutions to the set of
polynomials in Equations 6.2. The following example illustrates our polynomial system
modeling.
Example 6.1 Consider two functionally equivalent circuits over F22 . The miter is












Figure 6.1. The equivalence checking setup: miter.
Figure 6.2. Miter for 2-bit circuit equivalence.
66
The miter is modeled as a system of polynomials, where the outputs of C1, C2 are
expressed at word-level as: X + x0 + x1 · α and Y + y0 + y1 · α.
x0 = a0 ⊕ b0 ⇒ x0 + a0 + b0
c0 = a0 ∧ b0 ⇒ c0 + a0 · b0
c1 = a0 ⊕ b1 ⇒ c1 + a0 + b1
x1 = c0 ⊕ c1 ⇒ x1 + c0 + c1




d0 = ¬(a0 ∧ b0)⇒ d0 + a0 · b0 + 1
d1 = ¬(a1 ∧ b1)⇒ d1 + a1 · b1 + 1
d2 = a0 ∧ b0 ⇒ d2 + a0 · b0
d3 = ¬(a1 ∧ d1)⇒ d3 + a1 · d1 + 1
d4 = ¬(b1 ∧ d1)⇒ d4 + b1 · d1 + 1
d5 = ¬(a0 ∧ d0)⇒ d5 + a0 · d0 + 1
d6 = ¬(b0 ∧ d0)⇒ d6 + b0 · d0 + 1
d7 = ¬(d3 ∧ d4)⇒ d7 + d3 · d4 + 1
y0 = ¬(d5 ∧ d6)⇒ y0 + d5 · d6 + 1
y1 = d2 ⊕ d7 ⇒ y1 + d2 + d7




t · (X − Y ) + 1 = 0 } Miter:X 6= Y (6.3)
With the polynomial model given above, we formulate our problem as a Weak
Nullstellensatz problem, which is described next.
6.1.1 Verification Problem Formulation as Weak Nullstellensatz
As described in Equation 6.2 and Example 6.1, to formulate our verification test, we
first analyze the miter circuit and model the Boolean gate-level operators as polynomials
over F2 – i.e., two sets of implementation polynomials representing C1 and C2, and the
miter polynomials: X 6= Y (X, Y are outputs of C1 and C2). Subsequently, we can
reason whether or not solutions exist to this polynomial system.
For this purpose, we wish to use techniques from computer algebra and algebraic
geometry to reason about the solutions (variety) to the polynomial equations (ideal).
67
Notation: Let F1, F2 represent the set of polynomials generated from circuit C1
and C2, respectively. Let fm represent the miter polynomial. Let F = {F1, F2, fm} =
{f1, f2, . . . , fs, fm} denote this set of polynomials derived from the miter circuit. Let
{x1, . . . , xd} denote all variables occurring in F . Let J = 〈F1, F2, fm〉 ⊂ F2k [x1, . . . , xd]
denote the ideal generated by these polynomials. Subsequently, VF
2k
(J) denotes the
variety (solutions) of J over F2k .
Our verification problem can be formulated as the evaluation:
VF
2k
(J) = ∅? (6.4)
Weak Nullstellensatz [39] explicitly specifies the condition when a variety is empty.
Theorem 6.1 [Weak Nullstellensatz] Let J ⊂ K[x1, x2, · · · , xd] be an ideal satis-
fying VK(J) = ∅. Then, I = K[x1, x2, · · · , xn] ⇐⇒ {1} ∈ J .
Recall that a reduced Gro¨bner basis is a canonical representation of an ideal. We
know that the unit ideal 〈1〉 can generate the entire set of polynomials in K[x1, x2, · · · , xn].
Therefore, Weak Nullstellensatz can be further described via Gro¨bner basis as:
Corollary 6.1 [Weak Nullstellensatz] Let I ⊂ K[x1, x2, · · · , xd] be an ideal satis-
fying V (I) = ∅. Then the Reduced Gro¨bnerBasis(I)= {1}.
The Weak Nullstellensatz now offers us a way to evaluate whether the system of
multivariate polynomial equations has a common solution in Kd.
However, Weak Nullstellensatz is stated over an algebraically closed field K. Our
problem is modeled over F2k , which is not algebraically closed. Therefore, Weak
Nullstellensatz is bound to fail when applied directly, without modification, to finite
fields.
Let us explain why Weak Nullstellensatz fails when applying it to the field F2 ⊂ F2k
by an example.
Example 6.2 We are given an implementation of a circuit over F2 ⊂ F2k:
x1 = a ∨ (¬a ∧ b) (6.5)
Its corresponding specification is :
y1 = a ∨ b (6.6)
68
where x1 and y1 are symbolically different but functionally equivalent. Then, we trans-
form the circuit equations into their polynomial forms:
x1 = a ∨ (¬a ∧ b) 7→ x1 + a+ b · (a+ 1) + a · b · (a+ 1) (mod 2)
y1 = a ∨ b 7→ y1 + a+ b+ a · b (mod 2)
x1 6= y1 7→ x1 + y1 + 1 (mod 2)
Then, the reduced Gro¨bner basis of above polynomials with term ordering lex x1 >
y1 > a > b is:
a2 · b+ a · b+ 1
y1 + a · b+ a+ b
x1 + a · b+ a+ b+ 1
which is not equal to 〈1〉, even though their variety is empty. The reason for this can be
explained as follows.
As shown in Figure 6.3, F2k is the algebraic closure of F2k . If there is no solution
to ideal J in the algebraic closure F2k , then there is no solution in F2k either. However,
what happens when there is a solution in F2k , i.e., 1 /∈ GB(J)? In this case, it means
that there is a nonempty set of solutions to the polynomial system in F2kd. There are
two possibilities:
• The solution(s) may lie within F2k .
• The solution(s) may lie in F2k , but outside F2k , as depicted in Figure 6.3.
We are interested in finding out whether or not X 6= Y over F2k – i.e., whether the





Figure 6.3. A solution (bug) in (F2k − F2k) is a “don’t care”.
69
the field F2k , in which case the bug is really a “don’t care” condition (akin to a “false
negative” in design verification parlance).
To address this problem, Weak Nullstellensatz needs to be suitably modified for
application over finite fields F2k .
Theorem 6.2 [Weak Nullstellensatz in F2k ]
Given f1, f2, · · · , fs ∈ F2k [x1, x2, · · · , xd]. Let J = 〈f1, f2, · · · , fs〉 ⊂ F2k [x1, x2, · · · , xd]
be an ideal. Let J0 = 〈x2k1 − x1, x2k2 − x2, · · · , x2kd − xd〉 be the ideal of vanishing




(J + J0) = ∅, if and only if the reduced
Gro¨bnerBasis(J + J0) = {1}.





. From Lemma 5.1, we know:
VF
2k
(J + J0) = VF
2k
(J). (6.7)
Combining with Corollary 6.1, we conclude:
VF
2k
(J + J0) = ∅ ⇔ reduced Gro¨bnerBasis(J + J0) = {1} (6.8)
Example 6.3 Revisiting Example 6.2, we need to append the vanishing polynomials
a2 − a, b2 − b, x21 − x1, y21 − y1 to the given ideal. Now, when we compute the reduced
Gro¨bner basis, we get: reduced-GB(x1 + a+ b · (a+ 1) + a · b · (a+ 1), y1 + a+ b+
a · b, x1 + y1 + 1, a2 − a, b2 − b, x21 − x1, y21 − y1) = {1} which proves x1 = y1.
Verification Problem Formulation: Through Weak Nullstellensatz over F2k , given
an ideal J ∈ F2k [x1, . . . , xd], we can determine whether the variety of J is empty by
analyzing the corresponding reduced Gro¨bner basis of J + J0.
For our verification problem, we take the polynomials {F1, F2, fm} = {f1, . . . , fs, fm}
representing the miter circuit constraints to generate ideal J . Then we append the
vanishing polynomials {x2k1 − x1, . . . , x2kd − xd} of ideal J0. We compute the reduced
Gro¨bner basis G of J + J0 and check if G equals to the unit ideal {1}. The two circuits
are functionally equivalent if and only if G = {1}.
The critical issue in the Weak Nullstellensatz formulation is the computational com-
plexity of a Gro¨bner basis (as given in Theorem 5.3). To overcome this complexity, we
again wish to exploit our circuit topology-based term ordering from Proposition 5.1 for
70
polynomial representation. Note that according to the term ordering from Proposition
5.1, the set of polynomials in {F1, F2} does constitute a Gro¨bner basis – as C1 and C2
are independent circuits. However, with the miter polynomial fm, the set of polynomials
F = {F1, F2, fm} does not constitute a Gro¨bner basis. This is because there always
exists one polynomial fo ∈ F, (fo 6= fm) corresponding to the output of either C1 or
C2 with a leading term that is not relatively prime w.r.t. the leading term of the miter
polynomial fm. Their corresponding S-polynomial computation also does not reduce
to zero. This is shown in Example 6.4.
Example 6.4 Let us reconsider Example 6.2. Based on our topological term ordering
of the circuit, we impose a lex term order with:
x1 > y1 > d4 > d3 > d2 > d1 > d0 > c1 > c0 > a0 > a1 > b0 > b1,
Then, the set of polynomials of the miter circuit {F1, F2, fm} does not constitute a
Gro¨bner basis. This is because the miter polynomial fm : tX − tY + 1 and output
polynomial fX of circuit C1, fX : X + x0 + x1 · α, has a common variable X in their
leading terms tX and X , respectively. Therefore, lt(fm) and lt(fo) are not relatively
prime. Moreover, Spoly(fm, fX)
F1,F2,fm−→ r, r 6= 0, thus violating the property of a
Gro¨bner basis that all S-polynomials should reduce to zero.
This suggests that we may have to compute a reduced Gro¨bner basis. However,
in the next section, we describe our results that can identify a minimum number of
S-polynomial computations that are sufficient and necessary to prove equivalence or to
detect bugs.
6.2 Verification Using a Minimum Number
of S-polynomial Computations
To identify a minimum number of S-polynomial computations in Buchberger’s al-
gorithm, we make use of the following lemma.
Lemma 6.1 Let r ∈ F2[x1, . . . , xd] be a multilinear polynomial expression; i.e., r is
a nonconstant polynomial such that every monomial term in r contains variables of
degree 1. Then, r has a root in Fd2.
Proof. Let l(r) denote the number of nonzero monomials appearing in r. We will
perform induction on l(r). Note that in F2, the coefficient of all nonzero monomials is
1.
71
The case l(r) = 1 is trivial, as r = x1x2 . . . xt, for some t ≤ d. A polynomial with
one monomial term always has a solution.
For the general case, l(r) ≥ 2. Then, we can always write r = r′ +M where M
is a product of monomials. After appropriately relabeling the variables, we can assume
that x1 divides M , i.e., x1 appears in M . If x1 divides r′ too, then x1 divides r as well.
As a consequence, we obtain x1 = 0 as a solution for r = 0. So, r has a root in F2.
If x1 does not divide r, then it does not divide r′. So variable x1 does not appear in
r′. Then, let r” = F(0, x2, . . . , xd). Note that l(r”) < l(r), as monomial M does not
appear in r”. By induction, there is a solution (x2, . . . , xd) for r” = 0, which also gives
a solution (0, x2, . . . , xd) for r. Thus r always has a root in F2.
Now we state and prove the following theorem.
Theorem 6.3 Let F1, F2 correspond to the set of polynomials derived from circuits
C1, C2, respectively. Let fm be the miter polynomial. Let F = {F1, F2, fm} and J =
〈F 〉 ⊂ F2k [x1, . . . , xd] be the ideal of polynomials corresponding to the miter circuit.
Impose the circuit topology-based monomial order > from Proposition 5.1. Let F0 =
{x2k1 − x1, . . . , x2kd − xd} be the vanishing polynomials of F2k; and J0 = 〈F0〉. Let
fo ∈ F (fo 6= fm) be the only polynomial such that the leading terms of fm, fo are not
relatively prime. Then VF
2k
(J) = ∅ ⇐⇒ r = 1, where r is computed as Spoly(fm, fo)
F,F0−→+ r.
Proof. Let q = 2k, and let G and Gred, respectively, denote the Gro¨bner basis and the
reduced Gro¨bner basis of (J + J0). Let T represent the set of all variables occurring in
F , and let Tpi ⊂ T denote the set of all primary inputs.
Our objective is to deduce whether or not the variety VF
2k
(J) = ∅, without actually
computing a reduced Gro¨bner basis. Recall, according to Theorem 6.2, VFq(J) =
∅ ⇐⇒ Gred = {1}, so we only need to check whether Gred = {1}. Based on our term
ordering, we will try to identify the polynomials that constitute Gred.
In the first iteration of Buchberger’s algorithm, Spoly(fm, fo) is the only polynomial
that needs to be computed and reduced to obtain r, as all other S-polynomials reduce to
zero, due to Theorem 5.4. We need to consider three cases:
• Case 1: r = 1.
• Case 2: r = 0.
72
• Case 3: r is a nonconstant multilinear polynomial consisting of only primary
input variables of the circuit.
Case 1 is the trivial case: If r = 1, then 1 ∈ G, so Gred = {1} and therefore,
V (J + J0) = ∅. The miter is infeasible and the circuits are equivalent.
Case 2: When r = 0, no new polynomial is created in Buchberger’s algorithm.
Therefore, G = {F, F0}. While the set {F, F0} is itself a Gro¨bner basis, it is not
reduced. So, what is the reduced basis Gred? We will show that Gred 6= {1} and this
will imply that V (J + J0) 6= ∅.
To reduce a Gro¨bner basis G, we take all polynomials f ∈ G and reduce f G−f−→+ f ′.
All such f ′ constitute Gred. We will consider such a reduction for G = {F, F0}. For
all fj ∈ F , let fj = xj + Pj , where Pj = tail(fj) and lm(fj) = xj where xj /∈ Tpi.
This is due to our term order where only gate outputs (xj) appear as leading terms of all
polynomials. Let v be any variable in Pj . If v ∈ {T − Tpi} (non-primary-input), then
v = lm(fk) (k 6= j). Thus fj {F,F0}−fj−−−−−−→ f ′j , where f ′j = xj + P ′j . In such a case, P ′j
contains only primary inputs. From a circuit-structure perspective, this reflects that any
internal gate output xj can be expressed in terms of primary inputs.
Similarly, xqi − xi with xi ∈ {T − Tpi} will reduce to zero, and only vanishing
polynomials of primary inputs will remain in F0. Moreover, since circuit inputs are
bit-level, x2pi = xpi; so x2pi − xpi, xpi ∈ {Tpi}, are the vanishing polynomials remaining
in the reduced basis. Let F ′ = {xj + P ′j}, where xj ∈ T . Then, the reduced Gro¨bner
basis Gred of {F, F0} = reducedGB({F} ∪ {xqi − xi}) = {F ′} ∪ {x2pi − xpi}. Clearly,
Gred 6= 1. We conclude, if r = 0, Gred 6= {1}, and V (J + J0) 6= ∅. The miter
constraints are feasible and the circuits are not equivalent.
Case 3: If r is a nonconstant polynomial, then due to our term order and Corollary
5.1, r will contain only the primary input variables of the circuit. Moreover, as these
variables are Boolean, x2pi = x3pi = · · · = xpi, all variables in the monomials of r have
degree 1, and r is multilinear.
After the first iteration of Buchberger’s algorithm, we obtain {F, F0, r} in the basis.
Because r contains only primary inputs, lt(r) is relatively prime w.r.t. leading terms of
all polynomials in F . Therefore, the Gro¨bner basis of {F, r} is {F, r} itself.
However, {F, r} ∪ {F0} is not a Gro¨bner basis, because lm(r) and lm(xqk − xk) are
not relatively prime when xk ∈ Tpi. Therefore, G = GB({F, r} ∪ {F0}) = {F} ∪
73
GB(r ∪ {F0}). In such a case, if we can show that 1 /∈ GB(r ∪ {F0}), then 1 /∈
GB({F, F0, r}).
To show 1 /∈ GB(r ∪ {F0}), we utilize the Weak Nullstellensatz Theorem 6.2: if
V (r∪{F0}) 6= ∅, then 1 /∈ GB(r∪{F0}). In Lemma 6.1, we showed that if r is a multi-
linear polynomial, it always has a root. This means that V (r ∪ {F0}) 6= ∅. Therefore,
1 /∈ GB(r ∪ {F0}). This proves Case 3: if r is not 0 or 1, then {1} /∈ G = GB(F, F0).
So, we conclude that:
VF
2k
(J) = ∅ ⇐⇒ r = 1. (6.9)
Combining with Corollary 5.1, the above theorem can be restated based on a mini-
mum Gro¨bner basis.
Corollary 6.2 Let J = 〈F 〉 ⊂ F2k [x1, . . . , xd] on which we impose our circuit-based
monomial order >. Let JPI0 = 〈x2pi − xpi〉, where xpi ∈ PI . Let fo, fm be the only
polynomial pair such that lm(fm), lm(fo) are not relatively prime. Then, VF
2k
(J) =
∅ ⇐⇒ r = 1, where r is computed as Spoly(fm, fo) J,J
PI
0−→+ r.
Theorem 6.3 and Corollary 6.2 provide the foundation of our verification formula-
tion. We only need one S-polynomial computation to identify whether or not the two
circuits are equivalent. Our overall approach is described in the following algorithm.
Algorithm 5 first inputs the Boolean expressions of the given circuit implementa-
tion. Each expression is then transformed into a set of polynomials F using the map-
pings shown in Equation 5.1. All polynomials are then normalized into a sum-of-term
form using the distributive law A(B + C) = AB + AC. Then, we perform a reverse
topology traversal of the circuit to derive our variable and ordering. Then, we append
vanishing polynomials F0 = {x2 + x} for all x ∈ primary inputs. Subsequently, we
identify the two polynomials fm and fo that have common variables in their leading
terms. Finally, we conduct a polynomial reduction of Spoly(fm, fo) modulo {F ∪ F0}.
If the reduction result is r = 1, the two circuits are equivalent. If r 6= 1, the circuits are
not equivalent. Again, any assignment to the variables that makes r 6= 1 provides an
input vector that can be used as a counter-example for debugging.
6.3 Improving Polynomial Division Using F4-style Reduction
Through the results described above, the need for Buchberger’s algorithm is obvi-
ated and verification can be performed by analyzing the result of just one S-polynomial
74
Algorithm 6: Our Proposed Equivalence Checking Algorithm
Input: Two Circuit Implementations with outputs X and Y (Boolean equations).
Output: 1 if X = Y . Bug polynomial r if X 6= Y .
for (i=0; i < number of eqns; i++) do
/*Each equation is transformed to polynomials */;
poly[i] = Eqn-to-Poly(eqn[i]);
/*Each equation is transformed to sum-of-term*/;
newpoly[i] = Sum-of-term(poly[i]);
end
/*Obtain circuit-based variable order*/;
ordered var=T Traversal(newpoly);




/*Identify polynomials that need to be reduced*/;
fo, fm=Identify(newpoly, vanpoly);
To Be Reduced = Spoly(fo, fm);




return Bug polynomial r;
end
reduction. Therefore, the most intensive computational step is that of polynomial divi-
sion Spoly(fm, fo)
F,F0−→+ r. When the two circuits C1, C2 are very large, the polyno-
mial set {F, F0} also becomes extremely large. This division procedure then becomes
the bottleneck in verifying the equivalence. To further improve upon our approach,
we exploit the relatively recent concept of F4-style polynomial reduction [34], which
implements polynomial division using successive row-reductions on a matrix.
Let us first describe the matrix representation for polynomial algebra operations.
Matrix Representation of Polynomials: Each row i of the matrix M corresponds
to polynomial fi, whereas each column j corresponds to monomial mj . If the jth entry
on row i in matrix is 1, i.e., M(i, j) = 1, it means the jth monomial is present in the
ith polynomial. Similarly, M(i, j) = 0 denotes the absence of mj in fi. Since we
are operating in F2k , coefficients are always {0, 1}, and no specific representation of
coefficients is required. Note, however, that the entries in rows and columns have to
satisfy the imposed term ordering.
75
Example 6.5 Given two polynomials: f1 = a0 + a1 · b1 + 1 and f2 = a0 · b0 + b1 + 1
with term ordering lex with a0 > a1 > b0 > b1. First, we sort all monomials occurring
in f1 and f2 w.r.t. term ordering: a0 · b0 > a0 > a1 · b1 > b1 > 1.
Then, we associate these sorted monomials with the columns of the matrix. The
polynomials are also sorted according to the term order before they are associated with
the rows of the matrix. For example, since lm(f2) > lm(f1), f2 appears on row 1 and
f1 appears on row 2. The generated matrix is shown in Table 6.1.
Polynomial reduction requires operations of addition/subtraction and cancellation
of leading terms. We demonstrate how the addition/subtraction and division operations
are implemented on the matrix.
Matrix Subtraction for Polynomials: The subtraction of two polynomials can
be formulated as a row-eduction in the matrix. Since coefficients of polynomials are
computed (mod 2) in our case, row-reductions are also performed (mod 2).
Example 6.6 Again consider f1 = a0 + a1 · b1 + 1 and f2 = a0 · b0 + b1 + 1 with lex
order: a0 > a1 > b0 > b1. Let us perform f1 − f2: f1 − f2 = f2 − f1 (mod 2) =
a0 · b0 + a0 + a1 · b1 + b1. On the matrix, each entry on row 2 is subtracted from the
corresponding entry on row 1 and the result is stored in row 2, as shown in Table 6.2.
Matrix Reduction for Polynomials: Polynomial division is implemented as can-
cellation of leading terms. The reduction step in Algorithm 3 that cancels leading terms
is:
f1/f2 = f1 − lm(f1)
lm(f2)
· f2 (6.10)
Table 6.1. Matrix representation for polynomials.
a0 · b0 a0 a1 · b1 b1 1
f2 1 0 0 1 1
f1 0 1 1 0 1
Table 6.2. Matrix subtraction of polynomials.
a0 · b0 a0 a1 · b1 b1 1
f2 1 0 0 1 1
f2 − f1 1 1 1 1 0
76
In matrix representation, we create two rows, one each for f1 and lm(f1)lm(f2) · f2, and
then perform subtraction on the matrix; this is shown in Example 6.7.
Example 6.7 Given two polynomials: f1 = a0 · b1 + a0 + 1 and f2 = a0 + 1 with term
order lex: a0 > a1 > b0 > b1. Consider the polynomial reduction:
f1/f2 = f1 − a0 · b1
a0
· f2 = f1 − b1 · f2
We create two rows in matrix for f1 and b1 · f2 and insert monomials from f1 and
b1 · f2 into the matrix columns, as shown in Table 6.3.
Then, we conduct f1 − b1 · f2, as shown in Table 6.4.
Finally, row 2 represents the reduction result of f1/f2 = a0 + b1 + 1.
With the above basic polynomial operations formulated as matrix operations, we
now describe our algorithm to create the matrix of polynomials corresponding to our
verification instance (miter circuit). The algorithm is shown in Algorithm 7. The main
idea behind this algorithm is to set up the rows of the matrix (polynomials) in a way
that polynomial division can be subsequently performed by subtracting row i from row
i− 1. In the algorithm, the computation L := L ∪ mon
lm(fk)
· fk in the while-loop actually
corresponds to lm(f1)
lm(f2)
· f2 in Equation 6.10.
To better understand the algorithm, we describe the matrix construction procedure
in Example 6.8.
Example 6.8 Suppose that two functionally equivalent circuits and the miter are rep-
resented by the following polynomials at bit-level (i.e., over F2).
Table 6.3. Matrix reduction for polynomials: representation.
a0 · b1 a0 b1 1
b1 · f2 1 0 1 0
f1 1 1 0 1
Table 6.4. Matrix reduction for polynomials: subtraction.
a0 · b1 a0 b1 1
b1 · f2 1 0 1 0
f1 − bf2 0 1 1 1
77
Algorithm 7: Generating the Matrix for Polynomial Reduction
Input: f, F = {f1, . . . , fs} with f1 > f2 > · · · > fs.
Output: A matrix representing f f1,...,fs−−−−→+ r
/*Let L be the set of polynomials corresponding to
rows of matrix*/;
L:={f} ;
/*The index of polynomials in F*/;
i:=1;
/*Let ML be the set of monomials */;
ML:={ monomials of f} ;
mon:= the ith monomial of ML;
while mon /∈ PrimaryInputs do
Identify fk ∈ F satisfying: lm(fk) can divide mon ;
/*add new polynomial to L as a new row in matrix*/;
L := L ∪ mon
lm(fk)
· fk ;
/*Add monomials to ML as new columns in matrix */;
ML:=ML ∪ {monomials of monlm(fk) · fk} ;
i := i+ 1;
mon:= the ith monomial of ML;
end
Note that i0, . . . , i3 denote the primary inputs of the circuits. The circuit topology-
based monomial order is derived as lex with x > y > n0 > n2 > n10 > n7 > n6 >
n5 > n4 > n3 > i0 > i1 > i2 > i3. All polynomials above have already been sorted
(ordered) according to their leading terms in descending order. All monomials in each
polynomial are also ordered.
fm = x+ y + 1,
fo = x+ n0 + n2,
f1 = y + n10,
f2 = n0 + i2 · i3,
f3 = n2 + i0 · i1,
f4 = n10 + n7,
f5 = n7 + n6 + n4 · i0,
f6 = n6 + n5 + n3 · i1,
f7 = n5 + n4 · n3,
f8 = n4 + i1 + i3,
f9 = n3 + i0 + i2;
78
In this case, f = Spoly(fm, fo) = y + n0 + n2 + 1 and F = {f1, . . . , f9}. We want
to show the algorithm’s operation to construct a matrix for the reduction f F−→+ r.
Initialization:
L := {f};
ML := {y, n0, n2, 1};
mon := y;
Iteration i = 1:
fk := f1 = y + n10;
L := {f, f1};
ML := {y, n0, n2, n10, 1};
i := 2;
mon := n0
Iteration i = 2:
fk := f2 = n0 + i2 · i3;
L := {f, f1, f2};
ML := {y, n0, n2, n10, i2 · i3, 1};
i := 3;
mon := n2
Iteration i = 3:
fk := f3 = n2 + i0 · i1;
L := {f, f1, f2, f3};
ML := {y, n0, n2, n10, i0 · i1, i2 · i3, 1};
i := 4;
mon := n10
Iteration i = 4:
79
fk := f4 = n10 + n7;
L := {f, f1, f2, f3, f4};
ML := {y, n0, n2, n10, n7, i0 · i1, i2 · i3, 1};
i := 5;
mon := n7
Iteration i = 5:
fk := f5 = n7 + n6 + n4 · i0;
L := {f, f1, f2, f3, f4, f5};
ML := {y, n0, n2, n10, n7, n6, n4 · i0, i0 · i1, i2 · i3, 1};
i := 6;
mon := n6
Iteration i = 6:
fk := f6 = n6 + n5 + n3 · i1;
L := {f, f1, f2, f3, f4, f5, f6};
ML := {y, n0, n2, n10, n7, n6, n5, n4 · i0, n3 · i1, i0 · i1, i2 · i3, 1};
i := 7;
mon := n5
Iteration i = 7:
fk := f7 = n5 + n4 · n3;
L := {f, f1, f2, f3, f4, f5, f6, f7};
ML := {y, n0, n2, n10, n7, n6, n5, n4 · n3, n4 · i0, n3 · i1, i0 · i1, i2 · i3, 1};
i := 8;
mon := n4 · n3
Iteration i = 8:
fk := f8 = n4 + i1 + i3;
L := {f, f1, f2, f3, f4, f5, f6, f7, n3 · f8};
ML := {y, n0, n2, n10, n7, n6, n5, n4 · n3, n4 · i0, n3 · i1, n3 · i3, i0 · i1, i2 · i3, 1};
i := 9;
mon := n4 · i0
80
Iteration i = 9:
fk := f8 = n4 + i1 + i3;
L := {f, f1, f2, f3, f4, f5, f6, f7, n3 · f8, i0 · f8};
ML := {y, n0, n2, n10, n7, n6, n5, n4 · n3, n4 · i0, n3 · i1, n3 · i3, i0 · i1,
i0 · i3, i2 · i3, 1};
i := 10;
mon := n3 · i1
Iteration i = 10:
fk := f9 = n3 + i0 + i2;
L := {f, f1, f2, f3, f4, f5, f6, f7, n3 · f8, i0 · f8, i1 · f9};
ML := {y, n0, n2, n10, n7, n6, n5, n4 · n3, n4 · i0, n3 · i1, n3 · i3, i0 · i1,
i0 · i3, i1 · i2, i2 · i3, 1};
i := 11;
mon := n3 · i3
Iteration i = 11:
fk := f9 = n3 + i0 + i2;
L := {f, f1, f2, f3, f4, f5, f6, f7, n3 · f8, i0 · f8, i1 · f9, i3 · f9};
ML := {y, n0, n2, n10, n7, n6, n5, n4 · n3, n4 · i0, n3 · i1, n3 · i3, i0 · i1,
i0 · i3, i1 · i2, i2 · i3, 1};
i := 12;
mon := i0 · i1
Termination: Because i0 · i1 contains variables ∈ PrimaryInputs only.
Each polynomial in L corresponds to a row in the matrix and each monomial
corresponds to a column. The generated matrix is shown in Table 6.5.
With the generated matrix, the polynomial reduction can be formulated as a series
of matrix subtractions, i.e., Rowi − Rowi−1. After all row subtractions, the reduction
result corresponds to the polynomial represented in the last row.
Two important points to be noted:
81
Table 6.5. Matrix created for polynomial reduction for Example 6.8.
y n0 n2 n10 n7 n6 n5 n4 · n3 n4 · i0 n3 · i1 n3 · i3 i0 · i1 i0 · i3 i1 · i2 i2 · i3 1
f 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1
f1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
f2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0
f3 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0
f4 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0
f5 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0
f6 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0
f7 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
n3 · f8 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0
i0 · f8 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0
i1 · f9 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0
i3 · f9 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0
82
• All subtractions are computed modulo 2.
• If polynomials fi and fi−1 have no common leading monomials, then they cannot
conduct a reduction. Correspondingly, in the matrix, when conducting Rowi −
Rowi−1, if the first non-zero entries of Rowi and Rowi−1 are not in the same
column (leading monomials), then we move on to the next row and perform
Rowi+1 −Rowi−1.
This procedure is shown in Table 6.6 for i1 · f9 − i0 · f8: here lm(i1 · f9) = n4 · n3
while lm(i0 · f8) = n4 · i0. These leading monomials are not equal and they cannot
divide each other. Thus, we skip the current row (i1 · f9). Instead, we move to the next
row (i3 · f9) and compute i3 · f9− i0 · f8. Finally, the last entry in Table 6.6 corresponds
to r = 1, and that denotes infeasibility of the miter circuit.
As shown in the above example, the polynomial reduction result r can be computed
by successively subtracting rows i from rows i+1. Finally, the last row represents r. If
the last row only contains the monomial 1, the two circuits are equivalent. Otherwise,
the polynomial corresponding to the last row represents the bug polynomial.
6.4 Experimental Results
The above verification approach using F4-style reduction has been implemented in
C + + as an efficient equivalence checking engine. Using this setup, we performed
experiments to verify equivalence between different finite field multiplier implemen-
tations. Our experiments are conducted on a desktop with 2.40GHz Intel Core(TM)2
Quad CPU and 8GB memory running 64-bit Linux.
6.4.1 Equivalence Checking of Structurally Similar Circuits
To evaluate the performance of structurally similar circuits, we conduct a equiv-
alence check between Mastrovito and Barrett multipliers. As shown in Chapter 3,
Mastrovito and Barrett multipliers are somewhat structurally similar. Table 6.7 shows
the results of verifying Mastrovito multipliers against Barrett multipliers. SAT solvers,
ABC and CSAT can solve them reasonably fast. Singular can also verify these circuits
within a matter of seconds. However, since Singular has a limitation on the number of
variables it can accommodate (< 65535 variables), it cannot verify circuits larger than
96-bit circuits. The results also show that our approach is the most efficient in verifying
circuit equivalence over finite fields.
83
Table 6.6. Subtraction result of the matrix created for polynomial reduction.
y n0 n2 n10 n7 n6 n5 n4 · n3 n4 · i0 n3 · i1 n3 · i3 i0 · i1 i0 · i3 i1 · i2 i2 · i3 1
f 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1
f1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1
f2 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1
f3 0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 1
f4 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 1
f5 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 1
f6 0 0 0 0 0 0 1 0 1 1 0 1 0 0 1 1
f7 0 0 0 0 0 0 0 1 1 1 0 1 0 0 1 1
n3 · f8 0 0 0 0 0 0 0 0 1 0 1 1 0 0 1 1
i0 · f8 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1
i1 · f9 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 0
i3 · f9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
84
6.4.2 Equivalence Checking of Structurally Dissimilar Circuits
As the experiments in Table 5.1 depict, given two structurally dissimilar circuits
(such as a Mastrovito versus a Montogmery multiplier), none of SAT, SMT, BDD and
AIG-based methods are able to verify the equivalence of circuits beyond 16-bit. The
reason why ABC and CSAT are infeasible is that the structural hashing utilized by
ABC and CSAT is not beneficial for structurally dissimilar circuits. It is unable to
find common subcircuit nodes as they do not really exist. Without merging internal
subcircuit equivalences, these tools are unable to reduce the size of the verification
instance.
Our experiments perform verification between Montgomery multipliers on one hand,
and Mastrovito and Barrett multipliers on the other hand. Table 6.8 shows the runtimes
of equivalence verification of Barrett versus Montgomery multipliers. Table 6.9 shows
the runtimes for Mastrovito versus Montgomery multiplier verification. Singular can
only verify 64-bit multipliers because of the limit on the number of variables it im-
poses. In contrast, our approach can successfully verify up to 128-bit multipliers with
dissimilar structures. In the tables, note that the verification time for 128-bit multipliers
is significantly less than that of 96-bit ones. These experimental results are correct: we
reran the experiments and also checked the circuit designs for errors – no errors were
found. The reason for this anomaly may lie in the irreducible polynomials we selected
to construct the circuits.
6.5 Limitation of Our Approach
While our approach is efficient verifying modulo-arithmetic circuits over finite fields
F2k , our approach cannot be applied to verify multiplier circuits over integers or over
the finite ring Z2k . This is due to the polynomial function representation of circuits over
integers. The polynomial representation of circuits over finite fields has a much simpler
form than that over integer rings. For example, circuits over finite fields are mainly
constructed by XOR and AND gates which can be transformed into simple polynomials
(mod 2):
a ∧ b→ a · b (mod 2)
a⊕ b→ a+ b (mod 2)
However, circuits over finite integer rings involve a large number of OR gates which
are transformed into polynomials as:
85
Table 6.7. Verification of Mastrovito multiplier vs. Barrett multiplier. TO=10hrs.
⋆=Out of variable limitation. Time is given in seconds.
Size 8 16 32 64 96 128 163
#variables 412 1445 4587 18953 42576 110543 195124
#gates 1446 6846 25846 101401 227499 403036 653021
MiniSAT 0.02 0.27 0.36 1.60 17.54 5.10 28.97
PicoSAT 0.02 0.15 0.78 3.90 6.58 41.89 130.56
PrecoSAT 0.05 0.40 1.61 22.98 91.90 90.25 187.53
CryptoMiniSAT 0.07 0.82 1.31 4.75 16.81 128.22 42.78
ABC 0.12 1.07 0.82 2.79 5.72 9.79 18.67
CSAT 0.03 3.02 0.58 0.87 1.83 5.97 5.49
Singular 0.03 0.17 0.41 1.12 ⋆ ⋆ ⋆
Ours (correct design) 0.00 0.01 0.01 0.02 0.03 0.05 0.12
Ours (buggy design) 0.00 0.02 0.02 0.02 0.04 0.06 0.13
Table 6.8. Verification of Barrett multiplier vs. Montgomery multiplier.
TO=10hrs.⋆=Out of variable limitation. Time is given in seconds.
Size 8 16 32 64 96 128 163
#variables 942 3426 9478 40059 98452 197841 286357
#gates 1968 8784 23548 86017 188121 330528 528903
Singular 0.05 486.74 3210.30 ⋆ ⋆ ⋆ ⋆
Ours (correct design) 0.00 0.13 3.39 125.88 1407.86 59.18 TO
Ours (buggy design) 0.00 0.13 3.41 127.03 1435.14 59.86 TO
Table 6.9. Verification of Mastrovito multiplier vs. Montgomery multiplier. TO=10hrs.
Time is given in seconds.
Size 8 16 32 64 96 128 163
#variables 934 3387 9346 39654 99163 204972 294578
#gates 1958 8694 23318 86132 188526 331188 530278
Singular 0.05 446.83 3646.12 ⋆ ⋆ ⋆ ⋆
Ours (correct design) 0.00 0.12 3.29 126.01 1463.95 59.37 TO
Ours (buggy design) 0.00 0.13 3.31 127.45 1511.82 60.10 TO
86
a ∨ b→ a+ b+ a · b (mod 2)
Polynomial representations for OR-dominated functions include more monomial terms
and also more occurrences of variables among the terms. This eventually results in
size-explosion of the intermediate (remainder) polynomials in the reduction. Therefore,
our approach becomes infeasible in verifying integer arithmetic circuits over rings Z2k .
A conference paper that corresponds to the initial theoretical model for this problem was
published in [58] and a paper describing the efficient implementation of our approach
is under submission [59].
CHAPTER 7
VERIFICATION OF COMPOSITE FIELD
ARITHMETIC CIRCUITS
As an effort to reduce the high implementation costs, a methodology that designs
arithmetic circuits over a composite field is proposed [71], where the finite field F2k is
decomposed as F(2m)n , for a k = m·n, and the arithmetic operations are then performed
over F(2m)n . The decomposition introduces a hierarchy (modularity) in the design by
lifting the ground field from F2 (bits) to F2m (words). This results in impressive area
and delay savings over large finite fields [71] [72] [86].
The hierarchy of composite field circuits also introduces a challenge to verify such
problems: both word-level and bit-level information are contained in the designs, which
are not able to be solved by any contemporary technique.
This chapter addresses the implementation verification of such arithmetic circuits.
We formulate the verification problem as an (radical) ideal membership test at different
abstraction levels and then apply approaches presented in Chapter 5 to solve it, i.e.,
conducting a polynomial reduction.
Our approach is based on the known field decomposition information and the circuit
hierarchy. We utilize this information to:
• first verify the correctness of lower-level building-blocks (adders and multipliers)
over the ground field F2m ;
• then verify the overall function at the higher-level over the extension field F(2m)n .
Using our approach, we are able to prove the correctness of finite field circuits for
up to 1024-bit with decomposition F(232)32 .
7.1 Circuit Designs over Composite Fields
The finite field F2k is a k-dimensional vector space over the subfield F2. If k =
m · n, the field F2k can be decomposed as F(2m)n . Such a field representation is called a
88
composite field, and it is constructed as a n-dimensional extension of the subfield F2m .
The subfield F2m is called the ground field. Note that we have F2 ⊂ F2m ⊂ F(2m)n .
According to Theorem 3.1, there exists an unique field of size pk. This implies
that F2k is isomorphic to F(2m)n when k = m · n, and due to this isomorphism, it is
possible to derive one field representation from the other. The principle of constructing
a composite field is described in [71]. Here we derive concrete steps for circuit design
purpose.
Definition 7.1 A primitive polynomial P (x) is a polynomial with coefficients in F2
which has a root α ∈ F2k such that {0, 1, α, α2, · · · , α2k−2} is the set of all elements in
F2k , where α is a primitive element of F2k .
The only difference between primitive polynomials and irreducible polynomials is
whether they can generate all distinct elements of a finite field F2k . Primitive poly-
nomials can generate all elements with a primitive element of F2k while irreducible
polynomials cannot generate all elements of F2k .
Recall that to construct a finite field F2k , we need a primitive polynomial P (x) ∈
F2[x] of degree k. Similarly, to construct F(2m)n , we require a primitive polynomial,
of degree n, with coefficients from the ground field F2m . Given F2k and P (x), the
primitive polynomial of the composite field can be easily derived. We will use the
following notation:
• Let P (x) denote the given primitive polynomial of general field F2k , and α be the
primitive root, i.e., P (α) = 0.
• Let Q(x) denote the primitive polynomial of ground field F2m , and β be the
primitive root of F2m , i.e., Q(β) = 0. Note that Q(x) is a degree m primitive
polynomial over F2 so it is also known.
• Let R(x) denote the primitive polynomial of composite field F(2m)n , and γ be the
primitive root, i.e., R(γ) = 0. This polynomial R(x) has to be derived.
Lemma 7.1 From [86]: Let F2k be decomposed as F(2m)n where k = m · n. Let γ be







Since F2k is isomorphic to F(2m)n , α and γ are actually the same elements. Now, let
us consider the representation of an element A in F2k and its corresponding representa-
tion in the composite field.




ai · αi, ai ∈ F2, and P (α) = 0 (7.2)




Ai · γi, Ai ∈ F2m , and R(γ) = 0 (7.3)




aij · βj, aij ∈ F2, and Q(β) = 0 (7.4)
Now, we need to find the relationship between the primitive roots α and β (or
between γ and β, since α = γ), so as to be able to map the elements from F2k to
F(2m)n . We have the following result [86]:
Theorem 7.1 For γ ∈ F(2m)n , and β = γω, where ω = (2m·n − 1)/(2m − 1), then we




The above result states the following: Since γ is a primitive root, it can be used
to generate all the non-zero elements of F(2m)n . Moreover, β is a primitive root of the
ground field F2m , which is a subfield of F(2m)n ( i.e., F2m ⊂ F(2m)n); so β ∈ F(2m)n .
Therefore, an exponent of γ can be used to generate β as β = γω, where ω is given in
Theorem 7.1. Now, we know all the relationships between α, β, γ, and we are ready to
perform the decomposition.
Example 7.1 As an example, let us reconsider the field F24 and decompose it as F(22)2 .
Let P (x) = x4 + x3 + 1 and P (α) = 0. We need to perform the following steps:
90







= (x+ γ) · (x+ γ22)
= x2 + (γ4 + γ) · x+ γ5
Notice that R(γ) = γ2 + (γ4 + γ) · γ + γ5 = 0.




Ai · γi, Ai ∈ F22
= A0 + A1 · γ
3. Representation of A0, A1 in F2m:
A0 = a00 + a01 · β
A1 = a10 + a11 · β
where aij ∈ F2. Q(x) can be any degree m = 2 primitive polynomial in the
ground field F22 . Let us take Q(x) = x2 + x+ 1.







aij · βj) · γi
= a00 + a01 · β + (a10 + a11 · β) · γ
where each aij ∈ F2. From Eqn. (7.5), we have: β = α5 = γ5. We then substitute







aij · βj) · γi
= a00 + a01 · α5 + (a10 + a11 · α5) · α
Since P (x) = x4 + x3 + 1 with P (α) = 0, we have
A (mod P (α)) = a00+a01+a11+(a01+a10+a11)·α+a11 ·α2+(a01+a11)·α3
5. The same element A ∈ F24 is represented as:
A = a0 + a1 · α + a2 · α2 + a3 · α3
91
6. Since Eqns. 7.6 and 7.6 represent the same element, we can match the coefficients
of the the polynomials to obtain:
a0 = a00 + a01 + a11
a1 = a01 + a10 + a11
a2 = a11
a3 = a01 + a11











1 0 0 1
0 0 1 1
0 1 0 1











Now, we have successfully derived the composite field representation F(22)2 from
F24 . The element A ∈ F24 is represented as A = a0 + a1α + a2α2 + a3α3, where
P (α) = 0. The same element A is represented in F(22)2 as:
A = A0 + A1 · α
A0 = a00 + a01 · α5
A1 = a10 + a11 · α5
a00 = a0 + a3
a01 = a2 + a3
a10 = a1 + a3
a11 = a2
In the above equations, α = γ and R(γ) = 0.
Multiplication A · B (mod P (x)) over F24 can now be performed over the de-
composition F(22)2 , where A = A0 + A1γ,B = B0 + B1γ and the modulus is taken
over R(γ). Such a design is shown in Figure 7.1, where a0, a1, a2, a3, b0, b1, b2, b3 are
primary inputs. After a suitable transformation, composite field inputs are obtained
as a00, a01, a10, a11, b00, b01, b10, b11. A0, A1, B0, B1 are 2-bit buses. Correspondingly,
each block in Figure 7.1 internally represents a 2-bit operation: × represents 2-bit
multiplication and + represents 2-bit addition over the ground field. A logic circuit



















































































































Figure 7.2. Mastrovito multiplier over F24 .
Its corresponding composite field design with decomposition F(22)2 is shown in
Figure 7.1. Each block in Figure 7.1 represents a 2-bit operation internally, where ×
represents an m-bit multiplier and + represents an m-bit adder.
7.2 Problem Formulation and Hierarchy Verification
Let us again take the multiplier verification problem as example. The specifica-
tion S = A · B (mod P (x)) is already given in polynomial form (word-level). The
implementation is available at two different abstraction levels: one at the bit-level
(ground field F2m adders and multipliers) and one at the higher-level at F(2m)n . Using
this information, we derive constraints (polynomials) Z corresponding to the circuit.
Our verification problem is to prove/disprove that for all values of the inputs A =
{a0, . . . ak−1}, B = {b0, . . . bk−1}, the circuit implementation Z correctly computes
the multiplication S.
As we can notice from Figure 7.1, the entire composite field circuit is constructed
on lower-level building-blocks (adders and multipliers). Therefore, we have two verifi-
cation objectives: low-level circuits and higher-level interconnection of the lower-level
blocks.
94
Verification of Low-Level Circuits over F2m:
Low-level building-blocks consist of adders and multipliers over F2m . These circuits
are implemented at gate-level and are nothing special as the regular finite field circuits
we verified before. Therefore, we can simply employ the same methods described
in Chapter 5 to formulate the verification test as membership testing of the property
polynomial (S+Z = 0). When the correctness of low-level circuits is certified, we can
conduct the high-level verification over F(2m)n .
Verification of Higher-Level Interconnection over F(2m)n: The difficulty of veri-
fying the composite field circuits lies in the verification of high-level interconnection of
low-level building-blocks. Specifically, due to the presence of hierarchy of composite
field circuits, the constraints derived from the high-level interconnection contain both
gate-level and word-level abstractions. For example, in Figure 7.1, the circuit hierarchy
can be described as follows:
a00 = a0 + a3
a01 = a2 + a3
a10 = a1 + a3
a11 = a2
A0 = a00 + a01 · α5
A1 = a10 + a11 · α5
b00 = b0 + b3
b01 = b2 + b3
b10 = b1 + b3
b11 = b2
B0 = b00 + b01 · α5
B1 = b10 + b11 · α5
C0 = A0 ·B1
C1 = A1 ·B0
C2 = A0 ·B0
C3 = A1 ·B1
C4 = C0 + C1
95
C5 = C3 · α5
C6 = C3 · α5
Z0 = C4 + C5
Z1 = C2 + C6
where a0, . . . , a3, b0, . . . , b3 are variables in F2 (bits) while A0, A1, B0, B1, C1, . . . , C6,
Z0, Z1 are variables in F22 (words). Therefore, bit-level variables and word-level vari-
ables co-exist in the design. As far as we know, there are no techniques that can verify
design with different levels of abstraction. This is mainly because BDD/SAT/AIG-
based approaches can only handle bit-level problems. SMT solvers, on the other hand,
have no advantages to solve problems at bit-level. Besides, SMT solvers formulate
every problem over rings instead of finite fields. Take Equation 7.6 for example; C0 =
A0 · B1 represents a 2-bit finite field multiplication. In SMT, C0 = A0 · B1 represents
a 2-bit integer multiplication. As we know, the multiplication over rings and over finite
fields differs significantly.
Fortunately, due to the fact that both bits and words information can be formulated
as polynomials, this verification problem is algebraic in nature and therefore, can be
easily formulated as a system of polynomials and solved by ideal membership testing,
which is described in Algorithm 5.
Example 7.2 Our high-level verification problem is illustrated in Table 7.1. Let F
denote all the polynomials representing implementation, specification and vanishing
polynomials. Let F0 denote the vanishing polynomials for primary inputs. After all the
polynomials in {F} are available, we just need to check whether S +Z is a member of
the ideal 〈F, F0〉.
7.3 Experimental Results
With the approach presented above, we have conducted experiments to hierarchi-
cally verify Mastrovito multiplier implementations M against the specification S =
A · B (mod P (x)). Our verification setup is shown in Table 7.1. The implementation
is given as a circuit over F(2m)n . With the given hierarchy information, we construct the
polynomials representing high-level designs MH over F(2m)n and low-level designs ML
over F2m separately.
96
Table 7.1. Verification setup over F(22)2
implementation specification vanishing polynomials
a00 + a0 + a3 A+ a0 + a1 · α + a2 · α2 + a3 · α3 a20 − a0
a01 + a2 + a3 B + b0 + b1 · α + b2 · α2 + b3 · α3 a21 − a1
a10 + a1 + a3 S + A× B a22 − a2
a11 + a2 a
2
3 − a3
A0 + a00 + a01 · x5 b20 − b0
A1 + a10 + a11 · x5 b21 − b1
b00 + b0 + b3 b
2
2 − b2
b01 + b2 + b3 b
2
3 − b3
b10 + b1 + b3
b11 + b2
B0 + b00 + b01 · x5
B1 + b10 + b11 · x5
C0 + A0 ·B0
C1 + A1 ·B0
s2 + A1 ·B1
C3 + A1 ·B1
C4 + C0 + C1
C5 + C3 · α5
C6 + C3 · α5
Z0 + C4 + C5
Z1 + C2 + C6
Z + Z0 + Z1 · α
Property: Z+ S
For high-level designs MH , the specification polynomial S = A · B (mod P (x))
is used. In contrast, for low-level designs ML over F2m , the specification polynomial
SL = Am · Bm (mod Q(x)) is used, of which Am, Bm represents the m-bit inputs
for low-level building-block circuits; Q(x) is the primitive polynomial of F2m . Then
vanishing polynomials a20 − a0, . . . , a2k−1 − ak−1, b20 − b0, b2k−1 − bk−1 are appended to
MH and ML at different levels of design. We use Singular [28] to conduct polynomial
reduction. When the circuits are correctly designed, we do observe that the reduction
result is 0, proving the equivalence.
Our experiments are conducted on a desktop with 2.40GHz CPU and 8GB memory
running 64-bit Linux. The time-out limit is set as 24 hours.
The verification of low-level circuits is the same as the one shown in Table 5.3.
The number of low-level design units is shown in Table 7.2. Note that this number is
determined by n, which means F(2m1 )n and F(2m2 )n have the same number of low-level
design units, even if m1 6= m2.
97
Since high-level verification cannot be solved by any other technique, we only
show the results of our approach. Table 7.3 shows the runtime of high-level designs
verification over F(2m)n for varying word-size k = m · n. As shown in Table 7.3,
with our approach, we are able to prove the correctness of finite field circuits for up to
1024-bit with decomposition F(232)32 .
7.4 Conclusions
This chapter has targeted the implementation verification of hierarchically designed
composite finite field circuits. Decomposing the finite field F2k as F(2m)n introduces a
hierarchical abstraction. Our approach requires that this hierarchy information be made
available. Then, we formulate the verification problem using the polynomial reduction
as a ideal membership testing at different levels of abstraction. First we verify low-level
adders and multipliers at F2m , and then verify the high-level interconnections between
these blocks at F(2m)n . Using our approach, we can verify the correctness of up to
1024-bit multipliers where other contemporary techniques are not capable of verifying
such circuits. This work was presented in [56].
98
Table 7.2. Statistics of designs over F2m .
n 2 4 8 16 32
#Multipliers 6 36 168 720 2976
#Adders 3 27 147 675 2883
Table 7.3. Verification of Mastrovito multiplier over F(2m)n using proposed approach. All times are given in seconds.
32 64 128 256 512 1024
m n time m n time m n time m n time m n time m n time
2 16 7.55 2 32 879.83 2 64 ∗ 2 128 ∗ 2 256 ∗ 2 512 ∗
4 8 0.12 4 16 10.81 4 32 1619.51 4 64 ∗ 4 128 ∗ 4 256 ∗
8 4 0.01 8 8 0.46 8 16 35.04 8 32 2664.56 8 64 ∗ 8 128 ∗
16 2 0.01 16 4 0.15 16 8 3.25 16 16 147.84 16 32 11510 16 64 ∗
- - - 32 2 0.11 32 4 2.14 32 8 37.71 32 16 1166.10 32 32 75336
CHAPTER 8
CONCLUSIONS AND FUTURE WORK
This dissertation presents approaches to performing equivalence checking for arith-
metic circuits over finite fields F2k . In particular, we target two specific problems: i) ver-
ifying the correctness of a custom-designed arithmetic circuit implementation against
a given word-level polynomial specification over F2k ; and ii) gate-level equivalence
checking of two structurally dissimilar arithmetic circuits. We propose polynomial ab-
stractions over finite fields to model and represent the circuit constraints. Subsequently,
decision procedures based on modern computer algebra techniques – notably Gro¨bner
bases-related theory and technology – are engineered to solve the verification problem
efficiently.
8.1 Computer Algebra-Based Approaches for Equivalence
Checking of Arithmetic Circuit over F2k
The arithmetic circuit is modeled as a polynomial system in the ring F2k [x1, x2, · · · ,
xd], and computer algebra- and algebraic geometry-based results (Hilbert’s Nullstellen-
satz) over finite fields are exploited for verification. Two formulations are presented to
address the implementation verification and the equivalence checking problems.
Using the results of Strong Nullstellensatz over finite fields, the first verification
problem is formulated as an ideal membership testing. For this ideal membership test,
it is required to compute a Gro¨bner basis. The Gro¨bner basis computation is known
to have double-exponential worst-case complexity in the input data, which makes this
approach impractical. Therefore, straight-forward use of Gro¨bner basis engines for
verification is infeasible for large circuits. To overcome this complexity, we analyze the
given circuit topology to get more theoretical insights into the polynomial ideals corre-
sponding to the circuit constraints. Based on this circuit information, we derive efficient
term orderings to represent the polynomials. Subsequently, using the theory of Gro¨bner
100
bases over finite fields, we prove that our term orderings render the set of polynomials
itself a Gro¨bner basis – thus obviating the need for Buchberger’s algorithm. To fulfill
our verification purpose, we simply conduct a polynomial reduction to test whether the
equality property is a member of the ideal representing the circuit constraints.
The equivalence checking for two structurally dissimilar arithmetic circuits is still
a challenge for contemporary techniques. By utilizing computer algebra theory, we
formulate this problem as a Weak Nullstellensatz proof using Gro¨bner bases computa-
tion. Once again, this would require the computation of a reduced Gro¨bner basis, which
is expensive for large circuits. To overcome this complexity, we want to exploit our
circuit-based term ordering for polynomial representation. Unfortunately, unlike in the
previous case, the set of polynomials corresponding to this verification instance does
not constitute a Gro¨bner basis. Instead of computing a Gro¨bner basis for the the whole
circuit, we identify a minimal number of S-polynomial computations that are sufficient
to prove equivalence or to detect bugs for the whole circuit.
The verification of composite field circuits is a successful application of our com-
puter algebra-based approaches. To construct a composite field circuit over F(2m)n , the
finite field F2k is decomposed as F(2m)n , for a k = m · n, and the arithmetic operations
are then performed over F(2m)n . The decomposition introduces a hierarchy (modularity)
in the design by lifting the ground field from F2 (bits) to F2m (words). We formulate
the verification problem as an (radical) ideal membership test at different abstraction
levels. By combining the circuit hierarchy information, we first verify the correctness
of lower-level building-blocks (adders and multipliers) over the ground field F2m , then
we verify the overall arithmetic at the higher-level over the extension field F(2m)n .
8.2 Future Work
The approaches and theories presented in this dissertation can be further extended
to enhance the efficiency of equivalence checking of arithmetic circuits. Some future
research directions are proposed here.
8.2.1 Speeding up Verification Using a Graphics Processing Unit
As shown in Figure 6.2, the equivalence of “CIRCUIT1” and “CIRCUIT2” is for-
mulated as a single miter at word-level. However, since the circuits have multiple
101
outputs (k), we can create k miters for each output bit. In such cases, we will have
to compute Spoly(fm, fo)
F,F0−→+ r for each of the k outputs, and check if r = 1 in
each case. These are going to be n independent computations. In that regard, they will
immensely benefit from parallelization.
It is desirable to implement this technique on a hardware accelerator - particularly
on a NVIDIA Graphics Processing Unit (GPU). In the Electronic Design Automation
(EDA) community, there has been a lot of interest in exploiting GPU computing to im-
prove synthesis and verification algorithms. Significant speed-ups have been observed
in GPU implementation of circuit simulation algorithms (see for example [35]). It is
needed to further study how to efficiently implement our circuit verification problem
using independent S-polynomial reductions on a general purpose GPU.
8.2.2 Extraction of Circuit Abstraction
Suppose that we are given a circuit that implements a polynomial function over
F2k → F2k , but we do not know what function it implements. Can we identify a
polynomial representation of this function: f(X, Y ) where X represents the input
bit-vector and Y the output? This problem is one of hierarchy abstraction and is used
in component matching and resource allocation in high-level synthesis.
To explain this idea, let us revisit the example of Figure 5.2, a 2-bit multiplier. It
implements a polynomial function Z = A∗B; Z,A,B ∈ F22 . Here A = a0+a1α,B =
b0 + b1α,Z = z0 + z1α. Let us represent a polynomial for each gate in the circuit.
We will impose the following term order: lex term order with “circuit Variables” >
“Inputs, A, B” > “Output Z”. That is, we use lex term order with c0 > c1 > c2 > c3 >
r0 > a0 > a1 > b0 > b1 > z0 > z1 > A > B > Z. If we use this order to compute a
Gro¨bner basis of the circuit polynomials, then we obtain the following polynomials:
f1 : z0 + z1α + Z
f2 : b0 + b1α + B
f3 : a0 + a1α + A
f4 : c3 + r0 + z1
f5 : c1 + c2 + r0
102
f6 : c0 + c3 + z0
f7 : A ·B + Z
f8 : a1 · b1 + a1 ·B + b1 · A+ z1
f9 : r0 + a1 · b1 + z1
f10 : c2 + a1 · b0
Notice that the polynomial f7 : A ∗ B + Z is indeed the polynomial representation of
the function implemented by the circuit. And we were able to “extract” the polynomial
representation using Gro¨bner basis.
Polynomial interpolation techniques for this problem were studied in [80] [81]. Fur-
ther research should be conducted to investigate if we can use Gro¨bner basis techniques
to efficiently interpolate a polynomial representation from a circuit.
8.2.3 Simulation-Based Verification of Circuits
In our group’s previous work [78] [77], we show that given two polynomial func-
tions f, g over Z2k , exhaustive simulation is not always necessary to prove their equiv-
alence. We identified an integer λ such that functions (polynomials) f, g need to be
evaluated only for λ inputs vectors: {V1, . . . , Vλ}. If f = g for these λ vectors, then
f = g over the entire design space. If f 6= g, then we guarantee to catch the bug within
these λ vectors. In practice, λ << 2k.
Unfortunately, this result did not find much practical application as it required
that f, g be polynomial functions. Not every function (circuit) f : Z2k → Z2k is a
polynomial function. Instead of modeling a k-input/output circuit as a function from
f : Z2k → Z2k , we conjecture the model can be viewed as a polynomial function
over finite fields f : F2k → F2k . This way, we can then prove equivalence of two
polyfunctions f, g : F2k → F2k without resorting to exhaustive simulation. It is
promising to solve the same problem as in [78] [77], but now over a different domain:
F2k .
REFERENCES
[1] SimplifyingSTP, SMT-COMP2010. http://www.smtcomp.org/2010.
[2] Sonolar, SMT-COMP2010. http://www.smtcomp.org/2010.
[3] ADAMS, W. W., AND LOUSTAUNAU, P. An Introduction to Gro¨bner Bases.
American Mathematical Society, 1994.
[4] AVRUNIN, G. Symbolic Model Checking using Algebraic Geometry. In Com-
puter Aided Verification Conference (1996), pp. 26–37.
[5] BAHAR, I., FROHM, E. A., GAONA, C. M., HACHTEL, G. D., MACII, E.,
PARDO, A., AND SOMENZI, F. Algebraic Decision Diagrams and their Applica-
tions. In Proceedings of the IEEE/ACM International Conference on Computer-
Aided Design (Nov. 93), pp. 188–191.
[6] BARRETT, C., AND TINELLI, C. CVC3. In Computer Aided Verification
Conference (July 2007), Springer, pp. 298–302.
[7] BARRETT, P. Implementing the Rivest Shamir and Adleman Public Key En-
cryption Algorithm on a Standard Digital Signal Processor. In Proceedings of
Advances In Cryptology (London, UK, UK, 1987), Springer-Verlag, pp. 311–323.
[8] BIERE, A. Picosat Essentials. Journal on Satisfiability, Boolean Modeling and
Computation (JSAT) 4 (2008), 75–97.
[9] BIERE, A. SAT 2009 Competition.
[10] BIHAM, E., CARMELI, Y., AND SHAMIR, A. Bug Attacks. In Proceedings on
Advances in Cryptology (2008), pp. 221–240.
[11] BRAYTON, R., AND MISHCHENKO, A. ABC: An Academic Industrial-Strength
Verification Tool. In Computer Aided Verification (2010), vol. 6174, Springer,
pp. 24–40.
104
[12] BRAYTON, R. K., HACHTEL, G. D., SANGIOVANNI-VENCENTELLI, A.,
SOMENZI, F., AZIZ, A., CHENG, S.-T., EDWARDS, S., KHATRI, S., KUKI-
MOTO, Y., PARDO, A., QADEER, S., RANJAN, R., SARWARY, S., SHIPLE,
S. SWAMY, G., AND VILLA, T. VIS: A System for Verification and Synthesis.
In Computer Aided Verification (1996).
[13] BRUMMAYER, R., AND BIERE, A. Boolector: An Efficient SMT Solver for
Bit-Vectors and Arrays. In TACAS 09, Volume 5505 of LNCS (2009), Springer.
[14] BRUTTOMESSO, R., CIMATTI, A., FRANZEN, A., GRIGGIO, A., AND SE-
BASTIANI, R. The MathSAT 4 SMT Solver. In Computer Aided Verification
Conference (2008), vol. 5123, Springer.
[15] BRYANT, R. E. Graph Based Algorithms for Boolean Function Manipulation.
IEEE Transactions on Computers C-35 (August 1986), 677–691.
[16] BRYANT, R. E., AND CHEN, Y.-A. Verification of Arithmetic Functions with
Binary Moment Diagrams. In Proceedings of Design Automation Conference
(1995), pp. 535–541.
[17] BUCHBERGER, B. Ein Algorithmus zum Auffinden der Basiselemente des Restk-
lassenringes nach einem Nulldimensionalen Polynomideal. PhD thesis, Univer-
sity of Innsbruck, 1965.
[18] BUCHBERGER, B. A Criterion for Detecting Unnecessary Reductions in the
Construction of a Groebner Bases. In EUROSAM (1979).
[19] CIESIELSKI, M., KALLA, P., ZHENG, Z., AND ROUZYERE, B. Taylor Expan-
sion Diagrams: A New Representation For RTL Verification. In IEEE Interna-
tional High Level Design Validation and Test Workshop (Nov. 2001), pp. 70–75.
[20] CIESIELSKI, M., KALLA, P., ZHENG, Z., AND ROUZYERE, B. Taylor Ex-
pansion Diagrams: A Compact Canonical Representation with Applications to
Symbolic Verification. In IEEE Design, Automation and Test in Europe (2002),
pp. 285–289.
[21] CLARKE, E., GRUMBERG, O., AND PELED, D. The Temporal Logic of Reactive
and Concurrent Systems. The MIT Press, 1999.
105
[22] CLARKE, E. M., FUJITA, M., AND ZHAO, X. Hybrid Decision Diagrams - Over-
coming the Limitation of MTBDDs and BMDs. In Proceedings of the IEEE/ACM
International Conference on Computer-Aided Design (1995), pp. 159–163.
[23] CLEGG, M., EDMONDS, J., AND IMPAGLIAZZO, R. Using the Gro¨bner Basis
Algorithm to Find Proofs of Unsatisfiability. In ACM Symposium on Theory of
Computing (1996), pp. 174–183.
[24] CONDRAT, C., AND KALLA, P. A Gro¨bner Basis Approach to CNF Formulae
Preprocessing. In International Conference on Tools and Algorithms for the
Construction and Analysis of Systems (2007), pp. 618–631.
[25] COX, D., LITTLE, J., AND O’SHEA, D. Ideals, Varieties, and Algorithms: An
Introduction to Computational Algebraic Geometry and Commutative Algebra.
Springer, 2007.
[26] DAVIS, M., LOGEMANN, G., AND LOVELAND, D. A Machine Program for
Theorem Proving. In Communications of the ACM (1962), vol. 5, pp. 394–397.
[27] DAVIS, M., AND PUTNAM, H. A Computing Procedure for Quantification
Theory. Journal of the ACM 7 (1960), 201–215.
[28] DECKER, W., GREUEL, G.-M., PFISTER, G., AND SCHO¨NEMANN, H. Sin-
gular 3-1-3 — A Computer Algebra System for Polynomial Computations.
http://www.singular.uni-kl.de.
[29] DRECHSLER, R., SARABI, A., THEOBALD, M., BECKER, B., AND
PERKOWSKI, M. Efficient Representation and Manipulation of Switching Func-
tions based on Ordered Kronecker Functional Decision Diagrams. In Design
Automation Conference (1994), pp. 415–419.
[30] DRESCHLER, R., BECKER, B., AND RUPPERTZ, S. The K*BMD: A Verification
Data Structure. IEEE Design & Test of Computers 14, 2 (1997), 51–59.
[31] DUTERTRE, B., AND MOURA, L. The Yices SMT Solver. Tech. rep., 2006.
[32] EEN, N., AND SRENSSON, N. An Extensible SAT-Solver. Theory And Applica-
tions of Satisfiability Testing 2919 (2004), 333–336.
106
[33] EMERSON, E. A. Temporal and Modal Logic. In Formal Models and Semantics,
vol. B of Handbook of Theoretical Computer Science. Elsevier Science, 1990,
pp. 996–1072.
[34] FAUGE`RE, J. C. A New Efficient Algorithm for Computing Gro¨bner Bases (F4).
Journal of Pure and Applied Algebra 139 (June 1999), 61–88.
[35] FENG, Z., ZENG, Z., AND LI, P. Parallel On-Chip Power Distribution Network
Analysis on Multicore GPU Platforms. IEEE Transactions VLSI (2011).
[36] GAO, S. Counting Zeros over Finite Fields with Gro¨bner Bases. Master’s thesis,
Carnegie Mellon University, 2009.
[37] GUPTA, A. Formal Hardware Verification Methods: A Survey. Formal Methods
in System Design 1 (1992), 151–238.
[38] HANKERSON, D., HERNANDEZ, J., AND MENEZES, A. Software Implementa-
tion of Elliptic Curve Cryptography over Binary Fields, 2000.
[39] HILBERT, D. ¨Uber die Theorie der Algebraischen Formen. Math. Annalen 36
(1890), 473–534.
[40] HOLZMANN, G. J. The SPIN Model Checker: Primer and Reference Manual,
First ed. Addison-Wesley Professional, September 2003.
[41] HORETH, S., AND DRECHSLER. Formal Verification of Word-Level Specifica-
tions. In IEEE Design, Automation and Test in Europe (1999), pp. 52–58.
[42] JABIR, A., AND D., P. MODD: A New Decision Diagram and Representation
for Multiple Output Binary Functions. In IEEE Design, Automation and Test in
Europe (2004).
[43] JHA, S., LIMAYE, R., AND SESHIA, S. Beaver: Engineering An Efficient SMT
Solver for Bit-Vector Arithmetic. In Computer Aided Verification Conference
(2009), pp. 668–674.
[44] KALLA, P. An Infrastructure for RTL Validation and Verification. PhD thesis,
University of Massachusetts Amherst, 2002.
107
[45] KALLA, P., CIESIELSKI, M., AND BOUTILLON, E. High-Level Design Verifica-
tion using Taylor Expansion Diagrams: First Results. In IEEE International High
Level Design Validation and Test Workshop (2002), pp. 13–17.
[46] KNEZˇEVIC´, M., SAKIYAMA, K., FAN, J., AND VERBAUWHEDE, I. Modular
Reduction in GF(2n) Without Pre-Computational Phase. In Proceedings of the
International Workshop on Arithmetic of Finite Fields (2008), pp. 77–87.
[47] KOBAYASHI, K. Studies on Hardware Assisted Implementation of Arithmetic
Operations in Galois Field. PhD thesis, Nagoya University, Japan, 2009.
[48] KOC, C., AND ACAR, T. Montgomery Multiplication in GF(2k). Designs, Codes
and Cryptography 14, 1 (Apr. 1998), 57–69.
[49] KUEHLMANN, A., PARUTHI, V., KROHM, F., AND GANAI, M. K. Robust
Boolean Reasoning for Equivalence Checking and Functional Property Verifica-
tion. IEEE Transactions on Computer-Aided Design of Integrated Circuits and
Systems 21, 12 (Nov. 2006), 1377–1394.
[50] LEE, Y., SAKIYAMA, K., BATINA, L., AND VERBAUWHEDE, I. Elliptic-Curve-
Based Security Processor for RFID. IEEE Transactions on Computers 57, 11
(Nov. 2008), 1514–1527.
[51] LIDL, R., AND NIEDERREITER, H. Finite Fields. Cambridge University Press,
1997.
[52] LOPEZ, J., DAHAB, R., AND DAHAB, R. Improved Algorithms for Elliptic
Curve Arithmetic in GF(2n). In Proceedings of the Selected Areas in Cryptog-
raphy (London, UK, 1998), Springer-Verlag, pp. 201–212.
[53] LU, F., WANG, L., CHENG, K., AND HUANG, R. A Circuit SAT Solver With
Signal Correlation Guided Learning. In IEEE Design, Automation and Test in
Europe (2003), pp. 892–897.
[54] LU, F., WANG, L., CHENG, K., MOONDANOS, J., AND HANNA, Z. A Signal
Correlation Guided ATPG Solver And Its Applications For Solving Difficult
Industrial Cases. In Design Automation Conference (2003), pp. 436–441.
108




[56] LV, J., KALLA, P., AND ENESCU, F. Verification of Composite Galois Field
Multipliers over GF((2m)n) using Computer Algebra Techniques. In IEEE High-
Level Design Validation and Test Workshop (2011), pp. 136–143.
[57] LV, J., KALLA, P., AND ENESCU, F. Efficient Groebner Basis Reductions for
Formal Verification of Galois Field Multipliers. In IEEE Design, Automation and
Test in Europe (2012).
[58] LV, J., KALLA, P., AND ENESCU, F. Formal Verification of Galois Field
Multipliers using Computer Algebra. In 25th IEEE International Conference on
VLSI Design (2012).
[59] LV, J., KALLA, P., AND ENESCU, F. Scalable Equivalence Checking of Finite
Field Arithmetic Circuits using Gro¨ner Bases and F-4 Style Reduction. In IEEE
Design Automation and Test in Europe, in submission (2013).
[60] MANNA, Z., AND PNUELI, A. The Temporal Logic of Reactive and Concurrent
Systems, First ed. Springer-Verlag, 1991.
[61] MASTROVITO, E. VLSI Designs for Multiplication Over Finite Fields GF(2m).
Lecture Notes in Computer Science 357 (1989), 297–309.
[62] MCELIECE, R. J. Finite Fields for Computer Scientists and Engineers. Kluwer
Academic Publishers, 1987.
[63] MCMILLAN, K. L. Symbolic Model Checking. Kluwer Academic Publishers,
1993.
[64] MILLER, V. Use of Elliptic Curves in Cryptography. In Lecture Notes in
Computer Sciences (New York, NY, USA, 1986), Springer-Verlag New York, Inc.,
pp. 417–426.
[65] MONTGOMERY, P. Modular Multiplication Without Trial Division. Mathematics
of Computation 44, 170 (Apr. 1985), 519–521.
109
[66] MORIOKA, S., AND KATAYAMA, Y. Design Methodology for A One-Shot Reed-
Solomon Encoder and Decoder. In IEEE International Conference on Computer
Design (1999), pp. 60–67.
[67] MORIOKA, S., KATAYAMA, Y., AND YAMANE, T. Towards Efficient Verifi-
cation of Arithmetic Algorithms Over Galois Fields GF (2m). Computer Aided
Verification Conference 2102 (2001), 465–477.
[68] MOURA, L., AND BJRNER, N. Z3: An Efficient SMT Solver. In TInternational
Conference on Tools and Algorithms for the Construction and Analysis of Systems
(2008), vol. 4963, Springer.
[69] MUKHOPADHYAYA, D., SENGAR, G., AND CHOWDHURY, D. Hierarchical
Verification of Galois Field Circuits. IEEE Transactions on CAD (2007).
[70] NARAYAN, A., JAIN, J., FUJITA, M., AND SANGIOVANNI-VINCENTELLI, A.
Partitioned ROBDDs: A Compact Canonical and Efficient Representation for
Boolean Functions. In Proceedings of the IEEE/ACM International Conference
on Computer-Aided Design (1996), pp. 547–554.
[71] PAAR, C. Efficient VLSI Architecture for Bit-Parallel Computation in Galois
Fields. PhD thesis, University of Essen, Germany, 1994.
[72] PAAR, C. A New Architecture for A Parallel Finite Field Multiplier with Low
Complexity Based on Composite Fields. IEEE Transactions on Computers 45, 7
(July 1996), 856–861.
[73] PAVLENKO, E., WEDLER, M., STOFFEL, D., KUNZ, W., DREYER, A., SEEL-
ISCH, F., AND GREUEL, G.-M. STABLE: A New QBF-BV SMT Solver for Hard
Verification Problems Combining Boolean Reasoning with Computer Algebra. In
IEEE Design, Automation and Test in Europe Conference (2011), pp. 155–160.
[74] RAJAPRABHU, T. L., SINGH, A. K., JABIR, A. M., AND PRADHAN, D. K.
MODD for CF: A Compact Representation for Multiple Output Function. In IEEE
International High Level Design Validation and Test Workshop (2004).
[75] ROMAN, S. Field Theory. Springer, 2006.
110
[76] SHEKHAR, N., KALLA, P., AND ENESCU, F. Equivalence Verification of
Polynomial Datapaths using Ideal Membership Testing. IEEE Transactions on
CAD (July 2007), 1320–1330.
[77] SHEKHAR, N., KALLA, P., MEREDITH, M. B., AND ENESCU, F. Simulation
Bounds for Equivalence Verification of Arithmetic Datapaths with Finite Word-
Length Operands. In Formal Methods in Computer Aided Design (November
2006), pp. 179–186.
[78] SHEKHAR, N., KALLA, P., MEREDITH, M. B., AND ENESCU, F. Simulation
Bounds for Equivalence Verification of Polynomial Datapaths using Finite Ring
Algebra. IEEE TransactionsVLSI 16, 4 (2008), 376–387.
[79] SILVA, J., AND SAKALLAH, K. GRASP: A New Search Algorithm for Satis-
fiability. In Proceedings of IEEE/ACM International Conference on Computer-
Aided Design (1996), IEEE Computer Society, pp. 220–227.
[80] SMITH, J., AND DEMICHELI, G. Polynomial Methods for Component Matching
and Verification. In Proceedings of the IEEE/ACM International Conference on
Computer-Aided Design (1998).
[81] SMITH, J., AND DEMICHELI, G. Polynomial Methods for Allocating Complex
Components. In IEEE Design, Automation and Test in Europe (1999).
[82] SOMENZI, F. CUDD: CU Decision Diagram Package Release, 1998.
[83] SOOS, M. Cryptominisat-a SAT Solver for Cryptographic Problems. http:
//www.msoos.org/cryptominisat2/, 2009.
[84] ST MICROELECTRONICS. ST23YLxx series Microcontroller for Smart Cards.
[85] STOFFEL, D., AND KUNZ, W. Verification of Integer Multipliers On the Arith-
metic Bit Level. In Proceedings of the IEEE/ACM International Conference on
Computer-Aided Design (Piscataway, NJ, USA, 2001), IEEE Press, pp. 183–189.
[86] SUNAR, B., SAVAS, E., AND KO, C. Constructing Composite Field Repre-
sentations for Efficient Conversion. IEEE Transactions on Computers 52, 11
(November 2003), 1391–1398.
111
[87] WATANABE, Y., AND et al. Application of Symbolic Computer Algebra to
Arithmetic Circuit Verification. In IEEE International Conference on Computer
Design (October 2007), pp. 25–32.
[88] WIENAND, O., WEDLER, M., STOFFEL, D., KUNZ, W., AND GRUEL, G. An
Algebraic Approach to Proving Data Correctness in Arithmetic Datapaths. In
Computer Aided Verification Conference (2008), pp. 473–486.
[89] WU, H. Montgomery Multiplier and Squarer for a Class of Finite Fields. IEEE
Transactions On Computers 51, 5 (May 2002).
