Abstract: This paper addresses the problem of equivalence verification of RTL descriptions that implement arithmetic computations (add, mult, shift) over bitvectors that have differing bit-widths. Such designs are found in many DSP applications where the widths of input and output bit-vectors are dictated by the desired precision. A bit-vector of size n can represent integer values from 0 to 2 n − 1; i.e. integers reduced modulo 2 n . Therefore, to verify bit-vector arithmetic over multiple wordlength operands, we model the RTL datapath as a polynomial function from Z 2 n 1 × Z 2 n 2 × · · · × Z 2 n d to Z 2 m . Subsequently, RTL equivalence f ≡ g is solved by proving whether (f − g) ≡ 0 over such mappings. Exploiting concepts from number theory and commutative algebra, a systematic, complete algorithmic procedure is derived for this purpose. Experimentally, we demonstrate how this approach can be applied within a practical CAD setting. Using our approach, we verify a set of arithmetic datapaths at RTL where contemporary approaches prove to be infeasible.
I. Introduction
Many practical Digital Signal Processing (DSP) applications implement integer arithmetic operations, such as add, mult, shift, etc., over multiple bit-vector variables. Examples of such designs abound in DSP for audio, video and multimedia applications. High-level or registertransfer-level (RTL) descriptions of such systems can be modeled as multi-variate polynomials of finite degree, for design, synthesis [1] and verification purposes [2] . For efficient and correct modeling of such systems, it is important to account for the effect of bit-vector size of the operands on the resulting computation. For example, the largest (unsigned) integer value that a bit-vector of size m can represent is 2 m −1; implying that the bit-vector represents integer values reduced modulo 2 m (%2 m ). This suggests that bit-vector arithmetic can be efficiently modeled as algebra over finite integer rings, where the bit-vector size dictates the cardinality of the ring.
In many DSP applications, the computations are generally performed over operands that have multiple wordlengths; i.e., input and output bit-vectors may have differing bit-widths. For instance, a digital audio-video mixer may perform polynomial arithmetic over a 20-bit audio and a 32-bit video signal [3] . To analyze these designs efficiently, it is therefore required to derive efficient * Sponsored by NSF grants CCF-0514966 & CCF-515010. computational procedures to model and manipulate multiple operand bit-vector arithmetic.
This paper addresses the problem of equivalence verification of arithmetic datapath computations over bitvectors where the input and output operands may have different bit-widths. The problem is addressed at the level of behavioural/RTL descriptions. The following sub-section motivates the verification problem as it appears in the context of our work and describes our approach to the problem.
A. Motivating the Verification Problem
Let us motivate the equivalence verification problem as it appears in the context of our work. Initial highlevel (say, matlab) specifications of digital signal processing applications are usually in floating-point. Such designs can be converted to fixed-point models [4] and subsequently translated into RTL. Automatic translation utilities are available for this purpose [5] . The verification problem instance is that of checking the equivalence of the fixed-point design against the translated (and optimized) RTL models.
Consider the computation performed by a digital image rejection/separation unit that takes as input two signals: a 12-bit vector A[11 : 0] and another 8-bit vector B [7 : 0] . These signals are outputs of a mixer wherein one signal emphasizes on the image signal and the other emphasizes on the desired signal. The design produces a 16-bit output Y 1 . The computation performed by the design is described in RTL as shown in Eqn. 1. Note that because of the specified bit-vector sizes, the computation can be equivalently implemented as another polynomial Y 2 , as shown in Eqn. 2. 
B. Problem Modeling and Approach
We model the multiple word-length bit-vector computations as follows. Let x 1 , x 2 , . . . , x d denote the dvariables (bit-vectors) in the design. Let n 1 , n 2 , . . . , n d denote the size of the corresponding bit-vectors. Therefore, x 1 ∈ Z 2 n 1 , x 2 ∈ Z 2 n 2 , . . . , x d ∈ Z 2 n d . Note that Z 2 n corresponds to the finite set of integers {0, 1, . . . , 2 n − 1}. Let m correspond to the size of the output bit-vector f ; hence, f ∈ Z 2 m . Subsequently, we model the arithmetic datapath computation as a multi-variate polynomial over Z 2 n 1 ×Z 2 n 2 ×· · ·×Z 2 n d to Z 2 m [6] . Here Z a ×Z b represents the Cartesian product of Z a and Z b . The equivalence problem then corresponds to checking the congruence of two polynomials: f ≡ g%2 m . If two such polynomials f, g are indeed computationally equivalent, then it means that they correspond to the same underlying polynomial function (or polyfunction). Unfortunately, checking for the equality of such polyfunctions is an NP-hard problem [7] . Our approach transforms the equivalence problem f ≡ g%2 m as one of proving (f − g)%2 m ≡ 0; known as the zero-equivalence problem [7] . In other words, we test whether or not (f − g)%2 m corresponds to a nil polyfunction. For the example shown above, we can compute
16 :
Note that
Chen [6] has analyzed properties of such (nil) polyfunctions from a number-theoretic and commutative algebra perspective. We exploit some results from [6] and derive a systematic, algorithmic procedure to test for vanishing polyfunctions. Moreover, we demonstrate the applicability of our procedure within a practical CAD setting -by verifying a set of polynomial datapath computations over bit-vectors of disparate lengths for which contemporary techniques prove to be infeasible.
II. Previous Work
It is evident that contemporary graph-based canonical representations (BDDs [8] , BMDs [9] , K*BMDs [10] and their derivatives) are ill-suited for our application. This is mostly because these are based on variants of binary (bit-level) decomposition principles and, as such, they do not have the power of abstraction to model highdegree polynomial computations over wide bit-vectors. Recent work of Galois field decomposition of Boolean functions (MODD [11] ) and other arithmetic transforms of Boolean functions [12] are also unable scale w.r.t. the design size corresponding to our applications. TEDs [2] have been proposed as canonical DAG representations for multi-variate polynomials. However, TEDs do not model modulo-arithmetic. While they can prove polynomial equivalence over the integral domain (Z), they cannot canonically model polyfunctions over finite integer rings.
Modulo arithmetic concepts have been studied in the context of RTL verification for bit-vector arithmetic [13] [14], word-level ATPG [15] and MILP-based simulation vector generation [16] . However, these are mostly geared toward solving linear congruences under modulo arithmetic -a different application from proving equivalence of polynomial functions. The recent work of [17] uses an arithmetic bit-level normalization technique to simplify subsequent SAT instances for bounded model checking of arithmetic circuits. Techniques such as theorem proving (HOL), term-rewriting [18] etc., have been used for datapath verification. However, they have generally been successful when datapath size can be abstracted away, say using data-dependence, symmetry and other abstractions [19] [20] .
Contemporary Symbolic Computer Algebra tools do provide algorithmic solutions to polynomial equivalence over a variety of rings. However, these solutions are available for fields (R, Q, C), prime rings Z p , integral and Euclidean domains -collectively called the unique factorization domains (UFDs). Within UFDs, computer algebra systems solve the equivalence checking problem by uniquely factorizing an expression into irreducible terms and comparing the coefficients of the factored terms ordered lexicographically. However, in the case of our application, the finite integer rings of residue classes Z 2 m (formed by m-bit vectors) correspond to non-UFDs, due to the presence of zero divisors (e.g., 4 = 2 = 0, 4·2 = 0 in Z 8 ). In non-UFDs, any polynomial cannot be uniquely factorized into irreducible terms 1 . On the same lines, techniques using the concepts of Grobner's bases [21] [1] find extensive application in UFDs. However, for the above reasons, they cannot be directly ported to solve the above problem in non-UFDs of the type Z 2 m . We have analyzed a large number of symbolic algebra packages (NTL, ZEN, Maple, Mathematica, CoCoA, Singular, Macaulay, Pari, Macsyma, among others [22] ). To the best of our knowledge, none of the available packages provide a "ready-made" procedure that can solve the desired polyfunction equivalence.
The works that come closest to ours are those of [23] and [24] . In [23] , the zero-equivalence problem is solved for univariate datapaths (those with just one bit-vector variable). While [24] extends the results of [23] to multivariate bit-vector computations, the focus is still restricted to fixed-length datapaths. Both techniques lack the mathematical wherewithal to model polynomial computations over bit-vectors of unequal word-lengths.
III. Preliminaries
In what follows, Z corresponds to the set of integers, Z + to the set of non-negative integers and Z n to the finite set of integers {0, 1, . . . , n−1}. 
In the context of our work, n 1 , n 2 , . . . , n d corresponds to the bit-vector sizes of the input variables x 1 , x 2 , . . . , x d and m represents the output bit-vector size. Subsequently, we represent the RTL computations as polyfunctions from Z 2 n 1 ×Z 2 n 2 ×· · ·×Z 2 n d to Z 2 m . Chen [6] defines the corresponding polyfunction as follows:
3 for x 1 = 0, 1 and
It is possible for a polynomial with non-zero coefficients to vanish on such mappings; in which case the polynomial represents a nil polyfunction and their corresponding polynomials are often called vanishing polynomials.
The following Section describes the concepts that can be used to identify such polynomials. In the sequel, polynomial addition and multiplication are performed %n (n = 2 m ) according to the rules below:
Also, we use the following multi-index notation: k =< k 1 , k 2 , . . . , k d > are the (non-negative) degrees corresponding to the d input variables x =< x 1 , x 2 , . . . , x d >, respectively.
IV. Theory
We begin with the analysis of univariate polynomials that vanish on Z 2 m [x] (for didactic purposes) and then extend the results to vanishing polynomials from
According to a fundamental result in number theory, for any n ∈ N , n! divides the product of n consecutive numbers. For example, 4! divides 4 × 3 × 2 × 1. But this is also true of any n consecutive numbers: 4! also divides 99 × 100 × 101 × 102. Consequently, it is possible to find the least k ∈ N such that n divides k! (denoted n|k!). This value k corresponds to the Smarandache function, SF(n) [25] . In the ring Z 2 m , let SF (2 m ) = k, such that 2 m |k!. As an example, SF (2 3 ) = 4 as 8 divides 4! but 8 does not divide 3!; hence, least k = 4.
This property can be utilized to treat the equivalence problem as a divisibility issue in Z 2 m . When two polynomials F (x) and G(x) are equivalent in Z 2 m (i.e.
, let 8|(F (x) − G(x)). But, 8|4! too. Therefore, if for all x, (F − G) evaluated at x is a product of 4 consecutive numbers, then (F − G) vanishes in Z 2 3 . So, what is a natural example of such a polynomial? The answer is: (x)(x − 1)(x − 2)(x − 3). Such a product expression is referred to as a falling factorial and is formally defined below.
Definition IV.1: Falling factorials of degree k ∈ Z are defined according to:
The above concept of falling factorials can be similarly defined for multi-variate expressions over
Extending the above concept, if a multivariate polynomial in Z 2 m [x 1 , . . . , x d ] can be factorized into a product of SF (2 m ) consecutive numbers in at least one of the variables x i , then it vanishes %2 m . The following examples illustrates this idea.
Example IV.2: Consider the polynomial F (x 1 , x 2 ) = x 4 1 x 2 + 2x
. Here, SF (2 2 ) = 4 and the highest degrees of x 1 and x 2 are k 1 = 4, and k 2 = 1, respectively. Note that F %4 can be equivalently written as F = Y <4,1> (x 1 , x 2 )%4 = Y 4 (x 1 ) · Y 1 (x 2 )%4. Since F %4 can be represented as a product of 4 consecutive numbers in x 1 , 2 2 |F and F ≡ 0. In the above example, both the input variables x 1 , x 2 , as well as the output F are in Z 2 2 . We wish to extend the above concepts to analyze polynomials over Z 2 n 1 × Z 2 n 2 × . . . × Z 2 n d to Z 2 m . For this purpose, we define another quantity [6] :
Now consider the following results [6] :
Example IV.3: Let f : Z 2 1 × Z 2 2 → Z 2 3 and its corresponding polynomial be F = x 2 1 x 2 − x 1 x 2 . Here, SF (2 3 ) = 4, k 1 = 2 and k 2 = 1. Note that µ 1 (2 1 ) = min{2 1 , 4} = 2 = k 1 (the condition in Lemma IV.1 is satisfied) and µ 2 (2 2 ) = min{2 2 , 4} = 4 > k 2 , and F can be written as:
≡ 0 When a polynomial cannot be factored into such Y k expressions, can it still vanish? Consider the quadratic
. It can be written as 4(x)(x − 1). While 4x 2 − 4x cannot be factorized as (x)(x−1)(x−2)(x−3), it still vanishes in Z 8 . The missing factors, (x−2)(x−3) in this case, are compensated for by the multiplicative constant 4; therefore, 4x 2 − 4x ≡ 0%8. We now need to identify the constraints on such multiplicative constants such that the given polynomial would vanish. We state the following result [6] :
Lemma IV.2: The expression c k · Y k ≡ 0 if and only if
|c k ; where:
We can use Lemma IV.2 to prove that f is a nil polyfunction. Here, 2 n1 = 2, 2 n2 = 4 and 2 m = 8. k = < k 1 , k 2 >=< 1, 2 > corresponds to the highest degrees of x 1 , x 2 . Moreover,
The above results can be extended to derive necessary and sufficient conditions for a polynomial to vanish as a function from Z 2 n 1 × Z 2 n 2 × . . . Z 2 n d to Z 2 m . We state the following theorem [6] :
Theorem 1: Let F be a polynomial representation for the function f from Z 2 n 1 × Z 2 n 2 × . . . Z 2 n d to Z 2 m . Then, F is a vanishing polynomial (F ≡ 0) if and only if it can be represented as:
where:
is an arbitrary polynomial;
• Y k is the falling factorial defined in Eqn. 7;
• a k ∈ Z is an arbitrary integer and
.
Proof:
The proof follows straight-forwardly from Lemma IV.1 (for the computation Q µ Y µ ) and from Lemma IV.2 (for the computation Σ k a k b k Y k ).
The following example illustrates the above concept. Example IV.5: Consider a polynomial F = x
. F can be written as follows:
Here, a <1,2> = 1 and b <1,2> = 8/(8, 1! · 2!) = 4. F can be written in the form given by Theorem 1, and is thus a vanishing polynomial.
V. Algorithm: Zero Equivalence
Using the concepts presented in the previous section, we have derived a systematic algorithmic procedure that tests whether a given polynomial vanishes as a function from Z 2 n 1 × Z 2 n 2 × . . . Z 2 n d to Z 2 m . Algorithm 1 depicts the procedure. 
if (rem == 0) /* rem = remainder */ then return true; /*poly = QµYµ; a vanishing polynomial*/ else poly = rem; break; end if end if end for /*Iterate over all possible degrees*/ for j = d l=1
if (rem == 0) then return true; else poly = rem; end if else return f alse; end if end for Algorithm 1: ALGORITHM ZERO EQ: Zero testing a given polynomial.
The algorithm takes as input the two polynomials F 1 and F 2 in variables x 1 , . . . , x d with corresponding input bit-widths n 1 , . . . , n d , and output bit-width m. The output is true if F 1 ≡ F 2 . The algorithm operates as follows: 1. Find the difference of the two polynomials, poly. This is the expression which should vanish to prove equivalence.
Compute the Smarandache function value for 2
m ; an O(n/log n) computational procedure, given in [26] , has been implemented. Subsequently, SF (2 m ) value is used to obtain the µ i values. • If the quotient can be written as a k · b k (where b k is defined according to Theorem 1), and the remainder is zero, return true. It is a vanishing polynomial.
• If the quotient can be written as a k · b k , and the remainder is non-zero, continue to the next iteration.
• If the quotient cannot be written as
Complexity: In Algorithm 1, the number of multivariate divisions is bound by O( d µ i ), where µ i is as defined previously and d is the total number of variables. Maple 7 [27] for all the algebraic manipulations. Using our algorithm, we have been able to perform verification runs over a number of designs collected from a variety of benchmark suites.The results are presented in Table I .
VI. Experimental Results

Algorithm 1 was implemented in Perl with calls to
The first example is the from Sec. I, and represents the image rejection computation. The next two examples [1] are phase-shift keying and anti-aliasing functions, both used in digital communication. The polynomial filters [3] are Volterra models of polynomial signal processing applications. Horner polynomials [28] are commonly used in DSP -often implemented using multiply-add-accumulate units. In [1] , it was shown how computations by these mac units can be extracted as polynomials in Horner's form. MIBench is an automotive application from [29] . The last example is a vanishing polynomial of degree 10, specifically created to validate our algorithm. The second column in the table describes the characteristcs of these benchmarks: number of variables (var), highest degree of the polynomial (Deg), and input/output word-lengths (n i , m).
Our experimental setup is as follows: High-level restructuring and symbolic algebra-based transformations -such as: modulating and segmenting the coefficients, factorization and expansion, addition and removal of algebraic redundancy (vanishing polynomials), etc. -were applied to the original RTL descriptions to obtain symbolically different but functionally equivalent implementations. Subsequently, the data-flow graphs for the given RTL descriptions were extracted using Gaut [30] . Traversing the DFGs from the inputs to the outputs, the polynomial representations were constructed. The datapath sizes of both inputs and outputs (n 1 , . . . , n d and m) were also recorded. The algorithm was invoked to find the difference between the two polynomials and subsequently verify that it computes zero, to prove equivalency. We were able to solve all problems in < 25 seconds.
We have performed equivalence checking of the given RTL designs using BDDs, BMDs and SAT based approaches. Since gate-level descriptions are required by both BDDs and SAT, we synthesized our designs using a commercially available logic synthesis tool. BDDs were used to verify the resulting netlists using the VIS [31] package. It was found that though BDDs could solve the problem for some of the smaller benchmarks (especially for univariate polynomials), they failed for the rest of the designs.
From the gate-level netlists corresponding to the two designs, we generated miter circuits and converted them to CNF format. ZChaff [32] was used to prove equivalence via unsatisfiability testing. For all the designs, ZChaff could not solve the problem within the time-limit of 1000s. We also attempted to construct the BMDs from the synthesized gate-level netlists corresponding to the original RTL descriptions. Because of the presence of high-degree polynomial terms, the graph could be constructed only for the smaller benchmarks (degree ≤ 4). The other benchmarks could not be verified within the time-out limit.
A. Limitations of our approach
Many DSP systems implement some form of computation approximation, by incorporating various rounding schemes. Our approach is currently restricted inasmuch as it cannot verify those datapaths where intermediate signals have varying precision (due to rounding). Similarly, saturation arithmetic architectures can also not be verified using our technique. Analysis of such designs requires substantially more work, and is the subject of our future investigations.
VII. Conclusions
We have presented a framework for equivalence verification of arithmetic datapaths with multiple word-length operands. Our approach models the design as a polyfunction from Z 2 n 1 × Z 2 n 2 × . . . × Z 2 n d → Z 2 m . The concept of vanishing (nil) polyfunctions is exploited to prove equivalence between two symbolically distinct (but computationally equivalent) polynomials. The concepts from number theory and commutative algebra have been applied to derive a complete algorithmic procedure for this purpose. Using our algorithm, a variety of benchmarks have been verified. Our approach was able to solve the problem in all cases, where contemporary verification approaches were infeasible. As part of future work, we are investigating applications of the proposed concepts to datapath computations that implement rounding. 
acknowledgements
