Abstract-This paper addresses the problem of equivalence verification of RTL descriptions. The focus is on datapathoriented designs that implement polynomial computations over fixed-size bit-vectors. When the size (m) of the entire datapath is kept constant, fixed-size bit-vector arithmetic manifests itself as polynomial algebra over finite integer rings of residue classes Z2m . The verification problem then reduces to that of checking equivalence of multi-variate polynomials over Z2m . This paper exploits the concepts of polynomial reducibility over Z2m and derives an algorithmic procedure to transform a given polynomial into a unique canonical form modulo 2 m . Equivalence testing is then carried out by coefficient matching. Experiments demonstrate the effectiveness of our approach over contemporary techniques.
I. INTRODUCTION
RTL descriptions of integer datapaths that implement polynomial arithmetic are found in many practical designs, particularly in digital signal processing (DSP) for audio, video and multimedia applications. Such designs perform a sequence of ADD, MULT, SHIFT type of algebraic computations that can be modeled as multi-variate polynomials of finite degree. Initial algorithmic specifications of such systems involve data representation using floating-point formats. However, they are often implemented with fixed-point architectures in order to optimize the area, delay and power related costs of the implementations. In many cases, the design choice is that of a single, uniform system word-length for the computations. Such fixed-size datapath computations are generally implemented using: signal truncation, rounding or saturation arithmetic.
This paper addresses the problem of RTL equivalence verification of datapath descriptions that implement polynomial computations over fixed-size bit-vectors by way of signal truncation. In such designs, m-bit adders and multipliers produce an m-bit output; only the lower m-bits of the outputs are used and the higher-order bits are ignored. When the datapath size (m) over the entire design is kept constant, then fixed-size bit-vector arithmetic manifests itself as polynomial algebra over finite integer rings of residue classes Z 2 m ; i.e. addition and multiplication is closed within the finite set of integers {0, . . . , 2 m − 1}. In such cases, symbolically distinct 
A. Motivating the Verification Problem
Let us motivate the equivalence verification problem as it appears in the context of our work. Fig. 1 depicts a typical design flow for DSP applications. The floating-point MATLAB model is automatically converted to a fixed-point model; which is subsequently translated into RTL. Automatic translation utilities are available for this purpose [1] . The verification problem instance is that of checking the equivalence of the fixed-point design against the translated (and optimized) RTL models. As an example, consider the anti-alias function of an MP3 decoder that computes [2] :
. The given equation can be approximated using the Taylor series expansion for a range of x based on the given application. Under the constraint that the datapath size is to be fixed to 16-bits, the computation 16 ≡ G%2 16 . So how do we prove that the above computations are indeed equivalent? An algorithmic solution to this problem is the subject of this paper.
II. LIMITATIONS OF CONTEMPORARY APPROACHES
It is evident that Binary Decision Diagrams (BDDs) [4] , Binary Moment Diagrams (BMDs) [5] , K*BMDs [6] and their derivatives are ill-suited for our application; mostly due to the presence of high-degree polynomial computations over wide bit-vectors. TEDs [7] have been proposed as canonical DAG representations for multi-variate polynomials. However, TEDs do not model modulo-arithmetic and thus cannot prove polynomial equivalence over finite integer rings.
Modulo arithmetic concepts have been studied in the context of RTL verification for bit-vector arithmetic [8] [9] , wordlevel ATPG [10] and MILP-based simulation vector generation [11] . However, these are mostly geared toward solving linear congruences under modulo arithmetic -a different application from proving polynomial equivalence modulo 2 m . There exist various applications (such as FIR, IIR, Kalman, Elliptical wave filters, FFT, etc.) whose RTL computations have been verified using: co-operative decision procedures, theorem provers (HOL), term-rewriting, and congruence closure based techniques [12] . DSP implementations of the above applications are mostly linear and/or multi-linear forms -which are easy to verify. However, for polynomial equivalence in Z 2 m , such approaches are not very efficient.
Within the scope of Symbolic Computer Algebra, tools such as [13] [14] do provide algorithmic solutions to polynomial equivalence over a variety of rings. However, these solutions are available for fields (R, Q, C), prime rings Z p , integral and Euclidean domains -collectively called the unique factorization domains (UFDs). Within UFDs, computer algebra systems solve the equivalence checking problem by uniquely factorizing an expression into irreducible terms and comparing the coefficients of the factored terms ordered lexicographically.
Efficient algorithms for factorization have been developed [15] [16], which can be readily used for this purpose. However, in the case of our application, the finite integer ring formed by specific modulo value 2 m is a non-UFD, due to the presence of zero divisors (e.g., 4 = 2 = 0, 4 · 2 = 0 in Z 8 ). Since Z 2 m is a non-UFD, any polynomial in Z d 2 m cannot be uniquely factorized into irreducible terms. For example, consider f (x) = x 2 −x in the non-UFD Z 6 ; f factorizes in two (non-unique) irreducible forms: (x)(x − 1) and (x − 3)(x − 4). On the same lines, techniques using the concepts of Grobner's bases [17] [2] find extensive application in UFDs. However, for the above reasons, they cannot be directly ported to solve the above problem in the non-UFD Z 2 m . The symbolic algebra libraries ZEN [18] and NTL [19] allow for polynomial manipulation (factorization, multiplication, primality testing etc.) over rings of the type Z n , n = integer, as well as over their polynomial extensions. However, to the best of our knowledge, a "ready-made" algorithmic procedure to test
A. Related Work in Number Theory & Polynomial Algebra
The problem
%n is known to be NP-hard when n ≥ 2 [20] . Researchers from the field of number theory and commutative algebra have analyzed properties of polynomials over arbitrary finite integer rings. Singmaster [21] presented the theory of univariate vanishing polynomials over Z n , n ∈ N, n > 1; i.e. those polynomials f such that f (x)%n ≡ 0. For example, 2x 2 + 2x ≡ 0%4, ∀x ∈ Z 4 ; hence 2x 2 +2x is a vanishing polynomial in Z 4 . He identified necessary and sufficient conditions for a univariate polynomial to vanish %n. The equivalence test for f (x) ≡ g(x) in Z n can then be re-formulated as determining whether (f (x) − g(x))%n ≡ 0. However, in digital design, we mostly encounter multi-variate polynomials. Hungerbuhler and Specker [22] extend the concepts from [21] and derive a unique/canonical form representation of a multi-variate polynomial over finite integer rings of the form Z p m , where p is any prime integer. Their result suits our application as in our case p = 2.
The main contributions of this paper are: i) We formulate the fixed-vector-size (m) RTL datapath verification problem as polynomial equivalence in Z 2 m ; ii) From the concepts presented in [22] , we derive a systematic algorithmic procedure that operates on the given polynomials in Z 2 m and reduces them to a unique canonical form; iii) We extract the data-flow graphs (DFG) corresponding to the given RTL descriptions and construct their polynomial representations by traversing the DFGs from inputs to outputs. The polynomials are then reduced to their canonical forms and the equivalence check is performed by coefficient-matching. iv) Experimentally, we demonstrate that the proposed approach is able to verify the equivalence of high-degree polynomial RTL datapaths (realworld benchmarks), where contemporary methods prove to be impractical.
In the rest of the paper, we use the notation 
III. VANISHING POLYNOMIALS OVER FINITE RINGS
It is a well-known result in number theory that for any n ∈ N , n! divides the product of n consecutive numbers. For example, 4! divides 4 × 3 × 2 × 1. But this is also true of any n consecutive numbers: 4! also divides 99 × 100 × 101 × 102. Consequently, it is possible to find the least k ∈ N such that n|k!. This value k corresponds to the Smarandache function, SF(n) [23] . In the ring of interest,
Note that 8 does not divide 3!, and hence the least k = 4.
This property can be utilized to treat the equivalence problem as a divisibility issue in
represented as a product of 4 consecutive numbers, then (f −g) would vanish in Z 2 3 . So, what is a natural example of such a polynomial? The answer is (x+1)(x+2)(x+3)(x+4). In this regard, Singmaster [21] proposed a set of monic polynomials (with leading coefficient = 1), S k , where each S i represents (in polynomial form) a product of i consecutive numbers; When a polynomial cannot be factored into such S k expressions, can it still vanish? Consider the quadratic polynomial 4x 2 + 4x in Z 8 . It can be written as 4(x + 2)(x + 1). However, 4x 2 + 4x cannot be factorized as
The missing factors, (x+4)(x+3) in this case, are compensated for by the multiplicative constant 4; therefore, 4x 2 +4x ≡ 0%8. Singmaster identified the constraints on such multiplicative constants such that the polynomial in question would vanish. We state the following result. Singmaster extended this result to develop a canonical representation of a univariate polynomial that vanishes over any finite integer ring. We have studied this work and applied it to verification of univariate polynomial datapaths in [24] .
The above concept of vanishing polynomials leads to the concept of reducibility. For example, in Z 2 3 , 4x
In other words, 4x
2 can be reduced to 4x. Hungerbuhler and Specker [22] extend the concept of polynomial reduction to the multi-variate case. In the next section, we present the necessary theoretical foundation to perform the requisite reductions on multi-variate polynomials.
IV. REDUCIBILITY OF MULTI-VARIATE POLYNOMIALS
We use the following multi-index notation in the rest of the paper [22] :
In the above notation, the monomial 4x 1 2 x 2 can be represented in the above notation as ax k ≡ ax 1 k1 x 2 k2 , where,
We consider the following results. The proofs are available in [22] and are not reproduced. Here, a = 4 and k! = k x ! = 2 and clearly, 2 3 |4 · 2!. Now consider another polynomial in two variables f 2 (x, y) = 4x 2 y + 4xy + 4x 2 + 4x in Z 8 which is equivalently written as 
y + 1 1 = 4x 2 y − 4(x + 2)(x + 1)(y + 1)
In the above expression, 4x 2 can be further reduced to 4x (as
A. Canonical representation of polynomials modulo 2 m
The above concepts of monomial reductions can be applied to a given polynomial, iteratively, and the polynomial can be reduced to a minimal, unique canonical form. For this purpose, we first define the following concept.
Definition 4.1: We define ν 2 (k!) as the maximum degree x such that 2 x divides k!:
The above results lead to the following canonical form for polynomials in Z 
where
While the proof of this theorem is provided in [22] , we highlight the key concepts as they allow us to derive a systematic algorithmic procedure to reduce a given polynomial to the above canonical form.
The representation exists: Let ax k be the monomial of the highest total degree that appears in f . 
where q is the quotient and r is the remainder. Moreover, and
. This is the minimal unique form representation and further reduction is not possible. Now consider the monomial 5x 2 y in Z 2 3 . Here, α k = 5 and ν 2 (2!) = 1. Note that, 2 3 does not divide 5·2!. Moreover, α k = 5 / ∈ {0, 1, 2, 3}. Therefore, we represent 5x 2 y = 4x 2 y + x 2 y. As shown above, 4x 2 y can be reduced to a lower total degree and x 2 y is already in reduced form. Note that the proof of the above theorem allows us to derive an algorithmic procedure to reduce a given polynomial to its unique canonical form. The procedure operates as follows:
1) Order the terms in descending term-order [2] of their highest total degree. 
]; /*Quotient is degree-reducible. Subtract vanishing polynomial*/ reduce quo = quo(α k +1)( 
V. EXPERIMENTS
We have implemented Algorithm 1 described in Sec. IV in Perl with calls to MAPLE 7 [13] for all the algebraic manipulations. The data-flow graph for the given RTL descriptions is extracted using GAUT [25] . Traversing the DFG from the inputs to the outputs, the polynomial representations are constructed. The datapath size (m) is also recorded. The algorithm is applied to reduce the two polynomials to their canonical forms. Equivalency check is carried out by coefficient-matching.
We have tested our algorithm with a number of designs collected from a variety of benchmark suites, as shown in Table  I . The first two examples [2] are phase-shift keying and antialiasing functions, both used in digital communication. The degree-3 and degree-4 filter designs [26] are Volterra models of polynomial signal processing applications. Horner forms of polynomials, used for faster signal processing, are from [3] . MIBench is a 9 th -degree polynomial from [27] . The last example is a vanishing polynomial of degree 10, specifically created to validate our algorithm. The time reported is the total time for canonizing both polynomials and that required for subsequent coefficient matching. The two descriptions to be verified are symbolically different but computationally equivalent. The number of variables, the highest degree that a variable appears in a term, and the datapath size are shown in column 2. For example, Savitzky-Golay filter has 5 variables, highest degree of each being 3, and bit-width = 16.
We have performed equivalence checking of the given RTL designs using BDDs, BMDs, SAT and MILP based approaches. Since gate-level descriptions are required by both BDDs and SAT, we synthesized our designs using a commercially available logic synthesis tool. BDDs were used to verify the the resulting netlists using the VIS [28] package. It was found that though BDDs could solve the problem for some of the smaller benchmarks (especially for univariate polynomials), they failed for the rest of the designs.
From the gate-level netlists corresponding to the two designs, we generated miter circuits and converted them to CNF format. ZChaff [29] was used to prove equivalence via unsatisfiability testing. For all the designs, ZChaff could not solve the problem within the time-limit of 500s. For equivalence via MILP solving [11] , linear inequalities were created from their data-flow graphs. Equivalency f ≡ g was tested via proving UNSAT (f = g). The reason why MILP failed was because linearizing X k m requires expanding it into its constituent m bits. LPSOLVE was used as the resolution engine. Since BMD packages are not available in public domain, the TED [7] package was suitably modified to construct BMDs. Note that BMD decomposition is a special case of TEDs; when all variables are Boolean, the TED reduces to a BMD. We attempted to construct the BMDs from the synthesized gatelevel netlists corresponding to the original RTL descriptions. Because of the presence of high-degree polynomial terms, the graph could be constructed only for the smaller benchmarks. The other benchmarks could not be verified within the timeout limit.
The last benchmark is a "vanishing polynomial" in 2 variables. We wanted to verify that the outputs always compute zero. Interestingly, (commercial) logic synthesis tools generated a redundant, non-empty circuit. To verify that the circuit was indeed redundant, we attempted to construct both BDDs and BMDs, but were unable to do so. While BDDs ran out of memory, BMD-composition operations did not terminate.
VI. CONCLUSIONS AND FUTURE WORK
This paper has presented a polynomial algebra based framework for equivalence verification of arithmetic datapaths. The targeted applications are polynomial computations implemented with fixed-size bit-vectors. The concept of polynomial reducibility over finite rings, Z 2 m , is exploited to transform a given polynomial to a unique canonical form. The equivalence checking is carried out using coefficient matching. A variety of benchmarks were verified using the proposed method. Our algorithm was able to solve the problem in all cases, where established techniques failed.
As part of future work, we are currently looking to extend the concepts presented in this paper to verify fixed-size datapaths that implement rounding schemes by ignoring the lower order bits. We would also like to explore verification of datapaths with multiple word-lengths.
