Abstract. This paper proposes a new approach for proving arithmetic correctness of data paths in System-on-Chip modules. It complements existing techniques which are, for reasons of complexity, restricted to verifying only the control behavior. The circuit is modeled at the arithmetic bit level (ABL) so that our approach is well adapted to current industrial design styles for high performance data paths. Normalization at the ABL is combined with the techniques of computer algebra. We compute normal forms with respect to Gröbner bases over rings Z/ 2 n . Our approach proves tractable for industrial data path designs where standard property checking techniques fail.
Introduction
Property checking has become well-established in modern design flows for Systems-on-Chip (SoCs). Its main application domain is ensuring the correctness of the individual SoC blocks. This does not only lead to high quality IP (intellectual property) modules but also reduces the costs for system integration and chip-level simulation. Given IP modules of provably high quality, chip-level simulation may concentrate on true system-level aspects and is relieved from hunting bugs in local modules. Therefore, in recent years, a lot of effort has been made to develop sophisticated methodologies and tools for formal module verification based on property checking. Today, formal property checking can handle almost all types of modules that can be found in today's SoCs. Nonetheless, a few pathological cases remain that sometimes limit the application of property checking in industrial practice. In particular, data paths are often a challenge for formal techniques, especially, if not only the correctness of the control flow but also correctness of the data is to be proved.
For complex arithmetic data paths simulation is, therefore, still prevailing in industrial verification environments. This is due to the inability of standard proving procedures based on satisfiability solving (SAT) or binary decision diagrams (BDDs) to handle arithmetic functions. Especially multiplication -as it is part of nearly all data paths for signal processing applications -has remained a severe problem for standard tools. This deficiency has motivated the research community to investigate alternative proof methods with focus on arithmetic.
In case the validity of a property can be proven without consideration of the exact functionality of the data path, abstraction and refinement techniques have shown superiority over pure Boolean SAT techniques. A survey on these techniques can be found in [1] . However, for properties that depend on the exact functionality of the datapath a suitable abstraction is not likely to be found.
Another direction of research investigates SAT-modulo-theory (SMT) solvers. These solvers combine a SAT solver with specialized solvers for certain wellselected theories. An example for such a theory is the theory of equality with uninterpreted functions used in UCLID [2] . In case the problem at hand really depends on the exact functionality of a datapath, as is typically the case, most SMT solvers resort to bit blasting [1] for the corresponding problem parts. In this case SMT solvers show the same performance limitations as pure SAT solvers as soon as these datapaths include multiplication operations. The decision problems in RTL-property checking could be expressed as SAT problems for formulas of the quantifier free logic (QF-BV) and in principle be solved using solvers such as Yices [3] , MathSat [4] , Z3 [5] or Spear [6] . For sophisticated datapath implementations involving multiplication, however, our experience is that the problems are still beyond the capacity of such solvers.
Recently, techniques from symbolic computer algebra have entered the verification arena. The authors of [7] present a procedure to determine whether a multivariate polynomial with fixed word length operands is vanishing. By this means a comparison of polynomial representations for bit vector functions is feasible. This procedure is extended towards multiple word length operands in [8, 9] . However, both approaches require a word-level representation of the datapaths under comparison. This limits their applicability in RTL property checking. Due to performance and area requirements RTL designers typically design specialized arithmetic components. These components are often designed using bit level arithmetic circuitry to build addition trees and partial products. The smallest entities in an addition tree can be described using half and full adders in general. An approach for verification of such bit level implementations using Gröbner basis theory over fields is reported in [10] . This approach requires polynomial specifications for every building block in the hierarchy of the arithmetic circuit design. After proving that a block, e.g., a CSA adder, fulfills its local specification, the polynomial representation is used to verify the block in the next level of the hierarchy. However, as the correctness proof includes a range check the intermediate results at the block boundary are required to have sufficient bit width to represent every possible result. For designs implementing integer arithmetic with fixed bit width this is often not the case.
A heuristic approach to exploit the availability of arithmetic bit level (ABL) information in RTL designs has been reported in [11] . In this work a data structure called ABL description for representation of addition networks and bitwise multiplication is transformed into a reduced normal form. By canceling out common addends from addition networks in the fanin of a comparator the normalization approach relieves the SAT solver from reasoning in structurally different implementations for the same arithmetic function.
In order to overcome the limitations of [10] we use computer algebra algorithms for rings Z/ 2 N to solve decision problems at the arithmetic bit level. This extends the normalization approach of [11] with a clean and well-understood mathematical foundation. We show that an ABL description [11] can directly be transformed into a set of equivalent variety subset problems. We exploit the observation that under certain monomial orderings the set G of polynomials generated from the ABL components forms a Gröbner basis of the ideal I = G generated by these polynomials with special properties. This allows to solve the variety subset problem and hence decide problems at the arithmetic bit level.
The remainder of the paper is organized as follows: Section 2 briefly reviews the notion of an ABL description and describes how such a description can be generated given a design under verification and a property. Section 3 details the mathematical modeling for decision problems at the ABL. The proposed techniques are evaluated by experiments summarized in Section 4. Finally, Section 5 concludes the paper.
ABL Description
Arithmetic bit level (ABL) descriptions as introduced in [11] have proven to be useful for modeling the arithmetic parts of a property checking instance. In this section we briefly review this notion as far as it is required for this paper. We use the following notations:
-For a ∈ Z, b > 0 the remainder, a mod b, of the integer division a/b denotes the smallest k ≥ 0 with k = a − mb for some m ∈ Z. -For n > 0 and a ∈ Z the uniquely determined bit vector (a n−1 , . . . , a 0 ) with a mod 2 n = n−1 i=0 2 i a i is denoted as a, n = (a n−1 , . . . , a 0 ), i.e., a, n is the n-bit binary unsigned integer representation of a.
The combinatorial transition function of an RTL circuit design is usually modeled by a directed acyclic graph where the vertices are labeled with bit vector functions. It is common practice to translate verification problems for RTL circuits into such bit vector netlists with a single output indicating whether, e.g., a certain property holds for a design. For the arithmetic problem parts we extract an ABL description from this netlist. This description again is a directed acyclic graph where the vertices can be of type "partial product generator", "addition network" or "comparator". These vertex types are defined as follows: 
Partial product generators model bit-wise multiplication and comparators model comparison of bit vectors. Bit level addition units like half adders (HA) or full adders (FA) are modeled as addition networks. By construction, addition networks can be used to model any addition circuit ranging from HAs and FAs up to the entire addition scheme of a multiplier or a multiply-accumulate unit. This is true for both signed and unsigned arithmetic.
Example 1. An signed 2 × 2-bit multiplier can be modeled with the partial product generator
and the addition network
A simple bit-level implementation of this multiplier may implement the addition network using a fulladder and two halfadders. They can be modeled by
For reasons of space we omit the formal definition of ABL descriptions as a DAG. The interested reader is referred to [11] . Basically, the nodes of the graph are labelled with their vertex type and the edges describe the interconnections between them. Here, we explain this concept by continuing Example 1.
Example 2. The ABL description for the comparison of the bit level multiplier implementation discussed in Example 1 against its word level specification is depicted in Figure 1 . The vertices of this graph are labeled with the bit vector function defined in the previous example. The edges (v, v ) are labeled with bit vectors that propagate the result of v to the inputs of v . In other words, the variables are defined by the following equations:
This example illustrates that ABL descriptions may contain structurally dissimilar representations for one and the same arithmetic function. To simplify the comparison of such representations a heuristic ad-hoc algorithm called ABL normalization was proposed in [11] . This algorithm performs a series of local equivalence transformations on the ABL description that are based on the commutative and distributive laws. However, in the next section we will describe how to obtain a variety subset problem that is equivalent to the decision problem resulting from the comparison of such ABL representations. This paves the way for the application of generic computer algebra algorithms for which efficient implementations are available.
Mathematical Background
Application of computer algebra techniques to ABL verification problems requires ABL components to be modeled by polynomials over a unique ring. Due to the operation mod used to specify ABL components, the ring Z/2 n seems to be the natural choice. However, the mapping of ABL descriptions on sets of polynomials G ⊂ Z/2 n [X] over such a ring is not trivial and will be detailed in this section. The key observation is that the constructed set G is a Gröbner basis of the generated ideal I = G . This makes the proposed approach computational feasible.
We start with a set of equations G j , j = 1, . . . , m given by polynomials f j ∈ Z[X], X a finite set of variables, which are of the form
For the variables r
k ∈ X in this equation we assume r
Note that the equations G j can be easily generated from the vertices of an ABL description and that the condition r
k is fulfilled as the ABL description is acyclic by definition. For illustration we give a few examples.
Example 3. The partial products of a non-Booth-encoded n × m multiplier can be modeled by the polynomial equations
Example 4. A full adder with inputs a 0 , a 1 , a 2 and outputs s and c for sum and carry is modeled by the equation
Example 5. A k-bit adder with inputs a = (a 0 , . . . , a k−1 ) and b = (b i ) and result r = (r i ) is modeled by
For every proof goal, we obtain an additional polynomial g depending on a subset of variables {a 1 , . . . , a t } ⊂ X and need to check whether g(a 1 , . . . , a t ) = 0 mod 2 n for all solutions of the set of equations {G j }.
Example 6.
A k-bit comparator of operands a and b is modeled by the polynomial
Denote the set of all solutions to {G j } as V ({G j }). Analogously let V (g) be the set of all roots of g. Usually the equations G j and the polynomial g are given mod 2 k for different k. We apply a number of transformations to create an equivalent variety subset problem V ({h i }) ⊂ V (g) where h i and g are polynomials over a single ring Z/2 N with appropriate N , which is necessary in order to apply computer algebra. To solve the problem we construct a Gröbner basis and then use normal form computations with respect to this basis.
For the reader's convenience we recall some basic facts about Gröbner basis theory (cf. [12, 13] ). We need a monomial ordering <, i.e., a well ordering on the set of monomials s.t. multiplication with a monomial respects the ordering. Here a monomial is a power product of variables and a term is the product of a monomial with a coefficient, i. 
Problem Formulation over a Single Ring
Instead of directly converting the equations G j into a set of polynomials over a single ring, we generate some additional equations. These equations are redundant in the sense that they can be derived from the original equations G j . However, they will play an important role for the efficiency of the solution techniques described in Section 3.2. More precisely, these equations ensure that the polynomial system generated from them is a Gröbner basis of the corresponding ideal. This will be discussed later.
For every G j we generate n j equations
2 , . . . , a (j) mj mod 2 t with t = 1, . . . , n j and with f (t) j = f j mod 2 t being the minimal polynomial [14] representing the same polynomial function (Z/2 t ) mj → Z/2 t as f j .
Obviously, every solution of the G j is also a solution of the system {G Example 7. Suppose the n bit final adder of a multiply/accumulate unit is reused for computation of an m-bit addition (m < n). In a property checking instance for this addition only the lowermost m bits of the adder take influence on the arithmetic result. By the above construction we only instantiate the equations
So far the equations G 
The set of common roots for the G mj , i.e., a subset of the inputs of G j . For example, the polynomial modeling a half adder r 0 −a 0 −a 1 +2s results in the polynomial s = a 0 a 1 for the slack variable. However, often it is better to introduce the slack variables because, in general, the polynomials for the slack variables will be very large even for small polynomials f
. . , m and t = 1, . . . , n j } and I = G be the ideal generated by this set. Using the language of computer algebra our decision problem can be formulated by the following question:
Is
where V (I) and V (f ) denote the set of all common roots (in (Z/2 N ) k , where k is the number of variables) of the polynomials in I and the set of roots of the polynomial 2 N −n g, respectively? In the next section we will detail how to efficiently solve this problem.
Solving Decision Problems at the ABL
The following proposition turns out to be the key for an effective solution of the presented problem. Proof. Let < be a monomial order as required in the statement. We need to show that it is not possible to generate a polynomial from the polynomials in G with a leading term that is not divisible by any leading term of the polynomials in G. It is sufficient to show (cf. [15] , Theorem 30) (1) For any two polynomials f, g ∈ G the normal form of
A slight generalization of the product criterion (cf. [15] , Lemma 35) states that (1) is fulfilled, as our polynomials have different variables in their leading terms and these variables do not occur in any other term of the corresponding polynomials. Now let f = G (t) j . We obtain LT 2
t−2 = 0, and since the polynomials f By Lemma 3 we prove that normal form computation can be used as an effective solution procedure for our problem at hand. Proof. If h defines a constant zero function the set V (h) = V (g) contains all points and therefore V (G) ⊂ V (g) is trivial. Assume that for the variables x of h a valuation exists such that h is not zero. By assumption we can extend this valuation to a valuation on all variables such that g(
Lemma 3. Let G be a Gröbner basis of an ideal
n [x] and h be the normal form of 2 N −n g with respect to G, which can be computed [15] by Algorithm 1. Since we are only interested in the function of h on V (I) we can always replace portions of h by equivalent polynomials with respect to V (I). In particular, we can replace every slack variable in the normal form by a polynomial expression in the inputs of the corresponding equation G j . Therefore we may assume that h does not contain any slack variables. Furthermore, the output variables of the equations G j do not occur in h as otherwise h would be reducible by some of the generated sub-identities G (t) j , hence h satisfies the assumptions of Lemma 3. This guarantees that the variables present in h are inputs to the ABL description. Every valuation of these variables can be extended to a consistent valuation for the signals of the ABL. Further we can effectively decide whether h defines the zero function for all rings Z/m (cf. [14] ) and therefore decide the ABL problem by Lemma 3.
As already noted in Section 3.1 it is not always efficient to replace all remaining slack variables by polynomial expressions in terms of the input variables of the corresponding equations. Therefore we use special procedures for the practical computations, which we do not detail here.
Require: f a polynomial, G a finite set of polynomials,
> a monomial ordering Ensure: A normal form of f while f = 0 and
Algorithm 1. Normal form algorithm

Experimental Results
In order to evaluate the techniques presented in the previous sections we conducted a series of experiments. Except for one experiment explicitly indicated in the sequel, all experiments were carried on a machine running Suse Linux 10.3 on a Intel Core 2 Duo E6400 with 8 GB RAM.
The algorithms presented in Section 3 have been implemented within the framework of the general purpose computer algebra system Singular [16] . We used the industrial formal property checker Onespin 360 MV [17] to generate bit vector netlists for the considered verification problems. From these bit-vector netlists we extracted an arithmetic bit level description for the arithmetic parts of the decision problem and dumped out the resulting ABL description. The resulting problem file is used to generate the variety subset problem that is handed over to Singular in order to find a solution.
As a first step of the evaluation we used a number of parameterized benchmarks to evaluate the scalability of the proposed approach with respect to the bit-width of the datapath under verification. The benchmark suite consists of two instances (distrib and commute) for word-level implementations of the functions ab + ac and (ab)c where commutative and distributive laws have been applied to the word level operands, a bit level implementation of an unsigned multiplier with Booth-encoded partial products (mult ub) and a sequential implementation for the multiplication of four values with a single multiplier (shared).
We compare the performance in terms of run-time of our solution based on Singular against the normalization approach of [11] , a SAT-based decision procedure based on bit blasting, and the SMT solver Spear v.2.0 for the theory of fixed-size bit-vector functions (QF-BV). Note that an earlier version of Spear showed the best performance in this category on the 2007 SMT competition. Table 1 summarizes the results of these experiments. The table is organized as follows. Columns one and two contain instance and operand bit-width of the datapath. The remaining columns show the CPU times required by the particular tool to prove the instance. In case the memory limit or timeout limit was reached this is indicated by "> 8 GB" and "> 3600", respectively.
In order to evaluate the performance of Singular with respect to other computer algebra systems we also report results for solving the generated variety subset problems with the industrial computer algebra tool Magma [18] . However, due to license restrictions, these results were obtained using another machine, namely an AMD Dual Opteron 2.2 GHz with 16 GB RAM running Linux. We re-ran the Singular problems on this machine in order to allow for comparison of the run times. For the comparison we also increased the memory limit to 16GB. Table 2 summarizes the results for this comparison.
The presented results of the scalability experiments indicate that the proposed modeling and the proposed algorithms are adequate to solve verification problems with industrial impact. To demonstrate this we investigated a property suite originating from the verification of Infineon's Tricore 2 processor. The processor has advanced DSP features including a sophisticated integer pipeline that provides a large variety of multiply and multiply/accumulate instructions. The properties in the investigated property suite verify that every variant of these instructions causes the integer pipeline of the processor to deliver the expected arithmetic result according to the architectural manual. In order to obtain a high degree of resource sharing large portions of the datapath have been implemented at the arithmetic bit level and sophisticated control logic is used for configuration according to the executed instructions.
We used the techniques of [11] to generate the decision problems at the arithmetic bit level. All the resulting decision problems could be solved with Singular when modeled by polynomials as presented in this paper. Table 3 shows the results for a representative subset of the problem instances derived from the Tricore 2 property suite. It is organized as follows. The first column shows the commitment of the property specifying the arithmetic result of the integer instruction under verification. Columns two and three show the run-time of the normalization approach and the corresponding Singular run-time. Unless explicitly indicated all operations are considered as signed operations on the specified bit-vectors.
In essence, all our experiments show that the presented approach outperforms the ad-hoc normalization approach in terms of CPU time. Moreover, algorithms and modeling rely on a well-understood mathematical foundation which opens ample opportunities for further extensions of this framework.
However, the use of a generic computer algebra system as Singular for solving the normalization problems is paid with a price in terms of memory consumption. Except for some of the problems where the ABL description is generated from word-level problems Singular typically requires 3-8 GB of memory. This is caused by the data structures used inside Singular to represent polynomials. These data structures are not optimized with respect to the characteristics of the problems considered here. Compared to problems typically considered in computer algebra, we consider a large number of variables, and many polynomials. On the other hand the individual polynomials have low degree and only use a small fraction of the variables. With application-specific implementations of the employed algorithms such as the normal form computation a great improvement of the memory efficiency can be obtained easily.
Conclusion and Future Work
Decision problems at the arithmetic bit level have been modeled using polynomials over rings Z/ 2 n . It has been proven that the generated sets of polynomials form a Gröbner basis with respect to certain monomial orderings that can easily be determined using the topological ordering of design signals. This allows for utilization of the normal form algorithm to efficiently solve a variety subset problem that is equivalent to the original decision problem.
By this means we provide a solid mathematical foundation to the ad-hoc technique of arithmetic bit level normalization. The developed techniques have proven to be applicable to verification problems of industrial size.
As the datastructures for polynomial representation of in-the-box computer algebra systems do not exploit the typical characteristics of the generated polynomial sets, we are working on a specialized implementation of the employed algorithms that will dramatically reduce the memory consumption.
