Abstract
Introduction
Modern microprocessors, embedded and signal processors as well as communication ASICs utilize various arithmetic circuits in their datapaths. These arithmetic circuits vary in their area, delay and power constraints. Hence, many different realizations of arithmetic circuits can be found, from custom to those that are modified from the standard library elements. Their design, testing and verification poses a major challenge.
Verification of arithmetic circuits has exposed limits of methods based on Decision Diagrams (DDs). The original Reduced Ordered Binary Decision Diagrams (ROBDDs) present a canonical reduced representation of a truth table. They could be used to verify in polynomial time circuits such as adders. However, ROBDDs are of exponential size [1] even for multipliers. Numerous extensions to the ROBDDs were proposed that made the verification of arithmetic circuits more efficient. The most relevant are the extensions to the world-level diagrams, such as *BMDs [3] or EVBDDs [6] , for multi-output Boolean functions. All such diagrams are included in World Level Decision Diagrams (WLDDs). The common limitation of any WLDDs is its inability to represent dividers [9] and the more complex datapath operators by polynomial size diagrams.
While WLDDs enabled the verification of circuits such as multipliers, we believe that their current use has serious shortcomings. First, they have been used only for equivalence/model checking, that compares a function implementation against its specification. This approach is analogous to comparing two abstraction levels for each input combination. However, under some conditions, a smaller number of comparisons could be performed. For example, if a set of possible errors were to be known, one could devise a set of test vectors that can detect all such faults. Further, it would be useful if the verification vectors could be applied for the testing purposes, i.e. for a single stuck-at fault error model.
In this paper, we explore the theoretical bounds, and present the experimental demonstration of such a verification scenario. We exploit the properties of the Arithmetic Transform, which presents the underlying mechanism behind the word-level DDs. We present the basic test vector generation scheme and some of its optimizations, together with deriving an upper bound on the number of test vectors under the assumption of bounded error in spectral domain. Finally, we demonstrate that the same test vector can be applied successfully to a model of design errors [2] , considered recently for verification by error modeling.
Arithmetic Transform
The Arithmetic Transform (AT), also known as integervalued Reed-Muller (RM) polynomials [5] extend the traditional RM forms by allowing the integer function values, while the inputs remain Boolean. The RM forms are obtained by employing a Davio expansion around each input variable Ü Ý as follows:
The arithmetics is performed modulo 2; consequently, "+" and "-" denote an XOR operation. The AT is obtained by using in Equation 1 the integer addition instead. The Arithmetic Transform of a multi-output Boolean functions is calculated by applying the expansion from Equation 1. This expansion leads to a polynomial:
Integer Encoding Number Norm
where each Boolean variable Ü is raised to an exponent . When an exponent is equal to 1, the variable appears in the product term, otherwise it does not. Hence, each coefficient multiplies a product term consisting of a subset of all variables. There are ¾ Ò such subsets, and the arithmetic spectrum coefficients can be associated with these subsets, or equivalently with the subset characteristic functions.
To quickly derive arithmetic spectra for datapath circuits, we use an auxiliary norm function Ü , equal to the integer value that a binary-represented number takes. For an un- To obtain the Arithmetic Transform of the adder, we consider the numerical value of the sum of two n-bit unsigned numbers Ü and Ý, calculated as:
Comparing with Equation 2, we notice that this is a polynomial with integer coefficients representing multiple-output function of arguments Ü Ý where
In conclusion, the AT of the addition operation has ¾Ò nonzero spectral coefficients.
In the similar way, the subtraction operation is obtained by replacing the arithmetic "+" with a "-" sign. For example, for sign-extended encoding, the difference can be obtained as:
Multipliers can be represented in a straightforward way by using Ç´Ò ¾ µ spectral coefficients:
resulting in Ò ¾ spectral coefficients after the sums are multiplied out. In practice, this number can be reduced to ¾Ò by keeping the polynomial in the above factored form.
The cases of multipliers and adders have been addressed in a similar way in the construction of *BMDs [3] (and all subsequent DDs). The extension to the more complex arithmetic expressions can be done as follows. A multipleoutput Boolean function will be represented by a single polynomial (i.e. its arithmetic spectrum). For example, a simple expression
leads to the arithmetic spectrum for the expression Ü · Ý Þ . Any linear filter with Ñ coefficients ½ ¾ Ñ applied to n-bit integers, Ü ½ Ü ¾ Ü Ñ has ÑÒ spectral coefficients:
The application of the AT to the alternative data types results in the spectra of the size comparable to that for the unsigned integer encoding.
Calculation of Arithmetic Spectra
The Arithmetic Transform of an arbitrary multi-output Boolean function can be obtained by multiplying the vector of function values, considered as integers, by the transform matrix. The transform matrix is defined recursively as:
The transform matrix has ¾ ¾Ò entries, and the transform, obtained by multiplying Ì Ò with the vector of values, requires This approach is best used in conjunction with DD representations, to reduce its execution time and produce graph representations such as *BMDs.
The arithmetic spectrum can be obtained as a polynomial that is interpolated from the values that a function takes. Another way of obtaining the arithmetic spectrum is derived from this interpretation. It is most useful to consider multioutput Boolean functions The Arithmetic Transform can be obtained by traversing the lattice in the increasing order of points. It can be shown that at each point Ü, the transform coefficient Ü can be calculated by subtracting all the preceding coefficients from the function value at Ü:
Consider, for example, transforming the adder function · , for the 2-bit unsigned encoding: ½ ¼ and ½ ¼ . The spectral coefficients are generated by applying Equation 4 in the lattice order, i.e.
and ½¼¼¼ ¾. All other coefficients are 0, , as inscribed on Figure 1 where the nonzero coefficients are highlighted. For unsigned arithmetic functions, all such traversals result in forms whose nonzero coefficients are in the layer 1 (for adders) or layer 2 (multipliers) of the lattice.
In our case, the arithmetic spectrum will be used as a specification of the arithmetic operation. The shape of the polynomial for a given arithmetic operation is known and it depends only on the data type used; the addition example uses unsigned numbers. We use the knowledge of the shape of the representation polynomial to verify if the circuit matches it. We are especially interested in the minimum amount of comparisons needed under the error modeling, i.e. when all possible errors are given.
Error Models
Each faulty circuit can be modeled by an error superimposed on the correct circuit. Identification of accurate and robust error models is an important step in developing testing and verification methods.
In verification by error modeling, a set of test vectors is found to verify that the circuit contains no error included in the model. A circuit can be treated as a black box, by which the description of the design error can be obtained by subtracting the responses of the erroneous circuit from the corresponding responses of its (correct) specification. This additive error model is directly useful in conjunction with the Arithmetic Transform. Under the assumption of an error of bounded spectral complexity, we derive efficient verification methods.
Additive Error Model
Any error can be modeled by a quantity added to the circuit output. The operation of the faulty circuit is represented by the additive error model as a sum of the correct output and an error , i.e.
· . The Arithmetic Transform is linear, and satisfies the equation:
The "size" of the error will be measured in terms of the number of spectral coefficients in Ì´ µ.
For the considered verification scheme, we treat each design error as an additive error. Although the value of for each such error can be obtained by simulations of the faulty and correct circuits, and subtracting their outputs, we emphasize that this model, and the analysis to follow, do not require the explicit identification of the error.
Arithmetic Transform of Basic Design Errors
We defined an error model through its Arithmetic Transform. The question of relating the additive error model to the models used more often in practice is addressed next. We consider the design error classes proposed in [2] . Most of these classes can be described as the additive errors with few spectral coefficients. The basic error types identified in [2] are the bus errors: Bus Order Error (BOE), Bus Source Error (BSE), Bus Driver Error (BDE), Bus Single Stuck Line (SSL) Error and Module Substitution Error (MSE) When the buses are considered in isolation, the error spectra, for most of the above classes, are compact, as shown next. Bus Order Error This class includes a common design error of incorrectly ordering the bits in a bus. For example, if the signals Ü Ð and Ü of the bus with bits Ü ,
have been interchanged, the transform of a bus considered in isolation is:
If the correct circuit transform was Ì´ µ, the transform of the faulty one is:
The error polynomial has four nonzero spectral coefficients. In general, any permutation of bus signals will have the error transform with at most ¾Ò spectral coefficients. Bus Source Error This class replaces the intended source Ü with a source Ö . In this case, the AT of the error would be:
Bus Driver Error This kind of errors corresponds to a bus being driven by two sources. It manifests itself in a way dependent on the implementation technology. For example, if the bus line is implementing "wired-OR", then by connecting an additional source Ö to a line Ü , the resulting signal is Ü Ö . Using integer arithmetics, the logical OR is obtained as Ü In the case of a bus stuck-at-1, the error transform has ¾Ò nonzero coefficients: Single Stuck-At Faults In contrast to the above, the single stuck-at faults have a direct relation to testing of a circuit. These faults cannot be described by a single formula. For example, the distribution of spectra of all single stuckat faults in an 4x4 CSA multiplier is plotted in Figure 2 . This figure shows that a number of faults result in a substantial error spectrum. Regardless of the spectra size of the stuck-at faults, we demonstrate by the experiments that these faults are easily detectable by the vectors for small spectral error methods. The small additive error assumption is partly motivated by the examples of the common design error classes from [2] . There are several classes of circuits that are "small", such as the class of constant depth circuits. The results in [7] imply that this class of circuits has its (Fourier) spectrum small and concentrated in the low order coefficients.
Detecting Small Additive Errors
Under the assumption that the size of the additive error is bounded, the verification by error modeling can be done by using the bounds on the test vector size that are derived next.
The arithmetic transform of an arbitrary multi-output Boolean function can be obtained by multiplying the function values by the transform matrix Ì , defined in Equation 3. Rows of this matrix multiply the values that a function takes at all points of the function domain. A test set will contain a selected set of these points; finding out the polynomial representing a function can be performed by inverting the matrix. The structure of the matrix is identical to that of the RM error correcting code check matrix [8] , albeit with a different addition operation. The redundancy incorporated in the matrix allows us to find a minimal test set for the case of a bounded spectrum error.
An error check matrix À Ö consists of the Ì ½ Ò rows corresponding to the points in the top Ö · ½ layers of the lattice. This matrix has ¾ Ò columns; the number of rows is equal to the number of points in these lattice layers. The auxiliary lemma, proven in Appendix, states that the matrix À Ö has at least ¾ Ö·½ ½ independent columns. The following theorem is used to derive a test set for the case of small spectra errors. It is sufficient to check that each ¾Ø columns of this matrix are independent. Then, as in RM error correcting codes, any error polynomial with up to Ø terms will be detected. By the lemma in Appendix, the minimal number of independent rows is:
Therefore, any polynomial with up to Ø terms will be uniquely identified. This theorem uses the properties of the Arithmetic Transform that are identical to those of binary RM transform, due to the fact that both transforms consider binary inputs. A proof of the theorem for binary RM transform can be found in [8] . We note that a non-binary input generalization had been proven in [4] for detecting faults for circuits described by a multiple-valued RM transform. While the RM transform is of exponential size even for adders, the Arithmetic Transform is always of polynomial size, hence our result is more practical. Also, our result offers a generalization of the theorem in [8] for non-binary outputs.
This theorem provides an upper bound on the number of points that have to be simulated to detect this class of errors. In actual circuits, faults that involve many more spectral coefficients will be detected.
Experimental Results
We have performed a set of experiments over several most commonly used arithmetic circuits and several MCNC benchmarks. The errors considered include stuck-at faults and module replacement errors. Three types of gate replacement faults are applied. These faults substitute any gate in a circuit with AND, OR or XOR gate of the corresponding size. Faults are chosen to represent the basic design errors considered in Section 3.2. The stuck-at faults are selected for the additional reason of indicating the testing capabilities of this verification approach. We recorded the coverage of these four classes of faults, as well as the statistics on the test set size. Tables 2 and 3 report the coverage of MCNC benchmarks and arithmetic circuits, respectively. The results indicate that only 4 layers are sufficient for the detection of the given classes of faults in arithmetic circuits, as well as most of the considered MCNC benchmarks. The goal of these experiments was to determine the effectiveness of the complete test set, including the information on the number of vectors per layer that detect each fault. Hence, no fault dropping was applied, and the reporting simulation time would not reflect the actual time needed in the verification. Otherwise, the running time is proportional to the circuit size and the number of vectors needed to detect the faults, which are reported.
It is worth noting that, compared with Theorem 1 and the experimental results considering the error spectra from Figure 2 , a larger number of layers would have been expected. We interpret this by the fact that the theorem gives an upper bound on the number of vectors. Also these vectors are, according to Theorem 1, necessary for the unique identification of the error polynomial. We note that for the verification and testing purposes, only the detection of the presence of the error is required.
Improvements -Neighborhood Subspace Points
Although only up to 4 lattice layers (plus vector ½½ ½) were needed for good coverage, there were many redundancies in the test sets. This prompts us to consider the alternative schemes that use a subset of the considered lattice points.
We construct the vector windows covering exhaustively only the neighboring variables. We consider the neighboring inputs to be , , ·½ ample, the expression ´ · µ¾ joins together the neighbor variables in a polynomial term that is multiplied by the same constant. The size of the considered windows is in this case 4. Table 3 compares the two methods. First column for each fault type shows the coverage obtained by the exhaustive lattice points, and the second column reports the neighbor window results. We notice that little coverage is lost -further improvements are possible by experimenting with larger windows and alterinative schemes for neighbor signal selection. Table 4 compares the test set lengths. Testing with all the vectors belonging to the top four layers requires Ç´Ò µ points. With the window size of four, the exhaustive test set for all the neighbor variable combinations needs only Ç´Ò ¾ µ points. The savings are equivalent to using two layers less in the test set.
This way of generating test vectors follows the universal test set approach, in which the tests are independent on the circuit implementation. Additional information used to restrict the test set came not from the circuit structure, but from the high-level specification.
Conclusions and Future Work
We proposed a vector-based verification of datapath circuits using the Arithmetic Transform and a concept of error modeling. We have shown that this approach can be applied to derive effective test sets for several classes of design errors. Furthermore, testing can be combined with the validation process through reusing of verification vectors for detecting manufacturing faults.
Through the arithmetic spectrum, the compact circuit representations and the capability of relating the common errors to the bounds on the vector set size are achieved. This provides the confidence in restricting otherwise exhaustive test set to its smaller subsets, without sacrificing the fault detection capability.
The improvements to the basic concept include the use of the high-level information on the input variable dependences, through neighbor window variables. More improvements are possible for circuits such as ALUs. Preliminary results show that dividers can be verified using the same approach. More work is needed in this direction. Further work on error modeling using this approach needs to be done.
