Abstract
Introduction
Spectrum analysis of logic functions [13] is useful logic synthesis [20, 8, 10] , Boolean matching [5, 7] , test [17, 9, 11] , and verification [18] . Various methods to compute spectrum are known: Fast Fourier transform (FFT) [3] , cubes [20, 6] , and decision diagrams [5, 7] .
A disadvantage of the spectral method is that the sizes of the representations tend to be large, especially when the entire spectrum is represented at one time. In many applications, only a fragment of the spectrum coefficients is sufficient, and thus a smaller amount of computation time is needed compared to that of the entire spectrum. Most research on spectrum computation focus on software. Especially, [7] and [12] have considered efficient computation methods for a fragment of Walsh spectrum of a given logic functions.
In this paper, however, we use hardware to compute a fragment of the Walsh spectrum. Theoretically, the FFT realization computes the entire Walsh spectrum at one time, However, in practice, the straightforward FFT realization requires an excessive amount of hardware to implement by an FPGA. Thus, [2] proposes a bit-serial method to compute the spectrum. It is hardware typically used for digital signal processing, and assumes the following conditions:
1. The entire spectrum is computed at one time. 2. Each input is a signal of bits.
In this paper, we consider hardware to compute a part of Walsh spectrum, and assume the following conditions:
1. A part of coefficients of the spectrum is computed at one time. 2. Each input is a signal of a single bit. (We compute the spectrum of single-output logic function. Extension to multiple-output function is shown in Section 4.) Such a hardware is applicable for the fault diagnosis of semiconductor memories [11] , and Boolean matching [5] .
Definitions and Basic Properties
In this part, we define Walsh spectrum, Walsh transformation trees and Walsh transformation diagrams. Also we show a method to compute Walsh coefficients from the Walsh transformation tree and the Walsh transformation diagram [15] . In the case of BDT for , the SOPP corresponds to the canonical sum-of-products expression for . 
Walsh Transformation
´ ¼ ½ ¾ ¿ µ 1 ½ ¼ ½µ. The Walsh spectrum is ¾ ½ ½ ½ ½ ½ ½ ½ ½ ½ ½ ½ ½ ½ ½ ½ ½ ¿ ¾ ½ ½ ¼ ½ ¿ ¾ ¿ ½ ½ ½ ¿ Therefore, we have Ë × ¼ × ½ × ¾ × ¿ µ ¿ ½ ½ ½µ .
Walsh Transformation Tree
In the Walsh expression, by specifying the value of Û Û ½ Û ¾ Û Ò µ, we can compute an arbitrary Walsh coefficient. That is, the WTT represents a row of Ï´Òµ, and the value´Û ½ Û ¾ Û Ò µ specifies the row. When
This corresponds to the inner product of the truth vector and the first row of Ï´Òµ. 
From this, we can compute the Walsh coefficients as follows: 
(Proof) We will prove the theorem by the mathematical induction with respect to Ò, the number of variables in . Note that, in a BDD, when a node Ú has the same children, the reduced graph has an edge with the label Û ·Û ½. On the other hand, in a WTD, when a node Ú has the same children, the reduced graph has an edge with the label ½ · ½ ¾Û µ 3 ½ Û µ. 
Amount of Hardware

Computing a Single Coefficient
A hardware realization of a WTT can be obtained by replacing each node in the WTT by an adder-subtracter. A -bit adder-subtracter realizes the function Ý´Û × × µ × · ½ ¾Ûµ× , where × and × are -bit binary numbers. Note that Ý´Û × × µ represents addition´× · × µ when Û ¼, and subtraction´× × µ when Û ½, where Û is the control input. A -bit adder-subtracter has´¾ · ½ µ inputs and´ · ½ µ outputs. We assume that the cost of hardware for a -bit adder-subtracter is « , where « is a constant. The hardware for WTT has a structure of binary tree. Note that the adder-subtracters that are near to the root node have higher cost than ones that are near to the leaf nodes. However, we can prove that the total cost of hardware is exactly Ç´¾ Ò µ. 
Computing the Entire Coefficients Theorem 3.2 The cost of hardware that computes the entire Walsh coefficients of Ò variables at one time is
(Proof) By replacing each node in the butterfly diagram with an adder or a subtracter, we have hardware to compute the entire Walsh coefficients at one time. For each stage we need ¾ Ò ½ copies of adders and subtracters. Also, the cost of an adder-subtracter in the -th stage is ¬ , where ¬ is a constant. Thus, the total cost of the hardware is
Multiple-Output Functions
With the integer function [15] : The Walsh coefficients of a multiple-output function can be also obtained from the MTBDT. However, the straightforward implementation of MTBDT requires excessive hardware. In the method of Theorem 4.1, most of the hardware is independent of Ñ, the number of outputs. The only hardware that depends on Ñ is the adder in the final stage. This realization drastically reduces the amount of hardware, but the computation time will be proportional to Ñ.
Experimental Results
Circuits to Compute Single Coefficient
In Sections 2.2 and 2.3, we presented two methods to compute the coefficients: WTT and WTD. In this part, we only consider the hardware realization of WTTs, since the method using WTD is feasible only for fixed functions.
In the computation of spectrum for logic functions, two encodings exist: one is´¼ ½µ encoding, and the other iś ½ ½µ encoding [10] . In this paper, we use the´¼ ½µ encoding. In this case, the maximum value of the spectrum for an Ò-variable function is ¾ Ò , and the minimum value is ¾ Ò ½ . 
Û ½ Û ¾ Û ¿ µ ´½ ½ ¼µ. As shown in Fig. 5.1(c) , the adder-subtracter of a WTT has´¾ · ½ µ inputs and · ½ outputs. In this realization, to reduce the amount of hardware, we use a special encoding: The code´½ ¼ ¼ ¼µ represents ¾ , while other codes represent ¾'s complement numbers. For example, Table 5 .1 represents encoding for ¾ . In this case, the code´½ ¼ ¼µ represents ¾ . Table 5 .2 shows the environment and conditions of the experiments. Table 5 
Circuits to Compute All the Coefficients
We also implemented circuits to compute all the coefficients at one time. The networks simply realize butterfly networks shown in Fig. 2.1 . Up to Ò , we could implement combinational circuits to compute all the coefficients at one time. Table 5 .5 shows the numbers of ALUTs and delay time. For, Ò , the numbers of pins in the FPGA are not sufficient, so we used the TDM method. Table 5 .6 shows the amount of hardware and delay time. From these Ç´Ò ¾ ¡ ¾ Ò µ.
Comparison with Microprocessor
Various methods exist to compute Walsh coefficients by software. As for the data structure, we assume the array of the truth vector. For computation of any coefficient, we need to access all the ¾ Ò elements of the truth vector, and to do´¾ Ò ½µ additions and/or subtractions. So, to calculate We used the computer shown in Table 5 .2, and used the gcc compiler. Table 5 .7 compares computation time. In the case of the microprocessor (MPU), the computation time is proportional to ¾ Ò . From the table, we can see that the FPGA realization is at least ½¾ ¿ times faster than the MPU when Ò ½ . Note that the software implementations in [5, 12] can compute the coefficients only for the fixed functions, and require precomputation. On the other hand, in our implementation, we can compute the coefficients for any function without any precomputation.
Conclusion
In this paper, we have shown hardware to compute a Walsh coefficient of a logic function directly from the Walsh transformation tree. Also, we have designed the circuits using FPGAs. With the current FPGAs, our approach is fea-sible for Ò ½ inputs. It is at least ½¾ ¿ times faster than a software realization on a microprocessor when Ò ½ .
We have also shown that the amount of hardware to compute a coefficient and the entire coefficients are Ç´¾ Ò µ and Ç´Ò ¾ ¡ ¾ Ò µ, respectively.
