In this paper, we present the condition for the effective wire addition in Look-Up- Table- based (LUT-based) field programmable gate array (FPGA) circuits, and an optimization procedure utilizing the effective wire addition. Each wire has different characteristics, such as delay and power dissipation. Therefore, the replacement of one critical wire for the circuit performance with many non-critical ones, i.e., many-additionfor-one-removal (m-for-1) is sufficiently useful. However, the conventional logic optimization methods based on sets of pairs of functions to be distinguished (SPFDs) for LUT-based FPGA circuits do not make use of the m-for-1 manipulation, and perform only simple replacement and removal, i.e., the one-addition-for-one-removal (1-for-1) manipulation and the noaddition-for-one-removal (0-for-1) manipulation, respectively. Since each LUT can realize an arbitrary internal function with respect to a specified number of input variables, there is no sufficient condition at the logic design level for simple wire addition. Moreover, in general, simple addition of a wire has no effects for removal of another wire, and it is important to derive the condition for non-simple and effective wire addition. We found the SPFD-based condition that wire addition is likely to make another wire redundant or replaceable, and developed an optimization procedure utilizing this effective wire addition. According to the experimental results, when we focused on the delay reduction of LUT-based FPGA circuits, our method reduced the delay by 24.2% from the initial circuits, while the conventional SPFD-based logic optimization and the enhanced global rewiring reduced it by 14.2% and 18.0%, respectively. Thus, our method presented in this paper is sufficiently practical, and is expected to improve the circuit performance. key words: logic design, set of pairs of functions to be distinguished (SPFD), look-up- 
Introduction
FPGA circuits have been used mainly for prototyping, but now are often employed for practical implementations. While their performance is still lower than full/semicustom-designed circuits,' the technology is adopted partially or fully in circuit design and production due to their advantageous feature, "reconfigurable." This is expected to be the key feature of new computer models in the future.
A popular type of FPGA circuits is composed of LookUp- Tables (LUTs) . Each LUT can realize an arbitrary function with respect to a specified number of variables. The function stored inside of the LUT is called the internal function. When the specified number of LUT input variables is K, the FPGA circuit is said to be K-feasible. In such a K-feasible circuit, each LUT can work similarly to a Kinput gate. The function realized by the LUT is analogous to the logic operations realized by the gate. As logic optimization methods for gate circuits, the Transduction Method [1] - [3] , [8] and automatic-test-pattern-generationbased (ATPG-based) methods [6] , [7] , [10] - [14] are well known, and these are extended for the LUT-based FPGA circuits. These methods are based on the manipulations of circuit components, such as wires, and the conditions for the manipulations are expressed by incompletely specified functions (ISFs) or test pattern vector sets.
In particular, sets of pairs of functions to be distinguished (SPFDs) were proposed for optimization of LUTbased FPGA logic circuits [9] , [15] to take full advantage of the flexible change of the internal function. The conventional SPFD-based logic optimization method performs only wire replacement and wire removal. We classify wire manipulations by decomposing them into wire addition and wire removal as follows:
• No Addition for One Removal (0-for-1), • One Addition for One Removal (1-for-1), • Many Addition for One Removal (m-for-1).
Then, the wire replacement and the wire removal used in the conventional method are regarded as 1-for-1 and 0-for-1, respectively. In logic optimization, m-for-1 is also very useful. For example, if the removed wire is more critical for the circuit performance than the added ones, then the circuit performance is improved. In order to realize the mfor-1 manipulation, first, wires are added so that the primary output functions are not undesirably changed, and the effects by this addition must be propagated in the circuit. If they are fully propagated, we can detect all possible transformation made possible by the wire addition. However, in the case of the LUT-based FPGA logic optimization based on SPFDs, wire addition mostly does not work for the manipulation of other wires since it is not practical (or almost impossible) to propagate the effects fully.
As methods utilizing wire addition, the SPFD-based
Copyright c 2005 The Institute of Electronics, Information and Communication Engineers global rewiring (GR) [16] and enhanced global rewiring (ER) [17] were proposed. However, these methods are based on the 0-for-1/1-for-1. They remove the target wire forcibly, i.e., without SPFD calculation. As a result, the primary output functions are probably changed undesirably. This undesirable change is rectified by modifying the internal function of an LUT or by adding one wire based on SPFDs. Thus, these methods also do not make use of the m-for-1 manipulation.
In this paper, we present the condition for the SPFDbased effective wire addition for the m-for-1 manipulation in LUT-based FPGA circuits. Even though SPFDs still cannot fully propagate the effects by the wire addition, we can determine whether addition of wires may make another wire redundant, that is, whether the wire addition is effective or not, based on the condition. Moreover, we propose a logic optimization procedure for circuit delay reduction using the effective wire addition. In this procedure, we first pick up the wire that is likely to degrade the circuit performance most, and specify it as the target wire. Based on the condition presented in this paper, new wires are added effectively to remove the target wire.
As experiments, we optimized the circuit delay by using our method, to compare it with the conventional SPFDbased logic optimization method and the ER. As a result, our method reduced it by 24.2% on average, while the conventional SPFD-based optimization and the ER reduced it by 14.2% and 18.0%, respectively. From these results, we consider that our method is sufficiently useful for improving the LUT-based FPGA circuit performance. This paper is organized as follows: Sect. 2 defines the terminology and notation. It also introduces the SPFD calculation method and the conventional SPFD-based logic optimization method. Section 3 presents the condition for the effective wire addition with a beneficial example, and provides an optimization procedure using the effective wire addition. Section 4 demonstrates the experimental results. Section 5 concludes this paper.
Preliminaries
In this section, we define the terminology and notation. We also introduce the fundamental notion and the conventional calculation method of sets of pairs of functions to be distinguished (SPFDs) in a look-up-table-based (LUT-based) field programmable gate array (FPGA) circuit. We review the conventional logic optimization using SPFDs.
Terminology and Notation
In this section, we define the terminology and notation used in this paper.
In this paper, we consider a loop-free LUT-based FPGA circuit with n primary input variables, x 1 , x 2 , . . . , x n . The vector composed of these variables is denoted by
Let f be a logic function. If the value of f is always 0 or 1 only, then f is called a completely specified function (CSF). For a pair of CSFs, f and g, when f (x) ≤ g(x) holds for any x, in other words, when ( f (x), g(x)) is (0, 0), (0, 1) or (1, 1) for any x, the relationship is denoted by f ≤ g. The number of LUTs on the longest path from the primary input terminals and the primary output terminals is called the level of the FPGA circuit, which is denoted by Lev. For each LUT, v, the number of LUTs on the longest path from the primary input terminals to v is denoted by iLev (v) .
Each LUT or wire is called collectively a point. The function realized at a point, p, with respect to the primary input variables, x, is denoted by f p . Since this paper considers FPGA circuits consisting of only single-output LUTs,
An LUT can realize an arbitrary function with respect to a specified number of variables assigned to the LUT input wires. In this paper, the function realized by an LUT is called the internal function of the LUT, and the specified number is assumed to be K. An ordinal number is assigned to each input wire of an LUT. The k-th input wire of an LUT, v j , is denoted by w k/v j where k = 1, 2, . . . , K. Then, note that each wire is denoted in two manners. If w k/v j is an output wire of v i , then w k/v j is identical to w i, j .
In order to express the internal function, let the variables for the first, second, . . . , K-th input wires be denoted by y 1 , y 2 , . . . , y K , respectively, which constitute a vector, y. Let the internal function of v be denoted by u v (y). Namely, the following equation holds:
Set of Pairs of Functions to be Distinguished (SPFD)
In this section, we introduce SPFDs. An SPFD is computed at each point, and is used to determine whether the circuit can be transformed without any undesirable change of the primary output functions.
Considering a point, p, and the output functions of the sub-circuit between the primary input terminals and p, let us discuss transformation of the sub-circuit at p without any transformation of the outside of the sub-circuit. Then, we assume that as a result of the transformation at p, the function at only p may be replaced with g p . Then, no other output functions of the sub-circuit are changed. SPFD at p represents the conditions on g p . 
For the intuitive introduction of SPFD, we use an example LUT, v, with two input wires, w 1/v and w 2/v , in a threeprimary-input FPGA circuit. We assume that the internal function is u v = y 1 ⊕ y 2 and that the functions in Table 1 are realized at v, w 1/v and w 2/v . Table 1 is obtained from the truth table as follows. We reorder all the rows in the truth table by f w 1/v (x), f w 2/v (x) . For every LUT input vector, y = (0, 0), (0, 1), (1, 0), (1, 1), we compose the set, X y , of all x's such that f w 1/v (x), f w 2/v (x) = y holds. We put all the rows in the truth table corresponding to the same set into one row of (100), (110) and (111). Namely,
Let us discuss the replacement with g w 2/v while f v and f w 1/v are kept unchanged.
Let us consider that f w 2/v has been replaced with g 1 w 2/v . According to Table 1 , f v (x 00 ) = 0 and f v (x 01 ) = 1 must be kept unchanged for any pair of x 00 ∈ X 00 and x 01 ∈ X 01 . At the same time, f w 1/v (x 00 ) = f w 1/v (x 01 ) = 0 and g 1 w 2/v (x 00 ) = g 1 w 2/v (x 01 ) = 0 hold for any pair. However, when u v is modified so that u v (00) = 0 holds, f v is changed so that f v (x 01 ) = 0 for each x 01 ∈ X 01 . When u v is modified so that u v (00) = 1 holds, f v is changed so that f v (x 00 ) = 1 for each x 00 ∈ X 00 . Thus, there is no internal function so that f v can be kept unchanged. Namely, f w 2/v cannot be replaced with g 1 w 2/v , no matter how u v is modified.
From the example of g 1 w 2/v , we find that g w 2/v (x 1 ) g w 2/v (x 2 ) must hold for any pair of x 1 and x 2 such that Table 1 satisfy this kind of conditions. For
holds for any pair of x 00 ∈ X 00 and x 01 ∈ X 01 . Also, g i w 2/v (x 10 ) g i w 2/v (x 11 ) holds for any pair of x 10 ∈ X 10 and x 11 ∈ X 11 . As shown in the top row of Table 1 , we can find a function that can become u v without any change of f v , for each of g 
respectively. This set of pairs of CSFs is SPFD at w 2/v . Generally, the conditions on g p are described as the following relationships of g p (x 1 ) and g p (x 2 ) for each pair of distinct vectors, x 1 and x 2 :
In the example of the conditions on g w 2/v in Table 1 , the conditions in Table 2 
In this case, g p is said to distinguish each pair of CSFs and is said to satisfy the SPFD. Note that an empty SPFD is satisfied by any function.
LUT-based FPGA Circuit Transformation Based on SPFDs
In this section, we introduce the basic SPFD calculation method and transformation conditions based on SPFDs. Let SPFD at a point, p, denoted by S PFD p .
SPFD Calculation at the Output of an LUT
Let us consider how to calculate SPFD at the output of an LUT, v i . If v i is a primary output, the SPFD is assigned as
those at all the output wires as follows:
SPFD Calculation at Input Wires of an LUT
In this section, we consider how to calculate SPFDs at K input wires of an LUT, v.
Suppose that the following m-pair SPFD has been obtained: 
Second, for each y = (y 1 , y 2 , . . . , y K ) = (0, 0, . . . , 0), (0, 0, . . . , 1), . . . , (1, 1, . . . , 1), we calculate the function h y/v as follows:
wheref w k/v is calculated as follows:
These functions are classified into two sets as follows:
By computing the direct product of the two sets, H 1/v × H 0/v , we obtain the pairs that are to be contained in SPFDs at the K input wires. Finally, each pair, (h y 1 /v , h y 0 /v ) ∈ H 1/v × H 0/v , is placed into SPFD at one input wire, w k/v , satisfying the following condition:
If there are two or more input wires satisfying Eq. (6), then we select one of them and place the pair into SPFD at the selected input wire. For this selection, we assign a priority to each LUT. Then, the priority is distinct from those to the other LUTs. The priority of each LUT is also inherited to all its output wires. Based on these assigned priorities, the input wire with the highest priority of those satisfying Eq. (6) is selected.
SPFD-Based Conditions for Manipulations
If the following condition is satisfied, then w i, j can be removed:
If the following condition is satisfied, w i, j is replaced with w k, j :
SPFD-Based Internal Function Modification of an LUT
If input wires of an LUT, v, were removed or replaced, the internal function of v is modified. Based on the SPFDs at the input wires of v, the modified internal function of v is obtained in a disjunctive form. Each function in H 1/v in Eq. (5) corresponds to one term in the disjunctive form, which is composed as follows: The sum of all these terms is the modified internal function.
SPFD-Based Effective Wire Addition
The conventional SPFD-based logic optimization method [15] does not make use of the wire addition for the m-for-1 manipulation, and utilizes only wire removal (0-for-1) and wire replacement (1-for-1), as described in Section 2.3.3. Since each LUT can realize an arbitrary internal function and the variable for the new input wire can be ignored, the internal function can be unchanged. However, it is well known that, in gate-based circuit logic optimization, addition of an input wire to a gate may make those of another gate redundant. That is also considered to be true in LUTbased FPGA logic optimization. In this section, we present the condition for the effective wire addition, and show a beneficial example using the effective wire addition. Also, we provide an optimization procedure based on the effective wire addition.
Condition for Effective Wire Addition
Let us consider that a new input wire is added to v j that has K input wires where K < K. 
An Example of Effective Wire Addition
Let us consider an example sub-circuit shown in Fig. 1 . Suppose that the functions in Table 3 SPFDs are computed at the input wires, we obtain h y/v s 1 's and h y/v s 2 's satisfying the condition in Eq. (9) as follows:
Hence, we can consider that addition of w a 1 ,s 1 and w a 2 ,s 2 illustrated by the bold arrows in Fig. 1 
Optimization Procedure Utilizing Effective Wire Addition
We can consider that since Eq. (9) is very simple, there are many LUTs, v i 's, satisfying it for each LUT, v j . The circuit configuration is likely to become more complicated. When the circuit is placed and routed, this more complicated configuration may degrade the circuit performance. It is desirable to suppress the number of added wires. Hence, for efficiently utilizing the effective wire addition, we target a sub-circuit that is likely to degrade the circuit performance most. In the giga-hertz era, the most important circuit performance is the delay, and hence, in particular, we present an optimization procedure for the circuit delay reduction. At the logic design level, the delay is approximated as Lev. In the optimization procedure efficiently utilizing the effective wire addition, we aimed at reduction of the circuit level, Lev. As an intuitive strategy for the circuit delay reduction, we try to cut the critical path. In other words, we try to remove a wire on the critical path or to replace it with another wire supplied by an LUT closer to the primary input terminals. Such a wire on the critical path is called the target wire. The LUT where the target wire is input is called target LUT. The condition to remove the target wire is that the SPFD is empty. The fewer pairs are contained in SPFD at the wire, the more LUTs satisfy the condition to replace the wire. Therefore, in order to determine whether it is innately possible to remove or replace the target wire, the number of pairs propagated into SPFD at the target LUT is made as small as possible.
In order to suppress the number of pairs propagated into SPFD at the target wire, we have to assign the priorities to LUTs in the circuit by considering this purpose. We explain how to assign them by using an example is illustrated in Fig. 2 , where the target LUT and the target wire are denoted by v t and w t , respectively. Let w t be an output wire of v t . In Fig. 2 , the priorities are written between parentheses.
Since fewer pairs are placed into SPFD at an LUT with a lower priority, the lowest priority is assigned to v t . Pairs placed into SPFDs at LUTs in S (v t ) are not propagated into SPFD at v t . Hence, the non-successors are given higher priorities than successors.
When SPFDs at input wires of a farther LUT from v t are calculated, we have a better opportunity to prevent the propagation of pairs to successors of v t . Therefore, the priorities assigned to the successors are based on the distance from v t .
Finally, when SPFDs at input wires of v t are computed, v t must be given the lowest priority, since the number of pairs placed in SPFD at w t must be made as small as possible. However, since v t S (v t ) ∪ {v t } may have another output wire, a high priority has already been assigned and r(v t ) may be higher than r(v)'s where v ∈ IP(v t ) and v v t . Only at this time, r(v t ) is considered to be the lowest. In the example, r(v t ) = 11 is first assigned. However, when the SPFDs at the input wires of v t , we reassign r(v t ) = −1 only at the time. This reassignment makes the number of pairs placed in SPFD at w t as small as possible. If SPFD at w t we obtained is fortunately empty, then w t is removed. However, if it is innately impossible to remove w t , then some pairs remain in the SPFD. In this case, w t is replaced with a new one so that the distance from the primary input terminals to v t becomes shorter. Hence, closer LUTs to the primary input terminals are picked up as the candidate supplying the substitutive wire.
Based on this strategy, the following procedure is performed for the circuit delay reduction: Procedure: Delay Reduction Using Effective Wire Addition
Step 1: Set the target wire, w t , and the target LUT, v t . Let v t be an IP of v t connected by w t .
Step 2: Assign the priorities, r(v)'s, to all LUTs, v's, so that the following conditions are satisfied:
• For any pair of LUTs,
, we assign r(v i ) and r(v j ) in an arbitrary relationship.
• For any pair of LUTs, v i ∈ S (v t ) ∪ {v t } and v j ∈ S (v t ) ∪ {v t } where i j, we assign r(v i ) and r(v j ) in an arbitrary relationship.
• For any pair of LUTs, v i ∈ S (v t ) ∪ {v t } and v j ∈ S (v t ) ∪ {v t } where i j, we assign r(v i ) > r(v j ) in order to make the number of pairs propagated to v t as small as possible.
Step 3: Calculate the functions at all the points.
Step 4: For every v ∈ S (v t ), calculate S PFD v and h y/v 's, and add new wires from v k 's such that iLev(v k ) is small and many h y/v 's satisfy Eq. (9) with f v k .
Step 5: Reassign a lower priority than any other IPs of v t as r(v t ). Calculate SPFD at the output of v t and its input wires.
Step 6: Remove w t if S PFD w t = ∅. Otherwise, replace w t with an new wire from an LUT,
. If there are two or more v k 's, select the one with the lowest iLev(v k ).
Step 7: Apply the conventional SPFD-based logic optimization to the resultant circuit.
Step 8: Based on the configuration of the resultant circuit, update w t that has never been the target wire. If the new target wire is successfully specified, then go back to Step 2. Otherwise, terminate.
In the following, we compare the computational complexities between our method and the conventional method.
Since the SPFD calculation is almost the same between two methods, the main difference is considered to be the number of iterations (trials) of wire replacement or wire addition. Let A O be the number of trials of wire addition by our method. Let the total number of LUTs and the total number of existing wires in the circuit be V and W. Since the 
Thus, the ratio of A O to R C is at most (K − 1)W t . In our experiments for benchmark circuits, we specified the wires on the logically critical paths as the target. According to the results, W t = 31 on average. In this case, A O /R C is not significantly large. From the improvement of the circuit transformation flexibility, we consider that the trade-off between the complexity and the optimization capability is acceptable.
Experimental Results
In this section, we demonstrate the experimental results. For the comparison, we implemented in the C++ language the SPFD-based conventional method using wire removal and wire replacement only (Conv.) [15] , the SPFDbased enhanced rewiring (ER) [17] and the SPFD-based delay reduction using the effective wire addition (our method). For representing logic functions in this implementation, the CUDD package was adopted [19] . The BDD manipulation is dominant in the computation time. Hence, we made the calculation in Eq. (3) more efficient by reusing the previous results. We experimented these methods on Xeon 2.8 GHz with 4GB memory for the LUT-based FPGA circuits, which was obtained by applying SIS commands [5] with K = 4 to the MCNC benchmark circuits [4] .
We assume that the conventional method is used for the first logic optimization immediately after the logic synthesis. On the other hand, the ER and our method specify the wire that degrades the circuit performance most as the target. These methods are useful when the placement and routing do not produce any sufficient results. Based on this assumption, we performed three experiments, "Conv.," "Conv.+ER" and "Conv.+Our Method." Table 5 shows the experimental results. With respect to the averages, we compare the results with the initial circuits in the second lowest row. Moreover, since we assume that the ER and our method are applied after the conventional method, we compare the results from the ER and our method with those from the conventional method in the lowest row.
According to the experimental results in Table 5 , the conventional method reduced the number of LUTs, the number of wires and the circuit level by 5.9%, 9.9% and 14.2%, respectively, on average. Our method after the conventional one reduced these by 11.0%, 14.2% and 24.2%, respectively, while the ER after it reduced them by 6.2%, 10.2% and 18.0%.
Our method after the conventional method reduced the circuit level by 1.3 times of the ER after it from the initial circuits. From the circuits optimized by the conventional method, our method reduced it by about three times of the ER. In addition to the circuit level, our method reduced the numbers of LUTs and wires from the results of the conventional method more than the ER.
The SPFD-based conventional method makes full use of wire removal, which corresponds to the disconnectable procedure in the Transduction Method [3] . However, wire addition corresponding to the connectable procedure is performed only as a part of wire replacement. In other words, it is combined with wire removal. Comparing to the Transduction Method, we can consider that the SPFD-based ER is based on a similar idea to the Transduction Method based on the error compensation procedure. Both the conventional method and the ER perform only 1-for-1 and 0-for-1.
On the other hand, our method can apply wire addition independently of wire removal. It can perform m-for-1 to realize more flexible transformation than the SPFDbased conventional method and the ER. In the circuit implementation, each wire has different characteristics for the circuit performance. For example, suppose that the circuit has a wire that cannot be routed as a short one. If the wire is replaced with those that can be routed as short ones, then the circuit performance is considered to become higher. Therefore, our method based on the m-for-1 manipulation could improve the circuit performance more than the 1-for-1-based methods.
Concerning the optimization time, when the ER and our method are applied with the conventional method, our method took 8.72 times of the conventional method while the ER needed 3.75 times. When the target wire is selected, the ER examines whether each wire can be specified as the target, based on the circuit configuration. As a result, there are many wires that cannot be specified as the target. In our method, on the other hand, no such configuration constraint is imposed in specifying the target wire. Wires that can be specified as the target in our method are more than those in the ER. Hence, our method took longer time than the ER. Considering the significant circuit level reduction by our method, we conclude that the trade-off between the optimization time and the optimization capability is sufficiently and practically good.
Conclusion
In this paper, we have presented the condition for the SPFDbased effective wire addition in LUT-based FPGA circuits, and a logic optimization method using the effective wire addition. Since each LUT can realize an arbitrary logic function with respect to a specified number of variables and the variable for the added wire can be ignored, there is no sufficient condition for simple wire addition and wire addition may not make other wires redundant or replaceable. From this reason, the conventional SPFD-based logic optimization method performs only the wire removal and the wire replacement. On the other hand, if the condition we have presented in this paper is satisfied, the wire addition is more likely to make other wires redundant or replaceable. Hence, our logic optimization method using the effective wire addition transforms LUT-based FPGA circuits more flexibly than the conventional method. As another SPFD-based logic optimization method for LUT-based FPGA circuits, the SPFD-based enhanced rewiring (ER) is well known. The ER removes forcibly a wire, and modifies the internal function of an LUT or adds a new wire to rectify the primary output functions undesirably changed by this forcible removal. Unless this rectification is successful, the circuit is restored in the original configuration. According to the experimental results, compared with the ER after the conventional SPFDbased logic optimization, our method after it reduces the circuit delay approximated as the circuit level in the logic design more than the ER. On average, while the ER reduced the delay of the initial circuits by 18.0% our method reduced it by 24.2%, which is about 1.3 times of the results of the ER. Since our method provides more flexible transformation, the circuit delay is improved by our method more than by the ER, but the optimization time of our method is longer than that of the ER. We can conclude that this is an acceptable trade-off and our method is sufficiently practical.
