Traditionally, logic synthesis constrains the solution space of later design steps, such as physical design, because they are applied in sequence. Rewiring is a technique to restructure a circuit while maintaining its functionality. Since design properties and objectives can be considered during postsynthesis rewiring, it can help relieve constraints put forth by decisions made at earlier design steps. The extent of rewiring of a rewiring algorithm has a great impact on the success of the design flow. This paper presents a powerful rewiring technique that in addition to unifying all previously proposed Set-of-Pairs-of-Functions-to-be-Distinguished based rewiring techniques, it can perform rewiring with more than one wire which increases our ability to circumvent poorlydecided early design constraints. With this ability, the rewiring ability of using different numbers of wires is reported for the first time in this paper. Our technique can be used for runtime/quality trade-off in any given rewiring application.
INTRODUCTION
Optimization is at the core of any integrated circuit design flow. Due to the intractability of the majority of VLSI problems, the flow is organized in a sequential manner. Although interdependent, physical design is performed after logic synthesis. Thus, traditional physical design techniques are constrained by the circuit structure obtained from logic synthesis, resulting in sub-optimality. Since both logic synthesis and physical design are quite time consuming, reiterating over these two steps to improve solutions has found limited use. Rewiring has been proposed to restructure a circuit while its functionality is maintained, ie., removing some wires in the circuit and adding other wires at different locations in the circuit with the condition that the global functions of primary outputs and flip-flop inputs are unchanged. As a result, rewiring techniques can be used to adapt a circuit's structure to suit optimizations during physical design.
Rewiring techniques can be classified into automatictest-pattern-generation (ATPG) and Set-of-Pairs-of-Functionsto-be-Distinguished (SPFD) based techniques. ATPG-based rewiring relies heavily on the present node functions. Thus, its solution space is severely limited as a function provides minimal flexibility. SPFD was proposed to express a node's function flexibility [1] . Its application to rewiring has been shown to provide better results both in theory [2] and practice [3] . SPFD-based rewiring can be used in many applications. For example, it has been used to reduce the number of Look-Up Tables used in implementing a circuit [4] and also shown to help reduce FPGA power consumption by 12% [5] .
Rewiring ability is defined as the ratio of the number of wires that can be rewired to the number of total wires in a circuit. Thus, a rewiring algorithm with higher rewiring ability has more optimization capability. As a result, rewiring ability improvement is of great interest.
SPFD-based rewiring is composed of two steps: choosing the sources and targets for additional wires and checking if the circuit functionality is still maintained. Previously proposed checking mechanisms either restrict target nodes or limit the number of wires to be added or both. As a result, the rewiring ability is limited.
The contributions of this paper can be summarized as : 1. Provide necessary and sufficient conditions for generalized checking mechanism. The scheme unifies previous approaches into a single framework and is applicable for rewiring with more than one wire.
2. Show an efficient implementation of the scheme. 3. Report for the first time the rewiring ability of using different numbers of additional wires, which can be used for runtime/quality trade-offs in any rewiring application.
The paper is organized as follows. Section 2 summarizes notations used in this paper. Section 3 explains SPFD. The previous SPFD-based rewiring approaches are summarized in Section 4. Section 5 introduces our generalized checking scheme. An efficient implementation of the scheme is detailed in Section 6. Section 7 shows experimental results.
BASIC TERMINOLOGIES
A combinational circuit consists of nodes and directed edges between nodes. We use wire(n a , n b ) to call an edge from node n a to node n b . The source and sink nodes of a wire, w, are denoted by sr(w) and sk(w), respectively. Thus, sr(wire(n a , n b )) = n a . If there exists wire(n a , n b ), we say that n a is a fanin node of n b and n b is a fanout node of n a . Similarly, we call wire(n a , n b ) a fanin wire of n b and fanout wire of n a . F I(n) denotes both fanin nodes or wires of Node n. F O(n) is similarly defined as fanout nodes or wires. We use T F I(n) and T F O(n) to represent sets of transitive fanin and fanout nodes of node n, respectively. For example, if there exists wire(n a , n b ) and wire(n b , n c ), n a , n b ∈ T F I(n c ) and n b , n c ∈ T F O(n a ). Nodes with no fanout and no fanin nodes are called primary output and input nodes, P O, P I, respectively. As a circuit is a directed acyclic graph, nodes can be organized into levels using topological sort. The level of node n is denoted by L(n).
Each node, n, implements a one output binary function, f (n). Iteratively composing f (n) using functions of nodes in T F I(n), we obtain a global function of node n, g (n) . The global function of n before rewiring is denoted as g org (n).
SPFD
A binary function of n variables defines its ON and OFF sets of n-tuple binary numbers. Since each variable can take either 0 or 1, the total number of elements in both sets is 2 n . For any given binary function, we can draw a graph where each node represents one n-tuple number. Nodes in the graph can be organized into two groups, one for each set. To represent the fact that nodes from different sides will be evaluated into different values, an edge is added between any pair of nodes from different groups. The resulting graph is a bipartite graph as nodes in different groups are separated by an edge and there is no edge between nodes in the same group. From this construction, we can see that there is a one-to-one mapping between a set of n-input binary functions and a set of bipartite graphs with 2 n nodes. If there are some don't care tuples, the corresponding function is called incompletely specified function (ISF). These don't care nodes can be placed on either groups. However, different placement of them will represent different functions, but each of them implements the given ISF.
In 1996, Yamashita, et al., cleverly expanded this idea [1] by generalizing an edge to the following definitions.
Definition 3.1. [1] For any boolean functions, f and g, let
F X = x | f (x) = 1, GX = x | g(x) = 1, where x is the primary input vectors. If GX ⊆ F X, f includes g, written as g ≤ f or g → f , which is equivalent to g · f = 0.
Definition 3.2. [6] A function f is said to distinguish a pair of functions g and h if either
g ≤ f ≤ h or h ≤ f ≤ g is satisfied. Note that g ≤ f ⇔ f ≤ g and if g · h = 0, there
is no function that satisfies the pair.
For a given set of functions, Definition 3.2 can be extended to the following one.
Definition 3.3. [6] A function f satisfies a set of pairs to be
Equivalently, SPFD can also be used to represent a set of functions that satisfy the SPFD. Therefore, f ∈ SP F D can also be used to indicate that f satisfies SPFD Essentially, SPFD is a collection of ISFs and it can be conceptualized in the form of a graph as well. However, a graph of SPFD may contain many connected components, each for one ISF. SPFD has been shown to better express flexibility of functions than ISF [6] . In a simple term, ISFs in a SPFD can be assigned to ON or OFF sets separately, instead of collectively when combining them into one ISF.
For a given circuit and its output functions, the SPFD at the output pins can be constructed from their ON and OFF sets. This SPFD can be represented as a bipartite graph. Each edge of the graph will be distributed to one of the node's inputs which can distinguish the edge [6] . Nodes will be processed from primary outputs to primary inputs. The summary of SPFD computation at a node is shown in Algorithm 1.
Algorithm 1 SPFD computation at a gate.

Require: SP F D
), inputs to the gate are
Construct all possible minterms on the inputs of the gate,i.e.,
) is the care set. Thus, a i described all minterms needed to be distinguished.
4:
Distribute all care minterms into two sets:
Build complete bipartite graph
Add (a i , a j ) to at least one input k such that
end for 9: end for Algorithm 1 can be summarized as follows. For each pair of SPFD, care minterms are constructed (Line 2-3). Line 4-5 categorizes the minterms into two sets of minterms to be distinguished. Each pair of minterms will be assigned to a fanin of the gate that can distinguish the pair at Line 7. Applying Algorithm 1 to all nodes in the circuit from PO backward, SPFD of each node and wire can be computed. 
PREVIOUS WORK ON SPFD-BASED REWIRING
In this section, previous techniques on SPFD-based rewiring are outlined. We assume that SP F D of each wire is already computed. The wire to be removed is denoted as w r .
No Additional wire for one wire removal (0-for-1) and one additional wire for one wire removal (1-for-1)
If SP F D(w r ) can be redistributed to other existing wires, w r can be removed without adding another wire, so called 0-for-1 rewiring. If another wire has to be added, the rewiring is called 1-for-1 rewiring. Rewiring algorithms process both rewiring techniques in similar ways. Hence, they are categorized based on the location of the target of additional wires.
Local rewiring
Let n r = sk(w r ), as shown in Figure 1a . If there is a wire w a that satisfies SP F D(w r ), ie., SP F D(w r ) ⊆ SP F D(w a ), w a can be used to replace w r . f (n r ) needs to be updated after replacement. If SP F D(w r ) is empty, w r can be removed without adding any wires. Since the destination of both w a and w r are the same, this operation is referred to as local rewiring.
Global rewiring
A dominator node of w r is defined as nodes through which all paths from w r to any PO pass. The set of dominator nodes of w r is denoted as Dominator(w r ). For example, Dominator(w r ) = {n a , n b } in Figure 1b . Let sk(w r ) = n r . In global rewiring [3] , the set of target nodes for additional wires is expanded from only n r to Dominator(w r ). If a node n a ∈ Dominator(w r ), the effect of removing w r must pass through n a . The global rewiring proceeds by removing w r and propagating the change through fanout nodes of n r until n a is reached. A candidate w a will be added as a fanin of n a and checked if the resulting SP F D(n a ) covers its original SPFD before removing w r [3] .
Many additional wires for 1 wire removal (m-for-1)
Even though w a1 and w a2 in Figure 1c may not be used to individually replace w r , they may collectively substitute w r . The conditions that are likely to lead to this type of rewiring were suggested in [7] . For example, let
, respectively, adding both wire(n s1 , n a1 ) and wire(n s2 , n a2 ) directs SP F D(w r ) away from w r . Hence, SP F D(w r ) = ∅ and w r can be removed.
A UNIFIED FRAMEWORK
A generalized rewiring scheme which includes all previous approaches as special cases is proposed in this section. For a given n-variable function, f, we can also define the corresponding SPFD, representing pairs of n-tuples that the function can distinguish. Therefore, in this work we make the distinction between SPFD obtained from Algorithm 1 and SPFD that f can distinguish.
Definition 5.1. 1 An SPFD derived from a node's global function is called arrival SPFD or SP F D A . SP F D A at a node n can be computed from the functions of nodes in TFI(n). Therefore, SP F D A can also be perceived as forward propagation. If n = sr(e),SP F D
A (e) = SP F D A (n).
An SPFD at a node obtained by backward distribution (Algorithm 1) is call a required SPFD or SP F D
R .
Properties of SP F D
A and SP F D R can be shown in the following lemma.
Lemma 5.2.
SP F D
A (n i ) ⊆ ∪ n k ∈F I(ni) SP F D A (n k ) (1) SP F D R (n i ) = ∪ n k ∈F O(ni) SP F D R (n k )(2)
Lemma 5.3. A circuit, C, works if and only if
, ∀n ∈ C and either one of the following conditions hold.
In SPFD computation outlined in Algorithm 1, a pair p will be distributed to the input e if f (sr(e)) distinguishes the pair, equivalently p ∈ SP F D A (sr(e)). Therefore, the resulting SPFDs satisfy the above conditions. A cut set can be derived from nodes' levels as follows. Each level has only one cut set. There are two types of nodes in the cut set at level i: 1) nodes with level i and 2) overpass nodes defined as nodes at levels less than i which have fanouts at levels greater than i but no fanout node at level i. Organizing a circuit into levels, the following corollary is obtained as a necessary and sufficient condition for correct circuit functionality.
Corollary 5.4. A circuit works if 1. SP F D
A (n) ⊂ SP F D R (n), ∀n ∈ P O.
Let N i be the set of nodes in the cut set of level i. For all levels i,
At a first glance, Corollary 5.4 is just a necessary condition because pairs may be lost according to (1) 
AN EFFICIENT IMPLEMENTATION
Since our scheme does not pose any restrictions on where and how new wires should be added, the checking process cannot be limited to a specific part of the circuit. As a result, a naïve implementation would result in unacceptable runtimes. Three key techniques for an efficient implementation are outlined in this section.
Processing when necessary
Let i be the minimum of L(sr(e)) for all edges e ∈ W r . Since distribution of SP F D R at levels less than i are untouched, the checking process can start at level i. Initially, all source nodes of wires in W a and all sink nodes of wires in W r will be processed. A node in any given level will be processed if at least one of its fanin nodes is processed and their global functions have been changed.
Fig. 2. SP F D
A and bipartition. All minterms of a 2-input gate are organized into a graph. Different gate functions bipartition the graph in different ways as seen in a) and b). Thus, they produce different SP F D A , shown by thick edges.
Implementable node functions
During SPFD computations in Algorithm 1, SP F D R is computed backwards. However, during the checking process, we are trying to propagate ∪ u∈F I(n) SP F D A (u) forward across node n. Different node functions will produce different SP F D A (n) sets. Consider the graph in Figure 2 , which represent local SP F D A at a node with 2 inputs. The bipartition in Figure 2a , representing OR, propagates only (00,01), (00,10), (00,11) but not (01,10), (10, 11) , (01,11). However, XOR, bipartition in Figure 2b , propagates (00,01), (00,10), (11,01), (11, 10) .
At a given node n, we have to partition in such a way that
away from w r . This problem is formulated as a maximum cut graph bipartitioning problem in which a graph of
, their weights are set to be higher than the others to ensure they will be cut and then distinguished by the new node function. As there are 2 k nodes for a graph representing a k-input LUT, there are 2 2 k ways to bipartition the graph. Therefore, exhaustive search is inefficient even for k = 4. In a real situation, a graph representing SP F D R (n) may consists of many components and each component is a bipartite graph. The new node function must maintain bipartiteness of these components. Therefore, each component can be considered as one node and the maximum number of nodes can be reduced as well as the number of ways to bipartition them.
Early correctness declaration
for all nodes n i of a cut set, it is tempting to declare that the circuit functionality is correct by invoking Corollary 5.4 and assuming nodes in the cut set as artificial PO. However, it had been mentioned that if the function of some nodes change, SP F D R of their transitive fanouts may become non-bipartite which is not implementable by any one output binary function [9] .
A simple example to demonstrate the phenomenon can be constructed as shown in Figure 3 . A pair with 3-tuples and 2-tuple represent a global and local SPFD, respectively. SPFD at the output of Gate 2 contains two ISFs and they are distributed to fanin wires of Gate 2 as shown. If Gate 1 changes from OR to XOR, it is still able to distinguish (001,011) assigned to the gate. However, the new encoding at the input of Gate 2 changes. As can be seen, the required assignment is trying to distinguish between 11 and 01 in one ISF, while forcing 11 and 01 to be on the same side in the other ISF. Graphically, this encoding makes local SPFD at Gate 2 non-bipartite which is not implementable by any single output binary function.
However, if the new global function at a node n remains the same or becomes the inverse of the original global function before wire replacement, the SP F D A (n) will remain the same. Thus, entries in the column corresponding to Node n of a local function truth table at a node n k ∈ F O(n) will remain the same or flip which do not destroy the bipartiteness of local function at n k , which is not the case for the example in Figure 3 . Furthermore, g(n k ) and SP F D A (n k ) will remain the same. As a result, if at a cut set i, g(n i ) equals to g org (n i ) or g org (n i ), for all nodes in the cut set, the circuit can be declared correct invoking Corollary 5.4 by using nodes in the cut set as artificial POs.
EXPERIMENTAL RESULTS
Rewiring can be used in various, diverse applications. In this work, we do not attempt to investigate the rewiring ability for any application in particular. Therefore, our experiments are directed toward revealing the capability of rewiring independent of any specific application by considering all possible replacement wires.
First, the process to select candidate wires will be discussed. After that, rewiring ability will be compared against that of an ATPG approach. Finally, rewiring ability using more than one wires will be reported.
Choosing replacement wires
We assume that a node is a 4-input LUT and it can accept at most one new wire, even if it has more than one empty pin. This assumption should not severely limit rewiring ability as virtually no node has 3 free inputs and only some nodes have more than 1 free pin. For a given wire w r to be removed,
SP F D
R (w r ) will be removed from T F O(sk(w r )). These nodes will be used as possible targets for new wires.
The previous approaches search for wires with suitable SP F D A . However, node functions, thus their SP F D A , are derived without provisions for rewiring. Therefore, the number of candidates is limited. Consider an example in Figure 2 . Let SP F D R (w r ) = (11, 00). Assuming Node n implements the XOR function, whose SP F D A (n) is shown in Figure 2b , with SP F D R (n) = (00, 01), (00, 10). According to previous approaches, Node n is not a candidate for rewiring. However, if the node function is changed to OR, SP F D A (n) can satisfy all (00, 01), (00, 10) and (11, 00). Thus, a node n s is a possible source for a new wire, wire(n s , n a ), where
The circuit depth does not increase.
Rewiring ability using none or one wire
In this experiment, the rewiring ability of using our scheme is compared against that of other techniques. Table 1 shows rewiring ability of different approaches. The circuits used in the experiment and their numbers of nodes and wires are shown in Columns 1,2 and 3, respectively. During SPFD computation, BDD sizes can grow exponentially (also observed in [10] ). Thus, although our checking process is quite efficient, only medium size circuits can be handled in the current implementation. However, the technique proposed in [10] can speedup SPFD computation by at least 23 times which will make our techniques applicable to large circuits.
ATPG-based rewiring ability (quoted from [3] ) are shown in Column 4. The best known rewiring ability of SPFDbased rewiring was reported in [3] . The order of SPFD distribution at Step 7 of Algorithm 1 affects rewiring ability as it dictates concentration of SP F D R on a wire. This observation has been used to improve rewiring ability [11] as well as power reduction for FPGAs [5] . Hence, Table 1 is not meant to be a direct comparison between our scheme and that of [3] as SPFD distribution of [3] is not known. Furthermore, in [3] , the rewiring was declare feasible as soon as SP F D R of a target node covered the one before wire removal. Thus, some nodes may not be implementable. (see Section 6.3 for an example.) Thus, we quote the results as a pseudo upper bound on rewiring ability in Column 6.
Our implementation uses CUDD (a BDD package) to represent functions and SPFDs. As replacement wires are observed to be close to the removed wire [3] , a circuit is partitioned into clusters of 60 nodes using hMetis, similar to [3] . As partitioning might limit rewiring ability especially for wires next to partition lines, we ran our scheme in multiple passes. The wires with successful rewiring are not considered again in subsequent passes. To reduce the effect of partitioning, wires crossing a partition line are discouraged 
