Efficient circuit partitioning is gaining more importance with the increasing size of modern circuits. Conventionally, circuit partitioning is solved by modeling a circuit as a hypergraph for the ease of applying graph algorithms. However, there exist rooms for further improvement on even optimum hypergraph partitioning results, if logic information can be applied for perturbation. In this paper, we present a multi-way partitioning framework which can couple any excellent hypergraph partitioner and a noval logic perturbation based (GBAW) technique for further improvement over very excellent partitioning results. Our approach can integrate with any graph partitioner. We performed experiments on 2-, 3-, 4-, and 5-way partitionings for various circuits of different sizes from MCNC benchmarks. We have chosen the state-of-the-art hMetis-Kway to obtain high quality initial solutions for the experiments. Our experiments showed that this partitioning approach can achieve a further 15% reduction in cut size for 2-way partitioning with an area penalty of only 0.33%. The good results demonstrated the effectiveness of this new partitioning technique.
Introduction
Traditionally, circuit partitioning is done by simply modeling the circuit as a graph (or hypergraph). Graph partitioning problems are known to be NP-hard [1] . A comprehensive survey [2] has presented the recent directions of partitioning. Commonly used partitioning algorithms can be categorized into three classes. The first class strictly abides by the modeling graph, with no attempt to change the graph. High quality results have been reported by several algorithms which include iterative improvement based [1, 3] , clustering based [4] , and spectrum (eigenvector) based [5, 6] . The second class of algorithms may modify the graph through node replications [7, 8] . Improvement is achieved by sacrificing some area due to node replications. These two classes both perform the partitioning task on the graph without considering the logic function of the circuit. The third class [9, 10, 11] couples the graph domain (nodes and their connections) and logic domain (function perform by each node). The tradeoff of improving the partitioning results is the expensive computational cost [10, 11] .
Recently, many research works on multi-level partitioning are proposed [12, 13, 14, 15] . The general idea behind multi-level partitioning is to first cluster the whole problem by some useful algorithms to reduce the size, then apply a well-known graph domain partitioner on the coarsened graph to get a good initial solution. The graph is then unclustered and a suitable partitioning refinement algorithm is applied in order to adjust the cut edge between partitions. The quality and the runtime by multi-level partitioning are very encouraging. In particular, Karypis and Kumar [15] proposes a particular called hMETIS-Kway. It first coarsens the hypergraph, then recursively bisects the graph into kparts, followed by uncoarsening the hypergraph with refinement algorithms. More recent research works [16, 17] , in comparison with hMetis-Kway, showed that the solution by hMetis-Kway is of high quality that the cut size cannot be further reduced greatly.
Alternative wiring (rewiring) is the technique of adding single or multiple redundant wires or gates to a circuit such that other wires or gates become redundant and thus removable. This logic domain technique has been widely used for solving many logic level and physical level design problems [9, 18, 19, 20] . Circuit performance can be improved by removing a wire on the critical path and adding its alternative wire elsewhere. Circuit routability can also be improved by substituting an unroutable wire in congested area by a routable alternative wire in some other circuit part. The cut size of a partition can be reduced by replacing the wires crossing the cut line. Figure 1 illustrates how rewiring can be used to further improve an already optimum partition result obtained by a typical graph domain partition algorithm. The global optimum partition result in the graph domain, with a cut size of 3, is shown in Figure 1 (a). However, if we apply the logic domain rewiring technique to replace a target wire (thick wire) crossing the cut line by its alternative wire (dotted wire), the cut size can be reduced to 2, as shown in Figure 1(b) (without injecting area increase). From this example, we can see that rewiring can be applied to partitioning to further improve upon even optimum solution in the graph domain.
To investigate the possibility of perturbing the circuit without applying any Boolean operations, minimal circuit structures yielding rewiring patterns have been studied [21] . Based on benchmark circuits, we observe that the nearest existing alternative wire is quite close to its target wire. As a result, instead of applying the ATPG-based logic implications repeatedly for a same pattern, the Graph-Based Alternative Wire (GBAW) technique [21] employs a more efficient graph pattern matching operation to locate alternative wires. The basic idea of GBAW is to match the sub-circuit with some "pre-specified" patterns. Rewiring by GBAW can be done without doing any logic implication or redundancy check, hence it runs very fast. Besides considering the alternative wire which is close to the target wire from those small "pre-specified" patterns, distant alternative wires can also be located by propagating the matchings in a cascading way.
To expand the optimization space, we applied the coupling notion between graph and logic domain into our GBAW-Partitioner (GP). In graph domain, we chose the well-known Fiduccia-Mattheyses (FM) partitioning algorithm [3] as the iterative move-based engine for its simplicity. In logic domain, we applied an augmented GBAW, which enhances the ability to locate more 2-local alterna-tive wires, as a greedily guided perturbation engine. In our experiments, near optimum partition results were firstly obtained from the pure graph domain partitioner hMetis-Kway. Then the coupling graph and logic domain optimization by GP engine, was followed. Note that our logic perturbation process can be coupled with any powerful graph domain partitioning tool, and GBAW itself is able to handle patterns with multiple-input gates. We experimented this partition flow for 2-, 3-, 4-, and 5-way partitionings on various MCNC benchmarks ranging from small to fairly large circuits. The results showed that such a graph-logic domain coupled partitioning approach can further cut down the cut size effectively with small CPU overhead. The results also showed that it is a very promising direction for circuit partitioning.
This paper is organized as follows. The background and definition are introduced in Section 2. In Section 3, a brief introduction on Graph-based Alternative Wire technique is described. In Section 4, the details of repartitioning by rewiring is shown. In Section 5, experimental results are presented. Conclusions are drawn in Section 6.
Background and Definitions
A combinational circuit can be represented by a DAG where vertices correspond to the primary inputs (PI), primary outputs (PO) and the internal gates of the circuit. PI and PO are nodes which have only outgoing edges and incoming edges respectively. An internal node has at least two incoming edges and one outgoing edge and is associated with a Boolean function. Inverters are not considered as internal nodes, but as polarity of edges during logic domain perturbation. In a Boolean network, the in-degree of node y, denoted by d ; (y), is defined as the number of edges entering y. The out-degree of node y, denoted by d + (y), is defined as the number of edges leaving y. We also define a node y by a triplet (op d ; (y) d + (y)), where op is the Boolean operator of y which can be AND, OR, NAND, or NOR.
We use a graph configuration D to map the logic function from a Boolean Network G. For each node n i in sub-network S in network G, n i is mapped to a triplet (op i 1 i 2 ) in D where op denotes the operator representing the Boolean function of n i and i 1 , i 2 are non-negative integers. All edges inside S are preserved, while the edges outside S are omitted in D. In most cases, i 1 equals d ; (n i ) and i 2 equals d + (n i ). The element of a triplet (op d ; (y) d + (y)) can also be don't care. For the first element, don't care means any operator. For the other elements, don't care can be any positive integers. We use a configuration to denote a minimal pattern containing both the target and its alternative wire. The mapping is illustrated in Figure 2 . S is a sub-network of G. D 1 and D 2 are two mappable configurations of S. A k-local pattern denotes a minimal sub-graph with the distance between the alternative wire and its target wire being k. The distance between two wires is defined as the difference of maximum path length from any primary input to each of the wires. A wire is defined as a 2-point connection between a pair of source and sink nodes. When a larger circuit is partitioned into two sub-circuits, we define the wires crossing the partitioning cut line as cut wires. We also define a cut net as a hyperedge connecting partitions and the cut cost as the number of partitions that the hyperedge connects. 
Graph-Based Alternative Wire Technique (GBAW)
Graph-Based Alternative Wire (GBAW) is a newly proposed and efficient rewiring technique. It models a circuit as a directed acyclic graph (DAG) and searches alternative wires by checking graph matching between local subnetworks and the pre-specified minimal sub-graph configurations. A configuration is a minimal circuit pattern containing alternative wires within a given distance. Experiments showed that the number of all such local minimal sub-graph is limited. Most of the alternative wires are located topologically "near" to their target wires. It has been shown that about 96% of the closest alternative wires are only 2-edge distant from their target wires. When sub-network matches a pattern, GBAW can quickly determine the target wire and the corresponding alternative wires. Obviously, if w r is an alternative wire of w t , then w t is also an alternative wire of w r . Both w t and w r are presented in a pattern. But in a subnetwork, only one of them exists. In [21] , it has shown that by using GBAW as a random perturbation engine, a competitive logic optimization result is obtained when comparing to RAMBO while the runtime is greatly reduced.
There are 0-local, 1-local and 2-local patterns in GBAW. In this paper, we apply an augmented GBAW, which is a much extended scheme improved from the GBAW shown in [21] , to improve the effectiveness of identifying alternative wires of a given circuit for repartitioning. Figure 3 shows some new 2-local patterns used in GBAW, with the target wire and its alternative wire shown as the thick line and dotted line respectively. The position of the target wire and alternative wire can be swapped. GBAW is able to find the alternative wire of the target wire within a limited distance, also it is able to locate distant alternative wire by waveform propagations. This paper applies the GBAW as the perturbation engine in logic domain. 
Figure 3. Some new 2-local patterns in GBAW
There are more than 40 different patterns in the implementation of GBAW. GBAW does handle the case of adding one wire and removing another one, the cases of adding one AND, OR, NAND or NOR gate so as to remove one target wire. It also handles the cases of simultaneously adding two wires and removing two other wires as well.
Partitioning using Alternative Wiring
Assume that one pin is used in a partition for a net. The objective of a multi-way partition is essentially to minimize the number of pins required to connect all partitions. Since some of the wires may have alternative wires, if we replace some cut wires by their alternative wires that are not cut wires, cut size can be reduced. The rewiring process may lead to some new circuit graph, and in turn help escaping from local minima led by graph domain partitioning process.
A perturbation refers to the replacement of a target wire by its alternative wires. Figure 4 illustrates the gains regarding various perturbations in a circuit. Thick lines represent the target wires and dotted lines refer to their alternative wires. As shown in the example, we may have positive, zero and negative gains. 
Figure 4. Perturbations and gains
We use the hMetis-Kway partitioning tool to provide a fast and near optimum solution as our initial partition. We select the well-known FM partitioning algorithm [3] as our graph domain partitioner for its simplicity and efficiency. In fact, we can apply any other graph domain partitioner. Then we apply our rewiring technique, GBAW, to perform logic perturbations aiming for further improvements. Figure 5 gives the algorithm of GP.
During the perturbation process GP, only cut wires will be selected as target wires for perturbations. We first ran-Algorithm GP (best partition, m, k, t)f search limit = 0; n perturbations = 0; curr partition = best partition; last partition = best partition;
for i=1 to m f while((n perturbations < k) && (exit == false))f search limit = 0; while(search limit < t)f search limit ++; domly select a cut wire as the target wire. Then, GBAW is used to find the alternative wire set SWa of the target wire. Finally, among the wire set SWa, the alternative wire with the highest gain is selected for perturbation. When the SWa of the target cut wire is empty, GP may randomly select another cut wire for another trial. The number of iterations is set by m. The number of trials is limited by t times. k is the limit of perturbations. These limits serve to set some bounds for improving performance. Some (hillclimbing) perturbations with negative-gain perturbation are allowed. Therefore we can increase the chance of obtaining better solutions. By integrating GBAW to GP, our partitioner can locate nearly all the alternative wires of multiinput gate circuits.
Experimental Results
The algorithm GBAW-Partitioner (GP) was implemented in C and the experiments were conducted on Sun Enterprise E4500 workstation with 8 GB memory in a single-processor configuration for MCNC benchmarks of various sizes . The large benchmark circuits used in ISPD98 [22] are not applicable for our experiments due to the lack of logical domain informations. Since the rewiring engine GBAW [21] is able to locate alternative wires of multiple input gates as well as 2-input gates, thus the circuit simplification SIS [23] done by [9, 24] can be skipped.
In our experiments, we set the tolerance of area imbalance of GP to be 20% of the average area in each partitioned block. Therefore the maximal ratios are 40%:60%
and 16%:24% in 2-way and 5-way partitioning respectively. In order to explore the graph domain optimization, hMetis-Kway [15] was firstly run for each circuit. As a result, a nearly optimum partition solution was obtained. The next step is to select the best solution applying GP for logic perturbation to further improve the quality of the partitioning with k = 6 0 and t = 5 0 . Table 1 to 4 list the experimental results for the 2-to 5-way partitionings respectively. Column "area" lists the area of the sub-circuit in terms of the number of gates. "#lits" lists the total number of literals of the partitioned circuits. From the results, the area penalties for 2-to 5-way are 0.33%, 0.53%, 0.61% and 0.71% respectively. Column "cut cost" lists the total number of cut pins obtained for all partitioned blocks. Column "cut wire" lists the number of cut wires of the partitioning. Column "cpu" lists the cpu time (in seconds). From the results, we can see that applying logic perturbation can further cut down the cut size of the good partitionings produced by purely graph domain based partitioner. The total number of literals is slightly increased because of the area cost of the added gates during perturbation. We obtained 14.48%, 10.18%, 9.08% and 9.24% reduction in cut size for the 2-, 3-, 4-and 5-way partitionings. The last 2 columns showed that the quality and cpu time of GP are both much better than the results obtained by simply running FM for 250 times.
Conclusion and Future Work
In this paper, a scheme coupling the graph and logic domain partitioners to explore a larger optimization room of circuit partitioning is proposed. The scheme is shown to be very efficient in terms of CPU expenditure and is also quite capable in bringing further improvements on good partition results produced by the state-of-the-art partitioner hMetis-Kway. Without the integration with RAMBO, the input circuits is no longer limited to 2-input simple gate circuits. We conducted experiments on 29 MCNC benchmark circuits for 2-to 5-way partitionings, and obtained further reductions from 14.48% to 9.24% upon the good results produced by hMetis-Kway. Moreover, the partitions quality and CPU expenditure of GP are both better than simply running FM for 250 times. As GP can be integrated with any newly developed powerful graph partitioner, this partitioning scheme should be very practical and useful for many partition tasks. 
