Quantum-dot Cellular Automata (QCA) is a novel computing mechanism that can represent binary information based on spatial distribution of electron charge configuration in chemical molecules. QCA layout is currently restricted to a single layer with very limited number of wire crossing permitted. Thus, wire crossing minimization is crucial in improving the manufacturability of QCA circuits. In this article, we present the first QCA channel routing algorithm for wire crossing minimization. Our channel routing algorithm is able to reduce crossings, where Left-Edge First, Yoshimura and Kuh, and topologically-based algorithms fail to do so.
INTRODUCTION
One approach to computing at the nano-scale is through quantum-dot cellular automata (QCA) concept [1] that represents information in a binary fashion, but replaces a current switch with a cell having a bi-stable charge configuration. In this article, we present the first QCA channel routing algorithm for wire crossing minimization. Wire crossing rarely becomes an issue in CMOS circuit due to the availability of multiple routing layers and vias. However, QCA routing currently is restricted to a single layer with very limited number of wire crossing permitted. Thus, wire crossing minimization is crucial in improving the manufacturability of QCA layouts. Our approach is to insert crossing edges to the VCG (Vertical Constraint Graph) to enforce additional vertical relations for wire crossing reduction. We formulate and provide a heuristic solution for the Weighted Minimum Feedback Edge Set Problem to effectively remove cycles from the VCG after crossing edge insertion. As a result, we achieve wire crossing results that are very close to theoretical lower bound and outperform the conventional channel routing algorithms significantly. Some of the recent CAD works on QCA circuit include [5, 10, 8, 2, 9] .
PROBLEM FORMULATION
In a QCA layout, the I/O terminals are located on the top and bottom boundaries of each logic block. In addition, the signal flow among the logic QCA cells needs to be unidirectional from the input to the output boundary. This unidirectional signal is caused by QCA's clocking scheme, where an electric field E created by underlying CMOS wire is propagating uni-directionally within each logic block. Thus, we assume the placement, the input to our QCA routing problem, is done based on a K-Layered Bipartite Graph (KLBG). A directed graph G(V, E) is k-layered bipartite graph iff (i) V is divided into k disjoint layers, (ii) each layer p is assigned a level, denoted lev(p), and (iii) for every edge e = (x, y), lev(y) = lev(x) + 1. Finally, the goal of QCA channel routing is to finish connection among the terminals in every two adjacent layers of the KLBG so that the total wire crossing and channel width are minimized.
Many channel routing algorithms exist which attempt to minimize the resulting channel width or crosstalk. Yoshimura and Kuh presented a graph-based method (YK), using a VCG for channel width minimization [11] . The YK algorithm attempts to optimize the channel width, but an ideal QCA algorithm would optimize both the channel width and the number of crossings that occur between nets.
QCA CHANNEL ROUTING

Overview
Each QCA channel is routed in 5 steps. First, the channel terminals are traversed from left to right, and each contact edge is added to the VCG (step 1). Since the VCG must be acyclic, cycles in the contact edges are removed by empty column insertion and vertical doglegs (step 2). Once the VCG is acyclic, crossing edges are added to reduce wire crossing (step 3). If cycles exist in the VCG, contact edges are intelligently removed until the VCG is once again acyclic (step 4). Lastly, we use the LEF algorithm [6] to assign tracks to the nets and finish routing the channel (step 5). This entire process is applied to each channel of the KLBG from top to bottom to route the entire circuit.
Contact Cycle Removal: Doglegging
Note that the initial VCGs built from the input KLBG may contain cycles, making it impossible to route the channels. A directed graph contains a cycle if there exists a backward edge during a DFS (Depth First Search). This means that removing all backward edges ensures that the VCG contain no cycles [12] . To remove a cycle, a net on the cycle is split into multiple nets and a vertical dogleg is inserted [3] .
1 Each of these subnets is created and inserted into the VCG to remove these cycles.
Crossing Edge Insertion
We note that wire crossing can only occur between nets which overlap horizontally. Also, the crossing count between any arbitrary pair of horizontally overlapping nets is determined by their vertical ordering. For example, consider Net 1 and Net 2, which overlap horizontally in Figure 1 . If Net 1 is placed below Net 2, as in Figure 1 (a), then they will cross twice: a crossing for each top terminal of Net 1 and for each bottom terminal of Net 2 located in this region of overlap. Likewise, if Net 2 is placed below Net 1, as in Figure 1 (c), they will cross once: a crossing for each bottom terminal of Net 1 and for each top terminal of Net 2 located in this region of overlap. A similar relationship exists between Nets 1 and 3, but for the opposite order.
To reduce crossing, we add crossing edges between nets in the VCG to enforce the vertical relationship that results in the minimum number of crossing between them. An original VCG is shown in Figure 1 (b), and a modified version is shown in Figure 1 The additional amount of crossings that are caused by dogleg is a small portion of the overall crossings as revealed in Table 2 . So, we do not attempt to reduce these currently. Crossing edges may create cycles. backEdge will break the two cycles in a single pass (by removing (n5, n1) ), whereas cycleBreaker takes two passes: first to remove (n2, n3) and next to remove (n4, n5).
edge is assigned a weight equal to the number of wire crossings saved by orienting the nets according to that vertical orientation. The theoretical lower bound of wire crossing for each circuit is equal to the sum of the minimum values of wire crossing for each pair of nets. In the same way, the upper bound of crossing is equal to the sum of the maximum values for each pair. These lower and upper bounds were calculated and appear in Table 1 .
Optimal Cycle Breaking
If the VCG is not acyclic after the addition of crossing edges, then the cycles must be removed before track assignment can take place. An illustration is shown in Figure  2 . We remove crossing edges until the VCG is once again acyclic. The weight of crossing edges represents wire crossing reduction, so our goal is to minimize the total weight of the crossing edges removed to make VCG acyclic. This problem is formally defined as follows: Weighted Minimum Feedback Edge Set Problem: Given an edge-weighted directed graph G(V, E) with cycles, the goal is to find a set of edges A ⊂ E with the minimum total weight such that G (V, E−A) is acyclic. Since the non-weighted version of the Minimum Feedback Edge Set Problem is NP-complete [4] , it is not hard to see that this weighted version also becomes NP-complete.
We note that a simple heuristic exists to solve the weighted minimum feedback edge set problem: perform DFS once and remove all backward crossing edges found. This will guarantee that the remaining VCG is acyclic. A major shortcoming of this approach, named backEdge, is that it ignores the edge weights and may produce a feedback edge set with a high total weight. An illustration is shown in Figure 2 , where backEdge removes (n5, n1) and loses wire crossing reduction of 6 to break the two cycles. However, a better approach is to remove (n2, n3) and (n4, n5) to break the cycles and only lose wire crossing reduction of 3. Note that this second approach, named cycleBreaker, requires us to enumerate all cycles in VCG. DFS, however, can not provide all cycles in a single pass as illustrated in Figure 2 . DFS can either detect (1, 2, 3, 5, 1) or (1, 2, 4, 5, 1) but not both in a single pass. Therefore, it takes several iterations to remove all cycles with this cycle-by-cycle approach that searches for the minimum-weighted crossing edge.
An overview of cycleBreaker algorithm, our heuristic for the weighted minimum feedback edge set problem, is shown in Figure 3 . As the cycleBreaker function traverses the VCG cycleBreaker 1: while (the VCG contains cycles) 2:
perform depth-first traversal(VCG); 3: C = set of detected cycles in VCG; 4:
while (|C| > 0) 5:
for (each crossing edge e ∈ C) 6:
setKey(e, C); 7: e = edge with the highest key in C; 8: remove e from VCG; 9:
remove all cycles containing e from C;
setKey(e, C); 1: key(e) = count = 0; 2: for (each cycle c ∈ C that contains e) 3: count = count + 1; 3:
for (each edge e ∈ c = e ) 4:
key(e) = key(e) + w(e ); 5: key(e) = count · key(e)/w(e); in depth-first order, it collects the cycles detected. When this traversal is complete, it will have added some cycles from each strongly connected subgraph in the VCG to the collection of cycles that must be broken. Because the VCG was acyclic until the crossing edges were added, each cycle must depend on the presence of a crossing edge. Once these cycles are found, the crossing edges of each are keyed according to the setKey function in Figure 3 . We use the following function to determine the edge key value:
key(e) = |c(e)| ·
P e ∈c(e) weight(e ) weight(e)
where c(e) denotes the cycles that contain e. The edge with the highest key is removed from the VCG, and the cycles that contain this edge are removed from the collection of detected cycles. The edges are rekeyed and further cycles are removed until no more detected cycles remain unbroken. In this manner, small-weighted edges that occur in many cycles with other higher-weighted edges are more likely to be targeted for removal to break cycles. As discussed previously, cycleBreaker runs multiple iterations of "DFS plus cycle removal" until all cycles are removed from the VCG.
Wire Crossing vs Channel Width
When the VCG is finally acyclic, the last step is to assign each net to a unique routing track. We note that the netmerging algorithm used in YK router [11] may not be a good option for wire crossing minimization. The crossing edges could potentially be add to the VCG between each pair of nets. Thus, the longest path length could be increased with these new crossing edges inserted, and thus merging opportunities could be reduced. Therefore, the task of the YK algorithm is much harder with the presence of these crossing edges than without. These more complicated vertical relationships between overlapping nets often require a larger number of tracks to contain them than a solution that does not attempt to minimize the resulting crossing count. Thus, the routing track assignment is done by the LEF algorithm. This is shown in the previous Figure 1 . cmb  6  27  0  0  42  decod  3  50  0  55  64  pm1  5  40  1  113  25  i1  6  38  15  250  105  sct  4  76  71  4806  48  my adder  18  62  0  0  72  i2  5  201  1577  5410  118  x4  4  303  2126  113984  215  k2  3  215 80832  967007  279 
EXPERIMENTAL RESULTS
Our algorithms were implemented in C++/STL, compiled with gcc v2.96 run on Pentium III 746 MHz machine. 14 combinational circuits were selected from ISCAS benchmark [7] , converted to KLBG, and placed with an advanced QCA placement algorithm. Table 1 shows the resulting KLBGs which are ready for channel routing. We report the total number of channels (#cha) and the maximum column count among all channels (x-col) in each circuit. The theoretical wire crossing lower and upper bound (x-low and x-upp) for each KLBG were calculated and are also shown. We also show the lower bound of channel width (= the longest paths from source to sink in VCG) before crossing edge insertion (t-low).
These circuits were routed by four methods, and the results are shown in Table 2 . The results of our algorithm are listed as "CB" for our cycleBreaker algorithm in Table 2 . It was compared to the LEF algorithm (LEF), as well as the algorithm presented by Yoshimura and Kuh (YK) [11] . It was also compared to the backEdge algorithm (BE) discussed in Section 3.4. Table 3 shows the ratio of the resulting wire crossing count from each method to the theoretical lower bound of wire crossing for each circuit. The average LEF result has wire crossing 19x above the lower bound. The average result by the YK method is 18x above the lower bound, and the average BE result is almost 17x above the lower bound. In comparison, the method presented here results in average wire crossing only 13% above the lower bound, providing superior wire crossing minimization. Note that our channel routing method completes in only 7.5% more runtime than the simplest alternative method: LEF. Table 4 shows how the wire crossing count and track count are inversely proportional. While our channel routing method produces wire crossing much closer to the theoretical lower bound, the resulting channel width is almost 6x greater than the theoretical lower bound. The LEF, YK, and BE methods produce channels about 40-80% greater than the lower bound. As traditional channel routing algorithms attempt to reduce circuit area, they increase the wire crossing significantly. Conversely, while the channel routing algorithm presented here may increase the channel width, it can come much closer to the theoretical crossing lower bound than conventional methods.
CONCLUSION
QCA promises numerous benefits over today's CMOS circuits. However, due to its technological constraints, the circuit has to be laid out on a single layer. In this paper, we present new algorithms for wire crossing reduction in channel routing. Our channel routing algorithm reduces the crossing count to very near the theoretical lower bound, as it targets the absolute minimum number of wire crossings to route a particular circuit. 
