Consider a set of nets given by horizontal segments S = fs 1 ; s 2 ; ; s n g and a set of tracks T = ft 1 ; t 2 ; ; t k g in a channel, then a track assignment consists in an assignment of the nets to the tracks such that no two nets assigned to the same track overlap. One important goal is to nd a track assignment with the minimum number of tracks such that the signal interference between nets assigned to neighboring tracks is minimized. This problem is called crosstalk minimization. For a given track assignment with k tracks, crosstalk can be reduced by nding another track assignment for S with k tracks (i.e., by permuting tracks). However, considering all possible permutations requires exponential time. For general cost function for crosstalk measure, the problem is NP-hard. Several heuristic approaches were previously presented. In this paper, we consider special instances of the crosstalk-minimization problem where the cost function depends only on the length of the segments that runs in parallel and all pairs of segments intersect. An algorithm solving this problem in O(n log n) time is presented. An extension applied to the instances with more general function of switching activity and mixed signal sensitivity to reduce crosstalk and power consumption is also presented.
I. Introduction
As CMOS technology advances into deep submicron, some of the net lengths for interconnection between modules can be so long that they have a wire resistance, R wire , which is comparable to the resistance of the driver. As the interconnection delays become more and more dominant in the overall delay, performance-driven (for time and low power) routing becomes important, in addition to the traditional goals of area and interconnect minimization. The coupling capacitance, C coupling , between minimum pitch wires on a 0.25 m CMOS IC can account for over 80% of the total capacitance of a wire. This makes interconnect crosstalk noise one of the biggest challenges in VLSI design today. When R wire C coupling = t rising , there is noise problem. The increase of crosstalk not only holds for coupling via the interconnect, but also for the crosstalk via the substrate. The crosstalk is also proportional to the power consumption in CMOS circuits, thus minimizing crosstalk also leads to minimizing power consumption. A battery-operated multimedia system in 0.1 m technology will require 40 NiMH battery cells which causes the system not to be portable.
Therefore, as the interconnection delays become more and more dominant in the overall delay, performance-driven routing becomes important, in addition to the traditional goals of area and interconnect minimization. Timing, routability, size, and power problems are not discovered until after detailed routing. Therefore, deep-submicron designs require crosstalk-free detailed routing. Major contributions on this paper is to develop a very fast optimnal algorithm to improve the routing quality in terms of crosstalks for special instances of channel routing problems. One can extend the proposed basic algorithm framework to incorporate the other performance issues for practical use. Our work is signi cant and innovative since it is the rst attempt at identifying the instances with a linear-time complexity to reduce the crosstalk in the channel routing problem.
A. Crosstalk Measure
Crosstalk is a capacious and inductive interference caused by the noise voltage developed on signal lines when nearby lines change state. It can occur in the following manner as in Figure 1 . When the voltage of an aggressor signal changes as the voltage of a victim signal is in transition through the high gain region of a receiver circuit, a ripple or a small glitch is formed in the victim signal if the switching directions are opposite of each other. This high-gain crosstalk can a ect a circuit in many di erent ways: In a static CMOS design, this glitch increases the wire delay considerably. It also increases the receiving circuit delay since it alters the e ective input rise/fall time. This causes speed-related logic errors. Sometimes, this glitch can propagate through many fanout gates. If the signal feeds into a dynamic logic gate, it can discharge the storage charge during the evaluation phase and cause a logic error.
The crosstalk is a function of the separation between signal lines and the linear distance that signal lines run parallel with each other. To maximize system speed, crosstalk must be reduced to levels where no extra time is required for the signal to stabilize. Signals such as clocks, that are highly sensitive to crosstalk should be isolated by reference planes from signals on other layers and/or by extrawide line-to-line spacing.
Signals are grouped into categories based on waveshape control requirements, crosstalk limits, or other special requirements. For example, clocks, strobes, buses, memory address, data, chip-select, and write lines, and asynchronous signals, ECL signals, and analog signals have special routing requirements as follows 1].
Data buses: Crosstalk between buses tends to be data-pattern-sensitive and is worse when all addresses or data lines change in the same direction at the same time.
Memory address and data signals: Cross-coupling between any combination of memory input or output signals as well as crosstalk between nearby unrelated signals can be disruptive and must be guarded against. Excessive cross-coupling from memory data lines to address lines during read cycles can result in positive feedback that degrades the response time of the memory device and in extreme cases can cause unstable oscillatory operation. That data-toaddress-line cross-coupling may upset address lines su ciently to cause write signals to incorrect memory locations. Memory wire lines require the highest possible degree of isolation from crosstalk. Low-Sensitivity = digital nets that directly a ect the analog part in some cells such as control signals.
Non-Sensitivity = The most noise insensitive nets such as pure digital nets.
The crosstalk between two interconnection wires also depends on the frequencies (i.e., signal activities) of the signals traveling on the wires.
Once the electrical designer has established the electrical requirements or limits of each signal category based on system performance requirements and error budgets, the requirements must be translated to speci c mechanical requirements for the routing people. For example, twisting signal lines with ground lines provides some shielding e ects minimizing the chance for coupling into adjacent wiring. As the VLSI technology improves, more layers are available for routing. As a result, there is a need for developing multilayer routing scheme (e.g., the layer assignment of 7-layer process is as follows. 1,2: local routing, 3,4: inter-block routing, 5,6: power, ground, top layer: clock) that reduces the die size (and thus the average wirelength) and crosstalks.
A number of papers have been published related on the crosstalk issues: mixed analog and digital applications 2]; crosstalk minimum layer assignment 3], 4]; a spacing algorithm 5]; a channel routing enhancement considering crosstalk by a linear programming of track permutations 6]. 7] also addressed a channel routing algorithm and also , 8] proposed an optimal algorithm for the problem of minimizing crosstalk between vertical wires in 3-layer VHV channel routing. Recently, a crosstalk-minimum rainbow k-color permutation based on left edge dynamic programming was presented by 9].
B. Power Measure
Power consumption in CMOS circuits is due to three sources: dynamic power consumption due to charging and discharging of capacitive loads during output transitions at gates, the short circuit current which ows during output transitions at gates, and the leakage current. The last two factors above can be made su ciently small with proper device and circuit design techniques, thus, research in design automation for low power has focused on minimization of the rst factor, the dynamic power consumption.
The average dynamic power P av consumed by a CMOS gate is given below, where, C l is the load capacity at the output of the node, V dd is the supply voltage, T cycle is the global clock period, N is the number of transitions of the gate output per clock cycle, C g is the load capacity due to input capacitance of fanout gates, and C w is the load capacity due to the interconnection tree formed between the driver and its fanout gates.
Logic synthesis for low power attempts to minimize P i C g i N i whereas physical design for low power tries to minimize P i C w i N i . Here C w i consists of C x i + C s i , where C x i is the capacitance of net i due to its crosstalk, and C s i is the substrate capacitance of net i.
For lower power layout applications, power dissipation due to crosstalk is minimized by ensuring that wires carrying high activity signals are placed su ciently far from the other wires. Similarly, power dissipation due to substrate capacitance is proportional to the wirelength and its signal activity.
We need to minimize C x i + C s i to both minimize crosstalk and power consumption. In this paper, we address an e ective algorithm on the crosstalk minimization problem. An extension applied to a layout with minimum power consumption is also presented. This paper is organized as follows. We formulate the problem in Section 2. In Section 3, we present an optimal algorithm for the crosstalk minimization problem in the special case of channel routing. Sections 4 and 5 will present experimental results and conclusion, 
II. Formulation of the Problems
In a channel, given a set of multi-terminal nets N speci ed by the locations of their terminals on two channel sides, top layer is usually reserved for vertical wires and the bottom layer is reserved for horizontal wires. The general 2-layer crosstalk-minimum channel routing problem is known to be NP-complete.
The complexity of the channel routing stems from the vertical constraints. The vertical constraints imply that the two nets whose pins are at the same row in the channel cannot be overlapped vertically. Two-layer channels are usually dense and crosstalk-sensitive. Thus, it is crucial to attain the desired crosstalk minimization solution.
In early 90's, a third metal layer became feasible. Most of the current gate-array technologies use three layers for routing. For example, the Motorola 2900ETL, DEC's Alpha chip, Intel's 486 chip used a three metal layer process and original Intel Pentium was also fabricated on a similar process. The three-layer routing algorithm can be classi ed into two main categories: the reserved layer and the unreserved layer model. The reserved layer model can further be classi ed into the VHV model and the HVH model. Note that in VHV routing, the vertical constraints between nets no longer exist. Therefore, the channel height which is equal to the maximum density can always be realized using Left-Edge-Algorithm. Without vertical constraints, more nets are permutable in a channel. Thus, at a cost of one more layer, the VHV routing is e ective in terms of both area and crosstalk.
There are pairs of nets that cannot be assigned to the adjacent tracks because some nets might strongly interfere each other. Note that we can reduce crosstalk by maximizing the track separation between pairs of nets with high crosstalk. Thus, we formulate the crosstalk minimization problem in VHV model as follows.
De nition 1: Given k tracks T = (t 1 ; t 2 ; ; t k ) and n intervals, i.e., horizontal segments of net, S = (s 1 ; s 2 ; ; s n ), k n. The crosstalk minimization problem is to nd an assignment of the intervals to the tracks, i.e., : S ! T, such that no two intervals assigned to the same track intersect, and the cost function X (s i ;s j )2S w i;j subject to j s i ? s j j = 1 (2) is minimum, where w ij is X ij (N i + N j ) A ij : N i (resp. N j ) = switching activity of net i (resp. net j); A ij = signal sensitivity (for mixed signal interactions) between nets i and j; X ij = L ij =D ij = coupled noise between nets i and j; L ij = coupled length between horizontal wires of nets i and j; D ij = separative distance between horizontal wires of nets i and j.
A special case of the crosstalk minimization problem where n = k is track permutation for crosstalk minimization.
De nition 2: Given an assignment of S = (s 1 ; s 2 ; ; s k ) to T = (t 1 ; t 2 ; ; t k ), such that each track t i is assigned exactly one net s j . Find a permutation of the nets (S) = ( s 1 ; s 2 ; ; sn ), i.e., s j is the track number assigned to s j , such that X (s i ;s j )2S w i;j subject to j s i ? s j j = 1: (3) is minimum. We denote such an optimal ordering by opt .
Note that in the worst case the crosstalk is proportional to (N i + N j ). For example, when all addresses or data lines change in the same direction at the same time.
Let us identify horizontal segments of nets in the channel with intervals. Then an interval clique is a set of intervals whose corresponding interval graph is a clique. That is, when scanning the channel from left to right say, we consider the sets of intervals corresponding to all the intervals intersecting a vertical cut-line as depicted in Figure 2 .
In the following, we are only concerned with crosstalk-minimum track permutation. In Section 3, we will describe an algorithm considering only X ij , called the rst order model, in Section 3.1, and extend the algorithm considering X ij (N i + N j )A ij , called the second order model, in Section 3.2.
Traveling Salesman Problem (TSP):
INSTANCE: Set S of n cities, distance w(s i ; s j ) 2 Z + for each pair of cities s i ; s j 2 S, positive integer B.
QUESTION: Is there a tour (hamiltonian cycle) of S i.e., a permutation < (s 1 ); (s 2 ); ; (s n ) > of S, whose edges' weight sum is no longer than B.
The problem is NP-complete in general graphs. A brute-force algorithm generates n! tours and a dynamic programming uses O(n 2 2 n ) time.
TSP is a special case of crosstalk-minimization for interval clique, provided an arbitrary cost function is given. So crosstalk-minimization provided the cost function is arbitrarily chosen is NP-hard, even in the interval clique case. The rst order model we consider is easier not because of restriction to interval clique, but because of our restriction to a very special cost function, i.e., just the length of the interferences.
A. Algorithm on the First-Order Crosstalk Model
Consider an interval clique S = fs 1 ; s 2 ; ; s n g, where s i = (`i; r i );`i r i , where`i and r i represents x-coordinates of the left and right end points of the interval s i . The length L(I) of an interval s = (`; r) is de ned as the quantity jr ?`j. A simple heuristic to the problem of nding minimum-crosstalk track assignment on interval cliques is to adapt a "greedy" algorithm.
Algorithm Greedy (Interval Clique):
Step 1: assigned = Null; unassigned = S; s 0 = a virtual segment corresponding to top channel shore; s n+1 = a virtual segment corresponding bottom channel shore;
Step 2: Select two segments s i and s j from unassigned such that w ij is the largest, and assign s i to t 1 and s j to t 2 ; assigned = fs i ; s j g; unassigned = S ? fs i ; s j g;
Step 3: Select a segment s k such that crosstalk gain, when segment k is inserted between two segments (case 1) 0 and i, or (case 2) between i and j, or (case 3) between j and n + 1, g(ij; k) = w ij ? ( Fig. 2 . Two Interval Cliques in a channel assigned = assigned + fs k g; unassigned = unassigned ? fs k g;
Step 4: Repeat Step 2 until all segments are inserted to the position with the most gain.
Even though the approach generates an optimal solution in most of instances, the time complexity of the Greedy Algorithm is O(n 3 ) which may be not practical for large n. Also, we do not know yet whether the approach yields an optimal solution. We show in the next paragraph that there exists a polynomial-time algorithm to the crosstalk-minimum track permutation problem on an interval clique.
An interval clique can be partitioned into two subsets S 2 and S as follows.
Procedure Clique-Partition(input S; output S 2 , S )
Step 1: Consider a vertical cut-line that intersects all intervals in S, and for s i 2 S denote left(s i ) the part of s i to the left of that cut line and right(s i ) the part of s i to the right of that cut line. Accordingly, partition the interval clique into two sets S left and S right .
Step Step 3:
Apply the process in Step 2 to S right .
Step 4: Step 2: Assignment Let T = ft 1 ; t 2 ; ; t n g be the set of tracks. Assign each s i 2 S 1 to distinct even number track t 2m , and each s i of S 2 to distinct odd number track t 2m+1 .
Lemma 1 (Algorithm 1) generates a track permutation for a Containment Interval
Clique with minimum crosstalk in time O(n log n). Proof: The track assignment generated by Algorithm 1 is an alternate LONG-SHORT sequence (refer to Figure 3 ). When n 2 is odd, the lower bound on crosstalk is P bn2=2c i=1 2L(s i ). Similarly, the lower bound on crosstalk when n 2 is even is P n2=2?1 i=1 2L(s i )+ L(s n2=2 ). It is obvious that crosstalk for the alternate LONG-SHORT sequence meets the above lower bounds. The time complexity is dominated by sorting the interval lengths.
Corollary 2: Consider a track assignment generated by Algorithm 1 for a Containment Interval Clique S 2 . Then the crosstalk for a track assignment induced by a permutation on S 1 and S 2 respectively is again minimum.
Consequently, we can resolve a vertical constraint (i.e., the case where two vertical wire segments overlap) without increasing the crosstalk, by exchanging two intervals having end points of the same x-coordinate. Algorithm 2: Track permutation on a Monotone Interval Clique S
Step 1: Partition each interval in S into two sets (left and right sets) using a vertical cut line that intersects all intervals in S . We denote the left part of an interval s i as left(s i ) and the right part as right(s i ). Let S left (resp. S right ) denote the set of intervals containing all left(s i ) (resp. right(s i )). Then, both S left and S right are considered as a Containment Interval Clique.
Step 2: Apply Algorithm 1 to S left .
Lemma 2 (Algorithm 2) Algorithm 2 generates a track permutation for a Monotone We can satisfy a vertical constraint (i.e., the constraint that two vertical wire segments are not allowed to overlap) by exchanging two intervals that cause the vertical constraint.
Based on the above two lemmas, we have the following algorithm for a general Interval Clique.
Algorithm 3: Track permutation on an Interval Clique S
Step 1: Apply Clique-Partition(S; S 2 ; S ).
Step 2: Apply Algorithm 1 for S 2 and Algorithm 2 for S .
Step 3: Find Merge-Clique. Procedure Merge-Clique(input track assignment for S 2 and S respectively; output track assignment for S) Case A: jS 2 j = ODD:
Step 1: Identify s 1 ; s 2 ; s 3 Proof: Note that after clique-partition the number of intervals in S is always even. Thus, we have three cases as in Figure 5 . In Figure 5a , at left-hand side of vertical cutline we have an alternate LONG-SHORT sequence, thus the case is out of our concern. At right-hand side of vertical cutline, (right(s 1 )) = (right(s 2 )) = LONG. Intervals s 1 and s 2 are chosen such that crosstalk between s 1 and s 2 is minimized. Similaly, s 3 2 S and s 4 2 S 2 ) are chosen such that crosstalk between s 3 and s 4 is minimized. The proof on Case2 in Figure 5b is similar. In Figure 5c , the number of intervals in S 2 and S are both even. Thus, we nd only LONG-SHORT and SHORT-SHORT sequence at both left-hand and right-hand side of vertical cut-line as in Figure 5c .
Theorem 1 (Algorithm 3) Algorithm 3 generates a track permutation for an Interval
Clique with minimum crosstalk in O(n log n) time.
Proof: Finding optimal solutions for both S 2 and S individually takes O(n log n) time. The Merge-Clique operation takes O(n) time.
B. Extension to the Second Order Crosstalk Model
As described in Section 2, crosstalk in the second order model is proportional to X ij (N i + N j )A ij . The algorithm for the second order model is as follows. We rst partition the intervals into 4 interval groups as shown in Figure 6 , and then apply the procedure for the fig6.eps Fig. 6 . The Second-Order Crosstalk Modeling rst order model. Note that by partitioning the intervals as in Figure 6 , the instance of the second order model consists of four smaller instances of the rst order model. This is true because we no longer need to consider the signal sensitivity part. For each partitioned interval group, we apply Algorithm 1 or Algorithm 2 according to its type. Then, we apply Algorithm 3 to merge those four partitioned interval cliques. For brevity, we omit the detailed algorithm description.
III. Experimental Results
We have experimented our algorithm (Algorithm 3) using SUN Ultra-Sparc 2 and Pentium-Pro Machine with C/C++. We compared our algorithm with left-edge algorithm, and also with brute-force method which generates an optimal enumerative solution. Table  1 shows the results obtained by using our algorithm, left-edge algorithm, and brute-force method for the track permutation problem on an interval clique, respectively. We tested each algorithm in 10,000 times by using random generated interval cliques. For using brute-force method on the examples which have more than 9 tracks as Table 1 , we cannot get a result due to the exponential running time. For the cases which have less than 9 tracks, our algorithm generates the same result as the brute-force method. For all cases, the average crosstalk is about 30 trs Crosstalk Avg. Wire Length Avg.
CPU sec L.E Our B.F L.E Our B.F L.E Our B. In this paper, we consider special instances of the crosstalk-minimization problem where the cost function depends only on the length of the segments that runs in parallel and all pairs of segments intersect. An algorithm solving this problem in is presented. An extension applied to the instances with more general function of switching activity and mixed signal sensitivity to reduce crosstalk and power consumption is also presented. The presented algorithm can be applied to a performance-driven lower power channel routing in deep submicron VLSI designs.
