Abstract-With process technology and functional integration advancing steadily, chips are continuing to grow in area while critical dimensions are shrinking. This has led to the emergence of on-chip inductance to be a factor whose effect on performance and on signal integrity has to be managed by chip designers and has to be sometimes traded off against other performance parameters. In this paper, we cover several techniques to reduce on-chip inductance which in turn improve timing predictability and reduce signal delay and crosstalk noise. We present experimental results obtained from simulations of a typical high performance bus structure and a clock tree structure to examine the effectiveness of some of the different inductance reduction techniques.
I. INTRODUCTION

I
N DEEP submicron (DSM) technologies, critical feature size continues to shrink and now over 100 million transistors can be packed into a single die. The availability of many layers of low-resistance metal (Cu) interconnects makes routing of such complex chips possible, but demand for higher system performance reduces timing slacks and puts added constraints on timing accuracy and predictability. This makes the optimization of interconnect extremely difficult. Typically, performance is achieved by routing global interconnects using upper thick metal layers and wide metal lines to reduce resistance. To keep resistance low, top metal layer thickness have not scaled with newer technologies, which has led to an increase in coupling capacitance, and therefore, has created crosstalk problems [1] - [7] .
Until recently, most extraction and delay analysis tools have been limited to RC networks leaving an inherent unpredictability in the design process where inductive effects are suspected [6] - [9] . But, with the recognized significance of including inductive effects and their impact on performance and signal integrity, several techniques have been proposed to deal with these effects. The most common of these techniques are: shielding [10] where signal lines are interdigitated with Vdd or ground alternatively in order to provide isolation of signal lines from their neighboring signals, and buffer insertion [11] where Manuscript received December 30, 2001 ; revised April 19, 2002 . This work was supported by the Advanced Technology Group at Synopsys, the Somerset design center of Motorola, the DARPA packaging program, and the semiconductor research corporation.
Y. Massoud buffers are inserted in long lines to reduce crosstalk noise and delay. There are other techniques that were originally used to reduce capacitive effects such as, widening metal lines [12] , optimizing wire separation [13] , [14] , and net ordering [15] , [16] where nets are ordered to reduce crosstalk noise. Shielding can be simultaneously combined with ordering nets to minimized crosstalk capacitive and inductive noise [17] . Another technique makes use of the long used twisted pair principle and adapts it to on-chip signal routing in a twisted bundle format [18] . Dedicated ground planes is another common method to reduce inductive effects where the layers above and below the signal lines are dedicated to Vdd or ground [19] - [21] .
In this paper we use three-dimensional electromagnetic field solvers to analyze on-chip inductive effects and explore different methods to minimize these effects. In Section II, we evaluate the noise characteristics of crosstalk avoidance strategies including: shielding, widening metal lines, increasing wire separation, buffer insertion, and differential signaling. RC models of the interconnect are analyzed first then inductive effects are included to show that in 0.18 micron technology, and beyond, inductance has a first order effect on crosstalk noise. Assuming the same set of timing constraints applies for all strategies, we compare the effectiveness of the different crosstalk avoidance strategies.
In Section III, we discuss the minimization of on-chip self inductance. We start by examining the inductance of a signal line sandwiched between ground return lines. We show that for integrated circuit interconnect operating at below twenty-five gigahertz, it is the low frequency inductance that predicts performance. We then compare the performance of the sandwiched structure, using two dedicated ground planes and interdigitating thinned signal lines with thinned ground lines.
II. MINIMIZING COUPLING INDUCTANCE
In this section, we discuss different strategies to reduce coupling inductance and inductive crosstalk. We used a high performance 8-bit data bus to examine and compare between the different strategies. For our simulation and analysis, we used a major foundry's 0.18 m process. The metal lines were implemented in metal 6 with all lines having a metal width of 3 m and a metal to metal spacing of 1.5 m consistent with typical high level metal implementations of high performance global busses. The only exception to that are the test cases where the metal width or the metal spacing was intentionally varied as part of the experiment. In all experiments, we sandwiched the data bus between a Vdd line and a ground line each 15 m wide to provide a return path for the current flowing in the buses.
In all test cases, we used simple buffers for drivers and receivers implementation in the standard cases, and a differential driver-receiver pair for the implementation of differential examples. In order to be consistent in our comparison between the various cases considered, we maintained the same timing constraint of a propagation delay of 0.35 ns from input to output. We always used the weakest drivers sufficient for meeting that timing constraint in all the test cases to make sure that the drivers are not themselves a source of noise. Also, we maintained an input capacitance of approximately 10 ff. All receivers were loaded with a moderate load of 100 ff which is equivalent to ten standard loads. Finally, a supply voltage of 1.8 V was used in all the experiments. We used a distributed RLC model to model the interconnects where FastCap [22] was used to model the interconnect capacitance and FastHenry [23] was used to model both the resistance and the inductance of the interconnects. Both FastHenry and FastCap employ multipole-accelerated Method-of-Moments techniques [24] , [25] .
In order to test for the worst case noise generated on the 3000 m long 8-bit standard single ended bus (shown in Fig. 1 ), we applied a 50 psec rise time step to all the inputs except the one in the middle. In Fig. 3 , simulation results using Hspice [26] shows a large voltage glitch of 1.17 V. Such a glitch could cause erroneous switching and logic failures. In order to solve this crosstalk noise problem, we tested several of the most popular crosstalk noise reduction techniques against this example.
A. Shielding Technique
In the shielding technique, signal lines are interdigitated with Vdd or ground alternatively [10] , as shown in Fig. 2 . The idea of the technique is to isolate signal lines from their neighbors. Fig. 3 shows that even when we use the shielding technique, a voltage glitch of 0.54 V will still appear. Fig. 4 shows that when the inductance is not modeled, the shielding technique appears to solve the crosstalk problem perfectly as only a 0.03 V voltage glitch is generated. This is because the shielding technique is capable of screening signal lines and thus eliminating capacitive coupling. However, due to the long range of current return paths, shielding is capable of screening only part of these signals current return paths, and thus shielding might eliminate only part of the inductive coupling.
B. Widening Metal Lines
Widening signal metal width is one of the techniques used to reduce capacitive crosstalk noise, by increasing the signal capacitance to the ground [12] . Fig. 5 shows that the generated noise was reduced by not more than 20% after the wire width was increased from 3 to 7.5 m. In this example we have widened the interconnects and kept the separation distance, , at 1.5 m so that the total area taken by the data bus is the same as the one used in the shielding technique. Other than this modest noise improvement, this technique tends to increase the delay as the capacitance increases.
C. Increasing Metal to Metal Separation
Increasing the separation distance between signal lines is a technique that reduces crosstalk noise by pushing signal lines further apart from each other [13] , [14] . In order to test this method, we simulated the configuration shown in Fig. 1 where we kept the signal width at 3 m and increased the separation distance, , such that the total area consumed by the structure will be the same as the one consumed by shielding. Fig. 6 shows that the generated noise was reduced to 0.82 V. This is not as good as the shielding technique. Note that when neglecting the inductance, this technique reduced the crosstalk noise to 0.21 V which shows that this technique can be successful if no inductive coupling is involved. This is due to the fact that coupling inductance decays logarithmically with distance with separation distance unlike the much quicker linear decay of the coupling capacitance.
D. Buffer Insertion Technique
Buffers are often inserted in long interconnect routes to reduce crosstalk noise [11] . Buffer insertion typically reduces crosstalk noise but sometimes degrades the performance due to the additional delays of the inserted buffers. In order to test the buffer insertion method with the standard single-ended bus configuration in Fig. 1 , we divided each interconnect line in the bus into two 1500 segments with a buffer inserted in between the two segments. The generated crosstalk noise was only 0.27 V which is half of that of shielding. However, the delay has increased to 0.52 ns which is almost 50% more than the targeted timing budget. In high performance designs, timing is often be the leading factor in the design. In such case, in order to keep the timing budget at 0.35 ns, we gradually increased the strengthes of the drivers and buffers till we reached the weakest buffers and drivers to meet the delay target of 0.35 ns. Fig. 7 shows that the generated crosstalk noise was reduced to 0.87 V which is now more than that of the shielding technique. When inserting two buffers instead of one buffer in each interconnect line of the bus in Fig. 1 , the delay has deteriorated significantly and the delay constraint could not be met.
E. Differential Signaling
There has been increased interest in low swing signaling [27] , but inductive effects have not been included in previous studies. Here, we will discuss the use of limited swing differential signaling and show how it can significantly reduce crosstalk noise.
The differential driver used is shown in Fig. 8 . The driver consists of a very low input capacitance inverter and a transmission gate which generate balanced delay signals that drive two matched buffers which generate differential signal. The buffers are simple inverters with active current feedback in the form of an always on transmission gate. The buffer with the feedback is essentially a simple op-amp with a virtual grounded input and a low voltage gain. The active feedback provides maximum flexibility in controlling the delay, swing, and centering of the differential signal with respect to Vdd and ground. Also the very low input capacitance of the predriver makes this driver very useful in tight timing budgets and shallow pipeline architectures. Similar drivers have been used for very high frequency RF applications [28] . The driver we implemented had a swing of 300 mV with a low level of 0.7 V and a high level of 1.0 V. The input capacitance of the driver was 10 ff. The receiver, shown in Fig. 9 , was a standard static differential receiver with a low impedance current source for stability against injected noise. The reciever is loaded 100 ff capacitor.
We next investigated the use of differential signaling for the whole bus, as shown in Fig. 10 , and we observed the outputs OP1 and OP2 of the "quiet" differential signal and signal-bar in the middle of the bus at the end of the 3000 m differential line. Fig. 11 shows that the two points OP1 and OP2 are almost always in phase. This makes the difference OP1-OP2 very small. This difference is the effective noise seen by the differential receiver. Fig. 12 shows that by using this differential bus the effective noise seen at the input of the differential receiver is only 53 mv peak, thus, reducing the crosstalk noise by more than one order of magnitude as compared with the previously discussed crosstalk noise reduction techniques. We devised two experiments to determine whether the noise improvement is due to differential signaling noise immunity or due to differential signaling lower radiation. Fig. 12 . The difference between the signals on points OP1 and OP2 of the differential in Fig. 10 . This represents the input noise signal to the differential receiver (maximum crosstalk noise is only 53 mv). Fig. 13 shows a setup where the bus is driven differentially, except for the "quiet" line in the center which is single ended. We observed the output (OP1) of the quiet line with the remaining 7 differential pairs switching. Fig. 14 shows the quiet line to have a crosstalk noise peak of 38 mV, which asserts that differential signaling does not radiate significant electromagnetic interference. This is mainly because a differential transmitter generates a voltage swing of less than 300 mv, compared to 1.8 V for a standard driver. Therefore, for the same latency, differential drivers tend to be much smaller than standard drivers, which results in a significantly lower , and therefore lower inductive noise. 2) Noise Immunity: In the second experiment, shown in Fig. 15 , all the switching signals were on single-ended standard nonshielded lines with the "quiet line" driven differentially. Although the single ended lines caused a high level of coupling on each of the differential lines, OP1 and OP2, as shown in Fig. 16 , the two differential lines moved together, with a differential signal of less than 83 mV. The small differential The bus is sandwiched between a VDD line and a VSS line. The driver of the differential signal is a limited swing differential driver and the receiver is a differential receiver. The drivers and receivers of all the standard bits are standard CMOS buffers. Fig. 16 . The input signals to the differential receiver, OP1 and OP2 in Fig. 15 . Fig. 17 . Dotted graph represents the difference between the signals on points OP1 and OP2 of the differential receiver in Fig. 15 . This represents the input noise signal to the differential receiver (peak input noise is 83 mv). The solid graph represents the output of the differential receiver, OP3, in Fig. 15 . (peak output noise is only 4 mv).
1) Low Noise Generation:
signal generated an insignificant 4 mV on the differential receiver output, OP3, as shown in Fig. 17 .
In order to compare the noise immunity of differential signalling with that corresponding to shielding, we replaced the differential bus in the middle of the single ended bus in Fig. 15 with a standard single ended line with a shield, as shown in Fig. 18 . Fig. 19 shows a voltage crosstalk glitch on the input of the standard single ended receiver, OP1, of 1.12 V, which reflected a final output noise of 360 mV on OP2 in Fig. 18 . This experiment proves that differential signaling is much more immune to injected noise as the noise glitches on both the input and the output of the differential receiver were more than an order of magnitude smaller than the respective glitches on the shielded single ended receiver. This makes differential signaling a very efficient solution for critical nets, as it provides very good noise immunity along with speed.
III. MINIMIZING INTERCONNECT SELF INDUCTANCE
In this section, we discuss different strategies to reduce self inductance, and therefore, reduce signal delay. To compare between the different strategies, we used a case study of a high performance clock tree where it was necessary to minimize self inductance. Balancing delays is critical in clock tree design, as this minimizes clock skew. And since magnetic effects have a much larger range than electrostatic effects, an interconnect line with large inductance will be sensitive to distant variations in interconnect topology. This long range sensitivity makes it difficult to balance delays in clock trees, hence the importance of minimizing self inductance.
We examined a typical clock tree structure to explore methods for minimizing the self inductance. A cross-sectional view of that clock tree is shown in Fig. 20 where the clock signal line is sandwiched between ground return lines.
Typically, consecutive metal layers are orthogonal to each other, so there is no inductive coupling between lines in consecutive layers. Thus, the problem of minimizing the inductance of the structure in Fig. 21 is reduced to minimizing the inductance of the structure in Fig. 22 .
In order to estimate the resistance and inductance of the clock, we used the 3-D field solver FastHenry [23] . FastHenry uses a standard filament discretization of an integral formulation of magnetoquasistatic coupling [29] , [30] . The 3-D capacitance solver FastCap [22] was used to estimate the capacitance of the structure. In the capacitance model, conductors in upper and lower metal layers were represented, as they influence the capacitance of the clock structure. Note that without including the surrounding conductors in the upper and lower metal layers, changes in capacitance between the conductors would be grossly over-estimated. we tested several of the most popular self inductance reduction techniques using this example.
A. Optimizing the Metal to Metal Spacing
In order to find the metal to metal spacing which minimizes self inductance, we fixed the width of the clock signal line, , and the width of the ground return lines, , to 1 . Not surprisingly, we found that the self inductance increases as the separation distance between the clock signal line and the ground lines, Fig. 23 . Variation of the self inductance with the separation distance between the signal and the ground lines. Fig. 24 . Three-dimensional self-inductance frequency dependence for the clock structure in Fig. 22. , increases, as shown in Fig. 23 . Thus, in order to minimize the inductance, the separation distance between the clock signal line and the ground return lines, , should be as small as possible.
It is often presumed that near gigahertz clock rates imply that on-chip inductive effects can be analyzed by determining high frequency limit current distributions. In order to verify this assumption and get the specific frequency at which the conductors behave as perfect conductors, we did a frequency sweep on the inductance of the clock structure using FastHenry. In Fig. 24 , we show the frequency dependence of the self inductance of the structure shown in Fig. 22, where . Fig. 24 also shows that for frequencies less than 25 GHz, it is the low-frequency current distribution that determines inductive effects. The corner frequency for self-inductance is determined by the skin effect. Fig. 25 shows the frequency dependence of the resistance for the same structure. It shows that structure resistance has the dc resistance value for frequencies below the corner frequency and as frequency increases beyond the corner frequency, the resistance increases due to the skin effect. 
B. Optimizing Metal Line Width
In order to determine the clock line width that minimizes self inductance, we reexamined the structure in Fig. 22 . The selfinductance versus line width is given in Fig. 26 .
The inductance decreases as the width of the clock signal line, , increases, until . After this minimum, the inductance increases as increases. The resistance of the clock is always decreasing as is increasing as shown in Fig. 27 . This curve is not a linear function of due to the constant resistance of the ground return lines, . The component of the total resistance from continues to fall off linearly, but the total value saturates at the ground return value. Fig. 28 shows that the capacitance is increasing linearly with . Consider the two following structures, the first has m, m, and m, and the second has m, m, and m. The second structure has been optimized for minimum inductance given fixed m, and its inductance is 10% less than the first structure. This 10% reduction in the inductance was achieved by using 2.3 times the original space, and 120% increase in the capacitance as shown in Fig. 28 . Therefore, widening the clock lines has little impact on the inductance and increases the capacitance significantly.
C. Using Dedicated Ground Plane Techniques
We also investigated using dedicated ground planes as return paths for the clock signal, as shown in Fig. 29 [19] - [21] . Fig. 30 shows the low frequency self inductance as a function of the ground plane width,
. Fig. 30 also shows that, at around m, the inductance has a minimum value. After that minimum value, the low frequency self inductance increases monotonically as increases. The current, at low frequency, is uniformly distributed on the ground plane, therefore, big current loops are formed when is large. This increases the inductance. Fig. 31 compares the self inductance frequency response of the dedicated ground planes case, with m, m, and the two ground traces case, same as in Fig. 24 . Fig. 31 also shows the frequency response when having both the ground traces and the ground planes as return paths. As shown in Fig. 31 , using only guard traces technique has the smallest inductance unless the frequency of interest exceed several GHz. Above that frequency, dedicated ground planes have somewhat lower inductance, but since most of the energy in the signal is below several GHz, the use of dedicated ground planes is not very effective in reducing the self inductance.
D. Using Interdigitated Techniques
As space is always a limiting factor for chip designs, it would be best if the inductance can be significantly reduced with only a limited increase of the total space allocated for the clock structure. In order to achieve that, one might think of distributing the clock signal on many lines, and doing the same for the ground return lines, as shown in Fig. 32 . The 10 m signal line has been divided into two 5 m lines. Similarly, the two 3 m ground returns have been exchanged by three 2 m ground returns. This design resulted in no change in resistance, a 27% increase in capacitance, a 43% decrease in inductance, and only an 11% increase in area. The inductance is not reduced by 50%, as might be expected, due to the nonopposing mutual inductances.
We tried different interdigitated structures, keeping the total structure width increase to less than 20%. Table I shows that a significant reduction of the self inductance of the clock can be achieved increasing the number interconnect lines in the clock structure. Table II shows the relative change in the RLC performance, for all structures.
The resistance or the maximum space allowed for the clock, whichever is more critical in the design, determines the exact width for each line. As shown in Table I , about the same reduction percentage in the inductance, can be achieved with two different 5 line structures. However, the five line structure with the 17 m wide clock structure has 62% more resistance than the 20 m wide clock structure. Table II also shows that the inductance can be reduced as low as 3.9 times by having the clock structure composed of 11 lines, where ground and signal lines are alternatively placed. This significant reduction in the inductance can be achieved with an insignificant increase in the total clock structure width, and clock resistance, and a modest increase in the capacitance. The interdigitating approach can be used to help constrain a design to the RC domain to maintain predictability at some performance cost, or it can be used as a basis for alternative design rules where inductance and capacitance must be traded off to optimize for specific performance targets.
IV. DISCUSSION
In this paper, we examined different techniques to reduce the on-chip self and coupling inductances of global interconnect lines. This directly improves timing predictability and enhances signal integrity. Simulation results from industrial test cases have been shown to compare between the different inductance reduction techniques.
