Abstract-A highly efficient loop-based interconnect modeling methodology is proposed for multigigahertz clock network design and optimization. Closed-form loop resistance and inductance models are proposed for fully shielded global clock interconnect structures, which capture high-frequency effects including inductance and proximity effects. The models are validated through comparisons with electromagnetic simulations and measured data taken from a Power4 chip. This modeling methodology greatly improves the clock interconnect simulation efficiency and enables fast physical design exploration. Examples of interconnect performance optimization are demonstrated and design guidelines are proposed.
I. INTRODUCTION
G LOBAL clock signals are typically routed in top-level copper lines with large cross-sectional areas to reduce line resistance. With increases in clock frequency (especially clock edge rate) and low line resistance, inductance has now become a first-order consideration for global clock distribution design and verification. Inductance effects will change delay and ramp rate, and cause signal reflections in the vicinity of the clock drivers, as well as overshoot and undershoot at the far ends of the clock grid. Besides inductance, other high-frequency effects like proximity effects will change loop resistance and inductance values significantly at high frequencies. All these effects need to be carefully modeled and controlled to minimize skew and to keep ringing to a minimum.
The most accurate technique for analyzing inductance effects in complex structures is partial element equivalent circuit (PEEC) analysis, which is based on the concept of partial inductance [1] . However, because of the dense inductance matrix involved, this method is computationally expensive even with recent efforts in sparse approximation of the inductance matrix [2] and its inverse [3] . For fast circuit tuning and physical design exploration, more efficient methodologies are preferred for the design and simulation of complex networks. One major complication of inductance analysis is the difficulty of determining current return paths in advance. But in the case of clock networks, signal wires are highly optimized and usually have / shields to provide close return paths and to limit signal line coupling. Also, wide lines are usually split into multiple fingers interspersed with / shields in order to suppress inductive ringing [4] . In these special structures, the current return loops are relatively well defined, or they can be estimated. Therefore, it is feasible to use a loop-based method, in which the entire signal line and return paths of single or multiline structures are modeled with equivalent loop resistances , inductances , and capacitance ( Fig. 1 ). The simulation complexity is greatly reduced for this method, since the number of components in the circuit simulation is greatly reduced, especially because all mutual inductance elements are eliminated.
Most loop inductance calculations are based on an impedance matrix calculation [5] , [6] . The loop inductance and resistance are extracted by defining ports at the driving gate, shorting the receiver end of the signal, and then solving the current distribution for an model of the circuit. These methods ignore displacement current through the capacitive coupling to neighboring wires and tend to overestimate loop inductance [7] , [8] . Furthermore, these models do not have closed-form expression of the loop resistance and inductance values as a function of the interconnect geometry, and thus cannot provide insight into the geometry-dependent line behavior or guidelines for layout optimization. In this paper, we propose closed-form loop resistance and inductance models which are developed from analyzing realistic on-chip fully shielded global clock structures considering coupling capacitance [9] . The frequency-dependent behavior of clock wires with multiple return paths can be explained by line and loop proximity effects [8], [10] , which are important when chip frequencies reach the gigahertz regime (Fig. 2) . At high frequency, the magnetic field generated by neighboring lines will change the current distribution in the conductor and result in higher current density at the edges of the lines. This is referred to as the line proximity effect. Also, at different frequencies, the current return loop adjusts to minimize the overall impedance of the circuit. At low frequencies, inductive reactance is negligible and the currents spread out to minimize resistance. At high frequencies, inductive reactance dominates, and the currents return through close neighbors to minimize inductance. This is called the loop proximity effect. In state-of-the-art technology, the line proximity effect is typically less significant than the loop 0018-9200/03$17.00 © 2003 IEEE proximity effect. These effects lead to frequency-dependent loop resistance and inductance.
In this paper, we propose analytical models for the line and loop proximity effects. Based on these models, a complete loopbased interconnect modeling approach is developed for fully shielded wiring such as clock distribution networks, in which the and are calculated from analytical equations as functions of frequency and interconnect geometry.
II. LOOP RESISTANCE AND INDUCTANCE MODEL
We first develop the analytical models for the line and loop proximity effects, and later combine them into a complete clock structure loop resistance and inductance model. The models are physically based with fitting parameters included to simplify the models. All fitting parameters are extracted by comparing the analytical expressions with full-wave simulation results using the nonlinear least squares fitting method. The detailed simulation setup and extraction method are described in Section III.
A. Line Proximity Effect Model
The line proximity effect [ Fig. 2(a) ] leads to the increase of line resistance above its dc value at high frequencies. It is more important than the skin effect for multiline structures and is significant for many on-chip interconnects designed today. We use a similar model as in [8] , which was developed for the current crowding effects in RF inductors. The models are developed based on first-order calculation of the electric and magnetic coupling in the metal regions. Fig. 3 illustrates the physical and approximations of the magnetic and induced electric field distribution inside the conductor (line1) with the presence of an adjacent wire (line2) carrying excitation current . We approximate the magnetic field along the cross section of line1 as a constant . This constant is equal to the magnetic field generated by line2 at the center of line1, assuming all the current in line2 has been condensed to its axis:
Here is the permeability of free space, is the wire pitch, and is the excitation current. Based on Maxwell equations, we can calculate the induced eddy current density within line1:
(at line edges) (2) Here is the conductivity of the metal and is the width of the wire. By approximating the eddy current on each edge by a uniform current flowing within the outer 25% of the wire width, we can derive the effective eddy resistance: Here is the wire length and is the thickness. The effective resistance of line1 can be calculated by setting equal to the power dissipated:
Power (4) where is the line dc resistance and is the sheet resistance. A fitting parameter is introduced to account for the fact that there is more than one neighbor line in practice. Extracted from full-wave simulation , which yields the following result for estimates of :
The clock signals usually have a wide frequency spectrum. When the signal rise/fall time is relatively consistent across the network, which is typically the case for clock nets, we could use a single characteristic frequency to represent the clock signal for the whole network. In this paper, the characteristic frequency is defined as (6) where is the typical 30%-70% rise time at the far end of the interconnect. This definition gives a close match of the characteristic frequency to the real clock frequency. If another definition of characteristic frequency is used, the frequency coefficients in the loop models will need to be adjusted accordingly.
B. Loop Proximity Effect Model
Loop proximity effect [ Fig. 2(b) ] leads to the increase of return-path resistance and the decrease of loop inductance at high frequencies. For example, in the case of four neighboring ground shields (two on each side) next to the center clock line, using the lumped-element circuit representation the current return in the nearest neighbor versus total current can be expressed as (7) where , , , and are the mutual inductances to the th order neighbor. Here we assume same widths and spacings for all wires. The ratios of and versus frequency are plotted in Fig. 4 . We choose to approximate this loop frequency dependence with an exponential function LoopFactor
The factor 1/2 comes from our symmetric shielding assumption. There are two fitting parameters and in this equation, which can be extracted from comparison with full-wave simulation results. 
C. Loop Resistance and Inductance Model for Single Shielded Clock Line
Based on the above models, we derive the loop resistance and inductance models for a single clock line case (Table I ). All models are listed in SI units. The loop inductance is calculated mainly based on coupling capacitance. Theoretically, at infinite frequency, the resistance of the line is dominated by inductance and the inductance return and capacitive coupling share the same paths. Thus, the interconnect can be represented by a lossless transmission line and the loop inductance can be expressed as length (9) where is the speed of light in the dielectric and is the lateral coupling capacitance. At finite frequencies, the inductance is adjusted with factor , which consists of the same loop proximity effects term as in the model and a spacing adjustment term , with fitting parameters and . The spacing adjustment term comes from the fact that inductance changes more slowly with spacing than capacitance does (to the first order, and mutual inductance, ).
D. Loop Resistance and Inductance Model for Multiple Split Clock Lines
The multiline loop resistance and inductance models are developed based on the single line loop resistance and inductance (Table II) . We found that both effective resistance and inductance are larger than what the simple parallel rule implies ( , ), because the shield wires are now shared by more clock lines, and the inductive coupling between the split clock wires is additive. These are modeled through parameters and as exponents of , both of which are less than one. They are functions of interconnect dimensions. Note that in the case of a multiple split line, the resistance increase over the parallel rule is usually more significant than that of the inductance ( ). To construct the full chain, the capacitance values can be easily calculated from a lookup table or published closed-form expressions, as they are relatively frequency independent.
III. MODEL VALIDATION

A. Full-Wave Simulation
We use a three-dimensional full-wave PEEC interconnect transient analysis tool as a model development reference and verification target [10] . The simulation setup is illustrated in Fig. 5 . Loaded line structures are used with a large power mesh and underlying layer with 70% coverage to represent a realistic on-chip environment. We found that eight power/ground lines (four on each side, routed with typical power grid pitch) next to the outermost shield wire were sufficient to represent the on-chip power distribution network and that further addition of wide power/ground lines will not significantly change the clock line behavior. All wires are discretized to capture line proximity effects. For all simulations, we assumed symmetrical shielding, implying the same spacing on both sides of the clock wire, and the same number and width for all ground wires. The full-wave simulator generates the transient voltage waveform at the far end of the interconnect. The HSPICE optimization function is used to extract the frequency-dependent loop resistance and inductance values by matching delay, rise time, and peak noise values at the far end of the clock line with full-wave simulation results. Fig. 6 and Fig. 7 demonstrate excellent matching between the simulated and values and our model prediction for both single and multiline cases and for different interconnect geometries. The valid range of the models is listed in Table III . Within this range, the error between the full-wave simulation and our model prediction is less than 15%.
B. Measurement Results
We also compared the SPICE simulation result using the values calculated from the proposed models with measured data from a Power4 chip using 0.18-m copper technology [2] . The models are implemented by replacing each global clock wire (or multiple split wires) together with their ground returns by a single chain of loop segments calculated from the proposed analytical equations. Fig. 8 shows a die photo of a Power4 chip with outlines of the on-chip clock network. Fig. 9 shows the comparison of both near-end (inverter output) and far-end (circled) waveforms from measurement and SPICE simulation results based on the proposed loop model. Assuming a good driver model, our proposed interconnect model is very accurate for both ramp rate and skew predictions.
IV. INTERCONNECT OPTIMIZATION INCLUDING INDUCTANCE
As our model shows, both clock lines and ground shields are important parts of the current return loop and affect the overall interconnect performance. For a fixed total routing area (Fig. 1) , there exists a compromise between the clock width and ground width. Fig. 10 shows the performance comparison of interconnect structures with optimized ratio and the simple case of from full-wave simulation. It demonstrates that, with the same routing area, using the optimized can reduce delay by 9% and rise time by 21% compared with the nonoptimal case.
By combining the proposed interconnect parasitic model with performance models as in [12] , we can derive the optimal clock interconnect geometry directly targeted for performance.
As an example, we analyze the strategy of minimizing the total gate-stage delay (driver delay plus interconnect delay) versus minimizing total dynamic power consumption for a fixed interconnect routing area. The detailed setup is listed in Table IV . We sweep the interconnect geometry ( , Fig. 6.) , , ) and driver size and numerically solve for the combination that gives minimal total delay and dynamic power consumption with 5% delay penalty from the delay-optimal case. We define the interconnect signal-to-return ratio as the ratio of the total clock width to the total ground shield width :
(10)
The following design rules are observed in order to achieve minimal delay or power consumption at a clock frequency of 2 GHz:
Optimal delay:
(11) Optimal power:
These ratios will decrease (implying an increased ) with increased frequency and line splitting number because of the increased importance of ground return resistance. The guidelines to design a delay-optimal interconnect structure at a fixed routing area (11) can be summarized as follows: 1) provide at least as much close return path as the signal and 2) use larger than minimal spacing, because the resulting reduction in coupling capacitance is greater than the increase in loop inductance. The optimization of dynamic power consumption results in smaller clock line width and larger spacing for reduced ground and coupling capacitance compared with the delay-optimal case. We observe that a 15% dynamic power saving can be achieved with a 5% delay penalty from the delayoptimal case.
It is worth mentioning that, in actual designs, other considerations such as wire congestion, noise, and power line IR drop are also important considerations in deciding interconnect geometries.
V. CONCLUSION
In this paper, we propose a highly efficient loop-based interconnect modeling approach suitable for multigigahertz clock network performance estimation. The target clock line, together with all of its possible return paths, is modeled as a single effective loop chain, with the loop values calculated directly from closed-form models. The models capture high-frequency effects including inductance and current crowding (line and loop proximity effects) and is validated with full-wave simulation and Power4 chip measurement results, showing 15% error for a wide range of interconnect geometries up to a clock frequency of 5 GHz. Based on the proposed models, interconnect physical structure optimization is studied and design guidelines are proposed for both delay-and poweroptimized interconnect structures. He joined the IBM T. J. Watson Research Center, Yorktown Heights, NY, in 1986, where he worked on CMOS parametric testing and modeling, CMOS oxide-trap noise, package testing, and DRAM variable retention time. Since 1993, he has concentrated on tools and designs for VLSI clock distribution networks, contributing to ten IBM server microprocessors as well as high-performance ASIC designs. He has authored or coauthored 17 papers, holds four patents, and has given keynotes and several invited talks and tutorials on clock distribution, high-frequency on-chip interconnects, and technical visualization of VLSI design data.
Dr. Restle received IBM awards for the S/390 G4 and G5 microprocessors in 1997 and 1998 and for the invention of a high-performance VLSI clock distribution design methodology in 2000. 
