The improved T and improved n models are proposed for onchip interconnect macromodeling.
Introduction
Due to VLSI feature size shrinking in CMOS and GaAs technologies, gate delays decrease and interconnect delays increase to such an extent that the interconnect delays have dominated overall path delays for the current integrated circuits and systems. Meanwhile, the on-chip interconnects present transmission line effects when the operating frequency reaches several Giga Hertzs [I] . As the interconnect effects previously estimated by simple lumped capacitances results in unsatisfied accuracy, they need more accurate modeling for synthesis and layout optimization. On the other hand, the existing accurate modeling approaches generally represent the distributed parameters by cascaded RC networks, which makes the circuit sizes very large [2] , therefore they are not efficient to be used in the iterative process of synthesis and layout optimization.
Hence, accurate yet efficient modeling is needed as demanded by VLSI design. The simplest approximation of the interconnect tree is the total capacitance of the tree (see Fig. I ), which is a first-order approximation [3] .For submicron technologies, the total interconnect resistance is large and comparable to the driver output resistance, and cannot be neglected in estimating the gate delay. The actual delay is much smaller than that derived from the lumped capacitance model because the interconnect resistance acts as a shield to r~duce the load capacitance seen by the gate driver. The lumped L-type RC model gathers the total interconnect resistance and total capacitance as a simple lumped RC segment, and yields an optimistic delay estimate because the total interconnect resistance is lumped together and shields the total capacitance. The one-segment n model was proposed to approximate the load interconnect at the gate by matching the first three moments of the driving point admittance of the gate [4] , giving better accuracy. An effective load capacitance method including the shielding effect of interconnect resistors was proposed by calculating the effective capacitance that gives the same average current as the RC n model load, assuming that the driving gates can be represented by piecewise linear devices [5] . As the deep submicron technique is coming to prevail in VLSI design, more accurate modeling approaches are required.
Currently the reduced-order macromodels are popularly used for simulating complicated interconnects.
Asymptotic Waveform Evaluation (AWE) is the most well-known method to approximate general linear networks using momentmatching technique [6] . However, higher order moments lead to undesirable conditions when increasing the order of moments does not guarantee a better approximation. Furthermore, AWE may lead to unstable poles although the original network is stable. As the extension to AWE, the multipole expansion [7] and Krylov subspace [8] techniques are available for model reduction. There is another class of reduction techniques based on congruence transformations [9] . These techniques can give high accuracy, given considerable computational costs. However, the processes of layout optimization and high-Ievel synthsis in VLSI design require efficient modeling tools, because the processes need many iterations, in which the computational efficiency is a critical bottleneck. In this paper,.the approximation frames with compound-order approximations are derived for on-chip RC interconnect modeling. The approximation frames are based on the global approximations, which gives higher order accuracy than local approximation does. The application of this approximation frames to interconnects leads to the improved T and improved n macromodels for on-chip RC interconnects, which have the similar simple forms as one-segment T and n models, yet have higher accuracy. The new models for distributed RC interconnects are independent of CMOS gates, and can be directly incorporated into SPICE frames. 
Proceedings

RC Interconnect Modeling
For' different requirement of accuracy, the interconnects are traditionally modeled as cascades of L, T or n elements [ 10] , each of which is a small fraction of the entire interconnects. Although these approaches can provide high accuracy if the number of cascaded elements is large enough, they are not efficient for iterative optimization. Instead, simple models like one-segment T or n elements are desired. For on-chip interconnects, one-segment T and n models can give satisfied accuracy if the operating frequency is relatively low. At high operating frequency, the simple one-segment T and n models suffer from accuracy loss.
In Laplace domain, the normalized diffusion equation governing a distributed RC interconnect can be written as
where V(x, s) is the distributed voltage, and' denotes the derivative with respect to x. x E [0; I] is the nonnalized interconnect length, R is the nonnalized distributed resistance and C the nonnalized distributed capacitance over the interconnect. By introducing the distributed current I(x, s), the diffusion equation can be written in the fonn:
The difference between the model in Fig. 2 and the original T model is that there is an additional negative resistance in the (2) .t.long the line, we select three grid points: Xo = 0, XI = 1/2 and X2 = 1, then take three voltage variables Vo = V(xo,s), VI = V(XI,S) and V2 = V(X2jS), and twq current variables II = I(xo,s) and I2 = I(x2,s). By applying the global approximation to computing the current and voltage difference, we obtain the following approximation fraJnes [ 11]:
where a, bI, b2, b3 and b4 are coefficients to be determined by using fitting functions. By using the generalized Galerkin's XI -Xo = b1x'lz=zo + b2x'lz=Z2 (8) xi -xfi = b1(X2)'lz=zo + b2(X2)'lz=Z2 (9) which gives b1 = 3/8 and b2 = 1/8. Doing the same operations to Eqn. 6 results in b3 = 1/8 and b4 = 3/8. Taking Eqns. 2-3 into consideration, the approximation frame can be written as:
11 -12 = sCVl (10)
Eqn.7 has an accuracy order OfO(x2), while Eqns. 8-9 have an accuracy order of O(x3). Therefore, the approximation frame, given by Eqns.10-12 which are determined by Eqns. 7-9 has compound order accuracy.
A process of mathematical manipulations shows that Eqns. 10-12 represent an equivalent circuit having the modifled T type topology as shown in Fig. 2 
Il -I2 = 1/8 sCV1 + 3/8 sCV2
The equivalent circuit model of Eqns. 13-15 is the improved n model as shown in Fig. 3 . The difference between the improved n model and the original n model is that the former includes an additional negative capacitance. By equating the first three items of Eqns. 16 and 17 and !maintaining the symmetric form of the n model, i.e., C1 = 'C2. we obtain the parameters of the n model with 3rd order :AWE; C1 = C2 = C/2, Rl = 4R/3, and C3 = -C/5. The !improved n with 3rd order A WE has the a similar form but has .« !dlllerent parameters, as shown in Fig. 4 .
Proceedings of the 15th
B= !3
Passivity and Accuracỹ y using AWE technique, a II model presented in [13j atches the first three moments of the driving point admittance iat the output of a CMOS gate. It depends only on the total initerconnect tree parameters, and therefore has high efficiency 'for interconnect delay modeling. However, unlike the origiinal II model, the model in [13] has an asymmetric structure, i.e., it is an anisotropic model, though the original interconnect !is a symmetric one. This results in difficulty when modeling ithe bi-directional interconnects which demands same transfer I iproperties in two directions. By varying the improved II model so as to match the first three moments of the analytical driving point admittance, we derive new II models having symmetric 'structures.
From Eqns. For an open-ended interconnect, the driving point admit-,tance of a II model as shown in Fig. 3 can be expanded as ,follows, definitions and results are referred to [ 15] .
Lemma I: Necessary and sufficient conditions for a transfer function n x n matrix Y{s) to be passive is that Y{s) is positive-real, i.e.:
(1) each element ofY{s) is analytic in ~{s) ?; 0, (2)Y{s*) = Y*{s) and (3) (y*)T{s) + Y{s) is non-negative definite for al13?;{s) ?; 0.
Lemma 2: An n-port network is passive if and only if its admittance matrix Y{s) is positive-real.
Lemma 3: IfA{s) ispositive-real,thenA-I{s) ispositivereal, if it existed.
Lemma 4: If A{s) is positive-real and B is real, then BT A{s)B is positive-real.
Consider the 2-port model shown in Fig. 2 and its constitute equations Eqns. 10-12. If we think of Vo and V2 as input independent voltages, the MNA equations can be obtained as lsC -1 1 J l VI J l°J
Then the admittance matrix is obtained by deriving II and -h, setting both Vo and V2 to 1 's: where 0 1 0 0 0 -1 By Lemmas 1-4, Y(s) in Eqn. 19 can be easily verified to be positive real; therefore, the model shown in Fig. 2 is passive.
Noting that Eqns. 13-15 and Eqns. 10-12 are dual to each other, then the model shown in Fig. 3 has the following impedance matrix: t in order to show the accuracy of the derived modeling, c nsider a practical RC interconnect in a CMOS circuit with 0 18 Jl,m feature size as shown in Fig. 5 (a) Fig. 5 (b) ). The interconnect has distr buted parameters R = 110 n/mm and G = 130 fF/mm, v1ith the length of 2 mm. Assuming that the load capacit~nce is GL = 100 f~ and that the int~r~al res~stance .is d = 100 n. The relatIve errors of the dnvlng pOInt admltt~nce in the frequency domain are shown as in Fig. 6 , using le models shown in Figs. 2-4. Fig. 6 , the improved T and II models are more afcurate than the original T and II models over the frequency dbmain of interest, as compared to the II model with 3rd order AWE which is less accurate than the improved T and II mode!s, although its relative error gets smal\er at higher frequency. 100 f(GHz) 4 Circuit Applications
The first example is the practical CMOS inverter circuit shown in Fig. 5 (a) which is laid out using MosIsrrsMc .18,urn feature size. The distributed RC line (metal 2) has the length of 2 rnrn, and its width is 4A. The distributed resistance and capacitance are 222 n/rnrn and 123 f F /rnrn, respectively. The two identical inverters have the same parameters: channel length 0.18 ,urn for both FETs, channel width Wp = 18,urn for PMOS and channel width W n = 9 ,urn for NMOS. Using the original T model, origninal n model, improved T model, improved n model, n model with AWE, and improved n model with AWE to represent the interconnect, we incorporate the models into HSPICE as sub circuits [16] . The delayed waveforms at points A and B are shown in Fig. 7 .
The exact simulation results are obtained by using 10 segments of T models to represent the interconnect. Fig. 7 shows that at the near end point A, both AWE models give high agreement to the exact values; However, both AWE models have more errors than other models do at the far end point B: the n model with AWE gives optimistic delay and the improved n model with AWE gives pessimitic delay. On the other hand, the improved T and n models are more agreeable than other models at the far end point Bo Based on the above simulation results, further numerical experiments show the error distribution of applying the improved T model to modeling of different practical CMOS layouts whose schematics are shown as in Fig. 5 (a) . The feature size is retained as .18 Jlm, and the channel width of the Fig. 8 (a) . The results show that the error distribution of the improved T model has a well-forrned standard distribution, as compared to the error distribution of the original T model shown in Fig. 8 (b) . Fig. 8  (a) shows that the maximum relative error of the improve T model is in 6% of the exact values. The second example is the same as the first example except that the interconnect length is 4 mm. The response waveforms as shown in Fig. 9 demonstrates that both the n model with AWE and the improved n model with AWE give better accuracy at the near end point A and worse accuracy at the far end point B. At the far end point B, the original T model gives the optimistic delay estimations, while the original n model gives the pessimistic results. The results based on improved T and n models are more agreeable to the exact values at the load end. The third example is an H-shaped clock tree whose schematic is shown in Fig. 10 , with the feature size .25 JLm of MosIsrrsMc technique. All the inverters have the channel width 10 JLm for the PMOS and the channel width 5 JLm for the NMOS, and each of the inverters at leaf has a load capacitance of 200 f F .The distributed parameters for the interconnects are shown in Table. I. Assuming that the input is an impulse with 4 = t f = 50 ps, the responses at a leaf point G are calculated by incorporating the models in Figs. 2-4 into the HSPICE frames [16] , as shown in Fig. 11 . Fig. 11 shows that the original T model gives the optimistic delay estimation, and that the original n model gives the pessimistic results, while the improved T and improved n models give more accurate results, which agrees with the analysis in the first and second examples. The accuracy of 3rd-order AWE models is lower than those of the improved T and improved n models, and also lower than the original T and original n models.
In the third example, the run times based on the improved T and improved n models are comparable to those based on the original T and original n models, each of which is I second at the time step of 10 ps.
Proceedings of the 15th International Conference on VLSI Design (VLSID'02) 0-7695-1441-3/02 $17.00 © 2002 IEEE .comparable to those of original T and original n models. The analysis by Pade approximation via the Lanczos pro-AWE, original T, improved T, exact, improved n, original n, cess," IEEE Trans. Computer-Aided Design, vol. 14, and improved n model with 3rd order AWE models. no. 5, pp. 639-649, 1995. [9] K. J. Kerns and A. T. Yang, "Stable and efficient reduction of large, multiport RC networks by pole analysis via congruence transformations," IEEE Trans. Computer-5 Conclusions AidedDesign,vol.16,no.7,pp.734-744,1997 .
[10} T. Dhaene and D. D. Zutter, "Selection of lumped eleThe improved T and improved n models are proposed for ment models for co';lpled los~y transmission lines," IEEE on-chip interconnect macromodeling. Efficient approximation r~5nsi 9~~~puter-Alded Design, vol. 11, no.7, pp. 
