# **Repeater Insertion To Minimise Delay in Coupled Interconnects**

Dinesh Pamunuwa and Hannu Tenhunen ESD Lab, Department of Electronics, Royal Institute of Technology Electrum 229, Isafjordsgatan 22-26, SE-164 40 Kista, Sweden dineshlhannu@ele.kth.se

### ABSTRACT

Signalling over long interconnect is a dominant issue in electronic chip design in current technologies, with the device sizes getting smaller and smaller and the circuits becoming ever larger. Repeater insertion is a well established technique to minimise the propagation delay over long resistive interconnect. In deep sub-micron technologies, as the wires are spaced closer and closer together and signal rise and fall times go into the sub-nano second region, the coupling between interconnects assumes great significance. The resulting cross talk has implications on the data throughput and on signal integrity. Depending on the data correlation on the coupled lines, the delay can either decrease or increase. In this paper we attempt to quantify the effect of worst-case capacitive cross talk in parallel buses and look at how it affects repeater insertion in particular. We develop analytic expressions for the delay, buffer size and number that are suitable in a-priori timing analyses and signal integrity estimations. All equations are checked against a dynamic circuit simulator (SPECTRE).

## **1. INTRODUCTION**

Signal Propagation on long resistive interconnect lines is a function of the product of the line resistance and capacitance, commonly known as the RC delay. Since both the resistance and capacitance show a linear increase with length, the delay increases quadratically with length. Because the prophecy of Moore's law in VLSI circuits has held true over the years, interconnections have become smaller in cross-section and longer in length with each succeeding generation of CMOS technology. Hence there has been a lot of investigation into the problem of repeater insertion in long interconnect. Bakoglu [1] presented an analysis based on characterizing the repeater with an input capacitance and an output resistance which was one of the pioneering works in this area. Wu and Shiau in [2] improve on the repeater model and use a linearised form of the Schichmann-Hodges equations while Adler and Friedman in [3] use Sakurai's alpha power model to include the effect of velocity saturation in short channel devices. Ismail and Friedman in [4] present an analysis which models inductance in the interconnect for the first time. Some other work on repeater insertion is given in [5] and [6].

In the future generation of VLSI circuits when the feature size shrinks to a fraction of a micro meter, and more and more transistors are placed on a single chip, cross talk will pose a serious challenge in designing VLSI systems. In closely coupled interconnects such as in long parallel buses, cross talk can result in speeding up of the signal or cause considerable additional delay- depending on the correlation between the data on the different lines. We present in this paper an analysis for repeater insertion to minimise delay in parallel coupled interconnects with



Figure 1. Configuration for investigating effect of cross talk

0-7695-0831-6/00 \$10.00 © 2000 IEEE

513

worst-case capacitive cross talk. Our methodology uses the same repeater model as in [1] and incorporates coupling capacitance between adjacent interconnect lines. The equations we derive reduce to Bakoglu's equations when the coupling capacitance term is set to zero, and are suitable in *a-priori* signalling estimates.

### 2. DELAY MODEL FOR COUPLED INTERCONNECTS

The first step in analysing the effect of cross talk on repeater insertion is to have an accurate delay estimate for the step response. From now on, whenever delay is mentioned we are always talking about the 50% delay, since this corresponds to the delay of the output to the switching threshold of an inverter. We consider a line with coupling on two sides and derive a model for the step response with worst case coupling: that is both aggressor lines switch in a direction opposite that of the victim line. The configuration for this analysis is shown in Fig. 1.



Figure 2. Lumped model with cross talk

The reasoning behind the derivation of the model requires a short diversion into previous work. The delay of a lumped RC circuit is analytically solvable and is

$$T_{0.5, lumped} = 0.7RC \tag{1}$$

The step response of a distributed RC circuit has no closed form time domain solution. However a closed form frequency domain solution exists, and it is possible to make an approximation for t>>RC and t<<RC and use these equations to separately calculate the low frequency and high frequency portions of the output waveform (Ref. [1]). This leads to the following delay model for the step response to a distributed RC line:

$$T_{0.5 \ distr} = 0.4RC \tag{2}$$

Now the step response for a a lumped RC circuit with

coupling as shown in Fig. 2 is given by:

$$V_{out} = Vdd \left\{ 1 + \frac{C_s}{2C_c} e^{-\frac{t}{RC_s}} - \frac{4C_c + 3C_s}{2(2C_c + C_s)} e^{-\frac{t}{R(2C_c + C_s)}} \right\}$$
(3)

Just as the delay for the distributed line can be approximated by  $T_{0.5} = \lambda * RC$  where RC is the time constant in the lumped model, and  $\lambda$  a constant, We have approximated the delay for the distributed line with cross talk by a model given by:  $T_{0.5} = \lambda_a * RC_s + \lambda_b * R(2C_c+C_s)$  where  $RC_s$  and  $R(2C_c+C_s)$  are the two time constants present in the lumped model. If there is an additional line acting as the aggressor, the coupling capacitance term doubles. For a distributed line with two aggressor nets as shown in Fig. 1, this reduces to the following equation:.

$$T_{0.5, distr, withCT} = 0.4RC_s + 0.58RC_c$$
(4)

Table 1. Comparison of actual delay and delay predicted by model for a distributed RC line with cross talk

| R<br>(ohms) | Cs<br>(fF) | Cc<br>(fF) | Td<br>(simulated)<br>(ps) | Td<br>(model)<br>(ps) | Error<br>percentage<br>(%) |  |
|-------------|------------|------------|---------------------------|-----------------------|----------------------------|--|
| 10          | 1          | 1          | 10                        | 10                    | -0.1%                      |  |
| 10          | 1          | 10         | 70                        | 60                    | 5.5%                       |  |
| 10          | 1          | 100        | 640                       | 580                   | 8.33%                      |  |
| 10          | 100        | 1          | 390                       | 410                   | -5.06%                     |  |
| 10          | 100        | 10         | 450                       | 460                   | -1.59%                     |  |
| 10          | 100        | 100        | 980                       | 980                   | -0.09%                     |  |
| 1000        | 1          | 1          | 980                       | 980                   | -0.1%                      |  |
| 1000        | 1          | 10         | 6560                      | 6200                  | 5.5%                       |  |
| 1000        | 1          | 100        | 63710                     | 58400                 | 8.33%                      |  |
| 1000        | 10         | 1          | 4510                      | 4580                  | -1.6%                      |  |
| 1000        | 10         | 10         | 9790                      | 9800                  | -0.08%                     |  |
| 1000        | 10         | 100        | 65610                     | 62000                 | 5.51%                      |  |
| 1000        | 100        | 1          | 38620                     | 40580                 | -5.07%                     |  |
| 1000        | 100        | 10         | 45090                     | 45800                 | -1.58%                     |  |
| 1000        | 100        | 100        | 97920                     | 98000                 | -0.08%                     |  |

The constants were obtained by running simulations for a range of R,  $C_c$  and  $C_s$  and then fitting the above model to the data. Note that when  $C_c$  is set to zero, this reduces to



Figure 3: Repeater Insertion in a long interconnect

Bakoglu's approximation given in Eq. (2). Eq. (4) is appropriate for on-chip lines which are typically very lossy and the inductance is negligible. Ref. [7] gives closed form equations for the approximate coupling noise in the time domain, for coupled lines where the resistance is small compared to the inductive impedance.

Given in Table 1 are the actual delay values as obtained from simulations and the values predicted by the model in Eq. (4). It is seen that there is good agreement for a wide range of R,  $C_c$  and  $C_s$  parameters, and that the correlation is weakest when  $C_c > 100C_s$ , which is an unrealistic scenario in an actual situation. For all practical purposes, this model is as accurate as the very commonly used approximation for a distributed line as given in Eq. (2), which is reported to exhibit an accuracy of within 4% for a wide range. This delay model is used in the next section to obtain equations for repeater numbering and sizing for minimum delay.

#### **3. REPEATER INSERTION**

To reduce delay the long lines in Fig. 1 are broken up into shorter sections, with a repeater (an inverter) driving each section. Let the number of repeaters including the original driver be k, and the size of each repeater be h times a minimum sized inverter. The output impedance of a minimum sized inverter for the particular technology is  $R_{drv}$  and the output capacitance  $C_{drv}$ . Then the output impedance of an h sized driver becomes  $R_{drv}/h$ , and the output capacitance  $h^*C_{drv}$ . This configuration is sketched out in Fig. 3, where the symbol  $\overline{\frac{3}{344}}$  refers to a capacitively coupled interconnect as shown in Fig. 1. Now with reference to Fig. 3 and using superposition with the delay Eqs. (1, 2 and 4) the total delay becomes:

$$t_{0.5} = k \left[ 0.7 \frac{R_{drv}}{h} \left( \frac{C_s}{k} + hC_{drv} + 2.2 \frac{2C_c}{k} \right) + \frac{R}{k} \left( 0.4 \frac{C_s}{k} + 0.58 \frac{C_c}{k} + 0.7 hC_{drv} \right) \right]$$
(5)

It is assumed that the load  $C_L$  is equal to the input capacitance of an h sized inverter. This delay expression follows the Bakoglu model, and the difference is in the terms in bold, which are the result of modelling the cross talk in the delay. The coefficient of 2.2 for the lumped term involving the coupling capacitance  $C_c$  is to take the Miller effect into account. The accuracy of this approximation for

the delay of a section can be checked using table 2, which shows the difference between the predicted delay and the delay as given by SPECTRE simulations. Results are given for a range of values which are deemed to be relevant. The error is contained to within 5% over the full range of test values, of which only a sample is given here.

Table 2. Comparison of delay model for section with simulations

| R (ohm) | Cs (fF) | Cc (fF) | Error Percentage |
|---------|---------|---------|------------------|
| 1000    | 1000    | 1000    | 4.6%             |
| 1000    | 1000    | 100     | -1.1%            |
| 1000    | 100     | 100     | 3.2%             |
| 1000    | 100     | 1000    | 2.7%             |
| 100     | 100     | 100     | 2.7%             |
| 100     | 100     | 10      | -2.1%            |
| 100     | 10      | 100     | -2.3%            |
| 100     | 10      | 10      | 2.2%             |

Now to find the optimum h and k for minimising delay, the partial derivatives of Eq. (5) with respect to k and h are equated to zero. Setting

$$\frac{\partial t_{0.5}}{\partial k} = 0$$

leads to

$$k = \sqrt{\frac{0.4RC_s + 0.58RC_c}{0.7R_{drv}C_{drv}}}$$
(6)

Similarly

$$\frac{\partial l_{0.5}}{\partial h} = 0$$

leads to

$$h = \sqrt{\frac{0.7R_{drv}C_s + 3.1R_{drv}C_c}{0.7RC_{drv}}}$$
(7)

Note that when the coupling capacitance term  $C_c$  is set to zero, Eqs. (6 and 7) simplify to the Bakoglu equations.

Given in Table 3 is a comparison of the buffer sizes, number and delay as given by the Bakoglu equations and the equations taking cross talk into account for a number of line resistances and capacitances. Obviously, the differ-

| R<br>(ohm) | Cs<br>(pF) | Cc<br>(pF) | k1<br>(without<br>cross talk) | h1<br>(without<br>cross talk) | k2 (with<br>cross talk) | h2 (with<br>cross talk) | Td <sub>0.5</sub> (using<br>k1 & h1) (ns) | Td <sub>0.5</sub> (using<br>k2 & h2) (ns) | Diff. due to<br>disregarding<br>cross talk |
|------------|------------|------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------------------|-------------------------------------------|--------------------------------------------|
| 100        | 0.5        | 0.5        | 0.6                           | 79                            | 0.9                     | 184                     | 0.39                                      | 0.31                                      | 25%                                        |
| 100        | 0.5        | 1          | 0.6                           | . 79                          | 1.2                     | 248                     | 0.62                                      | 0.41                                      | 49%                                        |
| 100        | 1          | 0.5        | 0.8                           | 112                           | 1.2                     | 200                     | 0.39                                      | 0.35                                      | 11%                                        |
| 100        | 1          | 1          | 0.8                           | 112                           | 1.3                     | 260                     | 0.56                                      | 0.45                                      | 25%                                        |
| 1000       | 0.5        | 0.5        | 1.9                           | 25                            | 3                       | 58                      | 1.25                                      | 0.98                                      | 28%                                        |
| 1000       | 0.5        | 1          | 1.9                           | 25                            | 3.7                     | 78                      | 2.02                                      | 1.30                                      | 56%                                        |
| 1000       | 1          | 0.5        | 2.7                           | 35                            | 3.5                     | 63                      | 1.23                                      | 1.11                                      | 11%                                        |
| 1000       | 1          | 1          | 2.7                           | 35                            | 4.2                     | 82                      | 1.78                                      | 1.39                                      | 27%                                        |

Table 3. Investigation of the effect of including cross talk in buffer sizing and numbering

ence in the delay times increases with increasing coupling capacitance. In certain cases the difference is as much as 50%. Critical nets which are buffered without taking cross-talk into account will exceed the timing slack. It should be noted that the optimal sizes taking cross talk into account are considerably higher than those given by the Bakoglu equations. Setting  $C_c = C_c/2$  in all of the above will result in the equations corresponding to a system with a single aggressor. We ran Spectre simulations on an actual 0.35 micro meter AMS technology and compared the impact of the different buffer numbering and sizing to further verify the model. The average output impedance of a minimum sized inverter in that technology is 10k ohm while the input capacitance is approximately 10f F. In Table 4 are given the results of simulations run for two nets, the first 1 cm long and the second 3 cm long where parallel nets switch in the opposite direction as in Fig. 1. Both lines are 0.7  $\mu$ m wide and 0.5  $\mu$ m thick with a spacing between the lines of 0.7 µm and a dielectric thickness of 0.5 µm. The resistive and capacitive parasitics were obtained from typical per unit length values. The buffers were numbered and sized according to Bakoglu's equations  $(k_a \text{ and } h_a)$  and according to the cross-talk analysis model  $(k_h \text{ and } h_h)$  with k being rounded to the nearest integer. The delay in both cases was calculated using the new metric proposed in Eq. (5). The rise times of the sig-

nals in all cases were set to 100 ps. It can be seen that there is good agreement with the model.

The effect of cross-talk on delay can also be seen in the eye diagrams shown in Fig. 4. Since the eye diagrams are obtained by simulating with pulse streams instead of single steps, the eye opening is a statistical measure of delay. These were obtained with different pseudo-random bit streams running on the three lines, with a bit frequency of 1GHz and rise and fall times of 100p seconds. The simulations were carried out for the 3 cm line with parameters as given in table 4. The first diagram is with repeaters numbered and sized according to  $k_b$  and  $h_b$  and the second with repeaters sized and numbered according to  $k_a$  and  $h_a$  as given in table 4. The first has an eye opening as shown, while in the second this has completely closed.

#### 4. SUMMARY

We have presented a worst case delay model for parallel coupled interconnects and shown that it is accurate to within 95% over a wide range of parameters. We have used this model to study the impact of cross talk on buffer sizing for delay minimisation in long nets, and derived a new set of equations that give the number and optimum size of the repeaters. The delay equation for a single section of the line was verified with simulations and found to

| Net length<br>(cm) | R<br>(ohm) | Cs<br>(pF) | Cc<br>(pF) | Bakoglu Sizing |                | Cross-talk Sizing   |                |                | Td <sub>0.5</sub> with Spectre Simulations (ns) |                   | Error                |                   |                      |
|--------------------|------------|------------|------------|----------------|----------------|---------------------|----------------|----------------|-------------------------------------------------|-------------------|----------------------|-------------------|----------------------|
|                    |            |            |            | k <sub>a</sub> | h <sub>a</sub> | Td <sub>0.5,a</sub> | k <sub>b</sub> | h <sub>b</sub> | Td <sub>0.5,b</sub>                             | Bakoglu<br>Sizing | Cross-talk<br>Sizing | Bakoglu<br>Sizing | Cross-talk<br>Sizing |
| 1                  | 625        | 1          | 0.44       | 1.9            | 40             | 1.04                | 2.4            | 68             | 0.95                                            | 1.12              | 1.08                 | 7.1%              | 12%                  |
| 3                  | 1875       | 3          | 1.32       | 5.7            | 40             | 3.12                | 7.3            | 66             | 2.83                                            | 3.16              | 1.3%                 | 4.3%              | 4.7%                 |

Table 4. Comparison with simulations run in a 0.35 µm technology

516



Figure 4: Eye diagrams for different repeater configurations for a 3cm long net with cross talk

be accurate to within 95% over the range of test parameters, which were chosen to encompass a wide spectrum.

Finally we tested our delay model and the buffer sizing equations by running Spectre simulations with transistors from an actual 0.35 micrometer AMS process. We obtained an accuracy of around 85% for the specific cases investigated. Also we show by means of the eye diagram at the output of the line, that disregarding cross-talk can result in closure of the sampling window.

All the equations derived in this paper are completely general and are in no way restricted to a particular technology. There is a representative value for different regions so to speak, defined by the R,  $C_s$  and  $C_c$  coordinates. Typical values for a wide range of buses in VLSI circuits are thus-represented.

Even in modern synthesis programs, the capability of the routing tool to take into account effects such as cross talk is very limited. This results in poor layout and repeated iterations of the design cycle. The availability of closed form equations to predict the timing behaviour in the face of cross talk would hence potentially be very useful.

### 5. REFERENCES

[1] Bakoglu H. B., "Circuits, Interconnections, and Pack-

aging for VLSI", Addison Wesley 1990.

- [2] Wu C. Y. and Shiau M., "Accurate speed improvement techniques for RC line and tree interconnections in CMOS VLSI", in proc. IEEE International Symposium on Circuits and Systems (ISCAS) 1990, pp. 2.1648-2.1651.
- [3] Adler V. and Friedman E. B., "Repeater Design to Reduce Delay and Power in Resistive Interconnect", in IEEE Transactions on Circuits and Systems-II, Analog and Digital Signal Processing, Vol. 45, No. 5, May 1998
- [4] Ismail Y. I., and Friedman E. G., Effects of Inductance on the Propagation Delay and Repeater Insertion in VLSI Circuits, IEEE Transactions on VLSI Systems, April 2000, vol. 8, pp. 195-206
- [5] Nekili M. and Savaria Y., "Optimal methods of Driving Interconnections in VLSI Circuits", in proc. IEEE International Symposium on Circuits and Systems (ISCAS) May1992, pp. 21-23.
- [6] Dar S, Franklin M. A., "Optimum Buffer Circuits for Driving Long Uniform Lines", IEEE J. Solid State Circuits, vol. 26, pp. 32-40, Jan. 1991.
- [7] Tang K. T. and Friedman E. B., "Peak Crosstalk Noise Estimation in CMOS VLSI Circuits", in proc. IEEE International Symposium on Circuits and Systems (ISCAS) 1999 vol. 1, pp. 541-544

517