

# Measuring the Tolerance of Self-adaptive Clocks to Supply Voltage Noise Jordi Pérez-Puigdemont, Francesc Moll

Dpt. of Electronic Engineering, Universitat Politécnica de Catalunya - C/ Jordi Girona 1-3 08034 Barcelona jordi.perez-puigdemont@upc.edu | francesc.moll@upc.edu

# Jordi Cortadella

Dpt. of Languages and Informatics Systems, Universitat Politécnica de Catalunya - C/ Jordi Girona 1-3 08034 Barcelona jordi.cortadella@upc.edu

Abstract—Simultaneous switching noise has become an important issue due to its signal integrity and timing implications. Therefore a lot of time and resources are spent during the PDN design to minimize the supply voltage variation. This paper presents the self-adaptive clock as an alternative to tolerate the critical path delay variation due to supply noise thanks to its self-adaptable nature. A self-adaptive clock generation circuit is proposed in this paper and its benefits, in terms of clock period reduction, are assessed under a realistic supply noise obtained through simulation for different switching activities.

#### I. INTRODUCTION

The synchronous system paradigm continues to be the basis of today's sequential system design. In this paradigm the clock period is fixed and determined by the critical path delay plus setup time of registers and a timing margin to accommodate for worst case conditions. In current and new technologies, process variations [1], [2] introduce a large uncertainty in the gate delay, which could be up to 30%, [3], [4] and therefore a larger timing margin,  $\Delta T_{PV}$ , must be added to the clock period. In some circuits this added margin accounts for a loss of performance. In addition to process variations, voltage variations introduce a second source of added timing margin,  $\Delta T_{VV}$ , which, as is shown in fig. 2, could be up to 50% of the nominal delay.

In order to minimize the effects of power supply noise a large design effort is dedicated to the power supply distribution [5], [6], including the use of on-chip decoupling capacitors [7], [8]. The use of decoupling capacitors present a drawback in terms of area overhead and increase of leakage.

The objective of this paper is to evaluate the use of a self adaptive clock as a way to tolerate a larger power supply noise originating from simpler power distribution network or a smaller quantity of decoupling capacitance.

A self-adaptive clock is a clock that its period suffers a variation correlated to the delay variations of the critical path, caused either by process or by supply voltage variations. If the correlation is high it means that even for important voltage fluctuations the circuit will still behave correctly because the clock period will be larger when the critical path is slower and viceversa. This approach aims to reduce as much as possible the time guards  $\Delta T_{VV}$  and  $\Delta T_{PV}$  introduced due to the delay uncertainty caused by voltage or process variations



Fig. 1. Scheme of GALS circuit. The different synchronous clock domains are communicated with the following domains through asynchronous FIFO queues. This queues isolated the different clock domains in term of timing constrains. Therefore each domain can be studied separately.

respectively. In the current article we will focus only on the reduction of  $\Delta T_{VV}$  due to supply voltage variation. The studied approach is focused on globally-asynchronous locallysynchronous (GALS) architectures [9] depicted in Fig. 1, even though the results can be easily extended to asynchronous circuits with bundled data and matched delays [10]. In the GALS architecture the whole circuit is divided in different clock islands, operating synchronously at internal level, that communicate in an asynchronous way. In this paper we propose a self-adaptive clock to act as the locals clocks in the GALS architecture.

The proposed mechanism is self-compensating in the sense that no external control is necessary and the clock frequency self-adjusts to the existing variations in the critical path. Moreover, the self-adaptive clock scheme allows a relaxation in the Power Distribution Network (PDN) rules resulting in shorter design times, less on-chip decoupling capacitors, or lower operating voltages and consequently a decrease in power.

This work is structured in four sections. In the first we develop the concept of self-adaptive clocks. After that we focus on how the supply voltage variation is modelled. In the third section we assert the benefits of the self-adaptive clock scheme through two ways: analysing the correlation between the supply voltage waveforms of the self-adaptive clock and critical path and the delay variation produced by the supply noise. Finally we summarize our results in the conclusions.



Fig. 2. Propagation delay for different logic gates, taken from the same 65nm library, as function of supply voltage. Every gate output is loaded with a gate equal to itself. It is possible to observe that, despite the different delay values, the behaviour of the delay is similar for every gate.

# II. CONCEPT OF SELF-ADAPTIVE CLOCKS

As mentioned, the proposed self-compensation scheme will be effective when there is a strong similitude between clock period variations and critical path delay variations all along the clock domain. Therefore, the method aims especially at compensating for Die-to-Die systematic variations, at temperature variations and at supply voltage variations.

In this work, which is a proof of concept, we consider a ring oscillator (RO) made of inverters as the clock signal generation circuit. This circuit has been chosen because the inverters suffer a delay variation caused by the supply voltage noise (Fig. 2) that behaves in a similar way as other logic gates. Therefore, the corresponding clock period will naturally adapt to voltage variations as well as process and temperature variations that may affect the critical path. However, the delay is not the same for the different logic gates; in a more realistic design, instead of a RO made of inverters, the RO should be made with a mixture of logic gates similar to the ones present in the critical path.

The gain in operation frequency with respect to a conventional (fixed) clock, this is a decrease in  $\Delta T_{VV}$ , will depend on the similarity between the supply voltage noise at the clock generator circuit and the critical path. The similarity of these noise waves, and hence the correlation between the critical path delay and the clock period, will depend on its location, more precisely, on the relative distance between them. However, since the clock signal is not distributed immediately we have to take into account the clock distribution tree depth. This depth can be accounted as a delay in the clock's generation circuit supply voltage and, as we will see in section IV, it will play a major role in the minimum time guard  $\Delta T_{VV}$  that is required.

It is important to state that the work presented here is

only focused on the timing implications of the supply voltage noise for a given PDN. Also, it is mandatory to clarify that we only take into account the high frequency noise; the one caused on the die by circuit's switching activity. We focus only in the high frequency noise because it is the less spatially and temporally correlated noise so this is the noise that will degrade the most the performance of a self-adaptive clock.



Fig. 3. Scheme of the PDN. The horizontal stripes are placed in the lower metallization level, i. e. level 1; meanwhile the vertical stripes are placed in an upper metallization level such as level 2. Also, in this figure, the location and size of the standard cells are shown. And finally, the ports where the supply voltage is measured are also shown. This supply voltage will be calculated always as the voltage difference between the Vdd rail and the Gnd one at the ij-th location, this is:  $V_{supply}(i, j) = V_{Vdd}(i, j) - V_{Gnd}(i, j)$ .

## III. VOLTAGE VARIATION MODELLING

To assess the effectiveness of self-adaptive clock architecture, by measuring the operational frequency increase, we need to work with a realistic on chip PDN. Then, through its simulation, realistic voltage variations on the power supply rails will be obtained.

In order to model a realistic PDN under a realistic work load we follow these steps:

- 1) Specify the power distribution network (PDN) size in terms of number of activity cells,  $N \times M$ , as is depicted in the Fig. 3. Each one of these activity cells represent some closely located standard cells.
- 2) Generate a PDN model with RL parasitics extracted through FastHenry 3.0 software [11].
- 3) Add, in parallel to each PDN port, a capacitor  $C_{SW}$  which represents the total capacitance that has to be charged or discharged during the activity cell switching.
- 4) Simulate the noise generated at each port by the single switching of every activity cell. This is achieved by placing a PWL current source, with a gaussian shape, in parallel to the port of the switching cell. This high level model is similar to the one used for substrate noise analysis [12]. The width of the current gaussian pulse is related to the switching time duration of the

activity cell. The peak current is fixed by the  $C_{SW}$  that has to be charged or discharged and the width of the current pulse. Therefore the shape of the noise produced by a single cell switching is determined by the PDN size, the relative position of the switching cell and the measuring port, the  $C_{SW}$  switching capacitance and the switching time. After running this step what we obtain is a switching noise library (SNL) of the simulated PDN with  $(N \times M)^2$  entries. Some of its entries are depicted in Fig. 4.

5) Combine the single cell switching noise (assuming that the PDN behaves as linear system) to generate simultaneous switching noise (SSN) following a given pattern at each self-adaptive clock rising edge. The activity cells switching time, when the supply noise waveforms are combined, is governed by a self-adaptive clock: so the time between switching events will depend on its supply noise, which will be triggered by the the same selfadaptive clock in a sort of interdependent process. The result of this last step is shown in Fig. 5.

Simulating the SSN in this way we can very precisely control the switching of every cell at any given clock cycle. In the work presented here we setup this process to control the switching activity by means of a space-time constant switching probability (P(Switch)). This probability is proportional to the number of simultaneously switching cells.



Fig. 4. Noise waveforms obtained after the step 4 of the simulating process. Upper-left graph corresponds to the supply voltage noise obtained at the port of the switching cell. The other three graphs (upper-right, bottom-left and bottom-right) are the supply voltage noise waveforms obtained at different ports through the PDN. Note that the high frequency behaviour is depicted in the insets.

Once this process is run for a given PDN, a specific P(Switch) and a nominal clock period (which is determined by the number of inverters in the RO),  $N \times M$  SSN noise waveforms are obtained. Each one of these corresponds to the supply noise measured at one of the PDN cells (PDN ports, Fig. 3). The shape of these noise waveforms depends on the switching probability and the location of the observed

cell. The result of the simulating process as function of the switching probability and the nominal RO period are shown in Fig. 5. In this figure can be seen that as P(Switch) increases both the amplitude of the supply noise and the duration of the perturbation increase. At the same time, the different perturbations recorded at different clock periods are also more similar.

Once we have obtained the supply noise for a switching probability all across the PDN we can use this data to calculate the effects of this noise in the logic circuit delay and the performance increase of the self-adaptive clock in front of classical fixed clock systems.

# IV. EVALUATION OF SELF-ADAPTIVE CLOCK ARCHITECTURE FOR VOLTAGE VARIATION TOLERANCE

In order to assess the reduction of the clock period margin,  $\Delta T_{VV}$ , due to the use of a self-adaptive clock it is necessary to set up a simulation framework. This framework will be useful in terms of reducing the computational complexity of the circuit needed to simulate and the time taken to do it. In the last section we have described how we obtain through simulation the PDN supply noise for a given set of specifications. In our case the simulated PDN has a rectangular shape with  $11 \times 11$ activity cells, each of them measuring  $2.5\mu m \times 3\mu m$ . The clock generation circuit is a RO made of 13 inverters (65 nm technology) which leads to a nominal operation frequency of 2.4 GHz. The critical path is made of 27 inverters of the same type as the ones used in the RO.

As shown in Fig. 2 the logic gate delay is inversely-related to the supply voltage, and this will affect both the period of the ring oscillator and the delay of the critical path. Thereby it is interesting to study the correlation between voltage waveforms at the RO and critical path locations.

# A. Voltage correlation measure

Since we are assessing the self-adaptive clock architecture through the supply noise we need some metrics to measure the similarity between the supply noise at the central cell and the rest of the cells. As a first approach, we study the correlation between the noise at the central cell, or ring oscillator (RO) cell, and the other cells: the higher the degree of similarity between the supply noise in the RO and the supply noise of the critical path, the more efficient the self-adaptive clock scheme will be.

In order to assess the self-adaptive clock architecture we have to take into account the distance  $\delta$  between the RO cell and the cell with the critical path and the delay  $\tau$  introduced by the clock distribution tree. As a first approximation is assumed to be the same across the tree (perfectly balanced clock distribution) and unaffected by supply noise. We assume that the circuit used to generate the clock signal (RO) is placed in the central cell of the PDN, and the distance  $\delta$  is measured from this cell. The delay  $\tau$  will be introduced as shift in the RO supply voltage vector. Then the metric function  $\rho$  is calculated as follows:



Fig. 5. SSN, measured at the central cell of a  $11 \times 11$  cells PDN, varying the switching probability, P(Switch); which is constant in space and time. As P(Switch) increases the mean number of commuting cells at each period  $\langle N_{cc} \rangle$  also increases. This increase in  $\langle N_{cc} \rangle$  leads to an increase in the noise amplitude, but also to a rise in the correlation between the switching perturbations produced at each clock edge.



Fig. 6. Supply noise correlation  $\rho(\delta, \tau)$  for four different P(Switch), 0.0100, 0.0322, 0.1036 and 0.3333; from left to right and up to down in the figure. The supply noise correlation shows a periodic behaviour along the clock tree depth dimension, with the maximums placed at multiples of the nominal period  $T_{RO}$ . But at the same time it dose not show any strong dependence the distance  $\delta$ .

$$\rho(\delta,\tau) = \min_{t \neq j} \left\{ \operatorname{corr} \left( V_{ij}(t), V_{RO}(t-\tau) \right) \right\}$$
(1)

where  $i, j \mid || \vec{r}_{RO} - \vec{r}_{ij} \mid| = \delta$ ,  $\vec{r}_{RO}$  is the RO position and  $\vec{r}_{ij}$  is the *ij-th* position cell. Note that the worst case corresponds to minimum correlation and for this reason the metric function is chosen as the minimum among all the different cells that are at the same distance  $\delta$ .

In Fig. 6 it is shown the metric function  $\rho(\delta, \tau)$  between the RO cell supply noise and all the others cells supply noise, including the RO cell itself, for different switching probabilities. In this figure some interesting facts arise: In the first place, the distance to the central cell, for a given clock distribution tree delay has no influence in the metric function. In the second

place, the correlation shows a clear periodic behaviour along the  $\tau$  axis that corresponds to the nominal clock period. The maximum values of the metric function appearing at each clock period increase with increasing probability P(Switch).

# B. Relative delay measure

The relation between supply voltage and gate delay involves complex dynamic processes an therefore, to evaluate our previous results, it is necessary to assess the self-adaptive clock system through the difference between the RO period and the delay of a virtual critical path located along the PDN. In our study, the critical path is simulated with a chain of 2N inverters, where N is the number of inverters in the RO. The difference between the RO period and the critical path delay corresponds to the time guard  $\Delta T_{VV}$  that needs to be added to the clock period to ensure a correct chip operation.

1) Worst case cell determination: To avoid the delay simulation of every cell in the PDN, which would consume too many computational resources, we will only simulate the cell corresponding to the worst case (WC) condition. The WC in terms of timing corresponds to the case when the supply voltage at the RO is higher than the nominal value, while the critical path is fed with a sub-nominal voltage. As we showed in the Fig. 2 a sub-nominal supply voltage leads to an increase in the propagation delay while a supra-nominal voltage leads to shorter propagation delays, or in the RO case to shorter clock periods.

To identify the cell where the WC takes place we define the worst relative voltage difference between the RO and a cell, taking into account the clock distribution tree depth  $\tau$ , as follows

$$\Delta V_{min}(i,j) = \min_{t,\tau} \left\{ V_{ij}(t) - V_{RO}(t-\tau) \right\}$$
(2)

then the WC cell indices are determined as

$$i, j \mid \min_{i,j} \Delta V_{min}(i, j) \tag{3}$$

The  $\Delta V_{min}(i, j)$  values, for different switching probabilities, are shown in Fig. 7, where the WC cells are highlighted.



Fig. 7. Values of  $\Delta V_{min}(i, j)$  for P(Switch) equal to 0.0100, 0.0322, 0.1036 and 0.3333; from left to right and up to down. The worst cell is highlighted for each case.

2) Time margin determination: Once the worst case cell is determined we can study the time difference, between it and the RO, through transistor level electrical simulation carried out with Hspice software. If the delay of a chain of two times the number of inverters of the RO is defined as a function of the supply voltage:  $delay(V_{ij}(t))$  the relative delay for the worst case will be equal to

$$\Delta d_{WC}^n(\tau) = \text{delay}(V_{RO}(t-\tau)) - \text{delay}(V_{WC}(t)), \qquad (4)$$
  
for  $nT_{RO} < t \le (n+1)T_{RO}$ 

where  $\tau$  is the clock distribution tree delay.

Then the time margin that needs to be added to the clock period  $\Delta T_{VV}$ , to ensure the correct sequential chip operation, is, for a given  $\tau$ , equal to

$$\Delta T_{VV}(\tau) = -\min\left\{\Delta d_{WC}^n(\tau)\right\}$$
(5)

The value of  $\Delta T_{VV}(\tau)$  for different values of P(Switch)is shown in Fig. 8, as well as the  $\Delta T_{VV}$  margin needed in the fixed clock architecture. A reduction of the time margin is achieved for the four studied work loads. This reduction grows as P(Switch) increases. With the data in Fig. 8 we prove that the self-adaptive clock architecture gives a remarkable advantage, in terms of frequency increasing, over the fixed clock architecture. And, remarkably, although this reduction shows a periodic behaviour as function of the clock distribution tree depth, it always remains positive for any value of  $\tau$ .

In Fig. 8 it is evident that the correlation  $\rho$  between the supply voltages at the RO and the critical path (Fig. 6) is not a good metric for the self-adaptive clock scheme. Despite  $\rho$  and  $\Delta T_{VV}(\tau)$  both have a periodic behaviour, the periodicity of  $\Delta T_{VV}(\tau)$  is not the same as the one observed in the voltage supply noise correlation metric  $\rho$ ; in this case the period of  $\Delta T_{VV}(\tau)$  is approximately twice as much as the one found in the correlation analysis. The difference between both metrics



Fig. 8. Values of  $\Delta T_{VV}(\tau)$  calculated for a self-adaptive clock (s-a.c.) and fixed clock (f.c.) architectures under different values of P(Switch). The reduction in the needed time margin grows as the switching activity increases. Also is remarkable that this reduction is periodically dependent with the clock tree depth but for any  $\tau$  the self-adaptive clock reports a significant reduction of  $\Delta T_{VV}(\tau)$ .

can be explained due to the cumulative nature of the the supply noise induced relative delay  $\Delta T_{VV}(\tau)$ . That is, within a period of the self-adaptive clock, to minimize  $\Delta T_{VV}(\tau)$  the supply voltages ( $V_{RO}(t)$  and  $V_{ij}(t)$ ) do not have to be identical; the voltages, if they have a similar shape, can be mis-synchronized if both perturbations completely occur within the same clock period, as is depicted in Fig. 9.



Fig. 9. Representation of the self-adaptive clock capability to tackle missynchronized supply noise induced delay. The self-adaptive clock will adapt its period to the propagation delay of the critical path only if both perturbations are similar and occur within the same clock period.

#### V. CONCLUSION

In this article we evaluated the concept of self-adaptive clock within a synchronous computational framework. In order to asssess its benefits in terms of clock period reduction, we modelled a realistic PDN to obtain the supply voltage variation due to SSN. Once SSN is obtained, through the relative difference between the self-adaptive clock period and the propagation delay through a chain of inverters that accounts for the critical path we showed that the self-adaptive clock scheme reports a reduction of the clock period for a big span of switching activity.

We also proved that this reduction depends on the clock distribution tree delay in a periodic fashion, consequently to maximize the clock period reduction the clock tree needs to be carefully designed.

Finally we find out that the correlation between the supply voltage at the RO and the critical path is not a good metric. Therefore, to assess any self-adaptive clock implementation the measure should rely on the transient analysis of the relative delay obtained through transistor level simulation. Unfortunately this procedure consumes a prohibitive amount of computational resources for a medium-large size chip, therefore an interesting future avenue of research is to find a reliable metric which does not rely on the delay calculations. Using this metric, it will be possible to estimate the maximum size of the clock domain where the self-adaptive clock scheme is useful.

## ACKNOWLEDGMENT

This work was supported by project MODERN funded by the Spanish MICINN through "Fondo Especial para la Dinamización de la Economía y el Empleo – Plan E" (contract PLE2009-0024) and the ENIAC JU (contract ENIAC-120003).

### REFERENCES

- [1] B. Cheng, S. Roy, G. Roy, F. Adamu-Lema, and A. Asenov, "Impact of intrinsic parameter fluctuations in decanano mosfets on yield and functionality of sram cells," *Solid-State Electronics*, vol. 49, no. 5, pp. 740 – 746, 2005, 5th International Workshop on the Ultimate Intergration of Silicon, ULIS 2004.
- [2] G. Gielen, P. De Wit, E. Maricau, J. Loeckx, J. Martin-Martinez, B. Kaczer, G. Groeseneken, R. Rodriguez, and M. Nafria, "Emerging yield and reliability challenges in nanometer cmos technologies," in *Design, Automation and Test in Europe, 2008. DATE '08*, 2008, pp. 1322 –1327.
- [3] S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and V. De, "Parameter variations and impact on circuits and microarchitecture," in *Proceedings of the 40th annual Design Automation Conference*, ser. DAC '03. New York, NY, USA: ACM, 2003, pp. 338–342. [Online]. Available: http://doi.acm.org/10.1145/775832.775920
- [4] H. Mahmoodi, S. Mukhopadhyay, and K. Roy, "Estimation of delay variations due to random-dopant fluctuations in nanoscale cmos circuits," *Solid-State Circuits, IEEE Journal of*, vol. 40, no. 9, pp. 1787 – 1796, sept. 2005.
- [5] Q. K. Zhu, Power Distribution Network Design For VLSI. John Wiley & Sons, Inc., 2005. [Online]. Available: http://dx.doi.org/10.1002/0471660302
- [6] M. Popovich, A. V. Mezhiba, and E. G. Friedman, *Power Distribution Networks with On-Chip Decoupling Capacitors*, 1st ed. Springer Publishing Company, Incorporated, 2007.
- [7] R. Downing, P. Gebler, and G. Katopis, "Decoupling capacitor effects on switching noise," *Components, Hybrids, and Manufacturing Technology, IEEE Transactions on*, vol. 16, no. 5, pp. 484–489, Aug. 1993.
- IEEE Transactions on, vol. 16, no. 5, pp. 484 –489, Aug. 1993.
  [8] M. Pant, P. Pant, and D. Wills, "On-chip decoupling capacitor optimization using architectural level prediction," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 10, no. 3, pp. 319 326, Jun. 2002.
- [9] D. M. Chapiro, "Globally-asynchronous locally-synchronous systems," Ph.D. dissertation, Stanford University, Oct. 1984.

- [10] J. Sparsø and S. Furber, Eds., *Principles of Asynchronous Circuit Design: A Systems Perspective*. Kluwer Academic Publishers, 2001.
   [11] Fasthenry 3.0. [Online]. Available:
- http://www.rle.mit.edu/cpg/research\_codes.htm
- 12] L. Elvira, F. Martorell, X. Aragonés, and J. L. González, "A physicalbased noise macromodel for fast simulation of switching noise generation," *Microelectronics Journal*, vol. 35, no. 8, pp. 677 – 684, 2004.

404