Performance of Clock Mesh Under Dynamic Voltage and Frequency Scaling  by Harsha, P. et al.
 Procedia Computer Science  46 ( 2015 )  1433 – 1440 
Available online at www.sciencedirect.com
1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license 
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of organizing committee of the International Conference on Information and Communication Technologies (ICICT 2014)
doi: 10.1016/j.procs.2015.02.062 
ScienceDirect
International Conference on Information and Communication Technologies (ICICT 2014) 
Performance Of Clock Mesh Under Dynamic Voltage And 
Frequency Scaling 
Harsha Pa,*, Sunil K Va, John Reubenb 
a Student,M.Tech VLSI Design,School of Electronics and Engineering, VIT University, Vellore,632014,India  
b Professor,School of Electronics and Engineering, VIT University, Vellore, 632014,India  
Abstract 
Mesh based clock distribution is gaining popularity in microprocessor based designs, because of its tolerance to 
skew induced by process variations in Deep Sub-Micron technologies (DSM). In the recent past, there has been 
much research on reduction of power consumed by any chip and Dynamic Voltage and Frequency Scaling (DVFS) 
has emerged as one of the prominent methods for reducing the power. In this work, we first synthesize a capacitance 
driven clock mesh and study the variations of skew when the mesh is operated under a DVFS technique. Based on 
the observations, a novel method to reduce the skew variations is then proposed. 
© 2014 The Authors. Published by Elsevier B.V. 
Peer-review under responsibility of organizing committee of the International Conference on Information and Communication 
Technologies (ICICT 2014). 
Keywords: Clock mesh; DVFS;Skew;DSM;Capacitance driven. 
1. Introduction 
Clock distribution methodology plays a crucial role in the performance of any chip. Clock distribution using 
clock tree to connect to the sinks has been widely researched upon and implemented. This is more suitable for low 
 
 
 
 
*Corresponding author. Tel: +91-9538606535 
E-mail address: harsha.p2013@vit.ac.in 
© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license 
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of organizing committee of the International Conference on Information and Communication 
Technologies (ICICT 2014)
1434   P. Harsha et al. /  Procedia Computer Science  46 ( 2015 )  1433 – 1440 
end designs (like ASIC based) and also for low power applications. However, clock tree is not robust to skew 
induced by process variations in DSM technologies and a considerable degradation can be observed.  
Thus, in high end processors, where minimum skew and high tolerance to process variations is needed, a clock 
mesh is preferred. In a mesh topology, all of its nodes are electrically shorted, which results in providing of multiple 
path from the source to sink, thus reducing the skew drastically. In the recent past, a clock mesh is being used to 
distribute the clock to the sinks at root level, which is preceded by a clock tree at the upper level. 
Extensive research is being carried out in order to reduce the power consumed by a chip. Though there have been 
various techniques for power reduction, one method which could possibly affect the performance of a clock mesh is 
the Dynamic Voltage and Frequency Scaling1.  This method basically involves scaling the supply voltage or the 
frequency or both during the operation of the chip. In case of clock mesh, this scaling affects the power consumed by 
the mesh and also the latencies. 
In the existing literature, clock mesh formation based upon skew constraint and its simulation has been 
presented2. A capacitance driven mesh topology, using rectilinear Steiner trees and stubs for connecting the sinks to 
the mesh edges and a sliding window3 based simulation scheme has been discussed4. A clock mesh for multi voltage 
domain, by considering the splitting up of the mesh for each domain has been explored5. However, no work exists 
which studies the effect of DVFS on a clock mesh. 
This work focuses on how skew and power varies, when a clock mesh is operated under DVFS and aims at 
finding a solution for the skew variations. First a non uniform clock mesh is constructed based upon the load 
capacitances, which is described in section 2. For the mesh thus designed, the sinks are connected to the nearest 
mesh edge by using stubs as discussed here. Buffers are placed at suitable mesh nodes based on a buffer placement 
algorithm which is described in section 3. The simulation of the clock mesh is then performed and is explained in 
section 4. The effect of DVFS on the skew and power is then studied and is described in section 5. The results are 
discussed in section 6 and based on these results, a novel methodology to reduce the variations of skew is then 
proposed in section 7 and the paper is concluded in section 8. This work has been implemented on ISPD10CNS06 
Intel benchmark and 45nm Predictive Technology Model (PTM) library6 is used for buffer design. 
2. Capacitance driven clock mesh 
The traditional methodology of forming a mesh structure involved forming a uniform mesh and iteratively 
increasing the number of mesh windows until a target skew was achieved2. Further the edges of the mesh are placed 
in such a way that the lengths of the stubs connecting the mesh edges are minimized. These above approaches 
structure the mesh wires only with respect to any given skew and not the cluster density of the sinks. If the 
distribution of sinks is non-uniform, this method results in making mesh denser, resulting in longer mesh wires and 
higher power dissipation. In order to minimize these, a capacitance driven mesh formation algorithm4 is used here. 
The algorithm is shown and explained in Fig. 1. The algorithm was run using different target capacitances and the 
mesh and stub length for each target was determined as shown in Table. 1. The total mesh length affects the skew 
and the total stub length affects the latencies at sinks. Considering the trade-off between these two, it can be seen 
that for a target capacitance (pre defined capacitance) of 100fF, the mesh length and the stub length will yield an 
optimum value of power and latency. Thus, 100fF is used as the target capacitance in this work. The connection of 
stubs is determined by finding out the nearest mesh edge for a given sink inside a room. However, no stubs are 
needed for the sinks that lie on mesh edge itself. Fig. 2 shows the mesh for one quadrant of the bench mark under 
consideration. 
3. Buffer Placement 
Buffers are required to preserve the clock signal’s integrity and to reduce the load capacitance seen by the clock 
source. Buffers of different sizes must be used depending on the capacitance of the sinks. Smaller buffer must be 
placed at the mesh node which has a lower load capacitance in the room under consideration. If the Capacitance is 
large, a bigger buffer is used in order to minimize the slew4. 
 
 
1435 P. Harsha et al. /  Procedia Computer Science  46 ( 2015 )  1433 – 1440 
Table 1. Variation  Of  Mesh  And Stub Lengths For Different Target Capacitance 
 
Capacitance 
(fF) 
Mesh Length 
(mm) 
Stub Length       
(mm) 
Total Length 
(mm) 
25 44.060 3.5553 47.6153 
50 31.741 5.850 37.591 
100 
150 
25.102 
21.185 
8.762 
11.22 
33.8645 
32.405 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Fig. 1.  Algorithm for formation of Capacitance driven mesh 
 
The algorithm for placement of buffers is shown in Fig. 3 and the buffer selection library is given in Table. 2. 
However, care should be taken not to place more than one buffer at a mesh node. If that is the case, a bigger buffer 
which is equivalent to the smaller two buffers has to be placed at that node. The mesh with buffers placed is also 
depicted in Fig. 2. Here, in order to indicate different types of buffers, different shapes have been used.  
 
 
1436   P. Harsha et al. /  Procedia Computer Science  46 ( 2015 )  1433 – 1440 
4. Mesh Simulation 
 
In order to simulate the clock mesh, NGSPICE7 is used in this work. Once the placement of buffer is done, a 
netlist containing the information regarding the placement and sizes of the buffer and the capacitances they are 
driving is generated in MATLAB. This work incorporates the Π-connected network to model the mesh wires as well 
as the stub connections. A raw file was generated from NGSPICE and was imported to MATLAB. Using the 
HSPICE Toolbox for MATLAB8, the latency information at different sinks was determined and the skew is found 
using equation 1. 
 
(min)(max) latencylatencySkew                  (1) 
 
 
 
Fig. 2. Clock Mesh along with stub connections and buffer placement 
  
5. DVFS Operation 
 
To mimic the DVFS operation we have varied the voltage keeping the frequency constant and observed the skew 
variations. Then, keeping the Voltage constant, frequency is varied and skew variation is noted. Table. 3 shows the 
variations of power and skew when the voltage is scaled from 1V to 0.6V at a constant frequency of 1GHz. Then, 
the Frequency is scaled down to 500MHz and the skew and power are observed for the same set of scale down 
voltages as shown in Table. 4. The frequency is further scaled to 250MHz and the skew and power for this is 
tabulated in Table. 5. 
6. Results 
 
The skew was found to be the same for a given supply voltage, when the frequency was scaled from 1GHz to 
250MHz.When the supply voltage was scaled from 1V to 0.9V, there is a 7% increase in clock skew, as the voltage 
scales down further, this percentage change in skew from one voltage to another increases. Power consumed by 
clock mesh decreases whenever voltage or frequency is scaled down. 
1437 P. Harsha et al. /  Procedia Computer Science  46 ( 2015 )  1433 – 1440 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Fig. 3. Buffer placement algorithm 
 
Table  2.  Buffer library 
 
Buffer Name CLoad(fF) First  Inverter 
W/L     W/L 
Second   Inverter 
W/L             W/L 
B1 0-50 44/45    32/45      840/45          575/45 
B2 51-100 82/45    57/45 1656/45        1177/45 
B3 101-150 114/45  82/45 2451/45        1718/45 
B4 151-200 151/45  110/45 3258/45        2281/45 
B5 201-250 189/45  130/45 4073/45        2852/45 
B6 251-300 227/45   156/45  4887/45        3422/45 
B7 301-450 340/45    233/45 7330/45        5118/45 
B8 451-600 450/45    310/45 9700/45        6800/45 
 
 
1438   P. Harsha et al. /  Procedia Computer Science  46 ( 2015 )  1433 – 1440 
Table  3.  Skew and power when voltage is scaled with 1GHz Frequency 
 
Sr No Frequrncy Voltage Skew Power 
1 1GHz 1.2V 14.55ps 70.737mW 
2 1GHz 1V 16.54ps 56.55mW 
3 1GHz 0.9V 17.87ps   50mW 
4 1GHz 0.8V 19.70ps 43.81mW 
5 1GHz 0.7V 22.86ps 37.13mW 
6 1GHz 0.6V 29.66ps 28.25mW 
 
Table  4.  Skew and power when voltage is scaled with 500MHz Frequency 
 
Sr No Frequrncy Voltage Skew Power 
1 500MHz 1.2V 14.55ps 35.43mW 
2 500MHz 1V 16.54ps 28.3mW 
3 500MHz 0.9V 17.87ps   25.064mW 
4 500MHz 0.8V 19.70ps 11.41mW 
5 500MHz 0.7V 22.86ps 9.928mW 
6 500MHz 0.6V 29.66ps 8.465mW 
 
Table  5.  Skew and power when voltage is scaled with 250MHz Frequency 
 
Sr No Frequrncy Voltage Skew Power 
1 250MHz 1.2V 14.55ps 17.78mW 
2 250MHz 1V 16.54ps 14.178mW 
3 250MHz 0.9V 17.87ps   12.549mW 
4 250MHz 0.8V 19.70ps 10.97mW 
5 250MHz 0.7V 22.86ps 9.29mW 
6 250MHz 0.6V 29.66ps 7.07mW 
 
7. Skew Variation Minimization 
 
The variation of skew and power when the clock mesh is operated under DVFS is studied and it is found that the 
thus designed clock mesh is skew tolerant to frequency scaling. The clock skew was found to vary between 0.7% to 
3% of clock frequency corresponding to 250MHz and 1GHz respectively.  The increase in skew with decrease in the 
supply voltage can be substantiated as follows- 
The propagation delay through a CMOS buffer 9 is given by equation 2. 
 
 
 
1439 P. Harsha et al. /  Procedia Computer Science  46 ( 2015 )  1433 – 1440 
 
2)( TDD
DD
PD VVk
CVT                    (2)  
 
As can be seen by this equation, the propagation delay depends on the capacitive load at the output of the buffer, 
threshold voltage of the buffer and also on its supply voltage VDD. As was mentioned earlier, the load capacitance 
being driven by the buffers differ. For example, if B3 is driving 100fF of load capacitance, the same sized buffer 
may also be driving another load of 120fF capacitance and another driving 140fF of capacitance. For this case, the 
variations in propagation delay when supply voltage scales down are calculated using equation 2 and are 
summarized in Table. 6. 
 
Table  6.  Propagation delay variations 
 
Capacitance (fF) TPD (ps) at  VDD=1V TPD (ps) at VDD=0.8V % change in delay 
100 0.23 0.39 71.74 
120 0.28 0.47 69.28 
140 0.33 0.55 67.5  
    
 
If suppose, the load capacitance offered to the buffer was a constant, say, 100fF throughout the mesh, then the 
percentage change in delay when supply voltage scales down, would have been the same and hence no variations in 
skew will be observed due to scaling of voltage. However, in our case, the load capacitance offered to the buffer 
(say B3, which drives 100fF to 150fF) is variable and as shown in Table. 6, the increase in propagation delay when 
supply voltage scales down is not uniform throughout the clock mesh. Though this was the case for a particular 
range of load capacitance, when all the different possible ranges (B1 – B8) are considered, there will be more drastic 
variations in the propagation delay, which will in turn increase the clock skew when supply voltage scales down. It 
is important to note that throughout the analysis, the threshold voltage of the buffer was kept constant. The threshold 
voltage also plays a crucial role in the skew changing as supply voltage changes. So, either the load capacitance has 
to be made uniform for all the buffers or, the buffer speed can be varied by varying its threshold voltage. 
In order to overcome the above mentioned problem of skew variations, we propose a method in which the 
threshold voltage of the buffers is varied to equalize the skew even when the voltage scales down. For a MOSFET, 
the body or the substrate acts as a second gate of the device and introduction of a suitable body bias can vary the 
performance of the device considerably. It is this fact that we explore in this work. If the body voltage of a PMOS is 
made higher than its source potential, the threshold voltage of the device increases and the device takes relatively 
longer time to turn ON. The same effect can be observed in a NMOS device by keeping its body potential lower 
than its source potential. From equation 2, we can see that if VDD=1V and if the threshold is kept higher than the 
actual threshold by means of body bias, (say 0.6V), then VDD-VTH = 0.4V. Now when VDD scales to 0.9V, the body 
bias is also reduced, thus decreasing the threshold, (say to 0.5V) giving VDD-VTH=0.4V itself. Thus it can be seen 
that by suitably varying the threshold, the term in the denominator of equation (2) can be almost made as a constant. 
Note that, for a given buffer, the load being driven by it and its aspect ratio (W/L) are constant and hence the entire 
equation 2 reduces to almost a constant value. This indeed is what is required to make the skew constant even as 
supply scales. Table. 7 shows the body bias that was introduced for PMOS and NMOS of the buffer at different 
voltage levels. 
Generally, VBS of PMOS is made positive and VBS of NMOS is made negative/zero as seen from the first two 
rows of Table. 7. However, it is also possible to reverse the polarity of the body bias, thereby making the device turn 
ON earlier (i.e., at a lower gate voltage, since the body voltage now adds with gate voltage) and this has been used 
when supply scales to 0.8V (row 3 of Table. 7). As a result of this, we see that the skew remains almost the same 
even as supply voltage reduced from 1V to 0.8V, as compared to a variation of 3ps in the earlier case 
 
1440   P. Harsha et al. /  Procedia Computer Science  46 ( 2015 )  1433 – 1440 
Table  7.  Effect Of Body Bias On Skew 
 
VDD (V) VB of PMOS (V) VB of NMOS (V) Skew (ps) 
1 1.4 -1.4 17.37 
0.9 0.9 0 17.8 
0.8 0.3 0.5 17.83 
    
 
8. Conclusion 
 
A non uniform clock mesh based on the sink capacitances is constructed and the connection of sinks to the mesh 
is done using stubs. The mesh was designed such that the capacitance within a room does not exceed 100fF and the 
mesh and stub lengths obtained are 25.102mm and 8.762 mm respectively. A total of 156 buffers of various sizes are 
used for driving the clock to all the sinks in the benchmark considered. The performance of the clock mesh under 
DVFS operation was then studied and was found that the skew does not vary as the frequency of the clock scales. 
This is also in consistent with the equation 2, where in there is no effect of frequency on the propagation delay 
thorough a buffer. In order to nullify the effect of skew variations as supply voltage scales, a novel method of 
varying the body bias of a the buffer was proposed and it was found that the effect of  supply voltage scaling can be 
nullified by keeping the threshold high at higher VDD and suitably reducing the threshold as supply voltage reduces. 
Also, all the body bias voltages considered here are within the practically achievable limits. Thus, by suitable body 
biasing of the buffer, the skew can be made almost the same when the clock mesh is operated under Dynamic 
voltage and Frequency Scaling. 
References 
1. Magkils G et al. Dynamic frequency and voltage scaling for a multiple-clock-domain microprocessor Micro IEEE;2000. p.62-68 
2 Anand Rajaram, David Z.Pan. Mesh works A comprehensive framework for optimized clock mesh network synthesis, IEEE Transaction   on 
computer-aided design of integrated circuits and systems, vol. 29, no. 12, december 2010. 
3. H Chen et.al. A sliding window scheme for accurate clock mesh analysis, ICCAD:2005 
4. J. Reuben, M. Zackriya.V, S. Nashit, and H.M. Kittur, Capacitance driven clock mesh synthesis to minimize skew and power dissipation, 
IEICE Electronics Express vol 10, p. 24 1-12 
5. Can Sitik, Baris Taskin. Multi voltage domain clock mesh design ICCD:2012. p. 201-206 
6. Predictive Technology Model, 2013: http://ptm.asu.edu/ 
7. NGSPICE- http://ngspice.sourceforge.net/ 
8. HSPICE toolbox for MATLAB- http://cppsim.com/download_hspice_tools.html 
9. Kaushik Roy, Sharath C Prasad. Low-power CMOS VLSI circuit design, Wiley India: 2011 
  
 
