Statistical Skew Modeling for General Clock Distribution Networks in Presence of Process by Jiang, Xiaohong & Horiguchi, Susumu
704 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 5, OCTOBER 2001
Statistical Skew Modeling for General Clock
Distribution Networks in Presence of Process
Variations
Xiaohong Jiang and Susumu Horiguchi, Senior Member, IEEE
Abstract—Clock skew modeling is important in the perfor-
mance evaluation and prediction of clock distribution networks.
This paper addresses the problem of statistical skew modeling for
general clock distribution networks in the presence of process vari-
ations. The only available statistical skew model is not suitable for
modeling the clock skews of general clock distribution networks
in which clock paths are not identical. The old model is also too
conservative for estimating the clock skew of a well-balanced clock
network that has identical but strongly correlated clock paths
(for instance, a well-balanced H-tree). In order to provide a more
accurate and more general statistical skew model for general clock
distributions, we propose a new approach to estimating the mean
values and variances of both clock skews and the maximal clock
delay of general clock distribution networks. Based on the new
approach, a closed-form model is also obtained for well-balanced
H-tree clock distribution networks. The paths delay correlation
caused by the overlapped parts of path lengths is considered in
the new approach, so the mean values and the variances of both
clock skews and the maximal clock delay are accurately estimated
for general clock distribution networks. This enables an accurate
estimate of yields of both clock skew and maximal clock delay to
be made for a general clock distribution network.
Index Terms—Clock distribution network, clock skew, maximal
clock delay, process variations, statistical modeling, yield.
I. INTRODUCTION
THE evolution of VLSI chips toward larger die sizes andfaster clock speeds makes clock distribution an increas-
ingly important issue [1]. A striking example of what can be ac-
complished with aggressive clock design is the DEC alpha chip,
designed to operate at more than 600 MHz [2]. The advanced
Pentium III and AMD Athlon processors now work above 1
GHz. At such high speeds, clock skew becomes a very sig-
nificant problem. Clock skew may arise mainly from unequal
clock path lengths to various modules and process variations
that cause clock path delay variations [3], [4]. To model the
clock skew, either a worst-case or a statistical approach may be
utilized. A worst-case approach can usually cause an unneces-
sarily long clock period. In a statistical approach, on the other
hand, the clock parameters may be chosen so that the probability
of timing failure is very small, but not zero. This usually results
Manuscript received August 30, 2000; revised May 23, 2001. This work was
supported in part by Grant-In-Aid for Scientific Research from the JSPS (Japan
Society for the Promotion of Science).
The authors are with the Graduate School of Information Science, Japan Ad-
vanced Institute of Science and Technology (JAIST), Tatsunokuchi, Ishikawa
923-1292, Japan (e-mail: jiang@jaist.ac.jp; hori@jaist.ac.jp).
Publisher Item Identifier S 1063-8210(01)07661-2.
in a shorter clock period. The available literature dealing with
statistical clock skew modeling [5], [6] approaches the problem
from the standpoint that all clock paths are assumed to be iden-
tical and independent, so an upper bound of expected clock skew
is obtained. For general clock distribution networks (CDNs), the
clock paths may not be identical and they usually depend on
each other as they may overlap at some parts of their length,
so the old statistical model is not applicable to modeling the
clock skews of these clock networks in which clock paths are
not identical. The skew model is also too conservative when it
is used to estimate the clock skew of a well-balanced CDN in
which clock paths are identical but strongly correlated (e.g., the
well-balanced H-tree CDNs [7], [8] which are commonly used
to reduce the clock skew). For different level H-trees, the ex-
pected clock skews estimated by using the old model are con-
siderably larger than the actual expected skews as shown in this
paper. In the case where the clock frequency is limited by the
skew rather than by the minimum time between two successive
events propagated through the H-tree [8], the old model will re-
sult in an unnecessarily long clock period.
The clock period of a CDN is in general determined by both
the clock skew and the maximal clock delay of the network.
The focus of this paper is to provide a recursive approach to
estimating the expected values and the variances of both the
clock skews and the maximal clock delay of general CDNs. The
paths delay correlation caused by the overlapped parts of path
lengths is taken into account in the new estimates, so the mean
values and the variances of both clock skews and the maximal
clock delay of general clock distribution networks are accurately
estimated by using the new approach. This enables an accurate
estimate of yields of both clock skew and the maximal clock
delay and, thus, the yield of clock period to be made for a general
CDN.
The rest of the paper is organized as follows: A general skew
model for clock distribution networks is described in Section II.
A novel approach is presented in Section III-A for evaluating
recursively the mean values and the variances of clock skew
and the maximal clock delay of general CDNs. In Section III-B,
closed-form expressions are derived for the mean values and
variances of both clock skew and the maximal clock delay of
well-balanced H-tree CDNs. The yield modeling of clock skew
and the maximal clock delay is discussed in Section IV. Sec-
tion V compares the simulation results and theoretical results of
clock skew and the maximal clock delay for three typical CDNs,
and Section VI summarizes the contributions of this paper.
1063–8210/01$10.00 © 2001 IEEE
JIANG AND HORIGUCHI: STATISTICAL SKEW MODELING FOR GENERAL CLOCK DISTRIBUTION NETWORKS 705
II. CLOCK SKEW MODELING
Due to variations in process parameters, the actual circuit
delay will deviate from the designed value [9]. For a given CDN,
let denote the signal propagation time on the unique
path from the clock source to the sink . The maximal clock
delay and the minimal clock delay of the CDN can be de-
fined as
(1)
(2)
The clock skew between two sinks and is the delay dif-
ference and clock skew of the CDN is in
general defined as the maximum value of
over all sink pairs and in the CDN [10], [11]. Thus, is
given by
(3)
Process variations are subject to two sets of factors: sys-
tematic factors, like power supply fluctuations, which can be
controlled by proper techniques and factors that are random,
and therefore uncontrollable by improved techniques. Thus,
the random factors determine the achievable performance of
a circuit. Our major concern in this paper is to model the
clock skew and maximal clock delay of general CDNs when
the random factors are considered. When random process
variations are considered, variations of paths delay are modeled
by normal distributions [3], [12]. To model the clock skew ,
random variables and should be first characterized. The
model developed in this paper is based on the following two
assumptions.
1) A CDN can in general be represented by a binary tree, we
assume that both the maximal clock delay and the min-
imal clock delay in each subtree (and also the whole bi-
nary tree) of the CDN can be modeled by normal distri-
butions when process variations are considered. This as-
sumption takes its roots in the available results [13], [14].
The assumption makes it easy to analyze the correlation
that exists between the maximal and the minimal delay
in a subtree. This correlation analysis is critical in deter-
mining the variance of skew in each subtree (and also the
whole binary tree) of the CDN, and most importantly, the
estimated results of clock skew and the maximal clock
delay obtained by using the assumption are accurate as
shown in this paper.
2) The delay along a clock path is the sum of the uncertain
independent delays of the branches along the given path.
Correlation between the delay of any two paths is deter-
mined only by the overlapped parts of their length.
The clock paths of a CDN usually have some common
branches over their length, and these common branches cause
correlation among the delays of these paths. The above assump-
tion enables a complete analysis of this kind of correlation.
In addition to the delay correlation described in Assumption
2), the correlation among paths delay may also be caused by
the correlated intra-die variations of these parameters involved
in that delay (e.g., threshold voltages, resistances, etc.). How-
ever, finding the correlation coefficient of these parameters is, in
practice, quite cumbersome and may obscure the practicality of
an approach considering all these kinds of correlation. We must
be careful to avoid arriving at intractably complex models, so
these kinds of correlation are neglected in this paper as indicated
in Assumption 2). In general, the intra-die process parameters’
correlation will lead to the paths delay in the same chip tending
to be positive dependent. In this case, Assumption 2) will guar-
antee that the expected values of clock skew and maximal clock
delay will still be upper bounded by the corresponding values
estimated using our approach (see Appendix A for a discus-
sion). Compared to the old upper bound of expected skew of a
well-balanced CDN where all the clock paths are assumed com-
pletely independent, our estimates are enhanced significantly, as
shown in this paper, because the paths delay correlation caused
by the common branches of paths length are completely con-
sidered. Furthermore, the new approach is applicable to general
CDNs, whereas the old model is only applicable to the well-bal-
anced CDNs in which clock paths are identical.
From (3), the mean value and the variance of are given by
[15]
(4)
(5)
Here, and represent the mean value and the variance
of a random variable, respectively, and is the correlation coef-
ficient of and . The parameters , , , , and
should be accurately estimated for a CDN to allow, in turn, the
accurate modeling of clock skew and maximal clock delay.
III. A NEW APPROACH FOR PARAMETER ESTIMATION
A recursive approach for evaluating the parameters ,
, , and of general CDNs is presented here.
Based on this algorithm, closed-form expressions of clock
skews and the maximal clock delay of well-balanced H-tree
CDNs are also developed.
A. Parameter Estimation Algorithm for General CDNs
A CDN can in general be represented by a binary tree, so a
simplified binary tree shown in Fig. 1 is taken as an example to
illustrate the evaluating process of these parameters. The evalu-
ating process is then applied to general CDNs. All the paths in
Fig. 1 are partitioned into independent branches
by the branch split points in the clock tree, where is the ac-
tual delay of . The branch split point in the clock tree
is associated with a set of random variables , here
, and are the maximal clock delay, the minimal clock
delay and the clock skew of the subtree starting from the split
point, respectively. Each random variable here is characterized
by both its mean value and its variance.
To illustrate that the parameters , , , and
of the simplified binary clock tree can be evaluated recur-
sively, we begin with the evaluating process of .
Let branch also be associated with a set of random variables,
with being the maximal clock delay and being
706 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 5, OCTOBER 2001
Fig. 1. Illustration of a simplified binary clock tree.
the minimal clock delay of the subtree starting from branch ,
and being the correlation coefficient of and . Thus
(6)
(7)
Then we have
(8)
(9)
Equations (6)–(9) indicate that the results of
branch split point (0, 0) are determined by and can be evalu-
ated from both the corresponding results [ and
] of the next lower level split points (1, 1), (1, 2),
and the delay of the branches [ and ] connecting the point
to those next lower level split points (see Appendix B for the de-
tailed evaluating process). So once the results of
are obtained for each lowest-level split point (i.e., the split point
from which no further branch split points can be found in the
subtree starting from that split point), the process above can be
used recursively to evaluate the mean values and the variances
of clock skew, the maximal clock delay and the minimal clock
delay of a general CDN in a bottom-up manner. In fact, the re-
sults of of one lowest-level split point in
Fig. 1 can be obtained as follows. The distribution functions of
and can be obtained by using the same idea as for equa-
tions (B.1) and (B.2) in Appendix B, so the mean values and the
variances of and can be evaluated by using their distri-
bution functions, respectively. The mean value and the variance
of are given by (see Appendix C for the proof)
(10)
(11)
where
(12)
Based on the results of , the parameters ,
, , and of the whole binary tree are then given
by
(13)
(14)
The pseudocode for the parameter estimation algorithm can
be summarized as follows:
Algorithm
Parameter estimation for general CDNs
Initialization: for each do
while ( not empty) do
JIANG AND HORIGUCHI: STATISTICAL SKEW MODELING FOR GENERAL CLOCK DISTRIBUTION NETWORKS 707
for each do
Remove from
In the pseudocode, a CDN is represented by graph
with vertex (split point) set and edge (branch)
set . The lowest level split point of the graph is associated
with random variables as defined above.
are the random variables associated with the
branches starting from a lowest level splitting point, with
and representing the random variables
associated with the two branches starting from , respectively.
For a CDN, the initial values of and are just the actual
delay of the branches that support sinks. is the actual
delay of the branch connecting to its parent splitting point,
and is the subgraph starting from .
Since the algorithm carries out the same amount of compu-
tation for each split point, the following conclusion can be ob-
tained.
Theorem 1: The parameter estimation algorithm given above
computes a network in time.
The theorem indicates that the parameter estimation algo-
rithm is computationally effective in estimating the parameters
, , , and of a general CDN.
Fig. 2. An well-balanced H-tree clock distribution network for 64 processors
(clock buffers are not illustrated here).
B. Parameter Estimation for H-Tree CDNs
The H-tree technique is widely used to reduce the clock skew
[7], [8]. Due to the very symmetric structure of H-tree CDNs,
it is possible for us to get a closed form model for both clock
skew and the maximal clock delay of H-tree CDNs. Before
developing the models, the H-tree itself must first be defined.
Without loss of generality, a well-balanced H-tree has hier-
archical levels, where denotes the tree depth. The level zero
branch corresponds to the root branch, and level branches to
the branches that support sinks. A level branch begins with a
level split point and ends with level split point. The H-tree
illustrated in Fig. 2 is drawn for , which is used to dis-
tribute the clock signals to 64 processors.
For a level well-balanced H-tree, let , be
the actual delay of branch of a clock path. The mean values
and the variances of the maximal clock delay and the minimal
clock delay, , of the H-tree are then given by following equa-
tions (see Appendix D for the derivation):
(15)
(16)
(17)
Results (15)–(17) and (3) indicate that the expected clock
skew and skew variance of the level well-balanced
708 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 5, OCTOBER 2001
TABLE I
SOME MAJOR PROCESS PARAMETERS AND THEIR INTRADIE STANDARD DEVIATIONS (SD) OF A TYPICAL 0.25 m CMOS PROCESS
H-tree are given by
(18)
(19)
where is the correlation coefficient of and , and can be re-
cursively evaluated for a network as discussed in Section III-A.
The closed-form expressions (15)–(19) indicate clearly how the
clock skew and the maximal clock delay are accumulated along
the clock paths and with the increase of H-tree size. This en-
ables a suitable H-tree size to be selected for a specified clock
frequency, and also enables minimization of the clock period,
improving the speed for a fixed size H-tree network [16].
IV. MODELING THE YIELDS OF CLOCK SKEW AND THE
MAXIMAL CLOCK DELAY
The clock period of a CDN is in general determined by both
the clock skew and the maximal clock delay of the network.
With the estimates of mean values and variances of both and
in hand, it is possible for us to estimate the yields of and ,
and thus the yield of clock period. Here, the yield of a random
variable means the probability that the variable is less than a
specified value. As indicated in Assumption 1), the yield of
can be estimated by a normal distribution with
mean, , and variance, . For general CDNs, and
are positively correlated (i.e., ) normal variables [17],
and clock skew can be modeled by log-normal distribution
as verified by extensive simulation results [14]. The clock skew
yield, i.e., the probability that the actual skew of the network,
, is less than a skew specification ( ), can then be
evaluated as
(20)
Here, parameters and are given by [18]
(21)
(22)
Once the mean values and variances of both and of a
CDN are estimated by the algorithm developed in Section III,
the yields of and can be approximated by normal distribution
and log-normal distribution, respectively.
The delay of the branch may then be obtained by averaging
the rise and fall times. Here is the output resistance of the
driving transistor of minimum size inverter, is the input ca-
pacitance of the driven minimum size inverter, and are
the capacitance and resistance of the interconnection line in the
branch. , , and are given by
(24)
(25)
where
and width and length of the transistor;
gate unit area capacitance;
gate oxide thickness;
charge carrier mobility;
threshold voltage;
metal resistivity;
oxide dielectric constant.
Here, the interconnection line is with width , length , and
thickness on an oxide layer of thickness , and can
be estimated by the following empirical formula including the
contribution of fringing fields [21]:
(26)
The process parameters and their standard deviations used
here are based on the 0.25- m CMOS technology predicted
by the International Technology Roadmap for Semiconductors
(ITRS) [22] and the MOSIS parametric test results of a typical
0.25 m technology [23]. The mean values and intra-die standard
deviations (SD) of these process parameters are presented in
Table I.
Here, is not considered as a random variable since the
power supply in a system is globally controlled. Furthermore,
the standard deviation of the width of a transistor is 0.02 m
for n-MOS and 0.05 m for p-MOS as estimated from MOSIS,
and the standard deviation of (length of interconnection line)
is assumed to be 2% of its nominal. The n-MOS transistor and
p-MOS transistor in the minimum inverter of the technology are
assumed to have the gate width/length of 0.37 m/0.25 m and
1.1 m/0.25 m, respectively.
JIANG AND HORIGUCHI: STATISTICAL SKEW MODELING FOR GENERAL CLOCK DISTRIBUTION NETWORKS 709
As indicated in Assumption 2) the intra-die parameters cor-
relations are neglected in this paper. Thus, (the output resis-
tance of the driving transistor of minimum size inverter) in (23)
will be independent from (the input capacitance of the driven
minimum size inverter). One approach to calculating the delay
variance of a branch due to the variations of process parame-
ters is to first express the relation (23) in terms of independent
variables (geometrical dimensions of the interconnection line in
the branch, and ). The delay variance of the branch can
then be determined in terms of variances of these independent
random variables [3], [24]. For example, the variance, , of
a random variable that is a function of independent random
variables, , may be obtained from
(27)
Now consider the variances of and . Since
, all factors in this expression are independent variables.
The variance of can be determined in terms of variances of
these independent random variables using (27). By (24), is
determined by parameters , , and . In a system, the
power supply is controlled globally and is not a random variable.
Since and are both dependent on , they are correlated
variables. Also, for the gain constant, ,
of the transistor, the mobility has a dependence due to the
impact of the vertical gate field and thus correlated to . Con-
sidering this correlation can lead to a more accurate estimation
of variance of . For a simplified computation, however, we
will neglect this correlation here because and , and
can be roughly considered as independent variables as discussed
in [3], [25]–[27]. Thus, the variance of can be determined by
the variances of , , and , and the variance of
can then be evaluated from the variances of and using
(27). Following similar arguments as above, we can also eval-
uate the mean delay value of a branch based on both (23) and
the distributions of the basic parameters in (24)–(26) (geomet-
rical parameters of both transistor and interconnection line, ,
, , and ). Here, each basic parameter is considered as
a normal variable, but the process is also applied to other distri-
butions.
Once the mean delay value and the delay variance of each
branch are evaluated for a network, the theoretical approach de-
veloped in this paper can be used to estimate the mean values,
the variances and the yields of both the clock skew and maximal
clock delay of the network. In the theoretical approach, algo-
rithms presented in Section III are first used to evaluate the pa-
rameters , , , , and of different networks,
then the yields of the clock skew and the maximal clock delay
are estimated using the models provided in Section IV.
To verify the new approach, transistor level Monte Carlo sim-
ulations are also conducted. In the simulation, each basic pa-
rameter in (24)–(26) is simulated by a normal random variable.
To agree with the conditions used in the theoretical approach,
the correlation between and , and between and is
neglected as discussed above. The actual delay of a branch is
then evaluated from the random values of these basic parame-
Fig. 3. Simulation results and theoretical results of clock skew of H-tree
networks when different numbers of processors are considered. (a) Mean value
of clock skew. (b) Standard deviation of clock skew.
ters using (23)–(26). The actual delay of a path is the sum of the
actual delays of the branches along that path. The actual max-
imal clock delay, minimal clock delay and clock skew of the
network are then determined by (1)–(3). For a specified value,
the yield of a parameter (clock skew or the maximal clock delay)
is estimated by the ratio of number of simulations in which the
parameter is less than the specified value to the total number of
simulations.
The first network considered is well known as the H-tree ap-
proach shown in Fig. 2 (for brevity, inverters are not illustrated
in the following networks). Due to the very symmetrical design
of H-tree clock networks, all clock paths within the H-tree are
identical, and the old statistical model [5], [6] can be used to get
an upper bound of its expected clock skew when all the paths
are assumed to be independent. According to the old model, an
upper bound of expected clock skew of a well-bal-
anced H-tree is asymptotically given by [6]
(28)
710 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 5, OCTOBER 2001
Fig. 4. Simulation results and theoretical results of the maximal clock delay of
H-tree networks when different numbers of processors are considered. (a) Mean
value of the maximal clock delay. (b) Standard deviation of the maximal clock
delay.
With the variance of clock skew being given by
(29)
where
standard deviation of path delay;
Euler’s constant;
number of paths;
higher order term.
In a -level H-tree, there are a total of branches,
and it can be used to distribute clock signals to elements.
For two combinations of parameters and , in a wide range,
and when the numbers of processors are 4, 8, 16, 32, and 64, the
theoretical results (obtained using both the old and new models)
and simulation results of clock skews of corresponding H-trees
are summarized in Fig. 3(a) and (b). The equivalent results for
the maximal delay of the H-trees are summarized in Fig. 4(a)
and (b).
The results in Figs. 3 and 4 show that the new model is
accurate in estimating the mean values and standard devia-
tions of both the clock skew and the maximal clock delay
of H-tree CDNs, where the delay correlation determined by
Fig. 5. Simulation yield results and theoretical yield results of clock skew and
maximal clock delay of an H-tree network for two combinations of parameters
h andW . (a) Yield results of clock skew. (b) Yield results of the maximal clock
delay.
the overlapped parts of path lengths has been considered as
indicated in Assumption 2). Compared to the old estimates
of expected clock skew where all the paths in the H-tree are
assumed completely independent, the new estimates based on
Assumption 2) are a significant enhancement. For different
sized H-trees, the expected clock skews estimated using the
old model (28) are at least 2.6 times the expected clock skews
estimated using our approach. This is shown in Fig. 3(a). In
cases where clock frequency is limited by skew rather than by
the minimum time between two successive events propagated
through the H-tree [8], an unnecessarily long clock period will
result from using the old skew model.
Basedon theaboveestimated resultsof themeanvaluesand the
variances of both and , the yields of and can be further esti-
matedasdiscussedinSectionIV.For thesix-levelH-treeshownin
Fig. 2, the theoretical yield results and the simulation yield results
of both and are summarized in Fig. 5(a) and (b). For compar-
ison, we also present in Fig. 5(a) the simulation results of yield of
clock skew when all paths are assumed independent.
The results in Fig. 5(a) indicate that the old model’s indepen-
dent assumption leads to very conservative estimates of clock
JIANG AND HORIGUCHI: STATISTICAL SKEW MODELING FOR GENERAL CLOCK DISTRIBUTION NETWORKS 711
Fig. 6. Tree-type network.
Fig. 7. Trunk-type network.
skew yields of H-tree CDNs that have identical but strongly cor-
related clock paths. The results in Fig. 5 also show that when
the mean values and the variances of both the and of H-tree
CDNs are accurately estimated by our approach, the yields of
their and are further approximated by log-normal distribu-
tion and normal distributions, respectively.
The next two network considered are the Tree-type network
(Fig. 6) and Trunk-type network (Fig. 7). Since the paths in the
two general CDNs are not identical, the old model (28) and (29)
could not be used to model the skews of these networks. Thus,
only the theoretical results of the new approach and the simula-
tion results are illustrated for the following two CDNs.
In a Tree-type network, a single clock input drives a row of
columns, and each of these columns drives same number of pro-
cessors. When a Tree-type network is used to distribute clock
signals to an array of processors, a total of
branches will be needed.
The Trunk-type network involves dividing the processors into
two equal parts. A single clock drives a main trunk that is used
to drive signals across each sub row placed at either sides of the
trunk. In a Trunk-type network used for clock distribution for
Fig. 8. Mean values and standard deviations of clock skew and maximal
clock delay of Tree-type networks when different numbers of processors are
considered. (a) Mean values of clock skew and maximal clock delay. (b)
Standard deviations of clock skew and maximal clock delay.
an array of processors, there are a total of
branches.
For the same combinations of parameters and used
above for H-tree and when the array size varies from 2 2
to 8 8, the theoretical results and the simulation results for
clock skew and maximal clock delay of Tree-type networks are
summarized in Fig. 8. The corresponding results of Trunk-type
networks are summarized in Fig. 9.
The results in Figs. 8 and 9 indicate that the new model is
also accurate in estimating the mean values and standard de-
viations of both and for Tree-type CDNs and Trunk-type
CDNs. For the Tree-type network shown in Fig. 6, the theoret-
ical yield results and the simulation yield results of both and
are further summarized in Fig. 10. The corresponding results
of the Trunk-type networks shown in Fig. 7 are summarized in
Fig. 11.
Again, the results in Figs. 10 and 11 indicate that when the
mean values and the variances of both and are estimated
accurately for Tree-type and Trunk-type networks, the yields of
their and are modeled by log-normal and normal distribu-
tions, respectively.
712 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 5, OCTOBER 2001
Fig. 9. Mean values and standard deviations of clock skew and maximal
clock delay of Trunk-type networks when different number of processors
is considered. (a) Mean values of clock skew and maximal clock delay. (b)
Standard deviations of clock skew and maximal clock delay.
Due to the assumption that all clock paths are identical, the
old statistical skew model could not be used to model the clock
skews of general clock networks with nonidentical paths. The
new model developed in this paper, however, can be used to
accurately estimate the mean values and the variances of both
clock skew and the maximal clock delay of these general CDNs.
For a well-balanced CDN (e.g., H-tree clock network), the ex-
pected clock skews estimated by old model are very conserva-
tive because correlation among paths delay are completely ne-
glected. On the other hand, the expected clock skews estimated
by the new model are enhanced significantly. For the well-bal-
anced H-tree network shown in Fig. 2, the results in Figs. 3(a)
and 4(a) show that when parameters and m,
the actual mean value of maximal clock delay is 6.96 ns, and the
actual mean value of clock skew is 153.4 ps. The expected clock
skew estimated by old model is about 892.5 ps, a value 5.8 times
larger than the actual value. For a traditional clocking mode and
when the 10% rule of thumb relating the skew to the clock pe-
riod is used, the actual clock period should be dominated by the
maximal clock delay, and the mean value of clock period will
be 6.96 ns. However, when the old skew model is used in the
Fig. 10. Simulation yield results and theoretical yield results of clock skew and
maximal clock delay of a Tree-type network for two combinations of parameters
h and W . (a) Yield results of clock skew. (b) Yield results of maximal clock
delay.
skew estimate, the clock period should be determined by the
clock skew rather than the maximal clock delay, and the mean
value of clock period will be 8.925 ns. The old model will thus
mislead efforts to reduce the clock period of the network. For
a pipelined clocking mode, the clock periods of well-balanced
H-tree networks will be dominated by the clock skew rather than
by the minimum time between two successive events propagated
through the H-tree, an unnecessarily long clock period will re-
sult from using the old skew model.
V. CONCLUSION
The available statistical skew model is too conservative in es-
timating the expected skew of a well-balanced CDN, the model
is also not general enough to model the clock skew of a nonbal-
anced CDN. A computationally effective approach is presented
for estimating the mean values and the variances of both clock
skew and the maximal clock delay of general CDNs. Closed form
models of clock skewand maximal clock delay are also presented
for well-balanced H-tree CDNs. The paths delay correlation de-
termined by the overlapped parts of path lengths is completely
JIANG AND HORIGUCHI: STATISTICAL SKEW MODELING FOR GENERAL CLOCK DISTRIBUTION NETWORKS 713
Fig. 11. Simulation yield results and theoretical yield results of clock skew
and maximal clock delay of a Trunk-type network for two combinations of
parameters h and W . (a) Yield results of clock skew. (b) Yield results of
maximal clock delay.
considered in the new approach so the mean values and variances
of both clock skew and maximal clock delay are accurately esti-
mated for general CDNs. It is further verified that when the mean
values and variances of both clock skew and maximal clock delay
of a CDN are accurately estimated by the new approach, the skew
yield and maximal delay yield of the CDN are approximated by
log-normal and normal distributions, respectively. This enables
the clock period yield of the CDN to be estimated. If process vari-
ations are considered in designing a clock distribution network,
mean values and variances of delays for all branches should care-
fully estimated, then the approach presented here will be useful
in evaluating and predicting the network’s performance of clock
skew and the maximal clock delay.
APPENDIX A
First, we need the following lemma [5].
Lemma 1: Let and be the probability distribu-
tion functions for random variables and respectively, and
suppose further that and are differentiable and
and have finite means and variances. If
for all , then . [Here and repre-
sent the mean value and the variance of a random variable,
respectively.]
The correlated intra-die parameters variations mean that the
increase of a parameter’s values in one area tend to be associated
with the increase of the parameter’s values in other areas of a
chip, and vice versa. This will lead to the positive dependence
between delays and of two paths and in the chip,
then we have
(A.1)
where is the probability that an event happens.
A CDN can be represented by a binary tree, so a simplified bi-
nary tree shown in Fig. 1 is taken as example for illustration. All
the paths in Fig. 1 are partitioned into branches
by the branch split points in the clock tree, where is the ac-
tual delay of . The branch split point in the clock tree
is associated with a set of random variables , here
, and are the maximal clock delay, the minimal clock
delay and the clock skew of the subtree starting from the split
point, respectively. We begin with the subtree which starts from
the lowest level split point . When the intra-die parame-
ters correlation is considered, the delays and will
be positive dependent so we have
(A.2)
(A.3)
If Assumption 2) is used, and are independent,
and the distribution functions of and will be
and , re-
spectively. By Lemma 1 and (A.2), (A.3), the expected value of
the maximal clock delay ( ) and the expected value of the
clock skew ( ) of
the subtree will be upper bounded by the corresponding values
estimated using Assumption 2). Since we always have
and
whether is dependent or independent with
and , the expected values of the maximal clock delay and
clock skew of the subtree starting from branch will also be
upper-bounded by the corresponding values estimated using As-
sumption 2), and the same conclusion will apply to the subtree
starting from branch .
Furthermore, if the intra-die parameters correlation is consid-
ered, the delay of a path in the subtree starting from branch
will be positive dependent with the delay of a path in the sub-
tree starting from branch . Following similar arguments as
for that of the subtree starting from the lowest level split point
, the expected values of clock skew and the maximal clock
714 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 5, OCTOBER 2001
delay of the subtree starting from point will also be
upper bounded by the corresponding values estimated by As-
sumption 2).
By applying the arguments above to the binary-tree recur-
sively, we conclude that, if the intra-die parameters correlation
are considered, the expected values of clock skew and the max-
imal clock delay of each subtree (and also the whole binary tree)
of the CDN tree will be upper-bounded by the corresponding
values estimated using Assumption 2).
APPENDIX B
Part 1—Evaluation of , , , and
: Since
The following results can be obtained by using the Assumption
2):
Since
The distribution function of and distribution func-
tion of are given by
(B.1)
(B.2)
Thus, the parameters , , , and
can be obtained by using their distribution functions, respec-
tively. Based on the results above, the mean value of is de-
termined by
Part 2—Evaluation of : The variance of is given
by
Here is the covariance of and , and the
evaluation of is the main computational problem
of . To evaluate the covariance, and are first ex-
pressed as
Then is evaluated from (B.3) as shown at the
bottom of the page, where is the correlation coefficient of
and , is the correlation coefficient of and . The
in (B.3) can be evaluated as follows.
Since normal random variables and are independent from
normal random variables and , is modeled by normal
with mean, , and variance, , and is
modeled by normal .
Here
The covariance of and
is given by
(B.3)
JIANG AND HORIGUCHI: STATISTICAL SKEW MODELING FOR GENERAL CLOCK DISTRIBUTION NETWORKS 715
Then the correlation coefficient of and is
given by
Thus, the relation between and is given by
[15]
where is a standard normal variable independent from .
Then we have
(B.4)
Here, the function and parameter are defined as
The covariance is defined as
where is the joint distribution function of and
. From (B.4), is given by
The joint density function of and is
determined as
So the covariance can be evaluated
from
Here, and are given by the following
(see Appendix C for the proof):
It can be proven in the same way as that for
, that the covariance is given
by
(B.5)
where
(B.6)
(B.7)
716 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 5, OCTOBER 2001
(B.8)
, and
can be evaluated as for . To evaluate
, and in (B.5)–(B.8) should be replaced
by and , respectively. To evaluate , ,
and in (B.5)–(B.8) should be replaced by , and ,
respectively. Finally, to evaluate , , ,
and in (B.5)–(B.8) should be replaced by , , and ,
respectively.
APPENDIX C
Let and be two independent normal random variables,
and let . Then is also a normal random variable
with its mean, , and its standard deviation, , given by
The distribution function of can be determined as
So the density function of is:
Thus, the mean value of is given by
Then the variance of can be evaluated as
APPENDIX D
For the hierarchical-level, well-balanced H-tree, let be
the maximal clock delay and be the minimal clock delay of
the sub H-tree starting from level split point. Then we have
where and are independent samples of ,
and are independent samples of ,
and are independent samples of . Based on the
property that two normal random variables are independent if
and only if their covariance is zero [15], the following results
can be obtained by using both the symmetry of well-balanced
H-tree and Assumption 1)
(D.1)
(D.2)
(D.3)
The process above indicates clearly that , ,
and can be obtained using , , ,
, , and . By applying (D.1)–(D.3) to
the -level, well-balanced H-tree recursively, the mean values
and the variances of the maximal clock delay, , and the minimal
clock delay of the H-tree are then given by
JIANG AND HORIGUCHI: STATISTICAL SKEW MODELING FOR GENERAL CLOCK DISTRIBUTION NETWORKS 717
ACKNOWLEDGMENT
The authors would like to thank Dr. G. A. Allan, Edinburgh
University, U.K., Dr. N. Scaife and Professor H. Yasushi, JAIST,
Japan, Dr. F. Adachi, NEC, Japan, and the anonymous reviewers
for their valuable comments.
REFERENCES
[1] E. G. Friedman, Clock Distribution Networks in VLSI Circuits and Sys-
tems. New York: IEEE, 1995.
[2] D. W. Bailey and B. J. Benschneider, “Clocking design and analysis for
a 600-MHz alpha microprocessor,” IEEE J. Solid-State Circuits, vol. 33,
pp. 1627–1633, Dec. 1998.
[3] M. Afghahi and C. Svensson, “Performance of synchronous and asyn-
chronous schemes for VLSI systems,” IEEE Trans. Comput., vol. 41,
pp. 858–872, July 1992.
[4] M. Shoji, “Elimination of process-dependent clock skew in CMOS
VLSI,” IEEE J. Solid-State Circuits, vol. SC-21, pp. 875–880, Oct.
1986.
[5] S. D. Kugelmass and K. Steiglitz, “A probabilistic model for clock
skew,” in Proc. Int. Conf. Systolic Arrays, San Diego, CA, May 1988,
pp. 545–554.
[6] , “An upper bound of expected clock skew in synchronous system,”
IEEE Trans. Comput., vol. 39, pp. 1475–1477, Dec. 1990.
[7] A. L. Fisher and H. T. Kung, “Synchronizing large VLSI processor ar-
rays,” IEEE Trans.Comput., vol. C-34, pp. 734–740, Aug. 1985.
[8] M. Nekili, G. Bois, and Y. Savaria, “Pipelined H-trees for high-speed
clocking of large integrated systems in presence of process variations,”
IEEE Trans. VLSI Syst., vol. 5, pp. 161–174, June 1997.
[9] M. D’Abreu, et al., “Understanding of the fabrication process—Key
to design and test of mixed signal ICs,” in Proc. Eur. Test Workshop,
Barcelona, Spain, May 1998.
[10] A. B. Kahng and G. Robins, On Optimal Interconnections for
VLSI. Norwell, MA: Kluwer, 1996.
[11] A. Balboni, C. Costi, M. Pellencin, A. Quadrini, and D. Sciuto, “Clock
skew reduction in ASIC logic design: A methodology for clock tree man-
agement,” IEEE Trans. Comput.-Aided Design Integrated Circuits Syst.,
vol. 17, pp. 344–356, Apr. 1998.
[12] M. Eisele, J. Berthold, D. Schmitt-landsiedeld, and R. Mahnkopf, “The
impact of intra-die device parameter variations on path delay and on the
design for yield of low voltage digital circuits,” IEEE Trans. VLSI Syst.,
vol. 5, pp. 360–368, 1997.
[13] T. Gneiting and I. P. Jalowiecki, “Influence of process parameter
variations on the signal distribution behavior of wafer scale integra-
tion devices,” IEEE Trans. Components, Packaging, Manufacturing
Technol.—Part B, vol. 18, pp. 424–430, Aug. 1995.
[14] X. H. Jiang and S. Horiguchi, “Distribution analysis of clock skew and
clock delay for general clock distribution networks,”, JAIST Res. Rep.
(ISSN.0918-7553) IS-RR-2000-014, June 2000.
[15] J. Pitman, Probability New York, 1993.
[16] X. H. Jiang and S. Horiguchi, “Optimization of wafer scale H-tree clock
distribution network based on a new statistical skew model,” in Proc.
IEEE Int. Symp. Defect and Fault Tolerant in VLSI Systems (DFT’2000),
Yamana, Japan, Oct. 2000, pp. 96–104.
[17] , “A recursive approach to estimating clock skew yield and clock
delay yield for general clock distribution networks,”, JAIST Res. Rep.
(ISSN.0918-7553) IS-RR-2000-010, Apr. 2000.
[18] Probability and Statistics. New York: Academic, 1985.
[19] N. Nigam and D. C. Keezer, “A comparative study of clock distribution
approaches for VLSI,” in Proc. IEEE Int. Conf. Wafer Scale Integration,
San Francisco, CA, Jan. 1993, pp. 243–251.
[20] H. Bbakoglu and J. D. Meindl, “Optimal interconnection circuits for
VLSI,” IEEE Trans. Electron Devices, vol. ED-32, pp. 903–909, 1985.
[21] N. H. E. Weste and K. Eshraghian, Principles of CMOS VLSI Design: A
Systems Perspective. Reading, MA: Addison-Wesley, Apr. 1994.
[22] International technology roadmap for semiconductors (ITRS) (1998).
[Online]. Available: http://public.itrs.net
[23] MOSIS (2000). [Online]. Available: http://www.mosis.org
[24] A. Rapoulis, Probability, Random Variables and Stochastic
Process. New York: McGraw-Hil, 1965.
[25] K. R. Lakshmikumar, A. Hadaway, and M. A. Copeland, “Characteriza-
tion and modeling of mismatch in MOS transistors for precision analog
design,” IEEE J. Solid-State Circuits, vol. SC-21, pp. 1057–1066, Dec.
1986.
[26] K. R. Lakshmi Kumar, “Characterization and modeling of mismatch in
MOS devices and application to precision analog design,” Ph.D. disser-
tation, Carlton Univ., Ottawa, ON, Canada, 1985.
[27] J. Bastos, M. Steyaert, A. Pergoot, and W. Sansen, “Mismatch charac-
terization of submicron MOS transistors,” Analog Integrated Circuits
Signal Processing, vol. 12, pp. 95–106, 1997.
Xiaohong Jiang received B.S. and M.S. degrees in
applied mathematics in 1989 and 1992, respectively,
and the Ph.D. degree in solid-state electronics and
microelectronics in 1999, all from Xidian University,
Xi’an, China.
He is currently a Japan Society for the Promotion
of Science (JSPS) Postdoctoral Research Fellow
at the Japan Advanced Institute of Science and
Technology (JAIST). He was a Research Associate
in the Department of Electronics and Electrical
Engineering, University of Edinburgh, U.K., from
March 1999 to October 1999. His research interests include integrated circuit
yield modeling, timing analysis of digital circuits, clock distribution and
fault-tolerant technologies for VLSI and WSI, performance analysis, and
modeling of optical interconnection networks. He has published more than 20
technical papers in these areas.
Susumu Horiguchi (S’79–M’81–SM’95) received
the M.S. and the Ph.D. degrees in electrical commu-
nication engineering from Tohoku University, Japan,
in 1976, 1978, and 1981, respectively.
Since 1992, he has been a Full Professor with
the Graduate School of Information Science at
the Japan Advanced Institute of Science and
Technology (JAIST). He was a Faculty Member
of the Department of Information Science, Tohoku
University, Japan, from 1981 to 1992. From 1986
to 1987, he was a Visiting Scientist at IBM Thomas
J. Watson Research Center, Yorktown Heights, NY, and a Visiting Professor
with the Center for Advanced Studies, University of Southwestern Louisiana,
Lafayette, in 1994. He has been conducting his research group as the Chair of
the Multi-Media Integral System Laboratory, JAIST. His research interest has
been mainly concerned with parallel computer architectures, VLSI/WSI archi-
tectures, interconnection networks, parallel computing algorithm, massively
parallel processing, and multi-media integral system.
Dr. Horiguchi has been involved in organizing many international workshops,
symposia and conferences sponsored by IEEE, ACM, and IEICE. He is a Senior
Member of the IEEE Computer Society and a board member of the Information
and System Society of IEICE.
