Fast Monte Carlo Estimation of Timing Yield: Importance Sampling with
  Stochastic Logical Effort (ISLE) by Bayrakci, Alp Arslan et al.
ar
X
iv
:0
80
5.
26
27
v2
  [
cs
.O
H]
  2
1 M
ay
 20
08
1
Fast Monte Carlo Estimation of Timing Yield:
Importance Sampling with Stochastic Logical Effort (ISLE)
Alp Arslan Bayrakci Alper Demir Serdar Tasiran
Center for Advanced Design Technologies
College of Engineering
Koc University, Istanbul, Turkey
November 19, 2007
Abstract— In the nano era in integrated circuit fabrication technologies, the per-
formance variability due to statistical process and circuit parameter variations is
becoming more and more significant. Considerable effort has been expended in the
EDA community during the past several years in trying to cope with the so-called sta-
tistical timing problem. Most of this effort has been aimed at generalizing the static
timing analyzers to the statistical case. In this paper, we take a pragmatic approach
in pursuit of making the Monte Carlo method for timing yield estimation practically
feasible. The Monte Carlo method is widely used as a golden reference in assessing
the accuracy of other timing yield estimation techniques. However, it is generally be-
lieved that it can not be used in practice for estimating timing yield as it requires too
many costly full circuit simulations for acceptable accuracy. In this paper, we present
a novel approach to constructing an improved Monte Carlo estimator for timing yield
which provides the same accuracy as the standard Monte Carlo estimator, but at a
cost of much fewer full circuit simulations. This improved estimator is based on a
novel combination of a variance reduction technique, importance sampling, and a
stochastic generalization of the logical effort formalism for cheap but approximate
delay estimation. The results we present demonstrate that our improved yield esti-
mator achieves the same accuracy as the standard Monte Carlo estimator at a cost
reduction reaching several orders of magnitude.
Keywords— logical effort, statistical variations, timing yield estimation and opti-
mization, statistical timing analysis, Monte Carlo methods, variance reduction tech-
niques, importance sampling.
I. INTRODUCTION
We address the problem of estimating timing yield for a circuit under
statistical process parameter variations and environmental fluctuations
by proposing a novel and improved Monte Carlo method based on
transistor-level circuit simulations. In conventional Monte Carlo yield
estimation, a number of samples in the parameter probability space
are generated. The overall delay of the circuit for each sample point
is determined by performing transistor-level timing simulations. An
estimator for timing yield is obtained by considering the fraction of
samples for which the timing constraint is satisfied. Because of the
computational cost of determining circuit delay for each sample, the
number of samples one has to work with is limited. This adversely
affects the accuracy of the yield estimator; which has a large error for
a small number of samples. This is a weakness of the conventional
Monte Carlo method and has prevented it from finding widespread
use for practical yield estimation.
The technique we propose aims to improve the accuracy of the yield
estimates obtained from a given number of Monte Carlo simulations.
Alternatively, our improved Monte Carlo estimator achieves the same
accuracy as the standard Monte Carlo estimator, but at a cost of much
fewer number of full circuit simulations. This is made possible by
using a variance reduction technique called importance sampling that
we combine in a novel manner with a stochastic generalization of the
logical effort formalism originally proposed by Sutherland et. al [1].
Logical effort is a method for quickly estimating and optimizing the
path delays in a circuit. We use the stochastic logical effort formalism
to guide the generation and selection of sample points in the parameter
probability space in a transistor-level simulation based Monte Carlo
method for timing yield estimation.
Our approach is based on the premise that, given the magnitude of
process parameter variations and the non-linear dependency of gate
and circuit delay on these variations, the only sufficiently reliable and
accurate method for determining circuit delay is detailed, transistor-
level simulation. We believe that sufficient accuracy in yield estima-
tion can not be obtained even by applying Monte Carlo simulations at
a higher level, e.g. at the block level. Yield estimation techniques not
based on Monte Carlo simulations operate by propagating probability
density functions across the circuit. To make this process feasible, one
is forced to use approximate gate delay models and delay propagation
methods that may be too inaccurate when process paramater variations
This work was funded by TUBA (Turkish Academy of Sciences) under the GEBIP Program and by
TUBITAK (Scientific and Technological Research Council of Turkey) under two career awards (104E057
and 104E058). The first author was supported by a TUBITAK BIDEB Fellowship.
are large. We therefore believe that accurate determination of timing
yield must have circuit simulation as its basis. We demonstrate in this
paper that Monte Carlo simulation in conjunction with a novel vari-
ance reduction technique can serve as an accurate yet computationally
viable yield estimation method.
In Section II, we provide background information and preliminar-
ies on logical effort and its stochastic generalization, the Monte Carlo
method for yield estimation and importance sampling. In Section III,
we present our improved Monte Carlo yield estimator featuring sig-
nificant error reduction through the use of the stochastic logical effort
formalism to facilitate importance sampling. We also provide a pre-
cise, comparative error analysis for our proposed technique and the
standard Monte Carlo estimator. Finally in Section IV, we present
experimental results on several examples which demonstrate the effi-
ciency and accuracy of our proposed yield estimator.
II. BACKGROUND
Section II-A provides an overview of the logical effort approach [1].
In Section II-B, we introduce two techniques for using logical effort
as a method for approximating circuit delay in the presence of statisti-
cal variations. Section II-C reviews the standard Monte Carlo method
for evaluating definite integrals, and Section II-D presents the impor-
tance sampling technique for variance reduction in Monte Carlo sim-
ulations. Finally, Section II-E describes the preliminaries in applying
the standard Monte Carlo and importance sampling techniques to tim-
ing yield estimation.
A. Logical Effort
The logical effort formalism is a fast and efficient way of determining
the delay of a path in a digital circuit. The path delay is simply the
sum of the delays of the gates on the path, and the delay of a logic
gate r is approximated as
dLEr = τ d (1)
where dLEr is the absolute delay of a gate measured in seconds, τ is the
delay of a parasitic-capacitance-free reference inverter driving another
identical inverter, and d is the delay of the logic gate expressed in units
of τ. The d factor in (1) models the gate delay and is given by
d = (p+gh) (2)
where p represents the intrinsic (parasitic) delay, g is the logical ef-
fort, and h is the electrical effort or electrical fan-out. Logical effort
g for a logic gate is defined as the (unitless) ratio of its (per) input
capacitance to that of an inverter that delivers the same output cur-
rent. Thus, logical effort g is a measure of the complexity of a gate.
It depends only on the gate’s topology and is independent of the size
and the loading of the gate. Parasitic delay p expresses the intrinsic
delay of the gate due to its own internal parasitic capacitance, and it is
largely independent of the sizes of the transistors in the gate. Parasitic
delay p, is also expressed in units of τ. The electrical effort h is the
ratio of the load capacitance of the logic gate to the capacitance of a
particular input [1].
B. Stochastic Logical Effort
Equations (1) and (2) provide a way of decomposing the effects of
statistical parameter variations on gate delays. In a different context,
Sutherland et. al [1] analyzed different semiconductor processes with
varying supply voltages, and observed that almost all of the effect of
process parameters and supply voltage on gate delay is captured by the
reference inverter delay (τ in (1)), even when the parameters vary over
a large range spanning different fabrication processes. The logical ef-
fort g and the unitless parasitic delay p of a gate exhibit relatively little
variation with process parameters and supply voltage. Exploiting this
2observation in the context of timing yield analysis, in [2] a stochastic
logical effort (SLE) model was proposed where the delay of a gate
was modeled as
dLEr (X) = τ(X) (p+gh) (3)
where X is a vector of random variables, each component of which
represents a different statistical circuit or process parameter and τ(X)
is the reference inverter delay when the parameters are given by X .
As is apparent in this equation, in the stochastic logical effort approx-
imation, all process and environmental variations are captured by the
statistical variable τ while g, p and therefore d are assumed to be in-
dependent of process parameters. If only inter-die variations are mod-
eled, statistical parameters on the chip at all locations are perfectly
correlated. In this case, using the stochastic characterization of τ for
the same reference inverter for all of the logic gates on the die cap-
tures this perfect statistical correlation among gates. We refer to the
approximation given in (3) as first-degree stochastic logical effort (ab-
breviated as SLE.d1).
In this paper, we also investigate a further refinement of this ap-
proximation described by the following equation
dLEr (X) = τ(X) (p(X)+g(X)h) (4)
where the dependency of p and g on X is also modeled. We call this
model second-degree stochastic logical effort (SLE.d2). As will be-
come apparent later in the paper, SLE.d2 is more accurate but compu-
tationally more expensive.
In both versions of SLE, in order to compute the delay of a path pi
in a circuit, we simply add the delays of the gates on pi:
dLEpi (X) =
k
∑
r=1
dLEr (X) (5)
Here dLEr (X) is the delay of the r-th gate on the path pi. dLEr (X) is
computed by evaluating (3) for SLE.d1 and (4) for SLE.d2. For this
evaluation, a full transistor-level simulation of the whole circuit con-
taining the logic path is not necessary. However, the values of τ(X)
(for both SLE.d1 and for SLE.d2), and p(X) and g(X) (for SLE.d2) at
a given X are needed. For the results we present in this paper, we com-
pute these at a given X by running transistor-level circuit simulations
on small test circuits which contain only the reference inverter (for
τ(X)) or the gate under consideration (for p(X) and g(X)) together
with a proper driver and load circuitry. We envision that the statistical
characterizations (in the form of simple analytical formulas, a look-up
table or a response surface model generated by running circuit simu-
lations on appropriate test circuits) for these quantities could become
part of the characterizations supplied with a standard-cell library. In
this case, no circuit simulations will be needed when evaluating the
SLE delay formulas for circuits that are built using gates from such a
pre-characterized library.
C. The Monte Carlo Method
Monte Carlo (MC) techniques can be used to estimate the value of a
definite, finite-dimensional integral of the form
G =
Z
Ω
g(X) f (X)dX (6)
where Ω is a finite domain and f (X) is a probability density func-
tion (PDF) over X , i.e., f (X) ≥ 0 for all X and RΩ f (X)dX = 1. MC
estimation for the value of G is accomplished by drawing a set of in-
dependent samples X1,X2, ...,XN from f (X) and by using
GN = (1/N)
N
∑
i=1
g(Xi) (7)
The estimator GN above is itself a random variable. Its mean is equal
to the integral G that it is trying to estimate, i.e., E(GN) = G, making
it an unbiased estimator. The variance of GN is Var(GN) = σ2/N,
where σ2 is the variance of the random variable g(X) given by
σ2 =
Z
Ω
g2(X) f (X)dX−G2 (8)
The standard deviation of GN can be used to assess its accuracy in
estimating G. If N is sufficiently large, due to the Central Limit Theo-
rem, GN−G
σ/
√
N has an approximate standard normal (N(0,1)) distribution.
Hence,
P(G−1.96 σ√N ≤ GN ≤G+1.96
σ√
N ) = 0.95 (9)
where P is the probability measure. The equation above means that
GN will be in the interval [G−1.96 σ√N ,G+1.96
σ√
N ] with 95% con-
fidence. Thus, one can use the error measure
|Error| ≈ 2σ√N (10)
in order to assess the accuracy of the estimator.
Several techniques exist for improving the accuracy of MC evalua-
tion of finite integrals. In these techniques, one tries to construct an es-
timator with a reduced variance for a given, fixed number of samples,
or equivalently, the improved estimator provides the same accuracy
as the standard MC estimator but with considerably fewer number of
samples. This is desirable because computing the value of g(Xi) is
typically computationally or otherwise costly.
D. Importance Sampling
One MC variance reduction technique is importance sampling (IS) [3,
4]. IS improves upon the standard MC approach described above by
drawing samples for X from another distribution ˜f . G in (6) is first
rewritten as below
G =
Z
Ω
(
g(X) f (X)
˜f (X)
)
˜f (X)dX (11)
If X1,X2, ...,XN are drawn from ˜f instead of f , the improved estimator
˜GN takes the form
˜GN =
1
N
N
∑
i=1
g(Xi)
f (Xi)
˜f (Xi) (12)
where the weighting factor f (Xi)/ ˜f (Xi) has been used in order to com-
pensate for the use of samples drawn from the biased distribution ˜f .
In order for the improved estimator above to be well-defined and un-
biased, ˜f (Xi) must be nonzero for every Xi for which f (Xi)g(Xi) is
nonzero. We refer to this as the safety requirement. The ideal choice
for the biasing distribution ˜f is
˜fideal(X) = g(X) f (X)G (13)
which results in an exact estimator with zero variance with a single
sample! However, ˜fideal obviously can not be used in practice since
the value of G is not known a priori. Instead, a practically realizable ˜f
that resembles ˜fideal is used. The key in using IS in practical problems
is the determination of an effective biasing distribution that results in
significant variance reduction. We have identified one such biasing
distribution by exploiting the SLE formalism that we use to construct
an efficient and accurate estimator for the timing yield of digital cir-
cuits. This distribution will be described in Section III.
E. Monte Carlo Estimation of Timing Yield
A path pi in a circuit C is a sequence of gates g0,g1,g2, ...,gn where
g0’s inputs are primary inputs of the circuit, and gn’s output is a pri-
mary output of the circuit. Given a circuit and values for the statistical
parameters, a path is said to be critical if (i) it is sensitizable, and (ii)
its delay is as large as the delays of other sensitizable paths. A path
pi is said to be statistically critical if it is a critical path of C for some
possible assignment to process parameters. We denote by Πcrit the set
of statistically critical paths. Then, the delay of a circuit is computed
using
dC (X) = maxpi∈Πcrit dpi(X) (14)
where dC (X) is the delay of the circuit and dpi(X) is the delay of path
pi when the circuit and process parameters are given by X .
A target delay Tc is specified for the circuit. Given a PDF f (X)
for the statistical parameters, we would like to compute the fraction
of circuits that satisfy dC (X)≤ Tc, i.e., the timing yield of the circuit.
We define an indicator random variable I(Tc,X) for the entire circuit
as follows: I(Tc,X) = 1 if the circuit delay exceeds the target, i.e.,
dC (X) > Tc, and I(Tc,X) = 0 otherwise. We then define the timing
loss or simply loss with
Loss = 1−Yield =
Z
I(Tc,X) f (X)dX (15)
as the mean of the indicator random variable I(Tc,X) over the PDFf (X). Evaluation of the integral above is the timing yield (loss) esti-
mation problem addressed in this paper.
3In a straightforward application of the MC method to loss estima-
tion, one would draw samples X1,X2, ...,XN from the statistical param-
eter space according to the PDF f (X) and construct the loss estimator
LossN =
1
N
N
∑
i=1
I(Tc,Xi) (16)
With the MC method, full circuit simulations (transistor-level SPICE
simulations of the whole circuit containing the paths under considera-
tion) must be performed for each sample point, Xi, in order to compute
dC (Xi) and determine whether I(Tc,Xi) = 1 or 0. The MC method is
widely used as a golden reference in the literature in assessing the ac-
curacy and efficiency of timing yield estimation techniques. However,
it is generally believed that it can not be used in practice for estimat-
ing timing yield as it requires too many costly full circuit simulations
for acceptable accuracy, even though there are some arguments to the
contrary [5]. In the rest of this paper, the loss estimator in (16) is
referred to as the standard MC (STD-MC) estimator.
Loss can also be estimated based on the SLE formalism, without
performing any full circuit simulations. The delay of a circuit can be
computed analytically based on the SLE formalism as follows
dLEC (X) = maxpi∈Πcrit dLEpi (X) (17)
where dLEpi (X) is evaluated using the SLE formula in (5) and us-
ing SLE.d1 or SLE.d2. We define a new indicator random variable
ILE(Tc,X), which takes the value 1 if the delay of a circuit computed
analytically using the SLE equations exceeds the target delay Tc, i.e.,
ILE(Tc,Xi) is 1 if dLEC (Xi) > Tc, and 0 otherwise. The loss estimator
based on this new indicator variable takes the form
LossLEN =
1
N
N
∑
i=1
ILE(Tc,Xi) (18)
In computing LossLEN above, no full circuit simulations are performed.
Only simple evaluations of the SLE delay formulas are needed, based
on pre-characterizations of τ(X), p(X) and g(X)) in (3) and (4). In
contrast, the loss estimator in (16) requires N full circuit simulations,
one for every sample. The loss estimator in (18) will be referred to as
the SLE-MC estimator in the rest of this paper.
The estimation of loss based on the STD-MC estimator in (16) will
obviously be more accurate than the one based on the SLE-MC es-
timator in (18), but much more costly. We use the cheap SLE-MC
estimator not by itself for yield estimation, by in a novel approach to
constructing an IS-based loss estimator with reduced variance. This
approach is called ISLE (Importance Sampling based on Stochastic
Logical Effort) and provides the same accuracy as the STD-MC esti-
mator but at a cost of much fewer number of full circuit simulations.
III. TIMING YIELD ESTIMATION WITH ISLE
The biasing distribution ˜f (X) used in ISLE is
˜f (X) = I
LE(T εc ,X) f (X)
LossLE,ε
(19)
This biasing distribution serves as a good approximation to the ideal
but practically unrealizable biasing distribution for importance sam-
pling, ˜f (X) = I(Tc,X) f (X)Loss . In (19) above, T εc = (1− ε)Tc and LossLE,ε
is the loss computed by the SLE-MC estimator in (18) with the target
delay set to T εc instead of Tc, where ε is a margin parameter which we
explain below. The ISLE loss estimator is then constructed as follows
LossISLEN =
1
N
N
∑
i=1
I(Tc,Xi)
f (Xi)
˜f (Xi) (20)
where the sample points Xi must be drawn from ˜f (X) in (19) instead
of f (X). The margin parameter ε was introduced above in order to
guarantee that ˜f (Xi) is nonzero everywhere I(Tc,Xi) f (Xi) is nonzero,
i.e., ILE(T εc ,Xi) must take the value 1 everywhere I(Tc,Xi) is 1. The
margin parameter ε must be large enough so that the indicator vari-
ables never assume the values ILE(T εc ,Xi) = 0 (the timing constraint
T εc is satisfied according to SLE) and I(Tc,Xi) = 1 (the actual circuit
fails to satisfy the timing constraint) for any of the sample points. An
automated and adaptive algorithm for the determination of the small-
est value for the margin parameter ε will be described in Section III-B
Substituting the biasing distribution ˜f in (19) into (20), and per-
forming some simplications based on the fact that ILE(T εc ,Xi) takes
the value 1 for all samples drawn from ˜f (X), we arrive at
LossISLEN =
LossLE,ε
N
N
∑
i=1
I(Tc,Xi) (21)
where, as in (20), the samples Xi are drawn from ˜f (X) in (19).
In order to draw a sample from ˜f (X) in (19), we first draw a sample
from f (X). We keep the sample if ILE(T εc ,Xi) evaluates to 1 at the
sample point and discard it otherwise. The evaluation of ILE(T εc ,Xi)
is done using the analytical SLE formulas. Each kept sample consti-
tutes one of the Xi in (21). LossISLEN is then computed by determining
whether I(Tc,Xi) = 1 for each such kept sample, i.e., by carrying out
a full circuit-level simulation at Xi.
A key benefit of the ISLE approach is that circuit-level simulations
are avoided for discarded samples, i.e., when Xi results in an SLE cir-
cuit delay estimate smaller than T εc . The improvement brought about
by ISLE, however, goes significantly beyond this. For the same num-
ber of samples N, the ISLE estimator in (21) provides a much more
accurate (with significantly reduced variance) loss estimate than the
STD-MC estimator in (16). Were it possible to use the ideal biasing
function ˜fideal , a zero-variance estimator would have been obtained
with a single sample. The ISLE approach makes it possible to explore
the space between standard MC and this ideal. Using an ˜f that approx-
imates ˜fideal as closely as possible, ISLE both reduces the number of
full circuit simulations required and improves upon standard MC in
the estimator accuracy achieved for the same number of full circuit
simulations. The next section makes this discussion more precise.
A. Theoretical Gain: Quantifying Variance Reduction due to ISLE
The error of an estimator is the deviance of the estimator’s result from
the actual loss as explained in Section II-C for a general estimator.
In this section, the errors of the STD-MC and ISLE estimators are
derived and the results are compared.
Theorem III.1: The error of the STD-MC estimator in (16) ob-
tained with N full-circuit simulations is
ErrorMC = 2
√
Loss.Yield√
N (22)
with more than 95% confidence.
Proof: By (10), the error of the STD-MC estimator for loss
using N full-circuit simulations is 2σ/
√
N where σ2 is the variance of
the indicator random variable I(Tc,X) with PDF f (X). The mean of
I(Tc,X) is equal to the actual timing loss. σ2 is computed as
σ2 =
Z
Ω
I(Tc,X)2 f (X)dX −Loss2 (23)
I(Tc,X) is either 1 or 0, thus, I(Tc,X) = I(Tc,X)2. Eqn (23) becomes
σ2 = Loss−Loss2 = Loss(1−Loss) = Loss.Yield (24)
The error of the STD-MC estimator is thus given by (22).
Theorem III.2: The error of the ISLE estimator for loss when N full
circuit simulations are performed is
ErrorISLE = 2
√
Loss.(LossLE,ε−Loss)
/√
N (25)
with more than 95% confidence.
Proof: By (10), the error of the ISLE estimator for loss using
N full-circuit simulations is 2σ˜/
√
N where σ˜2 is the variance of the
random variable I(Tc,X) f (X)
˜f (X) with PDF
˜f (X). The mean of this random
variable is equal to the actual timing loss. σ˜2 is computed as
σ˜2 =
Z
Ω
(
I(Tc,X) f (X)
˜f (X) )
2
˜f (X)dX −Loss2 (26)
Substituting ˜f (X) from (19) and using the fact that I(Tc,X)2 =
I(Tc,X) we obtain
σ˜2 =
Z
θ
I(Tc,X) f 2(X)
ILE(T εc ,X) f (X)
LossLE,ε
dX −Loss2 (27)
4εinitεmin
εend
Fig. 1
ITERATIONS OF THE ISLEEXPLORER ALGORITHM FOR DETERMINING THE MARGIN ε.
θ denotes the subregion of Ω in which ˜f (X) is non-zero. From (19),
˜f (X) is zero when ILE(T εc ,X) is zero (and thus I(Tc,X) = 0, if the
margin ε is chosen properly). When ˜f (X) is non-zero, ILE(T εc ,X)= 1.
Thus
σ˜2 = LossLE,ε
Z
θ
I(Tc,X) f (X)dX−Loss2 = Loss.(LossLE,ε−Loss)
(28)
The error of the ISLE estimator is thus given by (25).
If the same number of full circuit simulations N is used with both
methods, then the ratio of the errors of the estimators is given by
Error Ratio =
ErrorMC
ErrorISLE
=
√
Yield
(LossLE,ε−Loss) (29)
Alternatively, suppose a bound on the allowable estimation error is
given. The ratio of the number of full circuit simulations required by
the two approaches to achieve this same error bound is given by
Gain = NMC
NISLE
=
σ2
σ˜2
=
Yield
(LossLE,ε−Loss) (30)
As is apparent from (29) and (30), as LossLE,ε approaches the real loss
Loss, the improvement that ISLE offers over STD-MC increases.
In the proof of Theorem III.2 above for the error of the ISLE es-
timator, LossLE,ε was assumed to be a known deterministic quantity.
However, LossLE,ε is not determined analytically. LossLE,ε is a ran-
dom variable and is estimated using the SLE-MC estimator in (18).
The variance of this random variable decreases proportionally to the
number of samples used in the SLE-MC estimator. In order for the
error result for the ISLE estimator in (25) to be valid, the estimation
of LossLE,ε must be performed by using a large enough number of
samples in (18) so that it has negligible variance. This would validate
its treatment as a deterministic quantity in the derivation of the error
for the ISLE estimator. The use of a large number of samples in the
SLE-MC estimator in (18) is easily affordable, because no full circuit
simulations are performed, only simple evaluations of the SLE delay
formulas are needed. The results we present later show that the theo-
retical error expressions derived here are in excellent agreement with
experimental data.
Algorithm 1 ISLEEXPLORER(MCSimCapacity, ExpectedMaxLoss,Tc)
1. NumFSamples ← ⌈MCSimCapacity×1/LossBound⌉
2. Draw NumFSamples sample points {X1,X2,X3, ...,XNumFSamples} from f (X)
3. for i = 1 to NumFSamples do
4. Xi.color ← BLACK
5. end for
6. PointsInMargin ← 0, ε← εinit , ε-step← 0.02 picoseconds
7. MCLossCount ← 0, W hitePoints← 0,
8. while (PointsInMargin ≤ Sa f etyLimit) do
9. EXPLORE (Tc , ε, NumFSamples)
10. if PointsInMargin = 0 then
11. if (NewWhitePointsDiscovered) then
12. εmin ← ε
13. LossPointsAtεmin ← MCLossCount
14. W hitePointsAtεmin ←W hitePoints
15. end if
16. end if
17. ε← ε+ ε-step
18. end while
19. return (LossPointsAtεmin/W hitePointsAtεmin)×LossLE,ε
Algorithm 2 EXPLORE (Tc, ε, NumFSamples)
1. T εc ← Tc− ε
2. NewWhitePointsDiscovered ← false
3. for i = 1 to NumFSamples do
4. if Xi.color = BLACK then
5. Compute ILE (T εc , Xi)
6. if ILE (T εc , Xi) = 1 then
7. Xi.color ←WHITE
8. WhitePoints←W hitePoints+1
9. NewWhitePointsDiscovered ← true
10. Compute I(Tc, Xi) // Full circuit simulation
11. if I(Tc, Xi) = 1 then
12. MCLossCount ←MCLossCount+1
13. PointsInMargin ← 0
14. else
15. PointsInMargin ← PointsInMargin+1
16. end if
17. end if
18. end if
19. end for
B. ISLEEXPLORER: The Margin Determination Algorithm
The determination of an appropriate value of ε, the timing margin used
by ISLE, is essential for the correctness, accuracy, and efficiency of
the technique. On the one hand, ε must be large enough to satisfy
the correctness constraint that for every value of X that f (X).I(Tc,X)
is non-zero, ˜f (X) is also non-zero. Since, ˜f (X) is proportional to
f (X) and ILE(T εc ,X), this translates to the safety requirement that
I(Tc,Xi) = 1 ⇒ ILE(T εc ,Xi) = 1. On the other hand, as can be seen
in (25) and (30), the closer LossLE,ε is to Loss, the more accurate the
ISLE estimator becomes and the more improvement it achieves over
standard MC. Making LossLE,ε close to Loss requires that ε be kept
small. Thus, to make ISLE accurate and efficient while preserving
correctness, we must make ε as small as possible without violating
the safety requirement for any Xi.
If SLE were a perfect approximation, a margin of ε = 0 would sat-
isfy the requirements above. However, as also demonstrated by our
experimental results, stochastic SLE is inaccurate to an extent that
is circuit and process parameter dependent. Therefore, the ε margin
must be determined separately for each different circuit and it must be
checked that the resulting ε satisfies I(Tc,Xi) = 1⇒ ILE(T εc ,X) = 1.
This section presents ISLEEXPLORER, an iterative heuristic al-
gorithm for determining ε (Algorithm 1) and estimating Loss.
ISLEEXPLORER interleaves steps of incrementing ε and performing
a number of full circuit simulations required to compute the ISLE es-
timator. When ISLEEXPLORER terminates, a correct value of ε is de-
termined and all of the full circuit simulations required for computing
the estimator LossISLEN (21) have been performed. ISLEEXPLORER
runs only a fixed number (Sa f etyLimit) of additional full circuit sim-
ulations beyond those needed for LossISLEN in order to ensure that
the margin value ε used is correct. As will become apparent below,
the cost of the full circuit simulations are the dominant factor in the
computational cost of ISLEEXPLORER. Therefore, the computational
cost of adaptively determining the margin parameter is a fixed number
Sa f etyLimit full circuit simulations.
The intuition behind the operation of ISLEEXPLORER is illustrated
in Figure 1. The solid rectangle represents the two-dimensional pa-
rameter space. Every possible point in the rectangle corresponds a
unique valuation of the parameters X . The solid curve in Figure 1
consists of the points X for which the delay of the circuit is exactly
Tc. Outside the solid curve, circuit delay exceeds Tc, i.e., I(Tc,X) = 1.
Each dotted curve consists of points X for which dLE
C
(X) = Tc− ε for
a particular value of ε.
ISLEEXPLORER considers NumFSamples samples generated from
f (X) and computes ε based on data it collects on these samples.
ISLEEXPLORER starts exploration with a negative initial value for the
margin εinit , gradually increases ε, ends exploration at εend and deter-
mines εmin in the process. εmin is the smallest margin ISLEEXPLORER
can detect for which it can verify that the dLE
C
(X) = Tc − εmin curve
lies completely inside the dC (X) = Tc curve. εmin is the value of ε
used by ISLEEXPLORER for computing the ISLE estimator for timing
loss (21). All circuit simulations required to compute the summation
in (21) have already been performed when the ε exploration is com-
pleted. At that point, to arrive at the value of the estimator LossISLEN ,
all ISLEEXPLORER needs is an estimate for the value of LossLE,εmin
51 2 3 4 5 6 7 8 9 10
Fig. 2
TEST CIRCUIT 1: INVERTER CHAIN
1 2 3 4
5
6 7
8
9
10
0
0
0
0
1 0
1
1
1
1
oai21
Fig. 3
TEST CIRCUIT 2: GATE CHAIN
that is computed using the SLE-MC estimator in (18) as explained be-
fore. The computational cost of LossLE determination is unavoidable
with ISLE and is not due to the adaptive determination of ε.
At each iteration, ISLEEXPLORER increases the margin ε by
ε-step. It then investigates (using the EXPLORE subroutine in Algo-
rithm 2) the samples that fall between dLE
C
(X) = Tc−ε and dLEC (X) =
Tc − (ε+ ε-step) and determines for each such sample Xi whether
I(Tc,Xi) = 1 is satisfied. The iterations continue until a value of mar-
gin εend is reached for which the number of samples Xi that fall in a
safety band defined by ILE(Tc−εend ,Xi)= 1 and ILE(Tc−εmin,Xi)= 0(also I(Tc,Xi) = 0) reaches Sa f etyLimit, a user-given parameter.
ISLEEXPLORER uses the colors white and black to mark the status
of samples Xi generated from f . If Xi.color = Black, this indicates
that a full circuit simulation has not been run for Xi. This is because
for the values of the margin ε explored so far, ILE(T εc , Xi) was found to
be 0. If Xi.color =W hite, this indicates that an SLE timing estimation
and a full circuit simulation for Xi has been performed and it has been
determined whether the safety requirement is satisfied for Xi . White
points do not need to be revisited when the value of ε increases, since
the value of I(Tc,Xi) and ILE(T εc ,Xi) do not change afterwards.
ISLEEXPLORER tries to obtain as accurate a delay estimate as
possible while limiting the number of full circuit simulations to
about MCSimCapacity, a parameter provided by the user. These
≈ MCSimCapacity samples are chosen among NumFSamples sam-
ples generated from the distribution f (X). The user also pro-
vides a rough estimate for an upper bound on the loss, 0 ≤
ExpectedMaxLoss ≤ 1. From among NumFSamples samples, we
expect to run full circuit simulations for about ExpectedMaxLoss×
NumFSamples. Therefore, the algorithm selects NumFSamples to be
(1/ExpectedMaxLoss).MCSimCapacity.
It should be noted that ISLEEXPLORER is a heuristic algorithm,
and, as such, does not formally guarantee that the safety require-
ment is satisfied for all samples X . In order to keep the compu-
tational cost reasonable, instead of checking that the safety require-
ment is satisfied for all NumFSamples samples Xi (since this would
require NumFSamples full circuit simulations) ISLEEXPLORER con-
siders margins larger than the minimum satisfactory εmin and makes
sure that for Sa f etyLimit samples Xi that satisfy Tc−εmin≤ dLEC (Xi)≤
Tc− εend (the points in the safety band) the safety requirement is not
violated. This is done in order to build further confidence that the εmin
value arrived at is valid. In future work, we plan to investigate tech-
niques that can formally ensure that the safety requirement is satisfied
for importance sampling using the SLE approximation.
IV. RESULTS
A. Experimental Setup
We first explain the technical issues related to our experimental setup
in order to help interpret our results better. We present results on two
test circuits, InverterChain and GateChain shown in Figures 2 and 3.
For both circuits, the timing loss is computed by comparing the delay
between nodes 3 and 8 with the timing constraint. The precursor gates
between nodes 1 and 3, and the postcursor gates between nodes 8 and
10 are placed in the circuit in order to realize a typical driver and load
for the logic path under consideration. The gates used in these circuits
are from Graham Petley’s 0.13µ library version 8.1 [6].
We consider three statistically varying process and circuit parame-
ters [7]:
• Effective channel length Le f f with a 3σ/µ ratio of 15%.
• Supply voltage Vdd with a 3σ/µ ratio of 10%.
• Threshold voltage Vth with a 3σ/µ ratio of 10%.
These parameters are assumed to have Gaussian distributions, and are
considered independent. We create three sets of statistical parameters
from the above:
• OnePar: A one-parameter set consisting of Le f f .
• TwoPar: A two-parameter set consisting of Le f f and Vdd .
• ThrPar: A three-parameter set consisting of Le f f , Vdd and Vth.
For the results we report in this paper, we consider only inter-die cor-
relations. In other words, the statistical parameters for all of the tran-
sistors in the circuit are fully correlated, and the variation in the pa-
rameters is location and transistor independent.
In order to empirically measure the error in the loss estimates ob-
tained by the standard MC (STD-MC) estimator and our ISLE esti-
mator, we perform 50 independent repetitions of the same experiment
run. In our graphs and tables, experiment numbers 1 through 50 refer
to these different runs. In each independent run, we compute the loss
estimates using 1000 separate samples generated from f in the param-
eter space. These 50 independent runs constitute samples of the loss
estimator, and the variance and error of the loss estimator is computed
over these 50 samples. For the STD-MC estimator, transistor-level
circuit simulations are performed at every one of these 1000 sample
points. For the ISLE estimator, a reduced number of simulations are
performed since most of the samples are discarded based on the eval-
uation of the SLE equations. The number of circuit simulations that
are run for the ISLE estimator may be different in the 50 runs. In our
tables and graphs, we report the average number of simulations over
the 50 runs.
The LossLE,ε value that is needed for computing the ISLE estima-
tor in (19) and (21) is computed using the SLE-MC estimator in (18)
using all of the 50000 sample points generated during the 50 runs.
We report and compare loss and error results for three estimators:
• the standard MC estimator: STD-MC,
• the ISLE estimator based on SLE.d1, and
• the ISLE estimator based on SLE.d2.
We report results for six different experiment configurations (com-
binations of a test circuit and a parameter set) in‘ the next two sec-
tions. We use the notation Exp(TestCircuit, ParameterSet) to de-
note an experiment that is run on the TestCircuit (which can be In-
verterChain or GateChain) with the ParameterSet (which can be
OnePar, TwoPar or ThrPar).
B. The Accuracy and Efficiency of the ISLE Estimator
We illustrate the performance and operation of our ISLE timing loss
estimator (based on SLE.d2) by providing four graphs that were gen-
erated from one selected experiment, Exp(GateChain,ThrPar). We
are not able to provide similar graphs for the other experiments due to
space constraints, but we present detailed performance results in the
next section on all of the six experiments we have run.
The graph in Figure 4 shows the loss estimates obtained with the
STD-MC estimator and ISLE for all of the 50 experiment runs. The
value of the ISLE estimator in each case is a lot closer to the mean
than that of the STD-MC estimator. The variance reduction obtained
by the ISLE estimator over STD-MC is thus apparent from this graph.
We should note that for every loss estimate shown in Figure 4, 1000
transistor-level simulations are performed for STD-MC, but the aver-
age number of simulations for ISLE was only 213 over the 50 runs.
Thus, if a normalization (to be explained below) is done considering
that the errors for the two estimators are different, the Gain of ISLE
over STD-MC is found to be 179, theoretically given by (30). Gain
represents the ratio of the number of full circuit simulations required
by the two approaches to achieve the same error.
As discussed before, the accuracy of the SLE approximation is key
in order for it to facilitate IS for yield estimation. To gauge this ac-
curacy, the scatter plots in Figure 5 and Figure 6 show circuit de-
lays computed with the SLE formulas versus delay computed with
transistor-level circuit simulation. Figure 5 is for SLE.d1 and Figure 6
is for SLE.d2. As seen in these plots, both versions of SLE formulas
provide reliable delay estimates. However, SLE.d2 is more accurate,
and hence it results in a bigger Gain as confirmed by the detailed re-
sults we present in the next section. This comes at the cost of having
statistical pre-characterizations for parasitic delay p and logical effort
g for all of the gates in the library, in addition to the reference inverter
delay τ.
60 5 10 15 20 25 30 35 40 45 50
0.15
0.16
0.17
0.18
0.19
0.2
Experiment No
Co
m
pu
te
d 
Lo
ss
Loss computed by ISLE and Loss computed by STD−MC
Fig. 4
TIMING LOSS FOR Exp(GateChain, TwoPar) (STD-MC AND SLE.d2)
Fig. 5
SCATTER PLOT FOR Exp(GateChain, TwoPar), (SLE.d1
Figure 7 confirms empirically that the 1/
√
N dependency of error
on the number of MC samples is as expressed by (22) and (25) for the
STD-MC and ISLE estimators, respectively. The significant reduction
in variance that ISLE provides is also obvious in this graph. In this
figure, a plot of loss error versus the number of full circuit simulations
is shown for both estimators. The smooth curves in this plot were
obtained using the theoretical error formulas. The two other curves
were computed using data from the 50 runs, each of which explore
sample set sizes ranging from 1 to 500. As explained before, full
circuit simulations are performed at all of the sample points for the
STD-MC estimator, but a reduced number of simulations are needed
for the ISLE estimator. We observe the excellent match between the
theoretical and experimental error curves in this plot.
C. Results
Table I present results obtained from six experiments with the STD-
MC estimator and our ISLE estimator based on both SLE.d1 and
SLE.d2. The mean loss and loss error values are computed over 50
independent runs as explained before. The Gain value reported in the
table represents the ratio of the number of full circuit simulations re-
quired by the STD-MC and SLE estimators to achieve the same error,
theoretically given by (30). Gain here is computed using the experi-
mental data for the loss errors and the actual number of circuit simu-
lations reported in the table as follows:
Gain = NMC
NISLE
ErrorMC2
ErrorISLE 2
(31)
The second factor in the above formula is needed to perform a nor-
malization required in order to make a correction for the difference in
the errors achieved by the two estimators with the given number of
simulations. The ratio of the squares of errors is used because of the
1/
√
N dependency of error.
The results in Table I show that both versions of our ISLE yield
estimator achieve significant cost reduction over the standard MC es-
timator for the same error, in the range from one to four orders of mag-
nitude. As expected, ISLE based on SLE.d2 performs better, achiev-
ing two orders of magnitude cost reduction in the worst-case, whereas
Fig. 6
SCATTER PLOT FOR Exp(GateChain, TwoPar) (SLE.d2)
0 50 100 150 200 250 300 350 400 450 500
10−3
10−2
10−1
100
Number of Full Circuit Simulations performed for loss computation
Lo
ss
 E
rr
or
Theoretical and Experimental Loss Errors for ISLE and STD−MC
 
 
Loss(ISLE) Error
Loss(STD−MC) Error
Theoretical Loss(STD−MC) Error
Theoretical Loss(ISLE) Error
Fig. 7
LOG-VARIANCE PLOT FOR Exp(GateChain, TwoPar) (SLE.d2)
the cost reduction achieved by ISLE based on SLE.d1 goes down to
12 in the worst-case. The Gain data presented in Table I shows that
ISLE performs better for the InverterChain circuit containing only
inverters. This is due to the fact that the SLE delay formulas are more
accurate for inverters because of their use in the SLE formalism as a
delay reference. It should also be noted that the performance of ISLE
improves considerably when only one statistical parameter is used,
achieving a speed-up reaching three orders of magnitude. When more
(two or three) statistical parameters are considered simultaneously, the
performance degrades in comparison but still above a respectable two
orders of magnitude speed-up with ISLE based on SLE.d2.
V. CONCLUSION
We have demonstrated in this paper that Monte Carlo simulation
in conjunction with a novel variance reduction technique, Impor-
tance Sampling with Stochastic Logic Effort, can serve as an accu-
rate yet computationally viable timing yield estimation method. Nu-
merous other techniques for reducing the variance of Monte Carlo
estimators have been proposed in the Monte Carlo simulation liter-
ature [3, 8]. The stochastic logical effort formalism can be used to fa-
cilitate other techniques for variance reduction in Monte Carlo estima-
tion, e.g., stratified sampling, control variates, multicanonical Monte
Carlo method, etc. [4], which we intend to explore in future work.
REFERENCES
[1] I. Sutherland, B. Sproull, and D. Harris. Logical Effort: Designing Fast CMOS Circuits. Morgan
Kaufmann, 1999.
[2] A. Demir and S. Tasiran. Statistical logical effort: Designing for timing yield on the back of an
envelope. In ACM/IEEE International Workshop on Timing Issues in the Specification and Synthesis
of Digital Systems (TAU), February 2006.
[3] Malvin H. Kalos and Paula A. Whitlock. Monte Carlo Methods, Volume 1, Basics. Wiley, 1986.
[4] P. Glasserman, P. Heidelberger, and P. Shahabuddin. Importance sampling and stratification for
value-at-risk. In Proc. 6th Intl. Conference on Computational Finance, pages 7–24. MIT Press,
May 28-31 1999.
[5] L. Scheffer. The count of Monte Carlo. In ACM/IEEE International Workshop on Timing Issues in
the Specification and Synthesis of Digital Systems (TAU), February 2004.
[6] http://www.vlsitechnology.org/. Cell Library, Release 8.1.
[7] Y. Cao, H. Qin, R. Wang, P. Friedberg, A. Vladimirescu, and J. Rabaey. Yield optimization with
energy-delay constraints in low-power digital circuits. In IEEE Conference on Electron Devices and
Solid-State Circuits, December 2003.
7TABLE I
EXPERIMENTAL RESULTS COMPARING STANDARD MC AND ISLE
Mean Loss Loss Error Number of Ckt. Simulations Gain
SLE.d1 SLE.d2 STD-MC SLE.d1 SLE.d2 STD-MC SLE.d1 SLE.d2 STD-MC SLE.d1 SLE.d2
Exp(InverterChain, OnePar) 0.1394 0.1394 0.1395 1.15e-3 1.28e-3 2.35e-2 181 181 1000 2305 1866
Exp(InverterChain, TwoPar) 0.1481 0.1483 0.1482 8.15e-3 1.72e-3 1.87e-2 211 190 1000 25 624
Exp(InverterChain, ThrPar) 0.1535 0.1537 0.1538 8.72e-3 2.15e-3 2.03e-2 218 196 1000 25 457
Exp(GateChain, OnePar) 0.1575 0.1575 0.1576 1.32e-3 1.33e-3 2.34e-2 199 199 1000 1589 1549
Exp(GateChain, TwoPar) 0.1686 0.1684 0.1688 1.08e-2 2.81e-3 2.24e-2 248 211 1000 17 300
Exp(GateChain, ThrPar) 0.1739 0.1741 0.1743 1.28e-2 2.80e-3 2.29e-2 259 217 1000 12 307
[8] A. D. Sokal. Monte carlo methods in statistical mechanics: Foundations and new algorithms. In
P. Cartier C. DeWitt-Morette and A. Folacci, editors, Functional Integration: Basics and Applica-
tions (1996 Carge`se summer school). Plenum, 1997.
