Non-invasive IC tomography using spatial correlations by Shamsi, Davood
RICE UNIVERSITY 
Non-invasive IC Tomography Using Spatial Correlations 
by 
Davood Shamsi 
A THESIS SUBMITTED 
IN PARTIAL FULFILLMENT OF THE 
REQUIREMENTS FOR THE DEGREE 
Master of Science 
APPROVED, THESIS COMMITTEE: 
Dr. Faxinaz KoushanfaiT"CTmir 
Assistant Professor, Electrical and Com-
puter Engineering 
Dr. Don H. Johnson 
J.S. Abercrombie Professor Emeritus, 
Electrical and Computer Engineering 
Dr. RichabeL^araniuk 
Victor E. Cameron Professor, Electrical 
and Computer Engineering 
HOUSTON, TEXAS 
NOVEMBER 2 0 0 9 
UMI Number: 1485969 
All rights reserved 
INFORMATION TO ALL USERS 
The quality of this reproduction is dependent upon the quality of the copy submitted. 
In the unlikely event that the author did not send a complete manuscript 
and there are missing pages, these will be noted. Also, if material had to be removed, 
a note will indicate the deletion. 
UMI' 
Dissertation Publishing 
UMI 1485969 
Copyright 2010 by ProQuest LLC. , 
All rights reserved. This edition of the work is protected against 
unauthorized copying under Title 17, United States Code. 
A ® 
uest 
ProQuest LLC 
789 East Eisenhower Parkway 
P.O. Box 1346 
Ann Arbor, Ml 48106-1346 
ABSTRACT 
Non-invasive IC Tomography Using Spatial Correlations 
by 
Davood Shamsi 
We introduce a new methodology for post-silicon characterization of the gate-
level variations in a manufactured Integrated Circuit (IC). The estimated char-
acteristics are based on the power and the delay measurements that are affected 
by the process variations. The power (delay) variations are spatially correlated. 
Thus, there exists a basis in which variations are sparse. The sparse representa-
tion suggests using the Ll-regularization (the compressive sensing theory). We 
show how to use the compressive sensing theory to improve post-silicon charac-
terization. We also address the problem by adding spatial constraints directly to 
the traditional L2-minimization. 
The proposed methodology is fast, inexpensive, non-invasive, and applicable 
to legacy designs. Noninvasive IC characterization has a range of emerging ap-
plications, including post-silicon optimization, IC identification, and variations' 
modeling/simulations. The evaluation results on standard benchmark circuits 
show that, in average, the gate level characteristics estimation accuracy can be 
improved by more than two times using the proposed methods. 
ii 
Contents 
Abstract ii 
List of Tables vi 
List of Figures vii 
1 Introduction 1 
2 Background 8 
2.1 Related work on process variation 8 
2.1.1 Early work 8 
2.1.2 Variation estimation and modeling 10 
2.1.3 Effects of variations on the design 15 
2.1.4 Testing 17 
2.2 Preliminaries 18 
2.2.1 Variation model 18 
2.2.2 Compressive sensing 20 
iii 
3 Power Tomography 23 
3.1 Preliminaries 23 
3.1.1 Leakage current 23 
3.1.2 Global flow of the power tomography 25 
3.2 Noninvasive tomography 26 
3.3 Fast tomography by compressive sensing 30 
3.3.1 Sparse representation 30 
3.3.2 Regular grid tomography 32 
3.3.3 Irregular grid tomography 33 
3.4 Tomography using spatial constraints (TUSC) 35 
3.4.1 Adding spatial constraints 35 
4 Delay Tomography 38 
4.1 Preliminaries 38 
4.1.1 Delay variation model 38 
4.1.2 Sensitizable paths 39 
4.1.3 Global flow of the delay tomography 39 
4.2 Delay estimation by £2-norm minimization 41 
4.3 Delay estimation using compressive sensing 46 
4.3.1 Sparse representation of variations 47 
4.3.2 Gates on the regular grids 49 
4.3.3 Gates on the irregular grids 50 
iv 
4.4 Determining the regularization coefficient A 51 
4.5 Path selection 55 
4.5.1 Sensitizable paths 55 
4.5.2 Basis path set 57 
5 Applications 59 
6 Evaluation Results 65 
6.1 Simulations setup 65 
6.2 Power tomography results 67 
6.2.1 Measurement matrix evaluation 68 
6.2.2 Tomography results in the power framework 69 
6.3 Delay evaluation results 75 
6.3.1 Measurement matrix and estimation in subspaces 75 
6.3.2 Delay tomography results 76 
7 Conclusion 82 
v 
List of Tables 
3.1 Static power for different input vector combinations 26 
4.1 Transition propagation rate for different gates 43 
6.1 Average number of independent power vectors 72 
6.2 Performance of the ^-norm minimization, the ^i-norm regulariza-
tion, and TUSC (power) 73 
6.3 Number of independent paths and independent linear equations. . 78 
6.4 Performance of ^-norm minimization and ^i-norm regularization 
(delay) 80 
vi 
List of Figures 
2.1 Design structure used by Doh et al. [23] 11 
2.2 Spatial correlation study by Doh et al [23] 12 
2.3 Measured process variation in a wafer [30] 13 
2.4 Variation on four test chips 14 
3.1 Global flow of the power tomography. 25 
3.2 A simple logic circuit 26 
3.3 Number of independent measurement vectors 29 
3.4 The power variation and its sparse wavelet transform 30 
3.5 Sorted wavelet coefficients (power) 31 
3.6 Gates are not placed on regular grids 33 
3.7 Irregular wavelet transformation 34 
4.1 Global flow of the delay tomography. 40 
4.2 A sensitizable path from an input to the output 40 
4.3 Delay variations and their wavelet transform 47 
vii 
4.4 Sorted wavelet coefficients 48 
4.5 The variation estimation error for various regularization factors A. 51 
4.6 Optimization curves for various measurement errors 55 
4.7 An example of a circuit with sensitizable and unsensitizable paths. 56 
6.1 Singular values of the measurement matrix 68 
6.2 Variations estimation error vs. percent of the power measurement 
noise 69 
6.3 Variation estimation error vs. number of power measurements. . . 70 
6.4 Singular values of the measurement matrices 74 
6.5 Variation (delay) estimation error vs. measurement error 77 
6.6 Variation (delay) estimation error vs. the number of measurements. 79 
viii 
Chapter 1 
Introduction 
In the modern integrated circuit (IC) design the objective is to increase operation 
speed (maximum frequency) and decrease power consumption. The maximum 
frequency of an IC is a function of the longest encountered delay in its different 
parts. Signal delay can be reduced by increasing the transistor density, but to 
increase transistor density in an IC, dimensions of the CMOS transistors in the 
IC must be scaled. Decreasing power consumption also demands reducing CMOS 
transistors dimensions. As we know, Moore's law predicts that the number of 
transistors on an inexpensive IC doubles every two years. For example, the Intel 
80486 introduced in 1989 was manufactured using 0.8/im CMOS technology and 
had a the maximum clock speed of 133MHz. Today's modern processors, such 
as the Intel Core 2, are manufactured using 65nm technology or less, and the 
maximum frequency can be more than 3.20 GHz. 
1 
Dimensions of a manufactured CMOS transistor is not exactly as it was de-
signed. If one measures the dimensions of the manufactured transistors, there 
are some variations from the design specifications. This phenomenon is called 
manufacturing variation. Imperfection in manufacturing tools is the main con-
tributor to systematic process variations. For example, because of the limitations 
on the minimum wavelength of the laser etching the mask [45], masks that are 
used in the manufacturing are not totally similar and symmetric. Thus, the tran-
sistors dimensions depend on the specific mask used in the manufacturing process 
and the transistors' location on the mask. Another reason for the manufactur-
ing variations is uncontrollable physical parameters of the manufacturing process 
(random variations). Because it is not possible to control strictly the physical 
environment of the fabrication, manufacturing two ICs with the same mask does 
not result in the same variations. 
Process variations can dramatically affect properties of manufactured ICs. 
Statistical static timing analysis (SSTA) statistical power analysis are two exam-
ples of the techniques that considers variations for pre-silicon optimizations. In 
SSTA, the goal is to find the longest path delay in the circuit. Because of the 
nondeterministic behavior of variations, no single path always elicits the longest 
delay. Thus, path delays should be statistically modeled and then the longest 
delay of the circuit is determined with a specific confidence interval. Orshansky 
et al. [57] showed that variations might cause up to 25% error in timing analysis. 
2 
In the statistical power analysis, it is also shown that under variations the ratio 
of standard deviation to the mean of the total current might varies between 0.17 
to 0.98 [5]. However, pre-silicon optimizations, such as SSTA, have some limita-
tions: the statistical characterizations of variations are not precisely determined 
and they might vary on different chips. 
A number of post-silicon variations characterization methods have recently 
introduced [23,30,36,79]. Friedberg et al. [30] used electrical linewidth metrology 
(ELM) to measure variations of chips' dimensions on a wafer. They exhaustively 
measured the variations of all the transistors. Hargreaves et al. [36] introduced a 
post-silicon characterization method using ring oscillators. They put a number of 
ring oscillators in different locations on an IC. Then, they measured the frequency 
of each ring oscillator. Frequencies of the oscillators represent variations across 
the IC. The mentioned methods are either expensive [30] or design specific [36]. 
We propose a fast, non-invasive, and inexpensive method for gate level post-
silicon characterization using power and delay measurements. In the power frame-
work, we first explain how the nominal leakage power consumptions of a logic gate 
are multiplied by a scaling factor due to process variations. The scaling factor 
indicates the ratio of the gate leakage to its expected value. Then, we show that 
measuring the total power consumption for each circuit input enforces a linear 
constraint on scaling factors. Feeding the circuit with different input vectors 
and measuring the total power for each input vector leads to a system of lin-
3 
ear equations with scaling factors as unknown variables. A common technique 
to solve the system of linear equations is traditional least square minimization 
(^-minimization). 
This estimation approach can be improved by incorporating the spatial cor-
relations in our framework. We show that spatia^correlations suggest that there 
is a basis in which variations in the scaling factors can be represented sparsely. 
We specifically consider wavelet bases that can capture spatial correlation effi-
ciently [25,73]. We experimentally determine a wavelet basis that results in the 
sparsest representation for variations. Having a sparse representation for varia-
tions, we use compressive sensing technique to efficiently recover scaling factors. 
Here, we regularize the objective function of the optimization problem with an 
£i-norm term to impose the sparsity on the solution. 
The post-silicon characterization also can be improved by adding spatial con-
straints directly to the optimization. The spatial correlation implies that two 
spatially close gates approximately follow similar variations. It is not statisti-
cally expected that two nearby gates follow totaly independent variations. Thus, 
in the underlying optimization, we penalize the difference among scaling factors 
of the nearby gates. The new formulation results in a better estimation of gates 
scaling factors. The approach is based on our paper in ISLPED 2008 confer-
ence [62]. 
Next, we use path delay analysis to characterize the variations in gate de-
4 
lays [63]. The same approach as in post-silicon leakage characterization is used 
for gate level delay variation characterization. However, in contrast to the power 
variations, the variations in delay are additive and they are linear functions of the 
CMOS dimensions. We use HSPICE simulation to find linear relations between 
transistor variations and delay variations in various gates. However, in the delay 
framework, the from and the construction of the system of linear equations is dif-
ferent from the power framework. In the delay framework, one can only measure 
the delay of the signal propagation on specific paths that start form a primary 
input and end at a primary output. Such paths are called sensitizable (testable) 
paths [64,65]. We use the testable basis selection method in [65] to find a set of 
sensitizable basis paths for a circuit. Then, using the linear relationship between 
transistor dimensions and the gate delays, we construct a system of linear equa-
tions with variations as the unknown variables. Again, we can use traditional 
.^-minimization or ^i-regularization (compressive sensing) to estimate the gate 
level timing characteristics. 
We evaluate performance of the proposed methods for both delay and power 
frameworks on a number of circuits from the MCNC benchmark suits. Results 
indicate that ^i-regularization method can estimate the variations much more 
accurately than the traditional ^-minimization. However, performance of the 
^i-regularization method depends on the circuit topology. For example, in delay 
framework, the ^i-regularization method on the C499 benchmark circuit improves 
5 
gate-level characteristics estimation more than 100%, while on the b9 benchmark, 
the improvement is only 10%. 
A number of applications can enjoy non-invasive post-silicon characterization. i.1* 
They include post-silicon optimization, manufacturing process characterization, 
simulation improvement and IC identification. 
The new aspect of this thesis are as follows: 
• We propose a method for post-silicon gate-level characterization for both 
power and delay frameworks, that only uses non-invasive measurements. In 
contrast to variation measurement methods based on the ring oscillators, 
our method works for a general combinatorial IC. 
• For the first time, we represent post-silicon variations in a sparse domain. 
Even though the spatial correlation in the variations is widely studied before 
[23,30,79], it is the first time that is used for post-silicon optimization. 
We experimentally determine which wavelet basis results in the sparsest 
representation. 
• We use the theory of compressive sensing to estimate the variations with 
a small number of measurements. We use the wavelet basis to sparsely 
represent delay and power variations. 
• We analyze the regularization factor in ^j-regularization and introduce a 
method to estimate the optimal regularization factor. 
6 
• We modify the original compressive sensing formulation such that it can be 
applied to irregular gate placements. 
• We add new constraints to the optimization problem that directly impose 
spatial correlations. With these additional constraints, variation estima-
tions improve considerably. 
• The proposed post-silicon variation characterization method is fast, inex-
pensive, and non-invasive. It enables a range of new applications. We 
introduce a number of novel applications for the proposed method. 
The thesis is organized as follows. In Chapter 2, we discuss related work and 
preliminaries that are used in the thesis. Preliminaries include the variations 
model and the compressive sensing theory. Chapters 3 and 4 introduce our vari-
ation estimation method in power and delay frameworks, respectively. Next, we 
discuss a number of applications for the proposed post-silicon variations charac-
terization method in Chapter 5. The evaluation results are presented in Chapter 
6. We finally summarize the thesis in Chapter 7. 
7 
Chapter 2 
Background 
2.1 Related work on process variation 
2.1.1 Early work 
Manufacturing variations have been a main source of random properties of pre-
cisely designed ICs. Even though process variations were very small in 20th 
century fabrication technology, they could affect precise analog design and they 
were addressed by a number of researchers [27,41,55,66]. Three of the early 
works in identification of random variations stand out. 
In 1982, Shyu et al. [66] studied effects of random variations on MOS capac-
itors. They identified the capacitor edge and the oxide thickness fluctuation as 
two sources of randomness in MOS capacitors. The variations in the physical 
properties lead to a random capacitance. They analytically derived the relation-
8 
ship between the capacitance and the random variations, and they numerically 
showed how variations affect the capacitance. For example, they showed that a 
Gaussian random fluctuation with variance 0.1/im in a capacitor with edge length 
50iJ,m causes only 0.036% difference in capacitance. Their results indicate that, 
in early CMOS capacitors, the effects of the random variations were negligible. 
Lakshmikumar et al. [41] in 1982 proposed a method to predict the current 
mismatch (intra-die) of the transistors on an integrated circuit. Since only the 
relative dimensions of transistors are important in analog design, the impacts of 
global variations (inter-die) were not analyzed in this work. They had two main 
missions in the paper. First, sources of variations were determined and a model 
was fitted to the measurement data. In other words, they tried to predict the 
systematic part of the variation. Second, they constructed an analytical relation 
between the current mismatch and transistor dimensions. Thus, the predicted 
current mismatch could be transformed into dimension variations. Knowing the 
variation in dimensions helps designing more precise analog circuits. However, 
random variations were not considered. This, the total variations could not be 
predicted. 
In 1995, Eisele et al. [27] used a 10 x 10 transistor array to study intra-
die variations in manufactured ICs. Their addressing scheme allowed individual 
transistor selection, meaning, they could characterize each transistor separately. 
After finding VGS of all transistors, a normal distribution was fitted to the mea-
9 
sured values. They also showed that variations in gate source voltage, VGS, are 
spatially correlated. They then repeated the procedure for different aspect ra-
tios (W/L) and verified the relation between the transistor dimensions and the 
threshold voltage variance: o~vth oc {WLeff)~i. Thus, as the CMOS transistor 
dimensions decrease, the fluctuations variance increases. 
2.1.2 Variation estimation and modeling 
As technology improved and nano-scale CMOS transistors could be fabricated, 
process variations became a determining factor. To appreciate how variations 
affects the circuit design, one needs a thorough understanding of variation and 
its statistical properties in ICs. Several researchers performed measurement and 
modeling of the process variations in different CMOS technologies [8,12,16,23, 
30,36,43,46,47,50,76,79]. 
In 2005, Doh et al. [23] experimentally characterized the spatial correlation in 
process variations. To do so, they fabricated a 4 x 5 module array in 130nm CMOS 
technology. As can be seen in Figure 2.1, each module consisted of 16 patterns of 
nMOS and pMOS transistors and an oscillator. Oscillators are standard devices 
used to characterize properties of integrated circuits [36]. They consist of a 
number of inverters that are connected in a loop circuit. Doh et al. [23] used a 
40-pattern ring oscillator (see Figure 2.1). Using this method, they explained the 
spatial correlation in variations. Figure 2.2 shows the scatter plot for saturation 
10 
Ml 
ni 
* tn 
i» 
m 
Test chip (20mm x 29mm) Test module {I200um * 600um) 
Figure 2.1: Design structure used by Doh et al. to characterize spatial correlation in 
process variations [23]. The left part is the 4 x 5 module array that they 
used in the experiment. Each module includes 16 patterns of nMOS, 16 
patterns of pMOS, and an oscillator. 
voltage of nMOS transistors. Saturation voltage of transistors in close modules, 
like M l and M2, are strongly correlated. The right side of Figure 2.2 shows that 
the correlation decreases linearly with distance. 
To characterize accurately variations, Friedberg et al. [30] used Electrical 
Linewidth Metrology (ELM) to measure transistors feature sizes in a 200mm 
wafer. They used the Kelvin test to find linewidth by ELM measurements. Fig-
ure 2.3 shows variations distribution of 130nm technology for a complete wafer. 
Patterns of inter-die and intra-die variations can be clearly observed in the pic-
ture. They measured dimension variations of all transistors in a number of wafers 
and introduced a variation model for the transistor dimensions. They proposed a 
piecewise linear fit to the measurement data. Their experimental results showed 
11 
: M2 M3 M4 
m M5 
TO M6 
m M7 
ni ra 
nMOS Ring oscillator 
16 patterns 40 patterns 
< 1 
pMOS 
16 patterns 
I 
H L 
Figure 2.2: Spatial correlation study by Doh et al [23], Left: scatter plot of saturation 
voltage of nMOS transistors. Close modules are strongly correlated. Right: 
Spatial correlation decreases as distance between modules increases. 
that spatial correlation increases after a specific distance, but they do not have 
any argument that interprets the experimental results. Their method is inva-
sive and expensive in time and equipment, making it very hard to characterize 
variations in a large number of ICs using ELM. 
Zhao et al. [79] used a transistor array to study the process variation. They 
used the test chip that was designed and fabricated by Agraval et al. [7]. The test 
structure was specifically designed to determine the local variation in transistors. 
The dimension of the test structure was 125fim x 110[im and it consisted of 1000 
columns and 96 rows. They used Level Sensitive Scan Device (LSSD) latch banks 
in the structure to allow addressing each transistor uniquely. They determined 
current voltage characteristics of all transistors. The observed variations were 
thought to be a result of threshold voltage and gate-length variations. They also 
12 
50 0 50 
Wafer X (mm) CD<nm) 
Figure 2.3: Measured process variation in a wafer [30]. Friedberg et al. used Electrical 
Linewidth Metrology (ELM) to measure the process variation in all the dies 
of the wafer. Inter-die and intra-die variations can be clearly observed. 
proposed a model for each parameter variations. The results show that having 
a statistical characterization of variations can reduce IC power prediction error 
from 30% to 7%. Their work signaled benefit of variations modeling. However, 
their analysis used a test array circuit and it can not be extended for modeling 
legacy ICs that are not equipped with the sensors. 
Liu [47] proposed a new modeling approach that described systematic vari-
ations as an affine function of the device's geometric coordinates. To model 
random variations, he recommended three spatial correlation functions: expo-
nential, Gaussian, and linear. Using generalized least square fitting, he chose a 
13 
Gaussian model for the measured data. His main focus was on modeling rather 
than on measuring IC variations. 
Ring oscillators spread throughout a test chip were used by Hargreaves et 
al. [36] to measure variation on a test chip. The chip design allowed the ring 
oscillators could be accessed sequentially. Thus, Hargreaves et al. [36] could 
measure each ring oscillator frequency separately. Figure 2.4 shows inverter delays 
for four different test ICs. They finally also modeled the variations as a Gaussian 
field. Their method differs from the model by Liu [47] in the correlation function 
and fitting procedure. Hargreaves et al. used more accurate parameter estimation 
method with higher complexity compared to Liu [47]. 
Ctm1 018)2 
Figure 2.4: Variation on four test chips measured by Hargreaves et al. [36]. 
14 
None of these methods provides a fast and practical method for variation 
estimation. They are either invasive, that is destructive, and expensive in terms 
of the time and equipment cost, or they rely on addition of on-chip oscillators 
for variation sensing. We introduce a fast, non-invasive, and inexpensive method 
to estimate the variation. Only a small number erf power or delay measurements 
are used to characterize the gate-level post-silicon variations. 
2.1.3 Effects of variations on the design 
Process variations have considerable effects on chip properties [2-6,9,19,21,26, 
33,44,51,57]. For example, they can seriously affect timing [14,17,19,39,49, 
51,57,58,78]. In statistical static timing analysis (SSTA), researchers try to 
find signal propagation delays on the critical paths in a circuit. Most of the 
proposed solutions are particularly interested in finding the statistical distribution 
of the maximum propagation delay. Orshansky et al. [57] found that in 180nm 
technology not considering process variation might cause a 25% timing error. 
Choi et al. [19] estimated path delays under process variation and proposed a 
new sizing algorithm. Their proposed method performed up to 19% better than 
the worst case analysis. Mangassarian et al. [51] found the delay probability 
distribution function (pdf) of the critical paths and sorted them. Based on sorted 
pdf of path delay, they proposed a statistical timing analysis that is about 30% 
better than the worst case analysis. 
15 
Above methods are pre-silicon models that a specific variations distribution 
is assumed on the IC. Cline et al. [20] analyzed impacts of the variations models 
on SSTA methods. They used real measurement data to fit the models such 
that the correlation decreases as distance between two gates increases. Then, 
they compared the SSTA methods of the models with the static timing analysis 
(STA). They showed that correlation models for the SSTA should follow the 
specific process variations in the IC. Otherwise, the performance of the SSTA 
would degrade. 
Liu et al. [48] introduced an SSTA method using post-silicon measurements 
and optimizations. They combined post-silicon measurements with the existing 
pre-silicon models for the variations. Thus, they constructed a specific model for 
each die. The proposed method could decrease the standard deviation by 83.5% 
compared to the traditional post-silicon SSTA techniques. 
Process variations affect the performance of pipelined circuits as well. 
Pipelined circuits consist of a number of sequential stages. To increase the oper-
ating frequency, one needs stages with small delays, but the slowest stage is the 
system's bottleneck. In the presence of variations, delay of each gate is randomly 
distributed according to a some pdf and it is not possible to exactly determine 
the slowest stage [21]. Datta et al. [21] showed that considering variations can 
result in a 9% improvement of design yield. Eisele et al. [26] showed that, in 
180nm CMOS technology, variation might cause a 10% reduction in the operat-
16 
ing frequency. 
Leakage current of an IC also changes with process variation [3,5,11,59,60]. 
Agarwal et al. [5] proposed a method to model IC leakage current distribution. 
They showed that in 50nm CMOS technology the coefficient of variation of the 
total current might vary between 0.17 to 0.98. 
2.1.4 Testing 
The goal of the IC testing is finding the defective gates in the circuit [56,64,65]. 
The test might be a functional test or a delay test. In the functional test, the 
logical functionality of the gates is tested. The delay test ensures that the delays 
of all gates satisfy a number of specific constraints. 
Finding a set of testable paths is the most important task in the testing. 
Sharma et al. [65] introduced a technique to construct a small basis path set 
that cover all gates. They proposed automatic test pattern generation (ATPG) 
techniques to identify the longest testable path through each gate. Thus, they 
could detect any defect in the circuits using delay measurements. 
Murakami et al. [56] introduced a method to recognize untestable paths. Their 
method was based on the logical necessity conditions that should be satisfied 
for a path to be testable. Knowing the necessity conditions, they proposed an 
algorithm to find the longest testable path trough each gate. 
Although, similar to our method, the circuit testing is based on the delay 
17 
measurements on a set of testable paths, the goal is not characterizing the delay 
variations. In the circuit testing, only defected gates are interested while the goal 
of our method is to characterize the delay variations of the gates. 
Thus, the process variations affect many different properties of a manufactured 
IC and they can not be ignored anymore. The previously described methods for 
variation estimation are expensive and cannot be extended for a legacy IC. 
2.2 Preliminaries 
2.2.1 Variation model 
Process variations can be generally described as the sum of systematic variations 
and random variations. The systematic variations have a deterministic pattern 
resulting from physical imperfection in the manufacturing process. For example, 
mask imperfections result in systematic variations in the chip. Because of their 
deterministic source, systematic variations can potentially be known beforehand 
[76]. The systematic variations of a specific logical gate u, denoted by ip^, are 
usually linearly modeled [47], 
i>Su = a0 + Ol^u + a2Vu\ 
where «o, and «2 are the model parameters and [xu, yu) is physical location of 
the gate on the IC. 
18 
Random variations result from arbitrary fluctuations in the manufacturing 
process. These variations can be decomposed into inter-die ^>inter and intra-die 
variations •0intra. Inter-die variations represent the differences among the dies for 
the same wafer. Inter-die variation is a random variable equaling some constant 
value for each chip. Intra-die variations represent the differences among the 
devices on one chip. Thus, the total random variation for gate u is 
V£ = V>inter + C t r a -
Finally, total variation can be written as 
= r u + r u 
= a0 + alXu + a2yu + ^mter + VjT* 
= (2.1) 
Where Fu = [1 ,xu,yu]T and = [a0 + ip^tOT, a1 ; a2]T. Note that Fu contains the 
gates location information. The term F^/3 for a specific gate is constant. VC*™ 
is a Gaussian random vector with zero mean and correlation matrix E [47] 
^>u,v = p(Fu — Fv). 
p is the correlation function and can have three forms [47]: p(-) = exp(—a2|| • ||) 
(exponential), /?(•) = exp(—a2|| • ||2) (Gaussian), or p(-) = max{0,1 — a2|| • ||} 
(linear). 
19 
Note that Gaussian random variables describe variations in the dimensions 
of gates (or equivalently gate delays), i.e., du = + tpu where d?u is nominal 
dimension of the gate. 
2.2.2 Compressive sensing 
The compressive sensing concepts, that enable us to reconstruct a sparse vector 
by partial measurement, are explained here (see [10,15,24]). A vector is called 
s-sparse when it has only s non-zero elements. Assume X is an s-sparse N x 1 
vector. Assume Y is described based on the following equation 
Vector X is the unknown sparse vector; U is a known KxN measurement matrix 
and e is measurement noise. Note that not only are the values of the non-zero 
components of X are not known, neither which components are zero. The vector 
Y is our observation (measurement). The goal is to estimate the sparse vector X 
using the measurement vector Y. To retrieve the vector X r one might choose a 
vector that minimizes — UX\\2- Because of the measurement noise and small 
number of measurements, this procedure usually leads to a non-sparse signal. 
However, solving the following optimization problem finds an sparse solution 
Y = UX + e. (2.2) 
min||X||o + a| |e| |2 (2.3) 
such that Y = UX + e, 
20 
where a is a positive constant. 
Note that the zero norm,|| • ||o, in Equation 2.3 measures the number of non-
zero elements of the vector. This objective function is not convex which means 
solving the optimization problem in Equation 2.3 is difficult. Instead, Danaho 
et al. showed [10,15,24] that one can use the following optimization problem to 
approximate the sparse vector X. 
min| |X| | i + a| |e| |2 (2.4) 
such that Y = UX + e. 
They proved that, for a Gaussian measurement matrix, an s-sparse vector can 
be retrieved via ^i-norm optimization if 
K 
S<Clog(N/K)> 
where C is a constant. Moreover, for a general measurement matrix U, Restricted 
Isometry Property (IRP) should be satisfied [15]. 
Most of the real world vectors have an approximately sparse representation. 
A vector X^rxi is called approximately s-sparse if it has s large elements and 
N — s very small elements. It is also shown that the optimization problem in 
Equation 2.4 can be used to recover approximately sparse vectors that he in weak 
lp ball of radius r [15]. i.e., 
M« < ri~p,l < i < N (2.5) 
21 
where X = (xi,x2, • • • X(i) is z-th largest element of X, and p is a positive 
integer number. 
22 
Chapter 3 
Power Tomography 
In this chapter, we introduce the new fast power variations estimation (power 
tomography) method. The proposed power tomography is based on our paper in 
ISLPED 2008 conference [62], 
3.1 Preliminaries 
3.1.1 Leakage current 
Digital circuits are designed such that there is no direct path between the voltage 
source and ground. Thus, one might expect that digital circuits do not consume 
static power; however, the leakage current does occur. There are four sources of 
leakage current [28]: (1) reverse-biased junctions, (2) gate-induced drain leakage, 
(3) gate direct-tunneling leakage, and (4) sub-threshold leakage. Finding the 
23 
exact value of the leakage current involves elaborate expressions. Since such an 
exact leakage model does not affect our basic approach, we use following model 
presented yi [60]. 
I** = q i e ^ L + * L 2 l (3.1) 
/ieak is the leakage current of a transistor; qi, <72, and are three constants 
that are determined by physical characteristic of the transistor and L is the 
gate length of the transistor. q3 is a small number and q^L2 <C q^L [60]. This 
model suggests an exponential relation between the transistor gate length and 
the leakage power. Thus, the leakage current approximately has a log-normal 
distribution and pu = 4>up^; where and pu are nominal power and real power 
of the gate, respectively, and 4>u = ; where i/)u represent variation in transistor 
dimension. 
Thus, given a combinational circuit C consisting of N logical gates, Pi input 
pins, and Po output pins, each gate gu, based on its inputs signals b, consumes a 
specific power p9ufi- Because of the process variation, power consumption of gate 
gu does not equal to its nominal power consumption Rather, it is scaled by 
<t>u-
P9u,b = P0gu,b(f>U 
The scaling factors of gates, <f>u, need to be estimated, whenever it is feasible. 
24 
3.1.2 Global flow of the power tomography 
Compressed sensing 
y ^ r y f j 
\represeflt/\_ 
t:2-iiorn» mm. 
with spatial 
^constrains (TUSC) 
Figure 3.1: Global flow of the power tomography. 
Figure 3.1 shows the global flow of our method. A number of random input 
vectors are applied to the circuit, and the leakage current corresponding to each 
input vector is measured (Steps 1 and 2). Next, a system of linear equations is 
formed where each equation corresponds to one measurement (Step 3). The equa-
tion unknowns are the (normalized) leakage current variations of each gates. The 
standard way to estimate the IC's leakage tomogram is to use ^2-norm optimiza-
tion (Steps 4a-5a). However, our method exploits spatial correlations of the sta-
tistical leakage variations and compressive sensing theory to estimate efficiently 
the leakage tomogram (Steps 4b-5b). We also enforce the spatial constraint on 
power variations estimation directly (the TUSC method in Steps 4c-5c). 
25 
Inputl 
Input2 
lnput3 92 
g3 
|g4 
Figure 3.2: A simple logic circuit. 
Table 3.1: Static power for different input vector combinations. 
input vector NAND-2 NOR-2 
00 0.776 nW 17.41 nW 
01 10.39 nW 4.112 nW 
10 4.137 nW 7.581 nW 
11 15.15 nW 3.527 nW 
3.2 Noninvasive tomography 
In this section, we detail the full matrix measurement method for noninvasive 
gate-level characterization. First, different inputs are applied to the circuit and 
the total chip's leakage current measured for each input. Then, an optimization 
problem is solved to find the process variation based on the power measurements. 
Consider the simple logic circuit in Figure 3.2. It has 3 inputs and 2 outputs. 
The nominal power consumptions of each gate for different inputs are shown 
in Table 3.1.The table shows power consumption for 65nm CMOS transistor 
26 
technology. As a result the circuit has a different power consumptions for each 
input vector. Because of the process variation, the nominal power consumption 
of the gate gu is scaled by <j)u. For example, if input 1, input 2, and input 3 are 
0, 1, and 1, respectively, then the total power consumption of the circuit would 
be 
Poll = Pgi,O101 + Pg-x ,1102 + Pga,00<p3 + P94,OO04 
= 4.112^1 + 15.1502 + 0.77603 + 17.4104, (3.2) 
where pg.b» is the power consumption of the gate gi for input bj. Note that 6® , the 
input of each gate gi, is a function of input vector of the circuit that is denoted 
by bj. For example, in Figure 3.2, if bj = 011 then b) = 00. 
In a digital circuit with N gates, for the binary input vector bj, total power 
consumption pj, is 
N 
Pbj = J^PgM^' (3.3) 
i=l 
If there are M input vectors b\,..., 6m> define measurement matrix A as 
Pgub\ Pgifi\ 
Pgub\ Pg2fil 
PgublM P92,b*M 
P9nK 
PgN,b» 
PgN,bZ 
Also, let 
27 
d = . . . , < M r -
Then, we need to solve following system of linear equations to find the gate 
variations. 
p = Ad. (3.4) 
G& 
Since there are N unknown variables (0j,i = 1 . . . N), N independent mea-
surements are needed to describe completely the solution of the linear system in 
Equation 3.4. In the presence of power measurement noise, we can least square. 
' m i n | | A d - p | | | . (3.5) 
We call this method the ii-minimization method. 
Note that each input vector bj, based on the topology of the circuit, determines 
a row of the measurement matrix A (power vector). It may be that the rows of 
the measurement matrix are not necessarily independent, making it impossible 
to find the variation of all gates by optimization as in Equation 3.5. 
Multi-voltage leakage measurement 
The number of independent power vectors (row of the measurement matrix) 
may increase by increasing the number of power measurements, M. However, 
circuit topology dictates an upper bound on the maximum number of independent 
power vectors. But as discussed in Section 2.2, supply voltage and the leakage 
current are not linearly dependent. Hence, measuring static power for different 
supply voltages results in independent power vectors. We use this fact to increase 
the number of the independent power vectors in the measurement matrix. 
28 
Figure 3.3: Number of independent measurement vectors for single voltage measure-
ments and multiple voltages (3 voltages) measurements. Multiple voltage 
measurements increase number of independent rows in the measurement 
matrix. 
Figure 3.3 shows the number of the independent power vectors in the C432 cir-
cuit from ISCAS'85 benchmarks. Similar to the previous section, this experiment 
is based on the 65nm CMOS transistor technology. In this figure, the number 
of independent power vectors versus the number of random measurements are 
shown. Two cases were investigated: measurement under single supply voltage 
and measurement under three supply voltages. It is clear that for the same to-
tal number of measurements, three supply voltages measurements result in more 
independent power vectors. 
29 
Figure 3.4: Process variation and its sparse wavelet transform for a typical circuit in 
power framework. 
3.3 Fast tomography by compressive sensing 
As discussed in Section 2.2, sparse vectors can be acquired using very few mea-
surements. In this section, first, we introduce fast tomography for chips with 
gates located on regular grids. Then, we extend this approach for cases with 
gates located on irregular grids. 
3.3.1 Sparse representation 
The spatial correlation in the variations provides some redundancies in the varia-
tion values. The spatial correlation suggests that variations can be sparsely rep-
resented in an appropriate basis. In this section, we use wavelet basis to sparsely 
represent the process variations. Specifically, we assume d = where W is 
30 
10° 
® o 
•i io° Co > 
c 
a> 
o 
t 
a> 
° 10 O 
10 - 1 0 
db2 
L db5 
db6 
db9 
bior1.3 
~ ^ " \ 
x' • 
200 400 600 800 
Sorted Coefficients Index 
1000 
Figure 3.5: Sorted wavelet coefficients for different basis functions in power framework. 
The db9 basis produces the most sparse representation. 
a wavelet basis and s is a sparse vector. Wavelet basis are very efficient in sparse 
modeling of spatial correlation, as shown in Figure 3.4. The left side of the figure 
images the variations of a chip in the spatial domain. The right side shows the 
variations in the wavelet domain. In the wavelet domain most of the non-zero 
coefficients are concentrated in the upper-left corner of the transform and most 
of the remaining coefficients are close to zero. 
Figure 3.5 shows wavelet transformation of variations for a number of wavelet 
bases. The figure demonstrates the coefficients decay rate for a variety of wavelet 
families on typical 32x32 regular grid circuits. The figure suggests that the 
Daubechies 9 (db9) wavelet basis is very good at sparsifying the process variation. 
In the remainder of the thesis, we use the Daubechies 9 wavelet to model process 
31 
variation sparsity in the power framework. 
3.3.2 Regular grid tomography 
First, we assume that the logic gates are located on a regular T x R grid on 
the chip. The matrix of process variation on the regular grid is denoted by 
H = {hS:T}S=I...T,T=I...R} where h,%t is variation of the gate located in the (s, t)-th 
point of the grid. We stack all the elements of the matrix H in a long column 
vector d. Assume W is the transformation matrix for a wavelet in which variation 
vector d is sparse. Let 
then, s is a sparse vector. 
Using the wavelet basis to model the spatial correlation of the process varia-
tion, Equation 3.4 becomes 
s = Wd- (3.6) 
p = ,4d + e = AW'1 s + e. (3.7) 
The sparse s can be recovered using the optimization in Equation 2.4: 
min||s||i +A|| iW_ 1s-y 2 2" (3.8) 
The process variation d is then recovered using d = Ws. 
32 
Figure 3.6: Gates are not placed on regular grids. 
3.3.3 Irregular grid tomography 
In practice, gates are not placed on a regular layout grid. Figure 3.6 shows an 
example of an IC in which gates are placed on an irregular grid. To address 
the irregular placement, we cover the IC with fine regular grids. Then, using 
Procedure 1, each gate is assigned to a point on the regular grid. At the first 
step of Procedure 1, all the regular grid points are labeled unmarked, meaning 
that none of the regular points is assigned to any gate. In the second step, for 
every gate, we find its closest regular point that is unmarked. Finally, to prevent 
multiple selection, we mark the selected regular grid. 
Then, we assign auxiliary variables to the points in the fine grid that are not 
assigned to any gate. We also modify the measurement matrix A to be consistent 
with the fine regular grids, i.e., for each auxiliary variable, we add an appropriate 
zero column to the matrix A. Since the coefficients of auxiliary variables in the 
33 
Figure 3.7: Wavelet coefficient of irregular wavelet transformation and fine-gride 
wavelet transformation. 
measurement matrix are zero, they do not affect the optimization. 
P R O C E D U R E 1 
Mapping from irregular gates to fine regular grids 
(1) Set all the regular grid points unmarked 
(2) for all gates, gi 
a. p = the closest grid point to the gates that is unmarked 
b. assign gate gl to p 
c. Mark regular grid point p 
Note that as an alternative method to deal with irregular grids, we could use 
irregular wavelet transformation introduced by Wagner et al. [75]. The irregular 
wavelet transformation is based on the regular wavelet transformation; however, 
it is adapted to irregular point arrangement. Figure 3.7 shows sorted wavelet 
34 
coefficient for both irregular wavelet transformation and our fine-grid wavelet 
transformation in the C880 circuit. The wavelet coefficients of the of the proposed 
fine-grid method decay much faster than irregular grid transformation. The main 
reason is that the gate placement is not completely irregular. The standard gate 
sizes are integer multiplicand of a specific value. Moreover, the placement tools 
assume irregularity in just one dimension. 
3.4 Tomography using spatial constraints 
(TUSC) 
In this section, we directly use the spatial correlation to improve the estimation 
error of power variations. In Section 3.2, we just used power (leakage) mea-
surements in Equation 3.3 to estimate the variations. Representing variations in 
sparse domain in Section 3.3 is based on the spatial correlation in the variations. 
Here, we reformulate the variation estimation problem such that the spatial cor-
relation explicitly appears in the optimization problem. 
3.4.1 Adding spatial constraints 
Adding spatial constraints directly to the optimization problem improves the 
estimation performance. The spatial correlation implies that nearby gates should 
have approximately similar scaling factors. As the distance between two gates 
35 
increases, the correlation between their scaling factors decreases. Thus, far gates 
might have totally different scaling factors. We should penalize solutions in which 
nearby gates do not have close scaling factors. 
s Consider optimization problem in Equation 3.5. We add a number of the 
I 
constraints to the optimization problem such that they enforce spatially correla-
tion solutions. Assume gu and gv are two logic gates that are located at (xu, yu) 
and (xv, yv), respectively. Similar to Section 3.2, their scaling factors are denoted 
by 4>u and <f)v. We use the following optimization problem to improve variation 
estimation. 
min||4d-p||!+ ^ kdu,v)(<t>u-<t>v)2, (3.9) 
(9u,9V)&£ 
where 
d"U,V V(XU "I" (yu Vv)^ 1 
£ = {(Gu,9v)\fJu and gv are two gates in the circuit}, (3.10) 
and 7(.) is a monotone-decreasing function. Thus, when the distance between 
two gates (du>v) is small, 7 ( d u > v ) is large. It enforces a small value for (4>u — (j)v)2. 
Consequently, when, the distance between two gates (d U j V ) is large, ~f(dUtV) is small 
and ((j)u — <t>v)2 does not affect optimization problem dramatically. Hence, solution 
of the optimization problem in Equation 3.9 will exhibit spatial correlations. 
To simplify the constraints, one can eliminate the gate pairs that are far from 
36 
each other. For example, we can define S r as 
£r = {{gu,9v)\9u and gv are two gates in the circuit,du.u < r}. (3.11) 
37 
Chapter 4 
Delay Tomography 
In this chapter, we extend the variation estimation to the delay framework. Sim-
ilar to the power tomography in Chapter 3, we only use primary inputs/outputs 
of the IC to characterize the delay variations. The approach is based on our 
paper in ICCAD 2008 conference [63]. 
4.1 Preliminaries 
4.1.1 Delay variation model 
Transition delay is usually modeled as a linear function of transistor feature size 
variations [38,49,58]. For example, consider a NAND2 gate where one of its 
inputs is 1 and the other input, at time t = 0, transits from 0 to 1. Because of 
propagation delay, the output transits from 1 to 0 at time t = dr. When there 
38 
are variations in transistor feature size, rising propagation delay, dr, varies among 
different NAND2 gates in the IC. i.e. [49] 
d r{^u o t a l ) = nominal + ^ t o t a l (4.1) 
where £ is a constant and is the nominal rising delay of the gate. Note 
that, even if we model the propagation delay quadratic (or higher order) [29], we 
can use the same approach by assuming new variables for higher order parameters. 
4.1.2 Sensitizable paths 
A path in an IC is defined as a sequence of logic gates from an input of the 
IC to one of its output pins. To find propagation delay in a path, one should 
find an appropriate input vector for the IC. The input vector should guarantee 
propagation of a transition in the path. If such an input vector exists, the path 
is called sensitizable; otherwise it is called unsensitizable. 
4.1.3 Global flow of the delay tomography 
Figure 4.1 shows the global flow of the work. At the first step, we feed the circuit 
with a number of input vector pairs based on the set of sensitizable paths. The 
inputs are found based on the path selection procedure introduced in Section 
4.5. In step 2, propagation delay is measured for every sensitizable path. Based 
on the measured propagation delays, we construct a System of Linear Equations 
39 
Q Q © © & 
Compressed sensing 
Figure 4.1: Global flow of the delay tomography. 
Figure 4.2: A sensitizable path from an input to the output. Inputs to the circuit are 
set such that a rising (falling) transition in input a can propagate to the 
output n. 
(SLE) with gate variations as its unknown parameters. Then, we estimate varia-
tions by two methods (4a and 4b). The first method is based on the traditional 
^2-minimization (4a.) In the second method, we show sparsity of variations in 
wavelet domain and use compressed sensing (^i-regularization) to estimate vari-
ation more efficiently. 
40 
4.2 Delay estimation by 4-norm minimization 
The signal propagation delays of a number of sensitizable paths are measured. 
Linear equations are constructed with the scaling factors of gate delays (defined 
in Section 4.1.1) as the unknown parameters. Finally, solving these equations, 
we estimate the scaling factors and, therefore, the gate variations. In Section 
4.3, we utilize the variations in spatial correlations to improve the scaling factor 
estimations. 
An example of path delay analysis is shown in Figure 4.2. Lines labeled by a, 
b, c, and d are the circuit's primary inputs and the line n is the circuit's primary 
output. We want to sensitize the delay of the highlighted path, P\\ (a-^i-z-e-^-
f-(74-s-<76-k-(77-n). We need to find an input vector that guarantees a transition in 
input a that would propagate through the path. Let us assume a rising transition 
in a (input a transits from 0 to 1). To allow propagation through the gate gi, 
we need to set b to be equal to 0. Then, there would be a falling (1 —> 0) and 
a rising (0 —> 1) transition in lines e and /, respectively. If g is equal to 1 and 
m is equal to 0, then the rising transition propagates in the lines s, k and n. To 
guarantee that g is equal to 1 and m is equal to 0, we just need to set the input 
c = 0. 
The input assignments above allow the transition in input a to propagate 
through the path Pi •.a,-gi-z-e-g^-i-g4-s-g&-k-g-I-n. Using the delay bounding 
method introduced in [64], one can measure the total delay of the underlying 
path. We can measure the time difference between the transitions in line a and 
in line n. Let us denote the total delay of the path Pi for the rising transition by 
dr(Pl). ^ 
The total path delay is an additive composition of the delays of its elements. 
For example, delay of the path Pi can be written as the summation of the delays 
in line a, gate gi, line k, line e, gate <73, and so on. i.e., 
dr{Pi) = d{a) + dr(gi) + d(z) + d(e ) + df(g3) 
+ d ( f ) + dr(g4) + d(s) + df(ga) + d{k ) 
+ dr(g7) + d(n), (4.2) 
where d(x) is the delay of the line x, and dr(gi) and df(gi) are the rising and 
falling delays of gate gj, respectively. 
Here, we assume for presentation clarity that interconnect delays (line delays) 
are zero. The proposed method can be easily extended to cases with non-zero 
interconnect delays. Note that it maybe the case that variations in the inter-
connects have a separate statistical representation. In such scenarios, one may 
consider compressed sensing methods that address the summation of two distinct • 
distributions in one framework [24]. Assuming zero interconnect delays, Equation 
4.2 reduces to: 
dr(Pi) = dr(g 1) + df(g3) + dr(gA) + df(g6) + dr(g7). (4.3) 
In Section 4.1, we illustrated that because of the process variation, delays of 
Table 4.1: Transition propagation rate for different gates. The rising and the falling 
transitions do not enforce the same delay rates. 
Gate Rising (pS//xm) Falling (pS/ptm) 
Inverter 86.9 40.77 
NAND2 176.9 507.7 
NOR2 95.4 1106.2 
the gates deviate from their nominal values, i.e. [49], 
dr{9i) = d™™™\9i) + ir,9ilgi, (4.4) 
where d?ominal(gi) is the nominal delay for rising transition and l9i is the varia-
tion for the gate gl and is a constant coefficient. Table 4.1 shows the constant 
coefficients for NAND2 gate. Similarly for the falling transition, 
df(9i) = ^ o m i n a l ( f t ) + ZfJm. (4.5) 
Thus, Equation 4.3 becomes 
43 
or 
+ dfomhial(g3) + Cf,g3l93 
+ d™™™\g7) + t,grl97, (4.6) 
= d r(Pi) - < o m i n a l ( . 9 i ) - 4 1 0 m i n a l (53) 
_ ^ n o m i n a l ^ _ ^ n o m i n a l ^ _ ^ n o m i n a l ^ 
bp1 is a constant. Thus, each sensitizable path in the circuit leads to a linear 
relation among the variation elements, l g r The falling and rising coefficients 
(£/,9i a n d £r,gi) are known and our goal is to estimate the variations, l9i. 
Assume that Pi, P2 . . . Pm are M sensitizable paths in a general combinational 
circuit C with N gates. For each path Pj, if it is stimulated by a rising transition, 
N 
i=1 
where 
aP(i) = < 
1 if gi belongs to the path Pj; 
0 otherwise, 
44 
and 
/ if gi has a falling transition when path Pj 
A r(Pj,i) = < is stimulated by a rising transition; 
r otherwise. 
Similarly for a falling transition, 
N 
where 
A f(Pj,i) = i 
f=i 
/ if gi has a falling transition when path P j 
is stimulated by a falling transition; 
r otherwise. 
(4.8) 
To write Equations 4.7 and 4.8 in a compact form, we define matrix A and 
measurement vector b and variation vector 1 as follows. 
/ 
A = 
q ; P m ( 1 ) ^ A ' - ( P M , 9 I ) , 9 I 
" P i ^ K A / C P I , ! ? ! ) , ! , ! 
aP2(X)£\'{P2,9l),9l 
.Si),: 
apAN)^{Pi,gN),gN 
aP2(N)£\r(P2,9N),9N 
aPM(N)&r(PM,9N),9N 
(XPi(N)t\f(Pi,9N),gN 
aP2(N)€\f(P2 ,9N),9N 
\ 
91 ,9N),: 9n 
45 
b = (bl,br2,...brM,b{,bf2,---bfM)T, 
and 
1 = h - • • IN)T• 
This notation allows following minimization for finding the variation 1. 
m m | | A l - b | | 2 . (4.9) 
we call this method ^-minimization method. 
Note that it may not be possible to find the variations of all gates by this 
method. For example in Figure 4.2, if we want to find another sensitizable path 
that includes <74, we should fix / = 1 (none-controlling value) causing e — 0 and 
<7=1. Thus, the transition cannot propagate on the line g and path / 0 is the 
only path that includes the gates <73, <74 and As a result, there is at most two 
equations (falling and rising) that includes variation of the gates gs, <74 and g$] it 
is impossible to find the variation of the three gates separately. We refer to such 
cases as ambiguous gates. 
4.3 Delay estimation using compressive sensing 
Section 4.2 presents a system of linear equations to estimate variations of the 
gates. However, the optimization problem in Equation 4.9 does not consider the 
spatial correlation of the delay variations. Incorporating the spatial correlation in 
46 
10 20 30 40 50 60 10 20 30 40 50 60 
Figure 4.3: Left: Spatial correlation in delay variations in a typical IC. Right: wavelet 
transform of the variation. Because of the spatial correlation the variation 
is sparse in the wavelet domain. 
the model significantly improve the results and allows resolving the ambiguities 
described in the previous section. This section incorporates sparsity in the wavelet 
domain as a model for the spatial correlation of the timing variation. Thus, we 
can use compressive sensing theory to estimate the variations more accurately. 
4.3.1 Sparse representation of variations 
As we explained in Section 3.3.1, because of the spatial correlation, wavelet basis 
can sparsely represent the variations. Similar to power tomography, we use the 
wavelet basis to sparsely represent variations. Note that variations in power 
framework are based on a log-normal distribution but variations in the delay are 
approximately normally distributed. Thus, power variations and delay variations 
47 
Figure 4.4: Sorted wavelet coefficients for different bases. bio3.5 bases results in the 
most sparse representation. 
might be sparse in different wavelet bases. 
Figure 4.3 demonstrates the effectiveness of the wavelet transform in repre-
senting spatial variations. The left side of the figure is the image plot of the 
variations in a typical IC, generated using the Gaussian model in [47]. The spa-
tial correlation is evident in the figure. The right side of the figure represents 
the wavelet transform of the left hand side. Most of the transform coefficients 
are zero. Only the top-left part of the figure has a dense amount of significant 
non-zero elements. 
Figure 4.4 presents the decay rate of the wavelet coefficients for a number 
of different wavelet transforms. A transform appropriate for compressed sensing 
should have a fast decay rate. The faster the decay, the sparser the signal under 
this transform, and the fewer the measurements necessary to acquire the variation 
48 
vector. The figure demonstrates that the (3,5) Biorthogonal wavelet basis best 
describes the spatial variations. We use this wavelet basis for the remainder of 
this thesis. 
4.3.2 Gates on the regular grids 
When gates are located on a regular grid, the two-dimensional wavelet transform 
of the variations, s, can be expressed as the product of the variation vector, 1, 
with the wavelet transform matrix W. 
s = W\. (4.10) 
As discussed in Section 4.3.1, s is assumed sparse because of the spatial correlation 
in the variations. We enforce the sparsity prior by regularizing Equation 4.9 using 
the £\ norm of s, as described in Section 2.2.2: 
min ||s||i + A||A1 — b | | | (4.11) 
or, equivalently, 
min ||s||i + — b|||, (4.12) 
where A is the regularization coefficient. Sparsity of the variations wavelet trans-
formation, s, provides a new piece of information. We call this method i\-
regularization method. 
49 
4.3.3 Gates on the irregular grids 
As we saw in Section 3.3.3, in practice because of area and logic gate constraints, 
the gates are not located on regular grids. An example of gate placement is shown 
in Figure 3.6. Similar to Section 3.3.3, we overcome this problem by using a dense 
regular grid such that the center of each gate is close to some grid point for all 
the gates in the circuit. We assign the variation of each gate gu to the point on 
the regular grid that is closest to the center of the gate. If there are more than 
one closest points, we select one of them randomly. The remaining grid points 
are assigned to free variables that do not correspond to physical gates and do not 
affect the measurements. 
The remainder of the measurement process is similar to Section 4.3.2. The 
points on the regular grid are mapped to a column vector 1 which is measured by 
a measurement matrix A as in Equation 4.11. Note that if the z-th element of 
the 1 is a free variable not assigned to any gate variation, then i-th column of A is 
zero. The vector 1 is still spatially correlated, and therefore sparse in the wavelet 
domain, and can be recovered through s in Equation 4.12. From the recovered 1 
the free variables can be ignored since they do not correspond to physical gates. 
50 
0.08 
0.075 
ill 
o 
o 
0.04 
0 5 10 15 20 25 30 
Regularization coefficient X 
Figure 4.5: The variation estimation error for various regularization factors A. 
4.4 Determining the regularization coefficient A 
Consider ^-regularization problem, 
When A is very small, A||Ax — 6||\ would be small compared to the ^i-norm term, 
||x||i and does not affect objective function dramatically. Thus, norm-one term 
11 a; ||i. is the main component that determines the solution of the regularization 
problem; the solution tends to be sparse. In the other hand, when A is very large, 
\\\Ax — &H2 would be large compared to the norm-one term, ||x||i, and small 
changes in ||Ac — 6||\ result in large changes in objective function. In general, A 
balances between sparsity (^i-norm term) and fitting to measurements (^2-norm 
term). 
Measurement noise and sparsity of the vector x are two major components 
min ||jc||i + A||Ar - b\\\. (4.13) 
51 
that determine A. When there is no noise in measurements, i.e., Axr = b, the 
regularization coefficient A should be set infinity. As measurement noise increases, 
we should relax ^2_norm constraint or equivalently decrease A. In addition, sparse 
vectors imply small A. When it is known that vector x is strongly sparse, one 
should relax £2-norm constraint (decrease A) to obtain a very sparse solution for 
the problem. 
Figure 4.5 shows estimation error for different regularization coefficients, A. 
As explained, for very small A and very large A estimation error is high. There is 
an optimal regularization coefficient Aopt in which the variation estimation error is 
minimum. Optimizing Equation 4.13 for A = Xopt leads to the minimum variation 
estimation error. Aopt is a function of the measurement matrix, measurement 
noise, and the true variations xr\ thus, it is not possible to find Xopt exactly. 
Applying first-order necessity condition for regularization problem in Equa-
tion 4.13 determines minimum value for A. Let 
J(x)= ||x||i + A| |Ac-6 | | l . 
The first-order necessity condition for optimal solution implies = 0, i = 
1 . . . n. Thus, 
<9[|x||i _ d\\\Ax-b\\l 
dxi dxi 
52 
or 
1 Xi > 0 
- 1 X4 < 0. 
Hence, 
dxi 
^-\\Ax-b\\l\\00 = 2\\AT(Ax-b)\\00<\ (4.14) 
As we mentioned before, for very small regularization coefficients A, zero is close 
to the optimal point. Thus, putting x = 0 in Equation 4.14 determines a value 
for A. i.e., if x = 0 is a optimal solution, 
Kim et al. [40] suggest determining A based on Ao- They use Ai = 10Ao. For 
the problem shown in Figure 4.5, Ai = 10Ao = 5.56 x 10~4. This estimation of 
the A is far from Xuvt (Xopt is shown in Figure 4.5). 
Hale et al. [35] use distribution of measurement error to find A. Assuming 
independent normal distribution for measurement noise, they suggest 
Thus, the value for A corresponding to zero would be 
53 
where a = minimum eigen value of AAT. For a = 0.05, A2 will be 591.13. It is 
clearly far from the optimal regularization factor, Xopt (Figure 4.5). 
To understand the behavior of the best A, we study optimal point curves of 
the problem. For each A 6 [Ao,oo), let x\ be the solution of the problem in 
Equation 4.13. Define 
s(A) = M i 
t( A) = | |AC a -6 | | 2 . (4.15) 
(s(A),£(A)) defines a curve in s-t plane. A number of these curves are shown in 
Figure 4.6. These curves are for different noise levels. The points that are shown 
by star on each curve represent the optimal regularization factor, (s(Aopt). s(Aopt)); 
we call these point optimal points. It suggests that the optimal points are approx-
imately on a horizontal line. Thus, we use following optimization formulation to 
estimate the variation. 
min \\Ax — b\\2 
such that ||x||i < c (4.16) 
where c is a constant number. We assume c = 0i2(||a;||i); where 6 € [1.5,2]. 
54 
0 0.002 0.004 0.006 0.008 0.01 0.012 
Norm-two 
Figure 4.6: Optimization curves for various measurement errors. 
4.5 Path selection 
The accuracy of variation estimation is a function of the paths that are used for 
constructing optimization problems. First of all, paths should be sensitizable; i.e., 
they should be possible to measure delay of the paths by externally stimulating 
the primary input of the IC. Moreover, the paths should be linearly independent. 
Ignoring the measurement noise, dependent paths provide redundant information 
about the variations. 
4.5.1 Sensitizable paths 
As we mentioned in Section 4.1, it might not be possible to find the delay of 
every arbitrary path. Only delays of the sensitizable paths (testable paths) can 
be measured by externally stimulating the IC. 
55 
f 
Figure 4.7: An example of a circuit with sensitizable and unsensitizable paths. 
Figure 4.7 shows examples of a sensitizable path and an unsensitizable path. 
Consider following path in the circuit: P2. a-g2-g-g4-k. To propagate a transition 
in the path P2, d should be 1 and h should be 0. Choosing c = 1 and b = 0 
will satisfy these constraints. Thus, P2 is a sensitizable (testable) path. However, 
path P3: c-f-g3-h-g4-k is not sensitizable. Propagation of a transition in this path 
happens if and only if g = 0 and c = 0. To satisfy g = 0, we should have a = 1 
and d = 1. It contradicts with e = 0. Thus, P3 is unsensitizable. 
To ensure that a path is sensitizable, we should generate two input vectors 
for the circuit such that a transition propagates in the path. Creation such input 
vectors might be very complex and take a long time. Thus, we determine a path 
is testable or not in two steps: primary necessity check and using automatic test 
pattern generation (ATPG) tools. 
Primary necessity check is based on the partial path sensitization introduced 
by Murakami et al. [56]. Using the topology and functionality of the circuit, they 
introduce bf-pairs in the circuit. Each bf-pair consists of a b-line (back line) and 
56 
an f-line (forward line), bf-pairs are determined such that necessary conditions 
for transition propagation in b-line and f-line contradict. Thus, a testable path 
can not contain any bf-pair. If a path contains at least one bf-pair, it is not 
testable; otherwise, the path is potentially testable. 
To determine if a potentially testable path is testable or not, we can use any 
ATPG tool to generate input vectors that test the path. In the simulations, we 
have used TranGen [77] for the test generation. It is a fast ATPG algorithm 
based on the SAT solvers. 
4.5.2 Basis path set 
In path selection, it is also important to select independent paths. Consider 
following four paths in the circuit shown in Figure 4.7. 
Pa- c-.j-gi-e-g3-h-.94-k 
P5: b-gi-e-g3-h-g4-k 
Pe- c-j-gi-d-g2-s-g4-k 
P7: b-gi-d-g2-£-g4-k 
For the circuit, it is not hard to verify 
dr(Pi) + dr{P7) = dr{Pb) + dr(P6). 
Thus, these four paths are not independent. Knowing delay of each three of them 
leads to the delay of the fourth one. 
57 
To efficiently minimize the number of path delay measurements, we should 
restrict the path set to the independent paths. We have used the method proposed 
by Sharma et al. [65] to generate a testable basis set for the underlying circuit. 
It is based on the basis generation algorithm introduced in [42] and [18]. 
58 
Chapter 5 
Applications 
In this chapter, we introduce a number of novel applications for the proposed 
variations estimation methods. My methods are fast, cheap, and applicable to 
all the combinatorial circuits. However, the previously proposed methods for 
variations estimation are expensive and design specific. Thus, they can barely be 
used in the following applications. 
1. Improving modeling and simulation: Modeling a random variable is the 
first step in finding its effects on a system. Modeling the process variation 
is widely addressed in the literature [8,12,16,23,30,36,43,46,47,50,76,79]. 
However, there are a limited number of variation measurements that can 
be used to fit a specific model and verify it. Our method introduces a fast 
method to acquire an accurate estimation of variations in a specific IC. 
The introduced variations estimation method can be also used in variation 
59 
simulations. Since there are a limited number of variation measurements, 
researchers have to use non-precise parametric models of variations in their 
simulations. Thus, simulations results might not be accurate enough. Our 
method provides a fast technique to estimate variations and researchers can 
use the real variation measurements in their simulations and improve their 
evaluations. These models can also be integrated within power simulator 
tools for accurate and realistic simulation models. 
2. Post-silicon optimization: Traditional VLSI design is based on the pre-
silicon optimizations. Various parameters of the design are considered by 
the designer and they are tuned to meet different constraints of the design. 
The variations are not considered at all; or only the statistical characteris-
tics of variations are considered. 
The static timing analysis (STA) is an example of pre-silicon optimization. 
The goal of STA is finding the longest delay in a specific circuit. The 
variations in delay are not considered in the STA. Delays of the interconnect 
wires and gates are deterministically modeled; then, using the graph model 
of the circuit, the longest path in the circuit is found. However, in the 
statistical static timing analysis (SSTA), the statistical characterizations of 
variations are utilized to improve the longest delay estimation in presence 
of the variations. 
Today's modern fabrication processes with high variability make the post-
60 
silicon optimization necessary. When there is no variation or variations are 
very small, the designer can predict the behavior of the circuit with small 
uncertainty. However, in the modern fabrication process, even considering 
statistical characteristics of the process variation might not be enough. Op-
timizations after manufacturing (post-silicon) can improve efficiency of the 
IC dramatically [34,52,70,71]. 
Tschanz et al. [71] used bidirectional adaptive body bias to mitigate effects 
of the intra-die and inter-die variations on the circuits. They have consid-
ered frequency-leakage optimization in which the designer should optimize 
the circuit for the maximum frequency while it meets a number of leakage 
constraints. They vary the body bias to change the threshold voltage of 
the transistors in the circuit. If variations reduce the operating frequency 
then the threshold voltage should be decreased. If variations increase the 
leakage current then the threshold voltage should be increased. Thus, by 
increasing or decreasing the body bias, one can adjust the manufactured 
ICs to meet the frequency and leakage constraints. To mitigate the inter-die 
variation, they suggest optimizing the supply voltage based on the variation 
realization in each IC. Intra-die variations can be also handled using differ-
ent reference voltages in different parts of the IC. They need an estimation 
of variations to optimize each circuit separately. Our method can efficiently 
provide them the estimation. 
61 
Pre-silicon optimizations (gate sizing) and post-silicon optimizations (adap-
tive body bias) can be used to reduce the loss of the parametric yield. Mani 
et al. [52] propose a joint optimization method to mitigate effects of varia-
tions on the yield. They show that their method results in a reduction of 
5-35% in the leakage current. 
In all the mentioned post-silicon optimization methods, an estimation of 
variations is necessary to optimize each circuit separately. Our method can 
efficiently provide them such an estimation. 
3. Manufacturing process characterization: The proposed variations estima-
tion method can be used to characterize the statistical properties of a spe-
cific manufacturing technique. In the other words, one can characterize 
variations based on the specific manufacturing technology. This characteri-
zation can be used to optimize designs for a specific manufacturing technol-
ogy. It can also be used to modify the manufacturing technology in order 
to decrease variations. 
4. IC identification and finger printing: Variations are result of complicated 
nanoscale physical interactions and systematic imperfectness of the manu-
facturing tools. Thus, it is practically impossible to clone variations in an 
IC; i.e., the variations in each IC are unique and can not be replicated. It 
is an important property that can be used in IC identification and finger 
printing. 
62 
Physical unclonable function (PUF) [31] is a security scheme that uses 
variations in a chip as its secrete key. Delay based PUFs use delay variations 
in the ICs to construct a function in the chips such that the output of the 
function depends on the variations. Thus, for the same input, the output of 
the function varies across the different chips. This unique and unclonable 
function in each IC can be used as the secrete key. 
5. Identifying hot spots: Various sections of an IC dispatch different power 
levels. Hot spot are the sections that dispatch more power and become 
hot sooner than other sections of the IC. Process variation also affects the 
hot spots on the IC. Using proposed variations estimation method, one can 
determine hot spots of a specific IC in presence of variations. Thus, these 
hot spots can be specifically controlled or cooled down to avoid possible 
damages. 
6. Workload scheduling: Maximum frequency of the various parts of the IC is 
a function of the design and variations. Knowing variation in an IC helps 
us to find the true power consumption and speed of the different parts of 
the IC. Thus, one can develop softwares that consider process variations 
and uses all the resources of the IC optimally. 
An example of such a software is proposed for workload management of 
cache memories by Meng and Joseph [53]. They show that inter-die and 
intra-die variations can dramatically affect leakage current of the ways in 
the cache; i.e., maximum leakage to minimum leakage under variation might 
be 10~100. Then, they introduce way prioritization technique to select low 
leakage ways in cache management. The propose technique can approxi-
mately reduce leakage current by 20%. It is important to note that the way 
prioritization technique utilize variations estimation. However, they do not 
provide a fast and cheap method for the variations estimation. 
64 
Chapter 6 
Evaluation Results 
To verify the accuracy of the proposed methods, we simulated variations in a 
number of MCNC benchmark circuits. Then, we used ^i-regularization, 
minimization and TUSC (see Section 3.4) to estimate the variations. The simula-
tion result shows that using .^-regularization and TUSC improve the estimations 
dramatically. 
6.1 Simulations setup 
• The variation model: As it is explained in Section 2.2.1, we have used 
multivariate Gaussian distribution to model the spatial correlation in the 
variations. The model well agrees with the measurement data and is also 
used by other researchers [22,32,47,69]. 
65 
The transistor model: We have used BSIM4 model for 65nm technology 
in the simulations [13]. The BSIM4 model is developed such that it can 
accurately model behavior of a transistor in the sub-lOOnm regime. 
Benchmark circuits: We have used a number of MCNC benchmark circuits 
in our simulations. The MCNC benchmarks were introduced in 1985 on 
magnetic tapes, and they are updated, modified, and enhanced regularly. 
The benchmarks are widely used in design automation community (for ex-
ample see [37,54,74]). 
The £i-regularization software: The SPGL1 software package [68] is used 
for .^-regularization. The SPGL1 uses an iterative approach to solve the 
LASSO problem. In each iteration radius of l\ ball is increased until the 
convergence. For more details, please see [72]. 
The quadratically constrained quadratic program (QCQP) solver. We 
have used SeDuMi (self-dual minimization) software package [61] for i2-
nimization and the QCQP in Section 3.4. SeDuMi is maintained at the 
Advanced Optimization Lab at McMaster University. It can be used to 
solve various symmetric cone problems. 
The ATPG tool: PathATPG [77] is used to identify testable paths and to 
generate test input pairs for the testable paths. PathATPG is fast ATPG 
tool that is based on the SAT-solvers. 
66 
• Estimation in a subspace of the variations space: Measurement matr ices in 
Equations 3.5 and 4.9 are not full rank. Thus, we should not expect to 
estimate variations of all gates; i.e., null space of the measurement matrix 
A, J\f(A) = {)/£ R"|Ay = 0}, is not accessible. 
Assume Ak is a measurement matrix that includes K measurements (delay 
or power). For a large K (say K > 10N, where N is the number of 
gates), range of Ak, cover almost whole the variation space that can be 
measured. Hence, we use singular vectors of Ak as the comparison space. 
By estimation in ne subspace, we mean estimation in direction of the first 
ne singular vectors of Ak. 
• As it is explained in Section 3.2, we use multi-voltage power measurements 
to construct the measurement matrix. 
• We have used the exponential correlogram function to generate the varia-
tions (see 2.2). We have used the same function as 7(e?j,u) i n Section 3.4. 
6.2 Power tomography results 
In this section, we evaluate performance of the ^2-norm optimization, the ^i-norrri 
regularization, and TUSC for the chip tomography. 
67 
Figure 6.1: Singular values of the measurement matrix. 
6.2.1 Measurement matrix evaluation 
The functionality of the IC imposes dependencies in logic gate status. Thus, the 
power vectors for the input vectors (i.e., the rows of the measurement matrix 
A) are not necessarily independent. In this sections, we use the singular value 
decomposition (SVD) to quantify the dependency of the rows of A. 
A matrix with N independent rows has N non-zero singular values. The 
sorted singular values of C499 and C880 circuits are shown in Figure 6.1 for a 
measurement matrix with M = 6 x N measurements, where N is the number of 
gates. On the figure the singular values for each circuit are normalized such that 
the largest singular value is 1. The figure demonstrates that the singular values 
decay rapidly; the 20-th singular value in both circuits are less than 10% (0.1). 
68 
Figure 6.2: Variations estimation error vs. percent of the power measurement noise. 
This decay suggests that it is not possible to find variation of all gates indepen-
dently because there is no information about the null space of the measurement 
matrix, Af(A) = {y G R^j/ ly = 0}. Thus, we can only estimate the variation in 
a subspace S that does not contain M(A). 
6.2.2 Tomography results in the power framework 
To study the performance of the proposed tomography method, we have simulated 
the process variation on a number of MCNC benchmarks. A total of 12% variation 
is assumed in the simulations. Based on the data in [16] and [76], 20% of the 
total variation is inter-die variation, 60% is spatial correlated intra-die variation, 
and 20% is random uncorrelated variation. To model the leakage current (static 
69 
Figure 6.3: Variation estimation error vs. number of power measurements. 
power), we used the HSPICE simulator on 65nm CMOS transistor technology. 
Figure 6.3 presents variations estimation error for the C499 and the C880 
benchmark circuits. The horizontal axis is the power measurement noise and 
the vertical axis is the variations estimation error. The variation estimation is 
calculated in a TV/3-dimensional subspace, where N is the number of gates. Note 
that by construction the estimation space is orthogonal to the null space of the 
measurement matrix. Thus, for low noise measurements the £i-regularization and 
TUSC are very similar. As the noise level increases, TUSC performs better than 
the £i-norm regularization. Note that ^-minimization performs much worse than 
^i-regularization and TUSC; it is not shown on the figure, please refer to Table 
6.2 for this comparison. 
70 
The number of measurements also affects the estimation error. Figure 6.3 
presents variation estimation error versus number of measurements. The hori-
zontal axis is the ratio of measurements to the total number of the gates in the 
circuit. The variation is estimated on N/4-dimensional subspace. M is 383 and 
317 for C499 and C880 respectively (M denotes the number of measurements in 
Table 6.2). Note that as the number of measurements increases, they cover most 
of the identifiable directions. Thus sparsity and shape constraints are similar in 
large number of measurements and the errors of the ^i-regularization and TUSC 
become nearly the same. 
Table 6.1 shows average number of the independent power vectors for single 
and multiple voltage measurement. The second column is the number of power 
vectors (measurements). To find number of the independent vectors in each mea-
surement set, we first find their singular values, then we count the number of 
non-zero singular values. The third and fifth columns show the number of inde-
pendent power vectors for single and triple voltage measurements, respectively. 
The table explains that triple voltage measurements increases the number of 
independent power vectors. 
Table 6.2 shows tomography results on different benchmark circuits. We used 
the software package SIS [67] with NAND2, NAND3, NAND4, NOR2, NOR3, 
NOR4, and inverters to map the circuit to the logic gates. The second column 
shows the number of gates and the third column reports the number of input 
71 
I. 
Table 6.1: Average number of independent power vectors for single and triple voltage 
measurements. 
Circuit Number of measurements Single-voltage 3-voltages 
C432 185 132.6 151.6 
C499 383 183.4 265.0 
C880 317 217.0 250.7 
C1355 465 184.4 251.5 
C1908 553 192.9 260.9 
C2670 540 322.9 350.3 
alu2 324 167.9 198.2 
alu4 659 312.9 351.8 
comp 127 84.5 112.7 
cordic 79 55.1 71.1 
b9 101 84.2 92.7 
c8 138 112.0 127.0 
72 
Table 6.2: Performance of the ^2-norm minimization, the ^-norm regularization, and 
TUSC for a number of MCNC benchmark circuits in the power framework. 
Circuit propert ies 3% noise 6% noise 9% noise 
name # g a t e s // inputs # m e a s "N/2 
'i subspace « l - reg . < 2 -min . T U S C ^l-reg. ^2-min. T U S C * l -reg . ^2-min. T U S C 
C432 206 36 185 0 .0076 61 2 .82 6.08 3 .97 5 .13 12.13 5 .57 7 .75 18.19 7 .46 
92 4 .85 10.21 7 .40 8 .76 20 .41 9 .58 12.86 30 .63 12.27 
C499 532 41 383 0 .0009 127 2 .71 9 .87 2.7 4 .98 19.77 4 .77 7 .31 29 .67 6 .97 
191 7 .83 38.08 8.18 13.90 76 .40 11.56 20 .50 114 15.6 
C880 353 60 317 0 .004 105 3 .20 8 .61 2 .99 6 .06 17.27 5 .66 9 .01 2 5 . 9 4 8 .39 
158 6 .03 16.00 5 .59 11.27 32.11 10.12 16.72 2 5 . 9 4 8 .39 
C1355 517 41 465 0 .0008 155 4 .27 65 .19 4 .27 7 .61 130.7 7 .32 11 .10 196 .3 10.42 
232 15.82 248 .3 15.33 26.51 498 .2 19.11 37 .65 748 .3 23 .72 
C 1 9 0 8 615 33 553 0 .0002 184 4 .89 44 .77 5 .19 9 .29 89 .69 8 .35 13.77 134.6 11.87 
276 14.71 113.4 13.05 22 .53 227 .1 16.78 30 .60 340 .9 2 1 . 8 3 
C2670 900 233 540 4e-5 180 4 .05 5.43 3 .76 7 .29 10.87 6 .95 10 .70 16 .30 10 .24 
270 8 .53 11.52 8 .37 15.17 23 .04 13.75 22 .25 34 .56 19.56 
alu2 360 10 324 0 .0014 108 6 .35 54 .97 5 .67 10.12 109.7 9 .21 14 .29 164.5 13 .10 
162 13.61 120.9 12.83 21 .54 241 .4 17.80 30 .37 361 .9 23 .74 
alu4 733 14 659 0 .0008 219 6 .70 64.01 5.82 11.56 127.96 10.74 16 .73 191.9 15.81 
329 13.61 129.53 11.66 21 .91 258 .9 19.75 31 .06 388 .4 28 .44 
c o m p 163 32 127 0 .005 42 2 .73 3.94 2 .60 4 .87 7 .74 4 .67 7 .12 11.56 6 .84 
63 4 .47 6 .34 4 .25 7 .94 12.42 7 .56 11 .64 11.56 6 .84 
cordic 102 23 79 0 .005 26 1.87 3 .74 3 .01 3 .23 7 .45 3 .85 4 .67 11.17 4 .93 
39 3 .35 6 .54 6 .97 5 .84 13.01 8 .02 8 .46 19.51 9 .48 
b9 113 41 101 0 .014 33 2 .51 4 .02 3 .66 4 .68 8 .02 5 .01 6 .90 12.02 6 .66 
50 4 .00 6.84 6 .79 7 .42 13.63 8 .50 10.97 20 .44 10.67 
c8 165 28 138 0 .008 46 3 .50 4.61 4 .32 6 .01 8 .93 6 .06 8 .74 13 .30 8 .10 
69 6 .22 8 .06 8 .50 10.56 15.66 10.91 15.26 23 .36 13.86 
73 
10 
•-
1C880 
— C499 
Random 
0 100 200 300 
Sorted singular values index 
400 
Figure 6.4: Singular values of the measurement matrices decay very fast. 
pins. For each circuit, we have measured the path delays for a number of paths 
in the testable basis set, reported in the fourth column. The fifth column shows 
the ratio of the iV/2-th singular value of the measurement matrix to the 1-st one. 
The M/3 and the M/2-dimensional subspaces—the sizes of which are reported in 
the sixth column—were estimated for the -regularization, the ^-minimization, 
and the TUSC methods were evaluated (M is the number of measurements). The 
remaining columns demonstrate the results for 3%, 6%, and 9% measurement 
noise. On average, the ^-regularization and the TUSC perform more than two 
times better in estimating the variations. 
74 
6.3 Delay evaluation results 
6.3.1 Measurement matrix and estimation in subspaces 
As mentioned in Section 4.2, due to the existence of ambiguities (path dependen-
cies), it may not be possible to find the variations for all gates in the circuit. In 
the other words, the measurement matrix, A, is not necessarily a full-rank ma-
trix. Most often the measurement matrix is ill-conditioned and its singular values 
decay rapidly. Figure 6.4 shows singular values of the measurement matrix for 
C880 and C499 circuit. The singular values are normalized to have the maximum 
value equal to 1. The singular values decay to 10% of the maximum after almost 
100 singular values. Note that C432 and C880 have 206 and 353 gates, respec-
tively. The figure also shows the singular value of a random Gaussian matrix. It 
is clear that singular values of the measurement matrices (for C499, C800) decay 
much faster than the random Gaussian matrix. 
Hence, it is not possible to find the variations of all gates. We measured 
estimation error in the space of singular values. The estimation error is minimum 
at the direction of the singular vector corresponding to the largest singular value 
and so on. We say the estimation subspace has dimension ne, when we project 
estimation error to the space of the first ne singular vectors. 
75 
6.3.2 Delay tomography results 
To evaluate the performance of the proposed methods, we simulate the variation 
model (Section 4.1.1) on a number of MCNC benchmark circuits. A total of 12% 
random variations is assumed. Correlated intra-die variation is 60% of the total 
variation [16] [76]; 20% of the total variation is uncorrelated intra-die variation 
and the remaining variation is allotted to the inter-die variation. 
Similar to Section 6.2, we have used SIS software to map the benchmark 
circuits to NAND2, NAND3, NAND4, NOR2, NOR3, NOR4, and inverter gates. 
Then, using Dragon, a placement software package [1], gates are placed on the IC. 
Since various gates cover different areas on the IC, gates are located on irregular 
grids. 
To calculate the falling and rising coefficients (£/lSu and £r,gu in Equation 4.7), 
we implemented all the gates with 65nm CMOS transistor technology. Then, we 
used the HSPICE software to fit the linear model for all gates. 
Figure 6.5 shows variations estimation error for the ^-minimization and the 
^i-regularization methods. The horizontal axis is delay measurement noise and 
the vertical axis is variations estimation error. The ^-regularization yields more 
than a 50% decrease in error over the ^-minimization. The estimation subspace 
is 84 for both C432 and C880 circuits. When measurement noise is small, delay 
measurements provides enough information to estimate variations accurately. As 
measurement noise increase, sparsity does not provide significant information. 
76 
25r 
C499-L1 regularization 
C499-L2 minimization 
C880-L1 regularization 
C880-L2 minimization 
Measurement noise % 
10 
Figure 6.5: Variation (delay) estimation error vs. measurement error. 
Thus, performance of the ^-regularization over the ^-minimization increases as 
measurement noise increases. 
The effect of the number of measurements is illustrated in Figure 6.6. The 
horizontal axis is the number of delay measurements divided by the number of 
the gates. Again, the ^i-regularization performs more than two times better than 
the ^-minimization. On the figure, the estimation subspace is 84 for both C432 
and C880 circuits. 
Next, we evaluate the basis path sets for the benchmark circuits. The method 
introduced in Section 4.5.2 provides a heuristic procedure for basis path selection. 
However, it does not necessarily result in an independent basis path that covers all 
the space. Table 6.3 shows the number of basis paths in the benchmark circuits. 
77 
Table 6.3: Number of independent paths and independent linear equations. For each 
path, rising and falling transitions result in different linear equations. 
Circuit gates # basis path # independent 
paths 
# independent 
linear equations 
C432 206 199 121 153 
C499 532 422 271 375 
C880 353 351 184 253 
C1355 517 480 233 335 
C1908 615 590 318 414 
C2670 900 979 422 632 
alu2 360 368 183 230 
alu4 733 693 337 449 
comp 163 131 84 122 
cordic 102 92 59 77 
b9 113 142 75 90 
c8 165 201 96 119 
78 
Figure 6.6: Variation (delay) estimation error vs. the number of measurements. 
The third column is the number of independent paths in each basis path. As 
mentioned before, two linear equations can be written for each path (rising and 
falling transitions). The last column of the table is number of independent linear 
equations that provides each basis path set. 
Finally, Table 6.4 shows results of variation estimation on 12 benchmark cir-
cuits. After the benchmarks' name, the first, the second and the third columns 
are the number of gates, the number of inputs in the circuit, and the number of 
delay measurements, respectively. The fourth column is the ratio of the TV/2-th 
singular value to the first singular value in the measurement matrix (N is num-
ber of gates). This column shows how fast singular values decay; or how the 
measurement matrix is well conditioned. The fifth column is the estimation sub-
79 
Table 6.4: Performance of ^2-norm minimization and ^i-norm regularization for a num-
ber of MCNC benchmark circuits. 
Circuit propert ies 3% noise 6% noise 9% noise 
name # g a t e s # i n p u t s # m e a s "N/2 subspace i\ error £2 error £2 error £2 error £\ error £2 error 
C432 206 36 199 0 .035 39 6 .05 7.15 10.38 13.72 14.88 20 .42 
66 10.13 12.29 16.18 22 .47 22.8 32 .93 
C499 532 41 422 0 .022 84 7 .31 13.15 10.82 25 .72 15.29 38.41 
140 11.10 20.47 16.12 39.0 22 .69 57 .94 
C880 353 60 421 0 .036 84 4 .52 8 .93 8 .42 17.81 12.41 26.71 
140 7 .71 13.12 14.86 26 .06 21 .95 39 .04 
C1355 517 41 4 8 0 0 .0211 96 5 .00 8 .19 9 .04 16 .39 12.61 24.58 
160 6 .35 9 .50 11 .90 19.00 17.07 28 .50 
C1908 615 33 590 0 .020 118 4 .89 7 .51 8 .87 14.66 13.0 21 .89 
196 7 .9 12.54 13.92 24 .30 20 .32 36 .20 
C2670 900 233 979 0 .022 194 8 .68 21.76 11 .34 41 .48 14.99 61 .47 
326 10.42 21 .83 14.61 41 .37 19.52 61 .29 
alu2 360 10 368 0 .015 73 5 .20 6 .06 7 .75 9 .83 10.66 13.99 
122 10.22 11.59 14 .53 17.98 19.43 25.11 
alu4 733 14 693 0 .010 138 5 .94 10.06 9 .84 19 .89 14.21 29 .79 
231 10.60 16.51 15.70 32 .76 21.99 49 .10 
c o m p 163 32 131 0 .023 26 5 .18 11.00 8 .07 21 .23 11.08 31 .53 
43 7 .60 15.92 13.52 31 .16 19.38 46 .45 
cordic 102 23 92 0 .03 18 4 .43 26 .72 7 .11 53 .41 10.09 80 .11 
30 9 .75 62 .83 14.57 125 20 .27 188 
b9 113 41 142 0 .076 28 2 .12 2 .22 3 .51 3 .75 5 .04 5 .43 
47 4 .27 4 .94 6 .43 8 .04 8 .97 11.48 
c8 165 28 201 0 .039 40 11 .03 17.51 16.15 31 .19 21 .52 45 .52 
67 25 .70 41 .12 33 .43 74 .30 43 .00 109 
80 
space. The rest of the columns represent the estimation error (in percent) for 12 
minimization and t\ regularization with 3%, 6%, and 9% percent measurement 
noise. 
81 
Chapter 7 
Conclusion 
We proposed a fast and inexpensive method for the gate-level variations estima-
tion in the power and the delay frameworks. In the power framework, the total 
power consumption is measured for a number of input vectors to the IC. Because 
of the variations, the power consumption of the gates in the circuit will be scaled. 
Using the leakage model of variations, we construct a linear equation for each 
power measurement with the scaling factors of the gates as the unknown vari-
ables. In the delay framework, the linear equations are constructed by measuring 
delays of a sensitizable basis path set. Here, unknown variables are the variations 
in the gate sizing that have a linear relationship with the delay. 
Next, we estimate the gate-level variations (power or delay) by solving the ap-
propriate system of linear equations. We can use the traditional ^-minimization 
to estimate the gate level variations. Since there are not enough linearly inde-
82 
pendent measurements, the ^-minimization method performs poorly. However, 
it is widely known that variations (power or delay) are spatially correlated; i.e., 
nearby gates are expected to have close variations. Because of the spatial correla-
tions in the variations, there exists a basis in which variations can be represented 
sparsely. The sparse representation suggests using the compressive sensing theory. 
We show how to use the compressive sensing theory to improve the post-silicon 
characterization. We also modify the traditional ^-minimization by adding the 
spatial constraint directly. The spatial constraints enforce the nearby gates to 
have close variations. The proposed method just uses external input/output pins 
of the IC for the estimation. In the power framework, first, a number of input 
vectors are applied to the IC and power consumption is measured for each input 
vector. Next, we establish an optimization problem based on the power measure-
ments. Finally, we improve the optimization problem using spatial correlation in 
variations. In the delay framework, we follow the same procedure as we did in 
the power framework. However, one can measure paths delays just in sensitizable 
paths. Thus, here, the optimization problem is constructed based on the delay 
measurements in a set of testable basis paths. 
The variations can affect various properties in the IC and estimating variations 
in an IC suggests a number of applications such as post-silicon optimizations. 
Evaluation results verify our method. We showed that, compared to traditional 
.^-minimization, £i-regularization can improve variation estimation about 80% 
83 
on average. 
84 
Bibliography 
[1] http://er.cs. ucla. edu/dragon/. 
[2] A. Agarwal, D. Blaauw, and V. Zolotov. Statistical clock skew analysis 
considering intra-die process variations. In International Conference on 
Computer-Aided Design, page 914, 2003. 
[3] A. Agarwal, D. Blaauw, and V. Zolotov. Statistical timing analysis for intra-
die process variations with spatial correlations. In IEEE/ACM International 
Conference on Computer-Aided Design, page 900, 2003. 
[4] A. Agarwal, D. Blaauw, V. Zolotov, S. Sundareswaran, M. Zhao, K. Gala, 
and R. Panda. Statistical delay computation considering spatial correlations. 
In Conference on Asia South Pacific Design Automation, pages 271-276, 
2003. 
[5] A. Agarwal, K. Kang, and K. Roy. Accurate estimation and modeling of 
total chip leakage considering inter- and intra-die process variations. In 
International Conference on Computer-Aided Design, pages 736-741, 2005. 
85 
[6] A. Agarwal, V. Zolotov, and D.' Blaauw. Statistical clock skew analysis 
considering intradie-process variations. IEEE Transactions on Computer-
Aided Design of Integrated Circuits and Systems, 23(8):1231—1242, 2004. 
[7] K. Agarwal, F. Liu, C. McDowell, S. Nassif, K. Nowka, M. Palmer, 
D. Acharyya, and J. Plusquellic. A test structure for characterizing local 
device mismatches. In VLSI Circuits, Digest of Technical Papers., pages 
67-68, 2006. 
[8] I. Ahsan, N. Zamdmer, O. Glushchenkov, R. Logan, E. Nowak, H. Kimura, 
J. Zimmerman, G. Berg, J. Herman, E. Maciejewski, A. Chan, A. Azuma, 
S. Deshpande, B. Dirahoui, G. Freeman, A. Gabor, M. Gribelyuk, S. Huang, 
M. Kumar, K. Miyamoto, D. Mocuta, and Mahoro. Rta-driven intra-die 
variations in stage delay, and parametric sensitivities for 65nm technology. 
In VLSI Technology, Digest of Technical Papers., pages 170-171, 2006. 
[9] M. Ashouei, M. M. Nisar, A. Chatterjee, A. D. Singh, and A. U. Diril. 
Probabilistic self-adaptation of nanoscale cmos circuits:' Yield maximization 
under increased intra-die variations. In International Conference on VLSI 
Design held jointly with 6th International Conference: Embedded Systems, 
pages 711-716, 2007. 
[10] R. Baraniuk. A lecture on compressive sensing. IEEE Signal Processing 
Magazine, 24(4):118-121, 2007. 
86 
[11] S. Bhardwaj and S. Vrudhula. A fast and accurate approach for full chip 
leakage analysis of nano-scale circuits considering intra-die correlations. In 
International Conference on VLSI Design held jointly with 6th International 
Conference: Embedded Systems, pages 589-594, 2007. 
[12] S. Bhardwaj, S. Vrudhula, P. Ghanta, and Y. Cao. Modeling of intra-die pro-
cess variations for accurate analysis and optimization of nano-scale circuits. 
In Conference on Design Automation, pages 791-796, 2006. 
[13] BSIM Research Group, http://www-device.eecs.berkeley.edu/ bsim3/bsim4.html, 
seen in June, 2008. 
[14] S. M. Burns, M. Ketkar, N. Menezes, K. A. Bowman, J. W. Tschanz, and 
V. De. Comparative analysis of conventional and statistical design tech-
niques. In Conference on Design Automation, pages 238-243, 2007. 
[15] E. Candes. Compressive sampling. In Int. Congress of Mathematics, pages 
1433-1452, 2006. 
[16] Y. Cao and L. T. Clark. Mapping statistical process variations toward circuit 
performance variability: an analytical modeling approach. In Conference on 
Design Automation, pages 658-663, 2005. 
[17] H. Chang and S. Sapatnekar. Statistical timing analysis under spatial cor-
relations. IEEE Transactions Computer-Aided Design of Integrated Circuits 
and Systems, 24(9):1467-1482, 2005. 
87 
[18] K. T. Cheng and H. C. Chen. Delay testing for non-robust untestable cir-
cuits. In IEEE International Test Conference on Designing, Testing, and 
Diagnostics - Join Them, pages 954-961, 1993. 
[19] S. H. Choi, B. C. Paul, and K. Roy. Novel sizing algorithm for yield im-
provement under process variation in nanometer technology. In Conference 
on Design Automation, pages 454-459, 2004. 
[20] B. Cline, K. Chopra, D. Blaauw, and Y. Cao. Analysis and modeling of CD 
variation for statistical static timing. In Conference on Design Automation, 
pages 60-66, 2006. 
[21] A. Datta, S. Bhunia, S. Mukhopadhyay, N. Banerjee, and K. Roy. Statistical 
modeling of pipeline delay and design of pipeline under process variation to 
enhance yield in sub-lOOnm technologies. In Conference on Design, Automa-
tion and Test in Europe, pages 926-931, 2005. 
[22] Q. Ding, R. Luo, H. Wang, H. Yang, and Y. Xie. Modeling the impact 
of process variation on critical charge distribution. In International SOC 
Conference, pages 243-246, 2006. 
[23] J. Doh, D. Kim, S. Lee, J. Lee, Y. Park, M. Yoo, and J. Kong. A unified 
statistical model for inter-die and intra-die process variation. In International 
Conference on Simulation of Semiconductor Processes and Devices, pages 
131-134, 2005. 
[24] D. L. Donoho. Compressed sensing. IEEE Transaction on Information 
Theory, 52(4): 1289-1306, 2006. 
[25] D. L. Donoho, M. Vetterli, R. A. DeVore, and I. Daubechies. Data com-
pression and harmonic analysis. IEEE Transaction on Information Theory, 
44(6):2435-2476, 1998. 
[26] M. Eisele, J. Berthold, D. Schmitt-Landsiedel, and R. Mahnkopf. The impact 
of intra-die device parameter variations on path delays and on the design for 
yield of low voltage digital circuits. IEEE Transactions on Very Large Scale 
Integration (VLSI) Systems, 5(4):360-368, 1997. 
[27] M. Eisele, J. Berthold, R. Thewes, E. Wohlrab, D. Schmitt-Landsiedel, and 
W. Weber. Intra-die device parameter variations and their impact on digital 
cmos gates at low supply voltages. In International Electron Devices Meeting, 
pages 67-70, 1995. 
[28] F. Fallah and P. Massoud. Standby and active leakage current control and 
minimization in cmos vlsi circuits. IEICE Trans Electron (Inst Electron Inf 
Commun Eng), E88-C(4):509-519, 2005. 
[29] Z. Feng, P. Li, and Y. Zhan. Fast second-order statistical static timing 
analysis using parameter dimension reduction. In Conference on Design 
Automation, pages 244-249, 2007. 
89 
[30] P. Friedberg, Y. Cao, J. Cain, R. Wang, J. Rabaey, and C. Spanos. Modeling 
within-die spatial correlation effects for process-design co-optimization. In 
International Symposium on Quality of Electronic Design, pages 5 i6 -521 , 
2005. 
[31] B. Gassend, D. Clarke, M. V. Dijk, and S. Devadas. Silicon physical random 
funct ions. In ACM Conference on Computer and Communications Security, 
pages 148-160, 2002. 
[32] P. Ghanta and S. Vrudhula. Analysis of power supply noise in the presence 
of process variations. In IEEE Design Test, pages 256-266, 2007. 
[33] P. Ghanta, S. Vrudhula, S. Bhardwaj, and R. Panda. Stochastic variational 
analysis of large power grids considering intra-die correlations. In Conference 
on Design Automation, pages 211-216, 2006. 
[34] J. Gregg and T. W. Chen. Post silicon power/performance optimization in 
the presence of processvariations using individual well adaptive body biasing 
(iwabb). In International Symposium on Quality Electronic Design, pages 
453-458, 2004. 
[35] E. T. Hale, W. Yin, and Y. Zhang. A fixed-point continuation method for 
Ll-regularization with application to compressed sensing. Rice University, 
CAAM Technical Report, (TR07-07), 2007. 
90 
[36] B. Hargreaves, H. Hult, and S. Reda. Intra-die process variations: How 
accurately can they be statistically modeled? In Conference on Asia-pacific 
Design Automation, pages 524-530, 2008. 
[37] J. Hlavicka and P. Fiser. A heuristic method of two-level logic synthesis. 
In World Multiconference on Systemics, Cybernetics and Informatics, pages 
524-530, 2001. 
[38] V. Iyengar, J. Xiong, S. Venkatesan, V. Zolotov, D. Lackey, P. Habitz, 
and C. Visweswariah. Variation-aware performance verification using at-
speed structural test and statistical timing. In International Conference on 
Computer-Aided Design, pages 405-412, 2007. 
[39] V. Khandelwal and A. Srivastava. A general framework for accurate sta-
tistical timing analysis considering correlations. In Conference on Design 
Automation, pages 89-94, 2005. 
[40] S.-J. Kim, K. Koh, M. L. ans S. Boyd, and D. Gorinevsky. An interior-point 
method for large-scale 11-regularized least squares. IEEE Journal of Selected 
Topics in Signal Processing, 1(4):606-617, 2007. 
[41] K. Lakshmikumar, R. A. Hadaway, and M. Copeland. Characterisation and 
modeling of mismatch in mos transistors for precision analog design. IEEE 
Journal of Solid-State Circuits, 21(6):1057-1066, 1986. 
91 
[42] J. D. Lesser and J. J. Shedletsky." An experimental delay test generator for 
lsi logic. IEEE Transactions on Computers, 29(3):235-248, 1980. 
[43] X. Li, J. Le, L. T. Pileggi, and A. Strojwas. Projection-based perfor-
mance modeling for inter/intra-die variations. In International Conference 
on Computer-Aided Design, pages 721-727, 2005. 
[44] X. Li, P. Li, and L. T. Pileggi. Parameterized interconnect order reduction 
with explicit-and-implicit multi-parameter moment matching for inter/intra-
die variations. In International Conference on Computer-Aided Design, 
pages 806-812, 2005. 
[45] B. Lin. methods to print optical images at low kl factors. SPIE, 1264:2-13, 
1990. 
[46] B. Liu. spatial correlation extraction via random field simulation and pro-
duction chip performance regression. In Conference of Design, Automation, 
and Test in Europ, pages - , 2008. 
[47] F. Liu. A general framework for spatial correlation modeling in vlsi design. 
In Conference on Design Automation, pages 817-822, 2007. 
[48] Q. Liu and S. Sapatnekar. Confidence scalable post-silicon statistical delay 
prediction under process variations. In Conference on Design Automation, 
pages 497-502, 2007. 
92 
[49] X. Lu, Z. Li, W. Qiu, D. M. H. Walker, and W. Shi. Longest path selec-
tion for delay test under process variation. In Conference on Asia South 
Pacific Design Automation: Electronic Design and Solution Fair, pages 9 8 -
103, 2004. 
[50] J. Luo, S. Sinha, Q. Su, J. Kawa, and C. Chiang. An ic manufacturing yield 
model considering intra-die variations. In Conference on Design Automation, 
pages 749-754, 2006. 
[51] H. Mangassarian and M. Anis. On statistical timing analysis with inter-
and intra-die variations. In Conference on Design, Automation and Test in 
Europe, pages 132-137, 2005. 
[52] M. Mani, A. Singh, and M. Orshansky. Joint design-time and post-silicon 
minimization of parametric yield loss using adjustable robust optimization. 
In International Conference on Computer-Aided Design, pages 19-26, 2006. 
[53] K. Meng and R. Joseph. Process variation aware cache leakage management. 
In International Symposium on Low Power Electronics and Design, pages 
262-267, 2006. 
[54] A. Mishchenko, S. Chatterjee, and R. Brayton. Dag-aware aig rewriting a 
fresh look at combinational logic synthesis. In Conference on Design Au-
tomation, pages 532-535, 2006. 
93 
[55] T. Mizuno, J. Okumtura, and A. Toriumi. Experimental study of threshold 
voltage fluctuation due tostatistical variation of channel dopant number in 
mosfet 's . IEEE Transactions on Electron Devices, 41(11):2216-2221, 1994. 
[56] A. Murakami, S. Kajihara, T. Sasao, I. Pomeranz, and S. M. Reddy. Selec-
tion of potentially testable path delay faults for test generation. In IEEE 
International Test Conference, page 376, 2000. 
[57] M. Orshansky, L. Milor, P. Chen, K. Keutzer, and C. Hu. Impact of system-
atic spatial intra-chip gate length variability on performance of high-speed 
digital circuits. In International Conference on Computer-Aided Design, 
pages 62-67, 2000. 
[58] A. Ramalingam, G. Nam, A. Singh, M. Orshansky, S. Nassif, and D. Pan. An 
accurate sparse matrix based framework for statistical static timing analy-
sis. In International Conference on Computer-Aided Design, pages 231-236, 
2006. 
[59] R. Rao, A. Srivastava, D. Blaauw, and D. Sylvester. Statistical estimation 
of leakage current considering inter- and intra-die process variation. In In-
ternational Symposium on Low Power Electronics and Design, pages 84-89, 
2003. 
[60] R. Rao, A. Srivastava, D. Blaauw, and D. Sylvester. Statistical analysis of 
subthreshold leakage current for vlsi circuits. IEEE Transactions on Very 
94 
Large Scale Integration (VLSI) Systems, 12(2): 131-139, 2004. 
[61] SeDuMi: self-dual minimization, http://sedumi.mcmaster.ca/, seen in June, 
2008. 
[62] D. Shamsi, P. Boufounos, and F. Koushanfar. Noninvasive leakage power 
tomography of integrated circuits by compressive sensing. In International 
Symposium on Low Power Electronics and Design, pages - , 2008. 
[63] D. Shamsi, P. Boufounos, and F. Koushanfar. Post-silicon timing character-
ization by compressed sensing. In International Conference on Computer-
Aided Design, pages - , 2008. 
[64] M. Sharma and J. Patel. Bounding circuit delay by testing a very small 
subset of paths. In IEEE VLSI Test Symposium, pages 333-341, 2000. 
[65] M. Sharma and J. Patel. Finding a small set of longest testable paths that 
cover every gate. In IEEE International Test Conference, pages 974-982, 
2002. 
[66] J.-B. Shyu, G. Temes, and K. Yao. Random errors in mos capacitors. IEEE 
Journal of Solid-State Circuits, 17(6): 1070-1076, 1982. 
[67] SIS: Synthesis of both synchronous and asynchronous sequential circuits. 
http://embedded.eecs.berkeley.edu/pubs/downloads/sis/index.htm, seen in 
June, 2008. 
95 
[68] SPGL1: A solver for sparse reconstruction. 
http://www.cs.ubc.ca/labs/scl/spgll/, seen in June, 2008. 
[69] A. Srivastava, D. Sylvester, and D. Blaauw. Statistical optimization of leak-
age power considering process variations using dual-vth and sizing. In Con-
ferenc on Design Automation e, pages 773-778, 2004. 
[70] J. L. Tsai, D. Baik, C.-P. Chen, and K. Saluja. A yield improvement method-
ology using pre- and post-silicon statistical clock scheduling. In International 
Conference on Computer-Aided Design, pages 611-618, 2004. 
[71] J. Tschanz, J. Kao, S. Narendra, R. Nair, D. Antoniadis, A. Chandrakasan, 
and V. De. Adaptive body bias for reducing impacts of die-to-die and within-
die parameter variations on microprocessor frequency and leakage, pages 
1396-1402, 2002. 
[72] E. van den Berg and M. P. Friedlander. Probing the pareto frontier for basis 
pursui t solutions. To appear in SIAM J. on Scientific Computing, 2008. 
[73] M. Vetterli. Wavelets, approximation, and compression. IEEE Signal Pro-
cessing Magazine, 18(5):59-73, 2001. 
[74] P. Vuillod, L. Benini, and G. D. Micheli. Generalized matching from the-
ory to applicat ion. In International Conference on Computer-Aided Design, 
pages 13-20, 1997. 
96 
[75] R. Wagner, R. Baraniuk, S. Du, D. Johnson, and A. Cohen. An architecture 
for distributed wavelet analysis and processing in sensor networks. In Inter-
national Conference on Information Processing in Sensor Networks, pages 
243-250, 2006. 
[76] J. Xiong, V. Zolotov, and L. He. Robust extraction of spatial correlation. 
In International Symposium on Physical Design, pages 2 -9 , 2006. 
[77] K. Yang, K. Cheng, and L. Wang. Trangen: a sat-based atpg for path-
oriented t rans i t ion faults . In Conference on Asia South Pacific Design Au-
tomation, pages 92-97, 2004. 
[78] Y. Zhan, A. J. Strojwas, X. Li, L. T. Pileggi, D. Newmark, and M. Sharma. 
Correlation-aware statistical timing analysis with non-gaussian delay distri-
butions. In Conference on Design Automation, pages 77-82, 2005. 
[79] W. Zhao, Y. Cao, F. Liu, K. Agarwal, D. Acharyya, and S. N. K. Nowka. 
Rigorous extraction of process variations for 65nm cmos design. In European 
Solid State Device Research Conference, pages 89-92, 2007. 
97 
