Analysis and Design of Resilient VLSI Circuits by Garg, Rajesh
ANALYSIS AND DESIGN OF RESILIENT VLSI CIRCUITS
A Dissertation
by
RAJESH GARG
Submitted to the Ofce of Graduate Studies of
Texas A&M University
in partial fulllment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
May 2009
Major Subject: Computer Engineering
ANALYSIS AND DESIGN OF RESILIENT VLSI CIRCUITS
A Dissertation
by
RAJESH GARG
Submitted to the Ofce of Graduate Studies of
Texas A&M University
in partial fulllment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
Approved by:
Chair of Committee, Sunil P. Khatri
Committee Members, Peng Li
Krishna R. Narayanan
Duncan M. Walker
Kevin Nowka
Head of Department, Costas N. Georghiades
May 2009
Major Subject: Computer Engineering
iii
ABSTRACT
Analysis and Design of Resilient VLSI Circuits. (May 2009)
Rajesh Garg, B. Tech., Indian Institute of Technology-Delhi, India;
M.S.,Texas A&M University,
Chair of Advisory Committee: Dr. Sunil P. Khatri
The reliable operation of Integrated Circuits (ICs) has become increasingly difcult to
achieve in the deep sub-micron (DSM) era. With continuously decreasing device feature
sizes, combined with lower supply voltages and higher operating frequencies, the noise
immunity of VLSI circuits is decreasing alarmingly. Thus, VLSI circuits are becoming
more vulnerable to noise effects such as crosstalk, power supply variations and radiation-
induced soft errors. Among these noise sources, soft errors (or error caused by radiation
particle strikes) have become an increasingly troublesome issue for memory arrays as well
as combinational logic circuits. Also, in the DSM era, process variations are increasing
at an alarming rate, making it more difcult to design reliable VLSI circuits. Hence, it
is important to efciently design robust VLSI circuits that are resilient to radiation parti-
cle strikes and process variations. The work presented in this dissertation presents several
analysis and design techniques with the goal of realizing VLSI circuits which are tolerant
to radiation particle strikes and process variations.
This dissertation consists of two parts. The rst part proposes four analysis and two
design approaches to address radiation particle strikes. The analysis techniques for the
radiation particle strikes include: an approach to analytically determine the pulse width
and the pulse shape of a radiation induced voltage glitch in combinational circuits, a tech-
nique to model the dynamic stability of SRAMs, and a 3D device-level analysis of the
radiation tolerance of voltage scaled circuits. Experimental results demonstrate that the
iv
proposed techniques for analyzing radiation particle strikes in combinational circuits and
SRAMs are fast and accurate compared to SPICE. Therefore, these analysis approaches
can be easily integrated in a VLSI design ow to analyze the radiation tolerance of such
circuits, and harden them early in the design ow. From 3D device-level analysis of the ra-
diation tolerance of voltage scaled circuits, several non-intuitive observations are made and
correspondingly, a set of guidelines are proposed, which are important to consider to real-
ize radiation hardened circuits. Two circuit level hardening approaches are also presented
to harden combinational circuits against a radiation particle strike. These hardening ap-
proaches signicantly improve the tolerance of combinational circuits against low and very
high energy radiation particle strikes respectively, with modest area and delay overheads.
The second part of this dissertation addresses process variations. A technique is devel-
oped to perform sensitizable statistical timing analysis of a circuit, and thereby improve the
accuracy of timing analysis under process variations. Experimental results demonstrate that
this technique is able to signicantly reduce the pessimism due to two sources of inaccuracy
which plague current statistical static timing analysis (SSTA) tools. Two design approaches
are also proposed to improve the process variation tolerance of combinational circuits and
voltage level shifters (which are used in circuits with multiple interacting power supply
domains), respectively. The variation tolerant design approach for combinational circuits
signicantly improves the resilience of these circuits to random process variations, with a
reduction in the worst case delay and low area penalty. The proposed voltage level shifter
is faster, requires lower dynamic power and area, has lower leakage currents, and is more
tolerant to process variations, compared to the best known previous approach.
In summary, this dissertation presents several analysis and design techniques which
signicantly augment the existing work in the area of resilient VLSI circuit design.
vTo my family
vACKNOWLEDGMENTS
I am very grateful to my advisor Dr. Sunil P. Khatri, for giving me this opportunity to
work under him. Without his constant guidance, suggestions and encouragement, this work
would not have been possible. I owe him my gratitude for showing me this way of research.
He has supported and encouraged me whenever I needed him and answered all my ques-
tions very openly. The informal talks that I had with him have been a constant source of
knowledge and inspiration. I also want to thank him for all the facilities and support he has
given to me. Thanks a lot Sunil for everything.
I would also like to thank my committee members, Dr. Peng Li, Dr. Hank Walker,
Dr. Kevin Nowka, and Dr. Krishna Narayanan, for giving me their valuable feedbacks and
suggestions. Their suggestions have been very helpful in improving my research work.
I would also like to express my sincere acknowledgment to Kanupriya, Karandeep,
Charu, Suganth, and Nikhil, for their constant support, valuable comments and guidance.
They have also helped me in learning new things, given time to discuss the problems, and
has been a source of inspiration all along.
I would also like to thank my parents, brother and sister, who taught me the value of
hard work by their own example. I would also like to share my moment of happiness with
them. Without their encouragement and condence in me, I would have never been able to
pursue and complete my Doctoral study.
Finally, I would like to thank all my friends, who directly and indirectly supported and
helped me in completing this dissertation.
vi
TABLE OF CONTENTS
CHAPTER Page
I INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . 1
I-A. Background and Motivation . . . . . . . . . . . . . . . . . 2
I-A.1. Radiation Particle Strikes . . . . . . . . . . . . . . 2
I-A.1.a. Physical Origin of Radiation Particles . . . . 4
I-A.1.b. Charge Deposition Mechanisms . . . . . . . 5
I-A.1.c. Charge Collection Mechanisms . . . . . . . 7
I-A.1.d. Circuit Level Modeling of a Radiation
Particle Strike . . . . . . . . . . . . . . . . . 10
I-A.1.e. Impact of Technology Scaling on the
Radiation Tolerance of VLSI Design . . . . . 11
I-A.2. Process Variations . . . . . . . . . . . . . . . . . . 13
I-A.2.a. Impact of Technology Scaling on Pro-
cess Variations . . . . . . . . . . . . . . . . 15
I-B. Dissertation Overview . . . . . . . . . . . . . . . . . . . . 17
I-C. Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 22
II RADIATION ANALYSIS - ANALYTICAL DETERMINATION
OF RADIATION-INDUCED PULSE WIDTH IN COMBINA-
TIONAL CIRCUITS . . . . . . . . . . . . . . . . . . . . . . . . . 23
II-A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 23
II-B. Related Previous Work . . . . . . . . . . . . . . . . . . . 26
II-C. Proposed Analytical Model for the Pulse Width of
Radiation-induced Voltage Glitch . . . . . . . . . . . . . . 28
II-C.1. Radiation Particle Strike at the Output of an Inverter 29
II-C.2. Classication of Radiation Particle Strikes . . . . . 31
II-C.3. Overview of the Model for Determining the
Pulse Width of the Voltage Glitch . . . . . . . . . . 32
II-C.4. Derivation of the Proposed Model for Deter-
mining the Pulse Width of the Voltage Glitch . . . 34
II-C.4.a. Voltage Glitch Magnitude VGM . . . . . . . . 35
II-C.4.b. Derivation of the Expression for t1 . . . . . . 38
II-C.4.c. Derivation of the Expression for t2 . . . . . . 38
II-D. Experimental Results . . . . . . . . . . . . . . . . . . . . 42
II-E. Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 45
vii
CHAPTER Page
III RADIATION ANALYSIS - ANALYTICAL DETERMINATION
OF THE RADIATION-INDUCED PULSE SHAPE . . . . . . . . . 47
III-A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 47
III-B. Related Previous Work . . . . . . . . . . . . . . . . . . . 49
III-C. Proposed Analytical Model for the Shape of Radiation-
induced Voltage Glitch . . . . . . . . . . . . . . . . . . . 49
III-C.1. Overview of the Proposed Model for Deter-
mining the Pulse Shape of the Voltage Glitch . . . 51
III-C.2. Derivation of the Model for Determining the
Shape of the Radiation-induced Voltage Glitch . . . 54
III-C.2.a. Voltage Glitch Magnitude VGM . . . . . . . . 55
III-C.2.b. Derivation of the Expressions for Case 3 . . . 60
III-C.2.c. Derivation of the Expressions for Case 2 . . . 61
III-C.2.d. Derivation of the Expressions for Case 1 . . . 62
III-D. Experimental Results . . . . . . . . . . . . . . . . . . . . 63
III-E. Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 69
IV RADIATION ANALYSIS - MODELING DYNAMIC STABIL-
ITY OF SRAMS IN THE PRESENCE OF RADIATION PAR-
TICLE STRIKES . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
IV-A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 70
IV-B. Related Previous Work . . . . . . . . . . . . . . . . . . . 72
IV-C. Proposed Model for the Dynamic Stability of SRAMs
in the Presence of Radiation Particle Strikes . . . . . . . . 73
IV-C.1. Weak Coupling Mode Analysis . . . . . . . . . . . 77
IV-C.2. Strong Feedback Mode Analysis . . . . . . . . . . 80
IV-D. Experimental Results . . . . . . . . . . . . . . . . . . . . 81
IV-E. Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 83
V RADIATION ANALYSIS - 3D SIMULATION AND ANALY-
SIS OF THE RADIATION TOLERANCE OF VOLTAGE SCALED
DIGITAL CIRCUITS . . . . . . . . . . . . . . . . . . . . . . . . 85
V-A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 85
V-B. Related Previous Work . . . . . . . . . . . . . . . . . . . 87
V-C. Simulation Setup . . . . . . . . . . . . . . . . . . . . . . 88
V-C.1. NMOS Device Modeling and Characterization . . . 90
V-D. Experimental Results . . . . . . . . . . . . . . . . . . . . 91
V-E. Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 101
viii
CHAPTER Page
VI RADIATION HARDENING - CLAMPING DIODE BASED RA-
DIATION TOLERANT CIRCUIT DESIGN APPROACH . . . . . . 103
VI-A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 103
VI-B. Related Previous Work . . . . . . . . . . . . . . . . . . . 104
VI-C. Proposed Clamping Diode based Radiation Hardening . . . 106
VI-C.1. Operation of Radiation-induced Voltage Clamp-
ing Devices . . . . . . . . . . . . . . . . . . . . . 106
VI-C.1.a. PN Junction Diode . . . . . . . . . . . . . . 108
VI-C.1.b. Diode Connected Device . . . . . . . . . . . 109
VI-C.2. Critical Depth for a Gate . . . . . . . . . . . . . . 110
VI-C.3. Circuit Level Radiation Hardening . . . . . . . . . 112
VI-C.4. Alternative Circuit Level Radiation Hardening . . . 112
VI-C.5. Final Circuit Selection . . . . . . . . . . . . . . . 114
VI-D. Experimental Results . . . . . . . . . . . . . . . . . . . . 114
VI-E. Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 126
VII RADIATION HARDENING - SPLIT-OUTPUT BASED RADI-
ATION TOLERANT CIRCUIT DESIGN APPROACH . . . . . . . 127
VII-A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 127
VII-B. Related Previous Work . . . . . . . . . . . . . . . . . . . 128
VII-C. Proposed Split-output Based Radiation Hardening . . . . . 129
VII-C.1. Radiation Tolerant Standard Cell Design . . . . . . 129
VII-C.2. Circuit Level Radiation Hardening . . . . . . . . . 135
VII-C.2.a. Identifying and Protecting Sensitive Gates
in a Circuit . . . . . . . . . . . . . . . . . . 136
VII-C.3. Critical Charge for Radiation Hardened Circuits . . 141
VII-D. Experimental Results . . . . . . . . . . . . . . . . . . . . 144
VII-E. Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 149
VIII VARIATION ANALYSIS - SENSITIZABLE STATISTICAL TIM-
ING ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
VIII-A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 150
VIII-B. Related Previous Work . . . . . . . . . . . . . . . . . . . 152
VIII-C. Proposed Sensitizable Statistical Timing Analysis Approach155
VIII-C.1. Phase 1: Finding Sensitizable Delay-critical
Vector Transitions . . . . . . . . . . . . . . . . . . 156
VIII-C.2. Propagating Arrival Times . . . . . . . . . . . . . 157
VIII-C.3. Phase 2: Computing the Output Delay Distribution 164
ix
CHAPTER Page
VIII-D. Experimental Results . . . . . . . . . . . . . . . . . . . . 165
VIII-D.1. Determining the Number of Input Vector Tran-
sitions N . . . . . . . . . . . . . . . . . . . . . . . 174
VIII-E. Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 175
IX VARIATION TOLERANT DESIGN - A VARIATION TOLER-
ANT COMBINATIONAL CIRCUIT DESIGN APPROACH US-
ING PARALLEL GATES . . . . . . . . . . . . . . . . . . . . . . 177
IX-A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 177
IX-B. Related Previous Work . . . . . . . . . . . . . . . . . . . 178
IX-C. Process Variation Tolerant Combinational Circuit Design . 180
IX-C.1. Process Variations . . . . . . . . . . . . . . . . . . 180
IX-C.2. Variation Tolerant Standard Cell Design . . . . . . 181
IX-C.3. Variation Tolerant Combinational Circuits . . . . . 185
IX-D. Experimental Results . . . . . . . . . . . . . . . . . . . . 186
IX-E. Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 198
X VARIATION TOLERANT DESIGN - PROCESS VARIATION
TOLERANT SINGLE SUPPLY TRUE VOLTAGE LEVEL SHIFTER 200
X-A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 200
X-B. The Need for a Single Supply Voltage Level Shifter . . . . 201
X-C. Related Previous Work . . . . . . . . . . . . . . . . . . . 203
X-D. Proposed Single-supply True Voltage Level Shifter . . . . 206
X-E. Experimental Results . . . . . . . . . . . . . . . . . . . . 210
X-E.1. Performance Comparison with Nominal Pa-
rameters Value . . . . . . . . . . . . . . . . . . . . 210
X-E.2. Performance Comparison under Process and
Temperature Variations . . . . . . . . . . . . . . . 212
X-E.3. Voltage Translation Range for SS-TVLS . . . . . . 214
X-E.4. Layout of SS-TVLS . . . . . . . . . . . . . . . . . 216
X-F. Chapter Summary . . . . . . . . . . . . . . . . . . . . . . 220
XI CONCLUSIONS AND FUTURE DIRECTIONS . . . . . . . . . . 221
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
VITA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
xLIST OF TABLES
TABLE Page
II.1 Pulse Width for INV1 Gate for Q = 150 fC, τα = 150ps and τβ = 50ps . . 43
II.2 Pulse Width for NAND2 gate for Q = 150 fC, τα = 150ps and τβ = 50ps . 43
III.1 RMSP Error of the Proposed Model for 3× Gates and Q = 150 fC,
τα = 150ps and τβ = 50ps . . . . . . . . . . . . . . . . . . . . . . . . . 67
III.2 RMSP Error of the Proposed Model for Different Gates Sizes and
Q = 150 fC, τα = 150ps and τβ = 50ps . . . . . . . . . . . . . . . . . . 67
IV.1 Comparison of Model with HSPICE . . . . . . . . . . . . . . . . . . . . 83
V.1 Q and Area of Voltage Glitch Versus Load Capacitance (Cload) . . . . . . 98
VI.1 Glitch Magnitude of PN Junction Clamping Diode for Rising Pulses
(Output at Logic 0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
VI.2 Glitch Magnitude of PN Junction Clamping Diode for Falling Pulses
(Output at Logic 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
VI.3 Glitch Magnitude of Diode-connected Clamping Device for Rising
Pulses (Output at Logic 0) . . . . . . . . . . . . . . . . . . . . . . . . . 116
VI.4 Glitch Magnitude of Diode-connected Clamping Device for Falling
Pulses (Output at Logic 1) . . . . . . . . . . . . . . . . . . . . . . . . . 116
VI.5 Delay, Area and Critical Depth of Cells . . . . . . . . . . . . . . . . . . 119
VI.6 Delay Overhead of the Proposed Radiation Hardened Design Approaches . 122
VI.7 Area Overhead of the Proposed Radiation Hardened Design Approaches . 123
VI.8 Total Number of Gates and Number of Hardened Gate in Different Designs 123
VI.9 Delay Overhead of the Improved Circuit Protection Approach . . . . . . . 124
VI.10 Area Overhead of the Improved Circuit Protection Approach . . . . . . . 125
xi
TABLE Page
VII.1 Area Overheads of the Radiation Hardened Design Approach Pro-
posed in This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
VII.2 Delay Overheads and Qcri of the Proposed Radiation Hardened De-
sign Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
VII.3 Area and Delay Overheads of the Proposed Radiation Hardened De-
sign Approach for 100% Coverage . . . . . . . . . . . . . . . . . . . . . 147
VIII.1 Transitions for a NAND Gate that Cause Its Output to Switch . . . . . . . 159
VIII.2 Parameters with Their Variation . . . . . . . . . . . . . . . . . . . . . . 166
VIII.3 Comparison of SSTA and StatSense for 75 Input Vector Transitions . . . . 171
VIII.4 Comparison of SSTA and StatSense for 50 and 25 Input Vector Transitions 173
VIII.5 Comparison of SSTA, StatSense50 and SSTA without False Paths . . . . . 175
IX.1 Comparison of Regular and Parallel Gates . . . . . . . . . . . . . . . . . 191
X.1 Low to High Level Shifting . . . . . . . . . . . . . . . . . . . . . . . . 212
X.2 High to Low Level Shifting . . . . . . . . . . . . . . . . . . . . . . . . 213
X.3 Process Variations Simulation Results for Low to High and High to
Low Level Shifting at T = 27◦ C . . . . . . . . . . . . . . . . . . . . . . 214
X.4 Process Variations Simulation Results for Low to High and High to
Low Level Shifting at T = 60◦ C . . . . . . . . . . . . . . . . . . . . . . 214
X.5 Process Variations Simulation Results for Low to High and High to
Low Level Shifting at T = 90◦ C . . . . . . . . . . . . . . . . . . . . . . 215
xii
LIST OF FIGURES
FIGURE Page
I.1 Charge deposition and collection by a radiation particle strike . . . . . . . 8
I.2 Current pulse model for a radiation particle strike plotted for different
values of Q, τα and τβ . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
I.3 SER of an alpha processor for different technology nodes . . . . . . . . . 14
I.4 Variation in threshold voltage of devices for different technology nodes . . 16
II.1 a) Radiation-induced current injected at the output of inverter INV1,
b) Voltage glitch at node a . . . . . . . . . . . . . . . . . . . . . . . . . 29
II.2 Flowchart of the proposed model for pulse width calculation . . . . . . . 33
II.3 Voltage/Current due to a radiation particle strike at node a of INV1 of
Figure II.1 (a) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
III.1 Radiation-induced current injected at the output of inverter INV1 . . . . . 51
III.2 Flowchart of the proposed model for the shape of the radiation-induced
voltage glitch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
III.3 Radiation-induced voltage glitches obtained using the proposed model
and SPICE for different gates . . . . . . . . . . . . . . . . . . . . . . . 65
III.4 Radiation-induced voltage glitch at 2X-INV1 . . . . . . . . . . . . . . . 68
IV.1 Schematic of SRAM cell with noise current (access transistors are not
shown) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
IV.2 SRAM node voltages for the noise injected at node n2 . . . . . . . . . . . 76
IV.3 Flowchart of the proposed model for SRAM cell stability . . . . . . . . . 77
IV.4 Comparison of critical charge obtained using HSPICE and the pro-
posed model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
V.1 Inverter (INV) under consideration . . . . . . . . . . . . . . . . . . . . 90
xiii
FIGURE Page
V.2 NMOS device: ID versus VDS plot for different VGS values . . . . . . . 92
V.3 Radiation-induced voltage transient at the output of 4× INV with
VDD=1 V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
V.4 Radiation-induced drain current of the NMOS transistor of 4× INV
with VDD=1 V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
V.5 Charge collected at the output of INV for different values . . . . . . . . . 95
V.6 Area of voltage glitch versus VDD . . . . . . . . . . . . . . . . . . . . . 96
V.7 Comparison of charge collected (Q) obtained from the proposed model
versus 3D simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
VI.1 Diode based radiation-induced voltage glitch clamping circuit . . . . . . . 107
VI.2 Device based radiation-induced voltage glitch clamping circuit . . . . . . 107
VI.3 Layout of radiation-tolerant NAND2 gate (uses device based clamping) . . 111
VI.4 Output waveform during a radiation event on output . . . . . . . . . . . . 117
VI.5 Output waveform during a radiation event on protecting node . . . . . . . 118
VII.1 Design of an radiation tolerant inverter . . . . . . . . . . . . . . . . . . 131
VII.2 Radiation particle strike at out1p and out1n of INV1 of Figure VII.1d . . . 134
VII.3 a) Radiation tolerant 2-input NAND gate, b) modified regular 2-input
NAND gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
VII.4 Part of a circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
VII.5 Waveforms at nodes cp, cn and d of Figure VII.4 (b) . . . . . . . . . . . 139
VII.6 Radiation tolerant ip-op . . . . . . . . . . . . . . . . . . . . . . . . . 140
VII.7 a) Circuit under consideration b) Waveform at different nodes . . . . . . . 143
VII.8 Area and delay overhead of our radiation hardening design approach
for different coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
xiv
FIGURE Page
VIII.1 Arrival time propagation using a NAND2 gate . . . . . . . . . . . . . . . 158
VIII.2 Plot of arrival times at output of NAND2 gate calculated through var-
ious means for the transition 00 → 11 . . . . . . . . . . . . . . . . . . . 162
VIII.3 Plot of arrival times at output of NAND2 gate calculated through var-
ious means for the transition 11 → 00 . . . . . . . . . . . . . . . . . . . 163
VIII.4 Plot of arrival times at output of NOR2 gate calculated through vari-
ous means for the transition 00 → 11 . . . . . . . . . . . . . . . . . . . 164
VIII.5 Plot of arrival times at output of NOR2 gate calculated through vari-
ous means for the transition 11 → 00 . . . . . . . . . . . . . . . . . . . 165
VIII.6 Characterization of NAND2 delay for all input transitions which cause
a rising output. a) 11 → 00, b) 11 → 01 and c) 11 → 10 . . . . . . . . . . 167
VIII.7 Characterization of NAND2 delay for all input transitions which cause
a falling output. a) 00 → 11, b) 01 → 11 and c) 10 → 11 . . . . . . . . . 168
VIII.8 Delay histograms for a) SSTA, b) StatSense and c) SPICE (for apex7) . . 172
IX.1 4× Inverter implementations . . . . . . . . . . . . . . . . . . . . . . . . 182
IX.2 2 input NAND gate a) Regular b) Parallel . . . . . . . . . . . . . . . . . 183
IX.3 Capacitance of various nodes a) Regular inverter, b) Parallel inverter . . . 185
IX.4 Results for 4× regular and parallel inverters: a) Standard deviation of
delay, b) Worst case delay and c) Standard deviation of output slew . . . . 188
IX.5 Ratio of results of the proposed approach compared to regular circuits
for different values of P . . . . . . . . . . . . . . . . . . . . . . . . . . 194
IX.6 Delay σ, µ+3σ and area ratio of the proposed approach compared to
regular circuits for different values of P for area mapped designs . . . . . 195
IX.7 Delay σ, µ+3σ and area ratio of the proposed approach compared to
regular circuits for different values of P for DM1 designs . . . . . . . . . 196
xv
FIGURE Page
IX.8 Delay σ, µ+3σ and area ratio of the proposed approach compared to
regular circuits for different values of P for DM2 designs . . . . . . . . . 197
X.1 Conventional voltage level shifter . . . . . . . . . . . . . . . . . . . . . 203
X.2 Multi-voltage system using CVLS . . . . . . . . . . . . . . . . . . . . . 204
X.3 Multi-voltage system using SS-TVLS . . . . . . . . . . . . . . . . . . . 205
X.4 Novel single supply true voltage level shifter . . . . . . . . . . . . . . . 208
X.5 Timing diagram for the proposed SS-TVLS . . . . . . . . . . . . . . . . 209
X.6 Combination of an inverter and SS-VLS by Khan et al. . . . . . . . . . . 211
X.7 Delay of SS-TVLS a) rising, b) falling . . . . . . . . . . . . . . . . . . . 217
X.8 Power of SS-TVLS a) rising, b) falling . . . . . . . . . . . . . . . . . . 218
X.9 Leakage current of SS-TVLS a) high, b) low . . . . . . . . . . . . . . . 219
X.10 Layout of the proposed SS-TVLS . . . . . . . . . . . . . . . . . . . . . 220
1CHAPTER I
INTRODUCTION
Reliability of VLSI (Very Large Scale Integration) systems has always been a major con-
cern. Integrated circuits (ICs) have always been subjected to several reliability degrading
factors such as manufacturing defects (for example, wire shorts, wire opens, etc), electro-
migration, noise, etc. To deal with these issues, various forms of fault tolerance have been
built into digital systems for the past several decades. Recently, in the deep sub-micron
(DSM) era, with continuously decreasing device feature sizes, lowering supply voltages
and increasing operating frequencies, the tolerance of VLSI systems against these effects
has signicantly decreased. In addition to this, several new factors such as process varia-
tions, aging, etc now further adversely affect digital VLSI system reliability. Therefore, in
the DSM regime, the design of reliable digital VLSI systems has become very challenging.
There are many types of noise effects in VLSI systems, like power and ground noise,
capacitive coupling noise, radiation particle strikes or single event effects (SEEs), etc. With
technology scaling, ICs have become very sensitive to radiation particle strikes [1, 2, 3, 4,
5, 6, 7]. Radiation particle strikes affect the electrical behavior of a circuit temporarily,
and can result in functional errors. Such errors are often referred to as soft or transient
errors. Researchers expect about an 8% increase in soft error rate (SER) per logic state
bit each technology generation [6, 7]. Also, the number of logic state bits on a chip dou-
ble each technology generation. This further increases the sensitivity of ICs to radiation
particle strikes with technology scaling. It is expected that the SER for chips implemented
in 16 nm technology will be almost 100× of the SER of chips implemented in 180 nm
technology [6, 7]. Also, with device scaling, the variations of key device parameters are
The journal model is IEEE Transactions on Automatic Control.
2increasing at an alarming rate [8, 9, 10], making it difcult to predict the performance of
a VLSI design. Thus, both these issues (radiation particle strikes and process variations)
result in unpredictable behavior of circuits and hence severely degrade the reliability of
VLSI systems. Due to the widespread use of modern VLSI systems, it is necessary to ad-
dress these issues during the design phase, to improve system reliability and resilience to
radiations and process variations. This is the focus of this dissertation.
In the remainder of this chapter, Section I-A provides background information about
radiation particle strikes and process variations. It also describes how these issues affect
VLSI circuit operation. The goals of the research work presented in this dissertation are
stated in Section I-B. Section I-B also provides an outline of the remaining chapters of this
dissertation. Finally, a chapter summary is provided in Section I-C.
I-A. Background and Motivation
This section provide some background information about radiation particle strikes and pro-
cess variations, to aid in understanding the remainder of this dissertation. It also describes
how these issues affect VLSI system operation, and how they are expected to scale in future
technologies.
I-A.1. Radiation Particle Strikes
Single event effects (SEEs) are caused when radiation particles such as protons, neutrons,
alpha particles, or heavy ions strike sensitive regions (usually reverse-biased p-n junctions)
in VLSI designs. These radiation particles strikes can deposit a charge, resulting in a volt-
age pulse or glitch at the affected node. This radiation-induced voltage glitch can result in
a soft or transient error.
Radiation particle strikes are very problematic for memories (latches, SRAMs and
3DRAMs) since they can directly ip the stored state of a memory element, resulting in
a Single Event Upset (SEU) [1, 2]. Although radiation-induced errors in sequential ele-
ments will continue to be problematic for high performance microprocessors, it is expected
that soft errors in combinational logic will dominate in future technologies [4, 11, 12], as
discussed later. Radiation strikes in combinational circuits are referred to as Single Event
Transients (SETs). In a combinational circuit, a voltage glitch due to a radiation particle
strike can propagate to the primary output(s) of the circuit, which can result in an incorrect
value being latched by the sequential element(s), hence resulting in single or multiple bit
upsets. Whether or not a voltage glitch induced by a radiation particle strike at any gate in
a combinational circuit propagates to the primary outputs (and results in a failure) depends
upon three masking factors. These masking factors are [4, 12]:
• Electrical masking occurs when a voltage glitch at a circuit node, induced by a ra-
diation particle strike attenuates as it propagates through the circuit to the primary
outputs. Electrical masking can reduce the voltage glitch magnitude to a value which
cannot cause any soft errors.
• Logical masking occurs when there is no functionally sensitizable path from the node
in the circuit where a radiation particle strikes, to any primary output of the circuit.
Hence, logical masking properties of a gate can be estimated using logic information
alone.
• Temporal masking occurs if a voltage glitch due to a radiation particle strike reaches
the primary outputs of a circuit at an instant other than the latching window of the se-
quential elements of the circuit. Temporal masking only depends upon the frequency
of operation of the circuit. Its inuence is identical for all gates in the circuit (for
a given voltage glitch due to a particle strike). Therefore, it provides a circuit some
gratuitous radiation tolerance against soft errors.
4Note that all these masking factor reduce the severity of a radiation particle strike in
combinational circuits. In other words, if a gate in a circuit is masked to a large extend by
any of these masking factors, then it is unlikely (low probability) that a radiation particle
strike at the output of that gate will have any effect on the primary outputs of the circuit.
Only those gates in a combinational circuit which exhibit a low degree of masking due to
these three factors (referred to as sensitive gates) contribute signicantly to the failure of
the circuit due to soft errors.
Until recently, radiation particle strikes were considered troublesome only for military
and space electronics. This is mainly due to the abundance of radiation particles in the op-
erating environment of such systems. In fact, the rst conrmed radiation-induced upsets
in space (four upsets in 17 years of satellite operation) was reported in 1975 [13]. However,
just four years later (i.e in 1979), soft errors were also observed in terrestrial microelectron-
ics [1]. Since then, with technology scaling, several cases of soft errors or upsets have been
observed in both space as well as terrestrial electronics [11]. Therefore, for applications
such as space, military and critical terrestrial (for example biomedical) electronics, which
place a stringent demand on reliable circuit operation, it is important to use radiation tol-
erant circuits. To efciently design radiation tolerant circuits, it is important to understand
the effects of radiation particle strikes on VLSI systems.
The rest of this section is devoted to a discussion on the physical origin of radiation
particles, how these particle strikes result in voltage transients, the modeling of a radiation
particle strike in circuit level simulations, and the impact of technology scaling on the
sensitivity of VLSI designs to radiation particle strikes.
I-A.1.a. Physical Origin of Radiation Particles
In space, the cosmic rays enter the solar system from the outside which are referred to as
galatic cosmic rays. These rays are high-energy charged particles, composed of protons,
5electrons, and heavier nuclei [14]. These energy particles are primarily responsible for soft
errors in space electronics [11]. Apart from galactic cosmic rays, solar event protons, and
protons trapped in the earth’s radiation belts are the other sources of protons present in the
earth’s atmosphere [11]. These are also capable of producing SEEs. Alpha-particles may
also originate from radioactive contaminations in IC packages [11]. In fact, the rst soft
error reported for terrestrial electronics [1] was due to alpha-particles that originated from
IC packaging materials. Recently, ip-chip packages have been identied as a source of
radiation particles (from the Pb-Sn solder bumps). This aggravates the problem of radiation
hardening because a source of radiation particles is present extremely close to the die. Also
at the surface of the earth, neutrons induced upsets have found to be very problematic.
Several studies have found that the neutrons from cosmic rays are a signicant source of
soft errors for SRAMs and DRAMs [11] operating at the earth’s surface. These atmospheric
neutrons result when high energy galactic cosmic rays collide with other particles in the
earth’s atmosphere. Thus, the neutron ux varies a lot with altitude and latitude [11, 4, 15].
The authors of [4] reported that the neutron ux at an altitude of 10,000 feet in Leadville,
CO is approximately 13× greater than that at the sea level. Due to this, a large number of
neutron induced upsets were observed in DRAMs at 10,000 feet in Leadville, CO, while no
upsets were observed when the DRAM was placed 200m underground in a salt mine [11].
Different radiation particles such as protons, neutrons, alpha-particles and heavy ions
have different mechanisms by which they deposit charge in VLSI designs. These mecha-
nisms are explained next.
I-A.1.b. Charge Deposition Mechanisms
There are two methods by which a radiation particle deposits charge in VLSI designs:
direct ionization and indirection ionization.
6Direct Ionization: A radiation particle generates electron-hole pairs along its path as
it passes through a semiconductor material, as shown in Figure I.1. In this process, the
radiation particle loses its energy. After losing all its energy, the particle comes to rest. The
energy transferred by the radiation particle is described by its linear energy transfer (LET)
value. LET is dened as the energy transferred (for electron-hole pair generation) by the
radiation particle per unit length, normalized by the density of the target material (for VLSI
designs, this is the density of Silicon). Thus the unit of LET is MeV-cm2/mg. The LET
of a radiation particle also corresponds to the charge deposited by the radiation particle per
unit length. In silicon, the amount of charge deposited (QD) by a radiation particle per unit
length (in microns) is calculated as QD = 0.01036 ·LET . For example, a particle with an
LET of 97 MeV-cm2/mg can deposit 1pC/µm. Heavy ions1 and alpha-particles primarily
deposit charge in a semiconductor by direct ionization. Light particles such as protons and
neutrons do not deposit enough charge by direct ionization to cause a soft error.
Indirect Ionization: Protons and neutrons typically deposit charge by indirect ioniza-
tion, which can result in signicant numbers of soft errors [11, 4, 16]. When a high-energy
light radiation particle (such as a proton or a neutron) passes through a semiconductor ma-
terial, it can collide with nuclei, resulting in nuclear reactions. These nuclear reactions may
produce secondary particles such as alpha-particles or heavy ions. These secondary parti-
cles then deposit charge by direction ionization and if the charge is deposited at different
locations in a chip then multiple soft errors may occur [11, 16]. Thus, the charge deposited
by a light particle through indirect ionization heavily depends upon the location and the
angle of incidence of the particle strike.
When charge is deposited a radiation event, this charge is collected by different ter-
minals of the devices, resulting in voltage and current transients in the device. The charge
1Heavy ion are ions whose atomic number is greater than equal to 2 [11].
7deposited by a radiation particle strike may get collected through different charge collection
mechanisms which are briey described next.
I-A.1.c. Charge Collection Mechanisms
There are three charge collection mechanisms as discussed below:
Drift-diffusion: Consider an NMOS transistor shown in Figure I.1. The source, gate
and bulk terminals of the NMOS transistor are connected to GND. The drain terminal is
connected to VDD. The drain-bulk junction is reverse-based and hence there is a strong
electric eld in the depletion region this junction from the drain to the bulk. Since radiation
particle generated free electron-hole pairs, the electric eld present in the depletion region
of the drain-bulk junction leads to the collection of electrons at the drain and of holes at
the bulk. Thus, the reverse-biased electric eld leads to the charge collection at the drain.
Therefore, the reverse biased junctions are most sensitive to a radiation particle strike. As-
sume that a radiation particle strikes this (drain-bulk) junction and generates electron-hole
pairs along its path as shown in Figure I.1. Immediately after the generation of this ionized
track, the depletion region collapses due to the separation of free electrons and holes by
the drift process in the depletion region. As mentioned earlier, charge (electrons and holes)
separation occurs due to the presence of a high electric eld, which pulls the electrons up
(towards the n+ diffusion) and pushes the holes down (towards the p− substrate). This
phenomenon reduces the width of the depletion region of the drain-bulk junction. As a
result, the potential drop across the depletion region decreases (before the radiation strike,
the potential drop across the depletion region was VDD). As the voltage between the drain
and the bulk terminals (n+ and p− substrate) is still VDD, the decrease in the potential
across the depletion region causes a voltage drop in the p− substrate region. This causes
the drain-bulk junction electric eld to penetrate into the p− substrate region, beyond the
original depletion region and hence enhances the ow of electrons from the substrate (these
8electrons are generated by the radiation particle strike in the substrate region) to the deple-
tion region. This enhanced electron ow process is referred to as funneling as shown in
Figure I.1. The electrons present in the depletion region drift to the drain (n+) diffusion
region and hence get collected. Thus, charge is said to be collected through the drift pro-
cess (or the funnel-assisted drift process). The funneling process increases the depth of the
region with a strong electric eld beyond the original depletion region. Hence it increases
the amount of charge collection by the drift process [17, 18, 19, 11].
+ −
+ −
− +
+ −
− +
VDD
Radiation particle
strike
+ −+ −
+ −− +
Diffusion
Funneling
Depletion Region
D
G
S
B
n+
p− substrate
n+
Fig. I.1. Charge deposition and collection by a radiation particle strike
As the electric eld continues to pull electrons up, it also pushes the holes down (away
from the depletion region) which allows the drain-bulk depletion region to recover and re-
gain its original width. After the recovery of the depletion region, the electrons which were
not collected by the funnel assisted drift process diffuse towards the depletion region (due
to their concentration gradient) and then get pulled by the junction electric eld towards
the n+ drain diffusion region. Thus the charge is also collected at the n+ region by the dif-
fusion process. It was reported in [18], that in a lightly doped substrate, most of the charge
9collection is through drift only whereas, more heavily doped substrates demonstrate charge
collection due to both the drift and the diffusion processes [20, 17, 18, 19, 11]. In the DSM
technologies, the substrate is heavily doped and hence charge collection at the drain node
occurs due to both drift and diffusion processes.
Bipolar Effect: Consider an NMOS transistor (an n-channel transistor located in a p-
well) in cut-off state, and with its gate and source terminals at GND and drain terminal at
VDD. The electrons generated by a radiation particle strike can be collected at either the
drain-well junction or the well-substrate junction. However, the radiation-induced holes
are left in the p-well, which reduces the source-well potential barrier (due to the increase
in the potential of the p-well). Thus, the source injects electrons into the channel which
can be collected at the drain. This increases the total amount of the charge collected at
the drain node and hence reduces the tolerance of the device to a radiation particle strike.
This effect is called bipolar effect because the source-well-drain of the NMOS (PMOS)
transistor act as a n-p-n (p-n-p) bipolar transistor. This effect mimics the on state of the
parasitic bipolar transistor. With technology scaling, the channel length decreases which
in turn reduces the base width (of the n-p-n transistor). Hence, this effect becomes more
pronounced in scaled technologies [19, 11, 21].
Alpha-particle Source-drain Penetration (ALPEN): This charge collection mechanism
results when a radiation particle strikes a MOS transistor at near-grazing incidence, such
that the particle penetrates through both the source and the drain regions of the transis-
tor. A radiation particle penetration through both the source and the drain regions of the
MOS transistor (nominally in the off state) perturbs the potential in the channel region. In
this case, the charge collection at the drain of the MOS transistor happens in three phases:
an initial funneling phase while there is no source/drain barrier, a bipolar phase as the
source/channel barrier recovers, and subsequent diffusion phase (after the device potentials
have recovered). This process also mimics the on state of the transistor. It is reported
10
Fig. I.2. Current pulse model for a radiation particle strike plotted for different values of Q,
τα and τβ
that the charge collection due to the ALPEN mechanism increases rapidly for effective gate
lengths below about 0.5µm [19, 11]. This mechanism may increase the radiation suscepti-
bility of DSM devices.
The charge collected (through any mechanism) at the drain node of a device results in
voltage transients at that node. These voltage transients in turn may result in soft errors.
I-A.1.d. Circuit Level Modeling of a Radiation Particle Strike
A radiation particle strike in a device induces current ow from the n type diffusion to the
p type diffusion. Traditionally, the radiation-induced current at circuit level is modeled by
a double-exponential current pulse [20] for circuit level simulations. The expression for
this pulse is
iseu(t) =
Q
(τα− τβ)
(e−t/τα − e−t/τβ) (1.1)
Here Q is the amount of charge collected as a result of the ion strike, while τα is the
11
collection time constant for the junction and τβ is the ion track establishment constant. This
current pulse is injected at any node in a circuit, to simulate a radiation particle strike in
SPICE at that node. Typically τα is of the order of 100 ps and τβ is of the order of tens
of picoseconds [12, 11]. Figure I.2 shows iseu(t) for several values of Q, τα and τβ. The
minimum amount of charge required to result in an error is referred to as critical charge
(Qcri).
Note that in DSM devices, the radiation-induced current may be very different from
this double exponential pulse [19, 22]. This is because, in DSM devices, the substrate is
more heavily doped compared to older technologies. As mentioned earlier, heavily doped
substrate demonstrate charge collection due to both the drift and the diffusion processes [20,
17, 18, 19, 11]. Therefore, a signicant amount of charge is collected in DSM devices, due
to both the drift and the diffusion processes. Whereas, in older technologies, the charge
was mainly collected by the drift process. Since, the double exponential current pulse
of Equation 1.1 was derived for an older technology by using the fact the charge is mainly
collected by the drift process [20], the radiation-induced current pulse can be different from
this double exponential current pulse in DSM devices. Therefore, for an accurate analysis,
device-level simulations of radiation particle strikes in transistors need to be performed.
However, for circuit level analysis and design, it is adequate to use the current model of
Equation 1.1 to model the worst case radiation particle strike [11, 12].
I-A.1.e. Impact of Technology Scaling on the Radiation Tolerance of VLSI Design
In the DSM era, the number of transistors on a chip is still increasing, in accordance with
Moore’s law [23]. This is facilitated by decreasing device and interconnect dimensions,
which have led to a reduction in the node capacitances of VLSI circuits. Hence, in modern
VLSI processes, even a small amount of charge deposited by a radiation particle (or low
energy particle) is sufcient to cause a signicant change in the voltage of a node. In other
12
words, DSM circuits are susceptible even to low energy radiation particle strikes. This is
further aggravated by decreasing supply voltages and increasing operating frequencies in
the DSM regime.
Although, these technology scaling trends severely reduce the radiation tolerance of
VLSI circuits, there are couple of factors associated with technology scaling which im-
proves the radiation tolerance of VLSI circuits. The area of transistors reduces with tech-
nology scaling and hence, the probability with which a device in a circuit experiences a
radiation particle strike reduces as well. Also, the decreasing supply voltages reduce the
charge collection efciency. Therefore, the devices implemented in newer technologies
(with lower supply voltages) collect less charge compared to the devices implemented in
older technologies (with higher supply voltages). A reduction in the amount of charge col-
lected (due to lowering supply voltages) with technology scaling improves the radiation
resilience of VLSI circuits.
The soft error rate (SER) is typically measured as failure in time (FIT), where a FIT
is dened as the number of failures in 109 hours of operation. Figure I.3 shows the SER
for an Alpha [24] processor, which was implemented using different technology nodes [4].
Figure I.3 shows the individual contributions of SRAMs, latches (for different pipeline
depths) and combinational logic (for different pipeline depths with a fanout factor of 4) to
the overall SER of the Alpha processor. Observe from Figure I.3 that the overall chip SER,
which is the sum of the contributions of SRAMs, latches and combinational logic, increases
with decreasing feature sizes. This veries that radiation particle strikes are becoming
increasingly problematic for the reliability of VLSI systems, as predicted by the theory.
Also, observe from Figure I.3 that in older technologies the contribution of the SRAMs
and latches to the overall chip SER was much higher than than that of combinational
logic. Hence, traditionally, radiation particle strikes were mainly considered problematic
for memories (SRAMs, DRAMs and latches) only. However, as the feature size is reduced
13
below 45 nm, the SER contribution of combinational logic has increased by a large factor
(more than 109), whereas the SER contribution of SRAMs (in absolute terms) has stayed
relatively constant (as shown in Figure I.3). This is because of the fact that with technology
scaling, heavily pipelined circuits are increasingly used, which leads to a reduction in the
depth of combinational circuits. Due to this, the effect of the three masking factors (as de-
scribed earlier) reduces and hence, fewer SET events are masked. Hence, it is expected that
radiation particle strikes in combinational logic will be more problematic than in memories
in future technologies [4, 11, 12]. Note that the SER of the Alpha processor due to radiation
particle strikes in latches also increased slightly with decreasing feature sizes. Therefore,
it will be necessary to harden both combinational logic and memories, to improve the radi-
ation resilience of VLSI systems implemented using future DSM processes.
Many critical applications such as space, military and critical terrestrial electronics
(for example biomedical circuits and high performance servers) electronics place a strin-
gent demand on reliable circuit operation. Therefore, efcient analysis and design tech-
niques are required to harden VLSI circuits (both combinational logic and memories)
against radiation events. Developing these is one of the two goals of this dissertation.
I-A.2. Process Variations
Another important problem encountered with technology scaling in the DSM era is the
increase in process variations. With the continuous scaling of devices and interconnects,
variations in key device and interconnect parameters such as channel length (L), threshold
voltage (VT ), oxide thickness (Tox), wire width (WM), and wire height (H) are increasing at
an alarming rate [8, 9, 10]. Due to this, the performance of different die of the same IC can
vary widely, resulting in a signicant yield loss, which translates into higher manufacturing
costs.
The two major sources of variability in device parameters are a) limited control over
14
Fig. I.3. SER of an alpha processor for different technology nodes
the manufacturing process (extrinsic causes of variations) and b) fundamental atomic-scale
randomness of the device (intrinsic causes of variations) [8]. The variability that arises due
to limited control over the manufacturing process is becoming more and more challeng-
ing to control. This is because of the inability of the semiconductor industry to improve
manufacturing tolerances at the same pace as technology scaling [8]. For example, the light
source (with a wavelength of 193 nm) used in lithography in older technologies (≥ 130 nm)
is still used in newer technologies (45 nm and below). Therefore, it is becoming increas-
ingly difcult to control the channel length of transistors with technology scaling [8]. The
intrinsic causes of variations are also expected to signicantly problematic in future tech-
nologies because of the fact that device dimensions are approaching the scale of silicon lat-
tice distances. At this scale, quantum physics needs to be used to explain device operation,
which is modeled as a stochastic process. Also, at this scale, the precise atomic congura-
tion of the material signicantly affects the electrical properties of the device. Therefore, a
small variation in the silicon structure has a large impact on the device performance. For
15
example, the threshold voltage of a transistor heavily depends on the doping density of the
channel region. With technology scaling, the number of dopant atoms required to achieve
the desired doping density is getting smaller [8]. Since the placement of dopant atoms
in the silicon crystal structure is random, the nal number of dopant atoms deposited in
the channel region of a transistor is a random variable. Therefore, the threshold voltage
of transistors also become a random variable. Variations in interconnect parameters are
mainly caused by a limited control over the manufacturing process. Processing steps such
as chemical mechanical polishing (CMP) and etching induce variations in interconnects or
wire dimensions [8].
The process variations due to these sources can be classied as systematic variations
and random variations [8, 9, 25]. The systematic component is the predictable variation
trend across a chip, and is caused by spatial dependencies of device processing, such
as Chemical and Mechanical Polishing (CMP) variations [26] and optical proximity ef-
fects [27]. The random component is caused by effects such as random uctuations of
the number and location of dopants in the MOSFET channel, polysilicon gate line-edge
roughness, etc [8, 9, 10].
Note that in terms of delay variability of a circuit, the contribution of variations in
device parameters dominates that of interconnect parameter variations [8]. The variation in
device parameters contributes close to 90% of the total variability of the delay of a realistic
design [8]. In future technologies, it is expected that the variation in device parameters will
continue to be the dominant source of delay variability of a circuit.
I-A.2.a. Impact of Technology Scaling on Process Variations
Figure I.4 shows the standard deviation of the threshold voltage of transistors (σVT ) imple-
mented in different technology nodes [28]. As shown in Figure I.4, σVT has increased by
a factor of ∼ 2× for a 45 nm technology compared to a 130 nm process. Note that the
16
Fig. I.4. Variation in threshold voltage of devices for different technology nodes
absolute value of VT is higher for 130 nm process (∼ 0.35 V) compared to a 45 nm process
(∼ 0.28 V). Similarly, the variation in other device (L and Tox) and interconnect (W , H,
etc) parameters has also increased with technology scaling, as reported in [29]. Therefore,
unless signicant advancements are made in process control, the variation in key device
and interconnect parameters is expected to further increase in future technologies.
Additionally, as devices are scaled below 45 nm, the random component of the total
variations becomes signicantly more problematic than the systematic component [30, 8].
Negligible spatial correlation was observed in the L and VT of devices in a test chip fab-
ricated using a 65nm SOI process [30]. However, the random component of L and VT
variation was quite high in comparison (the standard deviation of L and VT variations was
5% and 9% of the mean value respectively). Thus, the L [30, 8, 26] and VT [30, 8, 31, 10]
variations are expected to be mostly random (or independent) in nature for future deep
submicron technologies.
With the increasing amount (σ/µ) of variations in device and interconnect parameters,
it becomes difcult to predict the performance of VLSI designs, and hence it becomes a
challenging task to design reliable VLSI systems. The second goal of this dissertation is
17
to develop efcient analysis and design techniques to address the process variation issue,
in order to facilitate the implementation of process variation resilient VLSI circuits. These
techniques help improve design yield and hence lower manufacturing costs.
I-B. Dissertation Overview
Section I-A indicates that radiation particle strikes and process variations can signicantly
degrade the reliability of VLSI systems. Due to the widespread use of modern VLSI cir-
cuits, it is necessary to address these issues while designing VLSI systems, so as to im-
prove their reliability. Therefore, there is a critical need for analysis and design techniques
to enable the implementation of VLSI systems that are resilient to radiation and process
variation effects.
The goal of this dissertation is to develop several analysis and design techniques to
achieve circuit resilience against radiation particle strikes and process variations. This
dissertation consists of two parts.
In the rst part of this dissertation (Chapters II to VII), four analysis approaches
for analyzing the effects of radiation particle strikes in combinational circuits, SRAMs
and voltage scaled circuits [32, 33, 34] are presented. Two circuit level hardening ap-
proaches [35, 36] are also presented, to harden combinational circuits against a radiation
particle strike.
In the second part of this dissertation (Chapters VIII to X), a sensitizable statistical
timing analysis approach is presented to improve the accuracy of statistical timing analysis
of combinational circuits. Two design approaches are also presented to improve the process
variation tolerance of combinational circuits and voltage level shifters (which are used in
circuits with multiple interacting supply domains), respectively.
This dissertation is organized as follows.
18
In Chapter II, an analytical approach is developed to analyze radiation-induced tran-
sients in combinational circuits. Efcient and accurate models for radiation-induced tran-
sients are required to evaluate the radiation tolerance of a circuit. As mentioned earlier, a
radiation particle strike at a node may result in a voltage glitch. The pulse width of this volt-
age glitch is a good measure of radiation robustness of a design. Thus, an analytical model
to estimate the pulse width of the radiation-induced voltage glitch in combinational designs
is presented in this chapter. In this approach, a piecewise linear transistor IDS model is used
(instead of a linear RC gate model), and the effect of the ion track establishment constant
(τβ) of the radiation-induced current pulse is considered. Both these factors improve the
accuracy (in comparison with the best existing approach [37]) of the analytical model for
the pulse width computation. The model is applicable to any logic gate, with arbitrary gate
size and loading, and with different amounts of charge collected due to the radiation strike.
The model can be used to quickly (1000× faster than SPICE [38]) determine the suscepti-
ble gates in a design (the gates where a radiation particle strike can result in a voltage glitch
with a positive pulse width). The most susceptible gates can then be protected using circuit
hardening approaches, based on the degree of hardening desired.
In Chapter III, an analytical model is presented, which efciently estimates the shape
of the voltage glitch that results from a radiation particle strike. A model for the load
current IGout(Vin,Vout) of the output terminal current of the gate G is used. Again, the model
is applicable to any general combinational gate with different loading, and for arbitrary
values of collected charge (Q). The effect of the ion track establishment constant (τβ) of
the radiation particle induced current pulse is accounted for. The voltage glitch estimated
by this analytical model can be propagated to the primary outputs of a circuit using existing
voltage glitch propagation tools. The properties of the voltage glitch (such as its magnitude,
glitch shape and width) at the primary outputs can be used to evaluate the SEE robustness
of the circuit. Based on the result of this analysis, circuit hardening approaches can be
19
implemented to achieve the level of radiation tolerance required.
Chapter IV presents a model for the dynamic stability of an SRAM cell in the presence
of a radiation particle strike. Such models are required since SRAM stability analysis is cru-
cial from an economic viewpoint, given the extensive use of memory in modern processors
and SoCs. Static noise margin (SNM) based stability analysis often results in pessimistic
designs because SNM cannot capture the transient behavior of the noise. Therefore, to
improve analysis accuracy, a dynamic stability analysis is required. The model proposed
in this chapter utilizes the double exponential current pulse of Equation 1.1 for modeling a
radiation particle strike, and is able to predict (more accurately than the most accurate prior
approach [39]) whether a radiation particle strike will result in a state ip in a 6T-SRAM
cell (for given values of Q, τα and τβ). This model enables a designer to quickly (2000×
faster than SPICE) and accurately analyze SRAM stability during the design phase.
In Chapter V, an analysis of the effects of voltage scaling on the radiation tolerance
of VLSI systems is presented. For this analysis, 3D simulations of radiation particle strikes
on the output of an inverter (implemented using DVS and sub-threshold design) were
performed. The radiation particle strike on an inverter was simulated using Sentaurus-
DEVICE [40] for different inverter sizes, inverter loads, the supply voltage values (VDD)
and the energy of the radiation particles. From these 3D simulations, several non-intuitive
observations were made, which are important to consider during radiation hardening of
such DVS and sub-threshold circuits. Based on these observations, several guidelines are
proposed for radiation hardening of such designs. These guidelines suggest that traditional
radiation hardening approaches need to be revisited for DVS and sub-threshold designs. A
charge collection model for DVS circuits is also proposed, using the results of these 3D
simulations. The parameters of this charge collection model can be included in transistor
model cards in SPICE, to improve the accuracy of SPICE based simulations of radiation
events in DVS circuits.
20
Chapter VI presents a radiation tolerant combinational circuit design approach which
is based on diode clamping action. This diode clamping based hardening approach is based
on the use of shadow gates, whose task it is to protect the primary gate in case it experi-
ences a radiation strike. The gate to be protected is duplicated locally, and a pair of diode-
connected transistors (or diodes) is connected between the outputs of the original and the
shadow gate. These diodes turn on when the voltage across the two gate outputs deviates
(during a radiation strike). A methodology is also presented to protect specic gates of
the circuit based on electrical masking, in a manner that guarantees radiation tolerance for
the entire circuit and also keeps the area and delay overhead low. An improved circuit
level hardening algorithm is also proposed, to further reduce the delay and area overhead.
Note that the diode clamping based approach is suitable for hardening a circuit against low
energy particle strikes.
In Chapter VII, another radiation tolerant combinational circuit design approach is
presented, which is called the split-output based hardening approach. This hardening ap-
proach exploits the fact that if a gate is implemented using only PMOS (NMOS) transistors
then a radiation particle strike can result only in logic 0 to 1 (1 to 0) transient. Based
on this observation, radiation hardened variants of regular static CMOS gates are derived.
Split-output based radiation hardened gates exhibit an extremely high degree of radiation
tolerance, which is validated at the circuit level. Hence, this approach is suitable for hard-
ening against medium and high energy radiation particles. Using split-output gates, circuit
level hardening is performed based on logical masking, to selectively harden those gates
in a circuit which contribute maximally to the soft error failure of the circuit. The gates
whose outputs have a low probability of being logically masked are replaced by their radi-
ation tolerant counterparts, such that the digital design achieves a soft error rate reduction
of a desired amount (typically 90%). The split-output based hardening approach is able to
harden combinational circuits with a modest layout area and delay penalty.
21
Chapter VIII presents the sensitizable statistical timing analysis (StatSense) method-
ology, developed to remove the pessimism due to two sources of inaccuracy which plague
current statistical static timing analysis (SSTA) tools. Specically, the StatSense approach
implicitly eliminates false paths, and also uses different delay distributions for different
input transitions for any gate. StatSense consists of two phases. In the rst phase, a set
of N logically sensitizable vector transitions which result in the largest delays for a circuit,
are obtained. In the second phase, these delay-critical sensitizable input vector transitions
are propagated using a Monte-Carlo based technique to obtain the delay distribution at the
outputs. The specic input transitions at any gate are known after the rst phase, and so
the gate delay distribution corresponding to these input transitions is utilized in the sec-
ond phase. The second phase performs Monte Carlo based statistical static timing analysis
(SSTA), using the appropriate gate delay distribution corresponding to the particular input
transition for each gate. The StatSense approach is able to signicantly improve the ac-
curacy of SSTA analysis. The circuit delay distribution obtained using StatSense closely
matches that obtained by SPICE based Monte Carlo simulations.
In Chapter IX, a process variation tolerant design approach for combinational circuits
is presented, which exploits the fact that random variations can cause a signicant mismatch
in two identical devices placed next to each other on the die. In this approach, a large gate
is implemented using an appropriate number (> 1) of smaller gates, whose inputs and
outputs are connected to each other in parallel. This parallel connection of smaller gates
to form a larger gate is referred to as a parallel gate. Since the L and VT variations are
largely random and have independent variations in smaller gates, the variation tolerance
of the parallel gate is improved. The parallel gates are implemented as single layout cells.
By careful diffusion sharing in the layout of the parallel gates, it is possible to reduce the
input and output capacitance of the gates, thereby improving the nominal circuit delay as
well. An algorithm is also developed to selectively replace critical gates in a circuit by
22
their parallel counterparts, in order to improve the variation tolerance of the circuit. Monte
Carlo simulations demonstrate that this process variation tolerant design approach achieves
signicant improvements in circuit level variation tolerance.
In Chapter X, a novel process variation tolerant single-supply true voltage level shifter
(SS-TVLS) design is presented. It is referred to as true since it can handle both low to
high, or high to low voltage level conversions. The SS-TVLS is the rst VLS design which
can handle both low-to-high and high-to-low voltage translation without a need for a control
signal. The use of a single supply voltage reduces circuit complexity, by eliminating the
need for routing both supply voltages. The proposed circuit was extensively simulated in a
90nm technology using SPICE. Simulation results demonstrate that the level shifter is able
to perform voltage level shifting with low leakage for both low to high, as well as high to
low voltage level translation. The proposed SS-TVLS is also more tolerant to process and
temperature variations, compared to a combination of an inverter along with the non-true
VLS solution [41].
Finally, in Chapter XI, this dissertation is concluded. This chapter also presents some
future directions for research, and a summary of the broader impact of this work.
I-C. Chapter Summary
In this chapter, two major issues (radiation particle strikes and process variations) which
are encountered while designing reliable VLSI systems, were introduced. With technology
scaling, it is expected that the effect of these issues on the reliability of VLSI designs will
become more severe. Thus, there is a critical need to address these issues while designing
VLSI systems.
The next chapter will describe the rst radiation analysis approach for combinational
circuits.
23
CHAPTER II
RADIATION ANALYSIS - ANALYTICAL DETERMINATION OF
RADIATION-INDUCED PULSE WIDTH IN COMBINATIONAL CIRCUITS
II-A. Introduction
With technology scaling, radiation particle strikes are becoming increasingly problematic
for both combinational circuits and memory elements, as described in Chapter I. Many
critical applications such as biomedical circuits, as well as space and military electronics,
demand reliable circuit functionality. Therefore, the circuits used in these application must
be tolerant to radiation particle strikes.
In order to design radiation tolerant VLSI systems efficiently, it is required to rst an-
alyze the nature of radiation-induced voltage transients, and the effects of radiation particle
strikes both on combinational circuits and memory elements (like SRAM cells). Then,
based on the ndings of this analysis, circuit hardening approaches can be implemented
to achieve radiation resilience while satisfying area, delay and power constraints. This
chapter and the next three chapters present radiation analysis approaches developed in this
dissertation for analyzing the effect of a radiation particle strike in combinational circuits
and SRAMs. Then two hardening approaches are presented in Chapters VI and VII.
Circuit hardening approaches [12, 42, 43] often employ selective gate hardening to
reduce the area and delay overhead associated with radiation hardening contributors. This
is achieved by only protecting those gates in a circuit which are the signicant contributors
to the soft error failure rate of the circuit. Hence, the radiation susceptibility of such gates
has to be examined to evaluate the radiation tolerance of the circuit. Also, for efcient
hardening, it is important to harden the circuit early in the design ow. This will help in
reducing the number of design iterations and reducing design turn-around time and cost.
24
However, this can be achieved only if radiation analysis techniques can quickly and ac-
curately simulate the effects of radiation events of different particle energies, for different
gates with different loading conditions. For this, it is important to evaluate the radiation
tolerance of a circuit using robustness metrics.
An exhaustive SPICE based simulation of radiation events in a combinational circuit
would be accurate; however it would require a large number of simulations since the circuit
can have a large number nodes and a radiation particle strike can occur at any one of these
nodes. Also, the transient pulse resulting from a radiation particle strike depends upon
the node (node capacitance and the sizing characteristics of the gate driving that node),
the amount of charge collected due to the particle strike and the state of the circuit inputs.
Therefore, it is computationally intractable to use exhaustive SPICE-based simulators for
simulating the effect of radiating event in the early stages of the design ow. Thus, there is
a need for efcient and accurate analytical models for SET events in combinational circuits.
The modeling of radiation events in either combinational or sequential circuits in-
volves solving non-linear differential equations. Because of this, not much success has
been achieved in developing accurate and efcient models which are applicable across
different scenarios (such as different gate sizes, dumped charge, fanout loading, etc). Mod-
eling approaches in the past (explained in Section II-B) have made several assumptions and
approximations which limit the applicability of the resulting model due to the large error
involved.
In this chapter an analytical model for the pulse width1 of the radiation-induced volt-
age glitch in combinational circuits, is presented. The pulse width of the voltage glitch due
to a radiation particle strike is a good measure of radiation robustness because, if a gate is
more susceptible to radiation particle strikes, then a particle strike at the output node of that
1The pulse width of the radiation-induced voltage glitch is computed as the width of the
voltage glitch, measured at half the supply voltage.
25
gate would result in a voltage glitch with a larger pulse width. On the other hand, if a gate
is less susceptible to radiation events, then pulse width of the voltage glitch will be lower.
Hence, the pulse width of the radiation-induced voltage glitch is often used as the radiation
robustness metric of choice.
The model for the pulse width of the radiation-induced voltage glitch presented in this
chapter uses a piecewise linear transistor IDS model (instead of a linear RC gate model
as done in previous approaches [37, 44]), and also considers the effect of the ion track
establishment constant (τβ) of the radiation-induced current pulse (Equation 1.1). Both
these factors improve the accuracy of the analytical model for the pulse width computation.
The proposed model is applicable to any logic gate, with arbitrary gate size and loading,
with different amounts of charge collected due to the radiation strike. The computation of
the pulse width of the voltage glitch at a gate using the model presented in this chapter is
very fast and accurate; therefore, it can be easily incorporated in a design ow to implement
radiation tolerant circuits. The proposed model can be used to quickly determine if the gates
in a design experience a positive pulse width as a consequence of a radiation strike. Such
gates can be up-sized to harden them, while accounting for logical masking [12]. This ow
can be iterated until the required tolerance against radiation particle strikes is achieved.
Note that previous approaches [37, 44] neglected the contribution of the ion track es-
tablishment constant (τβ) to simplify their model. However, in [11], it was mentioned that
if the circuit response is faster than the time constants of the radiation event, then the shape
of the radiation-induced current pulse is critically important to accurately model the radia-
tion particle strike. For a 65nm PTM [45] model card, the delay of a minimum size inverter
driving a fanout of three minimum size inverters is about 13 ps which is much smaller than
the typical time constants2 associated with a radiation particle strike. Therefore, neglecting
2Typical rise times of a radiation-induced current pulse are in the range of 10-50 ps and fall
times are of the order of 100 ps [11, 12].
26
the contribution of the τβ term of the current pulse of Equation 1.1 will lead to an inaccurate
analysis. Through experiments it was found that ignoring τβ results in an under-estimation
of the pulse width of the radiation-induced voltage glitch by 10%. Therefore, neglecting
the contribution of the τβ term of the current pulse of Equation 1.1 effectively diminishes
the severity of the radiation particle strike, and hence leads to an optimistic estimate for
the voltage glitch. Thus, it is important to consider τβ for an accurate analysis. The model
presented in this chapter considers the contribution of τβ.
In the remainder of this chapter, Section II-B briey discusses related previous work
on modeling radiation-induced transients in combinational circuits. The model for the pulse
width of the radiation-induced voltage glitch developed in this dissertation is described in
Section II-C. Experimental results are presented in Section II-D, followed by a chapter
summary in Section II-E.
II-B. Related Previous Work
A signicant amount of work has been done on the simulation and analysis of radiation
particle strikes in both combinational and sequential circuit elements [46, 1, 2, 37, 47, 48,
44]. Most of this work can be classied under one of three categories: device-level, circuit-
level and logic-level.
Device-level simulation approaches involve solving device physics equations to evalu-
ate the effect of a radiation particle strike. In [18], three-dimensional numerical simulation
is used to study the charge collection mechanism in silicon n+/p diodes. In [49], device
level three-dimensional simulation was performed to study the charge collection mecha-
nism and voltage transients from angled ion strikes. Although device-level approaches
result in very accurate analysis, they are extremely time-consuming in nature. Hence, these
approaches cannot be used for large circuits.
27
For circuit-level and logic-level simulation approaches, a double exponential cur-
rent pulse (Equation 1.1) is used to model a particle strike [20, 48, 12]. Logic-level ap-
proaches [47, 50] are utilized when the accuracy of the analysis is not very important but
the speed of the analysis is very important. In these approaches, the electrical effect of
radiation-induced transient is abstracted into logic-level models, which are then used in
gate-level timing simulations to propagate the effects of a radiation particle strike to the
memory elements at the primary outputs of the circuit. The high level of inaccuracy of
these approaches makes them unattractive for robustness evaluation of circuits under radi-
ation particle strikes.
Circuit-level simulation approaches provide accuracy and runtimes which are interme-
diate between device and logic level methods. An exhaustive SPICE based simulation of
radiation events in a circuit would be relatively accurate; however it is still very time con-
suming since a large number of simulations are required to be performed due to the reasons
mentioned in Section II-A. Several approaches have been proposed to model radiation-
induced transients in combinational circuits [51, 52, 44, 37]. In [51], the authors presented
a methodology to analyze compound noise effects in circuits. Their approach utilizes look-
up tables and a database generated from SPICE simulations of all the cells in a library.
Many approaches [52, 44, 37] attempt to solve a non-linear differential equation (this equa-
tion is called a Ricatti differential equation) of the transistor to obtain a closed-form reduced
model for the radiation-induced transients. For this, several approximations were made in
these approaches which result in a large error.
The authors of [52] presented an exact solution of the Ricatti equation using a com-
putationally expensive innite power series solution. In [44], a switch-level simulator is
presented, where radiation-induced transient simulation is performed in two steps. In the
rst step, a rst order RC model is used to compute the pulse width due to a radiation
particle strike and then in the second step, a set of rules are used for the propagation of
28
the transient pulse through simple CMOS circuit blocks. Electrical-level simulations are
performed to obtain pulse widths for given resistance (Rg) and capacitance (Cg) values that
model a gate. Then the pulse width for other R and C values are obtained by using the
linear relationships between the pulse width obtained for Rg and Cg, and the new R and C
values. One major drawback of this approach is that it cannot be used for different values
of the radiation-induced current parameters (Q, τα and τβ). In [37], a closed-form model
is reported for radiation-induced transient simulation for combinational circuits. Again, a
linear RC gate model is used, which is derived using a SPICE-based calibration of logic
gates for a range of values of fanout, charge collected and gate size. In [37, 44], the circuit
simulation approaches assume a linear RC gate model which leads to higher inaccuracy. In
the DSM era, a gate cannot be accurately modeled by a linear RC model [53]. This will also
be demonstrated through an experiment in Section II-C. Also, these approaches neglect the
contribution of the ion track establishment constant (τβ) of the radiation-induced current
pulse of Equation 1.1, which further increases the inaccuracy of the analysis, as explained
in Section II-A. In contrast to these approaches, the model proposed in this chapter uses a
piecewise linear transistor IDS model and also considers the effect of τβ. Both these factors
improve the accuracy of the analytical model for the pulse width computation.
II-C. Proposed Analytical Model for the Pulse Width of Radiation-induced Voltage Glitch
This section describe the analytical model for the pulse width of the radiation-induced volt-
age glitch developed in this dissertation. Section II-C.1 discusses the effect of a radiation
particle strike at the output of an inverter, using SPICE [38] simulations. The inverters used
in this discussion were implemented using a 65nm PTM [45] model card with VDD = 1V .
The radiation-induced transient are classied into 4 cases, in Section II-C.2. The proposed
model for the pulse width computation, based on these cases, is introduced in Section II-
29
C.3. Section II-C.4 provides the derivation of the expression for the pulse width of the
radiation-induced voltage glitch.
II-C.1. Radiation Particle Strike at the Output of an Inverter
Consider an inverter INV1 driving three identical inverters as shown in Figure II.1 (a).
These inverters were implemented using a 65nm PTM [45] model card with VDD=1 V.
Note that these inverters were designed such that the switching threshold (VST ) is V DD/2
(i.e. 0.5 V). Let node a be at logic value 0 when a radiation particle strikes the diffusion
of INV1. This is modeled by the injection of iseu(t) (described by Equation 1.1) at node a.
The voltage glitches that results from the radiation particle strike are shown in Figure II.1
(b), for four different inverter sizes (1X, 7X, 8X and 10X)3 and for Q=150 fC, τα = 150 ps
and τβ = 50 ps. Note that all 4 inverters of Figure II.1 (a) are identical.
M2
INV4
INV3
INV2
INV1
M1
in
seu
i    (t)
a
(a) (b)
Fig. II.1. a) Radiation-induced current injected at the output of inverter INV1, b) Voltage
glitch at node a
3The width of the NMOS (PMOS) transistor of 1X inverter is 65 nm (195 nm). The channel
length of both the NMOS and PMOS transistors is 65 nm.
30
Note that from Figure II.1 (b), in case of 10X inverters, the radiation particle strike
changes the node voltage by less than VST (which is designed to be VDD/2) and hence the
logic value does not change. Hence the radiation particle strike does not cause any error
in circuit operation in this case. In case of the 8X inverter, the node voltage at a rises to
a value around 0.9 V. As the voltage of node a starts rising, the NMOS transistor M1 of
INV1 is in the linear region of operation. When the node voltage reaches V Ndsat , M1 enters
the saturation region of operation. For this case, the PMOS transistor is always in cut-off
(since its VGS = 0). When the radiation particle strike occurs at the output node diffusion
of the 7X inverter, the magnitude of the voltage glitch is around 1.4 V. In this case as well,
M1 starts out in the linear-region, and enters the saturation region when the node voltage
at a rises above V Ndsat . However, in this case, the PMOS transistor M2 of INV1 also turns
on (in saturation mode) when the voltage of node a reaches VDD + |VTP| (here VTP is the
threshold voltage of the PMOS transistor) because the VGS of M2 becomes smaller than
VTP. In case of the 1X inverter, the diode between the source diffusion and the bulk of
M2 also turns on (M1 and M2 both conduct in the saturation region under this condition)
when the voltage of node a reaches a value greater than V DD +Vdiode (i.e. 1.6 V). Here,
Vdiode is the diode turn on voltage which 0.6 V for Silicon. Therefore, the voltage of node a
gets clamped to a value around 1.6 V. Based on the above discussion, note that M1 and M2
operate in different modes of operation (cut-off, linear and saturation) during the radiation-
induced transient. Therefore, it will not be accurate to model INV1 by a linear RC gate
model, as in the case [37, 44].
Based on the above discussion, note also that the inverters of four different sizes oper-
ate quite differently during the radiation-induced transient, and the maximum voltage glitch
magnitude (VGM) determines their behavior at different times during the transient. In fact,
when a radiation particle strikes the output node of INV1, there are 4 cases to consider. In
the next section, each of these cases (distinguished based on the VGM value) are described.
31
Based on this classication, the proposed analytical closed form expression for the pulse
width of the radiation-induced voltage glitch is derived in Section II-C.4.
II-C.2. Classication of Radiation Particle Strikes
The analysis presented in this chapter is for an inverter with its input at VDD and its output
at GND. The radiation particle strike results in a positive voltage glitch at the output of the
gate. However without loss of generality, the same analysis and the same analytical model
can be used for any type of gate (NAND, NOR, etc), and for any logic values applied to
its inputs. Handling of NAND, NOR, etc. gates is achieved by constructing an equivalent
inverter for the gate. The size of this inverter depends on the given input values of the
gate. The applicability of the proposed model to different gates was veried by applying
the model to a 2-input NAND gate (for all four input combinations). These results are
presented in Section II-D. Note that for multiple input gates, the radiation particle strike at
intermediate nodes of the gate were not considered, because the worst-case transient occurs
when the particle strike occurs at the output node of the gate.
Again consider the inverter INV1 of Figure II.1 (a). INV1 can operate in 4 different
cases during the radiation event transient, based on the maximum voltage glitch magnitude
VGM. The value of VGM depends upon the sizes of the devices M1 and M2, the gate load-
ing at the output node a and the value of Q, τα and τβ. The pulse width of the voltage
glitch is computed differently for these cases, due to the different behavior of M1 and M2
(Figure II.1 (a)) for these cases. The classication the different cases is as follows.
• Case 1 - VGM ≥ V DD +Vdiode: In this case, with the increasing voltage of node a
(Va), M1 starts conducting in the linear region and enters the saturation region when
the Va becomes more than V Ndsat . M2 starts conducting in the saturation mode once Va
crosses V DD+ |VTP|. Eventually when Va reaches V DD+Vdiode, the voltage between
the source diffusion and the bulk terminal of the PMOS transistor M2 becomes ≥
32
Vdiode. Therefore, the diode between these two terminals get forward biased and it
starts conducting heavily. Thus Va gets clamped to a value around V DD+Vdiode.
• Case 2 - V DD+ |VTP| ≤VGM < VDD+Vdiode: In this case as well, both M1 and M2
conduct similar to Case 1. However, the diode between the diffusion and the bulk
terminals of M2 remains off.
• Case 3 - V DD/2 ≤ VGM < V DD + |VTP|: Only M1 conducts in this case. M1 starts
conducting in the linear region and when Va crosses V Ndsat , M1 enters the saturation
region. M2 remains off in this case.
• Case 4 - VGM <V DD/2: The voltage glitch magnitude is less than V DD/2 and hence
the radiation event does not result in node voltage change of magnitude greater than
VDD/2.
Also, out of these 4 cases, the radiation event causes a node voltage glitch of size
greater than V DD/2 for Cases 1, 2 and 3, and thus, the analysis is presented for these
cases.
II-C.3. Overview of the Model for Determining the Pulse Width of the Voltage Glitch
Figure II.2 (a) schematically illustrates a voltage glitch that results from a radiation strike at
the output node a of INV1. As shown in Figure II.2 (a), the node voltage rises and reaches
V DD/2 at time t1, and the node voltage falls to V DD/2 (after reaching a maximum value
of VGM) at the time t2. Hence the width of the voltage glitch of Figure II.2 (a) is t2-t1. The
goal of the proposed model is to compute t2, t1 (the width of the glitch). In order to use
the proposed model to compute pulse width, all the gates of different types and sizes in the
library (LIB) need to be characterized (this was done using SPICE [38]). For each gate (for
all input combinations), the current through the pull-down and pull-up stacks as a function
of the gate output voltage was computed, and stored in a look-up table. The input gate
33
capacitance (CG) and the output node diffusion capacitance (CD) were also computed as a
function of the input (output) node voltage and stored in look-up tables. For these lookup
table entries, the characterization was performed in the discrete steps of 0.1 V. For example,
for INV1 of Figure II.1 (a), the drain to source current IDS through M1 was computed for
different VDS values across M1, when node in is at V DD. The IDS value for M2 was also
computed when in is at the GND value, for different values of VDS across M2. Thus, the
number of current look-up tables (the pull-up and the pull-down current tables) for any gate
is equal to 2n (where n is the number of inputs of a gate). Similarly, CD was also computed
depending upon the input state of the gate. Therefore, for an n-input gate, the total size
of the look-up tables for CG, CD and the current through the pull-down (pull-up) stack are
23 ·n, 17 ·2n and 17 ·2n respectively. The saturation voltage Vdsat was also obtained for both
NMOS and PMOS transistors, for the nominal supply voltage value. Note that the proposed
model can be used for a circuit employing voltage scaling by obtaining the Vdsat values for
different supply voltage values. The gate characterization step needs to be performed once
for each gate in a library, and thus it does not affect the run-time of the model.
No
Yes
(b)(a)
Given a gate G, its input state,
the gates in the fanout of G and
IDS, CG and CD
Cell library data
use Eq. 2.14 use Eq. 2.15
If Case == 3If Case == 2
Case==4
to compute t2 to compute t2
Compute t1
using Eq. 2.7
is 0
Pulse Width
If Case == 1
use Eq. 2.10
to compute t2
Compute Pulse Width
as t2 - t1
Q, τα and τβ
and the case of operation
Determine the value of VGM
If
timet1 t2
v
VGM
V DD
2
Fig. II.2. Flowchart of the proposed model for pulse width calculation
34
Figure II.2 (b) shows the owchart of the algorithm used by the proposed model to
compute the values of t1 and t2 (and hence estimate the pulse width of the voltage glitch).
The input to the model is a gate G (the radiation event is to be simulated at the output node
of gate G), its input state, the list of gates which are driven by the gate G, and the values of
Q, τα and τβ. The model rst computes VGM and then determines the case that is applicable.
If VGM < V DD/2 (i.e. Case 4 applies), then the pulse width is 0 else t1 is computed. Note
that the expression of t1 is the same for cases 1, 2 and 3. After this, the time t2 is computed
using case specic expressions. Finally the pulse width of the voltage glitch (t2 − t1) is
returned. The steps of the proposed model to compute the pulse width of the voltage glitch
are explained in detail in the following sub-sections.
II-C.4. Derivation of the Proposed Model for Determining the Pulse Width of the
Voltage Glitch
As mentioned earlier, the discussion of the proposed model assumes that INV1 (Figure II.1
(a)) has its input node in at VDD and the output node a at GND. A radiation particle strike
results in a positive voltage glitch at node a. To ensure that the model for radiation events
in combinational circuit elements is manageable, a piece-wise linear drain-source current
(IDS) expression was used. Consider an NMOS transistor with the input gate terminal at
V DD. Then IDS as a function of VDS can be written as:
IVDSDS =


VDS/Rn linear (VDS < V Ndsat)
K3 +K4 ·VDS saturation (VDS ≥V Ndsat)
Here, Rn is the linear region resistance, which is calculated using the IDS versus VDS
lookup table for VDS values less than V Ndsat . Similarly, the constants K3 and K4 are obtained
by using the IDS versus VDS lookup table, for VDS values greater than V Ndsat .
To determine the case that is applicable, it is rst required to calculate the value of
35
VGM. This is done as follows.
II-C.4.a. Voltage Glitch Magnitude VGM
A radiation event can result in a voltage glitch with positive pulse width only if Imaxseu >
IVDD/2DS , where Imaxseu is the maximum value of radiation-induced current pulse of Equa-
tion 1.1. This condition is used to check whether a radiation event will result in a voltage
glitch of positive pulse width or not. The differential equation for the radiation-induced
voltage transient at the output of INV1 of Figure II.1 (a) is given by:
C dVa(t)dt + I
Va
DS = iseu(t) (2.1)
where C is the capacitance4 at node a. Equation 2.1 is accurate for values of Va be-
tween 0 V and V DD + |VTP|. It is used to calculate VGM. Note that if the estimated VGM
from Equation 2.1 is greater than V DD + 0.6V , then it is assumed that Case 1 applies. In
some instances, a Case 2 VGM value can be diagnosed as a Case 1 situation, which results
in a pessimistic pulse width estimate. The above equation can be integrated with the initial
condition Va(t) = 0 at t = 0 to obtain Va(t). For deep sub-micron processes, Vdsat is much
lower than VGS−VT due to short channel effects. For the 65 nm PTM [45] model card used
in this work, Vdsat for both NMOS and PMOS transistors is lower than VDD/2. Therefore,
to obtain the VGM value, Equation 2.1 is rst integrated from the initial condition using
the linear region equation for IVaDS till Va reaches V Ndsat value. Then, Equation 2.1 is again
integrated using the saturation region equation for IVaDS to obtain the Va(t) expression. The
resulting expression for Va(t) is used to calculate the value of VGM.
4The value of C is obtained by the addition of the capacitance of the output diffusion node of
INV1 (CD), interconnect capacitance and the input capacitance of the gates driven by INV1
(n ·CG). Here, n is the fanout factor. Note that these capacitance values were obtained over
the operating voltage range.
36
Integrating Equation 2.1 using the linear region equation for IVaDS and with the initial
condition Va(t) = 0 at t = 0 gives:
Va(t) =
In
C (
e−t/τα
X
− e
−t/τβ
Y
−Ze−t/RnC) (2.2)
where,
X =
1
RnC
− 1
τα
,Y =
1
RnC
− 1
τβ
, In =
Q
τα− τβ
,Z =
1
X
− 1
Y
To obtain the time Tsat when Va(t) reaches the V Ndsat value from Equation 2.2, linearly
expand Equation 2.2 around the initial guess T asat . The resulting expression for Tsat is:
Tsat = T asat +
V Ndsat − InC ( e
−Tasat/τα
X − e
−T asat/τβ
Y −Ze−T
a
sat/RnC)
In
C (− e
−T asat/τα
ταX +
e
−T asat/τβ
τβY +
Z
RnC e
−T asat/RnC)
(2.3)
To obtain the initial guess T asat , approximate the rising part of the radiation-induced
current by a line between the origin and the point where iseu(t) of Equation 1.1 reaches
its maximum value Imaxseu . The radiation-induced current iseu(t) reaches Imaxseu at T maxseu . Then
substitute this approximated radiation-induced current in the RHS of Equation 2.1 and
integrate it from the initial condition Va(t) = 0 at t = 0 to Va(t) = V Ndsat at t = T asat using the
linear region equation for IVaDS. After this, solve for T asat by performing a quadratic expansion
of the resulting equation around the origin. The expression for T asat is:
T asat =
√
2V Ndsat ·C ·T maxseu
Imaxseu
(2.4)
where,
T maxseu =
τατβ
τα− τβ
log τα
τβ
and Imaxseu = iseu(T maxseu )
37
So far the expression for Tsat is known, which is the time when Va(t) reaches V Ndsat , or
the time when M1 enters the saturation mode. Now, again integrate Equation 2.1 with the
initial condition Va(t) = V Ndsat at t = Tsat , and using the saturation region current equation
for IVaDS. The resulting expression for Va(t) is:
Va(t) =
In
C (
e−t/τα
X ′
− e
−t/τβ
Y ′
)− K3
K4
+Z′e−K4t/C (2.5)
where,
X ′ =
K4
C
− 1
τα
,Y ′ =
K4
C
− 1
τβ
and
Z′ = V Ndsate
K4Tsat/C − In
C
eK4Tsat/C(
e−Tsat/τα
X ′
− e
−Tsat/τβ
Y ′
)+
K3
K4
eK4Tsat/C
To calculate the value of VGM, rst differentiate Equation 2.5 and then equate dVa(t)/dt
to zero and solve for TVGM (the time at which Va(t) reaches its maximum value). Since the
equation dVa(t)/dt = 0 is also a transcendental equation, hence linearly expand dVa(t)/dt =
0 around T maxseu and then solve for TVGM . The expression for TVGM is:
TVGM = T
max
seu +
e−T maxseu /τα
ταX ′ − e
−T maxseu /τβ
τβY ′ +
K4Z′
C e
−K4T maxseu /C
e−T maxseu /τα
τ2αX ′
− e−T
max
seu /τβ
τ2βY ′
+
K24 Z′
C2 e
−K4T maxseu /C
(2.6)
Now, calculate VGM by substituting TVGM obtained from Equation 2.6, in to Equa-
tion 2.5. Note that by using this method, VGM can be evaluated to be greater than V DD +
0.6V , because the diode is not modeled in Equation 2.1. Therefore, if VGM > VDD+0.6V
then set VGM = V DD + 0.6V . Also note that the effect of the turning on of M2 is also
not included (when Va(t) reaches a value above V DD + |VTP|). This is done to keep the
analysis simple. It was found that neglecting the contribution of M2’s current minimally
38
affect the accuracy of the proposed model. The value of VGM determines the case which is
applicable. If Case 4 applies, then the pulse width is 0 since the radiation event does not
affect the logic level of INV1. Otherwise, the times t1 and t2 are computed to calculate the
pulse width of the voltage glitch at node a.
II-C.4.b. Derivation of the Expression for t1
As shown in the owchart of the proposed model in Figure II.2, the method to compute t1
is identical for cases 1, 2 or 3. To obtain the expression for t1, substitute t = t1 and Va(t1) =
V DD/2 in Equation 2.5 and then solve for t1 after expanding Equation 2.5 linearly around
the point ta1 (which is an initial guess for t1). Here ta1 = TsatV DD/(2V Ndsat), which is an
estimate of t1, obtained by extrapolating along the line between (0,0) and (Tsat ,V Ndsat ) The
expression for t1 is therefore:
t1 = t
a
1 +
e
−ta1/τα
X ′ − e
−ta1 /τβ
Y ′ +
C
In (Z
′e−K4ta1/C − K3K4 −
V DD
2 )
e
−ta1/τα
X ′τα − e
−ta1/τβ
Y ′τβ +
K4Z′
In e
−K4ta1/C
(2.7)
Equation 2.7 gives the time at which the voltage at node a reaches V DD/2. Note that
the contribution of τβ is not ignored in the calculation of t1 (unlike [37, 44]).
II-C.4.c. Derivation of the Expression for t2
The method for obtaining the value of t2 depends upon the value of VGM (i.e. the case that
is applicable). The derivation of the expression for t2, for the different cases is as follows:
Case 1: Consider the voltage and current waveforms of the 1X inverter during the
radiation event as shown in Figure II.3. Figure II.3 shows the voltage of node a, IDS currents
of M1 and M2, and the radiation-induced current pulse (iseu). As shown in Figure II.3, when
iseu(t) becomes equal to the IDS of M1, then at that instant, the IDS of M2 is approximately
equal to 0 and the voltage at node a is VDD + |VTP|. This is an important observation
39
Fig. II.3. Voltage/Current due to a radiation particle strike at node a of INV1 of Figure II.1
(a)
because this information will be used as the initial condition when integrating the INV1
output node voltage differential equation (Equation 2.1). Let iseu(t) become equal to the
IDS of M1 at time t3. Then Va(t3) = VDD+ |VTP|. To calculate t3, ignore the contribution of
the e−t/τβ term of iseu(t). This is reasonable since τα is usually 3-4 times of τβ and therefore
e−t/τβ approaches 0 much faster than the e−t/τα term. Thus the value of e−t/τβ around t3
(which is greater than T maxseu ) will be approximately equal to 0. The expression of t3 thus
obtained by equating iseu(t) (ignoring the e−t/τβ term) and IVDD+|VT P|DS is:
t3 =−τα log
IVDD+|VT P|DS
In
(2.8)
Now, the radiation-induced current after time t3 is modeled by a line, one of whose
end-points has a current value of IavgDS = 0.5·(IVDD+|VTP|DS + IVDD/2DS ) at a time value of t3.
The other end-point has its current value as 0 at time t∗. The value of t∗ is obtained by
equating the charge deposited by the actual radiation-induced current iseu(t) from time t3 to
innity and the charge deposited by linearized radiation-induced current equation. Hence
the expression for the radiation-induced linear current model is:
40
imseu(t) = I
avg
DS (1−
t− t3
t∗− t3 ) = K1−K2t (2.9)
where,
t∗ = t3 +2
In(ταe−t3/τα − τβe−t3/τβ)
IavgDS
Now substitute imseu(t) for iseu(t) in Equation 2.1, use the saturation region equation for
IVaDS and then integrate the resulting differential equation from time t3 to t2 (where Va(t3) =
V DD + |VTP| and Va(t2) = VDD/2). The resulting equation is solved for t2 by performing
a quadratic expansion around the ta12 point. The resulting expression for t2 is:
t2 = t
a1
2 +
−Q+
√
Q2−4PR
2P
(2.10)
where,
P =
MK24 e
−K4ta12 /C
2C2 ,Q =
K2
K4
− MK4e
−K4ta12 /C
C ,R = N +
K2ta12
K4
+Me−K4t
a1
2 /C,
N =
V DD
2
− K1−K3
K4
− K2C
K24
,M = e−K4t3/C(−VDD−|VTP|+ K1−K3K4 −
t3K2
K4
+
K2C
K24
)
To obtain the value of ta12 , again integrate Equation 2.1 but this time substitute I
Va
DS by a
constant current of value IVDD+|VT P|DS . The radiation-induced current is again modeled by a
line with one end-point having a current value of IVDD+|VTP|DS at a time value of t3. The other
end-point is again found by equating the charge deposited by the actual radiation-induced
current iseu(t) from time t3 to innity and the charge deposited by linearized radiation-
induced current equation. Equation 2.1 is integrated from time t3 to ta12 . A closed form
expression can be obtained for ta12 . The resulting expression for ta12 is:
41
ta12 = t3 +
√
C · (VDD/2+ |VTP|) · (t∗− t3)
IVDD+|VT P|DS
(2.11)
Case 2: In this case, both M1 and M2 conduct because the magnitude of the voltage
glitch is between VDD + |VTP| and VDD + 0.6V . Similar to Case 1, at time t3, iseu(t)
becomes equal to IVDD+|VT P|DS and the voltage of node a is VDD + |VTP|. The value of t3 is
again obtained using Equation 2.8. To obtain the expression for t2, integrate Equation 2.1
with the initial condition Va(t3)=VDD+ |VTP|, using the saturation region current equation
for the IDS of M1. The resulting equation of Va(t) is:
Va(t) =
In
C (
e−t/τα
X ′
− e
−t/τβ
Y ′
)− K3
K4
+Z′′e−K4t/C (2.12)
where,
Z′′ = (V DD+ |VTP|)eK4t3/C − InC e
K4t3/C(
e−t3/τα
X ′
− e
−t3/τβ
Y ′
)+
K3
K4
eK4t3/C
Now use Equation 2.12 to compute t2. For this substitute t = t2 and Va(t2) = V DD/2
in Equation 2.12, expand it around the initial guess point ta22 and then solve for t2. Through
some simulations and analysis, it was observed that ta22 (the time when iseu(t) falls to IVDD/2DS
after reaching Imaxseu ) can be used as an initial guess for t2 since the node voltage at that time
is close to VDD/2. For nding an expression for ta22 , ignore the contribution of the e
−t/τβ
term of iseu(t). The expression for ta22 is:
ta22 =−τα log
IVDD/2DS
In
(2.13)
Now equate Equation 2.12 to VDD/2, expand it around ta22 (from Equation 2.13) and
then solve it for t2. The resulting expression for t2 is:
42
t2 = t
a2
2 +
e
−ta22 /τα
X ′ − e
−ta22 /τβ
Y ′ +
C
In (Z
′′e−K4ta22 /C − K3K4 −
VDD
2 )
e
−ta22 /τα
X ′τα − e
−ta22 /τβ
Y ′τβ +
K4Z′′
In e
−K4ta22 /C
(2.14)
Case 3: In this case, only M1 of Figure II.1 (a) conducts because the magnitude of the
glitch voltage is less than V DD+ |VTP|. Therefore, the voltage of node a from Equation 2.5
can be used to compute t2. The initial guess for t2 is obtained in the same manner as Case
2 using Equation 2.13. Now equate Equation 2.5 to VDD/2, expand it around t a22 (from
Equation 2.13) and then solve it for t2. Hence the expression for t2 is:
t2 = t
a2
2 +
e
−ta22 /τα
X ′ − e
−ta22 /τβ
Y ′ +
C
In (Z
′e−K4ta22 /C − K3K4 −
VDD
2 )
e
−ta22 /τα
X ′τα − e
−ta22 /τβ
Y ′τβ +
K4Z′
In e
−K4ta22 /C
(2.15)
Using the values of t1 and t2 obtained in this section (for Cases 1, 2 and 3), the pulse
width of the radiation-induced voltage glitch at node a can be calculated. Note that τβ is
not ignored in the calculation of t2 as well as t1. The contribution of the e−t/τβ term of
iseu(t) was ignored only during the calculation of the initial guess for t2.
II-D. Experimental Results
The accuracy of the model proposed in this chapter for determining the pulse width of the
radiation-induced voltage glitch was compared with SPICE [38]. The model was imple-
mented in perl and it is much faster than SPICE simulation. In particular, for the results
shown in this section, the SPICE simulations for the inverter with input 1 (input 0) took
12.6 s (10.9 s) while the perl script generated the result for input 1 as well as input 0 in
0.008 s. Thus, the proposed model is more than 1000× faster. Note that all experiments
were conducted on a Linux-based 3.6 GHz Pentium 4 machine, with 3 GB of RAM.
A standard cell library LIB was implemented using a 65 nm PTM [45] model card
with V DD = 1V . The library contains INV, NAND and NOR gates of different sizes and
43
Table II.1. Pulse Width for INV1 Gate for Q = 150 fC, τα = 150ps and τβ = 50ps
INV1 with input 1 INV1 with input 0
SPICE Model SPICE Model
Load Size t1(ps) t2(ps) PW S(ps) t1(ps) t2(ps) PW M(ps) % Error t1(ps) t2(ps) PW S(ps) t1(ps) t2(ps) PW M (ps) % Error
1 1 7 540 533 7 540 533 0.00 7 524 517 6 529 522 0.97
1 2 12 426 414 12 427 415 0.24 11 415 404 11 421 410 1.49
1 4 22 314 292 22 319 296 1.37 20 305 285 19 317 298 4.56
1 6 33 246 213 35 258 223 4.69 30 238 208 29 261 231 11.06
1 8 50 192 142 49 195 146 2.82 44 184 140 43 184 141 0.71
3 1 10 562 552 9 563 553 0.18 9 544 535 9 542 533 -0.37
3 2 16 448 432 15 450 434 0.46 15 435 420 14 434 420 0.00
3 4 28 336 308 27 342 315 2.27 25 326 301 24 330 306 1.66
3 6 42 269 227 42 281 239 5.29 37 258 221 36 257 221 0.00
3 8 62 209 147 61 214 152 3.40 53 200 147 51 199 148 0.68
AVG 2.07 2.15
Table II.2. Pulse Width for NAND2 gate for Q = 150 fC, τα = 150ps and τβ = 50ps
Inputs 11 Inputs 00 Inputs 01 Inputs 10
Load Size PW S(ps) PW M (ps) % Error PW S(ps) PW M (ps) % Error PW S(ps) PW M(ps) % Error PW S(ps) PW M(ps) % Error
1 1 497 501 0.80 404 402 -0.5 521 523 0.38 531 531 0.00
1 2 382 388 1.57 284 288 1.41 408 410 0.49 417 418 0.24
1 4 259 270 4.25 140 141 0.71 289 297 2.77 298 304 2.01
1 6 172 192 11.63 - - - 211 228 8.06 220 220 0.00
3 1 512 518 1.17 413 415 0.48 539 538 -0.19 548 548 0.00
3 2 396 404 2.02 292 300 2.74 423 424 0.24 432 434 0.46
3 4 271 285 5.17 145 145 0.0 304 310 1.97 312 318 1.92
3 6 183 191 4.37 - - - 224 225 0.45 232 233 0.43
AVG 3.87 0.97 1.82 0.63
different numbers of inputs. As mentioned in Section II-C.3, all gates in LIB were pre-
characterized. Specically, look-up tables for the current through both the pull-up and
down stacks, the input gate capacitance CG and the output node diffusion capacitance CD
(for all input combinations) were obtained for all the gates in LIB. The method used to
obtain the stack current as well as, CG and CD look-up tables is explained in Section II-C.3.
For all experimental results reported in this section, Q = 150 fC, τα = 150ps and τβ = 50ps.
Similar results were obtained for the other values of Q, τα and τβ which are not reported
for brevity.
The proposed model was applied to inverters of different sizes (with both possible
input values) for determining the pulse width of the voltage glitch induced by a radiation
particle strike. The circuit under consideration is similar to Figure II.1 (a) where INV1 is
driving either 1 or 3 inverters of the same size, and a radiation particle strike occurs at the
44
output node of INV1. The results thus obtained from SPICE and the model are reported
in Table II.1. In Table II.1, Column 1 reports the number of inverters (of the same size
as INV1) present in the fanout of INV1. Column 2 reports the size of INV1 in terms of
multiples of a minimum-sized inverter. Columns 3 to 9 report the results when the input of
INV1 is at the logic value 1. Columns 3 and 4 report the values of times t1 and t2 obtained
using SPICE. Column 5 reports the pulse width (PW S) of the voltage glitch that results from
the radiation particle strike obtained from the SPICE. Columns 6, 7 and 8 report the values
of t1, t2 and the pulse width (PW M) calculated by the proposed model. The percentage
error of the proposed model in the estimation of the pulse width, compared to SPICE, is
reported in Column 9. Columns 10 to 16 report the same results as Columns 3 to 9 but
for the input value of 0. As reported in Table II.1, the proposed model estimates the pulse
width of the voltage glitch due to radiation events quite accurately. The absolute average
estimation error of the model is just 2.07% and 2.15% for the INV1 input values 0 and 1.
To demonstrate the applicability of the model to multiple input gates, the model was
also applied to a 2-input NAND gates of different sizes (for all input combinations). The
2-input NAND gate drive either 1 or 3 inverters of the same size as the equivalent inverter
of the NAND2 gate, and a radiation particle strike was assumed to occur at the output
node of the NAND2 gate. The results obtained from SPICE and the model are reported
in Table II.2 for all possible input states. Note that a ’-’ entry in Table II.2 means that a
Case 4 situation was found (no glitch). From Table II.2, observe that the absolute average
estimation error of the model is no larger than 3.87%. For other input states, the inaccuracy
of the model is even lower. The slight inaccuracy of the proposed model is due to three
reasons: i) sometimes the model wrongly diagnoses a Case 2 situation as a Case 1 as
situation, as mentioned in Section II-C.4.a, ii) the contribution of the capacitance of the
internal node to the output node diffusion capacitance CD in NAND2 was not accurately
estimated, and iii) the Miller feedback from the output node of the loading gates (like INV2
45
of Figure II.1) to the node where radiation particle strike affects the the pulse width of the
voltage glitch. In the proposed model, the effects due to this feedback were not considered.
To accurately estimate the contribution of the internal node capacitance to the output node
diffusion capacitance CD in NAND2, the approach of [53] can be used to characterize
NAND2 gates.
It can be concluded from Tables II.1 and II.2 that the proposed model for the pulse
width of the voltage glitch due to a radiation event is very accurate. The worst case aver-
age estimation error for inverters and 2-input NAND gate is less than 4%. Compared to
previous approaches [44, 37], the error of the proposed model is much lower. Note that
these previous approaches neglected the contribution of τβ of the radiation-induced current
which leads to under-estimation of the pulse width of the voltage glitch by 10%. Hence,
the inaccuracy of these previous approaches is high.
II-E. Chapter Summary
With the increasing demand for reliable systems, it is necessary to design radiation tolerant
circuits efciently. To achieve this, techniques are required to analyze the effects of a radia-
tion particle strike on a circuit and evaluate the circuit’s resilience to such events. By doing
this early in the design ow, signicant design effort and resources can be saved. In this
chapter, an analytical model was presented for estimating the pulse width of the radiation-
induced voltage glitch in combinational circuits. The pulse width of the voltage glitch due
to an radiation event is a good measure of radiation robustness of a design. The proposed
model efciently and accurately computes the pulse width of the radiation-induced voltage
glitch for any combinational gate. The proposed approach uses a piecewise linear transis-
tor current model and also considers the effect of the ion track establishment constant τβ of
the radiation-induced current pulse, to improve the accuracy of the analysis. Experimental
46
results demonstrate that the proposed model is very fast (∼ 1000 faster than SPICE) and
accurate, with a very low pulse width estimation error of 4% compared to SPICE. Thus,
the proposed analytical model can therefore be easily incorporated in a design ow to im-
plement radiation tolerant circuits.
47
CHAPTER III
RADIATION ANALYSIS - ANALYTICAL DETERMINATION OF THE
RADIATION-INDUCED PULSE SHAPE
III-A. Introduction
It was mentioned in last chapter that the circuit hardening approaches [12, 43, 42] often
employ selective gate upsizing to reduce the area and delay overhead of the resulting hard-
ened design. These approaches protected only those gates in a circuit which signicantly
contribute to the soft error failure rate of the circuit. Such gates in the circuit are identied
based on three masking factors: logical, electrical and temporal masking [4, 12]. These
masking factors were introduced in Section I-A. All three masking factors reduce the prob-
ability of failure due to radiation particle strikes in a combinational circuit. Therefore, for
efcient circuit hardening (with low area and delay overheads), it is important to consider
the effects of all three masking factors.
Of three masking factors, both logical and temporal masking can be computed without
the electrical simulations [4, 12]. However, electrical masking of a gate G in the circuit
depends heavily upon the electrical properties of all the gates along any sensitized path
from the output of G to any primary output of the circuit. Therefore, efcient and accurate
models/simulators for SET events in combinational circuits are required. These simulators
should quickly estimate the shape of the voltage glitch at the node where the radiation
particle strikes, and then propagate the effect of this voltage glitch to the primary outputs
of the circuit. Another reason for the need of the models/simulators for SET events is that
when a voltage glitch propagates through the circuit, the pulse width of the voltage glitch
can increase, resulting in pulse spreading [54]. With efcient simulators, it will be possible
to accurately obtain the glitch width at the primary output of the circuit. This is important
48
for system level circuit hardening approaches [55, 56, 57] which use information about the
radiation-induced voltage glitch at the primary output for soft error detection and tolerance
mechanisms.
In this chapter, an analytical model is presented, which efciently estimates the shape
of the voltage pulse or glitch that results from a radiation particle strike. The voltage glitch
estimated by this analytical model can be propagated to the primary outputs of the circuit
using voltage glitch propagation tools such as [44, 58, 59]. The properties of the voltage
glitch (such as the magnitude, glitch shape and width) at the primary outputs can be used
to evaluate the radiation robustness of the circuit. Based on the result of this analysis,
circuit hardening approaches can be implemented to achieve the level of radiation tolerance
required.
In the proposed approach for analytical determination of the shape of the radiation-
induced voltage glitch, a model for the load current IGout(Vin,Vout) of the output terminal
current of the gate G is used. Note that the load current model of the gate is more accurate
than the piecewise linear transistor IDS model used in Chapter II. Again, the model is
applicable to any general combinational gate with different loading, and for arbitrary values
of collected charge (Q). The effect of the ion track establishment constant (τβ) of the
radiation particle induced current pulse is also considered. Experimental results presented
in Section III-D demonstrate that the proposed model for the shape of the radiation-induced
voltage glitch is fast and accurate.
The rest of the chapter is organized as follows. Section III-B briey discusses some
additional previous work (in addition to the previous work presented in Chapter II) on
modeling of radiation-induced transients in combinational circuits. The model for the
shape of the radiation-induced voltage glitch developed in this dissertation is described
in Section III-C. Experimental results are presented in Section III-D, followed by a chapter
summary in Section III-E.
49
III-B. Related Previous Work
In addition to the previous work already discussed in Section II-B, the authors of [60] pre-
sented an iterative approach for soft error rate analysis of combinational circuits (while
accounting for electrical masking). As the approach of [60] estimates the effects of a radi-
ation particle strike iteratively, the speedup obtained over SPICE simulations is not high.
A great deal of research has been conducted on circuit-level modeling and simula-
tion for static timing analysis (STA) [53] and static noise analysis (SNA) [61]. These
approaches can be extended to estimate the shape of the radiation-induced voltage glitch in
combinational circuits. However, the approaches for STA [53] and SNA [61] are iterative,
and hence sometimes require a large number of iterations to converge. Thus, the speedup
obtained by such iterative approach is not high (the speedup of [53] is 3-70× and [61] is
20× compared to SPICE), and also varies widely depending upon the simulation scenario.
In contrast to these iterative approaches, the analytical approach presented in this chapter is
at least 275× faster compared to SPICE for estimating radiation-induced transients at the
output of an inverter.
In [59], the authors developed a general methodology to analyze crosstalk effects in
combinational circuits. The authors developed an analytical model for crosstalk excita-
tion. They also developed an analytical model for propagating voltage glitches in combina-
tional circuits. Note that their voltage glitch propagation tool can be used to propagate the
radiation-induced voltage glitch estimated by the analytical model presented in this chapter.
III-C. Proposed Analytical Model for the Shape of Radiation-induced Voltage Glitch
Consider four identical inverters as shown in Figure III.1 (same as Figure II.1 (a), repli-
cated here for convenience). A radiation particle strike at the node a is modeled by the
injection of iseu(t) (described by Equation 1.1) at node a. As described in Section II-
50
C.1, INV1 (as shown in Figure III.1) of different sizes operate quite differently during the
radiation-induced transient, and the maximum voltage glitch magnitude (VGM) determines
the behavior of their MOSFETs at different times during the transient. The analytical model
proposed in this chapter also classies INV1 (of Figure III.1) to be operating in one of four
different cases during a radiation-induced transient. The classication is performed in the
same manner as described in Section II-C.2. The four cases are briey described below for
completeness.
• Case 1 - VGM ≥ V DD +Vdiode: In this case, with the increasing voltage of node a
(Va), M1 starts conducting in the linear region and enters the saturation region when
the Va becomes more than V Ndsat . M2 starts conducting in the saturation mode once
Va crosses VDD+ |VTP|. Eventually when Va reaches VDD+Vdiode , the voltage be-
tween the source diffusion and the bulk terminal of the PMOS transistor M2 becomes
≥ Vdiode. Therefore, the diode between these two terminals gets forward biased and
it starts conducting heavily. Thus Va gets clamped to a value around V DD+Vdiode.
• Case 2 - VDD+ |VTP| ≤VGM < V DD+Vdiode: In this case as well, both M1 and M2
conduct similar to Case 1. However, the diode between the diffusion and the bulk
terminals of M2 remains off.
• Case 3 - VDD/2 ≤ VGM < V DD + |VTP|: Only M1 conducts in this case. M1 starts
conducting in the linear region and when Va crosses V Ndsat , M1 enters the saturation
region. M2 remains off in this case.
• Case 4 - VGM < V DD/2: This case corresponds to a voltage glitch of magnitude less
than V DD/2 and hence the radiation event does not result in a logic ip at the node.
The shape of the radiation-induced voltage glitch is computed differently for different
cases, due to the different behavior of M1 and M2 (Figure III.1) for these cases (i.e. for
51
Cases 1, 2 and 3).
An overview of the proposed model is provided in Section III-C.1. Then Section III-
C.2 provides details about the proposed method to determine the shape of the radiation-
induced voltage glitch.
IINV1a ain(V  ,V )
INV4
INV3
INV2
INV1
M1
in
seu
i    (t)
a
M2
Fig. III.1. Radiation-induced current injected at the output of inverter INV1
III-C.1. Overview of the Proposed Model for Determining the Pulse Shape of the
Voltage Glitch
Similar to the model for the pulse width presented in Chapter II, the analysis presented
in this chapter is also presented for an inverter with its input at VDD and its output at
GND. The radiation particle strike results in a positive voltage glitch at the output of the
gate. Note that the same analysis (and the same analytical model) for the shape of the
radiation-induced voltage glitch can be used for any type of gate (NAND, NOR, etc), with
any logic values applied to its inputs. The handling of NAND, NOR, etc. gates is achieved
by constructing an equivalent inverter for the gate. The size of this inverter depends on
the given input values of the gate. The applicability of the proposed model to different
gates is veried by applying the proposed model to a 2-input NAND gate (for all four input
combinations) and 3-input NOR gate (for all eight input combinations). These results are
52
presented in Section III-D. Note that for multiple input gates, the radiation particle strike
are not considered at intermediate nodes of the gate, because the worst-case transient occurs
when the particle strike occurs at the output node of the gate. Therefore, the estimate of
the voltage glitch at the output node due to a particle strike at any intermediate node will
not be useful for circuit hardening. Hence, the analysis presented in this chapter is only for
radiation strikes at the output node of multi-input gates.
Figure III.2(a) (shown at the top left portion of Figure III.2) schematically illustrates
a voltage glitch that results from a radiation strike at the output node a of INV1. As shown
in Figure III.2(a), the node voltage rises and reaches V Ndsat at time T 1sat , V DD/2 at time t1,
V DD + |VTP| at time T 1P (for Cases 1 and 2), and then after reaching a maximum value of
VGM, the node voltage falls to VDD+ |VTP| at time T 2P (for Cases 1 and 2), VDD/2 at time
t2 and nally to V Ndsat at the time T 2sat . Hence the shape of the voltage glitch of Figure III.2(a)
is dened by the node a voltage equations between the time intervals: (0, T 1sat), {(T 1sat , T 1P ),
(T 1P , T 2P ) and (T 2P , T 2sat)} for Cases 1 and 2 or (T 1sat , T 2sat ) for Case 3, and (T 2sat ,∞) for all
cases. The goal of the proposed approach is to compute the values of all the variables
which form the end-points of these time intervals, and also the node voltage equations of
node a corresponding to these time durations. The proposed approach can also be used to
compute t1 and t2 to obtain the width of the voltage glitch (which is t2− t1).
All the gates in the library LIB (used in this work) were characterized using the same
approach as reported in [53]. For each gate (for all input combinations), the load current
of the gate (Iout(Vin,Vout)) was obtained as a function of its output node voltage, and stored
in a look-up table. The input gate capacitance CG (the output node diffusion capacitance
CD) was also obtained as a function of the input (output) node voltage and stored in a look-
up table. A step size of 0.1 V was used for these look-up table entries. For example, for
INV1 of Figure III.1, the current through the output terminal a (Ia(Vin,Va)) was obtained
for different Va voltage values at a, when the input node in is at VDD and GND (Vin = VDD
53
and Vin = GND). Thus, the number of current look-up tables for any gate is equal to 2n
(where n is the number of inputs of a gate). Similarly, CD was also computed based on the
input state of the gate. Therefore, for an n-input gate, the total size of the look-up tables for
CG, CD and load current Iout is 23 · n, 17 · 2n and 17 · 2n respectively. This characterization
step is performed once for each gate in a library and thus it does not affect the runtime of
the proposed model. Also, n is typically ≤ 3, hence these lookup tables are quite tractable
in practice.
(a)
YesNo
No
Yes
Yes No
T 2sat
time
T 1P T 2PT
1
sat t1 t2
V DD+ |VT P|
V Ndsat
VDD
2
VGM
v
Use Case 3 equations to
estimate the shape
of voltage glitch
Determine the value of VGM
using gate current model for
Va ≥V DD+ |VT P|
If
Case==4
Cell library data
Iout(Vin,Vout), CG and CD
No voltage
glitch
If
Case==3
If
Case==2
Use Case 2 equations to
estimate the shape
of voltage glitch
Use Case 1 equations to
estimate the shape
of voltage glitch
Q, τα and τβ
Given a gate G, its input state,
the gates in the fanout of G and
Determine the value of VGM
using gate current model for
V Ndsat ≤Va < V DD+ |VT P|
Fig. III.2. Flowchart of the proposed model for the shape of the radiation-induced voltage
glitch
Figure III.2 (b) shows the owchart of the algorithm used in the proposed model to
compute the shape of the voltage glitch. The input to the model is a gate G (the radiation
event is to be simulated at the output node of gate G), its input state, the list of gates which
are driven by G, and the values of Q, τα and τβ. The model rst computes VGM using
the gate current model for V Ndsat < Va < VDD + |VTP| and then determines the case that is
applicable. If VGM < VDD/2 (i.e. Case 4 applies), then there is no voltage glitch reported.
54
Otherwise if V DD/2 < VGM < VDD+ |VT P| then Case 3 applies and Case 3 equations are
used to obtain the shape of the voltage glitch. Otherwise VGM is again computed using the
gate current model for Va > VDD + |VTP|. Based on this new value of VGM , the operating
case of gate G is found (either Case 1 or Case 2) and then the corresponding equations are
used to compute the shape and the width of the voltage glitch. The steps of the algorithm
used by the model are explained in the following sub-sections.
III-C.2. Derivation of the Model for Determining the Shape of the Radiation-induced
Voltage Glitch
As mentioned earlier, the analysis is presented for INV1 (Figure III.1) with its input node
in at VDD and the output node a at GND. A radiation particle strike results in a positive
voltage glitch at node a. To ensure that the model for radiation events in combinational cir-
cuit elements is manageable, the load current model I INV1a (Vin,Va) of INV1 was simplied.
Note that in the following analysis I INV1a (Va) is used instead of IINV1a (Vin,Va), since the
analysis is presented for Vin = V DD. With the input terminal of INV1 at VDD, I INV1a (Va)
can be written as:
IINV1a (Va) =


Va/Rn Va < V Ndsat
K3 +K4 ·Va V Ndsat ≤Va < V DD+ |VTP|
K5 +K6 ·Va VDD+ |VTP| ≤Va < VDD+0.6V
Here, Rn is the linear region resistance of M1 (since M2 is off in this region), which
is calculated using the IINV1a (Va) versus Va lookup table for Va values less than V Ndsat . The
constants K3 and K4 are obtained by using a linear equation for the points I INV1a (Va) versus
Va from the lookup table for Va values greater than V Ndsat and less than VDD+ |VTP|. When
Va > VDD+ |VTP|, IINV1a (Va) increases super-linearly with Va because both M1 and M2 are
ON. Thus, the constants K5 and K6 are obtained by tting a least square line to the points
55
(Va, IINV1a (Va) ) from the lookup table, for Va values greater than VDD+ |VTP| and less than
V DD+0.6V .
To determine the applicable case, it is rst required to nd the value of VGM. The
method of nding VGM is described next.
III-C.2.a. Voltage Glitch Magnitude VGM
A radiation event can result in a voltage glitch of magnitude greater than VDD/2 ip only if
Imaxseu > IINV1a (V DD/2), where Imaxseu is the maximum value of radiation-induced current pulse
(Equation 1.1). This is a necessary condition which is used to check whether a radiation
event will result in a signicantly large voltage glitch. The differential equation for the
radiation-induced voltage transient at the output of INV1 of Figure III.1 is given by:
C dVa(t)dt + I
INV1
a (Va) = iseu(t) (3.1)
where, C is the capacitance1 at node a. The above equation can be integrated with the initial
condition Va(t) = 0 at t = 0 to obtain Va(t). For deep sub-micron processes, Vdsat is much
lower than VGS−VT due to short channel effects. For the 65nm PTM [45] model card used
in this work, Vdsat for both NMOS and PMOS transistors is lower than VDD/2. Therefore,
to obtain the VGM value, Equation 3.1 is rst integrated from the initial condition and using
IINV1a = Va/Rn till Va reaches the V Ndsat value. Then Equation 3.1 is again integrated using
IINV1a (Va) = K3 + K4 ·Va to obtain the Va(t) expression. Then, the maximum value VGM
attained by this Va(t) expression is obtained. If VGM <VDD+ |VTP| then INV1 is in Case 3.
Otherwise, INV1 operates in either Case 1 or Case 22. The methodology to decide between
1The value of C is obtained by the addition of the average value of n ·CG, CD and the
capacitance of interconnect over the operating voltage range. Here, n is the fanout factor.
2In Cases 1 and 2, both M1 and M2 conduct and hence the INV1 load current model K5 +
K6 ·Va is used to obtain accurate value of VGM . This new value of VGM is used to resolve
between Cases 1 and 2.
56
Cases 1 and 2 is explained later. Now integrating Equation 3.1 using I INV1a (Va) = Va/Rn
and with the initial condition Va(t) = 0 at t = 0, the expression obtained for Va(t) is:
Va(t) =
In
C (
e−t/τα
X
− e
−t/τβ
Y
−Ze−t/RnC) (3.2)
where,
X =
1
RnC
− 1
τα
,Y =
1
RnC
− 1
τβ
, In =
Q
τα− τβ
&Z = 1
X
− 1
Y
To obtain the time T 1sat when Va(t) reaches the V Ndsat value from Equation 3.2, linearly
expand Equation 3.2 around the initial guess T 1asat . The expression for T 1sat thus obtained is:
T 1sat = T
1a
sat +
V Ndsat − InC ( e
−T 1asat/τα
X − e
−T 1asat/τβ
Y −Ze−T
1a
sat /RnC)
In
C (− e
−T 1asat/τα
ταX +
e
−T 1asat/τβ
τβY +
Z
RnC e
−T 1asat /RnC)
(3.3)
To obtain the initial guess T 1asat , approximate the rising part of the radiation-induced
current by a line between the origin and the point where iseu(t) of Equation 1.1 reaches its
maximum value Imaxseu . The radiation-induced current iseu(t) reaches Imaxseu at T maxseu . Then sub-
stitute this approximated radiation current in the RHS of Equation 3.1 and integrate it from
the initial condition Va(t) = 0 at t = 0 to Va(t) = V Ndsat at t = T 1asat using IINV1a (Va) = Va/Rn.
After this, solve for T 1asat by performing a quadratic expansion of the resulting equation
around the origin. The expression for T 1asat is:
T 1asat =
√
2V Ndsat ·C ·T maxseu
Imaxseu
(3.4)
where,
T maxseu =
τατβ
τα− τβ
log τα
τβ
and Imaxseu = iseu(T maxseu )
So far T 1sat (the time when Va(t) reaches V Ndsat , or the time when M1 enters the sat-
57
uration mode) is known. Now, again integrate Equation 3.1 with the initial condition
Va(t) = V Ndsat at t = T
1
sat , and using IINV1a (Va) = K3 + K4 ·Va. The resulting expression
for Va(t) is:
Va(t) =
In
C (
e−t/τα
X ′
− e
−t/τβ
Y ′
)− K3
K4
+Z′e−K4t/C (3.5)
where,
X ′ =
K4
C −
1
τα
,Y ′ =
K4
C −
1
τβ
and
Z′ = V Ndsate
K4T 1sat/C − InC e
K4T 1sat/C(
e−T
1
sat/τα
X ′
− e
−T 1sat/τβ
Y ′
)+
K3
K4
eK4T
1
sat/C
To calculate the value of VGM, rst differentiate Equation 3.5 and equate dVa(t)/dt
to zero and solve for TVGM (the time at which Va(t) reaches its maximum value). Since the
equation dVa(t)/dt = 0 is also a transcendental equation, hence linearly expand dVa(t)/dt =
0 around T aVGM and then solve for TVGM . The expression obtained for TVGM is:
TVGM = T
a
VGM +
e
−T aVGM /τα
ταX ′ − e
−T aVGM /τβ
τβY ′ +
K4Z′
C e
−K4T aVGM /C
e
−T aVGM /τα
τ2αX ′
− e
−T aVGM /τβ
τ2βY ′
+
K24 Z′
C2 e
−K4T aVGM /C
(3.6)
Now, calculate VGM by substituting TVGM obtained from Equation 3.6, into Equa-
tion 3.5. If VGM < V DD/2 then Case 4 applies and the radiation event does not ip the
logic level of the affected node. If V DD/2 ≤VGM < VDD+ |VT P|, then Case 3 is applica-
ble. Otherwise, either Case 1 or Case 2 is applicable. Before describing the methodology to
decide between Case 1 and Case 2, the method to obtain the value of T aVGM is rst discussed.
Note that the output node voltage of INV1 (i.e. Va(t) of Equation 3.5) always attains
its maximum value after T maxseu (the time iseu(t) of Equation 1.1 reaches its maximum value
Imaxseu ). Therefore, integrate Equation 3.1 using a linear model (imseu(t)) for the radiation-
58
induced current for time t > T maxseu and with the initial condition Va(t) = V sma at t = T maxseu
(obtained from Equation 3.5). The radiation-induced linear current model imseu(t) has one
of its end-points Imaxseu at a time value of T maxseu . The other end-point has its current value of
0, and its time value t∗ is obtained by equating the charge deposited by the actual radiation-
induced current iseu(t) from time T maxseu to ∞ and the charge deposited by the linearized
radiation-induced current equation. Hence the expression for the radiation-induced linear
current model is:
imseu(t) = Imaxseu (1−
t−T maxseu
t∗−T maxseu
) = P+Mt (3.7)
Now substitute imseu(t) for iseu(t) in Equation 3.1, use IINV1a (Va) = K3 + K4 ·Va and then
integrate. After this, differentiate the resulting equation for Va(t) and equate dVa(t)/dt to
zero and solve for T aVGM .
Deciding between Case 1 and Case 2: Before deciding whether INV1 is operating in
Case 1 or Case 2, it is rst required to compute the time t1 when Va(t) reaches V DD/2. Then
T 1P (the time when Va(t) reaches V DD + |VTP|) is computed using t1. After this, integrate
Equation 3.1 using the initial condition Va(t) = V DD + |VTP| at t = T 1P and IINV1a (Va) =
K5 + K6 ·Va, to obtain the expression for Va(t). Then this expression of Va(t) will be used
to decide between Cases 1 and 2 using the VGM value. As shown in the owchart of the
algorithm of the proposed approach in Figure III.2, the method to compute t1 is identical
for cases 1, 2 or 3. To obtain the expression for t1, substitute t = t1 and Va(t1) = VDD/2 in
Equation 3.5 and then solve for t1 after expanding it linearly around the point ta1 (which is
an initial guess for t1). Here ta1 = T 1satVDD/(2V Ndsat). The expression for t1 is therefore:
t1 = t
a
1 +
e
−ta1/τα
X ′ − e
−ta1 /τβ
Y ′ +
C
In (Z
′e−K4ta1/C − K3K4 −
V DD
2 )
e
−ta1/τα
X ′τα − e
−ta1/τβ
Y ′τβ +
K4Z′
In e
−K4ta1/C
(3.8)
Then compute the time t = T 1P when Va(t) reaches VDD+ |VTP|, since the load current
59
model of INV1 changes at this time instant. To obtain T 1P , repeat the same steps followed
for the derivation of the t1 expression with the condition Va(t) = VDD+ |VTP| at t = T 1P in
Equation 3.5, and with the initial guess T 1aP = t1 +(VDD+ |VTP|−V Ndsat)/(VDD/2−V Ndsat).
The expression for T 1P is therefore similar to Equation 3.8 with ta1 replaced by T 1aP , t1 by T 1P
and VDD/2 by V DD+ |VTP|.
Now integrate Equation 3.1 with the initial condition Va(t) = V DD+ |VTP| at t = T 1sat ,
and using IINV1a (Va) = K5 +K6 ·Va. The resulting expression for Va(t) is:
Va(t) =
In
C (
e−t/τα
X ′′
− e
−t/τβ
Y ′′
)− K5
K6
+Z′′e−K6t/C (3.9)
where,
X ′′ =
K6
C −
1
τα
,Y ′′ =
K6
C −
1
τβ
Z′′ = V Ndsate
K6T 1P /C − In
C
eK6T
1
P /C(
e−T 1P /τα
X ′′
− e
−T 1P /τβ
Y ′′
)+
K5
K6
eK6T
1
P /C
To calculate the value of the maximum value of Va(t) of Equation 3.9 (i.e. VGM, the
maximum glitch magnitude for Case 1 or Case 2), repeat the same steps is followed while
calculating the maximum value of Va(t) of Equation 3.5. After obtaining the value of VGM,
it can be decided whether INV1 is operating in Case 1 or Case 2. Note that by using
this method, VGM can be evaluated to be greater than V DD+0.6V , because the diode is not
modeled in Equation 3.1. Therefore, if VGM >V DD+0.6V then VGM is set to VDD+0.6V .
So far, the expression for VGM is known, which can be used to determine the operating
case of INV1. Also, expressions were derived for T 1sat , t1, T 1P and the INV1 output node
voltage equations for different time durations (Equations 3.2, 3.5, 3.9).
60
III-C.2.b. Derivation of the Expressions for Case 3
The derivation of the expressions for the shape of the voltage glitch in Case 3 is as follows.
First, derive the expression for t2 i.e. the time when Va(t) falls to the V DD/2 value. Note
that in this case, only M1 of Figure III.1 (a) conducts because the magnitude of the glitch
voltage is less than V DD + |VT P|. Therefore, Equation 3.5 describes the voltage of node
a for all times t such that T 1sat ≤ t ≤ T 2sat . The expression for t2 can be obtained in similar
manner as t1, with the substitution of t = t2 and Va(t2) = V DD/2 in Equation 3.5 and
with the initial guess point ta2 . It was observed that the time when iseu(t) falls to I
VDD/2
DS
after reaching Imaxseu can be used as an initial guess (ta2 ) for t2 since the node voltage at that
time will be close to VDD/2. The contribution of the e−t/τβ term of iseu(t) was ignored
when calculating ta2 . This is reasonable since τα is usually 3-4 times of τβ and therefore
e−t/τβ approaches 0 much faster than the e−t/τα term. Thus the value of e−t/τβ around ta2
(which is greater than T maxseu ) will be approximately equal to 0. The expression for ta2 is
−τα log IVDD/2DS /In.
Now, again substitute t = T 2sat and Va(T 2sat) = V Ndsat in Equation 3.5 and solve for T 2sat in
a similar manner as solved for t1 (Equation 3.8) using the initial guess T 2asat . The expression
for T 2asat is t2 +V Ndsat −0.5 ·VDD/(dVa(t)/dt|t=t2).
To obtain the node a voltage equation for t > T 2sat , integrate Equation 3.1 with the ini-
tial condition Va(T 2sat) = V Ndsat and using IINV1a (Va) = Va/Rn. The expression thus obtained
is:
Va(t) =
In
C (
e−t/τα
X
− e
−t/τβ
Y
−Ape(Tsat−t)/RnC) (3.10)
where,
Ap = V Ndsat −
In
C
(
e−T 2sat/τα
X
− e
−T 2sat/τβ
Y
)
61
Now the analytical expression of the radiation-induced voltage glitch for Case 3 is com-
plete. The voltage glitch is described by a set of 3 equations (Equations 3.2, 3.5 and 3.10),
as summarized below:
Va(t) =


Eqn. 3.2 t < T 1sat
Eqn. 3.5 T 1sat ≤ t ≤ T 2sat
Eqn. 3.10 t > T 2sat
III-C.2.c. Derivation of the Expressions for Case 2
In this case, the magnitude of the voltage glitch VGM is between VDD+ |VTP| and VDD+
0.6V . Therefore, both M1 and M2 of INV1 conduct for a time t such that T 1P ≤ t ≤ T 2P .
Hence node a’s voltage is described by Equation 3.9 (this equation was used to calculate the
VGM value for Cases 1 and 2). To obtain the value of T 2P , substitute Va(T 2P ) = V DD+ |VTP|
for t = T 2P in Equation 3.9 and then solve for T 2P by using T 2aP as the initial guess. The
resulting expression for T 2P is:
T 2P = T
2a
P +
e
−T 2aP /τα
X ′′ − e
−T 2aP /τβ
Y ′′ +
C
In (Z
′′e−K6T
2a
P /C − K5K6 − (V DD+ |VTP|))
e
−T 2aP /τα
X ′′τα − e
−T 2aP /τβ
Y ′′τβ +
K6Z′′
In e
−K6T 2aP /C
(3.11)
The value of T 2aP is obtained using the following observation. When iseu(t) becomes
equal to the drain to source current (IDS) of M1 of Figure III.1 (a), then at that instant, the
IDS of M2 is approximately equal to 0 and the voltage at node a is V DD+ |VTP|. Thus, the
value of T 2aP is obtained by solving IINV1a (VDD+ |VTP|) = iseu(T 2aP ) (since at this instant IDS
of M2 is zero therefore IDS of M1 is equal to IINV1a (VDD + |VTP|)). In this derivation, the
contribution of the e−t/τβ term of iseu(t) is ignored for the reason explained in Section III-
C.2.b. The expression for T 2aP is −τα log(IINV1a (VDD+ |VTP|)/In).
Now calculate the voltage equation of node a for time duration T 2P ≤ t ≤ T 2sat . For this,
62
integrate Equation 3.1 with the initial condition Va(t) = VDD+ |VT P| at t = T 2P , and using
IINV1a (Va) = K3 +K4 ·Va. The resulting expression for Va(t) thus obtained is:
Va(t) =
In
C (
e−t/τα
X ′
− e
−t/τβ
Y ′
)− K3
K4
+Z∗e−K4t/C (3.12)
where,
Z∗ = (VDD+ |VTP|)eK4T 1sat/C − InC e
K4T 2P /C(
e−T 2P /τα
X ′
− e
−T 2P /τβ
Y ′
)+
K3
K4
eK4T
2
P /C
Using Equation 3.12, the values of t2 and T 2sat can be obtained for Case 2 in the same
manner as t2 and T 2sat were derived for Case3. After nding the values for t2 and T 2sat , the
voltage equation of node a, for t > T 2sat is same as Equation 3.10 (with the values of t2
and T 2sat calculated for this case). Now all variables for this case have been derived. The
equation for the radiation-induced voltage glitch at node a is as shown below:
Va(t) =


Eqn. 3.2 t < T 1sat
Eqn. 3.5 T 1sat ≤ t < T 1P
Eqn. 3.9 T 1P ≤ t < T 2P
Eqn. 3.12 T 2P ≤ t ≤ T 2sat
Eqn. 3.10 t > T 2sat
III-C.2.d. Derivation of the Expressions for Case 1
In this case, both M1 and M2 of Figure III.1 (a) conduct, similar to Case 2. However, when
the voltage at node a reaches a value V DD + 0.6V , the diffusion diode between node a
and the bulk terminal of M2 gets forward biased and starts conducting heavily. Thus Va(t)
gets clamped to a value around V DD + 0.6V . Therefore, all expressions derived for Case
2 are also applicable to this case, with a slight modication to incorporate the effect of
the diode clamping action. In this case, when Equation 3.9 computes a value greater than
63
V DD + 0.6V for any time t then the voltage of node a is set to V DD + 0.6V . Thus, the
resulting equations for the voltage glitch for this case are:
Va(t) =


Eqn. 3.2 t < T 1sat
Eqn. 3.5 T 1sat ≤ t < T 1P
min(Eqn. 3.9,VDD+0.6V ) T 1P ≤ t < T 2P
Eqn. 3.12 T 2P ≤ t ≤ T 2sat
Eqn. 3.10 t > T 2sat
The equations for the radiation-induced voltage glitch derived in this Section (for
Cases 1, 2 and 3) determine the shape of the glitch. Note that τβ was not ignored in the
derivation of the voltage glitch equations and in the calculation of all time variables of the
proposed model such as T 1sat , t1, T 1P , etc. Sometimes, the contribution of the e
−t/τβ term of
iseu(t) was ignored, but this was done only during the calculation of the initial guess for
these time variables.
III-D. Experimental Results
The accuracy of the proposed model for determining the shape of the radiation-induced
voltage glitch was compared with SPICE [38]. The proposed model was implemented in
perl and was determined to be 275× faster than SPICE for the estimation of the radiation-
induced voltage glitch at the output of an inverter. For other gates such as NAND, NOR,
etc, SPICE takes more time to simulate a radiation particle strike, due to the larger number
of transistors in these gates, compared to an inverter. However, the runtime of the proposed
approach does not change signicantly with different gate types, due to the utilization of
a load current model for all gates. Therefore, the speedup of the model proposed in this
chapter, compared to SPICE simulation, will be higher for NAND, NOR and complex
64
gates 3.
A standard cell library LIB was implemented using a 65nm PTM [45] model, card
with V DD = 1V . The library LIB contains INV, NAND and NOR gates of 5 different sizes
(1× to 5×) with different numbers of inputs. As mentioned in Section III-C.1, the look-up
tables for the load current model of the gate, the input gate capacitance CG and the output
node diffusion capacitance CD (for all input combinations) were obtained for all the gates
in LIB. The method to obtain the load current, and the CG and CD look-up tables was
explained in Section III-C.1.
To validate the applicability of the proposed model to different types of gates, radiation
particle strikes were simulated at the output of INVs, 2-input NANDs and 3-input NORs
using the proposed model. For each gate type, 5 different sizes (1× to 5×) were considered,
with all possible input states. The applicability of the model to different scenarios was also
validated by loading the gates with different loads, and by varying the values Q, τα and τβ.
All gates were loaded with 1 and 3 inverters of the same size as the equivalent inverter of
G. The radiation particle strikes were simulated corresponding to Q = 150 fC, τα = 150ps
and τβ = 50ps and Q = 100 fC, τα = 200ps and τβ = 50ps.
The radiation-induced voltage glitches obtained using the proposed model and SPICE
are shown in Figure III.3 for the INV, NAND2 and NOR3 gates, with different scenarios
(as mentioned in the gure). Figure III.3 also reports the operating case for the gate along
with the gate size and the input state. From Figure III.3, observe that the voltage glitch
waveforms obtained using the proposed model match very closely with the voltage glitch
obtained from SPICE. Note that INV, NAND2 and NOR3 of different sizes with all possible
input states and with different radiation-induced current pulses were simulated. However,
for brevity only a few representative waveforms are shown in Figure III.3. The waveforms
3For a 2-input NAND gate, the proposed model is 330× faster than SPICE simulations.
65
 0
 0.2
 0.4
 0.6
 0.8
 1
 1.2
 1.4
 1.6
 1.8
 0  0.2  0.4  0.6  0.8  1  1.2
Vo
lta
ge
 (V
)
time(ns)
Q=0.15pC τα=150ps τβ=50ps Load=3X
SPICE
MODEL
(a) Case 1: 1X-INV with In-
put=1
-0.2
 0
 0.2
 0.4
 0.6
 0.8
 1
 0  0.2  0.4  0.6  0.8  1  1.2
Vo
lta
ge
 (V
)
time(ns)
Q=0.1pC τα=200ps τβ=50ps Load=3X
SPICE
MODEL
(b) Case 3: 4X-INV with In-
put=0
 0
 0.2
 0.4
 0.6
 0.8
 1
 1.2
 1.4
 1.6
 0  0.2  0.4  0.6  0.8  1  1.2
Vo
lta
ge
 (V
)
time(ns)
Q=0.1pC  τα=200ps  τβ=50ps  Load=1X
SPICE
MODEL
(c) Case 2: 3X-NAND2
with Input=11
-0.8
-0.6
-0.4
-0.2
 0
 0.2
 0.4
 0.6
 0.8
 1
 0  0.2  0.4  0.6  0.8  1  1.2
Vo
lta
ge
 (V
)
time(ns)
Q=0.15pC  τα=150ps  τβ=50ps  Load=1X
SPICE
MODEL
(d) Case 1: 2X-NAND2
with Input=00
 0
 0.1
 0.2
 0.3
 0.4
 0.5
 0.6
 0.7
 0.8
 0.9
 1
 0  0.2  0.4  0.6  0.8  1  1.2
Vo
lta
ge
 (V
)
time(ns)
Q=0.1pC  τα=200ps  τβ=50ps  Load=3X
SPICE
MODEL
(e) Case 3: 4X-NOR3 with
Input=010
-0.6
-0.4
-0.2
 0
 0.2
 0.4
 0.6
 0.8
 1
 0  0.2  0.4  0.6  0.8  1  1.2
Vo
lta
ge
 (V
)
time(ns)
Q=0.15pC  τα=150ps  τβ=50ps  Load=1X
SPICE
MODEL
(f) Case 2: 5X-NOR3 with
Input=000
Fig. III.3. Radiation-induced voltage glitches obtained using the proposed model and SPICE
for different gates
66
shown in Figure III.3 were chosen to demonstrate the applicability of the proposed model
to different scenarios. Figure III.3(b) corresponds to a Case 3 scenario in which a 4×
INV has its input at GND value and is driving 3-4× INVs. In this case, the voltage glitch
predicted by the model deviates from SPICE when the affected node voltage drops to 0.2 V.
This is due to the Miller-feedback from the switching of the output of the loading inverters
(3-4× INVs) to the node affected by the radiation strike. The effect of the Miller-feedback
is more dominant for the gates operating in Case 3, than for Case 1 and 2. This is because
in Case 3, the effect of a radiation particle strike is lower than in Case 1 or 2, and hence
the Miller-feedback has a signicant impact on the voltage glitch. Slight mis-matches can
also be observed in some of the voltage glitch waveforms of Figure III.3. This is due to
the modeling error which is introduced by gate characterization (which is performed with
a voltage step of 0.1 V).
The performance of the model was quantied by calculating the root-mean-square-
percentage (rmsp) error of the voltage glitches obtained using the model, compared to the
glitch waveforms obtained using SPICE. Note that the rmsp error was used to compare the
accuracy of the proposed model with SPICE because the goal of the proposed model is to
accurately estimate the radiation-induced voltage transient waveform (voltage glitch). This
voltage glitch then can be propagated to the primary outputs of the circuit using voltage
glitch propagation tools such as [44, 58, 59] to evaluate the radiation robustness of the
circuit. The rmsp error was computed over a time period for which the affected node
voltage value is greater (lesser) than VTN (V DD− |VTP|) for a positive (negative) glitch.
Table III.1 reports the rmsp error of the model for 3× gates and with a radiation particle
strike with Q = 150 fC, τα = 150ps and τβ = 50ps, for all possible input states. Column
1 reports the number of inverters driven by the gate reported in Column 2. Note that the
loading inverters are of the same size as the equivalent inverter of the corresponding gate.
Columns 3 through 10 reports the rmsp error of the voltage glitch estimated by the model,
67
Table III.1. RMSP Error of the Proposed Model for 3× Gates and Q = 150 fC, τα = 150ps
and τβ = 50ps
Input State
Load Gate 0 1 2 3 4 5 6 7 Avg. RMSP Err.
1 INV 2.86 2.62 2.74
1 NAND2 3.75 3.05 3.3 4.0 3.52
1 NOR3 3.45 2.43 7.06 3.65 10.85 5.38 7.40 8.76 6.12
3 INV 3.46 4.94 4.2
3 NAND2 3.72 3.36 3.57 5.89 4.13
3 NOR3 3.29 4.24 5.13 4.51 9.40 5.72 5.64 10.41 6.04
Avg. 5.06
Table III.2. RMSP Error of the Proposed Model for Different Gates Sizes and Q = 150 fC,
τα = 150ps and τβ = 50ps
Gate Size
Load Gate 1× 2× 3× 4× 5× Avg. RMSP Err.
1 INV 2.72 2.66 2.74 3.08 3.49 2.94
1 NAND2 3.45 3.27 3.52 3.93 3.80 3.6
1 NOR3 4.43 4.76 6.04 6.96 6.02 5.64
3 INV 3.66 3.95 4.20 4.61 5.15 4.3
3 NAND2 3.81 3.83 4.14 4.69 4.53 4.2
3 NOR3 4.77 4.99 6.12 6.98 7.12 6.00
Avg. 4.45
compared to SPICE, for all possible input states. Column 11 reports the average rmsp error
for a 3× gate, averaged over its all possible states. A blank entry in Table III.1 indicates
that the input state of the corresponding column is not applicable to the corresponding gate.
Observe from Table III.1 that the proposed model is able to predict the radiation-induced
voltage glitch for 3× gates with a very small rmsp error of 5.06% (as reported by the last
row of Table III.1) averaged over all gates for all input states. Similar results were obtained
for Q = 100 fC, τα = 200ps and τβ = 50ps.
Table III.2 reports the rmsp error of the proposed model, for different gate sizes (from
1× to 5×) with Q = 150 fC, τα = 150ps and τβ = 50ps, averaged over all possible input
states for the gate. Table III.2 shows that the proposed model to estimate the shape of
the radiation-induced voltage glitch is very accurate and the average rmsp error is 4.45%
68
Fig. III.4. Radiation-induced voltage glitch at 2X-INV1
averaged over all simulated scenarios (different gate types, gate loading and gate sizes).
Also, the proposed approach is at least 275× faster than SPICE simulations. Note that the
best known previous analytical approach to predict the radiation-induced voltage glitch is
reported in [37] to be just 100× faster than SPICE. Also in [37], the authors report that
their approach sometimes yields a 15% error in the radiation-induced glitch, compared to
SPICE. Moreover, the authors ignore the effect of the ion track establishment constant (τβ),
by setting it to zero for both their model as well as for their SPICE simulations. To evaluate
the impact of ignoring τβ on the radiation-induced voltage glitch, radiation particle strikes
were simulated in SPICE (with and without the inclusion of τβ) at the output of inverters
of different sizes (1× - 5×). These simulations were performed for two different radiation
strike parameter values (Q = 150 fC, τα = 150ps, and τβ = 50ps), as well as (Q = 100 fC,
τα = 200ps and τβ = 50ps) and for different loads on the inverters. For Q = 150 fC, τα =
150ps, and τβ = 50ps (Q = 100 fC, τα = 200ps and τβ = 50ps), it was found that ignoring
τβ results in an underestimation of the pulse width of the voltage glitch by 10% (8%). The
voltage waveforms at the output of a 2X inverter under a radiation particle strike with and
without the inclusion of the τβ term (for Q = 150 fC, τα = 150ps, and τβ = 50ps) are shown
69
in Figure III.4. The rmsp error of the voltage glitch without τβ (shown in Figure III.4) is
40% which is much higher than the error of the proposed approach. Thus, for an accurate
analysis, it is crucial to include the contribution of τβ. As mentioned earlier, the authors
of [37] ignore τβ and therefore, the error of their approach can be much higher than reported
in [37], when compared with the shape of the radiation-induced voltage glitch obtained
while considering the contributions of τβ. The analytical model presented in this chapter
is the rst to model the effect of both τα and τβ, for the estimation of the shape of the
radiation-induced voltage glitch. Thus, the proposed approach is more accurate than the
best known previous approach [37].
III-E. Chapter Summary
In this chapter, an analytical model for the determination of the shape of radiation-induced
voltage glitches in combinational circuits was presented. The radiation-induced voltage
glitch at an internal node of a circuit can be propagated to the primary outputs of the circuit
(using existing tools [44, 58, 59]) to account for the effects of electrical masking. This
enables an accurate and quick evaluation of the radiation robustness of a circuit. Experi-
mental results demonstrate that the proposed model is very accurate, with a very low root
mean square percentage error in the estimation of the shape of the voltage glitch (of 4.5%)
compared to SPICE. The model presented in this chapter gains its accuracy by using a
piecewise-linear model for the load current of the gate, and by considering the effect of τβ
of the radiation-induced current pulse. The analytical model is very fast (275× faster than
SPICE) and accurate, and can therefore be easily incorporated in a design ow to imple-
ment radiation tolerant circuits. The next chapter presents a model for the dynamic stability
of a SRAM cell during a radiation particle strike.
70
CHAPTER IV
RADIATION ANALYSIS - MODELING DYNAMIC STABILITY OF SRAMS IN THE
PRESENCE OF RADIATION PARTICLE STRIKES
IV-A. Introduction
Static random access memories (SRAMs) are an integral part of modern microprocessors
and systems on chips (SoCs). Typically, SRAMs occupy more than half of the total chip
area [46]. Hence from an economic viewpoint, SRAM yield is very important. In the deep
submicron era, the IC supply voltage is often scaled to reduce power consumption [62, 63,
39]. At the same time, noise effects in VLSI designs are increasing [64]. Although voltage
scaling reduces the dynamic energy consumption of the IC quadratically, it also reduces the
noise tolerance of SRAM cells [63]. Thus SRAM stability analysis has become an essential
design task, required to improve the yield of processors and SoCs.
Traditionally, static stability analysis was performed for memory designs in nanometer
scale technologies. The static noise margin (SNM) [46] is one such metric traditionally
used for static SRAM stability analysis. SNM is the maximum amplitude of the voltage
deviation on an input node that can tolerated, without causing a change in the memory state.
SNM is obtained by injecting static noise of constant amplitude (for an innite duration).
However, transient noise will typically not be present for a long duration. Also it is possible
that a noise of a larger amplitude may lead to a temporary disturbance in the SRAM, but not
affect the SRAM state at all. Both the amplitude and the duration of a noise event together
determine whether the SRAM state will ip or not. However, SNM based stability analysis
fails to capture the time-dependent properties of specic noise events. Hence, the use of
SNM to analyze the stability of an SRAM cell (during the design phase) unnecessarily
reduces design options, leading to overdesign.
71
To capture the effects of spectral and time-dependent properties of a noise signal on an
SRAM cell, dynamic or time-dependent stability analysis needs to be performed. Dynamic
noise margin (DNM) is one such metric for time-dependent stability analysis, which results
in a more realistic SRAM noise analysis. However, most of the dynamic stability analysis
methods proposed so far involve transistor and device level simulations which are quite
complex and time-consuming in nature [1, 2]. Recently, a model for dynamic stability
of SRAMs was reported in [39]. However, this model assumes a rectangular noise signal
which is typically not realistic for practical noise sources. This thus results in a large error
compared to SPICE simulations. Therefore, there is a need to develop simple and accurate
models for the dynamic stability of SRAMs, which capture the time-dependent nature of
the radiation-induced noise signal more closely.
As described in Chapter I, with technology scaling, radiation particle strikes continue
to be problematic for SRAMs. Whether a radiation particle strike (or any other transient
noise event) results in the state of SRAM cell being ipped or not depends upon both the
amount of charge dumped, as well as the time constants associated with the radiation par-
ticle strike. In addition to this, the electric and geometric parameters of an SRAM cell also
play an important role in determining if the SRAM state ips. If the amount of charge
dumped by a radiation strike is not sufcient, then the strike will only cause a temporary
disturbance and will not result in a state ip. Hence, it is important to develop a compact
and accurate model for SRAM cell stability in the presence of radiation particle strikes.
Such a model would be very useful to an SRAM designer, allowing them to quickly and ac-
curately evaluate the radiation tolerance of their SRAM cell and make it more reliable. The
work presented in this chapter develops a model for the dynamic stability of a 6-T SRAM
cell in a holding state, in the presence of radiation events. The model proposed in this
chapter can predict the effect of radiation particle strikes quite accurately, and the average
error in the estimation of the critical charge of the proposed model is just 4.6% compared to
72
SPICE simulation. The extension of this work to evaluate the dynamic stability of SRAM
during other modes (read mode or write mode) of the SRAM cells is straightforward.
The rest of the chapter is organized as follows. Section IV-B briey discusses some
previous work on modeling the static and dynamic noise margin of SRAMs. In Section IV-
C, the proposed model for the dynamic stability of an SRAM cell in the presence of a
radiation event is described. Experimental results are presented in Section IV-D, followed
by a chapter summary in Section IV-E.
IV-B. Related Previous Work
The stability analysis of an SRAM cell has been a topic of great interest for more than a
couple of decades, due to its importance in obtaining high yields for microprocessor and
SoC designs. A rich and well developed theory exists for static stability analysis of an
SRAM cell [65, 66, 46]. In [66], the authors proved the formal equivalence of four dif-
ferent criterion for worst-case static noise margin. In [46], explicit analytical expressions
were presented for the static-noise margin (SNM) as a function of device parameters and
supply voltage. Several studies have also been performed to evaluate the effects of process
variations on SRAM cell stability, using the SNM [67, 68]. Although a lot of work has been
done in static stability analysis, not much work has been reported on the dynamic stability
analysis of SRAM cells. Most of the previous work on evaluating the effect of radiation
particle strikes (or transient noise) on SRAM stability have used either device level or tran-
sistor level simulations [1, 2]. Thus, these methods are time consuming and cumbersome to
apply. Recently, in [39], an analytical model for SRAM dynamic stability was presented,
for a noise signal consisting of a rectangular current pulse. The authors used non-linear
system theory to derive the equation for the minimum duration of the noise current which
results in the ipping of the SRAM cell state, given the amplitude of the noise current. The
73
authors also attempted to apply their approach to perform transient noise analysis in the
presence of radiation particle strikes, but the error of their approach is quite large (11%)
compared to SPICE. Also they used a single exponential noise current to model radiation
strikes. However, the current due to a radiation particle strike is modeled more accurately
by a double exponential [1, 2, 3, 5, 69] current pulse (Equation 1.1). In contrast, the model
presented in this chapter utilizes a double exponential current pulse (Equation 1.1) to model
a radiation particle strike, and it is able to predict whether a radiation event will result in a
state ip in a 6T-SRAM cell with greater accuracy.
IV-C. Proposed Model for the Dynamic Stability of SRAMs in the Presence of Radiation
Particle Strikes
In this chapter, a model for the dynamic stability of a 6-T SRAM cell in the presence of a
radiation event is presented. Since radiation strikes are random events, when such an event
occurs, an SRAM cell will most likely be in a holding state. Thus, the proposed model is
presented only for the holding state. However, the extension of the approach presented in
this chapter to other states of the SRAM is straightforward.
The approach proposed in this chapter to model the dynamic stability of an SRAM
is inspired by the non-linear system theory based formulation presented in [39]. For
brevity, a limited description of the theoretical concepts used in the approach of [39]
is provided. Figure IV.1 shows the schematic of a 6-T SRAM cell (note that the ac-
cess transistors are not shown) with the radiation-induced current (iseu(t)) being injected
into node n2. The total capacitance seen at node n1 (and n2) is modeled by a capaci-
tor of value C, connected between n1 (and n2) and ground. The state of the SRAM cell
shown in Figure IV.1 is described by a pair of node voltages (Vn1,Vn2). From the voltage
transfer characteristics (VTC) of Inverter 1 and Inverter 2, the equilibrium points of the
74
SRAM cell are (VDD,GND), (GND,VDD) and (VDD/2,VDD/2). Out of these equilibrium
points, (VDD,GND) and (GND,VDD) are the stable equilibria whereas (VDD/2,VDD/2)
is a metastable equilibrium. The two VTCs (drawn on the same plot) of the inverters in
the SRAM cell also form the state space for the SRAM cell system. In the state space
of the SRAM cell, the region of attraction for the (VDD,GND) equilibrium point is the
region described by Vn1 > Vn2. This means that if the node voltages of the SRAM cell
satisfy the condition Vn1 > Vn2, then under no external input, the state of SRAM will reach
the (VDD,GND) equilibrium point. Similarly, Vn1 < Vn2 is the region of attraction for the
(GND,VDD) state. The metastable equilibrium point represents the condition Vn1 = Vn2.
Therefore, when a SRAM cell is in the metastable state, a small amount of noise at the
node n1 or n2 will drive the state of the SRAM cell to either one of the stable equilibrium
points. Thus, a radiation particle strike can ip the SRAM cell state if the radiation-induced
current can change the state of the SRAM cell from a stable equilibrium point to the region
of attraction of the other stable equilibrium point, or to the metastable point. This criterion
is used to evaluate the dynamic stability of the SRAM.
M2
M3
M4
M1
CC
Inverter 1 Inverter 2
Vn2(t)Vn1(t)
iseu(t)
n2n1
Fig. IV.1. Schematic of SRAM cell with noise current (access transistors are not shown)
75
An SRAM cell is a non-linear system [39] due to the presence of a back-to-back in-
verter connection. It is very often the case that mathematical tools are unable to analytically
solve such non-linear system equations. Therefore, to ensure that the SRAM dynamic sta-
bility model is manageable, a simple linear gate model [70] is used for the inverters. This
model assumes that at any given time, either the NMOS or the PMOS device conducts (i.e.
the short circuit current of an inverter is negligible). Let the input and output voltages of
the inverter be Vin and Vout . Then the driving current (current owing through the output
node) of an inverter can be written as
Iinv(Vin,Vout) =


0 cutoff
Vout/R linear
gm(Vin−VT ) saturation
Here, gm, R and VT represent the transconductance, linear-region resistance and threshold
voltage of the transistor of the inverter depending (PMOS or NMOS) which is conducting.
Without loss of generality, the analysis presented in this chapter assumes that initially
Vin = Vn1 = VDD and Vout = Vn2 = GND. Therefore, the SRAM cell is in the (VDD, GND)
state before the noise current is injected. The same analysis can also be applied when
the initial SRAM state is (GND, VDD). Consider the SRAM cell of Figure IV.1. If a
sufciently large noise current is injected into node n2 then the SRAM node voltages Vn1
and Vn2 change as shown in Figure IV.2. Note that Figure IV.2 is provided for the purpose
of explanation only. In practice the temporal trajectory of Vn1 and Vn2 may be different
from what is shown in Figure IV.2, and depends heavily on the value of Q. The goal of the
model proposed in this chapter is to test whether a SRAM cell will indeed encounter a state
ip, for a given value of Q. Initially, Vn2 increases. However, Vn1 remains almost at VDD.
Then after the node voltage Vn2 crosses Vdsat (the saturation voltage of NMOS transistor),
Vn1 starts decreasing rapidly. The rst phase, where Vn2 is increasing and Vn1 is constant is
76
referred to as the weak coupling mode (WCM), since the change in Vn2 does not affect Vn1.
The second phase, where both Vn1 and Vn2 change, is called strong feedback mode (SFM).
Figure IV.3 shows the owchart of the proposed model, for determining whether a
radiation particle strike results in the state of the SRAM cell to ip. The SRAM cell starts
in WCM mode when the noise current is injected into node n2. If the noise current is
sufciently large, then the SRAM will enter SFM. Otherwise, the SRAM continues to stay
in WCM and therefore the SRAM state does not change. After the SRAM cell enters SFM,
if the noise current is large enough, then Vn1 can become greater than or equal to Vn2,
resulting in a state ip or an SEU. Otherwise, the SRAM cell does not ip and it returns to
its initial state (VDD,GND). The steps of the proposed model are explained in detail in the
following sub-sections.
Fig. IV.2. SRAM node voltages for the noise injected at node n2
77
No
No
Yes
Yes
Calculate the time (Tw) at
which SRAM cell will enter SFM
SRAM cell state will flip
SRAM cell state will not flip
if SRAM will
Determine
enter SFM
determine if SRAM
state will flip
In SFM,
Given Q, τα and τβ
Fig. IV.3. Flowchart of the proposed model for SRAM cell stability
IV-C.1. Weak Coupling Mode Analysis
In weak coupling mode, M2 is in the linear region while M4 is in cutoff. M4 is assumed
to remain in cutoff during this mode if the threshold voltage of the NMOS transistor (VTN)
does not differ much from Vdsat . This is true for deep sub-micron technologies (due to
short channel effects). As mentioned earlier, the node voltage Vn1 remains almost at VDD
in weak coupling mode. The equations governing the temporal behavior of the SRAM cell
of Figure IV.1 are as follows:
78
dVn2(t)/dt = −Vn2(t)/RnC + iseu(t)/C (4.1)
Vn1(t) = V DD (4.2)
Here, iseu(t) represents the radiation-induced current, as described by Equation 1.1.
Rn is the linear-region resistance of the NMOS transistor and C is the total capacitance
seen at node n1 (and n2). For a given radiation-induced current pulse, it is required to
determine rst whether the SRAM cell will enter the strong feedback mode or not. To do
this, the minimum value of charge (Qwc) required to take an SRAM cell to SFM for the
given values of time constants (τα and τβ) is computed. To simplify the expression for Qwc,
the e−t/τβ term in the noise current of Equation 1.1 is ignored. Note that e−t/τβ is ignored
only to determine whether the SRAM cell enters SFM or not. Also, ignoring e−t/τβ results
in a pessimistic analysis. Therefore, this assumption results in a lower bounded value of
Qwc being computed, and hence does not lead to any error in predicting the SRAM state
ip. Integrating Equation 4.1 with initial condition t = 0, Vn2 = 0 to obtain:
Vn2(t) =
Q
C(τα− τβ)X
(e−t/τα − e−t/RnC) (4.3)
where,
X =
1
RnC
− 1
τα
Now, differentiate Equation 4.3 and equate dVn2(t)/dt to zero, to calculate the time
tVn2M at which Vn2(t) reaches its maximum value. If Vn2(tVn2M) ≥ Vdsat , then the cell enters
SFM. Substitute the expression for tVn2M for the value of t and Vn2 by Vdsat in Equation 4.3
to obtain the expression for Qwc as shown below.
Qwc = CX(τα− τβ)
VdsatetVn2M /RnC
etVn2M X −1 (4.4)
79
where,
tVn2M =
1
X
ln( τα
RnC
)
If the charge dumped (Q) by a radiation event is greater than Qwc then the SRAM cell
will enter SFM, otherwise it stays in WCM. If the SRAM enters SFM, then the state of
SRAM cell can ip. To determine if it indeed ips, it is required to calculate the time (Tw)
at which the SRAM cell enters SFM. Again, consider Equation 4.1, integrate it using iseu(t)
from Equation 1.1. The resulting equation for Vn2(t) is:
Vn2(t) =
In
C (
e−t/τα
X
− e
−t/τβ
Y
−Ze−t/RnC) (4.5)
where,
Y =
1
RnC
− 1
τβ
, In =
Q
τα− τβ
and Z = 1
X
− 1
Y
To obtain Tw (the time when the SRAM enters SFM), substitute Vn2 = Vdsat and t =
Tw in Equation 4.5 and solve it for t. Note that Equation 4.5 is a transcendental equation
in t and hence it is not possible to obtain the expression for Tw analytically. Therefore,
linearly expand Equation 4.5 in t around the point T iniw (which is expected to be close
to the actual value of Tw). To obtain a good expansion point T iniw , the radiation-induced
current approximated by a rectangular pulse of magnitude Imax (which is the maximum
value of iseu(t)) and a pulse width of a Q/Imax. Then the value of T iniw can be obtained using
Equation 4.6 (reported in [39]) for a rectangular noise current pulse for the same SRAM
cell as in Figure IV.1.
T iniw =−RnCln[1−Vdsat/(ImaxRn)] (4.6)
Note that the way in which the radiation-induced current pulse is modeled ensures
that T iniw is always smaller than the actual time (Tw) when the SRAM cell enters SFM. This
is due to the fact that a rectangular noise current pulse of magnitude Imax, depositing a
80
charge Q, has more severe effects on the node voltages than the actual radiation-induced
current pulse of Equation 1.1 dumping the same amount of charge. It is always better
to be conservative so that the SRAM cell state ip is always detected. This ensures that
an optimistic SRAM cell design is avoided. Also note from Equation 4.6 that another
condition which must be satised for an SRAM cell to enter SFM is ImaxRn > Vdsat . This
condition is checked after the condition imposed by Qwc is satised.
To obtain an expression for Tw, rst linearly expand Equation 4.5 in t around the
point T iniw (which is obtained from Equation 4.6) and then solve for t (=Tw). The resulting
expression for Tw is as given below.
Tw = T iniw +
Vdsat − InC ( e
−Tiniw /τα
X − e
−T iniw /τβ
Y −Ze−T
ini
w /RnC)
In
C (− e
−T iniw /τα
ταX +
e
−T iniw /τβ
τβY +
Z
RnC e
−T iniw /RnC)
(4.7)
IV-C.2. Strong Feedback Mode Analysis
When the SRAM cell of Figure IV.1 enters strong feedback mode, the transistors M2 and
M4 are in the saturation region. In this mode, the node voltage Vn2 increases (due to the
noise current injected at node n2) which decreases the value of Vn1. The decrease in Vn1
further helps in increasing the value of Vn2. The node voltage Vn1 depends upon Vn2 and
vice-versa and hence, the equations governing the time-domain behavior of the SRAM cell
in the SFM are cross-coupled and non-linear in nature. These equations are given below.
dVn1(t)/dt =−gmnVn2(t)/C +gmnVTN/C (4.8)
dVn2(t)/dt =−gmnVn1(t)/C+gmnVTN/C + QC(τα− τβ)
(e−t/τα − e−t/τβ) (4.9)
Subtracting Equation 4.9 from Equation 4.8 and using transformation u(t) = Vn1(t)−
Vn2(t) gives.
du(t)/dt = gmnu/C− Q
τα− τβ
(e−t/τα − e−t/τβ) (4.10)
81
As mentioned earlier, for the SRAM cell to ip, the noise current should change the
state of the SRAM cell from the stable equilibrium point (VDD, GND) to the metastable
equilibrium point (Vn2 = Vn1), or change the SRAM state to the region of attraction of the
other equilibrium point (Vn2 > Vn1). Therefore, if the SRAM cell ips, u(t) = Vn1(t)−
Vn2(t) should become equal to or less than 0. Now, integrate Equation 4.10 with the initial
condition t = Tw and u(Tw) = V DD−Vdsat and then nd the limit of u(t) as t → ∞ going
to innity. This is done because u(t) may become equal to 0 (i.e. Vn1 = Vn2) after the entire
charge (or most of the charge) has been deposited on node n2. Also, the feedback from
node n1 may also increase the node voltage Vn2 after a large amount of time. Therefore, the
condition which must be satised for a radiation event to ip the SRAM state is as given
below.
Q ≥C(τα− τβ)e−gmnTw/C
V DD−Vdsat
e−TwX ′/X ′− e−TwY ′/Y ′ (4.11)
where,
X ′ =
gmn
C
+
1
τα
,Y ′ =
gmn
C
+
1
τβ
IV-D. Experimental Results
To compare the accuracy of the proposed model for the dynamic stability of the SRAM cell
with HSPICE [71], the SRAM cell of Figure IV.1 was designed using a PTM 90nm [45]
model card with VDD=1.2V. The device sizes are W/L = 0.18 µm/0.09 µm for M2 and M4
and W/L = 0.27 µm/0.09 µm for M1 and M3. The total node capacitance of nodes n1 and
n2 is 5.4 fF. The gate model characterization (computation of gmn, Rn and Vdsat ) was done
for different VDD values in HSPICE.
As dened in Section I-A, Qcri is the minimum amount of charge required to be de-
posited by a radiation particle, in order to ip the SRAM state. Figure IV.4 compares the
82
critical charge values (Qcri) obtained using HSPICE and the model proposed in this chapter
(for τα= 150 ps, τβ= 38 ps and for different values of VDD). To obtain the value of Qcri,
initially a small value of Q (i.e. 10 fC) is used for the radiation-induced current of Equa-
tion 1.1. Then, any of 2 methods (HSPICE or the model proposed in this chapter) is used
iteratively, with increasing Q in small increments (0.1 fC), to determine the value (Qcri) at
which SRAM cell state ips. Figure IV.4 shows that the model proposed in this chapter
is very accurate with an average estimation error of 3.3% compared to HSPICE. Thus, the
proposed model is much more accurate than the model of [39], whose error is 11%. The
error of the model presented in this chapter is lower than that of [39] because unlike [39],
the approach of this chapter does not model the radiation-induced current by a rectangular
noise current pulse. Hence, the proposed model can capture the time-dependent nature of
the radiation-induced current more closely, which improves the accuracy of the model.
Fig. IV.4. Comparison of critical charge obtained using HSPICE and the proposed model
Table IV.1 compares the critical charge (Qcri) obtained using the proposed model and
HSPICE for various values of τα, τβ and VDD. In Table IV.1, Columns 1 and 2 report the
values of τα and τβ under consideration. Column 3 reports the value of VDD. Column
4 reports the critical charge value (QHSPICEcri ) obtained using HSPICE. The critical charge
83
Table IV.1. Comparison of Model with HSPICE
τα τβ VDD QHSPICEcri (fC) QMODcri (fC) % ERROR Run-time ratio
120 30 1 25.9 24.1 6.95 940
120 30 1.1 29.5 27.2 7.80 1478
120 30 1.2 33 30.7 6.97 1811
120 30 1.3 36.5 34.4 5.75 1830
120 30 1.4 39.8 37.4 6.03 2058
150 38 1 31.2 29.4 5.77 1390
150 38 1.1 35.6 33.7 5.34 2031
150 38 1.2 39.9 38.3 4.01 2020
150 38 1.3 44 43 2.27 2488
150 38 1.4 48 46.8 2.50 2820
150 50 1.1 37.9 36.6 3.43 2268
150 50 1.2 42.5 41.6 2.12 2221
150 50 1.3 46.9 46.7 0.43 2723
AVG 4.57 2006
value evaluated by the proposed model (QMODcri ) is reported in Column 5. Column 6 reports
the percentage error in the critical charge value obtained using the model, compared to
HSPICE. The ratio of the runtime of HSPICE and the model is reported in Column 7. As
reported in Table IV.1, the proposed model is able to obtain Qcri value very accurately
(with a small average error of 4.6%). Note that the error of the model reported in [39] was
11%. Also the runtime of the model presented in this chapter is ∼2000× better than the
HSPICE runtime. The runtime of HSPICE is in the order of tens of seconds, compared to
the runtime of the proposed model, which is the order of 10 ms. Since SRAM design is
an iterative process, it is valuable to use the model proposed in this chapter to evaluate the
stability of an SRAM cell due to the signicantly lower run-time of this model compared
to HSPICE. Note that Figure IV.4 and Table IV.1 also indicate that the proposed approach
is conservative.
IV-E. Chapter Summary
SRAMs are extensively used in modern microprocessors and SoCs. Hence SRAM yield is
very important from an economic viewpoint. As a result, SRAM stability analysis has
84
come quite important in recent times. SRAM stability analysis based on static noise margin
(SNM) often results in pessimistic designs, because SNM cannot capture the transient be-
havior of the noise. Thus, SNM reduces design options, resulting in a highly conservative
design. Therefore, to improve SRAM design, dynamic stability analysis is required. The
model developed in this chapter performs dynamic stability analysis of an SRAM cell in the
presence of a radiation event. Experimental results demonstrate that the model proposed
in this chapter is compact and very accurate, with a low critical charge estimation error of
4.6% compared to HSPICE. The runtime of the proposed model is also signicantly lower
(2000× lower) than the HSPICE runtime. Also, the results of the proposed model are al-
ways conservative. Thus this model enables SRAM designers to quickly and accurately
validate the stability of their SRAMs during the design phase.
The model presented in this chapter considers noise in SRAMs only due to radiation
particle strikes. However, there are other types of noise such as power and ground noise,
capacitive coupling noise, etc. Therefore, the models similar to the one presented in this
chapter are required to perform dynamic stability of an SRAM cell in the presence of
capacitive coupling noise, and power and ground noise. In future, the approach presented
in this chapter can be extended to include the effects of these noise sources as well.
85
CHAPTER V
RADIATION ANALYSIS - 3D SIMULATION AND ANALYSIS OF THE RADIATION
TOLERANCE OF VOLTAGE SCALED DIGITAL CIRCUITS
V-A. Introduction
In addition to the analysis of the effects of radiation particle strikes on combinational cir-
cuits and SRAMs, it is also important to study how voltage scaling affects the susceptibility
of VLSI circuits to radiation particle strikes. This is relevant since in recent times, power
has become a major issue in computing [72]. Low energy solutions are desired for many
applications such as Systems-on-chip (SoC), microprocessors, wireless communication cir-
cuits, etc. Both the dynamic and the leakage components of the power consumption of a
CMOS circuit depend upon the supply voltage; both decrease at least quadratically with
decreasing supply voltages. Therefore, in recent times, it is common to decrease the supply
voltage value in the non-critical parts of VLSI systems, in order to reduce the power and
energy consumption.
Modern VLSI systems extensively employ dynamic voltage scaling (DVS) to meet
the variable speed/power requirements that are imposed at different times during their op-
eration [73, 74, 75, 76, 77, 78]. DVS helps in reducing the circuit power consumption
especially when high speed circuit operation is not desired. Today, VLSI circuits are also
operated in the sub-threshold region of operation for a widening class of applications which
demand extreme low power consumption and can tolerate larger circuit delays [79, 80, 81].
Sub-threshold circuits operate with a supply voltage less than or equal to the device thresh-
old voltage. Since both DVS and sub-threshold circuits are extensively used to reduce
power consumption, the susceptibility of such circuits to radiation particle strikes can sig-
nicantly impact the reliability of VLSI systems based on these techniques. Hence, it
86
is important to analyze the effects of radiation particle strikes on DVS and sub-threshold
circuits. Based on the results of such an analysis, these circuits can be hardened against
radiation strikes to improve their reliability.
To understand the effect of voltage scaling on the radiation susceptibility of digital
VLSI circuits, in this dissertation, 3D simulations of radiation particle strikes (on the out-
put of an inverter implemented using DVS and sub-threshold design) were performed. 3D
simulation of radiation particle strikes aids in obtaining an accurate estimation of the ef-
fect of voltage scaling on the radiation susceptibility of the inverter. A radiation particle
strike on an inverter was simulated using Sentaurus-DEVICE [40] for different inverter
sizes, inverter loads, supply voltage values (VDD), and the energy of the radiation parti-
cles. From these 3D simulations, several non-intuitive observations were made, which are
important to consider during radiation hardening of such DVS and sub-threshold circuits.
Based on these observations, several guidelines are proposed for the radiation hardening
of such designs, as reported in Section V-D. These guidelines suggest that traditional ra-
diation hardening approaches need to be revisited for DVS and sub-threshold designs. A
charge collection model for DVS circuits is also proposed, using the results of these 3D
simulations. The charge collection model can accurately estimate (with an average error of
6.3%) the charge collected at the output of a gate as a function of the supply voltage, gate
size and particle energy (for medium and high energy particle strikes). The parameters of
this charge collection model can be included in transistor model cards in SPICE, to improve
the accuracy of SPICE based simulations of radiation events in DVS circuits.
The rest of the chapter is organized as follows. Section V-B discusses some of the
previous work in this area. In Section V-C, the 3D simulation setup used for the simulation
of a radiation particle strike at the output of inverter is described. In Section V-D, experi-
mental results are presented, and several observations from these results are discussed. The
corresponding design guidelines are also presented in this section. Finally, the chapter is
87
summarized in Section V-E.
V-B. Related Previous Work
Although radiation particle strikes in circuits operating at nominal supply voltages have
been extensively studied using 3D device simulation tools [18, 19, 11, 82, 83], DVS and
sub-threshold circuits have not received much attention. In [18], 3-D numerical simulation
is used to study the charge collection mechanism in silicon n+/p diodes. In [49], device
level three-dimensional simulation was performed to study the charge collection mecha-
nism and voltage transients from angled ion strikes. The authors of [82] used a 3D device
simulation tool to study the effect of radiation-induced transients and estimate the soft error
rate (SER) in static random access memory (SRAM) cells. In [84], an experimental study
of the effects of heavy ions in commercial SOI PowerPC microprocessors was conducted.
Microprocessors implemented using different technology nodes as well as different core
voltages were used in the experiment. It was also observed in [84] that the reduction of
feature size from 0.18µm to 0.13µm (and core voltage from 1.6 V to 1.3 V) had little effect
on the soft error rate. The sensitivity of several commercial SRAM devices to radiation,
as a function of their supply voltage, was experimentally studied in [85]. An increase in
the radiation susceptibility of SRAMs with decreasing supply voltage was observed. The
SRAMs used in these experiments were fabricated in older technologies (i.e. the feature
sizes were greater than 0.18µm). The authors of [83] analyzed the dependence of the soft
error rate on the critical charge (Qcri) and supply voltage, for a 0.6µm CMOS process. In
both [85] and [83], the study was performed through laboratory experiments, for nominal
supply voltage values (the minimum supply voltage value used was 1.5 V in [85] and 2.2 V
in [83]). Note that these were not DVS enabled circuits. Hence, the results of [85, 83] can-
not be used to predict the susceptibility of DSM VLSI circuits at lower (and sub-threshold)
88
voltages. Also, older process technologies were analyzed in [85, 83], and it is expected
that circuits implemented with recent deep submicron process technologies can exhibit a
very different behavior in response to radiation particle strikes than older processes [11].
In [84, 85, 83], no circuit level radiation hardening guidelines were proposed. In contrast,
in the work presented in this chapter, radiation strikes are modeled and analyzed for current
technologies, and a set of circuit hardening guidelines are presented based on the ndings.
V-C. Simulation Setup
In this work, a radiation particle strike is considered at the NMOS transistor of an inverter
(INV) shown in Figure V.1. INV is implemented in a 65 nm bulk technology. The input
of the INV is at GND and hence, the PMOS transistor is ON and the NMOS transistor is
OFF. An industry standard level 3D device simulation (Sentaurus-DEVICE [40]) was used
to simulate the INV of Figure V.1 with a radiation particle strike at the drain of the NMOS
transistor. Sentaurus-DEVICE is a mixed-level device and circuit simulator. The NMOS
transistor of the INV was modeled in the 3D device domain as described in Section V-C.1.
The PMOS transistor of INV is modeled using a PTM [45] SPICE model (in the circuit
domain). Note that a radiation particle strike was not simulated at the PMOS transistor
since, it is expected that a particle at the PMOS transistor would yield similar results as
obtained from a particle strike at the NMOS transistor.
To analyze the sensitivity of sub-threshold circuits, and circuits which employ DVS,
to radiation particle strikes, the supply voltage (VDD) of INV was varied in the 3D simu-
lations. The size of INV of Figure V.1, as well as the LET of the radiation particle were
varied, to simulate different radiation scenarios. The supply voltage values used were 0.35
V, 0.5 V, 0.6 V, 0.7 V, 0.8 V, 0.9 V and 1 V. The threshold voltage of the PMOS (NMOS)
transistor was V PT = 0.365 V (V NT = 0.325 V). Hence, 0.35 V was chosen as the supply volt-
89
age value for the sub-threshold INV. INVs of sizes 2×, 4× and 15× were simulated. The
width of the NMOS (PMOS) transistor in a 2× INV is 0.13µm (0.52µm). The INVs were
loaded with a load capacitance of value 3× their input capacitance. The radiation particle
LET values used were 2, 10 and 20 MeV-cm2/mg which represent low, medium and high
energy strikes respectively. A 4× INV with LET = 2 MeV-cm2/mg, and 10 MeV-cm2/mg,
and VDD = 1 V was also simulated for different load capacitances (0 fF, 1 fF, 3 fF, 5 fF and
6.3 fF) to study the effect of loading on the radiation susceptibility of the INV.
For each of these simulations, a radiation particle strike was simulated at the center
of the drain diffusion of the 3D NMOS transistor. The particle path was along the vertical
direction (normal to the surface of the drain diffusion). From simulations it was found that
a vertical strike corresponds to the worst case strike. Hence, in these 3D simulations, the
charge collection due to the ALPEN mechanism was not simulated. The total charge col-
lected at the drain node of the INV was due to the drift and diffusion mechanisms, as well
as the bipolar effect. The physical models used in the simulations included Shockley-Reed-
Hall and Auger recombination, hydrodynamic transport models for electrons, bandgap nar-
rowing dependent intrinsic carrier concentration models, mobility models which included
the Philips unied mobility model, as well as high-eld saturation and transverse eld de-
pendence. The silicon region containing the 3D NMOS device was 10µm x 10µm in size.
Note that it is sufcient to model the NMOS transistor in 3D device domain and the
PMOS transistor using a SPICE model card to simulate a radiation particle strike. This
is because, in an n-well process, the PMOS transistor sits inside an n-well and the n-well
terminal is connected to VDD. Therefore, the holes generated by the radiation particle
(below the drain of the NMOS transistor in the p-substrate) can not cross the n-well and p-
substrate junction to enter the n-well region. Note that the n-well and p-substrate junction is
reverse biased and the n-type diffusion collects only electrons. However, the drain (which
is a p-type diffusion) of the PMOS transistor can collect only holes. Thus, the radiation
90
particle strike at the NMOS transistor does not physically affect the PMOS transistor and
hence, it is appropriate to a SPICE model card for the PMOS transistor. Also, when a
radiation particle strikes the NMOS transistor, the PMOS transistor is ON since the input of
INV is at GND. Due to this, both the source and the drain terminal of the PMOS transistor
are at VDD and hence, the drain-bulk junction of the PMOS transistor is not reverse biased.
For these reasons, it is common practice to model only the device of a circuit struck by a
radiation particle in 3D device domain (the NMOS transistor in this work) [86, 83, 21, 22].
Hence, the approach used in this work is to model the INV is consistent with the previous
works.
in
M2
M1
out
Radiation particle strike
3D Device Model
(Sentaurus−DEVICE)
SPICE Model
Set to GND
Cload
Fig. V.1. Inverter (INV) under consideration
V-C.1. NMOS Device Modeling and Characterization
The Sentaurus-Structure editor tool [40] was used to construct the 3D NMOS transistor of
the INV in Figure V.1. The NMOS device was implemented in a 65 nm bulk technology.
The 3D 65 nm technology model was developed based on the data available in the litera-
ture [72, 87, 88, 89, 90, 21, 22]. Based on these references, the value of different parameters
91
used are as follows: the gate length L = 35nm, oxide thickness Tox = 1.2nm, spacer width
equals = 30nm and the height of the polysilicon gate = 0.12µm. The threshold voltage,
punch through, halo and latchup implants were also modeled in the NMOS device. The de-
tails of these implants are as follows. For the threshold (punch through) implant, the peak
doping concentration of Boron atoms is 8e18 cm−3 (7e18 cm−3) at 2 nm (14 nm) below the
SiO2-channel interface, the doping concentration decreases with a Gaussian prole, and the
doping concentration reduces to 1e17 cm−3 (2e17 cm−3) at a depth of 14 nm (5 nm) below
the peak concentration surface. The peak concentration of Boron atoms for halo implants
is 2e19 cm−3, and these implants are in the channel region at source-bulk and drain-bulk
junctions. Again, the doping concentration reduces with a Gaussian prole. For the latchup
implant, the peak doping concentration of Boron atoms is 5e18 cm−3 at 1.25 µm below the
SiO2-channel interface, the doping concentration decreases with a Gaussian prole, and the
doping concentration reduces to 1e16 cm−3 at a depth of 0.4 µm. The contact of the p-well
was placed at 0.75µm from the source diffusion of the NMOS transistor. The 3-D NMOS
device constructed in this work was characterized using Sentaurus-DEVICE [40] to obtain
the drain current (ID) as a function of the drain to source voltage (V DS) for different gate to
source voltages (V GS). The I−V characteristic of the NMOS transistor with width=1µm
is shown in Figure V.2. Figure V.2 shows that the NMOS device constructed in this chap-
ter has good MOSFET characteristics. These characteristics were veried to substantially
match the 65 nm PTM NMOS device characteristics, using SPICE.
V-D. Experimental Results
Figure V.3 shows the voltage of the output of the 4× INV of Figure V.1 with VDD =
1 V, during a radiation particle strike (of three different LET values) at the drain node
of the NMOS transistor. Figure V.4 plots the radiation-induced current through the drain
92
Fig. V.2. NMOS device: ID versus VDS plot for different VGS values
terminal of the NMOS transistor of the 4× INV. Note that for a 65 nm technology, as
shown in Figure V.3, a radiation particle with an LET value as low as 2 MeV-cm2/mg is
capable of generating a signicant voltage glitch (> 0.5VDD). For larger LET values, the
voltage at the output of the INV can become negative as shown in Figure V.3 (for LET =
10 and 20 MeV-cm2/mg). Hence, 65 nm devices are very susceptible to radiation particle
strikes even with medium energy particles. From the plots of the radiation-induced NMOS
drain current (shown in Figure V.4), observe that for low LET values (i.e. 2 MeV-cm2/mg)
the drain current looks like a double exponential current pulse. However, for larger LET
values (i.e 10 and 20 MeV-cm2/mg), there is plateau in the radiation-induced current. As
mentioned in Section I-A, a heavily doped substrate demonstrates charge collection due to
both the drift and the diffusion processes. In deep submicron technologies such as 65 nm,
the substrate is heavily doped and hence, the funnel collapses very rapidly (within 10-20
ps of the time of the radiation particle strike). As a result, a large amount of charge is left
in the substrate (after the funnel collapses) which then gets collected at the drain node of
the NMOS transistor through the diffusion process [91]. This results in a signicant drain
93
current and hence, the radiation-induced current remains constant for long time. Note
that this process is slow, as indicated in [91]. The current plateau was not observed for
LET = 2 MeV-cm2/mg since the radiation particle deposits a small amount of charge (20
fC/µm) in the substrate and most of this charge gets collected during the funnel assisted
drift collection phase. After this process, very little charge remains in the substrate, which
does not result in a signicant drain current.
LET = 20
LET = 10
LET = 2
Time (s)
1e−09 1.1e−09 1.2e−09 1.3e−09
V
o
lt
ag
e 
at
 O
ut
pu
t 
No
de
 (
V)
−0.5
0
0.5
1
Fig. V.3. Radiation-induced voltage transient at the output of 4× INV with VDD=1 V
LET = 20
LET = 10
LET = 2
Time (s)
1e−09 1.1e−09 1.2e−09 1.3e−09
N
M
OS
 D
ra
in
 C
ur
re
nt
 (
A)
0
0.001
0.002
0.003
Fig. V.4. Radiation-induced drain current of the NMOS transistor of 4× INV with VDD=1
V
94
The charge collected at the output of INV as a function of the supply voltage during
a radiation particle strike is plotted in Figure V.5, for different INV sizes and for different
linear energy transfer (LETs) values. Figure V.6 plots the area of the radiation-induced
voltage glitch (at the output of INV) for these simulations. Note that in these simulations,
the INVs were loaded with a load capacitance of value 3× their input capacitance. The
charge collected at the output of the INV is obtained by integrating the drain current of
the NMOS transistor following a particle strike. The area of a voltage glitch is computed
by integrating the difference of the supply voltage and the voltage at the output of INV
(V DD−V (out)) following a radiation particle strike. Thus, for a radiation particle strike
occurring at time t1 at the drain of M1 (shown in Figure V.1), the charge collected at out is
Q = R ∞t=t1 IM1d dt and the area of the voltage glitch is
R
∞
t=t1(VDD−V (out))dt. Note that the
area of the radiation-induced voltage glitch is a good measure of the susceptibility of an
INV (or any gate) to radiation particle strikes, because it incorporates both the magnitude
as well as the duration of the voltage glitch. Thus, it can be used for comparison of the sus-
ceptibility of INVs across different supply voltage values. From Figures V.5 and V.6 several
interesting observations were made. These observations, along with their explanation are
as follows.
1. Small devices collect less of the charge deposited by a radiation particle, compared
to larger devices. This phenomenon occurs mainly due to two reasons i) in a small
device, the drain node voltage falls more quickly compared to a large device. There-
fore, the strong electric eld in the drain-bulk junction of the NMOS exists for shorter
duration in the small device than in the large device. Thus, less charge is collected
initially during the funnel assisted drift collection phase, for a small device. ii) the
drain area is smaller in a small device compared to a large device. As a result, less
charge is collected through the diffusion process in the small device.
95
80
100
120
Q 
(fC
)
2X LET=2
2X LET=10
2X LET=20
4X LET=2
4X LET=10
4X LET=20
15X LET=2
15X LET=10
15X LET=20
0
20
40
60
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Q 
(fC
)
VDD (V)
Fig. V.5. Charge collected at the output of INV for different values
2. For low energy radiation particle strikes, wide devices collect almost the same amount
of charge across different supply voltage values. In other words, the charge collec-
tion efciency of wide devices is high and largely independent of supply voltage.
As mentioned earlier, during a low energy radiation particle most of the deposited
charge gets collected within a few picoseconds after the particle strike. Also, in a
wide device, the drain voltage of the device takes longer to fall, even for low supply
voltages, during a low energy radiation strike. Thus, the electric eld is present in
the drain-bulk junction for a long duration and a signicant amount of charge gets
collected, even at low supply voltages.
3. The amount of charge collected due to a radiation particle strike reduces with de-
creasing supply voltage. The charge collected due to the funnel-assisted drift process
depends on the strength of the electric eld in the drain-bulk junction. At lower volt-
ages, the electric eld in the drain-bulk junction is weaker than at higher voltages.
Also, the drain voltage of the device takes longer to fall for higher supply voltages
compared to lower supply voltages. Therefore, in case of high supply voltages, the
96
1.30
1.80
Ar
e
a 
o
f V
o
lta
ge
 
G
lit
ch
 
(V
-
n
s)
2X LET=2
2X LET=10
2X LET=20
4X LET=2
4X LET=10
4X LET=20
15X LET=2
15X LET=10
15X LET=20
-0.20
0.30
0.80
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Ar
e
a 
o
f V
o
lta
ge
 
G
lit
ch
 
(V
VDD (V)
Fig. V.6. Area of voltage glitch versus VDD
electric eld in the drain-bulk junction is strong and present for a longer duration,
due to which a large amount of charge gets collected at the drain node (compared to
the case when the supply voltage is low).
4. The effects of radiation particle strikes become severe for supply voltages less than
60% of the nominal value (which is slightly lower than the twice of the threshold
voltage of the PMOS transistor). As shown in Figure V.6, the area of the voltage
glitch increases with decreasing supply voltages. The PMOS transistor of the INV is
primarily responsible for recovering the voltage at the output node during a radiation
particle strike at the NMOS transistor. As the supply voltage (VDD) is decreased,
the PMOS transistor drive strength reduces and the PMOS transistor becomes sig-
nicantly weaker when the supply voltage is reduced below 2 ·V PT . Note that the
decrease in the drive strength of the PMOS transistor with decreasing VDD value is
much higher for VDD < 2 ·V PT compared to VDD values greater 2 ·V PT . Hence, when
the supply voltage is less than 2 ·V PT than the PMOS transistor takes longer to recover
the voltage at the output of the INV.
97
To study the effect of loading on the radiation susceptibility of a gate, a 4× INV
with LET = 2 MeV-cm2/mg and 10 MeV-cm2/mg, and VDD = 1 V was also simulated for
different load capacitances (0 fF, 1 fF, 3 fF, 5 fF and 6.3 fF). The results are reported in
Table V.1. In Table V.1, Columns 1 and 2 report the LET and the load capacitance values
under consideration. Column 3 reports the charge collected (Q) at the output of the INV.
The area of the radiation-induced voltage glitch is reported in Column 4. From Table V.1
the following observations were made:
1. For small devices with medium or high energy radiation particle strikes, the pulse
width of the voltage glitch increases with an increasing load capacitance (Cload) of
the gate. Due to a radiation particle strike of medium (or high) energy, the voltage at
the output of the INV of smaller sizes (such as 4× or smaller) becomes negative very
rapidly. After this the PMOS transistor of the INV starts recovering the voltage at the
output. If the INV is driving a higher load capacitance (Cload), then the PMOS will
take a longer time to restore the output voltage. Thus, the width of the voltage glitch
increases with the increasing load capacitance (Cload), contrary to popular belief.
2. However, for low energy radiation particle strikes, an increase in the load capacitance
(Cload) of the gate improves the radiation tolerance of the INV. The magnitude of the
voltage glitch is reduced with increasing load capacitance (which is due to increasing
fanout). This effect is more visible for low energy radiation particle strikes. For
high energy strikes, the difference in the magnitude of the voltage glitch for two
different loads is very small. As the voltage glitch magnitude is lower for low energy
strikes, the PMOS transistor of the INV has to recover a lower voltage swing at the
output node. Thus, the width of the voltage glitch reduces with the increasing load
capacitance. Hence, the INV becomes more tolerant to low energy radiation strikes
with the increasing load capacitance.
98
Table V.1. Q and Area of Voltage Glitch Versus Load Capacitance (Cload)
LET Cload (fF) Q (fC) Voltage Glitch
(MeV-cm2/mg) Area (V-ns)
2 0 10.0 0.0434
2 1 10.9 0.0448
2 3 11.1 0.0361
2 5 11.3 0.0314
2 6.3 11.6 0.0303
10 0 26.8 0.1224
10 1 27.7 0.1284
10 3 29.9 0.1409
10 5 32.6 0.1549
10 6.3 34.2 0.1629
3. The charge collected increases with the increasing load capacitance. This is again
due to fact that the voltage of the drain node of the NMOS transistor falls slowly for
large load capacitances. Thus, the electric eld is present in the drain-bulk junction
of the NMOS for a longer duration and hence, more charge gets collected.
The observations made above from Figures V.5 and V.6, and Table V.1 are important to
consider during radiation hardening of DVS and sub-threshold circuits. Based on the obser-
vations made, several design guidelines are presented for hardening DVS and sub-threshold
circuits. These guidelines suggest that the traditional radiation hardening approaches need
to be revisited.
1. If a gate is upsized to increase its radiation tolerance, then a higher value of charge
collected (due to a radiation particle strike) should be used. This is extremely impor-
tant for low voltage operation, since lowered voltage circuits are more likely to have
large voltage glitch areas.
2. For environments with low energy radiation particles, it is safe to assume that the
charge collected remains constant across different supply voltages for wide devices.
The collected charge also remains roughly constant across different gate sizes for
high or nominal voltage
99
3. DVS designs should scale down the supply voltage of a circuit to 2 ·VT (VT is the
maximum of V PT and V NT ). Below this value, radiation susceptibility increases rapidly
as shown in Figure V.6. Also, a circuit with DVS should be hardened at the lowest
operating voltage, with the charge collected at that voltage. This will ensure radiation
tolerance at higher supply voltages. Sub-threshold circuits and circuits with a supply
voltage < 2 ·VT require aggressive protection against radiation strikes.
4. The fanout load capacitance (Cload) of gates should be kept low in circuits operating
in high energy radiation particle environments. This is contrary to conventional wis-
dom. For low energy radiation environments, the fanout factor (load capacitance) of
the gates should be increased to improve their radiation tolerance.
Observe from Figure V.5 that the charge collected (Q) at the output of a gate has
a strong dependence on the size of the gate, the supply voltage (VDD) and the radia-
tion particle energy (LET). Therefore, simulating radiation particle strikes in DVS cir-
cuits in SPICE using a worst case collected charge (maximum possible charge collec-
tion) may lead to very pessimistic designs. To improve the accuracy of SPICE simula-
tions of radiation particle strikes in DVS circuits, a model for the charge collected (QM)
at the output node of a gate is also proposed, and 5 parameters from this model can
be appended in to the SPICE model cards for MOSFETs (for example, the PTM model
cards for 65 nm [45]). Since Q directly depends on the size of a gate - W (expressed
in µm), VDD and LET (expressed in MeV-cm2/mg), the model proposed in this work is
QM = min(KMAX · LET,KQ ·W β1VDDβ2LET β3 . Here, KMAX , KQ, β1, β2 and β3 are ob-
tained by characterizing a process technology through 3D simulations of radiation particle
strikes. In the expression for QM , KMAX ·LET represents the maximum amount of charge
that can be collected due to a radiation particle strike. The value of KMAX is obtained from
3D simulations of radiation particle strikes at the drain of a very wide NMOS transistor for
100
different LET values. Note that the drain terminal of this NMOS transistor was connected
to VDD (nominal value) and the source, gate and bulk terminals were connected to GND
to maximize the charge collection. From 3D simulations, KMAX was found to be 0.8. Note
that since KMAX = 0.8, the amount charge that can be collected (Q) in the worst case is 80%
of the charge deposited (QD) by a radiation particle strike in the charge collection volume.
Therefore, the traditional approach, in which 100% of the charge desposited is assumed to
be collected in the worst case, is pessimistic. The values of KQ, β1, β2 and β3 (parameters
in the second term in the expression for QM) were estimated by tting the model QM with
Q obtained through 3D simulations (shown in Figure V.5) for 2×, 4× and 15× INV, and
for VDD = 0.6 to 1.0 V (in steps of 0.1 V) and LET = 10 and 20 MeV-cm2/mg. The values
obtained are KQ = 16.54 fC, β1 = 0.704, β2 = 0.9 and β3 = 0.664. Note that the curve
t was performed for medium and high energy particles, since hardening of DVS circuits
needs to be performed against radiation particles of such energies, to meaningfully improve
their radiation tolerance. As mentioned earlier, for low energy particle strikes, it is safe to
assume that the charge collected remains constant across different supply voltages in wide
transistors. Therefore, the QM model proposed in this chapter is applicable for medium
and high energy particle strikes. Also, as proposed earlier, DVS designs should scale the
supply voltages of a circuit up to 60% of the nominal value, and therefore the QM model
was obtained for VDD = 0.6 to 1.0 V. To evaluate the accuracy of the proposed model, the
amount of charge collected at the output of the INV (shown in Figure V.1) predicted by the
proposed model (dark bar) and from 3D simulations (light bar) were plotted in Figure V.7.
Figure V.7 shows the charge collected with VDD = 0.6 to 1.0 V in steps of 0.1 V as the
outermost variable. For each voltage value the charge collected is reported for 2×, 4×
and 15× INVs with LET = 10 and 20 MeV-cm2/mg, as the legend indicates. Figure V.7
shows that the proposed model is very accurate, with an average error of 6.3%. Thus,
the proposed model for the charge collected at the output node of a gate can improve the
101
accuracy of the SPICE level simulation of radiation particle strikes in DVS circuits. For
sub-threshold circuits, it is difcult to nd an accurate model since the charge collection
efciency is very low and hence, 3D simulations should be performed to obtain the value
of the charge collected (Q) at the output node of a gate for different parameter values (W
and LET).
As mentioned earlier, the radiation-induced current shown in Figure V.4 for LET =
10 and 20 MeV-cm2/mg is very different from the double exponential current pulse de-
scribed by Equation 1.1. Therefore, for accurate SPICE simulations of radiation particle
strikes in circuits, a new model is required to model the radiation-induced current in DSM
devices (for example, the current waveforms shown in Figure V.4 for LET = 10 and 20
MeV-cm2/mg). Note that it difcult to model the radiation-induced current in DSM devices
by a simple expression such as double exponential current pulse. However, the radiation-
induced current may be modeled by a combination of a double exponential pulse and a
rectangular pulse. The double exponential part models the charge collection by the drift
process whereas, the rectangular pulse models the charge collection by the diffusion pro-
cess. Note that the rectangular pulse can accurately model the plateau presented in the
radiation-induced current in DSM devices as shown in Figure V.4.
V-E. Chapter Summary
Radiation particle strikes are becoming increasingly important problems for both combi-
national and sequential circuits. At the same time, power has become a major issue in
computing. In recent times, it is common to decrease the supply voltage value in the
non-critical parts of VLSI systems, in order to reduce the power and energy consump-
tion. Reduced supply voltages further aggravate reliability issues due to radiation. With
increasing demand for reliable systems, it is necessary to design radiation tolerant circuits
102
 
 




 
 

	












 ff fiffifl  "!
#%$ fl fiffifl'&%(
)ff*+*-,/.%0 12) )-*+*ff,3.%0 42)
)ff*+*-,/.%0 5 )
) * * , . 0 1 ) ) * * , . 0 4 )



 
 




 



6


 
 

6


 
 




 






7


Fig. V.7. Comparison of charge collected (Q) obtained from the proposed model versus 3D
simulations
efciently. In this chapter, the radiation particle strikes in DVS and sub-threshold circuits
were studied. 3D simulations for radiation particle strikes in an inverter were performed,
using Sentaurus-DEVICE. The sensitivity of DVS and sub-threshold circuits to radiation
particle strikes were studied by varying the inverter size, the inverter load, the supply volt-
age (VDD) and the energy of the radiation particle. This was done using 3D simulations.
From these 3D simulations, several non-intuitive observations were made which are impor-
tant to consider during the radiation hardening of DVS and sub-threshold circuits. Based on
these observations, several guidelines were also proposed for radiation hardening of DVS
and sub-threshold circuit designs. A model for the charge collected at the output node of
a gate was also proposed which can improve the accuracy of SPICE simulations radiation
events.
In the next two chapters, 2 approaches are presented for hardening a design against
radiation particle strikes.
103
CHAPTER VI
RADIATION HARDENING - CLAMPING DIODE BASED RADIATION TOLERANT
CIRCUIT DESIGN APPROACH
VI-A. Introduction
In Chapter I, the need to harden combinational circuits was discussed. Then in Chap-
ters II, III, IV and V, analysis approaches were presented to analyze radiation-induced
transients in combinational circuits and SRAMs. Based on the results of the analysis of the
effects of a radiation particle strike on a circuit, selective hardening of the gates in a cir-
cuit may be performed, to achieve the desired level of radiation tolerance while satisfying
area, delay and power constraints. For this, efcient circuit level hardening techniques are
required.
This dissertation proposes two circuit level hardening approaches previously pub-
lished in [35, 36] to harden combinational circuits against a radiation particle strike. These
two approaches are referred to as the diode clamping based approach and the split-output
based hardening approach. The diode clamping based approach (which is presented in
this chapter) is suitable for hardening combinational circuits against low energy radiation
particle strikes. The split-output based hardening approach (described in the next chapter)
is suitable for high energy radiation particle environments.
The diode clamping based hardening approach is based on the use of shadow gates,
whose task is to protect the primary gate in case it experiences a radiation strike. The gate
to be protected is duplicated locally, and a pair of diode-connected transistors (or diodes)
is connected between the outputs of the original and shadow gates. These diodes turn on
when the voltages of the two gates deviate (during a radiation strike). Experimental results
show that at the level of a single gate, the area overhead is quite large. At the circuit level,
104
however, gates are selectively hardened. A methodology is presented to protect specic
gates of the circuit based on electrical masking, in a manner that guarantees radiation tol-
erance for the entire circuit. This circuit level hardening methodology is able to harden
circuit with low area and delay overheads. An improved circuit level hardening algorithm
is also proposed, to further reduce the delay and area overhead.
The remainder of this chapter is organized as follows: Section VI-B discusses previ-
ous work in the area of designing radiation tolerant VLSI circuits. Section VI-C describes
the diode clamping based radiation tolerant combinational circuit design approach. Exper-
imental results are presented in Section VI-D, while the chapter summary is provided in
Section VI-E.
VI-B. Related Previous Work
There has been a great deal of work on radiation hardened circuit design techniques. Sev-
eral papers focus on combinational circuits [3, 2, 92, 69, 12, 5], while others have focused
on memory design [1, 2, 93, 94, 95, 96, 97, 98]. Since memories are particularly susceptible
to radiation events, these efforts were crucial to space and military applications.
Circuit hardening approaches can be classied as device level, circuit level, and sys-
tem level [99, 5, 12, 42, 35, 43, 100, 36]. Device level approaches require fundamental
changes to the fabrication process to improve the radiation immunity of a design [99].
Silicon-on-insulator (SOI) devices are considered to be more tolerant than bulk CMOS de-
vices [11] due to lower charge collection volume in SOI devices. However, other hardening
techniques are still required to achieve a meaningful tolerance of an SOI based design to
radiation particle strikes [11].
Circuit level hardening approaches use special circuit design techniques that reduce
the vulnerability of a circuit to radiation strikes [5, 12, 43, 42, 101, 102, 103]. In [12],
105
the authors selectively upsize the gates in a digital design to increase the radiation toler-
ance of the design. A larger gate has higher drive capability, which increases its radiation
immunity in comparison to a smaller gate. The authors protect those gates in a circuit
which contribute maximally to the soft error failure rate of the circuit. These sensitive
gates in a circuit are identied by using a logical masking [12] analysis. There authors
of [42, 43, 101, 102, 103] also performed selective gate hardening in a circuit. Heijmen et
al. performed selective duplication of sensitive gates in [43] (i.e. connecting two gates in
parallel) to reduce soft error rate (SER). The authors reported that SER can be improved by
50% with an area penalty of 30%.
Device and circuit level approaches are typically fault avoidance approaches, while
system level approaches typically involve the use of fault detection and tolerance mech-
anisms. Triple modular redundancy (TMR) [55] is a classical example of a system level
design approach. In [93], the authors provide a built-in current sensor (BICS) to detect
radiation events in an SRAM, which can be used to trigger a re-computation.
Although the approaches discussed above increase the circuit reliability to radiation
events, the cost (in terms of area, delay and power) associated with these approaches is high,
typically unacceptable for high-volume mainstream applications. Also these approaches
provide radiation tolerance to radiation particles with moderate energy levels. In other
words, the increase in Qcri achieved by traditional approaches is not very high. In several
applications, high energy radiation particle strikes are encountered. Therefore, there is a
need for a radiation hardening approach which can provide radiation tolerance against very
large values of Q, with comparable or smaller overheads. At the same time, there is a need
for radiation hardening approach which incur low delay penalties for low to medium energy
radiation particle strikes.
In this dissertation, two circuit level hardening approaches are proposed to harden
combinational circuits against a radiation particle strike. The rst approach (described
106
in this chapter) the diode clamping based approach is suitable for low energy radiation
particle strikes, in circuits which cannot tolerate a large delay overhead due to radiation
hardening. The second approach (the split-output based hardening approach) is presented
in the next chapter, and it is suitable for high energy radiation particle environments.
VI-C. Proposed Clamping Diode based Radiation Hardening
A radiation strike at a node in a circuit can result in a voltage glitch at that node. If the
magnitude of the voltage glitch is more than the switch-point of gates driven by that node,
then the radiation-induced transient may propagate to the primary outputs to the circuit.
This may results in an SEU. The clamping diode based circuit hardening approach ensures
that such a radiation-induced voltage glitch (at the node where the radiation particle strike
occurs) is clamped before it reaches the switch-point of gates in its fanout.
This section is divided into three subsections. In Section VI-C.1, two circuit structures
(shown in Figures VI.1 and VI.2) which were investigated, in order to create a radiation-
hardened standard cell are described. Section VI-C.2 discusses the notion of critical depth
for any protected library cell. A larger critical depth for any cell indicates that more logic
stages are needed for this cell to erase the effects of a radiation-induced voltage glitch.
Based on the notion of critical depth, Section VI-C.3 describes two algorithms proposed in
this chapter to selectively protect cells in a standard-cell based circuit, so as to minimize
the delay and area overheads.
VI-C.1. Operation of Radiation-induced Voltage Clamping Devices
A clamping diode can be used to suppress a glitch. However, this clamping diode should
not prevent (or delay) the switching of the logic during its normal functional operation
(when no radiation strike has occurred). Hence, another similarly sized driver (logic gate
107
outG
0V
in
1.4V
outP
−0.4V
1V
GP
D1D2
Fig. VI.1. Diode based radiation-induced voltage glitch clamping circuit
out
1.4V
Gin
outP
0V
1V
−0.4V
GP
Fig. VI.2. Device based radiation-induced voltage glitch clamping circuit
GP) is required in parallel with the gate that is to be protect G. This is shown in Fig-
ures VI.1 and VI.2. When the outputs of G and GP deviate signicantly (which would
occur when one of the gates undergoes a radiation strike), the clamping circuit turns on,
thereby protecting the gate G from a radiation event. As shown in Figures VI.1 and VI.2,
the supply voltages for the protecting gate (GP) are higher (VDD = 1.4 V and VSS = -
0.4 V). Hence thicker gate oxides are used for the protecting gate (GP) of Figures VI.1
and VI.2, and the diode connected devices of Figure VI.2, in order to avoid reliability prob-
lems. Multiple oxide thicknesses have been used in past for a 65 nm process as reported
in [90, 104, 105, 106]. Note that it is possible to use a generic CMOS process with a single
108
oxide thickness, but a thicker oxide is required. The thicker oxide will increase short chan-
nel effects for the protected gate (G) which is powered by VDD and GND. The devices used
in the protecting gate have a higher VT (V pT = -0.42 V and V nT = 0.42 V) compared to the
regular devices used in the protected gate G (which have VTn = 0.22V and VTp =−0.22V ).
This is to minimize the leakage through the protecting gate, which is important since the
inputs of GP are the same as those of the protected gate. The devices used for clamping also
have a higher VT , to make sure that they are off during regular operation (in the absence of
radiation events). In fact the clamping devices are on the verge of conduction (since V pT =
-0.42 V and V nT = 0.42 V). Ideally it would be desired for the protecting gate to have an even
higher VT (to minimize the leakage through this gate), but the proposed circuit hardening
approach restricts itself to two VT values. Note that the bulk terminal of the protecting gate
(GP) and the diode connected devices of Figure VI.2 are connected to the protecting gate
power supply i.e. VDD = 1.4 V and VSS = -0.4. This ensures that the bulk terminals of
these devices are not forward biased. Also, the dimensions of the devices used in both the
hardened and regular version of cells are same. In other words, the sizing of the G and GP
gates in Figures VI.1 and VI.2 are the same as that of a corresponding unhardened gate.
The clamping diodes used can either be regular PN junction type diodes (Figure VI.1)
or diode connected devices (Figure VI.2). Both these options were investigated in the work
presented in this chapter. Note that the Schottky diodes can also be used as clamping diodes.
VI-C.1.a. PN Junction Diode
Consider the circuit of Figure VI.1. Assume that a radiation particle strike occurs at the
output of protected gate (with its output at logic 0 under steady state) which results in a
positive voltage glitch at that node. Note that when the output of the protected gate is at 0
V then the output of the protecting gate is at -0.4 V. When the voltage on the out node starts
rising and when the voltage across the diode D2 (in Figure VI.1) reaches the diode turn-on
109
voltage, it begins to clamp the voltage across it. In this way the glitch due to the radiation
event is suppressed.
Now consider the case of a radiation particle strike at the output (outP) of the protect-
ing gate which is at logic 0. In this case the protected node is still protected (remains at
logic 0). This is because the protecting node is initially at a much lower voltage (-0.4 V)
and as the voltage at the protecting node rises, the diode D2 remains turned-off. Diode D1
turns on only when the voltage at the protecting node rises to a value greater than the diode
turn-on voltage (i.e. the voltage glitch magnitude is 0.4 + diode turn-on voltage). However,
the radiation particle which can cause such a glitch would have to have a high energy. As
mentioned earlier, the proposed clamping diode based circuit hardening approach is suit-
able for low energy radiation particle strikes which cannot result in such a large voltage
glitch. Therefore, a low energy radiation particle strike at outP will not affect the voltage
at out.
The working of the clamping structure for falling radiation-induced pulses, when the
output node is at logic 1, is similar to that discussed above.
VI-C.1.b. Diode Connected Device
Consider the circuit in Figure VI.2. Again assume that a radiation event causes a positive
voltage glitch at out in Figure VI.2, which was at logic 0 under steady state. At this time,
the steady state output of the protecting gate is at -0.4 V. When the voltage of out starts
rising, the clamping NMOS device starts to turn on, and conducts more strongly if the
voltage of out continues to rise, thus clamping the protected node. If the radiation particle
strikes the output of the protecting gate i.e. outP, the out node remains at logic 0. This
is because the protecting node is initially at a much lower voltage (-0.4 V) and as the
voltage at the protecting node rises, the clamping NMOS device turns off more. It is only
when the voltage of the protecting node rises above 0.4 V that the clamping PMOS device
110
starts turning on. This could cause the voltage of the protected node to rise. As discussed
in Section VI-C.1.a, a radiation event would need to have a high energy to cause such a
glitch.
In a similar manner, the clamping PMOS device helps protect a gate from a falling
voltage pulse due to a radiation event.
Both the device-based and diode-based clamping structures were implemented, and
had very similar protection characteristics, as shown later in chapter. The layout area
penalty of the device based clamping structure was determined to be lower than that for a
diode-based clamping structure. As a consequence, the experiments reported in this chap-
ter are all based on the device-based clamping structure. The performance of device based
and diode-based clamping structures for an inverter are presented in Tables VI.1, VI.2, VI.3
and VI.4.
It was experimentally veried that a radiation strike at the output of the protecting gate
does not cause extra soft errors (for the given value of Q = 24 fC, τα = 145 ps and τβ = 45
ps). In particular, if there is a radiation particle strike at the output of protecting gate then
the Q required to turn on the diode connected devices and affect the protected node needs
to be much larger than 24 fC. Note that the clamping diode based hardening approach
is suitable for low energy radiation particle strikes with Q up to 24 fC. Also, the correct
operation of the proposed radiation tolerant gate (shown in Figure VI.2) was explicitly
veried by simulating a radiation particle strike at all nodes of the gates, for every gate
in the library (LIB) used in this approach to implement radiation tolerant combinational
circuits.
VI-C.2. Critical Depth for a Gate
Radiation hardened versions for all regular unhardened cells present in the library LIB were
designed using diode connected devices. Then the critical depth (which is based on the
111
Fig. VI.3. Layout of radiation-tolerant NAND2 gate (uses device based clamping)
electrical masking) of each radiation hardened cell was computed in the following manner.
Consider a sequence of n copies of the same library cell C, with the output of the ith
cell being one of the inputs of the (i+1)th cell. Let all the other inputs of the (i+1)th cell be
assigned to their non-controlling values. Assume that a radiation strike occurs at the output
of the cell at the rst level, with to Q = 24 fC, τα = 145 ps and τβ = 45 ps. Then the critical
depth of library cell C, denoted as ∆(C), is dened as the number of levels of logic that are
required for the magnitude of the glitch due to the radiation event to become smaller than
γ×V DD, where γ < 1. Note that ∆(C) is a function of Q, τα, τβ, the load driven by C and
the input ordering of C. The values of ∆(C), were estimated using SPICE simulations. The
worst case critical depth for any library cell C is obtained (by loading it with a single fanout
load) in these simulations. Also, for n input gates, the output of each gate was connected
to the kth input of the subsequent gates. Then the critical depth was computed as the worst
depth among all the n possible input ordering. In this manner, the worst case critical depths
was computed for all the cells in the library LIB, for the given values of Q, τα and τβ. Note
that the denition of critical depth is applicable to static CMOS gates only.
112
VI-C.3. Circuit Level Radiation Hardening
A simplistic approach would be to protect each gate in the design using the standard cell
hardening approach proposed in this chapter. However, this would result in an exorbitant
delay and area overhead for the circuit. Instead, a selective hardening approach in presented
where the delay and area overhead is minimized, while guaranteeing radiation hardness for
the circuit.
Let ∆ = maxC(∆(C)) over all the cells in the library LIB . Given any circuit, one could
protect all gates that are topologically ∆ or less levels away from any primary outputs of
the circuit. In this case, if there is a radiation strike on any protected cell, it would be
eliminated because the cell is protected. If there is a radiation strike on an unprotected
cell, it would be eliminated since the effect of the strike needs to traverse ∆ or more levels
of protected gates before it reaches the output. In either case, the circuit is tolerant to the
radiation event.
A variant of the above approach, which is slightly more efcient, is based on variable
depth protection, and is described in Algorithm 1. It is based on a reverse topological
traversal of a circuit η from its primary outputs. Let deptharray() be the array of critical
depths of all the library cells used in the implementation of the circuit η. The algorithm
starts with a requirement to protect gates up to a reverse topological depth D = ∆(p), where
∆(p) is the critical depth of the gate at the primary output p. Whenever a gate C with
critical depth ∆(C) is encountered, the algorithm updates the depth to be protected as D =
min(D− l,∆(C)). Here, l is the topological depth of gate C from the primary output p.
VI-C.4. Alternative Circuit Level Radiation Hardening
If a large number of gates with high critical depth are present near the primary outputs
of a circuit then it might be necessary to protect a signicant portion of the circuit using
113
Algorithm 1 Variable Depth Radiation Hardening for a Circuit
variable depth protect(η,deptharray)
for each p ∈ PO(η) do
D = ∆(p)
for each cell C such that p ∈ f anout(C) do
l = topological depth of C from p
D = min(D− l,∆(C))
if D > 1 then
Replace C by Chardened
end if
end for
end for
the variable depth protection approach. This will result in large area and delay overheads.
Column 8 of Table VI.5 reports the critical depth of all the gates in the library LIB. Observe
from this table that the critical depth of the inv2AA gate is much higher than the rest
of the gates in LIB. Therefore, if a large number of inv2AA gates are present near the
primary outputs of a circuit, then the area and delay overhead of the hardened circuit will
be large. Thus, to further reduce the area and delay overhead associated with variable depth
protection scheme, an algorithm is presented, which attempts to reduce the number of gates
with large critical depth (such as inv2AA) near the primary outputs of a circuit.
The proposed approach to further reduce the area or delay overhead is described in
Algorithm 2. Let η be a mapped circuit obtained using library LIB with either area or
delay as a cost function. Also let η∗ be the circuit obtained after using the variable depth
protection algorithm on η. Now, partition η∗ into two parts, where the rst part is the
unprotected portion of η∗, represented by ζ, and the second part is the protected portion of
η∗, represented by φ. Then modify the library LIB to obtain another library L∗ in which
a large area and delay cost are assigned to gates with large critical depths (for example
inv2AA). Re-synthesize φ with the new library L∗ to obtain φ∗, which will contain very few
gates of high critical depth because of the high cost associated with them. Then, append ζ
to φ∗ and apply the variable depth protection algorithm on the combined circuit to produce
a radiation tolerant circuit η′. The resulting circuit η′ as is referred to as the re-synthesized
hardened circuit in the sequel.
114
Algorithm 2 Alternative circuit level radiation hardening
alternative circuit protect(η,L,deptharray)
η∗ = variable depth protect(η,deptharray)
Partition η∗ into (ζ,φ)
L∗ = modify(L)
φ∗ = re− synthesize(φ,L∗)
ηc = append(ζ,φ∗)
η′ = variable depth protect(ηc,deptharray)
VI-C.5. Final Circuit Selection
Two different radiation tolerant versions η∗ and η′ of a regular circuit η are obtained using
the approaches described in Sections VI-C.3 and VI-C.4. Now obtain the delay and area
associated with both η∗ and η′. The nal radiation tolerant circuit can be obtained by
choosing η∗ or η′ such that the area or the delay is minimized. This approach is referred to
as the improved circuit protection approach.
VI-D. Experimental Results
The radiation tolerance of both radiation hardened gate structures shown in Figures VI.1
and VI.2 was simulated in SPICE [38]. A 65nm BPTM [107] model card was used, with
V DD = 1V and VTN = |VTP|= 0.22V .
Based on [12], τβ = 45ps was used. The value of τα and Q was varied, to test the
proposed radiation hardened gate design against a variety of radiation conditions.
The performance of both radiation hardened gate designs is summarized in Tables VI.1,
VI.2, VI.3 and VI.4. These tables report the protection results (in terms of the magnitude
of the radiation-induced voltage glitch) for the INV-2X gate, which is the most radiation
sensitive gate in the library LIB. The rst two tables report the simulation results for diode
based clamping, and the latter two describe the results for device based clamping. For both
styles, the glitch magnitude is reported for varying values of τα and Q. The rst and third
tables report values of the glitch magnitudes when the output is at logic 0, while the second
115
Table VI.1. Glitch Magnitude of PN Junction Clamping Diode for Rising Pulses (Output at
Logic 0)
Q(fC) Decay time τα (ps)105 115 125 135 145 155 165 175 185
21 0.31 0.29 0.28 0.27 0.26 0.24 0.24 0.23 0.22
22 0.33 0.32 0.29 0.28 0.27 0.26 0.25 0.24 0.23
23 0.34 0.32 0.31 0.29 0.28 0.27 0.26 0.25 0.24
24 0.36 0.34 0.32 0.31 0.29 0.28 0.27 0.26 0.25
25 0.38 0.35 0.33 0.32 0.31 0.29 0.28 0.27 0.26
26 0.39 0.37 0.35 0.33 0.31 0.30 0.29 0.28 0.27
27 0.41 0.39 0.36 0.36 0.33 0.31 0.30 0.29 0.28
28 0.43 0.41 0.38 0.36 0.34 0.33 0.31 0.30 0.29
29 0.45 0.42 0.39 0.37 0.35 0.34 0.32 0.31 0.30
30 0.47 0.44 0.41 0.39 0.37 0.35 0.33 0.32 0.31
Table VI.2. Glitch Magnitude of PN Junction Clamping Diode for Falling Pulses (Output at
Logic 1)
Q(fC) Decay time τα (ps)105 115 125 135 145 155 165 175 185
21 0.31 0.29 0.28 0.26 0.25 0.24 0.24 0.23 0.22
22 0.32 0.32 0.29 0.28 0.26 0.25 0.24 0.24 0.23
23 0.34 0.32 0.30 0.29 0.28 0.26 0.26 0.25 0.24
24 0.35 0.33 0.32 0.30 0.29 0.28 0.27 0.25 0.25
25 0.36 0.34 0.33 0.31 0.30 0.29 0.28 0.26 0.26
26 0.38 0.36 0.34 0.33 0.31 0.30 0.29 0.28 0.26
27 0.40 0.37 0.35 0.34 0.32 0.31 0.30 0.28 0.28
28 0.41 0.39 0.38 0.35 0.34 0.32 0.31 0.29 0.28
29 0.43 0.40 0.38 0.36 0.35 0.33 0.32 0.30 0.29
30 0.45 0.42 0.40 0.38 0.38 0.34 0.33 0.31 0.30
and fourth correspond to an output at logic 1.
Based on these tables, it can be observed that the regular PN junction diode tended
to have better protection performance than the diode connected device for the same active
area. However, implementing the PN junction diodes require a larger area on account of
the spacing requirements of the required wells which are at different potentials. The diode
connected devices on the other hand share their well with the devices in the protecting gate,
and can be implemented in a more area-efcient manner. Also, the leakage current of the
regular PN junction diode will be higher under delay variations which can lead to a large
voltage drop across the diode. Therefore, the diode connected devices of Figure VI.2 were
116
Table VI.3. Glitch Magnitude of Diode-connected Clamping Device for Rising Pulses (Out-
put at Logic 0)
Q(fC) Decay time τα (ps)105 115 125 135 145 155 165 175 185
21 0.33 0.31 0.29 0.27 0.26 0.25 0.23 0.22 0.21
22 0.36 0.33 0.31 0.29 0.27 0.26 0.25 0.24 0.23
23 0.38 0.35 0.33 0.31 0.29 0.28 0.26 0.25 0.24
24 0.40 0.37 0.34 0.33 0.31 0.29 0.28 0.26 0.25
25 0.42 0.39 0.36 0.34 0.32 0.31 0.29 0.27 0.26
26 0.45 0.41 0.38 0.36 0.34 0.32 0.30 0.29 0.27
27 0.48 0.44 0.41 0.38 0.35 0.34 0.32 0.30 0.29
28 0.50 0.46 0.43 0.40 0.37 0.35 0.33 0.32 0.30
29 0.53 0.49 0.45 0.42 0.39 0.37 0.35 0.33 0.31
30 0.56 0.51 0.47 0.44 0.41 0.39 0.36 0.35 0.33
Table VI.4. Glitch Magnitude of Diode-connected Clamping Device for Falling Pulses
(Output at Logic 1)
Q(fC) Decay time τα (ps)105 115 125 135 145 155 165 175 185
21 0.32 0.30 0.28 0.27 0.26 0.24 0.23 0.22 0.21
22 0.33 0.31 0.30 0.28 0.26 0.25 0.24 0.23 0.22
23 0.35 0.33 0.31 0.29 0.28 0.27 0.25 0.24 0.23
24 0.38 0.35 0.33 0.31 0.29 0.28 0.26 0.25 0.24
25 0.40 0.37 0.34 0.33 0.31 0.29 0.28 0.26 0.25
26 0.41 0.38 0.36 0.34 0.32 0.30 0.29 0.28 0.27
27 0.43 0.41 0.38 0.35 0.34 0.32 0.30 0.29 0.28
28 0.45 0.43 0.40 0.37 0.35 0.33 0.32 0.30 0.29
29 0.48 0.44 0.42 0.39 0.37 0.35 0.34 0.31 0.30
30 0.50 0.46 0.43 0.40 0.38 0.36 0.34 0.33 0.31
117
used for hardening gates. Note that the Schottky diodes can be used instead of regular
PN junction diodes. The Schottky diodes can be implemented in smaller area compared
to the regular PN junction diodes. The library LIB consists of INV-2X, INV-4X, AND2,
AND3, AND4, OR2, OR3, OR4, NAND2, NAND3, NAND4, NOR2, NOR3 and NOR4
gates. Layouts were created for both the hardened and regular versions of all the gates in
the standard cell library LIB. Figure VI.3 describes the layout of the device based clamping
approach, for the NAND2.
Figure VI.4 describes the voltage waveform at the output of INV-2X, when radiation
particle strikes corresponding to Q = 24 fC, τα = 145ps and τβ = 45ps were simulated at its
output node. The voltage waveform of the unprotected design experiences a large glitch. If
it were captured by a memory element, an incorrect value would be sampled. The proposed
device clamping based hardened INV-2X successfully clamps the voltage to a safe level.
Fig. VI.4. Output waveform during a radiation event on output
Figure VI.5 shows the voltage waveform at the output of a gate, when a current corre-
sponding to Q = 24 fC, τα = 145ps and τβ = 45ps is injected into the protecting node. The
voltage waveform of the output node is well within the noise margins of the gate.
118
Fig. VI.5. Output waveform during a radiation event on protecting node
Based on the fact that the device based protection scheme is used due to its reduced
layout area compared to the diode clamping approach, the largest value of Q that the INV-
2X cell can tolerate (from Tables VI.3 and VI.4) is 24 fC for τα = 145 ps. This corresponds
to γ = 0.35 (i.e. the designs can tolerate a glitch magnitude of 0.35×VDD).
Based on the values of Q = 24 fC, τα = 145 ps and τβ = 45 ps, the critical depth
∆(C) for each gate C in LIB was computed. The results of this exercise are presented in
Table VI.5, in Column 8. In addition to critical depth, Table VI.5 also reports the worst-
case delay (in picoseconds) and the layout area (in µm2) of each cell in LIB. Columns 2 and
3 report the worst case delay of the unprotected and protected versions of the cell. Column
4 reports the percentage overhead in the worst-case delay of the hardened version of each
cell compared to its regular version. Note that the worst-case delay of the protected cell
is on average just slightly larger than that of a regular cell. Also note that for some cells
(inv4AA, and3AA, etc) the delay overhead is negative. This is possibly because of the fact
that the leakage current of the hardened version of those cell is greater than the regular
cell, therefore resulting in faster output transitions. Columns 5 and 6 report the layout area
119
Table VI.5. Delay, Area and Critical Depth of Cells
Cell Reg. Hard. Delay Reg. Hard. Area Depth
Delay Delay % Ovh. Area Area % Ovh.
(ps) (ps) (µm2) (µm2)
inv2AA 24.04 26.24 9.16 1.53 8.15 433.33 4
inv4AA 23.91 22.75 -4.88 2.04 9.60 370.83 1
nand2AA 31.42 33.01 5.06 2.04 9.17 350.00 1
nand3AA 44.92 46.10 2.63 2.55 10.70 320.00 1
nand4AA 62.44 63.34 1.44 3.06 12.23 300.00 1
nor2AA 45.62 48.46 6.24 2.55 10.19 300.00 2
nor3AA 77.15 81.04 5.04 4.59 14.52 216.67 1
nor4AA 92.80 92.74 -0.07 7.13 18.86 164.29 1
and2AA 57.48 58.52 1.81 2.55 10.19 300.00 1
and3AA 76.90 75.67 -1.60 3.06 11.72 283.33 1
and4AA 98.75 99.60 0.86 3.57 12.74 257.14 1
or2AA 71.16 71.00 -0.23 3.57 12.23 242.86 1
or3AA 112.87 113.37 0.44 5.35 15.29 185.71 1
or4AA 125.17 123.51 -1.32 8.15 20.89 156.25 1
AVG 1.76 277.17
of unprotected and protected versions of cells. The area overhead of hardened version of
each cell compared to its regular version is reported in Column 7. Note that the average
area overhead is about 277% which is quite large. Therefore, the variable depth protection
and improved variable depth protection algorithms are used to harden a circuit, so that only
a few gates are replaced with their radiation tolerant version. This helps in achieving a
reduced area overhead.
Table VI.6 reports the delay overhead of the proposed circuit hardening approaches
(η∗ and η′) for both area and delay mapped designs. The area overhead of the radiation
tolerant approaches is reported in Table VI.7. Tables VI.9 and VI.10 report the delay and the
area overhead respectively of the best radiation tolerant circuit (between η∗ and η′) using
delay or area based mapping. The circuits were optimized using technology independent
optimization in SIS (including redundancy removal), and were then mapped for area and
delay using the 65nm standard cell library LIB.
The delay penalty associated with applying the proposed radiation hardening approaches
(η∗ and η′) is presented in Table VI.6. Delays were computed using the sense [108] pack-
age in SIS [109], which computes the largest sensitizable delay for a mapped circuit. In
120
Table VI.6, Columns 2 and 3 report the delay (in picoseconds) of a regular design and a
radiation-hardened area-mapped design (before re-synthesis). Column 4 reports the per-
centage delay overhead for the radiation-hardened design. Column 5 reports the delay of
re-synthesized radiation-hardened area-mapped design (which are obtained as described
in Section VI-C.4) and Column 6 reports the percentage delay overhead for this design.
Similarly, Columns 7 and 8 report the delay (in picoseconds) of a regular design and a
radiation-hardened delay-mapped design (before re-synthesis). Column 9 reports the per-
centage delay overhead for the radiation-hardened design. Column 10 reports the delay
of the re-synthesized radiation-hardened delay-mapped design, and Column 11 reports the
percentage delay overhead for this design. As reported in Table VI.6, the circuit-level delay
overhead of the variable depth protection algorithm is as low as 2.92% on average for delay
mapped designs, and about 1.6% for area mapped designs (before re-synthesis). Note that
the radiation hardened designs are generated by replacing regular gates (which are topo-
logically close to the outputs) by hardened gates. This results in a large increase in the
load capacitance of the regular gates that drive these hardened gates. As a consequence,
the circuit level delay penalty in Table VI.6 is sometimes larger than the gate-level de-
lay penalty reported in Table VI.5. The circuit-level delay overhead of the re-synthesized
hardened circuit is 2.63% on average for delay mapped designs, and about 8.11% for area
mapped designs, which is higher than the delay associated with hardened circuit before
re-synthesis. For area mapped circuits, the delay overhead increases (for η′) because, for
resynthesis of the hardened circuit, rst the hardened portion of the circuit obtained from
the variable depth protection algorithm is extracted. Then this sub-circuit is re-synthesized
with a high cost assigned to gates with a large critical depth, to minimize their utilization.
This increases the utilization of gates with a large input load capacitance and hence, the
load on the unprotected circuit increases, resulting in a delay increase. However, for delay
mapped designs, the delay overhead reduces due to the increased usage of low overhead
121
(and negative overhead) gates. Also note that in some circuits, the delay overhead of the
hardened circuit is negative. This is due to the increased usage of the hardened inv4AA
gate which has a negative delay overhead over the regular inv4AA gate.
Both the regular and the radiation hardened circuits were mapped using the library of
cells mentioned in the beginning of this section. The resulting designs were placed and
routed using SEDSM [110]. Note that the area overhead due to the routing of the addi-
tional power supplies has been accounted for. The additional supply lines (VDD=1.4 V
and GND=-0.4 V) were routed as regular signal lines. This was done because a single radi-
ation particle strike would result in the clamping action at only one gate in an entire circuit
and therefore, wider wires are not needed for additional supply lines. The area penalty
associated with applying the proposed protection algorithms (η∗ and η′) is presented in
Table VI.7. In Table VI.7, Columns 2 and 3 report the placed-and-routed area (in µm2) of a
regular design and the radiation-hardened area-mapped design (before re-synthesis). Col-
umn 4 reports the percentage area overhead for the radiation-hardened design. Column 5
reports the placed-and-routed area of the re-synthesized hardened area-mapped design and
Column 6 reports the percentage area overhead for this design. Similarly, Columns 7 and 8
report the area (in µm2) of a regular design and a radiation-hardened delay-mapped design
(before re-synthesis). Column 9 reports the percentage area overhead for the radiation-
hardened design. Column 10 reports the placed-and-routed area of the re-synthesized ra-
diation tolerant delay-mapped design and Column 11 reports the percentage area overhead
for this design. Observe from Table VI.7 that the area overheads on average are larger for
area-mapped designs, which is reasonable since the designs were mapped with an area-
based cost function to start with. The average area penalty was about 45% and 28% for
area and delay mapped designs obtained using variable depth protection approach before
re-synthesis. However, the area overhead was around 29% and 24% for the re-synthesized
area and delay mapped hardened designs. The area overhead of the re-synthesized designs
122
Table VI.6. Delay Overhead of the Proposed Radiation Hardened Design Approaches
Area Mapping Delay Mapping
Ckt Regular η∗ %Ovh. η′ %Ovh. Regular η∗ %Ovh. η′ %Ovh.
alu2 1211.680 1165.100 -3.84 1214.939 0.27 1052.595 1066.158 1.29 1073.261 1.96
alu4 1405.975 1435.371 2.09 1533.967 9.10 1319.840 1329.837 0.76 1425.119 7.98
C1355 960.003 990.448 3.17 984.751 2.58 775.568 787.417 1.53 787.417 1.53
C1908 1376.626 1385.880 0.67 1486.142 7.96 1172.012 1184.320 1.05 1215.548 3.71
C3540 1682.691 1728.315 2.71 1772.920 5.36 1560.553 1571.991 0.73 1588.231 1.77
C499 960.003 990.448 3.17 984.751 2.58 775.568 787.417 1.53 787.417 1.53
C880 1606.093 1669.115 3.92 1323.711 -17.58 1544.077 1570.779 1.73 1239.997 -19.69
dalu 1325.516 1363.747 2.88 1415.225 6.77 1221.374 1233.771 1.02 1232.717 0.93
des 2170.999 1721.303 -20.71 2595.902 19.57 2016.371 2053.416 1.84 2272.788 12.72
frg2 910.514 930.828 2.23 991.758 8.92 911.745 957.092 4.97 870.592 -4.51
i2 462.161 477.990 3.42 478.714 3.58 377.435 386.718 2.46 417.151 10.52
i3 172.459 199.782 15.84 233.170 35.20 172.459 199.782 15.84 194.383 12.71
i10 2217.855 2335.109 5.29 2685.245 21.07 2246.170 2318.547 3.22 2315.502 3.09
AVG 1.60 8.11 2.92 2.63
is lower than that of the original designs since a small number of gates with high critical
depth are used in the re-synthesized circuit. The area overhead of either of the proposed
approaches is signicantly lower than the area overheads associated with alternate radia-
tion hardening approaches, which commonly require logic duplication or triplication. Note
that some designs (such as frg2) have a low logic depth and large number of inputs, and
consequently, their area overheads are higher.
Table VI.8 reports the total number of gates and the number of hardened gates in a
circuit resulting from the use of the proposed circuit tolerant approaches (η∗ and η′) for
both area and delay mapping. In Table VI.8, Columns 2 and 3 report the total number
of gates and the number of hardened gates of a radiation-hardened area-mapped design
(before re-synthesis). Columns 4 and 5 reports these numbers for the radiation-hardened
design after re-synthesis. Similarly, Columns 5 and 6 report the total number of gates
and the number of hardened gates for radiation-hardened delay-mapped designs (before
re-synthesis), and Columns 7 and 8 report these quantities for radiation-hardened delay-
mapped designs after re-synthesis.
The delay penalty associated with applying the improved circuit protection approach
123
Table VI.7. Area Overhead of the Proposed Radiation Hardened Design Approaches
Area Mapping Delay Mapping
Ckt Regular η∗ %Ovh. η′ %Ovh. Regular η∗ %Ovh. η′ %Ovh.
alu2 1045.88 1418.28 35.61 1215.22 16.19 1397.26 1569.74 12.34 1569.74 12.34
alu4 1994.52 2470.09 23.84 2279.11 14.27 2470.09 2756.25 11.59 2756.25 11.59
C1355 1592.01 2121.52 33.26 1994.52 25.28 1728.90 2279.11 31.82 2279.11 31.82
C1908 1569.74 1994.52 27.06 1799.46 14.63 1799.46 2225.95 23.70 2279.11 26.66
C3540 3183.22 3916.26 23.03 3573.65 12.27 4022.10 4572.46 13.68 4515.84 12.28
C499 1569.74 2121.52 35.15 1994.52 27.06 1728.90 2279.11 31.82 2279.11 31.82
C880 1045.88 1752.26 67.54 1418.28 35.61 1397.26 1871.43 33.94 1764.00 26.25
dalu 2470.09 2996.47 21.31 2965.89 20.07 3310.85 4057.69 22.56 3573.65 7.94
des 9964.03 16842.85 69.04 13731.15 37.81 12139.63 17800.90 46.63 15490.29 27.60
frg2 1994.52 4201.63 110.66 3916.26 96.35 2611.21 4147.36 58.83 4238.01 62.30
i2 685.39 730.08 6.52 745.29 8.74 872.61 872.61 0.00 872.61 0.00
i3 495.51 670.81 35.38 600.25 21.14 495.51 656.38 32.47 600.25 21.14
i10 6037.29 12016.54 99.04 9304.53 54.12 7705.33 11231.76 45.77 11054.42 43.46
AVG 45.19 29.50 28.09 24.25
Table VI.8. Total Number of Gates and Number of Hardened Gate in Different Designs
Area Mapping Delay Mapping
Ckt η∗ η′ η∗ η′
Total # # of Hardened Total # # of Hardened Total # # of Hardened Total # # of Hardened
of Gate Gates of Gate Gates of Gate Gates of Gate Gates
alu2 273 45 270 7 429 17 474 7
alu4 537 52 531 11 795 27 845 14
C1355 455 51 450 32 582 32 582 32
C1908 406 45 415 25 579 27 597 25
C3540 893 80 904 22 1290 46 1356 24
C499 455 51 450 32 582 32 582 32
C880 308 54 310 26 417 51 445 31
dalu 733 51 747 19 1064 38 1082 16
des 2795 545 2628 245 3812 365 4213 245
frg2 597 221 579 132 846 144 941 137
i2 151 2 151 3 228 3 230 1
i3 110 14 110 6 114 14 118 6
i10 1775 519 1787 233 2507 346 2792 243
124
Table VI.9. Delay Overhead of the Improved Circuit Protection Approach
Area Mapping Delay Mapping
Best Delay Best Area Best Delay Best Area
Ckt Regular min(η∗,η′) %Ovh. min(η∗,η′) %Ovh. Regular min(η∗,η′) %Ovh. min(η∗,η′) %Ovh.
alu2 1025.881 1165.1 -3.84 1214.939 0.27 883.78 1066.16 1.29 1073.26 1.96
alu4 1219.078 1435.371 2.09 1533.967 9.1 1126.59 1329.84 0.76 1329.84 0.76
C1355 835.347 984.751 2.58 984.751 2.58 670.35 787.42 1.53 787.42 1.53
C1908 1206.572 1385.88 0.67 1486.142 7.96 1028.11 1184.32 1.05 1184.32 1.05
C3540 1466.264 1728.315 2.71 1772.92 5.36 1343.16 1571.99 0.73 1588.23 1.77
C499 835.347 984.751 2.58 984.751 2.58 670.35 787.42 1.53 787.42 1.53
C880 1402.478 1323.711 -17.58 1323.711 -17.58 1319.60 1240.00 -19.69 1240.00 -19.69
dalu 1041.554 1363.747 2.88 1415.225 6.77 1008.48 1232.72 0.93 1232.72 0.93
des 1476.792 1721.303 -20.71 2595.902 19.57 1459.22 2053.42 1.84 2272.79 12.72
frg2 768.107 930.828 2.23 991.758 8.92 756.23 870.59 -4.51 957.09 4.97
i2 432.983 477.99 3.42 477.99 3.42 346.60 386.72 2.46 417.15 10.52
i3 160.868 199.782 15.84 233.17 35.2 160.87 194.38 12.71 194.38 12.71
i10 1879.402 2335.109 5.29 2685.245 21.07 1882.47 2315.50 3.09 2315.50 3.09
-0.14 8.09 0.29 2.60
is presented in Table VI.9. Two different radiation hardened versions (η∗ and η′) are avail-
able for each design and the best among them in terms of area or delay can be selected.
In Table VI.9, Column 2 reports the delay (in picoseconds) of a regular area-mapped de-
sign. Column 3 reports the delay of radiation-hardened area-mapped design with the best
delay. Column 4 reports the percentage delay overhead for this design. Column 5 reports
the delay of the radiation-hardened area-mapped design with the best area and Column 6
reports the percentage delay overhead for this design. Similarly, Column 7 reports the de-
lay (in picoseconds) of a regular delay-mapped design. Column 8 reports the delay of the
radiation-hardened delay-mapped design with the best delay. Column 9 reports the percent-
age delay overhead. Column 10 reports the delay of the radiation-hardened delay-mapped
design with the best area and Column 11 reports the percentage delay overhead for this
design. Note from this table that the circuit-level delay overhead of the improved circuit
protection algorithm is as low as 0.29% on average for delay mapped designs, and about
-0.14% for area mapped designs.
The placed-and-routed area penalty associated with applying the improved circuit pro-
tection approach is presented in Table VI.10. In Table VI.10, Column 2 reports the placed-
125
Table VI.10. Area Overhead of the Improved Circuit Protection Approach
Area Mapping Delay Mapping
Best Delay Best Area Best Delay Best Area
Ckt Regular min(η∗,η′) %Ovh. min(η∗,η′) %Ovh. Regular min(η∗,η′) %Ovh. min(η∗,η′) %Ovh.
alu2 1045.88 1418.28 35.61 1215.22 16.19 1397.26 1569.74 12.34 1569.74 12.34
alu4 1994.52 2470.09 23.84 2279.11 14.27 2470.09 2756.25 11.59 2756.25 11.59
C1355 1592.01 1994.52 25.28 1994.52 25.28 1728.9 2279.11 31.82 2279.11 31.82
C1908 1569.74 1994.52 27.06 1799.46 14.63 1799.46 2225.95 23.7 2225.95 23.7
C3540 3183.22 3916.26 23.03 3573.65 12.27 4022.1 4572.46 13.68 4515.84 12.28
C499 1569.74 1994.52 27.06 1994.52 27.06 1728.9 2279.11 31.82 2279.11 31.82
C880 1045.88 1418.28 35.61 1418.28 35.61 1397.26 1764 26.25 1764 26.25
dalu 2470.09 2996.47 21.31 2965.89 20.07 3310.85 3573.65 7.94 3573.65 7.94
des 9964.03 16842.85 69.04 13731.15 37.81 12139.63 17800.9 46.63 15490.29 27.6
frg2 1994.52 4201.63 110.66 3916.26 96.35 2611.21 4238.01 62.3 4147.36 58.83
i2 685.39 730.08 6.52 730.08 6.52 872.61 872.61 0 872.61 0
i3 495.51 670.81 35.38 600.25 21.14 495.51 600.25 21.14 600.25 21.14
i10 6037.29 12016.54 99.04 9304.53 54.12 7705.33 11054.42 43.46 11054.42 43.46
41.50 29.33 25.59 23.75
and-routed area (in µm2) of a regular area-mapped design. Column 3 reports the area of
the radiation-hardened area-mapped circuits with the best delay. Column 4 reports the
percentage area overhead for the corresponding design. Column 5 reports the area of the
radiation-hardened area-mapped design with the best area and Column 6 reports the per-
centage area overhead for this design. Similarly, Column 7 reports the area (in µm2) of a
regular delay-mapped design. Column 8 reports the area of the radiation-hardened delay-
mapped circuit with the lowest delay. Column 9 reports the percentage area overhead for
the corresponding circuit. Column 10 reports the area of the radiation-hardened delay-
mapped designs with the least area and Column 11 reports the percentage area overhead
of corresponding design. As reported in Table VI.10, the circuit-level area overhead of the
improved circuit protection algorithm is 23.75% on average for delay mapped designs, and
about 29.33% for area mapped designs.
126
VI-E. Chapter Summary
In this chapter, a novel circuit design approach was presented for radiation hardened digital
electronics. The proposed approach uses shadow gates to protect the primary gate in case it
is struck by radiation. The gate which is to be protected is locally duplicated, and a pair of
diode-connected transistors (or diodes) are connected between the outputs of the original
and the shadow gates. These transistors (diodes) turn on when the voltages of the two gates
deviate during a radiation strike. The delay overhead of the proposed approach per library
gate is about 1.76%. The area overhead of this approach is 277% per library gate.
In addition, a variable depth protection approach was also presented to perform circuit-
level radiation hardening with very low delay and area overheads. In this approach, the
number of gates that need to be protected are minimized. The resulting circuit is made
radiation hard, with a very low area and delay penalty (28% and 3% on average, for delay
mapped designs) compared to an unprotected circuit. In practice, however, a very small
fraction of gates need to be protected. Another approach was presented which reduces
the area and delay penalty based on the desired cost function. With the improved circuit
protection algorithm, radiation tolerant circuits are obtained with a very low area penalty
as low as 23.75% and a delay penalty as low as -0.14% on average.
It is possible to use the proposed gate level hardening approach to memory elements,
or even the gates that drive memory elements. In this way, the approach presented in this
chapter can protect both combinational and sequential circuits from radiation events. The
next chapter describes the split-output based hardening approach.
127
CHAPTER VII
RADIATION HARDENING - SPLIT-OUTPUT BASED RADIATION TOLERANT
CIRCUIT DESIGN APPROACH
VII-A. Introduction
This chapter presents the second circuit level radiation hardening approach (the split-output
based hardening approach) developed in this dissertation. The split-output based hardening
approach exploits the fact that if a gate is implemented using only PMOS (NMOS) transis-
tors, then a radiation particle strike can result only in logic 0 to 1 (1 to 0) ip. Based on this
observation, radiation hardened variants of regular static CMOS gates are derived. In par-
ticular, the PMOS and NMOS devices of regular gates are separated, and the gate output is
split into two signals. One of these outputs of the radiation tolerant gate is generated using
PMOS transistors, and it drives other PMOS transistors (only) in its fanout. Similarly, the
other output (generated from NMOS transistors) only drives other NMOS transistors in its
fanout. Now, if a radiation particle strikes one of the outputs of the radiation tolerant gate,
then the gates in the fanout enter a high-impedance state, and hence preserve their output
values.
Split-output based radiation hardened gates exhibit an extremely high degree of ra-
diation tolerance, which is validated at the circuit level. Using these gates, circuit level
hardening is performed based on logical masking, to selectively harden those gates in a
circuit which contribute maximally to the soft error failure of the circuit. The gates whose
outputs have a low probability of being logically masked are replaced by their radiation
tolerant counterparts, such that the digital design achieves a soft error rate reduction of a
specied amount (typically 90%). Experimental results demonstrate that this reduction is
achieved with a modest layout area and delay penalty.
128
In the remainder of this chapter, Section VII-B briey discusses some additional previ-
ous work (in addition to the previous work presented in Chapter VI) on radiation hardening
of circuits. Section VII-C describes the split-output clamping based radiation tolerant com-
binational circuit design approach. Experimental results are presented in Section VII-D,
while the chapter summary is provided in Section VII-E.
VII-B. Related Previous Work
In addition to the existing circuit level hardening approaches discussed in Section VI-B,
some other approaches [97, 98, 111] use the fact that a particle hit induces a current which
always ows from the n-type diffusion to the p-type diffusion, through a PN junction. This
means that if a ip-op is made up of only PMOS (NMOS) transistors, then a radiation
particle strike cannot ip the node voltage from 1 to 0 (0 to 1). The authors of [97] use this
observation to design a radiation hardened ip-op (with two inputs and two outputs), by
separating the NMOS and the PMOS transistors in the ip-op. However, their ip-op
design has signicantly higher leakage currents since some nodes have non-rail voltages in
steady state. The authors of [98] alleviate this problem by adding a few more transistors to
the radiation tolerant ip-ip design of [97].
In [111], the author borrows the idea of [97] to design a radiation tolerant standard
cell library. However, these hardened cells have signicantly larger leakage currents due
to non-rail voltage levels at the output nodes of the gates. This is a signicant problem
because leakage power in today’s technologies is comparable to or greater than switching
power [112]. Further, the author of [111] did not describe a methodology to implement
a radiation tolerant circuit using the radiation tolerant standard cell library, and hence did
not report the area and delay overheads of the resulting radiation tolerant circuit. Also,
the transistor of radiation tolerant standard cells of [111] have to be sized very carefully to
129
allow correct operation. In contrast, the radiation tolerant standard cells proposed in this
chapter do not suffer from these issues. This is described in Section VII-C.1.
VII-C. Proposed Split-output Based Radiation Hardening
In Section VII-C.1, the radiation-tolerant standard cell design approach proposed in this
chapter is described, along with an explanation of how these hardened gates are derived
from regular gates. The circuit level hardening approach is described in Section VII-C.2.a.
For circuit hardening, only those gates in a circuit which contribute maximally towards
the soft error failure rate of the circuit are replaced by their hardened counterparts. The
circuit level hardening approach presented in this chapter achieves a soft error failure rate
reduction of an order of magnitude (i.e. 90% reduction in the soft error rate) for several
ISCAS and MCNC benchmark circuits. Section VII-C.3 presents an analysis to estimate
the critical charge for the hardened circuit obtained by using the approach presented in this
chapter.
VII-C.1. Radiation Tolerant Standard Cell Design
As mentioned in Section VII-B, a radiation particle strike induces a current which always
ows from the n-type diffusion to the p-type diffusion through a PN junction [97]. This
implies that if a gate is made up of only PMOS (NMOS) transistors then a radiation particle
strike cannot ip the node voltage from 1 to 0 (0 to 1). In other words, if a particle strikes the
diffusion of a PMOS transistor of an inverter (constructed exclusively using PMOS devices)
whose output is at logic 1, then this particle strike will not cause the output node voltage
to experience an SET. Similarly, a particle strike at the diffusion of a NMOS transistor of
the inverter constructed exclusively using NMOS devices (with an output node at logic 0)
will not result in an SET. This is a key idea, which conveys that if a logic circuit is made up
130
of only PMOS (NMOS) transistors, then that logic circuit will be tolerant to node voltage
changes from logic values 1 to 0 (0 to 1). This concept is used in the hardening approach
presented in this chapter to design radiation tolerant standard cells.
Consider two regular inverters as shown in Figure VII.1 (a). Radiation particle strikes
at M1 and M2 of INV1 can result in both positive or negative voltage glitches at the out1
node (since the PMOS and NMOS transistors are both connected to the out1 node). The
voltage glitch at the out1 node can affect the node voltage of out2, which can lead to a soft
error. To avoid an error event due to radiation particle strikes at the diffusions of M1 or
M2, INV1 needs to be hardened. The steps to harden INV1 are as follows.
First, M1 (NMOS) and M2 (PMOS) of INV1 in Figure VII.1 (a) are separated from
each other, and the resulting circuit is shown in Figure VII.1 (b). The inverter INV1 shown
in Figure VII.1 (b) has 2 inputs (inp and inn) and 2 outputs (out1p and out1n). Both the
inputs and both the outputs of INV1 have the same polarity. Note that the output nodes
out1p and out1n of INV1 drive only PMOS and NMOS transistors of the gates in their
fanout respectively (out1p drives M4 of INV2 and out1n drive M3 of INV2 in Figure VII.1
(b)). Also, note that the inverter INV2 is also modied such that two different input signals
(of the same polarity) drive the transistors M3 and M4. In this chapter, an n input regular
gate (such as the inverter INV2 of Figure VII.1 (b)) whose inputs to PMOS and NMOS
transistors are separated, is referred to as the modified regular gate. Note that such a gate
has 2n inputs. Also note that a n input has 2n inputs and 2 outputs.
However, in the INV1 circuit of Figure VII.1 (b), out1p (out1n) can only charge
(discharge) to 1 (0). To get the opposite transitions at node out1p and out1n, two ad-
ditional transistors M5 (NMOS) and M6 (PMOS) are added, and connected as shown in
Figure VII.1 (c). The inverter INV1 of Figure VII.1 (c) works as follows. Assume that
both inn and inp are at a logic 0 value, and out1p and out1n are at logic 1. Now assume
that both inn and inp transition to logic 1 due to which transistor M1 turns on and M2 turn
131
M2
INV1
M1
M4
INV2
M3
in out1 out2
out2
M2
INV1
M1
M4
INV2
M3
inp
out1n
out1p
inn
INV1 INV2
out1p
out1n
out2
inp
inn
M2 M4inp
out1p
INV1
M1
INV2
M3
out1n
inn
out2
M6
M5
(a)
(b)
(c)
M2 M4
out1p
INV1
M1
INV2
M3
out1n
inn
inp
out2
M5
M8
M6
M7
(d)(e)
Fig. VII.1. Design of an radiation tolerant inverter
off. The turning on of M1 pulls the node out1n down to logic 0 which then turns on M6.
Since M6 is ON, out1p drives a weak logic 01. Both the out1p and out1n nodes are now at
logic 0, due to which the output of INV2 (out2) switches to logic 1. Now when both inputs
of INV1 (inn and inp) change to logic 0, then transistor M1 turns off and M2 turns on. As
M2 is on, out1p charges to logic 1, which turns M5 on and hence, node out1n is pulled to
a weak logic 1 (VDD- V M5T volts). Thus, the circuit INV1 of Figure VII.1 (c) behaves like
an inverter, with the output node out1p (out1n) switching between VDD and |V M6T | (VDD
- V M5T and GND). Note that the inverter INV1 of Figure VII.1 (c) has a high leakage power
dissipation because nodes out1p and out1n switch between non-rail voltage values. Specif-
ically, when inn and inp are at GND then out1p is at VDD and out1n is at VDD-V M5T . Due
to this, M6 is not fully turned off while M2 is completely on. Hence, there is a static power
dissipation through M2 and M6. Similarly, when inn and inp are at the VDD value, there
1The node out1p falls to |V M6T | volts. Note that V M6T is the threshold voltage of PMOS
transistor M6.
132
is a static power dissipation through M5 and M1. Also note that INV1 of Figure VII.1 (c)
is tolerant to radiation particle strikes at out1p and out1n. This will be explained shortly,
after discussing a modication to the proposed INV1 design of Figure VII.1 (c) which is
not only tolerant to a radiation particle strike, but also signicantly reduces the static power
dissipation. Experimental results show that this modication yields a reduction in leakage
of 2 orders of magnitude.
To reduce the static power dissipation in INV1 of Figure VII.1 (c), 2 more transistors
(M7 and M8) are added to INV1 and the resulting inverter is shown in Figure VII.1 (d).
Now when the inputs of INV1 (inn and inp) of Figure VII.1 (d) are at the GND value, again
the out1n node is at VDD-V M5T due to which M6 is still not fully turned off. However, M8
is completely off (since inn is at the GND value) and hence there is no static power dissipa-
tion through M2, M6 and M8. Similarly, there is no static power dissipation through M7,
M5 and M1 when the inputs inn and inp are at the VDD value. Note that the transistors
M5 and M6 of INV1 in FigureVII.1 (d) are selected to be low threshold voltage transistors
(indicated by a thicker line in the gure). This is done so as to increase the voltage swing
at nodes out1p and out1n, and bring them closer to the rail voltages. Note that the reduced
voltage swings at out1p and out1n do not increase the leakage currents in INV2 of Fig-
ure VII.1 (d). This is because when the node out1p is at the |V M6T | value, then out1n is at
the GND value, due to which M3 is completely turned off while M4 is turned on. Similarly,
when out1p is at the VDD value then out1n is at VDD-V M5T , and hence M3 is turned on
while M4 is completely turned off. Therefore, the leakage currents in INV2 do not increase
due to non-rail voltage swing at its inputs.
The inverter INV1 of Figure VII.1 (d) is tolerant to a radiation strike at out1p and
out1n. Consider the case when the nodes inp and inn are at VDD, which implies that
out1p and out1n are at |V M6T | and GND respectively, and out2 is at the VDD value. Now
suppose a radiation particle strikes at node out1p (the radiation particle strikes either M2
133
or M6) which increases the voltage at node out1p to VDD (due to the positive charge
collection at out1p). Due to this, M4 of INV2 turns off and M5 turns on. However, the
node out1n remains at GND value because M7 is in cutoff. Therefore M3 also remains off.
Thus, the node out2 remains at the VDD value in a high impedance state. Eventually, the
charge collected at out1p dissipates through M6 and M8 (since inp and inn are at VDD)
which brings the voltage at out1p node back to |V M6T |. At this point, M4 turns on again.
In this way, a radiation strike at out1p does not affect the node voltage of out2. Similarly,
a particle strike at out1n does not affect the node out2 when inn and inp are at the GND
value.
To summarize, a radiation particle at out1p (out1n) node can only result in a posi-
tive (negative) glitch since only PMOS (NMOS) transistors are connected to it. Also this
positive (negative) glitch at out1p (out1n) does not propagate to out2. This is because the
out1p (out1n) node drives only the PMOS (NMOS) transistor of INV2 which goes into
cutoff mode when a positive (negative) glitch appears at out1p (out1n) node. A radiation
particle strike at M8 can be of any signicance only when out1p is at the VDD value (since
a radiation particle strike at the NMOS transistor can only result in a negative glitch). How-
ever, when out1p is at VDD, M6 is turned off and hence a particle strike at M8 does not
affect the node voltage of out1p. Similarly, a radiation particle strike at M7 does not affect
the voltage of the out1n node. In this way, INV1 of Figure VII.1 (d) is tolerant to radiation
particle strikes since a particle strike at either of its output nodes does not affect the output
of its fanout gates (out2 of INV2 in Figure VII.1). To validate this analysis, inverters INV1
and INV2 of Figure VII.1 (d) were implemented using a 65nm PTM [45] model card with
VDD = 1.0 V. Radiation particle strikes were simulated at out1p (at time = 0.8 ns) and
out1n (at time = 2.4 ns) nodes, with Q = 150 fC, τα = 150 ps and τβ = 38 ps. The values
of Q, τα and τβ were obtained from [12]. The voltage waveforms at out1p, out1n and out2
are shown in Figure VII.2.
134
-1
-0.5
 0
 0.5
 1
 1.5
 2
 0.5  1  1.5  2  2.5  3  3.5
Vo
lta
ge
time(ns)
v(out1p)
v(out1n)
v(out2)
Fig. VII.2. Radiation particle strike at out1p and out1n of INV1 of Figure VII.1d
Note that the inverter INV1 of Figure VII.1 (c) is also tolerant to radiation particle
strikes in a similar manner as INV1 of Figure VII.1 (d). The radiation tolerant standard cell
library designed by the author of [111] is similar to the inverter INV1 of Figure VII.1 (c).
However, there are many issues associated with it. First, there is a static power dissipation
in the INV1 of Figure VII.1 (c) as described earlier. Second, the transistor M2 and M6 (M1
and M5) of INV1 have to be sized very carefully to allow for correct inverter operation.
The transistor M1 (M2) is sized larger than M5 (M6). Last, when a radiation particle
strike at out1p results in a positive glitch at out1p, M5 turns on and hence, the voltage at
node out1n is determined by the relative drive strength of M1 and M5. Therefore, a small
positive glitch can occur at out1n (as M1 is sized larger than M5) which can turn on M3 for
long enough to pull the node out2 low. In contrast to the radiation tolerant inverter design
of [111] (or the INV1 of Figure VII.1 (c)), the radiation tolerant inverter design which
is proposed in this chapter (shown in Figure VII.1 (d)) does not suffer from these issues.
The leakage currents of INV1 of Figure VII.1 (c) and INV1 of Figure VII.1 (d) were also
extracted through SPICE simulations. The leakage currents of INV1 of Figure VII.1 (c) are
0.939 µA and 1.936 µA when both inputs (inp and inn) are at logic 0 and 1 respectively. In
135
contrast, the leakage currents of the radiation hardened inverter INV1 of Figure VII.1 (d)
are 12.7 nA and 13.5 nA (for logic 0 and 1 at both inputs). Therefore, as mentioned earlier,
the radiation hardened gates proposed in this chapter have 2 orders of magnitude lower
leakage compared to the hardened gates of [111]. Figure VII.1 (e) also shows the symbolic
diagram of the radiation tolerant inverter (INV1) and the modified regular inverter (INV2)
of Figure VII.1 (d).
The radiation hardening approach described above can be applied to any static CMOS
gate, including complex gates. Figure VII.3 (a) shows a radiation tolerant 2-input NAND
gate designed using this approach. Figure VII.3 (b) shows the modified regular 2-input
NAND gate circuit. As shown in Figure VII.3 (a), the radiation tolerant 2-input NAND gate
has a total of 4 inputs and 2 outputs. The inputs in1p and in1n (in2p and in2n) correspond
to the rst input in1 (second input in2) of a regular 2-input NAND gate. The two outputs
out1p and out1n of the radiation tolerant 2-input NAND gate of Figure VII.3 drive the
PMOS and the NMOS transistors of the gates in its fanout. In general, an n-input static
CMOS gate requires 4n + 2 transistors when implemented using the radiation hardening
approach proposed in this chapter, in contrast to 2n transistors for its regular static CMOS
counterpart.
VII-C.2. Circuit Level Radiation Hardening
To keep the area and delay overhead low, selective hardening need to be performed by only
protecting those gates in a circuit which have a signicant contribution to the soft error
failure rate of the circuit. Whether a voltage glitch induced by a radiation particle strike at
any gate in a circuit propagates to the primary outputs and results in a failure depends upon
three masking factors: logical, electrical and temporal masking, as described in Section I-
A.
The sensitive gates in a circuit are those gates which have small values for these mask-
136
in1p
out1p
in2p
in1n
in2n
(a) (b)
out1n in1n
in2n
in2p
out1
in1p
Fig. VII.3. a) Radiation tolerant 2-input NAND gate, b) modified regular 2-input NAND
gate
ing factors, and hence these gates contribute signicantly to the soft error failure of the
circuit. These are the gates in a circuit which are needed to be protected by replacing them
with the hardened gates presented earlier, to signicantly improve the radiation tolerance
of the circuit. The split-output based hardening approach uses logical masking to identify
such sensitive gates in a circuit. The approach used to identify these gates is described next.
VII-C.2.a. Identifying and Protecting Sensitive Gates in a Circuit
To identify the sensitive gates in a circuit, a measure of the logical masking at all gates in
the circuit is computed. The logical masking at a gate is computed as the probability of
the absence of a functionally sensitizable path from the gate to any primary output of the
circuit. The computation of the probability of logical masking at a gate is carried out in the
same manner as proposed in [12], as described later in this section. As mentioned in [12],
the probability of logical masking (PLM) at a gate G is 1−PGSen where PGSen is the probability
of sensitization of gate G. The probability of sensitization is dened as the probability of
137
the existence of at least one functionally sensitizable path from the gate G to any primary
output of the circuit.
To calculate the probability of sensitization PSen, N random vectors were applied to
primary inputs of a circuit. For each vector, fault simulation was performed on all gates
in the circuit to determine if the fault is sensitized and observable at one or more primary
output. For a gate G the number of vectors (SG) which were able to sensitize any fault (both
G-stuck-at-1 and G-stuck-at-0) at G to the primary output(s) is recorded. Note that SG is
the summation of the number of vectors which simulate the fault at G (when the output
of G is at logic 0 or logic 1). Now the sensitization probability for the gate G (PGSen) was
calculated as SG/N. The value of N used was 10000. A gate which has high probability of
sensitization is a sensitive gate which needs to be protected.
After computing the sensitization probabilities (or logical masking probabilities) for
all the gates in the circuit (η), the sensitive gates are identied and protected using Algo-
rithm 1. For a given circuit η, all gates G ∈ η are sorted in a decreasing order of their
sensitization probabilities, and stored in a list LIST . Then the top K gates in the LIST are
protected (by replacing them with the hardened gates designed in Section VII-C.1) so that
the required tolerance against radiation particle strikes is achieved. The resulting hardened
circuit is referred as η∗.
Algorithm 3 Radiation Hardening for a Circuit η
Harden circuit(η)
LIST = sort(G ∈ η,PGSen)
i = 0
while required tolerance to radiation is not achieved do
G = LIST (i)
Replace G by Ghardened
i = i+1
end while
return η∗
In this work, a circuit is called protected when the soft error rate reduces by an order
of magnitude. To achieve this, it is required to protect gates in list LIST (in decreasing
138
a
b cp
d eG3G2
Modified 2−input NAND gateHardened INV gate
a
b c
d eG3G2
G1
G1
cn
(b)
(a)
Fig. VII.4. Part of a circuit
order of sensitization probability) until 90% coverage is achieved. The coverage is dened
as [12].
coverage =
∑
∀ hardened gate G∈η∗
PGSen
∑
∀ gate G∈η
PGSen
·100 (7.1)
It was demonstrated in[12] that coverage is a good estimate for soft error failure rate
reduction. 90% coverage corresponds to an order of magnitude reduction in the soft error
rate. To achieve 90% coverage, K gates have to be protected using Algorithm 1. Note that
a gate G is protected by replacing it with its hardened version, which is obtained by the
gate hardening technique as described in Section VII-C.1. For example, consider a circuit
fragment shown in Figure VII.4 (a). Note that all the gates in FigureVII.4 (a) are regular
gates. Suppose that the gate G1 has a very high sensitization probability and it needs to be
protected such that a radiation particle strike at its output should not affect the gates in its
fanout (G2). To achieve this, the gate G1 of Figure VII.4 (a) is replaced with the hardened
inverter gate of Figure VII.1 (d). The resulting circuit is shown in Figure VII.4 (b). While
replacing the gate G1 with its hardened version, it is also required to replace all the regular
139
-1
-0.5
 0
 0.5
 1
 1.5
 2
 0.5  1  1.5  2  2.5  3  3.5
Vo
lta
ge
time(ns)
v(cp)
v(cn)
v(d)
Fig. VII.5. Waveforms at nodes cp, cn and d of Figure VII.4 (b)
gates in its fanout (G2) with their modified regular gates2, because a hardened gate has two
outputs (one output drives only PMOS transistors of the gates in its fanout, and the second
drives only the NMOS transistors of the fanout gates of G1). Therefore, G2 in Figure VII.4
(a) is also replaced by its modified regular version in Figure VII.4 (b). To verify that this
replacement strategy works, the circuit shown in Figure VII.4 (b) was implemented using
a 65nm PTM [45] model card, and radiation particle strikes were simulated at nodes cp (at
time = 0.8 ns) and cn (at time = 2.4 ns) with Q = 150 fC, τα = 150ps and τβ = 38ps. The
waveform at nodes cp, cn and d are shown in Figure VII.5. Figure VII.5 shows that the
radiation particle strikes at cp and cn do not have any detrimental effects on the voltage
at node d. Hence, the hardening technique proposed in this chapter is able to selectively
harden G1 of Figure VII.4 (a).
The gates at the primary output of a circuit are always sensitive and have a sensitiza-
tion probability equal to 1. Therefore, these gates are always replaced by their hardened
2As mentioned earlier, a modified regular n-input gate is same as the regular n-input gate
with its inputs to the PMOS and the NMOS transistor disconnected from each other re-
sulting in total of 2n-inputs. For example, INV2 of Figure VII.1 (d) is a modi f ied regular
inverter. The gate of Figure VII.3 (b) is a modi f ied regular NAND2 gate.
140
CLK
CLK
CLKCLKIN CLK
CLK
inn
inp outp
outn
CLK
Fig. VII.6. Radiation tolerant ip-op
counterparts. However, the replacement of a regular gate (which has one output) with its
hardened counterpart results in two outputs. Therefore, the two outputs of each hardened
gate that drives the primary outputs of the circuit now need to drive a ip-op (which sam-
ples the primary output values). In the proposed approach, the radiation tolerant ip-op
design proposed in [98] (shown in Figure VII.6) is utilized. This ip-op design is widely
used to implement radiation tolerant VLSI circuits. The radiation tolerant ip-op of [98]
has dual inputs, which correspond to the input D of the regular ip-op. One of the 2
inputs of the radiation tolerant ip-op only drives PMOS transistors, and the other input
drives only NMOS transistors. Therefore, the hardened gates designed in Section VII-C.1
are compatible with radiation tolerant ip-op of [98].
141
VII-C.3. Critical Charge for Radiation Hardened Circuits
The waveforms shown in Figures VII.2 and VII.5 suggest that even a large amount of
charge dumped by a radiation strike at the output of the proposed hardened gate will not
affect the fanout gates’ output. Therefore, the approach of Section VII-C.2 for circuit level
radiation hardening provides 90% coverage against radiation particle strikes from very high
energy radiation particles. However, the frequency of circuit operation imposes a limit on
the magnitude of the charge dump that can be tolerated by a hardened circuit implemented
using the hardening approach proposed in this chapter. This is explained next.
Consider a portion of a hardened circuit as shown in Figure VII.7 (a). The waveform of
the various nodes, along with CLK, are shown in Figure VII.7 (b). In Figure VII.7 (b), dark
lines correspond to the normal operation (no radiation particle strike). The clock period
of the hardened circuit is T and the propagation delay of INV2 is d. Assume out1p and
out1n switch from logic low to high at t1. Now assume that a high energy radiation particle
strikes the out1p node sometime before t1. The particle induces a voltage glitch with the
pulse width greater than T and the voltage glitch rises before t1 and falls after T + t1.
As the node out1p switches to logic 1 before t1 (while out1n is at logic 0), therefore, the
node out2 enters the high impedance state. At time t1, out1n also switches high (since in
switches to a low value). Now out2 comes out of the high impedance state and switches
to logic 0 at the same time as in normal operation. Now at time T + t1, out1n switches to
logic 0, and hence the out2 node again enters a high impedance state (since out1p is still at
logic 1 due to the radiation strike). When out1p fall to logic 0 then out2 switches to logic
1 as shown in Figure VII.7 (b). However, note that the rising out2 transition is delayed in
comparison to the normal operation. Due to this, the primary output computation may get
delayed, potentially resulting in a circuit failure. If the voltage glitch at out1p had fallen
on or before T + t1, then the out2 node would have switched at the same time as in normal
142
operation, and hence no circuit failure would have been encountered. Thus, the pulse width
of the voltage glitch induced by a radiation particle strike at out1p should be less than the
clock period T . Hence the critical charge (Qcri) for the circuit is the maximum amount
of charge deposited by a radiation particle such that a voltage glitch of pulse width T is
encountered in the circuit. As reported in Section VII-D, a very large amount of charge
should be dumped by a radiation particle in order to generate a voltage glitch with the
pulse width equal to the clock period of a design. This experiment was conducted for the
smallest (most sensitive to radiation) gate in the library used in the work presented in this
chapter. This is quantied in Section VII-D. It was found that the proposed approach is
extremely robust to radiation strikes.
Now consider a radiation particle strike just after t1 + d, at node out1n. Due to the
particle strike, out1n switches to logic 0 at t1 + d, out2 enters the high impedance state
with the correct logic value of 0. Even if the pulse width of the negative voltage glitch
at out1n is greater than T , it is of no consequence to the node voltage of out2. This is
because at time T + t1 out1p switches to logic 0, hence out2 switches to logic 1 at the
same time as in normal operation. However, if a radiation particle strikes the out1n node
between t1 and t1+d then out2 enters the high impedance state with the wrong logic value
of 1 (since out1n switched to logic 0 before the out2 node switches to logic 1). Hence,
out1n is vulnerable to a radiation strike when the gates (whose one of the input is out1n) is
switching their outputs.
To summarize, the maximum tolerable radiation-induced glitch width for the proposed
approach is T . Also, the hardened gates designed in this chapter are vulnerable to radiation
particle strikes during the time when their fanouts are computing their outputs. For the
circuit shown in Figure VII.7 (a), INV1 is vulnerable to radiation particle strikes only
between t1 and t1+d. However, the probability of a particle to strike the out1n node during
143
radiation strike at out1p
Voltage glitch due to a
Voltage glitch due to a
radiation strike at out1n
INV1 INV2
out1p
out1n
out2in
t1 T+t1 2T+t1
t1+d
CLK
in
out1p
out1n
out2
(a)
(b)
T
Fig. VII.7. a) Circuit under consideration b) Waveform at different nodes
this time interval is very low3 and hence, it does not have any impact on the reduction of
the overall soft error rate obtained by the proposed approach.
The hardening approach proposed in this dissertation can tolerate radiation-induced
voltage glitches with a maximum width of T , the clock period of the design. Hence, the
proposed hardening approach can provide tolerance against radiation particles of very high
energy. In the experimental section, the critical charge (Qcri) values for various benchmark
circuits which are hardened using the proposed approach is reported, to support this claim.
3As per [113, 114], the maximum solar proton uence for particles of energy > 1MeV based
on the JPL- 1991 model is 2.91×1011/cm2/year with 99% condence. The maximum area
of a hardened gate in the library used in this work is 7.69× 10−8cm2 and the maximum
delay of any gate is 70 ps. Using these values, it can be shown that the probability of a
radiation particle to strike out1n between t1 and t1+d is 4.96×10−14.
144
Table VII.1. Area Overheads of the Radiation Hardened Design Approach Proposed in This
Chapter
Area Map Delay Map
Ckt Regular Hardened % Ovh. Total # # of Hard Regular Hardened % Ovh. Total # # of Hard
µm2 µm2 of Gates Gates µm2 µm2 of Gates Gates
alu2 667.89 1080.92 61.84 303 187 740.39 1160.24 56.71 398 218
apex7 417.43 699.96 67.68 214 148 465.76 748.96 60.80 263 163
C1355 949.54 1627.32 71.38 516 377 1015.89 1699.16 67.26 566 385
C1908 908.68 1486.05 63.54 479 310 1020.73 1624.68 59.17 572 347
C3540 2177.23 3312.64 52.15 1079 588 2401.32 3571.66 48.74 1271 639
C432 348.88 609.23 74.62 181 134 402.93 684.80 69.96 219 155
C499 974.59 1634.13 67.67 523 362 1069.50 1756.28 64.22 597 387
C880 772.90 1293.15 67.31 389 267 828.71 1361.26 64.26 457 294
dalu 1569.98 2458.44 56.59 781 444 1799.78 2822.27 56.81 969 545
alu4 4093.89 5945.08 45.22 1799 782 4543.40 6221.46 36.93 2469 880
frg2 1453.54 2302.90 58.43 723 422 1768.15 2736.36 54.76 988 571
AVG 62.40 58.15
VII-D. Experimental Results
The performance of the circuit hardening approach proposed in this chapter was evaluated
by applying it to several ISCAS and MCNC benchmark circuits. A standard cell library
(LIB) was implemented using 65nm PTM [45] model cards, with VDD = 1.0 V. The library
(LIB) consists of regular INV2X, INV4X, NAND2, NAND3, NOR2 and NOR3 gates.
The modified regular versions, as well as the hardened versions of all the regular gates
in the library LIB were also designed. The layouts were created for all these gates using
CADENCE SEDSM [110] tools. Several ISCAS and MCNC benchmark circuits were
mapped using LIB, for both area and delay optimality. From a mapped design, rst the
sensitization probability of all the gates in the design was computed. Then the sensitive
gates in the design were selectively hardened (to achieve a 90% reduction in soft error
rate) based on their sensitization probability, using Algorithm 1. The area and the delay
results of the regular (unhardened) and the hardened circuits are reported in Tables VII.1
and VII.2.
Table VII.1 reports the layout area results for several benchmark circuits which are
145
Table VII.2. Delay Overheads and Qcri of the Proposed Radiation Hardened Design Ap-
proach
Area Map Delay Map
Ckt Regular Hardened % Ovh. Qcri(fC) Regular Hardened % Ovh. Qcri(fC)
(ps) (ps) (ps) (ps)
alu2 1068.28 1309.45 22.58 >650 893.62 1129.06 26.35 >650
apex7 495.00 636.84 28.65 520 451.95 565.13 25.04 330
C1355 636.95 830.21 30.34 >650 639.86 799.49 24.95 >650
C1908 924.56 1206.91 30.54 >650 926.47 1205.01 30.06 >650
C3540 1217.71 1582.78 29.98 >650 1139.45 1530.20 34.29 >650
C432 856.80 1120.31 30.76 >650 839.02 1094.36 30.43 >650
C499 670.97 868.22 29.40 >650 655.22 784.30 19.70 >650
C880 923.22 1157.03 25.32 >650 879.10 1069.67 21.68 >650
dalu 909.35 1241.81 36.56 >650 821.68 1157.31 40.85 >650
alu4 679.15 818.10 20.46 >650 625.48 751.79 20.20 >650
frg2 679.32 905.38 33.28 >650 818.30 1098.05 34.19 >650
Average 28.90 27.98
mapped for both area and delay optimality. Note that the layout area for a design was
computed by adding the layout area of all the gates in the circuit. Column 1 reports the
circuit under consideration. Columns 2 and 3 report the area (in µm2) of the regular and
hardened (obtained using the approach proposed in this chapter) area mapped designs,
respectively. Column 7 (8) report the area of regular (hardened) delay mapped designs.
Column 4 (9) reports the percentage area overhead for the radiation-hardened area (delay)
mapped designs. Column 5 (10) reports the number of gates in the area (delay) mapped
designs. The number of hardened gates in the area (delay) mapped designs is reported in
Column 6 (11). Observe from Table VII.1 that the average area overhead for the proposed
hardening approach is 62.4% and 58.15%, for area and delay mapped designs respectively.
Also, on average, the ratio of the number of hardened gates and the total number of gates
in the area and delay mapped designs is 63.1% and 58.7%, respectively.
The delay penalty associated with applying the proposed radiation hardening approach
is presented in Table VII.2. Note that the delay for a design reported in Table VII.2 is the
summation of the combinational circuit delay (D), the setup time (Tsu) of the ip-op and
the clock to output (Tcq) delay of the ip-op. Therefore, Table VII.2 reports the clock
146
period (T = D + Tsu + Tcq) of a design. The delay of a regular design is obtained by using
a static timing analysis tool. The static timing analysis tool was modied to compute the
delay of the radiation hardened designs. First, all hardened gates were characterized to
construct 2-dimensional pin-to-output delay lookup tables for different load values on the
two outputs (out p and outn). Note that for any hardened gate, the output out p falls after
the falling of outn, and outn rises after the rising of out p. Therefore, the rising (falling)
delay of a hardened gate is obtained from the rising (falling) delay of the outn (out p) node.
After the characterization of all hardened gates, the modied static timing analysis tool
was used to compute the delay of the radiation hardened circuits using these 2-dimensional
delay lookup tables. Tsu and Tcq were obtained using an unhardened D ip-op for the
regular design, and a radiation tolerant ip-op [98] for the hardened design. Table VII.2
also reports the critical charge value for the radiation hardened design. Column 1 reports
the circuit under consideration. Columns 2 and 3 report the clock period (in ps) for a
regular area mapped design and the hardened area mapped design. Column 4 reports the
percentage delay overhead (or clock period overhead) for the radiation-hardened design.
Columns 5 report the critical charge (in fC) for the hardened design, computed as described
in Section VII-C.3. Note that the Qcri value is obtained for τα = 150ps and τβ = 38ps, as
reported in [12]. The smallest gate in LIB was used to nd this value. Columns 6 to 9
report the same results as Columns 2 to 5, but for delay mapped designs. As reported in
Table VII.2, the average delay overhead of the proposed radiation hardening approach is
28.9% and 28% for area and delay mapped designs respectively. Also, the critical charge
for the radiation hardened design is a very large value. Traditional radiation hardening
approaches such as [12, 100] protect against radiation strikes of at most ∼150 fC. For all
but one hardened design obtained using the proposed approach, the critical charge is greater
147
Table VII.3. Area and Delay Overheads of the Proposed Radiation Hardened Design Ap-
proach for 100% Coverage
Area Map Delay Map
Ckt % Area Ovh. % Delay Ovh. % Area Ovh. % Delay Ovh.
alu2 97.86 39.76 97.21 48.27
alu4 97.87 39.22 97.18 44.42
apex7 97.47 40.56 96.04 41.40
C1355 97.27 45.93 96.37 44.97
C1908 97.41 47.33 95.82 49.13
C3540 98.55 46.83 97.35 47.95
C432 98.49 44.62 97.06 47.32
C499 97.09 45.96 95.91 45.07
C880 97.7 44.90 96.66 46.83
dalu 97.98 49.53 96.4 52.05
frg2 97.4 42.85 95.4 56.31
AVG 97.74 44.32 96.49 47.61
650 fC4 for τα = 150ps and τβ = 38ps. Therefore, for all practical purposes, the proposed
radiation hardening approach provides 90% coverage (i.e. a reduction of the soft error rate
by an order magnitude) against very high energy radiation particle strikes.
Table VII.3 reports the area and delay overheads of the radiation hardened designs ob-
tained using the approach proposed in this chapter, to achieve 100% coverage for both area
and delay mapped designs. The overheads reported in Table VII.3 are the percentage area
and delay overheads of the hardened circuits, compared to their regular counterparts. Note
that the actual area and delay numbers of the regular circuits are reported in Tables VII.1
and VII.2. Table VII.3 shows that for 100% coverage, the proposed hardening approach re-
sults in a 97.74% area overhead and a 44.32% delay overhead on average, for area mapped
designs. For delay mapped designs, the area overhead is 96.49% and the delay overhead is
47.61%. Note that the area and the delay overheads for 100% radiation tolerance are ap-
proximately 50-60% higher than the area and the delay overheads for 90% coverage. Note
that this design point is appealing since it protects 100% of the circuit against signicantly
4The pulse width of the voltage glitch induced by a radiation particle strike with Q > 650fC
saturates to a value of 660 ps.
148
(a) Area Overhead (b) Delay Overhead
Fig. VII.8. Area and delay overhead of our radiation hardening design approach for different
coverage
larger Qcri values than has been reported in the literature. Of course, for soft error rate
reductions lower than 100%, these overheads can be signicantly reduced, described next.
Figure VII.8 shows the average area and delay overheads of the radiation hardened
designs, obtained by using the split-output based hardening approach for different cover-
age values. In Figure VII.8, AM (DM) corresponds to area (delay) mapped designs. As
shown in Figure VII.8, initially both the area and delay overheads increase linearly with
an increasing coverage value, for coverage less than ∼80%. However, for coverage val-
ues greater than 80%, the area and delay overheads increase super-linearly with increasing
coverage. Thus, the optimal coverage value for the split-output based hardening approach
is about 80%. For 80% coverage, the average area and delay overheads for area (delay)
mapped designs are 48.7% and 22.4% (44.7% and 20.8%) respectively.
From Figures VII.2 and VII.5 it can be concluded that the proposed radiation toler-
ant standard cells can tolerate high energy radiation particle strikes without affecting the
149
state of gates in their fanout. Also Tables VII.1 and VII.2 show that the circuit radiation
hardening technique proposed in this chapter provides good soft error rate reduction (by an
order of magnitude) with a modest area overhead of 60% and delay overhead of 29% on
average. The critical charge of the hardened circuit obtained using the proposed approach
is also a very large value (> 650 fC in all but one example), which ensures correct circuit
functionality in a heavily radiation prone environment.
VII-E. Chapter Summary
This chapter presents a new radiation tolerant CMOS standard cell library, and demon-
strates its effectiveness in implementing digital circuits. It is known that if a gate is im-
plemented using only PMOS (NMOS) transistors, then a radiation particle strike can result
only in a logic 0 to 1 (1 to 0) glitch. This concept was applied to derive radiation hardened
standard cells. The radiation hardened gates exhibit an extremely high degree of radiation
tolerance and signicantly reduced leakage compared to competing approaches. This is
validated through circuit simulations at the circuit level. The work presented in this chap-
ter also implemented circuit level hardening using logical masking, to selectively harden
those gates in a circuit which contribute most to the soft error failure rate of the circuit.
The gates with a low probability of logical masking are replaced by radiation tolerant gates
from the new library, such that the digital design achieves 90% soft error rate reduction.
Experimental results validate the claims of high radiation tolerance. A 90% reduction in
SER is achieved with an area (delay) penalty of 62% (29%) for area mapped designs.
150
CHAPTER VIII
VARIATION ANALYSIS - SENSITIZABLE STATISTICAL TIMING ANALYSIS
VIII-A. Introduction
As described in Chapter I, with the continuous scaling of the minimum feature sizes of
VLSI fabrication processes, variations in key MOSFET and interconnect parameters are
increasing at an alarming rate [8, 9, 10]. The increasing variability of device and intercon-
nect parameters makes the task of designing reliable VLSI systems difcult. Thus, it is
essential to use a statistical analysis of timing to evaluate the performance of a combina-
tional circuit under variations. Also, design methodologies to implement variation tolerant
circuits need to be developed and validated by statistical timing analysis tools.
In this dissertation, a sensitizable statistical timing analysis approach is developed, to
improve the accuracy of timing analysis under process variations. The sensitizable statisti-
cal timing analysis approach is referred to as StatSense, and it is described in this chapter.
Two design approaches are also developed to improve the process variation tolerance of
combinational circuits and voltage level shifters (which are used in circuits with multiple
interacting supply domains), respectively. The process variation tolerant design approach
for combinational circuits is discussed in the next chapter. In Chapter X, the process varia-
tion tolerant voltage level shifter design is described.
In recent times, statistical static timing analysis (SSTA) has received signicant atten-
tion in both academe and industry. While a lot of research has suggested that statistical
static timing analysis is essential for timing closure in contemporary VLSI design, this
new method of timing analysis has not been readily accepted by chip designers. It is not
just the reticence of designers towards adopting a new design methodology that is prevent-
ing/slowing the adoption of this new timing approach. There is also a legitimate concern
151
that the results of statistical static timing analysis tend to be overly pessimistic. Besides,
statistical static timing analysis takes longer to run. Also, the adoption of SSTA requires a
greater effort during the gate library characterization of VLSI design phase. Designers are
hence skeptical about the benets of this new timing analysis methodology.
The are many factors that contribute to the inaccuracies of statistical static timing
analysis (SSTA), and many of them are dependent on the method used for the analysis.
Some of these factors are:
1. Spatial correlations of process variations
2. Correlations of the delays of different circuit paths
3. In block based SSTA [25], the PDF (Probability Density Function) that results from
the maximum operation of two PDFs is approximated to have a Gaussian distribution
4. False paths in the circuit
5. Representation of gate delay distribution. Usually, a single distribution for each pin
to output delay for a gate is used. However, accuracy would be improved if one
distribution is used for each input vector transition.
The last two issues mentioned above are also sources of pessimism for current SSTA
tools. Most static timing analysis tools (and their statistical counterparts) do not consider
false paths. Also these tools assume that the delay distributions of all gates in a design are
Gaussian. However, the delay distribution of a gate is not necessarily Gaussian. In fact,
the delay distribution for a multi-input gate is Normal for each input vector transition that
causes a change on the gate output.
The StatSense approach proposed in this chapter deals with these two issues. Stat-
Sense consists of two phases. In the rst phase, a set of N logically sensitizable vector
transitions that result in the largest delays for a circuit are obtained. In the second phase,
152
these delay-critical sensitizable input vector transitions are propagated using a Monte-Carlo
based technique, to obtain a delay distribution at the outputs. The specic input transitions
at any gate are known after the rst phase, and so the gate delay distributions correspond-
ing to these input transitions are utilized in the second phase. The second phase performs
Monte Carlo based SSTA, using the appropriate gate delay distribution corresponding to
the input transition for each gate. In this way, StatSense also implicitly considers path cor-
relations and does not approximate gate delay PDFs to Gaussian distributions (since it does
not require the computation of the MAX of two PDFs).
The rest of the chapter is organized as follows. Section VIII-B briey discusses pre-
vious work on statistical timing analysis of combinational circuits. The proposed Stat-
Sense approach is described in Section VIII-C. Experimental results are presented in Sec-
tion VIII-D, followed by a chapter summary in Section VIII-E.
VIII-B. Related Previous Work
The idea of statistical timing analysis has been a subject of research for several years. Some
of the early works in this eld include [115, 116]. The recent growth of interest in this eld
has been driven primarily due to the fact that process variations are growing larger and less
systematic.
Most of the techniques that perform statistical timing analysis are based on the princi-
ples of Static Timing Analysis (STA). STA computes the (pessimistic) worst case delay of
a circuit by propagating the worst case arrival time at the nodes of a circuit (in a topological
manner from inputs to outputs). The arrival time at the output of each gate is the MAX of
the SUM of the gate’s delay and the arrival time of its inputs. Since most statistical timing
analysis approaches utilize STA to propagate delay, they are often called statistical static
timing analysis (SSTA) approaches.
153
There are two broad classes of statistical static timing analysis algorithms  path-
based and block-based. In path-based algorithms, a set of paths is selected for a detailed
statistical analysis, which is performed by Monte Carlo techniques. In each iteration, delay
computation is performed in a breadth rst manner from circuit inputs to outputs, using
STA. In block-based algorithms, delay distributions are propagated by traversing the circuit
under consideration in a levelized breadth-rst manner. The fundamental operations in a
block based SSTA tool are the SUM and the MAX operations. Most block based SSTA
algorithms rely on efcient ways to implement these SUM and MAX operations for delay
distributions, rather than for discrete delay values (which STA uses). While block-based
algorithms tend to be faster, path-based algorithms are more accurate and provide more
realistic statistical timing estimates [117].
In [25], the authors present a technique to propagate probability density functions
(PDFs) through a circuit in the same manner as arrival times of signals are propagated
during STA. Using PCA (Principal Component Analysis), they also demonstrate the abil-
ity to handle spatial correlations of process parameters. While the SUM operation used
(for adding 2 Gaussian distributions) yields another Gaussian distribution, the MAX of 2
or more Gaussian distributions is not a Gaussian distribution in general. For the sake of
simplicity and ease of calculation, the authors of [25] approximate the MAX of 2 or more
Gaussian distributions to be Gaussian as well.
In [117], a canonical rst-order delay model is proposed, and an incremental block
based timing analyzer is used to propagate arrival times and required times through a timing
graph in this canonical form. One of the major contributions of the algorithm proposed
in [117] is that it allows the statistical timing engine to be used in an incremental manner.
In [118, 119, 120], the authors note that accurate statistical static timing analysis can
become exponential. Hence, they propose faster algorithms that compute bounds on the
exact result.
154
In [121], a block based SSTA algorithm is discussed. By representing the arrival times
as CDFs (Cumulative Distribution functions) and the gate delays as PDFs, the authors claim
to have an efcient method to do the SUM and MAX operations. They decompose the CDF
into a sum of ramps and the PDF into a sum of step signals. This discretization helps to
make the SUM and MAX operation more efcient. The accuracy of the algorithm can be
adjusted by choosing more discretization levels. Reconvergent fanouts are handled through
a statistical subtraction of the common mode.
In [122], the authors propagate gate delay distributions (PDFs) through a circuit. The
key contribution of [122] is that PDFs are discretized to help make the operation more
efcient. Here too, the accuracy of the result is dependent on the discretization.
The common theme in all the above works is that they are based on the static timing
analysis framework. Hence, only the structurally long paths are identied through these
algorithms. The authors of [123] identify this deciency and come up with a statistical
static timing analysis ow that considers false paths. Their ow consists of 2 phases. In the
rst phase, a regular SSTA (using worst-case statistical timing information), is performed
to identify the structurally long paths. Each of these paths are then checked to see if they
are logically sensitizable. In the second phase of the ow, another statistical static timing
analysis is performed for just the logically sensitizable paths, to check if they are ’timingly
true’. A path is timingly true if the transitions on the path cannot be invalidated by other
off-input paths. The statistical static timing analysis is done using Monte-Carlo based tech-
niques. While the authors of [123] reduce pessimism by considering false paths, they do
not address the pessimism that arises from using a single Gaussian gate delay distribution
for any gate.
In [124], a bound-based technique was proposed to identify the top timing-violating
paths in a circuit under variability. The authors of [124] veried the correctness and accu-
racy of their approach and found that their approach nds delay critical paths of a circuit
155
under variations with good delity. Note that the approach of [124] can be extended to ob-
tain the sensitizable delay critical input vector transitions, which are required in the second
phase of the StatSense approach.
Thus, traditional approaches for statistical timing analysis of a circuit tend to be overly
pessimistic, and hence pose tighter design constraints on the VLSI circuit designers. There-
fore, more accurate statistical timing analysis tools are desired, to simplify the designers
task.
VIII-C. Proposed Sensitizable Statistical Timing Analysis Approach
The StatSense approach eliminates false paths and also accounts for the fact that the delay
of a gate has different Normal distributions for different input transitions which cause an
output transition. As mentioned earlier, the StatSense approach consists of two phases. In
the rst phase a set of N logically sensitizable vector transitions that result in the largest
delays for the circuit are obtained. In the second phase, these delay-critical sensitizable
input vector transitions are propagated using a Monte-Carlo based technique, to obtain a
delay distribution at the outputs. The specic input transitions at any gate are known after
the rst phase, and so the gate delay distributions corresponding to these input transitions
are utilized in the second phase. The second phase performs Monte Carlo based SSTA,
using the appropriate gate delay distribution corresponding to the input transition for each
gate.
In the remainder of this section, these two phases are described, along with a discus-
sion on how input arrival times are propagated for any gate.
156
VIII-C.1. Phase 1: Finding Sensitizable Delay-critical Vector Transitions
To ensure that the time is not needlessly spent on performing statistical analysis on false
paths, a user-specied number N of sensitizable vector transitions (that result in the largest
delays for the circuit) are rst obtained. This is done using the sense [108] package in
SIS [109]. Sense uses a boolean satisability (SAT) [125, 126, 127] solver to verify if
a particular delay (initially set to the delay found from a static timing analysis run) is
sensitizable. As a consequence, sense is NP complete. Efcient implementations of this
algorithm [108, 128] exist, and have been demonstrated to work on large designs. If there
is no satisable input vector that produces the delay obtained by the STA, the delay value is
reduced in steps until a delay D value is reached that has a satisfying vector (a vector on the
primary inputs that has a delay D) is obtained. In its original implementation, sense returns
only the maximum sensitizable delay of the circuit. Based on the theory of the sensitizable
timing analysis methodology of sense [108], the resulting delay that is reported by the
approach is the largest delay for the design to reach a stable output state. The sense routine
was modied to return the nal vector on the primary inputs, as well as all the possible
initial vectors on the primary inputs, that cause the maximum sensitizable delay. A change
from any initial vector to a nal vector is referred to as a vector transition. The set of N
input transitions is stored in an array for use in the second phase of the proposed statistical
timing ow.
After obtaining the rst largest sensitizable delay vector, the complement of this delay
vector is inserted (as a SAT clause) in sense’s SAT routine, and then sense is run again to
get the next critical vector. For example, if the vector V that produced the largest sensiti-
zable delay D at output z is abc (where a, b, and c are the primary inputs involved in the
sensitizable critical delay transition), then the clause added is (a+b+c+z). When sense is
called again, it returns the vector transition resulting in the next largest sensitizable delay.
157
Note that different vector transitions sensitize different structural paths. This process is
continued until the top N delay-critical, sensitizable vector transitions are collected. Note
that the value of N is specied by the user. It can be decided based on the desired accuracy,
and the time available for computation.
In [123], the authors nd a set of logically sensitizable paths from a given set of
structurally long paths (found from an initial static timing analysis). Then from this set,
they separately nd a subset of paths that are ’timingly true’.
In the proposed approach, sense returns vector transitions that are logically sensitiz-
able and timing true. There is no separation of the two properties (logic sensitizability and
timing trueness). Sense performs its analysis using nominal gate delay values. The statisti-
cal analysis is done only on the N primary input vector transitions that sense declared to be
critical and sensitizable.
Once sense returns a set of N critical vector transitions, these vector transitions are then
used to nd the statistical distribution of the circuit delay due to these vector transitions. A
primary input vector transition may induce an input transition on each gate of the design, for
which the appropriate gate delay distribution is selected for further statistical processing.
Also, the arrival times for the gate are propagated in a manner that exploits the fact that the
input transition at the gate are known. This is explained in the following section.
VIII-C.2. Propagating Arrival Times
In a regular static timing analysis, the structurally worst delay is obtained. However, Stat-
Sense uses the fact that the specic transitions at the inputs of a gate, that cause the output
node to switch, are known. The details of how this is done is explained with the example
of a NAND2 gate, which is assumed to be part of a larger circuit. First, consider just the
nominal delay of a NAND2 gate.
Table VIII.1 is a list of input transitions that cause the output of the NAND gate to
158
55.3ps
42.7ps
10ps 35ps 77.7ps 90.3ps
30.5ps
10ps 35ps 60.5ps
50.5ps
88.0ps
b
a c
a
b
c
(STA estimate)(StatSense
a
b
c
(STA estimate)(StatSense
estimate)
estimate)
Fig. VIII.1. Arrival time propagation using a NAND2 gate
159
Table VIII.1. Transitions for a NAND Gate that Cause Its Output to Switch
Rising Transition # ab → ab Delay(ps)
1 11 → 00 30.5
2 11 → 01 50.5
3 11 → 10 53.0
Falling Transition # ab → ab Delay(ps)
1 00 → 11 55.3
2 01 → 11 46.5
3 10 → 11 42.7
change its logic value. Let AT f alli denote the arrival time of a falling signal at node i and
AT risei denote the arrival time of a rising signal at node i.
In the case of regular STA, the rising time (delay) at the output c of a NAND2 gate is
calculated as
AT risec = MAX [(AT f alla +MAX(D11→00,D11→01)),
(AT f allb +MAX(D11→00,D11→10))]
where, Dxy→pq is the delay of the output when the inputs change from xy to pq. Also,
MAX(D11→00,D11→01) is often referred to as the pin-to-output rising delay from the input
a, while MAX(D11→00,D11→10) is referred to as the pin-to-output rising delay from the
input b.
Similarly, in STA, the falling time (delay) at the output c of a NAND2 gate is given by
AT f allc = MAX [(AT risea +MAX(D00→11,D01→11)),
(AT riseb +MAX(D00→11,D10→11))]
160
where, MAX(D00→11,D01→11) is often referred to as the pin-to-output falling delay from
the input a, while MAX(D00→11,D10→11) is referred to as the pin-to-output falling delay
from input b.
For example, if the worst case falling or rising arrival time at inputs a and b was 10 ps
and 35 ps respectively, then the rise delay at c would be calculated to be = MAX(10+50.5,
35+53.0) = 88.0 ps. Similarly for a falling c output, the delay would be MAX(10+55.3,35+55.3)
= 90.3 ps. However this is a pessimistic method of calculating the delay. The StatSense
approach attempts to remove some of this pessimism.
First consider the rising output. The output of the NAND2 gate switches high when
any of the two inputs switches low. From the output of sense, the actual vector transition
that causes the largest delay for a given circuit can be found. This primary input vector
transition induces a transition on the gate inputs. Assume that this input transition was 11
→ 00 for the NAND2 gate. A naive way of calculating the delay would be to state that the
delay would be given by
AT risec = MAX(AT f alla ,AT
f all
b )+D11→00
Assuming again that the arrival times at inputs a and b were 10 ps and 35 ps respec-
tively, the delay would be then be calculated as MAX(10,35)+30.5 = 65.5. However, it is
known that the output would start switching before 65.5 since signal a arrives earlier than
signal b. As a result, the gate effectively goes through the transition 11 → 01 → 00 rather
than 11 → 00 directly. Note that the output of the NAND2 gate falls for the vector 01 as
well. Hence, StatSense calculate the delay to be
AT risec = MIN((AT f alla +D11→01),(AT
f all
b +D11→00))
In this example, the delay estimated by StatSense is hence MIN(10+50.5,35+30.5)
= 60.5. Note that the minimum of two delays is used in this case since any one input
161
falling causes the output to switch. Also note that the delay calculated (60.5 ps) is much
smaller than the worst case delay calculated using regular STA (88.0 ps). The reduction in
pessimism in the proposed approach occurs due to the fact that the information about the
input transition for the gate is available.
Now consider the case of the falling output. The output of the NAND2 gate switches
low only when both the inputs switch high. Again, StatSense uses the fact that sense
provides the actual vector transition that caused the critical delay. Assume that the induced
input transition for the NAND2 gate was 00 → 11. A naive way of calculating the delay
would be to state that the delay is
AT f allc = MAX(AT risea ,AT riseb )+D00→11
Assuming again that the arrival times at inputs a and b were 10 ps and 35 ps respec-
tively, the delay would be calculated as MAX(10,35)+55.3 = 90.3. However, a arrives
earlier than b. As a result, the gate effectively goes through the transition 00 → 10 → 11
rather than 00 → 11 directly. Hence, in the StatSense approach, the delay is calculated to
be
AT f allc = MAX((AT risea +D00→11),(AT riseb +D10→11))
In this example, the delay is hence MAX(10+55.3,35+42.7) = 77.7. Note that the
maximum of two delays is used in this case since both inputs need to switch to cause the
output to switch. Also note that the delay calculated (77.7 ps) is smaller than the worst case
delay calculated using regular STA (90.3 ps).
These results are shown graphically in the Figures VIII.2 and VIII.3. These plots show
the arrival time of the output c of a NAND2 gate, for the 00 → 11 and 11 → 00 transitions
respectively. The arrival time of one of the inputs a is xed to zero and the arrival time of
the other input b swept between -100 ps to 100 ps. The propagated delays are shown for
162
STA and StatSense, along with the delay found by SPICE [38]. Figures VIII.4 and VIII.5,
show the same data for the NOR2 gate. As can be seen from these plots, the method which
is used by StatSense to calculate the arrival times for multiple switching inputs matches
SPICE quite accurately and is signicantly better (less pessimistic) than a traditional STA
method for computing arrival times.
 40
 60
 80
 100
 120
 140
 160
-150 -100 -50  0  50  100  150
Ar
riv
al
 T
im
e 
at
 o
ut
pu
t(p
s)
Time Difference(ps)
SPICE
OURS
STA
Fig. VIII.2. Plot of arrival times at output of NAND2 gate calculated through various means
for the transition 00 → 11
Similarly, equations are derived to calculate the arrival times for an arbitrary gate, de-
pending on the input transitions at that gate. Consider a NAND3 gate with inputs {a,b,c}.
First, assume that the inputs of the NAND3 gate change as follows:
000 → 100→ 110 → 111
The output of the NAND3 gate switches low only when the inputs are 111. Hence the
delay of the gate would be calculated as follows:
AT f allout = MAX [(AT risea +D000→111), (8.1)
163
-50
 0
 50
 100
 150
 200
-150 -100 -50  0  50  100  150
Ar
riv
al
 T
im
e 
at
 o
ut
pu
t(p
s)
Time Difference(ps)
SPICE
OURS
STA
Fig. VIII.3. Plot of arrival times at output of NAND2 gate calculated through various means
for the transition 11 → 00
(AT riseb +D100→111), (8.2)
(AT risec +D110→111)]
Now consider a NAND3 gate with its output rising. Let the inputs change as below
111 → 011→ 001 → 000
In this case, the output of the NAND3 gate starts switching high when at least one of
the inputs is logic 0. Hence the delay of the gate would be calculated as:
AT riseout = MIN[(AT f alla +D111→011), (8.3)
(AT f allb +D111→001), (8.4)
(AT f allc +D111→000)]
An extension to handling delay distributions is easily done by simply considering the
distribution to be made of several distinct delay values, obtained from the PDF of the gate
164
-50
 0
 50
 100
 150
 200
-150 -100 -50  0  50  100  150
Ar
riv
al
 T
im
e 
at
 o
ut
pu
t(p
s)
Time Difference(ps)
SPICE
OURS
STA
Fig. VIII.4. Plot of arrival times at output of NOR2 gate calculated through various means
for the transition 00 → 11
delay.
VIII-C.3. Phase 2: Computing the Output Delay Distribution
In the second phase of StatSense, Monte Carlo analysis is performed on the sensitizable
vector transitions that result in the largest delays for the circuit (which were computed
in the rst phase, described in Section VIII-C.1). In each of the STA runs for Monte
Carlo analysis, arrival times are propagated as described in Section VIII-C.2. Since the
primary input vector transitions may induce transitions on the input of each gate, the delay
distribution of the gate for the corresponding gate input transition is used. A random value
of the gate delay is computed from this distribution. This is done for each gate in the circuit.
Finally, STA is performed, using these delay values. The resulting maximum delay over all
the outputs is used to compute the worst case delay distribution of the circuit.
In a NAND2 gate, there are 3 different input rising transitions that cause an output
falling transition (these are shown in the bottom half of Table VIII.1). For any iteration
165
 40
 60
 80
 100
 120
 140
 160
 180
-150 -100 -50  0  50  100  150
Ar
riv
al
 T
im
e 
at
 o
ut
pu
t(p
s)
Time Difference(ps)
SPICE
OURS
STA
Fig. VIII.5. Plot of arrival times at output of NOR2 gate calculated through various means
for the transition 11 → 00
of STA, if the value of delay for one of the 3 transitions (say 00 → 11) is chosen to be
µ00→11 + nσ00→11, then the value of the other two transitions (01 → 11, 10 → 11) used is
µ01→11 +nσ01→11 and µ10→11 +nσ10→11 respectively.
VIII-D. Experimental Results
To demonstrate the effectiveness of the StatSense approach, it was tested for several circuits
from the ISCAS89 and MCNC91 benchmark suite. A 0.1µm BPTM process [107] model
card was used for all SPICE [38] simulations. A standard cell library LIB was designed
which consisted of 8 cells. The 8 cells were INV, INV2X, NAND2, NAND3, NAND4,
NOR3, NOR3, NOR4.
All standard cells in LIB were pre-characterized to construct a table of values for the
mean and standard deviation of the delay of each transition (that causes a change in the
output). This pre-characterization was done for a set of load capacitance values. This pre-
characterization was done using SPICE. The parameters considered to be varying, along
166
Table VIII.2. Parameters with Their Variation
Parameter Nominal Value σ
L 0.1µ 0.005µ
V NT 0.2607V 0.013V
V PT 0.3030V 0.01515V
with their variations, are given in Table VIII.2. As reported in this table, all parameters
were modeled such that their σ is 5% of their µ. The threshold voltages and the channel
lengths of the devices in a gate were assumed to vary in the same manner. Thus, all process
parameters within a gate were assumed to be perfectly correlated.
The characterization results for a NAND2 gate (with a load capacitance of 6 fF) are
shown in Figures VIII.6 and VIII.7. Figure VIII.6 shows the delay histogram for the three
vector transitions which result in a rising output. These vector transitions are 11 → 00,
11 → 01 and 11 → 10. Note that each of these vector transitions exhibit different output
delay distributions. Similarly, Figure VIII.7 shows the delay histogram for the three vector
transitions which result in a falling output. These vector transitions are 00 → 11, 01 → 11
and 10 → 11. Note that each of these vector transitions also exhibit different output delay
distributions. The mean and standard deviation of the distributions of each of these vector
transitions were computed and used in the second phase of the proposed algorithm.
During the timing analysis phase of the StatSense approach, the mean and standard
deviation of the delay for a given load capacitance value was obtained by interpolating
between the capacitance values for which the pre-characterization was performed.
Next, the rst phase of the StatSense approach was carried out. Sense was used to
nd the top few sensitizable critical delays and their corresponding input vector transitions.
The result of the rst phase of the StatSense approach is a set of vector transitions on the
167
 0
 5
 10
 15
 20
 25
 30
 0  10  20  30  40  50
N
um
be
r o
f S
am
pl
es
Delay (ps)
’nand1100.hist’
(a)
 0
 5
 10
 15
 20
 25
 30
 0  10  20  30  40  50
N
um
be
r o
f S
am
pl
es
Delay (ps)
’nand1101.hist’
(b)
 0
 5
 10
 15
 20
 25
 30
 0  10  20  30  40  50
N
um
be
r o
f S
am
pl
es
Delay (ps)
’nand1110.hist’
(c)
Fig. VIII.6. Characterization of NAND2 delay for all input transitions which cause a rising
output. a) 11→ 00, b) 11 → 01 and c) 11→ 10
168
 0
 5
 10
 15
 20
 25
 30
 0  10  20  30  40  50
N
um
be
r o
f S
am
pl
es
Delay (ps)
’nand0011.hist’
(a)
 0
 5
 10
 15
 20
 25
 30
 0  10  20  30  40  50
N
um
be
r o
f S
am
pl
es
Delay (ps)
’nand0111.hist’
(b)
 0
 5
 10
 15
 20
 25
 30
 0  10  20  30  40  50
N
um
be
r o
f S
am
pl
es
Delay (ps)
’nand1011.hist’
(c)
Fig. VIII.7. Characterization of NAND2 delay for all input transitions which cause a falling
output. a) 00→ 11, b) 01 → 11 and c) 10→ 11
169
primary inputs of the circuit. The experiments were performed for N = 75 (or 50 or 25)
primary input vector transitions that result in the largest circuit delay.
For the second phase of the StatSense approach, these transitions were propagated
throughout the circuit. Since the input transition at each gate is known, the arrival time
propagation methodology explained in Section VIII-C.2 was used to compute the arrival
time at the gate output. The output delay of the circuit was obtained by performing a linear
traversal of the circuit in levelized order. This step of propagating circuit delays is done
1000 times (or as many times as is required to get a reasonably stable and accurate estimate
of the mean and standard deviation of the maximum delay of the circuit). For each of these
1000 iterations, a random value of delay is chosen for each gate, for the relevant input
vector transitions for that gate. This random value is chosen from a Gaussian distribution
with a µ and σ derived from the pre-characterized table of values for each gate, for the
appropriate input transition at that gate. Note that the µ and σ used for any gate correspond
to the vector transitions that appear at that gate when the primary input vector transition is
applied.
Tables VIII.3 and VIII.4 describe the results of experiments conducted to compare
StatSense with SSTA. The major goal of the work presented in this chapter is to make
statistical static timing analysis more accurate. Hence, the StatSense approach was com-
pared with Monte-Carlo based SSTA which is considered to be most accurate [129]. It
is well known that block based SSTA sacrices some accuracy for speed due to approx-
imations when propagating PDFs (especially when computing the MAX of two or more
PDFs). Both StatSense and Monte-Carlo based SSTA were implemented in SIS [109]. The
data (the mean and the standard deviation of the delay of each transition) obtained from
characterization of all standard cells in LIB was used for both StatSense and Monte-Carlo
based SSTA. The SSTA experiments in this table were conducted using 10000 iterations.
The StatSense iterations were computed using 1000 iterations per primary input vector
170
transition. StatSense computes the µ and σ of the delay of a circuit by taking the statistical
maximum of the delay distributions of all input vector transitions. The statistical maximum
was computed as follows. First, a random delay value for each input vector transitions of a
circuit is obtained using their corresponding delay distributions. Then the maximum delay
value across all vector transitions is selected to obtain the delay of the circuit. This process
is repeated a large number of times (10000) to obtain the nal delay distribution of the cir-
cuit. Note that in the all results presented in this section, the average of the runtime ratios
are computed using a geometric mean, due to the high variability of these ratios. Also note
that the runtimes include the time required to obtain the delay-critical vector transitions
(Phase 1 of the StatSense approach) and the time required to perform Monte Carlo itera-
tions for all input vector transitions obtained from Phase 1. In Table VIII.3, Column 1 lists
the circuit under consideration. Columns 2 through 4 list the µ, σ and µ + 3σ delays (in
ps) returned by SSTA. Column 5 lists the SSTA runtime. All runtimes in this table are in
seconds. Columns 6 through 11 list the results for StatSense, when N = 75 input vector
transitions were simulated. Columns 6 through 8 list the µ, σ and µ+3σ delays (in ps) re-
turned by StatSense. Column 9 reports the ratio of the µ+3σ value returned by StatSense,
compared to that returned by SSTA. Note that StatSense, on average, returns a much lower
worst case circuit delay (the µ + 3σ delay) than SSTA. This illustrates the pessimism of
SSTA, and validates the claim that StatSense reduces this pessimism. Column 10 and 11
respectively list the runtime for StatSense and the ratio of this runtime versus the runtime
of SSTA. On average, note that StatSense (run with 75 input vector transitions) requires
about 2.5× more runtime than SSTA. In Table VIII.4, Columns 2 through 7 (and 8 through
13) have the same information as Columns 6 through 11 of Table VIII.3, except that the
StatSense simulations for these columns were performed using 50 (and 25) input vector
transitions (which result in the largest sensitizable circuit delay). The StatSense approach
with 50 (25) input vector transitions is referred to as StatSense50 (StatSense25). The pur-
171
Table VIII.3. Comparison of SSTA and StatSense for 75 Input Vector Transitions
Ckt SSTA StatSense 75
µ (ps) σ (ps) µ+3σ Time µ (ps) σ (ps) µ+3σ Ratio Time Ratio
alu2 1008.39 19.08 1065.63 278.8 701.30 7.46 723.68 0.68 5095.41 17.78
alu4 1234.77 18.21 1289.4 560.2 803.56 7.45 825.91 0.64 6607.43 11.8
apex5 539.04 19.29 596.91 727.6 464.68 5.84 482.2 0.81 1481.91 2.04
apex6 680.51 10.95 713.36 632.2 538.27 5.41 554.5 0.78 1673.73 2.64
apex7 489.85 8.3 514.75 180.66 490.20 5.24 505.92 0.98 323.9 1.79
C499 737.00 11.29 770.87 419.4 652.45 5.68 669.49 0.87 636.46 1.52
C880 1037.51 13.56 1078.19 324.9 909.20 7.40 931.4 0.86 651.8 2.01
C1355 714.82 8.59 740.59 484.8 445.46 4.49 458.93 0.62 673.81 1.39
cordic 669.99 8.60 695.79 657.0 620.17 7.14 641.59 0.92 1093.63 1.6
i6 496.16 22.80 564.56 353.0 495.55 7.89 519.22 0.92 836.8 2.36
i7 496.25 21.76 561.53 514.3 499.26 7.88 522.9 0.93 1019.4 1.98
i10 1858.92 18.72 1915.08 1900.12 1323.72 10.25 1354.47 0.71 3584.5 1.89
rot 781.23 13.75 822.48 571.0 542.55 6.71 562.68 0.68 1317.9 2.31
x1 319.34 10.40 350.54 261.5 303.41 6.14 321.83 0.92 321.68 1.23
AVG 0.81 2.49
pose of this experiment was to verify if the StatSense runtime can be reduced by simulating
fewer input vector transitions. By comparing Columns 9 of Table VIII.3, with columns 5
and 11 of Table VIII.4, it can be observed that there is no appreciable loss of delity when
25 input vector transitions are used instead of 75 or 50. The worst case circuit delay (the
µ + 3σ delay), averaged over all designs, is almost identical in all cases. The benet of
using 25 input vector transitions is indicated in Column 13 of Table VIII.4, which shows
that on average, StatSense (with 25 input vector transitions) requires 5% less runtime than
SSTA.
In spite of the fact that SSTA conducts 10000 STA iterations, and StatSense conducts
75000 (or 50000 or 25000 for StatSense50 and StatSense25 respectively) iterations, the
runtime of StatSense is not 7.5× (or 5× or 2.5×) that of SSTA but rather it is 2.49× (or
1.74× or 0.95×) that of STA. This is because StatSense performs an event driven delay
simulation. Whenever there is no transition at the output of a gate g, delay computations
for gates in the fanout of g are avoided. This pruning is not possible in SSTA.
Figure VIII.8 illustrates the delay histogram obtained by SSTA (with 50000 STA it-
erations) along with the delay histogram obtained by StatSense and SPICE (with 50 input
172
 0
 50
 100
 150
 200
 250
 300
 380  400  420  440  460  480  500  520  540
N
um
be
r o
f S
am
pl
es
Delay (ps)
SSTA apex7
(a)
 0
 50
 100
 150
 200
 250
 300
 380  400  420  440  460  480  500  520  540
N
um
be
r o
f S
am
pl
es
Delay (ps)
StatSense apex7
(b)
 0
 50
 100
 150
 200
 250
 300
 380  400  420  440  460  480  500  520  540
N
um
be
r o
f S
am
pl
es
Delay (ps)
SPICE apex7
(c)
Fig. VIII.8. Delay histograms for a) SSTA, b) StatSense and c) SPICE (for apex7)
173
Table VIII.4. Comparison of SSTA and StatSense for 50 and 25 Input Vector Transitions
Ckt StatSense 50 StatSense 25
µ (ps) σ (ps) µ+3σ Ratio Time Ratio µ (ps) σ (ps) µ+3σ Ratio Time Ratio
alu2 701.12 7.38 723.26 0.68 4018.89 14.02 701.25 7.69 724.32 0.68 1232.5 4.42
alu4 800.62 7.80 824.02 0.64 3386.8 6.04 799.77 7.93 823.56 0.64 3217.9 5.74
apex5 463.58 6.07 481.79 0.81 981.2 1.34 460.52 7.46 482.9 0.81 526.7 0.72
apex6 538.48 5.38 554.62 0.78 1096.42 1.73 536.92 5.98 554.86 0.78 787.7 1.24
apex7 490.10 5.26 505.88 0.98 233.17 1.29 490.05 5.42 506.31 0.98 124.5 0.69
C499 650.35 6.18 668.89 0.87 481.6 1.15 646.23 7.08 667.47 0.87 238.8 0.57
C880 909.13 7.48 931.57 0.86 416.6 1.28 907.65 7.80 931.05 0.86 227.07 0.70
C1355 443.67 4.85 458.22 0.62 578.1 1.2 440.86 5.64 457.78 0.62 291.8 0.60
cordic 613.14 5.69 630.21 0.91 657.23 1.00 612.19 6.03 630.28 0.91 355.3 0.54
i6 493.86 8.39 519.03 0.92 609.5 1.73 488.51 9.70 517.61 0.92 293.79 0.83
i7 496.08 8.72 522.24 0.93 494.9 0.96 489.83 10.06 520.01 0.93 366.6 0.65
i10 1323.24 10.55 1354.89 0.71 2319.0 1.22 1316.12 11.71 1351.25 0.71 1188 0.62
rot 540.15 7.33 562.14 0.68 1343.6 2.35 535.59 8.41 560.82 0.68 810.4 1.42
x1 303.51 6.10 321.81 0.92 277.14 1.06 303.37 6.28 322.21 0.92 141.6 0.54
AVG 0.81 1.74 0.81 0.95
vector transitions simulated). These histograms were obtained for the apex7 example. For
each input vector transition in StatSense and SPICE, 1000 Monte Carlo iterations of delay
computation were performed. This gure shows how the pessimism of SSTA is allevi-
ated by StatSense. This gure also shows that the delay distribution obtained by StatSense
closely matches with the delay distribution obtained by SPICE. However, the statistical
static timing analysis method of [123] (which is the best know previous approach) results
in 12% higher delay values than SPICE.
As mentioned earlier, StatSense addresses two sources of pessimism in statistical static
timing analysis. These two sources are: false paths in a circuit, and the representation of
gate delay distributions. Tables VIII.3 and VIII.4 report the results obtained when both
these issues are addressed simultaneously. To evaluate the accuracy gained by each of these
issues separately, Monte-Carlo based SSTA simulations were performed on 50 sensitizable
paths (the same paths which are obtained by StatSense50). In other words, Monte-Carlo
based SSTA analysis was performed on sensitizable paths of the circuits (after eliminating
the false paths). Table VIII.5 compares the results obtained from Monte-Carlo based SSTA
174
with and without false path elimination, and StatSense (which eliminates false paths and
also models input vector transition based gate delays) with 50 input vector transitions. For
each of the 50 input vector transitions, 1000 Monte Carlo simulations were performed. In
Table VIII.5, Column 1 reports the circuit under consideration. Columns 2 through 4 list the
µ, σ and µ+3σ delays (in ps) returned by SSTA (without false path elimination). Columns
5 through 7 list the µ, σ and µ+3σ delays (in ps) returned by StatSense50. Column 8 reports
the ratio of the µ+3σ value returned by StatSense50, compared to that returned by SSTA.
Columns 10 through 12 list the µ, σ and µ+3σ delays(in ps) returned by performing Monte-
Carlo based SSTA on 50 sensitizable paths which are obtained by StatSense50 (henceforth
referred to as SSTA with false path elimination). Column 13 reports the ratio of the
µ + 3σ value returned by SSTA without false paths, compared to that returned by SSTA
(with false paths). Observe from Table VIII.5 that StatSense50 on average, reduces the
error in the estimation of the worst case circuit delay by 19% compared to the Monte-Carlo
based SSTA (without false path elimination). Out of this 19% reduction, 9% is due to the
false path elimination (as observed from the results of SSTA with false path elimination
in Table VIII.5) and 10% is due the use of different delay distributions for different input
transitions (which cause a change in the output) for all gates in LIB. Therefore, to improve
the accuracy of statistical static timing analysis, it is important to consider both false paths
in the circuit and also use different delay distributions for different input transitions at a
gate (as done by StatSense).
VIII-D.1. Determining the Number of Input Vector Transitions N
The number of input vector transitions required to perform an accurate statistical timing
analysis can be obtained as follows. Note that as the number of input vector transitions
N is increased, the mean delay increases, while the standard deviation decreases. After
a certain number N1 of input vector transitions, the mean and the standard deviation of
175
Table VIII.5. Comparison of SSTA, StatSense50 and SSTA without False Paths
Ckt SSTA StatSense50 SSTA with False Path Elimination
µ (ps) σ (ps) µ+3σ µ (ps) σ (ps) µ+3σ Ratio µ (ps) σ (ps) µ+3σ Ratio
alu2 1008.39 19.08 1065.63 701.12 7.38 723.26 0.68 810.23 7.59 833 0.78
alu4 1234.77 18.21 1289.4 800.62 7.80 824.02 0.64 937.73 9.20 965.33 0.75
apex5 539.04 19.29 596.91 463.58 6.07 481.79 0.81 538.89 9.04 566.01 0.95
apex6 680.51 10.95 713.36 538.48 5.38 554.62 0.78 657.18 4.72 671.34 0.94
apex7 489.85 8.3 514.75 490.10 5.26 505.88 0.98 501.49 4.45 514.84 1.00
C499 737 11.29 770.87 650.35 6.18 668.89 0.87 748.65 5.85 766.2 0.99
C880 1037.51 13.56 1078.19 909.13 7.48 931.57 0.86 1047.92 7.23 1069.61 0.99
C1355 714.82 8.59 740.59 443.67 4.85 458.22 0.62 496.77 4.17 509.28 0.69
cordic 669.99 8.6 695.79 613.14 5.69 630.21 0.91 683.75 5.46 700.13 1.01
i6 496.16 22.8 564.56 493.86 8.39 519.03 0.92 537.32 10.64 569.24 1.01
i7 496.25 21.76 561.53 496.08 8.72 522.24 0.93 532.27 10.37 563.38 1.00
i10 1858.92 18.72 1915.08 1323.24 10.55 1354.89 0.71 1454.71 8.34 1479.73 0.77
rot 781.23 13.75 822.48 540.15 7.33 562.14 0.68 656.28 5.37 672.39 0.82
x1 319.34 10.4 350.54 303.51 6.10 321.81 0.92 337.18 4.96 352.06 1.00
AVG 0.81 0.91
the delay will not change with an increase in the number of input vector transitions. This
implies that when N ≥ N1, then all delay critical input vector transitions have already been
considered and the new input vector transitions (the (N1 + 1)th, (N1 + 2)th, ... vectors ) do
not become ever critical under process variations. Therefore, N1 input vector transitions are
sufcient for an accurate statistical delay estimation. Although this method for calculating
N was not used in this work, based on the results it is expected that N1 is close to 75 for all
the benchmark circuits analyzed in this work.
VIII-E. Chapter Summary
In recent times, the impact of process variations has become increasingly signicant. Pro-
cess variations have been growing larger and less systematic with each process generation.
In response to this, there has been much research in extending traditional static timing anal-
ysis so that it can be performed statistically. The resulting statistical static timing analysis
(SSTA) approaches are, however, still quite pessimistic. This pessimism arises from the
fact that most static timing analysis tools and their statistical counterparts do not consider
176
false paths. The second major source of pessimism is that statistical static timing analyzers
assume rising and falling delay distributions for all gates in a design to be a single Gaus-
sian. However, the delay distribution of a gate is not necessarily Gaussian. In fact the delay
distribution for a multi-input gate is Gaussian for each input vector transition that causes a
change on the gate output.
This chapter presented a sensitizable statistical static timing analysis (which is referred
to as StatSense) technique to overcome the pessimism of SSTA. The StatSense approach
implicitly eliminates false paths, and also uses different delay distributions for the different
input transitions of any gate. These features enable the StatSense approach to perform less
conservative timing analysis than the SSTA approach. Experimental results show that on
average, the worst case (µ + 3σ) circuit delay reported by StatSense is about 19% lower
than that reported by SSTA.
The next chapter describes a process variation tolerant combinational circuit design
approach developed in this dissertation.
177
CHAPTER IX
VARIATION TOLERANT DESIGN - A VARIATION TOLERANT COMBINATIONAL
CIRCUIT DESIGN APPROACH USING PARALLEL GATES
IX-A. Introduction
The increasing variation in device parameters leads to large variations in the performance
of different die of the same wafer, resulting in a signicant yield loss. This yield loss
translates into higher manufacturing costs. Therefore, it is important to design process
variation tolerant circuits, to improve yield and lower manufacturing costs.
In this dissertation, two design approaches are developed to improve the process varia-
tion tolerance of combinational circuits (described in this chapter) and voltage level shifters
(discussed in the next chapter), respectively.
The process variation tolerant design approach for combinational circuits proposed
in this chapter exploits the fact that random variations can cause a signicant mismatch
in the electrical performance of two identical devices placed next to each other on the
die. In the proposed approach, a large gate is implemented using an appropriate number
(> 1) of smaller gates, whose inputs and outputs are connected to each other in parallel.
This parallel connection of smaller gates to form a large gate is referred to as a parallel
gate. Since the L and VT variations are largely random (as discussed in Chapter I) and
have independent variations in the smaller gates, the variation tolerance of the parallel
gate is improved. The parallel gates are implemented as single layout cells. By careful
diffusion sharing in the layout of the parallel gates, it is possible to reduce the input and
output capacitance of the gates, thereby improving the nominal circuit delay as well. An
algorithm is also presented to selectively replace critical gates in a circuit by their parallel
counterparts, in order to improve the variation tolerance of the circuit. Experiment results
178
presented in Section IX-D demonstrate that this process variation tolerant design approach
achieves signicant improvements in circuit level variation tolerance.
The rest of the chapter is organized as follows. Section IX-B briey discusses some
previous work in this area. In Section IX-C, the proposed process variation tolerant design
approach for combinational circuits is described. Experimental results are presented in
Section IX-D, followed by the chapter summary in Section IX-E.
IX-B. Related Previous Work
As mentioned in the previous chapter, process variation tolerant design has been an active
research topic for several decades. Various approaches have been developed to efciently
analyze the effects of variations on the performance of a circuit [25, 117, 119, 120] as well
as to design process variation tolerant circuits [130, 131, 132, 133].
In order to perform statistical circuit analysis and optimization, it is important to iden-
tify and characterize variation sources. Different circuit structures are reported [134, 31, 9,
30] to characterize and extract process variations (for both random and systematic variation
components). In [31], signicant variations were observed in the extracted threshold volt-
age values, and large mismatches were observed in adjacent SRAM devices fabricated in a
65 nm process. It was also argued that the large variation in VT is mainly due to the random
dopant uctuations. The authors of [30] observed that the variations in L, VT and mobility
are major contributors to the overall variations in the performance of a circuit fabricated in
a 65 nm SOI process. The variations in L and VT were found to be normally distributed,
with negligible spatial correlation [30]. This suggests that random variations are becoming
more problematic than the systematic variations.
To evaluate the performance of a circuit under process variations, statistical timing
analysis of a circuit is typically performed [25, 117, 119, 120, 135]. Some of the statistical
179
timing analysis approaches have already been discussed in Chapter VIII.
In [130], the authors perform gate sizing to improve the variation tolerance of digital
circuits at the expense of an increase in the mean delay of the circuit. Thus, this approach
does not improve the timing yield. A bidirectional adaptive body bias (ABB) technique is
used to compensate for parameter variations in [136]. In this technique, a non-zero voltage
is applied between the body and the source terminal to control the threshold voltage of tran-
sistors (and hence the speed of a circuit). In [137, 131], the authors use both adaptive body
bias (ABB) as well as adaptive supply voltage (ASV) to reduce the impact of the process
variations. Using this technique, the number of die accepted in the highest three frequency
bins increases to 98% [137] from 58%. Although ABB with ASV is very effective in im-
proving yield, this technique can only be used to compensate for systematic variations. It
is not feasible to apply a different body bias (and/or different supply voltages) to different
gates in a circuit to compensate for random variations. Therefore, ABB (with or without
ASV) cannot be used to deal with random variations. Since the variations of L and VT are
mostly random in nature, there is the need to develop techniques to reduce the impact of
these random variations. Also, with diminishing feature sizes, the body effect coefcient
is decreasing [138] and therefore, ABB based approaches will not be effective for future
technologies.
In [139], the authors present a defect tolerant design technique for nanodevices. In
their approach, redundancy is added at the transistor level by replacing each transistor by a
quadded transistor. This is done to improve the functional reliability of a design against
permanent defects such as stuck-open, stuck-shorts, and bridges. Since each transistor in a
design is replaced by 4 transistor, the area overhead of this approach is very large (>100%
as reported in the paper). Another defect tolerant approach was presented in [140], where
the authors duplicate transistors in a voter circuit to improve the functional reliability of
triple modulo redundancy based fault-tolerant systems. The area overhead of this approach
180
is also very high (∼228%). The approaches of [139] and [140] try to improve only the
functional reliability of a design at the cost of area and delay overheads. These approaches
do not reduce the variability in the performance of the design, which is the goal of the work
presented in this chapter. In contrast to these approaches, the proposed approach splits
transistors to reduce both the mean and the standard deviation of the delay of a circuit.
IX-C. Process Variation Tolerant Combinational Circuit Design
In Section IX-C.1, the variations considered in this chapter are described. Section IX-C.2
describes the proposed variation tolerant standard cell design approach. To improve the
variation tolerance of a circuit, the gates in the circuit whose random variations result in a
signicant variability in the delay of the circuit are to be replaced by their variation tolerant
counterparts. A circuit level approach proposed to improve variation tolerance is described
in Section IX-C.3.
IX-C.1. Process Variations
In this work, random variations in the L and VT parameters of devices are considered, since
these are the key parameters for determining the performance of a circuit. The authors
of [30] extracted the variations in L and VT for devices fabricated in a 65 nm SOI process.
They found that the L and VT of transistors are normally distributed and vary independently.
The ratio of the standard deviation to the mean is 5% for L, and 9% for VT . Based on
this, both L and VT of transistors are assumed to vary independently. Also, the standard
deviation of L (σL) is taken to be 5% of its nominal value [30]. The standard deviation
of the threshold voltage σVth is a function of square root of the width (W ) of a transistor
i.e. σVth(W ) ∝ 1/
√
W [141, 8, 10]. In [10], it is also reported that the σVth of transistors
varies with the channel width by a factor of at most 2. In other words, the σVth of a very
181
large device is approximately half of the σVth of the smallest device [10]. This observation
is based on extracted data from several test chips. Thus, the σVth for the smallest device
(with a width of W = Wmin)1 is taken to be 9% of the nominal threshold voltage value [30].
The largest σVT for any device is taken to be 4.5%. The σVth for an arbitrary device with a
width of W is obtained using the following relation:
σVT (W ) = max{σVT (Wmin)
√
Wmin
W
,
σVT (Wmin)
2
} (9.1)
IX-C.2. Variation Tolerant Standard Cell Design
Random variations can cause a signicant mismatch in the electrical performance of two
identical devices placed next to each other. This phenomenon is utilized to design variation
tolerant standard cells. Consider a 4× inverter shown in Figure IX.1 (a). Assume that the
transistor M1 (M2) of the 4× INV of Figure IX.1 (a) is implemented as a single NMOS
(PMOS) transistor in the layout. This 4× INV is referred to as a regular inverter (an inverter
implemented using a single PMOS and a single NMOS transistor). The L and VT of M1
and M3 can vary randomly which will directly affect the delay of the 4× INV, as well
as the slew at the node out. This can also increase the delay variability of the circuit in
which this inverter resides. To reduce the delay variability and the slew of the output of this
INV due to random variations in L and VT , the 4X INV is implemented by connecting two
2X inverters in parallel as shown in Figure IX.1 (b). This implementation of the 4× INV
(as shown in Figure IX.1 (b)) is referred as a parallel inverter. The parallel 4× INV is
more tolerant to the random variations than a regular 4× INV since the variations in the L
and VT of transistors M1 and M3 are independent and hence they tend to cancel each other.
Similarly, the L and VT variations of transistors M2 and M4 tend to cancel each other. Thus,
1The smallest device is a device with a width of 2× the feature size (or Lmin). For a 65 nm
process, the smallest device has a width Wmin = 130nm.
182
the impact of random variations on the delay (and the slew) of the output of the parallel 4×
INV is lower than that of the regular 4× INV. Monte Carlo simulations were performed to
verify that the parallel 4× INV is more tolerant to random variation than the regular 4×
INV. The results are presented in Section IX-D.
in
2 − 2X INV
M2
M1 M3
M4
outin out
M2
M1
1 − 4X INV
(a) Regular 4× Inverter (b) Two 2× inverters
connected in parallel
Fig. IX.1. 4× Inverter implementations
Using this approach, a variation tolerant complex gate G of size k× is designed by
implementing it as a parallel connection of 2 (or more) smaller gates of size k/2× (or k/3×,
k/4×, ...), with the same functionality as the gate G. In other words, instead of using large
PMOS and large NMOS transistors to implement G, small PMOS and NMOS transistors
are used, and connected in parallel to improve the variation tolerance of G. Figure IX.2
shows the regular and the variation tolerant (parallel) versions of a 2-input NAND gate.
Note that the number of small transistors that can be connected in parallel to implement a
large transistor (inside a gate) is constrained by the width of the smallest device that can be
fabricated. This fact is taken into consideration while designing variation tolerant parallel
gates. For example, in a regular NOR2 gate of minimum size, both NMOS transistors are
183
in2
in1
out
in2
in1
(a) Regular NAND2 (b) Parallel NAND2
out
Fig. IX.2. 2 input NAND gate a) Regular b) Parallel
of minimum width. Therefore, in the parallel version of this gate there are still 2 minimum
width NMOS transistors (connected to the two inputs). Since the PMOS transistors of
the regular NOR2 gates are 5× devices, they can be implemented using smaller PMOS
transistors connected in parallel. Layouts were created for regular and parallel versions of
all gates in the library (LIB) used in this work. The parallel gates were implemented as
single layout cells. The same cell height was used for the regular and the parallel versions
of any gate. This permits seamless placement and routing of a circuit using both regular
as well as parallel gates. Since the parallel gates utilize more transistors than their regular
counterparts and both have the same height, the layout area of a parallel gate is more than
that of the corresponding regular gate (since a larger cell width is needed to include more
transistors in the parallel gate). In order to limit the area overhead of the proposed approach,
regular gates in a circuit are selectively replaced by parallel gates to improve the variation
tolerance of the circuit. The approach used to select regular gates to be replaced is explained
in Section IX-C.3.
Apart from increasing the variation tolerance of a gate, another advantage of the ap-
proach proposed in this chapter is that the input capacitance (Cin) of any pin of a parallel
gate G is lower than the input capacitance of the corresponding pin of the regular gate. This
184
is explained next. Consider a regular and a parallel inverter of equivalent size, shown in
Figure IX.3. Figure IX.3 also shows the capacitance at the input and the output nodes of
both inverters. The capacitance CG is summation of the gate capacitance of the transistors
M1 and M2 (M1, M2, M3 and M4) of the regular (parallel) inverter. CD (C′D) is the total
output diffusion capacitance of the regular (parallel) inverter. CM is the Miller capacitance
between the input and the output of both inverters. The parallel inverter has two PMOS
(NMOS) transistors in parallel. Therefore, in the layout of this inverter, transistors M2 and
M4 will share their diffusion. Similarly, M1 and M3 will also share theie drain diffusion.
Thus, the total area of the output diffusion node is lower in the parallel inverter compared
to the regular inverter. This implies that C′D < CD. Note that CG and CM are identical for
both regular and parallel inverters, since the total width of PMOS and NMOS devices is
equal in both these inverters. The input capacitance of the regular (parallel) inverter Cin
(C′in) depends on CG, CM and CD (C′D). In particular, Cin = CG +CDCM/(CD +CM) and
C′in = CG +C′DCM/(C′D +CM). The input capacitance of the parallel inverter is thus lower
than the input capacitance of the regular inverter since C′D < CD. Note that since C′D < CD,
the intrinsic delay is also lower for the parallel inverter compared to the regular inverter.
The lower input capacitance and the lower intrinsic delay of the parallel gates helps in
reducing the circuit level delay. Thus, the use of parallel gates in a circuit (instead of reg-
ular gates) can reduce both the mean (µ) and the standard deviation (σ) of the delay of the
circuit. The delay limited yield also improves since the worst case circuit delay (µ + 3σ)
decreases. This is demonstrated for several benchmark circuits in Section IX-D. Another
advantage of using parallel gates in a circuit is that the dynamic power consumption of the
circuit reduces because of the lower input and output capacitances of the parallel gates.
185
M2
M1 M3
M4
out
(a) Regular Inverter (b) Parallel Inverter
M1
M2
outin in
CG CG
CM CMCD C′DCin C′in
Fig. IX.3. Capacitance of various nodes a) Regular inverter, b) Parallel inverter
IX-C.3. Variation Tolerant Combinational Circuits
As mentioned earlier, the layout area of parallel gates is higher than that of regular gates.
Therefore, replacing regular gates with parallel gates in a circuit (to improve its tolerance to
random variations) would incur an area penalty. To minimize this area penalty, only those
gates which contribute signicantly to the performance variability of the circuit are replaced
by their variation tolerant parallel counterparts. In this work, such gates are identied based
on their deterministic slack value. It is reasonable to expect that delay variations of a gate
with a low slack value are likely to induce signicant variations in the performance of the
circuit. Therefore, regular gates in a circuit are replaced by parallel gates in increasing
order of their slack value. The number of gates in a circuit that will be replaced depends
on a user specied area constraint. A user specied number P which is the fraction of total
number of gates N in a circuit to be replaced with their variation tolerant counterparts, is
used in the proposed approach. This number P can be obtained from the amount of area
penalty that is acceptable.
First, the slack is computed for all the gates in the circuit (η) and then the critical gates
are identied and replaced using Algorithm 1. For a given circuit η, rst sort all gates G∈ η
186
in an increasing order of their slack values, and store them in a list L. Then replace the top
P ∗N gates (by replacing them with their corresponding variation tolerant parallel gates)
in the list L. The resulting variation tolerant circuit is referred to as η∗. This approach is
referred to as the percentage gate replacement approach.
Algorithm 4 Replacing critical gates in η to improve its variation tolerance
Increase variation tolerance(η,P)
L = sort(G ∈ η,Slack(G))
i = 0
N = number o f gates(η)
while i < P∗N do
G = L(i)
Replace G by Gparallel
i = i+1
end while
return η∗
The identication of critical gates (under process variations) for circuit optimization is
an active research topic [142, 143, 144, 145, 146]. To keep the proposed approach efcient,
the deterministic slack value is used to identify critical gates in a circuit. However, it is
possible to use the approach of [142] to identify gates in a circuit which are critical under
variations, and replace them with the proposed variation tolerant parallel gates. Using the
approach of [142], the area overhead of the proposed approach may be lowered. It is also
possible to improve the reduction in the delay variability of a circuit.
IX-D. Experimental Results
Monte Carlo simulations of the 4× regular and parallel INVs of Figure IX.1 were per-
formed using SPICE [38]. A 65 nm PTM [45] model card was used, with VDD = 1V . The
L and VT of the different transistors of these inverters were varied independently. Also,
σL was taken to be 5% of its nominal value [30] and σVT was computed using the method
described in Section IX-C.1. Figure IX.4 (a) shows the standard deviation of the rising and
the falling delay of both the regular and the parallel 4× INVs, for different loads (assuming
an input slew of 30 ps). Figure IX.4 (b) compares the worst case (µ+3σ) rising and falling
187
delays and Figure IX.4 (c) plots the standard deviation of the output slew. Figure IX.4
clearly shows that the parallel 4× INV is less impacted by random variations in the L and
VT of devices compared to the regular 4× INV. The worst case rising and falling delays of
the parallel INV are lower than those of the regular inverter by ∼ 8% and ∼ 4% respec-
tively. Therefore, the parallel INV of Figure IX.1 is more tolerant to random variations
compared to the regular inverter.
A standard cell library (LIB) was implemented using a 65 nm PTM [45] model card,
with VDD = 1.0V. The standard cell library L consists of regular INV2X, INV4X, INV8X,
NAND2X2, NAND2X4, NAND3X2, NOR2X2 and NOR3X2 gates2. The variation tol-
erant (parallel) versions of all the gates in LIB were also designed. For regular INV2X,
INV4X, NAND2X2, NAND3X2, NOR2X2 and NOR3X2 gates, one parallel version (INV2XP,
INV4XP, NAND2X2P, NAND3X2P, NOR2X2P and NOR3X2P) was created. Two parallel
versions were created for INV8X and NAND2X4. INV8X can be implemented as either
2 INV4X’s in parallel (INV8XP1) or 4 INV2X’s in parallel (INV8XP2). Similarly, the
parallel versions of NAND2X4 are NAND2X4P1 and NAND2X4P2, where NAND2X4P2
utilizes more NMOS and PMOS transistors connected in parallel than NAND2X4P1. The
minimum width of a transistor that can be fabricated is a 65 nm process is 130 nm. This
was taken into account while creating variation tolerant (parallel) gates. Note that in some
gates (INV2X, NOR2X2 and NOR3X2) only the PMOS devices could be parallelized since
the NMOS devices were minimum sized. Layouts were created for all regular and parallel
gates using CADENCE SEDSM [110] tools, paying careful attention to invoke diffusion
sharing whenever feasible.
The standard cells in LIB (both regular and parallel versions) were characterized to
construct 2-D lookup tables for the values of the mean and standard deviation of the input
2INV2X is the smallest inverter that can be manufactured. The width of the NMOS (PMOS)
transistor of INV2X is 130 nm (325 nm).
188
(a) (b)
(c)
Fig. IX.4. Results for 4× regular and parallel inverters: a) Standard deviation of delay, b)
Worst case delay and c) Standard deviation of output slew
189
pin to output delay, as well as the output slew. This pre-characterization was done for a set
of load capacitance and input slew values. Table IX.1 compares the mean (µ), the standard
deviation (σ) and the worst case delay (µ+3σ) of all regular gates in LIB along with their
variation tolerant counterparts, for a load value of 5 fF and an input slew of 30 ps. In this
Table the INV8X and NAND2X4 gates are compared with the INV8XP2 and NAND2X4P2
gates respectively. For multiple input gates, the input pin with the largest mean pin to output
delay value was used for this comparison. Table IX.1 also compares the layout area, and
the mean and standard deviation of the sub-threshold leakage current of the regular and the
parallel gates. The sub-threshold leakage of a gate is obtained for the input state which
maximizes its value. In Table IX.1, Column 1 reports gates under consideration. Columns
2 to 10 report the results obtained for the regular gates in LIB. Columns 11 to 19 report the
ratio of any quantity for the parallel gate compared to the value of the same quantity for the
regular gate. For example, Column 9 reports the ratio of the mean rising delay of a parallel
gate to the mean rising delay of the corresponding regular gate. As reported in Table IX.1,
on average the standard deviation of the rising ( falling) delay of the parallel gates is lower
by 31% (15%) compared to the regular gates. Also, the mean and the worst case rising
(falling) delays of the parallel gates are lower by 2% and 10% (1% and 4%) respectively
compared to the regular versions. Hence, the proposed parallel gates are more tolerant to
random variations than the regular gates. The average layout area of the parallel gates is
higher by 60% compared to regular gates. The average (over all parallel gates) of the mean
(under process variations) sub-threshold leakage current is 1.01× that of the regular gates.
However, the average (over all parallel gates) of the standard deviation of the sub-threshold
leakage current is 31% lower compared to the same quantity for regular gates. Thus, the
approach proposed in this chapter also reduces the variability in the sub-threshold leakage
current, with a small increase in its mean value. Note that the input pin capacitance and the
output capacitance of the parallel gates are smaller by 2.5% and 15% on average, compared
190
to the corresponding capacitances of the regular gates. The improvements in the delay µ, σ
and µ+3σ are higher for rising transitions since there are more opportunities to parallelize
PMOS devices since they are nominally larger than the NMOS devices.
Several ISCAS and MCNC benchmark circuits were mapped using LIB, for both area
and delay optimality. For a mapped design, the slack of all the gates in the design was com-
puted, and then the gates were sorted in order of increasing slack. After this, the gates with
the lowest slack in the design were replaced by their variation tolerant counterparts using
percentage based gate replacement approach described in Section IX-C.3 (until a fraction
P of the total number of gates in the design are replaced). For both regular area and de-
lay mapped designs (mapping was performed in SIS [109]), two variation tolerant versions
were generated using parallel gates. In the rst version, the INV8XP1 and NAND2X4P1
parallel versions for regular INV8X and NAND2X4 gates were used respectively (for all
other regular gates, note that there is only parallel version). The resulting area and delay
mapped designs are referred to as AM1 and DM1 designs respectively. In the second ver-
sion of variation tolerant circuits, the INV8XP2 and NAND2X4P2 parallel versions were
used for regular INV8X and NAND2X4. The resulting area and delay mapped designs are
referred to as AM2 and DM2 respectively. Note that since INV8X and NAND2X4 have a
large area, the area mapped designs did not utilize these gates. Hence, the AM1 and AM2
results are identical, therefore the results are presented only for AM1. On the other hand,
delay mapped designs use these large gates heavily, and thus signicant differences exist
between DM1 and DM2. The results are presented for both DM1 and DM2.
For regular area and delay mapped designs as well as for variation tolerant circuits
AM1, DM1 and DM2, Monte Carlo based statistical static timing analysis (SSTA) was
performed to obtain their delay distributions. Monte-Carlo based SSTA is considered to be
an accurate method to obtain the delay distribution of a circuit [129]. Monte-Carlo based
SSTA was implemented in SIS [109]. The data obtained from the characterization of all
191
Ta
bl
eI
X
.1
.C
om
pa
ris
on
o
fR
eg
ul
ar
an
d
Pa
ra
lle
lG
at
es
G
at
e
Re
gu
la
rG
at
es
Pa
ra
lle
lG
at
es
Ri
sin
g
D
el
ay
(p
s)
Fa
lli
ng
D
el
ay
(p
s)
A
re
a
Le
ak
ag
e
(n
A
)
Ri
sin
g
D
el
ay
Ra
tio
Fa
lli
ng
D
el
ay
Ra
tio
A
re
a
Le
ak
ag
e
Ra
tio
µ
σ
µ+
3σ
µ
σ
µ+
3σ
(µ
m
2 )
µ
σ
µ
σ
µ+
3σ
µ
σ
µ+
3σ
Ra
tio
µ
σ
IN
V
2X
30
.7
2
4.
91
45
.4
4
32
.4
2
3.
43
42
.7
1
1.
62
67
.9
6
41
8.
17
0.
99
0.
79
0.
92
0.
98
0.
99
0.
98
1.
33
1.
05
0.
73
IN
V
4X
20
.2
4
2.
85
28
.8
0
20
.5
9
2.
12
26
.9
5
1.
62
13
0.
70
76
2.
48
0.
97
0.
81
0.
92
0.
99
0.
84
0.
96
1.
33
0.
92
0.
64
IN
V
8X
14
.1
6
2.
56
21
.8
5
14
.7
5
1.
60
19
.5
5
2.
16
25
2.
18
16
22
.3
8
0.
98
0.
55
0.
83
0.
98
0.
65
0.
90
1.
50
1.
04
0.
53
N
A
N
D
2X
2
34
.5
9
5.
79
51
.9
5
32
.1
1
1.
99
38
.0
9
2.
16
12
2.
80
46
3.
55
0.
99
0.
66
0.
88
1.
00
0.
82
0.
97
1.
50
1.
09
0.
89
N
A
N
D
2X
4
24
.1
3
3.
46
34
.5
1
22
.6
5
1.
13
26
.0
5
2.
16
25
7.
26
10
81
.4
3
0.
98
0.
56
0.
85
1.
01
0.
67
0.
96
2.
50
1.
10
0.
58
N
A
N
D
3X
2
40
.8
1
6.
86
61
.4
0
34
.8
1
1.
56
39
.5
0
2.
70
20
4.
50
70
0.
15
0.
97
0.
74
0.
89
0.
98
0.
83
0.
97
1.
60
1.
02
0.
74
N
O
R2
X
2
37
.6
4
3.
50
48
.1
4
40
.8
2
4.
35
53
.8
7
2.
16
13
1.
79
63
9.
64
0.
98
0.
75
0.
93
0.
98
1.
01
0.
99
1.
50
0.
87
0.
66
N
O
R3
X
2
46
.5
0
3.
68
57
.5
4
54
.5
0
5.
82
71
.9
7
2.
70
15
7.
04
70
8.
34
0.
99
0.
69
0.
93
0.
97
0.
97
0.
97
1.
60
0.
97
0.
71
AV
G
0.
98
0.
69
0.
90
0.
99
0.
85
0.
96
1.
61
1.
01
0.
69
192
standard cells (regular and parallel versions) in the library (LIB) was used for Monte-Carlo
based SSTA. Note that the characterization of all gates was done for a set of different load
capacitance and input slew values. 10000 iterations were performed for the Monte Carlo
based SSTA analysis of any circuit. The mean (µ), the standard deviation (σ) and the worst
case delay (µ + 3σ) of all regular and variation tolerant designs were obtained from this
SSTA analysis. The layout area of all designs was also computed. Note that the area for a
design was computed by adding the layout area of all the gates in the circuit. Figure IX.5
shows the average (over 14 benchmark designs) ratio of the area and delay results of the
variation tolerant circuits compared to their regular counterparts, for different varying val-
ues of P from (0 to 1). Note that P is a user specied number which is the fraction of total
number of gates in a circuit to be replaced with their parallel versions. Figure IX.5 (a) and
(b) plot the average (averaged over all benchmark circuits) ratio of σ and µ + 3σ of the
delay of the variation tolerant circuits (AM1, DM1 and DM2) compared to regular designs.
Figure IX.5 (c) shows the average ratio of the layout area of the variation tolerant design
compared to the regular version (as a function of P). As shown in as Figure IX.5 (a) and
(b), on average, both the σ and µ + 3σ of the delay of the AM1, DM1 and DM2 designs
reduce as the fraction of parallel gates (P) in the designs increases. When P reaches 0.7,
the σ and µ + 3σ of the delay of the variation tolerant designs saturate. At this point, on
average, the σ and µ + 3σ of AM1 is 20% and 7% lower than that of regular area mapped
designs. The area utilization of AM1 is around 34% more than regular designs. For DM1
(DM2), at P ∼= 0.7, the σ and µ + 3σ are 23% and 7% (25% and 8%) lower than regular
delay mapped designs, with an area penalty of 33% (44%). From this it can concluded
that the variation tolerant design approach presented in this chapter reduces both the σ and
µ + 3σ of the delay of designs signicantly and hence, increases the delay limited design
yield. Also, the DM2 designs perform better than DM1 on average but with a higher area
penalty. Although this is not shown explicitly, the mean delays of the AM1, DM1 and DM2
193
designs are also lower compared to regular area and delay mapped designs (by 6%, 6% and
6% respectively). The mean delays of AM1, DM1 and DM2 designs are lower because
the input pin capacitance and the output capacitance of the parallel gates are smaller (as
explained in Section IX-C.2) by 2.5% and 15% on average, compared to the corresponding
capacitances of the regular gates.
The σ and µ + 3σ of delay, as well as the area ratio of individual AM1 designs com-
pared to their corresponding regular version are plotted in Figure IX.6, for different values
of P. Notice from Figure IX.6 that for smaller values of P, the reduction in σ and µ + 3σ
for some benchmark circuits (with increasing P) is abrupt. When P reaches 0.6, then the σ
and µ+3σ of all benchmark circuits either saturate or decrease very slowly with increasing
P. Therefore, P = 0.7 is a reasonable value to be used in the percentage based gate replace-
ment approach. As expected, the area of AM1 designs increase linearly with increasing P.
Similar trends are also observed for DM1 and DM2. The σ, µ+3σ and area ratio plots for
individual DM1 and DM2 designs are shown in Figures IX.7 and IX.8.
From Table IX.1, it can be concluded that the mean, the standard deviation and the
worst case delay of the variation tolerant (parallel) gates are lower than that of their regular
counterparts. At the circuit level, as shown in Figure IX.5, both the σ and µ+3σ of variation
tolerant circuits (obtained by using parallel gates) are lower that of the regular designs. Note
that it is possible to use the approach of [142] to identify the critical gates in a circuit under
variations and possibly further reduce the area overhead of the approach presented in this
chapter.
The effect of implementing a large transistor by a parallel connection of small tran-
sistors, for a large 32× INV was also studied in this work. For this study, only the VT of
transistors was varied, and Monte Carlo simulations of different implementations of the
32× INV were performed. The different implementations of 32× are as follows: 1-32×,
2-16×, 4-8×, 8-4× and 16-2×. The results of these simulations show that initially, the σ
194






  	   
	
			




 
σ
 
σ
 
σ
 
σ
	







(a) Delay σ Ratio







     
	
			




	 µµ µµ
	
	
σσ σσ
	



	





(b) Delay µ+3σ Ratio






     
	
			



	



	

	
	
(c) Area Ratio
Fig. IX.5. Ratio of results of the proposed approach compared to regular circuits for differ-
ent values of P
195







  	   
	
			




	 σσ σσ
	



	

 


 
 
	 	
	 
 

 
(a) Delay σ Ratio







     
	
			




	 µµ µµ
	
	
σσ σσ
	



	
 	
	 
 
 
 
 

	 
(b) Delay µ+3σ Ratio





     
	
			



	



	
 
 
 
 
 
 
	 
(c) Area Ratio
Fig. IX.6. Delay σ, µ + 3σ and area ratio of the proposed approach compared to regular
circuits for different values of P for area mapped designs
196







	
  
   
	
			




	 σσ σσ
	



 
 
	 
	
 
	

 
 
 
(a) Delay σ Ratio







   	  
     
	
			




	 µµ µµ
	
	
σσ σσ
	



 
 
	

 
	
 	
 
 
 
(b) Delay µ+3σ Ratio





     
	
			



	



	
 
 
 
 
 
 
	 
(c) Area Ratio
Fig. IX.7. Delay σ, µ + 3σ and area ratio of the proposed approach compared to regular
circuits for different values of P for DM1 designs
197







	
  
   
	





 σσ σσ




 
 
	 
	
 
	

 
 
 
(a) Delay σ Ratio







     
	
			




	 µµ µµ
	
	
σσ σσ
	



	
 	
	 
 
 
 
 

	 
(b) Delay µ+3σ Ratio






     
	
			



	



	
 
 
 
 
 
 
	 
(c) Area Ratio
Fig. IX.8. Delay σ, µ + 3σ and area ratio of the proposed approach compared to regular
circuits for different values of P for DM2 designs
198
of delay of these implementations decreases as the number of smaller inverters connected
in parallel increases(until 8-4×) and then it saturates. The mean delay also goes down
initially (until 4-8×) due to a reduction in the diffusion area, and then it starts increasing.
When a large number of smaller inverters are used to implement a 32× INV, the perime-
ter capacitance of the output diffusion node dominates the bottom plate capacitance. As
a result, the total capacitance at the output node of the 32× INV starts increasing with an
increase in the number of smaller inverters used to implement it. Therefore, a large gate
should be implemented using a parallel connection of an appropriate number (typically 2 -
4) of smaller gates. Using too many smaller gates to implement a large gate may increase
the mean delay and hence affect the delay limited yield.
IX-E. Chapter Summary
With the continuous scaling of devices, variations in key device parameters such as channel
length (L), threshold voltage (VT ), and oxide thickness (Tox) are increasing at an alarming
rate. This has led to signicant problems in terms of reliability, circuit resilience and yields.
In this chapter, a circuit design approach was proposed and validated, to alleviate this prob-
lem for combinational circuits. The proposed approach implements a large gate using an
appropriate number (> 1) of smaller gates connected in parallel (with their inputs and out-
puts connected to each other). Since the L and VT variations are largely random and vary
independently in the smaller gates, the variation tolerance of the parallel gates is improved.
The parallel gates were implemented as single layout cells, and have smaller delay µ and
σ compared to their traditional counterparts. This chapter also presented an algorithm
to selectively replace critical gates in a circuit by their parallel counterparts, to improve
circuit-level variation tolerance. Monte Carlo simulations demonstrate that the proposed
variation tolerant circuit design approach achieves signicant improvements in delay µ and
199
µ+3σ variation. On average, the proposed approach reduces the standard deviation (σ) of
the circuit delay by 23% for delay mapped designs, with an area overhead of 33% (com-
pared to regular circuits). This approach also reduces the worst case circuit delay under
variations (i.e. µ + 3σ) by 7% and hence signicantly improves the design yield.
200
CHAPTER X
VARIATION TOLERANT DESIGN - PROCESS VARIATION TOLERANT SINGLE
SUPPLY TRUE VOLTAGE LEVEL SHIFTER
X-A. Introduction
System-on-chip (SoC) solutions and multi-core computing architectures are becoming in-
creasingly common in many common applications. For such computing paradigms, energy
and power minimization is a crucial design goal. Both the dynamic and the leakage power
consumption of a CMOS circuit depend upon the supply voltage, and they decrease at least
quadratically with decreasing supply voltages. Therefore, in recent times, it is common to
decrease the supply voltage value in non-critical parts of SoCs and multi-core processors,
in order to reduce the power and energy consumption. This results in a situation where the
many blocks in an SoC design operate at different supply voltage levels, in order to min-
imize system power and energy values [147, 148]. Similarly, multi-core processors have
different cores operating at different supply voltage values, depending on the computational
demand.
When a signal traverses on-chip voltage domains, a level shifter is required. Invert-
ers can handle a high to low voltage shift with low delays and minimal leakage. For a
low to high voltage level translation, inverters tend to consume a large amount of leakage
power, and hence special circuits are needed for this type of level translation. Such special
circuits are called voltage level shifters (VLS). Moreover, different blocks/cores in SoCs
and multi-core processors may employ dynamic voltage scaling (DVS) to meet the variable
speed/power requirements at different times [76, 77, 78]. As a consequence, many voltage
domains are formed on a single IC or SoC, each operating at different supply voltage values
at different times of the computation. Therefore, the voltage level shifters (VLS) required
201
to interface these voltage domains should be able to efciently convert any voltage level to
any other desired voltage level, and the voltage of the input to the VLS can in general be
either greater than or less than the voltage of the output. Also, since the key device param-
eters vary signicantly in the DSM era, it is required that the performance of the voltage
level shifter does not vary signicantly due to these device parameter variations.
The rest of the chapter is organized as follows. The need for a novel single supply,
process variation tolerant, voltage level shifter (SS-VLS) is highlighted in Section X-B.
Section X-C discusses related previous work in the design of an SS-VLS. In Section X-D,
the design of the proposed process variation tolerant voltage level shifter is described. The
proposed voltage level shifter uses only one supply voltage, and it can convert any voltage
level to any other desired voltage level with low delay and power. Hence, it is referred to
as a single supply true1 voltage level shifter (SS-TVLS). In Section X-E, experimental re-
sults are presented which demonstrate that SS-TVLS outperforms the best known previous
approach. Finally, a summary of this chapter is presented in Section X-F.
X-B. The Need for a Single Supply Voltage Level Shifter
A conventional voltage level shifter (CVLS) is shown in Figure X.1. It requires two voltage
supplies, the input domain voltage supply (VDDI) and the output domain voltage supply
(VDDO). The operation of circuit is as follows. When the input signal in is at the VDDI
value (inb is at the GND value), MN1 turns ON (MN2 is off). Thus pulls the outb signal to
GND. This transition of the outb signal turns on MP2, which pulls up the out signal to the
VDDO value. When in is at GND (inb is at the VDDI value), MN1 is off and MN2 is on,
which turns on MP1. MP1 pulls up outb to the VDDO value. Although there are no high
leakage paths from VDDO to GND in this circuit, both VDDI and VDDO are required
1The SS-TVLS is true in the sense that it can perform level conversion of a signal from a
lower voltage domain to a higher voltage domain and vice versa
202
for the voltage level conversion. This can be a hard requirement to satisfy, especially if
the VDDO and VDDI domains are separated by a large physical distance. Supply voltage
wires typically need to be quite wide (especially if VDDO and VDDI are physically far
apart), resulting in a large area penalty. Figure X.2 shows a multi voltage system where
four modules are interacting with each other using CVLS. A voltage level conversion at
the input of a particular voltage domain (V ) will require all the supply voltages of other
voltage domains {W} which drive at least one signal to V , and whose voltage level is lower
than the voltage level of V . This may result in routing congestion, excessive area utiliza-
tion and also may pose restrictions on module placement. From the schematic diagram of
the CVLS shown in Figure X.1, it can be observed that the routing of additional supply
voltages can be avoided by transmitting a differential signal to a different voltage domain
(i.e. by transmitting both in and inb). However, this strategy would require one additional
wire per signal and hence could lead to routing congestion as well. This problem is further
aggravated by the increasing number of voltage domains in SoCs and multi-core architec-
tures. Additional complexity is encountered if the voltage domains have variable voltages,
which requires a domain to receive the supply voltages of every other domain in the system.
In such a scenario, it is not known apriori whether VDDI < VDDO or VDDI > VDDO.
Therefore, a single supply voltage level shifter (SS-VLS), is desired, to convert any voltage
level to any other desired voltage level with a predictable delay and low power, utilizing the
supply voltage of the VDDO domain alone. In addition, such a VLS should be true (i.e.
operate for both VDD < VDDO and VDDI > VDDO). One such solution is proposed in
this dissertation, and is referred to as a single supply true voltage level shifter (SS-TVLS)
previously published [149]. The use of a single supply voltage (VDDO) for level conver-
sion would help ease placement and routing constraints, enabling efcient physical design
of the IC. This would also help in reducing the number of input and output pins of a block.
Figure X.3 shows a multi-voltage system, where four modules (with DVS) interact
203
with each other using SS-TVLS. Note that since the performance of the voltage level shifter
is crucial for the performance of the overall system, the voltage level shifter should perform
reliably under process variations. In this dissertation, the devices of the proposed SS-TVLS
were carefully sized to increase its tolerance to process variations, while maintaining a low
leakage and low power consumption. This is quantied through experimental results in
Section X-E.2.
VDDO
GND
VDDI
GND
inb
outb
in
out
MN1
MP1 MP2
MN2
Fig. X.1. Conventional voltage level shifter
X-C. Related Previous Work
Several kinds of voltage level shifters have been proposed over the years, to minimize
power consumption [150, 151, 152, 153]. Most of these approaches utilize dual supply
voltages, which make them unattractive for SoCs and multi-core architectures for rea-
sons already discussed. The work of [150] focused on using bootstrapped gate drivers
204
0.8V1.0V1.2V 0.8V1.0V1.2V1.4V
0.8V 0.8V  1.0V
0.8 V 1.0 V
1.2 V 1.4 V
conventional
level shifters
sig
na
ls
Fig. X.2. Multi-voltage system using CVLS
to minimize voltage swings. This helps in reducing the switching power consumption in
the conventional level shifter, and also helps in increasing the speed of the level shifter.
In [151], the authors proposed a method of incorporating voltage level conversion into reg-
ular CMOS gates by using a second threshold voltage. They proposed a scheme to modify
the threshold voltage of high voltage gates (which are driven by the outputs of low voltage
gates) to achieve the level shifting functionality along with the logical operation. This work
focused on reducing power while using dual supply voltages. In [152], Wang et. al. pro-
posed a level up-shifter along with a level down-shifter, to interface 1.0V and 3.3V voltage
domains. The level up-shifters use zero-Vt thick oxide NMOS devices to clamp the volt-
age to protect the 1V NMOS switches from high voltage stress across the gate oxide. The
level down-shifter used thick oxide NMOS devices with 1V supplies as both pull-up and
pull-down devices. This approach also requires dual supply voltages. In [153], the authors
presented a low to high voltage level shifter for use in a VLSI chip for MEMS applications.
205
sig
na
ls
0.8 − 1.4 V
0.8 − 1.4 V 0.8 − 1.4 V
0.8 − 1.4 V
VD1
VD3 VD4
VD2
Voltage Level Shifters
Single Supply True
Fig. X.3. Multi-voltage system using SS-TVLS
The design uses a stack of devices in series between the rail voltages, biased by 5 different
bias voltages for the conversion.
The SS-VLS proposed in [154] uses a diode-connected NMOS device between the
supply and output, to convert a low level to a high voltage level. There is a threshold voltage
drop in this diode-connected NMOS device, which reduces the supply voltage to the input
inverter. This level shifter has a limited range of operation, and suffers from higher leakage
currents when the difference in the voltage levels of the output supply and the input signal is
larger than the threshold voltage. In [41], the authors present a SS-VLS design which tries
to address the issues associated with the design of [154]. However, their SS-VLS is only
able to convert a low voltage domain signal to a higher voltage domain (VDDI < VDDO).
Also, the leakage currents of the SS-VLS are relatively high. In contrast to these SS-VLS
implementations, the SS-TVLS proposed in this dissertation can convert any voltage level
206
to any other desired voltage level (i.e. it is a true voltage shifter) without using any control
signals. At the same time, the leakage currents of the proposed SS-TVLS design are very
low. Note that none of the previous approaches have analyzed the performance of their
VLS under process variations. The performance of the proposed SS-TVLS (under process
variations) is compared with the best known previous approach (in terms of functionality)
in Section X-E.2.
X-D. Proposed Single-supply True Voltage Level Shifter
The schematic diagram of the proposed SS-TVLS is shown in Figure X.4. This SS-TVLS
was implemented using a 90 nm PTM [45] technology. Note that devices with thick channel
lines are high-VT devices. Their VT is 0.49 V for NMOS and -0.44 V for PMOS, while
the nominal VT is 0.39 V for NMOS and -0.34 V for PMOS. Also note that the NOR gate
shown in Figure X.4 uses the VDDO supply. The sizes (width/length) of all devices (in
µm) are shown in the gure. The substrate terminals of all PMOS (NMOS) devices in this
gure are connected to VDDO (GND). The operation of the SS-TVLS can be explained
by considering two scenarios. The timing diagram of the SS-TVLS is shown in Figure X.5
and it is applicable to both scenarios. In the rst scenario, VDDO > VDDI (i.e. the VLS
has to convert a low voltage level to a high voltage level). In this case, when the input
signal in goes high (to the VDDI value), the output node outb starts falling due to the NOR
gate. However, the PMOS transistor of the NOR gate whose gate terminal is driven by
in is not in complete cut-off (i.e. it is leaking) because VDDI < VDDO. Thus there is
temporary leakage path between VDDO and GND, which is eliminated by the rising of
node2 (the second input of the NOR gate) to the VDDO value. After the input signal in
goes high, M6 turns on and thus pulls node1 to GND. This causes M3 to turn on and hence
node2 is pulled up to VDDO, causing the output node outb to be pulled down to GND. The
207
previously mentioned leakage path between VDDO and GND is removed when node2 is
pulled to to VDDO. During this phase, as in is high and it is at VDDI (< VDDO), M8 is
ON along with M2, which results in the charging of the ctrl node (whose capacitance is
dominated by the gate capacitance of MC) to a value which is the minimum of VDDI and
VDDO-VM8T (where VM8T is the threshold voltage of M8). Note that M1, M4, M5 and M7
are turned off when in is at the logic high value.
Now when the in node falls, M6 turns off while M1 turns on (because the gate to
source voltage of M1 is more than VM1T ). This leads to the discharge of node2 (and the
charging of node1) and thus the NOR gate output rises to VDDO (since both the inputs
of the NOR gate are at the GND value). In this phase, M3, M2, M6 and M7 are turned
off while M4 and M5 are turned on. The ctrl node discharges through M2 and M8 during
the time when M2 is turning off. The node capacitance of ctrl (implemented as the gate
capacitance of MC) is selected to be large enough to allow the discharge of node2. Note
that the NOR gate is used to balance the rising and the falling delays of the SS-TVLS. It
also provides the SS-TVLS the same load driving capability as a minimum size inverter.
Note that the SS-TVLS is an inverting voltage level shifter. An extra inverter is not required
at the output of the SS-TVLS because this polarity inversion can be subsumed in the logic
of the VDDO voltage domain. In the experiments, the VLS approach that the SS-TVLS is
compared with has the same inverting property.
In the second scenario, the SS-TVLS performs the conversion of a high voltage level
to a low voltage level (i.e. VDDO < VDDI). In this scenario as well, when the input in
goes high to the VDDI value, the output node outb falls to the GND value. In this scenario,
as VDDI > VDDO, the PMOS transistor of the NOR gate whose gate terminal is driven by
in, is in deep cut-off and hence, there is no leakage path between VDDO and GND. After
in goes high to VDDI, M6 turns and pulls down its drain node. This turns on M3 which
then charges node2 to VDDO. During this phase, as VDDI > VDDO therefore, M7 is ON
208
0.09
0.09
0.09
0.27
0.09
0.09
0.09
0.18
GND
0.09
0.09
0.09
0.54
0.09
0.18
VDDO
MC
M3
M5
0.6
1.4
M6
M4
M1
0.18
0.09
0.99
M2
0.09
VDDO M7
VDDO
M80.27
0.09
NOR
outb
node1
node2
ctrl
in
Fig. X.4. Novel single supply true voltage level shifter
and M2 is also ON. M8 is off in this case. Thus, the ctrl node voltage charges to a value
min(VDDO, VDDI-VM7T ). Here VM7T is the threshold voltage of M7. Note that M1, M4
and M5 are turned off when in is at VDDI. The rest of the operation of the SS-TVLS when
in transitions to GND is identical to the rst scenario. Note that the SS-TVLS works for
VDDI > VDDO as well as VDDI < VDDO because M1 never turns on when in is logically
high (regardless of whether VDDI > VDDO or VDDI < VDDO).
The SS-TVLS exhibits very low leakage currents as compared with the best known
voltage level shifter [41] for VDDI < VDDO. There are several reasons for this. Note
that the devices M4 and M6 are high VT devices, to reduce leakage currents. Also, all
the devices of the proposed SS-TVLS were carefully sized to improve its tolerance to pro-
cess variations and further, the tradeoff between speed and leakage power was considered.
Specically, to improve the variation tolerance of the SS-TVLS, the transistors M1, M3,
M6 and MC were carefully sized since they are very critical for the performance of SS-
TVLS. As mentioned before, the maximum voltage value that the ctrl node can charge to
209
is the minimum of VDDI and VDDO-VM8T when VDDI < VDDO, and VDDO and VDDI-
VM7T when VDDI > VDDO respectively. Thus, when the voltage values of the VDDI and
VDDO domains are small and close to each other, then the ctrl node charges to VDDO-
VM8T . Therefore, a low VT NMOS device2 is to used for M8 to ensure that ctrl can charge
to a sufciently large voltage value. This also helps in increasing the voltage translation
range of the SS-TVLS. Note that all other transistors (M1, M2, M3, M5, MC and M7 and
NOR gate transistors) are nominal VT devices.
GND
GND
VDDO
VDDO
VDDI
VDDO
in
node1
outb
node2
ctrl
Fig. X.5. Timing diagram for the proposed SS-TVLS
2This is indicated by a dark line at the gate of M8. The VT value of M8 is 0.19V.
210
X-E. Experimental Results
The proposed SS-TVLS was simulated using SPICE [38], with a 90 nm PTM [45] model
card. An inverter is the best level shifter when VDDI > VDDO. However, if VDDI <
VDDO, the inverter cannot be used, due to the high leakage currents that result in such a
conversion. For such a scenario, the best known previous approach [41] yields low leakage
currents. Therefore, to compare the performance of SS-TVLS, a combination of an inverter
and the SS-VLS of [41] (as shown in Figure X.6) was simulated. For the SS-VLS of [41],
the devices used are of the same size as reported in [41]. Note that the combined VLS of
Figure X.6 requires a control signal which indicates whether VDDI is greater or smaller
than VDDO. The SS-TVLS does not require such a signal. For all simulation results, both,
the SS-TVLS and the combined VLS were driven by identical inverters.
Note that the delays of the SS-TVLS as well as the SS-VLS of [41] are dependent on
the input sequence. The worst-case is a 0-1-0-1-0. . . sequence on the inputs. For this
sequence, the voltage achieved at the ctrl node when the input switches to 0, is the lowest
across all sequences, resulting in a higher output rising delay. The delay numbers reported
in this chapter are the worst-case delays across all possible input sequences.
X-E.1. Performance Comparison with Nominal Parameters Value
Table X.1 reports the results obtained for voltage level shifting from 0.8 V to 1.2 V at a
temperature of 27◦ C. Column 1 reports the performance parameter under consideration.
Column 2 reports the results obtained for the proposed SS-TVLS. Column 3 reports the
results obtained for the combined VLS of Figure X.6. Column 4 reports the ratio of the
results obtained for the combined VLS compared to the corresponding results for the SS-
TVLS. Note that the rising (falling) delay is dened as the delay of the rising (falling)
output signal. Similarly, Leakage Current High (Low) in the table represents the leakage
211
current when the output signal is at VDDO (GND) value. From Table X.1, observe that
the SS-TVLS performs signicantly better than the combined VLS in terms of delay (5.6×
faster for a rising output and 1.5× faster for a falling output), power (2.6× lower for a rising
output, and 3.5× lower for a falling output) and leakage (7.6× lower for a high output, and
19.8× lower for a low output).
0.09
0.09
0.09
0.18
H L
H L
H L
0.09
0.54
0.09
0.27
0.09
0.27
H L
H L
0.09
0.27
0.09
0.09
VDDO
SS−VLS
[6]
GND
VDDO
GND
0
1
outb
out
in
Fig. X.6. Combination of an inverter and SS-VLS by Khan et al.
Table X.2 reports the results obtained for voltage level conversion from 1.2 V to 0.8
V at a temperature of 27◦ C. Column 1 reports the performance parameter under consider-
ation. Column 2 reports the results obtained for the proposed SS-TVLS. Column 3 reports
the results obtained for the combined VLS shown in Figure X.6. Column 4 reports the ratio
of the results obtained for the combined VLS compared to the corresponding results for the
SS-TVLS. As reported in Table X.2, the proposed SS-TVLS performs very well compared
to the combined VLS of Figure X.6 with very low leakage currents (4.5× lower for a high
output, and 9.3× lower for a low output). Also it is faster than the combined VLS (1.3×
faster for a rising output and 2.2× faster for a falling output). Note that the delay of the
combined VLS is the summation of the delays of the transmission gate (at the input side),
the multiplexer (at the output side) and the inverter. Therefore, the delay of the combined
212
Table X.1. Low to High Level Shifting
Performance Proposed Combined VLS Ratio
Parameter SS-TVLS of Figure X.6 (Combined VLS/SS-TVLS)
Delay Rise (ps) 22.0 122.6 5.6
Delay Fall (ps) 33.3 50.5 1.5
Power Rise (µW) 27.6 71.87 2.6
Power Fall (µW) 33.8 119.27 3.5
Leakage Current High (nA) 20.8 157.2 7.6
Leakage Current Low (nA) 3.6 71.1 19.8
VLS is more than the inverter delay alone.
X-E.2. Performance Comparison under Process and Temperature Variations
To demonstrate the process variation tolerance of the SS-TVLS, the SS-TVLS was simu-
lated under process and temperature variations. The temperature, the channel width, the
channel length and the threshold voltage of all devices in the SS-TVLS were varied. The
temperature of all the devices were varied together, while all other parameters were varied
independently. For channel lengths and widths the mean was taken to be equal to the nom-
inal value and the standard deviation used was taken to be 3.34% of the Lmin of the process
(i.e. 90 nm). For threshold voltage the mean was taken to be equal to the nominal value and
the standard deviation used was taken to be 3.34% of the nominal value (so that the three
times of the standard deviation is 10% of the nominal value). Three different values of tem-
perature were used (27◦, 60◦ and 90◦ C). 1000 Monte Carlo simulations were performed
for both cases i.e. for high to low and low to high voltage conversion. These simulations
213
Table X.2. High to Low Level Shifting
Performance Proposed Combined VLS Ratio
Parameter SS-TVLS of Figure X.6 (Combined VLS/SS-TVLS)
Rise Delay (ps) 34.9 46.5 1.3
Fall Delay (ps) 15.7 35.2 2.2
Power Rise (µW) 27.3 20.7 0.8
Power Fall (µW) 59.3 56.8 1.0
Leakage Current High (nA) 7.3 32.5 4.5
Leakage Current Low (nA) 3.9 36.3 9.3
were performed at each of the three temperatures mentioned above. In all Monte Carlo
simulation, the SS-TVLS was able to convert the voltage level correctly for all samples.
The outputs of both designs were loaded with a xed capacitance of 1 fF.
The results obtained from the 1000 Monte Carlo simulations conducted at a tempera-
ture of 27◦ C are reported in Table X.3, for low-to-high and high-to-low voltage level con-
version. In Table X.3, Column 1 reports the performance parameter under consideration.
Columns 2 to 5 report the results for low-to-high voltage level conversion and Columns 6
to 9 report the results for high-to-low voltage level conversion. Columns 2 and 3 report
the mean and the standard deviation of the values obtained for the proposed SS-TVLS.
Columns 4 and 5 report the mean and the standard deviation for the combined VLS shown
in Figure X.6. Columns 6 to 9 report the same results as Columns 2 to 5 but for high-to-low
voltage level conversion. From this table, observe that the mean delay and power are closer
to their nominal values. However, the mean value of the leakage current is different from
the nominal value which is due an exponential dependence of the leakage current with the
214
Table X.3. Process Variations Simulation Results for Low to High and High to Low Level
Shifting at T = 27◦ C
Low to High High to Low
Performance Proposed Combined VLS Proposed Combined VLS
Parameter SS-TVLS of Figure X.6 SS-TVLS of Figure X.6
µ σ µ σ µ σ µ σ
Delay Rise (ps) 22.08 1.1 129.4 27.4 35.1 2.4 52.0 3.9
Delay Fall (ps) 33.2 1.9 50.4 6.0 15.6 0.8 34.8 1.3
Power Rise (µW) 27.7 0.8 78.9 7.3 27.5 1.3 22.5 1.1
Power Fall (µW) 33.8 0.4 114.2 7.2 59.5 0.6 52.5 0.9
Leakage Current High (nA) 31.5 13.7 218.8 158.6 8.6 3.0 41.4 14.1
Leakage Current Low (nA) 3.8 3.8 102.9 75.4 3.6 1.3 32.3 9.0
Table X.4. Process Variations Simulation Results for Low to High and High to Low Level
Shifting at T = 60◦ C
Low to High High to Low
Performance Proposed Combined VLS Proposed Combined VLS
Parameter SS-TVLS of Figure X.6 SS-TVLS of Figure X.6
µ σ µ σ µ σ µ σ
Delay Rise (ps) 18.5 0.8 131.3 39.4 29.1 2.0 43.2 2.9
Delay Fall (ps) 29.9 1.5 48.4 5.7 14.3 0.6 30.3 1.0
Power Rise (µW) 27.4 0.7 86.0 11.1 27.1 1.1 22.4 1.0
Power Fall (µW) 33.6 0.34 123.7.2 8.3 60.1 0.7 53.3 0.9
Leakage Current High (nA) 30.4 13.4 202.4 130.2 7.9 3.2 40.3 12.8
Leakage Current Low (nA) 3.7 3.8 98.5 61.1 3.2 1.5 32.9 8.4
threshold voltage. The standard deviation of all performance parameters i.e. delay, power
and leakage current is much lower for the SS-TVLS as compared to the combined VLS of
Figure X.6. This demonstrates that the SS-TVLS is more tolerant to process and temperate
variations than the combined VLS. Monte Carlo simulation results for other temperatures
are presented in Tables X.4 (for T = 60◦ C) and X.5 (for T = 90◦ C). Note that the results
are substantially similar to those obtained at T = 27◦ C (Table X.3).
X-E.3. Voltage Translation Range for SS-TVLS
To evaluate the effectiveness of the SS-TVLS for SoCs and multi-core processors having
multiple voltage domains with DVS, VDDI and VDDO values were varied from 0.8 V to
215
Table X.5. Process Variations Simulation Results for Low to High and High to Low Level
Shifting at T = 90◦ C
Low to High High to Low
Performance Proposed Combined VLS Proposed Combined VLS
Parameter SS-TVLS of Figure X.6 SS-TVLS of Figure X.6
µ σ µ σ µ σ µ σ
Delay Rise (ps) 16.3 0.6 146.7 54.2 26.4 1.9 36.9 2.3
Delay Fall (ps) 27.8 1.3 47.8 5.9 13.5 0.6 27.4 0.9
Power Rise (µW) 27.3 0.6 96.8 16.1 23.4 0.8 22.4 0.8
Power Fall (µW) 33.6 0.35 134.8 9.4 51.4 0.6 53.9 1.1
Leakage Current High (nA) 28.1 11.3 200.8 128.6 7.6 3.1 39.7 13.9
Leakage Current Low (nA) 3.4 1.9 94.0 66.3 3.1 1.3 33.3 8.6
1.4 V in steps of 5 mV. The SS-TVLS was simulated for all VDDI and VDDO combi-
nations. The SS-TVLS was able to translate voltage levels efciently for all VDDI and
VDDO combinations. Figures X.7 and X.8 show the plot of rising and falling delays and
powers when VDDI and VDDO were varied between 0.8 V to 1.4 V. The leakage current
for a high output value and a low output value is shown in Figure X.9 for VDDI and VDDO
varying between 0.8 V to 1.4 V. These gures show that the rising and the falling delays and
powers change smoothly with changing VDDI and VDDO values, over the entire voltage
range. Similarly, the leakage current for a high output and a low output is also well behaved
across the operating range as shown in Figure X.9. Therefore, it can be concluded that the
SS-TVLS can effectively perform voltage level translation over a wide range of VDDI and
VDDO voltage values and hence it is suitable for SoCs and multi-core processors.
As mentioned earlier, the maximum voltage value that the ctrl node can charge to is
the minimum of VDDI and VDDO-VM8T when VDDI < VDDO, and VDDO and VDDI-
VM7T when VDDI > VDDO. Therefore, none of the diffusion-bulk diodes of any device
(both PMOS and NMOS transistors of Figure X.4) get forward biased for any values of
VDDI and VDDO. Thus, the voltage translation range is not limited by the diode turn on
voltage.
It is possible to increase the voltage translation range i.e. from a value less than 0.8
216
V to 1.4 V. To achieve this, the size of M6 needs to be increased. However, when VDDI
or VDDO ¡ 0.8 V then the maximum voltage reached at the ctrl node of SS-TVLS of
Figure X.4 is small, and hence M1 is not turned on sufciently to discharge node2 when in
falls to GND. To address this, the bulk terminals of M7 and M8 transistors can be connected
to their respective source terminals instead of GND. This will help in avoiding the body
effect seen by M7 and M8 and will increase the maximum voltage value achieved at the
ctrl. Note that in the SS-TVLS shown in Figure X.4, the body terminals of M7 and M8 are
connected to GND.
X-E.4. Layout of SS-TVLS
The layout of the proposed SS-TVLS was created in the Cadence Virtuoso layout editor and
is shown in Figures X.10. A layout versus schematic (LVS) check was done. The layout
area of SS-TVLS is 4.47µm2 (the width is 0.837µm and the height is 5.355µm) which is
lower that the layout area of the combined VLS (∼ 12.53µm2). The sizes of all the devices
of the SS-TVLS are shown in Figure X.4. Note that the devices of the SS-TVLS were sized
considering the tradeoff between delay and leakage power, while maximizing the process
and temperature variation tolerance of the SS-TVLS.
The experimental results clearly demonstrate that the proposed SS-TVLS performs
much better than the combined VLS of Figure X.6. When it is not known apriori whether
VDDI < VDDO or VDDI > VDDO, then the SS-TVLS offers a great advantage over the
combined VLS of Figure X.6, due to its signicantly lower leakage currents (7.6× (4.5×)
lower for a high output, and 19.8× (9.3×) lower for a low output, when VDDI < VDDO
(VDDI > VDDO)). Moreover, the SS-TVLS does not require any control signals and only
requires the VDDO supply. This helps in reducing the circuit complexity and also helps in
placement and routing.
217
(a) Rising Delay
(b) Falling Delay
Fig. X.7. Delay of SS-TVLS a) rising, b) falling
218
 0.008
 0.01
 0.012
 0.014
 0.016
 0.018
 0.02
 0.022
 0.024
 0.026
 0.8
 0.9
 1
 1.1
 1.2
 1.3
 1.4
 0.8
 0.9
 1
 1.1
 1.2
 1.3
 1.4
 0
 0.02
 0.04
 0.06
 0.08
 0.1
Falling Power
(a) Rising Power
 0.002
 0.004
 0.006
 0.008
 0.01
 0.012
 0.014
 0.016
 0.8
 0.9
 1
 1.1
 1.2
 1.3
 1.4
 0.8
 0.9
 1
 1.1
 1.2
 1.3
 1.4
 0
 0.02
 0.04
 0.06
 0.08
 0.1
Falling Power
(b) Falling Power
Fig. X.8. Power of SS-TVLS a) rising, b) falling
219
 0
 1e-08
 2e-08
 3e-08
 4e-08
 5e-08
 6e-08
 0.8
 0.9
 1
 1.1
 1.2
 1.3
 1.4
 0.8
 0.9
 1
 1.1
 1.2
 1.3
 1.4
 0
 1e-08
 2e-08
 3e-08
 4e-08
 5e-08
 6e-08
 7e-08
 8e-08
 9e-08
 1e-07
Leakage Current Low
(a) Leakage Current High
 0
 1e-08
 2e-08
 3e-08
 4e-08
 5e-08
 6e-08
 7e-08
 8e-08
 0.8
 0.9
 1
 1.1
 1.2
 1.3
 1.4
 0.8
 0.9
 1
 1.1
 1.2
 1.3
 1.4
 0
 1e-08
 2e-08
 3e-08
 4e-08
 5e-08
 6e-08
 7e-08
 8e-08
 9e-08
 1e-07
Leakage Current Low
(b) Leakage Current Low
Fig. X.9. Leakage current of SS-TVLS a) high, b) low
220
Fig. X.10. Layout of the proposed SS-TVLS
X-F. Chapter Summary
Modern ICs often have several voltage domains. Whenever a signal traverses voltage do-
mains, a level shifter is required. Moreover, these ICs often employ dynamic voltage scal-
ing, due to which it may not be possible to know apriori if a high-to-low or low-to-high
voltage level conversion is required. Thus, an efcient voltage level shifter is required
which can convert any voltage level to any other desired voltage level. Also, since process
variations are increasing with device scaling, the voltage level shifter should be tolerant to
process and temperature variations.
In this chapter, a process and temperature variation tolerant single-supply true volt-
age level shifter (SS-TVLS) was presented, which can handle both low-to-high and high-
to-low voltage translations. The use of a single supply voltage reduces layout congestion
by eliminating the need for routing both supply voltages. The proposed circuit was simu-
lated in a 90 nm technology using SPICE. Simulation results demonstrate that the proposed
SS-TVLS performs much better than the combined VLS of Figure X.6. The combined
VLS uses an inverter for high-to-low voltage translation and the best known previous ap-
proach [41] for low-to-high voltage level shifting. Also, the proposed SS-TVLS is more
tolerant to process and temperature variations than the combined VLS.
221
CHAPTER XI
CONCLUSIONS AND FUTURE DIRECTIONS
The focus of this dissertation is to improve the reliability of VLSI circuits by addressing
two major issues (radiation particle strikes and process variations) encountered, in the deep
submicron era. This dissertation developed several analysis and design approaches to fa-
cilitate the realization of VLSI circuits which are tolerant to the radiation particle strikes
and process variations. This chapter summarizes the work presented in this dissertation and
presents some avenues for further research.
Chapter I described the effects of the radiation particle strikes and process variations
on VLSI circuits, thereby motivating the need to address these issues. This chapter also
provided the background information about these two topics, and also highlighted their
relevance in future technologies. This dissertation consists of two parts. The rst part
(Chapters II to VII) of the dissertation presented four analysis and two design approaches
to address the radiation issue. The second part (Chapters VIII to X) addressed the process
variation issue by presenting one analysis and two design approaches. Thus, several anal-
ysis and design techniques are presented in this dissertation, signicantly augmenting the
existing work in the area of resilient VLSI circuit design.
Two analysis approaches were developed in this dissertation to analyze the effects of
radiation particle strikes in combinational circuits (Chapters II and III). Chapter II pre-
sented a model to estimate the pulse width of the radiation-induced voltage glitch, and
Chapter III described a model to approximate the shape of the radiation-induced voltage
glitch. Both these approaches used more accurate gate models (than a linear RC gate
model), and also considered the ion track establishment constant (τβ) of the radiation-
induced current pulse in the analysis. The previous modeling approaches [37, 44] used
a linear RC gate model, and ignored τβ which increases the inaccuracy of their analysis.
222
Therefore, the modeling approaches presented in Chapters II and III are more accurate than
the previous approaches. The proposed models are fast and accurate, and thus can easily
be incorporated in a design ow to implement radiation tolerant circuits.
It was mentioned in Chapter III that there exist efcient tools which can propagate
voltage glitches in a design. However, it may be possible to implement more efcient
tools exclusively for propagating radiation-induced voltage glitches. Therefore, a possible
future direction could be to develop efcient radiation-induced voltage glitch propagation
tools. Also, the work presented in Chapter III only considers radiation particle strikes at
the output of a gate. Therefore, this work can be extended to obtain the radiation-induced
voltage transients at the output of a gate due to radiation strikes at the internal nodes of the
gates. A combination of the models presented in Chapters II and III, along with a voltage
glitch propagation tool, and an approach which estimates the radiation-induced transients at
the output of a gate due to radiation strikes at the internal nodes, can be used for hardening
a circuit efciently. It could also be used for estimating the soft error rate (SER) of a circuit
operating in an environment where radiation is present.
SRAM yield is very important from an economic viewpoint, because of the exten-
sive use of memory in modern processors and SoCs. Therefore, SRAM stability analysis
tools have become essential. SRAM stability analysis based on static noise margin (SNM)
computation often results in pessimistic designs because SNM cannot capture the transient
behavior of the noise. Therefore, to improve accuracy, dynamic stability analysis tech-
niques are required. A model was developed in this dissertation to perform the dynamic
stability analysis of an SRAM cell in the presence of a radiation particle strike, as described
in Chapter IV. This model utilizes a double exponential current equation for modeling a
radiation particle strike, and it is able to predict (more accurately than [39]) whether a radi-
ation particle strike will result in a state ip in a 6T-SRAM cell (for given values of Q, τα
and τβ). Experimental results demonstrate that this model is very accurate, with a critical
223
charge estimation error of 4.6% compared to HSPICE. The runtime of this model is also
signicantly lower (by ∼2000×) than the HSPICE runtime. Thus, this model enables an
SRAM designer to quickly and accurately analyze the stability of their 6T cell during the
design phase.
The model for the dynamic stability of an SRAM cell presented in Chapter IV consid-
ers noise in SRAMs only due to radiation particle strikes. However, there are other types of
noise such as power and ground noise, capacitive coupling noise, etc. Therefore, the mod-
els similar to the one presented in Chapter IV are required to perform dynamic stability of
an SRAM cell in the presence of capacitive coupling noise, and power and ground noise. In
future, the proposed approach for modeling the dynamic stability of an SRAM cell can be
extended to include the effects of these noise sources as well. Also, another possible future
direction could be to extend the model presented in Chapter IV to incorporate the effect of
process variations on the dynamic stability of SRAMs in the presence of a radiation particle
strike.
In recent times, dynamic supply voltage scaling (DVS) has been extensively employed
to minimize the power and energy of VLSI systems. Also, sub-threshold circuits are be-
coming more popular. Therefore, the reliability of voltage scaled VLSI systems (when
subjected to a radiation event) has become a major concern. With the increasing demand
for reliable low power systems, it is necessary to harden DVS and sub-threshold circuits
efciently. This makes it necessary to understand the effects of voltage scaling on the ra-
diation tolerance of VLSI systems. To address this, the effects of voltage scaling on the
radiation tolerance of VLSI systems was analyzed in Chapter V. For this analysis, 3D sim-
ulations of radiation particle strikes on the output of an inverter (implemented using DVS
and sub-threshold design) were performed. The radiation particle strike on an inverter was
simulated using Sentaurus-DEVICE [40], for different inverter sizes, inverter loads, supply
voltages and radiation particle energies. From these 3D simulations, several non-intuitive
224
observations were made, which are important to consider during the radiation hardening of
such DVS and sub-threshold circuits. Based on these observations, several guidelines were
proposed for radiation hardening of such designs. These guidelines suggest that traditional
radiation hardening approaches need to be revisited for DVS and sub-threshold designs. A
charge collection model for DVS circuits was also proposed, which can be used to improve
the accuracy of SPICE based simulations of radiation events in DVS circuits. Note that this
work was done for bulk CMOS process.
An extension of this work could be to perform a similar study for 3D devices such
as FINFETs. Since the structure and operation of 3D devices is very different from bulk
CMOS devices, the effect of voltage scaling on the radiation susceptibility of circuits im-
plemented using these 3D devices might be different compared to bulk CMOS devices.
As mentioned in Chapter V, a signicant amount of the charge gets collected through
the diffusion process in DSM devices (since the substrate is heavily doped). Therefore,
another possible extension of this work is to perform 3D device simulations to study the
effect of different device implementation structures (for example, instead of implementing
one big device, 2 small devices connected in parallel may be used) on the charge collected
due to a radiation particle strike. This study can be useful in hardening a circuit by present-
ing layout guidelines to enhance radiation resilience.
The results of the analysis of the effects of a radiation particle strike on a circuit
can be used for selective hardening of the gates in a circuit, to achieve a desired level of
radiation tolerance while satisfying area, delay and power constraints. For this, efcient
circuit level hardening techniques are required. Two hardening approaches were developed
in this dissertation for combinational circuits, as described in Chapters VI and VII. The
rst hardening approach (referred to as the diode clamping based approach) is suitable for
hardening circuits against low energy radiation particle strikes, while the second approach
(the split-output based hardening approach) can harden circuits against very high energy
225
particle strikes. Both these hardening approaches use special gate structures to prevent
the occurrence/propagation of the radiation-induced voltage glitch. Also, in order to keep
the area and delay overheads low, only sensitive gates in a combinational circuit were
hardened. The gates which were hardened against radiation particle strikes are the gates
which contribute signicantly to the soft error failure of the circuit. Experimental results
presented in Chapters VI and VII demonstrate the effectiveness of these approaches in
implementing radiation-tolerant combinational circuits.
In the second part of this dissertation, Chapter VIII presented the sensitizable statisti-
cal timing analysis methodology developed in this dissertation (StatSense) to improve the
accuracy of statistical timing analysis of combinational circuits. StatSense improves the
accuracy of statistical timing analysis by eliminating false paths in a circuit, and by also
using different delay distributions for different input transitions for any gate. Experimental
results show that on average, the worst case (µ + 3σ) circuit delay reported by StatSense
is about 19% lower than that reported by SSTA. Thus, StatSense reduces the pessimism
involved in the statistical timing analysis.
The StatSense approach uses Monte Carlo simulations to estimate the delay distribu-
tion of a circuit. Therefore, the StatSense approach, although more accurate, is slower than
the block-based SSTA approaches. It may be possible to combine the best of the StatSense
and the block-based SSTA approaches, to develop a StatSense-like fast block-based SSTA
approach. This will help in improving the accuracy of SSTA tools with smaller runtimes.
Process variation tolerant circuit design approaches are required to improve yield and
lower manufacturing costs. In Chapter IX, a process variation tolerant design approach for
combinational circuits was presented. This approach exploits the fact that random varia-
tions can cause a signicant mismatch in two identical devices placed next to each other on
the die. In this approach, a large gate is implemented by connecting an appropriate num-
ber (> 1) of smaller gates in parallel. This parallel connection of smaller gates is referred
226
to as a parallel gate. Since L and VT variations are largely random and have independent
variations in the smaller gates, the variation tolerance of the parallel gate is improved. The
parallel gates were implemented as single layout cells. By sharing the diffusion region in
the layout of the parallel gates, it is possible to reduce the input and output capacitance
of the gates. This helps in improving the nominal circuit delay as well. To keep the area
overhead low, only critical gates in a circuit were replaced by their parallel counterparts, to
improve the variation tolerance of the circuit. Experimental results from Monte Carlo simu-
lations demonstrate that this process variation tolerant design approach achieves signicant
improvements in circuit level variation tolerance.
With the increasing usage of dynamic voltage scaling (DVS) in SoCs and multi-core
ICs, the number of voltage domains in a single IC or SoC has signicantly increased. To
interface these voltage domains, voltage level shifters (VLSs) are required. These VLSs
should be able to convert any voltage level to any other desired voltage level with a pre-
dictable delay. Thus, process variation tolerant voltage level shifters are desired. A novel
process variation tolerant single-supply true voltage level shifter (SS-TVLS) design was
presented in Chapter X. The SS-TVLS is the rst VLS design which can handle both low-
to-high and high-to-low voltage translation without a need for a control signal. The use of
a single supply voltage reduces circuit complexity, by eliminating the need for routing 2
supply voltages. The proposed circuit was extensively simulated in a 90 nm technology us-
ing SPICE. Simulation results demonstrate that the level shifter is able to perform voltage
level shifting with low leakage for both low to high, as well as high to low voltage level
translation. The proposed SS-TVLS is also more tolerant to process and temperature vari-
ations compared to a combination of an inverter along with the VLS solution [41]. Thus,
the proposed SS-TVLS is better than the best known previous design of VLS approach.
227
REFERENCES
[1] T. May and M. Woods, Alpha-particle-induced soft errors in dynamic memories,
IEEE Transaction on Electron Devices, vol. ED-26, pp. 29, Jan 1979.
[2] J. Pickle and J. Blandford, CMOS RAM cosmic-ray-induced error rate analysis,
IEEE Transactions on Nuclear Science, vol. NS-29, pp. 39623967, 1981.
[3] W. Massengill, M. Alles, and S. Kerns, SEU error rates in advanced digital
CMOS, in Proc. of the European Conf. on Radiation and Its Effects on Compo-
nents and Systems, Sep. 1993, pp. 546  553.
[4] P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger, and L. Alvisi, Modeling the
effect of technology trends on the soft error rate of combinational logic, in Proc. of
the Intl. Conf. on Dependable Systems and Networks, 2002, pp. 389398.
[5] Q. Zhou and K. Mohanram, Transistor sizing for radiation hardening, in Proc. of
the Intl. Reliability Physics Symposium, April 2004, pp. 310315.
[6] P. Hazucha, T. Karnik, J. Maiz, S. Walstra, B. Bloechel, J. Tschanz, G. Dermer,
S. Hareland, P. Armstrong, and S. Borkar, Neutron soft error rate measurements in
a 90-nm CMOS process and scaling trends in SRAM from 0.25-um to 90-nm gener-
ation, in International Electron Devices Meeting, Dec. 2003, pp. 21.5.121.5.4.
[7] S. Borkar, Designing reliable systems from unreliable components: the challenges
of transistor variability and degradation, IEEE Micro, vol. 25, no. 6, pp. 1016,
Nov.-Dec. 2005.
[8] M. Orshansky, S. R. Nassif, and D. Boning, Design for manufacturability and sta-
tistical design: A constructive approach, US Springer, 2008.
228
[9] K. Agarwal and S. Nassif, Characterizing process variation in nanometer CMOS,
in Proc. of the Design Automation Conf., June 2007, pp. 396399.
[10] K. Bernstein, D. J. Frank, A. E. Gattiker, W. Haensch, B. L. Ji, S. R. Nassif, E. J.
Nowak, D. J. Pearson, and N. J. Rohrer, High-performance CMOS variability in
the 65-nm regime and beyond, IBM Journal of Research and Development, vol.
50, pp. 433449, July/Sept. 2006.
[11] P. E. Dodd and L. W. Massengill, Basic mechanisms and modeling of single-event
upset in digital microelectronics, IEEE Transactions on Nuclear Science, vol. 50,
no. 3, pp. 583 602, 2003.
[12] Q. Zhou and K. Mohanram, Gate sizing to radiation harden combinational logic,
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,
vol. 25, no. 1, pp. 155166, Jan. 2006.
[13] D. Binder, C. Smith, and A. Holman, Satellite anomalities from galactic cosmic
rays, IEEE Transactions on Nuclear Science, vol. NS-22, pp. 26752680, Dec.
1975.
[14] R. G. Harrison and D. B. Stephenson, Detection of a galactic cosmis ray inuence
on clouds, Geophysical Research Abstracts, vol. 8, no. 07661, pp. 1, 2006.
[15] G. Cellere, A. Paccagnella, A. Visconti, and M. Bonanomi, Soft errors induced by
single heavy ions in oating gate memory arrays, in Proc. of the Intl. Symposium
on Defect and Fault Tolerance in VLSI Systems, 2005, pp. 275284.
[16] A. Johnston, Scaling and technology issues for soft error rate, in Proc. of the
Annual Research Conf. on Reliability, Oct. 2000, pp. 18.
229
[17] L. D. Edmonds, A simple estimate of funneling-assisted charge collection, IEEE
Transactions on Nuclear Science, vol. 38, no. 2, pp. 828833, Apr 1991.
[18] P. E. Dodd, F. W. Sexton, and P. S. Winokur, Three-dimensional simulation of
charge collection and multiple-bit upset in Si devices, IEEE Transactions on Nu-
clear Science, vol. 41, pp. 20052017, 1994.
[19] P. E. Dodd, Device simulation of charge collection and single-event upset, IEEE
Transactions on Nuclear Science, vol. 43, no. 2, pp. 561575, Apr. 1996.
[20] G. Messenger, Collection of charge on junction nodes from ion tracks, IEEE
Transactions on Nuclear Science, vol. 29, no. 6, pp. 20242031, 1982.
[21] O. A. Amusan, Analysis of single event vulnerabilities in a 130 nm CMOS tech-
nology, M.S. thesis, Vanderbilt University, 2006.
[22] S. DasGupta, Trends in single event pulse widths and pulse shapes in deep submi-
cron CMOS, M.S. thesis, Vanderbilt University, 2008.
[23] G. E. Moore, Cramming more components onto integrated circuits, Electronics,
vol. 38, no. 8, pp. 14, April 1965.
[24] R. E. Kessler, The alpha 21264 microprocessor, IEEE Micro, vol. 19, no. 2, pp.
2436, 1999.
[25] H. Chang and S. S. Sapatnekar, Statistical timing analysis under spatial correla-
tions, IEEE Transactions on Computer-Aided Design of Integrated Circuits and
Systems, vol. 24, no. 9, pp. 14671482, Sept. 2005.
[26] L. He, A. B. Kahng, K. H. Tam, and J. Xiong, Simultaneous buffer insertion and
wire sizing considering systematic CMP variation and random Leff variation, Proc.
of the Intl. Conf. on Computer-Aided Design, vol. 26, no. 5, pp. 845857, May 2007.
230
[27] K. Cao, S. Dobre, and J. Hu, Standard cell characterization considering lithography
induced variations, in Proc. of the Design Automation Conf., 2006, pp. 801804.
[28] K. J. Kuhn, Reducing variation in advanced logic technologies: Approaches to
process and design for manufacturability of nanoscale CMOS, in International
Electron Devices Meeting, Dec. 2007, pp. 471474.
[29] S. Nassif, Delay variability: Sources, impacts and trends, in Proc. of the Intl.
Solid State Circuits Conf., 2000, pp. 368369.
[30] W. Zhao, Y. Cao, F. Liu, K. Agarwal, D. Acharyya, S. Nassif, and K. Nowka, Rig-
orous extraction of process variations for 65nm CMOS design, in Proc. of the
European Solid State Device Research Conf., Sept. 2007.
[31] K. Agarwal, F. Liu, C. McDowell, S. Nassif, K. Nowka, M. Palmer, D. Acharyya,
and J. Plusquellic, A test structure for characterizing local device mismatches, in
Proc. of the Symposium on VLSI Circuits, 2006, pp. 6768.
[32] R. Garg, C. Nagpal, and S. P. Khatri, A fast, analytical estimator for the SEU-
induced pulse width in combinational designs, in Proc. of the Design Automation
Conf., June 2008, pp. 918923.
[33] R. Garg and S. P. Khatri, Efcient analytical determination of the SEU-induced
pulse shape, in Proc. of the Asia and South Pacific Design Automation Conf., Jan.
2009.
[34] R. Garg, P. Li, and S. P. Khatri, Modeling dynamic stability of SRAMs in the
presence of single event upsets (seus), in Proc. of the Intl. Symposium on Circuits
and Systems, May 2008, pp. 17881791.
231
[35] R. Garg, N. Jayakumar, S. P. Khatri, and G. Choi, A design approach for radiation-
hard digital electronics, in Proc. of the Design Automation Conf., July 2006, pp.
773778.
[36] R. Garg and S. P Khatri, A novel, highly SEU tolerant digital circuit design ap-
proach, in Proc. of the Intl. Conf. on Computer Design, Oct. 2008, pp. 1420.
[37] K. Mohanram, Closed-form simulation and robustness models for SEU-tolerant
design, in Proc. of the VLSI Test Symposium, 2005, pp. 327333.
[38] L. Nagel, Spice: A computer program to simulate computer circuits, in University
of California, Berkeley UCB/ERL Memo M520, May 1995.
[39] B. Zhang, A. Arapostathis, S. Nassif, and M. Orshansky, Analytical modeling of
SRAM dynamic stability, in Proc. of the Intl. Conf. on Computer-Aided Design,
Nov. 2006, pp. 315322.
[40] Synopsys Inc., Mountain View, CA, Sentaurus user’s manuals, 2007.12 edition.
[41] Q. A. Khan, S. K. Wadhwa, and K. Misri, A single supply level shifter for multi
voltage systems, in Proc. of the Intl. Conf. on VLSI Design, Jan. 2006, pp. 14.
[42] K. Mohanram and N. A. Touba, Cost-effective approach for reducing soft error
failure rate in logic circuits, in Proc. of the Intl. Test Conf., 2003, pp. 893901.
[43] T. Heijmen and A. Nieuwland, Soft-error rate testing of deep-submicron integrated
circuits, in Proc. of the IEEE European Test Symposium, 2006, pp. 247252.
[44] P. Dahlgren and P. Liden, A switch-level algorithm for simulation of transients in
combinational logic, in Proc. of the Intl. Symposium on Fault-Tolerant Computing,
June 1995, pp. 207216.
232
[45] Nanoscale integration and modeling (NIMO) group (2007), ASU Predictive Tech-
nology Model [On-line], Available: http://www.eas.asu.edu/∼ptm.
[46] E. Seevinck, F. J. List, and J. Lohstroh, Static-noise margin analysis of MOS
SRAM cells, IEEE Journal of Solid-State Circuits, vol. SC-22, no. 5, pp. 748754,
Oct. 1987.
[47] H. Cha, E. M. Rudnick, J. H. Patel, R. K. Iyer, and G. S. Choi, A gate-level sim-
ulation environment for alpha-particle-induced transient faults, IEEE Transactions
on Computers, vol. 45, no. 11, pp. 12481256, 1996.
[48] A. Dharchoudhury, S. M. Kang, H. Cha, and J. H. Patel, Fast timing simulation
of transient faults in digital circuits, in Proc. of the Intl. Conf. on Computer-Aided
Design, 1994, pp. 719722.
[49] P. E. Dodd, M. R. Shaneyfelt, and F. W. Sexton, Charge collection and SEU from
angled ion strikes, IEEE Transactions on Nuclear Science, vol. 44, pp. 22562265,
1997.
[50] H. Cha, E. M. Rudnick, G. S. Choi, J. H. Patel, and R. K. Iyer, A fast and accurate
gate-level transient fault simulation environment, in Proc. of the Symposium on
Fault-Tolerant Computing, 1993, pp. 310319.
[51] C. Zhao, X. Bai, and S. Dey, A scalable soft spot analysis methodology for com-
pound noise effects in nano-meter circuits, in Proc. of the Design Automation
Conf., 2004, pp. 894899.
[52] Y. Shih and S. Kang, Analytic transient solution of general MOS circuit primitives,
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,
vol. 11, no. 6, pp. 719731, 1992.
233
[53] C. Kashyap, C. Amin, N. Menezes, and E. Chiprout, A nonlinear cell macromodel
for digital applications, in Proc. of the Intl. Conf. on Computer-Aided Design,
2007, pp. 678685.
[54] M. P. Baze and S. P. Buchner, Attenuation of single event induced pulses in CMOS
combinational logic, IEEE Transactions on Nuclear Science, vol. 44, pp. 2217
2223, Dec. 1997.
[55] S. Mitra, N. Seifert, M. Zhang, and K. Kim, Robust system design with built-in
soft-error resilience, IEEE Transactions on Computers, vol. 38, pp. 4352, Feb.
2005.
[56] M. Nicolaidis, Time redundancy based soft-error tolerance to rescue nanometer
technologies, in Proc. of the VLSI Test Symposium, April 1999, pp. 8694.
[57] L. Anghel, D. Alexandrescu, and M. Nicolaidis, Evaluation of a soft error tolerance
technique based on time and/or space redundancy, in Proc of the Symposium on
Integrated Circuits and Systems Design, Manaus, Brazil, 2000, pp. 237242.
[58] A. Kasnavi, J. W. Wang, M. Shahram, and J. Zejda, Analytical modeling of
crosstalk noise waveforms using Weibull function, in Proc. of the Intl. Conf. on
Computer-Aided Design, 2004, pp. 141146.
[59] W. Chen, S. K. Gupta, and M. A. Breuer, Analytical models for crosstalk excitation
and propagation in VLSI circuits, IEEE Transactions on Computer-Aided Design
of Integrated Circuits and Systems, vol. 21, no. 10, pp. 11171131, Oct. 2002.
[60] F. Wang, Y. Xie, R. Rajaraman, and B. Vaidyanathan, Soft error rate analysis for
combinational logic using an accurate electrical masking model, in Proc. of the
Intl. Conf. on VLSI Design, 2007, pp. 165170.
234
[61] C. Forzan and D. Pandini, A complete methodology for an accurate static noise
analysis, in Proc. of the Great Lakes symposium on VLSI, 2005, pp. 302307.
[62] J. Rabaey, Digital Integrated Circuits: A Design Perspective, Prentice Hall Elec-
tronics and VLSI Series. US Prentice Hall, 1996.
[63] K. Takeda, Y. Hagihara, Y. Aimoto, M. Nomura, Y. Nakazawa, T. Ishii, and H. Ko-
batake, A read-static-noise-margin-free SRAM cell of low-vdd and high-speed
applications, IEEE Journal of Solid-State Circuits, vol. 41, no. 1, pp. 113121, Jan.
2006.
[64] S. Rusu, M. Sachdev, C. Svensson, and B. Nauta, T3: Trends and challenges in
VLSI technology scaling towards 100nm, in Proc. of the Asia South Pacific Design
Automation Conf., 2002, pp. 1617.
[65] K. Anami, M. Yoshimoto, H. Shinohara, Y. Hirata, and T. Nakano, Design consid-
eration of a static memory cell, IEEE Journal of Solid-State Circuits, vol. SC-18,
no. 4, pp. 414418, Aug. 1983.
[66] J. Lohstroh, E. Seevinck, and J. D. Groot, Worst-case static noise margin criteria
for logic circuits and their mathematical equivalence, IEEE Journal of Solid-State
Circuits, vol. SC-18, no. 6, pp. 803807, Dec. 1983.
[67] C. Tsai and M. Marek-Sadowska, Analysis of process variation’s effect on SRAM’s
read stability, in Proc. of the Intl. Symposium on Quality Electronic Design, 2006,
pp. 603610.
[68] B. H. Calhoun and A. P. Chardrakasan, Static noise margin variation for sub-
threshold SRAM in 65-nm cmos, IEEE Journal of Solid-State Circuits, vol. 41, pp.
16731679, July 2006.
235
[69] R. Garg, N. Jayakumar, S. P. Khatri, and G. Choi, A design approach for radiation-
hard digital electronics, in Proc. of the Design Automation Conf., July 2006, pp.
773778.
[70] M. Horowitz, Timing models for MOS circuits, Ph.D. dissertation, Stanford Uni-
versity, 1984.
[71] Synopsys Inc., Mountain View, CA, HSPICE User’s Manual, 2003.03 edition.
[72] ITRS, The international technology roadmap for semiconductors edition, Avail-
able: http://public.itrs.net/, 2003.
[73] R. Gonzalez, B. M. Gordon, and M. A. Horowitz, Supply and threshold voltage
scaling for low power CMOS, IEEE Journal of Solid-State Circuits, vol. 32, no. 8,
pp. 12101216, Aug 1997.
[74] T. Burd, T. Pering, A. Stratakos, and T. Brodersen, A dynamic voltage scaled
microprocessor system, in Proc. of the Intl. Solid State Circuits Conf., 2000, pp.
294295, 466.
[75] K. Flautner, S. Reinhardt, and T. Mudge, Automatic performance setting for dy-
namic voltage scaling, Wireless Networks, vol. 8, no. 5, pp. 507520, 2002.
[76] W. Kim, D. Shin, H. S. Yun, J. Kim, and S. L. Min, Performance comparison
of dynamic voltage scaling algorithms for hard real-time systems, in Proc. of the
Symposium on Real-Time and Embedded Technology and Applications, 2002, pp.
219228.
[77] B. Zhai, D. Blaauw, D. Sylvester, and K. Flautner, Theoretical and practical limits
of dynamic voltage scaling, in Proc. of the Design Automation Conf., 2004, pp.
868873.
236
[78] C. Duan and S. P. Khatri, Computing during supply voltage switching in DVS
enabled real-time processors, in Proc. of the Intl. Symposium on Circuits and Sys-
tems, May 2006, pp. 51155118.
[79] S. Choi, B. Kim, J. Park, C. Kang, and D. Eom, An implementation of wireless
sensor network, IEEE Transactions on Consumer Electronics, vol. 50, no. 1, pp.
236244, Feb 2004.
[80] A. Abidi, G. Pottie, and W. Kaiser, Power-conscious design of wireless circuits
and systems, Proceedings of the IEEE, vol. 88, no. 10, pp. 15281545, Oct 2000.
[81] N. Jayakumar, R. Garg, B. Gamache, and S. P. Khatri, A PLA based asynchronous
micropipelining approach for subthreshold circuit design, in Proc. of the Design
Automation Conf., July 2006, pp. 419424.
[82] J. M. Palau, M. C. Calvet, P. E. Dodd, F. W. Sexton, and P. Roche, Contribution
of device simulation to SER understanding, in Proc. of the Intl. Reliability Physics
Symposium, March 2003, pp. 7175.
[83] P. Hazucha, C. Svensson, and S. A. Wender, Cosmic-ray soft error rate characteri-
zation of a standard 0.6-µm CMOS process, IEEE Journal of Solid-State Circuits,
vol. 35, no. 10, pp. 14221429, Oct 2000.
[84] F. Irom, F. F. Farmanesh, A. H. Johnston, G. M. Swift, and D. G. Millward, Single-
event upset in commercial silicon-on-insulator PowerPC microprocessors, IEEE
Transactions on Nuclear Science, vol. 49, no. 6, pp. 31483155, Dec 2002.
[85] O. Flament, J. Baggio, C. D’hose, G. Gasiot, and J. L. Leray, 14 MeV neutron-
induced SEU in SRAM devices, IEEE Transactions on Nuclear Science, vol. 51,
no. 5, pp. 29082911, Oct. 2004.
237
[86] P. E. Dodd, M. R. Shaneyfelt, J. A. Felix, and J. R. Schwank, Production and propa-
gation of single-event transients in high-speed digital logic ICs, IEEE Transactions
on Nuclear Science, vol. 51, no. 6, pp. 32783284, Dec. 2004.
[87] P. Bai, C. Auth, S. Balakrishnan, M. Bost, R. Brain, and V. Chikarmane et al., A
65nm logic technology featuring 35nm gate lengths, enhanced channel strain, 8 Cu
interconnect layers, low-k ILD and 0.57µm2 SRAM cell, International Electron
Devices Meeting, pp. 657660, Dec. 2004.
[88] S. Chen, H. Huang, H. Lin, L. Shie, and M. Chen, Using simulation for characterize
high performance 65nm node planar N-MOSFETs, in Proc. of the Intl. Symposium
on NANO Science and Technology, Nov. 2004, pp. 12.
[89] Y. Taur and E. J. Nowak, CMOS devices below 0.1 um: How high will performance
go? International Electron Devices Meeting, pp. 215218, Dec. 1997.
[90] T. Fukai, Y. Nakahara, M. Terai, S. Koyama, and Y. Morikuni et al., A 65 nm-
node CMOS technology with highly reliable triple gate oxide suitable for power-
considered system-on-a-chip, Proc. of the Symposium on VLSI Technology, pp.
8384, June 2003.
[91] R. Baumann, Soft errors in advanced computer systems, IEEE Design & Test of
Computers, vol. 22, no. 3, pp. 258266, May-June 2005.
[92] J. Wang, B. Cronquist, and J. McGowan, Rad-hard/hi-rel FPGA, in Proc. of the
ESA Electronic Components Conf., April 1997, pp. 14.
[93] B. Gill, M. Nicolaidis, F. Wolff, C. Papachristou, and S. Garverick, An efcient
BICS design for SEUs detection and correction in semiconductor memories, in
238
Proc. of the Conf. on Design Automation and Test in Europe, March 2005, pp. 592
597.
[94] G. Agrawal, L. Massengill, and K. Gulati, A proposed SEU tolerant dynamic ran-
dom access memory (DRAM) cell, in IEEE Transactions on Nuclear Science, Dec.
1994, vol. 41, pp. 20352042.
[95] M. Caffrey, P. Graham, E. Johnson, and M. Wirthli, Single-event upsets in SRAM
FPGAs, in Proc. of the Intl. Conf. on Military and Aerospace Programmable Logic
Devices, Sep. 2002, pp. 16.
[96] C. Carmichael, E. Fuller, M. Caffrey, P. Blain, and H. Bogrow, SEU mitigation
techniques for Virtex FPGAs in space applicaions, in Proc. of the Intl. Conf. on
Military and Aerospace Programmable Logic Devices, Sep. 1999, pp. 18.
[97] S. Whitaker, J. Canaris, and K. Liu, SEU hardened memory cells for a CCSDIS
Reed Solomon encoder, IEEE Transactions on Nuclear Science, vol. 38, no. 6, pp.
14711477, 1991.
[98] M. N. Liu and S. Whitaker, Low power SEU immune CMOS memory circuits,
IEEE Transactions on Nuclear Science, vol. 36, no. 6, pp. 16791684, 1992.
[99] J. S. Cable, E. F. Lyons, M. A. Stuber, and M. L. Burgener, United States patent
6531739: Radiation-hardened silicon-on-insulator CMOS device, and method
of making the same, Available: http://www.freepatentsonline.com/6531739.html,
Nov. 2003, pp. 1-16.
[100] C. Nagpal, R. Garg, and S. P. Khatri, A delay-efcient radiation-hard digital design
approach using CWSP elements, in Proc. of the Conf. on Design Automation and
Test in Europe, March 2008, pp. 354359.
239
[101] J. P. Hayes, I. Polian, and B. Becker, An analysis framework for transient-error
tolerance, in Proc. of the VLSI Test Symposium, 2007, pp. 249255.
[102] C. Zhao, S. Dey, and X. Bai, Soft-spot analysis: Targeting compound noise effects
in nanometer circuits, Design and Test of Computers, vol. 22, no. 4, pp. 362375,
2005.
[103] C. Zhao and S. Dey, Improving transient error tolerance of digital VLSI circuits
using robustness compiler (ROCO), in Proc. of the Intl. Symposium on Quality
Electronic Design, 2006, pp. 133140.
[104] Q. Lin, M. Ma, T. Vo, J. Fan, X. Wu, R. Li, and X. Li, Design-for-manufacture
for multi-gate oxide CMOS process, in Proc. of the Intl. Symposium on Quality
Electronic Design, 2007, pp. 339343.
[105] B. Amelifard, F. Fallah, and M. Pedram, Reducing the sub-threshold and gate-
tunneling leakage of SRAM cells using dual-vt and dual-Tox assignment, in Proc.
of the Conf. on Design Automation and Test in Europe, 2006, pp. 9951000.
[106] Altera Inc., San Jose, CA, Stratix III programmable power, May 2007, pp. 1-12,
Available: http://www.altera.com/literature/wp/wp-01006.pdf.
[107] Y. Cao, T. Sato, D. Sylvester, M. Orshansky, and C. Hu, New paradigm of predic-
tive MOSFET and interconnect modeling for early circuit design, in Proc. of IEEE
Custom Integrated Circuit Conf., June 2000, pp. 201204.
[108] P. McGeer, A. Saldanha, R. Brayton, and A. Sangiovanni-Vincentelli, Logic Syn-
thesis and Optimization. Boston, MA: US Kluwer Academic Publishers, 1993, ch.
Delay Models and Exact Timing Analysis, pp. 167189.
240
[109] E. M. Sentovich, K. J. Singh, L. Lavagno, C. Moon, R. Murgai, A. Saldanha,
H. Savoj, P. R. Stephan, R. K. Brayton, and A. L. Sangiovanni-Vincentelli, SIS: A
system for sequential circuit synthesis, Tech. Rep. UCB/ERL M92/41, Electronics
Research Laboratory, Univ. of California, Berkeley, May 1992.
[110] Cadence Design Systems, Inc., San Jose, CA, Envisia Silicon Ensemble Place-and-
route Reference Manuals, Nov 1999.
[111] J. Canaris, An SEU immune logic family, in Proc. of the NASA Symposium on
VLSI Design, Oct 1991, pp. 2.3.12.3.11.
[112] E. J. Nowak, Maintaining the benets of CMOS scaling when scaling bogs down,
IBM Journal of Research and Development, vol. 46, no. 2-3, pp. 169186, 2002.
[113] ECSS: European cooperation for space standardization, Energetic particle radiation,
May 2008, Available: http://www.spenvis.oma.be/spenvis/ecss/ecss09/ecss09.html.
[114] J. Feynman, G. Spitale, J. Wang, and S. Gabriel, Interplanetary proton uence
model: Jpl 1991, J. Geophys. Res., vol. 98, pp. 1328113294, 1993.
[115] J. Benkoski and A. J. Strojwas, A new approach to hierarchical and statistical
timing simulations, in IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, Nov. 1987, vol. 6, pp. 10391052.
[116] H. Jyu and S. Malik, Statistical delay modeling in logic design and synthesis, in
Proc. of the Design Automation Conf., 1994, pp. 126130.
[117] C. Visweswariah, K. Ravindran, K. Kalafala, S. G. Walker, and S. Narayan, First-
order incremental block-based statistical timing analysis, in Proc. of the Design
Automation Conf., 2004, pp. 331336.
241
[118] A. Agarwal, V. Zolotov, and D. T. Blaauw, Statistical timing analysis using bounds
and selective enumeration, in IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems, Sept. 2003, vol. 22, pp. 12431260.
[119] A. Agarwal, D. Blaauw, and V. Zolotov, Statistical timing analysis for intra-die
process variations with spatial correlations, in Proc. of the Intl. Conf. on Computer-
Aided Design, Nov. 2003, pp. 900907.
[120] A. Agarwal, D. Blaauw, V. Zolotov, and S. Vrudhula, Statistical timing analysis
using bounds, in Proc. of the Conf. on Design Automation and Test in Europe,
March.
[121] A. Devgan and C. V. Kashyap, Block-based static timing analysis with uncer-
tainty, in Proc. of the Intl. Conf. on Computer-Aided Design, 2003, pp. 607614.
[122] J. Liou, K. Cheng, S. Kundu, and A. Krstic, Fast statistical timing analysis by
probabilistic event propagation, in Proc. of the Design Automation Conf., 2001,
pp. 661666.
[123] J. Liou, A. Krstic, L. Wang, and K. Cheng, False-path-aware statistical timing
analysis and efcient path selection for delay testing and timing validation, in Proc.
of the Design Automation Conf., 2002, pp. 566569.
[124] L. Xie and A. Davoodi, Bound-based identication of timing-violating paths under
variability, in Proc. of the Asia and South Pacific Design Automation Conf., Jan.
2009, pp. 278283.
[125] S. A. Cook, The complexity of theorem-proving procedures, in Proc. of the Third
Annual ACM Symposium on Theory of Computing, 1971, pp. 151158.
242
[126] M. Davis, G. Logemann, and D. Loveland, A machine program for theorem-
proving, Communication of the ACM, vol. 5, no. 7, pp. 394397, 1962.
[127] S. Malik, Y. Zhao, C. F. Madigan, L. Zhang, and M. W. Moskewicz, Chaff: En-
gineering an efcient SAT solver, in Proc. of the Design Automation Conf., 2001,
pp. 530535.
[128] Y. Kukimoto, W. Gosti, A. Saldanha, and R. K. Brayton, Approximate timing
analysis of combinational circuits under the XBD0 model, in Proc. of the Intl.
Conf. on Computer-Aided Design, 1997, pp. 176181.
[129] M. C. T. Chao, L. Wang, K. Cheng, and S. Kundu, Static statistical timing analysis
for latch-based pipeline designs, in Proc. of the Intl. Conf. on Computer-Aided
Design, 2004, pp. 468472.
[130] O. Neiroukh and X. Song, Improving the process-variation tolerance of digital
circuits using gate sizing and statistical techniques, in Proc. of the Conf. on Design
Automation and Test in Europe, 2005, pp. 294299.
[131] J. Tschanz, K. Bowman, and V. De, Variation-tolerant circuits: Circuit solutions
and techniques, in Proc. of the Design Automation Conf., June 2005, pp. 762763.
[132] G. Nabaa and F. N. Najm, Minimization of delay sensitivity to process induced
voltage threshold variations, in Proc. of the IEEE-NEWCAS Conf., June 2005, pp.
171174.
[133] S. Bhunia, S. Mukhopadhyay, and K. Roy, Process variations and process-tolerant
design, in Proc. of the Intl. Conf. on VLSI Design, Jan. 2007, pp. 699704.
[134] A. Gattiker, M. Bhushan, and M. B. Ketchen, Data analysis techniques for CMOS
technology characterization and product impact assessment, in Proc. of the Intl.
243
Test Conf., 2006, pp. 110.
[135] R. Garg, N. Jayakumar, and S. P. Khatri, On the improvement of statistical timing
analysis, in Proc. of the Intl. Conf. on Computer Design, Oct. 2006, pp. 3742.
[136] J. Tschanz, J. Kao, S. Narendra, R. Nair, D. Antoniadis, A. Chandrakasan, and Vivek
De, Adaptive body bias for reducing impacts of die-to-die and within-die parameter
variations on microprocessor frequency and leakage, in Proc. of the Intl. Solid State
Circuits Conf., Feb. 2002, pp. 422  478.
[137] J.W. Tschanz, S. Narendra, R. Nair, and V. De, Effectiveness of adaptive supply
voltage and body bias for reducing impact of. parameter variations in low power and
high performance microprocessors, IEEE Journal of Solid-State Circuits, vol. 38,
no. 5, pp. 826829, May 2003.
[138] B. C. Paul, A. Agarwal, and K. Roy, Low-power design techniques for scaled
technologies, Integration, the VLSI Journal, vol. 39, no. 2, pp. 64  89, 2006.
[139] A. H. El-Maleh, B. M. Al-Hashimi, and A. Melouki, Transistor-level based de-
fect tolerance for reliable nanoelectronics, in Proc. of the Intl. Conf. on Computer
Systems and Applications, April 2008, pp. 5360.
[140] H. El-Razouk and Z. Abid, A new transistor-redundant voter for defect-tolerant
digital circuits, Proc. of the Canadian Conf. on Electrical and Computer Engineer-
ing, pp. 10781081, May 2006.
[141] M. J. M. Pelgrom, A. C. J. Duinmaijer, and A. P. G. Welbers, Matching properties
of MOS transistors, IEEE Journal of Solid-State Circuits, vol. 24, no. 5, pp. 1433
1439, Oct 1989.
244
[142] M. R. Guthaus, N. Venkateswarant, C. Visweswariaht, and V. Zolotov, Gate sizing
using incremental parameterized statistical timing analysis, in Proc. of the Intl.
Conf. on Computer-Aided Design, 2005, pp. 10291036.
[143] S. Raj, S. B. K. Vrudhula, and J. Wang, A methodology to improve timing yield in
the presence of process variations, in Proc. of the Design Automation Conf., 2004,
pp. 448453.
[144] A. Agarwal, K. Chopra, and D. Blaauw, Statistical timing based optimization using
gate sizing, in Proc. of the Conf. on Design Automation and Test in Europe, 2005,
pp. 400405.
[145] X. Bai, C. Visweswariah, and P. N. Strenski, Uncertainty-aware circuit optimiza-
tion, in Proc. of the Design Automation Conf., 2002, pp. 5863.
[146] S. H. Choi, B. C. Paul, and K. Roy, Novel sizing algorithm for yield improvement
under process variation in nanometer technology, in Proc. of the Design Automa-
tion Conf., 2004, pp. 454459.
[147] D. Lackey, David E. Lackey, Paul S. Zuchowski, Thomas R. Bednar, Douglas W.
Stout, Scott W. Gould, and John M. Cohn, Managing power and performance for
SOC designs using voltage islands, in Proc. of the Intl. Conf. on Computer-Aided
Design, Nov. 2002, pp. 195202.
[148] T. Hattori et. al., A power management scheme controlling 20 power domains for
a single-chip mobile processor, in Proc. of the Intl. Solid State Circuits Conf., Feb.
2006, pp. 540541.
[149] R. Garg, Gagandeep Mallarapu, and S. P. Khatri, A single-supply true voltage level
shifter, in Proc. of the Conf. on Design Automation and Test in Europe, March
245
2008, pp. 979984.
[150] S. C. Tan and X. W.Sun, Low power CMOS level shifters by bootstrapping tech-
nique, IEEE Electronics Letters, pp. 876 878, Aug. 2002.
[151] A. U. Diril, Y. S. Dhillon, A. Chatterjee, and A. D. Singh, Level-shifter free design
of low power dual supply voltage CMOS circuits using dual threshold voltages,
IEEE Transaction on Very Large Scale Integration, vol. 13, pp. 11031107, Sept.
2005.
[152] W. Wang, M. Ker, M. Chiang, and C. Chen, Level shifters for high-speed 1-V
to 3.3-V interfaces in a 0.13-pm Cu-interconnectiod/low-k CMOS technology, in
Proc. of the Intl. Symposium on VLSI Technology, Systems, and Applications, 18-
20 April 2001, pp. 307 310.
[153] D. Pan, H. W. Li, and B. M. Wilamowski, A low voltage to high voltage
level shifter circuit for mems application, in Proc. of the Biennial Univer-
sity/Government/Industry Microelectronics Symposium, July 2003, pp. 128131.
[154] R. Puri, L. Stok, J. Cohn, D. S. Kung, D. Z. Pan, D. Sylvester, A. Srivastava, and
S. Kulkarni, Pushing ASIC performance in a power envelope, in Proc. of the
Design Automation Conf., June 2003, pp. 788 793.
246
VITA
Rajesh Garg received his B.Tech degree in Electrical Engineering (Power) from the
Indian Institute of Technology-Delhi (IIT-Delhi), India in 2004 and his M.S. and Ph.D.
degrees in Computer Engineering from the Texas A&M University, College Station, in
2006 and 2009 respectively. During his graduate and doctoral studies he has done research
and published papers in many aspects of VLSI including radiation tolerant circuit design,
process variation tolerant circuit design, circuit modeling, SRAM design, structured ASIC
design, logic synthesis, low power design, Viterbi decoder design and statistical timing
analysis. His current research is focused on resilient circuit design, circuit modeling and
statistical timing analysis. During May-August 2006, he worked as a research intern on low
power receiver implementation for UWB communication systems at Mitsubishi Electric
Research Lab (MERL), Cambridge, MA. He also worked at Intel Corporation, Austin,
TX during May-August 2007 as an intern on clock distribution, power gating blocks and
IO drivers. He received the President’s Silver medal and Best Project in Electrical
Engineering award at IIT-Delhi in 2004.
Rajesh Garg may be reached at:
331-G WERC, MS 3259,
Department of ECE,
Texas A&M University,
College Station, TX-77843
E-mail: rajesh.garg@yahoo.com
The typist for this dissertation was Rajesh Garg.
