High-performance subthreshold standard cell design and cell placement optimization by Amarchinta, Sumanth
Rochester Institute of Technology
RIT Scholar Works
Theses Thesis/Dissertation Collections
6-1-2009
High-performance subthreshold standard cell
design and cell placement optimization
Sumanth Amarchinta
Follow this and additional works at: http://scholarworks.rit.edu/theses
This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion
in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact ritscholarworks@rit.edu.
Recommended Citation
Amarchinta, Sumanth, "High-performance subthreshold standard cell design and cell placement optimization" (2009). Thesis.
Rochester Institute of Technology. Accessed from
High-Performance Subthreshold Standard Cell Design and
Cell Placement Optimization
by
Sumanth Amarchinta
A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of
Master of Science in Computer Engineering
Supervised by
Dr. Dhireesha Kudithipudi
Department of Computer Engineering
Kate Gleason College of Engineering
Rochester Institute of Technology
Rochester, New York
June 2009
Approved By:
Dr. Dhireesha Kudithipudi
Assistant Professor, Department of Computer Engineering
Primary Advisor
Dr. James Moon
Associate Professor, Department of Electrical Engineering
Dr. Ken Hsu
Professor, Department of Computer Engineering
Dedication
To Family and GOD.
ii
Acknowledgments
I sincerely thank my advisor Dr Dhireesha Kudithipudi for her constant support through
out my stay at RIT which made this work possible. She helped me in every possible way
to attain my goal. I have learned a lot from Dr Kudithipudi specially the way she manages
a large research group. I am grateful to my thesis committee member Dr Moon and Dr
Hsu for their support and ideas which helped me in my thesis. I would like to thank Dr.
Ruben Proano for his suggestions. I would like to thank all the Faculty and staff of
computer engineering for their support through out my master program at RIT.
iii
Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1 Subthreshold Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 MOS Transistor Characteristics . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Motivation and Supporting Work . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Supporting Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Thesis Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Performance Enhancement of Subthreshold Circuits . . . . . . . . . . . . . 14
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Substrate Biasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Charge Boosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Cell Placement Optimization for Minimizing Energy Consumption . . . . . 26
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
iv
4.2 Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2.1 Computation of Early Event Time . . . . . . . . . . . . . . . . . . 29
4.2.2 Computation of Late Event Time . . . . . . . . . . . . . . . . . . . 30
4.2.3 Total Float . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3 Optimization Flow for Implementing CPM . . . . . . . . . . . . . . . . . . 32
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5 Results and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2 Performance Enhanced Standard Cell Library . . . . . . . . . . . . . . . . 36
5.2.1 Inverter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2.2 AND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2.3 NAND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2.4 OR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2.5 NOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.2.6 XOR and XNOR . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.2.7 AND-OR and AND-OR-INVERT . . . . . . . . . . . . . . . . . . 78
5.2.8 OR-AND and OR-AND-INVERT . . . . . . . . . . . . . . . . . . 91
5.2.9 NOR0211 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.2.10 Summary of Performance-Enhanced Standard Cell Library . . . . . 97
5.3 Implementation of CPM algorithm on Benchmark Circuits . . . . . . . . . 104
6 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
v
List of Tables
3.1 Delay and power values for AND02 with Vdd = 0.3 V. . . . . . . . . . . . . 22
4.1 List of all nodes, their successors and predecessors. . . . . . . . . . . . . . 29
4.2 List of all arcs, corresponding standard cells and their delays. . . . . . . . . 29
4.3 Early event time and latest event time for all nodes. . . . . . . . . . . . . . 31
4.4 List of all arcs and their respective total float. . . . . . . . . . . . . . . . . 32
5.1 Delay and energy values of an inverter at 0.3 V for IBM 65 nm technology. 37
5.2 Delay values for inverter at 0.3 V for IBM 65 nm technology across FF, FS,
FS and SF corners. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3 Delay and energy values for AND02 at 0.3 V for IBM 65 nm technology. . 44
5.4 Delay and energy values for AND03 at 0.3 V for IBM 65 nm technology. . 47
5.5 Delay and energy values for AND04 at 0.3 V for IBM 65 nm technology. . 48
5.6 Delay and energy values for NAND02 at 0.3 V for IBM 65 nm technology. 53
5.7 Delay and energy values for NAND03 at 0.3 V for IBM 65 nm technology. 55
5.8 Delay and energy values for NAND04 at 0.3 V for IBM 65 nm technology. 56
5.9 Delay and energy values for OR02 at 0.3 V for IBM 65 nm technology. . . 60
5.10 Delay and energy values for OR03 at 0.3 V for IBM 65 nm technology. . . 62
5.11 Delay and energy values for OR04 at 0.3 V for IBM 65 nm technology. . . 63
5.12 Delay and energy values for NOR02 at 0.3 V for IBM 65 nm technology. . 67
5.13 Delay and energy values for NOR03 at 0.3 V for IBM 65 nm technology. . 69
5.14 Delay and energy values for NOR04 at 0.3 V for IBM 65 nm technology. . 70
5.15 Delay and energy values for XOR at 0.3 V for IBM 65 nm technology. . . . 74
5.16 Delay and energy values for XNOR at 0.3 V for IBM 65 nm technology. . . 74
vi
5.17 Delay and energy values for AO21 at 0.3 V for IBM 65 nm technology. . . 79
5.18 Delay and energy values for AOI21 at 0.3 V for IBM 65 nm technology. . . 79
5.19 Delay and energy values for AO22 at 0.3 V for IBM 65 nm technology. . . 82
5.20 Delay and energy values for AOI22 at 0.3 V for IBM 65 nm technology. . . 82
5.21 Delay and energy values for AO221 at 0.3 V for IBM 65 nm technology. . . 84
5.22 Delay and energy values for AOI221 at 0.3 V for IBM 65 nm technology. . 84
5.23 Delay and energy values for AO32 at 0.3 V for IBM 65 nm technology. . . 87
5.24 Delay and energy values for AOI32 at 0.3 V for IBM 65 nm technology. . . 87
5.25 Delay and energy values for AO321 at 0.3 V for IBM 65 nm technology. . . 89
5.26 Delay and energy values for AOI321 at 0.3 V for IBM 65 nm technology. . 89
5.27 Delay and energy values for OA21 at 0.3 V for IBM 65 nm technology. . . 92
5.28 Delay and energy values for OAI21 at 0.3 V for IBM 65 nm technology. . . 92
5.29 Delay and energy values for OA32 at 0.3 V for IBM 65 nm technology. . . 94
5.30 Delay and energy values for OAI32 at 0.3 V for IBM 65 nm technology. . . 94
5.31 Delay and energy values for NOR0211 at 0.3 V for IBM 65 nm technology. 97
5.32 Design choice of a standard cell for delay, energy and energy-delay product
as metrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.33 Delay, power and energy values for Gate-Gate standard cell library at 0.3
V and 125 ◦C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.34 Delay, power and energy values for Drain-Drain standard cell library at 0.3
V and 125 ◦C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.35 Delay, power and energy values for Supply-Ground standard cell library at
0.3V and 125 ◦C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.36 Delay, power and energy values for charge-boosting standard cell library at
0.3V and 125 ◦C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.37 Delay values for the Benchmark circuits simulated at 0.3 V in IBM 65 nm
technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
vii
5.38 Un-optimized energy values for benchmark circuits at 0.3 V in IBM 65 nm
technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.39 Optimized energy values for benchmark circuits at 0.3 V in IBM 65 nm
technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.40 Number of performance-enhanced cells inserted in benchmark circuits through
CPM algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.41 Un-optimized energy-delay product for benchmark circuits at 0.3 V. . . . . 110
5.42 Optimized energy-delay product for benchmark circuits at 0.3 V. . . . . . . 110
1 Delay, power and energy values for Gate-Gate standard cell library at 0.3
V and 25 ◦C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
2 Delay, power and energy values for Drain-Drain standard cell library at 0.3
V and 25 ◦C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
3 Delay, power and energy values for Supply-Ground standard cell library at
0.3 V and 25 ◦C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4 Delay, power and energy values for charge-boosting standard cell library at
0.3 V and 25 ◦C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
viii
List of Figures
1.1 Id vs. Vgs characteristics for IBM 65 nm technology at Vdd = 1 V. . . . . . . 4
1.2 Id vs. Vds characteristics for IBM 65 nm technology (a) Subthreshold (b)
Superthreshold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Inverter frequency characteristics for IBM 65 nm technology and Vdd = 0.1
V to 0.9 V. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Inverter power characteristics for IBM 65 nm technology and Vdd = 0.1 V
to 0.9 V. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1 Inverter with various biasing schemes (a) Gate-Gate biasing (b) Drain-
Drain biasing (c) Supply-Ground biasing. . . . . . . . . . . . . . . . . . . 15
3.2 Graphical representation of SNM for Gate-Gate biased inverter. . . . . . . . 16
3.3 Graphical representation of SNM for Drain-Drain biased inverter. . . . . . 16
3.4 Graphical representation of SNM for Supply-Ground biased inverter. . . . . 17
3.5 Frequency vs. Vdd of an inverter for IBM 65 nm technology and various
biasing schemes (a) Gate-Gate biasing (b) Drain-Drain biasing (c) Supply-
Ground biasing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.6 Power vs. Vdd of an inverter for IBM 65 nm technology and various biasing
schemes (a) Gate-Gate biasing (b) Drain-Drain biasing (c) Supply-Ground
biasing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.7 Buffer circuit designed to amplify an input signal of 0.3 V by a factor of 2. . 23
3.8 Charge boosting buffer providing higher Vgs to an inverter with Vdd = 0.3 V. 24
3.9 Transient input-output characteristics of charge boosting buffer simulated
in subthreshold for IBM 65 nm technology. . . . . . . . . . . . . . . . . . 24
ix
4.1 Network model of a CMOS circuit. . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Predecessors of node A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3 Successors of node A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.4 Optimization flow for implementing CPM on benchmark circuits. . . . . . 34
5.1 Substrate biasing applied to a standard cell with Vdd =0.3 V. . . . . . . . . . 36
5.2 Charge boosting buffer providing higher Vgs to a standard cell with Vdd
=0.3 V. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.3 Inverter delay characteristics with varying Vdd in IBM 65 nm technology. . 39
5.4 Inverter energy characteristics with varying Vdd in IBM 65 nm technology. . 40
5.5 Inverter energy-delay product with varying Vdd in IBM 65 nm technology. . 43
5.6 AND02 delay characteristics with varying Vdd in IBM 65 nm technology. . 46
5.7 AND02 energy characteristics with varying Vdd in IBM 65 nm technology. . 46
5.8 AND02 energy-delay product with varying Vdd in IBM 65 nm technology. . 47
5.9 AND03 delay characteristics with varying Vdd in IBM 65 nm technology. . 49
5.10 AND03 energy characteristics with varying Vdd in IBM 65 nm technology. . 50
5.11 AND03 energy-delay product with varying Vdd in IBM 65 nm technology. . 50
5.12 AND04 delay characteristics with varying Vdd in IBM 65 nm technology. . 51
5.13 AND04 energy characteristics with varying Vdd in IBM 65 nm technology. . 51
5.14 AND04 energy-delay product with varying Vdd in IBM 65 nm technology. . 52
5.15 NAND02 delay characteristics with varying Vdd in IBM 65 nm technology. 54
5.16 NAND02 energy characteristics with varying Vdd in IBM 65 nm technology. 54
5.17 NAND02 energy-delay product with varying Vdd in IBM 65 nm technology. 55
5.18 NAND03 delay characteristics with varying Vdd in IBM 65 nm technology. 57
5.19 NAND03 energy characteristics with varying Vdd in IBM 65 nm technology. 57
5.20 NAND03 energy-delay product with varying Vdd in IBM 65 nm technology. 58
5.21 NAND04 delay characteristics with varying Vdd in IBM 65 nm technology. 58
5.22 NAND04 energy characteristics with varying Vdd in IBM 65 nm technology. 59
5.23 NAND04 energy-delay product with varying Vdd in IBM 65 nm technology. 59
x
5.24 OR02 delay characteristics with varying Vdd in IBM 65 nm technology. . . 61
5.25 OR02 energy characteristics with varying Vdd in IBM 65 nm technology. . . 61
5.26 OR02 energy-delay product with varying Vdd in IBM 65 nm technology. . . 62
5.27 OR03 delay characteristics with varying Vdd in IBM 65 nm technology. . . 64
5.28 OR03 energy characteristics with varying Vdd in IBM 65 nm technology. . . 64
5.29 OR03 energy-delay product with varying Vdd in IBM 65 nm technology. . . 65
5.30 OR04 delay characteristics with varying Vdd in IBM 65 nm technology. . . 65
5.31 OR04 energy characteristics with varying Vdd in IBM 65 nm technology. . . 66
5.32 OR04 energy-delay product with varying Vdd in IBM 65 nm technology. . . 66
5.33 NOR02 delay characteristics with varying Vdd in IBM 65 nm technology. . 68
5.34 NOR02 energy characteristics with varying Vdd in IBM 65 nm technology. . 68
5.35 NOR02 energy-delay product with varying Vdd in IBM 65 nm technology. . 69
5.36 NOR03 delay characteristics with varying Vdd in IBM 65 nm technology. . 71
5.37 NOR03 energy characteristics with varying Vdd in IBM 65 nm technology. . 71
5.38 NOR03 energy-delay product with varying Vdd in IBM 65 nm technology. . 72
5.39 NOR04 delay characteristics with varying Vdd in IBM 65 nm technology. . 72
5.40 NOR04 energy characteristics with varying Vdd in IBM 65 nm technology. . 73
5.41 NOR04 energy-delay product with varying Vdd in IBM 65 nm technology. . 73
5.42 XOR delay characteristics with varying Vdd in IBM 65 nm technology. . . . 75
5.43 XOR energy characteristics with varying Vdd in IBM 65 nm technology. . . 76
5.44 XOR energy-delay product with varying Vdd in IBM 65 nm technology. . . 76
5.45 XNOR delay characteristics with varying Vdd in IBM 65 nm technology. . . 77
5.46 XNOR energy characteristics with varying Vdd in IBM 65 nm technology. . 77
5.47 XNOR energy-delay product with varying Vdd in IBM 65 nm technology. . 78
5.48 AO21 energy-delay product with varying Vdd in IBM 65 nm technology. . . 80
5.49 AOI21 energy-delay product with varying Vdd in IBM 65 nm technology. . 81
5.50 AO22 energy-delay product with varying Vdd in IBM 65 nm technology. . . 83
5.51 AOI22 energy-delay product with varying Vdd in IBM 65 nm technology. . 83
xi
5.52 AO221 energy-delay product with varying Vdd in IBM 65 nm technology. . 85
5.53 AOI221 energy-delay product with varying Vdd in IBM 65 nm technology. . 86
5.54 AO32 energy-delay product with varying Vdd in IBM 65 nm technology. . . 88
5.55 AOI32 energy-delay product with varying Vdd in IBM 65 nm technology. . 88
5.56 AO321 energy-delay product with varying Vdd in IBM 65 nm technology. . 90
5.57 AOI321 energy-delay product with varying Vdd in IBM 65 nm technology. . 91
5.58 OA21 energy-delay product with varying Vdd in IBM 65 nm technology. . . 93
5.59 OAI21 energy-delay product with varying Vdd in IBM 65 nm technology. . 93
5.60 OA32 energy-delay product with varying Vdd in IBM 65 nm technology. . . 95
5.61 OAI32 energy-delay product with varying Vdd in IBM 65 nm technology. . 96
5.62 NOR0211 energy-delay product with varying Vdd in IBM 65 nm technology. 97
xii
Abstract
Digital subthreshold Complementary Metal-Oxide-Semiconductor (CMOS) circuits are
gaining importance because of their ability to serve as an ideal low-power solution. Sub-
threshold circuits can potentially replace superthreshold circuits in portable devices which
execute non-performance-critical tasks, thereby increasing the battery life. The drawback
of subthreshold circuits is their low operating speeds. By enhancing the speed of subthresh-
old circuits their application spectrum can be expanded.
Operating frequency is primarily dependent on the ON current (Ion) of the transistor.
Increasing Ion would improve the frequency of subthreshold circuits. Ion is dependent on
various parameters such as transistor threshold voltage (Vth), gate-source voltage (Vgs) and
supply voltage (Vdd). Ion can be increased either by boosting the Vgs or by lowering the
Vth of the MOS transistors through substrate biasing. This thesis presents a new approach
to substrate biasing and compares the results with two existing biasing techniques. A new
performance enhancement technique using charge boosting buffers to boost the Vgs of the
transistors is presented. A performance-enhanced subthreshold standard cell library was
built by implementing these techniques on a regular cell library for IBM 65 nm technol-
ogy. The performance-enhanced cell library when implemented on the ISCAS benchmark
circuits yielded a 10 times improvement in the frequency with approximately 2 times in-
crease in the energy-delay product (EDP). The optimization problem for minimizing the
overhead in the energy consumption without affecting the frequency is formulated as an
integer linear program (ILP). The optimization algorithm yielded a 50 % reduction in the
EDP.
xiii
1. Subthreshold Circuits
This chapter discusses the behavior of a MOS transistor and provides analytical expres-
sions for drain current and energy consumption in subthreshold. The different MOS tran-
sistor regions of operation and analytical expressions for subthreshold current and energy
consumption are presented in Section 1.1. Behavior of drain current with variation in sup-
ply voltage and gate voltage is explained in Section 1.2. Frequency and power charac-
teristics of an inverter operating in both subthreshold and superthreshold are presented in
Section 1.2. The key points discussed in this chapter are summarized in Section 1.3.
1.1 Introduction
A MOS transistor can operate in three regions namely, strong inversion, moderate inver-
sion and weak inversion region. These regions of operation can be described as follows:
(a) Weak inversion region, also known as subthreshold region, occurs when the Vdd is less
than the Vth; (b) As the Vdd increases beyond Vth, the region of operation shifts to moderate
inversion; (c) Strong inversion region occurs when the Vdd is sufficiently higher than Vth
and the substrate beneath the gate is strongly inverted.
Since this research focuses on subthreshold region1, the rest of this document concen-
trates on this region of operation. In weak inversion region of operation the surface potential
(φS) of the transistor falls between φF and 2φF , where φF is the Fermi potential of extrinsic
silicon [22]. Surface potential is defined as the total potential drop between the surface to a
neutral point in bulk. φS adds up to voltage of external source, the gate-body potential (Vgb)
1Subthreshold region and weak inversion region are used interchangeably in this document
1
along with oxide potential (φox) and the sum of several contact potentials (ψMS), shown in
Equation (1.1) [22].
Vgb = φox + φS + ψMS (1.1)
In subthreshold operation ON current is determined by the flow of charge through dif-
fusion. The drain current in subthreshold can be modeled as shown in Equation (1.2) [7].
ID =
W
Leff
µeffCox(m− 1)V 2T exp
Vgs − Vth
mVT
(1− exp −Vds
VT
) (1.2)
where, W is the width of the transistor, Leff is the effective length, µeff is the effective
mobility, m is the subthreshold slope, Vth is the transistor threshold voltage, and VT is the
thermal voltage, VT = (KTq ).
Besides the subthreshold drain current, several leakage currents exist in subthreshold
that contribute to the total ON current. Among them the key leakage currents are gate
tunneling leakage current and gate-induced drain leakage (GIDL). Gate tunneling leakage
current is caused due to the tunneling of carriers through the oxide layer. The high electric
fields present across the oxide layer are responsible for such tunneling of carriers. As tech-
nology is being scaled down, oxide thickness is greatly reduced resulting in higher electric
fields across the oxide layer, indicating higher amounts of gate tunneling leakage current.
However, gate tunneling leakage current can be considered negligible when compared with
subthreshold drain current [2]. GIDL is a leakage current that appears with a condition
of Vgs values and high drain-source voltage (Vds) values. In subthreshold operation GIDL
can be considered negligible due to low Vds values. As the drain current dominates over the
other leakage currents, current in the subthreshold region can be equated to the subthresh-
old drain current.
Total energy (ET ) in subthreshold is the summation of dynamic energy (EDYN ) and
2
static energy (EL), as given by Equation (1.3).
ET = EDYN + EL (1.3)
Energy due to short circuit current can be considered negligible for subthreshold operation
[7]. Dynamic energy is the energy due to charging and discharging of load capacitances,
and is given by Equation (1.4) [2].
EDYN = CeffV
2
dd (1.4)
where, Ceff is the averaged total switched capacitance, and VDD is the supply voltage.
Dynamic energy holds a quadratic relation with Vdd, as seen from Equation (1.4). As the
Vdd decreases, dynamic energy reduces quadratically. Static energy EL is the energy due to
the leakage current, and is given by Equation (1.5) [2].
EL = IleakVddtd = WeffIo exp
−Vth
mVT
VddtdLDP (1.5)
where, Weff is the average total width, Io is the drain current when Vgs = Vth, Vth is the
transistor threshold voltage, m is the subthreshold slope, td is the delay of the circuit,
and LDP is the depth of critical path. Static energy is linearly dependent on the delay td,
as observed from Equation (1.5). Static energy is very high for a subthreshold operation
because of high delay. Hence static energy dominates over dynamic energy in subthreshold.
As the supply voltage increases, the delay reduces and dynamic energy would dominate
over static energy for superthreshold operation.
1.2 MOS Transistor Characteristics
The supply voltage and current flowing through the transistors affect the design param-
eters such as power and frequency. The current-voltage (I-V ) characteristics are thus im-
portant in designing CMOS circuits. Channel current of a transistor is dependent on Vds,
3
Vgs, Vth and temperature.
(a) Id vs. Vgs
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
10−11
10−10
10−9
10−8
10−7
10−6
10−5
10−4
Vgs (V)
lo
g 
(Id
) in
 A
Figure 1.1: Id vs. Vgs characteristics for IBM 65 nm technology at Vdd = 1 V.
The behavior of Id with increasing Vgs is shown in Figure 1.1. Id behaves exponentially
in weak inversion region and holds a linear relationship in strong inversion region. Id vs
Vgs graph is used to extrapolate the threshold voltage of the MOSFET by looking at the
point where the graph deviates from its original exponential trajectory [2].
(b) Id vs. Vds
Behavior of Id with increasing Vds in subthreshold and superthreshold regions are shown
in Figure 1.2 (a) and 1.2 (b), respectively. As observed from the Figure 1.2, Id holds
an exponential behavior for low values of Vds which is the subthreshold region and then
4
behaves linearly for higher values of Vds. It can be observed that the current flattens by
further increasing the Vds values and the current becomes roughly independent of Vds which
is called the saturation region.
0 0.2 0.4 0.6 0.8 1
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
x 10−9
Vds (V)
Id
 (A
)
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
x 10−5
Vds (V)
Id
 (A
)
Vgs = 0.7 V
Vgs = 0.8 V
Vgs = 0.9 V
Vgs = 0.3 V
Vgs = 0.2 V
Figure 1.2: Id vs. Vds characteristics for IBM 65 nm technology (a) Subthreshold (b)
Superthreshold.
(c) Dependence of Id on Temperature and Vth
Id is effected by Vth and temperature variations. Id increases exponentially with de-
creasing Vth, shown in Equation (1.2). Hence the circuit performance is higher with low
Vth transistors. Temperature has an impact on parameters such as carrier mobility, thresh-
old voltage and junction leakage which vary the ON current in a MOS transistor. Carrier
mobility decreases with an increase in temperature. An approximate relation of carrier
mobility with temperature is shown in Equation (1.6) [23].
µ(T ) = µ(Tr)
(
T
Tr
)−kµ
(1.6)
where, T is the absolute temperature, Tr is the room temperature, and kµ is a fitting param-
eter generally in the range of 1.2-2.0. Vth reduces linearly with increase in temperature and
can be approximated as shown in Equation (1.7) [23].
5
Vth(T ) = Vth(Tr)− kvth(T − Tr) (1.7)
where, kvth is a constant and is in range of 0.5 to 3.0 mV/K. Junction leakage also increases
as the temperature is increased [23]. The overall effect of temperature on Id is different
for subthreshold and superthreshold operation. For subthreshold operation Id increases
with increasing temperature, and for superthreshold operation Id decreases with increase
in temperature [23]. Therefore, the circuit performance is best at high temperatures in
subthreshold, and worst at high temperatures for superthreshold operation. To improve
circuit performance of superthreshold circuits generally additional cooling mechanisms
such as heat sinks, water cooling, and liquid nitrogen are used which are not required for
subthreshold circuits.
To understand the behavior of circuits in subthreshold and superthreshold an inverter
is simulated using IBM 65 nm technology. Frequency and power characteristics of an
inverter operating in both subthreshold and superthreshold regions are shown in Figure
1.3 and 1.4, respectively. Power and frequency vary exponentially in subthreshold region.
Power consumption of an inverter operating at 0.3 V was 3.3 pW compared to 46.3 pW
at 1.0 V supply. The power consumption in subthreshold region is an order of magnitude
less when compared to strong inversion inversion. The reason for this is lower Vdd value in
case of subthreshold operation. The delay of an inverter operating at 0.3 V was 29.56 ns
compared to 57.2 ps. The delay in subthreshold region is three orders of magnitude greater
compared to strong inversion operation. The reason for this is due to lower ON current in
case of subthreshold operation.
6
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
105
106
107
108
109
1010
1011
Vdd (V)
lo
g 
(F
req
ue
nc
y) 
in 
Hz
Figure 1.3: Inverter frequency characteristics for IBM 65 nm technology and Vdd = 0.1 V
to 0.9 V.
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10−13
10−12
10−11
10−10
Vdd (V)
lo
g 
(P
ow
er)
 in
 W
Figure 1.4: Inverter power characteristics for IBM 65 nm technology and Vdd = 0.1 V to
0.9 V.
7
1.3 Summary
The operation of a MOS transistor in subthreshold has been discussed. Equations for drain
current and total energy in subthreshold have been presented in this chapter. The exponen-
tial dependence of Id on Vdd and Vgs in subthreshold region has been shown. The variation
in Id with varying temperature is discussed. Frequency and power characteristics of an
inverter operating in both subthreshold and superthreshold have been shown.
8
2. Motivation and Supporting Work
This chapter presents the related work supporting this research and provides the research
objectives. The research work related to subthreshold design is presented in Section 2.1.
The motivation for the proposed research is explained in Section 2.2. The formulated
thesis objectives are presented in Section 2.3. The summary of the key points discussed is
presented in the last section.
2.1 Supporting Work
Digital subthreshold design is gaining importance, especially for applications where
leakage power dissipation is the primary design metric and speed is not a criterion. Leak-
age power dissipation in CMOS circuits is increasing exponentially as the technology is
being scaled down [23]. Subthreshold design which utilizes the leakage current to per-
form useful computations is evolving as an ideal low power solution [2]. Subthreshold
designs suffer from a drawback of low operating speeds when compared to superthreshold
designs [9]. Research related to reducing the performance gap between subthreshold and
superthreshold circuits is gaining momentum.
The operation of digital circuits in the subthreshold region was considered as early as
1970, in which theoretical limits on supply voltage scaling were derived for a CMOS
inverter [19]. This limit is important to determine the minimum operating frequency of
CMOS circuits. The minimum energy operation of a CMOS circuit occurs in the sub-
threshold region [1]. This suggests that subthreshold operation can serve as a low energy
solution in CMOS circuits. Energy minimization and transistor sizing for minimum en-
ergy operation were examined and analytical expressions for the minimum energy point
9
were derived [1][3][4]. The minimum energy point calculation is essential in designing
subthreshold circuits.
As the channel length reduces, several short channel effects (SCE) such as drain-induced
barrier lowering (DIBL) and electron/hole tunneling become important in CMOS circuits
[20]. These short channel effects have been examined in depth for subthreshold CMOS
operation, and an analytical model for channel current as a function of feature size has been
derived [20][21]. The effect of DIBL in subthreshold is lower compared to superthreshold
operation because of low drain voltages. Therefore, the need for high channel doping can
be eliminated in case of subthreshold circuits, which is otherwise required to overcome the
SCE in superthreshold circuits.
Subthreshold circuits are more sensitive to process, supply voltage and temperature vari-
ations compared to superthreshold circuits [8]. This higher sensitivity may cause sub-
threshold circuits to fail to function properly. To reduce the sensitivity to process, supply
voltage and temperature variations, newer logic families such as dual VT logic [8], variable
threshold voltage CMOS (VTCMOS) [18], and dynamic threshold voltage MOS (DTMOS)
[16] were proposed. Since DTMOS logic involves biasing the substrate with the input gate
signal, it can only be implemented in triple well technology. The increase in the process
complexity of DTMOS logic is compensated by its higher operating frequency.
The application spectrum of subthreshold circuits could be expanded by improving their
performance. Traditional logic families such as pseudo NMOS [15] and domino style [17]
have been implemented in subthreshold to achieve higher speeds. Pseudo NMOS offers
high operating speed in subthreshold region but is less desirable as it dissipates excess
static power when integrated in large scale. Dynamic circuits provide high speed operation
for subthreshold circuits but are less desirable due to additional overhead of charge keeper
transistors, which are needed to hold the value at dynamic output nodes. Dynamic output
nodes are highly susceptible to noise especially at low voltage levels, making them less
10
desirable for subthreshold operation.
A logic family based on body biasing has been realized in [16] to improve the subthresh-
old operating frequency. Several models based on body biasing have been suggested in
[11][16]. Models suggested in [11][16] have either the gate or drain of the transistors tied
to the substrate. The substrate voltage increases with the input gate voltage reducing the
Vth, thereby increasing the speed. The advantage of using substrate biasing in subthreshold
circuits compared to superthreshold circuits is that it does not require additional limiter
transistors due to low operating voltages. Limiter transistors are required in superthreshold
circuits to limit the body potential to be less than 0.7 V to prevent CMOS latchup.
A circuit design approach to enhance the performance of subthreshold circuits was sug-
gested in [9]. In [9], asynchronous micro-pipelining of levelized network of PLAs was
used. A method to increase the speed of subthreshold interconnects has been suggested
in [10][14]. In [10][14], the voltage of the global interconnects was boosted through addi-
tional boosting circuitry. Subthreshold circuit performance could be improved by providing
higher gate voltage, while maintaining the supply voltage at a constant level. This approach
of boosting the gate voltage of transistors to improve the circuit performance has not yet
been considered.
Boosting the gate voltage of each and every transistor in a circuit is not required to
enhance the performance of subthreshold circuits. Transistors along the critical path deter-
mine the speed of the circuit. Boosting the gate voltage of the transistors along the critical
path can change the critical path itself. Hence, an optimal solution for the placement of
boosting circuitry is required. An optimization solution for leakage power minimization
is discussed in [12]. In [12], the optimization problem is formulated as an integer linear
program (ILP). Delay optimization of CMOS circuits by transistor reordering is suggested
in [5]. In [5], the authors implemented a breadth-first search algorithm to determine the
order of transistors.
11
The analysis suggests that the performance of subthreshold circuits can be improved
either by substrate biasing or by charge boosting. Substrate biasing is biasing the body of
the MOS transistor. The Vth of the transistors can be lowered through substrate biasing,
which increases the ON current and thus improves the frequency of subthreshold circuits.
Charge boosting is boosting the Vgs of the transistors which leads to higher ON current.
The methods to improve the frequency of subthreshold circuits based on substrate biasing
and charge boosting are proposed in this research.
2.2 Motivation
The minimum energy operation of CMOS circuits occurs in subthreshold region of op-
eration. Therefore, subthreshold CMOS circuits can serve as an ideal low-energy solution.
However, subthreshold circuits suffer from a drawback of low operating speed. The ap-
plication spectrum of subthreshold circuits can be expanded by enhancing their frequency.
The frequency can be enhanced either by substrate biasing or charge boosting. The ex-
ponential dependence of Vth on ON current in subthreshold makes substrate biasing an
effective method to improve the frequency of subthreshold circuits. Charge boosting en-
hances the performance by increasing the Vgs of the transistors, which does not cause a
large overhead in the energy consumption making it effective for subthreshold operation.
The motivation for this research has led to the goal of enhancing the performance of sub-
threshold circuits. The objectives of this thesis are discussed in the next section.
2.3 Thesis Objectives
The goal of this thesis is to design methods which enhance the performance of subthreshold
circuits. To achieve this goal the following objectives are formulated.
• Design of performance enhancement methods involving substrate biasing and charge
boosting.
12
• Design of standard cell libraries by implementing performance enhancement meth-
ods and characterization of the delay and energy variations with Vdd for standard
cells.
• Placement of standard cells for optimal delay and power through integer linear pro-
gramming (ILP) and implementation of standard cell library on the benchmark cir-
cuits.
2.4 Summary
The subthreshold circuit performance can be improved by increasing the ON current
flowing through the MOS transistors. The ON current can be increased either by substrate
biasing or charge boosting. Substrate biasing lowers the Vth of the transistors, thereby in-
creasing the ON current. Charge boosting enhances the frequency of subthreshold circuits
by boosting the Vgs of the transistors.
13
3. Performance Enhancement of Sub-
threshold Circuits
This chapter proposes two performance enhancement methods and presents analysis on
each method proposed. An overview of performance enhancement is discussed in Section
3.1. Section 3.2 presents two existing biasing techniques. A new approach to substrate bi-
asing is also discussed. A new performance enhancement technique using charge boosting
buffers is presented in Section 3.3. The key points discussed are summarized in Section
3.4.
3.1 Overview
The performance of a subthreshold circuit, is dependent on the ON current flowing
through the channel of MOS transistors. The ON current of a MOS transistor is dependent
on the Vth and the Vgs. Performance of subthreshold circuits can be improved by reducing
the Vth and by increasing the Vgs. The two existing biasing methods and a new approach to
substrate biasing involve reducing the Vth of the CMOS devices. A new performance en-
hancement technique proposed increases the Vgs of the transistors by using charge boosting
buffers.
3.2 Substrate Biasing
Substrate biasing is providing a bias voltage to the body of a MOS transistor. By pro-
viding a positive voltage to the body of an NMOS transistor relative to the source, Vth can
be reduced. As Vth reduces, the ON current increases. Higher ON current results in faster
14
charging and discharging of the load capacitances, reducing the delay of the circuit and
thus improving the performance of subthreshold circuits. The threshold voltage of a four
terminal MOS transistor (Vth) is given by Equation (3.1) [22].
Vth = Vth0 + γ(
√
φ0 + Vsb −
√
φ0) (3.1)
where, Vth0 is the threshold voltage with zero bias, γ is the body effect parameter, φ0 is the
surface potential of MOS transistor and the source to body substrate bias (Vsb). As seen
from Equation (3.1), threshold voltage can be varied by varying Vsb. Vth reduces when Vsb
assumes negative values. Vsb becomes negative when the substrate of the device is forward
biased. Thus, the Vth of the device can be reduced by forward biasing the substrate of a
MOS transistor.
Figure 3.1: Inverter with various biasing schemes (a) Gate-Gate biasing (b) Drain-Drain
biasing (c) Supply-Ground biasing.
15
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Vin (V)
Vo
ut
 (V
)
OH
OL IL IHV      = 0.19
V       = 0.294
V      = 0.12V       = 0.02193
Figure 3.2: Graphical representation of SNM for Gate-Gate biased inverter.
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Vin (V)
Vo
ut
 (V
)
OH
OL
IH
IL
V      = 0.2
V       = 0.2808
V       = 0.0448 V      = 0.1
Figure 3.3: Graphical representation of SNM for Drain-Drain biased inverter.
16
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Vin (V)
Vo
ut
 (V
)
OH
OL
IH
ILV      = 0.1
V      = 0.21
V       = 0.0449
V       = 0.2738
Figure 3.4: Graphical representation of SNM for Supply-Ground biased inverter.
Based on substrate biasing a method to improve the subthreshold circuit performance has
been designed. The two existing biasing methods and a new approach to substrate biasing
were applied to a CMOS inverter, and are shown in Figure 3.1 (a),(b) and (c), respectively.
The corresponding static noise margins for the three biased inverters are shown in 3.2, 3.3
and 3.4, respectively. The noise margin high is calculated as VOH - VIH and noise margin
low as VIL - VOL. The noise margin high and noise margin low for the circuit shown in
Figure 3.1(a) are 0.104 and 0.098, for circuit in Figure 3.1(b) are 0.808 and 0.055 and for
circuit shown in Figure 3.1(c) are 0.064 and 0.055 respectively. The biasing mechanism
shown in Figure 3.1(a) is termed Gate-Gate biasing [16], in which the substrates of PMOS
and NMOS are biased using a connection between respective gates and substrates. The
biasing mechanism shown in Figure 3.1(b) [11] is termed Drain-Drain biasing using a
connection between the respective drains and substrates. The proposed biasing mechanism
shown in Figure 3.1(c) is termed Supply-Ground biasing, in which the substrate of NMOS
is biased with supply voltage (Vdd) and the substrate of PMOS is biased with ground.
17
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
105
106
107
108
109
1010
Vdd (V)
lo
g 
(F
req
ue
nc
y) 
in 
Hz
 
 
Gate−Gate
Drain−Drain
Supply−Ground
Figure 3.5: Frequency vs. Vdd of an inverter for IBM 65 nm technology and various biasing
schemes (a) Gate-Gate biasing (b) Drain-Drain biasing (c) Supply-Ground biasing.
The circuits shown in Figure 3.1 were simulated in subthreshold for IBM 65nm technol-
ogy and the corresponding power and delay characteristics with varying supply voltage are
plotted, Figure 3.5 and 3.6. Frequency and power increase exponentially with supply volt-
age as observed from Figure 3.5 and 3.6 . It can also be observed that frequency and power
values are higher for the proposed Supply-Ground biasing compared to existing methods
(Gate-Gate and Drain-Drain biasing). This is because both NMOS and PMOS are biased
at all times in Supply-Ground biasing which is not the case with Gate-Gate biasing and
Drain-Drain biasing. In Gate-Gate biasing either NMOS or PMOS is biased depending
on input logic level of ’1’ and ’0’, respectively. In Drain-Drain biasing either NMOS or
PMOS is biased depending on output logic level of ’1’ and ’0’, respectively. Further, it can
be observed from Figure 3.5 and 3.6 that frequency and power values are higher in case of
Gate-Gate biasing compared to Drain-Drain biasing. The reason for this behavior is due
to higher ON current in case of Gate-Gate biasing. In case of Gate-Gate biasing, since
18
0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45
10−13
10−12
10−11
10−10
10−9
10−8
10−7
Vdd (V)
lo
g 
(P
ow
er)
 in
 W
 
 
Gate−Gate
Drain−Drain
Supply−Ground
Figure 3.6: Power vs. Vdd of an inverter for IBM 65 nm technology and various biasing
schemes (a) Gate-Gate biasing (b) Drain-Drain biasing (c) Supply-Ground biasing.
the input voltage is a step signal, the transistors are biased instantaneously when the input
signal is applied. In case of Drain-Drain biasing, the output voltage gradually changes and
would take the time equal to the delay of the circuit, to make a transition from one logic
level to another. Thus the substrate bias applied changes gradually from 0 to Vsbtd in 0 to
td seconds (where td is the delay of the gate and Vsbtd is the substrate bias voltage after the
time td). The substrate bias voltage for Drain-Drain biasing as a function of time can be
modelled as shown in Equation (3.2).
Vsb(t) = Vsbtd
(
t
td
)
(3.2)
where, Vsbtd is the substrate bias voltage at time t = td. Substituting the value of Vsb(t) in
Equation (3.1) results in variation of threshold voltage for Drain-Drain biasing as a function
of time, shown in Equation (3.3).
19
Vth(t) = Vth0 + γ(
√
φ0 + Vsbtd
(
t
td
)
−
√
φ0) (3.3)
The expression for ON current in subthreshold as given in Equation (2.2) can be simplified
as shown below in Equation (3.4).
ION = A1 exp(A2 − A3Vth) (3.4)
where,
A1 =
W
Leff
µeffCox(m− 1)V 2T (1− exp
−Vds
VT
)
A2 =
Vgs
mVT
A3 =
1
mVT
Substituting the expression for Vth(t) from Equation (3.3) into Equation (3.4) gives the
expression for ION as a function of time as shown in Equation (3.5).
ION = A1 exp(A2 − A3(Vth0 + γ(
√
φ0 + Vsbtd
(
t
td
)
−
√
φ0))) (3.5)
Equation (3.5) can be simplified as shown in Equation (3.6).
ION = A1 exp
(
A2 − A4 − γA3
√
φ0 + Vsbtd
(
t
T
))
(3.6)
where,
A4 = A3Vth0 − γA3
√
φ0
The average ON current can be obtained by integrating Equation (3.6) with limits on
time from 0 to T seconds, shown in Equation (3.8).
20
Iavg =
1
td
∫ td
0
ION dt (3.7)
Iavg =
2A1 exp(A2 − A4)
Vsbtdγ
2A23
(
(γA3
√
φ0 + Vsbtd + 1) exp(−γA3
√
φ0 + Vsbtd )
−(γA3
√
φ0 + 1) exp(−γA3
√
φ0)
)
(3.8)
The expression for ON current in Gate-Gate biasing is shown in Equation (3.9)
IGate−Gate = A1 exp
(
A2 − A4 − γA3
√
φ0 + Vsbtd
)
(3.9)
Iavg can be written as shown below
Iavg = I1 − I2 (3.10)
where,
I1 = IGate−Gate
2
(
γA3
√
φ0 + Vsbtd + 1
)
Vsbtdγ
2A23
(3.11)
I2 = IGate−Gate
2
(
γA3
√
φ0 + Vsbtd + 1
)
exp(−γA3
√
φ0)
Vsbtdγ
2A23 exp(−γA3
√
φ0 + Vsbtd )
(3.12)
To calculate the values of I1 and I2 we need γ, φ0, Vsbtd and A3. The value of A3 can
be calculated as shown below.
A3 =
1
mVT
=
1
60 ∗ 10−3 ∗ 26 ∗ 10−3 = 641.025
Substituting the approximate values for γ as 0.504 [22] and φ0 as 0.975 [22], VsbT = -Vdd
= -0.3 and A3 = 641.025 in Equation (3.11) and (3.12) and calculating the values of I1 and
21
I2 we get
I1 = 0.0375IGate−Gate
I2 ≈ 0
Iavg = I1 − I2 = 0.0375IGate−Gate (3.13)
It can be observed that the ON current in case of Gate-Gate biasing is 26 times that of
the ON current in case of Drain-Drain biasing. This difference in the ON current affects
the delay and power of the standard cells. The delay and power values for AND02 with
Gate-Gate and Drain-Drain biasing are shown in Table 3.1. It can be observed that the
delay in case of Gate-Gate biasing is 75.28 ns which is considerably less when compared
with 125.9 ns in case of Drain-Drain biasing. The power consumption in case of Gate-Gate
biasing is 1.467 nW which is four times higher when compared with 0.337 nW in case of
Drain-Drain biasing.
Table 3.1: Delay and power values for AND02 with Vdd = 0.3 V.
Biasing Delay Power
Gate-Gate 75.28 ns 1.467 nW
Drain-Drain 125.9 ns 0.337 nW
In CMOS circuits typically the substrate of NMOS is tied to ground and substrate of
PMOS is tied to Vdd. This is done to avoid the possibility of CMOS latch up. CMOS
latch up occurs when the body potential is typically greater than 0.7 V. Substrate biasing in
superthreshold circuits can cause CMOS latch up. However, due to low operating voltages
in case of subthreshold operation substrate biasing would not cause any latchup. Further, in
case of Drain-Drain biasing the output noise can cause some variation in delay and power
values when compared to Gate-Gate and Supply-Ground biasing, and would have no effect
on logical functionality of the device.
22
3.3 Charge Boosting
A new performance enhancement technique, namely charge boosting, improves the sub-
threshold circuit performance by increasing the Vgs of the transistors using charge boosting
buffers. The boosted Vgs provided to the transistors would increase the ON current. The
ON current of a transistor is exponentially dependent on Vgs in subthreshold region, as seen
in Figure 1.1. Therefore, a slight increase in Vgs would cause an exponential increase in
ON current, which causes the load capacitors to charge and discharge in short time. Hence
the delay of the circuit is reduced and the performance is enhanced.
The Vgs of the transistors can be increased by the use of charge boosting buffers. The
charge boosting buffer circuit which is designed to increase the Vgs is shown in Figure
3.7. The charge boosting buffer circuit is designed to amplify a step signal from 0 V to
0.3 V into a step signal from -0.1 V to 0.5 V, with -0.1 V as the voltage of sink and 0.5
V as the supply voltage of the buffer circuit. These buffers can be integrated into normal
standard cells to form a standard cell library with higher operating frequency. An inverter
integrated with a buffer is shown in Figure 3.8. The simulation of the buffer circuit with
output characteristics is shown in Figure 3.9.
Figure 3.7: Buffer circuit designed to amplify an input signal of 0.3 V by a factor of 2.
23
Figure 3.8: Charge boosting buffer providing higher Vgs to an inverter with Vdd = 0.3 V.
Figure 3.9: Transient input-output characteristics of charge boosting buffer simulated in
subthreshold for IBM 65 nm technology.
24
3.4 Summary
Two existing biasing methods and a new approach to substrate biasing have been pre-
sented, namely Gate-Gate biasing, Drain-Drain biasing and Supply-Ground biasing respec-
tively. A new performance enhancement technique using charge boosting buffers has been
proposed. An analytical expression comparing the ON current for the case of Gate-Gate
and Drain-Drain biasing has been derived. The equation derived indicates that ON current
in case of Gate-Gate biasing is 26 times that of the ON current in case of Drain-Drain
biasing. This higher ON current leads to lower delay and higher energy consumption in
case of Gate-Gate biasing when compared to Drain-Drain biasing. Each performance en-
hancement method comes with a cost of higher energy consumption. Hence, it is essential
to optimize the placement of the performance enhanced standard cells so as to achieve the
best performance with minimal additional overhead of energy consumption. An optimiza-
tion algorithm required to place these standard cells is presented in the next chapter.
25
4. Cell Placement Optimization for Min-
imizing Energy Consumption
This chapter presents an algorithm and a methodology to optimize the placement of stan-
dard cells for optimal delay and energy. An overview of optimization is discussed in Sec-
tion 4.1. An algorithm useful to find an optimal solution for placement of cells in CMOS
circuits is presented in Section 4.2. An optimization flow for implementing the optimiza-
tion algorithm is presented in Section 4.3. The key points discussed are summarized in the
summary section.
4.1 Overview
As discussed in Chapter 3, each design methodology improves the performance of the
subthreshold circuit by increasing the ON current flowing through the transistor. However,
due to an increase in the ON current the power consumption increases. Hence, it is es-
sential to optimize the placement of the high performance cells so as to achieve the best
performance with minimal additional cost of power consumption. The performance of a
large circuit is determined by the path with the longest delay, the critical path. Placing the
high performance cells along the critical path would change the original critical path in the
modified circuit. Hence an algorithm and methodology is required for the placement of the
cells to achieve the best performance with constraints of having the least additional power
consumption, and the original critical path remaining unchanged. The methodology and
the algorithm are discussed in the next section.
26
4.2 Optimization Algorithm
CMOS circuits can be represented in the form of a network. Network models can be
used as an aid in solving several optimization problems. The algorithm that is used to solve
the optimization problem is called Critical Path Method (CPM) [24]. CPM can be used to
determine the critical path of the circuit, and also to determine how long each activity can
be delayed without affecting the total performance of the circuit. An activity is defined as
the transition of inputs into outputs for any given standard cell in the circuit. A complete list
of all the activities that comprise the operation of the circuit is required to apply CPM. To
understand the algorithm with ease an exemplary network is considered, shown in Figure
4.1. To apply CPM the network has to be directed-acyclic graph. Acyclic graph indicates
absence of feed back loops in the circuit.
Figure 4.1: Network model of a CMOS circuit.
Each node in the network shown in Figure 4.1 represents a signal and each arc in the
network represents a standard cell. The time consumed for a signal to travel from one
node to the next adjacent node is given by the delay of standard cell that represents an arc
between the nodes. The arcs between start node and the input nodes are represented by
dummy cells with 0 seconds delay time. Similarly the arcs between the output nodes and
the end node are represented by dummy cells as shown in Figure 4.1. Each node in the
network has set of predecessors and successors. Predecessors for any node X are defined
as the adjacent nodes that are required to produce node X. The set of predecessors for node
27
A are as shown in Figure 4.2. Successors for any node X are defined as the neighboring
nodes that are dependent on node X for their production. The successors for node A are as
shown in Figure 4.3.
Figure 4.2: Predecessors of node A.
Figure 4.3: Successors of node A.
A list of all the nodes, their successors and predecessors, along with duration of all
activities or arcs, is required to apply CPM. A table representing the list of nodes, their
successors and predecessors is shown in Table 4.1 and the list of all arcs and their duration
is shown in Table 4.2.
The two key building blocks in CPM are the concepts of early event time (ET) and late
event time (LT) for a node. The early event time for a node i, represented by ET(i), is
defined as the earliest time at which the node i can be produced. The late event for a node
i, represented by LT(i), is the defined as the latest time at which the node i can be produced.
Early event time for any node X in a circuit would be equal to the total maximum delay
28
Table 4.1: List of all nodes, their successors and predecessors.
Node Successors Predecessors
Start IN1, IN2, IN3 -
IN1 A Start
IN2 A, B Start
IN3 B Start
A OUT1, OUT2 IN1, IN2
B OUT2 IN2, IN3
OUT1 END A
OUT2 END A, B
END - OUT1, OUT2
Table 4.2: List of all arcs, corresponding standard cells and their delays.
Arc Standard cell Delay
Start → IN1 Dummy 0 ns
Start → IN2 Dummy 0 ns
Start → IN3 Dummy 0 ns
IN1 → A AND02 177.1 ns
IN2 → A AND02 177.1 ns
IN2 → B XOR02 83.47 ns
IN3 → B XOR02 83.47 ns
A → OUT1 INV02 29.56 ns
A → OUT2 OR02 208.2 ns
B → OUT2 OR02 208.2 ns
OUT1 → END Dummy 0 ns
OUT2 → END Dummy 0 ns
required to produce the signal at that node. The early event time for the output nodes will
be equal to the delay of the circuit.
4.2.1 Computation of Early Event Time
The computation of early event time for each node begins with the start node. Represent
the start node as node 1 and assign numbers to each node. Early event time for node 1
is assumed to be 0, ET(1) = 0. Then compute ET(2), ET(3), etc. and stop when ET(end
node) is calculated. To compute ET(i), the early event times of all the predecessors of i
29
are required. Thus, computation of ET is in a particular order from start node to end node.
Referring to the network shown in Figure 4.1, and Figure 4.2 the predecessors of node A,
the early event time of node A is calculated as shown below [24].
ET (A) = max
ET (IN1) + delay of AND gateET (IN2) + delay of AND gate
From the above example it is clear that computation of ET(i) requires the knowledge of
ET(1), ET(2), ....., ET(i-1). Computation of early event time for any general node i can be
summarized as follows:
STEP 1: Find all the predecessors of node i.
STEP 2: To the ET for each predecessor of node i add the delay of the gate or arc
connecting the predecessor to node i.
STEP 3: ET(i) equals the maximum of the sums computed in Step 2.
4.2.2 Computation of Late Event Time
The computation of late event time for each node begins with the end node. To compute
the latest event time for each node, work backwards in descending order until LT(1) is
reached. Assume that LT(end) is equal to ET(end). Referring to the network shown in
Figure 4.1, and Figure 4.3 the successors of node A, the latest event time of node A is
calculated as shown below [24].
LT (A) = min
LT (OUT1)− delay of INVERTERLT (OUT2)− delay of OR gate
Computation of latest event time for any general node i can be summarized as follows:
STEP 1: Find all the successors of node i
STEP 2: To the LT for each successor of node i subtract the delay of the gate or arc
connecting the successor to node i.
30
STEP 3: LT(i) is the smallest of the differences determined in step 2.
The early event time and latest event time computed for all the nodes in the network are
shown in Table 4.3
Table 4.3: Early event time and latest event time for all nodes.
Node Early event time Latest event time
Start 0 ns 0 ns
IN1 0 ns 0 ns
IN2 0 ns 0 ns
IN3 0 ns 93.63 ns
A 177.1 ns 177.1 ns
B 83.47 ns 177.1 ns
OUT1 206.66 ns 385.3 ns
OUT2 385.3 ns 385.3 ns
END 385.3 ns 385.3 ns
4.2.3 Total Float
For any arc joining the nodes i and j, the total float, represented by TF(i,j), of the
standard cell or arc (i,j) is the amount of time by which the delay of the standard cell can
be extended without affecting the circuit performance and also without affecting the critical
path. The total float of the standard cells across the critical path is 0. The total float for any
arc (i,j) can computed as shown in Equation (4.1).
TF (i, j) = LT (j)− ET (i)− tij (4.1)
where tij is the delay of the standard cell represented by the arc (i,j).
The value of total float computed for all the arcs is shown in Table 4.4.
Since the arcs with total float equal to 0 fall on the critical path, by joining all such arcs
the critical path can be formed. The critical paths formed by joining such arcs from Table
4.4 are Start → IN1 → A → OUT2 → END and Start → IN2 → A → OUT2 → END.
From Table 4.4 it is clear that arc IN2 → B and IN3 → B which is represented by XOR02
31
Table 4.4: List of all arcs and their respective total float.
Arc Total float
Start → IN1 0 ns
Start → IN2 0 ns
Start → IN3 93.63 ns
IN1 → A 0 ns
IN2 → A 0 ns
IN2 → B 93.63 ns
IN3 → B 93.63 ns
A → OUT1 178.64 ns
A → OUT2 0 ns
B → OUT2 93.63 ns
OUT1 → END 178.64 ns
OUT2 → END 0 ns
can be delayed by 93.63 ns without affecting the overall performance and the critical path.
Similarly the arc A → OUT1 represented by INV02 can be delayed by 178.64 ns. Hence
by replacing the XOR02 and INV02 with modified cells of the same functionality which
have lower power consumption and higher delay depending on the total float will not affect
the performance of circuit.
4.3 Optimization Flow for Implementing CPM
As discussed in Section 4.1, the purpose of optimization is to find an optimal solution for
placement of high performance cells in the circuit with constraints of having the least addi-
tional power consumption, and the original critical path remaining unchanged. To achieve
this, first replace all the standard cells in the circuit with high performance cells. Apply
the CPM algorithm to determine total float of each cell present in the circuit. If the total
float of any particular high performance cell is greater than the difference of the delay of a
standard cell and its corresponding high performance cell, then replace that particular high
performance cell with a normal cell. Thus, by replacing all possible high performance cells
with normal cells, the power consumption is minimized and best performance is achieved.
32
A flow chart representing the methodology is shown in Figure 4.4.
4.4 Summary
A CPM algorithm has been discussed which is used to find the critical path of the circuit.
CPM is also used to determine how long the delay of each cell can be extended without
affecting the circuit performance and critical path. An optimization flow for implementing
the CPM algorithm to determine the placement of the performance-enhanced standard cells
has been presented. Placement of these high-performance cells is such that total power con-
sumption is minimized while the critical path remains unchanged and the best performance
is achieved.
33
           enhanced 
   normal cell − delayis required
No modification YESNO
 cell, if
total float > (delay of
              cell) 
 
with a corresponding   
regular cell
Replace the perform−For each
enhanced cells
Replace all the standard cells
enhanced cell in the netlist
Apply CPM and compute total 
circuit
netlist of a benchmark
placement
with optimal cell
benchmark circuit
Transistor level SPICE
in the netlist with performance−
float for each performance−
       of performance− 
ance−enhanced cell 
SPICE netlist of a 
Figure 4.4: Optimization flow for implementing CPM on benchmark circuits.
34
5. Results and Analysis
This chapter presents the results, analysis, delay and energy characteristics obtained from
the simulations on each standard cell. The simulation setup is explained in Section 5.1. The
results obtained from the simulations for each standard cell are presented in Section 5.2.
The delay and energy variations are characterized and the analysis is presented for each
standard cell library in Section 5.2. The summary of the performance-enhanced standard
cell library is presented in Section 5.2. The implementation of the performance-enhanced
cell library designed on the benchmark circuits and the effectiveness of the optimization
algorithm are presented in Section 5.3.
5.1 Simulation Setup
An IBM 65 nm technology file is used to perform all the simulation for this thesis. The
transient analysis for the standard cells is performed using HSPICE. PERL scripts are used
to automate the simulation runs for standard cells with Vdd ranging from 0.2 V to 0.4 V. To
account for the process variations the simulations are performed for performance corners
(FF, SS), transistor mismatch corners (FS, SF) and the nominal corner (TT). The simu-
lations are performed for worst case temperature of 125◦ C, best case temperature of 0◦
C, and nominal temperate of 25◦ C to account for the temperature variations. The de-
signed standard cell library is implemented on the ISCAS 85 benchmark circuits and these
benchmark circuits are chosen because they form a network of directed acyclic graphs [6].
A general setup used for the simulations on substrate-biased and charge-boosted standard
cells is shown in Figure 5.1 and Figure 5.2, respectively.
35
Vout
Vsb
0.3 V
Standard 
Cell
Biasing
Substrate
inV
Figure 5.1: Substrate biasing applied to a standard cell with Vdd =0.3 V.
0.5 V
−0.1 V
VV
Buffer
Charge Boosting Standard 
   Cell
0.3 V
in out
Figure 5.2: Charge boosting buffer providing higher Vgs to a standard cell with Vdd =0.3 V.
5.2 Performance Enhanced Standard Cell Library
The delay and energy characteristics along with the analysis for each standard cell are
presented in this section. A performance-enhanced standard cell library is built by im-
plementing a performance enhancement method on a regular standard cell library. For
36
example, Drain-Drain biasing when applied on a regular standard cell library results in
Drain-Drain standard cell library. The results obtained by implementing the four perfor-
mance enhancement methods on standard cells such as Inverter, AND, NAND, OR, NOR,
AND-OR gate, OR-AND gate, XOR and XNOR are discussed below.
5.2.1 Inverter
The regular inverter cell has a delay of 29.56 ns and an energy consumption of 0.08 fJ
at 0.3 V. Performance enhancement methods discussed earlier increase the ON current of
the transistors either by substrate biasing or charge boosting. This leads to faster charging
and discharging of load capacitances, reducing the delay. The higher ON current leads to
higher energy consumption. Thus the delay of performance-enhanced inverter is lower and
energy consumption is higher compared to regular inverter cell.
The delay and energy values for a regular inverter cell and the performance enhanced
inverter operating at 0.3 V supply are shown in Table 5.1. The analysis for the difference
in behavior of delay and energy observed in the case of Gate-Gate biasing, Drain-Drain
biasing, Supply-Ground biasing and charge boosting is presented below.
Table 5.1: Delay and energy values of an inverter at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 2.956e-08 7.927e-17
Gate-Gate 1.314e-08 9.966e-15
Drain-Drain 1.394e-08 6.475e-15
Supply-Ground 8.740e-09 2.960e-14
Charge boosted 7.062e-09 1.320e-14
(a) Delay
The delay value is least in the case of charge boosting, followed by Supply-Ground biasing,
Drain-Drain biasing and Gate-Gate biasing as observed from Table 5.1. The reason for
37
lower delay in the case of charge boosting compared to substrate biasing is the higher Ion
in the case of charge boosting. Ion is exponentially related to Vgs - Vth. Hence an increase
in the Vgs and an equivalent decrease in the Vth change the Ion by the same value. The
increase in Vgs in case of charge boosting for a 0.3 V Vdd is 0.2 V. The reduction in Vth due
to substrate biasing can be calculated from Equation (3.1) as shown in Equation (5.1).
∆Vth = γ(
√
φ0 + Vsb −
√
φ0) = 0.504(
√
0.975− 0.3−
√
0.975) = 0.083V (5.1)
Since ∆Vgs is higher than ∆Vth, the Ion in case of charge boosting is higher than substrate
biasing, leading to higher performance. The delay in case of charge boosting is approxi-
mately 4 times smaller compared to the regular inverter due to the 0.2 V boost in Vgs as
observed from Table 5.1. The limit of the boosted voltage for a charge boosting buffer with
a 0.3 V Vdd is 0.29 V with a valid functionality. But with a boosted voltage of 0.29 V any
noise in the input signal will result in a functionality error of the charge boosting buffer.
Thus, to maintain at least 0.1 V as a noise margin for a 0.3 V Vdd, the charge boosting buffer
was designed to have a boosted voltage of 0.2 V. The boosted voltage can be expressed as
0.66∗Vdd. The boosted voltage decreases with scaling down of the technology node. Since
the Vth decreases with technology scaling, the operating Vdd for the subthreshold operation
would decrease, resulting in lower boosted voltage..
The delay in case of Supply-Ground biasing is lowest among the three biasing methods.
This is because in case of Supply-Ground biasing the substrates of PMOS and NMOS are
biased at all times and do not change dynamically which is the case with the other two.
In case of Gate-Gate biasing and Drain-Drain biasing the substrate of PMOS and NMOS
are biased with respective input and output transitions. Gate-Gate biasing has lower delay
compared to Drain-Drain biasing because of approximately 26 times higher Ion as shown
in Equation (3.13). The delay variation for the four performance enhancement methods in
case of an inverter with varying Vdd is shown in Figure 5.3. The delay of all the performance
enhancement methods decreases with increasing Vdd. As Vdd increases the Ion increases and
38
delay reduces. Further, the reduction in delay with increasing Vdd is exponential in nature
because of the exponential dependence of Ion on Vgs and Vth as observed from Figure 5.3.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
Vdd (V)
lo
g1
0 
(D
ela
y) 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.3: Inverter delay characteristics with varying Vdd in IBM 65 nm technology.
(b) Energy
The energy due to leakage is the main component of energy consumption in subthresh-
old. The dependence of leakage energy on Vth, Vdd and td from Equation (1.5) is shown in
Equation (5.2).
EL ∝ (e−Vth)Vddtd (5.2)
The Vth reduces with substrate biasing and the energy increases exponentially with the re-
duction in Vth. The ∆Vth is highest for Supply-Ground biasing, followed by Gate-Gate
biasing and Drain-Drain biasing as discussed earlier, due to which the energy consumption
is lowest in case of Drain-Drain biasing followed by Gate-Gate biasing and Supply-Ground
biasing. The variation of energy with varying Vdd is shown in Figure 5.4. The energy in-
39
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−16
10−15
10−14
10−13
10−12
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.4: Inverter energy characteristics with varying Vdd in IBM 65 nm technology.
creases exponentially with increasing Vdd in case of substrate biasing, shown in Figure 5.4.
As Vdd increases the substrate bias voltage Vsb assumes more negative values. This leads to
a decrease in the value of Vth. As Vth reduces the energy consumption increases, shown in
Equation (5.2). The energy variation in case of charge boosting is different from substrate
biasing. In case of charge boosting the Vth remains the same as regular inverter cell. Hence
the energy consumption is dependent on Vdd and the energy consumed by the buffer. As the
supply voltage increases the increase in energy consumption is not exponential which is the
case with substrate biasing, shown in Equation (5.2). Due to the linear dependence on Vdd,
charge boosting consumes less energy compared to substrate biasing at higher values of
Vdd. The Drain-Drain biasing has the lowest energy among the four methods up to 360mV
and charge boosting has the lowest energy for Vdd greater than 360 mV.
40
(c) Energy-Delay Product
The energy-delay product is calculated as the product of the energy and the delay. The
energy-delay product varies linearly with variation in either energy or delay of the performance-
enhanced inverter. Drain-Drain biasing has the lowest energy-delay product among the
three substrate biasing methods because of its lower energy value, which dominates over
the higher delay. The energy-delay product in case of charge boosting is lower at higher
Vdd values because of the lower energy consumption at higher Vdd as discussed earlier. The
Drain-Drain biasing has the lowest energy-delay product up to 300 mV and charge boosting
has the lowest energy-delay product for Vdd greater than 300 mV, shown in Figure 5.5.
The energy-delay product for all the substrate biasing techniques increases exponentially
because of the increase in the energy which dominates over the decrease in the delay, shown
in Figure 5.5. The energy-delay product in the case of charge boosting decreases with
increasing Vdd, because of the exponential decrease in delay which dominates over the
linear increase in energy, shown in Figure 5.5. However, at a Vdd of 0.3 V the energy-delay
product graph deviates from its original trajectory, shown in Figure 5.5. As Vdd increases
the delay reduces exponentially. The energy consumption of the charge boosting inverter
is given by Equation (5.3)
Ecbb = Ebuffer + Einverter (5.3)
where, Ebuffer is the energy consumed by the buffer and Einverter is the energy consumed
by the inverter. As Vdd increases Einverter increases linearly. Ebuffer depends on the supply
voltage of the buffer, which is 1.66 ∗ Vdd. As Vdd increases the supply voltage of the buffer
increases at the rate of 1.66 times the Vdd. This causes the energy to increase rapidly with
higher supply voltages compared to lower supply voltages. Ecbb increases with an increase
in Vdd. The change in the energy value with an increase in Vdd from 0.26 V to 0.28 V is
given by Equation (5.4).
∆E0.26−0.28 = E0.28 − E0.26 = 2.4e− 15 (5.4)
41
where, E0.28 and E0.26 are the values of Ecbb at a Vdd = 0.28 V and 0.26 V, respectively.
The change in the energy value with an increase in Vdd from 0.28 V to 0.3 V is given by
Equation (5.5).
∆E0.28−0.3 = E0.3 − E0.28 = 6.22e− 15 (5.5)
where, E0.3 is the values of Ecbb at Vdd = 0.3 V. ∆E0.28−0.3 is larger than ∆E0.26−0.28 as
observed from Equation (5.4) and (5.5). This larger increase in energy causes the energy-
delay product at Vdd = 0.3 V to deviate from the original trajectory. As Vdd increases beyond
0.3 V the supply voltage of buffer increases at a rate of 1.66 times Vdd, which causes the
energy-delay product to decrease. Further, the variation in energy-delay product for Vdd
greater than 0.3 V is not smooth compared to substrate biasing. The energy-delay product
is not exponentially related to Vdd and is dependent on Ebuffer, Einverter and the delay. Due
to this the energy-delay product curve is not smooth compared to substrate biasing. For
Vdd greater than 0.38 V the energy-delay product starts to increase. This is because the
region of operation starts to shift from weak inversion to moderate inversion. In moderate
inversion ION is no longer exponentially dependent on Vgs, leading to a moderate energy
savings with an exponential overhead in energy.
Each of the four performance enhancement methods presented increase the performance
and also the energy consumption. A design choice from these four performance enhance-
ment methods for an inverter can be made depending on the user requirements. If minimum
delay is the user requirement then charge boosting method is the design choice as it has the
least delay when compared to other performance enhancement methods. With minimum
energy or minimum energy-delay product as the user requirement, Drain-Drain biasing is
the design choice. Apart from enhancing the performance, substrate biasing also increases
the robustness to process variations of a standard cell.
42
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−23
10−22
10−21
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.5: Inverter energy-delay product with varying Vdd in IBM 65 nm technology.
(d) Process Variations
To determine the effectiveness of the substrate biasing methods with respect to process
variations, the Drain-Drain inverter cell was simulated across the four process corners,
namely FF, SS, FS and SF. The FF and SS performance corners are set to produce 3σ vari-
ation in ring oscillator delay. The N to P mismatch corners FS and SF are set to have a 3σ
mismatch in ∆L and Vth. The corners models are present in IBM 65 nm technology file and
are created by changing several BSIM4 model parameters from their nominal value. These
parameters are primarily those that control the device, such as L, W , Tox, Vth, mobility,
series resistance and capacitance.
The delay values for Drain-Drain inverter at 0.3 V compared with regular inverter cell
across FS, SF, FF and SS process corners are shown in Table 5.2. The delay variation for
regular inverter cell from FF to SS corners is 51.62 ns compared with 18.83 ns in case of
43
Table 5.2: Delay values for inverter at 0.3 V for IBM 65 nm technology across FF, FS, FS
and SF corners.
Methodology FF SS FS SF
Normal 11.69 ns 63.31 ns 27.82 ns 38.98 ns
Drain-Drain 6.072 ns 24.90 ns 12.63 ns 15.47 ns
Drain-Drain inverter across same process corners. This indicates that 63.53 % less variation
for Drain-Drain inverter is observed. Similarly from FS to SF corners 75.5 % less variation
in delay is observed in case of Drain-Drain inverter compared to regular inverter cell. This
lower variation in delay suggests that the substrate biasing method increases the robustness
of the cell.
5.2.2 AND
This section presents the results and analysis obtained by implementing performance
enhancement methods on AND02, AND03 and AND04 cells. The delay and energy varia-
tions with varying Vdd are characterized.
AND02
The regular AND02 cell has a delay of 177.1 ns and an energy of 0.42 fJ at 0.3 V. The
delay value of a performance-enhanced AND02 cell is lower and energy consumption is
higher when compared to a regular AND02 cell. The reason for this is the higher Ion and is
similar to the case of an inverter, explained earlier. The delay and energy values for regular
and performance-enhanced AND02 cell are shown in Table 5.3.
Table 5.3: Delay and energy values for AND02 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 1.771e-07 4.239e-16
Gate-Gate 7.528e-08 6.254e-14
Drain-Drain 1.259e-07 2.017e-14
Supply-Ground 5.824e-08 1.674e-13
Charge boosting 3.761e-08 4.647e-14
44
The delay value is least in case of charge boosting and the energy is least in case of
Drain-Drain biasing as observed from Table 5.3. Approximately 5 times reduction in delay
is observed in case of charge boosting and 47 times increase in energy consumption is ob-
served in case of Drain-Drain biasing compared to a regular AND02 cell. The reason for
this is similar to the case of an inverter. A similar behavior in the variation of delay, energy
and energy-delay product is observed in case of AND02 as that of an inverter, shown in Fig-
ure 5.6, 5.7 and 5.8, respectively. However, in case of an inverter for Vdd greater than 0.34
V charge boosting had lower energy than Drain-Drain biasing, which is not the case with
AND02. This is because as the number of inputs are higher in case of AND02 compared
to an inverter, more buffers are being used which leads to the higher energy consumption.
Though the energy consumption for charge-boosted AND02 is high, the delay gap between
charge boosting and Drain-Drain biasing is higher, leading to lower energy-delay product.
Drain-Drain biasing has the lowest energy-delay product, up to 260mV. However for Vdd
greater than 280 mV energy-delay product is least in case of charge boosting, shown in Fig-
ure 5.8. Depending on the user requirements of minimum delay, energy or energy-delay
product, a design choice can be made and is summarized in Section 5.2.10.
The energy-delay product graph at 0.3 V deviates from the original trajectory. However,
the data point of energy-delay product at Vdd = 0.3 V does not shoot up unlike the inverter.
This is because even though the ∆E0.28−0.3 is greater than ∆E0.26−0.28, the delay savings
due to charge boosting in case of AND02 is higher than in case of Inverter. The additional
savings in delay prevents the energy-delay product to shoot up. The rest of the behavior of
the energy-delay product graph is similar to that of the inverter.
45
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.6: AND02 delay characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−15
10−14
10−13
10−12
10−11
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.7: AND02 energy characteristics with varying Vdd in IBM 65 nm technology.
46
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.8: AND02 energy-delay product with varying Vdd in IBM 65 nm technology.
AND03 and AND04
The regular AND03 cell has a delay of 304.7 ns and an energy of 0.6 fJ at 0.3 V. The
regular AND04 cell has a delay of 393.2 ns and energy of 1.009 fJ at 0.3 V. The delay and
energy values for regular and high performance AND03 cell and AND04 cell are shown in
Table 5.4 and 5.5, respectively.
Table 5.4: Delay and energy values for AND03 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 3.047e-07 6.035e-16
Gate-Gate 1.255e-07 1.636e-13
Drain-Drain 2.455e-07 2.914e-14
Supply-Ground 9.737e-08 4.121e-13
Charge boosting 4.490e-08 1.363e-13
47
Table 5.5: Delay and energy values for AND04 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 3.932e-07 1.009e-15
Gate-Gate 1.609e-07 4.104e-13
Drain-Drain 3.288e-07 5.156e-14
Supply-Ground 1.275e-07 9.875e-13
Charge boosting 4.699e-08 3.448e-13
The delay is least in case of charge boosting and energy is least in case of Drain-Drain
biasing for both AND03 and AND04, similar to AND02, shown in Table 5.4 and Table
5.5. Approximately 7 times reduction in delay in case of charge boosted AND03 and 8
times reduction in delay in case of charge boosted AND04 is observed compared to the
regular AND03 and AND04 cells. Approximately 50 times increase in energy in case of
Drain-Drain AND03 and 51 times increase in energy in case of Drain-Drain AND04 is
observed when compared to the regular AND03 and AND04 cells. The behavior in case
of AND03 and AND04 is similar to that of AND02 and the variation of delay, energy and
energy-delay product with Vdd for AND03 and AND04 is shown in Figure 5.9, 5.10, 5.11,
5.12, 5.13 and 5.14, respectively.
The behavior of energy-delay product is similar to the case of an inverter. The difference
in the behavior is that there is a shift in voltage at which energy-delay product shoots up.
For AND03 the energy-delay product shoots up at 0.32 V, for AND04 at 0.36 V compared
to 0.3 V for the inverter. This is because as the cell size increases the savings in delay
increases. As the savings increase the energy-delay product continues to reduce further
than 0.3 V for AND03 and AND04. However when the Vdd is higher than 0.3 V the increase
in energy consumption in AND03 and AND04 due to their large size dominates over the
savings in delay. This causes the energy-delay product to shoot up at a higher Vdd compared
to the inverter.
48
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
10−5
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.9: AND03 delay characteristics with varying Vdd in IBM 65 nm technology.
Summary for AND Cells
The delay, energy and energy-delay product variations with Vdd for AND02, AND3 and
AND04 have been shown. The delay gap between charge boosting and other performance
enhancement methods has increased in case of AND04 compared to AND03 and AND02,
shown in Figure 5.12, 5.9, and 5.6, respectively. This is because the delay in case of charge
boosting is lower than substrate biasing for every single transistor and as the size of the
cell increases the cumulative effect on each transistor adds up to the total variation in delay.
Similarly the energy gap between Drain-Drain biasing and other performance enhancement
methods has increased in case of AND04 compared to AND03 and AND02.
49
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−15
10−14
10−13
10−12
10−11
10−10
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.10: AND03 energy characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.11: AND03 energy-delay product with varying Vdd in IBM 65 nm technology.
50
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
10−5
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.12: AND04 delay characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−15
10−14
10−13
10−12
10−11
10−10
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.13: AND04 energy characteristics with varying Vdd in IBM 65 nm technology.
51
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.14: AND04 energy-delay product with varying Vdd in IBM 65 nm technology.
5.2.3 NAND
This section presents the results and analysis obtained by implementing performance
enhancement methods on NAND02, NAND03 and NAND04 cells. The delay and energy
variations with varying Vdd are characterized.
NAND02
The regular NAND02 cell has a delay of 82.89 ns and an energy of 0.2 fJ at 0.3 V. The
delay value of a performance-enhanced NAND02 cell is lower and energy consumption is
higher when compared to regular NAND02 cell. The reason for this is the higher Ion and is
similar to the case of an inverter, explained earlier. The delay and energy values for regular
and performance enhanced NAND02 cell are shown in Table 5.6.
52
Table 5.6: Delay and energy values for NAND02 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 8.293e-08 2.056e-16
Gate-Gate 3.902e-08 3.971e-14
Drain-Drain 6.278e-08 6.884e-15
Supply-Ground 2.170e-08 1.060e-13
Charge boosting 8.807e-09 4.398e-14
The delay value is least in case of charge boosting and energy is least in case of Drain-
Drain biasing as observed from Table 5.6. Approximately 10 times reduction in delay is
observed in case of charge boosting and 33 times increase in the energy consumption is
observed in case of Drain-Drain biasing when compared with regular NAND02 cell. The
reason for this is similar to the case of an inverter. A similar behavior in the variation
of delay, energy and energy-delay product is observed in case of NAND02 as that of an
AND02, shown in Figure 5.15, 5.16 and 5.17, respectively.
The behavior of the energy-delay product in case of NAND02 is similar to the cell dis-
cussed earlier. The energy-delay product shoots up at a Vdd = 0.36 V. The NAND02 cells
have a stack of transistors in their pull-down network which resist the leakage, whereas
in case of AND cell the presence of an additional inverter is responsible for a larger leak-
age compared to NAND cells. Since the energy consumption is less the savings in delay
dominate and the energy-delay product continues to reduce beyond Vdd = 0.3 V. As the Vdd
increases to 0.36 V the energy consumption increases which causes the shift as observed in
Figure 5.17.
53
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.15: NAND02 delay characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−16
10−15
10−14
10−13
10−12
10−11
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.16: NAND02 energy characteristics with varying Vdd in IBM 65 nm technology.
54
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−22
10−21
10−20
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.17: NAND02 energy-delay product with varying Vdd in IBM 65 nm technology.
NAND03 and NAND04
The regular NAND03 cell has a delay of 139.9 ns and an energy of 0.36 fJ at 0.3 V. The
regular NAND04 cell has a delay of 199.4 ns and energy of 0.63 fJ at 0.3 V. The delay and
energy values for regular and high performance NAND03 cell and NAND04 cell are shown
in Table 5.7 and 5.8, respectively.
Table 5.7: Delay and energy values for NAND03 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 1.399e-07 3.698e-16
Gate-Gate 6.882e-08 1.189e-13
Drain-Drain 1.235e-07 7.463e-15
Supply-Ground 3.946e-08 2.980e-13
Charge boosting 1.035e-08 1.300e-13
55
Table 5.8: Delay and energy values for NAND04 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 1.994e-07 6.305e-16
Gate-Gate 1.045e-07 3.167e-13
Drain-Drain 1.877e-07 8.072e-15
Supply-Ground 6.395e-08 7.601e-13
Charge boosting 1.200e-08 3.469e-13
The delay is least in case of charge boosting and energy is least in case of Drain-Drain
biasing for both NAND03 and NAND04, similar to NAND02, shown in Table 5.7 and Table
5.8, respectively. Approximately 14 times reduction in delay in case of charge boosted
NAND03 and 17 times reduction in delay in case of charge boosted NAND04 are observed
compared to the regular NAND03 and NAND04 cells. Approximately 20 times increase
in energy in case of Drain-Drain AND03 and 13 times increase in energy in case of Drain-
Drain NAND04 are observed compared to the regular NAND03 and NAND04 cells. The
behavior in case of NAND03 and NAND04 is similar to that of NAND02 and the variation
of delay, energy and energy-delay product with Vdd for NAND03 and NAND04 is shown
in Figure 5.18, 5.19, 5.20, 5.21, 5.22 and 5.23, respectively.
The energy-delay product graph does not shoot up, unlike the case with inverter and
AND cells. The stacking of the transistors in their pull-down networks causes a further shift
in the Vdd, compared to NAND02, at which the energy-delay product shoots up. For this
reason the energy-delay product graph continues to reduce from 0.2 V to 0.38 V, whereas
it shoots up at 0.36 V in case of NAND02.
56
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.18: NAND03 delay characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−16
10−15
10−14
10−13
10−12
10−11
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.19: NAND03 energy characteristics with varying Vdd in IBM 65 nm technology.
57
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−22
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.20: NAND03 energy-delay product with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
10−5
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.21: NAND04 delay characteristics with varying Vdd in IBM 65 nm technology.
58
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−16
10−15
10−14
10−13
10−12
10−11
10−10
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.22: NAND04 energy characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−22
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.23: NAND04 energy-delay product with varying Vdd in IBM 65 nm technology.
59
5.2.4 OR
This section presents the results and analysis obtained by implementing performance
enhancement methods on OR02, OR03 and OR04 cells. The delay and energy variations
with varying Vdd are characterized.
OR02
The regular OR02 cell has a delay of 208.2 ns and an energy of 0.35 fJ at 0.3 V. The
delay value of a performance-enhanced OR02 cell is lower and energy consumption is
higher when compared to regular OR02 cell. The reason for this is the higher Ion and is
similar to the case of an inverter, explained earlier. The delay and energy values for regular
and performance-enhanced OR02 cell are shown in Table 5.9.
Table 5.9: Delay and energy values for OR02 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 2.082e-07 3.554e-16
Gate-Gate 9.070e-08 5.265e-14
Drain-Drain 1.351e-07 2.249e-14
Supply-Ground 6.892e-08 1.672e-13
Charge boosting 4.550e-08 3.841e-14
The delay value is least in case of charge boosting and energy is least in case of Drain-
Drain biasing as observed from Table 5.9. Approximately 5 times reduction in delay is
observed in case of charge boosting and 64 times increase in energy consumption is ob-
served in case of Drain-Drain biasing compared to a regular OR02 cell. The reason for this
is similar to the case of an inverter. A similar behavior in the variation of delay, energy and
energy-delay product is observed in case of OR02 as that of an inverter, shown in Figure
5.24, 5.25 and 5.26, respectively.
60
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.24: OR02 delay characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−15
10−14
10−13
10−12
10−11
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.25: OR02 energy characteristics with varying Vdd in IBM 65 nm technology.
61
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.26: OR02 energy-delay product with varying Vdd in IBM 65 nm technology.
OR03 and OR04
The regular OR03 cell has a delay of 344.4 ns and an energy of 0.8 fJ at 0.3 V. The
regular OR04 cell has a delay of 507.2 ns and energy of 1.5 fJ at 0.3 V. The delay and
energy values for regular and performance-enhanced OR03 cell and OR04 cell are shown
in Table 5.10 and 5.11 respectively.
Table 5.10: Delay and energy values for OR03 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 3.444e-07 8.118e-16
Gate-Gate 1.481e-07 1.098e-13
Drain-Drain 2.004e-07 3.810e-14
Supply-Ground 1.134e-07 4.091e-13
Charge boosting 5.318e-08 1.265e-13
62
Table 5.11: Delay and energy values for OR04 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 5.072e-07 1.568e-15
Gate-Gate 2.082e-07 2.279e-13
Drain-Drain 2.771e-07 6.931e-14
Supply-Ground 1.613e-07 9.614e-13
Charge boosting 6.373e-08 3.484e-13
The delay is least in case of charge boosting and energy is least in case of Drain-Drain
biasing for both OR03 and OR04, similar to OR02, shown in Table 5.10 and Table 5.11.
Approximately 6 times reduction in delay in case of charge boosted OR03 and 8 times
reduction in delay in case of charge boosted OR04 are observed compared to the regular
OR03 and OR04 cells. Approximately 46 times increase in energy in case of Drain-Drain
OR03 and 45 times increase in energy in case of Drain-Drain OR04 are observed compared
to the regular OR03 and OR04 cells. The behavior in case of OR03 and OR04 is similar
to that of OR02 and the variation of delay, energy and energy-delay product with Vdd for
OR03 and OR04 is shown in Figure 5.27, 5.28, 5.29, 5.30, 5.31 and 5.32, respectively.
63
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
10−5
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.27: OR03 delay characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−15
10−14
10−13
10−12
10−11
10−10
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.28: OR03 energy characteristics with varying Vdd in IBM 65 nm technology.
64
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.29: OR03 energy-delay product with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
10−5
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.30: OR04 delay characteristics with varying Vdd in IBM 65 nm technology.
65
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−15
10−14
10−13
10−12
10−11
10−10
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.31: OR04 energy characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.32: OR04 energy-delay product with varying Vdd in IBM 65 nm technology.
66
5.2.5 NOR
This section presents the results and analysis obtained by implementing performance en-
hancement methods on NOR02, NOR03 and NOR04 cells. The delay and energy variations
with varying Vdd are characterized.
NOR02
The regular NOR02 cell has a delay of 71.4 ns and an energy of 0.18 fJ at 0.3 V. The
delay value of a performance-enhanced NOR02 cell is lower and energy consumption is
higher when compared to regular NOR02 cell. The reason for this is the higher Ion and is
similar to the case of an inverter, explained earlier. The delay and energy values for regular
and performance-enhanced NOR02 cell are shown in Table 5.12.
The behavior of energy-delay product in case of NOR02 is similar to that of an inverter.
The energy-delay product shoots up at Vdd = 0.3 V.
Table 5.12: Delay and energy values for NOR02 at 0.3 V for IBM 65 nm technology.
Methodlogy Delay (s) Energy (J)
Regular 7.149e-08 1.866e-16
Gate-Gate 3.522e-08 3.695e-14
Drain-Drain 4.045e-08 1.111e-14
Supply-Ground 1.979e-08 1.078e-13
Charge boosting 1.079e-08 6.509e-14
The delay value is least in case of charge boosting and energy is least in case of Drain-
Drain biasing as observed from Table 5.12. Approximately 7 times reduction in delay
is observed in case of charge boosting and 60 times increase in energy consumption is
observed in case of Drain-Drain biasing compared to a regular OR02 cell. The reason for
this is similar to the case of an inverter. A similar behavior in the variation of delay, energy
and energy-delay product is observed in case of NOR02 as that of an inverter, shown in
Figure 5.33, 5.34 and 5.35, respectively.
67
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.33: NOR02 delay characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−16
10−15
10−14
10−13
10−12
10−11
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.34: NOR02 energy characteristics with varying Vdd in IBM 65 nm technology.
68
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−22
10−21
10−20
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.35: NOR02 energy-delay product with varying Vdd in IBM 65 nm technology.
NOR03 and NOR04
The regular NOR03 cell has a delay of 177.5 ns and an energy of 0.60 fJ at 0.3 V. The
regular NOR04 cell has a delay of 307.4 ns and energy of 1.37 fJ at 0.3 V. The delay
and energy values for regular and performance enhanced NOR03 cell and NOR04 cell are
shown in Table 5.13 and 5.14, respectively.
Table 5.13: Delay and energy values for NOR03 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 1.775e-07 6.029e-16
Gate-Gate 7.028e-08 7.636e-14
Drain-Drain 8.999e-08 1.351e-14
Supply-Ground 5.630e-08 2.920e-13
Charge boosting 1.701e-08 1.248e-13
69
Table 5.14: Delay and energy values for NOR04 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 3.074e-07 1.376e-15
Gate-Gate 1.164e-07 1.523e-13
Drain-Drain 1.503e-07 1.583e-14
Supply-Ground 9.556e-08 7.272e-13
Charge boosting 2.467e-08 3.432e-13
The delay is least in case of charge boosting and energy is least in case of Drain-Drain
biasing for both NOR03 and NOR04, similar to NOR02, shown in Table 5.13 and Table
5.14. Approximately 10 times reduction in delay in case of charge boosted NOR03 and 12
times reduction in delay in case of charge boosted NOR04 are observed when compared to
the regular NOR03 and NOR04 cells. Approximately 22 times increase in energy in case
of Drain-Drain NOR03 and 11.5 times increase in energy in case of Drain-Drain NOR04
are observed compared to the regular NOR03 and NOR04 cells. The behavior in case of
NOR03 and NOR04 is similar to that of NOR02 and the variation of delay, energy and
energy-delay product with Vdd for NOR03 and NOR04 is shown in Figure 5.36, 5.37, 5.38,
5.39, 5.40 and 5.41, respectively.
The energy-delay product behavior is different compared to inverter and AND cells.
The energy-delay product shoots up at Vdd = 0.28 V for NOR03 and 0.26 V for NOR04,
respectively. The stacking of the transistor in case of NOR cell is in the pull-up network
as opposed to pull-down network which is the case with AND cells. The savings in delay
in case of NOR cells are lower compared to AND cell due to the stacking in the pull-
up network. The energy consumption is higher for NOR cell compared to a NAND cell
because in the AND cell the stacking is present in pull-down network which resists the
leakage. Thus, the energy-delay product shoot up at voltage less than 0.3 V which is the
case with inverter.
70
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.36: NOR03 delay characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−15
10−14
10−13
10−12
10−11
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.37: NOR03 energy characteristics with varying Vdd in IBM 65 nm technology.
71
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−22
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.38: NOR03 energy-delay product with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
10−5
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.39: NOR04 delay characteristics with varying Vdd in IBM 65 nm technology.
72
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−15
10−14
10−13
10−12
10−11
10−10
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.40: NOR04 energy characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.41: NOR04 energy-delay product with varying Vdd in IBM 65 nm technology.
73
5.2.6 XOR and XNOR
This section presents the results and analysis obtained by implementing performance
enhancement methods on XOR and XNOR cells. The delay and energy variations with
varying Vdd are characterized.
The regular XOR cell has a delay of 83.4 ns and an energy of 0.22 fJ at 0.3 V. The regular
XNOR cell has a delay of 223.5 ns and energy of 0.38 fJ at 0.3 V. The delay and energy
values for regular and performance enhanced XOR cell and XNOR cell are shown in Tables
5.15 and 5.16, respectively.
Table 5.15: Delay and energy values for XOR at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 8.347e-08 2.265e-16
Gate-Gate 3.239e-08 4.387e-14
Drain-Drain 5.508e-08 2.286e-14
Supply-Ground 2.483e-08 1.449e-13
Charge boosting 8.044e-09 4.745e-14
Table 5.16: Delay and energy values for XNOR at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 2.235e-07 3.838e-16
Gate-Gate 1.139e-07 5.088e-14
Drain-Drain 2.059e-07 3.104e-14
Supply-Ground 8.558e-08 2.003e-13
Charge boosting 6.154e-08 5.196e-14
The delay is least in case of charge boosting and energy is least in case of Drain-Drain bi-
asing for both XOR and XNOR as expected, shown in Table 5.15 and Table 5.16. Approx-
imately 10 times reduction in delay in case of charge boosted XOR and 4 times reduction
in delay in case of charge boosted XNOR are observed when compared to the regular XOR
and XNOR cells. Approximately 100 times increase in energy in case of Drain-Drain XOR
74
and 81 times increase in energy in case of Drain-Drain XNOR are observed compared to
the regular XOR and XNOR cells. The behavior in case of XOR and XNOR is similar to
that of an inverter and the variation of delay, energy and energy-delay product with Vdd for
XOR and XNOR is shown in Figure 5.42, 5.43, 5.44, 5.45, 5.46 and 5.47, respectively.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.42: XOR delay characteristics with varying Vdd in IBM 65 nm technology.
75
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−15
10−14
10−13
10−12
10−11
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.43: XOR energy characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−22
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.44: XOR energy-delay product with varying Vdd in IBM 65 nm technology.
76
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−9
10−8
10−7
10−6
10−5
Vdd (V)
lo
g 
(D
ela
y) 
in 
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.45: XNOR delay characteristics with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−15
10−14
10−13
10−12
10−11
Vdd (V)
lo
g 
(E
ne
rgy
) in
 J
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.46: XNOR energy characteristics with varying Vdd in IBM 65 nm technology.
77
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.47: XNOR energy-delay product with varying Vdd in IBM 65 nm technology.
5.2.7 AND-OR and AND-OR-INVERT
This section presents the results and analysis obtained by implementing performance
enhancement methods on AND-OR and AND-OR-INVERT cells. The delay and energy
variations with varying Vdd are characterized.
AO21 and AOI21
This section discusses the characteristics of performance enhanced AND-OR-21 (AO21)
cell and AND-OR-INVERT-21(AOI21) cell. The AO21 cell is constructed by adding an
inverter to AOI21 cells. AO21 cell has a delay of 413.8 ns and energy of 0.63 fJ at 0.3V.
AOI21 cell has a delay of 612.1 ns and energy of 1.3 fJ at 0.3 V. The delay and energy
values for regular and performance enhanced AO21 and AOI21 cells are shown in Tables
5.17 and 5.18, respectively.
78
Table 5.17: Delay and energy values for AO21 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 4.138e-07 6.365e-16
Gate-Gate 1.369e-07 7.015e-14
Drain-Drain 2.154e-07 2.286e-14
Supply-Ground 1.070e-07 2.173e-13
Charge boosting 4.804e-08 7.021e-14
Table 5.18: Delay and energy values for AOI21 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 1.837e-07 4.048e-16
Gate-Gate 5.818e-08 5.278e-14
Drain-Drain 9.653e-08 1.077e-14
Supply-Ground 4.850e-08 1.588e-13
Charge boosting 1.293e-08 6.878e-14
The delay value of AOI21 cell is lower compared to AO21 cell because of the additional
inverter present in case of AO21. Charge boosting has the least delay in case of AO21 and
AOI21 cells, compared with other methods. Approximately 9 times and 11 times reduction
in delay are observed for AO21 and AOI21, respectively, compared to the corresponding
regular cells. Drain-Drain biasing has the least energy consumption in case of AO21 and
AOI21 cells, compared with other methods. Approximately 35 times and 33 times increase
in energy consumption are observed for AO21 and AOI21, respectively, compared to the
corresponding regular cells. The variation of energy-delay product with supply voltage
for AO21 and AOI21 is shown in Figures 5.48 and 5.49, respectively. The energy-delay
product is least in case of charge boosting for Vdd greater than or equal to 300 mV and it is
least in case of Drain-Drain biasing for Vdd less than 300 mV.
The behavior of the energy-delay product for both AO21 and AOI21 is similar to an in-
verter. However for AO21 the energy-delay product shoots up at 0.34 V while for AOI21 it
shoots up at 0.38 V. The additional inverter present in AO21 causes it to consume more en-
ergy at a lower voltage compared to AOI21, due to which, the Vdd at which the energy-delay
79
product shoots up, shifts from 0.34 V to 0.38 V in case of AO21 and AOI21 respectively.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.48: AO21 energy-delay product with varying Vdd in IBM 65 nm technology.
80
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−22
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.49: AOI21 energy-delay product with varying Vdd in IBM 65 nm technology.
AO22 and AOI22
This section discusses the characteristics of performance-enhanced AND-OR-22 (AO22)
cell and AND-OR-INVERT-22(AOI22) cell. The AO22 cell is constructed by adding an
inverter to AOI22 cell. AO22 cell has a delay of 419 ns and energy of 1.172 fJ at 300 mV.
AOI22 cell has a delay of 223.9 ns and energy of 0.8 fJ at 300 mV. The delay and energy
values for regular and performance enhanced AO22 and AOI22 cells are shown in Tables
5.19 and 5.20, respectively.
The delay value of AOI22 cell is lower compared to AO22 cell because of the additional
inverter present in case of AO22. Charge boosting has the least delay in case of AO22 and
AOI22 cells when compared with other methods. Approximately 8 times and 20 times re-
duction in delay are observed for AO22 and AOI22, respectively, when compared with the
81
Table 5.19: Delay and energy values for AO22 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 4.190e-07 1.172e-15
Gate-Gate 1.671e-07 1.532e-13
Drain-Drain 2.717e-07 3.815e-14
Supply-Ground 1.346e-07 4.796e-13
Charge boosting 4.866e-08 1.732e-13
Table 5.20: Delay and energy values for AOI22 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 2.239e-07 8.897e-16
Gate-Gate 7.273e-08 1.216e-13
Drain-Drain 1.417e-07 1.464e-14
Supply-Ground 6.837e-08 3.643e-13
Charge boosting 1.173e-08 2.878e-13
corresponding regular cells. Drain-Drain biasing methodology has the least energy con-
sumption for AO22 and AOI22 cells when compared with other methods. Approximately
32 times and 16 times increase in energy consumption are observed for AO22 and AOI22,
respectively, when compared with the corresponding regular cells. The variation of energy-
delay product with supply voltage for AO22 and AOI22 is shown in Figures 5.50 and 5.51,
respectively. The energy-delay product is least in case of charge boosting for Vdd greater
than or equal to 290 mV in case AO22 and 310 mV in case of AOI22. For Vdd less than 290
mV in case of AO22 and 310 mV in case of AOI22, the Drain-Drain biasing has the least
energy-delay product.
82
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.50: AO22 energy-delay product with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.51: AOI22 energy-delay product with varying Vdd in IBM 65 nm technology.
83
AO221 and AOI221
This section discusses the characteristics of performance-enhanced AND-OR-221 (AO221)
cell and AND-OR-INVERT-221 (AOI221) cell. The AO221 cell is constructed by adding
an inverter to AOI221 cell. AO221 cell has a delay of 543 ns and energy of 1.203 fJ at
0.3 V. AOI221 cell has a delay of 333.4 ns and energy of 0.95 fJ at 0.3 V. The delay and
energy values for regular and performance enhanced AO221 and AOI221 cells are shown
in Tables 5.21 and 5.22, respectively.
Table 5.21: Delay and energy values for AO221 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 5.430e-07 1.203e-15
Gate-Gate 2.052e-07 1.709e-13
Drain-Drain 3.261e-07 4.251e-14
Supply-Ground 1.640e-07 5.781e-13
Charge boosting 5.633e-08 2.171e-13
Table 5.22: Delay and energy values for AOI221 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 3.334e-07 9.509e-16
Gate-Gate 1.045e-07 1.403e-13
Drain-Drain 1.851e-07 1.715e-14
Supply-Ground 9.386e-08 4.615e-13
Charge boosting 1.785e-08 2.156e-13
The delay value of AOI221 cell is lower compared to AO221 cell because of the addi-
tional inverter present in case of AO221. Charge boosting method has the least delay for
AO221 and AOI221 cells when compared with other methods. Approximately 10 times and
20 times reduction in delay are observed for AO221 and AOI221, respectively, compared
to the corresponding regular cells. Drain-Drain biasing methodology has the least energy
consumption for AO221 and AOI221 cells when compared with other methods. Approx-
imately 35 times and 16 times increase in energy consumption are observed for AO221
84
and AOI221, respectively, compared to the corresponding regular cells. The variation of
energy-delay product with supply voltage for AO221 and AOI221 is shown in Figures 5.52
and 5.53, respectively. The energy-delay product is least in case of charge boosting for
Vdd greater than or equal to 300 mV in case AO221 and 310 mV in case of AOI221. For
Vdd less than 290 mV in case of AO221 and 310 mV in case of AOI221, the Drain-Drain
biasing has the least energy-delay product.
The behavior of the energy-delay product graph for AO221 and AOI221 is similar to
AO21 and AOI21. Due to the additional inverter present in case of AO221 the energy
consumption is higher compared to AOI221. This is the reason for the voltage at which
energy-delay product shoots up shifting from 0.36 V to 0.38 V in case of AO221 and
AOI221, respectively.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.52: AO221 energy-delay product with varying Vdd in IBM 65 nm technology.
85
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.53: AOI221 energy-delay product with varying Vdd in IBM 65 nm technology.
AO32 and AOI32
This section discusses the characteristics of high performance AND-OR-32 (AO32) cell
and AND-OR-INVERT-32 (AOI32) cell. The AO32 cell is constructed by adding an in-
verter to AOI32 cell. AO32 cell has a delay of 477.1 ns and energy of 1.03 fJ at 0.3 V.
AOI32 cell has a delay of 391.1 ns and energy of 1.049 fJ at 0.3 V. The delay and energy
values for regular and performance enhanced AO32 and AOI32 cells are shown in Tables
5.23 and 5.24, respectively.
As shown in Tables 5.23 and 5.24, the delay value of AOI32 cell is lower compared to
AO32 cell because of the additional inverter present in case of AO32. Charge boosting
method has the least delay for AO32 and AOI32 cells when compared with other methods.
Approximately 10 times and 20 times reduction in delay are observed for AO32 and AOI32
respectively compared to the corresponding regular cells. Drain-Drain biasing has the least
86
Table 5.23: Delay and energy values for AO32 at 0.3 V for IBM 65 nm technology.
Methodlogy Delay (s) Energy (J)
Regular 4.771e-07 1.036e-15
Gate-Gate 1.938e-07 1.957e-13
Drain-Drain 3.346e-07 3.645e-14
Supply-Ground 1.534e-07 5.660e-13
Charge boosting 5.092e-08 2.197e-13
Table 5.24: Delay and energy values for AOI32 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 2.747e-07 8.018e-16
Gate-Gate 9.372e-08 1.644e-13
Drain-Drain 1.949e-07 1.317e-14
Supply-Ground 8.379e-08 4.511e-13
Charge boosting 1.345e-08 3.688e-13
energy consumption for AO32 and AOI32 cells when compared with other methods. Table
5.23 shows approximately 35 times and 16 times increase in the energy consumption for
AO32 and AOI32, respectively, compared to the corresponding regular cells. The variation
of energy-delay product with supply voltage for AO32 and AOI32 is shown in Figure 5.54
and 5.55, respectively. The energy-delay product is least in case of charge boosting for Vdd
greater than or equal to 300 mV in case AO32 and 320 mV in case of AOI32. For Vdd less
than 300 mV in case of AO32 and 320 mV in case of AOI32, the Drain-Drain biasing has
the least energy-delay product.
87
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.54: AO32 energy-delay product with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.55: AOI32 energy-delay product with varying Vdd in IBM 65 nm technology.
88
AO321 and AOI321
This section discusses the characteristics of high performance AND-OR-321 (AO321)
cell and AND-OR-INVERT-321 (AOI321) cell. The AO321 cell is constructed by adding
an inverter to AOI321 cell. AO321 cell has a delay of 612.1 ns and energy of 1.3 fJ at
0.3 V. AOI321 cell has a delay of 391.1 ns and energy of 1.049 J at 0.3V. The delay and
energy values for regular and performance-enhanced AO321 and AOI321 cells are shown
in Tables 5.25 and 5.26, respectively.
Table 5.25: Delay and energy values for AO321 at 0.3 V for IBM 65 nm technology.
Methodlogy Delay (s) Energy (J)
Regular 6.121e-07 1.300e-15
Gate-Gate 2.354e-07 2.126e-13
Drain-Drain 3.842e-07 4.293e-14
Supply-Ground 1.838e-07 6.679e-13
Charge boosting 5.872e-08 2.605e-13
Table 5.26: Delay and energy values for AOI321 at 0.3 V for IBM 65 nm technology.
Methodlogy Delay (s) Energy (J)
Regular 3.911e-07 1.049e-15
Gate-Gate 1.266e-07 1.836e-13
Drain-Drain 2.371e-07 1.772e-14
Supply-Ground 1.096e-07 5.517e-13
Charge boosting 1.959e-08 2.609e-13
The delay value of AOI321 cell is lower compared to AO321 cell because of the addi-
tional inverter present in case of AO321. Due to the same reason even the energy con-
sumption is higher in case of AO321 when compared to AOI321. Charge boosting has the
least delay for AO321 and AOI321 cells when compared with other methods. 10 times and
20 times reduction in delay are observed for AO321 and AOI321, respectively, compared
to the corresponding regular cells. Drain-Drain biasing has the least energy consump-
tion for AO321 and AOI321 cells when compared with other methods. 33 times and 17
89
times increase in energy consumption are observed for AO321 and AOI321, respectively,
compared to the corresponding regular cells. The variation of energy-delay product with
supply voltage for AO321 and AOI321 is shown in Figures 5.56 and 5.57, respectively.
The energy-delay product is least in case of charge boosting for Vdd greater than or equal
to 300mV for AO321 and 340 mV for AOI321 and it is least in case of Drain-Drain biasing
for Vdd less than 300 mV for AO321 and 340 mV for AOI321.
The behavior of energy-delay product graphs of AO321 and AOI321 are similar com-
pared to an inverter. The energy-delay product shoots up in both cells at 0.36 V. However,
the energy-delay product in case of AOI321 decreases after 0.36 V which is not the case
with AO321. The additional inverter present in AO321 is causing this behavior.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.56: AO321 energy-delay product with varying Vdd in IBM 65 nm technology.
90
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.57: AOI321 energy-delay product with varying Vdd in IBM 65 nm technology.
5.2.8 OR-AND and OR-AND-INVERT
This section discusses the characteristics of OR-AND and OR-AND-INVERT CELLS.
OA21 and OAI21
This section discusses the characteristics of high performance OR-AND-21 (OA21) cell
and OR-AND-INVERT-21 (OAI21) cell. The OA21 cell is constructed by adding an in-
verter to OAI21 cell. OA21 cell has a delay of 302.6 ns and energy of 0.48 fJ at 0.3 V.
OAI21 cell has a delay of 146.4 ns and energy of 0.33 fJ at 0.3 V. The delay and energy
values for regular and performance-enhanced OA21 and OAI21 cells are shown in Tables
5.27 and 5.28, respectively.
91
Table 5.27: Delay and energy values for OA21 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 3.026e-07 4.815e-16
Gate-Gate 1.261e-07 6.893e-14
Drain-Drain 2.167e-07 1.851e-14
Supply-Ground 9.985e-08 1.848e-13
Charge boosting 4.975e-08 6.441e-14
Table 5.28: Delay and energy values for OAI21 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 1.464e-07 3.336e-16
Gate-Gate 6.977e-08 5.122e-14
Drain-Drain 9.656e-08 7.182e-15
Supply-Ground 3.988e-08 1.273e-13
Charge boosting 1.413e-08 6.975e-14
The delay value of OAI21 cell is lower compared to OA21 cell because of the additional
inverter present in case of OA21. Due to the same reason even the energy consumption
is higher in case of OA21 when compared to OAI21. Charge boosting methodology has
the least delay for OA21 and OAI21 cells when compared with other methodologies. Ap-
proximately 6 times and 10 times reduction in delay are observed for OA21 and OAI21,
respectively, compared to the corresponding regular cells. Drain-Drain biasing has the least
energy consumption for OA21 and OAI21 cells when compared with other methods. Ap-
proximately 38 times and 22 times increase in energy consumption is observed for OA21
and OAI21 respectively compared to the corresponding regular cells. The variation of
energy-delay product with supply voltage for OA21 and OAI21 is shown in Figures 5.58
and 5.59, respectively. The energy-delay product is least in case of charge boosting for Vdd
greater than or equal to 290 mV for AO21 and 340 mV for AOI321 and it is least in case
of Drain-Drain biasing for Vdd less than 290 mV for OA21 and 340 mV for OAI21.
92
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.58: OA21 energy-delay product with varying Vdd in IBM 65 nm technology.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−22
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.59: OAI21 energy-delay product with varying Vdd in IBM 65 nm technology.
93
OA32 and OAI32
This section discusses the characteristics of high-performance OR-AND-32 (OA32) cell
and OR-AND-INVERT-32 (OAI32) cell. The OA32 cell is constructed by adding an in-
verter to OAI32 cell. OA32 cell has a delay of 448.7 ns and energy of 0.48 fJ at 0.3 V.
OAI32 cell has a delay of 243.8 ns and energy of 0.75 fJ at 0.3 V. The delay and energy
values for regular and performance enhanced OA32 and OAI32 cells are shown in Tables
5.29 and 5.30, respectively.
Table 5.29: Delay and energy values for OA32 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 4.487e-07 9.528e-16
Gate-Gate 1.861e-07 1.546e-13
Drain-Drain 3.258e-07 3.835e-14
Supply-Ground 1.528e-07 4.951e-13
Charge boosting 5.467e-08 2.165e-13
Table 5.30: Delay and energy values for OAI32 at 0.3 V for IBM 65 nm technology.
Methodology Delay (s) Energy (J)
Regular 2.438e-07 7.550e-16
Gate-Gate 9.930e-08 1.251e-13
Drain-Drain 1.845e-07 1.393e-14
Supply-Ground 8.025e-08 3.800e-13
Charge boosting 1.674e-08 2.156e-13
The delay value of OAI32 cell is lower compared to OA32 cell because of the additional
inverter present in case of OA32. Due to the same reason even the energy consumption is
higher in case of OA32 when compared to OAI32. Charge boosting has the least delay for
OA32 and OAI32 cells when compared with other methodologies. Approximately 8 times
and 15 times reduction in delay are observed for OA32 and OAI32, respectively, compared
to the corresponding regular cells. Drain-Drain biasing has the least energy consumption
for OA32 and OAI32 cells when compared with other methodologies. Approximately 40
94
times and 18 times increase in energy consumption are observed for OA32 and OAI32,
respectively, compared to the corresponding regular cells. The variation of energy-delay
product with supply voltage for OA32 and OAI32 is shown in Figures 5.60 and 5.61, re-
spectively. The energy-delay product is least in case of charge boosting for Vdd greater than
or equal to 300mV for OA32 and 330 mV for OAI32 and it is least in case of Drain-Drain
biasing for Vdd less than 300 mV for OA32 and 330 mV for OAI32.
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.60: OA32 energy-delay product with varying Vdd in IBM 65 nm technology.
95
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
10−18
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.61: OAI32 energy-delay product with varying Vdd in IBM 65 nm technology.
5.2.9 NOR0211
A regular NOR0211 cell when operated at 300 mV has a delay of 212.3 ns and con-
sumes energy of 0.5 fJ. The energy and delay values of regular and performance-enhanced
NOR0211 are shown in Table 5.31. As observed from Table 5.31, the delay value is least
in case of charge boosting then followed by Supply-Ground biasing, Drain-Drain biasing,
Gate-Gate biasing and the regular inverter cell. Approximate 3 times reduction in delay is
observed in case of charge boosting when compared to regular inverter cell. The energy
consumption is higher in case of high performance cells as expected, due to higher Ion.
Drain-Drain biasing has the least energy among the performance enhancement methods.
Approximate 40 times increase in energy consumption is observed in case of Drain-Drain
biasing when compared with regular inverter cell. The variation of energy-delay product
96
with supply voltage is shown in Figure 5.62. The energy-delay product of Drain-Drain bi-
asing is the least for Vdd less than 300 mV, and for Vdd greater than 300 mV charge boosting
has the least energy-delay product compared to other methodologies.
Table 5.31: Delay and energy values for NOR0211 at 0.3 V for IBM 65 nm technology.
Method0logy Delay (s) Energy (J)
Regular 2.123e-07 5.222e-16
Gate-Gate 1.115e-07 5.568e-14
Drain-Drain 1.386e-07 2.058e-14
Supply-Ground 7.734e-08 1.634e-13
Charge boosting 6.372e-08 4.642e-14
0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 0.38
10−21
10−20
10−19
Vdd (V)
lo
g 
(E
DP
) in
 J−
s
 
 
Gate−Gate
Drain−Drain
Supply−Ground
buffer
Figure 5.62: NOR0211 energy-delay product with varying Vdd in IBM 65 nm technology.
5.2.10 Summary of Performance-Enhanced Standard Cell Library
Four performance-enhanced standard cell libraries were designed in subthreshold, one
corresponding to each high-performance method. Depending on the design constraints and
97
user requirements a particular standard cell library can be chosen. User requirements can
be either minimum delay, minimum energy or minimum energy-delay product. The best
case method for each standard cell with delay, energy and energy-delay product as a user
requirement is shown in Table 5.32. The cell characteristics such as propagation delay,
energy and power at 0.3 V Vdd and 125 ◦C for Gate-Gate biasing are shown in Table 5.33,
for Drain-Drain biasing are shown in Table 5.34, for Supply-Ground biasing are shown in
Table 5.35, and for charge boosting are shown in Table 5.36. The cell characteristics such
as propagation delay, energy and power for nominal temperature of 25 ◦C and 0.3 V Vdd
are shown in Appendix A. The propagation delay is calculated as shown in Equation (5.6).
Delay = tpLH + tpHL (5.6)
where, tpLH is the low to high propagation delay and tpHL is the high to low propagation
delay. The design choice in case of minimum delay is charge boosting for all the stan-
dard cells. Similarly Drain-Drain biasing is the design choice in case of minimum energy
consumption.
98
Table 5.32: Design choice of a standard cell for delay, energy and energy-delay product as
metrics.
Standard Cell Energy-Delay Product
AND02 Charge boosting
AND03 Charge boosting
AND04 Drain-Drain biasing
NAND02 Charge boosting
NAND03 Drain-Drain biasing
NAND04 Drain-Drain biasing
OR02 Charge boosting
OR03 Charge boosting
OR04 Drain-Drain biasing
NOR02 Drain-Drain biasing
NOR03 Drain-Drain biasing
NOR04 Drain-Drain biasing
INVERTER Drain-Drain biasing
XNOR Charge boosting
XOR Charge boosting
AO21 Charge boosting
AO22 Charge boosting
AO32 Charge boosting
AO221 Charge boosting
AO321 Charge boosting
AOI21 Charge boosting
AOI22 Drain-Drain biasing
AOI32 Drain-Drain biasing
AOI221 Drain-Drain biasing
AOI321 Drain-Drain biasing
OA21 Charge boosting
OA32 Charge boosting
OAI21 Drain-Drain biasing
OAI32 Drain-Drain biasing
NOR0211 Drain-Drain biasing
99
Table 5.33: Delay, power and energy values for Gate-Gate standard cell library at 0.3 V
and 125 ◦C.
Standard TPLH (s) TPHL (s) Delay (s) Power (W) Energy (J) Contamination
Cell Delay (s)
AND02 1.016e-08 6.756e-09 1.691e-08 2.113e-10 1.092e-14 0.23e-09
AND03 2.522e-08 8.553e-09 3.377e-08 1.877e-10 1.920e-14 0.41e-09
AND04 2.701e-08 8.320e-09 3.533e-08 2.406e-10 4.844e-14 0.53e-09
NAND02 7.270e-09 1.184e-09 8.454e-09 1.109e-10 4.942e-15 0.11e-09
NAND03 1.847e-08 2.727e-09 2.119e-08 4.112e-11 4.927e-15 0.24e-09
NAND04 1.697e-08 2.055e-09 1.903e-08 1.490e-10 2.654e-14 0.37e-09
OR02 7.447e-09 1.327e-08 2.071e-08 2.071e-10 1.015e-14 0.57e-09
OR03 1.048e-08 2.377e-08 3.424e-08 2.404e-10 2.193e-14 0.83e-09
OR04 1.434e-08 3.589e-08 5.023e-08 2.668e-10 4.493e-14 1.2e-09
NOR02 5.338e-09 2.202e-09 7.541e-09 1.348e-10 5.736e-15 0.36e-09
NOR03 1.549e-08 4.682e-09 2.017e-08 1.898e-10 2.061e-14 0.39e-09
NOR04 2.771e-08 7.638e-09 3.535e-08 2.143e-10 3.211e-14 1.01e-09
INVERTER 2.832e-09 6.865e-10 3.518e-09 7.521e-11 1.657e-15 0.1e-09
XNOR 1.169e-08 9.091e-09 2.078e-08 2.474e-10 9.811e-15 0.26e-09
XOR 2.431e-09 5.343e-09 7.774e-09 1.865e-10 6.809e-15 2.9e-09
AO21 7.996e-09 2.786e-08 3.586e-08 2.606e-10 1.234e-14 0.25e-09
AO22 8.540e-09 2.975e-08 3.829e-08 3.115e-10 3.040e-14 2.9e-09
AO32 9.330e-09 3.412e-08 4.345e-08 3.583e-10 3.486e-14 2.4e-09
AO221 1.089e-08 4.242e-08 5.330e-08 3.546e-10 3.206e-14 2.6e-09
AO321 1.141e-08 4.517e-08 5.658e-08 4.142e-10 3.765e-14 3.1e-09
AOI21 1.867e-08 2.768e-09 2.143e-08 1.932e-10 8.283e-15 2.3e-09
AOI22 2.181e-08 2.464e-09 2.427e-08 2.378e-10 2.151e-14 1.8e-09
AOI32 2.583e-08 3.081e-09 2.891e-08 2.841e-10 2.593e-14 2.4e-09
AOI221 3.371e-08 4.556e-09 3.826e-08 2.909e-10 2.436e-14 1.3e-09
AOI321 3.890e-08 4.947e-09 4.385e-08 3.509e-10 2.999e-14 3.9e-09
OA21 9.236e-09 1.805e-08 2.728e-08 2.436e-10 1.210e-14 1.9e-09
OA32 1.094e-08 3.108e-08 4.203e-08 3.429e-10 3.225e-14 2.6e-09
OAI21 9.597e-09 3.544e-09 1.314e-08 1.655e-10 7.365e-15 1.1e-09
OAI32 2.332e-08 4.345e-09 2.766e-08 2.732e-10 2.383e-14 1e-09
NOR0211 8.907e-09 1.451e-08 2.341e-08 2.134e-10 1.040e-14 1.9e-09
100
Table 5.34: Delay, power and energy values for Drain-Drain standard cell library at 0.3 V
and 125 ◦C.
AND02 1.266e-08 8.203e-09 2.086e-08 3.055e-10 1.829e-14 0.35e-09
AND03 2.900e-08 9.057e-09 3.805e-08 2.670e-10 3.199e-14 0.47e-09
AND04 3.797e-08 1.053e-08 4.850e-08 2.512e-10 6.016e-14 0.52e-09
NAND02 1.089e-08 1.370e-09 1.226e-08 1.242e-10 7.417e-15 0.11e-09
NAND03 1.888e-08 1.883e-09 2.077e-08 1.040e-10 1.242e-14 0.35e-09
NAND04 2.762e-08 2.419e-09 3.003e-08 8.080e-11 1.926e-14 0.46e-09
OR02 9.163e-09 1.341e-08 2.257e-08 2.838e-10 1.682e-14 0.72e-09
OR03 1.267e-08 2.054e-08 3.321e-08 2.697e-10 3.198e-14 0.84e-09
OR04 1.698e-08 2.881e-08 4.579e-08 2.475e-10 5.846e-14 0.92e-09
NOR02 5.907e-09 2.668e-09 8.575e-09 1.605e-10 9.386e-15 0.15e-09
NOR03 1.323e-08 5.255e-09 1.848e-08 1.911e-10 2.243e-14 0.27e-09
NOR04 2.234e-08 8.409e-09 3.075e-08 1.828e-10 4.279e-14 0.35e-09
INVERTER 2.133e-09 1.140e-09 3.273e-09 9.578e-11 2.853e-15 0.31e-09
XNOR 1.425e-08 1.157e-08 2.582e-08 2.995e-10 1.565e-14 1.8e-09
XOR 3.434e-09 5.822e-09 9.256e-09 2.205e-10 1.100e-14 2.1e-09
AO21 9.998e-09 3.115e-08 4.115e-08 3.171e-10 1.888e-14 5.7e-09
AO22 1.085e-08 3.236e-08 4.321e-08 3.959e-10 4.736e-14 1.4e-09
AO32 1.195e-08 4.037e-08 5.232e-08 4.332e-10 5.190e-14 2.7e-09
AO221 1.328e-08 4.428e-08 5.756e-08 3.703e-10 4.411e-14 0.49e-09
AO321 1.420e-08 5.430e-08 6.850e-08 4.194e-10 5.007e-14 1.9e-09
AOI21 2.180e-08 3.310e-09 2.511e-08 2.155e-10 1.278e-14 0.27e-09
AOI22 2.455e-08 2.884e-09 2.743e-08 2.744e-10 3.277e-14 5.1e-09
AOI32 3.299e-08 3.776e-09 3.676e-08 3.085e-10 3.693e-14 2.9e-09
AOI221 3.809e-08 5.075e-09 4.316e-08 2.778e-10 3.301e-14 1.7e-09
AOI321 4.818e-08 5.693e-09 5.387e-08 3.262e-10 3.888e-14 2.3e-09
OA21 1.225e-08 2.236e-08 3.461e-08 3.149e-10 1.871e-14 1.4e-09
OA32 1.476e-08 3.476e-08 4.952e-08 4.067e-10 4.849e-14 1.8e-09
OAI21 1.225e-08 2.236e-08 3.461e-08 3.149e-10 1.871e-14 1.4e-09
OAI32 2.736e-08 5.932e-09 3.329e-08 2.955e-10 3.512e-14 3.1e-09
NOR0211 1.041e-08 1.394e-08 2.434e-08 2.901e-10 1.729e-14 1.4e-09
101
Table 5.35: Delay, power and energy values for Supply-Ground standard cell library at
0.3V and 125 ◦C.
Standard Cell TPLH (s) TPHL (s) Delay (s) Power (W) Energy (J) Contamination
Delay (s)
AND02 7.466e-09 5.477e-09 1.294e-08 4.499e-10 2.698e-14 0.26e-09
AND03 1.590e-08 5.856e-09 2.176e-08 4.693e-10 5.631e-14 0.41e-09
AND04 2.018e-08 6.874e-09 2.705e-08 5.090e-10 1.221e-13 0.65e-09
NAND02 6.050e-09 9.807e-10 7.031e-09 2.259e-10 1.355e-14 0.11e-09
NAND03 9.907e-09 1.357e-09 1.126e-08 2.659e-10 3.190e-14 0.22e-09
NAND04 1.432e-08 1.725e-09 1.605e-08 2.977e-10 7.144e-14 0.33e-09
OR02 5.609e-09 9.522e-09 1.513e-08 4.381e-10 2.613e-14 0.53e-09
OR03 8.333e-09 1.777e-08 2.610e-08 5.140e-10 6.138e-14 0.72e-09
OR04 1.158e-08 2.759e-08 3.917e-08 5.672e-10 1.356e-13 1.89e-09
NOR02 4.538e-09 1.463e-09 6.001e-09 2.736e-10 1.624e-14 0.26e-09
NOR03 1.313e-08 3.586e-09 1.671e-08 3.905e-10 4.653e-14 0.51e-09
NOR04 2.485e-08 6.083e-09 3.093e-08 4.602e-10 1.097e-13 0.81e-09
INVERTER 2.400e-09 5.496e-10 2.950e-09 1.549e-10 4.634e-15 0.1e-09
XNOR 8.889e-09 7.099e-09 1.599e-08 4.960e-10 2.605e-14 3.6e-09
XOR 2.336e-09 4.111e-09 6.446e-09 3.797e-10 1.915e-14 1.3e-09
AO21 6.478e-09 2.044e-08 2.692e-08 5.415e-10 3.241e-14 3.4e-09
AO22 7.734e-09 2.346e-08 3.120e-08 6.744e-10 8.083e-14 1.9e-09
AO32 8.212e-09 2.698e-08 3.519e-08 7.788e-10 9.341e-14 3.1e-09
AO221 9.444e-09 3.441e-08 4.385e-08 7.502e-10 8.981e-14 2.1e-09
AO321 9.750e-09 3.951e-08 4.926e-08 8.828e-10 1.058e-13 5.5e-09
AOI21 1.508e-08 2.299e-09 1.738e-08 4.018e-10 2.403e-14 1.4e-09
AOI22 1.946e-08 2.744e-09 2.220e-08 5.145e-10 6.164e-14 1.2e-09
AOI32 2.364e-08 3.174e-09 2.682e-08 6.157e-10 7.384e-14 1.9e-09
AOI221 3.102e-08 4.312e-09 3.533e-08 6.197e-10 7.413e-14 0.9e-09
AOI321 3.736e-08 4.507e-09 4.187e-08 7.515e-10 9.001e-14 2.9e-09
OA21 7.068e-09 1.317e-08 2.024e-08 5.080e-10 3.038e-14 3.4e-09
OA32 8.234e-09 2.381e-08 3.205e-08 7.508e-10 8.986e-14 2.8e-09
OAI21 7.917e-09 2.577e-09 1.049e-08 3.363e-10 2.006e-14 1.5e-09
OAI32 1.992e-08 2.860e-09 2.278e-08 6.009e-10 7.188e-14 3.5e-09
NOR0211 6.826e-09 9.769e-09 1.659e-08 4.507e-10 2.697e-14 2.2e-09
102
Table 5.36: Delay, power and energy values for charge-boosting standard cell library at
0.3V and 125 ◦C.
Standard TPLH (s) TPHL (s) Delay (s) Power (W) Energy (J) Contamination
Cell Delay (s)
AND02 3.611e-09 5.921e-09 9.532e-09 1.149e-08 6.911e-13 0.23e-09
AND03 4.404e-09 5.721e-09 1.013e-08 1.719e-08 2.063e-12 0.58e-09
AND04 4.974e-09 5.950e-09 1.092e-08 2.289e-08 5.496e-12 0.52e-09
NAND02 2.320e-09 8.810e-10 3.201e-09 1.144e-08 6.868e-13 0.74e-09
NAND03 4.020e-09 1.495e-09 5.515e-09 5.945e-09 7.147e-13 0.51e-09
NAND04 3.167e-09 1.266e-09 4.432e-09 2.284e-08 5.487e-12 0.82e-09
OR02 5.967e-09 3.813e-09 9.780e-09 1.026e-08 6.142e-13 0.94e-09
OR03 7.168e-09 4.586e-09 1.175e-08 1.597e-08 1.914e-12 0.91e-09
OR04 8.853e-09 5.457e-09 1.431e-08 2.228e-08 5.339e-12 1.52e-09
NOR02 2.266e-09 1.159e-09 3.425e-09 1.023e-08 6.137e-13 0.22e-09
NOR03 2.747e-09 2.160e-09 4.907e-09 1.595e-08 1.912e-12 1.89e-09
NOR04 3.265e-09 3.503e-09 6.768e-09 2.226e-08 5.352e-12 2.1e-09
INVERTER 2.388e-09 8.667e-10 3.255e-09 2.777e-09 8.332e-14 0.11e-09
XNOR 7.838e-09 3.759e-09 1.160e-08 1.252e-08 7.512e-13 5.39e-09
XOR 3.352e-10 2.362e-09 2.698e-09 1.250e-08 7.480e-13 1.3e-09
AO21 6.360e-09 4.689e-09 1.105e-08 1.725e-08 1.032e-12 4.9e-09
AO22 6.491e-09 4.851e-09 1.134e-08 2.286e-08 2.738e-12 1.3e-09
AO32 6.813e-09 5.067e-09 1.188e-08 2.856e-08 3.428e-12 3.1e-09
AO221 7.749e-09 5.479e-09 1.323e-08 2.856e-08 3.427e-12 2.6e-09
AO321 8.135e-09 5.673e-09 1.381e-08 3.426e-08 4.111e-12 1.3e-09
AOI21 2.865e-09 1.542e-09 4.407e-09 1.722e-08 1.033e-12 2.7e-09
AOI22 3.319e-09 1.701e-09 5.019e-09 1.581e-08 1.897e-12 2.1e-09
AOI32 2.977e-09 1.733e-09 4.710e-09 2.853e-08 3.422e-12 6.9e-09
AOI221 3.193e-09 2.516e-09 5.709e-09 2.853e-08 3.424e-12 3.4e-09
AOI321 3.309e-09 2.831e-09 6.140e-09 3.424e-08 4.108e-12 0.79e-09
OA21 6.792e-09 4.052e-09 1.084e-08 1.726e-08 1.036e-12 1.0e-09
OA32 7.156e-09 5.081e-09 1.224e-08 2.856e-08 3.430e-12 1.2e-09
OAI21 2.412e-09 1.862e-09 4.273e-09 1.722e-08 1.033e-12 2.1e-09
OAI32 3.061e-09 1.974e-09 5.034e-09 2.853e-08 3.424e-12 4.1e-09
NOR0211 9.345e-09 4.006e-09 1.335e-08 1.148e-08 6.884e-13 1.9e-09
103
5.3 Implementation of CPM algorithm on Benchmark Cir-
cuits
The performance-enhanced standard cell library has been implemented on the ISCAS’85
benchmark circuits, to evaluate the effectiveness of the performance-enhanced cell library
designed. The performance-enhanced cells improve the performance of the circuit with
an overhead of increased energy consumption, as discussed in Chapter 3. Thus the opti-
mal placement of these performance enhanced cells to achieve the best performance while
having the least overhead in energy consumption is necessary. The optimization algorithm
discussed in Chapter 4 was applied to the benchmark circuits to determine the placement
of these performance-enhanced standard cells.
The optimization algorithm presented in Chapter 4 can be applied to network models
which are directed-acyclic graphs (DAG). Acyclic graphs indicate absence of feedback
loops in the circuit. Hence, ISCAS’85 benchmark circuits are chosen, which have no feed-
back loops [6]. The ISCAS’85 circuits used for implementing the performance enhanced
cell library are C432, C1908, C3540, C6288, C7552 and a brief description is given below.
• The C432 circuit is a 27 channel interrupt controller and has 168 gates with 36 inputs
and 7 outputs.
• The C1908 is a 16 bit error detection circuit and has 207 gates with 33 inputs and 25
outputs.
• The C3540 is an 8 bit ALU and has 744 gates with 50 inputs and 22 outputs.
• The C6288 is a 16 bit array multiplier and has 1600 gates with 32 inputs and 32
outputs.
• The C7552 is a 32 bit adder and has 1123 gates with 32 inputs and 32 outputs.
104
The analysis explaining the delay, energy and energy-delay product obtained by im-
plementing the performance-enhanced cell library and the CPM algorithm on benchmark
circuits are discussed below.
Delay
The delays of the benchmark circuits are determined by their respective critical paths.
The number of gates along the critical path and their individual delays determine the total
delay of the circuit. The delay values obtained by implementing the performance-enhanced
cell library on the benchmark circuits are shown in Table 5.37. The CPM algorithm when
implemented on the benchmark circuits has no affect on the delay. The reason for this is the
CPM algorithm calculates the time for each cell in the circuit by which it can be delayed
so that over all performance is not affected. The CPM algorithm replaces only those cells
which are not on the critical path.
A similar trend in the delay values of the benchmark circuits with respect to four perfor-
mance enhancement methods is observed. The charge boosting method has the least delay,
followed by Supply-Ground biasing, Gate-Gate biasing, Drain-Drain biasing and the reg-
ular cell library. The reason for this is a similar behavior that is observed in case of each
standard cell along the critical path of the circuit. As discussed earlier, charge boosting
had the least delay followed by Supply-Ground biasing, Gate-Gate biasing, Drain-Drain
biasing and the regular cell for each of the 30 standard cells designed. Since the total delay
of the circuit is the summation of the individual delays of the cells along the critical path,
the trend observed in case of the individual cells is reflected across the benchmark circuits.
105
Table 5.37: Delay values for the Benchmark circuits simulated at 0.3 V in IBM 65 nm
technology.
Benchmark Regular (ns) Gate-Gate (ns) Drain-Drain (ns) Supply Charge
Circuit -Ground (ns) Boosting (ns)
c432 3706.01 1450.86 1905.05 933.11 348.48
c1908 3191.71 1535.84 2377.86 1097.48 631.18
c3540 4399.97 1901.517 1867.403 1354.99 698.05
c6288 7595.2 3894.32 6679.86 2790.9 1857.94
c7552 4388.18 1843.34 2694.64 1369.46 909.02
The delay in case of charge boosting for any particular benchmark circuit is approxi-
mately 5 times less compared to Drain-Drain biasing. This is because of the 0.2 V Vgs
boost given to all the cells when compared to approximately 0.08 V in case of Drain-Drain
biasing. The delay in case of Gate-Gate biasing for any particular benchmark circuit is
significantly lower compared to Drain-Drain biasing because of the 26 times higher Ion
in case of Gate-Gate biasing compared to Drain-Drain biasing. The effectiveness of the
performance enhancement methods in terms of savings in delay increases as the depth of
the critical path increases. This is because as the number of cells along the critical path
increases the delay savings obtained by the performance enhancement method on each cell
increases, adding up to the total savings in the overall delay of the circuit. The optimization
algorithm implemented on the benchmark circuits to determine the optimal placement of
the performance-enhanced cells does not affect the critical path. Hence, the delay of the
circuit does not change with the implementation of the optimization algorithm. The effec-
tiveness of the optimization algorithm minimizing the energy overhead is discussed in the
next subsection.
106
Energy
The total energy consumption of the circuit depends on the dynamic and static energy
of the individual cells present in the circuit. Static energy is the main component of en-
ergy consumption in subthreshold circuits, as discussed earlier. Hence the gates which are
in static mode represent the significant portion of the total energy. The energy values ob-
tained by implementing the performance-enhanced cell library on the benchmark circuits
are shown in Table 5.38. For each benchmark circuit the energy value is least in case of
Drain-Drain biasing followed by Gate-Gate biasing, charge boosting and Supply-Ground
biasing among the four performance enhancement methods. This is because of a similar
behavior observed in case of the individual standard cells. As discussed earlier Drain-
Drain biasing has the least energy consumption and Supply-Ground biasing has the highest
energy consumption for all the standard cells. Since the total energy consumption is depen-
dent on the energy of the individual cells a similar trend is observed in case of individual
cells and the benchmark circuits.
Table 5.38: Un-optimized energy values for benchmark circuits at 0.3 V in IBM 65 nm
technology.
Benchmark Regular (pJ) Gate-Gate (pJ) Drain-Drain (pJ) Supply Charge
Circuit -Ground (pJ) Boosting (pJ)
c432 0.3591 5.795 3.352 16.78 6.350
c1908 1.337 9.044 3.107 29.87 18.15
c3540 1.639 27.14 6.259 80.92 46.58
c6288 2.183 38.24 17.63 170.3 53.93
c7552 1.852 52.74 8.096 104.2 59.12
The optimization algorithm is implemented on the benchmark circuits to determine the
optimal placement of the performance enhanced cells. The effectiveness of the optimiza-
tion algorithm can be best evaluated from the results shown in Table 5.39. The optimization
algorithm minimizes the energy consumption of the benchmark circuits and does not affect
107
the delay of the circuits as discussed earlier. Significant savings in the energy consumption
are obtained by optimization. As the size of the circuit increases the optimization algorithm
becomes more effective. The reason for this is that the number of performance-enhanced
cells inserted in the circuit depends on the depth of the critical path and is independent of
the size of the circuit. The number of performance-enhanced cells inserted in each bench-
mark circuit is shown in Table 5.40. The ratio of number of performance-enhanced cells to
the size of the circuit in C432 is 0.33 compared to 0.05 in the case of C6288. As the ratio
of the performance-enhanced cells to the size of the circuit in case of C6288 is much less
compared to C432 the energy savings in C6288 is significantly higher than in case of C432
shown in Table 5.39. The number of high-performance cells inserted in the circuit depends
on the structure of the circuit. If the circuit is wide and has a lower number of gates along
the critical path, then the number of high-performance cells inserted will be significantly
lower.
The energy in case of Drain-Drain biasing for any particular benchmark circuit is signif-
icantly lower than Supply-Ground biasing. Further, the energy gap between Drain-Drain
biasing and Supply-Ground biasing increases as the number of gates increases in the unop-
timized case. In contrast, with optimization the energy gap between Drain-Drain biasing
Supply-Ground biasing does not increase. The reason for this is that the ratio of the num-
ber of performance-enhanced cells to the size of the circuit is independent of the circuit
size. For C432 the energy gap between Drain-Drain biasing and Supply-Ground biasing
is approximately 5 times and for C6288 the energy gap between Drain-Drain biasing and
Supply-Ground biasing is approximately 10 times in the unoptimized case. In contrast, the
respective energy gaps in the optimized case are 7 times for C432 and 5 times for C6288.
108
Table 5.39: Optimized energy values for benchmark circuits at 0.3 V in IBM 65 nm tech-
nology.
Benchmark Gate-Gate (pJ) Drain-Drain (pJ) Supply-Ground (pJ) Charge
Circuit Boosting (pJ)
c432 3.559 1.673 13.07 4.047
c1908 3.545 1.704 16.36 9.706
c3540 5.508 4.507 14.51 26.37
c6288 4.999 3.222 17.82 13.78
c7552 2.908 1.880 7.246 12.30
Table 5.40: Number of performance-enhanced cells inserted in benchmark circuits through
CPM algorithm.
Benchmark Number of Number of Performance
Circuit Cells Enhanced Cells
c432 168 55
c1908 207 62
c3540 744 38
c6288 1600 78
c7552 1123 43
Energy-Delay Product
The energy-delay product is calculated as the product of the delay and energy. The
energy delay product is least in the case of charge boosting for C432, C6288 and it is least
in the case of Drain-Drain biasing for C1908, C3540 and C7552, shown in Table 5.41.
This difference arises because of the energy-delay product of the individual cells present in
the respective circuits, shown in Table 5.32. The energy-delay product in case of Supply-
Ground biasing is the highest because of the large energy consumption due to high Ion.
The optimization algorithm implemented on the benchmark circuits reduces the energy,
leaving the delay unaffected. Due to this reduced energy the energy-delay product also
reduces. The energy-delay product values for the optimized benchmark circuits are shown
109
in Table 5.42. For the optimized benchmark circuits the energy-delay product is least in
case of charge boosting for C432 and it is least in case of Drain-Drain biasing for C1908,
C3540, C6288, C7552. The energy-delay product reduces approximately by more than 50
% with optimization. This is because only fewer performance enhanced cells are placed in
the circuit, leading to lower energy consumption. As the size of the circuit increases, the
saving in the energy-delay product also increases. This is due to the saving in the energy
as discussed earlier.
Table 5.41: Un-optimized energy-delay product for benchmark circuits at 0.3 V.
Benchmark Gate-Gate (J-s) Drain-Drain (J-s) Supply-Ground (J-s) Charge
Circuit Boosting (J-s)
c432 8.4e-18 6.39e-18 15.7e-18 2.2e-18
c1908 13.87e-18 7.4e-18 32.8e-18 11.5e-18
c3540 20.93e-18 11.7e-18 109.6e-18 32.5e-18
c6288 148.9e-18 117.8e-18 475.3e-18 100.2e-18
c7552 97.18e-18 21.75e-18 142.66e-18 53.61e-18
Table 5.42: Optimized energy-delay product for benchmark circuits at 0.3 V.
Benchmark Gate-Gate (J-s) Drain-Drain (J-s) Supply-Ground (J-s) Charge
Circuit Boosting (J-s)
c432 5.16e-18 3.2e-18 12.2e-18 0.93e-18
c1908 5.44e-18 4.05e-18 17.95e-18 6.13e-18
c3540 10.47e-18 8.42e-18 19.66e-18 18.41e-18
c6288 19.47e-18 12.52e-18 49.73e-18 25.6e-18
c7552 5.35e-18 5.06e-18 9.93e-18 11.17e-18
110
Summary
The regular and performance-enhanced standard cell library was implemented on the bench-
mark circuits. A significant delay savings are achieved by performance-enhanced cell li-
brary over the regular cell library. The energy consumption was higher with performance-
enhanced cell library implementation because of the higher Ion. The optimization algorithm
was implemented on benchmark circuits and significant savings in energy consumption
with no effect on the delay were observed. The effectiveness of the optimization algorithm
increases with the circuit size as the ratio of performance-enhanced cells inserted to the
size of the circuit depends on the depth of the critical path and is independent of the size of
the circuit.
111
6. Conclusions and Future Work
6.1 Conclusions
This research presents two existing biasing methods and proposes a new approach to sub-
strate biasing which improves the subthreshold circuit performance. A new performance
enhancement technique using charge boosting buffer is also proposed. The performance
improvement is achieved by increasing the Ion of the transistors. To understand the de-
pendence of Ion on Vgs and Vth extensive simulation analysis was performed. The results
showed an expected exponential dependence. Substrate biasing methods, namely Gate-
Gate biasing, Drain-Drain biasing and Supply-Ground biasing, reduce the Vth of the tran-
sistors, thereby increasing the Ion. The biasing in case of Supply-Ground and Gate-Gate
is instantaneous in nature, whereas it changes dynamically with time in Drain-Drain bias-
ing as the biasing is provided through a connection between the output of the cell and the
body of the transistors. To understand the Ion relationship with biasing method applied, an
analytical expression is derived for Drain-Drain biasing and Gate-Gate biasing. The equa-
tion derived indicates that Ion in case of Gate-Gate biasing is 26 times more compared to
Drain-Drain biasing. Charge boosting method improves the performance by increasing the
Vgs, which results in higher Ion. Charge boosting buffers are used to provide the higher Vgs
required to improve the performance of subthreshold circuits. To minimize the overhead
in the energy consumption an optimization algorithm, namely CPM, is implemented on
benchmark circuits.
Charge boosting buffers have the least delay followed by Supply-Ground biasing, Gate-
Gate biasing and Drain-Drain biasing among the performance-enhancement methods. The
energy consumption is least in case of Drain-Drain biasing followed by Gate-Gate biasing
112
and Supply-Ground among the three substrate biasing methods. The variation in energy is
linear for charge boosting, whereas the variation is exponential with varying Vdd for sub-
strate biasing. This is because in the case of charge boosting the energy increases linearly
compared to exponential behavior in the case of substrate biasing. Thus, for lower Vdd val-
ues, such as 0.2 V to 0.25 V, charge boosting method has higher energy consumption and
for Vdd values greater than 0.34 V it has lower energy consumption compared to substrate
biasing methods.
The performance-enhanced standard cell library designed is implemented on ISCAS’85
benchmark circuits and yielded a 10 times improvement in the frequency with charge boost-
ing and approximately 2 times increase in the energy-delay product was observed. CPM
algorithm is applied to the benchmark circuits to minimize the overhead in the energy
consumption without affecting the frequency of operation. The CPM algorithm yielded
approximately 50 % reduction in the energy-delay product. The effectiveness of the opti-
mization algorithm increases with circuit size.
6.2 Future Work
As the subthreshold circuits suffer from low operating speeds, performance enhance-
ment techniques for subthreshold circuits hold a potential for research. The performance
enhancement techniques usually have a drawback of an overhead in energy consumption.
One solution is to implement low power techniques which minimize the energy overhead
with no effect on frequency. Techniques such as clustered voltage scaling (CVS) and use of
high Vth transistor along the non-critical paths can be used to reduce the energy consump-
tion with the no change in the frequency.
The substrate biasing technique presented in this thesis enhances the performance and
also increases the robustness to process variations. However, a limitation to substrate bi-
asing is the overhead in energy consumption. Techniques to counter the process variations
113
with minimum overhead in energy need to be researched. Further, the higher sensitivity
of subthreshold circuits compared to superthreshold circuits could result in soft errors. To
avoid the soft errors fault tolerant architectures need to be implemented.
Equations derived for the average ON current in case of Drain-Drain biasing assume a
linear variation of Vsb with time. An empirical relation of Vsb with time can be derived by
statistical analysis of the variation in output voltage of an inverter. By using the empirical
model of Vsb a more accurate equation for ON current in case of Drain-Drain biasing can
be derived.
The optimization algorithm presented in this thesis is only applicable to directed-acyclic
graphs. More complex algorithms suitable for cyclic graphs, which serve the circuits with
feedback loops is a potential research area. Statistical analysis of the delay and energy
consumption of the standard cells in a circuit is necessary. The optimization algorithms
can be designed by incorporating the statistical data to achieve better savings in delay and
energy. A challenge in integrating subthreshold and superthreshold circuits on a single chip
is that they both need a separate placement and routing mechanisms.
114
Bibliography
[1] B. H. Calhoun and A. Chandrakasan. Characterizing and modeling minimum energy
operation for subthreshold circuits. In Proceedings of the International Symposium
on Low Power Electronics and Design, ISLPED ’04, pages 90–95, 2004.
[2] B. H. Calhoun, A. Chandrakasan, and A. Wang. Sub-threshold Design for Ultra Low-
Power Systems. Springer, 2006.
[3] B. H. Calhoun, A. Wang, and A. Chandrakasan. Modeling and sizing for mini-
mum energy operation in subthreshold circuits. IEEE Journal of Solid-State Circuits,
40(9):1778–1786, 2005.
[4] B. H. Calhoun, A. Wang, N. Verma, and A. Chandrakasan. Sub-threshold design: The
challenges of minimizing circuit energy. In Proceedings of the International Sympo-
sium on Low Power Electronics and Design, ISLPED’06, pages 366–368, 2006.
[5] B.S. Carlson and Suh-Juch Lee. Delay optimization of digital cmos vlsi circuits by
transistor reordering. IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, 14(10):1183–1192, Oct 1995.
[6] M. C. Hansen, H. Yalcin, and J. P. Hayes. Unveiling the iscas-85 benchmarks: A
case study in reverse engineering. Design and Test of Computers, IEEE, 16(3):72–80,
1999.
[7] S. Hanson, B. Zhai, K. Berstein, D. Blaauw, A. Bryant, L. Chang, W. Das, W. Haen-
sch, E. Novak, and D. Sylvester. Ultralow-voltage, minimum-energy cmos. IBM
journal of research and development, 50(4/5):469–490, July/September 2006.
[8] Yoo Hoi-Jun. Dual-vT self-timed cmos logic for low subthreshold current multigigabit
synchronous dram. IEEE Transactions on Circuits and Systems II: Analog and Digital
Signal Processing, 45(9):1263–1271, 1998.
115
[9] N. Jayakumar, R. Garg, B. Gamache, and S. P. Khatri. A pla based asynchronous
micropipelining approach for subthreshold circuit design. In 43rd ACM/IEEE Design
Automation Conference, 2006, pages 419–424, 2006.
[10] Kil Jonggab, Gu Jie, and C. H. Kim. A high-speed variation-tolerant interconnect
technique for sub-threshold circuits using capacitive boosting. IEEE Transactions on
Very Large Scale Integration (VLSI) Systems, 16(4):456–465, 2008.
[11] L. A. P. Melek, M. C. Schneider, and C. Galup-Montoro. Body-bias compensation
technique for subthreshold cmos static logic gates. In 17th Symposium on Integrated
Circuits and Systems Design. SBCCI 2004, pages 267–272, 2004.
[12] K. Prasad P. Elakkumanan, K. Thyagarajan and R. Sridhar. Optimal vth assignment
and buffer insertion for simultaneous leakage and glitch minimization through integer
linear programming (ilp). In Proceedings of IEEE International Midwest Symposium
on Circuits and Systems, pages 1880–1883, 2005.
[13] J. M. Rabaey, A. Chandrakasan, and B. Nikolic. Digital Integrated Circuits: A Design
Perspective. Pearson Education, 2003.
[14] Lin Saihua, Wang Yu, Luo Rang, and Yang Huazhong. A capacitive boosted buffer
technique for high-speed process-variation-tolerant interconnect in udvs application.
In Asia and South Pacific Design Automation Conference, ASPDAC ’08, pages 304–
309, 2008.
[15] H. Soeleman and K. Roy. Ultra-low power digital subthreshold logic circuits. In
Proceedings of the International Symposium on Low Power Electronics and Design,
ISLPED ’99, pages 94–96, 1999.
[16] H. Soeleman, K. Roy, and B. Paul. Robust ultra-low power sub-threshold dtmos
logic. In Proceedings of the International Symposium on Low Power Electronics and
Design, ISLPED ’00, pages 25–30, 2000.
[17] H. Soeleman, K. Roy, and B. Paul. Sub-domino logic: ultra-low power dynamic
sub-threshold digital logic. In Fourteenth International Conference on VLSI Design,
2001, pages 211–214, 2001.
[18] H. Soeleman, K. Roy, and B. C. Paul. Robust subthreshold logic for ultra-low
power operation. IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
9(1):90–99, 2001.
116
[19] R. M. Swanson and J. D. Meindl. Ion-implanted complementary mos transistors in
low-voltage circuits. IEEE Journal of Solid-State Circuits, 7(2):146–153, 1972.
[20] Kim Tae-Hyoung, Eom Hanyong, J. Keane, and C. Kim. Utilizing reverse short chan-
nel effect for optimal subthreshold circuit design. In Proceedings of the Interna-
tional Symposium on Low Power Electronics and Design. ISLPED’06, pages 127–
130, 2006.
[21] Kim Tae-Hyoung, J. Liu, and C. H. Kim. An 8t subthreshold sram cell utilizing
reverse short channel effect for write margin and read performance improvement. In
Custom Integrated Circuits Conference. CICC ’07. IEEE, pages 241–244, 2007.
[22] Y. P. Tsividis. Operation and Modeling of the MOS Transistor. New York: McGraw-
Hill, 1987.
[23] H. E. Weste and D. Harris. CMOS VLSI Design: A Circuit and Systems Perspective.
Pearson Education, 2004.
[24] W. L. Winston. Operation Research: Applications and Algorithms. PWS publishers,
1987.
117
Appendix A
Table 1: Delay, power and energy values for Gate-Gate standard cell library at 0.3 V and
25 ◦C.
Standard TPLH (s) TPHL (s) Delay (s) Power (W) Energy (J) Contamination
Cell Delay (s)
AND02 3.913e-08 3.615e-08 7.528e-08 1.467e-09 6.254e-14 0.96e-09
AND03 8.713e-08 3.837e-08 1.255e-07 1.808e-09 1.636e-13 1.16e-09
AND04 1.155e-07 4.544e-08 1.609e-07 2.195e-09 4.104e-13 1.36e-09
NAND02 3.334e-08 5.675e-09 3.902e-08 1.068e-09 3.971e-14 0.51e-09
NAND03 6.052e-08 8.300e-09 6.882e-08 1.449e-09 1.189e-13 0.92e-09
NAND04 9.370e-08 1.078e-08 1.045e-07 1.811e-09 3.167e-13 1.2e-09
OR02 4.112e-08 4.957e-08 9.070e-08 1.345e-09 5.265e-14 1.24e-09
OR03 6.891e-08 7.917e-08 1.481e-07 1.665e-09 1.098e-13 1.65e-09
OR04 9.849e-08 1.097e-07 2.082e-07 2.026e-09 2.279e-13 1.54e-09
NOR02 1.874e-08 1.648e-08 3.522e-08 1.147e-09 3.695e-14 1.21e-09
NOR03 3.394e-08 3.634e-08 7.028e-08 1.468e-09 7.636e-14 1.41e-09
NOR04 5.528e-08 6.109e-08 1.164e-07 1.805e-09 1.523e-13 1.6e-09
INVERTER 9.875e-09 4.061e-09 1.394e-08 6.340e-10 9.966e-15 0.24e-09
XNOR 7.017e-08 4.375e-08 1.139e-07 1.673e-09 5.088e-14 4.3e-09
XOR 1.042e-08 2.197e-08 3.239e-08 1.523e-09 4.387e-14 2.4e-09
AO21 4.732e-08 8.954e-08 1.369e-07 1.845e-09 7.015e-14 1.77e-09
AO22 5.253e-08 1.145e-07 1.671e-07 2.040e-09 1.532e-13 4.67e-09
AO32 5.633e-08 1.375e-07 1.938e-07 2.465e-09 1.957e-13 2.32e-09
AO221 7.045e-08 1.347e-07 2.052e-07 2.519e-09 1.709e-13 2.78e-09
AO321 7.440e-08 1.610e-07 2.354e-07 2.945e-09 2.126e-13 3.01e-09
AOI21 3.959e-08 1.859e-08 5.818e-08 1.618e-09 5.278e-14 6.36e-09
AOI22 5.677e-08 1.596e-08 7.273e-08 1.828e-09 1.216e-13 2.32e-09
AOI32 7.385e-08 1.987e-08 9.372e-08 2.253e-09 1.644e-13 3.76e-09
AOI221 7.160e-08 3.287e-08 1.045e-07 2.344e-09 1.403e-13 3.92e-09
AOI321 9.063e-08 3.601e-08 1.266e-07 2.783e-09 1.836e-13 4.32e-09
OA21 5.126e-08 7.481e-08 1.261e-07 1.586e-09 6.893e-14 2.68e-09
OA32 6.580e-08 1.203e-07 1.861e-07 2.132e-09 1.546e-13 3.21e-09
OAI21 4.376e-08 2.601e-08 6.977e-08 1.333e-09 5.122e-14 1.36e-09
OAI32 6.393e-08 3.537e-08 9.930e-08 1.948e-09 1.251e-13 1.82e-09
NOR0211 4.705e-08 6.444e-08 1.115e-07 1.512e-09 5.568e-14 1.98e-09
118
Table 2: Delay, power and energy values for Drain-Drain standard cell library at 0.3 V and
25 ◦C.
Standard TPLH (s) TPHL (s) Delay (s) Power (W) Energy (J) Contamination
Cell Delay (s)
AND02 7.516e-08 5.071e-08 1.259e-07 3.370e-10 2.017e-14 0.76e-09
AND03 1.894e-07 5.614e-08 2.455e-07 2.435e-10 2.914e-14 0.3e-09
AND04 2.625e-07 6.636e-08 3.288e-07 2.155e-10 5.156e-14 0.4e-09
NAND02 5.430e-08 8.477e-09 6.278e-08 1.154e-10 6.884e-15 0.6e-09
NAND03 1.116e-07 1.194e-08 1.235e-07 6.290e-11 7.463e-15 0.2e-09
NAND04 1.726e-07 1.513e-08 1.877e-07 3.434e-11 8.072e-15 1e-09
OR02 7.224e-08 6.288e-08 1.351e-07 3.785e-10 2.249e-14 1.19e-09
OR03 1.098e-07 9.053e-08 2.004e-07 3.210e-10 3.810e-14 1.7e-09
OR04 1.554e-07 1.217e-07 2.771e-07 2.926e-10 6.931e-14 2.1e-09
NOR02 1.537e-08 2.508e-08 4.045e-08 1.895e-10 1.111e-14 0.18e-09
NOR03 3.498e-08 5.501e-08 8.999e-08 1.170e-10 1.351e-14 0.9e-09
NOR04 5.922e-08 9.113e-08 1.503e-07 7.051e-11 1.583e-14 1.2e-09
INVERTER 8.916e-09 4.227e-09 1.314e-08 2.166e-10 6.475e-15 1e-09
XNOR 1.178e-07 8.807e-08 2.059e-07 5.637e-10 3.104e-14 4e-09
XOR 1.870e-08 3.639e-08 5.508e-08 4.265e-10 2.286e-14 5.1e-09
AO21 7.871e-08 1.367e-07 2.154e-07 3.842e-10 2.286e-14 7.3e-09
AO22 8.497e-08 1.867e-07 2.717e-07 3.193e-10 3.815e-14 2e-09
AO32 9.561e-08 2.389e-07 3.346e-07 3.045e-10 3.645e-14 5.2e-09
AO221 1.128e-07 2.133e-07 3.261e-07 3.571e-10 4.251e-14 1e-09
AO321 1.233e-07 2.608e-07 3.842e-07 3.602e-10 4.293e-14 4e-09
AOI21 6.606e-08 3.047e-08 9.653e-08 1.826e-10 1.077e-14 1e-09
AOI22 1.135e-07 2.817e-08 1.417e-07 1.233e-10 1.464e-14 6.5e-09
AOI32 1.579e-07 3.695e-08 1.949e-07 1.105e-10 1.317e-14 3e-09
AOI221 1.319e-07 5.317e-08 1.851e-07 1.458e-10 1.715e-14 1.4e-09
AOI321 1.753e-07 6.184e-08 2.371e-07 1.501e-10 1.772e-14 3e-09
OA21 8.288e-08 1.338e-07 2.167e-07 3.121e-10 1.851e-14 8e-09
OA32 1.304e-07 1.954e-07 3.258e-07 3.221e-10 3.835e-14 4.7e09
OAI21 6.703e-08 2.953e-08 9.656e-08 1.238e-10 7.182e-15 2e-09
OAI32 1.217e-07 6.283e-08 1.845e-07 1.188e-10 1.393e-14 4.1e-09
NOR0211 5.730e-08 8.133e-08 1.386e-07 3.451e-10 2.058e-14 5.2e-09
119
Table 3: Delay, power and energy values for Supply-Ground standard cell library at 0.3 V
and 25 ◦C.
Standard Cell TPLH (s) TPHL (s) Delay (s) Power (W) Energy (J) Contamination
Delay (s)
AND02 3.063e-08 2.761e-08 5.824e-08 2.791e-09 1.674e-13 0.46e-09
AND03 6.742e-08 2.996e-08 9.737e-08 3.435e-09 4.121e-13 0.5e-09
AND04 9.299e-08 3.452e-08 1.275e-07 4.115e-09 9.875e-13 1.1e-09
NAND02 1.655e-08 5.149e-09 2.170e-08 1.767e-09 1.060e-13 0.3e-09
NAND03 3.219e-08 7.273e-09 3.946e-08 2.483e-09 2.980e-13 0.4e-09
NAND04 5.464e-08 9.304e-09 6.395e-08 3.167e-09 7.601e-13 1.1e-09
OR02 3.066e-08 3.827e-08 6.892e-08 2.789e-09 1.672e-13 1.15e-09
OR03 5.164e-08 6.179e-08 1.134e-07 3.412e-09 4.091e-13 1.3e-09
OR04 7.612e-08 8.520e-08 1.613e-07 4.009e-09 9.614e-13 2.8e-09
NOR02 1.173e-08 8.063e-09 1.979e-08 1.800e-09 1.078e-13 0.5e-09
NOR03 3.033e-08 2.598e-08 5.630e-08 2.437e-09 2.920e-13 1.2e-09
NOR04 4.998e-08 4.558e-08 9.556e-08 3.034e-09 7.272e-13 2.7e-09
INVERTER 6.056e-09 2.684e-09 8.740e-09 9.872e-10 2.960e-14 0.12e-09
XNOR 5.043e-08 3.514e-08 8.558e-08 3.612e-09 2.003e-13 10.5e-09
XOR 6.124e-09 1.870e-08 2.483e-08 2.689e-09 1.449e-13 3.3e-09
AO21 3.749e-08 6.953e-08 1.070e-07 3.624e-09 2.173e-13 8.4e-09
AO22 4.582e-08 8.877e-08 1.346e-07 3.997e-09 4.796e-13 2.7e-09
AO32 4.984e-08 1.036e-07 1.534e-07 4.717e-09 5.660e-13 4.3e-09
AO221 6.012e-08 1.039e-07 1.640e-07 4.820e-09 5.781e-13 2.5e-09
AO321 6.332e-08 1.205e-07 1.838e-07 5.568e-09 6.679e-13 8.9e-09
AOI21 3.347e-08 1.504e-08 4.850e-08 2.649e-09 1.588e-13 5.5e-09
AOI22 5.029e-08 1.808e-08 6.837e-08 3.036e-09 3.643e-13 15.1e-09
AOI32 6.278e-08 2.101e-08 8.379e-08 3.760e-09 4.511e-13 8.3e-09
AOI221 6.292e-08 3.094e-08 9.386e-08 3.848e-09 4.615e-13 2.6e-09
AOI321 7.652e-08 3.311e-08 1.096e-07 4.599e-09 5.517e-13 1.5e-09
OA21 4.090e-08 5.895e-08 9.985e-08 3.083e-09 1.848e-13 2.3e-09
OA32 4.990e-08 1.029e-07 1.528e-07 4.128e-09 4.951e-13 4.7e-09
OAI21 2.421e-08 1.567e-08 3.988e-08 2.124e-09 1.273e-13 8.3e-09
OAI32 6.162e-08 1.864e-08 8.025e-08 3.169e-09 3.800e-13 18.2e-09
NOR0211 3.180e-08 4.554e-08 7.734e-08 2.725e-09 1.634e-13 5.5e-09
120
Table 4: Delay, power and energy values for charge-boosting standard cell library at 0.3 V
and 25 ◦C.
Standard TPLH (s) TPHL (s) Delay (s) Power (W) Energy (J) Contamination
Cell Delay (s)
AND02 1.755e-08 2.005e-08 3.761e-08 7.596e-10 4.647e-14 0.31e-09
AND03 1.744e-08 2.746e-08 4.490e-08 1.112e-09 1.363e-13 0.71e-09
AND04 1.838e-08 2.861e-08 4.699e-08 1.465e-09 3.448e-13 0.6e-09
NAND02 5.651e-09 3.156e-09 8.807e-09 7.493e-10 4.398e-14 0.9e-09
NAND03 6.352e-09 4.001e-09 1.035e-08 1.111e-09 1.300e-13 0.6e-09
NAND04 7.178e-09 4.824e-09 1.200e-08 1.463e-09 3.469e-13 0.9e-09
OR02 2.924e-08 1.626e-08 4.550e-08 6.795e-10 3.841e-14 1.06e-09
OR03 3.570e-08 1.749e-08 5.318e-08 1.046e-09 1.265e-13 1.02e-09
OR04 4.476e-08 1.896e-08 6.373e-08 1.449e-09 3.484e-13 1.82e-09
NOR02 4.538e-09 1.500e-05 1.501e-05 2.935e-10 1.753e-14 0.26e-09
NOR03 6.282e-09 1.073e-08 1.701e-08 1.042e-09 1.248e-13 0.85e-09
NOR04 6.825e-09 1.785e-08 2.467e-08 1.445e-09 3.432e-13 2.31e-09
INVERTER 5.024e-09 2.038e-09 7.062e-09 3.833e-10 1.320e-14 0.13e-09
XNOR 4.526e-08 1.628e-08 6.154e-08 8.502e-10 5.196e-14 8.22e-09
XOR 7.685e-10 7.276e-09 8.044e-09 8.427e-10 4.745e-14 1.5e-09
AO21 3.018e-08 1.785e-08 4.804e-08 1.189e-09 7.021e-14 6.2e-09
AO22 3.115e-08 1.751e-08 4.866e-08 1.448e-09 1.732e-13 1.9e-09
AO32 3.302e-08 1.790e-08 5.092e-08 1.805e-09 2.197e-13 3.7e-09
AO221 3.772e-08 1.861e-08 5.633e-08 1.812e-09 2.171e-13 3.1e-09
AO321 3.982e-08 1.890e-08 5.872e-08 2.174e-09 2.605e-13 1.9e-09
AOI21 6.578e-09 6.353e-09 1.293e-08 1.188e-09 6.878e-14 3.1e-09
AOI22 6.165e-09 5.565e-09 1.173e-08 1.447e-09 2.878e-13 2.7e-09
AOI32 6.300e-09 7.150e-09 1.345e-08 1.804e-09 3.688e-13 12.1e-09
AOI221 6.341e-09 1.151e-08 1.785e-08 1.811e-09 2.156e-13 4.4e-09
AOI321 6.419e-09 1.317e-08 1.959e-08 2.173e-09 2.609e-13 1.1e-09
OA21 3.302e-08 1.672e-08 4.975e-08 1.184e-09 6.441e-14 1.2e-09
OA32 3.641e-08 1.827e-08 5.467e-08 1.808e-09 2.165e-13 1.7e-09
OAI21 5.804e-09 8.325e-09 1.413e-08 1.181e-09 6.975e-14 3.1e-09
OAI32 6.755e-09 9.982e-09 1.674e-08 1.806e-09 2.156e-13 6.2e-09
NOR0211 4.497e-08 1.875e-08 6.372e-08 7.584e-10 4.642e-14 2.7e-09
121
