Integrated DC-DC Converters for Adaptive Ultra-Low-Energy Processors by Turnquist, Matthew
There is an emerging class of energy-
constrained adaptive processors that 
operate primarily at near-threshold 
voltages. Supplying the near-threshold 
voltage with high efﬁciency DC-DC 
converters is essential in realizing ultra-low 
energy consumption. The DC-DC converter 
should also be fully integrated to meet the 
increasingly small form factor requirements 
of modern ultra-portable electronics. 
  
This work presents the implementation of 
three fully integrated switched-capacitor 
DC-DC converters and two fully integrated 
adaptive NT processors. By designing the 
DC-DC converter, elements of the adaptive 
processor load, and considering practical 
(battery) input voltages, the author is able to 
identify new system design methodologies 
and approaches that reduce energy 
consumption. 
A
a
lto
-D
D
 2
1
6
/2
0
1
6
 
9HSTFMG*agfihi+ 
ISBN 978-952-60-6587-8 (printed) 
ISBN 978-952-60-6588-5 (pdf) 
ISSN-L 1799-4934 
ISSN 1799-4934 (printed) 
ISSN 1799-4942 (pdf) 
 
Aalto University 
School of Electrical Engineering 
Department of Micro- and Nanosciences 
www.aalto.fi 
BUSINESS + 
ECONOMY 
 
ART + 
DESIGN + 
ARCHITECTURE 
 
SCIENCE + 
TECHNOLOGY 
 
CROSSOVER 
 
DOCTORAL 
DISSERTATIONS 
M
atth
ew
 T
u
rn
q
u
ist 
In
tegrated
 D
C
-D
C
 C
o
n
v
erters fo
r A
d
ap
tiv
e U
ltra-L
o
w
-E
n
ergy
 P
ro
cesso
rs 
A
a
lto
 U
n
ive
rs
ity 
2016 
Department of Micro- and Nanosciences 
Integrated DC-DC 
Converters for Adaptive 
Ultra-Low-Energy 
Processors 
Matthew Turnquist 
DOCTORAL 
DISSERTATIONS 
Aalto University publication series 
DOCTORAL DISSERTATIONS 216/2016 
Integrated DC-DC Converters for 
Adaptive Ultra-Low-Energy 
Processors 
Matthew Turnquist 
A doctoral dissertation completed for the degree of Doctor of 
Science (Technology) to be defended, with the permission of the 
Aalto University School of Electrical Engineering, at a public 
examination held at the lecture hall S1 of the school on 21 January 
2016 at 12:00. 
Aalto University 
School of Electrical Engineering 
Department of Micro- and Nanosciences 
Supervising professor 
Prof. Kari Halonen 
 
Thesis advisor 
Adjunct Prof. Lauri Koskinen, University of Turku, Finland 
 
Preliminary examiners 
Prof. Alex Fish, Bar-Ilan University, Israel 
Prof. Eby Friedman, University of Rochester, USA 
 
Opponent 
Prof. Joseph Shor, Bar-Ilan University, Israel 
Aalto University publication series 
DOCTORAL DISSERTATIONS 216/2016 
 
© Matthew Turnquist 
 
ISBN 978-952-60-6587-8 (printed) 
ISBN 978-952-60-6588-5 (pdf) 
ISSN-L 1799-4934 
ISSN 1799-4934 (printed) 
ISSN 1799-4942 (pdf) 
http://urn.fi/URN:ISBN:978-952-60-6588-5 
 
Unigrafia Oy 
Helsinki 2016 
 
Finland 
 
Abstract 
Aalto University, P.O. Box 11000, FI-00076 Aalto  www.aalto.fi 
Author 
Matthew Turnquist 
Name of the doctoral dissertation 
Integrated DC-DC Converters for Adaptive Ultra-Low-Energy Processors 
Publisher School of Electrical Engineering 
Unit Department of Micro- and Nanosciences 
Series Aalto University publication series DOCTORAL DISSERTATIONS 216/2016 
Field of research Electronic circuit design 
Manuscript submitted 19 August 2015 Date of the defence 21 January 2016 
Permission to publish granted (date) 19 November 2015 Language English 
Monograph Article dissertation (summary + original articles) 
Abstract 
There is an emerging class of energy-constrained adaptive processors that operate primarily 
at near-threshold (NT) voltages. Operating at NT signiﬁcantly reduces energy consumption but 
avoids the large variance and performance penalties of sub-threshold. NT processors are useful 
for sensor-based platforms within ultra-portable electronics that require minimum energy 
consumption. Typically, the NT voltages are supplied by a DC-DC converter. Energy 
consumption of the DC-DC converter/processor system is proportional to the DC-DC's 
conversion efﬁciency. Thus, supplying the NT voltage with high efﬁciency is essential in  
realizing ultra-low-energy consumption. High efﬁciency over an increasingly large power range 
(due to large differences in sleep and active power levels) also require careful consideration. 
Besides the challenges in efﬁciency, the DC-DC converter should be fully integrated to meet 
the increasingly small form factor requirements of modern ultra-portable electronics. 
  
Research work presented in this thesis addresses the previous challenges by introducing new 
NT DC-DC converters, new techniques for reducing the increasingly dominant DC-DC control 
circuitry power losses, and new DC-DC/processor co-design methodologies. There are three 
main focus areas within the thesis: SC DC-DC converters, adaptive processors, and DC-DC 
converter/adaptive processor systems. The thesis gives the necessary background of DC-DC 
converters and adaptive processors. The research work is demonstrated with ﬁve integrated 
circuit (IC) implementations and ten publications. 
  
Three of the IC implementations are fully integrated SC DC-DC converters. A step-down 
Dickson topology in 28 nm CMOS is ﬁrst presented. An improved Dickson converter and a self-
oscillating converter are then presented. Both of these converters are built in 28 nm UTBB FD-
SOI, and both take advantage of the strong impact of back-gate biasing in order to improve 
efﬁciency. Practical design considerations of the converter's input voltage are shown with a 
prototype Li-ion battery. All three of the implemented DC-DC converters are measured with 
adaptive (processor) loads. 
  
The ﬁnal two IC implementations are adaptive processors. Both of the processors are built in 
65 nm CMOS. The processors use an adaptive scheme called timing-error detection (TED) to 
ensure reliability at NT voltages. One of the processors is the ﬁrst-known TED processor 
Keywords Switched-capacitor DC-DC converter, power management, minimum energy point, 
sub-threshold, weak inversion, MEP, UTBB FD-SOI, ultra-low-voltage, timing-
error detection, near-threshold, ultra-low-power 
ISBN (printed) 978-952-60-6587-8 ISBN (pdf) 978-952-60-6588-5 
ISSN-L 1799-4934 ISSN (printed) 1799-4934 ISSN (pdf) 1799-4942 
Location of publisher Helsinki Location of printing Helsinki Year 2016 
Pages 186 urn http://urn.ﬁ/URN:ISBN:978-952-60-6588-5 

Preface
The work in this dissertation was conducted at the Department of Micro-
and Nanosciences (Electronic Circuit Design Group) at Aalto University.
The work was supported through a number of projects. Thanks to the
Academy of Finland (projects: 140340, 13139458, 270585, 124029), the
Finnish Graduate School of Electronics Telecommunications and Automa-
tion (GETA), the Technology Industries of Finland Centennial Foundation
(project MepMic), the EU grant 621439 (Almarvi), and Tekes 2948/31/2011.
Thanks also to the Ahlström Säätiö for their ﬁnancial support.
A warm thanks to Professor Alex Fish and Professor Eby G. Friedman
for reviewing this thesis. Thanks to Professor Joseph Shor for agreeing to
be the opponent.
Thanks to the Department of Micro- and Nanosciences for giving me the
opportunity to do research in the microelectronics ﬁeld. Thanks to Profes-
sor Kari Halonen for welcoming me into the ECDL. Thanks especially to
my supervisor Adjunct Professor Lauri Koskinen. You gave me a chance
on your team. You kept me on target and helped me come back to reality
many times. Thanks for letting me move into a subject that interested me
more. Most importantly of all, you introduced me to Eläkeläiset during
our tapeouts. But really, my work would not have been possible without
you. Professor Jussi Ryynänen – thanks for the feedback. And yes Jussi, I
think very positive now. Thanks to Jani Mäkipää and Arto Rantala (VTT)
for the help and great discussions. Thanks to TRC for welcoming me onto
your team.
To all the teams I have been part of – thanks. Erkka Laulainen, Markus
Hiienkari, and Jukka Teittinen – you were a pleasure to work with. We
accomplished a lot together. Thanks to Professor Nikolic at the Univer-
sity of California Berkeley (BWRC) for letting me visit. Thanks to Dr.
Hanh-Phuc Le and Dr. Ruzica Jevtic for the DC-DC tips. Thanks to Dr.
1
Preface
David Bol and Guerric de Streel. The fruitful discussions with Jarno Salo-
maa, Mika Pulkkinen, Tero Nieminen, Olli Viitala, Tero Tikka, Dr. Mikko
Kaltiokallio, Dr. Kim Östman, and Dr. Matti Paavola helped tremen-
dously. Thanks to William Martin for the great discussions and the thesis
review.
Now on to the folks that kept my day-to-day tasks moving along. Thanks
to Artturi Kaila and Dr. Marko Kosunen for keeping the servers (and me)
happy. Artturi – you brought me back to earth and reminded me that not
everything is related to circuits. I will also miss watching the American
”classics” with you in the break room. Lea Söderman – thanks for helping
me with all the practical tasks and helping me to improve my Finnish.
Thanks to Anita Bisi, Marja Leppäharju, and Arja Hjelt.
And thanks to my ofﬁce mates over the years for keeping me sane.
Jakub (Dr. Jgro) – you were a blast to work next to. Dr. Shailesh Chouhan
– I greatly appreciate your feedback and discussions. Thanks to Dr. Mikko
Kärkkäinen, Ali Vahdati, and Denizhan Karaca for the good discussions.
This work would not have been possible without my wife Tiia. In Chisu’s
words . . . ”sä oot ihana” and deserve my utmost gratitude. You have
brought so much happiness to my life! Thanks to Tara for the occasional
entertainment at the lab. And to my girls (Isabel, Amelia, and Elinor) . . .
you mean the world to me! Thanks to my parents, brother, and Grandma
for the love and support in all my endeavors. Thanks to Tiia’s family for
making my time in Finland so special.
Thanks to the guys outside for revving your engines consistently for 5
years. Thanks to Finland for all the opportunities. Thanks to all my
friends who reminded me how much fun life is (especially the Suomi crew,
Ryan, and Larry). I know I forgot to thank at least a few people. But you
know who you are . . . so thanks.
In Espoo, December 21, 2015,
Matthew J. Turnquist
2
Contents
Preface 1
Contents 3
List of Publications 5
Author’s Contribution 7
List of Symbols 11
List of Abbreviations 13
1. Introduction 15
1.1 Objective of the Thesis . . . . . . . . . . . . . . . . . . . . . . 18
1.2 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . 18
1.3 Main Scientiﬁc Contributions . . . . . . . . . . . . . . . . . . 19
2. Switched-Capacitor DC-DC Converters 21
2.1 Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Operation Characteristics . . . . . . . . . . . . . . . . . . . . 23
2.3 Switched-Capacitor Network . . . . . . . . . . . . . . . . . . 23
2.4 Implementation Parameters . . . . . . . . . . . . . . . . . . . 26
3. DC-DC Converter Prototypes 31
3.1 3:1 Dickson DC-DC Converter in Bulk 28 nm . . . . . . . . . 32
3.2 SC DC-DC Converters in UTBB FD-SOI . . . . . . . . . . . . 37
3.2.1 UTBB FD-SOI Background . . . . . . . . . . . . . . . 37
3.2.2 3:1 Dickson DC-DC Converter . . . . . . . . . . . . . . 38
3.2.3 2:1 Self-Oscillating DC-DC Converter . . . . . . . . . 43
3.3 Implementation Summary . . . . . . . . . . . . . . . . . . . . 48
4. Adaptive Ultra-Low-Energy Processors 51
3
Contents
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Timing-Error Detection (TED) . . . . . . . . . . . . . . . . . . 55
4.3 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3.1 Adder with TED . . . . . . . . . . . . . . . . . . . . . . 57
4.3.2 8-bit Processor with TED . . . . . . . . . . . . . . . . . 59
4.3.3 32-bit Processor . . . . . . . . . . . . . . . . . . . . . . 63
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5. Co-Design of DC-DC Converters and Adaptive Processors 65
5.1 Input Voltage Considerations . . . . . . . . . . . . . . . . . . 66
5.2 DC-DC Converter Regulation Techniques . . . . . . . . . . . 67
5.3 System Results . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3.1 Ring Oscillator Load . . . . . . . . . . . . . . . . . . . 69
5.3.2 TEP Load . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3.3 32-bit Processor with TEP . . . . . . . . . . . . . . . . 73
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6. Conclusions 79
References 83
Errata 91
Publications 93
4
List of Publications
This thesis consists of an overview and of the following publications which
are referred to in the text by their Roman numerals.
I J. Mäkipää, M.J. Turnquist, Erkka Laulainen, L. Koskinen. Timing-
error detection design considerations in subthreshold: an 8-bit micro-
processor in 65 nm CMOS. Journal of Low Power Electronics, 2, pp.
180-196, May 2012.
II M.J. Turnquist, G. de Streel, D. Bol, M. Hiienkari, L. Koskinen. Effects
of back-gate bias on switched-capacitor DC-DC converters in UTBB FD-
SOI. In IEEE SOI-3D-Subthreshold Microelectronics Technology Uniﬁed
Conference (S3S), San Francisco, California, October 2014.
III M.J. Turnquist, M. Hiienkari, J. Mäkipää, H.-P. Le, L. Koskinen. Re-
thinking DC-DC converter design constraints for adaptable systems that
target the minimum-energy point. In Symposium on Low Power Elec-
tronics and Design (ISLPED), Beijing, China, pp. 383-388, September
2013.
IV M.J. Turnquist, E. Laulainen, J. Mäkipää, M. Pulkkinen, L. Kosk-
inen. Measurement of a timing error detection latch capable of sub-
threshold operation. In NORCHIP Circuit Conference, Trondheim, Nor-
way, November 2009.
V M. Hiienkari, J. Mäkipää, M.J. Turnquist, J. Teittinen, A. Rantala, M.
Sopanen, M. Kaltiokallio, L. Koskinen. A 3.15pJ/cyc 32-bit RISC CPU
5
List of Publications
with timing-error prevention and adaptive clocking in 28 nm CMOS. In
Custom Integrated Circuits Conference (CICC), San Jose, CA, pp. 1-4,
September 2014.
VI M.J. Turnquist, E. Laulainen, J. Mäkipää, L. Koskinen. Measurement
of a System-Adaptive Error-Detection Sequential Circuit with Subthresh-
old SCL. In NORCHIP Circuit Conferenc, Lund, Sweden, pp. 1-4, Novem-
ber 2011.
VII M.J. Turnquist, M. Hiienkari, J. Mäkipää, L. Koskinen. A Fully Inte-
grated Self-Oscillating Switched-Capacitor DC-DC Converter for Near-
Threshold Loads. In Asian Solid-State Circuits Conference (A-SSCC),
Xiamen, China, pp. TBD, November 2015.
VIII M.J. Turnquist, M. Hiienkari, J. Mäkipää, R. Jevtic, E. Pohjalainen,
T. Kallio, L. Koskinen. Fully Integrated DC-DC Converter and a 0.4V
32-bit CPU with Timing-Error Prevention Supplied from a Prototype
1.55V Li-ion Battery. In Symposium on VLSI Circuits, Kyoto, pp. c320-
c321, June 2015.
IX L. Koskinen, M. Hiienkari, J. Mäkipää, M.J. Turnquist. Implementing
Minimum-Energy-Point Systems with Adaptive Logic. In IEEE Trans-
actions on Very Large Scale Integration (VLSI) Systems, July 2015.
X K. Östman, M.J. Turnquist, K. Stadius, L. Koskinen, J. Ryynänen.
Supply Ripple Analysis for Receiver Front-Ends Powered by Fully In-
tegrated DC-DC Converters. IEEE Transactions on Power Electronics,
(submitted) April 2015.
6
Author’s Contribution
Publication I: “Timing-error detection design considerations in
subthreshold: an 8-bit microprocessor in 65 nm CMOS”
The author designed the reported EDS circuit and wrote the paper manuscript.
Mr. Mäkipää and Mr. Laulainen designed the microprocessor. The author
assisted Mr. Laulainen in the EDS circuit measurements. Dr. Koskinen
supervised the work.
Publication II: “Effects of back-gate bias on switched-capacitor
DC-DC converters in UTBB FD-SOI”
The author designed the SC DC-DC converter, assisted in the switch and
control circuitry analysis, and wrote the manuscript. Mr. de Streel as-
sisted in the switch and control circuitry analysis. Dr. Bol and Dr. Koski-
nen supervised the work.
Publication III: “Rethinking DC-DC converter design constraints for
adaptable systems that target the minimum-energy point”
The author performed the majority of the circuit analysis, wrote the paper
manuscript, and presented the paper. Dr. Le assisted with the circuit
analysis and revision of the manuscript. Mr. Mäkipää and Mr. Hiienkari
reviewed the manuscript. Dr. Koskinen supervised the work.
7
Author’s Contribution
Publication IV: “Measurement of a timing error detection latch
capable of sub-threshold operation”
The author designed the reported EDS circuit, wrote the paper manuscript,
and presented the paper. Mr. Mäkipää and Mr. Laulainen designed the
combinational logic. Mr. Laulainen and Mr. Pulkkinen assisted in the
system measurements. Dr. Koskinen supervised the work.
Publication V: “A 3.15pJ/cyc 32-bit RISC CPU with timing-error
prevention and adaptive clocking in 28 nm CMOS”
Markus Hiienkari designed and implemented the majority of the digital
parts of the system, and wrote test software and the manuscript. The
author assisted in the optimization of the TBD cell.
Publication VI: “Measurement of a System-Adaptive Error-Detection
Sequential Circuit with Subthreshold SCL”
The author designed the reported EDS circuit, wrote the paper manuscript,
and presented the paper. Mr. Mäkipää and Mr. Laulainen assisted in the
uncertainty region model. Dr. Yücetas¸ assisted in the layout. The author
assisted Mr. Laulainen in the EDS circuit measurements. Dr. Koskinen
supervised the work.
Publication VII: “A Fully Integrated Self-Oscillating
Switched-Capacitor DC-DC Converter for Near-Threshold Loads”
The author designed the reported DC-DC converter, performed the circuit
analysis, and wrote the paper manuscript. Mr. Hiienkari assisted in the
measurements. Mr. Mäkipää and Mr. Hiienkari reviewed the manuscript.
Dr. Koskinen supervised the work.
Publication VIII: “Fully Integrated DC-DC Converter and a 0.4V 32-bit
CPU with Timing-Error Prevention Supplied from a Prototype 1.55V
Li-ion Battery”
The author designed the reported DC-DC converter and assisted in sys-
tem measurements with Mr. Hiienkari. The author wrote the paper
8
Author’s Contribution
manuscript. The author measured the DC-DC and Mr. Hiienkari and Mr.
Mäkipää measured the CPU. The battery was designed and measured by
Mrs. Pohjalainen and Dr. Kallio. Dr. Jevtic and Dr. Koskinen supervised
the work.
Publication IX: “Implementing Minimum-Energy-Point Systems with
Adaptive Logic”
The author designed the reported DC-DC converter and assisted in the
system simulations. The manuscript was written by Dr. Koskinen. The
TEP adaptive load was designed by Mr. Hiienkari. Mr. Mäkipää and Dr.
Koskinen supervised the work.
Publication X: “Supply Ripple Analysis for Receiver Front-Ends
Powered by Fully Integrated DC-DC Converters”
The author designed the reported DC-DC converter. The author per-
formed the ripple analysis and measurement results in Section II. The
author wrote Section II and Dr. Östman wrote the remainder of the paper
manuscript. Dr. Mikko Kaltiokallio assisted in the layout of the DC-DC
converter. Dr. Stadius, Dr. Koskinen, and Prof. Ryynänen supervised the
work.
9
Author’s Contribution
10
List of Symbols
A area
Cfly ﬂying capacitance
EL leakage energy
ESW switching energy
ESY S system energy per operation
ET total energy
fSW DC-DC converter switching frequency
fT transition frequency at the asymptotic in-
tersection of the SSL and FSL impedances
GON switch on-conductance
Gtot summed switch on-conductance
IOUT output current
KOCR optimal conversion ratio factor
L length of a transistor
MEPSY S system minimum energy per operation
point
NMP number of multiphase interleaved units
N voltage conversion ratio
PCfly inherent SC losses
PIN input power
POUT,max maximum output power
POUT,min minimum output power
POUT,ratio ratio of maximum output power to mini-
mum output power
POUT output power
Pbott−cap bottom-plate losses
Pcond switch conductance losses
Pcont control circuitry losses
11
List of Symbols
Pgate−cap gate-plate losses
Ploss DC-DC converter losses
RFSL fast switching limit output impedance
RL load impedance
RSSL slow switching limit output impedance
Ro DC-DC converter impedance
V BBN back-gate voltage applied to NMOS
V BBP back-gate voltage applied to PMOS
VBAT battery input voltage
VDD output voltage with a processor load
VDIFF output voltage of the SC subtractor
VIN input voltage
VOUT,OPT output voltage at the MEP
VOUT output voltage
Vgs gate-source voltage of a transistor
Vth threshold voltage of a transistor
W width of a transistor
ΔVOUT peak-to-peak ripple at VOUT
αv velocity saturation index
α processor activity factor
ηlin efﬁciency of an ideal linear regulator
ηmax maximum conversion efﬁciency
ηmin minimum conversion efﬁciency over speci-
ﬁed load range
η conversion efﬁciency
φcharge charge phase
φdischarge discharge phase
12
List of Abbreviations
CMOS complimentary metal-oxide-semiconductor
EEF efﬁciency enhancement factor
FBB forward body bias
FSL fast switching limit
IC integrated circuit
ITRS international technology roadmap for
semiconductors
iVCR ideal voltage conversion ratio
LDO low drop out regulator
LVT low threshold voltage
MEP minimum energy point
MOM metal-oxide-metal
NMOS n-type metal oxide semiconductor
NOC non-overlapping clock generator
NT near-threshold
oVCR optimal voltage conversion ratio
PD power density
PFM pulse frequency modulation
PMOS p-type metal oxide semiconductor
PMU power management unit
PVTA process, voltage, temperature, and ageing
RBB reverse body bias
RVT regular threshold voltage
SC switched-capacitor
SCL source-coupled logic
SCN switched-capacitor network
SIR scaled-input regulation
SSL slow switching limit
13
List of Abbreviations
STSCL sub-threshold source-coupled logic
TED timing-error detection
TEP timing-error prevention
ULE ultra-low-energy
ULV ultra-low-voltage
UTBB FD-SOI ultra-thin buried oxide and body fully-
depleted silicon-on-insulator
VCR non-ideal voltage conversion ratio
14
1. Introduction
As the size of electronics continues to decrease and the functionality re-
quirements increase, energy consumption is becoming a critical issue.
This is especially true for ultra-portable electronics such as smart watches,
ﬁtness tracking bracelets, smart rings, contact lenses, and smartdust (Fig.
1.1 (a)). Unfortunately, battery energy density has developed slowly rel-
ative to the energy needs of these devices. To make matters worse, bat-
tery energy density does not scale well with smaller sizes [1]. Conse-
quently, there is a growing energy gap between energy needed and energy
available. Harvesting techniques do help and can even allow for energy-
autonomous operation in some ultra-portable applications [2–5]. How-
ever, circuit-level solutions are still critical in these applications and in
the development of future ultra-portable electronics [6].
As shown in Fig. 1.1 (b), an integrated circuit (IC) for ultra-portable
electronics has four main blocks: (i) sensing, (ii) data processing, (iii) com-
munication, and (iv) a power management unit (PMU). The PMU is the
interface between the energy source and all of the blocks. The tasks of the
PMU include DC-DC conversion, energy source harvesting/monitoring,
and power gating [7]. Of the four blocks, communication (through data
transmission) has the potential to consume a large proportion of the over-
all energy. Unfortunately, within a wireless node, the energy consump-
tion needed to transmit a bit does not scale with Moore’s law as advan-
tageously as with digital processing. Thus, there is strong motivation to
minimize the amount of transmitted data by increasing the amount of
ultra-low-energy (ULE) intra-node processing.
Scaling the supply voltage with complimentary metal-oxide-semiconductor
(CMOS) technology has been historically used to decrease energy per op-
eration of data processing blocks. However, according to the international
technology roadmap for semiconductors (ITRS) the impact of supply volt-
15
Introduction
(b)
Energy 
Source
IC
Processing Sensing Communication
DC-DC1 DC-DC2 DC-DC3
VIN
VOUT,3VOUT,2VOUT,1
PMU
N
ee
d 
fo
r F
ul
l I
nt
eg
ra
tio
n
Power Budget
(a)
Ultra-Low-Energy
(ULE)
Low-Energy
(LE)
?W mWpW nW W
Ultra-Portable
Portable
Autonomous
Sensor-Based
Platforms 
[2][3][4][7][12]
Smart Rings
Figure 1.1. (a) Relationship between power budget and the need for full integration; (b)
Typical IC architecture for an ultra-portable application.
age scaling is minimal for sub–90 nm technology [8]. This stagnation in
nominal voltage scaling has motivated a strong interest in operating at
near-threshold (NT) voltages. Operating at NT signiﬁcantly reduces en-
ergy consumption but avoids the large variance and performance penal-
ties of sub-threshold [9, 10]. The increasing focus on ULE processing also
means an increased importance in the DC-DC converter (i.e. DC-DC1 in
Fig. 1.1 (b)).
In a typical ultra-portable device, the processing block’s (NT) supply
voltage is supplied by a DC-DC converter with a system supply or bat-
tery input as shown in Fig. 1.1 (b). There are two main constraints for
the DC-DC converter in this ULE system. First, high efﬁciency is of ut-
most importance. Based on state-of-the-art DC-DC converters [11], an ef-
ﬁciency over 75 % is considered high efﬁciency in this thesis. The energy
consumption of the DC-DC converter/processor system is proportional to
the efﬁciency of the DC-DC converter. In other words, saving energy in the
processing block by scaling to NT voltages is only worthwhile if the DC-
DC converter supplying the NT voltage is efﬁcient. The second constraint
for the DC-DC converter is that it needs to be fully integrated due to the
increasingly small form factors of typical ultra-portable devices [2, 12].
In order to meet the efﬁciency and size constraints, the DC-DC converter
needs to be designed by keeping in mind the fact that processors operat-
ing at NT voltages have signiﬁcantly different characteristics than during
traditional super-threshold operation. Processors operating at such ultra-
low-voltage (ULV) typically have adaptivity to combat the high delay sen-
sitivity to variations. The adaptivity can be used to the advantage of the
DC-DC converter. A DC-DC converter supplying an ULV processor must
be capable of maintaining high efﬁciency over a large load power range
[13] since toggling between active and sleep modes can produce from 200x
16
Introduction
[14] to 6000x [15] changes in load power. Sleep mode power levels are
particularly challenging due to control circuitry losses [16].
In general, a DC-DC converter designed for a super-threshold load does
not necessarily function efﬁciently for a sub- or near-threshold voltage
load. Here in this work, we explore how to make the DC-DC converter
work well speciﬁcally with an ULE load operating at NT voltages. Ensur-
ing that these two blocks operate efﬁciently together minimizes energy
consumption, and ultimately, helps in enabling ultra-portable electronics.
For the processing block, the DC-DC converter is typically a linear-mode
or switched-mode converter. There also exists hybrid linear/switched -
mode converters [17–19]. Linear-mode converters are typically an low
drop out regulator (LDO). The LDO is not an ideal choice since its efﬁ-
ciency is limited to the ratio of VOUT over VIN . Even with a low VIN of 1 V,
down-converting to an (NT) voltage of 0.3 V with an LDO would give an
efﬁciency of only 30 %.
Switched-mode converters are a well-suited choice for meeting the de-
mands of ULE processors [5, 14, 20, 21]. They can provide high efﬁ-
ciency over a large range of output voltages and power levels. Switched-
mode converters transfer energy from the input to the output using pas-
sive components and switches. When the passive component is a ca-
pacitor (inductor), the converter is called a switched-capacitor (switched-
inductor) converter. In terms of power density, efﬁciency, and scaling,
switched-capacitor (SC) DC-DC converters are a more promising choice
than switched-inductor converters for achieving full integration. The main
reasons behind this choice are the fact that capacitors have 10-100 times
more energy per volume than inductors and that SC converters have im-
proved switch utilization [22–25].
In this work, we explore the characteristics of fully integrated SC DC-
DC converters designed for ULE adaptive processors. Overall, decreasing
the system energy consumption is the focus throughout the work. We
present a number of approaches to reduce the system energy consump-
tion: new DC-DC topologies, new regulation techniques, operation with
a prototype Li-ion battery, and utilization of back-gate biasing (within
UTBB FD-SOI).
17
Introduction
Energy
Source
DC-DC
Converter
Adaptive 
Ultra-Low-Energy (ULE)
Processor
Control
Circuitry
Chapter 2-3
Chapter 4
Chapter 5
Figure 1.2. Organization of the thesis.
1.1 Objective of the Thesis
The objective of this thesis is to answer the following questions:
1. What are the DC-DC converter design constraints for ULE processors?
2. What are the co-design considerations of a DC-DC converter/processor
system?
These questions will be answered within the thesis by analysis in three
main areas: (i) SC DC-DC converters, (ii) ULE processors, and (iii) DC-DC
converter/processor systems.
1.2 Organization of the Thesis
This thesis consists of two parts. Firstly, an introduction with ﬁve chap-
ters is given. Secondly, a compilation of scientiﬁc publications ([I]-[X]) by
the author is given. The six chapters within the introduction describe
the necessary SC converter background, implemented designs, and com-
parison to state-of-the-art DC-DC converters. More speciﬁcally, chapter
2 provides the SC DC-DC converter background relevant to ULE proces-
sors. Chapter 3 describes the implementation of three fully integrated SC
DC-DC converters with ULE loads. Chapter 4 examines the behavior of
ULE processors. The system effects of the DC-DC converter and ULE pro-
cessor loads are then given in chapter 5. Finally, a conclusion in chapter
6 summarizes the ﬁndings of the thesis and compares the implemented
DC-DC converters to the state-of-the-art.
18
Introduction
1.3 Main Scientiﬁc Contributions
The scientiﬁc contributions of this work are presented in detail within
publications [I]-[X]. A summary of the work in these publications is found
in chapters 2-6 of this thesis. The most important scientiﬁc contributions
from the publications can be further summarized as:
1. A design approach to improve DC-DC converter efﬁciency over a large
load range by using a new topology and back-gate biasing. At the time
of publication, this DC-DC converter had the highest power density for
a fully integrated NT converter and is the ﬁrst-known implementation
of the self-oscillating step-down topology [VII].
2. Designed the ﬁrst-known EDS circuit capable of sub-threshold opera-
tion [IV].
3. A fully integrated Dickson DC-DC converter. It is the ﬁrst-known fully
integrated version of this topology [VIII].
4. Implemented an EDS circuit in sub-threshold source-coupled logic [VI].
This EDS circuit is used within the ﬁrst-known timing-error detection
(TED) processor capable of sub-threshold operation [I].
5. A design approach for improving DC-DC converter efﬁciency by using
back-gate biasing in order to reduce control circuitry leakage energy
[VIII].
6. A technique utilizing back-gate biasing in order to reduce switch sizes
and improve DC-DC converter efﬁciency [II].
7. A design approach for reducing energy consumption of a battery/DC-
DC converter/CPU system. The battery is a 1.55 V Li-ion prototype. At
the time of publication, this system had the lowest reported energy per
operation [VIII].
8. Identiﬁed key ripple characteristics of SC converters [X].
19
Introduction
9. A DC-DC converter regulation methodology to reduce (battery/DC-DC
converter/CPU) system energy consumption [III, IX].
10. A method to detect the threshold voltage with a DC-DC converter and
a TED processor (chapter 6).
20
2. Switched-Capacitor DC-DC
Converters
This chapter provides the relevant background information on SC DC-DC
converters. The focus is on SC DC-DC converters that supply energy-
constrained NT processor loads. The main principles, operation charac-
teristics, and a description of the switched-capacitor network (SCN) are
given. Finally, the most relevant implementation parameters are consid-
ered.
2.1 Principles
A common method to model an SC DC-DC converter is with the SC out-
put impedance model [22, 26, 27]. This model (Fig. 2.1) consists of a
transformer and a series resistance (Ro). The transformer represents the
converter’s ideal voltage conversion ratio (iVCR). The iVCR is expressed
as:
iV CR = VNL/VIN = 1/N, (2.1)
where VNL is the no-load voltage after conversion and N is the conversion
ratio of the transformer. For example, the divide-by-two and divide-by-
three converters presented in the next chapter have iV CRs of 1/2 and 1/3,
Intrinsic Extrinsic
Pbott-cap
Pgate-capPCfly
Pcont
Ploss associated with Ro:
(a) (b)
Pcond
O
U
T
C
N:1
IOUTVOUTVNLVIN Ro
RL
SC DC-DC Converter
Figure 2.1. (a) SC output impedance model [22, 26, 27] and (b) the losses associated with
Ro.
21
Switched-Capacitor DC-DC Converters
respectively. The iVCR is determined solely by the converter SCN topol-
ogy. When a load is connected to the converter, the output voltage VOUT
decreases below VNL and the non-ideal voltage conversion ratio (VCR) is
given by:
V CR =
VOUT
VIN
(2.2)
The decrease in the output voltage VOUT below VNL is modeled by the
output impedance Ro. This impedance has a dual nature; it depends on
the contributions from the slow switching limit (SSL) and fast switching
limit (FSL). The SSL impedance is due to capacitor-dominated losses and
the FSL impedance results from resistive losses [23, 28]. The SSL and
FSL have asymptotic behavior, and there is a transition frequency (fT ) at
the intersection of the two asymptotes [26]. Ro is given by the following
equations in terms of SSL and FSL [25]:
Ro=
√
(RSSL)2+(RFSL)2=
√√√√√( KcCtotfSW︸ ︷︷ ︸
RSSL
)2 + (
2Ks
Gtot︸ ︷︷ ︸
RFSL
)2, (2.3)
where Kc and Ks depend on the topology, Ctot is the total ﬂy capacitance,
fSW is the converter switching frequency, and Gtot is the summed switch
on-conductance.
Understanding the losses associated with Ro is crucial for achieving
high conversion efﬁciency within the DC-DC converter. The losses are
both intrinsic and extrinsic. The intrinsic losses (PCfly) arise from charg-
ing a capacitor through a switch [22]. Extrinsic losses within the con-
verter are due to implementation, and they can be grouped into four loss
mechanisms. First, the bottom-plate losses (Pbott−cap) arise from the par-
asitic capacitance to the substrate of the ﬂy capacitors and switches. Sec-
ond, the gate capacitance switching losses (Pgate−cap) are caused by charg-
ing and discharging the gates of the switches. Third, the control circuitry
losses (Pcont) come from any control circuitry used to operate and/or reg-
ulate the converter. Fourth, the conduction loss (Pcond) arises from the
ﬁnite conduction of the switches. The sum of both intrinsic and extrinsic
losses can be summarized as:
Ploss = PCfly + Pbott−cap + Pgate−cap + Pcont + Pcond (2.4)
22
Switched-Capacitor DC-DC Converters
INV
flyC LR
OUTV
O
U
T
C
S1 S2
S4 S3
?discharge
OUTV
CflyI|     |
OUTV?
1S on, 3S
2S off, 4S
2S on, 4S
1S off, 3S
?charge
(a) (b)
?charge
?charge
?discharge
?discharge
Switched-Capacitor Network (SCN)
1/fSW
Figure 2.2. 2:1 series-parallel SC converter (a) circuit and (b) operation.
2.2 Operation Characteristics
To illustrate the operation of an SC DC-DC converter, the circuit and
transient behavior of a 1/2 series-parallel DC-DC converter are shown
in Fig. 2.2 (a) and (b), respectively. SC converters have an SCN com-
posed of switches and capacitors. The switches typically operate with
two phases and with a 50 % duty cycle. Although it is possible to oper-
ate SC DC-DC converters with different duty cycles, a 50 % duty cycle
has been shown to be optimal for two-phase switching [22]. During the
charge phase (φcharge), charge is stored on the ﬂy capacitor Cfly and ﬂows
to the load. The stored charge on Cfly ﬂows to the load during the dis-
charge phase (φdischarge). The output capacitor (COUT ) does not contribute
to charge transfer with respect to the conversion, it only reduces the ripple
(ΔVOUT ) at VOUT .
The speed of charge transfer, or fSW , has a direct impact on the Ro
impedance as seen from (2.3). In SSL, the Ro is inversely proportional
to fSW . In FSL, the fSW has a minimal impact on Ro unless fSW is much
higher than fT [26].
2.3 Switched-Capacitor Network
As shown previously, an SC DC-DC converter is composed of an SCN. In
the following text, we examine the design choices of the capacitors and
switches that form the SCN.
Capacitors
The type of ﬂy capacitor has three important effects on the converter de-
sign. First, the ﬂy capacitor’s capacitance density largely determines the
23
Switched-Capacitor DC-DC Converters
0 0.5 1
6
7
8
9
10
x 10
−17
V
gs
[V]
G
at
e 
C
ap
ac
ita
nc
e 
[F
]
GNDS=0 V
GNDS=1 V
GNDS=2 V
Vgs
GNDS
Increasing 
FBB
Figure 2.3. Capacitance plot for an NMOS transistor in 28 nm UTBB FD-SOI.
total converter area since the area of the ﬂy capacitor dominates the de-
sign. The converter designs in [X], [VIII], and [VII], and other state-of-
the-art implementations, conﬁrm the previous statement. Second, the
losses due to bottom plate parasitic (Pbott−cap) limits the efﬁciency of the
converter at power levels used within this thesis. The third effect stem-
ming from the choice of ﬂy capacitor is functionality since some capacitors
(e.g. MOSCAPs), have voltage dependencies.
Depending on the process technology, multiple types of capacitors may
be available. A qualitative summary of different ﬂy capacitor choices for
an integrated DC-DC converter is shown in Table 2.1. Deep trench ca-
pacitors, which are built with embedded DRAM process options [29], are
an excellent choice to their high capacitance density [30, 31]. However,
deep trench capacitors are still an exotic technology and not standard
components in current CMOS technologies. MIMs, which are standard
components in most current digital CMOS technology, are the next best
choice for NT converters since they have low Pbott−cap, do not have voltage
dependencies, and have a medium capacitance density. Since MIMs are
formed between two high metal layers and a thin dielectric, the area can
also be utilized underneath them to increase their effective density [14].
For NT converters, the use of MOSCAPs for the ﬂy capacitor is typically
avoided [32]. A MOSCAP, which is formed between the gate and the in-
duced channel of a MOS transistor [33], has a capacitance density that
decreases for capacitor voltages below Vth as shown in Fig. 2.3. Applying
forward body bias (FBB) to reduce Vth does help slightly in shifting the
capacitance decrease to lower Vgs. This improvement is, however, still not
24
Switched-Capacitor DC-DC Converters
MOSCAPMIM CMOM DeepTrench
Voltage 
Dependency? YesNo No No
Capacitance
Density HighMedium Low High
Bottom Plate
Parasitics (Pbott-cap)
*HighLow Medium Low
Circuits 
Underneath? NoYes Yes Yes
*This is decreased with SOI transistors
Table 2.1. Fly capacitor choices for fully integrated SC converters.
0
20
40
60
80
100
120
Load Power
I O
N
,d
c−
dc
/ I
O
FF
,d
c−
dc
LMIN
LMIN+10%
LMIN+20%
Non-functional
100 nW10 nW
Figure 2.4. Simulation results from [II] showing the impact of leakage on a 2:1 DC-DC
converter with sleep mode load power. c©2014 IEEE
adequate since the Vgs of a MOSCAP would still see only a fraction of the
NT output voltage (e.g. 0.1 V to 0.2 V). Additional challenges with leakage
also limit this approach. Avoiding the use of MOSCAPs is one reason why
it is challenging for NT converters to have comparable power densities to
the super-threshold converters. In summary, the capacitor choice has a
large impact on the overall design of NT converters. As further described
in chapter 3, the implemented converters in this thesis use metal-oxide-
metal (MOM) [X] and MIM [VIII], [VII] ﬂy capacitors.
Switches
Switches are used to transfer charge to and from the ﬂy capacitors (Cfly).
Switches require more attention in low power designs for two reasons.
First, drain-source leakage in the switches can cause functionality prob-
lems at small load power [II] or when the converter needs to be powered
OFF [34]. As shown in Fig. 2.4, the switches within a 2:1 DC-DC con-
verter require a minimum ratio of ON-current (ION,dc−dc) to OFF-current
25
Switched-Capacitor DC-DC Converters
(IOFF,dc−dc) in order to maintain functionality at nW power levels. In-
creasing the L reduces leakage with a tradeoff in increased losses at active
mode load power levels.
The second reason switches require more attention is that achieving
adequate switch transconductance is challenging due to a low overdrive
voltage [26]. Low overdrive voltage is a result of low Vgs as discussed in
[VII]. Increasing transistor width (W ) may help with increasing the over-
drive voltage but it also increases the drain-source leakage and increases
the switch driving losses. This tradeoff is especially important to consider
for large load power ranges.
2.4 Implementation Parameters
Efﬁciency
For energy-constrained processors, (power conversion) efﬁciency is one of
the foremost important metrics. The energy consumption of a DC-DC con-
verter/processor system is proportional to the efﬁciency of the converter.
Efﬁciency is the ratio of the converter’s output power (POUT ) and the input
power (PIN ) as follows:
η =
POUT
PIN
=
POUT
Ploss + POUT
(2.5)
where Ploss is the converter’s losses previously deﬁned in (2.4). The maxi-
mum efﬁciency is bound the ratio of the VCR and the iVCR [26] as follows:
ηmax =
V CR
iV CR
=
VOUT
VNL
=
NVOUT
VIN
(2.6)
The VCR at which ηmax occurs is deﬁned as the optimal voltage conversion
ratio optimal voltage conversion ratio (oVCR):
oV CR = KOCR ∗ VNL (2.7)
where KOCR is the optimal conversion ratio factor and is always less than
one. Typically, KOCR varies from 80 % to 95 % depending on a number of
factors (e.g. load, topology, etc.).
The minimum conversion efﬁciency over a speciﬁed load range is deﬁned
as ηmin. A high ηmin over a large load range is highly desirable for ULE
adaptive processors that have sleep and active modes (that induce large
changes in load power).
26
Switched-Capacitor DC-DC Converters
Efﬁciency Enhancement Factor
As presented in (2.6), the maximum efﬁciency (ηmax) in an SC converter is
bound by VCR/iVCR. The ηmax only accounts for the converted output volt-
age. Additional intrinsic and extrinsic losses as shown in (2.4) reduce the
actual efﬁciency. These losses typically increase as the VCR decreases (to
achieve a higher down-conversion), and thus, efﬁciency in SC converters
is proportional to VCR [35]. To account for this effect, and to benchmark
against a linear-mode converter, the efﬁciency enhancement factor (EEF)
[33] can be used as a ﬁgure of merit:
EEF = 1− ηlin
η
, (2.8)
where ηlin is the efﬁciency of an ideal linear-mode converter. The linear-
mode converter has the same VOUT and VIN as the SC converter. A posi-
tive EEF indicates that the efﬁciency (η) of the SC converter, is improved
over an equivalent linear-mode converter.
Load Power Range
The converter needs to function efﬁciently from the load’s minimum power
(POUT,min) to maximum power (POUT,max). This load power range can also
be expressed as a ratio:
POUT,ratio = POUT,max/POUT,min (2.9)
The POUT,ratio for a DC-DC converter is typically orders-of-magnitude for
energy-constrained processors [5, 14, 15] that utilize multiple power modes
(e.g. sleep and active modes) [13].
Ripple
SC DC-DC converters exhibit inherent ripple in their output voltage (VOUT )
due to pulse-like current patterns as shown previously in Fig. 2.2. The
ripple magnitude of an SC DC-DC converter’s output voltage can be un-
derstood by assuming that the converter operates in SSL. The worst-case
ripple amplitude is expected in SSL, because the parasitic resistances of
the switches, capacitors, and interconnects are neglected, and the charge
transfer is thus not damped by these parasitics. A general SSL approx-
imation for ripple can be described by rearranging the classic switched-
27
Switched-Capacitor DC-DC Converters
Volumetric
Energy Density
(Wh/l)
Cycle Life
(to 80% of 
initial capacity)
Nominal Voltage
Voltage over
95% of Discharge
Time
Gravimetric
Energy Density
(Wh/kg)
Li-ion
Prototype [36]
1.55V
300-400
1.5V-1.6V
20
925
NiMH
1.2V
300-500
~1.1V-1.3V
60-120
140-300
Li-ion
3.6V
500-1000
~3V-4V
110-160
270
Li-ion
Polymer
3.6V
300-500
~3V-4V
100-130
300
Alkaline
1.5V
50
0.9V-1.5V
80
80
Table 2.2. Characteristics of modern batteries.
capacitor loss equation in [22] and accounting for multiphase interleav-
ing:
ΔVOUT =
2IOUT
NMPMCAP fSWCfly
(2.10)
where IOUT is the output current into RL, NMP is the number of multi-
phase interleaved units, and MCAP is based on the converter architecture
[22]. All three of the DC-DC converters presented in the next chapter had
ΔVOUT less than 15% of VOUT .
Battery Input
The majority of current portable electronic systems are powered by bat-
teries. Table 2.2 gives an overview of commonly used rechargeable bat-
teries. The high-energy density found in Li-ion based batteries ensures
minimal weight and size. These qualities make them an attractive option
for applications like smartphones or smartwatches. Li-ion batteries with
a 3.6 V nominal voltage are the most common choice for powering mod-
ern portable electronics. However, as discussed in the next paragraph
and later in chapter 5, the lower (1.55 V) nominal voltage Li-ion of [36] is
advantageous.
Modern portable electronics require that the battery voltage (or system
voltage) be efﬁciently down-converted to the processor’s operation voltage
(VOUT ). When VOUT is at or below NT voltages, the ability to maintain
high-efﬁciency in the converter becomes more challenging due to a larger
step-down conversion ratio, or equivalently, a decreased VCR. A smaller
VCR requires more switches, capacitors, and control circuitry. As shown
in Fig. 2.5, state-of-the-art DC-DC converters typically have lower efﬁ-
ciency at lower VCR.
28
Switched-Capacitor DC-DC Converters
0 0.2 0.4 0.6 0.8
50
60
70
80
90
VCR=VOUT/VIN [−]
η m
ax
 
[%
]
Figure 2.5. Efﬁciency of state-of-the-art DC-DC converters with various VCRs.
To further clarify the trend in efﬁciency and VCR, the efﬁciency of a lad-
der DC-DC converter in 28 nm CMOS with three topologies is shown in
Fig. 2.6 (a). Each topology’s iVCR (i.e. 1/2, 1/3, and 1/7), is chosen based
on the input voltage. As the input voltage is reduced, the required iVCR
increases. A larger iVCR requires more switches and capacitors, and thus,
there are more extrinsic losses (Pbott−cap, Pcont, and Pgate−cap). Note that
the efﬁciency curves were generated from the MATLAB model developed
in [37]. This model does not take into account Pcont. Thus, the efﬁciency
with 1/7 topology would be reduced even further relative to the 1/2 and 1/3
topologies. The subsequent effects of input voltage ultimately effect the
energy consumption of a DC-DC converter/processor system as shown in
Fig. 2.6 (b). The 1/2 topology has a minimum energy point that gives a 36
% reduction in energy/operation over the 1/7 topology. In summary, choos-
ing the lowest possible input voltage helps to make the DC-DC converter
more efﬁcient, and consequently, the DC-DC converter/processor systems
operates with lower energy consumption.
Power Density
The cost of a fully integrated DC-DC converter is dependent on its power
density (PD). The PD is the output power (POUT ) divided by the silicon
area (A) required to perform the conversion
PD =
POUT
A
(2.11)
Achieving high PD is especially challenging for converters that supply NT
29
Switched-Capacitor DC-DC Converters
0.2 0.3 0.4 0.5
30
40
50
60
70
80
90
100
VOUT [V]
Ef
fic
ie
nc
y 
[%
]
 
 
1/2
1/3
1/7
VBAT = 1.2 V
VBAT = 1.55 V
VBAT = 3.6 V
(a)
0.2 0.3 0.4 0.5
0.5
1
1.5
2
2.5
3
3.5
4
VOUT [V]
E o
p 
[N
or
ma
liz
ed
]
 
 
E
op,sys (1/2)
E
op,sys (1/3)
E
op,sys (1/7)
E
op,uP
(b)
Figure 2.6. (a) A ladder converter with three different topologies (i.e. 1/2, 1/3, and 1/7).
Each topology is regulating to the NT voltage range from a different input
voltage. (b) Effects of the ladder converters’ efﬁciency on a ring oscillator
load.
loads for two reasons. The ﬁrst reason is that NT converters require that
the (area-dominant) ﬂy capacitors be constructed from capacitors that are
not sensitive to the voltage across them. The only choice is thus MOM
or MIM capacitors which have low to medium capacitance density (Table
2.1). Using a MOSCAP, which has a high capacitance density, is not pos-
sible due to the drop off in capacitance at NT voltages (see Fig. 2.3). The
second reason why it is challenging to achieve PD is that the switches
within the SCN have low overdrive voltages at ULV. Using a larger W for
switches is one option to compensate for the low overdrive. Nevertheless
a larger W means a tradeoff for reduced POUT,ratio (due to leakage) and
decreased efﬁciency (e.g. due to larger Pgate−cap).
30
3. DC-DC Converter Prototypes
Previously in the thesis, chapter 2 presented the principles, operation
characteristics, and implementation parameters of SC converters. In this
chapter, we turn our attention towards three implemented fully integrated
SC DC-DC converters. Each converter is designed to convert a system
supply or battery voltage down to an NT load. This chapter is divided
into three sections. The ﬁrst section discusses a 3:1 Dickson converter in
(bulk) CMOS and the motivation for the Dickson topology. Next, two SC
converters built in ultra-thin buried oxide and body fully-depleted silicon-
on-insulator (UTBB FD-SOI) CMOS are presented. The necessary back-
ground and motivation for using UTBB FD-SOI CMOS is also discussed.
In this chapter, the author’s contributions focus speciﬁcally on using
high-efﬁciency DC-DC converters for energy-constrained processors that
operate at NT voltages. These DC-DC converters serve a critical role
since energy consumption of a DC-DC converter/processor system is pro-
portional to the DC-DC converter’s conversion efﬁciency. In other words,
scaling a processor’s supply voltage to NT is only worthwhile if the DC-DC
converter can operate with high efﬁciency. This high conversion efﬁciency
must also be maintained over a wide load power range since toggling be-
tween a processor’s sleep and active mode can produce up to 6000 x [15]
changes in load power. While the ﬁrst converter (in section 3.1) is focused
on efﬁciency, the ﬁnal two converters (in section 3.2) explore new methods
of providing a high efﬁciency over a wide load power range. A performance
summary is given for each of these three converters.
This chapter highlights the most signiﬁcant aspects of the original work.
Further analytic and implementation details can be found in the related
scientiﬁc publications: [VII], [VIII], and [X].
31
DC-DC Converter Prototypes
MSW
Mbott
Num. of
fly caps
Mcap
1/3 Series-
Parallel
24.5
2.5
2
1.6
1/3 Ladder
18
0.98
3
0.8
1/3 Dickson
24.5
1
2
1.3
(b)
10
0
10
1
10
2
10
3
30
40
50
60
70
80
E
ff
ic
ie
nc
y 
[%
]
Power Density [mW/mm2]
0
30
60
90
120
150
180
A
D
C
D
C
/A
P
ro
c
[%
]
1/3 Series−Parallel
1/3 Dickson
1/3 Ladder
(a)
Target power density:
~4 mW/mm2
(at < 30% of processor area)
Figure 3.1. (a) Efﬁciency of three divide-by-three topologies based on equations from [22].
The target PD is based on an area constraint with respect to the processor
area. (b) Constants need to calculate efﬁciency in [22].
3.1 3:1 Dickson DC-DC Converter in Bulk 28 nm
Introduction
The main requirement of the 3:1 Dickson converter is to efﬁciently con-
vert from a battery input voltage range (VIN ) down to NT voltages (for
a processor load). The area of the DC-DC converter (ADCDC) is required
to be 30% less than the area of the processor load (AProc). The processor
load is the same as in [V]. VIN was chosen based on the voltage range of a
prototype 1.55 V battery [38].
To choose the best DC-DC converter topology based on the requirements
discussed previously, the efﬁciency of a ladder, a Dickson, and a series-
parallel topology are calculated using equations from [22]. A plot of these
efﬁciency results as a function of power density is shown in Fig. 3.1 (a).
Constants (MSW , Mbott, and Mcap) are needed in calculating the efﬁciency
of each topology (3.1 (b)); details of these constants are given in [22]. The
Dickson step-down topology, which was ﬁrst introduced in [39] with off-
chip ﬂy capacitors, has the highest efﬁciency for the target PD.
32
DC-DC Converter Prototypes
INV
LR
Control
Circuitry
OUTV3:1 Dickson 
DC-DC Core 
O
U
T
C
INV
DDV
N
O
C
Level
Shi?
Level
Shi?
1VVDIFFV
fly1C
fly2C
3:1 Dickson DC-DC Core 
Control
CIrcuitry
OCfly1C
fly1Cfly2C
fly2C
IC
.2
3-
m
m
.302-mm
INC
SWf
SWf
x8
?
1
?
2
?
1
?
2
S1
S2b
S2a
S3
S4 S5
S6S7
S2a
S1
(a)
(b)
(c)
1VV
DIFFV
INV
DIFFV
INV
xCxC
OUT
IN
Level
Shi?=
S2b
S2a INV
DIFFV
INV
~VDD x 2
DDV
~VDD x 2
1VV
0
*Upper Domain *Lower Domain
D D
D D
DIFFV
INV
DIFFV
INV
*Drivers labeled as D
?
2 S2bD D
1VV
?
1 S3,5,7D D
1VV
?
2 S4,6D D
1VV
1VV 1V
V INV DIFFV
1VV INV DIFFV
Figure 3.2. (a) Implementation of the 3:1 Dickson converter from [X]. c©2015 IEEE. (b)
Details of S2,a,b. (c) Capacitive level shifter.
Circuitry
The 3:1 Dickson converter is implemented in 28 nm bulk CMOS. The con-
verter consists of a core and control circuitry as shown in Fig. 3.2. The
core has eight switches (S1−8) and two ﬂy capacitors (Cfly1,2). A two-phase
non-overlapping clock generator (NOC) generates the switches’ control
signals. These signals are input to drivers (D) that form tapered buffers
[40, pp. 344-345]. VDIFF is an auxiliary source equal to VIN -1 V. S2a and
S3−7 are triple-well n-type metal oxide semiconductor (NMOS) transistors
and S1 and S2b are p-type metal oxide semiconductor (PMOS) transistors.
The switches and ﬂy capacitors within the core are optimized for high
efﬁciency according to [22].
The NMOS/PMOS stack for S2a,b, as shown in Fig. 3.2 (b), was the most
challenging switch to design due to the VIN -VOUT voltage drop across the
switch during the time it was OFF. The NMOS/PMOS stack ensures con-
trol voltages less than the breakdown voltage and reduced leakage when
S2a,b is OFF.
To ensure that S1, S2a and the drivers of S1 and S2a can be built with
thin-oxide transistors, rather than (inefﬁcient) thick-oxide I/O devices, the
gates of S1 and S2a need to switch at the upper domain (i.e. between
VBAT and VDIFF ). Switching in the upper domain does not violate the
breakdown voltage of a thin-oxide transistor. Level shifters are used to
33
DC-DC Converter Prototypes
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7
75
80
85
V
IN
[V]
E
ff
ic
ie
nc
y 
[%
]
1 1.1 1.2 1.3 1.4 1.5 1.6 1.7
300
400
500
V
IN
[V]
V
O
U
T
[V
]
@POUT=11?W
@POUT=154?W
@POUT=232?W
Figure 3.3. Simulation results of the 3:1 Dickson converter. For this simulation, the VCR
is ﬁxed at 0.3.
translate the control signals of the lower domain (i.e. between V1V and 0
V) to the upper domain. The (capacitive) level shifter [22, 41], which also
uses thin-oxide transistors, is shown in Fig. 3.2 (c).
The ﬂy capacitors are custom MOMs with a vertical parallel plate struc-
ture [42] utilizing metal layers M2 to M8. This choice of metal layers
provided a good tradeoff between capacitor density and Pbott−cap. The ca-
pacitor density was estimated (from HFSS and parasitic extraction) to be
5 fF/μm2 and Pbott−cap to be 1.2 %. MIMs would have been preferred due
to their superior performance (Table 2.1), however, at the time of imple-
mentation, they were not available.
Measurement Results
The performance of the converter was characterized through simulations
and measurements. As shown in Fig. 3.3, the converter achieves high
(simulated) efﬁciency at NT output voltages. The VOCR (VOUT /VIN ) was
ﬁxed in the simulation at 0.3 V. In other words, the (open-loop) converter
produced VOUT=304 mV to 470 mV from VIN=1 to 1.7 V. The technique of
scaling the VOUT with VIN is called scaled-input regulation (SIR) and is
further discussed in chapter 5. For the target 1.55 V battery input, the
Dickson converter achieves 80 % efﬁciency as shown in Fig. 3.3.
The load used in the previous results was a current source load as shown
in Fig. 3.4. It was used to mimic the processor’s average current (Iavg),
leakage current (Ileak), and peak current spikes (Ispike). The load proﬁle
for was built from both ring oscillator simulations and synthesis results.
The Dickson converter was implemented and measured. In the mea-
surements, the converter operated correctly with the on-chip processor
34
DC-DC Converter Prototypes
IspikeIleak
VOUT
time
Ileak
Ipeak = Ispike + Ileak
DC-DC
1/fproc
1/fproc
Iavg
Figure 3.4. Simulation test setup for a processor load.
from [V]. The measurement results with this processor and a 1.55 V bat-
tery input are described in [43]. However, a suspected error in the pads
and/or wirebonding meant that it was not possible to measure the efﬁ-
ciency.
Although the converter efﬁciency could not be measured, measurements
of the voltage ripple magnitude as a function of the switching frequency
(fSW ) were possible as shown in Fig. 3.5. This measurement was per-
formed at the POUT,min (14 μW) and POUT,max (115 μW). The ripple was
less than 12 % of VOUT for both power levels. The analytically predicted
ripple, which was calculated with an optimal switching frequency (fSW,OPT,l)
inserted into (2.10), closely matched the measured ripple. The details of
these predictions and additional ripple measurements are found in [X].
Summary
This section showed that the Dickson topology was a good match for the
target power density and high efﬁciency requirement. With this topol-
ogy, the ηmax was 82 % with an ULE load power. For a 1/3 topology, this
produced a high EEF of 64 %. The high efﬁciency and EEF are due to
minimal control circuitry, thin-oxide transistors, a low input voltage, and
the optimization of the SCN using methods in [22]. The converter was
simulated and measured in an open-loop conﬁguration. It was not pos-
sible to measure the efﬁciency due to an unknown error. However, the
converter did function correctly when supplying a 32-bit adaptive proces-
sor. The performance of the Dickson DC-DC converter is summarized in
Table 3.1. An improved Dickson converter design, which is built in 28 nm
UTBB FD-SOI CMOS, is presented in the next section.
35
DC-DC Converter Prototypes
10 20 30 40 50 60
50
100
150
200
250
300
350
400
f
SW
[MHz]
Δ
V
O
U
T
Measured @ P
OUT,max
Measured @ P
OUT,min
f
SW,OPT,l
(P
OUT,max
) from Eq.(6)
f
SW,OPT,l
(P
OUT,min
) from Eq.(6)
V 
   
  [
V]
 
O
U
T
Time [?s]         
Analytically 
predicted
Figure 3.5. Measured and analytically predicted ripple of the 3:1 Dickson converter from
[X]. c©2015 IEEE.
Table 3.1. Performance summary of the 3:1 Dickson converter.
Technology
Load Range
iVCR
Tested Input / 
Output Voltage
Efficiency (?) @
Sub-/Near-Vth
Power Density @
? (mW/mm2)
Topology
Load Range
in Ratio
28nm CMOS
1-1.7V / 
(0.304-0.470V)
1/3
14?W-115?W
step-down SC
1.66@0.350V
175%@0.350V
Efficiency max 182%@0.350V
1:8
Value
COUT 200pF (MOM/MOS)
1From simulations.
Cfly 160pF (MOM)
EEFmax 164%@0.350V
fdc-dc,max 32 MHz
36
DC-DC Converter Prototypes
3.2 SC DC-DC Converters in UTBB FD-SOI
3.2.1 UTBB FD-SOI Background
UTBB FD-SOI CMOS has a number of advantages over traditional CMOS.
For example, UTBB FD-SOI has improved SCE immunity, improved vari-
ability, latch-up immunity, and extended body-biasing (a.k.a. back-gate
biasing) [44]. Back-gate biasing is more effective in changing a transis-
tor’s threshold voltage (Vth) than in advanced bulk CMOS for two reasons
[45]. First, the range of bias voltages is larger since the bulk node is iso-
lated from the drain and source. Second, the body-bias factor (γ) is higher.
For example, at the 28 nm node, in UTBB FD-SOI γ ≈ 85 mV/V while in
bulk CMOS γ ≈ 25 mV/V [46].
The application of back-gate bias, is useful in increasing transistor con-
ductance with FBB [46–48] or decreasing leakage with reverse body bias
(RBB) [45]. For SC DC-DC converters, increasing the conductance with
FBB is beneﬁcial since it allows for larger load power (i.e. large POUT,max).
Similarly, RBB can be used to reduce leakage energy. Tradeoffs in perfor-
mance and leakage due to back-gate biasing are key beneﬁts of the UTBB
FD-SOI technology.
With UTBB FD-SOI, RBB and FBB can be applied to regular threshold
voltage (RVT) and low threshold voltage (LVT) transistors, respectively.
As shown in Fig. 3.7, RVT transistors are implemented in a conventional
well, where NMOS and PMOS transistors are above p- and n-wells, re-
spectively. An FBB up to 0.6 V and RBB down to -3 V can be used with
RVT transistors. The design in section 3.2.2 uses only RVT devices; RBB
was applied to reduce leakage at low (nW) load power levels. As shown
in Fig. 3.7, LVT transistors are implemented as “ﬂip-well” transistors in
which the NMOS and PMOS transistors are above the n- and p-wells, re-
spectively. FBB up to 3 V and RBB down to -0.3 V is allowed. The design
in section 3.2.3 uses only LVT devices; FBB was utilized to increase the
maximum load-handling capability.
The generation of the back-gate bias voltage requires additional cir-
cuitry. Details of this circuitry lies outside the scope of this thesis. How-
ever, the designs presented in the following sub-sections both give mea-
surement results with biasing schemes that have little or no impact on
performance. Self-biasing techniques [49] and simpler single-well biasing
techniques [50] will be explored in future implementations.
37
DC-DC Converter Prototypes
VBB
range:
FBB
+3V0V-0.3V
RBB
VBB
range:
FBB
-3V 0V VDD/2+300mV
RBB
LVT
BOX
P-Well
NMOS PMOS
VDDSGNDS
N-Well
P-Sub
BOX
VDDS=VDD-VBB
GNDS=VBB
VDDS=-VBB
GNDS=VBB
RVT P-Well
NMOS PMOS
VDDSGNDS
N-Well
P-Sub
BOX BOX
Figure 3.6. UTBB FD-SOI RVT and LVT transistors. The RVT and LVT transistors can
realize performance advantages from the application of RBB and FBB, re-
spectively [45].
3.2.2 3:1 Dickson DC-DC Converter
Introduction
The goal of the 3:1 Dickson DC-DC converter is to provide an efﬁcient con-
version from a 1.55 V prototype battery to an NT processor load. During
95% of the battery’s discharge time, the voltage is between 1.5 V and 1.6
V, and thus, 1.5 V - 1.6 V is the main input voltage testing criteria. Due to
the battery characteristics, and due to the fact that the NT processor load
has a ﬂat energy proﬁle from VOUT=0.35 V to 0.45 V, a single 1/3 topology
is sufﬁcient for this design.
Besides providing high efﬁciency under active load conditions, the DC-
DC converter also should be efﬁcient for sleep mode load conditions. As
previously mentioned, toggling between a processor’s sleep and active
mode can produce large changes in load power [51]. Although the in-
tended processor load only has an active mode, future implementations
of the processor will have a sleep mode. Achieving high efﬁciency at nW
loads (in sleep mode) is challenging due to control circuitry leakage energy
losses [7, 19]. We explore the application of back-gate biasing to reduce
control circuitry losses and increase efﬁciency for sleep mode. In UTBB
FD-SOI, back-gate biasing can be applied to RVT transistors to achieve
signiﬁcant reductions in leakage [45].
Circuitry
The 3:1 Dickson DC-DC converter is shown in Fig. 3.7 (a). The core and
control circuitry are designed to (efﬁciently) deliver an NT output volt-
age. The DC-DC converter’s area of 0.023 mm2 is mainly consumed by
38
DC-DC Converter Prototypes
BATV
Control
Circuitry
OUTV
3:1 Dickson  
DC-DC Converter 
auxV
Core
BATV
DDV
fly1C
S1
S2b
S2a
S3
S4 S5
S6S7
OUTC
V R
EF
Non-Overlap
Clock 
Generator
SWf
?
1
?
2
Driver
Control
Toggle
Latch
Hysteric Control
VCOMP
dcdcf
DIFFV       =          -BAT  V auxV
100pFfly2
C
100pF
110pF 
INC
5pF 
SC Subtractor
BATV
auxV
?
A
?
A
?
B
?
B
1pF
(a)
(b) (c)
OUTV
VCOMP
VREF
dcdcf
SWf
auxV
dcdcf
dcdcf
+v -v
dcdcf '
dcdcf ' dcdcf '
VCOMP
Figure 3.7. (a) Implementation of the 3:1 FD-SOI Dickson converter from [VIII]. The
Driver Control is similar to the previous Dickson implementation (section
3.1). (b) Single Boundary control. (c) Clocked comparator.
the MIM capacitors Cfly1,2 within the core and output decoupling capaci-
tance COUT .
The core and control circuitry of the converter use only thin-oxide RVT
transistors. Multi-phase interleaving is not used since it adds large con-
trol circuitry losses at sleep mode load power levels. The core uses the
same (Dickson) topology, and a similar driver control, as in the previous
DC-DC converter in section 3.1. The core’s switches are driven by tapered
buffers [40, pp. 344-345]. Within the control circuitry, an intermediate
rail generator (for VDIFF ) and hysteric control circuitry are used. The
intermediate rail generator, which is an SC subtractor circuit, produces
VDIFF=VBAT -V1V for switches S1 and S2a,b. The SC subtractor circuit uses
fdcdc for its switching frequency. This switching frequency should be as
low as possible to reduce power losses associated with switching the sub-
tractor circuit (see Fig. 3.11).
The Hysteric Control block (Fig. 3.7) is used to regulate the DC-DC con-
verter’s output voltage VOUT . This block uses discrete time hysteric con-
trol, which is an all-digital pulse frequency modulation (PFM) technique
used to scale the DC-DC converter’s switching frequency with power. Al-
though most modern DC-DC converters use PFM, beneﬁts have also been
shown with digital capacitance modulation [52]. The Hysteric Control
39
DC-DC Converter Prototypes
block consists of a clocked comparator and an edge-triggered (toggled)
latch to enable Single Boundary control [41] as shown in Fig. 3.7 (b). The
clocked comparator drives VCOMP high if VOUT < VREF and vice-versa. The
Toggle Latch changes state every time the comparator detects a boundary
violation. Thus, the actual switching frequency of the DC-DC converter
is fSW . The comparison and (latch) reset is done on the rising and falling
edge of CLK, respectively.
The comparator is a key component of the Hysteric Control block. For
this design, a low power clocked comparator is used as shown in Fig. 3.7
(c). Low NT input voltages (VREF ,VOUT ) and a wide input frequency range
(fdcdc=50 kHz – 50 MHz) require careful design of the comparator espe-
cially at low frequencies. The comparator architecture is similar to [41]
but with two different features. First, a delayed clock signal is added to
the PMOS used for the latch reset. This delay in the latch reset ensures
that the CLK is above Vt before the comparison is made, and thus, that
variations in IDSx are minimized. Second, all the transistor lengths were
sized 4 x Lmin to minimize (drain-source) leakage and ensure proper op-
eration down to kHz input frequencies.
Measurement Results
The DC-DC converter was measured with different load powers, input
voltages, output voltages, and temperature. The temperature results are
discussed later in chapter 5 (subsection 5.3.3) whereas the performance
under different load powers, VBAT , and VOUT is discussed as follows. The
measured efﬁciency at NT voltages is shown in Fig. 3.8. The ULP mode
uses a back-gate bias of V BBP=1.55 V/V BBN=-1.55 V while LP mode
uses a back-gate bias of V BBP=0 V/V BBN=0 V. At VOUT=0.465 V, the
minimum efﬁciency was 71% for 104 nW to 140 μW loads. The peak efﬁ-
ciency over this load power range was 81%. At VBAT=1.5 V and VBAT=1.6
V, the peak efﬁciency was 81.4% and 80.1%, respectively (Fig. 3.8 (b)).
Overall, the converter has high efﬁciency across a wide load power range.
The wide load power range is largely due to back-gate biasing. At sleep
mode power levels, the DC-DC converter is able to increase its efﬁciency
through increased RBB as shown in the measurement results in Fig. 3.10
(a). By applying RBB (with V BBP=2V and V BBN=-2V) at POUT=0.6 μW,
the percentage of leakage power is reduced from 15.6% to 0.6%. Increas-
ing RBB at higher load powers (e.g. POUT=10.6 μW), has minimal effects
since leakage power is small relative to converter’s power loss Ploss (as
40
DC-DC Converter Prototypes
VDD=0.465V
VDD=0.415V
Figure 3.8. Measured converter efﬁciency. The ULP mode uses a back-gate bias of
V BBP=1.55 V/V BBN=-1.55 V and LP mode uses a back-gate bias of
V BBP=0 V/V BBN=0 V [VIII]. c©2015 IEEE.
50 m500 ns/div
50 mV/div
IVDD=100 ?A
VDD,RMS=0.435 V
IVDD=300 ?A
VDD,RMS=0.420 V
VREF=0.415 V
VBAT=1.55 V 
fSW=25 MHz
VREF=0.415 V
VBAT=1.55 V 
fSW=25 MHz
IVDD=300 ?A
VDD,RMS=0.420 V
IVDD=100 ?A
VDD,RMS=0.435 V
50 m500 ns/div
50 mV/div
1 / fSW
Figure 3.9. Effect of a load step on the 3:1 Dickson DC-DC converter.
10
−7
10
−6
10
−5
10
−4
10
−3
30
40
50
60
70
80
90
P
OUT
[W]
E
ff
ic
ie
nc
y 
[%
]
(b) VBBP=2V, VBBN=-2V
VBBP=0V, VBBN=0V
Increasing
RBB
0
20
40
60
80
100
P
er
ce
nt
ag
e 
of
 P
lo
ss
[%
]
(a)
@POUT=0.6 ?W @POUT=10.6 ?W
PSW Pleak
R
B
B
: V
B
B
P=
2V
, V
B
B
N
=-
2V
VB
B
P=
0V
, V
B
B
N
=0
V
R
B
B
: V
B
B
P=
2V
, V
B
B
N
=-
2V
VB
B
P=
0V
, V
B
B
N
=0
V
Figure 3.10. Measured characteristics of RBB.
41
DC-DC Converter Prototypes
1 1.2 1.4 1.6 1.8 2
40
50
60
70
80
90
VBAT [V]
E
ff
ic
ie
nc
y
[%
]
SIR @ I
OUT
=100 μA
1 1.2 1.4 1.6 1.8 2
40
50
60
70
80
90
VBAT [V]
E
ff
ic
ie
nc
y
[%
]
SIR @ I
OUT
=100 μA
Efficiency increase 
due to lower input 
switching frequency
of the SC subtractor
SC subtractor 
input frequency: 25 MHz
SC subtractor 
input frequency: 7.5 MHz
Figure 3.11. (a) Measurements of the DC-DC converter with the SC subtractor input
frequency of 25 MHz and (b) 7.5 MHz.
shown in (2.4)). The effect of RBB on efﬁciency is shown in Fig. 3.10 (b).
Overall, RBB has greater impact as the POUT is reduced. The reason is
that the leakage power due to control circuitry losses (Pcont) increases as
a percentage of Ploss as POUT decreases; this was also conﬁrmed in [II].
When switching between different load powers, the DC-DC converter
should still perform correctly. As shown in Fig. 3.9, the converter was
measured with load steps and a ﬁxed input frequency (fdcdc) of 25 MHz.
Load current steps between 100 μA and 300 μA are applied to model the
processor’s (load) characteristics. There is a clear change in the actual
switching frequency (fSW ) of the converter when changing between load
levels. As predicted by [41], hysteric control is a fast and stable control
method. The VOUT,RMS is slightly larger than VREF mainly due to the
ripple [41]. However, the processor’s input capacitance of 75 pF reduces
VOUT,RMS to ≈ 0.415 mV under typical operation.
During measurement of the DC-DC converter, it was realized that the
SC subtractor input frequency could have been reduced to improve efﬁ-
ciency. By re-conﬁguring the SC subtractor to switch with fSW=7.5 MHz
rather than with fdcdc=25 MHz, the efﬁciency of the DC-DC converter can
be improved. As shown in Fig. 3.11, by reducing the switching frequency
of the SC subtractor, the average efﬁciency was improved from 70.7 % to
80.1 %. Note that the DC-DC converter used SIR - a technique discussed
later in chapter 5.
42
DC-DC Converter Prototypes
Table 3.2. Performance summary of the 3:1 Dickson converter.
Technology
Load Range
iVCR
Tested Input / 
Output Voltage
Efficiency (?) @
Sub-/Near-Vth
Power Density @
? (mW/mm2)
Efficiency max
Load Range
in Ratio
28nm UTBB
FD-SOI (RVT)
1-1.9V / 
(0.290-0.543V)
1/3
209nW-205?W
Topology step-down SC
5.5@0.415V
176%@0.415V
181%@0.465V
1:981
Value
176%@0.415V
1VBATT=1.55V and NBB (GNDS=VDDS=0V)
COUT 110pF
Cfly 200pF (MIM)
EEFmax 165%@0.415V
Load Range Min.
Efficiency (?MIN) ?MIN=71%
fdc-dc,max 38 MHz
Summary
A fully integrated SC DC-DC converter was presented (Fig. 3.2). While
connected to a Li-ion battery input, the DC-DC converter achieved a peak
efﬁciency of 81% and was able to achieve efﬁciency over 71% for 104 nW
to 140 μW loads. Back-gate biasing was used to reduce leakage power
at sub-5 μW loads. A single topology allowed for high efﬁciency across
a wide power range. Adding additional topologies was unnecessary and
would have only increased complexity and losses.
3.2.3 2:1 Self-Oscillating DC-DC Converter
Introduction
The goal of the 2:1 self-oscillating DC-DC Converter is to achieve high ef-
ﬁciency over a wide load range. The motivation for the wide load range is
to account for active and sleep modes in current ULE processors. A new
(self-oscillating) topology is implemented and back-gate biasing is utilized
to further improve its performance. More speciﬁcally, back-gate biasing is
used to increase the maximum drive strength of switches in the SCN and
to increase the switching frequency. This is useful when operating during
a processor’s active mode. The self-oscillating topology also works well
at low (sleep mode) load power levels due to its minimal control circuitry
[53].
43
DC-DC Converter Prototypes
top,2
(a)
(b)
Stage 1
VIN
VOUT
VIN
bot,1
top,1
GND
fly,1C
S1
S2
S3
S4
D
el
ay
 (D
1)
fly,2C
S1
S2
S3
S4
D
el
ay
 (D
2)
fly,3C
S1
S2
S3
S4
D
el
ay
 (D
3)
fly,4C
S1
S2
S3
S4
D
el
ay
 (D
4)
fly,5C
S1
S2
S3
S4
D
el
ay
 (D
5)
RLVOUT
+
-
IL
VCTRL
INHD1
INLD1
VOUT
VIN
CC
INHD1
INLD1
bot,2
VCTRL
GNDTP
to
p,
1
bo
t,1
0.65 0.7 0.75 0.8 0.85
0.35
0.4
0.45
0.5
0.55
Time [μs]
V
ol
ta
ge
 [V
]
0.65 0.7 0.75 0.8 0.85
0
0.5
1
Time [μs]
V
ol
ta
ge
 [V
]
top,1
bot,1
VOUT
VIN
VOUT
GND
(c)
VOUT
GND
GND
Figure 3.12. Implementation of the 2:1 self-oscillating converter from [VII]. (a)
Schematic of the self-oscillating 2:1 DC-DC converter. Cfly1−5 are each
a 27 pF MIM capacitor. (b) Delay block Di. (c) Simulated results of VOUT
and voltages (top,1/bot,1) from Stage 1. c©2015 IEEE.
Circuitry
The self-oscillating 2:1 DC-DC converter is shown in Fig. 3.12 (a). The
converter is comprised of two stacked ring oscillators. Clock generation
and level conversion are inherent within the topology. During oscillation,
each stage alternates its conﬁguration in the same way as a conventional
2:1 converter. For example, in Stage 1, S1 and S3 turn ON at the same
time in order to charge Cfly,1 while S2 and S4 turn ON at the same time
during a discharge phase.
The self-oscillating converter is interleaved since each stage delivers
charge at different phase times. Each of the ﬁve stages consists of three
main blocks: a 27 pF MIM capacitor, charge transfer switches (S1 − S4),
and a delay cell (Di). The LVT charge transfer switches form inverters
that are approximately 2x larger than the largest design kit library in-
verter. The lengths of these inverters are sized 2x the minimum length
to ensure that the leakage does not affect functionality at nW load power
[54].
The variable delay cells (D1-D5) within each stage are used to adjust
the switching frequency (fSW ) with changes in the load power (POUT =
VOUT ∗ IL). As shown in Fig. 3.12 (b), an individual delay cell consists
of two leakage-based delay elements, a pass transistor (TP ), and a 0.2 pF
44
DC-DC Converter Prototypes
0.4 0.5 0.6 0.7 0.8
100
101
102
VGS [V]
I O
N
[n
or
m
al
iz
ed
]
VBB=3
V
8x increase
@0.430 V 
1.5x 
increase
@0.8 V 
VBB
=2V
V B
B
=0
V
V BB
=1V
(a)
(c)(b)
fly,iC
D
el
ay
 (D
i)
GND
VIN
-VBB
VBB
-VBB
VBB
S4
S3
S2
S1
I O
N
,S
4 
[n
or
m
al
iz
ed
]
VBB
range:
FBB
+3V0V-0.3V
RBB
V B
B
-V
B
B
Figure 3.13. (a) VBB range for LVT transistors [46]. (b) Back-gate bias connections in the
converter. Within Di, the NMOS and PMOS have VBB and -VBB voltages,
respectively. (c) Simulated change in ION,S4 due to VBB . c©2015 IEEE.
MIM coupling capacitor labeled CC . CC ensures synchronization of the
stacked ring oscillators. Additional details of the leakage-based delay an
be found in [7, 55]. VCTRL is used to adjust the amount of leakage within
the leakage-based delay elements, and thus, adjust the delay (Di) between
each stage. Circuitry required to enable closed-loop control of VCTRL has
minimal impact on conversion efﬁciency even at 1.7 nW load power [55].
The back-gate biasing scheme for the proposed self-oscillating converter
is shown in Fig. 3.13. The biasing scheme was chosen to allow for bias
voltages to be generated without sacriﬁcing the efﬁciency of the converter.
For example, VBB (e.g. 1 V) can be generated from VIN and -VBB (e.g. -1 V)
can be generated from a nW reverse body bias generator [56], respectively.
The application of back-gate biasing on a single switch (i.e. S4) in triode
mode are shown in Fig. 3.13 (c). As Vgs approaches NT voltages, the gain
in the drive strength of S4 due to back-gate biasing becomes much more
signiﬁcant than at super-threshold voltages. Since the Vgs of all switches
in the self-oscillating topology operate with NT voltages (assuming VIN=1
V - 1.2 V), there is strong motivation to apply back-gate biasing.
The effect of back-gate biasing on all switches within the proposed DC-
DC converter (Fig. 3.12) is found by examining the DC-DC converter’s
output impedance Ro (see (2.3)). To construct an equation for Ro that
accounts for back-gate biasing, the Gtot is needed (in order to calculate
RFSL). From [VII], Gtot can be described by
Gtot = k(
Wi
Li
m∑
i=1
Ki(VGSi − Vth,i)) (3.1)
45
DC-DC Converter Prototypes
106 107 108
102
103
104
fSW [Hz]
R
es
is
ta
nc
e 
[Ω
]
Ro (VBB=0V)
Ro (VBB=1V)
RSSL
RFSL (VBB=0V)
RFSL (VBB=1V)
... f SW with VBB=0V ...
RSSL ? due to larger fSW with VBB=1V
f SW with VBB=1V
RFSL ? due to 
VBB=1V on switches
Figure 3.14. Analytical prediction of Ro from (2.3) with Kc=1/4, Ks=4, Ctot=135 pF, and
Gtot from (3.2) [VII]. c©2015 IEEE.
where m is the number of switches in a stage, k is the number of inter-
leaved stages, Li and Wi are the common switch size parameters, Ki is
the technology constant, and VGSi is the voltage applied when the switch
is ON (and in triode mode). Note that by decreasing the Vth,i with increas-
ing VBB results in a larger overdrive voltage, and thus, a larger Gtot. By
applying back-gate biasing with the biasing scheme from Fig. 3.13 (b),
(3.1) expands to
Gtot = 5
{
(Kn
Wn
Ln
(VIN−2Vthn+γn(2VBB−VOUT )
+(Kp
Wp
Lp
(−VIN−2Vthp+γp(−2VBB−VIN−VOUT )
} (3.2)
Now that Gtot is known, Ro (from (2.3)) is plotted as shown in Fig. 3.14.
The frequency range shown is for higher load powers (above 5 μW); the
range of fSW is estimated from simulations. The application of back-gate
biasing has two important effects on the converter. First, a larger VBB
increases fSW by decreasing the delay of both the inverter pairs (S1/S2
and S3/S4) and the delay cells. Thus, the fSW range is shifted to higher
frequencies for a given VCTRL range. As a result, the RSSL decreases. Sec-
ondly, a larger VBB increases the overdrive voltage of the converter, and
thus, decreases the RFSL. The decreases in both RSSL and RFSL give a
lower Ro, and thus, higher maximum power (POUT,max) since POUTα 1/Ro.
Measurement Results
The DC-DC converter’s efﬁciency was measured at VOUT=430 mV and
460 mV for nW to μW loads as shown in Fig. 3.15 (a). Back-gate biasing
was applied for both VOUT . The VCTRL was adjusted over the same power
range as in Fig. 3.15 (a) for VOUT=430 mV. With VBB=0 V, the VCTRL
(markers A and B), provides low fSW to meet low power loads as shown
46
DC-DC Converter Prototypes
0.35 0.4 0.45 0.5
70
75
80
85
90
VOUT [V]
E
ff
ic
ie
nc
y 
[%
]
(c)
10−8 10−6 10−4
0
0.2
0.4
0.6
0.8
POUT [W]
V
C
TR
L
[V
]
(b)
A
B
C
D
A
C
D
B
fSW
9.6kHz
0.8MHz
15MHz
1.15MHz
VBB=1VVBB=0V VBB=1V VBB=0V
@POUT?5?W
POUT=5?W
10−8 10−7 10−6 10−5 10−4 10−3
60
65
70
75
80
85
90
POUT [W]
E
ff
ic
ie
nc
y
[%
]
VDD=460mV (VBB=0V)
VDD=430mV (VBB=0V)
VDD=460mV (VBB=1V)
VDD=430mV (VBB=1V)(a)
(off-chip)
Processor Load
VDD=460mV
VDD=430mV
POUT=5?W
Figure 3.15. Measurement results of the proposed converter with back-gate biasing [VII].
(a) Efﬁciency vs. load power (POUT ) at VOUT=430 mV and 460 mV. (b) VCTRL
at VOUT=430 mV; (c) Efﬁciency at POUT ≈5 μW. c©2015 IEEE.
in Fig. 3.15 (b). Increasing the back-gate biasing to VBB=1 V gives higher
fSW over a similar VCTRL voltage range. For a load power over ≈5 μW it is
essential to use back-gate biasing of at least VBB=1 V in order to maintain
sufﬁcient fSW and switch conductance. Using back-gate biasing, the self-
oscillating converter achieved a minimum efﬁciency of 75 % for 79 nW to
200 μW loads and a peak efﬁciency of 87 %.
The maximum load power of the self-oscillating converter was further
increased by applying VBB up to 3 V as shown in Fig. 3.16. The VCTRL
was a constant 0.7 V to ensure the maximum fSW at each VBB and the
back-gate bias conﬁguration was the same as in Fig. 3.13 (b). VBB was
0 200 400 600 800
55
60
65
70
75
80
85
POUT [μW]
E
ff
ic
ie
nc
y 
[%
]
VOUT=430 mV
VOUT=460 mV
0 500 1000 1500 2000
55
60
65
70
75
80
85
POUT [μW]
E
ff
ic
ie
nc
y 
[%
]
VOUT=430 mV
VOUT=515 mV
0.5 V increments of VBB 0.5 V increments of VBB
1.86x 
increase
in power
VBB=0.5V VBB=3V
VBB=3VVBB=3V VBB=0.5V
(a) (b)
VBB=3V
Figure 3.16. Measured effeciency at (a)VIN=1 V and (b)VIN=1.2 V [VII]. c©2015 IEEE.
47
DC-DC Converter Prototypes
swept from 0.5 V to 3 V in 0.5 V steps. The load power as a function of
efﬁciency for VIN= 1.0 V is shown in Fig. 3.16 (a). The measured increase
in load power from VBB= 1 V to 2 V was 1.86 x. Assuming FSL operation
(i.e. RSSL ≈ 0), then an increase in load power requires a proportional de-
crease in Ro. Under this assumption, Ro decreases by 1.6 x in (2.3) when
changing from VBB=1 V to 2 V. The load power as a function of efﬁciency
for VIN= 1.2 V is shown in Fig. 3.16 (b). Higher POUT is possible due to
the larger overdrive voltages (i.e. switches have larger Vgs).
Summary
A fully integrated SC DC-DC converter was presented (Fig. 3.12). The
DC-DC converter achieved a peak efﬁciency of 87 % and was able to achieve
efﬁciency over 75 % for 79 nW to 200 μW loads. Back-gate biasing was
used to increase both the switch conductance and switching frequency at
large loads. The self-oscillating topology proved to be advantageous at
sleep power levels due to its minimum control circuitry. Since the con-
verter is built from stacked ring oscillators containing inverters, full syn-
thesis of this design is possible.
An additional back-gate biasing scheme could be beneﬁcial for the self-
oscillating DC-DC converter presented. Instead of requiring two separate
back-gate bias voltages as previously shown, the DC-DC converter could
have used a gate-bulk connection, or DTMOS [57], on all the switches.
This may slightly decrease efﬁciency at active mode power levels due to
increased gate capacitance, but efﬁciency would be slightly improved in
the sleep mode power level (since no additional bias circuitry would be
required). Additionally, switching between different bias schemes would
be fast. For systems in which sleep mode power is dominant, the gate-bulk
back-gate biasing scheme would be worthwhile to examine.
3.3 Implementation Summary
Three fully integrated DC-DC converters with an NT output voltage were
presented. The ﬁrst DC-DC converter, which was a 3:1 Dickson DC-DC
converter, showed the beneﬁts of the Dickson topology. Next, an improved
3:1 Dickson DC-DC converter was shown to provide efﬁcient conversion
over a wide load range from a 1.55 V Li-ion battery. This DC-DC converter
made use of back-gate biasing to improve the load power range. Finally,
a 2:1 self-oscillating DC-DC converter was given. This DC-DC converter,
48
DC-DC Converter Prototypes
Table 3.3. Performance summary of the 2:1 self-oscillating converter.
Technology
Load Range
iVCR
Tested Input / 
Output Voltage
Efficiency (?) @
Sub-/Near-Vth
Power Density @
? (mW/mm2)
Topology
Efficiency peak
Load Range
in Ratio
28nm UTBB
FD-SOI (LVT)
1-1.2V / 
(0.380 - 0.485V)
1/2
79nW-200?W
self-oscillating
step-down SC
62@0.515V
19.2@0.46V
24@0.43V
175%@0.515V
177%@0.46V
177%@0.43V
187%@0.46V
1:2532
Value
1VBATT=1V and VBB=1V
COUT 0pF
Cfly 135pF (MIM)
EEFmax 147%@0.460V
Load Range Min.
Efficiency (?MIN) ?MIN=75%
fSW,max 15 MHz
which also utilized back-gate biasing, had a self-startup capability, high
conversion efﬁciency, and a wide load range. The performance of the three
converters is later compared to state-of-the-art SC converters in Table 6.1
(chapter 6). Each of the converters are well suited for adaptive ULE loads.
The next chapter examines the characteristics of adaptive ULE loads.
49
DC-DC Converter Prototypes
50
4. Adaptive Ultra-Low-Energy
Processors
In order to design DC-DC converters for ULE adaptive processors, it is es-
sential to understand the processor characteristics. This chapter presents
the details of three ULE adaptive processor loads. Each processor oper-
ates at NT voltages and uses an adaptive timing scheme to save energy
and operate reliably at ULV. The adaptive schemes are based on oper-
ating a system at a voltage and frequency point in which the timing of
critical paths fails intermittently and are handled through detection and
correction.
A brief overview of ULV operation is ﬁrst given. Next, an introduction
to an adaptive timing scheme, called timing-error detection (TED), is pre-
sented. The details of three implemented processors with TED (or similar
schemes) are then given. Two of the processors were manufactured in 65
nm CMOS and one in 28 nm UTBB FD-SOI. The processor in subsection
4.3.2 iis the ﬁrst-known TED processor able to operate in sub-threshold.
The processor characteristics, which are most relevant to the DC-DC con-
verter design, are highlighted throughout the chapter and summarized at
the end. This chapter gives the most signiﬁcant aspects of the original
work. Additional details can be found in the related scientiﬁc publica-
tions: [IV], [VI], [I], and [VIII].
4.1 Overview
In order for ultra-portable electronics to operate with long lifetimes or
even autonomously, systems must operate with ULE. These systems rely
increasingly on low energy processing in order to reduce the (energy-
dominant) wireless data transmission [58]. This increased focus on ULE
processing has prompted an emerging class of processors that operate at
or near the threshold voltage [9, 14, 15, 21, 46, 59]. These ultra-low-
51
Adaptive Ultra-Low-Energy Processors
Supply Voltage
En
er
gy
/O
pe
ra
tio
n
Supply Voltage
lo
g 
(D
el
ay
)
N
on
-fu
nc
tio
na
l
Sub-Vth
Region
-High sensitiviy to 
 PVTA variation
-High delay
Near-Vth
Region
-Med. sensitiviy to 
 PVTA variation
-Med. delay
Super-Vth
Region
-Low sensitiviy to 
 PVTA variation
-Low delay
~10X
~50X
Vth Vnominal
MEP ESW
EL
ET
Figure 4.1. Energy per operation and delay as a function of supply voltage in modern
CMOS.
voltages are where the minimum energy point (MEP) of modern digi-
tal systems is located. Operating at the MEP can bring energy savings
around 10 x (Fig. 4.1). In general, the MEP is dependent on both the tech-
nology and architecture. Processors in older process nodes have a MEP
within the sub-threshold region whereas newer CMOS processes have a
MEP near the near-threshold region as shown in Fig. 4.1. As the leakage
energy (EL) increases for new processes, the MEP shifts right toward NT
voltages [60].
The MEP is largely inﬂuenced by the delay. The propagation delay for
an inverter operating in super-threshold is [61]:
td =
KCgVDD
(VDD − Vth)αv , (4.1)
where K is a delay-ﬁtting parameter, Cg is the output capacitance of a
characteristic inverter, and αv is the velocity saturation index [62]. In
sub-threshold, the propagation delay of an inverter is [61]:
td,sub =
KCgVDD
Ioexp(
VDD−Vth
nUT
)
. (4.2)
where Io is the ON current of the characteristic inverter, n is the sub-
52
Adaptive Ultra-Low-Energy Processors
threshold swing coefﬁcient, and UT is the thermodynamic voltage.
To better understand why the MEP is located at such low voltages, both
the switching and leakage energy need to be examined. The energy losses
due to short-circuit currents are negligible due to the exponential MOS I-
V characteristics at ULV [60]. The switching energy (assuming rail-to-rail
swing) is [61]:
ESW = CeffV
2
DD = CTOTαV
2
DD, (4.3)
where Ceff is the average effective switched capacitance, α is the activity
factor, and CTOT is the total physical capacitance. The leakage energy is
[61]:
EL = (ILEAKVDD)Top (4.4)
= WeffKCgLDPV
2
DDexp(
−VDD
nUT
), (4.5)
where Top is the time to complete an operation and LDP is the critical path
depth in characteristic inverter delays. The total energy per operation is
then expressed as:
ET = ESW + EL (4.6)
= V 2DD[Ceff +WeffKCgLDP exp(−VDD/nUT )]. (4.7)
As shown in Fig. 4.1, the EL dominants ET due to leakage over large
delay times within sub-threshold and near-threshold regions. Within the
super-threshold region, the ET is dominated by ESW . The smallest en-
ergy for ET is called the MEP. The VDD location of the MEP, or VDD,OPT ,
is of relevance for the DC-DC converter design. The VDD,OPT location is
affected by changes in the activity factor (α) of a processor [60]. How-
ever, changes in the activity factor have a lesser impact on the MEP with
scaling since CTOT (from (4.3)) scales by 1/S (S>1). This behavior implies
that the ESW contribution reduces (moves to the right) and the ﬂatness
of the MEP increases for newer processes. In other words, uncertainty in
VDD has less of an impact on the MEP. The increasingly ﬂat MEP region
translates into an increasingly large operating voltage range with only
a minimal penalty for operating outside the MEP. Therefore, the MEP
region can be deﬁned more generally as the region for which
Ecy − Ecy,MEP ≤ x · Ecy,MEP (4.8)
53
Adaptive Ultra-Low-Energy Processors
whereEcy is the energy per cycle at VDD, Ecy,MEP is the energy per cycle at
the MEP, and x is the per cent change from the MEP. As further discussed
in [III], the MEP region increases with a decreasing α.
As the size of the MEP region continues to increase for new processes
and for systems with a low to medium activity factor, it allows for innova-
tive energy savings possibilities. Speciﬁcally, it can be utilized to remove
the closed-loop MEP tracking control and achieve the energy savings with
a simpliﬁed open-loop control. Further, the adaptive logic required to mit-
igate the MEP region variance effects allows for reduced design margins
of the circuits associated with the logic. Here, the reduced susceptibility
of the adaptive logic is used to relax the design constraints of the DC-DC
converter.
Although there is decreasing motivation to precisely track the MEP,
some MEP tracking systems have been shown to be useful. For example,
the on-chip energy sensor in [16] is able to dynamically track the MEP of
arbitrary digital circuits under different operating conditions. This sensor
enables savings of 50 % to 100 %. This approach requires energy overhead
(50 x larger than the energy of a 1-tap ﬁlter), and thus, it may need to be
duty-cycled depending on the digital circuit power level.
When operating at the MEP, there are both performance and robust-
ness limitations at low voltages for digital logic [60, 63]. The main reason
for the performance limitation is that process, voltage, temperature, and
ageing (PVTA) variations cause exponential changes in the current in the
sub-threshold region as evident from (4.2). PVTA variations can be clas-
siﬁed as either global (e.g. die-to-die, ageing, or temperature), or local
(e.g. within-die, IR drops, jitter) [64]. The impact of variations requires
large (speed) safety timing margins, which translates into higher energy
consumption, to ensure robustness.
Process variations are of particular concern since they adversely affect
the yield and may require (costly) individual post-fabrication measure-
ments to ensure robustness at ultra-low-voltages. Process variations are
due to die-to-die (lot-to-lot and wafer-to-wafer) variations and within-die
variations (from factors such as nondeterministic placement of dopant
atoms and variation in L) [60, 65]. Within-die variations have a large im-
pact on the transistor’s Vth. The sigma for Vth random doping ﬂuctuations
(RDF) is proportional to (WL)−1/2 [61], and thus, sizing considerations
are important in reducing performance limitations.
Besides the performance limitations, digital logic also has functional ro-
54
Adaptive Ultra-Low-Energy Processors
Point of
First Failure
(PoFF)
Traditional
Safety Margin
Energy
Total Energy
IPC
Instruction
Replay Energy Processor
Energy VDD
Figure 4.2. Relationships between output voltage (VDD) and energy for a TED system
with ﬁxed operation frequency. As the error rate increases beyond the point
of ﬁrst failure (PoFF), the recovery energy grows due to the effort required in
correcting the errors.
bustness limitations at sub-threshold voltages due to the previously de-
scribed PVTA variations (see Fig. 4.1). The high impact of variability
in the ION/IOFF current ratio at such low voltages can lead to bad out-
put logic levels [9, 60, 63, 65]. This behavior in turn leads to functional
failures of subsequent gates as shown in [V].
To solve the performance and robustness limitations, adaptive methods
are needed. In super-threshold, a popular solution for overcoming the
need for large safety timing margins has been to use canary (replica) cir-
cuits. However, canary circuits can only compensate for PVTA variations
that are global and slow-changing [64]. The fact that they cannot compen-
sate for local variations (e.g. within-die), means that they are not suitable
for sub- or near-threshold operation. To compensate for both global and
local PVTA variations, timing-error detection (TED) is proposed in the
next section.
4.2 Timing-Error Detection (TED)
TED is an adaptive method that has been shown to reduce PVTA variation-
induced safety margins [64, 66–70], which are traditionally used across
all corners to ensure a sufﬁcient yield. The reduced safety margins with
TED can be turned into power savings (i.e. lower VDD [67]), or a higher
yield [64]. As shown in Fig. 4.2, TED systems can scale the VDD until
the point-of-ﬁrst-failure (PoFF). In other words, the system operates at
a voltage/frequency point in which the timing of critical paths fails in-
termittently. The failed timing occurrences are detected and corrected,
for example, with an instruction replay system. If the error rate is sufﬁ-
55
Adaptive Ultra-Low-Energy Processors
Combinational
Logic
Combinational
Logic
N
N
N
TEDsc
D
ERRf
Q
N
N
Regulation
fOp
TEDsc
D Q
ERRf
(EDS) (EDS)
Control OR-tree
Q
E
EDS
Q
E
EDS
VDD
VDD
(b)
(a)
Q
ERR
CLK
D
Tdmin
Tdmax
Tdmeet
TED window
Note: system level constraints in italic
Tclk
PULSE
Figure 4.3. (a) TED operation with a dynamic node-style error-detection sequential
(EDS) circuit. The transition-generated PULSE signal is used to change the
state of a dynamic node and generate a timing error, or ERR. (b) Block dia-
gram of a pipeline-based TED system.
ciently low (e.g. 0.04% in a study by Blaauw et al. [67]), then an energy
consumption beneﬁt is achieved as a result of operating at a lower volt-
age. The error rate must be kept low to ensure that the instruction replay
portion of the TED system does not consume too much energy.
Important signals from a TED system are shown in Fig. 4.3 (a). To op-
erate the TED system without timing errors, data (D) should transition
after Tdmin and before the period of CLK (Tclk); this is called the Tdmeet
region. If data would arrive before Tdmin, false errors would result. To
meet this requirement, additional delay elements need to be added to
any paths that violate Tdmin. The delay elements add power consumption
overhead. Data transitions under the TED window are ﬂagged as timing
errors. Data transitions beyond Tdmax allow the system to miss timing
errors. The timing constraints within a TED system can be summarized
in terms of the duty cycle (d) and period (Tclk) of the CLK signal:
Tdmin = d ∗ Tclk (4.9)
Tdmax = (1 + d) ∗ Tclk (4.10)
The key component used to ﬂag timing errors is called an error-detection
sequential (EDS) circuit. EDS circuits generate error signals when the
56
Adaptive Ultra-Low-Energy Processors
path setup timing fails (i.e. if D transitions under the TED window).
Within a TED system, the EDS circuits are placed at critical logic paths
where timing errors can occur as shown in Fig. 4.3 (b). When timing er-
rors are detected, the operation frequency (fOp) or supply voltage (VDD)
can be adjusted to prevent data transitions at Tdmax, and thus, ensure op-
eration without system failure. The TED window for the EDS circuits can
be tied to the clock signal [67], or it can be generated independently [64].
There are two main types of EDS architecture: a dynamic node [67, 71,
72] and a delayed shadow latch [66, 73]. Of these architectures, the dy-
namic node can achieve a lower power and lower clock node capacitance.
The dynamic node implementation typically uses an inverter delay chain
and a logic gate (e.g. XOR), to produce a signal pulse. The signal pulse,
or PULSE, as shown in Fig. 4.3 (b), is used to ﬂip the state of a dynamic
node and generate a timing error signal. The inverters and logic gates
used to produce the PULSE signal require a high level of precision across
all PVTA variations, especially at sub- and near-threshold voltage levels.
In addition to being robust to PVTA, the size of the PULSE should be
minimized since it limits the speed of the entire TED system. Designing
the inverters and gates that generate the PULSE signal is one of the most
challenging design tasks within the EDS circuit.
4.3 Implementations
4.3.1 Adder with TED
We designed and fabricated a TED system level test circuit in a 65 nm
CMOS. The test circuit, which is referred to as SystemTest1 in Fig. 4.4
(a), is composed of two main parts: an adder and EDS circuits (called
TDTBsub1). The EDS circuits are used to ﬂag timing errors between the
combinational logic (i.e. adder). Input data D is ﬁrst serially loaded into
the 60 input shift registers. The data is then passed to 60 (TDTBsub1)
registers and then added together. The output of the adder is again placed
through 60 TDTBsub1 latches. From the TDTBsub1 latches, the output
is level-shifted (with levelS) and given to the output shift-registers (with
shiftO).
The schematic of TDTBsub1 is shown in Fig. 4.4 (b). It is a dynamic
node style EDS and it is similar to the (super-threshold) EDS in [66],
57
Adaptive Ultra-Low-Energy Processors
Dsin Dpo0
Dpo1
Dpo58
Dpo59
TD
TB
su
bI
 x
30
adder
le
ve
lS
 x
60
Vdd Vdd
Vdd
Vdd
Vddh
sh
ift
O
 x
60
Vddh
GI
G
gnd
gnd
gnd
gnd gnd gnd
GO S
Dso
Q29
Q0
Q1
E29
E0
E1
vho0
vho1
vho59
sout29
sout0
sout1
Q29
Q0
Q1
Q29
Q0
Q1
TD
TB
su
bI
x3
0
TD
TB
su
bI
 x
30
sh
ift
IN
 x
60
combinational
G
Vdd
G
logic TED register
Vddh
gnd
D Q
ERROR
N1
N2
P1 K
CLKd
CLKi
CLK
PULSE
C
LK
d
XOR
CLKi
TG
PG
SN
keeper
CDC
CLK
CLKd
LATCH1
I6
I5
I4
I3
I2I1
I0
(b)
(a)
Figure 4.4. (a) SystemTest1 consists of an adder and an EDS circuit called TDTBsub1.
(b) Circuit diagram of TDTBsub1 [IV]. c©2009 IEEE.
except that it is designed to operate in sub-threshold. The operation of
TDTBsub1 is performed by a clock delay chain (CDC), a pulse generator
(PG), a switching network (SN), and a keeper. The CDC provides signals
for the transmission gate (TG). The PG provides a short voltage pulse
signal called PULSE that occurs at each transition of D. If D transitions
at the same time CLK is HIGH, the PULSE signal from PG allows the
SN to pull node K low. As K is pulled low, the keeper switches states
and ERROR is driven HIGH. During a reset of ERROR (at the negative
edge of CLK), P1 is able to reliably drive K HIGH by breaking the keeper
feedback with the TG [74]. All of TDTBsub1’s (HVT) transistor widths
were up-sized according to the sizing metric of [61]. To reduce leakage,
the L of most transistors were sized as Long-Le transistors (i.e. nominal +
10%), since Long-Le transistors have three times lower leakage with only
a 10 % reduction in speed [75].
Measurement results of SystemTest1 are shown in Fig. 4.5. As ex-
pected, reducing VDD provides large reductions in energy per operation. If
a DC-DC converter was to supply VDD, there is an important observation
that can be made from the TED system measurements. The energy per
operation becomes less sensitive to variations in VDD for sub- and near-
58
Adaptive Ultra-Low-Energy Processors
0.3 0.4 0.5 0.6 0.7
104
105
106
107
V
DD
[V]
f C
LK
[H
z]
E
op
[J
]
2
4
6
8
10
12
14
x 10−12
V
DD
[V]
f C
LK
[H
z]
0.2 0.3 0.4 0.5 0.6 0.7
103
104
105
106
107
E
rr
or
 R
at
e 
[%
]
0
20
40
60
80
100
Figure 4.5. Energy per operation as a function of supply voltage (VDD) and operation
frequency (fop); (b) Error rate (R) as a function of VDD and fop. As VDD is
reduced below 0.4 V, the error rate starts to increase for low fop due to the
leakage in N1 and N2 [IV]. c©2009 IEEE.
CP
CP
CP
CP
EPP
EPP
EPP
CP
Figure 4.6. 8-bit sub-threshold processor with TED. The timing error signal propagation
paths (EPP) are highlighted in blue and the critical paths (CP) in red [I].
c©2012 MDPI.
threshold regions. For example, the energy per operation increases 22.1
% and 33.3 % from 0.25 - 0.3 V and 0.3 - 0.35 V, respectively.
4.3.2 8-bit Processor with TED
To expand the previously described TED test system, an 8-bit processor
was designed with TED (Fig. 4.6). The processor, which is described
in [I], is capable of sub-threshold operation, has an 8-bit commercially-
compatible core, and is built in 65 nm CMOS. It is the ﬁrst-known TED
processor able to operate in sub-threshold. The architecture of the CPU
is an accumulator-based style in which the second operand is always the
accumulator register. The processor has a three-stage pipeline that can
generate timing errors from EDS circuits within the critical path. There
is a total of 20 EDS circuits; eight of them are in the accumulator register,
eight of them are in the register ﬁle write buffer, and four of them are
used for the arithmetic and logic unit (ALU) ﬂags.
59
Adaptive Ultra-Low-Energy Processors
Vdd,scl
VP
VN
M3 M4
Vout,1’Vout,1
V
ol
ta
ge
 S
w
in
g
C
on
tro
l (
V
S
C
)
NMOS
Network
Vin,1
Vin,n
Vin,2
Vin,1’
V in,n’
V in,2’
ssI
Vdd,scl VP
N
I ss
VL
V
IB
VA
SWA
MN MN
MP MP
dd
,s
cl
V
(c)(a)
(b)
VS
VD
VG
P-Sub
p+ p+p+
VD
VG
VS
n+
N-Well
Figure 4.7. (a) Bulk to drain PMOS and cross section; (b) STSCL; (c) Voltage swing con-
trol (VSC).
The EDS circuit for the 8-bit processor uses sub-threshold source-coupled
logic (STSCL) rather than traditional static CMOS. Depending on a ULE
digital system’s logic depth, leakage current, activity factor and operation
frequency, STSCL can have advantages over static CMOS [76]. STSCL
is targeted for ultra-low-power circuits which have relaxed speed require-
ments.
Similar to source-coupled logic (SCL) [77], an STSCL gate is composed
of a network of differential NMOS pairs, adjustable PMOS loads (M3,M4)
with output resistances ofRP , and an adjustable tail current ISS as shown
in Fig. 4.7 (a). The NMOS pairs are used to construct logic gates. ISS and
the PMOS loads are used to generate a proper voltage swing within the
logic gates. The voltage swing is deﬁned as VSW=|Vout1 − Vout1′ |=RP *ISS .
The bulk connection of the PMOS load is what differentiates SCL and
STSCL. As shown in Fig. 4.7 (b), the bulk is connected to the drain
(source) in STSCL (SCL). By connecting the bulk to the drain in STSCL,
a more linear IDS-VDS characteristic is achieved [76, 78].
The bulk to drain connection does have one inherent limitation: the
voltage swing is limited to 400-500 mV to ensure that the source to bulk
diode does not become forward biased. However, in STSCL, the minimum
voltage swing is well below this limitation. With low ISS , the NMOS pairs
operate in sub-threshold, and thus, require a minimum voltage swing as
low as 4∗n∗UT , or≈ 150 mV (at room temperature and n=1.5). The voltage
swing is maintained over global variations by dynamically adjusting the
size of the (PMOS load) RP and the magnitude of ISS .
60
Adaptive Ultra-Low-Energy Processors
(b)(a)
ERR
CLK
Q
ERR-reset
SPDN OFF
SlatchON
TED window
ß
SPDN OFF
SlatchON
SPDN ON
SlatchOFF
SR ON SR OFF
ERR
A
DD n
An
n
SR ON
CLKd
D DnDn
A A An
D
CLK
TEDsc
PV
NV
dd,sclV
M2
M3
M4
M5
M6
M7 M12
M9 M10
M11M8
Mb
S SPDN LATCHRSERR
ERRn
An M1
M14M13
D Q
G
D
CLKd
Q
L1
Dn
A
ß
CLK
-Delay Ma CLK-Delay
An
ITEDsc
Figure 4.8. (a) TEDsc schematic; (b) TEDsc timing diagram.
The size of RP and the magnitude of ISS are both adjusted by the voltage
swing control (VSC) block shown in Fig. 4.7 (c). The VSC decreases the
dependence on global variations (e.g., supply noise, temperature ﬂuctua-
tions, and ageing) by adjusting VP through negative feedback. The bias
voltage (VP ) from one VSC can be used for a large number of TEDsc gates
[79]. The VSC for the EDS circuits is composed of a two-stage, Miller-
compensated opamp (ASW ). The opamp is able to maintain an open-loop
gain of 40 dB for all the global process corners (TT, FF, SS, SF, and FS).
Simulations at -40◦C to 90◦C showed that the VSC generated a VP to en-
able correct functionality of TEDsc.
The TEDsc schematic and timing diagram are shown in Fig. 4.8. TEDsc
is able to detect timing errors when transitions of data D occur within
the TED window. During a transition of D under a TED window, the
pull-down network (SPDN ) receives signals from the β-Delay block to pull
ERRn low, and consequently, to drive ERR high. After a delay (β), the new
values of ERRn and ERR are latched until an error reset (ERR–reset) at
the negative clock edge. The error reset is performed with the inverted
clock signal CLKd that is applied to the reset pull-down block SR.
TEDsc was measured within a test digital system consisting of three
TEDscs and combinational logic. The clock frequency of the digital sys-
tem was ﬁxed at 10.37 kHz. TEDsc and VSC used the following settings:
Vdd,scl=400 mV and VL=200 mV. Thus, VSW=Vdd,scl-VL=200 mV. With these
settings, the probability of a timing error for the three TEDscs was mea-
sured as a function of time as shown in Fig. 4.9. There was 500 points
measured over one clock cycle. At each point there were 16384 (rising and
falling) transitions.
61
Adaptive Ultra-Low-Energy Processors
TEDsc
0
Falling D
TEDsc
1
Falling D
TEDsc
2
Falling D
TEDsc
0
Rising D
TEDsc
1
Rising D
TEDsc
2
Rising D
200 250 300 350 400 450 500
0
0.5
1
200 250 300 350 400 450 500
0
0.5
1
CLK HIGHCLK LOW
Graph VDD fCLK tdD-ERR ITEDsc TEDsc power
(a) 300 mV 10.37 kHz ~15 FO4 301 pA 150 pW
(b) 300 mV 10.37 kHz ~3 FO4 1.56 nA 655 pW
P e
P e
(a)
(b)
Time [normalized]
Figure 4.9. Measurements of TEDsc with ITEDsc of (a) 300 pA and (b) 1.5 nA.
Figure 4.10. Energy per operation of Core 1 (with TED) and Core 2 (without TED).
c©2012 MDPI.
Two bias currents were used in the measurement of Fig. 4.9. At the
lower bias current of 300 pA (Fig. 4.9 (a)), the probability of a timing
error is closer to the positive edge of CLK and farther from the negative
edge CLK. This behavior is caused by two effects. First, a smaller bias
current means that the time required for PULSE and CLK to be HIGH
simultaneously is increased. Second, the effects of variation are more evi-
dent at the lower bias current since transistors within the NMOS network
are deeper in sub-threshold. Recent techniques [80] could be applied in
future implementations of TEDsc to reduce this variation. In summary,
the power in TEDsc can be reduced by using lower bias currents with the
tradeoff of increased variability.
To understand the beneﬁts of using TEDsc in a digital system, a TED-
enabled core (Core 1) and a non-TED core (Core 2) were designed in 65 nm
CMOS. Core 1 used 20 TEDscs. The energy per operation for both cores
is shown in Fig. 4.10. At 300 mV, Core 1 (TED) uses 28% less energy
per operation than Core 2. If Core 1 is supplied by a DC-DC converter,
62
Adaptive Ultra-Low-Energy Processors
0
20
40
60
80
100
C
um
ul
at
iv
e 
po
w
er
 [%
]
0.3 0.35 0.4 0.45 0.5
0
20
40
60
80
VDD [V]
Fr
eq
ue
nc
y 
[M
H
z]
Leak. (BB1)
Dyn. (BB1)
Leak. (BB2)
Dyn. (BB2)
Perf. (BB1)
Perf. (BB2)
(a)
0.30.350.40.45
0
20
40
60
80
f IN
= 
40
M
H
z
f IN
= 
20
M
H
z
f IN
= 
10
M
H
z
f IN
= 
5M
H
z
VDD [V]
TB
E
 r
at
e 
[%
]
TBE IRQ
System failure
(d)
Figure 4.11. Measured processor characteristics [VIII]. c©2015 IEEE.
the ﬂatness of the energy consumption between 300 mV and 400 mV for
Core 1 implies relaxed constraints on ripple and regulation. Additional
measurement details of the TED system can be found in [I] and [81].
4.3.3 32-bit Processor
Expanding on to the previous two TED systems, led to an improved adap-
tive processor capable of sub- and near-threshold operation. The pro-
cessor is a customized 32-bit LatticeMicro RISC [82] with timing-error-
prevention (TEP) in 28 nm UTBB FD-SOI. TEP is similar to TED in
that it detects failed critical paths. However, TEP uses adaptive timing
margining rather than instruction replay [9]. The author did not design
the processor but did contribute to system measurements of the processor
[VIII]. The goal of this subsection is to highlight the characteristics of the
processor that affect the DC-DC converter design.
Measurements of the processor’s power distribution, energy, performance,
and energy-delay product (EDP) with two back-gate bias conﬁgurations
(BB1, BB2) are shown in Fig. 4.11 (a)-(d). BB1 (V BBP=-0.5V,V BBN=0V)
was used for enhanced performance while BB2 (V BBP=0V,V BBN=-0.5V)
was used for low leakage. Larger back-gate bias voltages could have been
applied, but the I/O pad limited the voltage to +/- 0.5 V.
With regards to DC-DC converters, there are three observations from
the processor measurements in Fig. 4.11. First, even small changes in
63
Adaptive Ultra-Low-Energy Processors
back-gate bias generate large changes in the operation frequency, and
thus, load power. Similar ULE processors in UTBB FD-SOI have shown
large changes in performance and load power due to the strong effects of
back-gate biasing at low voltages [83]. A DC-DC converter needs to op-
erate efﬁciently over these load power changes. Second, the energy per
operation is ﬂat at sub- to near-threshold voltages (Fig. 4.11 (b)). Third,
as shown in Fig. 4.11 (d), the regulation of VDD is relaxed in terms of
processor functionality. For example, if the processor is operating at 20
MHz and the DC-DC converter is supposed to regulate to VDD=0.4 V, but
the actual VDD is 0.380 V, then the error rate is higher than expected.
However, the inaccuracy in the regulation does not cause a system fail-
ure (since the TEP adaptive processor can handle high error rates). If
the error rate is too high, and subsequently, causing too large energy con-
sumption, then a closed-loop controller can use the measured error rate
to adjust the operation speed (and/or VDD) [64].
4.4 Summary
A brief overview of ULV operation was ﬁrst given. The beneﬁts and chal-
lenges associated within this region were explained. To achieve robust-
ness and ULE at ultra-low-voltages, adaptive techniques such as TED
can be used. An introduction to TED was presented to give the neces-
sary background for three implemented TED processors. The ﬁrst-known
TED processor with sub-threshold operation was shown. Characteristics
of ULE processors relevant to DC-DC converter design were highlighted
throughout the chapter. The ﬂatness in TED energy per operation curves
near the MEP and adaptivity of the load allows for relaxed design con-
straints in the DC-DC converter. The next chapter focuses on designing a
DC-DC converter/processor system.
64
5. Co-Design of DC-DC Converters and
Adaptive Processors
The previous two chapters presented independent characteristics of DC-
DC converters and ULE processors. The goal of this chapter is to identify
the interdepedencies of the DC-DC converter and the ULE processor load.
In other words, the chapter examines how these blocks operate together
and present solutions to minimize the total system energy per operation.
The input voltage considerations for the DC-DC converter are presented
ﬁrst. These affect a number of different design choices for the DC-DC con-
verter. The input voltage does inﬂuence the system energy per operation,
and therefore, it is important to consider. Typically, DC-DC converters
use a ﬁxed input voltage (e.g. 1 V), or a battery for their input voltage.
Two of the following DC-DC converters uses a ﬁxed input while the ﬁnal
DC-DC converter use a prototype Li-ion 1.55 V battery input. The low
and ﬂat discharge characteristics of the prototype battery are well-suited
for future ULE systems.
The DC-DC converter steps down the battery voltage and regulates to
an NT output voltage. Both traditional and new regulation techniques for
the DC-DC converter are given in this chapter. A new technique called
SIR shows promising results for ULE processors. Next, system measure-
ment results show how the choices in input voltage, DC-DC converter
designs, and the ULE processor performance inﬂuence the system energy
per operation. Measurements with the prototype 1.55 V Li-ion battery,
a DC-DC converter, and a 32-bit adaptive processor show that ultra-low
(system) energy is possible. Finally, a summary is given to highlight the
most important considerations for DC-DC converter/ULE processor sys-
tems.
The most signiﬁcant aspects of the original work are given in this chap-
ter and additional details can be found in the related scientiﬁc publica-
tions [III], [IX], and [VIII].
65
Co-Design of DC-DC Converters and Adaptive Processors
0 50 100 150
1.45
1.5
1.55
1.6
Specific Capacity [mAh/g]
V
B
A
T
[V
]
25oC
−20oC
70oC
(b) ICPU,MAX
BAT,range,-20oCV
At -20oC battery operates at 
                         > 95% of total
discharge time (Tdis,-20oC)
B
AT
,ra
ng
e,
-2
0o
C
V
Tdis,-20oC
0 50 100 150
1.45
1.5
1.55
1.6
Specific Capacity [mAh/g]
V
B
A
T
[V
]
ICPU,MAX=300μA
ICPU,MAX x 2
(a) 25°C
Figure 5.1. Measured discharge characteristics of the prototype 1.55 V Li-ion battery
proposed in [36]. The battery has a ﬂat discharge curve even with variations
in load current (i.e. ICPU ), and temperature. This same battery is used as
the input voltage to the DC-DC converter in section 5.3.3. c©2015 IEEE.
5.1 Input Voltage Considerations
The input voltage to the DC-DC converter affects the system energy con-
sumption. First, the larger the range of discharge voltage, the more chal-
lenging it is to achieve high efﬁciency in the DC-DC converter since more
topologies and control circuitry are required. Second, as discussed in
chapter 2, a higher input voltage requires a lower VCR. Achieving high
efﬁciency with a low VCR is not possible with SC DC-DC converters.
Many portable applications use a Li-ion batteries, which operate from
2.9 V - 4.2 V and nominally at 3.6 V. Traditional Li-ion batteries are an ad-
equate solution for older portable electronics with 1 V supplies and larger
breakdown voltages. But for future portable electronics that require near-
threshold voltages (0.3 - 0.5 V) and have sub-1.1 V breakdown voltages,
traditional Li-ion batteries are not a good solution. The VCR is too low
to achieve high efﬁciency and the high nominal voltage may cause over-
stress on the DC-DC converter’s switches [22]. To date, conversion from a
traditional Li-ion down to a near-threshold voltage has not been shown.
Based on the previously mentioned challenges, there is strong motiva-
tion to use a battery that has similar (volumetric energy density) charac-
teristics to the traditional Li-ion but operates with a lower nominal volt-
age. One such battery is the prototype 1.55 V Li-ion battery proposed
in [36]. This battery also has the advantage of a ﬂat discharge curve as
shown in Fig. 5.1. We present measurements with this battery later in
section 5.3.3.
66
Co-Design of DC-DC Converters and Adaptive Processors
? TR
AD
(a.)
? S
IR
(b.)
?max
regulated V DD
V B
AT
= 
1.
3 
V
V B
AT
= 
1.
5 
V
V B
AT
= 
1.
4 
V
V B
AT
= 
1.
3 
V
V B
AT
= 
1.
5 
V
V B
AT
= 
1.
4 
V
V B
AT
= 
1.
2 
V
V B
AT
= 
1.
1 
V
VCR maximum:
Unable to regulate to 
fixed V    for lower V DD BAT
......
? TR
AD
? S
IR
1.5 V1.1 V 1.2 V 1.3 V 1.4 V
1.5 V1.1 V 1.2 V 1.3 V 1.4 V
......
VCR minimum:
falloff in efficiency
for larger V BAT
VCR maximum:
limited by converter
architecture
VCR minimum:
limited by converter
architecture
VDD,OPT
VDD,OPT
refV = constant
refV BATV OCRK (1 / N )?
VOUT VIN
VINVOUT
Figure 5.2. (a) Traditional and (b) scaled-input regulation techniques. c©2015 IEEE.
5.2 DC-DC Converter Regulation Techniques
Traditional Versus Scaled-Input Regulation (SIR)
For changes in the input voltage or load power, the DC-DC converter
needs to regulate to the desired VOUT . Traditional regulation for SC DC-
DC converters require that the load has a ﬁxed VOUT (i.e. constant Vref
within the feedback circuitry) at each VIN as shown in Fig. 5.2 (a), the
peak conversion rate can only be reached at certain points of VIN . There-
fore, it is difﬁcult to maintain high average efﬁciency for large changes
in VIN . In order to maintain high efﬁciency over a large range of VIN ,
the Vref can be adjusted linearly with VIN . In other words, VOUT scales
linearly with changes in VIN and the VOUT /VIN ratio is constant.
Allowing for a nearly constant VOUT /VIN for changing VIN enables ηmax
to be achieved for a wide VIN range. This scaling technique is called
scaled input regulation (SIR). A conceptual diagram of the SIR technique
is shown in Fig. 5.2 (b). The SIR technique has been shown to provide
high efﬁciency across a large range of VIN [III]. Vref can be adjusted lin-
early with VBatt using a resistor or diode voltage divider connections to
VBatt [84]. The beneﬁts of the SIR technique are also discussed in [IX].
67
Co-Design of DC-DC Converters and Adaptive Processors
(a)
(b) (d.)
DDV
Er
ro
r R
at
e
DDV
de
la
y
     Td,uP
     Vth,RO     Vth,uP
DD,NTV
     Td,RO
En
er
gy
/O
pe
ra
tio
n
mis-matched
delay
DC-DC
Converter
Digital system
(with error detection) 
DDV
Timing Error Signal
Critical path
Replica RO
Threshold Voltage
Detector
     Td,uPmin
     Vth,RO
     Vth,uP
DD,REGV
INV
IN
(c.)
~constant error rate
fast rise
in error rate
@ VT,uP
.1 %
*
delay initilization at
super-threshold VDD
matched
delay
Start
Sample
Reset
Sample
Error Rate
Error Rate
< .1 %
Decrease
DD,REGV
YesNoIncrease
DD,REGV
-Delay
mismatch
-Delay
matching
Delay Initialization (   )*
*
BB1
fcpu
25 MHz
21.6 MHz
18.7 MHz
16.1 MHz
13.8 MHz
11.9 MHz
10.2 MHz
VDD
0.4 V
0.39 V
0.38 V
0.37 V
0.36 V
0.35 V
0.34 V
TBE rate
0
0
0
0
0
1000
over 1000
BB2
fcpu
11.3 MHz
9.85 MHz
8.51 MHz
7.34 MHz
VDD
0.4 V
0.39 V
0.38 V
0.37 V
TBE rate
0
0
1
overflow
Measurements:
Figure 5.3. (a) Threshold voltage tracking loop within a digital CMOS system (b.) Con-
ceptual diagram describing how the threshold voltage is found within the
system. The key observation is that the threshold can be found by detect-
ing a mismatch in Td,RO and Td,uPmin. (c) Threshold voltage tracking loop
ﬂowchart. (d) Measurement results of the concept with a 32-bit (TEP) pro-
cessor and an ideal voltage source.
Threshold Voltage Tracking Loop
The threshold voltage tracking loop is shown at the system-level in Fig.
5.3. It is composed of a digital system with timing-error prevention (TEP),
a critical path replica RO, a threshold voltage detector, and a DC-DC con-
verter. An important detail in this system is that the digital system and
the critical path replica RO do not have the same threshold voltage (i.e.
Vth,uP = Vth,RO). The goal of the tracking loop is to ﬁnd the digital sys-
tem’s threshold voltage (Vth,μP ). This goal is accomplished by identifying
the VDD at which the error rate rises above a (typical) rate of .1 % as
explained below in more detail.
The ﬁrst step in the tracking loop is the delay initialization. The delay of
the critical path replica RO signal (IN) is tuned (by adding or removing RO
stages) to match the minimum delay of the digital system. In other words,
Td,IN = Td,uPmin after the delay initialization (even though the critical path
replica RO and digital system have different threshold voltages). Note
that the delay initialization is done at a super-threshold VDD. The Td,IN
68
Co-Design of DC-DC Converters and Adaptive Processors
and Td,uPmin have matched delay until VDD ≈ Vth,uP . At this point, the
error rate rises quickly since the exponential delay change in Td,uPmin
becomes effective. The VDD at which the error rate rises beyond .1 % is
the ﬂagged as the location of the digital system’s threshold voltage. Since
the critical path replica RO has a lower threshold voltage than the digital
system, the Td,IN still follows a quadratic change in delay (until VDD ≈
Vth,RO).
The threshold voltage tracking loop ﬂowchart is shown in Fig. 5.3 (c). Af-
ter a sample reset, there is a sampling time in which the error rate of the
digital system is sampled. If the error rate is less than or equal to, for ex-
ample .1%, the threshold voltage tracking loop lowers VDD,REG. For VDD
above Vth,uP , a change in VDD,REG produces an (approximately) quadratic
change in the critical path replica RO delay. The sample reset and sam-
pling of the error rate resume. As this cycle continues, the VDD,REG will
eventually reach Vth,uP . At this point, the error rate is greater than .1%
and the threshold voltage of the digital system is detected.
The threshold voltage detector concept was conﬁrmed with measure-
ments of a 32-bit microprocessor and ideal voltage source. The results are
shown in Fig. 5.3 (d). The frequency was reduced approximately quadrac-
tically until an error rate increase was observed. The point at which the
error rate increases is (approximately) the threshold voltage (Vth,uP ). The
ﬁrst body bias applied (BB1) has a smaller threshold voltage than with
BB2. Measurements conﬁrm that the threshold voltage of BB1 is detected
at a lower voltage than BB2; this conﬁrms the threshold voltage detector
concept.
5.3 System Results
5.3.1 Ring Oscillator Load
A DC-DC converter and ring oscillator load is shown in Fig. 5.4. The DC-
DC converter’s SCN has a total ﬂy capacitance of 200 pF and eight charge
transfer switches. The converter uses two phase switching and a duty
cycle of 50 %. Although not shown, the control circuitry is composed of
drivers, a non-overlap clock generator, and level shifters. The DC-DC con-
69
Co-Design of DC-DC Converters and Adaptive Processors
V REF
f sw
V OCR
(new)
MEP
(traditional)
C fly,1
C fly,2
SW8
SW6 SW5
SW4
SW
3
SW2
SW1
R OUT
VCO
3:1 DC-DC
VIN
VOUT
VIN VOUT
?2
?2
?2
?2
?1
?1
?1
(ring oscillator)
Figure 5.4. Test system with a series-parallel 3:1 DC-DC converter. VREF was at the
digital load’s MEP for the traditional constraint and was set to VOCR for the
new constraint.
1.3 1.35 1.4 1.45 1.5 1.55 1.6
50
55
60
65
70
75
80
85
90
95
VIN [V]
E
ff
ic
ie
nc
y 
[%
]
ηTRAD
ηNEW (KOCR=91%)
ηNEW (KOCR=93%)
400 420 440 460
4.5
5
5.5
6
6.5
7
7.5
8
8.5
9
VOUT [mV]
E
ne
rg
y 
pe
r 
C
yc
le
 [p
J/
cy
cl
e]
Edl (α=10%)
Esys,TRAD
Esys,NEW (KOCR=93%)
VIN=1.48 V
V
IN
=1.28 V
Decreasing
VIN
(b) (a)
Figure 5.5. Effects of SIR and traditional regulation on (a) the efﬁciency of the DC-DC
converter and (b) energy per cycle of the system.
verter can be conﬁgured for traditional regulation or with the (NEW) SIR
technique. Traditional regulation is achieved by applying a ﬁxed VOUT to
VREF . SIR is achieved by connecting a voltage to VREF that scales with
VIN .
The test system from Fig. 5.4 was simulated with VIN from ≈ 1-1.55
V. The motivation for the input voltage is based on the prototype battery
from [36]. The SIR technique results in a higher efﬁciency than the tradi-
tional regulation over most of this VIN range. Additionally, the usable VIN
range with the SIR technique is larger. This characteristic is especially
valid for when KOCR is reduced. A lower KOCR means lower load power
at each VDD, and thus, it is easier for the DC-DC converter to operate.
The effects of the DC-DC converter’s efﬁciency on the (system) energy
per cycle are shown in Fig. 5.5. The SIR technique scales VOUT with VIN
70
Co-Design of DC-DC Converters and Adaptive Processors
3:1 Dickson
DC-DC Converter
TEP Adaptive Load
CLKCTRL
Latch
D Q
DFF
D Q
TB
E_
PO
S1
TB
E_
PO
S2
TB
E_
N
EG
1
TB
E_
PO
S1
TB
E_
N
EG
2
TB
E_
N
EG
3
TB
E_
PO
S
TB
E_
N
EG
FATAL_S1
FATAL_S2
FATAL_S3
FATAL_S4
FATAL_S5
FATAL_ERROR
CLK_SYS
LOAD_MODEL Latch
D Q
DFF
D Q
EDS1
TBED
STAGE1
IN OUT
FATAL_S1
TBE_NEG1
AFC
AF_CTRL
LOAD_MODEL Latch
D Q
DFF
D Q
EDS2
TBED
STAGE2
IN OUT
FATAL_S2
AFC
TBE_POS1
Toggle
Latch
Hysteric Control
C
LK
1
C
LK
7
Driver
Control
Dickson
DC-DC 
3:1 (Core)
SC DC-DC 
Subtractor
MIM
240 pF
1
C
LK
_E
XT
C
LK
_R
EF
Control
Circuitryref
V
BattV DDV
DDVBattV
SWf
SWf
SWf
1VV
1VV
refV BattV OCRK (n / m)?
outC = 140 pF
refV
Figure 5.6. 3:1 DC-DC converter and TEP load schematic. c©2015 IEEE.
and gives lower energy per operation for most VIN voltages. This behavior
allows the converter to achieve close to ηmax at each VIN . The traditional
regulation uses a ﬁxed VREF=410 mV, or the MEP of the ring oscillator.
Using the SIR technique results in minimum energy savings of 8 % over
the VIN range [III]. Overall, the results of this system simulation show
that by taking advantage of the digital load’s energy proﬁle (i.e. with
SIR), energy savings can be achieved. As the energy per operation curves
ﬂatten with scaling (chapter 4), the SIR technique will become even more
beneﬁcial. In addition, as ultra-portable applications shift toward sub-5
μW load powers, the beneﬁt of having reduced control circuitry (due to
the SIR technique) will become more inﬂuential in improving efﬁciency at
such low power levels.
5.3.2 TEP Load
To provide a more realistic load for the DC-DC converter, a 3:1 Dickson
DC-DC converter was simulated with an adaptive TEP load in 28 nm
FDSOI (Fig. 5.6). The DC-DC converter is the same that is used in section
3.2.2. However, a resistor divider was added to enable the SIR technique.
The resistor divider generates a reference voltage for the feedback that is
proportional to VBatt. The load for the DC-DC converter is a TEP adaptive
load. It includes a dual-phase latch pipeline with 5 stages, time borrow
detection circuitry, and an adaptive clock generator. Additional details of
the TEP adaptive load are found in [IX].
The simulation results of the DC-DC converter/TEP adaptive load sys-
tem with local variations is shown in Fig. 5.7. The DC-DC converter uses
the SIR technique to produce a VDD = (VBatt/3)*KOCR, where KOCR is 0.9.
The average, best, and worst Ecyc of the DC-DC converter/TEP load sys-
71
Co-Design of DC-DC Converters and Adaptive Processors
1 1.1 1.2 1.3 1.4 1.5
1
1.5
2
2.5
3
VBatt [V]
E
ne
rg
y 
pe
r 
cy
cl
e 
(n
or
m
al
iz
ed
)
Ecyc,nonadaptive,MCAVG (α=30%)
Ecyc,nonadaptive,MCBEST (α=30%)
Ecyc,nonadaptive,MCWORST (α=30%)
RSD:
13.8% RSD:
12.5%
RSD:
9.7%
RSD:
7.3%
RSD:
5.9%
RSD:
6.3%
1 1.1 1.2 1.3 1.4 1.5
1
1.5
2
2.5
3
VBatt [V]
E
ne
rg
y 
pe
r 
cy
cl
e 
(n
or
m
al
iz
ed
)
Ecyc,adaptive,MCAVG (α=30%)
E
cyc,adaptive,MCBEST
(α=30%)
Ecyc,adaptive,MCWORST (α=30%)
RSD:
10.2%
RSD:
7.6%
RSD:
7.3%
RSD:
6.4%
RSD:
5.0%
RSD:
4.0%
(a)
(b)
1 1.1 1.2 1.3 1.4 1.5
75
80
85
V
Batt
[V]
E
ff
ic
ie
nc
y 
[%
]
Nonadaptive
Adaptive
AVG. EFFICIENCY
(V
Batt
1−1.5V):
81.6%
AVG. EFFICIENCY
(V
Batt
1−1.5V):
81.1%
(c)
Figure 5.7. The average, best, and worst results from 10 Monte-Carlo runs on the DC-
DC converter with (a) a TEP non-adaptive load (b) and an adaptive TEP
load. Relative standard deviation (RSD) values are annotated in the boxes;
(c) The DC-DC converter average efﬁciency of 10 Monte-Carlo iterations at
each VBatt.
72
Co-Design of DC-DC Converters and Adaptive Processors
BATV
Control
Circuitry
DDV
3:1 Dickson  
DC-DC Converter 
auxV
32-bit 
CPU with TEP
Prototype
1.55V
Li-ion
Battery
Core
BATV
DDV
fly1C
S1
S2b
S2a
S3
S4 S5
S6S7
LM32
RISC Core
TB monitor
Peripherals
Timer
GPIO
SPI
TBE
IRQ
TB event
Clk control
Wishbone
Bus
Memory
IMEM
64kB
DMEM
8kB
CPUf
Level conversion
MEMV
oC
V R
EF
Non-Overlap
Clock 
Generator
SWf
?
1
?
2
Driver
Control
Toggle
Latch
Hysteric Control
VCOMP
dcdcf
DIFFV       =          -BAT  V auxV
100pFfly2
C
100pF
110pF IC
5pF 
INf
SC Subtractor
BATV
auxV
?
A
?
A
?
B
?
B
1pF
+/-
E
dir
Q =
TBE_limit CSR
clk
Figure 5.8. Schematic of the proposed battery/DC-DC converter/CPU system from [VIII].
c©2015 IEEE.
tem with Monte-Carlo iterations are shown. The comparison of Fig 5.7
(a) and 5.7 (b) not only shows that the average (system) energy per opera-
tion is less with the TEP adaptive load, but also that the deviation in the
average energy per operation is reduced with the adaptive system.
The efﬁciency of the DC-DC conversion from the previous local varia-
tion simulation is shown in Fig. 5.7 (c). The DC-DC converter is able to
maintain an efﬁciency over 78 % for VBatt=1-1.5 V with the TEP adaptive
load. We chose an α of 30 % as a worst case scenario for the DC-DC con-
verter. At α = 30 %, the power would be the largest for the TEP adaptive
load. The DC-DC converter was thus sized to meet the (average) power
demand for the load at α = 30%. For α less than 30 % within the pipeline,
the power at the load decreases, and the efﬁciency of the DC-DC converter
is greater than or equal to the conversion efﬁciency at α=30%.
5.3.3 32-bit Processor with TEP
A prototype Li-ion battery, fully integrated switched-capacitor DC-DC con-
verter, and CPU are proposed in Fig. 5.8. The main goal of the system is
to operate at or near the CPU’s minimum energy point despite changes
in temperature or battery voltage. The 3:1 DC-DC converter steps down
73
Co-Design of DC-DC Converters and Adaptive Processors
the Li-ion battery voltage to near-threshold voltages. The battery has an
improved (volumetric) energy density over traditional (nominal 3.6 V), a
long cycle life, and spends most its discharge time with minimal voltage
variation (≈ 100 mV).
The prototype battery’s characteristics allow for a more efﬁcient DC-
DC converter. The low nominal voltage increases the VCR (as compared
to a traditional Li-ion battery). As a result, the DC-DC converter’s SCN
requires less charge transfer switches and capacitors, and can utilize thin-
oxide transistors without breakdown issues. Besides the advantageous
nominal voltage, the ﬂat discharge curve of the battery allows for a single
topology SC DC-DC converter to meet the output voltage requirements.
Achieving high energy efﬁciency in a battery/DC-DC converter/CPU sys-
tem requires co-design. The energy proﬁle of the CPU is the starting point
of the co-design. This energy proﬁle is constructed by simulating the cur-
rent and maximum frequency at each VOUT . The DC-DC converter needs
to have a conversion ratio(s) that allows for efﬁcient down-conversion from
the battery voltage range to a target VOUT . The target VOUT is dependent
on whether the CPU is throughput-constrained or energy-constrained.
This thesis is mostly focused on energy constrained systems in which
regulating at or slightly above the MEP is the main goal. For current
technologies, the MEP is located near the threshold voltage.
The energy per cycle of digital logic can be deﬁned as:
EDL = PCPU ∗ TCLK , (5.1)
where PCPU and TCLK describe the power and clock period, respectively,
of the CPU. The DC-DC converter’s energy per cycle can also be described
in terms of the TCLK :
EDC−DC = PDC−DC ∗ TCLK . (5.2)
Thus, combining (5.1) and (5.2), and recognizing that PDC−DC = PCPU ∗
((1/ηmax)− 1), gives the system energy per cycle:
ESY S = EDL + EDC−DC = EDL ∗ ( 1
ηmax
). (5.3)
For the desired (VOUT ) operation range, which is at or slightly above the
MEP, the dynamic energy is dominant and thusEDL ≈ CeffV 2OUT . Rewrit-
74
Co-Design of DC-DC Converters and Adaptive Processors
0.3 0.35 0.4 0.45 0.5
0
20
40
60
80
100
VOUT [V]
η m
ax
[%
]
4:1 DC−DC
3:1 DC−DC
2:1 DC−DC
0.3 0.35 0.4 0.45 0.5
5
10
15
20
VOUT [V]
E
ne
rg
y 
[p
J]
E
CPU
E
SYS
w/ 4:1 DC−DC
E
SYS
w/ 3:1 DC−DC
E
SYS
w/ 2:1 DC−DC
Converting to 
VOUT > V OUT,OPT
increases ESYS
VOUT,OPT
MEPsys
w/ 3:1
(a) (b)
MEPsys
w/ 4:1
VOUT,OPT
Figure 5.9. (a) Converting with higher efﬁciency to VOUT > VOUT,OPT results in larger
system energy consumption. (b) Resultant ηmax for VIN=1.55 V and multiple
DC-DC topologies.
ing (5.3) based on this (EDL) approximation gives:
ESY S = CeffV
2
OUT ∗ (
1
ηmax
). (5.4)
By expanding the ηmax in terms of VOUT then:
ESY S = CeffV
2
OUT (
N ∗ VIN
VOUT
). (5.5)
ESY S = CeffVOUT (N ∗ VIN ). (5.6)
Equation (5.5) shows that the EDL contribution (CeffV 2OUT ) quadrati-
cally scales ESY S , while the DC-DC contribution (N ∗ VIN/VOUT ) scales
ESY S linearly. Further simpliﬁcation of Equation (5.6) shows that ESY S
increases linearly with VOUT . In conclusion, regulating to a higher VOUT
in order to achieve higher efﬁciency in the DC-DC converter does not de-
crease the energy consumption.
The illustrate the previous conclusion, the measured energy per opera-
tion of a CPU ECPU [VIII] and a DC-DC converter is calculated. Three
different step-down topologies (i.e. 1/4, 1/3, 1/2), are used to explore a
large range of VOUT near the CPU’s MEP as shown in Fig. 5.9 (a). Al-
though the 1/3 and 1/2 topologies could both achieve higher efﬁciency at
VOUT>VOUT,OPT (see Fig. 5.9 (b)), the system energy MEPSY S is at 0.4
V. This conﬁrms the equations and conclusions above: increasing VOUT
above the VOUT,OPT in order to achieve higher efﬁciency in the DC-DC
converter does not result in lower system energy consumption. However,
75
Co-Design of DC-DC Converters and Adaptive Processors
1.3 1.4 1.5 1.6 1.7 1.8
65
75
85
VBAT [V]
D
C
−D
C
 E
ff
ic
ie
nc
y 
[%
]
5
10
15
E
sy
s=
E
D
C
−D
C
+E
C
P
U
[p
J/
cy
c]
WC/25C
BC/25C
WC/−20C
BC/−20C
WC/70C
BC/70C
BAT,range,-20oCV BAT,range,25oCV
BAT,range,70oCV
INf
(MHz)
-20 C 10
 25 C 25
 70 C 20.8
CPU
BB
BB1
BB1
BB2
Figure 5.10. Measurement of the battery/DC-DC converter/CPU system.
for systems that do require operation at the MEPSY S , it may be worth-
while to increase VOUT to improve performance. Increasing from VOUT to
0.4 V to 0.5 V would give a 240 % increase in performance (,i.e. 22.7 MHz
to 77 MHz) and a 12.5 % increase in system energy.
For the battery/DC-DC converter/CPU system in Fig. 5.8, the main con-
cern was energy consumption, and thus, the target was the MEPSY S . We
chose the 1/3 DC-DC converter topology that gave the highest efﬁciency
near the MEPSY S (5.9 (b)). The DC-DC converter and CPU shared the
same input frequency (i.e. fdcdc=fIN ) for all measurements. The CPU used
the back-gate bias conﬁguration of BB1 (V BBP=-0.5 V, V BBN=0 V) for
high performance and BB2 (V BBP=0 V, V BBN=-0.5 V) for reduced leak-
age energy at high temperatures. Since the CPU operates only in active
mode with power levels over 100 μW, the DC-DC converter losses are dom-
inated by parasitic and conduction losses. These losses were minimized by
using MIM ﬂy capacitors and optimized transistor sizes according to [22],
respectively. If the CPU would have had a sleep mode, RBB could have
been used in the converter to reduce dominant control circuitry losses.
The measurement results of the battery/DC-DC converter/CPU system
operating together are shown in Fig. 5.10. At temperatures ranging from
-20 ◦C to 70 ◦C, and over 95 % of the battery discharge range, the average
system energy consumption (Esys) was 8 pJ/cyc and the average DC-DC
converter efﬁciency was 76.3 %. 8 pJ/cyc is considered state-of-the-art for
a DC-DC converter/CPU system [VIII].
Since the previous measurements were done with traditional regulation
(,i.e. DC-DC regulating to a ﬁxed VOUT ), measurements were also per-
76
Co-Design of DC-DC Converters and Adaptive Processors
0.35 0.4 0.45 0.5
12
13
14
15
16
17
18
19
20
VOUT [V]
E
ne
rg
y 
pe
r 
C
yc
le
 [p
J]
1 1.2 1.4 1.6 1.8 2
12
13
14
15
16
17
18
19
20
VBAT [V]
E
ne
rg
y 
pe
r 
C
yc
le
 [p
J]
1 1.2 1.4 1.6 1.8 2
65
70
75
80
VBAT [V]
E
ff
ic
ie
nc
y
[%
]
Traditional Regulation
SIR
0 10 20 30
12
13
14
15
16
17
18
19
20
fCPU [MHz]
E
ne
rg
y 
pe
r 
C
yc
le
 [p
J]
FSL of
DC-DC
(a) (b)
(c) (d)
FSL of
DC-DC
Figure 5.11. Measurements of the CPU with traditional and SIR. BB2 is applied in the
CPU.
77
Co-Design of DC-DC Converters and Adaptive Processors
formed using the SIR technique as shown in Fig. 5.11. Using the SIR
technique, the DC-DC converter provides good efﬁciency between 1.5 V
and 1.6 V. Below 1.5 V, the lowered VDD with SIR induces increases in
power due to leakage. Above 1.6 V, the power raises due to switching.
The increased power in each of these regions drives the DC-DC converter
into FSL. Although the SIR technique did function with the CPU and did
provide some advantages (e.g. large VBAT range), the beneﬁts in efﬁciency
were limited due to the FSL operation. The FSL limitations were due to
incorrect POUT,max estimates (,i.e. 1.5 x higher), from the synthesis of the
CPU.
5.4 Summary
The goal of this chapter was to identify the interdepedencies of the DC-
DC converter and the ULE processor load. We examined how these blocks
operate together and presented solutions to minimize the total system
energy per operation. The SIR technique showed that system energy con-
sumption could be reduced by exploiting the characteristics of the adap-
tive load. As process nodes scale and the load power decreases, the SIR
technique will allow for further improvements in efﬁciency. A technique
to identify the threshold voltage was also presented. This technique will
be useful as a part of a future DC-DC/processor regulation system. Tak-
ing advantage of the prototype Li-ion battery characteristics also helped
us in achieving state-of-the-art system energy consumption.
78
6. Conclusions
The goal of this thesis was to design DC-DC converters for ULE adaptive
processor loads. The processors used an adaptive scheme called timing-
error detection (TED). One of the processors is the ﬁrst-known TED pro-
cessor capable of sub-threshold operation. The main design constraints
for the DC-DC converter were full integration and high efﬁciency. The
thesis and the related scientiﬁc publications ([I]-[X]) presented challenges
and solutions associated with achieving this goal.
Table 6.1 compares our three DC-DC converters to state-of-the-art SC
DC-DC converters [14, 19, 21, 46] with similar power levels and NT out-
put voltages. In general, our converters improved the state-of-the-art es-
pecially in terms of usable load range (with high efﬁciency), power den-
sity, and the use of practical input voltages (i.e. battery-connected). The
DC-DC converter in [21] by Kwong et al. was the most similar reference
to the converters in this thesis that the author could ﬁnd. Kwong et al.
used three topologies (1/3, 1/2, and 2/3) to step down from an input volt-
age of 1.2 V to NT voltages. They were able to achieve a large load power
range (2-500 μW) due to scaleable switch widths. Using a single topology
in our converters helped us in achieving higher efﬁciency down to much
lower load power (i.e. nW). The DC-DC converter presented by Clerc et al.
[46] was designed for slightly higher load power. It also took advantage
of back-gate biasing in FD-SOI within its switches although details were
limited. Our self-oscillating converter had a higher power density even
assuming that the COUT in [46] was 0 pF (which was not reported). Bol
et al. reported a DC-DC converter with high efﬁciency over a large power
range in [14]. Unlike our proposed converters (which were also measured
with processor loads), this converter required a large off-chip COUT . Ad-
ditional DC-DC converters with NT output voltages such as [85, 86] were
not considered in Table 6.1 since their power levels were much higher (5
79
Conclusions
x-10 x) than the target power levels of ultra-portable electronics (i.e. nW
to μW).
Three conclusions were found from our implemented fully integrated
DC-DC converters and adaptive loads. First, the ULE adaptive load char-
acteristics should be considered when designing the DC-DC converter.
Overall, ULE adaptive loads have relaxed constraints on ripple and regu-
lation in terms of energy consumption and functionality. At NT voltages,
the energy per operation ﬂattens out relative to the supply voltage. As
detailed in Publication III, taking advantage of this ﬂatness with the SIR
technique helps to reduce the system energy per operation. The mea-
surements of ULE adaptive loads also showed that the functionality was
insensitive to the supply voltage (from the DC-DC converter).
Second, as adaptive ULE load powers continue to reduce, the (increas-
ingly dominant) control circuitry losses of the DC-DC converter need to
be considered. Reducing the amount of control circuitry with a new topol-
ogy (self-oscillating) and reducing the leakage energy within the control
circuitry are both promising approaches. Due to its inherent ring oscilla-
tor, the self-oscillating converter was beneﬁcial in reducing the amount of
control circuitry components as detailed in chapter 3 and Publication VII.
The self-oscillating converter achieved a peak efﬁciency of 87 % and was
able to achieve efﬁciency over 75 % for 79 nW to 200 μW loads. Reducing
leakage through back-gate biasing within the control circuitry also proved
to be worthwhile in reducing losses within the Dickson DC-DC converter
(see chapter 5 and Publication VIII). At VOUT=0.465 V, the minimum ef-
ﬁciency of the Dickson DC-DC converter was 71% for 104 nW to 140 μW
loads.
Third, exploring new energy sources is worthwhile for systems oper-
ating at NT. Regulating from a Li-ion (nominal 3.6 V) down to NT is
challenging. The DC-DC converter by Wiechkowski et al. in [19], which
achieved a peak efﬁciency of 56 %, is the only-known DC-DC converter
with a Li-ion (3.6 V) input voltage and an NT output voltage. The chal-
lenges associated with stepping down from such a high input voltage stem
primarily in avoiding breakdown voltages across switches and a small
VCR. By using a prototype Li-ion battery with a low nominal voltage (1.55
V) and ﬂat discharge proﬁle, a single DC-DC converter topology can be
used to convert to NT voltages at high efﬁciency. Additional details of this
work are given in chapter 5 and Publication VIII.
80
Conclusions
Future SC DC-DC converters for ultra-portable electronics should focus
on improving power density in order to drive down the costs of full inte-
gration. The power density needs to be improved to more closely match
the power density of DC-DC converters with super-threshold loads (e.g.
3.2 W/mm2 in [31]). The key elements in this effort needs to be on the ﬂy
capacitor and the overdrive voltage of the switches. Speciﬁcally, efforts
should be made in identifying new methods to reducing voltage depen-
dencies in MOSCAPs, evaluating new capacitor technologies (e.g. deep
trench capacitors), and expanding on the back-gate biasing proposed in
this work to increase overdrive voltages. As ultra-portable electronics con-
tinually push toward full-autonomy, increased functionality, improved se-
curity, high-efﬁciency and fully integrated DC-DC converters will become
increasingly important.
81
Conclusions
Ta
bl
e
6.
1.
C
om
pa
ri
so
n
of
th
e
w
or
ks
pr
es
en
te
d
he
re
to
st
at
e-
of
-t
he
-a
rt
SC
D
C
-D
C
co
nv
er
te
rs
.
Te
ch
no
lo
gy
Lo
ad
 R
an
ge
iV
C
R
Te
st
ed
 In
pu
t /
 
O
ut
pu
t V
ol
ta
ge
Ef
fic
ie
nc
y 
(?
) @
Su
b-
/N
ea
r-
V t
h
Po
w
er
 D
en
si
ty
 @
? (
m
W
/m
m
2 )
To
po
lo
gy
65
nm
 C
M
O
S
1-
1.
2V
 / 
(0
.3
2 
- 0
.4
8V
)
1/
2
5-
32
0?
W
1
st
ep
-d
ow
n 
SC
76
%
@
0.
4V
2 4
.6
@
0.
4V
65
nm
 C
M
O
S
1.
2V
 / 
(0
.3
 - 
0.
6V
)
1/
3,
 1
/2
, 2
/3
2-
50
0?
W
1
st
ep
-d
ow
n 
SC
Ef
fic
ie
nc
y 
m
ax
2.
05
@
0.
5V
75
%
@
0.
5V
78
%
@
0.
5V
Lo
ad
 R
an
ge
in
 R
at
io
81
%
@
0.
4V
28
nm
 U
TB
B
FD
-S
OI
1.
1 
/ 
(0
.3
3,
0.
45
)
1/
3,
 1
/2
13
0-
50
00
?W
1
75
%
@
0.
45
V
75
%
@
0.
45
V
1:
39
st
ep
-d
ow
n 
SC
18
.4
@
0.
45
V
13
0n
m
 C
M
O
S
(2
.5
-3
.6
V)
 / 
(0
.4
4V
)
1/
5
10
0-
35
0n
W
1
1:
25
0
st
ep
-d
ow
n 
SC
/
LD
O
0.
00
06
@
0.
4V
56
%
@
0.
4V
56
%
@
0.
44
V
1:
12
5
1:
64
C
O
U
T
N
/R
1 E
st
im
at
ed
 fr
om
 p
ap
er
2:
1 
D
C
-D
C
(S
ec
tio
n 
3.
3)
3.
3n
F 
(o
ff-
ch
ip
)
0
0
3:
1 
D
C
-D
C
(S
ec
tio
n 
3.
1)
3:
1 
D
C
-D
C
(S
ec
tio
n 
3.
2)
28
nm
 C
M
OS
st
ep
-d
ow
n 
SC
20
0p
F 
(M
OM
/M
OS
)
1/
3
1-
1.
7V
 / 
(0
.3
04
-0
.4
70
V)
14
?W
-1
15
?W
1:
8
3 8
2%
@
0.
35
0V
EE
F m
ax
3 6
4%
@
0.
35
0V
3 F
ro
m
 s
im
ul
at
io
ns
.
1.
66
@
0.
35
0V
3 7
5%
@
0.
35
0V
28
nm
 U
TB
B
FD
-S
OI
 (R
VT
)
st
ep
-d
ow
n 
SC
1/
3
11
0p
F
1-
1.
9V
 / 
(0
.2
90
-0
.5
43
V)
20
9n
W
-2
05
?W
Lo
ad
 R
an
ge
 M
in
.
Ef
fic
ie
nc
y 
(?
M
IN
)
N
/R
? M
IN
=
75
%
? M
IN
=
70
%
? M
IN
=
30
%
? M
IN
=
77
%
? M
IN
=
71
%
? M
IN
=
75
%
1:
98
1
4 8
1%
@
0.
46
5V
4 7
6%
@
0.
41
5V
4 6
5%
@
0.
41
5V
4 V
BA
TT
=1
.5
5V
 a
nd
 N
BB
 (G
N
D
S=
VD
D
S=
0V
)
4 7
6%
@
0.
41
5V
5.
5@
0.
41
5V
28
nm
 U
TB
B
FD
-S
OI
 (L
VT
)
se
lf-
os
ci
lla
tin
g
st
ep
-d
ow
n 
SC
0p
F
1-
1.
2V
 / 
(0
.3
80
 - 
0.
48
5V
)
1/
2
79
nW
-2
00
?W
1:
25
32
5 8
7%
@
0.
46
V
62
@
0.
51
5V
19
.2
@
0.
46
V
24
@
0.
43
V
5 7
5%
@
0.
51
5V
5 7
7%
@
0.
46
V
5 7
7%
@
0.
43
V
5 4
7%
@
0.
46
0V
5 V
BA
TT
=1
V 
an
d 
V B
B=
1V
2 D
oe
s 
no
t i
nc
lu
de
 a
re
a 
of
 C
ou
t
N/
R:
 n
ot
 re
po
rt
ed
[1
4]
[2
1]
[4
6]
[1
9]
44
%
N/
R
47
%
78
%
82
References
[1] A. Burdett. Tutorial t3: Ultra-low-power wireless systems. In IEEE Inter-
national Solid-State Circuits Conference, Feb. 2015.
[2] M.H. Ghaed, G. Chen, R.-U. Haque, M. Wieckowski, K. Yejoong, K. Gy-
ouho, L. Yoonmyung, L. Inhee, D. Fick, K. Daeyeon, S. Mingoo, K.D. Wise,
D. Blaauw, and D. Sylvester, "Circuits for a Cubic-Millimeter Energy-
Autonomous Wireless Intraocular Pressure Monitor," IEEE Transactions
on Circuits and Systems I: Regular Papers, vol. 60, no. 12, pp. 3152-3162,
Dec. 2013.
[3] A. Klinefelter, N.E. Roberts, Y. Shakhsheer, P. Gonzalez, A. Shrivastava,
A. Roy, K. Craig, M. Faisal, J. Boley, S. Oh, Y. Zhang, D. Akella, D.D. Went-
zloff, and B.H. Calhoun, "A 6.45 μW self-powered IoT SoC with integrated
energy-harvesting power management and ULP asymmetric radios," In
IEEE International Solid-State Circuits Conference Digest of Technical Pa-
pers, pp. 384-385, Feb. 2015.
[4] H. Kim, G. Kim, Y. Lee, D. Foo, D. Sylvester, D. Blaauw, and D. Wentzloff, "A
10.6mm3 Fully-Integrated, Wireless Sensor Node with 8GHz UWB Trans-
mitter," In IEEE Symposium on VLSI Circuits, pp.TBD, May 2015.
[5] I. Lee, G. Kim, B. Suyoung, A. Wolfe, R. Bell, S. Jeong, Y. Kim, J. Kagan,
M. Arias-Thode, B. Chadwick, D. Sylvester, D. Blaauw, and Yoonmyung Lee,
"System-On-Mud: Ultra-Low Power Oceanic Sensing Platform Powered by
Small-Scale Benthic Microbial Fuel Cells," IEEE Transactions on Circuits
and Systems I: Regular Papers, vol. 62, no. 4, pp. 1126-1135, April 2015.
[6] J. de Boeck, "IoT: The impact of things," In IEEE Symposium on VLSI
Circuits, pp. 82-83, June 2009.
[7] M. Fojtik, D. Kim, G. Chen, Y.-S. Lin, D. Fick, J. Park, M. Seok, M.T.
Chen, D. Foo, D. Blaauw, and D. Sylvester, "A Millimeter-Scale Energy-
Autonomous Sensor System With Stacked Battery and Solar Cells," IEEE
Journal of Solid-State Circuits, vol. 48, no. 3, pp. 801-813, March 2013.
[8] D. Sylvester, Tutorial t4: Low-power near-threshold design. In IEEE Inter-
national Solid-State Circuits Conference, Feb. 2015.
[9] M. Hiienkari, J. Teittinen, L. Koskinen, M. Turnquist, and M. Kaltiokallio,
"A 3.15pJ/cyc 32-bit RISC CPU with timing-error prevention and adaptive
clocking in 28nm CMOS," In IEEE Custom Integrated Circuits Conference,
pp.1-4, Sept. 2014.
83
References
[10] D. Markovic, C.C. Wang, L.P. Alarcon, Tsung-Te Liu, and J.M. Rabaey,
"Ultralow-Power Design in Near-Threshold Region," Proceedings of the
IEEE, vol. 98, no. 2, pp. 237-252, Feb 2010.
[11] A. Saraﬁanos and M. Steyaert, "Fully Integrated Wide Input Voltage Range
Capacitive DC-DC Converters: The Folding Dickson Converter," IEEE Jour-
nal of Solid-State Circuits, vol. 50, no. 7, pp. 1560-1570, July 2015.
[12] Y.-T. Liao, H. Yao, A. Lingley, B. Parviz, and B.P. Otis, "A 3- μW CMOS
Glucose Sensor for Wireless Contact-Lens Tear Glucose Monitoring," IEEE
Journal of Solid-State Circuits, vol. 47, no. 1, pp. 335-344, Jan. 2012.
[13] J. De Vos, D. Flandre, and D. Bol, "Switched-capacitor DC/DC converters
for empowering Internet-of-Things SoCs," In IEEE Faible Tension Faible
Consommation, pp. 1-2, May 2014.
[14] D. Bol, J. De Vos, C. Hocquet, F. Botman, F. Durvaux, S. Boyd, D. Flan-
dre, and J. Legat, "SleepWalker: A 25-MHz 0.4-V Submm2 μW/MHz Micro-
controller in 65-nm LP/GP CMOS for Low-Carbon Wireless Sensor Nodes,"
IEEE Journal of Solid-State Circuits, vol. 48, no. 1, pp. 20-32, Jan. 2013.
[15] S. Hanson, M. Seok, Y.-S. Lin, Z. Foo, D. Kim, Y. Lee, N. Liu, D. Sylvester,
and D. Blaauw, "A Low-Voltage Processor for Sensing Applications With
Picowatt Standby Mode," IEEE Journal of Solid-State Circuits, vol. 44, no.
4, pp. 1145-1155, April 2009.
[16] Y.K. Ramadass and A.P. Chandrakasan, "Minimum Energy Tracking Loop
With Embedded DC-DC Converter Enabling Ultra-Low-Voltage Operation
Down to 250 mV in 65 nm CMOS," IEEE Journal of Solid-State Circuits,
vol. 43, no. 1, pp. 256-265, Jan. 2008.
[17] X. Gong, J. Ni, Z. Hong, and B. Liu, "An 80% peak efﬁciency, 0.84mW
sleep power consumption, fully-integrated DC-DC converter with buck/LDO
mode control," In IEEE Custom Integrated Circuits Conference, pp. 1-4,
Sept. 2011.
[18] S. Kose, S. Tam, S. Pinzon, B. McDermott, and E.G. Friedman, "Active
Filter-Based Hybrid On-Chip DC-DC Converter for Point-of-Load Voltage
Regulation," IEEE Transactions on Very Large Scale Integration Systems,
vol. 21, no. 4, pp. 680-691, April 2013.
[19] M. Wieckowski, G.K. Chen, M. Seok, D. Blaauw, and D. Sylvester, "A hybrid
DC-DC converter for sub-microwatt sub-1V implantable applications," In
IEEE Symposium on VLSI Circuits, pp. 166-167, Nov. 2009.
[20] J. Myers, A. Savanth, R. Gaddh, D. Howard, P. Prabhat, and D. Flynn, "A
Subthreshold ARM Cortex-M0+ Subsystem in 65 nm CMOS for WSN Ap-
plications with 14 Power Domains, 10T SRAM, and Integrated Voltage Reg-
ulator," IEEE Journal of Solid-State Circuits, vol. PP, no. 99, pp. 1-14, 2015.
ISSN 0018-9200.
[21] J. Kwong, Y.K. Ramadass, N. Verma, and A.P. Chandrakasan, "A 65 nm
SubVt Microcontroller With Integrated SRAM and Switched Capacitor DC-
DC Converter," IEEE Journal of Solid-State Circuits, vol. 44, no. 1, pp.
115-126, 2009.
84
References
[22] H.-P. Le, S. R. Sanders, and E. Alon, "Design Techniques for Fully Inte-
grated Switched-Capacitor DC-DC Converters," IEEE Journal of Solid-
State Circuits, vol. 46, no. 9, pp. 2120-2131, Sept. 2011.
[23] S.R. Sanders, E. Alon, H.-P. Le, M.D. Seeman, M. John, and V.W. Ng, "The
Road to Fully Integrated DC-DC Conversion via the Switched-Capacitor Ap-
proach," IEEE Transactions on Power Electronics, vol. 28, no. 9, pp. 4146-
4155, Sept 2013.
[24] T. Piessens. Tutorial t9: Charge pump and capacitive dc-dc converter de-
sign. In IEEE International Solid-State Circuits Conference, Feb. 2014.
[25] A. Saraﬁanos and M. Steyaert, "The folding dickson converter: A step to-
wards fully integrated wide input range capacitive DC-DC converters," In
Proc. Eur. Solid-State Circ. Conf., pp. 267-270, Sept. 2014.
[26] J. De Vos, D. Flandre, and D. Bol, "A Sizing Methodology for On-Chip
Switched-Capacitor DC/DC Converters," IEEE Transactions on Circuits
and Systems I: Regular Papers, vol. 61, no. 5, pp. 1597-1606, May 2014.
[27] M. S. Makowski and D. Maksimovic, "Performance limits of switched-
capacitor DC-DC converters," In Proc. of Power Electronics Specialists Conf.,
pp. 1215-1221, June 1995.
[28] M.D. Seeman and S.R. Sanders, "Analysis and Optimization of Switched-
Capacitor DC-DC Converters," IEEE Transactions on Power Electronics,
vol. 23, no. 2, pp. 841-851, March 2008.
[29] N. Butt, K. McStay, A. Cestero, H. Ho, W. Kong, S. Fang, R. Krishnan,
B. Khan, A. Tessier, W. Davies, S. Lee, Y. Zhang, J. Johnson, S. Rombawa,
R. Takalkar, A. Blauberg, K.V. Hawkins, J. Liu, S. Rosenblatt, P. Goyal,
S. Gupta, J. Ervin, Z. Li, S. Galis, J. Barth, M. Yin, T. Weaver, J.H. Li,
S. Narasimha, P. Parries, W.K. Henson, N. Robson, T. Kirihata, M. Chudzik,
E. Maciejewski, P. Agnello, S. Stifﬂer, and S.S. Iyer, "A 0.039μm2 high per-
formance eDRAM cell based on 32nm High-K/Metal SOI technology," In
IEEE International Electron Devices Meeting, pp. 27.5.1-27.5.4, Dec. 2010.
[30] T.M. Andersen, F. Krismer, J.W. Kolar, T. Toiﬂ, C. Menolﬁ, L. Kull, T. Morf,
M. Kossel, M. Brandli, P. Buchmann, and P.A. Francese, "A sub-ns response
on-chip switched-capacitor DC-DC voltage regulator delivering 3.7W/mm2
at 90% efﬁciency using deep-trench capacitors in 32nm SOI CMOS," In
IEEE International Solid-State Circuits Conference Digest of Technical Pa-
pers, pp. 90-91, Feb. 2014.
[31] T.M. Andersen, F. Krismer, J.W. Kolar, T. Toiﬂ, C. Menolﬁ, L. Kuli, T. Morf,
M. Kossel, M. Brandii, and P.A. Francese, "A feedforward controlled on-chip
switched-capacitor voltage regulator delivering 10W in 32nm SOI CMOS,"
In IEEE International Solid-State Circuits Conference Digest of Technical
Papers, pp. 1-3, Feb. 2015.
[32] M. Steyaert, "DC-DC Converters: From Discrete Towards Fully Integrated
CMOS," In Proc. Eur. Solid-State Circ. Conf., Sept. 2011.
[33] M. Wens and M. Steyaert, Design and Implementation of Fully-Integrated
Inductive DC-DC converters in Standard CMOS. Springer, 2011.
85
References
[34] F. Botman, J. De Vos, S. Bernard, F. Stas, J.-D. Legat, and D. Bol, "Belle-
vue: A 50MHz variable-width SIMD 32bit microcontroller at 0.37V for
processing-intensive wireless sensor nodes," In IEEE International Sym-
posium on Circuits and Systems, pp. 1207-1210, June 2014.
[35] V. Ng and S. Sanders, "A 92%-efﬁciency wide-input-voltage-range switched-
capacitor DC-DC converter," In IEEE International Solid-State Circuits
Conference Digest of Technical Papers, pp. 282-284, Feb. 2012.
[36] E. Pohjalainen, J. Kallioinen, and T Kallio, "Comparative study of car-
bon free and carbon containing Li4Ti5O12 electrodes ," Elsevier Journal
of Power Sources, vol. 279, pp. 481-486, Dec. 2015.
[37] M.D. Seeman, "A Design Methodology for Switched-Capacitor DC-DC Con-
verters," PhD thesis.
[38] E. Pohjalainen, S. Räsänen, M. Jokinen, K. Yliniemi, D.A. Worsley, J. Kuusi-
vaara, J. Juurikivi, R. Ekqvist, T. Kallio, and M. Karppinen, "Water soluble
binder for fabrication of Li4Ti5O12 electrodes," Elsevier Journal of Power
Sources, vol. 226, pp. 134-139, Oct. 2013.
[39] Vincent Ng, "Switched Capacitor DC-DC Converter: Superior where the
Buck Converter has Dominated," PhD thesis.
[40] J.R. Baker, CMOS Circuit Design, Layout, and Simulation. IEEE Press,
2008.
[41] T.M. Van Breussegem and M.S.J. Steyaert, "Monolithic Capacitive DC-DC
Converter With Single Boundary-Multiphase Control and Voltage Domain
Stacking in 90 nm CMOS," IEEE Journal of Solid-State Circuits, vol. 46,
no. 7, pp. 1715-1727, July 2011.
[42] R. Aparicio and A. Hajimiri, "Capacity limits and matching properties of
integrated capacitors ," IEEE Journal of Solid-State Circuits, vol. 37, no. 3,
pp. 384-393, Mar. 2002.
[43] L. Koskinen, M. Hiienkari, T. Kallio, E. Pohjalainen, and M. Turnquist,
"Battery development for ultra-low-voltage systems," In IEEE Electrical
Electronics Engineers in Israel, pp. 1-4, Dec. 2014.
[44] T. Sakurai, A. Matsuzawa, and T. Douseki, Fully-Depleted SOI CMOS Cir-
cuits and Technology for Ultralow-Power Applications. Springer US, 2006.
[45] D. Jacquet, F. Hasbani, P. Flatresse, R. Wilson, F. Arnaud, G. Cesana,
T. Di Gilio, C. Lecocq, T. Roy, A. Chhabra, C. Grover, O. Minez, J. Ug-
inet, G. Durieu, C. Adobati, D. Casalotto, F. Nyer, P. Menut, A. Cathelin,
I. Vongsavady, and P. Magarshack, "A 3 GHz Dual Core Processor ARM
Cortex TM -A9 in 28 nm UTBB FD-SOI CMOS With Ultra-Wide Voltage
Range and Energy Efﬁciency Optimization," IEEE Journal of Solid-State
Circuits, vol. 49, no. 4, pp. 812-826, April 2014.
[46] S. Clerc, M. Saligane, F. Abouzeid, M. Cochet, J.-M. Daveau, C. Bot-
toni, D. Bol, J. De-Vos, D. Zamora, B. Coefﬁc, D. Soussan, D. Croain,
M. Naceur, P. Schamberger, P. Roche, and D. Sylvester, "A 0.33V/-40C pro-
cess/temperature closed-loop compensation SoC embedding all-digital clock
86
References
multiplier and DC-DC converter exploiting FDSOI 28nm back-gate bias-
ing," In IEEE International Solid-State Circuits Conference Digest of Tech-
nical Papers, pp. 1-3, Feb. 2015.
[47] J. Lechevallier, R. Struiksma, H. Sherry, A. Cathelin, E. Klumperink, and
B. Nauta, "A forward-body-bias tuned 450MHz Gm-C 3rd-order low-pass
ﬁlter in 28nm UTBB FD-SOI with >1dBVp IIP3 over a 0.7-to-1V supply,"
In IEEE International Solid-State Circuits Conference Digest of Technical
Papers, pp. 1-3, Feb. 2015.
[48] A. Larie, E. Kerherve, B. Martineau, L. Vogt, and D. Belot, "A 60GHz
28nm UTBB FD-SOI CMOS reconﬁgurable power ampliﬁer with 21% PAE,
18.2dBm P1dB and 74mW PDC," In IEEE International Solid-State Cir-
cuits Conference Digest of Technical Papers, pp. 1-3, Feb. 2015.
[49] Wenfeng Zhao, Yajun Ha, and M. Alioto, "Novel Self-Body-Biasing and
Statistical Design for Near-Threshold Circuits With Ultra Energy-Efﬁcient
AES as Case Study," IEEE Transactions on Very Large Scale Integration
Systems, vol. 23, no. 8, pp. 1390-1401, Aug. 2015. ISSN 1063-8210.
[50] O. Weber, E. Josse, F. Andrieu, A. Cros, E. Richard, P. Perreau, E. Bay-
lac, N. Degors, C. Gallon, E. Perrin, S. Chhun, E. Petitprez, S. Delmedico,
J. Simon, G. Druais, S. Lasserre, J. Mazurier, N. Guillot, E. Bernard,
R. Bianchini, L. Parmigiani, X. Gerard, C. Pribat, O. Gourhant, F. Abbate,
C. Gaumer, V. Beugin, P. Gouraud, P. Maury, S. Lagrasta, D. Barge, N. Lou-
bet, R. Beneyton, D. Benoit, S. Zoll, J.-D. Chapon, L. Babaud, M. Bidaud,
M. Gregoire, C. Monget, B. Le-Gratiet, P. Brun, M. Mellier, A. Pofelski, L.R.
Clement, R. Bingert, S. Puget, J.-F. Kruck, D. Hoguet, P. Scheer, T. Poiroux,
J.-P. Manceau, M. Raﬁk, D. Rideau, M.-A. Jaud, J. Lacord, F. Monsieur,
L. Grenouillet, M. Vinet, Q. Liu, B. Doris, M. Celik, S.P. Fetterolf, O. Faynot,
and M. Haond, "14nm FDSOI technology for high speed and energy efﬁcient
applications," In IEEE Symposium on VLSI Circuits, pp. 1-2, June 2014.
[51] J. De Vos, D. Flandre, and D. Bol, "A dual-mode DC/DC converter for ultra-
low-voltage microcontrollers," In IEEE Subthreshold Microelectronics Con-
ference, pp. 1-3, Oct. 2012.
[52] Y. K. Ramadass, A. A. Fayed, and A. P. Chandrakasan, "A Fully-Integrated
Switched-Capacitor Step-Down DC-DC Converter With Digital Capacitance
Modulation in 45 nm CMOS," IEEE Journal of Solid-State Circuits, vol. 45,
no. 12, pp. 2557-2565, 2010.
[53] Jung W., S. Oh, S. Bang, Y. Lee, Z. Foo, G. Kim, Y. Zhang, D. Sylvester, and
D Blaauw, "An Ultra-Low Power Fully Integrated Energy Harvester Based
on Self-Oscillating Switched-Capacitor Voltage Doubler," IEEE Journal of
Solid-State Circuits, vol. 49, no. 12, pp. 2800-2811, Dec 2014.
[54] M.J. Turnquist, G. de Streel, D. Bol, M. Hiienkari, and L. Koskinen, "Effects
of back-gate bias on switched-capacitor DC-DC converters in UTBB FD-
SOI," In IEEE SOI-3D-Subthreshold Microelectronics Technology Uniﬁed
Conference, pp. 1-2, Oct. 2014.
[55] W. Jung, S. Oh, S. Bang, Y. Lee, D. Sylvester, and D. Blaauw, "A 3nW fully
integrated energy harvester based on self-oscillating switched-capacitor
DC-DC converter," In IEEE International Solid-State Circuits Conference
Digest of Technical Papers, pp. 398-399, Feb. 2014.
87
References
[56] G.V. Pique and M. Meijer, "A 350nA voltage regulator for 90nm CMOS dig-
ital circuits with Reverse-Body-Bias," In Proc. Eur. Solid-State Circ. Conf.,
pp. 379-382, Sept. 2011.
[57] H. Soeleman, K. Roy, and B.C. Paul, "Robust subthreshold logic for ultra-
low power operation," IEEE Transactions on Very Large Scale Integration
Systems, vol. 9, no. 1, pp. 90-99, Feb. 2001.
[58] N. Saputra and J.R. Long, "A Fully Integrated Wideband FM Transceiver
for Low Data Rate Autonomous Systems," IEEE Journal of Solid-State Cir-
cuits, vol. 50, no. 5, pp. 1165-1175, May 2015.
[59] N. Ickes, Y. Sinangil, F. Pappalardo, E. Guidetti, and A.P. Chandrakasan,
"A 10 pJ/cycle ultra-low-voltage 32-bit microprocessor system-on-chip," In
Proc. Eur. Solid-State Circ. Conf, pp. 159 -162, Sept. 2011.
[60] M. Alioto, "Ultra-Low Power VLSI Circuit Design Demystiﬁed and Ex-
plained: A Tutorial," IEEE Transactions on Circuits and Systems I: Regular
Papers, vol. 59, pp. 3-29, Jan. 2012.
[61] Wang A., Calhoun B., and Chandrakasan A, Sub-threshold Design for Ultra
Low-Power Systems. Springer, 2005.
[62] T. Sakurai and A.R. Newton, "Alpha-power law MOSFET model and its
applications to CMOS inverter delay and other formulas," IEEE Journal of
Solid-State Circuits, vol. 25, no. 2, pp. 584-594, Apr. 1990.
[63] D. Bol, R. Ambroise, D. Flandre, and J.-D. Legat, "Interests and Limitations
of Technology Scaling for Subthreshold Logic," IEEE Transactions on Very
Large Scale Integration Systems, vol. 17, no. 10, pp. 1508-1519, Oct. 2009.
[64] D. Bull, S. Das, K. Shivashankar, G. S. Dasika, K. Flautner, and D. Blaauw,
"A Power-Efﬁcient 32 bit ARM Processor Using Timing-Error Detection and
Correction for Transient-Error Tolerance and Adaptation to PVT Variation,"
IEEE Journal of Solid-State Circuits, vol. 46, no. 1, pp. 18-31, Jan. 2011.
[65] O.S. Unsal, J.W. Tschanz, K. Bowman, V. De, X. Vera, A. Gonzalez, and
O. Ergin, "Impact of Parameter Variations on Circuits and Microarchitec-
ture," vol. 26, no. 6, pp. 30-39, Nov. 2006.
[66] K. A. Bowman, J. W. Tschanz, S. L. Lu, P. A. Aseron, M. M. Khellah, A. Ray-
chowdhury, B. M. Geuskens, C. Tokunaga, C. B. Wilkerson, T. Karnik, and
V. K. De, "A 45 nm Resilient Microprocessor Core for Dynamic Variation Tol-
erance," IEEE Journal of Solid-State Circuits, vol. 46, no. 1, pp. 194-208,
Dec. 2011.
[67] D. Blaauw, S. Kalaiselvan, K. Lai, Ma. W., S. Pant, C. Tokunaga, S. Das, and
D. Bull, "Razor II: In Situ Error Detection and Correction for PVT and SER
Tolerance," In IEEE International Solid-State Circuits Conference Digest of
Technical Papers, pp. 400-622, Feb. 2008.
[68] S. Kim and M. Seok, "Variation-Tolerant, Ultra-Low-Voltage Microproces-
sor With a Low-Overhead, Within-a-Cycle In-Situ Timing-Error Detection
and Correction Technique," IEEE Journal of Solid-State Circuits, vol. 50,
no. 6, pp. 1478-1490, June 2015.
88
References
[69] Seongjong Kim and Mingoo Seok, "R-processor: 0.4V resilient processor
with a voltage-scalable and low-overhead in-situ error detection and correc-
tion technique in 65nm CMOS," In IEEE Symposium on VLSI Circuits, pp.
1-2, June 2014.
[70] M. Fojtik, D. Fick, Y. Kim, N. Pinckney, D.M. Harris, D. Blaauw, and
D. Sylvester, "Bubble Razor: Eliminating Timing Margins in an ARM
Cortex-M3 Processor in 45 nm CMOS Using Architecturally Independent
Error Detection and Correction," IEEE Journal of Solid-State Circuits, vol.
48, no. 1, pp. 66-81, Jan 2013.
[71] M.J. Turnquist and L. Koskinen, "Measurement of a Timing Error Detec-
tion Latch Capable of Sub-threshold Operation," In IEEE NORCHIP Cir-
cuit Conference, pp. 1-4, Nov. 2009.
[72] M.J. Turnquist, E. Laulainen, J. Makipaa, and L. Koskinen, "Measurement
of a system-adaptive error-detection sequential circuit with subthreshold
SCL," In IEEE NORCHIP Circuit Conference, pp. 1-4, Nov. 2011.
[73] J. Crop, E. Krimer, N. Moezzi-Madani, R. Pawlowski, T. Ruggeri, P. Chi-
ang, and M. Erez, "Error Detection and Recovery Techniques for Variation-
Aware CMOS Computing: A Comprehensive Review," MDPI Journal of Low
Power Electronics and Applications, vol. 1, no. 3, pp. 334-356, 2011.
[74] M.J. Turnquist and L. Koskinen, "Sub-threshold Operation of a Timing Er-
ror Detection Latch," In IEEE Research in Microelectronics and Electronics
(PRIME), pp. 124-127, Feb. 2009.
[75] Simon T., S. Rusu, J. Chang, S. Vora, B. Cherkauer, and D. Ayers, "A 65nm
95W Dual-Core Multi-Threaded Xeon R© Processor with L3 Cache," In IEEE
Asian Solid-State Circuits Conference, pp. 15-18, Nov. 2006.
[76] A. Tajalli, E. J. Brauer, Y. Leblebici, and E. Vittoz, "Subthreshold Source-
Coupled Logic Circuits for Ultra-Low-Power Applications," IEEE Journal
of Solid-State Circuits, vol. 43, no. 7, pp. 1699-1710, June 2008.
[77] M. Alioto and G. Palumbo, "Feature - Power-aware design techniques for
nanometer MOS current-mode logic gates: a design framework," IEEE Cir-
cuits and Systems Magazine, vol. 6, no. 4, pp. 42-61, Fourth Quarter 2006.
[78] F. Cannillo, C. Toumazou, and T. S. Lande, "Nanopower Subthreshold
MCML in Submicrometer CMOS Technology," IEEE Transactions on Cir-
cuits and Systems I: Regular Papers, vol. 56, no. 8, pp. 1598-1611, Aug.
2009.
[79] A. Tajalli and Y. Leblebici, "Leakage Current Reduction Using Subthresh-
old Source-Coupled Logic," IEEE Transactions on Circuits and Systems II:
Express Briefs, vol. 56, no. 5, pp. 374-378, May 2009.
[80] M. Shoaran, A. Tajalli, M. Alioto, A. Schmid, and Y. Leblebici, "Analysis
and Characterization of Variability in Subthreshold Source-Coupled Logic
Circuits," IEEE Transactions on Circuits and Systems I: Regular Papers,
vol. 62, no. 2, pp. 458-467, Feb. 2015.
[81] E. Laulainen, M.J. Turnquist, J. Makipaa, and L. Koskinen, "Adaptive sub-
threshold timing-error detection 8 bit microcontroller in 65 nm CMOS," In
89
References
IEEE International Symposium on Circuits and Systems, pp. 2953-2956,
May 2012.
[82] Lattice Semiconductor. LatticeMico32 open, free 32-bit soft processor.
http://www.latticesemi.com/en/Products/DesignSoftwareAndIP/
[83] R. Wilson, E. Beigne, P. Flatresse, A. Valentian, F. Abouzeid, T. Benoist,
C. Bernard, S. Bernard, O. Billoint, S. Clerc, B. Giraud, A. Grover, J. Le Coz,
I. Miro Panades, J.-P. Noel, B. Pelloux-Prayer, P. Roche, O. Thomas, Y. Thon-
nart, D. Turgis, F. Clermidy, and P. Magarshack, "A 460MHz at 397mV,
2.6GHz at 1.3V, 32b VLIW DSP, embedding FMAX tracking," In IEEE In-
ternational Solid-State Circuits Conference Digest of Technical Papers, pp.
452-453, Feb. 2014.
[84] D. El-Damak, S. Bandyopadhyay, and A.P. Chandrakasan, "A 93% efﬁciency
reconﬁgurable switched-capacitor DC-DC converter using on-chip ferroelec-
tric capacitors," In IEEE International Solid-State Circuits Conference Di-
gest of Technical Papers, pp. 374-375, Feb. 2013.
[85] J. Jiang, Y. Lu, C. Huang, W.H. Ki, and P.K.T. Mok, "A 2-/3-phase fully
integrated switched-capacitor DC-DC converter in bulk CMOS for energy-
efﬁcient digital circuits with 14% efﬁciency improvement," In IEEE Inter-
national Solid-State Circuits Conference Digest of Technical Papers, pp. 1-3,
Feb. 2015.
[86] N. Krihely, S. Ben-Yaakov, and A. Fish, "Efﬁciency Optimization of a Step-
Down Switched Capacitor Converter for Subthreshold," IEEE Transactions
on Very Large Scale Integration Systems, vol. 21, no. 12, pp. 2353-2357, Dec
2013.
90
Errata
Publication II
The contours in Fig. 3 (a) were incorrectly labeled as GON ; they should
have been labeled as transconductance (gm). Also, the (right) y-axis in
Fig. 3 (b) was incorrectly labeled; the correct leakage scale is 10−11 to
10−4.
91
There is an emerging class of energy-
constrained adaptive processors that 
operate primarily at near-threshold 
voltages. Supplying the near-threshold 
voltage with high efﬁciency DC-DC 
converters is essential in realizing ultra-low 
energy consumption. The DC-DC converter 
should also be fully integrated to meet the 
increasingly small form factor requirements 
of modern ultra-portable electronics. 
  
This work presents the implementation of 
three fully integrated switched-capacitor 
DC-DC converters and two fully integrated 
adaptive NT processors. By designing the 
DC-DC converter, elements of the adaptive 
processor load, and considering practical 
(battery) input voltages, the author is able to 
identify new system design methodologies 
and approaches that reduce energy 
consumption. 
A
a
lto
-D
D
 2
1
6
/2
0
1
6
 
9HSTFMG*agfihi+ 
ISBN 978-952-60-6587-8 (printed) 
ISBN 978-952-60-6588-5 (pdf) 
ISSN-L 1799-4934 
ISSN 1799-4934 (printed) 
ISSN 1799-4942 (pdf) 
 
Aalto University 
School of Electrical Engineering 
Department of Micro- and Nanosciences 
www.aalto.fi 
BUSINESS + 
ECONOMY 
 
ART + 
DESIGN + 
ARCHITECTURE 
 
SCIENCE + 
TECHNOLOGY 
 
CROSSOVER 
 
DOCTORAL 
DISSERTATIONS 
M
atth
ew
 T
u
rn
q
u
ist 
In
tegrated
 D
C
-D
C
 C
o
n
v
erters fo
r A
d
ap
tiv
e U
ltra-L
o
w
-E
n
ergy
 P
ro
cesso
rs 
A
a
lto
 U
n
ive
rs
ity 
2016 
Department of Micro- and Nanosciences 
Integrated DC-DC 
Converters for Adaptive 
Ultra-Low-Energy 
Processors 
Matthew Turnquist 
DOCTORAL 
DISSERTATIONS 
