Load-Aware Power Conversion and Integration for Heterogeneous Systems by Carlo, Sergio
LOAD-AWARE POWER CONVERSION AND







of the Requirements for the Degree
Doctor of Philosophy
in
Electrical and Computer Engineering
School of Electrical and Computer Engineering
Georgia Institute of Technology
December 2015
Copyright© 2015 by Sergio Carlo
LOAD-AWARE POWER CONVERSION AND
INTEGRATION FOR HETEROGENEOUS SYSTEMS
Approved by:
Dr. Arijit Raychowdhury
Assoc. Professor, School of ECE
Georgia Institute of Technology
Dr. Saibal Mukhopadhyay, Advisor
Assoc. Professor, School of ECE
Georgia Institute of Technology
Dr. Madhavan Swaminathan
Professor, School of Interactive Computing
Georgia Institute of Technology
Dr. Sudhakar Yalamanchili
Professor, School of ECE
Georgia Institute of Technology
Dr. Satish Kumar
Asst. Professor, School of Mechanical Engi-
neering
Georgia Institute of Technology
Date Approved: October 2015
To my family and friends.
ACKNOWLEDGMENTS
Thanks to everyone who has contributed both as technical and emotional support. To my
contemporary GREEN laboratory members: Wen, Zakir, Monodeep, Boris, Denny, Amit,
Duckhwan, Jae Ha, and Jong Hwan. Special thanks to my wife for supporting my deci-
sions, for her patience and understanding during my time graduate school. To my mom for
her love, always believing in me, and always helping me look forward. To my dad for guid-
ing me and helping me grow both intellectually and emotionally. To my grandparents who
helped raise me and whom I love deeply. To my brother for all of his life advice and being a
guide into my graduate school. To my uncle and aunt who have had a large influence in my
development and my cousin Andres who motivates me to be a role model for him. Thanks
to my friends Carlos, Andres and Luis for all their support and, at times, offering some very
useful technical advice. I want to thank my advisor Saibal Mukhopadhyay for his support,
for his patience and guidance. Thanks to my committee members Dr. Raychowdhury, Dr.
Swaminathan, Dr. Yalamanchili and Dr. Swaminathan for their technical feedback and
thought provoking questions. Finally, thanks to my undergraduate professors at Mayaguez
who helped form my technical development and guided me into graduate school, in spe-
cial thanks to professors Guillermo Serrano, Gladys Ducoudray, Manuel Toledo, Gerson
Beauchamp, Manuel Jimenez and Rogelio Palomera.
iv
TABLE OF CONTENTS
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . 3
CHAPTER 2 LITERATURE SURVEY . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Load behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Losses in the power delivery network . . . . . . . . . . . . . . . . . . . . 7
2.3 Computing power reduction techniques . . . . . . . . . . . . . . . . . . . 9
2.3.1 Cloud computing . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Hyper/helper threading . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.3 Energy monitoring . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.4 Dynamic thermal management (DTM) . . . . . . . . . . . . . . . 10
2.3.5 Variable-supply-voltage scaling (VSV) . . . . . . . . . . . . . . . 10
2.3.6 Clock gating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.7 Power gating . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.8 Dynamic voltage-frequency scaling . . . . . . . . . . . . . . . . . 11
2.4 Power supplies and DVFS . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4.1 Single-inductor multiple-output topology and DVFS . . . . . . . . 14
2.5 State-of-the-art SIMO designs . . . . . . . . . . . . . . . . . . . . . . . . 15
CHAPTER 3 MULTIPLE VOLTAGE DOMAIN POWER CONVERTER MOD-
ELING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Impedance models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3 Voltage ripple model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4 MIMO and SIMO power loss model . . . . . . . . . . . . . . . . . . . . 23
3.4.1 MIMO impedance model . . . . . . . . . . . . . . . . . . . . . . 23
3.4.2 SIMO impedance model . . . . . . . . . . . . . . . . . . . . . . 24
3.5 Simulation results and discussions . . . . . . . . . . . . . . . . . . . . . 25
3.5.1 Efficiency comparison . . . . . . . . . . . . . . . . . . . . . . . . 25
3.5.2 Throughput and system power . . . . . . . . . . . . . . . . . . . 27
3.5.3 SIMO and MIMO output voltage ripple . . . . . . . . . . . . . . 28
3.5.4 Open-loop impedance . . . . . . . . . . . . . . . . . . . . . . . . 31
3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
v
CHAPTER 4 CLOSED-LOOP CONTROL OF MULTI-DOMAIN CONVERT-
ERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Power Weighted CCM Controller . . . . . . . . . . . . . . . . . . . . . . 36
4.3 Floating Capacitor-Based Power Stage Filter . . . . . . . . . . . . . . . . 38
4.4 SIMO Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5 Closed loop AC and transient MIMO model . . . . . . . . . . . . . . . . 48
4.5.1 Closed loop AC and transient SIMO model . . . . . . . . . . . . 49
4.6 Closed loop results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
CHAPTER 5 IMPLEMENTATION TECHNIQUES AND MEASUREMENTS
OF POWER WEIGHTING SIMO CONTROL . . . . . . . . . . 55
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Circuit Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2.1 Main power stage switches, drivers and parasitics . . . . . . . . . 58
5.2.2 SIMO output stage switch design . . . . . . . . . . . . . . . . . . 60
5.2.3 Amplifiers and parallel linear regulators . . . . . . . . . . . . . . 61
5.2.4 Non-Overlapping Drivers . . . . . . . . . . . . . . . . . . . . . . 61
5.2.5 Hysteretic Comparator . . . . . . . . . . . . . . . . . . . . . . . 62
5.2.6 Reference Tracking . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3.1 Fabricated die and test PCB . . . . . . . . . . . . . . . . . . . . . 64
5.3.2 Reference tracking results . . . . . . . . . . . . . . . . . . . . . . 65
5.3.3 Transient cross-regulation results . . . . . . . . . . . . . . . . . . 66
5.3.4 DC cross-regulation and line regulation . . . . . . . . . . . . . . 66
5.3.5 Efficiency improvement . . . . . . . . . . . . . . . . . . . . . . . 67
5.3.6 Comparison with prior arts . . . . . . . . . . . . . . . . . . . . . 68
5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
CHAPTER 6 PACKAGING AND VOLTAGE REGULATOR CO-DESIGN FOR
HIGH-PERFORMANCE POWER DELIVERY . . . . . . . . . 71
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.2.1 Modeling of the power delivery network . . . . . . . . . . . . . . 75
6.3 Electrical domain models . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3.1 Modeling of PDN considering the converter . . . . . . . . . . . . 77
6.3.2 Converter structure . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3.3 Design of the loop filter . . . . . . . . . . . . . . . . . . . . . . . 79
6.3.4 Feedback filter design . . . . . . . . . . . . . . . . . . . . . . . . 80
6.3.5 Modeling of VRM output impedance . . . . . . . . . . . . . . . . 81
6.4 Thermal domain models . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.4.1 Modeling of temperature-dependent VRM power losses . . . . . . 83
6.4.2 Effect of thermal coupling on converter loss . . . . . . . . . . . . 86
6.4.3 Thermal modeling environment . . . . . . . . . . . . . . . . . . . 89
vi
6.5 Analysis and co-design of 3D VRMs . . . . . . . . . . . . . . . . . . . . 90
6.5.1 Ideal PDN impedance with 3D VRM . . . . . . . . . . . . . . . 90
6.5.2 Co-analysis and design of VRM and PDN with 3D integration . . 91
6.5.3 Transient performance analysis . . . . . . . . . . . . . . . . . . 94
6.5.4 Transient closed-loop thermal results . . . . . . . . . . . . . . . . 95
6.5.5 2D VRM and processor die . . . . . . . . . . . . . . . . . . . . . 96
6.5.6 3D-stacked VRM with off-package inductor . . . . . . . . . . . . 97
6.5.7 3D-stacked VRM with interposed inductor . . . . . . . . . . . . . 97
6.5.8 3D-stacked VRM with on-die inductor . . . . . . . . . . . . . . . 98
6.5.9 Utilizing improved voltage robustness . . . . . . . . . . . . . . . 100
6.5.10 Effect of closed-loop temperature sensitivity on the system . . . . 100
6.5.11 Design guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
CHAPTER 7 PACKAGING AND SINGLE-INDUCTOR MULTIPLE-OUTPUT
(SIMO) REGULATOR CO-DESIGN . . . . . . . . . . . . . . . 105
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 SIMO integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.3 SIMO electrical model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.4 SIMO power loss model . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.5 Thermoelectrical modeling framework . . . . . . . . . . . . . . . . . . . 108
7.6 Thermoelectrical modeling results . . . . . . . . . . . . . . . . . . . . . 109
7.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
CHAPTER 8 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
8.1 Summary of key contributions . . . . . . . . . . . . . . . . . . . . . . . . 114
8.2 Future research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
vii
LIST OF TABLES
Table 1 State-of-the-art SIMO design comparison table (*Refers to reference
tracking speed) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Table 2 PCB parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Table 3 Comparison of the design to the state of the art . . . . . . . . . . . . . . 70
Table 4 PDN model parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Table 5 Summary of advantages and disadvantages of all the integration schemes 104
viii
LIST OF FIGURES
Figure 1 Block diagram of typical components in a mobile system . . . . . . . . . 4
Figure 2 Buck converter power stage circuit schematic . . . . . . . . . . . . . . . 5
Figure 3 Load current pattern input for efficiency model . . . . . . . . . . . . . . 6
Figure 4 Power loss results given Figure 3 input pattern . . . . . . . . . . . . . . 7
Figure 5 Illustration of a battery-powered distribution network . . . . . . . . . . . 8
Figure 6 Illustration of a DVFS controller structure . . . . . . . . . . . . . . . . . 12
Figure 7 Linear voltage regulator circuit schematic . . . . . . . . . . . . . . . . . 13
Figure 8 Multiple Output Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Figure 9 Scaling problem posed for the partitioning of passives (L and C) as well
as load current and switch sizes . . . . . . . . . . . . . . . . . . . . . . 20
Figure 10 Efficiency ratio as a function of total load power and number of cores at
which it is distributed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Figure 11 Point of inflection where the SIMO is a better option as the number of
cores increase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Figure 12 Conduction losses as a function of load partitioning . . . . . . . . . . . . 26
Figure 13 System power consumption as a function of load partitioning . . . . . . . 27
Figure 14 Voltage ripple as with increasing number of cores at a fixed frequency . . 29
Figure 15 Effect of the peak current with current and inductor scaling . . . . . . . . 29
Figure 16 AC response of the open-loop model . . . . . . . . . . . . . . . . . . . 30
Figure 17 Transient response of the open-loop model . . . . . . . . . . . . . . . . 32
Figure 18 Single-Inductor Multiple Output (SIMO) power stage with N outputs
generated through a single inductor. . . . . . . . . . . . . . . . . . . . . 34
Figure 19 Modified power stage with capacitors between outputs. . . . . . . . . . . 39
Figure 20 Effective switching frequency reduction (given fixed voltage ripple win-
dow) due to the flying capacitor assist currents. . . . . . . . . . . . . . . 40
Figure 21 Conceptual block diagram of the controller. . . . . . . . . . . . . . . . . 42
Figure 22 PWM controller simulated loop gain using transistor level schematics. . . 44
ix
Figure 23 Control δ1 to Vo,1 AC, closed loop gain . . . . . . . . . . . . . . . . . . 46
Figure 24 Transient responses of both power stage filters (conventional and modi-
fied) with compensated system. . . . . . . . . . . . . . . . . . . . . . . 46
Figure 25 Closed loop cross-regulation (Impedance) as a function of frequency for
both with and without the floating capacitor scheme. . . . . . . . . . . . 48
Figure 26 Transient response of Vo,2 due to a step on Io,1 for traditional filter and
modified filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Figure 27 MIMO ideal transient and AC model schematic . . . . . . . . . . . . . . 49
Figure 28 SIMO ideal transient and AC model schematic . . . . . . . . . . . . . . 50
Figure 29 Frequency response of the loop gain for the type-3 compensator used . . 51
Figure 30 Time domain response of the loop gain for the type-3 compensator used . 52
Figure 31 SIMO schematic-based model loop gain . . . . . . . . . . . . . . . . . . 53
Figure 32 SIMO schematic-based model closed loop transient response . . . . . . . 54
Figure 33 Control block implementing CCM control law (power weighting) used
as the reference for the PWM reference generator. . . . . . . . . . . . . 56
Figure 34 System level overview of the proposed SIMO regulator showing circuit
components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Figure 35 Effect of parasitics on internal (on-die) supply noise due to FET speed
and suppression through on-die de-coupling capacitors. . . . . . . . . . . 58
Figure 36 Load switches with diodes and floating driver circuit to reduce the switch-
ing loss of the power FETs . . . . . . . . . . . . . . . . . . . . . . . . . 59
Figure 37 Auxiliary linear regulators, these regulators do not conduct any transient
current and are only enabled when the output experiences a load transient. 60
Figure 38 Non-overlapping circuit for load switch enable, the circuit also extends
duty cycle to allow full charge of the i-th output before switching to the
that needs charge. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Figure 39 Effect of perturbation on extension and isolation current effect. If load
1 experiences a transient, the duty cycle extension will use part of the
isolation current to charge the output until the high-threshold is met. . . . 62
Figure 40 Hysteretic comparator circuit schematic, the hysteresis window is de-
termined by the M ratio of the PFETs and the speed can be tuned by
increasing the bias current in the input stage of the comparator. . . . . . . 63
x
Figure 41 Adjustable voltage reference for the hysteretic comparators. The design
makes use of a slowed down reference (through Cref,i) to keep reference
change within the control bandwidth and avoid voltage droops due to
reference changing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Figure 42 Efficiency and efficiency improvement as function of load currents. . . . 69
Figure 43 Integration schemes for the VRM, converter and filter . . . . . . . . . . 72
Figure 44 Modeling of the power delivery network . . . . . . . . . . . . . . . . . . 74
Figure 45 Different VRM integration schemes . . . . . . . . . . . . . . . . . . . . 77
Figure 46 Physics model of two-tiered stack . . . . . . . . . . . . . . . . . . . . . 82
Figure 47 Integration schemes for the VRM, converter and filter with bidirectional
heat flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Figure 48 Effect of variable variable temperature on the converter power losses . . . 85
Figure 49 Full electro-thermally coupled simulation framework . . . . . . . . . . . 86
Figure 50 Simplified model of the thermal equivalent network . . . . . . . . . . . . 87
Figure 51 Effect of variable converter power loss temperature sensitivity . . . . . . 89
Figure 52 Effect of variable temperature on PDN impedance . . . . . . . . . . . . 90
Figure 53 Effect of converter on the overall impedance profile utilizing same loop
shape as a 2D converter . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Figure 54 Loop gain frequency shaping to compensate for improved bandwidth . . 92
Figure 55 Effect of converter on the overall impedance profile . . . . . . . . . . . . 93
Figure 56 Transient simulations for the analyzed converters . . . . . . . . . . . . . 95
Figure 57 2D converter electro-thermal transient closed-loop simulation . . . . . . 96
Figure 58 3D converter with off-die LC electro-thermal transient closed-loop sim-
ulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Figure 59 3D converter with interposer-based LC electro-thermal transient closed-
loop simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Figure 60 Transient simulations for the on-die power stage filter . . . . . . . . . . 99
Figure 61 System power loss distribution of the different integration schemes . . . 100
Figure 62 Transient response of fixed throughput test case . . . . . . . . . . . . . . 101
xi
Figure 63 Transient dynamics of the system under varying loss sensitivity with re-
spect to temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Figure 64 Single-Inductor Multiple-Output converter with integrated magnetic layer
FIVR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Figure 65 3D drawing of the SIMO providing multiple voltages to the processor die 107
Figure 66 Effect of the temperature on the converter efficiency . . . . . . . . . . . 108
Figure 67 Block-level schematic of the simulated thermoelectrical system . . . . . 109
Figure 68 Voltage regulation capability of SIMO thermo-electrical simulation . . . 110
Figure 69 Power and thermal transient characteristics of the SIMO . . . . . . . . . 111
Figure 70 PDN+SIMO impedance seen across Vcore,1’s terminals . . . . . . . . . . 112
xii
SUMMARY
In this thesis we develop a holistic co-design approach to optimize the conversion sys-
tems in order to perform relevant trade-offs taking into account system-level converter,
load and packaging. We look at the interactions between processing power, the integra-
tion scheme and how it affects the converter in terms of size, design and performance.
In Chapter 1 introduces the problem to be addressed, chapter 2 is a literature survey of
the state-of-the-art power reduction techniques as well as a comparison between different
power converter topologies and how compatible they are with power saving techniques.
Chapter 3 presents compares behavioral models of Switch Inductor (SI) based dy-
namic voltage scaling topologies, the Single-Inductor-Multiple-Output (SIMO) is com-
pared against a Multiple-Inductor-Multiple-Output (MIMO) approach and the results mo-
tivate further research into SIMO control techniques. Chapter 4 presents a digitally ad-
justable Single-Inductor Multiple-Output (SIMO) control technique to address cross-regulation
and power loss reduction, the control technique is then compared against classic MIMO
closed-loop control techniques. Chapter 5 presents the circuit implementation and mea-
surements of a test-chip fabricated in a 130nm process as a proof of concept to the devel-
oped Power Weighting controller.
With the emergence of advanced packaging technologies it is important to analyze the
potential of power converter integration schemes. Chapter 5 looks at different integra-
tion options for the converter-processor system. With the emergence of 3D-integrated chip
stacking, the converter integration has the advantage of reduced parasitics but increased
thermal coupling. The SI converter is evaluated and compared using different integration
schemes (off-die, on-package, 3D) and it is shown how to properly design the converter
to take advantage of the denser integration. A thermoelectric modeling framework is cre-
ated to study the thermoelectric behavior of a fully integrated SI converter with processor
die. This thesis finalizes with an analysis of the integration for a SIMO converter that is
1
3D-stacked with a multi-core processor die, the analysis of the regulation capabilities and
thermal transient simulations of the SIMO system are discussed. It is found that the SIMO
is a suitable alternative to perform multi-domain voltage conversion, however, the tighter
integration and higher temperatures means that the passives parasitics play a large role in
how efficient the conversion is. Thus, improvements on the quality of passives are still
required in order to achieve high-quality and high-efficiency energy conversion. The the-
sis concludes up with a summary of the findings in terms of functional advantages of the




Silicon technologies, driven by Moore’s law, and consumer demand have been improving
and providing faster and more energy efficient computing [1, 2]. However, the progress on
increased energy density and reliability of the energy sources, has ocurred at a slower pace
[3]. Because of this, recent research has focused on reducing power consumption in order
to make end user devices stay unplugged as much as possible, i.e. small charging times
(large input power) and long battery life (low processing power consumption) [4, 5].
The broad spectrum of features available on embedded systems means that sensors,
CPUs, DSPs, memories, displays and audio use heterogenous technologies with varying
power consumption profiles which depend on the user as well as the hardware. Thus
system-level power reduction is a complex problem to model and optimize. Moreover,
within a single die, such as processing cores, dynamically varying supplies are used in
order to reduce power consumption. These processing cores require high-quality voltage
domains in order to avoid reduced system throughput and logic (functional) errors; sim-
ilarly, analog circuits suffer from sensitivity to supply noise as well [6], [7], [8]. Thus
ensuring good power quality at low conversion loss with the added feature of compatibility




Various techniques have been proposed to reduce system power consumption and improve
integrity of the power that is delivered to the processing loads. In heterogenous systems,
such as the one shown in Figure 1, this becomes even more complex due the presence of
voltage domains that are widely apart such such as displays, sensors, digital signal pro-
cessors as well as logic cores. Moreover, within a single homogeneous technology node,
multiple voltage domains are used as a technique to save power. In this section we will
talk about different power saving techniques that have been developed through the years
and how they relate to the power management, voltage (supply) regulation and the power
delivery system.
2.1 Load behavior
The first aspect to consider in a power management system is the load power behavior of
the system at hand. Often times, when selecting a power converter for the applications
we look at the load average currents and select a high-efficiency converter based on the
average current. However, selecting a converter in this manner over looks that the average
does not contain as much information as the probability distribution function (PDF) of the































Figure 2: Buck converter power stage circuit schematic
provides a better accurate of the expected efficiency since the converter efficiency is load
dependent. Figure 3 shows two models for the load current pattern and Figure 4 shows the
resulting converter loss plot given 4 different input patterns which have the same average
value but different PDFs. We can see that by weighting the power loss in the different
states, we obtain different weighted efficiencies.
Modeling the load current density function more realistically models the average load
behavior. It has been shown by Shye et. al. that the power profile varies dynamically
depending on the user. It was found that idle power amounts to 47.3 % of the total power
consumed averaged over a day while being in the idle state close to 89 % of the time. On the
other hand, active phone use consumes close 53.7 % of the power although it stays active
only 11 % of the time [9]. Moreover, there is a similar trend in systems which employ core
boosting where power profiles are very uneven [10].
As a measure towards achieving high-weighted efficiency the work in [11, 12] placed
some efforts into obtaining a variable efficiency curve by modulating the switch sizes and
selecting the optimum size to minimize the loss. Looking into a generic inductive converter
such as in the one shown in Figure 2 we can estimate the losses as:
Ptotal = Pconduction + Pswitching + Pcontroller (1)
where,
5





















































































Figure 3: Load current pattern input for efficiency model
Pconduction = DRds,on,HS I2load + (1 − D)Rds,on,LS I
2
load ∝
Le f f I2load





fC f etV2in ∝ fCoxWLdrawnV
2
in (3)
Where Rds,on is the triode region resistance of the power FETs, D is the duty cycle of
the high-side switch, W is the FETs width, L is the switch length and Left is the effective
length of the switch. Cfet is the effective gate capacitance and Vin is the input voltage. We
are neglecting the losses due to the switch routing in the integrated circuit (IC) and in the
printed circuit board(PCB). But it is evident from equations 2 and 3 that while the switching
losses are proportional to switch size, the conduction losses are inversely proportional to
the switch width. Thus the works in [11, 12] measure the load current and adjusts the switch
size to operate at maximum efficiency. This is the point where the conduction losses equal
to the switching losses, considering that the switching losses and the conduction losses in
the silicon are a function of the frequency, the load current, the input voltage and the power
transistor size.
If there is an overlap between the high-side ON-time and the low-side ON-time, the
efficiency drastically will drastically reduce because the power FETs are sized to handle
6



























































































Weighted Efficiency 90.256231 Weighted Efficiency 90.652828
Weighted Efficiency 91.057633 Weighted Efficiency 87.932584
Figure 4: Power loss results given Figure 3 input pattern
a large a mount of current, the Vds-Id losses can burn and damage the power FETs or
the high-current can exceed the metallization current densities which can melt or burn
the interconnections. Thus a dead-time is introduced between ON-times, there is a loss
associated with it equal to (Vd-Iload)*tdt*f. The controller losses depend on the design and
are usually negligible in ”high-power” systems, it is usually safe to keep them around 1-5%
of the total power budget (this is not the case in ultra-low power systems such as harvesting
circuits and biomedical applications).
This section shows that the losses in the power converter have a dependance on a variety
of factors, although the average power consumption is typically considered a design target
we’ve shown that it is more accurate and to the benefit of reduction of power losses to use
a load profile and measure the weighted power loss. In the next section we will talk about
how the placement and the packaging has been shown to also affect the losses.
2.2 Losses in the power delivery network
It has been long established that power distribution over long distances is should be done
at high-voltage/low-current in order to avoid the RI2 losses associated with the distribution
path. In PCB and chip design this fundamental concept applies and is very important, more


































































Figure 5: Illustration of a battery-powered distribution network
[13, 14], supposing that the input source is constant (often legacy determines this), then
from one generation to the next, considering that the supply scales down as Dennard’s scal-






Since power consumption is generally related to throughput (which is, at the very least,
slightly increases between technology nodes) and the voltage supply scales down, that
means that the current drawn for a newer technology will typically increased. Because of
this, it is becoming common to use integrated voltage regulators (IVRs) [17].
New packaging techniques such as 3D and 2.5D (interposer-based) packaging allows
compact integration of systems, enabling high-speed communication, cleaner voltage rails
and reduced loss systems [18, 19, 20, 21, 22, 23, 24, 25]. The advantages of having in-
tegrated voltage regulators are high-voltage/low-current distribution, fast response times
and reduction in form factor. By using point-of-load (POL) regulators we can effectively
8
distribute at high voltage and convert to low-voltage/high-current at the point closest to
the load meaning that power losses related to power distributions are minimized. The
integration of converter on-package helps reduce some of the parasitics associated with
PCBs and routing but research on improving on-chip and on-package integration is ongo-
ing [21, 26, 2, 27, 28, 24].
However, obtaining high-efficiency for large conversion ratios is a challenge [28, 23],
specially when high-voltage FETs such as DEMOS or LDMOS are not available. More-
over, it has been shown that low-voltage transistors have better Figure of Merit (FOM)
for the power losses and some research has been used to reduce the voltage ratings on the
power stage [29, 30]. Although the POL concept is fundamentally sound, it is only valid
for switching regulators, it is worth pointing out that using a POL linear regulator will not
result in efficiency improvements.
2.3 Computing power reduction techniques
Another important factor of the system is the power reduction techniques available in the
system at hand. In this section we will briefly discuss some of the techniques that have
been proposed to reduce power consumption and show why there is motivation to work on
power reduction technique-aware designs [1].
2.3.1 Cloud computing
Proponents of cloud computing propose running computing intensive applications on high-
performance and high-energy efficient servers [31]. However, overheads due to the high
power consumption of transmit/receive radio-frequency power amplifiers (RFPAs) limit
the amount of achievable efficiency obtainable. Voice recognition and image processing
applications are still attractive for these scheme given the large amount of computing power
needed exceeds the power overheads due to communication.
9
2.3.2 Hyper/helper threading
Hyper threading has been proposed as a means to improve energy efficiency by turning on
cores only when they are required to complete computing intensive threads [32, 33]. By
using scheduling and partitioning the threads, extra cores can turn on and help computation
complete within a shorter time.
2.3.3 Energy monitoring
Energy monitoring is a critical part of energy consumption reduction, software based ap-
plications need information regarding energy use in order to make decisions on how to
implement energy efficient computing [34]. Software and hardware based power estima-
tion have been investigated in the literature but fall out of the scope of this text.
2.3.4 Dynamic thermal management (DTM)
Temperature dependent power consumption has motivated research in Dynamic Thermal
Management (DTM), in [35] a scheduling scheme was developed to optimized throughput
by using cooler cores while allowing other cores to cool down, this ensures optimized use
of resources and reduced leakage power. DTM also minimizes hot spots where processor
temperature can be high enough to damage interconnect and thus improves reliability. A
lot of work has been employed into characterizing this phenomena [36] and alleviating it
[37].
2.3.5 Variable-supply-voltage scaling (VSV)
Proposed as a mechanism consisting of a state machine which monitors cache memory ac-
cess misses and scales the supply up when consecutive misses are detected. This effectively




Clock gating disables the switching signals going into registers or logic that is not perform-
ing any computation. The overhead being the delays caused by the enable circuitry.
2.3.7 Power gating
Power gating is a scheme used to reduce the leakage, this problem is most pronounced in
sub-micron processes where leakage power can account to large percentages of the total
dissipated power in the power. By placing a series switch with losses we can reduce the
”off” current when a block is not being used. Power gating can also be used in memories
but data retention can become an issue since the virtual supply can collapse and flip the
data.
2.3.8 Dynamic voltage-frequency scaling
Dynamic Voltage-Frequency Scaling (DVFS) is one of the most efficient techniques to
reduce power consumption while maintaining fixed performance (alternatively, improved
performance at fixed power through core boosting) [38, 1]. Figure 6 shows a block dia-
gram for a DVFS system, the concept relies on making the delivered power just enough to
complete computations and collapse or reduce voltage supplies for light computational use
or idle states.
In a digital system the power consumption can be summarized as the dynamic (active)
component and the static (leakage or inactive) components [39, 38]:
Ptotal = Pdyn + Pstatic + Pshortcircuit (5)
where the exact values are given by:
Pdyn = αCV2dd (6)
11
Variable Supply Units & Controllers
Cache












Figure 6: Illustration of a DVFS controller structure
















Pd ∝ V3dd ∝ f
3 (10)
Where alpha is the switching activity factor (varying from 0 to 1), C is the circuit ca-
pacitance, f is the operating switching frequency and Vdd is the supply voltage. The
static power is increasingly complex in sub-micrometer nodes, it can be divided into sub-
threshold leakage as well as gate leakage, the important factor, however, is that both are sen-
sitive to the supply voltage, and both reduce with decreasing supply voltage. The reduction
in supply voltages for low-intensity cores reduces both dynamic and leakage power con-
sumption thus reducing overall power by a significant amount. In [9], it was found CPUs in









Figure 7: Linear voltage regulator circuit schematic
as power gating and DVFS should be actively used to reduce power consumption. However,
adjusting the voltage domain to the required target requires a programmable output power
supply unit (PSU) as well as a relatively fast response which is dependent on the frequency
of supply transitions. To date, most converters to use DVFS use linear regulators.
2.4 Power supplies and DVFS
Voltage regulators have been receiving much attention towards DVFS implementation [40,
30, 41, 42, 43, 44, 29, 45, 46, 47, 48]. Although most implementations of DVFS have
used linear regulators such as the one shown in Figure 7, the main advantage being the
ease of design, integration and speed. The trade-off is a limitation of energy savings due to
the lower conversion efficiency. Although it can be initially thought that the saved voltage
drop will just fall across the pass devices, the load current is a function of the load voltage
meaning that the total power loss is:
P = Vds,passIout + VoutIout (11)
From equations 7, 6, and 8 by decreasing the power supply we can reduce all of the
power drawn from the voltage regulator (since the voltage is fixed by the regulator, Iout
reduces with reducing supply voltage. We can see that although the scheme reduces the
power consumption by reducing the load current, the amount dissipated in the pass device
13
is significant and unused. The scenario where linear regulator driven DVFS is optimum is
where very fast supply transitions are needed and conversion ratios are low such that the
regulator operates in the ”drop-out” region. However, in applications where wide DVFS
ranges are required, most of the power is then dissipated in the devices.The lossy linear
regulators have motivated a the use of integrated switching voltage regulators [41, 42, 30,
43].
The advantage of switching regulators over linear regulators is higher efficiencies which
are weakly dependent on conversion ratios, the drawback being increased complexity, in-
creased foot print, switching noise and reduced speed. However, with the emergence of
integrated voltage regulators (IVRs) high-frequency switching regulators have been shown
to have increased bandwidths and smaller footprints.
DVFS scaling is negatively affected by transition speeds which can vary from microsec-
onds to tens of microseconds. Ideally we want a fast and efficient regulator when using
fine-grained DVFS [38, 41]. Some argue that the requirements in speed allow DVFS to be
applied at a coarse-grained instruction level and suggest that heterogenous micro architec-
tures could provide higher savings [49]. Desired DVFS voltage supply tracking speeds are
in the 20mV/ns range, these allow for fine-grained applications.
2.4.1 Single-inductor multiple-output topology and DVFS
One of the main challenges of enabling DVFS is that multiple independent voltage do-
mains requires the use of various regulators. Linear regulators are common because of
their ease of integration, they only require silicon devices and the voltage regulators can
be implemented within the CMOS dies. However, due to their conversion ratio-dependent
efficiency it’s preferred to use switching regulators. Two types of regulators exist, switched-
inductor, and switched-capacitor, both offer higher efficiency than linear counterparts but
switched capacitor circuits use fixed conversion ratios which creates a challenge to en-
abling DVFS, for this reason switched-inductor regulators are more common. When using
multiple switching voltage regulators with multiple inductors besides the increase in cost
14
and complexity, large noise levels and EMI issues [40] arise, moreover, mutual inductance
can affect nearby supply voltages if the inductors couple together. This can be avoided by
placing large isolation layers between the inductors to avoid coupling but comes at an area
penalty.
Thus the Single-Inductor Multiple-Output (SIMO) topology has received much atten-
tion in the literature due to the use of a single inductor to generate multiple independent
output voltages [40, 50, 51, 52, 53, 54, 55, 56]. Although initially introduced as a means
to reduce inductor count, and thus generate many voltage domains as required by the ap-
plication at a low cost. The SIMO topology has also gotten attention as a means to enable
DVFS at high-efficiency [40]. Moreover, the SIMO can even generate voltages that exceed
the input voltage acting as a boost converter and negative voltages with respect to ground
[50, 57], this is very appropriate for applications where we can use turbo boosting in con-
junction with DVFS or want to reduce the leakage by the use of a negative rail. The SIMO
is therefore attractive for applications with heterogenous voltage levels generation such as
in mobile environments or DVFS systems.
2.5 State-of-the-art SIMO designs
The state-of-the art for the SIMO is shown in Table 1. Although it is very common for the
efficiency of switching converters to be close to the 90%+ range, we can see that from most
published works, the peak efficiency is lower when comparing it to single output convert-
ers, the lower efficiency comes from the extra switches added in the SIMO topology which
consume both conduction and switching power. However, the comparison regular buck or
boost converters is not fair considering that the SIMO topology is of more compact form
due to using only one inductor, which is usually the bulkiest and most expensive element
in power converters. The key metrics in SIMO design are cross-regulation suppression
(meaning how much a change in load current in a regulated output affects the neighboring
15
regulated voltages) and tracking speed [58, 53] which is a critical parameter when do-
ing fine-grained DVFS. Moreover, as a requirement for size reduction and/or integration,
passive component size is also an important metric given that the volume of passives are
usually larger than the power regulator silicon die itself. As a part of the proposed work for
this thesis we will try to improve on the tracking speed and filter size reduction of the SIMO
topology such that we can compete with the state of the art. It is important to note that al-
though the literature suggests tracking speed in the 20mV/ns range [38], this is incredibly
challenging unless using very advanced technology nodes.
16
Table 1: State-of-the-art SIMO design comparison table (*Refers to reference tracking
speed)
Design [59] [60] [54] [61] [53] [62] [63] [64] [58]
Technology 0.5um 0.5um 0.5um 0.5um 0.35um 65nm 0.35um 0.5um 0.35um
Area
(mm2)





















































































88.4 82.3 80.8 80 82.8 83.1 N/A 82 87
17
CHAPTER 3
MULTIPLE VOLTAGE DOMAIN POWER CONVERTER
MODELING
The reduction of power consumption in microprocessors is of critical importance for em-
bedded systems. Dynamic Voltage-Frequency Scaling (DVFS) has emerged as a very pow-
erful technique to decrease leakage and switching power and extend battery life [65]. DVFS
relies on creating multiple core-voltage domains to adjust the supply voltage of each core
depending on the desired throughput. The multiple, independent power supply and voltage
regulation in DVFS presents intriguing design challenges for future processors. In systems
with off-chip voltage regulator modules (VRMs), the number of independent voltage is-
lands is limited by the number of available power pins, the on-chip decoupling capacitance,
and the power quality. The integration of VRMs with the processor core (Integrated Voltage
Regulators or IVRs) provides opportunities to improve power quality as well as increase
the number of power domains with limited on-chip decoupling capacitance. Therefore,
design and integration of IVRs has received significant attention in recent years including
demonstration of high-performance commercial processors [45]. While the linear regula-
tors provide quick response and are not difficult to integrate, their low efficiency, particu-
larly for high conversion ratio (i.e. low-voltage operation of processors) has been a major
challenge. More recently, integration of switched-inductor regulators is being investigated
due to the advantages of reduced power losses over wide input and output voltage ranges.
Prototype and commercial processors with switching (high-efficiency) voltage regulators
have been demonstrated [29],[30],[45],[46],[47],[48]. With on-going research for better
quality on-chip passives [26] and [27] we can move closer towards our goal of integra-
tion of low-power optimized systems. Challenges arise, however, when designing VRMs


































































Figure 8: Multiple Output Systems
algorithms such as DVFS. When looking at inductive-based converters, two main topolo-
gies exist to create multiple, independently controlled voltage domains: 1) the Multiple-
Inductor Multiple-Output (MIMO) where an independent IVR with its own power stage
and inductor is used for each domain (a single output example is shown in Figure 8) and
2) the Single-Inductor Multiple-Output (SIMO), where a single power stage and inductor
is multiplexed across multiple power domains (N-output SIMO schematic). The design
and demonstration of these topologies have been investigated extensively in the literature.
However, to the best of our knowledge, there is a lack of analysis to understand the system
level trade-off between these topologies considering their impact on the power-efficiency
and robustness (i.e. power quality) of multi-core (multi power domain) processors. Such
an analysis is important to enable optimal choice of IVR topologies for different multi-core
processors. This chapter provides system-level insight of the effectiveness of the two com-
peting switched inductor IVR topologies, namely, MIMO and SIMO, in designing multiple
power domains with DVFS capabilities. This chapter makes following contributions:







Core 1 Core 2
Core 3Core 4
Core 1 Core 2
Core 3Core 4
Figure 9: Scaling problem posed for the partitioning of passives (L and C) as well as load
current and switch sizes
2. Power loss and efficiency comparison under constant and varying load core through-
put
3. Effective worst-case output voltage ripple comparison under varying load partition-
ing
3.1 Background
Various topologies of switch mode power supplies exist but they are mainly categorized
into two: Switched-Capacitor (SC) and Switched-Inductor (SI). SC regulators are attrac-
tive because capacitors are easier to integrate in CMOS processes. However, fine-tuned
voltage conversion is a design challenge because of fixed conversion ratios of SC regu-
lators ([28], [40]). SI VRMs are not limited to fixed conversion ratios and can achieve
better regulation at high efficiency, which is attractive for DVFS ([41], [42]). However,
generating multiple outputs requires multiple inductors, which would result in a cost and
complexity increase, moreover, multiple switching inductors are cause for EMI concerns
[40],[66]. The SIMO, on the other hand, shares the inductor reduces the cost and com-
plexity of having multiple inductors, moreover, the switching is that of the main inductor
so EMI concerns can be reduced. The control and regulation capabilities of the SIMO,
however, are a challenge. Many control mechanisms for the SIMO topology have been
proposed ([50], [59], [51], [58], [67], [64], [53], [63], [52], [55], [61], [62], [68], [69],
20
[56], [59]) to address different concerns. Mainly, in SIMO systems it is possible that cur-
rent variations in voltage domain affect the neighboring voltage islands due to the sharing
of the inductor and a finite controller speed, this effect is called cross-regulation. In this
chapter wee seek to use previously developed models for MIMO and SIMO and evaluate
metrics such as efficiency, system power, voltage ripple and regulation capability (load reg-
ulation and cross-regulation). We will pose the case of having a fixed area in which we
can either use a large single inductor or create small islands with smaller valued inductors,
this is shown in Figure 9. Moreover, the capacitors, load currents and transistor sizes for
the power stage we also reduced as the overall number of cores increased. The individual
models will be discussed first, then the results will be discussed giving some insight into
the behavior and trade-offs of both schemes.
3.2 Impedance models
Modeling the output equivalent impedance of the converter is useful to show the inherent
quality of the VRMs line regulation (voltage droop in response to a load transient) for
digital cores, an impedance model gives us an idea of what to expect at high frequencies
that exceed the controller bandwidth. Moreover, as will be discussed in Chapter 4, tuning
the controller bandwidth and poles and zeros is a way to improve the shape of the overall
Power Delivery Network (PDN). Since high switching frequencies lead to extremely long
simulations, linearized models, from which we can obtain voltage droops, are very useful
for the designer who can use these estimates and tune the circuit overheads to tolerate the
supply fluctuations. What we effectively extract is the converter plus passives response as a
function of a load current step (i.e. load line regulation). We need to keep in mind that the
AC impedance is reduced through the use of a feedback control. However, in order to make
a fair, controller-less comparison between the two topologies we will look at the open-loop
model for both topologies. Closed loop line regulation will be discussed in Chapter 4,
where a novel control scheme for the SIMO is presented and compared against a MIMO
21
closed-loop system.
3.3 Voltage ripple model
An expression for the output voltage ripple is found by adding the ripple due to the equiva-
lent series resistance (ESR) of the capacitor and the ripple due to the charge being deposited
into the output capacitor. This expression is important because the ripple remains within
bounds (in the case of this analysis within 10% of Vout) of what would make a core remain
operational. The voltage ripple is a function of the converter frequency, the load current,
the inductor peak current, and the output capacitor. Moreover, it is also a function of the
Equivalent Series Resistance (ESR) of the capacitor, which scales inversely proportionally
with area.
Vout,ripple = Vripple,C + Vripple,ES R,C (12)
For the MIMO and the SIMO the voltage ripple is different, moreover, for the SIMO there
are differences which depend on the conduction mode. For a single output , switched-
inductor VRM, the voltage ripple is:
Vripple,C =
(Da + Db)IES R,C
fswC
(13)
Where Da is the on time of the high-side of the power stage, Db is the ON time of the low-
side power stage switch, fsw is the switching frequency of the power stage, C is the load







Where Ipk is the peak current of the converter and Iload is the load current of the converter.
The ESR portion of the voltage ripple is given by:
Vripple,ES R,C = RES R,C ∗ IES R,C,pk = RES R,C ∗ (Ipk − Iload) (15)
These equations work for DCM and CCM, this is because the high-side ON time (Da)
and low-side ON time (Db) apply to both conduction modes, the difference being that in
22
DCM there is an extra time (Dc) where both the high-side and low-side are off but this term
does not contribute any ripple voltage in DCM. For the SIMO, the voltage ripple equations
when in DCM are equal to the ones in equations 13, 14 and 15. However, then going into
CCM, the inductor current builds up a DC level, and the equations for the capacitor currents
change, as they have an added term corresponding to the DC level of the inductor current,
thus equations 16 and 17 show the adjustments made to include the DC inductor current

























− 1) + (Ipk − Iload); (17)
Here the δrepresents the ON-time of the load switch that is being served, since the current
in the inductor is being shared, the condition that the sum of all the δi <1 is required.
3.4 MIMO and SIMO power loss model
Power losses in single stage buck and DCM controlled SIMOs have been well discussed in
the literature [68]. We adjust them to a N-output system where we can express them as a





Pcond,i + PS W,i
)
(18)
Where N is the number of outputs and Pcond,i and Psw,i are the conduction and switching
losses, respectively.
3.4.1 MIMO impedance model
The state-space model for a buck converter operating in DCM mode has been derived in
[70]. A characteristic of a DCM power stage is that the power stage acts as single pole
voltage controlled current source. For the MIMO, the equivalent open-loop DC resistance
















































Figure 10: Efficiency ratio as a function of total load power and number of cores at which
it is distributed
where L is the inductor value, fs the switching frequency, M conversion ratio and Da,i is
the ON time of the high-side switch. An interesting fact is that in DCM the DC impedance
is unaffected by parasitics (RESR,L,RESR,C and RDS,ON). The total expression for the AC















3.4.2 SIMO impedance model
Looking at the SIMO in DCM operation we can use a similar expression but we would
have to adjust the duty cycle. Here, to make the comparison easier we will use the high-
side switch ON-time for the i-th load Da,i based on the serving period, then we can express















































Figure 11: Point of inflection where the SIMO is a better option as the number of cores
increase
3.5 Simulation results and discussions
3.5.1 Efficiency comparison
First, we need to set the constraints for the case study; the total per-load current is set as a





We will also assume that some of the parastics will scale down with the inductance and
capacitance values and that the power stage switches will be resized to equalize conduction




































Similarly for the SIMO, the constraints are:
25




Li = LTotal,RES R,L =RES R,L,Total,Ci =
CTotal
Ncores

































Given these constraints, we want to determine the losses for each topology. Furthermore,
we evaluate the interactions of implementing a DVFS system and the power losses. It is
important to notice that the inductance value does not need to scale down with number of
cores for a SIMO since it’s shared. Figure 10 shows the efficiency ratio ηSIMO/ηMIMO with
varying power and core number, we can see that in the higher power and higher partitioning
range, the SIMO has an advantage over the MIMO. Figure 11 shows the core inflection
point (the core segmentation at which SIMO becomes the better option) as a function of
system power. At high power and number of cores (i.e. higher throughput), the SIMO is
a better option whereas for decreasing load currents, the inflection point increases. This
means that at low currents the SIMO is only better when the partitioning of cores is very
high. Depending on the system, higher partitioning might be feasible or not. Increasing
partitioning will make the load on-time smaller and there are limits to the resolution of the
pulse (minimum pulse width) based on the technology used which determines the MOSFET
26





























System Power Loss as a function of Number of Cores
 
 
SIMO System Loss @ f
max
=250MHz
MIMO System Loss @ f
max
=250MHz
SIMO System Loss @ f
max
=700MHz
MIMO System Loss @ f
max
=700MHz
Figure 13: System power consumption as a function of load partitioning
transition speed. In order to explain why the partitioning inflection point shifts with varying
load current we need to take a look at the individual power loss components. Due to the
sizing restrictions, the switching losses are fixed even as the number of cores increases.
Figure 12 shows that the reason the SIMO is a better option at high partitioning is related
to the conduction losses. For the MIMO, the total inductance value and total load current
is being divided by the number of cores, this effectively makes the MIMO peak current
independent of the number of cores. Interestingly, this is not the case in the SIMO, where
the inductor peak current is actually decreasing with increasing number of cores (lower DC
current per core), in the SIMO case, Ipk,SIMO ∝ 1/
√
NCores.
3.5.2 Throughput and system power
Figure 13 shows the overall system power at two fixed target frequency. We used an estab-
lished simulation framework to estimate V-F curve and determine the power consumption
as a function of load core partitioning. We see that, although power reduces in both ap-
proaches, using a SIMO seems like a better choice under high throughput and using a
MIMO is better for low throughput, this relationship agrees with our previous findings. At
lower partitioning (low NCores), the total output power is high (no DVFS advantage) but the
SIMO does not have an advantage at the high-load low-NCores condition. As we increase
27
the number of cores, the throughput is fixed so the power is effectively reducing due to
the VFS algorithm, meaning that the SIMO is shifting away from its mid-to-high current
favorable application space. Arrows representing the direction of increasing and constant
throughput have been added on Figure 10 to highlight the application area for both SIMO
and MIMO.
3.5.3 SIMO and MIMO output voltage ripple
Figure 14 shows the worst-case output voltage ripple both in SIMO and in MIMO. In
DCM, the peak-current ripple going into the load capacitor dominates the voltage ripple,
which is also a function of load current. The offset between the SIMO and the MIMO
is mostly due to the inductor peak current being lower in the SIMO. This is an important
outcome considering that supply ripple suppression is desired in terms of core performance
and reliability. This graph shows that parallelism will have less effect on voltage ripple for
SIMO than on MIMO. At NCore=4 a 2x reduction in voltage ripple is achieved with respect
to the MIMO counterpart. Figure 15 can be used to explain why splitting the inductance
can lead to increased current ripple. The inductor current charge rate is given by VL/L, thus
splitting an inductor (making it smaller) will yield a higher slope. Since we are splitting
inductance and the load currents, in the case of the MIMO, the peak current will remain
the same (see below derivation). In the SIMO, however, the peak current is reduced when
splitting the loads because the inductor size is fixed. Thus the SIMO will have a lower
ripple current than the MIMO when delivering the same average current and operating in
DCM.


































































Figure 15: Effect of the peak current with current and inductor scaling
29


















For the SIMO, however:





























Thus we can see, by comparing equations 28 and 32, that the inductor ripple is reduced
when splitting the load (because of a longer ON-time Da) current in the SIMO, whereas in
the MIMO, it is not. Or more simply, the ripple in the SIMO is lower due to the increased
2√Ncores in the denominator The splitting effect results in lower inductor ripple losses for
the SIMO as well as lower output voltage ripple.
30
3.5.4 Open-loop impedance
Figure 16 shows the equivalent output impedance of a two output SIMO as well as the
equivalent output impedance of a single stage of a MIMO. In the SIMO we see that at
low frequencies, the impedance is dominated by the converter’s output impedance since
the capacitor acts as an open circuit. As frequency increases, impedance starts to drop due
to the capacitor and saturates to the ESR value. In our analysis we neglect the very high
frequency equivalent series inductance of the capacitor LESL,C. Typically, and specially true
in low-power applications, steps in current at that range are small enough so that we do
not see a significant voltage droop due to the ESL. Figure 17 shows the transient results a
showing the overlaid linearized model with an actual switching converter model. We can
see that the linearized models can accurately predict the voltage droops due to a load step.
During startup, the model deviates from the actual response due to the fact that we are using
an open loop converter. During startup the on-off each cycle the current builds up faster
and dies down slower due to the net positive VL/L, thus the converter is not really in DCM
but in CCM and the startup time deviates from the model and is faster than predicted by the
model.
3.6 Conclusions
This chapter presented the fundamentals, modeling and comparison between MIMO and
SIMO based VRs. We have identified the relative weaknesses (more complex control,
higher switch count) and strengths (lower cost, high-load efficiency and lower supply volt-
age ripple) of the SIMO topology over its MIMO counter-part. At the system level, we
have observed that MIMO is a can be a good choice when the objective is creating multiple
power domains and to reduce power at a fixed throughput. We have also found that the
SIMO combined with DVFS could be a better option for increasing throughput at fixed
power. We’ve also found that voltage ripple, under the constraints presented, can be lower
in the SIMO when trying to achieve multi-domain conversion, this motivates us to further
31
Figure 17: Transient response of the open-loop model
explore power saving techniques by trading off voltage ripple. The challenge is now to
design a control scheme for the SIMO that can mitigate some of its disadvantages (cross-
regulation) while efficiently achieving a high power density. Chapter 4 will present the
control technique and modified power stage used to achieve these goals.
32
CHAPTER 4
CLOSED-LOOP CONTROL OF MULTI-DOMAIN CONVERTERS
4.1 Introduction
Low power design and dynamic power management have become a critical part of the de-
sign for processors. To improve power-efficiency, processors have multiple, independent
voltage domains to supply multi-core designs [65, 42, 66]. Each power domain is expected
to operate at ultra-low supply voltage, even lower than 0.3V, making use of near or sub-
threshold operation [71, 72] thus VRMs need to efficiently generate very low output voltage
from higher input voltages. Moreover, processors include workload dependent dynamic
power management techniques like dynamic voltage-frequency scaling (DVFS) for each
power domain. The design of VRMs for low-power processors is a critical design problem
for mobile platforms. Light, compact and low-cost VRMs, optimized to generate multi-
ple, independent, and programmable voltage domains requires research as well as careful
design. Moreover, to maximize the opportunity for power saving with DFVS, the VRMs
need to enable DVS and provide fast change in the output voltage (i.e. fast reference track-
ing and load response). The preceding characteristics must be achieved with good power
efficiency of the VRM to improve the system’s battery life. Switched inductor VRMs are
attractive for mobile processors as they can efficiently generate fine-grained voltage lev-
els compatible with DVFS [28] [27]. However, the need for multiple bulky inductors for
multiple output domains makes multiple-inductor multiple-output topologies unpractical.
Although, on-die inductive regulators and power magnetics have been improving [26] [45]
[73], avoiding multiple inductors is attractive since we can avoid the increase in size and the
layout area/complexity as well as secondary effects such as coupling between inductors the-
and adjacent circuits [66]. The passives (inductor and capacitors) contribute significantly to
the overall footprint, form factor, and cost of the VRM, and in fact, can dominate over the








































Figure 18: Single-Inductor Multiple Output (SIMO) power stage with N outputs generated
through a single inductor.
more suitable for mobile processors to meet the goal of light, compact and low-cost design.
The Single-Inductor Multiple-Output (SIMO) VR has been proposed as a solution to re-
duce inductor count [40] [58]. As shown in Figure 18, for N output power domains, SIMO
topology requires a single inductor and N capacitors, instead of N inductors and N capac-
itors. Therefore, design of SIMO regulators, which can support power management needs
of low-power mobile processors, is an important problem to investigate. The SIMO has
new design challenges such as managing cross-regulation, i.e. the regulation interference
between supply rails leading to voltage variation in one output due to load variations (DC or
transient) in other outputs. The literature shows extensive studies on techniques to reduce
cross-regulation [59] [52] [53] [69]. Most cross-regulation management schemes rely on
discontinuous conduction mode (DCM) or pseudo continuous conduction mode (PCCM),
both of which limit the total power density of the system for a target power quality (output
ripple) and total passive size (see section II for detail discussion). Therefore, trade-offs



















Io,1 Io,2 Io,3 Io,4
PCCM






































(b) CCM operation waveforms for load cur-
rents, their respective ON-time and Vx node
averaging (power weighting) effect
size, particularly, the inductor size, facilitates faster reference and load transition as well
as higher power density. In general, reducing passive size for a target power quality re-
quires higher switching frequency, and hence, higher switching loss. In a SIMO, reducing
the inductor size requires a higher switching frequency in the main power stage. But the
switching frequencies of N output devices are determined by load currents and their respec-
tive on-time, target output ripple, and the output capacitors. Therefore, design innovations
that can reduce the switching frequency of the output stages for a given total capacitance
(i.e. N capacitance of each output) and output ripple, help improve efficiency of the reg-
ulator. To reiterate, reducing the passive size for a target total power delivered and power
quality helps effectiveness of the design. This chapter presents a SIMO design to address
the preceding challenges considering low output voltages desired in mobile processors and
deliver higher power using smaller passive sizes (i.e. higher power density), provide fast
reference tracking speed, and ensure well-controlled voltage ripple and cross-regulation.
35
First, we present a power-weighted CCM controller topology for the power stage that sup-
presses cross-regulation while allowing the converter to operate in the deep CCM region
[74]. Conventionally, CCM operation is known to result in poor cross-regulation, by min-
imizing cross-regulation in CCM operation the presented controller helps support higher
output power for a given power quality. Secondly, we present an innovative power stage
topology for the output devices that uses coupling capacitances between the outputs to min-
imize switching frequency of the output devices for a target load and voltage ripple. The
coupling capacitors leverage a Miller effect to increase the effective capacitance at each
output thereby reducing the switching frequency of the output devices and improving over-
all efficiency. The chapter presents the operating principles of the proposed controller and
discusses the stability analysis of the SIMO regulator. The circuit design methods to realize
the proposed SIMO regulator will be presented in chapter 5, emphasizing the innovations
used to enable very low output voltages and improve transient responses. The rest of the
paper is organized as follows: Section 4.2 discusses the power-weighted CCM controller,
Section 4.3 presents the modified power stage and a figure of merit to analyze its effective-
ness in improving the power density; Section 4.4 discusses the stability analysis; Section
4.6 presents the closed loop schematic models for the SIMO and the Multiple-Inductor
Multiple-Output (MIMO) and compares the regulations capabilities; Section 6.6 finishes
with the summary of the contributions made by the power weighting controller as well as
the comparison of the MIMO and SIMO.
4.2 Power Weighted CCM Controller
The SIMO power stage shown in Figure 18 requires a controller to manage DC regulation.
In order to implement a control scheme we need to select the conduction mode of the in-
ductor. There are three ways the inductor current can be controlled in the SIMO. Figure
19a shows the three main different conduction modes: 1) Discontinuous Conduction Mode
(DCM) - when the current is allowed to fall to zero each time a particular output is served,
36
2) Continuous Conduction Mode - when the current is non-zero when switching between
one output and the other, and 3) Pseudo-Continuous Conduction Mode - when a DC current
is purposely allowed to flow in the inductor in order to increase the power density a large
ripple to DC ratio is used to reduce cross-regulation [59] [52]. We can see that to deliver
the same output current, both DCM and PCCM require very high peak currents. The high
peak current results in large voltage ripple, to reduce this ripple the filter size (L or C)
must increase, resulting in an increase of volume and/or cost. Moreover, high peak cur-
rents increase the inductor AC power losses. It can also cause inductor saturation, exceed
current density limits for metal lines, as well as inject high electromagnetic interference in
nearby circuits [73]. Moreover, a higher peak-to-average current will also increase the volt-
age ripple thereby reducing power quality and increasing ripple loss (lower efficiency). The
problem is particularly challenging as the output current demand increases. Alternatively, a
higher switching frequency can be used to deliver high-output current at low ripple loss, but
at the expense of increased FET switching loss. Moreover, at high frequencies the inductor
losses are further increased since, for a given inductor value, the quality factor (Q) reduces
as the switching frequency increases [72]. In this work, we address the power density,
voltage ripple, and cross-regulation paradigms by designing a converter that operates in the
CCM region while managing cross-regulation. The switching frequency for the main reg-
ulator power stage is constrained by package parasitics (discussed in section IV) and FET
power loss, it is selected by design as fsw=20MHz. Given the design switching frequency,
with increasing output power (power density), the power stage moves into the deep-CCM
region. In this region, the inductor has a very low peak to DC current ratio and it can then
be approximated to a DC current source. The inductor current is then multiplexed (shared)
to the outputs through hysteretic-based turn on of the i-th switch Si. The i-th load switch
Si is turned ON until the high-threshold voltage is reached and turned off until the lower
threshold voltage is reached. The hysteresis window fixes the voltage ripple and varies the
frequency and ON time to keep the output voltage within the hysteresis bounds. Hysteretic
37
control has several advantages in this application, 1) it makes the control loops for the indi-
vidual outputs inherently stable 2) maintains fixed ripple voltage and quick response time,
3) as will be discussed next, the hysteresis state be used to estimate the per-load current;
which will be critical for the design of the CCM control loop. Controlling the portion of
the current delivered to the outputs can be done through hysteretic comparators, but the
key challenge is to set the correct level of inductor current. This problem was solved us-
ing a power-weighted controller. Analyzing the SIMO CCM waveforms shown in Figure
19b and performing a power balance in the inductor, we find that the average voltage at





δiVo,i + V f w (33)
where δi is the i-th load ON time (determined by the hysteretic comparator, Vo,i is the i-th
output voltage, N is the number of outputs, and Vfw is an offset that can be introduced to
control the isolation period between outputs. The control parameter δi depends on the total





where Io,i is the DC current of the i-th output. Implementing the control law in a circuit
requires either current sensing or a current estimation technique.
4.3 Floating Capacitor-Based Power Stage Filter
A critical challenge when increasing the power output of switching regulators is to reduce
the size of passives used in the design. This is because off-chip passives dominate system
footprint and cost of a switching regulator. The ability to reduce passives for a given output
power suggests potential for higher integration density using on-package or on-chip pas-



























Figure 19: Modified power stage with capacitors between outputs.





where Pout,max is the total maximum output power, LTotal is the sum of inductor size used
(in Henry) and Ctotal is the sum of all the load capacitor values (in Farads). This FOM
is not dependent on passive density and reflects the ability of a design to scale delivered
output power as a function of the passives, not the active silicon area (silicon area has
been historically used but does not correlate well with solution size unless using linear
regulators). The design challenge is to increase the FOM with minimal impact on the
power quality. In a SIMO design with N outputs the required passives include one inductor
and N output capacitors. A direct approach to reduce passive size for a target power quality
is to increase the switching frequency. In a SIMO design, the switching frequency of
the main power stage fsw is determined by the conduction mode, parasitics and switching
losses, while the switching frequency of the output switches are related to voltage ripple,

















Figure 20: Effective switching frequency reduction (given fixed voltage ripple window)
due to the flying capacitor assist currents.






We can see from the above equation that maximum output current has a linear dependency
on the voltage ripple (Vripple), the total number of outputs (δi), load filter (de-coupling)
capacitance (C) and the maximum switching frequency of the output NFETs (fs,i,max). We
propose a modified power stage to increase the overall power density in terms of the FOM.
Figure 19 shows the addition of floating capacitors labeled CT; these connect across all
load combinations, and are smaller in value than the output capacitors connected to ground
(C1-4). However, the effective capacitance is larger than CT and, as will be shown later,
the floating capacitor effective capacitance depends on the number of SIMO outputs. Con-
ceptually, the addition of the coupling capacitance has a Miller-like effect, since the output
voltage ripples move in opposite phases (the one being served with respect to the ones
discharging), which makes the total capacitance at a given node seem ”heavier”. At given
moment, no load is fully disconnected from the inductor; instead a small current passes
through the floating capacitors and reduces the self-discharge speed. The effective change
in a commutation period can be seen in Figure 20. The effective switching frequency of an













To simplify the expression we can assume balanced N loads with equal ON-times δi and

















Therefore, we observe that given a fixed target ripple and load current, the switching fre-
quency of the output devices can be reduced using the proposed power stage. Looking at
the coefficient of the Ci term in the denominator of (11), we can estimate that for a 4-output
converter operating under balanced loads, the effective capacitance is 9/4 times larger than
if it were connected to ground. The switching loss associated with the load switches for the




C f etV2in (41)
and









C f etV2in (42)
The above discussion shows that given a fixed voltage ripple, as the switching frequency
of the output devices are reduced using the modified power stage, the switching losses are
reduced as well. Note that in the hysteretic comparator the switching frequencies of the
output devices are determined by the load current itself, a higher load current increases the
load FET switching frequency fs,i. Therefore, the relative reduction in the power loss, and
hence, efficiency using the proposed approach is more significant at higher load currents.
4.4 SIMO Stability Analysis
Figure 21 shows the conceptual block diagram of the system used. The stability analysis of
this system requires a state space averaged model for the power stage. The per-load regula-























Figure 21: Conceptual block diagram of the controller.
topologies or current ripple control [75], it has the advantage of being inherently stable.
The state-space averaged model is developed for the secondary feedback loops creating the
PWM, for both a CCM SIMO with the isolation current; as well as for the modified power
stage with charge transfer capacitors. The first task towards performing a state-space aver-
aging analysis is to apply small-signal perturbations on the control variables, in the given
design we state the problem as:
d(t) = D + d̂ (43)
Vin(t) = Vin + v̂in (44)
V f w(t) = V f w + v̂ f w (45)
d f w(t) = d f w + d̂ f w (46)
IL(t) = IL + îL (47)
Io,i(t) = Io,i + îo,i (48)
Vc,i(t) = Vc,i + v̂c,i (49)
Vo(t) =
∑n
i=1(δiVo,i + δ̂iVo,i + δiv̂o,i) (50)
Where all the hats are small signal components and non-hat are DC components. By per-
forming an inductor volt-second balancing assuming a buck SIMO and setting the deriva-
























= [ILδ̂i + îLδi − îo,i] + [îLδ̂i] (52)
Using the previous equations we can construct the matrices of in the standard form:
ẋ = Ax + Bu (53)
y = Cx + Du (54)
Neglecting second order terms, derivatives of DC terms and analyzing only the control
variables for the system, which are d and δ1-4 we can construct the state space representation

































Where the coefficient matrices of the vectors are:
A =









C 0 0 0 0
δ2
C 0 0 0 0
δ3
C 0 0 0 0
δ4














0 ILC 0 0 0
0 0 ILC 0 0
0 0 0 ILC 0


















































) Loop Frequency Response
Phase Margin = 81 deg
0 dB frequency = 40kHz
Figure 22: PWM controller simulated loop gain using transistor level schematics.
C =

1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1

The IL to Vo,i transfer function uses a hysteretic controller that is inherently stable, as such,
we can simplify the stability analysis to only consider to the relationship between the duty
cycle generated by the PWM and the output voltages. In other words, we need to study
the sensitivity of the outputs with respect to variations duty cycle, output current and load
ON-time, the stability of the system can be verified by closing the loop with the respective
controller poles and ensuring that φi , 180 o at f = f 0,dB. Thus a simulation setup by
placing a perturbation on the load ON-time and finding the equivalent change in inductor
current is required. A simulation setup to determine controller poles for the closed loop
system was run on a transistor-level linearized model. Using the state-space representation
developed previously, the system and control loop was closed in Matlab. The simulated
control loop block diagram is shown in Figure 21 and the corresponding frequency-domain
responses for G(s) in Figure 22, this was obtained using the transistor-level model for the
circuit shown in Figure 33 where the PWM saw tooth modulator is linearized as having a
44
gain of Vin/Vramp . The Matlab pole placement mirrors the extracted loop gain. The four
output voltages as well as the inductor current are controllable through the 5-state input
vector. In order to determine the stability of the modified power stage (including the floating
capacitor) we need to modify the equations to determine the state-space formulation, the
inductor volt-second balancing is unchanged but the load capacitor equation changes due






ILδ̂i + îLδi + (1 − 2δ + i)
n∑
i=1










where j , i and:








where1 ≤ j < N (56)
since Vo,i(t) and Vo,i+1(t) are complementary in phase, the result is a higher Ic,t(t) for a given
C. Similarly, the output voltages not being served have a common mode voltage ripple
and the voltage differential across their Ct’s is small. Thus we neglect coupling capacitor
currents between non-served outputs. These Ct currents increase during transient events
and increase cross-regulation, however, the effect is alleviated through the use of parallel
linear regulators at the output which quickly react to droops in the output voltages, whether
they’re due to load transients or cross-regulation (adjacent load transients). Constructing
the equations for the modified power stage results in a non-linear system of equations:
((1 + 2δ1)3CT + C)
dv̂c,1
dt
= ILδ̂1 + îLδ1 + 3(1 + 2δ1)
CT
C
[(1 − δ̂2)Io,2 + (1 − δ̂3)Io,3 + (1 − δ̂4)Io,4](57)
((1 + 2δ2)3CT + C)
dv̂c,2
dt
= ILδ̂2 + îLδ2 + 3(1 + 2δ2)
CT
C
[(1 − δ̂1)Io,1 + (1 − δ̂3)Io,3 + (1 − δ̂4)Io,4](58)
((1 + 2δ3)3CT + C)
dv̂c,3
dt
= ILδ̂3 + îLδ3 + 3(1 + 2δ3)
CT
C
[(1 − δ̂1)Io,1 + (1 − δ̂2)Io,2 + (1 − δ̂4)Io,4](59)
((1 + 2δ4)3CT + C)
dv̂c,4
dt
= ILδ̂4 + îLδ4 + 3(1 + 2δ4)
CT
C














































    Modified Power Stage
Figure 23: Control δ1 to Vo,1 AC, closed loop
gain









































Figure 24: Transient responses of both power
stage filters (conventional and modified) with
compensated system.
and the state space matrix coefficients are:
A =










0 0 0 0
δ2
(1+2δ2)3Ct+C
0 0 0 0
δ3
(1+2δ3)3Ct+C
0 0 0 0
δ4
(1+2δ4)3Ct+C



















































1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1

the corresponding closed loop AC simulation, as well as the closed loop transient simu-
lations, for both conventional power stage and modified power stage, are shown in Figure
46
23 and Figure 24 respectively. We can see that both responses are nearly identical in both
the frequency and the time domain. Although the open-loop responses are slightly differ-
ent (mainly high-frequency coupling due to capacitors), the variation is not significant and
cannot be discerned after the loop is closed. The increased coupling due to the floating
capacitors needs also to be investigated.
In order to analyze the effect in cross-regulation due to the coupling capacitors, a closed-
loop AC impedance model was used and the results are shown in Figure 25 and Figure 26.
The Figure 25 shows 1) the closed loop cross-regulation for the traditional power stage, 2)
the closed loop cross-regulation for the modified power stage and 3) the open loop cross-
regulation due to the floating capacitors. The open loop network cross-regulation of the
filter is shows how it affects the coupling from io,1 on vo,2. At very low frequencies, there
is low coupling between outputs 1 and 2 as the Ct capacitors act as DC blocks. At higher
frequencies, the impedance of the coupling network reduces thereby increasing the cross-
regulation, eventually the shunt output capacitors to ground dominates the impedance. The
closed-loop analysis shows a similar behavior, except that at low frequencies the cross-
regulation is due to the feedback loop. As the controller gain (loop gain) reduces at higher
frequencies, the controller is less effective in suppressing the cross-regulation, and hence,
the cross-regulation increases for both approaches. This is the region is where the cross-
regulation reaches a maximum. At much higher frequencies (no cross-regulation suppres-
sion by the controller), attenuation due to the load capacitors shunted to ground dominates
the closed-loop cross-regulation as shown in Figure 25 and Figure 26. The overall nature of
the closed-loop curves are similar for both approaches, however, the peak cross-regulation
at the mid-frequency range is marginally higher for the modified power stage as impedance
offered by the floating capacitors is also low in this region. Figure 26 shows the equiv-
alent transient response for the cross-regulation model. We observe that due to a 1mA

















































Figure 25: Closed loop cross-regulation
(Impedance) as a function of frequency for







Step Response for V
out,2














Figure 26: Transient response of Vo,2 due to a
step on Io,1 for traditional filter and modified
filter
stage is considered. In summary, the proposed power stage does not change the DC cross-
regulation, and a marginal increase is observed in the transient cross-regulation.
4.5 Closed loop AC and transient MIMO model
In order to properly compare the multiple inductor and the single inductor approaches we
also need to look at the closed-loop regulation capabilities as it is unlikely that either ap-
proach will be operated in open-loop conditions. Here we will use a type-3 compensator
model for the multiple inductor approach (which is of fairly common use in high-bandwidth
applications) and will compare it to the power-weighted controller developed for Chapter 3.
The assumption for this analysis is that the controller power is equal for both approaches;
this, however, is not necessarily true as the controller bandwidth and feedback loops for
both approaches vary. However, the MIMO approach does not suffer from cross-regulation
as their loops are independent and not linked through the inductor as in the SIMO.
The spice/VerilogA-based model for the MIMO converter is shown in 27. The converter
uses 4 individual feedback loops that compares the output voltages to the reference and
adjusts the inductor current to maintain the output voltages constant. The model for a




















































The frequency response corresponds to a type-3 frequency shaping filter with transfer func-
tion:
Ac(s) =
(s + z1)(s + z2)
(s + p1)(s + p2)(s + p3)
(62)
4.5.1 Closed loop AC and transient SIMO model
In order to properly analyze the regulation capabilities of the SIMO, we need to develop an
AC model. A CCM/DCM controller model was developed using the power weighting ap-
proach discussed in Chapter 3 where a control method using the average voltage of the right
hand side of the inductor was proposed and verified through hardware. Although the con-
troller was initially developed for the continuous conduction mode (CCM), the control law
is valid in any conduction mode used. Thus a voltage feedback capable of implementing
the control law with its frequency-dependent response, will suffice. Developing a closed-
loop model will enable us to, at a high system level, qualitatively determine the capabilities
of the converter as well as its limitations. The system level schematic is shown in Figure
28, there we can see that each of the outputs has an independent feedback loop that de-











































V    I
Iout,jj=1
N
out,i______( ) (Vref,i-V    )out,iAv,2+-
i
Figure 28: SIMO ideal transient and AC model schematic
the load currents are measured and the voltage is weighed by these currents, effectively












Similarly, the effect of the comparator induced offset observed and discussed in Chapter
3 is included in the model by adding an offset voltage to the output using the function of
equation 65.









4.6 Closed loop results
The closed loop model for the MIMO is shown in Figure 27, its corresponding frequency-



































































Figure 29: Frequency response of the loop gain for the type-3 compensator used
pole from the LC filter being compensated with the controller gain through the two zero-
boosting zeroes. In the case of a MIMO, this is tough to do since, as previously mentioned
the inductance (L) is reduced by a factor of N, therefore, shifting the pole to higher fre-
quencies, and requiring a very large controller bandwidth. The obtained transient results
are shown in Figure 30, we can extract the load and cross-regulation from these waveforms.
The load regulation, which is highly dependent on the controller, is small and we can see
that there is no visible cross-regulation in the MIMO. The MIMO architecture does not
show any cross-regulation as each output branch has an independent power stage, induc-
tance and controller. However, the non-ideal PDN can limit how independent the multiple
voltage domains are. If all the energy is coming from a non-ideal input source, a very large
transient in one of the outputs might cause the input voltage to droop and, thus affect nearby
voltage rails. The results for the small signal loop gain of the SIMO model shown in 28 are
shown in Figure 31. We can see that the system behaves as a single-pole system. The power
weighted controller used has the advantage of using both voltage and current feedback, this
results in a single-pole frequency response that is relatively simple to design. If using a only
51




















































Figure 30: Time domain response of the loop gain for the type-3 compensator used
voltage feedback system and operating in CCM, the frequency response will have a double
pole response similar to the MIMO case. Compensating such system is difficult as using a
dominant pole will slow down the response time and by introducing a pole/zero canceling
controller which is complex and difficult to match across process, voltage, temperature,
and load current [76]. Looking at Figure 32 we can see the controller capabilities heavily
influence properties like load regulation, cross-regulation. The controller bandwidth will
directly correlate to response time (because of the single-pole response). The response time
of the load controllers will also play a role in the stability of the system, although not shown
in the results, using bandwidths smaller than the main controller bandwidth will result in
instability. This is why, partly, the design implementation in Chapter 5 uses hysteretic con-
trol for the load switches the speed and the stability of the load control is much larger than
the main stage power loop. In the final chapter of this thesis we will use this model as part
of an electrical and thermal framework use to estimate the overall voltage droops seen in a























































Figure 31: SIMO schematic-based model loop gain
4.7 Conclusions
We have presented a novel power weighting CCM controller for the SIMO and it’s regula-
tion capabilities, and stability considerations (for both a regular power stage as well as the
introduced modified power stage). The introduced control allows indirect current sensing
and cross-regulation suppression through the controller’s loop gain. Moreover, models an-
alyzing the output impedance and cross-regulation were presented and show that the mod-
ified filter does not dramatically impact the cross-regulation. As a follow up to chapter 3,
closed-loop models for both architectures we presented and their properties compared, we
can see that the MIMO has the inherent advantage of having no cross-regulation. Although
the models show the expected drawback of the SIMO (cross-regulation), we have found
other unexpected advantages of the SIMO such as decreased voltage ripple with increas-
ing voltage domains, high efficiency at high load currents and that, although the control
complexity is increased (higher number of switches and feedback branches), the frequency
domain response of the SIMO is relatively simple single-pole response due to the inherent
53



















































Figure 32: SIMO schematic-based model closed loop transient response
voltage and current feedback loops present.
54
CHAPTER 5
IMPLEMENTATION TECHNIQUES AND MEASUREMENTS OF
POWER WEIGHTING SIMO CONTROL
5.1 Introduction
In order to obtain a high power-density Single-Inductor Multiple-Output converter, we de-
veloped the control scheme of Chapter 4. In addition, we wish to evaluate the feasibility
and characterize the modified power stage filter with floating capacitors. In this chapter
of the thesis, we discuss the circuits and techniques used to physically realize the power
weighting controller and the modified power stage as a monolithic integrated circuit. The
circuit techniques are discussed in 5.2 and the measured results are discussed in 5.3. A
0.5 mm2 prototype test-chip is designed in 130nm CMOS technology to demonstrate a
single inductor four-output regulator designed using on the proposed approaches. The de-
sign generates 4 outputs, each ranging from 200 mV to 600 mV, from a 1.2V input. The
measurement shows the design operates at 20MHz switching frequency with 500nH induc-
tor and 1µF total capacitance. The measurement results demonstrate 120mV/µs reference
tracking, 73% peak efficiency, and a power density figure of merit of 150mW/µH- µF. The
measurement results are discussed, presented and compared against the state-of-the-art.
5.2 Circuit Techniques
Figure 33 shows the system level schematic diagram of the CCM Control filter, as discussed










As we can observe from equation 67, the control law requires sensing the load currents.




































Figure 33: Control block implementing CCM control law (power weighting) used as the
reference for the PWM reference generator.
due to process and component variation. However, with the proposed system, the load cur-
rent ON-time can be measured directly through the hysteretic comparator on state (without
the need for direct current sensing). Figure 19b shows that the average on-time of the





As shown in Figure 33 the Vx node is sensed through Rfb,2 and the comparator outputs
are sensed through R1-4, both are then integrated (averaged), scaled and inverted . Design
equations 4-6 show that Vi,1 and Vi,2 can be tuned to cancel each other such that there is
no offset Vfw introduced. Similarly, these two voltages can be used to tune the isolation
time between loads. When the Vx node is added an offset, the hysteretic loop will force the
remaining current flow through the active diode free-wheel path. The isolation time can
then be adjusted through Vfw to either reduce transient and DC cross-regulation (increase






























































































































































The calculation of the power-weighted average performed by the circuit is then converted
into a Pulse-Width Modulated (PWM) signal, this signal then proceeds to the power train
and filter, and the Vfw is regulated as a result. To summarize, the advantage of this control
scheme are a tunable cross-regulation suppressing isolation current that uses no current
sensing. The CCM control scheme is also part of what allows a high power per passive
density as discussed in the next section. This section describes the circuit level design of
the proposed SIMO architecture. The system is designed to operate at high-conversion
ratios, mainly to mimic the lower voltage domains used in low-power and mobile digital
processors in deep nanometer nodes. The design was implemented in a 130nm process
using low-voltage (1.2 V) digital FETs designed in minimum channel length. The overall

















Figure 35: Effect of parasitics on internal (on-die) supply noise due to FET speed and
suppression through on-die de-coupling capacitors.
5.2.1 Main power stage switches, drivers and parasitics
The minimum channel length digital FETs are used as the power devices to facilitate ef-
ficient and high frequency operation of the converter. In order to minimize the losses we
need to design the power stage sizes and their drivers to minimize overlap losses, switching
losses, and conduction losses. To reduce the V-I overlap losses during transition events we
wish to reduce the transition times (trise and tfall). Using the low-voltage, minimum channel
length, digital transistors as power FETs facilitates lower transition time. The minimum
channel length devices also provide lower gate capacitances for a given output resistance.
A goal of the design is to demonstrate robust operation even using the constraint of a low-
cost package. As shown in Figure 35, using a low cost bond-wire package means a large
series inductance and resistance is present in all of the power pins. Thus combining fast
rise/fall times with large series inductance and large on to off current transition, the on-chip
input voltage to the power FETs can have large voltage droop (L di/dt noise) and may even
collapse [77]. The supply noise is not symmetrical since the low-side FET and the high-
side FETs are complementary. The droop in the input rail reduces the power FET drive and
hence, efficiency. To reduce this effect, the rise and fall times can be increased but at an
increased V-I overlap loss. The switching frequency would also be limited by the rise and










Figure 36: Load switches with diodes and floating driver circuit to reduce the switching
loss of the power FETs
metal-insulator-metal (MIM) decoupling capacitors as across the input and ground volt-
age of the power stage; schematically these are shown in Figure 35 as Con-die. MOSCAPs
were placed on residual active area and the MIM caps where placed above active control
circuitry, thus not requiring any extra footprint. However, this could not be done on the
vertical layers on top of the power FETs since the metallization required for low resis-
tance does not allow that. The reduction in the voltage droops can be estimated finding the






To give some perspective, assuming no internal power stage de-coupling, a on to off time









where ΔiHS is the change in the high-side FET current, in the SIMO operating in CCM, this
becomes the sum of the output currents. Adding a 100pF on-die capacitor, the estimated
droop reduces to 73mV. Meaning that, when available, adding on-die decap is an efficient










Figure 37: Auxiliary linear regulators, these regulators do not conduct any transient current
and are only enabled when the output experiences a load transient.
circuits use a separate power pin and are connected externally (at the PCB level) to the
input voltage source.
5.2.2 SIMO output stage switch design
The output switches of the SIMO are optimized for the low-output voltage. Normally, load
switches are implemented using PFET to support high output voltages but at the expense
of reduced power density. We use NFET based load switches to leverage the large input
to output voltage ratio which provides sufficient headroom to achieve the desired output
resistance at VGS (=Vin-Vo,i) lower than the full Vin, and hence, reducing switching losses.
The equivalent circuit used is shown in Figure 36, in order to reduce the switching loss
to (Vin-Vo,i)2, isolated drivers are required, otherwise discharging the gate capacitance to
ground will still incur a loss proportional to Vin2. Secondarily, body diodes are present
between Vo,i and the Vx nodes, under high voltage difference between outputs this body
diode can be turned ON for the condition when Vo,min < Vo,max+Vdiode. However, in the
proposed design, where the output voltages are very low, the diodes do not turn ON during




































Figure 38: Non-overlapping circuit for load switch enable, the circuit also extends duty
cycle to allow full charge of the i-th output before switching to the that needs charge.
5.2.3 Amplifiers and parallel linear regulators
The controller implementation requires several amplifiers, shown in Figure 33, OP1 through
OP3 are implemented with standard operational amplifiers with right-hand plane zero com-
pensation providing high-bandwidth at low quiescent currents. The major design con-
straints are offsets and output drive, differential pair inputs are used to reduce offsets and
the output stages are biased with sufficient current to sink/supply the load resistors. Lin-
ear regulators such as the ones used in [53] (Figure 37) were used to improve the tracking
speed; moreover, the use of a modified power stage can increase the high-frequency cou-
pling between loads. The transient accelerators help improve induced load noise due to
coupling capacitors.
5.2.4 Non-Overlapping Drivers
Overlapping the load switch ON-time will result in a short circuit between output voltages;
this is undesired in many respects. First the load short will cause the two voltages to fall
or rise to the average, this is cross-conduction. Secondly, a functional failure can result
from a large current spike that exceeds the maximum current density of the switches or the
metal routing. Thus it is important that no more than one switch is turned on at a given
moment. Figure 38 shows the implementation of the non-overlapping circuit and Figure
























Small Increase in I1
Figure 39: Effect of perturbation on extension and isolation current effect. If load 1 expe-
riences a transient, the duty cycle extension will use part of the isolation current to charge
the output until the high-threshold is met.
asynchronous ON-time generation, meaning that the i-th output will turn on whenever it is
required and no other load switch is on effectively removing conditions where two switches
are on at the same time and short circuit to each other. Secondly, the switch ON-time
will extend until the output that is served reaches the upper threshold determined by the
comparator, only when the comparator signals that the charge is enough, the next load will
be allowed to turn on. The design allows to modulate the ON-time of the switches as the
distribution of loads among the outputs change over time, a likely scenario in a multi-core
digital processors. It is evident that, if the one of the loads is above the maximum power
capacity of the converter, the serve switch for that switch will remain on but never reach its
upper threshold and turn off, then the remaining outputs of the regulator will collapse.
5.2.5 Hysteretic Comparator
The circuit topology of the hysteretic comparators is shown in Figure 40. The hysteresis
window is a function of the cross-coupled device sizes. To avoid offsets the input differen-










Figure 40: Hysteretic comparator circuit schematic, the hysteresis window is determined
by the M ratio of the PFETs and the speed can be tuned by increasing the bias current in
the input stage of the comparator.
was added to avoid kickback noise and to improve input common mode range as well as
reduce common mode noise from ringing at the switch nodes. Moreover, the comparators
shown in Figure 39 are very sensitive to the noise on the feedback node. If the feedback
signal is sensed internally, the ringing noise due to the bond-wires causes malfunction in
the comparators. Thus separate pins are used to provide low-noise feedback points for the
load comparators, these feedback sense the voltage at the load capacitors which damp the
ringing through the load capacitance, since the voltage feedback path is not a power path,
there is only small ringing due to the capacitor ESL.
5.2.6 Reference Tracking
The output voltage reference tracking is accomplished by dynamically adjusting the resistor
ratio of the filter and the references, thus changing the weighting for the Vx node control.
The circuit used is shown in Figure 41, it uses binary weighted resistors that are switched
ON and OFF to change the output voltage. The references for the hysteretic comparators
are adjusted by shunting resistor segments i.e.:
Vre f ,i = IbiasRbias(1 + 2b1 + b0) (74)
The change of the reference voltages has to be fast in order for the reference tracking to be
quick, however, increasing the reference speed requires a higher Ibias. As a secondary effect,




















Figure 41: Adjustable voltage reference for the hysteretic comparators. The design makes
use of a slowed down reference (through Cref,i) to keep reference change within the control
bandwidth and avoid voltage droops due to reference changing.
references, these three factors limit the speed of the reference tuning (control speed, noise
immunity and power loss). Note the reference tracking speed for low-to-high transition
in the output voltage can be improved by increasing the controller bandwidth. The high-
to-low transition in the output voltage is realized using load self discharge [9]. Alternate
approaches to increase track down speed includes providing a parallel path through ground
[78] or using the inductor to remove the charge (by building negative current) [67].
5.3 Measurement Results
5.3.1 Fabricated die and test PCB
The fabricated die micrograph and the two-layer test printed circuit board (PCB) are shown
in Figure 42a and Figure 42b. Table 2 shows the summary of the test PCB component
parameters. Several capacitors were placed in parallel to reduce the equivalent series re-
sistance (ESR) and equivalent series inductance (ESL) of the capacitors. The PCB routing
was made as compact as possible to reduce trace inductance. The addition of the floating
capacitor scheme helped reduce the ESR and ESL further. The system was tested with and
without the floating capacitors. To accurately characterize transient response, the refer-
ences were internal to the chip and controlled through a serial digital interface. The serial
connections controlled the references as well as control the dead-time of the main power
64
(a) Fabricated die on IBM 130nm pro-
cess
(b) Fabricated test printed board circuit
stage and the hysteresis window for the comparators. The bits are written through a mi-
crocontroller and a logic level shifter. For measurement of load transient, a current step
is generated through the circuit shown in Figure 42a and controlled through the microcon-
troller. The transient waveforms are shown in Figure 42a and , through the use of 2 program
bits for each output, there are 4 target levels possible for each output, generating 24 pos-
sible combinations for the output voltages. The step magnitude is a function of the bias
current as well, which is controllable through an off-chip potentiometer. The two supplies,
PVin and AVdd are separate but connected through a ground plane, which covers the entire
PCB to reduce EMI.
5.3.2 Reference tracking results
Figure Figure 42a and Figure 42b show the voltage reference tracking speed, after enabling
the reference change within the chip, the controller tracks the reference voltage closely and
we can see the ramp up moves 240mV within 2µs, which comes out to be 120mV/µs, the
control loop achieves a two-fold improvement with respect to the state of the art. Figure
42a shows that the converter is able to move output voltages simultaneously at the same
120mV/µs speed, this is 2x faster than the DVS-enabled SIMO in [58]. Note due to the
absence of a switch to ground, the track-down speed is limited by the DC load current to
be Io,i/Co. This self-discharge scheme for high-to-low transition of output voltage is used
65















(a) Reference tracking on Vout,4 while other outputs
remain fixed. The transient peak cross-regulation is
















(b) Simultaneous change in outputs 2 through 4, all
outputs are able to properly track at the target speed
of 120mV/µs.
in commercial DVFS-enabled VR architectures such as the one in [45].
5.3.3 Transient cross-regulation results
There are two types of cross-regulation in the system, cross-regulation due to the reference
tracking, and cross-regulation due to a change in load current. Figure 42a shows that when
a reference tracking event occurs, the adjacent output voltages ring with a magnitude of
30mV and settles to the DC output voltage within 5µs. shows a load step response where
the current is increased from 5mA to 23 mA, the ringing seen has a peak amplitude of
30mV and settles within 5 µs.
5.3.4 DC cross-regulation and line regulation
Due to the high switching speed of the system, a DC cross-regulation effect that has not
been previously discussed in the literature is seen. shows that the output voltages have a
slight linear increase when increasing the load current. This DC cross-regulation and line
regulation is characteristic of the system because of a combination of the inductor current
being the DC sum of the load current and a finite comparator delay. This issue is not seen
in designs such as in [40] [53] [53] [51] [68], where the load switch ON-time and switching






























(b) Transient load current step response,
transient load cross-regulation, and settling
time.
negligible.



















Given a finite comparator delay td, the output will not stop at precisely at the desired














This means that the comparator delay adds a DC term to the output voltage equation
this term is linearly dependent on the total load current and has a net positive effect because
the discharge slope of the converter is lower than the charge-up slope. Moreover, in the
proposed controller, the isolation current (which is an added term to the output sum) will
affect this DC term in the output voltages.
5.3.5 Efficiency improvement
shows both the peak efficiency graph and the efficiency improvement ratio of the floating
capacitor scheme versus the non-floating capacitor scheme. We can see that the improve-
ment has a load-current dependent behavior. The large improvement slope originates from
67
































(a) Cross-regulation and load regulation as a















(b) Effect of the comparator delay and the total load current
on the DC offset of the output loads.
increased switching losses at high-current levels (due to hysteretic control) a significant
improvement of 30-40% is seen in this region. This efficiency improvement occurs at
the point where the efficiency begins to deteriorate. Considering that the capacitor ESL
and ESR require the combination of various capacitors in parallel, the approach of having
higher effective capacitance for a given capacitor size proves useful and realizable. How-
ever, in applications where single capacitors instead of parallel banks of capacitances are
required, this is not a feasible approach.
5.3.6 Comparison with prior arts
The performance summary and comparison to the state of the art works is shown in table
3. The design offers several advantages over the previous works, mainly in the power den-
sity number, where we obtain higher power density even when compared to lower process
nodes. The controller is also able to provide fast reference tracking, which is one of the
critical areas in DVFS enabling architectures through the use of both the parallel regulators
and a fast response time.
5.4 Conclusions
This paper presented the operating principles, modeling/analysis, design and test of a
Single-Inductor Multiple-Output DC-DC converter to provide multiple (four) low output
68




















Efficiency and Efficiency Ratio





































Figure 42: Efficiency and efficiency improvement as function of load currents.
voltages with dynamic voltage scaling capabilities. A novel DC control loop was intro-
duced for CCM operation, referred to as power-weighted CCM control, to deliver high out-
put power density while maintaining good power quality and operating with small passives.
The presented controller enables high-frequency small passive design, and combined with
high-frequency auxiliary linear regulators, significantly improve the response times for ref-
erence tracking and load regulation. A modified power stage filter with floating capacitors
is presented to reduce the switching rate of the output devices for a target power quality
and small total capacitance; thereby improving the overall efficiency. Overall the design
demonstrates an improvement on power density for a target passive while maintaining good
power quality, efficiency and a high conversion ratio targeted for low-voltage mobile pro-
cessors. The proposed power-weighted CCM controller is scalable as the power-weighting
filter permits the addition of independent error loops that are summed up to control the
inductor current is then locally (load) controlled hysteretic control. The presented mea-
surement results show significant improvement over state-of-the-arts in SIMO design in
terms of power density figure of merit and reference tracking speed. However, in this
paper, the design is demonstrated using off-chip passives and low-quality packages that
ultimately limit the maximum performance achievable in measurement. The realization
69
Table 2: PCB parameters
Package Type LCC-28 Pin
Package size 130mm2





On-Chip Decoupling Capacitor 270pF
Inductor Footprint 49mm2
Capacitance Footprint 2mm2/load
Table 3: Comparison of the design to the state of the art
Design [53] [64] [62] [58] This design
Technology 0.35µm 0.5µm 65nm 0.35µm 0.13µm
Die Area 2.24mm2 13.3mm2 1.86mm2 5.4mm2 0.5 mm2
Input Voltage (V) 2.7-3.3 2.3-3.6 3.4-4.3 2.7-5 0.9-1.2
Output Voltage (V) 2 Fixed 4 Fixed 5 Fixed 4 Variable 4 Programmable(200mV-600mV)
Maximum Output Power 300mW 1,880mW 2,232 mW 2,160 mW 75mW
Inductor/Capacitor 1µH/44µF 4.7µH/40µF 2.2µH/23.5µF 4.7µH/40µF 0.5µH/1µF
Switching Frequency 500kHz 3MHz 1.2MHz 1MHz 20MHz
Voltage Ripple <20mV <80mV <40mV <30mV <20mV
Cross-Regulation N/A 0.41mV/mA 0.067mV/mA 0.04mV/mA 0.59mV/mA
Reference Tracking
Speed N/A N/A N/A 60mV/µs 120mV/µs
Peak Efficiency 82.8% 82% 83.1% 87% 73%
Power Density FOM
(W/µF-µH) 6.8 10 43.2 11.5 150
of the proposed design as Fully Integrated Voltage Regulators (FIVRs) [29] [79] [80] [41]
with on-die or on-package components and faster transistors in lower feature length process
nodes, should further improve the transition speeds.
70
CHAPTER 6
PACKAGING AND VOLTAGE REGULATOR CO-DESIGN FOR
HIGH-PERFORMANCE POWER DELIVERY
6.1 Introduction
In this chapter we will look at the potentials of packaging the voltage regulation module
(VRM) and establish a thermoelectrical framework that will aid us to later analyze the
Single-Inductor Multiple-Output (SIMO). The higher current demand and decreasing op-
erating voltage make power delivery and voltage regulation in high-performance digital
systems an increasingly difficult challenge in successive technology generation [81]. Due
to package-die and package-board resonances, it is difficult to ensure power integrity over
a wide frequency range making reliable power saving techniques a challenging task [82].
Increased transient noise also implies the need for higher cost and complexity of on-die
decoupling. The addition of voltage margin to tolerate worst-case IR and L di/dt noise
may improve robustness but increases the average power dissipation. On-chip integration
of the linear regulators (LDOs) may improve transient noise because of their bandwidth
but at the expense of a reduced efficiency. Fully integrated switched capacitor converters
have also been reported to improve performance and noise margin [83] [84] [85]. As the
efficiency/performance of the inductive converters tends to be better, inductive converters
for higher efficiency [86],[87], fast response [88, 29, 45, 46, 47, 48], and advanced effi-
ciency control over wide load range [81],[89],[90] are being studied. The integration of
the inductive regulators on the same-chip as the processor has the advantage of on-chip
integration, but increases the design complexity due to the need for designing the analog
converter in the state-of-the-art digital processes. As an alternative 3D integration of the
DC-DC converter with the processor has emerged as a design choice [82] [91] [92] [93]
[23] [94] (Figure 43).






































(d) 3D Stacked VRM with on-die filters
Figure 43: Integration schemes for the VRM, converter and filter
associated with printed circuit boards and C4 bumps. The reduced resistive parasitics de-
creases the losses (RI2) in these power traces. On the other hand, elimination of the resistive
and inductive parasitics of these traces reduces the DC drops (RI) as well as noise (L dI/dt)
in the voltage due to load transition. The reduced DC and transient noise allow the digi-
tal circuit designer to reduce voltage margins used for supply tolerance. Other advantages
of 3D-integration are reduced solution size, and more computing density. Consequently,
significant prior works have investigated the design of 3D integrated DC-DC converters
[82][91][92][93][23][94]. However, there are well-known challenges when stacking silicon
dies such as reliability, noise and thermal coupling. It is important to evaluate the impact
of such challenges to accurately analyze the potential of a 3D integrated DC-DC converter.
This chapter presents a detailed analysis of the 3D integrated DC-DC converter consider-
ing electro-thermal coupling between multiples dies in a 3D stack. First, we present the
electrical models of power delivery network (PDN) with 3D DC-DC converter and discuss
the design of the loop filters of the converter considering on board, on interposer, and on
chip passives. Second, the chapter presents a coupled electro-thermal analysis framework
72
to consider the die-to-die thermal coupling effects on the 3D DC-DC converter with differ-
ent integration scenarios for passives. The analysis considers a 3D integrated converter but
with on-board, on-package (on-interposer), and on-chip passives (Figure 43). The com-
parative analysis between different scenarios is performed to evaluate their advantages and
challenges. Our analysis shows that 3D integration of the converter provides potential for
faster loop response, specifically, with on-chip and on-interposer inductors. However, the
thermal coupling imposes a major challenge for on-chip inductor integration; showing that
on-interposer integration of the inductor is an attractive option. This chapter advances the
finding in our earlier conference paper [18], where electrical models involving the PDN
impedance with a converter were evaluated. Although the literature has shown some of
the advantages of 3D power delivery [95] [96] [97] [98], this work extends the modeling
framework to incorporate the self-consistent electro-thermal analysis to provide a thorough
understanding. The key contribution of the chapter is a framework that allows the sys-
tem and package designer to understand the realistic advantages of investing in the cost of
integrating a converter in 3D. The analysis considers the roles of temperature-dependent
VRM properties to facilitate the designer with tools to understand the role of the power
stage/filter loss with respect to load, processing core temperature, filter component qual-
ity, and location (integration scheme). This chapter is organized as follows: Section 6.2
shows the related work on 3D power converter; Section 6.3 presents the model of electri-
cal properties of the PDN with VRM; Section 6.4 presents the motivation and model for
coupled electro-thermal analysis; Section 6.5discusses the simulation results for 3D VRM;
and Section 6.6 concludes with remarks of the results found.
73
(a) PDN model (b) Mesh model
Figure 44: Modeling of the power delivery network
6.2 Related work
This chapter builds on three areas: first, power converters in 3D, second, on-package/on-
chip passives, and third, on the thermal bidirectional converter/processor effects of the in-
tegration scheme. Earlier work on 3D power converters focused on chip-stacking technolo-
gies. The chip-stacked 3D buck regulators are discussed in [91]. Schrom et. al. proposed
the use of flip-chip and through-hole packaging technology to vertically integrate a con-
verter with a processor [92]. More recently, Sun. et. al has explored the potential of using
inter-wafer-via (like TSVs) based 3D integration of the buck converter [93]. The authors
designed a prototype buck converter in 2D BiCMOS technology and discussed the poten-
tial for the 3D integration of the converter. The 3D converter design operates interleaved
converters at a very high switching frequency to address the small area and poor quality
on-chip inductance/capacitance. The peak efficiencies are quoted at 62% to 64%, which is
considerably low compared to off-chip converters. While the design of the converter itself
is discussed in detail, the challenges of integrating it on 3D are briefly mentioned. Using
interposed high quality inductors helps improve the efficiency on 3D stacked converters
74
Table 4: PDN model parameters
Parameters R (ohm) L (H) C (F)
PCB 94µ(s)166.6µ(p) 21p 240µ
PKG 1000µ(s)541.5µ(p) 120p 26 µ
BUMP 40m 72p
GRID 28.1m 3.1f 93.8p
TSV 7.735µ 5.710p 313.2f
because the equivalent series resistance (ESR) can be very low. Hence, on-chip/in-package
high quality inductors are also under intense investigation to match the demand from on-
chip/in-package regulators. The interposer based inductor in [85] opens the possibility of
high quality in-package integration. Advanced on chip inductor material from [92][85] also
made fully integrated regulators possible. However, the advanced on-chip inductors’ qual-
ity is still not on par with the discrete or in-package components, thus limiting the overall
system efficiency.
6.2.1 Modeling of the power delivery network
This work, instead of circuit design of DC-DC converter, focuses on the system-level char-
acterization considering the 3D VRM module, the LC loop filter and the effects on the
impedance, system loss and temperature of the thermal coupling from the processor to the
VRM. Using a unified frequency domain, time and thermal simulation method considering
distributed RLC based PDN and thermal model (including TSVs) and the power converter,
we study reliability, performance, and losses associated with different VRM-loop-filter-
processor integration. The co-analysis results and co-design studies can provide guidelines
to chip and system designers.
75
6.3 Electrical domain models
The on-chip PDN is modeled as a distributed RLC network. The RLC network uses an
equivalent distributed power mesh derived from the lumped impedance model of a Pen-
tium 4 processor [82]. The grid has 1212 grid nodes for VDD and corresponding 12x12
grid nodes for ground. The die dimension is 12 mm 12 mm and forms unit cell dimension
of 250 µm 250 µm. The off-chip impedances are modeled with RLC ladders as well to
capture the low frequency noise. The first segment of the ladder models the board level
lump impedance and the second segment of the ladder modes the package impedance.
The package ladder is evenly distributed to points on the on-die grid with partially-lumped
controlled collapse chip connection (C4) bump impedances. The values of the PDN pa-
rameters used for simulations are shown in Figure 44 and table 4. Note both the VDD and
GND grids are modeled as distributed RLC to accurately account for the effect of return
path. The PDN network for the 3D stacked VRM is composed of two planar power meshes
connected through vertical P/G TSVs. The P/G TSVs are modeled as lumped R, L, and C.
Depending on the TSV density and grid size, the number of TSV per grid are determined
assuming uniform TSV placement. Based on the estimated number of TSVs, the equivalent
TSV impedance is attached to each grid in the planar meshes. The top planar mesh rep-
resents the VRM die, the output of the power converter module is connected to one point
of the top planar mesh. The power consuming processor is connected to the bottom planar
mesh. The power is provided to the processor from the output of the VRM module; and
through the on-die RLC grid of VRM die, distributed TSVs, and the on-die RLC grid of the
bottom die. Therefore, a complete distributed RLC model is considered for the 3D VRM
configuration as well. The power dissipation of the processor is assumed to be at the cen-
ter grid of the bottom plain. The one point power delivery and power consumption model
helps account for the distributed RCL effect in both plains and TSVs. The PDN models are
developed considering distributed decap i.e. a fixed decap is attached per grid node during





































































































































(d) 3D Stacked VRM with on-die filters
Figure 45: Different VRM integration schemes
configurations. The PDN between the power stage of the VRM and power dissipation in
the processor remains the same. However, the parasitic component between power stage
output and the loop filter are different for the different cases as shown in the Figure 45. The
off-chip PDN models, shown in Figure 44 are used to model these parasitic components.
No package/board level decap is considered for the 3D VRM analysis.
6.3.1 Modeling of PDN considering the converter
A full-chip PDN model for one die is developed considering the impedances due to on-
chip grids (distributed RLC), and off-chip RLC ladders including decaps [99]. The model
is extended for the PDN network of 3D die-stacks where PDN is composed of two planar
power meshes connected through vertical P/G TSVs. In the 3D VRM case, one planar
mesh represents the output of the VRM module and provides power to the bottom mesh
through the P/G TSVs. The power consuming processor is connected to the bottom planar
mesh. Section III.A describes the PDN models and the associated parameter values. The
77
framework is used to model the following PDN configurations including the VRM module.
First, the conventional 2D scenario (Figure 43a) is modeled where the converter is off-chip
and connected through PCB routing, flip-chip bumps and bond-wires. For the 3D VRM-
Processor stack the converter is brought in as a stacked die. For the 3D cases, we first
consider the LC loop filter components of the VRM remains off-chip (Figure 43b). Next
we consider the case where the LC components are within the package but outside the die
(Figure 43c). This can be achieved using system-in-package integration and/or interposers.
Figure 43d shows the case where the filter is implemented on-chip on the VRM die. The
PDN analysis is often performed under assuming an ideal power converter source, which
overestimates the performance of the PDN. Using an ideal VRM also neglects the pole-zero
interactions between the network and the VRM. Therefore, we perform a detail converter-
aware PDN analysis. The full system PDN model for different integration scenarios are
shown in Figure 45. Figure 45a shows the structure considering off-chip PDN while Figures
45b, 45c, and 45d shows the structure considering 3D VRM but off-chip, on-package, and
on-die filters respectively.
6.3.2 Converter structure
Our goal is to evaluate the possibility of using a high-efficiency converter with different
integration schemes in order to exploit the advantages of lower parasitics and achieve a
similar performance to linear regulators. This work will use a pulse-width modulation
(PWM) based buck converter as the VRM. A closed loop converter provides tolerance to
process, voltage, and temperature and the use of a feedback sampling point closer to the
load provides better line and load regulation (lower impedance). 3D integration provides
the opportunity to sample very close to the load, this improves the regulation to the on-chip
PDN since the converter now compensates for the path losses due to PCB, packaging and
bumps (see Figure 44). The introduction of feedback and better control has a price; it cre-
ates the possibility of instability, especially because at this point a complex conjugate dual
pole pair appears due to the filter. Therefore, it is important to know how the loop of the
78
whole signal chain behaves to compensate and ensure stability while exploiting the oppor-
tunity of performance. To ensure stability we can either use dominant pole compensation
(which sacrifices the gain at higher frequencies to ensure stability) or use a controller which
shapes the frequency response of the loop filter to effectively cancel out the undesired parts
of the overall loop response. As we will show in Section V.B., a simplistic converter design
cannot take full advantage of the 3D integration schemes. Furthermore, the design of the
VRM will finally determine the performance gain of the 3D integration scheme.
6.3.3 Design of the loop filter







where GC represents the controller transfer function, βis the feedback ratio, the controller
gain was modeled as:
GC(s) = ADC
(s + z1)(s + z2
(s + p1)(s + p2)(s + p3)
(79)
where ADC the DC gain of the controller, p1, p2 and p3 are left hand plane poles and z1 and
z2 are left hand plane zeros. A larger number of poles than zeros were introduced in order
to implement a controller which is physically realizable. Since the filter comes inside the
loop equation, we must also model its transfer function:
H(s) =
ω2o
(s2 + 2ζωos + ω2o)
(80)












L f ilterC f ilter
(82)
Since we want a stable loop gain (phase <180 o @ 0-dB crossing), we can use the two
zeroes to cancel the resonant filter poles. The dominant pole p1 is placed at very low
79
frequency to act as an integrator and the second ( p2) and third poles (p3) are used to
reduce the gain at very high frequencies since there are gain peaks due to the parasitics
such as packaging, PCB and bumps. We must be aware that the LC filter has non-idealities
associated with them. Parasitic zeros in the transfer function appear due to the resistive
components of the inductor and capacitance. The locations of these zeros are:
ωz,ES R,C =
1






where the ESR stands for the equivalent series resistance. The two zeroes are fundamen-
tally different: the inductor ESR zero appears at low frequencies and the impedance is
boosted as frequency increases. The capacitor ESR zero appears at very high frequencies,
at which the capacitor is an equivalent short and the output impedance flattens out to a
minimum of RESR,C. Although these zeros boost the loop gain at high frequencies, they
have losses associated with them that impact the efficiency. Because of this we will try and
keep these two zeros to higher frequencies and have then outside the loop filter bandwidth.
Section 6.4.1 deals with the losses of a switching regulator and both of these effects are
explained there. If the loop gain is large enough, the closed-loop transfer function across




Vi ≈ Vi/β (85)
Then, assuming we’re using a scaled down version of the supply Vi = DVdd to feed into
our loop filter and that the feedback network β=1, then: Vo=DVdd.
6.3.4 Feedback filter design
The feedback filter design was a very important aspect of this work. We used a controller
transfer function in the forward path to improve the stability and stability of the system.
Implementations that can track circuit non-idealities and compensate losses dynamically
80
are not easy to implement in real-life systems. However, we did choose to keep our de-
sign moderately complex (2 zeros and 3 poles) to maintain high performance but in the
realizable domain, i.e. a finite bandwidth and with a reasonable amount of zero-pole pairs.
Higher performance can be achieved through pole-zero cancelation of higher frequency
non-idealities but proper implementation and characterization and PVT compensation is
extremely difficult to achieve. Various implementations of different realistic buck converter
compensators are available, the one chosen for this work is called a type III compensator.
If designed properly, this implementation maximizes the performance of the loop while
maintaining stability.
6.3.5 Modeling of VRM output impedance
The output impedance of the VRM plays a very important role in the impedance profile for
both the 2D and 3D VRM integration scenarios. The open-loop output impedance of the
converter can be calculated in the s-domain as:
ZOL = RES R,C
(s + 1RES R,CC )(s +
RES R,L
L )






We can see that the DC value of the output impedance is RESR,L and the high-frequency
open loop output impedance will saturate to RESR,C, this neglects the series inductance
with the capacitance LESL,C which kicks in at much higher frequencies and adds up to the
RESR,C. The controller loop of choice uses shunt sampling feedback to obtain the benefit of




1 + T (s)
(87)
where T(s) is the frequency-dependent loop gain of the converter. The net effect of shunt
feedback is that RESR,L is attenuated by the loop gain factor at low frequencies. At high fre-
quencies, however, RESR,C cannot be attenuated since the loop gain of the converter eventu-
ally dies out and ZCL= ZOL=RESR,C.We then see a limitation to how much improvement we





Figure 46: Physics model of two-tiered stack
6.4 Thermal domain models
Figure 46 shows the thermally coupled model used as well as the system parameters. The
thermal coupling in this system is represented using a distributed R-C network, where the
R and C correspond to the thermal resistance and capacitance respectively. The 3D stack
contains the package, the bulk and active silicon as well as the back-end of the line (BEOL)
metal connections for the processor layer and the processor to memory interface material,
bulk and active silicon (devices), as well as a BEOL layer for the routing of the power sig-
nals out of the power die. The thickness of the dies are 350 µm and the bonding, heat sink
and package conductivities are from [82]. In 2D integration schemes the thermal coupling
between regulator and processor is negligible, given the fact that the converter, the filter and
the processor lie on different packages and are only connected through the PCB trace. In a
3D stacked scenario, however, the close proximity of the processor die will cause heat flow
from the processor to the processor and vice-versa. Figure 47 shows the different integra-
tion scenarios of interest. Figure 47a shows the typical heat flow in typical 2D integration
scenarios where the heat in the converter is low due to its high efficiency and is mainly dis-
sipated to the PCB. Figure 47b shows the scenario where the inductor is on-package with
3D stacked converter, in this case the thermal coupling with the converter die is strong but



















(c) 3D Stacked VRM with on-
die filters
Figure 47: Integration schemes for the VRM, converter and filter with bidirectional heat
flow
the converter is directly stacked (and thus coupled thermally) with the processor core die,
in this scenario the heat will flow between core and converter die, causing the converter to
warm up more and reach thermal equilibrium with the processing core.
6.4.1 Modeling of temperature-dependent VRM power losses
In this section we study the temperature sensitivity of the power loss characteristics of
3D VRM scenarios. The losses in the power delivery are can be separated into the losses
in the converter and the losses in the PDN. The DC loss can be obtained by considering
the PDN impedance profile at low frequency and the load current. Extensive studies have
been performed in the literature evaluating the losses associated with synchronous buck
converters [83]. We can separate the power losses through two categories, the DC losses
and the AC (switching) losses:
PDC,HS = DRds,on,HS I2load (88)
PDC,LS = (1 − D)Rds,on,HS I2load (89)
PPDN,Total = (RPDN + RES RL)I
2
load (90)
Where the D is the ratio of the high-side (HS) FET turn on to the low-side (LS) FET turn
on, Rds,on,HS and Rds,on,LS represent the ON equivalent FET resistance for the high-side and
83
low-side respectively, RPDN and RESR,L are the DC resistances of the inductor and the PDN,
Iload is the load current consumed by the processor core. The losses in 88 correspond to the
conduction of the HS FET, eq 89corresponds to the conduction loss of the LS FET and eq
90 corresponds to the losses in the PDN and the inductor.
PHS = DI2RippleRRDS ,ON,HS (91)
PLS = (1 − D)RRDS ,ON,LS I2Ripple (92)
PPDN,Total = (RES R,L + RES R,C + RPDN)I2Ripple (93)
Where Iripple is the ripple of the current in the inductor. Switching the transistors involve
injecting charge in the gate capacitor so they can turn on, the switching loss is estimated
by:
Pdriver = fsCtotal,avgV2i (94)
Where fs is the switching frequency of the power FETs, Ctotal,avg is the equivalent capaci-
tance needed to charge the gates of the HS and LS power FETs and Vi is the input voltage
(used to turn on and off the power FETs). Circuit simulations in 130nm CMOS for tran-
sistor parameters coupled with the PDN are used for power loss analysis. The effect of
temperature on the properties of transistors and PDN components are considered for loss
analysis. In order to evaluate the advantage of 3D VRM power loss, we’ve made a model
that estimates the losses using parameters from a CMOS 130nm process. Then we calculate
the total system power, i.e.:
Pdelivery = PPDN + Pconv (95)
Where Pdelivery is the loss in converting and delivering power to the core, PPDN is the
power loss in the PDN, and Pconv is the power loss in the converter. Figure 48 shows the
VRM+PDN losses as a function of temperature. In this figure we can see the different
losses depending on the integration scheme and temperature, for the 2D case, the losses
increase as the PDN temperature increases, since the off-die converter and LC would not be
84
.


































3D with off−die LC
3D with Interposed LC
3D with on−die LC (R
ESR,L
=10mΩ)
3D with on−die LC (R
ESR,L
=30mΩ)
Figure 48: Effect of variable variable temperature on the converter power losses
close to the heat source (the processor) we neglect the temperature increase in the converter
power stage. Next we can see the on-PCB LC with 3D VRM has the highest system loss.
The high system loss for the off-package filter is due to the amount of interconnect needed
to route in and out of the package and back to the load, considering that even though
low-ESR inductors are available as PCB components, the large amount of trace resistance
(2 times RPCB) in the high current path increases the losses significantly and dominate
over the inductor ESR, although in the AC analysis performed later on, the equivalent
impedance seen from the load in the off-package converter, the small signal impedance is
an artifact of the loop converter and more accurately reflects load regulation than actual
resistance. For the case of the interposed LC filter, we can see that a reasonable assumption
of inductor quality (10mΩESR) leads to the lowest system power. The reduced power
path length in the interposed-based approach combined with a good component quality
yields a very low power dissipation and minimum temperature dependence (only the power
FETs are affected). Finally, we can see that the power loss of a fully-integrated (on-die
filter) approach depends highly on the quality of the component as well as the temperature
of operation. When using a high-quality component (ESR of 10mΩ), the inductor loss









































Figure 49: Full electro-thermally coupled simulation framework
dominate as well as its temperature dependence.
6.4.2 Effect of thermal coupling on converter loss
This section is used to study the effect of thermally coupling the converter, although not
a complete model it does give us some insight on what happens in a thermally coupled
system. The analysis presented in the literature has shown that the routing has to be ther-
mally aware as 3D integration can increase the temperature and increase the drops in the
PDN [95]. The integration potentials of 3D-integrated VRMs has been previously analyzed
[18]. However, the analysis presented there is aimed at the regulation capabilities without
introducing the thermal coupling (feedback) that will be present in a 2-tiered stack. In this
framework we develop the an integrated system that allows us to determine the closed-
loop temperature for the core and converter dies, although the designer might be tempted
to add the powers in order to calculate the temperature using a equivalent network, we
must understand that the system at hand is thermally fed back. Suppose we integrate the
converter, looking at the block diagram in Figure 49, we know that the conversion loss
and voltage properties will vary with temperature, however, determining the temperature
of the processor/core dies is not trivial. Let us look into the thermal circuit shown in Figure
50, RTH,core represents the equivalent thermal dissipation of the core die through the heat
sink and RTH,conv represents the stray connections of the heat from the converter die to the
ambient temperature (through bumps and PCB). Suppose that the core power is suddenly
increased, that will increase the temperature in the voltage regulator because the VRM is












Figure 50: Simplified model of the thermal equivalent network
not since the only heat dissipation comes from stray thermal paths to the bumps) and the
coupling resistance is small (as it is in TSVs), then the main mechanism of heat extraction
will be through the heat sink and both dies will be close in temperature. An increase in
power dissipation then increases temperature, which increases the power dissipation in the
VRM, this is a scenario that would happen if the losses in the VRM are very sensitive to
temperature. As we look to define the problem of integration and proper heat dissipation,
we should first look at the relationship between power loss and temperature such that we
are able to properly understand the results that will be presented. Let us look at the temper-
ature dependency and explain the non-linear function of the power dissipated in a FIVR.
Intuitively, the total system power consumption can be determined from eq 96.
Tcore = f (Pcore + PPDN) + f (Pconv(Pcore,Tcore, t)) (96)
Where Tcore is the core temperature, Tcore is the core power consumption, Tcore is the PDN
loss, and Tcore is a function of the loss in the core (through the load current), the converter
temperature and time (core current will vary). We can simplify complicated by making
some assumptions that will help us understand the behavior. If we make the assumptions
that the thermal coupling resistance is zero (RTH,C=0) and that the converter transfers its
heat through the TSVs and the heat sink ( RTH,conv=∞ ), then the thermal circuit solution is
that the core and converter are at the same temperature and the heat will dissipate through
the heat sink, i.e.:
Tcore ≈ Tconv ≈ Tsys = RT H,core
(




Which, assuming a steady-state power consumption will lead to:
Tsys = RT H
(
Pcore + PPDN + Po + αTsys
)
(98)
Where Po is the converter loss at room temperature and αis the coefficient of power loss with
respect to temperature of the converter, which can be obtained by sweeping the converter









RT H(Pcore + PPDN + Po) − Tsys
Tsys
(100)
Assuming that the temperature coefficient is constant, we can find that a closed-loop rela-
tionship for the temperature will be given by:
Tsys =
RT H(Pcore + PPDN + Po)
1 − αRT H
(101)
Assuming that we know the temperature coefficient for the converter power loss α, we
can then estimate the system temperature or estimate the total system loss based on a given
maximum temperature. Moreover, we can find that if the converter temperature depen-
dency is sensitive enough, both converter and core can experience a significant increase
in temperature. Figure 51 shows that the converter temperature sensitivity can consider-
ably increase the temperature, with α=0 (temperature independent loss) the system shows
a slight linear dependency with respect to the system temperature (as do all converter given
their nearly flat efficiency), as the coefficient moves closer to α= 1/RTH the temperature
tends to infinity. Intuitively this is because the temperature sensitivity of the converter die
makes it such that it generates more heat than what the heat sink is able to extract, this
would be the case of a poorly designed converter with large temperature variation. The
properties described her are only for theoretical and analytical purposes, the simulation
results do not make any assumptions in terms of the thermal impedances and includes all
coupling paths for heat (discussed at the beginning of Section 6.2.1).
88
.




































Figure 51: Effect of variable converter power loss temperature sensitivity
6.4.3 Thermal modeling environment
The thermo-electric model environment used is shown in Figure 49. The voltage regulation
and loss loop consists of the feedback converter model discussed in section 6.3.3 with its
electrical PDN network discussed in section 6.3 and shown in Figure 44. The challenge was
to develop a transient, temperature dependent model that used the core/processor outputs to
re-calculate the conversion loss dynamically. The model is also electrically coupled to the
core/processor stack model. The thermal feedback loop consists of the temperature depen-
dent loss model for the converter (6.4.1) as well as the thermal 3D and 2D thermal models
used [99]. Figure 44 shows the thermal model as well as the thermal property parameters.
The voltage regulation part determines and controls the ON/OFF times of the power stage,
the load current and the current converter temperature are then used to modulate the FET,
inductor and PDN resistances based on the regulator function and the given temperature.
The regulator power dissipation is fed back as an input to the thermal 3D stack which adds
the power dissipated to the proper part of the stack. As the regulation voltage, load cur-
rent, power and temperature settles, the system converges to the temperatures (core and
converter) that correspond to the applied load and conversion efficiency. The models use
a mix of VerilogA and spice-level components and are then used to evaluate small signal
(AC) analyses and to generate results in a fast manner.
89
.






























3D Impedance with thermal








Figure 52: Effect of variable temperature on PDN impedance
6.5 Analysis and co-design of 3D VRMs
The on-chip PDN is modeled as a distributed RLC network considering a die-dimension of
12mm12mm. The RLC network uses an equivalent distributed power mesh derived from
the lumped impedance model of a Pentium 4 processor [99]. Here we will present the
results of different integration schemes using the established framework.
6.5.1 Ideal PDN impedance with 3D VRM
Figure 52 shows the impedance plot of the PDN using an ideal voltage source termination
for a 2D PDN and a 3D PDN to demonstrate the potential of 3D VRM, also shown is the
effect of the temperature on the PDN, the temperature and impedance is obtained through
the use of the linearized converter/processor stack which includes the PDN impedance as a
function of temperature. The most noticeable effects are a DC impedance drop and a shift in
the resonant peaks (i.e. bandwidth extension). The resonant peak due to the packaging and
the bond-wires are removed and the only peak that remains is due to the TSV inductance.
A higher TSV density reduces the TSV non-idealities, resulting in a lower DC drop and L
di/dt droop shift to higher frequencies. The effect of a small temperature increase is also
shown in the figure, we can see that although the impedance at room temperature is 10mΩ,
an increase of 38 Celsius will the impedance by 14%, however, considering the overall





































Figure 53: Effect of converter on the overall impedance profile utilizing same loop shape
as a 2D converter
advantageous. The above analysis shows the potential of an ideal 3D VRM module. Next,
we discuss the impact of the converter characteristics in the above analysis.
6.5.2 Co-analysis and design of VRM and PDN with 3D integration
A co-simulation of the compensator loop gain of the VRM plus filter plus PDN for different
integration scenarios. Figure 53 shows the effect of the VRM on the PDN, the closed-
loop (integration-dependent) temperature was used for all of the plots as well as the same
control filter frequency response. As expected, using a 2D off-chip converter adds a large
amount of routing and package peaks, which limit the achievable bandwidth (without using
a non-specialized controller). The DC regulation at the load is also worse due to the IR
drop across the PCB and package, which do not fall inside the control loop. We also
observe that adding the VRM introduces a low frequency impedance peak. This peak
occurs at the unity crossover frequency, where the converter loses the ability to self-regulate
its output impedance. Meaning that this peaks corresponds to the open loop impedance
of the converter, it disappears at higher frequencies due to the impedance reduction of
the capacitor and settles to RESR,C. Next, we consider the characteristics of the 3D VRM







































Figure 54: Loop gain frequency shaping to compensate for improved bandwidth
frequency response is used. The PDN impedance in Figure 52 shows that, although we
see a reduction in the PDN impedance at all frequencies, we still observe peaks at lower
frequencies compared to the ideal case, these are due the use of a low bandwidth controller
that does not adjust its shape to compensate for the parasitics. The 2D VRM bandwidth is
purposely limited, that is because it has a large amount of parasitics that appear at lower
frequencies than its more integrated counterparts.
Taking into account that there are less parasitics in the paths that are 3D integrated, the
controller frequency shaping needs to be modified to optimize the integration scheme; the
poles/zeroes are shifted to take advantage of the low-impedance flatness in 3D integrated
structures. Figure 54 shows the loop gain response of the 2D off-chip converters and the
optimized 3D VRM converters, where the PDN is included as part of the loop gain. In
order to compensate the networks the controller function described in section 6.3.3 was
used. The parasitic reduction obtained from 3D integration can be used to extend the VRM
bandwidth; which can, in turn, be used to maximize the flatness of the impedance profile.
Moreover, the effect of the temperature cannot be neglected as the sensitivity and con-
vergent temperature depends on the integration scheme. Figure 55 shows the impedance

















































Figure 55: Effect of converter on the overall impedance profile
3D VRM with off-chip LC: This approach improves the low-frequency impedance as
well as the regulation into the PDN. The converter can now compensate losses in the pack-
aging and PCB since the converter can sample very close to the load. However, the bumps,
package, and PCB parasitics between the VRM module and the on-board loop filters re-
sult in low to medium frequency peaks in the loop transfer function. Thus the routing
parasitics limit the bandwidth of the converter (hence, response time). The upside of hav-
ing higher quality off-chip components is offset by the PDN parasitics. The temperature
does not have a significant effect on the equivalent DC impedance mainly because the tem-
perature increase is not large in this case. 3D VRM with on-package LC: This approach
removes low-frequency resonant peaks due to PCB which allows an improvement in the
VRM bandwidth. Higher bandwidth helps in better response time. However, we still ob-
serve a high frequency peak due to the package resonance, which is the main limiting
factor for the bandwidth. It provides the lowest impedance due to the main losses of the
converter coming from the filter (temperature dependent) parasitics. Since the converter
filter is not thermally coupled and the quality of the components for on-package compo-
nents is higher than on-die/stack components, it operates at a moderately low temperature
thus the effect of the temperature on the PDN is also not too large and would not impact
93
the PDN drops significantly. 3D VRM with on-die LC: Using an on-die LC filter has the
largest bandwidth advantage of 3D integration. The impedance profile is very close to the
ideal 3D behavior, removing the package and PCB components enables us to improve the
bandwidth without sacrificing stability so the loop gain is now only limited by the reso-
nance of the TSV impedance. Additionally, using an on-chip LC requires us to lower the
LC component values such that the bandwidth can be extended without the need for fre-
quency shaping using the controller poles and zeroes. However, the limiting factor of this
integration scheme is that the increased converter, PDN and core temperature increases the
(IR) drop significantly. Figure 55 shows the difference between a non-thermally coupled
fully integrated converter and a temperature-coupled converter. In a non-thermally coupled
environment, the system temperature is lower and the ideal advantage of integrating the
converter leads to the conclusion that it offers the lowest PDN plus converter impedance,
however, when including the system temperature (obtained with the previously discussed
thermo-electrical framework) we see that, although flatter, the impedance profile is higher
than the one obtained using both the 3D with off-die LC and 3D with interposed LC filters.
Therefore, integration-induced temperature can lead to a switch between options, although
the numbers used for these simulations show that the fully integrated approach is not the
optimal solution, the purpose of this analysis is to let the designer understand the effect of
the integration, these results are not set in stone as using better quality passives can lead to
reduced VRM power dissipation, leading to reduced system temperature and lower PDN
plus converter impedance. All of the AC simulations where obtained at a core power of
10W being applied to the processing core.
6.5.3 Transient performance analysis
Figures 56a and 56b show the converter startup and its step response to a transient 10-20
Amp current step that was applied at a node in the middle of the core PDN mesh. When the
current step is applied, two things happen. First, the local decap for the node is discharged
to provide quick charge to the load and the voltage droops, when the converter senses
94





















Fast settling time due 
to reduced parasitics
(a) Startup transient waveform for converters























3D On−Die/Stack LC (No T)
Minimum IR drop reduction




on IR drop for 3D with 
on−die filter
(b) Step load response
Figure 56: Transient simulations for the analyzed converters
the droop it begins injecting charge back and the voltage recovers from the droop. After
this happens, the voltage settles to the desired target voltage minus the IR drop in the PDN.
Figure 56b shows that the 3D VRM can significantly improve the response to load transient
and reduce the IR drop. As expected the results concur with the impedance results obtained
for the previous section, the 3D VRM with on-die LC filter with no temperature coupling
shows the lowest droop but including the effect of temperature, it is actually worse than
the other 3D integrations, the 3D VRM with on-package filter being the best in terms of
droop and DC drop. The startup response of the converter shown on Figure 56a clearly
indicates that 3D VRM with on-die filters has much faster start-up time compared to the
other 3D VRM options. The 2D response is not shown given that it is much slower than its
3D counterparts and cannot be shown in the same timescale.
6.5.4 Transient closed-loop thermal results
The distributed power delivery electrical and thermal networks models for the processor
and the converter enable accurate representation of transient phenomena in both the con-
verter and the processor die. Our interest is to understand the behavior of the converter/core
by co-simulating the VRM regulation with its temperature-dependent losses with the com-
plete electrical and thermal grids. Figure 49 shows the established framework to be used
95
.




































































Figure 57: 2D converter electro-thermal transient closed-loop simulation
in this section. The simulation framework was set up as follows: 1) The thermal grid
determines the temperature of the processor given a power output. 2) The temperature-
dependent converter model simulates the transient effects of the converter such as voltage
regulation, response time and conversion loss. 3) The configurations studied included off-
die VRM, on-die VRM with off-die (temperature decoupled) magnetics, on-die VRM with
on-die magnetics.
6.5.5 2D VRM and processor die
In this scenario, the 3D converter is decoupled from the heat of the processor die. Figure 57
shows that the converter power loss increases with a step on the load current. The system,
however, shows moderate system efficiency (output power divided by total consumption),
a large part of this is due to the loss in the PDN, a large amount of power is dissipated
in delivering high current from inductor through the PCB and to the core. This approach









































































Figure 58: 3D converter with off-die LC electro-thermal transient closed-loop simulation
6.5.6 3D-stacked VRM with off-package inductor
In Figure 58 we see the effect of converter/processor die temperature coupling. The tran-
sient temperature of the processor and converter dies increase in relation to the 2D case,
the increase results from the power dissipation from the converter adding to the thermal
stack. The increased temperature is mainly due to the added power loss created due to the
temperature sensitivity of the power FETs. The temperature of the converter and processor
dies is larger which also leads to slightly higher power loss. The system efficiency drops
significantly from 76% at light load to 61% at heavy load due to the large amount of inter-
connect length present between power stage to off-package inductor and back to the core.
6.5.7 3D-stacked VRM with interposed inductor
The interposed-based approach offers one of the best trade-offs of all the integration schemes.
Since the quality of the inductors of interposed based approaches can be high, the conver-
sion loss is low while still providing the advantages of a wider bandwidth and more robust
processing core supply. Figure 59 shows that the interposed-based solution achieves an
97
.






































































Figure 59: 3D converter with interposer-based LC electro-thermal transient closed-loop
simulation
efficiency from 80% to 89%. The temperature sensitivity is close to that of the previous
case, meaning that only the FETs are showing increased loss due to the temperature, the
inductor has low heat coupling. The reduced PDN and inductor paths and their respective
losses, make this a very attractive solution to the system designer.
6.5.8 3D-stacked VRM with on-die inductor
This integration is the most sensitive to both temperature as well as component quality.
Figure 60a was generated using a very high quality on-die inductor (ESR = 10mΩ), we can
see that given a very good inductor (high efficiency and low temperature sensitivity) this
approach provides the highest system efficiency (89% at light load efficiency versus low
80’s in the other approaches). The caveat of this approach relates to that inductor quality,
on-die inductors have been improving but obtaining high quality components is expensive
in terms of processing as well as complex. Figure 60b shows the same simulation but uti-
lizing a lower quality inductor (ESR = 50mΩ). In that figure we can see that the interaction
of the increased conduction loss in the inductor combined with higher conversion loss and
98



































































(a) Response using 10mOhm ESR



































































(b) Response using 50mOhm ESR
Figure 60: Transient simulations for the on-die power stage filter
higher temperature drastically reduces the efficiency from 88% to 72% at light load and
from 80% to 56% at heavy load. Using a 5X lossier inductor also means that its loss tem-
perature sensitivity moves up, even though the inductor can be made of the same material,
it is important to note then that the physical dimensions of the inductor will also play a role
in the loss sensitivity, meaning that a low ESR (thus large inductor) will be better than a
high ESR (but smaller) inductor. Figure 61a and 61b show the distributed power when the
load power is varied from 4W (61a) to 12W (61b) for all of the analyzed schemes. We can
see that when reducing the power consumption, the converter loss reduces drastically, the
thermo-electrical simulation in the previous discussion shows that the system temperature
is sensitive to core power and converter loss. Thus reducing the core power and converter


































































































































































(b) Heavy load condition
Figure 61: System power loss distribution of the different integration schemes
6.5.9 Utilizing improved voltage robustness
One of the main advantages of 3D integration, as discussed in section 6.5.2, is that a more
robust supply (lower PDN impedance) will allow the circuit designer to reduce the voltage
supply margins in their designs. Suppose that a desired margin can be now tolerated be-
cause the power delivery network is now more robust, then the voltage supply can be scaled
down.




Suppose that the reduction in IR drop means that we can count on a stable supply at the
high load, then the designer can reduce the voltage tolerance, while keeping the throughput
fixed and reduce the core power loss. We want to look at the effect on the system power
when the core power is reduced. Figure 62 shows that the reduction of the core supply (and
reduction in core power), has a positive impact in the total system loss, using a lower core
voltage results in a power efficiency of above 90%, and, with a good inductor, the dies do
not reach 40 Celsius, (below the temperature of the 2D stack). Improving robustness as a
means to reduce margins and proves as an effective approach to reducing the system power.
6.5.10 Effect of closed-loop temperature sensitivity on the system
Figure 63 shows the effect of a varying temperature sensitivity on the inductor power losses.
An αof 0 means that the inductor temperature does not result in an increase in resistance, an
αof 0.005 means that a 1C increase in temperature results in a 0.5% increase in the RESR,L,
100
.




























































































Efficiency with ESR low
Efficiency with ESR high
Figure 62: Transient response of fixed throughput test case
similarly for the α= 0.01. A temperature coefficient of α= 0.01, however, is unlikely since
most inductors are copper based (with an α= 3.8e-3). The takeaway point of this picture
is that the inductor material’s temperature sensitivity has a secondary effect on the system
besides the inductor quality (which we discussed in the previous sections).
6.5.11 Design guidelines
Table 5. shows a summary of the observations discussed. One of the main contributions of
this chapter is adding the effect of the converter to the die stack and analyzing the thermal
coupling effect on the VRM. When looking at higher integration, the benefits of the more
robust voltage supply could lead to an unrealistic expectation since 3D stacking comes at
the cost of increased temperature. As long as the voltage regulator is properly designed and
the quality of the components are not too low, the advantages of denser integration prove
that the increased heat does not offset the advantages. The results show, however, that the
losses are largely dependent on component quality. The lower the quality of the induc-
tor, the higher the loss of the converter die (which increases temperature), moreover, the
lower quality also results in higher temperature sensitivity (which again leads to increased
101
.






















































































Efficiency with α = 0
Efficienct with α=5e−3
Efficiency with α=10e−3
Figure 63: Transient dynamics of the system under varying loss sensitivity with respect to
temperature
loss). The advantages of the integration schemes in terms of bandwidth and reduced drops
can be taken into consideration when designing the feedback controller but determining
the conversion and delivery loss requires us to delve deeper as the small signal regula-
tion impedance is shaped by the feedback controller and is not equal to the power path
impedance. From the results we can see that both interposed and on-die passives offer the
best performances and offer the advantage of a reduction in voltage supply margins, but
the advantages come at a high cost, it is the designer’s responsibility to investigate what is
realistically achievable from these schemes based on the available technology.
6.6 Conclusions
This chapter establishes the potentials and challenges of switched-inductor 3D VRMs in re-
ducing losses and improving performance of power delivery system. 3D VRMs reduce both
low-frequency and high-frequency PDN impedance resulting in potentially much lower IR
and L di/dt in the digital circuits. As technologies scale down, digital circuits are becoming
highly sensitive to voltage noise, and thus lower supply noise leads to more robustness. The
102
reduced IR and L di/dt noise also provide opportunities for more aggressive voltage scaling
in processor hence, lower operating power. The analysis of the PDN behavior considering
the effect of the converter shows that the 3D VRM can provide appreciable advantage but
the methods for integration of the power stage filter of the converter scheme has a strong
impact on the stability and the achievable PDN+VRM impedance. The analyses indicate a
reduction in the IR losses obtained from the 3D PDN and an opportunity to improve VRM
bandwidth but also show that the temperature coupling has a strong effect on the electrical
properties such as PDN impedance, as well as the system efficiency. It is observed that
fully integrated (on-die filter) VRMs provide reliable voltage in terms of regulation but the
increased heat and lower quality of passives can reduce these advantages. The 3D VRM
with the interposer filter configuration has great potential due to low losses as well as lower
temperature sensitivity. However, the cost and complexity of interposing the passives needs
to be taken into consideration. The cost associated with the 3D integration is a factor that
needs to be considered and whether it can be amortized by lower processor power needs to
be evaluated. A full 3D integrated system might be more compact and the cost of integrat-
ing the magnetics can be lower than using on-package filters but can lead to temperatures
high enough that they offset the gain of the denser integration, then proper heat extraction
becomes the challenge. Table 5 summarizes the findings to help designers evaluate the
advantages and challenges of the different voltage regulator schemes.
103
Table 5: Summary of advantages and disadvantages of all the integration schemes
2D Off-Chip VRM 3D VRM Off-Chip Filter 3D VRM Interposer Filter 3D VRM with On-Chip Filter
Ease of Integration for both VRM
and loop filters
Integration of VRM requires 3D
process but Loop filters are easy to
integrate
Integration of VRM requires 3D
process. Integration of loop filters
on package have been
demonstrated but challenging
Integration of VRM requires 3D
process. Integration of loop filters
on diehave been demonstrated but
challenging
Poor regulation due to long
distance feedback point
Good regulation due to closer to
processor feedback
Better regulation due to closer to
processor feedback
Best Regulation due to closer to
processor feedback and
ellimination of all paraistiocs in the
feedback path
High PDN IR Drops Less PDN IR Drops Lesser PDN IR Drops Least PDN IR Drops
Higher Ldi/dt noise Less Ldi/dt noise Lesser Ldi/dt noise Least Ldi/dt noise
Higher impedance at at all
frequencies
Small low frequency impedance.
High frequency impedance is
limited by package and PCB
paraistics in the loop filter path
Small low impedance. High
frequency impedance is limited by
package paraistics in the loop filter
path
Low PDN Impedance over wide
band. High frequency impedance is
limited by the TSV and on-chip
parasitcs.
Low ESR inductors are easily
available
Low ESR inductors are easily
available
On-package inductors with
moderate ESR are challenging but
available.
On-die inductors have higher ESR
( 10s to 100s of mΩ)
Die-to-die thermal coupling is not
critical High power loss in PDN + VRM
Minimum power loss in the system
assuming previously reported ESR
of on-chip inductors
Lesser power loss in PDN + VRM
(Depends on the ESR of on-chip
inductor)
Conventional system majority of
the application
Suitable for high-power processor
with need good regulation and for
easier integration of passive
elements
Suitable for high-power processor
with tight regulation and loss
constraints but availability of
interposer technology
Suitable for low-to-moderate
power processor with tight
performance constraints and
availavilibity of on-chip passives
technology
Conventional converter design High bandwidth converter can beused at the expense for complexity
Higher bandwidth converter can be
used at the expense for complexity
Very high bandwidth converter can
be used resulting in excellent




Silicon switches are thermally
coupled and converter loss is
increased
Silicon switches are thermally
coupled and converter loss is
increased
Thermal sensitivity is high and can
limit the power density of the
converter mainly due to the
increased losses in the inductor
104
CHAPTER 7
PACKAGING AND SINGLE-INDUCTOR MULTIPLE-OUTPUT
(SIMO) REGULATOR CO-DESIGN
7.1 Introduction
Leading up to this chapter we have presented the design, system-level analysis of the
Single-Inductor Multiple-Output (SIMO) topology. We have also presented the electrical
and thermal advantages of having a integrated converter. As the summarizing chapter of
this thesis we will present the packaging concerns and simulation framework of an electro-
thermal simulation of a SIMO converter which is shown in Figure 64. This is an important
problem that needs to be studied if we can only afford to have a single inductor and wish
to have multiple voltage domains (which is common) in the processing core. We will first
determine the simulation integration framework and discuss the design choices for the con-
verter, since the electrical and thermal behaviors have been explained in Chapters 3, 4, 5
and 6, we will only briefly refresh some of the modeling here and focus on presenting the
results of a thermo-electrically coupled simulation.
7.2 SIMO integration
Figure 65 shows a fully integrated SIMO voltage regulator. The purpose of using a scheme
such as this is enabling Dynamic Voltage-Frequency Scaling (DVFS) applications where
independent voltage domains are required while using a single on-die inductor. The litera-
ture has shown significant improvements in terms of the quality of on-die passives, both for
capacitors [100] and inductors [22], [26], [29]. Although Switched-Capacitor (SC) voltage
regulators have shown great promise, the voltage granularity and wide range efficiency of
the Switched-Inductor (SI) approach makes it a competitive topology. Considering that
significant improvements need to be made to improve the inductor/capacitor quality and to






Figure 64: Single-Inductor Multiple-Output converter with integrated magnetic layer FIVR
multi-domain converter such as the SIMO.
7.3 SIMO electrical model
The electrical, linearized model for the SIMO was developed in Chapter 3, we will use
the power-weighted controller and contrast the differences that the packaging makes in the
output impedance of the PDN+Converter. The main contribution is to use the previously
developed electrical-distributed network and splice it into a multi-domain spice model that
we can then load and estimate the PDN plus converter voltage droops. Here we will see
that, for the characterized system, the main contribution to voltage droops in the multiple
domains are due to the lower line regulation capability of the SIMO as well an increased
cross-regulation.
7.4 SIMO power loss model
The SIMO power losses were previously evaluated in Chapter 4, here, the power loss model
used as a verilogA component that is a function of load current, voltages, passive sizes (and
their quality factors) and of temperature. The thermal dependency of both the silicon and
106









Figure 65: 3D drawing of the SIMO providing multiple voltages to the processor die
the copper based inductor by including a temperature coefficient with response:
RES R,L = RES R,L,To(1 + α(T − To)) (103)
Where To is room temperature (27C for our simulations), RESR,L,To is the equivalent series
resistance of the inductor at room temperature and αis the linear temperature coefficient of
the material that the inductor is made of (in our case, as in most cases, copper). Similarly,
for the power FETs we’ve extracted the temperature coefficient using cadence virtuoso and
fitted equivalent resistors that can vary with the transient simulation, i.e.
RDS ,MOS FET = RDS ,MOS FET,To(1 + α(T − To)) (104)
In Figure 66 we can see that the temperature of the converter affects the efficiency nega-
tively, meaning that at higher temperatures the converter loss is increased for a given output
power. We want to analyze the effect of the decreased efficiency on the overall thermal char-
acteristics and of the thermal characteristics on the electricals. The main advantage of the
model developed is that similarly to the one in Chapter 5, the temperature of the converter
T is actually a function of time i.e. T(t). Thus the die temperature, and the regulation
properties can be extracted simultaneously.
107



















Effect of temperature on efficiency
Increasing Temperature
Figure 66: Effect of the temperature on the converter efficiency
7.5 Thermoelectrical modeling framework
Figure 67 shows the closed loop thermoelectrical system schematic. The analysis frame-
work adds the thermal model of the converter with the stacked processor using a temper-
ature, time and power dependent converter model from which we can extract the voltage
regulation and the system temperature. The left-hand side of the picture shows the model
developed in Chapters 3 and 4 [74] for the power-weighted CCM converter, the inputs to
this block are the core voltages and the load currents, as well as the required output voltage
references. The power weighting block function is to establish the inductor current to a
level that satisfies the per-load requirements of the N voltage cores. The center part of the
schematic shows the model for the power stage partitioning and the per-load δregulation
part, this block contains the parasitics of the power stage and outputs both a voltage and a
power output which is a function of the load currents, desired core voltages and the tem-
perature. The right-hand partition of the schematic shows the distributed thermo-electrical
models that were developed in [101] and [102], these models correlate to 3D stacks and are











































































































Figure 67: Block-level schematic of the simulated thermoelectrical system
7.6 Thermoelectrical modeling results
The electrical portion of the thermo-electrical simulation results is shown in Figure 68. In
this figure we have plotted the output voltages for 4 different levels, 1.2V, 1V, 0.8V and
0.6V, we can see that the transient simulation shows that the voltages are within the desired
levels. We also see that, as expected, there is a dependency of the output voltages on the
load current, this effect is the sum of the line regulation effect (for the load experiencing
the increase power consumption) and cross-regulation (for the loads with fixed value that
experience a voltage variation). Interestingly, we find that the line regulation is a function
of the total power consumption, there are three contributing factors to this effect. First,
the increased load means that the losses in the power delivery network increase, this the
voltage droops. However, due to the feedback in the system, the voltage at the PWM node
(left hand side of the inductor in Figure 68) will increase to compensate for these losses,
how much error the feedback loop corrects for depends on the feedback loop gain. For the
simulations shown, the loop gain was high-enough such that this was not a contributing
factor. The second reason that a voltage droop might occur is when the losses in the power
delivery network and the power converter are high enough such that the feedback loop
cannot increase the voltage on the PWM node enough to compensate for them (i.e. 100%
duty cycle is reached). Two scenarios can occur which triggers a 100% duty cycle event,
109
























































Figure 68: Voltage regulation capability of SIMO thermo-electrical simulation
first is that the load power has exceeded the output power that the converter can deliver
because of the parasitics. Again this is similar to the effects discussed in Chapter 5, i.e.
Vcore = Vin − RS IMO,parasitic+PDN Icore (105)
The interesting part is that by stacking the converter, the RSIMO,parasitic is higher at higher
load currents because of the temperature coupling (this is illustrated by Figure 66 and the
fact that RSIMO,parasitic = RSIMO,parasitic(t,T)). The maximum power density of the SIMO is
reduced because of the increased temperature caused by the processor stacking. Figure
69 shows the transient waveforms for the output power, as well as the power loss and
efficiency of the SIMO. We can see that as the load experiences transient power steps, the
loss in the SIMO increasing with increasing current (conduction loss effect) but we also see
that the temperature induces an increased loss in the power stage. The interesting part of
this simulation is that the temperatures of both processor and converter dies converge (due
to thermal coupling) to the sum of the power consumption of processor plus converter core,
110

















Load (core) power consumption











































Figure 69: Power and thermal transient characteristics of the SIMO
more explicitly put:
Tcore = TS IMO = f (Ploss,core + Ploss,S IMO) (106)
The core and converter are at thermal equilibrium due to the presence of through-silicon
vias, moreover, the thermal dissipated power requirements on the heat sink are increased
since now, the converter die power also needs to be thermally dissipated. The disadvan-
tages of the integration are offset by smaller footprint, decreased strain on the power C4
bumps, and reduced system power consumption due to dynamic voltage scaling algorithms.
The denser integration will also reduce the parasitics and increase the usable bandwidth of
the PDN. Figure 70 shows the impedance frequency domain plot of the SIMO converter
plus PDN. We can see that, as expected, the impedance profile is flat due to the control
bandwidth of the SIMO and the reduced parasitics resulting from the 3D integration. The
difference, however, being that there is a gain between one output and the other, i.e. cross-
regulation, although less in comparison with the line regulation capability (7mΩline regu-
lation versus 2mΩcross-regulation), it is not negligible and can become a significant part























































Figure 70: PDN+SIMO impedance seen across Vcore,1’s terminals
margins for the circuits such that the interference with neighboring cores does not affect
functionality. Although the multi-domain SIMO converter does decrease the robustness
versus having completely independent voltage domains, it drastically reduces the fabri-
cation, layout and compensation complexity. Moreover, as shown in Chapter 3, having
a larger inductor will reduce the expected ripple currents (which will help the efficiency)
and, therefore, the voltage ripple. These advantages outweigh the disadvantage of complex
power stage design and increased cross-regulation. Moreover, the line and cross-regulation
are directly related to the errors due to finite controller loop gain and comparator delay,
meaning that using techniques to reduce comparator delay and increase controller loop
gain will certainly help suppress both of these disadvantages even further.
7.7 Conclusions
Although a 3D packaged SIMO has drawbacks similar to its single output counterpart,
multi-voltage domain regulators are a necessary evil. With increasing leakage power con-
sumption, the reduction of power in standby states is critical, and the most effective way of
reducing leakage power consumption is by scaling down the voltage supplies. In systems
with multiple voltage rails we can therefore expect reduced power consumption. Moreover,
112
on-die VRs are important when delivering very high large currents to cores which cannot
be handled by typical C4 bumps, this reducing the input current drawn from the PCB can
actually reduce the number of bumps used to provide power. Although we have found
that the efficiency of the system is highly dependent on the passive qualities, the SIMO has
proven that if designed properly, it can provide reliable voltage regulation at high efficiency




Power conversion has historically been treated as a separate block that is designed without
knowledge of load patterns, parasitics, core constraints (such as voltage scaling speeds and
line regulation), and efficiency including electrical parasitics. This needs to change as new
features that will become standard for power converters (such as voltage supply scaling and
light load efficiency) and tighter integration, co-design with information pertaining power
patterns, thermal behavior and integration schemes are not a luxury but a necessity. As
state-of-the-art processors more frequently use fully-integrated voltage regulators, it is im-
portant for the regulator designer to understand the requirements of the load such that they
can optimize their design accordingly knowing the trade-offs between topologies. In this
work we have presented the analysis, design and fabrication of multi-supply voltage regula-
tors and introduced a novel co-design in terms of multi-output voltage regulator controller,
power stage, packaging and thermoelectrical co-simulation.
8.1 Summary of key contributions
The key contributions of this thesis can be enumerated as:
1. Comparison between single and multiple inductor topologies with multi-voltage do-
main outputs. The findings serve as a motivation to explore the Single-Inductor
Multiple-Output topology and control schemes to manage cross-regulation.
2. Introduced a novel control method for the SIMO based on a power-weighting func-
tion. The fundamentals of the power weighting control method were modeled and a
test chip designed, fabricated and measured.
3. Created a novel floating capacitor scheme for improving the conversion efficiency
114
of the SIMO, this scheme was tested as part of the fabricated test chip and the ad-
vantages measured and characterized. Moreover the effect of the coupling capacitors
was modeled, analyzed and resulting cross-regulation suppressed through the control
and circuit techniques used.
4. Established a framework for evaluating 3D packaged converters, the framework was
used to evaluate both a single-domain converter as well as the SIMO. The framework
developed combines the electrical and thermal property models of the processor and
converter dies and uses them to estimate system efficiency, temperature and electrical
characteristics. The results show electrical sensitivity and maximum power limiting
due to the temperature increase in tightly coupled converters.
8.2 Future research
The SIMO shows great promise in terms of regulation capabilities as well as multi-domain
conversion. The proposed control techniques and circuits offer a scalable approach toward
controlling the inductor current in multi-core application spaces. Due to limitations in terms
of process technologies, however, a fully integrated system was not actually built meaning
that fabrication/integration challenges such as inductor placement and isolation and on-die
decoupling capacitor placement were not considered, in systems with high power density
the placement can significantly affect circuit parasitics. Another branch of the research has
to do with component quality, although outside the scope of the research in this thesis, the
passive density and parasitics can significantly deteriorate power loss in power conversion,
thus although the framework developed is still applicable, the obtained results can with
respect to component quality. The SIMO has yet to be used in a fully integrated system and
from the author’s point of view, it has the potential to become a widely used scheme.
The thermoelectrical framework developed shows some critical sensitivity parameters
due to the tight coupling. As converters become more tightly integrated with processor
cores, the thermal feedback system changes the electrical properties of the converter. The
115
framework discussed in this thesis will aid designers in understanding both the positive
effects of integration (reduced parasitics lead to higher usable bandwidth and less de-cap
constraints) as well as its negative effects (thermal coupling, increased cost and complex-
ity). The last chapter of the thesis also models the feasibility of a multi-core integrated




[1] A. Munir, S. Ranka, and A. Gordon-Ross, “High-performance energy-efficient mul-
ticore embedded computing,” Parallel and Distributed Systems, IEEE Transactions
on, vol. 23, pp. 684–700, April 2012.
[2] T. Skotnicki, C. Fenouillet-Beranger, C. Gallon, F. Buf, S. Monfray, F. Payet,
A. Pouydebasque, M. Szczap, A. Farcy, F. Arnaud, S. Clerc, M. Sellier, A. Cathig-
nol, J.-P. Schoellkopf, E. Perea, R. Ferrant, and H. Mingam, “Innovative materials,
devices, and cmos technologies for low-power mobile multimedia,” Electron De-
vices, IEEE Transactions on, vol. 55, pp. 96–130, Jan 2008.
[3] A. Amerasekera, “Ultra low power electronics in the next decade,” in Low-Power
Electronics and Design (ISLPED), 2010 ACM/IEEE International Symposium on,
pp. 237–237, Aug 2010.
[4] C. Limonard, “The future of power management in the mobile computing market.”
[5] T. Market, “Power management ic market is expected to reach usd 46 billion globally
in 2019: Transparency market research.”
[6] A. Arakali, S. Gondi, and P. Hanumolu, “Analysis and design techniques for supply-
noise mitigation in phase-locked loops,” Circuits and Systems I: Regular Papers,
IEEE Transactions on, vol. 57, pp. 2880–2889, Nov 2010.
[7] E. Seevinck, F. List, and J. Lohstroh, “Static-noise margin analysis of mos sram
cells,” Solid-State Circuits, IEEE Journal of, vol. 22, pp. 748–754, Oct 1987.
[8] A. Hajimiri, S. Limotyrakis, and T. Lee, “Jitter and phase noise in ring oscillators,”
Solid-State Circuits, IEEE Journal of, vol. 34, pp. 790–804, Jun 1999.
[9] A. Shye, B. Scholbrock, G. Memik, and P. A. Dinda, “Characterizing and modeling
user activity on smartphones: Summary,” SIGMETRICS Perform. Eval. Rev., vol. 38,
pp. 375–376, June 2010.
[10] H. K. Cho and S. Mahlke, “Embracing heterogeneity with dynamic core boosting,”
in Proceedings of the 11th ACM Conference on Computing Frontiers, CF ’14, (New
York, NY, USA), pp. 10:1–10:10, ACM, 2014.
[11] W. Lee, Y. Wang, D. Shin, N. Chang, and M. Pedram, “Optimizing the power de-
livery network in a smartphone platform,” Computer-Aided Design of Integrated
Circuits and Systems, IEEE Transactions on, vol. 33, pp. 36–49, Jan 2014.
[12] B. Flipsen, J. Geraedts, A. Reinders, C. Bakker, I. Dafnomilis, and A. Gudadhe,
“Environmental sizing of smartphone batteries,” in Electronics Goes Green 2012+
(EGG), 2012, pp. 1–9, Sept 2012.
117
[13] V. De and S. Borkar, “Technology and design challenges for low power and high
performance [microprocessors],” in Low Power Electronics and Design, 1999. Pro-
ceedings. 1999 International Symposium on, pp. 163–168, Aug 1999.
[14] S. Yang, “High performance logic technology-scaling trend and future challenges,”
in Solid-State and Integrated-Circuit Technology, 2001. Proceedings. 6th Interna-
tional Conference on, vol. 1, pp. 62–67 vol.1, 2001.
[15] D. Frank, R. Dennard, E. Nowak, P. Solomon, Y. Taur, and H.-S. P. Wong, “Device
scaling limits of si mosfets and their application dependencies,” Proceedings of the
IEEE, vol. 89, pp. 259–288, Mar 2001.
[16] R. Dennard, V. Rideout, E. Bassous, and A. LeBlanc, “Design of ion-implanted
mosfet’s with very small physical dimensions,” Solid-State Circuits, IEEE Journal
of, vol. 9, pp. 256–268, Oct 1974.
[17] S. Narenda, L. Fujino, and K. Smith, “Through the looking glass continued (iii):
Update to trends in solid-state circuits and systems from isscc 2014 [isscc trends],”
Solid-State Circuits Magazine, IEEE, vol. 6, pp. 49–53, winter 2014.
[18] S. Carlo, W. Yueh, and S. Mukhopadhyay, “On the potential of 3d integration of
inductive dc-dc converter for high-performance power delivery,” in Proceedings of
the 50th Annual Design Automation Conference, DAC ’13, (New York, NY, USA),
pp. 179:1–179:8, ACM, 2013.
[19] R. Beica, “Flip chip market and technology trends,” in Microelectronics Packaging
Conference (EMPC) , 2013 European, pp. 1–4, Sept 2013.
[20] H. Fuketa, Y. Shinozuka, K. Ishida, M. Takamiya, and T. Sakurai, “On-chip buck
converter with spiral ferrite inductor and reducing ir drop in 3d stacked integration,”
in Power Electronics Conference (IPEC-Hiroshima 2014 - ECCE-ASIA), 2014 In-
ternational, pp. 2228–2231, May 2014.
[21] J. Sun, J.-Q. Lu, D. Giuliano, T. Chow, and R. Gutmann, “3d power delivery for
microprocessors and high-performance asics,” in Applied Power Electronics Con-
ference, APEC 2007 - Twenty Second Annual IEEE, pp. 127–133, Feb 2007.
[22] P. Hazucha, G. Schrom, J. Hahn, B. Bloechel, P. Hack, G. Dermer, S. Narendra,
D. Gardner, T. Karnik, V. De, and S. Borkar, “A 233-mhz 80four-phase dc-dc con-
verter utilizing air-core inductors on package,” Solid-State Circuits, IEEE Journal
of, vol. 40, pp. 838–845, April 2005.
[23] N. Sturcken, E. O’Sullivan, N. Wang, P. Herget, B. Webb, L. Romankiw, M. Pe-
tracca, R. Davies, R. Fontana, G. Decad, I. Kymissis, A. Peterchev, L. Carloni,
W. Gallagher, and K. Shepard, “A 2.5d integrated voltage regulator using coupled-
magnetic-core inductors on silicon interposer,” Solid-State Circuits, IEEE Journal
of, vol. 48, pp. 244–254, Jan 2013.
118
[24] W. Lambert, M. Hill, K. Radhakrishnan, L. Wojewoda, and A. Augustine, “Package
embedded inductors for integrated voltage regulators,” in Electronic Components
and Technology Conference (ECTC), 2014 IEEE 64th, pp. 528–534, May 2014.
[25] J. Knickerbocker, P. Andry, E. Colgan, B. Dang, T. Dickson, X. Gu, C. Haymes,
C. Jahnes, Y. Liu, J. Maria, R. Polastre, C. Tsang, L. Turlapati, B. Webb, L. Wiggins,
and S. Wright, “2.5d and 3d technology challenges and test vehicle demonstrations,”
in Electronic Components and Technology Conference (ECTC), 2012 IEEE 62nd,
pp. 1068–1076, May 2012.
[26] D. Gardner, G. Schrom, F. Paillet, B. Jamieson, T. Karnik, and S. Borkar, “Review
of on-chip inductor structures with magnetic films,” Magnetics, IEEE Transactions
on, vol. 45, pp. 4760–4766, Oct 2009.
[27] G. Villar-Pique, H. Bergveld, and E. Alarcon, “Survey and benchmark of fully inte-
grated switching power converters: Switched-capacitor versus inductive approach,”
Power Electronics, IEEE Transactions on, vol. 28, pp. 4156–4167, Sept 2013.
[28] M. Seeman and S. Sanders, “Analysis and optimization of switched-capacitor dc-dc
converters,” Power Electronics, IEEE Transactions on, vol. 23, pp. 841–851, March
2008.
[29] E. Burton, G. Schrom, F. Paillet, J. Douglas, W. Lambert, K. Radhakrishnan, and
M. Hill, “Fivr 2014; fully integrated voltage regulators on 4th generation intel x00ae;
core x2122; socs,” in Applied Power Electronics Conference and Exposition (APEC),
2014 Twenty-Ninth Annual IEEE, pp. 432–439, March 2014.
[30] R. Jain and S. Sanders, “A 200ma switched capacitor voltage regulator on 32nm
cmos and regulation schemes to enable dvfs,” in Power Electronics and Applications
(EPE 2011), Proceedings of the 2011-14th European Conference on, pp. 1–10, Aug
2011.
[31] R.-S. Chang, J. Gao, V. Gruhn, J. He, G. Roussos, and W.-T. Tsai, “Mobile cloud
computing research - issues, challenges and needs,” in Service Oriented System En-
gineering (SOSE), 2013 IEEE 7th International Symposium on, pp. 442–453, March
2013.
[32] D. Kim, S.-W. Liao, P. Wang, J. del Cuvillo, X. Tian, X. Zou, P. Wang, D. Yeung,
M. Girkar, and J. Shen, “Physical experimentation with prefetching helper threads
on intel’s hyper-threaded processors,” in Code Generation and Optimization, 2004.
CGO 2004. International Symposium on, pp. 27–38, March 2004.
[33] J. Lu, A. Das, W.-C. Hsu, K. Nguyen, and S. Abraham, “Dynamic helper threaded
prefetching on the sun ultrasparc reg; cmp processor,” in Microarchitecture, 2005.
MICRO-38. Proceedings. 38th Annual IEEE/ACM International Symposium on,
pp. 12 pp.–, Nov 2005.
119
[34] A. Noureddine, A. Bourdon, R. Rouvoy, and L. Seinturier, “Runtime monitoring
of software energy hotspots,” in Automated Software Engineering (ASE), 2012 Pro-
ceedings of the 27th IEEE/ACM International Conference on, pp. 160–169, Sept
2012.
[35] Z. Zhou, J. Gu, and G. Qu, “Scheduling for multi-core processor under process
and temperature variation,” in Embedded Multicore Socs (MCSoC), 2012 IEEE 6th
International Symposium on, pp. 113–120, Sept 2012.
[36] H. Hamann, A. Weger, J. Lacey, Z. Hu, P. Bose, E. Cohen, and J. Wakil, “Hotspot-
limited microprocessors: Direct temperature and power distribution measurements,”
Solid-State Circuits, IEEE Journal of, vol. 42, pp. 56–65, Jan 2007.
[37] G. Kornaros and D. Pnevmatikatos, “Dynamic power and thermal management
of noc-based heterogeneous mpsocs,” ACM Trans. Reconfigurable Technol. Syst.,
vol. 7, pp. 1:1–1:26, Feb. 2014.
[38] S. Eyerman and L. Eeckhout, “Fine-grained dvfs using on-chip regulators,” ACM
Trans. Archit. Code Optim., vol. 8, pp. 1:1–1:24, Feb. 2011.
[39] M. L. Mui, K. Banerjee, and A. Mehrotra, “Supply and power optimization in
leakage-dominant technologies,” Computer-Aided Design of Integrated Circuits and
Systems, IEEE Transactions on, vol. 24, pp. 1362–1371, Sept 2005.
[40] D. Ma and R. Bondade, “Enabling power-efficient dvfs operations on silicon,” Cir-
cuits and Systems Magazine, IEEE, vol. 10, pp. 14–30, First 2010.
[41] W. Kim, M. Gupta, G.-Y. Wei, and D. Brooks, “System level analysis of fast, per-
core dvfs using on-chip switching regulators,” in High Performance Computer Ar-
chitecture, 2008. HPCA 2008. IEEE 14th International Symposium on, pp. 123–134,
Feb 2008.
[42] S. Eyerman and L. Eeckhout, “Fine-grained dvfs using on-chip regulators,” ACM
Trans. Archit. Code Optim., vol. 8, pp. 1:1–1:24, Feb. 2011.
[43] K. K. Rangan, G.-Y. Wei, and D. Brooks, “Thread motion: Fine-grained power man-
agement for multi-core systems,” SIGARCH Comput. Archit. News, vol. 37, pp. 302–
313, June 2009.
[44] C.-Y. Tseng, L.-W. Wang, and P.-C. Huang, “An integrated linear regulator with
fast output voltage transition for dual-supply srams in dvfs systems,” Solid-State
Circuits, IEEE Journal of, vol. 45, pp. 2239–2249, Nov 2010.
[45] N. Kurd, M. Chowdhury, E. Burton, T. Thomas, C. Mozak, B. Boswell, M. Lal,
A. Deval, J. Douglas, M. Elassal, A. Nalamalpu, T. Wilson, M. Merten, S. Chen-
nupaty, W. Gomes, and R. Kumar, “5.9 haswell: A family of ia 22nm processors,”
in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE
International, pp. 112–113, Feb 2014.
120
[46] R. Jain, S. T. Kim, V. Vaidya, K. Ravichandran, J. W. Tschanz, and V. De, “Conduc-
tance modulation techniques in switched-capacitor DC-DC converter for maximum-
efficiency tracking and ripple mitigation in 22 nm tri-gate CMOS,” J. Solid-State
Circuits, vol. 50, no. 8, pp. 1809–1819, 2015.
[47] R. Jain, B. M. Geuskens, S. T. Kim, M. M. Khellah, J. Kulkarni, J. W. Tschanz, and
V. De, “A 0.45-1 V fully-integrated distributed switched capacitor DC-DC converter
with high density MIM capacitor in 22 nm tri-gate CMOS,” J. Solid-State Circuits,
vol. 49, no. 4, pp. 917–927, 2014.
[48] S. T. Kim, Y. Shih, K. Mazumdar, R. Jain, J. F. Ryan, C. Tokunaga, C. Augustine,
J. P. Kulkarni, K. Ravichandran, J. W. Tschanz, M. M. Khellah, and V. De, “8.6
enabling wide autonomous DVFS in a 22nm graphics execution core using a dig-
itally controlled hybrid ldo/switched-capacitor VR with fast droop mitigation,” in
2015 IEEE International Solid-State Circuits Conference, ISSCC 2015, Digest of
Technical Papers, San Francisco, CA, USA, February 22-26, 2015, pp. 1–3, 2015.
[49] A. Lukefahr, S. Padmanabha, R. Das, R. Dreslinski, Jr., T. F. Wenisch, and
S. Mahlke, “Heterogeneous microarchitectures trump voltage scaling for low-power
cores,” in Proceedings of the 23rd International Conference on Parallel Architec-
tures and Compilation, PACT ’14, (New York, NY, USA), pp. 237–250, ACM, 2014.
[50] D. Kwon and G. Rincon-Mora, “Single-inductor multiple-output switching dc dc
converters,” Circuits and Systems II: Express Briefs, IEEE Transactions on, vol. 56,
pp. 614–618, Aug 2009.
[51] D. Trevisan, P. Mattavelli, and P. Tenti, “Digital control of single-inductor multiple-
output step-down dc dc converters in ccm,” Industrial Electronics, IEEE Transac-
tions on, vol. 55, pp. 3476–3483, Sept 2008.
[52] D. Ma, W.-H. Ki, and C.-Y. Tsui, “A pseudo-ccm/dcm simo switching converter with
freewheel switching,” in Solid-State Circuits Conference, 2002. Digest of Technical
Papers. ISSCC. 2002 IEEE International, vol. 1, pp. 390–476 vol.1, Feb 2002.
[53] Y. Zhang and D. Ma, “A fast-response hybrid simo power converter with adaptive
current compensation and minimized cross-regulation,” Solid-State Circuits, IEEE
Journal of, vol. 49, pp. 1242–1255, May 2014.
[54] H.-P. Le, C.-S. Chae, K.-C. Lee, S.-W. Wang, G.-H. Cho, and G.-H. Cho, “A single-
inductor switching dc dc converter with five outputs and ordered power-distributive
control,” Solid-State Circuits, IEEE Journal of, vol. 42, pp. 2706–2714, Dec 2007.
[55] E. Bayer and G. Thiele, “A single-inductor multiple-output converter with peak cur-
rent state-machine control,” in Applied Power Electronics Conference and Exposi-
tion, 2006. APEC ’06. Twenty-First Annual IEEE, pp. 7 pp.–, March 2006.
[56] M.-H. Huang and K.-H. Chen, “Single-inductor multi-output (simo) dc-dc convert-
ers with high light-load efficiency and minimized cross-regulation for portable de-
vices,” Solid-State Circuits, IEEE Journal of, vol. 44, pp. 1099–1111, April 2009.
121
[57] P. Patra, A. Patra, and N. Misra, “A single-inductor multiple-output switcher with
simultaneous buck, boost, and inverted outputs,” Power Electronics, IEEE Transac-
tions on, vol. 27, pp. 1936–1951, April 2012.
[58] D. Lu, Y. Qian, and Z. Hong, “4.3 an 87single-inductor 4-output dc-dc buck con-
verter with ripple-based adaptive off-time control,” in Solid-State Circuits Confer-
ence Digest of Technical Papers (ISSCC), 2014 IEEE International, pp. 82–83, Feb
2014.
[59] D. Ma, W.-H. Ki, C.-Y. Tsui, and P. Mok, “Single-inductor multiple-output switch-
ing converters with time-multiplexing control in discontinuous conduction mode,”
Solid-State Circuits, IEEE Journal of, vol. 38, pp. 89–100, Jan 2003.
[60] C.-S. Chae, H.-P. Le, K.-C. Lee, M.-C. Lee, G.-H. Cho, and G.-H. Cho, “A single-
inductor step-up dc-dc switching converter with bipolar outputs for active matrix
oled mobile display panels,” in Solid-State Circuits Conference, 2007. ISSCC 2007.
Digest of Technical Papers. IEEE International, pp. 136–592, Feb 2007.
[61] Y.-J. Woo, H.-P. Le, G.-H. Cho, G.-H. Cho, and S.-I. Kim, “Load-independent con-
trol of switching dc-dc converters with freewheeling current feedback,” in Solid-
State Circuits Conference, 2008. ISSCC 2008. Digest of Technical Papers. IEEE
International, pp. 446–626, Feb 2008.
[62] C.-W. Kuan and H.-C. Lin, “Near-independently regulated 5-output single-inductor
dc-dc buck converter delivering 1.2w/mm2 in 65nm cmos,” in Solid-State Circuits
Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, pp. 274–
276, Feb 2012.
[63] K.-C. Lee, C. seok Chae, G.-H. Cho, and G.-H. Cho, “A pll-based high-stability
single-inductor 6-channel output dc-dc buck converter,” in Solid-State Circuits Con-
ference Digest of Technical Papers (ISSCC), 2010 IEEE International, pp. 200–201,
Feb 2010.
[64] M. Belloni, E. Bonizzoni, E. Kiseliovas, P. Malcovati, F. Maloberti, T. Peltola,
and T. Teppo, “A 4-output single-inductor dc-dc buck converter with self-boosted
switch drivers and 1.2a total output current,” in Solid-State Circuits Conference,
2008. ISSCC 2008. Digest of Technical Papers. IEEE International, pp. 444–626,
Feb 2008.
[65] P. Macken, M. Degrauwe, M. Van Paemel, and H. Oguey, “A voltage reduction
technique for digital systems,” in Solid-State Circuits Conference, 1990. Digest of
Technical Papers. 37th ISSCC., 1990 IEEE International, pp. 238–239, Feb 1990.
[66] M. Kumar, Y. Tan, and J. Sin, “Excellent cross-talk isolation, high-q inductors, and
reduced self-heating in a tfsoi technology for system-on-a-chip applications,” Elec-
tron Devices, IEEE Transactions on, vol. 49, pp. 584–589, Apr 2002.
122
[67] H.-J. Chen, Y.-H. Wang, P.-C. Huang, and T.-H. Kuo, “20.9 an energy-recycling
three-switch single-inductor dual-input buck/boost dc-dc converter with 93peak con-
version efficiency and 0.5mm2 active area for light energy harvesting,” in Solid-
State Circuits Conference - (ISSCC), 2015 IEEE International, pp. 1–3, Feb 2015.
[68] X. Jing and P. Mok, “Power loss and switching noise reduction techniques for single-
inductor multiple-output regulator,” Circuits and Systems I: Regular Papers, IEEE
Transactions on, vol. 60, pp. 2788–2798, Oct 2013.
[69] Z. Shen, X. Chang, W. Wang, X. Tan, N. Yan, and H. Min, “Predictive digital current
control of single-inductor multiple-output converters in ccm with low cross regula-
tion,” Power Electronics, IEEE Transactions on, vol. 27, pp. 1917–1925, April 2012.
[70] J. Chen, R. Erickson, and D. Maksimovic, “Averaged switch modeling of bound-
ary conduction mode dc-to-dc converters,” in Industrial Electronics Society, 2001.
IECON ’01. The 27th Annual Conference of the IEEE, vol. 2, pp. 844–849 vol.2,
2001.
[71] S. Vangal, S. Jain, and V. De, “A solar-powered 280mv-to-1.2v wide-operating-range
ia-32 processor,” in IC Design Technology (ICICDT), 2014 IEEE International Con-
ference on, pp. 1–4, May 2014.
[72] Y. Pu, J. Pineda de Gyvez, H. Corporaal, and Y. Ha, “An ultra-low-energy multi-
standard jpeg co-processor in 65 nm cmos with sub/near threshold supply voltage,”
Solid-State Circuits, IEEE Journal of, vol. 45, pp. 668–680, March 2010.
[73] P. Herget, N. Wang, E. O’Sullivan, B. Webb, L. Romankiw, R. Fontana, X. Hu,
G. Decad, and W. Gallagher, “A study of current density limits due to saturation
in thin film magnetic inductors for on-chip power conversion,” Magnetics, IEEE
Transactions on, vol. 48, pp. 4119–4122, Nov 2012.
[74] S. Carlo and S. Mukhopadhyay, “A high power density, dynamic voltage scaling
compatible, single-inductor four-output regulator using a power-weighted ccm con-
troller and a floating capacitor-based output filter,” Power Electronics, IEEE Trans-
actions on, vol. PP, no. 99, pp. 1–1, 2015.
[75] S. Dietrich, R. Wunderlich, and S. Heinen, “Stability considerations of hysteretic
controlled dc-dc converters,” in Ph.D. Research in Microelectronics and Electronics
(PRIME), 2012 8th Conference on, pp. 1–4, June 2012.
[76] L. Cheng, Y. Liu, and W.-H. Ki, “A 10/30 mhz fast reference-tracking buck converter
with dda-based type-iii compensator,” Solid-State Circuits, IEEE Journal of, vol. 49,
pp. 2788–2799, Dec 2014.
[77] Y. Ren, M. Xu, J. Zhou, and F. Lee, “Analytical loss model of power mosfet,” Power
Electronics, IEEE Transactions on, vol. 21, pp. 310–319, March 2006.
123
[78] Y. Zhang and D. Ma, “An output-capacitor-free buck converter with nano-second-
dvs and sub-switching-cycle load transient response,” in Solid-State and Integrated
Circuit Technology (ICSICT), 2012 IEEE 11th International Conference on, pp. 1–4,
Oct 2012.
[79] H.-P. Le, J. Crossley, S. Sanders, and E. Alon, “A sub-ns response fully integrated
battery-connected switched-capacitor voltage regulator delivering 0.19w/mm2 at
73of Technical Papers (ISSCC), 2013 IEEE International, pp. 372–373, Feb 2013.
[80] S. Govindan and S. Venkataraman, “Silicon-package power delivery co-simulation
with fully integrated voltage regulators on microprocessors,” in Electrical Design of
Advanced Packaging Systems Symposium (EDAPS), 2014 IEEE, pp. 113–116, Dec
2014.
[81] W. Lambert, R. Ayyanar, and S. Chickamenahalli, “Fast load transient regulation of
low-voltage converters with the low-voltage transient processor,” Power Electronics,
IEEE Transactions on, vol. 24, pp. 1839–1854, July 2009.
[82] G. Huang, M. Bakir, A. Naeemi, and J. Meindl, “Power delivery for 3-d chip stacks:
Physical modeling and design implication,” Components, Packaging and Manufac-
turing Technology, IEEE Transactions on, vol. 2, pp. 852–859, May 2012.
[83] M. Gildersleeve, H. Forghani-zadeh, and G. Rincon-Mora, “A comprehensive power
analysis and a highly efficient, mode-hopping dc-dc converter,” in ASIC, 2002. Pro-
ceedings. 2002 IEEE Asia-Pacific Conference on, pp. 153–156, 2002.
[84] G. Sizikov, A. Kolodny, E. Fridman, and M. Zelikson, “Frequency dependent effi-
ciency model of on-chip dc-dc buck converters,” in Electrical and Electronics En-
gineers in Israel (IEEEI), 2010 IEEE 26th Convention of, pp. 000651–000654, Nov
2010.
[85] N. Sturcken, R. Davies, C. Cheng, W. E. Bailey, and K. Shepard, “Design of coupled
power inductors with crossed anisotropy magnetic core for integrated power con-
version,” in Applied Power Electronics Conference and Exposition (APEC), 2012
Twenty-Seventh Annual IEEE, pp. 417–423, Feb 2012.
[86] X. Zhang and A. Huang, “Monolithic/modularized voltage regulator channel,”
Power Electronics, IEEE Transactions on, vol. 22, pp. 1162–1176, July 2007.
[87] Q. Zhao and G. Stojcic, “Characterization of cdv/dt induced power loss in syn-
chronous buck dc-dc converters,” in Applied Power Electronics Conference and Ex-
position, 2004. APEC ’04. Nineteenth Annual IEEE, vol. 1, pp. 292–297 Vol.1, 2004.
[88] P. Li, D. Bhatia, L. Xue, and R. Bashirullah, “A 90-240 mhz hysteretic controlled
dc-dc buck converter with digital phase locked loop synchronization,” Solid-State
Circuits, IEEE Journal of, vol. 46, pp. 2108–2119, Sept 2011.
[89] M. Du, H. Lee, and J. Liu, “A 5-mhz 91with auto-selectable peak- and valley-current
control,” Solid-State Circuits, IEEE Journal of, vol. 46, pp. 1928–1939, Aug 2011.
124
[90] H.-W. Huang, K.-H. Chen, and S.-Y. Kuo, “Dithering skip modulation, width and
dead time controllers in highly efficient dc-dc converters for system-on-chip appli-
cations,” Solid-State Circuits, IEEE Journal of, vol. 42, pp. 2451–2465, Nov 2007.
[91] K. Onizuka, K. Inagaki, H. Kawaguchi, M. Takamiya, and T. Sakurai, “Stacked-
chip implementation of on-chip buck converter for distributed power supply system
in sips,” Solid-State Circuits, IEEE Journal of, vol. 42, pp. 2404–2410, Nov 2007.
[92] G. Schrom, P. Hazucha, J.-H. Hahn, V. Kursun, D. Gardner, S. Narendra, T. Karnik,
and V. De, “Feasibility of monolithic and 3d-stacked dc-dc converters for micro-
processors in 90nm technology generation,” in Low Power Electronics and Design,
2004. ISLPED ’04. Proceedings of the 2004 International Symposium on, pp. 263–
268, Aug 2004.
[93] J. Sun, D. Giuliano, S. Devarajan, J.-Q. Lu, T. Chow, and R. Gutmann, “Fully mono-
lithic cellular buck converter design for 3-d power delivery,” Very Large Scale Inte-
gration (VLSI) Systems, IEEE Transactions on, vol. 17, pp. 447–451, March 2009.
[94] N. Wang, E. O’Sullivan, P. Herget, B. Rajendran, L. Krupp, L. Romankiw, B. Webb,
R. Fontana, E. Duch, E. Joseph, S. Brown, H. Xiaolin, G. Decad, N. Sturcken,
K. Shepard, and W. Gallagher, “Integrated on-chip inductors with electroplated mag-
netic yokes (invited).,” Journal of Applied Physics, vol. 111, no. 7, p. (6 pp.), 2012.
[95] M. Pathak and S. K. Lim, “Performance and thermal-aware steiner routing for 3-
d stacked ics,” Computer-Aided Design of Integrated Circuits and Systems, IEEE
Transactions on, vol. 28, pp. 1373–1386, Sept 2009.
[96] S. Samal, K. Samadi, P. Kamal, Y. Du, and S. K. Lim, “Full chip impact study of
power delivery network designs in monolithic 3d ics,” in Computer-Aided Design
(ICCAD), 2014 IEEE/ACM International Conference on, pp. 565–572, Nov 2014.
[97] Y. Peng, B. W. Ku, Y. Park, K.-I. Park, S.-J. Jang, J. S. Choi, and S. K. Lim, “De-
sign, packaging, and architectural policy co-optimization for dc power integrity in
3d dram,” in Design Automation Conference (DAC), 2015 52nd ACM/EDAC/IEEE,
pp. 1–6, June 2015.
[98] M. Gupta, J. Oatley, R. Joseph, G.-Y. Wei, and D. Brooks, “Understanding voltage
variations in chip multiprocessors using a distributed power-delivery network,” in
Design, Automation Test in Europe Conference Exhibition, 2007. DATE ’07, pp. 1–
6, April 2007.
[99] W. Yueh, S. Chatterjee, A. Trivedi, and S. Mukhopadhyay, “Performance and robust-
ness of 3-d integrated sram considering tier-to-tier thermal and supply crosstalk,”
Components, Packaging and Manufacturing Technology, IEEE Transactions on,
vol. 3, pp. 943–953, June 2013.
[100] S. Sanders, E. Alon, H.-P. Le, M. Seeman, M. John, and V. Ng, “The road to fully in-
tegrated dc-dc conversion via the switched-capacitor approach,” Power Electronics,
IEEE Transactions on, vol. 28, pp. 4146–4155, Sept 2013.
125
[101] W. Yueh, S. Chatterjee, A. Trivedi, and S. Mukhopadhyay, “On the parametric fail-
ures of sram in a 3d-die stack considering tier-to-tier supply cross-talk,” in VLSI Test
Symposium (VTS), 2012 IEEE 30th, pp. 264–269, April 2012.
[102] S. Chatterjee, M. Cho, R. Rao, and S. Mukhopadhyay, “Impact of die-to-die thermal
coupling on the electrical characteristics of 3d stacked sram cache,” in Semiconduc-
tor Thermal Measurement and Management Symposium (SEMI-THERM), 2012 28th
Annual IEEE, pp. 14–19, March 2012.
126
