A Charge-Recycling Scheme and Ultra Low Voltage Self-Startup Charge Pump for Highly Energy Efficient Mixed Signal Systems-On-A-Chip by Ulaganathan, Chandradevi
University of Tennessee, Knoxville 
TRACE: Tennessee Research and Creative 
Exchange 
Doctoral Dissertations Graduate School 
12-2012 
A Charge-Recycling Scheme and Ultra Low Voltage Self-Startup 




Follow this and additional works at: https://trace.tennessee.edu/utk_graddiss 
 Part of the Electrical and Electronics Commons, and the VLSI and Circuits, Embedded and Hardware 
Systems Commons 
Recommended Citation 
Ulaganathan, Chandradevi, "A Charge-Recycling Scheme and Ultra Low Voltage Self-Startup Charge Pump 
for Highly Energy Efficient Mixed Signal Systems-On-A-Chip. " PhD diss., University of Tennessee, 2012. 
https://trace.tennessee.edu/utk_graddiss/1593 
This Dissertation is brought to you for free and open access by the Graduate School at TRACE: Tennessee 
Research and Creative Exchange. It has been accepted for inclusion in Doctoral Dissertations by an authorized 
administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact 
trace@utk.edu. 
To the Graduate Council: 
I am submitting herewith a dissertation written by Chandradevi Ulaganathan entitled "A Charge-
Recycling Scheme and Ultra Low Voltage Self-Startup Charge Pump for Highly Energy Efficient 
Mixed Signal Systems-On-A-Chip." I have examined the final electronic copy of this dissertation 
for form and content and recommend that it be accepted in partial fulfillment of the 
requirements for the degree of Doctor of Philosophy, with a major in Electrical Engineering. 
Benjamin J. Blalock, Major Professor 
We have read this dissertation and recommend its acceptance: 
Charles L. Britton, Jeremy Holleman, Xiaobing Feng 
Accepted for the Council: 
Carolyn R. Hodges 
Vice Provost and Dean of the Graduate School 
(Original signatures are on file with official student records.) 
A Charge-Recycling Scheme and 
Ultra Low Voltage Self-Startup Charge Pump for 






A Dissertation  
Presented for the  
Doctor of Philosophy Degree 
 






















Copyright © 2012 by Chandradevi Ulaganathan 















I would like to express my sincere gratitude to my advisor Dr. Benjamin J. Blalock 
whose constant support, guidance and patience helped me in completing this dissertation. I‟m 
thankful to him for providing me with employment at the Integrated Circuits and Systems 
Laboratory (ICASL) for the past 6 years. Dr. Blalock‟s tireless efforts to secure funding for 
cutting-edge research are very much appreciated. I also thank him for fostering a collaborative 
atmosphere in the lab and for giving me the freedom to pursue my interests.  
I‟m grateful to Dr. Charles L. Britton for his valuable advice, help and for being a source 
of inspiration during this research. I wish to thank Dr. Jeremy Holleman and Dr. Xiaobing Feng 
for serving on my committee and providing me with helpful suggestions and encouragement 
throughout this work. I thank Dr. Bimal K. Bose for his support and mentoring during my 
graduate studies. I‟m grateful to Ryan Lind, Clif Jones and the Home Audio Amplifiers group at 
TI for their support.  
I would like to thank all my current and former colleagues in ICASL for their support. It 
has been a great pleasure working with you. I‟m grateful to Drs. Neena Nambiar, Suheng Chen 
and Robert Greenwell for their encouragement, friendship and support. In addition, I would like 
to thank Pollob, Junjie and Kai for their help with the chip submissions. I wish to thank the 
administrative and the IT staff at the Department of Electrical Engineering and Computer 
Science for all their help.    
I wish to express my deepest gratitude to my friends Aparna Thyagarajan, Ezhilarasi 
Manickavasagam, Sukanya Iyer, Sangeetha Swaminathan, Srivatsan Sundararajan, Ankit Master 
and Anton D‟Silva. I‟m grateful to Lalitha Aunty for her unconditional love and support.  
Most importantly, I wish to thank my parents Mrs. Saroja and Mr. Ulaganathan, and my 
sisters Arthi and Vaishnavi, for their unconditional love, support and faith in me. I‟m eternally 




The advent of battery operated sensor-based electronic systems has provided a pressing 
need to design energy-efficient, ultra-low power integrated circuits as a means to improve the 
battery lifetime. This dissertation describes a scheme to lower the power requirement of a digital 
circuit through the use of charge-recycling and dynamic supply-voltage scaling techniques. The 
novel charge-recycling scheme proposed in this research demonstrates the feasibility of 
operating digital circuits using the charge scavenged from the leakage and dynamic load currents 
inherent to digital design. The proposed scheme efficiently gathers the “ground-bound” charge 
into storage capacitor banks. This reclaimed charge is then subsequently recycled to power the 
source digital circuit.  
The charge-recycling methodology has been implemented on a 12-bit Gray-code counter 
operating at frequencies of less than 50 MHz. The circuit has been designed in a 90-nm process 
and measurement results reveal more than 41% reduction in the average energy consumption of 
the counter. The total energy savings including the power consumed for the generation of control 
signals aggregates to an average of 23%. The proposed methodology can be applied to an 
existing digital path without any design change to the circuit but with only small loss to the 
performance. Potential applications of this scheme are described, specifically in wide-
temperature dynamic power reduction and as a source for energy harvesters. 
The second part of this dissertation deals with the design and development of a self-
starting, ultra-low voltage, switched-capacitor (SC) DC-DC converter that is essential to an 
energy harvesting system. The proposed charge-pump based SC-converter operates from 125-
mV input and thus enables battery-less operation in ultra-low voltage energy harvesters. The 
charge pump does not require any external components or expensive post-fabrication processing 
to enable low-voltage operation. This design has been implemented in a 130-nm CMOS process. 
While the proposed charge pump provides significant efficiency enhancement in energy 
harvesters, it can also be incorporated within charge recycling systems to facilitate adaptable 
charge-recycling levels. 
In total, this dissertation provides key components needed for highly energy-efficient 




CHAPTER 1 INTRODUCTION..............................................................................................1 
1.1 Motivation .................................................................................................1 
1.2 Research Goals ..........................................................................................3 
1.3 Dissertation Overview ...............................................................................4 
CHAPTER 2 LITERATURE REVIEW .................................................................................6 
2.1 Introduction ...............................................................................................6 
2.2 Power Consumption ..................................................................................6 
2.3 Scaling Trends for CMOS Technologies ..................................................8 
2.4 Low Energy Design Techniques .............................................................10 
2.4.1 Power Supply Voltage Scaling ......................................................10 
2.4.2 Leakage power reduction ...............................................................12 
2.4.3 Charge-recycling systems ..............................................................13 
2.5 Conclusion ...............................................................................................19 
CHAPTER 3 DESIGN AND ANALYSIS OF THE PROPOSED CHARGE 
RECYCLING SCHEME .............................................................................................................21 
3.1 Introduction .............................................................................................21 
3.1.1 Switching or Dynamic Power dissipation ......................................22 
3.1.2 Optimum power supply voltage .....................................................27 
3.1.3 Feasibility of Charge Recycling .....................................................28 
3.2 Proposed Charge-Recycling Methodology ..............................................29 
3.3 Design of the proposed charge-recycling system ....................................31 
3.3.1 Charge-Recycling Process – Design and Control ..........................31 
3.3.2 Estimation of the Charge-Recycling Capacitor Size .....................33 
3.3.3 Low-Power Comparator.................................................................34 
3.4 Analysis of Energy Consumption in the Charge-Recycling Scheme ......38 
3.4.1 Estimation of Virtual Power Supply Voltage levels and CR cycle39 
 vii 
3.4.2 Estimation of Energy Saved using CR Scheme .............................45 
3.5 Implementation of the charge-recycling methodology ............................51 
3.5.1 Design Methodology ......................................................................52 
3.5.2 Physical Implementation ................................................................55 
3.6 Simulation Results and Performance Analysis ........................................58 
3.6.1 Energy Saving ................................................................................59 
3.6.2 Effect on Circuit‟s Speed and Delay ..............................................61 
3.6.3 Leakage Current Reduction ...........................................................63 
3.7 Summary and Conclusions ......................................................................63 
CHAPTER 4 CHARACTERIZATION OF THE CHARGE-RECYCLING GRAY-
CODE COUNTER .......................................................................................................................65 
4.1 Test Setup ................................................................................................65 
4.1.1 Power Supply Generation and Partitions on PCB..........................65 
4.1.2 On-board Supply Regulators..........................................................67 
4.1.3 Digital Buffers on PCB ..................................................................68 
4.1.4 Reset signal generation ..................................................................68 
4.2 Test Procedure .........................................................................................69 
4.3 Measurement Results ...............................................................................72 
4.3.1 Energy Reduction due to Charge-recycling ...................................75 
4.3.2 Efficiency of Charge-recycling scheme .........................................81 
4.4 Techniques to improve the proposed CR scheme ...................................84 
4.5 Summary and Conclusion ........................................................................86 
CHAPTER 5 DESIGN & ANALYSIS OF THE ULTRA-LOW VOLTAGE DC-DC 
CONVERTER ............................................................................................................................88 
5.1 Introduction and Motivation ....................................................................88 
5.2 Literature Review ....................................................................................90 
5.2.1 Switched Capacitor Charge Pump topologies ................................90 
5.2.2 Low voltage self-startup in converters ...........................................99 
5.2.3 Ultra-low voltage SC converters ..................................................100 
 viii 
5.3 Design of the proposed low-voltage, self-starting DC-DC converter ...106 
5.3.1 Architecture of the Converter ......................................................106 
5.3.2 Choice of CP topology .................................................................108 
5.3.3 Design of Ring Oscillator ............................................................109 
5.3.4 Design of Non-Overlapping (NOV) phase generator ..................114 
5.3.5 Design of the Proposed Linear Charge Pump (LCP) ...................115 
5.4 Efficiency of the Proposed Charge Pump .............................................135 
5.4.1 Output voltage of the Charge Pump.............................................135 
5.4.2 Input current consumption of the Charge Pump ..........................140 
5.4.3 Conversion Efficiency of the Charge Pump ................................141 
5.5 Implementation of the DC-DC Converter .............................................142 
5.6 Simulation Results .................................................................................143 
5.6.1 LCP simulation results .................................................................145 
5.6.2 Performance of the Proposed CP topology with increase in number 
of CP stages ...........................................................................................150 
5.7 Summary ................................................................................................152 
CHAPTER 6 CHARACTERIZATION OF LOW-VOLTAGE SELF-STARTUP 
CHARGE PUMP .......................................................................................................................153 
6.1 Test Setup ..............................................................................................153 
6.1.1 Power Supply Partition & Generation on the PCB ......................153 
6.1.2 Ring Oscillator control signal generation ....................................156 
6.2 Test Procedure .......................................................................................156 
6.3 Prototype Characterization ....................................................................157 
6.3.1 Ring Oscillator .............................................................................157 
6.3.2 Charge Pump Output Voltage ......................................................160 
6.3.3 Charge Pump Startup Voltage .....................................................164 
6.3.4 Charge Pump Drive Capability ....................................................166 
6.3.5 Charge Pump Efficiency ..............................................................168 
6.4 Techniques to improve efficiency .........................................................171 
6.4.1 Adaptive Frequency control .........................................................172 
 ix 
6.4.2 Charge Recycling .........................................................................172 
6.5 Summary and Conclusion ......................................................................172 
CHAPTER 7 CONCLUSION ..............................................................................................175 
7.1 Original contributions ............................................................................176 
7.2 Directions for future work .....................................................................176 
7.2.1 Charge recycling based low power digital operation ...................176 






List of Figures  
Figure 2.1 Inverter illustrating current paths during operation ..........................................................7 
Figure 2.2 Energy-delay optimization and tradeoff in digital circuit design [7], [8] .........................11 
Figure 2.3 Charging and discharging action in switch [16] ...............................................................13 
Figure 2.4 Charging and discharging action in switch [16], [17] ......................................................15 
Figure 2.5 Charge recycling system based on vertically-stacked logic cells [24] .............................16 
Figure 2.6 Conceptual schematic of the charge recycling scheme [26]. ...........................................18 
Figure 2.7 Conceptual schematic of charge-recycling scheme [28] ..................................................19 
Figure 3.1 Block level schematic of FO4 inverter stages ..................................................................24 
Figure 3.2 NanoSim simulation results of power consumption across VDD variation in the FO4 
buffer ................................................................................................................................25 
Figure 3.3 Normalized power, delay and power-delay product across VDD variation in the FO4 
buffer ................................................................................................................................26 
Figure 3.4 Block level schematic of the charge-recycling scheme ....................................................30 
Figure 3.5 Schematic illustrating operation during charge-accumulation phase ...............................31 
Figure 3.6 Schematic illustrating operation during charge-recycling phase......................................32 
Figure 3.7 Schematic of the low-power comparator ..........................................................................35 
Figure 3.8 Histogram of comparator's offset voltage distribution from Monte-Carlo simulations ...36 
Figure 3.9 Comparator schematic illustrating the leakage paths to be investigated ..........................37 
Figure 3.10 Virtual power supply voltage levels in the charge-recycling scheme ............................40 
Figure 3.11 Duration of CR cycle as a function of CCR/CL across number of stages (N) ..................45 
Figure 3.12 Virtual power supply voltage levels and power consumption in the CR system ...........46 
Figure 3.13 Percentage energy saved by employing the proposed charge recycling technique ........50 
Figure 3.14 Functional schematic of partially-self powered charge-recycling system .....................51 
Figure 3.15 Schematic illustrating time-multiplexed virtual VDD generation ....................................52 
Figure 3.16 Alternate implementation of CR scheme to support multiple target blocks ..................53 
Figure 3.17 Simplified schematic of the analog buffer used to monitor the virtual supply nodes ....55 
Figure 3.18 Input referred offset voltage of the analog buffer across the input voltage range of 
interest ..............................................................................................................................56 
 xi 
Figure 3.19 Layout of Gray-code counter with CR scheme in 90-nm process..................................57 
Figure 3.20. Layout of Gray-code counter with CR scheme in 0.5-µm process ...............................57 
Figure 3.21 Transient simulation results of CR counter in 90-nm process .......................................58 
Figure 3.22 Simulated power consumption of the CR Gray-code counter ........................................59 
Figure 3.23 Power consumption in the 12-bit Gray code counter in 90-nm implementation ...........60 
Figure 4.1 Die photo of the charge-recycling Gray-code counter in 90-nm process .........................66 
Figure 4.2 Layout of test board to characterize charge-recycling Gray-code counter (90-nm 
process) ............................................................................................................................66 
Figure 4.3 Cross-section of the test board showing the layer stack-up .............................................67 
Figure 4.4 Schematic of LDO voltage-regulator circuit on the test board [65] .................................68 
Figure 4.5 12-bit Gray-code output from the charge recycling counter ............................................70 
Figure 4.6 Transient measurement results demonstrating power reduction in the counter ...............71 
Figure 4.7 Measured input offset voltage of analog buffers ..............................................................72 
Figure 4.8 Transient measurement results illustrating virtual supply rails in the CR counter ...........73 
Figure 4.9 Measured energy reduction in the charge-recycling Gray-code counter (Chip # 1) ........75 
Figure 4.10 Measured energy reduction in the charge-recycling Gray-code counter, including 
power dissipated by the control logic (Chip # 1) .............................................................76 
Figure 4.11 Measured energy reduction in the charge-recycling Gray-code counter (Chip # 2) ......77 
Figure 4.12 Measured energy reduction in the charge-recycling Gray-code counter, including 
power dissipated by the control logic (Chip # 2) .............................................................78 
Figure 4.13 Measured energy reduction in the charge-recycling Gray-code counter (Chip # 3) ......79 
Figure 4.14 Measured energy reduction in the charge-recycling Gray-code counter, including 
power dissipated by the control logic (Chip # 3) .............................................................79 
Figure 4.15 Measured energy reduction in the charge-recycling Gray-code counter (Chip # 4) ......80 
Figure 4.16 Measured energy reduction in the charge-recycling Gray-code counter, including 
power dissipated by the control logic (Chip # 4) .............................................................81 
Figure 4.17 Normalized energy in Counter versus normalized delay due to CR scheme .................82 
Figure 4.18 Normalized energy (including CR logic) versus normalized delay due to CR scheme .82 
Figure 4.19 Normalized energy versus normalized delay as a result of reduction in VDD .................83 
 xii 
Figure 5.1 Simplified schematic of a three-stage linear charge pump with charge transfer switch 
(CTS) implementations (a) CMOS diode CTS (b) CTS with gate control generator 
and (c) Bootstrapped CTS. ...............................................................................................92 
Figure 5.2 Simplified schematic of a three-stage Fibonacci charge pump (a) Φ1 is active (b) Φ2 is 
active. ...............................................................................................................................95 
Figure 5.3 Simplified schematic of exponential charge pump (a) Φ1 is active (b) Φ2 is active ........96 
Figure 5.4 Simplified schematic of linear charge pump‟s stage illustrating reverse current loss. ....98 
Figure 5.5 Schematic of the low-voltage charge pump in [46]..........................................................100 
Figure 5.6 Simplified schematic of the low-voltage charge pump in [45] ........................................101 
Figure 5.7 Simplified schematic of the low-voltage charge pump in [49] ........................................102 
Figure 5.8 Simplified schematic of the low-voltage exponential charge pump in [50] .....................104 
Figure 5.9 Topology of the proposed self-starting converter ............................................................107 
Figure 5.10 Model of inverter-based ring oscillator ..........................................................................109 
Figure 5.11 RO‟s startup voltage across device ratio for typical process corner ..............................112 
Figure 5.12 RO‟s voltage across device ratio for worst-case process corners...................................113 
Figure 5.13 Schematic of the 5-stage ring oscillator .........................................................................114 
Figure 5.14 Schematic of the non-overlapping phase generator........................................................115 
Figure 5.15 Simplified schematic of the proposed LCP ....................................................................117 
Figure 5.16 Schematic of the level-shifter used to generate the CTS control in the proposed LCP .118 
Figure 5.17 Simplified schematic of the proposed LCP along with the GC generation ....................119 
Figure 5.18 Schematic of the proposed self-starting linear charge pump ..........................................122 
Figure 5.19 Schematic illustrating control signals and charge transfer in the first stage of LCP ......124 
Figure 5.20 Schematic illustrating control signals and charge transfer in the second stage of LCP .125 
Figure 5.21 Schematic illustrating control signals and charge transfer in the third stage of LCP .....126 
Figure 5.22 Schematic illustrating control signals and charge transfer in the last stage of LCP .......127 
Figure 5.23 Schematic of the proposed improved self-starting charge pump - version 2 .................129 
Figure 5.24 Schematic illustrating control signals and charge transfer in the second stage of LCP 
V2 .....................................................................................................................................130 




 stage of 
LCP V2 ............................................................................................................................132 
 xiii 
Figure 5.26 Voltage across charge pump capacitors during charging (ΦC) and discharging (ΦD) 
phase of operation for (a) First stage CP capacitor, (b) i-th stage CP capacitor, (c) 
Last or N-th stage CP capacitor, and (d) the output capacitor .........................................137 
Figure 5.27 Layout of the low-voltage charge pump V1 in 130-nm process ....................................142 
Figure 5.28 Layout of the low-voltage charge pump V2 in 130nm process......................................143 
Figure 5.29 Ring oscillator‟s output frequency across input voltage variations ................................144 
Figure 5.30 Ring oscillator‟s output frequency control with delay network .....................................144 
Figure 5.31 Post-layout simulation showing startup of the LCP V1 .................................................146 
Figure 5.32 Post-layout simulation showing startup of the LCP V2 .................................................147 
Figure 5.33 Plot of LCP‟s output voltage across varying input voltages for different load 
conditions .........................................................................................................................148 
Figure 5.34 Plot of LCP‟s conversion gain across varying input voltages for different load 
conditions .........................................................................................................................148 
Figure 5.35 Plot of charge pumps‟ efficiency across varying input voltages and different load 
conditions .........................................................................................................................149 
Figure 5.36 Seven-stage LCP V2‟s output voltage across varying input voltages for different load 
conditions .........................................................................................................................150 
Figure 5.37 Seven-stage LCP V2‟s efficiency across varying input voltages and different load 
conditions .........................................................................................................................151 
Figure 5.38 Plot of LCP V2‟s output voltage across varying number of pumping stages (N) ..........151 
Figure 5.39 Plot of LCP V2‟s efficiency across varying number of pumping stages (N) .................152 
Figure 6.1 Microphotograph of the fabricated die .............................................................................154 
Figure 6.2 Test board to characterize low-voltage charge pumps .....................................................154 
Figure 6.3 Cross-section of the test board showing the layer stack-up .............................................155 
Figure 6.4 Schematic of LDO voltage-regulator circuit on the test board [65] .................................156 
Figure 6.5 Measured charge pump output voltage across frequency of operation (VIN at 0.33 V) ....158 
Figure 6.6 Measured charge pump efficiency across frequency of operation, normalized to ...........158 
Figure 6.7 Frequency tuning in ring oscillator (a) simplified schematic of RO illustrating delay 
cells, (b) Realized RO schematic, and (c) Required RO schematic, across different 
delay (switch) settings......................................................................................................159 
 xiv 
Figure 6.8 Simulated RO‟s output frequency, normalized to the free-running frequency, across 
delay settings for VIN at 0.125 V and 0.33 V. ..................................................................160 
Figure 6.9 Measured charge pump output voltage across variations in input voltage and DC load 
current for Chip 1 .............................................................................................................161 
Figure 6.10 Measured charge pump output voltage across variations in input voltage and DC 
load current for Chip 2 .....................................................................................................161 
Figure 6.11 Measured charge pump output voltage across variations in input voltage and DC 
load current for Chip 3 .....................................................................................................162 
Figure 6.12 Measured charge pump output voltage across low input voltages at different DC load 
current conditions for Chip 1 ...........................................................................................163 
Figure 6.13 Measured charge pump output voltage across low input voltages at different DC load 
current conditions for Chip 2 ...........................................................................................163 
Figure 6.14 Measured charge pump output voltage across low input voltages at different DC load 
current conditions for Chip 3 ...........................................................................................164 
Figure 6.15 LCP V1‟s conversion gain across low input voltages and load conditions ....................165 
Figure 6.16 LCP V2‟s conversion gain across low input voltages and load conditions ....................165 
Figure 6.17 Output voltage of LCP across varying load conditions for chip 1 .................................167 
Figure 6.18 Output voltage of LCP across varying load conditions for chip 2 .................................167 
Figure 6.19 Output voltage of LCP across varying load conditions for chip 3 .................................168 
Figure 6.20 Measured end-to-end efficiency of the DC-DC converter (LCP V1) across varying 
load conditions .................................................................................................................169 
Figure 6.21 Measured end-to-end efficiency of the DC-DC converter (LCP V2) across varying 
load conditions .................................................................................................................170 
Figure 6.22 Simulated LCP V2 efficiency and end-to-end converter efficiency (LCP V2)..............170 
  
 xv 
List of Tables  
Table 2.1 Comparison of the state-of-the-art Charge-recycling based low power techniques ..........20 
Table 3.1 Partially Self-Powered Circuit's performance with and without Charge-Recycling at 50 
MHz .................................................................................................................................61 
Table 3.2 Energy reduction of system with and without Charge-Recycling at 100MHz (90-nm 
design) ..............................................................................................................................62 
Table 3.3 Propagation delay with and without Charge-Recycling ....................................................62 
Table 4.1 Offset voltage of the comparators used in CR control logic..............................................74 
Table 4.2 Comparison of the state-of-the-art Charge-recycling based low power techniques ..........87 
Table 5.1 Summary of the state-of-art low-voltage DC-DC converters with startup mechanisms ...105 
Table 6.1 Charge pump startup voltage across different load conditions ..........................................166 
Table 6.2 Performance comparison of the proposed charge pump with state-of-the-art converters .173 
 
 1 
Chapter 1 Introduction 
 
1.1 Motivation 
The fast-paced growth of the semiconductor industry has facilitated the development of 
highly integrated electronic systems with sophisticated functionalities. Our dependence on these 
ubiquitous, pervasive electronic systems and the increasing concerns over global warming has 
placed efficient energy consumption as a significant topic of research. The International 
Technology Roadmap for Semiconductors (ITRS) [1] has identified power consumption and 
leakage power consumption as focus topics for research for the next 15 years [2]. Also, reducing 
the power consumption of the integrated circuits (ICs) in portable electronics is critical for long 
battery life between recharge cycles and thus to the performance of the system. Energy 
efficiency is important in wired applications as well, where low power consumption directly 
translates to lower utility bills and improved reliability due to lower heat generation. 
With this pressing need for efficient power utilization, many low-power IC design 
techniques have been put forth over the past two decades. Foremost among these digital design 
techniques are supply and threshold voltage scaling [11], [21], sub-threshold operation [7], body-
biasing [20], power-gating [19], and adiabatic computation [13], [16]. Each of these techniques 
has a different power-performance tradeoff and has been successfully used in microprocessors 
along with power management systems. Sensor-based mixed-signal circuits represent a niche 
class of systems that requires low-cost, highly energy efficient operation. Often, the generation 
of multiple supply voltages or the use of sophistication power management ICs presents a large 
overhead on these systems. In such applications, highly aggressive low-power design with 
efficient, integrated power management scheme is required to meet the power constraints of the 
system. 
Additionally, the CMOS device scaling trend has resulted in the degradation of analog 
device performance at Ultra-Deep-Submicron (UDSM) scale processes. This has necessitated the 
use of digital assist techniques to improve the overall system performance, improve 
reproducibility, and provide a simple and powerful interface [1]. With the increasing dependence 
 2 
on such digital housekeeping circuits for otherwise straightforward analog implementations, a 
power reduction technique that encompasses these power-constrained UDSM circuits would be 
very valuable. 
It is the opinion of the author that adaptive voltage scaling techniques would be the 
potentially most effective way to achieve energy efficiency, especially in the presence of process 
and temperature variations. The Adaptive Voltage Scaling (AVS) technique automatically 
adjusts the power supply voltage to achieve energy efficient operation at a required performance 
level, across variations in the operating conditions. Majority of voltage-scaling based energy 
reduction techniques focus on the methods to reduce energy consumption by employing multiple 
supply voltages, but do not account for the generation of these multiple supply voltage levels. 
While the energy expended for the generation of multiple supply levels is validated by the 
significant energy savings achieved in large computation systems, it presents a major overhead to 
severely power-constrained, mixed-signal systems. Hence, this research addresses the challenge 
of adopting VS techniques in these small mixed-signal systems. 
This dissertation focuses on the design and implementation of charge-recycling (CR) 
based voltage-scaling (VS) schemes to improve the efficiency of power-constrained mixed-
signal circuits. Furthermore, the proposed scheme can be easily adapted to AVS scheme. As with 
any VS technique, reduction in power comes at the expense of some performance degradation 
[3]. The power-reduction and performance-loss tradeoff in the system is analyzed in this work. 
The realization of charge-recycling scheme with an adjustable power-supply voltage that 
compensates for performance or temperature variations requires a low-voltage DC-DC converter 
that functions as the power delivering unit. Further, power autonomy can be achieved with a self-
starting DC-DC converter that is powered by the recycled charge (or harvested energy) to 
generate the output power-supply voltage. Additionally, ultra-low voltage charge pumps are 
widely exploited in energy harvesting systems, sensor nodes, as well as in smart power 
management ICs. Hence, the second part of this dissertation explores the design of an ultra-low 
voltage, self-starting charge pump which is an essential component in ultra-low voltage DC-DC 
conversion. Furthermore, unassisted self-startup capability in the ultra-low voltage regime 
 3 
provides significant efficiency enhancements in micro-energy harvesters and thus presents a very 
attractive challenge for research. 
1.2 Research Goals 
The goals of this research can be summarized as follows: 
 To investigate the feasibility of recycling charge from switching and leakage 
currents that are inherent to digital operation. 
 To explore energy-efficient means to use the reclaimed charge. 
 To design and implement a charge-recycling (CR) methodology that lowers the 
total power consumption of digital circuits in a mixed-signal system. Further, the 
proposed CR technique should be easily adaptable to existing digital circuits and 
independent of the process technology employed. 
 To study the performance-power design trade-off as a result of implementing the 
proposed CR scheme. 
 To investigate ultra-low voltage DC-DC conversion topologies that enable power 
autonomy, and offer variable conversion gains depending on the input voltage. 
 To design and implement an unassisted self-starting, ultra-low voltage charge 
pump which forms an integral component in micro-energy harvesters. 
 To characterize the startup voltage and the performance of the proposed charge 
pump. 
 To determine possible applications of the proposed CR scheme and the ultra-low 
voltage charge pump. 
To meet these goals, the state-of-the-art low-power designs have been studied and the 
various factors that limit power reduction have been understood. To investigate charge-recycling 
as a means to lower energy consumption, a methodology is proposed to analyze existing digital 
cells and the best possible way to adopt CR is studied. A prototype has been designed and 
implemented in 90-nm CMOS and 0.5-µm SiGe BiCMOS processes. The energy reduction for 
 4 
one cycle of operation in these test prototypes has been characterized and the performance of the 
proposed CR scheme has been discussed. 
In the second part of this dissertation, the factors that affect self-startup and operation of 
DC-DC converters at ultra-low voltages have been analyzed. A self-starting, switched-capacitor 
charge pump has been designed and implemented in a 130-nm CMOS process. The performance 
of the fabricated prototypes has been characterized across varying input and load conditions. 
Further, the performances of the designed prototypes have been compared with the current state-
of-the-art designs in order to study the effectiveness of the proposed circuits. 
1.3 Dissertation Overview 
Chapter 2 reviews trends in CMOS technology and the challenges of low-power design. 
A literature survey of state-of-the-art power reduction techniques is presented. Chapter 3 
provides an in-depth look at charge-recycling based dynamic voltage scaling in digital cells. The 
design considerations and methodology to optimize power consumption are presented. Further, 
an estimate of the percentage energy reduction due to charge-recycling is derived. The physical 
design and implementation of the proposed CR scheme for a 12-bit Gray-code counter are 
described. Furthermore, to show its effectiveness, the CR scheme has also been ported to a 0.5-
µm SiGe BiCMOS process and the efficiency of the two systems has been compared with 
simulation results. 
Chapter 4 describes the test-board design and the procedure employed to characterize the 
energy savings in the test chips. Further, the design trade-offs and the realized energy savings in 
the system are analyzed in detail. Directions for future work with techniques to improve the 
proposed CR scheme are discussed. The chapter concludes with the performance comparison of 
the proposed CR scheme with the current state-of-the-art low-power designs. The applications of 
the CR technique are also explored in this chapter. 
Chapter 5 reviews the current state-of-the-art self-starting DC-DC converters and 
establishes the design goals for this research. The proposed switched-capacitor-based, self-
starting DC-DC converter topology is introduced and its design is presented. Next, the 
operational losses and their effect on the efficiency of the converter are analyzed. Further, an 
 5 
output voltage equation that accurately models the CP‟s operational losses is derived. Finally, the 
improvements obtained using the proposed topologies are illustrated with the simulation results.  
Chapter 6 presents the test-board design and the methodology employed to verify the 
operation of the proposed self-starting charge pumps. The startup voltage and the performance of 
the proposed low-voltage charge pumps are characterized and compared with the current state-
of-the-art designs. 
Chapter 7 provides a summary and conclusion to this work. The original contributions 
made in this research are presented. Finally, the possible directions for future work are discussed. 
 
 6 
Chapter 2 Literature Review 
 
2.1 Introduction 
Traditionally, digital design optimization was targeted at improving the computation 
speed until the advent of battery operated systems which necessitated power and energy 
optimization techniques to extend the lifetime of the battery. For more than a decade, energy 
optimization has been one of the actively researched areas in IC design. A literature review of 
the various low-power design techniques along with an analysis of the implementation 
challenges helps to illustrate the contribution of this dissertation research to the state of the art 
low power methodologies. Section 2.2 presents a brief overview of the different sources of 
power dissipation that are inherent to digital operation. Section 2.3 discusses the role of CMOS 
technology scaling in the design of energy-efficient mixed-signal ICs. Low-power, energy-
efficient design techniques and their challenges are explored in Section 2.4. Finally, Section 2.5 
summarizes and concludes this chapter. 
2.2 Power Consumption 










where VDD is the power supply voltage, and IVDD(t) the instantaneous supply current consumed by 
the circuit in the time period, T. To examine the different components of power dissipation, 
consider the CMOS inverter schematic shown in Figure 2.1. At equilibrium state, only one of the 
transistors is ON thereby eliminating any conductive path between the supply rails. Hence, IVDD(t) 
current is primarily drawn from the supply only during a transition (or switching) of the output 
logic state. The load capacitance (CL) at the output node is charged to VDD during a LOW-to-
HIGH transition or discharged to ground during a HIGH-to-LOW logic transition. A charge of 
CLVDD is thus moved from VDD to CL through the pull-up network (PMOS transistor) or from CL 
to ground by the pull-down network (NMOS transistor) for the respective transitions. Thus, the 
energy expended for a digital transition is given by (1/2)CLVDD
2
. This switching current 
 7 
component which is essential for digital operation represents one of the dynamic components of 





,    (2.2) 
where α is the probability of a switching transition and 1/t is the frequency of operation. The load 
capacitance at the output node comprises of the reverse-biased diffusion capacitances of the 
driving circuit, gate capacitance of the fanout load, interconnection wire capacitance, and any 
external load capacitance connected to the output node. 
Furthermore, during the switching transitions, when one transistor is being turned ON, 
the other is being switched OFF. Therefore, there exists a short period of time when both the 
transistors are ON, resulting in a short-circuit current (ISC) through the direct conductive path 
from VDD to ground. This undesirable current component is dependent on the slope of input 
transition and can be controlled by proper transistor sizing and fanout ratios. However, this ISC 
forms a small percentage of the total dynamic power required for digital operation.  
Additionally, in current UDSM technologies, there exists a static component of power 
consumption that is caused by the small static leakage currents that flow through the transistors 
even in the OFF state. This static component of power dissipation is primarily determined by the 
 
 
Figure 2.1 Inverter illustrating current paths during operation 
 
 8 
subthreshold leakage and the gate leakage currents in UDSM processes [5]. Thus the total 
average power dissipated for digital operation can be summarized as [3] 
  DDLkgDDSCCLKDDLavgtotal VIVIfVCP  2,   (2.3) 
where the first two terms summarize the dynamic power dissipation, representing the switching 
and the short-circuit components while the static power dissipation due to leakage currents is 
represented by the last term. Typically, the switching component of dynamic power 
approximates the total power consumed and thus forms the target for most power reduction 
techniques. 
2.3 Scaling Trends for CMOS Technologies  
The consistent scaling of CMOS process technologies has been driven by the ever-
increasing need for low power, high performance and high packing density. With constant-field 
CMOS scaling, the device dimensions (width W, length L), gate-oxide thickness (tox), threshold 
voltage (VTH), along with the power supply voltage (VDD) are scaled by a factor of S (about √2) 
for every process node. However, material parameters such as silicon bandgap and built-in 
junction potential do not scale with reduced voltages or dimensions and thus present challenges 
in device design [6]. So, the modern processes follow the generalized scaling model, where the 
device dimensions are scaled by S but the VDD, VTH and doping follow a different scaling factor 
(U) which is smaller than S [3].  
To gauge the effects of scaling on energy efficiency, the critical digital design parameters 
of intrinsic delay and power are considered in this section. The intrinsic delay associated with 
long-channel MOSFET (with VGS = VDD) is given by [3]  
































where CG is the gate capacitance, COX is the gate-oxide capacitance per unit area, IDSAT is the 
saturation drain current at VGS =VDD and µ is MOSFET mobility. As seen in (2.4) the delay scales 
with the device length as L
2
 and thus device scaling results in quadratic increase in the maximum 
achievable speed (performance) for a given VDD and VTH. However, in modern UDSM processes, 
the small device length presents undesirable short-channel effects such as carrier velocity 
 9 
saturation, drain-induced barrier lowering (DIBL) and severe channel-length modulation. For 
nominal supply voltages, the electric field across short-channel lengths is high enough to cause 
velocity saturation and thus it is safe to assume that all modern logic devices are affected by 
short-channel effects. Once velocity saturation occurs, the saturation drain current IDSAT has a 













WCI   (2.5) 
where VDSAT is the drain-source voltage at which velocity saturation occurs, and is usually less 
than VGS-VTH for short-channel devices. The saturated carrier velocity vsat is given by (VDSAT·µ)/L. 
Now, the intrinsic delay τ for short-channel devices can be derived similar to (2.4) and is linearly 
dependent on L as shown below:  


















  (2.6) 
where vsat is constant above VDSAT. Thus there is a linear increase in the maximum attainable 
switching frequency (performance) with device scaling in short-channel length devices.  
 For most digital applications with VDD >> VTH, the switching power dissipation 
approximates the total power consumption and is proportional to the product of load capacitance 
CL and VDD
2 
as in (2.2). The dynamic power dissipation associated with the maximum operating 






















  (2.7) 
where α is the switching activity factor. The power consumption scales quadratically with the 
power supply reduction. For a fixed frequency of operation, choosing a smaller feature size 
results in much lower active power which is scaled by 1/(SU
2
), where S is the tox (dimension) 
scaling factor and U is the VDD scaling factor. Also, from (2.6) and (2.7), the energy per 
operation (i.e. power-delay product) is reduced by scaling technology. Further, the UDSM 
processes also provide different flavors of devices with varying VTH to cater to low-operating 
power (LOP), low-standby power (LSP), or high-performance (HP) circuit applications. Thus, 
 10 
depending on the application, the choice of technology and the available device types can be 
exploited to design for low energy consumption. Specifically, the excess performance offered by 
technology scaling can be exchanged for energy-efficient computation.    
2.4 Low Energy Design Techniques 
Design techniques to lower computation energy have been implemented at different 
levels such as device level, logic level and at architectural level. Device level techniques employ 
threshold scaling, supply-voltage scaling or frequency scaling in order to lower the circuit‟s 
power dissipation by varying the operating conditions of the devices. At the logic level, the 
choice of circuit topology and the logic implementation, such as dynamic versus static logic, true 
single phase clocking (TSPC) logic etc., is employed. Architectural level techniques include 
parallel-computing, time-multiplexed, or pipelined system architecture to improve the system 
performance when operating at low power levels [3]. Since this work applies to device level 
optimization techniques, this literature study is limited to device level techniques. 
2.4.1 Power Supply Voltage Scaling  
As evident from (2.3), VDD scaling presents a straight-forward and an effective method to 
lower power consumption in digital systems. The extent of VDD scaling depends on the 
performance and energy-efficiency requirements in these circuits. However, the reduction in VDD 
results in an increase in the propagation delay and thus lowers the maximum frequency of 
operation. Hence, there exists a power-performance, power-delay or energy-delay tradeoff with 
supply-voltage scaling. Figure 2.2 [7], [8] illustrates the energy-delay tradeoff in digital circuits. 
The energy-delay curve represents the optimal operating point to achieve a given performance 
with minimal energy consumption, or to operate at the maximum speed possible within a fixed 
energy constraint. The optimal energy-delay curve is derived for a system based on a set of 
design parameters such as activity level, transistor size, and VTH levels. Thus any change in these 
design parameters would cause a shift in the energy-delay curve [8].   
An unoptimized design‟s energy-efficiency can be improved by operating with minimum 
energy dissipation that corresponds to (Emin,Dmax) co-ordinates in Figure 2.2, by relaxing the 
performance requirement, or by speeding up the system to operate at (Emax, Dmin) point by 
 11 
increasing the power consumption. Thus, VDD scaling techniques operate at an optimum VDD 
voltage for a required performance such that the power-delay product i.e. the energy per 
operation is at a minimum level [3], [7]. 
2.4.1.1 Multiple VDD schemes 
Supply voltage islands or multiple VDD are commonly used to optimize the energy 
consumption across a chip [9], [10]. Circuits that have high activity, i.e. large switching power 
component, use low VTH devices and are biased at low VDD. Similarly, low activity circuits that 
have appreciable leakage power are biased at high VDD and employ high VTH devices. Generation 
of multiple supply-voltages results in an increase in the total chip power and area. 
2.4.1.2 Dynamic & Adaptive VDD schemes 
Conventional dynamic voltage scaling (DVS) schemes vary the supply voltage to 
establish low power levels for the required frequency of operation [9], [10]. The critical path‟s 
delay associated with VDD is monitored using ring-oscillator or FO4-based logic, and then a look-
up-table is used to specify the optimum VDD voltage that provides the required performance at 
the given temperature.    
 
Figure 2.2 Energy-delay optimization and tradeoff in digital circuit design [7], [8] 
 
 12 
Some adaptive VDD techniques employ closed-loop control systems to preserve minimum 
energy operation with VDD variations. In [11], the authors present a control circuit that 
automatically adjusts both the VDD and VTH such that minimum power consumption is maintained 
across process and temperature variations. The control circuit determines the body bias voltage 
needed to obtain the optimum VTH so that the ratio of the dynamic switching current to the static 
leakage current remains at a fixed level. In another VDD adaptive low-power design work [12], 
the transistor gate-size ratio is varied with change in VDD in order to maintain minimum energy 
point across different operating regions of the gate. This ensures that the large transistor sizes 
required for subthreshold operation does not increase the power when VDD is set above the 
threshold voltage. A 6X reduction in the power-delay product was obtained.  
2.4.2 Leakage power reduction  
Highly energy-constrained applications such as energy harvesters that operate at ultra-
low voltages (ULV), low-speeds use subthreshold digital operation to achieve energy efficiency. 
While low-voltage operation reduces the dynamic switching power, the static leakage of the 
devices increases. Thus the minimum energy point (MEP) in these applications is primarily set 
by the supply voltage for which the switching power is equal to the leakage power [18]. Since 
the static leakage currents in ULV operation form a considerable percent of the total power, 
techniques to reduce the leakage currents are necessary.  
At stand-by mode of operation, the power dissipated by static leakage currents can be 
eliminated by power gating [19]. Pass-transistors are used to connect the circuit to the power 
supply or ground terminal. When the system is in idle state, the pass-transistors are switched off 
and thereby disconnecting the circuit from the supply terminal. Now, the power consumption is 
mainly due to the leakage current flowing through the pass-transistor.  
Transistor stacking technique uses stacked transistor gates to reduce the subthreshold 
leakage current. When both the transistors are switched “off”, the gate-to-source voltage of the 
top transistor is negative and the increase in VTH due to body effect results in lower subthreshold 
leakage currents [19] .  
Several dynamic/adaptive body biasing techniques that vary the device VTH have been 
effectively used to reduce leakage current [20], [21]. To minimize the stand-by power, the VTH of 
 13 
devices are adaptively increased by varying levels of reverse body-bias. Some recent works have 
shown that simultaneous supply voltage scaling and bidirectional body is effective in achieving 
high performance in standby as well as active mode of operation. 
2.4.3 Charge-recycling systems  
Charge-recycling (CR) or energy-recovery refers to the low-power design approach that 
recovers some of the charge stored in the load capacitances (CL). Figure 2.3 shows a simple RC 
circuit that models an inverter‟s switching action (charge-discharge of CL) with R representing 
the on-resistance (Ron) of the active device and C is the load capacitance. A digital transition 
from LOW to HIGH requires that the load capacitance is charged to VDD. The energy expended 
by the power supply, for this transition, is CLVDD
2 
and the energy stored in the load capacitance is 
CLVDD
2
/2 while the switch (Ron) dissipates the remaining energy. A switching transition of 
HIGH to LOW discharges the load capacitor to ground and this process dissipates the energy 
stored (CLVDD
2
/2) in the capacitor. Thus, for one clock cycle of operation, the energy dissipated 
by the power supply is CLVDD
2 
and the charge supplied is CLVDD. While adiabatic computing 
techniques work on reducing the actual energy expended (CLVDD
2
) by the power supply, charge-
recycling techniques aim at recovering the stored charge (CLVDD) from one computation and 
reusing for subsequent computations. Note that the entire charge from the load capacitor can 
potentially be recycled in the charge-recovery process. 
Several implementations of charge-recycling or energy-recovery systems have been 







Figure 2.3 Charging and discharging action in switch [16] 
 
 14 
reference to adiabatic computing. The latter part of this section focuses on charge-recycling 
approaches that use the recovered charge to supply power for other computations, and are 
directly relevant to this research effort.  
2.4.3.1 Adiabatic Energy-recovery Logic 
The principle behind energy-efficient adiabatic logic is to dissipate very low energy 
during charging of load capacitances and recovering most of the energy back to the power-
supply during the discharge phase. This is accomplished by the use of clocked power-supply that 
is derived from the clock signal. The energy recovery process and the power supply waveform 
are illustrated in Figure 2.4. With controlled, slow edge-transitions, the voltage drop across the 
PMOS transistors is small and the output voltage follows the slope of the power-supply clock.  
This results in lower peak charging-current and thus low energy dissipation during the charge of 
CL. During the hold phase of the power supply clock, the output voltage is sampled or evaluated 
in the subsequent stage. During the recovery phase, the power-supply transitions to zero and the 
charge stored in the load capacitor is transferred back to the supply.  The total energy dissipated 







E   (2.8) 
where R is the effective “on” resistance of the gate, C is the load capacitance, TS is the edge 
transition time of the power supply clock and VDD the power supply voltage. Ideally with very 
slow clock transitions, the energy expended per operation can be made very low. Multiple clock 
phases are employed to synchronize the stages such that the hold time of a stage falls during the 
evaluate phase of the subsequent stage. Four-phase clock with 90º phase shift is recommended 
for effective energy saving [17]. 
The main challenge in adiabatic computing is the generation of efficient power supply 
sources that transfer bidirectional energy to and from the circuit [17]. The transition time of the 
power-clock which needs to be large enough to reduce the energy consumption also results in 
large propagation delay in the path. Additionally, the circuitry overhead associated with the 
energy-recovery process would reduce the overall efficiency. 
 15 
2.4.3.2 Vertically-stacked VDD techniques 
The charge-recycling technique based on vertically-stacked computation logics has been 
explored by Rajapandian et al. [22]-[24] and Gu et al. [25]. Implicit charge-recycling is 
accomplished by vertically stacking identical logic units such that the ground-bound charge from 
a digital HIGH to LOW transition in upper tier (domain) logic cells can be reused for a LOW to 
HIGH transition in the lower stack (domain). Thus this scheme is effective for large computation 
systems that are made of several identical logic unit blocks with similar energy consumption, 
performance and concurrent operation. 
The goal of Rajapandian’s work was to achieve energy-efficient on-chip dc-dc 
conversion to supply power for digital circuits using CR. Figure 2.5 shows the charge-recycling 
prototype presented in [24]. A 16x16 carry-save array multiplier is split into 16 logic partitions 
(granules) and connected in vertical stacks with different supply voltages. A linear push-pull 
regulator controls the ground reference voltage of the upper domain which is also the power 
supply voltage of the lower domain granules. Additionally, the regulator monitors the transient 
current consumption of both the domains. When perfectly matched, the entire current demand of 
the lower domain granules is supplied by the upper domain. An imbalance between domains 
would require the regulator to supply the current difference. When a current source-sink 
imbalance of more than a permissible amount is detected between the domains, the granule 
switching control logic randomly chooses a granule from the appropriate domain and switches it 
over to the other domain. This automatic charge (current)-balancing scheme between the logic 
 
 
Figure 2.4 Charging and discharging action in switch [16], [17] 
 
 16 
domains improves the energy-efficiency of the system and a measured efficiency of 85% was 
reported for VDD/2 conversion [24].  
Vertical voltage-stacking (logic-domain) reduces the total current flowing in each power 
supply route by 50% (without CR) for the same supply voltage levels. With charge-recycling 
between domains, the external (off-chip) current requirement is further decreased. The amount of 
current reduction depends on the number of logic domains employed and the switching activity 
in the domains. A direct consequence of lower supply current is a significant reduction in the IR 
and Ldi/dt noise associated with each supply. Furthermore, the current requirements on the linear 
regulator and thus the size of the pass transistor and on-chip decoupling capacitor are reduced. 
Hence, charge-recycling based VDD stacking proves to be an effective method of power delivery 
through dc-dc down conversion. 
Another implementation of multi-story logic based charge-recycling has been presented 
by Gu and Kim [25]. Their approach is similar to Rajapandian et al. [22]-[24] in the use of 
stacked logic to reuse the charge discarded by upper domain for computations in the lower 
domain. Their implementation differs in the current balancing technique where digital voltage 
 
Figure 2.5 Charge recycling system based on vertically-stacked logic cells [24] 
 17 
regulation is achieved by adjusting the switching activity of the functional units in all the logic 
stacks. This is accomplished by sensing the supply voltage variation and by controlling the 
digital inputs to the functional units such that the supply voltage is regulated. The focus of their 
efforts was to use logic stacking as a means of low-noise, power supply delivery. Ring 
oscillators, 16-bit LFSR and ALUs were used to evaluate the effectiveness of this scheme. 
Simulation results were presented to quantify the IR noise reduction to be 66% and Ldi/dt noise 
reduction of 67% with total power savings of 5% [25]. 
The main drawback of the stacked VDD technique is that the current balancing between 
the logic stacks is critical to ensure the charge from higher stacks is efficiently recycled in the 
lower domain stacks. A current monitoring mechanism that regulates the current consumption 
across the stacked domains would dissipate power and might not be a viable option for all mixed 
signal systems. Voltage-level converters that are necessary to communicate between the different 
power domains would also consume power. Additionally, in order to avoid severe body-effect or 
possible device breakdown, the body connections of NMOS transistors in the upper voltage 
domains need to be isolated and tied to their respective ground reference voltages. So, silicon-on-
insulator (SOI) or triple-well processes would be essential for logic domain (VDD) stacking to 
work.  
2.4.3.3 Charge-pump based charge-recycling Schemes 
A charge-pump based recycling scheme using virtual supply and virtual ground has been 
published by Manne et al. [26]. This scheme works on the principle of collecting the ground-
bound charge during the discharge of load capacitance. Once the virtual ground reaches a 
threshold voltage, the charge is boosted by a charge pump to VDD and is then used as a supply 
voltage. Figure 2.6 shows the architecture of this recycling scheme.  
To improve the energy efficiency, the charge pump uses adiabatic charge boosting 
techniques where the charge transfer to the charge-pump‟s output capacitor happens in voltage 
increments. This energy conscious charge-pump action presents an inherent delay to boost the 
virtual VDD node to the required supply voltage. This delay is overlapped with the computational 
delays within the digital system. Hence, the application of this method is constrained to digital 
systems with considerable computation time and pipelined operation. Also, the generation of 
 18 
control signals, the threshold voltages and the incremental voltages for adiabatic charge pump 
action would consume finite power that would reduce the system‟s energy efficiency. This 
design was implemented in digital signal processing (DSP) circuits to reduce the energy 
consumption by as much as 18% (on average of 9.9%) without perceptible loss in performance 
[26]. 
A modified version of this recycling scheme that reduces the power required to generate 
the control signals for monitoring the various virtual nodes was published by Keung et al. [28]. A 
single-stage charge pump along with a DC voltage source boosts the recycled charge to VDD.  
Figure 2.7 presents the conceptual schematic of the memory slice along with the virtual ground 
modules. This system uses two time-multiplexed charge accumulation capacitors to recycle 
charge. Once the capacitor reaches a fixed threshold voltage, it is disconnected from the 
“producer” slice and the stored voltage is pumped up to the required power supply voltage of the 












2.5 Conclusion  
The literature review is not a comprehensive discussion of all the reported low-power 
digital design techniques but includes design methodologies relevant to this work. The above 
discussed techniques have been targeted for large computational digital systems such as 
microprocessors, DSP chips, memories etc. For a small mixed-signal system, such as an ADC 
where digital control circuits consume appreciable amount of power, techniques that integrate 
the advantages of these low-power design techniques are required to accomplish energy-efficient 
computations. Also, low-power technique that achieves energy-efficient, low-cost operation 
across wide operating range offers significant advantage. Hence, this research employs 
continuous voltage scaling as well as charge-recycling methodologies as a means of lowering 
power consumption in mixed-signal systems.   
Table 2.1 presents a comparison of the state-of-the-art charge-recycling based 
methodologies that reduce the power dissipated in digital systems.  While the vertical stacking of 
digital blocks [24] provides large power reduction, it is not conducive to mixed-signal systems 
where matching of current consumption might not be possible. While the single-stage charge-
pump design [28] require external charge pump clock, the three-stage charge-pump based 
voltage boosting design [26] also requires a considerable startup time that is masked within the 
 
 





logic paths. The literature study has thus directed this research effort to address these keys 




Table 2.1 Comparison of the state-of-the-art Charge-recycling based low power techniques 
Reference % Power Saving Features Comments 
[24] 
33 % (VDD stacks) 




[25] 5 % (dual VDD stacks) VDD stacks 
About 66 % reduction 
in noise 
[26] 10 % Adiabatic Charge pump 
External, multiple 




 Single-stage Charge pump 
External boost voltage 
required 
*
 Simulation results 
 21 
Chapter 3 Design and Analysis of the Proposed 
Charge Recycling Scheme 
 
This chapter is revised based on the paper published by Chandradevi Ulaganathan et al. 
[32]: 
C. Ulaganathan, C. L. Britton, Jr., J. Holleman, and B. J. Blalock, “A Novel Charge 
Recycling Approach to Low-Power Circuit Design,” Intl. Conference on Mixed Design of 
Integrated Circuits and Systems, pp. 208–213, May 2012. 
My primary contribution to this paper include (i) identification of the scope of research, 
(ii) compilation of a literature review of previously published work, (iii) design of the proposed 
solution, (iv) analysis of simulation results, and (v) preparation of the manuscript. 
The objectives of this research include: 
 To investigate the feasibility of charge-recycling (CR) based low-power design 
 To propose a novel CR methodology that overcomes the limitations of the state-
of-the-art CR designs.  
 To insure that the proposed scheme should be easily adaptable to existing digital 
circuits and independent of the process technology employed. 
3.1 Introduction 
An important figure-of-merit in energy efficient systems is the minimization of energy 
per cycle of operation, ETOT.  ETOT is the total power-delay product, given by PTOT •Tclk, which is 
the power (PTOT) consumed to perform all required computations within one clock cycle (Tclk). 
The minimization of ETOT maximizes the energy efficiency [4]. In this section, the factors 
governing digital power consumption, and the path or propagation delay are examined to explore 
the feasibility of charge-recycling based energy reduction techniques.  
 22 
3.1.1 Switching or Dynamic Power dissipation 
As discussed in chapter 2, the power consumption in a digital circuit can be approximated 
as  
  DDLkgDDSCCLKDDLtotal VIVIfVCP  2  (3.1) 
The switching component of total power dissipation can be reduced by minimizing the load 
capacitance CL that is charged or discharged at each logic transition, or by reducing the logic 
voltage level (i.e. the power-supply voltage VDD), or by lowering the frequency of the logic 
transitions (i.e. frequency of operation fCLK). The impact of these design parameters on power 
reduction and their tradeoffs will be examined in this section.  
As evident in (3.1), the node capacitance (CL) and VDD determine the amount of charge 
drawn from the power supply for each digital transition. Hence, minimization of CL helps in 
reducing the power consumption. The load capacitance encompasses the diffusion and gate-drain 
(Miller) capacitances at the driver output, gate capacitance of the load, interconnect (wire) 
capacitance, and any external capacitance connected to that node. The device capacitances are in 
turn determined by the transistor sizing that is controlled by the drive-strength requirement (i.e. 
fanout) of the path, power and area specifications. Minimum-sized gates have the lowest 
diffusion and gate capacitances, and thus the least CL, are suitable for low-power, area-efficient 
designs. However, larger PMOS ratios are often essential to have symmetric rise and fall 
transitions, and good noise margin for robust circuit operation. The general design methodology 
is to optimize the sizing ratio of the transistors in order to achieve a given performance, area, 
energy or power specification [29].  
A CMOS buffer is used to illustrate how device sizing affects the power consumption and 






















t  (3.2) 
where CL is the load capacitance at the output node and ReqN, ReqP are the equivalent “ON” 
resistances of the NMOS and PMOS transistors. The load capacitance at the output is given by 
[29] 
 23 
wgngndndnL CCCCCC  2211   (3.3) 
where β is the sizing ratio of the PMOS to NMOS transistors, Cdn1 includes the diffusion and 
gate-drain Miller capacitance of the driver, Cgn2 is the gate capacitance of the load, and Cw is the 

























R  (3.4) 
where IDSAT is the saturation current, K’ is the process dependent transconductance parameter, 
VGS is the gate-to-source voltage, and VDSAT is the drain-to-source saturation voltage. From (3.4), 
the device sizing ratio, β, that achieves symmetric logic transitions at the output (i.e. tpHL = tpLH) 









 .  
Substituting (3.3) and (3.4) into (3.2) results in propagation delay of 

























169.0 21  (3.5) 
The optimum sizing ratio (βopt) for best performance (minimum tpd) can be found by equating 






















  (3.6) 
The βopt for the 90-nm process is simulated to be close to 3.6. This value represents the ratio of 
PMOS to NMOS transistor-sizing that is required for the best performance (i.e. least delay), and 
to have equal rise and fall times i.e. equal tpLH and tpHL values. Increasing the size beyond βopt 
would increase the strength of PMOS to reduce the tpLH, but would also increase the load 
capacitance to be driven by the NMOS and thus increase the tpHL. Therefore, beyond the 
calculated βopt, the increase in load capacitance would dictate an increase in the propagation 
 24 
delay due to self-loading. Additionally, if the wire capacitance is significant compared to the 
device capacitances, a larger value of βopt would result.  
For ratios just below βopt, the performance is slightly degraded, but circuit‟s power 
consumption is reduced due to lower CL. The optimal transistor sizing for minimum energy 
consumption can be found for a required performance by following an approach similar to that of 
sizing for performance. The normalized energy of the inverter with respect to a reference circuit 

































where, β is the sizing ratio and F is the overall effective fanout of the circuit, which is the ratio of 
external load capacitance to gate capacitance of the node. From (3.7), it can be inferred that the 
energy consumption can be reduced by device sizing and power supply voltage reduction for F 
greater than 1. The device sizing ratio βopt,energy is smaller than the βopt,performance [29].  
The optimum fanout (F) for best performance (minimum delay) is technology dependent 
and is approximately 2.7 to 5. Using an optimum fanout ensures that the path delay is within 5% 
of the minimum delay [31]. Hence the buffer used to study the power-performance tradeoff is 
designed with a fanout-of-4 (FO4) load at each stage. Four FO4 inverter stages are used as 
shown in Figure 3.1. The first inverter is used to shape the input signal, the second and third 
inverters form the buffer under study, while the last inverter is used to ensure a FO4-load for the 
buffer output. Further, load capacitor CL is placed at the output such that it presents a fanout-of-4 
 
 
Figure 3.1 Block level schematic of FO4 inverter stages 
 
 25 
load to the last stage and thus minimize Miller effect from gate-drain capacitances of the last 
stage to the node N3. Although the transistor sizing ratio of 3.6 that corresponds to βopt,performance 
is not power efficient, it serves as a good case study to illustrate the design tradeoffs. The effect 
of supply voltage reduction on power and performance characteristics of the FO4 buffer stages is 
analyzed at a frequency of 200 MHz.  
Figure 3.2 shows the achievable reduction in power consumption by operating at lower 
supply voltages. Synopsys‟s NanoSim tool was used to simulate the FO4 inverter stages. As seen 
in the figure, the dynamic power, i.e. the switching power component dominates the total power 
consumption.  The total wasted power which is contributed by the dynamic short-circuit current 
and the static leakage currents is insignificant in the total power consumption. The lowering of 
supply voltage results in a quadratic reduction in total power consumption since the circuit nodes 
charge and discharge to a lower VDD voltage. Notice that for power supply voltage below 0.6 V, 
there is a sharp increase in the total wasted power with a large drop in the total power 
consumption of the inverter. The VDD of below 0.6 V marks the transition into the subthreshold 
region of operation where the devices operate with lower peak currents and large propagation 
 
 
Figure 3.2 NanoSim simulation results of power consumption across VDD variation in the FO4 buffer 
 26 
delays. Thus, there is an increase in the dynamic short-circuit current component of the total 
wasted power and a sharp reduction of the total switching component of power. At VDD of 0.5 V, 
the switching component of power is still the dominant component of the total power 
consumption. Further reduction in the VDD would result in deep subthreshold operation where the 
total wasted power becomes comparable to the total switching component of power 
consumption. 
Power reduction by lowering VDD comes with the penalty of degradation in circuit 
performance due to increased path delay. The propagation delay of the buffer increases due to 
the increase in “ON” resistance of the transistors as a result of lower IDSAT as shown in (3.4).  
Figure 3.3 presents the simulation results of normalized power, normalized delay and power-
delay product of the FO4 buffer at different VDD voltages. As evident in Figure 3.3, the 
normalized power and the energy per computation can be considerably reduced, while the 
normalized delay steeply increases with reduction in the circuit‟s power supply voltage. From 
(3.2), one approach to lower the propagation delay is to minimize CL by using smallest device 
 
 
Figure 3.3 Normalized power, delay and power-delay product across VDD variation in the FO4 buffer 
 
 27 
size (βopt) possible, and to reduce the “ON” resistance (Req) by increasing the W/L ratio. 
However, the degree of performance improvement gained by increasing W/L is very small, since 
beyond the optimum device sizing, the device capacitances cause self-loading to increase the 
propagation delay. Hence, for achieving low-power operation, lowering the VDD presents a 
realistic approach only if the performance requirements can be met. 
The normalized power-delay product (PDP) and the corresponding energy-delay product 
(EDP) quantify the operational efficiency of the circuit with variation in VDD (Section 2.4.1 , 
Figure 2.2). Further, the PDP or EDP curves exhibit a minimum close to 0.8 V of VDD which is 
slightly more than 2VTH of the devices. This VDD offers the optimal point of operation if energy 
and delay are given equal importance in this DUT (buffer).  
The optimal energy-delay curve is derived for a system based on a set of design 
parameters such as activity level, transistor size, and VTH levels. Thus, any change in these design 
parameters would cause a shift in the energy-delay curve [8]. Note that using a different device 
sizing ratio would shift the normalized power and delay values but the circuit would still exhibit 
the similar power-delay tradeoff. Therefore, the PDP or EDP curve can be employed to arrive at 
the minimum possible energy consumption to meet a given speed (delay) requirement. Similarly, 
for a given energy constraint, the maximum possible performance can be deduced. Thus, for a 
given performance requirement, the reduction of VDD along with optimum device sizing presents 
an attractive but a challenging design path to minimize the circuit‟s energy consumption.  
3.1.2 Optimum power supply voltage 
The optimum power supply voltage for a circuit depends on various factors such as 
performance, power consumption, and reliability requirements. Operation at the nominal VDD 
offers the best possible performance with good timing margins in the logic paths, but the power 
consumption is non-optimal. Hence, for a given performance requirement, the power supply 
voltage that guarantees optimum operating point based on the energy-delay product can be 








  (3.8) 
 28 
where tp is the minimum propagation delay which defines the maximum frequency of operation. 
For short channel devices operating in velocity saturation, the propagation delay can be 























The optimum supply voltage can be derived from (3.10) by differentiating it with respect to VDD 




, DSATTHoptDD VVV   (3.11) 
Hence, operating at a reduced supply voltage of VDD,opt results in the optimum operating point 
that corresponds to minimum energy expended to operate at the required performance (minimum 
EDP). 
3.1.3 Feasibility of Charge Recycling 
The power-performance analysis of the FO4 buffer demonstrates the fact that the power 
consumption of digital circuits, operating at nominal power-supply voltage of the process, is 
dominated by the switching current required to charge and discharge the internal load or gate 
capacitances. Hence, reducing the switching component would decrease the power consumption 
in the circuit. The charge required by a node for every digital transition from logic LOW (i.e. 
ground) to logic HIGH (i.e. VDD), or logic HIGH to LOW is given by  
DDLWS VCQ /  (3.12) 
where CL is the load capacitance at the switching node. Thus for each clock cycle, charge QS/W is 
either drawn from the VDD supply to increase a node voltage to logic HIGH, or discharged from a 
node to ground to reduce that node voltage to logic LOW. The transition to logic LOW, which 
results in the loss of charge QS/W to the ground, presents an opportunity to recycle this ground-
bound charge. The readily available ground-bound switching-charge along with the possibility of 
 29 
energy-efficient operation at lower VDD voltages inspires charge-recycling in digital circuits. The 
design of the proposed charge-recycling scheme to reduce power consumption in medium-speed 
digital circuits will be discussed in the next section. 
3.2 Proposed Charge-Recycling Methodology 
This research focuses on lowering power consumption in digital circuits by recycling 
charge from the ground-bound switching currents. This is achieved by actively accumulating the 
ground-bound charge using a storage capacitor bank. In this procedure, the available power 
supply level to the circuit is lowered. If the reduced supply voltage is monitored to be close to 
the optimum VDD that is required to realize a given performance, then the system‟s energy-
efficiency would be considerably improved. The advantage of this approach is that energy-
efficiency and power-consumption are enhanced by both charge-recycling (CR), as well as 
supply-voltage reduction methodology. Furthermore, adopting the CR scheme to provide power 
supply, eliminates the need to use voltage regulators to generate the required multiple VDD levels. 
A block level schematic of the proposed charge-recycling scheme [32] is shown in Figure 
3.4. The source logic block and the target logic block could be different components of a digital 
system or the same logic circuit. Identification of potential blocks in a system is essential to 
maximize charge-recycling and power reduction. The choice of the source block is governed by 
the rate of switching activity in the logic and also by the ease with which the system can be 
separated into functional blocks. Also, it is critical to ensure that the required circuit performance 
is achieved with the reduced supply voltage resulting from charge-recycling. 
The charge-recycling scheme operates in two-phased cycles, namely charge-
accumulation phase and charge-recycling phase. During the charge-accumulation phase the 
source logic is powered up from VDD to a virtual ground (VVGND) to enable the collection of 
ground-bound charge by the charge-recycling or storage (CR) capacitor bank connected to the 
VVGND node. The VVGND node is monitored to ensure the functionality of the circuit is not 
degraded by the continuous reduction in the available power supply voltage. Once the VVGND 
node reaches a predetermined reference voltage, VRG, the CR control signals initiate the charge-
recycling phase.  
 30 
The charge-recycling phase begins with the boosting of the voltage across storage 
capacitors to the virtual VDD (VVDD) level. This is achieved by vertically stacking the capacitors 
that are charged to VVGND in order to generate the required VVDD. The target logic block is then 
powered up from VVDD to ground, VGND. Similar to the charge-accumulation phase, the VVDD node 
is monitored to prevent the supply voltage from discharging below a reference voltage level, VRD, 
to ensure the circuit‟s performance. 
The proposed architecture offers the advantage of power reduction without any design 
change to pre-existing digital circuits. An added advantage of power supply scaling by 
employing virtual supply nodes is the reduction in the leakage currents. This reduction is the 
result of increased threshold voltages due to body biasing, and also lower drain-to-source 
voltages that cause a reduction in the Drain-Induced Barrier Lowering (DIBL) component of 
leakage current [26]. The peak short-circuit current is also reduced with lower supply voltages. 
Additionally, the charge-recycling capacitors also accumulate charge from short-circuit currents 
 
 
Figure 3.4 Block level schematic of the charge-recycling scheme 
 
 31 
and leakage currents. Thus, the proposed charge-recycling scheme lowers the power 
consumption due to dynamic as well as static power dissipation components in (3.1) by 
dynamically varying the supply voltage, and recycling the ground-bound charge to further reduce 
the energy consumption in digital circuits.   
3.3 Design of the proposed charge-recycling system 
The design of the CR system along with the generation of control signals are discussed in 
this section. 
3.3.1 Charge-Recycling Process – Design and Control 
The schematic during the charge-accumulation phase is shown in Figure 3.5. A low-power 
comparator generates the control signals to enable the charging of the capacitors until the VVGND 
node reaches the maximum allowed voltage, VRG. VRG is determined such that the circuit‟s 
 
Figure 3.5 Schematic illustrating operation during charge-accumulation phase 
 
 32 
performance requirements are met at the reduced virtual supply voltage, even in the presence of 
comparator offset. This voltage in turn sets the number of CR capacitors (N) required to obtain 
VVDD. For this implementation, a moderate value of VDD/3 is chosen such that three equal-sized 
CR capacitors (N=3) are placed in series to provide the virtual supply VVDD.  
When VVGND reaches the fixed voltage VRG, the comparator output, ϕ goes high and the 
switches (S2-S8) connecting the CR capacitors to VVGND are opened and the circuit is directly 
connected to VGND through S1. This configuration defines the beginning of charge-recycling 
phase. Since the switches connect from ground to a sufficiently low voltage, NMOS-only 
switches have been employed for switches S1 to S8. 
Figure 3.6 illustrates the switch settings of the capacitive stack necessary to establish the 
VVDD voltage. Switches S9 to S11 connect the top and bottom plates of adjacent capacitors. 
 
 
Figure 3.6 Schematic illustrating operation during charge-recycling phase 
 
 33 
Transmission gate switches are used here to provide low resistance as the voltages change due to 
the transfer of charge from VVDD to sustain the target circuit‟s operation. The VDD switches, S12 
and S13, are PMOS-only versions. Once the comparator detects the VVDD voltage to be less than 
or equal to VRD, a reset pulse is generated and the circuit reverts to the VDD power rail. Also, the 
CR capacitors are returned to the accumulation phase configuration, as in Figure 3.5. The end of 
the charge-recycling phase initiates the charge-accumulation phase and the cycle repeats. The 
value of VRD is set between 2VDD/3 to 3VDD/4. 
The first cycle of the charge-accumulation phase consumes more time when compared to 
subsequent cycles. This is due to the initial need to charge from ground to VRG while the following 
cycles only need an incremental charge from VRD/N to VRG. The system repeats the charge-
accumulation and charge-recycling phases as long as the clock to the source block is enabled, i.e. 
until the source block goes to idle mode or is powered OFF.  
3.3.2 Estimation of the Charge-Recycling Capacitor Size 
The charge-recycling capacitors need to be large enough to supply the energy required by 
the target logic blocks and maintain a virtual supply voltage of more than VRD to deliver power to 
the target block. Conventional CMOS logic cells consume energy of CLVDD
2
 to charge a load 
capacitor CL to VDD. A worst case estimate of the required charge can be given by CLVDD
2
 times 
the number of PMOS transistors in the target design [26], [33]. So the CR capacitors that form the 
virtual VDD need to be large enough to sustain the worst case energy requirements of the target 
logic. 
Another design parameter that needs to be studied for the capacitor sizing is the frequency 
of charge-accumulation and charge-recycling phases versus the power-reduction efficiency of the 
proposed scheme. The frequency of the recycling phases depends on the energy available for 
reuse in the CR capacitors as well as the activity factor and the power consumption of the source 
and target blocks. Since the control logic would also dissipate energy for generating the control 
signals, operating at a high cycle rate would mean more energy dissipation and lower energy 
efficiency. However, increasing the accumulation-recycle time implies large CR capacitors and 
longer charging times. Hence, an optimum value of CR capacitors and accumulation-recycle time 
is chosen to permit maximum energy efficiency. 
 34 
A reasonable estimate of the average power consumed by the digital cell is used to 
determine the effective size of the CR capacitors. Since the topology employs stacking of CR 
capacitors, the effective capacitance decreases with the number of stacks, N. Hence, the individual 
capacitors are sized by N times the required value. The price paid for eliminating charge-pump 
induced delay (as in [26]) between the accumulation and recycle phase is an increase in chip area 
needed for the CR capacitors. 
3.3.3 Low-Power Comparator 
The low-power dynamic comparator used in this design is a regenerative latch that is 
commonly used as a sense amplifier in SRAM cells [29], [30]. The schematic of the high-speed, 
low-power comparator is presented in Figure 3.7. The positive regenerative action of the cross-
coupled inverter pairs provides the high-gain necessary for accurate comparison.  
The control signals of ϕ1 and ϕ1b are derived from the input clock of the source logic 
block, thus enabling the comparator to run at the same frequency as the source digital block. 
During the sample/reset phase, ϕ1 is low and transistors MBN, MBP are OFF, thus disabling the 
latch. In this phase input switches S1 and S2 are closed and the inputs (VVGND and VRG for virtual 
GND comparison, VVDD and VRD for monitoring virtual VDD) are sampled at the high impedance 
nodes V1 and V2. Next, in the regeneration/active phase, ϕ1 goes high to activate the regenerative 
action and to open the input switches to isolate the nodes V1 and V2 from the input. The voltage 
difference between the sampled input signals is amplified and the output is available at the V1 and 
V2 nodes.  
In order to ensure low power dissipation in the nano-Watt range, the transistors MBN and 
MBP are sized to limit the maximum current available for regeneration. Further, the cross-coupled 
input pairs are sized with minimum lengths to keep the power consumption as low as possible. 
This design path results in random mismatch between the input pairs and thus higher input offset 
voltage. Two comparators are employed in the system to monitor the virtual ground and virtual 
VDD voltages. The outputs from the comparators are used to generate the control signals to switch 
between the charge-accumulation and charge-recycle phases. 
 
 35 
Monte-Carlo simulations with three-sigma variations were performed to study the effect of 
matching and process variations on the comparators‟ offset voltage. The histograms showing the 
offset distribution for the virtual ground and virtual VDD comparators, across 40 Monte-Carlo 
runs, are presented in Figure 3.8. Though the comparators exhibit a large offset voltage change 
across process and mismatch variations, the accuracy of the comparators is not as critical as their 
speed and power consumption. The power consumption needs to be kept low in order to limit 
total power dissipated by the control logic, and also to minimize the charge drained from the 
virtual supply nodes during the comparison. The simulated average power consumption of the 
comparator is less than 25 nW at 25 MHz clock frequency. Further, there is no quiescent current 
consumption with this topology. Hence, with this comparator implementation, accuracy is traded-
off in order to lower power consumption with increased speed of operation.  
Analysis of Leakage Paths in the comparator 
The existence of leakage paths in the comparator needs to be checked in order to minimize 
undesirable effects of current flow and charge sharing between the power supply voltage (VDD), 
reference voltage (VRG or VRD), and the virtual supply (VVGND or VVDD) nodes. The leakage current 
paths from VDD or VRG to VVGND would increase the VVGND and thus the charge accumulated by 
 




the CR capacitors would not be entirely from the ground-bound charge. In the virtual VDD 
comparator, leakage from VVDD to VRD or ground (VSS) would result in additional losses and 
reduce the efficiency of the system. Since the existence of leakage currents impacts the efficiency 
of the CR scheme, the comparators and the control signals were designed to eliminate the leakage 
paths.  
High threshold voltage (HVT) devices, with threshold voltages of –490 mV for PMOS 
and 540 mV for NMOS, were employed to keep the subthreshold leakage low. Further, the body 
connections of the input devices were connected to VDD (PMOS) and VSS (NMOS), so as to 
exploit body-effect to increase the threshold voltages. In order to operate at high-speeds, the 
comparator has very small devices with minimum gate lengths and thereby reduces the amount of 
parasitic capacitance within the comparator.  
Figure 3.9 presents the virtual ground comparator with leakage paths that if present would 
corrupt the virtual ground node. The path P1 represents the leakage path from VRG through MN1 
and MN2 to VVGND while path P2 is the corresponding path from VRG through PMOS devices to 
VVGND. Since the maximum reference voltage is below 350 mV, the NMOS devices are always 
biased below their threshold voltages during the sampling phase. Further, as illustrated in Figure 
 
(a) (b) 
Figure 3.8 Histogram of comparator's offset voltage distribution from Monte-Carlo simulations  
(a) Virtual Ground comparator (b) Virtual VDD comparator 
 37 
3.9, when VVGND is close to ground, MN1 will be OFF. For the case when VVGND is close to VRG 
(330 mV), the leakage current through P1 cannot flow to VVGND since the common source node 
cannot be greater than 0.33 V.  Next, considering the path P2 with PMOS devices, the worst case 
leakage scenario is when VVGND is close to ground. The VGS of MP1 can be close to threshold 
voltage but VGS of MP2 is well below threshold. Hence, leakage from reference voltage to virtual 
ground through paths P1 and P2 is not possible with this setup. Employing HVT devices also 
minimizes the short circuit current flow from VDD to VVGND and VDD to VSS during the transition 
from sampling to regenerative phase.  
The analysis for the virtual VDD comparator is similar to the virtual ground comparison. 
The minimum reference voltage of close to 0.7 V is well below the threshold voltage of the 
PMOS devices while close to those of NMOS devices. However, there cannot be any leakage 
current path through the NMOS devices since one of them will have a very low VGS. Further, 
during transition from the sampling to regenerative phase, there can be a very small amount of 
 
Figure 3.9 Comparator schematic illustrating the leakage paths to be investigated 
 38 
 short-circuit current flowing from the inputs to the ground. The rise and fall times of the 
transitions are in the order of a few tens of picoseconds and the peak current is severely limited by 
the high threshold voltages, thus the short-circuit current does not cause any appreciable change at 
the virtual supply node.  
3.4 Analysis of Energy Consumption in the Charge-Recycling Scheme 
To evaluate the efficiency of the system, it is important to analyze the energy consumption 
and energy savings in the charge-recycling scheme. The total energy EIN stored by the CR 
capacitors at the end of the accumulation phase is given by  
 NVCNE RGCRIN 221   (3.13) 
where N·CCR is the individual charge-recycling capacitance, VRG is the maximum voltage at the 
virtual ground and N is the number of CR capacitors in parallel (VDD = N·VRG). Without the CR 
scheme, this energy EIN would have been lost as discharge to ground.  
Let ET,AVG be the average energy consumed per computation by the target block. During 
the charge-accumulation phase, ET,AVG is provided by VDD, whereas during the charge-recycling 
phase the CR capacitors furnish the required ET,AVG. As long as the available energy, EIN, is 
greater than the required, ET,AVG, the recycling system provides energy to the target block 
provided VVDD is above VRD. Thus, only a portion of the total recovered charge is used in each 
accumulation-recycle phase while the rest of the charge resides on the CR capacitors. Note that 
from the second cycle onwards, the CR capacitors need to recover only the difference in charge 





  (3.14) 
Let the energy consumed by the control logic in the CR scheme be given by EDISS which 
encompasses the energy dissipated by the clock to drive the switches, energy consumed by the 
comparators, control signal generation and also the resistive loss within the switches. The process-
related leakage currents and parasitic coupling to the substrate also add to EDISS. Note that ET,AVG 





 ,,  (3.15) 
where NT,RCYC is the number of clock cycles in the target logic during the charge-recycling phase. 
Thus, the system‟s energy efficiency is maximized by reducing EDISS. The percentage energy 









E  (3.16) 









  (3.17) 
The total energy reduction in the system is the sum of energy saved in the source block as a 
result of dynamic voltage scaling during the charge-accumulation phase, and the energy saved in 
the target block (ESAVED) due to charge-recycling phase. 
3.4.1 Estimation of Virtual Power Supply Voltage levels and CR cycle 
To estimate the amount of energy savings using the CR scheme, the dynamic virtual 
power supply level is first examined. With this designed charge-recycling scheme of N stages, 
N·CCR is the size of individual charge-recycling capacitors. The maximum virtual ground voltage 
is set by the ground reference voltage of VRG (VRG = VDD/N), while the minimum virtual VDD 
voltage is the virtual VDD reference of VRD which is VDD–VRG. Figure 3.10 (a) and (b) illustrate the 
transient virtual VDD (VVDD) and virtual ground (VVGND) levels during the charge-accumulation 
(CA-phase) and charge-recycle phases (CR-phase). As seen in Figure 3.10 (b), the charge-
accumulation phase begins with an initial voltage of VRD/N since the recycle phase does not reuse 
the entire charge stored in the CR capacitors. The virtual supply level seen by the charge-
recycling circuit (source and target blocks are the same in this case) is shown in Figure 3.10 (c), 
which is the difference between VVDD and VVGND. The power delivered by the external power 
supply source is shown in Figure 3.10 (d). During the CR-phase, the power supply does not 
furnish any current to the circuit and so the power delivered is zero, neglecting the insignificant 





Figure 3.10 Virtual power supply voltage levels in the charge-recycling scheme  
(a) virtual VDD (VVDD), (b) virtual ground (VVGND), (c) VVDD–VVGND, and  





To estimate the energy saved in one cycle of operation, the time taken to collect ground-
bound charges (tCA) and the time (tCR) during which the target block is powered by the recycled 
charge are calculated. The rate of charging the CR capacitors during CA-phase depends on the 
activity factor, α, which is the probability of a HIGH-to-LOW (VDD-to-0) transition. Also, the 
size of charge-recycling capacitors relative to the effective load capacitance (CL) of the source 
block defines the increase in VVGND with each clock cycle (tCLK). Hence, the increase in VVGND 




















where VVGND(t) represents the virtual ground voltage at the clock event, and the CR capacitor 
bank has N stages of N·CCR capacitors connected in parallel from VVGND to VSS. Short-circuit and 
subthreshold leakage currents represent the other sources of charge to increase the VVGND. The 


















  (3.19) 
where CLKG,SC is the effective capacitance that represents the leakage currents. The total increase 
in the virtual ground voltage is the sum of (3.18) and (3.19). The loss of charge from the CR 
capacitor bank to charge the logic-ground (LOW) nodes within the source block from VVGND(t-1) 
to the current VVGND(t) value is very small. Further, in order to limit the analysis to the significant 
contributing factors, the charge lost from CR-bank to the internal ground nodes is not included. 
Thus, the increase in VVGND due to ground-bound switching and leakage current components after 
each clock cycle can be approximated as 
  ))(())(()( tVVKtVVLStV VGNDDDVGNDDDVGND   (3.20) 
The virtual ground voltage can now be written as  















































Simplifying (3.22) gives 
 42 
    DDVGNDVGND VKKtVtV  1)1(  (3.23) 
 
The boundary conditions can now be included to calculate the charge-accumulation time, 
tCA. During the first cycle of charge-accumulation, the initial VVGND is at VSS, while from the next 
cycle onwards the initial VVGND is at VRD/N. At time tCA, the final VVGND voltage is at VRG. So, the 




















The VVGND voltage at a time t can be visualized as made of small packets of charge flowing to the 
CR bank at a period of tCLK. Thus, the charge-accumulation time has tCA/tCLK number of charge 
flow cycles and is represented as „a‟. Using (3.23) and (3.24), the VVGND can now be written as  






















where, the numbers 1 and 2 represent the number of clock periods, i.e. VVGND(2)=VVGND(2·tCLK). 
The general equation for VVGND at the i
th
 clock period is given by 













The solution for the geometric series in the second term can be integrated to result in  













The final boundary condition can be used to obtain tCA = a·tCLK as 























Thus, for a given number of charge-recycling capacitor stages (N), the time taken (a·tCLK) to 
reach the virtual ground reference voltage (VRG) depends on K which is the ratio of α·CL+CLKG,SC 
to N
2
·CCR. For the very first CA-phase (with initial voltage at 0), the first term in (3.29) and 
(3.31) are zero, thus tCA0 is N times longer than in the subsequent cycles.  
The charge-recycling time (tCR) can be derived using a similar approach as the tCA. The 
initial VVDD voltage starts at VDD and subsequently decreases to VRD in time tCR that is equal to 
r·tCLK. The rate of discharging in CR-phase depends on the activity factor, κ which is the 
probability of a LOW-to-HIGH (0-to-VDD) transition. The relative sizes of charge-recycling 
capacitors and the effective load capacitance (CLT) of the target block define the reduction in 
VVDD with each clock cycle (tCLK). Hence, the decrease in VVDD due to LOW-to-HIGH transitions, 

















where the effective charge-recycling capacitance reduces to CCR due to vertical stacking of the 
capacitors in the CR-phase. The virtual supply voltage can now be written as  
  )1()1(  tVtVtV DDVDDVDD  (3.33) 






















The boundary conditions are given by 







)(;0  (3.35) 
The general equation for VVDD at the i
th
 clock period is now given by 
  iDDVDD BViV )1( 
 
(3.36) 
The final boundary condition can be used to obtain tCR = r.tCLK as 




















Thus, for a given number of charge-recycling capacitor stages (N), the time taken (r·tCLK) to 
reach the virtual supply reference voltage (VRD) depends on B which is the ratio of κ·CLT+CLKG,SC 
to CCR.  
Short-circuit and leakage current components are generally close to 20% of the switching 
current consumption. However, with the effective supply voltage reduction (reduced DIBL) and 
body effect, short-circuit and leakage currents are reduced. Hence, assuming an average of 10% 
for the leakage factors results in CLKG,SC equal to 0.1α·CL. Further, for the same source and target 
blocks, the probabilities for HIGH-to-LOW and LOW-to-HIGH transitions, and the effective load 
capacitances can be assumed to be equal (α=κ and CL=CLT). Thus, equal amounts of charge is 
accumulated or recycled from the CR capacitor bank during each clock transition. Solving (3.31) 
with N = 3, α = 0.5, capacitance ratio of CCR/CL = 50, the value of „a‟ is 305 for the first CA-
phase, and 117 for the subsequent CA cycles. Solving (3.38) with the same set of values results 
in the value of „r‟ at 33. Thus, tCR (i.e. r) is approximately equal to tCA/N
2
 for the first cycle while 
from the second cycle onwards the ratio is 1/N. Figure 3.11 presents the duration of CR cycle 
which is the derived from (3.31) and (3.38) as a function of CCR/CL, for different number of CR 
capacitor stages (N). This plot provides an insight into the design parameters of area that is 
defined by CCR/CL, and the amount of energy reduction which is influenced by the duration of 
CR cycle (i.e. a+r).  
These results can be also be intuitively arrived by examining the amount of voltage 
change, per clock period, in the two phases. The total charge that is accumulated in CA-phase is 






VCNQ  )(2  (3.39) 
The recycled charge amounts to  
CRRGRDDDCRCR CVVVCQ  )(  (3.40) 
 45 
As evident from (3.39) and (3.40), with equal amounts of charge drawn per clock period, the 
CA-phase would be N times longer than the CR-phase. This result is valid only when equal 
amounts of charge flow during each clock period, which was assumed from using the same 
source and target blocks with equal α·CL+CLKG,SC and κ·CLT+CLKG,SC. For different source and 
target blocks, the ratios of (α·CL+CLKG,SC)/(N
2
CCR) and (κ·CLT+CLKG,SC)/CCR define the respective 
times. 
3.4.2 Estimation of Energy Saved using CR Scheme 
The energy consumed by the CR system can be estimated from the instantaneous virtual 
supply voltage levels. The instantaneous power can be approximated as  
CLKVGNDVDDL ftVtVCtPower 
2))()(()(   (3.41) 
where γ is the probability of a switching transition and fclk is the frequency of operation. For 
convenient reference, the instantaneous virtual supply and power levels at the circuit are included 
again as Figure 3.12 (a) and (b). The circuit‟s power consumption follows the curve in Figure 
3.12 (b) while, the external power supply actually furnishes power only during the charge 
accumulation phase, as depicted in Figure 3.12 (c). The energy expended by the circuit for one 
cycle of operation (T) can be approximated as  
 
(a) (b) 
Figure 3.11 Duration of CR cycle as a function of CCR/CL across number of stages (N)  





Figure 3.12 Virtual power supply voltage levels and power consumption in the CR system  
(a) VVDD-VVGND, (b) transient power consumption, γ·CL(VVDD–VVGND)
2
f  , and  













)(   (3.42) 
As seen in Figure 3.12 (b), the transient power is cyclic with a period of tCA+tCR. From 
the previous discussion on virtual supply levels, it is reasonable to assume that the virtual supply 
voltage decreases linearly by ΔV, which is equal to (3.20) in CA and (3.32) in CR, with each 
clock period, tCLK. The transient virtual supply voltage is derived separately for the two phases of 
operation in order to estimate the energy consumption and savings. The energy dissipated by the 





)(,)(,  (3.43) 
where M=T/(tCA+tCR). The power is estimated using the virtual supply values at the start and end 























































































































,   (3.45) 




































CMCAEnergy DDL  (3.46) 
Note that γ is the probability of any switching transition, and not α which is the probability of a 
HIGH-to-LOW transition during CA-phase, is included in the power estimation. The probability 
 48 
factor α defines the slope of the virtual power supply in CA-phase and is integrated into (3.46) 
by the factor „a‟.  
The energy expended by the circuit during the CR-phase can be calculated by following a 












































12,   (3.48) 

















CMCREnergy DDL  (3.49) 





























































L  (3.50) 










2    (3.51) 
Using the relationship T=M·(a+r)·tCLK to simplify the equation, results in 
2)(, DDL VCraMcktEnergy    (3.52) 












































































SavedEnergy  (3.54) 
Simplifying using the result r ≈ a/N arrived by solving (3.31) and (3.38), gives 
 










































Equation (3.50) presents the total energy consumption of the charge-recycling circuit. Since a 
percentage of this energy is being supplied by the recycled charges, the actual energy delivered 
by the external power source corresponds to the energy dissipated during the CA-phase. Thus, 
the percentage energy consumed by the charge-recycling circuit from the external power source 


































EnergyNormalized  (3.56) 





































SavedEnergy  (3.57) 




































SavedEnergy  (3.58) 
Thus, equation (3.54) represents the percentage of energy saved by the dynamic voltage scaling, 
while (3.57) provides the percentage energy saved by integrating both dynamic voltage scaling 
and charge recycling from the proposed charge-recycling scheme. The energy savings can be 
estimated for a given number of CR stages, N, by calculating the values of „a‟ and „r‟ from 
 50 
equations (3.31) and (3.38). For N=3, the values of „a‟ and „r‟ are 117 and 33 respectively, for 
CCR/CL of 50. Substituting these values in (3.54) results in a 43% reduction in the energy 
consumption, while the energy furnished by power supply reduced to 59% from (3.57). 
Equations (3.55) and (3.58) provide a quick approximation to the percentage energy 
saved as a function of N. Figure 3.13 presents a plot with the estimated energy savings that can 
be achieved using the charge-recycling scheme (equations (3.55) and (3.58)), across different 
number of stages, N. This relationship holds true for the case with the same circuit block used as 
the source and as the target. For different configurations, the equations should be modified to 
include the appropriate energy components from the circuits. Also, the energy consumed by the 
charge-recycling control generation block is not included in (3.55) and (3.58). Hence, the total 
energy saved in the system would be offset from (3.58) by the percentage of energy consumed 
by CR control block to the energy dissipated by the system without CR.  
In summary, equations necessary to estimate the achievable energy reduction using the 
 
Figure 3.13 Percentage energy saved by employing the proposed charge recycling technique 
 
 51 
proposed charge-recycling scheme have been derived in this section. Also, the design equations 
that define the energy-reduction-to-area-increase tradeoff have been derived in order to optimize 
the CR implementation.  
3.5 Implementation of the charge-recycling methodology 
Different implementations of the proposed scheme are possible that depend on the 
requirements of the application. The charge-recycling approach can be employed at the circuit-
level or at the system-level. At the circuit-level, since the charge recovered from the source 
circuit is immediately boosted to virtual VDD, it is possible to use this VVDD to furnish power to 
the source circuit and thus realize a partially self-powered circuit. Figure 3.14 presents the 
functional schematic of the CR scheme applied at circuit-level in the system. 
At the system level, a time multiplexing approach may be employed wherein multiple CR 
banks can be used to provide continuous virtual VDD to the target blocks. Figure 3.15 presents a 
time-multiplexed CR system that can achieve power-autonomy in the target block by operating  
 
 




entirely from recycled power and not the power supply voltage. Further, based on the energy 
requirements, alternate topologies can be devised to employ one virtual VDD bank to furnish 
power to different target blocks, as illustrated in Figure 3.16. Thus, the target application defines 
the best strategy for implementing the proposed charge-recycling scheme.  
3.5.1 Design Methodology  
This section provides an overview of the design and implementation methodology of the 
CR scheme in mixed-signal systems. The design parameters that influence the energy-efficiency 
in this CR scheme include the reference voltages of VRG and VRD, number of stages N and size of 
CR capacitor (CCR). 
3.5.1.1 Determination of virtual supply voltage levels and number of stages N 
The source logic block and the target logic block could be different components of a 
digital system or the same logic circuit. Identification of potential blocks in a system is essential 
to maximize charge-recycling and power reduction. The choice of the source and target blocks is 
governed by the rate of switching activity in the logic and also by the ease with which the system  
 
Figure 3.15 Schematic illustrating time-multiplexed virtual VDD generation 
 53 
 
can be separated into functional blocks. The dynamic virtual supply scaling as result of charge-
recycling causes a decrease in the performance of the system. Thus, the critical factor that 
determines the successful implementation is the energy-efficiency or performance requirement of 
the application.  
Depending on the application, since the reduction in performance of functional blocks in 
the non-critical delay path does not affect the overall system performance, these blocks can be 
exploited for charge-recycling. Characterization of the source and target blocks with reduced 
supply voltage is essential in order to estimate the amount of charge that can be recycled, and 
thus the virtual supply levels. Once the minimum VDD (VDD,min) that guarantees the system‟s 
required performance has been estimated, the values of the maximum virtual ground (VRG) and 
virtual VDD (VRD) references are set as VDD−VDD,min and VDD,min, respectively. The number of CR 
capacitor stages, N, is obtained from the voltage boosting ratio of VDD/VRG. 
Once the number of CR stages N has been obtained, an estimate of the percentage energy 
reduction can be obtained from the plot in Figure 3.13 or using the equations (3.55) and (3.58). 
 
Figure 3.16 Alternate implementation of CR scheme to support multiple target blocks 
 
 54 
3.5.1.2 Estimation of Charge-recycling capacitor size 
As discussed in section 3.3.2, the value of charge-recycling capacitor depends on the 
power consumption of the target block, area requirements of CR capacitors, and the duration of 
CR-cycle which determines the energy-efficiency of the implementation. The analysis presented 
in the previous section provides the vital relationship between the CR capacitor size, power 
consumption of the target or source block and the CR-cycle. Hence, equations (3.31) and (3.38) 
can be effectively used to find the optimum value of CCR for the desired N. The optimum value 
of CCR is arrived by iterative method using (3.31) and (3.38) to determine the values of „a‟ and 
„r‟, for different CCR sizes. The values of „a‟ and „r‟ are then substituted in (3.57) to estimate the 
energy reduction. The optimum CCR that provides best energy reduction within the planned chip-
area should be used in the system. 
3.5.1.3 Power Budget 
The power consumption of the logic block that controls the CR system contributes to the 
total power consumption. Careful power budgeting is therefore required to be able to realize 
power or energy savings using the CR scheme. The components that influence the power 
consumption in the control block include the comparators, buffers and the switching losses in the 
control switches that connect the CR capacitors in different CR-phases. Hence, longer duration 
of CR-cycle implies lower switching losses and reduced power consumption in the control logic 
block. The power consumption of the control logic can be incorporated in the total energy 




























































where δ is the probability of switching event in the control block, CLCB is the effective load 
capacitance and VDD is the nominal power supply voltage. The factor M·(a+r) is included to 
account for the fact that the control block is clocked by the same clock as the application. Thus, 
the power budget for the control block to ensure the required percentage energy reduction can be 
obtained using (3.59).  
 55 
3.5.2 Physical Implementation 
A 12-bit Gray-code counter within a 12-bit, 8-channel low-power Wilkinson ADC [34] 
designed to operate at 10 KSps, has been employed to demonstrate the effectiveness of the 
recycling scheme. Since the counter consumes approximately 30% of the ADC‟s total power 
[34], power reduction at the counter would be highly beneficial. The CR scheme has been 
implemented in a 90-nm process where leakage effects are more pronounced. Additionally, the 
CR Gray-code counter design has been implemented in a 0.5-µm process to study the efficiency 
across different process technologies. Both the designs accumulate the ground-bound charges 
from the Gray-code counter and subsequently reuse this charge to furnish their power-supply 
requirements, thereby realizing a partially self-powered Gray-code counter. 
Analog buffers were integrated in order to monitor the virtual supply and ground nodes 
for testing purposes in the 90-nm design. The simplified schematic of the two-staged operational 
amplifier (opamp) used as analog buffer is presented in Figure 3.17. The buffers were designed 
with thick-oxide devices and powered by 2.5 V supply. This facilitates the use of the same 
opamp circuit to buffer both the virtual ground and virtual VDD voltages. Figure 3.18 presents the 
offset voltages, for different inputs, from Monte-Carlo simulation results for 100 runs with three-
sigma process and mismatch variations. The buffer has an average systematic offset voltage of 
 
Figure 3.17 Simplified schematic of the analog buffer used to monitor the virtual supply nodes 
 
 56 
approximately 5 mV in the range of interest. A third buffer was also integrated in order to be 
able to characterize the input referred offset voltage of the buffers. 
Figure 3.19 and Figure 3.20 present the layouts of the CR based Gray-code counter in the 
90-nm and 0.5-µm processes, respectively. The CR capacitors dominate the additional area 
required for the charge-recycling system. High-density, low-leakage metal-insulator-metal 
(MIM) on-chip capacitors were used for the CR capacitors. In the 90-nm implementation, the 
area of the CR system (counter, CR capacitors and control logic) is 13,500 µm
2
 while the counter 
alone occupies an area of 7,200 µm
2
. This increase is mainly due to the capacitors and they 
occupy an area of 6,300 µm
2
 for a total of 15 pF. In the 0.5µm design, the counter alone occupies 
an area of 0.471 mm
2
, and the 75pF total capacitance and logic take 0.077 mm
2
 which is a 16% 
increase in the total area. Note that the source block used in this work is a simple 12-bit counter 
and the increase in overall area would be less substantial with a more complex source block or a 
system such as the ADC [30] where the area increase is only 1.3%. The power savings realized 
using the charge-recycling scheme justifies the increase in area.  
 
 







Figure 3.20. Layout of Gray-code counter with CR scheme in 0.5-µm process 
 
 




3.6 Simulation Results and Performance Analysis 
SPICE simulations were performed on the system to evaluate the power reduction and 
energy efficiency of the proposed scheme. For the 0.5-µm design, the power supply was set at 
2.5 V, with the minimum virtual VDD of 1.75 V. The 90-nm implementation has VDD of 1 V with 
minimum VVDD of 0.7 V. At the maximum ADC conversion rate, the Gray-code counter runs at 
44 MHz. Hence the CR Gray-code counter was characterized at 50 MHz. In addition, the 
effectiveness of the recycling-scheme was also assessed using different target logic such as a 10-
bit Binary counter and operating at a higher frequency of 100 MHz. Figure 3.21 presents the 
virtual VDD, ground and counter output bit (before and after level-shifting) from the post-layout 
simulations at 50 MHz. As seen in Figure 3.21, once the VVGND reaches VRG of 0.3 V, the VVDD is 
used to power up the circuit while VVGND is connected to ground (seen as a drop in VVGND node 
from VRG to VSS). The VVDD node discharges with charge being drawn from the virtual supply for 
LOW-to-HIGH transitions during each clock cycle. Once the VVDD reaches the VRD reference of 
0.7 V, the control switches over to charge-accumulation phase where VVGND collects the ground-
bound charge from digital transitions and leakage currents. 
The transient power consumption of the charge-recycling Gray-code counter is presented 
 
Figure 3.21 Transient simulation results of CR counter in 90-nm process 
 
 59 
in Figure 3.22. The power consumption decreases gradually during the charge-accumulation 
phase due to increase in the virtual ground voltage level. During the charge-recycle phase, the 
virtual VDD furnishes the required supply voltage and so the power delivered by the power supply 
is zero. The power dissipation in the recycle phase is due to the higher levels of dynamic VDD 
(VDD to VRD) in the recycle phase as compared to the charge accumulation phase which has 
7·VDD/9 to VRD power supply voltage variation. 
3.6.1 Energy Saving 
The energy reduction of the partially self-powered counter was estimated by simulating at 
a fixed performance (at frequency of 50 MHz) with and without employing charge-recycling 
methodology. Figure 3.23 presents the power consumed by individual blocks in the counter and 
the amount of power savings achieved by charge-recycling scheme in the 90-nm implementation. 
The charge-recycling scheme reduces the power consumed by the Gray-code counter by 52%  
 
 




which is very close to the estimated 60% from Figure 3.13 in Section 3.4.2. The losses due to the 
finite voltage drop across the series switches, and the parasitic top and bottom plate capacitances 
contribute to the difference between the simulated and estimated values.   
The total power saving that includes the power expended by the control logic amounts to 
31% in the 90-nm CR counter design. The simulation results of energy consumption and the 
percentage energy reduction for both the implementations are presented in Table 3.1. The energy 
saved including the energy dissipated by the control circuitry is close to 30% in both the counter 
implementations. Since 90-nm process has more leakage current contributions, more charge is 
recycled and the dynamic VDD reduction also reduces leakage currents and DIBL effect, thus 
contributing to increased energy savings in the 90-nm design. Furthermore, multiple threshold 
devices that are available in the 90-nm process were exploited to limit the power consumption of 
the level-shifters and comparators in the control logic. The comparable energy savings between 
the two designs verifies that the CR scheme can be employed to efficiently recycle switching-
dominated charge as well as charge from leakage currents. The percentage savings can be further 
improved by minimizing the power dissipation in the comparator and control signals generation.  
 
(a) (b) 
Figure 3.23 Power consumption in the 12-bit Gray code counter in 90-nm implementation  
(a) counter without charge-recycling (b) counter with charge-recycling 
 61 













0.5-µm 15.29 8.19 46% 26% 
90-nm 0.241 0.116 52%  31.4% 
 
 
The energy efficiency using different source and target blocks was also investigated, in 
the 90-nm design, with the Gray-code counter configured as the source and a different logic 
block (10-bit binary counter) as the target. Table 3.2  presents a comparison of the percentage 
energy saved by supplying the power to the Binary counter from the charge recycled in Gray-
code counter, and vice versa. The system was simulated for the time required to complete one 
full cycle of counting at the target block, operating at a frequency of 100 MHz. Since both the 
counters have similar energy requirements and some of the recycled charge is dissipated at the 
switches, the charge-accumulation phase lasts longer than the charge-recycle phase. So, the 
energy reduction at the source block is more than that at the target block. Improving the 
efficiency of virtual VDD generation would certainly increase the total energy saved at the target 
counter.  
3.6.2 Effect on Circuit’s Speed and Delay 
Powering digital blocks from virtual VDD and virtual ground introduces variations in the 
available supply voltage. In this CR scheme, the reduction in the available power supply and the 
use of body-effect to increase VTH, as a means to reduce leakage currents, would result in 
increased circuit delay [29]. The propagation delay (td) of the counters operating at 50 MHz is 
presented in Table 3.3. The change in delay is expected due to the variation in power supply 
during both the charge-accumulation and the charge-recycling phases. It should be emphasized 
that the increase in delay does not degrade the performance of the counters and there are no 
missing counts at the output. Since this scheme is targeted for medium-speed low-power circuits, 

















Energy (source) 58.85 35.15 − 40.3% 
Energy (target) 60.84 45.47 − 25% 





Energy (source) 243.3 131.4 − 46% 
Energy (target) 235.5 190.6 − 19% 









With CR (ns) 
Delay increase 
(ns) 
0.5 µm 2.00 
2.66 – 3.13 
(Avg: 2.9) 
0.90 
90 nm 1.11 






3.6.3 Leakage Current Reduction 
One of the advantages of implementing a virtual supply and ground is the reduction in 
leakage currents [26]. The leakage current is dominated by sub-threshold leakage and DIBL 



























, µ0 is the zero-bias carrier mobility, Cox is the gate-oxide 
capacitance, Leff is the transistor effective channel length, W is the transistor width, η is the DIBL 
coefficient, γ is the linearized body-effect coefficient, n is the transistor sub-threshold swing 
coefficient and vT is the thermal voltage (kT/q) [26]. 
The supply voltage reduction lowers the drain-source voltage and thus reduces the DIBL 
current. The leakage is also suppressed by the reverse body-bias voltage (VSB). The smaller 
feature sizes in the 90-nm process node have more leakage compared to the 0.5-µm process. 
Therefore, the two CR-based designs present good insight to the leakage current reduction by 
using virtual power supplies in these processes. 
3.7 Summary and Conclusions 
Charge-recycling has been demonstrated as a viable option to improve the energy 
efficiency of digital circuits. This chapter presented the design of proposed CR scheme to 
scavenge charge from the leakage and dynamic load currents that are inherent to digital design. 
The power-delay tradeoff and energy efficiency analysis of the CR scheme have been examined. 
The presented design methodology and the energy savings estimation can be effectively reused 
to implement the CR scheme to cater to the energy requirements of any system. 
The novel CR scheme has been designed and implemented for a 12-bit Gray-code 
counter in 0.5-μm and 90-nm processes. The proposed CR scheme avoids the delay introduced in 
charge-pump based voltage boosting techniques, and eliminates the need to match the current-
consumption in vertically-stacked CR digital blocks. Simulation results demonstrate a total 
average energy reduction of 31% with this charge-recycling scheme. The average energy of the 
 64 
counter alone is decreased by 52% by recycling charge. The characterization and measurement 
results of the 90-nm prototypes are presented in the next chapter. 
 65 
Chapter 4 Characterization of the Charge-Recycling 
Gray-code Counter 
 
This chapter presents the measurement results and analysis of the charge-recycling (CR) 
Gray-code counter design in a 90-nm CMOS process. The goal is to characterize the energy 
savings realized by employing the CR scheme. The power-performance tradeoffs incurred in this 
implementation are analyzed. Further, the efficiency of the CR scheme is characterized across 
changes in frequency of operation and virtual supply voltage variations. The dependency of the 
CR scheme on process variations is also examined. 
4.1 Test Setup 
The chip was fabricated in a 90-nm CMOS process that is offered by MOSIS [69] and 
packaged in a 52-pin Low-profile Quad-Flat (LQF) package. The chips were received in 
November 2011 and subsequently four test chips were characterized in spring 2012. Figure 4.1 
presents the microphotograph of the fabricated die. CadSoft‟s Eagle layout editor was utilized to 
design the FR4 printed circuit board (PCB) that was used to characterize the Gray-code counter. 
Figure 4.2 presents the designed test board that accommodates one test chip. The 52-pin LQFP 
are soldered directly onto the board in order to reduce the socket parasitics. The test board 
comprises of four layers of copper planes to provide a quiet ground-return path for the transients 
inherent to digital circuits. The layer stack-up in the PCB cross-section is illustrated in Figure 
4.3. Since most of the components used on the board are surface-mount devices, all the signal 
routings are on the topmost copper layer of the PCB. Further, as analog buffers are used to probe 
the virtual VDD and virtual ground nodes, careful power supply distribution is crucial to minimize 
noise coupling between the analog and digital domains.  
4.1.1 Power Supply Generation and Partitions on PCB 
Power supply planes and supply partitions are essential to avoid noise coupling and to 
preserve signal integrity on the test board. The fabricated mixed-signal test chip has individual 






Figure 4.2 Layout of test board to characterize charge-recycling Gray-code counter (90-nm process) 
 
Figure 4.1 Die photo of the charge-recycling Gray-code counter in 90-nm process 
 67 
 
supply of 2.5 V is used while the digital cells are powered up by 1-V supply. To provide a short, 
quiet return path for current transients during switching events, supply planes are utilized on the 
board. The inner copper supply layers also act as decoupling capacitors. Additional decoupling 
capacitors are provided close to the supply pins of the chip and small decoupling capacitors are 
also provided on-chip.   
As highlighted on the test board in Figure 4.2, the board has three supply partitions 
namely digital buffer, digital supply and analog supply for the test chip. The grounds of the three 
partitions are connected at only one point on the board in order to eliminate ground-loops. A 
tight, short ground-plug is used to connect the grounds and thereby eliminate the inductive 
effects of banana plugs.  
4.1.2 On-board Supply Regulators 
Onboard voltage regulators are employed to provide stable, low-noise supply voltages to 
the chip. Linear voltage regulators are preferred over switching regulators due to their low noise 
capability.  A low-dropout, linear voltage-regulator (TI‟s LP38512) [65] is used in the test board. 
This regulator provides output voltages in the range of 0.5 V to 4.5 V for an input voltage range 
of 2.25 V to 5.5 V. In this design, multiple power supply signals from the CR Gray-code counter 
have been padded out. In order to avoid the use of several power supply regulators, the non-
critical power supplies such as pad ESD supply, and digital output pad-drive buffer supply, are 
supplied by a single regulator. The analog supply of 2.5 V is generated using a separate regulator 
and the analog ground is isolated from the digital section. Figure 4.4 presents the schematic of 
the voltage regulator. The required output voltage is realized by varying the ratio of resistors R1 
and R2 and is given by 
 






















An input voltage of 5 V is applied to the regulators in order to obtain digital and analog power 
supply voltages of 1 V and 2.5 V, respectively. 
4.1.3 Digital Buffers on PCB 
The digital output from the Gray-code counter is buffered on the PCB to be able to drive 
the load presented by oscilloscope cables. The on-board buffers also reduce the effective loading 
seen by the on-chip pad-drive buffers. This results in lower transient peak currents within the 
chip and lower noise coupling from bond wire inductances. TI‟s SN74AUC2G34 dual gate 
buffer [66] is used in this test board. 
4.1.4 Reset signal generation  
The fabricated design includes counter enable (active “low”) or clock reset signal to 
disable the chip‟s operation (i.e. disable counting). Also, in order to measure the computation 
efficiency of the counter, with and without the charge-recycling scheme, the charge-recycle reset 
signal has been included to disable only the charge-recycling mechanism while the Gray-code 
counter is operational. These reset signals are generated on the board using the SPST switches 
that utilize a make-before-break transition. These signals are derived from the digital power 
supply voltage of the chip.  
 
 






4.2 Test Procedure 
The characterization of the CR counter begins with ensuring that the DUT is supplied 
with appropriate, stable power supply voltages from onboard regulators, and external power 
supplies. Before applying power to the Gray-code counter and CR control blocks, the respective 
reset signals are set HIGH in order to power-up the circuits in the reset mode. Also, the onboard 
resistive voltage dividers that set the virtual power supply references are verified to be within the 
operational range of the circuits. The power supply to the Gray-code counter and the CR control 
block are supplied using Keithley‟s 2400 Sourcemeter and the current consumed by these 
circuits is monitored using the Keithley‟s 2400 Sourcemeter and Keithley‟s 6485 picoameter. 
The counter‟s clock is provided by a LeCroy pulse generator. LabView is also used to interface 
and control the test equipment. 
Once the proper power supply voltages are set, the counter enable signal, and then the 
counter clock is applied. The 12-bit, Gray-code output bits from the counter are sampled using 
the Agilent MSO6034A mixed-signal oscilloscope. The sampled data is checked to verify correct 
operation of the counter. Then, the virtual supply nodes are set with proper reference voltages 
and the CR scheme is then enabled by releasing the CR reset signal. For this set of virtual supply 
values, the counter‟s output is again sampled to verify satisfactory operation without any missing 
codes. Figure 4.5 presents a picture of the sampled output from the charge-recycling Gray-code 
counter. Once the counter operation has been verified, the power consumed by the CR counter 
along with the CR control logic is measured. The efficiency of the CR scheme is accessed using 
the measured current consumption of the DUT. The power supply to the level-shifting buffers 
was inadvertently connected to the supply of pad-drive buffers.  So, the power expended to 
perform level-shifting has not been included in the total power reduction calculations. In reality, 
to keep the power consumption low, the level-shifting operation would be integrated into a flip-
flop available in the data path of the system. Since the counter‟s output bits were padded out for 
test purposes, a separate level-shifting block is required for this prototype. Furthermore, the 
simulation results include the level-shifters and they represent less than 2% of the total power 
dissipated. Although the total measured power reduction reported here presents an optimistic 
value, the close correlation between the simulation results and the measurement results 
demonstrates that the accuracy of the measured % energy reduction is within 2%.   
 70 
 
Figure 4.6 illustrates the transient power consumption of the test chip # 2 when the CR 
scheme is enabled. The figures also demonstrate the test procedure wherein the counter (without 
CR) is enabled after startup and is then followed by activating the CR blocks to lower the energy 
consumption of the circuit. The data was collected from the Keithley instruments using Labview 
interface. During this test, the counter was operating at a frequency of 35 MHz and the virtual 
ground and virtual VDD reference voltages (VRG & VRD, respectively) were set at 310 mV and 750 
mV, respectively.  From Figure 4.6, it can be inferred that CR confers 40% reduction in average 
power consumption of the counter. The total reduction which includes the power consumed by 
the CR control logic is about 22%. 
 
Figure 4.5 12-bit Gray-code output from the charge recycling counter 
 71 
 
In order to provide an insight into the charge recycling operation, and to aid in 
debugging, the virtual power-supply rails (VVDD & VVGND) to the counter are padded-out in the 
test chips. Analog buffers are employed to shield the virtual supply nodes from the output pad 
and probe loads. In addition, a copy of the opamp (used in the buffer cell) that is placed close to 
the two buffers is available to characterize the input offset voltage of the buffer across the power 
supply range of the counter. Figure 4.7 presents the measured input offset voltage of the opamps 
across varying input voltages for the four test chips. The measured offset of the analog buffer 
(i.e. opamp) gives a realistic estimate of the charge recycling operation and also provides an 
option to compensate for the comparator‟s offset within the CR control generation circuitry.  
Figure 4.8 presents an oscilloscope screenshot of the VVDD and VVGND nodes in the test 
chip # 2. The figure verifies the operation of the CR scheme where, during the charge-
accumulation phase, the ground-bound charges from the circuits are collected by the storage 
 
Figure 4.6 Transient measurement results demonstrating power reduction in the counter  
using the CR scheme (includes counter and CR control) 
 72 
capacitors, as seen in the bottom signal (VVGND) that increases with time, while the VVDD (top 
signal) is held at a constant VDD. Once the VVGND reaches the reference voltage VRG, the charge-
recycling phase begins wherein the VVDD node supplies power to the counter and so it reduces 
with each clock cycle until it reaches the virtual VDD reference voltage of VRD. During the charge-
recycling phase, the offset of the buffer (60 mV at an input of 0 V) masks the actual value of the 
VVGND which is held at ground. From the varying time period for one full cycle of operation, 
which includes the charge accumulation (tCA) and charge recycle phases (tCR), it can be inferred 
that the amount of charge, or the probability of a node discharging to ground, varies with the 
output of the counter. 
4.3 Measurement Results 
The test procedure outlined in the previous section was performed on all the four test 
chips to verify their operation. As discussed in Chapter 3, employing the charge recycling 
scheme to reduce the energy consumption also increases the propagation delay and thus lowers 
 
Figure 4.7 Measured input offset voltage of analog buffers  
 73 
the maximum achievable frequency of operation. Thus, it is essential to investigate the power 
reduction versus propagation delay increase in the Gray-code counter as a result of CR scheme. 
An accurate measurement of the delay increase in each output bit of the counter is not directly 
possible since the transient virtual power supply values result in different propagation delays that 
depend on the phase of the charge-recycling cycle. Further, a very high speed PCI-express data 
acquisition system of more than 250 MHz data-rate is required to record the high-speed virtual 
power supply voltage variations. To circumvent the test limitations, the delay increase due to CR 
scheme was obtained from the measurement of the maximum possible frequency of operation, 
without any missing counts at the counter output bits. Since the counter‟s critical delay path 
determines the maximum frequency of operation, an increase in the propagation delay due to CR, 
will also reduce the frequency of operation.  
 
Figure 4.8 Transient measurement results illustrating virtual supply rails in the CR counter  
 74 
Additionally, in order to quantify the efficiency of the proposed charge-recycling scheme, 
the counter‟s energy reduction and delay increase were characterized across variation in the 
frequency of operation, and across different virtual power supply levels. The range of virtual 
power supply levels was established so as to ensure that the peak VVDD (set by VRG) was always 
below the maximum rated value of 1.2 V for the 90-nm process. The choice for the minimum 
VVDD (set by VRD) was governed by the requirement to avoid excessive increase in the 
propagation delay that brings down the efficiency. Further, the offset of the two comparators that 
monitor the virtual supply rails to be within the predefined reference voltages, determines the 
actual power supply voltage realized at the CR counter. Thus, the comparator offset voltage, and 
the propagation delay from the comparator‟s output to the switches that control the charge-
recycling capacitors, will affect the charge-recycling efficiency. Since an independent 
comparator was not available for characterization, the offset was estimated from the difference 
between the maximum virtual ground voltage and the virtual ground reference (VRG), and the 
deviation of minimum virtual VVDD voltage from VRD. Table 4.1 presents the average offset 
voltage of the comparators used in the CR logic path. 
As discussed in Section 3.3.3, the high-speed comparator topology was chosen with an 
emphasis to keep its power consumption to a minimum, while trading-off accuracy. The  
 
 
Table 4.1 Offset voltage of the comparators used in CR control logic 
Chip 
Offset at Virtual 
Ground Comparison 
(Vtlgnd – VRG) 
Offset at Virtual VDD 
Comparison 
(Vtlvdd – VRD) 
1 54 mV (late) 15 mV (early) 
2 −28 mV (early) −70 mV (late) 
3 40 mV (late) −40 mV (late) 




measured offset voltages fall within the predicted offset voltage range from the MonteCarlo 
simulations performed across 3-sigma variations in process and mismatch. The CR scheme‟s 
reference voltages can be adjusted to compensate for the comparator offset and thereby achieve 
higher, reliable efficiency. The measurement results for each CR prototype are presented in this 
section. Comparison of the measurement results across the four test-chips provides an insight 
into the CR efficiency across process variations. 
4.3.1 Energy Reduction due to Charge-recycling 
Figure 4.9 presents the measured energy savings of the Gray-code counter for different 
charge-recycling voltage levels or virtual power supplies. The corresponding data for the total 
energy reduction that includes the power dissipated by the CR control logic is illustrated in 
Figure 4.10. The virtual ground comparator‟s offset of 54 mV (see Table 4.1) implies that the 
actual VVGND is 54 mV higher than the virtual ground reference voltage, VRG. So, VRG was limited 
 
 
Figure 4.9 Measured energy reduction in the charge-recycling Gray-code counter (Chip # 1) 
 76 
to a maximum of 0.32 V in order to ensure that the boosted VVDD does not exceed the maximum 
rated power supply voltage. While the minimum VVDD could be lower than 0.57 V, a very low 
VVDD would cause large increase in the propagation delay and thereby reduce the efficiency.  
As verified by the data in Figure 4.9 and Figure 4.10, for a given frequency of operation, 
the power or energy consumption can be reduced by decreasing the power supply voltage. For 
very low VRG of 0.24 V, the charge-storage capacitors accumulate the ground-bound charge to 
reach a maximum of 0.294 V, including the 54 mV offset of the comparator. However, with the 
0.294 V, the maximum boosted VVDD voltage is less than 0.9 V and so the virtual VVDD node 
discharges to VRD fast. Thus, for low VRG values, both the charge-accumulation phase (tCA) and 
the charge-recycling phase (tCR) are small and therefore the percentage of energy saved is lower. 
Further, the VRD voltage is not set at VDD–VRG but is smaller by about 20 mV (after compensating 
for the comparator offsets) in order to account for the series voltage drops in the charge transfer 
 
Figure 4.10 Measured energy reduction in the charge-recycling Gray-code counter, including power dissipated by 
the control logic (Chip # 1) 
 77 
switches present within the capacitive power supply circuit. Thus, for a reasonable value of VRG, 
that facilitates charge-recycling, the average power saved in the counter is approximately 
constant across frequency of operation and is close to 42% for the counter alone, and 24% 
including the CR scheme‟s control logic.  
Figure 4.11 and Figure 4.12 present the measured energy savings of the Gray-code 
counter and including control logic, for different charge-recycling voltage levels in chip # 2. 
Since the virtual ground voltage sensing comparator has a negative offset voltage, the 
comparator trips at 28 mV below VRG and so the charge-recycling phase starts with a lower VVDD. 
The virtual VDD comparator triggers the charge-accumulation phase after the VVDD decreases 
below VRD by 70 mV. Hence, the reference voltages for the virtual supply levels are 
appropriately chosen to compensate for the offset and thereby achieve good performance. The 
curve for reference voltages of 0.75 V (VRD) and 0.31 V (VRG) represents the case with 
uncompensated offset and exhibits close to 37% energy reduction. With offset compensation, 
similar to chip # 1, this DUT‟s average energy reduction is 40% for the counter and 23% with 
the control logic‟s energy consumption.  
 
 
Figure 4.11 Measured energy reduction in the charge-recycling Gray-code counter (Chip # 2) 
 78 
 
Figure 4.13 and Figure 4.14 present the respective measured energy savings of the Gray-
code counter, and the counter along with control logic, for different charge-recycling voltage 
levels in chip # 3. The chip # 3 presents the process corner where both the comparators trigger 
late, i.e. after the input has increased or reduced past the reference voltage. Hence, for an 
uncompensated reference voltage setting, this process corner would yield the highest energy 
savings but would also increase the penalty due to increase in propagation delay. With offset 
compensation, this chip provides an average energy reduction of 41% at the counter and 23% 
including the control logic‟s energy consumption. The reduction in power savings at low 
frequencies, especially below 20 MHz is primarily due to the increase in the losses incurred in 




Figure 4.12 Measured energy reduction in the charge-recycling Gray-code counter, including power dissipated by 





Figure 4.14 Measured energy reduction in the charge-recycling Gray-code counter, including power dissipated by 
the control logic (Chip # 3) 
 
Figure 4.13 Measured energy reduction in the charge-recycling Gray-code counter (Chip # 3) 
 80 
Figure 4.15 and Figure 4.16 present the measured energy savings data at different charge-
recycling voltage levels for chip # 4. The chip # 4 presents the process corner where both the 
comparators switch their outputs early, i.e. before the virtual supply nodes have reached their 
reference voltages. Therefore, for uncompensated reference voltages, this process corner 
represents the worst condition for this CR implementation. With offset compensation, this chip 
provides an average energy reduction of 40% at the counter and 21 % including the control logic. 
In summary, the four tested prototypes demonstrate consistent reduction of more than 
40% (counter alone) and 22% (counter and the control logic) in the energy consumption, across 
the frequency range of interest. The virtual supply‟s reference voltages can be exploited to adjust 




Figure 4.15 Measured energy reduction in the charge-recycling Gray-code counter (Chip # 4) 
 81 
 
4.3.2 Efficiency of Charge-recycling scheme 
The efficiency of the energy reduction can be deduced by examining the energy-delay 
tradeoff in the CR scheme. By reducing virtual supply voltage levels, the maximum possible 
frequency of operation also reduces due to the increase in propagation delay. For a given 
frequency of operation, energy expended by the counter is normalized to the energy spent by the 
counter without CR. Figure 4.17 presents the normalized energy versus normalized delay for the 
counter‟s operation, across variations in virtual supply voltages for all the four tested samples. 
Figure 4.18 presents the corresponding normalized energy versus delay results that includes the 
energy consumption of the CR control logic. As discussed in the previous section, chip # 4 
presents the „slow-slow‟ process corner and so the delay increase is large in order to attain same 
energy savings as the other samples. Both these plots, verify that the charge-recycling scheme 




Figure 4.16 Measured energy reduction in the charge-recycling Gray-code counter, including power 





Figure 4.18 Normalized energy (including CR logic) versus normalized delay due to CR scheme 
 
Figure 4.17 Normalized energy in Counter versus normalized delay due to CR scheme 
 83 
Another measure of the effectiveness of this CR scheme is the comparison to the 
maximum possible energy reduction that can be achieved with power supply scaling. The energy 
consumption and the propagation delay of the Gray-code counter were measured across 
reduction in the power supply voltage.  Figure 4.19 presents the measured normalized energy 
versus normalized delay results of the counter across different VDD levels. As evident in the 
figure, lowering VDD results in quadratic reduction of the energy consumed per operation in the 
circuit. The figure also includes the measurement results from the charge-recycling counter of 
chip #1 in order to facilitate the comparison.  
The normalized energy results in Figure 4.19 present the maximum possible energy 
reduction from supply voltage scaling. The normalized energy per operation for chip # 1‟s 
counter which includes the CR logic is close to 0.73 for normalized delay of 2, while those from 
VDD reduction is about 0.47 for delay of 2. Since the VDD reduction data does not include the 
 
Figure 4.19 Normalized energy versus normalized delay as a result of reduction in VDD 
 84 
energy consumed by voltage regulators that generate VDD, comparing the energy saving data 
from just the CR counter would provide a reasonable evaluation of the CR scheme. As seen in 
the figure, the CR counter‟s normalized energy of 0.57 is close to that of VDD reductions. The 
difference in the realized energy reduction is due the fact that the CR scheme has a dynamic 
virtual power supply level while the VDD reduction methodology operates a fixed VDD. Further, 
the CR scheme was implemented to realize a partially self-powered circuit which meant the 
critical path delay was being increased with voltage scaling as a result of charge-recycling. The 
normalized energy-delay product can be further reduced by implementing the CR scheme on 
non-critical paths in the system. Another technique to lower the normalized energy expended by 
the CR counter is to time-multiplex several charge-recycling banks to provide power to the target 
circuit entirely from recycled charge. 
4.4 Techniques to improve the proposed CR scheme 
Some enhancements to improve the efficiency and the application of the proposed 
charge-recycling approach are examined in this section. 
Virtual VVDD Generation:  
The premise for the selection of current CR topology is to limit the power consumption in 
the CR logic to a minimum. However, the efficiency of the CR scheme can be improved by 
reducing the losses within the CR control block. In this CR implementation, the dominant 
component of power consumption is the switching losses in the digital logic and in the storage or 
recycling capacitors. The generation of VVDD by stacking the charge-storage capacitors results in 
charge loss due to charging and discharging the capacitor plates, and the parasitic capacitances 
from both the top and bottom plates. The switching and charge redistribution losses in this 
switched capacitor power supply can be reduced by charge-sharing or charge-recycling 
techniques [38] applied at the bottom plates of capacitors, between the transitions from charge-
recycling to charge-accumulation phase when the bottom-plate charges are lost to the ground.  
The transitions between the two CR phases are controlled by complementary signals.  
Therefore, there is a possibility of current flowing from the storage capacitors to the circuit 
ground, and from the VDD to the CR capacitors, during CR phase transitions. While the loss to 
ground reduces the amount of recycled charge, the charge from VDD increases the VVGND but 
 85 
reduces the charge-accumulation time and thus increases the frequency of CR cycle and the 
power consumption of CR control. Hence, the control of CR capacitors with non-overlapping 
(NOV) clock phases is desirable. With accurate control of the charge recycling phases, the 
charge lost during the transitions between the CR phases would be reduced. However, the 
generation of all the required control signals would also increase the total power consumption of 
the CR control logic. If the dead-time in the NOV control can be exploited to perform charge-
sharing between the capacitors‟ bottom-plates, the benefits outweigh the increase in power 
consumption.  
Adaptive virtual power-supply reference voltages:  
The designed charge-recycling approach to generate virtual power-supply in low-power 
digital circuits works effectively for a small range of power-supply levels. Across large 
temperature variations, the changes in the device threshold voltage and mobility limit the amount 
of recoverable charge (from dynamic switching currents) for a given performance requirement. 
Further, the virtual ground and virtual VDD reference voltages that offer the best computation 
efficiency depend on target circuit‟s topology, required frequency of operation, and the threshold 
voltage of the devices. As seen in the measurement results, the variation in process parameters 
affects the comparators‟ offset voltage and thus the actual power supply voltage that is supplied 
to the counter. Hence, the reference voltages can be employed to compensate for the variation in 
performance with process and temperature changes. To this effect, a control loop can be devised 
that provides dynamic/adaptive virtual VDD and virtual ground voltages by tracking the optimal 
supply voltage to ensure performance across process and temperature variations.  
Simulations were performed on FO4 buffer stages to quantify the optimal variation in the 
virtual reference voltages in order to guarantee a maximum delay increase of two, across 
temperature range of –55 ºC to 125 ºC. With dynamic virtual supply rails, the maximum voltage 
from CR banks ranges from 100 to 300 mV to meet the performance requirement in the buffer at 
100 MHz, across –55 ºC to 125 ºC. This range of VRG or maximum VVGND necessitates variable 
number of storage capacitors and adaptive gain to generate VVDD. Therefore, an ultra-low voltage 
capable DC-DC converter is essential to boost the voltage from the capacitive storage banks to 
VDD level. Furthermore, self-starting converters with variable gain ratios facilitate the use of 
 86 
recycled charge for the voltage boosting operation. Thus self-starting DC-DC converter used in 
conjunction with the CR scheme would not impose any power overhead to the system. The 
second part of this dissertation explores the design of very-low voltage capable DC-DC 
converter. 
Time-multiplexed Charge-recycling:  
Time-multiplexed charge-recovery banks can be employed to provide constant, 
uninterrupted power supply to other digital blocks [26]. Depending on the application, time 
multiplexing can be designed to recycle charge from multiple sources to sustain a single or 
several target blocks. While time-multiplexed CR offers an attractive method to provide power 
supply, energy overhead is involved with the generation of control signals to manage several 
independent, local CR capacitor banks.  
4.5 Summary and Conclusion 
This research has demonstrated the feasibility of operating digital circuits using the 
charge scavenged from the leakage and dynamic load currents that are inherent to digital 
operation. The target application defines the actual implementation of the proposed CR scheme. 
Time multiplexed CR approach is beneficial in a large system while the partially self-powered 
approach is the best choice to reduce the power consumption in small independent circuits. The 
novel charge recycling scheme has been verified on a 12-bit Gray-code counter in 90-nm process 
technology. The measurement results demonstrate an average energy reduction of 23% with the 
charge-recycling scheme. The average energy consumption of the counter alone was decreased 
by 42% by recycling its ground-bound charge.  
A comparison of the efficiency of the proposed charge-recycling scheme with the state-
of-the-art techniques is presented in Table 4.2. The energy reduction figures achieved in this 
work show improvement over previously reported CR-based designs of [26], [27] and [28]. 
Furthermore, the proposed CR scheme avoids the delay introduced in charge-pump based 
voltage boosting techniques, and eliminates external clocks and the need to match the current-
consumption in vertically-stacked CR digital blocks. 
 
 87 
Table 4.2 Comparison of the state-of-the-art Charge-recycling based low power techniques 
Reference % Power Saving Features Comments 
[24] 
33 % (VDD stacks) 




[25] 5 % (dual VDD stacks) VDD stacks 
About 66 % reduction 
in noise 
[26] 10 % Adiabatic Charge pump 
External, multiple 




 Single-stage Charge pump 







Capacitive Stack DC-DC 
conversion 
No External CP Clock 
or Voltage boosting 




; 23 % (Avg.) 
*





Chapter 5 Design & Analysis of the Ultra-low 
Voltage DC-DC Converter 
 
This chapter is revised based on the paper published by Chandradevi Ulaganathan et al. 
[35]: 
C. Ulaganathan, B. J. Blalock, J. Holleman, and C. L. Britton, Jr., “An Ultra-Low 
Voltage Self-Startup Charge Pump for Energy Harvesting Applications,” The 55th Intl. Midwest 
Symposium on Circuits and Systems, pp. 206-209, Aug. 2012. 
My primary contribution to this paper include (i) identification of the scope of research, 
(ii) compilation of a literature review of previously published work, (iii) design and analysis of 
the proposed solution, (iv) implementation and simulation of the circuit, and (v) preparation of 
the manuscript. 
5.1 Introduction and Motivation 
The proliferation of portable, battery-operated electronic systems has increased the need 
for highly efficient, low-voltage-capable DC-DC boost converters. Further, the emerging niche 
classes of wireless, sensor-based electronic products such as smart sensors, biomedical implants, 
etc., necessitate power-autonomy to provide essential operations for several years without the 
need for battery replacement. In these systems, power-autonomy is realized by harvesting or 
scavenging energy from the surroundings using transducers such as thermoelectric generators 
(TEG), photovoltaic cells (PV), and piezoelectric sensors. However, due to variations in the 
operating conditions, these transducers do not generate a constant output. For instance, the output 
voltage from a PV cell could range from 100 mV to 500 mV due to variations in the incident 
light [36].  
For “on-the-body” applications, as the core body temperature is regulated at 37 °C, there 
exists a temperature difference between the muscle temperature and the skin surface temperature. 
TEGs are employed to harvest energy from this temperature gradient that is readily available 
across the low thermal-conductive fat layer which is sandwiched between the muscles and the 
 89 
epidermis (skin). However, the temperature differences vary significantly depending on the 
sensor‟s position on the body, ambient environment and physical activities [37]. The nominal 
temperature difference perceived by the TEGs range from 1-5 degree Kelvin and depends on the 
thickness of the fat layer. Hence, state-of-the-art TEGs require at least 1000 thermocouples 
(occupying area of 1.3 cm
2
) to convert the temperature difference of 5 K into an output voltage 
of 1 V that can provide 100 μW of power to the load [37]. While increasing the number of 
thermocouples offers more power output, the power-density from the TEG-cell is independent of 
the number of couples. Hence, efficient energy harvesting can be realized with minimum sensor 
area by employing DC-DC converters that transform the very-low sensor output voltage to useful 
power supply voltage levels in circuits. 
In order to guarantee efficient, uninterrupted energy harvesting and thereby achieve 
complete power-autonomy, energy harvesting systems need to operate across a wide range of 
input voltages that includes ultra-low voltage regime. The state-of-the art energy harvesters 
employ power management circuits (PMC) to facilitate the power autonomy. The critical 
components of PMC include DC-DC converter and the control-clock generator that enables the 
converter‟s startup and operation. The PMCs work as an interface between the energy harvesters 
and the load (application) in order to provide either a regulated output voltage or to ensure 
maximum power transfer (MPT) to the load [39]-[41], [44]-[49]. This is accomplished by 
varying the converter‟s DC-DC gain to provide a constant output voltage or by varying the 
frequency of operation to guarantee MPT. Therefore, it is essential that the circuits in the power 
management block are capable of reliable operation across the entire range of transducer‟s output 
voltage. 
The charge recycling scheme proposed in Chapter 3 can be considered as a type of 
energy harvester that is capable of enabling power autonomy to a different circuit within the 
same chip. As discussed in Section 3.3, the charge available for recycling varies with the 
circuit‟s frequency of operation, the threshold voltage of the devices and the circuit‟s power 
supply voltage. Hence, adopting a CR topology with multiple charge-collection banks (Figure 
3.15) could result in variable input levels to the voltage booster. Further, temperature changes 
would also necessitate adjustment to the charge collection threshold so that the circuit‟s 
performance can be maintained across T.  For instance, in a CMOS inverter, allowing a 
 90 
maximum delay increase of 2 times, the available charge-bank voltage would range from 100 
mV to 300 mV across –55 °C to 125 °C, in the 90-nm process. Hence, a CR system that adapts to 
input voltage variations across PVT and multiple capacitor banks would truly augment its 
application. 
Thus, solving the problem of reliable DC-DC conversion across a variable input voltage 
range, which extends down to very-low input voltages, is critical to energy harvesting 
applications and as well as improves the energy efficiency of the CR scheme. Hence the design 
of low voltage, self-starting DC-DC converter presents an attractive challenge with many 
applications. The goal of this research is to design a low-voltage-input capable converter that 
does not require external excitation or post-fabrication processing for startup. Furthermore, a 
switched-capacitor topology that enables on-chip integration without the need for external 
components, and with adaptable number of gain stages is preferred. Ideal target applications 
include ultra-low-energy products such as real-time clocks and wrist watches. 
5.2 Literature Review 
DC-DC converters can be broadly classified into inductor-based and switched-capacitor 
(SC) based converters. Inductive boost converters are capable of efficiently boosting ultra-low 
input voltages in the order of few tens of millivolts [39], [40]. However, to avoid the high area-
overhead associated with realizing on-chip inductors and to facilitate integrated multiple charge-
recycling banks (as in Fig 3.x), switched-capacitor based DC-DC converters are explored in this 
research.  
This section starts with brief introduction to the voltage boosting mechanism in charge 
pumps (CP) and is followed by a discussion on the losses that affect the conversion efficiency. 
Then, the operation and challenges of state-of-the-art low-voltage converters and their ability to 
self-start are studied. 
5.2.1 Switched Capacitor Charge Pump topologies 
Several CP topologies have been published in literature, each designed to meet specific 
requirements such as high gain, high drive capability, high efficiency, and small area [42]. The 
differences in the architectures lie primarily in the mechanisms employed to transfer charge 
 91 
between the pumping stages and the implementation of the charge transfer switches (CTS). This 
section discusses the operation and control of the most commonly employed CP topologies. 
5.2.1.1 Linear Charge pump (LCP) 
The most widely used LCP topology is based on the scheme published by Dickson in 
1976 [43] and has been extensively studied since then to improve its performance. Figure 5.1 
presents a simplified schematic of a three-stage LCP with each CP stage comprising of a 
pumping capacitor, CPi, and a charge transfer switch (CTS), Si. The load is represented by a DC 
current sink, IL, and a load capacitor, CL. Figure 5.1 (a) to (c) illustrate the commonly employed 
implementations of CTS in charge pumps. For this discussion, let us consider the switches to be 
ideal and the clocks Φ1 and Φ2 as complementary phases with voltage swing equal to the CP‟s 
input voltage i.e. VDD.  
During the first cycle of operation, switches S1 and S3 are closed while S2 and S4 are open. 
The capacitor CP1 is charged to VDD, while charge is transferred from CP2 to CP3, charging CP3 to 
approximately 3VDD (in steady state). At the output node, the load current is supplied by CL 
which slowly reduces the output voltage by ILT/2CL, where T is the clock‟s time period. During 
the subsequent cycle, switches S2 and S4 are closed while S1 and S3 are open. Since the bottom 
plate of CP1 is pushed up to VDD, node N1 is increased from VDD to 2VDD which results in charge 
transfer to CP2 through the closed switch S2. Similarly, node N3 is pumped to 4VDD and charge 
lost by CL during the previous phase is replenished from CP3. Additionally, charge of ILT/2 is 
transferred from CP3 to sustain the load current during this phase. Hence in each clock period, a 
charge equal to ILT is transferred between adjacent capacitors towards the direction of the load 









 1,  (5.1) 
where, N is the number of pumping stages. The second term in (5.1) represents the reduction in 
output voltage due to charge redistribution loss due to charge transfer from the input to the DC 












Figure 5.1 Simplified schematic of a three-stage linear charge pump with charge transfer switch (CTS) 
implementations (a) CMOS diode CTS (b) CTS with gate control generator and (c) Bootstrapped CTS. 
 
 93 
 (N+1)·VDD. Thus, a charge pump extracts charge from the input source and pumps this charge 
through the different pumping capacitors to a higher voltage at the output. This voltage boosting 
is accomplished in a linear manner from one stage to the next in the LCP topology. 
 Variations in the LCP topology can be realized by different implementations of the CTS 
and control clocks. Figure 5.1 (a) illustrates the case where the CTS is realized as a MOS diode. 
This topology eliminates the need for switch control generation but results in a lower output 
voltage due to the presence of a threshold voltage drop in each CP stage. This threshold voltage 
drop across each stage renders this topology unsuitable for low-voltage applications. Figure 5.1 
(b) shows a NMOS CTS and its associated gate control (GC) signal generation circuit [59] that 
employs dynamic level shifters using the pumped node voltages from different CP stages. The 
challenge with this approach lies in the choice of appropriate level-shifting voltages and their 
relative phase transitions.  Depending on the relative phase of control signal used for the 
pumping capacitors, reverse current paths could exist between these CP capacitors. Also, 
dynamic switching loss in the form of short-circuit current might result due to signal transitions 
at different phases at the level-shifter rails. Hence, careful implementation and analysis are 
essential to take advantage of this topology for low-voltage operation. This topology is further 
discussed in detail during the design of the proposed CP. 
Figure 5.1 (c) presents the bootstrap capacitor based implementation of CTS along with 
the non-overlapping (NOV) control signals. To understand the operation, consider the case when 
switch Si is open, Φ1 is HIGH (at VDD), Φ2 and ΦB1 are LOW (at ground). The bootstrap switch 
SBi is closed (VGS,SBi = 2VDD) and the capacitor at Si‟s gate node is charged to the previous stage‟s 
node voltage (Vi-1). During the next clock phase, the Φ2 is HIGH (at VDD), ΦB1 is also HIGH (but 
at 2VDD) while Φ1 is LOW (at ground). The bootstrap switch SBi is open since its VGS is zero. The 
top-plate of the bootstrap capacitor is at high impedance and so the voltage at Si‟s gate increases 
to Vi-1 + 2VDD when ΦB1 is at 2VDD. This sets a VGS of VDD at the charge transfer switch Si and 
turns it ON for charge redistribution. To eliminate the 2VDD amplitude required for ΦB1, four-
clock-phase control methodology has been developed in [52], wherein the smaller bootstrap 
capacitor CBi charges to Vi-1 + VDD for a very short period time before the ΦB1 goes HIGH to VDD. 
This provides a VGS of VDD at Si to enable the charge redistribution to the next stage‟s CP. Thus 
the bootstrapped topology is well suited for low voltage operation but requires complex four-
 94 
phase clock signal generation which is power hungry and also, larger area due to the bootstrap 
capacitors [61]. 
5.2.1.2 Fibonacci Charge Pump (FCP) 
The Fibonacci charge pump presented in Figure 5.2 produces a boosted output voltage 
equal to the (N+3)
th
 number in the Fibonacci sequence. For instance, the three-stage topology 
would have a gain of 5 V/V which is the 4
th
 number in the sequence 0, 1, 1, 2, 3, 5. As shown in 
Figure 5.2 (a), during the Φ1 phase (i.e. HIGH), the odd-stage pumping capacitors are charged to 
their steady-state values while the even-stage capacitors are connected in series to boost their 
top-plate voltage so as to charge the next stage capacitor. In the next phase (presented in Figure 
5.2 (b)), the odd-stage capacitors are connected in series, while the even-stage capacitors are 
replenished with charge to reach their steady-state values. Thus, with a series-parallel connection 
between the adjacent CP stages, higher conversion gains than the LCP can be achieved. The 
improvement in gain is more significant with increase in the number of pumping stages. The 
main disadvantage with this topology is the considerable reduction in output voltage, compared 
to the ideal value, due to parasitic capacitances with increasing number of stages. 
5.2.1.3 Exponential Charge Pump (ECP) 
The exponential CP topology has gained importance in low-voltage applications where 
large conversion gains are needed with tight area constraints. Figure 5.3 presents a simplified 
schematic of a three-stage ECP that provides an ideal voltage gain of 8 V/V (which corresponds 
to 2
3
). The operation of the CP can be readily understood by referring to the steady state 
operation shown in Figure 5.3 (a) for the case when Φ1 is HIGH and Φ2 is LOW. In this phase, 
the pumping capacitors in the top branch are connected in series and provide the steady-state 
output voltage of 8VDD. The capacitors in the bottom branch are connected to the charge 
replenishing nodes from the top capacitors, and thereby charge up to VDD, 2VDD and 4VDD 
respectively. As illustrated in Figure 5.3 (b), during the next phase (Φ2 is HIGH), the capacitors 
interchange their roles to sustain the steady state output voltage of 8VDD and to refill the charge 





















The main disadvantage with this topology is the higher output resistance when compared 
to the LCP and FCP topologies. Hence, the series voltage drop across the CTS increases with the 
output load current and thus the performance of the CP is reduced [42]. Further, since the 
pumping capacitors are connected in series, the presence of parasitic bottom plate capacitances 
deteriorates the maximum possible output voltage.  
5.2.1.4 Operational Losses in CPs 
The efficiency of CP‟s voltage boosting operation is reduced by the operational losses 
incurred during the process of transferring charge from the CP‟s source to the load. Thus, a study 
of the different mechanisms of loss is essential to improve the efficiency of the CP topologies. 
Switching loss 
The current required to charge and discharge the bottom plate capacitance of the charge pump 
capacitors, and also the gate capacitance of the charge transfer switches contributes to the 
switching loss. Additionally, topologies that employ gate control circuits to generate the control 
signals for CTS contribute to additional losses due to short-through (or short-circuit) current that 
can flow between the CP nodes used as supply rails during switching transients. Reducing the 
CP‟s operating frequency would reduce the switching losses but the CP‟s output voltage would 
experience a large voltage droop that increases with slow clocking. Hence, the solution to lower 
switching losses is to operate the CP at an optimal frequency that produces an acceptable output 
ripple voltage, and to size the CP capacitors and CTS for optimal performance. 
Conduction loss 
The finite “on” resistance of the CP‟s charge transferring switches results in small 
voltage drop in each stage and thus causes the CP‟s conduction loss. The switch‟s “on” 














where μ is the mobility, COX is oxide capacitance, W/L is the CTS device size, VTH is the 
threshold voltage and VGS is the gate-to-source voltage. This series resistance results in a voltage 
 98 
drop that depends on the amount of charge transferred by the CP. One way to mitigate this loss is 
to increase the width of the MOS switch and thereby reduce RON and the series voltage lost 
during conduction. However, very large devices introduce significant increase in parasitic 
capacitances such as gate-to-diffusion, gate-to-body capacitances that require more current for 
their charge and discharge during switching. Thus, an optimal switch size is required to keep the 
sum of the conduction and switching losses to a minimum in the CP. 
Reversion loss 
Reversion loss occurs when reverse currents flow from the later stages to the previous 
stages in a charge pump i.e. current flowing in the direction opposite to the CP‟s charge transfer. 
The most common cause for this reverse current is the controlling of adjacent CTS gates with 
complementary clock phases. As illustrated Figure 5.4, during switching transitions, when switch 
S2 is being opened and, S1 and S3 are being closed, a brief period of time exists when both the 
switches are conducting. This results in the flow of reverse current from the higher potential at 
node N2 to node N1 and thereby causes loss of charge in the direction opposite to the power 
transfer in the CP. Reversion loss can be eliminated by implementing the adjacent CTS with a 
„break-before-make‟ action wherein a switch is first opened before closing the adjacent switches 
for charge redistribution. This can be accomplished by employing non-overlapping clock signals 




Figure 5.4 Simplified schematic of linear charge pump‟s stage illustrating reverse current loss.  
 
 99 
Charge Redistribution loss 
Redistribution loss is due to the transfer of charge between the charge pump capacitors. 
Charge redistribution loss is independent of the “on” resistance and is proportional to the 
frequency of operation and the load current. Maintaining low ripple voltage results in smaller 
charge redistribution between the CP stages and thus lowers charge redistribution loss 
component in the charge pump. 
Equipped with this brief introduction to the operation and losses in charge pump 
topologies, the rest of this section discusses the state-of-the-art DC-DC converters that are 
pertinent to ultra-low voltage, energy harvesting applications.  
5.2.2 Low voltage self-startup in converters 
The self-startup feature denotes the unique capability of the converter to generate all the 
required control signals and thus enables autonomous operation as long as a valid input is 
available for conversion. Several self-starting converters for energy harvesting applications have 
been published in literature [39]-[41], [44]-[50] . A significant number of these designs have 
demonstrated autonomous operation when the input voltage is above a certain threshold level 
called the startup voltage. However, these designs do not operate unaided at very-low input 
voltages. The low-voltage startup aids in these designs include external excitation, such as a 
battery [47], mechanical vibrator [39], or transformer [41]. External battery initiated startup is 
the most commonly used methodology at low voltages as it avoids the need for interface circuits 
required in vibrators. 
Of the few designs that eliminate start-up aids, post-fabrication threshold voltage (VTH) 
tuning is performed in [44] to ensure low voltage startup, whereas [45] uses forward body-
biasing to lower the device threshold voltages in the SC charge pump. These designs offer low 
startup voltages of 100 mV and 180 mV respectively. The energy harvester presented in [49] is a 
standalone, low-voltage, self-starting DC-DC converter that is targeted for energy harvesting 
from photovoltaic cells and is capable of boosting inputs as low as 270 mV without any external 
excitation. A recently published work [41] exploits white noise to trigger oscillations in a 
transformer-capacitor start-up circuit. The transformer based converter is capable of ultralow 
startup voltage of 20 mV in a 130-nm process. The research in [50] employs an exponential 
 100 
charge pump (ECP) to achieve a very low startup voltage of 150 mV in a 65 nm process. The 
Table 5.1 presents a comparison of the state-of-the-art ultra-low voltage DC-DC converters and 
their startup characteristics. 
5.2.3 Ultra-low voltage SC converters 
This section discusses in detail the SC-based converter designs of [49], [50] and [45] that 
do not require any external excitation for starting, and operate at very low input voltages. 
An ultra-low voltage, four-stage linear charge pump [46] that works down to 0.7 V of 
input voltage is illustrated in Figure 5.5. This CP employs cross-coupled branches to reduce the 
output ripple voltage. During each clock phase, the CTS are controlled such that one branch (top 
or bottom) pumps charge towards the load while the capacitors in the other branch replenish their 
 
 
Figure 5.5 Schematic of the low-voltage charge pump in [46]  
 101 
charge to reach the respective steady state levels. Controlling the two symmetric CP branches 
with complementary clock phases facilitates the use of symmetric nodes from the opposite 
branch to generate the CTS gate-control. The main disadvantage of this CP is the presence of 
reverse current paths across all the pumping stages. This significantly reduces the maximum 
achievable output voltage. Furthermore, for self-startup topologies where the clock signals would 
have to be generated from the low input voltages, reverse current loss would drastically increase 
due to slower clock transitions and thereby impede CP startup.  
Figure 5.6 presents the low-voltage capable charge pump published in [45]. The 3-stage 
charge pump employs cross-coupled, voltage doubler cells [57] that are connected in series to 
achieve the required gain. Similar to [46], employing complementary clock signals to control the 
cross-coupled branches helps in the generation of CTS control signals. Additionally, the CP 
incorporates forward body biasing (FBB) to reduce the threshold voltages of the serially 
connected charge transferring switches. The body-bias voltage for the PMOS is derived from the 
input of the previous stage while the NMOS body is biased by the output of the next stage. An 
extra CP stage with small pumping capacitors that does not contribute to the output voltage, is 
 
Figure 5.6 Simplified schematic of the low-voltage charge pump in [45] 
 102 
required to generate the third-stage NMOS device‟s body bias voltage which is higher than VOUT 
of the CP. This charge pump design was employed to boost a 180 mV input to a 0.5 V output 
that drives the clock generation circuitry to control the inductor-based boost converter.  
Since this topology includes two series switches in each CP stage, the switches have to be 
sized large enough to limit the conduction loss. Further, as the body biasing for the CTS is 
derived from the pumping nodes, charge is lost in the process of charging and discharging the 
body (diffusion well). Another significant loss mechanism exists during the switching transitions 
when there is a possibility of all the switches being ON for a brief period of time and large 
reverse current could flow from the output to the source. Thus, the efficacy of this CP is eclipsed 
by the existence of several loss mechanisms. 
The simplified schematic of the low-voltage charge pump implemented in [49] is shown 
in Figure 5.7. The charge pump uses two parallel charge-transfer switches (CTS) in each stage to 
provide efficient charge transfer across a wide range of input voltages (Vin). The components 
MAX, MBX, Cbx, and Cpx form the conventional four-phase Dickson CP cell [52] that is controlled 
by boosted gate voltages derived from Vin. The gate controls for the Sx switches are derived from 
the output voltages. At very low input voltages, the output voltage is also low so the charge 
 





transfer to the subsequent stage is achieved by the boosted gate control switches MAX. For high 
input voltages, the switches with gate control derived from Vout dominate the charge transfers.  
This self-starting converter, designed in 130 nm process technology, employs four-stage, 
four-phase SC charge pump (CP) to provide a regulated output voltage of 1.4 V at 58% 
efficiency for a low input of 450 mV. The output voltage from the CP is regulated to within 23 
mV of 1.4 V by comparing a scaled version of the output with an on-chip bandgap voltage 
reference. Also, this work uses the input voltage to power up the clock and phase generation 
circuits and thereby eliminates the need of an external start-up voltage [49]. The system clock is 
synchronized (latched) with a comparator such that charge transfers along the CP stages are 
enabled only when the output voltage drops below the permissible ripple voltage range. The 
unregulated converter starts at a low voltage of 270 mV while providing 3 times gain. Thus, this 
self-starting converter is well suited for low voltage photovoltaic cell applications but not for 
conducive for ultra-low voltage TEG systems. 
An ultra-low voltage exponential charge pump (ECP)-based converter capable of 150 mV 
startup voltage has been recently published [50]. The schematic of the CP core is shown in 
Figure 5.8. Since the two-phase (Φ1, Φ2) clock generator is powered from the input voltage, the 
ECP requires bootstrapped switches to control charge redistribution between the CP stages. The 
CP was implemented in a 65nm process with total on-chip capacitance of 5nF. With an input of 
150 mV, simulation results illustrate an output of 850 mV at 1 µA of DC load current with an 
efficiency of 30% [50]. This CP renders itself well to low-voltage energy harvesting applications 
but the problem with ECP lies in the generation of the CTS control signals and the inability to 
provide multiple conversion gains.  
The Table 5.1 summarizes the characteristics of these converters along with other state-of-the-art 
ultra-low startup designs. The designs of [39], [40], [47], [48], [41], and [68] (first six rows in 
the table) that employ external or assisted startup are included in this table to provide a 
comprehensive review of state-of-the-art converters. Even though the inductor-based converters 
are capable of boosting ultra-low input voltages (20 mV), it is to be noted that they require 




converters represent efficient but expensive DC-DC conversion. Charge pump based solutions 
offer the possibility of cheap designs that are capable of self-startup. However, innovative 
control of the CP stages is required to improve the efficiency of these SC-based boost converters.  
From the study of these SC converters, it can be inferred that efficient low-voltage 
operation is highly dependent on the CTS implementation and its control strategy. For example, 
in a linear charge pump topology, the unassisted minimum startup voltage published in literature 
is 270 mV (130 nm process) [49], while the VTH tuning using forward body-biasing reduces the 
startup voltage to 180 mV in a 65 nm process [45]. Even though the design in [50] does not state 
a self-startup feature, the exponential CP design can operate with input voltages as low as 150 
mV, provided the control signals are available. Furthermore, the CTS implementation in these 
topologies and their relative phases at the adjacent CP stages determine the reverse current 
component that deteriorates the maximum attainable output voltage. Thus, an optimal solution 
 
 
Figure 5.8 Simplified schematic of the low-voltage exponential charge pump in [50] 
 105 


















35 mV Inductive 50 mV 1.8 V 58 % 350-nm 
[40] External battery 650 mV Inductive 20 mV 1 V 75 % 130-nm 
[47] External battery 2 V LCP 0.6 V 2 V 70 % 350-nm 
[48] Hybrid (External 
inductors, CP) 
200 mV Inductive 0.2 V 1.2 V 36 % 180-nm 
[68] undescribed 20 mV Transformer 20 mV 2.35 -5 V 40 % undescribed 
[41] White noise into 
transformer, cap  
40 mV Transformer 40 mV 1.2 V 








Ext. CP cap 








180 mV 0.7 V N/A 65-nm 
[49] Charge pump 270 mV LCP 450 mV 1.4 V 58% 130-nm 
[50] Charge pump 150 mV ECP 150 mV 0.85 V 30 to 80% 65-nm 
 
 106 
that minimizes the operational losses in the CTS control would significantly improve the 
efficiency and the startup voltage in low-voltage charge pumps.  
In conclusion, the literature study of the different low-voltage CP topologies has clearly 
emphasized the need for a voltage converter that offers both ultra-low voltage operation, and is 
truly autonomous i.e. self-starting without the need for external excitation. Further, a switched-
capacitor based converter that eliminates the need for external components, and facilitates 
adaptable number of gain stages would greatly improve the efficiency of energy harvested using 
transducers. 
5.3 Design of the proposed low-voltage, self-starting DC-DC converter 
The proposed low-voltage DC-DC converter is intended for energy harvesting 
applications, and as well as for extending the application of charge recycling scheme to operate 
efficiently across process, voltage and temperature variations. The design goals for the DC-DC 
converter include: 
 Ultra-low voltage operation in the 100 mV range to minimize the amount of 
wasted energy from transducers and the recycled charge across PVT variations. 
 Unassisted self-startup ability to enable power autonomy in harvesters and to 
reduce the energy overhead on the charge recycling system.   
 Easily variable conversion gain that adapts to change in input voltage. 
 Area efficient, on-chip integration of the converter to facilitate multiple charge-
recycling systems in a single chip and thereby increase energy efficiency by the 
virtue of dynamic voltage scaling (DVS). 
5.3.1 Architecture of the Converter 
The block level schematic of a power management circuit (PMC) that includes the 
proposed self-starting converter is presented in Figure 5.9. The low output voltage (VIN) from 
energy scavengers is boosted by the charge pump to either meet the power supply requirements 
of the target application, or, to be stored in the buffer for later use. The control signals required 
for the CP operation are generated by the five-stage ring oscillator (RO) and the phase generator 
 107 
(PG) circuits. These circuits are powered directly from VIN in order to enable power-autonomy in 
the converter. An output control block is employed in the PMC to provide maximum power 
transfer (MPT) from the CP to the load by controlling the number of conversion stages in the CP 
[47] or by varying the frequency of operation [40].  
At the startup, the system clock and the CP control signals are generated as soon as the 
harvested voltage (VIN) reaches the minimum power supply voltage (VDDMIN) required by the RO. 
VDDMIN is limited by the current matching in the PMOS-NMOS devices, the VTH, and the 
variation in device characteristics across process corners. Additionally, since the circuits operate 
in subthreshold region at low startup voltages, the device characteristics are more sensitive to 
process & mismatch variations. Hence, low-threshold voltage (LVT) devices with proper device 
sizing (as in [51]) are essential to minimize variation in the inverter‟s switching point (i.e. 
PMOS-NMOS ID and VTH) and thus to guarantee oscillation at low startup voltages.  
A conventional non-overlapping (NOV) phase generator is employed to produce the 
clock phases for the different stages in the CP. The NOV clocks minimize the reduction in output 
voltage due to reverse currents flowing from subsequent stages to previous stages along the CP. 
The charge pump performs the voltage boosting operation to provide the system power to the 
application. This remainder of this section discusses the design details of the RO, phase 
generator and the proposed low-voltage CP circuits.  
 
 
Figure 5.9 Topology of the proposed self-starting converter 
 108 
5.3.2 Choice of CP topology 
SC charge-pumps provide a boosted output voltage by phased charge-transfer from the 
input source to the output load through the CP stages. The conversion ratio depends on the 
number of CP stages and the topology of CP cell employed in each stage. This section 
investigates the suitability of linear CP (LCP), voltage doubler, Fibonacci CP (FCP), and 
exponential CP (ECP) to autonomous, low-voltage operation.  
High conversion gain with fewer number of CP stages than in LCP designs can be 
realized by cascading voltage doubler cells [57]. This topology doubles the voltage at each CP 
stage by driving the bottom plate of the CP capacitor with clock amplitude equal to the top 
plate‟s steady state voltage. Hence, the main disadvantage of this approach is the requirement to 
generate the CP clock with different voltage swings that depends on the stage in the CP. This can 
be accomplished by bootstrapped clock generation from the previous stage‟s output, or can be 
derived from the output voltage of the CP‟s last stage. However, these clock generation 
techniques add to the parasitic charge losses and thus degrade the efficiency. At low-voltage 
operation, larger pumping capacitors in each stage and bigger switch sizes are required to reduce 
the “on” resistance and thereby improve the efficiency of the pump.  
FCP & ECP based converters also provide higher boost ratios with lower number of CP 
stages when compared to the LCPs and thus makes them desirable in low-input voltage, high-
gain conversion requirements as demonstrated in [50]. Both the FCPs and ECPs implement 
series-parallel connections between the CP stages in order to realize the required gain. The most 
challenging part in the design of these charge pumps is the control generation for the CTS. 
Bootstrapped switches are required to control these switches. The main drawback in these 
topologies is that the number of stages cannot be easily adjusted to obtain multiple conversion 
ratios with dynamically varying input voltages. Also, in the presence of parasitic capacitors, 
LCPs perform better than FCP and ECP to achieve the highest output voltage [58].  
The LCP topology facilitates autonomous operation as a constant clock voltage swing of 
VIN, across all the CP stages, is sufficient to provide a linear boost of VIN for each stage. Further, 
since the CP is intended for use with low-voltage, low-power applications, the maximum number 
of stages is about 5 to 6 which is not very high to introduce significant loss in efficiency. Hence, 
 109 
a three-stage LCP that provides a theoretical gain of 4 V/V is adopted in the proposed DC-DC 
voltage converter. 
5.3.3 Design of Ring Oscillator 
The simplified schematic of an inverter-based ring oscillator topology is shown in Figure 
5.10. According to Barkhausen stability criterion, the necessary conditions for oscillations in a 
feedback loop are: 
 The loop gain should be equal to unity, and 
 The total phase shift in the feedback loop should be a multiple of 2π. 
The feedback loop in a RO with identical inverter cells implies that each stage would 
provide a phase shift of 2πk/N at the frequency of oscillation, where k is an integer and N is the 
number of inverter stages. Since an inverter has an inherent phase shift of π, the stability criterion 
is met in a RO with a phase shift of π + θ at each stage. Further, in order to generate high 
frequency of oscillation at low power consumption, the number of inverter stages in the RO is 
kept as small as possible. As shown in Figure 5.13, this implies that N.θ=π and results in  
NshiftPhase /,    (5.3) 
for each stage [53]. This sets the practical, minimum number of stages in a single-ended RO as 3. 
With even number of stages in a RO, there exists a possibility that the loop might settle at a dc 
condition and not produce any oscillations.  
 
 
Figure 5.10 Model of inverter-based ring oscillator 
 110 
For sustained oscillations, the gain requirement should also be satisfied and this entails  
  1
N
jA   (5.4) 










)(  (5.5) 
where gm is the transconductance of the stage, R and C represent the resistive and capacitive 
loads for each stage. At the frequency of oscillation, the phase of the transfer function in (5.5) is 
given by [53] 
    )(tan)( 0
1 RCjA  (5.6) 





0   (5.7) 
The gain criterion can be derived from (5.5) as  
cos
1
Rgm  (5.8) 
which is 2 for a three-staged RO and 1.24 with five-inverter stages. Equation (5.8) shows that it 
is easier to meet gain criterion with long inverter chains due to the smaller gain requirement from 
each stage. This result is important especially for low voltage, subthreshold operation regimes 
where the gain can be very low. Hence, a five-stage inverter ring is adopted in this design. 
Furthermore, a 5-stage topology keeps the power consumption low when compared to higher N 
and also allows for a good range of tuning of the oscillation frequency.  
The frequency of the generated clock is dependent on the power supply voltage which is 
the input voltage from the transducer. For the RO in Figure 5.13, the load resistance R in (5.7) is 
the “on” resistance of the device and is dependent on the applied VDD. The RO frequency can 






0  (5.9) 
 111 
where tdr and tdf are the propagation delays for the rising and falling transitions in each stage, and 
N is the number of stages. In this research, capacitive delay cells have been included at the 
output of each stage in order to be able to control the output frequency for a given input, VIN. A 
3-bit decoder is employed to add or remove delay cells so as to tune the frequency of the 
generated CP clock.  
5.3.3.1 Low-voltage startup in Ring Oscillators 
The ring oscillator‟s startup voltage defines the minimum possible input voltage that can 
be boosted by the charge pump. Hence, it is critical to ensure that the RO is capable of providing 
stable system clock even at very low supply voltages. The design methodology adopted here is to 
set the sizing ratio of the CMOS devices to facilitate subthreshold operation [51].  
The RO‟s startup voltage is determined by the current matching in the PMOS-NMOS 
devices, the VTH, and the variation in device characteristics across process corners. The sizing 
ratio of PMOS-to-NMOS device widths can be used as a soft tool to make small alterations to 
the inverter‟s switching threshold voltage, VM. In ROs, the inverter‟s switching threshold voltage 
establishes the duty cycle of the generated clock. Hence, setting the switching threshold at VDD/2 
results in a 50% duty cycle, maximizes the noise margin, and achieves symmetrical 
characteristics. The required PMOS-NMOS ratio can be determined by equating the CMOS 




























where the gate-to-source voltage (VGS) is set by the input VIN, VTH is the device threshold voltage, 
VT is the thermal voltage given by kt/q ≈ 26 mV, n is the subthreshold slope factor, and I0 is the 
technology dependent drain current when VGS=VTH and is given by [54] 
  200 1 TOX Vn
L
W
CI    (5.11) 




























For the 130-nm process used here, the subthreshold slope of 82 mV/decade [55]. Substituting the 
nominal values of mobility μ, VTH, and VDD of 100mV in (5.12) results in an optimal device ratio 
of 3.4 to obtain a switching threshold of VDD/2.  
Equation (5.12) illustrates the exponential dependence of the PMOS-NMOS sizing ratio 
on the power supply and threshold voltages. Further, changes in the device mobility and VTH 
values with process variations also affect the optimal device ratio. Hence, to ensure reliable 
operation at low voltages, the RO was simulated across process corners to determine the optimal 
device sizing for this application. The startup voltage of the RO is defined as the supply voltage 
at which the RO‟s output swing is at least 10% to 90% of the supply voltage. Figure 5.11 shows 
the RO‟s startup voltage as a function of device sizing for the nominal process corner. For large 
device ratios, the size of PMOS can get large such that the PMOS leakage current becomes 
comparable to the drain current of NMOS and result in weak pull-down i.e. higher than 10% of 
VDD. So, there is an increase in the minimum required VDD to establish output swing of less 10% 
of VDD. Similarly, for small PMOS sizes, the pull-up network is weak such that the output cannot 
 
 































reach 90% of VDD and thereby increases the minimum VDD. The inequality in the minimum VDD 
for 10% and 90% implies that the current in the inverter is not matched and so the switching 
threshold is not centered at mid-VDD. The device ratio of about 3.2 which ensures equal minimum 
VDD presents the best possible device ratio and also closely matches with the hand calculated 
value of 3.4 from (5.12). 
The worst case startup scenario is represented by the fast-slow (fs) and slow-fast (sf) 
process corners. Figure 5.12 present the corresponding plots of RO‟s startup voltage for the 
worst-case process corners. As expected, the fast-slow case, which represents a strong PMOS 
device and a weak NMOS, requires larger VDD to provide output low swing of less than 
10%·VDD. Similarly, the slow-fast corner representing weak PMOS with strong NMOS, needs 
larger VDD to attain a maximum output voltage of at least 90%·VDD. Similar to the nominal 
corner, the device ratio of close to 3.2 presents the optimal size for low startup voltage. In the 
final design, the smallest possible device ratio of 3 which guarantees a low voltage startup 
voltage across all corners was chosen. The decision to use the small device ratio helps to 
improve the energy efficiency by minimizing the switching loss and also facilitates high 
frequency clock output.  
 
 































Load capacitance based delay network has been added at the output of each inverter stage 
in the RO so as to provide an option to control the oscillator‟s output frequency. The delay 
network is controlled by a 3-to-8 bit decoder that connects different set of delay capacitors to 
obtain the required output frequency with an approximate 50% duty cycle. Figure 5.13 shows the 
implemented RO along with the device sizing.  
5.3.4 Design of Non-Overlapping (NOV) phase generator 
The non-overlapping clock phases that control the charge pump are generated from the 
ring oscillator‟s output using a conventional phase generator circuit [29].  The schematic of the 
NOV phase generator which is implemented with NAND gates and delay cells is shown in 
Figure 5.14. Similar to the ring oscillator design, the CMOS device ratios are adjusted to ensure 
low-voltage operation. For instance, the NAND device sizing is different from the inverter in 
order to compensate for the loss in the NMOS drive strength due to stacked devices in the 
NAND topology. An added advantage of stacked devices is the reduction in leakage current due 
to increase in VTH as a result of body effect. The delay cell used for the non-overlapping phase 
generator incorporates a 64fF vertical natural capacitor (vncap) formed between metal1 and 
metal 3 to generate dead-time of about 0.67 μs at 100 mV input voltage, and 0.36 μs at 125 mV 




Figure 5.13 Schematic of the 5-stage ring oscillator 
 115 
 
5.3.5 Design of the Proposed Linear Charge Pump (LCP) 
As discussed in Section 5.3.2, a linear charge pump topology offers the best choice for 
ultra-low-voltage DC-DC conversion with adaptable conversion gains. This work exploits a 
cross-coupled LCP topology that comprises of two similar charge pump branches that operate at 
complementary clock phases to provide the output voltage. The cross-coupled stages reduce the 
output ripple voltage and thereby allow lower frequency operation to achieve the same 
performance. Further, the reduction in output ripple voltage implies lower charge redistribution 
loss between the charge pump capacitors. Once the RO‟s clock is available, the charge transfer 
switches (CTS) between each pumping stage and their gate control (GC) circuits determine the 
actual startup voltage of the charge pump. Hence the design, implementation, and control of the 
CTS play a critical role in the performance of the CP. This section describes the design and 
implementation of the various CP components.  
5.3.5.1 Charge transfer switches 
The requirement to design for low-voltage operation determines the type of CTS used for 
each CP stage. When compared to PMOS, the NMOS switches offer smaller device size, thus 
smaller parasitic capacitance, for the same ON resistance of the switch. However, NMOS devices 
suffer increase in threshold voltage with increase in the source-to-body voltage due to body-
effect. This results in higher CTS threshold along the CP stages. For low-voltage operation, this 
increase in CTS threshold would severely constrain the minimum achievable startup voltage and 
CP efficiency. The 130-nm process used in this work offers triple-well NMOS devices but the 
 
Figure 5.14 Schematic of the non-overlapping phase generator 
 116 
choice of PMOS switches presents lower parasitic capacitances at the pumping nodes which is 
crucial for efficient voltage boosting along the CP stages.  
The ease of gate control generation is another factor to be considered in the CTS design. 
A CTS implemented with PMOS would need a gate voltage that is lower than its source and 
drain voltage to allow charge transfer when the switch is ON. During the OFF state, the gate 
voltage could be equal to the source i.e. VGS=0 or a higher gate voltage could also be derived 
from subsequent stages. Whereas the NMOS switch requires a higher gate voltage than its source 
and drain to remain ON and a lower gate voltage for the OFF state. Obtaining a higher voltage is 
essential for the charge transfer in NMOS switches and this necessitates boot-strapped capacitor 
gate control especially for the last couple of stages in the CP.  
As illustrated in Figure 5.1(c), bootstrapped switch control comprises of a small bootstrap 
capacitor (CB) that is charged to the steady state voltage of the previous CP stage. Driving the 
bottom plate of CB with a clock generates the required gate control for the CTS. The 
disadvantage of adopting a bootstrap switch control is the need for additional clock phases to 
control the bootstrap capacitor. Furthermore, the operational losses incurred to charge and 
discharge the CB and the associated switches reduce the efficiency on the CP. Hence, this work 
employs PMOS switches for the transfer of charge along the CP stages. 
Figure 5.15 presents a simplified schematic of the proposed 3-stage CP along with the 
CTS and the boosted node voltages. The first stage of the CP uses NMOS charge transfer 
switches (CTS) for ease of switch control generation and also to take advantage of the lower 
NMOS threshold voltage. The charge-transfer switches from the second stage onwards are 
implemented using PMOS transistors to avoid increase in device threshold voltage due to body 
effect. As illustrated in Figure 5.15, the switches were sized to keep the conduction loss due to 
finite RON and CTS gate-drive‟s switching losses as low as possible for the required load drive 
capability. The simulated ON resistance of the CTS is 25 KΩ at 125 mV input. As the node 
voltages on each side of the charge-transfer switches change with the CP‟s phase of operation 
and with the CP stage, the gate control (GC) voltages that control the CTS operation need to be 




from the pumping nodes, an interface that generates the appropriate control voltages is 
necessary.  
5.3.5.2 Gate Control (GC) Generation 
The CTS gate control generation is one of the critical components in the CP design as it 
establishes and controls the transfer of power from the input to the CP‟s output. The challenge 
with GC generation is the need to generate precise voltage levels and relative phase relationship 
between the GC signals of the CP. To ensure effective charge transfer, the GC signals must 
provide gate-to-source bias voltages that clearly define the ON and OFF states of the CTS. 
Further, the CTS‟ node voltages increase with the CP stage, thus necessitating the generated gate 
control voltages to scale along the CP stages. Furthermore, the relative phase difference between 
the adjacent CP stages determines the amount of reverse current loss. Thus, the GC generation 
circuitry greatly impacts the pumping efficiency and the achievable gain of the CP.  
As two CTS are connected to either side of the pumping capacitors in each stage, a break-
before-make action is commonly employed in these switches to avoid reverse current flow. 
 
Figure 5.15 Simplified schematic of the proposed LCP 
 118 
Numerous GC implementation techniques that improve the efficiency of CPs, have been studied 
and demonstrated in literature [44]-[50], [56]-[59].  The most commonly adopted methods 
generate the required control signals using the relative voltage difference between the different 
CP stages. Bootstrapped gate control methodology uses the voltage from the previous stage [49] 
while level-shifter based techniques [56], [58], [59] use voltages from both the previous and 
subsequent stages of the CP. In this work, the gate control signals are derived from the 
complementary cross-coupled stages using level-shifters. Some of the gate control strategies for 
efficient charge pumps from [59] are adapted to this low-voltage design. Specifically, the level-
shifter based technique has been optimized to ensure operation at low voltages. 
Figure 5.16 presents the inverter based CMOS level-shifter (LS) that is employed in this 
CP. Level-shifting of the CP‟s clock phases (Φ) is achieved by varying the VDD, VSS and VG 
voltages in order to obtain the required signal levels that control CTS operation. The CMOS LS 
devices are sized for low voltage operation as discussed in section 5.3.3.1. Further, low-threshold 
voltage devices with a threshold of 230 mV have been employed for CTS and the GC circuits. 
Figure 5.17 presents the proposed 3-stage CP along with the level-shifter circuits for GC 
generation. The top CP branch shows the required gate control voltage levels for the CTS, while 
the bottom CP branch illustrates the LSs employed to obtain the GC signals. The CTS in the top 
CP branch employ level-shifters similar to those in the bottom branch. The node voltages of the 
different CP stages in the top and the bottom branches are enumerated in the figure as (VX, VY) 
 





Figure 5.17 Simplified schematic of the proposed LCP along with the GC generation 
 120 
where, VX corresponds to the CP phase when Φ1 is at “0” (LOW) and Φ2 is at “1” (HIGH) , 
while VY corresponds to the complementary state with Φ1 HIGH and Φ2 LOW. In this design, 
the level-shifter‟s VDD, VSS and input voltages are chosen from the existing CP nodes to enable 
self-sustained voltage boosting by the CP. 
The first stage of the CP uses NMOS switches to benefit from lower VTH than PMOS 
switches. This necessitates a voltage higher than VIN to turn ON the NMOS switch. With the 
complementary cross-coupled topology, 2VIN voltage is readily available at the cross-coupled 
node, thereby eliminating the need for extra circuitry in GC generation. Hence, the first CP stage 
utilizes a conventional voltage-doubler topology as published in [60]. The CTS switches from 
the second stage onwards use PMOS switches and require individual GC generation to guarantee 
good pumping efficiency. As PMOS switches do not need a control voltage that is higher than 
that CP node, the GC voltage can be easily generated using the available CP node voltages. The 
tables included in Figure 5.17 present the voltages required at the level-shifter nodes in order to 
generate the appropriate GC signal levels for the switches.   
Large voltage swings on GC signals would result in strong ON and OFF switch states 
but, it would also deplete charge from the pumping capacitors and thus reduce the CP‟s 
maximum achievable VOUT. Hence, in order to keep the switching losses due to GC generation at 
a minimum, the level-shifters are biased with a maximum of 2VIN voltage swing (VDD to VSS). 
Based on this design condition, the permissible voltage levels that are employed in the CTS 
control are highlighted in the table included in Figure 5.17. From the Figure 5.17, it can be 
inferred that the level-shifter‟s ground node can be connected to the complementary branch‟s CP 
node of previous stage, the VDD can be connected to the current branch‟s next stage, while the 
gate voltage is controlled by the same branch‟s previous stage.  The effect of the phase 
relationships between these node transitions determine the reverse current loss and will be 
discussed in detail in the next section. 
The gate control voltage for the i-th stage PMOS CTS is designed to swing from (i-1)·VIN 
to (i+1)·VIN (i.e. 2VDD) in two steps in order to control the transfer of charge to the subsequent 
stage. During startup, the voltage difference between the adjacent stages is much smaller than 
VIN. This results in deep subthreshold region of operation until the CP reaches a steady state 
 121 
operation with a VIN – ΔV increase in voltage at each CP stage. Hence, it is vital that the level 
shifters are sized to operate at very low voltages.  
The full schematic of the proposed 3-stage cross-coupled LCP is illustrated in Figure 
5.18. The proposed CP accomplishes linear voltage boosting of VIN by utilizing just two NOV 
clock phase signals that are generated by the phase generator. The NMOS only and PMOS only 
charge transfer switches reduce the switching losses, and are optimized for low voltage startup 
and operation. The generation of GC using internal CP nodes facilitates extension to any number 
of CP stages so as to meet the gain requirements of the target application. In the Figure 5.18, Φ1 
and Φ2 represent two non-overlapping clock phases that are distributed along the CP stages such 
that, two adjacent CP stages are controlled by NOV clocks in order to mitigate losses during 
switching transients. Once the CP reaches steady-state operation, the voltage at the output of the 
N-th stage is approximately (N+1)VIN – ΔV where ΔV represents losses due to voltage drop 
across the switches due to finite “on” resistance, parasitic and leakage currents. The dotted lines 
for the ground connection in 2
nd
 stage‟s level-shifter represent an alternate connection.  
As discussed in 5.2.3, similar LCP topologies have been utilized for low-voltage 
operation as in [45] and [46]. This work reduces the considerable reverse current loss inherent in 
[45], [46] and also eliminates the need for two series switches for each charge transfer stage as in 
[45]. Low-threshold voltage devices with a threshold of about 230 mV have been employed for 
the entire CP design including the charge-transfer switches (CTS) and the control generation. 
5.3.5.2.1 Design for low-voltage operation  
To ensure low-voltage operation, the proposed LCP topology has been designed to 
minimize the losses associated with the CTS. Furthermore, LVT devices have been used for all 
the CT switches to facilitate low-voltage operation and to offer lower resistance in the charge 
transfer path.  
To investigate the efficiency of the proposed CP, a detailed study of the various 
operational losses is required. As discussed in Section 5.2.1.4, the power losses associated with 
LCP topologies include the conduction loss, redistribution loss, reversion loss, and switching 







Figure 5.18 Schematic of the proposed self-starting linear charge pump 
 123 
caused by the transfer of charge along the CP capacitors. Reversion loss occurs when a reverse 
current flows from the later stages to the previous stages in the CP. The current required to 
charge and discharge the bottom plate of CP capacitors and the gate-source capacitance of the 
switches contribute to the switching loss. With optimal sizing of the switches, the total 
conduction and switching losses have been reduced, while the redistribution loss is lowered with 
low output-ripple voltage [60].  
Influence of GC Signals on Reverse Current Loss 
The relative phase-transitions of gate-control signals at adjacent CP stages determine the 
reversion loss. Employing NOV clock phases for controlling adjacent stages is the most effective 
methodology to minimize undesirable current paths in a single-branch LCP. However, utilizing 
CP node voltages from cross-coupled nodes introduces complexity in the GC generation and 
entails careful design. With the aid of the CP‟s timing diagram derived from Figure 5.18, the 
different reverse current components will be studied. While the following discussion is based on 
the CP‟s bottom branch operation, the symmetrical topology of this charge pump makes this 
pertinent to the top branch as well.  
Charge transfer from VIN to Stage 1 of the Charge Pump 
The designs in [45], [46] employ complementary clock phases to control adjacent CP 
stages. To investigate the possibility of reverse currents in complementary phase control, 
consider the first CP stage in Figure 5.19, which is similar to the designs in [45], [46]. If Φ1 and 
Φ2 were complementary signals, during a clock transition, when the voltage at node N1 
transitions from VIN to 2VIN, the node N2 would also transition from 2VIN to VIN. During this 
transition, the NMOS switch connecting to N1 is not completely switched OFF and provides a 
path for current to flow from N1 back to source VIN. This reverse current reduces the effective 
charge transferred to the next CP stage and also limits the low-voltage operation. Hence, NOV 
clock phases have been employed in this proposed LCP.  
With NOV clocks, when the node N1 transitions from VIN to 2VIN, the node N2 remains at 




 increase in VGS from 0 to VIN and results in charge transfer from input to CP1T. Similarly, the 
switch S2 which is controlled by N1 (Φ1) remains completely switched OFF during N2‟s (Φ2‟s) 
rising edge transition while, S1 is turned ON. Thus, NOV clock control eliminates reverse 
current paths from the first stage of the CP to the input source. 
Charge transfer from Stage 1 to Stage 2 of the Charge Pump 
The gate-control generation from the second stage onwards uses signals derived from 
both Φ1 and Φ2 and therefore NOV switch control can be realized. To study the GC operation 
and to examine the existence of leakage current paths in the CP‟s second stage, let us refer to 
Figure 5.20 which illustrates the transient node voltages at N1, N2, G3, and N3. The switch S3 is 
ON during the charging phase (Φ1 HIGH) which results in charge transfer from the first stage 
pumping capacitor (CP1B) to the second stage capacitor (CP2B) in order to replenish the charge 
lost in CP2B. In the discharge phase (Φ2 HIGH), S3 is switched OFF and node N3 is boosted 
from 2VIN to 3VIN to transfer charge (i.e. discharge CP2B) to the next stage, while CP1B is charged 
to VIN. 
As shown in Figure 5.20, the gate control voltage (G3) swings from VIN to 3VIN to dictate  
 
 
Figure 5.19 Schematic illustrating control signals and charge transfer in the first stage of LCP 
 125 
 
the switch‟s ON and OFF states, respectively. During the dead-time between the charge and 
discharge phases, G3 is maintained at 2VIN, with a VGS of 0 V to ensure an OFF state. A reverse 
current path from CP2B to CP1B is possible only when the switch S3 conducts when the node 
voltage at N3 is higher than that at N1. This is possible when N3 transitions from voltage of 3VIN 
to 2VIN, while N1 is at VIN, or, when N1 changes from 2VIN to VIN when N3 is at 2VIN. Since the 
CTS control signal G3 is in-phase with 1  and node voltage at N3 is controlled by NOV phase 
Φ2, switch S3 is never ON or changing state when N3 transitions.  However, after the charge 
phase, when N1 transitions from 2VIN to VIN, S3 is being switched OFF with VGS at 0 V while N3 
(Φ2) does not change. With the level-shifter‟s threshold set at VDD/2, node N1 has to change to 
1.5·VIN before the CTS‟s gate voltage is pulled up to 2VIN to switch S3 OFF. So, there exists a 
possibility of small amount of reverse current flow from CP2B to CP1B during this transition. 
Note that during the transition of N3 from 2VIN to 3VIN, the CTS gate voltage follows N3 and so 
only a small insignificant amount of charge is used up to ensure S3 remains OFF during the 
discharge phase. 
Since the level-shifter (LS) is biased from the charge pump nodes, the effect of short-
circuit currents during switching transients need to be analyzed. Short-circuit current can flow 
from VDD to VSS when the LS‟s output changes, or when the rail voltages change with CP‟s phase 
of operation. During one full cycle of operation, the LS‟s output voltage transitions from VIN 
through 2VIN to 3VIN.  The input (N1) transitions between VIN and 2VIN node voltages cause one 
 
Figure 5.20 Schematic illustrating control signals and charge transfer in the second stage of LCP 
 126 
of the LS devices to switch OFF, while the other is being switched ON. This transition could 
result in a short-through or short-circuit current to flow from N3 (VDD) to N2 (VSS) for the brief 
period of time when both the LS devices are ON. However, during the transition from 2VIN to 
3VIN, the pull-up device (PMOS) remains ON and follows the change in its source voltage (N3) 
and so short-circuit current cannot flow in this case.  
Charge transfer from Stage 2 to Stage 3 of the Charge Pump 
As illustrated in Figure 5.21, the gate-control generation for the CP‟s third stage is similar 
to that of second stage. The CTS switch S5 conducts in-phase with 2   to charge CP3B to 3VIN. 
Similar to the second stage, the switching transients in the level-shifter could result in short-
circuit current from N5 (VDD) to N4 (VSS) of the complementary CP branch. Also, there exists a 
possibility of reverse current flow from N5 to N3 when switch S5 is being switched OFF after 
the charging phase.  
Charge Transfer from Stage 3 to Load Capacitor of Charge Pump 
The last stage‟s CTS control signals of G7 and G8 differ from those of previous stages 
due to the relatively constant VOUT that is used to bias the LS‟s VDD rail. Hence, as shown in 
Figure 5.22, the G7 (G8) voltage swings from 3VIN to 4VIN to control the S7‟s (S8‟s) ON and 
OFF states respectively. Similar to the previous discussion, reverse current components exist 
 
 
Figure 5.21 Schematic illustrating control signals and charge transfer in the third stage of LCP 
 127 
 
during the switching OFF transition of CTS which could degrade the charge pumping efficiency 
in this stage. 
In summary, there exists a possibility for reverse current flow during switching transients 
from one stage to its previous stage along the same branch and to the complementary branch. 
However, even at worst case operation with slow transitions, since the CP stages are controlled 
by NOV clock phases, a reverse current path cannot exist along all the CP stages. Hence, reverse 
current losses result in a small reduction in the achievable efficiency in this topology as 
compared to [45] and [46].  
5.3.5.3 Improved Version of the Proposed Linear Charge Pump – Version 2 
The proposed LCP introduced in the previous section, henceforth referred to as LCP V1, 
demonstrates lower losses when compared to existing low-voltage LCP designs of [45], [46] by 
eliminating reverse current paths along the CP stages. However, the existence of short-through or 
short-circuit current paths could result in reverse current to the previous CP stage. Although this 
reverse current flows only to the previous stage and never from the load capacitor to the CP 
input, the loss of charge would result in lowered CP output voltage and increased startup voltage 
in the CP. 
 
Figure 5.22 Schematic illustrating control signals and charge transfer in the last stage of LCP 
 128 
Efforts were taken to mitigate this reversion loss and to improve the startup capability of the CP. 
Figure 5.23 presents the improved version of the proposed LCP that will be referred to as LCP 
V2 hereafter. The LCP V2 topology comprises of the same CTS and GC implementations as in 
LCP V1. The difference lies in the method of controlling the adjacent stages along the CP 
branches. In contrast to LCP V1 where NOV signals were employed across all adjacent stages, in 
the LCP V2, complementary clock signals control adjacent CP stages in a branch, while the 
corresponding stage in cross-coupled branch is managed by NOV clock signals. Since the control 
signals to the level-shifter-based GC generation define the charge transfer between the CP stages, 
NOV control can still be realized in the LCP V2. The improvement in CP efficiency is 
accomplished primarily by reducing the reverse current losses between adjacent CP stages.  
5.3.5.3.1 Low-voltage operation in LCP V2  
To guarantee low voltage operation, the design methodologies described in Section 
5.3.5.2.1 were followed in this design as well to minimize the losses due to switching, 
conduction, and charge redistribution components. This section examines the CTS control logic 
and the associated reverse current paths in the CP topology. Figure 5.23 illustrates the timing 
sequence of the control signals and the node voltages in the LCP V2 design.  
Charge Transfer from VIN to Stage 1 of the Charge Pump 
The control signals in the first stage of the CP is the same in both the CP versions, so, as 
demonstrated in the previous discussion (see Figure 5.19 and Figure 5.23), the NOV clock 
control eliminates reverse current paths from the first stage of the CP to the input source. 
Furthermore, to aid in the CP start-up, a dynamic-threshold MOS (DTMOS) diode is included in 
parallel to the CTS in the first stage. During startup, the subthreshold current flowing through the 
diode charges the CP capacitors in the first stage. Once the NMOS CTS transistors have enough 
gate-source voltage, the diode is bypassed from the CP action. Thus, the diodes provide 
assistance for very-low voltage operation.  However, the presence of the DTMOS diode adds to 









Figure 5.23 Schematic of the proposed improved self-starting charge pump - version 2 
 130 
Charge Transfer from Stage 1 to Stage 2 of the Charge Pump 
The gate-control generation for the lower CP branch‟s second stage is derived from Φ1, 
1  and Φ2. Figure 5.24 presents the transient node voltages at N1, N2, G3, and N3. The switch 
S3 is ON during the charging phase (Φ1 HIGH) to replenish the charge lost in the second stage 
capacitor, CP2B. In the discharge phase ( 1  HIGH), S3 is switched OFF and node N3 is pumped 
from 2VIN to 3VIN to transfer charge to CPB3, while CP1B is charged to VIN. 
As shown in the Figure 5.24, the gate control (G3) signal‟s transition between VIN and 
3VIN is dictated by the node voltage at N1 (Φ1). So, the level-shifter‟s ground rail connection to 
NOV phase at N2 does not influence the CTS‟s (G3) control. Hence, the ground connection can 
be safely tied to VIN and thereby avoid adding parasitic capacitance at N2. Since the source and 
drain terminals of the CTS are controlled by complementary signals, and the gate G3 is also 
synchronized with Φ1, reverse current can flow from CP2B to CP1B. When CTS S3 is being 
switched ON, N1 increases from VIN to 2VIN, N3 decreases from 3VIN to 2VIN, while G3 voltage 
decreases from 3VIN to VIN. For the switch S3 to conduct, the gate voltage (G3) needs to be lower 
than the source (N3) and since the level-shifter‟s switching threshold is set at mid-rail, G3 starts 
to decrease from 3VIN only when N1 and N3 have completed more than 50% of their respective 
transitions. Further, since the level-shifter‟s positive rail voltage (i.e. N3) decreases while the  
 
Figure 5.24 Schematic illustrating control signals and charge transfer in the second stage of LCP V2 
 131 
input voltage (N1) increases, the LS‟s PMOS device is switched OFF at a faster rate and 
minimizes the short-circuit component of reverse current from CP2B to VIN. Hence, during the 
CTS S3‟s ON transition, reverse current losses from CP2B to CP1B or VIN is very small.  
Consider the S3‟s switching OFF transition, when the node N1 decreases from 2VIN to 
VIN, N3 increases from 2VIN to 3VIN, while G3 voltage increases from VIN to 3VIN. When this 
switching transition begins, the switch is conducting and reverse current could flow from N3 to 
N1 until the switch is turned OFF. The transitions in the opposite direction at the level-shifter 
PMOS device‟s gate (N1) and source (N3) voltages accelerates the rise in G3 voltage to follow 
N3, and thus to switch OFF the CTS. Short-circuit current path from N3 to VIN is again 
minimized by the fast PMOS switch ON. Thus, the reverse current loss, though finite, is 
restricted by the fast transition to turn OFF the leakage path. It is to be emphasized here that 
symmetrical matched routing for the complementary signals is critical to ensure efficient 
operation. 
Charge Transfer from Stage 2 to Stage 3 of the Charge Pump 
The gate-control generation for the bottom CP branch‟s third stage is derived from Φ1, 
1  and 2 signals. In this improved version of the LCP, close to ideal charge transfer can be 
achieved by proper NOV control of the CTS. The transient sequence of operation is illustrated in 
Figure 5.25. Consider the steady-state operation where, the charge in CP3B is being discharged to 
next stage i.e. COUT (Φ1 HIGH), CP2B is being charged to 2VIN, and the CTS (S5) is switched 
OFF with its VGS at 0 V. The next transition is triggered by the change in Φ1 signal to LOW 
which results in the N3 transition from 2VIN to 3VIN and N5 changing from 4VIN to 3VIN while N4 
remains at 3VIN. The output of the level-shifter is at high impedance state with the charge from 
PMOS parasitic capacitors redistributing back to CP3B and the CTS S5 remains in the OFF state. 
Thus, the finite reverse current component that exists in the previous stage‟s transition from 
discharge to charging phase is completely eliminated here.  
After a brief dead-time period, the 2  goes LOW which lowers the level-shifter‟s ground 
rail (N4) from 3VIN to 2VIN and thereby switches ON the NMOS device to lower the G5 node 
voltage to 2VIN. Since the PMOS device in the LS remains OFF during this transition, there is no 
 132 
 possibility of short-circuit current from N5 to N4. Furthermore, when switch S5 turns ON, the 
nodes N3 and N5 are already at steady state voltages of 3VIN and 3VIN-ΔV respectively, and thus 
removing any chance of reverse current flow. After this charging phase, the clock phase 2  goes 
HIGH which increases the level-shifter‟s ground rail to 3VIN with the NMOS being switched 
OFF. This transition is again devoid of without any short-circuit current as the LS‟s PMOS 
remains OFF throughout this operation. Since the node voltages at N3 and N5 remain equal 
during S5‟s switching-OFF event, there is no reverse current at this stage. 
The final transition to be considered is when the Φ1 signal goes HIGH to initiate the 
discharge phase. An increase in N5 only increases the gate voltage at S5 to follow N5 with the 
CTS staying switched-OFF, so any reverse current flow from N5 to N3 due to the delay in G5 to 
keep-up with N5 is negligible. Thus, in the LCP V2, the alternate CP stages do not suffer from 
losses due to reverse currents. 
 
 




 stage of LCP V2 
 133 
Charge Transfer from Stage 3 to the Load Capacitance of the Charge Pump 
The gate-control generation for the final set of CTS is derived from Φ1, Φ2 and VOUT. 
The CTS control signals of G7 and G8 are generated in a similar manner to those at G3 and G4, 
with the only difference being the LS‟s VDD rail is biased by the constant VOUT voltage. As 
presented in the LCP V1 design (see Figure 5.22), the G7 (G8) voltage swings from 3VIN to 4VIN 
to control the S7‟s (S8‟s) ON and OFF states respectively. Similar to the LCP V1 discussion, 
reverse current components could flow during the switching OFF transition of S7 (S8) and this 
could degrade the charge pumping efficiency in this stage.   
In summary, the LCP V2 improves the efficiency by completely eliminating the short-
through currents in the level-shifters and reverse current paths in the alternate CP stages. 
However, this improvement comes with a small penalty of the very small amount of reverse 
current between every other alternate CP stages. It is emphasized that every small enhancement 
is especially significant in ultra-low voltage subthreshold regime of operation and thus offers 
better CP performance.  
5.3.5.4 Adiabatic Gate Control of the Charge-transfer Switches  
The energy required to charge or discharge a node can be approximated as (1/2)·CV
2
 
where C and V are the respective capacitance and voltage at that node. Adiabatic switching 
techniques reduce the dissipated energy by reducing the voltage swing [14]. In this LCP, the gate 
control signal‟s voltage is switched in two steps of VIN to control the ON or OFF state. As shown 
in Figure 5.18 and Figure 5.23, the gate-source voltage of PMOS switch P3 is switched from –
VIN during ON, to zero, and then to +VIN for hard-OFF state. This step-wise charging and 
discharging reduces the peak current required for switching and thus halves the energy dissipated 
due to switching.  
5.3.5.5 Capacitor size calculations  
The design of the charge pump involves the estimation of optimum values of the CP 
parameters such as the CP capacitors (CP), frequency of operation (f), output capacitor (COUT) 
and the number of stages N for a given VIN. The output voltage for an N-stage CP can be 










 1,  (5.13) 
where IL is the load current of the charge-pump. The first term in (5.13) is the ideal voltage gain 
that can be obtained with an N-stage LCP, and the second term represents the charge-
redistribution voltage loss due to the presence of load current. For given N and IL, the values of 
CP and f determine the maximum attainable VOUT. In order to reduce the switching losses, it is 
desirable to operate the CP at frequencies as low as possible. Since the frequency determined by 
the RO has only a tuning small range, the value of CP is maximized for the available area in the 
chip. In this design, dual-mim capacitors, available in the 130-nm process node, are used to 
realize the on-chip capacitors. The capacitance per unit area for dual-mim capacitors is 
4.1fF/µm
2
, and the ratio of bottom-plate parasitic capacitance to the actual capacitance is about 
2.6% [62]. With an area of 35,000 µm
2
, the total dual-mim capacitance that can be realized, 
including area for routing is about 100 pF. For the 3-stage LCP (N=3), with complementary 
cross-coupling branches, the CP capacitance for each CP stage is about 16 pF. 
The required frequency of operation can be determined from (5.13) for a given load 
current. In this self-starting CP, the frequency of the clock generated by the ring oscillator 
depends on the applied input voltage. In order to accommodate a small range input voltages, the 
clock frequency from the RO can be tuned using delay cells, controlled by a 3-bit decoder, in 
order to maximize the efficiency of the conversion. For an input voltage of 125 mV, the RO 
oscillates at a frequency of 360 KHz. With a 100 mV input, the frequency reduces to about 200 
KHz. 
The required value of COUT is determined by the load current, frequency of operation, and 



















  (5.14) 
where CTotal is the total CP capacitance which is 96 pF (=16·3·2 pF). For a ripple voltage of 
15mV, VIN of 150mV, VOUT of 500mV, the value of COUT needs to be greater than 75 pF. A 100 
pF dual-mim capacitor with 0.05 mm
2
 area is employed in this design.  
 135 
5.4 Efficiency of the Proposed Charge Pump 
The operational losses determine the maximum achievable efficiency of the proposed 







  (5.15) 
where IIN is the current drawn by the CP from the input source, VIN, in order to sustain a load 
current of IL at a boosted output voltage of VOUT. To determine the efficiency of the proposed CP, 
the CP‟s output voltage (VOUT) and the current consumption from the input source (IIN) are 
derived at steady-state operating conditions. The charge redistribution, conduction, and reverse 
current loss components that reduce the output voltage of the CP are integrated in the VOUT 
calculation, while the switching loss is included in the IIN equation. The charge-balance analysis 
discussed in [58] is employed in this work to derive the output voltage of the charge pump. 
5.4.1 Output voltage of the Charge Pump 
The output voltage of CP can be derived based on the charge balance law which states 
that “In a system of capacitors, the sum of all charges leaving a node at any instance of charge 
transfer is equal to zero.” [58]. This law is based on the charge conservation principle which 
establishes that the total charge in the system before and after any charge transfer is always 
equal.  
The ideal output voltage level of (N+1)·VIN can never be realized in a charge pump 
because of the charge redistribution loss that is inherent in switched-capacitor circuits. Charge 
redistribution loss results due to the sharing of charge when two capacitors with different initial 
voltages are connected together. Since the voltage boosting action is realized by phased charging 
and discharging between the CP capacitors, charge redistribution loss occurs along each stage of 
the CP. 
The top-plate parasitic capacitance of the CP capacitors along with the parasitic 
capacitances due to charge transfer switches, and the level-shifters connected to the charge pump 
nodes (N1 to N6) consume charge during the CP operation. This contributes to additional loss of 
charge along the CP stages and reduces the attainable VOUT. The simplified schematic (see Figure 
 136 
5.18 and Figure 5.23) of the proposed three-stage CP, used in the VOUT derivation, includes the 







  (5.16) 
where, α·CP is the top plate parasitic capacitance of the charge pump capacitor, CGS,CTS and CGS,LS 
are the gate-drive capacitance of the CTS and LS respectively, and CSB,NMOS,LS is the source-to-
body parasitic capacitance from the LS‟s NMOS device. The charge redistribution and parasitic 
loss are included in the VOUT derivation, while the conduction and reverse current loss 
components will be added at the end. Further, since the NOV dead-time is much smaller than the 
time period (T), the charge pump‟s charging and discharging phases are assumed to be equal i.e. 
T/2.  
The Figure 5.26 illustrates the node voltages across the CP capacitors during the CP 
operation. The voltage Vx represents the voltage across the CP capacitor. The ΦC and ΦD 
represent the charging and discharging phases, where the i-th stage CP capacitors are charged to 
Vi-1+VIN and discharged to the steady-state value of Vi. During the charging phase, both the CP 
and γ·CP are charged to Vi-1+VIN voltage from the previous stage CP capacitor. When the control 
signals transition to the discharging phase, the CPi capacitor‟s bottom plate is raised to VIN which 
results in transfer of charge from the top plate to charge the parasitic capacitance (at the CP 
node), and also discharge to the next stage CP capacitor. After the discharge phase, the bottom 
plate is brought back to ground, while the top plate voltage is at steady-state of Vi. During this 
transition, the charge redistribution occurs from the parasitic capacitors back to the top plate of 
CP capacitor. 
Equating the charge before and after the transfer, gives  
     







































In the cross-coupled CP topology, charge redistribution at the last stage results in transfer of 
charge from capacitor (CN) to replenish the output capacitor (COUT), and to support the load 
current. Hence, a total charge of ILT/2 is transferred from CN during this time period of T/2. 
Therefore, at steady-state of operation, charge of ILT/2 is transferred from one CP stage to the 
next stage to be able to sustain the load requirements. The charge redistribution equation at the 
output stage can be written as 














  (5.18) 
Substituting (5.18) in (5.17) and simplifying the equation results in   
















  (5.19) 
Equation (5.19) can be rearranged to provide the voltage at i-th stage as 











1  (5.20) 
Substituting the values of Vi-1 recursively presents Vi in terms of V1, VIN and IL. At the CP‟s first 
stage, during the charging phase CP1 is charged to VIN, where ILT/2 is transferred from VIN to 
 
 
Figure 5.26 Voltage across charge pump capacitors during charging (ΦC) and discharging (ΦD) phase of operation 
for (a) First stage CP capacitor, (b) i-th stage CP capacitor, (c) Last or N-th stage CP capacitor, and (d) the 
output capacitor  
 138 
replenish CP1 of the charge lost in the previous discharging phase. Hence the charge equation is 
given by 
















  (5.21) 
The resulting V1 equation is  








  (5.22) 
Applying the boundary condition from the CP‟s first stage into (5.20), and using equal CP 
capacitors (CP), results in  














The output voltage can be obtained using VOUT = VO2 = VN + VIN. The output is now  
 

















This CP output voltage of (5.24) is equal to that derived in [58], where the charge redistribution 
after the discharge phase from parasitic capacitors to CP capacitors was not accounted for. The 
two VOUT equations do not differ since the amount of charge redistribution from parasitic back to 
CP capacitors is very small.  
The VOUT equation (5.24) includes the voltage reduction from charge supplied to the load, 
the average output voltage can now be written as 
 

























 term represents the average ripple voltage at the charge pump output, and α is the 
top plate parasitic capacitance factor for this process. The equation (5.25) gives the CP‟s average 
output voltage that accounts for the charge redistribution loss and the loss due to parasitic 
capacitances associated with the gate drive of CTS and level-shifter circuits. As seen in (5.25), 
the output voltage is always lower than the ideal output of N+1 times VIN even when the load 
 139 
current is not present. Similarly, without parasitic losses, the output voltage reduces with 
increase in load current. 
Conduction loss 
The conduction loss due to the series voltage drop in the CTS of each pumping stage will 














For low power application of this LCP, the load current is in micro-Ampere range. Hence with a 
reasonable RON, the conduction loss can be designed to be very small. As the number of CP stage 
increases, the VTH of the LS‟s NMOS device also increase due to body effect. So, the level-
shifter‟s output-low voltage (i.e. gate control to switch ON the CTS) might not be low enough to 
provide a |VGS| of at least VIN for the CTS. This would result in an increased RON of CTS along 
the CP stages and thus increasing the conduction losses with increasing N. For the proposed 
three-stage LCP, the output voltage which reflects the conduction loss can be approximated as 
   























where VRON represents the series voltage drop due to finite ON resistance in the CTS, and is 
equal to ILRON. 
Reverse-current loss 
As discussed in Section 5.3.5.3, the reverse-current flow that reduces the CP‟s output 
voltage can be viewed as charge (ΔQ) that is transferred from CPi to CPi-1. Since at the steady-
state of operation, charge transfer of ILT/2 is necessary to sustain load current, an extra amount of 
charge proportional to ΔQ needs to be transferred from CPi-1 to CPi. A portion of this extra 
charge feeds the reverse current component that flows back to the previous stage, thus increasing 
the loss due to charge redistribution and the total energy dissipation. In order to avoid 
complicating this analysis, the reverse current loss can be incorporated in VOUT as 
 140 
   






























where IREV represents the total reverse current in the charge pump, trise (tfall) is the rise (fall) time 
of the CTS‟s gate control voltage, and CP is the pumping capacitor. The equation (5.28) gives the 
CP‟s average output voltage that accounts for the charge redistribution, parasitic losses at CP 
node, conduction loss, and reverse current loss.  
5.4.2 Input current consumption of the Charge Pump 
The input current consumed by the CP is made of the charge transferred along the CP 
stages to support the IL, and the current required to charge and discharge the bottom plate 
parasitic capacitances of the charge pump capacitors. As shown in the previous discussion, a 
charge (ΔQ) of ILT/2 is transferred by one stage of the cross-coupled CP to the next. During the 
charging phase in the bottom CP branch (i.e. Φ1 HIGH, see Figure 5.23), ΔQ charge is 
transferred from input to CP1B, and CP2B to CP3B, while the top CP branch is in the discharging 
phase where CP1T transfers ΔQ to CP2T, and CP3T transfers to COUT and load. Thus, for one half-
cycle (T/2) of operation, we have (N+1)·ΔQ charge transferred across the CP. For full-cycle of 
operation, the total charge transferred and thus the total current flowing in the CP branches is 
given by [63] 
  LIN INI 1  (5.29) 
Equation (5.29) presents the current drawn from input source to support IL, but does not include 











where the bottom plate parasitic capacitance, Cbottom plate,par is proportional to pumping 
capacitance CP by a factor α that depends on the process technology. The frequency of CP 
operation is f, with time period T. Since the individual pumping-capacitors in a cross-coupled CP 
are half the size of those in a single-branch CP, the total switching current loss is the same in 
both these topologies. Hence, the total current consumption is  
 141 
  fVCNINI INPLIN  21  (5.31) 
5.4.3 Conversion Efficiency of the Charge Pump 
The conversion efficiency of the charge pump can be derived from the output voltage and 







  (5.32) 
In order to simplify η, the input current consumption IIN can be expressed in terms of IL by 
substituting the value of pumping capacitor (CP) from (5.28). To estimate CP, the series voltage 
loss in the CTS (VRON), the reverse current (IREV) and ripple voltage terms are assumed to be very 
small. Thus, the CP can be approximated from the output voltage as  











Substituting CP into (5.31) gives 
 







































Using (5.34) to calculate η results in  
 




























Thus, the efficiency of the linear charge pump can be approximated from the voltage gain 
(VOUT/VIN), the number of CP stages (N), and the process-dependent factor γ which is given by 
(5.16). The η equation of (5.35) includes the conduction loss, switching loss and reverse current 
 142 
losses in the output voltage term. Further approximation of equation (5.35) with γ=0 results in 
















where, the factor K is the voltage gain VOUT/VIN.  
5.5 Implementation of the DC-DC Converter 
The two versions of the cross-coupled, self-starting, low-voltage linear charge pump with 
three-stages were implemented in a 130-nm CMOS process. The total charge pump capacitance 
is 196 pF with individual pumping capacitor size of 16 pF and load capacitance of 100 pF. The 
charge pumps occupy an area of approximately 0.1 mm
2
 each and the layouts with highlighted 
blocks are presented in Figure 5.27 and Figure 5.28. In the ultra-deep submicron processes 
(UDSM), the threshold voltage of the devices placed near N-well edges increases. For this 
process technology, a distance of at least 2 µm is required from the active diffusion edge to the 
N-well edge to avoid VTH increase due to N-well proximity effect. As low-voltage operation is 
critical to this design, all devices follow the rule to avoid increase in VTH. Furthermore, since the 
efficiency of the charge transfer across the various CP stages depends on the relative phase of the  
 
 








control signals distributed to CP capacitor and charge-transfer switches, extra care has been 
taken to ensure symmetrical routing to match the delay in clock signals. The decoder is powered 
by a separate power supply and operates at the nominal VDD. This ensures that the switches for 
the delay cells in the RO have low ON resistance and are not affected by the low-voltage 
constraint of the input. Decoupling the decoder‟s power from the CP also facilitates 
measurement of the CP input current and the efficiency of the CP. 
5.6 Simulation Results  
The charge pump circuits were simulated from the schematic, as well as with the parasitic 
capacitances and resistances from the layout extraction. The simulation results characterizing the 
LCP designs will be presented in this section. The results from ring oscillator and the NOV 
blocks will be presented first as they are common to both the LCP versions. Since the converter 
is powered by the input voltage, any change in VIN results in variations in the ring oscillator‟s 
output clock frequency, and the dead-time in the NOV phase generator‟s output. Figure 5.29 
presents the change in ring oscillator‟s output frequency across variations in VIN. The delay 
networks along with the decoder offer an option to adjust the RO‟s output frequency. The ability 
to tune the CP‟s clock frequency provides a technique to maximize the conversion efficiency for 
a given load condition. Figure 5.30 illustrates the output frequency range that can be obtained for 
 















Figure 5.29 Ring oscillator‟s output frequency across input voltage variations 
 145 
various delay settings in the RO. The nominal dead-time introduced by the NOV generator at VIN 
of 125 mV is 360 ns. 
5.6.1 LCP simulation results 
The proposed LCP designs were simulated to characterize their startup voltages, 
conversion gains and efficiencies at different load conditions. The simulations were performed at 
the free-running frequency mode of the ring oscillator wherein, the delay cells in the RO were 
not employed. The startup voltage of the CP was characterized based on the CP‟s output voltage 
and conversion efficiency at a given VIN.  
For an input voltage of 100 mV, the control system, including the RO and NOV 
generator, was able to generate the required clock signals for the CP‟s startup and operation. The 
generated clock frequency from the RO was at 200 KHz. The CP‟s clock and the output voltage 
demonstrating the startup behavior of the LCP V1 and LCP V2 designs, at an input voltage of 
100 mV, are presented in Figure 5.31 and Figure 5.32, respectively. For the no-load condition, 
the output voltage reaches a steady-state value of 234 mV in LCP V1 while LCP V2 provides 
250 mV output within 1 ms. The figures also illustrate the zoomed-in view of the outputs with 
details on the RO‟s output swing of about 95% VIN and the CP‟s output ripple voltage of less 
than 1.2 mV for both the designs. 
Figure 5.33 and Figure 5.34 present the 3-stage charge pumps‟ output voltages and 
conversion gains for low input voltages and across different DC current loads. It is evident from 
the figures that, above the startup voltage, both the charge pumps provide a linear increment in 
the output voltage with increase in VIN. Above the startup voltage, the improvement in the output 
voltage of the LCP V2 when compared to the LCP V1 is clearly visible in the conversion gain 
plots of Figure 5.34. 
At 100 mV input voltage, although the RO provides the necessary control clocks to the 
charge pumps, the output voltage and conversion gain are low due to weak drive strengths in 
deep subthreshold regime of operation, and also increased losses along the pumping stages.  
Hence, the usable range of VIN starts from about 125 mV, where the conversion gain is 3.25 V/V 


















(a)  (b) 
Figure 5.34 Plot of LCP‟s conversion gain across varying input voltages for different load conditions 








Figure 5.33 Plot of LCP‟s output voltage across varying input voltages for different load conditions  







state value in less than 1ms for both the CP versions. At VIN of 125 mV, the RO oscillates at 360 
kHz and the output ripple voltage is 1.2 mV for a DC load current of 0.1 μA.  






















where K is the gain factor, VOUT/VIN and α is a technology dependent parameter which is the ratio 
of bottom-plate parasitic capacitance to CP capacitance, CPar,P/CP. Thus the efficiency of the CP 
depends on the number of CP stages, the value of CP capacitance and the frequency of operation 
[61]. With the simulated conversion gain, the efficiency of the CP can be estimated for α of 3% 
in the 130-nm process node [62]. Figure 5.35 presents the estimated efficiency of the charge 
pumps from simulation results, across varying input voltages and load conditions. As indicated 
by the low gain ratio, the CP‟s efficiency at 100 mV input is at 34% and increases to above 65% 
for input voltages above 125 mV at 0.1 μA DC current load. At no load condition, ηp of less than 
100% is due to the switching losses and losses in the charge transfer switches in each stage of the 
charge pump. As evident in the Figure 5.35, the LCP V2 version provides an improvement in the 




Figure 5.35 Plot of charge pumps‟ efficiency across varying input voltages and different load conditions 







5.6.2 Performance of the Proposed CP topology with increase in number of CP stages  
In order to demonstrate the ease of gate control generation that facilitates extension of the 
number of CP stages, the proposed LCP topology was simulated with seven charge pump stages. 
Figure 5.36 and Figure 5.37 present the output voltage and estimated efficiency for the 7-stage 
CP topology. The CP boosts the 125 mV input to 745 mV at a maximum efficiency of 69%.  
Further, the CP was designed with different number of stages (N) to study the topology‟s 
startup and conversion features as a function of pumping stages. Figure 5.38 and Figure 5.39 
present the output voltage and estimated efficiency as a function of the number of cascaded 
pumping stages. The input voltage is varied from 125 mV to 200 mV with a constant load of 0.1 
μA. The output voltage increases linearly with increase in the number of CP stages (N). The LCP 
characteristic reduction in ηp at higher N is due to the increase in conduction, parasitic, and 
charge redistribution losses due to the CP stages.  
 
 










Figure 5.38 Plot of LCP V2‟s output voltage across varying number of pumping stages (N)  
















A low-voltage capable, self-starting charge pump has been designed and implemented in 
the 130-nm process node. This charge pump topology with increased number of stages and drive 
strength is targeted to be used as the core component in a DC-DC boost converter design to 
facilitate charge-recycling based low-power digital design. The self-starting feature renders this 
charge-pump‟s use in energy harvesting systems as well. Low-voltage DC-DC converters are 
essential for energy harvesters that operate from various ultra-low voltage input, low-power 
sensors such as thermoelectric generators (TEG), photovoltaic cells etc. The measurement results 




Figure 5.39 Plot of LCP V2‟s efficiency across varying number of pumping stages (N)  







Chapter 6 Characterization of Low-Voltage Self-
startup Charge Pump 
 
This chapter presents the measurement results and analysis of the low-voltage self-
starting charge pump (CP). The primary goal is to characterize the startup voltage of the charge 
pump across different DC load current conditions. This is accomplished by measuring the charge 
pump‟s output voltage for various input voltages and by evaluating the conversion gain across 
the input range. Further, comparing the measurement results between the two versions of the CPs 
would illustrate the improvement realized by reducing the losses due to reverse currents. 
Correlating the measurement data with simulation results offers a better understanding of the 
circuit performance across process variations. 
6.1 Test Setup 
The chip was fabricated in the IBM 8RF CMOS process through MOSIS [69] and 
packaged in a 64-pin LQFP (Low-Profile Quad Flat Package) for testing. The chips were 
received and tested in June 2012. A microphotograph of the fabricated die, with highlighted 
circuit blocks is shown in Figure 6.1. Eagle layout editor was used to design the four-layer FR4 
PCB to characterize the charge pumps. Figure 6.2 presents the designed test board which 
accommodates all the 3 chips that were tested. The layer stack-up in the PCB cross-section is 
illustrated in Figure 6.3. The majority of the signal routings are on the topmost layer of the PCB, 
while the inner copper planes enable quiet power supply and ground connections. The inner 
copper supply layers also act as power supply decoupling capacitors. Additional decoupling 
capacitors are added close to the supply pins of the chips and small decoupling capacitors are 
also provided on-chip. 
6.1.1 Power Supply Partition & Generation on the PCB 
Power supply planes and their partitions are essential to avoid noise coupling and to 













different supply voltage requirements of 2.5 V and 1.2 V. The decoder that is used to control the 
ring-oscillator‟s (RO) frequency of operation and its pad frame use 2.5-V power supply since 
they employ thick-oxide CMOS devices. The pad ring for the core charge pump circuit is 
powered by a 1.2-V supply voltage. The bond-pads were positioned such that the pins on the 
chip‟s top side are powered by 2.5 V, while the 1.2-V power supply is required by pins on both 
sides of the chip.  The pad (signal) placement and the corresponding bonding diagram facilitate 
power plane partition on layers 2 & 3. 
Onboard voltage regulators are employed to provide stable, low-noise supply voltages to 
the chip. Linear voltage regulators are preferred to switching regulators due to their low noise 
capability.  A low-dropout, linear voltage regulator (TI‟s LP38512) [65] is used in this design. 
This regulator provides output voltages in the range of 0.5 V to 4.5 V for an input voltage range 
of 2.25 V to 5.5 V. The schematic of the voltage regulator is presented in Figure 6.4. The 














VV ADJOUT  (6.1) 
An input voltage of 5 V is applied to the regulators and the ratio of resistors is set to obtain 
output voltages of 1.2 V and 2.5 V. The grounds of the two supply partitions are connected at 
only one point on the board in order to eliminate ground-loops. A tight, short ground-plug is used 
to connect the grounds and thereby eliminate the inductive effects of banana plugs. 
 




6.1.2 Ring Oscillator control signal generation  
A 3-to-8 decoder is included on-chip in order to vary the ring oscillator‟s delay setting 
and thereby provide a means to control its output frequency. The test board includes single pole, 
single throw (SPST) switches to generate the three-bit control signal for each charge pump.  
As illustrated in Figure 6.1, the fabricated chip includes both the versions of the charge 
pump circuits. The three test chips were directly soldered on to the test board to reduce socket 
parasitics. However, during soldering, the power supply pin to the CP version 1 in chip # 2 was 
damaged. So, the CP testing was limited to two prototypes for the LCP V1 design, while three 
circuits of the LCP version 2 were available for characterization.   
6.2 Test Procedure 
The low-voltage charge pump characterization begins with ensuring stable, onboard 
power supply voltages for the test chips. Then, an input voltage is applied using the Keithley 
2400 sourcemeter to the CP-under-test while the DC load current is drawn out of the output node 
using another Keithley 2400 sourcemeter. The boosted CP output voltage is measured using a 
digital multimeter and the current supplied by the input voltage is also monitored in order to 
determine the CP efficiency.  An output voltage close to the expected value from the three-stage 
CP verifies the functionality of the charge pump, the associated ring oscillator, and phase 
generator circuits. Once the charge pump startup and operation are ascertained, the control bits to 
the decoder are changed to vary the frequency of the CP control signals. For a fixed load 
 






condition, the change in frequency of CP control signals would directly influence the input 
current consumption and the output voltage and thus the CP efficiency.  The switch setting that 
corresponds to the maximum output voltage and maximum efficiency is determined for various 
input voltage and output load conditions. With the appropriate control bits to the decoder, the CP 
is then characterized across different input voltages and load conditions.  
6.3 Prototype Characterization 
6.3.1 Ring Oscillator  
Figure 6.5 presents the CP output voltage across different frequencies of operation, 
obtained by varying the delay in ring oscillator, for an input voltage of 0.33 V and DC load of 
0.1 μA. Figure 6.6 presents the normalized efficiency of the CP designs obtained across different 
delay settings in ring oscillator for the same setup conditions as in Figure 6.5. The plots show 
only a small variation in the output voltage and efficiency for both the CP versions across 
frequency. On examination of the final fabricated design, it was found that the decoder circuit 
which is required to output a thermometer-code was erroneously designed to not do so. The 
fabricated design is a 3-to-8 binary decoder with only one active (high) output that depends on 
the input bits. Figure 6.7 (a) illustrates the delay network designed to control the ring oscillator‟s 
output frequency. The Figure 6.7 (b) and (c) also highlight the difference in the realized delay 
and the desired delay, for each switch setting. From Figure 6.7 it is evident that the fabricated 
frequency control circuit does not achieve the maximum possible range of frequency variations, 
and is constrained to approximately 3 different frequency settings. Figure 6.8 presents the 
simulated RO frequency variations, normalized to free-running case for both the thermometer-
code and binary decoder controlled delay settings. The thermometer decoder provides at least 2X 
frequency variation, while the binary decoder offers less than 20% change. This explains the lack 
of variation in the CP‟s measured performance across delay variations in Figure 6.5 and Figure 
6.6.  
As seen in Figure 6.6, the best CP performance that corresponds to a maximum output 
voltage and minimum input current (i.e. maximum efficiency), is achieved at the switch setting 





Figure 6.6 Measured charge pump efficiency across frequency of operation, normalized to  
free-running RO‟s efficiency (VIN at 0.33 V) 
 












 Figure 6.7 Frequency tuning in ring oscillator (a) simplified schematic of RO illustrating delay cells, (b) Realized 




performed with this switch (delay) setting that provides the best possible conversion efficiency 
for the charge pumps. Further, note that an input of “110” to the decoder corresponds to the 
smallest frequency (maximum delay) in the ring oscillator. Thus, the measured CP efficiency 
could be further improved with a lower frequency of operation, depending on the load 
conditions.  
6.3.2 Charge Pump Output Voltage 
The output versus input voltage characteristics across different load conditions is 
compared between the two versions of the CPs in Figure 6.9 to Figure 6.11. The input voltage is 
varied such that, at the maximum input, the CP‟s output extends to the maximum nominal supply 
voltage for this process (i.e. 1.2 V). The improved LCP version 2 reduces the reverse-current 
losses between the CP stages and thus delivers a higher output voltage than the LCP version 1 
which is based on conventional non-overlapping clock phase technique.  
 
Figure 6.8 Simulated RO‟s output frequency, normalized to the free-running frequency, 
across delay settings for VIN at 0.125 V and 0.33 V. 
 161 
 
Figure 6.10 Measured charge pump output voltage across variations in input voltage and DC load current for Chip 2 
 
 




Also, note that the difference in the output voltages increases with the input and is evident for 
inputs above 0.2 V in Figure 6.9 and Figure 6.11. This sizeable difference in the output voltages 
between the two CPs is due to the increase in the amount of reverse current and switching losses 
with increase in the frequency of operation at higher input voltages.  
Figure 6.12 to Figure 6.14 present a closer look at the low-voltage startup and operation 
of these charge pumps. From the plots, it can be observed that even at very low voltages, the 
LCP V2 performs better than the LCP V1. At low voltages, the charge transfer switches and the 
CP control generation circuits operate in sub-threshold regime and thus have slow transition 
edges. As described in Section 5.3.5.2, these slow edges result in increased reverse current losses 
in the odd pumping stage of the LCP V2 and thus the output voltages of both the CP versions 
start to converge at very low input voltages i.e. around startup voltage. 
 





Figure 6.13 Measured charge pump output voltage across low input voltages at different DC load current 
conditions for Chip 2 
 
 
Figure 6.12 Measured charge pump output voltage across low input voltages at different DC load current 




6.3.3 Charge Pump Startup Voltage 
The conversion gain characteristics of the LCPs across input voltages are employed to 
quantify the CP‟s startup voltage. The CP conversion gains across input voltage range for no-
load and 0.1 μA load conditions are illustrated in Figure 6.15 and Figure 6.16. Even though the 
CPs can furnish output at very low voltages, the gain and thus the efficiency of operation start to 
drop steeply with reduction in input voltage beyond a certain point. From the gain plots, it can be 
inferred that the practical startup voltage is approximately around the knee of the curves, below 
which the conversion gain degrades sharply. The startup voltages for the CP versions are 
tabulated in Table 6.1 for different load conditions. The LCP V2 has lower-voltage startup 
capability due to the DTMOS diodes in the first pumping stage (Section 5.3.5.3) that provide a 
path for the pumping capacitors to start charging at low voltages. Also, note that the startup 
voltage depends on the CMOS threshold voltages and thus is process dependent. In this small 
sample of test chips, the threshold voltages of the devices in chip 1 are higher than those of other 
samples. Hence, the LCP designs in chip 1 require a higher startup voltage when compared to  
 
Figure 6.14 Measured charge pump output voltage across low input voltages at different DC load current 
conditions for Chip 3 
 165 
 
Figure 6.16 LCP V2‟s conversion gain across low input voltages and load conditions 
 
 




Table 6.1 Charge pump startup voltage across different load conditions 
DC Current load 
(μA) 
Startup Voltage (mV) 
LCP V1 LCP V2 
Chip 1 Chip 3 Chip 1 Chip 2 Chip 3 
No load 150 140 130 125 130 
0.05 155 150 140 130 135 
0.1 170 160 150 140 140 
 
 
those in other chips. Both the linear charge pump versions exhibits a startup voltage variation of 
approximately 10 mV across the tested prototypes at 0.1 μA DC load current. A larger number of 
test samples are required to quantify the actual variation in the startup voltage. 
6.3.4 Charge Pump Drive Capability  
Figure 6.17 to Figure 6.19 illustrate the load driving capability of the charge pump 
designs. For a fixed frequency of operation (f), the output voltage reduces with increased load 
current due to the increase in losses associated with the CP operation. The CP‟s output voltage 









 1,  (6.2) 
where the second term represents the reduction in output voltage due to charge-redistribution 
loss incurred to sustain the load current. For input voltages close to the startup voltage, the 
charge pumps are able to sustain 0.2 μA of DC load current. They can support 0.5 μA loads at 
input voltages above startup. As seen in the plots, for inputs above the startup voltage, the LCP 
V2 clearly surpasses the LCP V1 design. However, near the startup voltage, LCP V2 is only 




Figure 6.18 Output voltage of LCP across varying load conditions for chip 2 
 
 




Figure 6.12 to Figure 6.14, for very low input voltages, the increase in reverse current losses in 
the odd number stages of the LCP V2 diminishes the improvement gained in the even stages at 
very low input voltages, so LCP V2‟s output is only slightly better than that of LCP V1 at very 
low voltages.  For high load currents at very low input voltages, both the LCP designs were not 
designed to sustain large DC load currents, consequently the outputs merge together and drop 
steeply. Also, since the frequency of operation was not changed during this test, the measured 
output voltages do not reflect the maximum possible efficiency to drive the DC load current.  
6.3.5 Charge Pump Efficiency 







  (6.3) 
where IPower is the power supply current consumed by the CP to provide the load with an output 
voltage (VOUT) and load current of IL, for an input voltage of VIN. Due to space constraints, the 
input pad to the CP had to be shared with the power supply pin to the ring oscillator and the 
phase generator circuits in order to minimize chip area. Hence, the measurement results represent 
 
Figure 6.19 Output voltage of LCP across varying load conditions for chip 3 
 
 169 
the end-to-end converter efficiency which includes the power consumed by the ring oscillator, 
phase generator, the charge pump, and leakage currents in the ESD diodes. Figure 6.20 to Figure 
6.21 present the end-to-end efficiency of the converter designs at the startup voltage, across load 
current variations. At the startup voltage, the CP designs provide comparable conversion 
efficiency to drive DC load currents. As seen in the previous characterization results, LCP V2 
provides a lower startup voltage than LCP V1 and better performance for inputs just above the 
startup voltage. During this measurement, the capacitive delay network within the RO was 
maintained at a fixed maximum value that corresponds to the switch setting of “110” (Section 
6.3.1) across the input voltage range. Since the tuning of RO‟s output frequency was constrained 
by the error in decoder implementation, the measured output voltages do not reflect the 
maximum possible efficiency that can be achieved to drive the DC load current for the input 
range.  
With a thermometer decoder-based control of the RO‟s delay, the tuning range of the 
RO‟s output frequency can be further increased to provide optimal performance of the CP 
designs across variation in input voltage and load currents. Figure 6.22 illustrates the 
  
 




Figure 6.22 Simulated LCP V2 efficiency and end-to-end converter efficiency (LCP V2)  
across varying load conditions and input voltages 
 
Figure 6.21 Measured end-to-end efficiency of the DC-DC converter (LCP V2) across varying load conditions  
 171 
improvement in the simulated efficiency of the charge pump, ηCP (LCP V2) and the end-to-end 
conversion efficiency (ηEEC) at startup voltage, across different delay settings from the 
thermometer decoder. For inputs close to the startup voltage of 130 mV, the end-to-end 
conversion efficiency increases from 22% to 27%, while the efficiency of the charge pump 
ranges from 62% to 73% for variations in the RO‟s output frequency. The efficiency at 200 mV 
input voltage, well above the startup voltage, is also included to illustrate the higher efficiency 
capability of the charge pump (ηCP ≈ 90% & ηEEC ≈ 40%) for inputs above startup voltage. Thus, 
from simulation results, the maximum conversion efficiency is obtained at the lowest frequency 
setting for both charge pumps. 
From the efficiency measurement plots of Figure 6.20 and Figure 6.21, the improvement 
in the measured end-to-end efficiency with increase in load current confirms that the conversion 
efficiency can be improved by reducing the frequency of operation, with respect to very low 
input voltage and load conditions. Furthermore, as seen in startup voltage measurements, chip 1 
has a process corner with higher threshold voltage and so the ring oscillator‟s output frequency is 
lower than that in the other chips. This results in a higher efficiency when compared to the other 
samples. Thus, from the examination of the measurement results, it can be inferred that the 
measured conversion efficiency can be enhanced by tuning the frequency of operation with 
respect to the operating conditions. 
In micro-energy harvester applications, low startup voltage is the critical requirement, 
while the end-to-end efficiency of 25% is not as important. For instance, the charge pump could 
be integrated into a power management system where it is used as a low-voltage multiplier that 
supplies reference or bias levels to other circuits. Hence, the ultra-low voltage startup 
demonstrated in this self-starting linear charge pump renders itself indispensable in micro-energy 
applications. For applications that necessitate high charge pump efficiency at low input voltages, 
this charge pump topology can be adopted without the self-starting feature and thereby achieve 
efficiencies close to 75%, based on simulation results. 
6.4 Techniques to improve efficiency 
Reducing the losses incurred in the charge pump stages is the most effective method to 
improve the efficiency of a linear charge pump. The switching losses due to charging and 
 172 
discharging the bottom plate of CP capacitors, and the gate-drive loss at the charge transfer 
switches, form a significant percentage of the total losses in the linear charge pump topologies. 
Some of the techniques that can be adopted to improve CP‟s efficiency include adaptive 
frequency control and charge recycling schemes.  
6.4.1 Adaptive Frequency control 
The switching losses can be reduced by lowering the frequency of the charge pumping 
operation. However, the reduction in operating frequency also results in increased output ripple 
and increase in charge redistribution loss. Therefore, there exists an optimum frequency of 
operation that corresponds to maximum efficiency in the CP for a given input and load 
conditions. This work aimed at variable-delay based frequency control to maximize conversion 
efficiency. With proper implementation of the delay control logic, the frequency of operation can 
be tuned across a large range to accommodate variation in input voltage and load conditions. 
Further, adaptive frequency control can also be realized by varying the frequency of operation by 
monitoring the CP‟s output ripple voltage [67]. 
6.4.2 Charge Recycling 
Charge-recycling based methodologies reduce the switching loss at the CP bottom plates 
by recycling the charge at the end of the discharge phase. The common implementation is to 
employ the non-overlapping dead-time between the charge and discharge phases of CP, to 
redistribute charge among the bottom plates of adjacent CP stage capacitors [38], [67].  
6.5 Summary and Conclusion 
From the measurement results, it can be concluded that the proposed CP topologies are 
capable of very-low voltage operation with self-startup capability. It has been demonstrated that 
reducing the losses associated with the linear charge pump topologies improves the performance, 
even under very-low voltage operation. Since the charge pumps operate at ultra-low voltage 
regime, the startup voltage is dependent of the CMOS device threshold voltage and can therefore 
be sensitive to process variations. Of the three tested samples, the variation in startup voltage due 
to process corners is within 10 mV, but more samples are required to quantify the percentage  
  
 173 
























[44] Charge pump 
with VTH tuning 
95 mV 
Inductive, 
Ext. CP cap 
N/A
†












4 180 mV 0.7 V N/A N/A 0.29 65-nm 
[49] Charge pump 270 mV Linear CP 4 450 mV 1.4 V 58% 15 mV 0.42 130-nm 
[50] Charge pump 150 mV 
Exponential 
CP 
8 150 mV 0.85 V 30 % (sim.) N/A N/A 65-nm 
This work 
LCP V1 






70 % (CP, sim) 
1.5 mV 0.15 130-nm 
This work  
LCP V2 






73 % (CP, sim) 
1.5 mV 0.15 130-nm 
*





A comparison of the performance of the proposed CP designs with the state-of-the-art 
low voltage, CP-based self-starting converters is provided in Table 6.2. It is to be noted that this 
self-starting CP outperforms the state-of-the-art charge pump based converters with respect to 
low startup voltage. The very high efficiency of the standalone charge pump proposed in this 
work is achieved at ultra-low startup voltage, without the need for external excitation, external 
components, or post-fabrication trims. Furthermore, the charge pumps with total capacitance of 
200 pF occupy a very small area of 0.15 mm
2
. 
The very-low startup voltage capability renders this CP suitable for ultra-low voltage 
energy harvesting systems such as TEGs. Also, this CP could be employed in kick-start 
applications for boost converters, and in battery-recharging systems from energy harvesters. 
 
 175 
Chapter 7 Conclusion 
 
The feasibility of charge-recycling based low-power digital operation has been 
demonstrated in this work. The proposed CR scheme has been designed and implemented to 
improve the energy-efficiency of a 12-bit Gray-code counter. This CR technique advances the 
state-of-the-art CR designs by eliminating the delay incurred in charge-pump based voltage 
boosting, and removing the need for current-balancing between vertically-stacked digital blocks. 
Additionally, the proposed scheme makes it possible to conceive partially self-powered circuits 
that reuse the recycled power harvested from their own operation. For a 2X reduction in the 
maximum frequency of operation, the proposed scheme offers 41% energy reduction in the 
source block while the total energy savings, including the control logic, aggregates to 25%, per 
cycle of operation. The energy reduction accomplished by the proposed scheme is more than that 
of other CR schemes reported in literature. Furthermore, this CR implementation realizes the 
25% energy reduction without the need to generate multiple, regulated power supply voltages. 
Thus, the proposed CR methodology clearly improves the energy efficiency in medium speed, 
digital systems, and advances the current state-of-the-art CR techniques. 
The second part of this research presents the design of an ultra-low voltage, switched-
capacitor based charge pump that broadens the application of the charge-recycling scheme to 
compensate for process, voltage and temperature (PVT) variations. The unassisted, self-startup 
capability in the ultra-low voltage regime renders this charge pump indispensable to energy 
harvesting applications. Further, the proposed low voltage, self-starting charge pump can be 
employed to kick start a boost converter and thereby improve the efficiency of micro-energy 
harvesters. 
The CR scheme in conjunction with the proposed self-starting, low-voltage charge pump 
facilitate harvesting of energy from digital circuits. The characterized prototypes validate the 
effectiveness of the proposed design methodologies to enable highly energy efficient operation. 
In summary, this research has demonstrated successful design and implementation of key 
components required to realize highly energy-efficient mixed-signal systems.    
 176 
7.1 Original contributions 
The original contributions of this work include 
 A novel charge-recycling approach to lower power consumption in medium-speed 
digital circuits.  
 Analysis of the amount of energy-reduction that can be achieved using CR 
techniques in conjunction with dynamic voltage scaling. 
 A methodology to implement charge-recycling in existing digital circuits. The 
absence of delay in the voltage boosting path facilitates the application of the 
proposed scheme in low-power systems such as portable electronics, and sensor 
based systems that have intermittent operation spread within long idle states.  
 Design of a self-starting, ultra-low voltage, charge pump that extends the 
application of the CR methodology to compensate for PVT variations and enables 
tracking of maximum performance. 
 A self-starting charge pump that can operate autonomously from the recycled 
charge or from the energy harvested from digital circuits. 
 Investigation of the various losses incurred in the proposed low-voltage charge 
pump topology. 
 Successful design and characterization to demonstrate the effectiveness of the 
proposed CR methodology and the ultra-low voltage self-startup charge pump. 
7.2 Directions for future work 
There are several interesting, open-ended problems that can be solved in order to improve 
the energy efficiency of low power circuits. A few directions to enhance the effectiveness and 
application of this work are included here. 
7.2.1 Charge recycling based low power digital operation 
 The efficiency of the virtual VDD generation can be improved by charge-sharing or 
charge-redistribution from the bottom plates of the CR capacitors.  
 The offset of the comparators employed to monitor the virtual ground and virtual 
VDD voltages affects the overall efficiency of the CR scheme. Hence, a 
 177 
compromise can be made with power consumption of the comparator in order to 
increase the accuracy and thus improve the efficiency across PVT variations. 
 Comprehensive characterization can be performed on the CR prototype to include 
the effect of temperature variations on the CR efficiency.  
 The CR reference voltage generation circuitry can be integrated within the control 
logic. This facilitates the generation of dynamic virtual power supply levels that 
adapt with process and temperature variations so as to guarantee consistent 
performance across varying operating conditions. Furthermore, this also enables 
tracking of the maximum energy efficiency point across PVT variations. 
 The proposed CR scheme can be implemented on a digital system that allows for 
multiple supply levels of operation. Furthermore, time-multiplexed CR approach 
can be explored to enhance the energy efficiency of the system. 
7.2.2 Charge pump  
 The efficiency of the proposed linear charge-pump can be improved by recycling 
the charge from the bottom plates of the pumping capacitors. Charge 
redistribution techniques presented in [38] can be adopted to share the charge in 
the bottom plates of the capacitors during the dead-time in between the charging 
and pumping phases. 
 The application of the ultra-low voltage charge pump can be further broadened by 
increasing the drive strength of the charge pump. 
 The prototype can be characterized to examine the change in the startup voltage 
and efficiency across variation in the temperature.   
 The ultra-low voltage charge pump can be decoupled from the self-startup circuits 
in order to characterize the maximum efficiency of the charge-pump as a function 
of the frequency of operation, for different input and load conditions.  
 178 
Finally, the low-voltage self-startup charge pump can be integrated into a charge-
recycling mixed-signal system in order to realize an autonomous energy harvester that operates 




[1] International Technology Roadmap for Semiconductors, 2010, 
http://www.itrs.net/Links/2010ITRS/Home2010.htm 
 
[2] International Technology Roadmap for Semiconductors, Design Chapter 2010, 
http://www.itrs.net/Links/2010ITRS/Home2010.htm 
 
[3] Anantha P. Chandrakasan et al., “Low-Power CMOS Digital Design”, IEEE JSSC, vol. 
27, no. 4, pp. 473–483, Apr. 1992. 
 
[4] Rahul Sarpeshkar, Ultra Low Power Bioelectronics: Fundamentals, Biomedical 
Applications and Bio-inspired Systems, Cambridge University Press, New York, 2010. 
 
[5] Neil H. E. West and David Harris, CMOS VLSI Design, A Circuits and Systems 
Perspective, 3
rd
 Edition, Pearson Education Inc., Boston, 2005. 
 
[6] Wai-Kai Chen, The Electrical Engineering Handbook, Academic Press, 2004. 
 
[7] Dejan Markovic et al., “Ultralow-Power Design in Near-Threshold Region,” Proc. of the 
IEEE, vol.98, no.2, pp. 237–252, 2010. 
 
[8] J. Rabaey, “Low Power Design Essentials,” Series on Integrated Circuits and Systems, 
DOI 10.1007/978-0-387-71713-5_4, Springer Science + Business Media, 2009. 
 
[9] J.-M. Chang and M. Pedram, “Energy Minimization using Multiple Supply Voltages,” 
IEEE Transactions on VLSI Systems, vol. 5, no. 4, pp. 436–443, Dec. 1997. 
 
[10] J. Cai et al., “Supply Voltage Strategies for Minimizing the Power of CMOS Processors,” 
Symposium on VLSI Technology, pp. 102–103, 2002. 
 
[11] Vincent R.von Kaenel et al., “Automatic Adjustment of Threshold & Supply Voltages for 
Minimum Power Consumption in CMOS Digital Circuits,” IEEE Symposium on Low 
Power Electronics, pp. 78–79, 1994. 
 
[12] Sami Kirolos and Yehia Massoud, “Supply Voltage Adaptive Low-Power Circuit 
Design,” IEEE Workshop on Design, Applications, Integration and Software, pp. 131–
134, 2006. 
 
[13] Yibin Ye and Kaushik Roy, “Low-Power Circuit Design Using Adiabatic Switching 
Principle,” IEEE Midwest Symposium on Circuits and Systems, pp. 1189–1192, 1995. 
 181 
 
[14] W. C. Athas, L. Svensson, J. Koller, N. Tzartzanis, and Y.-C. Chou, “Low-power digital 
systems based on adiabatic-switching principles,” IEEE TVLSI, vol. 2, pp.398–407, Dec. 
1994. 
 
[15] John S. Denker, “A Review of Adiabatic Computing,” IEEE Symposium on Low Power 
Electronics, pp. 94–97, 1994. 
 
[16] Suhwan Kim, Conrad H. Zeisler, and Marios C. Papaefthymiou, “Charge-Recovery 
Computing on Silicon,” IEEE Transactions on Computers, vol.54, no. 6, pp.651–659, 
2005. 
 
[17] Yong Moon, and Deog-Kyoon Jeong, “ An Efficient Charge Recovery Logic Circuit,” 
IEEE JSSC, vol. 31, no. 4, pp.514–522, Apr. 1996. 
 
[18] Bo Zhai et al., “Extended Dynamic Voltage Scaling for Low Power Design,” Proc. of Int. 
SOC Conference, pp. 389–394, 2004.  
 
[19] A. Chakraborty et al., “Implications of Ultra Low-Voltage Devices on Design Techniques 
for Controlling Leakage in NanoCMOS Circuits,” IEEE Int. Symposium on Circuits and 
Systems, pp. 33–36, 2006. 
 
[20] Kyung Ki Kim, and Yong-Bin Kim, “Novel Adaptive Design Methodology for Minimum 
Leakage Power Considering PVT Variations on Nanoscale VLSI Systems,” IEEE 
Transactions on VLSI Systems, vol. 17, no. 4, pp.517–528, 2009. 
 
[21] M. Nomura et al., “Delay and power monitoring schemes for minimizing power 
consumption by means of supply and threshold voltage control in active and standby 
modes,” IEEE JSSC, vol. 41, no.4, pp.805–814, Apr. 2006. 
 
[22] S. Rajapandian, X. Zheng, and K. L. Shepard, “Charge-recycling voltage domains for 
energy-efficient low-voltage operation of digital CMOS circuits,” Proc. Int. Conf. 
Computer Design, pp. 98–102, 2003. 
 
[23] S. Rajapandian, Z. Xu, and K. L. Shepard, “Energy-Efficient Low-Voltage Operation of 
Digital CMOS Circuits Through Charge-Recycling,” Symposium on VLSI Circuits Digest 
of Technical Papers, pp. 330–333, 2004. 
 
 182 
[24] S. Rajapandian et al., “Implicit DC-DC Downconversion Through Charge-Recycling”, 
IEEE JSSC, vol. 40, no. 4, pp. 846–852, Apr. 2005. 
 
[25] Jie Gu and Chris H. Kim, “Multi-Story Power Delivery for Supply Noise Reduction and 
Low Voltage Operation,” ISLPED, pp. 192–197, 2005.  
 
[26] K. Keung, V. Manne and A. Tyagi, “A Novel Charge Recycling Design Scheme Based 
on Adiabatic Charge Pump,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 15, 
no. 7, pp.733–745, Jul. 2007. 
 
[27] V. Manne and A. Tyagi, “An Adiabatic Charge Pump Based Charge Recycling Design 
Style,” PATMOS 2003, Lecture Notes in Computer Science, vol. 2799, pp.299–308, 
2003. 
 
[28] K. Keung, and A. Tyagi, “SRAM CP: A Charge Recycling Design Schema for SRAM,” 
PATMOS 2006, Lecture Notes in Computer Science, vol. 4148, p.95–106, 2006. 
 
[29] J. M. Rabaey, A. Chandrakasan and B. Nikolic, Digital Integrated Circuits, 2nd ed., 
Pearson Education, Delhi, 2004. 
 
[30] B. Razavi and B. A. Wooley, “Design Techniques for High-Speed, High-Resolution 
Comparators,” IEEE Journal of Solid-State Circuits, vol. 27, no. 12, pp. 1916–1926, Dec. 
1992. 
 
[31] David Harris et al., “The Fanout-of-4 Inverter Delay Metric,” 
http://www3.hmc.edu/~harris/research/FO4.pdf  
 
[32] C.Ulaganathan, C. L.Britton Jr., J. Holleman, and B. J. Blalock, “A Novel Charge 
Recycling Approach to Low-Power Circuit Design,” Intl. Conference on Mixed Design of 
Integrated Circuits and Systems, pp.208–213, May 2012. 
 
[33] K. Roy and S. C. Prasad, Low-Power CMOS VLSI Circuit Design, Wiley, New York, 
2000. 
 
[34] N.Nambiar et al., “SiGe BiCMOS 12-bit 8-channel low power Wilkinson ADC,” 
Midwest Symposium on Circuits and Systems, pp. 650–653, Aug. 2008. 
 
 183 
[35] C.Ulaganathan, B. J. Blalock, J. Holleman, and C. L.Britton Jr., “ An Ultra-Low Voltage 
Self-Startup Charge Pump for Energy Harvesting Applications,” Accepted for publication 
at The 55th Intl. Midwest Symposium on Circuits and Systems, pp. 206-209, Aug. 2012. 
 
[36] Nathan Bourgoine, “Harvest Energy from a Single Photovoltaic cell,” Journal of Analog 
Innovation, Linear Technology, vol.21, no. 1, Apr, 2011. 
 
[37] Alic Chen, “Thermal Energy Harvesting with Thermoelectrics for Self-powered Sensors: 
With Applications to Implantable Medical Devices, Body Sensor Networks and Aging in 
Place,” Ph.D. Dissertation, University of California, Berkeley, 2011. 
 
[38] Hye-Won Hwang, Jung-Hoon Chun, and Kee-Won Kwon, “A Low Power Cross-
Coupled Charge Pump with Charge Recycling Scheme,” IEEE Intl. Conference on 
Signals, Circuits and Systems, pp. 1–5, 2009. 
 
[39] Y. K. Ramadass and A. P. Chandrakasan, “A battery-less thermoelectric energy 
harvesting interface circuit with 35 mV startup voltage,” IEEE Journal of Solid-State 
Circuit, vol. 46, no. 1, pp. 333–341, Jan. 2011. 
 
[40] E. Carlson, K. Strunz, and B. Otis, “20 mV input boost converter for thermoelectric 
energy harvesting,” IEEE Symp. VLSI Circuits Dig.Tech. Papers, pp. 162–163, Jun. 
2009. 
 
[41] J.-P. Im et al,”A 40 mV transformer-reuse self-startup boost converter with MPPT 
control for thermoelectric energy harvesting,” Proc. IEEE ISSCC Dig. Tech. Paper, pp. 
104–106, Feb. 2012. 
 
[42] L. Gobbi, A. Cabrini, and G. Torelli, “A Discussion on Exponential-Gain Charge Pump,” 
ECCTD, pp. 615–618, Aug. 2007. 
 
[43] J. F. Dickson, “On-chip high-voltage generation in NMOS integrated circuits using an 
improved voltage multiplier technique,” IEEE Journal of Solid-State Circuit, vol. 11, no. 
3, pp. 374–378, Jun. 1976. 
 
[44] P.-H. Chen et al., “A 95 mv-startup step-up converter with VTh-tuned oscillator by fixed-
charge programming and capacitor pass-on scheme,” Proc. IEEE ISSCC Dig. Tech. 
Papers, pp. 216–218, Feb. 2011. 
 
[45] P.-H. Chen et al., “0.18-V Input Charge Pump with Forward Body Biasing in Startup 
Circuit using 65nm CMOS,” IEEE CICC, pp. 1–4, 2010. 
 184 
 
[46] J. Che, C. Zhang, Z. Liu, Z. Wang, and Z. Wang, “Ultra-low-voltage low-power charge 
pump for solar energy harvesting systems,” ICCCAS, pp. 674–677, 2009.  
 
[47] I. Doms, P. Merken, R. Mertens, and C. Van Hoof, “Integrated capacitive power-
management circuit for thermal harvesters with output power 10 to 1000 μW,” Proc. 
IEEE ISSCC Dig. Tech. Papers, pp. 300–301, Feb. 2009. 
 
[48] Anna Richelli, Luigi Colalongo, Silvia Tonoli, and Zsolt M.Kovacs-Vajna, “A 0.2–1.2 V 
DC/DC Boost Converter for Power Harvesting Applications,” IEEE Trans. on Power 
Electronics, vol.24, no.6, pp.1541–1546, Jun. 2009. 
 
[49] Yi-Chun Shih and Brian P. Otis, “An Inductorless DC-DC converter for Energy 
Harvesting With a 1.2–µW Bandgap-Referenced Output Controller, IEEE Trans. on 
Circuits and Systems–II: Express Briefs, vol.58, no.12, pp. 832–836, Dec. 2011. 
 
[50] M. AbdElFattah, A. Mohieldin, A. Emira, and E. Sanchez-Sinencio, “A Low-Voltage 
Charge Pump for Micro Scale Thermal Energy Harvesting,” IEEE Intl. Symposium on 
Industrial Electronics, pp. 76–80, 2011.  
 
[51] A. Wang, B. H. Calhoun, and A. P. Chandrakasan, “Sub-threshold design for ultra-low 
power systems,” Springer, pp. 75–102, 2006. 
 
[52] A. Umezawa et al., “A 5 V -only operation 0.6-μm Flash EEPROM with row decoder 
scheme in triple-well technology,” IEEE Journal of  Solid-State Circuits, vol. 27, no. 11, 
pp. 1540–1546, Nov. 1992. 
 
[53] Y. A. Eken, “High Frequency Voltage Controlled Ring Oscillators in Standard CMOS,” 
Ph.D. Dissertation, Georgia Institute of Technology, Nov. 2003. 
 
[54] C. Enz, F. Krummenacher, and E. Vittoz, "An Analytical MOS Transistor Model Valid in 
All Regions of Operation and Dedicated to Low-Voltage and Low-Current Applications," 
Special issue of the Analog Integrated Circuits and Signal Processing Journal on Low-
Voltage and Low-Power Design, vol. 8, pp. 83–114, Jul. 1995. 
 
[55] CMOS 8RF Design Manual, http://www.mosis.com/vendors/view/ibm/8rf-dm, 2010. 
 
 185 
[56] A.Richelli et al., “Charge Pump Architectures Based on Dynamic Gate Control of the 
Pass-Transistors,” IEEE Transactions on Very Large Scale Integration Systems, vol. 17, 
no. 7, pp. 964–967, Jul. 2009. 
 
[57] Y. Nakagome et al., “An experimental 1.5V 64Mb DRAM,” IEEE Journal of Solid-State 
Circuits, pp.465–472, Apr. 1991.  
 
[58] Wing-Hung Ki, Yan Lu, Feng Su, and Chi-Ying Tsui, “Design and Analysis of On-Chip 
Charge Pumps for Micro-Power Energy Harvesting Applications,” IEEE Intl. Conference 
on VLSI and System-on-Chip, pp.374–379, 2011. 
 
[59] Feng Su, Wing-Hung Ki, and Chi-Ying Tsui, “Gate Control Strategies for High 
Efficiency Charge Pumps,” IEEE Int. Symposium on Circuits and Systems, pp. 1907–
1910, 2005. 
 
[60] L. Su and D. Ma, “Design and optimization of integrated low-voltage low-power 
monolithic CMOS charge pumps,” IEEE SPEEDAM, pp.43–48, 2008. 
 
[61] G. Palumbo and D. Pappalardo, “Charge Pump Circuits: An Overview on Design 
Strategies and Topologies,” IEEE Circuits and Systems Magazine, First Quarter, pp. 31–
45, 2010. 
 
[62] IBM Microelectronics Division, “CMOS8RF Design Manual,” Nov. 2010.  
 
[63] G. Palumbo, D. Pappalardo, and M. Gaibotti, “Charge-Pump Circuits: Power-
Consumption Optimization”, IEEE Transactions on Circuits and Systems–I, vol. 49, no. 
11, Nov. 2002. 
 
[64] Wing-Hung Ki, Feng Su, and Chi-Ying Tsui, “Charge Redistribution Loss Consideration 
in Optimal Charge Pump Design,” IEEE Int. Symposium on Circuits and Systems, pp. 
1895–1898, May 2005. 
 
[65] Texas Instruments Datasheet, “LP38512-ADJ 1.5A Fast Transient Response Adjustable 
Low-Dropout Linear Voltage Regulator,” http://www.ti.com/product/lp38512-adj , Feb. 
2009. 
 
[66] Texas Instruments Datasheet, “SN74AUC2G34 Dual Buffer Gate,” 
http://www.ti.com/lit/ds/symlink/sn74auc2g34.pdf , Nov. 2003. 
 
 186 
[67] Y. K. Ramadass, “Energy Processing Circuits for Low-Power Applications,” Ph.D 
Dissertation, Massachusetts Institute of Technology, Jun. 2009. 
 
[68] Linear Technology Datasheet, “LTC3108 Ultralow Voltage Step-up Converter and Power 
Manager,” http://cds.linear.com/docs/Datasheet/3108fb.pdf, 2010. 
 





Chandradevi Ulaganathan was born in Pondicherry, India. She attended the St. Joseph of 
Cluny Higher Secondary school. She obtained a Bachelor of Technology degree in Electronics 
and Communication Engineering from Pondicherry University, India in May 2001. She worked 
as a Software Engineer in Infosys Technologies for two years before moving to United States to 
further her education. She obtained a Master of Science degree in Electrical Engineering from 
the University of Tennessee, Knoxville in May 2007.  
Since June 2007 Chandradevi has been pursuing the Doctor of Philosophy degree in 
Electrical Engineering at UT, under the supervision of Dr. Benjamin J. Blalock. In June 2012, 
she joined Texas Instruments, Knoxville, Tennessee, where she works in the Home Audio 
Amplifiers group. 
