Low Power Circuit Design in Sustainable Self Powered Systems for IoT Applications by Abuellil, Amr Mohamed Abdelaziz




AMR MOHAMED ABDELAZIZ ABUELLIL
Submitted to the Office of Graduate and Professional Studies of
Texas A&M University
in partial fulfillment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
Chair of Committee, Edgar Sánchez-Sinencio
Committee Members, Sunil Khatri
Kamran Entesari
Mahmoud El-Halwagi
Head of Department, Miroslav M. Begovic
May 2019
Major Subject: Electrical Engineering
Copyright 2019 Amr Mohamed AbdelAziz Abuellil
ABSTRACT
The Internet-of-Things (IoT) network is being vigorously pushed forward from many fronts in
diverse research communities. Many problems are still there to be solved, and challenges are found
among its many levels of abstraction. In this thesis we give an overview of recent developments
in circuit design for ultra-low power transceivers and energy harvesting management units for the
IoT.
The first part of the dissertation conducts a study of energy harvesting interfaces and optimizing
power extraction, followed by power management for energy storage and supply regulation. we
give an overview of the recent developments in circuit design for ultra-low power management
units, focusing mainly in the architectures and techniques required for energy harvesting from
multiple heterogeneous sources. Three projects are presented in this area to reach a solution that
provides reliable continuous operation for IoT sensor nodes in the presence of one or more natural
energy sources to harvest from.
The second part focuses on wireless transmission, To reduce the power consumption and boost
the Tx energy efficiency, a novel delay cell exploiting current reuse is used in a ring-oscillator
employed as the local oscillator generator scheme. In combination with an edge-combiner power
amplifier, the Tx showed a measured energy efficiency of 0.2 nJ/bit and a normalized energy




To God Almighty, my provider and aid. To my loving parents and sister Nora. To my good
friends and team. To Basma Serry
iii
ACKNOWLEDGMENTS
I would like to thank the Prof. Edgar Sanchez for putting his trust in me, supporting my whole
PhD program, providing unmatched care and advice through out my time here, you have been a
great support, teacher and a father to me. No words can show the gratitude I carry to you. I will
carry it with me through life.
To Texas A&M University, I learned a lot through this few years, more than I ever did in my
entire life, traveling here and seeing everything this country has to offer has been an eye opener
experience, thank you for making me enjoy every moment of this.
My parents, you rooted in me how to do things in the best way possible, hard work and to
always make a better person out of me. These good qualities are in me because of you, even
though it took me time to actually implement them in my life. Thank you for everything you did
to me. Ebtisam and Mohamed.
Basma, even though we are not together anymore, this victory must be celebrated by you also,
no way I could have made it without you in every-step of this. The last few months of this PhD
was just a natural result of every words of kindness and support you have given me. I appreciate
everything you did and wish all good things for you.
My brothers and sisters in arms, Alfredo, Johan , Aditya, Jorge , Sungjun , Zzengz ,Joseph ,
Omar, Hatem, Mohamed Abuzied, Fernando, Adriana and Sergio. I am honored to know you and
to have worked by your side. I had the best time hanging out with you and talking about all sort of
things through this. Much love to you all. I will use every opportunity to maintain this relationship
and keep the connection with you all.
My sister Nora, your presence in the last few weeks before submission was a great support
before the end line. Can‘t be thankful enough.
All of this happened because of your mercy, thank you God. And thank you for giving me the
ability to thank you.
iv
CONTRIBUTORS AND FUNDING SOURCES
Contributors
The introduction contains work that is done by Johan Estrada and Alfredo Costilla, and the
thesis author as collaborator. The work in chapter 4 was lead by Jorge Zarate, and the thesis
author as collaborator. All other work conducted for the dissertation was completed by the student
independently.
Funding Sources
Graduate study was supported by a fellowship Silicon Labs and Texas Instruments through




ASK Amplitude Shift keying
BFSK Binary Frequency Shift Keying
BW Band Width
DAC Digital to Analog Converter
DC Direct Current
EH Energy harvesting
FFT Fast Fourier Transform
DCO Digitally Controlled Oscillators
DCVSL Differential cascode voltage swing logic
FPGA field-programmable gate array
FSK Frequency Shift Keying
GFSK Gaussian Frequency Shift keying
HPF High Pass Filter
IF Intermediate Frequency
IOT Internet Of Things
ISM Industrial/Scientific/Medical
LNA Low Noise Amplifier
LO Local Oscillator
LSB least significant bit
MIM Metal-Insulator-Metal
MPPT Maximum Power Point Transfer
vi





OQPSK Offset Quadrature Phase Shift keying
PA Power Amplifier
PLL Phase Locked Loop
PMOS P-type MOSFET
PVT Process-Voltage-Temperature
QAM Quadrature Amplitude Modulation
RMS Root Mean Square
RO Ring Oscillator
RSSI Received Signal Strength Indication




VCO Voltage Controlled Oscillator
VRO Vertical Ring oscillator
WBAN Wireless Body Area Network





ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
CONTRIBUTORS AND FUNDING SOURCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
NOMENCLATURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
TABLE OF CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
LIST OF TABLES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Internet of things (IoT) Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 IoT Market Growth and Applications Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Inside a Low Power Wireless Network (WSN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.3 Key challenges and Research trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Thesis Framework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.1 Research Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.2 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2. LITERATURE REVIEW AND PROPOSED SYSTEM LEVEL . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1 Transmitters Low Power Strategies Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.1 Block Level Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.2 Architecture Level Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.3 Cross Layer Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.1.4 Low Power Features Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2 Energy Harvesting and Power management units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.1 Single Harvester . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.2 Multiple Harvesters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2.3 Energy harvesting PMU Low Power Features Summary . . . . . . . . . . . . . . . . . . . . . . . 33
2.3 Proposed System Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3.1 Harvesting Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.2 Power Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
viii
2.3.3 Wireless Transmitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3. MULTIPLE-INPUT HARVESTING PMU WITH ENHANCED BOOSTING SCHEME
FOR IOT APPLICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 System Level Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3 Circuit Level Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.1 Multiple Input Start-up Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.2 Combiner, MPPT and Energy Flow Detection (EFD) . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.3 Enhanced Booster Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3.4 Pulse Battery Charging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3.5 Hybrid LDO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.4 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.5 Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4. 0.2 NJ/BIT FAST START-UP ULTRA-LOW POWER WIRELESS TRANSMITTER
FOR IOT APPLICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2 Proposed Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.3 Circuit Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.3.1 VRO Analysis and Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.3.2 Edge Combiner Power Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.3.3 Digital Frequency Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3.4 Low Power, Monotonic Digital to Analog Converters . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.3.5 Temperature Insensitive Biasing for Crystal Oscillator. . . . . . . . . . . . . . . . . . . . . . . . . 92
4.4 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.5 Summary and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5. CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.1 Summary of Contributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.2 Suggested Improvements and Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108




1.1 IoT Market Growth and share by sub-sectors (data used with permission from
GrowthEnabler ©Market pulse report, IoT April 2017 [1]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Spending on IoT in 2015 versus 2020 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 IoT node hardware block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 IoT layers stack. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Power consumption patterns in different duty cycled applications . . . . . . . . . . . . . . . . . . . . 10
1.6 Discrete components implementation for Self powered IoT Node. ©[2006] IEEE . . . 11
2.1 LP design using current stacking. ©[2012] IEEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 LP design for TxRx architectures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Crystal oscillator startup enhancement circuits. ©[2014] IEEE . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 MDLL circuit. ©[2014] IEEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 PLL-less transmitter architecture. ©[2011] IEEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Preamble synchronization between 2 nodes in a network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7 ID-passive wake up. ©[2012] IEEE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.8 Harvester model TEG (left) , Solar (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.9 Harvesting from motion through coils and moving magnets, equivalent electrical
model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.10 Negative voltage converter (NVC) in passive (left) and active (right) modes. . . . . . . . . . 30
2.11 System level for magnetic motion harvesting PMU. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.12 (a) Series voltage combiner [2] ©[2018] IEEE, (b) Parallel current combiner [3]
©[2012] IEEE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.13 Proposed system block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
x
2.14 Tx/Rx Node power cycle and transmission overhead.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1 (a) Proposed PMU block diagram, (b) Operation timing chart . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 (a) Simplified power flow diagram, (b) System flow chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3 Start-up circuit diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4 Measured start-up with CSU=68 µF in presence of 4 inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5 (a) Combiner unit circuit, (b) Combiner waveforms, (c) Energy flow detection
(EFD) circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6 OSCmain/su circuit diagram showing "EFD" control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.7 (a) Harvester model and OCC pulse generation circuit, (b) VMPPT generation cir-
cuit, (c) Measured VMPPT search and harvester input during OCC pulse . . . . . . . . . . . . . 51
3.8 MPPT tracking accuracy, combiner power efficiency and power consumption ver-
sus Pout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.9 (a) Booster top level diagram, (b) Voltage doubler CP circuit with MIMCAP reuse . . 53
3.10 Operation cycle for harvester/battery powered systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.11 CSTR charging by smart boost, calculated efficiency with time . . . . . . . . . . . . . . . . . . . . . . . . 56
3.12 (a) Pulse charger circuit, (b) Pulse charger operation, (c) Measured VBATT and
VACC during charging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.13 Hybrid-LDO Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.14 Hybrid-LDO output voltage at different modes of operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.15 Zoomed in die corner (0) Cold start-up and OSCSU , (1) Four combiner units, (2)
Booster and MIM caps, (3) Pulse charger and 3V − CP , (4) Hybrid LDO, (5)
Energy flow detector and (6) OSCmain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.16 Measured block consumption pie charts. (a) EFD=0, (b) EFD=1 . . . . . . . . . . . . . . . . . . . . . . 63
3.17 Input waveforms showing four channel harvesting and MPP emulation . . . . . . . . . . . . . . . 64
3.18 Measured block and end-to-end efficiencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.19 Measurements setup diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.20 Measured PMU waveforms showing system operation through different states . . . . . . . 66
4.1 Top-level Tx block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
xi
4.2 BFSK modulation with a) small ∆f (LO phase noise buries f1 and f2) and b) large
∆f (negligible LO phase noise effect in f1 and f2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.3 a) Typical DCVSL delay cell and b) Proposed vertical delay cell. . . . . . . . . . . . . . . . . . . . . . 77
4.4 Equivalent models for the vertical delay cell during its two possible states showing
the corresponding common-mode levels: a) Logic high (low) at vout+(vout−) and
b) Complementary logic low (high) at vout+(vout−). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.5 Vertical delay cell waveforms and signal transitions details for a)τpHL and b) τpLH
in the vout+ (upper) output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.6 VRO and its building blocks: a) vertical delay cell and b) RC delay tuning cell. . . . . . 80
4.7 Delay variations due to local mismatches for the VRO stage . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.8 Delay variations due to ±10% supply variation for the VRO stage . . . . . . . . . . . . . . . . . . . . 86
4.9 Delay variations due to temperature variation for the VRO stage . . . . . . . . . . . . . . . . . . . . . . 86
4.10 Edge-combiner PA and the pre-amplifier used to interface with the VRO . . . . . . . . . . . . . 87
4.11 Digital calibration flow chart showing calibration duration under different operat-
ing conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.12 Resistive-string-based DACs used to generate the tuning voltages for the RC delay
tuning cell. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.13 Crystal oscillator schematic including bias current generation and buffering stage.. . . 93
4.14 Die microphotograph of the Tx. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.15 VRO measured tuning range 50–350MHz translates into a PA RF frequency of
0.15–1.05 GHz.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.16 XO frequency stability across temperature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.17 Phase noise of a) VRO with fV RO of 300 MHz for -106 dBc/Hz @ 1 MHz and b)
Tx carrier at 900 MHz with -94 dBc/Hz @ 1 MHz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.18 BFSK tones at the PA output for the 2 MHz frequency deviation case. . . . . . . . . . . . . . . . . 98
4.19 a) Transmitted bit pattern (1010110011110000) at 1 Mbps and 3 Mbps received in
a signal analyzer and b) Eye diagram using a PRBS7 pattern at 3 Mbps and 2 MHz
tone spacing.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.20 Lab setup for a) digital calibration and data modulation and b) BER . . . . . . . . . . . . . . . . . . 100
4.21 Binary search algorithm for f1 and f2 performed once in the initial calibration cycle. 101
xii
4.22 BER versus frequency deviation at different data rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.23 Power amplifier output power versus PA supply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.24 Tx power consumption per circuit block. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.1 DC/AC universal harvester interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107




1.1 IoT applications spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 IoT wireless standards comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Harvesting sources comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2 Harvester Interface target specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3 Power management unit target specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4 Transceiver target specifications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1 Comparison against State of The Art Energy Harvesting PMU’s. . . . . . . . . . . . . . . . . . . . . . . 67
4.1 Delay variations of the VRO stage across process corners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.2 Calibration time under various operation conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.3 Comparison between PLL and PLL-less approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.4 Performance Summary and Comparison with the State-of-the-Art Systems . . . . . . . . . . . 103
xiv
1. INTRODUCTION
1.1 Internet of things (IoT) Overview
An easy definition for IoT is that you install sensors to monitor almost everything in life, in
an effort to have a full digital transformation for the entire world we live in. This should change
the way we track merchandise, do agriculture, monitor our health, drive cars and the way we do
business generally. Dealing with almost 50 billion devices as forecasted in early 2011 and the
trend still holds till today [4], a lot more knowledge is suddenly available around you from all
sensors. These data can be simply saying the temperature in your room right now, vibration level
in a factory machine or a self-driving car motor from millions of nodes at the same time. You can
extrapolate trend to easily know that managing the data streams both in the wireless transmission
and over the cloud and filter what is needed on the long run and what can be managed and deleted
immediately.
From a software perspective, the term "Big Data" emerges here which will be a growing pain
from many angles (data storage/wireless congestion/security), security is the main focus and con-
cern as millions of devices are sold worldwide without any security checks and algorithms imple-
mented in them [5]. And we have seen some security breaches already. In Finland 2016, cyber-
criminals were able to shut down the heating in 2 buildings by continuously rebooting the heating
system so the heaters never get a chance to start, with the freezing temperatures Finland can reach
in the winter this attack was significant. Going forward, every IoT device should be shipped with
updated firmware and able to be updated as new vulnerabilities are found
From a hardware point of view, any device can be turned into a smart device as long as you
can monitor/control it from the internet, whether it was a toy or a passenger airplane. With all the
sensors embedded in those devices, the main limitation faced is the infrastructure to power these
devices and send/receive data from them, the gap between the energy capacity scaling (+5% an-
nually since 1990s) for batteries and Moore‘s law (half energy required every 18 month for same
1
functions) should theoretically be getting better with time. But the demands, the added function-
ality and complex wireless standards (5G, Wi-Fi) have been pushing the power consumption from
analog side up (power amplifiers and LNA) and the gap is far from being closed. An adequate
energy source is vital to reach the dream of self-sustained devices without an elevated maintenance
cost for replacing the batteries, or adding infrastructure to provide power.
1.1.1 IoT Market Growth and Applications Spectrum
The IoT platform has already enabled the advancement of multiple applications such as: health-
care, wearables, asset monitoring, smart homes and buildings, and smart infrastructure for smart
cities, etc. In the industry, the main driving force for IoT is the ability to detect when and where
to do preventive/predictive maintenance, as this can save hours on an active production line which
in most cases means millions of dollars. For outdoor/indoor agriculture, sensing soil tempera-
ture/moisture and nutrients levels can automatically turn on heating and watering systems, reducing
the amount of water used and lowering the needed labor cost while maximizing the production.
In environment monitoring, early detection of wild fires in forests or other disasters can save
many lives and resources. The early detection of radiation or air pollution levels can allow for
early evacuation. In Inventory management, thousands of RFID/NFC tags can be placed knowing
the exact location for each item, saving search times and allowing streamlining infrastructure.
The share of each application in the world-wide market size is shown in Fig.1.1 (data used with
permission from GrowthEnabler ©Market pulse report, IoT April 2017 [1]). The IoT market size
is estimated to grow by Compound annual Growth Rate (CAGR) of 28.5% during the period from
2016-2020 reaching $457 B. The spending on each sector in 2015-2020 is predicted in Fig.1.2. The
spending trends show how the market view the opportunities IoT provides to save future resources,
the IoT promise is basically based on one word, "Efficiency", which is the ability to do the same
with less. A Summary of applications is shown in Table.1.1.
You can still deploy all these sensors, and keep the human in the loop to decide what do with all
the data collected from different sensors, which means that control is still in human hands. Another
level is the adoption of intelligent actions within IoT (M2M direct control and deep learning)
2
lowering the production price and time in addition to improving functionality. This is becoming
easier than before as more insight now is provided from these data pools generated that allows
training vectors for deep learning algorithms to behave with a rational based on human coded logic
and learn to improve it with time to achieve certain targets.
Figure 1.1: IoT Market Growth and share by sub-sectors (data used with permission from
GrowthEnabler ©Market pulse report, IoT April 2017 [1])
Figure 1.2: Spending on IoT in 2015 versus 2020
3
Table 1.1: IoT applications spectrum
1.1.2 Inside a Low Power Wireless Network (WSN)
Node hardware
The basic hardware subsystems for a wireless sensor node to perform its basic operations are
shown in Fig.1.3.
4
Figure 1.3: IoT node hardware block diagram.
Power unit
This is an infrastructure for providing required power for the sensor, processing unit and radio
blocks. This can be a traditional battery pack, inductor based wireless power transmission module
or ultimately an energy harvester that scavenges vibration or solar power in the environment. In
any of these cases you are working on a tight power budget, but the last 2 options reduce the
maintenance cost drastically.
Sensing transducer
This is the node interface with the surrounding world. Sensors technology took big leaps
forward after MEMS emerged, allowing integration of a sensor that used to be few centimeters in
size into the same package with the rest of modules. The sensor is connected to ADC to interface
it with the processing unit.
Processing/Storage
These are used to handle on board data processing and manipulation, transient and short-term
storage, encryption, forward error correction (FEC) and digital modulation. WSNs have compu-
5
tational requirements typically ranging from an 8-bit micro-controller to a 64-bit microprocessor,
storage requirements are typically around 1 MB.
Communication module
Radio transmission is the way stored sensor data is communicated to other nodes in multi-hop
networks till they reach the central node which is called "Data sink". The last two subsystems
combined are called "Mote".
Node software
Unlike communication over a guided medium in wired networks, communication in wireless
networks is achieved in the form of radio signals through the air. This common transmission
medium must therefore be shared by all sensor network nodes in a fair manner. To achieve this
goal, a medium access control protocol must be utilized. The main functions of the MAC layer are
framing, medium access, reliability and flow/error control. Energy waste occurs due to (collision,
control packet overhead, idle listening, over hearing). It is the MAC’s responsibility to minimize
those as in Fig.1.4.
6
Figure 1.4: IoT layers stack.
1.1.3 Key challenges and Research trends
The three main technological challenges faced by IoT are listed below, it‘s important to know
that there are other business challenges and social challenges opposing this rising revolution but
here we focus on the technological side:
Security
IoT has already turned into a security issue that has drawn big technology firms and govern-
ments across the world. Imagine how much disruption you can cause in a hospital if you have
control over cameras, fridge thermostats, baby monitors in hospitals, and drug infusion pumps for
patients. Any person with evil intents will find a good number of security loopholes in current sys-
tems and the possibilities of attack vectors are endless. The more IoT gets integrated into our life,
much more than personal information is at risk. Our very health and life can become a target of IoT
7
attacks [6]. With the industry now seeking to increase profit in a rushing way without looking into
these concerns, security will always be a last priority matter and the problem will only increase till
people realize the massive losses this will cause in the near future.
Connectivity and standards compatibility
The problem here is in the architecture of the current communication system, being built in
a centralized way in a server/client to authorize and connect nodes in the network, such system
will be a great bottle neck in many IoT network implementations in the future as we are trying to
connect billions of devices together. Peer-to-peer communications and creating protocols for ex-
changing information as well as authentication without the need of a master node creates networks
without a single choke point which in turn increases the chances of congestion and failure [7].
Many wireless standards are being developed to lower the complexity for such energy con-
straint systems. A summary of the most popular standards that are fighting for IoT adoption are
shown in Table.1.2. Zigbee remains the most popular among IoT applications till now, Zigbee is
a low-power, low-data-rate, close-proximity ad hoc wireless network, supporting mesh network
topology. It is especially suited when devices are located in a small area. It only works in distances
from 10 to 100 meters, providing data rates of 250 kbps, 40 kbps and 20 kbps. Lower data rates and
proximity allow for batteries to last for years, or even allow battery-less operation when harvesting
is utilized. In 2017, Zigbee Alliance launched its anticipated IoT basic language, "Dotdot", which
makes it possible for smart objects to work together on any network. Some of these technologies
will eventually become obsolete in a few years, thus striving for standard unification is a pressing
need. It is still difficult to achieve a "one serve all" without sacrificing and compromising many
specs for some application needs. It seems that 2-3 standards will dominate the market and others
will slowly fade out in the next period of time to reach a good compromise.
8
Table 1.2: IoT wireless standards comparison
Providing power
Starting by an example, the global smart home market is forecasted to reach a value of al-
most $138 billion by 2023 with many companies providing solution in this domain. One of the
challenges is that most homes are already built without the required infrastructure for automation.
Thus, it‘s important to create a Plug and Play device with a self-powered and custom Low Power
(LP) wireless standards to facilitate installation. Also a battery-free device is not a luxury feature,
as tens of these devices will be installed in one home and changing the battery for all those will
decrease the comfort and convenience for end users. Energy Harvesting (EH) is a hot topic and
rising trend to provide energy for these various sensor nodes scattered in various applications, es-
pecially when changing the battery or connecting wires is very costly or impossible (agriculture,
inside machines and tunnels). Even though harvesters in these small form factors provide tens of
µW s which is not sufficient for these systems, but luckily these sensor nodes operate in a duty
cycled fashion where they consume mW s only in active mode which is less than 1% of the total
time as shown in Fig. 1.5. This creates an average power of µW s which the harvester can supply
if given enough time to collect this energy to be used in burst mode, which is the acquisition of a
big amount of energy if you harvest little power over a long period of time as explained in details
9
in this article [8]. In [9], An energy harvesting circuit design for piezoelectric pushbutton is im-
plemented in the wireless radio frequency (RF) transmitter. This self-powered wireless transmitter
is capable of transmitting a 12-bit digital word information using the mechanical force energy har-
vested from pressing the pushbutton. This fully discrete design with off the shelf components is
shown in Fig.1.6.
Research from hardware (HW) perspective was initially focused on creating power efficient
circuits with LP design techniques and new topologies. Recently, researchers started shifting to-
ward finding new architectures for the same systems they used to implement in the same ways
for the past years trying to pass the bottleneck of already saturating improvements being done. A
similar shift is happening in the software development too. The third phase/trend of research is the
cross layer optimization phase, where people from different parts of the solution (which is usually
done by different teams or companies) team up and try to co-optimize network layers with physical
layers, or application layers with network layers to achieve the best possible overall efficiency as
most of the energy is wasted in the in between layers. Chapter 2 is dedicated to cover the research
done in these 3 phases and highlight important state of the art work being done in each.
Figure 1.5: Power consumption patterns in different duty cycled applications
10
Figure 1.6: Discrete components implementation for Self powered IoT Node. ©[2006] IEEE
1.2 Thesis Framework
1.2.1 Research Scope
This dissertation focuses on enabling technologies for self-powered IoT devices, targeting sens-
ing and monitoring tasks for immediate transmission of data. For this end goal, three main cat-
egories of systems/circuits are being implemented in this thesis, with LP design being the target.
First, the harvesting interface with the main function of creating matching conditions between the
transducer (e.g. solar panels, thermoelectric generators or piezoelectric material) and the power
management circuit to extract the maximum power available. It also needs to provide power com-
bining in case of multiple inputs and isolation between them at the same time.
The second type of circuits is the power management circuit, where collected energy is man-
aged for creating a stable supply for the load and using excess energy to charge auxiliary storage
batteries. Focusing on supply ripple and noise suppression is important as the load is usually a
sensitive circuit (10-14 bits ADCs, Low Noise Amplifiers (LNA) for receivers, etc.). Lastly, the
third category is wireless transmitters, the data collected by the sensor node or received data from
11
previous node needs to be relayed forward in the network to propagate to the network sink point
for post processing. These transmitters consume the highest power in the system and also run for
the shortest time in the system, in most cases the overhead from the circuits before the power am-
plifier can dominate the power consumption in low output power short range communication, so
new architectures are needed to heavily reduce the power consumption.
As previously mentioned, even though [9] is a fully discrete design with off the shelf com-
ponents, it is still able to perform simple data transmission. However, if all these functions are
integrated in a single chip, we expect improved efficiency, performance and vast cost reduction.
Creating such integrated systems is considered the main motivation of this work.
1.2.2 Thesis Organization
This chapter gave an overview on the IoT platform, it‘s expected growth and application
spectrum with an insight on the challenges and fears surrounding such advancement. Lastly, it
defined the research scope of this work and in what order it will be presented.
Chapter 2 summarizes the strategies for achieving low power operation from the lowest block
to architectural and system levels and ending by layer to layer optimization, followed by a summary
for LP features required in any IoT product to achieve reliable self-sustained operation, both on
the transmission and the power management side. Finally, it proposes a complete IoT hardware
solution highlighting the main specs in each sub-system.
Chapter 3 introduces an end to end PMU. This work is the first in its class to provide simulta-
neous impedance matching for 4 harvesters with a single oscillator, simultaneously combining their
power and efficiently boosting the voltage with 1.5x more charging efficiency versus conventional
SC-voltage CP. The stored energy is used in generating regulated supply for sensitive loads, also
providing pulsed battery charging capabilities in a fully integrated design. The energy flow detec-
tion circuit scales down the system power consumption by 9x down to 1.55 µW at no input energy
availability, while maintaining supply regulation and the ability to return to normal harvesting op-
eration in < 0.2 ms. It achieved a maximum power delivery of 2.6 mW and a maximum end-to
end efficiency of 70% at 40 µW . With quiescent current of 950 nA, area of 0.46 mm2 this chip
12
enables an integration of sensor interface and wireless transmitter providing reliable, self-powered
IoT node in a single package with 2-3 external capacitors.
Chapter 4 explains the details of an ultra-low power, fast start-up, PLL-less and energy effi-
cient transmitter for IoT applications in the 900 MHz ISM band with data rates of up to 3 Mbps.
A current reusing, vertical delay cell is introduced, analyzed, and used to build a vertical ring-
oscillator with ultra-low power consumption. By employing a wideband BFSK modulation, it is
possible to use the vertical ring-oscillator as the Tx local oscillator generator. A digitally-assisted
frequency correction and calibration scheme is implemented to compensate the frequency varia-
tions that the oscillator might experience reaching a carrier frequency accuracy of ±3fREF (±98
kHz). The concept was tested with an FPGA that required division by 8 to provide the VRO signal
on board. The power amplifier is based on an edge-combiner, which effectively multiplies the os-
cillator frequency by a factor of 3, relaxing the maximum required frequency at the ring-oscillator.
The Tx was fabricated in 0.18 µm CMOS technology and occupies an active area of 0.112 mm2.
Measurement results showed an RF carrier tunability of 0.15 GHz−1.05 GHz. At 900 MHz, −10
dBm output power, and 3 Mbps data rate, the normalized energy efficiency is 3.1 nJ/(bitmW). The
proposed architecture achieves a superior efficiency not only in the transmission mode but also for
overall operation cycle due to minimized both start-up/calibration time and energy.
Finally, Chapter 5 lays down this dissertation‘s summary of contributions to the state of the art
and the areas of improvements found along the way to improve this work. It then draws conclusions
to give a clear vision for future directions to follow.
13
2. LITERATURE REVIEW AND PROPOSED SYSTEM LEVEL
2.1 Transmitters Low Power Strategies Review
This chapter objective is to survey the current market and research efforts for optimizing power
consumption, slowly building up a case for the need to have an architectural make over in the
current analog systems to reach the power limits we are hoping for sustainable self-powered nodes.
Then taking it a bit further with cross layer optimization, where not only a single circuit or groups
of circuits are co-optimized, but a multiple layers are designed as a big system with the target of
achieving maximum efficiency.
2.1.1 Block Level Optimization
Stacking and current reuse
Block stacking has been a trend going for more than 10 years till now, you reuse the current in
a branch to pass through other block stacked below on the same supply. In [10] a stacked VCO was
implemented over multiple RF blocks reusing the current needed for the VCO gm requirements
to pass through Low Noise Amplifier (LNA), Intermediate Frequency (IF) amplifier and PA. This
maximizes the use of the lithium battery 3 V supply to provide headroom for these stacked blocks,
in this case you don‘t need a high efficiency buck converter to supply multiple low voltage blocks,
instead the main battery is used directly achieving power consumption of 1.3 mW for -6 dBm
output power and -94 dBm Rx sensitivity. Stacked VCO, PA is shown in Fig.2.1(a).
14
Figure 2.1: LP design using current stacking. ©[2012] IEEE
15
The same technique was used recently in [11] but with mitigating a draw-back caused by this
technique which is VCO Pulling if both VCO and PA are on the same frequency with no proper
isolation in between. This usually happens through load pulling, coupling through substrate/supply
and by magnetic coupling. Isolation is implemented here by separating current mirrors for each
block, very strong AC grounding for the common point and most importantly adding a resistive
buffer on the same stack to avoid PA internal matching network from affecting the VCO tank
causing inter symbol interference for FSK modulation. In [12] stacking was used inside the PLL for
VCO and quadrature divider with the VCO running on double frequency to avoid pulling effects,
Fig.2.1(b),(c) shows circuits for the two stacking techniques.
On the Rx side a new method for stacking is to directly convert the input RF signal into current
and stack the whole receiver chain on top of transconductance stages eliminating the need for cou-
pling capacitors between blocks hence saving much area. This technique also benefits greatly from
technology node advancement as it offers more Ft and less Vth enabling more stacking headroom
with slightly elevated supplies. In [13] the RF input is required to be differential so a balun was
installed on board to provide input to transconductance followed by I,Q paths stacked on top of it
providing down mixing and amplification. In [14] was able to stack a current based Hybrid filter,
also eliminating the use of external balun by generating the differential current signal using special
transconductance stage from single ended RF input. It‘s worth noting that the receiver done in this
project use the same method.
Reducing active blocks
In many short range communication networks, the sensitivity requirement is in the -80 dBm
range due to dense deployment of such networks with nodes in close proximity to each other, if
we extracted the required Noise Figure (NF) for such case it will be 16dB according to Eqn.2.1,
where SNR is the demodulator SNR assumed to be 8 dB, and noise floor value of -173.8 dBm/Hz,
sensitivity -90 dBm and signal bandwidth of 500 kHz.
Psens = −173.8dBm/Hz + 10logBW +NFmax,dB + SNRdB (2.1)
16
With this spec you can easily get rid of the LNA and achieve the required sensitivity without
degrading the selectivity, because channel selectivity is achieved by performing the RX channel
selection in the complex baseband after a quadrature down-conversion stage. Therefore, two in-
gredients are inevitable to the receiver, a quadrature down-conversion stage and quadrature local
oscillator (LO) signals. Many efforts have been done in Mixer-first Rx chains like in [15] and
[16], both implementing passive mixer in the beginning of the chain, making the first gain seen
by signal at the base band amplifier added after mixers achieving NF from 11 to 17 dB which is
very sufficient, saving the 1-2mW spent in a carefully designed LNA in conventional transceivers.
During transmission, reduction of active blocks comes from opening PLL loop and modulating the
free running VCO directly through varactors or capacitor bank switching as the maximum packet
length in BLE is 376usec, this is impossible to do with standards having big payloads like in Wi-Fi
and standard Bluetooth (in milli−seconds) as the VCO frequency drifts due to loop control voltage
leakage, kicking the VCO out of channel. This technique is used in many chips in the industry and
also in [16] .Fig.2.2(a), (b) shows block diagrams for both Rx/Tx techniques.
17
Figure 2.2: LP design for TxRx architectures.
18
Reducing calibration/startup times
The last technique is to speed up the start-up/power-down time for blocks. As previously
mentioned, in small payload applications your active power is dominated by the calibration time
which is usually measured by the start-up and stabilization of the slowest block in your system. The
slowest two blocks in the system are usually the start-up for crystal oscillator and PLL calibration
and locking, for the latter, nothing much can be done without altering the whole PLL transfer
function via an adaptive bandwidth control [17] and other complex techniques using dynamic
phase error compensation via an auxiliary Charge pump [18]. Without enhancements PLL settling
and locking is in the range of 100−200 µs, leaving the crystal oscillator as a bottle neck for system
start-up. Crystals in the MHz range take up to a few milli−seconds for start-up, while the ones in
the kHz range take up to a few seconds. Waiting all this time to send a 100 usec sensor data packet
is a complete waste of energy. During crystal start-up, there are two main factors governing it‘s
behavior, negative resistance seen by crystal and initial noise seed, the oscillation voltage during
startup is shown in Eqn.2.2, where R is the negative resistance, L, C are the equivalent crystal
inductance and capacitance and K resembles the initial noise seed, this equation is derived from
Laplace transformation for simple RLC network as explained in Appendix A.
V (t) = K ∗ c1 cos(t) + c2 sin(t) (2.2)
19
Figure 2.3: Crystal oscillator startup enhancement circuits. ©[2014] IEEE
20
Conventionally negative resistance boosting was done by current control loop, it senses the
oscillator output and feeds-back current to be mirrored back into the oscillator core as done in
[19], this enhances the crystal start-up up to 450 µs for 26 MHz crystals but on the expense of
power as the core is consuming power in the order of mA for this time until the amplitude grows
and the control loop starts limiting the current. As for noise seed injection a technique used in [20]
and later used in [12] where an auxiliary Ring Oscillator is calibrated close to the natural crystal
frequency, then store this calibration word and use it every time you startup while connecting
the RO for few clock cycles at start up to inject noise in the extremely small BW of the crystal
(effective Q of crystal is around 10,000), thus taking much less time in startup which is around
50 µs for 26 MHz crystal. This solution requires pre-calibration before chip first start-up and
more importantly a temperature compensated biasing for the RO to maintain its accuracy, as 1
kHz or less shift in frequency is sufficient to block noise injection totally from reaching the crystal
leading to impractical implementation. Recently a compromise between start-up time and solution
complexity was presented in [21], where no longer pre-calibration was needed and replaced by
chirp oscillator that scans all frequencies from above the crystal frequency till zero, injecting noise
when it crosses the crystal frequency. Negative resistance boosting was implemented by adding
multiple stages in parallel that are disconnected after startup to save current, achieving a start-up
time around 150 µs, Fig.2.3(a),(b) shows system block diagram and the chirp oscillator circuit, (c)
shows start-up behavior with the applied technique.
2.1.2 Architecture Level Optimization
PLL and Clock generation
This is a new trend that targets replacing the PLL by low power alternatives. Multiplying Delay
Locked Loops (MDLLs) are becoming popular in wireless applications in the attempt to replace
the large PLLs based on LC-tank oscillators, but MDLLs have always been much worse in terms
of phase-noise. This paves the way for inductor-less radio chips removing all magnetic coupling
problems and layout headaches to isolate them from the surroundings. An MDLL resembles a
21
ring oscillator, in which the signal edge travelling along the delay line is periodically refreshed by
a clean edge of the reference clock as shown in Fig.2.4. In this manner, the phase noise of the
ring oscillator is filtered up to half the reference frequency and the total output jitter is reduced
significantly. The MDLL in theory are limited to integer operation only. In 2014, an attempt
to reach phase noise close to conventional PLLs was successful [22], the prototype synthesizes
frequencies between 1.6 and 1.9 GHz with 190 Hz resolution, fractional operation is introduced by
1-bit Time to Digital converter (TDC), it achieves RMS integrated jitter of 1.4 ps at 3 mW power
consumption. With some power optimization this will be a true replacement for PLLs in low power
applications.
Figure 2.4: MDLL circuit. ©[2014] IEEE
Breaking the barrier of mW ‘s for power consumption required a complete elimination of the
PLL, For Tx side as in [23] using crystal reference for injection locking on ring oscillators (ILRO)
to replace integer PLLs and direct modulation by altering the load capacitance seen by crystal, in
addition to PA working by combining phases from RO to generate a carrier frequency 9 times the
LO frequency. Fig.2.5(a) shows system level block diagram, (b),(c) show edge combiner wave-
forms and circuit. This method achieved total power of 90 µW , with one obvious disadvantage,
the design can only work on a single channel as you can’t change the reference frequency, limiting
22
its usage to single channel applications like Wireless Body Area Network(WBAN). On the Rx side
[24] used the ILRO to directly convert the RF input to CMOS level then directly demodulates it
using PLL based demodulator achieving sensitivity of -63 dBm with 38 µW power consumption.
Chapter 4 builds on this direction creating a low power Tx by replacing the PLL with specially
design digital calibration loop for fast start-up and replacing the high frequency crystal with a low
frequency LP alternative.
Figure 2.5: PLL-less transmitter architecture. ©[2011] IEEE
23
2.1.3 Cross Layer Optimization
All previously mentioned techniques optimize power consumption through physical layer of
the system, taking a step back on the nodes to node communication, efficiency of such networks
are dependent on the rate successful transmission and data propagation occurs, versus the need for
re transmission due to loss of Tx/Rx synchronization. Thus, achieving efficient wireless communi-
cation between nodes is not a unique function of reducing the power consumption of all the circuits
that are involved. Network synchronization and node start-up time have proven to be as equally
important [25]. Network= synchronization leads to extra power consumption, especially in the
cases of small payloads and heavily duty-cycled systems. However, some sort of synchronization
is required as there is no guarantee that the receiver (RX) is listening at the same time the trans-
mitter (TX) is sending data. The simplest method would be "preamble synchronization," which
requires the transmitter to send a certain repetitive pattern; to queue the receiver to start receiving
data as shown in Fig.2.6. A major drawback clearly shows up for duty cycled systems, as sending
a single data package of a few bytes (i.e. < 1ms at moderate data rates) with a receiver that wakes
up once every minute to sense the channel, requires a transmitter to send a preamble longer than a
minute to insure synchronization. It can be noted that saving power at the receiver side by further
duty-cycling impacts the transmitter consumption, creating an unequal link-budget and increasing
the synchronization power overhead.
24
Figure 2.6: Preamble synchronization between 2 nodes in a network.
To solve the problems created by preamble synchronization, wake-up radio functionality has
been proposed for modern wireless networks. The implementation can be as simple as a wake-on
RF power topology as shown in [26]. Though very efficient at the circuit level, in most practical
cases (congested networks or co-existence of multiple standards) it still cannot achieve high net-
work efficiency, due to the occurrence of false wake-ups. To avoid this behavior, [12] designed a
circuit allowing a single receiver to wake-up when a unique on-off (OOK) keying pattern is recog-
nized as shown in Fig.2.7. This circuit consumes hundreds of nA‘s and completely eliminates the
need of periodically have to enable full receiver chain for preamble sensing.
25
Figure 2.7: ID-passive wake up. ©[2012] IEEE
A more extensive survey of wake-up circuits and their various categories is presented in [27].
It is important to observe that the majority of these wake-up techniques cannot be integrated as
part of conventional wireless standards (Bluetooth LE, Wi-Fi). However, recent work shows that
enhancing these standards is a possibility through a custom media access control (MAC) layer
[28].
2.1.4 Low Power Features Summary
Certain features are required in sensor nodes to overcome the limitations in the deployment,
communication and energy availability, we will focus here on the radio features needed to over-
come such challenges
TxRx features
• Low sleep current
• Minimum processor activity
• Minimum duty cycle
• Wake on radio capabilities
26
• Fast frequency settling in PLL calibration
• Fast crystal oscillator start-up time
• Adaptive output power via RSSI
• Minimum external components
• Adapt harsh environmental conditions (temperature, humidity and pressure)
• Large voltage range operation (use switching regulators for efficiency)
• Compatible with IEEE 802.15.4 standard/ZigBee communication protocol
• Low peak current for energy harvester compatibility
• Low power Carrier sense ability for carrier sense multiple access with collision avoidance
(CSMA-CA) protocol dependencies
2.2 Energy Harvesting and Power management units
The large ratio between stand-by power of IoT applications (100s of nW to µW ) to active
current consumption (10s of mW ) creates an opportunity to supply them using harvesting power,
this is enabled via aggressive duty cycling (less than 0.1% active to sleep time ratio). This creates
average power consumption that can be maintained by harvesting as long as an energy storage
element exists to hold the voltage at peak consumption periods. A practical application would be
a mesh network of transmitters, battery status monitoring and replacement can be a huge burden
or even impossible is some applications, these nodes usually transmit every few minutes or even
hours, creating an ideal scenario where little energy can be extracted from the environment to
recharge the energy storage.
literature survey for current state of the art is introduced in this section, starting from single
input systems interfacing with main two categories of energy harvesters (DC and AC), introducing




Solar cells and Thermal electric generators are considered DC sources can be modeled by
current/voltage source respectively as shown in Fig.2.8. Matching condition is different for both
and is approximated by fractional open circuit voltage by 0.5-0.8 the open circuit voltage. DC-
DC converters can manipulate the input impedance by changing frequency to create matching
conditions, critical to extract maximum power from these harvesters. Other important system
requirement is passive start up, using the harvester itself not relying on pre-stored energy to supply
the circuit. Many research work focused on optimizing harvesting power from a single source as
light [29], heat [30].
Figure 2.8: Harvester model TEG (left) , Solar (right).
Motion based and RF harvesters generates an AC signal and needs to be pre-rectified before
interface with DC-DC converters [31]. In [32], the circuit is designed provide both self start at and
active harvesting from a magnet moving in a coil chamber as shown in Fig.2.9, equivalent electrical
model has V as max open circuit peak to peak voltage, R and LMH being DC coil resistance and
inductance respectively. The application intended is wearable electronics where these harvesters
can extend battery life. Using a re-configurable passive/active rectifier in Fig.2.10.
28
Figure 2.9: Harvesting from motion through coils and moving magnets, equivalent electrical
model.
The rectifier topology presented in this work is used to create a reconfigurable rectifier that
converts the AC voltage from the EM transducer to DC. This is achieved by means of this rectifier
topology that reconfigures depending on the power available in the system. Transistor M1 –M4 are
re-used in the main power path of the proposed circuit for the configurations of passive and active
rectification. When the energy stored in the system is not sufficient to power any external control
circuit for the operation of an active rectification, a passive rectification mode is employed. the
transmission gates TG1 –TG4 have no gate control the PMOS inside is conduction, thus passive
rectification is achieved.
29
Figure 2.10: Negative voltage converter (NVC) in passive (left) and active (right) modes.
This causes the rectifier to act as a standard passive negative voltage converter (NVC), where
the difference between the voltage of the EM transducer connected at VInp and that connected
at VInn can enable either M1 and M4 or M2 and M3 for positive and negative AC input voltage
swings, in this specific order, the rectified signal is VREG. Upon reaching a minimum of 1.8 V,
the voltage at the storage capacitor enables the active configuration of the rectifier. Transmission
gates TG1 –TG4 are disabled, disconnecting the original NVC topology and in turn allowing M1
–M4 to be controlled by C1 –C4. Low input voltage swings, common in low-pace movements such
as walking, can be rectified more efficiently during active rectification mode. Depending on the
polarity of the EM transducer and the rectified voltage, branches M1 and M4 or M2 and M3 are
enabled through two nanopower comparators. Transistor M1 –M4 can be turned off to fully isolate
the transducer from the EH system power path and enhance the prevention of back leakage, which
is an important feature of the active rectification topology. The NVC is connected to the system as
shown in Fig.2.11, the active rectification buffers are driven by comparators to decide which side
of the NVC to switch. The output of rectifier is further boosted and regulated to supply the system
and load.
30
Figure 2.11: System level for magnetic motion harvesting PMU.
2.2.2 Multiple Harvesters
Multiple input harvesters are recently under research focus to sustain operation even when
some energy sources are not available creating a reliable approach to self-powered electronics.
Combining multiple harvesters can be done in series or parallel. In Fig . 2.12(a), series combining
provides automatic voltage boosting but suffers from lower efficiency with higher number of inputs
and sources ranking/detection is required to maintain high efficiency. In [2], bypass switches were
added to disable any input whose voltage level is not big enough to drive the doubler circuit. On
the other hand in parallel combiners Fig.2.12(b), efficiency is high for arbitrary number of inputs
and no ranking is required as any input can be disconnected without affecting combiners operation,
but simultaneous harvesting cannot be achieved as only a single harvester can be connected at a
time to avoid harvesters short circuit and achieve MPPT.
31
Figure 2.12: (a) Series voltage combiner [2] ©[2018] IEEE, (b) Parallel current combiner [3]
©[2012] IEEE.
In the literature, different system design methodologies [33] and approaches have been pro-
posed to combine power from multiple sources [34]. Limitations and trade-offs in recent work are
categorized below.
Truly simultaneous means that power is being extracted from all sources without any inter-
ruptions while maintaining maximum power point tracking (MPPT). In previous works, a single
core circuit is shared between inputs; therefore, time slotting [35, 36] or priority assignment [37]
algorithms are required to generate the required impedance matching for each input, while other
harvesters are disconnected. A bigger input capacitance helps keep the collected energy at the
disconnect period but it trades off with the speed of an MPPT circuit to acquire the open circuit
voltage (OCV). This was solved in [36] at the expense of using two input chip pins for each har-
vester, severely limiting the integration capabilities of such power management units (PMUs). In
[2], three sources are directly connected to the same voltage doubler core creating simultaneous
harvesting but inherently losing the ability to provide a unique MPPT for each input as each of
the two cores runs at a single frequency. Thus, input impedance for all ports track each other but
cannot be controlled independently.
Universal interface gives a circuit the ability to interface with harvesters under different MPPT
conditions (0.5VOC−0.8VOC), voltage levels, and impedance range. To the best of our knowledge,
32
cold start from any arbitrary input is a missing feature in previous work pertaining to multiple input
PMUs, except when power-oring parallel connection with diodes [36, 38] which increase start-up
voltage and limits the use of small size harvesters in most applications. Another drawback with
current solutions is that the user must know the power levels of each harvester and select the proper
port [36,39], which is impractical in time varying environments. Thus, a universal circuit is needed
to serve all harvesters with programmable interface for “plug and play" installation that can serve
different IoT applications.
Power scaling increases output power by inputs expansion, without harming the harvesting
efficiency which is not achieved in series voltage adders structures [2]. A frequency-independent
MPPT technique is required to avoid adding extra circuitry with each input, allowing easy expan-
sion either on-chip or off-chip.
Adaptive power control as noted in [40], temporal variations in load and available harvest-
ing/battery power must be part of a system design. With excess ambient energy, circuit can go
from direct load supply mode to battery charge mode while the load is supplied by an interme-
diate super-capacitor. The battery is used later to supply the load in moments of energy drought.
In [2], the core oscillator is supplied by the harvester directly to scale the power consumption at
varying input powers. Reference [41] reports the system efficiency up to 50% with 0.3 to 2 µW
input power. Thus, design of a power management unit (PMU) that can detect energy availability
and flow to adapt power consumption is essential, to avoid depleting the main battery during long
periods of energy drought.
2.2.3 Energy harvesting PMU Low Power Features Summary
Certain features are required in sensor nodes to overcome the limitations in the deployment,
communication and energy availability, we will focus here on the power management/ harvesting
features needed to overcome such challenges
PMU/EH features
• Adaptive power consumption depending on chip state
33
• Detection and ranking of harvesting sources
• Provides a ripple free supply for sensitive loads
• Simultaneous harvesting in case of multiple inputs system
• Maximum power transfer capability through impedance matching for wide range
• Self start up capability in case no energy stored, can kick start or provide supply from har-
vester directly
• High efficiency in high and low energy availability, also at long energy drought periods
• Capability to interface with different harvester nature (AC / DC / high voltage / high current)
2.3 Proposed System Block Diagram
The thesis target is to create a HW platform with the features explained above, this platform
can enable transforming objects into smart ones to be part of IoT network. Fig.2.13 shows the
top level diagram of such system on chip (SoC). Even though each part is fabricated on a separate
chip, the number of ports, total area and co-design is taken into consideration to make full system
integration feasible.
Figure 2.13: Proposed system block diagram.
34
The main functionality of each sub-system and required specifications (specs) is explained
below
2.3.1 Harvesting Interface
Table 2.1: Harvesting sources comparison
Each harvester has a different nature regarding its output impedance and voltage levels, also the
output waveform and the ability to extract maximum power out of it. Table.2.1 shows the summary
of energy sources properties and power ranges. Based on the target application for WSN, the specs
are summarized in Table.2.2.
35
Table 2.2: Harvester Interface target specifications
Power range 1 µW to 5 mW
Impedance range 100 kΩ to 300 Ω
MPPT voltage accuracy < 5%
Number of harvesters > 2
Voltage range 50 mV to 2 V
Power detection capabilities Required
active consumption/input < 5µW
standby power/input < 100 nW
Area/input < 0.15mm2
Pins/input 1
Input waveform AC,DC and RF
Efficiency > 70%
2.3.2 Power Management
After each harvester output is conditioned for MPP, their combined power flows into the power
management unit (PMU), where the voltage level usually needs boosting to be adequate for sup-
plying the load or charging an external battery. The PMU needs to have the ability to self-start
(passive start-up) from zero energy state using the harvesters directly, also needs to adapt its power
consumption based on the load demands. A ripple free supply is only needed in active mode where
the main sensor ADC is operational or the transceiver unit is sending the data. In all other cases,
this supply sensitivity spec can be relaxed in stand-by mode or sleep mode, where only voltage
is needed to retain data in Flip Flops (FF) or supply a low frequency oscillator for system timers.
Required design parameters are summarized in Table.2.3
36
Table 2.3: Power management unit target specifications
Load power sleep < 1µA
Load power in Stand-by < 1mA
Load power in Active 1 mA to 10mA
Quiescent current < 1µA
Battery charging up to 4.2 V
Charging power > 100µW
Voltage boosting ratio up to 16x
Supply ripple < 10mV
Load Supply range 1 to 2 V
Battery voltage range 1.5 to 4.2 V




Figure 2.14: Tx/Rx Node power cycle and transmission overhead.
The sensed data by the sensor ADC is stored as digital bit stream to be transmitted to network
processing center where action will be taken for control. Wireless transmission usually consumes
the highest amount of power and lasts for the shortest time in comparison to other functions as
shown in Fig.2.14. The data size is usually small (10‘s of bits) and communication distance is less
than 10 m allowing extra flexibility in the PLL and power amplifier (PA) design, this figure also
emphasis the need for optimization in reduction of calibration time and sleep leakage as they easily
dominates the transmission energy when integrated over large times. Table.2.4 shows the target
specs.
38
Table 2.4: Transceiver target specifications
Communication distance > 5m
Output power range -20dBm to -5 dBm
Power consumption < 1mW
Rx sensitivity < -65 dBm
Start-up time/Energy < 1ms/ < 0.µJ
Frequency Range 300 MHz to 2.4 GHz
Max data packet length 1 ms
RSSI capability Required
Crystal oscillator frequency/ power 32 kHz/ < 1µW
Frequency accuracy 1%
Max/Min Data Rate 100 kbps to 3 Mbps
Energy/bit < 0.5nJ/bit
39
3. MULTIPLE-INPUT HARVESTING PMU WITH ENHANCED BOOSTING SCHEME
FOR IOT APPLICATIONS
Powering IoT sensor nodes in the expanding application platforms requires a highly efficient
power management unit (PMU), with adaptive power consumption based on harvester energy avail-
ability and load conditions. To provide continuous reliable operation, this work presents a PMU
able to extract energy from four energy sources, while providing independent maximum power
point tracking (MPPT) for each harvester. With MPPT being achieved independent of voltage
boosting, a novel step-up ratio technique is implemented for enhancing charging time and effi-
ciency by 33.5% (3.5x). A fully digital technique to detect harvester energy flow is designed to
scale down system power during energy drought periods, while being able to cold-start from inputs
as low as 0.4 V. The stored energy is used by load through a hybrid analog-digital LDO or further
boosted for pulsed battery charging. The chip is fabricated in 180 nm process with an area of 0.46
mm2, achieves a maximum power delivery of 2.6 mW and a maximum end-to-end efficiency of
70% at 40 µW .
3.1 Introduction
The wide application spectrum and market growth is the main driver for expansion in The In-
ternet of Things (IoT) field. From factory and home automation sensors to fitness trackers and
implantable medical products, these devices operate in a duty-cycled fashion creating an average
consumption at 10s of µW s which enables the use of energy harvesters. However, an energy stor-
age element is still needed to hold the voltage during the short bursts of peak consumption. Under
these working conditions where constant recharging is needed, a battery will suffer a degraded
lifetime. A two-level energy storage is preferable, where a super-capacitor is used in the direct
charge/discharge path since it has orders of magnitude higher life-time on the expense of higher
leakage [42]. An extra battery is used as secondary storage with bigger capacity and lower leakage,
the excess harvesting power to the load demands is stored in these batteries, to be used later at no
40
energy availability.
Recently, systems handling multiple input harvesting are gaining attention as a possible solu-
tion to the energy drought periods faced when using single harvester (e.g. solar panels at night or
a thermoelectric generator with no temperature difference present). Combining power from non-
homogeneous harvesters provide a reliable operation under varying ambient conditions [43, 44].
In the literature, different system design methodologies [33] and approaches have been pro-
posed to combine power from multiple sources [34]. On the architectural level, all of the recent
state of the art (to the author’s knowledge) are similar in interfacing all harvesters to the same
voltage boosting circuit. This creates two design limitations that will be discussed below.
Multi input harvesting means that power is being extracted from sources without interruptions
while maintaining maximum power point tracking (MPPT). This cannot be achieved with a single
boosting circuit shared between inputs as it can only create a single input matching impedance at a
time; therefore, time slotting [35,36] or priority assignment algorithms [37] are required to control
which input to operate on, while other harvesters are disconnected. A bigger input capacitance
helps keep the collected energy at the disconnect period but it trades off with the speed of an
MPPT circuit to acquire the open circuit voltage (OCV). This was solved in [36] at the expense
of using two input chip pins for each harvester, severely limiting the integration capabilities and
increasing the cost of such power management units (PMUs). In [2], three sources are directly
connected to the same voltage doubler core creating simultaneous harvesting but inherently losing
the ability to provide a unique MPPT for each input as the circuit runs at a single frequency. Thus,
input impedance for all ports track each other and cannot be controlled independently.
Booster design carries all the burden of the system performance as it has to create both a wide
range of input impedance for MPPT and to be optimized for efficiency. In integrated switched ca-
pacitor (SC) converters, maximum efficiency is usually achieved at a certain load/frequency. Thus
maintaining high efficiency with changing nature of energy harvesters becomes challenging, as
it trades off with the dynamic range of harvester powers the circuit can match. Switched induc-
tors converters relax this trade-off on the expense of not achieving full integration and increasing
41
solution cost.
To address these limitations, the paper proposes an intermediate stage that can handle inputs
matching and automatic selection of highest input available prior the boosting. This eliminates
the need for ranking algorithms to select most active harvester, allowing independent design for
booster stage optimized for efficiency, selecting the frequency and conversion ratio (CR) for high-
est charging efficiency. Since the booster load is mostly capacitive, the concept of charging energy
efficiency is introduced along with a novel boosting scheme to achieve faster and more efficient
charging. To provide a complete solution, the PMU includes self-start-up, load regulation and bat-
tery charging capabilities optimized for IoT applications. Lastly, the combiner circuit has a digital
energy flow detector that puts the chip in low power standby mode when no harvester energy is
available. This paper is organized as follows: Section 3.2 shows the system level architecture.
Detailed circuit-level design is described in Section 3.3. Section 3.4 presents the measured results
and performance of the proposed solution. Section 3.5 summarizes the main contributions of this
work and presents conclusions.
42
3.2 System Level Architecture
Figure 3.1: (a) Proposed PMU block diagram, (b) Operation timing chart
43
Figure 3.2: (a) Simplified power flow diagram, (b) System flow chart
A multi-input SC converter provides MPPT for each input independently, harvest DC powers
ranging from µ to mW and provide an efficient boosting technique to store energy for further use.
The system architecture is shown in Fig.3.1(a), the timing diagram for the main blocks in Fig.3.1(b)
illustrates chip operation depending on the harvester’s energy availability and CSTR voltage level
(primary energy storage at the output of the booster), detected by energy flow detection (EFD) and
Fsen flags respectively. The system cold-starts from any arbitrary input using the passive maximum
voltage selector followed by a charge pump (CP), allowing start-up from voltages as low as 0.4 V.
When CSU reaches 1.4 V, a power good (PG) flag goes high enabling the combiner and booster to
be in bypass mode (1x). CSTR is directly charged using the combiner without extra boosting. The
booster then increases CR in steps from 2x to 8x until CSTR is charged to 2 V (Fsen = 1), this
ensures that the output voltage of the booster is close to the ideal VSTR = Vcomb · CR maintaining
high efficiency during CSTR charging period which can reach hours [45], a simplified power flow
diagram and flow chart is shown in Fig.3.2 (a, b). Detailed analysis of the enhanced boost scheme
is presented in Sect. 3.3-C.
44
When CSTR is fully charged, the digital part of the Hybrid LDO (H-LDO) provides a stable 1.8
V supply for the load in sleep or standby mode (1 µ to 1 mA), with only 300 nA quiescent current.
In active mode, an auxiliary analog LDO is enabled in parallel to the DLDO supplies the demanded
current (≈ 10 mA), providing ripple-free supply for sensitive blocks by eliminating steady-state
switching. CSTR is kept between 1.8 and 2 V by charge/discharge action via the booster and battery
charger respectively. With only one of them enabled at a time, the 300 pF metal-insulator-metal
(MIM) capacitors used in booster circuit are shared (reused) in the battery charger CP saving 0.18
mm2 (more than one-third of the circuit area). The last part of the timing diagram in Fig.3.1(b)
shows the battery-assisted phase of operation when the load is demanding current while CSTR is
falling out of regulation. A simple switch connects CBATT to charge CSTR back to regulation
window.
When EFD = 0, indicating no input power availability, the whole system switches to low
power mode while maintaining the ability to sense the harvester’s power availability and provide a
stable supply for the load through CSTR or with battery assistance. This is achieved by turning off
the main oscillator (OSCmain) using EFD signal, relying on OSCSU (≈ 5 kHz) instead. Since
all circuits (including comparators) consumes power at clock edges only, this lowers the system
consumption considerably, a critical feature needed in a time-variant environment under different
load power profiles.
45
3.3 Circuit Level Design
3.3.1 Multiple Input Start-up Circuit
Figure 3.3: Start-up circuit diagram
The start-up (SU) circuit design is shown in Fig.3.3. To enable start-up from any input source,
a tree of cross-coupled PMOS as in dynamic body biasing structure (DBB), allows the maximum
input voltage to propagate to Vmax, the maximum selectors represent a low-resistive path if there
is a minimum voltage difference of Vth between compared inputs. This simple yet effective circuit
eliminates the Vth drop in traditional multi-input start-up [36] and still provides high impedance
isolation (1 GΩ) in the path of the three lowest inputs preventing back leakage. The drawback of
this technique shows up when inputs are close to each other within Vth (400 mV) or less, the start-
up slows down (up to 10x) as all max selectors provide high impedance. This condition can be
mitigated if complementary sources are selected for the EH1 - EH2 pairs and EH3 - EH4 pairs.
46
Figure 3.4: Measured start-up with CSU=68 µF in presence of 4 inputs
Fig.3.4 shows measured start-up waveforms in the presence of three DC inputs and one rectified
AC input charging a start-up capacitor CSU of 68 µF . An extra feed-forward paths are added
via diode-connected native (0 Vth) transistors to accelerate the start-up in the presence of high
voltage inputs. Protection from high voltage stress is provided in two ways; first, VSU is monitored
externally to set PG = 1 as it reaches 1.4 V to enable the entire PMU and disable the start-
up circuit. Second, intermediate CP nodes are kept at a maximum value of VSU using the same
diodes, as high input voltage with a fixed boosting ratio of 4 can easily cause higher voltage to
build up with no protection.
47
3.3.2 Combiner, MPPT and Energy Flow Detection (EFD)
Figure 3.5: (a) Combiner unit circuit, (b) Combiner waveforms, (c) Energy flow detection (EFD)
circuit
48
Figure 3.6: OSCmain/su circuit diagram showing "EFD" control
To realize multiple harvesters management and independent MPPT for each source, a 1:1 SC
converter is designed as in Fig.3.6(a) to provide a matching condition (input impedance equal to
harvester equivalent resistance Rin = REH). Different from a normal SC resistor implementation,
the clocked comparator (CMP1) operates the circuit at a pulse skipping mode to regulate VC
around VMPPT as shown in Fig.3.6(b). This allows each input to run at a local unique frequency
fcomb depending on the harvester available power. The comparator clock frequency only needs to
be higher than the lowest harvester resistance emulation frequency (RINmin = 2/fclkC), with all
the outputs of the EH units shorted together and connected to the input of the booster. A 1nF Ccomb
is used to stabilize the booster input fluctuations. Reverse current is eliminated by design, if M2
turns on and Vcomb > Vc, the charges will go backward and get trapped on C1−4 raising its voltage,
the comparator will keep M1 off, eliminating short circuit current between harvesters. A drawback
in current mode combining is that the higher voltage branches can now block energy flow from the
lower voltage inputs, unless Iboost is big enough to drain Vcomb below min(VMPPT ). This degrades
the efficiency as 1:1 SC converters need in/out voltages to be almost equal to achieve high power
efficiency. Simultaneous harvesting with high efficiency is achievable when all input voltages are
very close to each other. With different input voltage levels, the combiner will make the path
of highest power dominate and block lower inputs. The circuit reaches a steady state operation
when Iboost is equal to the summation of input currents. If the highest voltage input can provide
49
such current, harvesting from this channel is sustained with high efficiency (Vcomb ≈ VMPPT ). If
the harvester cannot supply all required current then Vcomb drops allowing more harvesters to be
combined, at the cost of lower efficiency for higher inputs. The loss factor is dominated by the
ratio of in/out voltage of the 1:1 SC converter Vcomb/VMPPTn, degrading the power conversion
efficiency by ≈50% for the highest input if the combined source is half its voltage.
The EFD circuit shown in Fig.3.6(c) detects toggling activity of the four input harvesters.
Delay cells are added to make sure odd-parity output goes high even if all harvesters are switching
at the same frequency and phase. A dead clock detector counts 4 clock cycles (≈ 0.1 sec) and sets
EFD to 0 when all comparators outputs are not changing their state. This sets the entire PMU in
low power (LP) state by disabling OSCmain and maintains operation on OSCSU . With the first
change in any of the comparators states, EFD goes high and the system goes back to normal
frequency operation to effectively harvest the available energy. Both oscillators are current starved
ring based, with OSCmain having extra control knob to push it to MHz range using EFD flag as
shown in Fig.3.6.
Since four independent VMPPTs need to be maintained during harvesting, four independent
MPPT circuits are required. Fractional open circuit voltage FOCV can be held on sampling
capacitors as in [37, 46] requiring continuous refresh due to leakage, which is acceptable if the
time it takes to reach and sample the OCV is negligible compared to refresh repetition period.
Commercial PV cells has 50 nF/cm2 output capacitance at MPP [47] and increases exponentially
approaching OCV. A small size solar panel can have time constant of 50 ms (τ = REH · CEH = 10
kΩ · 0.5 µF). Requiring to hold the OCV for at least a few seconds so that the total disconnect time
is much smaller than energy extraction period, which is not feasible using integrated capacitors
with leakage levels in current technologies. An example of capacitor sampling MPPT refresh rate
compared to energy extraction time is shown in [48], the restoration time for solar panel and MPP
search is 8.5% of the total time, lowering the harvester utilization and system efficiency by the
same factor.
50
Figure 3.7: (a) Harvester model and OCC pulse generation circuit, (b) VMPPT generation circuit,
(c) Measured VMPPT search and harvester input during OCC pulse
51
Figure 3.8: MPPT tracking accuracy, combiner power efficiency and power consumption versus
Pout.
The open circuit condition (OCC) is programmable to accommodate a wide range of restoration
time from 50 µs to 250 ms, with a repetition period of up to 10 s. OCC Pulse generation circuit
and equivalent harvester model are shown in Fig. 3.7(a). To avoid use of large capacitors to hold
OCV and still maintaining the sampled VMPPT for indefinitely long periods, the proposed six bits
current DAC in Fig.3.7(b) generates a staircase ramp to be compared against the FOCV obtained
from resistive divider as shown in Fig.3.7(b). When the comparator toggles, the voltage is held by
the current DAC. Only a 10% area overhead per harvester unit is needed by the circuit, it’s power
consumption scales from 5 nA to 320 nA depending on available OCV. The measured MPPT
tracking accuracy, combiner efficiency and consumed power Pvdd is shown in Fig.3.8 maintaining
high efficiency as Pvdd scales with input power. Power efficiency is calculated here as ηP =
Pout/(Pout + Pvdd) at Vi from 0.5 to 1.3 V (actual combiner input from 0.25 to 0.65 V at MPPT).
The MPPT tracking accuracy is affected at low input power (high REH) as the FOCV is altered
by the ratio of REH to resistive divider total resistance (0.6-1.2 MΩ) as shown in Fig.3.7(a-b). The
52
combiner and MPPT are designed to interface with DC-based harvesters (e.g. solar), whereas AC
harvesters (e.g piezo) can be accommodated only during start-up with pre-rectification [32].
3.3.3 Enhanced Booster Scheme
Figure 3.9: (a) Booster top level diagram, (b) Voltage doubler CP circuit with MIMCAP reuse
The circuit shown in Fig.3.9(a) is based on an integer CR version of the CP in [49], with four
muxes reconnecting the inputs of each module to choose from bypass mode (1x) up to 8x. Each
of the three voltage doubler units is shown in Fig.3.9(b). MIM caps are charged via Vin1−2 with
bottom plate grounded, followed by bottom plate charge via Vin3−4 with upper switches closed,
this makes upper plates go up to Vin1−2+Vin3−4. The Muxes reconnects Vin3−4 in each cell to the
previous stage output, ground or Vcomb to achieve different CRs.
53
Figure 3.10: Operation cycle for harvester/battery powered systems
A scheme is proposed to optimize the booster efficiency for pure capacitive loads undergoing
gradual charging. The motivation behind this is derived from the start-up sequence of battery
powered systems and how it is different from harvesting based systems as shown in Fig.3.10.
Charge time is needed to charge/discharge external capacitors, in battery powered systems these
capacitors are in the order of µF (e.g. buck-converter external cap, LDO stabilization cap), and
optimizing in power saving in these periods have a smaller effect than optimizing in harvesting
based systems where the storage capacitors are much bigger (mF s to Farads), making charging
energy optimization a necessity. For example, charging a 1 F capacitor directly from 1 V source,
takes an energy equivalent from the charging source, Esupp = q · V = C · V 2 = 1 J , while the
final energy on the capacitor is only 0.5 · C · V 2 = 0.5 J , which is commonly explained as loss
in the charging resistance, in other words, charging a capacitor from 0 V to final voltage wastes
50% of the total energy. Step-wise charging shown in [50] provides considerable energy savings
as the number of charging steps (N ) increases, reducing the total dissipated energy by a factor of




= 0.5 · (1 + Vini
Vf
) (3.1)
Where EC , ER, Vini and Vf is the energy stored on the capacitor, energy loss in charging re-
54
sistance, initial and final voltage on the charging step respectively, the closer the fraction Vini/Vf
to unity, the higher the energy efficiency. In the proposed system, the programmable booster is
responsible of the step-wise charging process of CSTR, starting from the CP power efficiency at
steady state condition [52], solve it across time to analyze the efficiency during the charging pe-











For current MIM capacitors technology, α is lower than 10%. This suggests that the efficiency
is close to 100% when the output voltage is close to the ideal final value at the selected CR.
Fig.3.11 plots the charging process of CSTR by the booster to 1.8 V with 8x CR along with the
“enhanced boost" configuration. The efficiency versus time is calculated in both cases showing up
to 3.5x improvement as the output voltage is kept close to the target value with step-by-step CR
increment. Also achieving 33.5% faster charging at 0.6 V input voltage, CSTR = 1 mF . With
bigger CSTR and higher input power, substantial charging time reduction can be achieved (50%
faster at 1 V input and CSTR > 1 mF ) as the input impedance of the booster and output current
allows faster charging at lower CR. The measured booster power conversion efficiency (PCE) is
70 to 80% with input voltages > 0.4 V across all CRs at Pout < 80 µW . In bypass mode (1x)
Pout can reach up to the max power delivered by the combiner ≈ 2.6 mW . The control of CR
in the enhanced boosting scheme is done manually in this work, one possible implementation is
by periodically comparing fractions of the output voltage with input voltage of the booster and
deciding the CR (e.g. when output voltage approaches 2 times input voltage, switch from CR=2
to 3). Lastly, to regulate CSTR voltage, OCC pulse enables a hysteresis-comparator to sense if
VSTR > 2 V, enabling the battery charger to transfer excess charges to the battery until it drops
back to 1.8 V where the booster is enabled again and so on.
55
Figure 3.11: CSTR charging by smart boost, calculated efficiency with time
3.3.4 Pulse Battery Charging
As most Li-Ion batteries operate from 3 V at deep discharge to 4.2 V at end of charge, the
battery charger is implemented as an extra voltage doubler CP similar to the one in Sect.3.3.3 using
thick-oxide 3V devices (block diagram shown in Fig.3.1). With CSTR as an input, the required
voltage range is 1.5 to 2.1 V, 30% of total chip area saved due to MIM-cap reuse from the main
booster as only one of the two circuits is working at a certain time depending on Fsen. Measured
maximum output power is 350 µW with maximum efficiency of 86% at 100 µW . The extra circuit
shown in Fig.3.12(a) adds charge accumulation and simplified pulse charging capabilities to the
battery charger system. Previous works [53,54] show the charging efficiency, time and temperature
control improvements versus conventional linear battery chargers.
The circuit contains a clocked comparator comparing a divided version of both the battery and
the accumulation capacitor CACC . Potential dividers are implemented with long channel PMOS
56
diode connected devices that draws only 50nA from each side. With each charging pulse the
potential divider on battery side is changed by M2 to force charging switch M1 to shut off at the
next clock cycle, the pulsed current profile is exponentially decaying with a peak of Ip = (VACC −
VBATT )/RM1 . The voltage difference is programmable by diode connected device M3 sizing,
creating current pulses from 800 µA down to 80 µA. The pulses frequency is a function of the
3V-CP operation frequency and CACC value. A simple 2 flip-flop circuit with low frequency clock
clk1Hz detects the charging done when 2 clock cycles (4 seconds) passes without a single charging
pulse CHR occurrence, raising a charging done flag high CHRDONE as shown in Fig.3.12(b).
This signal feeds back to the system to disable the battery charger and go back to boosting mode
to charge CSTR back to 2 V, where battery charger is enabled again. Measured charging operation
is shown in Fig.3.12.
57
Figure 3.12: (a) Pulse charger circuit, (b) Pulse charger operation, (c) Measured VBATT and VACC
during charging
3.3.5 Hybrid LDO
IoT applications often require both digital circuits for data processing, and analog circuitry
for data transmission. The widely different power requirements and noise requirements for these
mixed-mode circuits require a design with different operation states to maintain efficiency in
“sleep<1 µA", “Standby<1 mA" or “Active≈10 mA" modes.
58
The proposed H-LDO is shown in Fig.3.13, it contains a digital LDO in parallel with an analog
LDO. In the sleep state, both the LDOs are switched-off and load is connected directly to CSTR
regulated by Fsen circuit. During “Standby" the digital LDO is enabled. It comprises 15 PMOS
pass transistors capable of sourcing 1 µA - 1 mA. Each of these transistors is controlled by a bit
in a barrel shifter which, in turn is governed by a clocked window comparator. The comparator
senses the output voltage and enables more transistors if the output voltage drops below the Vref−L
voltage and disables them if the voltage rise above Vref−H . Like a typical digital LDO[55] this
architecture also suffers from a ripple at the output for a DC load current, ripple is determined
by Vref−L, Vref−H and FCLK . In our chip, these are chosen to be 1.75 V, 1.85 V and 2 MHz
respectively. In active mode, the auxiliary analog LDO is powered-up. This regulator smoothens
the ripple on output voltage and is capable of sourcing up to 10 mA. When the analog LDO is
enabled the state of the PMOS pass transistors in the digital LDO is frozen and regulation is set by
Vref−M at 1.8 V. If the voltage goes out of regulation window, the barrel shifter is enabled to assist
the analog LDO for faster settling. Both LDOs share the same feedback network that consists of
5 diode connected PMOS’s and compensation capacitors. The analog LDO opamp is designed as
a 2 stage, conventional NMOS input differential pair with class-AB push/pull output stage with
minimum compensation capacitor 100 pF at the output of the LDO, which is easily provided by
any typical load.
59
Figure 3.13: Hybrid-LDO Circuit
In digital mode, the LDO consumes 300 nA at steady state operation and can work with as low
as 50 mV dropout, 0.95 µA in hybrid mode with 150mV dropout achieving 6x ripple reduction and
low frequency PSRR enhancement. Fig.3.14 shows the hybrid LDO output voltage and “over OR
under" flag with ripple in each mode at different load currents to mimic an IoT sensor profile. The
3 voltage references are provided externally in this design but can be generated with a single current
mode bandgap with multiple output resistor string to guarantee a monotonicity with mismatch.
60
Figure 3.14: Hybrid-LDO output voltage at different modes of operation
61
3.4 Measurement Results
Figure 3.15: Zoomed in die corner (0) Cold start-up and OSCSU , (1) Four combiner units, (2)
Booster and MIM caps, (3) Pulse charger and 3V −CP , (4) Hybrid LDO, (5) Energy flow detector
and (6) OSCmain
The PMU was designed and implemented on the bottom left corner of a 2mm x 2mm die in
a 180 nm CMOS process with thick oxide (3.6 V) and 0VTH transistors available. Fig.3.15 shows
the die photograph with the circuits occupying an area of 0.46 mm2, allowing easy integration of
sensor interface and transceiver on the same die creating a self powered IoT node solution on a
single chip, with small number of external components (CSU , CSTR, CBATT = 68 µF ,1 mF and
10 mF respectively).
62
Figure 3.16: Measured block consumption pie charts. (a) EFD=0, (b) EFD=1
Fig.3.16(a) shows measured power breakdown by block when EFD = 0, OSCSU is running
at 5 kHz, the PMU maintains the ability to sense energy availability and provide stable supply
for the load. Since the booster and pulse charger works interchangeably, only the booster is added
to the total power consumption (1.55 µW ). The pulse charger and the Fsen circuit ensures that
excess energy in CSTR is stored in the battery. When EFD = 1, OSCmain is enabled to provide
the frequency needed for combiner matching, booster and faster LDO regulation where the chip
consumes 14.1 µW , with harvested output power ≈ 90 µW measured at LDO output.
63
Figure 3.17: Input waveforms showing four channel harvesting and MPP emulation
64
Figure 3.18: Measured block and end-to-end efficiencies
Figure 3.19: Measurements setup diagram
65
Figure 3.20: Measured PMU waveforms showing system operation through different states
Fig.3.17 demonstrates the ability for independent MPPT for individual channels, harvesters
RC model and input waveforms during power extraction, the combiner output Vcomb is kept below
200 mV to maintain energy flow in lower channels, it is important to note that the lowest PCE is
associated with the highest voltage input (30%), as in/out combiner voltages are not close to each
other as different input voltages are used for this demonstration. Fig.3.18 shows PCE versus Pout
for each block and end-to-end efficiency, this is measured when all inputs connected to the same
voltage level to insure efficient combiner operation with multiple input harvesting concurrently,
same results would be valid for a single input being harvested, in case of one dominant input. PCE
requires block testing with purely resistive loads, but the system loads are mainly capacitive most of
the time. When most of the load current is provided by CSTR and not the battery, most power flows
through the combiner and booster to gradually fillCSTR up using the enhanced boosting scheme. A
high power burst is then extracted by the load through the LDO, or used for battery charging. Thus,
an end-to-end efficiency curve for combiner-booster can be constructed, and depending on load and
battery power levels, the overall system efficiency can be calculated. The time-variant nature of
capacitive loads and enhanced boosting scheme is not part of PCE measurements. Therefore, the
multiplying PCE curves across power is a pessimistic view of the system’s overall energy efficiency
66
Table 3.1: Comparison against State of The Art Energy Harvesting PMU’s
67
especially with big storage capacitor CSTR, as the power flow happens in phases and not across all
blocks at the same time and power.
To show the system dynamics through different phases of operation, Fig.3.19 showing the test
setup. Results shown in Fig.3.20, starting at VSU =1.4 V where PG goes high and harvesting is
initiated with all four inputs connected to harvesters modeled at 0.8 V OCV (0.4 V at MPP) to
operate in simultaneous harvesting mode. As long as energy flow is going through the combiner
(EFD = 1), the booster fills up CSTR from 1x to 8x mode in steps for higher charging efficiency.
Then LDO is enabled to supply the load in digital and later in hybrid mode. Excess energy over
the load demand is stored via the battery charging for later use in energy drought periods, the
Fsen flag toggles repeatedly with the hysteresis comparator window (1.85 to 2.1 V) sending energy
packets to the battery. CSTR is recharged by the harvester through the combiner/booster after
battery charging is done as both circuits is the same MIM capacitors.
Table 3.1 includes a performance summary of the proposed PMU, compared with state-of-the-
art. From a cost perspective, this work achieves high power to area ratio with no inductors required.
Implemented on standard CMOS with minimum pin count, having regulation and battery charg-
ing capabilities, this allows IoT nodes integration on a single chip. The achieved efficiency and
quiescent Current compare favorably to other SC topologies in the table due to the used boosting
technique and system power consumption scaling with input power. The booster by-pass mode
allows the maximum power to reach the mW range.
3.5 Conclusion
This work is the first in its class to provide impedance matching for 4 harvesters with a single
oscillator, harvesting the highest source or combining their powers in case of equal input voltages
and efficiently boosting the voltage with up to 3.5 times charging efficiency with a 33.5% reduc-
tion in charging time versus conventional SC-voltage CP. The stored energy is used in generating
regulated supply for sensitive loads and also provides pulsed battery charging capabilities in a fully
integrated inductor-less design. The energy flow detection circuit scales the PMU consumption by
9-fold down to 1.55 µW with no input energy availability, while maintaining supply regulation
68
and the ability to return to harvesting operation in < 0.2 msec. The maximum power delivery
of 2.6 mW was achieved as well as a maximum end-to-end efficiency of 70% at 40 µW . With a
quiescent current of 950 nA and an area of 0.46 mm2, this chip enables an integration of sensor
interface and wireless transmitter providing a reliable, self-powered IoT nodes in a single package
with only 2-3 external capacitors.
69
4. 0.2 NJ/BIT FAST START-UP ULTRA-LOW POWER WIRELESS TRANSMITTER FOR
IOT APPLICATIONS 1
This chapter [56] presents an ultra-low power, PLL-less, power efficient transmitters (Tx) for
Internet-of-Things (IoT) applications. To reach sub 1 mW consumption, an architecture innova-
tion is proposed here to get rid of the PLL. Also on the block level, a current recycling vertical
delay cells is designed to create a ring oscillator with it which is refereed here as vertical ring
oscillator (VRO). Since wireless Tx targeting Internet-of-Things (IoT) applications impose tough
end-to-end efficiency requirements. The frequency synthesis problem is usually solved by incor-
porating a variant of the phase locked loop (PLL). However, power hungry dividers and large loop
time constants hurt the aggregated Tx power consumption and produce systems with slow start-up
and turnaround times, particularly when operating at low output power. This work demonstrates
an agile, ultra-low power and energy efficient transmitter architecture for IoT applications to ad-
dress these concerns. The Tx leverages the characteristics of the wideband frequency-shift keying
(FSK) modulation and uses an open loop ring oscillator based on a vertical delay cell as its local
oscillator (LO) generator. When followed by an edge-combiner-type power amplifier, the required
LO operating frequency drops to one-third of the RF frequency, which further reduces the Tx
power consumption. Moreover, LO frequency correction is achieved through a digitally-assisted
scheme with specially designed delay cells for fast frequency calibration. The Tx was fabricated
in 0.18µm CMOS technology and occupies an active area of 0.112mm2. Experimental results
show a Tx energy efficiency of 0.2 nJ/bit for a 3 Mbps data rate, and a normalized energy efficiency
of 3.1 nJ/(bit-mW) when operating at maximum output power of −10 dBm.
1Reprinted with permission from "0.2-nJ/b Fast Start-Up Ultra Low Power Wireless Transmitter for IoT Applica-
tions" by J. Zarate-Roldan, A. Abuellil et al. in IEEE Transactions on Microwave Theory and Techniques, vol. 66, no.
1, pp. 259-272, Jan. 2018. Copyright 2018 IEEE.
70
4.1 Introduction
When fully deployed, the Internet-of-Things (IoT) network will enable people, animals and
inanimate objects to establish on-demand, robust wireless communications links. The perpetual
connectivity state of the IoT nodes opens up limitless possibilities and applications in multiple
consumer segments as diverse as health care, structural condition monitoring, home and industrial
automation, and traffic management [57]. Ultra-low power operation sits at the top of the technical
challenges list that an IoT-competent device must overcome [58]. Reducing the power consump-
tion of the IoT nodes is critical to enable long intervals between replacements and/or depletion of
the local energy reservoir. This is of particular interest in applications operating in remote areas or
confined spaces. Furthermore, since the typical energy source of an IoT node is either a small form
factor battery or an energy harvesting unit [59, 60], it is mandatory for the IoT node to operate at
low power levels and in an energy-efficient fashion. While an IoT node is a complex device that
might include power management circuits, memory, microcontroller, sensors, analog front-ends,
and a wireless radio, it is typically the latter element which consumes the highest power [59, 61].
In most IoT applications, each node is required to broadcast a payload at intervals ranging
from a few seconds to hours, depending on the wake-up mechanism or periodicity of the event
triggering the transmission. However, the actual transmission time of payloads containing a few
bytes is generally less than 1 ms when operating at moderate data rates. Thus, it is critical to reduce
the power spent and duration of the incidental steps per transmission cycle (e.g. wake-up, start-up,
calibration) to avoid degrading the overall Tx efficiency. A conventional Tx architecture using
a PLL requires a crystal oscillator in the MHz range as a frequency reference. The associated
crystal’s start-up time and power consumption not only limit the start-up speed but ultimately
decrease the system’s energy and power efficiencies.
The need to reduce the power consumption of the IoT node’s wireless gateway has motivated
the development of highly efficient radio transmitters [16, 23, 62–70]. Some of the approaches
used to reduce the Tx power consumption include: using subharmonic injection-locked (IL) os-
cillators to apply phase multiplexing techniques [62]; implementing an edge-combiner power am-
71
plifier (PA) driven by cascaded IL ring-oscillators via a high-frequency crystal oscillator [23, 63];
using an intermediate-frequency (IF) quadrature backscatter technique to avoid the use of active
RF circuitry and support spectrally efficient modulations [64]; co-designing and optimizing the
voltage-controlled oscillator (VCO) and frequency divider for an efficient polar Tx [65]; employ-
ing a 2-tone RF signal as a transmission vehicle in a digitally-assisted Tx architecture [66]; using
a power-VCO in a heavily duty-cycled direct-RF Tx architecture (µ% duty cycle) [67]; using a
delta-sigma digitally-controlled polar PA in a direct digital-to-RF-envelope Tx [68]; implement-
ing sophisticated three-point modulation loops in a polar PA-based Tx [69]; using a subharmonic
ILRO with digital PA supporting high modulation rates [70]; or using an open loop direct modula-
tion approach [16].
From the previous discussion, it is possible to identify two trends in the energy efficient wireless
Tx design: i) architectures that strive to reduce the absolute power consumption and deliberately
use modulation schemes with low spectral efficiency to reduce complexity and ultimately power
[23, 66, 67]; and ii) Tx chains adopting more complex modulations that maximize spectral effi-
ciency and support higher data rates while consuming power levels well above 1 mW or radiating
low output powers (Pout) [62,64,65,68–70]. Both approaches have advantages and disadvantages:
i) is useful in applications that require the sporadic transmission of small data packets and are well
suited for energy harvesting-powered systems. Conversely, transmitters in ii) are able to transmit
a higher volume of information while making efficient use of the available power in their energy
reservoir. However, the Tx architectures in ii) pose tougher requirements on their power man-
agement units. This paper discusses the implementation of an ultra-low power Tx architecture for
operation in the 902-928 MHz industrial, scientific, and medical (ISM) radio band. The proposed
Tx uses system- and circuit-level techniques to reduce its power consumption while supporting a
data rates up to 3 Mbps. As such, our Tx sits in-between categories i) and ii), achieving energy effi-
ciency better or comparable to Tx implementations operating at sub-200 kbps data rates [23,66,67]
while transmitting a maximum Pout of −10 dBm for a 15 times faster throughput. Furthermore,
the PLL-less approach adopted in this Tx architecture replaces the high-end MHz crystal by a low
72
frequency reference (32 kHz) used for digital calibration at start-up. Due to its much lower power
consumption, the 32 kHz oscillator can be used during sleep and active states, for timing control,
and data transmission, respectively. As will be later discussed, even though typical start-up times
for low frequency (32 kHz) crystals can reach up to 2 seconds, the total energy required to start-up
such crystals is still comparable to that required by high frequency (MHz) crystals.
Key to the proposed Tx concept is the use of an unbalanced —yet differential— ultra-low power
delay cell coined as vertical delay cell. A three-stage vertical ring-oscillator (denoted as VRO)
based on vertical delay cells is employed as the Tx LO. The VRO includes a custom, digital RC
delay tuning scheme designed to minimize the frequency calibration time for the core oscillator. An
edge-combiner switching PA enables a 3x multiplication of the VRO frequency (fVRO ), allowing
the VRO to nominally operate at one-third of the RF frequency (fRF ), which further reduces
the overall power consumption. The chapter is organized as follows: Section 4.2 describes the
Tx architecture at the conceptual level and presents the system-level considerations. Section 4.3
introduces the vertical delay cell and illustrates its use in the VRO . also discusses the edge-
combiner power amplifier and elaborates on the digitally-assisted frequency correction scheme.
Section 4.4 discusses the experimental results and finally, Section 4.5 provides the conclusions of
this work.
4.2 Proposed Architecture
The proposed Tx architecture is shown in Fig. 4.1. The Tx is optimized for operation in the
902-928 MHz band. By jointly designing and optimizing the VRO employed as the LO generation
stage and the edge-combiner PA, it is possible to reduce the power dissipation of the highest power
consumers in the Tx chain. The delay cell used in the VRO is labeled “vertical” due to its stacked
arrangement. By construction, the structure of the vertical delay cell enforces the reuse of current
(charge) drawn from the supply during its operation, leading to an intrinsically lower power con-
sumption than other delay cells. Furthermore, by using a three-stageVRO in combination with a
three-leg switching edge-combiner PA similar to [23], theVRO needs to run at only 1/3fRF .
A digital calibration algorithm is employed for fVRO correction. The required frequency ref-
73
Figure 4.1: Top-level Tx block diagram
erence signal (fREF ) is provided for the calibrator by an on-chip Pierce oscillator circuit with a
32 kHz crystal. This calibration allows the Tx to fully exploit the wide-tuning range of the VRO .
The open-loop architecture employed as LO interfaces with an external FPGA. The latter executes
the calibration algorithm to calculate both coarse and fine digital words for the VRO delay tuning.
Two 7-bit, resistive-string-based digital-to-analog converters (DACs) translate the FGPA-generated
calibration words into control voltages for the RC tuning stage within the VRO .
Replacing the PLL frequency synthesizer (PLL-FS) with an open-loop LO is a challenging
but crucial task to enable the Tx low-power operation. PLL-FSs are not only power-hungry, but
demand a large area to implement an adequate loop filter, and exhibit long settling times which can
hamper performance in heavily duty-cycled applications [23].
The open-loop LO proposed in this work exhibits short wake-up and turn-around times, as well
as enabling fast modulation rates when applied directly to the VRO . Unfortunately, using an open-
74
Figure 4.2: BFSK modulation with a) small ∆f (LO phase noise buries f1 and f2) and b) large ∆f
(negligible LO phase noise effect in f1 and f2).
loop, free-running LO instead of a PLL-FS LO generally implies higher close-in phase noise levels
(due to the lack of noise-suppressing loop gain [71]). Furthermore, the LO phase noise directly
appears in the transmitted signal [72]. Thus, high LO phase noise levels in the Tx could lead to
spectral corruption between the two tones (f1 and f2) used to represent the mark and space states
in the binary version of FSK used in the proposed Tx architecture.
In the worst case scenario, f1 and f2 might be indistinguishable on the receiver side. To al-
leviate this concern, we have used a wideband FSK modulation. In this scheme, the frequency
deviation (∆f ) between f1 and f2 and the center frequency (fc) is increased. Since the modula-
tion index (h) is proportional to Δf (h = ∆f/DR, where DR is the data rate), h also increases,
which concentrates the transmitted signal power around fc ±∆f instead of around fc [73, 74]. If
∆f is equal to the frequency offset from fVRO for which the phase noise of the VRO has already
rolled-off (e.g., 1-2 MHz), it is the far-out phase noise of the VRO that becomes important for
the adequate transmission and reception of f1 and f2. Unlike the close-in phase noise, the far-out
phase noise (at sufficiently large offset) of a well-designed free-running LO is not so different from
that of a narrow bandwidth PLL-FS LO (since it is typically dominated by the oscillator [75]). This
concept is key to the use of a digitally-calibrated, free-running LO in our ultra-low power, energy
efficient Tx and is illustrated in Fig. 4.2.
75
Admittedly, using wideband FSK degrades the effective spectral efficiency; however, many
IoT applications require the transmission of small payloads in heavy duty-cycled Txs (short trans-
mission bursts). Thus, in the proposed architecture, spectral efficiency is traded-off with power
consumption to specifically address the needs of IoT applications without significantly affecting
the overall network capacity and performance, particularly when specific detection schemes are
implemented [76]. Choosing for example ∆f of 1 MHz for a 1 Mbps data rate, our Tx output RF
signal occupies a bandwidth of 3 MHz. While this is higher than the typical sub-MHz bandwidth
used by FSK transmitters, the adopted scheme leads to high energy efficiency. Furthermore, 3 MHz
bandwidth (8 MHz in our best test case @ 3 Mbps) is still small compared with the bandwidth re-
quired when using more complex schemes such as UWB (ultra wideband). Furthermore, when the
target data rates are <10 Mbps, as most of the IoT applications targeted by our solution do, using
an UWB-like scheme is unnecessary and disproportionate as it also increases the complexity of the
receiver.
4.3 Circuit Design
4.3.1 VRO Analysis and Design
As shown in Fig. 4.3, the vertical delay cell –which is the heart of the VRO– can be thought
of as an unwrapped version of the differential cascode voltage-switch-logic (DCVSL) delay cell.
Due to its small input capacitance, reduced switching noise, low power, and speed, the DCVSL
style is a natural candidate for circuits operating at high frequency. Its use in ring oscillators
and frequency dividers is well documented [17, 77, 78]. However, the DCVSL cell, like most
differential structures, has two mirrored paths that process the positive and negative inputs and
generate the differential output (Vdiff ). Thus, for an output voltage swing of 1/2Vdiff , a single-
ended implementation consumes roughly half the power of that used by its differential counterpart.
Nonetheless, as enticing as halving the power is, the multiple benefits of differential signaling
generally overbear this power-integrity trade-off, hence differential signaling is preferable despite
its double power budget. To provide an alternative to this conundrum, we propose a delay cell based
76
Figure 4.3: a) Typical DCVSL delay cell and b) Proposed vertical delay cell.
on the DCVSL structure that uses a single path to ground but is also able to generate differential
output signals. As a result, power consumption is decreased while the characteristics of differential
signaling are preserved.
Intuitively, in the DCVSL cell of Fig. 4.3a, a packet of charge Q1 = CLVDD is drawn from
VDD every time either of the driver transistors (MD1 and MD2) is turned off to represent a logic
high state at its corresponding output. Later on, when the input switches and the driver transistor
is turned on, Q1 is simply discarded to ground to represent a logic low state at the output. This
mechanism is continuously executed in the two branches forming the DCVSL cell. However, in
the presence of differential inputs, only one of the two outputs is a logic high at a given instant
(while the other is a logic low). Thus, it is conceptually possible to reuse Q1 (instead of discarding
it in DCVSL) to represent the logic high state of the complementary output when the branch that
initially drew it needs to switch to a logic low state. Shown in Fig. 4.3b is the proposed vertical
cell which operates using the Q1 -reusing concept. By fully unwrapping the DCVSL cell, and
making the necessary structural changes, the same packet of charge Q1 can be used to represent
two consecutive high states (one at each output). In the vertical delay cell, transistorsMD2 andML2
77
Figure 4.4: Equivalent models for the vertical delay cell during its two possible states showing the
corresponding common-mode levels: a) Logic high (low) at vout+(vout−) and b) Complementary
logic low (high) at vout+(vout−).
are of the complementary type with respect to the DCVSL cell (Fig. 4.3). Under the assumption
of differential input signals vin− and vin+, the lower part of the vertical delay cell (MD2 and ML2
in Fig. 4.3b) is able to accept Q1 at the precise instant that the upper part (MD1 and ML1 in Fig.
4.3b) needs to discard it to switch states. Since there is only one charge packet circulating at a
time, the vertical delay cell enforces a differential operation while exploiting verticality. To further
understand the operation of the vertical delay cell, Fig. 4.4 illustrates the two possible states that
the cell can temporarily settle into. Both outputs experience a similar voltage swing, however,
the signals vout+ and vout− have different common-mode voltage levels due to the vertical nature
of the structure. The middle point of the vertical cell, Vmid, is set at 1/2VDD. As a result, vout+
78
Figure 4.5: Vertical delay cell waveforms and signal transitions details for a)τpHL and b) τpLH in
the vout+ (upper) output.
(upper output) can reach voltages between 1/2VDD and VDD, whereas vout− (lower output) might
swing between GND and 1/2VDD, yielding common-mode levels of 3/4VDD and 1/4VDD for vout+
and vout−, respectively. For low power operation, 1/4VDD is typically below the transistor threshold
voltage (VTH), and the vertical delay cell preserves the zero-quiescent current feature (disregarding
leakage) of its DCVSL predecessor.
Shown in Fig. 4.4a is the first state of the delay cell, where both inputs, vin− and vin+, are
at a 1/2VDD level, which corresponds to a logic low for the upper section and a logic high for the
lower section. Under these conditions, the driver devicesMD1−2 experience a close-to-zero voltage
between their gate and source terminals. This is represented by open paths between Vmid and vout+
and vout− in Fig. 4.4a. Due to the lack of a ground path, the CL capacitors are charged to VDD and
GND (vout+ and vout−, respectively) through the on-resistance of the latch devices ML1−2, which
operate in the triode region during this state. The second state in which the vertical delay cell might
operate is shown in Fig. 4.4b. In this case, inputs vin− and vin+ are at the opposite voltage rails,
VDD and GND, respectively. Depending on the VDD level and the threshold voltage of MD1−2,
these driver devices may operate in either triode or subthreshold region. For fast switching, it is
79
Figure 4.6: VRO and its building blocks: a) vertical delay cell and b) RC delay tuning cell.
important to guarantee a VDD voltage high enough to put transistors MD1−2 in the triode region
(1/2VDD> max(VTH)) while operating in the second state. Conversely, transistors ML1−2 operate in
saturation during this state to allow both output signals to reach 1/2VDD. While in the second state
there is a weak path to ground, the large resistance of the saturated latch devices still enables low
power consumption of the delay cell.
The delay of the cell in Fig. 4.3b can be estimated using the model originally proposed in
[79] and further expanded in [78] for DCVSL cells. Despite its vertical nature, the proposed cell
inherits the well-known delay asymmetry from its DCVSL counterpart [78], leading to high-to-



























From (4.1) and (4.2), CL is the load capacitor; αN and αP are technology dependent param-
eters for N and P-type devices, respectively [79]; IDON and IDOP are drain currents of N and
P transistors when VGS = VDS = VDD; vTN and vTP are the VTH to VDD ratios (VTHN /VDD and
VTHP /VDD, respectively); and Ttn and Ttp are the transition times for rising-input and falling-input
signals approximated in (4.3) and (4.4) [79]. Ttn and Ttp are calculated using IDON and IDOP and































Also from (4.1) and (4.2), KHL and KLH are correction factors given by (4.5) and (4.6), re-
spectively, and calculated using empiric parameters (γN,P and ζN,P ) obtained from simulations
[78].
Finally, WN/WP denotes the N to P width ratio of the delay cell (for the same length).
1
KHL










From the delay expressions for the upper output at vout+ in Fig. 4.3b, τPHL includes only the
time it takes a rising-input (Ttn) at vin− to turn onMD1 such that Vmid is propagated to vout+ (which
81
represents a logic-low level). Conversely, τPLH at vout+ has two components [78]: t1, which is the
time it takes for a rising-input at vin+ to increase the resistance of MD2 such that vout− toggles,
and t2 which is the time it takes a falling edge in vout− to reduce the ML1 resistance such that VDD
propagates to vout+ (representing a logic-high level). Unlike the DCVLS cell case, where t1 is due
to an NMOS, and t2 is due to a PMOS, both τPLH components (at vout+) in the vertical cell are
due to P-type devices. Hence, note that the expression for τPLH (4.2) in the vertical delay cell is
different from its DCVSL counterpart [78].
Fig. 4.5 shows the waveforms and details the signal transitions for the delay quantification at
the vout+ output. Due to the duality of the upper and lower part of the delay cell, the delays (τPHL
and τPLH) for the lower output (vout−) exhibit an opposite behavior than that of the ones in the
upper section of the cell, meaning that for vout−, τPHL is longer than τPLH (contrary to vout−). The

























To mitigate the delay asymmetry, two tiny helper transistors can be used to accelerate the
transition of the lagging delay path (τPHL for vout−, and τPLH for vout+). The final form of the
vertical delay cell is shown in Fig. 4.6a. For a low-to-high transition at vout+ to occur (going from
the second to the first state, as shown in Fig. 4.4 b→a), ML2 needs to first switch to the triode
region in order to eventually turn on ML1. To quickly push ML2 into the triode region without
increasing the size of MD1 (which increases the input capacitance), transistor MB1 in Fig. 4.6a
takes advantage of the falling input at vin (which originally triggered the low-to-high transition)
to help raise the ML2 gate voltage. Similarly, for a high-to-low transition at vout−, ML1 needs to
switch to the triode region for ML2 to fully turn on and propagate GND to vout−. In this case, MB2
uses the rising-input at vin+ to contribute to the discharge of theML1 gate. A previous cell that also
82
exploited verticality (albeit without formally identifying it) was presented in [80]. The oscillator
presented in [80] fully relies in a leakage path to define the delay per stage which makes it less
suitable for applications where fine delay control is required. Conversely, the delay cell used in
the VRO relies on a controlled transition in the region of operation of the driver devices to shift its
state and in turn determine its delay.
Three vertical delay cells form the VRO . While the delay cell is differential and it is possible
to use an even number of stages to build a ring-oscillator, due to the different common-mode levels
of the differential outputs, an odd number is employed to avoid the use of an inter-stage level
shifter. Fig. 4.6b shows the tuning circuit used to achieve a wide delay range of operation. A
resistor-capacitor combination is used to modulate the total delay. The coarse control is obtained
via a voltage-controlled resistor implemented with a PMOS (Rtune−up) and an NMOS (Rtune−dw).
Using NMOS and PMOS for coarse control in the upper and lower part of the circuit helps to
compensate for the different common-mode levels of vout+ and vout−. The fine control is provided
by a voltage-controlled varactor (Ctune) in series with a fixed capacitor CC . Resistor Rb is used
to independently set the varactor’s bias point to maximize the linear frequency control region. It
will be shown in Section 4.3.3 that the coarse and fine tuning voltages are derived from a digital
calibration algorithm which minimizes the VRO frequency error.
The implemented tuning scheme desensitizes the fine tuning slope from voltage and/or tem-
perature variations, reducing the re-calibration time. Thus, to compensate for such variations, it is
sufficient to run the calibration at the channel center frequency (f0), while the fine points spacing
or delta from the center remain unchanged (e.g. fine calibration words related to f1 & f2 tones in
FSK). This is achieved by implementing the tuning capacitors (CC and Ctune) with back-to-back
PMOS devices and a MOS varactor (Fig. 4.6b), both having the same dependence with temperature
creating a constant tuning slope. This arrangement also cancels the variation of the capacitance
(CC) with supply and varactor biasing. All the control voltages for the tuning circuit in Fig. 4.6b
are generated using a resistive string-based DAC. These tuning voltages are referred to the VRO
supply, allowing for supply voltage changes to be mapped to the tuning circuit. Thus, the varactor
83
Table 4.1: Delay variations of the VRO stage across process corners






bias (Vb) and fine control (Vfine) track the VRO supply to the first order, and the voltage differ-
ence across the tuning capacitors (Ctune) is kept constant. In this way, the delay fluctuation due to
unexpected variations in the VRO supply is minimized. The linear tuning range of the fine tuning
is designed to cover multiple coarse tuning steps, providing a linear tuning at any operation condi-
tion. The fine tuning slope simulated variation is less than 1.5% across temperature, which is the
key to reduction of calibration time as will be discussed in the digital calibration scheme (Section
4.3.3).
Although the VRO is a custom design targeted to operate in the ecosystem of our Tx architec-
ture, it is useful to look at the power consumption of other LO generators used in previous works
that also target low power consumption to further evaluate the VRO . For example, the PLL in [65]
(40 nm node) consumes 520 µW to generate a high quality reference at 450 MHz; similarly, the
synthesizer in [69] (0.13 µm node) needs 1.7 mA to generate a 900 MHz reference. Conversely,
simulation results show that our VRO (0.18 µm) needs only about 135 µW to generate a 300 MHz
carrier. Although the LO requirements in [65], [69], and our work are different, it is clear that in
our proposed solution, the V RO is a good design alternative.
Due to its nature, the VRO is susceptible to variations in the frequency (delay per stage) when
the oscillator operates at different temperatures, under a non-regulated supply voltage, or simply
experiences process variations. To quantify these variations, the delay-per-stage of the VRO is
simulated a) in 100 process+mismatch Montecarlo runs (Fig. 4.7); b) across selected process
corners (Table 4.1); c) in the asumption that its supply voltage varies +/-10% from a nominal 1.4
V supply (Fig. 4.8); and d) across the typical temperature range of operation (Fig. 4.9). The
84
Figure 4.7: Delay variations due to local mismatches for the VRO stage
results of the Montecarlo analysis show that a vertical delay-cell within a three-stage VRO (e.g.
cell loaded with another identical cell) has a delay distribution centered on 556 ps (for an estimated
fVRO of ~300 MHz) with standard deviation of 85.7 ps, which maps to an estimated error in the
uncalibrated fVRO between ~40-50 MHz (error which can be corrected during our calibration).
Table 4.1 allows to compare the delay per stage between common process corners (TT, SS, FF, SF,
and FS). As expected, the extreme SS and FF corners lead to large deltas in the delay per stage
with respect to the typical (TT) case. One way to account and compensate for large differences
at such extreme corners is to modulate the supply voltage accordingly. While an unregulated
supply voltage can lead to undesired variations in the delay and oscillation frequency, nowadays
it is customary to employ an LDO (low-dropout) regulator to provide accurate supply voltage to
the oscillation core and avoid supply-induced variations in the oscillation frequency. The delay
variation of the VRO versus supply voltage is shown in Fig. 4.8. In the scenario where the supply
variation is restricted to +/-25 mV (which can be guaranteed by an LDO with sufficient loop gain),
the delay variation is reduced to ~15% from the delay value at nominal supply. The LDO design
has been omitted in our proposed solution and is out of the scope of this paper. To complete the
delay-cell characterization, the delay vs. temperature curve is shown in Fig. 4.9.
85
Figure 4.8: Delay variations due to ±10% supply variation for the VRO stage
Figure 4.9: Delay variations due to temperature variation for the VRO stage
4.3.2 Edge Combiner Power Amplifier
Intuitively, the use of a current recycling VRO reduces the overall power consumption of the
Tx. However, a careful interfacing between the VRO and the PA is required to fully leverage the
ultra-low power characteristics of the VRO . As illustrated in Fig. 4.6, the vertical nature of the
delay cells used in the VRO result in differential output signals with similar voltage swings but
86
Figure 4.10: Edge-combiner PA and the pre-amplifier used to interface with the VRO
centered on different common-mode voltages (VCM ). The circuit schematics of the VRO buffer
acting as a swing restoring stage and the edge-combiner PA are shown in Fig. 4.10.
Out of the six 60°-spaced phases available from the VRO , three have a 1/2VDD to VDD swing,
and the other three sway between GND and 1/2VDD. To simplify the design of the edge combiner
and increase the isolation between VRO and PA, the ac-coupled, single-ended, pseudo-differential
buffer shown in Fig. 4.10 is used. The ac-coupling removes VCM from the VRO outputs. Using
RCM to independently bias the buffer for Class-B operation maximizes the output swing while
minimizing the quiescent current. Furthermore, using the complementary phases of the VRO
at every buffer doubles the signal swing available at the input of the edge-combiner PA, which
minimizes the on-resistance of the PA switch transistors.
It is important to remark that while the buffer in Fig. 4.10 is used to remove the different
common-mode levels of the VRO outputs, it doubles up as the important isolation stage between
LO and PA used in most Tx designs to avoid frequency pulling/pushing effects. As specialized and
customized as this buffer is, its presence does not obey exclusively to the VRO characteristics but
87
as a need to increase isolation between the LO generation stage and the PA.
Unlike [23], which uses a 9x frequency multiplication factor, we have used a 3x factor in the
edge-combiner to reduce the effect of delay mismatches in the VRO at the PA output. Mismatches
across the delay-cells introduce spurious tones at the PA output with an offset fV RO with respect
to fRF . In our implementation, due to the 3x multiplication factor, the potential added spur would
appear ±300 MHz away from fRF . Thus, this potential mismatch-induced spur can be heavily
attenuated at the PA output. While it might be argued that the low reference frequency required
in edge-combiners using larger multiplication factors [23, 81] enables further reduction in the LO
power consumption, this does not necessarily hold when an ultra-low power delay cell is used
to generate a higher frequency reference. For the PA termination, an inductor-tapped matching
network transforms the 50 Ω antenna impedance into a target impedance at the drain of the PA
switches of ~2.5 kΩ. This impedance transformation improves the efficiency by allowing a larger
voltage swing to develop at the PA drain, with a maximum swing equal to the PA supply voltage
to avoid using expensive extended drain devices.
4.3.3 Digital Frequency Calibration
The characteristics of the wideband BFSK modulation employed allow for an LO generation
based on an open-loop VRO . While this approach reduces the LO power consumption, ring-
oscillators tend to suffer large frequency drifts due to temperature and process variations [82].
Thus, some form of frequency correction is necessary. The digitally-assisted calibration scheme
implemented for this purpose is shown in Fig. 4.1. The adopted frequency correction concept
relies on the availability of an on-chip accurate reference frequency with period TREF (1/fREF )
and a divided-by-8 version of fVRO ,(fVRO/8). An Altera® Cyclone-V FPGA compares the period
(TVRO/8) of the (fVRO/8) signal against TREF period to determine if fV RO is above or below
target. After resolving for the sign of the error, a binary search algorithm is employed to speed-
up or slow-down the VRO , the flow diagram of the binary search algorithm implemented in the
FPGA.
Two 7-bit resistive string-based digital-to-analog converters (DACs) are used to translate Fine[6:0]
88
Figure 4.11: Digital calibration flow chart showing calibration duration under different operating
conditions.
and Coar[6:0] into analog voltages. The DACs outputs are applied to the variable RC-delay cir-
cuit of Fig. 6b to tune fV RO. Note that while we have used an FPGA for the digitally-assisted
calibration approach to demonstrate our ultra-low power Tx concept, the synthesis and integration
of binary search algorithms have been previously demonstrated [83] and can be seamlessly inte-
grated for a fully monolithic solution. It is estimated that an on-chip state machine implementing
the calibration consumes an average of 200 µW (required only during the calibration phase). This
average is estimated via a simulated digital 7-bit counter running at max VRO frequency +20%
extra margin for combinational logic.
To perform a BFSK transmission, four 7-bit calibration words (Coarf0 , Finef0 , Finef1 , and
Finef2 ) required to maintain an accuracy of ±fREF on the corrected fVRO (300 MHz) signal,
representing the center frequency for the channel and the two tones for BFSK (f1 and f2), the flow
chart of the calibration is shown in Fig. 4.11. Due to the immunity of the fine tuning circuit to
temperature and voltage variation, only the center frequency needs calibration pre-transmission.
Table 4.2 summarizes the calibration time required and number of counting cycles for multiple
operation conditions.
89











±12°C/±5 mV, ∆fe≤20 MHz
Operation outside
14 437.5
±12°C/±5 mV, ∆fe > 20 MHz
Table 4.3 compares between a state of the art typical PLL transmitter [84] and the proposed
PLL-less approach. It can be clearly shown that the PLL is adding significant overhead to the
overall energy per transmission cycle. The PLL-less approach is providing more than 20 times
enhancement in the energy efficiency per transmission cycle. Low frequency crystal oscillator
is assumed to be always on for other timing functionalities in deep-sleep/standby modes, thus it
is not added in the total operation time and energy for the PLL-less approach. The PA power
consumption is assumed to be the same for both approaches.
4.3.4 Low Power, Monotonic Digital to Analog Converters
The resistive-string-based DAC used is shown in Fig. 4.12. Two DACs are used to generate
the Vfine, Vcoarse_up and Vcoarse_dw input signals for the VRO tuning circuit. While the Vfine DAC
operates in a standalone fashion, the Vcoarse_up and Vcoarse_dw signals are the outputs of the same
DAC string to save power and share a 1/2VDD voltage reference to define the maximum and min-
imum levels of the full-scale voltage, respectively. The different full-scale ranges for the up and
down coarse controls compensate for the different VCM in the upper and lower loops in the VRO ,
keeping symmetry on the delay added by the tuning stage on both upper and lower loops. The
intrinsic monotonicity and temperature insensitivity (voltage is a function of resistors ratio) of this
DAC structure makes it an ideal candidate for our application. Furthermore a segmented, two-stage
resistor string [85] decreases the number of resistors required to implement every 7-bit DAC from
90
Table 4.3: Comparison between PLL and PLL-less approaches
PLL
Duration Power Energy










Total 1800-2100 – 16.6-17.8
PLL-less
Duration Power Energy










Total 1000-1300 – 0.631-0.732
128 to only 24. In this case, the first DAC stage resolves the 3 MSBs (N=3) and the second stage
resolves the 4 LSBs (M=4). Due to the direct, unbuffered connection between the two resistor
segments [86], the resistor string corresponding to the LSBs appears in parallel with the selected
resistor from the MSBs string. For R =R1,1−N =R2,1−M , the effective selected resistor in the MSB
string becomes 15/16R which represents an induced error of 1 LSB (of the second DAC segment).
Most importantly, this slight deviation from the ideal R and its equivalent voltage drop does not
impact the monotonicity of the DAC nor jeopardizes the convergence of the search algorithm. Note
that while the DAC output impedance changes for every code, this is not a concern since the DACs
drive purely capacitive loads in the tuning circuit (Fig. 4.6). By guaranteeing monotonicity, the
selected DAC structure meets the first of the two most important characteristics for our application,
with the second being low power consumption. To achieve the latter, R is selected to be ~30 kΩ.
The required 9 to 2 and 16 to 1 MUXes are based on a transmission-gate implementation, and the
digital logic to decode both MUX inputs only needs to operate at low frequencies (equal to the data
91
Figure 4.12: Resistive-string-based DACs used to generate the tuning voltages for the RC delay
tuning cell.
rate).
4.3.5 Temperature Insensitive Biasing for Crystal Oscillator
A Pierce oscillator circuit (Fig. 4.13) is implemented using an external 32.768 kHz crystal.
The crystal oscillator (XO) output signal fREF is used by the FPGA to calculate the frequency
error in the fVRO signal. In Fig. 4.13, transistors M1 through M5 operate in the subthreshold
region, and together with R1 form a bias current (ibias) generator based on the mutual mobility
and threshold voltage temperature compensation principle [87]. In this approach, it is possible to
carefully sizeM5 (240 nm/40 µm) to push it into deep sub-threshold, such that it operates around its
zero-temperature-coefficient region when biased with a constant voltage (via M1-M4 subthreshold
divider).
92
Figure 4.13: Crystal oscillator schematic including bias current generation and buffering stage.
This circuit yields a nearly-temperature-stable current ibias that is mirrored to the core of the
crystal oscillator (XO core). Since the crystal frequency gets pulled by few ppm with load ca-
pacitance changes, this bias current leads to constant Cgs with temperature, thus decreasing the
frequency sensitivity with temperature. As a result, the counting window (TREF ) varies less than
1 count (TVRO ) maintaining the calibration accuracy despite temperature variation, more details
in Section 4.4. A 50 nA current ibias is used to provide negative resistance at the XO core five
times larger than the crystal losses. This guarantees the XO start-up and a consistent oscillation
frequency across temperature variations. The stabilized XO signal is further amplified and buffered
via the structure formed by M9-M15 and the two buffers in the XO amplification and buffer stage
in Fig. 4.13. A 12:1 scaled replica of the XO core is used for the generation of a reference voltage
(Xref ) used in the XO amplification stage. Xref tracks the operating point of the XO-stabilization
inverter across PVT variations. Since a fully differential operation is not provided with this pierce
topology, the selected amplification and buffering stage meets the phase noise, low power and PVT
resilience needs of our application.
93
Figure 4.14: Die microphotograph of the Tx.
4.4 Measurement Results
The Tx was implemented and fabricated in 0.18 µm CMOS technology and encapsulated in a
QFN56 package. The die microphotograph is shown in Fig. 4.14. While the total die area is 2x2
mm, the active area occupied by the Tx is only 0.112 mm2, including buffers for testing. The
measurement results for the achievable tuning range of the VRO are shown in Fig. 4.15. A third
degree of frequency control is added (beside coarse and fine tuning) by varying the VRO supply.
As a result, the total observed range is between 50 MHz and 350 MHz with a maximum supply of
1.75 V. Furthermore, since fV RO is effectively multiplied by 3 at the PA output, the VRO opens the
possibility for a self-contained Tx operating from 150 MHz to 1.05 GHz, with proper multi-band
matching and antenna.
A single tuning curve is able to cover process, temperature (-10 to 100 °C) and VRO sup-
ply variation (up to ±10 mV). Using the coarse control word, it is possible to discretely tune the
frequency in 600 kHz steps (coarse LSB). Conversely, the fine control word allows a 60 kHz/step
programmability. The measured frequency stability against temperature of the XO is shown in Fig.
4.16. Due to the ibias characteristics, there is only a total variation of 7 Hz from −10 °C to 100 °C,
94
Figure 4.15: VRO measured tuning range 50–350MHz translates into a PA RF frequency of
0.15–1.05 GHz.
which accounts for a total accuracy of 213 ppm and 3 ppm/°C at temperatures > 30 °C. As a result,
the counting window (TREF ) varies less than 1 count (TV RO) in this range, allowing to maintain
the calibration accuracy despite temperature variation.
The maximum Pout available at the antenna was measured to be −10 dBm as shown in Fig. 4.18.
It also shows the BFSK f1 and f2 tones with 2 MHz ∆f . In this case, the inter-tone interference
is 42 dB below f1 and f2. The phase noise of the VRO and the Tx carrier are shown in Fig.
4.17a and b, respectively. While the close-in phase noise has the typical flat-looking shape of
free-running oscillators, the measured phase noise at 1 MHz offset is –106 dBc/Hz in the case
of the VRO running at 300 MHz, and –94 dBc/Hz in the case of the PA carrier output at 3fVRO
(900 MHz). To test the Tx, a bit pattern was transmitted at 1 and 3 Mbps data rates. The time
representation of such bit patterns received using a signal analyzer is shown in Fig. 4.19a. The
correct reception of the patterns shows that using the VRO for LO purposes does not introduce
significant interference between f1 and f2 due to the ∆f employed and validates the far-out phase
noise design assumption made (spectral efficiency vs. power consumption trade-off). The eye-
diagram in Fig. 4.19b is measured at 3 Mbps and -10 dBm output power with 2 MHz tone spacing,
showing a clear eye opening at such data rates. While a general rule of thumb on selecting ∆f in
95
Figure 4.16: XO frequency stability across temperature.
our Tx architecture is to avoid the 1/f phase noise region, it can be shown [88] that a frequency
discriminator can sustain higher phase noise if ∆f is increased while maintaining constant BER.
The setup used to perform the eye diagram is also used in the binary search -using the FPGA-
for f1 and f2, the setup is shown in Fig. 4.20a and the results are demonstrated in Fig. 4.21. The
bit error rate (BER) setup in Fig. 4.20b compares long data streams from the Tx and Rx using a
computer software. The BER tests are performed using a SiLabs receiver (Si446x-C) [84] with
PRBS7 data stream of 1 million bits. BER is less than 0.1% at a distance of 3 m and less than 2%
at a distance of 10 m with a wall in between with Pout of –15 dBm at 100 kbps to match transmit-
ted signal to the maximum receiver bandwidth. To illustrate the effect of close-in phase noise in
open loop transmission, the BER is measured at different frequency deviation with different data
rates. The results are shown in Fig. 4.22, which agree with measurements in Fig. 4.18, where
small tone spacing will lead to high inter symbol interference degrading the BER due to spectral
corruption. It is possible to increase the transmission distance with increased output power but
this will limit the node compatibility with energy harvesting modules for a no-battery operation.
96
Figure 4.17: Phase noise of a) VRO with fV RO of 300 MHz for -106 dBc/Hz @ 1 MHz and b) Tx
carrier at 900 MHz with -94 dBc/Hz @ 1 MHz.
Due to its wide operating range, when paired with the corresponding baseband circuitry, the pro-
posed Tx is compatible with standards (with relaxed BW limits) such as Wireless MBUS, Konnex-
RF, IEEE 802.15.4g-SUN and IEEE 802.15.4k-LECIM, which uses M-FSK/GFSK/MSK/GMSK
modulations. The implemented fine tuning curve steps are small enough to generate any required
frequency shift patterns (e.g GFSK) for enhanced spectral efficiency.
The total current consumption of the Tx for a −15 dBm Pout is 363 μA from 1.65 V (VRO),
and 1.8 V (PA and digital circuitry). The power consumption breakdown is shown in Fig. 4.24.
97
Figure 4.18: BFSK tones at the PA output for the 2 MHz frequency deviation case.
Under these conditions the Tx energy efficiency is 0.2 nJ/bit. Furthermore, the normalized Tx
efficiency at the maximum Pout is 3.1 nJ/bitmW. The output power vs PA supply voltage is shown
in Fig. 4.23, adding a degree of freedom in power control. To put these results in context, Table 4.4
shows the measurement results summary and the performance comparison with other state-of-the-
art, low power Tx. Note that [23] and [64] have absolute power consumptions lower than our work.
However, [23] operates at a single frequency (400.5 MHz) due to the direct injection locking from
a high frequency crystal oscillator (44.5 MHz) with a maximum 200 kbps data rate. The reported
Pout in [23] is −17 dBm. Similarly, [64] operates at only -28.6 dBm output power. Conversely,
we have measured the energy efficiency at 3 Mbps and −10 dBm and −15 dBm Pout. The 0.2
nJ/bit energy efficiency is superior to most works in Table 4.4. Furthermore, the 3.1 nJ/bitmW
normalized energy efficiency at −10 dBm Pout is the best amongst the compared works. Moreover,
the proposed Tx has the largest available tuning range, which potentially enables a wide operating
range while maintaining its low power consumption characteristics.
98
Figure 4.19: a) Transmitted bit pattern (1010110011110000) at 1 Mbps and 3 Mbps received in a
signal analyzer and b) Eye diagram using a PRBS7 pattern at 3 Mbps and 2 MHz tone spacing.
4.5 Summary and Conclusion
This work presented an ultra-low power, fast start-up, PLL-less and energy efficient transmitter
for IoT applications in the 900 MHz ISM band with data rates of up to 3 Mbps. A current reusing,
vertical delay cell is introduced, analyzed, and used to build a vertical ring-oscillator with ultra-low
power consumption. By employing a wideband BFSK modulation, it is possible to use the vertical
ring-oscillator as the Tx local oscillator generator. A digitally-assisted frequency correction and
calibration scheme is implemented to compensate the frequency variations that the oscillator might
experience reaching a carrier frequency accuracy of ±3fREF (±98 kHz). The concept was tested
with an FPGA that required division by 8 to provide the VRO signal on board. The power amplifier
is based on an edge-combiner, which effectively multiplies the oscillator frequency by a factor of 3,
relaxing the maximum required frequency at the ring-oscillator. The Tx was fabricated in 0.18 µm
CMOS technology and occupies an active area of 0.112 mm2. Measurement results showed an
RF carrier tunability of 0.15 GHz−1.05 GHz. At 900 MHz, −10 dBm output power, and 3 Mbps
99
Figure 4.20: Lab setup for a) digital calibration and data modulation and b) BER
data rate, the normalized energy efficiency is 3.1 nJ/(bitmW). The proposed architecture achieves
a superior efficiency not only in the transmission mode but also for overall operation cycle due to
minimized both start-up/calibration time and energy.
100
Figure 4.21: Binary search algorithm for f1 and f2 performed once in the initial calibration cycle.
Figure 4.22: BER versus frequency deviation at different data rates
101
Figure 4.23: Power amplifier output power versus PA supply
Figure 4.24: Tx power consumption per circuit block.
102
Table 4.4: Performance Summary and Comparison with the State-of-the-Art Systems
103
5. CONCLUSION
This thesis offers multiple key building blocks for a self powered IoT node, various harvesters
interface with MPPT abilities, power management systems (to handle energy storage and supply
regulation efficiently) and a low power wireless transmitter. These designs are co-optimized and
co-designed for high end to end efficiency. The first two chapters introduced the importance of such
design methodology for next generation applications to achieve a reliable continuous operation in
increasingly demanding systems.
5.1 Summary of Contributions
As a second author, the writer of this thesis was a main contributor to system design, circuit
design and layout. Participating in PCB design, measurements process and journal writing. These
journals are part of co-author work.
• Technology Enabling circuits and systems for IoT: An Overview, a review paper on IoT
HW design from the physical/MAC/network layers’ standpoint, emphasizing on cross layer
optimization for an efficient hardware, in addition to circuit and system level optimization.
This work is presented in ISCAS 2018 conference in Italy
• Multiple Input Energy Harvesting Systems for Autonomous IoT End-Nodes, a review
paper summarizing the state of the art work in multiple input energy harvesting systems. It
explores different topologies, circuits and comes up with conclusions and recommendations
on the most efficient implementations. Finally, directions for future work provide reliable
systems for expanding IoT applications. This work was the starting point of the author’s
main work. This work is submitted to JLPEA.
• A Fully Integrated Maximum Power Tracking Combiner for Energy Harvesting IoT
Applications, a voltage adder circuit used as an energy combiner from two DC harvesters.
This is a smart ranking system that picks the highest two out of four input harvesters to be
104
combined and provides MPPT. Transaction of industrial electronics (TIE) is the journal of
submission.
• Reconfigurable System for Electromagnetic Energy Harvesting with Inherent Activity
Sensing Capabilities for Wearable Technology, focusing on harvesting energy from hu-
man body motion using a custom made electro-magnetic transducer. Focusing on fitness
trackers integration, the circuit interface not only rectifies the AC signal generated for en-
ergy harvesting, but also detects the activity levels and can sense basic gestures for added
user control functionalities. The passive activity detector also puts the harvesting circuits in
LP sleep mode when to motion is detected. This system is presented in TCAS-II [32]. An
updated more optimized and fully integrated version of this work is in process and should be
published in IEEE-Sensors Journal in a few months.
• 0.2 nJ/bit Fast Start-up Ultra-Low Power Wireless Transmitter for IoT Applications,
the author’s first work, a LP wireless transmitter for short distance communication sensor
nodes, with a focus on fast start-up time and high data rates to shorten the communication
time. An always-on sub uW crystal oscillator provides a frequency reference to a digital
calibration algorithm to set the oscillator frequency with ±100 kHz accuracy for a 900 MHz
carrier frequency. Novel architecture design is proposed allowing PLL-less operation at one
third of the carrier frequency as the power amplifier uses multiple phases of the oscillator
and multiplies it to triple frequency at the antenna output. This work is can be found on
MTT.
• Highly Linear Low Power Wireless RF Receiver for WSN, the multi-phase idea and os-
cillator design is reused from the previous project which is why the autors was added as an
author in this work. No extra contribution is provided to this work. Thanks are due to Omar
El-Sayed for such generosity. This work is submitted to TVLSI.
As a first author, with experience acquired from previous work in energy harvesting, a full
system design and circuit design to solve previous limitations faced is done here, with the help
105
of Alfredo Costilla, Johan Estrada and Aditya B. to finish the layout and finalize circuit design of
such big system (+20 blocks), this work would not have seen the light in such a short time frame
without them.
• Multiple Inputs Harvesting System with Smart Boost, Energy Flow Detection, Pulse
Charging and Hybrid Regulation Capabilities, this system provides a full solution to
powering IoT systems from multiple harvesters of different natures. With battery charging
and load supply regulation capabilities in a fully integrated solution, the system is augmented
with energy flow detection to optimize quiescent current at periods of energy drought. The
End-to-End efficiency achieved is the highest reported among switched capacitor structures,
owing to a smart boost sequence proposed to enhance charging efficiency. This work is
presented to TIE.
5.2 Suggested Improvements and Future Directions
Going through the last few projects, some ideas to improve the current work came to mind, and
several lines of work that is worth exploring for future directions showed up.
• Field testing for 3 systems combined (on PCB level): the RF transmitter, AC electromag-
netic interface and the multiple input harvesting PMU. The addition of a sensor, ADC and a
microcontroller to interface between the ADC and the Tx will prove the concept of having
a battery-less self powered node that operates periodically and sends the sensed data to a
central node for further processing and control.
• On chip integration of the 3 projects mentioned above, and comparing the efficiency im-
provement by such integration with PCB-to-PCB integration
• Project presented in Chapter 3, the combiner circuit can be expanded to accommodate AC
harvesters directly without the need of a pre-rectification stage as shown in Fig.5.1. The idea
is to use two harvester units and add a comparator to decide which side of the harvester is
connected to the rectifier unit and connect the other terminal to ground, depending on signal
106
polarity. With a single comparator and a switch, the system achieves a universal interface
that can provide MPPT for both DC/AC harvesters. RF harvesting will still need a pre-charge
pump to raise the voltage to 100mV before using the existing interface. This will allow it to
interface with Piezo-electric and kinetic magnetic transducers.
Figure 5.1: DC/AC universal harvester interface.
• Low power Tx in Chapter 4, The frequency calibration algorithm is to be integrated on-
chip for faster turn-around times and lower power consumption. Currently the system is
implemented to Xilinx FPGA.
• Hybrid LDO presented in Chapter 3, The PSRR enhancement due to the aux-analog LDO
is not quantified by measurements. The digital LDO is connected in parallel acting as an
extra path from the supply to the LDO output, degrading the PSRR in theory. However, that
doesn’t affect the low frequency PSRR as the opamp in analog part corrects both loops and
provides supply rejection as the output node is shared. Further measurements to show such
improvement is required in the future.
107
REFERENCES
[1] GrowthEnabler, “Market pulse report, internet of things (iot).”
https://growthenabler.com/flipbook/pdf/IOT
[2] M. A. Abouzied, H. Osman, V. Vaidya, K. Ravichandran, and E. Sánchez-Sinencio, “An
integrated concurrent multiple-input self-startup energy harvesting capacitive-based dc adder
combiner,” IEEE Transactions on Industrial Electronics, vol. 65, no. 8, pp. 6281–6290, 2018.
[3] S. Bandyopadhyay and A. P. Chandrakasan, “Platform architecture for solar, thermal, and
vibration energy combining with MPPT and single inductor,” IEEE J. Solid-State Circuits,
vol. 47, no. 9, pp. 2199–2215, 2012.
[4] D. Evans, “The internet of things: How the next evolution of the internet is changing every-
thing,” CISCO white paper, vol. 1, no. 2011, pp. 1–11, 2011.
[5] Z.-K. Zhang, M. C. Y. Cho, C.-W. Wang, C.-W. Hsu, C.-K. Chen, and S. Shieh, “Iot se-
curity: ongoing challenges and research opportunities,” in Service-Oriented Computing and
Applications (SOCA), 2014 IEEE 7th International Conference on, pp. 230–234, IEEE, 2014.
[6] K. Narayanan, “Addressing the challenges facing iot adoption,” MICROWAVE JOURNAL,
vol. 60, no. 1, pp. 110–118, 2017.
[7] K. Rose, S. Eldridge, and L. Chapin, “The internet of things: An overview,” The Internet
Society (ISOC), pp. 1–50, 2015.
[8] A. Romani, M. Tartagni, and E. Sangiorgi, “Doing a lot with a little: Micropower conversion
and management for ambient-powered electronics,” Computer, vol. 50, no. 6, pp. 41–49,
2017.
[9] Y. Tan, K. Hoe, and S. Panda, “Energy harvesting using piezoelectric igniter for self-powered
radio frequency (rf) wireless sensors,” in Industrial Technology, 2006. ICIT 2006. IEEE In-
ternational Conference on, pp. 1711–1716, IEEE, 2006.
108
[10] A. Molnar, B. Lu, S. Lanzisera, B. W. Cook, and K. S. Pister, “An ultra-low power 900 mhz
rf transceiver for wireless sensor networks,” in Custom Integrated Circuits Conference, 2004.
Proceedings of the IEEE 2004, pp. 401–404, IEEE, 2004.
[11] H. Amir-Aslanzadeh, E. J. Pankratz, C. Mishra, and E. Sánchez-Sinencio, “Current-reused 2.
4-ghz direct-modulation transmitter with on-chip automatic tuning,” IEEE Transactions on
Very Large Scale Integration(VLSI) Systems, vol. 21, no. 4, pp. 732–746, 2013.
[12] Y.-I. Kwon, S.-G. Park, T.-J. Park, K.-S. Cho, and H.-Y. Lee, “An ultra low-power cmos
transceiver using various low-power techniques for lr-wpan applications,” IEEE Transactions
on Circuits and Systems I: Regular Papers, vol. 59, no. 2, pp. 324–336, 2012.
[13] D. Ghosh and R. Gharpurey, “A power-efficient receiver architecture employing bias-current-
shared rf and baseband with merged supply voltage domains and 1/f noise reduction,” IEEE
Journal of Solid-State Circuits, vol. 47, no. 2, pp. 381–391, 2012.
[14] Z. Lin, P.-I. Mak, and R. Martins, “A 1.7 mw 0.22 mm 2 2.4 ghz zigbee rx exploiting
a current-reuse blixer+ hybrid filter topology in 65nm cmos,” in Solid-State Circuits Con-
ference Digest of Technical Papers (ISSCC), 2013 IEEE International, pp. 448–449, IEEE,
2013.
[15] M. Lont, D. Milosevic, G. Dolmans, and A. H. van Roermund, “Mixer-first fsk receiver with
automatic frequency control for body area networks,” IEEE Transactions on Circuits and
Systems I: Regular Papers, vol. 60, no. 8, pp. 2051–2063, 2013.
[16] J. Masuch and M. Delgado-Restituto, “A 1.1−mw−rx 81.4 −dbm sensitivity cmos
transceiver for bluetooth low energy,” IEEE Transactions on Microwave Theory and Tech-
niques, vol. 61, no. 4, pp. 1660–1673, 2013.
[17] J. Lee and B. Kim, “A low-noise fast-lock phase-locked loop with adaptive bandwidth con-
trol,” IEEE Journal of solid-state circuits, vol. 35, no. 8, pp. 1137–1145, 2000.
109
[18] W.-H. Chiu, Y.-H. Huang, and T.-H. Lin, “A dynamic phase error compensation technique
for fast-locking phase-locked loops,” IEEE Journal of Solid-State Circuits, vol. 45, no. 6,
pp. 1137–1149, 2010.
[19] J. Bloks, “Design of an Ultra-Low Power Time Reference Module,” 2009.
[20] S. A. Blanchard, “Quick start crystal oscillator circuit,” in University/Government/Industry
Microelectronics Symposium, 2003. Proceedings of the 15th Biennial, pp. 78–81, IEEE, 2003.
[21] S. Iguchi, H. Fuketa, T. Sakurai, and M. Takamiya, “92% start-up time reduction by variation-
tolerant chirp injection (ci) and negative resistance booster (nrb) in 39mhz crystal oscillator,”
in VLSI Circuits Digest of Technical Papers, 2014 Symposium on, pp. 1–2, IEEE, 2014.
[22] G. Marucci, A. Fenaroli, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, “21.1 a
1.7 ghz mdll-based fractional-n frequency synthesizer with 1.4 ps rms integrated jitter and
3mw power using a 1b tdc,” in Solid-State Circuits Conference Digest of Technical Papers
(ISSCC), 2014 IEEE International, pp. 360–361, IEEE, 2014.
[23] J. Pandey and B. P. Otis, “A sub-100 µW MICS/ISM band transmitter based on injection-
locking and frequency multiplication,” IEEE Journal of Solid-State Circuits, vol. 46,
pp. 1049–1058, May 2011.
[24] H. Cho, J. Bae, and H.-J. Yoo, “A 37.5/spl mu/w body channel communication wake-up
receiver with injection-locking ring oscillator for wireless body area network,” IEEE Trans-
actions on Circuits and Systems I: Regular Papers, vol. 60, no. 5, pp. 1200–1208, 2013.
[25] T. Instruments, Low Power RF Designer’s Guide to LPRF, SLYA020a Application Note, 2010.
[26] N. E. Roberts and D. D. Wentzloff, “A 98nw wake-up radio for wireless body area networks,”
in Radio Frequency Integrated Circuits Symposium (RFIC), 2012 IEEE, pp. 373–376, IEEE,
2012.
[27] R. Piyare, A. L. Murphy, C. Kiraly, P. Tosato, and D. Brunelli, “Ultra low power wake-up
radios: A hardware and networking survey,” IEEE Communications Surveys & Tutorials,
vol. 19, no. 4, pp. 2117–2157, 2017.
110
[28] D. Giovanelli, B. Milosevic, D. Brunelli, and E. Farella, “Enhancing bluetooth low energy
with wake-up radios for iot applications,” in 2017 13th International Wireless Communica-
tions and Mobile Computing Conference (IWCMC), pp. 1622–1627, IEEE, 2017.
[29] H. Shao, C.-Y. Tsui, and W.-H. Ki, “The design of a micro power management system for
applications using photovoltaic cells with the maximum output power control,” IEEE Trans.
Very Large Scale Integr. (VLSI) Syst., vol. 17, no. 8, pp. 1138–1142, 2009.
[30] S. Carreon-Bautista, A. Eladawy, A. N. Mohieldin, and E. Sanchez-Sinencio, “Boost con-
verter with dynamic input impedance matching for energy harvesting with multi-array ther-
moelectric generators,” IEEE Trans. Ind. Electron., vol. 61, no. 10, pp. 5345–5353, 2014.
[31] Y. K. Ramadass and A. P. Chandrakasan, “An efficient piezoelectric energy harvesting in-
terface circuit using a bias-flip rectifier and shared inductor,” IEEE J. Solid-State Circuits,
vol. 45, no. 1, pp. 189–204, 2010.
[32] A. Costilla-Reyes, A. Abuellil, J. J. Estrada-López, S. Carreon-Bautista, and E. Sánchez-
Sinencio, “Reconfigurable system for electromagnetic energy harvesting with inherent activ-
ity sensing capabilities for wearable technology,” IEEE Transactions on Circuits and Systems
II: Express Briefs, 2018.
[33] F. Yahya, C. Lukas, and B. Calhoun, “A top-down approach to building battery-less self-
powered systems for the internet-of-things,” Journal of Low Power Electronics and Applica-
tions, vol. 8, no. 2, p. 21, 2018.
[34] J. J. Estrada-López, A. Abuellil, Z. Zeng, and E. Sánchez-Sinencio, “Multiple input energy
harvesting systems for autonomous iot end-nodes,” Journal of Low Power Electronics and
Applications, vol. 8, no. 1, p. 6, 2018.
[35] S. Bandyopadhyay and A. P. Chandrakasan, “Platform architecture for solar, thermal and
vibration energy combining with mppt and single inductor,” in VLSI Circuits (VLSIC), 2011
Symposium on, pp. 238–239, IEEE, 2011.
111
[36] M. Dini, A. Romani, M. Filippi, V. Bottarel, G. Ricotti, and M. Tartagni, “A nanocurrent
power management IC for multiple heterogeneous energy harvesting sources,” IEEE Trans.
Power Electron., vol. 30, no. 10, pp. 5665–5680, 2015.
[37] G. Chowdary, A. Singh, and S. Chatterjee, “An 18 nA, 87% efficient solar, vibration and RF
energy-harvesting power management system with a single shared inductor,” IEEE J. Solid-
State Circuits, vol. 51, no. 10, pp. 2501–2513, 2016.
[38] Y. K. Tan and S. K. Panda, “Energy harvesting from hybrid indoor ambient light and ther-
mal energy sources for enhanced performance of wireless sensor nodes,” IEEE Trans. Ind.
Electron., vol. 58, no. 9, pp. 4424–4435, 2011.
[39] Y.-K. Teh and P. K. Mok, “Dtmos-based pulse transformer boost converter with complemen-
tary charge pump for multisource energy harvesting,” IEEE Transactions on Circuits and
Systems II: Express Briefs, vol. 63, no. 5, pp. 508–512, 2016.
[40] J. Li, J.-s. Seo, I. Kymissis, and M. Seok, “Triple-mode, hybrid-storage, energy harvesting
power management unit: Achieving high efficiency against harvesting and load power vari-
abilities,” IEEE Journal of Solid-State Circuits, vol. 52, no. 10, pp. 2550–2562, 2017.
[41] G. Chowdary and S. Chatterjee, “A 300-nw sensitive, 50-na dc-dc converter for energy har-
vesting applications,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 62,
no. 11, pp. 2674–2684, 2015.
[42] A. A. Blanco and G. A. Rincón-Mora, “Compact fast-waking light/heat-harvesting 0.18-µm
cmos switched-inductor charger,” IEEE Transactions on Circuits and Systems I: Regular Pa-
pers, vol. 65, no. 6, pp. 2024–2034, 2018.
[43] F. Deng, X. Yue, X. Fan, S. Guan, Y. Xu, and J. Chen, “Multisource energy harvesting system
for a wireless sensor network node in the field environment,” IEEE Internet of Things Journal,
2018.
[44] S. S. Amin and P. P. Mercier, “Misimo: A multi-input single-inductor multi-output energy
harvester employing event-driven mppt control to achieve 89% peak efficiency and a 60,000
112
x dynamic range in 28nm fdsoi,” in Solid-State Circuits Conference-(ISSCC), 2018 IEEE
International, pp. 144–146, IEEE, 2018.
[45] S. Carreon-Bautista, L. Huang, and E. Sanchez-Sinencio, “An autonomous energy harvesting
power management unit with digital regulation for iot applications,” IEEE Journal of Solid-
State Circuits, vol. 51, no. 6, pp. 1457–1474, 2016.
[46] T. Instruments, “Bq25570 nano power boost charger and buck converter for energy harvester
powered applications,” 2016.
[47] R. A. Kumar, M. Suresh, and J. Nagaraju, “Effect of solar array capacitance on the perfor-
mance of switching shunt voltage regulator,” IEEE transactions on power electronics, vol. 21,
no. 2, pp. 543–548, 2006.
[48] Y.-H. Wang, Y.-W. Huang, P.-C. Huang, H.-J. Chen, and T.-H. Kuo, “A single-inductor dual-
path three-switch converter with energy-recycling technique for light energy harvesting,”
IEEE Journal of Solid-State Circuits, vol. 51, no. 11, pp. 2716–2728, 2016.
[49] X. Liu, L. Huang, K. Ravichandran, and E. Sánchez-Sinencio, “A highly efficient reconfig-
urable charge pump energy harvester with wide harvesting range and two-dimensional mppt
for internet of things,” IEEE Journal of Solid-State Circuits, vol. 51, no. 5, pp. 1302–1312,
2016.
[50] L. J. Svensson and J. G. Koller, “Driving a capacitive load without dissipating fcv/sup 2,” in
Low Power Electronics, 1994. Digest of Technical Papers., IEEE Symposium, pp. 100–101,
IEEE, 1994.
[51] S. Arslan, S. A. A. Shah, J.-J. Lee, and H. Kim, “An energy efficient charging technique for
switched capacitor voltage converters with low-duty ratio,” IEEE Transactions on Circuits
and Systems II: Express Briefs, vol. 65, no. 6, pp. 779–783, 2018.
[52] G. Palumbo, D. Pappalardo, and M. Gaibotti, “Charge-pump circuits: power-consumption
optimization,” IEEE Transactions on Circuits and Systems I: Fundamental Theory and Ap-
plications, vol. 49, no. 11, pp. 1535–1542, 2002.
113
[53] J. M. Amanor-Boadu, M. A. Abouzied, and E. Sánchez-Sinencio, “An efficient and fast li-ion
battery charging system using energy harvesting or conventional sources,” IEEE Transactions
on Industrial Electronics, vol. 65, no. 9, pp. 7383–7394, 2018.
[54] L.-R. Chen, “A design of an optimal battery pulse charge system by frequency-varied tech-
nique,” IEEE Transactions on Industrial Electronics, vol. 54, no. 1, pp. 398–405, 2007.
[55] Y. Okuma, K. Ishida, Y. Ryu, X. Zhang, P.-H. Chen, K. Watanabe, M. Takamiya, and T. Saku-
rai, “0.5-v input digital ldo with 98.7% current efficiency and 2.7-µa quiescent current in
65nm cmos,” in Custom Integrated Circuits Conference (CICC), 2010 IEEE, pp. 1–4, IEEE,
2010.
[56] J. Zarate-Roldan, A. Abuellil, M. Mansour, O. Elsayed, F. A. Hussien, A. Eladawy, and
E. Sánchez-Sinencio, “0.2-nj/b fast start-up ultralow power wireless transmitter for iot appli-
cations,” IEEE Transactions on Microwave Theory and Techniques, vol. 66, pp. 259–272, Jan
2018.
[57] A. Zanella, N. Bui, A. Castellani, L. Vangelista, and M. Zorzi, “Internet of things for smart
cities,” IEEE Internet of Things Journal, vol. 1, pp. 22–32, Feb 2014.
[58] D. Blaauw, D. Sylvester, P. Dutta, Y. Lee, I. Lee, S. Bang, Y. Kim, G. Kim, P. Pannuto,
Y. S. Kuo, D. Yoon, W. Jung, Z. Foo, Y. P. Chen, S. Oh, S. Jeong, and M. Choi, “IoT design
space challenges: Circuits and systems,” in Symp. VLSI Tech. (VLSI-Tech): Dig. Tech. Papers,
pp. 1–2, June 2014.
[59] A. Burdett, “Ultra-low-power wireless systems: Energy-efficient radios for the internet of
things,” IEEE Solid-State Circuits Magazine, vol. 7, pp. 18–28, Spring 2015.
[60] M. A. Abouzied and E. Sanchez-Sinencio, “Low-Input Power-Level CMOS RF Energy-
Harvesting Front End,” IEEE Transactions on Microwave Theory and Techniques, vol. 63,
pp. 3794–3805, Nov 2015.
[61] K. Philips, “Ultra low power short range radios: Covering the last mile of the IoT,” in Euro-
pean Solid State Circuits Conference (ESSCIRC), pp. 51–58, Sept 2014.
114
[62] M. Rahman, M. Elbadry, and R. Harjani, “An IEEE 802.15.6 standard compliant 2.5 nJ/bit
multiband WBAN transmitter using phase multiplexing and injection locking,” IEEE Journal
of Solid-State Circuits, vol. 50, pp. 1126–1136, May 2015.
[63] H. C. Chen, M. Y. Yen, Q. X. Wu, K. J. Chang, and L. M. Wang, “Batteryless transceiver
prototype for medical implant in 0.18- µm CMOS technology,” IEEE Transactions on Mi-
crowave Theory and Techniques, vol. 62, pp. 137–147, Jan 2014.
[64] A. Shirane, H. Tan, Y. Fang, T. Ibe, H. Ito, and K. M. N. Ishihara, “A 5.8GHz RF-powered
transceiver with a 113 µW 32-QAM transmitter employing the IF-based quadrature backscat-
tering technique,” in IEEE Int. Solid- State Circuits Conf. (ISSCC), pp. 1–3, Feb 2015.
[65] M. Vidojkovic, X. Huang, X. Wang, C. Zhou, A. Ba, M. Lont, Y. H. Liu, P. Harpe, M. Ding,
B. Busze, N. Kiyani, K. Kanda, S. Masui, K. Philips, and H. de Groot, “A 0.33nJ/b
IEEE802.15.6/proprietary-MICS/ISM-band transceiver with scalable data-rate from 11kb/s
to 4.5Mb/s for medical applications,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig.
Tech. Papers, pp. 170–171, Feb 2014.
[66] X. Huang, A. Ba, P. Harpe, G. Dolmans, H. D. Groot, and J. Long, “A 915MHz 120 µW-
RX/900 µW-TX envelope-detection transceiver with 20dB in-band interference tolerance,”
in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, pp. 454–456, Feb 2012.
[67] P. P. Mercier, S. Bandyopadhyay, A. C. Lysaght, K. M. Stankovic, and A. P. Chandrakasan,
“A sub-nW 2.4 GHz transmitter for low data-rate sensing applications,” IEEE Journal of
Solid-State Circuits, vol. 49, pp. 1463–1474, July 2014.
[68] Y. H. Liu, X. Huang, M. Vidojkovic, A. Ba, P. Harpe, G. Dolmans, and H. d. Groot, “A
1.9nJ/b 2.4GHz multistandard (Bluetooth low energy/Zigbee/IEEE802.15.6) transceiver for
personal/body-area networks,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech.
Papers, pp. 446–447, Feb 2013.
[69] A. Wong, M. Dawkins, G. Devita, N. Kasparidis, A. Katsiamis, O. King, F. Lauria, J. Schiff,
and A. Burdett, “A 1V 5mA multimode IEEE 802.15.6/Bluetooth low-energy WBAN
115
transceiver for biotelemetry applications,” in IEEE Int. Solid-State Circuits Conf. (ISSCC)
Dig. Tech. Papers, pp. 300–302, Feb 2012.
[70] X. Liu, M. M. Izad, L. Yao, and C. H. Heng, “A 13 pJ/bit 900 MHz QPSK/16-QAM band
shaped transmitter based on injection locking and digital PA for biomedical applications,”
IEEE Journal of Solid-State Circuits, vol. 49, pp. 2408–2421, Nov 2014.
[71] Gardner and F. M., PLL Frequency Synthesizers. John Wiley & Sons, Inc., 2005.
[72] A. A. Abidi, “Direct-conversion radio transceivers for digital communications,” IEEE Jour-
nal of Solid-State Circuits, vol. 30, pp. 1399–1410, Dec 1995.
[73] M. Lont, D. Milosevic, A. H. M. van Roermund, and G. Dolmans, “Ultra-low power FSK
receiver for body area networks with automatic frequency control,” in Proc. Europ. Solid-
State Circuits Conf. (ESSCIRC), pp. 430–433, Sept 2012.
[74] B. W. Cook, A. Berny, A. Molnar, S. Lanzisera, and K. S. J. Pister, “Low-Power 2.4-GHz
Transceiver With Passive Rx Front-End and 400-mV Supply,” IEEE Journal of Solid-State
Circuits, vol. 41, pp. 2757–2766, Dec 2006.
[75] Berny, A. Dominique, Meyer, R. G., and A. Niknejad, Analysis and Design of Wideband LC
VCOs. PhD thesis, EECS Department, University of California, Berkeley, May 2006.
[76] H. X. Nguyen, H. H. Nguyen, and T. Le-Ngoc, “Amplify-and-forward relaying with M-
FSK modulation and coherent detection,” IEEE Transactions on Communications, vol. 60,
pp. 1555–1562, June 2012.
[77] T. K. Jang, J. Kim, Y. G. Yoon, and S. Cho, “A highly-digital VCO-based analog-to-digital
converter using phase interpolator and digital calibration,” IEEE Transactions on Very Large
Scale Integration (VLSI) Systems, vol. 20, pp. 1368–1372, Aug 2012.
[78] D. Z. Turker, S. P. Khatri, and E. Sanchez-Sinencio, “A DCVSl delay cell for fast low power
frequency synthesis applications,” IEEE Transactions on Circuits and Systems I: Regular
Papers, vol. 58, pp. 1225–1238, June 2011.
116
[79] T. Sakurai and A. R. Newton, “Alpha-power law MOSFET model and its applications to
CMOS inverter delay and other formulas,” IEEE Journal of Solid-State Circuits, vol. 25,
pp. 584–594, Apr 1990.
[80] Liu, Xiaosen, and E. Sanchez-Sinencio, “21.1 A single-cycle MPPT charge-pump energy har-
vester using a thyristor-based VCO without storage capacitor,” in Solid-State Circuits Con-
ference (ISSCC), 2016 IEEE International, pp. 364–365, IEEE, 2016.
[81] G. Chien and P. R. Gray, “A 900-MHz local oscillator using a DLL-based frequency
multiplier technique for PCS applications,” IEEE Journal of Solid-State Circuits, vol. 35,
pp. 1996–1999, Dec 2000.
[82] X. Zhang and A. B. Apsel, “A low-power, process-and- temperature- compensated ring os-
cillator with addition-based current source,” IEEE Transactions on Circuits and Systems I:
Regular Papers, vol. 58, pp. 868–878, May 2011.
[83] K.-S. Lee, E.-Y. Sung, I.-C. Hwang, and B.-H. Park, “Fast AFC technique using a code esti-
mation and binary search algorithm for wideband frequency synthesis,” Proc. Europ. Solid-
State Circuits Conf. (ESSCIRC), pp. 181–184, Sept 2005.
[84] “Silicon Labs high performance, low current Sub-1 GHz Transceiver "Si4463/61/60-
C",data sheet is provided by Si-Labs. [Online].” Available:http://www.silabs.com/
Support%20Documents/TechnicalDocs/Si4463-61-60-C.pdf.
[85] H. U. Post and K. Schoppe, “A 14-bit monotonic NMOS D/A converter,” IEEE Journal of
Solid-State Circuits, vol. 18, pp. 297–301, June 1983.
[86] Dempsey, Dennis, Gorman, and Christopher, “Digital to analog converter,” October 1999.
[87] I. M. Filanovsky and A. Allam, “Mutual compensation of mobility and threshold voltage
temperature effects with applications in CMOS circuits,” IEEE Transactions on Circuits and
Systems I: Fundamental Theory and Applications, vol. 48, pp. 876–884, Jul 2001.
117
[88] M. Lont, D. Milosevic, A. H. M. van Roermund, and G. Dolmans, “Ultra-low power FSK
Wake-up Receiver front-end for body area networks,” in 2011 IEEE Radio Frequency Inte-
grated Circuits Symposium, pp. 1–4, June 2011.
118
APPENDIX A
START-UP TIME FOR CRYSTAL OSCILLATORS
The startup time of a crystal oscillator may have many different definitions depending on the
type of system. The definition of startup time for a microprocessor system is often the time from
initial power application to the time a stable clock signal is available. The definition of startup
time for a phase locked loop (PLL) is often the time from initial power application to the time
a stable reference signal is available, often settled to within an acceptable frequency offset from
the final steady state oscillation frequency. The startup time of a crystal oscillator is determined
by the initial noise or transient conditions at turn-on; the small-signal envelope expansion due to
negative resistance; and the large-signal final amplitude limiting due to finite power consumption.
The envelope expansion is a function only of total negative resistance and the motional inductance
of the crystal. The simplified equivalent series RLC circuit will contain the motional inductance
(L), (R) is the sum of the applied negative resistance of the three-point oscillator (−Rn) and the
motional resistance of the crystal (RL), and the effective series capacitance of the entire network(C)
(dominated by the motional capacitance) as shown in Fig.A.1.
Figure A.1: RLC equivalent circuit for crystal oscillator
119
The following Laplace domain differential equation applies for the network (with no driving
function):





+ (1/LC) = 0 (A.2)


















Because the value of the net resistance R is negative, the poles of this system are in the right-
half plane, and the resulting time-domain solution for this differential equation is:
V (t) = K.[e|R/2L|t].sin(2πt
√
1/(LC) + Θ) (A.5)
Where K is a constant related to the initial startup condition and θ is an arbitrary phase related to
the initial startup condition. (Note that the exponential expansion will be valid only for small-signal
conditions, as the power available to the circuit is limited.) The time constant for the envelope
expansion is positive and proportional to the net negative resistance of the three-point oscillator
and the motional resistance, and inversely proportional to the motional inductance. Due to the large
motional inductance of crystals and the limited net negative resistance, crystal oscillators have very
long startup times. As an example of the envelope expansion time constant of a crystal oscillator
startup, assume a crystal with 5fF motional capacitance, and an oscillator with 1500 Ω negative
resistance magnitude operating at 10 MHz. Using the motional capacitance and the operating
120
frequency, a motional inductance of 50.66 mH can be determined by L = Cw2 . This motional
inductance yields an oscillation envelope expansion time constant of tao = 2 x LR = 67.55 µs.
Note that a trade-off exists between a smaller frequency pulling due to low motional capacitance
and longer startup times due to high motional inductance, of which high motional inductance is a
direct result of low motional capacitance. A mitigating factor is that smaller motional capacitances
are also associated with smaller shunt capacitances, which will yield larger negative resistances
and, thereby, improve startup time. Startup time is an important design consideration in many
battery-powered applications where systems are duty cycled between off and on operating states.
A shorter crystal-oscillator start-up time limits the wasted power in full-chip warmup times in
low-power radio systems.
121
