Digital ADCs and ultra-wideband RF circuits for energy constrained wireless applications by Denis Clarke Daly. by Daly, Denis Clarke
Digital ADCs and Ultra-Wideband RF Circuits for
Energy Constrained Wireless Applications
by MASSACHU:SETrS INSTrrrU E
OF TECHNOLOGY
Denis Clarke DalyOF TECHNOLOGYAUG 0 7 2009
B.A.Sc., University of Toronto (2003) 0 7 2009
S.M., Massachusetts Institute of Technology (2005) LIBRARIES
Submitted to the Department of Electrical Engineering and Computer
Science
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy
ARCHIVES
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
June 2009
@ Massachusetts Institute of Technology 2009. All rights reserved.
Author .....
Department of Electrical Engineering and Computer Science
May 22, 2009
Certified by ...... ............ .... ....................................
Anantha P. Chandrakasan
Joseph F. and Nancy P. Keithley Professor of Electrical Engineering
Thesis Supervisor
Accepted by...............
/ Terry P. Orlando
Chairman, Department Committee on Graduate Students

Digital ADCs and Ultra-Wideband RF Circuits for Energy
Constrained Wireless Applications
by
Denis Clarke Daly
Submitted to the Department of Electrical Engineering and Computer Science
on May 22, 2009, in partial fulfillment of the
requirements for the degree of
Doctor of Philosophy
Abstract
Ongoing advances in semiconductor technology have enabled a multitude of portable,
low power devices like cellular phones and wireless sensors. Most recently, as tran-
sistor device geometries reach the nanometer scale, transistor characteristics have
changed so dramatically that many traditional circuits and architectures are no longer
optimal and/or feasible. As a solution, much research has focused on developing
'highly digital' circuits and architectures that are tolerant of the increased leakage,
variation and degraded voltage headrooms associated with advanced CMOS processes.
This thesis presents several highly digital, mixed-signal circuits and architectures
designed for energy constrained wireless applications. First, as a case study, a highly
digital, voltage scalable flash ADC is presented. The flash ADC, implemented in
0.18 pm CMOS, leverages redundancy and calibration to achieve robust operation at
supply voltages from 0.2 V to 0.9 V.
Next, the thesis expands in scope to describe a pulsed, noncoherent ultra-wideband
transceiver chipset, implemented in 90 nm CMOS and operating in the 3-to-5 GHz
band. The all-digital transmitter employs capacitive combining and pulse shaping in
the power amplifier to meet the FCC spectral mask without any off-chip filters. The
noncoherent receiver system-on-chip achieves both energy efficiency and high perfor-
mance by employing simple amplifier and ADC structures combined with extensive
digital calibration.
Finally, the transceiver chipset is integrated in a complete system for wireless
insect flight control. Through the use of a flexible PCB and 3D die stacking, the
total weight of the electronics is kept to 1 g, within the carrying capacity of an adult
Manduca sexta moth. Preliminary wireless flight control of a moth in a wind tunnel
is demonstrated.
Thesis Supervisor: Anantha P. Chandrakasan
Title: Joseph F. and Nancy P. Keithley Professor of Electrical Engineering

Acknowledgments
The six years I have spent at MIT have been a great experience and I would like
to acknowledge several people for their contributions to this thesis, as well as to my
personal and professional development.
Anantha, thank you very much for your guidance and mentorship. You have made
me into a much stronger researcher and scientist than I would have been otherwise.
I would like to acknowledge Professor Joel Voldman and Professor Hae-Seung Lee
for serving on my thesis committee and providing invaluable research advice.
Throughout graduate school, I have always enjoyed working as part of a group,
and being a member of the UWB group has been no exception. Patrick - I have
learned much from you and enjoyed working so closely with you. I wouldn't have
been able to put in such long hours during tapeout without you. The hockey games
have been fun, too! Manish, our conversations have been inspiring. Helen, Dave, Fred
L., Sinit, Tina and Parth - it has been a pleasure working with you all.
Anantha has put together a great group of students in ananthagroup, and I would
like to acknowledge the current and former members who I have worked closely with.
Daniel, Naveen, Payam, Vivienne, Joyce, Marcus, and Bonnie - thank you for bringing
some Canadian pride to ananthagroup. Yogesh - go Manchester United! Jose, thank
you for your RF circuit advice. The many ananthagroup alumni, including Alice, Ben,
Nisha, Frank, Raul, Johnna, Alex, Brian, and Nathan have been a source of wisdom,
helping me figure out how to drink from the firehose. ananthagroup is so large that I
could spend half my thesis thanking everyone individually! Thank you Saurav, Fred
C., Hye Won, Nigel, Yildiz, Jae, Masood, Rahul, Mahmut and Courtney. Margaret
Flaherty also deserves thanks for keeping everything in 38-107 running smoothly.
I have had the pleasure to work with several other research groups within MIT
as well as with the University of Arizona and the University of Washington. It has
been rewarding to broaden my horizons and learn about both MEMS and biology. In
particular, Alice Stone has played a key role in my research - thank you! The moths
at the University of Arizona and Washington also deserve thanks and recognition.
Several times throughout my Ph.D. research I have been assisted by experts from
outside MIT. I would like to acknowledge Kush Gulati of Cambridge Analog, Paul
Yu of Marvell, Tony Chan Carusone of the University of Toronto, Terry Mayhugh of
Texas Instruments, and Rob Gilmore of Qualcomm.
My Ph.D. has been supported by funding from the Natural Sciences and Engi-
neering Research Council of Canada (NSERC), the Center for Circuit and System So-
lutions (C2S2), and the DARPA HI-MEMS project. In addition, fabrication services
were generously provided by National Semiconductor and ST Microelectronics. Peter
Holloway from National Semiconductor and Jean-Francois Paillotin and Kholdoun
Torki from CMP have been very helpful throughout the tapeout process.
Boston has felt like home over the past six years thanks to the many friends
who I have shared good times with. Nir, Mehdi, Kenny and Ardavan were excellent
flatmates. Jeremy, Benjamin, Eric, Jasmine, Grace, Lulu and Maryam have been
true friends and brought some Canadian flavour to Boston. Mike, Jeff, Anand, Vik,
Tia, Janel - thank you for coming to visit me and for your friendship!
I thank my family for their love and support throughout my Ph.D. Mom and Dad,
thank you for always being there to support me, and for giving me the chance to grow
and solve problems on my own. Barbara, Dave, Alex, and Jillian - thank you for the
many photos! My extended family, as well as my future in-laws, also deserve thanks
for their love.
Finally, Sarah Z., thank you for your love, support, encouragement, patience and
faith. You held the fort when I was practically living at lab and have made my Ph.D.
a much better experience than it would have been without you. I look forward to our
future together - the best has yet to come! ©
Contents
1 Introduction
1.1 Process Scaling: Benefits and Challenges
1.2 Highly Digital Circuits . . . . . . . . . .
1.2.1 Analog to Digital Converters . . .
1.2.2 Wireless Transceivers . . . . . . .
1.3 Thesis Contributions . . . . . . . . . . .
1.4 Thesis Outline ............ ...
2 Highly Digital, Voltage Scalable ADC
2.1 ADC Architecture .............
2.1.1 Background and Theory . . . . .
2.1.2 Overview .............
2.2 ADC Circuits ...............
2.2.1 Sampling Network . . . . . . . .
2.2.2 Comparator Array ........
2.2.3 Analysis of Device Stacking in the
2.2.4 Wallace Tree Adder and Memory
2.3 Measurement Results ...........
2.3.1 Static and Dynamic Performance
Subthreshold
2.3.2 Power Consumption
2.4
2.5
2.3.3 Calibration and Common-Mode Rejection
Future Directions . ..... ............
Summary . . . .....................
7
21
24
27
27
29
31
32
33
35
35
38
40
40
43
45
48
50
51
53
54
54
55
Regime
-
3 Low Data Rate UWB Architecture, Physical Layer, and Transmitter 57
3.1 Background and History ............
3.2 IEEE 802.15.4a Standard . ............
3.2.1 Chip spread spectrum PHY .......
3.2.2 Ultra-wideband PHY ...........
3.3 Noncoherent Demodulation ...........
3.4 Previous Work ..................
3.5 Transceiver Architecture and Packet Structure .
3.6 UWB Transmitter . ................
3.6.1 Transmitter Architecture .........
3.6.2 Dual-Digital Power Amplifiers ......
3.6.3 Digital Oscillator . ............
3.6.4 Measurement Results ...........
3.7 Summary ........... . ...
4 UWB Receiver SoC Circuits
4.1 Architecture . ...... .............
4.2 RF Front End....... . . ....
4.2.1 Single ended vs. differential architecture
4.2.2 Resonant vs. non-resonant load . . . . .
4.3
4.4
4.5
4.6
4.7
4.8
4.2.3 Core amplifier structure . . .
4.2.4 Low Noise Amplifier . . . . .
4.2.5 Multi-stage RF amplifier . . .
Squarer . ...............
Baseband Amplifier . .........
Integrator and ADC . ........
4.5.1 Integrator and ADC Clocking
Crystal Oscillator ......... ..
Delay-Locked Loop . .........
Digital Shift Register .........
.. .... .. 72
73
73
75
76
77
79
84
85
86
87
93
100
103
105
106
I
4.9 Digital State Machine . . . . . . . . . . . . . . .
4.10 Test Circuits....... ............
4.11 Summary . ..... .......... ...
5 UWB Receiver SoC Measurement Results
5.1 Layout, Packaging and PCBs ..........
5.2 Bit Error Rate and Interference Measurements .
5.3
5.4
5.5
5.6
5.7
5.8
5.9
5.10
5.11
5.2.1 Interference Measurements
Power Consumption & Energy/bit .
Wireless Demo of Transceiver Chipset
RF Front End .............
Squarer . . . . . .. . . . . . . . . .
Baseband Amplifier . . . . . . . . . .
Integrator and ADC . . . . . . . . .
Crystal Oscillator . . . . . . . . . . .
Delay Locked Loop . . . . . . . . . .
Summary ...............
6 UWB Receiver SoC System Demonstration: Cyborg Moths
6.1 Previous Hybrid-Insect Flight Control Research and Systems . . . .
6.2 Manduca sexta hawkmoth . . . . . . . . . . . . . . . . . . . . . ..
6.2.1 Flight control through abdominal deflection . . . . . . . . .
6.3 Hybrid-Insect Flight Control System . . . . . . . . . . . . . . . ..
6.3.1 Electronics ................... . ........
6.3.2 Stimulation Logic ............ . .......
6.4 Attaching PCB to Moth .......... . ........
6.5 Flight Tests .............. ... ... ....
6.6 Summary ............................ ......
7 Conclusions
7.1 Thesis Summary ............ . .......
9
: : : I : : :
107
107
108
109
109
114
117
118
122
122
130
132
133
134
136
136
137
138
140
140
143
145
149
150
152
155
157
157
7.2 Future Work ............. ..... . ............ 159
7.2.1 Analog to Digital Converters . .................. 159
7.2.2 Wireless Transceivers ................... .... 160
7.2.3 Wireless Moth Flight Control . ................. 161
A Analysis of Device Stacking in Subthreshold Regime 163
B Acronyms 167
List of Figures
1-1 Spectrum of existing mobile devices that consume power ranging from
micro-Watts to Watts. .......................... 22
1-2 Conceptual diagram of a generic mobile device. . ............ 23
1-3 Simulated effect of scaling on NMOS (a) Ion/Ioff, (b) Vth/VDD, and
(c) gm/gds. For simulations, L = Lmin and W = 3 Lmin. Device models
include MOSIS provided models [1] and Predictive Technology Mod-
els (PTMs) [2, 3]. Both high performance (HP) and low power (LP)
MOSFET data is shown .......................... 25
2-1 Conceptual block-diagram of an inverter-based ADC. . ......... 34
2-2 (a) Nominal threshold voltage of comparators for a 6-level, flash ADC
with three redundant comparators at each threshold voltage. (b) Ex-
ample of the ADC with threshold variation. The comparators that are
selected after calibration are highlighted. . ................ 36
2-3 Yield of ADC versus redundancy factor for ADCs with varying levels of
comparator reconfigurability. Redundancy factor refers to the number
of comparators assigned to a given threshold. Additional comparators
are placed at the edges of the input range to compensate for edge
effects. INL values are not corrected for gain and offset errors. .... 38
2-4 Block diagram of the flash ADC. . .................. . 39
2-5 Four possible circuit implementations for the sampling network. All
four circuits are sized for equal 'on' conductance. . ........... 41
2-6 Transient leakage simulation demonstrating the relative 'off' leakage
for the four sampling switch options presented in Figure 2-5. V1, 2,
V3, and V4 correspond to the nodes labeled in Figure 2-5. Simulation
results are for typical devices at 100'C. . ................. 41
2-7 Circuit schematic for voltage boosting circuit. . ............. 43
2-8 Comparator schematic. The comparator consists of many stacked
PMOS devices, each of which consists of a variable number of minimum-
sized PMOS devices in parallel. ............ .. ....... . 44
2-9 Multiple NMOS transistors stacked. All nodes are initially precharged
to VDD. ...... .................. ......... 46
2-10 Transient (a) ODE and (b) piecewise linear solution for V1, V2 , ... , Vo0
for ten stacked NMOS transistors. . .................. . 47
2-11 Block diagram of 127 by 9-bit memory with embedded counters used
for calculating the estimated cumulative distribution function (CDF)
of comparator thresholds. . .................. ..... 49
2-12 Die micrograph of the ADC in 0.18 pm CMOS. . ............ 50
2-13 (a) Maximum sampling frequency and (b) energy per conversion versus
supply voltage, indicating presence of minimum energy per conversion
at VDD = 0.4 V. ................ ... ........... 51
2-14 INL and DNL of ADC in single-ended 6-bit mode at VDD = 0.4 V. 51
2-15 FFT of ADC in (a) single-ended and (b) pseudo-differential 6-bit mode
at VDD = 0.4 V. ................... ........... 52
2-16 Statistical variation of ENOB in (a) single-ended and (b) pseudo-
differential mode at VDD = 0.4 V before and after calibration. .... 53
3-1 Receiver energy/bit values vs. data rate for UWB and narrowband
receivers recently published at ISSCC [4]. Data for this plot are found
in Table 5.4. ................... ............. 64
3-2 Three-channel frequency plan and narrowband interferers. . ...... 64
3-3 Packet structure and receiver states for the wireless link. . ..... . 65
3-4 Simplified block diagram of (a) receiver and (b) transmitter that form
the custom low-data-rate transceiver system. . .............. 66
3-5 Transmitter expanded block diagram. . .................. 67
3-6 Digitally controlled oscillator schematic. . ................ 69
3-7 Die photo of fabricated transmitter. . ................... 71
3-8 Measured transient waveform of a burst of five individually BPSK-
modulated pulses .......... ... ................ .. 71
3-9 Overlaid power spectral densities of the three channels in the low-band
of the 802.15.4a proposal. ........................ 72
4-1 Detailed block diagram of receiver SoC. . ............... . 74
4-2 Block diagram of noncoherent UWB receiver presented in [4]..... . 75
4-3 Circuit schematic of (a) resonant and (b) non-resonant amplifiers with
equal power consumption. Simulation results comparing the gain of
the amplifiers are presented in Figure 4-4. . ............... 78
4-4 Gain of the resonant and non-resonant RF amplifiers presented in Fig-
ure 4-3. ................... ................ 79
4-5 Four possible implementations of an RF gain stage with a resonant load. 80
4-6 Gain of an NMOS-based RF amplifier operating at 0.57 V (Fig. 4-5(a)),
an NMOS-based cascode RF amplifier operating at 1 V (Fig. 4-5(b)),
and an inverter-based, complementary RF amplifier operating at 1 V
(Fig. 4-5(d)), with all amplifiers consuming equal power. . ....... 82
4-7 Digitally tunable load capacitor of RF amplifier. All NMOS devices
are minimum length and their width is shown. . ............. 83
4-8 Schematic showing how differential signals can be leveraged to reduce
transistor size. The two circuits shown have the same radio-frequency
(RF) impedance, but by connecting a transistor between the positive
and negative terminals, the overall transistor width is reduced by four
times. ....... ...... . ................ 83
4-9 Schematic of low noise amplifier ................... .. . 84
4-10 Schematic of 6-stage RF amplifier, including the LNA. A variable num-
ber of stages can be enabled depending on the gain required. ...... 85
4-11 Schematic of differential, passive squarer. . ............. . . 86
4-12 Output amplitude magnitude of differential squarer versus amplitude
of 4.5 GHz RF input. ............. . . . ........ 87
4-13 Transient simulation of RF front end at its maximum gain setting and
squarer in response to a pulsed LNA RF input. . ............ 88
4-14 Baseband signal chain, consisting of a baseband amplifier, an ADC and
digital calibration logic ................... ....... 89
4-15 Examples of offset compensation approaches for the baseband ampli-
fier. (a) Discrete-time, analog offset compensation, (b) Continuous-
time, analog offset compensation, and (c) Discrete-time, digital offset
compensation ......... ... .. ............. 90
4-16 Circuit of current-mode DAC that is used to cancel baseband amplifier
offsets ........ ....... .... .. ..... ....... 92
4-17 Table showing relationship between digital-to-analog converter (DAC)
code and current routing. Based on the DAC code, the DAC En-
able/4:0] and Plus/Minus switches are opened or closed (Figure 4-16).
In the table, shaded cells correspond to a switch being closed. .... 94
4-18 Two example circuits implementing a windowed integrator: (a) an op
amp based integrator, and (b) a transconductor based integrator. In
both integrators, 01 controls a switch that resets the integrator output. 94
4-19 An example implementation of a time interleaved integrator and ADC.
Shown below the circuit diagram is a timing diagram. . ......... 96
4-20 An example implementation of a cascaded integrator and ADC. Shown
below the circuit diagram is a timing diagram. . ............. 97
4-21 Six stage cascaded integrator and ADC. . ................ 98
4-22 Integrator and ADC clocking circuit that generates the level sensitive
and clock phases. Each delay element has a nominal delay of 2 ns. .. 101
4-23 Transient simulation of integrator and ADC for two integration pe-
riods. The first integration period begins at 10 ns and the second
integration period begins at 42 ns ..................... 102
4-24 Pierce one-transistor crystal oscillator with amplitude control circuitry. 104
4-25 Pierce two-transistor, inverter-based, crystal oscillator. . ........ 104
4-26 Crystal oscillator output buffer, to amplify the oscillator output to full
digital levels. ............................... 104
4-27 Delay-locked loop .............................. 105
4-28 RF output buffer to drive the amplified RF signal off-chip. ....... 108
5-1 Die photograph of pulsed UWB receiver SoC. . ........... . 110
5-2 Bare die packaged in (a) a 40-pin QFN package and (b) a 64-pin QFN
package. The 40-pin QFN package is 6mm x 6mm and the 64-pin QFN
package is 9mm x 9mm. The two photos are equally scaled....... 111
5-3 Block diagram of the three PCBs used for testing the receiver. ..... 112
5-4 Opal Kelly XEM3001 FPGA module. (Actual size 3.5" by 2.0") . . . 113
5-5 Middle PCB of three PCB stack. (Actual size 4.5" by 2.4") ...... 113
5-6 Top PCB (daughterboard) of three PCB stack. This specific daugh-
terboard is designed for the 40-pin QFN package. (Actual size 1.8" by
1.3") ................ .................. 113
5-7 Test setup used to measure the BER of the receiver both with and
without narrowband interferers. . .................. .. 115
5-8 BER of receiver (a) at its highest gain setting at the three center fre-
quencies, and (b) at the different gain settings with f, = 4.0 GHz. .. 116
5-9 Sensitivity of the receiver versus core supply voltage at its highest gain
setting (f, = 4.0 GHz) ................. ........ 117
5-10 Two comparison plots of receiver with previously published work: (a)
energy/bit versus data rate, and (b) normalized sensitivity versus en-
ergy/bit. In both plots, a point is shown for the receiver at its highest
and its lowest gain setting. Data for these plots are found in Table 5.3. 121
5-11 Top level block diagram of the UWB wireless image transmission sys-
tem. (Courtesy Helen Liang) . .................. .... 123
5-12 Measured sil of the receiver, comparing results from the 40-pin and
64-pin QFN packages and comparing a microstrip transmission line to
a grounded coplanar waveguide transmission line. In all three configu-
rations the receiver is tuned to a frequency of 3.5 GHz. ........ . 124
5-13 Measured sil of the receiver with a center frequency of 3.5, 4.0 and
4.5 GHz packaged in a 40-pin QFN package with grounded coplanar
waveguide transmission lines. . .................. ... 125
5-14 Output spectrum of the RF front end when the low noise amplifier
(LNA) input is terminated to 50 Q. The spectrum analyzer RBW and
VBW are set to 3 MHz, and the detector is set to average mode. . . . 126
5-15 Frequency dependent gain of the RF front end at its maximum gain
setting at the three channel frequencies .................. 127
5-16 Output amplitude vs. input amplitude of the receiver RF front end at
its highest gain setting, with an input frequency of 4 GHz. From this
plot, the PdB at the highest gain setting is measured to be -41 dBm. 128
5-17 Noise figure of the RF front end when operating in each of three channels. 129
5-18 Positive and negative output amplitudes of the squarer versus LNA
input power at its third of six RF gain settings (fc = 4.0 GHz). . .. 130
5-19 Measured differential output amplitude of the squarer versus LNA in-
put power for the six RF gain settings. . ................. 131
5-20 Differential output amplitude of the squarer versus LNA input power
at its third of six RF gain settings, demonstrating a linear relationship
between input power and output amplitude (f, = 4.0 GHz). ..... 132
5-21 Differential output amplitude of the baseband amplifier versus LNA
input power for the six RF gain settings (fc = 4.0 GHz). . ....... 133
5-22 (a) DNL and (b) INL of integrator and ADC. . ............. 133
5-23 Linearity of ADC when provided a pulse input of varying duration. . 134
5-24 (a) DNL and (b) INL of DLL based on an LSB of i x 31.25 ns. The
DNL and INL shown in (c) and (d) are based on an LSB of 2.04 ns. . 135
6-1 Lateral view of Manduca sexta moth attempting to feed from an ar-
tificial flower. (Photo courtesy of Armin Hinterwirth, University of
W ashington) ............. .... .. .......... 141
6-2 Photograph of two, 4-electrode tungsten probes. . ............ 141
6-3 Measured abdominal deflection of a moth in response to pulse stimuli of
varying voltage and frequency. In (a), the voltage of stimulation pulses
is varied. T represents the pulse period, N represents the number
of pulses and D represents the duty cycle. In (b), the frequency of
stimulation pulses is varied, for a 0.5 s burst of 2.5 V, 1 ms pulses. .. 142
6-4 Transient measurement of current delivered through a tungsten elec-
trode when applying a 2.5 V pulse with pulse duration of 1 ms. . . 143
6-5 Photos showing abdominal deflection of a moth in response to pulse
stimulus. The abdomen is shown (a) before pulse stimulus and (b) dur-
ing a pulse stimulus of 2.5 V, 100 Hz with duty cycle of 10%. The radial
lines on the graph paper are spaced 20, indicating a total abdominal
deflection of 10. ..... ... ................... . 144
6-6 Hybrid-insect flight control system. . ................... 144
6-7 Block diagram of electronics mounted on a flexible PCB and attached
to a moth .............. .................. 146
6-8 Flexible PCB (a) top, (b) bottom, and (c) side. . ............ 146
6-9 Photo of the UWB receiver SoC and a Texas Instruments microcon-
troller vertically stacked in a single package. The UWB receiver SoC
is placed on top of the microcontroller. . ................. 148
6-10 Block diagram of stimulation logic included in the UWB receiver SoC. 149
6-11 Photo of the moth with (a) dorsal PCB mounting and (b) ventral PCB
mounting. ............ .... .............. 150
6-12 Photos of the surgical procedure to attach the hardware to the moth.
In (a), the harness is attached to the moth and in (b), the tungsten
probe is inserted through the cuticle into the connective tissue adjacent
to the nerve connectives. . .................. ...... 151
6-13 Photograph of wind tunnel where moth flight tests were conducted. . 152
6-14 Time-lapse photos of a moth in a wind tunnel responding to pulse
stimulus. Pulse stimulus is generated by the receiver SoC after the
receiver successfully decodes a transmitted packet. Photos of the moth
are shown (a) prior to stimulation, (b) at the moment when stimulation
begins, (c) when the moth's flight path begins to respond to the pulses,
and (d) after having responded to the stimulation pulses. In response
to the stimulation pulses, the moth's flight path turns to the left. . . 154
A-1 Propagation delay versus number of stacked NMOS devices for ODE
simulation and mathematical approximation given in Equation A.10. 166
List of Tables
1.1 Example existing and emerging applications for energy constrained mo-
bile devices .............. .. . ............ . 22
2.1 Summary of Results for ADC. . .................. ... 50
3.1 FCC emissions limit for indoor and outdoor UWB communication... 58
5.1 List of pad connections on receiver SoC. . ................ 110
5.2 Out-of-band interference robustness. . .................. 118
5.3 Receiver power consumption breakdown. In idle mode, the crystal
oscillator is enabled. .......................... .. 118
5.4 Comparison of receiver with previously published work.......... 121
5.5 Crystal oscillator frequency for different chips, supply voltages and
oscillator topologies .......................... .. 135
5.6 Crystal oscillator phase noise at various offset frequencies for both one-
transistor and two-transistor Pierce oscillator topologies. ....... . 135
6.1 Weight breakdown of components attached to the moth. . ....... 148
6.2 Comparison between this work and previous published wireless trans-
mitters and/or receivers for miniature biological systems. ....... 156
20
Chapter 1
Introduction
Mobile, battery powered devices have for decades fueled growth in the semiconductor
industry. One of the defining moments in the semiconductor industry that kick-
started this rapid growth was the invention of the integrated circuit by Jack Kilby
in 1958 [5]. Whereas previous electronic circuits consisted of several discrete semi-
conductor devices, integrated circuits allowed for multiple semiconductor devices to
be manufactured within a single piece of material, resulting a significant reduction in
area and increased reliability. These benefits enabled many portable, battery pow-
ered devices beginning with the handheld calculator, which was designed by Texas
Instruments in the late 1960's and first sold by Canon in 1970 as the 'Pocketronic'
[6]. A second key invention that has contributed to the growth of portable electronic
devices was the invention of CMOS technology in 1963 [7], which allowed for a signifi-
cant improvement in the energy efficiency of digital and analog circuits. Early CMOS
powered mobile devices included low power wrist watches [8] and calculators [9]. Since
then, CMOS processes have continued to improve and scale, following Moore's law,
such that is now possible for a cellular phone to achieve comparable performance to
a supercomputer of 30 years ago.
Wireless connectivity has been a key feature of portable devices, resulting in the
exponential growth of cellular phones, now achieving annual sales of over one billion
units. Laptop sales are now outnumbering desktop computer sales, and the emer-
gence of the One Laptop per Child (OLPC) project and low cost 'sub-notebooks' will
1 mW
I I U 1-
Figure 1-1: Spectrum of existing mobile devices
micro-Watts to Watts.
that consume power ranging from
Table 1.1: Example existing and emerging applications for energy constrained mobile
devices
Application Sensors Wireless Power Battery
interfaces cons. lifetime
Wristwatch None None 1 pW Multiple years
[8]
Pacemaker Pacing leads Inductive link 10 AW 10 years
[10]
Vital signs ECG, heart rate, 900 MHz ISM 1-8 mW 24 hours
monitor temperature (intensive care)
[11, 12]
Bluetooth Acoustic mic. Bluetooth 70 mW 6 hours
headset [13] (talk time)
Smart phone Acoustic mic., Bluetooth, 1 W 3-5 hours
[14] accelerometer, WiFi, GSM, (talk time)
camera HSDA, GPS
allow even greater penetration of portable, mobile devices. The majority of mobile
devices consume power ranging from micro-Watts to tens of Watts, and a spectrum
of applications are shown in Figure 1-1 and detailed in Table 1.1. In these applica-
tions, energy is becoming more and more the limiting factor rather than performance.
This means that battery lifetime dictates performance specifications rather than per-
formance dictating battery lifetime. This is particularly true for emerging systems
such as implanted electronics. For example, roughly half the area of a pacemaker is
occupied by a battery to ensure it can achieve 10 years of battery lifetime [10].
A conceptual diagram for a generic mobile device appears in Figure 1-2. The core
components of the device include data conversion, signal processing, and commu-
1 pW 1W
I -%._
Figure 1-2: Conceptual diagram of a generic mobile device.
nication subsystems, and these interface with the environment through sensors and
actuators. Additionally, an energy subsystem is required to efficiently power the elec-
tronics. This thesis focuses on the core mixed-signal components of a mobile device,
namely the interface to the sensors, actuators and antenna via the analog-to-digital
converter (ADC) and the wireless communication subsystem.
Four decades of Moore's law and semiconductor process scaling have made it pos-
sible now to integrate billions of transistors on a single chip. With device geometries
on the order of nanometers, device characteristics have changed so dramatically that
many traditional circuits and architectures are no longer optimal and/or feasible.
For example, reduced device breakdown voltages have forced circuits to operate at
supply voltages near 1 V. As the number of dopant atoms per transistor decreases,
random dopant fluctuations (RDFs) are becoming increasingly problematic, resulting
in significant device variation. Subthreshold and gate leakage have increased relative
to device's 'on' conductance such that leakage can no longer be ignored. These ex-
amples are just a handful of the many challenges encountered with advanced CMOS
processes, which have forced circuit designers to develop entirely new circuit topolo-
gies and architectures.
An emerging trend in analog and mixed signal circuit design in nanometer-scale
CMOS processes is the use of highly digital circuit topologies. There is no exact
definition of what constitutes a highly digital circuit or architecture, but they can be
broadly defined as consisting of significant digital logic, using simple 'digital' struc-
tures like comparators, switches, and capacitors, and using techniques like digital
feedback. Such structures can be readily implemented in advanced CMOS processes
whereas it is becoming increasing challenging to design 'analog' components such as
op amps and employ techniques like analog feedback or analog offset storage.
This thesis is focused on developing highly digital, mixed signal circuits and ar-
chitectures for energy-constrained mobile applications. As the spectrum of energy
constrained devices is quite large, this thesis focuses specifically on the subset of de-
vices that consume on average only a few milli-Watts or less, significantly less power
than cellular phones. Typically, these low power devices require only a short-range
(<100 m), low data rate (<100 kb/s) wireless radio and do not include power hungry
components like an liquid crystal display (LCD).
To provide a broad perspective into highly digital circuits and architectures, this
thesis is structured to progressively expand in scope, moving from a component level
to a system-on-chip level and then finally to a full system demonstration. First, at
the component level, a voltage scalable ADC is presented that operates from 0.2 V
to 0.9 V and consists of almost entirely digital logic and comparators. Next, at the
system-on-chip level, a wireless ultra-wideband (UWB) transceiver chipset designed
in 90 nm CMOS is presented that achieves both high performance and low energy op-
eration. The chipset includes an all-digital transmitter that employs an inverter-based
power amplifier (PA) with capacitive combining to generate Federal Communications
Commission (FCC) compliant, large output swing signals while consuming very little
energy. Also included in the chipset is a fully integrated receiver system-on-chip (SoC)
that leverages a simple, differential inverter-based RF front end combined with exten-
sive digital calibration to realize robust operation in a miniature form factor system.
Finally, the UWB transceiver chipset is demonstrated in a full system for insect flight
control.
HP
10 10 10 3
Process Node (nm)
(a)
0.4-
HP
0 3
10 10 10
Process Node (nm)
(b)
.......... . ..... .. :
10
101 tP
10 2 3
10 102  103
Process Node (nm)
(c)
Figure 1-3: Simulated effect of scaling on NMOS (a) Io/Ioff, (b) V/hVDD, and (c)
g /g9d. For simulations, L = Lni, and W = 3Lmn. Device models include MOSIS
provided models [1] and Predictive Technology Models (PTMs) [2, 3]. Both high
performance (HP) and low power (LP) MOSFET data is shown.
1.1 Process Scaling: Benefits and Challenges
A key driving force behind the growth in mobile, battery powered devices has been
semiconductor process scaling. The world's first supercomputer, the Cray-1, which
was released in 1978, had a 16-bit processor that operated at 80 MHz, used 8 MB
of memory, weighed 5.5 tons and consumed about 115 kW of power [15]. Modern
embedded processors, like the Marvell PXA320 XScale processor used in the LG-KC1
smartphone, realize superior performance than the Cray-1 while occupying board
space equivalent to a credit card and require sub-1 W power consumption [16]. These
dramatic reductions in energy and volume along with increased functionality have led
to the emergence of highly complex mixed-signal SoCs.
A key challenge associated with semiconductor process scaling, however, is that
traditional circuit architectures are not compatible with nanometer-scale processes.
Traditional architectures are either inefficient, negating many advantages of scaling,
or cannot function properly. Circuit designers must overcome severe local and global
process variation, increased gate and subthreshold leakage currents, reduced supply
voltages, and reduced intrinsic gain of devices.
25
Figure 1-3 shows three separate plots of how transistor properties fundamentally
change in advanced process nodes, each of which highlights a different challenge facing
nanometer-scale CMOS circuit design. In Figure 1-3(a), lon/Ioff is shown to rapidly
decrease in advanced CMOS processes, which can result in significant power consump-
tion when circuits are disabled and also making it challenging to hold analog voltages
on capacitors. In Figure 1-3(b), the ratio of device threshold voltage, Vth, to sup-
ply voltage, VDD, increases with scaling, which effectively reduces voltage headroom
and makes it difficult to stack multiple devices and employ techniques like cascod-
ing. Voltage headroom is also reduced due to the fact that Vdsat does not scale with
VDD. In Figure 1-3(c), the gm/gds of CMOS devices is shown to decrease in advanced
processes, making it difficult to obtain a large gain in single-stage amplifiers.
The data from Figure 1-3 is derived from Predictive Technology Models (PTMs)
[2, 3] for 16 nm to 180 nm processes, and from MOSIS provided models for 0.25 pm,
0.35 pm, and 0.5 pm processes [1]. For the 16 nm to 45 nm processes, both high per-
formance (HP) and low power (LP) device models are used. It is seen that a significant
reduction in leakage can be achieved by using the low power devices, but Vth/VDD
increases, reducing speed and decreasing voltage headroom. All simulations are with
minimum length NMOS devices with a width three times the minimum length. In
advanced processes, it is possible to increase gm/9ds by increasing the device length,
but this results in the transistor area and gate capacitance to increase, removing some
of the advantages of scaling. For leakage measurements in Figure 1-3(a), the NMOS
devices are operated at the nominal supply voltage, which increases with older pro-
cesses. The measurements in Figures 1-3(b) and 1-3(c) are based on diode-connected
NMOS devices each supplied 1 pA of current.
One additional problem associated with advanced CMOS processes that is not
shown in Figure 1-3 is increased device variation, due to RDF and processing variation
[17]. Both of these effects significantly change the resulting threshold voltage of
devices and in a 65 nm process it has been shown that a ±4u variation from RDF
alone can result in a change in drain-current by over three orders of magnitude [17].
Device variation most acutely affects digital circuits and memory, where devices are
typically minimum-sized to reduce area. In contrast, transistors in analog circuits
are typically not minimum-sized because the devices consume minimal area relative
to passive components such as capacitors and inductors, which are not subject to
RDF. Moreover, analog transistors are often sized for biasing and noise constraints
that result in relatively large devices.
1.2 Highly Digital Circuits
An emerging design approach that offers much promise to address many of the prob-
lems associated with advanced semiconductor processes is the use of highly digital,
variation-tolerant circuits and architectures. As described earlier in this chapter,
highly digital circuits and architectures can be broadly defined as consisting of sig-
nificant digital logic, using simple 'digital' structures like comparators, switches, and
capacitors, and using techniques like digital feedback.
Highly digital circuits and architectures are the subject of this thesis with a focus
on the design of highly digital ADCs and UWB wireless circuits for energy con-
strained applications. The circuits and systems presented are designed to leverage
the benefits of advanced nanometer-scale CMOS processes while overcoming the many
challenges faced. The following sub-sections briefly introduce both ADCs and wireless
transceivers, and provide some background into existing highly digital circuits and
architectures.
1.2.1 Analog to Digital Converters
In systems that interface with the physical world, such as sensors or wireless radios,
signals must be converted from the analog domain to the digital domain by an ADC
so that they can be processed digitally and take advantage of the sophisticated ca-
pabilities of a digital signal processor (DSP). Precisely how much processing is done
before and after the ADC is a matter of system level optimization. ADC requirements
depend on system characteristics, namely bandwidth and dynamic range, and system
optimization must consider ADC power, which can be a significant portion of the total
power. As one might expect, the energy per conversion, which is an important metric
for ADCs, increases as the dynamic range and sampling rate requirements increase.
An empirical figure of merit (FoM) for ADCs normalizes their power consumption,
P, to the input bandwidth, BW, and the dynamic range, expressed as 2ENOB (where
ENOB is the effective number of bits output) [18]:
FoM = (1.1)2BW2ENOB
State-of-the-art converters today achieve an FoM as low as 4.4 fJ per conversion-
step [19]; however, generally, dynamic ranges beyond those yielded by eight-bit con-
verters have a steeper power increase due to device noise limitations in the ADC cir-
cuits; the same is true when sampling-rates exceed tens of megahertz because devices
must be biased further above their threshold voltage, VT, which results in degraded
energy efficiency. In advanced CMOS technologies, the observed empirical trend has
been that the FoM decreases by 1.8 times with each new process generation [20]. This
trend in energy reduction offers significant incentive to develop highly digital ADC
architectures customized for nanometer-scale CMOS processes. A few key emerging
ADC trends are highlighted below.
Variation-tolerant ADCs. Mismatch and variation ultimately limit performance
and yield of ADCs. There are several well known methods to reduce these effects,
including increasing device sizes, common-centroid layout, auto-zeroing, cancelling
offsets through feedback and redundancy. Of these, redundancy is a promising, emerg-
ing technique that is well suited for highly scaled processes where more traditional
methods are impractical and the selection of which devices to enable and disable can
be implemented with simple digital switches and logic. Redundancy has been applied
to SRAM sense-amplifiers to allow for reduced area and power compared to simply
increasing the device size of a single sense-amplifier [21]. Redundancy offers much
promise for highly-parallel systems, such as time-interleaved ADCs [22] or flash ADCs
[23].
Low-voltage ADCs. For low-resolution ADCs, thermal noise is not a challenging
design constraint and the supply voltage can be reduced to enable low energy opera-
tion. Traditional analog circuit blocks cannot easily operate at low-voltages; however,
a 0.5 V pipelined ADC has been demonstrated that uses operational transconduc-
tance amplifiers (OTAs) biased in weak-inversion and does not require any voltage
boosting [24]. A more prevalent way to realize low-voltage operation is through the
use of highly digital ADC structures such as successive approximation register (SAR)
ADCs. SAR is an excellent topology for micro-power low-voltage operation, as no lin-
ear amplifiers are required for conversion. A 0.5 V SAR ADC has been demonstrated
that achieves an FoM of 0.12 pJ per conversion step [25].
Comparator Based ADCs and Circuits. Comparator-based structures allow the
implementation of switched-capacitor ADCs and analog circuits without the need for
analog feedback - comparators replace the functionality of operational amplifiers by
controlling energy transfer through comparator switching events rather than forcing
a virtual ground through feedback. This approach offers potential for reduced power
consumption and to address scaling issues in emerging technologies [20].
This thesis presents two ADCs that use these emerging trends to maximize energy-
efficiency. A 6-bit flash ADC described in Chapter 2 leverages extensive comparator
redundancy and reconfigurability to enable operation down to 0.2 V. A separate, 5-
bit ADC described in Chapter 4, uses a multi-stage, dynamic integrator structure
similar to an integrating ADC to generate the two bits of coarse quantization while
a traditional flash ADC generates an additional three bits of fine quantization.
1.2.2 Wireless Transceivers
Wireless communication links found in energy constrained systems typically dominate
the system's overall energy consumption. To reduce energy consumption, an emerging
trend in radio design is to leverage highly digital architectures in advanced CMOS
processes. There are several advantages associated with designing radios in advanced
CMOS processes. One key advantage is that the radio can be integrated in a mixed-
signal chip along with digital circuit blocks, thereby allowing for a highly compact
implementation. A SoC radio, in which the RF, analog, and digital circuits are all
integrated on the same die offers the benefit of reduced parasitic capacitances between
the many sub-blocks, thereby resulting in lower power consumption than a multi-
chip solution. SoCs offer the additional opportunity for more flexible system-level
optimization.
Much like low resolution ADCs, radios benefit from process scaling, particularly
when the architecture is optimized for the process. Highly-digital radios offer sev-
eral compelling advantages over traditional radio architectures in nanometer-scale
processes. Nanometer-scale processes offer very low-impedance switches with low
parasitic capacitance, thereby enabling highly reconfigurable and scalable radio ar-
chitectures. Reconfigurability is becoming increasingly important as wireless devices
are now being required to support multiple standards and frequency bands on a
single-chip. This had led to a need for software-defined radios (SDRs), which can
be reprogrammed when necessary to support a wide range of wireless standards [261.
A separate benefit of process scaling is increasing device ft, which has resulted in
untuned, non-resonant circuits becoming increasingly energy efficient such that they
can be used in place of tuned, resonant circuits [271. These untuned circuits occupy
significantly smaller area than tuned circuits.
UWB communication, technology that was first demonstrated by Marconi with
spark gap transmitters, has recently gained traction in the research community in part
due to its compatibility with advanced CMOS processes. UWB technology is a form
of wireless communication in which signals occupy a wide bandwidth, greater than the
lesser of 500 MHz or 20% of the center frequency of the signal, according to the FCC.
There are several reasons why UWB communication is well suited to nanometer-scale
CMOS and highly digital architectures. For high data rate UWB communication,
operating at hundreds of Mb/s and above, only advanced CMOS processes are fast
enough and sufficiently low power to decode and process the received bits. Addition-
ally, due to the wide bandwidth of UWB signals, they can be efficiently amplified and
processed with wide-bandwidth, low Q resonant or non-resonant circuits, which can
be easily integrated on-chip with minimal area [28]. One specific method of UWB
communication, termed impulse radio ultra-wideband (IR-UWB), is to encode data
in short pulses, on the order of nanoseconds. IR-UWB signaling is highly compati-
ble with digital architectures, and very simple digital pulse transmitters consisting of
only digital logic and delay elements have been successfully demonstrated [29].
1.3 Thesis Contributions
This thesis broadly examines the design of highly digital, mixed-signal circuits and
architectures focusing on ADCs and UWB wireless transceivers. The scope of this
thesis slowly expands, moving from a component level to a system-on-chip level and
then finally to a full system demonstration. The main contributions of this thesis are
in the following five areas.
1. Voltage/frequency scalable ADC - This thesis presents a voltage scalable flash
ADC that can operate down to a supply voltage of 200 mV, which is the lowest
reported operating voltage for a Nyquist rate ADC [30]. To achieve such low
supply voltage operation requires a highly digital architecture that leverages
redundancy to improve linearity and yield. Several circuit techniques are used to
reduce the effects of leakage and the ADC implements common-mode feedback
digitally.
2. Analysis of stacking in subthreshold regime - Due to comparator redundancy in
the voltage scalable flash ADC, significant variation in the comparator's switch-
ing thresholds can be tolerate. This allows for the use of a non-traditional com-
parator, where the switching voltage is set through device stacking. This thesis
presents an analysis of device stacking in the subthreshold regime, demonstrat-
ing that stacking results in a quadratic change in effective device strength.
3. All-digital UWB transmitter - Two key problems inherent in many low power
UWB transmitters are that they either are not compliant with the FCC spectral
mask without off-chip filters [29] or the output pulse amplitude is extremely low,
fundamentally limiting range [31, 32]. A highly digital IR-UWB transmitter is
presented that employs capacitive combining to generate large pulse amplitudes
that are FCC compliant while being energy efficient [33].
4. Highly integrated, noncoherent UWB receiver SoC - UWB receivers are tradi-
tionally more complex than transmitters, and it is particularly challenging to
achieve low energy, high performance operation in a highly integrated form fac-
tor. This thesis presents the first fully integrated noncoherent UWB receiver
with embedded synchronization logic [34, 35]. The receiver employs several
highly digital amplifier and ADC structures, combined with extensive digital
calibration to realize robust, low-energy operation in a 90 nm CMOS process.
5. System demonstration of UWB chipset for moth flight control - Insect flight
control is an emerging area of research where electronics and microelectrome-
chanical systems (MEMS) are placed on and within insects to wirelessly control
their flight. This thesis presents the first high performance radio that has been
successfully demonstrated in wireless flight control of a moth. In this prelim-
inary demonstration, the receiver SoC is implemented on a miniature, battery
powered printed circuit board (PCB) weighing only 1 g and consuming only a
few milli-Watts of average power.
1.4 Thesis Outline
The topic of this thesis is on the design of highly digital mixed-signal circuits and
architectures. The thesis is structured to slowly expand in scope, moving from a
component level to a system-on-chip level and then finally to a full system demon-
stration. Chapter 2 focuses on a specific component, an ADC, describing a voltage
and frequency scalable flash ADC. While this specific ADC is not used in the re-
mainder of the thesis, the chapter highlights several highly digital design techniques,
some of which are used later. Next, the thesis moves to the SoC level, describing a
UWB transceiver chipset. Chapter 3 introduces the low data rate UWB architecture
and presents a highly digital transmitter. The receiver SoC circuits and measurement
results are presented in Chapters 4 and 5, respectively. Finally, the thesis presents
a system demonstration of the UWB transceiver chipset in Chapter 6, in an insect
flight control system. Conclusions and a discussion of future work are presented in
Chapter 7.
34
Chapter 2
Highly Digital, Voltage Scalable
ADC
Microsensor wireless networks and implanted biomedical devices have emerged as
exciting new application domains. These applications are highly energy constrained
and require flexible, integrated, energy-efficient ADC modules that can ideally operate
at the same supply voltage as digital circuits. In many applications, the performance
requirements are quite modest (-100 kS/s). In systems with extensive digital signal
processing, an additional demand faced by these ADCs is that they be compatible with
advanced digital CMOS processes. As CMOS processes advance, digital switching
energy reduces and scaling allows for increasingly complex algorithms with minimal
energy overhead but key challenges such as increased leakage and device variation
emerge.
In recent years, highly digital ADC architectures like SAR and EA modulators
have gained popularity due to their compatibility with advanced CMOS processes.
In [36], a frequency-to-digital EA modulator is presented that uses only inverters and
digital logic gates, operating at a supply voltage of 0.2 V. In many of these ADCs, the
overall digital (CV2) power consumption is greater than analog power consumption,
allowing for significant digital energy savings through voltage scaling. Voltage scal-
ing can also be applied to analog circuits to reduce power consumption, particularly
in low-resolution ADCs where thermal noise is not a limiting constraint; however,
Voltage \ Digital Digital
In Logic Out
Enable Inverters
Figure 2-1: Conceptual block-diagram of an inverter-based ADC.
care must be taken to minimize the impact of power supply noise. Moreover, when
operating analog circuits at low supply voltages, device leakage and variation, al-
ready serious concerns in advanced CMOS processes, become increasingly severe and
traditional circuits and architectures are often impractical. To overcome these chal-
lenges, highly digital architectures must be employed and combined with techniques
like redundancy and reconfigurability.
Inspired by the aforementioned scaling trends, much research has focused on re-
alizing highly digital ADCs with the ultimate goal of a synthesizable ADC. Imagine,
for instance, a highly digital flash ADC consisting solely of a sea of many redundant
and reconfigurable inverter-based comparators combined with digital backend logic
for calibration, as shown in Figure 2-1. Due to the redundancy, there are a very
large number of comparators, and during normal operation only a small subset of the
comparators are enabled. This extreme redundancy allows for the tolerance of large
comparator voltage offsets. In a flash ADC reconfigurability can be combined with
redundancy to allow any comparator to be assigned to any ADC threshold. If, after
calibration, only a subset of inverters are enabled such that their switching thresholds
are linearly spaced, an energy efficient, highly digital ADC can be realized.
This chapter presents a highly-digital, voltage scalable flash ADC implemented
in a 0.18 pm CMOS process, inspired by the vision of an inverter-based ADC [30].
Section 2.1 describes the ADC architecture highlighting how redundancy and recon-
figurability is used to improve linearity and how extensive processing is moved to the
digital domain. Section 2.2 presents the key ADC circuit blocks, including the front-
end sampling switch and the clocked comparator array. Transistor sizing and stacking
are used to vary comparator switching thresholds, and a mathematical analysis of the
relationship between transistor stacking and comparator switching thresholds in the
subthreshold regime is presented. Finally, measurement results are presented in Sec-
tion 2.3.
2.1 ADC Architecture
2.1.1 Background and Theory
To achieve energy efficiency, the ADC presented in this chapter is designed to operate
at low voltages, where the energy per conversion is minimized. This operating voltage
is akin to the minimum energy point for digital circuits [37] and is based on the
assumption that the ADC sample rate scales with varying supply voltages. For ADCs,
the energy per conversion is minimized when the sum of leakage energy and active
energy is minimized, which for the ADC presented in this chapter occurs at supply
voltages near MOSFET threshold voltages. Low voltage operation allows for improved
energy efficiency but limits the maximum operating frequency. If higher operating
frequencies and input bandwidths are required, interleaving and parallelism can be
used to recover performance [22]. Low voltage operation also causes many analog
design challenges that must be addressed. Two key architectural challenges are that
increased variation in the subthreshold regime causes significant comparators offsets,
and that traditional differential architectures are impractical.
A key block in flash ADCs is the comparator network, including the peripheral
circuitry that ensures each comparator has an appropriate switching threshold. In
traditional flash ADCs, where there is a 1:1 correspondence between comparator
and output code, the combined comparator and reference voltage offset must be
significantly less than 1 least significant bit (LSB) to ensure a reasonable linearity.
For example, assuming a Gaussian distribution, a 6-bit ADC requires an offset, aoffset
to be smaller than 0.2 LSB to achieve a 99% yield of INL < 1 LSB [38]. Maintaining
Code Code
6 ------ a- b--6b -c-- 6
0 5 ------ 5a---Sb ---5c-- -0 5 5a '----
o ) 4 ------ 4a---4b ---4c-- o 4 ------ -- - ------- i--- 
1 ------3a---t-3b--- -- 1 ------
1c
(a) (b)
Figure 2-2: (a) Nominal threshold voltage of comparators for a 6-level, flash ADC
with three redundant comparators at each threshold voltage. (b) Example of the
ADC with threshold variation. The comparators that are selected after calibration
are highlighted.
low offsets requires large transistors, resulting in significant parasitic capacitance
and area. Alternatively, offsets can be cancelled through analog and mixed-signal
techniques such as a feedback DAC [39, 40] or correlated double sampling (CDS)
[41]. In [40], large offsets in a flash ADC preamplifier are cancelled by embedding a
5-bit DAC within each preamplifier.
As it is difficult to realize analog offset compensation at low supply voltages, the
ADC architecture leverages digital calibration combined with redundancy [23]. Many
redundant digital regenerative comparators with large input-referred offset variation
are used in place of a small number of precise comparators and reference voltages.
Any comparator can be assigned to any specific threshold, and there are many more
comparators available than thresholds required. By increasing the number of redun-
dant comparators, the ADC can achieve the required yield even in the presence of
very large variation in each comparator's threshold, with a standard deviation, a,
much larger than one LSB.
Figure 2-2 graphically demonstrates how redundancy and reconfigurability are
used by the flash ADC. In the figure, the comparator threshold voltages of a 6-level
flash ADC are shown where three comparators are inserted for each threshold voltage.
Figure 2-2(a) shows the comparator thresholds when there is no comparator threshold
variation. In this situation, the redundancy and reconfigurability is not needed. Fig-
ure 2-2(b) shows the comparator thresholds in the presence of comparator threshold
variation. The comparators that are selected after calibration are highlighted. In the
example, comparator 2b has a significant voltage offset, but is selected to represent
the fourth switching threshold. There is no comparator that perfectly aligns with the
fifth switching threshold, so the closest comparator, 5c, is selected.
The concept of comparator redundancy and reconfigurability can be analyzed
numerically to gain a quantitative understanding of the relationship between redun-
dancy and yield. Whereas when a < 1 LSB, the probability distribution of indi-
vidual comparator thresholds are narrow around their respective mean thresholds;
when a > 1 LSB, the probability distribution of individual comparator thresholds
significantly overlaps those of comparators with nearby thresholds. In this scenario
(a > 1 LSB), the number of comparators within a given voltage range is proportional
to the size of the voltage range, ignoring edge effects at the boundaries of the input
range. Thus, the thresholds are Poisson distributed. If we assume N comparator
thresholds over an input range of Vdr, and a redundancy factor of R, the probability
that there are no thresholds within a voltage range of x can be calculated to be:
xNR
Probability = e Vdr (2.1)
From this equation, we can calculate the expected probability that INL < 1 LSB,
assuming no correction for gain and offset errors. Here, INL is defined as the maximum
difference between the ideal and actual code transition levels after correcting for gain
and offset [42]. When INL < 1 LSB, there cannot be a gap of +1 LSB around each
of the N ideal switching thresholds (or code transition levels) with no comparator
thresholds, and thus x = 2vr. Thus, the following approximation for INL is derived:
Probability (INL > 1 LSB) e Ne - 2 R (2.2)
This approximation is only valid when e- 2R < 1. The approximation is presented
in Figure 2-3 along with Monte Carlo results showing how redundancy and reconfig-
urability can be combined to achieve a required linearity and yield in the presence of
* No reconfig, o=2
* Reconfig up to 1 LSB, 0=3
V Full reconfig, c=3
- Approximation for full reconfig.
-210 ...
-4z10 V
0 -611106
0- 0 5 10 15
Redundancy Factor, R
Figure 2-3: Yield of ADC versus redundancy factor for ADCs with varying levels of
comparator reconfigurability. Redundancy factor refers to the number of comparators
assigned to a given threshold. Additional comparators are placed at the edges of the
input range to compensate for edge effects. INL values are not corrected for gain and
offset errors.
large comparator threshold variation (a >1 LSB). In the Monte Carlo simulations,
no offset error and no gain error are tolerated. The ADC yield can significantly
improve if offset and gain errors are tolerated. As edge effects reduce yield, addi-
tional comparators are inserted with switching thresholds at the edges of the input
range. Compared to increasing device sizes, redundancy and reconfigurability have
been shown to offer an improved trade-off between power/area and linearity [231.
2.1.2 Overview
Figure 2-4 shows a block diagram of the ADC. The ADC can be configured in
either a single-ended or pseudo-differential configuration. It consists of a sampling
network, two arrays of 127 dynamic digital clocked comparators and a digital backend.
Instead of a traditional reference ladder that draws static current, the ADC uses
dynamic comparators with static voltage offsets to generate comparator thresholds.
The digital dynamic comparators are based on a sense-amplifier flip-flop and are
described in detail in Section 2.2. The digital backend consists of two 127-bit Wallace
Clocked Wallace
Comparators Adder
o 127
"0 . 12712 127x9b memory
Vin,.-, '@ Calibration logic
126
Low-Leakage Enable[126:0]
Sampling Switch , 1 Digital
C M Rej. - Out
IIR Filter
Vin-o o.7127x9b memory
126 Vn 7 Calibration logic
T Enable[126:0] 127
Figure 2-4: Block diagram of the flash ADC.
tree adders, two on-chip 127 x 9b memories with calibration logic, and an infinite-
impulse response (IIR) common-mode rejection filter. The Wallace tree adders sum
the individual thermometer encoded comparator outputs and generate binary values.
The ADC is designed for a maximum of 6-bits of resolution, so in nominal mode
no more than 63 comparators are enabled, and 64 comparators are disabled. For this
implementation, a redundancy factor of 2 was used to reduce area overhead at the
cost of degraded linearity when compared to an ADC with higher redundancy factor.
Before nominal operation can commence, the ADC must be calibrated by applying
an input with known distribution such as a triangle wave. In single-ended mode,
calibration can be applied in a ping-pong process, whereas in pseudo-differential mode
calibration must be foreground. While calibrating, the Wallace tree adder is bypassed
and each comparator is assigned to a specific 9-bit accumulator. An estimate of the
CDF of the input is generated in on-chip memory, and the comparator thresholds
are back-calculated from this data off-chip. Based on these thresholds, an off-chip
algorithm determines which comparators to enable. Once the appropriate subset of
comparators are enabled, the ADC can operate in nominal mode with the output
code taken at the output of the Wallace tree adders.
As true differential architectures are not amenable to low-voltage operation, the
ADC attempts to mimic the advantages of differential circuits through digital sig-
nal processing. Low-frequency common-mode rejection is implemented in pseudo-
differential mode with an IIR filter and a 5-bit capacitive feedback DAC, which injects
charge on the sampling capacitor to cancel common-mode offsets. The two single-
ended ADC outputs are averaged and compared to the desired mid-scale code. This
technique is advantageous for full-swing inputs where common-mode offsets can re-
sult in clipping and reduced performance. In an integrated system with a differential
amplifier driving the ADC input, the feedback DAC can be removed and instead the
IIR filter output can directly vary the common-mode output of the amplifier.
An alternate architecture that does not require large on-chip memories or sig-
nificant calibration computation is described in [43], whereby the inherent Gaussian
variation in comparator thresholds is used to obtain linearity over an input-range.
The stochastic ADC in [43] is fundamentally different from this work, as variation is
leveraged in [43], whereas in this work variation is tolerated.
2.2 ADC Circuits
To achieve good ADC performance at low supply voltages, there are several circuit
challenges that must be addressed in the sampling network, comparator array and
digital backend. This section describes the ADC circuit blocks in detail.
2.2.1 Sampling Network
At low supply voltages, it becomes challenging to realize good sampling switches
due to the degraded ratio of 'on' conductance to 'off' current. The sampling switch
must have a sufficiently high 'on' conductance and/or linearity such that it does not
introduce distortion, and the 'off' current must not result in input-dependent ADC
errors. To improve the linearity of the 'on' conductance, one can use resistor-based
sampling techniques [44] and constant Vg, bootstrapping techniques [45]. As these
techniques can be challenging to implement in combination with extreme voltage and
128pmJ
Vn V-r"
CL(
(a)
Vin V3
T CL
(c)
cdocboot j
4Vm2
CLT
(b)
dodc*boo-- @
4.Opm
Figure 2-5: Four possible circuit implementations for the sampling network. All four
circuits are sized for equal 'on' conductance.
-0.05
5100
* 50
* o
0 0.05 0.1 0.15 0.2
0.05 0.1
Time (ms)
Figure 2-6: Transient leakage simulation demonstrating the relative 'off' leakage for
the four sampling switch options presented in Figure 2-5. V1, V2 , V3 , and V4 correspond
to the nodes labeled in Figure 2-5. Simulation results are for typical devices at 1000 C.
frequency scaling, in this work we focus on techniques solely to increase the 'on'
conductance.
To improve the ratio of 'on' conductance to 'off' current, device stacking [24],
voltage boosting [25], and leakage feedback cancellation can be employed. To compare
these techniques, Figure 2-5 presents four sampling switch circuit implementations.
The four implementations are all sized for equal 'on' conductance. Figure 2-5(a)
presents a simple, single transistor sampling switch. At low supply voltages, the
gate overdrive can be as low as a few hundred milli-Volts and thus the switch must
be sized very large, resulting in large 'off' leakage current and significant switching
energy. If the 'off' current is sufficiently large, it can result in ADC errors while the
comparators are resolving. Voltage boosting can be employed to increase the 'on'
conductance while not increasing the 'off' current, as shown in Figure 2-5(b), as long
as device reliability is not a problem. Additionally, connecting devices in series can be
employed to reduce leakage, as shown in Figure 2-5(c). Connecting devices in series
has been shown to result in significant leakage reduction compared to a single device
[46]. While this results in only a minimal improvement in the ratio of 'on' conductance
to 'off' current, when combined with a feedback amplifier as shown in Figure 2-5(d),
a substantial reduction in leakage can be achieved. The feedback amplifier serves to
actively drive the internal node to the same voltage as the sampling capacitor, thus
reducing the Vd, and Ids of the sampling switch closest to the sampling capacitor.
The feedback amplifier consists of self-biased NMOS and PMOS source followers and
consumes only leakage current. The transient plot in Figure 2-6 shows how these
techniques reduce the leakage on the sampling capacitor when the sampling switch
is open. Voltage boosting results in a dramatic decrease in leakage and the feedback
amplifier reduces leakage by an additional r40%.
In this work, the sampling switch of Figure 2-5(d) is implemented. In parallel
with the load capacitor CL is a 5-bit capacitive DAC used to cancel low frequency
common-mode offsets. Figure 2-7 presents the voltage boosting circuit that drives the
sampling switches. The final stage inverter of the voltage boosting circuit is designed
so that the clock output can never drop below VDD due to leakage when it should be
Clock In 1 Boosted Clock
Figure 2-7: Circuit schematic for voltage boosting circuit.
held high. Due to parasitic capacitances, the output voltage is simulated to reach a
maximum of 510 mV when VDD equals 300 mV.
2.2.2 Comparator Array
The digital dynamic comparators used in the ADC are based on a sense-amplifier flip-
flop [47]. A simplified schematic of the flip-flop is shown in Figure 2-8. The sampled
analog voltage is applied to one input of the flip-flop, and a reference voltage of 0 V
is applied to the other input. The reference voltage could potentially be adjusted to
cancel global voltage offsets. Alternatively, body biasing could be employed to cancel
global voltage offsets. Comparator thresholds are varied by adjusting the effective
strength of the input PMOS devices. A variable number of minimum sized PMOS
input devices are connected in parallel and series. To reduce kick-back, the gates
of dummy PMOS devices are connected to the sampling capacitor and their drain
and source nodes are driven in counterphase to the internal flip-flop voltages. The
single stage flip-flop uses positive feedback to achieve a superior power-delay product
compared to a linear amplifier. Even though regenerative amplifiers are subject to
large input-referred offsets, these offsets are acceptable given the redundancy and
reconfigurability.
The comparator structure is designed to operate at supply voltages both above
and below VT. At low supply voltages, the comparator threshold range decreases
and it becomes increasingly difficult to realize a large threshold range through device
Dummy input
devices ClockA
Voltage jVoltage
I n Reference
M1 M3
vou ' vout
in parallel.
sizing. In the subthreshold regime, due to the exponential dependence of current on
gate voltage, to achieve a threshold range of 200 mV solely by varying device width,
a device must be varied in width by over 100 times. Stacking devices in series is
preferred to linear width scaling as the device strength decreases quadratically rather
than linearly in proportion to device area. This allows for a smaller comparator im-
plementation and consumes less power than setting comparator thresholds by scaling
device widths or by adding capacitors at the drain or source nodes of transistors M1
and M3 in Figure 2-8 [48]. For example, when the comparator operates at a supply
voltage of 300 mV, the switching threshold changes by 108 mV when increasing from
one device to six stacked series devices. Alternatively, if the width of a device is
increased or decreased by six times, the switching threshold changes by only 65 mV.
By using many instances of a single device of minimum size rather than varying its
width or length, the comparator thresholds can be estimated by only characterizing
a single device. A numerical proof of the quadratic relationship between stacking
and effective device strength in the subthreshold regime is presented in the following
sub-section.
2.2.3 Analysis of Device Stacking in the Subthreshold Regime
The effect of stacking transistors in digital CMOS logic has been well studied in
literature at supply voltages above VT. In this regime, transistors that are 'on' can
be modeled as resistors [49] and stacking transistors results in a quadratic increase in
propagation delay. However, in the subthreshold regime, transistors are not accurately
modeled by resistors, and this relationship must be re-evaluated.
For the comparator shown in Figure 2-8, the switching threshold is determined
by what input voltage causes the input pull-up network to be equal in strength to
the reference pull-up network. As an approximation, the switching threshold can
be estimated as when the two pull-up networks have equal propagation delay if the
positive feedback load is removed and the pull-up network is analyzed as if it were
a dynamic digital gate. Such a structure is shown in Figure 2-9, but with NMOS
input devices instead of PMOS devices. By characterizing the effect of input voltage,
Voltage In *--I v' V2
M2 VpV1
M1
Enable
Figure 2-9: Multiple NMOS transistors stacked. All nodes are initially precharged to
VDD.
stacking, and device width on propagation delay, one can estimate the switching
threshold of the associated comparator.
For the mathematical analysis, we first assume that we have N stacked NMOS
devices as shown in Figure 2-9. All internal nodes are initially precharged to VDD.
Cp represents the parasitic capacitance seen at internal nodes, and CL represents the
capacitance at the load node.
We can represent the circuit in Figure 2-9 with the following set of differential
equations.
d = I (ID,M2 
- ID,M1) (2.3a)dt Cp
dVN 1dt - (ID,MN - ID,MN-1) (2.3b)
dV 1S (-ID,MN) (2.3c)
dt CL
In the subthreshold regime, these equations can be expanded by using the following
300 . Ouu
200 .200
0
> 100 > 100
0 500 1000 1500 0 500 1000 1500
Time (ns) Time (ns)
(a) (b)
Figure 2-10: Transient (a) ODE and (b) piecewise linear solution for V1, V2, ..., V10
for ten stacked NMOS transistors.
equation for subthreshold current [50]:
ID,Mi = Ise Vth h - e (2.4)
where Is is a process and geometry dependent parameter and n is a process
dependent parameter. For additional accuracy, Vtn can be modified to include the
body effect. Although equation (2.3) cannot be easily analyzed analytically, it can
be analyzed using an ordinary differential equation (ODE) numerical solver. As an
example, we examine the scenario with N = 10, when all NMOS devices are minimum
sized with a gate voltage of 300 mV, a Vt, of 400 mV, and a supply voltage of 400 mV.
Cp is assumed to be 1.5 fF and the load capacitance CL is assumed to be 5 fF, to
represent device and parasitic capacitance. The delay is calculated to be the time
when the load voltage, V10, equals half of the supply voltage (i.e. 200 mV).
A transient solution of the ODE is shown in Figure 2-10(a). An interesting char-
acteristic of the transient plot is that only one node appears to be discharging at
a time. Moreover, each node appears to be discharging at a different but constant
rate, with the rate decreasing as later nodes are discharged. To simplify analysis, this
system can be represented by a piecewise-linear (PWL) approximation as shown in
Figure 2-10(b) and derived in Appendix A. The PWL approximation achieves a very
good match to the ODE solution.
Based on the PWL mathematical model, the following expression for the total
propagation delay, td is derived in Appendix A:
ta l (n+i- 1 (-v '-Vtn)) (n ii n1 1)
td C Dn e -nVthln - i + 1 - -)) (2.5)
Equation (2.5) can be further simplified if we assume the effect of the logarithm
is negligible, meaning that the VDD - nVthln (i + 1 -) term can considered to
be constant over the entire range of i. Based on this assumption, the delay can be
represented with the following second-order equation:
Vi, n
td = e nVth (CN 2 + 2 N + C3) (2.6)
where C1, C2, and C3 are constants. This agrees with existing analysis of above-
threshold logic elements [49]. At the switching threshold of the flip-flop, the delay of
the input side of the flip-flop can be approximately assumed to equal the delay of the
reference side (Tref). Thus, one can calculate the relationship between comparator
switching threshold, V, and the amount of stacking, N, by setting td = Tref:
V = nVthln (C1N 2 + C2N + C3 (2.7)
As N increases, the N 2 term in Equation (2.7) will dominate the numerator of
the logarithm and thus the switching threshold will vary twice as quickly compared
to adjusting the input device width.
2.2.4 Wallace Tree Adder and Memory
The 127-bit thermometer output of each comparator array must be encoded to a
7-bit binary value to generate the digital output code. The encoder is realized with
a Wallace tree adder that allows any combination of comparators to be enabled and
guarantees ADC monotonicity. The Wallace tree adder consists of multiple full- and
half-adders arranged in a tree to allow for 127 one-bit values to be added together in
Comparator[0]
Comparator[1] o. I Data
T Out [8:0]
Comparator[126]* d
Data In Write Read Address [6:0]
Bypass [8:0] Mode
Figure 2-11: Block diagram of 127 by 9-bit memory with embedded counters used for
calculating the estimated CDF of comparator thresholds.
minimal area and propagation delay. Comparators are not assigned to any specific
code and can be reassigned arbitrarily. When comparators are disabled, their output
to the Wallace tree adder is gated and held at GND.
The Wallace tree adder implements an energy efficient encoder; however, it is not
suitable for generating an estimated CDF as it breaks the link between comparators
and their associated thresholds. To generate the estimated CDF, the comparator
outputs are directly fed in parallel into a 127 by 9-bit memory (Figure 2-11). Nine
bits of memory are associated with each comparator to allow sufficient threshold
accuracy. Each block of memory has an associated counter that is used for CDF
generation. When the estimated CDF is being generated, each block of memory is
incremented by one whenever its associated comparator output is high. By the end
of the calibration period, the blocks of memory that are associated with comparators
that have low thresholds will have a high value stored in memory. The memory is
realized with CMOS latches to enable operation down to 0.2 V and operates off an
independent power supply so that it can be power gated when calibration is complete.
1.4mm
M
-2xb enM-2xbmm
-m m -
Adelgica WIIM 0
I~lr Li~~l~RCmp~aratorsSam lig e or
1.4mm
Figure 2-12: Die micrograph of the ADC in 0.18 pm CMOS.
Active Die Area 1.4mm by 1.4mm
Supply Voltage 0.2 V to 0.9 V
Sampling Frequency 2 kS/s to 17.5 MS/s
Performance at 0.4 V, single-ended, post-calibration
Dynamic Performance 5.05-bit ENOB
Power Consumption 1.66 pW
FoM 125 fJ/conversion-step
DNL +1.23/-0.91 LSB
INL +0.72/-0.90 LSB
Table 2.1: Summary of Results for ADC.
2.3 Measurement Results
The ADC is fabricated in a 0.18 pm 5M2P CMOS process and occupies 2 mm2
(Figure 2-12). It was packaged in a 0.5 mm pitch TQFP package. The ADC can
operate above 0.9 V, but the voltage boosting circuit must be disabled, the ADC
speed plateaus and CV2 losses significantly degrade energy efficiency. The ADC
operates from 2 kS/s at 0.2 V to 17.5 MS/s at 0.9 V, as shown in Figure 2-13(a). The
remainder of this section describes how the prototype was tested and its measured
performance. A summary of results is presented in Table 2.1.
108
" 10
10 6
I10
Cr
Lo 3
10
3
10
- 10
10-11
10 - 12
10
- 13
10 - 1410
H! ! 17.5 MS/s
. . . . . . . . . . . . . . . ... .
. . . . . . . . . "
0.5
Supply Voltage (V)
0.5
Supply Voltage (V)
Figure 2-13: (a) Maximum sampling frequency and (b) energy per conversion versus
supply voltage, indicating presence of minimum energy per conversion at VDD
0.4 V.
2.3.1 Static and Dynamic Performance
Static linearity ADC measurements were conducted at a supply voltage of 400 mV
and a sampling frequency of 400 kS/s. The code density test was conducted us-
ing a full-swing, differential sinusoidal input with amplitude of 110 mV and fre-
quency of 1.52625 kHz [51]. In single-ended mode, the maximum DNL and INL
are +1.23/-0.91 LSB and +0.72/-0.90 LSB, respectively (Figure 2-14). In pseudo-
differential mode, the maximum DNL and INL are +0.98/-0.78 LSB and +0.73/-0.61 LSB,
DNL INL
0 20 40 60 0 20 40 60
Digital Code Digital Code
(a) (b)
Figure 2-14: INL and DNL of ADC in single-ended 6-bit mode at VDD = 0.4 V.
... i' :.............ENOB = 4.03 i ENOB:= 4.52
.... ENO - . ....
..
00
............ 2fin
::::::::::::::::::::::::: ::::
.............V ...... ....... ...
.............. V' '(uk Plea k -:::::::::::,~:: An::::
............... L~:: ~i:
... 
.. . . . ... . . .. .. .
lent
FFT
F. = 195.3kHz
In
F = 400kHz
s
THD = -37.7dB
SNDR = 32.5dB
100
Frequency (kHz)
(a)
150 200
FFT
50 100 150 200
Frequency (kHz)
(b)
Figure 2-15: FFT of ADC in (a) single-ended and (b)
at VDD = 0.4 V.
pseudo-differential 6-bit mode
respectively. To improve the DNL and INL, additional redundancy is required.
The signal-to-noise-plus-distortion ratio (SNDR) and effective number of bits
(ENOB) of the ADC were derived using tone testing at supply voltages from 0.2 V
to 0.9 V. As the comparator thresholds vary at different supply voltages, the ADC
is recalibrated at each supply voltage. The FFT of the ADC in single-ended and
pseudo-differential mode at a supply voltage of 0.4 V is shown in Figure 2-15. An
ENOB of 5.05 and 5.56 are achieved in single-ended and pseudo-differential modes,
respectively. The THD in pseudo-differential mode is 6 dB better than in single-ended
mode, most likely due to the matching of the two signal paths and cancellation of
even order harmonics.
HD2 = -39'1 dB
HD3 = -46.8dB
HD4 = -59.5dB
HDs = -62.2dB
0
-20
-40cm
-60
-80
-20-
-40
-60
-80
-100 -0
Before calibration M After calibration
10
c c 8
10
v v 60 0
o o
2
zo zo
3 4 5 6 3 4 5 6
ENOB ENOB
(a) (b)
Figure 2-16: Statistical variation of ENOB in (a) single-ended and (b) pseudo-
differential mode at VDD = 0.4 V before and after calibration.
2.3.2 Power Consumption
The total power consumption of the ADC at 0.4 V, 400 kS/s is 2.84 pW and 1.66 pW
in pseudo-differential and single-ended mode, respectively, of which 135 nW is leakage
power. Shown in Figure 2-13(b) is the energy per conversion of the ADC in single-
ended mode versus supply voltage. At low voltages, the leakage current degrades the
energy per conversion due to low sampling rates, whereas at high voltages, CV2 losses
degrade the energy per conversion, leading to the emergence of a minimum energy
per conversion supply voltage of 0.4 V [37]. A widely used FoM normalizes the ADC
power consumption to the input bandwidth it can digitize and its dynamic range:
FoM = (2.8)
2BW2ENOB
At 0.4 V, the ADC achieves an FoM of 125 fJ/conversion-step in single-ended mode
(5.05 ENOB) and 150 fJ/conversion-step in pseudo-differential mode (5.56 ENOB).
The highly digital flash ADC has no bias currents and thus energy is only dissipated
through switching events (CV 2) and by leakage currents.
2.3.3 Calibration and Common-Mode Rejection
The comparators have a measured offset standard-deviation of approximately 8 mV,
which is larger than 1 LSB. Figure 2-16 presents statistical measurements of the
ENOB for the ADC, before and after redundancy calibration. In pseudo-differential
mode with a total of 126 comparators enabled, the ADC nominally has an average
ENOB of 5.56 at 400 kS/s. If redundancy calibration is not used and the same com-
parators are enabled on all chips, the average ENOB reduces to 3.84. The comparator
thresholds vary with temperature and ADC recalibration is required to maintain lin-
earity. In single-ended 6-bit mode, the ADC ENOB degrades from 5.05 at 250 C to
4.28 at 75 0 C without recalibration. After recalibration the ENOB returns to 5.08.
When a full-scale sinusoid input is in the presence of a -12 dBFS common-mode
signal at 0.005 Fs, the ENOB degrades by 0.5-bit compared to a 1.3-bit degradation
when the common-mode rejection is disabled. Due to latency of the digital circuits,
the common-mode feedback is only capable of cancelling low-frequency components
and improving ENOB at frequencies less than approximately 0.04 Fs.
2.4 Future Directions
In a practical system, the calibration input and logic need to be integrated on chip,
and the hardware required for this is briefly highlighted below. To generate the
calibration input, a triangle wave can be implemented with a high impedance cur-
rent source, an op-amp based integrator, or a sigma delta modulator. As it can be
challenging to implement a highly linear triangle wave, it is possible to use a sinu-
soidal input instead, and through significant bandpass filtering the linearity can be
improved. Calibration logic is required to calculate comparator thresholds from the
estimated CDF and to select the appropriate comparators to enable. To calculate
comparator thresholds when a triangle wave calibration input is used, a digital mul-
tiplier and adder are required. When a sinusoidal calibration input is used, a look-up
table is also likely required to compensate for the non-uniform CDF. To select the
appropriate comparators to enable, a simple closest neighbor searching algorithm can
be employed [23].
As CMOS processes reduce in geometries, it is likely that the ADC architecture
will improve in energy efficiency and the die area will decrease. At the majority
of operating conditions, switching losses dominate power consumption, which are
proportional to fCVD. The ADC operates at a low supply voltage and thus there is
little remaining energy savings to be had from voltage scaling; however, energy savings
can be achieved by reducing total switching capacitance. It is challenging to predict
how switching losses will scale with device scaling, in large part due to the effects of
interconnect capacitance, but it is likely that capacitance (and switching loss) will
decrease at a linear to quadratic rate in relation to the minimum device length. As the
majority of ADC area is occupied by digital structures with minimum-sized devices,
the overall die area will likely decrease at a quadratic rate in relation to the minimum
device length.
2.5 Summary
A highly digital flash ADC has been presented that can operate from supply voltages
of 200 mV to 900 mV. The architecture can tolerate large comparator and reference
voltage offsets due to redundancy and reconfigurability of the comparator array. This
allows for the use of a sense-amplifier based flip-flop with embedded offsets intro-
duced through device stacking and sizing. Device stacking has been analyzed in the
subthreshold regime and shown to result in a quadratic change in effective device
strength.
58
Chapter 3
Low Data Rate UWB Architecture,
Physical Layer, and Transmitter
This chapter describes the high level architecture and physical layer (PHY) specifica-
tions of the low data rate UWB system. In addition, key circuits and measurement
results of a fabricated UWB transmitter are presented. First, a brief history of UWB
technology is provided, focusing in particular on the low data rate, IEEE 802.15.4a
standard. This standard forms the core specifications of the UWB system presented
in this thesis, but some modifications have been made to reduce system power and
overall complexity. A discussion of the trade-offs between noncoherent and coherent
signaling is presented, and the transceiver architecture is described.
3.1 Background and History
UWB technology is a form of wireless communication in which signals occupy a wide
bandwidth, greater than the lesser of 500 MHz or 20% of the center frequency of
the signal, according to the FCC. First demonstrated by Marconi with spark gap
transmitters, commercial UWB research has flourished since 1998 when the FCC an-
nounced it would investigate the possibility of permitting UWB radio systems on an
unlicensed basis [52]. UWB communication is hoped to provide an improved method
for radar applications where precise distance resolution is required and for providing
Frequency Indoor Outdoor
(GHz) EIRP Limit EIRP Limit
(dBm/MHz) (dBm/MHz)
Below 0.96 At or below 15.209 limits [53]
0.96-1.61 -75.3 -75.3
1.61-1.90 -53.3 -63.3
1.90-3.10 -51.3 -61.3
3.10-10.60 -41.3 -41.3
Above 10.60 -61.3 -61.3
Table 3.1: FCC emissions limit for indoor and outdoor UWB communication.
covert voice or data communications that overcome multipath problems [52]. As data
is encoded in very short, wide bandwidth pulses rather than long duration narrow
bandwidth signals, UWB technology allows for alternate and potentially lower power
radio architectures. In 2002, the FCC published a Report and Order authorizing the
unlicensed use of UWB in the 3.1 to 10.6 GHz frequency band [531. Emissions limits
were set by the FCC to ensure co-existence with existing narrow-band communica-
tion devices, and these limits are presented in Table 3.1. Worldwide, these limits
vary significantly from country to country. For example, in Europe, the European
Commission is constraining UWB communication to the 6.0 to 8.5 GHz frequency
band, although other frequency bands can be used if specified interference mitigation
techniques are applied [54].
Based on the properties of UWB signals and regulatory emissions limits, UWB
communication standards have been developed for two key systems: a short range
(<10 m), high data rate system and a moderate range (<100 m), low data rate
system. The high data rate system, specified in the ECMA-368 standard, achieves
data rates up to 200 Mb/s and is envisioned as a wireless replacement for USB within
the home [55]. This standard is based on the work done by the now defunct IEEE
802.15.3a task group. The commercial success of the technology is still uncertain.
Most recently, the financial crisis and intense competition have resulted in several
high data rate UWB companies going out of business [56], and design efforts from
many large companies have ceased; however, several small and established companies
continue to compete in the high data rate space.
The low data rate system, specified in the IEEE 802.15.4a amendment to the
802.15.4 wireless personal area network (WPAN) standard, was approved in 2007
and adds UWB signaling as a physical layer option [57]. The 802.15.4a standard is
described in detail in the following section.
3.2 IEEE 802.15.4a Standard
IEEE 802.15.4a is an amendment to IEEE 802.15.4 that specifies two additional PHYs
to be added to the standard: a chirp spread spectrum (CSS) PHY operating in the
2.4 GHz band and a UWB PHY operating both in the sub-gigahertz band and the
3-10 GHz band. The key goals of the amendment were to add ranging support and
higher throughput capabilities to the 802.15.4 standard, while still maintaining low
power operation.
3.2.1 Chip spread spectrum PHY
A chirp is a linear frequency modulated pulse. Essentially, it involves sweeping the
transmitted frequency very rapidly while transmitting a pulse. Key advantages of CSS
over the 802.15.4 PHY are improved range and support for links with fast-moving
mobile devices.
In 802.15.4a, CSS communication operates at a center frequency of 2.45 GHz and a
nominal data rate of 1 Mb/s. The channel plan for the CSS PHY is identical to that of
IEEE 802.11 high rate direct sequence spread spectrum (DSSS) systems. Nominally,
a compliant receiver must have a sensitivity of -85 dBm or better for 1 Mb/s. This
thesis focuses on UWB communication, and thus CSS will not be described in more
detail.
3.2.2 Ultra-wideband PHY
The UWB PHY specifies three independent bands for operation: the sub-gigahertz
band (250-750 MHz), the low band (3.1-5 GHz) and the high band (6-10.6 GHz). Of
these three bands, the low band and high band are most commonly used for wireless
communication. Transmitted UWB signals are band limited pulses with bandwidths
of approximately 500 MHz or 1.5 GHz at a maximum pulse repetition frequency
(PRF) of 499.2 MHz. Any UWB pulse shape is supported, so long as it matches
sufficiently close to the provided reference pulse. Symbol rates from 0.12 Mb/s to
31.2 Mb/s are supported, and data is modulated with a combination of pulse-position
modulation (PPM) and binary phase-shift keying (BPSK).
The UWB PHY specifies forward error correction to be implemented with an outer
Reed-Solomon systematic block code and an inner half-rate systematic convolutional
code [58]. An interesting characteristic of the UWB PHY is that both coherent and
noncoherent signaling are supported. With noncoherent signaling, the receiver can
only demodulate the PPM modulated data and not the BPSK modulated data. Thus,
the overall data rate is lowered, but simpler, energy-detection receiver architectures
are supported.
Included in 802.15.4a is a packet structure and frame format for the UWB PHY.
The frame consists of a synchronization header, an start of frame delimiter (SFD),
a packet header and a data field. The synchronization header provides time for the
for the receiver to detect a signal, realize automatic gain control (AGC), synchronize
with the transmitter, and implement frequency tracking and several other functions.
Embedded in the synchronization header are length 31 or length 127 ternary codes
which are repeatedly sent by the transmitter.
3.3 Noncoherent Demodulation
The choice of noncoherent versus coherent signaling and demodulation is a key system
level trade-off. With coherent demodulation, the receiver must recover both the fre-
quency and phase of the received signal. In these systems, information is traditionally
encoded in the phase of transmitted waveforms through modulation techniques like
BPSK. Coherent demodulators typically require a phase locked loop (PLL) operat-
ing at the carrier frequency to maintain synchronization with the received carrier. In
contrast, for noncoherent receivers, the phase of the received carrier is not recovered,
and the receiver only needs to maintain synchronization with the bit period. As the
phase of the carrier is not recovered, information must be encoded in the frequency,
time or amplitude of the wireless signal and thus, frequency-shift keying (FSK), PPM,
or amplitude-shift keying (ASK) must be employed.
Noncoherent demodulators can tolerate significant RF frequency and phase offsets
because synchronizing to the bit period typically requires timing accuracies an order-
of-magnitude less stringent than synchronizing to the carrier. For example, the IEEE
802.15.4a standard specifies an RF frequency accuracy requirement of ±20 ppm for
coherent signaling, whereas noncoherent UWB signaling can tolerate RF frequency
accuracies over ±1000 ppm1 . Because of these relaxed RF frequency tolerances,
noncoherent receivers typically do not require an RF PLL or advanced phase tracking
hardware, and the overall system power consumption can be reduced. Indeed, many
noncoherent receivers do not need an RF clock at all.
3.4 Previous Work
As 802.15.4a is a recent standard, there are only a few if any commercial 802.15.4a
compliant parts; however, there has been extensive research published targeting low
data rate UWB systems. Published receivers include sub-gigahertz band receivers
[59, 60], and higher frequency noncoherent receivers [4, 61, 62, 63, 64, 65] and coherent
receivers [66, 67, 68]. In addition, several RF front ends have been presented [69, 32].
Complementing the UWB receiver research has been extensive transmitter re-
search. Several IR-UWB transmitter architectures have been shown to be amenable
to low-power solutions [70, 71]. In particular, transmitter signal generators and PAs,
1It is important to note, however, that it is still optimal for a noncoherent receiver to have
baseband, bit-period, clock accuracies of +20 ppm.
two blocks that typically dominate the power budget of narrowband systems, can be
replaced by simple digital pulse generators and CMOS buffers [72, 73, 71]. Further-
more, noncoherent communication relaxes center frequency tolerances, thus allowing
for reduced hardware complexity and enabling the use of highly digital transmitter
architectures [61, 29].
Energy/Bit Metric
Key advantages of coherent signaling are that bandwidth is utilized more efficiently
and coherent demodulators achieve better sensitivity than noncoherent demodula-
tors; however, these benefits often result in increased power consumption due to the
required phase tracking hardware and accurate RF clocks. For energy constrained
wireless links, the overall link energy efficiency must be considered when deciding
whether to employ coherent or noncoherent signaling and demodulation.
A key metric used to quantify the energy efficiency of wireless radios is energy/bit.
Energy/bit corresponds to the energy required by the receiver or transmitter elec-
tronics to transmit or receive a single bit of information. Most often, energy/bit is
calculated as the instantaneous power consumption divided by the instantaneous data
rate. Generally, modern 802.15.4 and Bluetooth radios consume tens of milli-Watts,
corresponding to 15-to-200 nJ/bit for data rates of 250 kb/s to 3 Mb/s [74] [75]. At
the system level, one effective approach to minimize energy/bit is by aggressively duty
cycling a radio operating at a fast instantaneous data rate. This results in energy
savings because high data rate radios typically consume less power per bit than low
data rate radios.
At high data rates and for short packets the simple energy/bit metric is less
meaningful as it does not account for two key sources of energy loss: the energy
overhead associated with turning on and off the receiver and transmitter, as well as
the energy overhead associated with synchronization. Thus, several improvements
to the energy/bit metric have been proposed, to account for both the radio turn-
on and turn-off time [76] and the synchronization energy [77]. In practical systems,
to maximize the energy per useful bit, optimizations must be made to reduce the
turn on/off time, the synchronization time, and the power consumption of the radio
while concurrently increasing the data rate and packet length. Even when accounting
for these additional sources of energy overhead, the energy/bit metric can be easily
manipulated, as it does not account for receiver sensitivity, transmit output power,
and many other specifications. Regardless, the metric still serves as a useful way to
compare similar radios with one another [78].
A key opportunity of UWB communication is that high instantaneous data rates
are inherent to the system due to the wide bandwidth signals. For low data rate
systems, this means that the high rate transceiver can be aggressively duty cycled. In
high instantaneous data rate radios, the fixed cost of analog and RF bias currents is
amortized over more bits/second as the data rate increases, resulting in a general trend
of improved energy efficiency compared to narrowband low rate radios. Figure 3-1
shows recently published receiver energy/bit values [4]. In Figure 3-1, there is an
approximate trend that noncoherent receivers attain better energy/bit than coherent
receivers. The one data point that is inconsistent with this trend is the coherent
receiver that requires only 0.159 nJ/bit [60]. This inconsistency is explained because
the receiver operates at a low carrier frequency of 500 MHz, and thus the RF and
clocking circuits require considerably less power consumption.
3.5 Transceiver Architecture and Packet Structure
For this thesis, a low data rate UWB transceiver chipset is developed that is designed
for the IEEE 802.15.4a standard. To minimize power consumption, noncoherent
signaling is employed. The radio communicates in one of three channels in the 3.1-
to-5 GHz band, as shown in Figure 3-2. The high band is not supported to reduce
transceiver complexity, as it is challenging to obtain a wide tuning range from 3.1-to-
10 GHz. Three channels are used to avoid potential in-band interferers and to add
frequency diversity for multiple users.
The packet structure for the wireless link is shown in Figure 3-3. The packet
consists of a combination of on-off keying (OOK) and binary PPM encoding. OOK is
10 3
-10
210
r
w • I-1
Coherent narrowband
Coherent UWB
X Noncoherent narrowband
+ Noncoherent UWB
iiiiiiiiiii  ii iiiiii
100 10 10 10 104 105
Data Rate (kb/s)
Figure 3-1: Receiver energy/bit values vs. data rate for UWB and narrowband
receivers recently published at ISSCC [4]. Data for this plot are found in Table 5.4.
I FCC
UWB Mask
3.1GHz Frequency 5.8GHz
Figure 3-2: Three-channel frequency plan and narrowband interferers.
66
32 repetitions of
OOK modulated sync. code PPM modulated payload
32b 32b 32b 32b SFD 8bPacket Structure code code code code code Header Payload
Detection & DetectReceiver State Synchronization SFD DemodulateSychronization SFD
Figure 3-3: Packet structure and receiver states for the wireless link.
used to encode the synchronization preamble, and PPM is used to encode the payload
data. With OOK, the presence of pulses in a time slot indicates a '1', whereas the
absence of pulses indicates a '0'. In contrast, with binary PPM, pulses are sent in one
of two time slots. If pulses are sent in the first time slot, a '0' is sent, and if pulses
are sent in the second time slot, a '1' is sent. Thus, for binary PPM two time slots
correspond to a single bit of information.
The packet begins with a 32-bit, OOK modulated synchronization code that is
repeated 32 times. Each bit of this code corresponds to a 31.2 ns period, and the
bit value determines whether a burst of 16, 1.95 ns pulses is sent ('1') or whether
no pulses are sent ('0'). When receiving this synchronization code, the receiver first
detects and then synchronizes to the received signal. Following the synchronization
code in the packet is an SFD code, which consists of a 5-bit outer code applied to the
32-bit synchronization code, resulting in a 160b total SFD length. Finally, the packet
ends with PPM encoded payload data, which consists of an 8-bit header followed
by data. Each PPM period consists of two adjacent time slots, and whether greater
energy is received in the first or second slot determines whether a '0' or '1' is received,
respectively. The slot period is nominally 31.2 ns, corresponding to a PPM bit period
of 62.4 ns and a data rate of 16 Mb/s; however, the slot period can be increased
digitally.
A simplified block diagram of the transceiver chipset presented in this thesis is
shown in Figure 3-4. The receiver is a noncoherent, energy detector that squares the
incoming signal at RF; therefore no RF local oscillator or PLL is required for down-
lDCO SModulator
Pulse Shaping
(b)
Figure 3-4: Simplified block diagram of (a) receiver and (b) transmitter that form
the custom low-data-rate transceiver system.
conversion. After the squaring operation, the baseband signal is integrated in a 31.2 ns
window, generating an analog representation of the amount of energy received within
that given time window. Next, the integrated analog value is converted to a digital
value by an ADC and processed by the baseband synchronizer and demodulator. The
receiver integrator, ADC and digital logic are clocked by a crystal resonator stabilized
oscillator. Clocking circuits are required to generate the appropriate clock phase for
the integrator and ADC.
The transmitter generates bursts of 2 ns UWB pulses centered at 3.5 GHz, 4.0 GHz
or 4.5 GHz. The RF signal is generated by a digital ring oscillator that is amplified
by an inverter-based PA. BPSK phase shifting is employed to reduce spectral lines.
To meet the FCC spectral mask without needing an off-chip filter, the transmitter
employs pulse shaping. Although the transmitter is not the focus of this thesis, it is
a key component of the system and is briefly summarized in the following section.
Figure 3-5: Transmitter expanded block diagram.
3.6 UWB Transmitter
There has been extensive literature published on UWB transmitters, in part due to the
potential for very simple, low power implementation. Two key problems inherent in
many previously published low power UWB transmitters are that they either are not
compliant with the FCC spectral mask without off-chip filters [29] or the output pulse
amplitude is extremely low, fundamentally limiting range [31, 32]. In a collaborative
research project, an all-digital UWB transmitter has been designed to address these
problems [33]. 2 The transmitter architecture, key circuit blocks, and measurement
results are presented in the following sub-sections.
3.6.1 Transmitter Architecture
An expanded block diagram of the transmitter is shown in Figure 3-5. The transmitter
is activated on the rising edge of the off-chip Start-TX signal. This edge enables a
digitally-controlled oscillator (DCO), whose output is synchronously divided to a
499.2 MHz clock as specified by the 802.15.4a standard. The divided DCO signal is
the transmitter's global clock, which activates a programmable counter to control the
number of pulses transmitted per burst. Pulses are generated by BPSK-scrambling
the DCO output via a run length limiting linear feedback shift register (LFSR),
2 Patrick Mercier led the design and test efforts, and I assisted with architecture design and did
design and layout of the oscillator and single-to-differential converter.
m-M
and buffering the resulting signal through dual single-ended digital PAs employing
capacitive combination. Several phases of the 499.2 MHz divided clock are used by
pulse shaping circuitry to dynamically shape the PA envelope to one of four discrete
levels. The DCO output frequency is calibrated and dynamically adjusted using an
early-late detector in a digital frequency-locked loop (FLL).
3.6.2 Dual-Digital Power Amplifiers
A key challenge in IR-UWB PA design is how to be energy efficient and spectrally
compliant while requiring as little chip or circuit board area as possible. Traditional
differential analog PAs can easily be spectrally compliant, but typically achieve poor
power efficiency [79]. Highly digital PAs can achieve much better power efficiency,
but often require off-chip filters or baluns to enable BPSK modulation and/or be
spectrally compliant [29, 73, 80]. The proposed digital power amplifier is both power
efficient and spectrally compliant while requiring minimal area through two key ap-
proaches:
* Dual-capacitively coupled digital power amplifiers allow for the nulling of common-
mode components that occur during turn-on and turn-off of the single-ended
PAs. This approach does not use any analog circuits and removes the need for
an area-expensive balun to interface with a single-ended antenna. Moreover,
this approach allows for BPSK modulation to be easily realized.
* The dual PAs each comprise several parallel drivers. Digital pulse shaping is re-
alized by dynamically switching drive strength, resulting in spectrally compliant
pulses.
The dual digital PAs generate pulses which are in-phase at RF but have counter-
phase common mode components. By capacitively combining the two paths, the
opposite common modes are cancelled and the zero-dc RF signal propagates to the
single-ended antenna. BPSK modulation is implemented by simply inverting the
oscillator signal while maintaining opposite common modes on the dual paths. The
Oscillator
SOut
Figure 3-6: Digitally controlled oscillator schematic.
coupling capacitors also form a first order high-pass filter between the outputs of
the PAs and the antenna, thereby attenuating low frequency out-of-band spectral
components.
The dual PAs each comprise 30 parallel tri-state inverters. Individual inverters
are dynamically enabled and disabled by pulse shaping logic to adjust the PA drive
strength. Pre-charge and pre-discharge transistors set the PA output common mode
levels to VDD and GND when the PAs are tri-stated between pulses.
3.6.3 Digital Oscillator
Since the transmitter is designed for use in noncoherent UWB systems, precise phase
and frequency accuracy of an oscillator is not required. For example, receiving a non-
coherent 500 MHz signal whose center frequency is only accurate to within 6000 ppm
results in a maximum of 0.04 dB received power loss based on a MATLAB simulation.
What is required, however, is a frequency tuning range from 3.5 GHz to 4.5 GHz.
Thus, it is not possible to employ a fixed-frequency off-chip resonator like an Film
Bulk Acoustic Resonator (FBAR) [81]. Taking this into account, the resulting DCO
is a 3-stage current starved ring oscillator, shown in Figure 3-6. The highly digital,
single-ended structure is designed to have a fast turn-on time on the order of 2 ns
in the typical case to reduce energy consumption in duty-cycled operation. Com-
pared to a differential structure, the single-ended oscillator requires half of the power
consumption but is more susceptible to power supply noise.
Coarse DCO frequency tuning is provided by switchable load capacitors, while
fine frequency tuning is provided by NMOS and PMOS current starving DACs. To
simplify the frequency locking algorithm, all three current starving DACs are set to the
same digital value, except that the second and third stage DACs can be individually
incremented by one for increased resolution. This technique results in a resolution
of 7.5 bits from the DACs and 2 bits from the 3 thermometer encoded capacitors,
totaling 9.5 bits. This resolution of frequency control is sufficient to meet noncoherent
wideband receiver sensitivity requirements. The worst-case measured frequency step
size in the 3-to-5 GHz band is 10 MHz, corresponding to an accuracy of 2800 ppm.
Phase scrambling is implemented by passing the DCO output through a single-
ended to differential converter that contains delay-matched paths: the inverted path
consists of a static CMOS inverter, while the non-inverted path consists of a trans-
mission gate that is sized for equivalent delay [82]. Simulation results show that up
to 10 ps of delay mismatch is tolerated before spectral lines with 1 dB amplitude
above the regular spectrum begin to appear. Depending on the process corner, there
is a variation between the two paths of +2 ps/-i ps around the nominal 1800 phase
shift. Monte Carlo analysis of circuit variation indicates that the standard deviation
is 1.2 ps, which is sufficiently small to not have a noticeable impact on the resulting
spectrum.
3.6.4 Measurement Results
The transmitter was fabricated in 90 nm CMOS and operates on a 1 V supply. A die
photo is shown in Figure 3-7. Figure 3-8 presents a 3.5 GHz transient waveform with
five pulses concatenated into a single burst. The resulting bursts are both indoor and
outdoor FCC compliant in all three channels without the use of an off-chip filter, as
shown in Figure 3-9. The four-level pulse shaping results in nearly 20 dB of sidelobe
rejection. The dual digital PAs have a gain scalability of 13 dB and a maximum output
swing of 710 mVp,. The transmitter consumes 17.5 pJ/pulse including electrostatic
discharge (ESD) and input/output (I/O) supplies.
U
Figure 3-7: Die photo of fabricated transmitter.
0 2 4 6 8 10 12 14 16 18
Time [ns]
Figure 3-8: Measured transient waveform of a burst of five individually BPSK-
modulated pulses.
SiRe
0 1 2 3 4 5 6
Frequency [GHz]
Figure 3-9: Overlaid power spectral densities of the three channels in the low-band
of the 802.15.4a proposal.
3.7 Summary
This chapter introduced the UWB architecture and PHY that forms the basis of
the remainder of this thesis. The architecture is designed for near compatibility
with IEEE 802.15.4a and employs noncoherent signaling to improve energy efficiency.
Through pulse-shaping and capacitive combining, the all-digital UWB transmitter
generates FCC compliant signals without requiring any off-chip filters.
Chapter 4
UWB Receiver SoC Circuits
This chapter describes the key circuit blocks in the UWB receiver SoC. Circuit design
decisions are driven by the goal to minimize power consumption while maximizing
performance, integration and robustness of the wireless receiver. The RF front end
consumes the majority of power in the receiver, and this chapter describes the steps
taken to minimize this power consumption. The receiver baseband analog, clocking
and ADC circuits are also described in detail.
4.1 Architecture
As described in Chapter 3, the receiver is a noncoherent, energy detection based
IR-UWB receiver that is designed for the 802.15.4a wireless standard. The receiver is
nearly compliant with 802.15.4a, but uses a modified packet structure to reduce power
consumption, reduce receiver complexity and allow for an improved synchronization
algorithm. Noncoherent signaling is employed to reduce power consumption on the
receiver as it allows for a simple architecture without any high frequency clocks and
also allows for the use of highly digital circuits due to relaxed frequency tolerances
and wide signal bandwidths. A detailed block diagram of the receiver SoC is shown
in Figure 4-1.
The first stage of the receiver signal chain is an RF front end, that amplifies
the received signal by up to 40 dB while attenuating out-of-band interferers. This
0-to-5 Stage RF Gain Digital Logic
& Tunable BPF BB Gain ADCS+ Noncoherent
LNA fM ()dt Synchronizer &
DemodulatorADC
---.---- -
- - - -
- - - -
- - -
--32MHzOsc. 9[[15:0] .. IfP L Phase 2.5V MothISelect Stimulator
Figure 4-1: Detailed block diagram of receiver SoC.
amplified RF signal is then squared, resulting in the RF signal being mixed to base-
band. Because of the squaring operation, the amplitude of this baseband signal is
proportional to the instantaneous received power in the input RF bandwidth. Thus,
by integrating the baseband signal over a period of time, the receiver generates a
signal proportional to the energy received over that period of time. Following the
squarer is a baseband amplifier, and then the amplified signal is integrated and quan-
tized by an ADC. The ADC values are passed to a digital backend, which performs
packet detection, synchronization and decoding.1 Also included in the receiver SoC
is crystal oscillator and a delay-locked loop (DLL). The entire receiver is clocked by
a fixed, 32 MHz clock. After synchronization, the appropriate DLL phase is selected
and is used by the windowed integrator and ADC. Each of the specific components
of the receiver SoC are described in the following subsections, except for the digital
synchronizer and demodulator [35}.
The receiver architecture shares many similarities with and was inspired by the
energy detection receiver presented in [4] and a block diagram of the receiver in [4]
is shown in Figure 4-2. The key differences between the receiver presented in this
thesis and [4] are briefly described below, and explained in more detail throughout
this chapter.
1The digital backend was designed by Patrick Mercier and Manish Bhardwaj
6 Stage RF Gain
& Tunable BPF BB Gain
-- Tint Bits
~LNA0- dt Tt Out
Figure 4-2: Block diagram of noncoherent UWB receiver presented in [4].
* This work is a full SoC, including the necessary clocking circuits, digital cali-
bration, and digital synchronization algorithms.
* In this work, the supply voltage is fixed at 1 V and gain scalability is imple-
mented by disabling RF gain stages, whereas in [4], gain scalability is imple-
mented by scaling the supply voltage from 0.5 V to 0.65 V.
* In this work, a multi-bit ADC is used to measure the energy received in a given
time window, whereas in [4], a clocked comparator measures the difference in
energy received between two adjacent time windows and provides a single bit
of information.
4.2 RF Front End
For noncoherent receivers, significant gain is required prior to the squarer to obtain a
sufficient signal swing such that semiconductor device nonlinearity can be exploited
in the squaring element. Passive and active squarers require input voltages on the
order of milli-Volts whereas LNA input voltages can be on the order of tens of micro-
Volts, thus requiring voltage gain of approximately 40 dB. To realize such large gain,
noncoherent receivers typically employ one of two methods: a super-regenerative ar-
chitecture [64, 65] or a multi-stage linear amplifier [4, 63, 61]. Although a multi-stage
linear amplifier requires more power than a super-regenerative amplifier, it allows
for simple support of any arbitrary squaring and integration interval. Moreover, a
multi-stage linear amplifier is less subject to RF leakage out of the antenna, which
can potentially result in FCC spectrum violations or require the use of an RF isola-
tion amplifier. Based on these advantages, a multi-stage linear amplifier topology is
selected.
Given the design decision to use a multi-stage linear amplifier, the next choice is
whether to amplify the signal solely at RF prior to the squarer or whether to mix
the signal to baseband (or a low frequency) and amplify the signal both at RF and
baseband. In general, gain at baseband can be implemented at much less power
consumption than gain at RF; however, to mix the RF signal to baseband requires
a mixer and oscillator, both of which consume power comparable to that of an RF
gain stage. Moreover, after mixing to baseband, both in-phase and quadrature signal
paths are required [66], doubling the baseband amplifier power consumption and
potentially the ADC power consumption. An additional disadvantage of mixing the
signal to baseband is that dc offsets at baseband must be cancelled. These dc offsets
can degrade performance and are particularly challenging to cancel digitally, as the
nonlinear squaring element makes it difficult to measure the offsets.
Given the simplicity, reliability, and minimal power consumption overhead asso-
ciated with solely amplifying the signal at RF prior to the squarer, this topology is
selected. Thus, the RF front end consists of several RF gain stages and the output of
the RF front end is connected directly to the squarer. The key design decisions for
the RF front end include whether to have a single ended or differential architecture,
a resonant or non-resonant load, and what core amplifier structure to use.
4.2.1 Single ended vs. differential architecture
Previous research has demonstrated single-ended RF front ends [69], fully differential
RF front ends [71, 66], and hybrid RF front ends where the LNA input is single-ended
but the signal is internally converted to a differential signal at RF [4, 32]. As the RF
front end is integrated on the same chip as digital logic and baseband analog circuits,
a differential architecture offers significant advantages in terms of substrate noise and
power supply immunity. For example, harmonics of low-frequency signals can more
easily couple into single ended architectures, thereby swamping out signals of interest
and potentially resulting in instability. This potential for instability is particularly
worrisome due to the large RF gain associated with the chosen architecture.
As robustness is of paramount concern, a differential RF architecture is selected;
however, as all commercially available UWB antennas are single ended, the LNA has
a single ended input. Single-ended to differential conversion is realized by the LNA
and all later stages are differential. Employing a differential architecture results in
several additional benefits outlined below:
* Reduced VDD decoupling capacitance is required.
* Higher quality factor inductors are available.
* Virtual ground 'center-tap' can be exploited in both differential capacitors and
inductors.
* Unlike with single ended amplifiers, bias currents of differential amplifiers can
be set without requiring ac coupling or large decoupling capacitors.
Despite the many benefits associated with differential architectures, one key dis-
advantage of differential architectures is that both area and power consumption are
often doubled; however, given the many benefits previously mentioned, this disad-
vantage is acceptable.
4.2.2 Resonant vs. non-resonant load
In many RF amplifiers, resonant loads consisting of inductors, resistors, and capaci-
tors are used to match impedances, cancel parasitics, and filter out unwanted signals
[83]. For instance, a parallel RLC tank placed at the output of an amplifier results
in a second-order bandpass transfer function. While a bandpass transfer function can
be implemented using only capacitors and resistors, a key difference is that inductor
causes the tank to become resonant. At the resonant frequency, referred to as wo, the
admittance of the inductor and capacitor cancel.
By exploiting resonance of RLC networks, it is possible to obtain higher gain at
RF than would otherwise be possible with non-resonant circuits. Figure 4-3 shows
0.6 V 1V 1V
1.2 nH 2660 CP 266 0
1.3 pF 
- Out +
In + MM 2 In -  In +- M M2 n -
5 mA V3 mA
(a) (b)
Figure 4-3: Circuit schematic of (a) resonant and (b) non-resonant amplifiers with
equal power consumption. Simulation results comparing the gain of the amplifiers
are presented in Figure 4-4.
circuit schematics of simple resonant and non-resonant amplifiers, and Figure 4-4
presents the simulated gain of these two amplifiers versus frequency. To ensure a fair
comparison between the two amplifier structures, several steps were taken. First, as
the resonant amplifier requires less voltage headroom, its supply voltage is reduced to
0.6 V compared to 1 V for the non-resonant amplifier. Second, the power consump-
tion of the two amplifiers are matched. Finally, parasitic capacitances, Miller effect
capacitances, and RF device models are included in the schematics.
In the given 90 nm CMOS process, the resonant differential pair shown in Fig-
ure 4-3(a) achieves a gain of 3.2 at 4 GHz with a supply current of 5 mA (VDD =
0.6 V). The RLC load has a quality factor of 5.75, which results in a -3 dB band-
width of 695 MHz. In contrast, the differential pair with non-resonant load shown
in Figure 4-3(b) achieves a gain of 2.2 at 4 GHz with a supply current of 3 mA
(VDD = 1 V). Although the gain at dc is 3.1, the gain at 4 GHz is degraded due
to the low pass -3 dB frequency near 4.2 GHz. This low pass frequency is deter-
mined by the parasitic wiring capacitance as well as capacitances of loading devices.
At sub-GHz frequencies, these parasitic capacitances might be sufficiently small such
that an amplifier with non-resonant load can realize superior gain at the same power
consumption as an amplifier with a resonant load [27]; however, in this system the
resonant amplifier achieves superior gain.
Non-resonant
gain
3 7
103  10
4
Frequency (MHz)
Figure 4-4: Gain of the resonant and non-resonant RF amplifiers presented in Fig-
ure 4-3.
In addition to the gain advantage offered by the resonant RLC load at 4 GHz, the
resonant amplifier has less total integrated output noise, due to the reduced output
bandwidth. For a self mixing based receiver, the total integrated output noise can
limit sensitivity and thus must be minimized. One disadvantage associated with
the resonant load is that the required inductors and capacitors can be quite large.
In addition, the resonant load must be able to be tuned to each of the frequency
channels or have sufficiently wide bandwidth to cover all of the frequency channels
simultaneously. In the given CMOS process, the area overhead of the resonant loads
are acceptable and the tuning range of 3.5 to 4.5 GHz is technically feasible. Thus, a
parallel LC resonant load is employed in each gain stage.
4.2.3 Core amplifier structure
Given the design choice of a differential amplifier with resonant load, the final key
design choice is what amplifier structure to use. One of the most common topolo-
gies is the differential pair, shown in Figure 4-5(a). The common-mode rejection
ratio (CMRR) of this amplifier can be improved by adding a source degeneration
inductor and capacitor [4]. This amplifier can operate at very low supply voltages,
and in fact, for maximum energy efficiency it needs to operate off very low supply
In + M 1 M2- In- In +- M1 M2 In -
F IDC . IDC
(a) (b)
IDC
- Out + - Out +
M3 M4 MM4
In + M M In - In + M In-
VCM VCM
4. IDC IDC
(c) (d)
Figure 4-5: Four possible implementations of an RF gain stage with a resonant load.
voltages. The amplifier shown in Figure 4-5(a), but with source degeneration, has
been demonstrated operating at a supply voltage as low as 0.5V [4]. In a practical
system, however, there are several reasons why such a low supply voltage is not ideal.
In mixed signal SoCs with RF, baseband analog and digital logic all on a single chip,
additional complexity is required to generate multiple supply voltages. Moreover,
there are voltage headroom advantages to operating both RF and baseband analog
circuits at higher voltages. For example, NMOS switches operating off a higher sup-
ply voltage have an improved I, to Ioff ratio. Thus, a key design goal is to design
an energy efficient RF gain topology that can operate off a 1 V supply.
Two popular approaches for RF amplifiers to take advantage of increased voltage
headroom are by cascoding [67] and current reuse [66]. Figure 4-5(b) shows an ex-
ample of an RF gain stage with cascoding, and Figure 4-5(c) shows an example of
current reuse by using PMOS input devices in tandem with NMOS input devices. In
Figure 4-5(c), two current sources are used to regulate the current as well as cancel
common-mode components, whereas in Figure 4-5(d) only one current source is used
to regulate current. When only a single current source is used, the power supply can
be reduced resulting in improved energy efficiency; however, this comes at the cost of
degraded common-mode and power supply rejection.
The circuits presented in Figures 4-5(a), 4-5(b) and 4-5(d) were simulated to de-
termine which topology has the maximum gain at a given power consumption, and
the simulated results are presented in Figure 4-6. To normalize power consumption,
the circuit of Figure 4-5(a) is supplied a voltage of 0.57 V and a current of 5.26 mA,
whereas the other two circuits are supplied a voltage of 1 V and a current of 3 mA.
All three topologies have similar performance, except that the cascode amplifier pro-
vides slightly less peak gain and has a narrower bandwidth. It might be somewhat
surprising that the inverter-based amplifier shown in Figure 4-5(d) has comparable
performance to the NMOS amplifier shown in Figure 4-5(a) considering that the
transconductance efficiency of PMOS devices is less than NMOS devices. While the
lower transconductance efficiency of PMOS devices does reduce performance, this is
offset by the lack of a current source above the PMOS input devices. Thus, the power
> I -NMOS amplifier
2 ..... NMOS cascode amplifier
- - - Inverter amplifier
1 .... .
2o00 4000 6000
Frequency (MHz)
Figure 4-6: Gain of an NMOS-based RF amplifier operating at 0.57 V (Fig. 4-5(a)),
an NMOS-based cascode RF amplifier operating at 1 V (Fig. 4-5(b)), and an inverter-
based, complementary RF amplifier operating at 1 V (Fig. 4-5(d)), with all amplifiers
consuming equal power.
consumption of the current source can be amortized by both the NMOS and PMOS
input devices.
As the inverter-based RF amplifier shown in Figure 4-5(d) achieves comparable
performance to the NMOS amplifier shown in Figure 4-5(a), but operates off a 1 V
supply rather than a 0.57 V supply, this topology is chosen as the core RF amplifier.
To obtain a sufficiently large tuning range, the load capacitor is implemented with
metal-oxide-metal (MOM) capacitors with 5 bits of tuning as shown in Figure 4-7.
This capacitor implementation results in a superior tuning range and higher qual-
ity factor than a varactor based load capacitor. The tuning structure leverages the
differential RF signals to reduce the size of the NMOS switches, thereby reducing
parasitic capacitance. Figure 4-8 shows graphically how differential signals are lever-
aged to reduce overall transistor size. The two circuits in Figure 4-8 have the same
RF impedance to GND, but in Figure 4-8(a), the overall transistor width is reduced
by four times. In the implemented circuit, additional NMOS switches to GND are
inserted to dc bias switches and tolerate mismatch. The capacitor tuning requires
only NMOS switches and thus consumes no subthreshold leakage as all internal nodes
are nominally biased at 0 V.
93 fF
RF + RF -
1.5pF 1.5 pF128 pm
128 pm 128 pm
Tune[4]
Figure 4-7: Digitally tunable load capacitor of RF amplifier. All NMOS devices are
minimum length and their width is shown.
W=WoRF + .Wt - RF -
T
VDD
W= 2Wo W=2WoRF +.-. .w .W... RF -
T T
VDD VDD
Figure 4-8: Schematic showing how differential signals can be leveraged to reduce
transistor size. The two circuits shown have the same RF impedance, but by con-
necting a transistor between the positive and negative terminals, the overall transistor
width is reduced by four times.
93 fF
VDD
LNA In-ll
RS n RS2
VCM DC
Figure 4-9: Schematic of low noise amplifier.
4.2.4 Low Noise Amplifier
The core RF amplifier described and motivated in the preceding sections serves as
the basis of the LNA, shown in Figure 4-9. The LNA is an inverter-based RF am-
plifier that performs single-ended to differential conversion. The LNA is essentially
a common-source, common-gate (CS-CG) amplifier [26], with wideband matching to
the 50 Q antenna through the PMOS common-gate amplifier. The LNA is designed
and laid out to be as symmetric as possible, so that noise on nodes like VCM is can-
celled. The LNA input is dc biased by Li, to VDD. Alternatively, if the input needs
to be dc biased to GND, the LNA can be designed with an NMOS common-gate
amplifier with no degradation in performance.
When the LNA is enabled, the switch en is closed, connecting the dc output of the
differential inverters with the dc input of the inverters. Through negative feedback,
the dc voltages at all of the nodes normalize to the same value, VCM. To allow the
LNA to turn on rapidly, switches are placed in parallel with Rs1 and Rs 2 and these
switches are briefly enabled while the LNA turns on. In normal operation, Rsl and
RS2 are sufficiently large that the negative feedback does not degrade gain. When
the LNA is disabled, the switch en is opened, IDC is set to 0 A, and VCM is actively
driven to VDD. This allows the output dc voltage to freely float, which is necessary
for proper calibration of the receiver (Described in Section 4.4).
LNA In "en sq 1 4X en sq2 X sq, e ensq
Senrf,l enrf,2 enrf,6
Squarer Output
Figure 4-10: Schematic of 6-stage RF amplifier, including the LNA. A variable number
of stages can be enabled depending on the gain required.
4.2.5 Multi-stage RF amplifier
Following the LNA are five stages of RF gain, providing approximately 40 dB of gain
to amplify the received wireless signal from the micro-Volt to the milli-Volt level.
Figure 4-10 presents the schematic of the multi-stage RF amplifier, including the
LNA. The first stage of RF amplification is the LNA that is also shown in Figure 4-9.
Later stages of RF amplification are differential inverters with resonant loads as shown
in Figure 4-5(d).
To dc bias the RF gain stages, the center tap of each stage's inductor is connected
to the center taps of adjacent stages' inductors. Due to the differential voltage across
each inductor, these center tap nodes are virtual grounds. Moreover, as all RF am-
plifiers are biased with the same current density, these nodes are nominally at the
same dc voltage. By connecting these nodes together with a low impedance connec-
tion, the CMRR is superior to what is obtained with more traditional common-mode
feedback (CMFB) techniques like resistive feedback. The simulated common-mode
gain of the five stages of RF gain after the LNA is less than 3 dB over a wide input
bandwidth.
Each gain stage has a squarer at its output, although at any time only one squarer
RF + RF-
RF - RF +
RF- RF+
Figure 4-11: Schematic of differential, passive squarer.
is enabled. Depending on how much RF gain is needed, a variable number of RF gain
stages are enabled, as well as the appropriate squarer.
4.3 Squarer
A squarer serves two functions in the receiver: to frequency shift (or mix) the received
RF signal to baseband and to square its amplitude. Although these two functions
are typically implemented in a single circuit, the frequency shifting and squaring
operations can be de-embedded from one another. For example, a mixer can be used
to mix a signal to baseband, and then after several baseband gain stages, the signal
can be squared. This can be useful to reduce RF gain, as micro-Volt level signals can
be processed by a mixer, whereas squaring circuits typically rely on device nonlinearity
and require milli-Volt level signals.
Most noncoherent UWB receivers implement a squarer by mixing the signal with
itself [65, 62, 4]. It is possible to design an entirely passive squarer that consumes no
dc bias current; however, these passive squaring circuits are traditionally single ended
[4] or pseudo-differential [62]. In this work, a passive, differential squarer is employed
that uses transistors biased in the triode region (Figure 4-11). The differential squarer
is made possible by the inverter-based RF amplifier, as the output voltage of the RF
201
E 1 5 ......... ...... ..... ................. .
E Negative output :
Positive output
0 10 20 30 40 50
RF Amplitude (mV)
Figure 4-12: Output amplitude magnitude of differential squarer versus amplitude of
4.5 GHz RF input.
amplifier is nominally mid-range, thereby allowing both NMOS and PMOS devices to
have sufficient gate overdrive. The squarer consumes no static bias current and has
near zero dc output voltage offset. A key advantage of this structure is that fairly well
matched differential outputs are generated. Due to its nonlinear transfer function,
the squarer requires RF inputs with amplitudes above approximately 10 mV. At a
10 mV RF input, the single-ended output voltage amplitude is ,0.7 mV, as shown
in Figure 4-12.
Figure 4-13 presents a transient simulation of the RF front end at its maximum
gain setting and squarer in response to a pulsed LNA RF input. The transient plot
shows how 80 [V input signals are amplified to the milli-Volt level, and then squared
at RF, mixing the signal to baseband.
4.4 Baseband Amplifier
Following the squarer is a baseband signal chain consisting of a three-stage amplifier
followed by an integrator and ADC (Figure 4-14). The baseband amplifiers are simple
differential pairs with resistive loads. The cumulative differential gain of the baseband
amplifier chain is simulated to be 83 V/V and the 3 dB bandwidth is 230 MHz. The
I I
0)
0.9946
> 0.9945
0.9944
0.5754
- 0.5752
0.575 0
0.59
50
6
6
0
0
0.585
> 0.58
0.575 RF Gain 4/6 output +, 3mV pk-pk
I I I I I
0 10 20 30 40 50 60
0.6
S0.58
0.56- RF Gain 6/6 output +, 24mV pk-pk
0 10 20 30 40 50 60
0.584
Squarer output, 1.3mV pk-pk
> 0.582 - Sq.out
Sqout
0.58
0 10 20 30 40 50 60
Time (ns)
Figure 4-13: Transient simulation of RF front end at its maximum gain setting and
squarer in response to a pulsed LNA RF input.
LNA., 80[V pk-pkin
LNAou t +, 115 V pk-pk
C
(
BB In+ ADC ADC+
, -> -> odt
* ... --.. ,BB In- L ADC ADC
CMFB
Coad Rload Offset
Out Out+ i Cancellation
In ID I
---------------------------------,
Figure 4-14: Baseband signal chain, consisting of a baseband amplifier, an ADC and
digital calibration logic.
large baseband gain is required to amplify the squarer output from amplitudes as low
as 0.5 mV. Each differential pair operates off a 1 V supply, is supplied 320 pA of
current, and has resistive and capacitive loads of 2.5 kQ and 150 fF, respectively. A
multi-stage amplifier is used rather than an op amp due to the wide signal bandwidths
and because a high Q filter is not required.
Due to the small input levels and high gain, offset compensation is a critical
component of the baseband amplifier. An input referred offset of merely 10 mV
would saturate the baseband amplifier. Traditionally, the goal of offset compensation
is to establish a 0 V differential output voltage given a 0 V differential input voltage;
however, in this system a fixed offset at the output needs to be established to maximize
dynamic range. This fixed output offset is required because the baseband signal
generated by the squarer is monopolar, meaning that the positive squarer output only
increases from its 'zero-input' level and the negative squarer output only decreases.
Thus, the positive baseband amplifier output should nominally be biased near the
bottom of the amplifier's dynamic range.
BB In+
BB In-
(a)
Vbias
BB In+
BB In-
Vbias
(b)
BB In+
BB In-
Digital
Calibration Logic
(c)
Figure 4-15: Examples of offset compensation approaches for the baseband ampli-
fier. (a) Discrete-time, analog offset compensation, (b) Continuous-time, analog offset
compensation, and (c) Discrete-time, digital offset compensation.
A variety of techniques can be employed to cancel offsets, including autozeroing
and chopper stabilization [41]. Given the wide signal bandwidth, autozeroing is pre-
ferred to chopper stabilization and can be implemented by analog or digital means,
and in a discrete time or continuous time process. In discrete time analog offset
storage (e.g. Figure 4-15(a)), the differential inputs are usually shorted together and
offsets are sampled onto capacitors at the input or output of amplifiers. Due to
leakage, the voltage stored on these capacitors must be periodically refreshed. In ad-
vanced CMOS processes where gate leakage is significant, the refresh period may need
to be less than 1 ps, making this form of offset cancellation impractical. Alternatively,
analog offset storage can be implemented with a continuous time, RC high pass filter
between stages (e.g. Figure 4-15(b)); however, this will cancel low-frequency inputs.
Again, due to leakage, the high pass corner frequency may need to be on the order
of 1 MHz. While 1 MHz is not significant relative to the 250 MHz signal, the key
problem associated with continuous-time filtering is that a long sequence of received
pulses (or noise) over time can result in an identical output as a zero-input signal.
This prevents accurate measurement of the absolute energy received, and thus only
relative measurements can be made.
Given the aforementioned considerations, offset compensation is implemented dig-
itally in a discrete time process, and a simplified block diagram of the offset compen-
sation is shown in Figure 4-15(c). During calibration, the LNA is disabled and the
baseband inputs are shorted to the same dc value. Next, the integrator and ADC
convert the baseband output to a digital value. The ADC output code is processed
by a slope tracking state machine to adjust a 5 bit DAC until the ADC output code
approaches the desired ADC value. In the slope tracking algorithm, the DAC value is
incremented or decremented by one codeword until the minimum difference between
the ADC value and the desired ADC value is observed. While this could potentially
require many more ADC samples than a binary search algorithm, in the majority
of situations, only a few ADC samples are required. The slope tracking algorithm
begins searching from the DAC value calculated previously, and in practice this DAC
value does not need to change much over time.
BB
DAC Enable[14:0]: %
DAC Current[6:0] f
Figure 4-16: Circuit of current-mode DAC that is used to cancel baseband amplifier
offsets.
Given the 83 V/V gain of the three-stage baseband amplifier and that input signals
can be as small as half a milli-Volt, input-referred offsets greater than -100 /pV must
be cancelled. A standard method to implement this offset compensation is to connect
a current-mode, multi-bit DAC to the output of the first baseband amplifier stage.
As each baseband amplifier stage has a gain of 4.3 V/V, a 100 pV input referred offset
corresponds to a 430 pV offset at the output of the first amplifier stage. Given an
amplifier load resistance of 2.5 kQ, this forces the current-mode DAC to have an LSB
of approximately 2 pA. Such a low current is challenging to generate from the 320 pA
current supplied to the amplifier stage. Two key challenges with generating the 2 pA
current are that extreme device ratios are required to step down from 320 [A to 2 pA
and that such small currents result in slow time constants, potentially switching
between 'on' and 'off' too slowly for either the radio calibration logic or the radio
turn-on sequence.
One possible solution to address this problem is to connect the current-mode
DAC to the output of a later baseband amplifier stage. This allows for the LSB
current source to increase by many times; however, this technique come at the cost
of increased power consumption. Additionally, this technique does not work if the
offset of the first amplifier stage is too large and later stages are saturated.
To try and address these problems, in this receiver a current-mode DAC is used
that can be connected to the output of all three baseband amplifier stages rather than
just a single baseband amplifier stage. The DAC consists of seven identical current
sources which can be independently routed to one of the three amplifier stages, or
none of them (Figure 4-16). To ensure monotonicity, as the DAC code increases, the
current sources transition from being unconnected, to being connected to the final
amplifier stage, to eventually being connected to earlier amplifier stages. For example,
if there is zero input-referred offset, the current sources are disconnected from the
baseband amplifier. If there is a small input-referred offset, the current sources are
only connected to the output of the final baseband amplifier stage. However, if there
is a large input-referred offset, the current sources are connected to the output of the
first and second baseband amplifier stages.
Depending on whether a positive or negative offset needs to be cancelled, the
current sources can connect to the positive or negative output nodes. This sign
control is set by the Plus/Minus control signal. A table showing how the DAC code
relates to current routing is shown in Figure 4-17. Based on the DAC code, the DAC
Enable[14:O] and Plus/Minus switches are opened or closed (Figure 4-16).
4.5 Integrator and ADC
Following the baseband amplifier is a windowed integrator and ADC. Both the inte-
grator and ADC are clocked at 32 MHz, resulting in an integration period of 31.25 ns.
The output of the ADC is a digital representation of the total RF energy received
within the 31.25 ns integration period. This absolute measurement of energy is pre-
ferred to a relative measurement of energy, because it allows for demodulation of both
PPM and OOK data.
Background and Motivation
An integrator can be represented as an analog filter, and the two most common
techniques to implement an analog filter are with a switched-capacitor filter or with
a continuous time filter. Due to the 250 MHz signal bandwidth, it is challenging to
Code DAC Enable Plus/
0 1 2 34 5 6171819110111112113114 Minus
o 1
7
8
9
+-1----
13
1415
16
28
29
Figure 4-17: Table showing relationship between DAC code and current routing.
Based on the DAC code, the DAC Enable[14:0] and Plus/Minus switches are opened
or closed (Figure 4-16). In the table, shaded cells correspond to a switch being closed.
Vin
Vin
Out T
Figure 4-18: Two example circuits implementing a windowed integrator: (a) an op
amp based integrator, and (b) a transconductor based integrator. In both integrators,
¢1 controls a switch that resets the integrator output.
2 II
3 1 1 I I 1 0
|
implement a low power switched capacitor filter and thus a continuous time filter is
preferred. One common approach to implement an integrator is to use an op amp
based integrator (Figure 4-18(a)). In an op amp integrator, feedback is used to create
a virtual ground node at the op amp's negative input terminal. This virtual ground
node in combination with resistor R allows for a highly linear conversion of voltage
Vi, to current. Alternatively, the resistor R can be replaced with a transconductor
to form a GM-C op amp integrator (also called a Miller integrator) [84]. For optimal
performance, a high gain op amp is required that has a unity gain bandwidth much
larger than the input signal bandwidth. This is particularly challenging for the UWB
system due to the 250 MHz baseband signal bandwidth. Moreover, it is difficult to
realize op amps in advanced CMOS processes.
As an alternative to op amp based integrators, it is possible to design a transcon-
ductor (GM) based integrator that does not require op amps (Figure 4-18(b)). The
transconductance element can be easily implemented with a differential pair [85], a
common-source amplifier, or even an inverter [4]. One limitation, however, is that the
transconductance element requires a high output impedance. To increase the output
impedance, positive feedback can be leveraged [85]. Alternatively, it is possible to
use a dynamic output node [62], which also provides a significant increase in head-
room. For example, an integrator can be realized with a single-transistor, NMOS
common-source amplifier that discharges a load capacitor from VDD. One disadvan-
tage with this approach is that without an active load device like a current source,
any dc bias current generated by the transconductance element is integrated on the
output capacitor along with the integrated signal of interest. Despite this drawback,
a dynamic, transconductor based integrator is selected owing to its simplicity.
A key design challenge is how to design the ADC that converts the analog, inte-
grator output to a multi-bit digital value. For maximum performance, the windowed
integrator must be able to integrate time periods of 31.2 ns that are immediately
after one another, which leaves no time for the integrator to reset its output between
integration windows and for the ADC to quantize the analog value. Although a dy-
namic, transconductor based integrator can reset its output sufficiently fast (< 1 ns)
-c ADCVin o -- --gm ClockA -
Select C2 ,0.
Clock B Select
Clock A _I1i.
Clock B I _ I
S- ---
Select
Figure 4-19: An example implementation of a time interleaved integrator and ADC.
Shown below the circuit diagram is a timing diagram.
such that there is no significant degradation in performance, most practical ADCs
require multiple nano-seconds to quantize the analog value. To address this problem,
the ADC requires a sample-and-hold circuit or interleaving or pipelining is required.
For example, it is possible to time-interleave two integration capacitors and ADCs as
shown in Figure 4-19. When the input signal is being integrated onto capacitor C1,
an ADC quantizes the integrated value stored on capacitor C2, and vice versa. Imme-
diately after an ADC completes its quantization, the appropriate capacitor voltage is
reset to VDD in anticipation of the next integration period.
As mentioned earlier, in practical transconductor based integrators with dynamic
output nodes, the transconductor output current consists of both a dc current (Idc)
and an input-dependent current (gmvin). The dc current generates a fixed voltage
offset at the output of the integrator, which can compress the dynamic range of
the output. As a solution, it is possible to cascade multiple integrators back-to-
back as shown in Figure 4-20. In this integrator, the transconductor first discharges
the voltage on capacitor C1 until it is discharged past the switching threshold of
the comparator connected to capacitor C1. When the comparator's output flips,
the transconductor output current is switched to discharge the voltage on capacitor
Clock
Clock E -LIr_
2 .- -
Select 1
Comp. Out - _
Asynchronous, comparator switching events
Figure 4-20: An example implementation of a cascaded integrator and ADC. Shown
below the circuit diagram is a timing diagram.
C2, until the end of the integration period. Capacitor C1 can be sized to cancel the
voltage offset introduced by Idc, such that only the signal of interest is integrated onto
capacitor 02. At the end of the integration period, only one ADC is required which
samples the integrated voltage on capacitor C2. The cascaded integrator and ADC is
similar to a zero-crossing based, pipeline ADC in that an asynchronous comparator
switching event indicates that a stage has completed [20]. The integrator and ADC
structure is also similar to a single-slope integrating ADC.
Compared to the time interleaved integrator shown in Figure 4-19, when a dynamic
integrator is used, the cascaded integrator and ADC requires comparable area but
increases the ADC dynamic range. Thus, the cascaded integrator and ADC is used
in this work. The following sub-section describes the actual implementation of the
integrator and ADC.
Implementation
The ADC consists of two single ended ADCs, operating on the positive and negative
integrator outputs and each generating 5 bits of information. The difference between
Integrator/ADC Out
Figure 4-21: Six stage cascaded integrator and ADC.
these ADC values generates a 6 bit output code, although if perfect matching is
assumed, only 5 bits of useful information is generated. Despite this limitation, the
pseudo-differential structure offers improved power supply rejection and common-
mode rejection compared to a single ended, 5 bit structure. Additionally, the pseudo-
differential outputs can allow for CMFB during normal operation, as described in
Chapter 2.
Having the integration output quantized to multiple bits is useful for gain control
and for accurate timing synchronization. Due to the 5 bits of ADC information
combined with coding on the transmitter, the receiver is able to synchronize with an
accuracy of +1 ns while being clocked with a period of 31.25 ns [35].
A detailed block diagram of the integrator and ADC are shown in Figure 4-21.
The differential inputs are first passed through a differential transconductor to convert
the input voltage to a current. This current discharges up to six stages from VDD in
100
succession, similar to that of a dynamic inverter. The differential rate of discharge
between the positive and negative ADCs is based on the differential input voltage,
and thus an integration function is realized. Based on the number of stages that
are discharged in the integration period, 2 bits of coarse quantization are generated.
Only 2 bits of information are generated from the six stages because the first two
stages are not considered in the coarse quantization. The first two stages should
ideally always be discharged by the end of an integration period and thus do not
contribute information. These first two stages serve to cancel out the static, zero-
input dc current of the differential transconductor that is required to appropriately
bias the transconductor in a linear region. Additionally, the time while these first
two stages are being discharged is leveraged by the final four stages to evaluate the
previous integration value.
The ADC generates an additional 3 bits of fine quantization that are combined
with the 2 bits of coarse quantization. These 3 bits are generated by quantizing the
capacitor voltage of the stage that was being discharged at the end of the integration
period with a flash ADC. The capacitor voltages on stages three through six are
temporarily held constant while the appropriate flash ADC resolves. During this
time period, the next integration period has already begun by discharging stage one.
A simple flash ADC with a resistive ladder DAC is used to generate these 3 bits.
Thus, 5 bits of data are generated by the integrator and ADC. Both positive and
negative outputs of the transconductor are independently processed by this integrator
and ADC structure, and thus a pseudo-differential output is generated.
The integrator and ADC are designed to operate at 1 V and do not require any
op amps. The architecture is similar to a single-slope integrator, but with a few key
modifications. First, the single-slope is divided over multiple stages, so that back-
to-back integrations can occur on the same signal path. Second, the 5 bit ADC
output code is generated by combining the outputs of a 3 bit flash ADC with 2 bits
of information derived from the number of stages that are discharged.
To describe the operation of the integrator and ADC in more detail, an example
integration and conversion period is described below.
101
Before integration begins, the first stage is precharged to VDD. Next, the inte-
gration period begins, and stage 1 is discharged from VDD. The rate of discharge
depends on the input voltage. Simultaneously, stage 2 is precharged. Once stage 1
has discharged past a certain threshold, the comparator connected to stage 1 toggles.
This comparator triggers the next stage to begin discharging. While stage 2 is dis-
charging, stage 3 is precharged. This process again continues until the comparator
connected to stage 2 toggles, signaling the integrator to begin discharging stage 3.
This process continues until the end of the integration period. Depending on the
input voltage, a variable number of stages have been discharged by the end of the
integration period. This number will be later used to do coarse ADC quantization
of the integration value. Right before the end of the integration period, stage 1 is
precharged to VDD, in anticipation of the next integration period. Thus, once the
integration period is over, a new integration period can begin by discharging stage
1. Simultaneously, stages 2 through 6 are evaluated by flip-flops to implement the
coarse ADC quantization of the integration value. Next, the appropriate flash ADC
from stages 3 through 6 is enabled to do fine quantization of the ADC value. The 2
bits of coarse data and 3 bits of fine data are combined to generate a 5 bit output
code.
4.5.1 Integrator and ADC Clocking
The integrator and ADC are almost entirely self timed, in that each stage is discharged
immediately after the preceding stage has finished discharging. Due to this feature,
no high frequency clocks above 32 MHz are required; however, four clock phase and
level sensitive signals are required.
1. A level to control when to precharge or discharge stage 1.
2. A level to control when the voltages at the output of stages 3 through 6 need
to be held constant for flash ADC conversion.
3. A clock to enable the flip-flops at the output of each integration stage
102
*Flip-Flop Clock
Flash ADC Clock
ClockClock In- ClockGating
Precharge/ Hold
Discharge Stage 1 Stages 3-6
Figure 4-22: Integrator and ADC clocking circuit that generates the level sensitive
and clock phases. Each delay element has a nominal delay of 2 ns.
4. A clock to enable the appropriate flash ADC
Figure 4-22 shows how these clock phases and levels are derived from the master
clock provided to the ADC. In each of the six integrator stages, only continuous time,
boolean logic needs to be combined with these four signals to determine when each
stage should be discharged, precharged and evaluated. Some additional signal and
clock gating signals are not shown, specifically the logic that deals with the first and
last integration periods in a series of back-to-back integration periods. For example,
in the first integration period, there is no need to enable the flip-flip clock and the flash
ADC clock, as these would evaluate the result of the preceding integration period,
which does not exist.
Figure 4-23 presents a transient simulation of the integrator and ADC for two
integration periods. The first integration period begins at 10 ns and the positive
ADC output code is 3. This low ADC output code is expected when there are no
received pulses. The negative ADC output code is not shown, but its output code is
nominally 31 - ADCpl,,, so long as the positive and negative ADCs are well matched
and the differential transconductor is properly biased. The second integration period
begins at 42 ns and the ADC output code is 22. By the end of the second integration
period, four integrator stages are completely discharged and the fifth stage is being
discharged.
103
1
o0.5 o 2Enable Conversion
0 10 20 30 40 50 60 70 80 90
0. Precharge Stage 1
0 10 20 30 40 50 60 70 80 90
1
0.5- Clock Flip-Flops
0 10 20 30 40 50 60 70 80 90
0 10 20 30 40 50 60 70 80 90
> 0.5- Clock Flash ADC
0 .... I.,
0 10 20 30 40 50 60 70 80 90
> 0.5 Hold Stages 3-6
1 1.. I n i
0 10 20 30 40 50 60 70 80 90
-0.2o s [06
0 10 20 30 40 50 60 70 80 90
0 10 20 30 40 50 60 70 80 90
1-
. A1 3 6,2200 10 20 30 40 50 60 70 80 900.IFlip ADC SBs [1:01
> at 42ns.0 10 20 30 40 50 60 70 80 90
0.5 3 22 -
0 10 20 30 40 50 60 70 80 90
Time (ns)
Figure 4-23: Transient simulation of integrator and ADC for two integration periods.
The first integration period begins at 10 ns and the second integration period begins
at 42 ns.
104
The average of the positive and negative ADC output codes represents the common-
mode of the integrator and ADC, and this average is used for CMFB by adjusting the
integrator current. The CMFB uses the same slope-tracking algorithm that is used
for offset compensation of the baseband amplifier. Typically, the CMFB is executed
prior to offset compensation.
4.6 Crystal Oscillator
The SoC is designed to be clocked off a fixed 32 MHz oscillator that is always en-
abled. Due to the noncoherent signaling, clock frequency and timing synchronization
accuracy requirements between transmitter and receiver are dramatically reduced.
The receiver does not need to track the phase of the RF carrier, as no information is
encoded in the phase. Instead, the receiver only needs to track the phase of the bit
periods, which are nominally 31.25 ns.
By stabilizing the frequency of an oscillator with a quartz crystal, it is possible to
achieve frequency accuracies on the order of ±20 ppm [86]. Such accurate frequencies
allow for the transmitter and receiver to require only one synchronization per packet,
without any phase tracking during the packet payload. For example, with a 32 MHz
oscillator, the transmitter and receiver have a worst-case offset frequency of 640 Hz.
If up to 2 ns of clock drift can be tolerated by the system, the receiver can operate for
100 ps after synchronization. At the maximum data rate of 16 Mbps, this corresponds
to a payload length of 1600 bits, which is more than sufficient for our application.
The receiver includes two Pierce crystal oscillators, although only one oscillator
can be enabled at any time. The two designs were included to compare a traditional,
single-transistor oscillator (Figure 4-24), with an inverter based, two-transistor oscil-
lator (Figure 4-25). Both circuits operate off a 1 V supply and nominally consume
150 pA of current. The single transistor oscillator includes an amplitude feedback
loop to prevent oscillation amplitudes from becoming too large [87]. A simple buffer
circuit, shown in Figure 4-26, is used to amplify the oscillator output.
105
Pierce Oscillator Amplitude Feedback
r----- 11----------I I
Figure 4-24: Pierce one-transistor crystal oscillator with amplitude control circuitry.
Osc. Out
Figure 4-25: Pierce two-transistor, inverter-based, crystal oscillator.
Oscillator_
Output ,Osc. Out
Figure 4-26: Crystal oscillator output buffer, to amplify
digital levels.
the oscillator output to full
106
Delay[5:0] !Calibration & Selection Logic -- (p out
Figure 4-27: Delay-locked loop.
4.7 Delay-Locked Loop
For the receiver to successfully decode data, the integrator and ADC must be phase
aligned with the received data. Phase alignment is implemented with a digital syn-
chronization algorithm and a DLL. Based on the result of the digital synchronization,
an appropriate phase from the DLL is used to clock the integrator and ADC. During
synchronization, the DLL is bypassed and the integrator and ADC are provided the
same clock phase as the rest of the digital logic. As the DLL is not being used, the
DLL can be calibrated during this time by a SAR state machine.
The digital baseband achieves synchronization accuracy of +1 ns in an integration
window of 31.25 ns, and the DLL is designed to match these specifications. The DLL
has 16 outputs, each nominally spaced 1.95 ns apart from one another. Due to the
noncoherent signaling, the DLL does not need to have good linearity, and thus it is
possible to use simple delay elements and calibration logic. Figure 4-27 presents the
schematic of the DLL. The core delay element consists of a current starved inverter,
and a DAC is used to control the bias current of the inverter. All outputs of the DLL
are passed to a digital, synthesized state machine.
As the integrator and ADC operate off a different clock phase than the rest of
the digital logic, there is a potential for timing violations or clock offsets at the
interface. To address this problem, retiming registers connect to the ADC outputs.
107
These retiming registers can be either positive or negative-edge triggered to ensure
sufficient setup and hold time.
4.8 Digital Shift Register
The receiver is digitally programmed and monitored through a digital serial peripheral
interface (SPI) bus. The receiver operates as a slave device and the SPI bus consists
of 5 wires: SPI clock, master-output slave-input (MOSI) data, master-input slave-
output (MISO) data, SPI enable, and reset. The SPI clock is separate from the
32 MHz system clock, and is retimed by system clock in the digital shift register
logic. Due to the retiming, the SPI clock must be a lower frequency than the system
clock.
The shift register consists of 3788 total bits, which are divided into 25 separate
sub-shift registers. Each sub-shift register is hard coded as a read/write, read-only,
or trigger shift register:
* Read/write shift registers are programmable based on MOSI data, and their
values can be read out through MISO. For example, the LNA bias current is
controlled by 6 bits in a read/write shift register.
* Read-only shift registers cannot be programmed, and instead are used to read
out internal states of the receiver through MISO. For example, the calibrated
integrator bias current is stored in a read-only shift register.
* Trigger shift registers are used to generate single pulse 'trigger' signals at the
32 MHz system clock. For example, a trigger can be generated to begin cali-
bration of the integrator bias current.
Of the 3788 total shift register bits, 343 bits are dedicated to the RF front end
and baseband analog circuits, including the ADC. In combination with a digital
state machine, these bits control which RF gain stages are enabled, default DLL bias
current settings, testability modes, etc.
108
4.9 Digital State Machine
The receiver is designed to be duty cycled, where it periodically turns on and attempts
to receive a packet. Duty cycling allows for a reduction in average power consumption
at the cost of reduced data rates. This duty cycling capability is embedded within the
receiver through a digital state machine, consisting of a sleep counter and controllers
for the receiver and stimulator. In between packets, the radio and modem are disabled
and all digital logic is clock gated except for the sleep counter. This low power sleep
mode continues until the sleep counter reaches a programmable count value. At this
point, the receiver state machine is triggered, and the receiver attempts to receive a
packet.
Once the receiver is triggered, but before the receiver modem performs packet de-
tection, the receiver state machine performs calibration of the DLL, baseband ampli-
fier and integrator. This calibration only takes a few microseconds, and is performed
before every packet reception to account for any change in temperature or supply
voltage since the last packet reception attempt.
4.10 Test Circuits
For test purposes, additional circuitry is included in the receiver SoC. One key
circuit is the RF output buffer, which buffers the output of the RF gain chain off-
chip. The output buffer allows for noise figure and gain measurements of the RF
front end. The RF output buffer is shown in Figure 4-28 and consists of an NMOS
source follower with an NMOS common-source amplifier. As the majority of test
equipment is single-ended whereas the amplified RF signal is differential, the output
buffer converts the differential signal to a single-ended output. One advantage of this
circuit over a single-ended inverter-based amplifier or a simple source follower is that
to a first order, this circuit equally loads the positive and negative outputs of the RF
gain chain. Additionally, the circuit realizes greater gain than a source follower.
A second key circuit for testability is an output buffer for the baseband amplifier.
109
C VDD
RF In +
Enable R RF Out
RF In --
IDC T
Figure 4-28: RF output buffer to drive the amplified RF signal off-chip.
The baseband amplifier differential pairs are not designed to drive a 50 Q output
impedance or a large capacitive load, and thus an output buffer is required. Source
followers are used to drive both the positive and negative baseband amplifier outputs
off-chip. Finally, to allow the integrator and ADC to be characterized, the receiver
includes switches that allow off-chip signals to connect to the integrator inputs.
4.11 Summary
This chapter described the key RF and analog circuit blocks of the UWB receiver.
Highly digital, differential structures are used throughout the receiver to obtain good
performance and reject power supply and substrate noise. A differential inverter-
based amplifier with resonant load provides a good energy/gain trade-off while achiev-
ing comparable common-mode rejection to more traditional amplifier structures. The
cascaded integrator consists of a differential transconductor that discharges multiple
nodes from VDD in succession, and a flash ADC quantizes the integration result. To
improve system reliability and reduce off-chip components, the receiver SoC includes
extensive digital calibration logic.
110
Chapter 5
UWB Receiver SoC Measurement
Results
This chapter describes the test setup and measurement results of the UWB receiver
SoC described in Chapter 4. The performance of the full UWB receiver SoC as well
as individual components are described in this chapter. In addition, the performance
of the receiver is compared with other state-of-the-art receivers. A key challenge in
testing a UWB chip is that the high frequency RF circuits are highly susceptible to
parasitic sources of loss. This chapter describes the steps taken to avoid such loss.
5.1 Layout, Packaging and PCBs
The receiver is implemented in a 90 nm CMOS process and a die photo of the chip is
shown in Figure 5-1. The die area is 2.6 mm by 2.1 mm, and the area is dominated
by digital logic, which occupies the right side of the die.
Due to the significant amount of digital logic integrated on the same die as the
RF front end, there is significant potential for digital supply and substrate noise
to result in degraded analog and RF performance. This motivated the use of a
differential receiver architecture, as explained in Chapter 4. Additionally, several
layout techniques are employed to mitigate the effects of digital noise.
* Guard rings are placed around each of the sensitive RF and analog blocks, and
111
Syn choni Or
DemodulatorT
Figure 5-1: Die photograph of pulsed UWB receiver SoC.
# Pad Description
36 GND
34 VDD (RF, analog, digital, ...)
1 RF LNA input
1 RF test output
3 Analog test I/Os
2 Crystal resonator connection
12 Digital output pads
13 Digital input pads
Table 5.1: List of pad connections on receiver SoC.
a separate guard ring is placed around the digital logic.
* A guard ring barrier is placed between the digital logc and all other circuits.
* Analog and digital I/O pads have separate ESD devices.
* The analog and digital circuits have separate power supply domains on-chip.
The receiver has a total of 102 I/O pads, and a breakdown of the pad connections
are presented in Table 5.1. To reduce the impedance of the power supply network,
over half of the pads are dedicated to either GND or a VDD.
The chip is packaged in a quad flat no-lead (QFN) package with die attach pad
(DAP) for ground connections to limit the impact of bondwire inductance on the
112
(a) (b)
Figure 5-2: Bare die packaged in (a) a 40-pin QFN package and (b) a 64-pin QFN
package. The 40-pin QFN package is 6mm x 6mm and the 64-pin QFN package is
9mm x 9mm. The two photos are equally scaled.
sensitive input and output RF pins. Two different packages are used: a QFN 64-
pin package for comprehensive characterization and a smaller 40-pin QFN package
optimized for RF measurements. Both packages have a pad pitch of 0.5 mm. When
the receiver is packaged in the 40-pin QFN package, many digital I/Os and VDD
pads on the die are left unconnected. Inside the QFN packages, the die is positioned
closer to one side to minimize the bondwire inductance of the RF input. Given that
bondwires have an inductance of approximately 1 nH/mm, positioning the die can
result in a significant improvement in performance. Photographs of the UWB receiver
bonded in the 40-pin and 64-pin packages are shown in Figure 5-2. In both die photos,
the ground downbonds to the DAP are visible.
Figure 5-3 shows a block diagram of the three FR4 PCBs used in the test setup.
The PCBs are stacked with the receiver mounted on top.
1. The bottom PCB (Figure 5-4) is an Opal Kelly XEM3001 integration mod-
ule based on a 400,000 gate Xilinx Spartan-3 field programmable gate array
(FPGA). The XEM3001 interfaces to a PC and is powered via a Universal Se-
rial Bus (USB) connection. In the test setup, the XEM3001 serves as a digital
pattern generator, a digital logic analyzer, and a power source.
2. The middle PCB (Figure 5-5) interfaces between the bottom XEM3001 and the
113
Middle PCB
---------------- Top PCB
Conn. ( ~ "(Daughterboard)
__ -.. - ------ I
DC/DC
DC/DC
VDD Jumpers',
Figure 5-3: Block diagram of the three PCBs used for testing the receiver.
receiver chip on the top PCB. It includes level converters to convert the 3.3 V
digital signals on the XEM3001 to and from the 2.5 V digital signals on the
UWB receiver. It also includes low frequency analog inputs and power supply
regulators.
3. The top PCB (Figure 5-6) is a daughterboard and includes the packaged re-
ceiver, decoupling capacitors, and a crystal resonator. SMA connectors on the
board connect to the RF input and RF test output of the receiver.
Given that the QFN package needs to be directly soldered to a PCB, it is challeng-
ing to rapidly test multiple chips. Sockets are available for QFN packages, but they
cost thousands of dollars and require clearance around the QFN package, resulting in
longer traces with increased parasitic RF losses. Without a socket, it is necessary to
completely populate a new PCB for every chip to test, or a chip must be unsoldered
and a new chip be soldered. Unfortunately, both of these options are time consuming.
As an alternative, by using a daughterboard for the radio chip, multiple chips can be
rapidly tested. For both the QFN 40-pin and 64-pin packages, the same middle PCB
114
L ~o __________ 640S__ *1__ 1
Figure 5-4: Opal Kelly XEM3001 FPGA module. (Actual size 3.5" by 2.0")
Figure 5-5: Middle PCB of three PCB stack. (Actual size 4.5" by 2.4")
Figure 5-6: Top PCB (daughterboard) of three PCB stack.
board is designed for the 40-pin QFN package. (Actual size
This specific daughter-
1.8" by 1.3")
115
is used and only the daughterboard is different.
5.2 Bit Error Rate and Interference Measurements
The receiver sensitivity is perhaps the best measure of overall performance of the
receiver SoC. Here, sensitivity is defined as the average input power level that results
in a given bit error rate (BER) at a specified data rate. Testing sensitivity of the
receiver requires a UWB transmitter with power and frequency control. Figure 5-7
shows the test setup employed to measure the receiver BER. The test setup consists of
the UWB receiver, power supplies, a laptop, a spectrum analyzer, and an ideal UWB
transmitter consisting of an arbitrary waveform generator (AWG), a vector signal
generator (VSG), and an RF interference generator. The ideal UWB transmitter has
the following properties:
* The AWG (Tektronix AWG710) generates baseband pulse bursts that will be
upconverted to PPM data. The AWG is clocked at 512 MHz. Each pulse
burst consists of 12 back-to-back pulses of 1.95 ns. This 23.4 ns pulse burst was
found to result in maximum receiver sensitivity. To ensure a wideband frequency
spectrum, the individual baseband pulses are modulated with a pseudo-random
pattern, generating a BPSK modulated pulse burst.
* The VSG (Agilent E8267C) generates the PPM UWB signal. The baseband
pulse stream from the AWG is passed to the VSG wideband modulator input,
which upconverts the baseband signal to RF. The VSG includes precise output
power and frequency control.
* The RF interference generator (Agilent E8362B) generates an RF frequency
tone with programmable output power. For interference measurements, this
tone is combined with the VSG output.
Before BER tests begin, the laptop must program both the receiver chip and
the ideal transmitter. The laptop interfaces to the receiver PCB test boards via
116
Transmitter
I WG Synchronizat
end Baseband Start Packeti n
PIses VSG RF RF Signal
Modulator
RF Interf.
Generator RF Interferer
!Spectrum
Analyzer RF Output
GPIB USB Interface
ion Clock
Trigger
Top PCB Mddl: Middle&
UWB PCB &
RX
SoC ----------
XEM3001
PCB
r- USB 2.0
Figure 5-7: Test setup used to measure the BER of the receiver both with and without
narrowband interferers.
117
-2.5 V
1.0 V
Power
Supplies
10 :
102
m -- 3 ih s .
10- 10....
f - H5hGHz i10 ....... .Gz ........... 1 0: :::::::::: ::t:: :: : : ........ :.... .......... ....... ...10 10 ........
-85 -80 -75 -70 -80 -60 -40
Input Power (dBm) Input Power (dBm)
(a) (b)
Figure 5-8: BER of receiver (a) at its highest gain setting at the three center frequen-
cies, and (b) at the different gain settings with fe = 4.0 GHz.
a USB interface, sending commands to and receiving data from the FPGA on the
XEM3001 PCB. The receiver is first programmed through its digital shift register to
an appropriate gain and frequency setting. Next, the laptop configures the ideal UWB
transmitter to generate a packet of many thousand back-to-back PPM encoded UWB
signals. The FPGA provides a 10 MHz reference clock to the AWG to ensure there
is no clock drift during the duration of a packet. At the start of a packet, the FPGA
generates a pulse that triggers the AWG to begin sending the packet. Simultaneously,
the FPGA stores the receiver ADC outputs or baseband demodulator outputs. From
this data, the BER of the receiver is determined.
Figure 5-8 presents the BER of the receiver in different frequency channels at its
highest gain setting and at different gain settings with fe = 4.0 GHz. The receiver
achieves a maximum sensitivity of -76 dBm at a data rate of 16 Mbps and a BER of
10 - 3 . The sensitivity scales by 35 dB from the lowest to highest gain setting, allowing
for a trade-off of power consumption for sensitivity.
The BER of the receiver has been characterized in the presence of varying supply
voltages, to determine the resilience of the receiver to power supply variation. Fig-
ure 5-9 presents the sensitivity of the receiver versus core supply voltage at its highest
gain setting with fc = 4.0 GHz. A variation in the core supply voltage of +50 mV
118
-71
m-73
. -74
-76
-7.9 0.95 1 1.05
Supply Voltage (V)
Figure 5-9: Sensitivity of the receiver versus core supply voltage at its highest gain
setting (f, = 4.0 GHz).
results in only a 2 dB degradation in sensitivity. Throughout these measurements,
none of the receiver components are recalibrated from the nominal values, includ-
ing the baseband offset compensation DAC, the RF front end capacitor tuning, and
the differential transconductor current. All BER measurements were conducted with
an I/O supply voltage of 1.2 V rather than the standard 2.5 V, to minimize digital
switching noise being coupled into the sensitive RF and analog circuits.
5.2.1 Interference Measurements
In a practical wireless system, the receiver must reject interfering signals, both in-
band, in the 3.5, 4.0, and 4.5 GHz channels and out-of-band. Key potential sources for
out-of-band interferers include 802.11 at 2.45 GHz and 5.25 GHz, as well as Bluetooth
at 2.45 GHz. Table 5.2 presents the maximum tolerable out-of-band interferer power
at 2.45 GHz and 5.25 GHz. In these measurements, the receiver is first set to its
maximum gain setting and the UWB input power is set such that the BER of the
receiver is 10- 6 . When f, = 4.0 GHz, this corresponds to an input power level
of -73 dBm. Next, the interferer tone is generated and combined with the UWB
signal. The maximum tolerable out-of-band power level corresponds to the maximum
interferer power level when the BER is less than 10- 3 . Similar measurements have
119
Maximum Interferer Power (dBm)
fc = 3.5 GHz f, = 4.0 GHz f, = 4.5 GHz
2.45 GHz -22 dBm -19 dBm -23 dBm
5.25 GHz -30 dBm -30 dBm -41 dBm
Table 5.3: Receiver power consumption breakdown. In idle mode, the crystal oscilla-
tor is enabled.
Receiver component Power Consumption
Leakage 0.64 mW
Crystal oscillator 0.15 mW
Clock tree (idle) 0.13 mW
Delay locked loop 0.05 mW
Baseband amplifier & ADC 1.51 mW
LNA 5.90 mW
RF Amplifier 0 mW to 14.30 mW
Total idle power 0.92 mW
Total active power 8.38 mW to 22.69 mW
been taken with in-band interferers, and the maximum tolerable in-band interferer
power ranges from -1 dB to 3 dB relative to the UWB signal power.
5.3 Power Consumption & Energy/bit
As the receiver SoC is targeted for low power, highly energy constrained applications,
significant effort was spent to minimize overall power consumption and energy/bit.
A breakdown of power consumption is shown in Table 5.3. Due to the extensive
digital logic and the absence of power gating switches, the total leakage power is
0.64 mW. When the receiver is in idle mode, the clock tree is extensively gated;
however, 0.13 mW of power is still consumed. The overall receiver power consumption
is dominated by the LNA and the RF amplifiers that follow the LNA. Each individual
RF amplifier consumes approximately 2.85 mW of power consumption, and the five-
stage RF amplifier consumes a total of 14.30 mW of power when all five stages are
enabled. At the lowest gain setting, the entire receiver consumes 8.38 mW of power
and at the highest gain setting, the receiver consumes 22.69 mW of power. The power
120
Table 5.2: Ollt-of-band interferene robustness.
consumption is constant regardless of the RF center frequency and includes the power
of the digital backend when decoding data; however, these power measurements do
not account for the energy required for synchronization at the start of a packet.
As the receiver operates at an instantaneous data rate of 16 Mb/s, the energy/bit
of the receiver is 0.5-to-1.4 nJ/bit depending on the gain setting. Table 5.4 and
Figure 5-10 present the energy/bit of the receiver compared to previously published
receivers, both narrowband and wideband as well as coherent and noncoherent. The
receiver achieves one of the lowest energy/bit values; however, the energy/bit metric
is flawed in that it does not account for receiver sensitivity. Many radios that achieve
a low energy/bit achieve a very poor sensitivity and range, and thus have limited
utility in practical systems. As an attempt to better compare receivers of different
energy/bit, data rates, and sensitivities, it is possible to normalize receiver sensitivity
to a constant data rate. This normalized sensitivity is similar to noise figure, but offers
a measure of the entire receiver performance rather than just the noise performance
of the RF front end. The following tenet forms the basis of the normalized value:
A 10 dB improvement in receiver sensitivity is equivalent to a decrease in
data rate by 10x at a constant energy/bit.
This tenet is based on the fact that for a given modulation scheme and a fixed noise
figure, data rate scales linearly with bandwidth, and a 10x decrease in bandwidth
results in a 10 dB improvement in sensitivity based on the following equation:
dBc
P,,min = -174 + 10log(BWHz) + SNIin + NF (5.1)Hz
In equation (5.1), Pr,min represents the sensitivity, BW represents the noise band-
width, SNRmin represents the minimum signal-to-noise ratio (SNR) required at the
output of the receiver, and NF represents the noise figure of the receiver. If NF
and SNRmin remain constant, then a 10x decrease in bandwidth results in a 10 dB
improvement in sensitivity.
The tenet is also based on the approximation that noise bandwidth should ideally
scale linearly with power consumption, resulting in a constant energy/bit. While this
121
approximation is not fundamental, the following two examples provide some justifi-
cation. For example, if ten identical narrowband radios are operated simultaneously,
each in a different frequency band, the average data rate increases by 10x and the
average received power increases by 10 dB, but the energy/bit remains constant. Al-
ternatively, if a radio is duty cycled by 10x, both the average data rate decreases
by 10x and average received power decreases by 10 dB, but the energy/bit remains
constant.' In practice, increasing data rates usually results in a sub-linear increase in
power consumption, resulting in high data rate radios achieving a better energy/bit
at the same scaled sensitivity as low data rate radios. Despite these limitations, the
scaled sensitivity metric serves as an effective number (like noise figure) to compare
the performance of a wide variety of receivers, both coherent and noncoherent and of
varying data rates.
Normalized sensitivity is included as a column in Table 5.4 and Figure 5-10(b)
plots normalized sensitivity versus energy/bit for previously published receivers and
this work. Ideally a receiver is positioned at the lower-left corner of this plot, achieving
a good normalized sensitivity and a minimum energy/bit. The receiver presented in
this thesis compares favorably with previously published work. From the plot, there
is a definite trade-off between energy/bit and normalized sensitivity. Although it is
possible to introduce a single FoM that captures both energy/bit and normalized
sensitivity, determining a fair FoM is fraught with difficulty, due to the difficulty in
quantifying the trade-off between energy/bit and sensitivity. Receiver sensitivity can
be improved by increasing energy/bit through a variety of techniques including coding,
reducing receiver noise figure, and using multiple antennas. For coherent receivers,
averaging two received bits can produce a 3 dB improvement in sensitivity, whereas
using a rate 1 convolutional code can produce a 6 dB improvement in sensitivity.
For noncoherent receivers, averaging offers less than 3 dB improvement in sensitivity,
depending on the SNR.
1It is important to note that while duty cycling or through parallelism, the instantaneous sensi-
tivity (or minimum detectable signal) does not change, and thus it is misleading to imply that the
sensitivity of the receiver has improved or worsened.
122
Table 5.4: Comparison of receiver with previously published work.
Author Data Rate Power E/bit Sens. at Sens. scaled
data rate to 100kbps
(kbps) (mW) (nJ/bit) (dBm) (dBm)
Porret [88] 24 1 41.6 -95 -89
Choi [89] 200 21 105 -82 -85
Emira [90] 11000 114 10.3 -86 -106
Otis [81] 5 0.4 80 -101 -88
Darabi [91] 11000 360 32.7 -88 -108
Chen [92] 500 2.8 5.6 -80 -87
Lee [4] 16700 42 2.5 -77 -99
Marholev [75] 3000 43 14.3 -83 -98
Pletcher [93] 100 0.052 0.5 -72 -72
Zheng [68] 15600 102 6.51 -75 -97
Weber [94] 2000 36 17.8 -90 -103
Bohorquez 120 0.4 3.3 -93 -94
[95]
Retz [96] 250 30.25 121 -96 -100
Verhelst [60] 20000 3.1 0.159 -65 -88
22.5 1.4 -76 -98
This work 16000-50 -72
11 0.7 -50 -72
* This work (lowest gain)
This work (highest gain)
... .. : ........ : ......... : ........ : .........
........ -. ....... : ......... . : ...... : .........
.iiiiiiiiii .iii j . + ... .. ... 
. '.' i.i0i .1 ...2 . i'×. . 52
0 101 10 10 10 10
Data Rate (kb/s)
-70
-80
-90
-100
*This work (lowest gain)
*This work (highest gain)
. . . .... . .. . .. ...... . . ....... . . . ..
* This *wo..r ... ....rk
Th is:: ::;i
10- 100  101 102  103
Energy/bit (nJ/bit)
Figure 5-10: Two comparison plots of receiver with previously published work: (a)
energy/bit versus data rate, and (b) normalized sensitivity versus energy/bit. In both
plots, a point is shown for the receiver at its highest and its lowest gain setting. Data
for these plots are found in Table 5.3.
123
103
. 2
10
t!101
10
L 0
10-1
1
~"""' ' "'' " "" ~""1~A
5.4 Wireless Demo of Transceiver Chipset
This UWB transceiver chipset has been demonstrated in a wireless image transmission
platform.2 A top level block diagram of the wireless image transmission system is
shown in Figure 5-11. The transmitter consists of a PC that generates packets, which
are processed by an FPGA module and wireless transmitted by the UWB transmitter
chip described in Chapter 3. The receiver consists of the UWB receiver chip described
in Chapter 4 that wirelessly receives packets, which are then processed by an FPGA
module and the received data is displayed on a PC.
A unidirectional wireless link is used to minimize system complexity as the UWB
receiver is not integrated on the same die as the transmitter. To improve the reliability
of the wireless link, a feedback path from receiver to transmitter is implemented
through an internet connection. Graphical user interfaces are setup in MATLAB
on both PCs. The wireless link communicates at a maximum instantaneous data
rate of 8 Mb/s; however, the actual data rate in the system is reduced by over an
order of magnitude due to delays in the MATLAB algorithms. The wireless platform
is able to transmit data reliably, with above 95% packet reception rate and below
2 x 10- 5 BER, for distances up to 16 m; however the transmitted waveform violates
the FCC spectral mask by a few dB. At 16 m distance, the wireless link communicates
at a maximum instantaneous data rate of 2.67 Mb/s. Further details on the image
transmission system are presented in [97].
Whereas the preceding sections in this chapter have presented high level system
measurements of the full receiver SoC, the following sections present detailed mea-
surement results of the many individual circuit components.
5.5 RF Front End
The RF front end is characterized by applying inputs to the LNA input and measur-
ing outputs at the RF output buffer. Additionally, s1l measurements are made by
2 The wireless image transmission platform was developed by Helen Liang as part of her Master
of Engineering thesis.
124
Or
('DCD
0
(D
(D
W
Cn
('Dc
1
CC+'
C,CD
missit missing
ipacket =packets
Feedback from receiver Feedback to transmitter throughthrough internet connection internet connection
FPGA Module by Opal Kelly FPGA Module by Opal Kelly
[XEM3001v2) Transmitter Node A A Receiver Node (XEM3001v2) I
==**=QFN64 with microstrip
- -QFN40 with microstrip
- QFN40 with grounded CPW
0
I I .
1 0 .............. %.. ... ...
-15 '
2000 4000 6000
Frequency (MHz)
Figure 5-12: Measured sil of the receiver, comparing results from the 40-pin and
64-pin QFN packages and comparing a microstrip transmission line to a grounded
coplanar waveguide transmission line. In all three configurations the receiver is tuned
to a frequency of 3.5 GHz.
connecting the LNA input to a network analyzer. As the LNA input is connected
on-chip to VDD by an inductor, it is necessary to ac couple test equipment to the
LNA input if the test equipment cannot tolerate dc voltage offsets. To implement the
ac coupling, a Pasternack PE8210 dc blocking sub-miniature A (SMA) male to SMA
female connector is used.
S-Parameter: sil
The sil of the receiver is characterized by connecting the LNA input to an Agilent
E8362B network analyzer. An s1i < -10 dB is desired for a good 50 Q match
to the antenna. If s1l > -10 dB, excess power is reflected from the LNA to the
antenna, degrading performance. sil measurements of the LNA are presented in
Figures 5-12 and 5-13. In Figure 5-12, the s11 of the receiver is measured in three
different configurations: a 64-pin QFN package with a microstrip transmission line, a
40-pin QFN package with a microstrip transmission line, and a 40-pin QFN package
126
u -10 ... i..... . ... '.
-1 '-'g ;I
-15
2000 4000 6000
Frequency (MHz)
Figure 5-13: Measured sii of the receiver with a center frequency of 3.5, 4.0 and
4.5 GHz packaged in a 40-pin QFN package with grounded coplanar waveguide trans-
mission lines.
with a grounded coplanar waveguide (CPW) transmission line. For all three test
boards, the receiver is tuned to a center frequency of 3.5 GHz. The 64-pin QFN
receiver has an si < -5 dB only over an extremely narrow frequency range and an
average s1i of -2.14 dB in the 3.5 GHz band. In contrast, the receivers in 40-pin QFN
packages have s1i < -5 dB over a much wider frequency range. The 40-pin QFN
packaged receivers with microstrip and grounded CPW transmission lines have an
average s1l of -8.2 dB and -10.5 dB, respectively, in the 3.5 GHz band. From these
results, it is evident that the reduced bondwire lengths associated with the 40-pin
QFN package result in improved performance and the grounded CPW transmission
line is superior to the microstrip transmission line. Figure 5-13 presents the measured
si1 of the receiver tuned to a center frequency of 3.5, 4.0, and 4.5 GHz packaged in
a 40-pin QFN package with grounded CPW transmission lines. The receiver has an
average si1 of -10.5 dB, -9.8 dB, and -7.8 dB in the 3.5, 4.0, and 4.5 GHz bands,
respectively.
127
f = 3.5 GHz f =4.0 GHz f =4.5 GHz
0 0 0
- 2 0 ....... ................. - 2 0 ...... .... - 2 0
E -40 ...................... -40 -40
-60 . ............ -60 -60
-80 -80 -80
3 4 5 3 4 5 3 4 5
Frequency (MHz)
Figure 5-14: Output spectrum of the RF front end when the LNA input is terminated
to 50 Q. The spectrum analyzer RBW and VBW are set to 3 MHz, and the detector
is set to average mode.
RF Gain
For any measurements that require two ports, including noise figure and P1dB, mea-
surements are constrained due to the fact that the RF output buffer requires all RF
gain stages to be enabled as it is connected to the output of the final RF gain stage.
Thus, it is not possible to measure the transfer function of an individual RF gain stage.
As the RF output buffer is single ended, and has amplitudes over 30 dB greater than
the single ended LNA input, there exists the potential for oscillations to occur due to
substrate and power supply coupling. Despite efforts to minimize the chance of these
oscillations, oscillations are observed when the receiver is tuned to a center frequency
4.0 GHz. These oscillations are not visible when the receiver is tuned to the other
two channels. It is important to note that these oscillations do not occur when the
RF test output is disabled during normal operation. Figure 5-14 presents the output
spectrum of the RF front end when the LNA input is terminated to 50 Q. For these
plots, the spectrum analyzer has an RBW and VBW of 3 MHz, and thus a thermal
noise power of -109 dBm is expected. From these plots, the maximum noise power
is approximately -65 dBm, corresponding to a cumulative gain and noise figure of
44 dB. When the receiver is tuned to a center frequency of 4.0 GHz, an oscillation
128
2 0 . -.... .  .. ............
.. f = 3.5 GHz
0 r i 1 c
'i '='= f = 4.0 GHz
Z 0 s -f = 4.5 GHz
-20 ...
-402 3 4 5 6
Frequency (GHz)
Figure 5-15: Frequency dependent gain of the RF front end at its maximum gain
setting at the three channel frequencies.
tone nearby 3.7 GHz is visible.
The frequency dependent gain of the RF front end at its maximum gain setting
at the three channel frequencies is shown in Figure 5-15. In each of the three chan-
nel, a maximum gain of 33 dB to 35 dB is measured and the -3 dB bandwidth is
approximately 250 MHz. As will be described later, this measurement of gain likely
underestimates the actual gain by 5 dB due to attenuation caused by the output
buffer. The -6 dB bandwidth is approximately 500 MHz, and the bandwidth in-
creases at higher channel frequencies at a greater rate than the frequency increases,
indicating a degradation in the quality factor of the resonant tank at high frequencies.
Due to the oscillations at 3.7 GHz when the receiver is tuned to a center frequency of
4.0 GHz, the data between a frequency of 3.65 GHz and 3.8 GHz is removed from the
plot for fc = 4.0 GHz. The measured gain of the 4.0 GHz channel is likely reduced
compared to the other two channels due to non-linear behavior introduced by the
oscillation tone at 3.7 GHz.
Linearity: PldB
For extremely large input signals, the RF front end can saturate and synchronization
performance and BER can be degraded. The linearity of the RF front end is measured
129
0 ... . . .. ... .. . . . . . . . . . . . . . . . . . . .E
m -10
D-20
L -30
.-40
0
u- -50
-60
-80 -60 -40 -20
LNA Input Power (dBm)
Figure 5-16: Output amplitude vs. input amplitude of the receiver RF front end at
its highest gain setting, with an input frequency of 4 GHz. From this plot, the PjdB
at the highest gain setting is measured to be -41 dBm.
by applying a frequency tone of varying amplitude to determine the P1db. Figure 5-16
presents the output amplitude of the complete RF gain chain versus LNA input
amplitude, with the RF front end tuned to the RF input frequency of 4 GHz. From
this plot, the measured P1dB is -41 dBm and the total gain is measured to be 32 dB.
In this measurement, the PldB is likely dominated by the final amplifier in the chain
of six RF amplifiers (including the LNA). In a later section, once off-chip losses
are accounted for, it is shown that the gain from LNA input to the input of the
final RF amplifier is approximately 31 dB. Based on this value, the expected PjdB
of each amplifier can be estimated to be -10 dBm. At the lowest gain setting,
assuming a 0 dBm transmit output power, the LNA input power will be -10 dBm at
a communication distance of approximately 3 cm. As the transceiver chipset is not
designed for such short distance communication, the linearity of the RF front end is
acceptable.
Noise Figure
The noise figure of the RF front end directly impacts the sensitivity of the receiver
and is ideally minimized; however, there is a trade-off between power consumption
130
m 12.5 . ........... .
) , . f -f = 3.5 GHz
.2 10 , ' . fc = 4.0 GHz
U r n -f = 4.5 GHz
C) c
o 7.5Z
5 3 4 5
Frequency (GHz)
Figure 5-17: Noise figure of the RF front end when operating in each of three channels.
and noise figure. From Figures 5-14 and 5-15 it is possible to derive the noise figure
of the RF front end; however, precise noise figure measurements require the use of
a temperature calibrated noise source. Thus, noise figure measurements are made
with an Agilent N4002A noise source connected to an Agilent MXA N9020 Signal
Analyzer. Figure 5-17 presents the frequency dependent noise figure of the RF front
end when operating in each of the three channels. The minimum noise figure in the
3.5, 4.0, and 4.5 GHz channels is measured to be 7.7, 9.0, and 9.1 dB, respectively.
The average noise figure in the 3.5, 4.0, and 4.5 GHz channels is measured to be
8.3, 9.7 and 10.2 dB, respectively. Given that the receiver is targeted for energy
constrained applications where power consumption is as important as performance,
the measured RF front end noise figure is acceptable.
The noise figure measurements match what is expected based on the zero-input
measurements in Figure 5-14 and the gain measurements in Figure 5-15; however, the
noise figure measurements are significantly higher than simulated values, likely due
to the inability to run full RC extracted simulations that include the power supply
nodes, and also due to unaccounted for parasitic resistances, inductances and capac-
itances associated with PCB traces, on-die traces and bondwires. In an unextracted
simulation with no bondwires, the RF front end has a simulated noise figure of 3.1 dB
at 4 GHz. A significant degradation in noise figure is seen once on-die traces are in-
131
>7
Ea6-
< Positive output
5 3 .............. . -- Negative output
C2
U 0 100 200
LNA Input Power (nW)
Figure 5-18: Positive and negative output amplitudes of the squarer versus LNA
input power at its third of six RF gain settings (fc = 4.0 GHz).
cluded in simulations after RC extraction. The simulated noise figure of the RC
extracted RF front end is 4.8 dB, ignoring any RC losses on the power supply or
ground nodes. In the following section, additional measurements of RF gain provide
evidence that the difference between simulated and measured noise figure is due to
degraded LNA gain.
5.6 Squarer
The squarer is characterized by applying RF input signals to the LNA and directly
measuring the outputs of the differential squarers. On the chip are programmable
switches to connect the squarer outputs to the analog test pads.
The matching performance and transfer function of the squarer is characterized
over the desired differential output voltage range of 0-to-8 mV. Ideally, the output
voltage amplitude should linearly increase with power and the positive and negative
outputs should be identical. The measured results, shown in Figure 5-18, achieve
good power-voltage linearity and matching.
Figure 5-19 presents the differential output amplitude of the squarer versus LNA
input power for the six RF gain settings. From Figure 5-19, one can determine the
132
> 100E 1 Highest
gain
. 80
0
CO
Cr 40 . . . .
U) gain
S-75 -60 -40 -20 0
LNA Input Power (dBm)
Figure 5-19: Measured differential output amplitude of the squarer versus LNA input
power for the six RF gain settings.
absolute gain of each RF gain setting by using the simulation result that an RF
amplitude of 40 mV will generate a differential squarer output amplitude of 20 mV.
While the accuracy of this simulation result cannot be guaranteed, the squarer is less
subject to RF parasitics than the RF gain chain. Using this simulation result, the
single ended voltage gain of the RF front end is 39 dB at its highest gain setting and
0 dB at its lowest gain setting. As the LNA input is single-ended and all internal
signals are differential, the differential gain can be considered to be 6 dB greater than
the single ended gain, making the LNA voltage gain 6 dB. On average, the RF gain
increases by 8 dB when increasing the RF gain by one setting.
From Figure 5-19, the maximum RF gain is calculated to be 39 dB, which is
significantly larger than the gain of 33-to-35 dB measured in the preceding section.
These conflicting measurement results provide evidence that the RF output buffer
likely attenuates the RF output by approximately 5 dB. Moreover, the measured
LNA single ended gain of 0 dB is 4 dB below what is determined from RC extracted
simulations; however, the 8 dB gain of later gain stages closely matches simulation
results. Thus, there are likely unaccounted for parasitics in the LNA input network,
including the bondwire, bond pad, and PCB traces. These unaccounted for parasitics
would degrade both the noise figure and gain and explain the measured noise figure
133
E 600
c- 400
0
-200
= -400
0 100 200
LNA Input Power (nW)
Figure 5-20: Differential output amplitude of the squarer versus LNA input power at
its third of six RF gain settings, demonstrating a linear relationship between input
power and output amplitude (f, = 4.0 GHz).
of 8-to-10 dB versus the simulated noise figure of 5 dB.
5.7 Baseband Amplifier
The baseband amplifier is characterized in a similar way to the squarer, by applying
RF input signals to the LNA and directly measuring the differential output of the
3-stage baseband amplifier. On the chip are programmable switches to connect the
baseband outputs to the analog test pads. Using the offset cancellation current DACs
and the digital calibration, the baseband amplifier is calibrated to nominally have an
output offset of -230 mV with no RF input. As described in Chapter 4 this fixed
offset maximizes the dynamic range of the baseband amplifier due to the monopolar
squarer output. The zero input voltage offset is visible in the transfer function of the
baseband amplifier, which is presented in Figure 5-20.
Figure 5-21 presents the differential output amplitude of the baseband amplifier
versus LNA input power for the six RF gain settings. By combining the results in
Figure 5-21 with the squarer results in Figure 5-19, the baseband differential gain is
measured to be 80 V/V, closely matching simulation results.
134
E 400
200
-L Highest
0 ,
" -200..
.- -400
0 -80 -60 -40 -20
LNA Input Power (dBm)
Figure 5-21: Differential output amplitude of the baseband amplifier versus LNA
input power for the six RF gain settings (f, = 4.0 GHz).
DNL INL
m 0m
-1
0 10 20 30 0 10 20 30
Code Code
(a) (b)
Figure 5-22: (a) DNL and (b) INL of integrator and ADC.
5.8 Integrator and ADC
To test the integrator and ADC, off-chip signals are applied to the input of the
differential transconductor through the analog I/O pads. ADC static and dynamic
measurements are complicated by the fact that the integrator cannot be de-embedded
from the ADC and that there is no sampling switch. Thus, inputs are effectively time
averaged, making it difficult to measure ADC performance by applying sinusoidal
inputs. Instead, to measure linearity, dc inputs need to be applied. Figure 5-22
presents the differential non-linearity (DNL) and integral non-linearity (INL) of the
ADC. The measured DNL is less than 1 LSB and the INL is less than 2 LSB. The
symmetric, non-linear nature of the INL is due to the differential transconductor.
135
-o 20o
* 1 0 .......................0
0
0 10 20 30
Input pulse width (ns)
Figure 5-23: Linearity of ADC when provided a pulse input of varying duration.
Dynamic performance of the integrator and ADC is measured by applying pulses of
varying width, simulating the receiver in normal operation (Figure 5-23). By applying
pulse inputs rather than dc inputs, the bandwidth of the integrator is tested. The
measurement results indicate a monotonic response and fairly good linearity, with the
linearity likely limited by the differential transconductor.
5.9 Crystal Oscillator
The crystal oscillator consists of an off-chip, Abracon ABM10 crystal resonator [86]
and an on-chip Pierce oscillator. Two types of Pierce oscillators are included on the
receiver chip: a one-transistor, NMOS based oscillator and a two-transistor, inverter
based oscillator. Table 5.5 presents the measured oscillation frequency of the crystal
oscillator for these two topologies at varying supply voltages and multiple chips. The
measured oscillation frequency does not vary significantly over all of the different
chips, voltages, and topologies. The measured phase noise of the crystal oscillators is
shown in Table 5.6.
136
Table 5.5: Crystal oscillator frequency for different chips, supply voltages and oscil-
lator topologies.
Oscillation Frequency (Hz)
Chip # Pierce Oscillator VDD= 0. 9 5 V VDD = 1.0 V VDD = 1.05 V
1 2-Transistor 32,000,678 32,000,673 32,000,667
1 1-Transistor 32,000,719 32,000,720 32,000,722
2 2-Transistor 32,000,748 32,000,744 32,000,739
2 1-Transistor 32,000,776 32,000,778 32,000,780
Table 5.6: Crystal oscillator phase noise at various offset frequencies for both one-
transistor and two-transistor Pierce oscillator topologies.
Phase Noise
Offset Frequency Phase NoiseOne-transistor Oscillator Two-transistor Oscillator
10 kHz -99 dBc/Hz -107 dBc/Hz
100 kHz -114 dBc/Hz -113 dBc/Hz
1 MHz -122 dBc/Hz -126 dBc/Hz
INLDNL
0.2
0IVVVVI
,nr L LJ
0-. 5 10 1
Code
(a)
Normalized DNL
-1L0 5 10 15
Code
(b)
Normalized INL
0.2
-0.20 5 10 1!
Code
-10
5 0 5 10 15
Code
Figure 5-24: (a) DNL and
DNL and INL shown in (c)
(b) INL of DLL based on an LSB of 1 x 31.25 ns. The
and (d) are based on an LSB of 2.04 ns.
137
5.10 Delay Locked Loop
As the digital synchronization algorithm is only able to achieve accuracies to ±1 ns,
the DLL does not need to be extremely linear. Figure 5-24 presents the DNL and
INL of the DLL, and both DNL and INL are less than 1 LSB. The INL of the DLL,
shown in Figure 5-24(b) gradually increases with codeword, indicating that each delay
element has too large a delay. Whereas each output phase should nominally be spaced
by 1.95 ns, the average measured spacing is 2.04 ns. This difference is likely due to
the limited tuning resolution of the DLL combined with mismatch in timing paths
of the DLL logic. If the INL is normalized to an LSB of 2.04 ns, very good INL is
observed (Figure 5-24(d)).
5.11 Summary
This chapter presented measurements results of the UWB receiver and compared the
receiver performance to existing work. Several steps were taken in layout, packaging,
and board design to reduce parasitic sources of loss. By using a smaller package with
shorter bondwires and improving the RF PCB traces, a significant improvement in
sil was measured. The receiver operates at a 16 Mb/s instantaneous data rate and
achieves a sensitivity of -76 dBm at 10-3 BER. The receiver SoC instantaneous power
scales from 8-to-22.7 mW while demodulating data, yielding 0.5-to-1.4 nJ/bit. When
compared to receivers that achieve similar normalized sensitivities, the receiver has a
very good energy/bit. The receiver has been combined with a UWB transmitter in a
UWB image transmission demo, and reliable communication is observed at distances
up to 16 m.
138
Chapter 6
UWB Receiver SoC System
Demonstration: Cyborg Moths
In 1964, Jose Delgado, a scientist at Yale, stood in a bullring in C6rdoba, Spain facing
a charging bull. Moments before it could gore him, Delgado pressed a button on a
radio transmitter that activated electrodes implanted in the bull's brain, braking it to
a halt [98]. Since that demonstration, scientists and engineers have been fascinated
by cybernetic organisms, or cyborgs, that fuse artificial and natural systems. Cyborgs
enable harnessing biological systems that have been honed by evolutionary forces over
millennia to achieve astounding feats. Male moths can detect a single pheromone
molecule, a sensitivity of roughly 10-21 grams. Swarms of cyborg insects could patrol
millions of acres of forest land, relay real-time traffic patterns, inspect bridges, and
conduct chemo- and nuclear surveillance over entire continents, all at extremely low
costs. Semiconductor technology is central to realizing this vision, and, despite a rate
of advance arguably unmatched in human endeavor, requires fundamental innovations
to catch up to the formidable capabilities of biological systems. Cyborg applications
are highly constrained in terms of energy, volume, and weight, demanding careful
system and circuit level engineering.
An emerging cyborg application is hybrid-insect flight control, where electron-
ics and MEMS devices are placed on and within insects to alter flight direction.
Compared to existing micro- and nano-air vehicles used by the military and other
139
government agencies, insects are appealing because they are small, can travel signifi-
cant distances, and can carry relatively large payloads. Such a hybrid-insect system
would take the best qualities of biology: energy storage, efficient flight control, highly
adapted sensing - and combine them with the best qualities of electronics: low weight,
small size, deterministic control, and interfacing with computation. A critical com-
ponent of the hybrid-insect system is the communication link, which provides flight
control commands to the insect. A wireless communication link is required for maxi-
mum system range and versatility.
This chapter presents work on a hybrid-insect wireless system being developed
by engineers and scientists at the Massachusetts Institute of Technology, the Uni-
versity of Arizona, and the University of Washington. The UWB receiver that is
presented in the previous three chapters of this thesis is a key component of the
hybrid-insect system. The focus of this chapter is on the wireless transceiver and sys-
tem integration challenges, rather than the biological challenges. First, background
on hybrid-insect systems is provided and the Manduca sexta hawkmoth is introduced.
Next, an overview of the wireless test board and key circuits that interface with the
moth are presented. Finally, system results during untethered flight in a wind tunnel
are presented.
6.1 Previous Hybrid-Insect Flight Control Research
and Systems
For millennia, humans have been able to control the motion or flight of animals
through a variety of techniques. Horses are provided directional commands from hu-
mans by reins. Homing pigeons have been selectively bred to be able to find their way
home over extremely long distances and are used to deliver messages. Dogs, dolphins,
chimpanzees and many other animals can be trained to follow human instructions.
Hybrid-insect flight control systems are an example of these existing motion control
systems pushed to the micro-scale. A pivotal, early development leading to hybrid-
140
insect systems was the discovery of electrophysiology by Luigi Galvani in the late
1700's; he discovered that electrical sparks caused leg motion in a dead frog. The
electrical properties of biological cells and tissues make it possible to interface elec-
tronic devices with cells and tissues to deliver electrical stimulation and to obtain
electrical recordings.
One of the most influential developments in electrophysiology has been the devel-
opment of the cardiac pacemaker and defibrillator, where the beating of the heart is
measured electrically, and electric pulses are sent to the heart, causing it to either
beat at a normal rate or restart beating if it has stopped. Pacemakers are implanted
within the human body and consist of a battery with ultra-low-power electronics, ca-
pable of achieving ten years of battery lifetime [10]. In 2005, approximately 800,000
pacemakers were implanted worldwide [99]. Pacemakers share many similarities with
hybrid-insect systems in that electronics must interface with an organism and power,
weight, and volume are all highly constrained.
In the last few decades, due to ongoing miniaturization of electronics, researchers
have developed 'backpacks' to place on animals, for both electrical stimulation and
recording. Demonstrated systems include a stimulator for cockroaches [100], a dis-
crete wireless transmitter for transmitting muscle potentials of a flying locust [101], a
discrete frequency modulation (FM) telemetry system for recording electromyogram
(EMG) signals from moths [102, 103], an integrated FM telemetry system for record-
ing neural activity of monkeys [104], and many other systems [105, 106, 107, 108, 109].
The majority of published work involves electrical recorders and wireless transmit-
ters rather than electrical stimulators and wireless receivers, because recording neural
activity has historically been more important than stimulating neurons.
As scientists have come to better understand the electrical properties of organisms,
it is becoming increasingly feasible to control motion through electrical stimulation. In
[100], directional locomotion control of a cockroach is implemented through electrical
stimulation of afferent nerve fibers on the antennae. Recent work has controlled the
wing flapping of a moth through electrical stimulation of muscle groups [110]. A
similar stimulation approach has been used to control the flight of a beetle [111]. An
141
alternative to muscle stimulation is to directly stimulate the central nervous system
(CNS), which was demonstrated for moth flight control in [112].
From the aforementioned discussion, it is clear there are several possible insects
that can be used in flight control systems, and for each insect there are several pos-
sible ways to electrically control the flight direction through stimulation. For this
thesis, the Manduca sexta hawkmoth was chosen and preliminary flight control was
realized through direct stimulation of the CNS. The following section provides some
background on the Manduca sexta hawkmoth and introduces the flight-control sys-
tem.
6.2 Manduca sexta hawkmoth
In our research, we used the hawkmoth Manduca sexta, which is commonly found on
the American continent. During their lifespan, moths undergo complete metamor-
phosis, going from egg, to larva, to pupa, and then finally to adult moth, which can
live in captivity for up to two weeks. There are multiple reasons why moths are ideal
for insect flight control systems. First, moths are easily reared in laboratories; the
moth colonies at the University of Arizona and University of Washington have existed
for decades. Second, an adult hawkmoth has a wingspan of 10 cm and a carrying
capacity of approximately 1 g, which is large enough to carry the required electronics.
In addition, moths have been studied extensively by neurobiologists and physiologists
for decades, and thus there is extensive data on their flight control mechanisms. A
moth flaps its wings at 25 Hz, and subtle variations in the wing movement alter the
direction of flight.
6.2.1 Flight control through abdominal deflection
One of the most promising approaches to altering the flight direction of a moth is to
elicit abdominal movements via neural stimulation. The moth's abdomen plays an
important role in flight stabilization. For instance, when the moth flies upward, its
abdomen deflects downward. It has been shown that pulsed stimulation of the nervous
142
Figure 6-1: Lateral view of Manduca sexta moth attempting to feed from an artificial
flower. (Photo courtesy of Armin Hinterwirth, University of Washington)
system can generate abdominal movements and thereby bias flight direction [112].
This is a fundamentally different process from stimulating muscles - by stimulating
the nervous system, the pulses consume negligible energy and do not need to be
synchronized with wing flapping.
In our system, stimulation pulses are delivered to the nervous system via a 4-
electrode tungsten probe (Figure 6-2). The electrodes are implanted at the ventral
junction between the abdomen and thorax, and interface with the nervous system.
Figure 6-2: Photograph of two, 4-electrode tungsten probes.
143
- N=30, T=1Oms, D=10%
- - N=30, T=10ms, D=50%
- - N=60, T=5ms, D=20%
15
o" ' 15
10 0. ..
o 
. ........ ....... ... 
.
E E
<0 0
1 1.5 2 2.5 3 10 101 102 103
Stimulus Voltage (V) Pulse Frequency (Hz)
(a) (b)
Figure 6-3: Measured abdominal deflection of a moth in response to pulse stimuli of
varying voltage and frequency. In (a), the voltage of stimulation pulses is varied. T
represents the pulse period, N represents the number of pulses and D represents the
duty cycle. In (b), the frequency of stimulation pulses is varied, for a 0.5 s burst of
2.5 V, 1 ms pulses.
Tungsten wire is used because of its combination of tensile strength, resistance to
corrosion, and conductive properties.' The electrodes consist of -15 mm long, 0.004"
diameter tungsten tines electrically coupled to 0.002" diameter steel wire, which
in turn interfaces with the probe. The steel-tungsten junction is encapsulated in
hardened-resin epoxy which facilitates handling. The four stimulation sites of the
tungsten probe enable multi-directional flight control.
To elicit abdominal deflections, simple monopolar pulses with pulse duration of
1 ms are applied to the CNS via the tungsten probe. Of the four electrodes, at least
one electrode needs to be grounded and at least one electrode needs to be provided
pulses. Figure 6-3(a) plots the abdominal deflection of a moth versus pulse amplitude.
The direction of this abdominal deflection depends on the specific electrode which
is pulsed, and this varies from moth to moth. Generally, the maximum amount
of abdominal deflection that can be introduced is on the order of 7' to 100. A
'The electrodes are designed by Tom Daniel and Armin Hinterwirth, and produced by SusanLoudon, all at the University of Washington.
144
200
150
< 100 .............
c 50
5 0 .............
-100 0 2 4
Time (ms)
Figure 6-4: Transient measurement of current delivered through a tungsten electrode
when applying a 2.5 V pulse with pulse duration of 1 ms.
pulse amplitude of approximately 1.5 V or above is required to introduce abdominal
deflections. At pulse amplitudes above 5 V, the abdominal motions becomes less
smooth and would likely disrupt natural flight. Figure 6-3(b) plots the abdominal
deflection versus pulse frequency of a different moth. By varying the pulse frequency
from 2 Hz to 200 Hz, the amount of abdominal deflection can be varied from 00 to
70
Based on a transient current measurement (Figure 6-4), the CNS connection be-
tween two tungsten electrodes can be modeled as a resistor with impedance of ap-
proximately 25 kQ. For a 2.5 V pulse stimulus, the average current during the 1 ms
pulse period is approximately 100 pA. Thus, for a pulse duty cycle of 10%, the aver-
age power consumption is 25 pW. In addition to the 25 kQ resistive component is an
inductive or capacitive component that results in small current spikes at each pulse
edge. Figure 6-5 shows two photos of a moth, before and during pulse stimulus.
6.3 Hybrid-Insect Flight Control System
A block diagram of the hybrid-insect flight control system is presented in Figure 6-6.
The flight control system is motivated by a DARPA project with the goal to direct the
flight of an insect to a target 100 m away. A basestation wireless transmitter sends
145
(a) (b)
Figure 6-5: Photos showing abdominal deflection of a moth in response to pulse
stimulus. The abdomen is shown (a) before pulse stimulus and (b) during a pulse
stimulus of 2.5 V, 100 Hz with duty cycle of 10%. The radial lines on the graph paper
are spaced 20, indicating a total abdominal deflection of 100.
Ic"
Transr
Tungsten Stimulator
Battery
To Stimulator
nitter Receiver SoC & Stim. Logic
Figure 6-6: Hybrid-insect flight control system.
146
I4
directional commands to the moth, and a wireless receiver on the moth decodes these
commands and sends pulses to the moth's CNS via tungsten probes to influence flight
direction. The electronics on the moth are powered by a miniature battery, although
the long term goal of the project is to harvest energy from the motion of the moth.
Given the extreme weight and volume constraints imposed by the moth, the elec-
tronics it is to carry need to be extremely lightweight, low power and to occupy
minimal volume. The total weight must be less than 1 g, the average power con-
sumption must be on the order of a few milli-Watts, and the volume needs to be on
the order of 1 cm3 or less. In addition, the receiver must have a range of 10-to-100 m.
To meet these stringent specifications requires a high performance, highly integrated,
energy efficient receiver SoC. Two factors that help to meet these constraints are that
low data rates (kb/s) are sufficient and that no data needs to be gathered on the moth
and transmitted back to the basestation. Thus, the system employs a unidirectional
wireless link to reduce the complexity and power consumption of the electronics on
the moth; however, by not having a reverse link from moth to basestation, no packet
acknowledgements can be sent, reducing the reliability of the wireless link. As a
workaround, the transmitter employs packet repetition, thus trading off transmitter
power consumption for receiver power consumption. In future work, it would be ad-
vantageous to employ a bidirectional wireless link, both to improve the reliability of
the link and to enable data collection from sensors attached to the moth.
6.3.1 Electronics
Although the receiver is highly integrated, in the hybrid-insect system, additional elec-
tronic components are required. Figure 6-7 shows a block diagram of the electronics
that are used. The key components include the receiver SoC, a microcontroller, 2.5 V
dc-dc converter, 1 V low drop-out (LDO) regulator, miniature coin cell battery, on-off
switch, crystal resonator, LED, antenna, and discrete inductors, resistors and capac-
itors. The electronic components are soldered to a flexible, 4-layer PCB. A flexible
PCB allows for a 60-70% reduction in weight and thickness compared to a rigid PCB.
The electronic components are described in more detail below and photos of the PCB
147
1.4V
U  Battery
SLDO en IDC/DC en T a
F 1.OV 2.5V
T T
UWB Flash
SRX MSP430 Programming
pController InterfaceSoCI"
LED Stacked
OPackage
Tungsten
interface
Figure 6-7: Block diagram of electronics mounted on a flexible PCB and attached to
a moth.
(a) (b)
Figure 6-8: Flexible PCB (a) top, (b) bottom, and (c) side.
148
are shown in Figure 6-8.
The electronics are powered by a 1.4-to-1.6 V Silver-Oxide, size 362 coin cell
battery. The battery has a typical capacity of 27 mAh, weighs 0.32 g, and has an
impedance at 40 Hz of 10-to-20 £. Several Zinc-Air and Silver-Oxide batteries were
tested, and this specific battery was found to be best suited to the high current
drawn by the electronics. At a given weight, thin-film batteries had significantly
worse energy capacity and impedance compared to Silver-Oxide batteries.
As the receiver SoC requires 1.0 V and 2.5 V supply voltages, dc-dc converters
are used to generate the required voltages from the Silver-Oxide battery. A Linear
Technologies LTC3526 boost dc-dc converter generates the 2.5 V supply and a Lin-
ear Technologies LTC3025 LDO linear regulator generates the 1.0 V supply. A key
advantage of the LDO is that its embedded linear regulator directly connects to the
battery voltage but the internal logic operates off the 2.5 V supply voltage, allowing
for high efficiencies and a very low drop-out on the order of tens of milli-Volts. Both
dc-dc converters are packaged in a 2 mm by 2 mm dual flat no-lead (DFN) package
to reduce form factor and weight. To further reduce form factor and weight, only a
single decoupling capacitor is used for each supply voltage. A miniature on-off power
switch is used to enable the dc-dc converters, so that the receiver does not consume
any static current when turned off.
A key limitation of the receiver SoC is that it has no embedded flash memory
and that on power-up it must be programmed to an appropriate state through its
digital shift register. After being programmed, the embedded controller within the
receiver SoC is able to autonomously receive packets periodically. A Texas Instru-
ments MSP430 microcontroller with embedded flash memory is included in the system
and serves as a power-up programmer. To reduce the form factor and weight, the
MSP430 microcontroller is combined in a single QFN package with the receiver SoC
by stacking the receiver bare die on top of the microcontroller bare die. A photo of
the stacked die is shown in Figure 6-9.
As the receiver cannot send an acknowledgement when a packet has been received,
it is not possible for the operator to determine that the receiver is successfully re-
149
Figure 6-9: Photo of the UWB receiver SoC and a Texas Instruments microcontroller
vertically stacked in a single package. The UWB receiver SoC is placed on top of the
microcontroller.
Table 6.1: Weight breakdown of components attached to the moth.
System Component Weight
Packaged RX SoC & Microcontroller 161 mg
1.5 x 2.6 x 0.002 cm PCB 100 mg
Antenna 164 mg
Silver Oxide Battery 320 mg
Harness 85 mg
Other Components 170 mg
Total 1000 mg
ceiving packets and stimulating the moth. As a workaround, a miniature red LED is
attached to the PCB and is connected to one of the four stimulation channels. Thus,
when the moth is being stimulated, the LED rapidly turns on and off and is visible
to the naked eye. A red LED is used because moths cannot see the color and it does
not influence their flight.
A Fractus UWB antenna is mounted on the PCB to transduce the electrical RF
signals to electromagnetic waves. The antenna occupies 10 x 10 x 0.8 mm, has a
radiation efficiency > 60% and weighs 164 mg.
Table 6.1 presents a weight breakdown of the components attached to the moth.
The total weight of all components is 1 g, including the tungsten probe and a harness
150
Output Logic
Enable Output, 1V 2.5V !timulator
Data1 I mao
that is used to attach the PCB to the moth. The Silver-Oxide battery consumes nearly
6.3.2 Stimulation LogicA key circuit block on the receiver SoC is V 2 5V es the stim-
ulator electroDividers. The stimulator logic consists of a frequency divider, a pulsewidth
Figure 6-10: Block diagram of stimulation logic included in the UWB receiver SoC.
s ac h h ot  ver-O d battery consumes nearly
one-third of the total weight. Although lighter batteries exist, none were found that
could provide the peak2.5 V, a 1 verage current levels required by the receiver SoC.
6.3.2 Stimulation Logic
A key circuit block on the receiver SoC is the stimulator logic which drives the stim-
ulator electrodes. The stimulator logic consists of a frequency divider, a pulsewidth
modulation (PWM) generator and output logic. The clock divider divides the 32 MHz
system clock to a stimulator clock of approximately 1 kHz. This low frequency clock
serves as the master clock for the PWM generator, which generates a pulse burst with
programmable on time, off time, and number of pulses in a burst. The receiver offers
support for up to 8 output channels, and each output channel can be individually set
to ground, to the pulse signal, or to a high impedance state. Level converters convert
the output signals to 2.5 V, as 1 V is not sufficient to elicit abdominal deflection of
the moth.
151
Figure 6-11: Photo of the moth with (a) dorsal PCB mounting and (b) ventral PCB
mounting.
6.4 Attaching PCB to Moth
As the average adult moth weighs only 2.5 g, it is a challenge for them to carry
the 1 g of hardware. In fact, previous literature has only demonstrated a carrying
capacity of 0.7 g [102j. For maximal carrying capacity, any weight attached to the
moth must be located near the moth's center of mass. Two alternatives for attaching
electronics to the moth are dorsal mounting by placing the board like a fin or mo-
hawk (Figure 6-11(a)), or ventral mounting by attaching the board with a harness
(Figure 6-11(b)).2 The harness is designed to stabilize and position the hardware
at the moth's center of gravity. Based on multiple tests, we determined that dorsal
mounting allows for a maximum carrying capacity of 0.5 g whereas ventral mounting
with the four-point harness allows for a maximum carrying capacity of 1 g, and thus
ventral mounting is used.
Several steps are required to attach the PCB to the moth and insert the tungsten
probe. It is particularly important not to damage the moth, either by damaging its
wings or by subjecting it to a level of trauma such that it no longer is interested
in flying. The first step of surgery is to remove scales from several locations on the
moth - if these scales are not removed, it is very difficult to attach the PCB and
insert the tungsten electrodes. Next, the moth is anaesthetized by submerging it in a
2For ventral mounting, a harness developed by Goggy Davidowitz at the University of Arizona
is used.
152
ness
(a) (b)
Figure 6-12: Photos of the surgical procedure to attach the hardware to the moth.
In (a), the harness is attached to the moth and in (b), the tungsten probe is inserted
through the cuticle into the connective tissue adjacent to the nerve connectives.
container of ice to rapidly cool the moth. This allows for approximately 15 minutes
of surgical time before the moth wakes up. During this time, the PCB is attached
to the moth via a harness (Figure 6-12(a)). The harness is attached to the moth
by crimping little metal beads and gluing them to the dorsal surface. By this time,
the moth is usually waking up, so its wings are temporarily restrained with clips and
then the tungsten probe is inserted into the nervous connective tissue at the ventral
junction of thorax and abdomen (Figure 6-12(b)). Wax is used to keep the tungsten
probe in place. After surgery, the moth is placed in a humidified incubator with
constant light exposure and given at least 8 hours to recover from surgery prior to
flight testing.Several steps are required to attach the PCB to the moth and insert
the tungsten probe. It is particularly important not to damage the moth, either by
damaging its wings or by subjecting it to a level of trauma such that it no longer is
interested in flying. The first step of surgery is to remove scales from several locations
on the moth - if these scales are not removed, it is very difficult to attach the PCB
and insert the tungsten electrodes. Next, the moth is anaesthetized by submerging
it in a container of ice to rapidly cool the moth. This allows for approximately 15
minutes of surgical time before the moth wakes up. During this time, the PCB is
attached to the moth via a harness (Figure 6-12(a)). The harness is attached to the
moth by crimping little metal beads and gluing them to the dorsal surface. By this
point, the moth is usually waking up, so its wings are temporarily restrained with
153
Figure 6-13: Photograph of wind tunnel where moth flight tests were conducted.
clips and then the tungsten probe is inserted into the nervous connective tissue at the
ventral junction of thorax and abdomen (Figure 6-12(b)). Wax is used to keep the
tungsten probe in place. After surgery, the moth is placed in a humidified incubator
with constant light exposure and given at least 8 hours to recover from surgery prior
to flight testing.
6.5 Flight Tests
Moth flight tests were conducted in a wind tunnel at the University of Arizona.3
Figure 6-13 presents a photo of the wind tunnel test setup. Multiple video cameras,
both high speed and regular speed, were used to capture moth movement. The UWB
transmitter [33] was placed on the top of the wind tunnel, and a reference receiver
3The flight tests were conducted in collaboration with Alice Stone, a senior research specialist
at the University of Arizona and Zane Aldworth, a postdoctoral researcher at the University ofWashington.
154
was placed on the bottom of the wind tunnel in view of the cameras. The reference
receiver serves as a secondary LED indicator, to indicate when the moth is being
stimulated with pulses. To encourage flight, the wind tunnel was set up with a 30
cm/s wind flow, and all tests were conducted in the dark, as moths typically fly at
dusk in an upwind direction. In addition, a sex-pheromone lure was placed upwind
from the moth, so that the moth would be encouraged to fly to it. The stripes on
the walls of the wind tunnel are used to give the moth a normal visual flow, as if
flying past plants, rocks etc. The red spots are invisible to the moth, but are useful
for interpreting video sequences.
Multiple moths were tested in the wind tunnel, and a host of system problems were
initially encountered. For example, the zinc-air batteries that were used initially could
not tolerate the current consumption requirements, and the moth electronics would
only work for a few minutes. A second problem was that early surgeries subjected
moths to too much trauma, resulting in the moth being unwilling to fly. Through
practice, the surgical procedure became faster and the moths generally appeared
stronger and more willing to fly in the wind tunnel.
In a preliminary flight control experiment, a moth was able to fly while carry-
ing the electronics, and the moth's flight direction changed in response to a pulse
stimulus. The photo shown in Figure 6-14 present a time-lapsed view of one of two
consecutive flight trajectories of a moth while being stimulated. In the photos, the
moth is originally flying to the lower right hand corner. Next, the receiver receives
a wireless packet with a stimulation command and begins stimulating the moth.
Shortly after stimulation begins, the moth's direction of flight turns to the left. The
time between pulse stimulation and the change in flight direction is roughly 200 ms,
which corresponds with what is expected from prior tests [113]. In both of the flight
trajectories, the moth responds to the pulse stimulus with a leftward turn, with a
change of bearing of 1950 during 500 ms of stimulation in the first trajectory and a
change of 1620 during 350 ms of stimulation in the second trajectory.
It is important to note that these results are preliminary, and much work remains
to be done. Significant advances in understanding the neural control of flight in the
155
(a) (b)
(c) (d)
Figure 6-14: Time-lapse photos of a moth in a wind tunnel responding to pulse stim-
ulus. Pulse stimulus is generated by the receiver SoC after the receiver successfully
decodes a transmitted packet. Photos of the moth are shown (a) prior to stimulation,
(b) at the moment when stimulation begins, (c) when the moth's flight path begins
to respond to the pulses, and (d) after having responded to the stimulation pulses.
In response to the stimulation pulses, the moth's flight path turns to the left.
moth are required before robust, multi-directional flight control can be achieved. For
this moth, only a single stimulation site was tested. What is significant in this work is
that it demonstrated successful operation of the receiver and stimulator on the moth.
Moreover, this is one of the first demonstrated wireless receivers on a tetherless, free-
flying moth that has enabled a wireless change in flight direction. Table 6.2 compares
the performance of the wireless receiver system with previously published transmitters
and receivers for miniature biological systems.
156
6.6 Summary
This chapter described a hybrid-insect flight control system wherein electronics are
placed on a Manduca sexta moth and the flight direction of the moth is controlled
wirelessly. In the system, the UWB receiver described in previous chapters was
mounted on a miniature, flexible PCB and attached to a moth with a harness. As
adult moths have a maximum carrying capacity of 1 g, several steps were taken
to reduce overall system weight, including stacking dies and using a flexible PCB.
Preliminary flight control of a moth was demonstrated, with a battery powered UWB
receiver successfully receiving a packet and stimulating the moth, thereby changing
the moth's direction of flight.
157
Table 6.2: Comparison between this work and previous published wireless transmitters and/or receivers for miniature biological
systems.
This Work [104] [105] [108] [114] [106]
Frequency 3-5 GHz 94-98 MHz 88-108 MHz 3.2 GHz 2.64 MHz/433 MHz 80-90 MHz
Comm. Scheme Noncoherent UWB TDM/Analog FM FM FM ASK/FSK FM
Tx/Rx? Rx Tx Tx Both Both Tx
Power Diss. 2 mW 1.5-2 mW - 5-8 mW 13.5 mW 10 mW
Weight 1.0 g 1.1 g >3.1 g - - 0.1g
Range <10 m 0.5 m A few meters 0.5 m Tens of centimeters 16 m
Process 90 nm CMOS 1.5 im CMOS - 0.35 pm CMOS 0.5 pm CMOS
Chapter 7
Conclusions
7.1 Thesis Summary
In low power, mobile devices, energy is becoming more and more the limiting factor
rather than performance. Additionally, due to decades of semiconductor process scal-
ing, device geometries are now on the order of nanometers and devices characteristics
have changed so dramatically that many traditional 'analog' circuits and architectures
are no longer optimal and/or feasible. To continue to improve energy efficiency while
leveraging advanced CMOS processes requires the use of highly digital circuits and
architectures. This thesis focused on developing highly digital, mixed signal circuits
and architectures for energy-constrained mobile applications. Several circuits and
architectures were presented in this thesis, progressively expanding in scope. First, a
single component, an ADC was described, and then the thesis moved to the system-
on-chip level, describing a UWB wireless transceiver chipset designed for the IEEE
802.15.4a standard. Finally, the transceiver chipset was presented in a full system
demonstration of wireless insect flight control.
Overall, highly digital circuit design is well suited to nanometer CMOS processes.
Through proper circuit- and system-level optimization, high performance, energy effi-
cient operation can be realized. Voltage scaling can be readily combined with digital
circuits to improve energy efficiency at the cost of reduced performance. This cost,
however, can be acceptable in systems with time varying performance requirements,
159
such as sensor networks. A voltage scalable flash ADC was presented that can operate
down to a supply voltage of 200 mV, which is the lowest reported operating voltage
for a Nyquist rate ADC. To operate at such a low supply voltage operation required a
highly digital architecture that leveraged redundancy to improve linearity and yield.
In many high performance systems it can be advantageous to forgo voltage scaling
and instead operate off a fixed supply that maximizes voltage headroom and mini-
mizes system complexity. Both wireless transmitters and ADCs benefit from large
supply voltages to generate large output waveforms and increase dynamic range, re-
spectively. Simple, inverter-based amplifiers can be used to efficiently amplify both
small signals and to drive large signals off-chip. This thesis presented a highly dig-
ital IR-UWB transmitter that employs capacitive combining to generate large pulse
amplitudes that are FCC compliant while being energy efficient.
In SoCs where extensive digital logic is integrated on the same die as sensitive
analog and RF circuits, there is significant power supply and substrate noise which
can degrade performance. Differential circuit structures provide excellent rejection of
this noise but come at the cost of increased power consumption. Except at extremely
low voltages, differential structures can be readily implemented in advanced CMOS
processes and with digital structures. For example, this thesis presented the first fully
integrated noncoherent UWB receiver with embedded synchronization logic. The
receiver employs several highly digital, differential amplifier and ADC structures,
combined with extensive digital calibration to achieve robust, low-energy operation
in a 90 nm CMOS process.
Finally, circuit optimizations alone are not sufficient to achieve an order-of-magnitude
reduction in power consumption. Such a large reduction in power consumption re-
quires optimization at the system-level, by considering the interaction between differ-
ent circuit blocks and making high level trade-offs. For instance, noncoherent UWB
signaling allowed for the use of a low power DCO and FLL in the transmitter rather
than a PLL. Three dimensional device stacking is another technique that offers the
potential for dramatic reductions in area and/or energy consumption. Through these
system-level optimizations, the UWB receiver SoC was successfully demonstrated in
160
a wireless flight control system for a moth.
7.2 Future Work
There are many exciting challenges that remain for highly digital mixed-signal circuit
and system design. CMOS process scaling is expected to continue for at least a decade
longer, and energy constrained mobile devices will become increasingly pervasive. To
maximize functionality while minimizing energy of mobile devices requires further
research and optimization at all levels.
7.2.1 Analog to Digital Converters
While this thesis presented an ADC that is able to scale voltage and frequency, it
would be advantageous for an ADC to also offer resolution scalability. Some prior
research has attempted to offer this functionality. In [115], a reconfigurable, op amp
based ADC is presented that can switch between pipeline and delta-sigma modes,
allowing for a bandwidth range of 0-10 MHz and a resolution range of 6-16 bits. In
[116], a SAR ADC supports both 8-bit and 12-bit modes, switching between resolution
modes by switching between two bit-resolution comparators. In systems like SDRs
and biomedical sensors, the required ADC resolution and frequency can change over
time. Voltage scaling has not yet been extensively leveraged in resolution scalable
ADCs, although it offers the potential to improve energy efficiency when scaling
resolution, particularly for low-resolution ADCs (<75 dB SNR) where thermal noise
does not limit performance [117].
One key limitation of the voltage scalable ADC described in this thesis is the
significant overhead associated with calibration. An area for future work is to de-
termine methods to rapidly calibrate similar ADC architectures, or perhaps modify
the architecture to remove the need for extensive calibration. For instance, in [43],
the stochastic ADC in designed to leverage variation and does not require extensive
calibration.
161
7.2.2 Wireless Transceivers
While cellular phones have been the driving force in the wireless industry in the
past decade, it is likely that short range, low power radios will play a significant role
in the next decade. Short-range wireless standards such as Bluetooth and 802.15.4
have gained traction in the market, and there are emerging standards for implanted
devices, such as Medical Implant Communications Service (MICS), and also for body
area networks (BANs). The IEEE 802.15 Task Group 6 is developing a communication
standard optimized for communication on, in or around the human body. The growing
biomedical space in particular provides many new challenges and offers the potential
for significant innovation.
Although the UWB receiver SoC was demonstrated in a complete system, and
achieves one of the best reported energy/bit values while maintaining high perfor-
mance operation, several modifications would be required if it were released as a
commercial part. One key modification would be to add power gating switches on
the digital logic to reduce the subthreshold leakage when the receiver is turned off.
To reduce off-chip component count, it would be advantageous to integrate dc-dc
converters and LDOs on die, such that the receiver SoC can directly connect to a
battery.
The receiver SoC uses multiple resonant LC loads in the RF front end to amplify
the received signal approximately 40 dB. In this thesis, resonant loads were shown
to result in superior gain at a given power consumption that non-resonant loads.
Unfortunately, resonant loads occupy significant die area and have a limited tuning
range. For a commercial UWB transceiver with world-wide compliance, it is likely
that the entire 3.1 to 10.6 GHz frequency band needs to be supported. The receiver
architecture presented in this thesis would likely require parallel RF front ends to
support this entire frequency band, which would occupy excessive die are. To reduce
area, it may be possible to design a non-resonant, wideband RF amplifier structure
or to mix the RF signal to baseband after only one or two RF gain stages.
162
7.2.3 Wireless Moth Flight Control
The moth flight control results presented in this thesis are preliminary, and much work
remains before reliable flight control can be demonstrated. The electronics themselves
need to be significantly reduced in weight from 1 g to less than 0.5 g. Although some
moths are capable of carrying a 1 g payload in flight, they need encouragement to fly
with such a large payload and their flying capabilities are degraded. A 0.5 g payload
can likely be carried by all adult moths without a significant degradation in their
flight capabilities.
The most significant contributor to weight in the presented system is the Silver
Oxide battery, which weighs 320 mg. Smaller, lighter batteries exist, but none were
able to deliver the milli-Amp currents required by the radio. Emerging battery tech-
nology like thin-film batteries offers much potential for reduced battery impedance
and increased capacity, ideally allowing for a weight reduction on the order of a few
hundred milli-grams. It is possible to further reduce the weight by 100-to-250 mg
by embedding power management and non-volatile memory within the receiver SoC,
thereby reducing overall component count and PCB size. By directly attaching the
receiver SoC die to the PCB, the weight of the QFN package can also be removed.
All of these modifications in combination would result in an overall system weight of
approximately 0.5 g. Finally, to further reduce the weight, a lighter antenna can be
designed, potentially directly on the PCB.
In combination with lighter electronics, significant advancements in understand-
ing moth electrophysiology, neurobiology and flight control are required before robust,
multi-direction flight control can be achieved. Indeed, even in a tethered, highly con-
trolled system, it is not possible to reliably control a moth's flight direction. Recently,
Jenna Atema, a biologist at Boston University, questioned whether the goal of reli-
able flight control is even feasible, in part due to the strong response of moths to
pheromones, which could override attempts at remote electronic control [118]. Dr.
Atema's questions are valid, and can only be answered through further research. As
part of this research, it would be useful to record EMG signals during stimulation and
163
flight [102, 103]. This functionality can be added to the existing system by integrating
an analog front end, an ADC and a wireless transmitter on the SoC.
164
Appendix A
Analysis of Device Stacking in
Subthreshold Regime
This appendix derives an analytical expression for total propagation delay of the
circuit shown in Figure 2-9 when biased in the subthreshold regime. This expression
is then used in Section 2.2 to estimate the switching threshold of a clocked comparator
depending on the amount of device stacking.
As discussed in Section 2.2, the ODE numerical solution (Figure 2-10(a)) can be
approximated with a piecewise-linear model (Figure 2-10(b)). A key observation is
that once V2 of Figure 2-9 has discharged, V1 is slightly reduced from the voltage it
originally discharged to. This is expected, as the current through M is assumed to
be equal to the discharge current, which decreases with time.
For the following analysis we consider the situation when the Lth node is dis-
charging (L < N). In this scenario, as only VL is being discharged, the current
through devices M1 through ML is equal and there is no current through devices
ML+1 through MN. We will refer to this current as IM,L. Thus, we have the following
set of equations:
165
IM,L
S(Vin-VL-i-Vtn
s e nVth
I z y l1yvt nh
= Is e nVth
= v( Ie-v tn
Is (e nVth
- e Vth (A.la)
(A.lb)
(A.lc)
1 - e (VV
-e Vth
Initially VL is precharged to VDD. As VL-1 has already discharged, it is at a voltage
much less than VDD and thus we can assume that:
-1 (VL-VL--1)1-e Vh = 1 (A.2)
To simplify equation (A.1), we -substitute a=esubstitute ai = e vth After dividing out the
Vins ve -
common factor of Is e nVth
IM,L
aL-1
we are left with the following set of equations:
oc aL-1
= aL-2 1-
aL-1 
naL-2
(A.3a)
(A.3b)
= al (1- a2 )
= (1- an)
We can manipulate the above equations as follows:
aL-1
aL-2
aL-1 aL-2 a 2
aL-2 aL-3 al
aL-1 aL-2 a2
2... -L- al
aL-2 aL-3 al
= (1-a n)
166
(A.3c)
(A.3d)
-(L-1 ) n )
aL-2
a2 )n)
(A.4a)
(A.4b)
(A.4c)
(1
As a- ~1 for i < L, we can use the approximation that (A )" 1 -n(1 - )i-1 i-1 ai1
Thus:
aL-1
aL-2
- n(1 (aL-1))
aL-2
(A.5a)
aL-1 aL-2 a 2
aL-2 aL-3 al
aL-1 aL-2
aL-2 aL-3
a 2
.. - - al
al
n(1- (a2))
= n(1- al)
We need to solve for aL-1 to determine IM,L
.
From Equation A.5, we can itera-
tively arrive at the solution aL-1 = n+ -1l Thus, IM,L = n Is V~n- Vt
nVth
Now that we have solved for the current as each node discharges, we can calculate
the total delay for all nodes to discharge:
ttot
N
i= ,ii= 1
N
i=1
(A.6)
(A.7)(VDD - Vio,l)
A good approximation for Vi,low is the source voltage of the top-most 'on' transistor
(i.e. VL-1). Thus:
Viow = VL-1 L=i (A.8)
-VL-1
As aL-1 = e nVth n- we obtain:
Vi,loj. nVhln -i + 1(n (A.9)
This can be substituted back into equation (A.6) to obtain an expression for the
total propagation delay:
167
(A.5b)
(A.5c)
1200 _
- ODE Simulation
1000 --- Mathematical Approx.
2 800 .......
3i 6 0 0 ................... .
0 400
200
0
0 5 10
Number of stacked devices
Figure A-1: Propagation delay versus number of stacked NMOS devices for ODE
simulation and mathematical approximation given in Equation A.10.
N Ci" n+i-1 ( 1.(0e + - A10i( n 1e V( VDD - nVthln i + 1 - )) (A.10)nI, n n
Figure A-i presents data based on this expression and comparing it to ODE sim-
ulation results. Equation A.10 closely matches the ODE simulation and can also be
accurately represented by a second-order equation. Thus, a quadratic relationship
exists between the amount of device stacking and the propagation delay in the sub-
threshold regime. The system was resimulated taking into account the body effect,
and results were found to be consistent.
168
Appendix B
Acronyms
ADC analog-to-digital converter
AGC automatic gain control
ASK amplitude-shift keying
AWG arbitrary waveform generator
BAN body area network
BER bit error rate
BPSK binary phase-shift keying
CDF cumulative distribution function
CDS correlated double sampling
CS-CG common-source, common-gate
CMOS Complimentary Metal Oxide Semiconductor
CMRR common-mode rejection ratio
CMFB common-mode feedback
CNS central nervous system
169
CPW coplanar waveguide
CSS chirp spread spectrum
DSSS direct sequence spread spectrum
DAC digital-to-analog converter
DAP die attach pad
DCO digitally-controlled oscillator
DFN dual flat no-lead
DLL delay-locked loop
DNL differential non-linearity
DSP digital signal processor
ENOB effective number of bits
EMG electromyogram
ESD electrostatic discharge
FCC Federal Communications Commission
FLL frequency-locked loop
FM frequency modulation
FoM figure of merit
FPGA field programmable gate array
FSK frequency-shift keying
IIR infinite-impulse response
INL integral non-linearity
170
IR-UWB impulse radio ultra-wideband
I/O input/output
LCD liquid crystal display
LDO low drop-out
LFSR linear feedback shift register
LNA low noise amplifier
LSB least significant bit
MEMS microelectromechanical systems
MICS Medical Implant Communications Service
MISO master-input slave-output
MOSI master-output slave-input
MOM metal-oxide-metal
ODE ordinary differential equation
OLPC One Laptop per Child
OTA operational transconductance amplifier
OOK on-off keying
PA power amplifier
PCB printed circuit board
PLL phase locked loop
PPM pulse-position modulation
PRF pulse repetition frequency
171
PHY physical layer
PWL piecewise-linear
PWM pulsewidth modulation
QFN quad flat no-lead
RDF random dopant fluctuation
RF radio-frequency
SAR successive approximation register
SDR software-defined radio
SMA sub-miniature A
SNDR signal-to-noise-plus-distortion ratio
SNR signal-to-noise ratio
SPI serial peripheral interface
SFD start of frame delimiter
SoC system-on-chip
USB Universal Serial Bus
UWB ultra-wideband
WPAN wireless personal area network
VSG vector signal generator
172
Bibliography
[1] MOSIS, "Wafer Electrical Test Data and SPICE Model Pa-
rameters for Classroom Instructional Purposes." [Online]. Available:
http:// www.imosis.com/ T(: chnmical/ T( stdata/mie l nu-testda ta mep.html
[2] Arizona State University, "Predictive Technology Model." [Online]. Available:
htttp: //www, eas.asu. edui/-ptm
[3] W. Zhao and Y. Cao, "New Generation of Predictive Technology Model for
Sub-45 nm Early Design Exploration," IEEE Transactions on Electron Devices,
vol. 53, no. 11, pp. 2816-2823, Nov. 2006.
[4] F. Lee and A. Chandrakasan, "A 2.5nJ/b 0.65V 3-to-5GHz Subbanded UWB
Receiver in 90nm CMOS," in IEEE Int. Solid-State Circuits Conf. Dig. Tech.
Papers, 2007, pp. 116-590.
[5] J. S. C. Kilby, "Turning Potential into Realities: The Invention of the Integrated
Circuit (Nobel Lecture)," ChemPhysChem, vol. 2, no. 8-9, pp. 482-489, Aug.
2001.
[6] K. B. Hamrick, "The History of the Hand-Held Electronic Calculator," The
American Mathematical Monthly, vol. 103, no. 8, pp. 633-639, Oct. 1996.
[7] F. M. Wanlass, Low Stand-by Power Complementary Field Effect Circuitry,
June 1963, United States Patent #3,356,858.
[8] R. Daniels and R. Burgess, "The electronic wristwatch: An application for Si-
gate CMOS ICs," in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers,
vol. XIV, Feb. 1971, pp. 62-63.
[9] Y. Suzuki, M. Hirasawa, and K. Odagawa, "Clocked CMOS calculator cir-
cuitry," in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, vol. XVI,
Feb. 1973, pp. 58-59.
[10] L. Wong, S. Hossain, A. Ta, J. Edvinsson, D. Rivas, and H. Naas, "A very low-
power CMOS mixed-signal IC for implantable pacemaker applications," IEEE
Journal of Solid-State Circuits, vol. 39, no. 12, pp. 2446-2456, Dec. 2004.
173
[11] A. Wong, G. Kathiresan, C. Chan, O. Eljamaly, O. Omeni, D. McDonagh,
A. Burdett, and C. Toumazou, "A 1 V Wireless Transceiver for an Ultra-
Low-Power SoC for Biotelemetry Applications," IEEE Journal of Solid-State
Circuits, vol. 43, no. 7, pp. 1511-1521, 2008.
[12] "Sensium Life Pebble." [Online]. Available:
http.Y / /ww. toum az. /pubic/page.phppage-se nsimpbt)chble
[13] Nokia, "Nokia Bluetooth Headset BH-216: Data Sheet," 2009.
[Online]. Available: http://wwxvw.nokiausa.comiNOIKIAUSA_ _64/Landing_
pages/ctia/pressrooflm/pdf/Nokia Blueto oth%2( IIeadset _BH-2 16_Product_
SheetFINAL.pdf
[14] 0. Silven and K. Jyrkki, "Observations on Power-Efficiency Trends in Mobile
Communication Devices," EURASIP Journal on Embedded Systems, no. 56976,
2007, 10.1155/2007/56976.
[15] R. M. Russell, "The CRAY-1 computer system," Commun. ACM, vol. 21, no. 1,
pp. 63-72, 1978.
[16] Strategic Test Corporation, "TRITON-320 PXA320 module." [Online].
Available: -ihttp:// www.strategic-e :.mbed.comi/dataslleetspxa32O)pxa270
pxa255/TRITON-320 Datasheet.pdf
[17] N. Verma, J. Kwong, and A. Chandrakasan, "Nanometer MOSFET Variation
in Minimum Energy Subthreshold Circuits," IEEE Transactions on Electron
Devices, vol. 55, no. 1, pp. 163-174, Jan. 2008.
[18] R. Walden, "Analog-to-digital converter survey and analysis," IEEE Journal
on Selected Areas in Communications, vol. 17, no. 4, pp. 539-550, Apr. 1999.
[19] M. van Elzakker, E. van Tuijl, P. Geraedts, D. Schinkel, E. Klumperink, and
B. Nauta, "A 1.9pW, 4.4fJ/Conversion-step 10b 1MS/s Charge-Redistribution
ADC," in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, 2008, pp.
244-245.
[20] H.-S. Lee and C. Sodini, "Analog-to-Digital Converters: Digitizing the Analog
World," Proceedings of the IEEE, vol. 96, no. 2, pp. 323-334, Feb. 2008.
[21] N. Verma and A. Chandrakasan, "A 256 kb 65 nm 8T Subthreshold SRAM Em-
ploying Sense-Amplifier Redundancy," IEEE Journal of Solid-State Circuits,
vol. 43, no. 1, pp. 141-149, Jan. 2008.
[22] B. P. Ginsburg and A. P. Chandrakasan, "Highly Interleaved 5b 250MS/s ADC
with Redundant Channels in 65nm CMOS," in IEEE Int. Solid-State Circuits
Conf. Dig. Tech. Papers, 2008, pp. 240-241.
174
[23] M. Flynn, C. Donovan, and L. Sattler, "Digital calibration incorporating re-
dundancy of flash ADCs," IEEE Transactions on Circuits and Systems-Part
II: Analog and Digital Signal Processing, vol. 50, no. 5, pp. 205-213, 2003.
[24] J. Shen and P. R. Kinget, "A 0.5-V 8-bit 10-MS/s Pipelined ADC in 90-nm
CMOS," IEEE Journal of Solid-State Circuits, vol. 43, no. 4, pp. 787-795, Apr.
2008.
[25] S. Gambini and J. Rabaey, "Low-Power Successive Approximation Converter
With 0.5 V Supply in 90 nm CMOS," IEEE Journal of Solid-State Circuits,
vol. 42, no. 11, pp. 2348-2356, Nov. 2007.
[26] R. Bagheri, A. Mirzaei, S. Chehrazi, M. Heidari, M. Lee, M. Mikhemar,
W. Tang, and A. Abidi, "An 800MHz to 5GHz Software-Defined Radio Re-
ceiver in 90nm CMOS," in IEEE Int. Solid-State Circuits Conf. Dig. Tech.
Papers, 2006, pp. 1932-1941.
[27] D. C. Daly and A. P. Chandrakasan, "An Energy-Efficient OOK Transceiver
for Wireless Sensor Networks," IEEE Journal of Solid-State Circuits, vol. 42,
no. 5, pp. 1003-1011, May 2007.
[28] D. Wentzloff, F. Lee, D. Daly, M. Bhardwaj, P. Mercier, and A. Chandrakasan,
"Energy Efficient Pulsed-UWB CMOS Circuits and Systems," in IEEE Inter-
national Conference on Ultra-Wideband, Sept. 2007, pp. 282-287.
[29] D. Wentzloff and A. Chandrakasan, "A 47pJ/pulse 3.1-to-5GHz All-Digital
UWB Transmitter in 90nm CMOS," in IEEE Int. Solid-State Circuits Conf.
Dig. Tech. Papers, Feb. 2007, pp. 118-591.
[30] D. C. Daly and A. P. Chandrakasan, "A 6-bit, 0.2V to 0.9V Highly Digital
Flash ADC with Comparator Redundancy," in IEEE Int. Solid-State Circuits
Conf. Dig. Tech. Papers, Feb. 2008, pp. 554-555.
[31] J. Ryckaert, G. Van der Plas, V. De Heyn, C. Desset, B. Van Poucke, and
J. Craninckx, "A 0.65-to-1.4 nJ/Burst 3-to-10 GHz UWB All-Digital TX in 90
nm CMOS for IEEE 802.15.4a," IEEE Journal of Solid-State Circuits, vol. 42,
no. 12, pp. 2860-2869, 2007.
[32] M. Cavallaro, A. Italia, G. Sapone, and G. Palmisano, "A 3-5 GHz low-
complexity ultra-wideband CMOS RF front-end for low data-rate WPANs,"
in ESSCIRC, Sept. 2008, pp. 438-441.
[33] P. P. Mercier, D. C. Daly, and A. P. Chandrakasan, "A 19pJ/pulse UWB Trans-
mitter with Dual Capacitively-Coupled Digital Power Amplifiers," in IEEE Ra-
dio Frequency Integrated Circuits Symposium, June 2008.
[34] D. C. Daly, P. P. Mercier, M. Bhardwaj, A. L. Stone, J. Voldman, R. B. Levine,
J. G. Hildebrand, and A. P. Chandrakasan, "A Pulsed UWB Receiver SoC for
175
Insect Motion Control," in IEEE Int. Solid-State Circuits Conf. Dig. Tech.
Papers, Feb. 2009, pp. 200-201.
[35] P. P. Mercier, M. Bhardwaj, D. C. Daly, and A. P. Chandrakasan, "A 0.55V
16Mb/s 1.6mW Non-Coherent IR-UWB Digital Baseband with ±+ns Synchro-
nization Accuracy," in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers,
Feb. 2009, pp. 252-253.
[36] U. Wismar, D. Wisland, and P. Andreani, "A 0.2V 0.44 pW 20 kHz Analog to
Digital EA Modulator with 57 fJ/conversion FoM," in ESSCIRC, Sept. 2006,
pp. 187-190.
[37] A. Wang and A. Chandrakasan, "A 180mV FFT processor using subthreshold
circuit techniques," in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers,
2004, pp. 292-529 Vol.1.
[38] K. Uyttenhove, K. Uyttenhove, and M. Steyaert, "Speed-power-accuracy trade-
off in high-speed CMOS ADCs," IEEE Transactions on Circuits and Systems-
Part II: Analog and Digital Signal Processing, vol. 49, no. 4, pp. 280-287, 2002.
[39] H.-S. Lee, D. Hodges, and P. Gray, "A self-calibrating 15 bit CMOS A/D con-
verter," IEEE Journal of Solid-State Circuits, vol. 19, no. 6, pp. 813-819, 1984.
[40] C.-Y. Chen, M. Le, and K. Y. Kim, "A Low Power 6-bit Flash ADC With
Reference Voltage and Common-Mode Calibration," IEEE Journal of Solid-
State Circuits, vol. 44, no. 4, pp. 1041-1046, Apr. 2009.
[41] C. Enz and G. Temes, "Circuit techniques for reducing the effects of op-amp
imperfections: autozeroing, correlated double sampling, and chopper stabiliza-
tion," Proceedings of the IEEE, vol. 84, no. 11, pp. 1584-1614, 1996.
[42] "IEEE standard for terminology and test methods for analog-to-digital convert-
ers," IEEE Std 1241-2000, 2001.
[43] S. Weaver, B. Hershberg, D. Knierim, and U.-K. Moon, "A 6b Stochastic Flash
Analog-to-Digital Converter without Calibration or Reference Ladder," in IEEE
Asian Solid-State Circuits Conf., 2008.
[44] G.-C. Ahn, D.-Y. Chang, M. Brown, N. Ozaki, H. Youra, K. Yamamura,
K. Hamashita, K. Takasuka, G. Temes, and U.-K. Moon, "A 0.6-V 82-dB delta-
sigma audio ADC using switched-RC integrators," IEEE Journal of Solid-State
Circuits, vol. 40, no. 12, pp. 2398-2407, Dec. 2005.
[45] M. Dessouky and A. Kaiser, "Very low-voltage digital-audio AE modulator
with 88-dB dynamic range using local switch bootstrapping," IEEE Journal of
Solid-State Circuits, vol. 36, no. 3, pp. 349-355, Mar. 2001.
176
[46] S. Narendra, S. Borkar, V. De, D. Antoniadis, and A. Chandrakasan, "Scaling
of stack effect and its application for leakage reduction," in IEEE International
Symposium on Low Power Electronics and Design, Aug. 2001, pp. 195-200.
[47] M. Matsui, H. Hara, Y. Uetani, L.-S. Kim, T. Nagamatsu, Y. Watanabe,
A. Chiba, K. Matsuda, and T. Sakurai, "A 200 MHz 13 mm 2 2-D DCT macrocell
using sense-amplifying pipeline flip-flop scheme," IEEE Journal of Solid-State
Circuits, vol. 29, no. 12, pp. 1482-1490, Dec. 1994.
[48] G. Van der Plas, S. Decoutere, and S. Donnay, "A 0.16pJ/Conversion-Step
2.5mW 1.25GS/s 4b ADC in a 90nm Digital CMOS Process," in IEEE Int.
Solid-State Circuits Conf. Dig. Tech. Papers, 2006, pp. 2310-2311.
[49] J. M. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Circuits,
C. G. Sodini, Ed. Pearson Education, Inc., 2003.
[50] Y. Tsividis and R. Ulmer, "A CMOS voltage reference," IEEE Journal of Solid-
State Circuits, vol. 13, no. 6, pp. 774-778, Dec. 1978.
[51] J. Doernberg, H.-S. Lee, and D. Hodges, "Full-speed testing of A/D converters,"
IEEE Journal of Solid-State Circuits, vol. 19, no. 6, pp. 820-827, Dec. 1984.
[52] Federal Communications Commission, "FCC 98-208," August 1998.
[53] - , "FCC 02-48," February 2002.
[54] W. Hirt, "The European UWB Radio Regulatory and Standards Frame-
work: Overview and Implications," in IEEE International Conference on Ultra-
Wideband, Sept. 2007, pp. 733-738.
[55] ECMA International, "ECMA-368: High data Rate Ultra Wideband PHY and
MAC Standard," Tech. Rep., Dec. 2007.
[56] P. Mannion, "UWB startup TZero shuts down," EETimes, Feb. 2009.
[57] "IEEE Standard for Information Technology - Telecommunications and infor-
mation exchange between systems - Local and metropolitan area networks - spe-
cific requirement Part 15.4: Wireless Medium Access Control (MAC) and Phys-
ical Layer (PHY) Specifications for Low-Rate Wireless Personal Area Networks
(WPANs)," IEEE Std 802.15.4a-2007 (Amendment to IEEE Std 802.15.4-
2006), pp. 1-203, 2007.
[58] J. G. Proakis, Digital Communications, 4th ed. McGraw-Hill, 2001.
[59] I. O'Donnell and R. Brodersen, "An ultra-wideband transceiver architecture
for low power, low rate, wireless systems," IEEE Transactions on Vehicular
Technology, vol. 54, no. 5, pp. 1623-1631, Sept. 2005.
177
[60] M. Verhelst, N. van Helleputte, G. Gielen, and W. Dehaene, "A Reconfigurable,
0.13pm CMOS 110pJ/pulse, Fully Integrated IR-UWB Receiver for Commu-
nication and Sub-cm Ranging," in IEEE Int. Solid-State Circuits Conf. Dig.
Tech. Papers, Feb. 2009, pp. 251-252.
[61] L. Stoica, A. Rabbachin, H. Repo, T. Tiuraniemi, and I. Oppermann, "An ul-
trawideband system architecture for tag based wireless sensor networks," IEEE
Transactions on Vehicular Technology, vol. 54, no. 5, pp. 1632-1645, Sept. 2005.
[62] T.-A. Phan, V. Krizhanovskii, and S.-G. Lee, "Low-Power CMOS Energy De-
tection Transceiver for UWB Impulse Radio System," Sept. 2007, pp. 675-678.
[63] Y. Gao, Y. Zheng, and C.-H. Heng, "Low-Power CMOS RF front-end for non-
coherent IR-UWB receiver," in ESSCIRC, Sept. 2008, pp. 386-389.
[64] M. Anis, R. Tielert, and N. Wehn, "Super-regenerative UWB impulse detector
with synchronized quenching mechanism," in ESSCIRC, Sept. 2008, pp. 390-
393.
[65] P. Thoppay, C. Dehollain, and M. Declercq, "A 7.5mA 500 MHz UWB receiver
based on super-regenerative principle," in ESSCIRC, Sept. 2008, pp. 382-385.
[66] J. Ryckaert, M. Badaroglu, V. De Heyn, G. Van der Plas, P. Nuzzo, A. Baschi-
rotto, S. D'Amico, C. Desset, H. Suys, M. Libois, B. Van Poucke, P. Wambacq,
and B. Gyselinckx, "A 16mA UWB 3-to-5GHz 20Mpulses/s Quadrature Ana-
log Correlation Receiver in 0.18pm CMOS," in IEEE Int. Solid-State Circuits
Conf. Dig. Tech. Papers, 2006, pp. 368-377.
[67] F. Zhang, R. Gharpurey, and P. Kinget, "A 3.1-9.5 GHz agile UWB pulse radio
receiver with discrete-time wideband-IF correlation in 90nm CMOS," in IEEE
Radio Frequency Integrated Circuits Symposium, June 2008, pp. 31-34.
[68] Y. Zheng, M. Annamalai Arasu, K.-W. Wong, Y. J. The, A. Suan, D. D. Tran,
W. G. Yeoh, and D.-L. Kwong, "A 0.18pm CMOS 802.15.4a UWB Transceiver
for Communication and Localization," in IEEE Int. Solid-State Circuits Conf.
Dig. Tech. Papers, Feb. 2008, pp. 118-600.
[69] G. Cusmai, M. Brandolini, P. Rossi, and F. Svelto, "An interference robust
0.18pum CMOS 3.1-8GHz receiver front-end for UWB radio," Sept. 2005, pp.
157-160.
[70] Y. Zheng, Y. Tong, C. W. Ang, Y.-P. Xu, W. G. Yeoh, F. Lin, and R. Singh,
"A CMOS carrier-less UWB transceiver for WPAN applications," Feb. 2006,
pp. 378-387.
[71] T. Terada, S. Yoshizumi, M. Muqsith, Y. Sanada, and T. Kuroda, "A CMOS
ultra-wideband impulse radio transceiver for 1Mb/s data communications and
±2.5cm range findings," IEEE Journal of Solid-State Circuits, vol. 41, pp. 891-
898, Apr. 2006.
178
[72] H. Kim, D. Park, and Y. Joo, "All-digital low-power CMOS pulse generator for
UWB system," IEE Electron. Lett., pp. 1534-1535, Nov. 2004.
[73] L. Smaini, C. Tinella, D. Helal, C. Stoecklin, L. Chabert, C. Devaucelle, R. Cat-
tenoz, N. Rinaldi, and D. Belot, "Single-chip CMOS pulse generator for UWB
systems," IEEE Journal of Solid-State Circuits, vol. 41, pp. 1551-1561, July
2006.
[74] W. Kluge, F. Poegel, H. Roller, M. Lange, T. Ferchland, L. Dathe, and D. Eg-
gert, "A Fully Integrated 2.4GHz IEEE 802.15.4 Compliant Transceiver for Zig-
Bee Applications," in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers,
Feb. 2006, pp. 1470-1479.
[75] B. Marholev, M. Pan, E. Chien, L. Zhang, R. Roufoogaran, S. Wu, I. Bhatti,
T.-H. Lin, M. Kappes, S. Khorram, S. Anand, A. Zolfaghari, J. Castaneda,
C. Chien, B. Ibrahim, H. Jensen, H. Kim, P. Lettieri, S. Mak, J. Lin, Y. Wong,
R. Lee, M. Syed, M. Rofougaran, and A. Rofougaran, "A Single-Chip Bluetooth
EDR Device in 0.13pm CMOS," in IEEE Int. Solid-State Circuits Conf. Dig.
Tech. Papers, Feb. 2007, pp. 558-759.
[76] R. Min, M. Bhardwaj, S.-H. Cho, N. Ickes, E. Shih, A. Sinha, A. Wang, and
A. Chandrakasan, "Energy-centric enabling technologies for wireless sensor net-
works," IEEE Wireless Communications, vol. 9, no. 4, pp. 28-39, Aug. 2002.
[77] J. Ammer and J. Rabaey, "The Energy-per-Useful-Bit Metric for Evaluating
and Optimizing Sensor Network Physical Layers," in Annual IEEE Communi-
cations Society on Sensor and Ad Hoc Communications and Networks, vol. 2,
Sept. 2006, pp. 695-700.
[78] S. Cho and A. Chadrakasan, "A 6.5-GHz energy-efficient BFSK modulator for
wireless sensor applications," IEEE Journal of Solid-State Circuits, vol. 39,
no. 5, pp. 731-739, May 2004.
[79] D. D. Wentzloff and A. P. Chandrakasan, "Gaussian pulse generators for sub-
banded ultra-wideband transmitters," IEEE Transactions on Microwave The-
ory and Techniques, vol. 54, pp. 1647-1655, June 2006.
[80] T. Norimatsu, R. Fujiwara, M. Kokubo, M. Miyazaki, A. Maeki, Y. Ogata,
S. Kobayashi, N. Koshizuka, and K. Sakamura, "A UWB-IR transmitter with
digitally controlled pulse generator," IEEE Journal of Solid-State Circuits,
vol. 42, pp. 1300-1309, June 2007.
[81] B. Otis, Y. Chee, and J. Rabaey, "A 400 pW-RX, 1.6mW-TX super-regenerative
transceiver for wireless sensor networks," in IEEE Int. Solid-State Circuits Conf.
Dig. Tech. Papers, Feb. 2005, pp. 396-606 Vol. 1.
[82] B. Razavi, K. Lee, and R.-H. Yan, "A 13.4-GHz CMOS frequency divider,"
Feb. 1994, pp. 176-177.
179
[83] T. H. Lee, The Design of CMOS Radio-Frequency Integrated Circuits. Cam-
bridge University Press, 2004.
[84] D. A. Johns and K. Martin, Analog Integrated Circuit Design. John Wiley &
Sons, 1997.
[85] J. Ryckaert, M. Verhelst, M. Badaroglu, S. D'Amico, V. De Heyn, C. Des-
set, P. Nuzzo, B. Van Poucke, P. Wambacq, A. Baschirotto, W. Dehaene, and
G. Van der Plas, "A CMOS Ultra-Wideband Receiver for Low Data-Rate Com-
munication," IEEE Journal of Solid-State Circuits, vol. 42, no. 11, pp. 2515-
2527, Nov. 2007.
[86] "ABM10: Ceramic SMD Ultra Miniature Quartz Crystal," Abracon, Tech.
Rep., 2008.
[87] E. Vittoz, M. Degrauwe, and S. Bitz, "High-performance crystal oscillator cir-
cuits: theory and application," IEEE Journal of Solid-State Circuits, vol. 23,
no. 3, pp. 774-783, June 1988.
[88] A.-S. Porret, T. Melly, D. Python, C. Enz, and E. Vittoz, "An ultralow-power
UHF transceiver integrated in a standard digital CMOS process: architecture
and receiver," IEEE Journal of Solid-State Circuits, vol. 36, no. 3, pp. 452-466,
Mar. 2001.
[89] P. Choi, H. Park, I. Nam, K. Kang, Y. Ku, S. Shin, S. Park, T. Kim, H. Choi,
S. Kim, S. M. Park, M. Kim, S. Park, and K. Lee, "An experimental coin-sized
radio for extremely low power WPAN (IEEE802.15.4) application at 2.4GHz,"
in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, Feb. 2003, pp. 92-
480 vol.1.
[90] A. Emira, A. Valdes-Garcia, B. Xia, A. Mohieldin, A. Valero-Lopez, S. Moon,
C. Xin, and E. Sanchez-Sinencio, "A dual-mode 802.11b/Bluetooth receiver in
0.25pm BiCMOS," in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers,
Feb. 2004, pp. 270-527 Vol.1.
[91] H. Darabi, S. Khorram, Z. Zhou, T. Li, B. Marholev, J. Chiu, J. Castaneda,
E. Chien, S. Anand, S. Wu, M. Pan, R. Roufoogaran, H. Kim, P. Lettieri,
B. Ibrahim, J. Rael, L. Tran, E. Geronaga, H. Yeh, T. Frost, J. Trachewsky,
and A. Rotougaran, "A fully integrated SoC for 802.11b in 0.18 pm CMOS," in
IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, Feb. 2005, pp. 96-586
Vol. 1.
[92] J.-Y. Chen, M. Flynn, and J. Hayes, "A Fully Integrated Auto-Calibrated
Super-Regenerative Receiver," in IEEE Int. Solid-State Circuits Conf. Dig.
Tech. Papers, Feb. 2006, pp. 1490-1499.
[93] N. Pletcher, S. Gambini, and J. Rabaey, "A 2GHz 52pW Wake-Up Receiver
with -72dBm Sensitivity Using Uncertain-IF Architecture," in IEEE Int. Solid-
State Circuits Conf. Dig. Tech. Papers, Feb. 2008, pp. 524-633.
180
[94] D. Weber, W. Si, S. Abdollahi-Alibeik, M. Lee, R. Chang, H. Dogan, S. Luschas,
and P. Husted, "A Single-Chip CMOS Radio SoC for v2.1 Bluetooth Applica-
tions," in IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers, Feb. 2008,
pp. 364-620.
[95] J. Bohorquez, J. Dawson, and A. Chandrakasan, "A 350pW CMOS MSK trans-
mitter and 400ptW OOK super-regenerative receiver for Medical Implant Com-
munications," in Symp. VLSI Circuits Dig. Tech. Papers, Feb. 2008, pp. 32-33.
[96] G. Retz, H. Shanan, K. Mulvaney, S. O'Mahony, M. Chanca, P. Corowley,
C. Billon, K. Khan, and P. Quinlan, "A Highly Integrated Low-Power 2.4GHz
Transceiver Using a Direct-Conversion Diversity Receiver in 0.18ptm CMOS
for IEEE802.15.4 WPAN," in IEEE Int. Solid-State Circuits Conf. Dig. Tech.
Papers, Feb. 2009, pp. 414-415.
[97] H. H. Liang, "A High Speed Image Transmission System for Ultra-Wideband
Wireless Links," Master of Engineering Thesis, Massachusetts Institute of Tech-
nology, June 2009.
[98] J. A. Osmundsen, "Matador with a radio stops wired bull," The New York
Times, May 1965.
[99] V. L. Gott, "Critical Role of Physiologist John A. Johnson in the Origins of Min-
nesota's Billion Dollar Pacemaker Industry," The Annals of Thoracic Surgery,
vol. 83, no. 1, pp. 349-353, Jan. 2007.
[100] R. Holzer and I. Shimoyama, "Locomotion control of a bio-robotic system via
electric stimulation," in Intelligent Robots and Systems, 1997. IROS '97., Pro-
ceedings of the 1997 IEEE/RSJ International Conference on, vol. 3, Sept. 1997,
pp. 1514-1519 vol.3.
[.101] W. Kutsch, G. Schwarz, H. Fischer, and H. Kautz, "Wireless Transmission of
Muscle Potentials During Free Flight of a Locust," Journal of Experimental
Biology, vol. 185, no. 1, pp. 367-373, 1993.
['102] P. Mohseni, K. Nagarajan, B. Ziaie, K. Najafi, and S. Crary, "An ultralight
biotelemetry backpack for recording EMG signals in moths," IEEE Transactions
on Biomedical Engineering, vol. 48, no. 6, pp. 734-737, June 2001.
[103] N. Ando, I. Shimoyama, and R. Kanzaki, "A dual-channel FM transmitter for
acquisition of flight muscle activities from the freely flying hawkmoth, Agrius
convolvuli," Journal of Neuroscience Methods, vol. 115, no. 2, pp. 181-187, Apr.
2002.
[104] P. Mohseni, K. Najafi, S. Eliades, and X. Wang, "Wireless multichannel biopo-
tential recording using an integrated FM telemetry circuit," IEEE Transactions
on Neural Systems and Rehabilitation Engineering, vol. 13, no. 3, pp. 263-271,
Sept. 2005.
181
[105] A. Nieder, "Miniature stereo radio transmitter for simultaneous recording of
multiple single-neuron signals from behaving owls," Journal of Neuroscience
Methods, vol. 101, no. 2, pp. 157-164, Sept. 2000.
[106] S. Takeuchi and I. Shimoyama, "A radio-telemetry system with a shape mem-
ory alloy microelectrode for neural recording of freely moving insects," IEEE
Transactions on Biomedical Engineering, vol. 51, no. 1, pp. 133-137, Jan. 2004.
[107] J. Parramon, P. Doguet, D. Marin, M. Verleyssen, R. Munoz, L. Leija, and
E. Valderrama, "ASIC-based batteryless implantable telemetry microsystem
for recording purposes," in Engineering in Medicine and Biology Society, 1997.
Proceedings of the 19th Annual International Conference of the IEEE, vol. 5,
1997, pp. 2225-2228 vol.5.
[108] P. Irazoqui-Pastor, I. Mody, and J. Judy, "In-vivo EEG recording using a wire-
less implantable neural transceiver," in Neural Engineering, 2003. Conference
Proceedings. First International IEEE EMBS Conference on, Mar. 2003, pp.
622-625.
[109] J. Mavoori, B. Millard, J. Longnion, T. Daniel, and C. Diorio, "A miniature
implantable computer for functional electrical stimulation and recording of neu-
romuscular activity," in Biomedical Circuits and Systems, 2004 IEEE Interna-
tional Workshop on, 2004, pp. S1/7/INV-S1/13-16.
[110] A. Bozkurt, R. Gilmour, D. Stern, and A. Lal, "MEMS based Bioelectronic
Neuromuscular Interfaces for Insect Cyborg Flight Control," in 21st IEEE In-
ternational Conference on Micro Electro Mechanical Systems (MEMS 2008),
Jan. 2008, pp. 160-163.
[111] H. Sato, C. W. Berry, B. E. Casey, G. Lavella, Y. Yao, J. M. VandenBrooks,
and M. M. Maharbiz, "A Cyborg Beetle: Insect Flight Control Through An
Implantable, Tetherless Microsystem," in 21st IEEE International Conference
on Micro Electro Mechanical Systems (MEMS 2008), Jan. 2008, pp. 164-167.
[112] W. M. Tsang, Z. Aldworth, A. Stone, A. Permar, R. Levine, J. G. Hildebrand,
T. Daniel, A. I. Akinwande, and J. Voldman, "Insect Flight Control by Neural
Stimulation of Pupae-Implanted Flexible Multisite Electrodes," in The Twelfth
International Conference on Miniaturized Systems for Chemistry and Life Sci-
ences (pTAS 2008), Oct. 2008.
[113] S. Loudon, Z. Aldworth, and T. L. Daniel, "Perturbing flight paths in Lepi-
doptera by inducing abdominal flexion," in SICB, Jan. 2009.
[114] R. R. Harrison, P. T. Watkins, R. J. Kier, R. O. Lovejoy, D. J. Black,
B. Greger, and F. Solzbacher, "A Low-Power Integrated Circuit for a Wireless
100-Electrode Neural Recording System," IEEE Journal of Solid-State Circuits,
vol. 42, no. 1, pp. 123-133, Jan. 2007.
182
[115] K. Gulati and H.-S. Lee, "A low-power reconfigurable analog-to-digital con-
verter," IEEE Journal of Solid-State Circuits, vol. 36, no. 12, pp. 1900-1911,
Dec. 2001.
[116] N. Verma and A. Chandrakasan, "An Ultra Low Energy 12-bit Rate-Resolution
Scalable SAR ADC for Wireless Sensor Nodes," IEEE Journal of Solid-State
Circuits, vol. 42, no. 6, pp. 1196-1205, Dec. 2007.
[117] B. Murmann, "A/D converter trends: Power dissipation, scaling and digitally
assisted architectures," Sept. 2008, pp. 105-112.
[118] S. Adlee, "Cyborg Moth Gets a New Radio," IEEE Spectrum Online, Feb.
2009. [Online]. Available: http: /jwww.spectrumi.ieee.org/fe)09/ 7747
183
