New Frontiers In Ultra-Fast Cmos Electronics: High Power Tunable Terahertz Sources Based On Coupled Oscillators And Beyond by Mesgarpour Tousi, Yahya
NEW FRONTIERS IN ULTRA-FAST CMOS
ELECTRONICS:
HIGH POWER TUNABLE TERAHERTZ SOURCES
BASED ON COUPLED OSCILLATORS AND BEYOND
A Dissertation
Presented to the Faculty of the Graduate School
of Cornell University
in Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
by
Yahya Mesgarpour Tousi
August 2012
c© 2012 Yahya Mesgarpour Tousi
ALL RIGHTS RESERVED
NEW FRONTIERS IN ULTRA-FAST CMOS ELECTRONICS:
HIGH POWER TUNABLE TERAHERTZ SOURCES BASED ON COUPLED
OSCILLATORS AND BEYOND
Yahya Mesgarpour Tousi, Ph.D.
Cornell University 2012
The terahertz frequency spectrum is gaining increasing interest due to its intriguing
applications in molecular spectroscopy, imaging, and sensing. This band though
very useful has been one of the hardest to access within the electromagnetic spec-
trum because of the limitations of both optics and electronics. In this work, we
look into some of the most challenging electronic functions that will pave the way
toward the realization of an ultra-high speed integrated system in this band. Our
goal is to replace the expensive and bulky solutions in this frequency band with
cheap and versatile CMOS circuits.
As far as signal generation in concerned, one of the major bottlenecks for
terahertz electronics is the limited achievable power with conventional electronic
sources. To address this challenge, we propose signal generation solutions with
power levels that are significantly beyond the limit of conventional designs. In
order to do so, we investigate the general theory of nonlinear dynamics and seek
sophisticated oscillatory systems that are highly effective for high frequency elec-
tronic sources. We present prototypes of these terahertz sources which show 4
orders of magnitude improvement compared to previous designs. The presented
methods are highly scalable and can be applied to generate even higher output
power levels on the same process, or significantly higher power levels using com-
pound semiconductors.
Furthermore, the other side of any transmission system in the mm-wave and
sub-mm-wave frequency range is a wideband analog signal processing block. Next,
in this work we find novel solutions for challenges in wideband analog processing.
We present the Electrical Interferometer as an energy efficient method for analog
signal detection above 10 GHz. This structure is based on wave interference in a
passive electric media and is a high speed and energy efficient method for detecting
and quantizing a wideband signal. Furthermore, we propose the delay-line based
data converter for GHz range data conversion. This scalable structure exploits the
small time resolution on CMOS for low-power and high-speed data conversion.
BIOGRAPHICAL SKETCH
Yahya Mesgarpour Tousi received his B.S. degree in 2004 and his M.S. degree in
2006 both in Electrical Engineering from Sharif University of Technology, Tehran,
Iran. In 2007 he joined Cornell University to pursue the Ph.D. degree in Electrical
Engineering. During his graduate study, he has been interested in novel high-
speed integrated systems for communication, biomedical, and signal processing
applications.
Yahya is the recipient of the 2009 Cornell Jacob Fellowship and the 2011 IEEE
Microwave Theory and Techniques Society Graduate Fellowship. He is also the
winner of the Graduate Research Competition at IMS 2011, and the recipient of
the 2011-2012 IEEE Solid-State Circuits Society Pre-Doctoral Achievement Award.
iii
To my wife and my parents
iv
ACKNOWLEDGEMENTS
As much as a thesis represents the accomplishments of a researcher, it is a result
of collaborations, advice and support from many other individuals. During my
years as a graduate student at Cornell, I had the opportunity to work in an open
and collaborative environment and with some of the most talented and motivated
people I had ever met. Looking back I feel very lucky for knowing many of these
individuals, either professionally or personally and having the opportunity to work
with them.
First, I have to acknowledge my advisor Professor Ehsan Afshari. In the last
five years, I have had the chance to gain a great deal from his bright vision and
profound insight. During my graduate study, his high academic standards has
always inspired my research goals. I cannot be more satisfied with my decision to
join his research group at Cornell.
I am also grateful to the members of my graduate committee: Professor Alyssa
Apsel, Professor Alyosha Molnar, and Professor Rajit Manohar. They have been
a valuable resource for technical advice throughout my graduate study. I am also
thankful to several faculty members including Professors Sandip Tiwari, Edwin
Kan, Farhan Rana, Richard Shealy, and Steven Strogatz for all I have learned
from them through various conversations and lectures. I also appreciate the great
support I have received from the department staff, especially Scott Coldren, Daniel
Richter, Zhila Sadri, and Sue Bulkley.
I have been very fortunate to carry out my study in our research group along
with some of the smartest and most creative group of graduate students. Thank
you Guansheng Li, Omeed Momeni, Wooram Lee, Muhammad Adnan, Georgios
Lilis, Ruonan Han, Vahnoud Pourahmad, and Hamidreza Aghasi for being sup-
portive colleagues and great friends.
v
During my time in Ithaca, I have met wonderful people and made friendships
that will remain as a valuable part of my experience. Thank you Amir Sattarzadeh,
Elham Bandezadeh, Hossein Jarrahi, Maryam Sharif, Monir Azarmnia, Hadi Hos-
seinzadegan, and many others for making my gradate life full of pleasant moments.
My parents have always been a part of my success. They have been a constant
source of unilateral love and support and I can never be grateful enough for having
them. I can only wish to give back to others the same way they did to me.
Finally, I can only finish my appreciation by thanking my wife, Nafiseh. She
has provided me with unparalleled support during our many years together, and
has been key for making this whole accomplishment possible. This is all I can tell
her: Now I have the rest of our lives to give back for all your sacrifice and support.
At the end, I am grateful for the research opportunity I was offered at Cornell
and the fascinating research environment in which I have been emerged. Without
a doubt, my graduate life has been the most scientifically educative and enjoyable
time I have ever had. I owe this to the great people I have met along this path,
among various faculties, staff, students, and friends.
vi
TABLE OF CONTENTS
Biographical Sketch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1 Challenges in mm-wave and Terahretz Electronics 1
1.1 High-Speed Integrated Circuits . . . . . . . . . . . . . . . . . . . . 1
1.2 Electronics in the Terahertz Spectrum . . . . . . . . . . . . . . . . 3
1.3 Ultra-Fast Signal Processing . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Low-Power Signal Quantization . . . . . . . . . . . . . . . . . . . . 7
2 CMOS High-Power Terahertz Sources Based on Coupled Oscilla-
tors 9
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Theory of Coupled Oscillators . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Actively coupled oscillators . . . . . . . . . . . . . . . . . . 13
2.2.2 Stability Theory . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.3 Coupling Modes . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.4 Frequency Tuning . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Terahertz Harmonic VCO . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.1 Oscillator Design . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.2 Harmonic Power Generation . . . . . . . . . . . . . . . . . . 25
2.3.3 Coupling Circuit . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.1 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.2 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3 Electrical Interferometer: A Novel High-Speed Quantizer 39
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Wave propagation and interference in lattices . . . . . . . . . . . . 41
3.2.1 Exponential tapering with constant L/C . . . . . . . . . . . 42
3.2.2 Interference pattern in 1-D LC ladders . . . . . . . . . . . . 46
3.2.3 Generalization to 2-D . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Interferometric quantization . . . . . . . . . . . . . . . . . . . . . . 50
3.3.1 The MOS Varactor . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.2 2-D lattice quantization . . . . . . . . . . . . . . . . . . . . 51
3.3.3 The effect of tapering . . . . . . . . . . . . . . . . . . . . . . 56
3.4 Quantizer design and simulation . . . . . . . . . . . . . . . . . . . . 57
3.4.1 The 2-D lattice . . . . . . . . . . . . . . . . . . . . . . . . . 58
vii
3.4.2 Specifying the input-output characteristic . . . . . . . . . . 62
3.4.3 Detector and memory . . . . . . . . . . . . . . . . . . . . . 64
3.4.4 Design summary and comparison . . . . . . . . . . . . . . . 66
3.4.5 Effect of noise and phase mismatch . . . . . . . . . . . . . . 70
4 Delay-Based Analog-to-Digital Conversion: A Scalable Solution 73
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2 Delay-Line-Based Signal Quantization . . . . . . . . . . . . . . . . . 75
4.2.1 Time-Based Signal Quantization . . . . . . . . . . . . . . . . 76
4.2.2 Delay-Based Signal Quantization . . . . . . . . . . . . . . . 77
4.3 ADC Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3.1 Delay-Cell Design . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3.2 The Differential Delay-Line . . . . . . . . . . . . . . . . . . 80
4.3.3 Maximizing The Dynamic Range . . . . . . . . . . . . . . . 81
4.4 Non-Ideal Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.4.1 Time-Domain Averaging . . . . . . . . . . . . . . . . . . . . 83
4.4.2 Noise and Jitter . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4.3 Mismatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4.4 Design Methodology . . . . . . . . . . . . . . . . . . . . . . 89
4.5 Circuit Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.5.1 Sample and Hold . . . . . . . . . . . . . . . . . . . . . . . . 91
4.5.2 Delay Adjustment Circuit . . . . . . . . . . . . . . . . . . . 93
4.5.3 The Delay-Cell . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.5.4 Clock Phase Generator . . . . . . . . . . . . . . . . . . . . . 97
4.6 Prototype Measurement . . . . . . . . . . . . . . . . . . . . . . . . 99
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5 Conclusion and Future Direction 106
Bibliography 109
viii
LIST OF FIGURES
2.1 A loop structure consisting of N coupled oscillators. . . . . . . . . . 13
2.2 Stable regions for different modes of a system of 4 coupled oscillators. 18
2.3 Mode transition in the system of 4 coupled oscillators. Top: Tran-
sition from ψ = 0 to ψ = pi/2. Bottom: Transition from ψ = pi/2
to ψ = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Relative frequency as a function of φc for two values of K. The
jump in the output frequency corresponds to the change in the
coupling mode from Ψ = 0 to Ψ = pi/2. . . . . . . . . . . . . . . . 21
2.5 Simualted harmonic current generated at the drain of a 65nm
CMOS transistor in a basic cross-coupled oscillation scheme. . . . . 23
2.6 The modified cross-coupled scheme employed and the equivalent
circuit at the fundamental frequency. . . . . . . . . . . . . . . . . . 24
2.7 Half-circuit equivalent of the core oscillator at the fourth harmonic
frequency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.8 Top view of the proposed four core coupled oscillator VCO showing
the cores, the coupling block and the power combiner. . . . . . . . 27
2.9 Simulated output matching of the oscillator at the fourth harmonic. 27
2.10 Simulated waveforms of the core oscillators and the output node
oscillating at the fourth harmonic frequency. . . . . . . . . . . . . . 28
2.11 The differential phase shifter followed by the coupler to the next
core. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.12 Simulated phase shift of the coupling block with respect to the
control voltage and the resulting change in the locking frequency
of the core oscillators. . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.13 The chip micro photograph. . . . . . . . . . . . . . . . . . . . . . . 31
2.14 Metal layers and the dimensions of transmission lines. . . . . . . . 32
2.15 Top: Test setup for measuring the output frequency. Bottom: Test
setup for measuring the output power. . . . . . . . . . . . . . . . . 34
2.16 A typical measured output spectrum downconverted by the 16th
harmonic of the LO. . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.17 Measured output power and frequency tuning for the 290 GHz VCO. 35
2.18 Measured output power and frequency tuning for the 320 GHz VCO. 35
2.19 Measured peak output power vs. DC power. . . . . . . . . . . . . . 36
3.1 2-D lattice as an Electrical Interferometer. . . . . . . . . . . . . . . 40
3.2 Discrete 1-D transmission line. . . . . . . . . . . . . . . . . . . . . 41
3.3 Time delay vs. section number in a 1-D tapered line, for different
tapering coefficients. Simulation is done with L0 = 50pH , C0 =
10fF , and ω = 2pi · 60GHz and it is compared with our analysis. 45
3.4 Incident and reflected waves across the two sections of the tapered
1-D transmission line. . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5 Discrete 2-D transmission lattice. . . . . . . . . . . . . . . . . . . . 49
ix
3.6 The characteristic of an accumulation mode MOS varactor. . . . . 50
3.7 2-D lattice interference pattern and its central lobe. The contours
represent nodes of the lattice with equal voltage amplitudes, and
the dashed lines represent nodes with zero amplitude. The voltage
levels are normalized with respect to the input voltage level. . . . . 53
3.8 Analysis vs. circuit simulation of the quantizer input-output rela-
tion for a uniform lattice with L0 = 56pH and C0 as in Fig. 3.6.
The comparison is performed for VT = 70mV (blue), VT = 80mV
(red), and VT = 90mV (green). In order to take into account the
effect of loss in the analysis, a smaller input swing is a applied to
(3.37). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.9 A 2-D lattice with central symmetric LC tapering. . . . . . . . . . 56
3.10 Inductor and capacitor quality factor vs. frequency. The design
point is shown by the dashed line. . . . . . . . . . . . . . . . . . . 59
3.11 Uniform vs. tapered lattice output. Top: uniform lattice with
L = L0 and C = C0. Bottom: constant L/C tapered lattice with
tapering factor of ρ = 0.09. The tapered lattice focuses the first
interference lobe to fit inside the 16×16 lattice. Simulation is done
with source amplitudes of 50mV . . . . . . . . . . . . . . . . . . . . 60
3.12 Quantizer’s output characteristic with and without tapering. All
inputs are ideal matched sources with input amplitude of 50mV.
The simulation is done for VT = 70mV in blue, VT = 80mV in
red, and VT = 90mV in green. We observe the effect of tapering
which is focusing the variation in smaller number of nodes while
maintaining the desired input-output trend. . . . . . . . . . . . . . 62
3.13 Quantizer output vs. input voltage for two different selection sets
showing two examples of various achievable input-output charac-
teristics. Top left: A 16-level or 4-bit linear quantizer. Top right:
Its selection set for output nodes. Bottom left: A logarithmic (sat-
urating) quantizer. Bottom right: Its selection set for output nodes 63
3.14 Voltage detector circuit with input and reference inputs and the
differential output. . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.15 The N th stage of the cascaded amplification circuit with the
switches and the four clock phases. . . . . . . . . . . . . . . . . . 66
3.16 Top: Detector’s transient response for Vin = 55mV and VT = 50mv.
Bottom: First gain stage (blue) and second gain stage (red) sam-
pled voltages for the above detector output. The first gain-stage
samples the detectors output and amplifies it in a 50ps time win-
dow. For the next 50ps the second stage amplifies the previous
output, while the fist stage is being reset. The sign of the detector
output is the logic state, which in this case is positive (Vin > VT ). . 67
3.17 Time domain simulation. Input sinusoid at 1.7GHz (blue) and
the digital output (red). All other parameters are the same as the
constant input simulations we performed before. . . . . . . . . . . 68
x
3.18 The digital output fourier transform for 256 samples of the output. 68
3.19 Noise and mismatch limits for different values of SNDR. The lat-
tice parameters are as specified in Section 3.4.4. The input Vin is
assumed to be a sinusoid with an amplitude of 150mV. . . . . . . . 71
4.1 A delay-line with a digital pulse applied to the first delay-cell. The
propagation length of the pulse is quantized inside the delay-line
through collecting the digital outputs of the delay-cells. . . . . . . 75
4.2 The proposed controllable delay-cell. The current source linearly
discharges the capacitive node when the start switch turns on by
the previous stage. The delay-cell is reset to its high-level voltage
at the end of the time window. . . . . . . . . . . . . . . . . . . . . 78
4.3 The proposed differential delay-line based data converter. The in-
put pulse is applied to both P-cells and N-cells and propagates in
the two lines with different speeds. The outputs of P-cells and N-
cells are latched separately and subsequently encoded as the digital
output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.4 Simulated Input-Output relation comparison between a single
delay-line quantizer and a differential quantizer in the employed
65nm CMOS process. The differential quantizer generates a higher
dynamic range and linearity. . . . . . . . . . . . . . . . . . . . . . 81
4.5 Left: The quantization error separated to different regions based
on the bias point. Right: The noise variance corresponding to sub-
optimal and optimal bias points. . . . . . . . . . . . . . . . . . . . 84
4.6 Signal to quantization noise ratio as a function of the bias point
from theoretical analysis of the differential delay-line compared to
the simulation results from the time domain spectrum in the em-
ployed 65nm CMOS process. . . . . . . . . . . . . . . . . . . . . . 84
4.7 Sources of noise and jitter in the delay-cell. . . . . . . . . . . . . . 86
4.8 Comparison of mismatch limited signal-to-noise ratio between ba-
sic flash and the delay-line structure. Monte-Carlo simulation is
performed for the 65nm process and the SER is calculated assum-
ing similar input voltage range. The horizontal arrows indicate the
regions in which each structure is favorable. The vertical arrows
indicate the direction of further scaling which is in favor of the
delay-line structure. . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.9 FOM as a function of sampling rate and resolution for the used
65nm CMOS process. The selected sampling frequency is shown
with the dashed line. Above fmin the delay-line is in the quanti-
zation limited region where sampling rate can be traded for higher
resolution while maintaining minimum energy consumption per bit. 91
4.10 Input sampling switches and the delay adjustment circuit. . . . . . 92
xi
4.11 Top: Transfer function of the input voltage to the inverse delay.
The dashed line represents the bias point of the delay-cells. Bottom:
INL of the resulting differential quantizer. . . . . . . . . . . . . . . 94
4.12 Delay-cell followed by the latch. The delay-cell is triggered by the
previous stage but all delay-cells are reset at the same time. . . . . 95
4.13 Time response of the delay-line outputs. If a cell is latched during
the transition time 4tm, the resulting bit is not clear. The uncer-
tainty is kept below noise floor by ensuring sharp transitions for
the delay-cells. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.14 Edge triggered pulse generator (top), and the two generated phases
(bottom). The delay-cells are similar to the core delay-cells to
ensure robustness to temperature and process variations. . . . . . . 98
4.15 The effect of process and temperature variation in the delay-line for
the employed adaptive pulse width technique compared with using
a fixed pulse width. . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.16 Chip photograph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.17 Measured INL (top) and DNL (bottom) of the two channels. . . . . 101
4.18 Top: Measured SFDR and SNDR at Nyquist rate vs. sampling
rate. Bottom: Measured SFDR and SNDR vs. input frequency at
fs =1.2GS/s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.19 Top: Measured low frequency spectrum for one channel. Bottom:
Measured Nyquist rate output spectrum, when both channels are
interleaved. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
xii
LIST OF TABLES
2.1 Performance comparison of the state-of-the-art terahertz signal
sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1 Performance Comparison of State-of-the-Art ADCs above 10GS/s . 69
4.1 Performance Summary . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.2 Performance Comparison of State-of-the-Art ADCs . . . . . . . . . 105
xiii
CHAPTER 1
CHALLENGES IN MM-WAVE AND TERAHRETZ ELECTRONICS
1.1 High-Speed Integrated Circuits
Modern electronics is an exciting area for research due to the explosive possibil-
ities that have emerged from the deep sub-micron technology. Within the last
decades, the cost-efficiency and reliability of silicon technology has been the main
drive for higher integration of various circuits and functionalities on a single die.
Such integration has resulted in explosive growth in various fields such as data
communication, memory, signal processing, imaging and medicine.
The exponential growth in the technology of integration has been best encap-
sulated by Moore’s law. Today, as a result of such an advancement in fabrica-
tion technology, devices with cut-off frequencies well above 100 GHz are available
within standard CMOS technologies. Such high performance transistors have not
only enabled highly integrated analog and RF circuits but they have also resulted
in the emergence of new areas that were not traditionally associated with inte-
grated circuits. Today, more and more microwave and mm-wave applications are
implemented on chip using standard silicon technologies. Examples of these new
areas of applications include ultra-wide-band communication and mm-wave radar
sensing and imaging.
The scaling in integrated circuits has not been without drawbacks and chal-
lenges. Beside higher leakage current in digital circuits, scaling down the oxide
thickness has resulted in lower supply voltages for analog circuit design. This
imposes challenges in power generation and amplification for RF and mm-wave
1
applications. The other challenge with reduced supply voltage is the reduced dy-
namic range available for voltage based analog signal processing. Another issue
with device scaling is that eventually metal parasitics and passive elements that do
not scale proportionally will become more significant. In particular at higher fre-
quencies this means that the proper design and consideration of passive elements
becomes crucial in the overall performance of the system.
Generally, a high-speed integrated system relies on two important tasks: signal
generation and transmission at one side, and processing the received signal at the
other side. With the scaling of CMOS transistors, higher operating frequencies
are possible for integrated circuits. However, the CMOS scaling mostly directed
toward digital gates has not been completely favorable for analog circuits. For
instance the lower supply voltage of transistors has resulted in limited dynamic
range in voltage-based analog circuits. Besides, many aspects of analog integrated
circuits such as the quality factor of passive devices and line connections are not
scaled proportional to the device performance. As a result, it is not always efficient
to employ signal generation and processing methods that have been traditionally
successful for sub-GHz and RF frequency ranges.
In order to design optimal systems on chip, circuits should take advantage of
advances in other fields such as microwave, optics and solid-state physics that share
various physical and mathematical concepts. An emerging challenge in one field
might find its solution in a different field. More importantly modern technological
challenges tend to remove artificial boundaries. For instance, left-handed materi-
als originated from microwaves but today they are applied to optics for cloaking
devices.
In this thesis we will present solutions to the challenges that exist in generat-
2
ing and processing signals in the mm-wave and terahertz range. In order to ap-
proach the fundamental potential of integrated circuits we present novel solutions
that address two main issues: 1) High power signal generation and transmission
at sub-mm-wave frequencies for wideband communications systems, imaging and
spectroscopy. 2) Power efficient multi-GHz analog signal processing for wideband
analog front ends. Our proposed architectural innovations will bring integrated
circuits one step closer to realizing efficient transceiver, imaging, and spectroscopy
systems at frequencies close or even above the cut-off frequency of CMOS transis-
tors.
The organization of this thesis is as follows: The rest of this chapter will in-
troduce the main contributions of our work in each of the aforementioned areas.
Next, in Chapter 2 we present a novel scalable approach to generate signals at
mm-wave and terahertz which can be applied to a wide range of applications. On
the processing side, Chapter 3 and Chapter 4 present novel approaches for pro-
cessing ultra-high-speed signals based on quasi optical interference and time-based
quantization. Finally, we conclude the thesis by summarizing our contributions
and suggesting possible future directions.
1.2 Electronics in the Terahertz Spectrum
The terahertz frequency range (0.3 THz - 3 THz) is one of the last frontiers in the
electromagnetic spectrum. It is a gray area between optics and electronics with
important features. Many molecules have absorption bands inside this frequency
range which gives terahertz niche applications in medical imaging and molecular
spectroscopy. For instance, detecting the water content of cells can be used to
3
diagnose epithelial diseases. Moreover, the high contrast between transmission of
T-rays in metallic and nonmetallic materiel makes this band useful for detecting
concealed objects. In addition, T-rays are non-ionizing which is crucial for non-
invasive imaging.
A terahertz transmission and detection solution is challenging to implement
using solid-state transistors. Signal generation is one of the main challenges due
to the limited cut-off frequency of transistors and the low quality factor of passive
elements. The transistor fmax limits the highest oscillatory frequency and the low
quality factor limits the generated power and phase noise. The other challenge is
tuning the generated frequency close or higher than fmax. Varactors are very lossy
at these frequencies and their tuning ability is also small due to device parasitics.
As a matter of fact, CMOS VCOs at these frequencies have only sub-µW power
levels and poor tunability.
In presence of these limits, we propose novel architectures that make terahertz
integrated systems a possibility. Our innovative solution is significant in at least
two directions: frequency tuning and power generation [1].
Terahertz frequency tuning: We propose a fundamentally different method for
tuning the oscillation frequency. Instead of changing the core frequency using
varactors, multiple core oscillators are coupled together through tunable coupling
blocks. The operation of the oscillator is an example of the synchronization phe-
nomenon in nonlinear dynamical systems. The coupling between the cores acts as
a bifurcation variable that controls the coupling dynamics and the synchronized
frequency of operation. The advantage of this tuning method is that each core
oscillator is free of varactors and the resonator can well sustain and deliver the
generated power from the core devices.
4
Scalable power generation and combining: We propose a scalable CMOS ter-
ahertz power generator based on the proposed scheme. The method consists of
two steps. First, the core transistors are locally matched to generate and deliver
maximum harmonic power. Next, the harmonics from all cores are combined at
the center of the structure. Based on this concept we fabricated a four core proto-
type on a 65nm low-power bulk CMOS process. This oscillator generates 0.76mW
of output power at 290GHz with 13GHz of tuning range. This level of power is
10000 times higher than other sub-mm-wave VCOs on CMOS. In fact the output
power and tuning range of this structure beats compound semiconductor designs
at this frequency which have considerably higher fmax.
1.3 Ultra-Fast Signal Processing
In the processing side for wide-band analog signals we present quasi-optical meth-
ods that result in fast and efficient processing blocks. Electrical media can be
utilized for controlled implementation of wave phenomena aimed toward a de-
sired outcome. Selecting particular values for the inductors and capacitors in a
1-dimensional transmission line or a 2-dimensional lattice provides a large design
space for implementing different phenomena. For instance, it is possible to repro-
duce wave interference, refraction and bandgaps by designing particular profiles
of inductors and capacitors. Based on such a method, analog operations can be
carried out using only passive elements.
Such operations are not limited to the cut-off frequency of the transistors and
can function at extremely high-speeds. The speed of this type of signal processing
is limited by the cut-off frequency of the passive network. Furthermore, such
5
a passive operator will potentially consume significantly less power compared to
an active circuit. Based on this concept processing rates in the order of tens of
gigahertz is achievable with significantly better efficiency than active structures [2].
Electrical Interferometer: We show that analog processing blocks can be im-
plemented using wave propagation in a 2-dimensional LC lattice. In particular,
we have proposed an ultra-fast signal quantizer based on wave interference in a
2-dimensional rectangular lattice. In our proposed scheme sinusoidal mm-wave
sources are applied to a rectangular lattice in 4 perpendicular directions and form
an interference pattern inside the lattice.
In this scheme the analog signal changes the propagation velocity of the wave
by controlling accumulation mode MOS capacitors in the lattice. As a result,
the analog input modulates the location of peaks and notches in the interference
pattern. Sensitive nodes on the lattice are chosen to detect the pattern change and
subsequently quantize it. Based upon this interferometric technique, we design a
20Gsample/sec 200mW quantizer. This quantizer is the fastest and most power
efficient single channel data converter on CMOS.
Electrical Lens: We propose a tapering profile on this 2-dimensional lattice that
acts as a lens upon the interference pattern. This profile keeps the characteristic
impedance constant while changing the propagation velocity gradually toward the
center. In effect this tapering results in refraction of the plane wave toward the
center of the lattice. The interference peak becomes more concentrated at the
center which is desired for efficient pattern detection.
6
1.4 Low-Power Signal Quantization
Physical representation of an analog signal plays an important role in its subsequent
use. Circuit theory represents any signal with either a voltage across two terminals
or a current inside a wire. This lumped element abstraction despite its versatility
does not represent all possible functions. The Mach-Zehnder interferometer is an
example of signal modulation using the phase of the electromagnetic wave. This
kind of modulation exploits the distributed nature of wave propagation that can be
modulated by changing the transmission properties of the waveguide. Similarly in
electrical domain, a propagating signal in a transmission line has analog properties
such as phase, propagation velocity and wavelength that can be exploited for novel
signal processing schemes.
The underlying idea is the following: An input signal in the form of voltage or
current modulates another physical quantity such as time, phase or wavelength.
The outcome of this modulation is detected and converted back into a voltage
or current. The Electrical Interferometer explained in the previous section is an
example of a structure that exploits wavelength modulation for signal quantization.
Here we show another example: time-domain signal processing [3].
Delay-line based data converter: If bandwidth is the favorable aspect of device
scaling can dynamic range also benefit from it? We positively answer this question
by proposing a novel data converter that relies on time quantization. In this
architecture the analog signal modulates a delay-line which carries a digital pulse.
The modulation of the delay line results in a change in the propagation speed of
the pulse during a fixed time. As a result the propagation length of the pulse
becomes a function of the analog signal. The number of triggered delay-cells is in
fact a quantized version of the propagation length of the pulse. By proper design
7
of the delay-cell we can get a linear quantization characteristic.
The dynamic range of this structure is proportional to the time resolution in
the digital delay cells in other words a favorably scalable quantity with device
scaling. We demonstrate this theoretically and as a proof of concept we implement
a 4bit 1.2GSample/sec prototype with only 0.01mm2 of active area. This perfor-
mance is achieved without any digital assistance or post-processing. Compared to
calibration-free voltage-based data converters above 1GSample/sec this proposed
delay-line base data converter achieves the best energy efficiency. Furthermore due
the mostly digital quantization scheme this structure occupies the smallest area
compared to all ADCs.
8
CHAPTER 2
CMOS HIGH-POWER TERAHERTZ SOURCES BASED ON
COUPLED OSCILLATORS
2.1 Introduction
The last few years have witnessed a surge in mm-wave integrated circuits mainly
because of the increased cut-off frequency of CMOS transistors. These integrated
systems target applications such as wide-band communication, remote sensing and
medical imaging [4–15]. Moreover, recent works have shown the possibility of
implementing CMOS circuits at sub-mm-wave and terahertz frequencies defined
from 300 GHz to 3 THz. This is motivated by the higher communication bandwidth
and better imaging resolution in addition to new applications such as integrated
terahertz spectroscopy [14–22]. Despite all recent advances in this frequency range,
a high power tunable signal source has remained a challenging yet essential circuit
block toward the realization of a complete terahertz system.
In LC-resonator-based voltage-controlled oscillators (VCO), varactors are
placed in the resonator in order to tune the oscillating frequency. This tuning
method works well at radio frequencies and achieves moderate tunability at lower
mm-wave frequencies (<100 GHz) [5–7]. However, there are at least two major
challenges in using varactors for frequency tuning above 100 GHz. First, at these
frequencies the varactor quality factor is low. This lowers the achievable out-
put power and degrades the phase noise performance. Secondly, as the operation
frequency increases, the parasitic capacitances dominate the tank, limiting the
tuning capability of varactors. These challenges impose an important trade-off in
the design of high frequency oscillators. CMOS oscillators above 100 GHz with
9
high output power, do not use varactors and as a result their frequency cannot be
tuned [16, 17]. On the other hand tunable oscillators at these frequencies provide
very low output powers (<1µW) due to the use of tuning varactors [18, 19]. As
a result of these challenges, power generation along with frequency tuning above
150 GHz is dominated by frequency multipliers [22–24]. Frequency multiplication
requires a high-power external source which is not desirable in a fully integrated
terahertz source.
To address this challenge, prior works at lower frequencies have focused on tun-
ing the oscillation frequency without using varactors [25–27]. The magnetically-
tuned and the transconductance-tuned VCOs are two examples where instead of
capacitive tuning, the effective inductor of the tank is tuned. Both of these tech-
niques still require additional active devices inside the oscillator tank. A recent
work has used an interpolative-phase-tuning technique in an LC ring oscillator at
the mm-wave frequency range [28]. All these techniques have been used to generate
output powers well below the cut-off frequency of the transistors.
In order to realize a high power VCO at the sub-mm-wave and terahertz band,
three requirements need to be satisfied: first, the signal source should be able
to generate high harmonic power above the device fmax. Second, the generated
power should be efficiently delivered to the output load. Third, a frequency tun-
ing mechanism should be proposed that would not adversely affect the first two
requirements.
In this chapter we introduce a novel VCO architecture based on coupled oscil-
lators in a loop configuration that is suitable for terahertz power generation. We
show that because of the new approach to frequency control, the trade-off between
frequency tuning and power generation in conventional VCOs is largely resolved.
10
As a result, frequency tuning is achieved while maintaining high output power in
the sub-mm-wave frequency range. Furthermore, the proposed technique provides
an effective way to generate and combine the harmonics of the fundamental fre-
quency from multiple core oscillators. Using a standard low-power 65 nm CMOS
process, we demonstrate a 290 GHz VCO with 0.76 mW (-1.2 dBm) output power
and 13 GHz tuning range and a 320 GHz VCO with 0.46 mW (-3.3 dBm) out-
put power and 8.4 GHz tuning range [29]. These VCOs achieve the highest output
power among all reported CMOS VCOs as well as compound semiconductor VCOs
around 300 GHz.
The rest of this chapter is organized as follows: Section 2.2 describes the the-
ory of the proposed coupled oscillator system, Section 2.3 explains the VCO ar-
chitecture and Section 2.4 describes the implementation and measurement results.
Finally, Section 3.5 summarizes the chapter.
2.2 Theory of Coupled Oscillators
The interaction between two electrical oscillators and its dynamics has been studied
by Adler in the context of injection locking [30]. Based on this analysis when
an oscillator at ω1 injects energy to another one with a free-running oscillation
frequency of ω0, under locking conditions the second oscillator follows the first
with an equal frequency and a phase shift of
∆φ = sin−1
(
2Q
Icore
Iinj
· ∆ω
ω0
)
, (2.1)
where ∆φ is the phase difference developed as a result of the frequency difference
∆ω = ω1 − ω0. In (2.1), Q represents the quality factor of the resonator and Icore
and Iinj are the currents delivered to the resonator from its active element and the
11
injecting source, respectively.
The above relation also yields the locking range by finding the frequency differ-
ence ∆ω at which |∆φ| = pi/2. As (2.1) implies, lowering the quality factor and/or
increasing the amplitude of the injected signal results in a wider locking range.
The lower and upper bound of the locking range are derived by finding the
frequencies where the phase shift equals to −pi/2 and pi/2, respectively. Intuitively,
at these frequencies the in-phase component of the injected signal disappears, as
a result the injected signal has no effect on the dynamics of the main oscillator.
This frequency locking phenomena has been applied to electrical oscillators for
applications including frequency division, coupled radiation systems and beam-
steering [31–37]. The main similarity between all these systems is that there is
an external signal source and one or more oscillators are coupled and eventually
synchronized to that external source.
Coupled systems do not necessarily need an external source and can rely on in-
ternal couplings to synchronize. A study by York exploits passive mutual coupling
between an array of oscillators to synchronize them in order to facilitate spatial
power combining [38]. In general, the method of coupling and its dynamics as well
as the geometry of the coupled system can be engineered for a diverse range of
functionalities. In this work, we propose a particular coupling topology between
N core oscillators that controls the frequency of the oscillators and imposes a con-
stant desired phase shift between them. As we show later, this constant phase
shift is critical in harmonic power combining from core oscillators.
12
          
 
fc
fc
fc
fc
 
 
 y
fcf0
f1
f2
fN-1
Figure 2.1: A loop structure consisting of N coupled oscillators.
2.2.1 Actively coupled oscillators
An insightful observation from (2.1) suggests that while a phase shift is developed
as a result of the frequency difference between the cores, the reverse phenomena is
also possible. In a system of coupled oscillators, if the phase shift between the the
injected signal and the main oscillator is changed the locking frequency is forced
to change. In the conventional locking scheme this cannot happen because the
frequency of the injecting signal is fixed and the phase difference is dictated by
(2.1). In contrast, in a coupling scheme that only relies on internal couplings, the
locking frequency is determined based on internal parameters of the system. The
scheme in Fig. 2.1 shows a ring of N coupled oscillators. In this architecture, the
ith oscillator injects energy to the (i + 1)th oscillator while receiving energy from
the (i − 1)th oscillator. The coupling between the consecutive cores is performed
by a tunable phase shift.
13
In order to analyze the dynamics of this system, we assume sinusoidal coupling
between the adjacent cores which is a good approximation close the the resonance
frequency [30]. Under this assumption, if φi is the rotating phase of the i
th oscillator
and all cores have a similar free running oscillating frequency of ω0, the phase
dynamics can be described by
φ˙i = ω0 +K sin(φi−1 + φc − φi), (2.2)
where K =
Iinj
Icore
ω0
2Q
is the coupling factor and φc is the phase shift resulting from
the coupling block. In this scheme all phase shifts are assumed to be equal.
By defining ψi = φi−φi−1 as the instantaneous phase shift between the adjacent
oscillators, we can rewrite the set of equations described by (2.2) as
ψ˙i = K sin(φc − ψi)−K sin(φc − ψi−1). (2.3)
This relation describes the dynamics of ψi, the phase difference between adjacent
cores in the ring. For the complete description of the dynamics the boundary
condition from the loop should be included. The total phase shift around the loop
is fixed: ∑
i
ψi = 2kpi, (2.4)
where k is an integer. The solution to the above set of differential equations de-
scribes the phase dynamics of the system which also results in the stable solutions.
2.2.2 Stability Theory
We are interested in the stable solutions of the system when all core oscillators are
synchronized. Under locking conditions, the phase difference between oscillators
14
should be constant resulting in ψ˙i = 0 in (2.3). This can be written in the matrix
form as:
Kx = 0, (2.5)
where xi = sin(φc − ψoi ) and ψoi represents the steady state solution of ψi. The
coupling matrix, K is equal to
K =

K 0 . . . . . . −K
−K K 0 . . . . . .
...
...
...
...
...
0 . . . . . . −K K

. (2.6)
The trivial solution of (2.5) is xi = 0. This along with (2.4) results in φc = 2npi/N .
Such a solution imposes discrete values for φc which is not physically stable.
The nontrivial solution of (2.5) is in the form of xi = x1 which leads to two
choices for ψoi :
ψoi = ψ
o
1 ± 2npi, (2.7a)
ψoi = 2φc − pi − ψo1 ± 2npi. (2.7b)
This suggests 2N−1 distinct coupling modes for a given ψo1. However, in the next
theorem we show that only solutions with all ψoi ’s chosen from (2.7a) are stable.
Theorem. If m of ψoi ’s are chosen from (2.7b) and N − m from (2.7a), only
solutions corresponding to m = 0 are stable.
Proof. In order to show this, we consider the nontrivial case of N ≥ 3. We perturb
(2.5) around each solution. By letting ψi = ψ
o
i + ηi(t) and after linearizing the
perturbed equations, we obtain:
η˙ = J{ψo1 ...ψon}η, (2.8)
15
where the elements of the Jacobian matrix are equal to Jij = −Kij cos(φc − ψoi ).
For a solution to be stable, all eigenvalues of the Jacobian matrix should be non-
positive [39]. It is straightforward to find the characteristic polynomial of J to
be:
P (λ) =
N∏
i=1
(Jii − λ)−
N∏
i=1
Jii
= (−1)N (λN + PN−1λN−1 + . . .+ P1λ) . (2.9)
Thus, λ1 = 0 is an eigenvalue of the Jacobian. We show that for stable solutions,
all other eigenvalues are nonzero. By elaborating on the characteristic polynomial
we find that the sum (Σ) and product (Π) of the other N − 1 eigenvalues are:
−PN−1 =
n∑
i=2
λi = J11(N − 2m), (2.10a)
(−1)N−1P1 =
N∏
i=2
λi = J
N−1
11 (N − 2m)(−1)m. (2.10b)
Without loss of generality we can assume J11 6= 0 and thus, a second zero only
happens when N = 2m, requiring both Π and Σ to be zero. Since in a stable
solution no eigenvalue can be positive, Σ = 0 forces all eigenvalues to be zero
which is not possible for the nonzero Jacobian matrix. As a result, for a stable
solution all the other N − 1 eigenvalues of the coupled system are nonzero.
Next, based on an extended Gershgorin theorem [40], the eigenvalues of J are
inside circles centered at Jii with equal radii of |J11|. These are essentially two
circles, one on the left half plane and the other on the right half plane which
overlap at the origin.
If m = 1 and J11 > 0, N−m(≥ 2) Gershgorin circles lie on the right side, hence
at least one eigenvalue is positive. On the other hand if J11 < 0 we consider two
cases: 1) For an even N , form (2.10b), the product of an odd number of eigenvalues
16
becomes positive which means that at least one eigenvalue is positive. 2) For an
odd N , the product of even number of eigenvalues becomes negative which again
means that at least one eigenvalue is positive. Similarly, one can show that the
solutions corresponding to m = N − 1 are also unstable.
For 2 ≤ m ≤ N − 2 there are more than one Gershgorin circles on either half
planes and thus, at least one positive eigenvalue exists which again means that the
corresponding steady-state solution is unstable.
Finally, the only stable solutions correspond to m = 0 and J11 < 0 which
confines all Gershgorin circles and eigenvalues to the left half plane.
As a result of the above theorem, by applying (2.4) we find N distinct stable
modes of oscillation,
ψok =
2kpi
N
, 0 ≤ k ≤ N − 1, (2.11)
where the kth mode is stable as long as J11 < 0 which means
2pin− pi
2
+ ψok < φc < 2pin+
pi
2
+ ψok. (2.12)
2.2.3 Coupling Modes
Fig. 2.2 shows the stable regions as a function of the phase shift φc. Interestingly
there are more than one stable solution for a given phase shift and depending on
the initial conditions the system settles to one of the modes. However, based on
our analysis as long as the system in a particular mode satisfies (2.12) it will remain
in that mode [41].
If the phase shift is changed beyond the stable conditions of the current mode,
17
-π -π/2 π/2 π 3π/2
π/2
π
-π/2
-3π/2
-π
y
fc
y2
o
y1
o
y0
o
y3
o
y2
o
Figure 2.2: Stable regions for different modes of a system of 4 coupled oscil-
lators.
the system becomes unstable and moves out of this mode toward a stable mode.
The transition from one mode to the other mode can be described by the general
non-linear differential equation (2.2). Using the theory of nonlinear dynamics, it
can be shown that for N = 4 the system settles to the directly next state in the
diagram of Fig. 2.2 [41]. This means that for a stable solution at Ψk, increasing φc
eventually moves the system into Ψk+1, which is the next stable mode. Similarly,
reducing φc moves the system into Ψk−1. During this transition the phase difference
ψi gradually increases (or decreases) and finally settles to the next stable mode.
This behavior can be seen in Fig. 2.3 for a system with 4 coupled oscillators. A
small step in φc moves the system one step up to the next mode or one step down
to the prior mode.
A useful feature of this actively coupled oscillator scheme is its ability to delib-
erately select any desired mode. As shown in Fig. 2.2 for a given φc, the system
18
0 10 20 30 40
π/2
0
π/4
50
0 10 20 30 40 50
π/2
0
π/4
3π/4
-π/4
Time (nsec)
π/2
0
π/4
3π/4
y fc
y
π/2
0
π/4
3π/4
fc
Figure 2.3: Mode transition in the system of 4 coupled oscillators. Top:
Transition from ψ = 0 to ψ = pi/2. Bottom: Transition from
ψ = pi/2 to ψ = 0.
can settle to one out of a few modes depending on the initial conditions. However,
by changing φc in a deterministic way, a particular mode can be selected. After
choosing this mode, the inherent hysteresis in the system will keep the system in
that mode as long as the stability conditions are satisfied. This is in contrast to
the quadrature oscillators where a phase uncertainty is inherent to the system.
We can consider the quadrature oscillator a special case of this coupled system
for N = 2, resulting in two stable modes at Ψ = pi/2 and Ψ = −pi/2. Without
the tunable phase shifters for φc = 0 the system randomly chooses either of the
coupling modes resulting in ambiguity in the locking frequency [42].
19
2.2.4 Frequency Tuning
When the coupled oscillators are locked together at the kth coupling mode, by
substituting Ψk = φi − φi−1 into (2.2), the frequency of all core oscillators, ω
becomes
ω = ω0 +K sin(φc −Ψk). (2.13)
Note that from (2.12), φc−Ψk is always between −pi/2 and pi/2, in the stable region
of any given coupling mode. By substituting for K, The maximum frequency
tuning range is equal to
∆ω
ω0
= ± 1
2Q
· Iinj
Icore
, (2.14)
The frequency tuning range can be increased by increasing the coupling factor.
Since reducing Q is not desirable for the purpose of harmonic power generation
and low phase-noise, the best way to increase the tuning range is by increasing
Iinj. When the amount of phase shift exceeds the stability limit of the operating
mode, the coupling mode changes to the next mode resulting in an abrupt step in
the output frequency as shown in Fig. 2.4.
2.3 Terahertz Harmonic VCO
We exploit the frequency tuning concept of the actively coupled oscillator structure
for generation of high power and tunable signal sources around 300 GHz. In order
to generate power above the transistor fmax, we employ harmonic generation. The
generated harmonic power should be maximized and optimally delivered to the
output node. Prior work has been able to produce > 100µW of output power
on silicon at sub-mm-wave frequencies by optimizing the device embedding for
20
-0.15
0.1
-0.05
0
0.05
0.1
-π/2 -π/4 0 π/4 π/2 3π/4 π
0.15
fc
D
w
/w
0
K=0.1
K=0.15
Figure 2.4: Relative frequency as a function of φc for two values of K. The
jump in the output frequency corresponds to the change in the
coupling mode from Ψ = 0 to Ψ = pi/2.
efficient generation and delivery of the harmonic power to the load [16]. For such
a design, a reasonable quality factor is required in the signal path which is only
possible by avoiding varactors. This limits the oscillator to a single frequency.
Our proposed structure provides an effective method to tune the frequency while
maintaining the high output power.
First, the oscillator core is optimized to generate the highest possible power
at the harmonic of the fundamental frequency. Then, four cores are coupled and
the generated powers from all cores are combined and delivered to the load. In
addition to harmonic generation, the coupling dynamic performs two tasks: (1) it
determines the coupling mode and (2) in that given mode it controls the operating
frequency of the core oscillators. The coupling circuits (i.e., phase shifters) operate
at the fundamental frequency and are buffered from the cores. Hence, the tasks
of power generation and frequency tuning are separated, resulting in a high power
tunable source.
21
2.3.1 Oscillator Design
For a given target frequency at a given technology, the first design step is to find the
best harmonic of the fundamental to use as the output frequency. The advantage
of using a higher harmonic number is that the fundamental frequency is lower and
generates larger amplitudes resulting in more non-linearity. However, the power
generated at higher harmonics is significantly smaller than lower harmonics. On
the other hand, using lower harmonic numbers forces the fundamental frequency
to become close to the fmax, resulting in a low fundamental power to begin with.
As a result, there is a trade-off in choosing the harmonic number and the lowest
harmonic number is not necessarily the best choice.
In this design, for a target frequency of 300 GHz, the fourth harmonic is chosen
for the highest power generation. The simulated fmax of this process in around
200 GHz and as a result, using the second harmonic results in a fundamental of
150 GHz which is too close to the fmax. As shown in Fig. 2.5, the fourth harmonic
generates a higher harmonic current compared to the second harmonic. This is
mainly because of the larger nonlinearity produced by a fundamental frequency
around 75 GHz. As simulations indicate, the third harmonic creates slightly higher
levels of harmonic current. However, in this design even harmonics were favored
because of the resulting symmetry in implementation of the coupled system. By
using the fourth harmonic in this design, the cross-coupled LC oscillator can be
used as the core oscillator.
The LC resonator design should result in optimum performance both at the
fundamental and the fourth harmonic. Consider the circuit in Fig. 2.6 where the
conventional cross coupled structure is extended into a more general resonator. The
fundamental frequency is determined by the resonance frequency of the network.
22
01
2
3
4
5
6
7
100 200 300 400 500 600
2
nd
 harmonic (I2)
3
rd
 harmonic (I3)
4
th
 harmonic (I4)
Harmonic frequency (GHz)
H
a
rm
o
n
ic
 c
u
rr
e
n
t 
(m
A
)
Figure 2.5: Simualted harmonic current generated at the drain of a 65nm
CMOS transistor in a basic cross-coupled oscillation scheme.
Moreover, at the fourth harmonic the network has to deliver maximum power
from the transistor to the load. Both the fundamental frequency and maximum
harmonic power delivery can be simultaneously achieved by adding an impedance
to the gate of the transistor. For practical purposes we employ two transmission
lines for the gate and drain lines. Let us assume the gate and drain lines have
characteristic impedances of Zg and Zd with lengths of lg and ld, respectively. For
this analysis, we model the transistor only with the gate capacitance Cg and a
transconductance of gm.
The equivalent circuit of the cross-coupled pair at the fundamental frequency is
shown in Fig. 2.6. To find the impedance of the resonator, we write the governing
equation at the output node. By applying a differential voltage source vx, the
resulting current ix equals to:
ix = gmvg +
vx
2Zgate
+
vx
2Zdrain
, (2.15)
where vg is the voltage at the gate, and Zgate and Zdrain are the impedances looking
23
Zg, β, lg
Zd, β, ld
vx/2 -vx/2
ix
vg
Zdrain
Zgate
Figure 2.6: The modified cross-coupled scheme employed and the equivalent
circuit at the fundamental frequency.
into the gate and drain lines respectively. By calculating Zgate, Zdrain and vg using
transmission line analysis the resonator transconductance, Gm can be derived [43]:
Gm =
ix
vx
= −gm
2
· 1 + Γ
ejβlg + Γe−jβlg
+
1
2Zg
· 1− Γe
−j2βlg
1 + Γe−j2βlg
+
1
2Zd
· 1 + e
−j2βld
1− e−j2βld , (2.16)
where Γ is the reflection coefficient at the gate of the transistor and β is the
propagation constant. While this analysis neglects the effect of loss of the lines
and the transistor, it is helpful for finding the resonance frequency. In this design lg
and ld are significantly smaller than the wavelength at the fundamental frequency
of 75 GHz, or equivalently βlg  1 and βld  1. By applying these approximations
and substituting for Γ, (2.16) can be simplified as
Gm ' −gm
2
+
jCgω
2
+
1
2Zd · jβld . (2.17)
This suggests that in the proposed oscillator the gate transmission line has
minimal effect on the fundamental frequency. This frequency, ω0 can be calculated
by setting the imaginary part of Gm equal to zero:
ω0 =
√
vp
CgZdld
, (2.18)
where vp = ω/β is the propagation velocity of the wave in silicon dioxide.
24
2.3.2 Harmonic Power Generation
At the fourth harmonic, the cross-coupled pair operates in common mode resulting
in a half-circuit equivalent shown in Fig. 2.7. Nonlinearity of the device generates
a harmonic current Ih at the drain of the transistor. For the highest power delivery
this current has to be optimally delivered to the load.
As shown in Fig. 2.7, Ih is divided into the two branches, Iout which proceeds to
the output load and Igate which returns to the gate of the transistor. To maximize
the output power, Ih should flow to the output node which means Zgate has to be
maximum at the fourth harmonic to minimize Igate. By using an analysis similar
to Section 2.3.1 the gate impedance equals to:
Zgate =
1
2Zg
· 1− Γe
−j2βlg
1 + Γe−j2βlg
. (2.19)
Zgate is maximum at its resonance frequency. By proper design of lg and Zg, this
resonance can be placed around the fourth harmonic.
The output of the four cores are connected to a power combiner. To design a
VCO at the fourth harmonic the coupling mode is set for Ψ = pi/2. In this mode
the second harmonic from the four cores cancel out at the output. The first and
third harmonics are also canceled because of the differential structure. As a result,
the lowest harmonic frequency at the output is the fourth harmonic. Furthermore,
for optimal power combining, the drain impedance of the transistor has to be power
matched. The VCO including the four coupled cores and the combining network
is shown in Fig. 2.8. The drain lines are first connected together and subsequently
connected to the load. The impedance and length of the lines are designed for
matching at the fourth harmonic. The resulting output matching is shown in Fig.
2.9. Fig. 2.10 shows the waveforms of the core oscillators at the fundamental
25
Ih
IoutIgate
ZL
S11
Power Combiner
Zg,lg
Zd,ld
Other Cores
Zgate
Figure 2.7: Half-circuit equivalent of the core oscillator at the fourth har-
monic frequency.
frequency and the output node at the fourth harmonic.
2.3.3 Coupling Circuit
As explained in Section 2.2, the coupling block requires a tunable phase shifter in
order to control the frequency and the coupling mode. Besides, this block should be
buffered from the core oscillators to minimize the power loss due to coupling. Fig.
2.11 shows the designed coupling network. The circuit consists of three stages. The
first two stages composed of transistors M1 and M2 act as variable phase shifters
and the third stage M3 acts as a buffer prior to connecting the phase shifters to
the next core oscillator. A fully differential scheme is implemented and the supply
of the phase shifters is provided through the virtual ground.
26
Outp
ut
L
g
L
d
Vcontro
l
Vcontro
l
V
c
o
n
tro
l
V
c
o
n
tro
l
Figure 2.8: Top view of the proposed four core coupled oscillator VCO show-
ing the cores, the coupling block and the power combiner.
100 150 200 250 300 350 400 450 500
Frequency (GHz)
S
1
1
 (
d
B
)
-14
-12
-10
-8
-6
-4
-2
0
Figure 2.9: Simulated output matching of the oscillator at the fourth har-
monic.
27
-0.5
0
0.5
1
1.5
2
2.5
0 2 4 6 8 10 12 14
Core4 Core3 Core2 Core1
Output
3
Time (psec)
V
o
lt
a
g
e
 (
V
)
Figure 2.10: Simulated waveforms of the core oscillators and the output node
oscillating at the fourth harmonic frequency.
V
in
V
o
u
t
Vdd
V
c
o
n
tr
o
l
V
c
o
n
tr
o
l
Vdd
Phase Shifter
M1 M2 M3
Coupler
Figure 2.11: The differential phase shifter followed by the coupler to the next
core.
The two stage phase shifter is composed of two resonating amplifiers. The
resonance frequency is designed to be close to the fundamental frequency of the
core oscillators. The varactor tunes the resonance frequency and subsequently
changes the overall phase shift. To better understand the process let us consider
the resonator at frequencies close to the resonance frequency [30]. It can be shown
28
that the total phase shift of the phase shifter is
φc(ω) = φ
o
c +
4Qc
ωr
· (ω − ωr), (2.20)
where ωr and Qc are the center frequency and quality factor of each resonator,
respectively. φoc represents the phase shift at ω = ωr from the output of the prior
core to the injecting current from M3 to the next core. The total phase shift φc
includes the variable phase shifts from the first the two stages and the fixed phase
shift from M3.
In order to find the resulting frequency we place (2.20) into the coupling equa-
tion in (2.13):
ω = ω0 +K sin(φ
o
c +
4Qc
ωr
· (ω − ωr)−Ψk). (2.21)
Finding the exact tuning range requires numerical simulation. However, it is in-
structive to find the trend in (2.21) by finding the change in the oscillating fre-
quency with respect to the change in ωr:
dω
dωr
=
−4KQc
ω2r
ω cos(φoc −Ψk + 4Qcωr (ω − ωr))
1− 4KQc
ωr
cos(φoc −Ψk + 4Qcωr (ω − ωr))
. (2.22)
The tuning method is based on changing ωr of the phase shifter by tuning the
resonance capacitors. Interestingly as (2.22) suggests, an optimum K results in
maximum change of ω with respect to ωr.
Fig. 2.12 shows the simulated phase shift for each phase shifter at ω0 as a
function of the control voltage. Note that the actual phase shift is determined by
the locking frequency that is slightly different from ω0. The resulting change in
the locking frequency of the oscillators is also shown in Fig. 2.12. The control
voltage used in this plot is the DC voltage across the varactors. Since the gate of
all varactors are biased at 1.2 V, all voltages are positive in the entire range.
29
300
305
310
315
320
325
-1.2 -0.9 -0.6 -0.3 0 0.3 0.6 0.9 1.2
Control Voltage (V)
F
re
q
u
e
n
c
y
 (
G
H
z
)
P
h
a
s
e
 s
h
if
t 
(d
e
g
re
e
)
0
30
60
90
120
150
Figure 2.12: Simulated phase shift of the coupling block with respect to the
control voltage and the resulting change in the locking frequency
of the core oscillators.
In practice K also changes both with respect to the control voltage and the
frequency. For an accurate picture of the frequency tuning one also has to consider
the change in the gain of the coupling amplifiers. However, in this design most
of the frequency tuning is achieved within the voltage range where the coupling
amplitude can be considered constant.
2.4 Implementation
The coupled oscillator VCO is designed and implemented in a 65 nm low-power
(LP) bulk CMOS process. The chip photo is shown in Fig. 2.13. This section will
discuss the steps taken for proper modeling and simulation of the VCO, followed
by the measurement setup and the measured results.
30
Core
C
om
bi
ni
ng CoreCore
Core
C
oupling
C
ou
pl
in
g
C
ou
pl
in
gCoupling
600µm
6
0
0
µ
m
Figure 2.13: The chip micro photograph.
2.4.1 Simulation
Proper simulation of all connections and couplings are crucial in this design. This
is particularly important in the core oscillators where the matching and power
combining networks operate around 300 GHz and all lines are comparable with
the wavelength. The core is designed based on the design methodology described
in Section 2.3. First, the the center frequency is set by ld and Zd of the drain
transmission line. Next, harmonic matching is achieved by tuning lg and Zg of the
gate transmission line. We use shielded RF devices with a finger width of 2µm
both for the core and the coupling circuits. The width of the transistors in the
core oscillator is 36µm. The sizes of M1, M2, and M3 in the coupling block of Fig.
2.11 are 12µm, 24µm, and 24µm, respectively.
31
3.4µm
4.2µm
4µm 6µm6µm
Figure 2.14: Metal layers and the dimensions of transmission lines.
Ground shielding is placed under all transmission lines by using the first and
second metal layers in parallel. Fig. 2.14 shows the structure used for the trans-
mission lines. Because of the proximity between the gate and drain lines the effect
of coupling between the lines should be simulated. In order to do so the entire
metallization of the core oscillator is modeled as a 5-port device, four connected
to the gates and drains of the cross-coupled pair and one connecting the output
to the combining network. A symmetric layout is vital for maximizing the fourth
harmonic power generation and canceling lower harmonics. Each core oscillator is
shielded from other blocks using grounded metal walls.
The differential coupling block is designed to achieve equal lengths for the
two differential signal paths. To achieve a fully differential scheme the inductors
of the phase shifter are designed to be differential and their middle points are
wirebonded for the DC supply. The varactors in resonators are buffered from
the cores and as a result, they have minimal effect on the quality factor of the
oscillators. All varactors are connected to bypass capacitors and are ESD protected
prior to connecting to the pads. The phase shifters are designed to achieve good
phase tuning and sufficient energy coupling at the same time. The front-end buffer
device, M1 is smaller that the next two stages in order to minimize the loading
effect of the coupling circuit on the injecting oscillator. The power combiner is
32
designed for matching at the fourth harmonic frequency as shown in Fig. 2.9.
Since the reactance of the pad capacitance is significant at the target frequency,
its effect is considered in designing the combiner network.
2.4.2 Measurement
We use two separate setups to measure the output frequency and power. These
two test setups are shown in Fig. 2.15. The output of the VCO is connected to a
Cascade i325-GSG probe with a built-in bias tee that provides the DC current to
the four cores. An alternative to using a bias tee is to provide the DC current using
wirebonds and use a quarter wavelength stub to connect the supply to the output
node. To measure the output frequency, a VDI WR3.4EHM harmonic mixer is
used for down-converting the sub-mm-wave signal. The LO is set so that the
down-converted IF is below 1 GHz. In order to calculate the output frequency we
first find the harmonic number of the LO which is multiplied with the RF signal.
This is done by changing LO and finding the ratio between the frequency change
in IF with respect to the frequency change in LO. Fig. 2.16 shows the IF spectrum
from the 16th harmonic of the LO.
By selecting two different values for the drain inductor Ld, two versions of the
chip are designed at two different center frequencies. Fig. 2.17 and Fig. 2.18 show
the measured output frequency for the two VCOs as a function of the control
voltage. The first VCO has around 13 GHz of tuning range around 290 GHz while
the second VCO has a tuning range of 8.4 GHz around 320 GHz. The second
VCO has a lower tuning range because the coupling circuit for both versions are
the same, while the tuning range of the phase shifter is centered around 290 GHz.
As shown in Fig. 2.12, the tuning range of the phase shifter is less than 180 degrees,
33
DUT
Cascade i325-
GSG Probe
WR-3.4 
waveguide
Bias-tee
Power Supply
Vdd WR-3.4 to WR-10 
tapered 
waveguide
Power-meter
(Erickson 
PM4)
Sensor head
DUT
Cascade i325-
GSG Probe
WR-3.4 
waveguide
Harmonic 
mixer
VDI
WR-3.4 
EHM
Diplexer
VDI IF
LO
Signal Generator 
0 - 40 GHz
(Agilent 
E8257D)
(Agilent 
8564E)
Spectrum 
Analyzer
mW
Bias-tee
Power Supply
Vdd
WR-3.4 
bend
Figure 2.15: Top: Test setup for measuring the output frequency. Bottom:
Test setup for measuring the output power.
Figure 2.16: A typical measured output spectrum downconverted by the 16th
harmonic of the LO.
34
-12
-10.5
-9
-7.5
-6
-4.5
-3
0
-0.25 0 0.25 0.5 0.75 1 1.25 1.5 1.75
Control Voltage (V)
O
u
tp
u
t 
P
o
w
e
r 
(d
B
m
) -1.5
282
286
290
294
298
296
292
288
284
F
re
q
u
e
n
c
y
 (
G
H
z
)
Figure 2.17: Measured output power and frequency tuning for the 290 GHz
VCO.
-14.5
-13
-11.5
-10
-8.5
-7
-5.5
-2.5
-0.25 0 0.25 0.5 0.75 1 1.25 1.5 1.75
Control Voltage (V)
O
u
tp
u
t 
P
o
w
e
r 
(d
B
m
) -4
314
318
322
326
330
316
320
324
328
F
re
q
u
e
n
c
y
 (
G
H
z
)
Figure 2.18: Measured output power and frequency tuning for the 320 GHz
VCO.
thus, frequency tuning range can be further increased by adding another stage to
the phase shifter.
For output power measurement, a wide-band Erikson PM4 power meter is
employed. As shown in Fig. 2.15, the probe is connected to a 90 degree bend which
is followed by a WR3.4 to WR10 tapered line. The loss of all the components are
calibrated using a 500 GHz network analyzer. Fig. 2.19 shows the peak output
35
-7
-6
-5
-4
-3
-2
-1
0
140 180 220 260 300 340 380 420 460
Core DC Power (mW)
P
e
a
k
 O
u
tp
u
t 
P
o
w
e
r 
(d
B
m
)
Design Point
VCO1
VCO2
Figure 2.19: Measured peak output power vs. DC power.
power as a function of the input DC power to the cores. The design point for both
chips is chosen at the point with the best DC to RF power ratio. Fig. 2.17 and
Fig. 2.18 show the output power as a function of the control voltage. The variation
in the output power is mostly due to variation in the gain of the phase shifters
for different control voltages. Although the oscillator systematically cancels all the
lower harmonics, mismatch between cores can lead to some undesired harmonic
leakage to the output. However, the amount of this leaked output power is small
compared to the main harmonic. Any leaked power is largely filtered out by the
output matching network and the high-pass nature of the rectangular waveguides.
This is verified by observing the output spectrum of the lower harmonics which
are at least 15 dB lower than the fourth harmonic.
For the first VCO, the peak output power is -1.2 dBm and the tuning range
is 4.5% around 290 GHz. The second VCO has a peak output power of -3.3 dBm
with a 2.6% tuning range around 320 GHz. The measured phase noises of the
290 GHz and 320 GHz sources at the 1 MHz offset frequency are -78 dBc/Hz and
-77 dBc/Hz, respectively. Each oscillator core consumes 81 mW from a 1.3 V
36
supply voltage and the power consumption of each coupling block is 31 mW from
a 1.1 V supply voltage. Because of the large DC current flowing into the cores,
the supply of core oscillators is slightly higher than the nominal value of 1.2V
in order to compensate the voltage drop from the pad to the drain of the tran-
sistors. Table. 2.1 compares the performance of this VCO with the best signal
sources around 300 GHz. Prior to this work the highest output powers on CMOS
were achieved with non-tunable oscillators. This VCO achieves both the highest
output power and the highest tunability compared to all the oscillators includ-
ing compound semiconductors technologies that have significantly higher cut-off
frequencies. The DC to THz conversion efficiency of this work is more than any
other CMOS tunable source at this frequency range. However, this efficiency is
still below 1%. This is mainly because a large portion of the DC power is converted
to the fundamental frequency as well as undesirable harmonics. It is intriguing to
explore novel harmonic generation schemes to maximize the conversion from DC
to the desired harmonic by minimizing the undesired harmonic generation.
2.5 Summary
In this chapter we propose a novel oscillator architecture that is suitable for high-
power generation in the terahertz region. The coupled oscillator structure in-
troduces a unique power generation and combining method that produces high
harmonic power above the transistor fmax. As a result of this architecture, power
generation and tuning can be performed without affecting each other. The mea-
sured output power and tuning range of this VCO is the highest compared to
37
Table 2.1: Performance comparison of the state-of-the-art terahertz signal
sources
Ref.
Freq.
(GHz)
Output
Power
(dBm)
Tuning
Range
Phase
Noise
(dBc/Hz)
DC
Power
(mW)
DC to THz
Efficiency
Technology
Source
Type
This Work 290 -1.2 4.5%
-78
@ 1 MHz
325 0.23%
65 nm LP
Bulk CMOS
Osc.
This Work 320 -3.3 2.6%
-77
@ 1 MHz
339 0.13%
65 nm LP
Bulk CMOS
Osc.
[16] 482 -7.9
Non
tunable
-76
@ 1 MHz
61 0.26%
65 nm
Bulk CMOS
Osc.
[17] 291 -13.9 (*)
Non
tunable
NA 19 0.21%
45 nm
SOI CMOS
Osc.
[18] 324 -46 1.2%
-78
@ 1 MHz
12 0.0002%
90 nm
Bulk CMOS
Osc.
[21] 296 -3.9 4%
-78
@ 1 MHz
115 0.35%
InP HBT
fmax > 800GHz
Osc.
[22] 190 0 13.6% NA 91 1.1%
45 nm
SOI CMOS
Freq.
Mult.
[24] 325 -3 6.2%
-101 (**)
@ 1 MHz
420 0.12%
130 nm
SiGe HBT
Freq.
Mult.
(*) Generated power from each oscillator assuming antenna efficiency of 50%.
(**) Employs a high power low noise external source.
CMOS VCOs at this range and is comparable to compound semiconductor oscil-
lators with much higher fmax.
38
CHAPTER 3
ELECTRICAL INTERFEROMETER: A NOVEL HIGH-SPEED
QUANTIZER
3.1 Introduction
The growing demand for higher data rates is increasing attention paid to extremely
fast signal quantization. The applications include equalization and detection in
serial data links, wide-band radar and RF receivers, and high-speed instrumen-
tations [44–52]. High-speed data conversion faces different challenges both in the
fundamental aspect due to limited active device cut-off frequency and in imple-
mentation aspects due to issues such as sampling jitter and clock/data skew [49].
The fastest reported quantizers use high-performance compound-semiconductor
processes, and/or time interleaving to go beyond the speed limit of a single chan-
nel data converter. However, clock jitter and channel mismatch remain as the
main challenges of the time-interleaving approach [46–50]. To address the issue
of sampling jitter, optically assisted sampling has been proposed, which can lower
the jitter noise floor while increasing the fabrication cost [48,54].
Although conventionally analog signals are converted from voltage or current
domain directly to the digital domain, it is not necessarily the most effective
method because of active device limitations. In [53], time stretching in an op-
tical medium is proposed for fast quantization. In [54] and [55] optical sam-
pling/quantization using optical phase shifters and power detectors has been pro-
posed as another alternative for achieving higher sampling rates. In our previous
work we proposed the delay-line-based quantization which exploits CMOS fast
delay cells for GSample/s and power-efficient data conversion [56,57].
39
Detector
Detector
Detector
Digital 
Encoder
and
Memory
Medium  Z0
Analog Input Digital 
Output
Figure 3.1: 2-D lattice as an Electrical Interferometer.
In this chapter, we will propose wave propagation in electrical lattices as an-
other means of signal quantization that can be used for extremely high-speed data
conversion on silicon. 1-D and 2-D LC lattices have been previously studied and
exploited in extremely high-speed signal generation and transmission [58–64]. An-
other area in which LC lattices have shown promising potential is fast signal pro-
cessing [65–70]. Wave propagation in discrete lattices is shown to behave similarly
to a continuum media provided the wavelength is considerably larger than one lat-
tice spacing. In this region of operation, 2-D lattices behave similarly to an optical
medium, and properties such as diffraction and refraction are observed [67–69].
The general idea of using an LC lattice as a quantizer is portrayed in Fig. 3.1.
The analog signal changes the properties of the medium, which will in turn change
the interference pattern inside the wave-propagating medium. The properties of
the wave can be changed by using varactors in the LC lattice and controlling their
bias points. Furthermore, we can engineer the values of inductors and capacitors at
different points of the lattice to generate the desirable interference pattern. Since
this kind of change in lattice properties does not involve any signal processing
with active devices, it is inherently fast. Another way to describe the process
40
is that the analog signal is spread across the entire lattice, making it a spatial
quantizer. To the best of our knowledge, this is the first structure that quantizes
the analog signal in a 2-D electrical lattice. Depending upon the way the LC lattice
is designed, certain nodes in the lattice become more vital for signal detection.
Since the amplitude of the pattern at these nodes matters, a power detector will
be used to compare the power level with a certain threshold.
The rest of this chapter is organized as follows. In Section 3.2, we will review
the behavior of waves in 1-D and 2-D LC lattices. The analysis of the proposed
quantizer is presented in Section 3.3. Finally, in Section 3.4 we propose the design
methodology and present the simulation results.
3.2 Wave propagation and interference in lattices
Let us consider a 1-D discrete transmission line made of inductors and capacitors
as shown in Fig. 3.2.
Ln
Cn
Ln+1
Cn+1
Ln-1
Cn-1
In-1
Vn-1 Vn
In
Vn+1
In+1
Figure 3.2: Discrete 1-D transmission line.
By applying KCL at node n, whose voltage with respect to ground is Vn, and
applying KVL across the two inductors connected to this node, one can show that
the voltages of adjacent nodes on this transmission line are related via
41
1Ln
(Vn − Vn−1) + Cnd
2Vn
dt2
+
1
Ln+1
(Vn − Vn+1) = 0. (3.1)
This differential equation can be analytically solved if inductors and capacitors
are identical [69]. For an inhomogeneous LC line where (3.1) is different for each
section, a continuum model can be used to find the response. This model is based
on the assumption, that the electrical length of each section is considerably lower
than the wavelength; thus, dispersion caused by discreetness is negligible. This
approach has been used in [63] for analyzing an exponentially tapered transmission
line with a constant L and C product. In this work, we perform tapering by keeping
the L/C ratio constant, thus minimizing fluctuation in the characteristic impedance
Z0. Keeping Z0 constant ensures power matching along the line and consequently
minimizes standing-wave formation, which can cause oscillation. Hence, keeping
Z0 constant reduces the settling time of the lattice in response to the input change.
3.2.1 Exponential tapering with constant L/C
In order to analyze a constant L/C tapering, we choose the inductance and capac-
itance of the nth section to be
Cn = C0 exp(ρn)
Ln = L0 exp(ρn),
(3.2)
where ρ, C0, and L0 are constants. By substituting these values into (3.1) and
assuming that the values of inductance and capacitance do not change abruptly
between adjacent nodes (ρ  1), we can use the continuum limit to obtain a
single differential equation for the line. In this case, we can approximate (3.1)
with respect to its spatial derivatives assuming that the node spacing is δ and
42
the node location x is nδ. By performing the Taylor series expansion of Vn+1 and
Vn−1 around Vn and neglecting third- and higher-order terms, we can derive the
continuum limit differential equation:
d2v
dt2
= −exp(−2µx)
l0c0
d2v
dx2
, (3.3)
where l0 = L0/δ and c0 = C0/δ are unit length inductors and capacitors of the line,
µ = ρ/δ is the continuous tapering coefficient, and v(x) is the continuum limit of
Vn. Next, since Z0 is constant, we assume a one-way traveling-wave solution:
V (x, t) = A(x) cos(ωt− f(x)), (3.4)
where we seek to find the function f(x) that fits best to the phase shift function.
Inserting (3.4) into (3.3) results in
− ω2l0c0 · exp(2µx) · A(x) cos(ωt− f(x)) =
cos(ωt− f(x))
[
d2A
dx2
− A(x)( df
dx
)2
]
+
sin(ωt− f(x))
[
2
dA
dx
· df
dx
− A(x)d
2f
dx2
]
.
(3.5)
Based on sine and cosine coefficients, two independent equations can be derived
from (3.5):
2
dA
dx
· df
dx
− A(x)d
2f
dx2
= 0, (3.6)
−ω2l0c0 · exp(2µx) · A(x) = d
2A
dx2
− A(x)( df
dx
)2. (3.7)
Solving (3.6) leads to
df
dx
= cA(x)2, (3.8)
where c is a constant. By replacing (3.8) into (3.7), we get a single differential
equation in terms of A(x):
−ω2l0c0 · exp(2µx) · A(x) = d
2A
dx2
− c2A(x)5. (3.9)
43
Assuming smooth tapering in the line, µ 1, we can assume second-order deriva-
tive of the amplitude to be small. Thus, by neglecting the first term in the right
hand side of (3.9), A(x) can be solved as
A(x) =
4
√
ω2l0c0
c2
· exp(µx/2). (3.10)
Consequently, By substituting (3.10) into (3.8), f(x) can be derived as
f(x) =
√
ω2l0c0
µ
· exp(µx), (3.11)
Now we return to the discrete node analysis by replacing x = δn in (3.4) and
substituting A(x) and f(x) with A(n) and f(n):
Vn(t) = A(n) cos(ωt− f(n))
A(n) = 4
√
ω2L0C0
(δc)2
· exp(ρn/2)
f(n) =
√
ω2L0C0
ρ
· exp(ρn)
(3.12)
From (3.12) and by looking at the exponential terms we can conclude that the
variation in A(n) is slow compared to f(n); thus, for our further analysis we neglect
the amplitude variation across the line and only consider the phase variation. The
delay per section can be found as
Td(n) · ω = f(n)− f(n− 1) ' df(x)
dx
|x=δn · δ
Td(n) =
√
L0C0 exp(ρn).
(3.13)
Fig. 3.3 shows Td for different values of ρ, where simulation closely follows our
analysis.
We can furthermore elaborate on (3.12) by approximating f(n) with a second
order polynomial for a small tapering coefficient. By only keeping the first three
terms we get
Vn(t) = A(n) cos(ωt− φ0 − k(n) · n), (3.14)
44
S
ec
ti
o
n
 d
el
a
y
 (
p
se
c)
500
1000
1500
2000
2500
3000
3500
4000
4500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Transmission line section number
Delay vs. Section (simulation) 
Delay vs. Section (Analysis) 
ρ=0.1
ρ=0.075
ρ=0.05
ρ=0.025
ρ=0
Figure 3.3: Time delay vs. section number in a 1-D tapered line, for different
tapering coefficients. Simulation is done with L0 = 50pH , C0 =
10fF , and ω = 2pi ·60GHz and it is compared with our analysis.
where φ0 is a constant phase shift and
k(n) = ω
√
L0C0(1 +
ρn
2
) (3.15)
is defined as the effective wave number at node n. From the above, we conclude
that the exponential tapering will cause the wave number to increase (decrease)
linearly along the line for positive (negative) values of ρ. This also means that
we get a linearly decreasing (increasing) wavelength across the transmission line.
Engineering the wavelength along the wave path is an effective way to change the
topography of the interference pattern. We will exploit this property in designing
interference patterns.
45
3.2.2 Interference pattern in 1-D LC ladders
Let us consider a 1-D discrete transmission line with N sections that is terminated
with matched loads at both ends. We apply in-phase sources to the two ends and
observe the interference pattern. The forward- and backward-propagating waves
at node n can be written as
V +n (t) = Re {V0 exp(jωt− jγn)}
V −n (t) = Re {V0 exp(jωt− jγ[N − n])} .
(3.16)
where γ is the complex propagation constant. By replacing γ with α + iβ , and
applying superposition to the voltage node at the nth section of the line, the nodes’
voltages are derived as
Vtot =Re{V +n + V −n }
=V0 cos(ωt− βn)e−αn + V0 cos(ωt− β(N − n))e−α(N−n)
=V0e
−αN/2[cos (ωt− β(N/2− n′)) e−α(N/2−n′)+
cos (ωt− β(N/2 + n′)) e−α(N/2+n′)],
(3.17)
where we define
n′ = N/2− n, (3.18)
in order to have a symmetric form around the center. Here, α is the loss of the
line and β is the propagation constant for a lossless line. In (3.17), If the loss is
small (α 1) or if the nodes of interest are around the center of the transmission
line (n′  N/2), the two loss exponential terms in (3.17) are approximately equal.
Thus, we can rewrite (3.17) as
Vtot = 2V0 exp(−αN/2) cos(βn′) cos(ωt− βN/2). (3.19)
The above equation shows that due to interference, the voltage amplitudes of the
nodes are modulated across the line, by the periodic function, cos(βn′).
46
L0C0 
Section Number
n=1 n=N/2 n=N
L0C0e
ρN
V+
-
 
V+
+
 
V-
-
 
V-
+
 
LC
Figure 3.4: Incident and reflected waves across the two sections of the tapered
1-D transmission line.
The same analysis can be performed to find the interference pattern inside a
tapered line. We consider a centrally symmetric tapering by choosing the values
of inductors and capacitors as
Cn = C0 exp(ρ|n′|)
Ln = L0 exp(ρ|n′|).
(3.20)
As shown in Fig. 3.4, we can write the response of the transmission line to
(3.16) as the sum of the transmitted and reflected waves in the two regions, where
exponential tapering results of (3.14) are valid. In this case, the voltage of the line
can be expressed as
Vtot =
 V
+
+ exp(−iβn′n′) + V +− exp(iβn′n′), n < N/2
V −+ exp(−iβn′n′) + V −− exp(iβn′n′), n > N/2,
(3.21)
where
βn′ =
√
L0C0ω(1 + ρN/4− ρ|n′|/2) (3.22)
is the varying propagation constant, which is the same as k(n) in (3.14). Since the
characteristic impedance is kept constant across the line, no reflection happens at
the center boundary, which means V ++ (n = N/2) = V
−
− (n = N/2) and V
+
− (n =
47
N/2) = V −+ (n = N/2). By applying this condition and the boundary conditions
at two ends of the line, the solution becomes
Vtot = 2V0 cos(βn′n
′) cos(ωt− βn′N/2). (3.23)
This solution is quite similar to (3.19), when we replace β with βn′ . We did not
consider the effect of loss in the tapered line, but by comparing (3.23) to the
uniform line result of (3.19), we can observe that the loss should similarly result
in an amplitude peak lower than 2V0.
3.2.3 Generalization to 2-D
A 2-D lattice composed of inductors and capacitors is shown in Fig. 3.5. This
lattice gives more degrees of freedom in designing circuits as compared with a 1-D
line. Although feeding the input to a 2-D lattice is more complicated than a 1-D
transmission line and requires a power division network, for a given number of
inductors, the propagation length from the edge of the 2-D lattice to its center is
much less than a 1-D transmission line, which makes 2-D structures more desirable
in the presence of loss.
In this lattice, we assume that all four sides are terminated to matched loads.
In addition, we assume an N ×M rectangular lattice and apply in-phase signal
sources to all four boundaries. Assuming no reflection on the sides, we can write
the effect of all four waves at each node of the lattice as
Vn,m(t) = Re{V0exp(iωt− ikxn) + V0exp(iωt− ikx(N − n))+
V0exp(iωt− kym) + V0exp(iωt− ky(M −m))},
(3.24)
where kx and ky are the propagation constants in the two perpendicular directions.
Since the four plane waves of the right side of (3.24) only propagate in the x or y
48
kx -kx
ky
-ky
Vn,m Vn+1,mVn-1,m
Vn-1,m-1 Vn,m-1 Vn+1,m-1
Vn-1,m+1 Vn,m+1 Vn+1,m+1
Lxn,m
L
y
n
,m Cn,m
Figure 3.5: Discrete 2-D transmission lattice.
direction, we can treat them as the solutions of orthogonal 1-D transmission lines.
Using a change of variable,
n′ = N/2− n
m′ = M/2−m,
(3.25)
equation (3.24) is simplified to
Vn,m(t) =
2V0 cos(−kxN/2) cos(kxn′) + 2V0 cos(−kyN/2) cos(kym′).
(3.26)
If the lattice is central symmetric (i.e., kx = ky = k and N = M), (3.26) becomes
Vn,m(t) =
4V0 cos(k
n′ +m′
2
) cos(k
n′ −m′
2
) cos(ωt− kN/2).
(3.27)
49
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
2
4
6
8
10
12
14
DC Bias (V)
C
a
p
a
c
it
an
ce
 (
fF
)
Figure 3.6: The characteristic of an accumulation mode MOS varactor.
The solution has a constant envelope (4V0), modulated by two diagonal periodic
functions. This 2-D modulation provides a pattern that is well suited for the
purpose of our quantizer.
3.3 Interferometric quantization
Based on our analysis in the previous section, we propose a method for efficiently
changing the interference pattern of an LC lattice using MOS varactors. The
change in the capacitors will change the propagation constant and wavelength,
which will in turn change the voltage across lattice nodes. This change in the
voltage across the lattice will be detected for the purpose of signal quantization.
50
3.3.1 The MOS Varactor
An MOS varactor is a device operating in depletion or accumulation mode, where
changing the gate voltage changes the effective capacitance [71]. the C-V curve
of an accumulation mode MOS varactor is shown in Fig. 3.6. The small signal
capacitance is defined as
C =
dQ
dV
. (3.28)
For small variations, the C-V characteristic of Fig. 3.6 can be approximated as
C(V ) =
C0
(1− bV ) , (3.29)
where V is the bias voltage across the varactor and C0 and b are constants.
Even though (3.29) describes a nonlinear system, for small input amplitudes
around a bias voltage of Vc we can neglect the effect of nonlinearity and approxi-
mate the current across the capacitor by
I ≈ C0
(1− bVc)
dV
dt
= C(Vc)
dV
dt
(3.30)
Thus, if capacitances in Fig. 3.2 are replaced by varactors, we can simply
rewrite (3.1) by replacing all capacitors with the varactor small-signal capacitance,
C(Vc). Based on this linear model, we use varactors to change wave propagation
and its interference pattern in an LC lattice.
3.3.2 2-D lattice quantization
In the N × N 2-D lattice described in the previous section, if we replace the
capacitors with varactors the voltage swing at the nth row and mth column would
51
be
Vn,m =4V0 cos (ωt− k(Vc)N/2) ·
cos
[
k(Vc)
n′ +m′
2
]
cos
[
k(Vc)
n′ −m′
2
]
,
(3.31)
where
k(Vc) =
√
L0
C0
(1− bVc)ω. (3.32)
We propose the following quantization method:
1. Apply a sample of the analog signal to the bottom plate of the varactors,
resulting in a change of their capacitance by changing the bias point: Vc =
Vin.
2. Apply small-signal, in-phase sinusoidal inputs to all sides of the lattice.
3. Four plane waves generated by the input sources propagate inside the lattice
and form the interference pattern. This pattern depends on the value of the
varactors, and hence, is a function of the analog signal.
4. Using a power detector, we detect the output power at certain nodes of the
lattice.
5. The detected output is compared with a threshold level, VT . The nodes above
the VT are defined as logic 1, and the nodes below it are defined as logic 0.
6. The total number of logic 1 outputs, Stot, is the digital output code.
In order to show how Stot is related to the analog signal, we need to know which
outputs are suitable for this kind of quantization. There is no unique way to
select the output as long as a monotonic relation between the analog input and
the quantized output holds. We will show that by using the lattice nodes at the
first lobe of the interference pattern (in the center of the lattice) we obtain the
52
Section number in X direction
Se
ct
io
n 
nu
mb
er
 i
n 
Y 
di
re
ct
io
n
 
 
20 40 60 80 100 120 140
20
40
60
80
100
120
140
0.5
1
1.5
2
2.5
3
3.5
Figure 3.7: 2-D lattice interference pattern and its central lobe. The contours
represent nodes of the lattice with equal voltage amplitudes, and
the dashed lines represent nodes with zero amplitude. The volt-
age levels are normalized with respect to the input voltage level.
desired relationship. We can find other selection sets that can result in the same
relation, but in terms of implementation it is desirable to have processing nodes
close to each other to minimize routing issues and to share similar circuit blocks.
Also, from loss analysis done in the previous section we concluded that the effect
of loss on the amplitude modulation will be negligible at the central nodes.
A typical interference pattern for a uniform 2-D lattice is shown in Fig. 3.7.
The first interference lobe is surrounded by the dashed lines in the center of the
lattice. We are interested in finding the number of nodes that satisfy Vn,m > VT
inside this region. From (3.31), the first lobe can be mathematically described as:
− pi
2
≤ k(Vin)n
′ +m′
2
≤ pi
2
− pi
2
≤ k(Vin)n
′ −m′
2
≤ pi
2
.
(3.33)
53
For the nodes close enough to the center, we can perform a Taylor series ex-
pansion on the voltage waveform of (3.31) with respect to n′ + m′ and n′ − m′
and neglect third- and higher-order terms to obtain
4V0(1− k(Vin)2.n
′2 +m′2
4
+O(n4)) > VT . (3.34)
Neglecting higher-order terms and substituting(3.32) into (3.34), we have
n′2 +m′2 < [1− bVin]ω
2
T
ω2
.(1− VT
4V0
), (3.35)
where
ωT =
2√
L0C0
(3.36)
is the lattice cut-off frequency in the orthogonal direction. Note that in 2-D lattices
ωT depends on the wavefront direction [70]. The nodes that satisfy (3.34) are within
a circle with a radius R equal to the square root of the right-hand side of (3.35).
The number of nodes inside this circle is equal to its area:
Stot(Vin) = bpiR2c = bpi[1− bVin]ω
2
T
ω2
.(1− VT
4V0
)c. (3.37)
Thus, the quantized output code is, approximately, linearly proportional to the
input analog voltage. Note that the proportionality factor will be
∆Stot
∆Vin
≈ d(piR
2)
dVin
= −bpiω
2
T
ω2
.(1− VT
4V0
). (3.38)
From the above, we can conclude that the quantizer will have higher resolution
when capacitor variation (b) is higher, the operating frequency is much less than
the cut-off, and the threshold voltage (VT ) is lower. As shown in Fig. 3.8, simu-
lation results follow the predicted behavior in (3.37) with minor deviations which
is mostly due to the linear and continuum approximations we made through the
analysis.
54
020
40
60
-0
.2
-0
.1
7
-0
.1
4
-0
.1
1
-0
.0
8
-0
.0
5
-0
.0
2
0
.0
1
0
.0
4
0
.0
7
0
.1
0
0
.1
3
0
.1
6
0
.1
9
Q
u
a
n
ti
ze
r'
s 
o
u
tp
u
t 
S
to
t
Input voltage (V)
Uniform lattice (Simulation)
Uniform lattice (Analysis)
80
100
120
Figure 3.8: Analysis vs. circuit simulation of the quantizer input-output re-
lation for a uniform lattice with L0 = 56pH and C0 as in Fig. 3.6.
The comparison is performed for VT = 70mV (blue), VT = 80mV
(red), and VT = 90mV (green). In order to take into account the
effect of loss in the analysis, a smaller input swing is a applied
to (3.37).
It is worth mentioning that the circular region is accurate when the threshold
voltage is close to the peak voltage, 4V0. As the threshold increases, the circular
region gradually transforms to a rectangular region. This transformation can be
clearly seen in the constant envelope contours of Fig. 3.7. For VT = 0, we have
the whole square-shaped central lobe, with its diagonal equal to the wavelength λ,
and we can write
Stot(Vin) = bλ
2
2
c = bpi2[1− bVin] ω
2
T
2ω2
c (3.39)
∆Stot
∆Vin
≈ −bpi
2
2
ω2T
ω2
. (3.40)
This result is slightly different than (3.38) in the proportionality factor, but the
input-output relation is similar.
55
5
10
15
5
10
15
1
1.5
2
X section number (n)Y section number (m)
L
C
/L
0C
0
Figure 3.9: A 2-D lattice with central symmetric LC tapering.
3.3.3 The effect of tapering
The above analysis is based on a uniform lattice, but this is not necessarily the best
implementation. Since we are only interested in the central lobe of the interference
pattern, a uniform tapering is not optimum. In the previous section, we showed
that a constant L/C ratio tapering in 1-D gradually decreases the wavelength,
which means that the interference pattern has a shorter wavelength at the center.
In other words, the central lobe is focused with respect to other lobes.
In a 2-D lattice, by applying this tapering in both dimensions, for a positive ρ
the central lobe becomes smaller and more focused, while for a negative ρ the cen-
tral lobe becomes larger and more dispersed. In Fig. 3.9, such a central symmetric
tapering with a positive ρ is shown. We can use tapering to get better resolution
on the nodes of interest.
56
3.4 Quantizer design and simulation
In order to get the best performance out of the proposed quantizer, we need to
optimally design different aspects of the structures. From (3.38), the quantization
factor is a strong function of the ratio of the lattice cut-off frequency to the carrier
frequency, and a higher ratio provides a better resolution.
On the other hand, the lattice response time plays a major role in the overall
quantization speed. Because we maintain matching conditions in the entire lattice,
reflection over the boundaries is minimal. Hence, the time it takes for the lattice
to respond to a change in the analog input will be the time it takes for the wave to
travel through the lattice sections. For frequencies reasonably below the cut-off,
we can write this time as
TD =
N∑
i=1
√
LiCi, (3.41)
where Li and Ci are the inductance and capacitance of the i
th section, respectively.
One can achieve the fastest response time by setting Li’s equal to L0 and Ci’s equal
to C0:
Tmin =
N∑
i=1
√
L0C0 =
2N
ωT
, (3.42)
where we have substituted for ωT from (3.36). Note that, if the nodes of interest
are close to the center of the lattice, only half of the lattice sections count to the
overall delay, which will significantly reduce the response time.
Equations (3.38) and (3.42) suggest that a higher ωT will result in higher res-
olution and higher speed, respectively. Thus, the reasonable way to design the
lattice is to choose ωT to be the highest possible value implementable in a certain
technology. In a properly designed structure, because of the high cut-off frequency
of the lattice, we expect the rest of the circuit, including the power detector and
57
the memories, to be the overall limiting factor. For this study we use TSMC’s
65nm standard CMOS technology, and we use Cadence for the simulations [72].
3.4.1 The 2-D lattice
The 2-D lattice shown in Fig. 3.5 gives us a vast degree of freedom in choosing the
inductors and capacitors as long as the L/C ratio remains constant everywhere.
We start by choosing C0 and L0 for a maximum ωT . The lowest possible varactor
in the process with minimum dimensions has an average C0 of around 9fF. Below
this value, the parasitic capacitance becomes significant. The characteristic of
this varactor is shown in Fig. 3.6. Furthermore, we select L0 = 56pH, since
this is the smallest lumped inductor that is modeled in the process. Customary
inductors smaller than this value can also be designed by performing extensive
electromagnetic modeling and simulation. With these values, the characteristic
impedance of the line will be Z0 =
√
L0/C0 = 78Ω, which is a practical value
for implementation. Also, from (3.36), fT = ωT/(2pi) = 425GHz, which is much
higher than carrier frequencies that CMOS devices can handle. Thus, the signal
wavelength is much shorter than the lattice spacing and our analysis based on a
continuum model is valid.
In order to estimate the effect of loss in the lattice, the quality factors of
the inductor (QL) and capacitor (QC) are plotted in Fig. 3.10. In a continuum
approximation, for the lattice quality factor we have Q = β/(2α), where
α =
ω
√
L0C0
2
(
1
QL
+
1
QC
) (3.43)
and
β =
1√
L0C0
(3.44)
58
10 20 30 40 50 60 70 80 90 100
0
10
20
30
40
50
60
Frequency (GHz)
Qu
al
it
y 
fa
ct
or
C
0
L
0
Figure 3.10: Inductor and capacitor quality factor vs. frequency. The design
point is shown by the dashed line.
are the real and imaginary parts, respectively, of the propagation constant, γ [73].
The loss of the varactor is clearly dominant for high frequencies and will dominate
the overall lattice quality factor. We concluded from our analysis in the previous
section that in order to have an effective interference pattern formation, loss should
be small, and simulation results suggest that a quality factor higher than 5 is
required for an effective interference pattern formation. As a result, from Fig.
3.10 we choose our carrier frequency to be 60GHz.
The next step is to decide on lattice dimensions and any possible tapering.
From our analysis in Section 3.3, and as a rule of thumb, we desire to have the
entire central lobe of the interference pattern inside the lattice. Thus, from Fig.
3.7 and (3.39) the dimensions should be around λ, or
N = pi
ωT
ω
. (3.45)
For fT = 425GHz and f = 60GHz, we get N = 23. This is a relatively large
number considering the amount of loss in the lattice. Furthermore, since power
59
5
10
15
5
10
15
0
0.05
0.1
0.15
A
m
p
li
tu
de
 R
MS
 (
V)
5
10
15
5
10
15
0
0.05
0.1
0.15
X section number (n)Y section number (m)
A
m
p
li
tu
de
 R
MS
 (
V)
Figure 3.11: Uniform vs. tapered lattice output. Top: uniform lattice with
L = L0 and C = C0. Bottom: constant L/C tapered lattice with
tapering factor of ρ = 0.09. The tapered lattice focuses the first
interference lobe to fit inside the 16 × 16 lattice. Simulation is
done with source amplitudes of 50mV .
60
splitting is more efficient for powers-of-two numbers of inputs, we choose N = 16
as the closest one and will use tapering in order to focus the pattern in this smaller
dimension.
Although from Fig. 3.10, going to lower frequencies enhances the lattice quality
factor, but because the detector needs few periods of the carrier frequency to
detect the voltage level, the carrier frequency needs to be higher than the sampling
frequency. This is to say that the maximum value of the carrier frequency is set
by the quality factor of the varactors and its minimum is determined by the speed
of the detectors. In order to estimate the maximum achievable sampling rate,
we should take into account both the lattice delay, TD, and the detection time.
By dedicating two periods of the carrier for the detector, we can estimate the
maximum sampling rate as:
FS =
1
TD + 2/f
. (3.46)
The fastest sampling rate can be achieved in a uniform lattice, where for TD =
Tmin, we get FSmax = 22GS/s. Due to tapering in our final design, TD becomes
slightly higher than Tmin. However, the detector response time remains as the
limiting factor in (3.46); thus, we choose a 50ps response time corresponding to
FS = 20GS/s for this design.
From our discussion in Section 3.2, a tapered line with ρ > 0 decreases the
wavelength, and we expect a smaller lobe compared with a uniform profile. Our
simulation results show in Fig. 3.11 that ρ = 0.09 is sufficient to fit the the first
lobe into the 16× 16 lattice. Also from Fig. 3.11, the tapering gives a larger peak
amplitude, which is desirable in the detection process.
Fig. 3.12 shows the quantized output versus the analog input for different VT ’s.
By performing simulations with different threshold levels, VT , we can choose the
61
Q
u
a
n
ti
ze
r'
s 
o
u
tp
u
t 
S
to
t
Input voltage (V)
0
10
20
30
40
50
60
70
-0
.2
-0
.1
7
-0
.1
4
-0
.1
1
-0
.0
8
-0
.0
5
-0
.0
2
0
.0
1
0
.0
4
0
.0
7
0
.1
0
0
.1
3
0
.1
6
0
.1
9
Tapered lattice
Uniform lattice 
Figure 3.12: Quantizer’s output characteristic with and without tapering.
All inputs are ideal matched sources with input amplitude of
50mV. The simulation is done for VT = 70mV in blue, VT =
80mV in red, and VT = 90mV in green. We observe the effect
of tapering which is focusing the variation in smaller number of
nodes while maintaining the desired input-output trend.
best value for our design. In this design, we choose VT = 90mV to get a sufficiently
large dynamic range.
3.4.2 Specifying the input-output characteristic
The derived characteristic of Fig. 3.12 is assuming that all lattice nodes outputs
in the first lobe are counted. This is neither practical nor useful considering the
resulting input-output curve, which is not desirable for many ADC applications.
For a more practical input-output characteristic, we will choose a subset of these
nodes. An input-output curve, Ssub(Vin), can be extracted from the original curve
62
−0.2 −0.1 0 0.1 0.2 0.3
0
2
4
6
8
10
12
14
16
Vin (Volt)
S s
u
b(
V i
n)
5 10 15
2
4
6
8
10
12
14
X Direction
Y
 
D
ir
ec
ti
on
(a)
−0.2 −0.1 0 0.1 0.2 0.3
0
5
10
15
20
25
Vin(Volt)
S s
u
b(
V i
n)
5 10 15
2
4
6
8
10
12
14
X Direction
Y
 
D
ir
ec
ti
on
(b)
Figure 3.13: Quantizer output vs. input voltage for two different selection
sets showing two examples of various achievable input-output
characteristics. Top left: A 16-level or 4-bit linear quantizer.
Top right: Its selection set for output nodes. Bottom left: A
logarithmic (saturating) quantizer. Bottom right: Its selection
set for output nodes
provided that,
0 ≤ ∆Ssub
∆Vin
≤ ∆Stot
∆Vin
. (3.47)
For any curve satisfying (3.47) across the whole input range, the total number of
node transitions from logic 0 to logic 1 at desired input levels, ∆Stot/∆Vin, are
equal or more than the required number, ∆Ssub/∆Vin. Thus, one should select
the subset of desired transition points and the nodes corresponding to them. In
63
transition points that more than one node is there to choose from, we tend to
choose the nodes in a uniform pattern across the lattice. This will minimize the
loading effect of the detectors on the lattice characteristic impedance. Figure
3.13(a) shows a 16-level (4bit) linear characteristic, satisfying (3.47), achieved by
choosing only 16 nodes of the lattice.
Even though a linear quantizer is usually used for communication applications,
other characteristics may also be desired. Some signals tend to occur more fre-
quently at low amplitudes which makes a saturating characteristic more optimal
for their quantization [74]. In order to show this possibility, a saturated quantizer
is designed as shown in Fig. 3.13(b). Also, by dynamically changing the selection
set, we have the ability to dynamically change the quantizer’s characteristic, which
can be potentially very useful. As an example, in RF receivers where the received
signal power changes from time to time, we can use a variable quantizer in addi-
tion to the variable-gain amplifiers to dynamically tune the total gain of the signal
path.
3.4.3 Detector and memory
Figure 3.14 shows the detector circuit, which is composed of two parallel self-
mixers. The input node with amplitude Vin is connected to M1, M6, and M7.
Since the input node is single-ended, the other three nodes are connected to bias
voltages. The reference signal with an amplitude VT is applied to the other mixer
through M3, M10, and M11. The output voltage is proportional to Vin − VT with
a proportionality factor depending on devices’ transconductances. The width and
length of M1−12 are 4µm and 60nm, respectively. We have designed the detector
by simultaneously minimizing its loading on the lattice and its response time.
64
Vb1 Vb1
Vb2 Vb2 Vb2
Vb3
Vin VT
Vout+
Vout-
M1 M2 M3 M4
M5,6 M7,8 M9,10 M11,12
M13 M14
Figure 3.14: Voltage detector circuit with input and reference inputs and the
differential output.
The sign of the differential output is the actual logic level of the detector. From
this point, since the differential output is still small, we will amplify the signal to
the desired level. In order to maximize the sampling speed, we propose a pipeline
amplification technique. In this method, the output of the detector is applied to a
cascade of sampled mode gain stages. Each gain stage samples the output of the
previous stage, amplifies it to the maximum possible level during the time window,
and provides the higher voltage level to the next stage. In Fig. 3.15, one such stage
is shown in which the output of the N th stage is sampled on a capacitor and at
the next phase it is applied to the (N + 1)th stage. Capacitors C1 and C2 are both
5fF and they sample the signal in turns, meaning that while one is sampling the
other is being reset. The switched capacitors are controlled by NMOS switches
with phases S1 to S4.
This technique is in fact a high-speed analog shift register, in which the logic
bit is shifted and amplified across the stages and the gain stages can also be
considered as part of the memory. The gain stages are designed for maximum gain
in the provided 50ps settling time. A two-stage differential pair shows optimum
functionality for each gain-stage. The transient response of the detector and the
65
S1 S2S3 S4
S1 S2S3 S4
S2 S4
S1
S2
S3
S4
50ps
(N+1)
th
 
Stage
+
 V
(N
) 
-
+
 V
(N
+
1
) 
-
N
th
 
Stage
50ps50ps50ps
C1 C2
Figure 3.15: The N th stage of the cascaded amplification circuit with the
switches and the four clock phases.
first two gain-stages are shown in Fig. 3.16. The results show that the detector
output response is fast enough to be sampled in the 50ps time window. Fig. 3.17
shows the time domain simulation of the lattice connected to the detector circuit.
The corresponding spectrum of the digital output is plotted in Fig. 3.18.
3.4.4 Design summary and comparison
We showed that with a 16×16 tapered lattice we can achieve at least 4 bits of linear
quantization. The detector circuit’s response time combined with the lattice delay
from (3.46), achieves a sampling rate of 20GS/s. The DC current of the detector
is 5mA from a 1.2V supply, and with double sampling performed to achieve the
desired sampling rate, each detection node consumes 10mA of current. The gain
stage amplifiers consume 3mA each. The number of gain-stages used depends on
the logic output level required for the memory. For our design, we require detection
66
0 50 100 150 200 250 300 350 400
−2
0
2
4
6
8
10
12
14
16
Time (ps)
A
m
p
li
tu
de
 (
mV
)
0 50 100 150 200 250 300 350 400
0
100
200
300
400
500
600
A
m
p
li
tu
de
 (
µV
)
First Stage
Second Stage
Figure 3.16: Top: Detector’s transient response for Vin = 55mV and VT =
50mv. Bottom: First gain stage (blue) and second gain stage
(red) sampled voltages for the above detector output. The first
gain-stage samples the detectors output and amplifies it in a
50ps time window. For the next 50ps the second stage amplifies
the previous output, while the fist stage is being reset. The sign
of the detector output is the logic state, which in this case is
positive (Vin > VT ).
67
-8
-6
-4
-2
0
2
4
6
8
150
200
250
300
350
400
450
500
550
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.4
Time (nsec)
In
p
u
t 
a
m
p
li
tu
d
e 
(m
V
)
D
ig
it
a
l 
O
u
tp
u
t
Figure 3.17: Time domain simulation. Input sinusoid at 1.7GHz (blue) and
the digital output (red). All other parameters are the same as
the constant input simulations we performed before.
-60
-50
-40
-30
-20
-10
0
0 1 2 3 4 5 6 7 8 9 10
Frequency (GHz)
S
p
ec
tr
u
m
 a
m
p
li
tu
d
e 
(d
B
)
Figure 3.18: The digital output fourier transform for 256 samples of the out-
put.
of Vin− VT > 1mV and from simulation, for Vin− VT = 1mV , the detector output
is around 50µV after 50ps. Thus, to achieve an output level as large as 0.2V the
number of stages will be
Ns = 2× (log0.23.5 − log50µ3.5 ≈ 13), (3.48)
where similar to the detection part, the factor of two comes from double sampling.
The power required to drive the 16×16 lattice at all four sides for the 50mV input
68
Table 3.1: Performance Comparison of State-of-the-Art ADCs above 10GS/s
Ref. Architecture Quantizer Power Total Power Fs Resolution Technology
[46] Time interleaving N/A 1.2W 24G 6 bits 90nm CMOS
[47] Time interleaving N/A 9W 20G 8 bits 0.18µm CMOS
[49] Flash N/A 4.5W 35G 4 bits SiGe BiCMOS
[50] Flash N/A 4.8W 20G 5 bits SiGe BiCMOS
[51] Flash N/A 3.6W 10G 5 bits SiGe BiCMOS
This Work Interferometer 0.194W 0.943W 20G 4 bits 65nm CMOS
amplitude is
Plattice = 4× 16× V
2
Z0
= 2mW. (3.49)
The total detection power is the sum of the detector and the lattice power:
Pdetection = Plattice + 16× 10mA× 1.2V. (3.50)
Table. 3.1 compares the proposed quantizer with other reported designs. The
comparison is performed both with and without taking into account the analog
memory. The reason is that the memory is not the essential part for the detection,
and as soon as the output is quantized, the data can be stored in many ways. For
example, time interleaving can be used to design a memory with a considerably
lower sampling rate and power consumption.
It is noteworthy that we do not have measurement results of this structure and
the comparison might not be fair. As a result, we do not draw any conclusions
beyond the fact that this structure shows great potential as a high-speed power
efficient quantizer.
69
3.4.5 Effect of noise and phase mismatch
The main effects that can degrade the performance of the interferometric quantizer
are thermal noise and phase mismatch between input sources. Thermal noise is
caused by the source impedance and also the detector circuit, while phase mismatch
between the inputs happen randomly mostly due to process variation across the
substrate. To capture both of these effects, we model the ith input source as
Vi = (V0 + vi) cos(ωt+ θi), (3.51)
where vi is the additive white Gaussian noise and θi is a random phase. For
θi we assume a Gaussian distribution with zero mean and variance of σθ. From
our analysis in Section 3.2.3, the voltage of any node inside the 2-D lattice is a
superposition of four input sources that are in the same column and row as that
node. This is assuming that the phase mismatch between sources on one edge
of the lattice is not large and hence the direction of the wave is not significantly
changed. By applying the non-ideal sources of (3.51) to (3.24), we can write the
voltage of each node as
Vn,m =
∑
i
[(Vi + vi) cos(ωt−∆φi + θi)] , (3.52)
where we have given indexes to the independent noise and mismatch sources and
∆φi represents the phase shift due to the wave propagating along the lattice.
Assuming that vi  V0 and σφ  1, we can simplify the superposition of the four
nodes from (3.52) to
Vn,m =4V0 cos(k
n′ +m′
2
) cos(k
n′ −m′
2
) cos(ωt− k(Vc)N/2)+∑
i
cos(ωt−∆φi) · vi −
∑
i
V0 sin(ωt−∆φi) · θi.
(3.53)
The first term in (3.53) is the desired amplitude coming from (3.27), while
the next two term are the effect of noise and phase mismatch, respectively. We
70
01
2
3
4
5
6
7
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8
3
σ
 o
f 
p
h
a
se
 m
is
m
a
tc
h
 (
d
eg
re
e)
 
Noise rms voltage (mV)
SNDR=35dB
SNDR=30dB
SNDR=25dB
SNDR=20dB
Figure 3.19: Noise and mismatch limits for different values of SNDR. The
lattice parameters are as specified in Section 3.4.4. The input
Vin is assumed to be a sinusoid with an amplitude of 150mV.
now substitute (3.53) into (3.34) and follow the same approximation to get the
quantizer output Stot as
Stot(Vin) = bpi (1− bVin) ω
2
T
ω2
·(
1− VT −
∑
i cos(ωt−∆φi) · vi +
∑
i V0 sin(ωt−∆φi) · θi
4V0
)
c.
(3.54)
In order to find the signal to noise plus distortion ratio (SNDR), we calculate the
output power by assuming independent noise and mismatch sources. We replace
noise terms with noise power of v2n and mismatch terms with σ
2
φ to get
S2tot(Vin) = pi
2ω
4
T
ω4
(
1 + b2V 2in
(
1− VT
4V0
)2
+
2v2n + 2V
2
0 σ
2
φ
16V 20
)
. (3.55)
The SNDR can be calculated from (3.55), by taking into account the signal and
noise powers
SNDR = 8b2 (V0 − VT/4)2 · V
2
in
v2n + V
2
0 σ
2
φ
. (3.56)
71
Form (3.56), one can find the requirement on noise and phase mismatch for a
particular SNDR. Fig. 3.19 shows the relation between these two noise sources for
achieving a desired SNDR based on our analysis.
72
CHAPTER 4
DELAY-BASED ANALOG-TO-DIGITAL CONVERSION: A
SCALABLE SOLUTION
4.1 Introduction
High-speed data conversion with modest resolution is used in high data-rate serial
links and wideband wireless receivers [75–78]. Multiprocessor systems and multi-
band wireless receivers are two examples of such applications, where beside energy
efficiency, area occupancy per ADC is also critical. In these applications it is de-
sirable to have multiple channels on the same die in order a achieve the highest
level of integration [79–81].
A diverse collection of voltage-based architectures ranging from flash to pipeline
and successive approximation (SAR), with different variations has been reported
for these low resolution, high-speed applications [75–78, 82–92]. In flash ADC, in
order to achieve the highest sampling rate with the lowest power consumption,
minimum size devices are normally used. However, using minimum size devices
causes the highest amount of mismatch between individual comparators, which
necessitates the use of calibration techniques. Background calibration is usually
complex and imposes significant area and power overhead [82, 91]. On the other
hand, foreground calibration requires periodic interruption of the ADC [83,85,86].
Alternatively, SAR ADC uses a single comparator and interleaves multiple cores
to achieve a high sampling rate. In this case, calibration is still needed in order
to match the gain and sampling phase of the parallel paths [89, 90]. In general,
the underlying challenge in all of these architectures is that the voltage-based
comparators are not scaled as favorably as digital circuits in CMOS. Although
73
smaller CMOS gate lengths potentially enable faster sampling rates, mismatch
still remains as a fundamental issue in analog circuits. As a result, voltage-based
high-speed data converters are commonly accompanied with calibration.
Time-based signal processing has received much attention in applications such
as digital phase-locked loops and time-of-flight measurements where a particular
delay has to be accurately measured [93–98]. Vernier-based structures and time
interpolation techniques are used to increase the time resolution beyond the min-
imum delay of a single cell [97–100]. Coarse-fine delay quantization is recently
introduced to address the long length and difficult calibration required in single
step structures [93, 94]. CMOS scaling has provided faster delay-cells, which is
shown promising for time-to-digital quantization.
The concept of time-to-digital quantization can also be used in analog to dig-
ital conversion. In this case, the sampled input is translated into time domain
and subsequently quantized using a delay-line structure. Previously such an ADC
has been realized at low sampling rates [101–103]. VCO-based quantizers that
use a ring oscillator and perform frequency-to-digital conversion have also been
reported [104,105]. In this work, we propose and demonstrate a novel time-based
ADC that can operate at high data rates. We show how implementing the quantiza-
tion process in time domain has distinct advantages compared to voltage domain
quantization. Based on our proposed architecture we implement a 4b 1.2GS/s
delay-line based data converter with no calibration. This miniature ADC con-
sumes less than 2mW of power in only 0.01mm2 of active area. To the best of
our knowledge, this is both the most compact and the most power efficient ADC,
compared to high-speed data converters that do not rely on manual calibration.
The rest of the chapter is organized as follows: Sections 4.2 and 4.3 explain
74
Ts
Cell1
Cell4
Cell3Cell2
Cell5
Cell1 Cell2 Cell3 Cell4 Cell5
Ts
Start Reset
Applied
Pulse
Tp
Figure 4.1: A delay-line with a digital pulse applied to the first delay-cell.
The propagation length of the pulse is quantized inside the delay-
line through collecting the digital outputs of the delay-cells.
the theory of the delay-line based ADC and the proposed architecture. Section 4.4
describes the advantages of the delay-line based structure compared to voltage-
mode quantization, and introduces a design methodology to decide which domain
is more energy efficient for a given design specification and technology. Sections
4.5 and 4.6 explain the implementation of the structure and the circuit level design
followed by the measurement results.
4.2 Delay-Line-Based Signal Quantization
Analog signals are not necessarily represented by voltage, and hence quantization
is not necessarily made by voltage comparison. Contrary to comparator-based
voltage quantization, in the delay-line-based quantizer a time that is proportional
to the analog signal is quantized. A delay-line consisting of a series of delay-cells
is shown in Fig. 4.1. A digital pulse is applied to the beginning of the delay-line
and travels for a time window of Ts.
75
The delay-cell outputs are initially set to “0” and switch to “1” after the pulse
propagates through them. At the end of the time window, Ts, the number of
triggered delay-cells is the value of N that satisfies
N∑
i=1
Di < Ts <
N+1∑
i=1
Di (4.1)
where Di is the delay of the i
th delay-cell. Here, we assumed that the line is long
enough so that the pulse does not reach the end of the line during the time Ts. If
this assumption is not valid, one can simply form a loop and take into account the
number of rotations of the pulse.
For the simplest implementation, all delay-cells are assumed to have the same
amount of delay equal to D. This results in N equal to
N = bTs
D
c· (4.2)
This equation suggests that the process of time domain quantization can operate
by either changing the time window or changing the amount of delay in each delay-
cell [106]. These two methods result in two classes of time domain quantization
which we call time-based quantization and delay-based quantization, respectively.
4.2.1 Time-Based Signal Quantization
In this technique delay-cells have a fixed delay and the time window is proportional
to the input signal. Although this method is used in applications such as digital
PLLs and time-of-flight measurement [93–98], it is not well suited for sampled
mode circuits. This is because of the fact that systems operating with a fixed
sampling rate need to have a fixed processing time in order to operate efficiently.
Since the time window varies in the time-based quantizer, for small values of Ts,
76
there will be significant idle time in the system. Nevertheless, this structure is
quite suitable for data conversion in event-driven applications where the system is
only expected to wake up when a new event arrives [107]. In these applications the
delay-line operates only when a pulse arrives, while during the idle time it requires
no DC current.
4.2.2 Delay-Based Signal Quantization
Alternatively, one can keep Ts constant and change the delay value of the delay-cells
according to the analog input. Since Ts is fixed, this structure is advantageous for
data converters. In most systems a sampled input is usually in the form of voltage
or current. Assuming a voltage mode sampling, a delay adjustment block (DA) is
required to apply the proper delay to the delay-cells based on the sampled voltage.
The input-output relation from (4.2) becomes
N =
⌊
Ts
D(vin)
⌋
, (4.3)
where D(vin) is the transfer function of the voltage-to-delay conversion and vin is
the sampled input voltage. If the DA generates an ideal delay relation as
D(vin) = η/vin, (4.4)
where η is constant, the quantizer output becomes a linear function of the input
voltage
N =
⌊
Ts
η
· vin
⌋
. (4.5)
In other words, for the pulse propagation speed and the resulting quantized
length to be proportional to the input, the delay of each cell should be inversely
proportional to the input.
77
DA
t0 t0+Di
VDD
start
reset
C
M1
I(Vin)
Figure 4.2: The proposed controllable delay-cell. The current source linearly
discharges the capacitive node when the start switch turns on by
the previous stage. The delay-cell is reset to its high-level voltage
at the end of the time window.
4.3 ADC Architecture
4.3.1 Delay-Cell Design
Variable delay-cells have been perviously implemented by various techniques such
as current starved inverters and supply control [101,108]. In order to maximize the
quantizer’s resolution, the delay-cell should maintain the relation in (4.4) for a wide
range of input voltages. To achieve this, we propose the delay-cell shown in Fig.
4.2. A capacitive node is linearly discharged using a controllable current source,
M1. The inverter succeeding the capacitor triggers at its switching threshold
voltage, VT [109]. The resulting delay value from the proposed circuit is
D(vin) =
C · VT
I(vin)
+ T0, (4.6)
where, C is the capacitance at the charging node, T0 is the additional delay caused
by the inverter and I(vin) is the current of M1 which is linearly proportional
to vin through Gm, the transconductance of the DA. The delay offset caused by
the inverter is a deviation from the ideal relation in (4.4), which limits both the
78
- 
V
in
 +
DAS/H
Clock 
Phases
P1 P2 PK
P-Cells
E
n
c
o
d
e
r
n
N1 N2 NK
N-Cells
resetDigital 
Pulse
P1
P3
P2
N1
Fast 
Line
Slow 
Line
start
P4
P5...PK
N5...NK
 
 
P-Latches
N-Latches
Figure 4.3: The proposed differential delay-line based data converter. The
input pulse is applied to both P-cells and N-cells and propagates
in the two lines with different speeds. The outputs of P-cells and
N-cells are latched separately and subsequently encoded as the
digital output.
dynamic range and the linearity of the delay-based quantizer. In order to minimize
this effect, T0 should be small compared to D(vin). This criterion sets a lower
limit on D(vin). Assuming this lower limit is Dmin, from (4.3) the maximum level
of quantization becomes Nmax ≤ Ts/Dmin. It is possible to increase Nmax by
increasing Ts, but this longer propagation time results in a lower sampling rate.
79
4.3.2 The Differential Delay-Line
To surmount the barrier caused by the inverter delay a differential delay-line is
proposed. As shown in Fig. 4.3, a differential DA controls the delay of the two
lines as a function of the sampled input, vin. P-cells and N-cells in the two lines
are similar, but their delay is set in a differential manner. This means that when
P-cells are fast, N-cells are slow and vice-versa. First, we rewrite (4.6) as
D(vin) =
C · VT
Gm(Vb + vin)
+ T0, (4.7)
where Gm is the transconductance of the DA and Vb is a bias voltage. Since
we are interested in the nonlinearity of the delay-line, we neglect any preceding
nonlinearity caused by the sampling network and the DA, thus Gm is assumed to
be constant. By defining the delay-line conversion gain GD as
GD =
TsGm
C · VT , (4.8)
The quantized output for each of the two delay-lines becomes
NP =
⌊
Ts
C·VT
Gm(Vb+vin)
+ T0
⌋
=
GD(Vb + vin)
1 + T0
Ts
·GD(Vb + vin)
−Q(NP ), (4.9)
and
NN =
⌊
Ts
C·VT
Gm(Vb−vin) + T0
⌋
=
GD(Vb − vin)
1 + T0
Ts
·GD(Vb − vin)
−Q(NN), (4.10)
where 0 ≤ Q(N) < 1 represents the quantization error. NP and NN are the digital
outputs of the P-cells and N-cells, respectively. Similar to the single delay-cell, (4.9)
and (4.10) show that for a linear quantization the delay of the inverters should be
small, i.e., T0  D(vin). The differential output defined as Nd = NP −NN equals
to
Nd =
2GDvin
1 + 2T0
Ts
GDVb +
T 20
T 2s
G2D(V
2
b − v2in)
−Q(NP ) +Q(NN). (4.11)
80
46
8
10
12
14
16
18
20
22
-15
-10
-5
0
5
10
15
20
-100 -50 0-150 100 15050
DVin (mV)
S
in
g
le
 L
in
e
 O
u
tp
u
t 
C
o
d
e
D
if
fe
re
n
ti
a
l 
L
in
e
 O
u
tp
u
t 
C
o
d
e
Figure 4.4: Simulated Input-Output relation comparison between a single
delay-line quantizer and a differential quantizer in the employed
65nm CMOS process. The differential quantizer generates a
higher dynamic range and linearity.
Two important advantages of the differential delay-line compared to a regular
delay-line can be observed from (4.11). First, the new quantizer conversion gain
is 2GD, which means that the dynamic range is doubled compared to the single
delay-line. Second, the signal dependent term in the denominator is of the second
order which is significantly weaker than the first order terms in (4.9) and (4.10).
Therefore, the differential delay-line provides both a higher dynamic range and a
better linearity. The quantizer’s input-output relation is plotted in Fig. 4.4 for
the two cases.
4.3.3 Maximizing The Dynamic Range
In order to quantify the dynamic range, the quantization error should be derived.
By neglecting the nonlinear components from (4.9) and (4.10), Nd can be written
81
as
Nd = bGD(Vb + vin)c − bGD(Vb − vin)c . (4.12)
Next, we separate the integer and non-integer parts so that GDVb = nb + rb and
GDvin = ni + ri, where nb and ni are integers and rb and ri are between 0 and 1.
The resulting quantization error can be written as
Qd = 2GDvin −Nd = 2ri − brb + ric+ brb − ric. (4.13)
Assuming that the input is a uniform random signal, ri becomes a uniformly
distributed random signal between zero and one [110]. As shown in Fig. 4.5,
the quantization noise expression in (4.13) has four regions separated based on rb
and ri:
Qd =

2ri, rb < 1− ri, rb > ri
2ri − 1, rb > 1− ri, rb > ri
2ri − 1, rb < 1− ri, rb < ri
2ri − 2, rb > 1− ri, rb < ri.
(4.14)
From (4.14), it can be shown that the average of the quantization noise is zero, i.e.,
Qd = 0. Consequently, by applying its definition, the quantization noise variance
can be computed from (4.14):
σ2Q =
∫ 1
0
(
Qd −Qd
)2 · dri
σ2Q = 1/3− 2 |rb − 0.5|+ 4 (rb − 0.5)2 .
(4.15)
Equation (4.15) shows that the optimal bias point that corresponds to rb = 1/4
or rb = 3/4 results in minimum value of σ
2
Q = 1/12. Fig. 4.5 shows the bias regions
and compares the quantization noise for optimal and sub-optimal bias points. The
signal-to-noise ratio from this analysis is compared with the result from time-
domain response of the delay-line in Fig. 4.6, which shows a reasonable match.
82
By combining (4.15) and (4.12), the dynamic range of the differential quantizer
becomes
DR =
N2d
σ2Q
= 6 (GDvmax)
2 , (4.16)
where vmax is the maximum range of the input voltage. Since in reality, the
dynamic range of the delay-cell is the actual limit, it is useful to rewrite (4.16)
in terms of the minimum delay. By substituting (4.8) and (4.7) into (4.16) we get
DR =
N2d
σ2Q
= 6
(
Ts
Dmin
)2
, (4.17)
where Dmin is the minimum delay corresponding to vmax and is a property of the
process. By defining Nmax as the output corresponding to the minimum delay
Dmin, the dynamic range equals to
DR = 6(Nmax)
2. (4.18)
For a fixed sampling rate, Nmax provides a useful benchmark to evaluate how
various properties of the delay-line-based structure enhance with scaling. From
(4.18), doubling the delay speed or the time window, results in one additional bit.
4.4 Non-Ideal Effects
4.4.1 Time-Domain Averaging
Consider a delay-line consisting of similar delay-cells. All the nonideal effects of
the ith delay-cell can be combined into an input referred delay error, di, followed
by an ideal delay-cell with a delay equal to D(vin). We assume di has a Gaussian
distribution with zero mean and standard deviation of σd. The governing equation
83
rb
ri
0 1
0
1
ri
Qd
10.50
0.5
0.5
0
0.5
0.5
0.25 0.75
Drb Drb
Design 
Points Drb
Figure 4.5: Left: The quantization error separated to different regions based
on the bias point. Right: The noise variance corresponding to
sub-optimal and optimal bias points.
0.5
31
32
33
34
35
36
37
38
0.750.250 1
Analysis
Transient 
Simulation
30
rb
S
ig
n
a
l 
to
 Q
u
a
n
ti
z
a
ti
o
n
 N
o
is
e
 R
a
ti
o
 
(d
B
)
Figure 4.6: Signal to quantization noise ratio as a function of the bias point
from theoretical analysis of the differential delay-line compared
to the simulation results from the time domain spectrum in the
employed 65nm CMOS process.
84
from (4.1) in the presence of errors becomes
N ·D(vin) +
N∑
i=1
di < Ts < (N + 1) ·D(vin) +
N+1∑
i=1
di· (4.19)
This means that N is a random variable whose standard deviation represents the
output error. For this analysis we neglect the effect of quantization error. We
also assume that the error of each delay-cell is much smaller than its delay, i.e.,
σd  D(vin). With these assumptions, (4.19) can be written as
N =
Ts −
N∑
i=1
di
D(vin)
. (4.20)
By taking the variance of both sides, we arrive at
σ2N =
Nσ2d
D(vin)2
. (4.21)
By following the analysis similar to Section 4.3, we can obtain the signal-to-error
ratio (SER) as
SER =
N2d
σ2Nmax
=
1
2
Nmax
(
Dmin
σd
)2
. (4.22)
Note that (4.22) assumes Nmax, where delay-cells have their minimum value Dmin.
From (4.21), this value causes the highest output noise which is a lower bound for
SER.
A curious result of this analysis is that the delay-line inherently averages out
the error contribution from the individual cells. As a result a longer delay-line cor-
responding to a higher Nmax results in a higher signal to error ratio. This averaging
effect is a distinct feature of the delay-based quantization. In the following, we use
this analysis to determine the effects of noise and mismatch on this quantizer.
85
A(jw) Vo
C
VDD
In
Vn
Vs
I
M1
VC
Figure 4.7: Sources of noise and jitter in the delay-cell.
4.4.2 Noise and Jitter
The two major noise sources in a delay-line are the device noise and supply noise.
In Fig. 4.7, the charging node is followed by an inverter. For noise analysis, the
small signal model of the inverter which is a first order amplifier with a gain of A0
and the unity gain cut-off frequency of ω0 can be used. The input referred noise of
the amplifier and the current noise of M1 are represented by Vn and In, and the
supply noise is represented by Vs. The time domain response of the delay-cell in
the linear region without noise sources becomes
Vo(t) =
A0I
C
·
(
t− e
−ω0t − 1
ω0
)
. (4.23)
According to (4.23), with the typical parameter values, the time response of
the inverter is much faster than the overall delay. This means that the noise of
the previous delay-cell only changes the trigger time but has negligible effect on
the noise of the following cell. Hence, the delay variance can be estimated based
on the time the inverter reaches VT . By referring all noises to the input of the
86
inverter, we have
Dn =
C
I
·
(
VT − Vn − Vs − 1
C
∫ D
0
In · dt
)
σ2n =
(
C
I
)2
·
(
σ2Vn +
D
C2
· I2n + σ2Vs
)
=
(
C
I
)2
· (σ2Vdn + σ2Vs) , (4.24)
where Dn is the delay in the presence of noise, σ
2
n is the jitter of a single delay-
cell, and σ2Vdn represents the total device noise. Hence, the jitter consists of two
components: The first component is due to the noise of the inverter and the current
source and the second component comes from all externally induced noises, most
importantly the supply noise.
In the differential delay-line scheme, the supply noise affects both sides equally
when the noise bandwidth is low compared to the sampling rate. Thus, when high
frequency components are sufficiently filtered, the differential structure cancels
out the effect of the supply noise on jitter. Moreover, the device noise sources
are independent across the delay-line, hence by substituting (4.24) into (4.21) and
neglecting T0, the signal to noise ratio equals to
SNR =
1
2
· V
2
T
σ2Vdn
·Nmax. (4.25)
As a direct result of averaging, SNR increases by increasing the number of
delay-cells. This means that a faster process results in a higher Nmax, increasing
the SNR of this quantizer.
4.4.3 Mismatch
Mismatch between delay-cells comes from random variations in device dimensions
and the threshold voltage [84]. From (4.7), variation in either of these parameters
87
affects the delay value. By calculating σd from (4.7) and substituting into (4.22),
the signal-to-error ratio becomes
SER =
1
2
· Nmax(
σVT
VT
)2
+
(
σC
C
)2
+
(
σGm
Gm
)2 , (4.26)
where σVT , σC , and σGm are the variations of the threshold voltage, charging
capacitor, and Gm, respectively. The terms in the denominator of (4.26) depend
on process properties. The last term in the denominator also depends on the bias
point and is represented by its average value. Equation (4.26) shows that the effect
of mismatch in a sufficiently long delay-based quantizer is less than a comparator-
based quantizer. For instance, in a calibration-free flash ADC, for an input voltage
range of Vswing, the ratio σVT /Vswing should be smaller than the LSB. However, in
the delay-line structure because of the mismatch averaging, this ratio is relaxed by
a factor of
√
Nmax.
Fig. 4.8 compares the mismatch-limited SER for flash and the delay-line-based
structure based on Monte-Carlo analysis in a 65nm CMOS process. For a fair
comparison, the device sizes and input voltage swings are set to be equal. As
a result of mismatch averaging, the delay-line is advantageous above a certain
number of cells. As indicated in Fig. 4.8, the intersection between the two plots
also scales, as faster devices move the delay-line to higher SERs and lower supply
voltages move the SER of flash structures further down. As as a result, the delay-
line based structure is favorable for deep sub-micron CMOS scaling.
88
4.4.4 Design Methodology
In order to find the optimal number of delay-cells, the energy efficiency for the
ADC should be evaluated using the figure-of-merit [87]
FOM =
P
2ENOB · fs =
ED(
2
3
· SNDR)1/2 , (4.27)
where ED is the energy consumed per conversion. The power consumption during
pulse propagation comes from the charging of the delay-cells most of which is
consumed for charging the capacitive node of the delay-cells. The energy per
transition in a delay-cell defined as E0, is equal to CV
2
dd/2 which is constant. As a
result the energy consumption in the delay-line becomes
ED = E0 · (NP +NN) = E0 ·Nmax, (4.28)
where NP and NN are replaced from (4.9) and (4.10). According to simulation,
for the target number of bits (4-6 bits) mismatch and quantization noise are the
dominant sources of error and the effect of the jitter induced by the device noise
is negligible. Using this result and by substituting (4.18) and (4.22) into (4.27),
energy per conversion step becomes
FOM = ED ·
[
3
2
· (DR−1 + SER−1)]1/2 = E0
2
·
(
1 +
12σ2d
fs ·D3min
)1/2
, (4.29)
where fs = T
−1
s is the highest possible sampling rate. Fig. 4.9 shows the variation
of FOM in the employed 65nm process as a function of the sampling frequency. At
high frequencies, the delay-line has a constant FOM limited by the quantization
noise. In this range, the delay-line-based ADC can have a variable sampling rate
with a fixed energy efficiency. At lower frequencies, the effect of mismatch increases
the required energy per bit. Thus, it is desirable to design the ADC above the knee
of this curve which can be calculated from (4.29) as
fmin =
12σ2d
D3min
. (4.30)
89
10
14
18
22
26
30
1 2 4 8 16 32 64
S
ig
n
a
l 
to
 E
rr
o
r 
ra
ti
o
 (
d
B
)
Number of Cells (Nmax)
Flash
Delay-Line 
Lower swing
Faster cells
Time based 
preferredVoltage based 
preferred
Figure 4.8: Comparison of mismatch limited signal-to-noise ratio between
basic flash and the delay-line structure. Monte-Carlo simulation
is performed for the 65nm process and the SER is calculated as-
suming similar input voltage range. The horizontal arrows indi-
cate the regions in which each structure is favorable. The vertical
arrows indicate the direction of further scaling which is in favor
of the delay-line structure.
From (4.30), fmin will significantly increase for faster processes which results in
a higher resolution in the low-power region. In this design, the knee frequency is
fmin = 140MS/s and we choose the sampling rate of fs = 600MS/s which is well
above the fmin. This sampling rate also provides an Nmax which is high enough
for a 4 bit quantizer.
4.5 Circuit Design
To show the feasibility of the proposed approach, we design and simulate a 4 bit
600 MS/s ADC on a standard 65nm CMOS process. Next, two of these core ADC’s
90
00.5
1
1.5
2
0.01 0.1 1 10
Quantization 
Limited
Matching Limited 
fmin
Design 
Point
11
-1
115
-1 11157
-1
fs (GS/s)
(Nmax)
-1
F
O
M
 (
N
o
rm
a
li
z
e
d
 t
o
 E
0
)
Energy Efficient 
Region
Figure 4.9: FOM as a function of sampling rate and resolution for the used
65nm CMOS process. The selected sampling frequency is shown
with the dashed line. Above fmin the delay-line is in the quantiza-
tion limited region where sampling rate can be traded for higher
resolution while maintaining minimum energy consumption per
bit.
are time-interleaved to achieve the sampling rate of 1.2 GS/s. In this section, we
discuss the key circuit blocks of the core delay-line-based ADC.
4.5.1 Sample and Hold
In most applications, the input signal is in the form of voltage or current. In order
to use the concept of delay-based quantization in these applications, a voltage-
mode sampling circuit is designed. The front-end switching network is shown
in Fig. 4.10. The input uses all-NMOS transistors that operate with two non-
overlapping phases.
91
M1
M3
M7
M2
M4
Vcm
AVdd
M5 M6
M8
M9
M10
S1
S2
S3
Vcm
S4
S5
S6
S7
S8
Vin
Vin VC 
IB
Cs
Cs
(W/L)1,2=10µm/60nm
R=1KΩ
IB
I I
VC 
S/H
DA
To N-CellsTo P-Cells
Figure 4.10: Input sampling switches and the delay adjustment circuit.
For best performance, the RC time constant of the switching network should
be small compared to the period of the input signal. Hence, larger switches (S1
- S8) and a smaller sampling capacitor (Cs) are desired. A small capacitor is
also advantageous to reduce the loading effect on the prior stage. However, the
capacitor needs to be sufficiently large to store enough charge during the hold
phase. In this design, simulation shows that Cs = 25fF meets the requirement
for this frequency and resolution. In addition, the maximum tolerable switching
charge injection sets the upper limit on the size of the switches.
The common mode voltage is set to V cm = 300mV as it should be low for the
best linearity in the switches but high enough for the switching network to have
acceptable voltage swing. A source follower acts as a buffer stage between the
92
sampling circuit and the differential pair. The buffer reduces the amount of charge
loss in the sampling network, allowing a smaller sampling capacitor. Therefore,
adding the buffer enables a faster sampling network for a given resolution.
4.5.2 Delay Adjustment Circuit
Delay adjustment as shown in Fig. 4.10 operates using a degenerated differential
pair composed of M1 and M2. During the hold phase, the sampling capacitors are
connected to the differential pair and the differential input voltage is transformed
into a differential current in M3 and M4. PMOS transistors are selected as the
input differential pair to lower the bias voltage of the preceding switches. Also
this allows the load to be NMOS, resulting in a faster settling time and faster
delay-cells. The transconductance gain of the differential pair, equals to gin/(2 +
ginR), where gin is the transconductance of M1 and M2 and R is the degeneration
resistor. Because of the separating buffer, the input transistors can be designed
sufficiently large to ensure ginR  1 across the entire input range, resulting in
minimal variation in Gm.
The bias current of M3 and M4 is set based on the difference between the
NMOS and PMOS current sources (M9 and M10). These two current sources
copy the same bias current with different ratios based on their respective sizes.
This bias current is important for two reasons. First, since the delay-cells consume
power only during the transitions, the differential pair is the only circuit that
draws DC current. This means that lowering the bias current decreases the overall
power consumption. However, the bias current has a lower bound which is set
by the required settling time of the DA. Second, the bias current determines Vb
in (4.7) which should be set to minimize the differential quantization error, Qd.
93
24
6
8
10
12
14
-200 -160 -120 -80 -40 0 40 80 120 160
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
1 3 5 7 9 11 13 15 17 19
Differential input (mV)
P-Cells N-Cells1
/D
e
la
y
 (
n
s
)-
1
200
Digital Code
IN
L
Bias Point
Figure 4.11: Top: Transfer function of the input voltage to the inverse de-
lay. The dashed line represents the bias point of the delay-cells.
Bottom: INL of the resulting differential quantizer.
Since this minimum error occurs for several optimal bias points, both of these
optimizations can be achieved simultaneously. The inverse delay as a function of
the input voltage for both P-cells and N-cells and the nonlinearity resulting from
the nonideal input-output relation is plotted in Fig. 4.11.
94
Reset
in M4
AVdd
DA
M5
M3
M2
M1
2µm/60nm
6µm/90nm
2µm/120nm
6µm/120nm
2µm/120nm
out
 
next stageprior 
stage
Regenerative 
Latch
Encoder
Figure 4.12: Delay-cell followed by the latch. The delay-cell is triggered by
the previous stage but all delay-cells are reset at the same time.
4.5.3 The Delay-Cell
Fig. 4.12 shows the circuit realization of the delay-cell proposed in Section 4.3.1.
M1 acts as an adjustable current source and its current is set by DA transistors
M3 and M4 for P-cells and N-cells, respectively. M2 is an NMOS switch and is
triggered by the pulse from the prior stage. The succeeding inverter consisting of
M4 and M5 provides the capacitance of the charging node of the delay-cell.
The operation of the delay-cell has two phases. During the pulse propagation
when the pulse arrives at the gate of M2, this switch turns on and the current
source starts discharging the capacitive node from AVDD to ground. Eventually
the inverter flips and its output is applied to the next delay-cell. At the end of
the time window, Ts, the output of the inverter is latched and the capacitive node
is reset by M3. In other words, the delay-cells are charged sequentially, but their
outputs are all latched and reset at the same time to record the quantized output
for the current sample.
95
700
0
0.2
0.4
0.6
0.8
1
1.2
Dtm
-0.2
100 200 300 400 500 600
Input 
Pulse
Cell 2 Cell 4 Cell 6
Time (ps)
V
o
lt
a
g
e
 (
V
)
0
Figure 4.13: Time response of the delay-line outputs. If a cell is latched dur-
ing the transition time 4tm, the resulting bit is not clear. The
uncertainty is kept below noise floor by ensuring sharp transi-
tions for the delay-cells.
Since the capacitive node is floating during pulse propagation, we should con-
sider the charge leakage. This charge leakage can change the characteristic of the
delay-cell and introduce nonlinearity. In order to minimize this effect, both M1 and
M2 have gate lengths above the minimum length and their subthreshold current
is ensured to be small compared to the saturation current of M1 during discharge.
In this design, the amount of charge leakage during the pulse propagation time has
negligible effect on the ADC performance.
The regenerative latch at the output of each delay-cell is optimized for low-
power and high-speed operation. For a more reliable timing scheme, a master-slave
topology is used [109]. In order to minimize the loading effect and the switching
noise of the latch, we place a buffer between the latch and the delay-cell. The
resulting output of each delay-cell is a single bit. The output bits from all delay-
cells are connected to a digital encoder to generate the final digital code.
96
Using a regenerative latch brings the concern of metastability [109]. A latch
requires a minimum setup time and hold time in order to function properly. The
delay-cell outputs are high or low for most of the time. However, as shown in Fig.
4.13, there is a chance of metastability if the time window Ts ends when a delay-cell
is in transition. In this case the final latched value is decided mainly by noise. The
resulting uncertainty in the output of one delay-cell across the delay-line adds to
the overall quantization error. In order to minimize the effect of metastability, the
transition time of the delay-cell should be small compared to the minimum delay
value, Dmin. This is ensured by the output inverter which makes sharp transitions.
In our design, the transition time is around 40ps which is smaller than the Dmin
of 80ps. Compared to voltage comparators where preamplification is necessary to
overcome the latching uncertainty, delay-line based quantization has the advantage
of being robust to this kind of error especially in faster processes.
4.5.4 Clock Phase Generator
The ADC operates in two non-overlapping clock phases. During the first phase
with the time window of Ts, the sample and hold tracks the input voltage. At
the same time, the pervious sample is applied to the delay-cells and the pulse
propagates through the delay-line. At the second phase with a time length of Tr,
the sampled input sets the current of M3 and M4 in the DA as shown in Fig. 4.10.
Also during this phase the output of the delay-cells are latched and they reset.
To optimize the FOM of the ADC, we have to carefully select the ratio of Ts
and Tr. As discussed in Section 4.4.4 and Fig. 4.9, for the best FOM, the ADC
operates in the quantization noise limit, which means that resolution is a linear
function of Ts. As a result, for the highest resolution, the larger portion of the
97
Tr
Ts
Input Clock
0.4
0.8
1.2
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20
0
t (ns)
(V
)
Ts Tr
0.2
0.6
1
Figure 4.14: Edge triggered pulse generator (top), and the two generated
phases (bottom). The delay-cells are similar to the core delay-
cells to ensure robustness to temperature and process variations.
period should be dedicated to this phase. The remaining time in each period is Tr,
which is determined by the settling time of the DA. However, a DA with shorter
settling time increases the power consumption. Hence, Tr should be long enough
to keep the power consumption of the DA low compared to the entire ADC.
The absolute delay of each cell is sensitive both to temperature and process
variations. In the differential structure, these variations happen in common mode.
Variation in the common mode of the delay-line changes the bias point, causing
suboptimal performance. The solution of this problem lies in the fact that the
quantized output is a function of the ratio Ts/D(vin). Thus, if Ts changes propor-
tional to D(vin), the resulting ratio becomes robust to variations. To implement
this, an edge triggered pulse generator is used as shown in Fig. 4.14. The input
pulse triggers a delay-line and the output of the delay-line produces a time window
equal to the total delay of the line . The delay-line produces Ts and the following
98
-3
-2
-1
0
1
2
3
SS TT FF
-40 27 85 -40 27 85 -40 27 85
Temperature (°C)
Δ
N
m
a
x
Fixed Ts
Adaptive Ts
Figure 4.15: The effect of process and temperature variation in the delay-line
for the employed adaptive pulse width technique compared with
using a fixed pulse width.
logic generates two non-overlapping phases. The delay-line in the pulse generator
uses delay-cells similar to the core delay-cells. As a result, the pulse length (Ts)
varies proportional to the absolute delay of the cells (D(vin)). Fig. 4.15 shows the
simulated variation in the delay-line output for the employed technique compared
with using a fixed time window. ∆Nmax is defined as the difference between Nmax
at a certain process/temperature with its value at TT. Fig. 4.15 demonstrates that
using the proposed adaptive time window, ∆Nmax remains below 1 LSB while a
similar quantizer with a fixed time window is more sensitive to process/temperature
variations.
4.6 Prototype Measurement
The proposed ADC is fabricated in a 65nm CMOS process. To have 4 bits, each
delay-line consists of 8 delay-cells followed by a dummy cell at the end. The ADC
99
105µm
1
1
0
µ
m
Clock Generator and 
Digital Encoder
S
/H
-D
A
P
-c
e
lls
P
-c
e
lls
N
-c
e
lls
N
-c
e
llsS/H
-D
A
ADC1 ADC2
Figure 4.16: Chip photograph.
including the digital section occupies an active area of only 110µm × 50µm. The
miniature size of this structure is a direct result of using delay-line quantization
instead of comparators. The small size of the delay-line also helps in avoiding global
variations on chip. As a result, cell to cell variations maintain similar statistical
properties, which is required for effective averaging. The fabricated chip includes
16 cores that can operate independently or together in a time interleaved fashion.
In the time interleaving mode, the input clock at 9.6GHz, is divided into 16 phases
for each core. This results in a sampling rate of 600MS/s for each core. In this
design, only two channels are interleaved in order to avoid the need for mismatch
and clock-skew calibration for the target resolution. Multiple chip measurements
have confirmed that these two channels can be reliably interleaved without any
calibration. The analog input and the clock are applied using RF probes and a
differential buffer connects the analog input to the cores.
The resulting two core delay-line ADC operates with a sampling rate of
1.2GS/s. The measured data corresponds to Tr = 610ps and Ts = 970ps. Both
100
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
D
N
L
-1.2
-0.8
-0.4
0
0.4
0.8
1.2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
IN
L
Channel 1
Channel 2
Channel 1
Channel 2
Digital Output Code
Figure 4.17: Measured INL (top) and DNL (bottom) of the two channels.
cores have the same time window and share the same bias current for the DA
which is applied externally. The chip photograph with individual blocks is shown
in Fig. 4.16. Most of the core area is covered by the sample-and-hold circuit and
the digital interface. The INL and DNL of the two channels plotted in Fig. 4.17
are below 0.8LSB and 0.5LSB, respectively. The Nonlinearity in the INL curve
is mostly deterministic. This suggests that the resolution is limited by nonideal
characteristic of the delay-cells, not by device mismatch. This is expected from
Fig. 4.9, as our target resolution is lower than the maximum achievable Nmax for
this sampling frequency. As a result, by using smaller delay cells and a nonuni-
form delay-line, the deterministic non-linearity can be compensated and a higher
dynamic range is possible.
101
12
16
20
24
28
32
36
1 10 100 1000
d
B
10
15
20
25
30
35
40
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
d
B
fs (GHz)
fin (MHz)
SFDR
SNDR
SFDR
SNDR
Figure 4.18: Top: Measured SFDR and SNDR at Nyquist rate vs. sampling
rate. Bottom: Measured SFDR and SNDR vs. input frequency
at fs =1.2GS/s.
At high input frequencies, the limited bandwidth of the sampling switches and
buffers, lowers the voltage to delay conversion gain. As a result, the differential
input range has to be increased in order to maintain full scale quantization. Since
nonlinearity is not limited by the input buffers or switches, this increase does
not affect the overall linearity of the ADC. High frequency measurement is done
both by sweeping the input frequency up to the Nyquist rate and by sweeping the
sampling frequency up to 1.2GHz. The measurement results are shown in Fig.
4.18. The SNDR remains above 20.4dB and the SFDR remains above 29.23dB for
the entire range. The measured output spectrum is plotted in Fig. 4.19.
102
0 50 100 150 200 250 300
-70
-60
-50
-40
-30
-20
-10
0
(d
B
F
S
)
Frequency (MHz)
fin=19.62 MHz
fs=600 MHz
SFDR=31.02 dB
SNDR=20.84 dB
0 100 200 300 400 500 600
-70
-60
-50
-40
-30
-20
-10
0
(d
B
F
S
)
fin=602.4 MHz
fs=1.2 GHz
SFDR=30.11 dB
SNDR=20.3 dB
Figure 4.19: Top: Measured low frequency spectrum for one channel. Bot-
tom: Measured Nyquist rate output spectrum, when both chan-
nels are interleaved.
The total power consumption of the two-core ADC excluding the buffers is
2mW. The share of the sample-and-hold and the DA is 480µW per channel and the
core delay-line consumes 520µW per channel. From the analysis in Section 4.4.4,
the power consumption of the delay-line is expected to further reduce in faster
processes. The supply of the delay-line is separated and decoupled to minimize
jitter. Low frequency ENOB is 3.6, and at the highest sampling rate the ENOB
is equal to 3.1, which results in a FOM equal to 196fJ/Step. The performance of
103
Table 4.1: Performance Summary
Technology TSMC 65nm CMOS
Resolution 4 bits
Sampling Rate 1.2 GS/s
Power Supply 1.2V
Input Range @ Nyquist 600 mV (pp-diff)
Power Consumption 2 mW
Number of Channels 2
DNL +0.54LSB/-0.38LSB
INL +0.78LSB/-0.83LSB
SNDR @ fin=602MHz 20.4 dB (23.1 dB peak)
SFDR @ fin=602MHz 30.1 dB (33.2 dB peak)
Input Capacitance 50 fF
Active Area 105µm×110µm
the ADC is summarized in Table. 4.1. As shown in Table. 4.2, this ADC provides
the highest energy efficiency among ADCs in this frequency range that do not
rely on foreground calibration. Moreover, the active area of the two-core ADC
is 0.01mm2, which is remarkably small compared to conventional ADCs with or
without calibration.
4.7 Summary
A delay-line-based data converter is introduced for high-speed and low-power ap-
plications. The sampled signal is transformed into time-domain and subsequently
104
Table 4.2: Performance Comparison of State-of-the-Art ADCs
Ref. Architecture Technology bits
fs
(GS/s)
Power
(mW)
Calibration
Area
(mm2)
FOM
(pJ/Step)
[83] Flash 180nm 4 4 78 Foreground 0.88 4.3
[85] Folding-Flash 65nm 5 1.75 2.2 Foreground 0.02 0.050
[88] SAR 65nm 6 1 6.7 Foreground 0.11 0.210
[82] Folding-Flash 90nm 6 2.7 50 Background 0.36 0.470
[91] SAR 130nm 6 1.25 32 Background 0.09 0.785
[89] SAR 65nm 5 0.25 1.2 Redundancy 5 0.240
[77] Flash 90nm 4 2 10 None 0.55 0.361
[84] Flash 130nm 6 1.6 180 None 0.42 2.6
[87] Two-step Flash 130nm 6 1 49 None 0.16 1.24
This Work Delay-Line 65nm 4 1.2 2 None 0.01 0.196
quantized using a differential delay-line. We illustrate how the proposed structure
is advantageous in deep sub-micron technology compared to voltage-based data
converters. To verify the concept, a prototype chip is fabricated and measured in
65nm CMOS. Compared to other calibration-free ADCs in the same frequency and
resolution range, the proposed ADC is more power efficient and compact.
105
CHAPTER 5
CONCLUSION AND FUTURE DIRECTION
Our work investigates the design of integrated circuits within the frontier of mod-
ern high-speed electronics. In such a realm we show how circuit design becomes a
multidisciplinary field, where for the most effective methods of circuit implemen-
tation one should look beyond functional abstractions, and into physical concepts
that can best represent a desired function in electronics.
For the issue of signal generation at terahertz frequencies, we address a basic
challenge that exists in conventional VCO designs. Based on the theory of nonlin-
ear dynamical systems, we present a oscillator that can overcome the output power
vs. tuning range trade-off in varactor based LC tunable oscillators. The delay-
coupled oscillator presents a scalable approach for power generation at frequencies
far above the maximum oscillation frequency of transistors.
The second chapter introduces this structure and derives the dynamic behavior
of the system. Based on the presented architecture and as a proof of concept,
we design and fabricate a terahertz source on a standard CMOS process. The
achieved power using the introduced delay coupled oscillator is orders of magnitude
higher than the state-of-the art CMOS VCOs. Such power is achieved without
compromising on the tunability of the oscillator. In fact, both the achieved power
and the tuning range is comparable with designs on technologies with significantly
higher cut-off frequencies.
This presented structure can be enhanced in at least two directions: In the first
direction the structure can be scaled to higher number of cores. In such a structure,
multiple cores can be coupled together in order to generate higher output power.
106
Furthermore, the output power can be radiated from each core in order to perform
spatial power combining at a target location.
The second aspect of the structure that can be enhanced is the achievable
output power and frequency for a given number of cores. By implementing this
structure with high mobility devices such as InP HBT transistors with cut-off
frequencies up to 1 THz, it is possible to achieve the same level of achieved output
power around 2 THz. On the other hand by employing large bandgap devices
such as GaN, significantly higher nonlinearity is generated at the same frequency,
resulting in much higher output powers.
On the signal processing side, we introduce the Electrical Interferometer and
the delay-line based data converter. The Electrical Interferometer is a structure
that is inspired by optical interference and is suitable for operating at even higher
frequencies. By increasing the frequency of the propagating waves the 2-D mesh
will become even smaller and more efficient interference can be realized. Such high-
speed propagation is useful for ultra-high-speed analog processing applications in
the mm-wave and terahertz range.
In addition to high-speed quantization, wave propagation in 1-dimensional and
2-dimensional electrical media can be used for a variety of processing tasks. For
instance, by introducing nonlinearity in the lattice which is the case when large
amplitudes are applied to the interferometer, intriguing processes such as pulse
shaping and nonlinear power combining are possible. One can think of a nonlinear
interference medium where power can be focused on particular points of the lattice
by controlling the properties of the medium. Such a process can be applied for
high-speed switching and modulation.
107
Finally, we introduce the delay-line based data converter as a scalable method
for low-power GHz range signal quantization on silicon. This data converter takes
advantage of the increasing time resolution of digital transitions in CMOS transis-
tors. The fabricated prototype consumes only 2 mW of power at a sampling rate
of 1.2 GHz. We show that time-based signal processing is more favorable in deep
sub-micron technologies for low-power and high-speed data conversion.
In addition to signal quantization, time-based signal processing can be used
for a variety of operations such as time-based amplification and modulation. It is
fruitful to investigate other operations that would also be more efficient in time-
domain in highly scaled CMOS technologies.
108
BIBLIOGRAPHY
[1] Y. M. Tousi, O. Momeni, and E. Afshari, “A Novel CMOS High-Power Tera-
hertz VCO Based on Coupled Oscillators: Theory and Implementation,” IEEE
J. Solid-State Circuits, vol. 44, no. 12, Dec. 2012.
[2] Y. M. Tousi, and E. Afshari, “2-D Electrical Interfermoter: A Novel High-
Speed Quantizer,” IEEE Trans. Microave. Theory and Tech, vol. 58, no. 10,
pp. 2549-2561, Oct. 2010.
[3] Y. M. Tousi, and E. Afshari, “A Miniature 2 mW 4 bit 1.2 GS/s Delay-Line-
Based ADC in 65 nm CMOS ,” IEEE J. Solid-State Circuits, vol. 46, no. 10,
pp. 2312-2325, Oct. 2009.
[4] C. Marcu, D. Chowdhury, C. Thakkar, J. Park, L. Kong, M. Tabesh, Y. Wang,
B. Afshar, A. Gupta, A. Arbabian, S. Gambini, R. Zamani, E. Alon, and A.
M. Niknejad, “A 90nm CMOS Low-Power 60 GHz Transceiver With Integrated
Baseband Circuitry,” IEEE J. Solid-State Circuits, vol. 44, no. 12, pp. 3434-
3447, Dec. 2009.
[5] J. Lee, Y. Chen, and Y. Huang, “A Low-Power Low-Cost Fully-Integrated
60-GHz Transceiver System With OOK Modulation abd On-Board Antenna
Assembly,” IEEE J. Solid-State Circuits, vol. 45, no. 2, pp. 264-275, Feb. 2010.
[6] T. Mitomo, N. Ono, H. Hoshino, Y. Yoshihara, O. Watanabe, and I. Seto, “A
77 GHz 90 nm CMOS Transceiver for FMCW Radar Applications,” IEEE J.
Solid-State Circuits, vol. 45, no. 4, pp. 928-937, Apr. 2010.
[7] A. Arbabian, S. Callender, S. Kang, B. Afshar, J. Chien, and A. M. Niknejad,
“A 90 GHz Hybrid Switching Pulse-Transmitter for Medical Imaging,” IEEE
J. Solid-State Circuits, vol. 45, no. 12, pp. 2667-2681, Dec. 2010.
[8] I. Sarkas, S. T. Nicolson, A. Tomkins, E. Laskin, P. Chevalier, B. Sautreuil,
and S. P. Voinigescu, “An 18-Gb/s, Direct QPSK Modulation SiGe BiCMOS
Transceiver for Last Mile Links in the 70-80 GHz Band,” IEEE J. Solid-State
Circuits, vol. 45, no. 10, pp. 1968-1980, Oct. 2010.
[9] A. Natarajan, S. K. Reynolds, M. Tsai, A. T. Nicolson, J. C. Zhan, D. G.
Kam, D. Liu, Y. O. Huang, A. Valdes-Garcia, and B. A. Floyd, “A Fully-
Integrated 16-Element Phased-Array Receiver in SiGe BiCMOS for 60-GHz
Communications,” IEEE J. Solid-State Circuits, vol. 46, no. 5, pp. 1059-1075,
May. 2011.
109
[10] A. Tomkins, P. Garcia, and S. P. Voinigescu, “A Passive W-Band Imaging
Receiver in 65-nm Bulk CMOS,” IEEE J. Solid-State Circuits, vol. 45, no. 10,
pp. 1981-1991, Oct. 2011.
[11] L. Gilreath, V. Jain, and P. Heydari, “Design and Analysis of a W-Band
SiGe Direct-Detection-Based Passive Imaging Receiver,” IEEE J. Solid-State
Circuits, vol. 46, no. 10, pp. 2240-2252, Oct. 2011.
[12] S. Koch, M. Guthoerl, I. Kallfass, A. Leuther, and S. Saito, “A 120-145 GHz
Heterodyne Receiver Chipset Utilizing the 140 GHz Atmospheric Window for
Passive Millimeter-Wave Imaging Applications,” IEEE J. Solid-State Circuits,
vol. 45, no. 10, pp. 1961-1967, Oct. 2010.
[13] A. Tang, and M. F. Chang, “183GHz 13.5mW/pixel CMOS regenerative re-
ceiver for mm-wave imaging applications,” ISSCC Dig. Tech. Papers, pp. 296-
298, Feb. 2011.
[14] H. Sherry, J. Grzyb, Y. Zhao, R. Al Hadi, A. Cathelin, A. Kaiser, and U. Pfeif-
fer, “A 1kPixel CMOS Camera Chip for 25fps Real-Time Terahertz Imaging
Applications,” ISSCC Dig. Tech. Papers, pp. 252-254, Feb. 2012.
[15] R. Han, Y. Zhang, Y. Kim, D. Kim, H. Shichijo, E. Ahshari, and K. O,
“280GHz and 860GHz Image Sensors Using Schottky-Barrier Diodes in 0.13µm
Digital CMOS,” ISSCC Dig. Tech. Papers, pp. 254-256, Feb. 2012.
[16] O. Momeni, and E. Afshari, “High Power Terahertz and Millimiter-Wave Os-
cillator Design: A Systematic Approach,” IEEE J. Solid-State Circuits, vol.
46, no. 3, pp. 583-597, Aug. 2010.
[17] K. Sengupta, and A. Hajimiri, “Distributed Active Radiation for THz Signal
Generation,” ISSCC Dig. Tech. Papers, pp. 288-289, Feb. 2011.
[18] D. Huang, T. R. LaRocca, M. F. Chang, L. Samoska, A. Fung, R. L. Camp-
bell, and M. Andrews, “Terahertz CMOS Frequency Generator Using Linear
Superposition Technique,” IEEE J. Solid-State Circuits, vol. 43, no. 12, pp.
2730-2738, Dec. 2010.
[19] E. Seok, D. Shim, C. Mao, R. Han, S. Sankaran, C. Cao, W. Knap, and K. K.
O, “Progress and Challenges Towards Terahertz CMOS Integrated Circuits,”
IEEE J. Solid-State Circuits, vol. 45, no. 8, pp. 1554-1564, Aug. 2010.
110
[20] B. Razavi, “A 300-GHz Fundumental Oscillator in 65-nm CMOS Technology,”
IEEE J. Solid-State Circuits, vol. 46, no. 4, pp. 894-903, Apr. 2011.
[21] M. Seo, M. Urteaga, J. Hacker, A. Young, Z. Griffith, V. Jain, R. Pierson, P.
Rowell, A. Skalare, A. Peralta, R. Lin, D. Pukala, and M. Rodwell, “InP HBT
IC Technology for Terahertz Frequencies: Fundumental Oscillators Up to 0.57
THz,” IEEE J. Solid-State Circuits, vol. 46, no. 10, pp. 2203-2214, Oct. 2011.
[22] A. Cetinoneri, Y. A. Atesl, A. Fung, and G. M. Rabeiz, “W-Band Ampli-
fiers with 6 dB Noise Figure and mW-Level 170-200 GHz Doublers in 45-nm
CMOS,” IEEE Trans. Microave. Theory and Tech, vol. 60, no. 3, pp. 692-701,
Mar. 2011.
[23] O. Momeni, and E. Afshari, “A 220-to-275GHz traveling-wave frequency dou-
bler with -6.6dBm Power at 244GHz in 65nm CMOS,” ISSCC Dig. Tech. Pa-
pers, pp. 286-288, Feb. 2011.
[24] E. Ojefors, B. Heinemann, and U. R. Pfeiffer, “Active 220- and 325-GHz Fre-
quency Multiplier Chains in an SiGe HBT Technology,” IEEE Trans. Microave.
Theory and Tech, vol. 59, no. 5, pp.1311-1318, May 2011.
[25] G. Cusmai, M. Repossi, G. Albasini, A. Mazzanti, and F. Svelto, “A Magnet-
ically Tuned Quadrature Oscillator,” IEEE J. Solid-State Circuits, vol. 42, no.
12, pp. 2870-2877, Dec. 2007.
[26] K. Kwok, and J. R. Long, “A 23-to-29 GHz Transconductor-Tuned VCO
MMIC in 0.13µm CMOS,” IEEE J. Solid-State Circuits, vol. 42, no. 12, pp.
2878-2886, Dec. 2007.
[27] G. Li, L. Liu, Y. Tang, and E. Afshari, “A Low-Phase-Noise Wide-Tuning-
Range Oscillator Based on Resonant Mode Switching,” IEEE J. Solid-State
Circuits, vol. 47, no. 6, pp. 1295-1308, June 2012.
[28] S. Rong, and H. C. Luong, “Design and Analysis of Varactor-Less
interpolative-Phase-Tuning Millimiter-Wave LC Oscillators with Multiphase
Outputs,” IEEE J. Solid-State Circuits, vol. 46, no. 8, pp. 1810-1819, Aug.
2011.
[29] Y. M. Tousi, O. Momeni, and E. Afshari, “A 283-to-296GHz VCO with
0.76mW Peak Output Power in 65nm CMOS,” ISSCC Dig. Tech. Papers, pp.
258-256, Feb. 2012.
111
[30] R. Adler, “Study of Locking Phenomena in Oscillator,” Proc. IRE, vol. 34,
pp. 351-357, June 1946.
[31] B. Razavi, “A Study of Injection Locking and Pulling in Oscillators,” IEEE
J. Solid-State Circuits, vol. 39, no. 9, pp. 1415-1424, Sep. 2004.
[32] B. Lin, and S. Liu, “Analysis and Design of D-Band Injection-Locked Fre-
quency Dividers,” IEEE J. Solid-State Circuits, vol. 46, no. 6, pp. 1250-1264,
June. 2011.
[33] J. F. Buckwalter, A. Babakhani, A. Komijani, and A. Hajimiri, “An Inte-
grated Subharmonic Coupled-Oscillator Scheme for a 60-GHz Phased-Array
Transmitter,” IEEE Trans. Microave. Theory and Tech, vol. 54, no. 12, pp.
4271-4280, Dec. 2006.
[34] P. Liao, and R. A. York, “A New Phase-Shifterless Beam-Scanning Technique
Using Arrays of Coupled Oscillators,” IEEE Trans. Microave. Theory and Tech,
vol. 41, no. 10, pp. 1810-1815, Oct. 1993.
[35] J. J. Lynch, and R. A. York, “Synchronization of oscillators coupled through
narrow-band networks,” IEEE Trans. Microave. Theory and Tech, vol. 49, no.
2, pp. 237-249, Feb. 2001.
[36] R. A. York, “Nonlinear analysis of phase relationships in quasi-optical oscil-
lator arrays,” IEEE Trans. Microave. Theory and Tech, vol. 41, no. 10, pp.
1799-1809, Oct. 1993.
[37] H. Chang, X. Cao, M. J. Vaughan, U. K. Mishra, and R. A. York, “Phase
noise in externally injection-locked oscillator arrays,” IEEE Trans. Microave.
Theory and Tech, vol. 45, no. 11, pp. 2035-2042, Nov. 1997.
[38] R. A. York, and R. C. Compton, “Quasi-Optical Power Combining Using
Mutually Synchronized Oscillator Arrays,” IEEE Trans. Microave. Theory and
Tech, vol. 39, no. 6, pp. 1000-1009, Jun. 1991.
[39] S. H. Strogatz, “Nonlinear Dynamics and Chaos With Applications to Physics,
Biology, Chemisty, and Engineering,” Massachusetts, USA: Addison-Wesley,
1998.
[40] J. A. Rogge, and D. Aeyels, “Stability of phase locking in a ring of unidirec-
tionally coupled oscillators”, Journal of Physics A: Mathematical and General,
vol. 37, no. 46, 2004.
112
[41] Y. M. Tousi, V. Pourahmad, and E. Afshari, “Delay Coupled Oscillators for
Frequency Tuning of Solid-State Terahertz Sources,” Physical Review Letters,
vol. 108, no. 23, Jun. 2012.
[42] S. Li, I. Kipnis, and M. Ismail, “A 10-GHz CMOS Quadrature LC-VCO for
Multirate Optical Applications,” IEEE J. Solid-State Circuits, vol. 38, no. 10,
pp. 1626-1634, Oct. 2003.
[43] D. M. Pozar, “Microwave Engineering,” Third Edition, New Jersey, USA:
John Wiley and Sons, 2005.
[44] M. L. Psiaki, S. P. powell, H. Jung, and P. M. Kintner, “Design and practical
implementation of multifrequancy RF front ends using direct RF sampling,”
IEEE Trans. Microwave Theory and Tech., vol. 53, no. 10, pp. 3082-3089, Oct.
2005.
[45] D. S. K. Pok, C. H. Chen, J. J. Schamus, C. T. Montgomery, and J. B. Y.
Tsui, “Chip design for monobit receiver,” IEEE Trans. Microwave Theory and
Tech., vol. 45, no. 12, pp. 2283-2295, Dec. 1997.
[46] P. Schvan, J. Bach, C. Falt, P. Flemke, R. Gibbins, Y, Greshishchev, N. Ben-
Hamida, D. Pollex, J. Sitch, S. Wang, and J. Wolczanski, “A 24GS/s 6b ADC
in 90nm CMOS,” ISSCC Dig. of technical papers, pp.544-545, Feb. 2008.
[47] K. Poulton, R. Neff, B. Setterberg, B. Wuppermann, T. Kopley, R. Jewett, J.
Pernillo, C. Tan, and A. Montijo, “A 20GS/s 8b ADC with 1MB Memory in
0.18µm CMOS,” ISSCC Dig. of technical papers, pp.318-319, Feb. 2003
[48] L. Y. Nathawad, R. Urata, B. A. Wooley and D. A. B. Miller, “A 20GHz Band-
width, 4b Photoconductive-Sampling Time-Interleaved CMOS ADC,” ISSCC
Dig. of technical papers, pp.320-496, Feb. 2003.
[49] S. Shahramian, S. P. Voinigescu, and A. C. Carusone, “A 35-GS/s, 4-bit
flash ADC with active data and clock distribution trees,” IEEE J. Solid-State
Circuits, vol. 44, no. 6, pp. 1709-1720, June 2009.
[50] R. A. Kertis, J. S. Humble, M. A. Daun-Lindberg, R. A. Philpott, K. E. Fritz,
D. J. Schwab, J. F. Prairie, B. K. Gilbert, and E. S. Daniel, “A 20GS/s 5-bit
BiCMOS dual-nyquist flash ADC with sampling capability up to 35GS/s fea-
turing offset corrected exclusive-or comparators,” IEEE J. Solid-State Circuits,
vol. 44, no. 9, pp. 1709-1720, Sept. 2009.
113
[51] J. Lee, P. Roux, T. koc, U. link, T. Link, Y. Baeyens, and Y. Chen, “A 5-b
10G-sample/s A/D converter for 10-Gb/s optical receivers,” IEEE J. Solid-
State Circuits, vol. 39, no. 10, pp. 1671-1679, Oct. 2009.
[52] S. Krishnan, D. Scott, Z. Griffith, M. Urteaga, Y. Wei, N. Parthasarathy, and
M. Rodwell, “An 8-GHz continous-time Σ − ∆ analog-digital converter in an
InP-based HBT technology,” IEEE Trans. Microwave Theory and Tech., vol.
51, no. 12, pp. 2555-2561, Dec. 2003.
[53] F. Coppinger, A. S. Bhushan, and B. Jalali, “Photonic time streach and its
application to analog-to-digital conversion,” IEEE Trans. Microwave Theory
and Tech., vol. 47, no. 7, pp. 1309-1314, July 1999.
[54] P. W. Juodawlkis, J. C. Twichell, G. E. Betts, J. J. Hargreaves, R. D. Younger,
J. L. Wasserman, F. J. O’Donnell, K. G. Ray, and R. C. Williamson, “Opti-
cally sampled analog-to-digital converters,” IEEE Trans. Microwave Theory
and Tech., vol. 49, pp. 1840-1852, Oct. 2001.
[55] M. Jarrahi, R. Fabian, D. A. B. Miller, and T. Lee, “Optical Spatial Quan-
tization for Higher Performance Analog-to-Digital Conversion,” IEEE Trans.
Microwave Theory and Tech., vol. 56, pp. 2143-2150, Sept. 2008.
[56] Y. M. Tousi, G. Lee, A. Hassibi, and E. Afshari, “A 1mW 4b 1GS/s delay-line
based analog-to-digital converter,” ISCAS Dig. Tech. papers, pp. 1121-1124,
May 2009.
[57] G. Li, Y. M. Tousi, A, Hassibi, and E. Afshari, “Delay-line based analog-to-
digital converters,” IEEE Trans. Circuits Syst. II, vol. 56, no. 6, pp. 464-468,
June. 2009.
[58] P. B. Johns, “The Solution of Inhomogeneous Waveguide Problems Using a
Trnasmission-Line Matrix,” IEEE Trans. Microwave Theory and Tech., vol.
22, pp. 209-215, Mar. 1974.
[59] W. J. R. Hoefer, “The Transmission-Line Matrix Method Theory and Appli-
cations,” IEEE Trans. Microwave Theory and Tech., vol. 33, pp. 882-893, Oct.
1985.
[60] H. S. Bhat and E. Afshari, “Nonlinear constructive interference in electrical
lattices,” Phys. Rev. E, Stat. Phys. Plasmas Fluids Relat. Interdiscip. Top.,
vol. 77, no. 6, 2008, Art. ID 066602.
114
[61] E. Afshari and A. Hajimiri, “Nonlinear transmission lines for pulse shaping
in silicon,” IEEE J. Solid-State Circuits, vol. 40, pp. 744-752, March 2005.
[62] E. Afshari, H. Bhat, X. Li, and A. Hajimiri, “Electrical funnel: A broadband
signal combining method,” ISSCC Dig. of technical papers, pp.751-760, Feb.
2006.
[63] E. Afshari, H. S. Bhat, A. Hajimiri, and J. E. Marsden, “Extremely wideband
signal shaping using one- and two-dimensional nonuniform nonlinear transmi-
sison lines,” J. Appl. Phys., vol. 99, no. 5, p. 054901, 2006.
[64] D. Sievenpiper, J. Schaffner, J. J. Lee, and S. Livingston, “A steerable leaky-
wave antenna using a tunable impedance ground plane,” IEEE Antennas and
Wireless Propag. Lett., vol. 1, no. 1, pp. 179-182, 2002.
[65] G. V. Eleftheriades, A. K. Iyer, and P. C. Kremer, “Planar negative refractive
index media using periodically L-C loaded transmission lines,” IEEE Trans.
Microwave Theory and Tech., vol. 50, no. 12, pp. 2702-2712, Dec. 2002.
[66] G. V. Eleftheriades and O. F. Siddiqui, “Negative refraction and focusing in
hyperbolic transmission-line periodic grids,” IEEE Trans. Microwave Theory
and Tech., vol. 53, no. 1, pp. 396-403, Jan. 2005.
[67] A. A. Houck, J. B. Brock, and I. L. Chuang, “Experimental Observations of a
Left-Handed Material That Obeys Snell’s Law,” Phys. Rev. Lett., vol. 90, no.
13, Apr. 2003.
[68] A. Sanada, C. Caloz, and T. Itoh, “Planar Distributed Structures With Nega-
tive Refractive Index,” IEEE Trans. Microwave Theory and Tech., vol. 52, no.
4, pp. 1252-1263, Apr. 2004.
[69] E. Afshari, H. S. Bhat, and A. Hajimiri, “Ultrafast analog Fourirer tranform
using 2-D LC lattice,” IEEE Trans. Circuits and Systems, vol. 55, pp. 2332-
2343, Sept. 2008.
[70] O. Momeni and E. Afshari, “Electrical Prism: A high quality factor filter for
millimeter-wave and terahertz frequancies,” IEEE Trans. Microwave Theory
and Tech., vol. 57, pp. 2790-2799, Nov. 2009.
[71] Y. Taur and T. H. Ning, “Fundumentals of Modern VLSI Devices,” First
Edition, Cambridge, UK: Cambridge University Press, 1998.
115
[72] Canence Design Manual, http://www.cadence.com/us/pages/default.aspx.
[73] T. Lee, “The Design of CMOS Rafio-Frequancy Integrated Circuits,” Second
Edition, Cambridge, UK: Cambridge University Press, 1998.
[74] J. G. Proakis, “Digital Communications,” Fourth Edition, New York, NY,
USA: McGraw-Hill.
[75] O. Tyschenko, A. Sheikholeslami, H. Tamura, M. Kibune, H. Yamaguchi, and
J. Ogawa, “A 5-Gb/s ADC-Based Feed-Forward CDR in 65 nm CMOS,” IEEE
J. Solid-State Circuits, vol. 45, no. 6, pp. 1091-1098, June. 2010.
[76] J. Cao, B. Zhang, U. Singh, D. Cui, A. Vasani, A. Garg, W. Zhang, N.
Kocaman, D. Pi, B. Raghavan, H. Pan, I. Fujimori, and A. Momtaz, “A 500
mW ADC-Based CMOS AFE With Digital Calibration for 10 Gb/s Serial Links
Over KR-Backplane and Multimode Fiber,” IEEE J. Solid-State Circuits, vol.
45, no. 6, pp. 1172-1185, June. 2010.
[77] D. A. Sobel, and R. W. Brodersen, “A 1Gb/s Mixed-Signal Baseband Analog
Front-End for a 60GHz Wireless Receiver,” IEEE J. Solid-State Circuits, vol.
44, no. 4, pp. 1281-1289, Apr. 2009.
[78] A. Varzaghani, and C. K. Yang, “A 4.8GS/s 5-bit ADC-Based Receiver With
Embedded DFE for Signal Equalization,” IEEE J. Solid-State Circuits, vol.
44, no. 3, pp. 901-915, Mar. 2009.
[79] H. Higashi, S. Masaki, M. Kibune, S. Matsubara, T. Chiba, Y. Doi, H. Ya-
maguchi, H. Takauchi, H. Ishida, K. Gotoh, and H. Tamura, “A 5-6.4Gb/s
12-Channel Transceiver With Pre-Emphasis and Equalization,” IEEE J. Solid-
State Circuits, vol. 40, no. 4, pp. 978-985, Apr. 2005.
[80] A. Agrawal, A. Liu, P. K. Hanumolu, and G. Wei, “An 8×5 Gb/s Parallel
Receiver With Collaborative Timing Recovery,” IEEE J. Solid-State Circuits,
vol. 44, no. 11, pp. 3120-3130, Nov. 2009.
[81] R. Reutemann, M. Ruegg, F. Keyser, J. Bergkvist, D. Dreps, T. Toifl, and
M. Schmatz, “A 4.5 mW/Gb/s 6.4 Gb/s 22+1-Lane Source Synchronous Re-
ceiver Core With Optional Cleanup PLL in 65 nm CMOS,” IEEE J. Solid-State
Circuits, vol. 45, no. 12, pp. 2850-2860, Dec. 2010.
[82] Y. Nakajima, A. Sakaguchi, T. Ohkido, N. Kato, T. Matsumoto, and M. Yot-
116
suyanagi, “A Background Self-Calibrated Folding-Interpolating Architecture,”
IEEE J. Solid-State Circuits, vol. 45, no. 4, pp. 707-718, Apr. 2010.
[83] S. Park, Y. Palaskas, and M. P. Flynn, “A 4-GS/s 4-bit Flash ADC in 0.18-µm
CMOS” IEEE J. Solid-State Circuits, vol. 42, no. 9, pp. 1865-1872, Sep. 2007.
[84] A Ismail, and M. Elmasry, “A 6-Bit 1.6-GS/s Low-Power Wideband Flash
ADC Converter in 0.13-µm CMOS Technology,” IEEE J. Solid-State Circuits,
vol. 43, no. 9, pp. 1982-1990, Sep. 2008.
[85] B. Verbruggen, J. Craninckx, M. Kuijk, P. Wambacq, G. V. Plas, “A 2.2
mW 1.75GS/s 5 Bit Folding Flash ADC in 90 nm Digital CMOS,” IEEE J.
Solid-State Circuits, vol. 44, no. 3, pp. 874-882, Mar. 2009.
[86] C. Chen, M. Q. Le, and K. Y. Kim, “A Low Power 6-bit Flash ADC With Ref-
erence Voltage and Common-Mode Calibration,” IEEE J. Solid-State Circuits,
vol. 44, no. 4, pp. 1041-1046, Apr. 2009.
[87] H. Chen, I. Chen, H. Tseng, and H. Chen, “A 1-GS/s 6-Bit Two-Channel
Two-Step ADC in 0.13-µm CMOS,” IEEE J. Solid-State Circuits, vol. 44, no.
11, pp. 3051-3059, Nov. 2009.
[88] J. Yang, T. L. Niang, and R. W. Broderson, “A 1GS/s 6 Bit 6.7 mW Successive
Approximation ADC Using Asynchronous Processing,” IEEE J. Solid-State
Circuits, vol. 45, no. 8, pp. 1469-1478, Aug. 2010.
[89] B. P. Ginsburg, and A. P. Chandrakasan, “Highly Interleaved 5-bit, 250-
MSample/s, 1.2-mW ADC With Redundant Channels in 65-nm CMOS,” IEEE
J. Solid-State Circuits, vol. 43, no. 12, pp. 2641-2650, Dec. 2008.
[90] W. Liu, Y. Chang, S. Hsien, B. Chen, Y. Lee, W. Chen, T. Yang, G. Ma,
and Y. Chiu, “A 600MS/s 30mW 0.13µm CMOS ADC Array Acheiving Over
60dB SFDR with Adaptive Equalization,” ISSCC Dig. of technical papers, pp.
82-83, Feb. 2009.
[91] Z. Cao, S. Yan, and Y. Li, “A 32 mW 1.25 GS/s 6b 2b/Step SAR ADC in
0.13µm CMOS,” IEEE J. Solid-State Circuits, vol. 44, no. 3, pp. 862-873, Apr.
2009.
[92] B. Verbruggen, J. Craninckx, M. Kuijk, P. Wambacq, and G. V. Plas, “A 2.6
mW 6 bit 2.2 GS/s Fully Dynamic Pipeline ADC in 40 nm Digital CMOS,”
IEEE J. Solid-State Circuits, vol. 45, no. 10, pp. 2080-2090, Oct. 2010.
117
[93] S. Lee, Y. Seo, H. Park, and J. Sim, “A 1GHz ADPLL With a 1.25ps
Minimum-Resolution Sub-Exponent TDC in 0.18µm CMOS,” IEEE J. Solid-
State Circuits, vol. 45, no. 12, pp. 2874-2881, Dec. 2010.
[94] M. Lee and A. A. Abidi, “A 9 b, 1.25 ps Resolution Course-Fine Time-to-
Digital Converter in 90 nm CMOS that Amplifies a Time Residue,” IEEE J.
Solid-State Circuits, vol. 43, no. 4, pp. 769-777, Apr. 2008.
[95] S. Henzler, S. Koeppe, W. Kamp, H. Mulatz, D. Schmitt-Landsiedel, “90nm
4.7ps-Resolution 0.7-LSB Single-Shot Precision and 19pJ-per-Shot Local Pas-
sive Interpolation Time-to-Digital Converter with On-Chip Characterization,”
ISSCC Dig. of technical papers, pp. 548-549, Feb. 2008.
[96] T. Tokairin, M. Okada, M. Kitsunezuka, T. Maeda, and M. Fukaishi, “A 2.1-
to-2.8GHz All-Digital Frequency Synthesizer with a Time-Windowed TDC,”
ISSCC Dig. of technical papers, pp. 471-471, Feb. 2010.
[97] P. Dudek, S. Szczepanski, and J. Hatfield “A High-Resolution CMOS Time-to-
Digital Converter Utilizing a Vernier Delay Line,” IEEE J. Solid-State Circuits,
vol. 35, no. 2, pp. 240-247, Feb. 2000.
[98] L. Vercesi, A. Liscidini, and R. Castello, “Two-Dimensions Vernier Time-to-
Digital Converter,” IEEE J. Solid-State Circuits, vol. 45, no. 8, pp. 1504-1512,
Aug. 2010.
[99] S. Henzler, S. Koeppe, D. Lorenz, W. Kamp, R. Kuenemund, and D. Schmitt-
Landsiedel, “A Local Passive Time Interpolation Concept for Variation-
Tolerant High-Resolution Time-to-Digital Conversion,” IEEE J. Solid-State
Circuits, vol. 43, no. 7, pp. 1666-1676, Jul. 2008.
[100] T. Watanabe and T. Terasawa, “An All-Digital A/D Converter TAD with
4-Shift-Clock Construction for Sensor Interface in 0.65-µ m CMOS,” Proc. of
the IEEE European Solid-State Circuits Conf., pp. 178-181, Sept. 2010.
[101] T. Watanabe, T. Mizuno, and Y. Makino, “An All-Digital Analog-To-Digital
Converter With 12-µV/LSB Using Moving-Average Filtering,” IEEE J. Solid-
State Circuits, vol. 38, no. 1, pp. 120-125, Jan. 2003.
[102] T. Watanabe, S. Yamauchi, and T. Terasawa, “A 0.0027-mm2 9.5-bit 50-
MS/s All-Digital A/D COnverter TAD in 65nm Digital CMOS,” Proc. of 16th
IEEE ICECS, pp. 271-274, Dec. 2009.
118
[103] S. Naraghi, M. Courcy, and M. P. Flynn, “A 9-bit. 14 µW and 0.06 mm2
Pulse Position Modulation ADC in 90 nm Digital CMOS,” IEEE J. Solid-State
Circuits, vol. 45, no. 9, pp. 1870-1880, Sep. 2010.
[104] M. Park and M. H. Perrott, “A 78 dB SNDR 87 mW 20 MHz Bandwidth
Continuous-Time ∆Σ ADC With VCO-Based Integrator and Quantizer Imple-
mented in 0.13 µ m CMOS,” IEEE J. Solid-State Circuits, vol. 44, no. 12, pp.
3344-3358, Dec. 2009.
[105] F. Opteynde, “A Maximally-Digital Radio Receiver Front-End,” ISSCC Dig.
of technical papers, pp. 450-451, Feb. 2010.
[106] G. Li, Y. M. Tousi, A, Hassibi, and E. Afshari, “Delay-line based analog-to-
digital converters,” IEEE Trans. Circuits Syst. II, vol. 56, no. 6, pp. 464-468,
June. 2009.
[107] F. Akopyan, R. Manohar, and A. B. Apsel, “A Level-Crossing Flash Asyn-
chronous Analog-to-Digital Converter,” Proc. 12th IEEE Int. Symp. Asyn-
chronous Circuits and Systems, pp. 12-22, Mar. 2006.
[108] M. Maymandi-Nejad and M. Sachdev, “A Monotonic Digitally Controlled
Delay Element,” IEEE J. Solid-State Circuits, vol. 40, no. 11, pp. 2212-2219,
Nov. 2005.
[109] J. M. Rabaey, A. Chandrakasan, and B. Nikolic, “Digital Integration Cir-
cuits, A Design Perspective,” Second Edition, New Jersey, USA: Prentice Hall,
2003.
[110] A. Papoulis and S. U. Pillai, “Probability, Random Variables and Stochastic
Processes ,” Fourth Edition, New York, USA: McGraw-Hill, 2002.
119
