Integrated Circuits and Systems for Millimeter-Wave Frequencies by Mohammadnezhad, Seyed Mohammad Hossein
UC Irvine
UC Irvine Electronic Theses and Dissertations
Title
Integrated Circuits and Systems for Millimeter-Wave Frequencies
Permalink
https://escholarship.org/uc/item/3sx3g52g
Author
Mohammadnezhad, Seyed Mohammad Hossein
Publication Date
2019
 
Peer reviewed|Thesis/dissertation
eScholarship.org Powered by the California Digital Library
University of California
UNIVERSITY OF CALIFORNIA,
IRVINE
Integrated Circuits and Systems for Millimeter-Wave Frequencies
DISSERTATION
submitted in partial satisfaction of the requirements
for the degree of
DOCTOR OF PHILOSOPHY
in Electrical Engineering
by
Seyed Mohammad Hossein Mohammadnezhad
Dissertation Committee:
Professor Payam Heydari, Chair
Professor Nader Bagherzadeh
Professor Tony Givargis
2019
Chapter 1 to 2 c© 2019 IEEE
c© 2019 Seyed Mohammad Hossein Mohammadnezhad
DEDICATION
To My Family
ii
TABLE OF CONTENTS
Page
LIST OF FIGURES v
LIST OF TABLES viii
ACKNOWLEDGMENTS ix
CURRICULUM VITAE x
ABSTRACT OF THE DISSERTATION xiii
1 A Millimeter-Wave Partially-Overlapped Beamforming-MIMO Receiver:
Theory, Design, and Implementation 1
1.1 Introduction to mm-Wave Wireless Communication Networks . . . . . . . . 1
1.2 All-analog, All-digital, and Hybrid Architectures . . . . . . . . . . . . . . . . 3
1.3 Conventional Hybrid and the Proposed Partially-Overlapped Hybrid Archi-
tectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Phase Control and Mainlobe Steering in Partially-Overlapped Hybrid Archi-
tecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Amplitude Control and Null-Steering in Partially-Overlapped Hybrid Archi-
tecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.1 Null-steering with Phase Shifters . . . . . . . . . . . . . . . . . . . . 9
1.5.2 Null-steering with VGAs . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 Effects of Amplitude and Phase Errors on Interference Suppression . . . . . 12
1.7 The Phase-Amplitude Controlled Partially-Overlapped Hybrid Circuit Archi-
tecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.7.1 Low Noise Amplifier (LNA) . . . . . . . . . . . . . . . . . . . . . . . 16
1.7.2 Phase Shifter (PS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.3 Variable Gain Attenuator (VGA) . . . . . . . . . . . . . . . . . . . . 20
1.7.4 Measurement Results of RF and RF-to-BB Channels . . . . . . . . . 24
1.7.5 Coupling between RF-Channels . . . . . . . . . . . . . . . . . . . . . 25
1.8 Measured Null-steering and Spatial Multiplexing . . . . . . . . . . . . . . . . 28
1.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
iii
2 A Millimeter-Wave Energy-Efficient Direct-Demodulation Receiver: The-
ory, Design, and Implementation 34
2.1 Introduction to High-Speed mm-Wave Receivers . . . . . . . . . . . . . . . . 34
2.2 High-speed ADC Design Challenges . . . . . . . . . . . . . . . . . . . . . . . 36
2.3 High-Order Direct-Demodulation . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3.1 Current 8PSK Demodulation Techniques . . . . . . . . . . . . . . . . 40
2.3.2 Proposed 8PSK Direct-Demodulation Technique . . . . . . . . . . . . 41
2.3.3 BER of the proposed 8PSK direct-demodulation Method . . . . . . . 43
2.4 Proposed 8PSK Direct-Demodulation Receiver . . . . . . . . . . . . . . . . . 46
2.4.1 LNA circuit design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.4.2 1-to-4 balun-based splitter design . . . . . . . . . . . . . . . . . . . . 49
2.4.3 Mixer-baseband circuit design . . . . . . . . . . . . . . . . . . . . . . 50
2.4.4 Comparator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4.5 LO Generation and Distribution Network . . . . . . . . . . . . . . . . 56
2.5 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Bibliography 67
iv
LIST OF FIGURES
Page
1.2 Typical hybrid system architectures: (a) sub-array, (b) full-array. . . . . . . 4
1.3 Conceptual representation of the hybrid system with partially-overlapped
clusters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 (a) M and NRF as function of D for N “ 8 elements, (b) pM,NRF , Dq com-
binations for N “ 4 and N “ 8 elements. . . . . . . . . . . . . . . . . . . . . 6
1.5 Outage capacity for full-array, sub-array, and partially-overlapped array with
N “ 4 and N “ 8 elements. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 (a) Array factor nulls based on phase shifter control, (b) Array factor nulls
based on VGA control. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.7 Effect of amplitude and phase error on weight vector. . . . . . . . . . . . . . 13
1.8 Interference rejection as a function of RMS amplitude error and RMS phase
error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.9 The 4-element realization of the beamforming-MIMO RX. . . . . . . . . . . 15
1.10 4-stage LNA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.11 (a) Layout of the on-chip balun,(b) simulated S-parameters and NF of the
4-stage LNA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.12 Quadrature Gilbert-based phase shifter. . . . . . . . . . . . . . . . . . . . . . 18
1.13 Layout of the quadrature all-pass filter (QAF). . . . . . . . . . . . . . . . . . 18
1.14 (a) Measured phase shifter’s phase response, (b) measured phase shifter’s RMS
phase error across phase states. . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.15 Measured amplitude error across phase shifter bits. . . . . . . . . . . . . . . 19
1.16 5-bit passive pi-stage VGA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.17 Layout of one stage of VGA. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.18 (a) Measured VGA steps, (b) measured VGA’s RMS gain error across atten-
uation steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.19 Measured phase error across VGA bits. . . . . . . . . . . . . . . . . . . . . . 22
1.20 (a) Measured S-parameter of RF-channels (simulated in dashed), (b) simu-
lated IP1dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.21 (a) Measured conversion gain and NF of RF-to-BB channel, (b) measured I/Q
amplitude and phase errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.22 Coupling path between RF-Channels 1 and 2 of cluster 1 and RF-Channels 2
and 3 of cluster 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.23 (a) coupling between RF-channels 2 and 3 of cluster 1, (b) coupling between
RF-channels 1 and 2 of cluster 1 for 4 different phase shifter settings. . . . . 26
v
1.24 (a) measured RMS gain error, and (b) measured RMS phase error due to
coupling of RF-channels 2 and 3 of cluster 1, (c) measured RMS gain error,
and (d) measured RMS phase error due to coupling of RF-channels 1 and 2
of cluster 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.25 Measured phase scanning of each 3-element cluster. . . . . . . . . . . . . . . 28
1.26 Measured spatially multiplexed array factors of two clusters steered toward
60˝ and 90˝. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.27 Measured array factor of each cluster for 625 VGA settings. . . . . . . . . . 29
1.28 Measured signal-to-interference ratio (SIR) for different undesired incident
angles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.29 Die micrograph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.30 (a) RF characterization and RF-to-baseband measurement setup (b) schematic,
(c) photo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 (a) 2D IQ signal space of 8PSK symbols partitioned by 8 LO phases, and (b)
block diagram of multi-phase RF-correlator and sign-check comparators. . . 42
2.3 (a) Equivalent baseband PAM-4 eye-diagram and 4-level normal Gaussian
distribution, and (b) 8PSK bit and symbol error probabilities of the proposed
direct-demodulation scheme and theory. . . . . . . . . . . . . . . . . . . . . 45
2.4 Proposed direct-demodulation RF-to-bits 8PSK receiver architecture includ-
ing front-end, 4-phase LO, mixer-baseband, and demodulator. . . . . . . . . 46
2.5 (a) 6-Stage common-emitter-based LNA circuit schematic, (b) simultaneous
noise and power matching in the first LNA stage (Zopt and Zi˚n curves). . . . 47
2.6 (a) Second single-ended LNA stage layout with MIM bypass capacitors, (b)
second differential LNA stage layout with CPW-based matching networks,
and (c) LNA S-parameter and NF simulation results. . . . . . . . . . . . . . 47
2.7 (a) 4-Way splitter layout, (b) splitter S-parameter simulation results, and (c)
splitter amplitude and phase error simulation results. . . . . . . . . . . . . . 49
2.8 (a) Voltage-mode double-balanced passive mixer followed by 3-stage amplifi-
cation (last amplification stage is CTLE), (b) simplified circuit of a double-
balanced passive mixer, and (c) mixer input matching from 110 GHz to 140
GHz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.9 (a) Conversion gain of mixer-baseband for four different CTLE code settings,
and (b) mixer-baseband NF. . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.10 (a) Comparator-DFF with offset calibration, and (b) comparator BER due to
random noise (PSS + PNOISE simulations). . . . . . . . . . . . . . . . . . . 53
2.11 Timing diagram of the comparator-DFF operation. . . . . . . . . . . . . . . 54
2.12 (a) Tripler circuit schematic, and (b) tripler output power with and without
input buffer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.13 (a) Low-pass phase shifter, (b) high-pass phase shifter, (c) 90˝ phase-shift
tuning range, and (d) S-parameter simulation results. . . . . . . . . . . . . . 57
2.14 (a) Doubler circuit schematic, and (b) doubler output power. . . . . . . . . . 58
2.15 (a) Lower-frequency-tuned all-pass phase shifter, and (b) higher-frequency-
tuned all-pass phase shifter, (c) 45˝ phase-shift tuning range, and (d) S-
parameter simulation results. . . . . . . . . . . . . . . . . . . . . . . . . . . 59
vi
2.16 (a) LO network output saturated power for different LO center frequencies,
and (b) LO network output power at different harmonics (6th, 4th, 3rd, 2nd,
1st) of the bondwired 20.83 GHz LO input. . . . . . . . . . . . . . . . . . . . 59
2.17 Die micrograph of the 8PSK receiver. . . . . . . . . . . . . . . . . . . . . . . 60
2.18 (a) Wireless measurement setup schematic, and (b) photo. . . . . . . . . . . 61
2.19 Measured receiver conversion gain at the 8PSK test output. . . . . . . . . . 62
2.20 (a) Measured DSB NF and (b) IP1dB at the 8PSK test output. . . . . . . . 62
2.21 36 Gbps power spectrum at the 8PSK test output. . . . . . . . . . . . . . . . 62
2.22 Wirelessly measured (a) 30 Gbps and (b) 36 Gbps eye-diagrams at the 8PSK
test output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.23 Wirelessly measured (a) 30 Gbps and (b) 36 Gbps constellations at the 8PSK
test output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.24 Wirelessly measured eye-diagrams of demodulated 8PSK 3-bit streams for (a)
30 Gbps and (b) 36 Gbps overall data-rates. . . . . . . . . . . . . . . . . . . 64
2.25 Variation of BER with received input power. . . . . . . . . . . . . . . . . . . 65
vii
LIST OF TABLES
Page
1.1 Performance Summary of the 4-Element Beamforming-MIMO RX . . . . . . 32
1.2 Table of Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.1 Normalized correlation values of 8PSK symbols with eight LO phases. . . . . 42
2.2 Logic table of three demodulated bits per symbol. . . . . . . . . . . . . . . . 42
2.3 Comparison table of state-of-the-art direct-demodulation receivers . . . . . . 65
viii
ACKNOWLEDGMENTS
I would like to express my gratitude to my adviser for his continuous guidance and mo-
tivation. Special thanks to Huan Wang for his contribution. I would like to acknowledge
STMicroelectronics, GLOBALFOUNDRIES for facilitating the chip fabrication. Finally,
I would like to thank Keysight Technologies, in particular, Dave Huh for providing great
assistance in test equipment.
ix
CURRICULUM VITAE
Seyed Mohammad Hossein Mohammadnezhad
EDUCATION
Doctor of Philosophy in Electrical Engineering 2019
University of California, Irvine Irvine, California
Master of Science in Electrical Engineering 2018
University of California, Irvine Irvine, California
Bachelor of Science in Electrical Engineering 2013
Sharif University of Technology Tehran, Iran
x
REFEREED JOURNAL PUBLICATIONS
A 115-135 GHz 8PSK Receiver Using Multi-Phase RF-
Correlation-Based Direct-Demodulation Method
2019
IEEE Journal of Solid State Circuits
Analysis and Design of High-Order QAM Direct Modu-
lation Transmitter for High-Speed Point-to-Point mm-
Wave Wireless Links
2019
IEEE Journal of Solid State Circuits
A Millimeter-Wave Partially Overlapped Beamforming-
MIMO Receiver: Theory, Design, and Implementation
2019
IEEE Transactions on Microwave Theory and Techniques
A silicon-based low-power broadband transimpedance
amplifier
2017
IEEE Transactions on Circuits and Systems I: Regular Papers
Analysis and design of a wideband, balun-based, differ-
ential power splitter at mm-Wave
2017
IEEE Transactions on Circuits and Systems II: Express Briefs
REFEREED CONFERENCE PUBLICATIONS
A Single-Channel RF-to-Bits 36Gbps 8PSK RX with
Direct Demodulation in RF Domain
Apr 2019
Custom Integrated Circuits Conference (CICC), 2019 IEEE
A 100-120GHz 20Gbps Bits-to-RF 16QAM Transmitter
Using 1-bit Digital-to-Analog Interface
Apr 2019
Custom Integrated Circuits Conference (CICC), 2019 IEEE
A 64–67GHz partially-overlapped phase-amplitude-
controlled 4-element beamforming-MIMO receiver
May 2018
Custom Integrated Circuits Conference (CICC), 2018 IEEE
A low-power BiCMOS 50 Gbps Gm-boosted dual-
feedback transimpedance amplifier
Oct 2015
2015 IEEE Bipolar/BiCMOS Circuits and Technology Meeting-BCTM
xi
A broadband nonlinear lumped model for silicon IM-
PATT diodes
Oct 2015
2015 IEEE Bipolar/BiCMOS Circuits and Technology Meeting-BCTM
PATENTS
Ultra-broadband Transimpedance Amplifiers (TIA) for
Optical Fiber Communications
2018
U.S. Patent No. 20180102749A1
RF to Bits Receiver without Analog to Digital Con-
verter (ADC) for 8-PSK Constellation
2019
U.S. Application No. 62/806,456
Quadrature Phase Shift Keying Quadrature Amplitude
Modulation Transmitter
2019
U.S. Application No. 62/870,189
Transmitter Architecture for Generating 4N -QAM Con-
stellation with no Digital-to-Analog Converters (DAC)
in Signal Path Requirement
2019
U.S. Application No. 62/712,062
xii
ABSTRACT OF THE DISSERTATION
Integrated Circuits and Systems for Millimeter-Wave Frequencies
By
Seyed Mohammad Hossein Mohammadnezhad
Doctor of Philosophy in Electrical Engineering
University of California, Irvine, 2019
Professor Payam Heydari, Chair
There is an ever-increasing demand for higher data-rates in wireless communication networks.
Data traffic in cellular networks increases by roughly 40% annually and by 7 times every
6 years. Furthermore, the number of connected devices in these networks increases by an
order of magnitude every decade [1]. To accommodate for this insatiable demand, multi-
antenna architectures are adopted to take advantage of the vast available spectrum at mm-
wave without suffering from the higher wireless link loss at those frequencies. This thesis
presents novel architectures for ultra-high speed wireless transceivers for both short-range
to medium-range indoor and outdoor applications to be used in the future generations of
wireless communication networks.
In the first section of this thesis, mm-wave circuit- and system-level solutions for addition
of multi-user service to conventional multi-antenna phased array architectures will be in-
troduced. The proposed architecture will enhance the link capacity, co-channel user service
and hardware cost compared to conventional solutions. Theory and design of the circuits
and system are detailed and comprehensive measurement results are presented verifying the
system-level functionality. First section is named A Millimeter-Wave Partially-Overlapped
Beamforming-MIMO Receiver: Theory, Design, and Implementation. More specifically, this
section presents an analysis and design of a partially-overlapped beamforming-MIMO archi-
xiii
tecture capable of achieving higher beamforming and spatial multiplexing gains with lower
number of elements compared to conventional architectures. As a proof of concept, a 4-
element beamforming-MIMO receiver (RX) covering 64-67 GHz frequency band1 enabling
2-stream concurrent reception is designed and measured. By partitioning the RX elements
into two clusters and partially overlapping these clusters to create two 3-element beam-
formers, both phased-array (coherent beamforming) as well as MIMO (spatial multiplexing)
features are simultaneously acquired. 6-bit phase shifters with 360˝ phase control and 5-bit
VGAs with 11 dB range are designed to enable steering of the two RX clusters toward two
arbitrary angular locations corresponding to two users. Fabricated in a 130-nm SiGe BiC-
MOS process, the RX achieves a 30.15 dB maximum direct conversion gain and a 9.8 dB
minimum noise figure (NF) across 548 MHz IF bandwidth. S-parameter-based array factor
measurements verify spatial filtering of the interference and spatial multiplexing in this RX
chip.
In the second section of this thesis, energy-efficient ultra-high speed transceiver architec-
tures will be presented. Current high-speed transceivers rely on high-sampling-rate high-
resolution power-hungry analog-to-digital converters or digital-to-analog converters at the
interface of analog and digital circuitries. However, design of these backend data-converters
are extremely power-hungry at very high speeds in a fully-integrated end-to-end scenario
(i.e. RF-to-Bits, Bits-to-RF). Novel system-level architectures will be presented that ob-
viate the need for such costly data converters and will significantly relax the complexity
of digital signal-processing. The proposed architecture will result in orders of magnitude
energy saving at ultra-high speeds. Theory, design, and measurement results of the highest-
speed, highly energy-efficient fully-integrated end-to-end transceiver will be discussed in this
section. Second section is named A Millimeter-Wave Energy-Efficient Direct-Demodulation
Receiver: Theory, Design, and Implementation. More precisely, this section presents the
theory, design, and implementation of an 8PSK direct-demodulation receiver based on a
1The FCC’s newly allocated 64-71 GHz frequency band for high-speed wireless links between small cells
xiv
novel multi-phase RF-correlation concept. The output of this RF-to-bits receiver architec-
ture is demodulated bits, obviating the need for power-hungry high-speed-resolution data
converters. A single-channel 115-135-GHz receiver prototype was fabricated in a 55-nm SiGe
BiCMOS process. A max conversion gain of 32 dB and a min noise figure (NF) of 10.3 dB
was measured. A data-rate of 36 Gbps was wirelessly measured at 30 cm distance with
the received 8PSK signal being directly demodulated on-chip at a bit-error-rate (BER) of
1e-6. The measured receiver sensitivity at this BER is -41.28 dBm. The prototype occu-
pies 2.5ˆ3.5 mm2 of die area including PADs and test circuits (2.5 mm2 active area) and
consumes a total DC power of 200.25 mW.
xv
Chapter 1
A Millimeter-Wave
Partially-Overlapped
Beamforming-MIMO Receiver:
Theory, Design, and Implementation
1.1 Introduction to mm-Wave Wireless Communica-
tion Networks
There is an insatiable demand for higher data rates and larger number of data consumers in
mobile communication networks. Mobile data traffic is expected to increase sevenfold from
2016 reaching 48.3 Exabytes per month by 2021 with global IP video traffic making up 82%
of the internet traffic by 2021 compared with 73% in 2016 [1].
To accommodate this ever-increasing demand of high data rates, three directions are pursued
1
in the deployment of the 5th generation wireless systems, 5G, which result in orders-of-
magnitude increase in wireless capacity compared to current wireless networks: (1) mm-
wave wireless communication with technologies currently available at 26, 28, 38, 60 GHz
leveraging available wide bandwidths to achieve multi-gigabit-per-second data rates (e.g.,
5 Gbps by 2020 and 50 Gbps by 2024 [2]) compared to data rates achieved in 4G LTE-
advanced (1 Gbps) or 3G networks (384 kbps) [3], (2) frequency reuse leading to creation
of small cells (i.e., pico or femtocells as shown in Fig. 1.1) with 10–200 m of coverage
range, where intercell interference is insignificant due to the high path loss experienced
by mm-wave communication, and (3) distributed base stations (BS) with massive number
of antennas (>100) providing one-tier or multi-tier high-speed wireless access to multi co-
channel users [4, 5].
Multi-antenna architectures are currently being adopted for both base stations and small
cells to combat high path loss at mm-wave frequencies and achieve higher capacity and co-
channel user service [6–9]. Increasing the number of antennas results in channel hardening
and reduction of small-scale fading (less multi-path and Doppler spread), which in return
simplifies baseband signal processing algorithms. Various configurations of multi-antenna
architectures provide: (1) multiplexing gain to enhance link capacity through concurrent
transmission of parallel data/user streams, (2) diversity gain to improve reliability of wireless
User1
User2User3
picocell
PBS
User4User5
femtocell
User6
User7
FBS
MBS
macrocell
Figure 1.1: Generic 5G network architecture (high-level topological view).
2
links especially in non line-of-sight (NLOS) scenarios through transmission of copies of the
same data stream, (3) antenna gain to combat path loss, integrated wide-band noise, and
co-channel interference through breamforming in LOS or directed NLOS scenarios.
In low-SNR mm-wave channels, increasing the capacity is limited by both the transmitter
output power and receiver integrated noise. Adopting beamforming multi-antenna architec-
tures to transmit sharp beams with highly directional antenna gains enhances the capacity
by improving the SNR, thus making it possible to employ high-order modulation schemes
to achieve higher spectral efficiencies. On the other hand, in high-SNR mm-wave channels
with high diversity or rank order, exploiting multiplexing gain via propagation of indepen-
dent signal streams through multiple distinct paths in spatial and polarization domains can
further enhance the channel capacity or multi-user service.
1.2 All-analog, All-digital, and Hybrid Architectures
An all-analog multi-antenna architecture, where beamforming weights are applied in ana-
log domain, significantly reduces complexity and cost of baseband digital signal processing
and provides wider coverage for wireless links by generating sharp transmit/receive beams.
However, the mm-wave front-end of an all-analog system for applications mandating high
beamforming resolution, adaptability, and multi-beam communication should satisfy strin-
gent performance/power specifications.
On the other hand, an all-digital beamforming system enables multi-beam communication
with the highest adaptability and data-rate, but demands complex power-hungry, high-speed
baseband DSPs for large antenna arrays operating at mm-wave frequencies. Therefore,
a hybrid architecture with both analog beamforming and digital MIMO coding is desired
to reduce the complexity of the digital baseband in communication systems with massive
3
ADC
ADC
RF 
Chain
W1,1
WM,1
NRF
M
N
M RF 
Chain
P
a
ra
ll
e
l 
D
a
ta
D
ig
ita
l C
o
d
in
g
K
WM,NRF
W1,NRF
(a)
ADC
ADC
RF 
ChainW1,1
NRF
N
W
RF 
Chain
D
ig
ita
l C
o
d
in
g
P
a
ra
ll
e
l 
D
a
ta
1,NRF
WN,NRF
NRF
KNRF
WN,1
(b)
Figure 1.2: Typical hybrid system architectures: (a) sub-array, (b) full-array.
number of antennas.
1.3 Conventional Hybrid and the Proposed Partially-
Overlapped Hybrid Architectures
Hybrid architectures can be designed to receive (or transmit) only one dedicated data-stream
per each subset of antennas (sub-array in Fig. 1.2a), or receive (or transmit) all data-streams
from all antennas (full-array in Fig. 1.2b). The number of required RF chains NRF in a
hybrid architecture is strictly lower-limited by the number of parallel data streams K, while
beamforming gain is determined by the number of antenna elements per each cluster M ,
where each cluster is composed of dedicated RF-channels (i.e., complex weighting coefficients
in Figs. 1.2a, 1.2b, and 1.3) and antennas per RF chain. A full-array, in fact, realizes
the function of an all-digital architecture. The number of signal processing paths (from
the digital baseband to the antenna front-end) for the sub-array in Fig. 1.2a is equal to
N and for full-array in Fig. 1.2b is equal to NRF ˆ N , where N is the total number of
antennas [10]. On the other hand, beamforming gain of the sub-array is 1{NRF of the full-
array. Therefore, a trade-off exists between signal processing complexity and beamforming
gain of hybrid architectures. A recent circuit implementation of a hybrid architecture was
4
ADC
RF 
Chain
W1,1
NRF
WM-D+1,1
D
W1,2
WM,1
M
ADC
RF 
Chain
WM-D+1,2
W1,3
WM,3
ADC
RF 
Chain
N
K
D
ig
ita
l C
o
d
in
g
WD,2
P
a
ra
ll
e
l 
D
a
ta
WM,2
WD,3
D
WM-D+1,3
W1,4D
Figure 1.3: Conceptual representation of the hybrid system with partially-overlapped clus-
ters.
presented in [11]. It utilizes Cartesian combining concept to enable 2-stream reception.
However, this hybrid RX requires 8 splitters, 20 combiners, and 12 mixers for a 2-stream
reception. These large number of signal paths introduce electromagnetic cross-talks due to
many cross-overs between these paths. Therefore, this architecture is not scalable for larger
number of streams and not suitable for implementation at higher frequencies.
To address the above issues, a partially-overlapped beamforming-MIMO architecture is in-
troduced [12]. Shown in Fig. 1.3, N antennas are decomposed into K partially-overlapped
clusters of M antennas with an overlapping depth of D supporting K parallel data streams.
M is bounded by:
N{NRF ďM
“ “ pN `DpNRF ´ 1qq{NRF ‰ ď N (1.1)
5
Overlapping the clusters reduces the complexity of RF-to-basedband signal processing paths
to NRF ˆ M from NRF ˆ N in the case of the full-array. From (1.1), varying D from 0
to N causes the number of signal processing paths to vary from N to NRF ˆ N and the
beamforming gain improvement factor (referenced to a sub-array architecture) to vary from
1 to NRF . Therefore, introducing overlapping depth into the clusters of antennas and RF-
channels creates a new degree of freedom that helps us reach a better compromise between
the beamforming gain and the complexity of signal processing in hybrid architectures. Fig.
1.4a shows M and NRF variations with respect to D in a partially overlapped hybrid for
N “ 8 elements, and Fig. 1.4b shows the possible combinations of pM,NRF , Dq for N “ 4
and N “ 8. The behavior of multiplexing and coherent processing gain variations with
respect to the overlapping depth are demonstrated in these figures.
In Fig. 1.3, the i-th complex weighting coefficient (RF-channel) in cluster k is generally
defined as Wi,k “ Aikejφik for i P t1, ...,Mu and k P t1, ..., NRF u. φik and Aik are realized
by RF phase shifters and variable gain attenuators and/or amplifiers (VGAs). As will be
illustrated in Sections 1.4 and 1.5, the RF phase shifters are used for mainlobe steering
of each cluster, whereas the RF VGAs enable spatial filtering of interferences from other
Coherent↑ 
Processing Gain
Multiplexing 
Gain↑
(a) (b)
Figure 1.4: (a) M and NRF as function of D for N “ 8 elements, (b) pM,NRF , Dq combina-
tions for N “ 4 and N “ 8 elements.
6
0
1
2
3
4
5
6
7
8
9
10
11
12
0 5 10 15 20 25 30
SNR (dB)
O
ut
ag
e 
Ca
pa
ci
ty
 (b
ps
/H
z)
Full-array  M=8, K=8
Sub-array  M=4, K=2
Overlapped array  M=4, K=3, D=2
Full-array  M=4,K=4
Sub-array  M=2, K=2
Overlapped array  M=3, K=2, D=2
Figure 1.5: Outage capacity for full-array, sub-array, and partially-overlapped array with
N “ 4 and N “ 8 elements.
clusters by placing the null locations of each beamforming cluster toward the directions
of the interference incident angles. Also, throughout the forthcoming analysis, we assume
NRF “ K without loss of generality, as this work is primarily concerned with the RF portion
of the hybrid architecture. It is noteworthy that the use of both linear amplitude and
phase controls across the frequency range of interest enables this partially-overlapped hybrid
architecture to achieve the same performance as a linear digital beamforming system [13,14]1.
The upper bound of channel capacity is determined by Shannon-Hartley theorem and is
a function of the SNR “ Psig{PN where Psig is signal power and PN is noise power. To
compare performance of these hybrid architectures, the outage capacity with respect to
SNR is calculated. The outage happens when the Shannon capacity falls below a certain
threshold, Cout. The outage probability, Pout, considering equal power distribution across all
clusters is given by [16]:
Pout “ p2
Cout ´ 1qK
K!SNRKH
(1.2)
1To achieve the same performance in a hybrid with phase-only constraint, the number of parallel data
streams is required to be less than half the number of the RF chains [15].
7
where SNRH is defined as the improvement in SNR and is equal to:
SNRH “ M
K
SNR (1.3)
With the beamforming gain defined as the number of antennas per cluster, overlapping the
clusters in an N -element hybrid allows us to allocate larger number of antennas per cluster,
thereby resulting in higher beamforming gain and SNRH compared to the corresponding
N -element sub-array. As an example, in Fig. 1.5, the outage capacity of full-array, sub-array,
and partially-overlapped antenna arrays with N “ 4 and N “ 8 for an outage probability of
1% are compared. The partially-overlapped hybrid architecture is observed to achieve better
outage capacity than the conventional sub-array counterpart.
1.4 Phase Control and Mainlobe Steering in Partially-
Overlapped Hybrid Architecture
Phase shifters in the RF-channels of each cluster enable independent phase excitation for
each RF-channel within that cluster. The array factor AF of cluster k is expressed as:
AFMk “
Mÿ
i“1
epi´1qˆjψk (1.4)
where ψk “ pi cos θk ` φk. φk is the phase progression from one RF-channel to the next
in cluster k, and θk is the incident angle from the axis of antenna array. By adjusting the
phase shifters in each cluster so as to achieve relative phase difference of φk, the array factor
of cluster k can be maximized toward the incident angle θk, or in other words, cluster k’s
mainlobe can be steered toward the incident angle θk. As a result, the mainlobes of all clusters
in this partially-overlapped hybrid architecture can be simultaneously and independently
8
steered toward different arbitrary angles.
1.5 Amplitude Control and Null-Steering in Partially-
Overlapped Hybrid Architecture
By utilizing both VGA’s amplitude and phase shifter’s phase controls in this architecture,
null locations of the array factor can be arbitrarily steered. Null-steering solely based on
the VGA setting is of particular interest to enable interference suppression independently
from mainlobe steering (achieved by phase shifter settings). Nevertheless, for the sake of
completeness, null-steering using two mechanisms, namely, (1) phase shifter settings and (2)
VGA settings with phase shifter being already preset, will be explained.
1.5.1 Null-steering with Phase Shifters
Considering an M-antenna cluster, the array factor of cluster k with λ{2 spacing between
antennas is written as:
AFMk “
Mÿ
i“1
epi´1qˆjψk z“e
jψk“
Mÿ
i“1
zi´1 (1.5)
where ψk “ pi cos θk ` φk and φk is the phase progression (set by the phase shifters) from
one RF-channel to the next. The power series in (1.5) is readily calculated, resulting in
AFk “ p1 ´ zMq{p1 ´ zq [17]. As shown in Fig. 1.6a, this uniform distribution of VGA
amplitudes and progressive phase shifter settings will result in M ´ 1 zeros (nulls) around
the unity circle of the cluster’s array factor. Assuming ψk to be the k´ th zero on the unity
circle of Fig. 1.6a, after preseting φk to steer the cluster’s mainlobe toward the desired angle,
the null location of cluster k is calculated to be at θnull,k “ cos´1pψk ´ φkq{pi.
9
1.5.2 Null-steering with VGAs
Relying solely on phase shifters to achieve both mainlobe and null-steering does not provide
sufficient flexibility for null control. Therefore, it is desired to delegate null-steering to VGAs
and design phase shifters only for mainlobe steering purpose. The array factor of cluster k
with arbitrary amplitude excitation per RF-channel is given by:
AFMk “
Mÿ
i“1
Aike
pi´1qˆjψk z“ejψk“
Mÿ
i“1
Aikz
i´1 (1.6)
By allowing only conjugate pair zeros in the AF, the need for a phase shifter to realize nulls
is alleviated (i.e., pz ´ ziqpz ´ zi˚ q is real, thus no phase information) at the cost of reducing
the number of possible nulls to half (Fig. 1.6b). The AF of cluster k with conjugate pair
zeroes is [17]:
AFMk “
$’&’%
pz ` 1qΠni“2pz ´ zi´1qpz ´ z˚i´1q; M “ 2n
Πni“1pz ´ ziqpz ´ z˚i q; M “ 2n` 1
(1.7)
therefore, AFM“2n,k “ pz` 1qAFM“2n´1,k. Moreover, the VGAs amplitude settings within a
Cluster K
Z1
Z2
ZM-1
Array Factor 
Null
Array Factor 
Mainlobe Maximum
(a)
Cluster K
Odd M:
Even M: *
*
* * *
*
Z1
Z1
Z(M-1)/2
ZM/2
*
Array Factor Null
*Array Factor Null Conjugate
Array Factor 
Mainlobe Maximum
(b)
Figure 1.6: (a) Array factor nulls based on phase shifter control, (b) Array factor nulls based
on VGA control.
10
cluster will be symmetric around the center element of each cluster.
In the special case of two 3-element clusters, null location can be steered arbitrarily to
suppress the interference from the other cluster. The AF of cluster 1 is:
AF31 “ pz ´ z1qpz ´ z1˚q
“ pz ´ ejψ1qpz ´ e´jψ1q
“ z2 ´ 2zcosψ1 ` 1
(1.8)
where A21 “ ´2cosψ1 “ ´2cosppi cos θnull,1 ` φ1q. θnull,1 is the null location of cluster 1 and
φ1 is the required phase progression set by this cluster’s phase shifters to steer the mainlobe
of cluster 1 toward the desired direction of θmain,1. Similarly, the AF of cluster 2 is calculated
to be:
AF32 “ z2 ´ 2zcosψ2 ` 1 (1.9)
where A22 “ ´2cosψ2 “ ´2cosppi cos θnull,2`φ2q. θnull,2 represents the null location of cluster
2 and φ2 is set by this cluster’s phase shifters to steer the mainlobe of cluster 2 toward
the desired direction of θmain,2. Therefore, proper adjustments of these two clusters’ phase
shifters and VGAs yield θnull,1 “ θmain,2 and θnull,2 “ θmain,1. It also facilitates simultaneous
operation of these two clusters for spatial multiplexing in addition to beamforming within
each cluster.
11
1.6 Effects of Amplitude and Phase Errors on Inter-
ference Suppression
Non-idealities in RF phase shifters and VGAs (e.g., resolution-induced quantization error,
timing jitter, and device mismatch) result in amplitude and phase errors, which degrade the
effectiveness of interference suppression in the overlapped hybrid architecture. Suppose that
optimum phase and amplitude excitations for each RF-channel within cluster k were derived
to achieve maximum array factor in the desired direction in the presence of interference in-
duced by simultaneous operation with other clusters. The complex weight vector containing
these optimum excitations is expressed as:
~Wopt,k “ rA1kejφ1k , ..., AMkejφMks; k P t1, ..., Ku (1.10)
where Aik and φik are the optimum amplitude and phase settings of the VGA and phase
shifter of the RF-channel i in cluster k. Accounting for amplitude and phase errors in the
VGA and phase shifter, the actual weight vector will be:
~Werr,k “rA1kp1` a1kqejpφ1k`δ1kq, .....,
AMkp1` aMkqejpφMk`δMkqs; k P t1, ..., Ku
(1.11)
where aik and δik denote the amplitude and phase errors of the VGA and phase shifter of
the i-th RF-channel in cluster k. Fig. 1.7 shows the beamforming weight vectors ~Wopt,k
and ~Werr,k with an angular error of θerr,k between them. ~Wopt,k is orthogonal to interference
plane, resulting in maximum beamforming gain and interference suppression. Because of
the amplitude and phase errors, ~Werr,k is projected on both signal and interference planes
by factors cospθerr,kq and sinpθerr,kq [18]. As a result, for small angular errors, interference
leakage (a function of sinpθerr,kq « θerr,k) is more sensitive to phase/amplitude errors than
the mainlobe gain (a function of cospθerr,kq « 1). The standard deviations of aik and δik
12
Werr,k
Wopt,k
W
e
rr
,k
 c
o
s
(θ
er
r,
k)
θ
er
r,
k
Interference Plane
S
ig
n
a
l 
P
la
n
e
Werr,ksin(θerr,k)
Figure 1.7: Effect of amplitude and phase error on weight vector.
for 1 ď i ď M are defined as Era2iks “ σ2a,k and Erδ2iks “ σ2δ,k (Er.s is the expected value).
Assuming normalized weight vectors (i.e.,
řM
i“1A
2
ik “ 1), the variance of error in the weight
vector of cluster k is calculated to be [19]:
σ2θ,k “ Erp ~Wopt,k ´ ~Werr,kq2s
“ Er
Mÿ
i“1
A2ikpa2ik ` δ2ikqs
“ σ2a,k ` σ2δ,k
(1.12)
therefore, the standard deviation of angular error in the weight vector is independent of
the number of antennas per cluster. However, increasing the number of antennas facilitates
suppressing larger number of interferences, thus supporting larger number of parallel data
streams and improving the mainlobe gain by increasing the beamforming gain per cluster.
Fig. 1.8 shows interference rejection (i.e., ´10logpσ2θ,kq for normalized weight vectors as-
suming small angular errors θerr,k) as a function of RMS amplitude and phase errors. To
combat interference suppression degradation due to amplitude and phase errors, high reso-
lution VGA and phase shifters need to be employed. With Bamp-bit VGA and Bφ-bit phase
shifter per RF-channel, 2Bamp`Bφ weight vectors can be generated. With scalar quantization
of the weight vectors, the expected value of peak-to-null-ratio EpPNRq varies proportionally
13
11
11
12
12
12
13
13
13
14
14
14
14
15
15
15
15
16
16
16
16
17
17
17
17
18
18
18
19
19
19
20
20
20
21
21
22
22
23
23
24
24
25
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
RMS Amplitude Error (dB)
1
2
3
4
5
6
7
8
9
10
11
R
M
S 
Ph
as
e 
Er
ro
r (
De
gr
ee
s)
Figure 1.8: Interference rejection as a function of RMS amplitude error and RMS phase
error.
with:
ErPNRs 9M ˆ 22pBamp`Bφq (1.13)
therefore, ErPNRs is linearly proportional to the number of elements per cluster. However, it
varies exponentially with the resolution of the VGA and phase shifter, Bamp and Bφ, showing
the importance of the VGA and phase shifter resolutions in achieving high interference
suppression and high resolution beam control.
1.7 The Phase-Amplitude Controlled Partially-Overlapped
Hybrid Circuit Architecture
Based on the idea of partially-overlapped hybrid scheme with phase/amplitude control, a
4-element beamforming-MIMO RX for D “ 2, M “ 3, N “ 4, and K “ 2 is designed
(Fig. 1.9) [?, 12, 20]. Each of the two clusters is composed of three RF-channels (i.e., three
LNA-PS-VGA paths) with LNA2 and LNA3 shared between the two clusters. The phase
14
states of three RF phase shifters and amplitude states of three VGAs within each cluster
are adjusted based on the mainlobe steering and the null-steering methods explained in
Sections IV and V, respectively. The null location for each cluster is steered with VGA
steps, enabling each cluster to suppress interferences at undesired incident angles located
in its null space. Inter-element spacing within and between clusters in this architecture is
assumed equal to λ/2 so as to avoid spatial aliasing in diversity beampatterns. Beamforming
gain acquired from each cluster combined with spatial multiplexing gain from these 2 clusters
improve reliability/diversity of high data-rate multi-stream links at both low and high SNR
regimes. Two LNAs (LNA2-3), shared between U1 and U2, are followed by splitters to
allow independent amplitude/phase control for these 2 clusters. The signals U1ke
jθ1k and
U2ke
jθ2k , 1ď k ď4, from distinct incident angles θ1 and θ2 appearing on 6 RF-channels are
fed to 6-bit phase shifters (φij 1ď i ď2, 1ď j ď3) to be steered independently toward a
desired angle. The phase-shifted signals are then fed to 5-bit VGAs (Aij 1ď i ď2, 1ď j ď3)
to suppress interference due to concurrent data reception of the other cluster by the null-
X2
I/Q
 
G
e
n
.
LO2,Q
BB2,Q
BB2,I
BB1,Q
BB1,I
LO2,I
LO1,Q
LO2,Q
LO1,I
LO2,I
LO1,Q
LO1,I
L
O
in
A11
LNA1
LNA2
LNA3
LNA4
RF test  
PAD1Φ12
6b
SPI
1b
b<0:85>
Clk
1b
6b
6b
6b
6b
6b
...
Differential 
mstrip
Data
GND Plane
+
+
-
+
-
+
-
RF test  
PAD2
5b
5b
5b
5b
5b
5b
Φ11-
Φ23
Φ13
Φ22
Φ21
Coupler
Coupler
Cluster 1 
(U1) 
Cluster 2 
(U2) 
A12
A13
A22
A21
U1 AF 
Beam’s null-
space of U2   
Beam’s null-space 
of U1   
A23
Figure 1.9: The 4-element realization of the beamforming-MIMO RX.
15
steering technique explained in Section 1.5. Furthermore, any static amplitude or phase
mismatch between RX’s RF-channels is compensated with high dynamic range of VGAs
and phase reference control of phase shifters. The VGAs’ outputs within each cluster travel
through carefully laid-out equi-length paths to a 3-to-1 combiner, where they are coherently
power-combined at the desired incident angle (U1p2qejθ1p2q for cluster 1(2)) and suppressed at
the undesired (U2p1qejθ2p1q for cluster 1(2)). IQ mixers down-convert the RF signals at each
combining node using a shared integrated LO network to be filtered and amplified in the
baseband. To characterize RF-channels, the RF signal at each cluster’s combining node is
monitored at the output of a tap coupler.
1.7.1 Low Noise Amplifier (LNA)
Shown in Fig. 1.10 is the schematic of the 4-stage LNA employed in each RF-channel. The
first stage is an inductively degenerated common emitter amplifier. Common emitter shows
a smaller NF compared to cascode as the frequency of operation approaches fT . Current
density of the first stage is set to JDC « 0.52 mA/µm to acheive close-to-minimum NF
without compromising gain too much. The inductor Ldeg was then optimized to achieve
simultaneous noise and power match. The LNA’s second stage incorporates a cascode at
the same current density to achieve a higher gain. An on-chip tuned balun was designed to
convert the cascode’s single-ended output (matched to 50 Ω) to differential (matched to 100
Ω) with minimum phase error and loss (Fig. 1.11a) [21]. Finally, two neutralized differential
stages were designed for Gmax to further amplify the incoming signal. The S-parameter and
NF simulations of the 4-stage LNA is depicted in Fig. 1.11b. The 4-stage LNA consumes a
total DC power of 38.6 mW.
16
+-
LNAout
x2
RFin
1.8V
3.2V
Vbias,1
Ldeg
1.8V
2x4um
90pH
50pH
300pH
700fF 33pH
10um
8um
4
0
p
H
90pH
40fF
36fF
3.6um 3.6um
3um3um
1
K
Ω
1KΩ 1KΩ
21fF21fF
110pH110pH
3
1
fF
1
7
fF
160fF
160fF
Vbias,2
Vbias,3 Vbias,3
9
5
p
H
1
2
8
p
H
LNAout
Kc ~ 0.6
Figure 1.10: 4-stage LNA.
1.7.2 Phase Shifter (PS)
As shown in [22], the beam-steering resolution with a Bφ-bit phase shifter is:
θmain,res “ sin´1p 1
2Bφ´1
q (1.14)
An array utilizing 6-bit and 5-bit phase shifters results in an SNR improvement very close to
the ideal SNR improvement across incident angles [22]. A 6-bit (5-bit) phase shifter results
in a beam-steering resolution of 1.79˝ (3.58˝).
108 um
125 um
(a)
7
9
11
13
15
17
19
-25
-23
-21
-19
-17
-15
-13
62 63 64 65 66 67 68 69
RF Frequency (GHz)
S2
1 
an
d 
NF
 (d
B)
S1
1/
S2
2 
(d
B)
S21
S11
S22
NF
(b)
Figure 1.11: (a) Layout of the on-chip balun,(b) simulated S-parameters and NF of the
4-stage LNA.
17
IM
IP IP
QM
QP QP
_
PS[4]
VS,I VS,I VS,Q
PS[4] PS[5] PS[5]
VS,Q
VT,I VT,QIP QM QP IM
C CL L
RFSP,inQAF
4bit 
DAC
PS[0-3]
VT,Q
VS,Q
4bit 
DAC
PS[0-3]
VT,I
VS,I
M1
M3 M4 M5 M6
Q1 Q3Q2 Q4 Q5 Q6 Q7 Q8
M2
2Rs2Rs
RS
RS RS
RS
4um 4um 4um 4um 4um 4um 4um 4um
L=60pH, C=100fF,Rs=24.5Ω,2R=49Ω
+
RFM,outRFP,out
40pH 40pH
60pH 60pH
3.2V
Figure 1.12: Quadrature Gilbert-based phase shifter.
58 um
106 um
Figure 1.13: Layout of the quadrature all-pass filter (QAF).
A 6-bit phase shifter (1 calibration bit to reduce RMS phase error), capable of generating 0˝
to 360˝ with 11.25˝ phase steps, was designed to adjust the phase of each RF-channel (Fig.
1.12). It employs an active Gilbert-based topology with a quadrature all-pass filter (QAF)
whose 3D layout view is shown in Fig. 1.13 [23]. The QAF is composed of low-pass and
high-pass filters to generate the differential quadrature signals, IM-IP and QM-QP, at the
resonance frequency. The QAF inductor and capacitor are L “ R{ω0 and C “ L{R2, where
R is the QAF characteristic impedance. The input parasitic capacitance of the Gilbert stage
18
(a) (b)
Figure 1.14: (a) Measured phase shifter’s phase response, (b) measured phase shifter’s RMS
phase error across phase states.
contributes to amplitude and phase errors of the QAF. The series resistor, Rs, lowers the
network’s Q-factor as well as these amplitude and phase errors. The optimum value of Rs
for theoretical zero amplitude and phase errors is calculated to be Rs “ R [24, 25]. The
tail current of each quadrature Gilbert cell is controlled by a 4-bit DAC. Moreover, two
additional bits control the current steering of switch pairs M3-M4 and M5-M6 to generate
positive and negative phases at the output of phase shifter. As indicated in Fig. 1.14a,
the measured RF-channel phases exhibit constant group delay for different settings of phase
shifter. An RMS phase error less than 2.8˝ was measured across the RF BW (Fig. 1.14b).
The RMS amplitude error across phase shifter bits is less than 0.85 dB (Fig. 1.15). The core
-1
.5
8
 d
B
1
.4
2
 d
B
Figure 1.15: Measured amplitude error across phase shifter bits.
19
phase shifter consumes a total DC power of 27.2 mW.
1.7.3 Variable Gain Attenuator (VGA)
To adjust the signal amplitude of each RF-channel, a 5-bit passive VGA was designed (Figs.
1.16 and 1.17). The VGA attenuation varies from 0.5 dB to 11.5 dB in 1-dB steps. Using
Eqs. (8) and (9), an amplitude step of A (A ď 1) will result in a null-steering resolution of:
θnull,res “ cos´1
´cos´1p´A{2q
pi
¯
´ cos´1p2
3
q (1.15)
for 3 elements per cluster. The null-steering resolution, θnull,res, is 1.78
˝ for 1-dB ampli-
tude step, and is 3.97˝ for 2-dB amplitude step. Therefore, the use of 1-dB-step VGA and
11.25˝-step phase shifter enables comparable resolution for the null and mainlobe steering
(calculated in Section VII.B).
The attenuation stages employ pi-stage topology whose shunt and series resistors, RP and
R
F
A
,o
u
t
+
-
VGA[0]
VGA[0]
VGA[0]
b[1]b[0] b[2] b[3] b[4]
RS
RP
R
P
,o
ff
R
F
A
,in
+
-
RP
R
P
,o
ff
RS
LB LA
GGS
Figure 1.16: 5-bit passive pi-stage VGA.
20
126 um
73 um
Figure 1.17: Layout of one stage of VGA.
(a) (b)
Figure 1.18: (a) Measured VGA steps, (b) measured VGA’s RMS gain error across attenu-
ation steps.
RS, for different levels of attenuation AdB (AL “
a
1{p10AdB{10)) are derived as:
RP “ Z0p1´ A
2
Lq
1` A2L ´ 2AL
“ Z0.1` AL
1´ AL (1.16)
RS “ p1´ ALqRPZ0
RP ´ Z0 (1.17)
21
All Stages OFF
All Stages ON
4°
-1.8°
Figure 1.19: Measured phase error across VGA bits.
where Z0 is the 50-Ω characteristic impedance of the VGA (100-Ω differential). A large shunt
resistor, RP,off , is added in series with RP and in parallel with deep N-well switches. In the
VGA ON-mode, the switches turn on, RP,off is shorted and the resistive network RS–RP
adjusts the attenuation level. In the VGA OFF-mode, RP,off is placed in series with RP ,
resulting in no attenuation from the VGA stage. The error in AL due to variations in RP
and RS (∆RP and ∆RS) is approximately equal to:
Aerr “ Z0RP .rpZ0RSq∆RP ´ pRPRS ` Z0RSq∆RSspRSRP `RSZ0 `RPZ0q2 (1.18)
To reduce the effect of parasitic capacitances of the switches, the equivalent capacitance, C,
at the VGA input and output ports are resonated out by inductors L in series with RS. This
capacitance appears in parallel with the input and output VGA ports and is derived to be:
C “ CSW p1`Q
2
P qQ2S
p1`Q2SqQ2P
(1.19)
where CSW is the equivalent parasitic capacitance appearing in parallel with the switch,
and QP and QS are derived to be:
QP “ RP,offCSWω0 (1.20)
22
QS “ QP RP,off
RP p1`Q2P q `RP,off
(1.21)
A gain stage is introduced after -2 dB stage to reduce the NF contribution of the two -4 dB
stages. Assuming a loss of LB before the gain stage, a loss of LA after the gain stage, and a
gain of GGS from the gain stage, the IP1dB and the noise factor referred to the VGA input
are:
IP1dB,V GA “ 1{
` LB
IP1dB,GS
` GV GA
IP1dB,AV
˘
(1.22)
FV GA “ LBFGS ` LB
GGS
pLA ´ 1q ` FAV ´ 1
GV GA
(1.23)
where IP1dB,AV and FAV are the input compression point and the noise factor of RF building
blocks after the VGA (including the power combiner and mixer), respectively, and GV GA “
GGS{pLBLAq. The effect of IP1dB,GS is reduced by LB and IP1dB,AV is thus the dominant
term setting the overall linearity IP1dB,V GA in all-OFF mode, and is independent of the gain
stage location. However, FV GA degrades by the absolute loss of the stages prior to the gain
stage in both ON and OFF modes. Placing the gain stage after all VGA stages compared
to putting it before will degrade NFV GA by 9.72 dB/15.7 dB and will improve IP1dB,V GA by
1.87 dB/10.52 dB in all-OFF/all-ON modes. Putting the gain stage after the -2 dB stage
compared to putting it before all VGA stages will degrade NFV GA by only 2.96 dB/2.58 dB
and will improve IP1dB,V GA by 0.79 dB/5.18 dB in all-OFF/all-ON modes.
The measured RF-channel gain for different VGA settings is shown in Fig. 1.18a. A measured
RMS gain error less than 0.32 dB across the RF BW was obtained (Fig. 1.18b). The
23
measured phase error of the VGA for all 16 states, including the cases when all stages are
OFF and ON, is shown in Fig. 1.19. The RMS phase error across VGA bits is less than
1.8˝. The VGA block consumes a total DC power of 9 mW.
1.7.4 Measurement Results of RF and RF-to-BB Channels
The RX S-parameters and linearity were measured from the LNA inputs down to RF test
PAD1 and 2 after VGA compensation for any gain mismatch between RF-channels. The
measured RX S-parameters plot in Fig. 1.20a shows a maximum gain per RF-channel of
12.3 dB at a center frequency of 65.5 GHz with 3 GHz RF BW. The worst-case IP1dB of
the RF-channels vs. RF frequency is shown in Fig. 1.20b.
Fig. 1.21a shows the measured conversion gain and NF for an RF-to-BB channel comprising
the entire path from the LNA down to the BB port. Upon down-conversion using the
integrated LO network, a direct conversion gain of 30.15 dB was measured with a 3-dB
IF BW of 548 MHz. The measured double sideband NF of a single RF-to-BB channel
for minimum and maximum VGA attenuations varies from 9.8-11 dB and 11.5-12.7 dB,
(a)
-20
-18
-16
-14
-12
62 63 64 65 66 67 68 69
Frequency (GHz)
In
pu
t C
om
pr
es
si
on
 P
oi
nt
 (d
Bm
)
(b)
Figure 1.20: (a) Measured S-parameter of RF-channels (simulated in dashed), (b) simulated
IP1dB.
24
(a) (b)
Figure 1.21: (a) Measured conversion gain and NF of RF-to-BB channel, (b) measured I/Q
amplitude and phase errors.
respectively, across the IF BW (Fig. 1.21a). The IQ gain/phase mismatches were measured
to be less than 0.8 dB/3.6˝ (Fig. 1.21b).
1.7.5 Coupling between RF-Channels
Intra- and inter-element couplings within and between clusters cause gain/phase errors at
the two combining nodes of this RX [26]. For example, sources of coupling between three
RF-channels of cluster 1 are demonstrated in Fig. 1.22 (this coupling discussion is also valid
for cluster 2 due to symmetry of the RX layout). The dominant source of coupling is from
the last single-ended stage of each RF-channel’s LNA to the LNA input of an adjacent RF-
channel. To measure the effect of coupling on gain/phase errors of this RX, two coupling
scenarios are considered. Coupling from desired RF-channel 3 of cluster 1 to RF-channel 2
of cluster 1 and coupling from desired RF-channel 1 of cluster 1 to RF-channel 2 of cluster 1.
In the first scenario, the signal at the last single-ended stage of LNA3 couples to the input
of LNA2 and after getting amplified by LNA2 will pass through the undesired RF-channel 2
of cluster 1 and will appear at the combining node of cluster 1 with a gain of GLNA2Gφ12A12.
25
5Φ23
6b
Φ13
6b
5b
6b 5b
Coupled Path
Φ22
A23
A22
LNA2
LNA3
2
RFin,2
3
RFin,3 5b
A13
Desired RF Channel
LNA2 LNA3
1.8V-VCC
GND Plane
3.2V-VCC
GND Plane
Desired RF ChannelCoupled Paths
B
Y
P
1.8V-VCC
GND Plane
3.2V-VCC
GND Plane
B
Y
P
B
Y
P
LNA1
1
Φ12
6b
Φ11
6b 5b
A11
5b
A12
RFin,1
Desired RF Channel
Coupled Path
Figure 1.22: Coupling path between RF-Channels 1 and 2 of cluster 1 and RF-Channels 2
and 3 of cluster 1.
Bypass rails between RF-channels of each cluster in addition to ground plane for critical
passives are designed to suppress this mechanism. To measure this coupling, phase shifter
φ13 of the desired RF-channel is set as a reference and phase shifter of the undesired path
φ23 and φ12 are set to four specific values of 0
˝, 90˝, 180˝, and 270˝ (Fig. 1.23). The 0˝
and 180˝ phase differences and the 90˝ and 270˝ phase differences capture maximum gain
and phase errors, respectively, at the combining node. Figs. 1.24(a)-(b) show the measured
(a) (b)
Figure 1.23: (a) coupling between RF-channels 2 and 3 of cluster 1, (b) coupling between
RF-channels 1 and 2 of cluster 1 for 4 different phase shifter settings.
26
1-------------,---
1 
I 
I 
I 
I 
I I I I ------�-------,-------r-------r------
1 I I I 
I I I I 
I I I I 
I I I I 
I.. I I I I e O 6 _______ :_____ o.cG-dB-l-------�------
.._ • 
I i> I I � -------------: -----------r-l--------r------------1-------------
·- I I I 
11111 I I 
(3 0.4 __ i ___ 
en 
:E 
� 0.2 
63 64 65 66 67 
Frequency (GHz) 
68 
(a) (b)
(c) (d)
Figure 1.24: (a) measured RMS gain error, and (b) measured RMS phase error due to
coupling of RF-channels 2 and 3 of cluster 1, (c) measured RMS gain error, and (d) measured
RMS phase error due to coupling of RF-channels 1 and 2 of cluster 1.
RMS phase/gain errors at the combining node for these four different phase settings. The
RMS gain and phase errors due to coupling between RF-channels 2 and 3 of cluster 1 at the
combining node of cluster 1 is less than 0.56 dB and 1.4˝, respectively. In the second scenario,
coupling from RF-channel 1 to RF-channel 2 of cluster 1 is considered. The dominant source
of coupling in this case is from the last single-ended stage of LNA1 to the input of LNA2,
where the signal experiences a gain of GLNA2Gφ12A12 and appears at the combining node
of cluster 1. Figs. 1.24(c)-(d) show the measured RMS phase/gain errors at the combining
27
Figure 1.25: Measured phase scanning of each 3-element cluster.
 
Figure 1.26: Measured spatially multiplexed array factors of two clusters steered toward 60˝
and 90˝.
node for four different phase settings. The RMS gain error and phase error due to coupling
between RF-channel 1 and 2 of cluster 1 at the combining node of cluster 1 is less than 0.54
dB and 2.27˝, respectively. Therefore, amplitude and phase errors due to coupling between
RF-channels of each cluster are negligible in this RX.
1.8 Measured Null-steering and Spatial Multiplexing
Based on the phase and amplitude control techniques formulated by (1.8) and (1.9), the
mainlobe and null locations of each cluster’s beam pattern can be directed toward arbitrary
28
Main Lobe
Steerable Nulls
Figure 1.27: Measured array factor of each cluster for 625 VGA settings.
angles. Array factor of each cluster is extracted from S-parameter measurements of each
RF-channel of this RX for all phase shifter and VGA settings in accordance with Eq. (1.6).
Fig. 1.25 shows the rotational change in the location of array factor’s mainlobe for different
phase shifter settings within a cluster. The half-power beamwidth increases at wide null-
steering intervals due to near triggering of grating lobes in the cluster’s array factor. Fig.
1.26 shows the measured spatially multiplexed array factors of cluster 1 and cluster 2 steered
toward 60˝ and 90˝ with the null of each cluster adjusted such that it is placed on top of the
mainlobe of the other cluster.
Figure 1.28: Measured signal-to-interference ratio (SIR) for different undesired incident an-
gles.
29
Figure 1.29: Die micrograph.
Fig. 1.27 shows the array factor of each cluster for 625 combinations of VGA settings,
where the null location of the array factor varies in accordance with Eqs. (1.8) and (1.9) for
different VGA setting. For every null location, θnull, multiple combinations of VGA settings
pA1k, A2k, A3kq for cluster k can be found. Mainlobe gain degradation due to null-steering is
equal to:
20log
´
3ˆ maxpA1k, A2k, A3kq
2´ 2cosppicospθnullqq
¯
(1.24)
for a 3-element array. For a uniformly illuminated 3-element array θnull “ cos´1p2{3q and
there is no degradation.
By having control over both the mainlobe and null locations, spatial multiplexing of multiple
(in this work, two) clusters can be realized. For spatially multiplexed clusters, interference
is considered to be the interference of clusters on each other during simultaneous parallel
data stream reception. Assuming equal signal powers for cluster 1 and cluster 2 (signal-to-
30
Die
PCB
E8361A PNA
Network Analyzer
RFin RFoutGSG
Probe
GSG
Probe
(a)
Die
PCB
PCB
Die
E8361A PNA
Network Analyzer
E8361A PNA
Network Analyzer
E8257D PSG
Signal Generator
GSG
Probe
GSG
Probe
RFin RFout
RFin
LO
BB+
BB-
BALH-0006
E4448A PSA Series
Spectrum Analyzer
(b)
E8361A PNA 
Network Analyzer*               
E8257D PSG Signal  
Generator 
E4448A PSA Series 
Spectrum Analyzer        
* Functional up to 70GHz
PCB
DC Supplies
(c)
Figure 1.30: (a) RF characterization and RF-to-baseband measurement setup (b) schematic,
(c) photo.
interference-ratio (SIR) = PNR), Fig. 1.28 shows the measured signal-to-interference ratio
across different incident angles of interference. In this measurement, the cluster 1’s mainlobe
peak angle (i.e., the interference angle for cluster 2) is steered toward 90˝ and cluster 2’s
mainlobe peak angle (i.e., the interference angle for cluster 1) is swept from 0˝ to 180˝ in
15˝ steps. To account for angle estimation errors a half-LSB phase error is included at each
step. It is observed that an SIR better than 15 dB is achieved across a wide null-steering
interval (22.5˝-74˝ and 106˝-157.5˝). These SIR measurement results are in accordance with
interference rejection contours in Fig. 9 based on the measured RMS phase and amplitude
errors of phase shifter (Section VII.B) and VGA (Section VII.C).
Fig. 1.29 shows the 3.5ˆ3mm2 die micrograph of the 4-element/6-channel RX prototype
fabricated in a 130-nm SiGe BiCMOS process. A two-layer FR-4 PCB which embeds the
wire-bonded die was developed for measurements. The RF characterization setup for RF-
31
Table 1.1: Performance Summary of the 4-Element Beamforming-MIMO RX
Architecture RF Front-End Performance 
Process 130nm  BiCMOS Frequency (GHz) 64-67
fT/ fmax (GHz) 200/220 Single RF Channel Gain (dB) 12.3 
Integration RF, LO, Analog BB Input/Output Return Loss (dB) ≤ -13.6/-14.9 
Chip Area (mm2) 3.5x3 Measured IP1dB (dBm) @ 65.5GHz -16.3
Measured Power Consumption (mW) 528 Phase Shifter Resolution (°) 11.25 
Phased Array Phase Shifter Phase Control (°) 360 
Number of RF channels 6 RMS Phase Error (°) ≤ 2.8 
Phase Shifting RF Domain Gain Control (dB) 11 
Amplitude Control RF Domain RMS Gain Error (dB) ≤ 0.32 
MIMO 
SIR @ desired angle=90° (dB) 
≥20 for interference angles 32° 
to 65°&115°to148° Spatial Multiplexing 2 
Coupling between Channels ≥15 for interference angles 0 
to 32°& 65° to 74°& 106° to 
115° & 148° to 180 ° RF Channels 1 and 2 
Gain Variation (dB) @ center freq. ≤ 0.5 
RF Front-End to Baseband Performance 
RMS Gain Error (dB) ≤ 0.54 
RMS Phase Error (°) ≤ 2.27 IF Bandwidth (MHz) 548 
RF Channels 2 and 3 RF-to-BB Channel Conversion Gain (dB) 30.15 
Gain Variation (dB) @ center freq. ≤ 0.6 DSB NF of a single RF-to-BB Channel (dB) 9.8-11 
RMS Gain Error (dB) ≤ 0.56 I/Q Phase Mismatch (°) ≤ 3.6 
RMS Phase Error (°) ≤ 1.4 I/Q Amplitude  Mismatch (dB) ≤ 0.8 
TABLE II. TABLE OF COMPARISON 
ISSCC 2017 [5] TMTT 2017 [6] JSSC 2012 [7] TMTT 2013 [8] This Work 
Architecture 
TRX 
Phased Array       
TRX   
Phased Array
RX 
Phased Array        
RX    
Phased Array
RX     
Beam forming 
MIMO  
Phase Shifting Passive/Active RF LO Passive RF Passive RF RF 
Process 130nm BiCMOS 90nm BiCMOS 130nm BiCMOS 130nm BiCMOS 130nm BiCMOS 
fT/ fmax (GHz) NA 300/350 200/220 200/220 200/220 
Number of Phased Array Channels 
16 phased array 
channels per chip 
4 phased array 
channels 
4 phased array 
channels 
16 phased array 
channels 
4 elements / 6 phased 
array channels 
Chip Area (mm2) 15.8×10.5 3.4×2.1 2×2.7 5.5×5.8 3.5x3 
Integration RF, LO, BB RF, LO, BB RF RF, LO, BB RF, LO, BB 
Frequency (GHz) 27-29 71-86 76-84 76-84 64-67
Channel Conversion Gain (dB) 34 26.2 10.1-18.9 30-33 30.15 
Channel NF(dB) 6† 9-14 10-11 11.4-13 9.8-11†† 
Phase Shifter Resolution (°) 4.9 5 11 11 11.25 
Gain Control (dB) 8 - 9 11.2 11 
Input P1dB (dBm) -22.5 -30.6 -26.7 to -23 -26 -16.3
PDC  (mw) 3300 286* 130** 1000/1200 528*** 
MIMO 
Capability 
Multiplexing gain NA NA NA NA 2 
Number of stream 1 1 1 1 2 
† NF of LNA+switch off-mode PA    †† NF of RF-to-BB channel 
* In RX mode per channel (excluding BB and LO)  ** RF Blocks only  *** PDC of all channels for 2-streams (including Mixer-BB-LO)
Table 1.2: Table of Comparison
TABLE I.    PERFORMACNE SUMMERY OF THE 4-ELEMENT BEAMFORMING-MIMO RX 
Architecture RF Front-End Performance 
Process 130nm  BiCMOS Frequency (GHz) 64-67 
fT/ fmax (GHz) 200/220 Single RF Channel Gain (dB) 12.3 
Integration RF, LO, Analog BB Input/Output Return Loss (dB) ≤ -13.6/-14.9 
Chip Area (mm2) 3.5x3 IP1dB (dBm) -16.3 
Measured Pow r Consumption (mW) 528 Phase Sh fter Res ution (°) 11.25 
Phased Array Phase Shifter Phase Control (°) 36  
Number of RF channels 6 RMS Phase Error (°) ≤ 2.8 
Phase Shifting RF Domain Gain Control (dB) 11 
Amplitude Control RF Domain RMS Gain Error (dB) ≤ 0.32 
MIMO 
SIR @ desired angle=90° (dB) 
≥20 for interference angles 32° 
to 65°&115°to148° Spatial Multiplexing 2 
Coupling between Channels ≥15 for interference angles 0 to 
32°& 65° to 74°& 106° to 115° 
& 148° to 180 ° RF Channels 1 and 2 
Gain Variation (dB) @ center freq. ≤ 0.5 RF Front-End to Baseband Performance 
RMS Gain Error (dB) ≤ 0.54 
RMS Phase Error (°) ≤ 2.27 IF Bandwidth (MHz) 54  
RF Channels 2 and 3 RF-to-BB Channel Conversion Gain (dB) 30.15 
Gain Variation (dB) @ center freq. ≤ 0.6 DSB NF of a single RF-to-BB Channel (dB) 9.8-11 
RMS Gain Error (dB) ≤ 0.56 I/Q Phase Mismatch (°) ≤ 3.6 
RMS Phase Error (°) ≤ 1.4 I/Q Amplitude  Mismatch (dB) ≤ 0.8 
 
TABLE II. TABLE OF COMPARISON 
 ISSCC 2017 [5] TMTT 2017 [6] JSSC 2012 [7] TMTT 2013 [8] This Work 
Architecture TRX            Phased Array        
TRX                         
Phased Array             
RX                               
Phased Array                       
RX                               
Phased Array                       
RX                           
Beam forming 
MIMO                   
Phase Shifting Passive/ ctive RF LO Passive RF Passive RF RF 
Process 30nm BiCMOS 90nm BiCMOS 130nm BiCMOS 130nm BiCMOS 130nm BiCMOS 
Number of Phased Array Channels 16 phased array channels per chip 
4 p sed array 
channels 
4 phased array 
channels 
16 phased array 
channels 
4 elements / 6 phased 
array channels 
Chip Area (mm2) 15.8×10.5 3.4×2.1 2×2.7 5.5×5.8 3.5x3 
Integration RF, LO, BB RF, LO, BB RF RF, LO, BB RF, LO, BB 
Frequency ( Hz) 27-29 71-86 76-84 76-84 64-67 
Channel Conversion Gain (dB) 34 26.2 10.1-18.9 30-33 30.15 
Channel NF(dB) 6† 9-14 10-11 11.4-13 9.8-11†† 
Phase Shifter Resolution (°) 4.9 5 11 11 11.25 
Gain Control (dB) 8 - 9 11.2 11 
PDC  (mw) 3300 286* 130** 1000/1200 528 
MIMO 
Capability 
Multiplexing gain NA NA NA NA 2 
Number of stream 1 1 1 1 2 
† NF of  LNA+switch, off-mode PA    †† NF of RF-to-BB channel  * In RX mode per channel ** RF Blocks only  
 
channel measurements is shown in Fig. 1.30a. The RF-to-BB measurement setup is shown
in Figs. 1.30b and 1.30c. E8361A PNA was used for S-parameter measurements and E8257D
PSG signal generator along with E4448A PSA spectrum analyzer were utilized for RF-to-
BB measurements. Table 1.1 summarizes the detailed RF and RF-to-BB performance of
the proposed RX, and Table 1.2 provides the performance comparison with prior work. The
32
proposed RX supports a MIMO multiplexing gain of 2 in contrast to no multiplexing gain
in conventional phased array architectures.
1.9 Conclusion
A 4-element phase-amplitude controlled receiver architecture with simultaneous beamform-
ing and MIMO capabilities both implemented in the RF domain was presented. By par-
tially overlapping RX elements into 2 amplitude-phase controlled clusters, the proposed RX
achieves a higher MIMO multiplexing gain compared to conventional phased array or hybrid
architectures with the same number elements. Analysis of amplitude control using VGA
to enable null-steering independently of mainlobe steering was presented and the effect of
phase/amplitude error on the achievable interference suppression was detailed. Measure-
ment results were presented showing excellent agreement with simulation, while verifying
the system-level analysis for concurrent spatial multiplexing and interference rejection.
33
Chapter 2
A Millimeter-Wave Energy-Efficient
Direct-Demodulation Receiver:
Theory, Design, and Implementation
2.1 Introduction to High-Speed mm-Wave Receivers
There is a rapid increase in demand for high-speed point-to-point wireless links with data-
rates comparable to wireline links for both short-range indoor and long-range outdoor com-
munication. Enabling applications include: optical fiber replacement [27–30], high-capacity
backhauls, high-speed access networks, resource management in large-scale networks [31],
close-proximity wireless data transfer between mobile terminals and storage devices, wire-
less communication through relay nodes [32], and highly-secure modulation with long code
modulation [33,34].
The abundance of available bandwidth in the mm-wave/sub-THz frequency range makes it
possible to achieve the capacity of wireline links with the flexibility and low-cost of wireless
34
links. However, due to the degradation of active device performance at frequencies close to
device fmax [35], the operating frequency cannot be arbitrarily high. As a rule of thumb,
operation at frequencies as high as „fmax{2 is considered to be palatable before cliff-fall
degradation in the active device performance. Recent developments of advanced commercial
silicon processes suggest this upper limit to be somewhere in the F-band (90-140 GHz) [36–
39]. This notion implies that high spectral efficiency modulation schemes accompanied by
wide bandwidth provides a more practical pathway toward tens-of-Gbps wireless transceivers.
Most importantly, although very high-speed wireless transceiver front-ends based on con-
ventional direct-conversion [40–42] or IF-conversion [43–45] architectures have been reported
recently, their inputs/outputs are still modulated baseband or IF signals. Ultra-high-speed
and high-resolution data converters are thus required to (de-)modulate raw bits’ informa-
tion. Based on the Nyquist criteria, the sampling rates of these data converters need to be
at least 2 and 4 times the baud-rate of the modulated baseband and IF signals, respectively
to avoid aliasing [46]. However, signal-to-noise-distortion-ratio (SNDR) and spurious-free-
dynamic-range (SFDR) both quickly degrade with speed, leading to increasingly poor resolu-
tion. Accordingly, ultra-high-speed transceivers in the prior-art utilize expensive and bulky
high-speed real-time oscilloscopes with speeds and resolutions as high as 200 GSa/s and
12 bits to demodulate their reported data-rate with an acceptable BER [47, 48]. The need
for watt-level data converters and digital back-ends makes these high data-rate transceivers
extremely energy-inefficient upon (de)modulation. One solution being pursued by prior
works is channel-bonding [40, 49]. Nonetheless, it demands several parallel data converters
at lower sampling rates, wideband LO and gain characteristics and high levels of calibra-
tion. Therefore, one can argue that using conventional architectures, designing an ultra-high
data-rate wireless transceiver that also incorporates energy-efficient mixed-signal and base-
band units would be quite challenging unless paradigm-shifting architecture-level solutions
are explored [50–53].
35
As will be discussed in Section II, it is of great interest to eliminate data converters and
significantly simplify the DSP to pave the way for energy-efficient ultra-high-speed wireless
links. In Section III, theory and system analysis of the proposed ADC-less 8PSK direct-
demodulation architecture based on a novel RF-correlation idea are detailed. The 8PSK
direct-demodulation receiver along with circuit analysis of main building blocks are described
in Section IV. The complete measurement results of the fabricated receiver prototype are
presented in Section V, and finally, Section VI provides concluding remarks.
2.2 High-speed ADC Design Challenges
A fundamental trade-off exists between power dissipation pPdissq, resolution pENOBq, and
speed pfsq of data converters. This trade-off is captured in the widely used Walden figure-of-
merit, FOM “ Pdiss{pfs.2ENOBq [54]. Although finer technology nodes improve the energy
efficiency of data converters, dynamic range is severely limited due to simultaneous reduction
of supply voltage. This limited dynamic range makes high-resolution (i.e. ą 10-bit) ADC
design at higher speeds extremely challenging, if not impossible. This has prompted state-
of-the-art high-speed ADCs to focus on improving the FOM by increasing energy efficiency
rather than increasing SNDR [55]. However, it is insightful to study the power dissipation
overhead of increasing resolution since demodulation BER is limited by the maximum achiev-
able SNDR. According to the thermal noise requirement of an n-bit ADC, a lower-bound on
its sampling capacitor size is calculated to be:
Cs “ 12kT 2
2n
V 2in,FS
(2.1)
where Vin,FS is the full-scale voltage at ADC input. For high-resolution (ą 10-bit) ADCs,
power dissipation is dominated by thermal noise and grows proportionally with 22n. On
the other hand, for low-resolution (ă 6-bit) ADCs, it is dominated by component mismatch
36
D
A
C
D
A
C
Cs
Cs
D1
DN
Time-Interleaved
ADC
Multi-Phase 
CLK GEN
CLK1 CLKN
CLKREF
Vin
Skew CAL
Multi-Phase
Distribution
S/H
CLK1
S/H
CLKN
GFE
G
a
in
 a
n
d
 O
ff
s
e
t 
C
A
L
 D
A
C
DAC
Sub-ADC 1
Sub-ADC N
N
C
L
K
 G
e
n
e
ra
ti
o
n
 a
n
d
 
D
is
tr
ib
u
ti
o
n
 N
e
tw
o
rk
Figure 2.1: A generic time-interleaved ADC architecture.
requirement and minimum capacitor size and grows proportionally with 2n [56]. With simul-
taneous technology and supply scaling the required sampling capacitor for a given resolution
increases due to reduced permissible noise levels [56], [57]. Therefore, energy efficiency of
even low-resolution ADCs in finer technology nodes gets limited by more stringent thermal
noise requirements.
Design of the baseband circuit is determined by the required RF bandwidth, power budget
and the intended application, ranging from few Hz and micro watts for biomedical [58–61]
and power-cycling [62] applications to several GHz and watts for high speed wireless commu-
nications. Additionally, ADC must maintain the required resolution across a very wide band-
width for high-speed signal demodulation. As the parasitic capacitances become comparable
with the sampling capacitor, the power-speed trade-off becomes non-linear with increasing
speed resulting in continuous FOM degredation [63]. Consequently, ADCs operating above
fFOM,cliff [64] will experience a drastic degradation in their resolution and energy efficiency.
This fFOM,cliff , even for ADCs designed in nanoscale technologies, is in the order of only a
few hundreds of MHz [64]. To achieve multi-GHz sampling rates without severely sacrificing
37
FOM , state-of-the-art ADCs utilize time interleaving of lower-speed sub-ADCs operating in
their linear power-speed regime (see Fig. 2.1) [63]. However, still a power-hungry front-end
driver with the same acquisition bandwidth as that of the overall ADC is required to drive
the equivalent sampling capacitor (Cs,eq) at the ADC input. The trade-off between kT {Cs,eq
noise and bandwidth of the input driver in addition to its linearity requirements limits the
available input SNDR/SFDR [65].
Furthermore, even design of the core time-interleaved ADC (i.e. sub-ADCs) for high multi-
Gbps data-rate communication systems is challenging. More specifically, time-interleaved
ADCs are quite sensitive to inter-channel mismatches. Gain mismatch and timing skew
between channels need to be precisely calibrated to avoid aliasing and degradation of the
signal integrity. Gain mismatch between channels results in an undesired signal-amplitude-
dependent image in the signal spectrum. For example, to obtain an SNR of 45 dB („8-bit
resolution), a gain matching better than 0.5% is required in a 4-channel time-interleaved
ADC [66]. Gain error can be detected and corrected digitally. However, the associated
power overhead highly depends on the activity factor of logic gates which limits the maximum
allowable number of logic gates to only a few thousands in low-resolution ADCs [67].
Furthermore, timing mismatch between channels will result in an undesired input-frequency-
dependent image in the signal spectrum. To quantize a signal at frequency of fin with a
resolution of ENOB, the variance of timing mismatch in an N-channel time-interleaved ADC
must be less than [68]:
σ2T ď NN ´ 1 ¨
2{3
p2ENOB ¨ 2pifinq2 (2.2)
For example, for an input signal with 12-GHz bandwidth, a timing mismatch of „50 fs limits
the ADC resolution to 8 bits in a 4-channel time-interleaved ADC. Achieving such a low
timing mismatch is quite challenging even in current technologies and its correction requires
38
precisely controllable analog delay lines or high-order digital filters [69]. Additionally, each
channel often occupies a large area to meet the mismatch requirements of its sub-ADC,
thereby mandating long multi-phase clock routings. These long routing interconnects cause
signal integrity issues and increase the power dissipation of the clock network especially at
higher speeds.
Time-interleaved ADCs with sampling rates as high as 64 GSa/s have been reported in
prior-art. However, due to the challenges discussed above, they suffer from low-resolution
(e.g. 5.95 ENOB at Nyquist frequency) and high power dissipation (e.g. „950 mW) [70].
Importantly, the reported state-of-the-art FOMs often leave out power dissipation of cer-
tain parts of a complete ADC including the input S/H amplifier or driver, clock genera-
tion/distribution network, reference generation/calibration (gray blocks in Fig. 2.1) [71].
Therefore, ADC remains one of the most power-hungry and challenging blocks in a high
data-rate communication system.
2.3 High-Order Direct-Demodulation
High-speed receivers requiring no ADCs and achieving data-rates as high as 16 Gbps have
been reported in the prior-art [41, 72]. However, these direct-demodulation architectures
only support low-order modulation schemes (OOK, BPSK, and QPSK). To further increase
the data-rate, these low spectral efficiency architectures need to be designed with much
wider bandwidths at very high carrier frequencies (close to fmax), resulting in poor wire-
less link quality due to limited receiver sensitivity and transmitter output power at such
high frequencies. This motivates the design of direct-demodulation architectures for high-
order modulations to increase the spectral efficiency and achievable data-rate for a given
bandwidth. As a natural extension following the already reported direct-demodulation ar-
chitectures up to QPSK, this work presents a direct-demodulation of 8PSK constellation
39
based on multi-phase RF-correlation technique which is amenable to mm-wave frequencies.
2.3.1 Current 8PSK Demodulation Techniques
The most popular way currently for 8PSK demodulation is the arctangent (ATAN) tech-
nique in parallel with a digital PLL (DPLL) that calculates the distance between complex
baseband envelope and constellation points in the IQ signal space [73]. ATAN function has a
complex digital implementation and requires large lookup tables (LUTs) and memory blocks
to distinguish and scale phases; all being extremely challenging to implement at ultra-high
speeds targeted in our work. A similar demodulation method to ATAN is the Costas-loop
technique [74], where the received signal is fed to four matched filters controlled by 180˝,
225˝, 270˝, 315˝ phase-shifted outputs of a reference Costas-loop. The correct received sym-
bol is detected by finding the maximum amplitude at the output of four matched filters
over one symbol period. This is a 5-level amplitude decision and requires a power-hungry
high-speed ADC for our target data-rate to detect the received symbol corresponding to the
maximum amplitude level.
Another digital 8PSK demodulation technique is based on complex number method [75],
where IQ baseband signal is sampled with a high sampling rate processor and converted to
a complex number. Afterwards, the sampled received complex number gets multiplied with
a delayed conjugate version of itself and the product is then fed to a complex 8-phase slicer
to detect the received symbol. Implementation of this 8PSK demodulation technique again
requires ultra-high-speed processor and high levels of field programmable gate array (FPGA)
resources.
Cross-correlation technique is also proposed in [76], where the 8PSK-modulated signal is
cross-correlated with 180˝, 225˝, 270˝, 315˝ phase-shifted versions of itself to detect eight
separate angles. Realization of this technique is extremely challenging at ultra-high input
40
frequencies and data-rates, as the correlated signal at the output of each cross-correlator is
at twice the input frequency and is composed of positive (in-phase) and negative (180˝ out-
of-phase) terms. A sign-slicer is used to separate positive and negative parts. The absolute
value of both parts is calculated and the peak absolute value among cross-correlator outputs
identifies the correct symbol. The absolute values of cross-correlated angles need to be
updated, compared and recorded in LUTs every half cycle of the carrier, which requires
significant amount of hardware resources.
2.3.2 Proposed 8PSK Direct-Demodulation Technique
To obviate the need for ultra-high-speed high-resolution ADCs and complicated digital
demodulation computations with high-level of FPGA resources, a simple 8PSK direct-
demodulation technique is presented in this work. Fig. 2.2 shows the proposed multi-phase
RF-correlation-based technique for direct-demodulation of 8PSK symbols. The 2D IQ RF
signal space is partitioned by four differential LO phases into eight angular subsections (Fig.
2.2a). Phase references of RF carrier and LO are offset by 22.5˝ to maximize the Euclidean
distance between RF symbols and the LO boundaries, thus maximizing the error tolerance
in detecting the symbols. A multi-phase RF-correlator is utilized for 8PSK symbol detec-
tion (Fig. 2.2b). By downconverting symbols with four differential 45˝ phase-shifted mixers
followed by low-pass filtering, only the sign of RF-correlated signals (Cin,k“1,2,3,4) is needed
to determine the received symbols. Table 2.1 shows the normalized correlation values of
8PSK symbols with LO phases. Assuming Gray-coding, the 3 bits per symbol are readily
extracted from the retimed output of RF correlators with simple XOR logic gates. These 3
demodulated bits are related to Cout,k“1,2,3,4 with simple Boolean expressions: B2 “ Cout,1,
B1 “ Cout,3, and B0 “ Cout,2 ‘ Cout,4 (Table 2.2).
41
LO0°LO180°
LO135°
LO225°
LO270°
LO315°
110010
000
001 101
LO45°
LO90°
S1S4
S8
S7S6
S5
111
B2B1B0
S2S3
011 111
100
(a) (b)
Figure 2.2: (a) 2D IQ signal space of 8PSK symbols partitioned by 8 LO phases, and (b)
block diagram of multi-phase RF-correlator and sign-check comparators.
Table 2.1: Normalized correlation values of 8PSK symbols with eight LO phases.
XXXXXXXXXXXXSymbols
LO Phases
LO0˝ LO180˝ LO45˝ LO225˝ LO90˝ LO270˝ LO135˝ LO315˝
S1 +0.92 -0.92 +0.92 -0.92 +0.38 -0.38 -0.38 +0.38
S2 +0.38 -0.38 +0.92 -0.92 +0.92 -0.92 +0.38 -0.38
S3 -0.38 +0.38 +0.38 -0.38 +0.92 -0.92 +0.92 -0.92
S4 -0.92 +0.92 -0.38 +0.38 +0.38 -0.38 +0.92 -0.92
S5 -0.92 +0.92 -0.92 +0.92 -0.38 +0.38 +0.38 -0.38
S6 -0.38 +0.38 -0.92 +0.92 -0.92 +0.92 -0.38 +0.38
S7 +0.38 -0.38 -0.38 +0.38 -0.92 +0.92 -0.92 +0.92
S8 +0.92 -0.92 +0.38 -0.38 -0.38 +0.38 -0.92 +0.92loooooomoooooon
Cin,1
loooooomoooooon
Cin,2
loooooomoooooon
Cin,3
loooooomoooooon
Cin,4
Table 2.2: Logic table of three demodulated bits per symbol.
XXXXXXXXXXXXSymbols
Correlator
Cout,1 Cout,3 Cout,2‘Cout,4
S1 1 1 1
S2 1 1 0
S3 0 1 0
S4 0 1 1
S5 0 0 1
S6 0 0 0
S7 1 0 0
S8 1 0 1loomoon
B2
loomoon
B1
looomooon
B0
42
2.3.3 BER of the proposed 8PSK direct-demodulation Method
Receiver RF 8PSK symbols, Si“1,2,...,8ptq, after downconversion by eight differential LO
phases pLO0˝ , LO180˝q, pLO45˝ , LO225˝q, pLO90˝ , LO270˝q, pLO135˝ , LO315˝q and baseband fil-
tering, generate four parallel 4-level pulse-amplitude modulation (PAM-4) signals with un-
equal spacing. The equivalent PAM-4 symbol (Sm“1,2,3,4,PAM) for each 8PSK symbol (Si“1,2,..,8)
is derived in Eq. (2.3). In this equation, ωB is the baseband angular frequency and gptq
is a unity baseband pulse with a unity amplitude in the symbol period (0 ď t ď Ts) and
zero everywhere else. As shown in Fig. 2.3a, for a baseband signal energy of Eg these
four levels are: ´d2
aEg{2, ´d1aEg{2, `d1aEg{2, and `d2aEg{2, where d21 ` d22 “ 1 and
d2{d1 “ cosp22.5˝q{cosp67.5˝q « 2.4.
Modulated baseband signal energy has a normal Gaussian distribution at each level of the
PAM-4 eye-diagram. The probablity of receiving an error in detecting B2 of 8PSK symbols
Si“1,4,5,8ptq (red regions in Fig. 2.3a) upon RF correlation with pLO0˝ , LO180˝q is equal to:
PrpE|S1,PAMq “ PrpE|S1 or 8q “
ż 0
´8
P pr|S1 or 8qdr
“
ż 0
´8
1?
piN0
e
´pr´d2
?Eg{2q2
N0 dr “ Qp
d
d22Eg
N0
q
(2.4)
S4,PAM :
S3,PAM :
S2,PAM :
S1,PAM :
Ó
PAM-4 Symbols
$’’’&’’’%
´gptqd2cospωBtq, Si“4,5
´gptqd1cospωBtq, Si“3,6
`gptqd1cospωBtq, Si“2,7
`gptqd2cospωBtq, Si“1,8
Ó
pLO0˝ , LO180˝ q
$’’’&’’’%
´gptqd2cospωBtq, Si“5,6
´gptqd1cospωBtq, Si“4,7
`gptqd1cospωBtq, Si“3,8
`gptqd2cospωBtq, Si“1,2
Ó
pLO45˝ , LO225˝ q
$’’’&’’’%
´gptqd2cospωBtq, Si“6,7
´gptqd1cospωBtq, Si“5,8
`gptqd1cospωBtq, Si“4,1
`gptqd2cospωBtq, Si“3,2
Ó
pLO90˝ , LO270˝ q
$’’’&’’’%
´gptqd2cospωBtq, Si“7,8
´gptqd1cospωBtq, Si“1,6
`gptqd1cospωBtq, Si“2,5
`gptqd2cospωBtq, Si“3,4
Ó
pLO135˝ , LO315˝ q
(2.3)
43
PrpE|S4,PAMq “ PrpE|S4 or 5q “
ż 8
0
P pr|S4 or 5qdr
“
ż 8
0
1?
piN0
e
´pr`d2
?Eg{2q2
N0 dr “ Qp
d
d22Eg
N0
q
(2.5)
where N0 and Qpxq are the noise power spectral density and the tail distribution function of
normal Gaussian distribution, respectively. Similarly, the probability of receiving an error
in detecting B2 of 8PSK symbols Si“2,3,6,7ptq (blue regions in Fig. 2.3a) upon RF correlation
with pLO0˝ , LO180˝q is equal to:
PrpE|S2,PAMq “ PrpE|S3,PAMq
“ PrpE|S2 or 7q “ PrpE|S3 or 6q “ Qp
d
d21Eg
N0
q
(2.6)
Based on Eqs. (2.4), (2.5) and (2.6), the bit error probability of B2 extracted from RF
correlation of 8PSK symbols (Si“1,2,..,8) with pLO0˝ , LO180˝q is equal to:
PB2pEq “
8ÿ
i“1
PrpE|Siq
“ 4
8
ˆ
PrpE|S1 or 4 or 5 or 8q ` PrpE|S2 or 3 or 6 or 7q
˙
“ 1
2
ˆ
Qp
d
d22Eg
N0
q `Qp
d
d21Eg
N0
q
˙ (2.7)
Similarly, the bit error probability ofB1 upon RF correlation of 8PSK symbols with pLO90˝ , LO270˝q
is equal to the bit error probability of B2 in (2.7). For an average baseband signal energy
per bit of Eb,avg “ p2pd1
aEg{2q2 ` 2pd2aEg{2q2q{6 “ Eg{6, the bit error probabilities of B2
44
(a)
PB2(E) and PB1(E) 
PB0(E)
PB(E)
Theory
PS(E)
Theory and 
Proposed
(b)
Figure 2.3: (a) Equivalent baseband PAM-4 eye-diagram and 4-level normal Gaussian distri-
bution, and (b) 8PSK bit and symbol error probabilities of the proposed direct-demodulation
scheme and theory.
and B1 are equal to:
PBpEq “ PB2pEq “ PB1pEq
«1
2
Q
`c6Eb,avg
N0
cosp22.5˝q˘` 1
2
Q
`c6Eb,avg
N0
sinp22.5˝q˘ (2.8)
Bit error probability of B0 is calculated by XORing the bit error probabilities of 8PSK
symbols after RF correlations with pLO45˝ , LO225˝q and pLO135˝ , LO315˝q, which is equal to
PB0pEq “ 2PBpEq ´ PBpEq2. PB0pEq, PB1pEq, and PB2pEq are shown in Fig. 2.3b. The
symbol error rate of the proposed 8PSK direct-demodulation is the same as the theoretical
8PSK symbol error rate. Therefore, the proposed multi-phase RF-correlation method enables
direct-demodulation of 8PSK symbols without imposing any degradation in detection of
those symbols. Furthermore, another advantage of this direct-demodulation method is that
symbols are detected using only sign-check comparators (i.e., BPSK decision) and simple
Boolean expressions, thereby requiring no multi-level amplitude decision or peak detection.
This sign-check comparison (BPSK decision) reduces symbol detection sensitivity to LO
phase deviation. In theory, as long as each LO phase deviation from its ideal angle remains
less than 22.5˝, all levels in PAM-4 eye-diagram will be non-zero for correct sign-check
decision. Therefore, LO phase error in four RF-correlators can be tolerated as long as BER
45
3
2
D Q
D Q
B0
B2
LNA
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
- + - +- +
4 Diff
X3
Low Pass PS
High Pass PSAll-Pass 
PS
X2
X2
Ф
Ф
62.5GHz 
Buffer
62.5GHz 
Buffer
20.83GHz 
Buffer
12
5G
H
z 
B
uf
fe
r
Bondwire
Pin, LO
Ф
Ф
Ф
Ф
4-Phase 
LO Network
LNA and 
4-Way Splitter
Baseband
and Demodulator
Downconverted
8PSK Test Outputs
D Q
Clk
D Q
Clk
D Q
ClkD Q
ClkD Q
Clk
LO
BB
135°
 
 
- +
B1
Symbol 
Clock-Rate
RF
Diff
Diff
315°
90°
270°
45°
225°
0°
180°
1
Diff
C
TL
E
321
Figure 2.4: Proposed direct-demodulation RF-to-bits 8PSK receiver architecture including
front-end, 4-phase LO, mixer-baseband, and demodulator.
of sign-check decision on the lowest eye-diagram level does not get limited by baseband SNR
and comparator sensitivity.
2.4 Proposed 8PSK Direct-Demodulation Receiver
The proposed 8PSK direct-demodulation receiver architecture is shown in Fig. 2.4 [50,
51]. It is comprised of a wideband LNA followed by a balun-based 1-to-4 splitter. The
splitter outputs provide four phase-matched differential RF signals at the RF ports of double-
balanced passive mixers driven by four 45˝ phase-shifted differential LOs.
The 125 GHz LO distribution network generates close to 0 dBm at the center frequency
after frequency multiplication by 6 (a tripler followed by a doubler) of the bondwired 20.83
GHz input. Tunable varactor-based low-pass and high-pass phase shifters with 45˝ phase
difference at 62.5 GHz and tunable all-pass phase shifters with 45˝ phase difference at 125
GHz are designed to generate four 45˝ phase-shifted differential LO signals at the LO ports
of the double-balanced passive mixers. The downconverted signals are then fed to baseband
amplifiers/active filters and continuous-time linear equalizers (CTLEs). Due to the high
46
(a) (b)
Figure 2.5: (a) 6-Stage common-emitter-based LNA circuit schematic, (b) simultaneous noise
and power matching in the first LNA stage (Zopt and Zi˚n curves).
frequency of operation, and therefore sparse multi-path and small delay-spread [77], a simple
CTLE suffices to compensate for bandwidth limitations.
An 8PSK test output is taken at the output of each mixer-baseband to adjust the LO and RF-
carrier phase references at 22.5˝ offset during measurement for optimum symbol detection.
The CTLE outputs are then fed to four sign-check comparators followed by three XOR gates
to extract the three bits per symbol based on the theory discussed in Section III.
RFout
RFin
M1-M8 
GND
CBYP,out
CBYP,in
VB,Gmax
VCC
Mitered
Bend
Mitered
T-Junction
(a)
Output
Matching 
Network
Input
Matching 
Network
RFPin
RFMin
RFMout
VB,DIF
VCC
RFPout
(b)
S21
NF
S22
S11
(c)
Figure 2.6: (a) Second single-ended LNA stage layout with MIM bypass capacitors, (b)
second differential LNA stage layout with CPW-based matching networks, and (c) LNA
S-parameter and NF simulation results.
47
2.4.1 LNA circuit design
Before opting for the popular LNA cascode topology, the noise contribution of the cascode
device needs to be taken into account due to the high frequency of operation. The current
noise density of the cascode device appears as i2n,out ą 2kTgmpf{fT q2 at its output. At the
center frequency of 125 GHz and a process fT of 325 GHz, the effect of the cascode device
on the LNA NF is thus no longer negligible. Therefore, a common-emitter-based design is
adopted for this high frequency LNA.
The 6-stage common-emitter-based LNA is shown in Fig. 2.5a. The LNA’s first stage is
designed at a current-density of JDC = 0.35 mA/µm as a compromise between NFmin and
Gmax. The first stage is degenerated with a CPW of length EL to provide simultaneous
noise and power matching. Variations of Zopt and Zi˚n with EL are shown in Fig. 2.5b. For
EL “ 47.36˝, simultaneous noise and power match is achieved. The remaining stages are
optimized for Gmax at a current-density of JDC = 0.7 mA/µm. The frequency responses of
the LNA stages are stagger-tuned to maximize the operation bandwidth.
As an example, the layout of the second single-ended stage of the LNA is shown in Fig.
2.6a. MIM capacitors are used as bypass capacitors and are EM-simulated to provide self-
resonance-frequency and therefore a small impedance („1-2 Ω) at the frequency of operation.
Likewise, the layout of the second differential stage of the LNA is shown in Fig. 2.6b. Match-
ing networks are composed of low-loss 50 Ω CPW-based T-sections and finger capacitors.
Cascaded bends and T-junctions are the main sources of electrical and magnetic field dis-
continuity in long CPW routings (e.g. matching networks). Bends and T-junctions are cut
with a 45˝ angle to improve their reflection coefficient and loss compared to a standard bend
and T-junction with 90˝ corner1. Finger capacitors are realized on (M5-)M6-M7 for values
above 30 fF and on M8 for values below 15 fF, and their port-to-port EM-simulated phase
1By applying this technique in a 55-nm SiGe BiCMOS process, a single 50 Ω CPW bend with width of
6.5 µm and lateral ground spacing of 9.4 µm can achieve 8 dB better reflection coefficient at 125 GHz.
48
M1-M8 
GND
M1-M6 
GND
Isolation 
Resistors
50 Ω-to-100 Ω
Balun
 
 
Isolation 
Resistor
(a)
S22
S33
S11
S21
S23
S34
(b)
∠S21 – ∠S31
180° phase error 
dB(S21) – dB(S31)
(c)
Figure 2.7: (a) 4-Way splitter layout, (b) splitter S-parameter simulation results, and (c)
splitter amplitude and phase error simulation results.
delay is taken into account during the design of conjugate matching networks. S-parameter
and NF simulation results of the LNA are shown in Fig. 2.6c. LNA achieves a maximum
gain of 22.6 dB and a minimum NF of 9.2 dB across the RF bandwidth.
2.4.2 1-to-4 balun-based splitter design
A 1-to-4 splitter is required at the LNA output to feed the downconversion mixers with four
in-phase differential RF signals. A balun-based splitter is designed to avoid cross-overs and
the high loss of long λ{4 CPWs in conventional differential Wilkinson splitters. The splitter
is composed of six baluns, as indicated in Fig. 2.7a [78]. Ground walls on metal stack M1-M8
for the two input baluns and on M1-M6 for the four output baluns are designed to reduce
coupling between adjacent baluns. Each balun is double-tuned with input and output finger
capacitors to match the primary to 50 Ω and the secondary to 100 Ω differential.
The outputs of the two input baluns are connected to the four output baluns with 100 Ω dif-
ferential CPWs. Assuming loaded quality factors of Qp and Qs for the primary and secondary
of each balun, respectively, the overall quality factor is: Qnet “ 1{
b
0.5p1{Q2p ` 1{Q2sq. The
frequency response of a double-tuned balun is maximally flat if the balun’s coupling factor is
K “ 1{Qnet [78]. On the other hand, for K ă 1{Qnet the response is still flat over a narrower
49
(a)
LOP
LOM
VRF(ω)/2
ZBB(ωB) ZBB(ωB)
Zs(ω)
+
-
+
-
Rsw Rsw
Zs(ω)
-VRF(ω)/2
LOP
VRF,m(ω)/2 -VRF,m(ω)/2
(b) (c)
Figure 2.8: (a) Voltage-mode double-balanced passive mixer followed by 3-stage amplification
(last amplification stage is CTLE), (b) simplified circuit of a double-balanced passive mixer,
and (c) mixer input matching from 110 GHz to 140 GHz.
bandwidth, and for K ą 1{Qnet primary and secondary resonance frequencies will be further
apart, resulting in wider fractional bandwidth at the cost of higher in-band ripples. „45
pH primary and secondary inductors of each balun are double-tuned with 29 fF and 62 fF
single-ended finger capacitors, respectively. Coupling factor of each balun is designed for
K « 0.48 close to 1{Qnet to achieve a maximally flat response. To provide isolation between
output ports of the splitter, they are cross-connected with 100 Ω resistors. The S-parameter
simulation results of the 1-to-4 balun-based splitter are shown in Fig. 2.7b. The splitter
shows a low loss of 8.3 dB across a wide bandwidth of 40 GHz. The isolation between out-
put ports remains better than 15 dB across the bandwidth. The amplitude and phase errors
of the splitter are shown in Fig. 2.7c. The amplitude and phase errors between output ports
are less than 0.2 dB and 1.5˝, respectively. The differential output ports phase error from
180˝ is less than 0.5˝.
2.4.3 Mixer-baseband circuit design
Each mixer-baseband is composed of a voltage-mode double-balanced passive mixer followed
by three amplification stages with the last stage being a CTLE (Fig. 2.8a). Shunt peaking
and capacitive neutralization are used in the first two stages to increase bandwidth with-
50
out sacrificing gain or noise performance. CTLE is an RC-degenerated differential stage
with tunable resistors and capacitors. RC-degeneration introduces a zero in the baseband
frequency response to compensate for bandwidth limitations imposed by the receiver front-
end. CTLE incorporates 3-bit tuning for both degeneration resistors and capacitors. Proper
CTLE code is chosen during measurement to optimize the baseband frequency response
based on the existing trade-off between bandwidth enhancement and noise integration for
wider bandwidths. The long routing between CTLE output and the next-stage sign-check
comparator serves as series peaking inductor and helps further increase in the baseband
bandwidth.
The simplified circuit of the voltage-mode double-balanced passive mixer is shown in Fig.
2.8b. To calculate RF-referred conversion gain, high-side and low-side RF injections are
defined as VRF “ ARF cospωHt ` φq and VRF “ ARF cospωLt ` φq, respectively, where ωH “
ωLO`ωB and ωL “ ωLO´ωB and φ is the phase offset between LO and RF-carrier (i.e. 22.5˝
based on Section III). Assuming square-wave 50% LO, the high-side and low-side conversion
gains of this voltage-mode double-balanced mixer are [79]:
Gconvpωq “
$’’’’’’&’’’’’’%
p2{piqZBBpωBq
ZspωHq`Rsw`ZBBpωBq , ω “ ωH
p2{piqZ˚BBpωBq
ZspωLq`Rsw`Z˚BBpωBq , ω “ ωL
(2.9)
Therefore, by designing Zspωq to exhibit resonance at the mixer input (i.e. Zspωq real), high-
side and low-side conversion gains can be equalized and a symmetrical frequency response
is achieved at the passband of mixer’s input impedance. The mixer’s input impedance from
110 GHz to 140 GHz is shown in Fig. 2.8c. The low-side and high-side conversion gains
(both are almost equal) of mixer-baseband for four different CTLE code settings are shown
in Fig. 2.9a. The mixer-baseband achieves a maximum conversion gain of 24.7 dB (mixer
conversion gain = -5.7 dB).
51
Due to reciprocity of passive mixers and 50% LO design, interaction between switches needs
to be studied. Assuming a high-side RF injection, the RF voltage at the input of the mixer
driven by differential 50% LO pLOmˆ45˝ , LO180˝`mˆ45˝q is:
VRF,mpωq “ p4ARF {pi
2qZBBpωBq
ZspωHq `Rsw ` ZBBpωBq ˆ cospωHt` φq
` p4ARF {pi
2qZB˚BpωBq
ZspωLq `Rsw ` ZB˚BpωBq
ˆ$’’’’’’’’’’&’’’’’’’’’’%
`cospωLt´ φq, m “ 0
´sinpωLt´ φq, m “ 1
´cospωLt´ φq, m “ 2
`sinpωLt´ φq, m “ 3
(2.10)
where the first term is produced by upconversion of the baseband signal to the main fre-
quency (high-side) and the second term is produced by upconversion of the baseband signal
to the image frequency. Similarly, the RF voltages at the input of mixers for low-side in-
jection can be derived by replacing ωB with ´ωB and ωH with ωL and vice versa. Extra
gain stages can be placed at the input of four constituent double-balanced mixers to reduce
the cross-talk between them, but such gain stages will degrade the linearity and bandwidth
Code 0
Code 2
Code 5Code 7
BW
Gain
(a)
Min NF = 7.95 dB
(b)
Figure 2.9: (a) Conversion gain of mixer-baseband for four different CTLE code settings,
and (b) mixer-baseband NF.
52
Vinp,1 Vinm,1Vrefp Vrefm
CLKM
CLKP
80 Ω80 Ω
Vrefm
Vrefp
2 µm
4 µm
0.45 µm 0.45 µm
1 µm 1 µm1 µm
7.5 µm 2x7.5 µm
CLKP
2V
2V
Vin,2
VBias,1 VBias,2
R
e
fe
re
n
c
e
 L
a
d
d
e
r
Vinp,2 Vinm,2
CLKM
100 Ω100 Ω
2 µm
6 µm
1 µm 1 µm
2x7.5 µm
2V
Vout,dif
VBias,2
CLKP
Comparator
(master-latch)
Slave-Latch
Vout,comp
(a)
Sampling
tBER
Regeneration
SNRout
Vout,comp
Vn,RMS
V
o
u
t,
c
o
m
p
 (
V
)
V
n
,R
M
S
 (
V
),
 S
N
R
o
u
t 
(d
B
)
(b)
Figure 2.10: (a) Comparator-DFF with offset calibration, and (b) comparator BER due to
random noise (PSS + PNOISE simulations).
and will increase the receiver power dissipation. Therefore, as mentioned in Section IV.B, a
passive approach is pursued. Splitter output ports are isolated to minimize the interaction
between four mixers.
Each double-balanced passive mixer is followed by differential amplification stages. Assuming
high enough gain from each amplification stage to suppress the noise of latter stages, only
the first differential stage is considered in NF calculation. For a real Zspωq “ Rs (due to
matching) at the mixer input, the input-referred NF is:
NF “ pi
2
4
ˇˇˇˇ
Rs `Rsw ` ZBBpωBq
ZBBpωBq
ˇˇˇˇ2
ˆ
2pRsw ` rbq ` pgmβ qr2b ` gm`2{RLg2m
2Rs
(2.11)
where gm, β, rb, and RL are the trans-conductance, current factor, base and load resistor of
the first differential stage. The simulated NF of the mixer-baseband is shown in Fig. 2.9b.
The integrated NF across the baseband bandwidth is 9.23 dB.
53
V
in
,s
w
in
g
Clock
Comparator
Input
Comparator
Output
t = 0 t = Ts/2 t = Ts
Slave-Latch
Output
V
in
,s
e
n
s
Master
SAMP.
Slave
REG.
Master
REG.
Slave
SAMP.
Master
SAMP.
Slave
REG.
Master
REG.
Slave
SAMP.
V
o
u
t,
s
e
n
s
Figure 2.11: Timing diagram of the comparator-DFF operation.
2.4.4 Comparator
CML-based comparator design is adopted to drive CML-based logic gates and to enable high-
speed pipeline operation. Each comparator and the following retiming DFF are combined
into one comparator-DFF block (Fig. 2.10a). The comparator is biased with a tail current
of 1.6 mA for a GainˆBW product of 105.13 GHz. To compensate for offset across PVT,
a calibration differential pair with tunable reference voltage is added in parallel with the
comparator input differential pair. The reference voltage is fed from a 5-bit resistive ladder.
This design enables calibrating offset voltages from -17 mV to 17 mV with 2 mV accuracy.
After offset calibration, each master- and slave-latch can fully regenerate input signals as
small as 10 mVp at 12 GHz clock-rate.
To estimate BER degradation due to random noise, the effect of noise in both sampling and
regeneration periods is considered. Comparator is a linear periodically time-variant (LPTV)
system with cyclo-stationary noise. SNR at the output of the comparator is thus calculated
from differential output voltage and integrated noise power spectral density at each time
instance within one clock period, i.e., BERB,noise “ Qp?SNRoutq “ QpVout,comp{Vn,RMSq,
where Vout,comp and Vn,RMS are the differential output voltage and RMS noise voltage at
54
the comparator output. The simulated steady-state output voltage, RMS noise voltage and
SNRout are shown in Fig. 2.10b. Signal and noise (accumulated in sampling period) are
amplified by exponential gain of the cross-coupled pair in the regeneration period. However,
as the comparator output latches to a large-signal digital value, signal and noise will no longer
affect BER and output noise will be heavily compressed. Therefore, BER is determined
by SNRout at the time instance when the gain of master-latch is maximized but still not
compressed. Based on Fig. 2.10b, SNRout at this time instance (i.e. tBER) is „18 dB
resulting in a BER less than 1e-15 for a 12 Gbps input signal.
Another source of BER degradation is metastability due to very small input signals. A
timing diagram of the comparator-DFF operation is shown in Fig. 2.11. Vin,swing, Vin,sens, and
Vout,sens are the input full-swing, comparator sensitivity level, and slave-latch sensitivity level,
respectively. As long as the input signal is large enough for the master-latch to regenerate
it to the slave-latch sensitivity level, the slave-latch will be able to regenerate it to a digital
value and correctly detect the bit at half-period clock sampling. The comparator and slave-
latch outputs for this case are shown in black in Fig. 2.11 (for very small Vin,sens, the gray
traces will be expected). The regeneration time-constant of the comparator cross-coupled
pair terminated with CL at each output is equal to τreg “ RLpCpi ` 4Cµ ` CLq{pgmRL ´ 1q.
Assuming Gamp is the unlatched gain of the comparator, the output amplitude at the end of
the regeneration period is:
Vout,comppt “ Ts
2
q “ Vout,compp0qe
Ts{2
τreg
“
ˆ
Gamp ¨ Vin,compp0q
˙
e
Ts{2
τreg
(2.12)
where Vin,comp and Vout,comp are the comparator input and output voltages. The comparator’s
bit error probability is equal to the probability of |Vin,compp0q| ă |Vin,sens|, which is:
BERB,meta “ P p|Vin,compp0q| ă |Vin,sens|q
“ Vout,sens
Gamp ¨ Vin,swing e
´Ts{2
τreg ,
(2.13)
55
80 fF
1.5 KΩ
50 Ω
10 µm
10 µm
345 pH
100 fF
10 µm
10 µm
2V
2V
2V
VB,TripVB,Buf
RFout,DIF
RFin
R
F
in
,T
rip
(a)
without
Input Buffer
with 
Input Buffer
Max Ptrip. = -1.2 dBm
(b)
Figure 2.12: (a) Tripler circuit schematic, and (b) tripler output power with and without
input buffer.
therefore, smaller τreg 9 (GainˆBW)´1 results in smaller BER. The designed τreg is 1.45 ps
resulting in a BER less than 1e-13 for a 12 Gbps input signal. Therefore, in the proposed
direct-demodulation architecture, baseband imposes negligible degradation to BER of the
demodulated output bits.
2.4.5 LO Generation and Distribution Network
LO network is designed to generate four 45˝ phase-shifted differential LO signals with close
to 0 dBm power at the LO ports of four double-balanced mixers. The bondwired input signal
at 20.83 GHz is first buffered and then fed to a tripler to reduce the tripler’s output power
sensitivity to bondwire inductance variation. Transistor size and biasing point of the tripler
are set to maximize the third harmonic current. A balun tuned at 62.5 GHz provides the
optimum load to the tripler at the third harmonic and converts the single-ended output to
differential for the rest of the LO chain. The schematic and simulation results of the tripler
are shown in Figs. 2.12a and 2.12b, respectively. The saturated output power of the tripler
is -1.2 dBm at 62.5 GHz.
The 62.5 GHz tripler output is then split with a differential Wilkinson splitter to two low-
56
Vtune
Vin
+ -
+ -
Vout
(a)
Vout
+
-Vin
+
-
(b)
2
9
.9
° 
T
u
n
in
g
 R
a
n
g
e
(c)
S21S34
S11
S33
(d)
Figure 2.13: (a) Low-pass phase shifter, (b) high-pass phase shifter, (c) 90˝ phase-shift tuning
range, and (d) S-parameter simulation results.
pass and high-pass LO branches. Tunable varactor-based low-pass and high-pass phase
shifters with 45˝ phase difference at 62.5 GHz are designed at each Wilkinson’s output port.
The layouts of low-pass and high-pass phase shifters are shown in Figs. 2.13a and 2.13b,
respectively. Low-pass/high-pass phase tuning range is 29.9˝ (Fig. 2.13c) and the loss is less
than 1 dB at 62.5 GHz (Fig. 2.13d). These low-pass and high-pass LO branches are then
followed by 62.5 GHz capacitively-neutralized buffers and CPW-based doublers to generate
quadrature LOs at 125 GHz. The circuit schematic and simulation results of the doubler are
shown in Figs. 2.14a and 2.14b, respectively. The saturated output power of the doubler is
-1.8 dBm at 125 GHz. The output of the doublers from two quadrature branches are then
fed to differential 1-to-2 balun-based splitters (similar to the 1-to-4 balun-based splitter in
section IV.B) and followed by 125 GHz capacitively-neutralized buffers to create four LO
57
branches. Four tunable varactor-based all-pass phase shifters with 45˝ phase difference at
125 GHz are designed at each of these splitters output ports. The layouts of all-pass phase
shifters are shown in Figs. 2.15a and 2.15b. All-pass phase tuning range is 14.1˝ (Fig. 2.15c)
and the loss remains less than 1.3 dB (Fig. 2.15d). These four 45˝ phase-shifted LOs are
then further amplified with 125 GHz capacitively-neutralized buffers and local LO amplifiers
at the LO ports of the four differential double-balanced mixers. The saturated output power
of the LO network at different output center frequencies and for multiple harmonics of the
bondwired 20.83 GHz LO input are shown in Figs. 2.16a and 2.16b, respectively.
2.5 Measurement Results
The die micrograph of the chip fabricated in a 55-nm SiGe BiCMOS process is shown in
Fig. 2.17. It occupies 2.5ˆ3.5 mm2 of die area including PADs and test circuits (2.5 mm2
active area). The direct-demodulation receiver consumes a total DC power of 200.25 mW
from 1.5-V/2.0-V supplies (LNA: 16 mW, mixer-basebands: 101 mW, LO network: 49.15
mW, and demodulator: 34.1 mW).
To test the chip a six-layer PCB (3 copper signal layers and 3 copper internal ground planes)
10 µm 10 µm
1
7
°
37 fF
RFMin
2
2
.3
°
RFPin
48.75 fF
6°
1.5V
2
2
.3
°
5.79°24°
Vbias
1.5V RFout
1.5V
2
5
°
48.75 fF 5.79°24°
(a)
Max Pdoub. = -1.8 dBm
(b)
Figure 2.14: (a) Doubler circuit schematic, and (b) doubler output power.
58
+ -
+ -
Vin
Vout
Vtune
(a)
Vin+ -
+ -
Vout
Vtune
(b)
Vtune,1 = 2.5 V
Vtune,1 = -2.5 V
1
4
.1
° 
T
u
n
in
g
 R
a
n
g
e
(c)
S21S34
S11
S33
(d)
Figure 2.15: (a) Lower-frequency-tuned all-pass phase shifter, and (b) higher-frequency-
tuned all-pass phase shifter, (c) 45˝ phase-shift tuning range, and (d) S-parameter simulation
results.
Max PLO = 0.8 dBm
(a)
4th
6th
3rd
2nd
1st
(b)
Figure 2.16: (a) LO network output saturated power for different LO center frequencies,
and (b) LO network output power at different harmonics (6th, 4th, 3rd, 2nd, 1st) of the
bondwired 20.83 GHz LO input.
59
L
N
A
4-
W
A
Y
S
P
L
IT
T
E
R
M
IX
E
R
-C
T
L
E
BIT B1
COMPARATOR + BUFFER
BITS B0, B2
COMPARATOR + BUFFER
4-PHASE LO
AP PS
HP PS
LP PS
B1 8PSK 0°
8PSK 90° 8PSK 45°B2 B0
Figure 2.17: Die micrograph of the 8PSK receiver.
was developed. Rogers RO4350B dielectric with a relative permittivity of 3.66 F/m and a
loss tangent of 0.0037 at 10 GHz was used to provide low-loss PCB routings for bondwired
LO and clock input signals. The measurement setup and lab photo are shown in Figs.
2.18a and 2.18b, respectively. CW or modulated output of a 65-GSa/s, 25-GHz BW AWG
(Keysight M8195A) was fed to a WR-8.0 upconversion mixer (SAGE SFB-08-E2) on the
transmit side. A WR-8.0 frequency extender (Keysight E8257DV08) provided the LO signal
for upconversion mixer.
To measure the frequency response of the receiver, RF output of the WR-8.0 upconversion
mixer was directly connected to the RF GSG probe with a WR-8.0 waveguide section, and
the 8PSK test output of the receiver was measured with a differential GSGSG probe. For
conversion gain measurement, the RF input tone power and the CTLE control code were set
to -42 dBm and code 7, respectively, and the LO network was saturated. Conversion gain of
60
the receiver is shown in Fig. 2.19. A maximum conversion gain of 32 dB was measured across
the RF bandwidth. For NF measurement, the RF input of the receiver was terminated to 50
Ω, CTLE was set to code 7, and the LO network was again saturated. The receiver achieves
a minimum DSB NF value of 10.3 dB and remains less than 14 dB at baseband frequencies
up to 12 GHz (Fig. 2.20a). The measured worst-case input-referred 1-dB compression point
(IP1dB) of the receiver is -31.4 dBm (Fig. 2.20b). Under modulation, the measured power
spectrum of a 36 Gbps modulated signal at the 8PSK test output of the receiver is shown
in Fig. 2.21.
For wireless measurements, WR-8.0 horn antennas with 25 dBi gain were used at WR-
VNA Angilent E8361A
DUT
WR-8.0 
Probe
30cm
WR-8.0 25 dBi 
Horn Antenna
GSGSG 
Probe
RFin
WR-8.0 
Bend
Frequency Extender
Keysight E8257DV08
X9
IF
 
 
REF_OutSignal REF_In
 Real-Time Oscilloscope
Keysight DSAV334AWR-8.0 Balanced Mixer
SAGE SFB-08-E2
IF L
O
RF
Signal Generator
Angilent E8257D/567
IFout
 
 
REF_OutSignal REF_In
Signal Generator
Angilent E8257D/567
PCB
LO CLK
CLK
 
AWG Keysight M8195A
10/20 dB 
Attenuation
WR-8.0 25 dBi 
Horn Antenna
50 Ω Termination
(a)
AWG 
Keysight M8195A
VNA 
Agilent E8361A
Real-time Oscilloscope
Infiniium V-Series
Signal Generator 
Keysight E8257D/550
Spectrum Analyzer 
Agilent E4448A
WR-8 Frequency Extender
VDI E8257DV08
Balanced Mixer
Sage Millimeter SFB-08-E2
(b)
Figure 2.18: (a) Wireless measurement setup schematic, and (b) photo.
61
Figure 2.19: Measured receiver conversion gain at the 8PSK test output.
(a)
1dB
IP1dB = -31.4
(b)
Figure 2.20: (a) Measured DSB NF and (b) IP1dB at the 8PSK test output.
Figure 2.21: 36 Gbps power spectrum at the 8PSK test output.
8.0 upconversion mixer output and the receiver input to transmit and receive the 8PSK-
modulated wireless data. An 80-GSa/s 33-GHz bandwidth real-time oscilloscope (Keysight
DSAV334A) was used to measure the 8PSK eye-diagram at the 8PSK test output after
downconversion and before direct-demodulation. External references of the signal generators
62
30
Gb
ps
 8P
SK
 
(a)
36
Gb
ps
 8P
SK
 
(b)
Figure 2.22: Wirelessly measured (a) 30 Gbps and (b) 36 Gbps eye-diagrams at the 8PSK
test output.
(a) (b)
Figure 2.23: Wirelessly measured (a) 30 Gbps and (b) 36 Gbps constellations at the 8PSK
test output.
for receiver and transmitter LOs were both synchronized with the AWG external reference.
A known non-random 8PSK pattern (repetition of 22.5˝, 67.5˝, ..., 337.5˝) was received.
Transient waveform at the 8PSK test output after direct-downconversion was observed on
the real-time oscilloscope and phase reference of the receiver LO was tuned externally until a
4-level symmetrical eye-diagram with ratios of +0.92/+0.38/-0.38/-0.92 was achieved (one-
time calibration in a fixed point-to-point test setup). Figs. 2.22a and 2.22b show the
wirelessly measured 30 Gbps and 36 Gbps 8PSK eye-diagrams at a maximum distance of 0.3
m (limited by the measurement setup). The 8PSK constellation was reconstructed from two
63
Bit B0
Bit B1
Bit B2
(a)
Bit B0
Bit B1
Bit B2
(b)
Figure 2.24: Wirelessly measured eye-diagrams of demodulated 8PSK 3-bit streams for (a)
30 Gbps and (b) 36 Gbps overall data-rates.
8PSK baseband data downconverted by two 90˝ phase-shifted LOs. Figs. 2.23a and 2.23b
show the wirelessly measured 30 Gbps and 36 Gbps 8PSK constellations.
For direct-demodulation, the reference clock of AWG was externally synchronized with the
reference clock of the signal generator that feeds the symbol-rate clock to the receiver. Fur-
thermore, AWG timing delay was tuned for optimum sampling point to minimize BER. The
measured BPSK eye-diagrams of the 8PSK-demodulated 3-bit streams for 30 Gbps and 36
Gbps total data-rates are shown in Fig. 2.24. A BER of 1e-6 for 36-Gbps PRBS-7 sequence
was wirelessly measured at a 0.3 m distance. The measured receiver sensitivity at this BER
is -41.28 dBm. Variation of BER for different values of received input power is shown in
64
Figure 2.25: Variation of BER with received input power.
Table 2.3: Comparison table of state-of-the-art direct-demodulation receivers
This Work [40] [41] [41] [72]
Modulation 8PSK QPSK QPSK BPSK OOK
Demodulator Multi-phase RF-Correlator Quadrature Zero-IFQuadrature Zero-IFQuadrature Zero-IF Envelope Detector
Frequency (GHz) 125 60 240 240 130
Data-Rate (Gbps) 36 14.08 16 9 11.5
BER 1e-6 1e-3 1e-4 1e-5 1e-6
Gain (dB) 32 30 25 25 NA
Wireless Distance (cm) 30 I 90 2 2 50
Power Dissipation (mW) 200.25 220 260 260 24 II
Energy Efficiency (pJ/bit) 5.56 15.63 16.25 28.9 2.08 II
Technology 55nm SiGe BiCMOS 65nm CMOS 65nm CMOS 65nm CMOS 55nm SiGe BiCMOS
I Limited by the measurement setup, measured sensitivity at the reported data-rate and BER is -41.28 dBm
II Non-coherent reception excluding power-hungry blocks (synthesizer, LO distribution network, and quadrature mixer)
Fig. 2.25, which follows 8PSK waterfall profile. The performance of the proposed 8PSK
receiver is compared with state-of-the-art direct-demodulation receivers in Table 2.3. This
work achieves the highest speed and lowest BER with excellent energy efficiency among all
previously reported high data-rate direct-demodulation receivers to date.
2.6 Conclusion
In this paper, a novel RF-correlation-based direct-demodulation 8PSK receiver was dis-
closed. The proposed receiver is the highest speed direct-demodulation receiver with the
highest modulation-order to date. The proposed RF-to-bits receiver architecture obviates
the need for power-hungry high-speed-resolution ADCs and significantly improves the energy
efficiency of the receiver compared to other high-speed receivers in an RF-to-bit scenario.
65
System analysis in terms of the bit error probability of the multi-phase RF-correlation idea
was detailed and high-frequency circuit design techniques and analysis of critical building
blocks were presented. Wireless measurement results were conducted and excellent sensi-
tivity and BER were achieved for the reported data-rate verifying the effectiveness of the
direct-demodulation idea. This direct-demodulation idea can readily be extended to realize
higher-order 16QAM-Star demodulation by only adding one level of envelope detection.
66
Bibliography
[1] “Cisco Visual Networking Index.” [Online]. Available: https://www.cisco.com
[2] A. Ghosh, “5G Small Cell Technology,” Nokia Bell Labs, Tech. Rep., 2017.
[3] Y. Niu, Y. Li, D. Jin, L. Su, and A. Vasilakos, “A Survey of Millimeter Wave (mmWave)
Communications for 5G: Opportunities and Challenges,” Wireless Networks, vol. 18,
no. 2, pp. 1018–1044, Secondquarter 2015.
[4] S. Karimi-Bidhendi, J. Guo, and H. Jafarkhani, “Using quantization to deploy
heterogeneous nodes in two-tier wireless sensor networks,” CoRR, vol. abs/1901.06742,
2019. [Online]. Available: http://arxiv.org/abs/1901.06742
[5] J. Guo, S. Karimi-Bidhendi, and H. Jafarkhani, “Energy efficient node deployment
in wireless ad-hoc sensor networks,” CoRR, vol. abs/1904.06380, 2019. [Online].
Available: http://arxiv.org/abs/1904.06380
[6] B. Sadhu, Y. Tousi, J. Hallin, S. Sahl, S. Reynolds, O. Renstrom, K. Sjogren, O. Haa-
palahti, N. Mazor, B. Bokinge, G. Weibull, H. Bengtsson, A. Carlinger, E. Westesson,
J. E. Thillberg, L. Rexberg, M. Yeck, X. Gu, D. Friedman, and A. Valdes-Garcia, “A
28GHz 32-Element Phased-Array Transceiver IC with Concurrent Dual Polarized Beams
and 1.4 Degree Beam-steering Resolution for 5G Communication,” in 2017 IEEE Int.
Solid-State Circuits Conf. (ISSCC), Feb 2017, pp. 128–129.
[7] N. Ebrahimi, P. Y. Wu, M. Bagheri, and J. F. Buckwalter, “A 71 -86-GHz Phased Array
Transceiver Using Wideband Injection-Locked Oscillator Phase Shifters,” IEEE Trans.
Microw. Theory Techn., vol. 65, no. 2, pp. 346–361, Feb 2017.
[8] S. Y. Kim and G. M. Rebeiz, “A Low-Power BiCMOS 4-Element Phased Array Receiver
for 76-84 GHz Radars and Communication Systems,” IEEE J. of Solid-State Circuits,
vol. 47, no. 2, pp. 359–367, Feb 2012.
[9] S. Y. Kim, O. Inac, C. Y. Kim, D. Shin, and G. M. Rebeiz, “A 76-84-GHz 16-Element
Phased-Array Receiver With a Chip-Level Built-In Self-Test System,” IEEE Trans.
Microw. Theory Techn., vol. 61, no. 8, pp. 3083–3098, Aug 2013.
[10] S. Kutty and D. Sen, “Beamforming for Millimeter Wave Communications: An Inclu-
sive Survey,” IEEE Communications Surveys Tutorials, vol. 18, no. 2, pp. 949–973,
Secondquarter 2016.
67
[11] S. Mondal, R. Singh, A. I. Hussein, and J. Paramesh, “A 25-30 GHz 8-Antenna 2-
Stream Hybrid Beamforming Receiver for MIMO Communication,” in 2017 IEEE Radio
Frequency Integrated Circuits Symp. (RFIC), June 2017, pp. 112–115.
[12] H. Mohammadnezhad, R. Abedi, A. Esmaili, and P. Heydari, “A 64-67GHz Partially-
Overlapped Phase-Amplitude-Controlled 4-Element Beamforming-MIMO Receiver,” in
2018 IEEE Custom Integrated Circuits Conf. (CICC), April 2018, pp. 1–4.
[13] Xinying Zhang and A. F. Molisch and Sun-Yuan Kung, “Variable-phase-shift-based
RF-baseband codesign for MIMO antenna selection,” IEEE Transactions on Signal
Processing, vol. 53, no. 11, pp. 4091–4103, Nov 2005.
[14] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. L. H. Nguyen, L. Li, and K. Haneda,
“Hybrid beamforming for massive MIMO - A survey,” CoRR, vol. abs/1609.05078,
2016. [Online]. Available: http://arxiv.org/abs/1609.05078
[15] F. Sohrabi and W. Yu, “Hybrid Digital and Analog Beamforming Design for Large-Scale
Antenna Arrays,” IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 3,
pp. 501–513, April 2016.
[16] K. Satyanarayana, M. El-Hajjar, P. H. Kuo, A. Mourad, and L. Hanzo, “Dual-Function
Hybrid Beamforming and Transmit Diversity Aided Millimeter Wave Architecture,”
IEEE Trans. on Vehicular Technology, vol. 67, no. 3, pp. 2798–2803, March 2018.
[17] T.B.Vu, “Method of null steering without using phase shifters,” Microwaves, Optics
and Antennas, IEE Proceedings H, vol. 131, no. 4, pp. 242–245, 1984.
[18] O. Bakr, “A Scalable and Cost Effective Architecture for High Gain Beamforming
Antennas,” Ph.D. dissertation, EECS Department, University of California, Berkeley,
Dec 2010. [Online]. Available: http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/
EECS-2010-178.html
[19] O. M. Bakr and M. Johnson, “Impact of phase and amplitude errors on array
performance,” EECS Department, University of California, Berkeley, Tech. Rep.
UCB/EECS-2009-1, Jan 2009. [Online]. Available: http://www2.eecs.berkeley.edu/
Pubs/TechRpts/2009/EECS-2009-1.html
[20] H. Mohammadnezhad, R. Abedi, and P. Heydari, “A Millimeter-Wave Partially Over-
lapped Beamforming-MIMO Receiver: Theory, Design, and Implementation,” IEEE
Transactions on Microwave Theory and Techniques, vol. 67, no. 5, pp. 1924–1936, May
2019.
[21] H. Mohammadnezhad, H. Wang, and P. Heydari, “Analysis and Design of a Wideband,
Balun-Based, Differential Power Splitter at mm-Wave,” IEEE Trans. on Circuits and
Systems II: Express Briefs, pp. 1–1, 2017.
[22] M. Tabesh, “Energy-Efficient mm-Wave Systems for Communication and Sens-
ing,” Ph.D. dissertation, EECS Department, University of California, Berkeley,
68
Dec 2016. [Online]. Available: http://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/
EECS-2016-171.html
[23] F. Golcuk, T. Kanar, and G. M. Rebeiz, “A 90-100 Ghz 4x4 sige BiCMOS polarimetric
transmit-receive phased array with simultaneous receive-beams capabilities,” in 2013
IEEE International Symposium on Phased Array Systems and Technology, Oct 2013,
pp. 102–105.
[24] K. Koh, “Integrated microwave and millimeter-wave phased-array designs in silicon
technologies,” Ph.D. dissertation, EECS Department, University of California, San
Diego, 2008. [Online]. Available: https://escholarship.org/uc/item/75w3t8zt
[25] S. Y. Kim, D. Kang, K. Koh, and G. M. Rebeiz, “An Improved Wideband All-Pass I/Q
Network for Millimeter-Wave Phase Shifters,” IEEE Trans. Microw. Theory Techn.,
vol. 60, no. 11, pp. 3431–3439, Nov 2012.
[26] T. Yu and G. M. Rebeiz, “A 22-24 GHz 4-Element CMOS Phased Array With On-
Chip Coupling Characterization,” IEEE J. of Solid-State Circuits, vol. 43, no. 9, pp.
2134–2143, Sept 2008.
[27] H. Mohammadnezhad, A. K. Bidhendi, M. M. Green, and P. Heydari, “A low-power
BiCMOS 50 Gbps Gm-boosted dual-feedback transimpedance amplifier,” in 2015 IEEE
Bipolar/BiCMOS Circuits and Technology Meeting - BCTM, Oct 2015, pp. 161–164.
[28] A. Karimi-Bidhendi, H. Mohammadnezhad, M. M. Green, and P. Heydari, “A Silicon-
Based Low-Power Broadband Transimpedance Amplifier,” IEEE Transactions on Cir-
cuits and Systems I: Regular Papers, vol. 65, no. 2, pp. 498–509, Feb 2018.
[29] A. Karimi Bidhendi, “A Broadband Transimpedance Amplifier for Optical Receivers,”
2018.
[30] P. Heydari et al., “Ultra-Broadband Transimpedance Amplifiers ( TIA ) for Optical
Fiber Communications,” 2018, US20180102749A1.
[31] S. Karimi-Bidhendi, F. Munshi, and A. Munshi, “Scalable classification of univariate
and multivariate time series,” in 2018 IEEE International Conference on Big Data (Big
Data), Dec 2018, pp. 1598–1605.
[32] S. Ghasemi-Goojani, S. Karimi-Bidhendi, and H. Behroozi, “On the capacity region of
asymmetric gaussian two-way line channel,” IEEE Transactions on Communications,
vol. 64, no. 9, pp. 3669–3682, Sep. 2016.
[33] A.-P. Telecommunity, “APT Report on Technology trends of Telecommunications above
100 GHz.”
[34] “Microwave backhaul evolution – reaching beyond 100GHz. [Online],” Avail-
able at: https://www.ericsson.com/en/ericsson-technology-review/archive/2017/
microwave-backhaul-evolution-reaching-beyond-100ghz.
69
[35] P. Nazari, H. Mohammadnezhad, E. Preisler, and P. Heydari, “A broadband nonlinear
lumped model for silicon IMPATT diodes,” in 2015 IEEE Bipolar/BiCMOS Circuits
and Technology Meeting - BCTM, Oct 2015, pp. 145–148.
[36] P. Chevalier et al., “Nanoscale SiGe BiCMOS technologies: From 55 nm reality to
14 nm opportunities and challenges,” in IEEE Bipolar/BiCMOS Circuits and Technol.
Meeting (BCTM), Oct 2015, pp. 80–87.
[37] “45RFSOI Advanced 45nm RF SOI Technology. [Online],” Available at: https://www.
globalfoundries.com/sites/default/files/product-briefs/pb-45rfsoi.pdf.
[38] B. t. Kazemi Esfeh, “28 nm FD SOI Technology Platform RF FoM,” in IEEE SOI-3D-
Subthreshold Microelectronics Technol. Unified Conf (S3S), 2014.
[39] R. Carter et al., “22nm FDSOI technology for emerging mobile, Internet-of-Things,
and RF applications,” in IEEE Int. Electron Devices Meeting (IEDM), Dec 2016, pp.
2.2.1–2.2.4.
[40] K. Okada et al., “20.3 A 64-QAM 60GHz CMOS transceiver with 4-channel bonding,”
in IEEE Int. Solid-State Circuits Conf. Digest of Technical Papers (ISSCC), Feb 2014,
pp. 346–347.
[41] S. V. Thyagarajan et al., “A 240 GHz Fully Integrated Wideband QPSK Receiver in 65
nm CMOS,” IEEE J. Solid-State Circuits, vol. 50, no. 10, pp. 2268–2280, Oct 2015.
[42] R. Wu et al., “13.6 A 42Gb/s 60GHz CMOS transceiver for IEEE 802.11ay,” in IEEE
Int. Solid-State Circuits Conf. (ISSCC), Feb 2016, pp. 248–249.
[43] K. K. Tokgoz et al., “A 120Gb/s 16QAM CMOS millimeter-wave wireless transceiver,”
in IEEE Int. Solid-State Circuits Conf. (ISSCC), Feb 2018, pp. 168–170.
[44] S. Hara et al., “A 32Gbit/s 16QAM CMOS receiver in 300GHz band,” in IEEE MTT-S
Int. Microw. Symp. (IMS), June 2017, pp. 1703–1706.
[45] K. K. Tokgoz et al., “13.3 A 56Gb/s W-band CMOS wireless transceiver,” in IEEE Int.
Solid-State Circuits Conf. (ISSCC), Feb 2016, pp. 242–243.
[46] A. Oppenheim et al., Digital signal processing. Pearson, 2015.
[47] “DPO77002SX Product Support. [Online],” Available at: https://www.tek.com/
oscilloscope/dpo77002sx.
[48] “Keysight Technologies Infiniium 90000L Series Oscilloscopes. [Online],” Available at:
http://literature.cdn.keysight.com/litweb/pdf/5990-7368EN.pdf.
[49] R. Wu et al., “64-QAM 60-GHz CMOS Transceivers for IEEE 802.11ad/ay,” IEEE J.
Solid-State Circuits, vol. 52, no. 11, pp. 2871–2891, Nov 2017.
70
[50] H. Mohammadnezhad et al., “A Single-Channel RF-to-Bits 36Gbps 8PSK RX with
Direct Demodulation in RF Domain,” in 2019 IEEE Custom Integr. Circuits Conf.
(CICC), April 2019, pp. 1–4.
[51] H. Mohammadnezhad, H. Wang, A. Cathelin, and P. Heydari, “A 115–135-GHz 8PSK
Receiver Using Multi-Phase RF-Correlation-Based Direct-Demodulation Method,”
IEEE Journal of Solid-State Circuits, vol. 54, no. 9, pp. 2435–2448, Sep. 2019.
[52] H. Wang et al., “A 100-120GHz 20Gbps Bits-to-RF 16QAM Transmitter Using 1-bit
Digital-to-Analog Interface,” in 2019 IEEE Custom Integr. Circuits Conf. (CICC), April
2019, pp. 1–4.
[53] H. Wang, H. Mohammadnezhad, and P. Heydari, “Analysis and Design of High-Order
QAM Direct-Modulation Transmitter for High-Speed Point-to-Point mm-Wave Wireless
Links,” IEEE Journal of Solid-State Circuits, pp. 1–19, 2019.
[54] R. H. Walden, “Analog-to-digital converter survey and analysis,” IEEE J. Sel. Areas
Commun., vol. 17, no. 4, pp. 539–550, April 1999.
[55] B. E. Jonsson, “A survey of A/D-Converter performance evolution,” in IEEE Int. Conf.
Elec., Circuits Syst., Dec 2010, pp. 766–769.
[56] T. Sundstrom et al., “Power Dissipation Bounds for High-Speed Nyquist Analog-to-
Digital Converters,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 3, pp.
509–518, March 2009.
[57] T. Ito et al., “Capacitance Mismatch Evaluation for Low-power Pipeline ADC Design,”
IEICE Electronics Express, vol. 1, no. 3, pp. 63–68, 2004.
[58] A. Karimi-Bidhendi, O. Malekzadeh-Arasteh, M. Lee, C. M. McCrimmon, P. T. Wang,
A. Mahajan, C. Y. Liu, Z. Nenadic, A. H. Do, and P. Heydari, “CMOS Ultralow Power
Brain Signal Acquisition Front-Ends: Design and Human Testing,” IEEE Transactions
on Biomedical Circuits and Systems, vol. 11, no. 5, pp. 1111–1122, Oct 2017.
[59] A. Mahajan, A. K. Bidhendi, P. T. Wang, C. M. McCrimmon, C. Y. Liu, Z. Nenadic,
A. H. Do, and P. Heydari, “A 64-channel ultra-low power bioelectric signal acquisition
system for brain-computer interface,” in 2015 IEEE Biomedical Circuits and Systems
Conference (BioCAS), Oct 2015, pp. 1–4.
[60] M. Lee and A. Karimi-Bidhendi and O. Malekzadeh-Arasteh and P. T. Wang and Z. Ne-
nadic and A. H. Do and P. Heydari, “A CMOS inductorless MedRadio OOK transceiver
with a 42 µW event-driven supply-modulated RX and a 14% efficiency TX for medical
implants,” in 2018 IEEE Custom Integrated Circuits Conference (CICC), April 2018,
pp. 1–4.
[61] M. Lee and A. Karimi-Bidhendi and O. Malekzadeh-Arasteh and P. T. Wang and A.
H. Do and Z. Nenadic and P. Heydari, “A CMOS MedRadio Transceiver With Supply-
Modulated Power Saving Technique for an Implantable Brain–Machine Interface Sys-
tem,” IEEE Journal of Solid-State Circuits, vol. 54, no. 6, pp. 1541–1552, June 2019.
71
[62] A. Karimi-Bidhendi, H. Pu, and P. Heydari, “Study and Design of a Fast Start-Up Crys-
tal Oscillator Using Precise Dithered Injection and Active Inductance,” IEEE Journal
of Solid-State Circuits, pp. 1–12, 2019.
[63] B. Razavi, “Design Considerations for Interleaved ADCs,” IEEE J. Solid-State Circuits,
vol. 48, no. 8, pp. 1806–1817, Aug 2013.
[64] L. L. Lewyn et al., “Analog Circuit Design in Nanoscale CMOS Technologies,” Proc.
IEEE, vol. 97, no. 10, pp. 1687–1714, Oct 2009.
[65] D. Cui et al., “3.2 A 320mW 32Gb/s 8b ADC-based PAM-4 analog front-end with
programmable gain control and analog peaking in 28nm CMOS,” in IEEE Int. Solid-
State Circuits Conf. (ISSCC), Jan 2016, pp. 58–59.
[66] N. Kurosawa et al., “Explicit analysis of channel mismatch effects in time-interleaved
ADC systems,” IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 48, no. 3, pp.
261–271, March 2001.
[67] B. Murmann et al., “Digitally enhanced analog circuits: System aspects,” in IEEE Int.
Symp. Circuits Syst., May 2008, pp. 560–563.
[68] M. El-Chammas et al., “General Analysis on the Impact of Phase-Skew in Time-
Interleaved ADCs,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 5, pp.
902–910, May 2009.
[69] B. Razavi, “Problem of timing mismatch in interleaved ADCs,” in Proc. IEEE Custom
Integr. Circuits Conf., Sep. 2012, pp. 1–8.
[70] J. Cao et al., “29.2 A transmitter and receiver for 100Gb/s coherent networks with
integrated 4ˆ64GS/s 8b ADCs and DACs in 20nm CMOS,” in IEEE Int. Solid-State
Circuits Conf. (ISSCC), Feb 2017, pp. 484–485.
[71] K. Bult, “Embedded analog-to-digital converters,” in Proc. ESSCIRC, Sep. 2009, pp.
52–64.
[72] N. Dolatsha et al., “17.8 A compact 130GHz fully packaged point-to-point wireless
system with 3D-printed 26dBi lens antenna achieving 12.5Gb/s at 1.55pJ/b/m,” in
IEEE Int. Solid-State Circuits Conf. (ISSCC), Feb 2017, pp. 306–307.
[73] N. Deltimple, B. Le Gal, C. Rebai, A. Aulery, N. Delaunay, D. Dallet,
D. Belot, and E. Kerherve, “Cartesian Feedback with Digital Enhancement
for CMOS RF Transmitter,” in Linearization and Efficiency Enhancement
Techniques for Silicon Power Amplifiers, Nov. 2014. [Online]. Available: https:
//hal.archives-ouvertes.fr/hal-01090527
[74] W. Osborne and B. Kopp, “Synchronization in M-PSK modems,” in [Conference Record]
SUPERCOMM/ICC ’92 Discovering a New World of Communications, June 1992, pp.
1436–1440 vol.3.
72
[75] S. G. Farquhar et al., “Demodulation in digital communication systems,” 1997,
UK Patent GB2306085A. [Online]. Available: https://patents.google.com/patent/
GB2306085A.
[76] A. Alaiwi et al., “Correlation-base Demodulation,” 2015, WIPO Patent
WO2015022532. [Online]. Available: https://patentscope.wipo.int/search/en/detail.
jsf?docId=WO2015022532.
[77] “5G Candidate Band Study: Study on the Suitability of Potential Candidate Frequency
Bands above 6GHz for Future 5G Mobile Broadband Systems. [Online],” Available at:
https://www.ofcom.org.uk/ data/assets/pdf file/0014/31910/qa-report.pdf.
[78] H. Mohammadnezhad et al., “Analysis and Design of a Wideband, Balun-Based, Differ-
ential Power Splitter at mm-Wave,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 65,
no. 11, pp. 1629–1633, Nov 2018.
[79] H. Darabi, Radio Frequency Integrated Circuits and Systems. Cambridge University
Press, 05 2015.
73
