Analysis and Design of CMOS Radio-Frequency Power Amplifiers by Qian, Haoyu
ANALYSIS AND DESIGN OF CMOS
RADIO-FREQUENCY POWER AMPLIFIERS
A Dissertation
by
HAOYU QIAN
Submitted to the Office of Graduate and Professional Studies of
Texas A&M University
in partial fulfillment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
Chair of Committee, Jose Silva-Martinez
Committee Members, Aydin Karsilayan
Peng Li
Duncan M. Walker
Head of Department, Miroslav M. Begovic
May 2017
Major Subject: Electrical Engineering
Copyright 2017 Haoyu Qian
ABSTRACT
The continuous advancement of semiconductor technologies, especially CMOS technology, has enabled
exponential growth of the wireless communication industry. This explosive growth in turn has completely
changed people’s lives. The CMOS feature size scale down greatly benefits digital logic integrations,
which result in more powerful, versatile, and economical digital signal processing. Further research
and development has pushed analog, mixed-signal, and even radio-frequency (RF) circuit blocks to be
implemented and integrated in CMOS.
Future generations of wireless communication call for even further level of integration, and as of now,
the only circuit block that is rarely integrated in CMOS along with other parts of the system is the power
amplifier (PA). Due to the fact that the PA in a wireless communication system is the most power-hungry
circuit block, the integration of RF PA in CMOS would potentially not only save the cost of the wireless
communication system real estate, but also reduce power consumption since die-to-die connection loss
can be eliminated.
RF PA design involves handling large amounts of voltage and current at the radio frequencies, which
in the present wireless communication standards are in the range of giga-hertz. Therefore, a good under-
standing of many aspects related to RF PA design is necessary. Theoretical analysis of the communication
system, nonlinear effects of the PA, as well as the impedance matching network is systematically pre-
sented. The analysis of the nonlinear effects proposes a formal mathematical description of the multitone
nonlinearity, and through its relationship with two-tone test, the proposed PA design methodology would
greatly reduce the design time while improving the design accuracy.
A thorough analysis of the available architecture and design techniques for efficiency and linearity
enhancement of RF PA shows that despite tremendous amounts of research and development into this
topic, the fundamental tradeoff between the two still limits the RF PA implementation largely within
SiGe, GaAs, and InP technologies. A RF PA for Wideband Code-Division Multiple Access (WCDMA)
application standard is proposed, designed, and implemented in CMOS that demonstrates the proposed
segmentation technique that resolved the main tradeoff between power efficiency and linearity. The
innovative architecture developed in this work is not limited to applications in the WCDMA communication
protocol or the CMOS technology, although CMOS implementation would take advantage of the readily
available digital resources.
ii
To my wife, parents, and late grandparents.
iii
ACKNOWLEDGMENTS
First, I would like to acknoledge my advisor, Professor Jose Silva-Martinez. It was Dr. Silva who first
introduced me to analog integrated circuit design, and through my entire PhD program, I had always
looked up to his passion, his approach to solve problems, and his dedication. I am very grateful to have
had a chance to work with him, and am thankful for his guidance that made this research possible.
I would like to thank Professor Aydin Ilker Karsilayan for his valuable suggestions and technical
insights. I also greatly appreciate Professor Peng Li and Professor Duncan M. Walker for their comments
and inputs, and for taking time out of their very busy schedules to serve on my graduate committee.
I feel very honored to be part of the AMSC group, and want to thank Heng Zhang, Yung-Chung
Lo, Shan Huang, Jiayi Jin, Cheng Li, Chengliang Qian, Jie Zou, Xi Chen, Zhuizhuan Yu, Yang Liu,
Jingjing Yu, Hongbo Chen, Jun Zhou, Yang Su, Chen Ma, Jun Yan, Miao Song, Hao Huang, Yang Gao,
Yanjie Sun, Nan Wang, Geng Tang, Congying Shi, Xiaosen Liu, Younghoon Song, John Mincey, Marvin
Onabajo, Chang-Joon Park, Sai Ganta, Jesse Coulon, Mohan Geddada, Alex Edward, Carlos Briseno,
Ehsan Tabasy, Nagar Rashidi, Hajir Hedayati, Efrain Gaxiola, Jorge Zarate, and Richard Turkson for their
friendship. In particular, I want to thank Jingjing Yu, Chen Ma, Jun Yan, and Jesse Coulon for many
nights spent together on course work. Also I want to thank Shan Huang for many invaluable technical
discussions. Younghoon Song and John Mincey also served as my TA during my graduate study, and with
their countless help to the very details of many aspects of analog IC design, I gained the most out of
the classes. I had the honor of working with Dr. Marvin Onabajo on his research project of RF mixer
built-in self testing. He is the most hard-working person I have ever worked with, and I learned quite a lot
from him. Also, I really want to thank Professor Kamran Entesari for taking time to share his expertise
in impedance matching techniques. In addition, I want to express my appreciation of Ella Gallagher for
facilitating events and paper work.
I would like to express my gratitude to the Department of Electrical & Computer Engineering of
Texas A&M University for providing such a good environment for academic research, and for offering
such a comprehensive course work that prepares me for my career. In particular, I would like to thank
Professor Scott Miller for teaching me Digital Communication Theory, and Professor Aniruddha Datta
for teaching me Control Theory. The fundamental concepts in these classes greatly helped me with my
graduate research. I am very grateful to Tammy Carda to her assistance all the way through my graduate
study. Her administrative competence and warm personality greatly eased my life as a graduate student.
iv
During my internship at Microtune, Jason Wardlaw was my mentor, and had helped me quite a lot
to get to speed. Also I want to thank Yan Cui, Jan-Michael Stevenson, and Ron Spencer for technical
discussions, and Kirk Asby for his great leadership of the many good people at Microtune and for giving
me the opportunity to work with them.
Outside of the Department of Electrical & Computer Engineering, I would like to thank Professor
Joseph H. Ross, Jr. from Department of Physics. Dr. Ross was my research advisor during my study at
Department of Physics, and it was him who first taught me how research should be done at the world-
class level. I really appreciate his mentorship and unselfishly encouragement to pursue the Electrical
Engineering degree.
Outside of Texas A&M University, I want to express my gratitude toward Eric Soenen of TSMC for
giving me the opportunity to tape out my project with their 40 nm CMOS technology. Also I want to
thank Sherif Embabi of Nvidia for allowing me to measure the power amplifier at their lab with their
instruments. Their help all directly contributed to my work.
Finally, I must reserve the most special appreciation for my friends and family. I thank my friends at
Texas A&M University and elsewhere for their friendship. I am deeply indebted to my parents, Guyuan Lu
and Cheng Qian, who gave me a loving family and have always emphasized the importance of education.
My late grandparents, Maozhen Zhang and Shaojie Lu, brought me up, and had supported me every step
of the way. At last, my most tender and sincere thanks are reserved for my wife, Sulei Chen, who was
willing to marry a graduate student, but nevertheless continues giving me her unconditional love and
support.
v
CONTRIBUTORS AND FUNDING SOURCES
Contributors
This work was supported by a dissertation committee consisting of Professors Jose Silva-Martinez,
Aydin Karsilayan, and Peng Li of the Department of Electrical and Computer Engineering, and Professor
Duncan M. Walker of the Department of Computer Science and Engineering.
All work for the dissertation was completed by the student, under the advisement of Professor Jose
Silva-Martinez.
Funding Sources
Graduate study was supported by teaching assistantship from Texas A&M University.
vi
TABLE OF CONTENTS
Page
I INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
I.A Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
I.B Motivation and Challenges in CMOS Power Amplifier (PA) . . . . . . . . . . . . . 4
I.C Orgnization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
II PA CLASSIFICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
II.A Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
II.B Class A Power Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
II.C Class B Power Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
II.D Reduced Conduction Angle PAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
II.E Class D Power Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
II.F Class E Power Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
II.G Class F Power Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
III DIGITAL COMMUNICATION SYSTEMS . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
III.A Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
III.B Digital Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
III.C Wireless Communication Standards . . . . . . . . . . . . . . . . . . . . . . . . . . 24
IV NONLINEAR EFFECTS OF POWER AMPLIFIER . . . . . . . . . . . . . . . . . . . . . . 27
IV.A Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
IV.B MOSFET Nonlinearity Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . 28
IV.C Single-tone Test: Harmonic Distortion and Gain Compression . . . . . . . . . . . . 30
IV.D Two-tone Test: Intermodulations and Intercept Point . . . . . . . . . . . . . . . . . 33
IV.E Multitone Nonlinear Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
IV.F AM-to-PM Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
V IMPEDANCE MATCHING NETWORK DESIGN . . . . . . . . . . . . . . . . . . . . . . . 48
V.A Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
V.B Impedance Matching Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
V.C L-match Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
V.D Pi-match Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
V.E Multisection Matching Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
VI POWER AMPLIFIER ARCHITECTURES . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
VI.A Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
VI.B Efficiency Enhancement Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 61
VI.C Linearization Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
vii
VII A 35DBM OUTPUT POWER AND 38DB LINEAR GAIN PA WITH 44.9% PEAK PAE
AT 1.9GHZ IN 40 NM CMOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
VII.A Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
VII.B Efficiency Enhancement Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 73
VII.C PA Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
VII.D PA System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
VII.E Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
VII.F Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
VIII CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
viii
LIST OF FIGURES
FIGURE Page
1 Global cellular subscription growth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Distribution of global mobile subscribers by technology. . . . . . . . . . . . . . . . . . . . 2
3 Global smartphone shiptments forecast. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4 Schematic of a generic single-ended PA. . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5 Class A drain current and voltage waveforms at maximum output level. . . . . . . . . . . 10
6 Class B current and voltage waveforms at maximum output level. . . . . . . . . . . . . . 11
7 Efficiency of classes A and B as function of PBO. . . . . . . . . . . . . . . . . . . . . . . 12
8 Reduced conduction angle PA current and voltage waveforms at maximum output level. . 13
9 Maximum efficieny and output power as a function of conduction angle. . . . . . . . . . . 15
10 Conceptual schematic of a class D PA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
11 Class D current and voltage waveforms. . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
12 Conceptual schematic of a class E PA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
13 Class E current and voltage waveforms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
14 Class F current and voltage waveform illustrations. . . . . . . . . . . . . . . . . . . . . . 19
15 Block diagram of a digital RF transmitter. . . . . . . . . . . . . . . . . . . . . . . . . . . 20
16 Relationship between two representations of complex envelope. . . . . . . . . . . . . . . . 21
17 Ideal BPSK contellation diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
18 Ideal QPSK constellation diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
19 Ideal 16QAM constellation diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
20 Illustrative I-V characteristic of a MOSFET. . . . . . . . . . . . . . . . . . . . . . . . . . 28
21 Typical power transfer characteristics of a PA. . . . . . . . . . . . . . . . . . . . . . . . . 32
22 Third-order intermodulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
23 Geometrical interpretation of IP3 calculation. . . . . . . . . . . . . . . . . . . . . . . . . 35
24 Typical RF transmitter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
25 Microphotograph of the chip. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
26 Simulated and predicted multitone ACLR as a function of number of tones. . . . . . . . . 44
27 Multitone output spectrum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
28 PA output impedance matching block diagram. . . . . . . . . . . . . . . . . . . . . . . . . 49
ix
29 Lumped-element L-match network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
30 Concept of loaded Q. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
31 Lumped L-match driven by the intended source resistance. . . . . . . . . . . . . . . . . . 51
32 Hybrid L-match network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
33 Π-match network in output matching applications. . . . . . . . . . . . . . . . . . . . . . . 53
34 General schematic for analysis of a Π-match network. . . . . . . . . . . . . . . . . . . . . 54
35 Π-match network with inductor’s parasitic resistance explicitly shown. . . . . . . . . . . . 58
36 Thevenin equivalent circuit of the Π-match network near matching frequency. . . . . . . . 58
37 Multisection matching network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
38 Schematic of a 2-stage matching network. . . . . . . . . . . . . . . . . . . . . . . . . . . 60
39 Conceptual schematic of outphasing modulation. . . . . . . . . . . . . . . . . . . . . . . . 62
40 Demonstration of outphasing load configuration without compensation. . . . . . . . . . . . 63
41 Outphasing load compensation network. . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
42 Outphasing technique implemented in the current domain. . . . . . . . . . . . . . . . . . . 64
43 Conceptual schematic of a Doherty PA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
44 Simplified Doherty PA for analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
45 Conceptual schematic of an EER architecture. . . . . . . . . . . . . . . . . . . . . . . . . 67
46 Conceptual schematic of a polar modulation architecture. . . . . . . . . . . . . . . . . . . 68
47 Conceptual schematic of an ET system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
48 Conceptual schematic of a power combining architecture. . . . . . . . . . . . . . . . . . . 69
49 Conceptual schematic of a polar-loop feedback system. . . . . . . . . . . . . . . . . . . . 69
50 Conceptual schematic of a cartesian feedback system. . . . . . . . . . . . . . . . . . . . . 70
51 Conceptual schematic of a feedforward architecture. . . . . . . . . . . . . . . . . . . . . . 71
52 Conceptual schematic of a predistortion system. . . . . . . . . . . . . . . . . . . . . . . . 71
53 Conceptual schematic of an ET system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
54 Conceptual schematic of power combining architecture with switchable PAs. . . . . . . . 75
55 Correlation between control phases and baseband signal amplitude. . . . . . . . . . . . . . 76
56 Simplified schematic of the proposed architecture. . . . . . . . . . . . . . . . . . . . . . . 77
57 Simplified model for timing mismatch analysis. . . . . . . . . . . . . . . . . . . . . . . . 78
58 PA output waveforms (RF component is not shown for simplicity). . . . . . . . . . . . . . 79
59 Timing mismatch effects on ACLR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
x
60 Schematic of the PA output stage; the core consists of 1536 replicas. . . . . . . . . . . . . 82
61 Schematic showing the CMFB circuit allocated at PA output. . . . . . . . . . . . . . . . . 83
62 Conceptual schematic of the driver stage. . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
63 Simulation results of common-mode voltage transient response. . . . . . . . . . . . . . . . 85
64 Two-section impedance matching network. . . . . . . . . . . . . . . . . . . . . . . . . . . 86
65 Insertion loss simulation with process variations. . . . . . . . . . . . . . . . . . . . . . . . 86
66 Transient simulation results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
67 Microphotograph of the chip. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
68 Measured gain, output power, and PAE as a function of input at 1.9 GHz. . . . . . . . . . 88
69 Simulation and measurement results of PA’s S22. . . . . . . . . . . . . . . . . . . . . . . 89
70 ACLR measured at maximum output power of 31 dBm. . . . . . . . . . . . . . . . . . . . 90
71 SEM measured at maximum output power of 31 dBm. . . . . . . . . . . . . . . . . . . . 90
72 ACLR as a function of maximum output power. . . . . . . . . . . . . . . . . . . . . . . . 91
73 EVM as a function of maximum output power. . . . . . . . . . . . . . . . . . . . . . . . . 92
74 Phase error as a function of maximum output power. . . . . . . . . . . . . . . . . . . . . 93
xi
LIST OF TABLES
TABLE Page
I RF/Microwave-Related Publications in IEEE Database . . . . . . . . . . . . . . . . . . . . 3
II QPSK Format Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
III GSM PA Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
IV IEEE 802.11g PA Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
V WCDMA PA Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
VI Intermodulation Adjacent Tone Position Analysis . . . . . . . . . . . . . . . . . . . . . . 40
VII Triple-beat Adjacent Tone Position Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 41
VIII Impedance Matching Network Component Values . . . . . . . . . . . . . . . . . . . . . . 86
IX Comparison With Recently Published Works . . . . . . . . . . . . . . . . . . . . . . . . . 93
xii
I. INTRODUCTION
I.A. Background
Radio frequency (RF) circuits and systems have been around for more than a century. Since the
theoretical prediction by Maxwell and later experimental verification by Hertz of the electromagnetic
waves, scientists and engineers have put endless endeavor to develope systems that are able to transmit
and receive information embedded in such waves. From the inventions of electrical telegraph to radio
broadcasting, then telephone and television broadcasting, internet, cellular phone, and nowadays global
positioning system (GPS), bluetooth, and wireless local area network (WLAN), the list that shows the
effort to enable and improve means of communication via RF and microwave technologies keeps growing.
The first-generation (1G) cellular system first appeared in 1969 [1],Then after a decade of patenting,
licensing, and trial services, the first full-service 1G celluar system finally took place in late 1983 [2]. The
following generations were developed and put to market in an exponentially faster pace. Development
in semiconductor technologies is a deciding factor: ever since the invention of integrated circuit (IC),
especially the development of the CMOS IC, digital signal processing (DSP) saw great advance and thus
more functionalities can be integrated. Mobile device and hence services became more affordable. As a
result, today’s mobile devices are no longer limited in making telephone calls, but can have functionalities
such as bluetooth, WLAN, GPS, and so on. On the other hand, the frequency band originally intended
only for handling voice information can realize video calls now, and at the same time photos, videos, and
files can be shared. Fig. 1 shows the growth of cellular subscriptions over the last decade. Not only has
the total number of cellular subscriptions increased by a factor of about seven during the past 14-year
span, the subscriptions per 100 inhabitants is approaching 100. Note that the cellular industry did not
show significant slow-down even during the global economic recession in the late 2000s. The size of
the population does not pose as a limit to the expansion of market either. Fig. 1 shows that the average
subscription percentage is close to 100% as projected in the year 2014, and detailed statistics show that
in some countries and regions this number has already exceeded the 100 mark for several years.
With the development of the third-generation (3G) wireless communication standards, wireless com-
munication has become so reliable, versatile, and thus convenient, that it is no longer considered a luxury
or even optional, but has become an essential part of people’s everyday lives. With the introduction of
the fourth-generation (4G) technology that applies not only to mobile phones (especially smartphones)
but also other devices such as tablet, laptop, television, and even motor vehicle, the concept of wireless
1
Fig. 1. Global cellular subscription growth. [3]
Fig. 2. Distribution of global mobile subscribers by technology. [4]
communication is extented to communication between devices. As predicted in Fig. 2, new generations of
wireless communications standards steadily take over the old ones. The fast pace of technology take-over
in the wireless communication has uniquely made it one of the most active, dynamic, and exciting in the
semiconductor industry.
The promising functionalities in the 3G and 4G technologies indeed imply a promising potential in the
wireless market. Fig. 3 shows the global smartphone shipments forecast. Despite the mobile market in all
developed countries and most developing countries being already mature, the smartphone shipments still
see a steady increase in the forecast.
The growth in market and continuous evolution of wireless communication technologies in turn pushed
2
Fig. 3. Global smartphone shiptments forecast. [5]
for more research and development in RF and microwave IC and system design. Table I shows the resultant
number of publications of a simple search of the keywords “radio frequency” and “microwave” in the
IEEE Xplore database. In these fields, the number of IEEE publications in the 1990s decade is more than
1.5 times the total number before 1990, the millennium decade almost tripled as compared to the 1990s
decade, and the number of IEEE publications within less than the recent four years has already surpassed
the entire 1990s decade.
TABLE I
RF/Microwave-Related Publications in IEEE Database
Year No. of publications
Before 1990 2953
1991 to 2000 4498
2001 to 2010 12099
2011 to present 5095
One important factor that has been a constant driving force of the aforementioned growth in both the
industry and research and development effort is the advance in semiconductor technology. Because of
the scaling down of the feature size, more transistors can be placed per unit area, and the overall cost
is reduced. The major beneficiary of such progresses is integrated digital circuits because more gates
and thus functionalities can be integrated. As a result, the DSP has become more powerful, versatile,
and economical. For analog and RF front-end designs, the scaling of semiconductor presents more of a
design trade-off than one-sided benefit their digital counterpart experiences. Therefore the front-end and
3
back-end used to be implemented on separate chips, each using a semiconductor technology optimized
for their respective performance metrics. Although such a degree of integration greatly reduced cost as
compared with discrete circuits, it will not be able to keep pace with the growing market demand for
even lower cost to have enough profit margin. Consequently, the trend has been towards integration of
digital and analog even RF circuitries on the same die, termed system on chip (SoC). Analog and mixed
signal circuits are first integrated with the digital back-end, then small-signal RF circuitry, such as the RF
receiver [6–10] and transmitter driver [11–15], followed suit. In addition to savings in power consumption
and area, RF circuits fits naturally to today’s high-speed digital because they can share at least part of
the clocking circuits, and the increasingly more common RF-to-digital interface and digital assistance to
the RF circuits such as calibration, predistortion, built-in self testing, and multimode controls would be
bulky and inefficient if the RF and digital sections are on separate dice [16].
Since most of the signal processing is performed in the digital domain, and the CMOS technology is
best suited for digital ICs to this date, the aforementioned trend is then translated into the integration of
the system on digital CMOS die. Analog and mixed-signal building blocks such as amplifier, active filter,
bandgap reference, voltage regulator, and data converter have been successfully implemented in digital
CMOS with satisfactory performance and low power consumption. Due to lossy and thus noisy substrate
and relatively large and nonlinear parasitic component as compared with conventional RF IC technologies
such as SiGe, GaAs, and InP, CMOS RF IC had not been developed and put to industry at the early
stage of this process of integration. But lower cost as a result of integration and better battery life as a
consequence of low supply voltage and bias current of the CMOS technology fit the demand of the mobile
communication market so well that a great amount of effort of research and development has resulted in
reliable and high-performance small-signal RF IC implemented in CMOS.
I.B. Motivation and Challenges in CMOS Power Amplifier (PA)
RF PAs are the biggest power consumer in the RF transceiver chain and occupy large die area, but they
have large current and voltage swing at high frequency, hence are difficult to implement in digital CMOS.
Although there is an increasing interest and research effort in the academic community, commercial PAs
are still dominated by SiGe and GaAs technologies as of today. The drawback of CMOS technology
rooted in the reasons mentioned above is a factor, and another reason that is holding off commercial
CMOS PAs and thus full integration of the entire radio is the reliability issue due to CMOS scaling.
As the CMOS technology scales down to finer minimum dimensions, the maximum allowable voltage
4
is lowered, and various mechanisms have made CMOS PAs especially prone to device breakdown [17].
Consequently, realizing high-power CMOS PAs with conventional architectures would result in very low
load impedance, which is difficult to realize, prone to process, voltage, and temperature (PVT) variations,
and can have high power loss. In addition, 3G and 4G standards use modulation schemes that have high
peak-to-average power ratio (PAPR), thus highly linear PAs are needed. Multimode requirements in the
future generations of mobile devices would add flexibility requirements to the PA design, making it more
challenging to be implemented in digital CMOS. Since the majority of the CMOS foundry’s customers
use the technology for digital or mixed-signal applications, the modeling and characterization from the
foundry are also based on those applications, which is not enough even for general RF IC applications,
not to mention high-power RF PAs. Therefore another challenge in CMOS PA design is the lack of
accurate modeling and thus simulation tools. For example, even in a CMOS technology that comes with
S-parameters, they are small-signal S-parameters, which can only serve as a reference for PA design.
GaAs and SiGe technologies suffer less from such a problem because they are mainly used for RF PA
applications and their modeling emphasizes on large-signal RF behavior.
Despite the challenges discussed above, because of the potential cost-reduction in CMOS integration,
there is a lot of research effort put into the design of CMOS PAs. Along the way, various techniques have
been developed to not only alleviate the limitations posed by the CMOS technology but also improved PA
performance. Therefore, in addition to finding a low-cost solution to fully integrated wireless transceiver,
another motivation of this work is that the techniques developed in improving PA performance metrics
such as linearity and efficiency in CMOS technology, which is not optimized for RF PA applications,
may also find their enduring values in other semiconductor technologies, future CMOS nodes, or future
semiconductor technologies, and even in other applications.
Some call CMOS PAs the “last frontier”, or the “missing piece” of full integration of the wireless
transceiver system. This dissertation hence presents the analysis of various aspects of RF PA design, and
introduces the research results of a segmented CMOS PA for wireless communication applications.
I.C. Orgnization of the Dissertation
The objective of this work is to exploit standard CMOS technology for development of RF PAs for
current and future wireless communication applications. RF PA design is not an easy task because the
designer is required to have a solid understanding of analog, digital, and RF IC design concepts. In
addition, due to the challenges in realizing CMOS RF PAs mentioned above, it is important to be aware
5
of the device physics and various failure mechanisms in modern CMOS technology. Moreover, proficiency
is needed in other areas such as microwave, communication, and signal processing, each of which is itself
a well-developed discipline. This dissertation would not cover all topics related to CMOS RF PA design in
great details, but would present an analysis of some of the important ones before discussing the research
results of a highly linear and efficient CMOS PA designed fore high-power wireless communication
applications.
Section II first introduces various operation modes of PAs, which serve as a starting point of the
discussion and analysis of RF PA design. Each operation mode has its own advantages and disadvantages,
and the discussion of all those operation modes would reveal that important specifications of RF PA design
usually trade off with each other.
Section III provides an introduction to digital communication systems. Modern communication systems
are mostly implemented using digital modulation schemes, and modulation theory is a well-developed
discipline that deserves a discussion in the length of a textbook [18]. This section only lays out some
of the most critical modulation concepts and most common modulation schemes used in wireless com-
munications, and how they relate to RF PA design specifications. One of the requirements of modern
communication standards revealed in this section is the stringent specification of linearity, thus a discussion
of nonlinear effects of PAs is presented in Section IV.
The root-cause of nonlinear effects lies in the use of nonlinear active device. Therefore the nonlinear
mechanisms of MOS transistors is first discussed. Conventional linearity tests, i.e. single-tone and two-
tone tests are also briefly reviewed. As would be mentioned in Section III, modern communications use
wideband, multicarrier modulation schemes, hence the conventional linearity tests can only provide an
indication of the PA linearity. On the other hand, a multitone test or modulated envelope simulation require
long simulation time and large computation resource. To resolve this issue, a detailed analysis of multitone
nonlinear effects is carried out in this section, and as the analysis shows, a simple two-tone test can be
used to predict multitone nonlinear behavior. Although currently not a major nonlinearity contributor to
CMOS PA, the amplitude modulation to phase modulation (AM-to-PM) conversion is briefly analyzed at
the end of this section.
Section V presents an analysis of PA’s output impedance matching. The output impedance matching
is especially important for PA because it is usually the last circuit network before T/R switch or the
antenna, and its insertion loss hurts the power efficiency the most. As mentioned previously, CMOS PA’s
optimal output impedance at the drain is usually very small due to low-voltage technology. Therefore a
6
high impedance transformation ratio results, and thus the quality factor Q of the impedance matching
network is high. High-Q networks have narrow bandwidth, and such a network may be sensitive to PVT
variations. To alleviate this issue, a multisection impedance matching network is proposed, and simulation
results show a wide bandwidth and insensitivity to PVT variations.
Because the sophisticated multifunctional wireless electronic devices demand high efficiency PA for
longer battery life, and today’s strict wireless communication standards and overly crowded communication
frequency bands require highly linear transceivers, conventional standalone PAs would not meet all the
requirements. Various PA architectures aiming at improving power efficiency and linearity have been
developed, and Section VI reviews these techniques and their advantages and disadvantages.
Based on all the analysis and discussion of all the important issues concerning CMOS RF PA design,
Section VII presents a research project that developes a solution that improves the PA’s power efficiency
while keeps the linearity performance. PA segmentation technique is proposed, whose control scheme is
correlated with the input signal power level. A fast switching scheme ensures the PA segments can be
activated and deactivated within the modern wideband wireless communication standards, therefore the
average power efficiency can be improved both within a specific standard or in a multimode application.
Such a proposal and solution is implemented in TSMC 40 nm CMOS technology and supported by good
silicon measurement results of a WCDMA signal.
Finaly, Section VIII summarizes and concludes this dissertation.
7
II. PA CLASSIFICATIONS
II.A. Introduction
Fig. 4 shows the schematic of a generic single-ended PA. The active device can be implemented using
MESFET, HEMT, pHEMT, BJT, JFET, or MOSFET, but since the focus of this dissertation is on CMOS
PA design, an NMOS transistor symbol is shown in the figure. The choke inductor (RFC) and blocking
capacitor (CB) ensure isolation of DC and RF signal paths. The output filter usually has multiple functions:
a) it attenuates out-of-band components; b) it enables impedance transformation, so that the impedance
of the antenna ZL can be transformed to the optimal output impedance the transistor sees at the drain or
collector, ZT ; and c) it realizes infinite or zero impedance at multiples of the harmonics of the RF signal,
so that the output waveform can be shaped.
RFin
CB
Output
Filter
ZL
ZT
RFC
VDD
Fig. 4. Schematic of a generic single-ended PA.
Based on the method of operation, conventional RF PAs are categorized into classes A - F [2, 19, 20].
In classes A, AB, B, and C PAs, the transistor operates as a current source, whereas in classes D, E, and
F, the transistor is utilized as a switch, and therefore those PAs are generally termed switching mode PAs.
Current-source type PAs, especially classes A and B, are inherently linear, but their power efficiencies
degrades in the power back-off region. On the other hand, switching mode PAs usually can achieve power
efficiencies that are theoretically close to 100%, but do not preserve amplitude linearity.
The main point of comparison between various classes of PAs is their drain efficiencies (DE), defined
as the output power at the fundamental tone to the DC power:
η =
Pout
PDC
(1)
The power-added efficiency (PAE) is also a metric of the efficiency of the PA, defined as the ratio of the
power difference at the fundamental frequency between the output and input of the PA to the DC power
consumption:
PAE =
Pout − Pin
PDC
= η
(
1− 1
Gp
)
(2)
8
where Gp is the power gain of the PA. This definition of efficiency also includes the power gain
considerations, and is approximately equal to η if Gp is large. Theoretical analyses in this section will
compare the drain efficiencies of various classes of PA operation, because the power gain information,
which is usually difficult to obtain in idealized analysis, is not needed, whereas in the design example
discussed in a later section, PAE will be reported.
In modern wireless communication applications, the non-constant envelope signals are more likely to be
applied to the PAs. Therefore, in addition to analyzing the maximum efficiency a certain class of PA can
obtain, it is also important to investigate the power efficiency as a function of the output power back-off
(PBO), defined as how much in decibel the output power is less than the maximum output power, or
PBO = 10 log
Pout,max
Pout
(3)
Since different modulation schemes would result in different peak-to-average power ratio (PAPR) of
the envelope, the average efficiency as a function of PAPR provides a hint to making design decisions
towards a specific modulation scheme. Mathematically, the PAPR can be expressed as
PAPR = 10 log
Pout,max
Pout,avg
(4)
Basic operations of various classes of PAs will be described in this section, while techniques to improve
back-off region power efficiencies for current source PAs and to enhance linearity for switching mode
PAs will be discussed in a later section.
II.B. Class A Power Amplifier
Class A PAs are biased such that the transistor is in the active region at all specified input levels.
Assume the input voltage is sinusoidal, the ideal drain voltage and current at the maximum output level
are shown in Fig. 5, where θ = ωt for simplicity. Since the voltage across the choke inductor can be
both positive and negative, the drain voltage can swing between 0 and 2VDD. Accordingly, the drain
current varies between Imax and 0, with Imax depending on the bias and loading conditions as well as
the transistor size.
From Fig. 5, the maximum efficiency of an ideal class A PA is achieved at the maximum output power:
ηA,max =
1
2 (Imax/2)VDD
(Imax/2)VDD
= 50% (5)
The DC power consumption is fixed, so the efficiency as a function of PBO and the average efficiency
9
0Imax
2
Imax
0 pi 2pi
0
VDD
2VDD
θ (radian)
iD
vD
Fig. 5. Class A drain current and voltage waveforms at maximum output level.
as a function of PAPR are
ηA =
Pout
PDC
=
Pout
Pout,max
· Pout,max
PDC
= ηA,max10
−PBO/10 (6)
ηA,avg =
Pout,avg
PDC
=
Pout,avg
Pout,max
· Pout,max
PDC
= ηA,max10
−PAPR/10 (7)
So the efficiency of class A PAs would decay very quickly as the output power drops into the PBO
region, for example, the ideal efficiency of a class A PA at 6 dB power back-off from the maximum
output power would drop from 50% to 12.5%, only a quarter of the maximum efficiency. Also, class A
average efficiency for high PAPR modulation schemes is very low. For instance, multi-channel OFDM
modulations yield a PAPR of about 10 dB, so the ideal class A average efficiency would be only 5%, or
a tenth of its maximum value. Note that the above analysis is based on the ideal class A operation, which
ignored the finite minimum VDS that is needed to keep the transistor in the active region. As the CMOS
technology scales, VDS,min has become a fraction of VDD that is no longer negligible, and ηA,max would
be less than 50%, resulting in even less average efficiency in high PAPR modulation schemes.
II.C. Class B Power Amplifier
If the PA only conducts half of the RF cycle it is of class B. The ideal drain current and voltage
waveforms at maximum output level is shown in Fig. 6. Since the drain current is a half-wave, there will
be harmonics. In class B PA analysis, it is assumed that all harmonics are short-circuited by the output
filter. Therefore the drain voltage of a class B PA is still an ideal sinusoid.
10
0Imax
0 pi 2pi 3pi 4pi
0
VDD
2VDD
θ (radian)
iD
vD
Fig. 6. Class B current and voltage waveforms at maximum output level.
The drain current is expressed as
iDS =


Imax cos θ, −pi2 ≤ θ ≤ pi2
0, −pi ≤ θ < −pi2 , pi2 < θ ≤ pi
(8)
where for convenience of the analysis, the [−pi, pi] RF cycle is chosen. The DC and fundamental
components of the drain current are obtained by calculating the corresponding Fourier coefficients:
IDC = I0 =
1
2pi
+pi/2∫
−pi/2
Imax cos θ dθ =
Imax
pi
(9a)
iRF = I1 =
2
2pi
+pi/2∫
−pi/2
Imax cos
2 θ dθ =
Imax
2
(9b)
Therefore the theoretical maximum power efficiency of a class B PA is
ηB,max =
1
2 (Imax/2)VDD
(Imax/pi)VDD
=
pi
4
≈ 78.5% (10)
But the advantage of class B over class A is not only its higher maximum efficiency. From the derivation
of (9a), its DC current consumption is correlated with the output current. Suppose the peak output current
in the PBO is Ipk , then
PDC =
Ipk
pi
VDD =
Ipk
Imax
PDC,max (11)
The RF current at the fundamental is the same as (9b) except Imax is replaced Ipk , whereas the voltage
amplitude would be the product of the RF current amplitude and the optimal load impedance. Therefore
11
010
20
30
40
50
60
70
80
0 5 10 15 20 25 30
η
(%
)
PBO (dB)
Class A
Class B
Fig. 7. Efficiency of classes A and B as function of PBO.
the output power can be expressed as
Pout =
1
2
(
Ipk
2
)2
VDD
Imax/2
=
(
Ipk
Imax
)2
Pout,max (12)
Combination of (11) and (12) would lead to the power efficiency of class B PAs as a function of PBO:
ηB = ηB,max
(
Ipk
Imax
)
= ηB,max
√
Pout
Pout,max
= ηB,max10
−PBO/20 (13)
Fig. 7 shows theoretical power efficiency of classes A and B as a function of PBO, according to (6) and
(13). Class B has a larger maximum efficiency, and more importantly, it decays more slowly than class
A as the PA enters PBO region. The class B average efficiency as a function of PAPR is
ηB,avg = ηB,max10
−PAPR/20 (14)
One of the shortcomings of class B PA is the difficulties in reliable and insensitive realizations. As
the CMOS technology scales, the transistor current shows more gradual variations around the threshold
voltage, thus simply biasing the transistor gate at the threshold voltage would result in a drain current
waveform that is far from the ideal case. Even the drain current leakage below or close to the threshold
voltage is negligible, the threshold voltage itself varies by as much as 50% due to process, voltage, and
12
0Imax
0 2pi − α
2
2pi + α
2
4pi
0
VDD
2VDD
θ (radian)
iD
vD
Fig. 8. Reduced conduction angle PA current and voltage waveforms at maximum output level.
temperature variations, hence a robust realization could be a challenge. Finally, the ideal class B amplifier
is linear based on the assumption that harmonics are all short-circuited by the output filter. Such an
assumption requires the output filter have relatively high quality factor. In low voltage applications, the
optimal load resistance is low for high output power, thus the output filter would not have high Q, and
thus the linearity of the class B PA may not be guaranteed.
Another disadvantage of class B as compared with class A is that to deliver the same RF power at the
fundamental, class B PA needs twice the input voltage amplitude as required by class A. In other words,
the power gain of class B PAs is 6 dB less than that of class A.
II.D. Reduced Conduction Angle PAs
The aforementioned classes A and B operations can be viewed as special cases of a more general
concept. Define conduction angle α to be the proportion of the RF cycle for which conduction occurs
[20], then the class A amplifiers are the ones with α = 2pi, whereas for class B, α = pi. In general,
for a sinusoidal input, the current waveform is a truncated sinusoid if α < 2pi, as shown in Fig. 8. Not
surprisingly class AB is defined as pi < α < 2pi, while for PAs whose 0 < α < pi they are of class
C. Since the transistor does not conduct all the time, the efficiency performance is expected to be better
than that of class A. For reduced conduction angle PA analyses, it is assumed again that all harmonics
are short-circuited by the output filter, therefore the voltage waveform in Fig. 8 is still an ideal sinusoid
between 0 and 2VDD.
13
Mathematically, the drain current can be modeled as
iDS =


IQ + Im cos θ, −α2 ≤ θ ≤ α2
0, −pi ≤ θ < −α2 , α2 < θ ≤ pi
(15)
where for convenience of analysis, here θ ranges from −pi to pi. Note that IQ only represents the
mathematical average of the current waveform if it were not truncated, and thus can be positive and
negative. From Fig. 8 and the definition of conduction angle, Imax = IQ+Im, and IQ+Im cos (α/2) = 0,
thus (15) can be expressed in terms of Imax:
iDS =


Imax
1−cos α
2
(
cos θ − cos α2
)
, −α2 ≤ θ ≤ α2
0, −pi ≤ θ < −α2 , α2 < θ ≤ pi
(16)
The DC and harmonic components of the drain current are obtained by calculating the corresponding
coefficients of the Fourier series:
IDC = I0 =
1
2pi
+α
2∫
−α
2
Imax
1− cos α2
(
cos θ − cos α
2
)
dθ =
Imax
2pi
· 2 sin
α
2 − α cos α2
1− cos α2
(17)
In =
2
2pi
+α
2∫
−α
2
Imax
1− cos α2
(
cos θ − cos α
2
)
cos θ dθ
=
Imax
2pi
· 2
1− cos α2
[
sin n−12 α
n (n− 1) −
sin n+12 α
n (n+ 1)
]
(18)
where n ≥ 1. Maximum power efficiency as a function of conduction angle can be plotted using the
results from (17) and the n = 1 case (fundamental) from (18). Along with normalized maximum output
RF power, this is plotted in Fig. 9. Note that although deep class C operation would yield a maximum
efficiency of close to 100%, its output power is also low, which means for a certain overall power gain,
class C PAs would need more powerful driver amplifiers, thus the overall efficiency advantage over PAs
with larger conduction angles is less than illustrated in the plot. Moreover, class C PAs are nonlinear
even with harmonic traps at the output, making them less popular choices in modern communication
applications.
II.E. Class D Power Amplifier
As stated before, the active device in a PA can also be used as a switch, resulting in switching mode
PAs. Strictly speaking, such circuits should be termed power converters instead of amplifiers, since there
14
-20
-15
-10
-5
0
5
0pi2pi
50
55
60
65
70
75
80
85
90
95
100
N
o
rm
al
iz
ed
P
o
u
t
(d
B
)
η
(%
)
Conduction angle α (radian)
ηmax
Pout,max
Fig. 9. Maximum efficieny and output power as a function of conduction angle.
is not a strong correlation between the input and output power, they simply convert the DC power from
the power supply to RF power.
A straight forward implementation of such an idea is the class D PAs, whose simplified schematic is
shown in Fig. 10 [20]. The output series RLC resonator is alternately connected to VDD and ground for
each half of the RF cycle, and if the resonator is tuned at the carrier frequency, the current through each
switch would be a half-wave sinusoid that complements each other, resulting in a total output current that
is a full sinusoid. The voltage and current waveforms are shown in Fig. 11.
Cp
Ls Cs
RL
RFC
VDD
Fig. 10. Conceptual schematic of a class D PA.
Simple calculations of the waveforms reveal that the DC and fundamental components of the square-
wave drain voltage are VDD/2 and VDD ·2/pi, respectively. Similarly, the DC and fundamental components
of the drain current are Imax/pi and Imax/2, respectively. The ideal power efficiency of class D is therefore
ηD =
1
2 (Imax/2) (VDD · 2/pi)
(Imax/pi) (VDD/2)
= 100% (19)
15
0Imax
0 pi 2pi 3pi 4pi
0
VDD
2VDD
θ (radian)
iD
vD
Fig. 11. Class D current and voltage waveforms.
The difficulty of implementing class D PAs for RF applications is the need for complementary switches.
The floating switch between VDD and the RLC resonator is usually implemented as a p-type device. Due
to the low mobility of holes as compared with electrons, the p-type device is usually two to three times
larger in size than the n-type device with a similar power capacity. Not only the loss of p-type device
itself would reduce the power efficiency, the large input capacitance requires large driver stage power
consumption to accommodate sharp hard switching at RF. The use of transformers between driver and
the output switches, with one in-phase and the other anti-phase, makes it possible to implement the two
switches using both n-type device [21], but the need of large transformers as well as two power switches
still limits its applications to frequencies less than 100 MHz, and therefore an in-depth analysis of class
D amplifier is left out here.
II.F. Class E Power Amplifier
The schematic of a class E PA is shown in Fig. 12. The parasitic capacitance at the drain of the
transistor can be absorbed into Cp, and package inductance can be aborbed in Ls. Since the transistor
is used as a switch, its drain voltage vD and current iD waveforms satisfy 1) vD is negligible when iD
is nonzero and 2) iD is negligible when vD is nonzero. The uniqueness of class E PAs is the design of
the output filter such that in addition to the two conditions above, ideal operation of class E would result
in vD and iD waveforms satisfying the following conditions [22]: 3) the rise of vD at transistor turn-off
should be delayed until after the transistor is off, 4) vD should be brought back to zero at the time of
the transistor turn-on, and 5) the slope of vD should be zero at the time of turn-on. Because of such
properties, especially condition the last one regarding to the zero voltage slope, class E PAs are able to
achieve high efficiency even the power transistor has finite transition time. The ideal voltage and current
16
Cp
Ls Cs
RL
RFC
VDD
Fig. 12. Conceptual schematic of a class E PA.
waveforms are shown in Fig. 13.
0
Imax
0 pi 2pi 3pi 4pi
0
Vmax
θ (radian)
iD
vD
Fig. 13. Class E current and voltage waveforms.
As the case of class D, class E achieves a maximum theoretical efficiency of 100%, but it has
several advantages. First, the unavoidable drain capacitance of the transistor causes power loss in class D
operations [21], whereas in class E, according to condition 4) above, vD = 0 when the switch is closed,
hence there is no switching loss due to charging and discharging of the drain capacitance. This property
of class E, also referred as “soft switching” or “zero-voltage switching”, makes it possible for a low-cost
switching mode transistor to be used in high power RF applications. Also, because hard switching and
square waveforms are not required, class E shows better tolerance of process variations.
The drawback of class E PAs is that the transistor is under large stress. Detailed analysis shows that
the maximum drain voltage can be approximately 3VDD [22, 23]. This has imposed a serious design
constrain for CMOS PAs as the technology scales.
II.G. Class F Power Amplifier
Recall in the analysis of reduced conduction angle PAs, it was assumed that the output filter would
attenuate all but the fundamental harmonics. On the other hand, class F PAs intentially add harmonics at
17
the output to boost the power efficiency.
Starting with an ideal class B PA, its drain current iD is a half-wave sinusoid, and its tuned drain
voltage vD is sinusoidal, shown in Fig. 14(a). Its maximum efficiency is 78.5% as calculated before. This
efficiency can be improved if vD can be “flattened”, i.e. it has a sharper transition than sinusoidal and
stays at lower value for longer time in the half RF cycle when iD is nonzero, such as the one shown in
Fig. 14(b). This is done by adding odd harmonic components to vD , which is realized by having infinite
impedance at odd harmonics in the output filter. If infinite number of odd harmonics are added to vD,
it becomes a square wave, and the waveforms becomes that of class D, with an efficiency of 100%,
Fig. 14(c).
Even without adding all odd harmonics, there would be significant efficiency improvement. For example,
consider adding only third harmonic, then the drain volatage is
vD = VDD + V1 cos θ + V3 cos 3θ (20)
where θ = ωt. Maximum flatness requires that the second derivative of vD at θ = pi is zero [24]. Combine
this with the restraint that vD cannot exceed 2VDD, then V1 and V3 in (20) are solved to be V1 =
9
8VDD,
V3 = − 18VDD . Since the current waveform is the same as that of class B, the efficiency in this case is
ηF =
1
2
V1Imax/2
VDDImax/pi
=
9pi
32
≈ 88.4% (21)
One of the drawbacks of class F is the same as that of class B, which the difficulty in designing an
accurate and robust bias scheme. Another disadvantage is that the output filter is more complex, which
leads to more power loss. Since the output filter is usually the last stage before the antenna, its power
loss is more detrimental to the overall efficiency.
18
0Imax
0 pi 2pi 3pi 4pi
0
VDD
2VDD
θ (radian)
iD
vD
(a)
0
Imax
0 pi 2pi 3pi 4pi
0
VDD
2VDD
θ (radian)
iD
vD
(b)
0
Imax
0 pi 2pi 3pi 4pi
0
VDD
2VDD
θ (radian)
iD
vD
(c)
Fig. 14. Class F current and voltage waveform illustrations. (a) Ideal class B current and voltage, (b) class F, realized by adding
third and fifth harmonics to the class B drain voltage, and (c) class F with all odd harmonics added to the drain voltage becomes
identical to class D.
19
III. DIGITAL COMMUNICATION SYSTEMS
III.A. Introduction
Modern communication systems widely use digital modulations because of advances in digital signal
processing (DSP). In a generic digital RF transmitter which is the focus of this work, information such
as voice and image is first digitized then compressed, serialized, and pulse-shaped in the digital domain,
before getting converted back to analog form and up-converted to the specific frequency band according
to the respective communication standard. The power amplifier (PA) then comes into play and send out
the information through the antenna with a certain amount of RF power. The block diagram of such a
transmitter is shown in Fig. 15.
Voice Sensor/Amp A/D DSP D/A
LO
PA
Fig. 15. Block diagram of a digital RF transmitter.
Although this work focuses on the design and implementation of the PA, it is important to have a working
understanding of the basic concepts of digital modulations as well as some common communication
standards, and this section serves as an overview of these topics.
III.B. Digital Modulation
Digital modulation transfers a digital bit stream over an analog bandpass channel [18]. Due to its digital
nature, the modulating signal usually takes one out of two possible values, although in some modulation
schemes there can be more than two states. Therefore digital modulation techniques are often termed
keying, derived from the Morse key used for telegraph.
Different modulation schemes are used in various commuinication standards, so a brief introduction to
them is necessary to reveal some properties of the standards.
III.B.1. Basic Concepts: Starting from the frequency domain, the modulation process is equivalent to
moving the complex baseband signal SBB to the carrier frequency fc:
SRF (f) =
1
2
SBB(f − fc) + 1
2
S∗BB(−f − fc) (22)
where (∗) denotes complex conjugate. In the time domain, (22) is transformed to be
sRF (t) =
1
2
sBB(t)e
jωct +
1
2
s∗BB(t)e
−jωct = ℜ{sBB(t)ejωct} (23)
20
I0
Q
A(t)
I(t)
Q(t)
φ(t)
Fig. 16. Relationship between two representations of complex envelope.
where ωc = 2pifc. The complex signal sBB(t) is called the complex envelope of the real signal sRF (t),
it can be decomposed into either polar or cartesian form. In polar form, the complex envelope and the
real narrowband signals are
sBB(t) = A(t)e
jφ(t) (24a)
sRF (t) = ℜ
{
A(t)ejφ(t)ejωct
}
= A(t) cos [ωct+ φ(t)] (24b)
where A(t) and φ(t) are amplitude and phase modulations, respectively. The same process applied to
cartesian decomposition would result in
sBB(t) = I(t) + jQ(t) (25a)
sRF (t) = ℜ
{
[I(t) + jQ(t)] ejωct
}
= I(t) cosωct−Q(t) sinωct (25b)
where I(t) and Q(t) are called the in-phase and quadrature components, respectively. Fig. 16 shows
the relationship between the two sets of baseband representations. If the horizontal axis represents the
I-component while the vertical represents the Q-component, then
A(t) =
√
I2(t) +Q2(t) (26a)
φ(t) = tan−1
Q(t)
I(t)
(26b)
In digital communications, the complex envelope would result in discrete points on the IQ plane, and
such plots are termed constellation diagram.
The two formulations of the baseband signal are the basis of polar modulation and quadrature mod-
ulation, where either A(t) and φ(t) or I(t) and Q(t) are modulated, respectively. Polar modulation has
separate paths for amplitude and phase modulations. The envelope has constant amplitude in the phase
modulation path, therefore a switching mode PA can be used. Compared with linear PAs, switching mode
21
PAs have higher power efficiency, and are less sensitive to antenna impedance variations that are common
especially in hand-held electronic devices [25]. However, polar modulation also has its shortcomings. For
example, due to AM-PM conversion, the amplitude modulation would inevitably cause phase distortion. In
the phase modulation path, although the amplitude of the envelope is constant, since the phase variations
are abrupt in digital communication systems, the resultant waveform would show sharp transitions, which
in the frequency domain would translate into high-frequency spurs. The implementation of simultaneous
amplitude and phase modulation with robust delay mismatch control is difficult and complicated [26], and
simulations reveal that polar modulation would lead to worse linearity performance in terms of spectral
leakage and error vector magnitude (EVM) when compared with quadrature modulation [27]. On the other
hand, simultaneous modulation of amplitude and phase in quadrature modulation systems is not a big issue
because, as will be shown, when decomposed into the I and Q components, the phase modulation also
shows up as amplitude variations and since quadrature signals are orthogonal, they do not interfere with
each other. Also, only one local oscillator would be needed, with an additional 90◦ phase shifter, and
due to the ease with combining from and splitting into two indepedent component parts of the signal as
a result of the symmetrical structure, quadrature modulation is more widely adopted in modern digital
communication systems [26].
III.B.2. Phase-shift Keying (PSK) Modulation: A general PSK modulated signal can be expressed as
sRF (t) = A cos [ωct+ φ(t)] = A cosφ(t) cosωct−A sinφ(t) sinωct (27)
Thus the I and Q components are
I(t) = A cosφ(t) (28a)
Q(t) = A sinφ(t) (28b)
where the amplitude A is constant. As shown in (28), quadrature modulation convert the phase modulation
to amplitude modulations in the I and Q components. AnM -bit digital system would lead to 2M possible
φ values, and to reduce detection error, they are (2pi/2M ) radians apart.
The simplest case of PSK is binary phase-shift keying (BPSK), where a one-bit system generates two
phases, φ = 0 and φ = pi. From (28), for BPSK (I,Q) = (±A, 0), and thus the contellation diagram
should ideally be that of Fig. 17.
Quadrature phase-shift keying (QPSK) is a more common modulation scheme, because at the same
22
I0
Q
∗
A
∗
-A
Fig. 17. Ideal BPSK contellation diagram.
symbol rate, it is able to transmit twice as much information as that of BPSK. Adding more number of
bits would further increase the bandwidth efficiency, but as symbols getting close on the constellation
diagram, the system becomes more prone to errors.
A common choice of φ for QPSK and the corresponding (I,Q) pairs are summarized in Table II, and
the constellation diagram is shown in Fig. 18. Note that with such a choice of phases, the QPSK can be
viewed as a sum of an I-channel BPSK and a Q-channel BPSK.
TABLE II
QPSK Format Summary
Dibit φ I Q
00 pi/4 A/
√
2 A/
√
2
01 −pi/4 A/
√
2 −A/
√
2
10 3pi/4 −A/
√
2 A/
√
2
11 −3pi/4 −A/
√
2 −A/
√
2
I0
Q
∗( A√
2
, A√
2
)
∗( A√
2
,− A√
2
)
∗(− A√
2
, A√
2
)
∗(− A√
2
,− A√
2
)
Fig. 18. Ideal QPSK constellation diagram.
III.B.3. Quadrature Amplitude Modulation (QAM): As its name suggests, QAM is a modulation scheme
where amplitude modulation is applied on both in-phase and quadrature channels, i.e.
sRF (t) = Im cosωct−Qm sinωct (29)
where Im and Qm usually both take on M values and thus there are altogetherM
2 possible constellation
points. For example, a 4-bit scheme can be transmitted using 16QAM, because two bits would result in
23
I0
Q
∗ ∗
∗∗
∗∗
∗ ∗
∗∗
∗ ∗
∗ ∗
∗∗
Fig. 19. Ideal 16QAM constellation diagram.
four levels in Im, and the other two would generate four levels in Qm. The constellation diagram of a
16QAM is shown in Fig. 19.
III.B.4. Orthogonal Frequency Division Multiplexing (OFDM): The modulation schemes described
so far all modulate the digital bit series into a single carrier. OFDM is a method of modulating the
digital information into multiple carriers. In this scheme, the channel bandwidth is divided into multiple
subchannels, each of which independently modulated. The symbol rate of each subchannel is set to be
the reciprocal of the subchannel frequency spacing ∆f , so all subchannels are orthogonal to each other
[18].
The major advantage of OFDM is that it is less sensitive to transmission media characteristics. This is
because as the subchannel bandwidth becomes sufficiently small, the medium frequency response can be
approximated to be flat, and according to Nyquist theorem, the system is free of intersymbol interference
(ISI). Consequently, OFDM is a common choice of modulation format in communication systems where
the attenuation of the channel is severe or there is multipath propagation. The disadvantage of of OFDM
modulation is the resultant high PAPR signals would require linear tranceiver, which would usually lead
to inferior power efficiency.
III.C. Wireless Communication Standards
Some of the most common communication standards are introduced here. Among the numerous spec-
ifications laid out in each standard, only those that are most closely related to PA design are discussed
here.
III.C.1. Global Systom for Mobile Communications (GSM): As a replacement for first-generation (1G)
analog cellular networks, GSM is the first cellular phone standard that is based on digital modulations
[2]. The GSM uses Gaussian minimum-shift keying (GMSK) modulation scheme, which is a variation
of PSK with Gaussian pulse shape. To increase user capacity, it applies time-division multiple access
(TDMA) signalling and frequency-division duplexing (FDD). During each time slot up to eight users can
24
TABLE III
GSM PA Specifications
System Uplink (MHz) Pout,max (dBm)
GSM 850 824.2 - 849.2 33
GSM 900 880.0 - 915.0 33
GSM 1800 1710 - 1785 30
GSM 1900 1850 - 1910 30
be accommodated and the data rate per user is 270 kb/s. In an FDD system, the transmission and reception
of the signals are at different frequencies, thus there is good isolation between the receiver and transmitter.
Since in mobile devices the transmitter and receiver are usually built at close vincinity, even on the same
chip, FDD is employed in many wireless communication systems [28]. Extension of GSM to facilitate
data communications led to the developements of general packet radio services (GPRS) and enhanced
data rates for GSM evolution (EDGE), both considered second-and-half generation (2.5G) standards.
The PA design for GSM applications does not require amplitude linearity because it uses constant-
envelop modulations. Therefore switching mode PAs can be used, and the main design effort is targetted
at power efficiency optimization. GSM standard is integrated by the third-generation partnership project
(3GPP) for backward compatibility and the full set of specifications can be found in [29], Table III
summarizes the key specifications of GSM that are related to PA design.
III.C.2. Wireless Local Area Network (WLAN): Based on the IEEE 802.11 standards, WLAN uses
OFDM modulation to reduce sensitivity to multipath effects and has a high data rate. Unlike GSM, WLAN
applies time-division duplexing (TDD). The advantage of TDD is that the transmission and reception
of the signal use the same frequency, thus direct communications between two transceivers, which is
an important feature in WLAN, is made much easier to realize. The major drawback of TDD is that
unintended but strong transmission signals may appear at the same frequency at the receiver, and thus the
linearity requirements for both the transmitter and receiver are stringent.
Due to the use of OFDM modulation, linear PA is required for WLAN transmitter design. Therefore
both efficiency and linearity are important design targets to tackle. Furthermore, since OFDM modulation
usually has a high PAPR (about 10 dB), the power efficiency at the back-off region determines the average
efficiency. The complete specifications of WLAN can be found in [30], and the PA-related specifications
of IEEE 802.11g is listed in Table IV.
III.C.3. Wideband Code Division Multiple Access (WCDMA): WCDMA is the primary cellular standard
for the third-generation (3G) wireless communications and the most commonly used member of the
25
TABLE IV
IEEE 802.11g PA Specifications
Specification Value
Frequency (MHz) 2412 - 2484
No. of carriers 52
Channel bandwidth (MHz) 22
Max. Pout (dBm) 20
Max. EVM (dB) -25
ACLR (dBc)
-20 @ 10 MHz offset
-28 @ 20 MHz offset
-40 @ 30 MHz offset
universal mobile telecommunications system (UMTS). It uses FDD scheme, and accommodates up to ten
carriers.
Similar to the case of WLAN, WCDMA standard has high requirements in regard to linearity [31].
Also because of multicarrier modulation, it is vital to ensure the power back-off efficiency is high. The
specifications related to PA design are listed in Table V.
TABLE V
WCDMA PA Specifications
Specification Value
Uplink (MHz) 1900 - 1980
No. of carriers 10
Channel bandwidth (MHz) 3.84
Max. Pout (dBm) 30
Max. EVM (dB) -15
ACLR (dBc)
-33 @ 5 MHz offset
-43 @ 10 MHz offset
26
IV. NONLINEAR EFFECTS OF POWER AMPLIFIER
IV.A. Introduction
Conventional modulations, such as frequency modulation (FM), frequency-shift keying (FSK), and
Gaussian minimum-shift keying (GMSK) have their information stored in the frequency variations, not
the envelope amplitude, and thus do not require linear amplification. If the envelope amplitude does contain
information and thus varies, then linear amplification is required [19]. Of course amplitude modulations
fall into this category, but phase modulation (PM), phase-shift keying (PSK), and multicarrier modulations
such as orthogonal frequency-division multiplexing (OFDM) also have envelope amplitude variations that
need to be preserved throughout the tranceiver chain, therefore linear power amplifiers (PAs) are required.
As the available frequency bands for wireless communications have become increasingly crowded,
bandwidth efficiency has become an important design considerations. As a result, most modern com-
munication systems use raised-cosine (RC) pulse shaping. RC filter’s spectrum can be shaped arbitrarily
close to rectangulars, whose bandwidth efficiency is maximum, while maintaining minimal intersymbol
interference (ISI) [18]. Therefore the required transmitter pulse shape is root-raised-cosine (RRC), which
results in a envelope with a peak to average power ratio (PAPR) of 3-6 dB, depending on the specific
modulation applied [19].
OFDM has found its popularity in wideband digital communications such as wireless local access
network (WLAN), digital television (DTV), and fourth generation long term evolution (4G LTE), due to
its multi-carrier nature that enables more reliable transmission and reception compared with single-carrier
schemes [18]. The resultant envelope’s PAPR falls in the range of 8-13 dB [19].
In either of the modulation schemes aforementioned that have non-constant envelopes, PA’s linearity
is a crucial design specification. Lack of linearity would result in distortion of the amplified signal,
which results in the frequency domain as unwanted components in frequencies other than the designated
frequency bands. This phenomenon is also called spectrum regrowth, and the worst case of it usually
happens in the adjacent channel. Therefore the adjacent channel leakage ratio (ACLR), defined as the
ratio of the adjacent power to the main channel power, is usually one of the toughest test of PA’s linearity.
This section will discuss the MOSFET nonlinearity mechanisms, as well as how they are manifested
in single-tone, two-tone, and multi-tone tests. Also, AM-PM conversion will be introduced.
27
IV.B. MOSFET Nonlinearity Mechanisms
Semiconductor transistors always exhibit nonlinearities to some extent. Fig. 20 illustrates a typical I-
V characteristic of a MOSFET [20]. The transistor conducts negligible current in the cutoff region as
0
Imax
Vt
I D
S
VGS
Realistic
Square-law
Fig. 20. Illustrative I-V characteristic of a MOSFET.
VGS < Vt, due to the absence of a conducting channel. As the channel is formed after Vt, the transistor
enters active region, the IDS − VGS relation would follow a quadratic equation if long channel model of
the MOSFET is used, i.e.
IDS =
1
2
µnCox
W
L
(VGS − Vt)2 (30)
for an NMOS transistor, where µn is the average electron mobility in the channel, Cox is the gate oxide
capacitance per unit area, W and L are the width and length of the channel, respectively, and Vt is the
threshold voltage.
It can be seen in Fig. 20 that it does not take long for the I-V characteristics to deviate from the square
law. This is the result of several second-order effects [32]:
a) Channel-length modulation: In the active region, VDS > VGS−Vt, or VGD < Vt. In other words,
the gate-drain voltage is not large enough to sustain the conduction channel, or the channel is “pinched
off”. As a result, there exists a depletion region between the pinch-off end of the channel and the drain,
which would cut into the effective channel length. This effect can be modeled by adding a VDS-dependent
28
term to (30):
IDS =
1
2
µnCox
W
L
(VGS − Vt)2 (1 + λVDS) (31)
where λ is proportional to the effective channel length, to the first-degree approximation. According to
(31), this effect is approximately proportional to the drain current and inversely proportional to the channel
length. Therefore the deviation after Vt is apparent when deep sub-micron CMOS transistors in the RF PA
applications are delivering a large amount of current. As VGS increases, VDS would decrease, and hence
the depletion region would shrink, resulting in a larger effective channel length, which in turn causes the
drain current to be less than that predicted by the square law.
b) Velocity saturation: At relatively low VGS after Vt, the VDS of the transistor is large. Combined
with a short channel length, the electrical field between drain and source becomes very strong, and the
electron drift velocity becomes less proportional to the electrical field as in the low-field case and tend
to saturate. A first-order approximation reveals
vd =
µnE
1 + E/Ec (32)
where E is the electrical field, and Ec is termed the critical field. As a result, the drain current in the
active region becomes
IDS =
1
2
µnCox
W
L
V 2DS(act) =
1
2
µnCox
W
L

EcL


√
1 +
2 (VGS − Vt)
EcL − 1




2
(33)
where the VDS(act) term can be approximated by
VDS(act) = EcL


√
1 +
2 (VGS − Vt)
EcL − 1

 ≈ (VGS − Vt)
(
1− VGS − Vt
2EcL
)
(34)
In the presence of a strong drain-to-source electrical field, EcL is relatively small, and according to (34)
and (33), IDS would also be less than that predicted by the square law.
c) Mobility degradation: As VGS further increases, the vertical electrical field originated from
the gate voltage causes the carriers in the channel to be closer to the silicon surface, where surface
imperfections would impede their movement. This effect can be modeled as a degradation of the carrier
mobility [33], where the effective mobility is given by
µeff =
µn
1 + θ (VGS − Vt) (35)
29
where the parameter θ is inversely proportional to the oxide thickness. As MOS technology scales down,
the oxide thickness shrinks, and thus this effect also causes IDS reductions.
In addition to the effects mentioned above, as VGS further increases, VDS would continue to decrease,
and the transistor would enter triode region. The drain current would then be determined by the load of
the transistor, and flatten out to the saturated value Imax as the transistor enters the deep triode region.
Refer back to Fig. 20, if the transistor enters cutoff or deep triode region, there will be strongly nonlinear
effects due to clipping. On the other hand, the PA is driven to its maximal current capacity, indicating a
good efficiency performance. The switch mode PAs operate in strong nonlinear regions, resulting in higher
efficiency. However, as mentioned before, modern communication schemes usually have very stringent
linearity standards, requiring most of the PAs operating between the two extremes in Fig. 20, sometimes
referred to as the weakly nonlinear region, thus the analyses below would focus in this region.
In the weakly nonlinear region, the PA’s input and output can be related by a power series
iout =
∞∑
i=1
Giv
i
in (36)
where Gi are complex coefficients, representing both amplitude and phase nonlinearities, and they include
the memory effects that devices under high frequency excitations usually exhibit. The Volterra series are
formulated in this way and serve as a rigorous nonlinearity analysis tool for RF applications [34, 35].
However, such a formulation becomes too complicated as the transistor models become more sophisticated,
and hence, gives less design insights compared to simpler series. Also, for modern communication systems
that require high linearity, a power series of up to third-order generally gives most of the required
information, assuming the PA operates up to the 1-dB compression point [36–38]. Analysis in this paper
utilizes a real-valued power series up to the third order component. Under this simplification, (36) becomes
iout = g1vin + g2v
2
in + g3v
3
in (37)
where g1 represents the linear transconductance gain of the PA, and g2 and g3 represent the second- and
third-order distortion terms, respectively.
IV.C. Single-tone Test: Harmonic Distortion and Gain Compression
IV.C.1. Harmonic Distortion: Consider a voltage signal with constant amplitude va and frequency f0,
also known as a continuous wave (CW), is input to a nonlinear PA. In other words, vin = va cos(2pif0t)
30
in (37). Using the identities cos2 θ = 12 (1 + cos 2θ) and cos
3 θ = 14 (3 cos θ + cos 3θ), (37) becomes
iout = g1va cos θ0 +
1
2
g2v
2
a (1 + cos 2θ0) +
1
4
g3v
3
a (3 cos θ0 + cos 3θ0) (38)
For the sake of conciseness, θ0 = 2pif0t = ω0t. One obvious consequence of a CW signal passing through
a nonlinear PA is that components at multiples of the carrier frequency are generated, and they are termed
harmonic tones. Generally speaking, even-order distortions would generate even-order harmonics up to
their orders, while odd-order distortions would generate odd-order harmonics up to their orders. For
instance, as resulted in (38), second-order distortion generates a DC term and a second harmonic, while
third-order distortion generates a distortion term at the fundamental tone, in addition to a third harmonic.
Second harmonic distortion (HD2) is defined as the ratio of the amplitude of the output signal
component at the second harmonic to that of the fundamental. Therefore according to the nonlinear
model in (37),
HD2 =
1
2
∣∣∣∣g2g1
∣∣∣∣ va (39)
Third harmonic distortion (HD3) can be defined similarly, and in the current model,
HD3 =
1
4
∣∣∣∣g3g1
∣∣∣∣ v2a (40)
Note in (40) that it is assumed the distortion at the fundamental contributed by the g3 is negligible, which
is valid in the weakly nonlinear region. In the weakly nonlinear region, HD2 and HD3 are usually very
small and are more conveniently expressed in decibel:
HDi|dB = 20 logHDi (41)
where i = 2, 3, · · · . For RF applications, harmonic distortions do not give very realistic indications of
the system linearity. This is because the harmonic tones are at multiples of the fundamental tone, which
are very far apart. In most of the narrow-band RF systems, the output filter would reduce the harmonic
tones, so the harmonic distortions may appear very small even if the system is substantially nonlinear.
IV.C.2. Gain Compression: Collecting terms at the fundamental in (38) yields
iout,fund =
(
g1 +
3
4
g3v
2
a
)
va cos θ0 =
(
g1 +
3
4
g3v
2
a
)
vin (42)
which indicates that the gain at the fundamental tone may be greater or less than the gain without distortion.
These phenomena are termed gain expansion and gain compression, respectively. For MOSFET, usually
31
GP
P1dB
PSAT
0
P
o
u
t
(d
B
sc
al
e)
Normalized Pin (dB scale)
1 dB
1 dB/dB
Fig. 21. Typical power transfer characteristics of a PA.
g1 and g3 have opposite signs, and would exhibit gain compression. For BJT, whose I-V relationship
is exponential, g1 and g3 have the same sign, which would show gain expansion [20]. However, the
gain expansion of BJT would only happen at a close vicinity around the operating point, because as the
input voltage amplitude increases, higher order distortion contributions also come into play, and more
importantly, as shown in Fig. 20, the output current for any practical system would saturate at high input
levels, preventing any possible gain expansions.
Therefore the typical power transfer characteristic of a practical PA would look like the one shown in
Fig. 21. At low Pin levels, the PA is mostly linear, i.e. Pout = GPPin, where GP denotes the linear
power gain. On the dB-dB scale plot, the linear portion of the transfer curve would appear to be a straight
line with a slope of 1 dB/dB and y-intercept of GP . As Pin increases, the power gain starts to compress
due to nonlinear, especially odd-order, distortions, and hence Pout starts to deviate from the ideal linear
characteristics, shown in the dashed line. In RF applications, the gain compression is quantified by the
1-dB compression point, defined as the point at which the power gain has dropped by 1 dB from its
small-signal asymptotic value, it can be referred to the input or the output power. For transmitter building
blocks, the output-referred 1-dB compression point is often used, whereas for receiver building blocks,
input-referred 1-dB compression point is more common. As Pin further increases, most practical PA
would saturate, reaching the maximum RF power it can generate, PSAT . PSAT indicates the PA’s power
capability, but for linear PAs, P1dB is usually viewed as the upper limit of the PA’s dynamic range.
32
According to the definition of P1dB and (42), for a third-order distortion PA, the input amplitude that
corresponds to P1dB satisfies
20 log
∣∣∣∣g1 + 34g3v2ic
∣∣∣∣ = 20 log |g1| − 1 (43)
which leads to
vic =
√
0.145
∣∣∣∣g1g3
∣∣∣∣ (44)
In summary, under the single-tone condition, second-order distortion results in a signal-dependent DC
offset and a harmonic distortion term that appears at twice the fundamental frequency, both of which can
be easily filtered out by the output filter of a PA. On the other hand, the third-order distortion would lead
to a non-linear signal dependent term at the fundamental frequency in addition to the third-order harmonic
distortion. The distortion term at the fundamental frequency is considered the cause of gain compression
[28]; hence, in terms of linearity considerations, third-order distortion is more critical when designing RF
PA.
IV.D. Two-tone Test: Intermodulations and Intercept Point
IV.D.1. Intermodulations: Now consider an input signal that consists of two tones, that is,
vin = v1 cosω1t+ v2 cosω2t (45)
The nonlinear PA described so far would generate harmonic tones at 2ωi and 3ωi, i = 1, 2, and in addition,
due to the presence of two tones, there will be nonlinear tones at linear combinations of their frequencies.
This nonlinearity is termed intermodulation (IM ).
a) Second-order Intermodulations: The second-order IM products are
IM2 = g2v1v2 [cos (θ1 + θ2) + cos (θ1 − θ2)] (46)
where again to reduce unnecessary complexity, θi = ωit, i = 1, 2. If the two input tones are close to each
other, then the two terms in (46) are at approximately twice the carrier frequency and at close to DC.
In a typical RF PA application, IM2 are usually filtered out by the output filter, and do not contribute
to severe nonlinearity problems to the system. Also due to the filtering effects, IM2 does not serve as a
pratical linearity metric for RF PAs.
33
ff1 f2
Input
ff1 f2
2f1 − f2 2f2 − f1
Output
Fig. 22. Third-order intermodulation.
b) Third-order Intermodulations: The third-order IM products are
IM3 =
3
4
g3
[
v21v2 cos (2θ1 ± θ2) + v22v1 cos (2θ2 ± θ1)
]
(47)
The (2ω1+ ω2) and (2ω1+ω2) terms would appear close to three times the carrier frequency, if the two
input tones are again assumed to be close. Therefore these terms are again of less importance. However,
(2ω1 − ω2) and (2ω2 − ω1) are very close to the input tones: they are just outside of ω1 and ω2 by the
frequency difference of the two main tones, as illustrated in Fig. 22. Defined by the ratio of the IM3
product at (2ω1−ω2) to the fundamental at f1 or IM3 at (2ω2−ω1) to fundamental at f2, the third-order
intermodulation distortion is therefore calculated according to (47) and (38):
IMD3 =
3
4
∣∣∣∣g3g1
∣∣∣∣ v1v2 (48)
IV.D.2. Intercept Point: Consider the two input tones to have the same amplitude va, then according to
(47) the IM3 amplitude becomes
3
4g3v
3
a, whereas by comparison, the fundamental tones would have an
amplitude of g1va. Although IM3 products are less than fundamental tone amplitude when va is small,
they increase as a function of v3a, while the fundamentals increase in proportion to va. The third-order
intercept point (IP3) is defined to be where IM3 and fundamental have the same amplitude. IP2 can be
similarly defined, but since IM2 is rarely used in PA applications, so is IP2.
Note that IP3 is a term that can indicates a system’s linearity, and thus provides some design insights,
usually it is not a quantity that can be directly measured, as the case of P1dB . This is because as va
increases, the gain compresses and higher order distortions become significant. IP3 is usually obtained
by extrapolation of measured data. Theoretical IP3 calculation would simply set the IM3 product and
fundamental have the same amplitude, assuming the two input tones have the same amplitude va, then
from (38) and (47),
viip3 =
√
4
3
∣∣∣∣g1g3
∣∣∣∣ (49)
34
OIP3
IIP3Pin
Pout,fund
IM3
x
Fig. 23. Geometrical interpretation of IP3 calculation.
Compare IP3 with other linearity metrics that describe the third-order distortion such as IM3 or IMD3
in (47) and (48), it is clear that the advantage of IP3 is that it is independent of input signal level. This
makes it a popular quantity to compare linearity performance of different RF circuits. Compare viip3 with
that of P1dB in (43),
IP3 − P1dB = 10 log
(
4
3
∣∣∣∣g1g3
∣∣∣∣
)
− 10 log
(
0.145
∣∣∣∣g1g3
∣∣∣∣
)
≈ 9.64 dB (50)
Therefore as a rule of thumb, IP3 is higher than P1dB by about 10 dB. To get IP3 from simulations,
sweeping two large-signal tones and extrapolating would consume a large amount of computing resources.
One way to reduce simulation time is to perform only one two-tone simulation, in which the amplitudes
of the two input tones are identical, and they are set to be small enough that both the fundamental and
third-order distortion terms are in their asymptotic ranges, but large enough so they can be measured
accurately. Then the input-referred IP3 can be calculated as
IIP3 = Pin +
1
2
|IMD3| (51)
where all quantities are in dB-scale, and Pin is the input power of one of the two input tones. The
derivation of (51) can be geometrically explained in Fig. 23. On logrithmic scales, the fundamental and
third-order power transfer characteristic lines would have slopes of 1 dB/dB and 3 dB/dB, respectively.
Therefore from Fig. 23, OIP3−Pout,fund = x, and OIP3−IM3 = 3x, with x being the power difference
between IIP3 and Pin. Recall the definition of IMD3 = IM3−Pout,fund, it can be easily deduced that
x = |IMD3| /2, hence comes the result in (51).
Another way to simulate IP3 without consuming a large amount of simulation resources is to use only
one large tone in the two-tone test and sweep this tone only, while the other tone is kept as a small-signal
35
Digital
Signal
Processing D/A
0
f
fRF
0
f
fRF
PA
0
f
fRF
Fig. 24. Typical RF transmitter.
tone [39]. The caveat of this approach is that in IP3 calculations, IM3 and the corresponding fundamental
tone should not be adjacent, i.e. (2f1− f2) and f2, (2f2− f1) and f1 tone amplitudes are to be set equal
in extrapolating the transfer curves to get IP3, which is apparent from (47), because doing so would
cancel out the non-equal tone effects and come to the same result as in (49).
IV.E. Multitone Nonlinear Effects
Linearity is one of the major concerns in current and near future wireless communication systems. Of
all the requirements related to linearity in wireless communication standards, the requirements on spectral
leakage, often in the form of adjacent-channel leakage ratio (ACLR), are usually the most demanding
specifications for radio frequency integrated circuits (RFIC) design.
A typical RF transmitter block diagram is conceptually shown in Fig. 24. Modern communication
standards usually require a certain form of filtering applied to the digital information in order to limit the
bandwidth. However, due to nonlinearity of the RF front-end, the shape of the input signal is not preserved,
which means the spectrum is not limited to the desired bandwidth. This effect is termed spectral regrowth
[28] and is quantified by ACLR, which is defined as the ratio of the integrated power in the adjacent
channel to the power in the transmitted channel [31].
To assess ACLR performance in the design phase, envelope simulations are needed [39]. However, such
a simulation is very expensive in terms of computing resources and simulation time, which is especially
true for RF PA designs because of the large number of active devices involved. Moreover, due to the
random nature of signals used in communications, it is not easy to analytically relate ACLR to PA design
parameters. Multitone signals are similar to the band-limited signals used in communication systems in
the frequency domain, but its simulation is still relatively expensive, and depending on the number of
tones at the input source, it may not be well-supported by the simulator [40].
This subsection presents an analysis that relates a multitone ACLR to two-tone third-order intermodula-
tion distortion (IMD3) of a nonlinear system, which enables a quick estimate of the PA spectral regrowth
36
using a fast two-tone simulation in the early design phase.
Consider an input signal that consists of N identical tones evenly spaced in the frequency domain:
vin = va
N∑
n=1
cos [ω0 + (n− 1)∆ω] t (52)
where ∆ω denotes the angular frequency spacing. Rearranging (52) reveals more clearly that the normal-
ized input resembles a modulated signal:
vin
va
=
1
2
(
N∑
n=1
exp {j [ω0 + (n− 1)∆ω] t}
+
N∑
n=1
exp {j [−ω0 − (n− 1)∆ω] t}
)
=
1
2
(
ejω0t
1− ejN∆ωt
1− ej∆ωt + e
−jω0t 1− e−jN∆ωt
1− e−j∆ωt
)
=
1
2
(
ejω0t
ejN
∆ω
2
t
ej
∆ω
2
t
e−jN
∆ω
2
t − ejN ∆ω2 t
e−j
∆ω
2
t − ej∆ω2 t
+ e−jω0t
e−jN
∆ω
2
t
e−j
∆ω
2
t
ejN
∆ω
2
t − e−jN ∆ω2 t
ej
∆ω
2
t − e−j∆ω2 t
)
=
sin
(
N ∆ω2 t
)
sin
(
∆ω
2 t
) cos(ω0 + N − 1
2
∆ω
)
t (53)
Clearly
(
ω0 +
N−1
2 ∆ω
)
is the center frequency of the frequency band from ω0 to (ω0 + (N − 1)∆ω),
which can be viewed as the carrier frequency, and hence the N -tone signal would have an “envelope” of
venv = va
sin
(
N ∆ω2 t
)
sin
(
∆ω
2 t
) (54)
The envelope voltage has maxima at t = n 2pi∆ω , n = 0, 1, 2, · · · . For instance, using the L’Hôpital’s rule,
it can be shown that
venv,max = lim
t→0
va
sin
(
N ∆ω2 t
)
sin
(
∆ω
2 t
) = Nva (55)
To calculate the root-mean-square (RMS) value of vin composed of N equal-magnitude tones, the
Parseval’s identity is used, leading to
vin,rms =
√√√√2v2a N∑
n=1
(
1
2
)2
=
√
N
2
va (56)
37
Combining the results from (55) and (56), the PAPR of an N -tone signal is computed as
PAPR =
Pmax
Pavg
=
v2env,max/2
v2in,rms
= N (57)
where a normalized resistance of 1 Ω is assumed. In other words, (57) reveals that, for example, a
modulated signal with a PAPR of 10 dB can be emulated by a 10-tone signal. Although the frequency
spectrum of a multitone signal may have a similar profile to a modulated signal with similar PAPR, it
is still quite different from the continuous band-limited signal’s spectrum due to the discrete nature of
its spectrum. This is evident when the signals go through a nonlinear system, and spectral regrowth is
observed. As will be shown later, even when two signals show similar ACLR, the spectra can be different
in shape.
For wireless communications, the exact definition of adjacent channel, i.e., the frequency offset and
integration bandwidth, depend on the specific communication format, but for multitone analysis, one can
assume that the adjacent channel starts just outside of the main channel, and has the same bandwidth as that
of the main channel. The second-order distortion would not contribute to tones in the adjacent channels for
high-IF narrow-band signals; therefore, only third-order distortions are considered in analyzing multitone
adjacent channel powers (ACP). Substituting vin from (52), the third-order distortion output is
iout3 = g3v
3
in = g3v
3
a
(
N∑
k=1
cosωkt
)3
(58)
Next, the amplitude of each tone in the adjacent channel is calculated, and the ACP is the sum of their
output power. To simplify the analysis, it is assumed the PA operates in the weakly nonlinear regime,
and thus the distortion terms that fall within the passband are assumed to be much less than that of
the amplified signal, and therefore are not accounted. Also, due to symmetry, only the upper ACP is
calculated, namely the tones at frequencies ωN up to ω2N−1. An extensive analysis was performed in
[37], but phase uncorrelation was assumed, i.e., it was assumed that each distortion tone outside of the
passband is contributed to by nonlinear terms with uncorrelated phases, and thus the distortion amplitude
was computed by vectorially adding all the nonlinear contributions. Although the frequency response of
the PA might result in different phases, the difference is small in most current communication standards.
Therefore, analytical results using the approach in [37] should be considered the best-case scenario. In
this work, the analysis considers the worst-case condition, in which all distortion terms are considered
in-phase. Therefore, for each tone in the adjacent channel, the output currents that were contributed from
38
all possible distortion mechanisms are added before the result gets squared to calculate the power.
Observation of (58) leads to dividing iout3 components into three categories: where ωk = ωl = ωm,
ωk = ωl 6= ωm, and ωk 6= ωl 6= ωm.
a) Harmonic terms (k = l = m): The output current tones due to third-order distortion are
iHD3 =M1g3v
3
a
N∑
k=1
cos3 ωkt (59)
where M1 denotes the multinomial coefficient, which is unity in this case since all three summations
in (58) need to contribute the term with the same ωk. It can be shown that (59) would result in small
tones whose effects should affect the magnitude and phase of the inband tones and then degrading the
quality of the final constellation; those effects, however, are not the focus of this paper. These components
also result in third-order harmonic components, which by narrowband assumption are out of the adjacent
channel range. Therefore these harmonic distortion terms are not considered in this analysis.
b) Intermodulation terms (k = l 6= m): The output current tones as a result of intermodulation are
iIM3 =M2g3v
3
a
N∑
l=1
N∑
m=1
m 6=l
cos2 ωlt cosωmt (60)
The multinomial coefficient in this case is M2 = C(3, 2) = 3 because for each pair of (l,m), the
coefficient M2 is the combination of choosing 2 (that contribute ωl) out of the three summation terms in
(58). Each term in (60) would expand as below:
cos2 ωlt cosωmt =
1
2
[
cosωmt+
1
2
cos (2ωl + ωm) t
+
1
2
cos (2ωl − ωm) t
]
(61)
The first term in (61) results in tones that are in-band, and the tones as the result of the second term
are out of band. The only term that contributes to the upper adjacent channel spurs is the third term that
satisfies ωl > ωm,
N
2 < l ≤ N , and thus, the tone positions are ωIM3 = 2ωl − ωm = ωl + nlm∆ω,
where nlm is the index difference between the two tones ωl and ωm. Therefore, for ωIM3 to appear in the
adjacent channel, the tone position difference between ωl and ωm needs to be greater than that between
ωl and the upper end of the channel ωN , or nlm > nNl. If l = N , all 1 ≤ m ≤ N − 1 would result in an
ωIM3 that is in the adjacent channel, at ωN+1, ωN+2, . . . , ω2N−1. If l = N − 1, m = N − 2 would not
result in an ωIM3 that is in the adjacent channel, so mmax = N − 3, and the resulting adjacent channel
tones are at ωN+1, ωN+2, . . . , ω2N−3. This process of analysis is summarized in Table VI.
39
TABLE VI
Intermodulation Adjacent Tone Position Analysis
l mmax ωIM3,max index No. of tones in adj. ch.
N N − 1 2N − 1 N − 1
N − 1 N − 3 2N − 3 N − 3
.
.
.
.
.
.
.
.
.
.
.
.
N − i N − 2i− 1 2N − 2i− 1 N − 2i− 1
.
.
.
.
.
.
.
.
.
.
.
.
N+1
2
+ 1 (N odd) 2 N + 2 2
N
2
+ 1 (N even) 1 N + 1 1
Counting occurrences at each tone in the upper adjacent channel from Table VI reveals that at ωN+p,
the occurrence of IM3 distortion term KIM3,N+p is
KIM3,N+p = ⌈N − p
2
⌉ (62)
where ⌈x⌉ denotes the ceiling function of x, which is defined as the least integer greater than or equal
to x [41], and p = 1, 2, · · · , N − 1. Therefore, the distortion due to IM3 at tones in the upper adjacent
channel would result in
iIM3,adj =
3
4
g3v
3
a
N−1∑
p=1
⌈N − p
2
⌉ cosωN+pt (63)
c) Triple-beat terms (k 6= l 6= m): The third-order distortion output in this category is
iTB =M3g3v
3
a
N∑
k,l,m=1
k 6=l 6=m
cosωkt cosωlt cosωmt (64)
These distortion terms are called triple beat [42], and since each tone in the triple beat is different, the
multinomial coefficient M3 = P (3, 3) = 3! = 6. Expansion of each term in (64) yields
cos θk cos θl cos θm =
1
4
[cos (θk + θl + θm)
+ cos (θk + θl − θm)
+ cos (θk − θl + θm)
+ cos (θk − θl − θm)] (65)
where θi = ωit, i = k, l,m. Without loss of generality, assume k > l > m, then with the narrowband
assumption, in (65), only the second term has distortion terms that fall in the upper adjacent channel. The
first term is approximately three times the passband frequency; the third term corresponds to frequencies
40
TABLE VII
Triple-beat Adjacent Tone Position Analysis
k No. terms at ωN+1 No. terms at ωN+2 ωTB,max index
N N − 2 N − 3 2N − 2
N − 1 N − 4 N − 5 2N − 4
.
.
.
.
.
.
.
.
.
.
.
.
N − i N − 2i− 2 N − 2i− 3 2N − 2i− 2
.
.
.
.
.
.
.
.
.
.
.
.
N
2
+ 2 (N even) 2 1 N + 2
N−1
2
+ 2 (N odd) 1 0 N + 1
at (ωk−nlm∆ω), but since k > l > m, these frequencies all fall in the passband, and the last term results
in negative frequencies, but their absolute values are frequencies at (ωm − nkl∆ω), which are either in
the passband or in the lower adjacent channel.
Observation of the second term in (65) reveals that the frequencies of this distortion term are at (ωk +
nlm∆ω). If k = N , then 1 ≤ m < l ≤ N − 1, leading to (N − 1 − 1 = N − 2) pairs of (l,m)
such that nlm = 1, and consequently there would be a distortion term at ωN+1. Similarly, there are
(N − 1− 2 = N − 3) pairs of (l,m) that result in a distortion contribution at ωN+2, and so on. Finally,
the farthest index “distance” between ωl and ωm, in the case of k = N , is (N−2), so ωTB,max = ω2N−2.
When k = N − 1, 1 ≤ m < l ≤ N − 2, there would be (N − 2 − 2 = N − 4) pairs of (l,m) such that
nlm = 2 and thus a distortion contribution at ωTB = ωN+1, and ωTB,max = ω2N−4. Similar analysis
can be carried out for other k values, and the results are summarized in Table VII.
The occurrence of a triple-beat at ωN+p is then computed as
KTB,N+p = ⌊N − p
2
⌋⌈N − p
2
⌉ (66)
where ⌊x⌋ is the floor function of x, which is defined as the greatest integer less than or equal to x [41],
and p = 1, 2, · · · , N−2. Therefore, the triple beat in the upper adjacent channel would result in an output
current of
iTB,adj =
3
2
g3v
3
a
N−2∑
p=1
⌊N − p
2
⌋⌈N − p
2
⌉ cosωN+pt (67)
As mentioned before, if assume the distortion contributions to each adjacent tones are all in-phase, then
the nonlinear output in the upper adjacent channel is the sum of results from (63) and (67):
iadj,u =
3
4
g3v
3
a
N−1∑
p=1
[
⌈N − p
2
⌉
(
1 + 2⌊N − p
2
⌋
)]
cosωN+pt (68)
41
It can be shown that after some mathematical manipulations and a change of variable, (68) can be simplified
to
iadj,u =
3
4
g3v
3
a
N−1∑
p=1
[
p (p+ 1)
2
]
cosω2N−pt (69)
And the adjacent channel power then becomes the sum of power in each tone, assuming that the tones
themselves are all uncorrelated.
Padj =
1
2
(
3
4
g3v
3
a
)2 N−1∑
p=1
1
4
(
p2 + p
)2
=
1
2
(
3
4
g3v
3
a
)2
1
4
[
1
5
(N − 1)5 + (N − 1)4
+
5
3
(N − 1)3 + (N − 1)2 + 2
15
(N − 1)
]
=
1
2
(
3
4
g3v
3
a
)2
F (N) (70)
where
F (N) =
1
4
[
1
5
(N − 1)5 + (N − 1)4
+
5
3
(N − 1)3 + (N − 1)2 + 2
15
(N − 1)
]
The passband output power is
Pch =
N
2
(g1va)
2
(71)
If a two-tone signal with equal amplitude of va is applied to the PA, the IMD3, expressed in decibel, is
IMD3|dB = 10 log
(
3
4
∣∣∣∣g3g1
∣∣∣∣ v2a
)2
(72)
The ACLR of an N -tone signal that is applied to the PA can then be obtained from (70) and (71), and
more importantly, it can be related to a two-tone IMD3 described by (72):
ACLR = 10 log
Padj
Pch
= IMD3 − 10 logN + 10 logF (N) (73)
Analysis in [37] assumes that the phases of all distortion components are random. Such an assumption
42
provides the lower limit of ACLR:
ACLRmin = IMD3 − 10 logN + 10 logG(N) (74)
where
G(N) =
1
12
[
4N
(
N2 − 1)− 3 (N2 −N mod 2)]
Therefore, in the design phase, a simple two-tone simulation can be used to get IMD3, and from (73)
and (74), one can obtain a basic idea of the multitone ACLR range, which indicates what could happen
in the corresponding ACLR as a result from a modulated input signal that has a similar PAPR.
To verify the analysis of this work, an RF power amplifier was designed and fabricated in TSMC 40 nm
CMOS, and Fig. 25 shows the microphotograph of the chip. The linear PA operates at 1.9 GHz, with a
measured continuous-wave (CW) saturated output power PSAT ≈ 35 dBm and power gain of 38 dB. The
PA itself is designed with three stages, and the last two stages can be switched to improve power efficiency,
but for the purpose of verifying the concept in this paper, the switching functionality is not activated in
the following simulations and measurements. For more details of the design and implementation of the
power amplifier, the readers are referred to [43].
First, a set of simulations are carried out to verify the relationship between multitone ACLR and the
number of tones. The input source to the PA provides a multitone signal, and the Fourier analysis is
performed at the PA output to calculate the ACLR. The input signal’s total bandwidth is kept constant,
and the amplitude of the individual input tones, va, is kept the same as the number of tones N is varied
from 2 to 9. The value of va is chosen such that in case when the maximum number of tones is reached,
in this case 9, the peak input voltage would not exceed the PA’s saturation limit. Based on the two-tone
test and thus the corresponding IMD3, (73) can be computed and compared to simulation results. The
results are shown in Fig. 26. Note that as a reference, the ACLR predicted by (74) is also included in the
plot. As expected the simulation results are better than the results of (73) but worse than that predicted by
(74). As previously mentioned, the purpose of this analysis is to provide a quick estimate of the ACLR
at the early design phase; therefore, a prediction of the ACLR range should suffice. For large Ns, the
difference between the results predicted by (73) and (74) can be approximated by taking the dominant
terms in F (N) and G(N) only. Even when N = 10, for instance, the ACLR should be within a 10 dB
window, which is manageable in the early design phase, especially when considering that the time and
resources it takes to get the estimate is fairly little.
43
Fig. 25. Microphotograph of the chip.
-55
-50
-45
-40
-35
-30
-25
-20
2 3 4 5 6 7 8 9 10
A
C
L
R
(d
B
)
Number of tones
ACLRmax
ACLRmin
ACLRsim
Fig. 26. Simulated and predicted multitone ACLR as a function of number of tones.
44
-80
-60
-40
-20
0
20
1885 1890 1895 1900 1905 1910 1915
N
o
rm
al
iz
ed
o
u
tp
u
t
(d
B
)
Frequency (MHz)
Predicted
Simulation
Measurement
Fig. 27. Multitone output spectrum.
As a second testbed, an OFDM signal with a PAPR of about 9 dB is applied to the fabricated PA and
the output power spectrum is measured. According to (57), a 9-tone signal has a similar PAPR; therefore,
the normalized output spectrum calculated according to (69) and the corresponding simulation results are
compared with the measured PA output spectrum, as shown in Fig. 27.
As expected, the theoretical calculations from (69) and (73) represent the worst case, and due to the
phase response of the PA and its output matching network, the simulation results give a better nonlinearity
performance. Although similar to PAPR, the multitone and its corresponding OFDM spectrum are different
in shape, mainly due to the difference between their time domain envelope. Nevertheless, the ACLR turns
out to be similar, which again validates the proposed methodology. The simulated 9-tone ACLR is about
-30 dBc, whereas the measurement result is -33 dBc ACLR for the OFDM signal.
The results from simulations and measurements show the validity of the analysis. Therefore, as a
design guideline from a linearity perspective, a two-tone simulation can provide some indications of the
PA linearity in the presence of multitone or even modulated signals. For a specific target modulation
format with a known PAPR, the designers can then offer an educated estimation of the ACLR from a
quick two-tone simulation. If the PA has a strong memory effect, the IM3 would become a function of
the frequency spacing of the two tones, and are asymmetric [20]. The cause of memory effect and the
means to reduce it are out of the scope of discussiion in this paper, but to account for the memory effect,
a design margin of about 5 dB in ACLR should be allocated.
45
IV.F. AM-to-PM Conversion
Previously described nonlinearities can be summarized as amplitude modulation to amplitude modu-
lation (AM-AM) conversion, i.e. amplitude modulation at the input of the PA (or any other nonlinear
system) results in disproportional output amplitude modulation. In frequency domain, this phenomenon
can be interpreted as the envelope of the modulated signal having additional frequency components at the
output, as manifested by previously described intermodulation and triple beat.
Another phenomenon of nonlinearity is amplitude modulation to phase modulation (AM-PM) conver-
sion, in which the PA’s phase response becomes a function of the input amplitude. Nonlinear capacitance of
the PA transistor and high-Q matching networks are all possible causes of AM-PM conversion [20, 44, 45].
However, AM-PM conversion is only dominant beyond the 1-dB compression point [20].
To analyze AM-PM conversion, consider an input signal vin = cos θm cos θ, where for simplicity,
θ = ωt, and the input amplitude is normalized to unity. Here ωm represent the envelope, and ω is the
carrier, and the signal can be decomposed into two tones, one at (ω−ωm) and the other at (ω+ωm). As
stated, AM-PM conversion is a function of the input amplitude, and therefore in this case, there would
be an additional phase Ψ added to the output of the PA, and Ψ has a period that is half of the envelope.
In the simplest model, assume
Ψ =
φ
2
(1 + cos 2θm) (75)
that is the normalized phase error has a maximum of φ, average of φ/2, and a frequency of 2ωm. Assume
the phase error is small, i.e. cosΨ ≈ 1, sinΨ ≈ Ψ, cos φ2 ≈ 1, sin φ2 ≈ φ2 , and for simplicity, ignore the
amplification of the amplitude, then the output is
vout = cos θm cos (θ +Ψ)
≈ cos θm (cos θ −Ψsin θ)
= cos θm
(
cos θ − φ
2
− φ
2
sin θ cos 2θm
)
≈ cos θm
{
cos
(
θ +
φ
2
)
− φ
4
[sin (θ + 2θm) + sin (θ − 2θm)]
}
= cos θm cos
(
θ +
φ
2
)
− φ
8
[sin (θ ± θm) + sin (θ ± 3θm)] (76)
Therefore because of the amplitude-dependence of phase distortion, the AM-PM conversion would generate
tones at the two input tones at (ω±ωm) as well as at the third-order intermodulation tones at (ω± 3ωm),
and they would vectorically add to the distortions caused by AM-AM conversion. Although derived using a
46
rather simple model, the above AM-PM results would hold for more complex Ψ and modulation schemes.
47
V. IMPEDANCE MATCHING NETWORK DESIGN
V.A. Introduction
At RF or microwave frequencies, the electromagnetic wavelength becomes comparable to the PCB
traces or even traces on-chip. Wave properties, such as reflections, can be significant if there is impedance
mismatch [46]. A direct connection between the PA’s output and the antenna is also impractical. If the
input impedance of the antenna is 50Ω, a simple calculation shows that the voltage swing would have to
be 10Vpk that corresponds to an output power of 30 dBm, or 1 W. Advance CMOS technology is not able
to handle such a voltage swing, hence an impedance matching network that transforms the load impedance
at the antenna to a lower value at the output port of the PA is needed. Note that input impedance matching
is also needed for measurement purposes, but if the PA is part of a tranceiver SoC that has other parts
that precede it, then input matching may not be necessary.
At microwave frequencies, transmission lines can be used to implement impedance matching, whereas
at low RF, lumped LC networks are more common due to the otherwise large size of transmission lines.
At intermiate or high RF, a hybrid approach can be used.
Basic concepts of impedance matching are reviewed in this section, followed by the review of several
basic impedance matching network. The discussion would focus on PA output impedance matching,
although many of the principles apply to input matching as well.
V.B. Impedance Matching Theory
The PA output impedance matching can be conceptually illustrated in Fig. 28. Here RL represents the
load impedance, which could be the input impedance of the antenna or measurement probe. Usually it is
real and has a value of 50Ω for RF applications or 75Ω in TV systems. Through the impedance matching
network, the PA’s output is instead terminated at RT , the termination impedance. Since PA is one of the
most expensive devices in the tranceiver in terms of power consumption and area, RT is usually designed
for optimal (maximal) output power, therefore its reactance is zero or negligible [20]. Load-pull technique
is utilized to determine the RT value for PA [47], while S-parameter techniques are used for determining
RT to optmize other parameters, such as noise, gain, and stability, for small-signal amplifiers [46, 48].
There are several considerations to be taken into account when desiging a matching network. First, the
relationship between RL and RT needs to considered. In general, RT can be greater than or less than
RL, which could result in different matching topologies. Even within the PA design applications, where
RT is usually less than RL, the ratio between the two would determine how many stages of matching
48
PA
impedance
matching
RL
RT
Fig. 28. PA output impedance matching block diagram.
should be used. This is because, as will be shown, that if a single-stage matching is used, then there is
no degree of freedom in terms of the quality factor Q of the matching network. As a result, especially
when RL is much larger than RT , a single-stage matching network would have a very high Q, which in
turn would make the matching quite narrow-band. A matching network that has a narrow bandwidth may
be sensitive to process and component variations. As can be shown later, the matching network has wider
bandwidth with more stages.
In addition to matching the impedance at fundamental frequency, harmonic impedances also need to be
well terminated. Usually because of the lowpass nature of the impedance matching network, the harmonics
are filtered out. But in case of bands that have stringent harmonic requirements, especially bands that
are involved in carrier aggregation [49, 50], then certain harmonic traps, either in the form of a series
harmonic open or shunt harmonic short, may be required to be embedded into the matching network.
The other consideration is the insertion loss of the impedance matching circuit. Generally speaking,
the less number of component used, the less the loss is. Therefore, the insertion loss trades off with
the bandwidth of the matching circuit. Also due to this consideration, the impedance matching network
mainly consists of low-loss components, such as inductors, capacitors, and transimssion lines.
At microwave frequencies, transmission lines can be used to implement impedance matching, whereas
at low RF, lumped LC networks are more common due to the otherwise large size of transmission lines.
At intermiate or high RF, a hybrid approach can be used.
V.C. L-match Network
One of the most basic forms of the impedance matching network consists of a shunt element and a
series element, forming an L-shaped impedance transformer. For PA design applications, where RL is
usually greater than RT , there is usually an element lumped at the load impedance RL, and another in
series. Two L-match networks implemented with inductor and capacitor are illustrated in Fig. 29.
In principle, to realize a certain impedance transformation, the component values for both versions of
the L-match networks shown in Fig. 29 are the same, and they should have identical in-band performance.
49
LC RL
RT
(a)
C
L RL
RT
(b)
Fig. 29. Lumped-element L-match network, in (a) low-pass and (b) high-pass forms.
But usually the low-pass version is preferred. This is because of the often need for rejection or attenuation
at harmonic frequencies, which are all higher than the fundamental tone. Another advantage of the low-
pass L-match, especially if a hybrid approach is to be implemented on PCB, is that the inductors can be
conveniently replaced by microstrip lines, formed simply by PCB traces over the ground plane, and thus
having the low-pass version would enable one to easily change the position of the capacitors along the
lines to “tweak” the network, making it a more flexible approach in practice.
To analyze the lumped version of the L-match impedance transformation network, refer back to the
low-pass version in Fig. 29(a). The parallel combination of RL and C results in an impedance of
ZRC =
RL
1 + jωRLC
=
RL
1 +
(
RL
XC
)2 − jXC
(
RL
XC
)2
1 +
(
RL
XC
)2 (77)
Therefore the inductance and capacitance are determined such that the real part of ZRC is the desirable
termination resistance, RT , and the imaginary part of ZRC is resonated out by the inductor. Define the
impedance transformation factor m = RL/RT , then according to (77),
XL = RT
√
m− 1 (78a)
XC =
RL√
m− 1 (78b)
The quality factor Q of the impedance matching network is also an important quantity to take into
considerations during design. For a resonant circuit, its Q is equal to the ratio of the resonant frequency
to the 3-dB bandwidth:
Q =
f0
BW
(79)
where f0 denotes the resonance frequency. In other words, Q is inversely proportional to the network
bandwidth. Unlike resonators, the impedance matching network is driven by a source: either a real signal
source for the input-match case, or the active device that can be modeled as a source for the output-
match case. Therefore, in impedance matching applications, a concept of “loaded Q” is introduced when
50
VS
RT impedance
matching
RL
Fig. 30. Concept of loaded Q.
discussing the bandwidth of the network. The loaded Q, QL, is defined as the Q near the matching
frequency of the impedance matching network driven by a source with the proper (intended) source
impedance, as shown in Fig. 30.
For the lumped L-C L-match, the Q of the L-C network can be calculated as
Q =
RL
XC
=
√
m− 1 (80)
So if the circuit is driven by a voltage source with a series resistance of RT as shown in Fig. 31, the
equivalent resistance of the network is doubled at resonance, and thus
QL =
1
2
Q =
1
2
√
m− 1 (81)
Note that if a lumped L-C matching topology is chosen, then once the impedance ratio m is determined,
so is the Q. Since RL is usually set at 50 Ω, and RT is chosen to achieve best power or efficiency
performance of the PA, the lumped L-C matching network is lack of a degree of freedom to design for
the Q.
VS
RT
L
C RL
Fig. 31. Lumped L-match driven by the intended source resistance.
The hybrid version of the L-match network has a transmission line in place of the inductor, shown in
Fig. 32. GT and GL represent the termination and load conductance, respectively. YX is the admittance
of the parallel R-C network, Y0 and d are the characteristic admittance and length of the transmission
line, respectively.
As hinted in Fig. 32, the analysis would be less involved if performed in terms of admittance, con-
ductance, and susceptance. The impedance transformation strategy of this scheme is to use the lumped
51
Y0
d
C GL
GT YX
Fig. 32. Hybrid L-match network.
capacitor to change the admittance seen at the end of the transmission line, in order to let the reflection
coefficient ΓX at the right end of the line have the same magnitude as that of GT . Then the length of
the transmission line is determined such that through the length of d, the phase of ΓX would change to
that of ΓT .
Define n = GT /Y0, then according to the strategy outlined above,
|ΓX | = |ΓT | , or∣∣∣∣Y0 −GL − jBCY0 +GL + jBC
∣∣∣∣ =
∣∣∣∣Y0 −GTY0 +GT
∣∣∣∣ = n− 1n+ 1 , or
BC = GL
√
Y0
GL
n2 + 1
n
−
(
1 +
Y 20
G2L
)
(82)
And the length of the transmission line is determined by
ΓX =
Y0 − YX
Y0 + YX
=
Y0 −GL − jBC
Y0 +GL + jBC
d =
λ
4pi
6 ΓX =
λ
4pi
tan−1
(
2BC
GL
n2+1
n − 2Y0
)
(83)
where λ = ceff/f is the wavelength of the electromagnetic wave, and ceff is the effective propagation
speed of light. Similar to (80), the quality factor of this network can be calculated by
Q =
BC
GL
=
√
Y0
GL
n2 + 1
n
−
(
1 +
Y 20
G2L
)
(84)
From the same logic in the lumped version, the loaded Q is half of the network Q. This expression is a
little more complicated than that of the lumped-element case, but for the special case where GL = Y0,
i.e. the transmission line is matched to the load impedance, (84) reduces to
Q =
√
n− 2− 1
n
(85)
For large n, (80) and (85) converge. In other words, the bandwidth improvement is negligible for power
amplifier output impedance matching applications if the characteristic impedance of the transmission line
52
C1
L
C2 RL
RT
Fig. 33. Π-match network in output matching applications.
is the same as the load termination. Therefore lower characteristic impedance is usually implemented to
achieve higher bandwidth. Coincidentally, transmission lines with low characteristic impedance are usually
wide, which result in a lower parasitic resistance and thus less power loss and better current handling
capacity.
In other words, the hybrid impedance matching network provides the missing degree of freedom to
control the quality factor. For example, suppose an impedance matching network is needed to transform
50Ω load impedance to 5Ω. If a lumped-element L-match is used, then according to (80), the Q is 3,
and cannot be changed because it is determined only by the impedance transformation ratio. If the hybrid
L-match is chosen, and assume the transmission line’s characteristic impedance is 50Ω, then according
to (85), the Q is calculated to be 2.8. Furthermore, if the characteristic impedance of the transmission
line can be varied, then the Q can be changed accordingly. If the transmission line is implemented such
that the characteristic impedance is 30Ω, then from (84), the Q is 2.55. Compare this result with that of
lumped implementation, the network bandwidth can be increased by 15%.
V.D. Pi-match Network
The output transistor of power amplifiers usually exhibits a relatively large output parasitic capacitance.
In addition, the bond wires connecting the output node on chip to the package have inductance that cannot
be ignored at RF. Therefore a simple L-match design would result in inaccurate impedance transformation
due to lack of such considerations. An additional capacitor C1 is added at the end of the inductor, as shown
in Fig. 33, forming a Π-match network. The output capacitance of the PA transistor can be absorbed into
C1, whereas package and PCB trace capacitance can be absorbed into C2.
Compared with L-match, the additional element in the Π-match network provides flexibility in choosing
the loaded Q, and also results in more matchable range on the Smith chart [46]. Therefore Π-match
network is widely used in input-, output-, and interstage-matching applications. As will be shown later,
L-match can be treated as a special case of Π-match with one of the capacitance being zero. Therefore
the following analysis of the Π-match network start with a more generalized form as shown in Fig. 34,
53
VS
R1
C1
L
C2 R2
ZA ZB
Fig. 34. General schematic for analysis of a Π-match network.
in which the Π-match network is going to match R1 and R2 at a certain frequency, and is driven by a
voltage source VS . In the case of input-matching applications, VS can be the RF source and R1 the source
impedance, and R2 is the input impedance of the RF circuit; in the case of output-matching applications,
VS can be viewed as the Thevenin equivalent of the PA, with R1 being the optimal output impedance of
the PA and R2 being the load impedance.
Due to the additional degree of freedom, now the loaded Q of the network can be specified when
designing the Π-match, and so the following analysis assumes R1, R2, QL, and the frequency of interest
are all specified, and would try to derive design equations to determine L, C1, and C2.
First introduce
BC1 = ωC1, BC2 = ωC2, XL = ωL (86)
and define
Q1 = BC1R1, Q2 = BC2R2 (87)
As shown in Fig. 34, define ZA and ZB as the parallel equivalent impedance of {R1, C1} and {R2, C2},
respectively, therefore
ZA = RA − jXA, ZB = RB − jXB (88)
Basic circuit analysis can show that
RA =
R1
1 +Q21
, XA = RAQ1 (89a)
RB =
R2
1 +Q22
, XB = RBQ2 (89b)
At conjugate match, RA = RB, XL = XA+XB , and since the loaded Q at resonance can be expressed
as QL =
XL
RA+RB
, the following relationships can be derived from (89):
QL =
1
2
(Q1 +Q2) (90)
54
R1
R2
=
1 +Q21
1 +Q22
(91)
XL = R1
2QL
1 +Q21
= R2
2QL
1 +Q22
(92)
From (90) and (91), the condition on which the Π-match is designable can be derived. For example,
rearranging (91) for an expression of Q2 yields
Q22 =
R2
R1
(
1 +Q21
)− 1 (93)
For a positive Q2, it is thus required that if R1 > R2, Q1 ≥
√
R1
R2
− 1. Similarly, if R2 > R1, the
requirement becomes Q2 ≥
√
R2
R1
− 1. Therefore if expressed in terms of QL from (90), the condition
on which the Π-match is designable is that
QL ≥


1
2
√
R1
R2
− 1 if R1 > R2
1
2
√
R2
R1
− 1 if R2 > R1
(94)
If the condition in (94) is met, then from (90) and (91), Q1 and Q2 can be solved:
Q1 =
2QLR1 −
√
4Q2LR1R2 − (R1 −R2)2
R1 −R2 (95)
Q2 =
2QLR2 −
√
4Q2LR1R2 − (R1 −R2)2
R2 −R1 (96)
Therefore the design procedure of Π-match network is the following:
1) With R1, R2, and QL set, check to make sure the condition in (94) is met. If not, reassign their
values (usually the QL value) until the condition is met.
2) Once it is verified that condition in (94) is met, then Q1and Q2 can be calculatored from (95) and
(96), respectively.
3) Solve for BC1, BC2, and XL from (87) and (92).
4) At the frequency of interest, calculate C1, C2, and L from (86).
Note that by setting one of the capacitances to zero, all the Π-match equations above would become
identical to those of the L-match. This confirms that the L-match can be viewed as a special case of the
Π-match.
As stated before, for RF power amplifier applications, R1 and R2 are usually determined by the power
and efficiency optimization and load requirement, respectively. The following discussion thus focuses on
55
design considerations of QL.
First, according to its definition, QL determines the bandwidth of the matching network. Therefore, if
the bandwidth specification is provided, then
QL ≤ fc
BW
(97)
where fc and BW represent the center frequency and the bandwidth of the band, respectively.
One of the direct tradeoffs in designingQL for the pass-band is the requirement of out-of-band harmonic
rejections. Less QL would provide wider bandwidth, and thus the matching network can be less sensitive
to process, voltage, and temperature variations; but high levels of harmonic rejections usually call for
higher QL values. Basic network analysis can show that the voltage transfer function of the Π-match
network is
H(s) =
V2
VS
=
R2
R1 +R2
1
1 + s
[
L+ (C1 + C2)
R1R2
R1+R2
]
+ s2 LR1+R2 (C1R1 + C2R2) + s
3LC1C2
R1R2
R1+R2
(98)
Define ωm as the angular frequency at the match condition, then after some mathematical manipulations
of (86), (87), (95), and (96), we have
ωmC1 =
2QLR1 −
√
4Q2LR1R2 − (R1 −R2)2
R1 (R1 −R2) (99)
ωmC2 =
2QLR2 −
√
4Q2LR1R2 − (R1 −R2)2
R2 (R2 −R1) (100)
ωmL =
(R1 −R2)2
2QL (R1 +R2)− 2
√
4Q2LR1R2 − (R1 −R2)2
(101)
where k = R1/R2. Substituting (99), (100), and (101) into (98), the transfer function can be expressed
in terms of ωm, QL, and k. In particular, when the harmonic rejection is of concern, the magnitude
frequency response is
|H(jω)| = 2
[
(k + 1)QL −
√
4kQ2L − (k − 1)2
]
/
{[
2 (k + 1)
2
QL
− 2 (k + 1)
√
4kQ2L − (k − 1)2 − 2 (k − 1)2QL
(
ω
ωm
)2]2
+
([
3 (k − 1)2 − 8kQ2L + 2 (k + 1)QL
√
4kQ2L − (k − 1)2
](
ω
ωm
)
56
+[
8kQ2L − 2 (k + 1)QL
√
4kQ2L − (k − 1)2 − (k − 1)2
](
ω
ωm
)3)2

1/2
(102)
Substituting ω = ωm in (102), the magnitude response at matching condition is
|H(jωm)| = 1
2
√
k
=
1
2
√
R2
R1
(103)
as expected as the Π-match network can be viewed as an impedance transformer. Define S(ω) =
|H(jω)|/|H(jωm)|, and once the impedance transformation ratio k is determined, S(ω,QL, k) can be
used to determine the lower limit of QL if a certain harmonic rejection is specified. For instance, often
times the second- and third-order harmonic rejections are specified, and setting ω = 2ωm and 3ωm in the
definition of S(ω) and substituting in (102) and (103), we have
S(2ωm, QL, k) = 2
√
k
[
(k + 1)QL −
√
4kQ2L − (k − 1)2
]
/
{([
(k + 1)
2 − 4 (k − 1)2
]
QL − (k + 1)
√
4kQ2L − (k − 1)2
)2
+
[
24kQ2L − 6 (k + 1)QL
√
4kQ2L − (k − 1)2 − (k − 1)2
]2}1/2
(104)
S(3ωm, QL, k) = 2
√
k
[
(k + 1)QL −
√
4kQ2L − (k − 1)2
]
/
{([
(k + 1)
2 − 18 (k − 1)2
]
QL − (k + 1)
√
4kQ2L − (k − 1)2
)2
+
[
96kQ2L − 24 (k + 1)QL
√
4kQ2L − (k − 1)2 − 9 (k − 1)2
]2}1/2
(105)
Therefore once these harmonic rejection levels are specified, then we can have the second constraint of
the QL:
QL ≥ max{QL,H2, QL,H3} (106)
where QL,H2 and QL,H3 are the QL that correspond to the specified second- and third-order harmonic
rejections obtained from graphical approach according to (104) and (105), respectively.
Another design consideration for specifying QL involves the parasitic effect. Assume capacitors are less
lossy, and the majority of the loss comes from the parasitic resistance of the inductor, which is usually
the case. The Π-match network is redrawn in Fig. 35, with this parasitic resistance r explicitly shown.
The parasitic resistance of the inductor has two effects that would contribute to non-ideal power transfer:
a) it causes direct power loss of the network, and b) indirectly, it causes mismatch of the impedance,
57
VS
R1
C1
L
I
r
C2 R2
ZA ZB
Pin
Fig. 35. Π-match network with inductor’s parasitic resistance explicitly shown.
thus a portion of the RF power is reflected back and does not reach the output. We denote the network
power loss and the reflected power to be Ploss and Prefl, respectively. Also, define PAV as the available
power from the source VS , Pin as the input power at the Π-match network, and Pout as the output power
dissipated at R2. Therefore, due to conservation of energy, we have
PAV = Pin + Prefl (107)
Pin = Pout + Ploss (108)
To analyze the circuit near matching condition, consider the Thevenin equivalent circuit shown in Fig. 36,
where VTh represent the Thevenin equivalent voltage, and RA, RB, XA, and XB are defined in (88).
VTh
ZA L
I
r
ZB
Fig. 36. Thevenin equivalent circuit of the Π-match network near matching frequency.
Also define the “unloaded Q”, Qu = XL/r, and since at matching condition, by design, RA =
RB, XL = XA +XB , and by definition QL = XL/(RA +RB), we have
r =
XL
Qu
= 2RA
QL
Qu
(109)
I =
VTh
RA +RB + r + j (XL −XA −XB) =
VTh
2RA
1
1 + QLQu
(110)
By definition [46], the available power can be calculated as PAV = V
2
Th/(4RA). Therefore by using (109)
and (110) along with Pout = |I|2RB and Ploss = |I|2r, we have
Ploss
PAV
=
2QL/Qu
(1 +QL/Qu)
2 (111)
58
L1
C1
L2
C2
· · ·
· · ·
Ln
Cn RL
RT
Fig. 37. Multisection matching network.
Prefl
PAV
=
(QL/Qu)
2
(1 +QL/Qu)
2 (112)
Pout
Pin
=
1
1 + 2QL/Qu
(113)
Pout
PAV
=
1
(1 +QL/Qu)
2 (114)
Obviously, if the inductor is lossless, i.e. Qu is infinite, then we have Ploss = Prefl = 0 and Pin =
Pout = PAV . If the inductor is implemented off-chip, then usually the Qu can be in the range of 30 to 80.
For power amplifier matching purposes, especially for wireless communication applications, QL is usually
within 5 to satisfy bandwidth requirement in (97). Therefore, it is a valid assumption that QL/Qu ≪ 1.
Comparing (111) and (112) under this assumption, the direct power loss term, Ploss, is the dominant
power loss mechanism due to parasitics, and (113) and (114) converge.
If the insertion loss of the matching network is defined as
IL = 10 log
PAV
Pout
(115)
then when the available inductor quality factor is given (or its range is known), there can be another
design constraint of the QL if the minimum insertion loss is specified:
QL ≤ Qu
(
10ILmin/20 − 1
)
(116)
In summary, the choice of QL can be narrowed down by (97), (106), and (116). Once its value is
decided, the design procedure outlined in this section can be used to design the component values of the
Π-match network.
V.E. Multisection Matching Network
As the name suggests, multisection matching consists of multiple sections of the basic matching
topologies mentioned above. Fig. 37 shows a multisection matching network that is the cascade of L-match
sections.
59
L1
C1
L2
C2 RL
RT Rm
Fig. 38. Schematic of a 2-stage matching network.
Generally speaking, the more number of stages a matching network has, the wider the matching
bandwidth [46]. Therefore, for broadband match it is usually desirable to implement multisection match.
However, the more components a multisection matching network contains would introduce more power
loss. Therefore usually a 2-stage matching network can be used to achieve wider bandwidth, as shown in
Fig. 38.
The 2-stage matching network converts the load impedance RL to an intermediate impedance, Rm,
before it is then converted to the termination impedance RT . Although in theory the choice Rm can be
arbitrary, in practice a general fule of thumb is to set Rm such that the impedance transformation ratio
of the two stages stays the same:
RL
Rm
=
Rm
RT
(117)
This way the voltage swing progresses evenly through the two stages, resulting in less stress to the
components [20].
60
VI. POWER AMPLIFIER ARCHITECTURES
VI.A. Introduction
As indicated in previous sections, the design of RF power amplifiers faces the tradeoff between linearity
and efficiency. Therefore, techniques and architectures to achieve linearity through high-efficient switching
PAs and to improve power efficiency of linear PAs have been active research and development topics.
To accommodate more functionalities in battery-powered electronic devices, it is required for the PAs to
have high average efficiency. On the other hand, to meet the increasingly stringent linearity requirements
in modern communication standards, linearization is often needed even for linear PA classes. In general,
architectures and techniques to achieve the above goals fall into two categories: efficiency enhancement
techniques and linearization architectures. This section serves as an overview of these previous solutions.
VI.B. Efficiency Enhancement Techniques
Power efficiency would not be quite a big design challenge if the modulation schemes does not involve
variations in the envelope amplitude, because in that case a switching mode PA can be used, and the
theoretical efficiency should reach 100%. Therefore in frequency modulations (FM) and second-generation
(2G) wireless communication standards where the transmission envelope has constant amplitude, much
research effort was put to switching PA implementations to achieve power efficiency that were as close
to the theoretical value as possible.
To achieve high average power efficiency on the modern electronic devices, it is important to improve
efficiency in the power back-off (PBO) region because of the high peak-to-average power ratio (PAPR)
modulation schemes used. Therefore, development and implementation of efficiency enhancement tech-
niques have been very active research areas. On one hand, techniques developed a long time ago are
implemented using modern technologies, on the other hand, new techniques are developed to improve
efficiency.
VI.B.1. Outphasing Modulation: The basic idea of outphasing modulation, as originally proposed by
Chireix [51], is to embed the envelope amplitude variations into the phases of two paths, so in each path
switching mode PA can be used. The original amplitude modulation would be recovered after a voltage
subtractor at the output of the two paths, as shown in Fig. 39.
61
cos(ωt + φm)
PA1
v1
− cos(ωt − φm)
PA2
v2
−
+
vo
RL
Fig. 39. Conceptual schematic of outphasing modulation.
Suppose the output voltage amplitude of each PA is V , then we have
v1 = V (cosφm cosωt− sinφm sinωt) (118a)
v2 = V (− cosφm cosωt− sinφm sinωt) (118b)
Therefore the output voltage is
vout = 2V cosφm cosωt = A(t) cosωt (119)
In other words, the amplitude modulation A(t) is embedded into the phase modulations applied to the
PA in each path,
φm = cos
−1
[
A(t)
2V
]
(120)
Although both PAs in an outphasing system can be implemented in switching mode that results in high
efficiency themselves, the overall power efficiency still decreases in proportion to the power back-off.
Since the DC power is a constant, the shape of the efficiency as a function of PBO is actually the same
as that of class A PAs, with the only difference being the maximum efficiency value. There is still one
difference between the two systems, in that in an outphasing system, since the PAs are switching mode
and high-efficient themselves, they dissipate less heat, therefore outphasing system may have an advantage
in reliability performances. In addition, since the power loss in the back-off region is not in the form of
heat dissipation, but simply a result of power addition or subtraction, it is possible that this power can be
recycled.
To improve power efficiency in the back-off region in outphasing systems, it is necessary to modify
the load network. If the PAs are modeled as voltage sources, the load without compensation is shown in
Fig. 40. Using the phasor notation, the output voltages of the PAs and the load current can be expressed
62
RL
V2V1
Fig. 40. Demonstration of outphasing load configuration without compensation.
as
V1 = V (cosφm + j sinφm) (121a)
V2 = V (− cosφm + j sinφm) (121b)
IL =
V1 − V2
RL
=
2V cosφm
RL
(121c)
Therefore the effective output impedance of the PAs are
Z1 =
V1
IL
=
RL
2
(1 + j tanφm) (122a)
Z2 =
V1
−IL =
RL
2
(1− j tanφm) (122b)
Take PA1 for example, as a result of outphasing modulation, its output impedance is effectively the
half load resistance in series with an inductance whose value depends upon the envelope amplitude. To
compensate this, a capacitor can be shunted at its output, and its value is determined by the efficiency
improvement required at the certain power back-off. A shunt inductor can be placed at the output of
PA2 accordingly. The resultant schematic is shown in Fig. 41, and the complete analysis is carried out
in [51, 52].
PA1
RL
PA2
Fig. 41. Outphasing load compensation network.
The circuit shown in Fig. 41 is still not practical since the midpoint of RL is not a virtual ground. To
alleviate this problem, an alternative as shown in Fig. 42 is developed by Raab [53], where quaterwave
transmission lines are placed at the outputs of the PAs, and the PA outputs are summed in the current
domain. The analysis is similar and not repeated here.
The disadvantage of outphasing technique is that the required digital resource to embed the amplitude
63
cos(ωt + φm)
PA1
−jB
λ/4
cos(ωt − φm)
PA2
−jB
λ/4
io
RL
Fig. 42. Outphasing technique implemented in the current domain.
modulation into phase may be large, even in the context of the current DSP capabilities. Also, mismatch
between the two paths would result in degradation in both the linearity and system efficiency. Furthermore,
the load compensation network component values, as shown in (122), are a function of the embedded
phase modulation φm, but as described in (120), φm is an inverse trigonometric function of the amplitude
modulation A(t), and a relatively large slope of inverse trigonometric functions indicates that the combiner
network design could be sensitive to component value variations.
VI.B.2. Doherty Amplifier: Originally proposed by Doherty [54], Doherty amplifier is an architecture
that improves efficiency in the power back-off region while maintains the system linearity. In the conceptual
schematic shown in Fig. 43, PA1 and PA2 are called the carrier and peaking amplifiers, respectively. The
purpose of the quarterwave line before the peaking amplifier path is to avoid delay mismatch, whereas
the one after the carrier amplifier serves as an impedance inverter. When the power level is low, only
the carrier amplifier is activated. Conventionally implemented in class B mode, the carrier PA’s output
power increases linearly with the input. At a transition point, the carrier PA saturates and the peaking
amplifier starts to conduct. As the input power continues to increase, the output current of both PAs
increase, but because of the impedance inverter, the output impedance of the carrier PA decreases, and its
output power continues to increase while its output voltage stays the same. If the transition point has half
the output voltage as that of the maximum output power, then in this medium power region, the carrier
PA remains at its maximum efficiency, whereas the efficiency of the peaking PA increases from half of
its maximum efficiency at the transition point to maximum at the maximum output power. Therefore the
system efficiency would reach maximum both at the maximum output power and at the transition point,
and stays relatively high in the medium power region, thus the efficiency at power back-off is improved.
To analyze the Doherty operation, it is important to understand how the quarterwave transmission line
can be used as an impedance inverter. Given an arbitrary transmission line with length d and characteristic
64
RFin
PA1
VDD
λ/4
Z0
λ/4
Z0
PA2
VDD
RL
Fig. 43. Conceptual schematic of a Doherty PA.
impedance Z0, connected to a load impedance ZL, the input impedance is given by
Zin(d) = Z0
ZL + jZ0 tanβd
Z0 + jZL tanβd
(123)
where β = 2pi/λ and λ is the effective electromagnetic wavelength in the transmission line [46]. If the
length of the line is a quarter of the wavelength, then simplification of (123) leads to
Zin(λ/4) · ZL = Z20 (124)
In other words, the impedance at the two ends of a quarterwave line is inversely proportional to each
other, hence the quarterwave line is viewed as an impedance inverter in this regard.
For simplicity, assume the transition point occurs when the output voltage is half of the maximum, i.e.
6 dB back-off from the maximum output power. The analysis in [55] considers a more general case. If
both PA are modeled as current sources, then in the medium power region, the Doherty system can be
modeled as a circuit shown in Fig. 44.
I1
λ/4
Z0
I3
RL
IL
VL
I2
Z1 Z3 Z2
Fig. 44. Simplified Doherty PA for analysis.
Assume that both PAs have the same maximum current Imax, and so the RF current amplitude of the
PAs at maximum output power, I1 = I2 = Imax/2. In the medium power region, formulate the current
from the PAs to be
I1 =
Imax
4
(1 + k) (125a)
I2 =
Imax
2
k (125b)
65
where k is a parameter that ranges between 0 and 1, with k = 0 corresponding to the transition point and
k = 1 the maximum output power. If the quarterwave line is lossless, then its input and output voltage
and current relationships are
V1I1 = VLI3 (126a)
V1
I1
VL
I3
= Z20 (126b)
Consequently the output current of PA1 after the quarterwave line
I3 =
V1
Z0
(127)
The critical concept of Doherty amplifier can be revealed by calculating the impedances Z2 and Z3:
Z2 =
VL
I2
=
VL
IL
IL
I2
= RL
(
1 +
I3
I2
)
(128a)
Z3 =
VL
I3
= RL
(
1 +
I2
I3
)
(128b)
Thus Z3 increases as the peaking PA starts to conduct. Due to the impedance inversion by the quarterwave
line, Z1 decreases, hence although V1 is kept at the saturation voltage level, the output power by the carrier
PA continues to increase. Combining the results from (124), (125), and (128b), and solving for V1 would
lead to
V1 =
Imax
4
Z0
[
Z0
RL
+ k
(
Z0
RL
− 2
)]
(129)
If Z0 = 2RL then according to (129), the output voltage of the carrier PA would be independent of k,
and in this case Z0 = VDD/(Imax/2), which is the ideal optimal load impedance of class A or class B
PA.
Although the analysis above led to an elegant result both in terms of linearity and back-off power
efficiency, Doherty amplifier faces several implementation difficulties. First, the two paths need to have
accurate delay match. Superficially this requirement might get translated to matching of the quarterwave
lines and the two PAs in each path. However, even if the two PAs deliver the same maximum power and
thus have the same dimension, they should be biased differently, and hence would result in different
delay because of the bias-dependent parasitic capacitances. In modern wireless communications this
difference in delay may be a significant fraction of the RF cycle, and compensation through quarterwave
line delays would still make the design sensitive to process, voltage, and temperature variations. Also,
66
RFin Limiter
PA
Env. detector SR
RL
Fig. 45. Conceptual schematic of an EER architecture.
robust implementation of bias circuit for the peaking amplifier is difficult. In addition, as revealed by
(129), the characteristic impedance of the quarterwave line should be equal to the optimal load impedance
of the carrier PA, which is usually in the order of 10 Ω or less in low-voltage technologies. Therefore the
transmission lines would result in a very wide dimension which would consume a lot of area.
VI.B.3. Envelope Elimination and Restoration Technique: First proposed by Kahn [56], the envelope
elimination and restoration (EER) technique separates amplitude modulation from frequency and phase
modulations. As shown in a conceptual schematic in Fig. 45, the limiter in the PA path eliminates the en-
velope amplitude information but preserves phase and frequency modulations. As a result, switching mode
PA can be used, yielding a high power efficiency. On the other hand, the envelope amplitude information
is extracted by the envelope detector, which in turn controls the switching regulator’s output accordingly.
This information is restored at the output of the PA because the output amplitude is proportional to the
PA power supply, which is the output of the switching regulator.
Modern implementation of the EER system is sometimes termed polar modulation, as shown in Fig. 46
[25]. Advance in digital signal processing (DSP) allows direct generation of amplitude and phase mod-
ulations. The phase-locked loop (PLL) in the PA path further reduces phase noise, and the voltage
regulator can be implemented by low-drop regulator or switching regulator. Such an architecture provides
a possibility of realizing linear transmitters that have high efficiency in a relatively larg dynamic range.
Delay mismatch may cause linearity degradations, therefore the system linearity performance is sensitive
to process, voltage, and temperature variations unless there is a dynamic feedback correction. Also, large
variations on the PA power supply would cause AM-PM conversion and hence phase nonlinearity. In
addition, the high PAPR in modern modulation schemes may pose as a design challenge to the voltage
regulator, and the trade-off of linearity and efficiency of the regulator may result in an overall efficiency
that is less than expected.
67
DSP
φ(t)
Limiter PLL
PA
RL
A(t)
Regulator
Fig. 46. Conceptual schematic of a polar modulation architecture.
VI.B.4. Envelope Tracking Technique: The conceptual schematic of an envelope tracking (ET) transmit-
ter is shown in Fig. 47. It is very similar with the EER system shown in Fig. 45, but the major distinction
is that in an ET system, the PA is usually linear, and thus the variable voltage regulator only needs to
provide sufficient voltage headroom for the PA to operate properly in the linear region. Therefore, the
amplitude linearity of the system is still provided through the PA, but the design of the regulator can be
relaxed since it does not have to accurately follow the envelope amplitude variations.
RFin Delay
PA
Env. detector SR
RL
Fig. 47. Conceptual schematic of an ET system.
The drawback of ET architecture is that in low-voltage applications, as the minimum output voltage
becomes a significant fraction of the power supply, the efficiency improvement in the power back-off region
becomes less. This is especially true in CMOS technologies, thus the ET systems are more common in
technologies that has higher breakdown voltages [57–59].
VI.B.5. Power Combining Technique: Power combining technique divides the PA into several sections,
and deactivate one or more sections in the power back-off region. As shown in Fig. 48, the output
power from individual PA sections are combined in either voltage or current domain [60]. In addition,
this technique is a good candidate for multimode applications, where PA sections can be activated or
deactivated according to the maximum output power of a certain standard. This technique thus has gained
a lot of interest in recent research [60–64].
68
RFin
Power
Splitter
PA1
PA2
...
PAn
Power
Combiner
RL
Fig. 48. Conceptual schematic of a power combining architecture.
Switching speed is a design challenge in power combining architectures. Most previous works are able
to switch on and off sections of the PA, but not fast enough. Therefore although they can be used for
multimode applications and the average efficiency can be improved when part of the PA is deactivated
during the low-power mode, the power efficiency within each operation mode is the same as that of a
stand-alone PA. Another issue with this technique is that the use of on-chip transformers would result
in unreliable impedance matching, and as the distance between the metal layers and the silicon substrate
continues to shrink, the loss due to substrate coupling would cut into the power saved [65].
This work presents a new method that alleviates the issues above, and the details would be discussed
in Section VII.
VI.C. Linearization Techniques
LO
Limiter
Env. det.
A
tte
n.
RL
RF in
Env. det.
Phase
comp.Limiter
PA
Fig. 49. Conceptual schematic of a polar-loop feedback system.
Because of the overcrowded frequency bands, band-efficient modulation and filtering schemes are widely
used in modern communication systems. As a result, the PAPR is high and linearity requirement is
69
LO
L
90°
90° Atten.
Iin
Q in
PA
I−Q demod
I−Q mod
R
Fig. 50. Conceptual schematic of a cartesian feedback system.
stringent. Consequently, using lineary PA alone may not meet the specific communication standard, due
to nonlinearity of the PA itself. Therefore some degrees of linearizations might be needed. In general
there are three linearization architectures: feedback, feedforward, and predistortion.
VI.C.1. Feedback Techniques: The system shown in Fig. 49, usually referred to as polar-loop feedback,
is a feedback system that consists of two independent loops, one of them corrects amplitude nonlinearity
while the other corrects phase distortion. The main design challenge is that loop delay mismatch may affect
the system linearity performance. Also, the auxiliary circuits may degrade the system power efficiency.
Cartesian feedback, shown in Fig. 50, alleviates in some degree the delay mismatch problem of polar-
loop feedback. However, the complexity due to feedback demodulators and error amplifiers may pose
a concern of power efficiency. Another drawback of this system is that it is unable to actively correct
AM-PM conversion.
VI.C.2. Feedforward Techniques: The feedforward architecture is illustrated in Fig. 51. Due to the
absence of feedback, the bandwidth limitation is alleviated. The error amplifier needs to amplify the error
signal, which has a much higher PAPR than the RF signal the PA handles. Therefore the efficiency of
the error amplifier may pose as a design challenge. Also, delay mismatch between the two paths might
degrade the effectiveness of this approach.
VI.C.3. Predistortion Techniques: Predistortion intentionally introduces nonlinearity to the input signal
to cancel distortion from the PA, as illustrated in Fig. 52. Due to advance in the DSP, this technique has
gained some popularity. Additional loop can be added for adaptive predistortion, but storage and process
70
Error amp.
Delay
Delay
RF in
A
tte
n.
RL
PA
Fig. 51. Conceptual schematic of a feedforward architecture.
overhead, as well as the look-up table updating and convergence time can be an issue [20].
PA
RL
DSP
LO
Look−up
table
Fig. 52. Conceptual schematic of a predistortion system.
71
VII. A 35DBM OUTPUT POWER AND 38DB LINEAR GAIN PA WITH 44.9% PEAK PAE AT
1.9GHZ IN 40 NM CMOS1
VII.A. Introduction
The power amplifier (PA) is one of the major power consumers in the RF transceiver [66–68], and the
design and implementation of high efficient CMOS PA has been a very active research and development
area during the last few years [69–71]. The 3-5G communication standards use a high data rate and
bandwidth efficient modulations that result in a high peak to average power ratio (PAPR). Because of
the high PAPR in such modulations during orthogonal frequency-division multiplexing (OFDM), the
probability density function (PDF) of the transmitted power will peak in the power back-off (PBO)
region. However, the power efficiency of linear PAs reaches maximum at the peak output power, and
drops drastically in the PBO region.
Envelope tracking [57–59, 70, 72, 73] and PA segmentation [60–64, 69, 74–76] are two efficiency
enhancement techniques that have gained much interest recently. However, the envelope tracking system
is becoming less effective in advanced CMOS technologies as the power supply scales down. The
minimum drain-souce voltage required by PA transistors and the limited drain-source voltage allowed
by the technology limit the benefits of this approach; the use of stacked transistors may help to tolerate
more signal swing. Additionally, wide bandwidth standards require a high switching frequency switching
regulator, which serves as a tradeoff between regulator power efficiency, output ripple, and tracking error
[72, 73].
The use of on-chip transformers in segmented PAs usually presents tolerances, especially the magnetic
coupling factor that may result in unreliable impedance matching, and as the distance between the
metal layers to the silicon substrate continues to shrink, the loss due to substrate coupling would cut
into the power saved by such architectures. Therefore, these segmentations must be accompanied by a
tunable impedance matching network that makes these solutions sensistive to process-voltage-temperature
variations [60, 74]. In this approach, some PA sections are deactivated in the low-power mode, such
that overall efficiency for low-power standards is improved. Such architectures do not provide means to
improve average efficiency within each mode of operation. On the other hand, the PA based on DAC
switching [77] used in polar PAs is an interesting approach that is further exploited in this design.
1Part of this section is reprinted with permission from “A 35 dBm Output Power and 38 dB Linear Gain PA with 44.9% Peak
PAE at 1.9 GHz in 40 nm CMOS”, H. Qian, Q. Liu, J. Silva-Martinez, and S. Hoyos, IEEE Journal of Solid-State Circuits, vol.
51, no. 3, March 2016.
72
In this section, the design of a 1.9 GHz linear segmented PA is presented. To improve efficiency in
the PBO region, a combination of PA segmentation and digital signal processing (DSP) is employed. The
PA sections are directly connected to the output impedance matching network equipped with class AB
common-mode feedback (CMFB) mechanism to reduce common-mode variations when (de)activating the
PA segments. The proposed PAs’ efficiency in the back-off region is significantly improved since the
drivers and PA active sections are correlated with input signal power. The discrete power gain variations
were effectively compensated using a digital pre-warping technique employing noiseless, fast, precise,
and cheap digital amplification. The digital pre-warping scheme increases the power of weak signals
improving the signal-to-noise ratio of the solution under PBO conditions. Preliminary results of this work
were recently reported in [43].
This section is organized as follows. Subsection VII.B reviews three popular PA architectures aimed
at improving power efficiency in the PBO, namely the envelope tracking system, power combining, and
DAC-based technique. In Subsection VII.C, the proposed architecture is described in detail, and an in-depth
analysis of the impact of linearity due to timing mismatch is carried out. The design of the PA building
blocks is presented in Subsection VII.D, and the measurement results and discussions are presented in
Subsection VII.E. Finally, the conclustion is drawn in Subsection VII.F.
VII.B. Efficiency Enhancement Techniques
Current and future generations of communication systems use high PAPR modulation schemes due
to the need for bandwidth efficiency and accommodation of multimode and multistandard applications;
therefore, the target goal is to improve power efficiency in the PBO region. A brief description of these
techniques follows.
a) Envelope Tracking
One of the possible envelope tracking topologies is shown in Fig. 53. Baseband amplitude (the “en-
velope”) is extracted in the DSP and converted to analog through the digital-to-analog converter (DAC).
The envelope signal is then fed into a switching regulator, usually combined with a linear regulator (not
shown in this figure) used to reduce VDRAIN ripple. The PA’s VDRAIN is dynamically varied, tracking
the baseband signal amplitude; PA’s efficiency improves at PBO region. One of the major flaws of this
architecture is that the timing misalignment of the PA supply voltage to the RF signal will introduce
nonlinearity, and most effort to align the two paths are sensitive to process, voltage, and temperature
variations. As the CMOS technology scales toward lower breakdown voltages where the PA output voltage
73
Fig. 53. Conceptual schematic of an ET system.
swing is limited, the envelope tracking technique becomes less effective. On the other hand, the switching
regulator must be agile to track the fast variations in the input signal but also with small ripple. These
issues demand large switching frequencies, even > 100 MHz for signal bandwidths of 20 MHz with
stringent slew-rate specifications. Unfortunately, the increased switching loss of the switching regulator
degrades the overall power efficiency when high frequency clocks are employed. Power efficiency degrades
due to the use of an auxiliary linear amplifier needed to reduce output voltage’s ripple.
b) Power Combining: Segmented PA
The PA can be segmented and the control system deactivates one of more sections depending on
the power demanded by different standards, as shown in Fig. 54(a). This approach is well suited for
multimode multistandard applications where several sections of the PA can be deactivated when the
system is used in low-power mode operation [60–64]. This technique can also be used in switching-mode
PAs as demonstrated by [65].
c) Power DAC: Segmented PA
This approach was developed for polar amplifiers; see for instance [77]. It employs a DAC em-
bedded at the output of the RF PA as depicted in Fig. 54(b). The phase of the input signal modu-
lates the carrier and the modulated signal then feeds the linear preamplifiers and so the PA sections
2M (W/L)0 , 2
M−1 (W/L)0 , · · · , 20 (W/L)0. The PA is binary segmented; then, its output current is
correlated with the magnitude of the input signal (determined by bM , bM−1, · · · , b0) implementing an
embedded DAC. In theory, the PA’s current efficiency would be maintained close to the maximum attainable
in every segment due to the fact that the digital predistortion adjusts the signal power to fit within the
74
(a)
(b)
Fig. 54. Conceptual schematic of power combining architecture with switchable PAs. (a) PA for multistandard applications. (b)
DAC-based PA with optimized current efficiency.
maximum linear range. However, a number of practical limitations (such as larger dc current than peak ac
current in every PA segment is needed for good linearity) degrade it. The PA driver amplifies the phase
modulated waveform in a linear fashion to preserve the information, and then demands the use of power
hungry class A drivers. Under PBO conditions, the power consumption might be drastically limited by
the PA drivers rather than the PA itself. PA drivers can also turn OFF when the corresponding branch is
OFF.
VII.C. PA Architecture
Since most communication systems in 3G and onward have a Gaussian distribution power transmission
pdf as a function of output power in dBm, the architecture targeted at such communication systems
partitions the signal in a linear-to-dB manner to maximize its effectiveness. On the other hand, the best
power efficiency in current PAs is obtained for large signals, then the aim of the proposed approach is
75
φ0
φ1
φ2
φ3
Fig. 55. Correlation between control phases and baseband signal amplitude.
to maintain the PA input signal large; for this purpose, digital prewarping techniques are employed. The
incoming signal is segmented into four regions with adjacent regions, which differ in maximum voltage
by 6 dB as shown in Fig. 55. More segments can always be used if appropriated for other designs. The
four regions are distinguished by the values of the control phases φ1−φ3. These control bits correspond to
the two most significant bits (MSBs) of the baseband signal; thus, baseband signal power is identified in
the DSP. The control phases manage the segments of the PA, thus correlating the PA current consumption
and gain with the signal MSBs. The prewarped LSBs are then processed using linear amplifiers.
Fig. 56 shows the conceptual schematic of the proposed system. Ignoring the sign bit, the MSBs of the
digital representation of the baseband signal magnitude manage the segments, while the least significant
are converted into analog format and then up-converted by the mixer. The PA and its driver are divided into
four sections in a binary fashion; it is straightforward to realize this operation in the digital domain since
the two MSBs provide that information; for better control of the architecture, the MSBs are converted
into thermometric format. The control bits φ1 − φ3 drive the PA sections through the drivers. If the
signal strength falls in the region φ0, for instance, the control phases φ1 − φ3 are zero, and then only
the unswitchable section manages the signal Sin(t). To minimize the switches in the signal path, the
drivers are turned OFF by disconnecting the transistor drain from VDD; dc coupling is used to drive
the PA sections to avoid the use of large capacitors that introduce significant delay in signal path. The
architecture is designed such that when the drivers are turned OFF, the PA sections also shut OFF. As a
76
Fig. 56. Simplified schematic of the proposed architecture.
result, the drivers and PA sections are dynamically correlated with signal power providing further power
savings.
Due to the manipulation of the segments, the PA power gain follows this pattern, which is a desirable
property for polar amplifiers, but makes the PA gain signal dependent for linear amplifiers. An elegant yet
efficient solution is to use digital gain equalization to overcome this shortcoming. The signal strength is
evaluated and amplified accordingly in the digital domain such that the digital gain and gain attenuation
due to PA switching compensate each other leading to a constant power gain factor across all operating
conditions. The MSBs used to control the PA segments are also used to manipulate the least significant
bits implementing digital gain factors of 20, 21, 22 and 23. The realization of these operations is trivial
since they correspond to left data shifting by 0, 1, 2, or 3 spots.
A unique property of this approach is that small signals are noise-free amplified in the digital domain,
making them more tolerant to thermal noise due to the mixer, PA drivers, and PA sections. The digital
amplification does not saturate the RF sections since the magnitude of the prewarped input signal is always
within the linear range of the active drivers and PA blocks. The digital gain by multiples of 2 is a very
easy and cheap operation since it only requires a bit-shift to the left in the digital domain. If the digital
gain equalized signal reaching PA input is fully synchronized with the manipulation of the PA sections,
the PA output signal is smooth when transitioning across different segments. However, a common-mode
current step (when switching across segments) is an issue that requires further attention.
a) Timing Mismatch Analysis
One concern is the timing alignment of the RF signal path and the digital control phase path. A simplified
77
Fig. 57. Simplified model for timing mismatch analysis.
model of the system shown in Fig. 57 is used to capture the essence of the timing mismatch. Let us consider
the case of only one-bit control φ3. Suppose that there is a timing delay of τ seconds between the RF signal
path and the control phase, i.e., the control signal arrives at the switch before the corresponding RF signal
reaches the PA cells. Assume a modulated inputsignal sin(t) = sBB(t)sRF (t) = cos (ωBBt) (ωRF t),
where ωBB and ωRF represent the baseband and RF angular frequencies, respectively. For simplicity, the
amplitude of the input tone and gain of the mixer are chosen to be unity. If all PA sections are active,
the output power is then described as sout−N (t) = 0.5AV PAsin(t). However, the baseband equalizer
recognizes that the signal power is small and amplifies it by 6 dB; sin(t) is then a pre-equalized version
of the original baseband input signal and can be expressed as follows for the case of a single tone:
sin(t) =


2sBB(t)sRF (t) if − 0.5 ≤ sBB(t) ≤ 0.5
sBB(t)sRF (t) if sBB(t) > 0.5 or sBB < −0.5
(130)
In Fig. 58, ti, i = 1, 3, 5, 7 which corresponds to the breaking points of the segmentation algorithm. If
the timing is perfectly aligned, while the magnitude of the baseband signal is smaller than the threshold
voltages, the PA gain reduces by a factor of 2. At the same time, the signal is digitally amplified by two
while in this region and, thus, the overall gain remains constant since the digital amplification and PA
attenuation are fully synchronized. On the other hand, if there is a timing mismatch of τ seconds between
the time we manipulate the PA segments and signal traveling through the up-converter and amplification
chain, then the operations are misaligned resulting in an error (glitch like) at the PA output. The delay
occurs when the signal travels through the DAC, the mixer, drivers, and PA sections. If the PA sections
78
(a)
(b)
Fig. 58. PA output waveforms (RF component is not shown for simplicity). (a) Prewarped signal with and without timing delay.
(b) Error waveform due to timing mismatch between φ3 and si(t − τ).
are turned OFF earliear, then PA gain drops by 6 dB and stays in this condition until the equlized signal
reaches the gate of the PA. This scenario is illustrated in Fig. 58(a) where the PA input signal becomes
79
sin(t) =


sBB(t− τ)sRF (t), t < t2
2sBB(t− τ)sRF (t), t2 ≤ t < t4
sBB(t− τ)sRF (t), t4 ≤ t < t6
2sBB(t− τ)sRF (t), t6 ≤ t < t8
(131)
where sBB(t) is the incoming baseband signal. Defining the error signal at the output to be the difference
between PA output currrent with timing errors and the ideal output current, then
ie(t) =


− 12Gm−V PAsBB(t)sRF (t), t1 ≤ t < t2
Gm−V PAsBB(t)sRF (t), t3 ≤ t < t4
− 12Gm−V PAsBB(t)sRF (t), t5 ≤ t < t6
Gm−V PAsBB(t)sRF (t), t7 ≤ t < t8
0, otherwise
(132)
withGm−V PA being the transconductance gain of the PA. The resulting error signal is plotted in Fig. 58(b);
the RF component is not shown to simplify the plot. In general, the error signal resulting from the timing
mismatch would be manifested as the convolution of the signal LSBs with a time delay of τ seconds, the
MSBs, and a time window of τ seconds. For the sake of simplicity, let us denote θ = ωBBt; then, the
third Fourier coefficient of the error signal can be calculated as follows:
a3 =
(
Gm−PA
pi
)−1
2
θ1,5+θτ∫
θ1,5
cos θ cos 3θ dθ +
θ3,7+θτ∫
θ3,7
cos θ cos 3θ dθ

 (133)
where θi = ωBBti, i = 1, 2, 3, 4 and θτ = ωBBτ . Calculating of the integrations and then rearranging
the expanded terms, noting from Fig. 58 that θ1 =
pi
3 , θ3 =
2pi
3 , θ5 =
4pi
3 , θ7 =
5pi
3 , would lead to
a3 =
(√
7Gm−PA
2pi
)[
1
2
sin 2θτ sin (2θτ − φ)− sin θτ sin (θτ + φ)
]
(134)
where φ = tan−1
(
1
3
√
3
)
= 0.19 rad. If we assume that θτ ≪ φ, then (134) reduces to the simpler yet
intuitive result
|a3| ≈
(
τ
TBB
)
Gm−PA (135)
where TBB is the baseband signal period. Since a1 ≈ Gm−PAsBB−pk in this simplified analysis, the
80
third-order intermodulation distortion due to the timing mismatch IMD3 is proportional to
3τ
4TBB
. For
a baseband signal of 10 MHz (TBB = 10
−7 s), the delay error τ must be under 1.3 × 10−9 secs to
maintain IM3 under -40 dB. Timing mismatches in other 3 PA segments add similar effects and increase
the PA sensitivity to time delay mismatches. Even more, in practice the computation is more complicated
since the spectral leakage is the result of the convoluation of the MSBs used to control φ1 − φ3 and the
signal power of the least significant bits si(t) and a time window of τ seconds correlated with the MSBs;
notice in (132) and (133) that the magnitude and sign of the windowing is function of the direction of
the transition of the MSBs: -1/2 when the MSB transition from 1 to 0 and +1 when transitioning from 0
to 1.
Fig. 59. Timing mismatch effects on ACLR.
To reduce the nonlinearity caused by the timing mismatch, a delay cell is added to the system to reduce
the timing mismatch, as shown in Fig. 57. The delay cell includes a replica of the preamp, but it acts as
a digital driver. After fine tuning the size of the delay cell using extensive post-layout simulations, the
on-chip delay mismatch was under 100 ps for all segments and under PVT variations. Timing mismatches
generate glitches (MSBs and least significant bits are not well aligned as depicted in Fig. 58) that may
not significantly degrade the received constellation if properly sampled at the receiver. These effects have
more effect on ACLR since these glitches are signal dependent. Extensive simulations in a WCDMA
system, where the channel bandwidth is 3.84 MHz, a timing mismatch of 500 ps would result in PA
neighbor channel leakage power under -40 dB as illustrated in Fig. 59. The timing delay block was not
manipulated during characterizations.
81
RFin+
M1
RFin−
M2
Bias
Z-match
Balun
50 Ω
Total W/L
(µm/µm)
No. of
fingers
M1 4608/0.04 1536
M2 6912/0.27 1536
VDD
Fig. 60. Schematic of the PA output stage; the core consists of 1536 replicas.
VII.D. PA System Design
The critical design of the proposed system is the switching scheme, which is applied to both the PA
sections and their drivers. The PA design details are described in this subsection.
a) Output Stage Design
Fig. 60 shows the schematic of the PA stage. Cascode configuration is used to improve its reliability.
The common-source transistors are standard thin oxide transistors that have lower input capacitance and
higher transconductance; the common-gate transistors have thick oxide to withstand larger voltage swing.
At maximum RF output power, the voltage swing at the drain terminal of the cascode device and the
common-source device are 2.5 and 0.75 Vpk, respectively. The transistors are optimized for linearity,
and their sizes are also included in Fig. 60. The nominal gate overdrive voltage for the transistors are
VOV 1 = 300mV and VOV 2 = 400mV. At maximum RF output power, the simulated bias current of
the output stage and driver stage are 980 and 320 mA, respectively. Maximum RF current is expected
at this stage; thus, extra care is needed in the design layout. Multiple pads for the output and ground
nodes are used, and the ground pads of this stage are not shared with the remaining parts of the chip.
The bondwires are explicitly drawn to indicate that those pads are for the output stage exclusively. The
transistors are organized in clusters employing common-centroid techniques to facilitate the connectivity
and to minimize transistor mismatches. The PA transistors are dc connected to the PA drivers; thus, no
82
Fig. 61. Schematic showing the CMFB circuit allocated at PA output.
additional switches are required to enable or disable these sections. When M1 transistors are switched
ON/OFF, there is a significant common-mode step in current that may produce significant common-mode
ringing and up to 1 V common-mode peak variation. To alleviate this issue, a fast class AB CMFB circuit
shown in Fig. 61 is allocated at PA output. A couple of single stage amplifiers compare the common-mode
output signal and VDRAIN and drive the class AB amplifiers composed by transistors MC1 and MC2.
These transistors are biased through RB and VB1,2 at the onset of subthreshold region to save power.
Class AB amplifiers MC1 and MC2 minimize the power consumption but are able to deliver/sink enough
instantaneous current reducing the common-mode glitches generated by the transistor’s switching.
b) PA Drivers
The schematic of the driver stage is shown in Fig. 62. It consists of a differential pair with resistive
load, a switch controlled by the control code, and a CMFB loop. Cp and CPA in Fig. 62 represent
the effective parasitic capacitance at the common-source node and the output nodes, respectively. Direct
coupling between the driver and the PA stages reduces the switching time. When the switch is opened, the
83
vin+ vin−
vout+(RFin+)
vout−(RFin−)
φ
−
+
Vref
VDD
Fig. 62. Conceptual schematic of the driver stage.
driver’s output common-mode voltage moves down very quickly, putting the differential pair transistors
in the triode region. The common-mode voltage drops, and then breaks the loop during this condition,
which helps turning down quickly the preamplifier outputs. When the switch is closed again, the output
voltage of the driver moves toward VDD and is only limited by the time constant RLCPA. Since the load
resistors RL are small (in this case around 30 Ω for the unit-cell driver), the time constant is small, and
fast low-to-high transition is achieved. As soon as the common-mode level exceeds the reference voltage,
the loop tries to reach its steady state; then, settling time of the CMFB is function of the loop properties.
Therefore, the use of fast CMFB is a must.
Open-loop gain, closed-loop bandwidth, and stability are all important parameters to be considered
when designing the CMFB loop. Simulation results of the common-mode voltage as the switching takes
action are illustrated in Fig. 63. The common-mode voltage moves very quickly until 400 mV is reached
because the loop is still broken due to the lack of current in M4. The knee during the rising transition is
due to the fact that the fast voltage variation at the drain of M3 put them in a saturation region, allowing
the generation of instantaneous discharging current until the parasitic capacitor Cp can get charged. Then,
the drain current of M3 reduces again and the common-mode voltage rises very fast again until reaching
its steady-state condition. The 1% settling time under the worst-process corner is less than 8 ns, which
means that even if the baseband signal bandwidth is 10 MHz, the switching process would only take
8% of the signal time period in the worst case. The common-mode settling time issues arise during the
transition time of the incoming data, generating data-dependent glitches that may degrade the ACLR and
EVM figures.
84
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20
V
C
M
(V
)
Time (ns)
ss
tt
ff
Fig. 63. Simulation results of common-mode voltage transient response.
When all the PA and driver sections are active, the simulated power gain of the driver stage is about
18 dB, whereas the power gain of the output stage is around 20 dB.
c) Output Impedance Matching Design
For the output impedance matching circuit, a multisection network was implemented. Fig. 64 shows a
half representation ofthe matching network. CD and Lbnd stand for the draincapacitance and the bondwire
inductance, respectively. The parasitic capacitance at the package on the PCB is accounted for in C1. The
transmission line with a length d and characteristic impedance Z0 is formed by a microstrip line consisting
of the PCB trace and the ground plane underneath. RL represents the input impedance of a balun, which
is 25Ω for a half circuit. The optimal PA load impedance RT is determined by maximum linear output
power design specification. The loadpull simulations further allow us to optimize the choice of RT . The
RF choke inductor if off-chip and not shown in Fig. 64. Its value is chosen such that at RF, it is seen as
a high impedance to the PA, while at the switching frequency of the switching regulator, it is seen as a
low impedance to the switching regulator. The matching network component values can be determined by
hand calculations, the Smith chart, or existing software packages. The summary of the component values
is given in Table VIII.
Two important design specifications for the output matching network are bandwidth and insertion loss.
85
CD
Lbnd
C1
Z0
d
C2 RL
RT
Fig. 64. Two-section impedance matching network.
TABLE VIII
Impedance Matching Network Component Values
CD C1 C2 Lbnd Trans. Line
(pF) (pF) (pF) (pH) W (mil) d (mil)
10 15.9 3.27 400 150 244
The matching circuit used in the proposed system is effectively a multisection design, and its bandwidth
is sufficient for WCDMA applications. To ensure robustness, the insertion loss of the output matching is
simulated under process variations, as shown in Fig. 65. The worst case of the simulated insertion loss is
around 1 dB when all component values are shrunk by 30%. However, this is the less likely case, since
nonidealities usually result in additional parasitic components, making the effective component values
larger. If all component values increase by 30%, the insertion loss is simulated to be only 0.2 dB.
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5
In
se
rt
io
n
lo
ss
(d
B
)
Frequency (GHz)
Nominal
30% shrunk
30% expanded
Fig. 65. Insertion loss simulation with process variations.
The Q of the bondwire inductance was assumed to be 50 in simulations. Both the change in Q and
86
the value of the bondwire inductance affect the output matching network. This effect manifests itself in
higher insertion loss at frequency of interest. The insertion loss at 1.9 GHz was simulated with various Q
and L values of the bondwire inductance across all four modes of operations. Under extreme conditions,
i.e., inductance increased by 25% and Q = 30, the insertion loss barely exceeded 1 dB.
-100
0
100
0 0.1 0.2 0.3 0.4 0.5
V
B
B
(m
V
)
-3
0
3
0 0.1 0.2 0.3 0.4 0.5
V
D
(V
)
-10
0
10
0 0.1 0.2 0.3 0.4 0.5
V
L
O
A
D
(V
)
Time (µs)
Baseband sine
Predistorted sine
Fig. 66. Transient simulation results. (a) Input signal before and after digital prewarping (top trace). (b) Output signal at drain
voltage (middle trace). (c) Output signal after impedance matching network.
As the PA switches ON and OFF different transistor sections, the output impedance of the transistor
changes. However, the transistor is in either an active region or a cutoff region, and the drain capacitance
is mainly due to the depletion region capacitance between the drain and the substrated plus the gate-drain
overlap capacitance. To test if the impedance matching network works properly in each switching scenario,
a 2 MHz sinusoidal baseband signal modulated to the carrier frequency is applied to the system. The
transient waveforms are shown in Fig. 66. The top plot shows the original sinusoidal baseband signal, along
with a predistorted baseband that is to be input to the PA. As shown in Fig. 66, the impedance matching
network functions properly at all switching scenarios. The peak differential voltage amplitude before and
after the impedance matching network are 3.1 and 10.7 V, respectively. The voltage transformation ratio of
3.45 thus implies an impedance transformation ratio of 11.9, as desired (ZL/ZT = 50/4.5 = 11.1). Notice
that if the mismatches in segmented PA are small, the ac current delivered to the matching network is
smooth for the entire power range. In practice, some glitches are present when switching between segments
87
Fig. 67. Microphotograph of the chip.
mainly due to the unavoidable parasitic capacitors and timing offsets.
VII.E. Measurement Results
0
10
20
30
40
-35 -25 -15 -5 5
0
5
10
15
20
25
30
35
40
45
G
ai
n
(d
B
)/
P
O
U
T
(d
B
m
)
P
A
E
(%
)
PIN (dBm)
Gain
POUT
PAE
Fig. 68. Measured gain, output power, and PAE as a function of input at 1.9 GHz.
88
(a) (b)
(c) (d)
Fig. 69. Simulation and measurement results of PA’s S22 when the control bits are (a) 111; (b) 011; (c) 001; and (d) 000.
The PA was fabricated in a TSMS 40 nm CMOS process, and Fig. 67 shows the microphotograph of the
chip. The chip area is approximately 2.88 mm2. A single-tone continuous-wave (CW) signal of 1.9 GHz
was applied to characterize the PA in all four operation modes. Fig. 68 shows the measured gain, output
power, and power-added efficiency (PAE) as a function of the input power. The PCB and cable losses
are de-embedded in the performance. The PA’s output P1dB and PSAT are measured as 31 and 35 dBm,
respectively. The average power gain is 38 dB, and the PAEs at P1dB and PSAT are 28.8% and 44.9%,
respectively. As a comparison, the PAE of the PA without the proposed power efficiency improvement
techniques was measured, and displayed in Fig. 68 as the dashed curve. The PAE improvement in the
PBO region is apparent. For instance, at 20 dB back-off from PSAT , PAEs of the PA with and without
89
Fig. 70. ACLR measured at maximum output power of 31 dBm.
segmentation are 21.3% and 8.1%, respectively. If required, more segments can be added to improve PA
power efficiency at higher power levels.
Fig. 71. SEM measured at maximum output power of 31 dBm.
90
-55
-50
-45
-40
-35
-30
-25
0 5 10 15 20 25 30 35 40
A
C
L
R
(d
B
)
POUT (dBm)
ACLR = -33 dB
Fig. 72. ACLR as a function of maximum output power.
The S22 of the PA in each mode of operation is simulated and measured as shown in Fig. 69. Although
there is some mismatch due to nonidealities, it is manageable, and can be optimized by tweaking the
output matching network component values. Note that both simulation and measurement show that S22
does not vary much across different modes of PA operation. due to the cascode topology of the output
stage, PA’s output resistance is kept large as compared with the transformed RT ; therefore, the variation
in PA output resistance is absorbed by the output matching network.
A WCDMA baseband signal that is compliant with the 3GPP standard [31] was generated, preprocessed
in digital, and up-converted to 1.9 GHz with a bandwidth of 3.84 MHz. According to [31], the adjacent
channel leakage ration (ACLR) at ±5 MHz should be kept below -33 dBc for cellular handsets to comply
with the standards. The measured output power spectrum is shown in Fig. 70, with the PA under test
transmitting a maximum linear power of 31 dBm. Data analysis from the spectrum analyzer shows that the
ACLR at a macimum power of 31 dBm is -35.8 dBc. The spectrum emission mask (SEM) measurement
was carried out, and the result is shown in Fig. 71. Under maximum output power condition, the PA
meets the 3GPP SEM specifications.
With a fixed set of switching thresholds for the PA and the switching regulator, the ACLR as a fuction
of maximum output power was measured, and the result is shown in Fig. 72. Contrary to classic PA
91
-40
-35
-30
-25
-20
-15
-10
0 5 10 15 20 25 30 35 40
E
V
M
(d
B
)
POUT (dBm)
EVM = -15.4 dB (17%)
Fig. 73. EVM as a function of maximum output power.
cases where amplifier’s linearity improves at low power, the linearity of the proposed architecture was
compromised at the PBO region due to the digital amplification. The PA, however, still met the required
specifications. This is because the PA transistors work close to their maximum power capacity most of the
time. These results show that the proposed architecture has a good balance between power efficiency and
linearity. Another linearity figure of merit is the error vector magnitude (EVM). For 3G WCDMA, the
specification for EVM is less than -15 dB (17%). The EVM as a fuction of the maximum output power
of the PA is shown in Fig. 73. At a maximum output power of 31 dBm, the EVM is -21 dB (8.9%).
The phase error is measured as an indication of the PA’s AM/PM nonlinearity. The phase error as a
function of the output power is shown in Fig. 74. The phase error is under 2.5% up to 35 dBm output
power.
Recently reported linear PAs with segmentation technique to improve PAE were compared with the
proposed PA in Table IX. The proposed PA achieved a remarkable peak PAE as well as outstanding PSAT
and marks at PBO regions. Such an improvement was achieved by the combination of segmentation and
proposed digital predistortion technique. Moreover, the proposed PA enables switching between different
modes within a very short time frame, which is the first to report such a feature, to the author’s best
knowledge.
92
00.5
1
1.5
2
2.5
0 5 10 15 20 25 30 35 40
P
h
as
e
E
rr
o
r
(d
eg
re
e)
POUT (dBm)
Fig. 74. Phase error as a function of maximum output power.
TABLE IX
Comparison With Recently Published Works
Reference Frequency PSAT /PAE VDD CMOS Size Number PAE Increase at PBO(%)
(GHz) (dBm/%) (V) (nm) (mm2) of Modes 7 dB 10 dB 15 dB
[62] 2.4 23.1/42 1.5 130 5.48 2 3.3 4.3 3.6
[63] 2.4 27/32 1.2 130 2 2 5.4 4.0 3.4
[64] 2.4 23.1/42 3.3 180 0.88 2 8.7 8.7 7.2
[75] 2 23/38 2.5 250 2.48 3 10 N/A N/A
[76] 2.45 31.5/25 3.3 65 2.7 3 10 5 N/A
[78] 2.45 26.3/33 2 90 1.88 2 9 7 3
[79] 2.2 43 1.2 65 6.25 2 6.2 4 2.1
This work 1.9 35.3/44.9 2.5 40 4 4 13 7.36 9
VII.F. Conclusion
A 1.9 GHz segmented linear PA was designed and implemented in 40 nm CMOS technology. The
input signal is segmented and strategically amplified in the digital domain, while the PA is segmented and
its segments are properly manipulated to maintain its power gain invariant with voltage while achieving
significant power savings. The architecture emulates the operation of the conventional class-B amplifier,
thus achieving similar power efficiency. However, the fact that the PA drivers are made switchable, then
this architecture may result in better power efficiency. The PA achieved a saturated/maximum linear output
power of 35/31 dBm with corresponding peak PAEs of 44.9% and 28.8%, respectively. A fast yet efficient
93
switching scheme that employs direct coupling between PA sections and drivers was demonstrated, which
enabled the PA to improve efficiency in the PBO region within a wideband communication standard. The
architecture can be combined with envelope tracking techniques to achieve better power efficiency figures.
The proposed techniques are general and can be used in other PA architectures as well.
94
VIII. CONCLUSION
This dissertation has examined various important aspects related to CMOS RF PA design, particularly
in the context of wireless communication applications. The proposals, and theoretical analyses in early
sections have led to the design and implementation of a high efficiency and high performance PA in 40
nm CMOS technology. The segmented PA is able to switch between different modes within a very short
time frame, and thus has achieved very high average power efficiency while maintain an industrial-level
good linearity. The fast switching feature is the first to be reported in the RF PA research and development
community. This work demonstrates that CMOS technology can be a serious candidate for implementing
high power RF PA, among other more expensive technologies such as SiGe BiCMOS and GaAs MESFET.
Section II first introduces various operation modes of PAs, which serve as a starting point of the
discussion and analysis of RF PA design. While it turned out the work in this dissertation is mainly
focused on linear PA design, the design and simulation techniques, especially the segmentation approach,
can be employed in switching mode PAs too. Also, the detailed analysis of the operation of PA in different
modes in this section serves as the basis on which the switching of segmented PA architecture forms.
The introduction of some of the relevant concepts in digital communication theories in Section III
points out the importance of linear PA in the current and future generations of wireless communications.
Therefore, it is indispensable to study the nonlinear mechanism in RF PA in Section IV. The root cause of
nonlinear effects lies in the use of nonlinear active device, and MOSFET nonlinear mechanisms, including
channel-length modulation, velocity saturation, and mobility degradation, are analyzed in details. It is also
very important to have a good understanding of how the nonlinear effects can be efficiently characterized
in design. Conventional linearity tests, i.e. single-tone and two-tone tests are also briefly reviewed. As
mentioned in Section III, modern communications use wideband, multicarrier modulation schemes, hence
the conventional linearity tests can only provide an indication of the PA linearity. On the other hand,
a multitone test or modulated envelope simulation require long simulation time and large computation
resource. To resolve this issue, a detailed analysis of multitone nonlinear effects is carried out in this
section, and as the analysis shows, a simple two-tone test can be used to predict multitone nonlinear
behavior. The result of this innovative analysis could save much resource in the design and implementation
of RF PA. Although currently not a major nonlinearity contributor to CMOS PA, the amplitude modulation
to phase modulation (AM-to-PM) conversion is briefly analyzed at the end of this section.
The analysis and design of impedance matching network is discussed in Section V. This topic is in
some way unique in RF design, and usually sets apart RF design from analog design. Yet at least for RF
95
PA applications, impedance matching is very important. A mismatched impedance at the output of the PA
would not only degrade the power efficiency by a lot, but may even damage the device due to the high
power levels. Therefore, in this section the many design considerations of impedance matching circuits are
analyzed, including bandwidth, harmonic rejection, and insertion loss. Basic impedance matching building
blocks are also analyzed, with an emphasis in the Π-match due to its versatility and generality.
A theoretical study of nearly all PA architectures is presented in Section VI, which shows that, despite
the huge amount of effort of research and development, the PA architectures still usually improve the
power efficiency or linearity, but not both. This is because this two aspects is the fundamental tradeoff
in PA design.
Finally, based on all the theoretical studies and proposed innovations on the previous sections, a
segmented PA combined with digital pre-warp architecture is proposed, designed, and implemented in
40 nm CMOS technology. This work reconciled the fundamental tradeoff between power efficiency and
linearity, and the fast-switching scheme implemented in this system has resulted in switching of PA
segments within the modulation, which is the first to be reported. The PA achieved 35 dBm output power
and 38 dB power gain in the 1.9 GHz WCDMA band. The peak PAE of 44.9% and good linearity meeting
3GPP standard shows that this PA has a very balanced tradeoff between efficiency and linearity, and is
competible with other more expensive, traditional RF IC technologies, such as SiGe, GaAs, and InP.
96
REFERENCES
[1] C. Paul, “Telephones aboard the Metroliner,” Bell Laboratories Record, March 1969.
[2] T. H. Lee, The Design of CMOS Radio-Frequency Integrated Circuits, 2nd ed. New York, NY,
USA: Cambridge University Press, 2004.
[3] International Telecommunication Union, International Telecommunication Union Database, 2015.
[Online]. Available: www.itu.int
[4] World Bank, World Bank Database, 2016. [Online]. Available: www.worldbank.org
[5] International Data Corporation, International Data Corporation Database, 2016. [Online]. Available:
www.idc.com
[6] J. Crols and M. Steyaert, “A single-chip 900 MHz CMOS receiver front-end with a high performance
low-IF topology,” Solid-State Circuits, IEEE Journal of, vol. 30, no. 12, pp. 1483–1492, Dec 1995.
[7] A. Rofougaran, J.-C. Chang, M. Rofougaran, and A. Abidi, “A 1 GHz CMOS RF front-end IC
for a direct-conversion wireless receiver,” Solid-State Circuits, IEEE Journal of, vol. 31, no. 7, pp.
880–889, Jul 1996.
[8] J. Rudell, J.-J. Ou, T. Cho, G. Chien, F. Brianti, J. Weldon, and P. Gray, “A 1.9-GHz wide-band IF
double conversion CMOS receiver for cordless telephone applications,” Solid-State Circuits, IEEE
Journal of, vol. 32, no. 12, pp. 2071–2088, Dec 1997.
[9] A. Shahani, D. Shaeffer, and T. Lee, “A 12-mW wide dynamic range CMOS front-end for a portable
GPS receiver,” Solid-State Circuits, IEEE Journal of, vol. 32, no. 12, pp. 2061–2070, Dec 1997.
[10] R. Kulkarni, J. Kim, H.-J. Jeon, J. Xiao, and J. Silva-Martinez, “UHF receiver front-end: Implemen-
tation and analog baseband design considerations,” Very Large Scale Integration (VLSI) Systems,
IEEE Transactions on, vol. 20, no. 2, pp. 197–210, Feb 2012.
[11] O. Erdogan, R. Gupta, D. Yee, J. Rudell, J.-S. Ko, R. Brockenbrough, S.-O. Lee, E. Lei, J. L.
Tham, H. Wu, C. Conroy, and B. Kim, “A single-chip quad-band GSM/GPRS transceiver in 0.18
µm standard CMOS,” in Solid-State Circuits Conference, 2005. Digest of Technical Papers. ISSCC.
2005 IEEE International, Feb 2005, pp. 318–601 Vol. 1.
[12] D. Kaczman, M. Shah, N. Godambe, M. Alam, H. Guimaraes, L. Han, M. Rachedine, D. Cashen,
W. Getka, C. Dozier, W. Shepherd, and K. Couglar, “A single-chip tri-band (2100, 1900, 850/800
MHz) WCDMA/HSDPA cellular transceiver,” Solid-State Circuits, IEEE Journal of, vol. 41, no. 5,
pp. 1122–1132, May 2006.
97
[13] A. Hadjichristos, M. Cassia, H. Kim, C. H. Park, K. Wang, W. Zhuo, B. Ahrari, R. Brockenbrough,
J. Chen, C. Donovan, R. Jonnalagedda, J. Kim, J. Ko, H. Lee, S. Lee, E. Lei, T. Nguyen, T. Pan,
S. Sridhara, W. Su, H. Yan, J. Yang, C. Conroy, C. Persico, K. Sahota, and B. Kim, “Single-chip RF
CMOS UMTS/EGSM transceiver with integrated receive diversity and GPS,” in Solid-State Circuits
Conference - Digest of Technical Papers, 2009. ISSCC 2009. IEEE International, Feb 2009, pp.
118–119,119a.
[14] H. Moon, J. Han, S.-I. Choi, D. Keum, and B.-H. Park, “An area-efficient 0.13-µm CMOS multiband
WCDMA/HSDPA receiver,”Microwave Theory and Techniques, IEEE Transactions on, vol. 58, no. 5,
pp. 1447–1455, May 2010.
[15] H. Wang, C.-H. Peng, C. Lu, Y. Chang, R. Huang, A. Chang, G. Shih, R. Hsu, P. Liang,
S. Son, A. Niknejad, G. Chien, C. Tsai, and H. Hwang, “A highly-efficient multi-band multi-mode
digital quadrature transmitter with 2D pre-distortion,” in Circuits and Systems (ISCAS), 2013 IEEE
International Symposium on, May 2013, pp. 501–504.
[16] Anonymous, “The great debate: SOC vs. SIP,” EE Times, March 2005.
[17] I. Aoki, S. Kee, R. Magoon, R. Aparicio, F. Bohn, J. Zachan, G. Hatcher, D. McClymont, and
A. Hajimiri, “A fully-integrated quad-band gsm/gprs cmos power amplifier,” Solid-State Circuits,
IEEE Journal of, vol. 43, no. 12, pp. 2747–2758, Dec 2008.
[18] J. G. Proakis and M. Salehi, Digital Communications, 5th ed. New York, NY, USA: McGraw-Hill,
2007.
[19] F. Raab, P. Asbeck, S. Cripps, P. Kenington, Z. Popovic, N. Pothecary, J. Sevic, and N. Sokal,
“Power amplifiers and transmitters for RF and microwave,” Microwave Theory and Techniques,
IEEE Transactions on, vol. 50, no. 3, pp. 814–826, Mar 2002.
[20] S. C. Cripps, RF Power Amplifiers for Wireless Communications, 2nd ed. Norwood, MA, USA:
Artech House, Inc., 2006.
[21] S.-A. El-Hamamsy, “Design of high-efficiency RF class-D power amplifier,” Power Electronics, IEEE
Transactions on, vol. 9, no. 3, pp. 297–308, May 1994.
[22] N. Sokal and A. Sokal, “Class E-a new class of high-efficiency tuned single-ended switching power
amplifiers,” Solid-State Circuits, IEEE Journal of, vol. 10, no. 3, pp. 168–176, Jun 1975.
[23] F. Raab, “Idealized operation of the class E tuned power amplifier,” Circuits and Systems, IEEE
Transactions on, vol. 24, no. 12, pp. 725–735, Dec 1977.
[24] ——, “Class-F power amplifiers with maximally flat waveforms,”Microwave Theory and Techniques,
98
IEEE Transactions on, vol. 45, no. 11, pp. 2007–2012, Nov 1997.
[25] A. Hadjichristos, “Transmit architectures and power control schemes for low cost highly integrated
transceivers for GSM/EDGE applications,” in Circuits and Systems, 2003. ISCAS ’03. Proceedings
of the 2003 International Symposium on, vol. 3, May 2003, pp. III–610–III–613 vol.3.
[26] H. Packard, “Digital modulation in communications systems - an introduction,” Hewlett Packard
Application Note 1298, July 1997.
[27] H. Izumi, M. Kojima, Y. Umeda, and O. Takyu, “Comparison between quadrature- and polar-
modulation switching-mode transmitter with pulse-density modulation,” in Advanced Communication
Technology (ICACT), 2013 15th International Conference on, Jan 2013, pp. 1140–1145.
[28] B. Razavi, RF Microelectronics, 2nd ed. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2011.
[29] 3rd Generation Partnership Project, “3gpp ts 05.05 technical specification rev. 8.20.0,” 1999.
[Online]. Available: http://www.3gpp.org
[30] I. S. Committee, “Ieee std. 802.11ac-2013,” IEEE Standard for Information Technology, pp. 1–425,
Dec 2013.
[31] 3rd Generation Partnership Project, “3gpp ts 25.101 technical specification rev.12.3.0,” March 2014.
[Online]. Available: http://www.3gpp.org
[32] P. R. Gray, P. J. Hurst, S. H. Lewis, and R. G. Meyer, Analysis and Design of Analog Integrated
Circuits, 5th ed. Hoboken, NJ, USA: John Wiley & Sons, 2009.
[33] Y. Tsividis, Operation and Modeling of the MOS Transistor, 2nd ed. New York, NY, USA: Oxford
University Press, 1999.
[34] S. Narayanan, “Transistor distortion analysis using Volterra series representation,” Bell System
Technical Journal, The, vol. 46, no. 5, pp. 991–1024, May 1967.
[35] S. Maas, “Volterra analysis of spectral regrowth,” Microwave and Guided Wave Letters, IEEE, vol. 7,
no. 7, pp. 192–193, Jul 1997.
[36] Q. Wu, H. Xiao, and F. Li, “Linear rf power amplifier design for cdma signals: a spectrum analysis
approach.” Microwave Journal, vol. 41, no. 12, p. 22, 1998.
[37] J. Pedro and N. de Carvalho, “On the use of multitone techniques for assessing rf components’
intermodulation distortion,” Microwave Theory and Techniques, IEEE Transactions on, vol. 47,
no. 12, pp. 2393–2402, Dec 1999.
[38] S. A. Maas, Nonlinear Microwave and RF Circuits, 2nd ed. Norwood, MA, USA: Artech House,
Inc., 2002.
99
[39] K. Kundert, “Introduction to RF simulation and its application,” Solid-State Circuits, IEEE Journal
of, vol. 34, no. 9, pp. 1298–1319, Sep 1999.
[40] T. Quarles, D. Pederson, R. Newton, A. Sangiovanni-Vincentell i, and C. Wayne. SPICE User
Guide. EECS Department of the University of California at Berkeley. [Online]. Available:
http://bwrcs.eecs.berkeley.edu/Classes/IcBook/SPICE/
[41] G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers, 6th ed. New York, NY,
USA: Oxford University Press, 2008.
[42] M. Leffel, “Intermodulation distortion in a multi-signal environment,” RF Design, June 1995.
[43] H. Qian and J. Silva-Martinez, “A 44.9% PAE digitally-assisted linear power amplifier in 40 nm
CMOS,” in Solid-State Circuits Conference (A-SSCC), 2014 IEEE Asian, Nov 2014, pp. 349–352.
[44] J. Aikio and T. Rahkonen, “A comprehensive analysis of AM-AM and AM-PM conversion in an
LDMOS RF power amplifier,” Microwave Theory and Techniques, IEEE Transactions on, vol. 57,
no. 2, pp. 262–270, Feb 2009.
[45] L. Cotimos Nunes, P. Cabral, and J. Pedro, “AM/AM and AM/PM distortion generation mechanisms
in Si LDMOS and GaN HEMT based RF power amplifiers,” Microwave Theory and Techniques,
IEEE Transactions on, vol. 62, no. 4, pp. 799–809, April 2014.
[46] G. Gonzalez, Microwave Transistor Amplifiers : Analysis and Design, 2nd ed. Upper Saddle River,
NJ, USA: Prentice-Hall, Inc., 1997.
[47] S. Cripps, “A theory for the prediction of GaAs FET load-pull power contours,” in Microwave
Symposium Digest, 1983 IEEE MTT-S International, May 1983, pp. 221–223.
[48] G. E. Bodway, “Two port power flow analysis using generalized scattering parameters,” Microwave
Journal, vol. 10, no. 6, May 1967, also available in HP application note 95.
[49] J. Wannstrom, Carrier Aggregation Explained, June 2013. [Online]. Available: www.3gpp.org
[50] Qualcomm Technology Inc., “LTE advanced - evolving and expanding in to new frontiers,” August
2014. [Online]. Available: www.qualcomm.com
[51] H. Chireix, “High power outphasing modulation,” Radio Engineers, Proceedings of the Institute of,
vol. 23, no. 11, pp. 1370–1392, Nov 1935.
[52] I. Hakala, D. Choi, L. Gharavi, N. Kajakine, J. Koskela, and R. Kaunisto, “A 2.14-GHz Chireix
outphasing transmitter,” Microwave Theory and Techniques, IEEE Transactions on, vol. 53, no. 6,
pp. 2129–2138, June 2005.
[53] F. Raab, “Efficiency of outphasing RF power-amplifier systems,” Communications, IEEE Transactions
100
on, vol. 33, no. 10, pp. 1094–1099, Oct 1985.
[54] W. Doherty, “A new high-efficiency power amplifier for modulated waves,” Bell System Technical
Journal, The, vol. 15, no. 3, pp. 469–475, July 1936.
[55] F. Raab, “Efficiency of Doherty RF power-amplifier systems,” Broadcasting, IEEE Transactions on,
vol. BC-33, no. 3, pp. 77–83, Sept 1987.
[56] L. Kahn, “Single-sideband transmission by envelope elimination and restoration,” Proceedings of the
IRE, vol. 40, no. 7, pp. 803–806, July 1952.
[57] J. Staudinger, B. Gilsdorf, D. Newman, G. Norris, G. Sadowniczak, R. Sherman, and T. Quach, “High
efficiency CDMA RF power amplifier using dynamic envelope tracking technique,” in Microwave
Symposium Digest. 2000 IEEE MTT-S International, vol. 2, June 2000, pp. 873–876 vol.2.
[58] G. Hanington, P.-F. Chen, P. Asbeck, and L. Larson, “High-efficiency power amplifier using
dynamic power-supply voltage for CDMA applications,” Microwave Theory and Techniques, IEEE
Transactions on, vol. 47, no. 8, pp. 1471–1476, Aug 1999.
[59] F. Wang, D. Kimball, D. Lie, P. Asbeck, and L. Larson, “A monolithic high-efficiency 2.4-GHz 20-
dBm SiGe BiCMOS envelope-tracking OFDM power amplifier,” Solid-State Circuits, IEEE Journal
of, vol. 42, no. 6, pp. 1271–1281, June 2007.
[60] J. Kim, Y. Yoon, H. Kim, K. H. An, W. Kim, H.-W. Kim, C.-H. Lee, and K. Kornegay, “A linear
multi-mode CMOS power amplifier with discrete resizing and concurrent power combining structure,”
Solid-State Circuits, IEEE Journal of, vol. 46, no. 5, pp. 1034–1048, May 2011.
[61] A. Shirvani, D. Su, and B. Wooley, “A CMOS RF power amplifier with parallel amplification for
efficient power control,” Solid-State Circuits, IEEE Journal of, vol. 37, no. 6, pp. 684–693, Jun 2002.
[62] P. Reynaert and M. S. Steyaert, “A 2.45-GHz 0.13- µm CMOS PA with parallel amplification,”
Solid-State Circuits, IEEE Journal of, vol. 42, no. 3, pp. 551–562, March 2007.
[63] G. Liu, P. Haldi, T.-J. K. Liu, and A. Niknejad, “Fully integrated CMOS power amplifier with
efficiency enhancement at power back-off,” Solid-State Circuits, IEEE Journal of, vol. 43, no. 3, pp.
600–609, March 2008.
[64] Y. Yoon, J. Kim, H. Kim, K. H. An, O. Lee, C.-H. Lee, and J. Kenney, “A dual-mode CMOS
RF power amplifier with integrated tunable matching network,” Microwave Theory and Techniques,
IEEE Transactions on, vol. 60, no. 1, pp. 77–88, Jan 2012.
[65] A. Niknejad, D. Chowdhury, and J. Chen, “Design of CMOS power amplifiers,” Microwave Theory
and Techniques, IEEE Transactions on, vol. 60, no. 6, pp. 1784–1796, June 2012.
101
[66] L. Larson, “RF and microwave hardware challenges for future radio spectrum access,” Proceedings
of the IEEE, vol. 102, no. 3, pp. 321–333, March 2014.
[67] K. Okada, R. Minami, Y. Tsukui, S. Kawai, Y. Seo, S. Sato, S. Kondo, T. Ueno, Y. Takeuchi,
T. Yamaguchi, A. Musa, R. Wu, M. Miyahara, and A. Matsuzawa, “A 64-QAM 60GHz CMOS
transceiver with 4-channel bonding,” in Solid-State Circuits Conference Digest of Technical Papers
(ISSCC), 2014 IEEE International, Feb 2014, pp. 346–347.
[68] M. Ebrahimi, M. Helaoui, and F. Ghannouchi, “Delta-sigma-based transmitters: Advantages and
disadvantages,” Microwave Magazine, IEEE, vol. 14, no. 1, pp. 68–78, Jan 2013.
[69] E. Kaymaksut and P. Reynaert, “A dual-mode transformer-based doherty LTE power amplifier in
40nm CMOS,” in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE
International, Feb 2014, pp. 64–65.
[70] W.-Y. Kim, H. Son, J. Kim, J. Jang, I. Oh, and C. Park, “A CMOS envelope-tracking transmitter
with an on-chip common-gate voltage modulation linearizer,” Microwave and Wireless Components
Letters, IEEE, vol. PP, no. 99, pp. 1–1, 2014.
[71] K. Oishi, E. Yoshida, Y. Sakai, H. Takauchi, Y. Kawano, N. Shirai, H. Kano, M. Kudo, T. Murakami,
T. Tamura, S. Kawai, S. Yamaura, K. Suto, H. Yamazaki, and T. Mori, “A 1.95GHz fully integrated
envelope elimination and restoration CMOS power amplifier with envelope/phase generator and
timing aligner for WCDMA and LTE,” in Solid-State Circuits Conference Digest of Technical Papers
(ISSCC), 2014 IEEE International, Feb 2014, pp. 60–61.
[72] B. Sahu and G. Rincon-Mora, “A high-efficiency linear RF power amplifier with a power-tracking
dynamically adaptive buck-boost supply,” Microwave Theory and Techniques, IEEE Transactions on,
vol. 52, no. 1, pp. 112–120, Jan 2004.
[73] I. Rippke, J. Duster, and K. Kornegay, “A single-chip variable supply voltage power amplifier,” in
Radio Frequency integrated Circuits (RFIC) Symposium, 2005. Digest of Papers. 2005 IEEE, June
2005, pp. 255–258.
[74] D. Chowdhury, C. Hull, O. Degani, Y. Wang, and A. Niknejad, “A fully integrated dual-mode highly
linear 2.4 GHz CMOS power amplifier for 4G WiMax applications,” Solid-State Circuits, IEEE
Journal of, vol. 44, no. 12, pp. 3393–3402, Dec 2009.
[75] H. Hedayati, M. Mobarak, G. Varin, P. Meunier, P. Gamand, E. Sanchez-Sinencio, and K. Entesari, “A
2-GHz highly linear efficient dual-mode BiCMOS power amplifier using a reconfigurable matching
network,” Solid-State Circuits, IEEE Journal of, vol. 47, no. 10, pp. 2385–2404, Oct 2012.
102
[76] A. Afsahi and L. Larson, “Monolithic power-combining techniques for watt-level 2.4-GHz CMOS
power amplifiers for WLAN applications,” Microwave Theory and Techniques, IEEE Transactions
on, vol. 61, no. 3, pp. 1247–1260, March 2013.
[77] P. T. M. van Zeijl and M. Collados, “A digital envelope modulator for a wlan ofdm polar transmitter
in 90 nm cmos,” IEEE Journal of Solid-State Circuits, vol. 42, no. 10, pp. 2204–2211, Oct 2007.
[78] E. Kaymaksut and P. Reynaert, “Transformer-based uneven doherty power amplifier in 90 nm cmos
for wlan applications,” IEEE Journal of Solid-State Circuits, vol. 47, no. 7, pp. 1659–1671, July
2012.
[79] L. Ye, J. Chen, L. Kong, P. Cathelin, E. Alon, and A. Niknejad, “A digitally modulated 2.4GHz
WLAN transmitter with integrated phase path and dynamic load modulation in 65nm CMOS,” in
Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2013 IEEE International, Feb
2013, pp. 330–331.
103
