# DIGITAL CMOS RF POWER AMPLIFIERS 

by

Wen Yuan

A dissertation submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

Department of Electrical and Computing Engineering
The University of Utah
August 2016

All rights rese ved
INFORMATION TO ALL USERS
The quality of this reproduction is dependent upon the quality of the copy submitted.
In the unlikely event that the authordid not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion.


Pro Quest 10162908
Published by Pro Quest LLC (2016). Copyright of the Dissertation is held by the Author.

All rights re sened.
This work is protected against unauthorized copying under Title 17, United States Code Microform Edition © ProQuest LLC.

Pro Quest LLC.
789 East Eise nhower Parkway
P.O. Box 1346

Ann Arbor, MI 48106-1346

Copyright © Wen Yuan 2016
All Rights Reserved

## The University of Utah Graduate School

## STATEMENT OF THESIS APPROVAL

The thesis of

## Wen Yuan

has been approved by the following supervisory committee members:

|  | Jeffrey Sean Walling | Chair | 03/17/2016 |
| :---: | :---: | :---: | :---: |
| Darrin J. Young |  | , Member | 03/17/2016 |
| Kenneth S. Stevens |  | , Member | 03/17/2016 |
| Rongrong Chen |  | Member | 03/17/2016 |
| Sneha K. Kasera |  | Member | 03/17/2016 |
| and by | Gianluca Lazzi |  | , Chair/Dean of |
| the Dep | College/School of | Electrical and Computer Engineering |  |

and by David B. Kieda, Dean of The Graduate School.


#### Abstract

High speed wireless communication systems (e.g., long-term evolution (LTE), Wi-Fi) operate with high bandwidth and large peak-to-average power ratios (PAPRs). This is largely due to the use of orthogonal frequency division multiplexing (OFDM) modulation that is prevalent to maximize the spectral efficiency of the communication system. The power amplifier $(\mathrm{PA})$ in the transmitter is the dominant energy consumer in the radio, largely because of the PAPR of the input signal. To reduce the energy consumption of the PA an amplifier that simultaneously achieves high efficiency and high linearity. Furthermore, to lower the cost for high volume production, it is desirable to achieve a complete System-on-Chip (SoC) integration.

Linear amplifiers (e.g., Class-A, -B, -AB) are inefficient when amplifying signals with large PAPR that is associated by high peak-to-average modulation techniques such as LTE. OFDM. Switching amplifiers (e.g., Class-D, -E, -F) are very promising due to their high efficiency when compared to their linear amplifier counterparts. Linearization techniques for switching amplifiers have been intensively investigated due to their limited sensitivity to the input amplitude of the signal. Deep-submicron CMOS technology is mostly utilized for logic circuitry, and the Moore's law scaling of CMOS optimizes transistors to operate as high-speed and low-loss switches rather than high gain transistors. Hence, it is advantageous to use transistors in switching mode as switching amplifies and use highspeed digital logic circuitry to implement linearization systems and circuitry.


In this work, several linearization architectures are investigated and demonstrated. An envelope elimination and restoration (EER) transmitter that comprises a class-E power amplifier and a 10-bit digital-to-analog converter (DAC) controlled current modulator is investigated. A pipelined switched-capacitor DAC is designed to control an open-loop transconductor that operates as a current modulator, modulating the amplitude of the current supplied to a class-E PA. Such a topology allows for increased filtering of the quantization noise that is problematic in most digital PAs (DPA). The proposed quadrature and multiphase architecture can avoid the bandwidth expansion and delay mismatch associated with polar PAs. The multiphase switched capacitor power amplifier (SCPA) was proposed after the quadrature SCPA and it significantly improves the power efficiency.

To my family.
To the world.

## TABLE OF CONTENTS

ABSTRACT ..... iii
LIST OF FIGURES ..... viii
LIST OF TABLES ..... xii
ACKNOWLEDGEMENTS ..... xiii
Chapters
1 INTRODUCTION ..... 1
1.1 Background and Motivations ..... 1
1.2 Contributions of This Research ..... 5
1.3 Organization of the Dissertation ..... 5
2 FUNDAMENTALS OF CMOS POWER AMPLIFIERS ..... 11
2.1 Class- $\mathrm{A},-\mathrm{AB},-\mathrm{B}$, and -C ..... 11
2.2 Class-D ..... 14
2.3 Class-E ..... 17
2.4 Class-F ..... 19
2.5 Class-F- ${ }^{-1}$ ..... 21
2.6 Class-D ${ }^{-1}$ ..... 21
3 LINEARIZATION TECHNIQUES ..... 33
3.1 Envelope Elimination and Restoration. ..... 34
3.2 Outphasing ..... 35
3.3 Pulse-Width and Pulse-Position Modulation ..... 36
3.4 Digital Polar PA (DPA). ..... 37
3.5 Hybrid Architectures ..... 37
3.6 Summary ..... 39
4 A CURRENT-MODULATED CLASS-E PA ..... 43
4.1 Introduction ..... 43
4.2 Differential Class-E Power Amplifier ..... 44
4.3 Current Modulation with Unary Current-Mode DAC ..... 45
4.4 Current Modulation with Pipelined SC-DAC ..... 48
4.5 Summary ..... 54
5 QUADRATURE SWITCHED CAPACITOR PA ..... 69
5.1 Introduction ..... 69
5.2 Theory of Operation ..... 72
5.3 Circuit Details ..... 77
5.4 Experimental Results ..... 81
5.5 Summary ..... 83
6 A MULTIPHASE SWITCHED CAPACITOR PA ..... 97
6.1 Motivations ..... 97
6.2 Theory of Multiphase Modulation ..... 99
6.3 Circuit Details ..... 105
6.4 Measurement Results ..... 107
6.5 Summary ..... 110
7 CONCLUSIONS AND FUTURE WORK ..... 125
7.1 Summary ..... 125
7.2 Future Work ..... 126
REFERENCES ..... 128

## LIST OF FIGURES

## Figures

1.1. Plot of PDF for 802.11 and efficiency of PAs (class-B and class-E) as a function of envelope. ..... 8
1.2. Block diagram of a transmitter with a linear PA. ..... 8
1.3. Block diagram of a transmitter with a switching PA. ..... 9
2.1. Schematics of a linear PA ..... 23
2.2. (a) Drain voltage; (b) drain current for ideal class $\mathrm{A}, \mathrm{AB}, \mathrm{B}$, and C ..... 24
2.3. A typical class-D PA ..... 25
2.4. Voltage and current waveforms at the drain of PMOS/NMOS ..... 26
2.5. A class-E PA. ..... 27
2.6. Voltage and current waveforms at the drain. ..... 27
2.7. A class-F PA. ..... 28
2.8. Voltage and current waveforms at the drain. ..... 28
2.9. A class-F PA with on-chip terminations. ..... 29
2.10. An inverse class-F PA ..... 29
2.11. Voltage and current waveforms at the drain. ..... 30
2.12. An inverse class-D PA. ..... 30
2.13. Voltage and current waveforms at the drain. ..... 31
3.1. Schematic of an SCPA ..... 40
3.2. Block diagram of a typical EER PA. ..... 40
3.3. Block diagram of an outphasing PA. ..... 40
3.4. Outphasing operation in polar coordinates. ..... 41
3.5. Block diagram of a PWPM PA. ..... 41
3.6. Illustration of (a) modulating the amplitude with pulse-width; (b) modulating of the phase with the pulse position [76], [79] ..... 42
3.7. Block diagram of a DPA. ..... 42
4.1. Typical class-E power amplifier. ..... 56
4.2. Cascoded differential class-E amplifier. ..... 56
4.3. Block diagram of the current DAC modulated EER transmitter. ..... 57
4.4. Block diagram of the unary-weighted, current-mode DAC. ..... 57
4.5. (a) Differential Current Cell; (b) Single-ended cascade current. ..... 58
4.6. (a) Simulated output power versus input code; (b) simulated PAE versus output power. ..... 58
4.7. (a) DAC input code versus output amplitude; (b) DAC input code versus output phase. ..... 59
4.8. Simulated PSD of the proposed EER transmitter. ..... 59
4.9. Simulated 64-QAM OFDM signal constellation. ..... 60
4.10. Block diagram of the switched capacitor DAC modulated EER transmitter ..... 60
4.11. (a) Pipelined switched capacitor DAC; (b) three-phase clock. ..... 61
4.12. Interleaved differential pipelined Switched Capacitor DAC. ..... 62
4.13. Simulated DNL of the SC-DAC with op-amp ..... 63
4.14. Simulated INL of the SC-DAC with op-amp. ..... 63
4.15. (a) DAC input code versus output amplitude; (b) DAC input code versus output phase. ..... 64
4.16. Diagram of signal flow with offline DPD. ..... 64
4.17. Microphotograph of SC-DAC modulated EER transmitter in 130 nm CMOS. ..... 65
4.18. (a) Measured output power versus input code and (b) Measured PAE versus output power. ..... 65
4.19. Measured peak output power and PAE versus frequency. ..... 66
4.20. Measured 10 MHz 64 QAM LTE constellation with EVM of 2.35 \%-rms after DPD. The constellation before DPD is in gray. ..... 66
4.21. Measured ACLR of an LTE signal (a) without DPD (b) with DPD. ..... 67
4.22. Measured ACLR of a WCDMA signal (a) without DPD (b) with DPD. ..... 68
5.1. Block Diagram of an SCPA based quadrature power amplifier. ..... 84
5.1. Schematic of an SCPA ..... 84
5.2. (a) Schematic diagram of a Q-SCPA; (b) waveforms of I/Q vectors. ..... 85
5.3. Schematics of capacitively combined quadrature SCPAs outputting (a) $-6+\mathrm{j} 1$, (b) $8+\mathrm{j} 8$, (c) $-2-\mathrm{j} 4$, and (d) 3-j6. ..... 86
5.4. Comparison of ideal drain efficiency, $\eta$, versus $P_{\text {out }}$ for a conventional SCPA and several Q-SCPAs. ..... 87
5.5. Comparison of the total efficiency versus Qtoaded for several code words in a Q-SCPA. ..... 87
5.6. Block diagram of the proposed quadrature SCPA. Note that the actual implementation is differential and that the switches are cascoded class-G switches (See Figure 5.8). The unit capacitance size is 200 fF ..... 88
5.7. (a) Custom differential inductor. $\mathrm{L}_{\text {ser }}$. Simulated inductance and resistance versus frequency. ..... 89
5.8. Schematic of unit class-G driver with active supply of (a) VDD2 (b) VDD. All transistors are minimum length, with the following widths in $\mu \mathrm{m}: \mathrm{P}_{1}=\mathrm{P}_{2}=87.84, \mathrm{~N}_{1}=28.8$, $\mathrm{N}_{2}=38.88$ ..... 90
5.9. Q-SCPA Class-G Logic Decoder. Note that the unit size for an NMOS transistor is $550 \mathrm{~nm} \times 60 \mathrm{~nm}$, while a PMOS is $1320 \mathrm{~nm} \times 60 \mathrm{~nm}$ ..... 91
5.10. Chip microphotograph of the 65 nm experimental prototype transformer combined SCPA. ..... 92
5.11. Measured output power and PAE versus frequency ..... 92
5.12. (a) Measured output power versus codeword; (b) measured PAE versus output power.93
5.13. Measured ACLR for a $10 \mathrm{MHz}, 64$ QAM LTE signal. ..... 94
5.14. Measured signal constellation for a 10 MHz , 64 QAM LTE signal. ..... 94
5.15. Measured OOB spectrum for a $10 \mathrm{MHz}, 64$ QAM LTE signal. ..... 95
6.1. (a) IQ summation at output; (b) IQ waveforms. ..... 112
6.2. (a) Quadrature clocks with 50 \% duty cycle; (b) quadrature clocks with $25 \%$ duty cycle [23]. ..... 112
6.3. (a) Eight-phase vectors in Cartesian coordinates; (b) four of eight-phase clocks in time domain ..... 113
6.4. Polar to multiphase conversion. ..... 113
6.5. Cartesian to multiphase conversion. ..... 114
6.6. Example multiphase operations with SCPA ..... 114
6.7. (a) Output power versus input code; (b) PAE versus input code. ..... 115
6.8. Output power versus the number of switching capacitors with different M values. ..... 115
6.9. One unit cell of a cascoded switch. ..... 116
6.10. Block diagram of the 16 -phase SCPA. ..... 116
6.11. Chip microphotograph of the 130 nm multiphase SCPA. ..... 117
6.12. Measured $\mathrm{P}_{\text {out }}$ and PAE versus frequency. ..... 117
6.13. Measured $\mathrm{P}_{\text {out }}$ versus input code ( $\mathrm{n}_{1}, \mathrm{n}_{2}$ and m are mapped to IQ). ..... 118
6.14. Measured output with all codes using (a) two phases; (b) 16 phases. ..... 118
6.15. Example of data points of (a) predistored input; (b) output after DPD. ..... 119
6.16. 2D surface fit from the output of MP-SCPA. ..... 119
6.17. Comparison between LUT and surface fit: (a)IDPD; (b)QDPD. ..... 120
6.18. Measured output with DPD: (a)output power; (b)PAE. ..... 120
6.19. Measured ACLR for a 10 MHz 64 QAM LTE signal (a) no DPD; (b) with DPD. 121
6.20. Measured signal constellation for a 10 MHz , 64QAM LTE signal (blue dots represent the signal after DPD; gray points represent the signal before DPD) ..... 122
6.21. Measured OOB spectrum for a $10 \mathrm{MHz}, 64$ QAM LTE signal. ..... 122

## LIST OF TABLES

## Tables

1.1. Several watt-level CMOS power amplifiers ..... 10
2.1. Conduction angles and duty cycles of linear PAs ..... 32
2.2. Ideal peak efficiencies of linear PAs ..... 32
5.1. Comparison to prior art ..... 96
6.1. Power loss of multiphase architecture and digital polar architecture ..... 123
6.2. Comparison to prior art ..... 124

## ACKNOWLEDGEMENTS

I would like to express my sincere gratitude to my advisor Prof. Jeffrey Walling for his invaluable guidance and support of my Ph.D. study. I appreciate his great assistance in my research, writing and presentation. He guided me to interesting and challenging projects from system design to hands-on experiments.

I would like to thank the rest of my thesis committee: Prof. Rongrong Chen, Prof. Sneha Kasera, Prof. Ken Stevens, and Prof. Darrin Young, for their insightful comments and suggestions.

I would also like to thank my fellow labmates in Power Efficient Radio Frequency IC (PERFIC) Lab for the stimulating discussions and mutual assistance.

I would also like to thank my parents for their understanding and encouragement of my research and study.

## CHAPTER 1

## INTRODUCTION

### 1.1 Background and Motivations

The pursuit of high data rate in wireless communications has led to the prevalence of nonconstant envelope (non-CE) modulation techniques with high peak-to-average power ratio (PAPR), such as Wi-Fi and long-term evolution (LTE). In conventional transmitter architectures, a linear amplifier is needed to amplify these non-CE modulated signals. Linear amplifiers in CMOS are inefficient because their efficiency is inversely proportional to their output power, as shown in Figure 1.1 [1]. Switching amplifier are very attractive due to their high power efficiency [2]-[6]. Deep-submicron CMOS processes are fundamentally digital technologies where the scaling of CMOS focuses on optimization of transistors as high-speed and low-loss switches rather than high gain amplifiers. Hence, it is advantageous to use switching amplifiers since they can benefit from CMOS scaling. In a common switching PA (e.g., class-D, -E, -F), transistors act as switches, toggling between on and off at the carrier frequency [1]. During the on-state the switches can be modeled as a small on-resistance whereas they are modeled as large off-resistance during the off-state. Since switching transistors are operating in the triode region, the output amplitude has reduced sensitivity to the input signal amplitude. Therefore, switching PAs are less sensitive to variations in transistor characteristics due to process, voltage, and temperature.

### 1.1.1 CMOS PAs for WLAN

Due to their potential for low cost integration, CMOS PAs are very competitive with III-V compound semiconductor PAs for connectivity applications such as Wireless Local Area Network (WLAN) [7]-[9] and Bluetooth [10]-[13]. In CMOS, the digital signal processing (DSP) backend can be fabricated on a single chip with the receiver and transmitter on a complete system-on-chip. This reduces the implementation size for a complete transmitter, saving area on printed circuit boards (PCBs), and hence, lowers the cost of wireless transceivers. Most CMOS transmitters use linear PAs to implement the output stage, as shown in Figure 1.2. The CMOS linear PAs suffer from the AM-AM and AM-PM distortion, as well as memory effect. To meet requirements on modulation standards such as adjacent channel power ratio (ACPR) and error-vector magnitude (EVM), self-testing circuitry is implemented on-chip to adjust the input bias for different signal levels, known as analog predistortion (APD). An alternative technique is to modify the digital signal at base band, known as digital predistortion (DPD) [14]-[16].

One problem with integration is that the linear CMOS PAs, as shown in Figure 1.2, are not energy efficient when not operating at peak output power. This is especially true for OFDM signals with high peak-to-average ratio (PAPR). Therefore, the low cost and integration level is an advantage for CMOS PAs while III-V PAs win for output power and power efficiency.

In order to enhance the power efficiency of CMOS, a digitally-modulated switching PA can be used to replace the linear PA and the RF modulator as shown in Figure 1.3 [17], [18]. The efficiency of switching PAs is significantly higher than their linear counterparts, especially in fine-line CMOS, due to the low operating voltage and low intrinsic gain of the transistor. However, a digitally modulated PA may introduce quantization noise [19]
and may require digital predistortion [20]-[22] to enhance its linearity. To reduce the distortion of the digital PA , on-chip self-testing and calibration circuitry can be implemented; this is not possible for III-V technology. This is because digital circuitry is costly and power hungry to implement in III-V technology, where there are no complementary transistors. Therefore, digital-modulated switching PAs have attracted great attention recently [3], [17], [19], [23]-[27].

In Figure 1.2, the design flows of digital backend and RF modulator are well established. Although digital PAs can eliminate the RF modulator, as shown in Figure 1.3 [17], [18], their digital backend requires a modified implementation. As a result, the switching PAs have yet to be prevalent in the market, due to the fact that both digital backend and PA output stage need to be designed in parallel to achieve good reliability.

### 1.1.2 CMOS PAs for Mobile Transceivers

Implementing mobile transceivers with CMOS PAs are very attractive due to the low cost in high-volume production and the enhanced functionality/integrity with system-onchip (SoC). However, the output power of a typical CMOS PA is approximately a quarter of a watt. Design of a power efficient $>1$ watt CMOS PA is still challenging [28], [29]. CMOS PAs for second-generation (2G) RF transceivers have been commercially available [30]-[32], but it is still hard to meet third-generation (3G) or fourth-generation (4G) standards due to the large PAPR of these signals. Several watt-level CMOS PAs were have been introduced in recent years [29], [33]-[35]. These CMOS PAs utilize on-chip transformer based power combiners, and their power efficiency is significantly lower than their III-V PA counterparts, as shown in Table 1.1.

With CMOS scaling, the operating voltage and the gain of the transistors are reduced.

Since linear CMOS PAs cannot compete with linear III-V PAs due to their low efficiency, the implementation of CMOS transmitters using switching PAs looks promising. In switching PAs, CMOS transistors operate as a fast, low-loss switch and take full advantage of CMOS scaling.

The output amplitude of a switching PA is insensitive to the input amplitude. Additional linearization techniques are required to achieve the amplitude modulation. This requires the design of accompany digital logic circuitry. Fortunately, the digital circuitry in CMOS is low cost and low power, but the mixed-signal design of the whole system brings in more complexity.

Load-insensitivity is a challenging requirement in practical manufacturing since the PA modules are required to tolerate load variations. In switching PAs, transistors are operating as either voltage sources or current sources, hence, insensitive to load variations and provide a promising solutions.

Another promising future of switching PAs is the development of RF CMOS processes. In the past, the CMOS scaling focused on faster switching and lower voltage. For PA applications, however, faster switching and higher voltage are preferred. As the Moore's law approaches its limit, it would be beneficial for the foundry to explore the RF friendly devices as the market for wireless communications grows rapidly.

The efficiency of switching PAs is competitive but the peak output power is still limited by the low-operating voltage, losses in the matching network due to low-Q on-chip inductors, and losses due to voltage/current division at the transistor drain node. The operating voltage can be increased by using cascoded switches [3], [36]. An alternative to off-chip inductors is to integrate surface mount (SMD) inductors at the packaging level. This will allow higher output power to be achieved without suffering matching losses as
much.

### 1.2 Contributions of This Research

Several linearization techniques are proposed and implemented.
In the architecture level, the concept of the multiphase architecture is first proposed and investigated. The multiphase can eliminate the bandwidth expansion and delay mismatch issues associated with polar architecture while alleviate the power loss issue associated with quadrature architecture, at the expense of more complex coordinates conversion.

In the system/transistor design level, the multiphase architecture is first realized with switched capacitor power amplifier with the phase selection logic and multiphase decoder. In order to synthesize the digital circuitry, the library of standard cells are built and the design flow for 130 nm CMOS is independently developed.

In the device testing and linearization process, the methods of 2D lookup table (LUT) and 2D surface fit for DPD are developed and realized with LabVIEW programs. With the automatic device characterization and linearization using LabVIEW, several types of CMOS PAs are optimized to meet the stringent LTE/WCDMA requirements.

### 1.3 Organization of the Dissertation

The dissertation is organized as follows:
A review of different classes of power amplifiers are presented in Chapter 2.
In Chapter 3, various conventional linearization techniques are reviewed and their advantages and limitations are compared and discussed.

After that, several linearization architectures are proposed, investigated, and
demonstrated that address primary concerns with prior art in switching PA linearization.
In Chapter 4, an envelope elimination and restoration (EER) transmitter that comprises a class-E power amplifier and a 10-bit DAC-controlled current modulator is investigated. A pipelined switched-capacitor DAC is designed to control an open-loop transconductor that operates as a current modulator, modulating the amplitude of the current supplied to a class-E PA. Such a topology allows for increased filtering of the quantization noise that is problematic in most digital PAs (DPA). The EER PA is a PA that operates in the polar domain and is subject to significant nonideality at the system level. Namely, transformation to the polar domain increases the bandwidth when compared to Cartesian signals. This causes group delay differences between the basis polar vectors that leads to delay matching issues when the polar signals are recombined at the PA. The system measurements yield a peak output power and power added efficiency (PAE) of 22.5 dBm and $23.6 \%$, respectively. When applying a WCDMA signal, the measured EVM is $1.32 \%-\mathrm{rms}$ and the adjacent channel power ratio (ACPR) is -37.9 dBc , while outputting 19.9 dBm at $14.3 \%$ PAE. For an LTE signal, the measured EVM is $3.72 \%-\mathrm{rms}$ and the ACLR is -30.2 dBc , while outputting 18.1 dBm at 10.6 \% PAE.

In Chapter 5, we also investigate multiphase amplifiers that use switched capacitors to efficiently linearly amplifier basis phase vectors before charge summation on a capacitor array. The proposed quadrature and multiphase architectures avoid the bandwidth expansion and delay mismatch associated with polar PAs. The quadrature SCPA (Q-SCPA) uses amplifies Cartesian basis vectors separately using a switched capacitor circuit and combines them on the capacitor array in the charge domain. A prime problem with Cartesian combination is power loss when the basis vectors are combined; this owes to a phase difference between the basis vectors. The Q-SCPA delivers a peak output power of
20.5 dBm with a peak PAE of $20 \%$. It is measured with a $10-\mathrm{MHz}, 64-\mathrm{QAM}$ LTE signal and achieves an ACLR of $<-30 \mathrm{dBc}$, with an EVM $<4 \%$-rms.

In Chapter 6, the multiphase SCPA (MP-SCPA) is proposed as a solution to the power combining losses in the Q-SCPA. By reducing the phase differences between the basis vectors, it significantly improves the power efficiency while not suffering from bandwidth expansion or delay mismatch. The MP-SCPA is implemented with a 16 -phase clock generator and a 7-bit SCPA array. The MP-SCPA delivers a peak output power and PAE of 26 dBm and 24.9 \%, respectively. When amplifying an LTE signal the average output power and PAE are 20.9 dBm and $15.2 \%$, respectively, while achieving less than -30 dBc ACLR and 3.5 \%-rms EVM.

The conclusions of this dissertation are summarized in Chapter 7.


Figure 1.1. Plot of PDF for 802.11 and efficiency of PAs (class-B and class-E) as a function of envelope.


Figure 1.2. Block diagram of a transmitter with a linear PA.


Figure 1.3. Block diagram of a transmitter with a switching PA.

Table 1.1. Several watt-level CMOS power amplifiers

| Ref. | D.Chowdhury et al., <br> ISSCC, 2009 [35] | A. Afsahi et al., <br> ISSCC, 2010 [33] | B. Francois et al., <br> MTT, 2012 [34] |
| :---: | :---: | :---: | :---: |
| Process technology | 90 nm | 65 nm | 90 nm |
| Carrier Frequency | 2.4 GHz | 2.35 GHz | 930 MHz |
| Peak Pout (dBm) | 30.1 | 31.5 | 29.4 |
| PAE at peak P ${ }_{\text {out }}$ | $33 \%$ | $25 \%$ | $25.8 \%$ |
| Application | WiMAX | WLAN | $16-\mathrm{QAM}$ |
| Average P P $_{\text {out }}(\mathbf{d B m})$ | 22.5 | 25.5 | 26 |
| Average PAE | $12.4 \%$ | $16 \%$ | $17 \%$ |

## CHAPTER 2

## FUNDAMENTALS OF CMOS POWER AMPLIFIERS

### 2.1 Class- $\mathrm{A},-\mathrm{AB},-\mathrm{B}$, and - C

The four types of the linear PAs, class-A, $-\mathrm{AB},-\mathrm{B}$, and -C , are distinguished primarily by bias conditions. Since the active devices in linear PAs act as high output impedance transconductors, they are also called the transconductor-based PAs. They have the same generic form, which is shown in Figure 2.1. The RF choke acts as a current source and provides a DC path from the supply to bias the transistor. The matching network transforms the optimal load at the drain to $50 \Omega$.

The gate biases of different types of linear PAs are controlled so that their conduction angles (2Ф) are different, as shown in Figure 2.2(b). According to the conduction angles, the duty cycles of different types of linear PA are summarized in Table 2.1.

In a class-A PA, the transistor is biased to be turned on during each cycle and the corresponding conduction angle is $2 \pi$. The class-A PA has a high linearity but suffers from low efficiency due to quiescent drain current. The peak efficiency of a class-A PA is limited to $50 \%$ according to the maximum power transfer theorem.

In a class-B PA, the transistor is biased to be turned off for half time of each cycle. The efficiency is improved by reducing the drain current to zero when the drain voltage is at maximum, as shown in Figure 2.2 (b). Here, the maximum power transfer theorem is no
effective due to the energy stored in the passive load. The class-B PA achieves higher efficiency at the expense of degraded linearity, and it can be used to implement RF PAs [37], [38].

In a class-C PA, the transistor is biased so that it conducts for less than $50 \%$ of each cycle. The linearity is even worse than class-B but the overall efficiency is higher.

The detailed derivations of design equations for linear PAs are described below.
The drain current of a linear PA is given by

$$
\begin{equation*}
i_{D}=I_{D C}+i_{r f} \cos \omega_{0} t \tag{2.1}
\end{equation*}
$$

where $I_{D C}$ is the DC bias current and $i_{r f}$ is the amplitude of the sinusoidal signal of the drain current.

By setting $i_{D}$ to zero in (2.1), the conduction angle $2 \Phi$ is given by

$$
\begin{equation*}
2 \Phi=2 \arccos \left(\frac{I_{D C}}{i_{r f}}\right) \tag{2.2}
\end{equation*}
$$

The average drain current can be calculated from the integral of $i_{D}$ :

$$
\begin{equation*}
<i_{D}>=\frac{1}{2 \pi} \int_{0}^{2 \pi} i_{D}(\phi) d \phi=\frac{1}{2 \pi} \int_{-\Phi}^{\Phi}\left(I_{D C}+i_{r f} \cos \phi\right) d \phi \tag{2.3}
\end{equation*}
$$

By solving (2.2) and (2.3), we get

$$
\begin{equation*}
I_{D C}=\frac{i_{f f}}{\pi}(\sin \Phi-\Phi \cos \Phi) \tag{2.4}
\end{equation*}
$$

The DC power consumption is given by

$$
\begin{equation*}
P_{D C}=\frac{1}{\pi}(\sin \Phi-\Phi \cos \Phi) i_{r f} V_{D D} \tag{2.5}
\end{equation*}
$$

The LC tank or matching network will select the signal at the fundamental frequency $\omega_{0}$, which is the fundamental term in the Fourier expansion,

$$
\begin{equation*}
i_{\omega_{0}}=\frac{1}{\pi} \int_{0}^{2 \pi} i_{D}(\phi) \cos \phi d \phi=\frac{1}{2 \pi} i_{r f}(2 \Phi-\sin 2 \Phi) \tag{2.6}
\end{equation*}
$$

The maximum voltage swing range is limited by $V_{\mathrm{DD}}$ and the corresponding peak drain voltage is $2 V_{\mathrm{DD}}$, so the peak output power is

$$
\begin{equation*}
P_{o u t}=\frac{1}{2} i_{\omega_{0}} V_{D D}=\frac{1}{4 \pi}(2 \Phi-\sin 2 \Phi) i_{r f} V_{D D} \tag{2.7}
\end{equation*}
$$

Therefore, the peak efficiency of a linear PA is given by

$$
\begin{equation*}
\eta_{\text {peak }}=\frac{P_{\text {out }}}{P_{D C}}=\frac{2 \Phi-\sin 2 \Phi}{4(\sin \Phi-\Phi \cos \Phi)} \tag{2.8}
\end{equation*}
$$

The optimal load resistance $R_{L}$ is given by

$$
\begin{equation*}
R_{L}=\frac{\sqrt{2} V_{D D}}{\sqrt{2} i_{\omega_{0}}}=\frac{2 \pi V_{D D}}{i_{r f}(2 \Phi-\sin 2 \Phi)} \tag{2.9}
\end{equation*}
$$

From (2.5), (2.7), and (2.9), we can rewrite the DC power and peak output power as:

$$
\begin{gather*}
P_{D C}=\frac{2 V_{D D}^{2}}{R_{L}} \frac{\sin \Phi-\Phi \cos \Phi}{2 \Phi-\sin 2 \Phi}  \tag{2.10}\\
P_{\text {out }}=\frac{V_{D D}^{2}}{2 R_{L}} \tag{2.11}
\end{gather*}
$$

From (2.8) and (2.10), we can calculate the ideal DC power and peak efficiencies for different types of linear PAs, as shown in Table 2.2. The power dissipation in the transistors of linear PAs is the product of drain current and voltage, therefore, by reducing the conduction angles, the devices only draw current when the drain voltage is low, as shown in Figure 2.2 (a) and (b).

Note that when the conduction angle is close to zero, the ideal drain efficiency is close to $100 \%$. However, with smaller conduction angles, the output power is lower, as well as the power gain. At the first glance of (2.11), it seems that different types of linear PAs have
similar power capability. But the effective load resistances $R_{L}$ for the same device at different bias conditions are quite different.

The drain current for a static gate voltage $V_{G S}$ is

$$
\begin{equation*}
I_{D C}=\frac{1}{2} \mu_{n} C_{o x}\left(\frac{W}{L}\right)\left(V_{G S}-V_{T}\right)^{2} \tag{2.12}
\end{equation*}
$$

For a large input signal at the gate, the input amplitude for class-A is almost twice of the input amplitude of class-B, leading to four times of drain current. Base on this approximation, for the same device, the output power, using class-B biasing is 6 dB lower than that with class-A biasing.

In summary, the power efficiency of linear PAs can only be improved at the expense of larger device, lower gain, lower output power, and worse linearity. To overcome these disadvantages, high power switching PAs are explored.

### 2.2 Class-D

In a switching PA, the transistor operates as a switch, therefore, there is either zero voltage across it or zero current through it, resulting a zero voltage-current product during a whole period.

The schematics of a typical class-D PA [39] is shown in Figure 2.3. A square wave is passed to the class-D and the LC tank selects the output signal at the fundamental frequency. The PMOS is driven on during the positive half cycles and the NMOS is on during the negative half cycles. The PMOS and NMOS are switched alternately to generate a square wave at the drain. An ideal filter will provide a short at the fundamental frequency and an open at all other harmonics, leading to a $100 \%$ power efficiency. For a real LC matching, the finite quality factor of the matching network will cause the harmonics, which will lead
to the power loss and induce the distortion.
One advantage of the voltage-mode class-D PA is that it operates as a voltage source, therefore, it is invariant to different load conditions.

The voltage and current waveforms for an ideal class-D PA are plotted in Figure 2.4.
The amplitude of the output signal at the drain of the PMOS/NMOS is the fundamental component of a square wave with a range from $V_{D D}$ to ground and it can be given by the first term in the Fourier expansion,

$$
\begin{equation*}
V_{o u t}=\frac{2}{\pi} V_{D D} \tag{2.13}
\end{equation*}
$$

The output power is

$$
\begin{equation*}
P_{o u t}=\frac{V_{o u t}{ }^{2}}{2 R_{o p t}}=\frac{2}{\pi^{2}} \frac{V_{D D}}{R_{o p t}} \tag{2.14}
\end{equation*}
$$

where $R_{\text {opt }}$ is the optimal load resistance.
The amplitude of the output current is

$$
\begin{equation*}
I_{o u t}=\frac{V_{\text {out }}}{R_{\text {opt }}}=\frac{2}{\pi} \frac{V_{D D}}{R_{\text {opt }}} \tag{2.15}
\end{equation*}
$$

The DC current from the supply equals the average of the current passing through the PMOS,

$$
\begin{equation*}
\int_{0}^{T} I_{D C} d t=\int_{0}^{T / 2} I_{\text {out }} \cos \left(\omega_{0} t\right) d t \Rightarrow I_{D C}=\frac{I_{\text {out }}}{\pi} \tag{2.16}
\end{equation*}
$$

where $T$ is the period of the input pulse wave, assuming the duty cycle of the pulse wave is $50 \%$.

From (2.15) and (2.16), the DC power is given by

$$
\begin{equation*}
P_{D C}=I_{D C} V_{D D}=\frac{2}{\pi^{2}} \frac{V_{D D}}{R_{o p t}} \tag{2.17}
\end{equation*}
$$

Therefore, the ideal drain efficiency of a class-D PA is $100 \%$. In a practical implementation, the nonzero on-resistance of the switch is one of the dominant factors for power loss and efficiency degradation. Therefore, it is desirable to use a large transistor, typically with a transistor width in millimeter scale for submicron CMOS processes. However, the drain parasitic capacitance will also increase and the charging and discharging of the drain parasitic capacitor will cause power loss [40]. As shown in Figure 2.4, the switch is turned on when the drain voltage is still high, therefore, the energy stored on the drain parasitic capacitance is dissipated through the on-resistance of the switch as heat, which is known as hard-switching.

Another disadvantage of the hard switching is the power loss during the voltage transition. At high frequencies, since the gate voltage gradually changes from $V_{\mathrm{DD}}$ to ground, there is a small overlapped transition time when both NMOS and PMOS are on, which results in a direct current path from $V_{\mathrm{DD}}$ to ground. This is known as crowbar current or shoot-through current loss. Nonoverlapping clocks and different driving circuits for PMOS and NMOS can be implemented to reduce the power loss caused by transition.

Moreover, the hard-switching will also cause the transistors operating in saturation region during the transition time when the drain voltage and gate bias are both high. In the saturation region, the transistors are not ideal switches and the device characteristics should be considered. As the frequency goes high, the transition time becomes a significant portion for each period, leading to more power loss and nonideal characteristics.

To avoid the power loss due to the drain parasitic capacitance and switch transition, the drain voltage needs to be shaped so that it is close to zero before the switch is turned off, which is known as soft-switching.

### 2.3 Class-E

The class-D PAs suffer from the presence of drain parasitic capacitance. To minimize the switching loss caused by the parasitic capacitance, a class-E PA [41]-[43] can be implemented, as shown in Figure 2.5, where the drain parasitic capacitance is absorbed by reactive components at the drain.

As shown in Figure 2.6, there is no overlap between the drain voltage and current waveforms. Moreover, the drain voltage decreases to zero before the switch turns on. When the switch is turned on, the drain voltage is already zero, indicating that the energy stored on the drain parasitic capacitance $C_{D}$ has been transferred to the reactive matching network before the switch is fully turned off. Therefore, there is ideally no power loss due to the charging/discharging of the drain parasitic capacitance; this is known as soft-switching. In a class-D PA, on the other hand, the switch is turned on when the drain voltage is still high, the energy stored on the drain parasitic capacitance is dissipated through the available resistive path as heat.

With soft-switching, the transition loss in class-E is also reduced since the drain voltage gradually reduces to zero before the switch is turned on. As a result, there is no DC current path from the supply to the ground. Moreover, the soft-switching also ensures that the transistors are always operating in triode region, leading to a more accurate modeling of the device as a switch. Hence, the characteristics of a class-E PA is invulnerable to circuit and process variations [44]. Losses in a class-E PA are mainly due to nonideal waveform shaping caused by nonzero on-resistance and nonideal passive components [45].

The detailed derivation of the design equations is presented in [41]. The final equations are summarized below.

The output power delivered to the load is

$$
\begin{equation*}
P_{\text {out }}=\frac{1}{1+\pi^{2} / 4} \frac{V_{D D}^{2}}{R_{L}} \approx \frac{0.577 V_{D D}^{2}}{R_{L}} \tag{2.18}
\end{equation*}
$$

Therefore, the optimal load resistance $R_{L}$ for a given output power is

$$
\begin{equation*}
R_{L}=\frac{0.577 V_{D D}^{2}}{P_{\text {out }}} \tag{2.19}
\end{equation*}
$$

The components in Figure 2.5 can by calculated by

$$
\begin{gather*}
C_{D}=\frac{2}{\pi\left(\pi^{2} / 4+1\right) \omega R_{L}} \approx \frac{0.1836}{\omega R_{L}}  \tag{2.20}\\
L=\frac{1.1525 R_{L}}{\omega}  \tag{2.21}\\
L_{0}=\frac{(Q-1.1525) R_{L}}{\omega}  \tag{2.22}\\
C_{0}=\frac{1}{\omega^{2} L_{0}} \tag{2.23}
\end{gather*}
$$

where $\omega$ is the center frequency and $Q$ is the quality factor of the matching network. For an ideal PA, the maximum $Q$ is limited by the desired bandwidth. But due to the lossy onchip passive components, $Q$ needs to be smaller than the quality factors of inductors/capacitors to reduce the loss in the matching network.

In a practical class-E PA, the on-resistance $R_{S W}$ of the transistor is not zero. If $R_{S W}$ is considered, (2.19) can be modified as following,

$$
\begin{equation*}
R_{L} \approx \frac{0.577 V_{D D}^{2}}{P_{\text {out }}}-\frac{8}{3} R_{S W} \tag{2.24}
\end{equation*}
$$

Since the drain parasitic capacitance can be absorbed by the reactive components, larger device size can be used in class-E PAs to reduce the $R_{S W}$ and to increase the overall output power and efficiency. As a result, the lossy on-chip passive components become the dominant limiting factor for the power and efficiency of a class-E PA.

One significant disadvantage of the class-E PA is the high drain voltage. As discussed in [41], the peak drain voltage is $\sim 3.6 V_{\mathrm{DD}}$ and the peak drain current is $\sim 1.7 V_{\mathrm{DD}} / R_{L}$. The high peak drain voltage introduces a severe stress on the devices, especially for the deepsubmicron CMOS technologies with thin gate oxides and low operating voltages. With a large voltage at the drain, the CMOS oxide breakdown can occur at the gate-drain overlapping region. The high electrical field will induce the defect accumulation, during which process electrons are injected into the oxide and forming defects. The conventional oxide breakdown is characterized with static voltage. For a high voltage at RF frequencies, the defect accumulation process is much slower due to the fast alternating electrical field across the gate oxide [46]-[48]. Therefore, for RF applications, the device breakdown voltage is considerably higher than (more than two times of) the conventional DC breakdown standards [49].

As a result, the class-E PA is still a promising candidate for CMOS RF applications. The drain voltage can be reduced by replacing the RF choke with a finite inductor resonating with $C_{D}$, whereas the stress on the switches can be lowered by a cascoding topology.

### 2.4 Class-F

In the design of class-E PAs, the voltage and current waveforms are shaped to avoid the overlap between them. Another example is the class-F PA [50]-[52], as shown in Figure 2.7, where the load matching passes all even harmonics as a short circuit and blocks all odd harmonics as an open circuit.

In Figure 2.7, the load impedance at the fundamental frequency is

$$
\begin{equation*}
Z_{L}=\frac{Z_{0}^{2}}{R_{L}} \tag{2.25}
\end{equation*}
$$

where $Z_{0}$ is the characteristic impedance of the transmission line.
The high-Q LC resonator in parallel acts an open circuit at fundamental frequency and a short circuit at all other harmonics. Therefore, at all odd harmonics, the transmission line shows a large impendence close to infinity since the effective RL is close to zero. At even harmonics, the transmission line becomes some multiple of half-wavelength and acts as a short circuit to ground. This will result in a square voltage waveform and a half-rectified current waveform at the drain, as shown in Figure 2.8.

With an input square wave, the drain voltage is also a square wave with a peak-to-peak voltage of $2 V_{\mathrm{DD}}$. Therefore, the amplitude of the square wave at the drain is $V_{\mathrm{DD}}$. The output voltage is the fundamental term of the Fourier expansion of the drain voltage waveform, therefore, its amplitude is given by

$$
\begin{equation*}
V_{o u t}=\frac{4}{\pi} V_{D D} \tag{2.26}
\end{equation*}
$$

The output power is

$$
\begin{equation*}
P_{\text {out }}=\frac{8}{\pi^{2}} \frac{V_{D D}^{2}}{R_{L}} \tag{2.27}
\end{equation*}
$$

If the on-resistance $R_{S W}$ of the transistor is considered, then

$$
\begin{equation*}
R_{L}=\frac{8}{\pi^{2}} \frac{V_{D D}^{2}}{P_{\text {out }}}-2 R_{S W} \tag{2.28}
\end{equation*}
$$

An ideal class-F PA will comprise of an infinite number of odd-harmonic resonators to shape the voltage and current waveforms at the drain. In practical implementation, only the first several harmonics will be terminated. For example, we can use an on-chip $C_{3}, L_{3}$ tank tuned at the third harmonic to approximate the quarter wave transmission line, as
shown Figure 2.9.
From Figure 2.8 we notice the hard-switching nature of class-F PAs. Therefore, the charging and discharging of the drain parasitic capacitance will reduce the overall efficiency. Moreover, the practical implementation of class-F PAs requires more passive elements than class-D or class-E, therefore, class-F PAs are not widely used in CMOS.

### 2.5 Class-F ${ }^{-1}$

In order to achieve a soft-switching in class-F family, the inverse class-F (Figure 2.10) can be implemented [52], [53], where the current waveform is squared and the voltage waveform is half-sinusoidal, as shown in Figure 2.11.

Similar to a class-F PA, an ideal inverse class-F PA has an infinite number of evenharmonic resonators to result in a half-sinusoidal voltage waveform and a square current waveform at the drain.

The class-E, $-\mathrm{F},-\mathrm{F}^{-1}$ has similar topology with different load matching and termination, which can be referred to as the class-E/F family [52]. Comparing with class-E, the class-F and $-\mathrm{F}^{-1}$ have lower peak drain voltage, which can reduce the stress on the device.

### 2.6 Class-D ${ }^{-1}$

The inverse class-D PA, also called current-mode class-D, is similar to the class D PA (voltage-mode class-D), but with interchanged voltage and current waveforms at the drain [54]-[56]. This will result in the soft-switching to eliminate the power loss due to the charging/discharging of the drain parasitic capacitance. The inverse class-D PA is often implemented differentially; half side of the inverse class-D PA is shown in Figure 2.12.

Figure 2.13 shows the voltage and current waveforms at the drain of the inverse class-

D PA. The soft-switching characteristic of the inverse class-D can avoid the efficiency degradation due to drain parasitic capacitance. The transistors in inverse class-D PAs operate as current sources whereas the transistors in class-D PAs work as voltage sources.

The inverse class-D PA is quite similar to the class E/F family, with different load matching for waveform shaping.


Figure 2.1. Schematics of a linear PA.


Figure 2.2. (a) Drain voltage; (b) drain current for ideal class $\mathrm{A}, \mathrm{AB}, \mathrm{B}$, and C .


Figure 2.3. A typical class-D PA.



Figure 2.4. Voltage and current waveforms at the drain of PMOS/NMOS.


Figure 2.5. A class-E PA.



Figure 2.6. Voltage and current waveforms at the drain.


Figure 2.7. A class-F PA.


Figure 2.8. Voltage and current waveforms at the drain.


Figure 2.9. A class-F PA with on-chip terminations.


Figure 2.10. An inverse class-F PA.



Figure 2.11. Voltage and current waveforms at the drain.


Figure 2.12. An inverse class-D PA.



Figure 2.13. Voltage and current waveforms at the drain.

Table 2.1. Conduction angles and duty cycles of linear PAs

| Linear PAs | Conduction Angle (2Ф) | Duty Cycle (D) |
| :---: | :---: | :---: |
| Class-A | $2 \pi$ | $100 \%$ |
| Class-AB | $\pi<2 \Phi<2 \pi$ | $50 \%<D<100 \%$ |
| Class-B | $\pi$ | $50 \%$ |
| Class-C | $0<2 \Phi<\pi$ | $0<D<50 \%$ |

Table 2.2. Ideal peak efficiencies of linear PAs

| Linear PAs | DC Power $\left(\boldsymbol{P}_{\boldsymbol{D C}}\right)$ | Output Power | Peak Efficiency ( $\boldsymbol{\eta}$ ) |
| :---: | :---: | :---: | :---: |
| Class-A | $\frac{V_{D D}{ }^{2}}{R}$ | $\frac{V_{D D}{ }^{2}}{2 R}$ | $50 \%$ |
| Class-AB | $\frac{V_{D D}{ }^{2}}{R}<P_{D C}<\frac{V_{D D}{ }^{2}}{\pi R}$ | $\frac{V_{D D}{ }^{2}}{2 R}$ | $50 \%<\eta<78.5 \%$ |
| Class-B | $\frac{V_{D D}{ }^{2}}{\pi R}$ | $\frac{V_{D D}{ }^{2}}{2 R}$ | $78.5 \%(\pi / 4)$ |
| Class-C | $\frac{V_{D D}{ }^{2}}{\pi R}<P_{D C}<\frac{V_{D D}{ }^{2}}{2 R}$ | $\frac{V_{D D}{ }^{2}}{2 R}$ | $78.5 \%<\eta<100 \%$ |

## CHAPTER 3

## LINEARIZATION TECHNIQUES

Switching amplifiers, as shown in Figure 3.1, are more efficient than their linear amplifier counterparts and take advantage of the benefits of CMOS scaling, which has made the device a low-loss, fast switch. However, switching amplifiers require linearization circuitry for normal operation on wideband linear communication signals (e.g., LTE, Wi-Fi, etc.) [57]-[59].

Several techniques have been investigated to linearize RF PAs implemented as switching PAs, including envelope elimination and restoration (EER) [60]-[62], polar modulation [63]-[65], digital PAs [3], [19], [24]-[27], [66], [67], Digital-Doherty [68][70], and outphasing [6], [71], [72].

All of the above techniques require a conversion from the Cartesian domain that is typically preferred in communications systems to the polar coordinate system. A description of the transform will follow in the next sections and various systems will be described in detail, such as envelope elimination and restoration, outphasing/LINC, pulsewidth and pulse-position modulation, and digital polar. Their advantages and disadvantages will also be analyzed and discussed.

### 3.1 Envelope Elimination and Restoration

Conventional CMOS linear PAs have a low power efficiency for non-CE signals with large PAPR. To improve the power efficiency, envelop tracking (ET) can be used to modulate the supply voltage of a linear PA in accordance with the envelope of the RF signal. ET systems still suffer from reduced efficiency, owing to the limits of the linear PA; although the efficiency is improved when compared to a standalone PA [73]. To further improve the power efficiency, envelope elimination and restoration (EER) is used by replacing the linear PA with a high efficiency switching PA as shown in Figure 3.2. EER was first proposed by Kahn in 1952 [60].

The $I / Q$ signal for the digital backend is converted to amplitude and phase by a coordinate rotation digital computer (CORDIC). The amplitude information is used to modulate the supply voltage of the switching PA while a phase-modulated pulse wave is applied at the input of the switching PA. Since there are two separate paths for the amplitude and phase information, there is a delay mismatch that must be corrected. The intrinsic reason for the delay mismatch is that the amplitude and phase signals propagate at different frequencies (e.g., near DC and RF, respectively). Hence, they have different group delays. In addition, since the conversion from Cartesian (e.g., I and $Q$ ) to polar (e.g., $A$ and $\phi$ ) is nonlinear, there is a significant bandwidth expansion which limits wideband operation required by modern wireless communication standards.

The Cartesian-to-polar transformation can be expressed as follows:

$$
\begin{align*}
A(t) & =\sqrt{I(t)^{2}+Q(t)^{2}}  \tag{3.1}\\
\phi(t) & =\tan ^{-1}(Q(t) / I(t)) \tag{3.2}
\end{align*}
$$

Owing to the strong nonlinearity associated with these conversions, the bandwidth
required for the polar components $A(t)$ and $\phi(t)$, especially the phase component $\phi(t)$, are substantially larger than the bandwidth of the Cartesian components $I(t)$ and $Q(t)$ [59]. These issues make complete integration of EER PAs problematic and have limited their viability.

### 3.2 Outphasing

The outphasing architecture was first introduced by Chireix in 1935 [72]. It has also been called linear amplification using nonlinear components (LINC) [74], [75]. As shown in Figure 3.3, the input $I / Q$ signals are converted into two phase-modulated pulse waves with constant amplitude. The two pulse waves drive two identical switching PAs and their outputs sum together using a passive power combiner. When the two pulse waves are inphase, the output is maximum; when the two pulse waves are 180 degree out-of-phase, the output is zero (assuming the two switching PAs are identical).

As shown in Figure 3.4, the phases of the input signal can be determined by:

$$
\begin{gather*}
S_{1}(t)=A_{0} \cos (\omega t+\varphi(t)+\theta(t))  \tag{3.3}\\
S_{2}(t)=A_{0} \cos (\omega t+\varphi(t)-\theta(t))  \tag{3.4}\\
S(t)=A(t) \cos (\omega t+\varphi(t)) \tag{3.5}
\end{gather*}
$$

Where $\theta(t)$ is determined by:

$$
\begin{equation*}
\theta(t)=\arccos \frac{A(t)}{A_{0}} \tag{3.6}
\end{equation*}
$$

The outphasing architecture requires two switching PAs that are precisely matched; otherwise the out-of-phase components cannot cancel each other. This limits the minimum output power of the outphasing PA, and therefore degrades the dynamic range of the system. Similar to EER PAs, bandwidth expansion also occurs in outphasing PAs (in creation of
the outphasing angle and phase modulation vectors (e.g., $\theta(t)$ and $\phi(t)$, respectively). Moreover, since the two switching PAs are operated at their full power, the power efficiency of the outphasing PA is outphasing angle-dependent. When the two branches are 180 degree out-of-phase, the power efficiency is lowest.

### 3.3 Pulse-Width and Pulse-Position Modulation

In the operation of a switching PA, information can be encoded in the digital pulsewave that drives the input to the PA. The input pulse-width is directly related to the output power of the switching PA, hence, by modulating the pulse-width the output amplitude can be modulated. Similarly, changes in the pulse-position correspond to changes in the output phase; this leads to the pulse-width and pulse-position modulation (PWPM) as shown in Figure 3.5 and Figure 3.6 [58], [76]-[78].

The phase information is contained in the position of the edges of the phase-modulated pulse wave, similar to phase modulation in both the EER and outphasing PAs. The fundamental amplitude $a_{l}(t)$ is a function of the duty cycle $D(t)$ [76]:

$$
\begin{equation*}
a_{1}(t)=\frac{4}{\pi} \sin (\pi D(t)) \tag{3.7}
\end{equation*}
$$

Harmonic tones $\mathrm{a}_{\mathrm{n}}(\mathrm{t})$ are also generated [76],

$$
\begin{equation*}
a_{n}(t)=\frac{4}{n \pi} \sin (n \pi D(t)) \tag{3.8}
\end{equation*}
$$

Consider a PMW class-D PA and assume the drain voltage ranges from $V_{D D}$ to ground. By using the Fourier expansion, the amplitude of the output sinusoidal wave is

$$
\begin{equation*}
V_{\text {out }}=\frac{2}{\pi} V_{D D} \sin (D) \tag{3.9}
\end{equation*}
$$

The output power is

$$
\begin{equation*}
P_{\text {out }}=\frac{2}{\pi^{2}} \frac{V_{D D}^{2}}{R_{L}} \sin ^{2}(D) \tag{3.10}
\end{equation*}
$$

where $R_{L}$ is the optimal load resistance.
The output dynamic range of the PWPM PA is limited by the minimum width of the input pulse wave that the transistor switches can process without pulse-swallowing. In addition, bandwidth expansion is also present in the phase and amplitude waveform generation.

### 3.4 Digital Polar PA (DPA)

Another technique that can be used to linearize a switching PA is digital polar modulation [19], [26], [27], [67]. Instead of using one switching PA to amplify the RF signal, an array of unary weighted switching PA cells is used to provide the amplitude modulation, as shown in Figure 3.7. The PA cells can operate in the voltage, current, or charge domain with an appropriate summation block to combine the individual cells.

Similar to the EER PA, the phase information is contained in the phase-modulated pulse wave. The output amplitude can be modulated by adjusting the number of active switching PA cells. The finite AM resolution is limited by the number of PA cells, and this AM quantization error generates a white noise in the spectrum [19]. The DPA architecture also suffers from bandwidth expansion and delay mismatch.

### 3.5 Hybrid Architectures

The combination of different architectures can sometimes provide an alternative solution to overcome some drawbacks discussed above.

For example, in EER architecture, the efficiency is mainly limited by the supply
modulator. Although DC-DC converter has a very high efficiency, its bandwidth is very small. Because the efficiency of supply modulator decreases significantly as bandwidth increases, it is difficult to achieve high efficiency when amplifying a wide bandwidth signal. If we combine envelope tracking (ET) with outphasing, PWM or digital polar, the bandwidth requirement for supply modulator can be greatly reduce. An extreme case is implementing a 1-bit supply modulator with digital polar, such as a class-G switched capacitor PA [80].

Outphasing architecture has a phase-dependent efficiency. Its inputs are outphased to combine at the output for desired amplitude, therefore, the power efficiency is very low at large outphasing angles. To avoid the occurrence of large outphasing angle, the output amplitudes of the two PAs in the outphasing topology can be adjusted by techniques such as ET [81], PWM, and digital polar.

PWM suffers from small dynamic range due to the limited pulse-width of the input pulse wave, especially at high frequencies. To increase the dynamic range, we can adjust the supply for different signal levels (EER), split the PA into an array of PA cells (digital polar), or combine the output of two PWM PAs with outphased input pulse wave (outphasing).

In all the above hybrid architectures, a certain output can be generated by different inputs. For example, in the case of outphasing with ET, a small output can be generated either by applying a low supply voltage or using a large outphasing angle. Different inputs can be selected to maximize the power efficiency, at the expense of more complex digital predistortion dealing with redundant input codes.

### 3.6 Summary

It is noted that all of the presented architectures suffer from system level deficiencies that lead to nonlinearity. The bandwidth expansion involved with polar conversion limits the capability to operate in closed loops, which are preferred to account for process, voltage and temperature (PVT) variations. Furthermore, the polar architectures suffer from delay mismatches owing to group delay differences that the amplitude and phase signals experience. Finally the digitally modulated PAs suffer from noise due to finite quantization of the amplitude signal.

Three architectures that address the challenges associated with switching PA linearization are addressed in the next three chapters. Among these architectures, the multiphase architecture avoids the low power efficiency at large outphasing angles in outphasing architectures and eliminate wideband phase modulator which is required in all the above architectures, at the expense of more complex, low cost and low power digital logic circuitry.


Figure 3.1. Schematic of an SCPA.


Figure 3.2. Block diagram of a typical EER PA.


Figure 3.3. Block diagram of an outphasing PA.


Figure 3.4. Outphasing operation in polar coordinates.


Figure 3.5. Block diagram of a PWPM PA.


Figure 3.6. Illustration of (a) modulating the amplitude with pulse-width; (b) modulating of the phase with the pulse position [76], [79].


Figure 3.7. Block diagram of a DPA.

## CHAPTER 4

## A CURRENT-MODULATED CLASS-E PA

### 4.1 Introduction

### 4.1.1 Background

High data rates in wireless communications require nonconstant envelope (non-CE) modulation techniques with high peak-to-average power ratios (PAPR) such as those used in Wi-Fi and LTE. Linear amplifiers in CMOS are inefficient when amplifying non-CE modulation because their efficiency is proportional to their output power. Nonlinear switching amplifiers can achieve high efficiency since the transistors operate when they are most efficient (e.g., in compression) [82]. Therefore, the switching amplifier is a promising candidate for improving efficiency in wireless radios for wideband, high-speed communications. Various linearization techniques for switching PAs have been investigated extensively, such as Chireix/Outphasing [6], [71], [72], [83], Doherty [84], and Envelope Elimination and Restoration (EER) [4], [60]-[65], [85]-[87]. Both Doherty and Chireix require two separate PAs as well as large passive output matching/combining networks. On the other hand, EER only requires one PA with a voltage supply modulator, hence, it has reduced power loss and smaller size in die area [57].

### 4.1.2 Motivation

Typically the voltage supply-modulator for EER transmitters is implemented using a low-dropout regulator (LDO) [4], [63]. An LDO consists of a PMOS pass transistor controlled by an op-amp using negative feedback. The low-dropout regulator (LDO) for a conventional supply modulator is power hungry and has a similar efficiency characteristic to a class-B PA [82], albeit with higher peak efficiency [63]. Moreover, the large size of the pass transistor introduces stability problems for the operational amplifier, especially when operated with wide bandwidth. In this chapter, we introduce a current modulated class-E PA (see Figure 4.1) as an alternative to the conventional voltage supply modulated EER PA. To make the PA compatible with digital systems, the current modulator is controlled by a digital-to-analog converter (DAC) which allows for increased filtering of the quantization noise since it is no longer on the RF signal path.

### 4.2 Differential Class-E Power Amplifier

A typical class-E PA topology is shown in Figure 4.1. It consists of an NMOS, $M$, switching at a frequency, $\omega_{0}$, a shunt capacitor $\left(C_{d}\right)$ and inductor $\left(L_{d}\right)$ forming a positive susceptance at the drain of the transistor, and a matching network ( $L_{m}, C_{m 1}$, and $C_{m 2}$ ) transforming the antenna impedance to an optimal impedance at the transistor drain and allowing high power to be delivered to the load at the fundamental frequency, $\omega_{0}$. The design equations for the parallel-circuit tuned class-E power amplifier are [88], [89]:

$$
\begin{gather*}
R_{L}=\frac{1.365 V_{D D}^{2}}{P_{\text {out }}}-\frac{8}{3} R_{s w}  \tag{4.1}\\
L_{d}=\frac{0.732 R_{L}}{\omega_{0}} \tag{4.2}
\end{gather*}
$$

$$
\begin{equation*}
C_{d}=\frac{0.685}{\omega_{0} R_{L}} \tag{4.3}
\end{equation*}
$$

The matching network ( $L_{m}, C_{m 1}$, and $C_{m 2}$ ) transforms $R_{L}$, the optimum termination resistance, to the load impedance $50 \Omega$. The drain capacitor $C_{d}$ consists of an added extrinsic capacitor in parallel with the parasitic capacitance at the drain of transistor $M$.

Since class-E amplifiers offer high efficiency at the expense of device stress (due to large drain voltage), a cascode implementation is applied to distribute voltage stress and improve device reliability [76], [90]. The cascoded class-E PA is shown in Figure 4.2. With use of thick gate transistors, M3 and M4, the supply voltage VDD can be increased to 2.5 V in a 130 nm RF CMOS technology.

### 4.3 Current Modulation with Unary Current-Mode DAC

The low-dropout regulator (LDO) for a conventional supply modulator is power hungry and has a similar efficiency characteristic to a class-B PA [82], albeit with higher peak efficiency [63]. In this section, we introduce a unary-weighted, current-mode digital-to-analog converter (DAC) as the supply-modulator to a class-E PA, as shown in Figure 4.3. This is similar to a current-mode DPA with a high efficiency digital output stage, with the exception that quantization is not in the RF signal path [91], [92]. By moving the signal quantization out of the RF signal path, additional filtering can be used to shape the quantization noise. When compared to a current-mode DPA, quantization noise is only suppressed by the output matching network acting as a bandpass reconstruction filter. Size and losses in the RF matching components restrict the number of components and hence the order of the filter.

By separating the RF amplifier and DAC function, the quantized envelope signal can
be filtered at low frequency by the DAC reconstruction filter and again at high frequency by the output matching network. Additionally, high peak efficiency is possible because all of the current supplied by the DAC is provided to the highly-efficient switching amplifier. This does not require an LDO, which usually consumes more power than the drivers required for the current-mode DAC and often requires large capacitance due to necessity for compensation. Moreover, when compared to a rail-to-rail operational amplifier required by an LDO, a current-mode DAC can be easily implemented to achieve a large output current swing (e.g., from zero up to the desired peak current) [93]-[95]. Hence, a currentmode DAC acting in an EER transmitter is a promising alternative for both the supplymodulator of an EER transmitter and for conventional current-mode DPAs.

### 4.3.1 Unary-Weighted Current DAC

The block diagram of the unary-weighted current-mode DAC is shown in Figure 4.4. Binary input bits are thermometer decoded and buffered to switch an array of unary current cells. Compared to a binary-weighted DAC, a unary-weighted DAC reduces static errors due to current cell mismatch and the dynamics errors due to switching transients exposed to the DAC output. The resolution of the array is designed to meet the EVM target required for demodulation of the signal. For sufficient margin, this is chosen to be $<1 \%$; hence, a supply modulator with 7-bit accuracy is sufficient. The current amplitude is designed to achieve an output power of 24 dBm from the transmitter; hence, the DAC is designed with a peak output current of 250 mA . The clock frequency of the DAC is chosen by the bandwidth of the amplitude signal being processed. Simulation results have shown that the amplitude bandwidth of an IEEE 802.11 g signal is $\sim 40 \mathrm{MHz}$; thus, the DAC is clocked at $4 \times$ oversampling or 320 MHz , which is sufficient to reconstruct the IEEE 802.11 g signal.

Fully-differential current-steering DACs are widely used due to their reduced output impedance requirement. With the reduced output impedance requirement, the cascoded transistor $M_{\text {cas }}$ in Figure $4.5($ a) can be eliminated to alleviate headroom constrains. However, since the current DAC is used as the supply modulator in our design, a singleended unit current cell is preferred to reduce the power consumption, as shown in Figure 4.5(b). Both $M_{1}$ and $M_{2}$ are switched by the decoder and driver to achieve the required output impedance while maintaining a voltage headroom close to that of the differential current cell.

### 4.3.2 Simulation Results

The postextracted current-mode DAC-based EER transmitter is simulated at 2.4 GHz . Figure 4.6 (a) and Figure 4.6 (b) show the output power versus the input code and the PAE versus output power, respectively. The peak output power is 24.5 dBm while the peak PAE is $43.3 \%$.

Current DAC EER transmitters provide high efficiency but moderate/poor linearity due to code-dependent impedance variation, which leads to signal distortion in both amplitude (AM-AM) and phase (AM-PM). The static AM-AM distortion can be observed in Figure 4.7 (a) while Figure 4.7 (b) shows the AM-PM distortion.

Both the output amplitude and phase can be considered a function of the input code of the current-mode DAC. To improve the linearity of the EER transmitter, digital predistortion (DPD) is applied. As shown in Figure 4.7, the PA output signals are detected and lookup tables for the amplitude A and phase $\phi$ are constructed, respectively. Based on the lookup tables, $A$ and $\phi$ are mapped to predistorted signals $A^{\prime}$ and $\phi^{\prime}$, providing linearized signals at the PA output.

A 2.4 GHz OFDM modulated signal similar to the IEEE 802.11 g standard is applied to verify the dynamic operation of the PA. Figure 4.8 shows the normalized power spectral density (PSD) of the output signals relative to the mandated spectral mask. The error vector magnitude (EVM) is 2.09 \%-rms based on the simulation results, well below the required 5.6 \%-rms requirement for such signals. Figure 4.9 shows the simulated output signal constellation. The average output power is 13.6 dBm . The average PAE is 8.9 \%.

In this section current-mode DAC-based EER transmitter is presented. Such a topology allows for improved linearity while reducing the output quantization noise that is problematic in digital power amplifiers where the quantization is directly in the RF signal path. As with all current-mode DACs the output characteristic is compressive with increasing digital code due to impedance changes in the current source. Using digital predistortion, the linearity of the EER transmitter can be significantly improved to compensate for this effect. The circuit is laid out and simulated in a 130 nm RF CMOS technology. The peak output power and PAE are 24.5 dBm and $43.3 \%$, respectively. By applying a test signal based on the IEEE 802.11 g standard, the transmitter linearity is validated by its performance compared to the required spectral mask. The demodulated signal achieves an EVM of 2.09 \%, while meeting the spectral mask with high margin.

### 4.4 Current Modulation with Pipelined SC-DAC

An alternative way to implement the current supply modulator is to control an open loop transconductor using a 10-bit pipelined switched-capacitor (SC) digital-to-analog converter (DAC), as shown in Figure 4.10. Current-mode DPAs usually introduces the quantization noise at the PA output. The proposed current-modulated PA architecture is beneficial because the quantization noise can be suppressed by the reconstruction filtering
at the DAC and the output matching/filtering of the PA [91], [92].
In a conventional DPA/SCPA, the filtering at the bandpass matching network is quite limited because the matching network often needs to accommodate wideband operations. Moreover, size and loss of the on-chip matching components restrict the number of components, and hence the order of the filter. By separating the RF amplifier and the DAC controlled transconductor, the quantized envelope signal can be filtered at low frequency by the reconstruction filter and at high frequency by the output matching network. High power efficiency can be achieved because the pipelined SC-DAC and the open-loop transconductor consume less power than an LDO, which often requires large transistors for low output impedance and large capacitance for compensation. In the current modulated EER PA, the open-loop transconductance amplifier controlled by an SC-DAC can be easily implemented to achieve a large output current dynamic range. EER architecture requires digital predistortion (DPD) to improve the linearity. The 10-bit low-power DAC provide a convenient digital interface for DPD comparing with the conventional analog LDO-based EER PA. Hence, an SC-DAC modulated EER transmitter is a promising alternative for both the conventional EER PAs and for conventional current-mode DPAs.

### 4.4.1 Circuit Details

### 4.4.1.1 Pipelined Switched Capacitor DAC

Conventional high-speed current-mode DACs comprise scaled current-steering elements that can be controlled via digital switching. High-power PAs require high drain currents at peak output power; hence, they require large magnitude current-steering elements. But at low power the current is steered into a dummy load and the DAC consumes significant quiescent power, reducing the overall system energy efficiency. The
filtering of quantization noise also consumes more power for a DAC with high output current. Additionally, significant glitching can occur due to charge feedthrough summing at the output of the current source.

A pipelined switched capacitor DAC [96]-[99] (SC-DAC) is used to improve the overall system efficiency. This DAC uses a unary-weighted capacitor array controlled by binary-weighted digital inputs. The precision capacitor ratios achievable in CMOS improve the linearity when compared to a current-mode DAC, whereas the fast-switching transistors accommodate the high-speed operation. As shown in Figure 4.11, a three-phase clock is applied to control the pipeline of operations. Each capacitor is charged to $V_{D D}$ or ground depending on its digital binary input code. Half of the charge on the previous capacitor will transfer to the next capacitor in the pipeline, forming binary operations between adjacent capacitors. Comparing with the aforementioned current-steering DAC, the glitching for a pipelined SC-DAC is much lower since the charges of each stage are all combined in the last/MSB capacitor before delivered to the output load. As we can see in Figure 4.11, each capacitor can be charged from $V_{D D}$ to ground, hence the swing range of the SC-DAC is large. However, the driven strength of the SC-DAC is low since the output is generated based on charge redistribution between capacitors. Therefore, we use the DAC to control an open-loop transconductor that can supply the source current to a high-power high-efficiency class-E PA. The supplied current is proportional to the desired output amplitude, and hence controls the output power of the PA. Because the transconductor operates in open-loop, the output amplitude is nonlinear, with respect to the input code. Unlike closed-loop systems or current-steering DACs, this nonlinearity is relatively easy to correct using DPD running in the DSP and is relatively constant with respect to output frequency, as the load the PA presents is almost constant across its band of operation.

Although the pipelined SC-DAC is a binary-weighted DAC, it utilizes identical capacitors to minimize the cell mismatch. Compared with the more traditional current-steering DAC, the pipelined SC-DAC consumes less power and is glitch free. A 10-bit pipelined SC DAC is implemented to provide sufficient margin for predistortion which will be discussed later. The EER power amplifier is designed to achieve an output power of 24 dBm into a $50 \Omega$ load; hence, the open loop transconductance amplifier is designed with a peak output current of 250 mA . The clock frequency of the DAC is chosen by the bandwidth of the amplitude signal being processed. To achieve a wide modulation bandwidth at the output of the PA/DAC circuitry, the clock frequency of the SC-DAC is derived from a divide-by4 circuit whose input is the RF PA switching signal.

### 4.4.1.2 Differential and Interleaved DAC

In order to drive the transconductance amplifier in Figure 4.10, a differential pipelined SC-DAC is implemented as shown in Figure 4.12.

As shown in Figure 4.11, the output of the SC-DAC is only enabled when the corresponding phase clock is on, hence the output is only enabled for one third of the clock period. To maintain the desired output continuously, the interleaved SC-DAC is implemented and the output of different SC-DACs are enabled for different clock phases.

### 4.4.1.3 Open-Loop Transconductor

To modulate the output of the PA the SC-DAC drives an op-amp that implements the reconstruction filter of the SC-DAC, as shown in Figure 4.10. The reconstruction filter consists of an op-amp that is configured as an integrator with finite DC gain. This filter drives a transistor configured as an open drain buffer. The transistor is sized to minimize
the drop-out voltage placed across it at the maximum output current. . It should be noted that the drop-out voltage is similar to that of an LDO with the exception that a much smaller NMOS transistor can be used to achieve the same drop-out voltage that a PMOS can achieve.

### 4.4.2 Simulation Results

### 4.4.2.1 INL and DNL of SC-DAC

The postextracted SC-DAC with the op-amp based reconstruction filter is simulated to verify its linearity. The simulated DNL and INL of the output voltage is shown in Figure 4.13 and Figure 4.14.

### 4.4.2.2 Digital Predistortion

As shown in Figure 4.10, the SC-DAC with op-amp reconstruction filter drives an open-loop transconductance cell. The output current is nonlinear with respect to the driving voltage; hence, it is nonlinear with respect to the input digital code. Therefore, digital predistortion (DPD) is required. As digital circuitry is low power and easy to be integrated with the PA, DPD is widely used to linearize CMOS PA. As was previously noted the DPD is relatively static across frequency.

Current based EER transmitters provide high efficiency but moderate/poor linearity due to code-dependent impedance variation, which leads to signal distortion in both amplitude (AM-AM) and phase (AM-PM). Figure 4.15 (a) and (b) show the output signal distortion of the EER transmitter system using the SC-DAC and transconductor as the supply modulator. Both the output amplitude and phase can be considered a function of the input code of the current-mode DAC. To improve the linearity of the EER transmitter,
digital predistortion (DPD) is applied, as shown in Figure 4.16.
The PA output signals are detected and lookup tables for the amplitude $A$ and phase $\phi$ are constructed, respectively. Based on the lookup tables, $A$ and $\phi$ are mapped to predistorted signals $A^{\prime}$ and $\phi^{\prime}$, providing linearized signals at the PA output.

A differential class-E amplifier is implemented which is similar to the one in Section 4.2.

### 4.4.3 Measurement Results

The SC-DAC modulated, class-E RF PA Is fabricated in a $130-\mathrm{nm}$ RF CMOS process and the total area of the chip is $2 \times 2 \mathrm{~mm}$ including all pads, as shown in Figure 4.17.

### 4.4.3.1 Static Measurement

The static performance of the PA was characterized using a high-speed digital IO pattern generator, an RF vector signal generator and a vector signal analyzer.

The output power vs input code w/ DPD and w/o DPD are plotted in Figure 4.18(a), respectively. The PAE versus output power is plotted in Figure 4.18(b). The measured peak power and the corresponding power added efficiency (PAE) versus frequency are plotted in Figure 4.19 . At 2.4 GHz , the SC-DAC modulated EER transmitter achieves a peak power of 22.5 dBm with a corresponding power added efficiency (PAE) of $23.6 \%$.

### 4.4.3.2 Dynamic Measurement

The nonlinear behavior of the open-loop transconductance cell requires digital predistortion (DPD) to enhance the linearity of the EER transmitter. The PA output signals throughout all digital input codes are measured and saved. By comparing the input codes
with output amplitudes and phases, lookup tables (LUT) for the amplitude, $A$, and phase $\phi$ are constructed, respectively. Based on the lookup tables, linearized output $A$ and $\phi$ are mapped to predistorted input signals $A^{\prime}$ and $\phi^{\prime}$, providing linearized signals at the PA output, as shown Figure 4.16.

A $10 \mathrm{MHz}, 64$ QAM LTE signal is applied to test the linear operation of the currentmodulated EER PA. The demodulated LTE constellation diagram is shown in Figure 4.20. The constellation is plotted before DPD in gray and after DPD in black. The EVM is improved from $>11$ \%-rms before DPD to 2.35 \%-rms after DPD.

The measured average output power and PAE are 18 dBm and $10.6 \%$, respectively. The measured adjacent channel leakage ratio (ACLR) is plotted in Figure 4.21(a) with no DPD and (b) with DPD. The DPD improves the ACLR from -18.4 dBc to -30.3 dBc .

Furthermore, a WCDMA modulation is also tested with this current-modulated EER PA. The ACPR for WCDMA signal is plotted in Figure 4.22. The average power is 19.9 dBm with a PAE of $14.3 \%$. With DPD, the ACPR can be reduced from -24 dBc to -37.8 dBc . The EVM of the WCDMA signal is 1.3 \%-rms.

### 4.5 Summary

A pipelined SC-DAC current-modulated EER transmitter is designed and tested. Such a topology allows for improved linearity while reducing the output quantization noise that is problematic in digital power amplifiers where the quantization is directly in the RF signal path. DPD is applied to compensate the nonlinearity of the open-loop transconductance cell. The circuit is laid out, simulated, and measured in a 130 nm RF CMOS technology. At 2.25 GHz , the peak output power and PAE are 22.5 dBm and $23.6 \%$, respectively. This PA meets both the LTE and WCDMA requirements regarding EVM and ACLR. For the

10 MHz 64 QAM LTE signal, the PA achieves an EVM of 2.35 \%-rms with an ACLR less than -30 dBc at an average $\mathrm{P}_{\text {out }}$ of 18 dBm .

Limitations of the efficiency and linearity of the current modulated EER transmitters, along with the aforementioned systematic nonlinearity (e.g., bandwidth expansion, delay mismatch, etc.) motivate a different approach. The next two chapters introduce multiphase SCPAs, starting with a quadrature SCPA.


Figure 4.1. Typical class-E power amplifier.


Figure 4.2. Cascoded differential class-E amplifier.


Figure 4.3. Block diagram of the current DAC modulated EER transmitter.


Figure 4.4. Block diagram of the unary-weighted, current-mode DAC.


Figure 4.5. (a) Differential current cell; (b) Single-ended cascade current.


Figure 4.6. (a) Simulated output power versus input code; (b) simulated PAE versus output power.


Figure 4.7. (a) DAC input code versus output amplitude; (b) DAC input code versus output phase.


Figure 4.8. Simulated PSD of the proposed EER transmitter.


Figure 4.9. Simulated 64-QAM OFDM signal constellation.


Figure 4.10. Block diagram of the switched capacitor DAC modulated EER transmitter.


Figure 4.11. (a) Pipelined switched capacitor DAC; (b) three-phase clock.


Figure 4.12. Interleaved differential pipelined switched capacitor DAC.


Figure 4.13. Simulated DNL of the SC-DAC with op-amp.


Figure 4.14. Simulated INL of the SC-DAC with op-amp.


Figure 4.15. (a) DAC input code versus output amplitude; (b) DAC input code versus output phase.


Figure 4.16. Diagram of signal flow with offline DPD.


Figure 4.17. Microphotograph of SC-DAC modulated EER transmitter in 130 nm CMOS.


Figure 4.18. (a) Measured output power versus input code and (b) measured PAE versus output power.


Figure 4.19. Measured peak output power and PAE versus frequency.


Figure 4.20. Measured 10 MHz 64 QAM LTE constellation with EVM of 2.35 \%-rms after DPD. The constellation before DPD is in gray.

(b)

Figure 4.21. Measured ACLR of an LTE signal (a) without DPD (b) with DPD.

(b)

Figure 4.22. Measured ACLR of a WCDMA signal (a) without DPD (b) with DPD.

## CHAPTER 5

## QUADRATURE SWITCHED CAPACITOR PA

### 5.1 Introduction

Consumer electronic devices for modern communication require transmitters with high-power and high-efficiency while simultaneously operating with high linearity and wide bandwidth.

The low breakdown voltage of CMOS limits the practically achievable output power of power amplifiers operating in either switching or linear regimes. Linear amplifiers further operate with a finite quiescent current that limits their efficiency. Hence, nonlinear switching amplifiers, such as class-D, -E, and -F PAs, are widely used to implement RF power amplifiers. Switching amplifiers are shown to be more efficient than their linear amplifier counterparts and take advantage of CMOS scaling, which has made the device a lower loss, faster switch. However, switching amplifiers require linearization circuitry for normal operation on modern wideband linear communication signals (e.g., LTE, Wi-Fi, etc). Several linearization techniques have been investigated, such as envelope elimination and restoration (EER) [60]-[62], [85], [86], polar modulation [63]-[65], digital PAs [3], [19], [24]-[27], [66], [67], Digital-Doherty [68], outphasing [6], and pulse-wave modulation [76], [77]. The techniques above require a conversion from the Cartesian to the polar coordinate system.

In the digitally intensive versions, the Cartesian-to-polar conversion requires a complex coordinate rotation digital computer (CORDIC) and the resulting bandwidth expansion limits wideband operation required by current wireless standards [57], [59]. The Cartesian-to-polar transformation can be expressed as follows:

$$
\begin{align*}
A(t) & =\sqrt{I^{2}(t)+Q^{2}(t)}  \tag{5.1}\\
\phi(t) & =\tan ^{-1}(Q(t) / I(t)) \tag{5.2}
\end{align*}
$$

Owing to the strong nonlinearity associated with these conversions, the bandwidth required for the polar components $A(t)$ and $\phi(t)$, especially the phase component $\phi(t)$, are substantially larger than the bandwidth of the Cartesian components $I(t)$ and $Q(t)$ [59].

An alternative to techniques that require a polar conversion is to digitally modulate and sum the $I$ and $Q$ signals in the Cartesian domain, as shown in Figure 5.1. In the past, summation in the quadrature domain has been performed with transformers [5], [23]. Due to interactions between the $I$ and $Q$ currents in the windings and due to memory effects, this technique requires precise duty-cycle control and/or digital predistortion. Several efforts have been made to perform the combination in the charge domain, on a capacitor array [17], [18], [100]. These techniques are more amenable to CMOS technology, as they do not require custom transformers, and are more linear due to low-loss switching and precision capacitor matching [26]. Further, duty-cycle control is not necessary, as long as the voltage on the capacitor array due to one input settles before the next input is entered (e.g., $I(t)$ must settle before $Q(t)$ ) [18]. In the capacitively combined versions, a capacitor array is divided into $I$ and $Q$ subarrays, each with a quantized number of unary/binary capacitor cells. The I/Q vectors can be represented simply by clocking the $Q$ cells with a quadrature clock delayed by 90 degrees from the $I$ clock; $I / Q$ vectors can be weighted by
controlling the number of cells that are switched in each subarray. Four quadrant operation in the complex plane is achievable by appropriately inverting the $I$ and $Q$ clock signals. The output amplitude and phase are achieved by appropriate weighting of the $I / Q$ signals; hence, this eliminates the need for a CORDIC, and a wideband supply modulator and phase modulator required for other realizations of Polar/EER PAs. Another direct benefit is that the $I / Q$ vectors propagate at similar frequency with similar group delay. This obviates the need for delay synchronization circuitry necessary in many Polar/EER transmitters; careful, symmetric layout ensures proper timing alignment.

Because summation of quadrature signals results in a $3-\mathrm{dB}$ loss when compared to summation of in-phase signals (e.g., polar modulation), amplifier efficiency is critical. Switching amplifiers are ideal for this operation, as their ideal efficiency is higher than that of linear amplifiers. A class-G technique can be adopted to enhance the efficiency at output power backoff [3], [80]. It is noted that the quadrature architecture consumes less overhead power than polar architectures owing to the lack of a wideband phase modulator and synchronization circuitry. As with other SCPAs, it is worth noting that the Q-SCPA achieves high linearity owing to the ability to precisely define capacitance ratios in CMOS. Similarly, the Q-SCPA eliminates the need for auxiliary, high-bandwidth analog/mixedsignal circuitry (e.g., supply modulators and phase modulators) and can be scaled to higher resolutions as required by the communication standard.

In this chapter we present a class-G Q-SCPA for nonconstant envelope amplification. This chapter is organized as follows. In Section 5.2, theoretical operation of the Q-SCPA is discussed. Design details of the presented Q-SCPA are provided in Section 5.3, followed by measurement results in Section 5.4. Finally, conclusions are presented in Section 5.5.

### 5.2 Theory of Operation

### 5.2.1 Operation of Conventional SCPA

Switched-capacitor circuits are ubiquitous in CMOS owing to their fast low-loss switches and precisely controlled capacitance ratios. The SCPA is a class-D PA with a precisely controlled capacitive divider at its output. The divider precisely controls the voltage level at the output of the SCPA using charge division on an array of capacitors; hence, it provides a direct linear summation of RF signals. Moreover, the load impedance of SCPA is code-independent, that is, it is constant as the input digital code causes the output amplitude to vary.

As shown in Figure 5.2, an SCPA consists of an array of capacitors whose top plates are shared and whose bottom plates are connected to an inverter that can be switched between $V_{\mathrm{DD}}$ and ground $\left(V_{\mathrm{GND}}\right)$. Though shown as a single inductor, in practice, $L$ is the excess reactive impedance of the impedance matching network and $R_{\text {opt }}$ represents the optimum termination resistance [3]. A decoder can selectively enable or disable any of the inverters.

When enabled, switching is allowed to occur, while when disabled the bottom plate of the capacitor is held at ground. The output amplitude can thus be modulated by controlling the number of the total capacitance that is switched each cycle, relative to the total array capacitance. When all capacitors are switched, a peak voltage is output, while switching fewer capacitors proportionally reduces the output voltage. An inductor is connected in series with the top plate to filter the square switching waveforms at the SCPAs input. This inductor forms a series resonant circuit with the output resistor; hence, it acts as a bandpass filter for the fundamental operation frequency. The inductor and output resistor may be formed by passive components, or they can comprise a bandpass matching network that
transforms the impedance of an antenna to an equivalent small resistance in series with a positive reactance. The output amplitude, $V_{\text {out }}$, is given by the following expression:

$$
\begin{equation*}
V_{\text {out }}=\frac{2}{\pi}\left(\frac{n}{N}\right) V_{D D} \tag{5.3}
\end{equation*}
$$

where $N$ is the total number of unit capacitors in the array and $n$ is the number of capacitors that are being switched. The output power, $P_{\text {out }}$, and input power, $P_{\mathrm{SC}}$, are given by the following expressions:

$$
\begin{gather*}
P_{\mathrm{out}}=\frac{2}{\pi^{2}}\left(\frac{n}{N}\right)^{2} \frac{V_{\mathrm{DD}}^{2}}{R_{\mathrm{opt}}}  \tag{5.4}\\
P_{S C}=C_{I N} V_{D D}^{2} f \tag{5.5}
\end{gather*}
$$

where $f$ is the carrier frequency and $C_{I N}$ is the input capacitance that varies with the selected code and the value of a unit capacitor $C$, as follows:

$$
\begin{equation*}
C_{\text {IN }}=\frac{n(N-n)}{N^{2}} C \tag{5.6}
\end{equation*}
$$

The efficiency of the SCPA can be found as the ratio of the output power to the total power:

$$
\begin{equation*}
\eta_{S C P A}=\frac{P_{\text {out }}}{P_{o u t}+P_{S C}}=\frac{4 n^{2}}{4 n^{2}+\frac{\pi n(N-n)}{Q_{n w}}} \tag{5.7}
\end{equation*}
$$

where $Q_{\mathrm{nw}}$ is the network quality factor for the series resonant circuit:

$$
\begin{equation*}
Q_{n w}=\frac{2 \pi f L}{R_{o p t}}=\frac{1}{2 \pi f C R_{o p t}} \tag{5.8}
\end{equation*}
$$

The design of an SCPA commences by choosing a desired $P_{\text {out }}$, and an acceptable value of $Q_{\mathrm{nw}} Q_{\mathrm{nw}}$ is limited by the quality factor of the available passive components and is typically dominated by the inductor in CMOS processes. Practical values of the quality factor of on-chip inductors are $<20$. Matching network efficiency for a two element, lowpass downward transformation can be approximated by:

$$
\begin{equation*}
\eta_{\text {Match }}=\frac{1}{1+\frac{Q_{n w}}{Q_{\text {inductor }}}} \tag{5.9}
\end{equation*}
$$

Thus, to maintain a network efficiency of greater than $75 \%, Q_{\mathrm{nw}}$ values must be less than 5 [101]. Further details of the SCPA design and its theory of operation can be found in Yoo, et al. [3], [26].

### 5.2.2 Operation of the Q-SCPA

An example schematic of a Q-SCPA is shown in Figure 5.3. Design of a Q-SCPA commences from the same point as the polar SCPA to find the value of the total array capacitance. The array is subdivided such that half of the array capacitance is placed in the individual $I / Q$ paths. The subarrays are further divided into unit capacitors with individual driver chains that can be either switched between $V_{\mathrm{DD}}$ and $V_{\mathrm{GND}}$, or held at $V_{\mathrm{GND}}$. A quadrature clock is generated that switches the subarrays such that the $I$-array rising edge leads the $Q$-array rising edge by $90^{\circ}$.

In previous quadrature transmitter designs the clock was operated with a $25 \%$ dutycycle to limit interactions between the individual $I / Q$ components [17], [23]. Here it is noted that as long as the $I$-array settles before the $Q$-array clock is turned on, there is no interaction between the $I / Q$ components; this does set practical upper limits on the operation frequency of the Q-SCPA with $50 \%$ duty-cycle. Operation will now be described in detail.

As was previously mentioned, the input pulse waves for the $I$ and $Q$ capacitor subarrays are $90^{\circ}$ out of phase. The output signal $s(t)$ is the direct summation of $I(t)$ and $Q(t)$ waveforms:

$$
\begin{equation*}
s(t)=I(t) p(t)+Q(t) p\left(t+\frac{T}{4}\right) \tag{5.10}
\end{equation*}
$$

Where $p(t)$ and $p(t+T / 4)$ represent the input $50 \%$ duty-cycle square waveforms as shown in Figure 5.3(b). The $I(t)$ and $Q(t)$ signals are given by the following:

$$
\begin{align*}
& I(t)=A(t) \cos \phi(t)  \tag{5.11}\\
& Q(t)=A(t) \sin \phi(t) \tag{5.12}
\end{align*}
$$

where $A(t)$ and $\phi(t)$ are the amplitude and phase of the modulated signal, respectively.
By substituting (5.11) and (5.12) into (5.10), performing the Fourier expansion and keeping only the terms at the desired carrier frequency, the following expression results:

$$
\begin{align*}
s(t) & =A(t) \cos \phi(t) \frac{4}{\pi} \cos (\omega t)+A(t) \sin \phi(t) \frac{4}{\pi} \sin (-\omega t)  \tag{5.13}\\
& =A^{\prime}(t) \cos (\omega t+\phi(t))
\end{align*}
$$

where $A^{\prime}(t)=4 / \pi \times A(t)$. The factor of $4 / \pi$ is due to the fundamental component of the Fourier expansion of a square pulse train.

It can be seen that the amplitudes of $I(t)$ and $Q(t)$ are proportional to the number of capacitors switching in $I(\cos )$ and $Q(\sin )$ modes, respectively. Hence, weighting the $I / Q$ subarrays properly, the output amplitude and phase can be controlled precisely. An example of the operation of the proposed Q-SCPA is shown in Figure 5.4. In this figure both $I$ and $Q$ operate with 3 b of total capacitance (e.g., 8 unit capacitors). In Figure 5.4(a) an output in quadrant II of the complex plane is achieved with an inverted $I$ clock signal and a non-inverted $Q$ clock signal; precise phase and amplitude are controlled by selecting the number of capacitors that are on (e.g., switched), relative to the number held at ground. Examples for operation in quadrants I, III, and IV are shown in Figure 5.4(b), (c), and (d), respectively.

The output power of the Q-SCPA can be found by increasing the total capacitance, $N$,
in (5.4) by a factor of 2 , since the $I$ subarray is always off when the Q subarray is switching (and vice-versa) [102]:

$$
\begin{equation*}
P_{o u t, Q S C P A}=\frac{2}{\pi^{2}}\left(\frac{n}{\sqrt{2} N}\right)^{2} \frac{V_{D D}{ }^{2}}{R_{o p t}} \tag{5.14}
\end{equation*}
$$

The input power for the individual $I$ and $Q$ array can be found by assuming that the capacitors being switched ( $n C / N$ ) are in series with the parallel combination of the capacitors not being switched $(C(N-n) / N)$ and the capacitance from the other array $(C)$. This gives the following input capacitance:

$$
\begin{equation*}
C_{I N}=\frac{n(2 N-n)}{2 N^{2}} C \tag{5.15}
\end{equation*}
$$

Making a similar substitution of (5.15) into (5.5):

$$
\begin{equation*}
P_{S C, Q S C P A}=\frac{n(2 N-n)}{4 N^{2}} C V_{D D}^{2} f \tag{5.16}
\end{equation*}
$$

The ideal drain efficiency of the Q-SCPA can be found as the ratio of the Q-SCPA output power to the total power:

$$
\begin{equation*}
\eta_{S C, Q S C P A}=\frac{P_{o u t, Q S C P A}}{P_{o u t, Q S C P A}+2 P_{S C, Q S C P A}}=\frac{4 n^{2}}{4 n^{2}+\frac{\pi n(2 N-n)}{Q_{n w}}} \tag{5.17}
\end{equation*}
$$

The total efficiency of the PA is the product of (5.9) and (5.17):

$$
\begin{equation*}
\eta_{\text {Total }}=\eta_{S C P A} \cdot \eta_{\text {Match }} \tag{5.18}
\end{equation*}
$$

This is the total drain efficiency and does not account for input power due to the overhead (e.g., clock distribution, decoder logic, pad drivers, etc.) or losses due to finite switch resistance. This accounts for the discrepancy between the measured power added efficiency (PAE) and the total drain efficiency calculation. It should be noted that the efficiency profile of (5.17) is identical to that of (5.7); however, the Q-SCPA peak
efficiency will always be lower since its peak output power is $3-\mathrm{dB}$ lower than the original SCPA. A plot comparing the ideal PAE of the conventional SCPA to the Q-SCPA for several different values of $Q_{\mathrm{nw}}$ is plotted in Figure 5.5. It is noted that $\eta_{Q S C P A}$ is proportional to $Q_{\mathrm{nw}}$, while $\eta_{\text {Match }}$ is inversely proportional to $Q_{\mathrm{nw}}$; this implies that an optimal $Q_{\mathrm{nw}}$ exists. $\eta_{\text {Total }}$ is plotted versus $Q_{\mathrm{nw}}$ in Figure 5.6, for several different values of code word, $n$ in a 7 bit Q-SCPA (i.e., $N=128$ ). This plot assumes that the quality factor of the capacitor is significantly larger than the inductor, and that $Q_{\text {Inductor }}=10$. It can be seen that the optimal $Q_{\text {nw }}$ is between 2 ( $n=128$, peak output power) and 4 ( $n=32,6 \mathrm{~dB}$ backoff).

Additional losses in clocking and driving can be accounted for with estimates of the total gate capacitance being driven [26]. As has been noted, though there is a penalty for combining the signals after the PA, there is no requirement for precision synchronization or wideband phase-/amplitude-modulator circuitry, all of which require significant power from the supply.

### 5.3 Circuit Details

### 5.3.1 Top Level of the 8-Bit Q-SCPA

A single-ended block diagram of the proposed Q-SCPA is shown in Figure 5.7 [102]. Note that the fabricated circuit is differential. A Cartesian representation of a nonconstant envelope signal is separated into its constituent in-phase, $I$, and quadrature, $Q$, vectors. The digitized $I / Q$ vectors $B_{I, Q}$, are represented as signed digital code words; These vectors are input to a digital pattern generator that separates the bit pattern and outputs the bits to their proper digital inputs. The on-chip decoder is a binary-to-thermometer decoder for the MSBs, while the LSBs are simply buffered to match the decoder delay. An RF frequency equal to twice the desired output frequency is received on chip via an LVDS clock receiver
and is then converted to a quadrature clock by a quadrature $\div 2$ circuit. The MSB from the decoder is the sign bit and is input to an XOR along with the output of the $\div 2$ circuit. Hence, a quadrature output that can be inverted depending on the value of the sign bit is realized.

The remaining LSBs represent the amplitude weighting of the constituent $I$ and $Q$ signals. Each capacitor subarray comprises a total of 6 bits, chosen primarily to reduce the amount of quantization noise at the output of the Q-SCPA, while meeting signal fidelity requirements (e.g., EVM, ACLR, etc.). This resolution also allows for additional bits should digital predistortion (DPD) be required. The capacitor subarrays are subdivided into a partial unary and binary array as a compromise between size/complexity and linearity. The four MSBs are unary-weighted $\left(C_{U}=200 \mathrm{fF}\right)$ and controlled by a binary-tothermometer decoder whereas the two LSBs are binary-weighted ( $C_{l}=100 \mathrm{fF}$ and $C_{0}=50$ fF ) for fine output resolution. An extra bit is achieved by operating as a class-G circuit, with two binary weighted power supply voltages; hence, in the fabricated circuit, seven total bits of amplitude resolution are realized. The capacitor sizes were limited by the smallest dimensions achievable for MiM capacitors in the chosen technology.

The capacitor array is designed using MiM capacitors with a common top plate, while the bottom plates are connected to class-G switches (more detail on the class-G switch will be provided in the Section 5.3.2). The top plates are connected in series with a low-pass matching network that transforms the antenna impedance of $50 \Omega$ to the optimum termination impedance. The matching circuit is comprised of a series inductor, $L_{s e r}$, and a shunt capacitor, $C_{s h}$, forming a bandpass series-resonant circuit at the design frequency. Because the total capacitance remains unchanged from the perspective of the matching network, it can be sized to be series resonant with the total capacitance in the array. The matching network also acts to filter the undesired harmonic content associated with
switching waveforms at the input of the circuit.
The inductor, $L_{s e r}=1.0 \mathrm{nH}$ is realized as custom wound, fully differential transformer, as seen in Figure 5.8(a). The routing of the inductor allows both sides of the differential QSCPA to be matched to the antenna impedance while providing ease of routing. The simulated inductance and resistance of the custom cell are plotted in Figure 5.8(b). The capacitor, $\mathrm{C}_{\mathrm{sh}}=4.8 \mathrm{pF}$ is a MiM capacitor, similar to those used in the Q-SCPA capacitor arrays. The impedance transformation circuit uses a loaded quality factor, $Q_{\mathrm{nw}} \approx 3$, leading to a circuit with approximately $600 \mathrm{MHz} 3-\mathrm{dB}$ bandwidth centered at 2 GHz . Higher quality factors can be used if off chip impedance transformation is used owing to the higher quality factors possible with use of off-chip components.

### 5.3.2 Unit Class-G SCPA

The schematic of the dual supply class-G driver [3], [103] is shown in Figure 5.9. Low voltage is a primary reason for poor efficiency in CMOS power amplifiers; this is because the output resistance is proportional to the square of the supply voltage; hence, a reduction in supply voltage by a factor of two reduces the optimum termination impedance by a factor of four. This leads to larger impedance transformations from the antenna, corresponding to higher losses in the matching network, as well as a voltage division at the output of the switching transistor. The nominal supply voltage for CMOS devices is $\mathrm{V}_{\mathrm{DD}}=1.2 \mathrm{~V}$ in the chosen 65 nm process technology. In order to increase the output power and to reduce the losses from impedance transformation, it is desirable to operate with higher voltage supplies. This is implemented by cascoding the transistors in a standard CMOS inverter that acts as the switch between the high supply voltage and ground in the Q-SCPA. Using this topology, the supply voltage of the cascoded driver is increased to twice $\mathrm{V}_{\mathrm{DD}}$, which
is labeled as $\mathrm{V}_{\mathrm{DD} 2}$ in Figure 5.9. It has been shown that efficiency in power backoff can be improved by reducing the supply voltage for envelope signals that are small enough [4], [103], [104]. In a switched capacitor circuit, switching supplies results in no glitch, as the transition can be controlled to only occur when the switch is already open (e.g., disconnected from the load) [3], [105]. Therefore, a second switching path is added with a supply voltage of $\mathrm{V}_{\mathrm{DD}}$. It is critical to match the resistances both pull-up and the pull-down path. This will mitigate code dependent nonlinearity[26]. The class-G topology increases the peak output power, improves the efficiency at power backoff, and adds an extra binary bit of resolution since $V_{\mathrm{DD} 2}=2 V_{\mathrm{DD}}$.

### 5.3.3 Logic and Switch Drivers

The schematic for the enabling logic and drivers that precede the switch is shown in Figure 5.10. The enabling logic for each switch path is located adjacent to the switch and takes its input from the decoder. Colocation of the logic and driving chains allows the parasitic routing capacitance to be minimized and for easier timing synchronization of the switching signals. Four separate controls (A, B, C, and D) are required to control the classG switch. The PMOS transistors operate between supply rails $V_{\mathrm{DD}}$ and $V_{\mathrm{DD} 2}$; hence, a level shifter is used to change the logic levels [106]. Inverters after the level shifters are placed in isolation wells to allow operation from these different supply rails. Care is taken to minimize the delay mismatch from output to input in all four paths, as this minimizes the potential for crowbar current to flow between the supply rails if the PMOS and NMOS paths were on simultaneously. Nonoverlapping clocks can be used to further minimize crowbar current, at the expense of slightly lower output power and reduced linearity. Effects of relative delay between different cells is mitigated using an input latch; all data
bits are designed to arrive at the latch within its setup time.

### 5.4 Experimental Results

An experimental prototype of the capacitively combined, class-G Q-SCPA is fabricated in a 65 nm RF LP CMOS process with 9 layers of metallization, including an ultra-thick top metal for high quality passive elements. The prototype occupies an area of $1.8 \mathrm{~mm} \times 1.0 \mathrm{~mm}$ including all bonding and probe pads; the chip area is heavily pad dominated due to required I/O. Figure 5.11 shows a chip microphotograph of the Q-SCPA. The circuit is comprised of a differential, quadrature 6-bit array of precision MIM capacitors, switches, drivers, selection logic, decoders, and a fully integrated output matching network. All circuits operate from 1.2 V, with the exception of the cascoded switches that operate from 2.4 V .

### 5.4.1 Static Measurements

The PA operates at a center frequency of 2 GHz with a peak output power and efficiency of 20 dBm and $21 \%$, respectively, as shown in Figure 5.12. The -3 dB bandwidth of the PA is $\approx 400 \mathrm{MHz}$ as determined by the loaded quality factor of the band-pass matching network. Note that the performance below 2 GHz is dominated by the rolloff of the balun in the measurement setup.

Shown in Figure 5.13 (a) is the $P_{\text {out }}$ versus the quadrature code input for the vector $I=Q$. This corresponds to a transition from the maximum in quadrant III to the maximum in quadrant I of the complex plane. The output amplitude reduces linearly as the code is changed, with minor distortion due to bonding inductance. A sign bit allows the quadrature oscillator signals to be inverted so that all quadrants of the complex plane are accessible.

Asymmetry in the response owes to supply and ground bounce due to excess bondwire inductance in the PCB layout. This distortion can be reduced with better decoupling of the supply circuitry on chip, or with low-inductance packaging (e.g., Flip-chip) [3]. The efficiency is plotted as a function of output power for the $I=Q$ vector in Figure 5.13. Again, the asymmetry is due to supply and ground inductance and can be reduced similarly.

### 5.4.2 Dynamic Measurements

To verify the quadrature SCPAs ability to amplify complex, wideband modulated signals, a $10 \mathrm{MHz}, 64$ QAM LTE signal is applied to the power amplifier. The ACLR performance is plotted in Figure 5.14 and shows less than -30 dBc when outputting 14.5 dBm at 12.2 \% average efficiency. This result is obtained after a 2D digital predistortion procedure that is only necessary due to the aforementioned excessive supply and ground bondwire inductance, as was verified with simulations of the Q-SCPA with and without bondwire inductance. The signal constellation is plotted in Figure 5.15, showing the measured EVM at this ACLR is 3.6 \%-rms.

Digital PAs such as the SCPA and Q-SCPA are quantized systems, and hence their out-of-band (OOB) noise is dominated by quantization. The OOB noise for the 7-bit QSCPA when transmitting a 10 MHz , 64 QAM LTE signal is plotted in Figure 5.16. The OOB noise at $+80,+85,+95,+190 \mathrm{MHz}$, and the ISM band is $-115.4,-115.3,-115.8$, 108.8 , and $-112.4 \mathrm{dBm} / \mathrm{Hz}$, respectively. Though these exceed the desired specification of $-125 \mathrm{dBm} / \mathrm{Hz}$, with two extra bits of resolution the specification would be met. As the presented design was pad limited, increasing resolution in a fully integrated transmitter would not be problematic. It should also be noted that the poor performance at 190 MHz was dictated by the sampling rate of the pattern generation instrument and could be
increased to move the spurs further OOB. The functionality of the QSCPA is validated through both the static and vector measurements. The advantages of the QSCPA are evident in that no phase modulator or timing synchronization circuitry was necessary.

### 5.5 Summary

A quadrature SCPA that can output any phase and amplitude on the complex plane based on digitally coded quadrature inputs is demonstrated in 65 nm CMOS. As with all SCPAs, this PA leverages CMOS strengths of low-loss switches and precision capacitor ratios to simultaneously achieve good efficiency and linearity. The Q-SCPA, however, leverages the advantages of digital PAs while not requiring the wideband modulator of typical DPAs. Furthermore, no complex synchronization circuitry is required, unlike what is required in Digital polar PAs. A prototype fabricated in a 65 nm CMOS process achieves a peak $P_{\text {out }}$ and $P A E$ of 20.5 dBm and $20 \%$, respectively. The performance of the Q-SCPA in a transmitter is validated using a 10 MHz , $64-$ QAM LTE signal. After a 2D DPD, the ACLR is below the required -30 dBc limit and the measured EVM is $<4 \%-\mathrm{rms}$, while achieving an average $P_{\text {out }}$ and $P A E$ of 14.5 dBm and $12.2 \%$, respectively.

A comparison to similar digital transmitters is in Table 5.1. The overall efficiency is lower in this design; this is because that the on-chip matching is implemented with this design, which usually degrades the overall efficiency by $30 \%$ to $40 \%$. In addition, in [17] and [100], the circuit is operating at 800 MHz . At lower frequency, the overall efficiency will be higher since the ratio of the transition time in one cycle is smaller.


Figure 5.1. Block Diagram of an SCPA-based quadrature power amplifier.


Figure 5.2. Schematic of an SCPA.


Figure 5.3. (a) Schematic diagram of a Q-SCPA; (b) waveforms of I/Q vectors.


Figure 5.4. Schematics of capacitively combined quadrature SCPAs outputting (a) $-6+\mathrm{j} 1$, (b) $8+\mathrm{j} 8$, (c) $-2-\mathrm{j} 4$, and (d) $3-\mathrm{j} 6$.


Figure 5.5. Comparison of ideal drain efficiency, $\eta$, versus $P_{\text {out }}$ for a conventional SCPA and several Q-SCPAs.


Figure 5.6. Comparison of the total efficiency versus $\mathrm{Q}_{\mathrm{nw}}$ for several code words in a QSCPA.


Figure 5.7. Block diagram of the proposed quadrature SCPA. Note that the actual implementation is differential and that the switches are cascoded class-G switches (See Figure 5.9). The Unit capacitance size is 200 fF .


Figure 5.8. (a) Custom differential inductor. $\mathrm{L}_{\text {ser }}$. Simulated inductance and resistance versus frequency.


Figure 5.9. Schematic of unit class-G driver with active supply of (a) VDD2 (b) VDD. All transistors are minimum length, with the following widths in $\mu \mathrm{m}: \mathrm{P}_{1}=\mathrm{P}_{2}=87.84, \mathrm{~N}_{1}=$ $28.8, \mathrm{~N}_{2}=38.88$.


Figure 5.10. Q-SCPA Class-G Logic Decoder. Note that the unit size for an NMOS transistor is $550 \mathrm{~nm} \times 60 \mathrm{~nm}$, while a PMOS is $1320 \mathrm{~nm} \times 60 \mathrm{~nm}$.


Figure 5.11. Chip microphotograph of the 65 nm experimental prototype transformer combined SCPA.


Figure 5.12. Measured output power and PAE versus frequency.


Figure 5.13. (a) Measured output power versus codeword (b) measured PAE versus output power.


Figure 5.14. Measured ACLR for a $10 \mathrm{MHz}, 64$ QAM LTE signal.


Figure 5.15. Measured signal constellation for a $10 \mathrm{MHz}, 64$ QAM LTE signal.


Figure 5.16. Measured OOB spectrum for a $10 \mathrm{MHz}, 64$ QAM LTE signal.

Table 5.1. Comparison to prior art

|  | This Work | [17] | [23] | [100] | [67] |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Process technology | 65 nm | 28 nm | 65 nm | 65 nm | 65 nm |
| Supply voltage (V) | 1.2/2.4 | 1.1 | 1.3 | 1.25 | 1.2 |
| Resolution (bit) | 8-IQ | 6-IQ | 13-IQ | 7-IQ | 9-Polar |
| Carrier Frequency | 2.0 GHz | 800 MHz | 2.4 GHz | 800 MHz | 2.2 GHz |
| Peak Pout (dBm) | 20.5 | 13.9 | 22.8 | 7.87 | 23.3 |
| PAE at peak $\mathrm{P}_{\text {out }}$ | 20 \% | 40.4 \% | 42 \% | 10.8 \% | 38 \% |
| Modulation signal | LTE <br> 10 MHz , <br> 64-QAM | $\begin{gathered} \text { LTE } \\ 10 \mathrm{MHz}, \\ \text { 16-QAM } \end{gathered}$ | Single carrier $22 \mathrm{MHz}, 64-$ QAM | LTE <br> 5 MHz | $\begin{gathered} 802.11 \mathrm{~g} \\ 20 \mathrm{MHz}, 64- \\ \text { QAM } \end{gathered}$ |
| Average $\mathrm{P}_{\text {out }}(\mathrm{dBm})$ | 14.5 | 6.97 | 15 | 0.26 | 16.8 |
| Average PAE | 12.2 \% | 29.1 \% | NA | 6.7 \% | 21.8 \% |
| EVM ( \%) | 3.6 \%-rms | NA | 3.98 \%-rms | NA | 3.98 \%-rms |
| ACLR (dBc) | -30.7/-31.0 | -32.4/-32.7 | $<-43 \mathrm{dBc}$ | $\begin{gathered} \hline-31.8 /- \\ 31.5 \end{gathered}$ | NA |
| Matching Network | On-Chip | No | Transformer | No | Transformer |

## CHAPTER 6

## A MULTIPHASE SWITCHED CAPACITOR PA

### 6.1 Motivations

### 6.1.1 Problems in Polar Architecture

In modern communication systems, $I$ and $Q$ signals are readily available at the digital backend. The operation of polar PAs requires a coordinate rotation digital computer (CORDIC) to convert the $I$ and $Q$ signals into amplitude and phase whereas the outphasing PAs requires two different phase signals.

As we can see from equations (5.1)-(5.2), these conversions are nonlinear, resulting a significant bandwidth expansion, especially for phase signals. Conventional phase modulators are implemented using phase-locked loop (PLL) and have a narrow bandwidth that is not suitable for wideband modulation techniques such as WiMAX or LTE. Although open-loop phase modulation is an alternative, it suffers from phase quantization noise and digital-to-phase nonlinearity. The handling of phase quantization noise and digital-to phase nonlinearity adds complexity to entire system.

Furthermore, in EER PA or digital polar PA, the amplitude signal and phase signal propagate at different frequencies on separate paths, resulting in a significant delay mismatch. The delay mismatch can cause spectral mask violations for modulated signals such as WCDMA and LTE.

Outphasing PAs do not suffer from the delay mismatch, however, the load/device mismatch due to PVT variations significantly limits the dynamic range and causes AMAM and AM-PM distortions.

### 6.1.2 Output Power Loss in Quadrature Architecture

Although the quadrature PAs can avoid the bandwidth expansion or delay mismatch, they suffer from a significant power drop during $I / Q$ combination. As shown in Figure 6.1 (a), when the magnitude of $I$ and $Q$ are equal, the output power drops by 3 dB when compared to vector summation of two signals that are in phase. Hence, the phase-dependent power drop is caused by the 90 -degree phase difference of the $I / Q$ signals.

It is straight forward that the power drop is 3 dB when the output phase is $\pi / 4$. For a modulated signal, the phase is evenly distributed from 0 to $2 \pi$, then we need to consider the overall power drop due to the IQ direct summation. Assume the output phase is $\phi$ and the output amplitude is $A_{0}$. In SCPA, the output amplitude is proportional to the number of the capacitors switching, therefore, the ratio between the amplitude of quadrature PA and that of the digital polar PA can be expressed by:

$$
\begin{equation*}
\frac{A(\phi)}{I(\phi)+Q(\phi)}=\frac{A_{0}}{\left|A_{0} \cos \phi\right|+\left|A_{0} \sin \phi\right|}=\frac{1}{|\cos \phi|+|\sin \phi|} \tag{6.1}
\end{equation*}
$$

The overall ratio from 0 to $2 \pi$ is

$$
\begin{align*}
& \frac{1}{2 \pi} \int_{0}^{2 \pi} \frac{1}{|\cos \phi|+|\sin \phi|} d \phi=4 \times \frac{1}{2 \pi} \int_{0}^{\pi / 2} \frac{1}{\cos \phi+\sin \phi} d \phi  \tag{6.2}\\
= & \frac{2}{\pi} \int_{0}^{\pi / 2} \frac{1}{\cos \phi+\sin \phi} d \phi=\frac{2}{\pi} \int_{0}^{\pi / 2} \frac{1}{\sqrt{2} \cos \left(\phi-\frac{\pi}{4}\right)} d \phi \\
= & \frac{\sqrt{2}}{\pi} \ln \left|\sec \left(\phi-\frac{\pi}{4}\right)+\tan \left(\phi-\frac{\pi}{4}\right)\right|_{0}^{\pi / 2}=\frac{\sqrt{2}}{\pi} \ln \left(\frac{\sqrt{2}+1}{\sqrt{2}-1}\right) \\
= & 0.7935
\end{align*}
$$

Similarly, we can also calculate the ratio of the output power:

$$
\begin{align*}
& \frac{1}{2 \pi} \int_{0}^{2 \pi} \frac{1}{(|\cos \phi|+|\sin \phi|)^{2}} d \phi=4 \times \frac{1}{2 \pi} \int_{0}^{\pi / 2} \frac{1}{(\cos \phi+\sin \phi)^{2}} d \phi  \tag{6.3}\\
& \quad=\frac{2}{\pi} \int_{0}^{\pi / 2} \frac{1}{(\cos \phi+\sin \phi)^{2}} d \phi=\frac{2}{\pi} \int_{0}^{\pi / 2} \frac{1}{2 \cos ^{2}\left(\phi-\frac{\pi}{4}\right)} d \phi \\
& \quad=\left.\frac{1}{\pi} \tan \left(\phi-\frac{\pi}{4}\right)\right|_{0} ^{\pi / 2}=\frac{2}{\pi} \\
& \quad=0.6366=-1.96 \mathrm{~dB}
\end{align*}
$$

Therefore, for a modulated signal, the output power of quadrature SCPA will be 2 dB lower than the polar SCPA if they are implemented by a similar SCPA array.

A number of papers use $25 \%$ duty cycle clocks instead of $50 \%$ duty cycle clocks [17], [23], arguing that the low drain efficiency is due to the overlap of the $I / Q$ clocks as shown in Figure 6.2 (a). However, if we carefully investigate the operation of DRFC during the overlap region as shown in Figure 6.2 (a), this region is corresponding to the 45 -degree vector in Figure 6.1(b).

### 6.2 Theory of Multiphase Modulation

From Figure 6.1 (a) and Figure 6.2 (a), we can conclude that the power loss is actually due to the nonoverlapping region of $I / Q$ clocks. However, this nonoverlapping region is critical for the generation of output signals with arbitrary phases.

Therefore, the multiphase all-digital transmitter architecture is proposed to significantly improve the power efficiency of the quadrature power amplifier.

An example of eight-phase clocks is illustrated in Cartesian coordinates as shown in Figure 6.3(a). By adding additional phases, the separation between adjacent phase vectors is reduced, resulting in increased constructive summation between adjacent phase vectors
and hence reduced power loss.
The digital logic circuitry to implement multiphase signals would be hard to implement back in the era when Kahn introduced the first EER PA in 1954, but they are readily available in nowadays CMOS circuitry. Deep-submicron CMOS technology is fundamentally a digital technology. The multiphase architecture basically take advantage of the low-cost and low-power digital circuitry to reduce the power loss in high-power switching power amplifiers.

### 6.2.1 Polar to Multiphase Conversion

As shown in Figure 6.4, for an arbitrary vector (A, $\theta$ ), we can convert it to multiphase domain using following equations:

$$
\begin{gather*}
n_{1}=A \cos (\pi / M) \frac{\sin (2 \pi / M)}{\sin (2 \pi / M-\phi)}  \tag{6.4}\\
n_{2}=A \cos (\pi / M) \frac{\sin (2 \pi / M)}{\sin \phi}  \tag{6.5}\\
\phi=\theta-\frac{2 \pi}{M} m \tag{6.6}
\end{gather*}
$$

Where $n_{1}$ and $n_{2}$ represent the amplitudes of the selected adjacent two phase vectors and $m$ is an integer, representing the index of the selected phase vector that is determined by

$$
\begin{equation*}
\frac{m}{M} \pi \leq \theta<\frac{m+1}{M} \pi \tag{6.7}
\end{equation*}
$$

or

$$
\begin{equation*}
\frac{\theta}{\pi} M-1 \leq m<\frac{\theta}{\pi} M \tag{6.8}
\end{equation*}
$$

To justify the efficiency improvement of the proposed design, assume the output amplitude of half of the DRFC cells is $V$. In Figure 6.1 (a), when the magnitudes of $I$ and $Q$ are identical, the output power is reduced by:

$$
\begin{equation*}
10 \log \left(\frac{4 V^{2} \cos ^{2}(\pi / 4)}{4 V^{2}}\right)=-3 \mathrm{~dB} \tag{6.9}
\end{equation*}
$$

However, if the eight-phase architecture is used instead, the output power is only reduced by:

$$
\begin{equation*}
10 \log \left(\frac{4 V^{2} \cos ^{2}(\pi / 8)}{4 V^{2}}\right)=-0.69 \mathrm{~dB} \tag{6.10}
\end{equation*}
$$

In general, the power loss at phase $\phi$ of the M-phase architecture can be calculated by

$$
\begin{equation*}
\frac{A(\phi)^{2}}{\left(n_{1}(\phi)+n_{2}(\phi)\right)^{2}}=\frac{\sin ^{2}(\pi-2 \pi / M)}{(|\sin (2 \pi / M-\phi)|+|\sin \phi|)^{2}} \tag{6.11}
\end{equation*}
$$

When $\phi=\pi / M$, we can get the maximum power loss:

$$
\begin{equation*}
\frac{A(\pi / M)^{2}}{\left(n_{1}(\pi / M)+n_{2}(\pi / M)\right)^{2}}=\cos ^{2}(\pi / M) \tag{6.12}
\end{equation*}
$$

The overall ratio between the amplitude of the M-phase architecture and the amplitude of a polar architecture, for a modulated signal with evenly distributed phases, can be calculated by

$$
\begin{equation*}
\frac{A(\phi)}{P_{0}(\phi)+P_{1}(\phi)}=\frac{A_{0} \sin [(M-2) \pi / M]}{\left|A_{0} \sin (2 \pi / M-\phi)\right|+\left|A_{0} \sin \phi\right|}=\frac{\sin [(M-2) \pi / M]}{|\sin (2 \pi / M-\phi)|+|\sin \phi|} \tag{6.13}
\end{equation*}
$$

$$
\begin{equation*}
\frac{1}{2 \pi} \int_{0}^{2 \pi} \frac{\sin [(M-2) \pi / M]}{|\sin (2 \pi / M-\phi)|+|\sin \phi|} d \phi=M \times \frac{1}{2 \pi} \int_{0}^{2 \pi / M} \frac{\sin [(n-2) \pi / M]}{\sin (2 \pi / M-\phi)+\sin \phi} d \phi \tag{6.14}
\end{equation*}
$$

$$
=\frac{M \sin [(M-2) \pi / M]}{2 \pi} \int_{0}^{2 \pi / M} \frac{1}{\sin (2 \pi / M) \cos \phi+(1-\cos (2 \pi / M)) \sin \phi} d \phi
$$

$$
\begin{aligned}
& =\frac{M \sin [(M-2) \pi / M]}{2 \pi C} \int_{0}^{2 \pi / M} \frac{1}{\cos (\phi-\alpha)} d \phi \\
& =\frac{M \sin [(M-2) \pi / M]}{2 \pi C} \int_{0}^{2 \pi / M} \frac{1}{\cos (\phi-\pi / M)} d \phi \\
& =\frac{M \sin [(M-2) \pi / M]}{2 \pi C} \ln |\sec (\phi-\pi / M)+\tan (\phi-\pi / M)|_{0}^{2 \pi / M} \\
& =\frac{M \sin [(M-2) \pi / M]}{2 \pi C} \ln \left(\frac{\sec (\pi / M)+\tan (\pi / M)}{\sec (\pi / M)-\tan (\pi / M)}\right) \\
& =\frac{M \sin [(M-2) \pi / M]}{2 \pi \sqrt{2-2 \cos (2 \pi / M)}} \ln \left(\frac{\sec (\pi / M)+\tan (\pi / M)}{\sec (\pi / M)-\tan (\pi / M)}\right)
\end{aligned}
$$

where

$$
\alpha=\arctan \frac{1-\cos (2 \pi / M)}{\sin (2 \pi / M)}=\frac{\pi}{M} \text { and } C=\sqrt{2-2 \cos (2 \pi / M)} .
$$

Similarly, the ratio of the output power can be calculated by:

$$
\begin{aligned}
& \frac{A(\phi)^{2}}{\left(P_{0}(\phi)+P_{1}(\phi)\right)^{2}}=\frac{\sin ^{2}[(M-2) \pi / M]}{(|\sin (2 \pi / M-\phi)|+|\sin \phi|)^{2}} \\
& \frac{1}{2 \pi} \int_{0}^{2 \pi} \frac{\sin ^{2}[(M-2) \pi / M]}{(|\sin (2 \pi / M-\phi)|+|\sin \phi|)^{2}} d \phi=\frac{M}{2 \pi} \int_{0}^{2 \pi / M} \frac{\sin ^{2}[(M-2) \pi / M]}{(\sin (2 \pi / M-\phi)+\sin \phi)^{2}} d \phi \\
&= \frac{M \sin ^{2}[(M-2) \pi / M]}{2 \pi} \int_{0}^{2 \pi / n} \frac{1}{(\sin (2 \pi / M) \cos \phi+(1-\cos (2 \pi / M)) \sin \phi)^{2}} d \phi \\
&= \frac{M \sin ^{2}[(M-2) \pi / M]}{2 \pi / n} \int_{0}^{2 \pi C^{2}} \frac{1}{\cos ^{2}(\phi-\pi / M)} d \phi \\
&=\left.\frac{M \sin ^{2}[(M-2) \pi / M]}{2 \pi C^{2}} \tan (\phi-\pi / M)\right|_{0} ^{2 \pi / M} \\
&= \frac{M \sin ^{2}[(M-2) \pi / M]}{2 \pi[1-\cos (2 \pi / M)]} \tan (\pi / M)
\end{aligned}
$$

From (6.11) and (6.15), we can compare the power loss between polar architecture and multiphase architecture as shown in Table 6.1.

### 6.2.2 Cartesian to Multiphase Conversion

As shown in Figure 6.5, the $I$ and $Q$ of a multiphase vector can be expressed by:

$$
\begin{align*}
& I=I_{m}+I_{m+1}=n_{1} \cos \left(\frac{m}{M} 2 \pi\right)+n_{2} \cos \left(\frac{m+1}{M} 2 \pi\right)  \tag{6.17}\\
& Q=Q_{m}+Q_{m+1}=n_{1} \sin \left(\frac{m}{M} 2 \pi\right)+n_{2} \sin \left(\frac{m+1}{M} 2 \pi\right) \tag{6.18}
\end{align*}
$$

Where $n_{1}$ and $n_{2}$ represent the amplitudes of the selected adjacent two phase vectors and $m$ represents the index of the selected phase vector that is determined by (6.7) or (6.8).

From (6.17) and (6.18), we can get the Cartesian to multiphase conversion:

$$
\begin{gather*}
n_{1}=\frac{I \sin \left(\frac{m+1}{M} 2 \pi\right)-Q \cos \left(\frac{m+1}{M} 2 \pi\right)}{\sin \frac{2 \pi}{M}}  \tag{6.19}\\
n_{2}=\frac{I \sin \left(\frac{m}{M} 2 \pi\right)-Q \cos \left(\frac{m}{M} 2 \pi\right)}{\sin \frac{2 \pi}{M}} \tag{6.20}
\end{gather*}
$$

where the integer $m$ can be determined by (6.7).
The output power of the multiphase PA is proportional to $\left(I^{2}+Q^{2}\right)$,

$$
\begin{equation*}
I^{2}+Q^{2}=n_{1}^{2}+n_{2}^{2}+2 n_{1} n_{2} \cos \frac{2 \pi}{M} \tag{6.21}
\end{equation*}
$$

therefore,

$$
\begin{equation*}
\frac{P_{o u t, n_{1}, n_{2}}}{P_{\text {out }, \max }}=\frac{n_{1}^{2}+n_{2}^{2}+2 n_{1} n_{2} \cos \frac{2 \pi}{M}}{N^{2}} \tag{6.22}
\end{equation*}
$$

where $P_{\text {out,max }}$ is the maximum output power with one phase vector, and $N$ is the total number of the switching PA cells.

### 6.2.3 Multiphase SCPA

As discussed in Section 5.2, the SCPA has a good linearity and constant load impedance, hence, it is a good candidate to implement the multiphase architecture. In SCPA, an array of capacitor cells has a shared bottom plates whereas their top plates are driven by a phase-modulated pulse-wave and switched between $V_{\mathrm{DD}}$ and $V_{\mathrm{GND}}$, or held at $V_{\mathrm{GND}}$. To accommodate the multiphase operation, an RF clock signal is subdivided into $N$ equally spaced output phases $\left(\phi_{1-} \phi_{N}\right)$ by a polyphase filter or a delay-locked loop (DLL) and a digital phase selector will select the chosen phase to drive each capacitor. An example of a multiphase-SCPA (MP-SCPA) is shown in Figure 6.6. Each capacitor in the capacitor array can be switched by a pulse wave with any one of the eight phases.

Same as (6.22), we can calculate the output power of MP-SCPA as a function of the digital code word $n_{l}$ and $n_{2}$,

$$
\begin{equation*}
\frac{P_{\text {out }, n_{1}, n_{2}}}{P_{\text {out }, S C P A}}=\frac{n_{1}^{2}+n_{2}^{2}+2 n_{1} n_{2} \cos \frac{2 \pi}{M}}{N^{2}} \tag{6.23}
\end{equation*}
$$

where $P_{\text {out }, S C P A}$ is the peak output power of the SCPA:

$$
\begin{equation*}
P_{o u t, S C P A}=\frac{2}{\pi^{2}} \frac{V_{D D}^{2}}{R_{o p t}} \tag{6.24}
\end{equation*}
$$

Similar to a polar SCPA, the input power can be expressed by:

$$
\begin{equation*}
P_{i n, n_{1}, n_{2}}=\frac{\left(n_{1}+n_{2}\right)\left(N-n_{1}-n_{2}\right)}{N^{2}} C V_{D D}^{2} f \tag{6.25}
\end{equation*}
$$

From (6.23), (6.24) and (6.25), we can derive the PAE of an ideal multiphase PA,

$$
\begin{equation*}
P A E_{n_{1}, n_{2}}=\frac{P_{\text {out }, n_{1}, n_{2}}}{P_{\text {out }, n_{1}, n_{2}}+P_{i n, n_{1}, n_{2}}} \tag{6.26}
\end{equation*}
$$

$$
=\frac{n_{1}^{2}+n_{2}^{2}+2 n_{1} n_{2} \cos \frac{2 \pi}{M}}{n_{1}^{2}+n_{2}^{2}+2 n_{1} n_{2} \cos \frac{2 \pi}{M}+\left(n_{1}+n_{2}\right)\left(N-n_{1}-n_{2}\right) \frac{\pi}{4} \frac{1}{Q_{\text {Load }}}}
$$

In Figure 6.7, we convert the multiphase to Cartesian coordinates and plot the output power and PAE as a function of the input code.

In (6.23), assuming $n_{l}=n_{2}$ and $Q_{\text {load }}=3$, we can plot the ideal output power versus the number of switching capacitors and the number of phase vectors $M$, as shown in Figure 6.8.

From Figure 6.8, we can observe a significant power increase from $M=4$ to $M=8$, whereas the improvement is minor when the number of phases $M$ is larger than 16 .

### 6.3 Circuit Details

### 6.3.1 Phase Selector and Amplitude Decoder

The phase selector comprises a digital logic circuit and a MUX tree. The clocks with sixteen phases will pass to the input of the MUX. The digital logic circuit controls the MUX and selects two clocks with adjacent phases ( $\phi_{A}$ and $\phi_{B}$ ) as the output. Then the selected two clocks go to each cell of the MP-SCPA, where the amplitude decoder allows the unit capacitor to be driven by either $\phi_{A}$ or $\phi_{B}$, or held at ground.

Since the phase selector and the amplitude decoder both are operating at the same sampling rate with the same group delay, the delay of the two paths can be easily adjusted to minimize the delay mismatch. In EER PA, however, the phase signal is at RF frequency whereas the amplitude signal is at sampling rate, resulting a difficult problem in delay mismatch.

### 6.3.2 Cascoded Switch for Unit Capacitor Cell

The switch is a cascoded inverter that allows for operation with doubled supply voltage to increase the output power and reduce losses in the output matching network, as shown in Figure 6.9. For each cell, a level shifter and separate buffer chains are implemented to drive the high-side $M 4$ and low-side of the switch $M 1$. To avoid the conduction loss, the delay between $V_{r f p}$ and $V_{r f n}$ is minimized.

### 6.3.3 Block Diagram of the Entire Design

Figure 6.10 shows the block diagram of the 16-phase all-digital switched-capacitor power amplifier. A 7-bit unary switched-capacitor array is implemented in this design, which will be adequate to meet the error vector magnitude (EVM) and out-of-band (OOB) noise specifications for wireless communication standards such as LTE and 802.11ac (e.g., Wi-Fi) [5].

A single-ended version is shown although the fabricated circuit is fully differential. In the proposed architecture, an off-chip phase generator creates 16 evenly distributed phase vectors $\left(\phi_{0}-\phi_{15}\right)$ that are input to a clock selection MUX. Four LSB bits from the digital pattern generator are used to select the two adjacent phases $\left(\phi_{A}-\phi_{B}\right)$ to the phase of the desired output signal. The phases, $\phi_{A}-\phi_{B}$, are distributed to every cell of a 7 b capacitor array, where a 14b multiphase logic decoder can select whether to switch each capacitor cell on $\phi_{A}$, on $\phi_{B}$, or to ground the cell to $V_{\mathrm{GND}}$. As was previously mentioned, the array can be steered entirely to either phase $\phi_{A}$, or $\phi_{B}$, allowing for larger output amplitude and efficiency.

### 6.4 Measurement Results

The capacitively combined MP-SCPA is fabricated in a 130 nm RF CMOS process with an ultra-thick top metal for high quality passive elements. Figure 6.11 shows the microphotograph of the MP-SCPA. It occupies a total area of $2.1 \mathrm{~mm} \times 1.8 \mathrm{~mm}$. A 4-bit phase logic (m) selects two out of sixteen clocks with adjacent phases for the desired output whereas two 7-bit decoders $\left(\mathrm{n}_{1}, \mathrm{n}_{2}\right)$ independently control the number of capacitors switching at the two chosen clocks. The die area used by the digital input pads can be saved when the power amplifier is integrated together with digital signal processing (DSP) backend. Most circuitry operates from a supply voltage of 1.5 V , with the exception of the cascoded PMOS switches that operate from 1.5 V to 3 V .

### 6.4.1 Static Measurement

In Figure 6.12, the static output power and PAE are measured at different frequencies, respectively. The PA delivers a maximum output power of 26 dBm with a PAE of $24.9 \%$ at 1.82 GHz . The measured -3 dB bandwidth of the PA is around 750 MHz , which is consistent with the loaded quality factor of the band-pass matching network. Note that the performance above 2 GHz is dominated by the rolloff of the balun in the measurement setup.

The output power is plotted as a function of the digital input code in Figure 6.13. The digital input codes $n_{1}, n_{2}$, and $m$ are mapped to IQ plane using (6.19) and (6.20).

The nonlinearity is caused by the bondwire inductance and the difference of the adjacent clocks. Excess bondwire inductance can cause supply and ground bounce and affect the dynamic response at the output in the form of memory effect. The difference in the clock's duty cycle and rise/fall time can result in mismatch in the output amplitudes
along different phase vectors.
In Figure 6.13, we can observe a relatively larger distortion at high power, which can be explained by the interaction between supply/ground bounce and clock mismatch. When the output power is high, the effect of clock difference can enlarge the supply and ground bounce, resulting in a larger difference in output amplitudes switching at different clocks.

### 6.4.2 Digital Predistortion

Nonlinearity in power amplifiers can cause the spectral mask violations, increase bit error rate (BER) resulted from the in-band distortion, and cause adjacent channel interference due to spectral regrowth. Owing to the low cost and low power of CMOS digital circuitry, it is cost effective and advantageous to linearize the PA using digital predistortion (DPD) at the baseband.

### 6.4.2.1 2D Look-Up Table

Similar to quadrature PAs, the multiphase PA requires a 2D-DPD. The digital input code $\mathrm{n}_{1}, \mathrm{n}_{2}$, and m can be mapped to the measured output amplitude and phase. To visualize the mapping relationship, both the digital input code and the output amplitude and phase are converted to the IQ plane. The measured output with all codes using two adjacent phases are plotted in Figure 6.14(a), whereas all codes with 16 different kinds of phase combinations are plotted in Figure 6.14(b). It can be observed that, with increased number of phases, the look-up table (LUT) for DPD can become huge.

From the measured nonlinear output as shown in Figure 6.14, a LUT can be built and the digital input code can be predistorted to achieve a linear output. The predistorted input codes are converted to IQ plane and several examples are plotted in Figure 6.15 (a) and (b).

### 6.4.2.2 2D Surface Fit

A large LUT will take a lot of memory, that is, large die area on chip. To overcome this disadvantage, a novel 2D surface fit is first proposed and applied to the multiphase PA.

Assume that the $I_{D P D}$ and $Q_{D P D}$ are independent with each other, then they can be expressed as a polynomial function of $I$ and $Q$, respectively:

$$
\begin{align*}
& I_{D P D}=\sum_{j=0}^{J} \sum_{k=0}^{k=j} a_{k, j-k} I^{k} Q^{j-k}  \tag{6.27}\\
& Q_{D P D}=\sum_{j=0}^{J} \sum_{k=0}^{k=j} b_{k, j-k} I^{k} Q^{j-k} \tag{6.28}
\end{align*}
$$

Functions (6.27) and (6.28) represent two different continuous surface in the 2D coordinates, as shown in Figure 6.16. By fitting the surface, we can find the desired coefficients $a_{k, j-k}$ and $b_{k, j-k}$. A third order polynomial expression can be determined and it is plotted as the red surface in Figure 6.16. Predistorted codes from 2D surface fit and 2D LUT are very close, since the surface fit is generated on the basis of LUT, as shown in Figure 6.17(a) and (b).

The 2D surface fit can save the die area occupied by the large LUT. It can also accommodate the temperature change or memory effect to its polynomial coefficients. After DPD, the output amplitude shows a linear response across all possible codewords and phases, as shown in Figure 6.18(a). In Figure 6.18(b), the corresponding PAE is plotted; the nonlinearity in PAE is due to the code dependent PAE which can be calculated by (6.26).

### 6.4.3 Dynamic Measurement

The MP-SCPA is tested with a $10 \mathrm{MHz}, 64$ QAM LTE signal. Without DPD, the ACLR is around -20 dBc as shown in Figure 6.19(a). By applying the DPD to linearize the PA, the measured ACLR, is less than -30 dBc , as shown in Figure 6.19(b). The average output power is 20.9 dBm with a PAE of 15.2 \%. The measured EVM is 3.5 \%-rms after DPD whereas the EVM is more than 10 \%-rms without DPD. The signal constellation is plotted in Figure 6.20.

The close-in out-of-band (OOB) noise of the MP-SCPA is dominated by quantization. The far-out OOB noise for the 7-bit QSCPA when transmitting a $10 \mathrm{MHz}, 64$ QAM LTE signal is plotted in Figure 6.21 . The alias at 2.02 GHz is due to the 200 MHz sampling rate of the input LTE signal and it can be further suppressed with higher sampling rate of the input signal or higher resolution of the MP-SCPA.

Although the PA is implemented with differential topology, the device mismatch and input clock signal mismatch can contribute to the second harmonics. The quality factor of the output matching is relatively small $(\sim 3)$, but the second and third harmonics can be further suppressed by the output antenna with a quality factor Q , which can be on the order of tens.

### 6.5 Summary

We have first introduced the concept of multiphase modulation and implemented a prototype of the digital multiphase switched capacitor power amplifier in 130nm CMOS. This PA leverages the advantages of digital PAs while not requiring the wideband phase modulator of polar DPAs, or having the high combining loss of a quadrature DPA. This PA delivers a peak output power of 26 dBm at 1.82 GHz with 24.9 \% PAE. The
performance of the PA is validated from static measurements and modulation measurement using a 10 MHz , 64-QAM LTE signal. A novel DPD method using 2D surface fit is first proposed to save the die area occupied by the large LUT and to accommodate temperature coefficients. With DPD, the ACLR is below the required - 30 dBc LTE standards and the measured EVM is $3.5 \%$-rms. A comparison to prior art is provided in Table 6.2. Though this circuit was implemented in 130 nm technology, it achieves higher average output power than its competitors. The PAE of this circuit is lower than [17], due to the implemented on-chip matching, which usually degrades the overall efficiency by $30 \%$ to $40 \%$. In addition, in [17] and [100], the circuit is operating at 800 MHz . At lower frequency, the overall efficiency will be higher since the ratio of the transition time in one cycle is smaller. Output power can be traded off for efficiency by reducing output voltage division and matching network losses. It is also noted that SCPAs benefit significantly from process scaling.


Figure 6.1. (a) IQ summation at output; (b) IQ waveforms.


Figure 6.2. (a) Quadrature clocks with 50 \% duty cycle; (b) quadrature clocks with 25 \% duty cycle [23].


Figure 6.3. (a) Eight-phase vectors in Cartesian coordinates; (b) four of eight-phase clocks in time domain.


Figure 6.4. Polar to multiphase conversion.


Figure 6.5. Cartesian to multiphase conversion.


Figure 6.6. Example multiphase operations with SCPA


Figure 6.7. (a) Output power versus input code; (b) PAE versus input code.


Figure 6.8. Output power versus the number of switching capacitors with different M values.


Figure 6.9. One unit cell of a cascoded switch.


Figure 6.10. Block diagram of the 16 -phase SCPA.


Figure 6.11. Chip microphotograph of the 130 nm multiphase SCPA.


Figure 6.12. Measured $\mathrm{P}_{\text {out }}$ and PAE versus frequency.


Figure 6.13. Measured $\mathrm{P}_{\text {out }}$ versus input code ( $\mathrm{n}_{1}, \mathrm{n}_{2}$ and m are mapped to IQ).


Figure 6.14. Measured output with all codes using (a) two phases; (b) 16 phases.


Figure 6.15. Example of data points of (a) predistored input; (b) output after DPD.


Figure 6.16. 2D surface fit from the output of MP-SCPA.


Figure 6.17. Comparison between LUT and surface fit: (a)IDPD; (b)QDPD.


Figure 6.18. Measured output with DPD: (a)output power; (b)PAE.

(b)

Figure 6.19. Measured ACLR for a 10 MHz 64 QAM LTE signal (a) no DPD; (b) with DPD.


Figure 6.20. Measured signal constellation for a 10 MHz , 64QAM LTE signal (blue dots represent the signal after DPD, Gray points represent the signal before DPD).


Figure 6.21. Measured OOB Spectrum for a $10 \mathrm{MHz}, 64$ QAM LTE signal.

Table 6.1. Power loss of multiphase architecture and digital polar architecture

| All-digital Architecture | Maximum Power Loss | Average Power Loss |
| :---: | :---: | :---: |
| Quadrature | -3 dB | -1.96 dB |
| 8-phase | -0.67 dB | -0.46 dB |
| 16-phase | -0.17 dB | -0.11 dB |
| 32-phase | -0.04 dB | -0.03 dB |
| Digital Polar (Ref.) | 0 dB | 0 dB |

Table 6.2. Comparison to prior art

|  | This Work [107] | [17] | [100] | [23] | [108] |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Process technology | 130 nm | 28 nm | 65 nm | 65 nm | 65 nm |
| Supply voltage (V) | 3 | 1.1 | 1.25 | 1.3 | 1.2/2.4 |
| Resolution (bit) | 7-MP | 6-IQ | 7-IQ | 13-IQ | 8-IQ |
| Carrier Frequency | 1.8 GHz | 800 MHz | 800 MHz | 2.4 GHz | 2.0 GHz |
| Peak Pout (dBm) | 26 | 13.9 | 7.87 | 22.8 | 20.5 |
| PAE at peak $\mathrm{P}_{\text {out }}$ | 24.9 \% | 40.4 \% | 10.8 \% | 42 \% | 20 \% |
| Modulation signal | $\begin{gathered} \text { LTE } \\ 10 \mathrm{MHz} \\ 64-\mathrm{QAM} \end{gathered}$ | $\begin{gathered} \text { LTE } \\ 10 \mathrm{MHz} \\ 16-\mathrm{QAM} \end{gathered}$ | $\begin{gathered} \text { LTE } \\ 5 \mathrm{MHz} \end{gathered}$ | Single carrier <br> 22 MHz <br> 64-QAM | $\begin{gathered} \text { LTE } \\ 10 \mathrm{MHz} \\ 64-\mathrm{QAM} \end{gathered}$ |
| Average $\mathrm{P}_{\text {out }}(\mathrm{dBm})$ | 20.9 | 6.97 | 0.26 | 15 | 14.5 |
| Average PAE | 15.2 \% | 29.1 \% | 6.7 \% | NA | 12.2 \% |
| EVM | 3.5 \%-rms | NA | NA | 3.98 \%-rms | 3.6 \%-rms |
| ACLR (dBc) | -30.3/-31.7 | -32.4/-32.7 | $\begin{gathered} -31.8 /- \\ 31.5 \end{gathered}$ | <-43 | -30.7/-31.0 |
| Matching Network | LC <br> Matching | No | No | Transformer | LC Matching |

## CHAPTER 7

## CONCLUSIONS AND FUTURE WORK

### 7.1 Summary

High speed wireless communications require power amplifiers with high linearity and high efficiency. Switching amplifies are very promising because they can benefit from the CMOS scaling. Since switching PAs are insensitive to input amplitude, linearization systems are needed to meet wireless communication standards and to prevent the spectral mask violations.

Polar architectures, such as EER and digital polar, can cause bandwidth expansion and delay mismatch. Moreover, they require wideband phase modulators which are difficult to implement with PLLs due to PLL's narrow bandwidth. Open-loop phase modulators can be implemented to broaden the bandwidth, but they will introduce phase quantization noise, digital-to-phase nonlinearity, and spectral images.

Quadrature architecture can avoid the bandwidth expansion and delay mismatch, as well as eliminating both supply modulator and phase modulator. It also accommodates a good interface to the digital backend. However, the power loss due to the outphasing summation significantly limits its peak/average power and efficiency. Therefore, the separation between adjacent phases need to be reduced to lower the power loss.

Multiphase architecture is first proposed to enhance the output power in quadrature
architecture by reducing the separation of adjacent phase vectors. Since digital circuitry is low power in CMOS, it is cost effective to utilize the logic circuitry for multiphase conversion and linearization. A DPD method using 2D surface fit is first proposed to save the die area occupied by the large 2D LUT and the resulted polynomial expression provides the flexibility to include the PVT variations and memory effect.

All of the three aforementioned architectures are designed and tested, including current-modulated EER PA, quadrature SCPA, and multiphase SCPA. With thoroughly static/dynamic characterization and linearization based on DPD, all of the implemented PAs meet the stringent LTE standards and reveal a promising direction for CMOS switching PAs that has yet to be available in volume production.

### 7.2 Future Work

In order to suppress the quantization noise in $\mathrm{Q}-\mathrm{SCPA}$ and MP-SCPA, more bits can be added to improve the resolution of the SCPAs. Although there is a minimum size for the unit capacitor for a given CMOS process, the C-2C topology and split-array topology can be implemented to increase the resolution.

To further improve the PAE at power backoff, the class-G topology can be implemented with the MP-SCPA. In a class-G topology, the higher voltage supply will be used for high output power whereas the lower voltage supply will handle the low output power. The class-G topology is beneficial because the switching PAs ideally have $100 \%$ drain efficiency when operating with both supply voltages.

The low-Q passive elements in CMOS is one important reason for both the power loss and efficiency degrading. The matching network for CMOS PAs can be implemented using
off-chip high-Q inductors/capacitors or transformers to further improve the overall output power and efficiency.

## REFERENCES

[1] F. H. Raab, P. Asbeck, S. Cripps, P. B. Kenington, Z. B. Popovic, N. Pothecary, J. F. Sevic, and N. O. Sokal, "Power amplifiers and transmitters for RF and microwave," IEEE Trans. Microw. Theory Tech., vol. 50, no. 3, pp. 814-826, Mar. 2002.
[2] N. O. Sokal, "RF power amplifiers, classes A through S-how they operate, and when to use each," in Electronics Industries Forum of New England, 1997. Professional Program Proceedings, 1997, pp. 179-252.
[3] S.-M. Yoo, J. S. Walling, O. Degani, B. Jann, R. Sadhwani, J. C. Rudell, and D. J. Allstot, "A class-G switched-capacitor RF power amplifier," IEEE J. Solid-State Circuits, vol. 48, no. 5, pp. 1212-1224, May 2013.
[4] J. S. Walling, S. S. Taylor, and D. J. Allstot, "A class-G supply modulator and classE PA in 130 nm CMOS," IEEE J. Solid-State Circuits, vol. 44, no. 9, pp. 2339-2347, Sep. 2009.
[5] C. Lu, H. Wang, C. H. Peng, A. Goel, S. Son, P. Liang, A. Niknejad, H. C. Hwang, and G. Chien, "A 24.7 dBm all-digital RF transmitter for multimode broadband applications in 40 nm CMOS," in IEEE ISSCC Dig. Tech. Papers, 2013, pp. 332-333.
[6] H. Xu, Y. Palaskas, A. Ravi, M. Sajadieh, M. A. El-Tanani, and K. Soumyanath, "A flip-chip-packaged 25.3 dBm class-D outphasing power amplifier in 32 nm CMOS for WLAN Application," IEEE J. Solid-State Circuits, vol. 46, no. 7, pp. 1596-1605, Jul. 2011.
[7] S. Khorram, H. Darabi, Z. Zhou, Q. Li, B. Marholev, J. Chiu, J. Castaneda, H.-M. Chien, S. B. Anand, S. Wu, M.-A. Pan, R. Roofougaran, H. J. Kim, P. Lettieri, B. Ibrahim, J. J. Rael, L. H. Tran, E. Geronaga, H. Yeh, T. Frost, J. Trachewsky, and A. Rofougaran, "A fully integrated SOC for 802.11 b in $0.18-\mu \mathrm{m}$ CMOS," IEEE J. SolidState Circuits, vol. 40, no. 12, pp. 2492-2501, Dec. 2005.
[8] S. S. Mehta, D. Weber, M. Terrovitis, K. Onodera, M. P. Mack, B. J. Kaczynski, H. Samavati, S. H. M. Jen, W. W. Si, M. Lee, K. Singh, S. Mendis, P. J. Husted, N. Zhang, B. McFarland, D. K. Su, T. H. Meng, and B. A. Wooley, "An 802.11 g WLAN SoC," IEEE J. Solid-State Circuits, vol. 40, no. 12, pp. 2483-2491, Dec. 2005.
[9] A. A. Kidwai, A. Nazimov, Y. Eilat, and O. Degani, "Fully integrated 23dBm transmit chain with on-chip power amplifier and balun for 802.11a application in standard 45nm CMOS process," in IEEE RFIC Dig. Tech. Pap., 2009, pp. 273-276.
[10] H. Darabi, S. Khorram, H.-M. Chien, M.-A. Pan, S. Wu, S. Moloudi, J. C. Leete, J. J. Rael, M. Syed, R. Lee, B. Ibrahim, M. Rofougaran, and A. Rofougaran, "A $2.4-\mathrm{GHz}$ CMOS transceiver for Bluetooth," IEEE J. Solid-State Circuits, vol. 36, no. 12, pp. 2016-2024, Dec. 2001.
[11] P. van Zeij1, J. W. T. Eikenbroek, P. P. Vervoort, S. Setty, J. Tangenherg, G. Shipton, E. Kooistra, I. C. Keekstra, D. Belot, K. Visser, E. Bosma, and S. C. Blaakmeer, "A Bluetooth radio in 0.18- mu;m CMOS," IEEE J. Solid-State Circuits, vol. 37, no. 12, pp. 1679-1687, Dec. 2002.
[12] H. Ishikuro, M. Hamada, K. I. Agawa, S. Kousai, H. Kobayashi, D. M. Nguyen, and F. Hatori, "A single-chip CMOS Bluetooth transceiver with 1.5 MHz IF and direct modulation transmitter," in IEEE ISSCC Dig. Tech. Papers, 2003, pp. 94-480 vol.1.
[13] V. R. Vathulya, T. Sowlati, and D. Leenaerts, "Class 1 bluetooth power amplifier with 24 dBm output power and $48 \%$ PAE at 2.4 GHz in $0.25 \mu \mathrm{~m}$ CMOS," in ESSCIRC 2001, pp. 57-60.
[14] F. Zavosh, M. Thomas, C. Thron, T. Hall, D. Artusi, D. Anderson, D. Ngo, and D. Runton, "Digital predistortion techniques for RF power amplifiers with CDMA applications," Microw. J., vol. 42, no. 10, pp. 22-23, Oct. 1999.
[15] J. de Mingo and A. Valdovinos, "Performance of a new digital baseband predistorter using calibration memory," IEEE Trans. Veh. Technol., vol. 50, no. 4, pp. 1169-1176, Jul. 2001.
[16] W.-J. Kim, S. P. Stapleton, J. H. Kim, and C. Edelman, "Digital predistortion linearizes wireless power amplifiers," IEEE Microw. Mag., vol. 6, no. 3, pp. 54-61, Sep. 2005.
[17] H. Jin, D. Kim, S. Jin, H. Lee, K. Moon, H. Kim, and B. Kim, "Efficient digital quadrature transmitter based on IQ cell sharing," IEEE ISSCC Dig. Tech. Pap., pp. 168-169, Feb. 2015.
[18] W. Yuan, V. Aparin, J. Dunworth, L. Seward, and J. S. Walling, "A quadrature switched capacitor power amplifier in 65nm CMOS," in IEEE RFIC Dig. Tech. Papers, 2015, pp. 135-138.
[19] C. D. Presti, F. Carrara, A. Scuderi, P. M. Asbeck, and G. Palmisano, "A 25 dBm digitally modulated CMOS power amplifier for WCDMA/EDGE/OFDM with adaptive digital predistortion and efficient power control," IEEE J. Solid-State Circuits, vol. 44, no. 7, pp. 1883-1896, Jul. 2009.
[20] Y. Nagata, "Linear amplification technique for digital mobile communications," in Vehicular Technology Conference, 1989, pp. 159-164 vol.1.
[21] A. N. D'Andrea, V. Lottici, and R. Reggiannini, "RF power amplifier linearization through amplitude and phase predistortion," IEEE Trans. Commun., vol. 44, no. 11,
pp. 1477-1484, Nov. 1996.
[22] K. J. Muhonen, M. Kavehrad, and R. Krishnamoorthy, "Look-up table techniques for adaptive digital predistortion: a development and comparison," IEEE Trans. Veh. Technol., vol. 49, no. 5, pp. 1995-2002, Sep. 2000.
[23] M. S. Alavi, R. B. Staszewski, L. C. N. de Vreede, and J. R. Long, "A wideband 2x13bit all-digital I/Q RF DAC," IEEE Trans. Microw. Theory Tech., vol. 62, no. 4, pp. 732-752, Apr. 2014.
[24] P. Cruise, C.-M. Hung, R. B. Staszewski, O. Eliezer, S. Rezeq, K. Maggio, and D. Leipold, "A digital-to-RF-amplitude converter for GSM/GPRS/EDGE in $90-\mathrm{nm}$ digital CMOS," in IEEE RFIC Dig. Tech. Papers, 2005, pp. 21-24.
[25] R. B. Staszewski, T. Jung, T. Murphy, I. Bashir, O. Eliezer, K. Muhammad, and M. Entezari, "Software assisted digital RF processor for single-chip GSM radio in 90 nm CMOS," IEEE J. Solid-State Circuits, vol. 45, no. 2, pp. 276-288, Feb. 2010.
[26] S.-M. Yoo, J. S. Walling, E. C. Woo, B. Jann, and D. J. Allstot, "A switched-capacitor RF power amplifier," IEEE J. Solid-State Circuits, vol. 46, no. 12, pp. 2977-2987, Dec. 2011.
[27] A. Kavousian, D. K. Su, M. Hekmat, A. Shirvani, and B. A. Wooley, "A digitally modulated polar CMOS power amplifier with a $20-\mathrm{MHz}$ channel bandwidth," IEEE J. Solid-State Circuits, vol. 43, no. 10, pp. 2251-2258, Oct. 2008.
[28] P. J. Zampardi, "Will CMOS amplifiers ever Kick-GaAs?," in 2010 IEEE Custom Integrated Circuits Conference (CICC), 2010, pp. 1-4.
[29] T. Johansson and J. Fritzin, "A review of watt-level CMOS RF power amplifiers," IEEE Trans. Microw. Theory Tech., vol. 62, no. 1, pp. 111-124, Jan. 2014.
[30] I. Aoki, S. Kee, R. Magoon, R. Aparicio, F. Bohn, J. Zachan, G. Hatcher, D. McClymont, and A. Hajimiri, "A Fully-Integrated Quad-Band GSM/GPRS CMOS Power Amplifier," IEEE J. Solid-State Circuits, vol. 43, no. 12, pp. 2747-2758, Dec. 2008.
[31] C. H. Lee, J. J. Chang, K. S. Yang, K. H. An, I. Lee, K. Kim, J. Nam, Y. Kim, and H. Kim, "A highly efficient GSM/GPRS quad-band CMOS PA Module," in IEEE RFIC Dig. Tech. Pap., 2009, pp. 229-232.
[32] W. Kim, K. S. Yang, J. Han, J. Chang, and C. H. Lee, "An EDGE/GSM quad-band CMOS power amplifier," in IEEE ISSCC Dig. Tech. Pap., 2011, pp. 430-432.
[33] A. Afsahi, A. Behzad, and L. E. Larson, "A 65 nm CMOS 2.4 GHz 31.5 dBm power amplifier with a distributed LC power-combining network and improved linearization for WLAN applications," in IEEE ISSCC Dig. Tech. Papers, 2010, pp. 452-453.
[34] B. Francois and P. Reynaert, "A fully integrated watt-level linear 900-MHz CMOS

RF power amplifier for LTE-applications," IEEE Trans. Microw. Theory Tech., vol. 60, no. 6, pp. 1878-1885, Jun. 2012.
[35] D. Chowdhury, C. D. Hull, O. B. Degani, P. Goyal, Y. Wang, and A. M. Niknejad, "A single-chip highly linear 2.4 GHz 30 dBm power amplifier in 90 nm CMOS," in IEEE ISSCC Dig. Tech. Papers, 2009, pp. 378-379,379a.
[36] A. J. Annema, G. J. G. M. Geelen, and P. C. de Jong, " $5.5-\mathrm{V}$ I/O in a $2.5-\mathrm{V} 0.25-\mu \mathrm{m}$ CMOS technology," IEEE J. Solid-State Circuits, vol. 36, no. 3, pp. 528-538, Mar. 2001.
[37] P. Asbeck and C. Fallesen, "A 29 dBm 1.9 GHz class B power amplifier in a digital CMOS process," in ICECS, 2000, vol. 1, pp. 474-477 vol.1.
[38] S. Moloudi and A. A. Abidi, "The Outphasing RF Power Amplifier: A Comprehensive Analysis and a Class-B CMOS Realization," IEEE J. Solid-State Circuits, vol. 48, no. 6, pp. 1357-1369, Jun. 2013.
[39] W. J. Chudobiak and D. F. Page, "Frequency and power limitations of Class-D transistor amplifiers," IEEE J. Solid-State Circuits, vol. 4, no. 1, pp. 25-37, Feb. 1969.
[40] S. A. El-Hamamsy, "Design of high-efficiency RF Class-D power amplifier," IEEE Trans. Power Electron., vol. 9, no. 3, pp. 297-308, May 1994.
[41] N. O. Sokal and A. D. Sokal, "Class E-A new class of high-efficiency tuned singleended switching power amplifiers," IEEE J. Solid-State Circuits, vol. 10, no. 3, pp. 168-176, Jun. 1975.
[42] F. Raab, "Idealized operation of the class E tuned power amplifier," IEEE Trans. Circuits Syst., vol. 24, no. 12, pp. 725-735, Dec. 1977.
[43] M. K. Kazimierczuk, "Class E tuned power amplifier with nonsinusoidal output voltage," IEEE J. Solid-State Circuits, vol. 21, no. 4, pp. 575-581, Aug. 1986.
[44] F. H. Raab, "Effects of circuit variations on the class E tuned power amplifier," IEEE J. Solid-State Circuits, vol. 13, no. 2, pp. 239-247, Apr. 1978.
[45] F. H. Raab and N. O. Sokal, "Transistor power losses in the class E tuned power amplifier," IEEE J. Solid-State Circuits, vol. 13, no. 6, pp. 912-914, Dec. 1978.
[46] C. Hu, "Reliability phenomena under AC stress," Microelectron. Reliab., vol. 38, no. 1, pp. 1-5, Feb. 1998.
[47] R. Bellens, G. Groeseneken, P. Heremans, and H. E. Maes, "Hot-carrier degradation behavior of N - and P-channel MOSFET's under dynamic operation conditions," IEEE Trans. Electron Devices, vol. 41, no. 8, pp. 1421-1428, Aug. 1994.
[48] A. Bravaix, D. Vuillaume, D. Goguenheim, V. Lassarre, and M. Haoad, "Competing AC hot-carrier degradation mechanisms in surface-channel p-MOSFETs during pass
transistor operation," in Electron Devices Meeting, 1996, pp. 873-876.
[49] L. Larcher, D. Sanzogni, R. Brama, A. Mazzanti, and F. Svelto, "Oxide breakdown after RF stress: experimental analysis and effects on power amplifier operation," in Reliability Physics Symposium Proceedings, 2006, pp. 283-288.
[50] F. H. Raab, "Class-F power amplifiers with maximally flat waveforms," IEEE Trans. Microw. Theory Tech., vol. 45, no. 11, pp. 2007-2012, Nov. 1997.
[51] F. H. Raab, "Maximum efficiency and output of class-F power amplifiers," IEEE Trans. Microw. Theory Tech., vol. 49, no. 6, pp. 1162-1166, Jun. 2001.
[52] S. D. Kee, I. Aoki, A. Hajimiri, and D. Rutledge, "The class-E/F family of ZVS switching amplifiers," IEEE Trans. Microw. Theory Tech., vol. 51, no. 6, pp. 16771690, Jun. 2003.
[53] C. J. Wei, P. DiCarlo, Y. A. Tkachenko, R. McMorrow, and D. Bartle, "Analysis and experimental waveform study on inverse class class-F mode of microwave power FETs," in Microwave Symposium Digest., 2000, vol. 1, pp. 525-528 vol.1.
[54] H. Kobayashi, J. M. Hinrichs, and P. M. Asbeck, "Current-mode class-D power amplifiers for high-efficiency RF applications," IEEE Trans. Microw. Theory Tech., vol. 49, no. 12, pp. 2480-2485, Dec. 2001.
[55] T.-P. Hung, A. G. Metzger, P. J. Zampardi, M. Iwamoto, and P. M. Asbeck, "Design of high-efficiency current-mode class-D amplifiers for wireless handsets," IEEE Trans. Microw. Theory Tech., vol. 53, no. 1, pp. 144-151, Jan. 2005.
[56] T. Nakatani, J. Rode, D. F. Kimball, L. E. Larson, and P. M. Asbeck, "DigitallyControlled Polar Transmitter Using a Watt-Class Current-Mode Class-D CMOS Power Amplifier and Guanella Reverse Balun for Handset Applications," IEEE J. Solid-State Circuits, vol. 47, no. 5, pp. 1104-1112, May 2012.
[57] J. S. Walling and D. J. Allstot, "Linearizing CMOS switching power amplifiers using supply regulators," IEEE Trans. Circuits Syst. II Express Briefs, vol. 57, no. 7, pp. 497-501, Jul. 2010.
[58] J. S. Walling and D. J. Allstot, "Pulse-width modulated CMOS power amplifiers," IEEE Microw. Mag., vol. 12, no. 1, pp. 52-60, Feb. 2011.
[59] J. S. Walling and D. J. Allstot, "Design considerations for supply modulated EER power amplifiers," in IEEE WAMICON Dig. Tech. Papers, 2013, pp. 1-4.
[60] L. R. Kahn, "Single-sideband transmission by envelope elimination and restoration," Proc IRE, vol. 40, no. 7, pp. 803-806, Jul. 1952.
[61] F. H. Raab, B. E. Sigmon, R. G. Myers, and R. M. Jackson, "L-band transmitter using Kahn EER technique," IEEE Trans. Microw. Theory Tech., vol. 46, no. 12, pp. 22202225, Dec. 1998.
[62] D. K. Su and W. J. McFarland, "An IC for linearizing RF power amplifiers using envelope elimination and restoration," IEEE J. Solid-State Circuits, vol. 33, no. 12, pp. 2252-2258, Dec. 1998.
[63] P. Reynaert and M. S. J. Steyaert, "A 1.75-GHz polar modulated CMOS RF power amplifier for GSM-EDGE," IEEE J. Solid-State Circuits, vol. 40, no. 12, pp. 25982608, Dec. 2005.
[64] E. McCune and W. Sander, "EDGE transmitter alternative using nonlinear polar modulation," in Proc. IEEE ISCAS, 2003, vol. 3, pp. III-594-III-597 vol.3.
[65] J. N. Kitchen, I. Deligoz, S. Kiaei, and B. Bakkaloglu, "Polar SiGe class E and F amplifiers using switch-mode supply modulation," IEEE Trans. Microw. Theory Tech., vol. 55, no. 5, pp. 845-856, May 2007.
[66] D. Chowdhury, S. V. Thyagarajan, L. Ye, E. Alon, and A. M. Niknejad, "A fullyintegrated efficient CMOS inverse class-D power amplifier for digital polar transmitters," IEEE J. Solid-State Circuits, vol. 47, no. 5, pp. 1113-1122, May 2012.
[67] L. Ye, J. Chen, L. Kong, E. Alon, and A. M. Niknejad, "Design considerations for a direct digitally modulated WLAN transmitter with integrated phase path and dynamic impedance modulation," IEEE J. Solid-State Circuits, vol. 48, no. 12, pp. 3160-3177, Dec. 2013.
[68] S. Hu, S. Kousai, J. S. Park, O. L. Chlieh, and H. Wang, "A +27.3 dBm transformerbased digital Doherty polar power amplifier fully integrated in bulk CMOS," in IEEE RFIC Dig. Tech. Papers, 2014, pp. 235-238.
[69] K. Onizuka, S. Saigusa, and S. Otaka, "A +30.5 dBm CMOS Doherty power amplifier with reliability enhancement technique," in 2012 Symposium on VLSI Circuits (VLSIC), 2012, pp. 78-79.
[70] N. Wongkomet, L. Tee, and P. R. Gray, "A 1.7GHz 1.5W CMOS RF Doherty Power Amplifier for Wireless Communications," in IEEE ISSCC Dig. Tech. Pap., 2006, pp. 1962-1971.
[71] I. Hakala, D. K. Choi, L. Gharavi, N. Kajakine, J. Koskela, and R. Kaunisto, "A 2.14GHz Chireix outphasing transmitter," IEEE Trans. Microw. Theory Tech., vol. 53, no. 6, pp. 2129-2138, Jun. 2005.
[72] H. Chireix, "High power outphasing modulation," in Proc. IRE, 1935, vol. 23, pp. 1370-1392.
[73] F. Wang, D. F. Kimball, J. D. Popp, A. H. Yang, D. Y. Lie, P. M. Asbeck, and L. E. Larson, "An improved power-added efficiency $19-\mathrm{dBm}$ hybrid envelope elimination and restoration power amplifier for 802.11 g WLAN applications," IEEE Trans Microw Theory Tech., vol. 54, no. 12, pp. 4086-4098, 2006.
[74] A. Birafane and A. B. Kouki, "On the Linearity and Efficiency of Outphasing

Microwave Amplifiers," IEEE MTT Trans., vol. 52, no. 7, pp. 1702-1708, 2004.
[75] M. El-asmar, A. Birafane, and M. Helaoui, "Analytical Design Methodology of Outphasing Amplification Systems Using a New Simplified Chireix Combiner Model," MTT-S, vol. 60, no. 6, 2012.
[76] J. S. Walling, H. Lakdawala, Y. Palaskas, A. Ravi, O. Degani, K. Soumyanath, and D. J. Allstot, "A class-E PA with pulse-width and pulse-position modulation in 65 nm CMOS," IEEE J. Solid-State Circuits, vol. 44, no. 6, pp. 1668-1678, Jun. 2009.
[77] R. Hezar, L. Ding, A. Banerjee, J. Hur, and B. Haroun, "A PWM based fully integrated digital transmitter/PA for WLAN and LTE applications," IEEE J. SolidState Circuits, vol. 50, no. 5, pp. 1117-1125, May 2015.
[78] T. P. Hung, J. Rode, L. E. Larson, and P. M. Asbeck, "Design of H-Bridge class-D power amplifiers for digital pulse modulation transmitters," IEEE Trans. Microw. Theory Tech., vol. 55, no. 12, pp. 2845-2855, Dec. 2007.
[79] J. S. Walling and D. J. Allstot, "Pulse-width modulated CMOS power amplifiers," IEEE Microw. Mag., vol. 12, pp. 52-60, Feb. 2011.
[80] S.-M. Yoo, B. Jann, O. Degani, J. C. Rudell, R. Sadhwani, J. S. Walling, and D. J. Allstot, "A class-G dual-supply switched-capacitor power amplifier in 65nm CMOS," in 2012 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), 2012, pp. 233-236.
[81] P. A. Godoy, S. Chung, T. W. Barton, D. J. Perreault, and J. L. Dawson, "A 2.4-GHz, $27-\mathrm{dBm}$ Asymmetric Multilevel Outphasing Power Amplifier in $65-\mathrm{nm}$ CMOS," IEEE J. Solid-State Circuits, vol. 47, no. 10, pp. 2372-2384, Oct. 2012.
[82] F. H. Raab, P. Asbeck, S. Cripps, P. B. Kenington, Z. B. Popovic, N. Pothecary, J. F. Sevic, and N. O. Sokal, Power amplifiers and transmitters for RF and microwave, vol. 50. IEEE, 2002.
[83] S. Chung, R. Ma, S. Shinjo, H. Nakamizo, K. Parsons, and K. H. Teo, "Concurrent Multiband Digital Outphasing Transmitter Architecture Using Multidimensional Power Coding," IEEE Trans. Microw. Theory Tech., vol. 63, no. 2, pp. 598-613, Feb. 2015.
[84] M. Elmala, J. Paramesh, and K. Soumyanath, "A 90-nm CMOS Doherty Power Amplifier With Minimum AM-PM Distortion," IEEE J. Solid-State Circuits, vol. 41, no. 6, pp. 1323-1332, Jun. 2006.
[85] K. Oishi, E. Yoshida, Y. Sakai, H. Takauchi, Y. Kawano, N. Shirai, H. Kano, M. Kudo, T. Murakami, T. Tamura, S. Kawai, K. Suto, H. Yamazaki, and T. Mori, "A 1.95 GHz Fully Integrated Envelope Elimination and Restoration CMOS Power Amplifier Using Timing Alignment Technique for WCDMA and LTE," IEEE J. Solid-State Circuits, vol. 49, no. 12, pp. 2915-2924, Dec. 2014.
[86] F. Wang, D. F. Kimball, J. D. Popp, A. H. Yang, D. Y. Lie, P. M. Asbeck, and L. E. Larson, "An improved power-added efficiency $19-\mathrm{dBm}$ hybrid envelope elimination and restoration power amplifier for 802.11 g WLAN applications," IEEE Trans. Microw. Theory Tech., vol. 54, no. 12, pp. 4086-4099, Dec. 2006.
[87] W. Yuan and J. S. Walling, "A switched-capacitor controlled digital-current modulated class-E EER transmitter," in New Circuits and Systems Conference (NEWCAS), 2015 IEEE 13th International, 2015, pp. 1-4.
[88] N. O. Sokal and A. D. Sokal, "Class E-A new class of high-efficiency tuned singleended switching power amplifiers," IEEE J Solid-State Circuits, vol. 10, no. 3, pp. 168-176, 1975.
[89] F. H. Raab, "Effects of circuit variations on the class E tuned power amplifier," IEEE J. Solid-State Circuits, vol. 13, no. 2, pp. 239-247, 1978.
[90] A. Mazzanti, L. Larcher, R. Brama, and F. Svelto, "Analysis of reliability and power efficiency in cascode class-E PAs," IEEE J Solid-State Circuits, vol. 41, pp. 12221229, May 2006.
[91] A. Kavousian, D. K. Su, M. Hekmat, A. Shirvani, and B. A. Wooley, "A Digitally Modulated Polar CMOS Power Amplifier With a $20-\mathrm{MHz}$ Channel Bandwidth," IEEE J. Solid State Circuits, vol. 43, no. 10, pp. 2251-2258, 2008.
[92] C. Presti, F. Carrara, A. Scuderi, P. M. Asbeck, and G. Palmisano, "A 25 dBm digitally modulated CMOS power amplifier for WCDMA/EDGE/OFDM with adaptive digital predistortion and efficient power control," IEEE J. Solid-State Circuits, vol. 44, no. 7, pp. 1883-1896, 2009.
[93] J. Deveugele and M. S. J. Steyaert, "A 10-bit 250-MS/s binary-weighted currentsteering DAC," IEEE J. Solid-State Circuits, vol. 41, no. 2, pp. 320-329, Feb. 2006.
[94] K. O'Sullivan, C. Gorman, M. Hennessy, and V. Callaghan, "A 12-bit 320MSample/s current-steering CMOS D/A converter in $0.44 \mathrm{~mm} 2, "$ IEEE J. Solid-State Circuits, vol. 39, no. 7, pp. 1064-1072, Jul. 2004.
[95] T. Chen and G. G. E. Gielen, "A 14-bit 200-MHz Current-Steering DAC With Switching-Sequence Post-Adjustment Calibration," IEEE J. Solid-State Circuits, vol. 42, no. 11, pp. 2386-2394, Nov. 2007.
[96] K. Khanoyan, F. Behbahani, and A. A. Abidi, "A 10 b, $400 \mathrm{MS} / \mathrm{s}$ glitch-free CMOS D/A converter," in VLSI Circuits, 1999. Digest of Technical Papers. 1999 Symposium on, 1999, pp. 73-76.
[97] C. Daigle, A. Dastgheib, and B. Murmann, "A 12-bit 800-MS/s switched-capacitor DAC with open-loop output driver and digital predistortion," in Solid State Circuits Conference (A-SSCC), 2010 IEEE Asian, 2010, pp. 1-4.
[98] P. F. Ferguson Jr, X. Haurie, and G. C. Temes, "A highly linear low-power 10 bit

DAC for GSM," in Custom Integrated Circuits Conference, 2000, pp. 261-264.
[99] F. J. Wang, G. C. Temes, and S. Law, "A quasi-passive CMOS pipeline D/A converter," IEEE J. Solid-State Circuits, vol. 24, no. 6, pp. 1752-1755, Dec. 1989.
[100] D. Kim, H. Jin, S. Jin, and B. Kim, "Highly efficient and wideband digital quadrature transmitter," in IEEE IMS Dig. Tech. Papers, 2013, pp. 1-3.
[101] Y. Han and D. J. Perreault, "Analysis and design of high efficiency matching networks," IEEE Trans Power Electron, vol. 21, no. 5, pp. 1484-1491, Sep. 2006.
[102] W. Yuan, V. Aparin, J. Dunworth, L. Seward, and J. S. Walling, "A quadrature switched capacitor power amplifier in 65 nm CMOS," in Proc. of the IEEE RFIC Symp., 2015, pp. 135-138.
[103] B. Serneels, T. Piessens, and W. Dehaene, "A high-voltage output driver in a standard 2.5 V $0.25 \mu \mathrm{~m}$ CMOS technology," in IEEE ISSCC Dig. Tech. Papers, 2004, vol. 1, pp. 146-518.
[104] F. H. Raab, "Average efficiency of class-G power amplifiers," IEEE Trans. Consum. Electron., vol. 32, no. 2, pp. 145-150, May 1986.
[105] S. Sehajpal, S. S. Taylor, D. J. Allstot, and J. S. Walling, "Impact of switching glitches in class-G power amplifiers," IEEE Microw. Wirel. Compon. Lett., vol. 22, no. 6, pp. 282-284, Jun. 2012.
[106] B. Serneels, M. Steyaert, and W. Dehaene, "A high speed, low voltage to high voltage level shifter in standard 1.2V $0.13 \mu \mathrm{~m}$ CMOS," in Proc. IEEE ECS, 2006, pp. 668-671.
[107] W. Yuan and J. S. Walling, "A multiphase switched capacitor power amplifier in 130nm CMOS," in IEEE RFIC Dig. Tech. Papers, 2016.
[108] W. Yuan, V. Aparin, J. Dunworth, L. Seward, and J. S. Walling, "A quadrature switched capacitor power Amplifier," IEEE J. Solid-State Circuits, vol. PP, no. 99, pp. 1-22, 2016.

