ABSTRACT This paper presents a fully integrated C-band Doherty power amplifier (DPA) based on a 0.25-µm GaN-HEMT process for the 5G massive MIMO application. The performance degradation caused by nonlinear output capacitance is analyzed, and a novel compensation technique is proposed. A low-Q output network is employed to broaden the bandwidth, and its insertion loss in the back-off region is demonstrated to be mainly decided by the Q-factor of the drain bias inductor of the main PA. Hence, by adopting on-chip transmission lines with high Q-factors for drain biasing, a full integration, and a low loss can be achieved simultaneously. Reversed uneven power splitting and back-off input matching are proposed for gain enhancement. The fabricated DPA demonstrates a small-signal gain of 8.6-11.6 dB, an output power of 40.4-41.2 dBm, a 6-dB back-off drain efficiency (DE) of 47% -50%, and a saturation DE of 55%-63% across a wide bandwidth from 4.5 to 5.2 GHz, with an ultra-compact size of 2.2 mm × 2.1 mm. Using a 40-MHz LTE signal with a 7.7-dB peak-to-average power ratio at the carrier frequency of 4.9 GHz, the measured average output power and efficiency are 33 dBm and 43%, respectively. The raw adjacent channel power ratio is −29 dBc and is improved to −46 dBc by applying digital predistortion.
I. INTRODUCTION
The demand for data rates is increasing rapidly with the development of many emerging technologies. However, the spectrum below 3 GHz has been very crowded. Therefore, many new frequency bands are allocated in 5G communication, including mm-wave band and the spectrum below 6 GHz, especially C-band [1] . Compared with mm-wave band, C-band shows much lower path loss, and will be deployed in 5G system first with a strong commercial demand. Massive multiple-input multiple-output (MIMO) technique will be used in 5G system for spectrum efficiency improvement, and the number of power amplifiers (PAs) is typically up to
The associate editor coordinating the review of this manuscript and approving it for publication was Vittorio Camarchia. 64 or 128. As a result, the power requirement of a unit PA is reduced greatly to tens of watts. In addition, the unit PA should be miniaturized to ensure a reasonable system size. GaN-based monolithic microwave integrated circuit (MMIC) is an excellent choice for such application. Since most power is consumed by PAs for a typical base transceiver system, the design of high-efficiency PAs is critical to the reduction of system power dissipation. Doherty power amplifier (DPA) is the most popular architecture for average efficiency enhancement due to its high back-off efficiency [2] - [7] . Many GaN MMIC DPAs, including those for microwave backhaul applications [8] - [17] and those for small-cell or femto-cell applications [18] - [28] , have been reported. However, there are relatively few C-band GaN MMIC DPAs for 5G application [29] - [31] . Hybrid integration is adopted for the design of 3.5-GHz GaN MMIC DPAs in [29] , and a state-of-the-art performance is achieved. Fully integrated sub−6 GHz DPAs with high back-off efficiency are presented in [30] and [31] . Nevertheless, the saturated power is below 10 watts, and high back-off efficiency is maintained in a narrow bandwidth. This paper demonstrates an ultra-compact fully integrated GaN DPAs with a saturated power up to 41.2 dBm. Moreover, the 6-dB back-off efficiency better than 47% is achieved in a relatively wide band from 4.5 GHz to 5.2 GHz.
To reduce the occupied size, the quarter-wavelength transformer (QWT) of the proposed DPA is implemented using a lumped π-type network, whose bandwidth is much narrower than that of the distributed QWT [32] . A low-Q output network proposed in [32] is adopted for bandwidth expansion. The back-off impedance transforming ratio (ITR) is reduced to 2 by setting the common load and characteristic impedance of the QWT to R opt and √ 2 R opt , respectively. Furthermore, the gate width of the transistor is optimized for a R opt of 50 to eliminate the post matching network. The output capacitance of the transistor, which is commonly assumed to be constant in the conventional low-Q design [18] , [20] - [24] , is absorbed into the output network. In fact, the output capacitance of GaN-HEMT devices is not linear, and it changes with the output power instead, as demonstrated in [33] and [34] . The assumption of a constant C out may deteriorate the load modulation of the DPA significantly. In this paper, the performance degradation caused by the nonlinear output capacitance is analyzed in depth, and a novel compensation technique is proposed. The insertion loss of the low-Q output network in the back-off region is evaluated, and it is found that the loss is mainly decided by the Q-factor of the drain bias inductor of the main PA. To achieve a full integration with a low loss, all drain bias inductors are realized by on-chip transmission lines (TLs), which exhibit much higher Q-factor than on-chip inductors and ensure a high back-off efficiency. Other inductors in the output network are realized by on-chip spiral inductors for a compact size.
A large transistor size is adopted to obtain an output power higher than 10 watts, and the power gain is reduced consequently. In our design, back-off input matching (BIM) and reversed uneven power splitting (RUPS) are proposed for gain enhancement. Different from the conventional design, the input matching of the main PA is optimized in the back-off power level with a load impedance of 2R opt . The back-off gain of the DPA is improved while maintaining the same the saturated gain. Uneven input power splitting is commonly used to improve the load modulation [35] . However, this method will reduce the gain of the DPA [18] . To solve this problem, different stabilizing circuits are applied for the main PA and the auxiliary PA, and a RUPS scheme is used to deliver more power to the main PA. As a result, the overall gain of the DPA is enhanced greatly.
II. PROPOSED OUTPUT NETWORK
A. LOW-Q NETWORK WITH NONLINEAR C OUT COMPENSATION Fig. 1 shows the conventional output network of symmetrical DPAs [27] . A high-pass π-type network is used as the QWT with a characteristic impedance of R opt . The output capacitance, C out , of the transistor is neutralized by a shunt inductor L p . A compact output network with only three inductors can be achieved after merging shunt inductors. However, the requirement of open circuit in the low power (LP) region and optimal power matching in the high power (HP) region cannot be satisfied at the same time with merely one shunt inductor, since the output capacitance of the auxiliary transistor in the LP region is different from that in the HP region. The large-signal model of GaN-HEMT devices is very complicated, and the output capacitance is modulated by both drain-source and gate-source voltages. Assuming only a dependence on drain-source voltage, the typical characteristic of nonlinear C out according to the drain-source voltage is plotted in Fig. 2(a) [33] , [34] . It is observed that the capacitance increases dramatically as the drain-source voltage becomes smaller, which means the average C out will increase with the output power.
A transistor with a gate width of 10 × 200 µm is used to demonstrate the nonlinear output capacitance. The output capacitance under different output powers can be extracted from load-pull simulation results. As shown in Fig. 2(b) , average C out increases rapidly when the output power is close to saturation. Assuming an operating frequency of 4.9 GHz, the output capacitances in the back-off and saturation region, the corresponding resonant inductances, and the saturated power for each inductance is summarized in Table 1 . The shunt inductance for open circuit in the back-off region and optimal power matching in the saturation region are 1.5 nH and 1.2 nH, respectively. In the conventional output network, the saturation output power of the auxiliary PA will be reduced by 0.6 dB if the open circuit in the back-off region is guaranteed, as shown in Table 1 , and consequently the load modulation of the DPA is deteriorated.
To overcome the output power degradation caused by the nonlinear C out , a small series inductor is inserted behind the auxiliary PA, as shown in Fig. 3 . Shunt inductor L p of the auxiliary PA is used to resonate with the output capacitance in the back-off region. Owing to the high supply voltage of GaN devices, R opt is a relatively large value. Thus, we assume the impedance of L s is much smaller than R opt , as defined in (1), where ω is angular frequency. In the back-off region, open circuit is maintained in view of the small impedance of L s . In the saturation region, the output matching network of the auxiliary PA is shown in Fig. 4(a) . In Fig. 4(b) , the series inductor is converted to the shunt inductor using the following equations:
Q is a very small value according to (1), thus, (2) and (3) can be simplified to (5) and (6), respectively.
L s_c is a large inductor, and the merging of L s_c and L p will result in an inductor smaller than L p , denoted by L p_c in Fig. 4(b) . Consequently, although the output capacitance increases rapidly in the saturation region, the shunt inductor is able to resonate with C out around the operating frequency, and the performance deterioration can be overcome. The example in Table 1 shows a power reduction of 0.6 dB when using a 1.5-nH shunt inductor. After a 0.3-nH series inductor is inserted, the saturation power is restored to 39.2 dBm, which is only 0.1 dB lower than the maximum power from the simulation results. Since the inductance of L s is very small, the introduced insertion loss is negligible and power improvement can be maintained.
Besides the performance deterioration caused by nonlinear output capacitance, the conventional network in Fig. 1 also exhibits limited bandwidth because of the high ITR in the back-off region. To expand the bandwidth of the DPA, a low-Q QWT is adopted [32] . Fig. 5 presents the schematic of the low-Q output network. The common load and characteristic impedance of the transformer are set to R opt and √ 2R opt , respectively. Therefore, the ITR in the back-off region is reduced to 2, and a larger bandwidth can be achieved. A lower ITR means a lower insertion loss, and the back-off efficiency will also be improved. The ITR in the saturation region has the same value, hence, the saturation performance is not a limiting factor of the overall bandwidth. The post matching network is an additional bandwidth limiting factor. It also occupies a large size and introduces additional insertion loss. Hence, the gate width of the transistor is optimized to realize a R opt of 50 , which results in a transistor size of 10×200 µm. A band-pass network with a zero phaseshift is used to realize the impedance transformation in the saturation region and maintain the open circuit of the output impedance in the back-off region at the same time. The network is composed of L 1 , C 1 , L 2 , and C 2 , and its structure is presented in Fig. 6 , VOLUME 7, 2019 where Z 1 = jωL 1 , Z 2 = 1/jωC 2 ; Y 1 = jωC 1 , and Y 2 = 1/jωL 2 . To determine element values, the S-parameter matrix of the network is derived. Firstly, the cascaded matrix can be calculated by multiplying that of each element, denoted by
where A Z 1 , A Y 1 , A Z 2 , and A Y 2 are the cascaded matrixes of each element. Then the cascaded matrix is converted to Sparameter matrix. Considering the function of the network, the following equations should be satisfied:
After some simplification, we can get
The solution is not unique, as the number of equations is less than that of variables. Different solutions result in different bandwidths, occupied areas and insertion losses. The principle of determining final solution is that the values of L 1 and L 2 should be as small as possible while a reasonable bandwidth is guaranteed. Fig. 7 shows the impedance transforming in the back-off region, where L M is the merged inductance of L pM and L T , R M and R T represent the loss of L M and L T , respectively, and the loss of capacitor is not considered. To achieve a high back-off efficiency, the insertion loss of the QWT should be minimized, which relies on the Q-factors of L M and L T . The resistor network in Fig. 7 divides the current, and dissipates RF power. Assuming R M and R T are much larger than R opt , the insertion loss can be expressed as
B. FULL INTEGRATION WITH LOW LOSS
Denoting the Q-factor of L M and L T by Q M and Q T , respectively, R M and R T can be calculated using
where L T and L M are decided by
Using (14)- (17), (13) can be simplified to
where
Since 2/β is much larger than 1/ √ 2, it can be concluded that the insertion loss is more sensitive to Q M .
In [20] - [25] , [27] , and [28] , L M is realized using off-chip inductors or bonding wires with high Q-factors. However, the requirement of off-chip components reduces the consistency and reliability of PA modules. Therefore, on-chip components are adopted for both L M and L T to achieve a full integration. Due to the limited dc-current capacity of on-chip inductors, L M is realized by an on-chip TL. Compared with on-chip inductors, on-chip TLs exhibit much higher Q-factor, which guarantees a low loss of the QWT according to (18) . For a 10 × 200 µm transistor, L M and L T are 0.79 nH and 2.3 nH, respectively, and R opt is 50 . Fig. 8 illustrates the insertion loss as a function of Q M and Q T at 4.9 GHz. It is observed that the loss shows very small variation with Q T , and mainly depends on Q M . Fig. 9 presents the Q-factor of the on-chip TL and spiral inductor with the same inductance of 0.79 nH. At 4.9 GHz, the Q-factor of the spiral inductor is only 17 while that of the on-chip TL is up to 46. Assuming a Q-factor of 20 for L T , the network with on-chip TL exhibits an insertion loss of only 0.55 dB according to Fig. 8 . The final version of the proposed output network is shown Fig. 10 . The drain bias inductor of the auxiliary PA is also realized by an on-chip TL. A small loaded capacitor C L is employed to avoid a lengthy TL at the cost of a little Fig. 11(a) shows the stabilizing circuit, where a small series resistor R s is used to improve in-band stability and a relatively large parallel resistor R p is applied to remove possible lowfrequency oscillations. With R s set to 3 , the maximum gain of the 10 × 200 µm transistor is shown in Fig. 11(b) . The device is conditionally stable in the operating frequency band, but the unconditional stability can be achieved when the losses of all passive networks are considered according to the final layout simulation results. The maximum gain of the transistor is about 16 dB at 4.9 GHz, thus, the maximum gain of the DPA is 13 dB assuming an even input splitter. Input network is realized by lumped components to achieve a compact size, and the gain of the DPA will be further reduced due to the high loss of on-chip inductors. In our design, RUPS and BIM are proposed to enhance the gain.
III. GAIN ENHANCEMENT

A. REVERSED UNEVEN POWER SPLITTING
Conventionally, main PA and auxiliary PA employ the same stabilizing circuit, and the gain of the auxiliary PA is much lower than that of the main PA because of the class-C biasing. To prevent an inappropriate load modulation, an uneven power splitter is commonly adopted to deliver more power to the auxiliary branch, and auxiliary PA is biased at a deeper class-C to prevent the early turn-on [35] . However, this method will reduce the gain in the back-off region due to a shortage of the input power towards the main PA and, also, the gain in the HP region due to a deep class-C operation [18] .
In fact, the small-signal stability of the DPA is dominated by the main PA since auxiliary PA provides no gain in the back-off region [36] . Hence, the stabilizing circuit of the auxiliary PA can be relaxed, and asymmetrical stabilizing networks are applied for the DPA. In our design, R s is set to 3 for the main PA while it is reduced to 1 for the auxiliary PA, which increases the gain of the auxiliary PA. Contrary to the conventional DPA, we adopt a reversed uneven power splitter to provide more power to the main PA, and then the auxiliary PA can be biased at a shallower class-C. As a result, the overall gain of the DPA is improved greatly. Fig. 12 shows the performance of the conventional DPA and the DPA with RUPS. For the conventional DPA, the series resistors of the main PA and the auxiliary PA are both 3 , and 1 dB more power is delivered to the auxiliary PA with −4.6 V gate biasing. For the proposed DPA, the series resistors of the main PA and auxiliary PA are 3 and 1 , respectively, 4 dB more power is delivered to the main PA, and the auxiliary PA is biased at −3.7 V. It is observed that the proposed RUPS technique enhances the gain by about 2 dB in the back-off region and 1.5 dB in the saturation region, with a similar efficiency performance. Fig. 13 (a) presents a simplified equivalent circuit for the GaN-HEMT device [37] . The capacitor C gd can be split, according to the Miller's theorem, into two grounded elements in parallel to C gs and C ds , respectively, resulting in the equivalent capacitors C in and C out , as shown in Fig. 13(b) . Assuming C out is neutralized by susceptance jB L , the input capacitance can be approximated as
B. BACK-OFF INPUT MATCHING
where R L is the load impedance. Since g m , C gs , and C gd are all nonlinear, C in is a function of power level. Moreover, it should be noted that C in also depends on load impedance R L . Fig. 14 depicts the extracted input capacitance of the main PA versus output power for the load impedance of R opt and 2R opt . Conventionally, the input matching of the main PA is designed in the small-signal region with the load impedance of R opt . A mismatch will be produced in the back-off power level and the gain will be degraded consequently. To solve this problem, BIM is adopted in our design, which means that the input matching is designed in the back-off power level with the load impedance of 2R opt . The gain of the main PA with BIM is compared with the conventional main PA in Fig. 15 . In the back-off power level (Pout = 36 dBm, R L = 2R opt ), BIM improves the gain by about 0.6 dB. In the saturation power level (Pout = 39 dBm, R L = R opt ), BIM shows no impact on the gain. Fig. 16 presents the performance comparison between the DPA with BIM and the conventional DPA. The back-off gain of the DPA is enhanced by about 0.5 dB while almost the same efficiency is achieved.
IV. CIRCUIT DESIGN A. OUTPUT NETWORK
As mentioned before, the transistors with a gate width of 10 × 200 µm are adopted to realize a R opt of 50 . Load-pull Using (10)- (12), the output matching network of the auxiliary PA can be synthesized, and a group of solutions are shown in Table 2 . Frequency response of the network in Fig. 6 for each solution is depicted in Fig. 17 . It is found that different solutions result in different matching bandwidths. Solution 1 shows the largest bandwidth with the largest inductances, while Solution 4 exhibits the smallest bandwidth with the smallest inductances. In practical design, the inductance is expected to be smaller for chip miniaturization. With both bandwidth and inductance taken into consideration, Solution 2 is chosen for circuit implementation.
To decide the linewidth of TL M and TL A , dc supply current should be assessed first. Since the maximum output power of the employed transistor is 39.5 dBm, the maximum dc current is calculated to be 530 mA assuming a drain efficiency of 60%. The dc-current density of TLs is 29.6 mA/µm, and thus the allowable minimum linewidth is calculated to be 18 µm. A larger linewidth of 50 µm is used for both TL M and TL A to realize a low loss at the cost of a little larger occupied area. Table 3 summarizes the component values of the proposed output network in Fig. 10 . Electromagnetic (EM) simulation is launched to evaluate the performance of the proposed output network, the insertion loss of which under different output powers is presented in Fig. 18 . In the back-off region, the insertion loss is only about 0.85 dB, ensuring a high back-off efficiency. 
B. INPUT NETWORK
The schematic of the input network is shown in Fig. 19 , where all components values are given. Due to the nonlinear input capacitor, the second harmonic of the input voltage is out-of-phase to the fundamental component, and the conduction angle is increased as a result, which will degrade the efficiency obviously [21] . Therefore, a second harmonic short-circuit network is inserted at the gate of the main PA, which improves both the back-off and saturated efficiencies.
At fundamental frequency, the harmonic network is equivalent to a shunt capacitor, and acts as a part of the input matching network. The QWT and different biasings result in a phase difference of 100 • between the two branches. The phase compensation is integrated with the input matching of the main PA by adopting a band-pass network for the main PA and a high-pass network for the auxiliary PA. High-pass and band-pass networks exhibit a phase shift of 80 • and −20 • , respectively, and then the phase difference of 100 • is compensated. A Wilkinson divider realized by lumped inductors and capacitors is used for power splitting. The port impedance is 31 for the main PA and 81 for the auxiliary PA. The divider is designed at 4.9 GHz, and provides a fractional bandwidth of 20%, which is enough for our design. The shunt inductors at the same node can be merged for a reduced size.
C. IMPLEMENTATION
The overall schematic of the proposed DPA is shown in Fig. 20 . A 0.25-µm GaN-HEMT process from WIN Semiconductors is adopted for implementation. The power density is about 4 W/mm with 28-V drain supply, and the cutoff frequency is 24.5 GHz. All passive circuits are simulated using the Momentum in Keysight's Advanced Design System (ADS). The dimensions of passive devices have been fine tuned in view of many non-ideal factors, such as coupling effect, connecting line and bending effect. Fig. 21 shows the EM simulation results of the proposed DPA from 4.7 to 5.4 GHz. The saturation power is over 40.8 dBm and the 6-dB back-off DE is better than 49% across the bandwidth. 
V. MEASUREMENT RESULTS
The photo of the fabricated DPA is shown Fig. 22 , and the chip size is only 2. Large-signal performance also exhibits a frequency shift, and the actual operating frequency band is 4.5 -5.2 GHz. Fig. 24 shows the measured DE and gain characteristics for the continuous-wave signal across the bandwidth with 0.1 GHz step. At the center frequency of 4.9 GHz, the fabricated DPA demonstrates a power gain of 11 dB, a saturated output power of 41 dBm, a 6-dB back-off DE up to 50%, and a peak DE of 58%. As shown in Fig. 25 , a saturated power of 40.4 -41.2 dBm, a 6-dB back-off DE (PAE) of 47% -50% (40% -45%), and a saturated DE (PAE) of 55% -63% (45% -51%) are obtained from 4.5 GHz to 5.2 GHz. Compared with the simulated performance in Fig. 21 , measured results show a saturation power degradation of about 0.5 dB, and a DE decrease of about 3% at both saturation and 6-dB backoff, which could be caused by process variation, model inaccuracy, or the error of test system. In addition, the measured gain exhibits a larger compression, and the reason may be that the power device with class-C biasing is not well modeled. To measure the DPA's performance under the modulated signal excitation, a 40-MHz LTE signal with a peak-to-average power ratio (PAPR) of 7.7 dB is employed at the carrier frequency of 4.9 GHz. The measured efficiency is 43% at the average output power of 33 dBm. DPD based on generalized memory polynomial (GMP) model is performed to linearize the DPA. Fig. 26 presents the measured power spectrum density (PSD) before and after DPD at the average output power of 33 dBm. The ACPR of the proposed DPA is −29 dBc and is improved to −46 dBc after DPD linearization.
The performances are summarized and compared with other fully integrated C-band GaN MMIC DPAs in Table 4 , where the power density is defined as the ratio of the saturated power to chip size. The proposed DPA demonstrates the largest saturated power with a high efficiency. Moreover, the power density is also the largest because of the ultracompact size. Therefore, the proposed design is a very good candidate for 5G MIMO application. 
VI. CONCLUSION
A fully integrated GaN MMIC DPA with high-efficiency and compact size has been designed for 5G MIMO application. The performance degradation caused by the nonlinear output capacitance is analyzed in depth, and is overcome by inserting a small series inductor. A low-Q output network is employed to broaden the bandwidth. Its insertion loss in the back-off region is demonstrated to be mainly decided by the Q-factor of the drain bias inductor of the main PA. To achieve a full integration and low loss simultaneously, all drain bias inductors are realized by on-chip TLs with high Q-factors. Using the proposed RUPS and BIM, the gain of the DPA is enhanced greatly. The fabricated DPA demonstrates a measured saturation power of 40.4 -41.2 dBm, and a 6-dB back-off DE of 47% -50% from 4.5 to 5.2 GHz, with a chip size of 2.2 mm×2.1 mm. Under an excitation of 40-MHz LTE signal with 7.7-dB PAPR, the ACPR at the average output power of 33 dBm is improved to −46 dBc after applying DPD. Her current research interests include the behavioral modeling and digital predistortion for RF power amplifiers. He has held several invited positions at several academic and research institutions in Europe, North America, and Japan. He has provided consulting services to a number of microwave and wireless communications companies. He is currently a Professor, an iCORE/Canada Research Chair, and the Director of the iRadio Laboratory with the Department of Electrical and Computer Engineering, University of Calgary, Alberta. His research has led to more than 500 refereed publications, seven U.S. patents and seven patents applications. His research interests include RF and wireless communications, nonlinear modeling of microwave devices and communications systems, design of power-and spectrum-efficient microwave amplification systems and design of SDR systems for wireless and satellite communications applications. He is an IET Fellow and IEEE-MTT-S Distinguish Microwaves Lecturer.
ZHENGHE FENG (SM'92-F'11) received the B.S. degree in radio and electronics from Tsinghua University, Beijing, China, in 1970.
Since 1970, he has been with Tsinghua University, as an Assistant, Lecturer, an Associate Professor, and a Full Professor. His current research interests include numerical techniques and computational electromagnetics, RF and microwave circuits and antenna, wireless communications, smart antenna, and spatial temporal signal processing.
