T he next generation wireless network, 5G, is expected to provide ubiquitous connections to billions of devices as well as to unlock many new services through multigigabit-per-second data transmission. To meet the ever-increasing demands for higher data rates and larger capacities, new modulation schemes have been developed, and wider frequency bands, such as those at millimeter wave (mm-wave), have been designated for 5G [1] , [2] . Massive multiple input/multiple output (MIMO), which uses a large number of antennas at the transmitter and receiver, has been considered one of the key 5G technologies to improve data throughput and spectrum efficiency [3] . These new application scenarios impose stringent requirements on wireless transceiver front ends and call for special considerations at the circuit and system design levels. In the transmitter, power amplifiers (PAs) should accommodate complex modulated signals, featuring a high peak-to-average-power ratio (PAPR) and a wide modulation bandwidth. Moreover, in massive MIMO arrays, PAs should maintain a high average efficiency to mitigate thermal heating issues.
Several PA architectures have been adopted to efficiently amplify signals with large PAPRs. Popular ones include envelope tracking, out-phasing, and the Doherty PA (DPA). Since its introduction in 1936 [4] , the DPA has been extensively explored and become one of the most widely used PA architectures in existing cellular base stations. It basically consists of two amplifiers that have their output power combined through a load-modulation network. It can maintain high efficiency across a large power range and features a simple circuit implementation when compared to other architectures. Recent research also shows that it has the capability of operating at mm-wave frequencies [5] . Unfortunately, the classical DPA suffers from intrinsic bandwidth limitations, mainly due to the narrowband quarterwavelength ( / ) 4 m transmission lines used for the impedance transformation.
Bandwidth extension is an important consideration in modern DPA designs, and it has received increasing attention in recent research, especially for wideband 5G applications. Many papers reviewing DPAs have been published during the past several years [6] [7] [8] [9] [10] [11] . However, there is no complete review of broadbanddesign techniques for the DPA, an essential subject for 5G wireless transmitters. In this article, we present a comprehensive review and critical discussion of bandwidth enhancement techniques for DPAs proposed in the literature to provide a thorough understating of broadband-DPA design for high-efficiency 5G wireless transmitters.
DPA Bandwidth Limitation
The basic DPA architecture is shown in Figure 1 , where the carrier amplifier is biased in the class-B mode, while the peaking amplifier is biased in the class-C mode. At high input-power levels, both amplifiers are active and deliver power to the load impedance. The characteristic impedances of the /4 m transmission lines, TL1 and TL2, are chosen so that both amplifiers "see" their optimum load resistance to provide maximum power and efficiency. When the input power level is low, only the carrier amplifier is active in providing output power. The load resistance presented to the carrier amplifier is increased by transmission line TL1, operating as an impedance inverter and serving to improve the DPA efficiency at lower output power levels. The /4 m transmission line at the peaking amplifier input provides a 90 -c phase shift to ensure the proper combination of output power from the two amplifiers at the common output node. Several factors contributing to the bandwidth limitation of the DPA are discussed in the following sections. We provide new insights through a theoretical derivation of the impedance presented to the carrier amplifier at the peak power and back-off.
Output Network
The main bandwidth-limiting element of the DPA is usually impedance inverter TL1. This can be shown by exploring the impedances presented to the carrier amplifier at the peak output power and back-off. The characteristic impedances of the /4 m transmission lines, assuming a symmetric transistor configuration (that is, the transistors in the carrier and peak amplifiers are identical), are chosen as
,
where Ropt is the optimum load impedance of the carrier and peaking transistors [6] . Transmission line TL2 transforms load impedance RL into / / Z R R 2 L 2 2 opt = at the common output node. At peak power, the two transistors deliver identical output currents. The characteristic impedance of TL1 is chosen to match the optimum load resistance at peak power. Thus, impedance Ropt is presented to both amplifiers. At 6-dB back-off, the peaking amplifier is turned off, and the impedance presented to the carrier amplifier, assuming that the peaking amplifier presents an open circuit, is
It is evident that the impedancetransformation ratio of impedance inverter TL1 is one at the peak power and four at the 6-dB back-off. This limits the bandwidth of the DPA at back-off. To illustrate this bandwidth limitation, we compare the impedance presented to the carrier amplifier at peak power and 6-dB back-off. Assuming that the load presented to the output of impedance inverter TL1 is kRopt (k = 1 at peak power, and k = 0.5 at the back-off), it can be shown that 
The real part of ( ), Z f c normalized to Ropt, at the peak power and 6-dB back-off is depicted in Figure 2 . A narrower bandwidth is observed at the back-off. The 
where
The real part of the impedance at the peak power and back-off is shown in Figure 3 . For m < 1, the bandwidth degrades compared to the ideal case of m = 1, while for m > 1, a higher bandwidth is achieved but with additional peaks in the real part of the impedance at the peak power, leading to variations in output power and efficiency across the bandwidth. For m = 0.5, the fractional bandwidth for a 20% reduction in the real part of the impedance at back-off is 18%, while for m = 1.5 it is 62%. This shows the advantage of using transistors with a large Ropt [for example, gallium nitride (GaN) devices] in the realization of broadband DPAs. Note that the characteristic impedance of TL2 is given by ,
which can become infeasible if m is too large.
The TL2 transmission line can be replaced by a higherorder matching network to improve its bandwidth. For example, a two-section impedance transformer composed of two /4 m transmission lines with characteristic impedances of Z mR a L 2 = and Z m m R b L 2 = will lower the impedance transformation ratio (m for each transmission line compared to m 2 in the previous case), thus extending the bandwidth, as depicted in Figure 4 . The fractional bandwidth for a 20% reduction in the real part of the impedance at back-off is 38% for all values of m. Also, the real part of the impedance at peak power exhibits much smaller variations across the bandwidth.
It should be noted that this technique is not effective for transmission line TL1, which should operate as an impedance inverter. To clarify this point, we note that a higher-order matching network is designed for a given set of source and load impedances. If the load resistance is doubled, for example, its frequency response degrades. An impedance inverter, however, is a special impedance transformer, in that its input impedance is proportional to the inverse of the load impedance. It is not a straightforward process to realize a broadband impedance inverter by replacing a /4 m transmission line with a higher-order network.
A DPA with asymmetric (that is, of unequal size) transistors is used to amplify signals that have a PAPR Normalized Frequency
Back-Off
Peak Power Figure 2 . The real part of the normalized impedance presented to the carrier amplifier at peak power and 6-dB back-off. The DPA bandwidth is limited mainly by the backoff impedance. greater than 6 dB. Assuming that the gate width of the peaking transistor is N times that of the carrier transistor, the peak efficiency is achieved at a ( ) log N 20 1 dB -+ back-off. The characteristic impedance of TL1 is the same as in the symmetric DPA ( , ), Z R that is 1 opt = while for TL2 it is given by
The load impedance is transformed to /( ) R N 1 opt + at the common node. The impedance presented to the carrier transistor is Ropt at peak power and ( ) N R 1 opt + at back-off, while the impedance / R N opt is presented to the peaking amplifier at peak power (it is the optimum impedance of the peaking transistor, as its size is N times that of the carrier transistor). We emphasize that the impedance transformation ratio of TL1 at backoff increases to ( ) / ,
limiting the bandwidth of the asymmetric DPA. Moreover, the impedancetransformation ratio for TL2 is (
which can be larger than that of the symmetric DPA, further limiting bandwidth.
Parasitic Capacitances
The transistors' parasitic capacitances can limit the DPA bandwidth. The drain-source capacitance, , Cds can affect the impedance presented to the amplifiers, thus limiting the bandwidth of the load-modulation network. One solution is to absorb the transistors' parasitic capacitances into the impedance inverter network using the reduced-length or lumped-element equivalent circuit of a transmission line, as described in Figure 5 . Unfortunately, both equivalent circuits are valid only at a single frequency, and their frequency response deviates from the original circuit. Moreover, there is a limit on the parasitic capacitance that can be absorbed by the impedance inverter circuit. In Figure 5 (a), a large capacitance leads to a large characteristic impedance and an unrealistic transmission line, while, in Figure 5 (b), Cds should be smaller than / . C Z 1 0 0 =^h We investigate the impedance presented to the carrier amplifier in these two cases and compare the results with a transmission-line-based impedance inverter.
The real part of the impedance in the former case is plotted in Figure 6 for different parasitic capacitances. The circuit bandwidth degrades as the capacitance C increases. Moreover, this technique requires a transmission line with a higher characteristic impedance (that is, a narrower width), which may not be feasible due to process limitations. To effectively use this circuit in an asymmetric DPA (that is, with a stronger peaking transistor), an extra capacitance should be added to the carrier transistor's output to equalize the two parasitic capacitances, thus further limiting the bandwidth. For the second case of Figure 5 (b), the real part of the Zc impedance is depicted in Figure 7 , indicating that the bandwidth at peak power and back-off is reduced compared to the results in Figure 2 . Other circuit techniques, including offset lines, resonant circuits, and compensation networks, have been proposed to cancel the Cds effects, but they are usually narrowband [6] , [9] , [36] .
The gate-source capacitance, , Cgs also limits the bandwidth of the input power splitter and phasealignment network. The impedance matching network used to match the transistors' input impedance to the power splitter can limit the bandwidth for a large . Cgs The Cgs nonlinearity, especially for the class-C biased peaking transistor, means that the input impedance (and, as a result, the input power division ratio) depends on the input power. This degrades DPA linearity [12] . The gate-drain capacitance, , Cgd presents a nonlinear impedance at the transistors' input through Miller's effect. The impedance nonlinearity is more severe for the carrier amplifier, whose load impedance varies by a factor of 2, that is, from Ropt at peak power to R 2 opt at 6-dB back-off. Therefore, Cgd can limit the bandwidth and degrade the linearity of the DPA [13] .
Input Network
The input power splitter and phase-shift network can constitute another source of bandwidth limitation. In a separate implementation of the power splitter (for example, a Wilkinson power splitter) and phase-shift network (a /4 m transmission line), the bandwidth is limited mainly by the narrowband phase response of the /4 m transmission line. A wider bandwidth can be achieved by merging the two functions in a quadrature coupler, such as a Lange or branch-line coupler. Further bandwidth enhancement can be achieved using a multisection Wilkinson power divider or branch-line coupler. However, this implementation increases the circuit size and may become infeasible for integrated circuit (IC) implementations at low frequencies.
DPA Bandwidth-Extension Techniques
In this section, we review bandwidth extension techniques for the DPA. Table 1 presents a performance summary for broadband DPAs using various bandwidth extension techniques.
DPA With Modified Load-Modulation Network

Modified Impedance of Transmission Lines
As stated previously, in the conventional DPA architecture, assuming symmetrical transistors, the impedance inverter's impedance-transformation ratio is one at peak power and four at 6-dB back-off, thus limiting the DPA bandwidth at back-off. A number of DPA architectures with a modified transmission-line characteristic impedance were proposed to mitigate this issue.
In [14] , the common load impedance was increased from / R 2 opt to a higher value:
This reduces the impedance-transformation ratio at 6-dB back-off to . . 2 2 2 8
, Therefore, the drain-efficiency bandwidth extends compared to that in the conventional DPA. The characteristic impedances of the transmission lines are chosen as Z R 2
The load impedance presented to the peaking amplifier at saturation is , R 2 opt which is higher than its optimum value. This leads to the degradation of the output power and efficiency at saturation. A GaN DPA designed by using this technique achieves 41-55% drain efficiency at 6-dB back-off at 1.7-2.6 GHz (42%).
A modified DPA architecture was proposed in [15] and [16] , where the characteristic impedance of the impedance inverter was chosen to be identical to the load resistance, ,
as shown in Figure 8 . A frequency-independent impedance is thus presented to the carrier amplifier at back-off, improving the DPA bandwidth. Impedances Zc and Zp are manipulated Figure 7 . The real part of the impedance presented to the carrier amplifier using a lumped-element transmission line. The DPA bandwidth is now limited by the peak power and back-off impedances. to achieve broadband performance at peak power and back-off by properly controlling the relative phase and amplitude of the carrier and peaking-amplifier currents, Ic and . Ip Asymmetric drain supply voltages are used for the carrier and peaking amplifiers, and it is shown that the power back-off level can be reconfigured by the ratio of two supply voltages. Using this technique, a GaN DPA was presented in [15] with a 52-68% efficiency at 6-dB back-off at 0.7-1 GHz (35%). In [16] , a broadband GaN DPA with two RF inputs was reported, achieving a 49-62% efficiency at 6-dB backoff at 1.5-2.4 GHz (47%). The need for unequal drain bias voltages is a drawback of this architecture, as one of the transistors should operate at a supply voltage that is lower than the maximum level enabled by the process. However, the design has the advantage of being able to reconfigure the back-off level to accommodate modulated signals with different PAPRs. Another broadband DPA design approach was presented in [17] , where the characteristic impedances of the two transmission lines were determined for an impedance transformation with a maximally flat frequency response. The resulting impedances are derived as
where a is the back-off level (0.5 for the 6-dB back-off).
The phase and magnitude of the current profiles are set based on the input power and frequency. This architecture requires asymmetric supply voltages for the carrier and peaking amplifier, related as
This DPA's theoretical efficiency is compared with the conventional DPA in Figure 9 , where it is evident that the DPA with a maximally flat frequency response provides a notable efficiency enhancement at back-off. A GaN DPA with two RF inputs is implemented based on this architecture. It exhibits a drain efficiency of 40-52% at 6-dB back-off at 0.55-1.1 GHz (67%). The main advantage is that the broadband operation is achieved using the optimum carrier and peaking impedances, while the need for asymmetric supply voltages limits the carrier transistor's power capability.
Two-Section Peaking Network
Yet another modified DPA architecture with two /4 m transmission lines in the peaking network appears in Figure 10 . The output impedance transformation network is eliminated, while the transmission lines' characteristic impedances are established to achieve a broadband response. This architecture has been extensively investigated, and different design criteria have been derived for its optimal operation [18] [19] [20] [21] [22] [23] .
In [18] , the characteristic impedances of the transmission lines, assuming RL and Ropt of 50 Ω, are chosen as . Z Z 50 2 70 7 1 2 ,
In the proposed architecture, impedance inverter TL1 performs a 100-to-50 Ω transformation at peak power and a 50-to-100 Ω transformation at 6-dB back-off. Therefore, the impedance-transformation ratio is two in both cases, leading to an extended bandwidth at back-off but a reduced bandwidth at peak power. A broadband DPA with a 40-45% drain efficiency at 6.5-dB back-off and a 2.1-2.7 GHz (25%) bandwidth was designed using this architecture, based on GaN class-E amplifiers. Figure 8 . The modified DPA architecture with the improved back-off bandwidth proposed in [15] and [16] . Figure 9 . The theoretical drain efficiency of (a) a conventional DPA compared to (b) the DPA with a maximally flat frequency response [17] .
In Out
Power Splitter TL1 TL3 TL2 Figure 10 . The DPA architecture with the two-section peaking network for bandwidth extension at back-off.
In [23] , a generalized design approach was presented for this DPA architecture. Assuming transistors with the same optimum load resistance, it is shown that, to achieve broadband efficiency at back-off and peak power, the characteristic impedances of the transmission lines should be chosen as
, Z m
The impedance transformation ratio for TL1 is derived as m 2 at peak power and m 4 2 at 6-dB back-off, while for TL , 2 3 it is given by m at peak power. It is evident that, for m = 1, this architecture provides the same results as the conventional DPA, while, for m = 0.5, the largest bandwidth is obtained at back-off. The DPAs reported in [19] [20] [21] are special cases of this architecture, with an m of 0.6, 0.6, and 0.4, respectively. A GaN DPA presented in [23] was based on a class-E PA and an output combiner with m = 0.5, achieving a drain efficiency of 39-67% at 6-dB back-off at 2.1-2.66 GHz (24%).
In [21] and [22] , a GaN DPA was designed using this architecture, with drain efficiency of 35-58% at 6-dB backoff at 1-2.6 GHz (83%). A broad bandwidth is achieved but with large drain-efficiency variations, which indicates that the load-modulation network is not capable of providing a constant optimum-load resistance for the transistors across the bandwidth. This architecture was used in [24] to realize a laterally diffuse metal-oxide semiconductor (LDMOS) broadband push-pull DPA. The DPA provides 700 W of output power and a more than 40% average drain efficiency for a signal with a 7.5-dB PAPR in the bandwidth of 522-762 MHz (37%). This architecture has a straightforward design procedure and can provide a wide bandwidth at back-off. Its drawbacks concern large drain-efficiency variations within the bandwidth and the extra /4 m transmission line. More complicated load-modulation networks based on branch-line couplers were proposed in [25] and [26] . However, as is evident in Table 1 , the reported bandwidths are lower than those achieved using the twosection peaking network with a simpler structure.
Peaking Network With Shunt Short-Circuited Stub
In [27] , a /4 m shunt short-circuited stub was added at the peaking amplifier's output to improve the loadmodulation network's bandwidth, as shown in Figure 11 . This stub modifies the impedance presented to the impedance-inverting transmission line to compensate for the impedance drop experienced by the carrier amplifier. The characteristic impedances of the impedance inverter and short-circuited stub are chosen as Z 50 2 1 X = and . Z 120
Simulation results for the carrier impedance, ,
Zc at back-off and peak power for the proposed and conventional DPA architectures are compared in Figure 12 . It is evident that the real part of the carrier impedance exhibits significant bandwidth improvement in the back-off. The larger impedance at the band edges can degrade the DPA efficiency at peak power. A GaN DPA designed using this technique provides a 42-53% drain efficiency at 6-dB back-off at 1.5-2.5 GHz (52%).
Another GaN DPA designed based on this technique, with two peaking amplifiers and one carrier amplifier, achieves a 40-47% drain efficiency at 8-dB back-off at 2-2.6 GHz (26%) [28] . The main advantage of this architecture is that broadband operation can be achieved using only an extra transmission line. However, the transmission line's characteristic impedance may become too large and thus impractical. A similar DPA architecture was proposed in [29] , where a parallel inductor-capacitor (LC) network Figure 11 . The DPA bandwidth extension using a shortcircuited stub [27] . 2 Figure 12 . The impedance presented to the carrier amplifier of the DPA with a short-circuited stub at peak power and 6-dB back-off (solid lines) compared to the conventional DPA (dashed lines).
was used instead of the short-circuit stub. The parallel network resonates at the center of the frequency band and improves the carrier impedance's bandwidth at back-off. The DPA exhibits a 48-56% drain efficiency at 6-dB back-off at 700-950 MHz (30%).
Frequency-Response Optimization
A broadband-design approach for the DPA is based on the optimization of the overall frequency response [30] [31] [32] . In [30] , a "real frequency" technique is used for the synthesis of matching networks. The DPA output network is represented by three two-port networks, shown in Figure 13 . Their scattering parameters are determined from conditions that should be satisfied by the impedances at saturation and back-off. If a high back-off efficiency is the most important design target, the optimum load impedance, , Z , , c opt BO should be presented to the transistors at back-off, while the impedances at the saturation can reach only a suboptimum value. The load impedance, , ZL should be optimized to provide a broadband response to the carrier amplifier at back-off. The output impedance of the peaking amplifier network, , Z ,p out should be close to open-circuit to avoid a power leakage from the carrier amplifier at back-off. Using this technique, a broadband GaN DPA was designed in which drain efficiency of 40-47% at 5-6 dB back-off is achieved at 2.2-2.96 GHz (30%).
In [31] , the frequency-dependent back-off efficiency degradation is minimized by properly designing the carrier amplifier output matching network (OMN) to compensate for the frequency-sensitive impedance of the inverter. The impedance presented to the carrier amplifier at back-off exhibits a broadband real part. A GaN DPA designed based on this technique provides 53-65% drain efficiency at 6-dB back-off at 1.7-2.25 GHz (28%). In [32] , a multiobjective Bayesian optimization is used to design a broadband GaN DPA. The authors demonstrate that this optimization strategy outperforms a commercial electronic design tool's built-in optimizer. The DPA achieves 20-W output power and 45-54% average drain efficiency at 7-dB back-off for a 20-MHz single-carrier LTE signal in the bandwidth between 1.5 and 2.4 GHz (47%).
Parasitic Compensation
The bandwidth of the DPA can be limited by the transistors' parasitic capacitances. Furthermore, output parasitic components alter the impedances presented to the transistors' intrinsic drain nodes, leading to output power and efficiency variations across the bandwidth. Several parasitic-compensation techniques have been presented for the DPA [33]- [37] . A basic solution, discussed previously, is to absorb the output parasitic capacitances into the impedance inverter network using a transmission line's reduced-length or lumped-element equivalent circuit. Another technique is to use parallel inductors to resonate with the parasitic capacitances (see Figure 14 ). However, the cancellation is achieved at only a single frequency, and the loss of the inductors can degrade the DPA efficiency.
In [34] , wideband reactive networks are cascaded with the output of the carrier and peaking amplifiers to compensate for their output parasitic components, as given in Figure 15 . The compensation network is designed so that the overall two-port network (that is, the cascaded networks) provides the scattering parameters of S S 0 11 22 = = and . S S 1 21 12 ! = = Nevertheless, achieving these conditions across a wide bandwidth, especially the phase response, is not a trivial matter. The implemented GaN DPA achieves drain efficiency of 38-56% at 6-dB back-off at 3-3.6 GHz (18%). Another GaN DPA with parasitic compensation is presented in [35] and achieves 33-55% drain efficiency at 6-dB back-off at 1.5-3.8 GHz (87%). It is evident that the efficiency exhibits large variations in the reported bandwidth.
A design technique for a broadband impedance inverter in the presence of large parasitic capacitances was proposed in [36] . Shown in Figure 16 , the DPA architecture is based on the two-section peaking network previously discussed. In the conventional parasitic-absorption Figure 13 . The DPA model based on two-port networks for the real-frequency design technique [30] .
L C Figure 14 . The parasitic capacitance cancellation using parallel inductors.
Bandwidth extension is an important consideration in modern DPA designs.
technique, the carrier's parasitic capacitances and peaking transistors are absorbed into transmission lines TL1 and TL2 by shortening their length and increasing their characteristic impedance. It is proposed in [36] to replace transmission line TL3 with an equivalent circuit consisting of two shunt inductors and one transmission line with a longer length. This topological transformation is performed so that the extra inductors can compensate for the extra capacitance introduced by shortening the length of transmission lines TL1 and TL2. It is evident that this cancellation can be obtained across a wide bandwidth. A DPA designed from this technique and implemented using silicon LDMOS transistors features average drain efficiency of 37-47% and average output power of 49 dBm at 650-950 MHz (37.5%) for a 20-MHz LTE signal with a 7.5-dB PAPR.
Post-Matching DPA
In the post-matching DPA architecture proposed in [38] , the carrier's impedance matching networks and peaking amplifiers are realized by simple low-pass networks to extend the bandwidth. Furthermore, a broadband impedance matching network is used at the output of the DPA to transform the load resistance into the optimum resistance for broadband operation (Figure 17 ). This is different from the conventional impedance matching network at the DPA output that transforms the 50-Ω load resistance to a fixed resistance . ) R 2 (for example, opt This post-matching network provides an appropriate frequency-dependent impedance for the low-order impedance inverters. The implemented GaN DPA achieves drain efficiency of 47-57% at 6-dB back-off at 1.7-2.6 GHz (43%). Modulation measurements using a 20-MHz LTE signal with a 10.5-dB PAPR indicate a higher than 40% average drain efficiency. In [39] , second-harmonic short-circuit networks are included in the post-matching DPA to improve efficiency. The implemented GaN DPA provides drain efficiency of 47-54% at 6-dB back-off at 1.8-2.7 GHz (40%).
A modified version of this architecture was proposed in [40] , where a multiple-resonance circuit was used for the peaking amplifier. The resonant network is designed to provide an optimum load resistance to the peaking transistor at the band center while providing an optimum susceptance, ( ), Bopt~ at its output to broaden the bandwidth of the lumped-element impedance inverter at the carrier amplifier. A broadband Figure 16 . The parasitic-compensated load-modulation network proposed in [36] . Figure 17 . The post-matching DPA proposed in [38] . Figure 15 . The DPA architecture with parasitic compensation and second-harmonic control [34] .
post-matching network transforms the load resistance into the optimum load impedance required at the common node. A DPA designed using this architecture achieved 42-58% drain efficiency at 6-dB back-off at 0.9-1.8 GHz (70.7%).
The continuous-mode theory of PAs was proposed in [41] to provide an extended design space for the fundamental and harmonic load impedances. The fundamental load impedance is extended from a resistance, Ropt, to a complex impedance, while the harmonic impedances are modified appropriately (for example, the short-circuit second-harmonic impedance is replaced with a reactive impedance). This extended design space enables the realization of broadband harmonic-tuned PAs. The idea was extended to a DPA in [42] . Its architecture is similar to the post-matching DPA, where a postharmonic tuning network has replaced the output-matching one. This network is designed to provide the optimum load impedance at fundamental and harmonic frequencies. The authors showed that, by using an optimally designed postharmonic tuning network in the DPA, the average output power and efficiency for a 20-MHz, 7.5-dB PAPR LTE signal can be improved by 2 dB and 10%, respectively. A GaN DPA designed from this architecture features a 52-66% drain efficiency at 6-dB back-off at 1.65-2.75 GHz (50%). Another DPA using a similar architecture achieves 200-W output power and 40-52% drain efficiency at 6-dB back-off at 1.7-2.7 GHz (47%) [43] . In [44] , the output parasitic impedance of the peaking amplifier is employed to provide the optimum loadimpedance conditions for continuous-mode operation of the carrier transistor. The DPA achieves 46.5-63.5% drain efficiency at 6-dB backoff at 1.6-2.7 GHz (53%).
Distributed DPA
A distributed amplifier can provide broad bandwidth by absorbing the transistors' input and output parasitic capacitances into the transmission lines connected to the gate and drain [45] . A broadband distributed DPA architecture was proposed in [46] , in which two DPAs and their driver amplifiers are used in the single-ended, dual-fed distributed structure, as diagrammed in Figure 18 , without the need for a two-way power divider and combiner. This architecture inherits some features from the distributed amplifier, including the absorption of the peaking amplifiers' output capacitance into the output transmission line. However, the output parasitic capacitance of the carrier amplifiers and impedance inverters still limits the bandwidth. A GaN DPA reported in [46] provides an average power-added efficiency (PAE) of 15-25% at 2.06-2.22 GHz for a single-carrier wideband code-division multipleaccess signal that has a 10-dB PAPR.
Dual-Input DPA
In the conventional DPA, the two transistors exhibit different drain-current profiles, gains, and input impedances. Several analog techniques have been proposed to mitigate this issue, including uneven input-power splitting, an asymmetric DPA architecture, and adaptive gate biasing [10] . A generic approach is to consider the DPA as a dual-input amplifier where the magnitude and phase of each input signal can be controlled separately to achieve optimal operation. The DPA bandwidth can be extended by a frequency-dependent input-signal distribution. The efficiency and gain of the DPA can be improved by adaptive input-signal splitting where most of the input power is delivered to the carrier transistor at back-off, while a larger portion is directed into the peaking transistor at peak power [10] , [47] [48] [49] [50] . A dual-input DPA can be integrated into a transmitter system to enable digital control of the two input signals with input power and frequency, as shown in IC realization is essential to achieve robust performance in the presence of high parasitic components and losses. Figure 19 . This architecture also facilitates the implementation of digital predistortion algorithms to mitigate the DPA nonlinearity. However, the extra signalprocessing overhead can limit its application in 5G wireless transmitters that have large modulation bandwidths and operate at mm-wave bands.
In [49] , a 1-3-GHz, dual-RF input PA based on the Doherty outphasing continuum was proposed, where the relative amplitude and phase of the two input signals are optimally controlled. The DPA circuit, designed using GaN highelectron-mobility transistors (HEMTs), is diagrammed in Figure 20 . The broad bandwidth is achieved by absorbing the parasitic capacitances of the transistors into the matching circuits and, f u r t h e r, by employing stepped-impedance transformers. The DPA achieves 45-68% efficiency at peak power and 48-65% efficiency at 6-dB backoff at 1-3 GHz (100%). Nearly the same efficiency is obtained at peak power and back-off as a result of the optimum relative amplitude/phase of the two input signals.
Transformer-Based Power-Combining PA
A transformer-based voltage combiner was proposed in [51] to combine the RF power generated from several low-voltage CMOS amplifiers. The output-combiner circuit, given in Figure 21 , operates as a series voltage adder, where the required output power can be controlled by turning the unit amplifiers on and off. This architecture can modulate the load impedance that each unit amplifier experiences. The bandwidth is limited by the output parasitic capacitances of the unit amplifiers, switches, and transformer. A 2.4-GHz PA implemented in a 0.13-μm CMOS process achieves 27-dBm peak output power with 32% drain efficiency at saturation. At 2.5-dB output-power backoff (when one of the four unit amplifiers is turned off), drain efficiency is 31.5%, which is very close to the drain efficiency at peak power. This PA architecture can provide broad bandwidth by combining small wideband PA cells. Moreover, it can be used to realize Figure 19 . The transmitter architecture with a dual-input DPA [10] . DSP: digital signal processor; DAC: digital-to-analog converter; BPF: bandpass filter. Figure 20 . The broadband dual-input DPA circuit using stepped-impedance transformers reported in [49] . Figure 21 . The conceptual transformer-based powercombining PA [51] .
reconfigurable transmitters whose output power level can be controlled according to the operation scenario.
Transformer-Less Load-Modulation PA
A transformer-less load-modulation architecture introduced in [52] [53] [54] does not require bandwidth-limiting transmission-line impedance transformers and offset lines. The load modulation is performed by broadband OMNs that transform two load impedances into the optimum values at peak power and back-off [52] . A PA architecture based on this technique is shown in Figure 22 . The carrier amplifier's matching network is designed to present an optimum impedance to the transistor for maximum efficiency at back-off. It also provides a close-to-optimum impedance at peak power. The peaking amplifier's matching network offers an optimum impedance to the transistor at peak power. It should also exhibit high output impedance at back-off to prevent leakages from the carrier amplifier. Finally, the output currents of the two amplifiers should be in phase at peak power for proper power combining. This condition can be satisfied using phase-alignment networks at the amplifiers' inputs. With this architecture, a GaN PA is designed with 40-45% drain efficiency at 6-dB back-off at 2-2.45 GHz (20%). In [54] , a GaN PA with a series-connected load was designed using a similar technique. It achieves a 20-50% drain efficiency at 6-dB back-off at 1.6-2 GHz (22%).
IC DPA Design
From the standpoint of 5G user equipment and smallcell base stations, an IC implementation of the DPA is desired for size and volume reasons. Moreover, with the allocation of mm-wave frequency bands to 5G, IC realization is essential to achieve robust performance in the presence of high parasitic components and losses. However, most of the DPAs presented previously operate at relatively low frequencies (below 4 GHz in Table 1 ) and are implemented as discrete-component circuits where losses and the size of the passive components are not the most important concerns. The design of broadband IC DPAs i s more c ha l leng i ng than that of the conventional broadband high-efficiency PAs [55] , [56] , since extra conditions must be satisfied at the peak and back-off output-power levels. Several issues should be considered in t he IC desig n of broadband DPAs to address 5G requirements: 1) Parasitic capacitances can degrade the gain and limit the bandwidth.
2) Losses in transistors and passive components, mainly transmission lines and inductors, degrade efficiency.
3) The size of /4 m transmission lines and inductors in the impedance inverter and input-power splitter can become excessively large for IC implementations. 4) The transmission lines' current-density limitation imposes constraints on the lines' minimum width and hence limits the maximum realizable characteristic impedance. 5) Specific IC design rules, such as the minimum spacing and density of metal layers, can degrade DPA performance by limiting the transformers' coupling coefficient and the inductors' quality factor. 6) The low gain of the class-C biased peaking transistor at high frequencies degrades DPA gain and efficiency. 7) The mismatch between the gain and phase responses of the carrier and peaking amplifiers, which arises from their different bias conditions, makes it challenging to achieve high efficiency at a broad bandwidth. 8) In the presence of process variations and mismatches between the two amplifier paths, it is difficult to maintain amplitude and phase linearity across a wide power range, which is an essential requirement for 5G complex-modulated signals. 9) 5G signals' large PAPR (for example, 9.6 dB for a 64-quadrature amplitude modulation with orthogo n a l f requency-division mult iplexi n g (O F D M ), requires an asymmetric DPA Figure 22 . The broadband transformer-less load-modulation PA architecture [52] . IMN: input matching network.
With the extensive deployment of mm-wave frequencies envisioned for 5G, new PA design approaches have been developed.
structure, which, as discussed in the "Output Network" section, has a larger impedance-transformation ratio at back-off and hence a narrower bandwidth. Moreover, the peaking transistor would have larger parasitic capacitances that further limit the bandwidth.
These issues render the techniques discussed so far less effective, if not impractical, for integrated DPAs and indicate the need for further research. In the next section, we review integrated DPAs in the RF and mmwave frequency bands.
RF Bands
The sub-6-GHz frequency band, including unlicensed and various licensed spectra, will be deployed in 5G, especially for delivering fixed wireless services at long distances. The available design techniques for IC DPAs are investigated here to provide insights for further 5G developments.
A broadband DPA implemented in a gallium arsenide (GaAs) heterojunction bipolar-transistor (HBT) process [57] is presented in Figure 23 . The transistors' output parasitic capacitances are absorbed into the lumped-element output network. The inductors are implemented off chip to lower their losses. For a 10-MHz LTE signal with a 7.5-dB PAPR, the DPA delivers an average output power of 27.5 dBm and a PAE of 36% at 1.85 GHz. The average PAE of more than 30% is obtained at 1.6-2.1 GHz (27%). A similar approach was used in [58] to design a broadband DPA i n a 0.25-μm GaN-HEMT monolithic microwave IC (MMIC) process. The output network is designed to provide the same impedance-transformation ratio for the carrier and peaking amplifiers. The DPA achieves an average drain efficiency of 46-53% and average output power of 33.1 dBm at 2.1-2.7 GHz (25%) for a 10-MHz LTE signal with a 7.2-dB PAPR.
Bias Circuit
Bias Circuit
Peaking Amplifier
Carrier Amplifier Figure 23 . The broadband DPA circuit implemented in a GaAs HBT process [57] . Figure 24 . The (a) impedance-inverter network and (b) 0.25-μm GaN-HEMT MMIC DPA reported in [59] .
In [59] , a compact impedance inverter network was proposed using a T-structure composed of transmission lines and the transistors' output parasitic capacitances ( Figure 24) . A DPA was fabricated using a 0.25-μm GaN-HEMT MMIC process and achieves a peak output power of 35 dBm, PAE of 38-50% at peak power, and 24-37% at 9-dB back-off at 6.8-8.5 GHz (22%). A broadband 0.25-μm GaN-HEMT MMIC DPA based on the modified load-modulation network is presented in [60] . The DPA circuit is illustrated in Figure 25 , where a power-combining network absorbs drain-source parasitic capacitances and provides asymmetric drain biases for the transistors. A Lange coupler is used as the input-power splitter with a broadband quadrature-phase response. The DPA provides 36 dBm of peak power and a 31-39% PAE at 9-dB back-off at 5.8-8.8 GHz (41%).
mm-Wave Bands
With the extensive deployment of mm-wave frequencies envisioned for 5G, new PA design approaches have been developed, such as [61] and [62] , and more research is expected in mm-wave DPAs. There are only a few reports of DPAs at mm-wave frequencies to date [63] [64] [65] [66] [67] [68] [69] . One of the key issues for mm-wave DPAs concerns the high losses in the /4 m transmission lines, which degrades the gain and PAE. In [63] , an active phase-shift DPA architecture is proposed, where the /4 m transmission line at the input of the peaking transistor is replaced with a preamplifier. The DPA Figure 26 . The (a) transformer-based DPA implemented in (b) a 40-nm CMOS process [64] .
implemented in a 45-nm silicon-on-insulator CMOS process provides 18 dBm of output power, a 23% peak PAE, a 17% PAE at 6-dB back-off, and a 7-dB gain at 42 GHz. This architecture's bandwidth can be extended using a broadband load for the preamplifier.
A DPA based on a mm-wave transformer was proposed in [64] . The circuit is shown in Figure 26 , where an asymmetric series power combiner with an LC tuning circuit at the output of the auxiliary amplifier is used to achieve broadband performance. Moreover, each amplifier is implemented as two parallel branches with smaller devices to further extend the bandwidth. A 40-nm CMOS DPA designed using this technique achieves a 3-dB bandwidth of 60-81 GHz (30%). The output power of 20.1 dBm and peak PAE of 11.4% are maintained across 58-77 GHz. The 6-dB back-off PAE is 7% at 72 GHz.
In [65] , a two-section peaking network architecture was adopted to design a broadband DPA that covers multiple mm-wave 5G frequency bands. The transmission lines are replaced by lumped-element circuits, as demonstrated in Figure 27 , in which two pairs of coupled inductors, L1-L2 and L3-L2, are realized as transformers. The DPA circuit is composed of differential output and driver amplifier stages, an input quadrature hybrid, and varactor-loaded transmission lines to adjust the relative phase shift of the carrier and peaking paths. A power-aware, adaptive, uneven-feeding scheme is used to gradually deliver more power to the peaking amplifier as the input power increases. Different varactor settings are used for the 28-, 37-, and 39-GHz bands. The DPA, implemented in a 130-nm silicon-germanium (SiGe) bipolar-CMOS (BiCMOS) process, achieves a 3-dB small-signal-gain bandwidth of 23.3-39.7 GHz (52%) and 1-dB saturated output-power bandwidth of 28-42 GHz (40%), collectively, in the two settings. The DPA further demonstrates 17-dBm peak output power, 20-23% peak PAE, and 13-17% PAE at a roughly 6-dB back-off.
A promising technique for simultaneous frequency and back-off reconfigurability in an mm-wave PA was proposed in [70] . The authors demonstrate that, by using an asymmetric power combiner that exploits PA-cell interactions, optimal impedances can be synthesized across the frequency and back-off reconfiguration. As Figure 27 . The (a) mm-wave multiband DPA implemented in (b) a 130-nm SiGe BiCMOS process [65] .
New approaches should be developed to effectively use the available bandwidth-extension techniques in IC DPAs.
seen in Figure 28 , the output-power level can be controlled by switching the PA cells, while the impedance presented to the PA cells depends on the phase of the signals in all paths. Therefore, the input-signal phase can be adjusted to achieve broadband operation at a given output-power level. This architecture can be considered a generalized combination of the dual-input DPA and transformer-less load-modulated PA discussed in the "DPA Bandwidth Extension Techniques" section. A PA is designed using this technique with two combined paths, each composed of eight PA cells, and adopts an input phase-shift network based on a variable delay line that has a varactor bank. The PA, implemented in a 130-nm SiGe BiCMOS process, provides peak output power of 23.7 dBm, peak efficiency of 34.5%, and 6-dB back-off efficiency of 16-22% across a broad mm-waveband of 30-55 GHz (62%).
In summary, only a few of the bandwidth extension techniques developed for the DPA are adopted in IC implementations. The bandwidth and efficiency of the developed IC DPAs are also much lower than their discrete counterparts. While the worse performance mainly originates from the limitations of IC processes, new approaches should be developed to effectively use the available bandwidth-extension techniques in IC DPAs. For mm-wave bands, where the loss effects from the parasitic components are more critical, further research is expected for designing high-performance DPAs.
Conclusions
The DPA is a promising architecture for 5G wireless transmitters that enables the efficient amplification of complex-modulated signals with a large PAPR. To accommodate the unprecedented data-rate and frequency-band increases envisioned for 5G, the bandwidth of the DPA should be extended. In this article, we presented a comprehensive review of the DPA bandwidth enhancement techniques and broadbanddesign methodologies published in the literature. Many techniques have been developed for low-frequency discrete DPA circuits, most of which cannot be directly employed in IC implementations. This indicates the need for further research to develop broadband design techniques that address the IC processes' challenges and limitations.
From the techniques investigated in the "DPA Bandwidth Extension Techniques" section, the DPA with modified transmission-line impedances is a feasible architecture for IC implementation. The two-section peaking network needs three /4 m transmission lines that normally take an excessive on-chip area. However, as shown in the "IC DPA Design" section, it is possible to derive equivalent lumped-element circuits with a compact IC realization. The DPA with a short-circuited stub can be implemented on chip that absorbs the parasitic capacitance of the peaking transistor into the stub circuit. The dual-input DPA architecture is useful in transmitter systems where signal modulation, predistortion, and conditioning can be performed in the digital domain and applied to the DPA to improve its performance. The transformer-based power-combining PA, originally developed for IC PAs, can be effectively used at mm-wave frequencies. The transformerless load modulation of PAs can also be adopted in IC DPAs, as the large impedance inverters can be replaced by lumped-element circuits. It is expected that more research activities will focus on the design of integrated DPA circuits in the future, especially at mmwave frequencies, to address the requirements of 5G wireless transmitters. Figure 28 . The (a) PA architecture with a simultaneous reconfigurable frequency and back-off implemented in (b) a 130-nm SiGe BiCMOS process [70] .
More research activities will focus on the design of integrated DPA circuits in the future, especially at mm-wave frequencies.
