Abstract: This letter presents a high efficiency, and small group delay variations 12-24 GHz fully-integrated CMOS power amplifier (PA) for quasi-millimeter wave applications. Maximizing the power added efficiency (PAE), and minimizing the group delay variations in a wideband frequency range are achieved by optimizing the on-chip input, output, and inter-stage matching circuits. In addition, stagger tuning is employed for realizing excellent gain flatness. A two-stage CMOS PA using the proposed methodology is designed and fabricated in 0.18 µm CMOS technology and tested. A measured power gain (|S 21 |) of 10.5 ² 0.7 dB and a measured small group delay variation of ²20 ps over the frequency range of interest are achieved. The PA shows a maximum measured PAE to be 26 % with DC power consumption of 50 mW.
Introduction
Due to the rapid growth of modern wireless communication technologies, the quest to design multiband or wideband RF front-ends on higher frequency bands is growing [1] . Currently, CMOS process is considered the most attractive solution for low cost and high performance fully integrated transceivers [2] . However, some inherent characteristics of CMOS process such as a low unity power-gain frequency, a small quality factor of the on-chip inductors, and a low oxide breakdown voltage of the deep sub-micron technologies cause the CMOS power amplifier to be one of the most critical blocks in the CMOS transceiver design [2, 3] . Improving the power added efficiency (PAE), the output power and the group delay are challenging tasks in the CMOS power amplifier design [3] . Different techniques such as using adaptive biasing techniques [4] , and reversed body bias technique [5] were reported to increase the PAE for narrow band 24 GHz power amplifiers. Alternatively, in this work, the PAE is maximized over a wideband by optimizing the matching circuits around the band of interest. In addition, in a wide band amplifier design, improving the group delay is another critical issue to mitigate the signal distortion especially for a system with impulse response [6] . Kyoung et al. [7] introduced an off-chip negative group delay (NGD) circuit in the feedback loop to compensate the group delay variations in the InGaP/GaAs heterojunction bipolar transistor (HBT) monolithic microwave integrated circuit (MMIC) amplifier. J. Chen et al. [8] introduced 6 to 26.5 GHz stacked PA with a high saturated output power of 26.1 dBm using 45-nm CMOS SOI technology. However, both solutions use advanced costly technology, that in turns, increase the product cost of the wireless system. The objective of this work was to develop a wideband, high PAE, small group delay variation, flat gain PA using 0.18 µm CMOS process for quasi-millimeter wave applications.
Technology challenges
The proposed PA was fabricated using a deep N-well 0.18 µm CMOS technology, which provides CMOS transistors with a cut-off frequency (f t ) of 50 GHz and oscillation frequency (f max ) of 55 GHz. In literature, few amplifiers were reported using this technology beyond 15 GHz as it is difficult to achieve a high output power and gain at the same time over a wide band due to the problem of low oxide breakdown voltage of deep sub-micron technologies [9] . Therefore, small size cascode pairs with high drain to source breakdown voltage are used in this work to reduce parasitics, and achieve a higher gain. On the other hand, the spiral inductors provided by the foundry has self-resonance frequency at around quasi-millimeter wave band if the inductor is larger than 3 nH as shown in Fig. 1 . Furthermore, the inductors values differ beyond 15 GHz. For example, the 1.1 nH foundry inductor, as illustrated in Fig. 1 , has a value of 1.3 nH at 15 GHz, and a value of 1.9 nH at 27 GHz, which results in almost 80% error. This error further increases exponentially with a larger inductor. To overcome these two problems, all of the inductors used for the proposed PA were designed and extracted using ADS Momentum electromagnetic simulator. In addition, in order to test the extracted model of the designed inductors, a 1.1 nH inbuilt inductor was fabricated separately and tested. The measurement results as shown in Fig. 1 indicate good agreement between the extracted model, and the measured inductor where the error is almost within 6% till 30 GHz. However, the simulated 1.1 nH foundry inductor results deviate from the measured results starting from 15 GHz.
Circuit design
The schematic of the proposed two stages PA is shown in Fig. 2 . The staggertuning concept [6] is employed, where gain flatness is accomplished by using two different center frequencies at 20.5 GHz and 15.5 GHz for the first and the second stage, respectively. The tuning center frequencies (peak gain frequencies) of the two stages are adjusted by optimizing the width of transistors M 1 , M 2 and M 3 to be 80, 96, 264 µm respectively, which control the current-gain cut off frequencies
. Where ! T is the currentgain cutoff frequency, C gs , and g m are the gate to source capacitance and the transconductance, respectively. The first stage (driver stage) consists of a cascode configuration as it achieves a high unilateral gain. The input impedance matching network formed by the gate inductor L g1 and the capacitor C g1 is employed to realize a wideband input impedance matching. The output of the first stage amplifier is passed to the second stage amplifier (power stage) through an interstage matching circuit formed by MIM capacitors C int , C 2 and inductor L int which is designed and optimized to achieve maximum PAE and output power, in addition to minimum group delay variation through the 12 GHz to 24 GHz frequency band. The second stage is a common source amplifier with small source degeneration inductor L s2 to improve the linearity and enhance the gain flatness. The output matching network using MIM capacitor C out and shunt inductor L shunt is designed and optimized by using load-pull simulation for the maximum PAE and output power [10] . 4 Power added efficiency and group delay analysis Fig. 3 shows the small signal equivalent circuit model used for input, output, and interstage impedance matching, in addition to group delay analysis. The PAE is defined by equation (1):
Where, RFP in , RFP out , and DCP in are the RF input power to the device, delivered RF power to the load and dissipated DC power, respectively. In order to improve the PAE of the proposed amplifier, first, a wideband input impedance matching was necessary to increase the available power from the source to the device. Therefore, by using equation (2) of the input impedance Z in of the proposed PA, an input matching circuit, formed by the gate inductor L g1 and capacitor C g1 , is employed to achieve a wide band input matching. In addition, the source degeneration inductor L s1 is used to enhance gain flatness and improve the matching and the linearity. Fig. 4(a) shows the frequency behavior of the normalized input impedance Z in on smith chart with adding the input matching circuit where Z in is around the 50 Ohms source impedance in the 12 GHz to 24 GHz operating band.
Second, for the two stages amplifier design, the output impedance of the first stage Z out1 represents the source impedance Z s2 of the second stage (power stage). Thus, the inter-stage matching circuit is necessary to force the output impedance Z out1 to follow the optimum smith chart locus of the second stage source impedance Z s2 that leads to a maximum PAE, and a maximum power delivered to the load all over the target frequency band. By using equation (3) of the output impedance of the first stage after adding the interstage matching circuit, the inductors L int , L D1 and the capacitors C 2 and C int are designed and optimized for maximum PAE, and maximum power delivered to the load. Fig. 4(b) shows the normalized output impedance of the first stage Z out1 after adding the interstage matching circuit and the normalized optimum source impedance using source-pull simulation Z s2 that achieves maximum PAE all over the band of interest. As can be seen in Fig. 4(b) , the interstage matching circuit achieved good agreement between the output impedance of the first stage and the values of optimum source impedance of second stage starting from 16 GHz.
Third, the output matching circuit is optimized using shunt inductors L shunt so that the output impedance follows the smith chart locus of the impedance needed for maximum power delivered to the load and maximum PAE as indicated by the loadpull contours of the second stage. To calculate the optimized values of the inductors L g1 , L int , L shunt and the capacitors C g1 , C 2 we need to find and solve at least five equations and at the same time take into consideration other design requirements such as gain, matching and group delay. Therefore, the values of L g1 , L int , L shunt and C 2 were optimized using ADS™ simulation to maximize the PAE along with the other design parameters. In order to optimize the PAE all over the band, the input power is swept between −20 dBm to 10 dBm at each frequency point and the maximum PAE were recorded and drawn in Fig. 5 . The method of optimizing the value of the capacitor C 2 and the inductors L g1 , L int and L shunt for maximum PAE across the 12 GHz to 24 GHz bandwidth is shown in Fig. 5 .
The group delay is an important criterion for pulsed based applications like radar as it is used as a measure for phase nonlinearity. The group delay is defined as the negative of the transfer function phase's rate of change with respect to angular frequency [6] . By ignoring the coupling capacitances C in , C int , and C out , using the small signal equivalent circuit shown in Fig. 3 , the overall transfer function H(s) can be simplified by equations (4), (5), and (6), where an approximate formula for the phase of transfer function θ (ω) and group delay G d (ω) can be given by equations (7) and (8):
Where:
For the proposed PA, based on the derived formula for the transfer function and the group delay using equations (4) and (8), in addition to initial circuit simulation, it was determined that, the inductors L g1 , L int and L shunt and the sizes of transistors M 1 and M 3 affect the group delay variations, and gain flatness, largely. Moreover, the gate width of transistors M 1 and M 3 (W 1 and W 3 ) controls the transconductance (g m ) and the gate source capacitance (C gs ) and is selected to adjust the resonance frequency of the two stages to accomplish high and flat gain as described in section 2. Equation (8) consists of many variables, so it is difficult to find direct solution of each variable; in addition, gain flatness is important and need to be optimized also with the group delay and PAE. Therefore, graphical solution to optimize group delay and power gain is shown in Fig. 6 . After approximate values of those parameters, the resulting parameters are refined several times to match their values with maximum PAE shown in Fig. 5 . Finally, by using Figs. 5, and 6, the values of the capacitor C 2 and the inductors L g1 , L int and L shunt after EM simulation were selected to be 70 fF, 0.52 nH, 0.43 nH, and 0.43 nH, respectively.
Measured results and discussion
The proposed PA has been designed and fabricated using 0.18 µm CMOS technology. All transmission lines, as well as pads, have been simulated and modeled using ADS Momentum electromagnetic simulator before fabrication. Fig. 7 shows a micrograph of the proposed PA with a chip size of 0:653 Â 0:59 mm 2 including the measurement pads. Fig. 8 illustrates the comparison between the simulated and measured S-parameter. As can be shown in Fig. 8 , the proposed PA attained a measured flat power gain of 10:5 AE 0:7 dB over the 12 GHz to 24 GHz frequency range. This gain flatness limits the group delay ripples. The proposed PA realizes a measured input return loss (jS 11 j) of less than −8 dB, and a measured output return loss (jS 22 j) of less than −8.5 dB across the entire band between 12 to 24 GHz, as shown in Fig. 8 . Such wideband input and output impedance matching increases the delivered output power and enhances the PAE largely. The proposed PA achieved a small group delay variation of AE20 ps across the targeted frequency band from 12 GHz to 24 GHz as can be seen in Fig. 9 . The measurement setup used to perform the large-signal measurement consists of Agilent signal generator (PSG Fig. 6 . Effect of L g1 , L int and L shunt on the power gain and on the group delay variation. 8257D), and Agilent spectrum analyzer (PXA N9030A), in addition to Cascade Summit probe station and Cascade Infinity 100 µm pitch GSG probes and two cables. The input port is connected to the Agilent signal generator, while the Agilent spectrum analyzer is used to measure the output RF signal. As displayed in Fig. 10 , the proposed PA delivered a saturated output power of 14 dBm all over the band, in addition, a measured output gain compression point (OP1dB) of 10.5, 10.5, 12, 11.5 dBm and PAE of 16, 20, 26, and 24% is achieved at 14, 18, 20 and 22 GHz, respectively. This high efficiency is obtained by proper designing and optimizing input, inter-stage, and output matching networks to maximize the available power from the source. Finally, the proposed PA is stable and consumes 50 mW DC power from 1.8 voltage supply. Table I shows a summary of the fabricated PA performance compared to other published PAs. 
Conclusions
In this letter, a new design method to realize a high efficiency, minimum group delay variations and flat gain wideband PA was presented for quasi-millimeter wave band applications. A 12.0 to 24.0 GHz two stages PA was designed and fabricated in 0.18 µm CMOS technology based on the proposed technique with stagger tuning approach. The proposed PA has a measured power gain of 10:5 AE 0:7 dB and average saturated output power of 14 dBm across the bandwidth. In addition, it achieves a small group delay variation of AE20 ps and maximum PAE of 26% which is the highest among the recently published 0.18 µm CMOS PAs while consuming 50 mW of DC power.
