Abstract This paper describes the design of a power amplifier (PA) for 802.11n WLAN fabricated in 65 nm CMOS technology. The PA utilizes 3.3 V thick gate oxide (5.2 nm) transistors and a two-stage differential configuration with integrated transformers for input and interstage matching. A methodology used to extract the layout parasitics from electromagnetic (EM) simulations is described. For a 72.2 Mbit/s, 64-QAM, 802.11n OFDM signal at an average and peak output power of 11.6 and 19.6 dBm, respectively, the measured EVM is 3.8%. The PA meets the spectral mask up to an average output power of 17 dBm.
Introduction
The power amplifier (PA) is a key building block in all RF transmitters. To lower the costs and allow full integration of a complete radio system-on-chip, it is highly desirable to integrate the entire transceiver and the PA in a single CMOS chip. However, integration of RF power amplifiers in low-cost CMOS technologies proves to be a challenging task [1] .
While digital circuits benefit from the technology scaling, it is becoming significantly harder to meet the stringent requirements on linearity, output power, and power efficiency of PAs at lower supply voltages and in the presence of large on-chip parasitics [2] . This has recently triggered extensive studies to investigate the impact of different circuit techniques, design methodologies, and design tradeoffs on functionality of PAs in deep-submicron CMOS technologies [3] . Particularly, the demand for higher data rates in wireless communication has led to an increased interest in modulation schemes utilizing both phase and envelope modulation, necessitating a special focus on design issues for linear CMOS PAs to amplify signals with high Peak-to-Average-Power Ratio (PAPR), as in 802.11n WLAN. Due to the large PAPR values, there is an inherent conflict between power-added efficiency and linearity, as the PA has to back-off significantly from the maximum output power in order not to cause significant distortion. Assuming a signal with PAPR of 10 dB, the efficiency of an ideal Class-A amplifier cannot be higher than 5% [4] . Nonetheless, it is important to minimize power consumption to increase battery operation time and also minimize heat dissipation.
Several high performance PAs for WLAN have been fabricated in 180 nm [5, 6] and 90 nm [7] CMOS technologies. In this paper, we present the design and evaluation of a linear 2.4 GHz WLAN PA [8] in 65 nm CMOS supporting the IEEE 802.11n draft standard. The PA utilizes 3.3 V thick gate oxide CMOS transistors and integrated transformers for input and interstage matching. The output matching network is located off-chip, on a FR4 PCB, and is realized by lumped components. This paper discusses the design and implementation of the PA, including the circuit architecture, modeling and design of the transformers, the extraction of the layout parasitics, and the output matching network design, which is followed by the experimental results.
Design and implementation of the power amplifier
The PA utilizes 3.3 V thick gate oxide CMOS transistors with a gate length of 0.6 lm and integrated transformers (T 1 and T 2 ) for input and interstage matching. Figure 1 shows the differential design and the two amplifier stages, as well as the integrated transformers and tuning capacitors (C 1 and C 2 ). Transformers have not been commonly used in integrated CMOS PAs until recently, and it has been shown that they can provide sufficient performance for impedance matching purposes [9, 10] . Since the primary and secondary windings of the implemented transformers are galvanically isolated, we can use the center taps for either biasing of the input device in the cascode stage, as in the first transformer, or power supply of the cascode stage, as in the second transformer.
To ensure reliable operation and protect the transistors from hot electrons and breakdown due to high voltage peaks, each amplifying stage uses a pair of transistors in a cascode configuration, which also increases the output resistance and reduces unwanted capacitive feedback [11] . To provide highest protection for the transistors, the gates of the cascode transistors should be biased at VDD, but a lower bias level can provide better performance [10] . As the 65 nm process was still under development, information on reliability issues due to hot carriers in the thin oxide devices was limited, and a conservative approach, using a cascode structure with two thick oxide devices with long channel length, was taken. The minimum channel length of the thick gate oxide transistor was 0.6 lm, which also leads to a low gain of the transistors. Therefore, to achieve a sufficiently high gain, large transistor widths were used, 0.8 and 6 mm for the first (M 1 and M 2 ) and second (M 3 and M 4 ) amplification stages, respectively. In [12] it is shown that a combination of a thin oxide and thick oxide device in the cascode structure can provide sufficient reliability for WLAN products in nanometer technologies. Using a combination of a thin oxide and thick oxide device could increase the gain of the PA, due to the shorter gate length, and lower the driving requirements of the first stage as the input impedance of the required thin oxide device can be higher.
Transformer model and losses
In [9] the relationship between the voltages and currents in the primary and secondary windings of an ideal transformer in Fig. 2 is described. Suppose that an impedance, Z s , is connected to the secondary side, then the primary side experiences an impedance, Z p , which relates to each other according to the turns ratio, n, as defined in Eq. 1.
Consequently, different winding schemes result in different impedance transformation ratios, which mean that the transformer can be used for impedance matching/ transformation purposes. A more detailed analytical model of the transformer is the T-model, described in [13] , where the efficiency was derived to be as follows (for optimum choice of L p and consideration of tuning capacitors [14] ):
In Eq. 2, Q P and Q S are the quality factors of the primary and secondary sides, respectively. From Eq. 2, we can see that the efficiency can be maximized by using a coupling factor, k, as close as possible to unity and making the Q of the primary and secondary windings, as large as possible. However, the number of turns and the inductances are limited by the substrate and interwinding parasitic capacitances and operating frequency, making it challenging to find an optimum transformer design. Moreover, in this PA design the inductances of the transformers are limited by the large capacitances of the transistors since the transformers are used for input and interstage matching.
The designs of the planar square transformers are based on a model described in [15] and are implemented as coupled inductors. The model in Fig. 3 includes coupling to the substrate, inductances, and the coupling between the To reduce the resistive losses at primary and secondary sides, the two upper layers in the seven-metal stack are connected to form one conductor. The thicknesses of the aluminum and copper layers are 1.3 and 0.6 lm, respectively. The winding ratios of the transformers, in Fig. 1 , are 2:3 (T 1 ) and 3:2 (T 2 ) with a coupling factor of *0.7 for both transformers. Estimations of the power losses in the transformer can be calculated by the maximum available gain, G ma , based on S-parameters for any termination impedances calculated according to Eqs. 3 and 4, and is a measure of the gain of a system when the source and load reflection coefficients are conjugately matched to S 11 and S 22 [16] .
where k s is the stability factor defined as:
The simulated maximum available gain, G ma , for the input and interstage transformers is approximately -2.2 dB for both transformers at the target operating frequency of 2.45 GHz. G ma is plotted in Fig. 4 up to the approximate self-resonant frequency, which is close to 5 GHz. Besides biasing possibilities through the center taps of the transformer, the galvanic isolation also features an ESD protective function (for stand-alone devices) at the input of the PA. LNA implementations have shown that protection up to 5 kV is feasible [17] by using transformers.
Parasitics extraction
Extraction of interconnect parasitics and inherited losses are used to predict the frequency behavior and gain of the PA. In our simulation model, the signal traces are approximated by series inductance and series resistance, which are extracted through electromagnetic (EM) simulations using Agilent Advanced Design System (ADS).
To meet the current density limitations, and to reduce the losses in the drain and source connections at the output transistors, several metal layers were stacked on top of each other in the structure, as shown in Fig. 5 . For such a structure, the capacitive coupling between gate, source, and drain is increased. Since not all the metal layers are not included in the existing transistor model, we need to add the parasitic capacitances, while taking into account the associated dielectric losses [18] , into our extended model. The values were extracted through EM simulations and added in the simulation model as series connections of capacitance and resistance between the gate, drain, and source, as seen in Fig. 6 . Additionally, there will be an interconnect resistance between the drain (M input ) and source (M casc ). But by making the transistors wide with multiple fingers and using several metal layers as shown in Fig. 5 , the resistive drop across this resistance was reduced to a few mV and therefore this resistance was omitted in Fig. 6 . The large transistors were split into gate widths of 20 lm (connected on both sides), resulting in 40 and 300 The S-parameters were converted to Z-parameters [19] of the reciprocal network [18] into a T-type connection and by applying expression 5 and 6, the parasitic component values were approximated at the operating frequency. For differential signals, Eq. 5 was applied to calculate the differential impedance (Z dd ). For single-ended excitation, Eq. 6 was applied to calculate the input impedance at port 1 (Z se ) [20] .
To accurately model the gate resistance at high frequencies, we added a vertical gate resistance [21] as in Fig. 7 , which due to dopant segregation during silicidation can be relatively high, and thus will influence the gain of the PA and f max [22] . In Fig. 6 , this resistance is denoted as R vgr , inserted in series with the lateral gate resistance, R lateral , which is calculated by the gate sheet resistivity and layout geometry. The sum of both resistance values for the first and second amplifier stages are *1.6 X and 250 mX, respectively, and the vertical gate resistance represents *20% of the total gate resistance in our design. The estimated impact of these resistances is a gain reduction of 1.6 dB.
For evaluation of the PA design, the schematic from Cadence was simulated in ADS WLAN 802.11 g testbench together with the layout of the fabricated testboard using RFIC Dynamic Link. In such a setup the influence of the parasitics can be evaluated.
Off-chip matching network
The PA utilizes an off-chip lumped element balun [23] ( Fig. 8) for differential-to-single-ended conversion and load impedance transformation. A pre-matching capacitor (C P ) was used before the balun to compensate for the bondwire inductance and interconnection lines from the PA to the balun. For simplicity we assume that the bond-wire inductance [24] and the interconnections from the PA to the balun can be represented by L P , and that the balun makes an impedance transformation from a resistive value R 2 to a resistive R L . As the pre-matching capacitor is added in front of the balun, two impedance transformations are performed. One impedance transformation is done at the immediate output of the PA, from a complex load Z 1 to R 2 , and one transformation from R 2 to R L . As we are adding matching components we may potentially lower the impedance transformation ratios from Z 1 to R 2 , and R 2 to R L (Re{Z 1 } \ R 2 \ R L ), such that the overall matching network become less sensitive to component variations and that a wider bandwidth can be achieved [25] . Based on these assumptions, a matching network with a complex load Z 1 , the component values of C P , L Bx , and C Bx can be calculated according to Eqs. (7) (8) (9) .
The differential output stage also implies some additional aspects, besides the available double voltage swing, as each amplifier stage can utilize a lower power enhancement ratio [3, 13] such that the efficiency can be higher than if a single-ended PA with a single L-match would have been used.
6 Experimental results Figure 9 shows the photograph of the PA, with a size of 1 9 1 mm 2 . The chip was directly bonded on the testboard, which was a two-layer 0.5 mm thick FR4 PCB with e r of 4.2 and tan d of 0.035 as shown in Fig. 10 . The input power to the testboard was applied differentially with an external 50-100 Ohm balun connected to the signal source.
The target frequency of the PA was set to 2.45 GHz in the design work. After tuning of the output matching network, the best performance in terms of power gain was found at 2.48 GHz. For a frequency offset of ±30 MHz around 2.48 GHz, the drop in gain was measured to be 0.4 dB (2.45 GHz) and 0.2 dB (2.51 GHz), respectively.
The input and output 1 dB compression points (P1 dB) at 2.48 GHz were found to be 2.6 and 19.6 dBm, with a power-added efficiency (PAE) of 5.8%. Plots of the measured average output power, EVM, and gain are provided in Fig. 11 and 12 for a 72.2 Mbit/s, 64-QAM, 802.11n OFDM signal, while using a supply voltage of 3.3 V. Compared to 802.11 g, the data rate of 802.11n is increased from 54 to 72.2 Mbit/s by the use of more subcarriers, shorter guard interval, and 5/6 coding rate [26] .
For an input signal with PAPR of 9.1 dB, the PA gave an EVM of 3.8% at an average output power of 11.6 dBm, Analog Integr Circ Sig Process and 5.4% at 13.4 dBm average output power with a PAE of 1.4%. Due to the high linearity requirements of OFDM modulation, the PA was biased in class-A which leads to large bias currents and low PAE [6] . Additionally, simulations indicate that there is design space for load impedance improvement to get a higher output power.
The measured output spectrum of a WLAN 72.2 Mbit/s, 64-QAM, 802.11n OFDM signal for a 20 MHz channel is plotted in Fig. 13 . As seen in the figure, the PA meets the spectral requirements of the 802.11n draft 2.05 [26] . The PA showed an average output power of 17 dBm with an EVM of 13.1%.
In Table 1 the performance of the implemented PA and some recently presented WLAN PAs is listed. The transformer-based PA in this work has similar average output power compared to recently presented WLAN PAs. However, our PA is capable of running at a lower supply voltage than [6] , maintaining a low EVM for a 802.11n OFDM signal with more subcarriers than 802.11 g [5, 6] , and is implemented in a more advanced technology with increased resistances in the interconnects [2] .
Summary
The paper has presented a transformer-based CMOS PA for WLAN 802.11n, fabricated in 65 nm CMOS. A methodology to extract the layout parasitics from electromagnetic (EM) simulations was described. The PA meets the EVM and spectral requirements for a 72.2 Mbit/s, 64-QAM, 802.11n OFDM signal, at an average output power of 11.6 dBm, with an EVM of 3.8%. 
