I. INTRODUCTION
W ITH THE scaling of CMOS transistors, the speed of the transistors has increased while being operated at lower supply voltages, and it becomes more challenging to meet the requirement on output power (P out ), linearity, and efficiency in power amplifiers (PAs). With the improved speed of CMOS transistors, highly efficient switched PAs, like class D/E, have gained increased interest in polar modulation [1] , [2] and outphasing [3] - [10] . In the outphasing amplifier, an input signal s(t) containing both amplitude and phase modulation is divided into two constant-envelope phase-modulated signals s 1 (t) and s 2 (t) as in Fig. 1 . The signals are amplified by efficient switched amplifiers A 1 and A 2 and are connected to a combiner with strict requirements on gain/phase matching, whose output y(t) is an amplified replica of s(t). With an isolating combiner, the linearity is high as the amplifiers do not interact and the seen load impedance for each amplifier is fixed. For a nonisolating combiner, the amplifiers' seen load impedance varies with outphasing angle, making class-E PAs less suitable and require predistortion [11] , as the switching characteristic and constant-envelope operation depend on the load impedance. A class-D PA mitigates this as it can be considered as an ideal voltage source, independent of the load [12] thus maintaining linearity even when nonisolating combiners, e.g., transformers, are used. The output power of class-D RF PAs has, until recently [8] - [10] , been lower than +30 dBm [1] - [7] . This can be explained as follows: For a given supply voltage and load resistance, the P out from a class-D PA is −3.9 dB (1) lower compared to that from class-A/-B PAs
(3) For a higher P out , either a high supply voltage or a small load impedance, i.e., a high impedance transformation ratio, is needed. A high impedance transformation ratio results in reduced bandwidth and low efficiency, particularly for on-chip matching networks [13] . A high supply voltage (and swing) can be used in class-D PAs by utilizing cascoding techniques to operate at two or three times the transistors' nominal supply voltage [1] , [2] , [4] - [8] , [14] . In [1] , [2] , and [4] - [8] , the voltage stress on the devices is limited to the nominal supply voltage by using two supplies in the output stage and in the drivers, i.e., 2 × V DD and V DD as shown in Fig. 2(a) . However, during RF operation, the time-dependent dielectric breakdown (TDDB) is proportional to the rms electric field (E) across the gate oxide [15] , [16] and not the peak E. Thus, the capacity of the transistors may not be fully exploited.
This brief presents the design and analysis of a 5.5-V class-D stage used in two fully integrated watt-level outphasing RF PAs with on-chip transformers in standard 130-and 65-nm CMOS technologies. The PAs, optimized for output power [9] and bandwidth [10] , delivered +32.0 and +29.7 dBm, respectively. The class-D stage utilized a cascode configuration, driven by an ac-coupled low-voltage driver operating at 1.3 V (V DD1 ) to allow a 5.5-V (V DD2 ) supply without excessive device voltage stress. Compared to earlier works [9] , [10] , expressions for rms E across the gate oxides, the reduction in on-resistance r on , and the power dissipation of the parasitic drain capacitance of the common-source transistors in the proposed cascode stage 1549-7747/$31.00 © 2012 IEEE [2] , [4] - [8] .
In [1] and [4] , nonoverlapping driver signals were used. (b) Proposed class-D stage. C 1 −C 4 are MIM capacitors. T 4 is biased for improved reliability and to be able to affect the on-resistance ron. The driver is a tapered buffer with tapering factor λ = 2.5 [9] (λ = 2 in [10]). are derived. The properties are compared with a conventional cascode (inverter) stage. The presented technique is also useful in wideband high-voltage drivers for base stations [17] and to enable a direct connection to the battery in more deeply scaled nanometer technologies like 45 nm. The outline of this brief is as follows. Section II describes the operation of the proposed class D and in Section III the properties are compared with a conventional cascode (inverter) stage. Section IV presents the design of the outphasing RF PAs and in Section V the measured RF performance is compared with other work. Section VI provides the conclusions.
II. OPERATION/RELIABILITY OF THE CLASS-D STAGE

A. Design and Operation of the Class-D stage
The proposed class-D stage, denoted by PA in Figs 
The simulated V g,i and drain, V 1 and V 2 , voltages are shown in Fig. 4(a) . When the driver signal V x in Fig. 2(b) is high, V g,3 is raised above the bias level and becomes V DD2 /2, reducing the r on of T 3 (Section III-C). When V x is low, V g,3 is lowered below the bias level and becomes V DD2 /2 − V DD1 . This lowers V gd,4 and V ds,4 to ≈ V g,3 − V th,T 3 if subthreshold conduction is neglected but also increases V gd,3 and V ds, 3 . With the reduced voltage swing at V d, 4 , the power consumption due to switching C d (3) is reduced by ≈50% similar to that in [18] (Section III-B). The operations of T 1 and T 2 are the same, but they are in their ON state (OFF state) when T 3 and T 4 are in their OFF state (ON state). Thus, by choosing suitable bias points and driving all transistors, the voltage stress can be distributed for the whole RF cycle, enabling a high supply voltage. As the PA stage does not use the sum of the transistors' nominal supply voltages, as in Fig. 6(b) , a 3.4 dB larger P out can be achieved (2).
B. Reliability Considerations
The reliability of CMOS transistors due to oxide degradation is particularly important to consider in circuits with large voltage swings, like PAs, but the impact of RF stress is not as damaging as dc stress [19] , [20] . Two major degradation mechanisms are Fowler-Nordheim tunneling, due to high E across the gate oxide, and hot carriers (HCs), i.e., accelerated carriers in the channel [21] . During RF operation, the TDDB is proportional to the rms E applied to the gate oxide [15] , [16] . In [15] , the devices had a similar time to failure when rms RF and dc stress were compared. Fig. 5 shows the simulated V gd and V ds for the class-D stage in [9] (similar in [10] ). The rms E gd , E gs , and E gb are ≈0.6-V/nm gate oxide, which is similar to the voltage stress in digital circuits and is expected to result in a lifetime of more than ten years [21] .
HC stress typically occurs when V ds is larger than the maximum rated V ds while V gs is at least V ds /2 [16] . A sign of HC stress is, for example, increased V th . In the PAs, presented in this brief, V ds is high (≈ 1.5 × V DD,nom ) when the transistors are in their OFF state and V gs is close to 0 V, minimizing the HC stress [14] , [16] . Also, the simulated V ds of the proposed class-D stage is smaller compared to class-AB PAs, where the cascode device is typically not driven by the driver and V ds approaches 2 × V DD,nom ) [16] . The drain/well breakdown voltage of the processes used is 10 V, far from the drain voltage switching between V DD2 and GN D. This was also seen in simulations for a wide range of impedances including open/short load. 
III. ANALYSIS AND COMPARISON OF CLASS-D STAGES
A. Computed RMS Electric Fields
Assuming that the drain voltage is a square wave and is pulled to either V DD2 or GN D, the V gd and V gs voltages for the proposed [ Fig. 6(a) ] and conventional [ Fig. 6(b) ] class-D stages can be expressed as in Table I . In Fig. 6(a) , the gate of the cascode device is assumed to operate between V bias ± V DD1 /2. In Fig. 6(b) , the gate voltage is held constant at its bias level V bias . The gate-drain E gd,T 3 ,prop , gate-source E gs,T 3 ,prop , and the gate-drain E gd,T 4 ,prop rms E(4) in the proposed class-D stage in Fig. 6(a) can be expressed as in (5)- (7) as a function of V bias . The fields are plotted in Fig. 7(a) for [9] , assuming that V DD2 = 5.5 V and V DD1 = 1.3 V and V th,T 3 = 0 since V gs,3 = 0 in Fig. 4(a) . The corresponding fields for the conventional cascode (inverter) stage in Fig. 6(b) can be expressed as in (8)- (10) . The fields are plotted in Fig. 7(b) , where the supply is assumed to be the same as in the proposed class-D stage, i.e., V DD2
The optimal bias points where the lifetime of the devices is optimized (assuming that thin and thick gate oxide devices have the same characteristics regarding voltage stress) are marked with a circle. The value of V bias, opt is 2.11 V for [9] (1.88 V for [10] ) and corresponds well to the ideal bias point of 2.1 V (V DD2 /2 − V DD1 /2). The figure shows that, in the proposed class-D stage, a significantly higher bias level of the cascode device can be used while still having a comparable rms E between the gate-drain and gate-source, enabling the use of a high supply voltage. Thus, if only T 1 and T 4 are driven by the driver as in Fig. 6(b) , either a lower V DD2 or bias levels
must be used to reduce the oxide stress. Reducing the supply voltage or adjusting the bias voltages, i.e., reducing voltage swing and increasing r on , lowers the P out . Increasing transistor widths would reduce r on but introduce more capacitive losses. Fig. 8(a) shows simulated rms E gd over V DD2 for the PA in [9] and the proposed class-D stage in Fig. 6(a) (0.6 V/nm is indicated with a dashed line). Fig. 8(b) shows the corresponding simulated E when only T 1 and T 4 are driven by the driver as in Fig. 6(b) . Fig. 8(b) shows that rms E values are about 25% higher, reducing the predicted oxide lifetime by more than a factor of ten [15] . For a V DD2 larger than 4 V, the difference in P out is ≈0.5 dB. Lowering the bias level to 1.3 V, the reduction in P out is ≈1 dB. The results demonstrate the benefits of the Fig. 8 . Simulated rms E gd over supply voltage V DD2 for the PA in [9] when (a) all transistors, i.e., T 1 −T 4 as in Fig. 6(a), and (b) only the thin-oxide devices, T 1 and T 4 as in Fig. 6(b) , are driven by the driver. proposed class-D stage in terms of reduced device voltage stress and higher P out compared to a conventional cascode stage.
B. Power Consumption Reduction of C d in the Common-Source Stage
As shown in Fig. 6(a) and (b), V d,4 is reduced in the proposed class-D stage. With the reduced swing, the power consumption due to charging and discharging the parasitic drain capacitance C d of T 4 is reduced as seen in (3). In (11), plotted in Fig. 9 (a) using Table I and assuming that V th = 0, the ratio of the power consumption due to switching C d in the proposed class-D stage and in the conventional cascode stage is mathematically expressed as a function of the bias level V bias . For bias levels above V DD1 /2, the power consumption due to the parasitic drain capacitance is reduced. At the optimal bias point V bias, opt , the ratio is ≈0.48 [9] (0.43 [10] ); thus, the power consumption is reduced by 52% [9] (57% [10] ) compared to the conventional cascode stage. Taking into account that the drain capacitance of the thick-oxide devices are charged to 5.5 V each RF cycle, the overall power reduction is 5%-10%. Notice that the same bias level would not be possible in the conventional cascode stage due to the higher level of voltage stress if the expected lifetime should remain the same.
Even if the power consumption due to charging/discharging the drain cap of T 1 and T 4 is reduced, it does not include C d of T 2 and T 3 . The technique presented in [22] leads to the suppression of the third harmonic and higher drain efficiency, no short-circuit power dissipation, and reduced impact of the drain capacitance, but the fundamental tone is reduced and, therefore, that approach was not used. The technique presented in this brief is applicable not only when using a 5.5-V supply but also in deeply scaled CMOS technologies like 45 nm [23] . Assuming thin and thick gate oxide devices with t ox values of 1.5 and 3.0 nm and a 1.0-V voltage driver, a 3.0-V supply (V DD2 ) can be used to achieve an rms E of ≈0.6-0.7 V/nm if V bias = V DD2 /2. ∼3.0 V is a reasonable battery voltage [23] , enabling direct connection of the PA to the battery instead of using voltage converters. However, if a supply voltage larger than 3.0 V is desirable, e.g., 5.5 V, a dc-dc converter is needed.
C. On-Resistance Reduction
With the proposed class-D stage, r on (12) is reduced compared to the conventional cascode stage as V g,3 is raised above its bias level. The ratio of the equivalent r on , i.e., the sum of r on of T 3 and T 4 , for the proposed and conventional cascode stages are computed in (13) and plotted in Fig. 9(b) using Table II . The ratio of the r on of T 3 is also plotted with a dashed line according to (14) . The ratios are smaller for low bias levels, but at V bias, opt , the equivalent r on ratios for [9] are 0.81 and 0.76, respectively, i.e., a reduction of the equivalent r on by 19% and the r on of the cascode device by 24%. The corresponding reductions in r on for [10] are 23% and 26%. Thus, for the same r on , the proposed class-D stage requires smaller devices, reducing associated losses and area.
IV. DESIGN OF THE OUTPHASING RF PAS
For a high-voltage swing and high P out in the PAs in Fig. 3(a) , the class-D stages are combined using 1:1 onchip transformers T R. The PAs, optimized for P out [9] and bandwidth [10] , used four and two transformers. Under T R, floating metal shields were placed in M 1 and M 2 to reduce the losses [24] , but the optimal effect is obtained at maximum P out , i.e., as the class-D output operates on complementary signals. Simulations of [10] showed a 1.2-dB higher P out and an ≈30% relatively higher efficiency with the floating shields than without. Tuning capacitors were placed at the primary windings in [9] to reduce the losses. In [10] , the inductance of the transformer and transistor sizes (and associated capacitances) were optimized at 1.95 GHz to omit the MIM tuning capacitors with a maximum allowed voltage of 5.5 V (10 V in [9] ), potentially causing reliability issues with the 5.5-V supply (V DD2 ). The top/bottom plates of the capacitors were connected to V DD2 /GN D [9] (V DD2 /V DD1 [10] ).
V. MEASUREMENT RESULTS
The measurement setup is shown in Fig. 10 . The 5.5 V supply voltage is supplied from an off-chip power supply and was chosen by using the derived equations and by sweeping the bias level to achieve an E of ≈0.6-0.7-V/nm gate oxide and provide a reasonable lifetime. P out was +32.0 dBm [9] with a DE and PAE of 20.1% and 15.3%, respectively, including all drivers, at 1.85 GHz. P out was +29.7 dBm [10] at 5.5 V (+30.5 dBm at 6.0 V) with a DE and PAE of 30.2% and 26.6%, respectively, at 1.95 GHz. The PAs had 3-dB bandwidths of 0.9 GHz (1.2-2.1 GHz) [9] and 1.6 GHz (1.2-2.8 GHz) [10] , respectively. The measured P out and efficiency are 10% lower than the simulated performance at 65
• C, including the S parameters of electromagnetic-simulated transformers and layout parasitics. The PA performance over temperature has not been characterized in measurements, but in simulations, the relative changes in P out and efficiencies were < 15% between 25
• C and 125
• C. Table III lists published state-of-the-art fully integrated CMOS class-D PAs. The PAs present state-of-the-art P out [9] and bandwidth [10] , comparable with [8] but larger than other PAs with state-of-the-art bandwidths (1.3 GHz) [25] . Modulation/spectral requirements were met for uplink WCDMA/LTE [9] , [10] .
In an initial reliability assessment [10] , two devices were continuously operated for 168 h without performance degradation in P out or efficiency. Thus, the 5.5-V supply does not seem to have any direct impact on device reliability. The required lifetime has to be put in relation to the employed standard (e.g., a 2G GSM has a 12.5% duty cycle) and an expected user case. Presuming 4 h of talk time a day for 1.5 years (estimated lifetime of a handset) [23] corresponds to ≈275 h (365 · 1.5 · 4 · 0.125) of continuous PA operation. WLAN products may experience similar effective operating times, where a test time of 168 h at an elevated supply voltage is considered to cover more than five years of product reliability [16] .
VI. CONCLUSION
This brief has presented the design and analysis of a 5.5-V class-D stage used in two fully integrated watt-level outphasing RF PAs in standard 1.2-/2.5-V 130-and 65-nm CMOS technologies. The class-D stage utilizes a cascode configuration, driven by an ac-coupled low-voltage driver, to allow a 5.5-V supply without excessive device voltage stress. The properties are compared with a conventional cascode (inverter) stage. To the authors' best knowledge, the class-D PAs presented are among the first fully integrated CMOS outphasing PAs reaching +30 dBm of output power and demonstrate state-of-the-art output power and bandwidth.
