Abstract -Switching processes such as pulse-width and pulse-density modulation have been used for many years in power electronics applications. Due to rapid scaling of semiconductor technology, similar approaches may be successfully applied to blocks in the radio architecture. This work outlines the implementation of a class-D power amplifier for RF applications in low-GHz frequency bands. We describe the motivation for using pulse-density modulation (PDM) to achieve linear amplitude modulation of a nominally nonlinear switching power amplifier. The amplifier achieves linearity suitable for wideband wireless standards, with peak efficiency of 43.5% at 1.95GHz and up to 20dBm output power. The system generates amplitudemodulated waveforms with up to 20MHz envelope bandwidth, demonstrating the validity of this approach for modern communication standards.
INTRODUCTION
After decades of successful process and technology scaling, active semiconductor devices are now operating with current and power gain-bandwidths (f t and f max ) in excess of 100GHz [1] [2] . This enables efficient operation of deep-submicron CMOS technology at radio frequencies with conventional analog and digital circuitry. In modern digital CMOS technologies, core libraries of standard cells can operate at low-GHz carrier frequencies. Combined with extraction of layout parasitics, standard-cell designflow allows direct and rapid synthesis of computational blocks at RF frequencies. This enables complex digital processing and control of RF switching waveforms.
Trends in processing speed and circuit design methodology will have a direct impact on many blocks in wireless systems. As this work demonstrates, techniques previously used for audio amplifiers, motor control, and power conversion may now be relevant for RF applications. Here we present an RF power amplifier (RF PA) implemented as a power digital-analog converter (DAC). The class-D PA operates with both NMOS and PMOS complementary devices. With PMOS f t on the order of 40GHz in the 90nm process, switching power loss is substantially reduced compared to previous generations of CMOS technology. We implement digital control of the carrier amplitude using pulse-density modulation (PDM). Two stages of digital circuitry shape quantization noise away from the signal band. A High quality factor (Q) passive filter attenuates out-of-band noise, and reduces power loss from harmonics. Matching networks and filtering is implemented at the board level to achieve higher quality factor. With appropriate passive components, we achieve unloaded Q in the range of 20-30 in the output network [3] . Pulse-density amplitude modulation provides an alternative to both conventional linear RF modulators and recent generations of polar and envelope tracking (ET) systems [4] - [7] . Polar systems improve the efficiency of traditional power amplifiers through dynamic regulation of the RF PA supply voltage. Polar systems are subject to AM-AM and AM-PM distortion, and are difficult to design due to wideband spectral content in the amplitude and phase signals [4] .
With pulse-density modulation, switching losses scale with carrier amplitude, improving efficiency at low power levels. This improves average efficiency for standards with high peak-average power ratio (PAPR). Linearity is also improved because carrier amplitude is only a function of pulse-density, which can be accurately controlled by the digital system. By eliminating the need for dynamic supply regulation, the PDM system eliminates inherent problems with AM-AM, AM-PM distortion and power supply noise issues [8] .
II. ARCHITECTURE The architecture, shown in fig. 1 , includes two stages of noise shaping, digital upconversion, and a class-D PA. The PA drives a pulse waveform with 50% duty cycle into the matching network and bandpass filter. The high-Q output filter attenuates the harmonics and selects only frequency content at the RF carrier fundamental. The PDM process operates as an RF-DAC converting the high dynamic range baseband signal to a 1-bit representation operating at the carrier frequency. The first-stage ΔΣ modulator is a noise shaping process that controls the second stage pulse-density modulator. The second stage modulator uses programmed binary codes to generate the pulse waveform at the RF carrier frequency. The upconversion block mixes the output with the carrier to generate the correct amplitude for the carrier. The PDM process controls the amplitude of the carrier fundamental by changing the density of pulses at the carrier frequency. The pulse-density process creates a spectrum similar to amplitude-modulation. Shown in fig. 2 , the output includes harmonics symmetric around the carrier that can contaminate the output spectrum. Harmonics can also be a source of power loss in the system. As discussed in [9] , the power distribution in pulse-density modulated waveforms is a function of modulation depth (pulse density) and the quality factor of the output filter. For the example in fig. 2 , with loaded quality factor of Q~12 and pulse-density of 50% (6dB power backoff), approximately 95% of the harmonic power is concentrated in the carrier [9] . This number ignores loss in active and passive components and falls off with decreasing pulse-density. The filter is designed to be high-impedance out of band to block harmonic power from reaching the antenna. For the voltage-mode class-D PA, a series resonant L-matching network is used as a first stage filter. Parallel resonant filters are also possible, but increase power loss by providing low-impedance at the output harmonics.
A suitable noise shaping process can be implemented digitally from many ΔΣ processes described in the literature [10] - [12] . Shown in Fig. 3 , we used a digital error-feedback topology to extract 10 quantization levels from the 12-bit baseband representation. The ΔΣ process operates at 100MHz, generating peaks in the noise power spectrum 50MHz from the carrier. A decoder maps the 10 quantization levels to a 4-bit representation. The PDM block in fig. 4 uses programmed codes, nine pulses in Polarity information is used to represent both positive and negative amplitudes by inverting the phase of the RF clock. The polarity bit improves linearity as the signal amplitude approaches zero and can be used to reduce the bandwidth of polar amplitude and phase signals [13] .
For example, at half amplitude, the sequence generates alternating 'zeros' and 'ones' to put the closest harmonic at w 0 /2 where w 0 is the carrier frequency. This concept is highlighted in fig. 2 . Pre-shaped codes are used to generate amplitudes increasing at 1/9 th of the full scale amplitude. This places the worst-case tone at w 0 /9, or 216MHz away from the 1.95GHz carrier. The spectral components of the pre-shaped codes are diversified such that the output tones are substantially reduced by the baseband ΔΣ process. A third-order loop filter shapes the output noise with two complex zeros and a zero at DC. The spectrum of the baseband ΔΣ modulator is shown in fig. 5 for a single tone at 5MHz. To flatten the in-band noise spectrum, a notch in the loop filter is placed approximately 12MHz from the carrier. Other filters are possible, including 1 st and 2 nd order. A lower order filter may have lower peak out-of-band noise, but higher probability of tones in the output spectrum [10] . We chose 3 rd order to demonstrate the effective use of notches in the output spectrum.
III. CLASS-D POWER AMPLIFIER
Switching power amplifiers traditionally use n-channel power devices because of higher mobility. This leads to smaller devices and lower switching losses. Class-E and class-F soft-switching techniques improve efficiency under normal circumstances by reducing power lost to reactive parasitic components [14] . However, these amplifiers rely on the output network to shape the voltage waveform across the active device. This can result in high voltage stress. Modern CMOS processes have fast active devices that are constrained by low breakdown voltages. In this scenario, the complementary class-D output stage has advantages over more traditional topologies. By using a complementary output stage with active PMOS pull-up transistors, the drain voltage is a square wave. Class-E and class-F topologies are sensitive to variation in the load impedance. Even if the output network is ideal, class-F topologies can only approximate a square wave by blocking various combinations of switching harmonics. In a realistic environment, these topologies are limited by oxide stress in the active devices as the drain voltage can swing above the supply voltage. 
Here, V pk and V DC are the peak and DC voltages at the drain of the active device, and I rms and I DC are the RMS and DC currents. The F V figure of merit is indicative of oxide stress in the active device. Higher F V implies higher drain voltage swing relative to the supply. F I is related to the efficiency of the amplifier topology. Higher RMS current levels from high peak currents result in increased resistive losses in the switches. Table I compares the class-D topology to other switching-class amplifiers. Oxide stress as a function of supply voltage is lower for class-D, especially compared to class-E. This allows the devices to deliver higher average power to the load for a given peak oxide stress. Since each switch has half-wave sinusoidal drain current, F I is comparable to the other amplifier topologies. An important advantage of the class-D amplifier is that the output stage is always low impedance. Variation in the impedance of the output network caused by load pull from the antenna does not affect F I and F V ratios. This makes the class-D output stage robust against impedance variation compared to other switching amplifier topologies.
The traditional disadvantage of the complementary class-D topology is the use of p-channel devices with lower f t than the n-channel devices. This increases capacitance, reducing efficiency compared to NMOS-only Figure 7 . Level shift and deadtime control topologies. However, in deep-submicron processes, pchannel devices may have f t in excess of 40GHz [2] . Device scaling further improves operation frequencies and reduces switching losses [1] . In this design the advantage of using a low-impedance complementary output stage outweighs higher losses in the p-channel device.
Class-D is also more linear with pulse-density modulation than other amplifier classes. For the pulsedensity modulation scheme, the class-D output stage has the advantage that the drain voltage waveform is not affected by pulse-skipping. For class-E or F, the startup time to achieve nominal steady state operation can lead to distortion. This second-order effect is caused by reliance on the output network to shape the drain voltage waveform. Voltage-mode class-D amplifiers force a nearly ideal drain voltage because the switching-node is always low impedance. Class-D amplifiers are less likely to need predistortion to compensate AM-AM and AM-PM distortion than class-E or class-F amplifiers due to inherent open-loop linearity in PDM operation. Fig. 6 shows the schematic of the CMOS class-D PA. To achieve higher output power, the output stage uses a cascode device, allowing higher voltage operation. The output devices are thin-oxide transistors with high f t which can achieve higher efficiency than thick-oxide devices for the same supply voltage [15] . The PA operates with a high voltage rail, VHV=2.0V, and a center voltage rail, V half =1.0V. The V half node is shared by the high-side and low-side drivers such that current from driving the PMOS output device is reused to drive the NMOS device. Each cascode device was built to share the diffusion region of the switching device in order to reduce parasitic junction capacitance at the cascode node. The finger width for both cascode and switch was 5um to minimize gate resistance.
IV. IMPLEMENTATION
The level shift block, shown in fig. 7 , is used to interface the nominal 1.0V digital processing block to the PA output devices. The architecture is similar to that proposed in [15] . Latching structures were used to restore signal swing on intermediate nodes. Coupling capacitors increase frequency response of the digital signals. Deadtime control is implemented with fixed delay of 60ps in the high-side and low-side signal paths. In the deadtime circuit, capacitors are placed between the highside and low-side signals to assure proper time-alignment. The output stage was sized for maximum efficiency at 100mW output power (20dBm) with balanced switching and conduction loss for 2.0GHz operation. The output stage is hard switched to maintain accurate control of the PDM waveform. Hard switching is performed by inverter drivers, which are scaled with fanout of three. V. TESTING AND RESULTS The circuit was implemented in a 90nm digital CMOS test chip. The die photo is shown in fig. 8 . The class-D PA die is 1.0mmx1.0mm with active area for the PA of 0.15mm 2 . The PDM circuit requires 0.2mm 2 active area on a separate test chip that is 1.2mmx1.2mm. In the future the PA, PDM, and ΔΣ blocks will be integrated. The baseband ΔΣ modulator was implemented on an FPGA clocked at 100MHz. The 1.95GHz RF clock signal is injected directly into the PDM chip with an off-chip RF source. A Labview PXI system drives the RF clock. The off-chip bandpass filter was implemented with a seriesresonant L-matching network to transform the 50Ω load impedance to approximately 15Ω [16] . The 5.0nH inductor was implemented on the board and achieved an unloaded quality factor of Q~25 at 1.95GHz. fig. 9 is the PA clock; the bottom waveform is the output of the PA before the matching network. The PA skips pulses to modulate the amplitude of the RF carrier. The timevarying pulse pattern highlights active noise shaping as generated by the baseband and RF pulse-density modulator. Fig. 10 shows the downconverted time-domain output of the system. Here, the amplitude signal is a full-scale linear ramp. The amplitude is bi-polar as controlled by the polarity bit. The polarity shift is controlled in the amplitude modulator by inverting the phase of the RF clock.
The inherent linearity of the system is demonstrated qualitatively by the linearity of the ramp signal. The ramp sweeps between full-scale to zero amplitude in approximately 20us. Fig. 11 shows the efficiency and output power of the class-D PA for supply voltages between 1.4-2.4V. The upper voltage range occurs at the peak oxide stress for the devices, which is 1.2V. The peak output power of 20dBm was achieved for VHV=2.4V. Peak efficiency is 43.5% at an output power of 18.5dBm. Peak efficiency occurs at a supply voltage of VHV=2.0V for the 1.95GHz carrier frequency. Efficiency measurements include power of the PA driver and losses in the L-matching network. carrier amplitude is reduced. The PA achieves 20% efficiency for pulse streams with only 2 out of 9 pulses delivered (13dB power backoff). High efficiency in power backoff improves the average efficiency of the system for wireless standards with high peak-average power ratios (PAPR). High linearity is achieved with the PDM scheme as measured in terms of carrier amplitude versus pulse density. Fig. 13 shows the output spectrum of the PA driving a suppressed-carrier amplitude modulation signal. The AM signal has 20mV amplitude with zero DC component to suppress the carrier tone. This plot highlights the lowamplitude noise-shaping performance of the modulator. Near-band noise is suppressed substantially by the ΔΣ modulator. Peak noise occurs 50MHz on either side of the carrier frequency. Fig. 14 shows the spectrum at the PA output for a WCDMA channel waveform. The signal is analogous to a single I or Q signal from the Cartesian representation. It is bandlimited to 3.84MHz with a PAPR of approximately 3.0dB. Peak noise occurs approximately 50MHz from the carrier. The bandwidth of the system is limited by the baseband sampling rate of 100MHz. With this sample rate the system can accurately generate waveforms with over 20MHz channel bandwidth.
The limiting factor for the pulse-density modulation system is the out-of-band quantization noise. While this system demonstrates effective reconstruction of the RF channel waveforms, the noise level for the WCDMA waveform is not sufficient to meet the spectral-mask requirements for the standard. Additional research is needed to explore ways to further reduce out-of-band noise. Noise can be reduced by operating the baseband noise-shaping at higher frequencies. In a fully integrated solution, it would be practical to operate the ΔΣ process at 250MHz or more, which would reduce the peak noise level by up to 10dB, depending on the attenuation of the bandpass filter. Another approach is to power combine multiple PA outputs to add additional quantization levels for the digital modulator. Additional bits directly improve the signal-noise ratio and reduce peak out-of-band noise. Higher-order filters are another opportunity. Discrete bandpass filter components including MEMS and acoustic devices may provide substantial attenuation for out-ofband noise to make the PDM system practical for a range of wireless standards. Other wireless standards, such as Bluetooth may provide an opportunity, since the spectral requirements of low-power systems are more flexible.
VI. CONCLUSION Overall this work demonstrates a new approach for the transmitter architecture that combines techniques in power management, data conversion, and RF circuit design. The 90nm CMOS test chip performs linear modulation of a nonlinear amplifier using pulse-density modulation. The digital architecture achieves high efficiency for a wide range of output power, high linearity, and is able to generate wideband RF amplitude waveforms. The class-D PA operates with 43.5% efficiency at 1.95GHz, including PA driver power and losses in the off-chip matching network. The architecture benefits from technology scaling by operating a deterministic pulse-density modulation process at the RF carrier frequency. This work shows that techniques successfully developed for power electronics applications are relevant for blocks in the radio architecture.
