Abstract-Present-day smartphones and tablets demand high audio fidelity (e.g., total harmonic distortion + noise, THD + N 0.01%), and high noise immunity (e.g., power supply rejection ratio, PSRR 80 dB) to allow high integration in an SoC. The design of conventional closed-loop pulse width modulation (PWM) Class-D amplifiers (CDAs) typically involves undesirable trade-offs between fidelity (qualified by THD + N), PSRR and switching frequency. In this paper, we propose a fully integrated CMOS CDA that embodies a novel input-modulated carrier generator and a novel phase-error-free PWM modulator, collectively allowing the employment of high loop-gain to achieve high PSRR, yet without compromising linearity/dynamic-range or resorting to high switching frequency. The prototype CDA, realized in 65 nm CMOS, achieves a THD + N of 0.0027% and a power efficiency of 94% when delivering 500 mW to an 8 Ω load from V = 3.6 V. The PSRR of the prototype CDA is very high, -101 dB @217 Hz and 90 dB @1 kHz, arguably the highest to-date. Furthermore, the switching frequency of the prototype CDA varies from 320 to 420 kHz, potentially reducing the EMI due to spread-spectrum. In addition, the prototype CDA is versatile with a large operating-voltage range, with V ranging from rechargeable 1.2 V single battery to standard 3.6 V smart-device supply voltages.
the same order. Further, due to the inevitable noise coupling between different modules (including said digital-to-analog converter) in the audio CODEC System-on-Chip, said audio amplifier therein would need to feature very high tolerance to noise in the supply rail (qualified by high power supply rejection ratio (PSRR), for example PSRR 80 dB, and low power-supply-induced intermodulation distortion (PS-IMD) [4] , [5] , for example PS-IMD 90 dB), and low electromagnetic interference (EMI). Yet further, in view of the limited power resources in these mobile smart devices, it is highly desirable that said amplifier features high power efficiency, for example power efficiency 90%. In view of the high power-efficiency requirement, it is not surprising that virtually all present-day smart mobile devices embody a Class D amplifier (CDA) [6]- [8] as the driver to the primary ("speakerphone") loudspeaker. Nevertheless, at this juncture, CDAs are largely deficient in fidelity and noise immunity (to power supply noise); for example, only very few CDAs feature THD + N 0.01% and PSRR 90 dB, and none featuring PSRR 100 dB; see Table I later in the paper.
Reported methods [5] , [9] , [10] to improve these imperative fidelity and noise immunity parameters include employing a high switching frequency (e.g., 500 kHz), and/or complex multiple feedback loops. However, these reported methods could incur compromises and penalties. Specifically, the former not only increases the power dissipation (hence compromising the power efficiency [11] ) but also potentially increases the EMI [12] , [13] . The latter, on the other hand, penalizes both the hardware complexity (hence higher IC area and cost) and the quiescent power dissipation (hence reducing the power efficiency), and may render the CDA non-fully-integrated if external components are required.
To circumvent said undesirable compromises and ensuing penalties, yet obtaining high power efficiency and without resorting to a high switching frequency, we propose a novel CDA design that embodies an input-modulated carrier generator and a phase-error-free PWM modulator; we earlier showed that the primary mechanisms of THD are duty-cycle and phase errors [14] . Arguably, this is the first-ever CDA to feature a design with zero phase error. The prototype CDA, fabricated using a commercial 65 nm CMOS process, features a low THD + N of 0.0027% (when delivering 500 mW to an 8 load from 3.6 V supply) and the highest PSRR to date, PSRR 101 dB at 217 Hz, yet with a relatively low switching frequency, 320 kHz at nominal operating conditions. Furthermore, the prototype CDA is highly versatile in terms of operating supply voltages, ranging from 1.2 V to 4 V, thereby allowing operation from a rechargeable single-cell (1.2 V) to standard 0018-9200 © 2014 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/ redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. smartphone operation (3.7 V). In addition, as the switching frequency is input-modulated, hence varying, the ensuing EMI is reduced compared to conventional CDAs whose switching frequency is fixed. For completeness, the 0.6 mm (active area) prototype CDA is (for cost reasons) integrated with other (unrelated) designs on a large 3 3 mm die and packaged in a large 6 6 mm lead-frame package. This paper is organized as follows. Section II discusses the system-level design. In Section III, circuit realizations are delineated in detail. Section IV presents the measurement results and the benchmarking of the prototype CDA against state-of-the-art CDAs. Finally, conclusions are given in Section V.
II. SYSTEM-LEVEL DESIGN
Pulse width modulation (PWM) is arguably the most prevalent CDA modulation scheme at this juncture. Open-loop PWM CDAs were initially used largely for their hardware simplicity, but are now largely discarded due to their poor linearity and low PSRR [15] . Closed-loop CDAs are now ubiquitous because of their improved performance; largely by the means of negative feedbacks [4] , [5] , [9] , [14] . Fig. 1 depicts a prevalent closed-loop design, a single-feedback second-order integrator PWM CDA. The amplifier outputs ( and are fed (back) to the integrator which integrates these signals with the input signals ( and ), forming a closed loop. We have previously investigated the distortions of closedloop Class D amplifiers [14] and have identified three distinctive mechanisms: (i) open-loop Class D amplifier distortions, (ii) duty-cycle error, and (iii) phase error. While distortions due to (i) mainly arise from the open-loop Class D amplifier (embodied in the closed-loop topology, including the carrier generator, PWM modulator and output stage), distortions due to (ii) and (iii) are introduced by the feedback-the intermodulation between the residual switching components at the PWM input (or integrator output) and the carrier signal results in intrinsic system distortions. In addition, we have also analyzed the PSRR of closed-loop PWM Class D amplifiers [4] and found that a higher loop-gain typically results in a higher PSRR.
On the basis of said investigations, the loop-gain of the closed-loop is a fundamental design parameter because it directly and markedly affects two imperative specifications of CDAs-THD and PSRR [4] , [14] : i) A high loop-gain results in a large in-band (at audio frequencies) gain. This in turn leads to high PSRR (and reduced (improved) PS-IMD [4] , [16] ) as the large gain suppresses the (audio frequency) supply noise introduced at the output stage of the CDA. ii) Unlike linear amplifiers, a high loop-gain may conversely and inadvertently exacerbate [9] , [14] the distortions in the CDA. It is because a high loop-gain inevitably results in reduced out-of-band attenuation (at high frequencies, particularly at the switching frequency) of the switching signal component. The residual switching signal component intermodulates with the carrier during the PWM modulation process, thereby introducing distortions [9] , [14] . iii) Following (ii), said distortions arise from aforementioned duty-cycle error and phase error. Particularly, the phaseerror distortion arises because the time delay between the center of the PWM signal and the center of the carrier is signal-dependent, and this distortion cannot be suppressed by the in-band loop-gain [14] . In other words, the phase-error distortion may undesirably increase as the loop-gain increases. In short, there is an inevitable trade-off between PSRR and linearity when designing the loop-gain-a large loop-gain improves the PSRR while potentially exacerbating distortions. Furthermore, a large loop-gain may limit the maximum non-saturated signal swing of the CDA, i.e., the maximum signal swing without any clipping effect at the integrator output. This is because a large loop-gain results in a large (less attenuated) switching signal component at the integrator output (see Fig. 2 ), which takes away part of the headroom for the audio signal swing. The limited (unclipped) signal swing at the integrator output in turn limits the low-distortion maximum output signal swing of the CDA; the clipped integrator output signal would lead to drastic increase in distortions and this can be seen from the sudden increase in THD + N in the high output power range (see Fig. 11 later) . Although a very high switching frequency could theoretically resolve said issues, it is nevertheless not desirable for the reasons delineated earlier in Section I.
To circumvent said undesirable trade-offs, we propose a novel PWM CDA architecture that, as depicted in Fig. 2 , embodies an input-modulated carrier generator and a phase-error-free PWM modulator (see Fig. 5 (a) later). We will now discuss the mechanisms and efficacies of these two blocks in relation to the performance of the CDA.
The input-modulated carrier generator (depicted in the dashed box at the bottom of Fig. 2 ) takes the input signals, and , of the CDA as the modulating signal to vary the frequency of the carrier. Specifically, when the magnitude of the input signal increases, the carrier switching frequency increases accordingly; and vice versa. In this manner, the switching frequency at nominal operating conditions remains low, which reduces switching power dissipation and generates less ground-bounce noise. Of specific interest, the switching frequency increases to its maximum when the magnitude of the input signal is at its maximum. The modulation of the carrier frequency by the input signal is largely desirable. First, it desir- ably leads to increased attenuation of the switching component when the output power is large, as the switching is now at a higher frequency. Consequently, the potential signal-clipping is mitigated and distortions reduced. Second, the large attenuation in turn allows the employment of a high loop-gain filter design. Third, the switching frequency now spreads over a wide frequency range, and this in turn potentially reduces the radiated EMI of the Class D amplifier [13] . The detailed operation of this proposed carrier generator will be delineated in the next section. We will show that, compared to conventional carrier generators, the hardware and the quiescent power overheads of the proposed generator are negligible.
The phase-error-free PWM modulator is also depicted in Fig. 2 ; and the detailed schematic thereof is depicted in Fig. 5 . This novel modulator is intrinsically phase-error free. As abovementioned, the phase error in conventional PWM arises because the time delay between the center of the PWM signal and the center of the carrier is signal-dependent, and this distortion cannot be suppressed by the in-band loop-gain. To eliminate the phase error, our proposed PWM modulator is designed in a fashion such that the center of the PWM signal is aligned to the center of the carrier signal. With this modulator, a large loop-gain can be employed, without potentially compromising (increasing) the distortions due to phase-errors. The detailed schematic and operation of the proposed modulator will be delineated in the next section.
The proposed input-modulated carrier generator and the phase-error-free PWM modulator, individually and collectively, allow the employment of a very large loop-gain, without compromising the dynamic range and/or the linearity of the CDA; see Fig. 12 later for the THD + N comparison between the proposed design and a benchmark (conventional) design. The high loop-gain is realized by the double-feedback second-integrator loop-filter depicted in Fig. 2 . The design of the loop-filter and its potential stability issues will now be delineated in detail.
III. CIRCUIT-LEVEL DESIGN
In this section, we will discuss the circuit design of the critical functional blocks embodied in the proposed CDA.
A. Loop-Filter Design
The proposed CDA depicted in Fig. 2 adopts a double-feedback topology with a second-order integrator for each feedback loop. The equivalent block diagram model of the CDA is depicted in Fig. 3 , where and are respectively the feedback factor for the first (inner) feedback loop, , and second (outer) feedback loop, . , depicted within the shaded box, is essentially a closed-loop single-feedback CDA, and is encompassed within the forward path of . The transfer functions (or gains) of the first and second integrators are also depicted.
The imperativeness of high loop-gain was delineated in the last section. From Fig. 3 , the effective loop-gain, , is defined as the gain of the CDA that suppresses the supply noise (and the output stage distortions) and it can be derived as follows:
(1) It is apparent from (1) that a high loop-gain can easily be achieved by increasing the integrator(s) gain (terms within parentheses). These gains, however, cannot be arbitrarily increased, in part because the double-feedback CDA is not unconditionally stable; and it is stable only if both feedback loops are stable. Amongst the two loops, (with a second-order integrator) is unconditionally stable. On the other hand, is only conditionally stable, and its stability largely depends on three parameters: the poles/zeros of and the single-feedback CDA.
It can be seen that the frequency response, , of is (2) where is the forward path gain of . It comprises the closed-loop gain of the single-feedback CDA and Integrator2, and it can be expressed as:
The root-locus of is depicted in Fig. 4(a) . To ensure stability, we design the CDA loop filter in the following fashion. The closed-loop poles of the single-feedback CDA, and , are placed on the left hand side of the two zeros, and , introduced by the two integrators. In this manner, all the poles of will reside in the left-hand plane, thereby achieving a stable loop. The exact pole positions (that results in the loopfilter response in Fig. 4(b) ) are also indicated (by the diamond marker) in Fig. 4(a) .
The loop-gain, , of the ensuing filter-loop design (embodying four poles and two zeros) is plotted in Fig. 4(b) . It can be seen that it simultaneously achieves large in-band gain (180 dB and 126 dB at 217 Hz and 1 kHz, respectively) and high out-of-band attenuation ( 16 dB at 320 kHz).
B. Proposed Phase-Error-Free PWM Modulator
The schematic of the proposed phase-error-free PWM modulator and its waveforms are depicted in Fig. 5(a) and (b) , respectively. For completeness, the proposed modulator is embodied in both differential branches as depicted in Fig. 2 . For brevity, we will only delineate the operation of the modulator in the upper branch thereof. The proposed modulator operates as follows. The first comparator, Cmp1, compares the integrator output signal, , against the triangular carrier signal, , to generate a conventional PWM signal,
. An "AND" gate generates the 'half-pulse', , by combining the pulse signal of the carrier with . This 'half-pulse' is essentially one half of a complete PWM pulse that is the input to the output stage, and is phase-error-free. Subsequently, turns on the 2 current source and capacitor commences to charge from its original value. The voltage at the positive input of Cmp2, , commences to ramp up and triggers comparator Cmp2 to change state. This in turn closes switch and the lower side current source commences to sink a current . Hence, for the 'half-pulse' period, capacitor is effectively being charged by a current of . At the end of the 'half-pulse' period, opens while remains closed, and the lower-side current source continues to discharge capacitor with current . Subsequently, ramps down until it reaches its initial voltage, and it triggers Cmp2 to open . In this manner, the width of the final PWM pulse, , is twice of . The pertinent waveforms are depicted in Fig. 5(b) . It can be seen that the center of the final PWM signal is synchronized with the center of the carrier signal, and is independent of the input signal. Hence the has zero phase error, i.e., phase-error-free. Further, as its rising edge coincides with that of the conventional natural-sampling PWM signal, the sampling is not synchronized by any clock signals, hence the sampling is natural sampling [17] . The is therefore the desired natural sampling phase-error-free PWM signal. For completeness, the center of the conventional PWM waveform, , is not synchronized to the carrier; it is instead dependent on the input signal magnitude. This dependence results in the phase-error [14] .
C. Proposed Input-Modulated Carrier Generator
As delineated in Section II, the switching frequency of the proposed carrier generator is not fixed as conventional designs. It is instead input-modulated and hence varying. The schematic of the proposed input-modulated carrier generator was depicted earlier in the dashed box at the lower half of Fig. 2 .
Its operation is as follows. The proposed carrier generator generates a triangular carrier, , whose switching frequency varies with the signal swing. This is obtained by employing a varying current (vis-à-vis a constant current in conventional designs), , whose magnitude is input-related, to charge and discharge capacitor to generate the carrier. The varying is generated as follows. The differential input signals, and , are first buffered by op-amp Amp, whose outputs are subsequently compared by comparator Comp. When (i.e., is positive with respective to the input common-mode voltage and is negative), switch is closed and connected to . Conversely, when , switch is closed and is connected to . In this fashion, is always positive (with respect to the common-mode) as depicted in the waveforms in Fig. 2 . Resistor consequently converts to current . The hardware overhead (shaded area in Fig. 2 ) and power overhead of the proposed carrier generator over the conventional carrier generator are, in the context of the entire CDA, largely negligible. This is because the specifications of op-amp Amp and comparator Comp are relaxed, and the output stage dominates the power-dissipation and IC area. Specifically, for the former, the gain-bandwidth of Amp is about 1 MHz (with a resistive load of 100 k ), and the delay and resolution of Comp are 0.05 s and 10 mV, respectively. The other added hardware overheads are negligible-two switches and a few resistors, all integrated. The input-modulated carrier signal features a minimum switching frequency of 320 kHz, and with increased input signal magnitude, the (instantaneous) switching frequency ramps up to 420 kHz. Fig. 6 later demonstrates the relationship between the input signal magnitude and the instantaneous switching frequency. The varying switching frequency leads to the spread-spectrum [13] , which could potentially reduce the EMI of the CDA. Fig. 7 depicts the schematic of the operational amplifiers (op-amps), and depicted in Fig. 2 . The op-amps are employed in the integrators ( and in the loop-filter, and to ensure the integrators function as designed, the op-amp therein must feature a high DC gain ( 100 dB) and a relatively wide bandwidth ( 15 MHz) . This is because the desired high in-band gain of the integrator(s) is directly determined by the gain of the op-amp therein, and because the desired frequency response (which is largely determined by RC values) of the integrators can only be realized if the bandwidth of the op-amp is higher than the cut-off frequency of the integrator. Of specific interest, as the input-referred noise of directly contributes to the overall input-referred noise of the CDA, hence affecting the noise floor and the SNR of the CDA, the input transistors of are designed to be large to reduce flicker noise.
D. Operational Amplifiers

E. Output Stage Design
The output stage of the CDA is a bridged-tied-load topology to drive the load differentially. Fig. 8 Due to the relatively low switching frequency of the proposed CDA, a relatively large dead-time, 4 ns, can be accommodated, largely without affecting the linearity of the CDA. The driver circuit is designed with a relatively large tapering ratio (the ratio of the size of the driving inverter to that of the inverter being driven), thereby resulting in relatively slow switching of the output transistors. This in turn reduces the ground-bounce generated by the output stage, and hence improves the overall performance of the CDA.
IV. MEASUREMENT RESULTS
The prototype CDA IC is fabricated using a commercial 65 nm CMOS process and for cost reasons, integrated with other (unrelated) designs on a 3 3 mm die (the active area is 0.6 mm ). Fig. 9 is the microphotograph of the CDA IC and the functional blocks are (A) the proposed carrier generator and PWM modulator, (B) the integrators, (C) the output stage, (D) biasing circuits, and (E) testing circuits.
A single-rail supply 3.6 V and an 8 load, unless specified otherwise, are used. Measurements are obtained by means of the Rohde & Schwarz UPV Audio analyzer. The measured bandwidth of the prototype CDA is from 20 Hz to 20 kHz. The measurements setup complies with the CDA testing settings Fig. 9 . Microphotograph of the CDA IC prototype. Fig. 10 . Spectrum of the output signal, V at 1 kHz.
described in a well-established application note [18] . To ascertain the PSRR and PS-IMD parameters, a power supply that is able to superimpose a sinusoidal wave on a DC voltage is used to inject the noise in the supply voltage. Fig. 10 depicts the spectrum of the output signal, 2 V at 1 kHz. The dominant third-order harmonic is 94 dB lower than the 1 kHz fundamental component, and the measured THD + N and SNR are 0.0027% and 97 dB, respectively. Fig. 11 depicts the THD + N (%) versus the output power under different supply voltages. At nominal 3.6 V, the minimum THD + N is a low 0.0027% at 500 mW. When the output power increases to 700 mW, the THD + N remains low, 0.01%.
As designed, the CDA can operate over a large range of supply voltages, ranging from 1.2 V to 4 V. This versatility is important in one of our intended applications-devices powered by a rechargeable 1.2 V single-cell and for meeting stringent power requirements where there is no voltage regulation (also see PSRR later). Fig. 12 depicts the THD + N (%) versus the input signal frequency when delivering 500 mW to an 8 load from 3.6 V. The prototype CDA features excellent THD + N performance over a wide range of input frequencies, including THD + N 0.0022% at 100 Hz and THD + N 0.0027% at 1 kHz. Furthermore, to demonstrate the efficacy of the proposed methods to improve THD + N, Fig. 12 also depicts the measured THD + N of a benchmark CDA prototype. The benchmark CDA prototype was fabricated using the same 65 nm CMOS process. It embodies the same output stage and the same double-feedback loop-filter design as the proposed design; its carrier generator and PWM modulator are instead based on conventional designs. It can be seen that the proposed design features better linearity (i.e., lower THD + N) compared to the benchmark design, particularly at high input signal frequencies.
For PSRR measurements, a noise signal, 200 mV , at different frequencies is superimposed on 3.6 V, and the input of the CDA is either grounded or floating. Fig. 13 depicts the PSRR of the prototype CDA against the frequency of the supply noise. Of specific interest, the PSRR is very high at 217 Hz and 1 kHz, respectively PSRR 101 dB and PSRR 90 dB, and the PSRR is largely independent of whether the input is grounded or floating. The PSRR at these two frequencies are particularly pertinent for mobile applications as the radio frequency power amplifier in mobile devices may induce large magnitude supply noise at 217 Hz and 1 kHz when transmitting GSM and LTE signals, respectively. In addition to PSRR, PS-IMD is another important parameter [4] , [9] , [19] to qualify the supply noise rejection attributes of CDAs. Fig. 14 depicts the spectrum of the output signal, 1 V at 1 kHz when 3.6 V and 200 mV ( 23 dBV ) at 217 Hz. The PS-IMDs with respect to the output signal and the noise are very low, 106.5 dB and 83.5 dB, respectively. To ascertain the PSRR for a practical case, the PSRR is measured for 1 V , and the PSRR remains a high 96.8 dB. Fig. 15 depicts the power-efficiency of the prototype CDA when driving different loads from different supply. The powerefficiency is a high 94% when delivering 0.8 W output power to an 8 load. With a 4 load, the efficiency remains a high 85% when delivering 1.6 W output power. In Fig. 14 , the maximum output power is defined when THD + N 10%. For completeness, the total power consumption (used to calculate power efficiency) includes all the circuits delineated in Fig. 2  (and biasing circuits not shown therein) . Fig. 16 depicts the spectrum of the output signal, 1 mV at 1 kHz. It can be seen that the noise floor of the prototype CDA is low; and the A-weighted integrated noise from 20 Hz to 20 kHz is 35 V. The measurements of the prototype CDA are consolidated in Table I , and are benchmarked against several state-of-the-art designs; for completeness other imperative parameters are also included thereto. The state-of-the-art designs are grouped according to their packaging types-lead-frame packages (QFN, QFP) and non-lead-frame-packages (wafer-level CSP)-CDAs in the two right-most columns are commercial CDAs.
From Table I , it can be seen that the prototype CDA features the highest PSRR of all benchmarked CDAs. Its THD + N is a low 0.0027% (when delivering 500 mW output power) and is much lower compared to all but two CDAs benchmarked. The power-efficiency is a high 94% and is the highest of all designs benchmarked. Of specific interest, the prototype CDA is the only CDA that can operate with a supply voltage as low as 1.2 V and hence is able to operate from a single rechargeable cell. In conclusion, the overall performance prototype CDA looks favourable when compared against reported state-of-the-art designs. 
V. CONCLUSION
A high fidelity and high noise-immunity PWM CDA with low (and varying) switching frequency has been proposed. The proposed CDA featured said attributes without resorting to high switching frequency and/or complex multiple feedback loops by means of a novel input-modulated carrier generator and a novel phase-error-free modulator. Collectively, these novelties permit the employment of high loop-gain, yet without compromising linearity/dynamic range.
The prototype CDA, realized in 65 nm CMOS, achieved a THD + N of 0.0027% and a power efficiency of 94% when delivering 500 mW to an 8 load from 3.6 V. The PSRR of the prototype CDA was 101 dB at 217 Hz and 90 dB at 1 kHz, and the switching frequency was input-modulated with relatively low nominal 320 kHz. The prototype CDA also featured a versatile supply voltage operating range with functionality for ranging from rechargeable single battery of 1.2 V to standard smart device voltage of 3.6 V. Overall, on the basis of benchmarking against state-of-the-art CDAs, the prototype CDA featured the highest PSRR, highest power-efficiency, very-low THD + N, wide operating voltage range. 
