receiver to obtain 100 MHz of RF BW with minimum analog baseband complexity [1] . In this wireless application, a continuous-time sigma-delta modulator (CT M) is the ADC architecture of choice to meet the stringent specifications of high resolution, wide BW, and low power consumption. In addition, it possesses an inherent alias rejection and tolerance for out-of-band blockers, which are unique features beneficial for this application.
As an alternative to the wide BW single-loop CT M architecture [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] , the multistage noise-shaping (MASH) CT M architecture [1] , [15] [16] [17] [18] [19] [20] [21] [22] has recently gained popularity due to its wide BW capability [21] , low power potential [22] , and capacity for integration in an LTE-A basestation transceiver [1] . Nevertheless, the single-loop CT M architecture is usually preferred over the MASH CT M architecture due to the problems of quantization noise leakage and nonideal interstage interfacing.
The problem of quantization noise leakage is more severe in MASH CT M compared to that in its discrete time counterpart, requiring calibration [15] [16] [17] , [21] . Digital correction of the modulator output [15] , [21] , can be too power hungry to implement at high sampling frequency ( f s ). Since analog RC time constant calibration is already a requirement for modulator stability over process corners, it can be implemented with sufficient accuracy to also satisfy the quantization noise leakage specification [16] , [17] . However, the digital calibration algorithm in [15] , [16] , and [21] is complex, whereas that in [17] consumes a large amount of power consumption.
Accurate analog RC time constant calibration can be avoided for the MASH 0-X [19] and the sturdy MASH (SMASH) [20] , [23] architectures. However, these architectures suffer from systematic first-stage quantization noise leakage [19] , [23] . In addition, the SMASH architecture is more prone to overload compared with the MASH architecture, as it is essentially a single-loop architecture in disguise employing a MASH 0-X quantizer.
In addition to the problem of quantization noise leakage, interstage connection in MASH CT M is not as straightforward as that in its discrete time counterpart. Shown in Fig. 1 , delay in the interstage digital-to-analog converter (DAC) causes out-of-band peaking for both the input signal and the first-stage quantization noise, which are processed in the second stage. This situation is exacerbated if the second stage 0018-9200 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. has out-of-band peaking in its signal transfer function (STF). In most cases, using a multibit quantizer in the first stage is necessary, as the second stage is already prone to overload from the first-stage quantization noise alone.
Due to this nonideal interstage connection, early MASH CT M designs [15] [16] [17] require a complex and power hungry noise cancellation filter (NCF) for equalization. An external coaxial cable [19] , a low-pass filter [20] , and a lattice filter [21] have been earlier proposed as an analog delay element in the interstage connection of MASH CT M. The effectiveness of these solutions to suppress out-of-band STF peaking still need improvement.
The MASH 2-2 CT M proposed in this paper solves all the aforementioned problems. Quantization noise leakage is minimized by on-chip RC time constant calibration, which is possible due to the low oversampling ratio (OSR), as well as employing high-gain multistage operational amplifiers (OAs) in the loop filters. In [24] , it is shown that it is theoretically possible to synthesize MASH CT M from its discrete time counterpart prototype using feedforward interstage paths. The synthesis method proposed in this paper reduces the number of feedforward interstage paths necessary accounting for excess loop delay (ELD) while presenting minimal loading to the second stage and reducing its signal swing without out-ofband peaking. The NCF integrated in this design is also simple, low power, and capable of high-speed operation. To increase the suitability of this design for wireless applications, the modulator adopts feedback topology to provide STFs free from out-of-band peaking at the digital and integrator outputs. This paper is organized as follows. Section II discusses the architecture of the proposed MASH 2-2 CT M. Section III describes critical circuits for the modulator operation. Section IV reports the measurement results. Section V provides the conclusion and comparison with the state of the art. 
A. Proposed Single-Loop CT ΔM Stage Architecture
Both stages implement a second-order noise transfer function (NTF) with dc zeros as demonstrated by the impulse invariant analysis shown in Table I . Feedback topology is adopted to avoid out-of-band peaking of the input signal for the first stage and amplification of the first-stage quantization noise for the second stage. The loop filters are realized using active-RC topology with digitally tunable capacitors. High-gain multistage OAs are implemented to satisfy both the loop filter linearity and the quantization noise leakage specifications.
DAC 4 and DAC 3 provide the half and one clock cycle delay feedback paths in the first stage, respectively. This ELD compensation scheme provides better power efficiency compared with using the zeroth-order feedback path [25] [26] [27] and lower quantizer complexity compared with using digital ELD compensation [28] . It resembles the ELD compensation scheme based on differentiator DAC [29] , in which one of the differentiator DAC (DAC 3 ) coefficients is reduced by four times to save the area and power consumption thanks to the feedback topology. Furthermore, the delay of DAC 1 is extended to two clock cycles to accommodate data weighted averaging (DWA) [30] and thermometer-to-binary encoder based on Wallace-tree adder to minimize the quantizer metastability. The change in the loop filter impulse response caused by this additional delay is compensated by adding DAC 2 and reducing the first integrator coefficient by 40%.
DAC 9 and DAC 8 fulfill the same roles as DAC 4 and DAC 3 in the second stage, respectively. As second-order noiseshaping for error in DAC 6 is sufficient, a one clock cycle delay is used for DAC 6 as DWA is not necessary. All the DACs use nonreturn-to-zero pulse shaping to reduce jitter sensitivity. DAC bias currents track the value of replica loop filter resistors. Two 4-b flash quantizers provide an amplification factor of 2.5 V/V to reduce the signal swing and BW of the second and fourth integrators by the same amount. This is achieved by reducing the full scale of the quantizers with respect to that of the modulator [31] .
B. Proposed MASH CT ΔM Architecture
In addition to the main interstage path through R 3 , five additional feedforward interstage paths are added through R 13 , R 24 , C 24 , DAC 5 , and DAC 7 . Four, five, or six feedforward interstage paths are necessary for a MASH 2-2 CT M with zero, one, or two clock cycle ELDs, respectively. Without all these additional paths, the second stage needs to process the input signal without any attenuation and the NCF needed to cancel the first-stage quantization noise is complex [17] . Although the in-band input signal processed by the second stage can be canceled using DAC 5 [15] , [16] , this leads to out-of-band peaking of the input signal and the first-stage quantization noise at the second-stage output. This out-ofband peaking can be solved by minimizing the DAC 5 delay or implementing an analog delay element in the main interstage path [19] [20] [21] .
By removing the constraint on the second-stage STF compared with that in [24] , this allows a search for the optimized feedforward interstage path's combination and reduces its number by one. The design options considered were: 1) four loop filter feedforward interstage paths, one of which is the unused resistive connection from the first-integrator output to the fourth-integrator input and 2) two interstage DACs from the first-stage output to the third-and fourth-integrator inputs with one, one and a half, or two clock cycle delays. All the topologies were then compared based on the second-stage STF and the value of the coefficients to minimize the second stage loading.
Using the impulse invariant transform analysis as exemplified in Table II and the simplified MASH 2-2 CT M model in Fig. 3 , the feedforward interstage coefficients are chosen to satisfy
where NTF(z) and NTF 12 (z) are the transfer functions from the first-stage quantization noise to the NCF and the secondstage outputs, respectively. As implied by (2), the NCF in this design is simple compared with those used in the state-of-theart CMOS MASH CT Ms [15] [16] [17] , [21] .
C. Noise Transfer Function
The theoretical quantization noise floors of the second-and fourth-order NTFs with dc zeros, 4-b quantizer, and OSR of 10 are −63.0 and −85.7 dBFS, respectively. The simulated maximum stable amplitude (MSA) is −0.6 dBFS. second-order feedforward interstage path through C 24 and cancellation of the third-order feedforward interstage paths through R 13 and R 24 . The in-band input signal swings at first through the fourth integrator outputs are 0, −7.96, −5.74, and −14.0 dB, respectively. This design is free from internal outof-band peaking. The in-band second-stage STF is −6.02 dB. Note that it is not possible to synthesize a MASH CT M where the later stage does not process any input signal in-band and out-of-band. This implies that the STF of the MASH CT M is that of the first stage. Thus, the later stage is unable to cancel the alias generated by the first stage. This violates the assumption of quantization noise cancellation condition, as alias is virtually indistinguishable from quantization noise.
D. Signal Transfer Function
To control the signal swing at the second stage output, feedforward input paths [11] , [28] are recommended instead of additional feedforward interstage paths [24] . This directly controls the overall STF that can be more intuitively derived [32] . The second-stage STF is given as
which implies that the second-stage STF is indirectly controlled by the overall STF if the first-stage STF and the NCF transfer functions are fixed.
As an example, a feedforward resistive path from the modulator input to the third-integrator input can be used to cancel the second-order feedforward path through C 24 .
In this case, the in-band input signal swings at the thirdintegrator output and the second-stage output are −20.0 and −21.6 dB, respectively. Thus, it allows room for interstage gain to further reduce quantization noise contribution of the second stage. Without any integrator rescaling, this gain is limited to 2 V/V as the out-of-band input signal swing at the third-integrator output is similar to that in the current design. As the current design allocates some quantization noise leakage budget as a conservative measure, this modification is deemed not necessary. It can be attractive for future designs with more accurate RC time constant calibration.
E. Process Variation
The impact of process variation was determined by analyzing the global and local sensitivities of each component or parameter to the quantization noise floor of the modulator. Fig. 6 shows the theoretical and simulated quantization noise floor of the MASH 2-2 CT M versus RC time constant variation with the ideal OA model. In this example, up to ±3% RC time constant variation can be tolerated for less than −83.5 dBFS of quantization noise budget. Calibration is necessary since the worst case variation on the values of R and C is ±20% based on technology specifications.
Analyses were done for both global and local variations of R, C, DAC coefficient, DAC delay, quantizer sampling instance, and quantizer gain. Besides RC time constant variation, DAC coefficient variation is also a contributor to quantization noise leakage and minimized by proper biasing. The quantization noise floor of the design is not sensitive to DAC delay, quantizer sampling instance, quantizer gain, and feedforward interstage path coefficient variations.
F. Circuit Thermal Noise
The full scale of the modulator corresponds to a differential sinusoid input signal with an amplitude of 687.5 mV (peakto-peak of 1.375 V) or 0 dBFS. The noise contributions from R 1 , OA 1 , and DAC 1 are calculated to be −84.7, −86.3, and −86.2 dBFS, respectively, which bring the noise floor up to −80.9 dBFS for the first integrator. The suppressions for 
G. Clock Jitter
The prototype chip relies on the performance of an external clock source to achieve the required clock jitter budget. Using the analysis in [33] , the theoretical noise floors due to white clock jitter mixing with quantization noise and a worst case sinusoid input signal with −1 dBFS of amplitude at band edge are found to be −83.0 and −81.0 dBFS, respectively, for 1-ps rms of white jitter in DAC 1 . If the clock jitter is dominated by its near-carrier component with a very narrow BW compared with that of the modulator, the worst case theoretical noise floor becomes −71.0 dBFS for 1-ps rms of low-frequency sinusoid jitter approximation in DAC 1 . Fig. 7(a) shows the OA architecture and its design parameters for OA 1 . It has four gain stages compensated using the no-capacitor feedforward (NCFF) scheme [34] , making high gain OA practical in deep submicrometer CMOS. The transconductors G m2 and G m12 , as well as G m3 and G m13 , are designed to be identical in order to balance the gain. Additional nMOS capacitors are added at the output terminals of the first and second stages to lower their BW. Their nonlinearity is not a concern due to the small signal swings they experience. The third stage directly drives the parasitic input capacitances of the output stage. The estimated load capacitance of the output stage annotated in Fig. 7(a) includes the parasitic capacitances of the digitally tunable capacitor, OA input capacitance, and DAC output capacitance, which effectively load the OA. Fig. 7(b) shows the schematic of G m1 . It is a telescopic cascode amplifier with noncascoded load transistors, a selfbiased common-mode feedback (CMFB), and a current source common-mode level shifter. Fig. 7(c) shows the schematic of G m4 and G m14 as the output stage. It is a currentreuse amplifier with an ac coupling for G m14 and a twostage NCFF CMFB. Pseudo differential topology is chosen to accommodate the limited transistor headroom of 206 mV. As the signal swings at the second through fourth integrator outputs are reduced, and the pseudo differential output stage in OA 1 is replaced by its fully differential version for the rest of the OAs. Fig. 8 shows the postlayout simulated OA 1 Bode plot. The loop gain is obtained by Cadence stb analysis when OA 1 is used in closed loop as the first integrator, whereas OA gain is obtained when OA 1 is used in open loop including the integrator feedback network as its load. The feedback factor, indirectly measured from the ratio of the loop and OA unity gain frequencies (UGFs), is 0.823 due to the capacitive division between the integrating capacitor and the parasitic capacitances at the OA virtual ground including that of DAC 1 . OA 1 achieves 84.3 dB of dc gain, 61.5 dB of gain at a frequency of 50 MHz, 1.19 GHz of loop UGF, 61.3°of loop phase margin, and 3.4 nV/rtHz of input-referred voltage noise density. All the OAs achieve greater than 60 dB of gain at a frequency of 50 MHz, which is equivalent to greater than 50 GHz of gain-BW (GBW) product for OA with single-pole roll-off. The loop UGFs of OA 2 , OA 3 , and OA 4 are 1.20 GHz, 1.81 GHz, and 964 MHz, respectively. The simulated nominal power consumption of OA 1 , OA 2 , OA 3 , and OA 4 are 10.8, 6.5, 3.6, and 3.5 mW, respectively. Due to the limited design time, OA 3 was overdesigned and its power consumption and loop UGF can potentially be reduced further.
III. CIRCUIT IMPLEMENTATION

A. Operational Amplifier
The advantage of the proposed OA is apparent when evaluating the required OA specifications as shown in Fig. 9 . Single-pole OAs with dc gain and GBW larger than 40 dB and 5 GHz, respectively, are needed to minimize the quantization noise leakage. This rules out the possibility of low gain OA [13] . Transient simulation of the modulator using the proposed OAs shows that the ideal quantization noise floor remains intact. On the other hand, using conventional OA with single-pole roll-off with the same UGF of 1.19 GHz as the proposed OA suffers from degradation in quantization noise floor by 13.2 dB. No GBW tuning or loop filter compensation is needed as the proposed OA performance meets the specifications for all the process corners. quantization noise leakage budget of ±3% RC time constant variation tolerates up to two LSBs of error in the digital code.
B. Digitally Tunable Capacitor
C. Bias and RC Time Constant Calibration Circuits
The bias circuit in each stage provides bias currents inversely proportional to the value of the replica loop filter resistor to the DACs, OAs, and RC time constant calibration circuit. Error in DAC coefficient of less than 1% is achieved to minimize quantization noise leakage. During startup, the RC time constant calibration circuit in each stage measures the rise time of a ramp waveform generated by a replica integrator and DAC. The digitally tunable capacitor is tuned until the desired rise time of the ramp is achieved. The RC time constant variation over temperature is +0.15% and +0.64% at 125°C and −40°C, respectively, compared with the nominal RC time constant at 27°C. Thanks to the low temperature coefficients of the nonsilicide p-type polysilicon resistor and the finger metal capacitor, startup calibration is sufficient. Fig. 11 shows the comparator schematic used in the 4-b flash quantizer. During startup, the sense amplifier input terminals are connected to the resistive ladder via a switched-capacitor common-mode level shifter. The finite-state machine calibrates the sense amplifier offset to the differential reference voltage of the resistive ladder. During normal operation, the resistive ladder can be turned OFF. Fig. 12 shows the sense amplifier schematic used in the comparator. The classic topology in [35] is modified by adding a digitally controlled dynamic differential pair formed by transistors M 6−10 . It generates dynamic offset current, which also contributes to the latch transconductance to offset the effect of additional parasitic capacitance that minimizes the degradation of the regeneration time constant. Fig. 13(a) shows the postlayout simulated sense amplifier offset versus digital code for all the corners. The effect of nonlinearity on this curve is minimized by sweeping both the positive and negative digital codes during the calibration. The maximum sense amplifier offset is greater than the desired maximum differential reference voltage for all the corners to give some margin for random transistor mismatch. After calibration, the simulated integral and differential nonlinearities of the quantizer are less than a quarter of LSB. By modeling the comparator offset to be random with a standard deviation of a quarter of LSB, 1000-run Monte Carlo simulations as shown in Fig. 13(b) predict that the quantization noise floor increases to −81.3 dBFS for the case where the quantization noise power is higher than its mean value by its 3-σ value.
D. Quantizer
The quantizer also includes a thermometer-to-binary encoder based on Wallace-tree adder and DWA. The simulated nominal digital power consumption of the quantizer is 2.5 mW. Fig. 14(a) and (b) shows the pMOS and nMOS DAC cell schematics, respectively. Both consist of the current source transistor M 1 , cascode transistor M 2 , and current switch transistors M 3-4 . The bias current carried in each DAC cell is given by G m V f s /16, where G m is the DAC transconductance annotated in Fig. 4 . External 10-μF ceramic capacitors are used to decouple the DAC bias voltages V b1 to reduce DAC noise floor from biasing by 7.5 dB during large signal condition.
E. DAC
The pMOS DAC cell uses a 2.5 V power supply and thick oxide current source transistor M 1 , whose drain is biased at 1.1 V to provide low noise and good matching. It is used in DAC 1 , DAC 4 , DAC 6 , and DAC 9 . The rest of the DACs use the nMOS DAC cell. The common-mode DAC currents help to raise the OA input common-mode voltages and provide bias currents to the OA output stages and the circuit driving the modulator input terminals.
The standard deviation of DAC 1 cell current mismatch is slightly less than 0.05%. From 1000-run Monte Carlo simulations, this corresponds to averaged second-and thirdorder harmonic distortions of 84.8 and 88.6 dB, respectively. DWA is expected to improve harmonic distortion by 10.0 dB at band edge. Fig. 14(c) shows the DAC latch cell schematic, which has low crossing switching. It is used to drive the pMOS DAC cell, whereas inverters are added to its outputs to drive the nMOS DAC cell for high crossing switching. These switching schemes, combined with ensuring fast transition times, minimize the effects of DAC glitch and intersymbol interference (ISI).
The simulated nominal analog power consumptions of the DACs in the first and second stages are 7.5 and 1.1 mW, respectively. The simulated nominal digital power consumptions of the DACs in the first and second stages are 3.2 and 1.7 mW, respectively. 
F. Floorplan
The modulator is implemented in the TSMC 40-nm lowpower CMOS process. Fig. 15 shows the chip microphotograph. The area occupied by the modulator is 0.265 mm 2 . Passives of each stage are placed close to each other. Separate power supplies between the analog and digital blocks are used. Guard rings are used extensively to protect sensitive analog circuits. Unused areas are for power supply decoupling capacitors. Fig. 16 shows the measured fast Fourier transform (FFT) spectrums for a single-tone sinusoid input signal at a frequency of 10 MHz. The peak signal-to-noise-and-distortion ratio (SNDR), peak signal-to-noise ratio (SNR), and spuriousfree dynamic range (SFDR) are 74.4, 75.8, and 84.0 dB, respectively, at an input amplitude of −0.8 dBFS. Noise and distortion cancellation of 20.0 dB was observed. The harmonic components visible in the first stage output, which are canceled by the second stage, indicates that the first stage is very close to overload as the MSA is −0.7 dBFS. The BW of the modulator is 50.3 MHz to include the fifth-order harmonic component. Fig. 17 shows the measured second-and third-order harmonic distortion versus single-tone sinusoid input signal frequency with an amplitude of −0.8 dBFS. The harmonic distortion is limited by intrinsic DAC matching and ISI as DWA was found to reduce the SFDR due to interaction between parasitic DAC capacitances and parasitic DAC routing resistances to the OA input terminals. All the measurements shown correspond to the case when DWA is turned OFF. Fig. 18 shows the measured FFT spectrum for two-tone sinusoid input signals at frequencies of 38 and 42 MHz with −7.5 dBFS of amplitude, each of which corresponds to peak SNDR condition. The two-tone MSA is −6.5 dBFS. These input signals located near the edge of the modulator BW represent the worst case two-tone linearity test due to reduced gain at high frequencies. The second-and third-order intermodulations are 85.9 and 80.6 dB, respectively. Residual noise from the signal generators, which was filtered by an external bandpass filter with 4 MHz of BW, was observed from the 38 to the 42-MHz band. Fig. 19 shows the measured SNDR and SNR versus singletone sinusoid input signal amplitude at a frequency of 10 MHz. The dynamic range (DR), defined as the ratio between the maximum and minimum input signal amplitudes where SNDR > 0 dB, is 76.8 dB. The modulator always recovered from overload and startup conditions without the need for reset mechanisms. Fig. 20 shows the measured noise floor versus digital code controlling the first-stage digitally tunable capacitors. The digital codes obtained by startup RC time constant calibration were found to be close to optimum for both the first and second stages. Compared with the nominal digital codes, the digital codes after calibration differ by +2 and −1 LSBs for the first and second stages, respectively. Meanwhile, the measured value of input resistors R 1 in this test chip is 475 , which is about 5% less than its nominal value. Fig. 21 shows the measured STF. The STF in-band is flat with less than 0.1 dB of variation. The STF peaking is 4.1 dB at a frequency of 320 MHz. The alias suppression is 52.4 dB at a frequency of 950 MHz. The STF peaking, degraded alias suppression, and shallow STF notch are attributed to poor matching at high frequency due to finite OA gain and BW, finite switch ON-resistance, and component mismatch. These degradations were also observed by transient simulations of the modulator at the critical frequencies. Nevertheless, the increased DR required by the STF peaking is safely accommodated by the NCF and the only peaking worth considering is the 2.1 dB of the first-stage STF peaking at a frequency of 170 MHz. The reduction of input signal swing at the second-stage output is degraded to 3.4 dB compared with the theoretical value of 6.0 dB due to quantizer gain error attributed to the switched-capacitor common-mode level shifter during quantizer calibration. is competitive compared with the state-of-the-art CT Ms shown in Table III. 
IV. MEASUREMENT RESULTS
