An analog queue-based architecture and an adaptive digital-calibration algorithm calibrate an 8-bit two-stage pipelined algorithmic analog-to-digital converter (ADC). To minimize power dissipation and noise, the queue consists of only one sample-and-hold amplifier. At a sampling rate of 20 Msamples/s, the peak signal-to-noise-and-distortion ratio (SNDR) is 45 dB, and the spurious-free dynamic range (SFDR) is 62 dB. The total power dissipation is 25.4 mW from 3.0 V. The active analog area is 0.11 mm 2 .
I. Introduction
Traditional designs of high-performance ADCs have required high-performance operational amplifiers (op amps). In many cases, single-stage op amps have been chosen to provide both high gain and high speed. As CMOS process geometries continue to shrink, however, the openloop gain of single-stage op amps is decreasing. To overcome this problem, digital background calibration can be used to adjust for the effects of reduced open-loop gain. Background calibration is transparent to the ADC user and can track variations caused by changes in temperature, unlike foreground calibration done at power up. Also, the cost of digital calibration is decreasing due to the scaling of CMOS technologies. As a result, digital-background calibration can potentially allow the cost-effective design of high-performance analog-to-digital converters with low-performance op amps.
In this paper, the application of queue-based digital-background calibration to an 8-bit algorithmic ADC is described [1] . To reduce power dissipation, the length of the calibration queue is reduced from two sample-and-hold amplifiers (SHAs) in [2] to one here. A resolution of 8
bits is selected to demonstrate that calibration is potentially useful at a level where it has not traditionally been required or used. Fig. 1(a) shows a block diagram of an ADC with a calibration queue. The input to the queue, x(t), is sampled at a rate f S , which is slightly less than the ADC conversion rate, f C . The difference between these two rates allows the ADC to occasionally process a calibration signal, while the queue stores one or more samples of the input signal. When the queue consists of two SHAs [2] , the time allowed for the ADC to complete a calibration must be less than two input sampling periods or 2/f S . An advantage of this method is that this calibration interval is long enough to do a highly accurate calibration. However, a disadvantage of this method is that although one SHA is necessary to achieve excellent dynamic performance, the second SHA is extra and adds extra noise to the input signal. To overcome this problem, the calibration queue in this paper uses only one SHA. "Full" is high when the queue holds an input sample. Once a sample is completely transferred to the ADC, Full is reset. Because f C > f S , the interval during which samples are held in the queue decreases over time at first. When the ADC is available to begin a conversion while the queue is empty, "Calibrate" rises, and the ADC instead converts a calibration signal, which is introduced directly at the ADC input, bypassing the SHA. The information obtained by these calibration operations is used to overcome limitations that would otherwise occur because of finite op-amp gain and capacitor ratio errors. The use of a queue with two SHAs allows nearly identical conversion and sample rates. However, when the queue consists of only one SHA, the ratio of these rates must be increased.
II. Queue-Based Background Calibration
The queue has three distinct modes of operation: acquisition, hold, and transfer to the ADC.
During the acquisition time t A , a new input sample is acquired and a valid SHA output is not available. Any conversion begun by the ADC during t A cannot be used to quantize an input sample. During the transfer time t T , the SHA output must be held constant and another input sampling operation is not allowed to begin. Fig. 1c shows a detailed worst-case timing diagram of Sample and Convert in the synchronous case, where T S is n + 1 periods and T C is n periods of one master clock. In general, n is an integer, and n = 3 in this example. When Sample rises, the SHA starts to acquire the input signal. When Sample falls, the acquisition of the SHA input is complete. When Convert rises, the ADC starts to acquire its input, which comes from either the SHA output or the calibration input. When Convert falls, the acquisition of the ADC input is complete. The first rising edges of Sample and Convert are assumed to occur at the same time. Also, T CAL = T C is assumed, where T CAL is the time required by the ADC to convert a calibration signal and T C is the conversion time for input samples stored by the queue.
During the first T C period, a calibration is done because the acquisition of the input is not complete when the ADC is ready to begin a new conversion. During the calibration, the input acquisition is completed, and the resulting sample must be transferred to the ADC before another acquisition can begin. Therefore, the first T S interval in Fig. 1c must be at least T CAL + t T .
During the next nT C intervals, input signals stored by the SHA are quantized by the ADC, and the delay between the end of the acquisition time and the beginning of the conversion time decreases because T C < T S by design. In the last T C period before the process repeats, the delay between the rising edges of the sample clock and convert clocks is minimum. To guarantee that a conversion begins in this interval instead of a calibration, this delay must be at least as long as t A so that the acquisition is complete when the ADC is ready to start conversion. After this last T C period, the rising edges of the sample and convert clocks again occur at the same time.
Therefore, the last T S interval in Fig. 1c must be at least as long as T C + t A . In other words,
As a result, both conditions (1) and (2) must be satisfied in the synchronous case.
III. Pipelined Algorithmic Architecture with Calibration
In the prototype algorithmic ADC, the residue gain g is less than 2 so that charge injection, comparator offset, and op-amp offset do not cause stage outputs to exceed the full-scale input range of the next stage [3] . The digital calibration adaptively determines g by measuring the major-carry jump and uses this information to appropriately weight the output bits, thus eliminating the effect of the reduced gain g on ADC linearity [2] .
In a traditional algorithmic ADC, one comparator determines a single decision each time the residue goes around the loop. In practice, two SHAs are used in the loop because each SHA by itself does not sample its input while producing an output. Since two SHAs are used, the maximum sample rate of an algorithmic ADC can be doubled by using one comparator at the output of each SHA instead of only one comparator for both SHAs [4] . With two comparators, however, the loop has two residue gains. These gains are nominally set equal, but they may mismatch in practice. Although these two gains can be measured separately in principle, the calibration here calculates a single gain, which is a weighted average of the two possibly different gains.
This approach is effective in overcoming linearity errors to the extent that the two gains match each other. Since about 8-bit gain matching is expected in practice, and since the prototype is expected to have 8-bit resolution, this approach should be satisfactory here.
Another consequence of the use of only one SHA is that when the ADC enters the calibration mode, the calibration must be complete in less than one sample period instead of two sample periods with two SHAs in the queue. Therefore, the result of each individual calibration interval is not as accurate as in [2] . However, if the calibration is intended to track variations in the residue gain that only occur slowly (from changes in temperature for example), each individual calibration measurement need not be highly accurate by itself. Instead, the individual calibration measurements must only converge to an accurate value after averaging the results from many calibration cycles. To allow this averaging to occur without a limitation arising from quantization noise, dither is added to the calibration signal V cal [5] .
For a white Gaussian dither signal, at least 0.5 LSB rms is required [6] . This dither causes the major-carry jump to take on a variety of values that when averaged determine the residue gain to a greater accuracy than would otherwise be possible. However, adding noise to the calibration input adds noise to the value of g that is determined by the calibration. To overcome this problem, the step size in the adaptive calibration loop can be reduced so that the noise power at the ADC output is not significantly increased by the presence of the dither. The disadvantages of this approach are that it increases the number of cycles required for the calibration to converge, and it reduces the rate at which the calibration can track changes in the residue gain.
IV. Prototype Implementation
The prototype uses three op amps (one for the queue and two for the ADC) and two comparators. Control logic for the ADC and queue is implemented on the prototype as well as output latches and buffers. The digital calibration is performed off chip. Fig. 2a shows a simplified schematic of a residue amplifier. A differential circuit with two inputs (V ip and V in ) and two outputs (V op and V on ) is used in practice. In the differential circuit, the left side of C s is connected to the left side of its differential counterpart and otherwise floated during φ 2 , providing some common-mode rejection [7] . Here, d represents the binary decision of the corresponding comparator. A standard two-phase nonoverlapping clock is used. Clock phase φ 1 falls slightly before φ 1 to reduce signal-dependent charge-injection errors [8] . These clocks run at the master clock frequency of 120 MHz. With an ideal op amp, the stage gain would be approximately 97.5/50 = 1.95. The SHA used in the queue is the same as this residue amplifier with the exceptions that C r and its switches are removed, C s and C f are equal, and no connection to a comparator is made. the feedback factor and the speed for a given power dissipation. Since a tail current source is not used, the common-mode gain of the differential pair is not suppressed, and feedback to the V i+ and V i− inputs controls not only the differential output voltage but also the common-mode output voltage. This op amp was chosen for its simplicity and speed in keeping with the goal of relaxing the required gain through digital calibration. With an op-amp gain of a and a parasitic capacitance from each op-amp input to ground of C p , the differential residue amplifier output
where V id = V ip − V in is the differential input and V rd = V rp − V rn is the differential reference.
Calculation shows that C p is about 170 fF, and simulation shows that a is about 160. Therefore, the gain applied to the input in (3) is about 1.86. This gain is about 4.4% less than the gain with an ideal op amp and would limit the uncalibrated ADC linearity to less than 5 bits.
For testing, the sampling clock frequency is the master clock frequency divided by six, which is 20 Msamples/s. The conversion clock frequency is the master clock frequency divided by five or 24 Msamples/s, but one out of every six conversions operates on a calibration signal instead of the input. Each algorithmic stage produces one output bit each master clock cycle; therefore, five cycles are required to produce a raw output of 10 decisions. The digital output is truncated to 8 bits after the calibration calculations.
Two comparator decisions are produced on each cycle of the master clock, which is divided into two nonoverlapping phases of equal duration. As a result, each clock phase produces one comparator decision. The duration of each phase must be long enough to allow an op amp to settle. Since each ADC output consists of 10 comparator decisions, the op-amp settling time must be less than T C /10, where T C is the ADC conversion time. The transfer time t T in (1) is limited by the settling time of the op amp in the queue. Since all the settling times are assumed equal, one phase of the master clock is allocated for t T . Therefore, t T = T C /10. For the prototype, t A = 4.9 ns and the maximum ADC conversion rate is 24 Msamples/s. Therefore, T C ≥ 41.7 ns and t T ≥ 4.17 ns. For f C = 24 Msamples/s, (1) gives f C /f S > 1.10, and (2) gives f C /f S > 1.12 for synchronous operation. As a result, the conditions described in the previous paragraph with f C /f S = 1.2 satisfy (1) and (2) . In contrast, with a queue of two SHAs, f C /f S as low as 1.0006 was demonstrated [2] . The difference between these two cases is that the time required for acquisition and transfer can be accommodated within the second sampling interval when two SHAs are used in the queue but increases the required f C /f S ratio with only one SHA. Finally Fig. 4 shows the die photograph, and Table 1 summarizes the performance. 
V. Experimental Results
¢ ¡ ¤ £ ¦ ¥ § © © ! (a) " $ # & % ( ' 0 ) 2 1 3 5 4 7 6 7 8 9 1 A @ C B D F E ) G ) 3 H # & ) G I Q P 0 @ R # S
