technique is proposed in the design of a 10-bit 100-MS/s pipelined ADC. This technique significantly reduces the finite opamp gain error without compromising the conversion speed, allowing the active opamp blocks to be replaced by simple cascoded CMOS inverters. Both high-speed and low-power operation is achieved without compromising the accuracy requirement. An efficient common-mode voltage control is introduced for pseudodifferential architecture which can further reduce power consumption. Fabricated in a 0.18-m CMOS process, the prototype 10-bit pipelined ADC occupies 2.5 mm 2 of active die area. With 1-MHz input signal, it achieves 65-dB SFDR and 54-dB SNDR at 100 MS/s. For 99-MHz input signal, the SFDR and SNDR are 63 and 51 dB, respectively. The total power consumption is 67 mW at 1.8-V supply, of which analog portion consumes 45 mW without any opamp current scaling down the pipeline.
I. INTRODUCTION
T HE pipelined ADC architecture has been adopted into many high-speed applications including high-performance digital communication systems and high-quality video systems [1] , [2] . The rapid growth in these application areas is driving the design of ADCs toward higher operating speed, lower power consumption and smaller die size. The continuing trend of submicron CMOS technology scaling, which is coupled with lower power supply voltages, makes it possible to keep up with the application development. However, this trend poses challenges to conventional pipelined ADC designs which rely on high-gain operational amplifiers to produce high-accuracy data converters. At low power supply voltage, large open-loop operational amplifier (opamp) gain is difficult to realize without sacrificing bandwidth and/or power consumption. As a result, the finite opamp gain is becoming a major hurdle in achieving both high speed and high accuracy.
One way to bypass this issue is to maintain a high opamp gain even at reduced power supply voltages. This is usually realized by multistage opamp structure, gain boost technique, and long channel devices biased at low current density. The price paid for this approach is complex design, large die area, high power consumption, and/or low bandwidth. The power and speed advantages of submicron CMOS technology are diminished or even completely lost. To avoid this undesirable overhead and to fully exploit the benefits of advanced submicron CMOS processes, it becomes necessary to use a low-gain single-stage opamp instead and to explore efficient circuit techniques which can tolerate the low opamp gain. Various digital self-calibration techniques [3] - [5] can correct the errors due to finite opamp gain as well as capacitor mismatches, but the normal conversion operation has to be stopped during the error measurement (i.e., foreground calibration). To avoid this interruption, background calibration may be used in pipelined ADCs [6] - [9] . However, the implementation tends to become very complex. Moreover, the power consumption and die area increases considerably. Another solution to this low opamp gain problem is the use of correlated double sampling (CDS) technique [10] , [12] . CDS techniques have been used successfully in integrator and amplifier designs. With CDS, the error resulting from the finite opamp gain becomes inversely proportional to the square of the opamp gain. This equivalently doubles the opamp gain in decibels (dB). Furthermore, the opamp offset is removed, and noise is also suppressed. However, the straightforward implementation of CDS in pipelined ADC design increases load on the opamp and adds one extra clock phase (a detailed discussion is found in Section II). To solve this problem, we present in this paper a time-shifted CDS technique [14] . The proposed technique is highly effective for finite opamp gain compensation in the context of low-voltage and high-speed pipelined ADCs. Due to this effective gain compensation, the time-shifted CDS technique enabled a successful implementation of a low-power and high-speed pipelined ADC that uses simple cascoded CMOS inverters in place of traditional opamps.
The rest of this paper is organized as follows: Section II describes the time-shifted CDS technique; Section III describes the circuit implementation of the prototype 10-bit pipelined ADC; and Section IV presents the experimental results. Concluding remarks are given in Section V.
II. TIME-SHIFTED CDS TECHNIQUE
One of the simplest implementations of pipelined ADCs incorporating digital correction/redundancy is based on the 1.5-bit-per-stage architecture shown in Fig. 1 . This architecture is widely used to maximize conversion speed [13] . Fig. 2 shows a typical multiplying digital-to-analog converter (MDAC) used in this type of pipelined ADC architecture. The output of this MDAC at the end of the amplification phase is
0018-9200/04$20.00 © 2004 IEEE where is the sampled input is which depends on the result of the sub-ADC conversion of the sampled input, and the error resulting from the finite opamp gain is (2) This error is inversely proportional to the opamp gain , directly deteriorating the overall linearity of the ADC.
As mentioned earlier, one effective finite opamp gain compensation method is the CDS technique, which mitigates the error due to finite opamp gain, making it inversely proportional to the square of the opamp gain. Fig. 3 illustrates one straightforward implementation of the conventional predictive CDS in the context of a 1.5-bit MDAC. Three clock phases and two sets of switches and capacitors are required in this scheme. The capacitors and are for the predictive MDAC operation, and the capacitors and are for the real MDAC operation. First, during the sampling phase , all the capacitors are sampled to the input voltage . Next, during the predictive amplifying phase and hold the sampled input signal, while and produce the predictive output signal. In the meantime, the nonzero error voltage due to finite opamp gain at the negative input of opamp is stored in . Finally, during the real amplifying phase and produce the real output signal. Because is connected between the negative input of opamp and the common node of and (node G), a much more accurate virtual ground is created at node G. The output error due to finite opamp gain is mostly cancelled. Naturally, this cancellation will not be perfect, and the output error after CDS is given by (3) where is the current output in the main pipeline, and is the output of the previous clock phase (predictive pipeline). Note that the error is inversely proportional to . This equivalently doubles the opamp gain in decibels. In a practical design, this equivalent opamp gain boosting may be less significant due to parasitics and other second-order effects, but despite this practical limitation at least 30 dB improvement can be achieved easily. This would be enough for most low-voltage pipelined ADC designs where finite opamp gain is the limiting factor.
Although the CDS scheme described above can nearly eliminate the MDAC's residue error due to finite opamp gain, it comes with a price. Two drawbacks can be observed from the above illustration. First, one extra clock phase is required in addition to the conventional nonoverlapping two-phase clocking scheme in the switched capacitor (SC) circuit. This implies that either the opamp settling time has to be reduced for the same clock frequency, or the clock frequency has to be reduced to maintain the same opamp settling time. Faster opamp settling time requires larger power consumption. Therefore, either the power consumption or the conversion rate is compromised in this CDS scheme. Second, during the sampling phase , both predictive sampling capacitors and real sampling capacitors are connected to the driving stage output, resulting in double load to the driving stage opamp, assuming all capacitors are the same size. As a direct result of this increased load, the opamp bandwidth gets reduced. If the same bandwidth is to be maintained, more bias current is needed. Once again, either the power consumption or the conversion rate has to be compromised for this opamp gain compensation. These two drawbacks make the CDS technique less suitable for the high-speed pipelined ADC design where the power and speed requirements are stringent. Therefore, it would be highly desirable to find a new method to incorporate the CDS operation into the pipelined ADC so that the large conversion rate and/or power consumption overhead would be avoided.
Such a solution indeed exists, and we refer to it as the time-shifted CDS since this can be implemented by incorporating some timing adjustments to the conventional predictive CDS scheme [14] . The important goals of the time-shifted CDS are to eliminate the one extra clock phase and to realize the pre-sampling and real sampling in different clock phases to avoid added capacitive loading. The two drawbacks of the conventional CDS are to be avoided. This timing change is illustrated in Fig. 4 . Fig. 4 (a) shows the timing of two cascaded stages in the pipelined ADC using the conventional CDS. Note the necessary overlap between the amplifying phase of stage and the sampling/presampling phase of stage for correct signal processing. It is not difficult to find that the "preliminary" residue voltage of stage is already available for sampling in the pre-amplifying phase. Stage can make use of this time slot to do the pre-sampling without any timing conflict. As a result, the double capacitive loading to the output of stage is avoided by this separation of the pre-sampling and sampling phases. This new timing scheme is shown in Fig. 4(b) . Note that the entire timing of stage is shifted one clock phase ahead, and the sampling phase and the pre-amplifying phase share the same time slot. Now it can be seen that the amplifying phase and the pre-sampling phase can be merged into one clock phase since they are totally independent of each other. The resulting time-shifted CDS clock scheme is shown in Fig. 4(c) . Only two clock phases are required in this scheme. No extra load is added to the opamp during any clock phase. The speed and/or power consumption overhead in the conventional predictive CDS technique is removed completely. Fig. 5 shows the proposed 10-bit pipelined ADC architecture employing the time-shifted CDS technique. Conceptually, this architecture realizes two pipelined paths working in parallel for the first few stages. One path represents the predictive path which only operates for the first four stages, and the other path represents the main signal path which operates for all nine stages necessary for the 10-bit conversion. The first four stages of the main signal path are very similar to their corresponding stages in the predictive path, and they share the same set of active stages (opamps/inverters and comparators), thus there is no duplication of active stages. Both signal paths (main and predictive) process the same input signal from the first sample-and-hold (S/H) stage, but the main signal path is delayed a half clock cycle (one phase) by an additional S/H (actual implementation shares the same opamp/inverter) following the first S/H. The input signal is first processed by the predictive pipeline and the finite opamp gain error is stored on a capacitor. The stored error is used to correct the corresponding stage in the main pipeline in the following clock phase (half clock cycle delay). As both signal paths (predictive and main) share the same opamp/inverter, this operation is easily achieved with added switches and capacitors.
The SC implementation of this MDAC operation merging both the predictive pipeline and the main pipeline is shown in Fig. 6 . and are the input/output of the main pipeline, and and are the input/output of the corresponding predictive pipeline. The capacitors are chosen such that . In the proposed time-shifted CDS scheme, the sampling and amplifying operation is actually performed twice. The initial/first operation is done by and , and the nonzero error voltage due to finite opamp gain at the negative input of opamp is stored in . The following/second operation is done by and , with connected between the negative input of opamp and the common node of and (node G). An accurate virtual ground is created at node G. While the operation of this time-shifted CDS technique may appear to be similar to conventional CDS techniques [10] , [11] , it performs without the additional capacitive load to the opamp and/or the extra clock phase(s) to the ADC operation. Any possible speed penalty due to CDS operation is completed avoided, which is critical in achieving the low-power and high-speed ADC performance.
Some design considerations of the proposed architecture are described in the following. First, because the inputs of MDAC are not the same for the predictive path and the main path (with the exception of the very first MDAC), the effect of error correction will not be as good as the conventional CDS techniques. The output error at stage in the main pipeline is approximately given by (4) where is the current output in the main pipeline, and is the output of previous clock phase (predictive pipeline). Note that the error is inversely proportional to . However, this error will increase from stage to stage down the pipeline. This is because the discrepancy between the outputs of the predictive path and the main path will become larger from stage to stage down the pipeline. Fortunately, the accuracy requirement of the pipelined ADC is reduced as the residue signal propagates down the pipeline, and this decreasing accuracy of CDS operation down the pipeline is comfortably tolerated. The second design issue is that the time-shifted CDS will effectively add extra offset to sub-ADCs in the main pipeline. The reason is that the MDACs in the main pipeline need to use the digital code generated by the sub-ADCs in the predictive pipeline. This is equivalent to putting a signal-dependent offset to the sub-ADCs in the main pipeline. Fortunately, digital redundancy of the pipelined ADC is able to correct for the offset, whether signal- dependent or not, as long as the amount of the offset is within the correctable range ( for 1.5-bit-per-stage MDAC). Some behavior simulations have been done to verify the effectiveness of the proposed architecture. In simulation, the opamp gain was chosen to be 40 dB, the capacitor mismatch was assumed to be less than 0.1%, and the random offsets of sub-ADCs were assumed to be less than . Fig. 7 shows the results of the architecture using the conventional CDS technique (recall the extra capacitive load and extra clock phase overhead). Fig. 8 shows the results of the proposed architecture using the time-shifted CDS technique in the first five stages. It can be seen that their performances are very close in terms of SNDR. The third-order harmonic in the time-shifted CDS is found to be a bit higher than in the conventional CDS. This is because the proposed time-shifted architecture does face a small and increasing degradation of error correction down the pipeline, as noted earlier. The larger third harmonic observed is due to insufficient gain error correction for the later stages. The reason this degradation does not significantly degrade the overall performance is because the opamp gain requirement is also reduced down the pipeline as MSBs are resolved. For comparison, the simulation results of the regular pipelined ADC without any gain error correction is shown in Fig. 9 . Note that the SNDR is only about 43 dB, which is 16 dB lower than the SNDR of architectures with gain error correction. 
III. CIRCUIT DESIGN

A. CMOS Inverter as Opamp
The opamp is one of the most critical building blocks in pipelined ADCs. The opamp dc gain and bandwidth determine the achievable accuracy and conversion rate. For a 10-bit pipelined ADC, the open-loop opamp gain needs to be well over 60 dB. It is not uncommon to see 80-dB gain in practical design examples. Designing such a high-gain opamp at low supply voltage is quite challenging because traditional stacking of cascode transistors is not feasible. Use of compensated multistage opamps will lead to considerably increased power consumption and reduced speed. In this prototype IC implementation, we used simple cascoded (both the NMOS input and PMOS current source) CMOS inverters, as shown in Fig. 10 . Replacing opamps with these inverters allowed large signal swing, large bandwidth, and low power consumption. The simulation results of the designed inverter in 0.18-m CMOS technology is summarized in Table I . More than 2-GHz gain bandwidth and 1-single-ended signal swing from 1.8-V power supply are achieved with only 1-mA current dissipation, and the open-loop dc gain is 43 dB. This level of dc gain is insufficient for 10-bit accuracy, but we are able to tolerate the low dc gain due to the enhancements achieved from the time-shifted CDS technique described in the above. 
B. Pseudodifferential MDAC
The use of inverters in place of opamps implies inherently single-ended design. We have adopted pseudodifferential configuration throughout the pipelined ADC design to suppress even-order harmonics and supply/substrate noise. In other words, two single-ended MDACs in parallel are used to build the pseudodifferential MDAC. While the pseudodifferential pipelined ADC can achieve lower power consumption than its fully differential counterpart, as demonstrated by Miyazaki [15] , it still requires some sort of equivalent common-mode feedback (CMFB) operation. Without the equivalent CMFB function, any common-mode error in the pipeline would be amplified in just the same way that the differential input signal is amplified (residue amplification). This can cause single-ended opamps (inverters) to saturate down the pipeline. To mitigate this issue, a hybrid structure which includes both fully differential stages and pseudodifferential stages was employed in [15] . The price paid is increased power consumption and design complexity due to the use of the fully differential MDAC. In order to fully exploit the low-power advantage of pseudodifferential architecture without implementing a traditional CMFB with power consumption overhead, a new pseudodifferential MDAC that uses a differential float sampling scheme is proposed. This is shown in Fig. 11 (time-shifted CDS not shown for simplicity). The differential gain of this MDAC is still two, but the common-mode gain is just one, because one pair of input capacitors ( and ) is differentially sampled without a specific common-mode reference (thus floating). This equivalent CMFB operation is achieved with no speed penalty. The complete pseudodifferential MDAC incorporating time-shifted CDS is shown in Fig. 12 . For comparison, another pseudodifferential MDAC that does not suffer from common-mode error amplification is shown in Fig. 13 [16] . Note there is one, doubled in size, dedicated sampling capacitor , one dedicated feedback capacitor , and one reference injection capacitor that had to be added. During the amplifying phase, the bottom plates of the sampling capacitors are connected together to transfer only differential charge to the feedback capacitor. Therefore, this noncapacitor-flip-over pseudodifferential MDAC can decouple the input and output common-mode voltage. However, the opamp feedback factor drops from 1/2 to 1/4 in this structure, resulting in large loop bandwidth reduction. Moreover, the noise and opamp noise is also doubled. All these drawbacks make it unsuitable for our high-speed and low-power prototype ADC design.
C. Double-Sampling S/H Stage
A front-end S/H circuit is critical in the design of a high-performance pipelined ADC. It usually takes up a large die area and consumes much power. The S/H also puts limits on linearity and noise. The proposed architecture shown in Fig. 5 indicates that two front-end S/H blocks are required to apply the time-shifted CDS technique. Implementing an extra S/H will not only add power consumption and die area but will also add noise to the input signal. However, we can realize the equivalent function of these two S/H circuits by using just one double-sampling S/H. Thus, there will not be any added power consumption, die area, or noise. Fig. 14 shows the double-sampling S/H circuit that is implemented (single-ended illustrated for simplicity) in this prototype IC. There are two sets of sampling switches and capacitors for this time-interleaved operation, and they operate at half the speed of the overall ADC. This S/H circuit will provide a sampled output (hold operation) for two sampling phases of the first-stage pipeline employing the time-shifted CDS. This double-sampling S/H circuit is insensitive to timing skew due to the use of a series master sampling switch [17] . The capacitor mismatches are alleviated due to inherent voltage-mode operation (i.e., sampled input voltage is "flipped" to the output). The opamp offset and gain mismatches (memory) are manageable at the 10-bit level. In this S/H, a CMOS inverter is also employed in place of a conventional opamp to reduce power consumption. Note that we did not use CDS in this S/H circuit. The finite/low gain of the inverter used in the S/H circuit only causes linear gain error, which can be tolerated in most applications, without degrading the linearity of ADC. The linearity of the low-gain inverter was found sufficient for 10-bit accuracy at the specified signal swing. One major concern of the front-end sampling circuit is the nonlinearity caused by the input sampling switches. Bootstrapped sampling switches are commonly used in practical design to achieve superior linearity for very high-frequency input signal [18] . The price paid is the added complexity. Long-term reliability is also an issue for bootstrapped switches, particularly in deep-submicron CMOS processes. In this prototype design, CMOS transmission gate switches are employed. Simulation results have indicated that 10-bit linearity can be achieved even for a 100-MHz input signal after optimizing the sizes of the input sampling switch transistors.
Another major concern is the noise requirement. The sampling capacitor used in the S/H stage is 0.8 pF. The sampling capacitor in the first-stage MDAC is also 0.8 pF. In the remaining stages, the sampling capacitors are all 0.4 pF. We did not further scale the capacitors to simplify IC implementation. These values are comparable to (or even larger than) several recently published 10-bit ADC designs. Thus, the low power dissipation is achieved mainly by the use of proposed low-power techniques, rather than by aggressive capacitor scaling.
D. Comparator
The commonly used capacitively coupled comparator shown in Fig. 15 is adopted in the sub-ADC design. The input capacitors used is 0.1 pF. No offset cancellation scheme is employed because large comparator offsets can be tolerated in 1.5-b/stage pipelined ADCs. The time-shifted CDS technique does makes this tolerance smaller. However, it was verified in the modeled simulation environment that there was no obvious performance degradation even with up to comparator offsets. One critical part of this comparator module is the latched comparator, which is shown in Fig. 16 . It includes three stages: input amplifier (M1 and M2), NMOS and PMOS regeneration latches (M5-M8), and output S-R latch (M13-M20). The input amplifier is a simple NMOS differential pair with 300-A bias current, which not only amplifies the input signal but also suppresses the kickback noise from the regeneration latches. The NMOS switches (M3 and M4) will turn off the input differential pair during regeneration time to save power consumption. It also helps reduce kickback noise from the regeneration latches. The combination of PMOS and NMOS regeneration latches speeds up the regeneration compared to the PMOS-only latches. The regeneration latches are reset to a voltage close to power supply by M11 and M12 during the sampling/resetting phase. One additional reset switch, M10, across the differential latching node reduces the offset due to the mismatch of M11 and M12. The NMOS switch M9 disables the NMOS regeneration latch during the resetting phase to avoid large dc current to ground. The output S-R latch holds the comparison result during the whole clock period for the convenience of following encoding logic. With about 0.3 mW at 1.8 V, this latched comparator achieves less than 250-ps regeneration time for a 2-mV differential input signal, which is short enough for a 100-MHz clock with 400-ps nonoverlap time.
E. Distributed Clock Generator
The distributed internal clock generator scheme shown in Fig. 17 is used in this design to reduce the load on the clock drivers due to parasitic capacitances of interconnect wires. It also helps to reduce delay skew due to interconnect wires. Note that two internal clock references are generated from the single input reference clock. One clock runs at a half rate for use in the local clock generator which generates clocks for the double-sampling S/H stage. The other one is at the full rate for use in all other local clock generators. The delay matching of these clock signals is critical. Much care is taken to ensure proper functionality as well as performance. Extensive design and layout optimization (such as inserting dummy load and matching lengths of clock lines) has been incorporated to minimize the delay skew. The simulated maximum clock skew in the final design is less than 30 ps across all process variations. Typical clock rising/falling time is 50 ps, and typical clock nonoverlap time is 400 ps.
IV. EXPERIMENTAL RESULTS
The prototype ADC was fabricated in a 0.18-m CMOS process. The die photograph is shown in Fig. 18 . The active die area is 1.2 mm 2.1 mm. The total power consumption is 67 mW at 1.8-V supply and 100-MHz sampling frequency. The analog portion consumes 45 mW. The measured DNL and INL are 0.8 LSB and 1.6 LSB, as shown in Fig. 19 . With 1-MHz input and 100 MS/s, the measured SFDR, SNR, and SNDR are 65, 55, and 54 dB, respectively. Fig. 20 shows a typical measured frequency spectrum at 1-MHz input and 100 MS/s (the digital output of the ADC is decimated/downsampled by 4 on chip for testing purposes). Fig. 21 shows the dynamic performance versus the magnitude of 1MHz input at 100 MS/s. Fig. 22 shows the dynamic performance versus input frequency at 100 MS/s. The measured SFDR, SNR, and SNDR at 99-MHz input frequency are 63, 52, and 51 dB, respectively. Fig. 23 shows the dynamic performance versus conversion/clock rate with 1-MHz input signal. Performance degrades past 100 MS/s, Table II. V. CONCLUSION A time-shifted CDS technique which compensates the finite amplifier gain of an inverter-based pipelined ADC is described. The proposed technique enables low-power and high-speed operation by allowing significantly reduced amplifier gain. Prototype IC measurements demonstrate 67-mW 10-bit 100-MS/s performance. While it is a common practice to scale opamp bias currents and capacitor sizes down the pipeline to reduce power consumption in practical pipelined ADC design [19] , [20] , no scaling was applied for prototyping convenience. All opamps dissipate the same amount of current throughout the pipelined ADC. It is expected that 20%-30% further reduction of analog power consumption can be achieved with proper opamp scaling. The achieved results indicate that a design incorporating an effective CDS techniques (e.g., time-shifted CDS) in combination with simplistic active stages (e.g., inverter) can achieve significant speed improvement, while maintaining, or even lowering, the overall power consumption. This time-shifted CDS technique can be also be used in a fully differential implementation of a pipelined ADC. The fully differential implementation may end up being simpler than the pseudodifferential ADC, since a CMFB circuit would be used, completely avoiding the common-mode voltage drift issue. However, a fully differential circuit is likely to have higher power consumption due to the added CMFB circuit. The fully differential opamp signal swing will also have to be reduced with the use of a tail current source, although it would provide better noise rejection. In short, the fully differential implementation would provide a better SNR with an increased power consumption.
