Abstract-A 10-bit 170MS/s two-step binary-search assisted time-interleaved SAR ADC architecture is proposed, where the ADC's front-end is built with a 5b binary-search ADC, shared by two time-interleaved 6b SAR ADCs in the 2 nd -stage. The design prevents the use of opamp that causes large power dissipation. Besides, a process insensitive asynchronous logic is proposed to further reduce the delay of SA loop. The ADC was fabricated in 65nm CMOS and achieves 54.6dB SNDR at 170MS/s with only 2.3mW of power consumption, leading to a FoM of 30.8fJ/conversion-step.
INTRODUCTION
Mobile video and high definition television applications require medium-resolution, high-speed, and low-power design specifications, which can be achieved by using pipelined [1] [2] or pipelined-SAR ADCs [3] [4] . The pipelined ADCs require multiple op-amps that consume large power and become tough to be designed with the technology scaling under low supply voltage [1] [2] . Pipelined-SAR ADC is a potential architecture to achieve both high-speed and lowpower, while the implementation of inter-stage gain still requires one op-amp that consumes static power [3] [4] .
The SAR ADC architecture is well-known and power efficient for medium-resolution application, while its speed is less than 100MS/s for resolutions >9b because of feedback loop delay of N-bit cycling, especially the switching of MSB capacitors in the DAC array, which is the most critical. Another comparator-based binary-search ADC [5] achieves both low-power and high-speed with limited resolution due to the comparators' offsets and exponential growth of hardware (comparators, switch matrixes, decoders, and reference ladder).
This paper presents a two-step Binary-Search (BS) assisted Time-Interleaved (TI) SAR ADC architecture, which takes the speed, resolution, and power benefits of these two types of ADCs. The 1 st -stage implementing a binary-search ADC determines coarse 5b. The 2 nd -stage utilizes the TI-SAR scheme to improve the speed with low-power dissipation. Besides, the architecture avoids using opamp to save power and overcomes the resolution and speed limits of BS and SAR ADCs, respectively. In addition, a process insensitive asynchronous SAR logic is also proposed, which auto-detects the logic delay of each cycling bit, instead of using a fixed delay cell with off-chip tunable control as [6] . A 10-bit 170MS/s prototype is verified in 65nm CMOS with FoM of 30.8fJ and 36.4fJ per step @ DC/Nyquist input.
II. PROPOSED ADC ARCHITECTURE
The proposed ADC architecture is shown in Fig. 1 , which is composed of a high-speed BS ADC shared by two interleaved low-speed SAR ADCs. The 1 st -stage is a 5b BS ADC [5] , composed by 5 comparators, a resistive reference ladder, and distributed track-and-holds (T/Hs). Compared to flash ADCs, BS ADCs activate the comparators bit-by-bit rather than activating all comparators at once, thus avoiding large power dissipation. After the residue is generated at the top-plate of the DAC, the 2 nd -stage SAR1 starts to convert the rest of the 6b fine code. Therefore, the SAR ADCs are sampling and quantizing alternatively to achieve double operating speed. In this way, the BS ADC exhibits an optimized trade-off between speed and power.
Besides, the switching of the MSB array is the bottleneck of DAC settling, implying power to be consumed for switching on both references and buffers. The 1 st -stage ADC charges up/down the MSB array of the DAC before the 2 ndstage starts quantization, then, less buffers and smaller switches can be used. And, by reducing the input range of 2 nd -stage, the DAC settling delay also reduced, as well. Therefore, the power consumption and DAC settling of the subsequent SAR ADCs benefit from the proposed ADC architecture.
III. PROCESS INSENSITIVE ASYNCHRONOUS LOGIC
The main dominating delay of the SA loop is the digital logic delay and the DAC settling delay. Since the MSB capacitors are settled before 2 nd -stage conversion, the DAC settling delay is significantly reduced. The operation speed of the logic delay is optimized by using the proposed process insensitive asynchronous logic shown in Fig. 2 , together with its timing diagram. The traditional asynchronous SAR ADCs use a fixed delay of comparator ready signal to trigger the asynchronous logic [6] . However, the logic delay (t LOGIC ) including the SAR control logic and switch buffers delay suffers from process variation, which cannot be guaranteed by a fixed delay and causes incorrect quantization. Therefore, a tunable delay with off-chip control was implemented [6] . The proposed scheme uses the switching signals (S P,1 , S N,1 …S P,k , S N,k ) to trigger the asynchronous loop instead of a fixed delay of comparator ready signal, so that it does not require tunable delay cells to overcome the process variation problem.
The operation of the proposed logic is separated into 5 steps. During the reset phase (Φ R =0), the logic is reset to V DD at the node B i such that the SA loop is disabled (1) . Once the SAR starts quantization (2), the comparator is clocked, and the control logic provides a switching signal (3) to select an appropriate reference voltage (V REFP /V REFN ) connecting to the DAC. The switching signal (S P,1 /S N,1 …S P,k /S N,k =1) discharges the dynamic node B i to clock the comparator (4). The pulse generator (PG) logic limits the pulse width of the digital output signal (P 1 …P k ), so the feedback signal (Φ F =0) can charge the dynamic node again to pull down the comparator clock (5). The asynchronous logic repeats (3) to (5) until the last bit is quantized, in which the comparator is always clocked after the switching signal. In this way, the SA loop is clocking itself at its optimized frequency, and insensitive to the process variation by the proposed asynchronous logic. Fig. 3 shows the 5b BS ADC for the 1 st -stage, which is modified from [5] . There are 5 sub-stages, and each sub-stage has its own comparator and distributed T/Hs. Totally, 9 T/Hs are used, and the unit capacitance of each T/H is 50fF. The comparators of the sub-stages are activated one-by-one asynchronously without feedback loop to achieve both lowpower and high-speed. When compared with [5] , 1/2LSB of the reference voltages is shifted by adding 1/64•V REF to each residue, so the implemented design can perform redundancy for digital error correction. Besides, the code overflow happens when the output code of the 1 st -stage is "11111". In general flash sub-ADCs one comparator must be removed, so that the maximum output code can only be "11110." However, in the implemented BS ADC, if the comparator of the last sub-stage is removed, the LSB code is lost. Therefore, when the first 4b code is "1111", the supply voltage is selected as residue rather than a reference voltage for the last sub-stage, and the input voltage is never larger than the supply voltage, so that the comparator of the last sub-stage will quantize an output bit of '0', and the maximum code can only be "11110". -stage DAC array. The split DAC structure is used to reduce the total capacitance and area. The unit capacitor C formed by Metal-Oxide-Metal (MOM) fringing structure results a unit value of 20fF. From the layout extraction, the top-plate parasitic capacitance is around 5%. The attenuation capacitance C att is designed to 1.07C (21.4fF) rather than ideal value of 1.03C to compensate 5% top-plate parasitic capacitance. The total capacitance of the 2 nd -stage DAC array is 0.8pF.
IV. CIRCUIT IMPLEMENTATION

A. 1 st -stage BS ADC with redundant references
B. 2 nd -stage DAC array
C. 2 nd -stage Dynamic comparator
A high-resolution low-noise comparator is required for the 2 nd -stage since there is no amplification between the stages. A three-stage dynamic comparator [7] is utilized for this implementation, as shown in 
V. DESIGN CONSIDERATIONS
The design considerations of the proposed architecture includes inter-stage error (errors between 1 st and 2 nd stage) and time-interleaved (mismatching between TI-SAR) as described below.
A. Inter-stage errors
Both 1 st and 2 nd stage are using bottom plate sampling, and the 2 nd -stage residue is generated by digital code together with input signal at the DAC arrays rather than analog-signal only. Therefore, the inter-stage gain error caused by the parasitic capacitances at the top-plates of the several T/Hs and DAC arrays can be avoided. The digital error correction relaxed the reference, offset, and time error requirement of the 1 st -stage from 11b to 6b. The resistive reference ladder, which can be easily designed at a matching level of >6b, is implemented for the BS ADC.
Since the input range of the ADC is ±0.8V P-P , the full-scale voltage (V FS ) of the 2 nd -stage is ±25mV P-P . When the offset of the 1 st -stage is larger than 25mV, the input of the 2 nd -stage will be saturated. Besides, the offset requirement of the 1 st -stage is also 25mV, and this offset is corrected by the offset calibration through unbalanced capacitive loading [5] .
The timing error between the 1 st and 2 nd stage can be tolerated by the error correction, which is 32ps for the proposed architecture. It is not problematic to optimize the timing error by balancing the wire length (75μm), width (0.1μm), and loading between the stages to reduce the timing error.
B. Time-interleaved mismatches
The reference matching for TI 2 nd -stage SARs are 11b, which is decided by capacitor matching of the DAC array, and this matching can be achieved within 11b level using common-centroid layout technique.
The offset mismatch between the channels of the 2 nd -stage requires a 1σ of 0.7mV. In this design, the 2 nd -stages' comparators offset are cancelled in the digital domain by subtracting the mean code between channels using off-chip calibration.
The time skew of the 2 nd -stage SARs must be fewer than 2ps for 10b 170MS/s requirement because they are using different DAC arrays to sample the interleaved input signal as shown in Fig. 1 . The matched interleaved clocks between two channels are generated by the master clock through a dividerby-2 low-skew clock generator to achieve maximum time skew of 1.2ps under 30 Monte Carlo simulation results.
VI. MEASUREMENT RESULTS
The prototype ADC was fabricated in 1P7M 65nm CMOS. Fig. 6 shows the die chip photograph with an active area of 0.104mm 2 (260μm╳400μm). The prototype ADC consumes 2.3mW of power operating at 170MS/s including 0.77mW analog power (T/Hs, DACs, and comparators) and 1.52mW digital power (clock generator, SAR logic, and binary-search logic). Fig. 7 shows the measurement result among 20 chips. No offset and timing skew tones are observed within the chips. By the mean performance among the chips to report the results, the measured SNDR at 1.3MHz input is 54.6dB. The FoM, defined as Power/(2 ENOB •f S ), is 30.8fJ/conversionstep. At Nyquist input frequency, the SNDR is 53.2dB, with a resulting FoM of 36.4fJ/conversion-step. The SNDR and SFDR versus input frequency are shown in Fig. 8 , and the ERBW is 116MHz. Fig. 9 shows the measured DNL/INL, the DNL is +0.42/-0.52 LSB and the INL is +0.93/-0.81 LSB. Fig. 10 shows both the measured FFT spectrum @ DC and near Nyquist input. The performance summary and comparison with state-of-the-art pipelined and pipelined-SAR ADCs are shown in Table I .
I. CONCLUSIONS
A 10-bit 170MS/s two-step binary-search assisted timeinterleaved SAR ADC architecture is proposed, where the ADC's front-end is built with a 5b binary-search ADC, shared by two time-interleaved 6b SAR ADCs in the 2 nd -stage. The design prevents the use of opamp that causes large power dissipation. Besides, a process insensitive asynchronous logic is proposed to further reduce the delay of SA loop. The ADC was fabricated in 65nm CMOS, occupying 0.104mm 2 of active area. It achieves 54.6dB SNDR at 170MS/s, consuming 2.3mW of power, with a FoM of 30.8fJ/conversion-step. At Nyquist input frequency, the SNDR is 53.2dB with an FoM of 36.4fJ/conversion-step. 
