This paper presents design and simulation of a 4-bit 10 GS/s time interleaved ADC in 0.25 micrometer CMOS technology. The designed TI-ADC has 4 channels including 4-bit flash ADC in each channel, in which area and power efficiency are targeted. Therefore, basic standard cell logic gates are preferred. Meanwhile, the aspect ratios in the gate designs are kept as small as possible considering the speed performance. In the literature, design details of the timing control circuits have not been provided, whereas the proposed timing control process is comprehensively explained and design details of the proposed timing control process are clearly presented in this study. The proposed circuits producing consecutive pulses for timing control of the input S/H switches (ie the analog demultiplexer front-end circuitry) and the very fast digital multiplexer unit at the output are the main contributions of this study. The simulation results include +0.26/-0.22 LSB of DNL and +0.01/-0.44 LSB of INL, layout area of 0.27 mm 2 , and power consumption of 270 mW. The provided power consumption, DNL and INL measures are observed at 100 MHz input with 10 GS/s sampling rate.
Introduction
New generation wired or wireless communication systems, high performance oscilloscopes, software defined radio, optical communication systems, low noise amplifiers (LNAs), and mixers require higher bandwidth, operating frequency, dynamic range, and sampling rates [1] [2] [3] [4] [5] . These developments undoubtedly increase the importance of high resolution and high sampling rate ADCs.
One of the most popular ADC architectures to satisfy these requirements is time-interleaved (TI). This architecture mainly consists of n number of sample and hold (S/H), flash ADC core, a multiplexer, and a timing control logic blocks. Here, frequency and phase of the sampling control pulses for n number of flash ADCs are very critical. The sampling frequency of each flash ADC is calculated as F s = N/T , where N is the number of interleaving, T is the period of sampling for each Flash core [6] . Hence, it becomes possible to achieve very high sampling rates for time-interleaved ADCs [7, 8] . The block diagram of basic time interleaved ADC architecture is depicted in Fig. 1 . Figure 2 shows typical timing diagram of sampling control pulses for each flash ADC core.
In this structure, the mux block selects the digital outputs of the multiple channels within a correct order. As a result, a higher sampling rate ADC operation is obtained. Compared to speed of each ADC channel to move the sampling rate one step further, the number of ADC channels should be increased accordingly. In this case, however, the required chip area, power consumption, and the equivalent analog input capacitance increase accordingly [9, 10] . In fact, there are selferrors of each ADC cores used in a time interleaved ADC such as INL, DNL, and clock jitter etc. Moreover, while increasing the number of channels, the interference mismatch and timing problems between the channels worsens accordingly [11] [12] [13] . Here, another important design issue is the adjustment of switching times for sampling rate of over 10 GS/s. To palliate this problem, a pulse generation circuit is proposed in the literature. There are CMOS design examples, in which these switching pulses are generated such as in [4] to achieve a design with higher sampling rate, low noise, and high reliability. SiGe or InP based fabrication technologies are also preferred for that purpose [14] [15] [16] . Moreover, using different ADC architectures for each channel may be another possible solution. In addition, successive approximation register (SAR) [17] [18] [19] and pipeline or folding [16, 20, 21] type of ADC structures may also be used in order to increase the resolution and reduce the layout area and power consumption further.
This paper is organized as follows. The following section summarizes examples of state-of-the art time interleaved ADCs from the literature. The designed sampleand-hold circuit, 4-bit flash core, counter circuit, and multiplexer circuit are explained in Section 3. The post layout simulation results are given and discussed in Section 4. Finally, Section 5 concludes this paper.
Preliminary works
Chu et al [5] realized a 40 GS/s TI ADC in SiGe technology in 2010. Where, an additional front-end S/H circuitry was proposed. Also, design difficulties in TI-ADCs were summarized. Moreover, an on-chip LC oscillator circuitry was preferred to produce the clock pulses needed for proper operation. Duan and Allon [22] (2014) implemented a 32 way hierarchical 12.8 GS/s TI ADC in 65 nm CMOS technology. In this work, SAR ADC cores were utilized instead of flash ADC core, in which novel sampling and frequency divider circuits were proposed. However, a sampling rate of over 20 GS/s could not be achieved due to CMOS technology limitations, which is the reason to prefer Si-Ge technology to achieve higher sampling rates in the literature. Therefore, Si-Ge BiCMOS technology is preferred in state-of-the-art TI designs despite of its higher fabrication cost. El-Chammas et al [23] (2014) realized two channel TI ADC structure using 12-bit pipeline ADC core in 0.18 µm BiCMOS technology. The sampling rate was only 1.6 GS/s as expected due to higher resolution. Lee and Chen [24] (2010) proposed 50 GS/s 5-bit TI ADC in 0.18 µm SiGe BiCMOS technology. They proposed a novel architecture in digital encoder part of the ADC.
In many TI ADC design examples, fast track-andhold amplifier core designs have been proposed since this block is the fastest part in TI ADC designs. The other works included fast flash or different ADC core architectures and fast timing control circuits for TI ADC designs. For instance, Ma et al [25] (2014) proposed a 32.5 GS/s T/H amplifier circuit. Ritter et al [26] (2014) designed a 20 Gs/s 6-bit flash ADC without using T/H amplifier. They proposed a novel comparator circuit in BiCMOS technology. Gao et al [27] (2012) proposed a multi-phase clock design for high speed TI ADCs. These state-of-the art core designs may be employed in future TI ADC implementations.
Architecture of time interleaved ADC
The designed time interleaved ADC contains a S/H building block including four S/H circuits, a flash ADC building block including four flash ADC cores, a 4 × 1 multiplexer, and a controlling block including a ring counter and a pulse generator circuit as shown in Fig. 3 .
Sample-and-hold circuit
In this study, a traditional CMOS analog switch (also known as transmission gate) is used as S/H purpose as shown in Fig. 4 . In fact, effective resistance of this switch varies depending on the applied input voltage. Since the resolution of each flash ADC core is relatively low in general, a simple CMOS switch is used in this study, where a more accurate S/H circuit would positively affect the dynamic parameters of the system. Therefore, instead of using a single CMOS switch (ie parallel connected NMOS and PMOS), usage of dummy elements on both sides of the switch are preferred to reduce the charge injection errors [28, 29] . In this structure, the resistance of the switch is very high when a clock signal represented as " ϕ" applied to the gates becomes logic "0". On the other hand, the switch shows low resistance when the applied gate voltage becomes logic "1" [28] .
Flash A/D converter
Flash ADCs are known as the fastest structures among all ADC architectures. They are frequently preferred due to their conversion throughput since all quantization levels are compared at one clock period of time during conversion of a specific analog input level [30] . The block diagram of a typical flash ADC architecture is depicted in Fig. 5 . The design steps of the flash ADC core are summarized in the following sub-sections.
T h e c o m p a r a t o r c i r c u i t
The schematic of the comparator circuit preferred here is shown in Fig. 6 [31] . The aspect ratios of the mosfets and the critical bias voltage values are listed in Table 1. The power supply voltage for the comparator is ±1.5 V. This circuit is basically a differential pair input comparator circuit, which was also used in the authors earlier work [32] with different design parameters adapted to 0.18 µm CMOS technology. In addition to comparator in stage (M0, M1, M2, M3, M5, M6), the voltage gain is boosted by a common source amplifier (M8) and a CMOS inverter stage (M11, M12) at the output.
The other transistors are used for biasing purposes. The DC analysis result and the transient analysis result with f in = 10 MHz are given in Fig. 7 and Fig. 8 , respectively. CMOS ICs suffer from variability phenomenon, where non-idealities during the fabrication process cause shift in several transistor parameters such as, threshold voltage, thickness oxide, and device geometries. Variation of these parameters ultimately results in functional failures. Considering the comparator block, which is the most critical circuit for an ADC structure, any comparison error may disrupt the whole performance of ADC. Therefore, a Quasi-Monte Carlo (QMC) analysis is conducted to demonstrate the effect of process variations on the comparator performance. QMC is preferred since it promises asymptotically better convergence rate for a given sample size compared to the conventional Monte Carlo analysis. The input offset voltage of the comparator is determined as the performance metric during the analysis and the yield is calculated according to the change in the offset voltage. Both inter-and intra-die variations in the threshold voltage, thickness oxide, and channel length and width are taken into account during the variability analysis.
QMC analysis result for comparator circuit is provided in Fig. 9 , where the mean (µ) and the sigma (σ ) values of the input offset voltage are 2.61 mV and 0.127 mV, respectively. The yield values for offset voltages of 2.7 mV, 2.8 mV, 2.9 mV, and 3 mV are calculated as 88 %, 95 %, 98 %, and 100 %. 
T h e d y n a m i c l a t c h c i r c u i t
The schematic of the dynamic latch circuit is provided in Fig. 10 . This circuit is a traditional dynamic CMOS inverter circuit. Depending on the clock signal, either the logic value of former input signal is kept at the output or the logic value of current input signal is transmitted to the next stage of the ADC. Thus, digital part of the ADC is somehow relaxed for accurate conversion, because of continuous change of the analog input signal [32, 33] . In other words, when the clock signal is logic "0", it means that the analog input is sampled at this time; however, when the clock signal is logic "1" it means that the digital conversion of that sample is executed in the digital part of the converter.
T h e r m o m e t e r d e c o d e r b l o c k
Thermometer decoder circuit (1-of-N decoder block) is an array of the compound gate executingĀB function. The function of this block is to convert the thermometer code (namely the output of the comparator array or latch array) to 1-of-N code; hence, the PLA-ROM stage at the output of the converter can convert the thermometer code to a binary code. As seen in Fig. 11 , only static CMOS logic gates are preferred to implement the functionĀB. In fact, there are better alternative solutions in the literature to overcome the bubble error problem. However, considering both the chip area and a low resolution in ADC (4-bits), these are not preferred in this study.
T h e PLA-ROM b l o c k
The PLA-ROM, which is a binary encoder unit, is employed to convert the 1-of-N code to binary code. This structure is designed by arranging of NMOS transistors in a way that each row is related to its corresponding binary output code. The NMOS transistors operate either in the cut-off or the triode region based on the logic signals applied to their gates, however, PMOS transistors always operate in the triode region (ie PMOS transistors serve as load resistors) [34] . The main advantage of this circuit is being both in fully parallel nature that yield higher speed performance and smaller area. On the other hand, the main disadvantage is being more prone to possible bubble error problems that results in a non-monotonic converter performance. Alternative solutions proposed in the literature can also be employed by sacrificing from the power and area budget. Figure 12 shows a part of the circuit schematic of PLA-ROM. Here,c 0 , . . . , c 13 , c 14 , and c 15 inputs are the outputs of thermometer decoder block. The 4-bit binary outputs are taken from the output bits D1, D2 D3 and D4. The simulation results of the whole high speed 4-bit flash ADC is provided in Fig. 13. 
4 × 1 M u l t i p l e x e r b l o c k
Multiplexer circuits are one of the key structures of high-speed transmission and communication systems [35, 36] . They can be used to switch multiple outputs of a very high-speed circuit in to a specific input of another circuit. In this study, static CMOS logic gate based multiplexer structure is selected. The gate level circuit schematic of 4x1 multiplexer and the corresponding truth table are given in Fig. 14 . The multiplexer block includes four 4 × 1 multiplexers since there are four channels including 4-bit binary data belonging to each channel. Obtaining the timing control pulses of the select inputs of these multiplexers is one of the most critical points of this study. The proposed circuit producing non-overlap pulses to control the input S/H switches (ie analog demultiplexer circuit) is the main contribution of this study and will be explained in the next section.
The counter and pulse generator circuit
This block is responsible for producing select control bits of the multiplexer block and clock control signals of the 4-bit flash ADC block. A 2-bit ring counter circuit is designed using two JK flip flops as shown in Fig. 15 . Static CMOS gate structures are preferred at the transistor level.
A 2×4 decoder circuit shown in Fig. 16 is used to generate four consecutive control pulses for de-multiplexing the analog input signal to the flash ADC inputs (namely clock phase control pulses of the analog input S/H circuits are obtained). Hence, four flash ADC channels can be consecutively activated.
Post-layout simulation results
The post layout simulation of 4-bit TI ADC is carried out using Tanner Tools Pro software, where 0.25 µm generic CMOS technology is utilized. A ramp input signal between −1 V to 1 V was applied to the ADC.
The power supply voltage is ±1.5 V. Figure 17 shows the DC analysis results of the ADC and the corresponding INL-DNL plots. Simulation results include DNL values of between +0.05 LSB and −0.05 LSB, and INL values of between +0.063 LSB and −0.014 LSB. Figure 18 shows the transient analysis results under 100 MHz ramp input signal with 10 GS/s sampling rate. Fig. 18(b) shows the corresponding INL and DNL plots. There is an expected time delay depending on the input signal frequency, sampling clock frequency, and the comparator circuit performance. The amount of delay increases while the input frequency increases. According to the transient operation of DNL and INL plots, DNL values varies between +0.26 LSB and −0.22 LSB, and INL values are in between +0.01 LSB and −0.44 LSB. Next, the ADC sampling clock frequency is kept at 10 GHz but the input signal frequency is swept. The resulting DNL and INL plots are provided in Fig. 19 . As seen from this figure, input signal frequency is up to 100 MHz, the DNL and INL values are less than 0.5 LSB. However, DNL and INL values relatively degrade beyond 100 MHz. Table 2 compares the performances of the proposed TI ADC with the similar studies from the literature. Although [6] and [21] are also TI type of ADC examples, they use different core ADC architectures. Therefore, comparison of this work with them may not be fair. [2] and [4] uses flash ADC cores, however, the design tech-nology and the resolution are different than this work. In this design, 10 GS/s sampling rate is achieved in 0.25 µm CMOS technology. Because, other two designs were realized in 65 nm CMOS technology. It can be seen from Table 2 that this work has a better performance measures in terms of INL and DNL values. In fact, the ADC resolution of this work is smaller than that of other works listed in Table 2 . This work consumes more power, which may be considered as a disadvantage. [2] and [4] include a calibration circuitry unlike this design. In terms of layout area, it is not fair to make comparison since having lower resolution. However, it may be considered that the area of this design is much lower than that of [6] .
Conclusion
In conclusion, a 4 bit 10 GS/s Time-Interleaved CMOS ADC was designed and simulated in 0.25 µm generic CMOS technology. The average power consumption is 270 mW, and the active layout area is 0.27 mm 2 . The linearity measures observed are in the acceptable range. Furthermore, a Monte Carlo analysis (QMC) was performed for the comparator circuit and the yield values were calculated. According to the results, process variation would not cause any considerable degradation at the offset voltage of the comparator. The proposed architecture is easy to figure out. Therefore, this work can be a starting guide for the beginner researchers, who are interested in designing TI ADCs. As a future work, in the proposed TI ADC structure, it is possible to increase the analog input bandwidth further using different analog switching circuits at the analog front-end part, and using comparators having higher bandwidth. Moreover, instead of using CMOS only processes, advanced design technologies such as Si-Ge Bi-CMOS and Ga-As Bi-CMOS will result in better speed performance, where one should consider the tremendous increase in the cost.
