Currently, dynamic comparator approach necessitates in high-speed and power efficient analogto-digital converter applications due to its high latching speed and ultra-low power consumption. In this paper, a novel dynamic comparator is proposed to reduce latch delay and offset. The comparator benefits from add-on cross-coupled transistors in latch structure and unbalanced clocks to enhance comparison speed and to lessen input offset voltage occurred due to mismatch in crosscoupled circuits in latch stage. The derivations for delay and input offset voltage are presented for proposed dynamic comparator with meticulous Monte-Carlo simulations. The results are verified by simulations in CA-DENCE SPECTRE at 1 V supply voltage and 90 nm CMOS technology. A comparative analysis between the proposed dynamic comparator and the previous reported comparators has been presented. It is observed that the delay is reduced up to 46 % and 6 % as compared to conventional and two phase dynamic comparator, respectively. Moreover, the proposed design consumes 53.36 µW power only. The Monte-Carlo simulation shows that the standard deviation of input offset voltage is 10.8 mV which is 12 % and 77 % of conventional and two phase dynamic comparator, respectively.
Introduction
For past few decades, the regenerative latch circuits in comparators have been playing a vital role as interface between digital and analog signals [1] . It is a main building block that is widely used in a variety of systems such as Analog-to-Digital Converters (ADCs) [2] , memory devices [3] and [4] , Variable Gain Amplifiers (VGAs) [5] or switched capacitor circuits. High switching speed, low offset [6] and [7] and energy efficient [8] comparators having small die area are required for flash type ADCs. But trade-off between speed, offset and power makes it challenging to design high speed low offset comparators [6] . In recent CMOS processes, high speed comparators suffer from low voltage supply in Ultra-Deep Submicron (UDSM) CMOS technology because the threshold voltage is not scaled in same way as supply voltage [9] , resulting in limitations on voltage headroom and common mode input voltage range. A challenge towards high speed low power comparator is increase of kickback noise [10] and offset caused by mismatches due to threshold voltage, capacitances, and current factors. Thus, this major thrust to design high performance comparators is a huge challenging task in ADC design environment.
Comparators are classified as static and dynamic depending on the clock signal. Static comparators [10] suffer from static power dissipation and are not suitable for high speed low power applications. Best suited comparators for high speed operations are dynamic comparators having no static power dissipation [11] . However, this topology creates stacking effect and fails for low voltage applications because appropriate delay time requires proper voltage headroom [12] . Many researchers have introduced a lot of techniques to design comparators such as body driven technique [13] , [14] and [15] , charge steering technique [16] , Zero-V t MOS based technique [17] , offset cancellation technique [15] , [18] , [19] and [20] , shared charge method [21] , and supply voltage bootstrapping and boosting [22] and [23] method to meet the above requirements. In body-driven technique [13] , the thresh-old voltage requirement is removed due to MOSFET operation in depletion mode, but it suffers from lesser trans-conductance in comparison of gate driven technique. Also, for both PMOS and NMOS operation in body driven design, a unique fabrication process as n-well is required. The comparator, based on Zero-V t devices [17] provides rail-to-rail input range and fast switching at low supply voltage. However, Zero-V t devices in many CMOS processes are not available, and fabricate them physically is impossible. So, above mentioned techniques are not unswerving for low voltage applications in spite of being effective. To remove stacking effect in [9] and [12] , an extra circuitry is added to conventional comparator to increase speed in UDSM low voltage supply. In this approach, additional circuitry creates component mismatch which should be considered. To overcome all these challenges, doubletail two stage dynamic comparators [24] , [25] and [26] comprising separate amplification stage and regenerative stage are proposed for energy efficient and lesser delay. By including some extra circuitry [25] , power consumption is reduced in the expense of delay and area. To enhance regenerative speed, a new quasidynamic [8] regenerative stage is proposed, but static power dissipation occurs in amplification stage.
A classical single phase comparator named as "Lewis-Gray" comparator was introduced in [27] and [28] to explain compromise in offset, delay and power. It is widely used in ADC systems [28] , therefore is taken as reference in this paper. It is fully differential dynamic comparator and consists of pre-amplifier stage and regenerative latch stage like other single phase comparators. When pre-amplifier stage develops sufficient voltage difference at the inner nodes of latch stage, it starts comparison and functions properly. In [29] , an analysis of input offset voltage shows that it can be diminished on the cost of higher power consumption. At the regeneration phase amplification of input voltages and regeneration of cross-coupled inverters occur concurrently. Therefore, amplification should be quick and sufficient to suppress offset of cross-coupled inverters which leads to more power consumption. At the output node, load capacitance mismatch again affects input offset which needs more controlling input stage. To break this stalemate between power and offset, a new double phase based architecture [30] was introduced with significant lesser input offset with less power penalty. Nevertheless, a penalty on delay occurs.
In this paper, an improved unbalanced clock based dynamic comparator has been proposed in which an extra circuitry is included in latch stage as cross-coupled transistors. Now, output nodes of pre-amplifier stages are passed to intermediate transistors in place of direct connected with output nodes of latch stage that improves the performance of the proposed comparator. A significant delay is reduced without penalty on offset and power consumption but on the cost of some area caused by extra circuitry. The remnant of this paper is structured as follows: In Sec. 2. , the proposed comparator is explained along with mathematical analysis of delay and input offset. In Sec. 3. , design considerations are explained in which some design issues are elaborated. Simulation results are discussed and compared with past designs in Sec. 4. whereas Sec. 5. concludes the paper.
Proposed Comparator
The proposed comparator, shown in Fig. 1 , is composed of two stages: 1) pre-amplification stage and 2) regenerative latch stage.
Preamplification stage is formed by transistors The two separate stages, i.e. regenerative latch stage and pre-amplification stage function with two clock pulses CLK 1 and CLK 2 individually. These clocks aid the input transistors to reduce the mismatch effect in the latch stage. Thus, the input offset voltage of comparator is reduced significantly. This circuit has less stacking, so it can operate at low supply voltage.
2.1.
Operation of Proposed Circuit Architecture
The proposed comparator functions with the three phase operations: pre-charge, amplification and comparison phase as illustrated in Fig. 2 . During the first phase when both the clocks CLK 1 and CLK 2 are low, the transistors M 3 -M 4 pre-charge the nodes F+ and F− causing M K1 -M K2 to be off and M 11 -M 12 transistors pull the output nodes V + out and V − out to V DD . In second phase, CLK 1 is high, however CLK 2 is still low. Now, the nodes F+ and F− start to discharge and an input and reference dependent differential voltage ∆V F +/F − is developed due to differential current produced in input branches I N 1 -I N 2 . The intermediate transistors M K1 and M K2 pass ∆V F +/F − to cross-coupled inverters that provides good shielding between input and output. Hence, kickback noise is reduced. A sufficient differential voltage is developed at the output nodes of the latch stage which is related to differential input and reference voltages. The clock CLK 2 is set to high during third phase, resulting latch circuit starts to operate. The regenerative loop of back-to-back inverters boosts the developed differential voltage at output nodes. 
Delay Analysis
In order to validate delay reduction mathematically, the delay equations are derived for this proposed circuit as presented in [21] and [24] . The total delay consists two parts: amplification phase duration, t amp and regenerative latch stage delay, t latch .
The delay t amp is the time duration in the amplification phase when the latch stage load capacitance C L at output nodes discharges until the first PMOS (M 9 /M 10 ) turns on. Here, the first PMOS (M 9 /M 10 ) will turn on when first preamplifier output node
Hence, t amp is obtained as:
where I B1 is the drain current of M K1 . Let, sum of I B1 and I B2 currents (i.e. I B1 + I B2 ) is equal to total supply current I, then I B1 can be approximated as half of supply current I for small differential input (∆V in ).
If ∆V 0 is the initial output voltage difference at the beginning of comparison phase, latch delay can be obtained from [31] :
where τ = C L /g m,ef f in which g m,ef f is the effective trans-conductance of the cross-coupled inverters. From Eq. (4), it is clear that speed of proposed comparator can be improved by enhancing ∆V 0 and g m,ef f .
• Enhancement in ∆V 0 : As discussed earlier, t amp is the time after which comparison phase starts and one of the latch output charges back to V DD . According to Eq. (4) at this time t amp , differential output ∆V 0 has a significant impact on t latch time. Enhancement in ∆V 0 lessens the latch time t latch . From [24] , ∆V 0 of this comparator is calculated as:
where, I B1 and I B2 are the drain currents of the left and right branches of the latch stage. Considering ∆I B = |I B1 − I B2 | = g mK1,2 × ∆V F +/F − , Eq. (5) is rewritten as:
where g mK1,2 is the effective trans-conductance of the intermediate PMOS transistors M K1 and M K2 of latch stage and ∆V F +/F − is the differential voltage of the pre-amplifier stage output nodes F + and F − at the time t amp . Both these influencing parameters g mK1,2 and ∆V F +/F − amplify ∆V 0 resulting latch delay reduces.
The voltage difference at nodes F +/F − at time t amp , ∆V F +/F − can be determined as:
In this equation, I N 1 and I N 2 are the currents of input transistors of which difference depends on the input voltage difference i.e. ∆I B = g m1,2 × ∆V in and g m1,2 is the transconductance of the input transistors M 1 /M 2 . By substituting Eq. (7) in Eq. (6), we have:
• Enhancement in effective trans-conductance: In proposed comparator, it is evident that the output nodes F +/F − of input stage discharge in decision making phase, ensuing turns on intermediate stage transistors and strengthens positive feedback, thus the effective trans-conductance of the latch is increased i.e. (g m,ef f + g mK1,2 ). Hence,
, and:
Finally, including effects of both parameters, the total delay of proposed comparator is derived from:
(10) From expression derived in Eq. (10), it can be concluded that total delay strongly depends on input voltage difference, supply current, transconductance of input and intermediate stage transistors, and the ratio of C L and C L,F +(−) . These parameters reduce delay logarithmically and amplify the whole speed of proposed comparator which can be confirmed by the simulation results.
Mismatch Analysis
In the proposed comparator, two intermediate PMOS transistors (M K1 and M K2 ) are included with two phase dynamic comparator [30] , thus mismatch effect of threshold voltage (∆V T hK1,2 ) and current factor (∆β K1,2 ) due to M K1 /M K2 transistors is considered for input offset analysis. However, the threshold voltage and current factor mismatch effect is insignificant in most cases except small differential input voltage (∆V in ), where output nodes of input stage F + and F − follows each other at similar discharge rate. As a result, the decision making outcome might be disturbed due to the mismatch of intermediate transistors. Therefore, following two brief analysis of mismatch effects, caused by threshold voltage and current factor, have been considered on the input offset voltage.
• Effect of Threshold Voltage Mismatch of M K1 and M K2 (∆V T hK1,2 ): The differential current caused by the M K1 /M K2 threshold mismatch is achieved as:
Hence, the input offset voltage caused by the M K1 /M K2 threshold mismatch is calculated as follows:
·∆V T hK1,2 . (12)
• Effect of Current Factor Mismatch of M K1 and M K2 (∆β K1,2 ): The current factor mismatch of M K1 /M K2 can be obtained as channel length mismatch ∆W K1,2 . In order to find input offset voltage due to current factor mismatch, the differential current in terms of ∆W K1,2 can be written as:
(13) Hence, the input offset voltage caused by the M K1 /M K2 current factor mismatch is calculated as follows:
Thus, the total input offset due to both mismatch factors of the intermediate transistors M K1 /M K2 can be determined as:
Expressions derived in Eq. (12) and Eq. (14) conclude that the trans-conductance of input transistors (g m1,2 ) is effective to diminish input offset. So, the size of these input transistors is kept usually large in reducing the effect of intermediate transistors mismatch, which results in low input offset voltage.
Kickback Noise
In the regenerative latched based dynamic comparators, the voltage discrepancy at the output nodes, coupled to input stage transistors, can disturb the input voltage due to nonzero output impedance. This effect, known as kickback noise, may affect the comparator accuracy. As explained in [10] , the high speed and low power comparators create larger disturbance at the input nodes. Hence, it is inescapable in the fast latching circuits. In Fig. 3 , the undesired peak errors are depicted in the transient response of input voltage at ∆V in = 10 mV. To determine kickback noise, the Thevenin equivalent of input is modeled with resistance of 8 kΩ. Figure 4 illustrates the peak error in the input voltage as a function of input voltage difference for three different structures. The proposed comparator has higher kickback noise than two phase dynamic [30] while lower than conventional [27] . The intermediate transistors of proposed circuit are not as robust as latch of two phase dynamic. Thus, the size of these transistors is determined in such a way that the proposed circuit maintains high switching speed and low power dissipation with reduced kickback noise.
The disturbance at reference voltages is negligible as compared to inputs due to low impedance at reference nodes. The main discrepancy occurs during amplification phase when reference voltage takes some level settling time before the start of regeneration phase. In some applications, in order to reduce the kickback noise where it becomes significant, the kickback noise reduction techniques, such as neutralization in [10] , can be applied. The proposed comparator is simulated with neutralization technique as shown in Fig. 4 . Conventional [27] Two Phase Dynamic [30] Proposed Proposed with neutralization Fig. 4 : The plot of measured peak error in input voltage due to kickback noise versus input voltage difference variation.
Design Considerations
In the proposed structure, there are several design issues that must be considered. The sizing of crosscoupled PMOS transistors M K1 /M K2 , located between cross-coupled inverters of latch stage, is an important issue for high speed, low voltage, and low offset operations. These transistors may create the voltage headroom problem, limiting the low voltage applications. In order to overcome this problem, M K1 /M K2 transistors of low resistance, i.e. of large size, are required. The input offset might be affected by the threshold voltage and current factor mismatch between M K1 /M K2 transistors. To diminish this effect, M K1 /M K2 transistors of large transconductance are required. Therefore, large transistors must be used. However, the large size transistors affect the parasitic capacitances of F +/F − nodes, C L,F +(−) , and resulting delay bottlenecks. As, the increased parasitic capacitances restrict the speed of comparator, the size of the M K1 /M K2 transistors is optimally selected in such a way that maintains the high speed, low voltage, and low offset operations.
In the proposed comparator, CLK 1 and CLK 2 are designed as unbalanced clocks. CLK 2 is delayed by ∆t time from CLK 1 , and amplification delay (t amp ) depends on this delay time (∆t). So, the design of clock generation circuit is another important issue. As depicted in Fig. 5(a) , the delay of CLK 2 with respect to CLK 1 is controlled by varying V ctrl of the current inverters in the clock buffers. At small ∆V in , the comparison is very difficult in evaluation phase. Therefore, in amplification phase, the sufficient amplification time (t amp ) is required to develop the differential output voltage at the internal nodes F +/F −. Thus, ∆t time is set such that it is equal to or greater than t amp (∆t ≥ t amp ). If ∆t < t amp , it will create the error in comparison phase for small ∆V in . At higher values of ∆t, the input offset is reduced effectively. However, the delay is increased rapidly. Hence, to maintain the high speed and low input offset, ∆t is kept equal to or slightly greater than t amp . For proposed circuit, ∆t = t amp . The conceptual waveforms are shown in Fig. 5(b) . 
Simulation Results and Discussion
To compare the proposed comparator with existing conventional [27] and two phase dynamic comparator [30] , the circuit is designed in CADENCE and results are simulated in SPECTRE at 90 nm CMOS technology with V DD = 1 V, V CM = 0.9 V and ∆V in = 5 mV. For fair and authentic comparison of simulation results, the designed circuits from [27] and [30] are simulated in alike simulation environment and framework which is used to simulate the proposed circuit. Figure 6 shows the layout of proposed circuit with area occupancy 64.08 µm 2 (9 µm × 7.12 µm). The appropriate caution has been taken in layout design to avoid effect on power, offset and delay. Figure 7 shows the dependence of delay on power supply for proposed comparator and results are compared with other two configurations. It is obvious that speed is significantly enhanced in comparison to other circuits. However, delay is higher at low supply voltages in respect of higher voltage supplies. The delay varies from 364.3 pS to 221 pS for power supply 0.7 V to 1.2 V. Figure 8 and Fig. 9 In Fig. 10 , the analytical outcomes from Eq. (10) are compared with simulated values of delay at different ∆V in and V CM = V DD − 0.1 V. The delay calculated from analytical derivations shows good matching with delay from simulations. The negligible difference is found which is due to non-linear second order effects. These effects are approximated and neglected during analytical derivations of delay to convert the complex expressions into simple expressions. Figure 11 and Fig. 12 depict the dependency of T Delay and T Latch on input voltage difference and results are compared with previous structures. Here, ∆V in varies from 1 mV to 30 mV at V DD = 1 V, V CM = 0.9 V and load capacitance, C L is 5 fF. At ∆V in = 20 mV, T Delay for proposed circuit is 190.63 pS while 298.6 pS and 197.67 pS for conventional design and two phase dynamic circuit, respectively. These results confirm that the delay is reduced for proposed comparator in comparison with past comparators. Also, a significant speed is enhanced compared to conventional circuit. The reason behind the speed improvement is a boost in ∆V 0 . As shown in Fig. 13 , ∆V 0 variation is represented with ∆V in . As ∆V in is increased from 1 mV to 30 mV, ∆V 0 amplifies fast at small differential input and becomes approximately constant at higher values of ∆V in which confirms the delay is reduced minimally at large values of ∆V in . It also depicts that ∆V 0 is heightened at particular value of ∆V in for proposed configuration as compared to others. For example, at ∆V in = 10 mV, ∆V 0 is boosted to 353 mV whereas 136 mV for conventional circuit. At particular value of C L = 5 fF and V DD = 1 V, ∆V 0 increases by 225 mV, from 190 mV to 415 mV for ∆V in variation from 1 mV to 30 mV. Conventional [27] Two Phase Dynamic [30] Proposed Fig. 7 : Total delay for different structures versus V DD at ∆V in = 5 mV, V CM = V DD − 0.1 V. Figure 14 represents that slew rate depends on ∆V in . Slew rate increases with increment of ∆V in and has larger values for proposed circuit than other circuits. The slew rate is defined as change in output voltage with respect to time (∆V 0 /∆t). It proves that slew rate will be higher at small delay time. Slew rate at ∆V in = 5 mV is 4.03 V·nS −1 which is much greater than 2.14 V·nS −1 for conventional structure. The whole simulated results conclude that delay is significantly reduced with comparable power dissipation, P diss as shown in Fig. 15 . P diss at ∆V in = 10 mV is 44.97 µW for proposed which is comparable to 43.79 µW for two phase dynamic. Moreover, P diss is significantly lower than that of conventional circuit at every particular value of ∆V in . For example, P diss = 53.36 µW at ∆V in = 5 mV for proposed, on the contrary, 86.07 µW for conventional circuit. It is obvious that speed is expressively enhanced while consuming almost same power. Hence Energy Per Conversion (EPC) [24] is reduced which is defined as
ENOB is effective number of bits and f s is sampling frequency. EPC in proposed circuit is slightly reduced in comparison with two phase dynamic circuit while an impressive drop occurs in respect of conventional circuit as shown in Fig. 16 . For 1 bit conversion, EPC is decreased from 13.25 fJ to 3.4 fJ at ∆V in = 5 mV after comparing with conventional structure, on the contrary, a slight drop with two phase dynamic from 2.15 fJ to 1.99 fJ at ∆V in = 10 mV. In Tab. 1, the performance of the proposed structure has been sum-marized. Table 2 includes and verifies both analytical analysis and 0.2 k Monte Carlo simulated values for offset voltage. There is a small difference in calculated and simulated values. The offset voltage calculated from analytical derivations is lower than the simulated result by meticulous 1 − σ Monte Carlo simulations. The small difference is due to the dynamic offset which is not considered in analytical derivations. Conventional [27] Two Phase Dynamic [30] Proposed Fig. 11 : Total delay for different structures versus ∆V in at V DD = 1 V, V CM = 0.9 V. Figure 17 shows the offset voltage variation of current proposed circuit with previous configurations at three different supply voltages. By using unbalanced clock scheme, the input offset is reduced remarkable with respect to conventional, and additions of intermediate transistors lessen somewhat more input offset voltage, but keep in mind that size of these transistors should be larger with respect to others. At V DD = 1.2 V, the input offset voltage (V os ) is 63.85 mV, 11.67 mV and 8.32 mV for conventional, two phase dynamic and proposed circuit, respectively. At each point, the offset results are achieved using 1 − σ Monte Carlo simulations at 200 samples run. As shown in Fig. 18 , the standard deviation of the input offset (σ os ) for the proposed circuit is derived to be 10.8 mV at V DD = 1 V using 1 − σ based Monte Carlo simulations. Conventional [27] Two Phase Dynamic [30] Proposed Fig. 12 : Latch delay for different structures versus ∆V in at Conventional [27] Two Phase Dynamic [30] Proposed Fig. 13 : ∆V 0 (differential output voltage at t = tamp) for different structures versus ∆V in at V DD = 1 V, V CM = 0.9 V. Table 3 presents the corner analysis for proposed comparator at ∆V in = 5 mV and V DD = 1 V. Thus, the proposed circuit works properly at different corners. However, the delay is increased with some extent at SS corner. To draw a fair comparison, the proposed structure and two other structures from [27] and [30] are simulated and compared in same simulation environment at 90 nm CMOS technology as shown in Tab. 4. The width of the MOS transistors is set such that the optimized values are drawn for delay and off-set. Finally, Tab. 5 relates the performance parameters of the proposed structure with previous works.
Conventional [27] Two Phase Dynamic [30] Proposed 
Conclusion
In this paper, a novel unbalanced clock based dynamic comparator has been presented to diminish latch regeneration delay and offset. The latch stage is modified by adding two intermediate transistors which enhances the regeneration speed. The unbalanced clock signaling aids to cancel the mismatch effect of the interior devices. The analytical derivations for the proposed comparator are presented to analyze delay and offset that verify the results simulated by CADENCE VIRTUOSO tool. The simulated results confirm the reduction in delay and offset for the proposed circuit as compared to the previous structures. The maximum sampling frequency of proposed comparator is 5.7 GHz at V DD = 1 V with total delay of 248.2 pS and input offset of 10.8 mV at the cost of 53.36 µW power consumption and 64.08 µm 2 area. The delay is reduced up to 46 % and 6 % as compared to conventional and two phase dynamic comparator, respectively. The offset is also minimized by 88 % and 23 % in comparison of conventional and two phase dynamic comparator, respectively. 
