Abstract-In this paper, a low power, small area cyclic time-to-digital converter in All-Digital PLL for DVB-S2 application is presented. Coarse and fine TDC stages in the two-step TDC are shared to reduce the area and the current consumption maintaining the resolution since the area of the TDC is dominant in the ADPLL. It is implemented in a 0.13 µm CMOS process with a die area of 0.12 mm 2 . The power consumption is 2.4 mW at a 1.2 V supply voltage. Furthermore, the resolution and input frequency of the TDC are 5 ps and 25 MHz, respectively.
I. INTRODUCTION
DVB-S2 (Digital Video Broadcasting -S2) is the standard for the transmission of multimedia content to portable terminals. As the UHF spectrum used for DVB-S2 service should accommodate many broadcast signals that are emitted from plural locations, the undesired channel signals received by a DVB-S2 tuner can be much larger compared to the desired channel signal. Thus, the phase noise nose of the frequency synthesizer should be less than -95 dBc/Hz at 10 kHz offset when the output frequency is 2.25 GHz.
In order to reduce the current consumption and die area, the All-Digital PLL (ADPLL) is adopted in this design.
Time-to-digital converter (TDC) is used for the comparison of the output frequency of the digitally controlled oscillator (DCO) with that of the reference clock. Moreover, it should be implemented at high resolution to improve the phase noise of digital PLL [1] . The relationship between the resolution (∆t inv ) of TDC and in-band phase noise is shown in Eq. (1).
( ) 
In Eq. (1), T v and f R are period of the DCO clock and frequency of the reference clock, respectively. In order to meet the phase noise requirements of DVB-S2 application, the resolution of TDC should be less than 10 ps based on Eq. (1) . The simplest topology of TDC is to use an inverter delay chain as shown in Fig. 1 [1] .
Buffer delay chain is widely used in TDC. In the delay chain, shown in Fig. 1 , the rising edge of the F DCO than a single inverter delay does [2] .
So, in order to improve the resolution, two-step TDC architecture with time amplifier (TA) is adopted [3] . Fig. 2 shows the conventional two-step TDC architecture. A twostep TDC improves the resolution by amplifying the residue between the input and closest coarse level and then, by quantizing the amplified residue again with the same coarse resolution. Both Coarse TDC and Fine TDC use the thermometer-to-binary (T2B) and have the same number of resolutions.
The Coarse TDC stage converts the time difference between the two inputs into digital bits. Moreover, the residual time difference is also amplified by using the time amplifier (TA) and it is transferred to the Fine TDC stage. As a final step, Fine TDC converts the amplified time difference into fine TDC codes.
However, in the conventional two-step TDC architecture, the areas of the Coarse and Fine TDC stages are very large to be integrated in the ADPLL. Thus, Coarse and Fine TDC cores can be shared in the time domain since they are not activated at the same time during the conversion. In this paper, the cyclic TDC that has only one TDC core is proposed to reduce the area and also the current consumption.
II. CYCLIC TDC ARCHITECTURE
The block diagram of the proposed cyclic TDC architecture is shown in Fig. 3 . It consists of the TDC core and time amplifier. Two inputs, F DCO and F REF , are connected to INA and INB of TDC Core through the MUX when Sel is LOW at the coarse conversion stage. In order to acquire a resolution less than that of the inverter, the delay cell is implemented with inverters and phase interpolator is composed of resistor arrays.
The edge detector (ED) in the TDC Core determines which edge of PI(n) is closest to the F REF and then, the multiplexer (MUX) passes the residue to the fine TDC through the TA. Outputs of TA, TA_O_A and TA_O_B, are connected to INA and INB of TDC Core through MUX, when Sel is HIGH at the final conversion stage. Fig. 4 shows the block diagram of the TDC Core in Fig.  3 . TDC Core is composed of a delay cell which consists of two cascaded inverters, phase-interpolator, F/F, edge detector, and MUX. TDC Core uses a delay cell chain. Each stage in the chain consists of two inverters and phase-interpolators for which the delay is 5 ps. Flip Flop is used as a comparator to determine the thermometer codes of the TDC. By locating the transition from 1 to 0, the edge detector logic identifies the critical residue which is to be sent to the TA in the TDC Core during fine conversion.
As shown in (0) PI (2) PI ( 
D (15) PI (121) PI (127) PI (120) PI (0) PI (1) PI (7) TDC_O ( zero and the Edge Detector could not detect the transition. As a result, the fine TDC is not activated and the output of TDC is also all zero in this case, which is ignored by the following block after the TDC. The output of TDC is valid only when INA is earlier than INB. In [3] , the gain of the time amplifier is calculated by
where T off is limited by the delay of the inverter. In order to overcome this problem, the gain of time amplifier is improved using the delay time difference 
where g m is the transconductance of a NAND and C is the capacitance at its output [3] . It should be noted that both the gain and linear range can be controlled by the time offset α. The TA uses the low value of C and α, to cover the high gain and high frequency. TA exploits the variable delay of an SR latch subject to nearly coincident input edges. If rising edges are applied to S and to R at almost the same time, the latch will be metastable. After both inputs go to high, the initial voltage developed at the output of SR latch, is proportional to the input initial time difference, and the positive feedback in the latch forces the output eventually to a binary level. Conventional TA is implemented in two latches and delays in opposite inputs. Thus, it has a problem that it cannot implement high gain due to a delay time larger than the inverter delay time. TA is improved by using a fractional delay time (α) less than that of the inverter; this is difference between two delay cells with separate delay times (Toff, Toff + α). The resolution of TDC is determined by gain of the TA. The gain of TA is determined by a capacitor and the transconductance of a NAND gate, which are influenced by PVT variation. Since the delay time difference (α) in Fig. 6(a) is very small and more sensitive to the PVT variations compared with Toff, it is necessary to do layout very carefully and perform the calibration of the delay manually to adjust the gain of the time amplifier. Also, it requires additional area to implement the small delay cell (Delay 1). Fig. 6(b) shows the schematic of the delay cell, Delay 1, whose delay is controlled by dcont<3:0>. The additional delay time (α) of TA is manually controlled by dcont<3:0> in order to compensate for the PVT variation. Although Toff is changed for the PVT variation, gain of the TA has little effect on it because α, difference between two delay cells, can be controlled manually.
Offset and delay mismatches in the time amplifier are minimized by the careful layout and post-layout simulation. They are very sensitive to the parasitic capacitance and resistance of the interconnection wires. We finely adjusted them during the post-layout simulation stage. Also, the sizes of the transistors in TA are increased to improve the matching characteristics. 
A passive voltage divider, as shown in Fig. 7 , can be connected between D(0) and D(1) to generate the interpolated signals defined by (4) . Fig. 8(a) shows resistor tuning array which is composed of a main resistor (R0) and sub resistors (R1 ~ R4). It is controlled by the R TUNE (3:0) signal from the resistor automatic-tuning circuit shown in Fig. 8(b) . In Fig. 8(b) , V TUNE is generated by I REF and replica resistor of R0 ~ R4 and is compared with the reference voltage, V REF . R TUNE (3:0) is controlled based on the result by the digital controller.
In order to improve the resolution, values of resistors are designed to be very small. Thus, they are very sensitive to the process variation. However, the sizes of MOSFETs in the buffers are designed to be large and less insensitive to the mismatches. Also, careful layout is performed to ensure the matching between the buffers. When the resistor is changed by process variation, it is restored by negative feedback of the resistor automatictuning circuit. The passive resistors and capacitors usually vary due to the process variation. The resistance value of R0 is designed to be controlled by ± 15% with switches S1 ~ S4 due to parasitic capacitance variation. Thus, the resistor tuning range is ± 15% around the nominal resistor value. The default resistor R0 is 85 % of the nominal resistor value. And, the tuning resistors, R1 ~ R4, are 15 %, 7.5 %, 3.75 %, and 1.87 % of the nominal resistor value, respectively. In this design, the resistance value of R0 is 20 kΩ and R1 ~ R4 are 3.53 kΩ, 1.76 kΩ, 880 Ω and 440 Ω, which are 15%, 7.5%, 3.75% and 1.87 % of the nominal value, respectively. Fig. 9 shows the timing diagram of the resistor automatic-tuning method. If the resistor of phaseinterpolator is increased by process variation, the V TUNE is higher than V REF . In this case, the switch control bits R TUNE (3:0) are decreased. Resistor tuning is completed when V TUNE crosses the V REF .
The default value of the R TUNE (3:0) is "0111" after the reset, which corresponds to 100 % of the nominal resistor value with R0 and R1. When the resistor value is decreased due to the process variation, R TUNE (3:0) is changed into "0110". On the other hand, R TUNE (3:0) is changed into "1000" when the resistor value is increased due to the process variation.
III. EXPERIMENTAL RESULTS
This chip is fabricated in the CMOS process with a feature size of 0.13 µm technology. Fig. 10 shows the chip layout pattern. The chip has a single poly layer, six layers of metal and high sheet resistance poly resistors. The total die of the TDC with phase-interpolator and time amplifier is 0.12 mm 2 . Fig. 11 shows the simulation result of the phaseinterpolator in the two-step TDC when RAT is enabled. The phase error of phase-interpolator with RAT is within ± 5 % including buffer mismatches.
The relationship between the input and output of the TDC is a linear function within the limited input range. In spite of the PVT variation, the gain of TA using the fixed delay time (α) is not changed in the input linear range of it. The linear input range of the TA is 1 ns, where the simulated gain of the time amplifier is about 50 at the operating frequency of 25 MHz. Within this linear input range, the gain of TA is constant and can be controlled by adjusting the delay (α) of delay cell with dcont<3:0>. Since the reside from the coarse TDC is within the linear input range of TDC, the gain is kept almost constant. To measure the linearity, two inputs with a 3 kHz difference at a reference frequency of 25 MHz are applied to generate a ramp input. The Differential Non Linearity (DNL) and the Integral Non Linearity (INL) of the cyclic TDC are calculated, sweeping the value of the difference in the frequency domain and then analyzing the code density statistic. The DNL and INL are shown in Fig. 13 and 14, respectively. The maximum DNL is ± 0.55 LSB while the maximum INL is ± 2.0 LSB. Table 1 summarizes the performance of the two-step TDC. When the input frequency is 25 MHz and the frequency resolution is 5 ps, the power consumption and die area are 2.4 mW and 0.12 mm 2 respectively. The power consumption and die area of this work is smallest compared with references, [3, 5] , and [6] , which are implemented in 90 nm and 0.13 µm process. The resolution of this work is larger than [3, 6] , and [7] . There exists a trade-off between the resolution of TDC, the die area and power consumption. Our design target of this work is to minimize the die area and power consumption of TDC. The resolution of the TDC can be further improved by increasing the die area and power consumption.
V. CONCLUSIONS
This paper presents a cyclic TDC with phaseinterpolator and time amplifier. The coarse and fine TDC stages in the conventional two-step TDC architecture are shared to reduce the area. It is implemented in the 0.13 µm CMOS process with a die area of 0.12 mm 
