A wide-range all-digital duty-cycle corrector (ADDCC) with output clock phase alignment is presented in this paper. The proposed ADDCC can correct the duty-cycle error of the input clock to 50% duty-cycle. The frequency range of the input signal is 250MHz to 1GHz. In addition, the duty-cycle range of the input signal is from 20% to 80%. The proposed ADDCC is implemented in a standard performance 65nm CMOS process. The power consumption is 0.79mW (@250MHz) and is 3.15mW (@1GHz).
I. INTRODUCTION (HEADING 1)
In high-speed applications, such as double data rate (DDR) SDRAM or double sampling analog-to-digital converter (ADC), the positive edge and the negative edge of the clock are utilized for sampling the input data. Thus, in these systems, an exact 50% duty-cycle of input clock is required. The clock is distributed over the chip using clock buffers. However, as the clock signal passes through a series of clock buffers, the duty-cycle of the output clock is affected by the unbalanced rise time and fall time of the clock buffers with process, voltage and temperature (PVT) variations. Hence, the duty-cycle corrector (DCC) can be used to adjust the duty-cycle to 50%.
The analog pulse-width control loop (PWCL) [1] has been proposed to correct the duty-cycle of the clock. The PWCL circuit is shown in Fig. 1 . The PWCL circuit requires a ring oscillator to produce 50% duty-cycle reference. Then, the duty-cycle of the input clock (CLK_IN) can be adjusted by the control stage with the feedback control voltage (Vctl). However, the operating range and the input duty-cycle error are very restricted in this architecture. Therefore, it can only be used in a narrow frequency range. In recent years, the PWCL is improved by the linear control stage and digitally controlled charge pump (DCCP) [2] , and the operating range is further extended. Nevertheless, the lock-in time is still slower than the all-digital DCC [3-6, 9, 10] . Moreover, the leakage current problem of the charge-pump makes it not suitable for 65nm CMOS process.
In all-digital DCC (ADDCC) [4] , a synchronous mirror delay (SMD) line and a half-cycle delay line (HCDL) are used to produce a 50% duty-cycle clock. However, this architecture has a very narrow frequency range and it is not suitable for wide-range applications. The time-to-digital converter (TDC) is used in ADDCC [5] to quantize the period information of the input clock into digital codes. Then, a clock with a half-cycle delay is generated using the delay line of the TDC to produce a 50% duty-cycle clock. Nevertheless, the length of the delay line limits the operating frequency range of this architecture, and the output duty-cycle error is restricted by the resolution of the TDC. In ADDCC [6] , a TDC is used to quantize the period information of the input clock into digital codes. Then, two clocks with the complementary duty cycles are generated. Then, an interpolator is applied to produce the 50% dutycycle clock output. However, in this architecture, the output duty-cycle error is also restricted by the resolution of the TDC. A wide-range all-digital duty-cycle corrector with output clock phase alignment is presented in this paper. The proposed high resolution duty-cycle detector with all-digital duty-cycle correction delay line can overcome the TDC resolution limitations in prior studies. Thus it is very suitable for clock duty-cycle correction applications in system-on-a-chip (SoC) era. This paper is organized as follows. Section II describes the architecture of the proposed ADDCC. The implementation of the proposed design is discussed in Section III. Section IV shows the experimental simulation results of the proposed design. Finally, Section V concludes with a summary.
II. OVERALL CIRCUIT DESCRIPTION
The block diagram of the proposed ADDCC is shown in The timing diagram of the DLL is shown in Fig. 3 . The period of the input clock (CLK_IN) is T. If the duty-cycle of the input clock (CLK_IN) is A/T, and the duty-cycle of the DLL's output clock (DLL_CLK) is B/T. Then, the period T is equal to (A+B). After the DLL is locked, the positive edge of the CLK_IN and the DLL_CLK are phase aligned with complementary duty cycles. The signal selector detects the pulse widths of these two clocks, and the clock with wider pulse width is outputted as WIDE_SIGNAL. Oppositely, the clock with shorter pulse width is outputted as NARROW_SIGNAL. For example, in Fig. 3 , the CLK_IN is outputted as NARROW_SIGNAL and the DLL_CLK is outputted as WIDE_SIGNAL. The timing diagram of the DCC is shown in Fig. 4 . After the DLL is locked, the proposed ADDCC starts to compensate the duty-cycle error of the output clock (CLK_OUT). The NARROW_SIGNAL passes through the coarse-tuning duty-cycle correction delay line (Coarse DDCC) and the fine-tuning duty-cycle correction delay (Fine DDCC) to increase the pulse width then outputs as the DDCC_CLK. The duty-cycle detector (DCD) detects the phase error between the negative edge of the DDCC_CLK and the WIDE_SIGNAL. Then, it outputs DCD_UP and DCD_DOWN control signals to the DCC_CTRL. Subsequently, the DCC_CTRL adjusts the duty-cycle correction delay line control code (DDCC_CODE) to increase the output pulse width of the DDCC_CLK. When the phase error between the negative edge of the DDCC_CLK and the WIDE_SIGNAL is eliminated, the DCC is locked. The pulse width of the NARROW_SIGNAL is increased by ΔE, and ΔE is equal to (B-A). Since the period of the input clock (CLK_IN) is T, (A + ΔE/2) is equal to T/2 (= A+ (B-A)/2 = (A+B)/2). Therefore, we use the half duty-cycle correction delay line (Half Coarse DDCC and Half Fine DDCC) to increase the pulse width of the input clock (CLK_IN) by ΔE/2. After the DCC is locked, the duty-cycle of the output clock (CLK_OUT) is 50%.
III. CIRCUIT IMPLEMENTATION
The proposed DCD is composed of a sampled-based bang-bang phase detector (PD) and a tiny dead zone phase detector (PD). The detail circuit of the sampled-based bangbang PD is shown in Fig. 5 (a) . The dead zone of this PD is restricted by the meta-stability window of the D-Flip/Flops. Therefore, it can not detect a very small phase error. The detail circuit of the tiny dead zone PD [7] is shown in Fig. 5  (b) . The tiny dead zone PD can detect a phase error larger than 1ps in 65nm CMOS process. For this reason, the proposed DCD can detect a tiny phase error between the negative edge of the DDCC_CLK and the WIDE_SIGNAL. Therefore, the duty-cycle error of the output clock can be also reduced. At the beginning, the DCC_CTRL adjusts the DDCC_CODE according to the sampled-based bang-bang PD's outputs DCD_UP_1 and DCD_DOWN_1. After the sampled-based bang-bang PD is locked, the DCC_CTRL continues to adjust the DDCC_CODE with the tiny dead zone PD's outputs DCD_UP_2 and DCD_DOWN_2. The detail circuits of the Coarse DDCC and the Fine DDCC are shown in Fig. 6 (a) and Fig. 6 (b) , respectively. The Coarse DDCC is composed of a chain of OR gates to enlarge the pulse width of the NARROW_SIGNAL. The DDCC Encoder is used to convert the binary control code (DDCC_CODE[11:0]) into the thermometer codes (ddcc[128:0] and fine_ddcc[15:0]). The Fine DDCC is added to further improve the resolution of the proposed ADDCC. In the Fine DDCC, the digitally controlled varactors (DCVs) [8] are used to build up a fine-tuning delay cell.
The Half Coarse DDCC and the Half Fine DDCC are the mirrored circuits of the duty-cycle correction delay line (Coarse DDCC and Fine DDCC). Since the duty-cycle correction range of the half duty-cycle correction delay line is less than the duty-cycle correction delay line (Coarse DDCC and Fine DDCC), the number of OR gates and AND gates can be reduced to save the area and the power consumption. The proposed ADDCC is implemented on a standard performance (SP) 65nm CMOS process with 1.0V power supply. Fig. 7 shows the layout of the proposed ADDCC, and the core area is 0.01 mm 2 including the test circuit. Fig.  8 shows the simulation results which are the frequency range of the input clock is 250MHz to 1GHz. In addition, the duty-cycle range of the input clock is from 20% to 80%. The simulation results show that the output clock can be corrected to 50% duty-cycle with different input frequencies and duty-cycles. The simulation results of the proposed ADDCC are summarized in Fig. 9 . In the test chip, a digitally controlled oscillator (DCO) and a test mode controller are included for testing. The DCO can output a frequency ranges from 250MHz to 1GHz, and the duty-cycle of the DCO output clock can be adjusted to verify the effectiveness of the proposed all-digital DCC (ADDCC). The test mode controller receives these control signals to generate the desired testing clock signal for the ADDCC. The performance comparisons are shown in the Table I . In [3, 6, 10] , the TDC-based ADDCC architecture must have a high resolution TDC to minimize the duty-cycle error. However, it is not easy to design a wide-range high resolution TDC. Therefore, it is not suitable for wide frequency range operation. In addition, the interpolator circuit used in [6] is sensitive to PVT variations. In [9] , the output clock is not phase aligned with the input clock. Comprared to prior studies, the proposed ADDCC has a wider frequency range and also has a wider input duty-cycle range. In addition, the power consumption of the proposed ADDCC is 3.15mW at 1GHz and is 0.79mW at 250MHz. 
IV. EXPERIMENTAL RESULTS

V. CONCLUSION
In this paper, a wide-range all-digital duty-cycle corrector with output clock phase alignment is presented. The proposed DCC architecture can achieve wide-range operation with input frequency ranges from 250MHz to 1GHz and input duty-cycle ranges from 20% to 80%. In addition, the proposed high-resolution duty-cycle correction delay line can correct the duty-cycle error of the output clock to 50%. The proposed ADDCC can overcome the TDC resolution limitations in prior studies. Thus it is very suitable for clock duty-cycle correction applications in system-on-a-chip (SoC) era. 
