Abstract: A new digital delay-locked loop (DLL) for DDR3/DDR4 SDRAM is presented. The proposed digital DLL employs a new noisetolerant triple (MSB-interval + binary + sequential) search algorithm for implementing a harmonic-free, fast-locking capability while retaining low jitter, low power performance, and a wide operating frequency range. The proposed DLL with duty-cycle correction is designed using a 38-nm CMOS process and occupies an active area of just 0.02 mm 2 . The DLL operates over a frequency range of 0.3-2.0 GHz and achieves a peak-to-peak jitter of 7.78 ps and dissipates 3.48 mW from a 1.1 V supply at 1 GHz.
Introduction
Synchronous dynamic random access memory (SDRAM) has served as an important low-cost main memory solution for the personal computer (PC) and other cost-sensitive consumer electronics markets. Traditional SDRAM provides a maximum memory bus data rate of only 166 Mbps/pin with a maximum clock rate of 166 MHz. As processor speeds continue to increase, the PC becomes more reliant on low cost, higher bandwidth memory solutions. The double data rate (DDR) SDRAM was introduced to the public market to meet these requirements. DDR SDRAM, also called DDR1 SDRAM, continues to evolve and has been superseded by various DDR-x SDRAMs (i.e., DDR2, DDR3, and DDR4) that provide better performance and consume lower power. DDR3 was introduced in 2007 and DDR4 in 2013 [1] . Currently, the market is moving swiftly from DDR3 to DDR4 because of the lower power consumption and higher speed advantages of DDR4.
In order to achieve a memory bus data rate of over 400 Mbps/pin, DDR-x SDRAMs must incorporate an on-chip delay-locked loop (DLL) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] that can eliminate skew problems and achieve higher timing margin at high frequencies. To design a DLL that can support both DDR3 and DDR4 specifications at the same time [13, 14] , the DLL should be locked within 512 clock cycles and operate over a frequency range from 300 MHz to 1.6 GHz using an internal supply voltage of less than 1.2 V. Also, the DLL must be capable of correcting the duty cycle of the distorted input clock so that the data-valid window (tDV) could be widened [2] .
Currently, most DDR3/DDR4 SDRAMs use a digital DLL [1, 2, 3, 4, 10, 11] . One of the reasons for using digital architectures is because DDR3/DDR4 SDRAMs require fast recovery times for various power mode transitions. To achieve a fast locking time, a successive approximation register (SAR)-based binary search algorithm was adopted in DLL designs [5, 6, 7, 8, 9] . However, this introduced harmonic lock problem [6, 7, 12] . Harmonic locking may occur when the most significant bit (MSB) of the SAR code is changed in the beginning of the SAR operation, since the delay of the DLL is increased to 50% of the total delay line. To eliminate the harmonic locking problem in a SAR-based digital DLL, a variable SAR (VSAR) algorithm was introduced [7] . However, [7] requires a very complex and timing sensitive fail-to-lock detection circuit which could easily cause logic failures in the presence of noise in the power supply or ground.
In this paper, a digital DLL for DDR3/DDR4 SDRAMs is proposed which relies on a new noise-tolerant triple (MSB-interval + binary + sequential) search algorithm for achieving a harmonic-free, fast-locking capability while maintaining low jitter, low power consumption, and a wide operating frequency range. This digital DLL architecture has three operating modes (MSB, SAR, and Counter mode) and utilizes a triple (MSB-interval + binary + sequential) search algorithm for achieving fast locking without harmonic locking, as shown in Fig. 1(b) .
Circuit design
The MSB mode relies on eight evenly spaced delay intervals (1'st∼8 th interval)
to control the 5-bit CDL, which provides high noise tolerance in the DCDL controller design. Fig. 2 shows the proposed CDL which is made up of 32 cascaded digital delay elements (DEs). The DE consists of four NAND gates [7] and the delay step in a The propagation delay of the DLL can be represented as follows
where t variable is the tunable delay of the DCDL, t fixed is the initial fixed delay of the DLL when the t variable equals to zero, and t CK is the cycle time of the input clock. For ideal phase lock locking without the harmonic lock problem, N needs to be one. Therefore, t variable can be represented as follows.
Since the CDL consists of 32 DEs and the tunable delay range of the FDL is equal to one tD, t variable becomes as follows.
In order to achieve harmonic-free operation at the minimum frequency of 0.3 GHz (t CK ¼ 3:33 ns) with any values of t fixed and t3, the maximum variable delay of the DCDL (t variable max ) should be at least 3.33 ns. Therefore, the proper value of tD should be larger than 100 ps (¼ t variable max =33 ¼ 3:33 ns=33) even in the fastest process corners. Consequently, by choosing a typical tD of 135 ps, the proposed DLL can achieve harmonic-free wide-range operation from 300 MHz to 2.0 GHz. Referring to Fig. 1(b) and When the Comp signal is changed from logic high to low, the MSB mode is completed and the DLL enters the SAR mode and the binary search algorithm begins. Fig. 3(a) illustrates the triple search locking process along with the DCDL control bits and the three operating modes. The MSB mode is used to set the DLL output clock near the locking point within eight CLK CTRL cycles, where CLK CTRL is an output of the divide-by-N divider with N ¼ 4. Thus the delay range of the DCDL is separated by eight interval periods, which produces a maximum delay step change of only 4 Â tD in the MSB mode. When the MSB mode is completed, the DLL enters the SAR mode and the binary search algorithm is applied to the MMR. Fig. 3(b) shows the 10-bit MMR. In order to increase the DCDL delay rapidly without incurring any harmonic locking problem, only the 3 MSB bits, M [4:2] , are controlled in the MSB mode. Then the rest 7 LSB bits are controlled in the SAR mode for binary search. The SAR-based binary search requires a maximum 7 CLK REG cycles, where CLK REG is an output of the divide-by-M divider with M ¼ 8.
After the binary search is completed in the SAR mode, the 10-bit MMR is transformed into a 10-bit counter and the DLL starts the sequential search in the Counter mode, maintaining a closed loop to track process, voltage, and temperature (PVT) variations. As a consequence, the use of triple search algorithm results in a relatively fast locking time with no harmonic locking problems. The worst case locking time of the proposed DLL is 88 
Experiment results
The proposed digital DLL was designed in a 38-nm Powerchip CMOS process. decimal) . In Counter mode, the input clock (CLK IN ) and output clock (CLK OUT ) of the DLL are precisely aligned with each other with no clock skews. It is assumed that the clock buffer and the replica path are ideal and therefore they are omitted in this simulation. Fig. 7 shows the simulated peak-to-peak (p-p) jitter of the output clock. The proposed DLL achieves a simulated p-p jitter of 12.03 ps and 9.71 ps at 300 MHz In DDR memory systems, the input clock of the DLL may have a duty-cycle distortion (DCD). Therefore, the DLL should be able to lock properly regardless of the input clock's duty-cycle variation. In this design, the PD compares only the rising edges of the input and feedback clocks, and the DCDL controller has no relationship with the input clock's duty-cycle ratio. Therefore, the phase locking operation of the proposed DLL is immune to DCD. Since the output of the DLL clock signal can be distorted due to device mismatches, the proposed DLL adopts a digital DCC [9] to improve performance at high frequencies. The digital DCC achieves a fast locking time of less than 24 input clock cycles and it can be turned off during the power-down mode. The DCC achieves a duty-cycle correction range of 30-70% at 1.0 GHz. Fig. 9 shows the simulated input and output clocks of the proposed DCC-equipped DLL with distorted input clock duty-cycles. As shown in Fig. 9(a) , the DLL achieves a corrected output clock duty-cycle of 50.38% from a 55% input duty-cycle at 300 MHz. Fig. 9(b) shows an output duty-cycle of 50.48% from a 60% input duty-cycle at 2 GHz.
A performance comparison between the proposed digital DLL and other stateof-the-art DDR3/DDR4 digital DLLs is given in Table I . 
Conclusion
This Letter presents a new digital DLL architecture which is capable of duty-cycle correction and harmonic-free, fast-locking in DDR3 and DDR4 SDRAMs. The proposed DLL achieves these capabilities by adopting a new noise-tolerant triple (MSB-interval + binary + sequential) search algorithm. Implemented in a 38-nm Powerchip CMOS DRAM process, the proposed DLL operates over a frequency range of 0.3-2.0 GHz, achieves a peak-to-peak jitter of 7.78 ps, and dissipates 3.48 mW from a 1.1 V supply at 1 GHz. And the proposed DCC-equipped DLL occupies an active area of just 0.02 mm 2 . 
