After DLL is lock, the total delay of whole delay line is the same as input clock period. This paper presents an all-digital Delay-Locked Loop (DLL) for DDR SDRAM controller applications. The presented all-digital, cellHowever, if the required numbers of output multi-phase signals based, DLL-based five-phase multi-phase clock generator can are increased or the required maximum operating frequency si generate the required fixed timing delay (tSD) for DDR SDRAM increased, DLL architectures [3, 5] will face difficult design controller to capture the output data (DQ) correctly. The proposed challenges to build a high resolution delay cell with minimum DLL-based multi-phase clock generator architecture can lock to the intrinsic delay requirement. As a result, the operating range of harmonic of input clock period and still get a correct multi-phase previous DLL is severely limited. clock output. Hence the design challenges to build a high resolution delay line with minimum intrinsic delay can be reduced. Simulation In this paper, the new DLL-based multi-phase clock generator results and chip measurement results show that the proposed DLL architecture is proposed to let the DLL lock to harmonic of the input can generate desired tSD delay with error < 7.6%). The power clock period and can still get a correct multi-phase clock output. The consumption of the proposed DLL is 4.1mW (at DDR-200) and is proposed architecture can reduce the design challenges to build a 9.0mW (at DDR-400).
INTRODUCTION tSD|
Delay-Locked Loops (DLLs) have been widely used for _ designing high-speed memory interface circuit or clock multiplier to perform clock de-skew [2, 4] and multi-phase clock generation [3, 5] . tSD In those applications, the DLL offers better jitter performance than Phase-Locked Loop (PLL) because the reference clock jitter and the noise induced by power supply noise or substrate noise disappear at All DQ and DQS the end of the delay line. On the other hand, the ring oscillator of PLL accumulates jitter and noise effects. Thus DLL is good alternative for PLL in those applications and has a good phase Total tracking ability.
Setu
Strobe Hold
Budget Ujncertainty Buge
In Double Data Rate (DDR) SDRAM controller design, the output data strobe (DQS) signal must be delayed by a fixed timing
Skew
Data Valid Skew delay (tSD) to capture the output data (DQ) correctly. Figure 1 shows this read operation timing budget. Ideally, the DQS and DQ is edge FIGURE 1. READ OPERATION TIMING BUDGET aligned by DDR SDRAM. However due to pin-to-pin skew among all DQ and DQS, and PCB board skew, the data valid window becomes smaller than expected. As a result, how to generate an The proposed DLL is implemented with a 0.13ptm 1P8M CMOS optimal timing delay (tSD) to make sure that both setup/hold time process Structured ASIC cell-library. Frequency operating range of budget for the controller can be met, has become an important design the proposed DLL ranges from 100MHz to 200MHz to meet the issue for DDR SDRAM controller design.
DDR-200/266/333/400 specifications. The power consumption of the proposed DLL is 4.1mW (at DDR-200) and is 9.0mW (at DDR-400). In [1] , the calculations for timing budget show that the optimal value for tSD is approximately 20 percent of an input clock period.
PROPOSED DLL ARCHITECUTRE Since the input clock frequencies range from 100MHz to 200MHz (DDR-200/266/333/400), the tSD value varies from 2ns(=1OnsxO.2)
The proposed DLL architecture is shown in Figure 2 . Like most to Ins (=5nsxO.2). In this paper, a five-phase, all-digital, cell-based, of DLL-based multi-phase clock generators [3, 5] , the DLL has a DLL-based multi-phase clock generator architecture is proposed to multi-stage delay line with the same control code (dline control) to generate this desired tSD delay for DQS signal. The proposed design generate equally spaced multi-phase clock output. The TDC, which can overcome process, voltage, and temperature (PVT) variations used in the close-loop for reference clock (FREF) period and still generate the desired tSD delay. measurement [5] , provides the range selection control for the DLL controller. DLL-based multi-phase clock generator must have anti-harmonic lock capability. Otherwise the multi-phase clock generation will be
In previous DLL designs [3] [4] [5] , when DLL is locked, total delay failed. In previous DLL design [3] [4] , the lock detector is used to of the whole delay line is equal to one period of the reference clock overcome the possibility to lock to the harmonic of input clock (TFREF). Hence each delay stage should have TFREF/5 delay. As a period. A similar concept is proposed in [5] , it uses a Time-to-Digital result, the design requirements for the minimum delay of each delay Converter (TDC) to perform period measurement to avoid false-lock, stage must be smaller than this delay value: TFREF/5 in the worst case. used to achieve very fine resolution and those DCVs areFIUE4PHSDTCOREAZNEMIIAIN
implemented with standard cells. In Figure 3 , the change of finetuning control code (FCON[3 1:0]) will finely adjust the capacitive Figure 4 shows modified circuit and signal waveforms for the loading on F_OUT net. The worst phase resolution of this fine-tuning proposed PD. In the proposed PD [6], phase detector only detects the stage is 1.4PS with good linearity when it is implemented with a sign of phase error (i.e. lead or lag). Both QU and QD are three-state 0.13pim 1P8M CMOS process Structured ASIC cell-library, and the PD's outputs to decide whether IN leads or lags FB. By using the total controllable delay range of this fine-tuning stage is 37ps. digital pulse amplifier concept in [6] to extend the pulse width of QU and inserting a delay on QD net, the generated OUTU pulse width can be further extended. It means the detected phase error is enlarged by this improvement circuit. As a result, the minimum detectable TDC and is converted into delay line range control code phase error of this PD can be improved. Similarly, the modified (range_control[2:0]) and then the delay line executes range selection.
circuit for OUTD pulse generation is also shown in Figure 4 . After delay line finishes range selection, the total delay of the Thus the next stage digital pulse amplifier can more easily extend whole delay line falls into this range: 1.5 x TFREF < Tdelay-line < 2.0 x the OUT/OUTD pulse width to meet the output register's timing TFREF. Then the DLL controller continues fine-tuning the output requirements. By carefully design the digital pulse amplifier to phase accuracy with PD's UP/DOWN signal. Since the delay range extend the three-state PD output pulse width, the dead zone of this of the delay line is determined first, thus the false-locked problem PD can be reduced to 5ps when it is implemented with 0.13ptm will not occur in the proposed design.
1P8M CMOS process Structured ASIC cell-library.
When the phase error between reference clock (FREF) and delay Since the delay line output signal (P4) shown in Figure 2 has line output (P4 or CKOUT [4] ) is smaller than PD's dead zone, the more capacitive loadings than the other multi-phase clock signals. DLL is locked. Both multi-phase clock generation and tSD delay Dummy cells must be added to the rest of multi-phase clock signals generation are completed. The total gate count of the proposed DLL is 5250, and the area of 6l 0 the proposed DLL is 452tm x 462pm. Figure 5 shows and DQSD should be 0. (i.e. tSD=O.2xTFREF). Figure 6 shows the transient response of DLL after system reset.
After system reset (i.e. PDN=1), the TDC performs reference clock
The DLL continues to update the delay line control code and (FREF) period measurement. The dline_in signal shown in Figure 6 keeps tracking to the rising edge of reference clock. Since in the becomes "low" for two reference clock periods. This pulse is sent to proposed DLL, only rising edge information of reference clock is used hence it also has good duty-cycle error immunity. Compared to previous DLL [4] , the extra duty-cycle correction (DCC) circuit is CONCLUSION eliminated.
In this paper, an all-digital delay-locked loop for DDR SDRAM Post-layout Fast-SPICE simulations are performed in different controller applications is presented. The proposed DLL architecture operation conditions to make sure the stability of the proposed DLL. can not only reduce design challenges to build up a high resolution Table 1 lists the simulated tSD value vs. desired tSD value In delay cell with minimum delay constraint, but also make it possible different input conditions (DDR-200/266/333/400) and different to implement with standard cells. Thus the proposed architecture can operation environment, the generated delay error (AtSD) can still reduce both design time and circuit complexity. And it is very keep smaller than 7.600 of the desired tSD value, which meets the suitable for many high-speed interface circuit or digital required delay error (< 130%o) as mentioned in [1] . communication applications. performance of DLL output multi-phase clock signals is dependent on the reference clock jitter. But even with noisy reference clock input, the proposed DLL can still achieve lock and generate the required tSD delay.
The power consumption of the proposed DLL is 4.1mW (at and is 9.0mW (at DDR-400).
