Abstract-This paper describes a reset-free delaylocked loop (DLL) for a memory controller application, with the aid of a hysteresis coarse lock detector. The coarse lock loop in the proposed DLL adjusts the delay between input and output clock within the pull-in range of the main loop phase detector. In addition, it monitors the main loop's lock status by dividing the input clock and counting its multiphase edges. Moreover, by using hysteresis, it controls the coarse lock range, thus reduces jitter. The proposed DLL neither suffers from harmonic lock and stuck problems nor needs an external reset or start-up signal. In a 0.13-μm CMOS process, postlayout simulation demonstrates that, even with a switching supply noise, the peak-to-peak jitter is less than 30 ps over the operating range of 500-1200 MHz. It occupies 0.04 mm 2 and dissipates 16.6 mW at 1.2
I. INTRODUCTION
Simultaneous switching noise (SSN) is a voltage fluctuation between power and ground that occurs when multiple output drivers switch simultaneously. The amount of fluctuation is related to the inductance between device ground (power) and system ground (power) and can be expressed by multiplication of inductance L and current deviation di/dt.
L is composed of the inductance of package bond wire, package trace, and board inductance [1] . The other term, di/dt is cumulative and proportional to speed and the number of simultaneous switching I/Os. Therefore, as memory controllers scale down to meet increasing bandwidth requirement, the side effects of SSN are becoming more apparent by increasing speed and the number of I/Os.
Voltage fluctuation caused by SSN generates a system delay and logic faults. So it may make a system unstable and degrade its performance. But up to this time, SSN problems have been treated only in an external circuit level such as improving package and board power wiring. Dispersing power currents and shortening a distance from ground are simple solutions to it. Not only is SSN problem getting worse, but also these trials have several limitations, though. So SSN problem must be considered within a circuit design level, too.
A delay locked loop (DLL) is the logic block that has been widely used in micro-processors, memory interfaces, and communication IC`s for generating onchip clocks and suppressing skew and jitter in the clocks [2] . In designing a DLL, the effect of SSN ripples must be considered along with harmonic lock and stuck problems.
This paper presents a DLL for a memory controller using a hysteresis coarse lock detector (HCLD). With the proposed HCLD, the DLL becomes immune to SSN, so it can be suitable in a memory controller. It is also free from harmonic lock and stuck problems without a reset signal, and it is faster than a DLL using a conventional coarse lock detector (CLD). This paper is organized as follows. Section II shows the system and architecture of this work. Section III describes the circuits implementation and design considerations. Section IV and V show the simulation results and conclusion.
II. SYSTEM AND ARCHITECTURE
DLLs are widely used for generating on-chip clock with low jitter. Fig. 1 shows the architecture of a conventional DLL. This type of DLL has some advantages that it does not accumulate jitter and it locks quickly compared with a PLL. But it adjusts only phase, not frequency, and the operating frequency and Voltage Controlled Delay Line (VCDL) range is severely limited by harmonic lock problem [2, 5] . Fig. 2 shows the block diagram of the proposed dual loop DLL. It consists of a VCDL, a HCLD, a dynamic phase detector (PD), and a current mismatch calibrated charge pump (CP). After frequency acquisition between the input clock and the delayed clock in the HCLD using the VCDL multi-phases, one-cycle phase lock occurs in the PD. By using the HCLD, we can not only make the best use of a VCDL range but make the DLL to be immune to SSN without harmonic lock and stuck problem. 3 shows an implemented memory controller system. It shows how a DLL is used in a memory controller, and why SSN problem must be considered in a DLL design. This memory controller is composed of four channels, and each channel has a one-clock-signal (DQS) channel and eight data signal (DQ) channels. General memory such as GDDR3 transfers edge-aligned DQs and DQS. So for sampling DQs in a memory controller system, it is necessary to shift the phase of DQS by an amount of 180 degree. For shifting phase, the reference DLL generates a proper control voltage for one-cycle-delayed clock, and transfers the control voltage to the replica half delay line of each channel. Variable delay lines (VDL) in DQ and DQS change the delay according to their own control codes, so they can calibrate timing mismatch, which results from PCB trace difference and process variation between DQ and DQS.
In this system, the levels of DQ and DQS input signals are converted from 3.3 V to 1.2 V. A level converter consumes much power whenever transition happens. Moreover, about forty signals change simultaneously in the overall four channels. These are the main sources to generate SSN. SSN causes a voltage fluctuation and aggravates the reliability of sampled DQ signals. Therefore, we must consider SSN for DLL design. 
III. CIRCUIT DESCRIPTION 1. Voltage Controlled Delay Line (VCDL)
A VCDL consists of a single-to-differential converter and 15-stage differential delay cells. The reasons why we use a single-to-differential converter are to keep the duty cycle of input clock, and to sample DQs by half rate after DQS passes through replica DLL. While there needs only 12 stages to drive a PD, two additional stages are used for the HCLD and the final stage is added for dummy. The VCDL delays the input clock according to control voltage and provides proper clock phases to the HCLD and the PD. A VCDL range is directly related to a DLL's operating range. To cover overall frequency range from 500 MHz to 1.2 GHz, the delay of the VCDL must satisfy Eq. (1). 12 x T VCDL-1stage Maximum delay > 2 ns (1/500 MHz), 12x T VCDL-1stage minimum delay < 0.833 ns (1.2 GHz) (1)
Hysteresis Coarse Lock Detector (HCLD)
The HCLD is composed of a modified CLD and a hysteresis logic. The HCLD generates a clock whose frequency is half the input's. Its cycle is composed of an evaluation phase and a reset phase. In the reset phase, the HCLD counter, QA [1] ~ QA [5] , becomes zero. In the evaluation phase, every other-edge-detection is neglected before the rising edge of PH [5] is detected. So, the HCLD counts the exact number of odd-numbered phase edges in its every evaluation phase. The number of edges determines whether UP or DOWN. Thus the proposed DLL can avoid a harmonic lock and a stuck problem without requiring any external reset and arbitration logics.
The conventional CLD [3] has some shortcomings in speed and area. To overcome these problems, before entering a flip-flop, the divided clock is delayed as much as the same delay amount of the counting logic and we can acquire some timing margin. It is represented by the shaded area in Fig. 4. Fig. 6 represents a hysteresis logic to control the coarse lock range and its timing diagram. At first, the HCLD locks in a narrow mode. After the coarse lock lasts for 3 cycles, it changes the coarse lock range from a narrow mode to a wide mode.
Under an SSN environment in a memory controller, power supply voltage fluctuations directly influence a control voltage to be unstable even in a lock state. So if we fix the HCLD range to a narrow mode, like conventional CLDs, SSN environment breaks the lock state, and the CLD recovers coarse lock quickly again, which will happen continually at all times. This can be a jitter source because a frequency tracking loop and a phase tracking loop may interfere with each other when the CLD transfers control signal to the PD and vice versa. If we fix the HCLD range to a wide mode, on the contrary, proper locking process is done, but locking speed slows down. The proposed HCLD takes advantages of narrow and wide modes. At first, narrow mode is selected for fast locking. Once a lock state is entered and held during 3 cycles, coarse lock range becomes wide. So the PD keeps controlling, hence jitter is reduced.
Dynamic Phase Detector and Charge Pump
The high precision PD implemented here can operate with a less dead zone at high frequencies due to the symmetry of circuits, small logic depth, and small amount of pumped charges [2] . The widths of UP and DOWN pulses are proportional to phase difference of the inputs as shown in Fig. 7 . It adjusts the delay between a differential input clock and a 12th VCDL output into one cycle. After coarse lock is attained, only PD controls the VCDL delay. Consequently, the PD determines the overall precision of the DLL. Timing difference between two input clocks of the PD after phase locking is less than 20 ps.
And we use a current-mismatch-calibrated CP [4] . Most DLLs use a charge pump to implement an integrating loop filter [5] . But conventional charge pumps have a current mismatch problem. Difference between charging and discharging currents can cause a static phase offset as well as dynamic jitter. The implemented CP has not only low current mismatch but also a wide dynamic range. Its valid voltage range concludes the VCDL range in Eq. (1). So the DLL becomes stable in a wide range. Fig. 8 . shows the block diagram of the implemented CP and its control voltagecurrent curve. Current mismatch in a valid control voltage range is below 5 uA.
The replica CP always delivers constant currents by calibrating the same amount of UP and DOWN mismatch currents. Consequently, it can align multiphase of the VCDL uniformly much more in a lock state and reduce jitter. Fig. 9 shows an overall locking process of the proposed DLL at 1.2 GHz. To model SSN, the ripples of a 1-MHz sinusoidal wave plus a 1.2-GHz signal AM modulated at 6 GHz, whose peak-to-peak amplitudes are 2.5% of a nominal supply voltage respectively, are added to a supply voltage. First two waves show modeled power and its enlarged form. Whatever a control voltage the DLL has in an initial state, the DLL is able to go into a locking state without a harmonic lock and a stuck problem. Fig. 9 shows that the proposed DLL, at first, performs coarse lock by using the HCLD and then goes into phase locking steady state and keeps stable. Lock time is shorter than 1 us because of the HCLD`s narrow coarse locking range. Fig. 10 shows the comparison of the frequency and phase locking process between when a conventional CLD is used and when the proposed HCLD is. Whereas the CLD repeats coarse UP and DOWN signals under the influence of SSN, the HCLD keeps a coarse lock state in a given SSN environment. Degree of control voltage fluctuation is well contrasted with each other. Fig. 11 shows an eye diagram of the proposed DLL, which is 
IV. SIMULATION RESULTS

V. CONCLUSIONS
This paper presents a DLL immune to SSN by using a hysteresis coarse lock detector. The dynamic PD and the current mismatch calibrated CP in this paper provide better performance in jitter, operating range and uniformity. Fabricated in a 0.13-μm CMOS process with an area of 0.04 mm 2 , the proposed DLL operates in the frequency range from 500 MHz to 1.2 GHz and consumes 16.6 mW at 1.2 GHz. The post-layoutsimulated peak-to-peak jitter is less than 30 ps in an SSN environment. It is about 30% performance improvement in comparison of the DLL using a conventional CLD. 
