An all-digital phase-locked loop (ADPLL) for high-speed clock generation is presented in this paper. The proposed ADPLL architecture can be implemented with standard cells. And the ADPLL implemented in a 0.35pm 1P4M CMOS process can operate from 40MHz to 540MHz. The p-p jitter of the output clock is less than _+170ps, and the rms jitter of the output clock is less than 39ps. A systematic way to design the ADPLL with specified standard cell library is also introduced. The proposed ADPLL can easily be ported to different processes in a short time. Thus it can reduce the design time and design complexity of ADPLL, making it very suitable for SystemOn-Chip (SoC) applications.
ARCHITECUTRE OVERVIEW
Phase-Locked Loops (PLL's) has been commonly used in data communication and microprocessor. It can be applied to Clock and Data Recovery (CDR) circuits and frequency synthesis applications. In conventional designs, PLL's is often designed by analog approaches. But since analog PLL's have to overcome the noise coupling, signal shading, and power supply noise effects, it's difficult to be integrated into system. All-Digital Phase-Locked Loops (ADPLL's) [ 1-41 have been proposed to overcome those disadvantages of analog PLL's and reduce the turn around time for system integration.
There are two major problems needed to be considered carefully when designing the ADPLL. One is how to design a wide operating range and high resolution Digital-Controlled Oscillator (DCO), the other problem is how to speedup the frequency acquisition and phase acquisition, and reduce the clock jitter comes from reference clock. In [I] , a binary-weighted MOS width DCO is proposed to achieve high frequency resolution. But since the area of DCO in [I] takes a large area, a 4-to-1 MUX is used to reduce the MOS width in each delay element [2] . In [3-41, a standard-cell based DCO is proposed. In There are two DCOs in the ADPLL, one is used for tracking the reference clock, and the other is used for generating output clock.
I11 -679
The signal dco-out-divM signal comes from the output of inner DCO divided by M. The PFD detects the frequency difference and phase error between REFCLK and dco-out-divM signal. When the PFD generates P-UP signal or P-DOWN signal, it indicates the inner DCO should be speeded up or slowed down respectively. When the controller receives those signals, it will change the inner DCO's control code: coarse [5:0] and fine [5:0] to alter the inner DCO's output frequency. This inner DCO's control code is also sent to the loop filter. After ADPLL finished frequency acquisition, loop filter will detect the maximum inner DCO control code and minimum inner DCO control code during 5 12 reference clock cycles, then output (DCO control code (max) + DCO control code ,min))/2 as the average DCO control code for the output DCO.
Thus output clock will have low-jitter even with a noisy reference clock.
U," from P-UP to P-DOWN or vice versa, the search step becomes reduced to one half of the previous search step. And after the search step becomes 1, the frequency acquisition is finished. Fig.3 shows the state diagram of the polarity detector used in controller, the polarity detector will find the transition of PFD's output, and send polarity change detect signal to the controller.
After frequency acquisition i:s finished, only DCO fine-tuning control code will be changed, and loop filter will eliminate the damping effect caused by the limitation of PFD's sensitivity and DCO's frequency resolution. TpHL+ TpLH of one coarse-tuning delay cell is about 300~1s. Thus when DCO coarse-tuning control code increases one or decreases one, the period of the output clock will be changed by -+300ps.
CIRCUIT DESCRIPTION
To increase the frequency resolution of the DCO, fine-tuning delay cell is added after coarse-tuning stage. Fine-tuning delay cell is shown in Fig.4 (b) [5] . The finetuning delay cell consists of And-or-invert (AOI) cell and Or-and-invert (OAI) cell. Both A01 cell and OAI cell ;ire shunted with two tri-sate buffers. Shunted tri-state buffers can increase the controllable range of fine-tuning delay cell. And the controllable range of fine-tuning delay should cover one coarse-tuning step (i.e. 300ps). In fine-tuning delay cell, totally six bits (EN1 A1 Bl EN2 A2 B2) can be controlled. Thus 64(=26) different delay steps can be used. From HSPICE simulation, the resolution of the DCO can be improved to Sps by adding fine-tuning delay cell.
The maximum output frequency of DCO is 545MHz (1.8331s) and minimum output frequency of DCO is 41MHz (24.26111s) by HSPICE simulation. i n n n n The digital pulse amplifier is shown in Fig. 7 . It uses the asymmetry propagation delay between TpHL and TpLH in A01 cells and OAI cells to increase the pulse width of OUTU and OUTD to meet the pulse width requirement for D-FliplFlop's reset pin. This timing requirement in 0.35pm lP4M CMOS cell library is 800ps. The digital pulse amplifier can increase a pulse width larger than loops to 800ps. Thus the sensitivity of PFD can be improved to f100ps. ADPLL controller, frequency divider, and loop filter are described by Hardware Description Language (HDL). And they can be synthesized to gate-level circuits by logic synthesizer.
Therefore a systematic way is provided to design the ADPLL with specified standard cell library. Firstly, transistor level simulation of the fine-tuning delay cell should be done. And when the controllable range of the fine-tuning cell is determined, the suitable coarse-tuning cell whose TpHL+TpLH is less than or equal to the controllable range of fine-tuning delay cell, can be selected from cell library. The output clock specifications determine the number of select paths in coarse-tuning stage. The rest of ADPLL are realized by logic synthesizer. Thus the design time and the complexity to design an ADPLL can be reduced. And the proposed ADPLL architecture can easily be ported to different process in a short time. Thus the output frequency is 200MHz(=5MHz*40). When RESET=O, the ADPLL starts to work. If dco-out-divM's frequency is higher than reference clock, the PFD will generate P-DOWN signal to indicate the inner DCO should be slowed down, and when dco-out-divM's frequency is lower than reference clock, P-DOWN signal will be generated to speed up the inner DCO.
Next control code is code [I When the polarity transition is detected from polarity detector, the search step for frequency acquisition will be reduced to one half of the previous search step. And when polarity transition happens twice, which means the upper bound and lower bound for target frequency has been determined. To further speed up the frequency acquisition process, the step will be directly reduced to 64. Thus the DCO control code: code [I l :O], which means {coarse [5:0], fine [5:0]}, will be converged to the fine-tuning region in 16 reference clock cycles. And after search step is reduced to 1, the frequency acquisition is completed. After ADPLL is locked, the loop filter will take the average of DCO control code. Fig. 9 shows the simulation waveform for the loop filter. In Fig.9 , the reference clock is 7.812SMHz, and M is 64, thus the output frequency is 5OOMHz. When ADPLL is locked, the DCO control code Ill -681 is converged to the fine-tuning region. The maximum DCO code and minimum DCO control code during 512 reference clock cycles will be almost the same. There will have a small damping in DCO control code. This damping effect in DCO control code is caused by the sensitivity limitation of the PFD and the resolution limitation of the DCO. Since the PFD has dead zone, thus when it cannot distinguish two different frequencies, the phase error will be accumulated. And when the accumulated phase error is large enough to change the polarity of PFD. The DCO control code will have a damping. Thus the loop filter will take the average of the DCO control code and output it as the avgcoarse Fig.10 shows the jitter analysis and power spectral density analysis for the proposed ADPLL. The jitter analysis and power spectral density analysis are calculated from the post-layout simulation waveforms during a period of 8ps. The reference clock is IOMHz (loons), and the division ratio M is 20. It has a peak-to-peak jitter: +18OOps and the rms jitter is 800ps. Fig. 10 shows that the peak-to-peak jitter of the output clock is fl70ps, and rms jitter of the output clock is 39ps. And phase noise of output clock is -4OdBc@(IOOkHz). The jitter comes from reference clock can be reduced by loop filter.
Fig1 1 shows the layout of the ADPLL. The layout of ADPLL is generated by Auto Placement and Routing (APR) tool. Two DCOs and the PFD should have region constraint to minimize the wire-loading effect during APR.
The gate count of the ADPLL is 4800. The core size of the ADPLL is 8 4 0~ x 840pm, and the chip size including 1/0 pads is 2 0 1 0 p x 2 0 1 0~. The power for ADPLL is 1 OOmW(@500MHz). 
CONLUSIION
In this paper, an all-digital phase-locked loop is presented. The ADPLL can be implemented with standard cells. And it has portability for different processes. The ADPLL implemented in a 0.35pm 1P4M CMOS standard cell library, can operate from 40MHz to 540MHz. The P-FI jitter of the output clock is less than fl70ps, and the rmr; jitter of the output clock is less than 39ps. A systematic way to design ADPLL is also introduced. The proposed ADPLL can reduce design time and circuit complexity.
Therefore it is very suitable for SoC applications.
