### AN ABSTRACT OF THE DISSERTATION OF

Wenjing Yin for the degree of Doctor of Philosophy in Electrical and Computer Engineering presented on October 20, 2010.

Title: Design Techniques for High-Performance Digital PLLs and CDRs

Abstract approved: \_

Pavan Kumar Hanumolu

Phase-Locked Loops (PLLs) are essential building blocks in many communication systems. Designing high performance analog PLLs in the presence of technology imposed constraints such as leakage, poor analog transistor behavior, process variability, and low supply voltage is a challenging task. To overcome these drawbacks, digital PLLs (DPLLs) have recently emerged as an alternative to analog PLLs.

In this work, a digital PLL employing a linear proportional path and a double integral path is proposed to achieve low jitter, wide operating range and low power. Moreover, the approach of bandwidth and tuning range tracking is achieved. The prototype DPLL fabricated in a 90nm CMOS process operates from 0.7 to 3.5GHz. At 2.5GHz, the proposed DPLL consumes only 1.6mW power and achieves 1.6ps r.m.s jitter.

Moreover, the design techniques for a novel digital clock and data recovery (CDR) with linear loop dynamics are presented. The PLL-based digital CDR avoid the

use of TDC, achieves static phase offset free (SPO-free) and well-controlled jitter transfer bandwidth. The prototype digital CDR fabricated in a  $0.13 \mu m$  CMOS process achieves error-free operation (BER  $< 10^{-12}$ ) for PRBS data sequences ranging from  $2^7 - 1$  to  $2^{31} - 1$  and a near-constant bandwidth of 4.5MHz.

<sup>©</sup>Copyright by Wenjing Yin October 20, 2010 All Rights Reserved

## Design Techniques for High-Performance Digital PLLs and CDRs

by

Wenjing Yin

## A DISSERTATION

submitted to

Oregon State University

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

Presented October 20, 2010 Commencement June 2011 Doctor of Philosophy dissertation of Wenjing Yin presented on October 20, 2010.

APPROVED:

Major Professor, representing Electrical and Computer Engineering

Director of the School of Electrical Engineering and Computer Science

Dean of the Graduate School

I understand that my dissertation will become part of the permanent collection of Oregon State University libraries. My signature below authorizes release of my dissertation to any reader upon request.

Wenjing Yin, Author

## ACKNOWLEDGEMENTS

I would like to thank my advisor Dr. Pavan Kumar Hanumolu for his encouragement and guidance during the course of my research. I am grateful to him for having shared his knowledge and experience with me over these years. I remain inspired by his perpetual energy and enthusiasm in research. I would like to extend my sincerest gratitude to Prof. Un-Ku Moon, Prof. Kartikeya Mayaram and Prof. Gabor Temes for serving on my thesis committee. Their brilliant ideas, insightful comments, and constructive suggestions played an important role in my research. I would also thank Prof. Abdollah Farsoni for being the graduate representative in my Ph.D. program.

All my colleagues and friends at Oregon State University made it a pleasant place to work and live. In particular, I would like to thank Rajesh Inti, Amr ElShazly, Brian Young, Abhijith Arakali, Seok-Min Jung, Sachin Rao, Qadeer Khan, Savesh Bang and Bangda Yang for many interesting discussions and great helps on my research. Many thanks to the friendship of Kangmin Hu, Ronghua Ni, Chen Cao, Jinzhou Cao, Charles Hu, Tao Jiang, Yan Wang, Wenhuan Yu, Weilun Shen, Jiangguang Wang, Yue Hu, Jiaming Lin, Ruiqing Ye, Jinjin He, Wen Bai and Chao Wang. All of you make my life at Oregon State University so enjoyable and memorable.

I would like to thank the Semiconductor Research Corporation (SRC) for funding

this research. I want to acknowledge Dongbu HiTek for providing IC fabrications. Also I would like to thank Kawasaki Microelectronics America for testing support.

All of the work was made possible by the love and encouragement of my family. I am appreciative of the love and support from them. Finally, I would like to thank my husband Yun Zhu for his love, support and tremendous consideration.

# TABLE OF CONTENTS

|   |       |                                                          | Page |
|---|-------|----------------------------------------------------------|------|
| 1 | Intro | oduction                                                 | 1    |
|   | 1.1   | Overview of Phase-Locked Loops                           | . 3  |
|   | 1.2   | Motivation and Contribution                              | . 6  |
|   | 1.3   | Thesis Organization                                      | . 8  |
| 0 | Titta | n Derformance analyzis for digital DLLs and CDPs         | 10   |
| Δ | JILLE | Defention and Little                                     | 10   |
|   | 2.1   | Definitions of Jitter                                    | . 10 |
|   | 2.2   | DPLL Clock Jitter                                        | . 11 |
|   |       | 2.2.1 Deterministic fitter                               | . 12 |
|   |       | 2.2.3 DPLL Noise Analysis                                | . 19 |
|   | 2.3   | CDR Clock Jitter                                         | . 23 |
|   |       | 2.3.1 Jitter Generation                                  | . 23 |
|   |       | 2.3.2 Jitter Transfer $\ldots$                           | . 24 |
|   |       | 2.3.3 Jitter Tolerance                                   | . 25 |
|   | 2.4   | Summary                                                  | . 27 |
| 3 | A lov | w jitter, wide operating range digital phase-locked loop | 28   |
|   | 3.1   | Introduction                                             | . 28 |
|   |       | 3.1.1 Limitation on TDC Quantization error               | . 30 |
|   |       | 3.1.2 Limitation on DCO Phase Noise                      | . 31 |
|   |       | 3.1.3 Limitation on DCO Tuning Range                     | . 32 |
|   | 3.2   | Proposed DPLL Architecture                               | . 33 |
|   |       | 3.2.1 Simplified Proposed Architecture                   | . 33 |
|   |       | 3.2.2 Proposed Double Integral DPLL Architecture         | . 30 |
|   |       | 3.2.4 Quantization Error Tracking                        | . 39 |
|   | 33    | Circuits Design                                          | 45   |
|   | 0.0   | 3.3.1 Digitally-Controlled Oscillator                    | . 45 |
|   |       | 3.3.2 Digital-to-Analog Converter                        | . 48 |
|   |       | 3.3.3 Tunable Switched RC Filter                         | . 50 |
|   | 3.4   | Experimental Results                                     | . 53 |
|   | 3.5   | Summary                                                  | . 58 |

# TABLE OF CONTENTS (Continued)

|                                                                                                            | Page |
|------------------------------------------------------------------------------------------------------------|------|
| 4 A TDC-less, linear loop dynamics digital clock and data recovery circui                                  | t 61 |
| 4.1 Introduction $\ldots$ | . 61 |
| 4.2 Proposed CDR Architecture                                                                              | . 64 |
| 4.2.1 Bang-bang Phase Detector                                                                             | . 66 |
| 4.2.2 Linear Phase Detector                                                                                | . 69 |
| 4.3 Circuits Design                                                                                        | . 74 |
| 4.4 Experimental Results                                                                                   | . 77 |
| 4.5 Summary                                                                                                | . 84 |
| 5 Conclusion                                                                                               | 86   |
| 5.1 Contribution $\ldots$ | . 86 |
| 5.2 Future work                                                                                            | . 88 |
| Bibliography                                                                                               | 88   |

| Figure |                                                                                | Page |
|--------|--------------------------------------------------------------------------------|------|
| 1.1    | PLL Block Diagram.                                                             | 3    |
| 1.2    | CPPLL Block Diagram.                                                           | 4    |
| 1.3    | DPLL Block Diagram.                                                            | 5    |
| 2.1    | Timing jitter.                                                                 | 11   |
| 2.2    | Tracking jitter.                                                               | 12   |
| 2.3    | Delay line based flash TDC                                                     | 13   |
| 2.4    | Timing of flash TDC                                                            | 14   |
| 2.5    | TDC transfer characteristic.                                                   | 14   |
| 2.6    | Accumulator output digit before and after dropping LSBs                        | 17   |
| 2.7    | Phase noise of open-loop oscillator.                                           | 18   |
| 2.8    | DPLL Linear Model                                                              | 19   |
| 2.9    | Noise transfer functions from TDC, DLF, and DCO to output                      | 22   |
| 2.10   | Jitter transfer function                                                       | 25   |
| 2.11   | Jitter tolerance for a type-II PLL                                             | 26   |
| 3.1    | Conventional DPLL architecture.                                                | 29   |
| 3.2    | TDC transfer characteristic.                                                   | 31   |
| 3.3    | Noise bandwidth tradeoff                                                       | 32   |
| 3.4    | DCO resolution-tuning range tradeoff. (a) Small LSB. (b) Large LSB             | . 33 |
| 3.5    | Block diagram of the simplified proposed DPLL                                  | 34   |
| 3.6    | The transfer characteristic for: (a) conventional !!PD. (b) proposed PFD+!!PD. | 36   |
| 3.7    | Block diagram of the proposed double integral DPLL                             | 37   |
| 3.8    | Simulated random jitter vs. frequency with and without bandwidth tracking      | 40   |

# LIST OF FIGURES

# LIST OF FIGURES (Continued)

| Figure |                                                                                                                                     | Page       |
|--------|-------------------------------------------------------------------------------------------------------------------------------------|------------|
| 3.9    | Step response behavior of the proposed DPLL and an equivalent analog PLL                                                            | . 41       |
| 3.10   | Close loop frequency response with different $K_P.\ .\ .\ .$                                                                        | . 42       |
| 3.11   | Simulated deterministic jitter vs. frequency with and without quan-<br>tization error tracking.                                     | . 44       |
| 3.12   | Digitally controlled oscillator.                                                                                                    | . 46       |
| 3.13   | Current controlled oscillator                                                                                                       | . 47       |
| 3.14   | DAC for integral and double integral paths.                                                                                         | . 49       |
| 3.15   | Delta-sigma modulator                                                                                                               | . 50       |
| 3.16   | (a) Tunable switched RC low pass filter. (b) Clock circuit for CMOS switch. (c) Clock waveforms for different duty cycle D1 and D2. | . 51       |
| 3.17   | Simulated deterministic jitter vs. frequency with and without switcher RC filter.                                                   | ed<br>. 53 |
| 3.18   | Die photograph                                                                                                                      | . 54       |
| 3.19   | Jitter histogram @2.5GHz                                                                                                            | . 55       |
| 3.20   | Phase noise @2.5GHz.                                                                                                                | . 55       |
| 3.21   | Measured Reference Spur @2.5GHz                                                                                                     | . 56       |
| 3.22   | Jitter performance vs. frequency.                                                                                                   | . 56       |
| 3.23   | Power consumption vs. frequency                                                                                                     | . 57       |
| 3.24   | Measured bandwidth/reference ratio with and without bandwidth tracking                                                              | . 57       |
| 4.1    | Conventional analog CDR architecture                                                                                                | . 62       |
| 4.2    | Proposed CDR architecture                                                                                                           | . 64       |
| 4.3    | Alexander phase detector.                                                                                                           | . 66       |
|        |                                                                                                                                     |            |

# LIST OF FIGURES (Continued)

| Figure | l                                                                                                                     | Page |
|--------|-----------------------------------------------------------------------------------------------------------------------|------|
| 4.4    | Alexander PD clock waveform. (a) Clock is early. (b) Clock is late.<br>(c) Clock aligned or data absence              | 67   |
| 4.5    | Alexander PD output probability with clock jitter and voltage offset                                                  | . 68 |
| 4.6    | Hogge phase detector.                                                                                                 | 69   |
| 4.7    | Hogge PD clock waveform                                                                                               | 70   |
| 4.8    | The transfer characteristics for: (a)Alexander PD. (b)Hogge PD                                                        | 71   |
| 4.9    | Sense-amplifier flip-flip schematic.                                                                                  | 74   |
| 4.10   | Decimator circuit. (a) Schematic. (b) Input and output digits                                                         | 76   |
| 4.11   | Simplified digitally controlled oscillator schematic                                                                  | 77   |
| 4.12   | Die photograph                                                                                                        | 78   |
| 4.13   | Jitter histogram @2.5Gb/s                                                                                             | 79   |
| 4.14   | Measured phase noise @2.5Gb/s of proposed CDR and conventional<br>!!CDR for 2 integral path gains                     | 79   |
| 4.15   | Measured recovered clock jitter as a function of jitter transfer band-<br>width and input PRBS sequence length        | 80   |
| 4.16   | Jitter tolerance for PRBS $2^7 - 1$ and BER threshold of $10^{-12}$                                                   | 81   |
| 4.17   | Measured jitter transfer bandwidth for proposed CDR and conven-<br>tional !!CDR for different input jitter amplitudes | 82   |
| 4.18   | Measured bathtub curves for proposed CDR and conventional $\text{!!CDR}$<br>with $2^{31} - 1$ PRBS input.             | 83   |

# LIST OF TABLES

| <u>Table</u> |                          | Page | e |
|--------------|--------------------------|------|---|
| 3.1          | DPLL Performance Summary | . 58 | 8 |
| 4.1          | CDR Performance Summary  | . 84 | 4 |

#### Chapter 1 – Introduction

Nowadays the rapid advance in Integrated circuit (IC) technologies have revolutionized the world of electronics and led to an exponential expansion of information technology. A phase-locked loop (PLL) is an integral component found in almost all digital, analog, and radio-frequency integrated circuits that require a clock to provide the timing basis. charge-pump phase-locked loop (CPPLL) [5] is the most commonly used architecture in PLL designs. A PLL is essentially a negative feedback loop that locks the on-chip clock phase to input signal phase. The PLLs are widely used in digital system for two main purposes of clock generation and timing recovery. For clock generation, since off-chip reference frequency is limited to the crystal maximum frequency, a PLL is used to multiply the reference clock frequency to the multi-gigahertz operating frequency and then drive all other parts of the chip. For timing recovery, the PLL is to align the phase between input data and recovered clock and thus help to regenerate the data communication timing.

Although this analog-type architecture meets the high performance requirements for last twenty-year's applications, the present continued scaling technology in deep-submicron CMOS process imposes severe constraints such as current leakage, poor analog transistor behavior, low supply voltage, process variability. Overcoming such technology limitations in PLL designs often incurs the cost penalties in terms of performance, area, power, time-to-design, and design flexibility. For instance, transistor leakage mandates the use of metal capacitors in place of high-density MOS capacitors, causing more than 3 times increase in the loop filter area. To overcome these drawbacks, alternative approaches of digital PLL (DPLL) [27, 16, 25] have recently emerged as an alternative to analog PLL to eliminate the need for the charge-pump and analog filter, therefore relieve the problems of current leakage and transistor mismatch. Instead, a digital loop filter is used in a DPLL with all-digital implementation that avoid the poor analog transistor behavior and intrinsic technology limitation.

The second advantage of a digitally-enhanced PLL is the ability of self-calibration to mitigate performance degradation due to process, voltage and temperature (PVT) variations. Compare to analog circuits, digital circuits are less sensitive to process, voltage, and temperature, thus DPLL is able to obtain a better performance due to the constant loop dynamics from PVT immunity. In addition, DPLL is easier for process portability which merely needs to scale a number of digital blocks, while analog PLL requires a complete redesign for all analog blocks in a new process. This could greatly shorten the design cycle time and improve the efficiency in releasing new product. Last, a digital implementation is immune to the inherent noise of digital circuits, therefore decreases the noise contributions.

As possessing the advantages discussed above, the digitally-enhanced PLL continues to offer benefits from the intrinsic digital circuits characteristics. The study in DPLL design becomes increasingly necessary and in great demand. The target of this thesis is to explore the design techniques of high performance DPLL from both system-level and circuit-level perspectives, analyze and improve the design bottlenecks of conventional DPLL, and further extent the discussion to DPLLbased clock and data recovery circuit (CDR) designs.

## 1.1 Overview of Phase-Locked Loops

A basic block diagram of a PLL is shown in Fig. 1.1. It consists of a phase detector (PD), a loop filter (LF), a voltage-controlled oscillator (VCO) and a divider. The PD detects the phase difference between reference clock  $F_{REF}$  and feedback clock  $F_V$  and generate an output that is proportional to the phase difference. The LF filters the PD output and as a control signal of the oscillator and adjusts VCO output frequency  $F_{VCO}$ . The divider in the feedback path divides down  $F_{VCO}$  to a lower feedback clock frequency  $F_V$  and achieves phase lock with  $F_{REF}$ , therefore  $F_{VCO}$  is N times larger than  $F_{REF}$ .



Figure 1.1: PLL Block Diagram.

The CPPLL is the most commonly used architecture in PLL designs. The block diagram of a charge-pump PLL is shown in Fig. 1.2. A phase and frequency detector (PFD) compares the frequency and phase difference between  $F_{REF}$  and  $F_{V}$  and then provides information about phase or frequency difference through UP

and DN outputs. The charge pump (CP) injects the charge into or out of the loop filter capacitor. If the CP receives an UP signal, current is driven into the LF. Conversely, if it receives a DN signal, current is drawn from the LF. The combination of CP and LF is an integrator that generates an average value proportional to phase error. Based on the average control voltage, the VCO oscillates at a higher or lower frequency, which affects the phase and frequency of the feedback clock. The VCO stabilizes once  $F_{REF}$  and  $F_V$  have the same phase and frequency. The LF filters out jitter by removing glitches from the CP and preventing voltage over-shoot.



Figure 1.2: CPPLL Block Diagram.

An alternative PLL design is digital PLL. The simplified block diagram of DPLL is shown in Fig. 1.3. The PD in analog PLL is replaced with an time-to-digital converter (TDC) that converts the phase difference into digital output. A bang-bang phase detector (BBPD) that is implemented by a simple D flip-flop (DFF) is a 1bit TDC. The digital output of TDC is filtered by the digital loop filter (DLF) and then to control the input digit of digitally-controlled oscillator (DCO). DCO can be implemented with different ways. For a ring oscillator based

DCO, digital control can be realized by tuning on and off the switches or tri-state inverters. For a LC based DCO, frequency tuning is done by selecting the tank capacitor bank. Besides, using a digital-to-analog converter (DAC) in front of a VCO to convert digital input to voltage or current is also a practical way to design a DCO.



Figure 1.3: DPLL Block Diagram.

The important parameters for PLL are presented below. Locking time is an important parameter to evaluate PLL settling behavior. The locking time is the time that it takes the PLL to switch from one frequency to another for a given frequency change to a given frequency tolerance. In the time that the PLL takes to switch frequencies, no data can be transmitted, so having a locking time that is too long can reduce the data rate of the system. There are normally different specific requirements of locking time for different applications.

Phase Noise is another essential parameter for PLL design that determines root-mean-square (rms) jitter. The phase noise spectral density of a PLL system refers to the noise power of the PLL versus the offset frequency. Close to the carrier, within the loop bandwidth of the PLL, this noise is commonly dominated by the PD, and farther out, it is typically dominated by the VCO. The PLL loop bandwidth is often chosen to minimize rms phase error.

Moreover, deterministic jitter (DJ) is another jitter source PLL design, which mainly comes from the reference spur, loop latency, quantization error, etc. Reference spur are spurious emissions that occur from the carrier frequency at an offset equal to the channel spacing. These are usually caused by leakage and mismatch in the charge pump of a CPPLL. The loop latency causes delay in PLL loop response and dithering jitter at the output. Quantization error usually exists in the digit-assisted PLL that is raised by the finite resolution in digital circuits.

#### 1.2 Motivation and Contribution

At present continued scaling technology, the PLL designs become more and more challenging. In deep-submicron CMOS process, the gate transistor current leakage increases current mismatch in the CP design and prevents the use of MOS capacitors in LF, therefore results in a larger chip area by using metal capacitors. The IV characteristics of MOS transistors become severely layout dependent and thus induces a more difficult challenge for device matching. Morever, the increasingly low supply voltage constrains the analog circuit performance while the threshold voltage doesn't scale with supply voltage in the same amount. In the contrast, utilizing digital circuits instead of analog circuits could eliminate all the problems mentioned above. Therefore, digital PLLs citeStaszewski,Lin,Song have recently emerged as an alternative to analog PLLs to eliminate the need for a charge-pump and analog filter and demonstrate the ability of a digital implementation to achieve the performance of analog PLLs.

In spite of these advantages, the DPLL presents several circuit design bottlenecks that have limited its usage in high performance applications. The key performance limiting factors of the DPLL are discussed next.

First, conventional TDCs quantize the phase error in steps of an inverter delay and consequently the resolution of the TDC is limited to the minimum achievable inverter delay in a given process. For example, even in a 90nm CMOS processes, the TDC step size is only about 20ps. This rather poor resolution manifests itself as phase-quantization error which, if left unfiltered, appears as jitter at the DPLL output. Further, the TDC also leads to a bang-bang behavior of the loop, thus making it susceptible to dithering jitter due to excess loop delay.

The second challenge is the design of a high resolution DCO. One way to implement DCO is to utilize a DAC in front of the VCO. As a result, the finite resolution of the DAC manifests itself as frequency quantization error. Furthermore, the DCO suffers from an inconvenient tradeoff between frequency resolution and tuning range. For instance, with a given DAC resolution of L+1 bits and a required frequency resolution of  $\Delta F$ , the DCO tuning range is limited to  $\pm 2^{L} \times \Delta F$ .

Finally, the DPLL also suffers from an inherent noise bandwidth tradeoff: suppression of the large phase noise of ring-based DCO requires a wide loop bandwidth, while it can be easily shown that a low loop bandwidth is needed to mitigate the quantization error of the TDC.

The conventional DPLLs aforementioned exhibit the bottlenecks mentioned above. For example, in [27] an LC-based DCO with excellent phase noise is combined with a very low PLL bandwidth to suppress the TDC quantization error. Very high DCO resolution is achieved in this design by simply limiting the DCO tuning range. On the other hand, in [16] and [25] a reasonably wide tuning range is achieved at the expense of a large output clock jitter caused by the large frequency step size ( $\Delta$ F).

This thesis is targeted to improve the design bottlenecks of conventional DPLL, explore the design techniques of high performance DPLL, and further extent the discussion to DPLL-based CDR designs.

#### 1.3 Thesis Organization

This thesis is organized as follows:

Chapter 2 presents an analysis for design method of high performance digital PLLs and CDRs. The analysis provides a discussion on the challenges in conventional DPLL and further extents to PLL-based digital CDR design.

Chapter 3 discusses the design techniques of a novel DPLL aiming at improving the drawbacks of conventional DPLL. This architecture provides wide tuning range and achieves low power and low jitter performance. In order to optimize the jitter performance over a wide operating range, a bandwidth-tracking and a quantization-tracking methods are introduced to minimize rms jitter and DJ.

Chapter 4 presents a PLL-based digital CDR circuit, in which analog blocks such as CP and LF are replaced with digital counterparts. Linear loop dynamics are obtained in this digital CDR design. Thereby a constant bandwidth is achieved regardless of input jitter, which indicates the certain jitter generation (JGEN) and jitter transfer (JTRAN) characteristics.

Finally, conclusions and suggestions for future research directions are highlighted in Chapter 5.

#### Chapter 2 – Jitter Performance analysis for digital PLLs and CDRs

The digital PLL is intrinsically beneficial for process portability, PVT insensitivity and low supply design compare to analog PLL. Nowadays the DPLL designs have demonstrated the ability of achieving or even exceeding the performance of analog PLLs with digital implementations. However, the DPLL still exists some design bottlenecks that limit its usage in high performance applications, such as the intrinsic quantization error and bang-bang behavior in the digital loop, the tradeoff between loop bandwidth and rms jitter, the power and performance trade-off, etc. Jitter is an undesired perturbation or uncertainty in the timing of events. As timing jitter performance is an essential issue for PLL in the application of both clock generation and clock recovery system,, it is important to analyze the effects of clock jitter on DPLL. In this chapter we will first discuss the jitter sources of DPLL with a linearized phase-domain model and then present the detailed jitter effects on the implementation of high performance digital PLL and PLL-based digital CDR.

#### 2.1 Definitions of Jitter

Jitter is the unwelcome companion of all electrical systems that use voltage transitions to represent timing information. Timing jitter  $\sigma_{\Delta T}$  is an important metric for PLL output jitter. It's defined as the standard deviation of the time difference between the  $1_{st}$  cycle and  $m_{th}$  cycle of the clock, as shown in Fig. 2.1. In phase domain, timing jitter is also expressed as phase jitter  $\sigma_{\Delta\phi}$ , where the expression can be derived as

$$\sigma_{\Delta T} = \frac{T}{2\pi} \cdot \sigma_{\Delta \phi} = \frac{1}{\omega_0} \cdot \sigma_{\Delta \phi} \tag{2.1}$$

where clock period is  $T = \frac{2\pi}{\omega_0}$ . Timing jitter is called short-term jitter for small  $\Delta T$  and long-term jitter as  $\Delta T$  goes to infinity.



Figure 2.1: Timing jitter.

Tracking jitter  $\sigma_{tr}$  is another commonly used metric for a PLL output clock. As indicated in Fig. 2.2, it is measured as the phase difference between a ideal clean reference  $F_{REF,IDEAL}$  and the PLL output clock  $F_{OUT}$ . The tracking jitter is related to timing jitter by  $\sigma_{tr} = \frac{\sigma_{\Delta T \to \infty}}{\sqrt{2}}$  at very large  $\Delta T$  [19]. In this thesis, we will focus on the analysis of long-term jitter.

#### 2.2 DPLL Clock Jitter

The main jitter sources of a PLL can be mainly divided to two types: deterministic jitter (DJ) and random jitter (RJ). Deterministic jitter is a type of jitter with a



Figure 2.2: Tracking jitter.

known non-Gaussian probability distribution. The peak-to-peak value of this jitter is bounded, and the bounds can be observed, repeated and predicted.

In the digital PLL design, deterministic jitter are mainly introduced from quantization error and loop delay. The quantization error dominates DJ performance. There are two quantization error sources in DPLL: TDC quantization error and DCO quantization error. Any error contribution from the two sources could greatly impact output jitter and need to be studied with details.

## 2.2.1 Deterministic Jitter

#### 2.2.1.1 Quantization Error from TDC

The TDC quantizes the time difference between reference clock  $F_{REF}$  and feedback clock  $F_V$  and converts it to a digital format. The delay line based flash TDC [28] is a basic TDC architecture, as shown in Fig. 2.3. It is composed of a string of non-inverting delay elements (such as buffers), a number of registers (such as D flip-flops) and a thermometer-to-binary code convertor.



Figure 2.3: Delay line based flash TDC.

The timing of the TDC is described in Fig. 2.4. The reference signal of TDC is the feedback clock  $F_V$  in the DPLL.  $F_V$  propagates through the delay line, thus produces  $2^N - 1$  delay reference signals that are distributed uniformly in one clock cycle  $T_{REF}$  with a fixed delay time  $T_D$ . The relationship betweens DPLL clock period and TDC resolution can be derived as  $T_D = T_{REF}/2^N$ . The input clock  $F_{REF}$  samples the reference signals over the delay line and generates thermometer code output, which is interpreted by the thermometer-to-binary convertor. The N bit binary code digital output  $D_{OUT}$  represents the time difference between two signals with a resolution of  $T_D$ .

The TDC resolution determines the phase quantization error at DCO output, manifested as DJ. It is important to note that the TDC resolution is intrinsic limited by invertor delay and thus introduce large DJ. Fig. 3.2 is the transfer characteristic of a TDC. The x-axis indicates the input phase error  $\Delta \Phi$  and the yaxis is the TDC digital output word D<sub>OUT</sub>. As can be seen, the TDC quantization error is proportional to its resolution, while the resolution is limited by invertor



Figure 2.4: Timing of flash TDC.

delay  $T_{INV}$ .



Figure 2.5: TDC transfer characteristic.

Alternative approaches of TDC architecture are reported in recent publications [3, 14, 29] to achieve high resolution and small quantization error. [3] proposed a Vernier delay line based TDC to achieve a resolution better than a single delay element. The architecture utilized two delay chain at  $F_{REF}$  and  $F_V$ , respectively, therefore the TDC resolution does not depend on the delays of the unit delay elements but rather on the time difference between two delay chain. However,

a calibration method has to be used to improve the TDC linearity due to the matching of multiple delay lines. [14, 29] use alternative architectures to achieve high resolution TDC that is not constrained by invertor delay, however it increases the circuit complexity and power consumption in a great amount. Also those designs are not all-digital designs and raise PVT sensitive issue. In conclusion, there exists an tradeoff between TDC resolution and power, area and other issues, thus the quantization error in TDC is an important source of DPLL output jitter.

#### 2.2.1.2 Quantization Error from DCO

The DCO quantization error is another source of DJ that is limited by the design requirement of a high resolution DCO. As a DCO is normally designed with a DAC, thus the requirement becomes to implement a high resolution DAC. The finite resolution of the DAC manifests itself as frequency quantization error with a PSD equal to  $S_{Q_F}$ , as shown in Eq (2.3), while as known a high resolution DAC is a challenging design task.

Furthermore, as a wide tuning range is always necessary in a PLL design to ensure that the PLL can work with PVT variations or wide operating range, there exists a tradeoff between the DCO frequency resolution and tuning range. For instance, with a given DAC resolution of L+1 bits and a required frequency resolution of  $\Delta F$ , the DCO tuning range is limited to  $\pm 2^{L} \times \Delta F$ .

Also, the PVT variations would degrade the DCO resolution and tuning range by more than a factor of two. For example, in the fast process corner the DCO operates at a higher frequency for the same input control word, thereby reducing the resolution or alternatively, in the slow corner the tuning range is reduced. Therefore the DJ from DCO quantization error could be doubled.

#### 2.2.1.3 Loop Delay

Loop delay is mainly caused by the inherent discrete-time nature in the digital PLLs. For instance, the PD and the digital circuits in the integral path such as accumulator, exhibit clock sample characteristics and display cycles delay. In other words, the delay may take the form of latency that is introduced to facilitate implementation at high sampling rate, or it may form an integral path of the phase detector. In addition, the circuit delay in the real circuit implementations also causes large loop latency, such as the inverter delay. The loop delay is mainly determined by the DPLL architecture and and practical implementation choices, but invariably affects any design and thus cannot be avoided. The loop delay will impact the loop dynamics and restrict the stability region of the DPLL. More importantly, the loop latency will introduce excess dither in the loop and lead to deterministic jitter.

In DPLL design, the dither jitter caused by loop latency can be eliminated by throwing away the useless LSBs in the accumulator in the integral path, as shown in Fig. 2.6. Because of the loop delay, the accumulator output digit dithers with more than 1bit and transfers to voltage control ripple or excess switching noise in the DCO, thus leads to more jitter. By dropping the dithering code at accumulator output, only 1bit appears and maintains the quantization error less than 1LSB. The DJ due to excess loop latency is eliminated accordingly. It's important to note that dropping LSBs wouldn't decrease the DCO resolution as useless LSBs is added in the accumulator, thus higher accumulator number of bit is required and same DAC resolution is maintained. The accumulator power consumption would raise by some extent, but the extra amount is negligible compare to the other part of DPLL.



Figure 2.6: Accumulator output digit before and after dropping LSBs.

## 2.2.2 Random Jitter

The other major class of jitter is non-deterministic, or random jitter. It is unpredictable electronic timing noise and typically follows a Gaussian distribution. The noise sources of random jitter (RJ) includes extrinsic noise sources and intrinsic noise sources. The extrinsic noise sources are deterministic noises from the interference of other noise sources, such as supply noise, substrate noise and coupling from undesired signals. The intrinsic noise sources are the random noise from the interior of the circuits, for instance, thermal noise and flicker noise. In this analysis, we will focus on intrinsic noise since the target of this chapter is to analyze the performance limitations and explore the design techniques for the intrinsic circuits of DPLL. As jitter and phase noise are highly related, the equivalent rms jitter can be obtained by integrating the phase noise power over the frequency range of interest.

Phase noise is a critical factor in radar and communications applications. It is formed by the random phase modulation mainly coming from the oscillator. The phase modulation is caused by both the noise figures of the oscillator and the MOS transistors used in the circuit. It can be measured in the frequency domain. The phase noise of an open-loop oscillator is revealed in Fig. 2.7. At the frequency



Figure 2.7: Phase noise of open-loop oscillator.

close-in to the carrier (usually 5-500Hz), the phase noise is dominated by flicker noise. While at the frequency beyond that (typically 5KHz), phase noise is a function thermal noise and driven by the oscillator transistors.

#### 2.2.3 DPLL Noise Analysis

In order to quantify the DPLL phase noise and identify the noise contributions from each noise source, a frequency domain analysis of a closed-loop PLL is needed. The linear model of a typical DPLL is shown in Fig. 2.8. A Z-domain



Figure 2.8: DPLL Linear Model.

discrete-time transformation is used in DPLL analysis as counterpart to the Sdomain continuous-time transformation in analog PLL. A relationship between the discrete-time operator  $z = e^{j\omega T}$  and the continuous-time operator  $s = j\omega$ , where  $\omega$  is the angular frequency of interest and T is the sampling period, can be derived using a first-order Taylor series expansion of z as shown below,

$$z = e^{j\omega T} \approx 1 + j\omega T = 1 + sT \Rightarrow s = \frac{1 - z^{-1}}{T \cdot z^{-1}}$$
(2.2)

The equation is valid only under the assumption that  $\omega \ll 1/T$ , which is true in practice since the bandwidth of most PLLs is typically much smaller than the reference frequency. The main jitter sources are also demonstrated in Fig. 2.8. The TDC quantization error  $S_{Q_{TDC}}$ , DCO quantization error  $S_{Q_F}$  and loop delay  $z^{-M}$  are main DJ source. The loop delay  $z^{-M}$  is modeled as M cycles loop latency, where M is dependent on practical circuit implementation.  $S_{Q_{TDC}}$  and  $S_{Q_F}$  are caused by limited resolution in the circuit. Assuming uniform distribution for the quantization error, it can be easily shown that

$$S_{Q_{TDC}} = \frac{\Delta \Phi^2}{12F_{REF}}, \quad S_{Q_F} = \frac{\Delta F^2}{12F_{REF}}$$
(2.3)

where  $\Delta \Phi$  and  $\Delta F$  are the resolution of the TDC and the DCO, respectively.  $F_{REF}$  is the DPLL reference frequency.

The RJ sources are mainly the results of intrinsic noise sources such as thermal and flicker noise, including TDC noise  $S_{\Phi_{TDC}}$ , DLF noise  $S_{\Phi_{DLF}}$  and DCO noise  $S_{\Phi_{DCO}}$ . It's easy to prove that  $S_{\Phi_{DCO}}$  is the dominant noise source for RJ, so we only consider  $S_{\Phi_{DCO}}$  in this analysis.

While ideal TDC has a nonlinear transfer characteristic which gives an infinite gain, in this linear model, the TDC is linearized in the presence of clock jitter [12], therefore the open loop gain (LG) of the linear model can be derived as,

$$LG(z^{-1}) = K_{TDC} \cdot \left(K_{P} + \frac{K_{I} \cdot z^{-1}}{1 - z^{-1}}\right) \cdot \frac{K_{DCO}}{1 - z^{-1}} \cdot \frac{z^{-M}}{N}$$
(2.4)

Similar to the noise analysis of analog PLL, the impact of each of the noise sources on the output clock jitter can be evaluated using the transfer function analysis. The DPLL total output phase noise is express as  $S_{\Phi_{OUT}}$ , so the TDC noise transfer functions can be derived as,

$$H_{TDC}(z^{-1}) = \sqrt{\frac{S_{\Phi_{OUT}}}{S_{Q_{TDC}}}} = \frac{N}{K_{TDC}} \cdot \frac{LG(z^{-1})}{1 + LG(z^{-1})}$$
(2.5)

The TDC noise transfer functions has a low-pass transfer characteristic. It indicates that the noise contributed from TDC is passed through low frequency and suppressed at high frequency. The DCO quantization error exists at the DLF output, which suggests that the DLF noise transfer functions is,

$$H_{\rm DLF}\left(z^{-1}\right) = \sqrt{\frac{S_{\Phi_{\rm OUT}}}{S_{\rm QF}}} = \frac{K_{\rm DCO}}{1 - z^{-1}} \cdot \frac{1}{1 + {\rm LG}\left(z^{-1}\right)}$$
(2.6)

This indicates that the DLF noise transfer functions is typically a band-pass function. It's worth to note that  $H_{DLF}(z^{-1})$  is proportional to the  $K_{DCO}$ . Therefore the noise contributed from DCO quantization error can be improved by reducing the DCO gain. Also the DCO noise transfer functions is expressed as,

$$H_{\rm DCO}\left(z^{-1}\right) = \sqrt{\frac{S_{\Phi_{\rm OUT}}}{S_{\Phi_{\rm DCO}}}} = \frac{1}{1 + {\rm LG}\left(z^{-1}\right)}$$
(2.7)

Here,  $H_{DCO}(z^{-1})$  shows a high-pass transfer characteristic. Intuitively, the DCO noise at low frequency can be corrected by the relatively fast feedback loop, while the loop is not fast enough to correct the error at high frequency, thus high frequency noise is passed to the DPLL output.

As a result, the total output phase noise  $S_{\Phi_{\rm OUT}}$  is given by,

$$S_{\Phi_{OUT}} = H_{TDC}^{2} \left( z^{-1} \right) \cdot S_{Q_{TDC}} + H_{DLF}^{2} \left( z^{-1} \right) \cdot S_{Q_{F}} + H_{DCO}^{2} \left( z^{-1} \right) \cdot S_{\Phi_{DCO}}$$
(2.8)

It is important to notice that Eq (2.8) does not account for dither jitter induced by TDC nonlinearity and loop latency  $z^{-M}$ . In the DPLL design, the dither jitter caused by loop latency could be decreased by throwing away useless digital output information. Therefore the dithering in the loop doesn't induce significant output jitter.

Fig. 2.9 shows the noise transfer functions from TDC, DLF and DCO. As shown in the figure,  $H_{TDC}(z^{-1})$  has a low-pass characteristic,  $H_{DLF}(z^{-1})$  has a band-pass characteristic and  $H_{DCO}(z^{-1})$  has a high-pass characteristic.



Figure 2.9: Noise transfer functions from TDC, DLF, and DCO to output.
# 2.3 CDR Clock Jitter

Clock and data recovery (CDR) circuits are the essential building blocks in highspeed transceivers to generate synchronized clock for the incoming serial data and re-time the data that can get rid of the jitter accumulated during transmission. Consequently, it's necessary to analyze the jitter characteristics and the methods of estimating these characteristics that help degine a CDR performance. There are three main metrics to evaluate the CDR performance and the CDR ability of dealing with input and output jitter: jitter generation (JGEN), jitter transfer (JTRAN) and jitter tolerance (JTOL). The details about the three parameters will be discussed below.

#### 2.3.1 Jitter Generation

The jitter generation is the jitter produced by the intrinsic jitter of the CDR when no jitter exists in the input data. JGEN is similar to the output jitter in a PLL design, therefore the main source of JGEN contains the following: (1) deterministic jitter from quantization error; (2) random jitter from VCO phase noise; (3) ripple on the control voltage; (4) the coupling of data transition noise to VCO through the PD; (5) supply and substrate noise.

The JGEN is usually measured by applying a reference signal with no jitter to the input of the PLL, and measuring its output jitter. Jitter generation is the measure of the intrinsic jitter produced by the PLL and is measured at its output. Jitter generation is measured by applying a reference signal with no jitter to the input of the PLL, and measuring its output jitter. Jitter generation is usually specified as a peak-to-peak jitter value.

## 2.3.2 Jitter Transfer

The jitter transfer or jitter attenuation refers to the magnitude of jitter at the output of a CDR for a given amount of jitter at the input. It is the transfer function of input jitter to output jitter that is varied at different rate. The JTRAN exhibits a low-pass characteristic, which is same as the close-loop transfer function in PLLs. Since intrinsic jitter is always present, jitter attenuation will appear to be lower for low frequency input jitter signals than for high frequency ones. Specifically, the input jitter is applied at various amplitudes and frequencies. If the input jitter varies slowly, the output response will follow the input to ensure the phase locking. While in contrast at fast input jitter varying rate, the output response is limited by the CDR loop bandwidth and therefore got filtered.

The jitter transfer function is shown in Fig. 2.10. As discussed above it exhibits a low-pass characteristic. Also it displays a jitter peaking that results from the 2nd o r higher order system itself. The jitter peaking may worsen the jitter transfer performance and cause instability problem or timing failure. Since the CDR tracks jitter that is within its loop bandwidth, the transfer function should have a gain of 0 dB at low frequencies. If the CDRs inherent jitter is lower than that of the input signal, the jitter transfer function should break downward at a rate of 20 dB/decade at approximately the CDR bandwidth. For a Type-2 PLL, the breakpoint will actually be somewhere below the CDR bandwidth because of jitter peaking.



Figure 2.10: Jitter transfer function.

With different applications there are various requirement for the JTRAN specifications. For instance, the optical standards have two specifications for jitter transfer characteristic. First, the CDR bandwidth should be very small (approximately 120kHz in OC-192) and thus a low-noise oscillator is needed as it provides little suppression for oscillator phase noise. Second, the amount of jitter peaking should be < 0.1dB which asks for a careful design for the frequency response of the CDR system.

## 2.3.3 Jitter Tolerance

The jitter tolerance is a natural properties of CDR to quantify how much input jitter can be tolerated by a CDR loop without introducing the bit-error rate (BER).

In other words, jitter tolerance is a measure of the ability of a PLL to operate properly, i.e., remain in lock in the presence of jitter of various magnitudes at different frequencies, when jitter is applied to its reference. The specification is typically described using an input jitter mask as a function of jitter frequency.

The certain requirement of JTOL depends on the specific applications. For instance, the SONET mask posed by optical standards demands that the CDR must withstand a peak-to-peak jitter of 15UI if the jitter varies at a rate below 100Hz. Also a minimum of 0.15UI is demanded for the frequency beyond loop bandwidth, as shown in Fig. 2.11.



Figure 2.11: Jitter tolerance for a type-II PLL.

It is intuitive to quantify the jitter tolerance of a typical CDR loop and compare with the mask. The JTOL of a CDR is measured by applying a random sequence that is phase-modulated at different rate for different part of the mask. At a given frequency, the amplitude of input excess phase  $\Phi_{IN}$  should be increased to seek the point where the BER begins to rise. This happens when the phase error of  $\Phi_{\rm IN} - \Phi_{\rm OUT}$  approaches 0.5UI, that is, the CDR forces the clock sampling edge to be close to the zero-crossing points of data. With this approach, we can compare the JTOL measurement result with the mask. Fig. 2.11 also indicates the JTOL curve for a typical type-II PLL, which implies that the designed CDR reaches the desired region such that it satisfies the minimum requirement of the SONET mask.

#### 2.4 Summary

The analysis about high performance digital PLLs and CDRs specifications and limitations, specifically based on jitter specifications, is presented in this chapter. The linear model of DPLL indicate that the jitter generation in DPLLs is mainly determined by the deterministic jitter from TDC and DCO quantization error as well as random jitter from DCO phase noise. The analysis further extents to PLL-based digital CDR and discusses the CDR jitter characteristics with three specifications. The discussions also indicate that there exists some constraints trade-off design issues in the digital-enhanced clock circuits that demands alternative design methods must be implemented to ease and solve these problems. The following chapters will target at solve some of the issues in typical digital PLLs and CDRs with two circuit design examples. Chapter 3 – A low jitter, wide operating range digital phase-locked

#### loop

# 3.1 Introduction

As introduced in the last chapter, analog PLL suffers from many disadvantages. To overcome these drawbacks, digitally-enhanced PLLs (DPLLs) [10, 11, 16, 25, 27, 30] have recently emerged as an alternative to analog PLLs.

There are several advantages for digital PLLs. First of all, it eliminates the need for the charge pump and analog filter, therefore relieves the problems of current leakage and transistor mismatch. Instead, a digital loop filter is used in a DPLL with all-digital implementation that avoid the poor analog transistor behavior and intrinsic technology limitation. The second advantage of DPLL is the ability of self-calibration to mitigate performance degradation due to PVT variations. Compare to analog circuits, digital circuits are less sensitive to process, voltage, and temperature, thus DPLL is able to obtain a better performance due to the constant loop dynamics from PVT immunity. In addition, DPLL is easier for process portability which merely needs to scale a number of digital blocks, while analog PLL requires a complete redesign for all analog blocks in a new process. This could greatly shorten the design cycle time and improve the efficiency in releasing new product. Last, a digital implementation is immune to the inherent noise of

digital circuits, therefore decreases the noise contributions. The discussion above indicates that DPLL continues to offer benefits from the intrinsic digital circuits characteristics, therefore the study in DPLL design becomes increasingly necessary and in great demand.

The conventional DPLL architecture [6] is shown in Fig. 3.1. In a DPLL, the time-to-digital converter (TDC), digital loop filter (DLF), and the digitally controlled oscillator (DCO) replace the PFD, analog loop filter, and the VCO of the analog PLL, respectively. The TDC detects the phase difference difference between reference clock  $F_{REF}$  and feedback clock  $F_V$ , and generates a digital word that is proportional to the input phase difference. The proportional and integral gains KP and KI in the digital loop filter are related to the resistor and capacitor in the analog loop filter. A digital-to-analog converter (DAC) interfaces the DLF to the voltage-controlled oscillator (VCO).



Figure 3.1: Conventional DPLL architecture.

In spite of many advantages, the conventional DPLL presents several circuit design bottlenecks that have limited its usage in high performance applications.

For example, [27] has to choose a very low PLL bandwidth to suppress the TDC quantization error. In [16], a reasonably wide tuning range is achieved at the expense of a large output clock jitter due to DCO quantization error. [11, 25] use high resolution TDCs to reduce quantization error, but the power consumptions are much higher mainly due to the TDCs. More detailed discussion about the conventional DPLL disadvantages will be presented in following.

#### 3.1.1 Limitation on TDC Quantization error

The conventional DPLL suffers from the TDC quantization error as the main source for deterministic jitter (DJ) at DPLL output. The transfer characteristic of the delay line based flash TDC is shown in Fig. 3.2. The x-axis indicates the input phase error  $\Delta \Phi$  and the y-axis is the TDC digital output word D<sub>OUT</sub>. The TDC quantizes the phase error in steps of an inverter delay. As can be seen, the TDC quantization error is proportional to its resolution, while the resolution is limited to the minimum achievable inverter delay T<sub>INV</sub> in a given process. For example, even in a 90nm CMOS processes, the TDC step size is only about 20ps. This rather poor resolution manifests itself as phase-quantization error which, if left unfiltered, appears as deterministic jitter at the DPLL output.

Furthermore, the TDC also leads to a bang-bang behavior of the loop, that reveals one cycle delay at each sampling clock. The sampling delay accumulates and exhibits as loop latency in the DPLL loop, thus making the TDC susceptible to dithering jitter due to excess loop delay.



Figure 3.2: TDC transfer characteristic.

## 3.1.2 Limitation on DCO Phase Noise

The conventional DPLL also suffers from an inherent noise bandwidth tradeoff that deteriorates the random jitter at DPLL output. As shown in Fig. 3.3, the DCO phase noise has a high pass transfer characteristic, therefore a high loop bandwidth is required to suppress the large phase noise of ring-based DCO. On the other hand, the TDC quantization noise exhibits a low pass transfer characteristic, therefore a low loop bandwidth is needed to mitigate the quantization error of the TDC. This raises a contradiction in conventional DPLL design. As DCO phase noise is the dominant noise source for random jitter, it's necessary to set the loop bandwidth relatively high to obtain a good random jitter performance. However, the TDC quantization noise determines the deterministic jitter, therefore a low bandwidth is also necessary.

Normally, a relatively low bandwidth is chosen to mitigate the TDC quantization error for the sake of deterministic jitter in a DPLL design. However this



Figure 3.3: Noise bandwidth tradeoff.

brings on a challenge on the DPLL phase noise performance.

# 3.1.3 Limitation on DCO Tuning Range

The tradeoff between DCO resolution and tuning range is another bottleneck in DPLL circuits. As shown in Fig. 3.1, the control voltage to the oscillator is generated by the digital-to-analog converter (DAC). As a result, the finite resolution of the DAC manifests itself as DCO frequency quantization error, which is another important source of deterministic jitter. Smaller LSB is necessary for mitigating output jitter performance, while larger LSB gives wider tuning range. Fig. 3.4 illustrates this tradeoff issue with two examples of different DAC resolution. With a given DAC resolution of L+1 bits, Fig. 3.4(a) with a LSB of  $\Delta F/2$  allows only half DCO quantization error compare to the DCO resolution of LSB =  $\Delta F/2$  in Fig. 3.4(b). However, the tuning range in Fig. 3.4(a) is only half of Fig. 3.4(b).

An alternative approach to mitigate this tradeoff issue is to adopt a DAC with



Figure 3.4: DCO resolution-tuning range tradeoff. (a) Small LSB. (b) Large LSB. larger number of bits, however in this way it greatly increases the circuit complexity and power consumption and degrades the figure of merit of the DPLL design.

#### 3.2 Proposed DPLL Architecture

# 3.2.1 Simplified Proposed Architecture

In order to improve the limitations in the conventional DPLL design as described in last section, we propose a novel digital PLL architecture to eliminate the phase quantization error in the proportional path and allow high loop bandwidth to suppress DCO phase noise, as shown in Fig. 3.5.

To synthesize the proposed DPLL, it is instructive to first evaluate the purpose of the proportional path in a PLL. The proportional path stabilizes a type-2 PLL by introducing a zero in the loop transfer function that otherwise contains two poles at DC. In a analog PLL, the proportional path pulse-width modulates



Figure 3.5: Block diagram of the simplified proposed DPLL.

the oscillator with a magnitude equal to the proportional path gain and a duty cycle equal to the input phase error. On the other hand in a conventional DPLL, the duration of the pulse width is always equal to at least one reference clock period. Furthermore, due to the bang-bang behavior of the TDC at zero phase error, the steady state of the DPLL is a bounded limit cycle where in the oscillator is dithered between two discrete frequencies at nearly the reference frequency. As a consequence, jitter accumulates at a rate that depends on DCO frequency resolution, reference clock period and the desired DPLL tracking bandwidth resulting in poor jitter performance. In view of this DPLL behavior, it is beneficial to eliminate the TDC quantization error and the associated limit cycle behavior.

Following this line of thought, the proposed DPLL pulse-width modulates the VCO through a simple 3-level proportional DAC (PDAC), much like in a analog PLL. The 3-levels of the PDAC correspond to the 3 states of a conventional phase frequency detector (PFD). Because there is no quantization error, the proportional

path exhibits a linear response [32, 34]. The integral path is implemented by first quantizing the PFD output by a bang-bang phase detector (!!PD) and accumulating the resulting phase error. The sign of the phase difference between PFD output UP and DN is detected by !!PD. It is equivalent to detect the phase difference between reference clock  $F_{REF}$  and feedback clock  $F_V$ . The dithering jitter in the integral path due to excess loop delay is minimized by reducing the integral path gain  $K_I$ .

More details about the linear proportional path characteristics are described in Fig. 3.6. The conventional !!PD in Fig. 3.6(a) exhibits large quantization error at proportional path. In the contrast, the proposed linear proportional path employs PFD and obtains linear transfer characteristics, in other words the proportional path has infinite resolution and completely eliminates the quantization error, as shown in Fig. 3.6(b). Certainly the !!PD in the integral path still contains the same amount of quantization error, however most of it can be filtered by the very low integral path bandwidth and reveals that the quantization error from integral path is much trivial. It is also important to note that the proposed DPLL has nearly unlimited pull-in range (only limited by VCO tuning range) as indicated in Fig. 3.6(b), in contrast to [32] where the pull-in range is severely limited by the !!PD.

Since the proposed architecture has released the effect of TDC quantization error on output jitter by utilizing a linear proportional path, it's also easy to show that the proposed architecture decouples the noise bandwidth tradeoff in TDC quantization error and DCO phase noise. In the conventional DPLL design,



Figure 3.6: The transfer characteristic for: (a) conventional !!PD. (b) proposed PFD+!!PD.

high loop bandwidth increases TDC quantization error and therefore deterministic, while high loop bandwidth deteriorate DCO phase noise and thus random jitter. In contrast, the proposed DPLL uses a PFD to eliminate TDC quantization error, consequently a high loop bandwidth can be chosen with no effect on quantization error in the loop. In this design, high loop bandwidth can be simply realized by applying with large proportional path gain  $K_P$ .

# 3.2.2 Proposed Double Integral DPLL Architecture

In order to further improve the limited DCO tuning range problem in a conventional DPLL design, we employs a double integral control path in addition to the traditional proportional and integral paths to decouple the tradeoff between DCO resolution and tuning range. The completed novel digital PLL architecture [35] is presented in Fig. 3.7. Except for the advantages from the simplified proposed architecture, the novel double integral architecture has the following additional features: (1) double integral control to widen the DCO tuning range, (2) bandwidth tracking to minimize random jitter over the entire operating range, (3) quantization error tracking to minimize deterministic jitter in a wide operating range, and (4) novel delta-sigma DAC that maintains very high frequency resolution over wide DCO tuning range.



Figure 3.7: Block diagram of the proposed double integral DPLL.

The double integral path operates at a slower speed than the integral path by decimating its clock speed by a factor of 32. The decimation allows the digital accumulator to operate at a lower frequency. A comparator is employed at double integral path to compare the integral path accumulator output with a reference digit and yield input digit for double integral path. The double integral path gain  $K_C$  is set very high such that  $K_C$  dominates the DCO frequency output, while integral path gain  $K_I$  is chosen to be less than 10% of  $K_C$  (approximately 25% of

the reference frequency). As a consequence, the double integral path can drive the oscillator to frequency lock by accumulating the frequency error based on the sign of the integral path accumulator output. In other words, the frequency contribution in integral path is completely compensated by double integral path and in the steady state the integral path accumulator output is driven to the vicinity of zero. In this way, the DCO tuning range is not correlated with integral path, but only controlled by double integral path.

As the proposed architecture satisfies  $K_C \gg K_I$ , the frequency quantization error in double integral path is therefore much larger than integral path. However the frequency error in double integral path is compensated by integral path, which indicates that only integral path quantization error has impact on DPLL output jitter. As a result, the tradeoff between DCO resolution and tuning range is greatly relieved by allocating DCO tuning range to double integral path and DCO quantization error to integral path.

To prevent limit cycles in the double integral path, a small dead zone denoted by  $\pm$  K in Fig. 3.7 is introduced in the comparator. The quantization error in double integral path should be smaller than 2K times of integral path quantization error to ensure that the integral path is within the proper range to generate correct comparator output to adjust double integral path. In addition, the power dissipation in the double integral path is minimized by operating it at a very low rate of  $F_{REF}/128$ .

# 3.2.3 Bandwidth Tracking

In order to further optimize the DPLL jitter performance over a wide operating frequency, the proposed architecture employs bandwidth tracking and quantization error tracking approaches to realize optimal both random jitter and deterministic jitter, respectively. Recently publications [8, 17] are mainly focused on discussing the bandwidth tracking method in analog PLLs, however, bandwidth tracking and more importantly quantization error tracking in digital PLL are rarely mentioned.

Without the proposed bandwidth tracking method, the DPLL bandwidth is limited to roughly one-tenth the lowest update rate. For a PLL design with a wide operating frequency range, the sub-optimal bandwidth is limited by the lowest operating frequency and results in excessive random jitter at the highest update rate. In contrast, to remain the bandwidth and reference frequency ratio constant over all the operating range is beneficial for optimizing random jitter, especially at higher frequency. The simulated random jitter with and without bandwidth tracking is illustrated in Fig. 3.8. The results are consistent with the analysis that bandwidth tracking achieves optimal random jitter over all frequency range and is in particular beneficial for high frequency.

In this proposed architecture, the proportional path gain  $K_P$  is set very large to achieve high loop bandwidth. The integral path gain  $K_I$  is very small for low DCO quantization error. Due to the extremely low speed, the double integral path exhibits a very low pole that has trivial effect on the loop dynamics. Accordingly only  $K_P$  and  $K_I$  determine the DPLL loop dynamics.



Figure 3.8: Simulated random jitter vs. frequency with and without bandwidth tracking.

It's important to note that  $K_P \gg K_I$  in this 2<sup>nd</sup> order DPLL system shows heavily over-damped behavior. To prove this point, the frequency domain analysis is presented below. As mentioned above the a very low pole at double integral path has negligible impact on loop dynamics, so here we ignore the double integral path for simplicity. The close loop gain for the DPLL is expressed as,

$$LG(z^{-1}) = K_{TDC} \cdot \left(K_{P} + \frac{K_{I} \cdot z^{-1}}{1 - z^{-1}}\right) \cdot \frac{K_{DCO} \cdot z^{-1}}{1 - z^{-1}} \cdot \frac{1}{N}$$
(3.1)

The close loop response of the 2<sup>nd</sup> order system exhibits loop dynamics as below,

$$\omega_{\rm BW} = \frac{K_{\rm P} \cdot K_{\rm DCO}}{2\pi N}, \quad \zeta \gg 1 \tag{3.2}$$

Therefore it indicates that the system is heavily over-damped system. Fig. 3.9

shows the step response behavior of the proposed DPLL and an equivalent analog PLL that exhibits identical loop parameters. An step change in input clock phase is produced, as indicated in the figure, the phase output of DPLL and analog PLL present almost same response, which further proves that the proposed DPLL is an over-damped system and identical analog PLL behavior can be achieved by choosing a very large capacitor in the loop filter to obtain a similar over-damped system.



Figure 3.9: Step response behavior of the proposed DPLL and an equivalent analog PLL.

Observing that the DPLL bandwidth  $\omega_{BW}$  is proportional to proportional path gain K<sub>P</sub>, bandwidth tracking can be achieved simply by making K<sub>P</sub> track with the reference frequency F<sub>REF</sub>.

$$K_P \propto F_{REF}$$
 (3.3)

Since the DPLL is an over-damped system, the impact of  $K_P$  variation on

loop stability is negligible. That is, the damping factor  $\zeta \gg 1$  remains valid over the entire operating frequency. Fig. 3.10 is the close loop frequency response of the proposed DPLL with different K<sub>P</sub>. It indicates that varying K<sub>P</sub> adjusts loop bandwidth, but the phase margin is still close to 90°, which proves that the loop is still stable and no additional approach for loop stability is needed. Thereby this bandwidth tracking approach is simpler than the traditional method in analog PLLs [17].



Figure 3.10: Close loop frequency response with different  $K_P$ .

#### 3.2.4 Quantization Error Tracking

Except for random jitter, deterministic jitter is another main jitter source for DPLL output jitter. Optimizing deterministic jitter in entire frequency range is of the same importance for the sake of total output jitter optimization. The quantization error tracking targets at minimizing the DCO quantization error which dominate deterministic jitter in the proposed architecture.

Without the proposed quantization error tracking method, the integral path tuning range needs to be about  $\pm 25\%$  of the double integral path tuning range, which is the highest operating frequency. For a PLL design with a wide operating frequency range, the unnecessarily large tuning range at the lowest operating frequency degrades the deterministic jitter performance. In contrast, to maintain the DCO quantization error and reference frequency a constant ratio over wide range can improve deterministic jitter especially at low frequency.

The simulated random jitter with and without quantization error tracking is illustrated in Fig. 3.11. The results are consistent with the analysis that quantization error tracking achieves optimal deterministic jitter over wide frequency range and is in particular beneficial for low frequency.

The proposed double integral path facilitates a simple means for scaling the DCO quantization error with the reference frequency. As integral path determines the DCO quantization error while has no impact on DPLL tuning range, the quantization error tracking method can be developed by simply setting integral path frequency resolution to tracking reference frequency. Meanwhile the DPLL



Figure 3.11: Simulated deterministic jitter vs. frequency with and without quantization error tracking.

operating range can be extended without degrading the quantization error induced deterministic jitter. More explicit details are presented in the following equation of deterministic jitter,

$$DJ = N \cdot \left( \frac{1}{N \cdot F_{REF} - \frac{\Delta F}{2}} - \frac{1}{N \cdot F_{REF} + \frac{\Delta F}{2}} \right)$$
(3.4)

where  $\Delta F$  is the frequency resolution in integral path, N is the divider ratio. As  $\Delta F \ll F_{\text{REF}}$  is satisfied, Eq. (3.4) can be further approximated as below,

$$DJ \approx \frac{\Delta F \cdot N}{F_{REF}^2}$$
 (3.5)

Eq. (3.5) indicates that if  $\Delta F$  tracks reference frequency, the deterministic jitter at low frequency can be greatly mitigated. Observing that the DPLL integral path gain  $K_I$  is proportional to  $\Delta F$ , quantization error tracking can be achieved by making  $K_I$  track with the reference frequency  $F_{REF}$ .

$$K_I \propto F_{REF}$$
 (3.6)

## 3.3 Circuits Design

In this section, we will present the circuit implementation details of the building blocks in the proposed double integral DPLL (refer to Fig. 3.7).

The PFD employs the well-known three-state state-machine architecture and is implemented using the pass transistor structure [18]. The PFD achieves fast switching and minimizes the dead zone problem. A sense-amplifier flip-flip [20] is adopted in the !!PD to minimize the setup time induced phase error in the integral path. 14-bit and 18-bit accumulators are use in the integral path and double integral path, respectively. The DCO and DACs are presented with more details in the following.

### 3.3.1 Digitally-Controlled Oscillator

A simplified circuit schematic of the DCO is shown in Fig. 3.12. The output frequency is controlled by varying the supply current in a current-controlled oscillator (CCO). The proportional, integral, and double integral control signals are summed in current domain at the supply node of the CCO, thus obviating the need for a digital adder. That is, the CCO current is contributed from three parts: the PDAC current  $I_{PDAC}$  in the proportional path, the IDAC current  $I_{IDAC}$  in the integral path, and the CDAC current  $I_{CDAC}$  in the double integral path.



Figure 3.12: Digitally controlled oscillator.

The DAC current controls the gain of each path. The proportional path gain  $K_P$  needs to be high to obtain high loop bandwidth, however in the steady state the average gain  $\overline{K_P}$  in a full cycle is negligible. As the double integral path determines the DPLL tuning range, its gain  $K_C$  is set 10 times much higher than integral path gain  $K_I$ . As a consequence,  $I_{IDAC}$  dominates the CCO current and accordingly the oscillation frequency. The nominal CCO current is set through current source transistor M1 that is driven by CDAC, while CDAC output is controlled by the double integral path accumulator output  $D_C$ .

Observing that the DPLL bandwidth tracking method requires that PDAC gain  $K_P$  is proportional to reference frequency. In Fig. 3.12, because the CDAC output current  $I_{IDAC}$  is proportional to the oscillator frequency (and therefore to

the reference frequency for the present fixed divider case), bandwidth tracking is achieved by biasing the PDAC with the CDAC output current and thus obtains  $I_{PDAC} \propto I_{CDAC}$ . Similarly, by biasing IDAC with the CDAC output current, that is  $I_{IDAC} \propto I_{CDAC}$ , the integral path gain  $K_I$  tracks the reference frequency and thereby DCO quantization error tracking is realized.

The CCO is implemented using three current-starved pseudo-differential stages connected in a ring oscillator topology, as shown in Fig. 3.13. The delay cell consists of two inverters whose outputs are coupled in a feed-forward manner through the NMOS/PMOS pass transistor pair. This coupling ensures differential operation of the oscillator. A ac-coupled inverter biased by replica inverter is used as a CCO buffer. This CCO buffer output is rail-to-rail with small duty cycle distortion and the power consumption is very low.



Figure 3.13: Current controlled oscillator.

#### 3.3.2 Digital-to-Analog Converter

The PDAC is a 3-level current DAC that generates output of 1, -1 or 0. As illustrated in Fig. 3.12, PDAC is driven by pulse-width modulated PFD output signals, DN and  $\overline{\text{UP}}$ , is implemented with a 3-level current source. The three PFD states, UP, RESET, and DOWN are mapped to 2I<sub>B</sub>, I<sub>B</sub>, and 0 currents, respectively, where I<sub>B</sub> is the bias current. As mentioned earlier, bandwidth tracking is achieved by biasing PDAC with the CDAC output current. To achieve accurate mirroring of the CDAC current in the PDAC, a replica of DN/ $\overline{\text{UP}}$  switch is used to degenerate the M1 current source.

The integral control is implemented by steering away current from the oscillator. To ensure the integral path tuning range is about  $\pm 25\%$  of the reference frequency, the IDAC is also biased with the CDAC output current and achieves quantization error tracking. The CDAC is implemented with the same architecture as IDAC. The block diagram of the digital to analog converter (DAC) employed in integral and double integral paths (IDAC and CDAC in Fig. 3.7) is shown in Fig. 3.14. The requirement of the 13-bit DAC is released by using a 13-bit second order deltasigma modulator (DSM). The 4-bit binary output resulting from truncation of the 13-bit digital input by the DSM drives a 15-level thermometer coded currentmode DAC. Each LSB of the 4-bit current DAC consumes only  $5\mu$ A current. The transistor M1 is used to minimize glitches due to clock feed-through and thereby reduce deterministic jitter. A low pass filter (LPF) suppresses the high frequency DSM-induced quantization error and generates analog output voltage V<sub>OUT</sub>.



Figure 3.14: DAC for integral and double integral paths.

The block diagram of the delta-sigma modulator is shown in Fig. 3.15. A second order error feedback architecture is used to feed the quantization error back to the input through a simple loop filter implemented by two delay elements. In this implementation, the noise transfer function  $(1 - z^{-1})^2$  with two zeros at DC is achieved by coefficients that are multiples of 2, thereby obviating the need for a multiplier. The input to the DSM is an 13-bit word and it and generates 4-bit output. The clock for the DSM is one fourth of the DPLL reference frequency.

The delta-sigma DAC architecture eases the hardware requirements, however it also degrades the jitter performance of the DPLL in two ways. First, the low-pass post filter (LPF )increases loop latency and causes excessive dithering in integral and double integral paths. If not addressed, this dithering causes prohibitively large deterministic jitter. In the proposed DPLL, this dithering jitter is mitigated by reducing the gain and using the deadzone comparator in the integral and double integral paths, respectively. Second, the DAC resolution degrades at lower operat-



Figure 3.15: Delta-sigma modulator.

ing frequency and causes large jitter. This dependence of DAC resolution on the operating frequency can be explained as follows. The in-band quantization error at the output of a DSM greatly depends on the over-sampling ratio (OSR) [24]. Defining OSR as the ratio of the clock frequency,  $F_{CK}$ , to the LPF bandwidth,  $F_{LPF}$ , it can be easily seen that OSR scales linearly with  $F_{CK}$ . Because DSM clock frequency is a fixed fraction of the DPLL reference clock frequency, OSR reduces at lower operating frequency, severely increasing the amount of residual quantization error at the output. In view of this, we further propose a tunable switched RC filter whose bandwidth scales linearly with  $F_{CK}$ , thus maintaining a constant OSR over the entire DPLL operating range.

## 3.3.3 Tunable Switched RC Filter

The tunable LPF is implemented using a passive switched RC filter, as seen in Fig. 3.16(a). With a CMOS switch connected in serial with the resistor  $R_{LPF}$ , the



Figure 3.16: (a) Tunable switched RC low pass filter. (b) Clock circuit for CMOS switch. (c) Clock waveforms for different duty cycle D1 and D2.

LPF is functional only when the switch turns on. In other words, the equivalent average resistance  $R_{eq}$  in the LPF is now correlated with the duty cycle D of the clock in the switch,

$$R_{eq} = \frac{R_{LPF}}{D}$$
(3.7)

Following this line of thought, we can adjust the duty cycle to vary the equivalent resistance, and in further make the LPF bandwidth tunable. The switch clock  $F_{SW}$  schematic is shown in Fig. 3.16(b).  $F_{CK}$  is the DSM clock.  $F_{SW}$  is generated by gating the clock  $F_{CK}$  with its delayed version  $\overline{F}_{CKD}$ . Thus  $F_{SW}$  turns out to be a clock with a pulse width  $T_D$ . As the delay  $T_D$  is fixed, the duty cycle of  $F_{SW}$  is now scaling with the clock frequency  $F_{CK}$ , as illustrated by the timing diagrams Fig. 3.16(c). The duty cycle D can be derived as,

$$D = T_D \cdot F_{CK} \tag{3.8}$$

Therefore the equivalent resistance  $\mathrm{R}_{\mathrm{eq}}$  becomes,

$$R_{eq} = \frac{R_{LPF}}{T_D \cdot F_{CK}}$$
(3.9)

So the -3dB bandwidth of the LPF can be written as,

$$F_{\rm LPF} = \frac{1}{2\pi R_{\rm eq} C_{\rm LPF}} = \frac{T_{\rm D} \cdot F_{\rm CK}}{2\pi R_{\rm LPF} C_{\rm LPF}}$$
(3.10)

The OSR of the DSM is,

$$OSR = \frac{F_{CK}}{F_{LPF}} = \frac{2\pi R_{LPF} C_{LPF}}{T_D}$$
(3.11)

Eq. (3.4) indicates that OSR maintains constant over the entire frequency range, accordingly achieves the same DAC resolution regardless of reference frequency. Fig. 3.17 shows the simulated deterministic jitter results with the tunable switched RC LPF and with a fixed bandwidth LFP. Clearly more deterministic jitter is introduced with fixed bandwidth LFP, especially at low reference frequency due to decreased OSR.



Figure 3.17: Simulated deterministic jitter vs. frequency with and without switched RC filter.

# 3.4 Experimental Results

A prototype of the proposed double integral DPLL is implemented in a 90nm CMOS process. Fig. 3.18 shows the die photograph of the prototype and occupies an active area of  $0.36 \text{mm}^2$  ( $600 \mu \text{m} \times 600 \mu \text{m}$ ). Since the entire layout including digital blocks are completed manually, it is reasonable to shrink the die area to half if the digital layout is drawn by automatic tools. The die was packaged in a standard 32-pin TQFP plastic package. The packaged chip is attached to the four-layer test board through a clamp screw that is used to mechanically press down on the package and force its leads to contact solder pads on a printed-circuit board (PCB).

Fig. 3.19 shows the measured jitter histogram at 2.5GHz. The proposed DPLL achieves long-term root mean square (r.m.s) jitter of 1.6ps and peak-to-peak jitter



Figure 3.18: Die photograph.

of 11.6ps. Fig. 3.20 is the measurement for phase noise at 2.5GHz. It reveals that the r.m.s jitter obtained by integrating the measured phase noise from 10kHz to 100MHz is only 0.9ps, indicating that the rest of measured jitter at jitter histogram plot is contributed from the oscilloscope trigger jitter in time domain. The reference spur is -50.1dBc as illustrated by the measured spectrum shown in Fig. 3.21, which implies that the deterministic jitter resulting from this reference spur is only 0.83ps.

Fig. 3.22 shows the measured r.m.s jitter and pk-pk jitter from 0.7GHz to 3.5GHz. Fig. 3.23 shows the power consumption versus frequency. The power consumptions are only 0.6-2.8mW from 0.7-3.5GHz frequency range. The plots indicate that the proposed DPLL achieves excellent jitter performance and power consumption over a wide operating range.



Figure 3.19: Jitter histogram @2.5GHz.



Figure 3.20: Phase noise @2.5GHz.

Fig. 3.24 shows the measurement result of the DPLL bandwidth and reference frequency ratio. When the bandwidth tracking mode is off, the DPLL bandwidth maintains constant when reference frequency varies, then resulting in a increasing bandwidth-reference frequency ratio. As a consequence the phase noise at high frequency is severely limited by the low bandwidth. When the DPLL has



Figure 3.21: Measured Reference Spur @2.5GHz.



Figure 3.22: Jitter performance vs. frequency.

the bandwidth tracking mode on, the measured curve of bandwidth-reference frequency ratio is almost a fixed number over the wide frequency range, which indicates that the DPLL bandwidth tracks the reference frequency and maintain a low bandwidth-reference frequency ratio to suppress more DCO phase noise and obtain low random jitter.



Figure 3.23: Power consumption vs. frequency.



Figure 3.24: Measured bandwidth/reference ratio with and without bandwidth tracking.

The performance summary of the prototype is summarized in Table 3.1. The measured operating range of the proposed DPLL is 0.7 to 3.5GHz, which suggests that a large operating range is achieved with the proposed double integral path.

The power consumption of the DPLL operating at 2.5 GHz is only 1.6mW from a 1V supply, of which merely 0.1mW is consumed by the digital circuits in integral path and double integral path. This certifies the advantage of low power consumption of employing a digital PLL topology. The bandwidth of the proposed DPLL is set to be 1/40 of reference frequency. With a divider ratio of 4, the DPLL loop bandwidth is 16MHz in the typical 2.5GHz output frequency. Table 3.2 is the performance comparison of this circuit with other state-of-the-art PLL designs[10, 11, 22, 30, 31]. The proposed DPLL exhibits competitive advantages on the PLL performance of output jitter and power consumption.

| Table 5.1. DI LL I enformance Summary |                                  |
|---------------------------------------|----------------------------------|
| Technology                            | 90nm CMOS                        |
| Supply Voltage                        | 1V                               |
| Operating Frequency                   | 0.7GHz-3.5GHz                    |
| Random Jitter @2.5GHz                 | 1.6ps r.m.s                      |
| Jitter from Integrated Phase Noise    | $0.9 \mathrm{ps} \mathrm{r.m.s}$ |
| Peak-to-peak Jitter @2.5GHz           | 11.6ps                           |
| Reference Spur                        | $-50.1 \mathrm{dBc}$             |
| Deterministic Jitter from Ref Spur    | $0.83 \mathrm{ps}$               |
| Loop Bandwidth @2.5GHz                | 16MHz                            |
| Power @2.5GHz                         | $1.6\mathrm{mW}$                 |
| Die Area                              | $0.36 \mathrm{mm}^2$             |

 Table 3.1: DPLL Performance Summary

#### 3.5 Summary

A double integral path DPLL architecture that achieves excellent jitter performance while operating with low power is presented. By using the linear proportional path, eliminating TDC quantization error in the proportional path as well
|                   | JSSC05 [31]          | ISSCC05[22]          | JSSC08 [30]         | ASSCC09 [10]        | VLSI06 [11]        | This work           |
|-------------------|----------------------|----------------------|---------------------|---------------------|--------------------|---------------------|
| Technology        | 90nm                 | 0.13um               | 65nm                | 0.13um              | 0.13um             | 90nm                |
| Supply Voltage    | 0.9V                 | 1V&2.5V              | 0.5V                | 1.2V                | 1.2V               | 1V                  |
| Operating         | 0.4-3.4GHz           | 0.62-5GHz            | 0.09-1.25GHz        | 0.3-1.4GHz          | 0.7-1.7GHz         | 0.7-3.5GHz          |
| Frequency         |                      |                      |                     |                     |                    |                     |
| <b>Rms Jitter</b> | 1.4ps                | 1.3ps                | 3ps                 | 4.17ps              | 6.9ps              | 1.6ps               |
|                   | @2.5GHz              | @3.125GHz            | @1GHz               | @1.35GHz            | @1.2GHz            | @2.5GHz             |
| Pk-pk Jitter      | 15.8ps               | 11.1ps               | N/A                 | 32ps                | 56ps               | 11.6p               |
| -                 | @2.5GHz              | @3.125GHz            |                     | @1.35GHz            | @1.2GHz            | @2.5GHz             |
| Power             | 19.8mW               | 15mW                 | 1.65mW              | 13.7mW              | 17mW               | 1.6mW               |
| consumption       | @2.5GHz              | @3.125GHz            | @1GHz               | @1.35GHz            | @1.2GHz            | @2.5GHz             |
| Die Area          | 0.016mm <sup>2</sup> | 0.064mm <sup>2</sup> | 0.03mm <sup>2</sup> | 0.19mm <sup>2</sup> | 0.6mm <sup>2</sup> | 0.36mm <sup>2</sup> |
| Architecture      | AnalogPLL            | AnalogPLL            | Digital PLL         | Digital PLL         | Digital PLL        | Digital PLL         |
|                   |                      |                      |                     |                     |                    |                     |

Table 3.2: Performance Comparison

as decoupling the noise bandwidth tradeoff are both achieved. As a consequence a high loop bandwidth can be chosen to suppress the phase noise of the DCO in a power efficient manner. Employing a double integral path relieves the integral path resolution requirements and provides a means for bandwidth and tuning range tracking needed to achieve low random jitter and deterministic jitter performance over a very wide operating range. The high resolution of a delta-sigma DAC was maintained over a wide range of sampling clock frequencies by employing a tunable switched RC filter that force the post filter bandwidth to track with the clock frequency and remain a constant OSR. The measured results validate proposed design techniques. Specifically, the prototype PLL chip operating at 2.5 GHz achieves 0.9ps r.m.s jitter from integrated phase noise while consuming only 1.6mW from a 1 V power supply. The proposed DPLL also achieved the bandwidth tracking and quantization tracking to optimize output jitter in the entire operating range.

# Chapter 4 – A TDC-less, linear loop dynamics digital clock and data recovery circuit

#### 4.1 Introduction

Clock and data recovery (CDR) circuits are the essential building blocks in highspeed transceivers. Such transceivers serve in many applications, including optical communications, backplane routing, and high speed serial links. The data received in these systems are both asynchronous and noisy, requiring that a clock be extracted to perform synchronous operations. Moreover, the data must be re-timed in order to get rid of the jitter accumulated during transmission. The CDR is used to generate synchronized clock for the incoming serial data and satisfy the specifications defined by communication standards.

A classical CDR is realized using a type-2 PLL architecture, as shown in Fig. 4.1. The phase detector (PD) multiplies the edge-detected data by the VCO output, generating a beat that drives the VCO frequency toward the input bit rate. If the initial difference between the VCO frequency and the data rate is sufficiently small, the loop locks and establishes a well-defined phase relationship between  $D_{IN}$  and  $F_{OUT}$ . The CP and LF suppresses ripple on the oscillator control line. Also, to re-time the data, a retimer which is realized by a DFF samples the input data  $D_{IN}$  with recovered clock  $F_{OUT}$ . In other words, the recovered clock  $F_{OUT}$  drives



both the data input of the phase detector and the clock input of the retimer.

Figure 4.1: Conventional analog CDR architecture.

The conventional CDR circuit suffers from a number of drawbacks. First, the PD could generate full digital outputs for run lengths greater than one, thereby bringing significant ripple on the VCO control voltage and hence output jitter. Second, since the PD samples the clock by the data, whereas the decision circuit samples the data by the clock, data retiming exhibits significant phase offset at high speeds. Furthermore, a passive lead-lag analog loop filter is used to set the loop response. That is, for analog CDRs, large capacitors are needed to achieve low jitter transfer bandwidth and a highly over-damped response to reduce jitter peaking prohibit monolithic integration of the analog loop filters. Therefore, analog CDRs have to be often realized using external components [2, 9].

As introduced in Chapter 2, digital CDRs [4, 21, 23] that employ digital loop filter (DLF) have recently emerged as an alternate solution to implement fully integrated CDRs method due to the property of low power consumption, easy portability to different process technologies and the robustness to process and temperature variations. However, digital CDRs still have significant disadvantages, as discussed in the following.

The PDs for random data must provide two essential functions: data transition detection and phase difference detection. The bang-bang phase detector (!!PD) is normally used in digital CDRs. It provides a simple interface to the DLF and does not have any inherent static phase offset (SPO) [2, 21]. However, it introduces a large phase quantization error that appears as deterministic jitter due to bang-bang behavior. Also it limits the jitter transfer bandwidth due to the uncertain BBPD gain. The non-linear behavior of !!PD makes it difficult to control loop dynamics. Therefore, constant jitter transfer (JTRAN) characteristic is difficult to achieve with a different jitter generation (JGEN), which results in uncertain in the CDR loop dynamics.

A time-to-digital converter (TDC) realized using a linear phase detector followed by an ADC could provide fixed gain that enables well-controlled loop dynamics and achieve the desired jitter transfer bandwidth [4, 23]. However, it is susceptible to mismatches that appear as a large SPO and the need for an ADC complicates the design and incurs a large power penalty.

The focus of this chapter is to improve the disadvantages in the traditional digital CDRs. In this chapter, key techniques to implement a digital CDR with SPO-free data recovery and linear loop dynamics are presented. The prototype digital CDR combines the advantages of linear and bang-bang phase detectors to achieve error-free operation at 2.5Gbps and a fixed jitter transfer bandwidth independent of input jitter amplitude. The proposed CDR decouples the correlation between JGEN and JTRAN, and allows wide loop bandwidth to suppress oscillator

phase noise efficiently.

## 4.2 Proposed CDR Architecture

The proposed CDR architecture is illustrated in Fig. 4.2. It consists of a digital frequency-locking loop (FLL) and a type-2 PLL. At startup, the FLL bring the CDR loop to frequency lock by utilizing a 1MHz reference clock input. It drives the DCO frequency to within  $\pm 0.1\%$  off any incoming data rate in a 0.5 to 3.2Gbps range. The PLL, consisting of a linear proportional path and a bang-bang digital integral path, acts as phase detection and accounts for the residual frequency error and acquires phase lock.



Figure 4.2: Proposed CDR architecture.

In the linear proportional path, a linear PD [7, 13] is utilized to detect the phase difference between input data and feedback clock by linearly scaling the PD output pulse width. The linear PD then directly drives the DCO frequency. Hence the DCO adjusts the output clock to correct the phase difference. Since this analog behavior in proportional path indicates infinite resolution, it eliminating the nonlinearity and quantization error that inherently exists in !!PD. Hence no high resolution TDC [4, 23] with huge power consumption is required in this design.

In the digital integral path, a bang-bang Alexander PD produces early or late (E/L) output signal and recovers the data without any systematic SPO and generates the early/late information to drive the accumulator. A decimator circuit is used to operate the digital circuitry at a lower clock frequency and thereby reduce power consumption in digital blocks. Ignoring the lower LSBs of the accumulator output could reduce the jitter degradation from the resulting latency from decimator circuit.

With a linear proportional path instead of a bang-bang behavior proportional path, the proposed architecture is now eliminating the large quantization error from the finite resolution TDC in proportional path and meanwhile contains no SPO as long as the DFFs used in the PDs of both paths exhibit good matching for setup time. Also a higher loop bandwidth can be achieved to suppress oscillator phase noise efficiently. Another important advantage of the proposed architecture is to achieve a constant gain in proportional path that produces a well-controlled loop dynamics as well as a fixed JTRAN bandwidth. More details about the proposed CDR will be discussed in the following.

#### 4.2.1 Bang-bang Phase Detector

In this CDR architecture, we used an Alexander phase detector [1] as the bangbang phase detector. Alexander phase detector, as well as TDCs based on it, is the commonly used PD in CDRs. The Alexander PD is known as the early-late detection method. It extends the DFF based !!PD to operate with random data. As DFF only uses either the positive or the negative edge, Alexander PD operates on both the positive and negative edges of the input random data, and produces early or late signal output to wether the phase difference is positive, negative, or in the presence of missing transitions in the input data.

Fig. 4.3 illustrates the Alexander PD block diagram. The PD determines if the clock leads or lags the data by using three data samples at three consecutive clock edges. If no data transition happens, all three samples are equal and results in no action. If the clock is early than data, the first sample S1 is unequal to the last two S2 and S3. In the contrast, the clock is late, the first two samples S1 and S2 are equal, but unequal to the last sample S3.



Figure 4.3: Alexander phase detector.

The clock waveforms for Alexander PD are shown in Fig. 4.4. Thus,  $S1 \oplus S2$ 

and  $S2 \oplus S3$  provide the early-late information:

- If  $S1 \oplus S2$  is high and  $S2 \oplus S3$  is low, the clock is late.
- If  $S1 \oplus S2$  is low and  $S2 \oplus S3$  is high, the clock is early.
- If  $S1 \oplus S2 = S2 \oplus S3$ , no data transition is present.



Figure 4.4: Alexander PD clock waveform. (a) Clock is early. (b) Clock is late. (c) Clock aligned or data absence.

The Alexander PD is a bang-bang system, exhibiting a very high gain in the vicinity of  $\Delta \phi = 0$ . As a consequence, the conventional digital CDR based on this type of PD locks when S2 coincides with the data zero crossings. As Alexander PD exhibits a bang-bang characteristic, it contains two advantages compare to a DFF based PD. First, it automatically retimes the data and produces a valid data output without additional retimer. The other advantage is that it generates a zero dc output in the absence of data transitions and make the oscillator control undisturbed. As a result, for long data runs, the VCO frequency drifts only due to device electronic noise rather than due to a high or low level on the control line.

Though exhibiting a bang-bang characteristic, the nonlinearity of Alexander PD can be linearized in the presence of input clock jitter. It's reasonable to assume a Gaussian jitter distribution for input clock jitter [6, 26]. According to [6], the Alexander PD output probability with clock jitter and voltage offset ( $V_{OS}$ ) can be depicted in Fig. 4.5, where  $T_{OS}$  is the time offset that is related to  $V_{OS}$  and clock rise time  $t_r$ . Also the average output with a given input phase difference  $\Delta T$  can be derived as below,

$$\overline{V_{PD} (\Delta T)} = V_{O} \cdot \operatorname{erf} \left( \frac{\Delta T + T_{OS}}{\sigma_{j}} \right) + V_{O} \cdot \operatorname{erf} \left( \frac{\Delta T - T_{OS}}{\sigma_{j}} \right)$$
(4.1)

where  $V_O$  is output voltage amplitude and  $\sigma_j$  is the root mean square value of the input jitter.



Figure 4.5: Alexander PD output probability with clock jitter and voltage offset.

According to Eq. (4.1), the voltage offset leads to a phase dead zone in Alexander PD transfer characteristic, as shown in Fig. 4.8. The width of the dead zone is correlated with the value of  $V_{OS}$ . The result has been verified in [26]. The dead zone would cause a important problem in CDR system. That is, it leads to a reduction in jitter tolerance (JTOL) due to the uncertain phase wandering within the voltage-offset-induced dead zone region. Furthermore, it will cause uncertainty and worsen the sensitivity of Alexander PD gain, which is linear around twice of the standard deviation of input jitter but equals to zero in the dead zone.

#### 4.2.2 Linear Phase Detector

In this design, we used a full-rate Hogge phase detector [7] as the linear phase detector. Hogge phase detector is a linear PD that the the width of the output pulses varies linearly with the input phase difference. This behavior is in contrast to that of a !!PD. Fig. 4.6 illustrates the Hogge PD block diagram. The signal ERR generates a pulse width proportional to the phase difference between incoming clock and data. The signal REF is a reference pulse that appears only only on data edges and exhibits a constant width equaling to half clock period. The reference pulse is used to eliminate the pattern dependency, in other words, to eliminate ambiguity for different data transition densities for different data sequences. The buffer followed with DATA(t) is used to cancel the clock-to-Q delay in DFF and achieve a good match for phase difference.



Figure 4.6: Hogge phase detector.

Fig. 4.7 shows the clock waveform for the full-rate Hogge PD used in this CDR.

The sampling edge of recovered clock CK(t) detects the phase difference with input data DATA(t) and results in error output ERR(t). The pulse with of reference output REF(t) is equivalent to half clock period such that the average output DE-DR remains same in the steady state regardless of the length of consecutive identity digit (CID) of input data. The main advantages of using a linear proportional path



Figure 4.7: Hogge PD clock waveform.

and a bang-bang digital integral path are described in the following details.

The first advantage of the proposed architecture is to utilize a linear PD in the proportional path to eliminate the quantization error caused by large step size in !!PD, which is one of the main deterministic jitter sources in bang-bang CDR. Though linear PD output still reveals control ripple in DCO, which is coming from the pulse with of ERR(t) and REF(t), most of the ripple is filtered by the low CDR bandwidth and exhibit little effect on output jitter.

The second advantage of the proposed architecture is to achieve static phase offset-free (SPO-free) in the CDR loop in steady state. Fig. 4.8 shows the transfer characteristics for Alexander PD and Hogge in ideal and real situation, respectively.



In a conventional linear CDR that utilizes a Hogge PD, a charge-pump (CP) is

Figure 4.8: The transfer characteristics for: (a)Alexander PD. (b)Hogge PD.

required to generate the linear-scaled pulse width and provide control voltage to the oscillator. As CP always current mismatches in the UP/DN pulses that is hard to eliminate, it causes static phase offset in the CDR loop. In contrast, in the proposed CDR, the use of CP is eliminate. Instead, the output of the linear PD directly control the oscillator frequency. As a consequence, this design exhibit a SPO-free characteristic.

Furthermore, the proposed CDR improve the output jitter through the use of Hogge PD to compensate the dead zone in Alexander PD. As shown in Fig. 4.8, the Alexander PD exhibits a phase dead zone that is related to DFF offset voltage  $V_{OS}$ , while Hogge PD presents linear characteristic around steady state ( $\Delta T = 0$ ). In a conventional digital CDR that utilizes a !!PD, the phase dead zone causes an uncertain phase error due to the phase wandering issue, and thus deteriorates the JTOL performance as the uncertain phase region decreases the CDR loop lock ability [33, 15]. However, in the proposed CDR, separating the proportional and integral path with a Hogge PD and An Alexander PD could eliminate the phase error caused by the dead zone in Alexander PD. When the time difference  $\Delta T = 0$ falls in the dead zone region, the integral path receives zero and doesn't response. However the Hogge PD exhibits a constant gain and continues to drive the loop to steady state. Therefore no phase error is caused as the Hogge PD indicates a linear transfer characteristic.

Another important advantage of this CDR is, the proposed CDR obtains a constant jitter transfer bandwidth regardless of input clock jitter. The CDR is designed for a heavily over-damped response by making the digital integral gain very small compared to the proportional gain. Consequently, jitter peaking is almost completely eliminated and the jitter transfer bandwidth is set solely by the well-controlled proportional path gain. According to the average output of Alexander PD expressed in Eq. (4.1), the gain of Alexander PD can be derived as,

$$K_{PD_{A}} = \frac{d\overline{V_{PD}(\Delta T)}}{d\Delta T} = \frac{2V_{0}}{\sqrt{2\pi\sigma_{i}}} e^{\left(-\frac{\Delta T}{\sqrt{2\sigma_{j}}}\right)^{2}}$$
(4.2)

where  $V_O$  is the amplitude of PD output that equals to  $V_H - V_L$ . Eq. (4.2) reveals that the PD gain exhibits severe dependence on the input clock jitter  $\sigma_j$  in the conventional CDR. As CDR bandwidth is related to PD gain, the uncertainty of  $K_{PDA}$  due to unknown  $\sigma_j$  reveals big variation in CDR bandwidth and results in uncertain, thereby it leads to uncertain JTRAN characteristic.

In contrast, in this design, utilizing a linear PD in proportional path and a !!PD in integral path can solve this problem while still retain the advantages of digital CDR, as the loop dynamics are determined by the gain of Hogge PD instead of !!PD, from Fig. 4.8 we can obtain the Hogge PD gain to be,

$$K_{PD_{H}} = \frac{dV_{PD}(\Delta T)}{d\Delta T} = \frac{V_{0}}{\pi}$$
(4.3)

Eq. (4.3) indicates that the gain of Hogge PD is decided by output amplitude and not affected by input jitter. That is, a unchanged PD gain and stable loop dynamics are obtained. As a result, compared to conventional bang-bang digital CDRs where the bandwidth is highly sensitive to the input jitter amplitude [21], a major advantage of constant jitter transfer bandwidth is achieved with the fixed gain of the linear PD. Furthermore, the proposed architecture also decouples jitter generation from jitter transfer performance of the CDR. Since the !!PDs is employed only in the digital integral path, bandwidth can be increased with out increasing dither jitter generation. In backplane applications, the bandwidth can be maximized to reduce jitter generation from the oscillator phase noise also.

Last, sensitivity to the DCO quantization error and CIDs is greatly reduced by the linear proportional path in the proposed architecture. While in contrast, jitter generation in conventional digital CDRs arises from the frequency quantization error and the phase noise of the DCO. The finite DCO resolution manifests as deterministic jitter due to jitter accumulation and its magnitude is determined both by the number of consecutive identical digits (CIDs) and the amount of frequency quantization error. The finite slew-rate of a bang-bang CDR compounds this effect further.

## 4.3 Circuits Design

In this section, we will present the circuit implementation details of the building blocks in the proposed CDR (refer to Fig. 4.2). The PDs in both proportional and integral paths use the same DFFs for the better matching. The sense-amplifier flip-flip (SAFF) [20] is employed to minimize the setup time induced phase error. The schematic of the SAFF is shown in Fig. 4.9.



Figure 4.9: Sense-amplifier flip-flip schematic.

The SAFF consists of the SA in the first stage and the slave set-reset (SR) latch in the second stage. The SA stage provides a negative pulse on one of the inputs to the SR latch  $\overline{S}$  or  $\overline{R}$  depending whether the output is to be set or reset. It senses the true and complementary differential inputs and produces monotonic transitions from one to zero logic level on one of the outputs, following the leading clock edge. Any subsequent change of the data during the active clock interval will not affect the output of the SA. The SR latch in the second stage captures the transition and holds the state until the next leading edge of the clock arrives.

After the clock returns to inactive state, both outputs of the SA stage assume logic one value. Therefore, the whole structure acts as a flip-flop.

The sense amplifier operates in two phases. In the reset phase, when the clock signal CK is low, nodes  $\overline{S}$  and  $\overline{R}$  are reset to VDD. In the sample phase, when CK is high, the tail transistor M0 turns on and pulls-down the common node  $V_{CM}$ . Therefore the input differential pair M1 and M2 turn on and discharge the output nodes  $\overline{S}$  and  $\overline{R}$  at a rate determined by the input differential voltage.

As the input data has small amplitude and runs at high-speed, additional samplers that directly connect to the input data are placed in front of SAFF to regenerate a rail-to-rail signal. Because the data evaluation time and sensitivity are important performance metrics of the data/edge samplers, the series connected sense amplifiers increase the data evaluation time. Also the sensitivity is improved by minimizing the offset of the first SA by a reasonably large input devices and careful layout that reduces the mismatch of both the transistors and parasitics capacitances. In this way the PDs can mitigate the sensitivity of the incoming data and increase the CDR maximum operating rate.

A decimator circuit is used in the digital integral path to operate the digital circuitry at a lower clock frequency in order to ease the digital circuits design requirement and also reduce power consumption. The decimator schematic is shown in Fig. 4.10(a). The clock CK in the decimator circuit is operating at half rate of input data  $D_{IN}$ . Therefore, the output  $D_{OUT}$  data rate is reduced to a half also.  $D_P$  is sampled by the positive edge of CK. While  $D_N$  is sampled by  $\overline{CK}$ , it is the data at the negative edge of CK.  $D_P$  and  $D_N$  thereby contain all the information



Figure 4.10: Decimator circuit. (a) Schematic. (b) Input and output digits.

of  $D_{IN}$  and ensure that no data missing happens.  $D_P$  and  $D_N$  then sum together with a 2-bit adder. The table of input and output data is shown in Fig. 4.10(b). The output data  $D_{OUT}$  is the sign of the summing result, which degrades to 1-bit again. Though the decimator circuit contains all the useful information, it still increases loop latency because of reduced speed. Whereas the loop latency can be easily compensated by ignoring useless LSBs in the accumulator to avoid dithering jitter.

The DCO schematic is illustrated Fig. 4.11. It is a current-controlled ring oscillator whose frequency is separately controlled by an FLL, a digital integral path, and a proportional path through two 14-bit DACs and a 3-level DAC. Using separate DACs for the frequency control and the integral control relaxes the stringent quantization error requirements otherwise present in a shared DAC architecture [21]. To further ease the precision requirements, high resolution DACs are implemented using a second-order error feedback delta-sigma modulator that truncates the 14-bit input to 15 levels, a thermometer-coded current-mode DAC and a low



Figure 4.11: Simplified digitally controlled oscillator schematic.

pass post filter. The proportional control is realized by scaling the current of a 3-level DAC with the pulse width modulated PD outputs DE and DR and its bias current is set to achieve a jitter transfer bandwidth of about 4MHz. The current in the integral control is much smaller than the proportional control such that it can achieve a fine frequency resolution and thus improve deterministic jitter induced from the integral path. Further, an ac-coupled output buffer follows the oscillator to boost the output clock amplitude from the CCO supply to rail-to-rail with little distortion and perfect duty cycle.

### 4.4 Experimental Results

The prototype digital CDR is designed and fabricated in a 130nm CMOS technology. It achieves error-free operation (BER  $< 10^{-12}$ ) for all PRBS data sequences ranging from  $2^7 - 1$  to  $2^{31} - 1$  sequence lengths over 0.5Gb/s to 3.2Gb/s data rates. Fig. 4.12 shows the micrograph of the chip. The active die area is 0.2mm<sup>2</sup> (400 $\mu$ m × 500 $\mu$ m).



Figure 4.12: Die photograph.

Fig. 4.13 shows the measured jitter histogram at 2.5Gbps with  $2^7 - 1$  PRBS data. It indicates that recovered clock jitter achieves long-term root mean square (r.m.s) jitter of 5.7ps and peak-to-peak jitter of 68ps. Fig. 4.14 depicts the measured phase noise of the 2.5GHz-recovered clock. The r.m.s jitter, obtained by integrating the phase noise from 10kHz to 1GHz, is 6.1ps and 8.3ps for the proposed and conventional CDRs, respectively. In the conventional CDR, increasing the integral path gain, KI, increases the DCO frequency quantization error and degrades the jitter from 8.3ps to 20.5ps (Fig. 4.14). In contrast, the jitter remained unchanged for the proposed digital CDR, thus minimizing the sensitivity to PVT-

induced DCO resolution variation. It indicated that the proposed CDR exhibits much better JGEN result than convention !!CDR because the Hogge CDR in the proportional path achieves linear loop dynamics with less quantization error and obtains larger loop bandwidth to suppress more DCO phase noise.



Figure 4.13: Jitter histogram @2.5Gb/s.



Figure 4.14: Measured phase noise @2.5Gb/s of proposed CDR and conventional !!CDR for 2 integral path gains.

Fig. 4.15(a) depicts the measured recovered clock jitter as a function of input PRBS sequence length. When the length is increased from  $2^7 - 1$  to  $2^{15} - 1$ , jitter



Figure 4.15: Measured recovered clock jitter as a function of jitter transfer bandwidth and input PRBS sequence length.

degrades from 6ps to 10.7ps and 9ps to 23ps for the proposed and conventional cases, respectively, thus validating the proposed CDRs improved immunity to CID. In the conventional CDR increasing jitter transfer bandwidth degraded recovered clock jitter due to increased dither jitter in the proportional path, as shown in Fig. 4.15(b). However, in the proposed CDR the bandwidth can be increased without introducing dither jitter, thereby reducing DCO phase noise-induced recovered clock jitter. The jitter tolerance (JTOL) result is depicted in Fig. 4.16. The measured JTOL achieves greater than 0.41UI for all jitter frequency.

The measured jitter transfer (JTRAN) bandwidth plots for both proposed CDR



Figure 4.16: Jitter tolerance for PRBS  $2^7 - 1$  and BER threshold of  $10^{-12}$ .

and conventional !!CDR are shown in Fig. 4.17. The bandwidths at four input jitter amplitudes ranging from -10dB to -40dB are measured, respectively. The plots indicate that the proposed CDR achieves a near-constant bandwidth of 4.5MHz while the bandwidth varies by more than 100% (from 3.8HMz to 9.5MHz) when it is configured in the conventional bang-bang mode. Consequently, the measured results exhibit consistency with the fore-mentioned discussion that proposed architecture achieves constant bandwidth and exhibits stable jitter transfer characteristic regardless of input noise, while the conventional !!CDR reveals a large variation in jitter transfer bandwidth with uncertain input jitter.

Fig. 4.18 depicts the measured recovered data bathtub curves for  $2^{31} - 1$  PRBS inputs in the proposed CDR and conventional !!CDR, respectively. At BER <  $10^{-12}$ , the jitter generation of the proposed CDR is better than the !!CDR by more than 0.15UI due to reduced quantization error in the linear proportional and



Figure 4.17: Measured jitter transfer bandwidth for proposed CDR and conventional !!CDR for different input jitter amplitudes.

static phase offset free characteristic in the proposed CDR.

The performance summary of the prototype is summarized in Table 4.1. The operating data rate range is from 0.5Gb/s to 3.2Gb/s. It achieves error-free operation (BER  $< 10^{-12}$ ) for all PRBS data sequences ranging from  $2^7 - 1$  to  $2^{31} - 1$  sequence lengths over the entire frequency range. At 2.5Gb/s, the power consumption of the CDR is 7mW from a single 1.2V supply. The bandwidth of the CDR is 4MHz at the typical 2.5Gb/s data rate. Table 4.2 is the performance comparison of this circuit with other state-of-the-art CDR designs[2, 4, 9, 21, 23]. The proposed CDR exhibits competitive advantages on the performance of jitter performance and power consumption.



Figure 4.18: Measured bath tub curves for proposed CDR and conventional  $!!\rm CDR$  with  $2^{31}-1$  PRBS input.

|                    | JSSC'05 [2]         | JSSC'06 [9]               | ISSCC'07 [21]            | JSSC'06 [23]             | ISSCC'09[4]         | This work              |
|--------------------|---------------------|---------------------------|--------------------------|--------------------------|---------------------|------------------------|
| Technology         | 0.35µm              | 0.13µm                    | 0.13µm                   | 0.25µm                   | 65nm                | 0.13µm                 |
| Supply voltage     | 3.3V                | 1.8V/3.3V                 | 1.2V                     | 2.5V/3.3V                | 1.2V                | 1.2V                   |
| Data rate          | 0.0125-2.7<br>Gbps  | 9.95-11.1<br>Gbps         | 2.8Gbps                  | 0.15/0.6/1.2/2.5<br>Gbps | 10Gbps              | 0.5-3.2<br>Gbps        |
| Filter             | Analog              | Analog                    | Digital                  | Hybrid                   | Digital             | Digital                |
| Phase Detector     | N/A                 | Binary                    | Binary                   | Linear                   | Linear              | Linear                 |
| JTRAN<br>Bandwidth | 0.5MHz              | 3.6MHz                    | N/A                      | 1.4MHz                   | 20MHz               | 4MHz                   |
| Jitter [rms/pp]    | N/A                 | 4.2mUI/35mUI<br>@11.1Gbps | 7.2ps/47.2ps<br>@2.5Gbps | 1.2ps/N/A<br>@2.5Gbps    | 14ps/N/A<br>@10Gbps | 5.7ps/68ps<br>@2.5Gbps |
| Power              | 775.5mW<br>@2.5Gbps | 793mW<br>@11.1Gbps        | 13.2mW<br>@2.5Gbps       | 425mW<br>@2.5Gbps        | 65mW<br>@10Gbps     | 7mW<br>@2.5Gbps        |
| Area               | 9mm <sup>2</sup>    | 8mm <sup>2</sup>          | 0.13mm <sup>2</sup>      | 25mm <sup>2</sup>        | 0.11mm <sup>2</sup> | 0.2mm <sup>2</sup>     |

Table 4.2: Performance Comparison

| -                              |  |  |
|--------------------------------|--|--|
| $0.13 \mu m CMOS$              |  |  |
| 1.2V                           |  |  |
| 0.7 Gb/s-3.5 Gb/s              |  |  |
| $< 10^{-12}$                   |  |  |
| $5.7 \mathrm{ps} \mathrm{rms}$ |  |  |
| 4.5MHz                         |  |  |
| $7 \mathrm{mW}$                |  |  |
| $0.2 \mathrm{mm}^2$            |  |  |
|                                |  |  |

 Table 4.1: CDR Performance Summary

#### 4.5 Summary

In this chapter, we proposed a novel TDC-less digital CDR architecture that achieves linear loop dynamics and constant jitter transfer bandwidth. With the use of Hogge PD, the proportional path reaches the target of eliminating TDC quantization error and obtaining a stable PD gain that is unaffected by input jitter amplitude. In this way we arrive at a large and constant jitter transfer bandwidth such that the CDR can suppress more phase noise and achieve fixed JTRAN characteristic. In addition, the linear proportional path achieves phase offset free compare to conventional linear CDR as no charge-pump is needed in this digital CDR. Meanwhile the Hoge PD can compensate the dead zone from the voltage offset of the Alexander PD and achieves better JTOL performance compare to conventional "CDR. High performance SAFF is employed in both PDs to minimize the setup time, mitigate the sensitivity of the incoming data and increase the CDR maximum operating rate. The prototype CDR chip achieves BER <  $10^{-12}$  for all PRBS data sequences in the data rate range of 0.5-3.2Gb/s. At the typical 2.5Gb/s, the power consumption of the CDR is only 7mW with 1.2V supply. The

measurement results also show that the proposed CDR obtains almost constant JTRAN bandwidth that is immune to input jitter noise.

## Chapter 5 – Conclusion

High-performance digital systems is a widely discussed design topic in present IC technologies. Nowadays the continued-scaling technology in deep-submicron CMOS process raises severe constraints to analog circuits. Digital-enhanced circuits to assist analog design have become more and more necessary and exhibit many performance enhancement, while there still exists significant bottlenecks that obstruct the applications. Therefore, it is necessary to address the problems and seek for alternative solutions to achieve better performance.

### 5.1 Contribution

This thesis explores the design approaches and circuit techniques for high-performance digital-enhanced PLL and CDR in deep sub-micron processes.

In Chapter 2, an analysis about high performance digital PLLs and CDRs specifications and limitations is presented. This chapter starts with a detailed jitter analysis in DPLLs, then explores the discussion to the design challenges and limitations in conventional DPLL and further extents to PLL-based digital CDR design. This analysis shows that the constraints in the digital-enhanced clock raise some trade-off issues that alternative design methods must be implemented to ease and solve these problems.

Chapter 3 presents a novel double integral path DPLL to achieve wide operating frequency range as well as low jitter and power performance. The linear proportional path in the proposed DPLL eliminate TDC quantization error and decouple the decouple the noise bandwidth tradeoff for the benefit of both deterministic jitter and random jitter. The proposed double integral path obtains both wide tuning range and low integral path step size. The approaches of bandwidth and tuning range tracking are further developed in this DPLL to optimize the output jitter performance over a wide operating range. In this way a high-performance digital PLL is achieved.

In Chapter 4, a PLL-based digital CDR is proposed to avoid the use of TDC and exhibit linear loop dynamics. As the charge-pump and loop filter are replaced the digital counterparts, the Hogge PD in the proposed CDR reveals no static phase offset compare to conventional linear CDR. The Hogge PD also compensates the dead zone from the DFF voltage offset in the Alexander PD in the digital integral path which deteriorates JTOL performance and hence improves output jitter. Furthermore, the linear proportional path mitigates TDC quantization error and acquires a stabilized PD gain that is not impacted by input jitter amplitude. Thus a large and constant jitter transfer bandwidth such that the CDR can suppress more phase noise and achieve fixed JTRAN characteristic. In this way a highperformance digital CDR is obtained. It's reasonable and promising that digital circuits for the assist of analog circuits function exhibit significant merits and are worth to explore further researches in the implementation of deep deep-submicron CMOS technologies.

#### 5.2 Future work

With the increasing development in future's deep-submicron processes, digitalenhanced circuits to assist analog design becomes more and more important. Further development on this topic can be explored in the following ways:

(1) Advanced implementations to improve supply and substrate noise in the digital PLLs and CDRs are essential and critical as the impact of supply and substrate noise on oscillator are severe in low supply voltage in deep-submicron processes. Therefore it's necessary to investigate both supply noise suppression and cancelation techniques to improve the power supply rejection ratio (PSRR). Same target at substrate noise should also be taken into account.

(2) The PVT variation problems need more consideration in the future design. Though digital circuits is immune to PVT variation, however the digital-controlled oscillator as well as some necessary analog circuits could worsen this problem. As the gain of the oscillator could vary larger then twice in different corners, approaches to realize adaptive loop dynamics or calibration scheme to achieve a fixed DCO frequency step size  $\Delta F$  in the presence of PVT variations would greatly benefit for this design issue and strengthen the advantages of digital-enhanced PLLs and CDRs.

### Bibliography

- J. Alexander. Clock recovery from random binary signals. *Electronic Letters*, 41:541–542, Oct. 1975.
- [2] D. Dalton, K. Chai, E. Evans, M. Ferriss, D. Hitchcox, P. Murray, S. Selvanayagam, P. Shepherd, and L. DeVito. A 12.5-mb/s to 2.7-gb/s continuousrate cdr with automatic frequency acquisition and data-rate readback. *IEEE Journal of Solid-State Circuits*, 40:2713–2725, Dec. 2005.
- [3] P. Dudek, S. Szczepanski, and J. Hatfield. A high-resolution cmos timetodigital converter utilizing a vernier delay line. *IEEE Journal of Solid-State Circuits*, 35:240–247, Feb. 2000.
- [4] K. Fukuda, H. Yamashita, F. Yuki, G. Ono, R. Nemoto, E. Suzuki, T. Takemoto, and T. Saito. A 1.2mw 0.02mm2 2ghz current-controlled pll based on a self-biased voltage-to-current converter. In *Internation Solid-State Circuits Conference Digest of Technical Papers*, pages 186–187, Feb. 2009.
- [5] F. Gardner. Charge-pump phase-lock loops. *IEEE Trans. Communications*, 40:1849–1858, Nov. 1980.
- [6] P. Hanumolu, G. Wei, U. Moon, and K. Mayaram. Digitally-enhanced phaselocking circuits. In *Custom Intergrated Circuits Conference*, pages 361–368, Sep. 2007.
- [7] C. Hogge. A self-correcting clock recovery circuit. *IEEE Journal of Lightwave Technique*, 3:1312–1314, Dec. 1985.
- [8] W. Jung, H. Choi, C. Jeong, K. Kim, W. Kim, H. Jeon, G. Koo, J. Kim, J. Seo, M. Ko, and J. Kim. A 1.2mw 0.02mm2 2ghz current-controlled pll based on a self-biased voltage-to-current converter. In *Internation Solid-State Circuits Conference Digest of Technical Papers*, pages 310–311, Feb. 2007.
- [9] J. Kenney, D. Dalton, E. Evans, M. Eskiyerli, B. Hilton, D. Hitchcox, T. Kwok, D. Mulcahy, C. McQuilkin, V. Reddy, S. Selvanayagam, P. Shepherd, W. Titus, and L. DeVito. A 9.95 to 11.1gb/s xfp transceiver in 0.13m cmos. *IEEE Journal of Solid-State Circuits*, 41:2901–2909, Dec. 2006.

- [10] D. Kim, H. Song, T. Kim, S. Kim, and D. Jeong. A 1.35ghz all-digital fractional-n pll with adaptive loop gain controller and fractional divider. In *IEEE Asian Solid-State Circuits Conference*, pages 161–164, Nov. 2009.
- [11] V. Kratyuk, P. Hanumolu, K. Ok, K. Mayaram, and U. Moon. A digital pll with a stochastic time-to-digital converter. In *IEEE Symposium on VLSI Circuits Digest of Technical Papers*, pages 52–53, Jun. 2006.
- [12] J. Lee, K. Kundert, and B. Razavi. Analysis and modeling of bang-bang clock and data recovery circuits. *IEEE Journal of Solid-State Circuits*, 39:1571– 1580, Sep. 2004.
- [13] J. Lee and K. Wu. A 20gb/s full-rate linear cdr circuit with automatic frequency acquisition. In Internation Solid-State Circuits Conference Digest of Technical Papers, pages 336–337, Feb. 2009.
- [14] Minjae Lee and Asad Abidi. A 9 b, 1.25 ps resolution coarsefine time-todigital converter in 90 nm cmos that amplifies a time residue. *IEEE Journal* of Solid-State Circuits, 43:769–777, Apr. 2008.
- [15] S. Lee, H. Ha Y. Kim, Y. Seo, H. Park, and J. Sim. A 650mb/s-to-8gb/s referenceless cdr circuit with automatic acquisition of data rate. In *Internation Solid-State Circuits Conference Digest of Technical Papers*, pages 184–185, Feb. 2009.
- [16] Jerry Lin, Baher Haroun, Tim Foo, Jin-Sheng Wang, Bob Helmick, Scott Randall, Terry Mayhugh, Chris Barr, and Jeff Kirkpatrick. A pvt tolerant 0.18mhz to 600mhz self-calibrated digital pll in 90nm cmos process. In *Internation Solid-State Circuits Conference Digest of Technical Papers*, pages 488–489, Feb. 2004.
- [17] J. Maneatis, J. Kim, I. McClatchie, J. Maxey, and M. Shankaradas. Self-biased high-bandwidth low-jitter 1-to-4096 multiplier clock generator pll. *IEEE Jour*nal of Solid-State Circuits, 38:1795–1803, Nov. 2003.
- [18] M. Mansuri, D. Liu, and C. Yang. Fast frequency acquisition phase-frequency detectors for gsamples/s phase-locked loops. *IEEE Journal of Solid-State Circuits*, 37:1331–1344, Oct. 2002.
- [19] J. McNeill. Jitter in ring oscillators. IEEE Journal of Solid-State Circuits, 32:870–879, June. 1997.

- [20] B. Nikolic, V. Oklobdzija, V. Stojanovic, W. Jia, J. Chiu, and M. Leung. Improved sense-amplifier-based flip-flop: Design and measurements. *IEEE Journal of Solid-State Circuits*, 35:876–884, Jun. 2000.
- [21] D. Oh, D. Kim, S. Kim, D. Jeong, and W. Kim. A 2.8gb/s all-digital cdr with a lob monotonic dco. In *Internation Solid-State Circuits Conference Digest of Technical Papers*, pages 222–223, Feb. 2007.
- [22] J. Parker, D. Weinlader, and J. Sonntag. A 15mw 3.125ghz pll for serial backplane transceivers in 0.13um cmos. In *Internation Solid-State Circuits Conference Digest of Technical Papers*, pages 412–413, Feb. 2005.
- [23] M. Perrott, Y. Huang, R. Baird, B. Garlepp, D. Pastorello, E. King, Q. Yu, D. Kasha, P. Steiner, L. Zhang, J. Hein, and B. Signore. A 2.5-gb/s multi-rate 0.25-um cmos clock and data recovery circuit utilizing a hybrid analog/digital loop filter and all-digital referenceless frequency acquisition. *IEEE Journal of Solid-State Circuits*, 41:2930–2944, Dec. 2006.
- [24] R. Schreier, S. Norsworthy, and G. Temes. *Delta-Sigma Data Converters*. IEEE Press Wiley, 2nd edition, 1997.
- [25] Minyoung Song, Young-Ho Kwak, Sunghoon Ahn, Wooseok Kim, ByeongHa Park, and Chulwoo Kim;. A 10mhz to 315mhz cascaded hybrid pll with piecewise linear calibrated tdc. In *Custom Intergrated Circuits Conference*, pages 243–246, Sep. 2009.
- [26] J. Sonntag and J. Stonick. A digital clock and data recovery architecture for multi-gigabit/s binary links. *IEEE Journal of Solid-State Circuits*, 41:1867– 1875, Dec. 2006.
- [27] R. Staszewski, J. Wallberg, S. Rezeq, C. Hung, O. Eliezer, S. Vemulapalli, C. Fernando, K. Maggio, N. Barton, M. Lee, P. Cruise, M. Entezari, K. Muhammad, and D. Leipold. All-digital pll and transmitter for mobile phones. *IEEE Journal of Solid-State Circuits*, 40:2469–2482, Dec. 2005.
- [28] Robert Staszewski, Sudheer Vemulapalli, Prasant Vallur, John Wallberg, and Poras Balsara. 1.3 v 20 ps time-to-digital converter for frequency synthesis in 90-nm cmos. *IEEE Transactions on Circuits and System-II: Express Briefs*, 53:220–224, Mar. 2006.

- [29] Matthew Straayer and Michael Perrott. A multi-path gated ring oscillator tdc with first-order noise shaping. *IEEE Journal of Solid-State Circuits*, 44:1089– 1098, Apr. 2009.
- [30] J. Tierno, A. Rylyakov, and D. Friedman. A wide power supply range, wide tuning range, all static emos all digital pll in 65nm soi. *IEEE Journal of Solid-State Circuits*, 43:42–51, Jan. 2008.
- [31] T. Toifl, C. Menolfi, P. Buchmann, M. Kossel, T. Morf, R. Reutemann, M. Ruegg, M. Schmatz, and J. Weiss. A 0.94-ps-rms-jitter 0.016-mm2 2.5ghz multiphase generator pll with 360 digitally programmable phase shift for 10-gbs serial links. *IEEE Journal of Solid-State Circuits*, 40:2700–2712, Dec. 2005.
- [32] P. Wang, J. Zhan, H. Chang, and H. Chang. A digital intensive fractional-n pll and all-digital self-calibration schemes. *IEEE Journal of Solid-State Circuits*, 44:2182–2192, Aug. 2009.
- [33] R. Yang, K. Chao, S. Hwu, C. Liang, and S. Liu. A 155.52 mbpsc3.125 gbps continuous-rate clock and data recovery circuit. *IEEE Journal of Solid-State Circuits*, 41:1380–1390, Jun. 2006.
- [34] W. Yin and P. Hanumolu. Digital phase-locking loop circuits. SRC report, pages 361–368, Sep. 2007.
- [35] W. Yin, R. Inti, and P. Hanumolu. A 1.6mw 1.6ps-rms-jitter 2.5ghz digital pll with 0.7-to-3.5ghz frequency range in 90nm cmos. In *Custom Intergrated Circuits Conference*, pages 243–246, Sep. 2010.