Analysis and Design of Robust Multi-Gb/s Clock and Data Recovery Circuits by Rennie, David J.
Analysis and Design of Robust Multi-Gb/s




presented to the University of Waterloo
in fulfillment of the
thesis requirement for the degree of
Doctor of Philosophy
in
Electrical and Computer Engineering
Waterloo, Ontario, Canada, 2007
c©David J. Rennie 2007
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including
any required final revisions, as accepted by my examiners.




The bandwidth demands of modern computing systems have been continually increasing and
the recent focus on parallel processing will only increase the demands placed on data communica-
tion circuits. As data rates enter the multi-Gb/s range, serial data communication architectures
become attractive as compared to parallel architectures. Serial architectures have long been used
in fibre optic systems for long-haul applications, however, in the past decade there has been a
trend towards multi-Gb/s backplane interconnects. The integration of clock and data recovery
circuits into monolithic integrated circuits is attractive as it improves performance and reduces
the system cost, however it also introduces new challenges, one of which is robustness.
In serial data communication systems the CDR circuit is responsible for recovering the data
from an incoming data stream. In recent years there has been a great deal of research into inte-
grating CDR circuits into monolithic integrated circuits. Most research has focused on increasing
the bandwidth of the circuits, however in order to integrate multi-Gb/s CDR circuits robustness,
as well as performance, must be considered.
In this thesis CDR circuits are analyzed with respect to their robustness. The phase detector
is a critical block in a CDR circuit and its robustness will play a significant role in determining the
overall performance in the presence of process non-idealities. Several phase detector architectures
are analyzed to determine the effects of process non-idealities. Static phase offsets are introduced
as a figure of merit for phase detectors and a mathematical framework is described to characterize
the negative effects of static phase offsets on CDR circuits. Two approaches are taken to improve
the robustness of CDR circuits. First, calibration circuits are introduced which correct for static
phase offsets in CDR circuits. Secondly, phase detector circuits are introduced which have been
designed to optimize both performance and robustness. Several prototype chips which implement
these schemes will be described and measured results will be presented. These results show
that while CDR circuits are vulnerable to the effects of process non-idealities, there are circuit
techniques which can mitigate many of these concerns.
iii
Acknowledgements
A PhD is in many way a solitary endeavor, however it is never completed alone. I am indebted
to a great many people, and I thank all those who have helped me complete this journey.
I thank my supervisor Manoj Sachdev for many years of patience, advice and encouragement.
This thesis would not have been possible without his guidance and support, and I am deeply
grateful.
I have been blessed with many friends who have made the long road a little shorter, providing
both much needed help and much needed distractions at just the right times. I can never repay
them for all they have contributed to my life, but I can, and do, say a heartfelt thank you.
I have also been greatly blessed with a wonderful family, and I thank them. They have given
me unconditional love and unfailing support, put up with my erratic schedule, and only rarely
asked when I was going to be finished!
Finally, I give thanks to God, who set my feet upon this path, and was faithful in leading me
through; to Him be the glory.
The race is not to the swift
or the battle to the strong,
nor does food come to the wise
or wealth to the brilliant
or favor to the learned;





1.1 CMOS Integration of CDR Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Clock and Data Recovery 5
2.1 Serial Digital Data Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Data Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Data Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Wireline Data Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Optical Data Communication Systems . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Copper Data Communication Systems . . . . . . . . . . . . . . . . . . . . . 11
2.3 CDR Circuit Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 Phase Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.2 Charge Pump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.3 Loop Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.4 Voltage Controlled Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 CML in Multi-Gb/s CDR Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
v
2.4.2 CDR Circuits Using CML . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5 Figures of Merit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5.1 CDR System Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5.2 Jitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5.3 Frequency Domain FOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.5.4 Time-Domain FOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.6 Modelling Binary Phase Detector Based CDR Circuits . . . . . . . . . . . . . . . . 56
2.6.1 Basic Binary Control: Example . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.6.2 Basic Binary Control: Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.6.3 Second Order CDR Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3 Robustness Considerations in CDR Circuits 74
3.1 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.1.1 Definition of Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.1.2 Robustness in This Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.2 Mathematical Analysis of Static Phase Offset . . . . . . . . . . . . . . . . . . . . . 79
3.2.1 Static Phase Offsets in a Phase Detector . . . . . . . . . . . . . . . . . . . . 79
3.2.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.3 DFF Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.3.1 Metastability in a DFF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.3.2 Hogge Phase Detector Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.3.3 Alexander Phase Detector Gain . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.3.4 DFF Phase Detector Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.4 Effect of Non-Idealities on Phase Detectors . . . . . . . . . . . . . . . . . . . . . . 93
3.4.1 Analysis Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.4.2 Analysis Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
vi
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4 Calibration Techniques for Robust CDR Circuits 100
4.1 Calibration in CDR Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.1.1 Correction of Static Phase Offsets Using Calibration . . . . . . . . . . . . . 101
4.1.2 Linear Phase Detectors in CDR Circuits . . . . . . . . . . . . . . . . . . . . 105
4.2 Offline Calibration Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.2.1 Calibration Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.2.3 Measured Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.3 Online Analog Calibration Architecture . . . . . . . . . . . . . . . . . . . . . . . . 120
4.3.1 Calibration Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.3.3 Measured Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5 Phase Detector Design for Robust CDR Circuits 129
5.1 Tri-State DFF Phase Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.1.1 Architecture of Tri-State DFF Phase Detector . . . . . . . . . . . . . . . . 130
5.1.2 Robustness of the Tri-State DFF Phase Detector . . . . . . . . . . . . . . . 132
5.1.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.1.4 Measured Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.2 Pulsed DFF Binary Phase Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.2.1 Monolithic Second Order Loop Issues . . . . . . . . . . . . . . . . . . . . . 140
5.2.2 Proportional Path Optimization . . . . . . . . . . . . . . . . . . . . . . . . 141
5.2.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.2.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
vii
5.2.5 Measured Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.3 Bandwidth Enhanced Linear Phase Detector . . . . . . . . . . . . . . . . . . . . . 158
5.3.1 Robustness in the Hogge Phase Detector . . . . . . . . . . . . . . . . . . . . 158
5.3.2 Modified Hogge Phase Detector . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.3.3 Charge Pump Currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6 Conclusions 165
6.1 Major Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
viii
List of Tables
2.1 Data rates for SONET/SDH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1 Simulation data for process normalization . . . . . . . . . . . . . . . . . . . . . . . 93
4.1 Measured calibrated and uncalibrated BER for various data patterns . . . . . . . . 118
ix
List of Figures
2.1 Example of several line codes which could be used for data communication . . . . 6
2.2 Architecture of an optical data communication system . . . . . . . . . . . . . . . . 9
2.3 Architecture of a repeater . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Data communication over a backplane . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Architecture of a backplane data communication system . . . . . . . . . . . . . . . 12
2.6 Frequency attenuation over an FR4 backplane [1] . . . . . . . . . . . . . . . . . . . 13
2.7 Architecture of a PLL based CDR circuit . . . . . . . . . . . . . . . . . . . . . . . 14
2.8 Gain of a binary and linear phase detector . . . . . . . . . . . . . . . . . . . . . . . 15
2.9 Architecture of the Hogge phase detector . . . . . . . . . . . . . . . . . . . . . . . 16
2.10 Operation of the Hogge phase detector . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.11 A CDR circuit with a DFF as the phase detector . . . . . . . . . . . . . . . . . . . 19
2.12 The sampling behavior of an Alexander phase detector . . . . . . . . . . . . . . . . 20
2.13 Architecture of the Alexander phase detector . . . . . . . . . . . . . . . . . . . . . 20
2.14 Waveforms of full-rate and half-rate CDR circuits . . . . . . . . . . . . . . . . . . . 22
2.15 Basic architecture of a charge pump . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.16 Current steering charge pump with differential inputs . . . . . . . . . . . . . . . . 24
2.17 Differential implementation of a charge pump . . . . . . . . . . . . . . . . . . . . . 25
2.18 Second order low pass filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.19 Architecture of a CDR circuit with a dual loop structure and an external reference 27
x
2.20 Architecture of a four stage ring oscillator . . . . . . . . . . . . . . . . . . . . . . . 28
2.21 (a)Ideal LC-tank (b)LC-tank with parasitics . . . . . . . . . . . . . . . . . . . 29
2.22 LC-tank oscillator with NMOS cross-couple pair . . . . . . . . . . . . . . . . . . . 30
2.23 CML implementation of a buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.24 CML implementation of four different logic gates . . . . . . . . . . . . . . . . . . . 32
2.25 Frequency domain mathematical model of a CDR circuit . . . . . . . . . . . . . . . 34
2.26 Output waves from a CDR circuit with a capacitor for a loop filter . . . . . . . . . 35
2.27 Frequency response of a CDR circuit which has a capacitor as the loop filter . . . . 36
2.28 Frequency response of a CDR circuit which has a first order loop filter . . . . . . . 37
2.29 An illustration of pattern dependant jitter . . . . . . . . . . . . . . . . . . . . . . . 40
2.30 An illustration of pulse width distortion . . . . . . . . . . . . . . . . . . . . . . . . 41
2.31 The difference between RMS and peak-to-peak jitter measurements [2] . . . . . . . 42
2.32 Jitter transfer mask for OC-192 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.33 Jitter amplification in a second order loop . . . . . . . . . . . . . . . . . . . . . . . 46
2.34 Jitter tolerance mask for OC-192 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.35 Jitter tolerance using a first order approximation . . . . . . . . . . . . . . . . . . . 49
2.36 Relationship between phase noise and jitter . . . . . . . . . . . . . . . . . . . . . . 51
2.37 Frequency domain mathematical model of a CDR circuit with noise . . . . . . . . 52
2.38 Frequency response of a CDR circuit to noise . . . . . . . . . . . . . . . . . . . . . 53
2.39 A CDR architecture where the binary phase detector directly controls the VCO . . 58
2.40 Waveforms illustrating the ideal response of the first order loop . . . . . . . . . . . 59
2.41 Waveforms used to determine the jitter transfer response of the first order loop . . 60
2.42 Waveforms used to determine the jitter tolerance response of the first order loop . 61
2.43 The jitter transfer and jitter tolerance response of the first order loop . . . . . . . 63
2.44 Three architectures which increase the order of the first order loop . . . . . . . . . 64
2.45 A second order loop with a capacitor as the loop filter . . . . . . . . . . . . . . . . 65
2.46 Waveforms showing the integral response provided by the capacitor . . . . . . . . . 66
xi
2.47 A second order loop with a first order RC loop filter . . . . . . . . . . . . . . . . . 67
2.48 Relationship between jitter tolerance, jitter generation and ∆fbb . . . . . . . . . . 69
2.49 Response of two-state and tri-state binary phase detectors . . . . . . . . . . . . . . 70
2.50 V/I circuit used by Lee et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.51 Example of a circuit with tri-state frequency control . . . . . . . . . . . . . . . . . 72
3.1 Some sources and effects of process variations [3] [4] . . . . . . . . . . . . . . . . . 75
3.2 Phase detector performance and robustness goals . . . . . . . . . . . . . . . . . . . 77
3.3 Eye diagram of a CDR circuit when Φspo = 0 . . . . . . . . . . . . . . . . . . . . . 80
3.4 Eye diagram of a CDR circuit when Φspo 6= 0 . . . . . . . . . . . . . . . . . . . . . 81
3.5 Maximum allowable RMS jitter in the presence of static phase offsets . . . . . . . . 82
3.6 Output BER with respect to Jrms and Φspo . . . . . . . . . . . . . . . . . . . . . . 84
3.7 The effect of a 10ps static phase offset on the BER of a 10Gb/s CDR circuit . . . 84
3.8 Simulated maximum input jitter vs Φspo for a 5GB/s CDR circuit . . . . . . . . . 85
3.9 Schematic of a CML based DFF with waveforms illustrating its functionality . . . 86
3.10 Value of the C-Q delay with respect to the input phase error . . . . . . . . . . . . 88
3.11 The operation of a Hogge phase detector given non-ideal DFFs . . . . . . . . . . . 89
3.12 Waveforms for an Alexander phase detector including C-Q delay . . . . . . . . . . 91
3.13 Process variation simulation results for the three phase detectors . . . . . . . . . . 96
3.14 Summary showing the overall effects of process on Φspo . . . . . . . . . . . . . . . 97
4.1 Hogge phase detector operation when Φspo = 0 and when Φspo 6= 0 . . . . . . . . . 102
4.2 The effect of UP and DOWN charge pump currents on Φspo . . . . . . . . . . . . . 104
4.3 Block diagram of the proposed offline calibration algorithm . . . . . . . . . . . . . 107
4.4 Simulated output waveforms for standard and symmetric XOR gate . . . . . . . . 110
4.5 Schematic of the symmetric XOR gate . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.6 Schematic of modified charge pump . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.7 Delay through the delay line as the input current is varied . . . . . . . . . . . . . . 113
xii
4.8 Architecture of the DAC circuit used in this design . . . . . . . . . . . . . . . . . . 113
4.9 The output current and error of the DAC as the codes are stepped . . . . . . . . . 114
4.10 Micrograph of the fabricated CDR circuit . . . . . . . . . . . . . . . . . . . . . . . 116
4.11 Spectrum of the VCO locked to a 5Gb/s PRBS . . . . . . . . . . . . . . . . . . . . 117
4.12 Measured jitter of the locked oscillator . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.13 Output clock and data waveforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.14 Block diagram of the online calibration architecture . . . . . . . . . . . . . . . . . 121
4.15 Simulated waveforms showing the calibration loop locking . . . . . . . . . . . . . . 122
4.16 Eye diagram showing the clock and data signals before and after calibration . . . . 123
4.17 The schematic of a dual edge triggered CML DFF . . . . . . . . . . . . . . . . . . 124
4.18 Simple charge pump for the calibration circuit . . . . . . . . . . . . . . . . . . . . . 125
4.19 Micrograph of the fabricated CDR circuit . . . . . . . . . . . . . . . . . . . . . . . 126
4.20 5Gb/s clock and data waveforms before and after calibration . . . . . . . . . . . . 127
5.1 Architecture of the tri-state binary phase detector . . . . . . . . . . . . . . . . . . 130
5.2 CDR circuit waveforms given a tri-state binary phase detector . . . . . . . . . . . . 131
5.3 Detailed response of the Alexander phase detector over corners . . . . . . . . . . . 133
5.4 Detailed response of the tri-state DFF phase detector over corners . . . . . . . . . 133
5.5 Comparison of Tri-state DFF vs Alexander Pulse Widths . . . . . . . . . . . . . . 134
5.6 Architecture of the modified charge pump . . . . . . . . . . . . . . . . . . . . . . . 136
5.7 Physical structure of an AMOS varactor . . . . . . . . . . . . . . . . . . . . . . . . 137
5.8 Micrographs of the CDR circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.9 Frequency spectrum of the recovered clock signal . . . . . . . . . . . . . . . . . . . 139
5.10 Jitter histogram of the recovered clock for two data patterns . . . . . . . . . . . . 139
5.11 A CDR circuit with a parasitic capacitor creating a third order response . . . . . . 140
5.12 Architecture of a CDR circuit with separate proportional and integral paths . . . . 141
5.13 Architecture of the proposed phase detector . . . . . . . . . . . . . . . . . . . . . . 142
xiii
5.14 Ideal waveforms if a capacitor is used as a loop filter . . . . . . . . . . . . . . . . . 143
5.15 A 2nd order loop filter and the waveforms resulting from a current pulse . . . . . . 143
5.16 Relationship between t1 and tpulse given a 2nd order filter . . . . . . . . . . . . . . 145
5.17 Matlab plot showing the simulated ideal current pulse characteristic . . . . . . . . 146
5.18 Jitter transfer for DFF and pulsed DFF phase detectors . . . . . . . . . . . . . . . 146
5.19 Jitter transfer for various values of Jin for both phase detectors . . . . . . . . . . . 147
5.20 Simulations results showing jitter tolerance for both phase detectors . . . . . . . . 148
5.21 Simulated gain of the proposed phase detector . . . . . . . . . . . . . . . . . . . . 149
5.22 Schematic of circuit used to generate the current pulse . . . . . . . . . . . . . . . . 149
5.23 Schematics of both the digital and analog current pulse circuits . . . . . . . . . . . 151
5.24 The range of simulated current pulses for the RC scheme . . . . . . . . . . . . . . 151
5.25 Simulated eye diagram for both a regular and pulsed DFF phase detector . . . . . 152
5.26 Micrograph of the proposed pulsed DFF phase detector . . . . . . . . . . . . . . . 153
5.27 Simulated and measured VCO frequency . . . . . . . . . . . . . . . . . . . . . . . . 154
5.28 Simulated capacitance of AMOS varactors . . . . . . . . . . . . . . . . . . . . . . . 155
5.29 Spectrum of the 15GHz output clock signal . . . . . . . . . . . . . . . . . . . . . . 156
5.30 Jitter histogram for the output clock for both phase detectors . . . . . . . . . . . . 156
5.31 Output data signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.32 Simulated response of the original Hogge phase detector over corners . . . . . . . . 159
5.33 Architecture of a modified Hogge phase detector . . . . . . . . . . . . . . . . . . . 160
5.34 Ideal transfer characteristic for the standard and modified Hogge . . . . . . . . . . 161
5.35 Simulated response of the modified Hogge phase detector over corners . . . . . . . 162




Serial data communication systems have been used for decades to transmit large quantities of
information over a single link. However, over the past ten years there has been a significant shift
in the high-speed serial data communication market from optical networks to backplane systems.
A decade ago optical systems dominated, focusing on long-haul applications where data is sent
vast distances over optical fibre. These systems were very expensive, using optical components
and discrete integrated circuits (ICs) such as trans-impedance amplifiers (TIAs), clock and data
recovery (CDR) circuits and serializer / deserializer (SERDES) circuits. These circuits were
implemented in separate ICs and usually fabricated in non-CMOS processes like GaAs and SiGe.
Data communication over backplanes was implemented using wide parallel buses like IDE, PCI
and AGP. The data rates of these backplane buses was relatively slow, usually less than 100Mb/s.
High-speed interconnects were limited to high-end workstations; for example, Cray developed the
HIgh Performance Parallel Interface (HIPPI) bus.
While optical networks have grown over the past ten years, the market for backplane serial
interconnects has grown at a much higher rate, such that they now dominate the market for
serial data communication. Companies have had 40Gb/s optical devices available for a few years,
however an overaggressive previous investment in capacity and advances in technology, such as
1
Introduction 2
wavelength-division multiplexing (WDM), has resulted in most systems not having data rates any
greater than 10Gb/s. Companies have focused on reducing the cost of optical systems through
integration and through a cautious migration to CMOS processes. CMOS processes seem an
attractive option in terms of cost and integration, however the quality required for long-haul
systems is still high enough that many companies have stayed with GaAs and SiGe processes.
There has been some research involving the integration of fibre optics into backplanes in order
to enable optical links over a backplane, however the cost of such systems is orders of magnitude
greater than systems which operate exclusively in the electrical domain. As such this approach
is unlikely to be implemented in the near future.
The growth in backplane serial interconnects has come about due to the increasing core
frequency of microprocessors along with the growth of multi-processor computer systems. As the
core frequency of microprocessors increases they require access to more data. In order to increase
bandwidth the width of a parallel bus can be increased, however at a certain point this becomes
prohibitive and serial I/Os become attractive [5]. Several backplane serial data communication
standards have come to the forefront over the past decade, namely RapidIO, PCI-Express and
FibreChannel. Some of these are aimed more at chip-to-chip communication (i.e. RapidIO and
PCI-Express) while others are aimed more at networking (i.e. Fibrechannel). Generally speaking
wireline standards for the backplane have less stringent performance requirements as compared
to optical communication standards like SONET/SDH.
1.1 CMOS Integration of CDR Circuits
As of 2007, most leading edge CDR circuits are not implemented using standard CMOS technol-
ogy, but rather in less mainstream processes. At present, the leading edge of commercial CDR
circuits have data rates in the range of 10Gb/s to 40Gb/s and are usually fabricated in SiGe,
GaAs or InP processes. While these processes have attractive features, they are more expensive
than CMOS, and lack the ease of integration inherent to CMOS processes. Integration is becom-
Introduction 3
ing increasingly important as IC designers place as many circuits as possible on the same die in
order to reduce the system cost. Systems which use a CDR circuit implemented in a non-CMOS
technology must then have a wide parallel bus connecting the CDR IC to a data processing IC.
Integration of a serial transmitter and receiver into a CMOS chip which contains data process-
ing has the potential to reduce latency, system cost and power consumption while providing the
processing cores with the necessary data.
Integration has been the focus of a great deal of research in the past decade. The scaling of
CMOS has had a significant effect on CDR circuits, in that it has enabled the integration of CDR
circuits operating at multi-Gb/s data rates into monolithic ICs [6]. However, the integration of
CDR circuits into a CMOS environment is not a trivial process. While CMOS is an excellent
technology for digital circuits, it is less conducive to the implementation of high-speed mixed
signal circuits. One reason for this is that CMOS transistors have less gain and a lower operating
frequency then their bipolar counterparts. A second problem is that the lossy nature of substrates
in CMOS processes allows noise to infiltrate into sensitive circuits. A third problem is that often
transistors in CMOS processes are not properly characterized for operation at multi-Gb/s data
rates. A fourth problem in the integration of CDR circuits into a monolithic IC is that it requires
the CDR circuit be robust. The robustness of CDR circuits is a topic which has received very
little attention, however it is a problem which goes hand-in-hand with integration.
1.2 Motivation
In any high-speed serial data communication systems CDR circuits play a key role. In this thesis
the importance of robust CDR circuit design is discussed. This topic is becoming significant both
due to the drive to integrate entire systems onto a single die and due to the aggressive scaling of
CMOS technology. Integration makes it increasingly important for the circuits to be robust and
the increased process variation associated with scaling make robust circuits increasingly difficult
to realize. In this thesis the effects of robustness are examined and circuits designed to alleviate
Introduction 4
those effects are introduced.
1.3 Thesis Overview
This thesis will first examine CDR circuits from a traditional point of view, then the robustness
of CDR circuits will be analyzed and finally methods will be proposed to reduce the sensitivity
of CDR circuits to process non-idealities. Chapter 2 provides background information on CDR
circuits, examining them at both the architectural and circuit level. Figures of merit used to
measure the performance of CDR circuits are discussed and finally mathematical models of CDR
circuits are derived. Robustness is an important concept in electrical engineering, however it can
mean a number of different things depending on the context. Chapter 3 first defines robustness
and then introduces the effects of robustness on CDR circuits. The effects of robustness are
examined at both the mathematical level and at a circuit level. Calibration is used in many
different areas of circuit design in order to optimize a system. In Chapter 4 a method to calibrate
CDR circuits is introduced and two schemes are proposed. The first scheme is an online calibration
scheme and the second is an offline calibration scheme. Robustness is certainly an important
metric, however the performance of CDR circuits cannot be sacrificed. As such, Chapter 5
proposes three phase detector designs which optimize both the performance and robustness of
CDR circuits. Finally Chapter 6 summarizes the thesis, elucidates the major contributions and
points to potential future work.
Chapter 2
Clock and Data Recovery
In this chapter a background on wireline data communication is given. First, the different ways of
describing data in the electrical domain are described. Next, optical and backplane networks are
compared with respect to each other. The CDR circuit is a key part of serial interconnects and
it is described in detail. Figures of merit used to characterize the performance of CDR circuits
are then introduced. A mathematical model for the CDR circuit is described, and then used
to derive equations for various figures of merit. This is done first for the case of a linear phase
detector based CDR circuit, but CDR circuits which implement binary phase detectors are also
analyzed.
2.1 Serial Digital Data Communication
In modern computing systems data is moved to and from storage devices, memory, processing
units, I/O devices and through networks. There are numerous ways for data communication
systems to be implemented. Based on the physical medium, data communication can either
be wireline or wireless. Data communication can be implemented using a bus or over a single
channel. There are also numerous methods of putting digital data into the electrical domain,
5
Clock and Data Recovery 6
Figure 2.1: Example of several line codes which could be used for data communication
which is known as modulation. Data can also be encoded, usually in order to make it easier to
receive, or in order to enable error correction.
In this thesis only wireline data communication over a single channel is discussed. The next
section describes modulation and encoding, and also explains what kind of data signal is assumed
in this thesis.
2.1.1 Data Modulation
While modulation is more commonly associated with wireless data communication it is merely a
description of data in the electrical domain. Modulation in the context of wireline data commu-
nication is also known as line coding. Examples of a few different line codes are shown in Figure
2.1.
1. Non-Return-to-Zero
Non-return-to-zero (NRZ) is the simplest line code. With NRZ the only possible values of
the output signal are ‘0’ or ‘1’, which correspond to the value of the input data for the
entire period. While this type of encoding is simple it has two main disadvantages. The
Clock and Data Recovery 7
first is that the output data signal has no frequency content at the data rate. This can
be understood by thinking of a 10Gb/s data stream. The highest frequency content will
happen when the data is equal to 10101010, however this is equivalent to a clock signal
of 5GHz. In some ways this is advantageous as the bandwidth requirements are halved,
however it makes data synchronization more difficult. The second disadvantage of NRZ
encoding is that for long runs of ‘1’ or ‘0’ the output data will be DC. This also makes
synchronization difficult as DC signals are blocked by high-pass filtering.
2. Return-to-Zero
Return-to-zero (RZ) is known as a bi-polar encoding scheme, and the output signal has
three possible levels. The output signal corresponds to the value of the data in that ‘0’
corresponds to a value of -V and ‘1’ corresponds to +V, however, this is only true for half
the period. In the second half of the period the data signal is equal to zero. This makes
data synchronization much easier as the signal has spectral power at the frequency equal
to the data rate. Another benefit of the RZ code is that there are always transitions, which
means that the data signal will never be blocked by high-pass filtering with long runs of ‘1’
or ‘0’ (as is the case with NRZ). The significant downside of this line code is that it requires
twice the bandwidth of NRZ.
3. Pulse-Amplitude Modulation
Finally, pulse-amplitude modulation (PAM) is an example of a multi-level line code. In
PAM the output data signal has several possible levels, each of which represents more than
one bit. In Figure 2.1 a 4-PAM encoding scheme is shown, with the corresponding four
possible output levels. Every two periods the output signal will be at a single value which
represents two data bits. This reduces the required bandwidth to one fourth the data rate,
however, the complexity of the transmitter and receiver is greatly increased. Also, the
division of the output signal into multiple levels increases the required SNR, which in turn
negates some of the benefits of the reduced bandwidth.
Clock and Data Recovery 8
In spite of its downsides, the simplicity and the reduced bandwidth of the NRZ line code
makes it the standard in multi-Gb/s data communication [7]. Recently there has been a lot
of research into multi-level line codes [8, 9, 10, 11], however thus far the increased complexity
required seems to outweigh the benefits of reduced bandwidth. In this thesis all data signals are
assumed to be transmitted using NRZ.
2.1.2 Data Encoding
In wireline systems the data which is being transmitted is often encoded. Encoding is a way of
mapping one set of data onto another set of data. The purpose is that the mapped data will
have advantages over the original data scheme. One benefit of common encoding schemes is
that long strings of ‘1’s or ‘0’s are eliminated. Another benefit is that the encoded data signal
can maintain a DC balance, similar to the RZ line code. Encoding the data allows the new
data signal to maintain DC balance by guaranteeing that there are an equal number of ‘1’s and
‘0’s. The most common coding scheme is known as 8B/10B. In this scheme every eight bits of
data is mapped onto ten bits. These extra two bits are known as the encoding overhead. The
8B/10B code provides the benefits of guaranteed transitions and DC balance described above.
Another common scheme is known as 64B/66B. This encoding scheme provides the same benefits
as 8B/10B, however with only about 3% overhead, as opposed to 25% overhead for 8B/10B.
In this thesis it is assumed that there is some level of coding (i.e. 8B/10B) in order to
maintain DC balance and to provide a minimum number of transitions. Strictly speaking if the
phase detector in the CDR circuit is a tri-state phase detector, reasonably long strings of ‘0’s or
‘1’s can be tolerated, however virtually all data communication systems implement some form of
encoding.
Clock and Data Recovery 9
Figure 2.2: Architecture of an optical data communication system
2.2 Wireline Data Communication
There are two main categories of wireline data communication systems, which are distinguished
by their transmission medium. Optical data communication systems transmit data over fibre
optic cables, whereas copper systems transmit the data in the electrical domain using a copper
medium. CDR circuits are only one component in wireline serial data communication systems,
however they play a crucial role in both optical and backplane systems. The work in this thesis
is not exclusively applicable to one system or the other, however the robustness considerations
are generally more relevant to backplane systems.
2.2.1 Optical Data Communication Systems
Optical data communication systems are designed to carry a large amount of data long distances
over optical links. The architecture of an optical system is shown in Figure 2.2. In this figure
there are two domains, the electrical domain and the optical domain. The electrical domain can
be further divided into a transmitter and a receiver. The transmitter is composed of a PLL and
Clock and Data Recovery 10
Figure 2.3: Architecture of a repeater
a serializer (SER). The PLL provides the serializer with a reference clock and the serializer uses
it to convert the incoming parallel data into a serial data stream. The receiver is composed of
a CDR circuit and a de-serializer. The CDR circuit recovers the data stream by generating a
phase aligned clock signal and then using it to retime the data. The de-serializer then uses the
recovered clock signal to convert the serial data into a set of parallel data signals.
Optical fibre is a low-loss medium which can transmit data over long distances. However, only
light can travel over fibre optic cables, hence electrical signals must first be converted into optical
signals. An optical diode placed after the transmitter is used to convert the electrical signal into
an optical signal. This optical signal travels over an optical fibre until it reaches a receiver. On
the receive side, a photo-diode converts the light into an electrical signal. This signal is usually
very weak and must be amplified by a TIA. This signal is then sent to the receiver, which recovers
the data. While the optical fibre is a low-loss medium, it is not lossless [12]. In order to maintain
a minimum signal to noise ratio (SNR), repeaters were traditionally used. In long haul networks
where data is sent over hundreds of kilometers, repeaters could be placed every ten kilometers.
The architecture of a repeater is shown in Figure 2.3. The repeater converts the optical signal to
an electrical signal where the data is recovered and re-timed using a CDR circuit. The data is
then converted back into an optical signal and sent out again over the optical fibre. The use of
repeaters in optical networks led to some specific CDR circuit requirements which do not exist
for backplane systems, specifically jitter amplification. Optical amplifiers may be used instead of
repeaters to compensate for loss over the fibre optic cable. An optical amplifier only amplifies the
incoming signal, it does not regenerate the data as a repeater does. However, optical amplifiers
Clock and Data Recovery 11
Table 2.1: Data rates for SONET/SDH









are significantly cheaper and erbium doped fibre amplifiers (EDFAs) have largely made repeaters
obsolete.
Most optical data communication systems are designed to conform to the SONET/SDH phys-
ical layer standards. SONET and SDH are identical, however SONET is the name for the North
American standard whereas SDH is the name for the international standard. Table 2.1 lists the
comparable names for SONET and SDH, and gives the associated data rate. The data rate of
the SONET OC-192 standard is approximately 10Gb/s and is used throughout this thesis as a
reference standard.
2.2.2 Copper Data Communication Systems
While optical systems are vital to long distance data communication, the cost associated with
optical components makes them unattractive for data communication over a short distance. For
these situations, the data is kept in the electrical domain and is transferred over a copper medium.
One of the most common examples of this is in chip-to-chip communications where data is trans-
ferred over a backplane. Figure 2.4 shows the scenario where the data is transferred from one
line card to another over a backplane.
Clock and Data Recovery 12
Figure 2.4: Data communication over a backplane
Figure 2.5: Architecture of a backplane data communication system
Modern server systems may be comprised of many line cards interconnected by way of a
backplane. In these situations the cost per line card becomes very important, as does the amount
of power each line card consumes. The architecture of a generic copper data communication
system is shown in Figure 2.5. This block diagram is very similar to that of the optical system,
the main differences being the channel and the transmitter / receiver. In the optical system,
devices which can convert between the electrical and optical domains are needed, whereas in the
copper based system pre-emphasis and equalizer circuits are often used in order to compensate
for the lossy channel. As signals cross the copper channel they experience a frequency dependant
Clock and Data Recovery 13
Figure 2.6: Frequency attenuation over an FR4 backplane [1]
attenuation. In order to flatten the overall frequency response, equalizer circuits are designed to
perform the inverse transfer function of the channel [13].
At multi-Gb/s data rates, signal integrity on the backplane becomes difficult. Signal attenu-
ation due to the poor dielectric constant of most backplane materials becomes very large. Figure
2.6 illustrates the attenuation over an FR4 backplane for various trace lengths. As can be seen,
the attenuation as data rates increase into the multi-Gb/s region is significant. Also, reflections
caused by impedance mismatches on the signal path, crosstalk and inter-symbol interference all
add noise and reduce the performance. Board designers can mitigate some of these effects with
careful layout and better board materials, and circuit designers can mitigate some of these effects
using equalization. In recent years there has been a great deal of research into equalization for
multi-Gb/s CDR circuits due to the need to compensate for the attenuation of the backplane at
these data rates [14] [15].
Clock and Data Recovery 14
Figure 2.7: Architecture of a PLL based CDR circuit
2.3 CDR Circuit Architecture
The function of a CDR circuit is to receive a serial data stream, synchronize an internal clock
to the data signal, and then use that clock to retime the data. The output of a CDR circuit
is logically identical to the input signal, however the signal-to-noise ratio (SNR) of the output
signal is increased. It is possible to build CDR circuits with either open-loop or closed loop
architectures, however closed loop architectures dominate in monolithic implementations. One
significant problem with open-loop CDR circuits is that they generally require the use of high-
Q filters, which cannot be integrated into a CMOS environment [5]. The closed-loop CDR
architecture is often referred to as a phase-locking CDR circuit, and its architecture is similar
to that of the phase-locked loop (PLL) circuit [16]. The closed-loop CDR circuit is much easier
to integrate than the open-loop CDR circuit, and as such in this thesis any reference to a CDR
circuit implies the phase-locking CDR circuit topology. The architecture of a PLL based CDR
circuit is given in Figure 2.7. The phase detector detects phase errors between the incoming
data signal and the internal clock signal and supplies correction information to the charge pump.
The charge pump takes the correction information and adds charge to or subtracts charge from
the loop filter. The loop filter plays an important role in defining the frequency response of the
system. The voltage on the loop filter controls the VCO, and the output of the VCO is sent to
Clock and Data Recovery 15
Figure 2.8: Gain of a binary and linear phase detector
both the phase detector and also to the retiming circuit.
2.3.1 Phase Detector
There are two basic types of phase detectors used in CDR circuits, linear phase detectors and
binary phase detectors [17, 18]. These circuits are named based on how they respond to phase
errors. A linear phase detector generates correction information which is proportional to the
size of the phase error. On the other hand, a binary phase detector applies correction of the
same magnitude regardless of how large or small the phase error is. The ideal phase detector
gain of linear and binary phase detectors is illustrated in Figure 2.8. The x-axis represents the
phase error between the clock and data signals, where ∆Φ = 0 represents the situation where the
signals are perfectly aligned. The y-axis represents the output of the phase detectors, which will
be discussed next.
Linear Phase Detector
A linear phase detector is also known as a proportional phase detector, as it corrects for phase
errors in proportion to their magnitudes. The proportional nature of the phase detector gain
leads to low activity on the VCO control voltage when the CDR circuit is in the locked condition,
Clock and Data Recovery 16
Figure 2.9: Architecture of the Hogge phase detector
which in turn leads to good jitter performance [16]. The linear response of this circuit allows for
simple formulation of loop equations, which is very helpful for system analysis (this will become
more clear later in this chapter). Virtually all linear phase detectors operate in a similar manner
to a Hogge phase detector, hence it will be used as the reference linear phase detector [19]. The
architecture of the Hogge phase detector is shown in Figure 2.9. The Hogge phase detector
generates UP and DOWN pulses which control the charge pump. Figure 2.10 illustrates the
logical operation of the Hogge phase detector in two situations: when the CDR circuit is in the
ideal locked state and when there is a phase error. When the CDR circuit is perfectly locked the
clock and data are synchronized and the UP and DOWN pulses are exactly equal, as can be seen
in Figure 2.10a. With UP and DOWN pulses of equal width an equal amount of charge is added
to and subtracted from the loop filter. As such the voltage on the loop filter has no net change.
Figure 2.10b illustrates the situation where the clock is leading the data. With this phase error,
the width of the UP pulse is reduced such that there will be a net loss of charge from the loop
filter. This correction will adjust the phase of the VCO so as to correct the phase error.
In the Hogge phase detector the DOWN pulse is generated by performing the logical XOR
Clock and Data Recovery 17
Figure 2.10: Operation of the Hogge phase detector
over the outputs of the two DFFs, therefore it has a constant width of half a period regardless
of the phase error between the clock and data signals. The UP pulse is generated by performing
the logical XOR over the input data signal and the output of the first DFF. While the output
of the DFF is phase aligned with the clock, the input data signal is not. Therefore it is the UP
Clock and Data Recovery 18
pulse which contains the information as to whether the clock is leading or lagging the data and
by what amount. The plot in Figure 2.8a shows the ideal relationship between the input phase
error and the output correction. The total correction is the difference between the width of the
UP and DOWN pulses and hence as the magnitude of the phase error changes, the magnitude of
the correction applied changes proportionally.
While linear phase detectors have many attractive features, they tend to be more difficult
to successfully implement in CMOS processes. One reason for this is that creating small UP
and DOWN pulses is difficult at high data rates. Another reason is that linear phase detectors
are more sensitive to process non-idealities than binary phase detectors. For these reasons, the
majority of commercial designs use binary phase detectors [20].
Binary Phase Detector
The name of the linear phase detector alludes to its proportional nature and similarly the name
of the binary phase detector describes the nature of its correction. In a binary phase detector
there are only two states, which correspond to whether the clock is leading or lagging the data.
With a binary phase detector no information is generated as to the magnitude of the phase error.
There are numerous binary phase detector architectures, however in this section only two will be
considered: the simplest and the most common. A very simple binary phase detector is a DFF
where the data signal is used to sample the input clock signal. A CDR circuit which uses a DFF
as a phase detector is illustrated in Figure 2.11 [16, 21]. The advantages of this phase detector
lie primarily in its simplicity, as a single DFF comprises the entire phase detection circuitry.
This simplicity means that the circuit is robust and has little sensitivity to process non-idealities,
however, the DFF phase detector also has several disadvantages.
One problem with the DFF phase detector is the lack of integrated retiming, requiring the
second DFF seen in Figure 2.11. The Hogge phase detector and the Alexander phase detector
(which will be discussed next) both have integrated retiming. This is important, as it guarantees
Clock and Data Recovery 19
Figure 2.11: A CDR circuit with a DFF as the phase detector
that there is no skew between the retiming clock signal and the phase aligned clock signal, as they
are one and the same. A DFF phase detector based CDR circuit needs a separate retiming circuit,
and care must be taken to ensure that clock skew is not a problem. A second problem with the
DFF phase detector is the fact that it supplies correction information to the charge pump even
when there are no data transitions. The phase detector can only determine the phase relationship
between the clock and data when there is a data transition. This means that when there are
long strings of ‘1’s or ‘0’s, the CDR circuit continues to implement the last known correction.
This information may not be correct and this could cause the CDR circuit to lose lock. Many
binary phase detectors have a third state during which no information is sent to the charge pump.
These phase detectors are known as tri-state or ternary phase detectors. How critical the lack of
this third state depends on the system implementation. In many data communication systems
the data is encoded such that there are a guaranteed minimum number of transitions, in which
case there would be a limited number of repeated ‘1’s or ‘0’s and the absence of a tri-state phase
detector would not limit the performance.
Clock and Data Recovery 20
Figure 2.12: The sampling behavior of an Alexander phase detector
Figure 2.13: Architecture of the Alexander phase detector
The Alexander phase detector is another binary architecture, and it is the one most commonly
implemented in multi-Gb/s CDR circuits [22]. This circuit samples the data signal at three points
and uses logic to determine whether the data is leading or lagging the clock. Figure 2.12 illustrates
the result of sampling in two cases: when the clock leads the data and when the clock lags the
data. DFFs are used to acquire the sample points S0, S1, S2, and XOR gates are used to determine
the phase error. The architecture of the Alexander phase detector is shown in Figure 2.13. One
Clock and Data Recovery 21
important difference between the DFF phase detector and the Alexander phase detector is the
tri-state nature of the Alexander phase detector. This means that the Alexander phase detector
actually has three logical states, as opposed to two. The three states are: data leading clock,
data lagging clock and no transition. The ‘no transition’ state is important, as is means that
when there are no transitions in the data, the phase detector will not provide any correction
information to the charge pump. This allows the CDR circuit to stay locked, even when there are
long strings of ‘0’s or ‘1’s. As mentioned before, another benefit of the Alexander phase detector
is the integrated retiming. This can be seen in Figure 2.13 where the output of DFF2 is taken
as the retimed data signal. The primary downside of the Alexander phase detector as compared
to the DFF phase detector is its complexity. The Alexander phase detector requires four DFFs
and two XOR gates and this the extra logic consumes power and area. The increased complexity
also leads to a greater sensitivity to process non-idealities.
Multi-Rate Phase Detector
A derivative phase detector architecture which has been prominently researched over the past
several years is the multi-rate phase detector [23, 24, 25, 26]. Both linear and binary multi-rate
architectures have been published, and these circuits are usually closely related to their full-rate
counterparts. The idea behind these phase detectors is the use of multiple phases of a clock
running at a frequency less than the data rate. The linear and binary phase detectors which have
been discussed up to this point have a clock signal which is at the same frequency as the data rate.
For example a 10Gb/s full-rate CDR circuit will have a VCO operating at 10GHz. A half-rate
CDR circuit at the same data rate will have a VCO operating at 5GHz and a quarter-rate CDR
circuit will have a VCO operating at 2.5GHz.
In Figure 2.14 the differences between a full-rate and half-rate CDR circuit are illustrated by
examining the clock and data waveforms. A half-rate CDR circuit will use at least two phases of
the clock. In [23] Savoj and Razavi proposed both a half-rate linear phase detector and a half-rate
Clock and Data Recovery 22
Figure 2.14: Waveforms of full-rate and half-rate CDR circuits
binary phase detector. The half-rate linear phase detector uses the rising and falling edges of the
in-phase clock to generate the proper correction information whereas the half-rate binary phase
detector uses both the in-phase and the quadrature phases of the clock signal to generate the
correction information. In both these phase detectors the data is retimed by two separate DFFs,
one operating on the rising edge of the in-phase clock and the other operating on the falling edge
of the in-phase clock. As such, in these systems the data is intrinsically demultiplexed into two
separate signals. The numbered bubbles in Figure 2.14 illustrate which data bit corresponds to
which output data signal.
The obvious benefit of a multi-rate architecture is the lower frequency of operation, however
there are several downsides which must be considered. First, as there architectures use both the
in-phase and quadrature clock signals, the clock phases must be very precise. Any mismatch
between the in-phase and quadrature clock signals will degrade overall system performance. This
could entail the use of an I/Q offset compensation circuit, as is commonly implemented in wireless
communication systems [27]. A second problem with multi-rate CDR circuits is the increased
complexity of the logic which is required to determine phase error, which can destroy any power
savings gained from the lower operating frequency. Also, the greater circuit complexity virtually
Clock and Data Recovery 23
Figure 2.15: Basic architecture of a charge pump
guarantees greater noise. Finally, a multi-rate topology places greater requirements on the VCO.
This may not seem obvious as the clock frequency is reduced, which would indicate a relaxed
design. However, while the frequency of the VCO is lower, the phase noise performance of the
VCO must be better, with every halving of the clock signal requiring a 6dBc/Hz improvement in
the phase noise. This can be understood by thinking of the reverse case. In a full-rate system,
in order to demultiplex the data, the clock signal is divided by two. With an ideal divider, the
phase noise of the clock signal will improve by a factor of two (or 6dBc/Hz) [28]. Hence, in
order to match the performance of an oscillator running at twice the frequency, the phase noise
performance of the half-rate VCO must be higher by a factor of two. As an example, in order
to meet SONET OC-192 specification it is generally accepted that a VCO operating at 10GHz
should have a phase noise no less than -90dBc/Hz at a 1MHz offset [29]. This means that for
a half-rate architecture the 5GHz VCO must have a phase noise no worse than -96dBc/Hz at a
1MHz offset.
2.3.2 Charge Pump
In a CDR circuit the charge pump is responsible for changing the voltage on the low-pass filter by
adding or subtracting charge. The charge pump is controlled by the correction information sent
Clock and Data Recovery 24
Figure 2.16: Current steering charge pump with differential inputs
to it by the phase detector. The basic architecture of any charge pump is given in Figure 2.15.
While the UP signal is active, charge is deposited onto the filter, causing the control voltage to
increase. While the DOWN signal is active, charge is removed from the filter, causing the control
voltage to decrease.
There are many different charge pump architectures, however most charge pumps in multi-
Gb/s CDR circuits are implemented using current steering logic [30]. A current steering imple-
mentation allow current to be switched more quickly and accurately than standard CMOS logic
implementations. The circuit for a current steering charge pump is shown in Figure 2.16. While
a current steering charge pump can operate with non-differential inputs, virtually all multi-Gb/s
CDR circuits use differential logic and as such only the implementation with differential inputs
is shown.
The charge pump itself can be differential [31]. This is especially useful when ring oscillators
are used, as these often have differential tuning [32]. This is different from most LC-tank oscillator
architectures which typically have single-ended tuning. A differential filter has several advantages
Clock and Data Recovery 25
Figure 2.17: Differential implementation of a charge pump
as compared to a single-ended filter. First, the effective output voltage range is effectively doubled
as compared to a single-ended filter. Secondly, any mismatches between the NMOS and PMOS
transistors do not substantially affect the performance of the filter, due to the symmetry between
the ‘charge up’ and ‘charge down’ paths. Finally the differential nature of the filter voltage for
this architecture offers a significant improvement in noise immunity, especially to any common
mode noise on the supply line. There are also some disadvantages to a differential filter. First,
any differential charge pump will require a common-mode feedback (CMFB) circuit in order to
compensate for offsets caused by charge-pump non-linearities and to ensure the common mode
filter voltage remains at the appropriate level. Also, a charge pump with differential outputs
may require the loop filter to be duplicated. If the filter is implemented monolithically it will
consume more area, and if an external loop filter is used (which is the most common situation)
an extra pin and more off-chip components will be required. Hence, while a differential topology
is attractive as it leads to lower phase noise [33], most filters are single ended. Figure 2.17 shows
a charge pump with differential outputs.
Clock and Data Recovery 26
Figure 2.18: Second order low pass filter
2.3.3 Loop Filter
The loop filter has a great deal of influence on the overall system performance. In controlling the
loop filter the designer can alter the location of the poles and zeros of the system, changing the
performance [34, 18, 35]. In a CDR circuit which uses a linear phase detector the most common
filter used is the second order low pass filter, which is shown in Figure 2.18 The second order
filter topology dominates as it is the simplest practical filter implementation [7]. For binary phase
detectors there is less unanimity in the filter design, however usually either a simple capacitor or
a first order RC filter is used. The effects of the loop filter on the performance of the CDR circuit
will be discussed later in this chapter.
2.3.4 Voltage Controlled Oscillator
The output of a voltage controlled oscillator (VCO) is a signal which oscillates at a particular
frequency based on a control voltage [36]. The VCO has possibly been researched more than
any other analog circuit [37, 32, 38, 39, 40]. There are countless configurations of VCOs and a
multitude of applications, from wireless circuits to microprocessors to CDR circuits. A great deal
of research has gone into designing VCOs for low-power, large tuning range and low phase noise.
The phase noise of a VCO is very important, as phase noise in the frequency domain translates
Clock and Data Recovery 27
Figure 2.19: Architecture of a CDR circuit with a dual loop structure and an external reference
to jitter into the time domain. The concepts and importance of phase noise and jitter are further
discussed later in this chapter.
In a CDR circuit, the tuning range of the VCO determines the data rates which can be locked.
Most multi-Gb/s CDR circuits are aimed at a particular data rate (or a small range of data rates)
and thus the tuning range of the VCO is primarily used to compensate for process variations. For
example, a VCO with a tuning range from 9.953GHz to 10.709GHz can be used in CDR circuits
for OC-192 (SONET - 9.953Gb/s) 10GE (Gigabit Ethernet - 10.315Gb/s) and those standards
with various forward error correction data rates (10.664Gb/s and 10.709Gb/s). A VCO with
a large tuning range is referred to as a high-gain VCO. While this allows for greater flexibility
in the input data rate and greater robustness this can have negative consequences, as high-gain
VCOs amplify noise on the VCO control line, which leads to poor phase noise performance [41].
In order to optimize the performance, CDR circuits may be implemented in a dual loop
structure. In one loop the VCO is operated in a high-gain mode until the VCO is frequency
aligned with the incoming data signal or some external reference signal. This is often referred to
as coarse tuning. At that point a second loop takes over, with the VCO operating in a low-gain
mode. This is commonly known as fine tuning. The sole purpose of this loop is to phase align the
Clock and Data Recovery 28
Figure 2.20: Architecture of a four stage ring oscillator
VCO to the incoming data signal; frequency synchronization is assumed. The low-gain loop will
improve performance as noise on the control signal will have less effect, however tracking range of
the low-gain is limited. This means that if the incoming data signal deviates significantly enough
from the initial frequency, the phase detector may go out of lock and have to switch back to the
high-gain loop [17]. Figure 2.19 shows the architecture of a CDR circuit which implements a dual
loop structure. In this figure the coarse tuning of the VCO is controlled by the loop with the
phase frequency detector which synchronizes the frequency of the VCO to the external reference,
fref.
The two most common oscillator topologies used in CDR circuits are ring oscillators and the
LC-tank oscillators.
1. Ring Oscillator
A ring oscillator consists of a series of controllable delay cells where the output is connected
to the input, creating an unstable circuit which oscillates [32]. In order to achieve oscillation
the combined phase shift through all delay cells must equal 360◦. The architecture of a
four-stage differential ring oscillator is shown in Figure 2.20. In this circuit each delay
cell must provide a phase shift of only 45◦, as the inversion in the feedback path provides
the additional 180◦ phase shift. By varying the control voltage, the delay of each cell can
be changed, which changes the frequency of oscillation. The frequency of oscillation of a




)−1, where N is the
Clock and Data Recovery 29
Figure 2.21: (a)Ideal LC-tank (b)LC-tank with parasitics
number of delay elements and T
delay
is the delay through each delay element. A tutorial
describing the design of a ring oscillator can be found in [32].
2. LC-tank Oscillator
An LC-tank oscillator uses the resonance of an LC-tank to achieve oscillation. A capacitor
and inductor make up an ideal LC-tank, as shown in Figure 2.21a. Assuming a lossless
inductor and capacitor, once energy is introduced to the system it continually cycles without
loss at a particular frequency. The transfer function of an LC-tank is given in Equation 2.1.
H(ω) =
LC
1− ω2 · LC (2.1)
A quick analysis of the transfer function reveals that at the frequency ω = 1√
LC
, the transfer
function goes to infinity. This can be understood to mean that for an ideal LC-tank at
that particular frequency there is an output, even without an input. Ideal inductors and
capacitors do not exist, and the parasitics which are intrinsic to any device in a monolithic
system act to damp out oscillations in the tank. The primary source of these parasitics are
resistances in the inductor and capacitor, which are usually be modelled as series resistances.
These series resistances can be converted to parallel resistances using the relationship Rp =
Q2 ·Rs [42], where Q represents the device’s quality factor. Figure 2.21b shows an LC-tank
with Rp representing the lumped parasitic resistances. In Figure 2.21b there is another
Clock and Data Recovery 30
Figure 2.22: LC-tank oscillator with NMOS cross-couple pair
component, labelled − 1gm . This component represents a negative transconductance which
must be added to the circuit to compensate for the parasitic resistance. In order for the
LC-tank to sustain oscillations, this transconductance must be large to overcome the tank’s
parasitic resistances, as such gm >
1
Rp
. In the monolithic CMOS implementation of an
LC-tank oscillator the transconductance is generated by a cross-coupled differential pair.
One common topology for an LC-tank oscillator is shown in Figure 2.22. In this topology
the negative transconductance is generated by the cross-coupled NMOS differential pair.
Frequency tuning is usually achieved using MOS transistor configured as voltage controlled
capacitors [43].
2.4 CML in Multi-Gb/s CDR Circuits
CML is a high-speed current steering logic family which is the CMOS equivalent to the emitter-
coupled logic (ECL) family used in bipolar technologies. This logic family is also known as
Clock and Data Recovery 31
Figure 2.23: CML implementation of a buffer
source-coupled logic (SCL) and current steering logic [44, 45, 46]. Multi-Gb/s CDR circuits
implemented in CMOS technology have generally exclusively used CML as opposed to static
CMOS logic. In this section the CML logic family is briefly described and its use in CDR circuits
is discussed.
2.4.1 Architecture
The architecture of a CML buffer is shown in Figure 2.23. The input voltage (Vin+−Vin−) steers
the bias current between transistors Q1 and Q2. The voltage drop across the resistors creates the
output voltage swing, which means that the voltage swing in CML circuits is not full swing, but
rather is from Vdd to Vdd − (Ibias ×R) [47]. The basic idea of current steering can be expanded
upon to create more complicated gates. This is accomplished by way of stacking current steering
differential pairs. The number of levels of current steering in the circuit determines the complexity
of logic. For example, a two input XOR gate requires two levels of current steering. The CML
Clock and Data Recovery 32
Figure 2.24: CML implementation of four different logic gates
implementation of four different logic gates is given in Figure 2.24. There are many excellent
papers available which detail the design of CML gates [48].
Clock and Data Recovery 33
2.4.2 CDR Circuits Using CML
The CDR circuits in this thesis are all implemented using CML, however the basic concepts
are applicable to any logic family. Scaling has enhanced the performance of standard CMOS
processes to the point where it is possible to implement multi-Gb/s CDR circuits in logic families
other than CML, such as static CMOS. While modern CMOS processes may provide sufficient
bandwidth there are still difficulties to implementing multi-Gb/s CDR circuits using logic styles
other than CML, and there are several distinct benefits to using CML.
The first and most significant benefit of using CML is that it enables higher performance than
any other logic family. This is obviously important in the design of high-speed interconnects in
order to achieve the highest data rates [49] [50]. A second important benefit is that the differential
property of CML provides excellent common mode noise immunity [20]. The differential signals
are always taken with respect to each other, and thus noise affecting both signals is cancelled
out. Single-ended are far more susceptible to noise on the supply and ground rails [51]. Thirdly,
single-ended logic like static and dynamic CMOS inject a lot of noise into the substrate, which
is quite undesirable near sensitive analog circuits, particularly the VCO [16]. As CML has a
constant current draw there is very noise injection into the substrate. Fourthly, as will be seen
later in the chapter, duty-cycle distortion (or pulse-width distortion) is a source of jitter in CDR
circuits. With CML the crossing points of the differential signals are more important than the
risetime or falltime, however for static CMOS circuits unequal rise and falltimes will lead to duty
cycle distortion. For static CMOS circuits operating at very high frequencies it can be difficult
to achieve the necessary precision in matching the rise and falltimes and the resulting duty-cycle
distortion will degrade the performance. Finally, the differential nature of CML also means that
the inverse of a signal is always available and that signal is phase aligned with the original signal.
In a single ended design an inverter would have to be used to generate the inverse, and the inverse
signal will have a phase offset with respect to the original signal. One drawback to CML circuits is
their constant current draw, and a CDR circuit implemented using static or dynamic logic would
Clock and Data Recovery 34
Figure 2.25: Frequency domain mathematical model of a CDR circuit
consume significantly less power than a CML implementation. A second drawback to using CML
is that the resulting design will consume more area than a non-CML design.
2.5 Figures of Merit
In order to characterize the performance of a system, figures of merit (FOM) are defined. In data
communication systems two classes of FOM are used: time domain and frequency domain. Time
domain FOM are the most important characteristics of backplane systems, whereas frequency
domain FOM are more important in optical system. Time domain FOM include peak-to-peak
jitter, root mean squared (RMS) jitter, jitter generation and bit error rate (BER). Frequency
domain FOM include jitter tolerance, jitter transfer and VCO phase noise.
2.5.1 CDR System Analysis
In order to understand the different FOM, a CDR circuit is mathematically modelled. The
analysis is performed on a CDR circuit which uses a linear phase detector, as opposed to a binary
phase detector. The reason for this is that a linear phase detector allows for a straightforward
system analysis. While there have been several attempts at analyzing CDR circuits based on
binary phase detectors the non-linearity of the binary phase detector makes it difficult [18]. Some
of the research into binary phase detector based CDR circuits will be detailed later in this chapter
however an analysis of linear phase detector based CDR circuits will allow the concepts behind
Clock and Data Recovery 35
Figure 2.26: Output waves from a CDR circuit with a capacitor for a loop filter
the various FOM to emerge successfully. The architecture of a CDR circuit is shown in Figure
2.25, including the transfer functions for each block. This CDR circuit is very similar to a PLL
and the corresponding analysis is also very similar to that of a PLL [35, 52]. The input of the
system Φin represents the input phase and Φout represents the phase of the VCO signal. The
output of the phase detector is linearly proportional to the input phase error Φerror multiplied
by a factor of Kdf2π . Here,
Kdf
2π represents the average data transition density, and it is a term
which is used to compensate for the fact that there is not a data transition every period. For an
alternating data pattern (10101010...) Kdf = 1 and for a random bit sequence Kdf = 0.5 [53].
The charge pump gain is simply denoted Icp, which is the value of the charge pump current.
The VCO is a phase integrator, hence it has a 1s term. The Kvco term describes the gain of the
VCO, which represents how much the VCO’s output phase changes with for a given change in
input voltage. The low-pass filter is normally implemented as a second order filter (as described
in Section 2.3.3), however to simplify the analysis it is initially assumed that the filter is simply a
capacitor, C1. The response of a CDR circuit with a capacitor for a loop filter is shown in Figure
2.26. If the output filter voltage is approximated as a continuous wave with the slope shown in
the figure, the relationship between the filter voltage and a phase error, Φerror can be defined, as
is shown in Equation 2.2.
Clock and Data Recovery 36









As the relationship between the filter voltage and the output phase is simply defined as Kvcos the
open loop transfer function of the CDR circuit which describes the relationship between the input

























Clock and Data Recovery 37
Figure 2.28: Frequency response of a CDR circuit which has a first order loop filter
A return to the open-loop transfer function given in Equation 2.3 reveals a significant problem
which is due to using simply a capacitor as a loop filter. The open loop transfer function has two
poles as the origin, which will give the frequency response shown in Figure 2.27. As can be seen,
the loop will be unstable as there is no phase margin. In order to compensate for this a resistor













The transfer function shows that the resistor has added a zero to the system. The frequency
response of the open loop gain of the CDR circuit with the first order filter is shown in Figure
2.28. As can be seen the zero stabilizes the system as the phase margin is now equal to 90◦.
As before the closed loop response of the CDR circuit is calculated and the result is shown in
Equation 2.6.










s2 + s · Kdf·Icp·Kvco·R2π + Kdf·Icp·Kvco2π C1
(2.6)
As this transfer function is a standard second order loop response it can be written in terms of
the natural frequency (ωn) and the damping ratio (ζ).
Gcl2(s) =
s 2ζωn + ω2n










Kdf Icp C1 Kvco
2π
As mentioned previously these results assume a first order filter whereas a second order filter
is normally used. A first order filter proves insufficient due to the glitches on the filter voltage that
results due to the voltage drop across the resistor when the charge pump current turns on and off.
These glitches will cause a significant amount of noise and negatively affect the performance of
the CDR circuit. As such, a second capacitor, C2, is added in order to smooth out these glitches.




Kdf·Icp·Kvco · [s3RC1C2 + s2(C1 + C2)] + sRC1 + 1
(2.8)
This equation is not as simple as the second order system, however it does more accurately
describe the system. The addition of the second capacitor suppresses the glitches, however the
corresponding addition of a third pole can cause stability issues as the phase margin will be
degraded. However, if capacitor C2 is at least an order of magnitude smaller than capacitor C1
the closed loop frequency response of the loop will be approximately the same as that of the
second order system [7]. Because of this, the second order loop approximation will be used later
in the chapter to help describe tradeoffs involved in frequency domain FOM.
It is important to emphasize that this analysis is only valid for CDR circuits which implement
Clock and Data Recovery 39
linear phase detectors. Binary phase detector circuits are intrinsically non-linear, and hence
the system analysis is different. The modelling of binary phase detector based CDR circuits is
presented in Section 2.6.
2.5.2 Jitter
Jitter is defined as the difference in time between an ideal event time and the actual event time.
In a CDR circuit the events of interest are the zero-crossing points of the differential clock and
data signals. While the concept of jitter is relatively simple, it can be defined in many different
ways. Instantaneous jitter is determined by measuring the difference in time for a singular event,
as described in Equation 2.9.
j[n] = tE [n]ideal − tE [n]actual (2.9)
In this equation j[n] is the jitter at n
th
transition and tE [n] is the time of the nth event. While
jitter is technically the non-ideality of a singular event, it is most often treated in a statistical
manner. The difference in the zero-crossing points of the clock and data signals of a CDR circuit
are measured over a period of time, and calculations are made with respect to both the magnitude
and the spread. While the exact nature of jitter can at times be difficult to determine, jitter is
broadly separated into two categories: random jitter and deterministic jitter.
Random Jitter
Random jitter describes timing variations which do not have deterministic causes. Random
jitter is characterized using Gaussian distribution statistics. As random jitter has a Gaussian
distribution, there is theoretically no limit to its magnitude. Thermal noise is the primary cause
of random jitter, however other causes include flicker noise and process variations. Random jitter
is always present, and a circuit designer has little ability to control thermal noise sources. The
exact sources of random jitter in a circuit are difficult to pinpoint. In CDR circuits there are both
Clock and Data Recovery 40
Figure 2.29: An illustration of pattern dependant jitter
intentional and parasitic resistances all of which generates thermal noise. The noise voltage of a
resistor is defined as vnoise = 4KTR∆f [36]. Flicker noise is another source of random jitter in
circuits. Devices fabricated in CMOS processes suffer from flicker noise much more than devices
fabricated in bipolar processes. As random jitter is statistical in nature, there is no discrete
value which can completely encapsulate it. Random jitter is typically reported as a root-mean-
squared (RMS) value. An RMS measurement must be used, as theoretically random jitter has
no maximum value. As random jitter is a statistical calculation, the accuracy depends on the
number of measurements which are taken.
Deterministic Jitter
Deterministic jitter describes timing variations which do not have a Gaussian distribution. Unlike
random jitter, deterministic jitter has specific and identifiable causes. Also unlike random jitter,
the magnitude of deterministic jitter is bounded. The sources of deterministic jitter include
crosstalk, electromagnetic interference, and simultaneous switching outputs. Deterministic jitter
cannot be analyzed using Gaussian statistics and as its amplitude is finite it is commonly reported
as a measurement of the maximum jitter which occurs. This measurement is referred to as peak-
to-peak jitter. Deterministic jitter can be separated into three kinds: Pattern Dependant Jitter,
Pulse Width Distortion and Bounded Uncorrelated Jitter.
Clock and Data Recovery 41
Figure 2.30: An illustration of pulse width distortion
1. Pattern Dependant Jitter
Pattern dependant jitter is jitter which is caused by limitations in component and system
bandwidth. Pattern dependant jitter is also known as data dependant jitter or inter-symbol
interference. While the system bandwidth is ideally perfectly flat, this is never the case.
The incoming data stream contains many different frequency components and the system
response to the different frequency components will be different. This is most commonly
observed as an attenuation of the high frequency components. For example, when there is an
incoming bit sequence with a long sequence of ones or zeros followed by several transitions,
the magnitude of the data signal during those transitions will be attenuated, which can
cause timing errors. This situation is illustrated in Figure 2.29.
2. Pulse Width Distortion
Pulse width distortion results from differences in the rise time and fall time of the data
signal. Pulse width distortion is also called duty cycle distortion. The unequal rise and fall
times leads to a difference in the width of a pulse representing logic ‘1’ and the width of
a pulse representing logic ‘0’ with respect to Vdd/2. Figure 2.30 illustrates the problem of
pulse width distortion. If the signals in Figure 2.30 were taken single-endedly (as would be
the case in a static CMOS implementation) pulse width distortion would cause problems
Clock and Data Recovery 42
Figure 2.31: The difference between RMS and peak-to-peak jitter measurements [2]
as there would no longer be an edge of the clock at the centre of the data eye. The use of
differential logic (i.e. CML) largely solves the problem of pulse width distortion, as with
differential logic the two differential signals are referenced with respect to each other. The
primary concern in differential circuits is the zero crossing points of the signals, not their
shape.
3. Bounded Uncorrelated Jitter
Bounded uncorrelated jitter refers to jitter which is bounded in amplitude, yet uncorrelated
to the data pattern. It is often sinusoidal in nature and caused by interference from signals
sources either internal or external to the system [54]. The interference can be caused by
way of capacitive coupling, inductive coupling or electromagnetic interference.
Eye Diagram Jitter Representation
The different effects of random and deterministic jitter can be identified using an eye diagram.
An eye diagram is generated by repeatedly sampling a data signal at a regular interval and
Clock and Data Recovery 43
superimposing the results. As jitter adds variation to the zero-crossing point of the data signal,
the eye opening will shrink. The different effects are illustrated in Figure 2.31. J
RJ
rms is the RMS
jitter on the signal due to random noise, however, the overall RMS jitter for the signal is given
simply as Jrms and is dependant on both random and deterministic jitter. J
DJ
pp is a measure of
the amount of peak-to-peak jitter on the signal due only to deterministic noise. Jpp is the value
of peak-to-peak jitter with both deterministic and random jitter taken into account. Figure 2.31
clearly shows that jitter shrinks the opening of the data eye. The shrinking of the data eye means
that it is more likely that the incoming data signal will be incorrectly sampled, which reduces
the performance of the system. From Figure 2.31 it is clear that both the RMS and peak-to-peak
measures of jitter are important for system characterization. On their own each can give valuable
insight to the circuit behavior, however they must be viewed together in order to get a proper
understanding of system performance.
2.5.3 Frequency Domain FOM
Frequency domain FOM describe the frequency response of the CDR circuit. From the analysis in
Section 2.5.1 it is clear that the system response is different for different input noise frequencies.
Jitter tolerance, jitter transfer and VCO phase noise are three frequency domain FOM which are
used to characterize a CDR circuit. OC-192 is a SONET standard for 10Gb/s data communication
which is often used in optical data communication systems [55]. The different FOM will be
described using the specifications for OC-192 CDR circuits as examples.
Jitter Transfer
Jitter transfer is the attenuation of jitter from the input to the output. Given this description it
can be seen that the transfer function in Equation 2.7 is the same as the jitter transfer relationship
of a linear phase detector based CDR circuit. A concern in optical networks which jitter transfer
is used to identify is jitter amplification. In an optical data communication system a signal may
Clock and Data Recovery 44
Figure 2.32: Jitter transfer mask for OC-192
travel for great distances and go through many repeaters. If jitter at a particular frequency is
amplified at each repeater, the magnitude of the jitter will eventually cause the system to fail.
For an OC-192 system the maximum jitter amplification which is acceptable is 0.1dB, and the
jitter transfer function has a -3dB frequency of 8MHz. The OC-192 jitter transfer mask is shown
in Figure 2.32.
Using Equation 2.7 the bandwidth of the jitter transfer for a CDR circuit can be calculated.















Equation 2.10 can be solved in terms of ω−3dB to get the jitter transfer bandwidth.





+ ω4n = 0 (2.12)








(1 + 2ζ2)2 + 1
]
(2.13)







=⇒ ω−3dB = 2ζωn (2.14)
Substituting for ζ and ωn allows the jitter transfer bandwidth to be written in terms of the
original parameters.
ω−3dB =
Kdf Icp Kvco R
2π
(2.15)
Equation 2.15 indicates that the jitter transfer bandwidth is independent of the capacitor C1.
This is based upon the assumption that ζ is large. A large value of ζ is important for stability,
and moreover implies a large value of C1. For wireline data communication the jitter transfer
bandwidth is specified to be quite small. Reducing the value of the resistor by a factor of N will
reduce the bandwidth by the corresponding amount, however in order to maintain the value of
ζ the value of the capacitor must then be increased by a factor of N2. If the assumption that
ζ is large is maintained, Equation 2.7 can be simplified to the form shown in Equation 2.16.
This result shows that the jitter transfer can be approximated by a first order response, and by








The results from the jitter transfer analysis can be used to examine jitter amplification. The
poles and zeros of second order loop can be derived from Equation 2.7.






Clock and Data Recovery 46
Figure 2.33: Jitter amplification in a second order loop
These pole and zero locations illustrate that there is no way to avoid jitter amplification, as both
of the poles are greater than the zero. A plot of the magnitude of the frequency response showing
the frequency range in the neighborhood of ω−3dB is shown in Figure 2.33.
The maximum value by which the jitter transfer gain can exceed 0dB is specified by standards.
As mentioned previously the OC-192 standard specifies that the amount of jitter amplification
must not exceed 0.1dB. Using this value and the straight line bode plot approximation an expres-
sion for jitter amplification can be derived. While the straight-line approximation is not strictly
accurate when poles and zeros are close to one another, using it can provide a first order equation







Equation 2.17 can be used to relate the value of jitter amplification to the original circuit param-
eters, as shown in Equation 2.18




R2 Icp C1 Kvco Kdf
(2.18)
Clock and Data Recovery 47
Figure 2.34: Jitter tolerance mask for OC-192
Here a similar situation as with the jitter transfer bandwidth emerges. Reducing the value of
the resistor by a factor of N will reduce the bandwidth, however in order to keep the jitter
amplification at the required level the value of the capacitor must then be increased by a factor
of N2. Optimal loop filter design in order to eliminate jitter amplification does not simply
involve increasing the damping ratio. An analysis must be performed in order to choose correct
values for the capacitance ratio and the damping ratio, in order to satisfy the jitter amplification
specifications of the standard [34].
Jitter Tolerance
Jitter tolerance is the maximum magnitude of jitter which can be present in the input signal with
the output signal still meeting design specification. Jitter tolerance is a frequency dependant
definition and a mask test is used to determine whether or not a system passes. The jitter
tolerance mask for SONET OC-192 is shown in Figure 2.34. This figure shows that the CDR
circuit must be able to track jitter frequencies under 2.4kHz with a magnitude up to 15UI. UI
stands for unit interval, which is the period of the data. The data rate for OC-192 is 10Gb/s,
Clock and Data Recovery 48
therefore the unit interval is 100ps. This means an OC-192 compliant CDR circuit must be able
to track a data stream with up to 1.5ns of peak-to-peak jitter at 2.4kHz. After 2.4kHz the mask
drops by two orders of magnitude in two discrete steps. At jitter frequencies greater than 4MHz
the CDR circuit must track only so long as the jitter magnitude is less than 0.15UI, or 15ps
peak-to-peak.
The maximum phase error which can be tolerated in any situation is equal to half the period,
or 12UI. This is expressed mathematically in Equation 2.19.
Φin − Φout < 12UI (2.19)
The previous section derived the equation for jitter transfer, which is the relationship between














Equation 2.22 describes a relationship where the input jitter Φin must be less than the expression
on the right hand side. The maximum magnitude of input jitter which a system can accept is





Clock and Data Recovery 49
Figure 2.35: Jitter tolerance using a first order approximation
Using the simplified first order jitter transfer equation given in Equation 2.16 allows for the first








· s + 2ζωn
s
(2.25)
Figure 2.35 illustrates the simplified first order jitter tolerance response described by Equation
2.25. As can be seen the jitter tolerance corner frequency is equal to that of the jitter transfer.
For input jitter frequencies greater than ω = 2ζωn the jitter tolerance is constant, however as the
input jitter frequency drops to less than ω = 2ζωn the CDR circuit can track increasing large
input jitter magnitudes.
As with the jitter transfer derivation, simplifications have been made in order to illustrate the
basic system response. Obviously these simplifications trade accuracy for illustrative simplicity.
With a second order loop filter, the actual system response will be third order and more complex
that what has been presented here. While these closed loop equations will be complex, it is still
Clock and Data Recovery 50
far easier to analyze the system response in the frequency domain, as opposed to running lengthy
time-domain simulations.
Jitter Tolerance and Jitter Transfer create some obvious design constraints for a CDR circuit.
An OC-192 system is again used to illustrate this point. The OC-192 jitter tolerance mask shows
that the system must track jitter of reasonably large magnitudes less than 4MHz. As a PLL tracks
jitter at frequencies less than its loop bandwidth, the loop bandwidth of an OC-192 CDR circuit
must be greater than 4MHz. The jitter transfer mask shows than input jitter with magnitude
greater than 8MHz must be significantly attenuated. As a PLL attenuates jitter at frequencies
greater than its loop bandwidth, the bandwidth must be less than 8MHz. For this reason most
CDR circuits for OC-192 application have a system bandwidth of approximately 6MHz.
Phase Noise
Jitter has already been discussed, and while jitter is a time domain characterization, phase
noise in the frequency domain analog of jitter [38]. Figure 2.36 illustrates the frequency domain
representation of phase noise and the effect on the zero-crossing points in the time domain. In
Figure 2.36a there is only a single frequency component, and this correlates to a time domain
signal with no uncertainty in the zero-crossing point. However, the frequency spectrum in Figure
2.36b has energy at frequency components other that the centre frequency, which is referred to
as phase noise. The effect of these undesired frequency components in the time domain is to
cause non-idealities in the zero-crossing points, as shown in Figure 2.36b. Phase noise is typically
measured in dBc/Hz at a particular offset from the carrier. This measure refers to the amount of
spectral power in a 1-Hz bandwidth (measured in dB) at a particular frequency offset from the
carrier, relative to the total power of the carrier. The larger the phase noise the more spectral
power is in non-desired frequencies and hence the jitter will be large.
Clock and Data Recovery 51
Figure 2.36: Relationship between phase noise and jitter
CDR Circuit Noise Analysis
In this section equations based on the ones previously derived are used to analyze the performance
of a CDR circuit in the presence of noise. While there are many potential sources of noise in a
monolithic environment, the dominant sources are assumed to be noise in the input signal and the
phase noise of the VCO. The study of phase noise in voltage controlled oscillators has been the
subject of many papers and theses [32, 56, 57]. Generally speaking the phase noise requirements
of VCOs in CDR circuits is less than that of wireless circuits, however a high quality VCO is
always desirable. The importance of the VCO phase noise also depends on the application; long
haul optical systems usually will requires lower phase noise than a backplane system.
Figure 2.37 shows the CDR system with noise sources at the input (Φjitter) and at the output
of the VCO (Φ
PN
). Using Figure 2.37 transfer functions can be derived for the two noise sources,
Clock and Data Recovery 52
Figure 2.37: Frequency domain mathematical model of a CDR circuit with noise
Φjitter and ΦPN . For the noise due to the input jitter, Φjitter, the transfer function will simply
be identical to the jitter transfer function previously derived. Equation 2.26 gives the transfer












Equation 2.27 gives the transfer function of the output noise with respect to the VCO noise. In
order to simplify the relationship the single ‘s’ term in the denominator is assumed to be small
in comparison to the ‘s 2ζωn’ term and the simplified relationship between the phase noise and
the output phase is given in Equation 2.28. Substituting the component parameters in for ζ and























These equations can be used to gain a valuable insight into the system behavior of a CDR
circuit. It has already been shown that the jitter transfer, and hence Φout(s)Φjitter(s) has the character-
istics of a low-pass filter, and from Equation 2.29 it can be seen that the transfer function of the
VCO’s phase noise to the output phase is a high-pass filter. In Figure 2.38a, the phase noise the
Clock and Data Recovery 53
Figure 2.38: Frequency response of a CDR circuit to noise
VCO is plotted along with the input noise, which is assumed to be white. In Figure 2.38b the two
transfer functions given in Equations 2.26 and 2.29 are plotted. Figure 2.38c combines these and
shows the system response as the solid line. Here, the ideal situation has been assumed, which
means that the system bandwidth is at the point where the magnitude of the VCO phase noise
is equal to the magnitude of the input noise. At frequencies less than ωbw, the magnitude of the
input noise is less than the noise of the VCO. Figure 2.38c shows that for ω < ωbw the input data
will pass through, however the VCO phase noise will be attenuated. At frequencies greater than
ωbw the noise from the VCO is less than the input noise. Again referring to Figure 2.38c it can
be seen that for for ω > ωbwthe input noise is attenuated, while the VCO noise passes through.
Clock and Data Recovery 54
2.5.4 Time-Domain FOM
Jitter measurements are the primary criterion used to characterize the performance of a CDR
circuit in the time domain. Four FOM which are used to characterize the amount of jitter in
a CDR circuit are RMS jitter, peak-to-peak jitter, jitter generation and bit error rate (BER).
While RMS jitter, peak-to-peak jitter and jitter generation are distinct measurements of jitter,
BER is used to provide a clear measure of system performance.
Root Mean Squared Jitter
The root mean squared (RMS) jitter of a signal is calculated and provides a measure of the
average amount of jitter in a signal. RMS jitter can be calculated using the equations, given in













In these equations µj represents the mean jitter value, N is the number of jitter samples used
and σj is the RMS value of the jitter.
Peak-to-Peak Jitter
While RMS jitter provides a measure of the average jitter on a signal, the peak-to-peak measure-
ment provides the worst case value for the jitter which is seen on a signal over a given sample
set. Peak-to-peak jitter is calculated using Equation 2.32.
Jp−p = max(j[n])−min(j[n]) (2.32)
Clock and Data Recovery 55
The size of the sample set is important, as it must be large enough to give an accurate peak-to-
peak jitter measurement.
Jitter Generation
Jitter generation is an important FOM in any CDR circuit. While it is often reported bundled
together with jitter transfer and jitter tolerance, it is in fact a time domain FOM. Jitter generation
is the magnitude of jitter at the output of the CDR given an ideal input. For an OC-192 system
the maximum amount of jitter generation is specified to be 10ps peak-to-peak and 1ps RMS. In
order to meet jitter generation specifications it is very important that the VCO control voltage
does not wander in the absence of data transitions [58]. Due to this reason if a non-tri-stated
phase detector is used it may be difficult to meet the desired jitter generation performance.
Bit Error Rate
The function of a CDR circuit is to recover a stream of data, therefore the most important FOM
is how well it does that. The probability of making an error by incorrectly identifying a data bit is
known as the bit error rate (BER). Every serial data communication standard has specifications
regarding acceptable BER. For example, if a standard specifies a BER < 10−9, this means that
the circuit must have less than one error for every billion bits transmitted. As the data is retimed
at the centre of the data eye, the BER of a CDR circuit is fundamentally linked to the jitter
(as observed in the analysis of Figure 2.31). However it is difficult to get an accurate analytic
relationship between the two. In [59] an equation is presented which attempts to quantify the
relationship between jitter and BER. Equation 2.33 relates the RMS and peak-to-peak jitter using
α. Equation 2.34 then relates α to the bit error rate using the complementary error function.




Clock and Data Recovery 56
While these equations provide a glimpse at an analytic relationship between jitter and BER,
there is no equation which can precisely predict BER. For this reason it is best for BER to be
measured, as opposed to calculated. Machines which measure the BER of CDR circuits are known
as Bit Error Rate Testers (BERTs). Another method of measuring BER is done by some modern
oscilloscopes, which have software packages to analyze the jitter in an incoming data stream. The
jitter can be divided into its various components (i.e. random and deterministic) and the BER
can be estimated.
2.6 Modelling Binary Phase Detector Based CDR Circuits
The CDR analysis presented so far in this chapter has assumed a linear phase detector is used.
The use of a linear phase detector allows an analysis using classical loop theory in the derivation
of FOM such as jitter tolerance and jitter transfer. This section discusses some of the challenges
involved in analyzing a CDR circuit which uses a binary phase detector. The fundamental issue
which prevents a proper closed form analytic expression is the non-linear nature of the binary
phase detector. While the output of a linear phase detector is proportional to the magnitude of
the input phase error, the output of a binary phase detector is either UP or DOWN, regardless of
the input phase error. Traditional control theory is based on an assumption that every component
in the system either is linear, or can be linearized. This issue cannot be fully circumvented and as
such the analysis of binary phase detector based CDR circuits will never be as graceful as linear
phase detector based CDR circuits. However, in recent years several interesting papers have been
written which provide good models to better understand and optimize the performance of binary
phase detector based CDR circuits.
This section does not aim to provide a comprehensive mathematical analysis of a binary
phase detector based CDR circuit, rather the goal is to provide an intuitive understanding of the
functionality of the system and then summarize some of the efforts which have been made toward
analyzing these systems. The math presented mostly follows the approach taken by Lee et al. in
Clock and Data Recovery 57
[60], however other works are also used.
2.6.1 Basic Binary Control: Example
In order to illustrate the behavior of a loop with binary control an example is presented. Imagine
two cars travelling around a circular track. The first car is travelling at a non-constant velocity
somewhere between 90km/h and 110km/h. The velocity of the first car is always changing,
however it is changing in a continuous manner. The goal of the second car is to maintain a
position as close to the first car as possible. In order to do this the second car must match
both the speed of the first car and its position on the track. This example is analogous to a
CDR circuit. The first car represents an incoming data stream and the second car represents the
output of the VCO. The speed of the first car represents the data rate of the input signal and
the the position of the first car represents the phase.
If the second car is controlled in a linear manner its velocity will be increased or decreased
depending on how far away it is from the first car. As the cars get closer and closer the rate of
acceleration or deceleration decreases until the cars are perfectly synchronized. If the velocity
of the first car changes this will result in the positions of the cars no longer being synchronized.
The second car will accelerate or decelerate in a proportional manner in order to re-synchronize
the position.
If the second car is controlled in a binary manner the functionality will be quite different,
however the end result will be quite similar in that the positions of the two cars can still be
synchronized. For the case of binary control the second car can only travel at one of two velocities,
80km/h or 120km/h. The velocity of the second car depends only on whether the position of the
first car is ahead of it or behind it on the track. The velocity of the second car will never equal
that of the first car and as such the position of the second car will never perfectly synchronize
with that of the first car. The second car will continuously be speeding past and then falling
behind the first car. However, if the velocity of the second car can switch fast enough, its position
Clock and Data Recovery 58
Figure 2.39: A CDR architecture where the binary phase detector directly controls the VCO
can accurately match that of the first car.
Figure 2.39 illustrates a CDR circuit with a very basic bang-bang control. The output of the
phase detector will either be high or low, which will correspond to one of two VCO frequencies.
The VCO will switch between these two frequencies in order to match the output phase with the
input phase.
2.6.2 Basic Binary Control: Analysis
Several authors have presented analyzes of CDR circuits which implement binary phase detectors,
notably Walker and Lee et al. [18, 60], however there have also been valuable contributions by
Ramezani and Salama, Wang et al. and Greshishev [58, 61, 62]. In their analyzes all authors
other than Lee et al. assume that the binary phase detector is ideal. In [60] Lee et al. the non-
ideal response of the phase detector due to metastability in the DFF is discussed, however the
frequency response is almost solely dependant on the proportional response of the phase detector.
As such in the formulation of expressions for jitter tolerance and jitter transfer the phase detector
can be assumed to be ideal.
Figure 2.40 illustrates the theoretical operation of this circuit. In this figure the input signal
is ideal, with no jitter. Even so, the difference between Φin and Φout never goes to zero as
Φout will always move back and forth across the zero phase error point. There are only two
possible frequencies and as the VCO is a phase integrator, the output phase changes in a linear
manner. In [60] Lee et al. formulate the equations for jitter tolerance and jitter transfer for a
Clock and Data Recovery 59
Figure 2.40: Waveforms illustrating the ideal response of the first order loop
binary phase detector based CDR circuit which has a second order loop, however the framework
is very similar for the case of the first order loop and the following equations are based largely on
that work. Jitter transfer describes the response of the CDR circuit to input jitter of the form,
Φin = Φin,p · cos (ωΦt). For low input jitter magnitude and frequency the loop can track, however
as the magnitude and/or frequency increase at a certain point the loop is no longer able to track.
Figure 2.41 illustrates the response of the first order loop when the input jitter magnitude is
high. In this case the loop is not able to perfectly track the input jitter, but instead begins to
attenuate it. As before there are only two VCO frequencies and as such the output phase changes
in a linear manner. The maximum magnitude of the output phase occurs after one quarter of the
















Equation 2.36 illustrates the first order response of loop, in terms of the input jitter frequency,
Clock and Data Recovery 60
Figure 2.41: Waveforms used to determine the jitter transfer response of the first order loop
ωΦ. The corner frequency occurs when the loop begins to track the input jitter properly, which
will result in |Φout,pΦin,p | ≈ 1. Using this identity yields an expression for the jitter transfer corner





An expression for jitter tolerance describes the maximum magnitude of input jitter a CDR
circuit can tolerate before the data begins to be sampled incorrectly. This will occur when the
clock and data are 90 degrees out of phase, which can be written as Φin − Φout = π. Figure
2.42 illustrates the extreme case of jitter tolerance. In this case the input jitter is defined as
Φin = Φin,p · cos (ωΦt + δ). The input jitter is offset by angle δ so that the maximum output
phase would occur at time t = 0. The expressions at time t = 0 is given in Equation 2.38.
Φout = ∆fbb · T4 = Φin,p · cos (δ) (2.38)
Equation 2.38 can be reworked as in Equations 2.39 to 2.41.
Clock and Data Recovery 61
















































The maximum phase error (∆Φmax) is difficult to calculate, however it is very close to the phase
Clock and Data Recovery 62
error at t = T4 . As such, an assumption is made that ∆Φmax ≈ ∆Φt=T4 [60]. The jitter tolerance is
defined where ∆Φmax = π, and as such Equation 2.44 provides an expression when the maximum








)∣∣∣ = |Φin,p sin(δ)| = π (2.44)
Rewriting Equation 2.39 and using that in Equation 2.44 allows an expression for ∆Φmax which
is independent of δ. Deriving an expression for the maximum input jitter magnitude which can
be tolerated will give the expression for jitter tolerance. With some manipulation an expression






















Equation 2.47 again illustrates the first order response of the loop. For very high input jitter
frequencies the maximum magnitude of input jitter is equal to π, however as the input jitter
frequency decreases the loop begins to be able to handle large magnitudes, increasing at a rate
of 20dB/dec. The corner frequency will occur when Equation 2.47 is equation to
√
2π. This in








=⇒ ωbw = ∆fbb2 (2.48)
Clock and Data Recovery 63
Figure 2.43: The jitter transfer and jitter tolerance response of the first order loop
Figure 2.43 illustrates the frequency response of the simple first order binary phase detector. By
changing the value of ∆fbb a designer can alter the system response in order to achieve the desired
performance.
2.6.3 Second Order CDR Systems
There are several ways to improve the performance of binary phase detector based CDR circuit
beyond that of the first order system, all of which involve increasing the order of the loop. The
first order binary phase detector based CDR circuit is a very simple circuit, and it is remarkable
effective, however one problem it has is that the VCO frequencies (fhigh and flow) are fixed. This
can be a problem if process variations shift the frequency of the VCO off the designed centre point.
The ∆fbb can be designed very large in order to compensate for process variations, however this
has implications in terms of the performance of the CDR circuit. Making the loop second order
allows the loop to track the frequency as well as the phase.
Three methods to increase the order of the loop are shown in Figure 2.44. The first method
involves adding a charge pump and capacitor between the phase detector and the VCO, as shown
in Figure 2.44a. This method changes the proportional response of the control to an integral
Clock and Data Recovery 64
Figure 2.44: Three architectures which increase the order of the first order loop
response. A second method involves the addition of a charge pump and an RC filter, as shown
in Figure 2.44b. This method results in both an integral response and a proportional response.
A third method involves the addition of a charge pump and a conventional second order filter, as
shown in Figure 2.44c. This method results in a third order system, which can cause problems.
Third order system will not be discussed, however an analysis is presented in [62].
Clock and Data Recovery 65
Figure 2.45: A second order loop with a capacitor as the loop filter
Loop Filter: Capacitor
The simplest method of increasing the order of a first order binary phase detector based CDR
circuit is to add a charge pump and a capacitor. This changes the proportional response of the
CDR circuit to an integral response. The result is that the frequency of the VCO is not switched
between discrete values, but rather increases and decreases in a continuous manner as charge is
added to and subtracted from the capacitor. This system and the associated waveforms are given
in Figure 2.45. The operation of the loop sounds similar to that of a system which includes a linear
phase detector, however the phase detector is still binary, and hence no matter the magnitude
of the phase error the frequency will change at the same rate of Kvco · IcpC . This response leads
to the output phase of the VCO changing in a parabolic manner, which in turn leads to a jitter
transfer and jitter tolerance response which are second order. Figure 2.46 illustrates the loop
operation effect of of an integral response on the output phase.
The analysis of this system is very similar to that of the first order system presented above,
the only difference being that instead of the phase changing in a linear manner it changes in a
parabolic manner. This requires a slightly more complicated derivation of the phase, however the
steps are essentially the same. Lee et al. provides the framework for this analysis, and extending
from this work one can easily calculate the equations for jitter tolerance and jitter transfer, and
Clock and Data Recovery 66
Figure 2.46: Waveforms showing the integral response provided by the capacitor
from those the associated bandwidths [60]. The equation for jitter transfer is given in Equation
2.49 and solving for
∣∣∣Φout,pΦin,p
∣∣∣ = 1 gives the associated bandwidth, given in Equation 2.50. As in
Section 2.6.1 solving for Φin − Φout = π gives the equation for jitter tolerance, which is given in
Equation 2.51. The corner frequency of jitter tolerance can be found by solving for |Gjtol| =
√
2π,















2 · C · Φin,p
(2.50)
|Gjtol| = 1.26 · KvcoIcpπ
2






2 · C (2.52)
Clock and Data Recovery 67
Figure 2.47: A second order loop with a first order RC loop filter
Loop Filter: RC Filter
The most common way of increasing the order of the simple binary phase detector based CDR
circuit is to add a charge pump and a first order RC filter. This creates a loop which has both
a proportional and an integral response. The proportional response is due to the voltage drop
across the resistor when as the charge pump switches between charging and discharging whereas
the integral response is due to the charging and discharging of the capacitor. In order to achieve
good stability the capacitor in the first order filter is generally quite large, which means that
except for the cases of very low frequency jitter the proportional response will determine the
response of the circuit. As such the proportional response will be largely responsible for the jitter
tolerance and jitter transfer characteristic of the CDR circuit, which means that those FOM
will have a first order response [18]. The integral response is still important, as it is responsible
for frequency tracking and low frequency jitter tolerance [60], however it will be ignored for the
derivation of jitter transfer and jitter tolerance.
Section 2.6.2 describes the analysis of the a first order loop, which is controlled by the pa-
rameter ∆fbb. Given the assumption that the frequency response is solely dependant on the
proportional response of the loop the first order analysis is immediately transferrable, the only
difference being that now ∆fbb is explicitly defined as ∆fbb = Kvco ·Icp ·R. The resulting equation
Clock and Data Recovery 68
for jitter transfer is given in Equation 2.53 and the associated bandwidth is given in Equation
2.54. The equation for jitter tolerance is given in Equation 2.55 and the associated bandwidth is



























In a CDR circuit jitter can be caused by numerous sources, including power supply noise, VCO
phase noise and substrate noise. As such, while jitter generation is an important figure of merit
it is difficult to formulate an equation which precisely describes it. In [18] Walker formulates an
equation for jitter generation by describing the phase detector as a noise source and calculating
the output noise. The analysis is quite detailed and deals with several regions of operation which
depend on the magnitude of the input jitter and the loop stability. The resulting equation for
jitter generation is given in Equation 2.57, where Φbb is equal to the magnitude of the loop phase
step and thus proportional to ∆fbb.
Jgen ≈ 0.79 ·
√
Φbb · Jin (2.57)
Walker’s analysis in [18] is the most detailed, however there are other equations which are used
for jitter generation. In [58] Greshishchev gives an equation which describes the jitter generation
Clock and Data Recovery 69
Figure 2.48: Relationship between jitter tolerance, jitter generation and ∆fbb
as proportional to the loop delay (tloopdelay) and the proportional response of the loop (∆fbb), as
shown in Equation 2.58.
Jgen ≈ tloopdelay ·∆fbb (2.58)
Regardless of the exact relationship between ∆fbb and Jgen the relationship between jitter gener-
ation and jitter tolerance is easily understood. Looking back to the car analogy in Section 2.6.1
it is clear that if the velocity of the first car exceeds the maximum velocity of the second car, the
second car cannot properly track it. This is analogous to the relationship between input jitter
frequency (ωΦ) and ∆fbb. The value of ∆fbb sets an upper bound on the magnitude of input jitter
which can be tolerated. However, the proportional response cannot simply be increased to allow
tracking of arbitrarily high input jitter magnitudes. As described in Equation 2.57, increasing
the value of ∆fbb will result in a higher output jitter.
Figure 2.48 shows a graphical representation of these equations and the associated tradeoff be-
tween jitter generation and jitter tolerance. This figure illustrates that the value of ∆fbb must
be set high enough in order to meet the jitter tolerance specification, however it must not be set
Clock and Data Recovery 70
Figure 2.49: Response of two-state and tri-state binary phase detectors
too high as that will result in the failure of the jitter generation specification.
Tri-state Phase Detectors
So far in this section is has been assumed that the binary phase detector used is a pure binary
phase detector, such as a DFF binary phase detector. However, many standards require the
system to operate even with input data sequences which contains long intervals with no data
transitions. As such, most systems do not implement a pure binary phase detector, but rather
use a tri-stated circuit, most commonly the Alexander phase detector. The difference in the
resulting output waveforms for these two phase detectors given a CDR architecture with a charge
pump and first order filter is shown in Figure 2.49. The difference is handled differently by
different authors. Walker suggests increasing ∆fbb in order to compensate for the degradation
in the jitter tolerance caused by the lower data density [18]. In a similar manner to Walker,
Greshishchev uses a gain term DF (referred to in this thesis as Kdf), which describes the data
density. This variable allows the designer to compensate for the fact that the correction is not
being continuously applied as a reduced data density will require other design parameters to be
modified in order to compensate for the reduced ∆fbb [58]. Ramezani and Salama takes a different
tact and calculates the phase change due to the proportional and integral branches separately,
however Ramezani and Salama primarily discusses locking behavior as opposed to the frequency
Clock and Data Recovery 71
Figure 2.50: V/I circuit used by Lee et al.
response [61].
Lee et al. does not specifically address this issue, but rather sidesteps it with the use of a
low-speed V/I converter as opposed to a charge pump [60]. According to Lee et al. the V/I
converter is preferable to a charge pump as it senses the average output of the phase detector,
as opposed to being driven by high-speed pulses. The V/I converter has the effect of bringing
the system response back to that of a pure binary phase detector. The schematic of the V/I
converter used by Lee et al. shown in Figure 2.50 [63] (While this is not the same paper or even
the same author, this circuit is from the same research group and is used in several other papers
and as such it is virtually certain that the same V/I circuit is used by Lee et al.). While this
circuit allows the designer to avoid the effects of the phase detector pulses on the filter voltage,
other difficulties are created. First, with this circuit the designer has no control over the charge
pump current. The charge pump current allows the designer to tune the characteristics of the
CDR circuit during testing, and as such this scheme loses that flexibility. Also, as the current
is dependant on the absolute voltage levels at the output of the XOR gates, any common-mode
offset between these signals will result in a systematic offset, which will impact performance.
For an architecture which has a proportional response, the use of a tri-state phase detector
requires a small modification to the control of the VCO. Using a tri-state phase detector means
that there are three states and as such the proportional path must also have three states. The
Clock and Data Recovery 72
Figure 2.51: Example of a circuit with tri-state frequency control
third state, which occurs in the absence of any data transitions, must not result in either the UP
or DOWN state (which correspond to the frequencies Fhigh and Flow) but rather to the midpoint
[53]. As such the proportional path must be able to output a third frequency, located at the
midpoint of Fhigh and Flow. For a ring oscillator this can be a simple circuit, one example of
which is shown in Figure 2.51. In this circuit the tri-state state will occur when the UP voltage
equals the DOWN voltage which results in the VCO’s proportional current being equal to half of
Ibias bb.
Notes
It should be noted that as of this point in time there is no comprehensive analysis of binary
phase detector based CDR circuits which is generally accepted. While numerous authors have
formulated useful and innovative models they are different from each other and this creates some
confusion. Some authors focus more on the time domain response and derive relationship based
on that, whereas others focus on the frequency domain response. For example, the analysis in this
thesis has ignored the concept of stability, which was derived by Walker [18] and used by both
Wang et al. and Ramezani and Salama [62, 61]. The stability of the loop relates the proportional
Clock and Data Recovery 73
and integral responses and is defined as ξ = ∆Φproportional∆Φintegral . Stability plays an important role in
Walker’s analysis, however it is not used at all in the analysis performed by Lee et al.. This thesis
can only provided an overview of some of the research in this area and someone studying CDR
circuit analysis would be wise to go back to the original papers to get a complete understanding
of the work which has been done.
2.7 Summary
This chapter presented a background on wireline data communication circuits. Wireline data
communication systems can be implemented over optical fibre or over a backplane, however for
both systems the CDR circuit is very important. The individual blocks which compose a CDR
circuit were detailed and a mathematical model for the CDR circuit was introduced. Various
FOM are used to characterize the performance of CDR circuits, and both frequency domain
and time domain FOM were described. It was shown that for linear phase detector based CDR
circuits classic loop theory can be used to formulate equations for many of the FOM. For binary
phase detector based CDR circuits it is more difficult to formulate similar equations, however
one method of formulating these expressions was presented. Also, the effects of noise on CDR
circuits were examined and the relationship with the system bandwidth was illustrated.
Chapter 3
Robustness Considerations in CDR
Circuits
The word robustness means different things in different fields of study, so it is important to first
define the term. Even when an IC has been properly designed there will be discrepancies between
the simulated results and the measured results. The ability of a circuit to operate properly, in
spite of the silicon not matching the idealities of simulation, is the fundamental idea of robustness
in the context of circuit design. While robust circuit design has been heavily researched in the
areas of memory and logic circuits, the robustness of CDR circuits has not been studied. In this
chapter the specific effects of process non-idealities on CDR circuits are analyzed at both the
circuit level and the system level.
3.1 Robustness
The scaling of CMOS is a critical factor in the advance of computing systems. CMOS processes
scale to smaller geometries approximately every one-and-a-half to two years [64]. The number of
transistors in ICs has correspondingly increased, as has their frequency of operation. Highly scaled
74
Robustness Considerations in CDR Circuits 75
Figure 3.1: Some sources and effects of process variations [3] [4]
geometries enable the integration of multi-Gb/s CDR circuits in monolithic circuits, however
they also create challenges. One of the most significant challenges is that the aggressive scaling of
CMOS processes to maximize performance has increased the process variability [65] [66] [67]. The
effects of process variations has thusfar primarily been studied in the context of digital circuits
and memories, however these variations also negatively affect the performance of CDR circuits.
The graphs on the left and right in Figure 3.1 are from [3] and [4] respectively. The graph on
the left in Figure 3.1 illustrates the fact that in recent years the scaling of CMOS has caused the
variability to increase. Taking Leff as an example it can be seen that the percentage variation was
over 30% in 2000 when the paper was written and the variability was expected to continuously
increase. The graph on the right in Figure 3.1 illustrates the effect of this variability on a circuit.
A current mirror with a desired 10:1 ration is simulated in 130nm, 90nm and 65nm CMOS
processes and the results show that the for the scaled transistors the variability in the actual
current ratio is significantly higher.
Process variations are an important part of the reason why robust design is needed, however
there are also other sources of non-idealities including: temperature variations, threshold voltage
shifts, random transistor mismatches and the inadequacy of current transistor models. Transistor
Robustness Considerations in CDR Circuits 76
model limitations become more problematic with circuits operating at multi-GHz frequencies, as
often devices are inadequately characterized at these frequencies. All of these issues have the
effect of creating gaps between the simulation environment and silicon results.
Scaled CMOS processes have enabled the integration of more and more processing power
onto a single die. This large amount of processing power requires an equally large amount of
data, making high bandwidth data communication circuits desirable. As such, scaling gives the
ability to create multi-Gb/s interconnects, integration supplies the impetus to do so and process
non-idealities makes robust design of those interconnects increasing difficult.
3.1.1 Definition of Robustness
There is no singule definition of robustness, nor is there a singular scale by which it can be judged.
In the most basic sense robustness refers to strength and endurance. In the context of circuit
design this is used to identify those circuits which continue to operate under conditions where
other circuits fail. In circuit design any comparison of robustness must use a reference. Hence
circuit X cannot technically be defined as ‘robust’, rather circuit X can only be defined as ‘robust
compared to circuit Y’, or ‘robust with respect to some predefined standard’. Robustness is
essentially the same as manufacturability. The design must work reliably in different environments
and it must work over a long period of time.
3.1.2 Robustness in This Thesis
As there is no singular metric by which robustness can be measured, enhancements in robustness
can be difficult to quantify analytically, especially in a design environment. Most research into
CDR circuits is aimed at enhancing performance. This allows for easily quantifiable results,
as there are detailed FOM which can be used. Section 2.5 describes various FOM used to
quantify the performance of CDR circuits. Robustness is not a very easy FOM to measure
in a university environment. In a university there is limited access to the kind of proprietary
Robustness Considerations in CDR Circuits 77
Figure 3.2: Phase detector performance and robustness goals
information semiconductor foundries use to calculate yield. When chips are fabricated in the
university environment, only a few chips are available to be tested, which doesn’t allow for any
analysis of yield. While this does pose some problems for a thesis based around ‘robust design’,
the approach taken here is to look at circuits from an architectural point of view, and analyze
their sensitivities with respect to reference circuits.
In this thesis the overarching goal is to optimize CDR circuits for both performance and
robustness. In order to achieve a more robust interconnect reducing the data rate is always an
option which will work. However, the result of reducing the data rate is that performance is not
optimized and the design is not taking full advantage of the process. In such a situation multiple
links may be required to get the desired bandwidth, which would increase area, power, design
complexity and system cost.
Much of this thesis focuses on the phase detector as the circuit where robustness can be
affected. It will be shown later that with respect to the two basic types of phase detectors, linear
and binary, there are very different robustness and performance characteristics. Linear phase
detectors ideally give better performance however are more sensitive to process non-idealities.
Conversely, binary phase detectors have theoretically lower performance, however are less sensitive
to process non-idealities. As such, there are two approaches to robustness taken in this thesis.
Robustness Considerations in CDR Circuits 78
First calibration techniques will be introduced to improve the robustness of linear phase detectors
while maintaining their performance. Secondly circuits will be introduced which improve the
performance of binary phase detectors, while maintaining their robustness. Figure 3.2 illustrates
these goals, using jitter as a simple performance metric. The grey area describes the region of
possible output jitter magnitudes, with the black bar representing the mean. Figure 3.2a shows
that the mean jitter for a linear phase detector based CDR circuit is lower than that of a binary
phase detector based CDR circuit, however the variability of the linear circuit is much larger.
Figure 3.2b illustrates that the goal of this thesis is not to sacrifice performance for robustness,
but rather to improve the robustness of linear phase detectors by reducing the variability while
maintaining the performance and to maintain the robustness of binary phase detectors while
improving their performance.
Calibration
As will be seen later in this chapter, the response of a phase detector can experience significant
shifts due to process non-idealities. In Chapter 4 calibration circuits will be introduced in order
to compensate for these variations and return the performance of the phase detector to its desired
location. While the use of calibration is common in many analog circuits, it has found limited use
in wireline circuits, in part as it is difficult to use calibration with circuits operating at multi-Gb/s
frequencies.
Novel Phase Detector Circuits
The calibration circuits which will be described in Chapter 4 introduce additional circuitry which
surrounded the phase detector designed to compensate for variations. In Chapter 5 phase detector
circuits will be introduced which are designed to be robust and improve performance over the
standard design. Both linear and binary phase detector circuits are proposed and analyzed with
respect to both performance and robustness.
Robustness Considerations in CDR Circuits 79
3.2 Mathematical Analysis of Static Phase Offset
In an ideal CDR circuit the clock and data signals will be perfectly aligned when the system is in
the locked condition. However, in reality the clock and data signals will often lock in a non-ideal
state where there is a phase difference between them. This is known as a static phase offset. A
static phase offset is a significant problem in a CDR circuit because it results in the incoming
data no longer being sampled at the centre of the data eye [17]. In this section the effect of a
static phase offset on the performance of a CDR circuit is analyzed mathematically. A static
phase offset will affect any phase detector architecture in a similar manner and the mathematical
results presented in this section will hold regardless of the phase detector architecture used.
3.2.1 Static Phase Offsets in a Phase Detector
As described in Section 2.5.4 the probability of a CDR circuit making an error by incorrectly
sampling the data is known as BER. Every CDR circuit will have a BER which is caused by jitter.
As described in Section 2.5.2 , jitter can be divided into two categories: random and deterministic.
While random jitter can be approximated by a Gaussian distribution [68] deterministic jitter has
many different potential sources. While there are many sources of deterministic jitter, their effects
on a data communication system are similar, hence they are usually considered collectively. Unlike
random jitter, deterministic jitter is bounded, and its effect is to shrink the data eye by a finite
amount. As deterministic jitter is bounded, it is less significant as compared to random jitter and
will only further reduce the ability of the CDR circuit to tolerate static phase offsets. For that
reason, and in order to simplify the mathematical analysis, deterministic jitter is ignored. Figure
3.3 shows an eye diagram illustrating the random jitter in the system. The dot in the centre of
the data eye represents the ideal sampling point. The multiple traces in the transition of the clock
signal illustrate the uncertainty due to noise. The bottom curve represents the probability density
function (PDF) of the clock transition point. The probability of a sampling error is illustrated
by the blackened area under the PDF curve which is beyond the next data transition.
Robustness Considerations in CDR Circuits 80
Figure 3.3: Eye diagram of a CDR circuit when Φspo = 0
In this analysis the transition point of the clock has a Gaussian distribution, hence the prob-
ability of a sampling error (Pe) can be written mathematically as the probability that the clock
transition occurs either before the leading data transition, or after the trailing data transition



























In Equation 3.2 ‘σ’ represents the standard deviation and ‘m’ represents the mean. In a CDR
circuit the standard deviation corresponds to the random jitter (Jrms) and the mean corresponds
to the deviation of the clock transition from the ideal sampling point. Given an ideal system where
the sampling point is exactly at the centre of the data eye then the probability of ttrans > T2 is
equal to the probability that ttrans < T2 . This can be written as:
Robustness Considerations in CDR Circuits 81














Integrating the area under the PDF which lies beyond ±T2 will give the probability of an error,















The integral in Equation 3.4 cannot be solved directly, however the Q-function can be used to
simplify the analysis. It is also important to note that the probability of an error is equal to the
BER. Using the Q-function and Equation 3.4, Equation 3.1 can be re-written as:







A static phase offset in a CDR circuit moves the sampling point away from the centre of the
data eye. In a mathematical analysis this has the effect of moving the mean of the Gaussian
distribution by the amount of the static phase offset. Therefore the BER of a CDR circuit with
Robustness Considerations in CDR Circuits 82
Figure 3.5: Maximum allowable RMS jitter in the presence of static phase offsets
a static phase offset equal to Φspo can be expressed as:



















The BER rate will, for any significant Φspo, be dominated by either the first or second term in
Equation 3.6. The eye diagram for a CDR circuit with a static phase offset is illustrated in Figure
3.4. Here the zero crossing point of the sampling clock is offset by Φspo. Again the probability of
an error is illustrated by the blackened area under the Gaussian curve. The blackened area, and
therefore the error probability, is greater for the case where Φspo 6= 0.
Using Equation 3.6 CDR circuits operating at data rates of 2.5Gb/s, 5Gb/s and 10Gb/s are
analyzed. Matlab is used in order to calculate the maximum magnitude of RMS jitter that a
CDR circuit with a given Φspo can tolerate. The CDR circuit is considered to fail if its BER
exceeds 10−12. The results of this analysis are given in Figure 3.5. It is important to point out
Robustness Considerations in CDR Circuits 83
that these results represent theoretical maximums, and should not be considered as indicative
of actual circuit performance. They are, however, useful for observing trends and tradeoffs. In
Figure 3.5 the area under the curves represents the region where the CDR circuit has a BER less
than 10−12. The slopes of these lines are identical, indicating that Equation 3.6 can be re-written
in a form which is independent of the data rata. Dividing the numerator and denominator of
Equation 3.6 by the period, T, results in an equation for BER which is independent of the data
rate. The result is expressed in Equation 3.7, where the Φspo and Jrms are expressed in terms of
the unit interval (UI), which is the period of the data rate.



















As an example, the case where the static phase offset increases from zero to a quarter of the
period is analyzed using Equation 3.7. In this situation the first term in Equation 3.7 dominates
and its numerator will be halved, decreasing from 12 to
1
4 . The effect of this is that in order to
keep the same BER the Jrms budget must also be halved. This can be seen in Figure 3.5, with
the maximum allowable Jrms for the 10Gb/s CDR circuit decreasing from 10ps to 5ps as the Φspo
increases from 0ps to 25ps.
In Figure 3.6 Equation 3.7 is plotted to demonstrate the effect of static phase offsets on the
BER. In Figure 3.6 the five different curves represent the BER values for various static phase
offsets, from zero to 0.25UI. This figure demonstrates that even relatively small static phase
offsets can significantly increase the BER of a CDR circuit. As an example, Figure 3.7 shows the
resulting BER for a 10Gb/s CDR circuit when a static phase offset of 10ps is introduced. The
lower curve in the figure represents the BER for the ideal case where there is no static phase offset
and the upper curve represents the BER for the case where there is a 10ps static phase offset. If
the CDR circuit had an RMS jitter of 10ps and the desired BER was 10−10 it can be seen that
the ideal system meets the design specification with a BER of approximately 10−12. However,
Robustness Considerations in CDR Circuits 84
Figure 3.6: Output BER with respect to Jrms and Φspo
Figure 3.7: The effect of a 10ps static phase offset on the BER of a 10Gb/s CDR circuit
the introduction of a 10ps static phase offset increases the BER by four orders of magnitude, to
approximately 10−8 resulting in the desired performance not being met. In this example a 10Gb/s
CDR circuit is used, and for such a system an RMS jitter of 10ps is unrealistically high. However,
again it is noted that these results represent theoretical maximums, and the negative effects of
static phase offsets will become significant with much less RMS jitter in real CDR circuits.
Robustness Considerations in CDR Circuits 85
Figure 3.8: Simulated maximum input jitter vs Φspo for a 5GB/s CDR circuit
3.2.2 Simulation Results
In order to verify the analysis a 5Gb/s CDR circuit was modelled using Matlab. The model had
a variable static phase offset and the amount of jitter could also be varied. Extensive simulations
were performed to determine the maximum amount of input jitter which could be tolerated
for a given static phase offset. These simulations are challenging, as a statistically accurate
measure of BER requires a very long simulation time. For a 5Gb/s CDR circuit it is simply
not practical to run a time-domain simulation for more than a few tens of microseconds, given
the small timestep required. As such, the BER values which were targeted were 10−3 and 10−4,
which correspond to one error every 0.2µs and one error every 2µs. In order to get results which
were statistically meaningful the simulation times were an order of magnitude greater than the
associated BER. Also, the simulations were repeated numerous times and the results averaged.
Figure 3.8 shows both simulation results and also the theoretical values calculated from Equation
3.6. It can be seen that as the static phase offset increases the maximum jitter which can be
tolerated steadily decreases. The simulation results are not identical to the calculated values,
Robustness Considerations in CDR Circuits 86
Figure 3.9: Schematic of a CML based DFF with waveforms illustrating its functionality
and this is not surprising, as the BERs calculated from Equation 3.6 are theoretical maximums.
While the Matlab model is ideal, there will inevitably be errors due to the finite accuracy of any
simulation. The important aspect of Figure 3.8 is that the trends of the simulated BER curves
accurately track the theoretical values. This indicates the accuracy of the mathematical analysis
and re-enforces the negative effects of static phase offsets.
3.3 DFF Analysis
The DFF is the fundamental building block of virtually every phase detector. The DFF is also
used to re-time the data signal. A CML DFF is composed of two latches, as shown in Figure
3.9. The associated waveforms are also shown, along with the modes of the two latches. As can
be seen, each latch is either in sample mode or hold mode. Transistors M3 and M4 comprise
the sampling circuit and transistors M5 and M6 comprise the hold circuit. When latch A is in
sampling mode, latch B is in hold mode, and visa versa. The clock signal controls which mode
a latch is in by steering the bias current via transistors M1 and M2. When the clock is high
transistors M3a and M4a in latch A sample the incoming data signal onto VAQ. In latch B the
Robustness Considerations in CDR Circuits 87
output signal VBQ is not dependant on the changing VAQ as the bias current is going through
transistors M5b and M6b which form a regenerative cross-couple inverter pair. When the clock
signal transitions to low latch A holds the signal and latch B samples the signal VAQ onto the
output signal VBQ.
Ideally as soon as the clock switches and a latch enters its ‘sampling’ mode, the output will
instantaneously switch. In reality this is not the case, but rather there is a latency which is
known as the C-Q delay of the DFF. The C-Q delay is defined as the time it takes from the clock
transition until the output signal is defined. The C-Q itself is not constant, but is a function of
the input phase error [69]. Figure 3.9 shows the clock and data signals in their ideal condition,
however as the phase error between the clock and data signal becomes large the situation arises
where the data signal is transitioning at the same time as the clock signal. This creates a condition
known as metastability.
3.3.1 Metastability in a DFF
Metastability in a DFF occurs when the clock and data signal transition at approximately the
same time. In Figure 3.9 it can be seen that in order to sample properly all of the bias current
should be flowing through transistor M1 and the input data signal should be large. A large
input data signal combined with the gain of transistors M3 and M4 allows the output signal to
be quickly resolved. However, due to finite rise and fall times when the clock signal is near its
transition point current will not be completely switched. This will reduce the gain of the sampling
differential pair M3 and M4 and also begin to activate the hold circuit of M5 and M6. If the data
signal is transitioning at the same time its reduced swing will make it even more difficult for the
sampling pair M3 and M4 to properly sample it. The metastability of a DFF will affects the C-Q
delay of the circuit in that the output signal will take more time to fully resolve [70].
The C-Q delay of the DFF is not static, but rather depends on the phase relationship between
the incoming clock and data signals. In order to examine this effect a CML DFF is simulated
Robustness Considerations in CDR Circuits 88
Figure 3.10: Value of the C-Q delay with respect to the input phase error
in a 180nm CMOS process. The DFF is designed to operate in a 5GB/s CDR circuit, and as
such the clock is 5GHz clock and the data has a period of 200ps. The input phase error is varied
over the complete range of possible phase errors. The C-Q delay of the DFF is also be affected
by process corners. As such, in order to illustrate the effects of process variations the DFF is
simulated over both process and resistance corners. Figure 3.10 illustrates the effects of phase
error and process on the C-Q delay of the DFF. As can be seen the C-Q delay of the DFF varies
significantly. Region 1 illustrates the situation where the data changes soon after the clock has
transitioned. As the clock has transitioned the change in the data signal should have no effect
on the output of the DFF, however due to the metastability the data can be passed even when
it is not supposed to be. This is why in Region 1 the C-Q delay for some of the curves drops
to negative values. Region 2 illustrates the desired situation where there is plenty of setup time
and hold time between the clock signal and the data signal. In this region the C-Q delay of
the flip-flop is relatively constant. Region 3 illustrates the situation where the data changes just
Robustness Considerations in CDR Circuits 89
Figure 3.11: The operation of a Hogge phase detector given non-ideal DFFs
before the clock changes. In this case the output signal should ideally transition the same way
as in region 2. However, the metastability of the DFF results in an increase in the C-Q delay, to
the point where the output data signal does not resolve until beyond the next transition of the
clock. This is why in Region 3 the C-Q delay for some of the curves rises rapidly beyond the top
edge of the figure.
3.3.2 Hogge Phase Detector Gain
Using the timing diagram in Figure 3.11 transition times of the signals in the Hogge phase detector
can be mathematically described. The signals in Figure 3.11 differ from the ideal waveforms shown
in Figure 2.10 in that the C-Q delays of the DFFs are included. Simple expressions can be written
to express the timing of the waveforms, and these are given below.
Robustness Considerations in CDR Circuits 90
data = Φerror
Q1 = T2 + tcq1
Q2 = T + tcq2




UPpulsewidth = Q1 − data− tdelay
= T2 + tcq1 − tdelay − Φerror
DNpulsewidth = Q2 −Q1
= T2 + tcq2 − tcq1
These equations illustrate that if the delay element matches the C-Q delay of DFF1 the pulse
width of UP will simply equal T2 - error, which is ideal. However, the delay of the delay element
is constant, whereas the C-Q delay of a DFF is a function of the setup time, and hence the C-Q
delay of DFF1 is a function of the input phase error. The DOWN relationship illustrates that
the DOWN pulse width will equal T/2 when the C-Q delays of the two DFFs match, which
only happens for ideal DFFs where tCQ=0. Serious problems arise when the C-Q delay of DFF1
increases to the point where the setup time of DFF2 is violated. At this point DFF2 enters
metastability and its output is no longer valid.
3.3.3 Alexander Phase Detector Gain
As with the Hogge phase detector, the waveforms for the Alexander phase detector are analyzed,
including the effects of the C-Q delay. The waveforms will be different depending on whether
the clock is leading or lagging the data, and both cases are shown in Figure 3.12. One weakness
of the Alexander phase detector can be seen by examining the timing of its output waveforms.
The following expressions describe the pulse widths of the UP and DOWN signal in the situation
where the clock lags the data.
Robustness Considerations in CDR Circuits 91
Figure 3.12: Waveforms for an Alexander phase detector including C-Q delay
Clock lagging data:
data = −Φerror
Q1 = T2 + tcq1
Q2 = 3T2 + tcq2
Q3 = tcq3




UPpulsewidth = Q4 −Q1 = T2 + tcq4 − T2 − tcq1
= tcq4 − tcq1
DNpulsewidth = Q4 −Q2 = T2 + tcq4 − 3T2 − tcq2
= T + tcq4 − tcq2
Robustness Considerations in CDR Circuits 92
Clock leading data:
data = +Φerror
Q1 = T2 + tcq1
Q2 = 3T2 + tcq2
Q3 = T + tcq3




UPpulsewidth = Q4 −Q1 = 3T2 + tcq4 − T2 − tcq1
= T + tcq4 − tcq1
DNpulsewidth = Q4 −Q2 = 3T2 + tcq4 − 3T2 − tcq2
= tcq4 − tcq2
The relationships above illustrate that in order to get an ideal response, the C-Q delays of all the
DFFs to be identical. This is a problem as the C-Q delay of a DFF is a function of the setup
time. As long as the CDR circuit is near the locked condition the setup time of DFF1 and DFF2
will be reasonably large and hence their C-Q delays will not vary by a large amount. They can
be defined as operating in region 2, as labelled on Figure 3.9. The problem is that when the
input phase error is very small, DFF3 is sampling the data at the same time that the data is
changing. In this situation it is very difficult for DFF3 to accurately sample the data signal, as
it has virtually no setup time and as such the C-Q will vary significantly. This is represented in
Figure 3.9 as regions 1 and 3. The varying C-Q delay can cause the output of DFF3 to infringe
on the setup time of DFF4. Both the UP and DOWN signals depend on DFF4 and hence an
incorrect output of DFF4 results in the phase detector giving an incorrect response.
3.3.4 DFF Phase Detector Gain
A DFF phase detector is operated exclusively in the metastable region, as the clock signal should
ideally be switching at the exact time as the clock signal. This would appear to pose a problem for
the DFF phase detector, however the C-Q delay of a DFF phase detector is of little importance.
The output of the DFF is not fed into any other timing block, hence the latency associated with
the C-Q delay will not significantly affect the performance. The process variations will cause
some problems relating to the ability of the DFF phase detector to resolve phase errors, and
Robustness Considerations in CDR Circuits 93
Table 3.1: Simulation data for process normalization
CMOS Frequency Rise Time Fall Time Frequency Data Rate
process (GHz) (ps) (ps) ratio (Gb/s)
180nm 0.769 47.88 47.96 1.00 5
130nm 1.21 30.20 30.19 1.57 8
90nm 1.92 18.88 18.77 2.49 12.5
this will be examined for the DFF phase detector and the other two phase detectors, in the next
section.
3.4 Effect of Non-Idealities on Phase Detectors
The simulations in the previous section illustrated the effects of process variation on a CML DFF
in the presence of process variations. In this section the robustness of CML based phase detectors
is analyzed with respect to the scaling of CMOS processes. Three phase detectors are analyzed
over corners in three standard CMOS processes: 180nm, 130nm and 90nm. The purpose of this
section is to highlight trends in the robustness of CDR circuits as technology scales.
3.4.1 Analysis Setup
It is difficult to accurately compare different circuits over different process technologies. In order
to properly compare the results a reference circuit is used to normalize the simulation results from
the different processes. The reference circuit chosen is a 19-stage balanced CMOS ring oscillator
with minimum width NMOS transistors. Ring oscillators are commonly used to provide a simple
metric of process performance [71]. The reference circuit is simulated in three standard CMOS
processes, 180nm, 130nm and 90nm, and the results are summarized in Table 3.1. These results
are used to determine appropriate data rates for the CDR circuits in each process.
Robustness Considerations in CDR Circuits 94
For the 180nm process the data rate chosen was 5Gb/s. While CDR circuits operating at
higher data rates in 180nm processes have been reported [20, 72, 73], they either used half-
rate phase detectors or binary phase detectors. All phase detectors in this work are full-rate
phase detectors. This especially stresses the Hogge architecture, which must generate small,
accurate pulses and hence requires a large bandwidth. In this work phase detectors are analyzed
at relatively aggressive data rates in order to illuminate their weakness and point out where
problems are likely to surface. 5Gb/s is an aggressive data rate for this 180nm process and based
on this data rate and the information in Table 3.1 the data rates for the other processes are
determined. The data rates used are 8Gb/s for the 130nm process and 12.5Gb/s for the 90nm
process.
In order to determine the robustness of these phase detectors, their transfer characteristics
are analyzed. The transfer characteristic is the output of the phase detector for a specific phase
offset between the input clock and data signals. The ideal transfer characteristics for linear and
binary phase detectors was previously shown in Figure 2.8. While it is impossible for any system
to realize the ideal response the deviations from the ideal take specific forms. One such form is
a static phase offset, the detrimental effects of which were previously derived in Section 3.2. As
was shown, a static phase offset the clock and data signals align incorrectly which results in the
data being sampled at a non-ideal point, resulting in an increased BER. Therefor, static phase
offsets are the FOM used to compare the different phase detectors in this section.
In order to simulate the transfer characteristic of the phase detectors clock and data signals
with controlled phase errors are generated. Ideal clock and data signals are used, however the rise
and fall time of the signals correspond to realistic rise and fall times for the particular technology.
All bias voltages are ideal, in order to isolate the analysis to only the phase detectors. For all
simulations an identical data pattern was used, in order to make the results consistent. Verilog-A
models were used to measure the widths of the UP and DOWN pulses and log the data, which
was then gathered and analyzed.
Robustness Considerations in CDR Circuits 95
3.4.2 Analysis Results
In each process the phase detectors were subjected to three basic process corners: slow-slow (SS),
typical-typical (TT) and fast-fast (FF). All of the circuits are implemented using CML, which
means that there are only NMOS transistors present, and hence the slow-fast (SF) and fast-slow
(FS) corners are redundant. The implementation of circuits using CML was previously described
in Chapter 2. The resistive pull-ups in CML circuits can be implemented using polysilicon or MOS
resistors. Using MOS resistors results in large capacitive loading compared to polysilicon resistors
(especially if symmetric loads are used [74]) and this notably reduces the overall bandwidth
of the system [7]. As such, in this work polysilicon resistors are used. However, the use of
polysilicon resistors makes the circuit sensitive to polysilicon variations and in order to stress
this characteristic of CML circuits each phase detector is also simulated with ±20% resistor
variations at each process corner. It must be emphasized that these results were obtained for
specific processes. As such, different flavors of processes from different foundries will give different
results, even for the same process node. This analysis does not claim to precisely match results
from another process, but rather the results are meant to illustrate the trends encountered when
scaling CML phase detectors. It is also important to realize that this work only analyzes the
response of the phase detectors. Process variations will negatively affect the other block of the
CDR circuit, further degrading the overall system response. The effect of process variations on
these blocks is not discussed in this section.
Figure 3.13 shows the response of the DFF, Alexander and Hogge phase detector at all process
corners and for all three technologies. For all simulations the results have been normalized to
clarify the analysis. For the simulations involving the Alexander and Hogge phase detectors the
pulse widths of the output UP and DOWN signals were divided by their expected widths. For
example, the ideal UP or DOWN pulse width for a 5Gb/s Alexander phase detector is 200ps and
therefore the simulated width is divided by 200ps to normalize the data. For the Alexander and
Hogge phase detectors the output is presented as ‘UP width - DOWN width’, which represents
Robustness Considerations in CDR Circuits 96
Figure 3.13: Process variation simulation results for the three phase detectors
Robustness Considerations in CDR Circuits 97
Figure 3.14: Summary showing the overall effects of process on Φspo
the total effect of the correction information on the CDR circuit.
In order to extract the trends from the simulations data the variation in the static phase
offset is determined. The variation in the static phase offsets for the different process corners can
easily be seen in Figure 3.13. The total variation in static phase offset for each phase detector
in each technology is summarized in Figure 3.14. For all three phase detectors the variation in
the static phase offset increases as the process scales. It is interesting to note that the actual
magnitude of the variation in static phase offset reduces. For example, in the case of the Hogge
phase detector in the 90nm process there are fifty degrees of variation at a data rate of 12.5Gb/s,
thus there is a total variation of 11ps. For the Hogge phase detector in the 180nm process there
are thirty-six degrees of variation at a data rate of 5Gb/s, hence there is a total static phase offset
variation of 20ps. However the data rate increases faster than the variations decrease, leading
to a larger relative variation. The reduction in the magnitude of the variations is primarily due
Robustness Considerations in CDR Circuits 98
to the increased gain of the transistors and the faster rise and fall time of the signals, both of
which cause the differential pairs in the CML circuits to switch faster which in turn improves the
performance of the DFFs.
The weaknesses of the Alexander and Hogge phase detectors are largely based on having
back-to-back DFFs clocked on opposite edges of the clock. The variations in the C-Q delay of the
first DFF as the input phase error changes can make it difficult for the second DFF to accurately
sample the data. While the response of the linear Hogge phase detector is attractive, the Hogge
phase detector’s sensitivity to process variations poses a significant problem. The Alexander
phase detector is regarded as an easier circuit to integrate [20], however while it is less sensitive
than the Hogge phase detector it too shows sensitivity to process variations. The simplicity of the
DFF binary phase detector allows it to be far more robust over corners. This phase detector varies
approximately 5◦ as the process scales, from 17◦ in 180nm to 23◦ in 90nm. In the 90nm process
the DFF binary phase detector has 55% less variation than the Hogge and 40% less variation than
the Alexander. While these simulations show the DFF binary phase detector to be more robust,
this must be balanced against the weaknesses in the architecture, most significantly that it is
not tri-stated and that it requires a separate re-timing circuit. The results of these simulations
demonstrate that as CMOS processes scale it is becoming more difficult to design robust CML
phase detectors. This is significant as high-speed serial links are becoming prevalent and it is
precisely in scaled geometries that one would want to integrate multi-Gb/s CDR circuits and
CML is the logic family which enables the highest data rates. In order to accommodate process
variations less aggressive data rates could be used or else novel circuit approaches must be taken.
3.5 Summary
In this chapter the concept of robustness has been examined as it applies to CDR circuits.
Whenever a circuit is fabricated there are deviations from the ideal behavior due to non-idealities
which include process variations, temperature variations, voltage fluctuations and inaccurate
Robustness Considerations in CDR Circuits 99
transistor models. The DFF is the building block for all phase detectors and it was analyzed with
respect to process variations. The analysis showed a variation in the C-Q delay, which in turn
was shown to affect the timing of the different phase detectors. One affect of this is a static phase
offset in the phase detector. Static phase offsets were shown to have a considerable negative effect
on the BER performance of a CDR circuit and a mathematical model was created to quantify
those effects. Finally, an analysis of the robustness of CML based phase detectors with respect the
scaling of CMOS processes was presented. Three phase detectors were analyzed over corners in
three standard CMOS technologies: 180nm, 130nm and 90nm. Simulation results show that the
total variation of static phase offsets increases with scaling for each of the phase detectors. The
DFF binary phase detector has a definite advantage over the Alexander and Hogge architectures
in terms of robustness, however it has some performance limitations. Both the Alexander and
Hogge phase detectors experience significant and increasing variations in the static phase offset
as CMOS processes scale.
Chapter 4
Calibration Techniques for Robust
CDR Circuits
Calibration is often used to improve the performance of analog circuits. While there are not
extensive publications relating to the use of calibration in data communication systems, there
are some examples of previous research. In [27] calibration is used to compensate for any shifts
in the desired centre frequency of a ring oscillator caused by processing variations. The authors
used a digital calibration circuit to control the high gain loop of the VCO, allowing the main
loop of the PLL to have a low gain to improve noise performance. In recent years there have
been several papers illustrating the use of calibration in equalizers [13, 75]. For example, in [13]
a least-mean-square (LMS) algorithm is implemented in digital CMOS and is used to configure
the equalizer. Anytime a designer uses calibration the goal is automatically configure the system
in order to optimize the performance.
100
Calibration Techniques for Robust CDR Circuits 101
4.1 Calibration in CDR Circuits
In this chapter two types of calibration algorithms are presented, namely offline and online cal-
ibration. Offline calibration (also called background calibration) operates in a secondary mode,
seperate from the normal operation of the circuit. One example of this would be a circuit which
runs a calibration algorithm at startup. Once the calibration is complete normal operation be-
gins, and the calibration circuit is inactive. Online calibration (also called foreground calibration)
is a second type of calibration which operates continually, rather than at discrete intervals. In-
tegrated circuits can deviate from their design specifications not just once due to manufacturing
non-idealities, but continuously due to environmental conditions like temperature, and also due to
changes that occur over time. Online calibration operates continuously and therefor these occur-
rences can be compensated for as they happen whereas an offline calibration circuit would need
to be periodically enabled in order to compensate for such effects. However, offline calibration
algorithms have two significant advantages. First, offline calibration circuitry is turned off during
normal operation of the CDR circuit, and hence has no effect on the performance during normal
operation; online calibration circuitry will inevitably add some degree of noise to the circuit. The
second advantage of offline calibration, which will be seen later in the chapter, is that there are
situations where it is able to get a circuit to function even when the uncalibrated circuit is not
functional.
4.1.1 Correction of Static Phase Offsets Using Calibration
This section describes how a calibration circuit can used to correct for static phase offsets in
phase detectors. The Hogge phase detector circuit is specifically used in the analysis, however
the calibration algorithm is valid for any phase detector. There are many factors which may
cause the Hogge phase detector to deviate from the ideal behavior described in Section 2.3.1.
Figure 4.1a illustrates the ideal behavior of the Hogge phase detector, along with a plot showing
the corresponding phase detector gain. The timing analysis of the phase detectors in Section
Calibration Techniques for Robust CDR Circuits 102
Figure 4.1: Hogge phase detector operation when Φspo = 0 and when Φspo 6= 0
3.3.2 gave a framework for this. For example, the delay block illustrated in Figure 2.10 must
accurately match the C-Q delay of DFF1 and the C-Q delays of DFF1 and DFF2 must match.
Any inequality will cause the widths of the UP and DOWN pulses to become imbalanced, which in
turn will cause a static phase offset. This situation is shown in Figure 4.1b, with the corresponding
phase detector response illustrating the resulting static phase error, Φspo. The fact that linear
phase detectors operate by balancing UP and DOWN pulse widths makes them more sensitive to
process variations and other non-idealities. The analysis in Section 3.4 illustrated the sensitivity
to process variations and the equations in Section 3.2 described the performance degradation
which occurs due to static phase offsets.
Calibration Techniques for Robust CDR Circuits 103
One way to correct for static phase offsets is by changing the value of the UP and DOWN
currents in the charge pump. A linear phase detector affected by a static phase offset can be
described in the following manner: when the clock and data inputs are perfectly synchronized the
DOWN pulse is assumed to have an ideal width of T2 but the UP pulse has a width of
T
2 + Φspo.
Under these conditions the amount of charge added by the charge pump should be equal to the
amount of charge subtracted by the charge pump. The amount of charge the charge pump adds
or subtracts is equal to:
∆q = I ×∆t (4.1)
In the given situation the charge added by the UP pulse must be equal to the charge subtracted
by the DOWN pulse, therefore:
∆qup = ∆qdown (4.2)







= Idown · T2 (4.3)
Rearranging Equation 4.3 leads to the relationship between Iup and Idown shown in Equation 4.4.











Figure 4.2 illustrates the normalized UP and DOWN charge pump currents resulting from Equa-
Calibration Techniques for Robust CDR Circuits 104
Figure 4.2: The effect of UP and DOWN charge pump currents on Φspo
tion 4.4 which will compensate for a given range of static phase offsets.
As an example, using this figure it can be seen that for a 5Gb/s CDR circuit with a -20ps
static phase offset (0.1UI), the UP current will need to be 25% larger than the DOWN current.
For the same circuit with a +20ps static phase offset the UP current will need to be 17% smaller
than the DOWN current. This example and the figure both illustrate that when the UP pulse is
smaller than the DOWN pulse the magnitude of the change in the charge pump current is larger
than when the UP pulse is larger than the DOWN pulse.
While this section has described a method to correct for static phase offsets specifically in
a linear phase detector, the algorithm can be used for any phase detector. As was shown in
Section 3.4.2, while the Hogge phase detector is most severely impacted by process variations,
the Alexander phase detector also suffers significant variations in its static phase offset. In this
section it has been assumed that the charge pump is ideal. In reality any mismatches between the
charge pump’s UP and DOWN current paths is a problem, as such a mismatch will cause a static
phase offset. With this calibration scheme, as long as the static phase offset can be detected any
mismatches in the charge pump current will also be compensated for.
Calibration Techniques for Robust CDR Circuits 105
4.1.2 Linear Phase Detectors in CDR Circuits
In recent years the majority of papers on the subject of CDR circuits have used binary phase
detectors, due in part to their superior robustness. Numerous papers refer to the ‘ease of integra-
tion’ of binary phase detector compared to linear phase detectors, and this ‘ease of integration’
relates to the robustness. However, in this chapter the specific goals of the calibration circuits
are to correct for static phase offsets in linear phase detectors. These techniques could be applied
also to binary phase detectors, however as described in Section 3.4 linear phase detectors are
more prone to having static phase offsets. A justifiable question arises, “Why bother with linear
phase detectors at all? If binary phase detectors are so much easier to integrate, why not simply
use them exclusively?” In fact, there are several significant advantages to using a linear phase
detector. First, a CDR circuit which uses a linear phase detector will have lower in-lock jitter.
For a binary phase detector the magnitude of phase error correction applied is constant regardless
of the magnitude of the input phase error. However, the fact that with a linear phase detector
the magnitude of correction is proportional to the phase error means that in the locked state
the VCO control voltage will have little activity, which translates to low in-lock jitter [72]. This
makes it easier to accurately predict the frequency domain performance which is important for
systems aimed at standards which require a specific frequency domain performance. Secondly, a
CDR circuit using a linear phase detector has a jitter-transfer bandwidth which is independent of
the amplitude of the input jitter [76]. Thirdly, a linear phase detector is intrinsically tri-stated,
which is important to meeting jitter generation specifications. Finally, as was seen in Sections
2.5.1 and 2.5.3, a CDR circuit which uses a linear phase detector can be analyzed using well
understood classic loop theory. This is especially important in the design of SONET systems,
where the frequency response must be well characterized [33]. Section 2.6 described a method to
find a closed form analysis of CDR circuits using binary phase detectors, however this solution is
not nearly as graceful as the solution for linear phase detector based CDR circuits and different
authors get different results depending on their approach and the assumptions they make.
Calibration Techniques for Robust CDR Circuits 106
4.2 Offline Calibration Architecture
In this section a digital calibration technique is used to tune a 5Gb/s CDR circuit in order to
compensate for non-idealities. As seen in Section 3.4.2, process non-idealities can cause large
shifts in the static phase offset of the Hogge phase detector. It was also mathematically shown in
Section 3.2 that static phase offsets in a phase detector will reduce the BER of a CDR circuit. As
such, the calibration circuit in this section is designed to sense and compensate for static phase
offsets in a CDR circuit.
4.2.1 Calibration Algorithm
The calibration circuit proposed in this section is designed to sense static phase offsets in a CDR
circuit and tune the charge pump currents such that when the calibration locks, the clock and
data are properly aligned. The architecture of the proposed CDR circuit with the calibration
circuitry is shown in Figure 4.3. The shaded areas, labelled Mode 1 and Mode 2, represent the
two phases of the calibration algorithm. The purpose of Mode 1 is to generate a data signal that
is phase aligned to the clock signal. The purpose of Mode 2 is to set the charge pump currents
so as to eliminate any static phase offset.
It should be emphasized that this calibration algorithm is non-continuous, meaning that it
does not operate on live data. Varying parameters such as temperature and voltage fluctuations
could change the circuit and require updating the calibration after a certain period of time. As
this calibration algorithm is controlled internally, calibration could be programmed to run at
certain time intervals, or when there is no incoming data.
Mode 1
Mode 1 is the first phase of the calibration algorithm. At the end of this phase an internal data
signal will have been generated which is phase aligned with the clock signal. Mode 1 begins with
the calibration control circuit disconnecting the external data signal via MUX M2. During Mode
Calibration Techniques for Robust CDR Circuits 107
Figure 4.3: Block diagram of the proposed offline calibration algorithm
1 the voltage on the low-pass filter is set to a DC voltage, so as to stabilize the VCO. The DC
voltage used is the mid-point of the charge-pump output range. A data signal is generated by
dividing the clock signal by two, creating a simple alternating data pattern. While an alternating
data pattern does not represent a realistic data pattern, this technique could be extended to
incorporate a pseudo-random bit sequence (PRBS) generator. This would create a more realistic
data pattern at the expense of complexity, area and power. The creation of the data signal from
the clock signal guarantees that it will be frequency aligned with the clock, however there will
be a finite phase offset between them due to the delay in the divide-by-two circuit. In order to
compensate for this phase offset, the clock signal from the VCO is sent through a programmable
delay line, which is controlled by the calibration logic (L1). The delay is varied until the binary
phase detector PD1 determines that the delayed clock signal and created data signal are phase
Calibration Techniques for Robust CDR Circuits 108
aligned. It is important to recognize that PD1 is not directly calibrating the linear phase detector,
rather it is used to compensate for the fixed delay in the divide-by-two circuit. As a binary phase
detector is much less sensitive to non-idealities than a linear phase detector, PD1 can accurately
determine the point where the clock and generated data signals are synchronized. Finding this
point is made easier as the delay in the programmable delay line is changed in discrete steps. This
is accomplished by the control logic changing the control current of the delay line via the 5-bit
digital to analog converter (DAC), D1. In order to accurately match the path of the generated
data signal with the input data signal, MUX M1 is used to emulate the delay through MUX M2.
Once the clock and data signals are phase aligned, the first phase of the calibration algorithm is
over and the calibration logic activates the second phase of calibration, Mode 2.
Mode 2
In Mode 2 the external data signal remains disconnected, the voltage on the low-pass filter is no
longer set to a DC value and the phase aligned clock and data signals which were generated in
Mode 1 are sent to the linear Hogge phase detector, PD2. In the case where PD2 has no static
phase offset and has phase aligned data and clock signals, an equal amount of charge should be
added to and subtracted from the low-pass filter. In Mode 2 the charge pump currents are varied
so as to ensure that this condition is met. First, the DOWN current in the charge pump is set
to a reference value via a control signal from calibration logic (L2) while the UP current is set to
the lowest value. This results in a significant net subtraction in charge from the low-pass filter
after every data transition. As such, the voltage on the low-pass filter will be reduced until it is
at the most negative end of the charge-pump output voltage range. Next, the calibration logic
gradually increases the UP current in the charge pump using the 5-bit DAC, D2. The comparator
observes the voltage on the low-pass filter to determine the point when an equal amount of charge
is added and subtracted. The UP current is varied in finite steps and the comparator is biased
at the centre of the charge pump output range. Once the comparator switches, the UP current
Calibration Techniques for Robust CDR Circuits 109
is set and the calibration algorithm is complete.
Calibrated Operation
Once Mode 2 is complete, the calibration control circuit connects the external data to the CDR
circuit via MUX M2 and normal operation is resumed. The values of the UP and DOWN currents
are set as determined in Mode 2 of the calibration algorithm. In normal operation, the calibration
circuitry is not active and the CDR circuit operates without any interaction with the calibration
circuit.
4.2.2 Implementation
In order to test the proposed architecture a complete CDR circuit using the offline calibration
circuit was designed. The architecture of the CDR circuit including the calibration circuitry was
previously shown in Figure 4.3. All the high-speed blocks in the CDR circuit are implemented
using CML which, as described in Section 2.4, gives them excellent noise immunity and their
current steering nature allows for greater performance than any other logic family. While the
CDR circuit is implemented using CML, much of the calibration circuitry operates at a much lower
frequency and hence it is implemented using static CMOS. The only circuits in the calibration
algorithm implemented in CML are the delay line, the divide by two circuit, and the binary
phase detector. In this section the various blocks which make up the proposed CDR circuit are




The phase detector used in the main CDR circuit is a Hogge phase detector, the architecture
of which was previously described in Section 2.3.1. The most critical gates in the Hogge
Calibration Techniques for Robust CDR Circuits 110
Figure 4.4: Simulated output waveforms for standard and symmetric XOR gate
phase detector are the XOR gates. These gates must accurately generate UP and DOWN
pulses in order for the CDR circuit to operate correctly. To ensure the best performance a
symmetric XOR circuit should be used [16]. In a traditional CML XOR gate the two inputs
are located at different bias points. This leads to an unequal switching threshold for the zero
and one states, as can be seen in Figure 4.4a. Placing two traditional CML XOR gates in
parallel with their inputs switched and the outputs shorted results in an XOR gate which is
symmetric with respect to the switching threshold and which also has a higher bandwidth.
The schematic for the symmetric XOR gate is shown in Figure 4.5. Figure 4.5 also shows
the logical configuration of the circuit with the XOR gates representing traditional CML
XOR gates. The resulting output waveform of the symmetric XOR gate is shown in Figure
4.4b. The penalty for the symmetric XOR gate is larger area and higher power, however as
the XOR gate is the most critical gate in the phase detector this is an acceptable tradeoff.
2. Charge Pump
The charge pump implements the correction information supplied to it by the phase detec-
tor. The Hogge phase detector provides differential UP and DOWN pulses which the charge
Calibration Techniques for Robust CDR Circuits 111
Figure 4.5: Schematic of the symmetric XOR gate
Figure 4.6: Schematic of modified charge pump
pump uses to either add charge to or subtract charge from the loop filter. The schematic of
the charge pump is shown in Figure 4.6. In order to implement the calibration algorithm
the charge pump circuit requires separate biases for the current mirrors which control the
Calibration Techniques for Robust CDR Circuits 112
UP and DOWN currents. These currents are controlled by the calibration circuit. The
DOWN bias current is set to a reference level, while the control circuit changes the UP bias
current using a DAC.
3. VCO
The 5GHz VCO is implemented as a four stage CML ring oscillator. A ring oscillator
was chosen as this architecture is most easily integrated into a complex CMOS IC. While
the phase noise of a ring oscillator is poor in comparison to resonant tank oscillators, the
purpose of this work is to compare the performance of the calibrated CDR circuit with the
performance of the uncalibrated CDR circuit, hence using the ring oscillator is acceptable.
The delay cells are implemented as self-biased Maneatis style circuits [32]. The frequency
of the oscillator is 5GHz, which is close to the operational limit of ring oscillators in this
180nm process [37], hence great care was taken in the design and layout to ensure the
best performance possible. The VCO was designed to have a coarse tuning range of 1GHz
centered around 5GHz and a fine tuning range of approximately 200MHz.
Calibration Circuits
1. Delay Line
The delay line is implemented using a series of CML buffers. The calibration logic sets the
delay by changing the bias current of the buffers. A replica bias sets the PMOS load bias
voltage to ensure that the voltage swing remains the same [74]. The current is set using
a five bit current DAC, which gives thirty-two finite delay settings. The delay line was
designed so that the delayed clock and generated data signal would be synchronized at the
centre of the delay-lines range. Figure 4.7 shows the simulated delay through the delay line
for a given current.
2. DAC
A digital to analog converter is used at two different points during the calibration algorithm.
Calibration Techniques for Robust CDR Circuits 113
Figure 4.7: Delay through the delay line as the input current is varied
Figure 4.8: Architecture of the DAC circuit used in this design
In Mode 1 a DAC is used to set the delay in the programmable delay line and in Mode 2 a
DAC is used to set the UP bias current in the charge pump. The DACs are identical five bit
binary-weighted DACs whose architecture is illustrated in Figure 4.8 [52]. The switches are
controlled by the calibration logic to change the output current. A separate bias voltage
can be used to add an offset (Iadd) to the current set by the DAC. The output current
Calibration Techniques for Robust CDR Circuits 114
Figure 4.9: The output current and error of the DAC as the codes are stepped
of the DAC is mirrored and becomes the bias current for either the programmable delay
line or the charge pump DOWN current. The simulated performance of the DAC circuits
is shown in Figure 4.9. This figure shows that the output of the DAC does not precisely
match the ideal response, however for this system the accuracy of this circuit is sufficient.
3. Phase Detector
Mode 1 of the calibration algorithm utilizes a binary phase detector in order to determine
when the clock and generated data signals are aligned. In this situation the phase error
being detected is a fixed phase error, which is caused by the delay through the divide-
by-two circuit. This is different from the normal CDR operation, where the data signal
has continually varying phase with respect to the VCO. As the phase error is constant,
a binary phase detector and a programmable delay line are adequate to synchronize the
clock and the generated data signals. A DFF configured with the data and clock inputs
interchanged functions as a simple binary phase detector. However, as the generated data
Calibration Techniques for Robust CDR Circuits 115
signal has an alternating pattern there is a guaranteed transition every period, therefore
a latch configured such that the data signal latches the clock signal is sufficient, further
simplifying the design.
4. Comparator
A comparator is used in Mode 2 to determine the correct UP current. The comparator
is implemented as a simple open loop opamp. The resolution of a comparator with this
architecture is limited to the input offset voltage of the opamp [52]. In this implementation
the changes in the UP current are finite and are integrated over a significant period of time.
The result is a change in the voltage on the loop filter which is large enough to negate
the impact of any offset voltage errors. A more complex implementation of this calibration
algorithm would require a more complex comparator, however for this implementation the
chosen comparator can accurately determine the point where the voltage on the loop filter
passes the centre of the charge pump output range. The centre of the charge pump output
range is determined via a bias circuit which finds the maximum and minimum charge pump
output and divides them.
5. Logic
The calibration algorithm is controlled by three state machines. The first state machine
activates the various phases of calibration, the second state machine controls the delay line
and the final state machine sets the UP bias current. These state machines are implemented
using static CMOS. The calibration algorithm operates at a much lower frequency than the
data rate, hence they these circuits do not have stringent performance requirements and
using static CMOS minimizes power consumption and area.
4.2.3 Measured Results
The calibrated CDR circuit was implemented in a 180nm standard CMOS process with six metal
layers. The total die area is 1mm2 with the CDR circuit and calibration circuit taking up ap-
Calibration Techniques for Robust CDR Circuits 116
Figure 4.10: Micrograph of the fabricated CDR circuit
proximately 0.36mm2. The total area of the calibration circuitry is approximately 0.12mm2. A
micrograph of the die is shown in Figure 4.10. To test the CDR circuit the die was wirebonded
directly to a PCB substrate. The complete system including input and output buffers consumes
230mW from a 1.8V supply at room temperature. The input and output buffers consume ap-
proximately 100mW. Once the calibration operation is complete, the CMOS based calibration
circuitry is not active, and hence consumes no power.
Calibration Techniques for Robust CDR Circuits 117
Figure 4.11: Spectrum of the VCO locked to a 5Gb/s PRBS
Figure 4.12: Measured jitter of the locked oscillator
Before the CDR circuit was tested, the VCO was measured. The coarse tuning range of the
VCO is 4.4GHz - 5.3GHz and the fine tuning provides a 200MHz range. The coarse tuning of
Calibration Techniques for Robust CDR Circuits 118
Table 4.1: Measured calibrated and uncalibrated BER for various data patterns
Data pattern Calibrated Uncalibrated
101010. . . < 10−13 1.25× 10−5
PRBS 27-1 < 10−13 4.6× 10−2
PRBS 29-1 1.50× 10−10 N/A
PRBS 211-1 2.50× 10−10 N/A
PRBS 215-1 1.65× 10−8 N/A
PRBS 223-1 2.70× 10−8 N/A
PRBS 231-1 4.0× 10−7 N/A
the VCO is controlled off-chip. At 5GHz the measured phase noise of the VCO is -75.8dBc/Hz
at a 1MHz offset. The spectrum of the locked oscillator is shown in Figure 4.11. The VCO
consumes 12mA from a 1.8V supply, excluding buffers. The CDR circuit was tested at 5Gb/s
using a BERT. The circuit is able to recover data in both calibrated and uncalibrated modes,
however uncalibrated functionality is limited to simple data patterns and the BER is poor. After
calibration, the performance of the CDR circuit improves significantly. Table 1 shows the per-
formance figures for data patterns of increasing complexity, from alternating data (10101) to a
PRBS of 231 − 1. With a PRBS of 27 − 1 at 5Gb/s, the uncalibrated CDR circuit had a BER
of 4.6× 10−2. Once the CDR circuit was calibrated the BER improved to less than 10−13. The
uncalibrated CDR circuit was not able to lock to any data pattern more complex than 27 − 1,
however once calibrated it was able to lock to a PRBS of up to 231−1. With a PRBS of of 231−1
the CDR circuit had a BER of 4.0× 10−7. The measured RMS jitter of the recovered clock with
a PRBS of 231− 1 is 6.04ps. The jitter histogram of the recovered clock signal is shown in Figure
4.12. The corresponding plot of the output clock and data signals is shown in Figure 4.13.
This CDR circuit was designed using transistor models characterized for digital designs. Later
access to RF transistor models indicated that the digital models overestimated the performance
Calibration Techniques for Robust CDR Circuits 119
Figure 4.13: Output clock and data waveforms
at multi-GHz frequencies. The resulting significant decrease in bandwidth helps explain the
poor performance of the uncalibrated CDR circuit, however it also highlights the benefits of this
calibration algorithm. In any design there can be non-idealities which result in performance
degradation, however this work has demonstrated a calibration circuit which can correct for
serious errors.
Calibration Techniques for Robust CDR Circuits 120
4.3 Online Analog Calibration Architecture
The previous section described an offline calibration architecture. In this section an online cal-
ibration architecture is introduced which is also designed to correct for static phase offsets by
controlling the charge pump currents. As before a linear phase detector is used to demonstrate
the algorithm, however the calibration circuit itself can be used with any phase detector circuit.
An online calibration algorithm has advantages over an offline calibration algorithm in that it
can dynamically compensate for effects such as temperature variations and voltage fluctuations,
keeping the CDR circuit continually optimized.
4.3.1 Calibration Architecture
The online calibration architecture proposed in this section rests on two premises. The first
premise of this architecture is that a simple DFF binary phase detector is much more robust
than a linear phase detector. The analysis presented in Section 3.4 demonstrates that this is a
justifiable premise which is important in this circuit as a DFF binary phase detector is used to
determine when the incoming clock and data signal are synchronized. The second premise of this
architecture is that the uncalibrated CDR circuit is able to lock to the incoming data signal. It
was shown that the the offline calibration architecture was able to recover functionality in cases
where the uncalibrated CDR circuit was non-functional.
The architecture of the proposed CDR circuit is shown in Figure 4.14. The proposed online
calibration circuit begins to function once the CDR circuit has locked to the incoming data
signal. It is assumed that the linear phase detector causes the CDR circuit to lock with some
undefined static phase offset. As such, at this point the CDR circuit is locked and operating
properly, however the clock and data signals are not optimally aligned. The DFF binary phase
detector sees this non-ideal alignment and applies continual correction to correct this. While the
calibration circuit is active, the magnitude of correction supplied by the calibration charge pump
is low and the operation is slow in comparison with the phase correction loop. The output of
Calibration Techniques for Robust CDR Circuits 121
Figure 4.14: Block diagram of the online calibration architecture
the calibration charge pump is used to change the UP bias current in the main charge pump.
Once the DFF binary phase detector sees the clock and data signals as synchronized, they are
considered synchronized, and at this point the secondary loop will slowly move back and forth
across the zero phase error point of the DFF binary phase detector. It is important that the
main loop and the calibration loop interact as little as possible. If the loops end up fighting with
one another it could create an unstable system. By making the calibration loop very slow in
comparison to the phase detection loop it appears ‘DC’ to the phase tracking loop.
This operation of the calibration circuit can be seen in Figure 4.15. In these simulations the
phase detector and calibration circuit use transistor models, however the VCO is implemented
using a Verilog-A model. This is helpful as a 20µs simulation is extremely long in relation to
the 5Gb/s data rate, and using Verilog-A models helps to speed up the simulations. The upper
frame in Figure 4.15 shows the voltage on the main loop filter and the lower frame shows the
Calibration Techniques for Robust CDR Circuits 122
Figure 4.15: Simulated waveforms showing the calibration loop locking
calibration filter voltage and the associated reference voltage. The CDR circuit locks within a
few hundred nanoseconds, however it takes the calibration loop approximately 4µs to lock. From
this point on the calibration control voltage slowly moves back and forth across its stable value.
Figure 4.15 also shows that the magnitude of the noise on the VCO filter voltage decreases once
the calibration circuit has stabilized. This indicates a tighter lock, which corresponds to less
jitter. Figure 4.16 shows a zoomed region of the corresponding clock signal eye diagrams for both
the uncalibrated and calibrated condition. The uncalibrated eye has an 11.8ps static phase offset
and a peak-to-peak jitter of 2.82ps. The calibration circuit has essentially eliminated the static
phase, reducing it to less than 1ps. The peak-to-peak jitter of the calibrated CDR circuit was
also reduced, dropping to 1.2ps. These peak-to-peak jitter values are very small, however this is
a simulation environment and also the VCO is modelled using Verilog-A hence the noise is only
due to the phase detector and charge pump circuits. These factors mean that the output jitter
seen in the simulated eye diagrams is much smaller that what would be seen in reality.
Calibration Techniques for Robust CDR Circuits 123
Figure 4.16: Eye diagram showing the clock and data signals before and after calibration
4.3.2 Implementation
In order to test the proposed architecture a complete CDR circuit using the online calibration
circuit was designed. The block diagram of the CDR circuit was previously shown in Figure 4.14.
This architecture is far less complicated as compared with the offline calibration architecture
which has benefits in terms of area and power. In this section the various circuits which make up
this design are detailed.
Phase Detectors
The main phase detector is a linear Hogge phase detector. This is the identical circuit which
was used in the previous design which used offline calibration and as such it will not be further
discussed here. The phase detector used in the calibration circuit is a CML DFF, however it is
implemented as a dual-edge triggered DFF. It is important that the DFF is a dual-edge triggered
Calibration Techniques for Robust CDR Circuits 124
Figure 4.17: The schematic of a dual edge triggered CML DFF
DFF as if a regular DFF is used correction information will only be generated on one edge of
data transitions. Using a dual edge triggered DFF doubles the number of times the phase error
between clock and data signal is determined. The schematic of a dual edge triggered CML DFF
is shown in Figure 4.17. Latch A observes the clock while data is high and holds the sample on
the falling edge of data. Latch B observes the clock while data is low and holds the sample on
the rising edge of data. As such, while the data signal is high the multiplexor outputs the sample
from latch B and while the data is low the multiplexor outputs the sample from latch A. The cost
of a dual edge triggered DFF as compared to the standard DFF is 50% greater area and power.
Charge Pump
Figure 4.17 illustrates that the output of the dual edge triggered CML DFF is a single UP/DOWN
signal. This means that the charge pump used in the previous design cannot be used as it required
differential UP and DOWN signals. As such a simple CML charge pump is used, the schematic
of which is shown in Figure 4.18. The charge pump in the main phase detection loop is identical
to that implemented in the CDR which had offline calibration. This circuit (shown in Figure 4.6)
is a CML charge pump with differential inputs and only differs from the standard circuit (shown
Calibration Techniques for Robust CDR Circuits 125
Figure 4.18: Simple charge pump for the calibration circuit
in Figure 2.16) in that it has separate UP and DOWN bias currents.
Loop Filter and VCO
A large off-chip capacitor is used as the loop filter for the calibration. A large capacitor ensures
that the calibration loop is very slow, and as such will not interfere with normal operation of
the CDR circuit. As can be seen in Figure 4.14 the voltage on the capacitor is used to set the
UP current in the phase tracking loop’s charge pump. A low-gain differential pair compares the
calibration voltage with respect to a reference signal and steers the correct amount of UP current
into the main charge pump.
The voltage controlled oscillator in this design was implemented as a three stage ring oscillator.
As with the offline calibration circuit, the delay cells are implemented as self-biased circuits [74].
The VCO was designed to have a coarse tuning range from 5 - 6GHz and a fine tuning range of
approximately 200MHz. This VCO was a different design than what was used in the chip with
offline calibration, however again the 5GHz data rate is aggressive for this process and care was
taken in the design and layout to ensure that the circuit would be able to operate at the desired
frequency.
Calibration Techniques for Robust CDR Circuits 126
Figure 4.19: Micrograph of the fabricated CDR circuit
4.3.3 Measured Results
The CDR circuit with online calibration was implemented in a 180nm standard CMOS process
with six metal layers. The total die area is 0.8mm × 0.8mm, with the CDR circuit and cali-
bration circuit taking up approximately 0.4mm2. The total area of the calibration circuitry is
approximately 0.01mm2, which shows that this calibration architecture has a negligibly small
Calibration Techniques for Robust CDR Circuits 127
Figure 4.20: 5Gb/s clock and data waveforms before and after calibration
area penalty. A micrograph of the die is shown in Figure 4.19. To test the CDR circuit the die
was wirebonded directly to a PCB substrate. The complete system including input and output
buffers consumed 305mW from a 1.8V supply at room temperature. Without the input and out-
put buffers the CDR circuit consumes approximately 200mW. The calibration circuit consumes
a constant amount of power, which is approximately 15mW.
Before the CDR circuit was tested, the VCO was measured. The coarse tuning range of the
VCO was measured to be 4.2GHz - 6.1GHz and the fine tuning provides a 200MHz range. As
with the offline calibration circuit the coarse tuning of the VCO is controlled off-chip. At 5GHz
the measured phase noise of the oscillator is -71.2dBc/Hz at a 1MHz offset. The VCO consumes
20mA from a 1.8V supply, excluding buffers. A BERT was used to test the CDR circuit at 5Gb/s.
The CDR circuit was only able to lock to a PRBS of 27− 1. For any data pattern more complex
than this the CDR circuit is unable to lock. This illustrates a weakness in online calibration in
that the CDR circuit must lock initially in order for the calibration circuit to be useful. With a
Calibration Techniques for Robust CDR Circuits 128
PRBS of 27 − 1 at 5Gb/s the BER in both uncalibrated and calibrated modes was identical, at
< 10−13. However, the calibration circuit was able to improve the measured jitter for an input
PRBS of 27 − 1. The measured jitter on the uncalibrated clock for an input PRBS of 27 − 1
at 5Gb/s was 12.9ps RMS and 78ps peak-to-peak, whereas the calibration circuit reduced the
measured jitter to 4.92ps RMS and 28ps peak-to-peak.
The inability of the CDR circuit to lock to any data pattern more complex than 27 − 1 was
disappointing and highlights a weakness of online calibration. It is possible that had the CDR
circuit been able to lock to more complex data patterns the calibration circuit could have improved
the BER, as was the case for the offline calibration design. However, for a PRBS of 27 − 1 at
5Gb/s this calibration circuit was able to reduce the jitter by over 60%, which demonstrates its
effectiveness.
4.4 Summary
This chapter described techniques to improve the robustness of CDR circuit using calibration.
Section 3.4 showed that process variations have a significant effect on the static phase offset of
CDR circuit, and in this chapter a method was presented which corrects for static phase offsets
by modulating the UP and DOWN currents in the charge pump. Two calibration circuits were
proposed, one offline and the other online. These were implemented in CDR circuits which were
fabricated in a 180nm standard CMOS process. For the CDR circuit with offline calibration, with
a PRBS of 27−1 at 5Gb/s the calibration circuit improved the measured BER of the CDR circuit
from 4.6 × 10−2 to less than 10−13. For the CDR circuit with online calibration for PRBS of
27− 1 at 5Gb/s the calibration circuit improved the measured RMS jitter from 12.9ps to 4.92ps.
While online and offline calibration have been treated separately in this chapter, it would be
possible to design a circuit which implements both an offline calibration circuit and an online
calibration circuit. This would take advantage of the benefits of both schemes, at the expense of
design complexity, area and power.
Chapter 5
Phase Detector Design for Robust
CDR Circuits
In this chapter new phase detector circuits are introduced with the intent to optimize both
performance and robustness. As described in Section 3.4, the DFF binary phase detector is a
robust circuit, however if suffers from limitations as a phase detector. The Alexander phase
detector is a very popular phase detector circuit, however it was shown in Section 3.4 that it is
more vulnerable to process non-idealities compared to the DFF phase detector. In this chapter
a phase detector circuit based around the DFF phase detector creates a characteristic which is
identical to the Alexander phase detector, however with the robustness of the DFF phase detector.
Also, a second phase detector circuit based around the DFF phase detector is introduced which
changes the charge pump current in order to improve the performance over the regular DFF
phase detector. Finally, a simple modification to the linear Hogge phase detector is described
which significantly improves the circuit’s robustness.
129
Phase Detector Design for Robust CDR Circuits 130
Figure 5.1: Architecture of the tri-state binary phase detector
5.1 Tri-State DFF Phase Detector
The Alexander phase detector is the most common phase detector architecture for CDR circuits.
As seen in Section 3.4 the robustness performance of the Alexander phase detector is superior to
the Hogge phase detector, however it is still sensitive to process variations, and this sensitivity
increases as CMOS processes scales. The DFF phase detector was shown to be the most robust,
however one weakness of this circuit is that it is not tri-stated. In this section a new phase
detector is proposed based on the DFF phase detector which which has a phase error response
similar to that of the Alexander phase detector. An analysis of the Alexander phase detector and
the proposed phase detector is performed in order to compare their respective robustness.
5.1.1 Architecture of Tri-State DFF Phase Detector
The architecture of the proposed phase detector is based around the DFF phase detector circuit.
Additional circuits are added in order to gain an overall response equal to that of the Alexander
phase detector. The architecture of the proposed circuit is given in Figure 5.1. The premise of
a tri-state phase detector is that correction information is only sent to the charge pump after
Phase Detector Design for Robust CDR Circuits 131
Figure 5.2: CDR circuit waveforms given a tri-state binary phase detector
there is a data transition. The phase detection in this circuit is performed by DFF1, which acts
as a binary phase detector. The output signal of DFF1 is either high or low, which corresponds
to whether the clock is leading or lagging the data. As the output of DFF1 is always either
high or low, DFF1 on its own is not a tri-stated phase detector. A separate circuit composed of
two back to back DFFs, (DFF2 and DFF3) and an XOR gate performs dual function of data
retiming and creation of a reference pulse. In the proposed circuit the generated reference pulse
is used to enable the operation of the charge pump, making this a tri-stated phase detector. The
reference pulse is created by performing the logical XOR across the output of two back-to-back
DFFs. These DFF clock the data on the same edge of the clock and thus the reference pulse will
have a constant width of one period of the clock. The proposed phase detector essentially acts
as two distinct circuits, a phase detection circuit and a retiming circuit. The operation of the
binary phase detector is described using waveforms in Figure 5.2. As can be seen, the voltage on
the low-pass filter only changes after a data transition.
The key benefit of the modified phase detector is that the phase detection circuit has only
Phase Detector Design for Robust CDR Circuits 132
one stage, thereby avoiding any interaction between blocks and eliminating the need for precise
timing. The simplicity of this phase detector brings benefits in robustness. The proposed phase
detector also has power and area benefits as compared to the Alexander phase detector, as it
requires one less DFF and one less XOR.
5.1.2 Robustness of the Tri-State DFF Phase Detector
In order to compare the robustness of the proposed phase detector a similar analysis to Section
3.4 is done. An Alexander phase detector and a tri-state DFF phase detector were designed in
a 180nm standard CMOS process targeting a data rate of 5Gb/s. Input clock and data signals
were generated with specific phase offsets and the phase detector response was analyzed. The
test clock and data signals used representative rise and fall times in order to accurately mimic
realistic input signals. In order to test the robustness, corner simulations were performed using
both standard process corners (SS, TT and FF) and resistor variations (±20%) at each corner.
In order to measure the widths of the UP and DOWN pulses Verilog-A models were generated
which capture and log the data as the input phase offset is varied. Simulations were performed
in order to examine the phase detector response over the complete input phase offset range. The
results of the robustness analysis on the Alexander phase detector are given in Figure 5.3 and
the results of the robustness analysis on the tri-state DFF phase detector are given in Figure 5.4.
The total variation in the zero-crossing point of the phase detector gain for the tri-state binary
phase detector is approximately 7.1ps with is approximately 50% of the simulated variations of
the Alexander phase detector, which were 13.9ps.
Another characteristic of the proposed phase detector which is examined relates to the ref-
erence pulse. Up until this point the reference pulse was assumed to have a constant width,
regardless of the input phase error. While this is mostly true, large phase errors will put the
first re-timing flip-flop into the region of meta-stability and the pulse width will no longer be
constant. However, this situation also affects the Alexander phase detector. In Figure 5.5 the
Phase Detector Design for Robust CDR Circuits 133
Figure 5.3: Detailed response of the Alexander phase detector over corners
Figure 5.4: Detailed response of the tri-state DFF phase detector over corners
response of the Alexander phase detector is compared to the overall response of the DFF tri-state
phase detector, with the pulse width taken into account. The width of the reference pulse is com-
bined with the output of the phase detector to create the overall response. As can be seen, the
responses of the two phase detector are very similar. The error is plotted at the bottom of Figure
5.5 describing the deviation from the ideal binary phase detector response. The average error for
Phase Detector Design for Robust CDR Circuits 134
Figure 5.5: Comparison of Tri-state DFF vs Alexander Pulse Widths
the DFF tri-state phase detector is 5.06%, which is slightly less than the average error for the
Alexander phase detector, which is 6.4%. This shows that the tri-state DFF phase detector can
match the response of the Alexander phase detector, however with reduced sensitivity to process
variations, making it more easily integrated into a monolithic IC.
5.1.3 Implementation
In order to test the proposed phase detector circuit a complete CDR circuit was designed in
a 180nm standard CMOS process. Before taping out the chip, the functionality of the CDR
circuit was verified using back-annotated simulations. As this circuit operates at 5GHz, circuit
layout becomes very important. Parasitic resistors and capacitors can reduce the bandwidth
of the system, and improper layout can provide paths for noise to infiltrate sensitive circuits.
Appropriate layout techniques were employed so as to reduce the high-frequency effects. This
section details the individual blocks which make up the implemented CDR circuit.
Phase Detector Design for Robust CDR Circuits 135
Reference Pulse
All blocks in the proposed CDR circuits were implemented using current mode logic. In current
mode logic, multiple levels of current steering are required for the implementation of the XOR
gate. This can be problematic, especially if the output of the XOR gate is a short pulse, as
bandwidth limitations come into effect. In virtually all phase detectors the XOR gate plays a
critical role. For example, in Alexander phase detectors and Hogge phase detectors, XOR gates
are used to generate UP and DOWN pulses. The Hogge phase detector has a higher dependance
on the XOR gate due to due to the need to properly balance the UP and DOWN pulses, however
XOR gates are also important in the Alexander phase detector. In the proposed phase detector,
the XOR gate does not play such a critical role. The purpose of the XOR gate is to produce the
constant-width reference pulse. If the bandwidth of the XOR gate is less than expected or the
gate is affected by process variations, the width of the reference pulse may vary. However, the
CDR operation will only be marginally affected since the amount of correction applied by the
charge pump will be equal, regardless if it is a charge up or charge down correction. Furthermore,
the magnitude of the charge pump current may be changed in order to compensate for the altered
width of the reference pulse.
Charge Pump
As the charge pump is activated only if there is a data transition, the proposed phase detector
has a tri-state. In order to make use of the reference signal the charge pump circuit must be
modified. The schematic of the modified charge pump is given in Figure 5.6. As can be seen, the
reference pulse steers the charge pump current in such a way to only enable the bias current in
either the ‘add charge’ path or the ‘subtract charge’ path. The reference pulse will operate at a
higher frequency than the UP/DN signal and hence it is kept close to the output. Correction only
occurs when the reference pulse goes high, as that is only time when the charge pump current is
connected to the loop filter.
Phase Detector Design for Robust CDR Circuits 136
Figure 5.6: Architecture of the modified charge pump
LC-tank VCO
The VCO was implemented as an LC-tank oscillator. The LC-tank oscillator was described
in Section 2.3.4 and the exact topology used is the same as that shown in Figure 2.22. In
this topology cross-coupled NMOS transistors provide the negative gm and varactors provide
the variable capacitance needed to tune the frequency. In this implementation accumulation-
mode MOS (AMOS) varactors were used, as they provide a monotonic tuning characteristic, a
reasonably large tuning range and a high quality factor [43]. The structure of an AMOS varactor,
shown in Figure 5.7 is essentially an NMOS which is created inside an N-well [77]. The source
and drain are shorted together and connected to control voltage which varies the capacitance,
and the gate is connected to the node of the VCO where the oscillations are occurring. The
control voltage can vary the size of the depletion region under the gate, which in turn varies the
capacitance.
Phase Detector Design for Robust CDR Circuits 137
Figure 5.7: Physical structure of an AMOS varactor
5.1.4 Measured Results
The CDR circuit was implemented in a 180nm six metal layer standard CMOS process. Figure
5.8 shows the micrograph of the fabricated CDR circuit. The total area of the CDR circuit is
0.8mm × 0.4mm. The binary phase detector and charge pump required an area of only 450µm ×
300µm. The CDR circuit consumes 150mA from a 1.8V supply, including the output and output
buffers. Without the buffers the circuit consumes less than 100mA. Based on simulations, a CDR
circuit which implemented an Alexander phase detector would have consumed approximately 40%
more power than this design.
A layout error in the data output buffer resulted in an inability to measure the output data
stream, making BER and jitter tolerance measurements impossible. However, measurements
on the clock signal were still able to be performed. The coarse tuning range of the VCO was
measured to be from 4.85GHz to 6.3GHz, approximately 50% greater than what was designed and
the fine tuning range was almost 400MHz. The phase noise of the unlocked VCO was measured
to be -96dBc/Hz at a 1MHz offset. The CDR circuit was able to successfully lock to a PRBS of
231− 1 at data rates from 5-6.25Gb/s. Figure 5.9 shows the frequency spectrum of the recovered
Phase Detector Design for Robust CDR Circuits 138
Figure 5.8: Micrographs of the CDR circuits
clock signal given a PRBS of 231 − 1 at 6.25Gb/s. The phase noise of the recovered clock signal
was measured to be -85dBc/Hz at a 5kHz offset. The measured jitter on the recovered clock for
an input PRBS of 6.25Gb/s was 1.7ps RMS and 11ps peak-to-peak, as shown in Figure 5.10a.
If the data pattern is changed to an alternating data pattern (101010...) the measured RMS
and peak-peak jitter reduce to 230fs and 1.7ps respectively, as shown in Figure 5.10b. The data
rate of 6.25Gb/s is 25% higher that what the circuit was designed for. The simplicity of the
proposed phase detector has been shown to have benefits in terms of robustness, however the
same simplicity also provides a high degree of scalability.
Phase Detector Design for Robust CDR Circuits 139
Figure 5.9: Frequency spectrum of the recovered clock signal
Figure 5.10: Jitter histogram of the recovered clock for two data patterns
Phase Detector Design for Robust CDR Circuits 140
Figure 5.11: A CDR circuit with a parasitic capacitor creating a third order response
5.2 Pulsed DFF Binary Phase Detector
In this section the basic operation of a binary phase detector is analyzed and a new circuit is
proposed. As with the tri-stated DFF phase detector, the DFF is used as a phase detector.
The phase detector analysis in Section 3.4 illustrated the robustness of this circuit and in this
section a modification is proposed which improves the performance. In order to characterize
the performance benefits, the proposed circuit is compared with a standard DFF phase detector
using various FOM such as jitter tolerance, jitter generation and jitter tolerance. Finally, a silicon
implementation of a CDR circuit using the proposed phase detector is described and measured
results are given.
5.2.1 Monolithic Second Order Loop Issues
As shown in Section 2.6, a binary phase detector can be effectively used in a CDR circuit. The
use of a first order RC filter is an attractive architecture as it allows the designer to have control
over both the proportional and integral branches of the loop. However, this architecture requires
the use of a first order RC filter and this is difficult in an integrated environment. The parasitic
capacitance on the chip, the pad and and the PCB add a secondary capacitance in parallel with
the RC filter. This turns the first order filter into a second order filter, which in turn turns the
loop in a third order loop. This situation is illustrated in Figure 5.11.
In [62] Wang et al. analyze the negative repercussions of a second order loop turning into a
third order loop. Even when the parasitic capacitor is several orders of magnitude lower than the
Phase Detector Design for Robust CDR Circuits 141
Figure 5.12: Architecture of a CDR circuit with separate proportional and integral paths
loop capacitor, the frequency response of the CDR circuit deteriorates. As such, it is desirable to
find a way to control the proportional and integral branches of the response without increasing
the order of the loop. Wang et al. proposed one such architecture in [62] where the output
of the bang-bang phase detector directly modulates a varactor in a LC-tank oscillator. A ring
oscillator could also be used, whereby the output of the phase detector directly modulated a bias
current of the VCO. Separating the paths will not eliminate the parasitic capacitances, however
the integral path filter will consist of simply one loop capacitor and any parasitic capacitances
will be absorbed into it. As the parasitic capacitances will be orders of magnitude less than the
loop capacitor, the overall effect will be negligible. Figure 5.12 illustrates the basic architecture
whereby both proportional and integral paths exist separately.
5.2.2 Proportional Path Optimization
The previous section described some of the difficulties involved in implementing a binary phase
detector based CDR circuit which has only a second order loop. The importance of separate
proportional and integral paths was described and a simple method of accomplishing this was
described. In this section the goal is to expand upon this idea in order to improve the performance
of the CDR circuit.
One problem with the use of binary phase detectors is the excessive jitter they generate. The
non-linear behavior means that the frequency of the VCO never stabilizes. The output frequency
Phase Detector Design for Robust CDR Circuits 142
Figure 5.13: Architecture of the proposed phase detector
of the VCO is constantly switching in discrete steps. This discrete frequency switching results in
phase noise, which correlates to higher jitter. In this section a circuit is introduced which softens
the hard non-linearity of the binary response. The concept is that instead of instantly switching,
the shape of the proportional switching is controlled via a current pulse. When the output of the
binary phase detector transitions, the proposed phase detector injects a current pulse. Figure
5.13 shows the architecture of a CDR circuit where the charge pump current is not fixed, but
rather is a function of the phase detector output. This architecture implements a second order
filter, however the loop response remains second order.
In order to describe the operation of the circuit the current pulse is assumed to be an ideal
square pulse. In order to obtain a response similar to the simple second order loop which has a
first order RC loop filter, the second order loop filter in the proposed circuit must be properly
designed. If the proposed CDR circuit used a capacitor for the loop filter, the changes in the
output voltage would simply be a function of the q = C×V relationship. This results in Equation





Phase Detector Design for Robust CDR Circuits 143
Figure 5.14: Ideal waveforms if a capacitor is used as a loop filter
Figure 5.15: A 2nd order loop filter and the waveforms resulting from a current pulse
Given a simple capacitor the step in the filter voltage due to a current pulse is simply proportional
to the difference in the current Ia vs Ib. The resulting waveforms for this type of architecture
are shown in Figure 5.14. As can be seen, the change in the capacitor voltage is quite small and
therefor the proportional response of the loop is also small.
The use of a second order filter along with the current pulse provides the desired separate
integral and proportional paths, however the filter design is more complicated. Figure 5.15
shows the schematic of a second order filter and the ideal related currents in the presence of a
current pulse. As can be seen there is both an integral and a proportional response, however
the proportional response is not discontinuous, rather it has a slope. The waveforms in Figure
Phase Detector Design for Robust CDR Circuits 144
5.15 illustrate the ideal waveforms, however in order for the response to approximate this, the
filter must be properly designed. The width of the current pulse should be equal to t1, given
in Equation 5.2. This will result in the proportional and integral frequency steps as given in
Equation 5.3.











·Kvco ∆fbb integral = I2
C1 + C2
·Kvco (5.3)
If this time constant t1 is less than the duration of the pulse there will be a droop on the filter
voltage. If t1 is exactly equal to tpulse there will be no droop and the voltage increase after the
pulse will be almost linear. If t1 is greater than tpulse there will not be a droop, however the step
in the output voltage (and hence the ∆fbb) will be larger than desired, and the filter response
will be more exponential than linear. These three situations are illustrated in Figure 5.16. While
these equations seem to present some rather strict constraints on the filter, ultimately as long as
the filter design is relatively close to the ideal it will provide the desired response.
A CDR circuit implementing the proposed architecture was simulated in Matlab and was
shown to have improved performance as compared to hard switching. It was also found that the
proportional response of the phase detector can be further optimized by changing the shape of
the current pulse. So far the current pulse has been an ideal square wave, however there is no
reason why it must be so. To obtain the best performance extensive Matlab simulations were
run in order to determine the optimal shape of the current pulse. The results of the simulation
are shown in Figure 5.17. In Figure 5.17 the x-axis represents the number of clock periods the
pulse is active for and the y-axis represents the ratio of the pulse current to the normal value
of the current. The figure shows that the optimal shape of the current pulse as determined by
Matlab simulations was a large initial current which decayed over several period until it reached
the final value. These results illustrate that the shape of the current pulse does have an effect on
Phase Detector Design for Robust CDR Circuits 145
Figure 5.16: Relationship between t1 and tpulse given a 2nd order filter
the performance of the CDR circuit. Optimizing the shape of the pulse does not affect the filter
design, even though the equations were derived assuming a square current pulse. If t1 is used as
the pulse width, I1 as the maximum current, and I2 as the minimum current Equations 5.2 and
5.3 still hold. This architecture is referred to as the pulsed DFF phase detector.
5.2.3 Simulation Results
In order to illustrate the benefits of the proposed phase detector simulations were performed in
order to compare it against other standard architectures. Verilog-A models of all blocks were
created and the proposed phase detector was compared against a standard DFF with an RC
filter providing the proportional and integral paths. In these simulations the filter was ideal and
the effects of the parasitic capacitance were ignored.
Jitter transfer is a common FOM used to examine the tracking performance of CDR circuits.
Phase Detector Design for Robust CDR Circuits 146
Figure 5.17: Matlab plot showing the simulated ideal current pulse characteristic
Figure 5.18: Jitter transfer for DFF and pulsed DFF phase detectors
Figure 5.18 shows the jitter transfer for both a regular DFF phase detector and the proposed pulse
DFF phase detector. For these simulation the magnitude of the input jitter was 25ps and the
data rate was 10Gb/s. As can be seen, the pulsed DFF phase detector tracks the low frequency
jitter better than the standard DFF based phase detector, without affecting the jitter transfer
Phase Detector Design for Robust CDR Circuits 147
Figure 5.19: Jitter transfer for various values of Jin for both phase detectors
bandwidth. The derivations in Section 2.6 showed that the jitter transfer bandwidth for binary
phase detectors is dependant on the magnitude of the input jitter. Figure 5.19 illustrates the
effect on the jitter transfer responses of the two phase detectors as the input jitter magnitude is
varied from 5ps to 35ps. As can be seen, for all input jitter magnitudes the novel phase detector
circuit keeps a tighter lock, resulting in a lower jitter transfer gain at low frequencies, however,
the jitter transfer bandwidths of the two circuits are approximately the same.
Next, the jitter tolerance of the pulsed phase detector was examined. Cadence simulations
were performed on both the standard DFF and the pulsed DFF phase detector and the resulting
jitter tolerance of both circuits is shown in Figure 5.20. For input jitter frequencies which are
low, the jitter tolerance response depends more on the integral response of the phase detector
and hence the pulsed DFF phase detector is of little benefit as the current pulse only affects the
proportional response. As such, the jitter tolerance of the pulsed DFF phase detector matches
that of the standard DFF phase detector at lower input jitter frequencies. However, as the input
jitter frequency increases, the response of the proportional path dominates and the optimization
of the pulsed DFF phase detector’s proportional response becomes significant.
Phase Detector Design for Robust CDR Circuits 148
Figure 5.20: Simulations results showing jitter tolerance for both phase detectors
5.2.4 Implementation
In order to verify the performance benefits of the proposed phase detector, a complete CDR
circuit designed in a 130nm standard CMOS process. The proposed CDR circuit was designed
to operate at approximately 10Gb/s in order to compare the performance against standards such
as OC-192 and 10GB Ethernet. The CDR circuit was designed so that the pulsed DFF and
standard DFF responses could be compared against one another.
Phase Detector and Charge Pump
The phase detection in the pulsed DFF phase detector is performed by a CML DFF. The DFF
was implemented as a dual-edge triggered CML DFF, which was previously described in Section
4.3.2. The simulated gain of the CML DFF is given in Figure 5.21. The input to the charge
pump is a single differential UP/DOWN signal and as such the charge pump is implemented
as previously shown in Figure 4.18. The only difference from the common charge pump is that
in this design the charge pump current is not constant, but rather is controlled by the pulse
Phase Detector Design for Robust CDR Circuits 149
Figure 5.21: Simulated gain of the proposed phase detector
Figure 5.22: Schematic of circuit used to generate the current pulse
generation circuit.
Pulse Generator
Two methods were developed to generate the current pulse. The first method is similar to a
current DAC whereas the second method is an analog method which uses an RC circuit to
Phase Detector Design for Robust CDR Circuits 150
generate the current pulse. Each design is able to generate a current pulse, however each has
advantages and drawbacks. Both of these schemes rely at least partially on a circuit which
generate a sequence of pulses. The block diagram of this circuit and the waveforms it generates
are shown in Figure 5.22.
The circuit contained in the shaded area labelled ‘1’ generates the pulse VA and it is always
active. The circuit contained in the shaded area labelled ‘2’ generates pulses VB through VF and
this is only activated in the digital current DAC scheme. The widths of pulses VA through VE are
equal to the delay through the variable delay line ∆T. The delay in the delay line is created via
a sequence of current starved inverters, and the magnitude of delay is controlled externally. The
digital current DAC scheme uses all of the pulses which are generated. The pulses VA through
VE are used to turn on switches which connect the charge pump bias current to five different
bias currents which are controlled externally. The pulse VF is used to enable a switch to set the
normal current, after the current pulse has completed. By setting these bias voltage the shape of
the current pulse can be arbitrarily defined. One could extend this technique to add more pulses
and create a more detailed current pulse, at the expense of design complexity, power, area and
noise. However, the Matlab simulations showed no benefit in controlling the current pulse beyond
a certain point. The schematic of the digital current DAC scheme is shown in Figure 5.23a.
The the analog RC scheme uses only the first pulse, VA, and while this scheme is enabled
the circuits which generate the pulses VB through VF are disconnected. The pulse VA turns on a
PMOS transistor which sets the charge pump current to IB, which is the maximum value of the
current pulse. Once VA returns to zero the charge pump’s bias voltage returns to VB at a rate
determined by the RC circuit created by the capacitor and the resistance through the NMOS
transistor. The voltage Vbias is used to set the resistance of the NMOS transistor. The schematic
of the analog RC scheme is shown in Figure 5.23b. The range of simulated pulses generated by
the analog RC scheme is shown in Figure 5.24.
Phase Detector Design for Robust CDR Circuits 151
Figure 5.23: Schematics of both the digital and analog current pulse circuits
Figure 5.24: The range of simulated current pulses for the RC scheme
LC-tank VCO
The VCO was implemented using an LC-tank oscillator. The architecture of the LC-tank os-
cillator used in this design has been previously described in Section 2.3.4. The varactors were
implemented using AMOS varactors, which were previously discussed in Section 5.1.3. While ac-
Phase Detector Design for Robust CDR Circuits 152
Figure 5.25: Simulated eye diagram for both a regular and pulsed DFF phase detector
cumulation mode varactors are not usually a part of standard CMOS technology kits, this process
included detailed models for the devices. The VCO was designed to have a centre frequency of
approximately 10.5GHz, a coarse tuning range of approximately 2.5GHz and a fine tuning range
of 400MHz.
Back-Annotated Simulations
Once the design and layout of the CDR circuit was completed, parasitics were back annotated
and simulation were performed with a random input data sequence. Both the regular DFF
phase detector and the proposed pulsed DFF phase detector were simulated and the resulting
eye diagrams are shown in Figure 5.25. As can be seen the simulated peak to peak jitter of the
standard DFF phase detector is 8.55ps, whereas the simulated peak to peak jitter for the pulsed
DFF phase detector is only 3.11ps.
Phase Detector Design for Robust CDR Circuits 153
Figure 5.26: Micrograph of the proposed pulsed DFF phase detector
5.2.5 Measured Results
A CDR circuit using the pulsed-DFF phase detector was designed and fabricated in a 130nm
standard CMOS process. The total die area was 0.8mm × 0.625mm, including pads. The
micrograph of the fabricated chip is given in Figure 5.26. Excluding the inductor, the active area
of the entire CDR circuit is only 0.3mm × 0.2mm.
While the CDR circuit was designed to operate at 10Gb/s, silicon results of the oscillator
differed significantly from the back-annotated simulations. Figure 5.27 shows the measured and
simulated frequency results from the LC-tank VCO. As can be seen the VCO is oscillating at
a significantly higher frequency than expected. The midpoint of the measured coarse frequency
Phase Detector Design for Robust CDR Circuits 154
Figure 5.27: Simulated and measured VCO frequency
tuning range is approximately 4GHz, or 40%, higher than the designed centre frequency. As
described in Section 2.3.4, the frequency of an LC-tank oscillator is defined as wo = 1√LC . This
means that the combination of LC must have deviated from the desired value by almost 90%.
The inductor was chosen using measured data from a foundry datasheet and is largely defined
by the geometry. As such, it is likely the varactors which caused the increase in frequency.
AC simulation performed before tapeout indicated that there was a drop in capacitance as the
frequency increased. The results of these simulations are shown in Figure 5.28. While the
simulations indicate a slight drop in the capacitance at 10GHz, it is certainly not of the magnitude
indicated from the measured results. It is also interesting to note that the measured fine frequency
tuning closely matches the simulated fine frequency range, however the measured coarse tuning
range is significantly less. This is curious as the coarse tuning varactors are exactly the same as
the fine tuning varactors, they are simply a larger array of the same device. As such, it is not clear
Phase Detector Design for Robust CDR Circuits 155
Figure 5.28: Simulated capacitance of AMOS varactors
why the fine tuning varactors seem to respond as expected, while the coarse tuning varactors do
not. It is still likely that the varactors are the cause of the increased frequency, however the issue
is as yet unresolved.
Despite the fact the CDR circuit was operating at a data rate which was 50% greater than
what it was designed for, it was able to lock to a 15Gb/s PRBS. Figure 5.29 shows the spectrum
of the clock when the CDR circuit was locked to a PRBS of 231 − 1. The CDR circuit was able
to lock to data rates from 14Gb/s to 15.5Gb/s, with an input PRBS of 231 − 1. By turning off
the pulse generating circuit, the phase detector could operate as a regular DFF phase detector.
This allowed a comparison between the standard DFF phase detector and the pulsed DFF phase
detector. Figure 5.30 shows the jitter on the clock for both the regular DFF phase detector
and the pulsed DFF phase detector for an input 15Gb/s PRBS of 231 − 1. As can be seen, the
measured jitter for the regular DFF phase detector is Jpp = 12.2ps and Jrms = 1.72ps, while the
pulsed DFF phase detector improved the jitter to Jpp = 3.3ps and Jrms = 0.42ps. For the pulsed
DFF measurements the analog RC method of generating the pulse was used. The digital DAC
Phase Detector Design for Robust CDR Circuits 156
Figure 5.29: Spectrum of the 15GHz output clock signal
Figure 5.30: Jitter histogram for the output clock for both phase detectors
scheme relies on the propagation of multiple high-speed signals, and these ended up coupling into
the CDR circuit, resulting in significant performance degradation.
While the CDR circuit was able to lock to the incoming data sequence, unfortunately the
Phase Detector Design for Robust CDR Circuits 157
Figure 5.31: Output data signal
data retiming circuit was unable to properly retime the data and the data output was corrupted.
This is not all that surprising giving that the data rate is 50% higher than what the circuit was
designed for. As such it was not possible to measure BER and other figures of merit. The output
data waveform when the CDR circuit is locked to a PRBS of 231 − 1 is given in Figure 5.31.
Phase Detector Design for Robust CDR Circuits 158
5.3 Bandwidth Enhanced Linear Phase Detector
Most recent implementations of multi-Gb/s CDR circuits have used binary phase detectors, how-
ever Section 4.1.2 provided a justification for the use of linear phase detectors. The Hogge phase
detector is the most common linear phase detector architecture however it is difficult to success-
fully implement in standard CMOS processes. As was shown in Section 3.4, the process variations
associated with the aggressive scaling of CMOS processes can cause significant problems for this
circuit. In Section 3.4.2 it was shown that the Hogge phase detector demonstrates significant vari-
ations in its static phase offset over process corners. In this section a simple circuit modification
is proposed to improve the robustness of the Hogge phase detector.
5.3.1 Robustness in the Hogge Phase Detector
The robustness analysis of the standard Hogge phase detector has already been performed, and
the detailed simulations results for the 8Gb/s CDR circuit implemented in a 130nm CMOS
process are shown in Figure 5.32a. As can be seen, the response is linear for small input phase
errors, however for significant errors the phase detector cannot operate properly and the response
deviates significantly from the ideal. In order to get the overall system response the UP pulse
width is subtracted from the DOWN pulse width, which is shown in Figure 5.32b. Figure 5.32c
shows the expanded response in the region of the zero crossing point, and this shows the significant
variation in the static phase offset of the phase detector over corners. The relationships derived
in Section 3.2 illustrate that static phase offsets degrade the performance of a CDR circuit, and
hence while the ideal behavior of the linear phase detector is attractive it is desirable to use a
circuit with less sensitivity to process variations.
One interesting result which emerges from Figure 5.32 is that the variation in the width of the
DOWN pulse are more significant than the variations in the UP pulse width. It is primarily the
variations in the DOWN pulse width which has led to the variations in the zero crossing point
of the response of the phase detector. The reason for this is that in the Hogge phase detector
Phase Detector Design for Robust CDR Circuits 159
Figure 5.32: Simulated response of the original Hogge phase detector over corners
the back-to-back DFFs are clocked on opposite edges of the clock. This means that DFF2 has
at most T2 to sample the output of DFF1. The C-Q delay of DFF1 reduces this, and variations
in the C-Q delay of DFF1 can cause the setup time of DFF2 to be violated.
Phase Detector Design for Robust CDR Circuits 160
Figure 5.33: Architecture of a modified Hogge phase detector
5.3.2 Modified Hogge Phase Detector
In this section a simple modification to the Hogge phase detector in proposed in order to improve
the response of the DOWN pulse and alleviate this problem. There have been previous efforts to
improve the Hogge phase detector. In [72] a half-rate Hogge phase detector is introduced and [78]
introduces a tri-wave Hogge phase detector. These circuits improve the performance, however
all have the same weakness of back-to-back DFFs clocked on opposite edges of the clock. The
proposed phase detector helps to alleviate this problem by stretching the DOWN pulse. The
architecture and functional waveforms of the original and the proposed phase detector are shown
in Figure 5.33. The only difference from the standard Hogge phase detector is that DFF2 is
clocked on the same edge of the clock as DFF1. By simply clocking DFF2 half a period later the
setup time of DFF2 is lengthened by half a period. This can be seen on the waveforms, where for
Phase Detector Design for Robust CDR Circuits 161
Figure 5.34: Ideal transfer characteristic for the standard and modified Hogge
the proposed circuit Q2 is delayed by T2 which in turn widens DOWN by
T
2 . Therefor the DOWN
reference pulse for the proposed circuit is double the width compared to the original Hogge phase
detector. The resulting ideal transfer characteristic for the original and modified Hogge phase
detectors are shown in Figure 5.34. The total amount of charge added to or subtracted from the
loop filter is simply a function of the bias current and the pulse width. The DOWN pulse for the
proposed phase detector is twice as large, and hence the current should be halved in order that
the overall response be equal to that of the original Hogge phase detector.
In order to compare the modified phase detector to the original circuit, simulations are per-
formed as in Section 3.4 so as to determine the effect of process corners on the transfer char-
acteristic of the phase detector. The same corners were used and the results of the simulations
are shown in Figure 5.35. These results closely match the ideal response in Figure 5.34 for small
input phase errors, however as with the original Hogge phase detector the response breaks down
for large input phase errors. By subtracting the UP and DOWN pulse widths the overall phase
detector response can be determined, and this is shown in Figure 5.35. There isn’t a large dif-
ference in the linear range between the two phase detectors, however the total variation in the
Phase Detector Design for Robust CDR Circuits 162
Figure 5.35: Simulated response of the modified Hogge phase detector over corners
static phase offset of the modified Hogge phase detector is only 5◦, which is a significant reduction
compared to the original Hogge circuit which had a total variation of 45◦. Comparing Figure 5.35
to Figure 5.32 it can be seen that for the modified Hogge phase detector the DOWN pulse widths
are much more uniform compared to the original circuit. This is due to the extra time that DFF2
has to sample the output of DFF2. While this is a simple modification it has a profound effect
on the robustness of this circuit.
5.3.3 Charge Pump Currents
In order to use the proposed phase detector, a charge pump which has separate controls for the
UP and DOWN paths must be used. The DOWN path performs in a near-ideal manner, however
Phase Detector Design for Robust CDR Circuits 163
Figure 5.36: Simulated eye diagram before and after calibration
UP path must go through a current mirror which can attenuate the signal due to bandwidth
limitations. This means that a larger UP pulse than in the ideal case is required to compensate
for the attenuation in the UP path, which manifests itself as a static phase offset. This problem
is present in the original Hogge phase detector however the proposed modification exacerbates
the problem as the width of the UP pulse is now half the width of the DOWN pulse.
In order to observe the effect of this the CDR circuit is simulated using Verilog-A models
for all blocks other than the phase detector and charge pump. Figure 5.36a gives the resulting
eye diagram, which shows a 21.97ps static phase offset. This is a large error which will limit the
performance of the CDR circuit. The DOWN bias current in the charge pump should theoretically
be half that of the UP bias current. However, simulations show that the required UP bias current
is closer to three times that of the DOWN bias current, and this ratio varies over process corners.
In order to compensate for this, the calibration circuit described in Section 4.3 was implemented.
As can be seen in Figure 5.36b, the calibration circuit is capable of properly adjusting the charge
pump currents in order to eliminate the static phase offset.
Phase Detector Design for Robust CDR Circuits 164
5.4 Summary
This chapter has proposed phase detectors which were designed with the goal of optimizing both
performance and robustness. Two binary phase detector were presented which used the simple,
robust DFF phase detector. The goal of these phase detectors was to address the performance
limitations of CDR circuits which use the DFF phase detector while maintaining the inherent
robustness. A novel tri-state DFF phase detector was proposed and was shown to have an identical
response to the Alexander phase detector. However, at a data rate of 5Gb/s the variation in the
static phase offset over corners for the proposed circuit was only 7.1ps, as compared with 13.9ps
for the Alexander phase detector. A CDR circuit implementing this phase detector was designed
and fabricated in a 180nm standard CMOS process. Given a 231 − 1 PRBS at 6.25Gb/s the
measured RMS and peak-to-peak jitter of the CDR circuit were 1.7ps and 11ps. The second
binary phase detector improved the performance of the simple DFF phase detector by shaping
the charge pump current. A CDR circuit implementing this phase detector was designed and
fabricated in a 130nm standard CMOS process. Given an input 15Gb/s PRBS of 231 − 1 the
measured RMS jitter for the pulsed DFF phase detector (compared with the standard DFF phase
detector) improved from 12.2ps to 3.3ps. Linear phase detectors were also considered, with the
goal of improving robustness rather than performance. A simple change to the Hogge phase
detector was proposed which drastically reduces the overall variation in the static phase offset.
Corner simulations of the original and modified phase detectors in a 130nm process at a data rate
of 8Gb/s show that the modified Hogge phase detector reduces the variation of the static phase
offset from 45◦ to 5◦.
Chapter 6
Conclusions
The bandwidth of interconnects is becoming an increasingly important factor in modern com-
puting systems and has led to a trend towards the use of serial interconnects. The integration
of these interconnects into monolithic ICs provides benefits, especially in terms of system cost.
The scaling of CMOS processes has enabled this integration, however the increasing variability
associated with scaled processes means that robust design of integrated CDR circuits is becoming
increasingly difficult.
In Chapter 2 a background on wireline data communication was provided. Both optical
and backplane systems were described, as were corresponding differentiators. The purpose and
operation of the CDR circuit was introduced and the individual circuits which comprise a CDR
circuit were detailed. Various figures of merit used to characterize the performance of CDR circuits
in both the frequency domain and the time domain were described. Mathematical models of a
CDR circuit were formulated for both the case of a linear phase detector and the case of a binary
phase detector. Using these models, mathematical expressions were derived for several of the
figures of merit.
In Chapter 3 the concepts of robustness as it applies to CDR circuits was defined. Various
causes of process non-idealities were identified and the static phase offset was introduced as a
165
Conclusions 166
measure of robustness for CDR circuits. A mathematical expression was derived which quanti-
fied the relationship between static phase offsets and BER. A CML DFF circuit was analyzed
to determine its response over process corners and subsequently three standard phase detector
circuits were analyzed over process corners in order to compare their robustness. The analysis
showed that the robustness of all phase detectors degraded as the process scaled, however certain
circuits showed advantages in terms of robustness.
Chapter 3 defined the concept of circuit robustness, and Chapter 4 introduces the idea of en-
hancing the robustness of CDR circuits via calibration. A calibration scheme was proposed which
involved using the charge pump currents to compensate for static phase offsets. Two calibration
circuits, one offline and one online, were designed and implemented in a 180nm standard CMOS
process. The measured results from these designs showed that by using calibration circuits to
correct for static phase offsets, the performance of a CDR circuit can be improved.
The use of calibration circuits to compensate for the effects of process variation is valuable,
however it would be preferable to use circuits which did not suffer those effects. Chapter 5
examined ways to optimize CDR circuits for performance and robustness by creating novel phase
detector circuits. Three circuits were proposed, two binary and one linear. The two binary
circuits were implemented in CDR circuits which were fabricated in 180nm and 130nm standard
CMOS processes. Simulated results from two designs demonstrated an increase in performance
without sacrificing robustness. Finally a modification to the linear Hogge phase detector was
proposed which significantly improved the robustness while maintaining the desired performance.
6.1 Major Contributions
In this thesis the robust design of multi-Gb/s CDR circuits has been examined. In particular, the
effects of process non-idealities on CDR circuits have been analyzed and new robust phase detector
circuits have been proposed with the goal of optimizing both performance and robustness. The
major contributions of this thesis are as follows:
Conclusions 167
1. Focus on Manufacturability
Virtually all published research into CDR circuits is aimed at maximizing performance. Due
to scaling, integrated high-speed serial interconnects are becoming increasingly necessary,
and as such their robustness is a significant concern. Several common phase detectors
were analyzed to examine the effect of process non-idealities on CDR circuits. Simulations
demonstrated that the robustness of all phase detectors deteriorates with scaling. The
results show that the Hogge has the highest sensitivity to process variations and that the
DFF phase detector has the least. In addition, a new mathematical model was developed
to quantify the effects of static phase offsets on the BER performance of CDR circuits.
2. CDR Circuit Calibration
The effects of process variability can negatively affect the performance of a CDR circuit
in a variety of ways, including the creation of static phase offsets. Static phase offsets
have a significant negative effect on CDR circuits and in order to compensate for them an
approach was proposed which changes the charge pump current in order to properly align
the clock and data signals. Two calibration circuits were implemented, one offline and the
other online, both of which sense and compensate for static phase offsets. With a PRBS of
27− 1 the offline calibration circuit improved the measured BER of the a CDR circuit from
4.6× 10−2 to less than 10−13.
3. Robust Phase Detectors
There have been a multitude of phase detector circuits presented in literature, however none
of them have been designed with robustness as an explicit criteria. In this thesis three phase
detectors circuits were proposed where robustness was a significant design consideration.
Both linear and binary phase detector circuits were proposed, and all demonstrated an
improvement in either robustness or performance without degrading the other. As compared
to a standard DFF phase detector, the pulsed DFF phase detector improved the measured
peak-to-peak jitter for a 15Gb/s PRBS of 231 − 1 from 12.2ps to 3.3ps.
Conclusions 168
6.2 Future Work
The integration of CDR circuits into monolithic CMOS ICs is relatively recent and it presents
many opportunities for future work. This thesis has shown that in spite of the challenges associ-
ated with scaling there are ways to improve current CDR performance, especially in the context of
robustness. It is the hope of the author that the work presented here will be useful in addressing
some of the future challenges. There are several areas where the research presented in this thesis
could be continued and extended.
1. Design of Low-Power CMOS CDR Circuits
Aggressive process scaling allows multi-Gb/s CDR circuits to be implemented in logic fam-
ily other than CML. While the use of static CMOS is attractive in terms of power and
area considerations, more research is needed to determine if issues such as noise can be
compensated for in the context of integration into monolithic ICs. It is also possible that a
hybrid CML - static CMOS approach would be optimal.
2. Use of Calibration in CDR Circuits
Calibration provides a way to maintain performance while improving the robustness. More
research into calibrated CDR circuits could improve other aspects of the design. Also,
several of the proposed techniques could be implemented in parallel. For example, offline
and online calibration circuits could both be included in a single CDR circuit.
3. Multiple Interconnect Integration
The integration of many CDR circuits onto a single die brings new challenges. With multiple
CDR circuits on a single die, issues such as VCO pulling, area limitation and distributed
architectures should all be considered. Finally, design for test will be very important as
individually testing multiple interconnects could be cost prohibitive.
Bibliography
[1] V. Stojanovic and M. Horowitz, “Modeling and Analysis of High-Speed Links,” IEEE Custom
Integrated Circuits Conference, pp. 589 – 594, September 2003.
[2] Agilent Technologies, “Jitter Analysis Techniques for High Data Rates,”
cp.literature.agilent.com/litweb/pdf/5988-8425EN.pdf, 2003.
[3] S. Nassif, “Delay Variability: Sources, Impacts and Trends,” IEEE International Solid-State
Circuits Conference, pp. 368 – 369, 2000.
[4] K. Bernstein, D. J. Frank, A. E. Gattiker, W. Haensch, B. L. Ji, S. R. Nassif, E. J. Nowak,
D. J. Pearson, and N. J. Rohrer, “High-Performance CMOS Variability in the 65-nm Regime
and Beyond,” IBM Journal of Research and Development, vol. 50, pp. 433 – 449, 2006.
[5] B. Razavi, ed., Monolithic Phase-Locked Loops and Clock Recovery Circuits. Piscataway, NJ:
IEEE Press, 1996.
[6] R. Walker, “Clock and Data Recovery for Serial Digital Communication,” IEEE Interna-
tional Solid-State Circuits Conference Short Course, February 2002.
[7] B. Razavi, High-Speed CMOS Circuits for Optical Receivers. McGraw-Hill, 2002.
[8] R. Farjad-Rad, A. Nguyen, J. Tran, T. Greer, J. Poulton, W. Dally, J. Edmondson, R. Sen-
thinathan, R. Rathi, M.-J. Lee, and H.-T. Ng, “A 33-mW 8-Gb/s CMOS Clock Multiplier
169
Bibliography 170
and CDR for Highly Integrated I/Os,” IEEE J. Solid-State Circuits, vol. 39, pp. 1553 – 1561,
September 2004.
[9] T. Toifl, C. M. M. Ruegg, R. Reutemann, P. Buchmann, M. Kossel, T. Morf, J. Weiss, and
M. L. Schmatz, “A 22-Gb/s PAM-4 Receiver in 90-nm CMOS SOI Technology,” IEEE J.
Solid-State Circuits, vol. 41, pp. 954 – 965, April 2006.
[10] J. Zerbe, C. Wernerand, V. Stojanovic, F. Chen, J. Wei, G. Tsang, D. Kim, W. Stonecypher,
A. Ho, T. Thrush, R. Kollipara, G.-J. Yeh, M. Horowitz, and K. Donnelly, “Equalization
and Clock Recovery for a 2.5-10Gb/s 2-PAM/4-PAM Backplane Transceiver Cell,” IEEE
International Solid-State Circuits Conference, pp. 80 – 479, 2003.
[11] M. Hsieh and G.E. Sobelman, “SiGe BiCMOS PAM-4 Clock and Data Recovery Circuit
for High-Speed Serial Communications,” IEEE Systems-on-Chip Conference, pp. 305 – 308,
2003.
[12] A. Li, J. Faucher, and D. Plant, “Burst-Mode Clock and Data Recovery in Optical Multiac-
cess Networks Using Broad-Band PLLs,” IEEE Photonics Technology Letters, vol. 18, pp. 73
– 75, January 2006.
[13] J. Kim, J. Yang, S. Byun, H. Jun, J. Park, C. Conroy, and B. Kim, “A Quad-Channel
3.125Gb/s/ch Serial-Link Transceiver With Mixed-Mode Adaptive Equalizer,” IEEE J.
Solid-State Circuits, vol. 40, pp. 462 – 471, February 2005.
[14] Y. Tomita, M. Kibune, J. Ogawa, W. W. Walker, H. Tamura, and T. Kuroda, “A 10-Gb/s
Receiver With Series Equalizer and On-Chip ISI Monitor in 0.11µm CMOS,” IEEE J. Solid-
State Circuits, vol. 40, pp. 986 – 993, April 2005.
[15] R. Palmer, J. Poultonand, W. J. Dally, J. Eyles, A. Fuller, T. Greer, M. Horowitz, M. Kellam,
F. Quan, and F. Zarkeshvari, “A 14mW 6.25Gb/s Transceiver in 90nm CMOS for Serial Chip-
Bibliography 171
to-Chip Communications,” IEEE International Solid-State Circuits Conference, pp. 440 –
441, February 2007.
[16] J. Savoj and B. Razavi, High-Speed CMOS Circuits for Optical Receivers. New York: Kluwer,
2001.
[17] B. Razavi, “Challenges in the Design of High-Speed Clock and Data Recovery Circuits,”
IEEE Communications Magazine, vol. 40, pp. 94 – 101, August 2002.
[18] R. Walker, “Designing Bang-bang PLLs for Clock and Data Recovery in Serial Data Trans-
mission Systems,” in Phase-Locking in High-Performance Sytems - From Devices to Archi-
tectures, vol. ed: B. Razavi, 34 - 45, IEEE Press, 2003.
[19] C. Hogge, “A Self Correcting Clock Recovery Circuit,” Journal of Lightwave Technology,
vol. 3, pp. 1312 – 1314, December 1985.
[20] J. Rogers and J. Long, “A 10Gb/s CDR/DEMUX with LC Delay Line VCO in 0.18µm
CMOS,” IEEE J. of Solid-State Circuits, vol. 37, pp. 1781 – 1789, December 2002.
[21] A. Pottbacker, U. Langmann, and H.-U. Schreiber, “A Si Bipolar Phase and Frequency
Detector IC for Clock Extraction up to 8 Gb/s,” IEEE J. of Solid-State Circuits, vol. 27,
pp. 1747 – 1751, December 1992.
[22] J. D. H. Alexander, “Clock Recovery from Random Binary Data,” Electronics Letters, vol. 11,
pp. 541 – 542, October 1975.
[23] J. Savoj and B. Razavi, “Design of Half-Rate Clock and Data Recovery Circuits for Optical
Communication Systems,” in Design Automation Conference, pp. 121 – 126, 2001.
[24] S.-J. Song, S. Park, and H.-J. Yoo, “A 4-Gb/s CMOS Clock and Data Recovery Circuit
Using 1/8-Rate Clock Technique,” IEEE J. Solid-State Circuits, vol. 38, pp. 1213 – 1219,
July 2003.
Bibliography 172
[25] M. Ramezani and C. Salama, “A 10Gb/s CDR With a Half-Rate Bang-Bang Phase Detec-
tor,” in Circuits and Systems, pp. II–181 – II–184, 2003.
[26] J. Zhuang and T. Kwasniewski, “A 4-Gb/s Half-rate Clock and Data Recovery Circuit with
a Novel PFD and a 3 stage VCO,” IASTED International Conference on Circuits, Signals
and Systems, p. October, 2005.
[27] C. Park, O. Kim, and B. Kim, “A 1.8-GHz Self-Calibrated Phase-Locked Loop With Precise
I/Q Matching,” IEEE J. of Solid-State Circuits, vol. 36, pp. 777 – 783, May 2001.
[28] W. Egan, “Modeling Phase Noise in Frequency Dividers,” Transactions on Ultrasonics, Fer-
roelectrics and Frequency Control, vol. 37, pp. 307 – 315, 1990.
[29] D. Mukherjee, J. Bhattacharjee, and J. Laskar, “A Differentially-Tuned CMOS LC VCO
for Low-Voltage Full-Rate 10Gb/s CDR circuit,” IEEE International Microwave Symposium
Digest, vol. 36, p. 707 710, June 2002.
[30] W. Rhee, “Design of High-Performance CMOS Charge Pumps in Phase-Locked Loops,”
IEEE Symposium on Circuits and Systems, pp. II545 – II548, 1999.
[31] H. Djahanshahi and C. Salama, “Differential CMOS Circuits for 622-MHz/933-MHz Clock
and Data Recovery Applications,” IEEE J. of Solid-State Circuits, vol. 35, pp. 847 – 855,
June 2000.
[32] T. Weigandt, Low-Phase-Noise, Low-Timing-Jitter Design Techniques for Delay Cell Based
VCOs and Frequency Synthesizers. PhD thesis, University of California Berkeley, 1998.
[33] Y. Greshishchev and P. Schvan, “SiGe Clock and Data Recovery IC With Lineartype PLL
for 10Gb/s SONET Application,” IEEE J. Solid-State Circuits, vol. 35, pp. 1353 – 1359,
September 2000.
Bibliography 173
[34] H. Jiang, C. He, D. Chen, and R. Geiger, “Optimal Loop Parameter Design of Charge Pump
PLLs for Jitter Transfer Characteristic Optimization,” Midwest Symposium on Circuits and
Systems, vol. 1, pp. 344 – 347, 2002.
[35] F. Gardner, “Charge Pump Phase-Lock Loops,” IEEE Transactions on Communications,
vol. 28, pp. 1849 – 1858, November 1980.
[36] T. Lee, The Design of CMOS Radio Frequency Integrated Circuits. Cambridge University
Press, 1998.
[37] S. Docking and M. Sachdev, “An Analytical Equation for the Oscillation Frequency of High-
Frequency Ring Oscillators,” IEEE J. of Solid-State Circuits, vol. 39, pp. 533 – 537, March
2004.
[38] T. H. Lee and A. Hajimiri, “Oscillator Phase Noise: A tutorial,” IEEE J. Solid-State Circuits,
vol. 35, pp. 326 – 336, March 2000.
[39] D. Ham and A. Hajimiri, “Concepts and Methods in Optimization of Integrated LC VCOs,”
IEEE J. Solid-State Circuits, vol. 36, pp. 896 – 909, June 2001.
[40] M. Tiebout, “Low-Power Low-Phase-Noise Differentially Tuned Quadrature VCO Design in
Standard CMOS,” IEEE J. Solid-State Circuits, vol. 36, pp. 1018 – 1024, 2001.
[41] S. Wovcienhowsky and R. Nottenburg, “A 10 GHz LC-tuned VCO with Coarse and Fine
Frequency Control,” Electronics Letters, vol. 33, pp. 917 – 918, May 1997.
[42] F. Herzel, M. Pierschel, P. Weger, and M. Tiebout, “Phase Noise in a Differential CMOS
Voltage-Controlled Oscillator for RF Applications,” IEEE Transactions on Circuits and Sys-
tems II, vol. 47, pp. 11 – 15, January 2000.
[43] E. Pedersen, RF CMOS Varactors for Wireless Applications. PhD thesis, Aalborg University
RISC Group, 2001.
Bibliography 174
[44] D. Rennie, “Design and Optimization of Source-Coupled Logic in Multi-Gbit/s Clock and
Data Recovery Circuits,” Master’s thesis, University of Waterloo, 2003.
[45] M. Alioto and G. Palumbo, “Design Strategies for Source Coupled Logic Gates,” IEEE
Transactions on Circuits and SystemsI, vol. 50, pp. 640 – 654, May 2003.
[46] S. Kiaei, S.-H. Chee, and D. Allstot, “CMOS Source-Coupled Logic for Mixed-Mode VLSI,”
IEEE Custom Integrated Circuits Conference, pp. 1608 – 1611, May 1990.
[47] P. Gray, P. Hurst, S. Lewis, and R. Meyer, Analysis and Design of Analog Integrated Circuits.
New York: John Wiley & Sons, 2001.
[48] M. Alioto and G. Palumbo, “Power-Aware Design Techniques for Nanometer MOS Current-
Mode Logic Gates: a Design Framework,” IEEE Circuits and Systems Magazine, vol. 6,
pp. 40 – 59, 2006.
[49] P. Heydari and R. Mohanavelu, “Design of Ultrahigh-Speed Low-Voltage CMOS CML
Buffers and Latches,” IEEE Transactions on VLSI Systems, p. 1081 1093, October 2004.
[50] M. M. Green and U. Singh, “Design of CMOS CML Circuits for Highspeed Broadband
Communications,” IEEE International Symposium on Circuits and Systems, vol. II, pp. 204
– 207, 2003.
[51] P. Heydari, “Design Issues in Low-Voltage High-Speed Current-Mode Logic Buffers,” ACM
Great Lakes symposium on VLSI, pp. 21 – 26, 2003.
[52] D. Johns and K. Martin, Analog Integrated Circuit Design. New York: John Wiley & Sons,
1997.
[53] Y. Greshishchev, P. Schvan, J. Showell, M. Xu, J. Ojha, and J. Rogers, “A Fully Integrated
SiGe Receiver IC for 10-Gb/s Data Rate,” IEEE J. Solid-State Circuits, vol. 35, pp. 1949 –
1957, December 2000.
Bibliography 175
[54] J. Buckwalter, B. Analui, and A. Hajimiri, “Data-Dependent Jitter and Crosstalk-Induced
Bounded Uncorrelated Jitter in Copper Interconnects,” IEEE International Microwave Sym-
posium, June 2004.
[55] SONET OC-192 Transport System Generic Criteria GR-1377-CORE, vol. Issue 5, December
1998.
[56] B. Razavi, “A Study of Phase Noise in CMOS Oscillators,” IEEE J. Solid-State Circuits,
vol. 31, pp. 331 – 343, March 1996.
[57] A. Hajimiri, S. Limotyrakis, and T. H. Lee, “Jitter and Phase Noise in Ring Oscillators,”
IEEE J. Solid-State Circuits, vol. 34, pp. 790 – 804, June 1996.
[58] Y. M. Greshishchev, “Clock and Data Recovery ICs for SONET Application,” IEEE VLSI
Circuits Symposium Short Course, June 2000.
[59] Maxim Integrated Products, “Converting Between RMS and Peak-to-Peak Jitter at a Spec-
ified BER,” pdfserv.maxim-ic.com/arpdf/AppNotes/3hfan402.pdf, December 2000.
[60] J. Lee and B. Razavi, “Analysis and Modeling of Bang-Bang Clock and Data Recovery
Circuits,” IEEE J. Solid-State Circuits, vol. 39, pp. 1571 – 1580, September 2004.
[61] M. Ramezani and C. Salama, “Analysis of a half-rate bang-bang phase-locked-loop,” Midwest
Symposium on Circuits and Systems, vol. 49, pp. 505 – 509, 2002.
[62] S. Wang, H. Mei, M. Baig, W. Bereza, T. Kwasniewski, and R. Patel, “Design Considerations
for 2nd-Order and 3rd-Order Bang-Bang CDR Loops,” IEEE Custom Integrated Circuits
Conference, pp. 317 – 320, September 2005.
[63] S. Gondi and B. Razavi, “Equalization and Clock and Data Recovery Techniques for 10-
Gb/s CMOS Serial-Link Receivers,” IEEE J. Solid-State Circuits, vol. 42, pp. 1999 – 2011,
September 2007.
Bibliography 176
[64] R. Chau, M. D. B. Doyle, S. Datta, G. Dewey, J. Kavalieros, B. Jin, M. Metz, A. Majumdar,
and M. Radosavljevic, “Advanced CMOS Transistors in the Nanotechnology Era for High-
Performance, Low-Power Logic Applications,” International Conference on Solid-State and
Integrated Circuits Technology, vol. 1, pp. 26 – 30, October 18-21 2004.
[65] S. Borkar, “Extending and Expanding Moore’s Law: Challenges and Opportunities,” The
Second Workshop on System Effects of Logic Soft Errors, April 12, 2006.
[66] S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and V. De, “Parameter Vari-
ations and Impact on Circuits and Microarchitecture,” in Design Automation Conference,
pp. 338 – 342, 2003.
[67] S. Nassif, “Modeling and Forecasting of Manufacturing Variations,” International Workshop
on Statistical Metrology, pp. 2 – 10, September 2000.
[68] Maxim Integrated Products, “Optical Receiver Performance Evaluation,” www.maxim-
ic.com.cn/pdfserv/en/an/3hfan302.pdf, March 2003.
[69] V. Stojanovic and V. Oklobdzija, “Comparative Analysis of Master-Slave Latches and Flip-
Flops for High-Performance and Low-Power Systems,” IEEE J. Solid-State Circuits, vol. 34,
pp. 536 – 548, April 1999.
[70] J. Lee, K. S. Kundert, and B. Razavi, “Modeling of Jitter in Bang-Bang Clock and Data
Recovery Circuits,” IEEE Custom Integrated Circuits Conference, pp. 711 – 714, September
2003.
[71] A. Bassi, A. Veggetti, L. Croce, and A. Bogliolo, “Measuring the Effects of Process Variations
on Circuit Performance by Means of Digitally-Controllable Ring Oscillators,” International
Conference on Microelectronic Test Structures, pp. 214 – 217, March 17-20 2003.
Bibliography 177
[72] J. Savoj and B. Razavi, “A 10-Gb/s CMOS Clock and Data Recovery Circuit With a Half-
Rate Linear Phase Detector,” IEEE J. Solid-State Circuits, vol. 36, pp. 761 – 767, May
2001.
[73] J. Cao, A. Momtaz, K. Vakilian, M. Green, D. Chung, J. Keh-Chee, M. Caresosa, X. W.
W.-G. Tan, Y. Cai, I. Fujimori, and A. Hairapetian, “OC-192 Transmitter and Receiver
in Standard 0.18µm CMOS,” IEEE J. of Solid-State Circuits, vol. 37, pp. 1768 – 1780,
December 2002.
[74] J. Maneatis, “Low-jitter Process-Independent DLL and PLL Based on Self-Biased Tech-
niques,” IEEE J. of Solid-State Circuits, vol. 31, pp. 1723 – 1732, November 1996.
[75] B. Analui, A. Rylyakov, S. Rylov, M. Meghelli, and A. Hajimiri, “A 10 Gb/s Eye-Opening
Monitor in 0.13µm CMOS,” IEEE J. Solid-State Circuits, vol. 40, pp. 2689 – 2699, December
2005.
[76] Y. Ohtomo, T. Kawamura, K. Nishimura, M. Nogawa, H. Koizumi, and M. Togashi, “A
12.5Gb/s CMOS BER Test Using a Jitter-Tolerant Parallel CDR,” IEEE International Solid-
State Circuits Conference, pp. 174 – 175, February 2004.
[77] P. Andreani and S. Mattisson, “On the Use of MOS Varactors in RF VCO’s,” IEEE J.
Solid-State Circuits, vol. 35, pp. 905 – 910, June 2000.
[78] T. Lee and J. Bulzacchelli, “A 155 MHz Clock Recovery Delay and Phase-Locked Loop,”
IEEE J. Solid-State Circuits, vol. 27, pp. 1736 – 1746, December 1992.
