System-level design, simulation and measurement for high-speed data links by Yang, Jerry
c© 2015 Jerry Yang
SYSTEM-LEVEL DESIGN, SIMULATION AND MEASUREMENT FOR
HIGH-SPEED DATA LINKS
BY
JERRY YANG
THESIS
Submitted in partial fulfillment of the requirements
for the degree of Master of Science in Electrical and Computer Engineering
in the Graduate College of the
University of Illinois at Urbana-Champaign, 2015
Urbana, Illinois
Adviser:
Professor Jose´ Schutt-Aine´
ABSTRACT
The era of the internet-of-things (IOT) is expanding the utilization of mobile
and cloud computing to a global scale. The enormous data transport places
a huge design overhead in building low-cost, low-power, low-error-rate high-
speed data links. This thesis provides a system-level overview of the design,
simulation, and measurement of high-speed digital applications in the con-
text of signal integrity. Examples are provided to demonstrate the design
approach and trade-offs made to arrive at the results. Modeling and simu-
lation methodologies for high-speed interconnect are discussed and studied,
using both conformal mapping and the variational method in closed-form
solutions, with examples provided to study the frequency-dependent channel
effects in high-speed digital systems. Detailed processes along with examples
are presented at the end to illustrate some real-world issues many engineers
will face when characterizing and measuring high-speed data links.
ii
To my family and friends, for their love and support.
iii
ACKNOWLEDGMENTS
First and foremost, I would like to express my sincere gratitude to my adviser,
Professor Jose´ E. Schutt-Aine´, for providing me with the opportunity to work
on this interesting topic and research in the field of high-speed link modeling,
design and signal integrity analysis. I thank him for all his attention, encour-
agement, guidance and support during every stage of my research. Working
under his supervision has been one of the most enriching experiences of my
life.
I would like to thank my fellow graduate students in Professor Schutt-
Aine´’s research group, Rishi Ratan, Da Wei, Jin Lei and Yubo Liu, for their
immense help and support during the early stage of this project. I would like
to acknowledge the roles of Xinying Wang and Kedi Zhang, who partnered
with me during the final project of the course Advanced Signal Integrity, as
well as my undergraduate mentees Haodong Guo and Zexian Li, for their
excellent work. Many thanks to my research group colleagues and friends,
Thomas Comberiate, Xu Chen, Maryam Hajimir, Thong Nguyen, Drew Han-
dler, Colin Madigan, Drew Newell, Sabareeshkumar Ravikumar, Ankit Jain,
Rushabh Mehta, Ishita Bisht, Yi Ren, Si Win and Karan Bhagat, for their
swift assistance and many valuable discussions during the course of my re-
search.
iv
TABLE OF CONTENTS
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
CHAPTER 2 DESIGN OF FUNDAMENTAL BUILDING BLOCKS
IN HIGH-SPEED SERDES . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Serializer and Deserializer . . . . . . . . . . . . . . . . . . . . 7
2.2 Transmitter and Receiver . . . . . . . . . . . . . . . . . . . . . 18
2.3 Channel Loss and Equalization . . . . . . . . . . . . . . . . . 28
2.4 PLL-based Clock and Data Recovery . . . . . . . . . . . . . . 36
2.5 Link Verification . . . . . . . . . . . . . . . . . . . . . . . . . 47
CHAPTER 3 MODELING AND SIMULATION OF HIGH-SPEED
INTERCONNECT . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.2 Interconnect Modeling Using Numerical Methods with Field
Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.3 Interconnect Modeling Using Conformal Mapping and Vari-
ational Method in Closed-Form Analytical Method . . . . . . 63
3.4 Example: High-Speed Double Data Rate (DDR) Memory
Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
CHAPTER 4 EYE DIAGRAM, JITTER AND NOISE MEASURE-
MENT OF HIGH-SPEED DATA LINKS . . . . . . . . . . . . . . . 87
4.1 Measurement Overview . . . . . . . . . . . . . . . . . . . . . . 88
4.2 Probe Calibration . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3 Measurement Setting and Bandwidth Considerations . . . . . 95
4.4 Jitter and Noise Measurement . . . . . . . . . . . . . . . . . . 100
4.5 Eye Diagram Measurement and Mask Test . . . . . . . . . . . 107
4.6 Embedding and De-embedding . . . . . . . . . . . . . . . . . . 112
CHAPTER 5 SUMMARY AND FUTURE WORK . . . . . . . . . . 126
5.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
v
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
vi
LIST OF FIGURES
1.1 Estimated total volume of data generated by year . . . . . . . 1
1.2 Parallel communication . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Serial communication . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Per-pin data-rate vs. year for a variety of common I/O
standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Generic function of a SerDes . . . . . . . . . . . . . . . . . . . 4
1.6 Overview of the SerDes core . . . . . . . . . . . . . . . . . . . 5
1.7 A typical controller and memory interface, forming a chan-
nel; consider the controller as transmitter and memory as
receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 High-level overview of the serial link . . . . . . . . . . . . . . 7
2.2 A 2:1 serializer circuit . . . . . . . . . . . . . . . . . . . . . . 8
2.3 A 1:2 deserializer circuit . . . . . . . . . . . . . . . . . . . . . 8
2.4 A positive MUX-based D-latch using transmission gates . . . . 9
2.5 A true single-phase clocked latch . . . . . . . . . . . . . . . . 10
2.6 A true single-phase clocked latch with split output . . . . . . . 11
2.7 A transmission-gate based multiplexer . . . . . . . . . . . . . 12
2.8 A current-mode logic multiplexer . . . . . . . . . . . . . . . . 13
2.9 Simulation result of transmission gate multiplexer . . . . . . . 14
2.10 Simulation result of CML multiplexer . . . . . . . . . . . . . . 14
2.11 Binary-tree design serializer . . . . . . . . . . . . . . . . . . . 15
2.12 Simulation result of 8 to 1 serializer with transmission gate mux 16
2.13 Simulation result of 8 to 1 Serializer with CML mux . . . . . . 16
2.14 Binary-tree design deserializer . . . . . . . . . . . . . . . . . . 17
2.15 Test bench of serializer and deserializer (without channel,
equalizer and clock-recovery unit) . . . . . . . . . . . . . . . . 18
2.16 Simulation result of 8 to 1 serializer with CML MUX . . . . . 18
2.17 Parallel termination typically used for high impedance current-
mode driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.18 Current-mode driver for differential signaling . . . . . . . . . . 20
2.19 Series termination typically used for low impedance voltage-
mode driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.20 Voltage-mode driver for single-ended signaling . . . . . . . . . 21
vii
2.21 Comparison of signaling techniques . . . . . . . . . . . . . . . 22
2.22 Output driver summary . . . . . . . . . . . . . . . . . . . . . 23
2.23 Test bench for output impedance sweep of the voltage-
mode driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.24 DC Analysis for output impedance sweep . . . . . . . . . . . . 24
2.25 Current vs. width for NMOS . . . . . . . . . . . . . . . . . . . 25
2.26 Current vs. width for PMOS . . . . . . . . . . . . . . . . . . . 25
2.27 High swing voltage-mode driver . . . . . . . . . . . . . . . . . 26
2.28 Test bench to simulate eye diagram at driver output before
channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.29 Eye diagram at driver output before channel . . . . . . . . . . 27
2.30 Eye diagram showing time and voltage offset and resolution . 27
2.31 Illustration of receiver circuit . . . . . . . . . . . . . . . . . . 28
2.32 Single-ended S11, S12, S13 and S14 of a 4-port backplane
channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.33 Frequency response of an ideal channel (solid blue), phys-
ical channel (dashed red) and equalizer (dot-dashed purple) . . 30
2.34 Block diagram of transmitter side FIR equalizer . . . . . . . . 31
2.35 Implementation of transmitter side FIR equalizer . . . . . . . 32
2.36 Circuit-level realization of a 3-tap Tx FIR equalizer . . . . . . 32
2.37 Single-bit response of the channel at 5 Gbps, with maxi-
mum swing of 0.6 V per symbol (or 1 Vpp differentially) . . . 33
2.38 Differential eye diagram at channel output driven by CML
output driver, with equalization . . . . . . . . . . . . . . . . . 34
2.39 Differential eye diagram at channel output driven by CML
output driver, without equalization . . . . . . . . . . . . . . . 35
2.40 Block diagram of receiver-side FIR equalizer . . . . . . . . . . 36
2.41 Illustration of noise enhancement by receiver-side equalization 36
2.42 Block diagram of the CDR . . . . . . . . . . . . . . . . . . . . 37
2.43 Implementation of the charge-pump PLL-based CDR . . . . . 38
2.44 Circuit schematic of SR latch . . . . . . . . . . . . . . . . . . 39
2.45 Top: Circuit schematic of Hogge PD. Bottom: Operation
data stream and clock when the data and clock get aligned. . 41
2.46 Hogge PD up (second bottom) and down (bottom) output
when data (top first) and clock (top second) are aligned . . . . 42
2.47 Schematic of a current steering charge pump . . . . . . . . . . 43
2.48 Simulation result of the current steering charge pump . . . . . 44
2.49 Schematic of a loop filter . . . . . . . . . . . . . . . . . . . . . 45
2.50 Schematic of the CDR testbench . . . . . . . . . . . . . . . . 46
2.51 Simulation result of the CDR for 10 µs simulation . . . . . . . 47
2.52 Illustration of channel design . . . . . . . . . . . . . . . . . . . 48
2.53 Configuration of the bonding wires in Ansys HFSS . . . . . . 49
2.54 HFSS simulated S-parameters of the bonding wires . . . . . . 49
2.55 Configuration of the bonding wires in Ansys HFSS . . . . . . 50
viii
2.56 HFSS simulated S-parameters of the package vias . . . . . . . 50
2.57 Configuration of the stripline in Ansys Q3D . . . . . . . . . . 51
2.58 The ADS schematic circuit for analyzing the total loss of
the signal trace . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.59 Simulated total loss of the signal trace in Agilent ADS . . . . 52
2.60 Virtuoso testbench for link verification . . . . . . . . . . . . . 53
2.61 Eye diagram measured at Tx driver with channel effect . . . . 54
2.62 Eye diagram measured after Receiver . . . . . . . . . . . . . . 54
2.63 Eye diagram measured after CDR . . . . . . . . . . . . . . . . 54
2.64 Transient simulation of recovered data at receiver end . . . . . 55
2.65 Eye diagram of the HSSL with PRBS-31 sequence . . . . . . . 56
2.66 Total jitter (TJ) histogram of the HSSL with PRBS-31 sequence 56
2.67 Data-dependent jitter (DDJ) histogram of the HSSL with
PRBS-31 sequence . . . . . . . . . . . . . . . . . . . . . . . . 57
2.68 Random jitter (RJ) histogram of the HSSL with PRBS-31
sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.69 DDJ vs. Bit Number of the HSSL with PRBS-31 sequence . . 58
2.70 Bathtub curve and bit-error rate (BER) . . . . . . . . . . . . 58
3.1 Interconnect modeling . . . . . . . . . . . . . . . . . . . . . . 59
3.2 HFSS modeling of a microstrip line . . . . . . . . . . . . . . . 60
3.3 Simulated S-parameters of a microstrip line . . . . . . . . . . 61
3.4 HFSS modeling of a microstrip line with discontinuities . . . . 61
3.5 Simulated S-parameters of a microstrip line with discontinuities 62
3.6 Cross-sectional view of the multi-conductor stripline mod-
els with 1 conductor at two different thicknesses . . . . . . . . 62
3.7 Cross-sectional view of the multi-conductor stripline mod-
els with 2 conductors at two different thicknesses . . . . . . . 62
3.8 Cross-sectional view of the multi-conductor stripline mod-
els with 3 conductors at two different thicknesses . . . . . . . 62
3.9 Cross-sectional view of the multi-conductor stripline mod-
els with 4 conductors at two different thicknesses . . . . . . . 63
3.10 N-layer dielectric with side walls and a point source . . . . . . 64
3.11 Three boundary conditions. Dashed lines represent mag-
netic walls, thick solid lines represent electric wall, and thin
solid lines represent the dielectric interfaces . . . . . . . . . . . 65
3.12 Configuration of the interconnect with a bottom ground
plane aperture . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.13 Comparison between the unified approach using variational
method in closed-form analytical solution and EM field
solver using numerical methods . . . . . . . . . . . . . . . . . 74
3.14 Single line: characteristic impedance vs. dielectric height,
using conformal mapping and variational method . . . . . . . 75
ix
3.15 Single line: characteristic impedance vs. conductor width,
using conformal mapping and variational method . . . . . . . 75
3.16 Coupled line: characteristic impedance vs. dielectric height,
using conformal mapping and variational method . . . . . . . 76
3.17 Coupled line: characteristic impedance vs. conductor widths,
using conformal mapping and variational method . . . . . . . 76
3.18 Coupled line: characteristic impedance vs. conductor spac-
ing, using conformal mapping and variational method . . . . . 77
3.19 DIMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.20 Clamshell double-side assembly . . . . . . . . . . . . . . . . . 78
3.21 ADS schematic of the equivalent model . . . . . . . . . . . . . 78
3.22 Comparison of simulated S-parameter between analytical
methods and ADS . . . . . . . . . . . . . . . . . . . . . . . . 79
3.23 A motherboard with 4 slots for DIMM . . . . . . . . . . . . . 79
3.24 ADS schematic of the equivalent model with 4 slots . . . . . . 80
3.25 Comparison of simulated S-parameter from Tx to Rx at
slot 1 between analytical methods and ADS . . . . . . . . . . 80
3.26 Comparison of simulated S-parameter from Tx to Rx at
slot 2 between analytical methods and ADS . . . . . . . . . . 81
3.27 Comparison of simulated S-parameter from Tx to Rx at
slot 3 between analytical methods and ADS . . . . . . . . . . 81
3.28 Comparison of simulated S-parameter from Tx to Rx at
slot 4 between analytical methods and ADS . . . . . . . . . . 81
3.29 Single-bit pulse . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.30 Single-bit response at slot 1 . . . . . . . . . . . . . . . . . . . 82
3.31 Single-bit response at slot 2 . . . . . . . . . . . . . . . . . . . 83
3.32 Single-bit response at slot 3 . . . . . . . . . . . . . . . . . . . 83
3.33 Single-bit response at slot 4 . . . . . . . . . . . . . . . . . . . 84
3.34 Clock pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.35 Clock pattern response at slot 1 . . . . . . . . . . . . . . . . . 85
3.36 Clock pattern response at slot 2 . . . . . . . . . . . . . . . . . 85
3.37 Clock pattern response at slot 3 . . . . . . . . . . . . . . . . . 86
3.38 Clock pattern response at slot 4 . . . . . . . . . . . . . . . . . 86
4.1 N5443A performance verification and deskew fixture . . . . . . 90
4.2 Probe head tip leads: “+” to signal trace and “-” to ground . 91
4.3 Channel configuration . . . . . . . . . . . . . . . . . . . . . . 91
4.4 Channel configuration . . . . . . . . . . . . . . . . . . . . . . 92
4.5 Channel configuration . . . . . . . . . . . . . . . . . . . . . . 92
4.6 Probe calibration . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.7 Sanity check of successful calibration -1 . . . . . . . . . . . . . 94
4.8 Sanity check of successful calibration - 2 . . . . . . . . . . . . 94
4.9 Proper position of resistors . . . . . . . . . . . . . . . . . . . . 95
4.10 Add measurement window . . . . . . . . . . . . . . . . . . . . 96
x
4.11 Measurement thresholds . . . . . . . . . . . . . . . . . . . . . 97
4.12 Waveform without BW limitation . . . . . . . . . . . . . . . . 97
4.13 Waveform with 20 GHz BW limitation . . . . . . . . . . . . . 98
4.14 Waveform with 12 GHz BW limitation . . . . . . . . . . . . . 99
4.15 Waveform with 7 GHz BW limitation . . . . . . . . . . . . . . 99
4.16 Waveform with 3 GHz BW limitation . . . . . . . . . . . . . . 100
4.17 Jitter and noise measurement setup . . . . . . . . . . . . . . . 101
4.18 Scope vertical scale setup . . . . . . . . . . . . . . . . . . . . . 102
4.19 RJ/RN separation method settings . . . . . . . . . . . . . . . 102
4.20 Measurement source type . . . . . . . . . . . . . . . . . . . . . 103
4.21 Test pattern and BER level setup . . . . . . . . . . . . . . . . 104
4.22 Clock recovery setup . . . . . . . . . . . . . . . . . . . . . . . 105
4.23 Voltage threshold setup . . . . . . . . . . . . . . . . . . . . . . 106
4.24 Acquisition setup . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.25 Random noise calibration . . . . . . . . . . . . . . . . . . . . 107
4.26 Serial data analysis window . . . . . . . . . . . . . . . . . . . 107
4.27 Vertical scale and clock recovery method . . . . . . . . . . . . 108
4.28 PLL configuration . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.29 Receiver switching threshold setting . . . . . . . . . . . . . . . 109
4.30 Real-time eye diagram . . . . . . . . . . . . . . . . . . . . . . 110
4.31 Real-time eye diagram . . . . . . . . . . . . . . . . . . . . . . 111
4.32 Eye diagram with eye mask test . . . . . . . . . . . . . . . . . 112
4.33 Channel configuration window . . . . . . . . . . . . . . . . . . 114
4.34 InfiniiSim setup window . . . . . . . . . . . . . . . . . . . . . 115
4.35 Application preset for InfiniiSim model setup . . . . . . . . . . 115
4.36 InfiniiSim setup window with block diagram of the serial link . 116
4.37 InfiniiSim block setup . . . . . . . . . . . . . . . . . . . . . . . 117
4.38 InfiniiSim sub-circuit block setup for S-parameter file . . . . . 118
4.39 InfiniiSim sub-circuit block setup for probe load . . . . . . . . 119
4.40 Source and load impedances setting . . . . . . . . . . . . . . . 120
4.41 Quick view for individual block setting . . . . . . . . . . . . . 121
4.42 Process indicator showing progress . . . . . . . . . . . . . . . 121
4.43 Example of an error message . . . . . . . . . . . . . . . . . . 122
4.44 Message showing successful computation of transfer function . 122
4.45 Frequency response of the transfer function for embedding . . 123
4.46 Step response of the transfer function for embedding . . . . . 123
4.47 Impluse response of the transfer function for embedding . . . . 124
4.48 Eye diagrams comparison after applying the embedding filter . 125
4.49 Frequency response of the transfer function for deembedding . 125
4.50 Eye diagrams comparison after applying the deembedding filter125
xi
CHAPTER 1
INTRODUCTION
1.1 Motivation
The information age is here. Big data, internet-of-things (IoT), digital con-
vergence, all these new opportunities are aiming to provide smart and intel-
ligent solutions to solve real-world problems more efficiently, while they also
require massive amounts of data transportation. As shown in Figure 1.1 [1],
an estimated 5 zettabytes (5 ZB = 5×1021 bytes = 5 billion terabytes) of
data were generated in 2014 alone, and trends indicate that the volume of
data will grow significantly every year, reaching nearly 45 ZB by 2020.
Figure 1.1: Estimated total volume of data generated by year
Data transmission (also known as digital transmission, or digital com-
munications) is the physical transfer of data (a digital bit stream) over a
1
point-to-point or point-to-multipoint communication channel. Shown in Fig-
ures 1.2 and 1.3 are the two basic methods of data transmission between two
chips on the same circuit board or inter-circuit boards: parallel data transfer
and serial data transfer.
Figure 1.2: Parallel communication
Figure 1.3: Serial communication
To allow faster and more efficient data transfer, one way would be to in-
crease the data rate. As the technology continues to advance, data rates
are increasing rapidly into the multi-gigabits-per-second range, as shown in
Figure 1.4 [2]. The input/output (I/O) performance has become the bot-
tleneck of the overall system performance, especially for modern high-speed
applications.
2
Figure 1.4: Per-pin data-rate vs. year for a variety of common I/O
standards
Traditional parallel communication such as PCI and PCI-X, however, can-
not meet the standard for high-speed links between integrated circuits (IC)
data transmission. In parallel communication, the difference in arrival time
of simultaneously transmitted data is commonly referred to as skew. The
tolerance of data skew between parallel signals is approaching the practical
limit, because of the increasing operating frequency of the high-speed data
links, and data skew can cause critical problems such as phase difference. In
addition, crosstalk, which refers to the interference between adjacent paral-
lel data links, is causing more problems as data rates go higher and higher.
What is more, the number of circuits that can be manufactured on a chip
is increasing year by year, as predicted by Moore’s law, and therefore extra
pins associated with parallel links would lead to higher packaging costs.
To circumvent the performance limitation of traditional parallel commu-
nication, point-to-point serial data communication is one of the possible so-
lutions. Serial data transfer requires fewer lines, which reduces board area.
Serial I/O has the advantage of higher speed, less interference between adja-
cent links, fewer pin counts and thus lower packaging costs. A serializer/de-
serializer (SerDes) is such a device that takes the parallel data link input and
condenses it into fewer lines of serial stream which are then deserialized and
output as the original recovered parallel data, as shown in Figure 1.5. SerDes
3
is very beneficial because it solves the problems of many traditional parallel
data links and reduces the number of I/O pins and cost for connectors and
cables. Designing a robust, lower power SerDes that functions properly at
high speed is very challenging and requires knowledge from several different
areas.
Figure 1.5: Generic function of a SerDes
This thesis provides guidelines for high-speed serial link (HSSL) design,
modeling and characterization in the context of signal integrity. Funda-
mental concepts and major components of SerDes are covered, as well as
the design approach of the transmitter, channel, receiver and clock-data re-
covery (CDR) using charge-pump phase-locked-loop (CPLL). Figure 1.6 [3]
shows an overview of the SerDes core, where the PLL slice ensures that the
clock signals for the transmitter and receiver slice have low jitter. The trans-
mitter (TX) slice performs parallel-to-serial conversion through a serializer
circuit. The serialized data is then fed to a feed-forward equalizer (FFE) to
ensure that the receiver input is a clean waveform. The receiver (RX) slice
also requires equalization after the serialized data is transmitted through the
channel. A decision-feedback equalizer (DFE) could improve the bit-error-
rate (BER). After the signal is equalized, the serial stream is driven through
the deserializer to perform the serial-to-parallel conversion.
4
Figure 1.6: Overview of the SerDes core
The thesis will explore signal integrity issues experienced by high-speed
link designers on both a system-architecture level as well as the circuit level.
All building blocks, including the transmitter, receiver and timing recovery
circuits, will be implemented at a transistor level using Cadence Virtuoso.
The channel, which connects the chip through package to board is shown in
Figure 1.7 [4]. The complete channel that includes wire bonding, package
trace, package via, solder bump, PCB via, and PCB trace is designed in
HFSS. Scattering parameters will be generated and analyzed in ADS.
5
Figure 1.7: A typical controller and memory interface, forming a channel;
consider the controller as transmitter and memory as receiver
1.2 Outline
• Chapter 1 provides the background and introduction to the research
problem, as well as the motivation for this thesis.
• Chapter 2 describes the design approach of the fundamental build-
ing blocks used in high-speed SerDes, including Serializer, Deserializer,
Transmitter, Receiver, Channel, PLL-based Clock and Data Recovery.
• Chapter 3 discusses the techniques used in modeling and simulation
of high-speed interconnects, including both conformal mapping and
variational method in closed-form analytical solution.
• Chapter 4 provides detailed procedure on jitter, noise, eye diagram
measurement as well as mask test, embedding and de-embedding tech-
niques for high-speed data links.
• Chapter 5 concludes the thesis with a few possible future research di-
rections and opportunities.
6
CHAPTER 2
DESIGN OF FUNDAMENTAL BUILDING
BLOCKS IN HIGH-SPEED SERDES
Shown in Figure 2.1 is the high-level overview of the serial link that is going
to be discussed in this chapter.
Figure 2.1: High-level overview of the serial link
2.1 Serializer and Deserializer
The serializer performs the parallel-to-serial conversion while the deserializer
performs the serial-to-parallel conversion. Simplified schematics of basic 2:1
serializer and 1:2 deserializer are presented here in Figures 2.2 and 2.3
7
Figure 2.2: A 2:1 serializer circuit
Figure 2.3: A 1:2 deserializer circuit
For the 2:1 serializer example, assume that the two bits of parallel data,
Deven and Dodd, are time-aligned into the serializer and are synchronized
to the half-rate C2 clock signal. The parallel Deven and Dodd signals are
captured by the first two D-latches, which create the De and Do outputs
on the rising edge of the C2 clock signal. The Do′ signal is generated by
resampling the Do signal on the falling edge of the C2 clock signal. The select
input of the 2:1 multiplexer (MUX) is controlled by the C2 clock signal, so
that when the clock is low, De input signal is selected, and when the clock
is high, Do′ is selected.
8
2.1.1 D-latch Design
A latch is an important component in the construction of several major blocks
in high-speed SerDes, including the serializer block, feed-forward equalizer
block, phase detector block, deserializer block, etc. A positive latch is a level-
sensitive circuit that passes the D input to the Q output when the clock signal
is high and it is said to be in the transparent mode. When the clock signal
is low, the input data sampled on the falling edge of the clock is held stable
at the output for the entire phase, and the latch is said to be in the hold
mode. Similarly, a negative latch passes the D input to the Q output when
the clock signal is low. A register, however, is an edge-trigged component
in contrast to the level-sensitive latches. A latch is an essential component
in the construction of an edge-triggered register. A flip-flop generally refers
to any bistable component, formed by the cross coupling of gates. Often in
some textbooks, an edge-triggered register is referred to as a flip-flop as well
[5].
Shown in Figure 2.4 is the transistor-level implementation of a positive
MUX-based D-latch built by using transmission gates.
Figure 2.4: A positive MUX-based D-latch using transmission gates
9
When the CLK signal is high, the bottom transmission gate is on, and the
latch is transparent, and input signal D is copied to Q. During this time,
the top transmission gate is off. When the CLK signal is low, the bottom
transmission gate is off while the top is on. The feedback ensures the output
is held as long as the CLK signal is low.
The problem of such MUX-based D-latch design using transmission gate
is that it requires both CLK and CLK BAR signal, which could lead to clock
overlap and eventually cause a race condition to happen. A true single-phase
clocked (TSPC) latch can overcome the problem caused by clock overlap.
Figure 2.5 shows the transistor implementation of a TSPC latch.
Figure 2.5: A true single-phase clocked latch
For the positive TSPC latch shown in Figure 2.5, when the CLK is high, the
latch is in the transparent mode, and corresponds to two cascaded inverters.
When the CLK is low, on the other hand, both inverters are off and the latch
is in hold mode.
A slightly different configuration of a TSPC with split out latch is used
in the final design of the serializer as shown in Figure 2.6. The advantage
is that fewer transistors are needed in this design, and thus less power is
consumed by the overall system. Also smaller propagation delay results in
higher speed of the entire circuit. Again, no inverted clock signal is needed
in this design, so the circuit is free of clock skew.
10
Figure 2.6: A true single-phase clocked latch with split output
2.1.2 Multiplexer Design
The transistor-level schematic of a transmission-gate multiplexer is shown in
Figure 2.7. The idea behind this circuit is to use two transmission-gates as
simple switches to propagate either input A or input B directly to the out-
put. An extra inverter is needed to generate the inverted select signal S bar.
While the upper transmission-gate is activated by S, the lower transmission-
gate is activated by S bar, due to the wiring of their control inputs. When
S is low, only the lower transmission-gate is conducting (because S bar is
connected to its n-channel and S to its p-channel transistor gate inputs),
while the upper transmission-gate is non-conducting. As a result the value
of B is passed through to the output of the multiplexer. When S is high,
the upper transmission-gate is activated, while the lower transmission-gate
is non-conducting. Therefore the value of A is passed through to the multi-
plexer output. This operation is equivalent to the following Boolean function:
Q = (A · S +B · S bar) (2.1)
11
Figure 2.7: A transmission-gate based multiplexer
However, the traditional transistor sizing method and logical effort cannot
be applied to this transmission-gate multiplexer design, and thus it is hard to
find out the optimal transistor size for maximum speed theoretically. Also,
in order to lower the equivalent resistance Req, the transmission gate must
be made wide. The capacitance of the gates, however, will also be increased,
resulting in no reduction in the time constant of the transmission-gate mul-
tiplexer. As a result, another design called current-mode logic (CML) mul-
tiplexer is adopted and is shown in Figure 2.8. The CML circuits are widely
used in GHz range high-speed bipolar driver or multiplexer implementations.
The differential select signals, S and S bar, select which of the two data-
inputs A and B to connect to the output. When the select signal S is high
12
(and S bar is low), A directly affects the output while B is disconnected.
When the select signal S goes low (and S bar high) B will be connected to
the output. Thus both levels of the clock will be used to multiplex the data.
The advantage of this CML circuit is that it has higher operating speed with
constant power consumption independent of operation frequency.
Figure 2.8: A current-mode logic multiplexer
2.1.3 Simulation
Simulation results of the transmission gate multiplexer and CML multiplexer
are given in Figure 2.9 and Figure 2.10.
13
Figure 2.9: Simulation result of transmission gate multiplexer
Figure 2.10: Simulation result of CML multiplexer
The input to the serializer is an n-bit datapath, which is then serialized to
a one-bit serial data signal for application to the feed-forward equalizer and
driver stage. The value of n is generally a multiple of 8 or 10, and may be
programmable on some implementations. Values of n which are multiples of
14
8 are useful for sending unencoded and/or scrambled data bytes; values of n
which are multiples of 10 are useful for protocols which use 8B/10B coding.
The 8B/10B encoder is usually implemented by logic outside the SerDes core
[3]. The proposed 8-to-1 binary-tree design serializer is shown in Figure 2.11,
with the functionality test simulation shown in Figure 2.12 and Figure 2.13.
Figure 2.11: Binary-tree design serializer
15
Figure 2.12: Simulation result of 8 to 1 serializer with transmission gate
mux
Figure 2.13: Simulation result of 8 to 1 Serializer with CML mux
16
The actual design of the datapath which is fed into the equalizer may
be more than one bit wide, and that results in more complex circuitries.
However in general, the n-bit data input would be serialized into a k-bit
datapath, where n > k > 0. The k-bit data would be fed into the equalizer,
and further serialized at the driver stage if needed.
The deserializer block at the receiver slice, as shown in Figure 2.14, per-
forms the inverse function of the serializer at the transmitter slice. The serial
data, after the clock-data recovery and decision feedback equalization block,
is then deserialized back to an n-bit databus.
Figure 2.14: Binary-tree design deserializer
The complete serializer and deserializer functional test simulation, without
the channel, equalization and CDR circuits, has been performed and is shown
in Figure 2.15 and Figure 2.16.
17
Figure 2.15: Test bench of serializer and deserializer (without channel,
equalizer and clock-recovery unit)
Figure 2.16: Simulation result of 8 to 1 serializer with CML MUX
2.2 Transmitter and Receiver
In high-speed serial link designs, transmitters are used to pass data stream
through transmission lines. If the impedance of the driver does not match
18
the characteristic impedance of the channel, the driver is unable to provide
maximum power to the channel because of reflection in the transmitter side.
If the characteristic impedance of the channel does not match the impedance
of the terminal, the channel is unable to deliver maximum power to the
terminal because of reflection in the receiver side. If there is mismatch in
both sides, then some energy reflected from the terminal will experience
another reflection in the transmitter side. It takes some time (∆T) for this
energy to complete this round-trip and suffer some loss. When it comes
back, it adds to the signal that is sent at ∆T later. Therefore, the output
impedance of the transmitters should match the characteristic impedance of
the transmission lines in order to minimize the reflections for signal integrity
considerations, especially for high data rate applications. Two typical types
of transmitter driver (Tx driver) will be discussed in the following two sub-
sections: current-mode (CM) drivers and voltage-mode (VM) drivers.
2.2.1 Current-Mode Drivers
Current-mode drivers use Norton-equivalent parallel termination, as shown
in Figure 2.17, and they are widely used in high performance serial links.
The advantage of current-mode drivers is that it is easier to control output
impedance because of the Norton configuration. The termination resistor
RTT should be equal to the characteristic impedance, Zo, of the channel.
The disadvantage, as will be shown later, is that the power consumption
could be as high as 4 times larger than voltage-mode drivers. Figure 2.18
shows one possible design of a current-mode driver.
Figure 2.17: Parallel termination typically used for high impedance
current-mode driver
19
Figure 2.18: Current-mode driver for differential signaling
2.2.2 Voltage-Mode Drivers
In contrast, voltage-mode drivers use The´venin-equivalent series termination,
as shown in Figure 2.19, and typically have very low output impedance and
hence are implemented using large transistors operated in the triode region.
Fundamentally, a voltage-mode driver acts as a switch connecting a signal
line to one of two voltages with very low impedance, as illustrated in Figure
2.20(a). In practice, the switch should have output impedance matched
to the characteristic impedance of the transmission line. For low voltage
swing application, both transistors can be NMOS with the upper transistor
connecting to a dedicated voltage reference (supply) for V1 as shown in Figure
2.20(c). For high voltage swing application as shown in Figure 2.20(b), the
size of the transistors, especially PMOS, must be made very large so that
transistors can have an on resistance of about Z0. The termination voltage
VT can be any convenient voltage; however, typically it is chosen to be the
middle value between V1 and V 0.
20
Figure 2.19: Series termination typically used for low impedance
voltage-mode driver
Figure 2.20: Voltage-mode driver for single-ended signaling
21
2.2.3 Single-Ended vs. Differential Signaling
Differential signaling uses two complementary signals sent on two paired
transmission lines to transmit data. This signaling method requires twice as
many wires as single-ended signaling, but fewer return pins, and potentially
it has better noise immunity. However, uneven length in trace or difference
in signal speed may cause timing skew, which would greatly affect signal
integrity. Single-ended signaling is the simplest and most commonly used
method of transmitting electrical signals over wires. This signaling method
uses only one wire to carry the signal, while the other wire is connected to
a reference voltage, usually ground. Pure single-ended signaling has noise
problems such as ground offset, but is cost-effective because fewer wires are
needed to transmit multiple signals. In this design project, single-ended
signaling is employed as it does not suffer from timing skew problem. Figure
2.21 summarizes the above mentioned signaling techniques.
Figure 2.21: Comparison of signaling techniques
2.2.4 Current-Mode Driver vs. Voltage-Mode Driver
The two different output drivers [6] discussed earlier are compared in Figure
2.22.
22
Figure 2.22: Output driver summary
For this design example, the voltage-mode driver is utilized since the driver
delay is not a very big concern at the target speed 2 Gbps. It also has the
advantage of lower power consumption, as well as easy configuration. The
biggest challenge of the voltage-mode driver, as mentioned above, is the
output impedance control. For the high swing voltage-mode driver design,
the output impedance is controlled by adjusting the transistor sizing, as
shown in the test bench in Figure 2.23.
Figure 2.23: Test bench for output impedance sweep of the voltage-mode
driver
DC analysis was performed to sweep the variable W , width of the transis-
tor, as shown in Figure 2.24. In the simulation result file, plot current versus
23
width is shown in Figures 2.25 and 2.26. In this example, the target output
impedance is 75 Ω, and therefore current flowing through transistor should
be
ID =
VDD
2Zo
= 12 mA (2.2)
Figure 2.24: DC Analysis for output impedance sweep
24
Figure 2.25: Current vs. width for NMOS
Figure 2.26: Current vs. width for PMOS
As shown in Figure 2.25 and Figure 2.26, the transistor widths of Wn =
22.47 µm and Wp = 50 µm correspond to the transistor current of 12 mA
which gives 75 Ω output impedance explained above.
The final design of the voltage-mode driver is shown in Figure 2.27, with
the transistor sized such that output impedance is matched to the charac-
teristic impedance 75 Ω.
25
Figure 2.27: High swing voltage-mode driver
To check the performance of the output driver of the transmitter, simulate
the eye diagram right before the channel as shown in Figure 2.28. The eye
diagram is shown in Figure 2.29.
Figure 2.28: Test bench to simulate eye diagram at driver output before
channel
26
Figure 2.29: Eye diagram at driver output before channel
2.2.5 Receiver Circuit
A receiver detects an electrical quantity, current or voltage, to recover a
symbol from a transmission medium. In order to recover the data stream
transmitted by transmitters, receiver circuits are needed to properly match
with the channel and ensure signal integrity. The receiver performance is
measured in both time and voltage domains. The sensitivity indicates the
minimum voltage the receiver can measure. The receiver voltage offset is
caused by the device mismatch and circuit structure. The aperture time,
which limits the maximum data rate of the link system, is defined as the
shortest pulse width the receiver can detect. The timing offset becomes the
timing skew and jitter between the receiver and CDR. These four parameters
are illustrated in Figure 2.30 [7].
Figure 2.30: Eye diagram showing time and voltage offset and resolution
27
For single-ended designs, CMOS inverters are usually used for receiver pre-
amplifier structures. The termination resistor, RTT , should be placed near
the inverter trip-point, as is illustrated in Figure 2.31.
Figure 2.31: Illustration of receiver circuit
2.3 Channel Loss and Equalization
Figure 2.32: Single-ended S11, S12, S13 and S14 of a 4-port backplane
channel
Shown in Figure 2.32 are the the S-parameters of a 4-port backplane channel
across the frequency range 0-40 GHz. The channel consists of the daughter-
card, connector, backplane, and then connector to the other daughtercard.
The S11 plot represents how much reflection exists in the channel, measured
28
with referenced impedance (usually 50 Ω) system. It is also known as the
return loss, which characterizes the loss of power in the signal reflected by a
discontinuity in a transmission line. When the system is perfectly matched
(i.e. Γ =0), dB (S11) is ideally -∞. The closer dB (S11) is to -∞, the better
matching (less reflection). The S11 of this 4-port backplane channel ranges
between -15 dB and -55 dB, which signifies a moderate amount of reflection,
especially at around 5 GHz.
The S12 plot represents how much attenuation of the transmitted signal
exists in the channel, and is known as the insertion loss due to the fact
that it characterizes the loss of signal power resulting from the insertion of
a device in a transmission line. At low frequencies, the transmission line
looks like a short wire with little (if not zero) loss. As frequency goes higher
and higher, the signal power transmitted through the channel decreases due
to the skin effect, which states that at higher frequencies where skin depth
becomes smaller, the effective resistance of the conductor increases and thus
the power loss is greater at higher frequencies. The skin effect is due to
opposing eddy currents induced by the changing magnetic field resulting from
the alternating current. Therefore, the skin effect explains the decrease of
power transmission as frequency increases. The S12 of this 4-port backplane
channel starts at 0 dB at DC (which is expected as explained before), and
then goes down drastically beyond -100 dB around 39 GHz.
The S13 plot represents the near-end crosstalk (NEXT) while the S14
plot represents the far-end crosstalk (FEXT). A moderate amount of NEXT
between -19 dB and -58 dB is observed across the entire frequency range.
FEXT, on the other hand, increases quite a lot from -27.44 dB at 3.73 GHz
to -129.3 dB at 39.59 GHz
In the ideal situation, the signal sent from the transmitter should propa-
gate through the wire without any loss of the frequency component. However
in reality there are many factors, such as the physical dimensions and ma-
terial of the electrical transmitting medium, which could limit the signaling
bandwidth. A simple case is shown in Figure 2.33, where an ideal channel
(in solid blue) should have 0 dB loss, and a physical channel (in dashed red)
will act like a low pass filter with some loss. As frequency increases, the
loss of the channel will increase. An equalizer (in dot-dashed purple) should
have a frequency response that could undo the channel effect, compensate
for any unwanted channel loss, and extend the channel’s maximum operating
29
bandwidth. The equalized channel behaves like an ideal cable with 0 dB loss
(in solid blue).
Figure 2.33: Frequency response of an ideal channel (solid blue), physical
channel (dashed red) and equalizer (dot-dashed purple)
Equalization can be realized both at the transmitter side (before the chan-
nel) and at the receiver side (after the channel). Typically in many high-speed
serial links, transmitter-side equalization is the most common and favorable
technique. A feed-forward equalizer (FFE) on the transmitter side can be
achieved by a finite impulse response (FIR) filter that pre-distorts the trans-
mitted data over several bit periods in order to invert the channel loss and
distortion. An FFE is normally implemented as a low-frequency de-emphasis
process to reduce the low-frequency signal envelope in proportion to the at-
tenuation experienced by the high-frequency pattern in the channel. The
low-frequency components get de-emphasized in order to flatten the channel
response. This equalization process usually comes at the cost of attenuated
signal at the transmitter output driver, and as a result this type of equaliza-
tion is also known as de-emphasis or pre-emphasis. As shown in Figure 2.34,
the FIR equalizer can be implemented using unit delay elements and cur-
rent steering digital-to-analog converter (DAC) circuits. The input data Din
propagates through the delay elements with some delay value T (also known
30
as the tap spacing) which is equal to 1 bit period in this implementation. At
each stage the input is multiplied by the tap coefficient, Ci.
Figure 2.34: Block diagram of transmitter side FIR equalizer
The taps can be implemented by some trans-conductance elements, whose
Gm are set by the tap coefficients, and the unit delay elements can be just flip-
flops as shown in Figure 2.35. The advantage of this configuration is that
a high-speed DAC on the transmitter side is relatively easy to implement
compared to a high-speed analog-to-digital converter (ADC) on the receiver.
Tx FFE can also cancel both the pre-cursor and post-cursor inter-symbol
interference (ISI), while noise is not amplified due to the digital nature of
the Tx FFE. The disadvantage of the Tx FFE is that it flattens the channel
response, and low-frequency content is attenuated as a result of the peak
power constraint.
31
Figure 2.35: Implementation of transmitter side FIR equalizer
Shown in Figure 2.36 is a current-mode-logic (CML) based 3-tap FIR
equalizer. The termination resistors RTT are set to be the same as the char-
acteristic impedance Zo of the channel. Tap coefficients are optimized using
peak distortion analysis based on the pulse response (also known as the
single-bit response) of the channel, shown in Figure 2.37.
Figure 2.36: Circuit-level realization of a 3-tap Tx FIR equalizer
32
Figure 2.37: Single-bit response of the channel at 5 Gbps, with maximum
swing of 0.6 V per symbol (or 1 Vpp differentially)
The output common mode voltage can be expressed as the following:
Vocm = VTT − IBRTT
2
(|C−1|+ |C0|+ |C1|) (2.3)
where | C−1| + | C0| + | C1| needs to be equal to 1. This is also sometimes
referred to as the peak swing (or peak power) constraint:∑
i
| Ci| = 1 (2.4)
Shown in Figures 2.38 and 2.39 are the eye diagrams at the channel output
driven by the CML output driver with and without the FIR equalizer enabled.
The vertical and horizontal eye openings have increased by 5 mV and 6 ps
respectively at 5 Gbps.
33
Figure 2.38: Differential eye diagram at channel output driven by CML
output driver, with equalization
34
Figure 2.39: Differential eye diagram at channel output driven by CML
output driver, without equalization
Although equalization is often implemented on the transmitter side, it can
also be realized on the receiver side. The FIR equalizer can be done on the
receiver side with delay elements and high-speed analog-to-digital converter,
with the addition of a sample-and-hold circuit at the front of the equalizer,
as shown in Figure 2.40. Instead of a digital binary data pattern, the input
to the receiver equalizer is an analog voltage waveform. Therefore the delay
elements on the receiver side need to be implemented in the analog manner,
which is the major circuit design challenge. What is more, since the receiving
signal contains the channel response information, the tap coefficients can be
tuned adaptively to the channel; however, the noise and crosstalk contents
can also be unintentionally amplified along with the incoming signal by the
Rx FIR equalizer as illustrated in Figure 2.41.
35
Figure 2.40: Block diagram of receiver-side FIR equalizer
Figure 2.41: Illustration of noise enhancement by receiver-side equalization
2.4 PLL-based Clock and Data Recovery
In modern data transmission systems, binary data is the most common for-
mat for data transmission. The random data bits received at the receiver end
are most likely distorted and noisy. A clock and data recovery circuit block
is usually needed for regeneration of data signals and associated clock pulses
36
from an input data stream. A CDR block would usually have the following
characteristics:
1. Clock frequency exactly equals the data rate of the input data.
2. The clock has appropriate timing with respect to the data, allowing
optimally sampling. Preferably, the clock sampling edge is locked in
the middle of the data eye.
3. The clock exhibits small jitter since the jitter of the clock contributes
to the retimed data jitter.
Figure 2.42 illustrates the CDR circuit blocks. Typically, the CDR com-
prises four circuit components:
• Phase detector (PD)
• Charge pump (CP)
• Low-pass filter (LPF)
• Voltage-controlled oscillator (VCO).
Figure 2.42: Block diagram of the CDR
The PD compares the phase of the input data stream and the VCO feed-
back clock signal, as shown in Figure 2.43. If the phase difference exceeds
the PD detection resolution, a voltage pulse will be generated to drive the
next block, CP, to charge or discharge the capacitors in the following block.
After LPF eradicates the high-frequency noise components, a constant con-
trol voltage then locks the VCO to generate clean and stable clock pulses
whose phase and frequency align with the input data stream. The generated
clock pulse is then fed back to the PD. Finally, the recovered clock pulse will
feed to the data recovery latch to recover the data bits.
37
Figure 2.43: Implementation of the charge-pump PLL-based CDR
The FF latch is a necessary component in most digital logic circuit blocks.
A fast low-delay FF latch would greatly improve the circuit block perfor-
mance including delay, noise, jitter, etc. In CDR, latches are used in PD
and in data recovery block. Therefore, a high-performance latch is crucial
to a CDR that can fulfill project specifications. A widely used scheme for
the latch is the master-slave latch combination consisting of two cascaded
latches. However, this topology encounters clock phase aligning issues. A
flip-flop latch usually consists of two blocks: a pulse generator and a slave
latch. This topology seemingly is similar to the master-slave topology; how-
ever, the pulse generator stage is a function of the clock and data signals,
which will automatically align with the slave stage. Figure 2.44 gives the
schematic of sense-amplifier-based SR FF latch. The pulse-generating stages
are M3 M4, and M5 M6. The latch senses the complementary differential
inputs from M1 and M2. As the clock is on, any input change will not affect
the SR output. After the clock returns to zero, both SR outputs remain logic
one.
38
Figure 2.44: Circuit schematic of SR latch
39
2.4.1 Phase Detector
With the latch selected, the next step is to design the phase detector. There
are two types of popular phase detectors available for us to choose: the
Hogge phase detector (PD) and the bang-bang phase detector. The Hogge
phase detector generates pulse length that is linearly proportional to the
phase difference. It has a wider frequency acquisition range than the bang-
bang phase detector. Figure 2.45 shows the circuit schematic for the Hogge
PD. It consists of two D-latches and two XOR gates. The D-latch uses the
sense-amplifier latch discussed above. The first latch samples data at the
clock rising edge while the second samples data from the first latch at the
clock falling edge. When clock phase and data phase are not aligned, the
Y generates pulse width linearly depending on how much the clock comes
early or late relative to data. The pulse width generated by X will always
be a half clock cycle. The bottom part of Figure 2.45 presents the operation
data stream and clock when the data and clock get aligned. We can see that
the pulse widths generated by X and Y will be exactly the same. The clock
falling edge will be aligned with the data rising edge. Figure 2.46 reveals the
Hogge PD up and down output when data and clock are aligned. As seen
in the figure, the when clock rising edge is aligned in the middle of data bit,
the up and down pulse widths are endemically the same. Figure 2.46 shows
that the Hogge PD is working well.
40
Figure 2.45: Top: Circuit schematic of Hogge PD. Bottom: Operation data
stream and clock when the data and clock get aligned.
41
Figure 2.46: Hogge PD up (second bottom) and down (bottom) output
when data (top first) and clock (top second) are aligned
2.4.2 Charge Pump
The charge pump is another important circuit block that needs to be well
designed. Current steering charge pump design topology was used in this
work. Basically, the function of the charge pump is to convert the voltage
pulse generated from the PD into current. Figure 2.47 shows the circuit
topology of current steering charge pump. There are two inputs, up and
down, which turn a switch transistor on and off so that charging and dis-
charging happen when clock and data phases are not aligned. Figure 2.48
presents the simulation of charge pump designed. As seen from the figure,
when a periodic pulse is applied alternatively up and down, the voltage at
the output of the CP exhibits a perfect zigzag form, which means the charge
pump block is functioning well.
42
Figure 2.47: Schematic of a current steering charge pump
43
Figure 2.48: Simulation result of the current steering charge pump
2.4.3 Loop Filter
The loop filter is a low-pass filter to filter out the high-frequency component
of the control voltage pulse, as shown in Figure 2.49. At the lock state, the
CP will generate a zigzag waveform voltage at the output. However, the
fluctuating voltage is not desired for controlling a VCO. Therefore, a loop
filter will do its job to smoothen the control voltage by filtering out the high-
frequency component. The loop filter consists of a two branches in parallel
with each other. One branch composes a capacitor in series with a resistor.
The other composes a bare capacitor. The values for those capacitors and
resistor are carefully calculated based the CDR working conditions, such as
working frequency, phase margin, pole positions, etc. Detailed calculations
are presented in the following. The unit gain bandwidth ωugb is between 1M
rad/s to 3M rad/s, and ωugb = 1M rad/s was chosen in this work. Phase
margin φM is another parameter need to be set. φM cannot be too large or
too small. If φM is too large, for instance 100
◦, it will result in a long locking
time. However, if φM is too small, such as 20
◦, it will result in an unstable
feedback system. φM = 65
◦ was selected in this work. With φM and ωugb
selected, the ratio of capacitors C0 and C1, KC , can be calculated.
Kc =
C1
C2
= 2(tan2 φM + tanφM
√
tan2(φM) + 1) (2.5)
The zero pole position can be found by the following equation:
44
ωz =
ωugb√
Kc + 1
(2.6)
For low noise, the resistance R0 was chosen to be 10 kΩ. The capacitors
C0 and C1 can be calculated using the following equations:
C1 =
1
ωzR0
(2.7)
C2 =
C1
Kc
(2.8)
C0 and C1 was calculated to be 100 pF and 5 pF. Then to determine the
bias current Icp for the charge pump designed:
ωp3 =
1
R0
C0C1
C0+C1
(2.9)
Icp =
2piC2
Kvco
· ω2ugb ·
√
ω2p3 + ω
2
ugb
ω2z + ω
2
ugb
(2.10)
With Kvco = 1 GHz/V , Icp was calculated to be 11.03 µA.
Figure 2.49: Schematic of a loop filter
45
2.4.4 Simulation
Figure 2.50 shows the schematic of the CDR overall simulation. A pseudo-
random bit sequence (PRBS) generator was used to mimic the data stream
coming from the transmission line or equalizer. The bit rate is set at 2
Gbps. The Hogge PD will take the input pseudo-random bit as input to a
generate control voltage for VCO through charge pump and loop filter. The
clock recovered is fed to the data sampling latch in order to recover data. At
lock condition, VCO will stably generate 2 GHz clock pulse. The output data
stream is going to exactly follow the pattern of the input data stream. Figure
2.51 shows the 10 µs simulation result. As seen from the figure, the control
voltage is still decreasing, which means that the CDR is still on the way to
being locked. Due to the lengthy simulation for CDR, an initial value which
is close to the lock voltage was given to the control voltage line in order to
significantly shorten the simulation time to reach the locking condition. The
initial value was set to be 1.3 V, and the simulation time was set to be 10
µs. Figure 2.51 (top) and Figure 2.51 (bottom) presents the 10 µs simulation
time result with 1.3 V initial voltage and its zoomed-in view, respectively.
We can see that the CDR is locked at around 1.3 V, and the data can be
perfectly recovered at this condition.
Figure 2.50: Schematic of the CDR testbench
46
Figure 2.51: Simulation result of the CDR for 10 µs simulation
2.5 Link Verification
2.5.1 Channel Design
In this section, Ansys HFSS and Ansys Q3D Extractor were used to sim-
ulate and extract the equivalent S-parameters of the signal channel. The
simulated S-parameters are imported to Agilent ADS to verify that the de-
sired specifications have been achieved. The process of how to design the
signal channel is illustrated in Figure 2.52. The reference impedance of all
the system components was set to be 75 Ω . The data rate is a key design
parameter because it determines the working frequency of the signal trace.
Considering a robust system design, all the subsystem components for the
signal trace are simulated with a frequency up to 10 GHz. The bit-error-rate
(BER, less than 10−12) is another important design parameter. While it is
non-trivial, if not impossible, to find an explicit relationship that expresses
the BER in terms of the insertion loss of the signal trace, it is certainly pos-
sible to decrease the BER by minimizing the loss of the signal channel over
47
the entire frequency band of interest.
Figure 2.52: Illustration of channel design
Bonding Wire
The structure of the bonding wire (or the controller package trace) is illus-
trated in Figure 2.53. There are many practical issues or, more precisely,
conventions involved in the design of the bonding wires; for example, they
must be connected to the traces via pads. With such conventions considered
in the model, the Ansys HFSS simulation results are depicted in Figure 2.54.
As can be observed from Figure 2.54, the loss introduced by bonding wires
is very small. After completing the HFSS simulation, the S-parameters were
exported as an s4p file for the future ADS simulation.
48
Figure 2.53: Configuration of the bonding wires in Ansys HFSS
Figure 2.54: HFSS simulated S-parameters of the bonding wires
Package Via
The geometrical configurations and HFSS simulation results of the package
vias are shown in Figures 2.55 and 2.56, respectively. The package vias are
complicated structures with many connections. It is not expected to have
49
the same low insertion loss as the bonding wire. Nevertheless, as is shown in
Figure 2.56, the insertion loss is still controlled within an acceptable range.
Even for the worst case when the working frequency approaches 10 GHz, the
loss is approximately -1.32 dB.
Figure 2.55: Configuration of the bonding wires in Ansys HFSS
Figure 2.56: HFSS simulated S-parameters of the package vias
50
Stripline
The design of a stripline is a rather mature topic. There are many well-
tested formulas to use. However, to combine all the subcomponents inside
Agilent ADS to perform a wide-band simulation, we chose Ansys Q3D to
extract the equivalent S-parameters. We use Q3D rather than HFSS to
extract the S-parameters because the former can analyze this problem more
efficiently by solving a two-dimensional problem instead of solving a complex
and time-consuming three-dimensional problem. Figure 2.57 demonstrates
the geometrical configurations of the stripline with a trace of 2.25 mil in
length and 0.325 mil in height, and a substrate of 15 mil in height. The
extracted S-parameters are exported as an s2p file for the subsequent ADS
simulations.
Figure 2.57: Configuration of the stripline in Ansys Q3D
ADS Simulation
With the Ansys HFSS and Ansys Q3D extracted S-parameters, an ADS
model was built to investigate the total loss of the signal trace (see Figure
2.58 and 2.59). As can be observed from Figure 2.59, the signal trace has an
excellent S11 and S21 from DC to 2 GHz. From 2 GHz to 6 GHz, although
51
the S21 did not deteriorate much, the S11 increased significantly. When the
frequency is higher than 6 GHz, both the S11 and S21 deteriorate a lot. The
poor performance at a relatively high frequency might be partly due to the
sharp corners in this design.
Figure 2.58: The ADS schematic circuit for analyzing the total loss of the
signal trace
Figure 2.59: Simulated total loss of the signal trace in Agilent ADS
52
2.5.2 Combined Simulation
Serialized data coming from the serializer are transmitted to the channel
through the Tx driver. The scattering parameter of the channel designed in
HFSS is first packaged in ADS, and then imported to Virtuoso represented
by the two-port network called n2port, as shown in Figure 2.60. To analyze
the signal integrity performance of the link, eye diagrams were simulated as
shown Figures 2.61 to 2.63. Due to the slight mismatch between the output
impedance of the transmitter and characteristic impedance of the channel, as
well as data-dependent jitter, we can observe some ripples at the transition
points between unit intervals (UIs), as shown in Figure 2.61. Data-dependent
jitter (DDJ) depends on the input pattern, as well as the impulse response
of the system that generates the pattern. Since the channel is bandwidth-
limited and has finite rise-time step response, DDJ is to be expected. To
verify the performance of the receiver circuit, the eye diagram at the receiver
output is measured, shown in Figure 2.62. The sensitivity of the receiver
almost achieves full rail-to-rail swing, and the aperture time is reasonably
good at the data rate of 2 Gbps. Finally the eye diagram of the data output
of the CDR is simulated and shown in Figure 2.63, where we can see the
eye is opened. In the transient simulation, shown in Figure 2.64, we can see
the CDR block is able to recover clock at 2 GHz as well as fully recover the
transmitter data through the receiver.
Figure 2.60: Virtuoso testbench for link verification
53
Figure 2.61: Eye diagram measured at Tx driver with channel effect
Figure 2.62: Eye diagram measured after Receiver
Figure 2.63: Eye diagram measured after CDR
54
Figure 2.64: Transient simulation of recovered data at receiver end
With the PRBS-31 sequence, the eye diagram is simulated as shown in
Figure 2.65. With the eye diagram, it is possible to extract individual jitter
components. Ideally, we could use deconvolution into components. How-
ever, without prior knowledge of deterministic jitter, it is not possible to do
so. Therefore, the tail-fit technique can be used to extract random jitter
under the assumption that its probability density function follows a Gaus-
sian distribution. Making use of the dual-dirac model, the individual jitter
components of the total jitter measurement result can be extracted. Jitter
can cause inter-symbol interference (ISI), which occurs if time required by
signal to completely charge is longer than bit interval. To ensure high signal
integrity, jitter is ought to be minimized. As shown in Figures 2.65 to 2.70,
the total jitter (TJ) PDF is the convolution of individual components, in this
case, the deterministic jitter (DJ) and the random jitter (RJ):
TJ(x) = DJ(x) ∗ RJ(x)
The sample delay BERT scan curve is a direct measurement of the jitter
cumulative density function (CDF). In a BERT scan, the BER (CDF) is
measured as the sample time, and is swept between the two-bit time bound-
aries. From the BERT scan results, one may be able to estimate the jitter
in the signal. The BER is a function of the sample time and the probability
density function (PDF) width. This is commonly known as a BER bathtub
curve. The BER bathtub curve is a description of the shape of a BER or
CDF curve that has steep walls to a noise floor (a flat bottom) where the
55
probability of population is small. It is observed that for the majority of the
unit-interval (UI), the bit-error-rate (BER) is less than 10−12.
Figure 2.65: Eye diagram of the HSSL with PRBS-31 sequence
Figure 2.66: Total jitter (TJ) histogram of the HSSL with PRBS-31
sequence
56
Figure 2.67: Data-dependent jitter (DDJ) histogram of the HSSL with
PRBS-31 sequence
Figure 2.68: Random jitter (RJ) histogram of the HSSL with PRBS-31
sequence
57
Figure 2.69: DDJ vs. Bit Number of the HSSL with PRBS-31 sequence
Figure 2.70: Bathtub curve and bit-error rate (BER)
58
CHAPTER 3
MODELING AND SIMULATION OF
HIGH-SPEED INTERCONNECT
3.1 Overview
The interconnect in high-speed links usually refers to the electrical path be-
tween the transmitter (Tx) and receiver (Rx). Also known as the “channel,”
the interconnect can include but is not limited to the following components:
IC packages like wire bonds, ball grid arrays (BGAs), through-silicon vias
(TSVs), printed circuit board (PCB) traces and vias, connectors, wires, flex-
ible circuits (FLEX), etc.
Figure 3.1: Interconnect modeling
As shown in Figure 3.1, for low-speed designs, where the wavelength is
much bigger than the dimension, an interconnect can be modeled as a sim-
ple short. For applications within mid-range speed, the interconnect can
be approximated with lumped RLGC elements. As the speed goes higher
and higher, where the wavelength is comparable to the physical dimensions,
a distributed element model (also known as a transmission line model) is
required.
59
3.2 Interconnect Modeling Using Numerical Methods
with Field Solvers
Modeling and simulations for high-speed design, signal integrity and power
integrity (SIPI), and electromagnetics (EM) can be classified as high per-
formance computing (HPC) tasks, which require extraordinary high CPU,
RAM, and graphical performance. Sufficient power supply and cooling also
have to be ensured in order to handle heavy and time-consuming simulation
tasks.
For the interconnect extraction, modeling and simulation, there are exist-
ing commercial tools (e.g. CST Microwave Studio, Ansys HFSS, Q3D Extra-
tor, etc.) whose simulation approaches involve numerical methods such as
finite difference time-domain (FDTD), finite element method (FEM), bound-
ary element method (BEM), etc., but they can be extremely computationally
hungry and take a very long time for just one single run of simulation.
Shown in Figures 3.2 to 3.5 are the examples of HFSS model and S-
parameter results of microstrip line and microstrip line with discontinuities,
and their simulation time could take hours depending on the computer specs
as discussed above.
Figure 3.2: HFSS modeling of a microstrip line
60
Figure 3.3: Simulated S-parameters of a microstrip line
Figure 3.4: HFSS modeling of a microstrip line with discontinuities
61
Figure 3.5: Simulated S-parameters of a microstrip line with discontinuities
Simple structures like stripline models shown in Figures 3.6 to 3.9 are used
in memory interconnects, and the time-consuming simulations can slow down
each iteration of feasibility studies for full system design.
Figure 3.6: Cross-sectional view of the multi-conductor stripline models
with 1 conductor at two different thicknesses
Figure 3.7: Cross-sectional view of the multi-conductor stripline models
with 2 conductors at two different thicknesses
Figure 3.8: Cross-sectional view of the multi-conductor stripline models
with 3 conductors at two different thicknesses
62
Figure 3.9: Cross-sectional view of the multi-conductor stripline models
with 4 conductors at two different thicknesses
3.3 Interconnect Modeling Using Conformal Mapping
and Variational Method in Closed-Form Analytical
Method
3.3.1 Basis of the Unified Approach
In this work, a unified approach is used to solve a microstrip-like trans-
mission line with multilayer dielectrics and bottom ground aperture. The
approach makes use of the Green’s function and the transverse transmission
line technique combined with the variational method. Closed-form analyti-
cal expressions for the line capacitance and characteristic impedance of the
microstrip-like interconnect with bottom ground aperture are presented.
63
Figure 3.10: N-layer dielectric with side walls and a point source
Consider a unit charge located at (x0, y0) as shown Figure 3.10. The
Green’s function should satisfy the Poisson’s differential equation in the plane
(x, y) and is given by Equation (3.1):
∇2tG(x, y|x0, y0) = −
1
ε
δ(x− x0)δ(y − y0) (3.1)
For an interconnect with a multilayered substrate, the boundary conditions
at the interface of the dielectrics are given by:
G(x, sj−0) = G(x, sj+0) (3.2)
εj
∂
∂y
G(x, sj−0) = εj+1
∂
∂y
G(x, sj+0) (3.3)
The Green’s function can be expressed as the sum of the product of ele-
64
mentary functions with separated variables:
G =
∑
n
Gn(x)Gn(y) (3.4)
In order to satisfy the boundary conditions on the vertical walls separated
by wall spacing c, the following expressions are found [8] for Gn(x) for the
three cases shown in Figure 3.11:
Case a. Electric walls at x = 0 and c (Dirichlet type, G = 0):
Gn(x) = sin
npix
c
, n = 1, 2, . . . ,∞ (3.5)
Case b. Electric wall at x = 0(Dirichlet type, G = 0) and Magnetic
wallx = c (Neumann type, ∂G/∂n = 0):
Gn(x) = sin
(2n+ 1)pix
2c
, n = 0, 1, 2, . . . ,∞ (3.6)
Case c. Magnetic walls at x = 0 and c (Neumann type, ∂G/∂n = 0):
Gn(x) = cos
npix
c
, n = 1, 2, . . . ,∞ (3.7)
Note that in the equations, sin npix
c
, sin (2n+1)pix
2c
and cos npix
c
are orthogonal
in the interval (0,c).
Figure 3.11: Three boundary conditions. Dashed lines represent magnetic
walls, thick solid lines represent electric wall, and thin solid lines represent
the dielectric interfaces
The transverse transmission line technique is used to compute Gn(y). Con-
sider a transmission line with a current source of intensity Is at the charge
65
plane y = y0. The voltage and current relations along the line are [9, 10, 11]:
dV
dy
= −γZcI (3.8)
dI
dy
= − γ
Zc
V + Is · δ(y − y0) (3.9)
where Zcis the characteristic impedance of the line and γ is the propagation
constant.
Solving (3.8) and (3.9) simultaneously leads to the following differential
equation satisfied by the voltage:
d2V
dy2
− γ2V = −γZcIs · δ(y − y0) (3.10)
In the case of change in the characteristic admittance of the line, the conti-
nuity equations are
Vj = Vj+1 (3.11)
and Ij = Ij+1 which combined with (3.8) give
Ycj
∂Vj
∂y
= Vcj+1
∂Vj+1
∂y
(3.12)
Comparing Equations (3.10) to (3.12), the authors of [12] came up with
the following:
1. Green’s function Gn(y) can be determined by the voltage along the
line:
V ≡ Gn(y) (3.13)
2. The dielectric constant of the layer can be identified by the character-
istic admittance of the transmission line:
Ycj = εj (3.14)
Thus, the boundary conditions satisfied by the Green’s function at the
various interfaces are equivalent to the boundary conditions satisfied by the
voltages at the interfaces between two dissimilar characteristic admittances.
66
The voltage on the transmission line at the charge plane y = y0 is given by
V |y=y0 =
Is
Y
(3.15)
where Y is the admittance at y = y0. The Green’s function for the three
cases can now be obtained as follows:
Case a. Electric walls at x = 0 and c (Dirichlet type, G = 0):
Zc =
1
ε
, γ = npi
c
, and Is =
2
npi
sin npix0
c
Thus,
Gn(y)|y=y0 =
2
npiY
sin
npix0
c
(3.16)
Substituting (3.16) and Gn(x) into G =
∑
nGn(x)Gn(y), the Green’s func-
tion at the charge plane y = y0 becomes
G(x, y|x0, y0)|y=y0 =
∞∑
n=1
2
npiY
sin
npix
c
sin
npix0
c
(3.17)
Case b. Electric wall at x = 0 (Dirichlet type, G = 0) and magnetic
wallx = c (Neumann type, ∂G/∂n = 0):
Zc =
1
ε
, γ = (2n+1)pi
2c
, and Is =
4
(2n+1)pi
sin (2n+1)pix0
2c
Thus,
Gn(y)|y=y0 =
4
(2n+ 1)piY
sin
(2n+ 1)pix0
2c
(3.18)
Substituting (3.18) and Gn(x) into G =
∑
nGn(x)Gn(y), the Green’s func-
tion at the charge plane y = y0 becomes
G(x, y|x0, y0)|y=y0 =
∞∑
n=0
4
(2n+ 1)pi
sin
(2n+ 1)pix
2c
sin
(2n+ 1)pix0
2c
(3.19)
Case c. Magnetic walls at x = 0 and c (Neumann type, ∂G/∂n = 0):
Zc =
1
ε
, γ = npi
c
, and Is =
2
npi
cos npix0
c
Thus,
Gn(y)|y=y0 =
2
npiY
cos
npix0
c
(3.20)
67
Substituting (3.20) and Gn(x) into G =
∑
nGn(x)Gn(y), the Green’s func-
tion at the charge plane y = y0 becomes
G(x, y|x0, y0)|y=y0 =
∞∑
n=1
2
npiY
cos
npix
c
cos
npix0
c
(3.21)
The variational method is a well-established mathematical technique gen-
erally used to seek a function that gives a maximum or minimum of a desired
quantity which depends upon that function, and is widely applied to elec-
tromagnetics problems, particularly microwave problems where the physical
system under consideration acts such that some function of its behavior at-
tains the least or the greatest value. For example, in an electrostatic system,
the well-known Thomson’s theorem states that the charges which reside on
conducting bodies and which give rise to the electric field E will distribute
themselves such that the energy function We is minimized. So, the varia-
tional method is fundamentally a maximization or minimization technique
[13].
In the unified approach, the variational method is used to compute the
capacitance per unit length [14]. To illustrate the basis of the variational
method, consider a system of perfect conductors S1, S2, . . . , SN , with Q1,
Q2, . . . , QN as the charges on the conductors held at potentials V1, V2, . . . ,
VN . The potential function ϕ in the space surrounding the conductors is the
solution of the Laplace equation ∇2ϕ = 0 subject to the boundary conditions
ϕ = Vi on Si with i =1, 2, . . . , N. The electrostatic energy stored is given by
We =
ε
2
∫
vol
∇ϕ · ∇ϕdV (3.22)
where the integration is carried over the entire volume containing the electric
field. Suppose the charges on the conductors are moved slightly from their
equilibrium position while keeping potentials constant, then the potential
distribution in the surrounding space also changes. The change in the energy
function is given by
We =
ε
2
∫
vol
∇δϕ · ∇δϕdV (3.23)
where δϕ is an incremental change in ϕ. If a trial function for the potential
distribution which differs by a small quantity δϕ from the correct value is
68
inserted, the resulting value of We will change from its value by an amount
proportional to (δϕ)2. In other words, for a first-order change in ϕ, the
change in We is only of second-order. The energy function We is a positive
stationary function for the equilibrium conditions. Hence the true value of
We is a minimum since any change from the equilibrium increases the energy
function We [15].
The energy stored in the electrostatic field per unit length along the trans-
mission line is given by
We =
ε
2
∫∫
xy−plane
|∇tϕ|2 dxdy = 1
2
CV 20 (3.24)
where C is the capacitance per unit length of the line and V0 is the potential
difference between the two conductors. The upper bound on capacitance per
unit length of the line is given by
C =
ε
V 20
∫∫
xy−plane
|∇tϕ|2 dxdy =
ε
∫∫
xy−plane |∇tϕ|2 dxdy(∫ S2
S1
∇tϕ · dl
)2 (3.25)
where V0 is the line integral of ∇tϕ from S1 to S2.
The authors of [15] have elaborated in detail on variational expressions for
upper and lower bounds on capacitance. The lower bound on capacitance is
given by:
1
C
=
1
Q2
∫
S2
ϕ(x, y)ρ(x, y)dl (3.26)
where ρ(x, y) is the unknown charge distribution for which a suitable trial
function would be substituted later.
3.3.2 Analysis using Variational Method
The unified approach combines the variational method with transverse trans-
sion line technique for determination of capacitance. The comprehensive
review of the unified approach can be found in [12].
69
Figure 3.12: Configuration of the interconnect with a bottom ground plane
aperture
Consider the structure shown in Figure 3.12, where the bottom ground
aperture with width ws is assumed to be symmetrical with respect to the
center of the signal line with width w. Three isotropic dielectric layers have
heights b1, b2 and b3 as well as relative permittivities ε1, ε2 and ε3 respectively.
The short circuits at either end of the structure correspond to ground planes
at y = 0 and y = b. Due to the inhomogeneous structure, and partial
opening in the ground plane, the substrate region (region 2) is divided into
three vertical profiles I, II and III.
Similar to the previous analysis, there are three possible boundary condi-
tions:
Case a. Electric walls at x = 0 and c (Dirichlet type, G = 0):
βn(x) =
npi
c
(3.27)
Case b. Electric wall at x = 0 (Dirichlet type, G = 0) and magnetic wall
x = (c− ws) /2 (Neumann type, ∂G/∂n = 0):
βn(x) =
(2n+ 1)pi
c− ws , n = 0, 1, 2, . . . ,∞ (3.28)
Case c. Magnetic walls at x = (c− ws) /2 and x = (c+ ws) /2 (Neumann
70
type, ∂G/∂n = 0):
βn(x) =
npi
Ws
(3.29)
where n = 1, 3, 5, . . . ,∞.
Consider an infinitesimally thin strip conductor S as shown in Figure 3.12.
The charge distribution can be assumed as:
ρ(x, y) = f(x)δ(y − y0) (3.30)
where f(x)is the charge distribution in the x -direction.
Substituting the charge distribution function in Equation (3.26), the varia-
tional expression for the capacitance per unit length of a multilayer structure
with side walls can be found as:
1
C
=
∫∫
S
G(x, y|x0, y0)f(x)f(x0)dxdx0[∫
S
f(x)dx
]2 (3.31)
Thus, Green’s function for various boundary conditions at the side walls
derived in the previous section as Equations (3.17), (3.19) and (3.21) can be
substituted in the above equation. Expressions for capacitance for the three
cases of side wall conditions take the following form:
Case a. Electric walls at x = 0 and c (Dirichlet type, G = 0):
C =
[∫
S
f(x)dx
]2∑∞
n=1
2
npiY
[∫
S
f(x) sin npix
c
dx
]2 (3.32)
Case b. Electric wall at x = 0 (Dirichlet type, G = 0) and magnetic wall
x = c (Neumann type, ∂G/∂n = 0):
C =
[∫
S
f(x)dx
]2∑∞
n=0
4
(2n+1)piY
[∫
S
f(x) sin (2n+1)pix
2c
dx
]2 (3.33)
Case c. Magnetic walls at x = 0 and c (Neumann type, ∂G/∂n = 0):
C =
[∫
S
f(x)dx
]2∑∞
n=1
2
npiY
[∫
S
f(x) cos npix
c
dx
]2 (3.34)
where the admittance at the charge plane Y = Y+ + Y−. The expression for
the admittance can be easily obtained by using the standard transmission
71
line formula for the input admittance Yin of a section of transmission line. If
lj is the length of the j th section, its input admittance Yin,j is given by
Yin,j = Ycj
[
YLj + Ycj tanh(γjlj)
Ycj + YLj tanh(γjlj)
]
(3.35)
where YLj is the load admittance of the j th section which is the same as the
input admittance Yin,j+1of the (j+1)th section. Ycj and γj are the charac-
teristic admittance and propagation constant of the j th section. Assuming
isotropic dielectric layer, Ycj = εj and γj =
npi
c
for cases (a) and (c) while for
case (b) γj =
(2n+1)pi
2c
.
It is found that a trial function given by the following gives very accurate
results for practical engineering purposes [13] :
f(x) =
{
(1/w) [1 + A |(2/w) (x− c/2)|3] for (c− w) /2 ≤ x ≤ (c+ w) /2
0 otherwise
(3.36)
Substitute Equation (3.36) into Equation (3.32) and simplify the capacitance
expression:
C =
(1 + 0.25A)2∑
nodd (Ln + AMn)
2 Pn/Y
(3.37)
where
Ln = sin (βnw/2)
Mn = (2/βnw)
3
 3
{
(βnw/2)
2 − 2} cos (βnw/2)
+ (βnw/2)
{
(βnw/2)
2 − 6}
· sin (βnw/2) + 6

Pn = (2/npi) (2/βnw)
2
βn = npi/c
To determine the value A, solve ∂C/∂A = 0 and obtain
A = −
[∑
nodd
(Ln − 4Mn)LnPn/Y
]/∑
nodd
(Ln − 4Mn)MnPn/Y
72
The only parameter left to be evaluated is the admittance Y at the charge
plane. The admittance parameter at the charge plane is obtained by applying
the transmission line formula (3.35). The admittance of the top layer is given
by
Y+ = Yin,3 = ε0ε3 coth(γ3b3) (3.38)
where γ3 =
npi
c
is defined in case a above.
The admittance parameter below the charge plane is given by
Y− = YI + YII + YIII (3.39)
with
YI = YIII = ε0ε2 coth(γ2b2) (3.40)
YII = ε0ε2 coth(γ2,IIb2) (3.41)
where γ2 = βn(x) =
(2n+1)pi
c−ws is defined in case b above and γ2,II =
npi
Ws
is
defined in case c above.
The total admittance is given by Y = Y+ + Y− , which is then substituted
back to (3.37) for computation of capacitance per unit length.
A comparison with the EM field solver using numerical methods and the
unified approach using the variational method in closed-form analytical so-
lution is shown in Figure 3.13, along with the physical parameters used, in
Table 3.1.
Table 3.1: Physical Parameters
Physical Parameter Value
h (=b2) 0.78mm
ho (=b3) 50*b2
ε1 1
ε2 2.17
ε3 1
w 0.6
c 100*b2
L 10mm
73
Figure 3.13: Comparison between the unified approach using variational
method in closed-form analytical solution and EM field solver using
numerical methods
The modified boundary condition is based on the fact that vertical profiles
I and III belong to the situation where the electric wall is on one side and the
magnetic wall on the other. With the modified boundary condition and the
unified approach (variational method combined with transverse transmission
line technique), capacitance of the multilayer structure with side walls is
calculated using Green’s function for various boundary conditions at the side
walls combined with the transverse transmission line technique, and thus the
characteristic impedance of the whole structure. Comparison to simulation
using field solver proves the accuracy of this analysis.
3.3.3 Analysis using Conformal Mapping
The variational method in closed-form has been shown to enable fast and
accurate simulations. Conformal mapping, one of the analytical methods
that provide exact solution, is an alternative for interconnect analysis. Fig-
ures 3.14 and 3.15 show the characteristic impedance versus dielectric height
and conductor width of a single stripline model, using both conformal map-
ping and variational methods. Figures 3.16 to 3.18 are the odd and even
impedances versus the structure for coupled striplines, and all the results
show good agreements using both conformal mapping and variational meth-
74
ods.
Figure 3.14: Single line: characteristic impedance vs. dielectric height,
using conformal mapping and variational method
Figure 3.15: Single line: characteristic impedance vs. conductor width,
using conformal mapping and variational method
75
Figure 3.16: Coupled line: characteristic impedance vs. dielectric height,
using conformal mapping and variational method
Figure 3.17: Coupled line: characteristic impedance vs. conductor widths,
using conformal mapping and variational method
76
Figure 3.18: Coupled line: characteristic impedance vs. conductor spacing,
using conformal mapping and variational method
3.4 Example: High-Speed Double Data Rate (DDR)
Memory Interconnect
In this example, a high-speed double data rate (DDR) memory intercon-
nect with branching network is simulated using previously discussed analysis
methods.
Figure 3.19 shows a simplified picture of a dual-in-line memory module
(DIMM), with the transmitter (TX) on the memory controller sending the
signal through the motherboard trace to connector pins (also known as the
fingers), then to the module-to-package trace, and all the way down to the
receiver (RX) mounted on both sides of the DIMM. Looking inside from the
top view of the DIMM, the PCB module-to-package trace is split into two for
the so-called clamshell double-side assembly structure as shown in Figure 3.20
[16]. The module-to-package vias, package solder balls and bond-wires can
be modeled using lumped elements, while all traces are modeled using causal
transmission line models by analytical methods using closed-form solutions.
The equivalent model in ADS is shown in Figure 3.21, and the simulated S-
parameters using analytical method comparing to ADS simulation are shown
in Figure 3.22
77
Figure 3.19: DIMM
Figure 3.20: Clamshell double-side assembly
Figure 3.21: ADS schematic of the equivalent model
78
Figure 3.22: Comparison of simulated S-parameter between analytical
methods and ADS
For commercial motherboards on the market, up to 4 slots of DDR DIMM
are common, as shown in Figure 3.23. The previous model was cascaded to
handle multiple branches and multiple slots of the DDR DIMM. The equiva-
lent model in ADS is shown in Figure 3.24, and the simulated S-parameters
using analytical method compared to ADS simulation is shown in Figures
3.25 to 3.28.
Figure 3.23: A motherboard with 4 slots for DIMM
79
Figure 3.24: ADS schematic of the equivalent model with 4 slots
Figure 3.25: Comparison of simulated S-parameter from Tx to Rx at slot 1
between analytical methods and ADS
80
Figure 3.26: Comparison of simulated S-parameter from Tx to Rx at slot 2
between analytical methods and ADS
Figure 3.27: Comparison of simulated S-parameter from Tx to Rx at slot 3
between analytical methods and ADS
Figure 3.28: Comparison of simulated S-parameter from Tx to Rx at slot 4
between analytical methods and ADS
81
To analyze the problem in time-domain, a single input bit pulse is launched
down the channel from Tx to Rx, as shown in Figure 3.29, and the single-bit
response (SBR) is displayed as a function of time as shown in Figures 3.30
to 3.33.
Figure 3.29: Single-bit pulse
Figure 3.30: Single-bit response at slot 1
82
Figure 3.31: Single-bit response at slot 2
Figure 3.32: Single-bit response at slot 3
83
Figure 3.33: Single-bit response at slot 4
Similarly, clock pattern input can be generated, as shown in Figure 3.34,
and sent to the channel from Tx to Rx. Clock pattern response is impor-
tant because unlike SerDes applications, where asynchronous embedded clock
generated by clock recovery circuit is used, forward-clock channel is utilized
for memory applications. Shown in Figures 3.35 to 3.38 are the clock pattern
responses (CPRs) at slot 1, slot 2, slot 3 and slot 4. More sophisticated and
random data patterns can be created and launched down the channel to see
the channel response for optimization purposes.
Figure 3.34: Clock pattern
84
Figure 3.35: Clock pattern response at slot 1
Figure 3.36: Clock pattern response at slot 2
85
Figure 3.37: Clock pattern response at slot 3
Figure 3.38: Clock pattern response at slot 4
86
CHAPTER 4
EYE DIAGRAM, JITTER AND NOISE
MEASUREMENT OF HIGH-SPEED DATA
LINKS
For high-speed serial data link analysis, a standard way to evaluate the signal
quality would be the eye diagram, also known as the eye pattern, which
is a time-folded representation of a signal that carries digital information.
Large eye openings ensure that the receiver (Rx) can reliably decide between
high and low logic states even when the decision threshold fluctuates or the
decision time instant varies.
Traditionally, eye diagram construction in the real-time scopes are based
on hardware clock recovery and trigger circuitry. As the record length and
memory depth of the oscilloscope has increased over the years, an alternative
method called real-time eye rendering that is based on long record acquisition
and software post-processing, has been adopted in modern high-speed real-
time oscilloscopes. The eye rendering consists of the following procedures
[17] performed internally inside the scope:
1. Capture the waveform record;
2. Determine the measured edge times;
3. Determine the edge labels (bit labels);
4. Determine the recovered edge times - clock recovery;
5. Slice the record into unit intervals, and overlay the segments;
6. Display the result as an eye diagram.
As data rates kept being pushed higher, ensuring the overall system achieved
a target bit error ratio (BER) and maintaining high signal integrity became
critical for system design. In high-speed digital systems, the timing un-
certainties causes bit errors. The industry term for timing uncertainties in
digital transmission systems is called jitter. Jitter analysis evaluates the
87
waveform in the horizontal direction, and is based on when the waveform
crosses a horizontal reference line. Similarly, along the vertical dimension
the signaling uncertainties is called noise, which is measured based on a ver-
tical reference point, typically around 50%. Measuring both jitter and noise
enables a two-dimensional view of the system behavior.
4.1 Measurement Overview
To characterize the performance of a high-speed serial link in real time, a high
bandwidth real-time scope is necessary. Probes and measurement system
need to be carefully designed for robust and accurate measurement. The
equipment used in this example is listed below:
• MSO-V334 Mixed Signal Oscilloscope 33 GHz 80 GSa/s
• N2803A 30GHz InfiniiMax III Series Probe Amplifier
• N2836A InfiniiMax III 26GHz Solder-in Probe Head
• N5443A Performance Verification and Deskew Fixture for InfiniiMax
III Probing System
The key measurement parameters include the rise/fall time, clock data
recovery rate, time interval error (TIE), de-emphasis, eye height, eye width,
random jitter, deterministic jitter (including periodic jitter, inter-symbol in-
terference and duty cycle distortion), total jitter, random noise, deterministic
interference (including periodic interference, bounded uncorrelated interfer-
ence) and total interference.
4.2 Probe Calibration
For any high-speed measurement, it is crucial to make sure the probes are
properly calibrated before they are soldered down on the DUT. It is rec-
ommended to re-calibrate the probe system whenever the probe headers are
newly connected to the probe amplifier. To properly calibrate the probe
system, follow the steps below:
88
1. With the 50 Ω SMA terminator attached, connect the SMA female
connector of the N5443A Deskew and Performance Verification Kit
to the Cal Out SMA male connector of the MSO-V334 Mixed Signal
Oscilloscope. Turn the nut on the Cal Out counter-clockwise to tighten.
For best connectivity, hold the fixture upright with one hand, use an 8
lbs-in torque wrench to fully tighten the connector.
2. Attach the N2803A 30GHz InfiniiMax III Series Probe Amplifier to the
scope, screw counterclockwise until it is securely connected followed by
a click sound. The DC Cal LED light of the probe amplifier would be
orange when it is newly connected to the scope and needed calibration.
Note that even though the LED light may be green after DC Cal has
been done once on that channel of the scope with the same type of
probe head, it is always recommended to re-calibrate the probe system
before the probe header is soldered down for new measurements.
3. Connect the N2836A InfiniiMax III 26GHz Solder-in Probe Head to the
probe amplifier. Insert the amplifier into the top of the fixture holder.
The amplifier can slide up and down in the holder to adjust the probe
head position, as shown in Figure 4.1.
89
Figure 4.1: N5443A performance verification and deskew fixture
4. On the deskew fixture, the center gold trace is signal and the large
plates on either side are both ground. Use the spring-loaded fingers to
clamp the probe head tip “+” lead to signal trace and “-” lead to the
ground, as shown in Figure 4.2. Press Autoscale on the front panel; a
stable step on screen should be observed if the probe head tip leads are
connected correctly.
90
Figure 4.2: Probe head tip leads: “+” to signal trace and “-” to ground
5. In the scopes main software Infiniium, click on Setup→ Channel 1. The
Channel Configuration window should show up, as shown in Figure 4.3.
Click on the Probe. . . button.
Figure 4.3: Channel configuration
91
6. Inside the Probe Configuration window, a simple block diagram of the
probe system is shown. Immediately after the probe amp is plugged
in, the scope automatically recognizes the serial number of the probe
amp. However, the type of probe head used needs to be specified by
clicking the Select Head button, as shown in Figure 4.4.
Figure 4.4: Channel configuration
7. Choose the appropriate probe head model from the list and click OK,
as shown in Figure 4.5.
Figure 4.5: Channel configuration
8. Back to the Channel Configuration window (Figure 4.3), click on the
Probe Cal. . . button. Within the Probe Calibration window, there are
92
three subsections, as shown in Figure 4.6. Start by performing the DC
Attenuation/Offset Cal, carefully following the instruction, and then
the Skew Calibration as well as AC Response Calibration. It is neces-
sary to allow 15 minutes for probe warmup before starting calibration.
Figure 4.6: Probe calibration
9. Once the calibration is successful, the DC Cal LED light of the probe
amplifier should turn green, indicating that the particular combination
of probe amplifier, probe head, and oscilloscope channel input has been
calibrated.
10. A quick sanity check before removing the calibration fixture should be
done by inspecting the waveform on that channel. With vertical scale
for the displayed channels set to100 mV/div and horizontal scale to
1.00 ns/div, one should be able to see a waveform similar to that in
Figure 4.7, if calibration was performed successfully. Press Autoscale:
a repetitive square wave signal should be observed similar to that in
Figure 4.8.
93
Figure 4.7: Sanity check of successful calibration -1
Figure 4.8: Sanity check of successful calibration - 2
94
4.3 Measurement Setting and Bandwidth
Considerations
After a successful calibration, the next step is to solder the probe head to
the designated location on the DUT. A good measurement cannot be made
without high quality and precision of the soldering. When requesting board
rework to lab technicians, it is highly recommended to follow these few guide-
lines:
1. Directly soldering down the probe head tip leads on the pad is preferred.
Avoid using any extra wires.
2. If the pad location is unreachable by the probe head tip lead, lab wire
with ∼ 8 mil diameter and less than 3 mm long is acceptable. Avoid
bending wires.
3. Avoid excessive solder, especially solder ball formations.
4. Keep the mini-axial lead resistors roughly parallel as shown in Figure
4.9, and use the tip wires on the mini-axial leads to get the desired
span.
Figure 4.9: Proper position of resistors
After the probes are calibrated and soldered down on the designated lo-
cations, connect the probe heads to the probe amps and complete the mea-
surement setup. Start configuring the scope setting by following the steps:
1. Press Default Setup on the front panel to set the scope to a known
state. Doing so will NOT erase the probe calibration data.
2. Power on the DUT, and press Auto Scale on the front panel for a
quick sanity check of the signal. Press the Run/Stop button on the
front panel and one should be able to see a waveform similar to that in
Figure 4.12.
95
3. Go to Setup → Bandwidth Limit, make sure the Global Bandwidth
Limit of the scope is set as Automatic. The upper bandwidth limit
should be bounded by the lowest hardware bandwidth, which is 26
GHz for the probe head.
4. To include measurement items, go to Measure→ Add Measurement; a
window should pop up as shown in Figure 4.10. In the setup section,
choose the measurement source, and - very importantly - change the
measurement thresholds by clicking the Thresholds button. For rise
and fall time measurements, choose the 20, 50, 80% of Top, Base defi-
nition, as shown in Figure 4.11. The measured waveform without BW
limitation, as well as all the selected measurement items are shown in
Figure 4.12.
Figure 4.10: Add measurement window
96
Figure 4.11: Measurement thresholds
Figure 4.12: Waveform without BW limitation
97
Bandwidth Considerations
In general, to characterize a high-speed serial communication link with NRZ
coding, the system bandwidth of the scope and probes should satisfy certain
rules. A rule of thumb for the hardware bandwidth requirement, including
both the scope and probe system, is 3 times the bit rate (6 times the fun-
damental) which ensures the 5th harmonic to pass. Figures 4.13 -4.16 show
the effect of BW limitation to the waveform:
Figure 4.13: Waveform with 20 GHz BW limitation
98
Figure 4.14: Waveform with 12 GHz BW limitation
Figure 4.15: Waveform with 7 GHz BW limitation
99
Figure 4.16: Waveform with 3 GHz BW limitation
Clearly the system bandwidth limitation can adversely affect the key mea-
surement parameters (such as Rise Time, Fall Time, CDR rate, etc.) as well
as the waveform shape and consequently the eye diagram.
4.4 Jitter and Noise Measurement
To start jitter and noise measurements, follow the steps below:
1. Go to Analyze → Jitter/Noise (EZJIT Complete) to setup Jitter and
Noise Measurements, as shown in Figure 4.17.
100
Figure 4.17: Jitter and noise measurement setup
2. Click on the Setup Wizard button and carefully read the instruction
before moving forward. Start by adjusting the vertical scale for all
active channels by clicking the Autoscale Vertical button as shown in
Figure 4.18. Typically there are only 256 (=28) vertical quantization
levels of a high-speed scope; the process of quantization adds vertical
noise with a standard deviation of one-twelfth of a quantization level
to the signal. To optimize the vertical dynamic range, it is important
to use the full range of the scopes analog-to-digital converter (ADC).
101
Figure 4.18: Scope vertical scale setup
3. The next very important and critical step for jitter and noise measure-
ments, is to choose the appropriate RJ/RN separation method. When
strong interference of crosstalk is present where bounded uncorrelated
jitter (BUJ) needs to be considered, use Spectral & Tail Fit. For this
application, use Spectral Only for both RJ and RN method as shown
in Figure 4.19.
Figure 4.19: RJ/RN separation method settings
102
4. Followed by the RJ/RN methods, it is important to choose the jit-
ter measurement method. Time interval error (TIE) is the preferred
method for jitter measurement, which calculates the difference in time
between an edge in the measured data and the corresponding edge in
the recovered clock. Choose Both for Edges and 50% for measurement
location as shown in Figure 4.20.
Figure 4.20: Measurement source type
5. The jitter analysis package EZJit Complete has the ability to automat-
ically detect the presence and length of a cyclically repeating pattern.
Choose Periodic and Auto for pattern length, since the configured test
pattern in the PHY setting is PRBS7. Although the jitter/noise decom-
position and the bathtub curve creation do not require the knowledge
of target BER level, this parameter does determine the point at which
the eye opening as well as total jitter are reported. Enter 1E-12 as the
target BER level as shown in Figure 4.21.
103
Figure 4.21: Test pattern and BER level setup
6. Clock recovery setup is absolutely fundamental to jitter measurement,
since serial communication signals carry their own timing information
without external clocking source. An incorrect clock recovery configu-
ration can cause failure to recover the clock signal, leading to extremely
high jitter results and a closed eye diagram. In most standard serial
communication links, a PLL-based clock data recovery (CDR). In this
example, use First Order PLL as the clock recovery method, 5.40 Gb/s
for the nominal data rate and 10 MHz for the loop bandwidth, as shown
in Figure 4.22.
104
Figure 4.22: Clock recovery setup
7. The voltage thresholds are the reference levels that define when sig-
nificant timing events occur. Voltage level can significantly affect the
measured jitter. For differential signaling, the right choice of the volt-
age threshold is usually defined to be 0 V. In the thresholds setting
window, choose Snap to 0 for threshold level, and click on the Auto set
thresholds button as shown in Figure 4.23. By doing this, the scope will
lock the switching threshold at 0 V while adding hysteresis to prevent
false edges due to noise.
105
Figure 4.23: Voltage threshold setup
8. To minimize the effects of vertical noise on jitter measurements, it is
generally recommended to set the scope’s sample rate to be about 3
to 5 samples per edge if possible. In this case, choose Set maximum
sample rate (80 Gsa/s) and set the memory depth to be 4.61 Mpts, as
shown in Figure 4.24.
Figure 4.24: Acquisition setup
106
9. It is possible to manually remove the scope’s random jitter and random
noise by measuring the scope’s random noise at the current vertical
sensitivity, as shown in Figure 4.25; however, this step is optional.
Figure 4.25: Random noise calibration
4.5 Eye Diagram Measurement and Mask Test
In order to determine whether an eye diagram formally meets the specifica-
tion or certain measurement standard, it is important to perform eye mask
testing. Follow the steps below to start eye diagram analysis on the scope:
1. Go to Analyze → RTEye/Clock Recovery (SDA); a window called Se-
rial Data Analysis should pop up as shown in Figure 4.26.
Figure 4.26: Serial data analysis window
107
2. Click on the Setup Wizard button and carefully read the instruction
before moving forward. Start by adjusting the vertical scale for all
active channels by clicking the Autoscale Vertical button as shown in
Figure 4.27. This will optimize the vertical dynamic range to use the
full range of the scopes digitizer. Select the clock recovery method by
choosing First Order PLL in the dropdown menu.
Figure 4.27: Vertical scale and clock recovery method
3. Configure the phase-locked loop (PLL) setting by entering the nominal
data rate of the signal as shown in Figure 4.28. In this application,
the nominal data rate would be 5.4 Gb/s. To best mimic the receiver,
loop bandwidth for the scope PLL clock emulation is set to be 10 MHz,
which is the frequency below which the clock is expected to track.
108
Figure 4.28: PLL configuration
4. The next important step it to set the Receiver Switching Threshold,
which is the level at which the clock switches. For differential signaling
applications, the right choice is usually defined to be 0 V. Choose Snap
to 0, and click Auto Set Thresholds, as shown in Figure 4.29. This
will lock the switching threshold to be 0 V while adding hysteresis to
prevent false edges due to noise.
Figure 4.29: Receiver switching threshold setting
5. The next two steps in the setup wizard will let the user turn on the
Time Interval Error measurement relative to the recovered clock (Data
TIE) as well as the real-time eye display.
109
6. Finally, in the acquisition setting page, set the Memory Depth to be
4.61 Mpts and Sampling Rate to be 80 GSa/s. Click finish, and the
real-time eye diagram should look similar to that in Figure 4.30.
Figure 4.30: Real-time eye diagram
7. To load an eye mask from a file, go to Analyze→Mask Test, and select
Enable. Select the corresponding channel source, then click Load Mask
to select the mask file as shown in Figure 4.31.
110
Figure 4.31: Real-time eye diagram
8. In the bottom left corner of the mask test window, change Run Until to
Unit Intervals for 20000000 (20MUIs). It may take a few minutes for the
scope to run the mask test depending on the number of measurement
items and settings. Figure 4.32 shows the eye diagram after eye mask
test is finished.
111
Figure 4.32: Eye diagram with eye mask test
4.6 Embedding and De-embedding
When performing measurements for high-speed digital designs, it is very com-
mon to encounter situations where the physical measurement fixture differs
from the desired configuration. It is critical for the engineer to be able to
observe what the waveform looks like at the specific locations, as well as to
apply “what if” scenarios where circuit or channel elements are changed from
those taken from the original acquisition. Fortunately, many modern high-
speed scopes with the latest software allow engineers to perform embedding
and de-embedding easily, making the following two tasks possible:
1. Relocating measurement point of the circuit or channel, due to me-
chanical or electrical considerations;
2. Removing and/or inserting channel elements for feasibility study.
To perform actuate channel embedding and de-embedding, network pa-
rameters of the channel components need to be measured by a properly
calibrated vector network analyzer (VNA), or simulated using appropriate
3D/2.5D full-wave EM solvers. Differential 2-port networks can model odd
mode of the signals while ignoring the even mode, and hence can only be
112
used to model highly balanced differential circuits where there is very lit-
tle coupling between the even and odd modes. When ultimate precision is
needed, it is better to use 4-port networks, which can capture the amount of
crosstalk between the coupled transmission lines, ensuring better accuracy
for differential channel modeling. While the scope may accept either 2-port
or 4-port network models for embedding/de-embedding, it is recommended
to use 4-port network models in most of the scenarios for differential signaling
applications.
In practice, S-parameter is the preferred network parameter model for high-
frequency applications. It is worth mentioning that the frequency range of the
S-parameters plays an important role in the embedding/de-embedding pro-
cess. By default, the scope automatically limits the global measuring band-
width based on the scope, probe system, as well as the S-parameter models,
whichever is the smallest. It is important to ensure the S-parameter models
have the required higher frequency components so that the embedding/de-
embedding will not be band-limited due to S-parameters. Also, be sure to
include the DC value when simulating the S-parameters. If low-frequency
data points are not available, extrapolation must be performed down to DC.
To perform on-scope real-time channel embedding, follow the following
steps:
1. Choose the corresponding channel for embedding by going to Setup→
Channel: the channel configuration window should pop up as shown in
Figure 4.33.
113
Figure 4.33: Channel configuration window
2. In the lower right corner of the channel configuration window, within
the InfiniiSim section, choose 4 Port (Channel 3) in the dropdown
menu. Since the each individual channel is probing differentially, there
is no need to refer one channel to another: i.e., do not choose 4 Port
(Channel 1&3) unless each channel is measured single-endedly. Choose
Differential for port extraction, and click the Setup button.
3. If a transfer function has been created before, select the corresponding
.tr4 file by locating the file in the scopes local drive or through USB
flash drive, as shown in Figure 4.34. Within the InfiniiSim Setup win-
dow there are options to configure the bandwidth limit and filter size.
To create a new transfer function .tf4 file, click on the button Create
Transfer Function from Model.
114
Figure 4.34: InfiniiSim setup window
4. To create the transfer function from model, it is necessary to choose the
proper configuration to represent the measurement and simulation cir-
cuit. InfiniiSim provides 13 different templates under the Application
Preset dropdown menu as shown in Figure 4.35.
Figure 4.35: Application preset for InfiniiSim model setup
5. In the Application Preset dropdown menu, choose the General Purpose
Probe configuration; the block diagram of the serial link is shown as in
115
Figure 4.36.
Figure 4.36: InfiniiSim setup window with block diagram of the serial link
6. Click on the block to open up the InfiniiSim Block Setup window as
shown in Figure 4.37. The Measurement Circuit, as the name suggests,
represents the actual physical circuit that produced the measured wave-
form. The Simulation Circuit, on the other hand, models the hypothet-
ical electrical circuit that exhibits the desired electrical characteristics.
In other words, the measurement circuit is what the probe actually
measures while the simulation circuit is what one wishes the probe
would have been able to measure. One needs to specify the block type
for both the measurement circuit and the simulation circuit.
116
Figure 4.37: InfiniiSim block setup
If needed, it is possible to change the block name to something that
represents the details of the block. In the Simulation Block Port Type,
there are options to choose 2 Port, 2 Port Differential or 4 Port. In this
application, choose 4 Port for the block. For each block, one may define
it as Ideal Thru, Open, Probe Load, S-parameter File, or Combination
of Sub-circuits, depending on the need. In this example, Combination
of Sub-circuits was chosen and the relationship among the sub-circuits
is Cascade, while other options include Parallel and Series.
7. If a combination of sub-circuits is chosen, there will be 3 sub-circuit
blocks for the measurement circuit and the simulation circuit. Click
on each of the sub-circuit blocks to open up the InfiniiSim Sub-circuit
Block Setup window as shown in Figure 4.38. Similarly, the block type
can be configured as S-parameter File, Probe Load, Open, Ideal Thru
and Unused. If S-parameter File is used, choose the corresponding
touchstone file from the scope’s local drive or USB flash drive. It is
critical to ensure the port assignment is in the correct order, otherwise
one may need to flip the model or renumber the 4 port.
117
Figure 4.38: InfiniiSim sub-circuit block setup for S-parameter file
8. If Probe Load is chosen for the block, choose the corresponding touch-
stone file from the scope. The S-parameter file for the probe is a 2-port
touchstone file (.s2p) with port 1 & 2 as thru. Even though symbol
shown is a 4-port S-parameter block (this is a software bug), choose
the 2-port touchstone file for the probe as shown in Figure 4.39.
118
Figure 4.39: InfiniiSim sub-circuit block setup for probe load
9. If there is a need to adjust the source and load impedances, click on
the resistor symbol at the transmitter side or the receiver end of the
serial link model: a window called Circuit Source & Load Impedances
will pop up as shown in Figure 4.40.
119
Figure 4.40: Source and load impedances setting
The default setup is 50 Ω for both the source and load impedances
for all ports, including measurement circuit and simulation circuit. To
adjust individual impedance, uncheck the Applies to all option and
type in the desired value for the specific resistor.
10. After all the blocks have been properly specified, it is always a good
idea to double check them before finishing. Move the mouse cursor
over the block and a quick summary of the measurement circuit and
the simulation circuit setup will appear as shown in Figure 4.41.
120
Figure 4.41: Quick view for individual block setting
11. If everything looks fine, choose the location where the .tf4 file will be
saved and click the Save Transfer Function button: a process indicator
pop-up window will appear as shown in Figure 4.42. It could take a
few minutes for the scope to compute the transfer function, depending
on the complexity of the channel as well as the bandwidth of the S-
parameter files.
Figure 4.42: Process indicator showing progress
12. In case there is an error, there will be a message showing where the error
is coming from and transfer function cannot be saved until the error
is fixed. Figure 4.43 shows an example of error where the S-parameter
file is missing.
121
Figure 4.43: Example of an error message
13. Once the transfer function has been successfully created, the process
indicator should change to the message showing it is successful as shown
in Figure 4.44.
Figure 4.44: Message showing successful computation of transfer function
In the main window there should be a new section called InfiniiSim
showing the Frequency Response of the transfer function newly created.
As shown in Figure 4.45, the yellow curve represents the frequency
response of the transfer function for embedding purpose. As expected,
embedding (inserting) a passive lossy channel should have a frequency
response less than 0 dB and decrease as frequency increases. The green
spectrum belongs to live signal measuring by the probe, while the blue
spectrum belongs to the simulated signal after the embedding transfer
function. Once again, it is expected that the simulated spectrum has
lower magnitude after embedding the passive lossy channel.
122
Figure 4.45: Frequency response of the transfer function for embedding
Besides the frequency response, it is important to check the time-
domain responses and ensure the filter size is large enough. In the
dropdown menu called Type, one may change Frequency Response to
Step Response and Impulse Response as shown in Figure 4.46 and Fig-
ure 4.47. In case the impulse response and step response are not settling
down within the time span of the filter, increase the Max Time Span
of the Filter size in the InfiniSim Setup window (see Figure 4.34).
Figure 4.46: Step response of the transfer function for embedding
123
Figure 4.47: Impluse response of the transfer function for embedding
With embedding, the measurement point can be relocated from the middle
of the channel to other test points down the channel. As shown in Figure
4.48, the eye diagram measured at the middle of the channel can be used to
compute the eye diagram at the receiver by applying the embedding filter.
From the comparison, one can easily observe that the eye shape is very close
to the eye diagram measured at the receiver, even though physically the eye
diagram is measured at the middle of the channel. This will be useful when
the target measurement point is physically hard to reach, while there are
other locations along the channel available for probing.
Similarly, de-embedding can be done with similar procedures shown above.
Effectively, de-embedding is removing channel components, and therefore the
frequency response would be the opposite of the embedding filter, typically
above 0 dB, as shown in Figure 4.49. One may use de-embedding to relocate
the measurement point similar to embedding. Figure 4.50 shows that the eye
diagram measured at the receiver can be used to compute the eye diagram
measured at the middle of the channel by applying the de-embedding filter.
De-embedding will be also useful when some of the channel components or
measurement fixtures (such as connectors or cables) need to be removed.
124
Figure 4.48: Eye diagrams comparison after applying the embedding filter
Figure 4.49: Frequency response of the transfer function for deembedding
Figure 4.50: Eye diagrams comparison after applying the deembedding filter
125
CHAPTER 5
SUMMARY AND FUTURE WORK
5.1 Conclusion
In summary, the work presented in this thesis laid down a path necessary to
gain knowledge of designing and building the essential components of a simple
high-speed serial link (HSSL). Fully functional serializer and deserializer cir-
cuits, as well as transmitter output driver, equalizer, receiver, and PLL-based
CDR have been designed, created and simulated using Cadence Virtuoso. A
complete channel that connects the transmitter and receiver through pack-
age to board, including the wire bonding, package trace, package via, solder
bump, PCB via, and PCB trace, has been successfully designed in HFSS and
simulated in Agilent ADS. The entire link system was integrated in Cadence
Virtuoso for performance analysis. In addition, techniques used for modeling
high-speed interconnects have been covered, with examples to help under-
standing the channel effects in modern high-speed digital systems. Finally,
the detailed process for eye diagram, jitter and noise measurement using a
high-speed real-time scope is provided to illustrate some of the real-world
issues many engineers will face when characterizing high-speed digital links.
With that, the author hopes all the examples presented in this thesis will
be helpful and educational for anyone who wishes to conduct research on or
pursue a career related to high-speed digital links.
5.2 Future Work
As for the future work, there are many aspects of the entire SerDes system
one may improve in order to meet the industry standard, such as higher
speed, better signal integrity and lower power consumption. Fine-tuning of
126
components and optimization are needed to increase the robustness of the
entire high-speed serial link system and minimize the overall power consump-
tion. Furthermore, alternative designs or topologies for any sub-component
such as driver topologies (current-mode vs. voltage-mode), timing and signal-
ing techniques (single-ended vs. differential), and PLL-based CDR (classical
analog PLL vs. all digital PLL) could be candidates to explore. One may
conduct feasibility studies on how different topologies affect the overall sys-
tem performance, and replace the components of the existing design based on
the simulation result. Finally, more in-depth signal and power integrity anal-
ysis could be done in the future on the entire high-speed serial link system
in order to discover more advanced techniques to mitigate unwanted effects
such as crosstalk, inter-symbol interference (ISI) and jitter/phase noise.
127
REFERENCES
[1] C. Hagen, K. Khan, M. Ciobo, J. Miller, D. Wall,
H. Evans, and A. Yadav, “Big data and the cre-
ative destruction of today’s business models,” Jan. 2013.
[Online]. Available: https://www.atkearney.com/strategic-it/
ideas-insights/article/-/asset publisher/LCcgOeS4t85g/content/
big-data-and-the-creative-destruction-of-today-s-business-models/
10192
[2] D. Friedman, “International solid-state circuits conference 2014 tech
trends,” Feb. 2014. [Online]. Available: http://isscc.org/doc/2014/
2014 Trends.pdf
[3] D. Stauffer, J. Mechler, K. Dramstad, C. Ogilvie, A. Mohammad,
J. Rockrohr, and M. Sorna, High Speed Serdes Devices and Applications.
New York: Springer, 2008.
[4] W. Beyene and M. Aleksic, “A study of optimal data rates of high-speed
channels,” presented at DesignCon 2011. Santa Clara, CA, Jan. 2011.
[5] A. C. J. Rabaey and B. Nikolic, Digital Integrated Circuits: A Design
Perspective. Upper Saddle River, New Jersey: Prentice Hall, 2003.
[6] M. Horowitz, “Transmitter and receiver design,” 2000. [Online].
Available: http://www-classes.usc.edu/engr/ee-s/577bb/lect.15.pdf
[7] W. Dally and J. Poulton, Digital Systems Engineering. Cambridge
University Press, 1998.
[8] R. Crampagne, M. Ahmadpanah, and J.-L. Guira, “A simple method
for determining the green’s function for a large class of mic lines having
multilayered dielectric structures,” IEEE Transactions on Microwave
Theory and Techniques, vol. 26, no. 2, pp. 82–87, Feb. 1978.
[9] Y. Chang and I. Chang, “Simple method for the variational analysis of
a generalized n-dielectric layer transmission line,” Electronics Letters,
vol. 6, no. 3, pp. 49–50, Feb. 1970.
128
[10] R. E. Collin, Field Theory of Guided Waves. New York: McGraw-Hill,
1960.
[11] D. E. Vitkovitch, Field Analysis. Experimental and Computational
Methods. D. Van Nostrand, 1966.
[12] B. Bhat and S. K. Koul, “Unified approach to solve a class of strip and
microstrip-like transmission lines,” IEEE Transactions on Microwave
Theory and Techniques, vol. 30, no. 5, pp. 679–686, May 1982.
[13] C. Nguyen, Analysis Methods for RF, Microwave, and Millimeter-Wave
Planar Transmission Line Structures. New York: Wiley, 2003.
[14] R. Sharma and T. Chakravarty, Compact Models and Measurement
Techniques for High-Speed Interconnects. New York: Springer, 2012.
[15] B. Bhat and S. Koul, Stripline-like Transmission Lines for Microwave
Integrated Circuits. New York: Wiley, 1989.
[16] D. Wang, “A talk on memory buffers,” 2014. [Online]. Available:
http://www.cs.utah.edu/thememoryforum/wang.pdf
[17] D. Derickson and M. Mu¨ller, Digital Communications Test and Mea-
surement: High-Speed Physical Layer Characterization. Prentice Hall,
2007.
129
