Integrated Circuit Design for Hybrid Optoelectronic Interconnects by Zhu, Kehan
INTEGRATED CIRCUIT DESIGN FOR HYBRID
OPTOELECTRONIC INTERCONNECTS
by
Kehan Zhu
A dissertation
submitted in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy in Electrical and Computer Engineering
Boise State University
December 2016
c© 2016
Kehan Zhu
ALL RIGHTS RESERVED
BOISE STATE UNIVERSITY GRADUATE COLLEGE 
 
 
DEFENSE COMMITTEE AND FINAL READING APPROVALS 
 
 
of the dissertation submitted by 
 
 
Kehan Zu 
 
 
Dissertation Title: Integrated Circuit Design for Hybrid Optoelectronic Interconnects 
 
Date of Final Oral Examination: 4 December 2016 
 
The following individuals read and discussed the dissertation submitted by student Kehan 
Zu, and they evaluated his presentation and response to questions during the final oral 
examination.  They found that the student passed the final oral examination.  
 
John Chiasson, Ph.D.    Co-Chair, Supervisory Committee 
 
Vishal Saxena, Ph.D.    Co-Chair, Supervisory Committee 
 
Hao Chen, Ph.D.    Member, Supervisory Committee 
 
Wan Kuang, Ph.D.    Member, Supervisory Committee 
 
Subhanshu Gupta, Ph.D.   External Examiner 
  
The final reading approval of the dissertation was granted by John Chiasson, Ph.D., Co-
Chair of the Supervisory Committee, and Vishal Saxena, Ph.D., Co-Chair of the 
Supervisory Committee.  The dissertation was approved by the Graduate College. 
 
 
Dedicated to My Parents
iv
ACKNOWLEDGMENTS
Looking back to when I arrived here in Boise, in 2012, I can remember those
years since with gratitude: gratitude to so many here, who helped me. I had graduated
with my Master’s Degree in China and had worked for four years before making the
decision to come to Boise State University to study for my PhD. I am so thankful
to have made it here. I wish I could name everyone who helped me, encouraged me,
warmly welcomed me, but there are too many of you wonderful people to name right
now. I must mention a few, however.
First, I would like to acknowledge my BSU advisor, Dr. Vishal Saxena, who
introduced me to a wonderful research topic. His teaching skills, multi-disciplinary
knowledge and unique insight inspired me throughout the process of my doctoral
research.
I’d also like to thank the ECE department at BSU for funding my PhD pro-
gram and MOSIS educational program for providing the two electrical chip tapeouts.
Thanks to Sakkarapani Balagopal for assisting me during my first chip tape-out in
November of 2013. Thanks to Virginia Molina for helping with the grating coupler
alignment. And thanks to Dr. John Chiasson, Dr. Hao Chen and Dr. Wan Kuang,
on my supervisory committee, for their support and encouragement during my PhD
study.
I’m especially thankful to Hewlett-Packard Labs for providing the unique opportu-
nity to work as a research associate in their Palo Alto facilities from May 2015 through
May 2016. That was a great experience, contributing to a leading-edge project that
v
might contribute to HP’s future. Thanks, Cheng Li, for your mentoring and Marco
Fiorentino for your support there. Thanks, also, for all the members of the team; I
learned a great deal from you. Nan Qi and Kunzhi Yu, thank you for sharing your
knowledge of PCB design and chip testing.
I would also like to thank Ran Ding and Zhe Xuan. They were so helpful in
discussions at the early stage of the MZM Verilog-A model development.
Thanks also to Don Dutcher and Ann Dutcher, residents of Boise. They have
been so nice to me and willing to help me on anything during my stay in Boise. They
are just like my parents in America.
Last, and definitely not least, I gratefully thank my parents back in Xiangtan
City, Hunan Province, China, for their unconditional support.
vi
ABSTRACT
This dissertation focuses on high-speed circuit design for the integration of hybrid
optoelectronic interconnects. It bridges the gap between electronic circuit design and
optical device design by seamlessly incorporating the compact Verilog-A model for
optical components into the SPICE-like simulation environment, such as the Cadence
design tool.
Optical components fabricated in the IME 130nm SOI CMOS process are char-
acterized. Corresponding compact Verilog-A models for Mach-Zehnder modulator
(MZM) device are developed. With this approach, electro-optical co-design and
hybrid simulation are made possible.
The developed optical models are used for analyzing the system-level specifications
of an MZM based optoelectronic transceiver link. Link power budgets for NRZ,
PAM-4 and PAM-8 signaling modulations are simulated at system-level. The optimal
transmitter extinction ratio (ER) is derived based on the required receiver’s minimum
optical modulation amplitude (OMA).
A limiting receiver is fabricated in the IBM 130 nm CMOS process. By side-
by-side wire-bonding to a commercial high-speed InGaAs/InP PIN photodiode, we
demonstrate that the hybrid optoelectronic limiting receiver can achieve the bit error
rate (BER) of 10−12 with a -6.7 dBm sensitivity at 4 Gb/s.
A full-rate, 4-channel 29-1 length parallel PRBS is fabricated in the IBM 130 nm
SiGe BiCMOS process. Together with a 10 GHz phase locked loop (PLL) designed
vii
from system architecture to transistor level design, the PRBS is demonstrated oper-
ating at more than 10 Gb/s. Lessons learned from high-speed PCB design, dealing
with signal integrity issue regarding to the PCB transmission line are summarized.
viii
TABLE OF CONTENTS
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
LIST OF ABBREVIATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 MZM Device Characterization and Behavioral Modeling . . . . . . . . 6
2.1 MZM Device Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Modeling for MZM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Grating Coulper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Silicon Waveguide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 High-Speed Phase Modulator . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.4 Low-Speed Phase Modulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.4.1 Thermal Phase Modulator . . . . . . . . . . . . . . . . . . . . . . 20
ix
2.2.4.2 PIN Phase Modulator . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 EO Co-Design Consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Current-Mode Drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.2 Voltage-Mode Drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.3 Velocity Mismatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4 MZM Measurement and Behavioral Simulation . . . . . . . . . . . . . . . . . . . 31
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3 A Reconfigurable MZM Based Optical Link Budget Analysis . . . . 35
3.1 Derive OMA for Receiver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Determine ER for Transmitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Correlation between Transmitter and Receiver . . . . . . . . . . . . . . . . . . . . 45
3.4 Reconfigurable MZM Transmitter Simulation . . . . . . . . . . . . . . . . . . . . 47
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 A Hybrid Optoelectronic Limiting Receiver . . . . . . . . . . . . . . . . . . 50
4.1 Receiver Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.1 Photodiode and Trans-impedance Amplifier . . . . . . . . . . . . . . . . 52
4.1.1.1 Gain and Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.1.2 Noise and Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.1.2 Limiting Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.1.2.1 Gain Stages with Active Feedback . . . . . . . . . . . . . . . . 60
4.1.2.2 DC Offset Compensation . . . . . . . . . . . . . . . . . . . . . . . 62
4.1.2.3 Large-Signal in Limiting Region . . . . . . . . . . . . . . . . . . 63
4.1.3 Output Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
x
4.2.1 Eye Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2.2 Bathtub and Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5 A 10 GHz Phase Lock Loop Design . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.1 PLL Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2 LC VCO Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3 Loop Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.4 Circuit Block Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.4.1 Phase Frequency Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.4.2 Charge Pump and Passive Loop Filter . . . . . . . . . . . . . . . . . . . 87
5.4.3 Frequency Divider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.5 PLL Phase Noise Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.5.1 Noise Sources and Noise Transfer Function . . . . . . . . . . . . . . . . . 93
5.5.2 Phase Noise Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6 A 10 Gb/s Full-Rate 4-Channel 29 -1 Parallel PRBS . . . . . . . . . . . 102
6.1 PRBS Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.2 Transition Matrix Method and Correlation . . . . . . . . . . . . . . . . . . . . . . 105
6.3 Circuits Design and Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.3.1 Current Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.3.2 CML DFF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.3.3 Clock and Data Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.3.4 PRBS Startup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
xi
6.3.5 Creation of PAM Signaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.4.1 Packaging and Socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.4.2 PCB Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
APPENDIX A Verilog-A to Enable Optical Simulation . . . . . . . . . . . 135
APPENDIX B Determine the PRBS Feedback Tap . . . . . . . . . . . . . . 139
APPENDIX C First Author Publications during 2013-2016 . . . . . . . 141
xii
LIST OF TABLES
2.1 RLGC parameters at 10 GHz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Parameter description used in Equation 2.9 and Equation 2.10, corre-
sponding values used for curve fitting in Figure 2.10 are listed. . . . . . . . 19
4.1 TIA design parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2 Comparison of the optoelectronic RX fabricated in 130 nm (SOI) CMOS
process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.1 Loop filter parameters when fV CO = 11 GHz, N = 128, b = 25.57, c
= 9, PM = 65◦. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.2 Simulated opppcres resistor and MIM capacitor characteristics at three
corners in IBM8HP process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3 Noise transfer functions from PLL o/p to each noise sources. . . . . . . . . 93
6.1 Comparison with recent PRBS generators published in JSSC. . . . . . . . . 103
6.2 QFN package electrical parasitic provided by the vendor. . . . . . . . . . . . 116
6.3 Design parameters of single-ended and differential transmission lines
made of RO4350B Rogers material. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.4 Transient measurement for RF cable and three different transmission
lines with 500 mVPP PRBS-7 pattern at 10 and 15 Gb/s, respectively. 121
xiii
LIST OF FIGURES
1.1 Simulated fT versus current density for four generations of IBM pro-
cesses. The width for HBT and MOS are 5 µm and 15 µm, respectively.
Minimum length is used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 Illustration of a MZM device, HSPM cross-section, not drawn to scale. . 9
2.2 Optical power transmission characteristic of the MZM as a function
of the phase difference with and without considering the insertion loss
introduced by the optical components. . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Layout of a PAM4 MZM device. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Measured transmission spectrum characteristic of the MZM device
shown in Figure 2.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Fiber array alignment to grating couplers on the silicon photonic die
fabricated as a part of this research. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6 Loss profiles of IME TE 1550 nm grating coupler tested with 22◦
polished angle fiber array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.7 Decompose HSPM model into electronic model and photonic model. . . 17
2.8 The object properties of a Verilog-A HSPM cell. . . . . . . . . . . . . . . . . . . 17
2.9 Concatenate 1000 HSPM cells in Cadence schematic. . . . . . . . . . . . . . . 18
2.10 Simulated and measured results comparison of (a) the change of rela-
tive phase shift and (b) pn junction capacitance as a function of the
reverse-biased voltage on a 5 mm long HSPM. . . . . . . . . . . . . . . . . . . . . 19
xiv
2.11 Cross section of (a) thermal phase modulator and (b) p-i-n phase
modulator (not drawn to scale). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.12 DC characteristic of a 200 µm long thermal phase modulator. . . . . . . . . 21
2.13 Flowchart for the MZM-based transmitter design. . . . . . . . . . . . . . . . . . 22
2.14 Schematic of the NRZ TX circuit to drive the MZM. . . . . . . . . . . . . . . . 24
2.15 (a) CML dc transfer characteristic. (b) Vtail-input characteristic. . . . . . 25
2.16 A segmented MZM device consists of fourteen lumped HSPM elements
and PIN PM as dc phase device. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.17 Illustration of the segmented MZM with lumped HSPM devices. Each
segment has a dedicated push-pull driver. . . . . . . . . . . . . . . . . . . . . . . . 29
2.18 Velocity mismatch simulation of NRZ and PAM-4 signaling at 32 G
symbol rate (9.8 ps optical delay per segment). . . . . . . . . . . . . . . . . . . . 30
2.19 Deterministic jitter of the ideal PAM-4 signal. . . . . . . . . . . . . . . . . . . . . 31
2.20 Eye pattern at 20 Gb/s with a 1 Vpp differential drive for a 5 mm
MZM using a 1555 nm wavelength. (a) Compact model simulation
result. (b) Measured result in [6]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.21 Prototype of a MZM device wire-bonded on PCB. Fiber array is aligned
on top of the chip. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.22 Eye pattern at 12.5 Gb/s with a 1.8 Vpp differential drive for a 1.1 mm
segment MZM in Figure 2.3. (a) Compact model simulation result. (b)
Measured result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1 A conceptual MZM based hybrid silicon photoinic link block diagram. . 37
3.2 MZM based optical link specification parameters. . . . . . . . . . . . . . . . . . 37
3.3 Inverter based optical front-end receiver. . . . . . . . . . . . . . . . . . . . . . . . . 38
xv
3.4 PAM-2/4/8 receiver sensitivity based on a 32 Gb/s TIA in 16nm
FinFET CMOS process (in,rms = 4 µA, Vth = 20 mV , ZTIA = 58
dBΩ, ρ = 0.8 A/W ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Latch-based level shifter with example of potential reliability issue
at the start of the signal toggling. Voltage at node 3 may jump to
2*VDDH or −VDDL so that will stress the gates of INV1 and INV2. . 40
3.6 A block diagram of the MZM segment driver. . . . . . . . . . . . . . . . . . . . . 41
3.7 Illustration of the reconfigurable MZM transmitter using segmented
serpentine layout style for the proposed driver shown in Figure 3.6. . . 42
3.8 An example of the OE buffer from TSMC digital standard cell library. . 43
3.9 Optical power transfer function of the MZM in a 130nm SOI CMOS
process with different lengths (an effective phase shift of 7.58◦/mm is
extracted when operating at 32 Gb/s). . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.10 MZM ER and IL versus length. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.11 OMAPD versus ER with varying input laser power. . . . . . . . . . . . . . . . 46
3.12 The overload current seen from the TIA versus ER. . . . . . . . . . . . . . . . 47
3.13 Simulated NRZ modulation format eye diagrams with 5 segments and
9 segments, respectively, operating at 32 Gb/s. . . . . . . . . . . . . . . . . . . . 48
3.14 Simulated PAM4 modulation format eye diagrams with 5+9, 4+7, 3+5
segment combinations operating at 64 Gb/s. . . . . . . . . . . . . . . . . . . . . . 49
4.1 The architecture of the proposed hybrid optoelectronic receiver. . . . . . 52
4.2 Limiting receiver design specification partition. . . . . . . . . . . . . . . . . . . . 53
4.3 The schematic of the the TIA with photodiode, triple-well NMOS is
used for TIA core devices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
xvi
4.4 Post layout simulation results for TIA frequency response and group
delay at nominal process corner, 40◦C. . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.5 Histogram plot from nominal process Monte Carlo simulation of a 1.2
V nfet biased at maximum fT , 40
◦C, in IBM 130 nm CMOS process. . 57
4.6 Post layout simulation eye diagram of TIA output at 4 Gb/s. . . . . . . . . 58
4.7 TIA input referred current noise spectrum from post layout simulation. 59
4.8 Bandwidth improvement factor and stage dc gain versus the number
of stages for an achievable gain-bandwidth product of 20 GHz (20×1
GHz). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.9 Topology of the limiting amplifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.10 Illustration of dc offset compensation. . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.11 Schematic of the folded cascode gain-boosted opamp used in the dc
offset compensation feedback loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.12 A 2.5 GHz 150 mVPP sinusoidal signal gets amplified and gets more
NRZ-like waveform with edges sharpened, as it travels along the LA
chain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.13 The schematic of the level shifted output buffer. . . . . . . . . . . . . . . . . . . 67
4.14 Post layout simulated eye diagram of the signal coming out from the
output buffer at 4 Gb/s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.15 (a) Chip microphotograph. (b) Chip-on-board bonding to the PCB.
(c) PCB setup. (d) Test setup in the Lab. . . . . . . . . . . . . . . . . . . . . . . . 70
4.16 Block diagram of the test setup used for measuring the eye diagram
and BER of the designed receiver. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
xvii
4.17 Optical eye diagrams of the MZM output measured with PRBS-31 at
4 Gb/s with (a) 2 dBm and (c) 6dBm laser power, at 5 Gb/s with (b)
2.8 dBm and (d) 6 dBm laser power, wavelength is set at 1550 nm. . . . 72
4.18 (a)-(c) Electrical eye diagrams of a single-ended output of the limiting
receiver measured with PRBS-31 from 4 Gb/s to 5 Gb/s. (d) Oscillo-
scope mode at 5 Gb/s with PRBS-7 pattern. . . . . . . . . . . . . . . . . . . . . . 75
4.19 Power consumption breakdown of the limiting receiver. . . . . . . . . . . . . . 76
4.20 BER bathtub measurement of the limiting receiver with PRBS-31
operating from 4 Gb/s to 5 Gb/s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.21 Sensitivity plot of the limiting receiver with PRBS-31 operating from
4 Gb/s to 5 Gb/s, BER versus MZM OMA and average power. . . . . . . 77
5.1 Schematic of the proposed type-II third-order PLL architecture. . . . . . . 79
5.2 Schematic of the LC VCO. Control bits C < 2 : 0 > control the
capacitor bank for discrete frequency coarse tuning, C0=66 fF. . . . . . . . 81
5.3 Two different VCO layout examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4 Layout extracted simulation results of VCO characteristics. . . . . . . . . . 83
5.5 PLL model with possible noise sources. . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.6 Plot of unity loop bandwidth over zero versus capacitor ratio in the
loop filter for 65◦phase margin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.7 Schematic of the linear phase frequency detector. . . . . . . . . . . . . . . . . . 87
5.8 State diagram of the linear phase frequency detector. . . . . . . . . . . . . . . 87
5.9 Schematic of charge pump. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.10 Simulated MOS capacitor characteristics with vary gate voltages at
three corners. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
xviii
5.11 Schematic of tunable resistor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.12 Schematic of CML divider-by-2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.13 Schematic of TSPC divider-by-2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.14 Plots of PLL noise transfer function for each noise source. . . . . . . . . . . . 94
5.15 Phase noise of each noise sources introduced into the PLL. . . . . . . . . . . 95
5.16 PLL output noise due to individual noise sources. . . . . . . . . . . . . . . . . . 96
5.17 A micro picture of the PLL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.18 Signal generator, signal analyzer and the prototype PCB. . . . . . . . . . . . 98
5.19 VCO frequency versus capacitor control bits (66 fF incremental) when
the PLL is in the lock state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.20 PSD of (a) the reference clock at 83 MHz and (b) a PLL/64 output
signal at 166 MHz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.21 Phase noise of (a) the reference clock and (b) the PLL/64 signal. . . . . . 100
5.22 Comparison of the measured and the simulated phase noises for PLL
and free-running VCO. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.1 System block diagram of the MZM based PAM-4 transmitter using the
4-channel parallel PRBS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.2 An n-stage PRBS generator with possible n and k combinations (adapted
from [74]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.3 Single-ended version block diagram of the full-rate 4-channel 29 − 1
parallel PRBS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.4 Auto-correlation and cross-correlation of the 4-channel PRBS genera-
tor (signal amplitude rescaled to -1 and 1). . . . . . . . . . . . . . . . . . . . . . . 108
xix
6.5 Output characteristic comparison of self biased wide swing NMOS
cascode current mirror and BJT current mirror with beta helper and
emitter degeneration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.6 Schematic of the BJT DFF employed in Figure 6.3. . . . . . . . . . . . . . . . . 111
6.7 Schematic of the XOR-merged DFF employed in Figure 6.3. . . . . . . . . . 111
6.8 Schematics of (left) clock buffer employed in Figure 6.3 and (right) bias
condition for DFF clock inputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.9 Schematic of output buffer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.10 PRBS start-up process by enabling “set” signal. . . . . . . . . . . . . . . . . . . 114
6.11 Simulated eye diagrams of data pattern at 40 G Baud rate for NRZ,
PAM-4/8/16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.12 Micro photograph of half of the fabricated die. . . . . . . . . . . . . . . . . . . . 116
6.13 Picture of the prototype FR4 PCB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.14 Eye diagrams of PRBS output with the prototype FR4 PCB. . . . . . . . . 118
6.15 Block diagram of the FFE to process the signal. . . . . . . . . . . . . . . . . . . 118
6.16 Transmission line sample board made of RO4350B material. Total
length of the transmission line including SMA footprints is 1.368 inch. 119
6.17 Cross-section of single-ended microstrip, GCPW and differential GCPW
transmission lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.18 Measured S-parameters of the sample transmission lines in Figure 6.16. 121
6.19 TDR Simulated characteristic impedance with 25 ps rise time for a 10
MHz to 20 GHz GCPW measured S-parameters . . . . . . . . . . . . . . . . . 122
6.20 Picture of the prototype Rogers PCB. . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.21 Eye diagrams of PRBS output with prototype Rogers PCB. . . . . . . . . . 123
xx
6.22 System integration of electrical die and photonics die via side-by-side
wire bonding (Bond wires drawn not to scale). . . . . . . . . . . . . . . . . . . . 124
xxi
LIST OF ABBREVIATIONS
– ER Extinction Ratio
– FEC Feedforward Error Correction
– LA Limiting Amplifier
– LFSR Linear Feedback Shift Register
– MRM Micro Ring Modulator
– MZM Mach-Zehnder Modulator
– NRZ Non-Return-to-Zero
– OMA Optical Modulation Amplitude
– PAM Pulse Amplitude Modulation
– PLL Phase-Locked Loops
– PPG Pulse Pattern Generator
– PRBS Pseudo-Random Bit Sequence
– PVT Process, Voltage and Temperature
– TIA Trans-Impedance Amplifier
xxii
1CHAPTER 1
INTRODUCTION
High-speed, low-power, small form factor interconnects are increasingly being
demanded in today’s large-scale computing and switching system applications. For
example, contemporary data centers can use 4-lane 25 Gb/s optical transceivers
to achieve 100 Gb/s data transfer for different distances depending on the cable
options. Ethernet and optical transport network (OTN) protocols have put 400G
physical layer technology development on the agenda in several emerging industry
standards, such as IEEE 802.3bs [1], ITU-T G.709 [2], MSA [3] and OIF [4], to
upgrade from the 100G standard. Advanced modulation scheme such as the pulse
amplitude modulation 4-level signaling (PAM-4) is adopted in the newer standard to
achieve higher throughput in next-generation designs.
Electrical I/Os have reached a bottleneck in efficiently raising the speed of the
data transfer up to more than 28 Gb/s per lane data communication. This is true
even for an ultra-short reach, due to the speed limitations of the switching devices
and the requirements of complex circuits and systems used for compensating the
electrical channel loss and dispersion. Taking the actual bias condition and parasitics
into consideration, the maximum achievable operating frequency of a reliable system
is usually less than 1/10th to 1/20th of the device’s cutoff frequency fT . As shown
in Figure 1.1, the fT versus current density of a single device characterization from
2four generations of IBM processes are compared. As demonstrated in the plot, it is
challenging to achieve > 50 Gb/s/lane data transfer even using the IBM 130 nm SiGe
BiCMOS process.
10-2 10-1 100 101 102
Current Density (mA/µm2)
20
40
60
80
100
120
140
160
180
200
220
f T
 
(G
Hz
)
BiCMOS8HP 130 nm HBT
BiCMOS8HP 130 nm MOS
CMOS10LPE 65 nm MOS
CMRF8SF 130 nm MOS
CMRF7SF 180 nm MOS
Figure 1.1: Simulated fT versus current density for four generations of IBM
processes. The width for HBT and MOS are 5 µm and 15 µm, respectively.
Minimum length is used.
Recently, silicon-based photonic integration together with an advanced modula-
tion format has emerged as a promising solution to meet these I/O design require-
ments. It has opened wider opportunities for circuit designers to exploit photonic
devices for high-speed signal processing, and also posed challenges in electrical-optical
co-design, packaging and testing. For a given lithographic generation, e.g. a mono-
lithic photonic integrated circuit (PIC) transceiver in 130 nm SOI CMOS process,
the achievable speed of the system is limited by the CMOS circuitry at about 10
3Gb/s/lane [5]. While the Mach-Zehnder Modulator (MZM) fabricated in a silicon
photonic platform also in 130 nm SOI CMOS being driven by external test equipment
can achieve up to 40 Gb/s [6]. Comparable high-speed electrical devices (MOS or
HBT) are not available for integration on the same die in the commercial silicon
photonic process platform, therefore designers have to rely on an advanced CMOS or
BiCMOS process to design faster electrical circuitry, and then interface them with the
photonic chip. Packaging challenges involved with bonding options for the electrical
circuits die and silicon photonic die are critical to the overall signal integrity, as well
as optical packaging for high volume production.
1.1 Motivation
Designing CMOS or BiCMOS circuits to interface with silicon photonic devices
requires a high level of optoelectronic integration. In the past, silicon photonic devices
were typically designed by engineers trained in the field of optics, with specialized
device simulators and field solvers, that are not at all compatible with the traditional
IC design flow. There are commercially available tools for photonic device and circuit
simulation such as the Lumerical computational solutions [7]. Recently, Lumerical
demonstrated an optical system-level design tool called Lumerical Interconnect [7].
However, such a tool is specific to the photonic integrated circuits (PIC) simula-
tion and cannot be employed for hybrid optoelectronic system simulations since
a SPICE-like solver is required for transistor-level circuit simulations. Thus, IC
designers who design circuits for integration with silicon photonics, need compact
photonic device models which encapsulate both electrical and optical properties for
the Electrical-Optical (EO) hybrid circuit simulation. Verilog-A, a hardware descrip-
4tion language developed for behavioral modeling of analog circuits, is a good candidate
for addressing this essential need [8]. There has been a general lack of such compact
models for integrating silicon photonic devices with CMOS electronics. Lacking such
models hinders EO co-design simulation and link budget analysis.
1.2 Contributions
This dissertation focuses on the design, analysis and hardware implementation of
high-speed integrated circuits for optical interconnects. Specifically it addresses:
• How to bridge the gap between electrical circuit design and optical device design
by creating compact optical device models using Verilog-A.
• Using systematic optical link power budget analysis, which can optimize the
system-level specification for NRZ and PAM signaling format. This will be
used to further guide the circuit-level design and energy-efficient optical system
development.
• The design of high-speed circuit blocks for optical receivers and transmitters.
• A high frequency PCB design for maintaining signal integrity.
1.3 Dissertation Organization
The rest of the dissertation covers MZM device design, characterization and
behavioral modeling, system-level optical link budget analysis and two electrical
chips for receiver and transmitter, respectively. Chapter 2 characterizes the optical
components fabricated in a 130 nm SOI CMOS process platform along with the
5compact behavioral modeling. Chapter 3 presents a MZM based link power budget
analysis, and proposes a NRZ/PAM-4 reconfigurable optical transmitter based on
voltage mode drivers with a segmented MZM device. Chapter 4 showcases a hybrid
optoelectronic limiting receiver design by using the IBM 130 nm CMOS process and
an InGaAs/InP PIN photodiode device. Chapter 5 and Chapter 6 detail a high-speed
type-II third-order charge-pump PLL design and a full-rate, 4-channel 29 − 1 length
parallel PRBS design, respectively. The PRBS is clocked by the PLL designed on
the same chip in the IBM 130 nm SiGe BiCMOS process. Chapter 7 discusses future
directions for this research and concludes the dissertation.
6CHAPTER 2
MZM DEVICE CHARACTERIZATION AND
BEHAVIORAL MODELING
Optical devices such as lasers, modulators and detectors are typically designed by
engineers trained in optics background, with specialized device simulators and field
solvers, that are not at all compatible with the traditional IC design flow. There are
commercially available tools for optical device and circuit simulations such as Lumeri-
cal computational solutions [7], PhoeniX software [9], COMSOL [10]. Recent progress
has been made with optical system-level design tools such as Lumerical’s Interconnect.
However, such tools are specific to photonic integrated circuits (PIC) simulation and
cannot be employed for hybrid optoelectronic system simulation, where a SPICE-like
solver is required for transistor-level circuit simulations. Photonic process design kit
(PDK) provided by contemporary commercially accessible silicon photonic platform
foundry services, such as IME in Singapore [11] and ePIXfab from Europractice [12],
also don’t provide models for co-simulation with electronic process platforms like
standard CMOS technologies. Thus, IC designers who design circuits for integration
with optical devices, need compact device models which encapsulate both electrical
as well as optical properties for the Electrical-Optical (EO) hybrid circuit simulation.
Verilog-A, a hardware description language developed for behavioral modeling of
analog circuits, is a good candidate for addressing this essential need [8].
7This chapter focuses on modeling of one type of electrooptic modulators, which
is called Mach-Zehnder modulator (MZM) [6]. MZM is by far the most reliable
indirect optical modulator in silicon photonic platform, although its footprint is large
comparing to Micro-Ring modulator (MRM) [13]. Thus it requires relatively more
power for modulator drivers. MZM device working mechanism will be explained in the
first place followed by Verilog-A model developing for behavioral simulation. Driving
options will be discussed based on lumped element modulator and traveling-wave
modulator. The performance of NRZ and PAM-4 signaling method will be analyzed.
2.1 MZM Device Mechanism
A Mach-Zehnder modulator device consists of optical components including high-
speed phase modulator (HSPM), thermal phase modulator (TPM) or p-i-n phase
modulator (PIN PM) and grating couplers. As illustrated in the cross-section in
Figure 2.1, the HSPM is a lateral p-n diode ridge waveguide which is the key optical
component that determines the operating speed of MZM device. The doped ridge
waveguide consists of lightly doped p-n junction and heavily doped p++ and n++
implant for contact [6, 14]. Intermediate density p+ and n+ regions can be added
in between for reducing series resistance without inducing excessive optical loss [15].
Dynamic optical phase shift modulation is induced by applying reverse bias voltage
on p-n diode to create a change in refractive index at the depletion region. DC
optical phase shift can be introduced from the asymmetric waveguide or TPM or PIN
PM. The TPM is relying on resistive heating to induce phase shift of a length of
waveguide, while the PIN PM uses carrier injection to create a change in refractive
index. The grating couplers serve as optical IOs which need to be placed some
8distance away from the core device, to make sure there is no collision between the
alignment and bondwire or the chip package. With the continuous-wave light source
being split evenly into the two HSPM arms, when an electrical field forced by the
reverse-biased voltage applied on each of the HSPM arms inducing a change in the
carrier density, which, in turn induces a phase shift as the optical wave propagates in
the arms. Depending on the relative E-field polarity applied on the HSPM arms, the
two paths of lights interfere either constructively or destructively when they combine
together at the output. The phase modulation is converted into intensity modulation
at the combiner. Without considering the insertion loss of all optical components,
the optical power transfer function (Topt) of the MZM can be derived as shown in
Equation 2.1 [14, 16]. Here Pin and Pout are the input laser power and MZM output
power, respectively. Here, φ1 and φ2 are the absolute phases of the two arms (HSPM+
PIN PM). However, in reality every optical components will introduce insertion loss.
For accurate modeling, all these non ideal effects need to be considered into the
model. An accurate Topt expression is given in Equation 2.2 [17]. In this model, k is
the mismatch factor between the two arms which will deviate from 0.5 in practice.
The two branches of optical power before entering the combiner are represented by
P1 and P2and represented by Equation 2.3 and 2.4 in dB scale, respectively. The
two most significant insertion losses are introduced by the grating coupler (ILGC)
and the HSPM (ILHSPM), respectively. Losses introduced by other optical elements
such as the Y-junctions (ILY−junc.), silicon waveguides (ILWG) and the low-speed
phase modulator also need to be included in the MZM model. What’s more, the
insertion loss for phase modulators can be partitioned into static and dynamic parts.
Topt derived from models with and without considering insertion loss versus phase
difference are plotted in Figure 2.2. It can be observed that it will result a significantly
9different extinction ratio (ER) and average power which will impact the optical link
analysis. Further, the dc phase operating point φdc should be set at the quadrature
bias point (90◦) to achieve symmetric modulation. This can be achieved by either
using an extra length (about 100 µm) of waveguide or PIN phase modulator for one
of the arms. In order to save power consumption, the asymmetry length waveguide
and PIN phase modulator can be used together.
Thermal or p-i-n PM
Buried Oixde
Subs trate
nn n++p pp++
A A`
Grating coupler
P
N
A
A`
P
N
High-Speed PM
Un-doped waveguide
mim. gap required
ContactContact
Figure 2.1: Illustration of a MZM device, HSPM cross-section, not drawn
to scale.
Topt =
∣∣∣∣PoutPin
∣∣∣∣2 = 1 + cos (φ1 − φ2)2 (2.1)
Topt =
P1k + P2(1− k) + 2
√
P1P2k(1− k)cos(φ1 − φ2)
Pin10
ILY−junc.+ILWG+ILGC
10
(2.2)
P1|dBm = Pin|dBm − ILGC − ILWG + 10log10k − ILHSPM − ILLSPM (2.3)
10
P2|dBm = Pin|dBm − ILGC − ILWG + 10log10(1− k)− ILHSPM − ILLSPM (2.4)
60 70 80 90 100 110 120
φ1 - φ2 (°)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
T o
pt
w/   insertion loss
w/o insertion loss
Figure 2.2: Optical power transmission characteristic of the MZM as a
function of the phase difference with and without considering the insertion
loss introduced by the optical components.
The MZM figure of merit, VpiLpi in the units of V -cm, is defined as the product of
the driver voltage height and the MZM length to cause a phase shift of pi. This can
be expressed in Equation 2.5. In which, ∆φMZM is the actual phase shift due to the
driver voltage of Vdrv applied on a MZM device length of LMZM .
VpiLpi =
pi
∆φMZM
VdrvLMZM (2.5)
Figure 2.3 demonstrates a layout of the PAM-4 traveling-wave MZM device. A 200
µm long thermal phase modulator is used in this case which requires about 8-10 mA
current for quadrature point phase bias (90◦). Ideally, the transmission spectrum
should be constant flat for symmetric MZM without adding any dc phase bias.
However, mismatches always exist between the two branches so that the measured
11
MZM transmission spectrum for symmetric MZM has similar characteristic to the
asymmetric MZM in [18, 6, 19, 20], but has less voltage dependency. This is plotted
in Figure 2.4. The free spectrum range (FSR) wavelength is about 6 nm.
Thermo heater
Grating coupler
1.9 mm
1.1 mm
Figure 2.3: Layout of a PAM4 MZM device.
Figure 2.4: Measured transmission spectrum characteristic of the MZM
device shown in Figure 2.3.
12
2.2 Modeling for MZM
Verilog-A language is chosen for the model development for its advantage to do co-
simulation with transistor-level circuitry in Cadence design tool platform. However,
because Verilog-A doesn’t support complex numbers, the power intensity and phase
are processed separately. In order to display the units for optical power (OptPower)
and optical phase (OptPhase) in the unit of Watts and radians, respectively, their
natures should be added explicitly as a Verilog-A discipline [21]. The optical discipline
is given in Appendix A.1. With optical power and phase discipline and nature defined,
an optical source block can be made as a general cell for converting the voltage to
the optical power and optical phase units. The corresponding Verilog-A code is given
in Appendix A.2. The MZM model shown in Figure 2.1 can be partitioned into four
sub-blocks in silicon photonic process platform, they are grating coupler, un-doped
waveguide used for optical routing and Y-junction, HSPM and PIN PM. Each optical
components will be presented in the following.
2.2.1 Grating Coulper
Grating couplers bring the optical signal into or out of the optical chip, typically
externally interfacing to a fiber array as shown in Figure 2.5. The other interface
of the grating coupler is in the plane of the wafer which connects to a 500 nm
wide rib (fully etched) waveguide to the rest of the optical devices. There are two
types of grating couplers which are single-polarization grating coupler (SPGC) and
polarization-splitting grating coulpler (PSGC). As a rule of thumb, for IME TE SPGC
aligned with the 22◦ lid polish angle fiber array, the aligned fiber array is about 35 µm
above the die and the SPGC is about 250 µm behind the front edge of the fiber array.
13
The loss profile in decibel scale can be expressed in Equation 2.6 with the parameters
of peak loss (Losspeak), peak lambda (λpeak) and -3 dB wavelength (λ−3dB). Figure
2.6 shows four sets of SPGC measured data along with a Verilog-A model simulation
result. The simulated result matches well with the measured result. The typical peak
loss is measured about 4.5-6.5 dB per SPGC at different wavelengths depending on
the different fiber array tilt angle and the actual height to die.
Loss = −Losspeak −
(
λ− λpeak
λ−3dB/
(
2
√
3
))2 (2.6)
Figure 2.5: Fiber array alignment to grating couplers on the silicon
photonic die fabricated as a part of this research.
14
Figure 2.6: Loss profiles of IME TE 1550 nm grating coupler tested with
22◦ polished angle fiber array.
15
2.2.2 Silicon Waveguide
Si waveguide in SOI technology has been made possible due to the high contrast
between silicon and silicon dioxide. As optical signal routing, the waveguide intro-
duces negligible insertion loss, but it does introduce phase shift and propagation delay
which can’t be ignored in MZM design. Verilog-A compact model can capture the
length and temperature dependent phase shift as shown in Equation 2.7, as well as
insertion loss being customized for single-mode, multi-mode, different radius bends
and tapers. In Equation 2.7, nSi is the Silicon’s effective refractive index and the floor
function helps to remove the integer multiples of 2pi. It also includes thermo-optic
coefficient of the refractive index (ndT ), which is about 1.86 × 10−4/◦C [22]. The
propagation delay is the waveguide length over the group velocity (vg) which can be
expressed in Equation 2.8. In which c and ng are the velocity of light in vacuum
and group index of silicon, respectively. For the purpose of simplicity and without
losing accuracy, the refractive index and group index can be approximated to constant
values for narrow range wavelength operation.
∆φ = 2pi
(
(nSi − ndT · (T − T0))L
λ
− floor
(
(nSi − ndT · (T − T0))L
λ
))
(2.7)
td =
L
vg
=
Lng
c
(2.8)
2.2.3 High-Speed Phase Modulator
The HSPM model is decomposed into electronic model and optical model as shown
in Figure 2.7. The electronic model can be treated as either S-parameters of a short
16
section of T-line or a distributed RLGC network in which the electrical propagation
delay is designed to match the optical propagation delay. The latter is used here
in which Rtl is a frequency-dependent metal skin resistance. It changes from 2.5
Ω/mm at very low frequencies up to 10 Ω/mm at 40 GHz. However, it is kept as a
constant (e.g. 5 Ω/mm at 10 GHz ) in the model for simplicity and without losing
accuracy at the same time. Moreover, the value shown in the component description
format (CDF) parameter can be overwritten for other frequencies by editing the
object properties of the symbol in Cadence, e.g., shown in Figure 2.8. Ltl and Ctl are
the inductance and capacitance between the metal traces, respectively, which are also
frequency-dependent [15]. The p-n diode can be modeled as parasitic resistance Rpn
and depletion capacitance Cpn. Cpn characteristic varies with the applied reverse-bias
voltage as shown in Figure 2.10 [6]. The RLGC parameters used for the modeling
at 10 GHz are listed in Table 2.1. A rough estimation of the microwave velocity
( 1√
Ltl(Ctl+Cpn)
) and optical group velocity (3×10
8m/s
ng
) are approximately 8.5× 107m/s
and 7.5 × 107m/s, respectively. We can observe that the two velocities are roughly
matched. Velocity mismatch effect will be analyzed in detail in later section. The
distributed elements can be concatenated to form the MZM arm, the output phase of
each stage reflects the voltage-dependent phase-change induced by the current stage,
added to the phase of the next stage. The segment length of each cell should be set
< 1
10
3×108m/s
ng×5×fNyquist . e.g., for operating at 10 Gb/s, the segment length of the cell should
be at least less than 300 µm. Using small cell length will result in a large number of
segments in cascade for a long MZM arm. However, there is a convenient way to do
series connection as illustrated in Figure 2.9 for a thousand 3 µm cells to form a 3
mm length arm.
17
Ltl Rtl
Ctl
Electronic Model Optical Model
Pout
Фout
ΔФ(V,L) PLoss(V,L)
Pin
Фin
V
td(L)
td(L)
Figure 2.7: Decompose HSPM model into electronic model and photonic
model.
Figure 2.8: The object properties of a Verilog-A HSPM cell.
18
Figure 2.9: Concatenate 1000 HSPM cells in Cadence schematic.
Table 2.1: RLGC parameters at 10 GHz.
Rtl Ltl Rpn Ctl
5 Ω/mm 450 pH/mm 15 Ω ·mm 123 fF/mm
For the optical model, physical model equations for the voltage dependent dynamic
phase shift and depletion junction capacitance are given in Equation 2.9 and Equation
2.10, respectively [8, 23]. The description of the parameters are detailed in Table 2.2.
Simulated results using Verilog-A based on the physical models for relative phase
shift and depletion junction capacitance for a 5 mm long HSPM are well matched
with the measured results given in [6]. They are plotted in Figure 2.10(a) and (b),
respectively. The HSPM has static insertion loss and dynamic insertion loss due
to the doped waveguide and the applied modulation voltage. The static insertion
loss is about 0.63 dB/mm which will contribute significant amount of loss when the
device length is long. The scale of the dynamic insertion loss is 1/10 smaller of static
insertion loss and will be decreased with increasing the applied voltage.
4φ = cV 2Ph · L · pi
180
· (k · VR)mph (2.9)
Cd = C0
(
1 +
VR
Vbi
)−mcap
+ Cp (2.10)
19
Table 2.2: Parameter description used in Equation 2.9 and Equation 2.10,
corresponding values used for curve fitting in Figure 2.10 are listed.
Parameters Values Description
cv2ph -7
◦/ (V ·mm) Voltage to phase multiplication coeff.
L 5 mm Length of the HSPM
k 1.2 Voltage to phase fitting coeff.
mph 0.8 Voltage to phase power coeff.
C0 0.22 fF/µm Zero-bias junction capacitance
Vbi 1.5 V Built-in voltage
mcap 0.33 Voltage to capacitance power coeff.
Cp 0 Parasitic capacitance
(a) (b)
Figure 2.10: Simulated and measured results comparison of (a) the change
of relative phase shift and (b) pn junction capacitance as a function of the
reverse-biased voltage on a 5 mm long HSPM.
2.2.4 Low-Speed Phase Modulator
The low-speed phase modulator (LSPM) is used for the purpose of tuning the
device’s optical properties such as the optical phase for MZM device quadrature phase
bias and resonance wavelength tuning for MRM. Two commonly used LSPM in silicon
photonic platform are Thermal phase modulators (TPM) and p-i-n phase modulators
(PIN PM). Their cross-sections are illustrated in Figure 2.11. The property of using
20
less current to induce larger phase shift range is desirable for low power design.
Buried Oixde
Substrate
p p++p p Sip++
ContactContact
500 nm
800 nm
8 µm
Buried Oixde
Substrate
STI p++STI Sip++
ContactContact
500 nm
800 nm
8 µm
(a) Cross-section of thermal phase modulator. (b) Cross-section of p-i-n phase modulator.
Figure 2.11: Cross section of (a) thermal phase modulator and (b) p-i-n
phase modulator (not drawn to scale).
2.2.4.1 Thermal Phase Modulator
The thermal phase modulator (TPM) is a doped Si waveguide as shown in Figure
2.11 (a). It relies on resistive heating to induce phase shift of a length of Si waveguide.
The additional phase can be increased with input current as expressed in Equation
2.11. In which η is the tuning efficiency in the units of radians/mW . The measured dc
characteristic of a 200 µm long p-doped TPM which features a cross-section denoted
in Figure 2.11 (a) is plotted in Figure 2.12. It can be observed that its resistance
is not a monotonic relationship with respect to the power. The corresponding η is
about 0.11 radians/mW . Using TPM for quadrature phase bias is not a low-power
solution as it requires more current to achieve the phase shift when compared to the
PIN PM.
phase = ηI2inR (2.11)
21
Figure 2.12: DC characteristic of a 200 µm long thermal phase modulator.
2.2.4.2 PIN Phase Modulator
The p-i-n phase modulator, as shown in Figure 2.11 (b), is used in forward-biased
condition at carrier injection mode to create the change of refractive index. They
are usually used to provide low-speed optical phase modulation, for instance control
the quadrature dc phase bias point in MZM device. Like HSPM, it also has static
insertion loss and dynamic insertion loss.
2.3 EO Co-Design Consideration
Trade-off between driver voltage swing and device length is key to determine the
driver scheme. MZM device and driver circuits need to be co-designed. Differential
drive or differential push-pull drive on both of the MZM arms can reduce the the arm
22
length thus is a preferred option for on-chip driver design, although mismatches in
differential drive will introduce chirping effects. However, signal chirping techniques
can be useful in long-haul optical transmission [24] which won’t be discussed here. A
flowchart for the CMOS photonic design methodology is shown in Figure 3.2.
Figure 2.13: Flowchart for the MZM-based transmitter design.
2.3.1 Current-Mode Drive
As the operating frequency increases, the electrodes of the long MZM device
should be treated as transmission line when the arm length has a physical dimension
comparable to 1/10 of the signal wavelength. The wavelength (λ) can be calculated
with Equation 2.12, in which, f and εeff are the operating frequency and the effective
dielectric of the material, respectively.
23
λ =
3× 108m/s√
εeff × f (2.12)
Since MZM electrodes can be designed as on-chip transmission line according to
the back-end of the process line metallization specification, current-mode driver is
a natural fit to drive the traveling-wave MZM device at high-speed data rates. It
is also required to provide enough voltage swing [25][26]. As an example, IBM 130
nm CMOS process which features a 1.2 V core device and 2.5 V I/O device, which
maximum operating voltage is 1.6 V and 2.7 V, is employed for implementing the
CML driver with 1.2 V single-ended swing as the schematic shown in Figure 2.14.
For speed consideration, the differential pair (diff-pair) should use 1.2 V core devices.
For reliability issue, 2.5 V I/O devices (M3a, M3b) have to be cascoded on 1.2 V
devices as shown in Figure 2.14, at the sacrifice of speed. Transistor sizing, bias
scheme, and parasitic introduced by the pad and bond wire are all critical design
considerations for high-speed circuit design.
In order to efficiently use the bias current to obtain the desired voltage swing,
the input pair (ML1,2 and MR1,2 in Figure 2.14) of the CML driver are operating
in the large-signal regime. Three operating regions of the input pair are illustrated
along with the dc transfer characteristic as shown in the Figure 2.15 (a). The input
pair are both in saturation in region II. In region I and III, one transistor of the
input pair will enter sub-threshold while the other one will either stay in saturation,
or in triode if its input amplitude is further increased. The rise-time of the output
voltage is mainly determined by the RC time constant. This delay depends on the
charging of the load capacitor by the resistors to the supply rail, and thus the total
capacitance contributed by the current and the next stage should be minimized.
24
VDD14
Vcas
VbnVbn
Cb
Rb
Vb
Optical Input
VDD25
Rt
RL
M1a M1b
Ms1 Ms2
M2a M2b
M3a M3bRL
Rt
Optical Output
1.3~2.5V
24 mA
Vdsat:200 mV
Bond wire
Bond wire model 
P2P1
V1p
V1n
Grating 
Coupler
CML
CML driver
11.4 mA
48µ/120n 100µ/120n
220µ/240n
70 
50 
200 µm
Radius = 12.5 µm
2.5 mm
Figure 2.14: Schematic of the NRZ TX circuit to drive the MZM.
The fall-time of the output is contributed by discharging the load capacitor during
which the transistor transitions from sub-threshold region to saturation region (It
will enter triode region until Vg > Vd + VTH when the amplitude is large), with the
discharge current reaching close to the tail current. Here, output slew-rate limitation
is alleviated by using a large tail current. The slope in the region II can be increased
by reducing the overdrive voltage of the input pair [27], this will also help to satisfy
Equation 2.13 to maintain the tail current source in saturation, as is illustrated in
Figure 2.15 (b).
Vpmin = Vcm − Vgs,MR,L | I
2
> Vdsat,Ms (2.13)
25
-1 -0.5 0 0.5 1
0.2
0.25
0.3
0.35
0.4
Vinp - Vinm (V)
V
ta
il 
(V
)
-1 -0.5 0 0.5 1
-1
-0.5
0
0.5
1
Vinp - Vinm (V)
V
o
u
tp
 
-
 
V
o
u
tm
 
(V
)
Figure 2.15: (a) CML dc transfer characteristic. (b) Vtail-input character-
istic.
Open-drain CML with single termination of 50 Ω at the far-end of the TWMZM is
chosen for power saving purpose if the far-end can be perfectly matched. Indeed, this
method can only be used for frequencies up to a few gigahertz since the matching at
the far-end can hardly be made perfect [28]. Active back termination can be used to
save power[29, 30]. A 2.5 mm long bondwire is adopted to introduce enough parasitic
inductance for series peaking. However, the series peaking only works for open drain
CML [31]. A headroom of 250 mV is chosen to satisfy the Vdsat ofMs. The size ofMs
is meant to be large due to the requirement of large bias current. Non-ideal current
source dc characteristic with the displacement current spiking induced by the parasitic
capacitance during the fast signal transition need to be taken into account. In order to
efficiently utilize the tail current to achieve the desired 1.2 V voltage swing across the
MZM arms, a minimum 0.4 V single-ended amplitude with a common-mode voltage
of 1 V is required for the diff-pair to be switched on and off to steer the current into
the resistive load. Sizing the diff-pair is a trade-off between its overdrive voltage and
the maximum allowable parasitic capacitance being introduced. In order to carry the
26
desired 24 mA current capability, it would result in a relatively large size for the input
pair and the cascoded devices M3. This is detrimental to high-speed performance.
However, the size of M3 can’t be too small due to the ESD considerations. Thus,
there is a trade-off between TX speed and ESD tolerance. Explicit capacitor is needed
for the node Vcas to minimize the signal feed-through due to the parasitic capacitance
of Cdg,M3.
Since the MZM driver consumes large current (24mA), the resulting diff-pair
size is large, thus exhibiting large input capacitance. A predriver stage is therefore
necessary [28] to drive the output stage with the required swing and suitably fast
transitions. This requires the supply voltage of the predriver to be 1.4 V. Since the
gate capacitance ofMR2 is about 33 fF and suppose the Voutp1 node has 15 fF parasitic
capacitance, including the drain capacitance of MR1, it requires the load resistance
RL to be smaller than 95 Ω to keep the rise-time less than 0.125 UI (unit interval).
Consequently, a 70 Ω load resistance with 11.4 mA tail current is chosen for the
predriver. It also needs a minimum 0.4 V amplitude with a common-mode voltage of
1 V for the predriver diff-pair to be efficiently switched on and off. The size of the
prominent n-channel MOSFETs and the resistor values are annotated in Figure 2.14.
It is worth noting that, in order to pass the lowest frequency component at
certain data rate with certain length of the PRBS pattern, the on-chip ac coupling
components Cb and Rb need to be chosen large enough [32]. DC coupling or use
discrete bias-T device for testing is recommended to save the silicon area occupied by
passive RC components.
27
2.3.2 Voltage-Mode Drive
Voltage-mode driver is suitable to drive lumped-element load. Circuit design tech-
niques for voltage-mode driver are required to provide enough voltage swing as well
as high-speed data rates [33][34]. For a lumped-element MZM device, inverter based
drivers are the better option because it doesn’t consume static current dissipation and
precludes the need of termination resistor for impedance matching. Shorter length
MZM devices feature high VpiLpi are highly desired for inverter based driver[35]. A
long MZM which consists of multiple HSPM segments arranged in serpentine style
as shown in Figure 2.16. Each segment is 500 µm long that can be treated as
a lumped-element. It can be configured as either NRZ or PAM-4 modulation by
controlling the drivers. For this application, flip-chip or CuPillar bonding options
feature extremely small parasitic inductance are necessary for the inverter based driver
integration. Another main challenge for multiple segments lumped-element MZM
transmitter design is precision delay cells between every two consecutive segments for
velocity matching are required [36].
2.3.3 Velocity Mismatch
Microwave propagation delay must be matched to the optical propagation delay.
Otherwise, the bandwidth of the optical link will be degraded, especially when the
device is operating at higher data rate. As mentioned before the precision delay
cells are required for multiple segments lumped-element MZM device. While the
electrode design for traveling-wave MZM velocity matching is explained in Section
2.2.3. This phenomenon is studied with electrical-optical (EO) behavioral simulation
for a multi-segment lumped-element MZM device similar to Figure 2.16. A simplified
28
Pads location for 
LSB drivers
Pads Location for 
MSB drivers
The length of the waveguide 
can be tuned to meet the 
delay cell design spec.
The difference of the large 
and small waveguide turns 
will be compensated by the 
turns for routing the next 
segment.
Large turnSmall turn
Figure 2.16: A segmented MZM device consists of fourteen lumped HSPM
elements and PIN PM as dc phase device.
schematic is illustrated in Figure 2.17. By varying the time delay of the delay cell
with an offset time, defined by tos, with respect to the optical propagation delay. In
this example, the optical propagation delay is set to 9.8 ps. And the MZM device
is modulated with 32 Gb/s electrical signal. As evident in the NRZ and PAM-4 eye
diagrams, which are plotted in Fig. 2.18, by varying tos from 0.5 ps to 2 ps, large delay
mismatch directly impacts the overall bandwidth in a considerable manner. This is
especially critical for PAM-4 signaling format, the delay mismatch can cause skew
for top, middle and bottom eye patterns and more jitter. In other words, the NRZ
signaling performance affected by velocity mismatch is equivalent to the effect caused
by slow rise and fall time of the modulation signal. PAM-4 signaling has an additional
29
intrinsic deterministic jitter comparing to NRZ signaling. This can be observed from
Figure 2.19. With a 10 ps 10%-90% rise and fall time 10 Gb/s bit rate, PAM-4 signal
has 5.66 ps deterministic jitter due to the cross point of certain level transitions. In
certain practical applications, forward-error-correction (FEC) with DSP techniques
are required for PAM-4 signaling modulation due to more jitter and less SNR than
NRZ signaling modulation [37].
vo
ut
h<
1>
vo
ut
l<
1>
vo
ut
hb
<1
>
vo
ut
lb
<1
>
drvr drvr
vo
ut
h<
2>
vo
ut
l<
2>
vo
ut
hb
<2
>
vo
ut
lb
<2
>
delay
drvr drvr
vo
ut
h<
3>
vo
ut
l<
3>
vo
ut
hb
<3
>
vo
ut
lb
<3
>
delay
drvr drvr
vo
ut
h<
*>
vo
ut
l<
*>
vo
ut
hb
<*
>
vo
ut
lb
<*
>
delay
drvr drvr
In
Inb
Figure 2.17: Illustration of the segmented MZM with lumped HSPM
devices. Each segment has a dedicated push-pull driver.
30
Figure 2.18: Velocity mismatch simulation of NRZ and PAM-4 signaling
at 32 G symbol rate (9.8 ps optical delay per segment).
31
PAM-4 deterministic jitter 
Figure 2.19: Deterministic jitter of the ideal PAM-4 signal.
2.4 MZM Measurement and Behavioral Simulation
The developed MZM model is intended to be used for hybrid circuit-level simula-
tions. The simulated 20 Gb/s optical eye pattern of a 5 mm MZM is shown in Figure
2.20 (a), and it closely predicts the optical power levels and extinction ratio (ER) in
the eye pattern shown in Figure 2.20 (b) obtained from RF wafer probing in [6].
32
Figure 2.20: Eye pattern at 20 Gb/s with a 1 Vpp differential drive for a
5 mm MZM using a 1555 nm wavelength. (a) Compact model simulation
result. (b) Measured result in [6].
The MZM device illustrated in Figure 2.3 was fabricated in IME SOI CMOS
process and chip-on-board (COB) wire bonded on a PCB. The test platform for testing
the prototype is shown in Figure 2.21. The high-speed electrical modulation signal
was added through 27 GHz bandwidth end launch SMA connectors with impedance
controlled CPW transmission lines. High frequency (up to 20 GHz), thin film surface
mount resistors with 50 Ω and 0402 footprint are used for MZM far-end termination.
The PAM-4 MZM device has two pairs of HSPM arms, which can be individually
modulated to get NRZ signaling. Figure 2.22 compares the Verilog-A model simulated
result with the measured result only when the shorter arms are modulated at 12.5
Gb/s. Jitter in the random pulse generator and relative intensity noise in the laser
source are also included in the simulation stimulus to make the simulated result more
close to the measured result. The Anritsu MP17763C we have in the lab has a 2.5 ps
rms jitter (14 ps peak-to-peak jitter) which will significantly contribute to the jitter
of the MZM optical output.
33
Figure 2.21: Prototype of a MZM device wire-bonded on PCB. Fiber array
is aligned on top of the chip.
(a) Simulated (b) Measured
252.9 µW
Jitter p-p: 22 ps Avg Power: 203 µW
Bit Rate: 12.5 Gb/s ER=2.19 dB
152.6 µW
Figure 2.22: Eye pattern at 12.5 Gb/s with a 1.8 Vpp differential drive
for a 1.1 mm segment MZM in Figure 2.3. (a) Compact model simulation
result. (b) Measured result.
34
2.5 Summary
A traveling-wave MZM device is fabricated and characterized. A library that
consists of optical components behavioral models is created which enables the co-
simulation of a silicon photonic MZM device and the CMOS transistors in Cadence
Spectre. Current-mode and voltage-mode driver schemes with respect to multi-
segment lumped-element and traveling-wave MZM devices are discussed. Velocity
mismatch effect is emphasized with compact model behavioral simulation. The power
consumption of voltage-mode driving scheme may comparable or even exceed the
power consumption of current-mode driving scheme as the data rate getting higher
and number of the lumped elements getting larger due to the CV 2f relationship.
Behavioral simulation using the developed compact model for a traveling-wave
MZM device being tested with wafer probing and a traveling-wave MZM device which
is wire-bonded on a PCB prototype being tested with RF cables at 20 Gb/s and 12.5
Gb/s are demonstrated, respectively. Simulation results are well matched with the
measurement reuslts. Optical component behavioral modeling is an indispensable
part along with the creation of layout standard cell for optical process design kit
(PDK) development. With the help of this optical library, IC designers who have less
optical device design background can accurately simulate the optoelectronic system,
thus the design risk and time to market can be significantly reduced.
35
CHAPTER 3
A RECONFIGURABLE MZM BASED OPTICAL LINK
BUDGET ANALYSIS
Hybrid integration of CMOS chip with silicon photonic devices has emerged as a
promising cost-effective solution to meet the ever increasing data transfer bandwidth
requirement in the computing system. MZM device is by far the most reliable indirect
optical modulator in silicon photonic platform, though its footprint is large and thus
requires relatively more power for the drivers. Escalating the amplitude modulation
scheme from NRZ (PAM-2) to PAM-4, even to increase the data rate to PAM-8
requires an analytic model to estimate the trade-off among the electrical-to-optical
(EO) channel loss over the sacrificed signal-to-noise ratio, the circuit design complex-
ity, the chip area and the power consumption. It is imperative to find a methodology
to evaluate the link topology at the system-level and guide the transistor-level design
of the transceiver circuits.
This chapter proposes a methodology which helps to find the optimized system-
level performance specifications in terms of the improved link energy efficiency. CMOS
drivers designed in TSMC 16 nm FinFET CMOS process will be flip-chip bonded
with CuPillar technology to a segmented serpentine style MZM device in 130 nm SOI
CMOS photonic process. With the help of output enable function in the CMOS driver
circuit, the effective length of the MZM device can be reconfigured. NRZ and PAM-4
36
modulation with different extinction ratios (ER) can be achieved with the one design
solution. Figure 3.1 illustrates a conceptual MZM-based silicon photonic link block
diagram. The Germanium homo-junction photodetector (PD) is used at the receiver
(RX) side features a small parasitic capacitor and thus can realize more than 25 GHz
bandwidth with less than 2 V bias voltage, which has the best case responsivity (ρ)
of 0.9 A/W. The ER[37] of the MZM at the transmitter (TX) side is determined
by the MZM device insertion loss and modulation efficiency. The TX has to meet
the minimum OMA requirement being derived at the RX side, considering the signal
attenuation due to the PD coupling loss. However, the larger the received power
at the PD, the more input current can be seen at the input of the transimpedance
amplifier (TIA). Excessive amount of current will overload the TIA thus degrading its
sensitivity. Therefore, an optimized optical link requires a suitable ER at the TX side
to achieve the target BER performance with the least power consumption. Figure
3.2 lists the most important design parameters of interest for both transmitter and
receiver. The electrical and optical parameters can be simulated and extracted from
the process design kit (PDK) provided by the foundries which are essential for the
detailed link analysis.
3.1 Derive OMA for Receiver
As shown in Figure 3.3, a single-ended optical front-end receiver consists of a
transimpedance amplifier (TIA) stage, an isolating stage, a variable gain amplify
(VGA) stage, an automatic gain control (AGC) which includes a peak detector and
a comparator [38, 39]. The gain of the TIA has to be the largest among the stages in
the signal path for the best noise performance. The isolating stage decouples the dc
37
Ge PD TIA
ER OMAPower
CW laser
SiPh MZM
MZM driver
CDR
Single mode fiber RF path
Figure 3.1: A conceptual MZM based hybrid silicon photoinic link block
diagram.
TIA
RX Spec.
PDMZMLaser
TX Spec.
Driver
Pave φdc 
IL
ER
Vπ Lπ
BER
OMAMZM
Length Vswing
Speed
ρ
BW
OMAPD Vth
in,rms
ZTIA BW
ioverload
SNR
Loss including 
all the GCs
OMAMZM > min. OMAPD
λ
RIN Power
Figure 3.2: MZM based optical link specification parameters.
operating points between TIA and VGA, and provides small gain. The VGA makes
the overall gain tunable to cover the cases of the input signal at different amplitude.
This is achieved by using AGC to control the VGA stage.
The bit error rate (BER) performance is determined by the signal-to-noise ratio
(SNR). It can be expressed with the complementary error function as given in Equa-
tion 3.1 [40]. The SNR is represented by scaling factor α. The target α with other
electrical parameters of the TIA set the OMA requirement seen by the PD as given
in Equation 3.2. In which, N stands for N-level of PAM. in,rms is the TIA’s input
referred rms current noise. In this context, 4 µA in,rms is used as a design example.
Vth is the decision threshold after the TIA. ZTIA is the TIA’s transimpedance gain.
By correlating Equation 3.1 and Equation 3.2, BER as a function of the the PD’s
38
OMAmin is plotted in Figure 3.4. As shown in Figure 3.4, in order to achieve a BER
of 10−12, the approximate minimum OMAPD has to be larger than -10 dBm, -5 dBm
and -1.5 dBm for PAM-2, PAM-4 and PAM-8, respectively.
BER =
1
2
erfc
(
α
2
√
2
)
(3.1)
OMAPD = (N − 1)αin,rms + Vth/ZTIA
ρ
(3.2)
VDDVDD
RX_OUT
TIA_IN
VDD VDD
VDDVDD
VCM IosRos
TIA Isolator VGA AGC
Peak detector
Comparator
VPD
Cpar
Rf1
Rf2
N1
Optical in
Figure 3.3: Inverter based optical front-end receiver.
3.2 Determine ER for Transmitter
MZM device needs certain voltage swing to achieve the modulation depth. This
inevitably requires circuit techniques to design high voltage driver with low voltage
MOSFETs (faster speed). As Figure 3.5 shows a latch-based level shifter couples
the signal from low voltage domain to a higher voltage domain. When compared to
the traditional passive AC coupling circuit, the biggest advantage of this topology
is there is no dc wandering issue, no pattern length and bit rate dependency and
much less capacitance required. However, as the voltage potential can’t be changed
39
-11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0
OMAPD (dBm)
10-15
10-14
10-13
10-12
10-11
10-10
10-9
B
ER
PAM-2
PAM-4
PAM-8
Figure 3.4: PAM-2/4/8 receiver sensitivity based on a 32 Gb/s TIA in
16nm FinFET CMOS process (in,rms = 4 µA, Vth = 20 mV , ZTIA = 58 dBΩ, ρ
= 0.8 A/W ).
instantaneously between the capacitor, potential voltage overshoot or undershoot can
happen at the start of the signal toggling depending on the initial conditions at the
two plates of the capacitor. The voltage overshoot and undershoot can exceed the
gate oxide breakdown voltage and cause reliability issue in the long term which is not
an advisable solution for product development. Thus capacitor charge reset circuits
are developed to protect the reliability of the transistors after the capacitors. A
complete MZM segment push-pull driver schematic is shown in Figure 3.6. Where,
vip and vin are complementary signals at the same low voltage domain, vouth and
voutl are complementary signals at the high voltage domain and low voltage domain,
respectively. The initial reset and power-down function can share one circuitry. The
two nodes, before and after the capacitor, will be pulled down the voltage low at the
40
respective voltage domain when either one of the rst or pd signal is high. This is
realized with an output enabled (OE) buffer. When the OE=1, the input can pass to
Z. when OE=0, Z is in high impedance state so that a small pull down NMOS can
pull the node to voltage low. The pull down NMOS is preferred to be sized small
due to the high-speed operation. The VDDL to VDDH voltage domain pull down
needs a level shifter circuit and a thick gate PMOS with an always-on weak pull down
NMOS. When the reset or power down is high, b=0, node 2 is pre-charged to VDDH,
so that node 3 is pull down to VDDL. When reset or power down is low, b=VDDH,
node 2 is pulled to VDDL, so that node 3 is unaffected. The power down signal can
cut off the signal path for a specific MZM segment even if the signal along the chain
is presented.
1 3
C
INV1
INV2 1 3
0
VDDL
0
-VDDL
0
VDDL
VDDL
2*VDDH
-VDDL
VDDH
VDDH
2*VDDH
Reliability issue
Figure 3.5: Latch-based level shifter with example of potential reliability
issue at the start of the signal toggling. Voltage at node 3 may jump to
2*VDDH or −VDDL so that will stress the gates of INV1 and INV2.
The driver is especially suited to drive the lumped high-speed phase modulator
(HSPM) being laid out in serpentine style as shown in Figure 3.7. Flip-chip bonding
is required. A PAM-4 transmitter block diagram is illustrated in Figure 3.8. The
reset signal can be generated by the digital control circuit before the PLL clocking
for serializers is ready. Each driver can be individually powered down (disable the
HSPM segment modulation) by controlling the corresponding pd < ∗ > signal. Thus
41
vip vouth
ab
voutl
reset
power_down
Level shifter
a
b ( 0 to VDDH)
aa
a
b
VDDH
VDDL
VDDL
VDDH
VDDL to VDDH voltage domain
HSPM
ab
OE
Z
vin
ab
OE
Z
1
2
3
0
VDDL
0
VDDL
VDDL
VDDH
0
VDDL
I
I
Figure 3.6: A block diagram of the MZM segment driver.
different ER can be achieved with disable or enable driving certain segments for the
MZM device. The rms current for segment driver operating at 32 Gb/s can be 14
mA and 4.6 mA for VDDH (1.8 V) and VDDL (0.9 V), respectively. This added
flexibilty can save power consumption by disabling drivers which are not necessary
when the ER requirement is not high. It can also be configured as NRZ modulation
if the patterns for serializer1 and serializer2 are kept the same.
ER, which correlates the OMA with the average power, is an important specifica-
tion for optical modulators. The optical power transfer function (Topt) of the MZM
device is given in Equation 3.3 and is plotted in Figure 3.9 with different lengths. In
this plot, the MZM is driven by a push-pull driver with 1.8 VPP swing on both of
the MZM arms. The dc phase difference (φdc) being introduced by the PIN phase
modulators in Figure 3.7 is set to 90◦. Therefore, the MZM is ideally biased at the
42
Pad location for LSB drivers 
Pad location for MSB drivers 
p-i-n phase modulator
High-speed phase modulator
Silicon waveguide
Figure 3.7: Illustration of the reconfigurable MZM transmitter using
segmented serpentine layout style for the proposed driver shown in Figure
3.6.
quadrature operation point for the best symmetric linear modulation. However, this
assumption is started without considering the mismatches between the two arms due
to process variation. φ1 and φ2 are the absolute phases of MZM’s two arms. k, which
deviates from 0.5, is the mismatch factor of the Y-junction and between the two arms.
A total static insertion loss is expressed in Equation 3.4. The two branches of optical
power before entering the combiner is represented by Equation 3.5 and Equation 3.6,
respectively. A dB scale is used in the above equations. The two most significant
insertion losses, ILGC and ILHSPM,sat, are introduced by the grating coupler and the
high-speed phase modulators (HSPM), respectively. A nominal ILGC of 5 dB is used
for this analysis. The static and dynamic insertion loss for HSPM are 0.57 dB/mm
43
OE
I
Z
Figure 3.8: An example of the OE buffer from TSMC digital standard
cell library.
and -0.08 dB/mm, respectively, when it is reverse-biased at 1.8 V. The longer the
HSPM, the more static insertion loss will be introduced. The larger the reverse-bias
voltage is, the less the dynamic loss of HSPM is. The static and dynamic insertion
loss, ILPIN,static and ILPIN,dyn, for a 250 µm PIN PM are 0.22 dB and 1.7 dB/pi,
respectively. Moreover, losses introduced by other optical elements including the
Y-junctions used for splitting and combing the lights, optical routing like the straight
and turned silicon waveguides and the PIN phase modulators, are considered in the
MZM model. Thus, the ER can be accurately derived from the plotted Topt in Figure
3.9. It’s astonishing that only less than 5% of the laser power can be transmitted.
Figure 3.10 shows the ER for different MZM lengths, and the corresponding insertion
loss profile. The achievable ER will be reduced as the data rate speed increases due
to the bandwidth limitation of the electrical signal imposed on the effective phase
shift.
44
Topt =
P1k + P2(1− k) + 2
√
P1P2k(1− k)cos(φ1 − φ2)
Plaser10
ILY−junc.+ILWG+ILGC
10
(3.3)
ILstatic = ILGC + ILWG + ILHSPM,static + ILPIN,static (3.4)
P1 = Plaser − ILstatic + 10log10k − ILHSPM,dyn(φ1)− ILPIN,dynφdc (3.5)
P2 = Plaser − ILstatic + 10log10(1− k)− ILHSPM,dyn(φ2) (3.6)
30 40 50 60 70 80 90 100 110 120 130 140 150
φ1 - φ2 (°)
0
0.01
0.02
0.03
0.04
0.05
T o
pt
L = 1mm
L = 2mm
L = 3mm
L = 4mm
L = 5mm
L = 6mm
L = 7mm
Figure 3.9: Optical power transfer function of the MZM in a 130nm SOI
CMOS process with different lengths (an effective phase shift of 7.58◦/mm
is extracted when operating at 32 Gb/s).
45
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5
MZM Length (mm)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
ER
 (d
B)
9.5
10
10.5
11
11.5
12
12.5
13
13.5
14
14.5
15
IL
 (d
B)
ER @ dc
ER @ 32 Gb/s
Optical IL
Figure 3.10: MZM ER and IL versus length.
3.3 Correlation between Transmitter and Receiver
The OMA after the coupling at the RX PD side needs to be guaranteed to meet
the BER requirement for PAM-N signaling. This is plotted in Figure 3.11 with the
laser power ranging from 10 to 19 dBm (numbers in red are denoted at the side of
each curve). The dash-dotted horizontal lines are the minimum required OMA at
the PD for PAM-N signaling derived from the parameters used in Figure 3.4. 5 dB
insertion loss is assumed per grating coupler. It can be observed that there is an
optimal ER for this specific MZM device, which is around 8 dB. In order to meet
the OMAmin for PAM-4 and PAM-8 RX sensitivity requirement, laser power needs
to be increased due to the degraded SNR. It requires at least 11 dBm and 19 dBm
laser power for PAM-4 and PAM-8 signaling modulation, respectively, to meet the
OMA at the RX side to achieve BER of 10−12. Longer MZM device can increase the
46
ER but doesn’t necessarily help the OMA . This is because longer MZM device can
introduce excessive loss.
10
11
12
13
14
15
16
17
18
19
PAM-4 
PAM-8 
PAM-2 
Laser (dBm)
Figure 3.11: OMAPD versus ER with varying input laser power.
Besides OMA, the overload current, which is set by the average power of MZM
and the ρ of the PD, should also be watched. This is plotted in Figure 3.12. The
TIA needs to be able to tolerate this amount of overload current by turning on sink
current nfets. Thus the more overload current need to be cancelled, the more sink
current nfets need to be turned on thus more input current noise can be introduced,
which in turn raises the minimum required OMA at the RX side.
47
Laser (dBm)
Figure 3.12: The overload current seen from the TIA versus ER.
3.4 Reconfigurable MZM Transmitter Simulation
NRZ modulation format can be realized by completely powering down either all
the 5 LSB segments or the 9 MSB segments. The simulated results are shown in
Figure 3.13. Since NRZ modulation requires less ER specification, segments within
LSB or MSB can be further powered down to save power consumption. With 10
dBm (10 mW) laser power and a fixed current setting for p-i-n phase modulator.
The obtained ERs are about 2.94 dB and 5.55 dB, respectively. The driver power
consumption is 290 mW and 522 mW, respectively.
For PAM-4 modulation, the two serializers generate two uncorrelated data pat-
terns. Simulated results with three different LSB and MSB segment combinations are
48
shown in Figure 3.14. In order to meet the receiver’s sensitivity requirement, PAM-4
signaling need more laser power. Here as a example, given 13 dBm (19.95 mW)
laser power and a fixed current setting for p-i-n phase modulator. The driver power
consumption for achieving 9.38 dB, 7.12 dB, 4.92 dB extinction ratio is 812 mW, 638
mW and 464 mW, respectively. Different ER specifications for PAM-4 modulation
format can be realized to meet different application scenarios. The middle eye height is
smaller than the height of the top and bottom eyes for the 3+5 segments combination.
This can be improved by tuning the dc phase point by changing the current setting
for the p-i-n phase modulator.
5 segments 9 segments
ER 2.94 dB ER 5.55 dB
Figure 3.13: Simulated NRZ modulation format eye diagrams with 5
segments and 9 segments, respectively, operating at 32 Gb/s.
3.5 Summary
A driver has been designed for reconfigurable MZM transmitter. Depending on the
different application, it can be configured either NRZ or PAM4 modulation format.
Furthermore, it can be set to have different ER specifications, which can significantly
save power in certain scenarios. A systematic optical link power budget analysis has
49
5+9 segments 4+7 segments 3+5 segments
ER 9.38 dB ER 7.12 dB ER 4.92 dB
Figure 3.14: Simulated PAM4 modulation format eye diagrams with 5+9,
4+7, 3+5 segment combinations operating at 64 Gb/s.
been presented based on a CMOS driver topology for a lumped-element segmented
MZM deivce. Minimum OMA for error-free RX, optimal ER for MZM device and
minimum laser power are derived. The high-speed PAM-4 hybrid optoelectronic
transceiver is under fabrication in the TSMC 16 nm FinFET CMOS process and
a SOI CMOS silicon photonic process. It will be demonstrated by Hewlett-Packard
Labs.
50
CHAPTER 4
A HYBRID OPTOELECTRONIC LIMITING RECEIVER
Optical communication chipsets play a vital role in the contemporary datacom
industry. High-speed, low-power, small form factor interconnect modules are in-
creasingly being sought for the cloud computing and switching systems. An op-
tical receiver converts the small current signal being detected by the photodiode
(PD) to a resolvable voltage signal for the subsequent stage, such as clock data
recovery (CDR) circuitry, for further processing. Although CMOS PDs can be
integrated to achieve monolithic optoelectronic receivers, they have extremely low
responsivity (ρ) and bandwidth, thus more transmitter optical power is required and
equalization technique for bandwidth compensation is a necessity for the receiver
[41, 42]. Avalanche PD features a good ρ but requires high voltage, that may
not compatible to the practical system application [43]. Costly III-V material PDs
can provide high bandwidth and good ρ with small voltage requirement, which are
dominantly deployed for high-speed optical receivers [41, 44]. Recent advances in
silicon photonic technologies have enabled a cost-effective SOI CMOS platform for
electro-optical integration with improved energy efficiency [11]. It can provide Ge PDs
for hybrid integration with CMOS compatible power supply requirement. However,
low-cost mass producible packaging for silicon photonics is still challenging [45].
Nowadays, hybrid integration has become the mainstream solution due to the cost
51
consideration and speed limitations of the transistors in available silicon photonic
process [5, 46, 38, 47].
Design analysis of the fabricated limiting receiver is elaborated in section 4.1.
System-level analysis and discussion with measurement results are given in section
4.2. Finally, key design considerations learned from this case study are summarized.
4.1 Receiver Architecture
In this work, a limiting receiver was designed in IBM 130 nm CMOS process.
Similar to prior work such as [46, 43], without having CDR or DeMUX circuitry, the
signal is taken out by using an output buffer. As illustrated in Figure 4.1, the signal
path of the CMOS limiting receiver consists of a front-end trans-impedance amplifier
(TIA) followed by limiting amplifier (LA) stages, with a high gain feedback opamp for
dc offset compensation (DCOC). The final outputs will be level shifted by a output
buffer (OB), which utilizes a 3.3 V power supply. A pair of off-chip capacitor (Cex)
is used to achieve the desired low-frequency cut-off for this DCOC loop. The CMOS
chip also includes a bandgap reference (designed using 3.3 V transistors) and bias
generator. A top illuminated InGaAs/InP PIN photodiode (PD) is side-by-side wire
bonded to the TIA input on the CMOS die.
The system specification of the limiting receiver is illustrated in Figure 4.2. The
overall bandwidth (BWtot) of gain stages in cascade in the linear region can be first
order estimated by using 1
BW 2tot
= 1
BW 21
+ 1
BW 22
+ · · · [48]. However, for the latter stages
operating in the limiting regime, the concept of small-signal bandwidth is not effec-
tively applicable and must be replaced by modeling the slew-rate limited operation
for large-signal excursions. As a rule of thumb, the whole receiver’s bandwidth needs
52
Cex
TIA
InGaAs/InP 
photodiode
VPD
LA Stages OB
Opamp
SMF
CMOS Chip Boundary
Bandgap & Bias Generator
VDDL VDDHVDDL
Bondpad with 
bondwire
Linear region Limiting region
Figure 4.1: The architecture of the proposed hybrid optoelectronic re-
ceiver.
to meet 2/3 of the data rate (B) for optimal inter-symbol interference (ISI), noise
and power trade-off [28]. Using the first order equation in the linear region mentioned
before as an example, it can be derived that the bandwidth of TIA and LA can be
set at about 0.9 × B and 1 × B, or 0.7 × B and 2 × B, respectively. In this work,
the performance of the limiting receiver is studied with a TIA bandwidth less than
0.5×B and a LA bandwidth about 2×B in the limiting region.
4.1.1 Photodiode and Trans-impedance Amplifier
According to the InGaAs/InP PIN PD device specification, it features a typical
responsivity of 0.8 A/W, 5 nA dark current and a maximum of 100 fF capacitance
when it’s reverse biased at 5 V [44]. As shown in Figure 4.3, the voltage applied on
the cathode of the PD should be adjusted to 5.6 V as the anode is about 0.57 mV
which is set by the operating bias voltage of the TIA input. The TIA consists of
53
BW
Responsivity (ρ)
Dark current
Linear region Limiting region
PD TIA LA OB
Signal integrity
ESD
Electro-migration
TX Spec.
ER
Laser Pave
Wavelength
Couple loss Limiting RX Spec.
BER
OMA (Sensitivity)
Pave (Overload current)
Gain
Power
BWGain BW
Noise Power
Figure 4.2: Limiting receiver design specification partition.
a shunt feedback common source stage and a second stage with a biased n-channel
transistor (nfet) which act as an active inductor load for bandwidth extension. The
second stage also provides additional gain and adjusts the dc operating point for the
next block by adjusting the bias voltage (Vb), Vb must be set below VDDL+Vth to keep
M3 in saturation. The input current level will be raised when the incoming optical
intensity is increased, this can result in improper dc operating point for the next
stage, which is CML type in the LA stages. This undesired current is called overload
current which can be canceled by turning on the overload control signal (ol−en).
However, enabling the current mirror will introduce noise and parasitic capacitance
at the input of the TIA. In order to alleviate this problem, the current mirror is
connected at the middle of the shunt resistance [5] but at the sacrifice of less overload
current canceling capability. What’s more, no input ESD protection device is added
here since this pin is not exposed after the CMOS die and PD die wire bonding
integration is performed. Machine model (MM) ESD stress during the wire bonding
process can be avoided with ESD control precautions [49]. Design trade-offs among
TIA gain, bandwidth and noise performance are discussed in the following.
54
tia_out
Vb
RF2
M1 M2
M3
RB
Cgs
RL
VDDL
VPD RF1
ol_en
Ls
Lg
Lp
Optical in
Input pad
PD pad
Figure 4.3: The schematic of the the TIA with photodiode, triple-well
NMOS is used for TIA core devices.
Table 4.1: TIA design parameters
PDK displayed resistance (Ω) Total width of nfets
RF1,2 RL RB M1,2 M3
384 768 4.6 K 20.4 µ 4.8 µ
4.1.1.1 Gain and Bandwidth
The first stage TIA gain frequency response is expressed in Equation 4.1 without
taking the bondwire effect into account. The low-frequency trans-impedance gain
is approximately RF as long as gmRF , gmRL  1. The dominant pole is due to
the input capacitance including the PD junction capacitance, PD pad, TIA input
device and pad capacitances. The corresponding input resistance is expressed in
Equation 4.2. The second pole is resulted from the TIA’s first stage output resistance
(RL ‖ 1gmM1 ) and the capacitance associated at that node, which will be much smaller
than the input capacitance. The input bond-wire can potentially create complex
conjugate poles which cause peaking in the frequency domain and ringing in the time
55
domain.
RT (f) =
RL (1− gmRF )
1 + gmRL
1(
1 + f
p1
)(
1 + f
p2
) (4.1)
Rin = Zin(s)|s=0 = RL +RF
1 + gmM1RL
(4.2)
With the design parameters listed in Table 4.1, the post layout simulation results
for gain frequency response of the 2-stage TIA is plotted in Figure 4.4. The layout
parasitic is critical, as an example here, it reduces the TIA’s bandwidth by more than
half when compared to the schematic simulation result. Even though the fT,max mean
value of the nfet is 117.8 GHz, from the schematic Monte Carlo simulation as plotted
in Figure 4.5. The achievable gain-bandwidth product using the nfet in practical
design will be much less than fT,max, especially without using on-chip inductors. This
is due to the actual bias condition and parasitic effects introduced at the input and
output nodes of the nfet.
Extracted layout transient simulation analysis was performed at nominal process
corner including bondwire S-parameters extracted with ADS and a PRBS-15 pattern
(chosen for practical simulation time) including 1.2 ps rms jitter is used as stimulus.
The stimulus has 70 ps rise/fall time, zero level and one level at 1.5 µA and 174 µA,
respectively, which are referenced from the measured MZM optical output after a 3.5
dB coupling loss into the PD with a ρ of 0.8 A/W. The simulated signal operating at
4 Gb/s at the output of the TIA is plotted in eye diagram form as shown in Figure
4.6. It has a peak to peak jitter of about 13.5 ps. The group delay of the 2-stage TIA
is also plotted in Figure 4.4. As claimed in [48], group delay variation is required to
be less than ±10% of the bit period (±0.1 UI) over the specified bandwidth. However,
56
G
ro
up
 d
el
ay
 (p
s)
TI
A
 g
ai
n 
(d
BΩ
)
Frequency (GHz )
Figure 4.4: Post layout simulation results for TIA frequency response and
group delay at nominal process corner, 40◦C.
this simulated group delay variation has 330 ps, which is 1.32× UI for 4 Gb/s. The
effect of group delay distortion on the TIA output eye diagram is not obvious due
to the jitter is overwhelmed by the source jitter. Moreover, the signal will enter
limiting region in the following LA stages, thus making the group delay specification
less critical at this stage.
4.1.1.2 Noise and Sensitivity
The first stage in the front-end circuit is key to the noise performance. The input-
referred current noise density at the middle band frequency is derived in Equation
4.3. There, γ is a process dependent thermal noise coefficient of the nfet. The thermal
noise due to the feedback resistor (RF ) is directly reflected at the input. It’s preferable
to increase RF and gm of M1 (equivalent to gain) to reduce the noise floor. However,
the bandwidth will be reduced by increasing the gain, the nfet channel noise being
57
90 100 110 120 130 140 150
fT,max (GHz)
0
20
40
60
80
100
120
140
N
o.
 o
f S
am
pl
es
Number = 1000
Mean = 117.82 GHz
Std Dev = 8.88 GHz
Biased at 2.78 mA/µm2
Figure 4.5: Histogram plot from nominal process Monte Carlo simulation
of a 1.2 V nfet biased at maximum fT , 40
◦C, in IBM 130 nm CMOS
process.
referred to the input will start to increase from the TIA bandwidth onwards. This
can be observed from the post layout simulation result as plotted in Figure 4.7. For
integrated noise, it is necessary to look at the whole spectrum up to about twice the
TIA bandwidth[48]. The estimated input referred rms current noise (in,rms) is less
than 1.2 µA.
I
2
n,in ≈
4kT
RF
+
4kT
(1− gmRF )2
(
1
RL
+ γgm
)
(4.3)
Wideband front-end circuit can cause more noise, while narrowband front-end
circuit has more ISI distortion at higher speed which also reduces the sensitivity
[48]. The bit error rate (BER) performance is derived by integrating of the Gaussian
58
Jitter[p-p] Bit Rate Eye Ampl
13.5 ps 4 Gb/s 170.5 mV
Figure 4.6: Post layout simulation eye diagram of TIA output at 4 Gb/s.
distribution like noise profile (thermal noise is assumed) from Q times the standard
deviation (σ) to infinity, which is expressed as the complementary error function
given in Equation 4.4 [40]. In order to achieve a BER of 10−12, Q = 7 is required.
The sensitivity, which is defined as optical modulation amplitude (OMA) seen at
the PD, can be expressed in Equation 4.5 for non-return-to-zero (NRZ) signaling.
In which, the scaling factor α = 2Q is the amplitude signal-to-noise ratio (SNR).
inoise = in,rms + ishot + iRIN , where ishot and iRIN are the shot noise and the relative
intensity noise due to the PD dark current and the laser source, respectively. η is the
frequency dependent loss factor, e.g. η = 0.7 at -3 dB bandwidth. Vth is the decision
threshold of the circuit after the TIA [40]. ZTIA is the TIA’s trans-impedance dc
gain. From η in Equation 4.5, it can manifest the sensitivity can be degraded when
59
Flicker noise dominated
nfet channel noise dominated
nfet channel noise + 
feedback resistor noise
Figure 4.7: TIA input referred current noise spectrum from post layout
simulation.
the bandwidth is lowered to certain extent.
BER =
ˆ +∞
Q·σ
1√
2pi
exp(− x
2
2σ2
)dx =
1
2
erfc
(
Q√
2
)
(4.4)
OMAPD =
(αinoise/η) + (Vth/ZTIA)
ρ
(4.5)
4.1.2 Limiting Amplifier
The LA needs to provide high gain bandwidth and convert the incoming single-
ended signal from TIA to a differential signal. For broadband high gain amplifier
design, low gain stages in cascade are used. Gain and bandwidth trade-off with
identical number of stages has been studied systematically in [48, 50, 51, 52]. In
60
order to simplify the analysis, let’s assume that the LA consists of n identical cas-
caded gain stages that do not load each other. As an example of evaluating a total
gain-bandwidth of 20 GHz. Also, let’s assume the gain (As) and bandwidth (BWs)
for a single stage amplifier are 20 (26 dB) and 1 GHz, respectively. With n number of
first-order stages equally with Butterworth frequency response in cascade, the total
gain can be kept the same since each stage has a gain of n
√
As. While the total
bandwidth can be improved by A
1− 1
n
s
√
n
√
2− 1 than the bandwidth of a single stage
with a gain of As. For this example, bandwidth improvement ratio and stage dc gain
versus the number of stages are plotted in Figure 4.8. It can be calculated that the
peak bandwidth improvement point is about 4.25 when n = 6. The corresponding
stage gain is 1.65. In other words, it means the bandwidth for each stage should be
12.14 GHz if the gain-bandwidth product remains constant. Using active feedback
other than simply cascading stages to make a higher order system can extend the
overall bandwidth but at the cost of a reduced dc stage gain and increased power
consumption [51, 52]. Gain, bandwidth and power consumption trade-offs need to be
evaluated at the transistor-level circuit design.
4.1.2.1 Gain Stages with Active Feedback
This LA design used twelve single stages in total with active feedback as is shown in
Figure 4.9. The inputs of the LA will be connected to the TIA output and the feedback
node, respectively. Ideally the feedback node provides a dc bias for comparison with
the TIA output comparison. It should be noted that the amplitude and common-mode
level of the differential signals at the first several stages will not be equal due to the
single-end to differential conversion. In order to get a genuine differential signal,
additional buffer stages are required. For easy hand calculation, let’s assume the
61
1 2 3 4 5 6 7 8 9
Number of stages
1
2
3
4
5
Ba
nd
w
id
th
 im
pr
ov
em
en
t
0
2
4
6
8
10
12
14
16
18
20
G
ai
n/
sta
ge
Gain/stage
Bandwidth improvement
Figure 4.8: Bandwidth improvement factor and stage dc gain versus the
number of stages for an achievable gain-bandwidth product of 20 GHz
(20×1 GHz).
feedback stage does not introduce extra loading to the single stage cell Av(s). Then
the transfer function of the 3rd- order stage with active feedback can be expressed as
Equation 4.6. It has a dc gain of
A30
1+A20GfRL
, one non-dominant real pole p1 expressed
in Equation 4.7 and two dominant complex conjugate poles p2,3 expressed in Equation
4.8. From the s-plane pole locations illustrated in Figure 4.9, it can be graphically
observed that the bandwidth can be extended with the active feedback technique for
the third-order system, comparing to which without using active feedback. The f−3dB
increases with increasing the feedback gain but sacrificing more dc gain. The physical
layout routing for the feedback cell Gf will introduce some systematic mismatch. The
interleaving active feedback proposed in [52] will complicate the layout design and
thus is avoided. What’s more, the frequency peaking effect needs to be carefully
62
examined with extracted layout simulation.
H(s) =
Av
3(s)
1 + Av2(s)
GfRL
1+s/ωp
=
A30
(1 + s
ωp
)3 + A20GfRL
(4.6)
vip
vin
von
vop
Gf
vip
vin
von
vop
3rd- order stage
Av(s) Av(s) Av(s)
+ -
+-
I0
+ -
+-
If
von
vop
Av Gf
vin
vip vip
vin
von
vop
3rd- order stage
vip
vin
von
vop
3rd- order stage
vip
vin
von
vop
3rd- order stage
RL RL
j
p1
Av(s) =
1+s/ p
A0
- p
p3
p2
Figure 4.9: Topology of the limiting amplifier.
p1 = −ωp(1 + 3
√
A20GfRL) (4.7)
p2,3 = −ωp
(
1−
3
√
A20GfRL
2
± j
√
3
2
3
√
A20GfRL
)
(4.8)
4.1.2.2 DC Offset Compensation
In order to convert the single-ended signal to a differential signal, it needs a dc
offset compensation loop which can sense the common-mode voltages of the LA’s
outputs and then connect to its input in negative feedback. The purpose of the
negative feedback is to reduce the offset voltage of the system and provide a stable dc
bias close to the TIA output’s average voltage against PVT variations. Figure 4.10
illustrates the function. Equation 4.9 derives the relationship between the gain and
63
the offset voltages in the loop. Equation 4.10 is the equivalent input referred offset
voltage. It can be seen that the offset voltages presented at the inputs of A1 and A2
are reduced by the loop gain and the forward gain of the system, respectively. Thus
the offset of the feedback opamp A2 has to be designed small. In order to ensure
the loop stability, the RC needs to be large enough to make the dominant pole very
low to meet the stable phase margin. The low cutoff frequency point (P2) is set by
the RC and dependent to the gain of A1, A2 stages [48]. On-chip capacitor results
in a large area. Off-chip methods have a bonding wire inductance in series with the
terminating capacitor. The loop stability needs to be carefully simulated across the
PVT variations by taking the effect of bondwire into account. The higher the loop
gain, the less static error between the feedback node voltage and the average input
voltage, in turn, the more symmetrical between LA’s differential output. Thus, a
folded cascode gain-boosted opamp (Figure 4.11) features a dc gain of 65 dB is used
as the feedback opamp (A2) [53]. Amplifier Gt and Gb with compact CMFB scheme
are used for gain-boosting [54]. In this design, P2 is set below 10 MHz. RC needs to
be large enough due to the large loop gain.
[(Vos1 − Vcm)A1 + Vos2]A2 = Vcm (4.9)
Voffset = Vos1 − Vcm ≈ Vos1
A1A2
+
Vos2
A1
(4.10)
4.1.2.3 Large-Signal in Limiting Region
Depending on the magnitude of the input signals, as they propagating to certain
stages in the limiting amplifier, the tail current in the CML stages is fully steered
64
C
A1
vin
R
voutVos1
2πRC
A1A2+1
P2 =
Vcm
freq.
vout
vin
P2 P1
A2 Vos2
P1 is approximately
at the BW of A1
Figure 4.10: Illustration of dc offset compensation.
to one side so that the voltage swing is limited by the tail current times the load
resistance. The rise-time of the output voltage is mainly determined by the RC time
constant. This delay depends on the charging of the load capacitor by the resistors to
the supply rail, and thus the total capacitance contributed by the current and the next
stage should be minimized. The fall-time of the output is contributed by discharging
the load capacitor during which the transistor transitions from sub-threshold region
to saturation region (It will enter triode region until gate voltage larger than drain
voltage plus Vth when the amplitude is large), with the discharge current reaching
close to the tail current [16]. The bandwidth-limited signal in the linear region gets
sharpened in the limiting region. As shown in Figure 4.12, with a 2.5 GHz 150 mVPP
sinusoidal signal propagating through the LA stages, the signal is first being amplified
in the linear region and then sharpened in the limiting region. More additive noise
effect will present in slow rise/fall time signals. Sharper transition edge signal is
desirable although it gets sharpened by the latter limiting stages in the LA [28].
65
vb1
vop
vip vim
vb1
+
+
+
+
Gt
Gb
Gb
Gt
Figure 4.11: Schematic of the folded cascode gain-boosted opamp used in
the dc offset compensation feedback loop.
4.1.3 Output Buffer
Signal integrity, electro-migration and ESD protection are the main design con-
siderations for the output buffer. The signals arriving at the inputs are already large
amplitude limiting signals. The tail current of the output buffer is set by the required
output signal amplitude and the termination resistance. As shown in Figure 4.13,
1.2 V nfets are used as high speed switching device. In order to prevent the output
devices from ESD damage, 3.3 V thick gate oxide nfet is cascoded on the 1.2 V nfet.
What’s more, ballasting resistors (Rd) in parallel are added in series at the drain side
of each unit finger of the 3.3 V nfet as a current limiter. The unit width and total
width of the nfet should be properly sized for ESD safety and be carefully laid-out
66
9 9.2 9.4 9.6 9.8 10
0.8
1
1.2
M
ag
ni
td
ue
 (V
) Input
9 9.2 9.4 9.6 9.8 10
0.8
1
1.2
2nd stage
9 9.2 9.4 9.6 9.8 10
0.8
1
1.2
M
ag
ni
td
ue
 (V
) 4th stage
9 9.2 9.4 9.6 9.8 10
0.8
1
1.2
6th stage
9 9.2 9.4 9.6 9.8 10
Time (ns)
0.8
1
1.2
M
ag
ni
td
ue
 (V
) 8th stage
9 9.2 9.4 9.6 9.8 10
Time (ns)
0.8
1
1.2
11th stage
Figure 4.12: A 2.5 GHz 150 mVPP sinusoidal signal gets amplified and gets
more NRZ-like waveform with edges sharpened, as it travels along the LA
chain.
even with ballasting resistor present.
On-chip resistors with a nominal value of 50 Ω is used to alleviate the reflection
due to the transmission line on the PCB. It usually has a 50 Ω termination with
respect to the ground at the equipment side (e.g. the oscilloscope). The required tail
current can be estimated from the desired output amplitude divided by 25 Ω, which
also sets the dc operating point of the CML buffer. The dc tail current is designed
at 10 mA in the current mirror by default setting. However, it will have current
mismatch due to the Vds of the current mirror devices will not be the same. In order
67
to alleviate the Vgs drop of the switching devices (M1,2) to keep M4 in saturation,
low VT nfet is adopted for M1,2. A minimum metal width is required to meet the
electro-migration reliability concerns.
+ -
+-
Rd
50Ω 50Ω
Rd
ctrl<0>
50µA
M1 M2
M3
m=1
M4
m=100
ctrl<1>
VDD33
vbn
50µA25µA
lvt
Figure 4.13: The schematic of the level shifted output buffer.
Figure 4.14 is the simulated eye diagram of the OB drives a 50 Ω through a
dc blocker. In the Lab measurement, this signal not only goes to the sampling
oscilloscope for eye measurement, but also goes into the BERT for bathtub and
sensitivity testing. Multiple jitter sources contribute to the total jitter (TJ) in the
eye diagram of the OB. The distribution of TJ is the convolution of the distributions
of random jitter (RJ) and deterministic jitter (DJ). Thus probability density function
(PDF) of TJ won’t be a Gaussian like PDF in practice as illustrated in Figure
4.14. Tail fitting technique can be used to construct a Gaussian distribution to
accurately match the actual PDF effects [55]. Bathtub plot is obtained by building
the cumulative distribution function (CDF) of jitter on the left and right data edges.
68
It shows the probability of error versus the sampling point, and used to estimate the
eye opening for very low BER levels. Instead of the time domain, similarly, the BER
can be characterized in the magnitude domain. This is shown in the experimental
results section.
1. Move sampling location to check BER (bathtub).
2. Tune laser power to check BER (sensitivity).
noise statistics
tail fitting
Figure 4.14: Post layout simulated eye diagram of the signal coming out
from the output buffer at 4 Gb/s.
4.2 Experimental Results
To represent a complete optical link, a continuous-wave (CW) laser, a commercial
LN (LiNbO3) Mach-Zehnder modulator device and a discrete broadband driver are
adopted to assist the electro-optical testing. As demonstrated in Figure 4.15 (a),
the CMOS die fabricated in IBM 130nm CMOS process and the commercial PD
die (500µm × 500µm ) are wire bonded using chip-on-board (COB) method. The
bondwire gap between the TIA input pad and the PD anode pad is about 820 µm.
69
Figure 4.15 (b) shows a 4-layer PCB fabricated with laminate material of Rogers
(RO4350B). ENEPIG surface finish is required for COB bonding. A parallel of
decoupling capacitors (100 pF , 1 µF , 10 µF ) are used for the 1.2 V, 3.3 V and PD
power supplies (no LDO is used). Grounded coplanar waveguide (GCPW) T-line was
designed to carry the signal out to the equipment. As indicated in Figure 4.15 (c), a
single fiber probe is aligned on top of the PD. The differential outputs from the OB are
connected to the sampling oscilloscope (Keysight DCA 86100D) and BERT (Anritsu
MP1800A), respectively, for observing the waveform and eye diagram inspection and
simultaneously for sensitivity measurement. A picture of the experimental test setup
is shown in Figure 4.15 (d). The yellow fiber cable at the bottom right connects to
the tunable CW laser source at the other side of the room. The laser wavelength is
set to 1550 nm.
A detailed test setup block diagram is shown in Figure 4.16. The optical stim-
ulus for the PD is generated by using a commercial LiNb Mach-Zehnder modulator
device being modulated by the Anritsu MP1800A PPG source with a high voltage
swing, broadband driver. Polarization controller is required since non polarization-
maintaining single mode patch cables are used.
4.2.1 Eye Measurement
The optical information from the output of the MZM is recorded for later sensi-
tivity analysis. Figure 4.17 shows the optical eye diagrams measured at the output
of the MZM device with PRBS-31 at 4 Gb/s and 5 Gb/s, respectively, at different
input laser power. It can be observed that it has about 12.1 dB total insertion loss
from the laser to the MZM output. It should be noted that the zero level power
shown in Figure 4.17 (a) and (b) went down to negative values. This is due to an
70
(a) (b)
(c) (d)
CMOS die
PD
Fiber align
Diff. o/p
DCA 86100D
MP1800A
TIA i/p
PD power
3.3V
1.2V
Output buffer o/p
Cex
Figure 4.15: (a) Chip microphotograph. (b) Chip-on-board bonding to
the PCB. (c) PCB setup. (d) Test setup in the Lab.
uncalibrated high bandwidth plug-in module (Keysight 86105D) was used for this
optical measurement. High bandwidth module features more integrated noise. The
insertion loss can be reduced to about 12 dB if the optical signal being offset up by
4 µW . The actual extinction ratio (ER) will be 17.8 dB and 16.5 dB when operating
at 4 Gb/s and 5 Gb/s, respectively, with 6 dBm laser power.
In order for the PD to get a stable detection in the photoconductive mode, the
PD power supply needs to be set above 5.6 V. Given a PD responsivity of 0.8 A/W,
by reading the MZM optical power level and the PD power supply currents, it can
be estimated that there is a coupling loss of 3.5 dBm due to the fiber alignment. It
can be derived that the input current eye height for obtaining the electrical eye in
Figure 4.18 is about 173 µA, 172 µA and 170 µA at 4 Gb/s, 4.5 Gb/s and 5 Gb/s,
71
CW laser LiNb MZMPolarization controller
MZM driver
Anritsu
MP1800A
Positioner & 
fiber align
CMOS chip
&
Photo diodeDC blocker
Keysight
DCA 86100D DC blocker
Trigger clock
BER
Eye diagram
Single mode fiber
RF cable
PPG
Figure 4.16: Block diagram of the test setup used for measuring the eye
diagram and BER of the designed receiver.
respectively. In Figure 4.18 (a)-(c), data rate, rise/fall time, rms jitter, peak-to-peak
jitter, eye height, SNR were recorded for one of the output buffer’s differential signals
with large number of counts along with the eye diagram measurement at PRBS-31
pattern. The receiver draws 39 mA from 1.2 V and 12 mA from 3.3 V, resulting in
a total dc power consumption of 85 mW. The power consumption breakdown for the
limiting receiver is plotted in Figure 4.19.
4.2.2 Bathtub and Sensitivity
One of the output buffer’s differential outputs was connected to the Anritsu
MP1800A error detector. Auto bathtub measurement was performed to test the
BER versus the internal CDR clock phase as plotted in Figure 4.20. The bathtub
plot is not smooth due to the fiber alignment gradually shifts during the long time
measurement. For operating at 4 Gb/s, the eye opening at BER = 10−12 is around 0.1
UI, while for operating at 5 Gb/s, the eye opening at BER = 10−8 is around 0.2 UI.
72
Figure 4.17: Optical eye diagrams of the MZM output measured with
PRBS-31 at 4 Gb/s with (a) 2 dBm and (c) 6dBm laser power, at 5 Gb/s
with (b) 2.8 dBm and (d) 6 dBm laser power, wavelength is set at 1550
nm.
Sensitivity is also characterized by manually sweeping the input laser power while
recording the BER. BER versus the MZM average power (top x-axis) and optical
modulation amplitude (OMA at the bottom x-axis) as shown in Figure 4.21. OMA
is the logarithm of the optical power amplitude. When the receiver operating at 4
Gb/s PRBS-31 pattern, a BER of 10−12 can be achieved at the sensitivity level of
-6.2 dBm average power and -3.2 dBm OMA at the MZM output (estimated OMA
≈ -6.7 dBm after coupling to the PD).
73
4.3 Summary
A limiting receiver was designed and fabricated in IBM 130 nm CMOS process.
It was wire-bonded to a commercial discrete PD. A complete electro-optical link was
demonstrated with a discrete high-swing broadband driver and a discrete LN MZM
device as a transmitter. A BER of 10−12 at 4 Gb/s is achieved with a TIA bandwidth
of 0.46×B and a LA bandwidth of 2×B in the receiver’s linear region. Table 4.2
compares some of the OE limiting receivers fabricated in 130 nm CMOS process.
It can be observed that using integrated PD or using flip-chip bonding option to
eliminate the input bondwire can help to improve the performance. Several other
points can be learned from this design, they are: 1). Bondwire effects in side-by-side
bonding option and the overload current effect need to be co-simulated with the
extracted layout. 2). In order to reduce the parasitic effects introduced by the metal
routing, the TIA block can be placed close to the input pad. No latchup issue is
present since there has only nfets and no large switching currents in the TIA. ESD
protection devices at the TIA input can be waived for high-speed application. 3). The
latter stages of the LA working in the limiting region are also act as predriver stages
for the output buffer. 4). Other important issues like signal integrity and power
integrity including T-line design in PCB, the selection of SMA connectors and power
supply noise at different frequency bands are key to the overall system performance.
74
Table 4.2: Comparison of the optoelectronic RX fabricated in 130 nm
(SOI) CMOS process.
[5] JSSC 2006 [56] JSSC 2009 [43] JQE 2012 This work
RX topology TIA+LA+CDR TIA+EQ+LA+OB TIA+EQ+LA+OB TIA+LA+OB
TIA topology Shunt feedback CS Diff. shunt feedback Diff. shunt feedback Shunt feedback CS
Feedback resistance 310 Ω 5 kΩ (PMOS) 4 kΩ 768 Ω
On-chip inductor Yes No No No
PD type PIN flip-chip CMOS Integrated CMOS Integrated APD Hybrid InGaAs/InP PIN wire-bond
PD capacitance 150 fF 1 pF NA 100 fF
Optical wavelength 1550 nm 850 nm 850 nm 1550 nm
Max. data rate 10 Gb/s 4.5 Gb/s 10 Gb/s 4 Gb/s
Sensitivity -19.5 dBm∗ -3.8 dBm ∗ -4 dBm∗ -6.7 dBm (-6.2 dBm∗)
Measured method Packaged NA Die with RF probing COB with SMA and RF cable
Note: ∗ Average input power is measure as sensitivity for BER=10−12. Otherwise, it
is characterized with OMA.
75
Figure 4.18: (a)-(c) Electrical eye diagrams of a single-ended output of
the limiting receiver measured with PRBS-31 from 4 Gb/s to 5 Gb/s. (d)
Oscilloscope mode at 5 Gb/s with PRBS-7 pattern.
76
Figure 4.19: Power consumption breakdown of the limiting receiver.
Normalized Phase (UI)
BE
R
(lo
g)
Figure 4.20: BER bathtub measurement of the limiting receiver with
PRBS-31 operating from 4 Gb/s to 5 Gb/s.
77
MZM Average Power (dBm)
BE
R 
(lo
g)
MZM OMA (dBm)
Figure 4.21: Sensitivity plot of the limiting receiver with PRBS-31
operating from 4 Gb/s to 5 Gb/s, BER versus MZM OMA and average
power.
78
CHAPTER 5
A 10 GHZ PHASE LOCK LOOP DESIGN
On-chip phase Lock Loop (PLL) which can provide high precision clocking is
indispensable for data generation and synchronization in any transceivers. This is
especially important to high-frequency, high-speed applications due to it’s challenge
to feed high-frequency clocks externally. Higher-speed data transfer is more desired
nowadays due to the ever-increasing appetites for bandwidth-hungry applications, like
demands on video streaming, online gaming, cloud computing, etc. In this chapter,
a systematic design flow for a 10 GHz type-II third-order charge pump PLL in the
IBM 130 nm SiGe BiCMOS process is presented along with experimental results. It
is used to provide system clock for the > 10 Gb/s full-rate pseudo-random binary
sequence (PRBS) which will be introduced in the next chapter. As semiconductor
technology advances, the channel length of the devices shrink for higher speed and
density. On the other hand, the gate leakage of the MOS capacitors used in the
traditional analog loop filter design will increase due to the decrease of gate oxide
thickness, this can cause system instability with an analog PLL. Low voltage charge
pump design also poses challenges in the < 100 nm technologies due to large device
mismatches and the stringent voltage headroom requirement. Digital PLLs adopt a
digital loop filter, which can solve the leakage problem and replace analog PLLs in the
< 100 nm processes [57]. However, in this 130 nm SiGe BiCMOS process, thick gate
79
oxide MOS devices and a 2.5 V power supply are used, so that the above mentioned
issues in the analog PLL design won’t be a problem.
In order to perform PLL loop stability analysis, VCO characteristic based on the
specific topology and process should be ready beforehand. So VCO is characterized
right after the PLL architecture is defined. Then followed by system loop stability
analysis to choose the appropriate loop filter parameters. Noise budget analysis is
finally performed to predict the noise contributions from each block.
5.1 PLL Architecture
D
Q
R
Qb
D
Q
R
Qb
td
dn
upb
÷2÷2÷2÷2
÷2÷2÷2
÷2÷2
CML
TSPC
Vc
CKref
CML to CMOS
up
dnb R1
C1 C2Cc
LC VCO
CKsyn
PFD Differential charge 
pump
Loop filter
VCO buffer
Figure 5.1: Schematic of the proposed type-II third-order PLL architec-
ture.
The proposed type-II 3rd-order PLL architecture is shown in Figure 5.1. LC VCO
is chosen for the advantages of of high resonance frequency and better phase noise
performance. It also suits for the narrow tuning range application which is the case
here. A divide-by-128 divider chain is used in the feedback loop so that the required
reference clock is PLL output clock divide by 128. A resistive positive feedback
80
inverter based input buffer takes the reference clock from external source. A fully
differential charge pump with an opamp to reduce the transient current mismatch is
designed. The PLL is designed with SiGe HBT as well as MOS fets with 2.5 V power
supply.
5.2 LC VCO Design
The LC VCO schematic is shown in Figure 5.2. The resonance frequency is
expressed in Equation 5.1 which depends on the inductance and total capacitance.
Usually L should be kept small so that leave more tuning flexibility to capacitor banks
and for smallerKV CO as indicated in Equation 5.2. LowerKV CO is desirable to reduce
the VCO phase noise due to the spur. The MOS varactor capacitance Cvar should be
larger than or comparable to the load capacitance CL. MOS varactor is made by N+
polysilicon gate over n-well using 5.2 nm gate oxide and is operating between depletion
and accumulation. It has a tuning voltage and capacitance range of 1 V to -0.5 V and
2.8:1, respectively. The unit area capacitance is about 6 fF/µm2 at 1.25 V. However,
all these passive devices are not ideal in practical, they have series resistance which
need to be canceled by the negative resistance. Negative resistance is realized by npn
BJT and pfet double cross-coupled pair. With the same bias current, in theory, the
differential amplitude of double cross-coupled pair is twice of the single cross-coupled
pair counterpart. So that the double cross-coupled pair topology has 6 dB phase noise
improvement [58]. The tail current is made tunable to avoid the oscillator working
under voltage limited mode. The post-layout simulated single-ended peak-to-peak
amplitude of the oscillator is about 600-890 mV across the process and temperature
variations. Since the oscillator is working under large-signal region, one side of the
81
npn BJT and pfet double cross-coupled pair will enter reverse-active region and linear
region, respectively. This can incur a severe phase noise penalty. Design techniques
are proposed in [59, 60] to shift down the base or gate voltage so that the oscillator
operating in Class-C mode. However, it requires extra bias voltage. For layout
simplicity, Class-B oscillator topology is used in this design. A pfet current mirror is
used for the oscillator’s current bias so that it has less 1/f noise up-conversion effect
than the nfet current mirror counterpart. Big resistor and pfet capacitor are added
to filter out the noise introduced from the current source and bandgap reference.
Ct[0]
VC
Ct[1]Ct[2] Ct[0] Ct[1] Ct[2]
L
Cvar Cvar
C02C04C0 C0 2C0 4C0
CL CL
k2k4k k 2k 4k
VDD25
Thin gate devices
Figure 5.2: Schematic of the LC VCO. Control bits C < 2 : 0 > control
the capacitor bank for discrete frequency coarse tuning, C0=66 fF.
82
ω0 =
1√
L
2
(Cvar +
∑2
n=0 VC [n]2
nC0 + CL)
(5.1)
KV CO =
(ω0
2pi
)′
= −2pi2f 3LdCvar
dV
(Hz/V ) (5.2)
~ 230µ
(a). Bad VCO layout scheme, inductor and varactor too far away from the cross-coupled devices.
(b). Better VCO layout scheme by moving inductor and varactor close to the cross-coupled devices.
Varactor
Cross-coupled 
devices
Figure 5.3: Two different VCO layout examples.
Parasitic resistance and capacitance are detrimental to the LC tank resonance
frequency. Even a small amount of resistance (<1 Ω) will have a profound effect
on high-Q circuits such as resonators. A bad layout and good layout comparison is
given in Figure 5.3 (a) and (b), respectively. The inductor should be placed as close
as possible to the cross-coupled transistors. And any capacitance being introduced
83
by the metal routing and VCO buffer loading should be considered during the VCO
design and simulation. The layout extracted VCO characteristics with two extreme
corners and one nominal corner are plotted in Figure 5.4. With the known range
of the KV CO, charge pumped current can be designed digitally programmable to
compensate the KV CO variation.
1.2 1.3 1.4 1.5 1.6 1.7 1.8
100
200
300
400
500
600
700
VC (V)
K
V
CO
(M
H
z/
V
)
+3σ, 0 ºC
0σ, 27 ºC
-3σ, 80 ºC
1 1.2 1.4 1.6 1.8
11.5
12
12.5
13
13.5
VC (V)
f V
CO
(G
H
z)
+3σ, 0 ºC 0σ, 27 ºC -3σ, 80 ºC
Figure 5.4: Layout extracted simulation results of VCO characteristics.
5.3 Loop Stability Analysis
The corresponding PLL model is illustrated in Figure 5.5 with individual noise
sources added for later noise analysis. The loop transfer function of type-II third-order
PLL can be expressed in Equation 5.3.
L(s) =
ICP
2pi
[(
R1 +
1
sC1
)
‖ 1
sC2
]
2piKV CO
s
1
N
=
ICPKV CO
(C1 + C2)N
1 + s
ωZ
s2(1 + s
ωP
)
(5.3)
Where ωZ = 1/(R1C1) and ωP = (C1 + C2)/(R1C1C2). The unit of KV CO is
Hz/V. Let b = C1/C2, then ωP = (1 + b)ωZ . In the 3
rd-order system, two poles
84
ICP
2*π
R1*C1*s+1
R1*C1*C2*s + (C1+C2)*s2
2*π*KVCO
s
1
N
2nd-order loop filter
VCO
Divider
PFD+Charge pump
PFD+CP noise LF noise VCO noiseRef+divider noise
Ref. clock PLL o/p
Figure 5.5: PLL model with possible noise sources.
are located at the origin, so that the zero should be placed before the unity loop
bandwidth (ωu,loop), thus we have ωZ < ωu,loop < ωP < ωref , ωref is the reference
radius frequency. By letting c = ωu,loop/ωZ and using the trigonometric identity
tan−1A− tan−1B = tan−1( A−B
1+AB
), the phase margin (ϕ) of L (s) is shown in Equation
5.4.
ϕ = tan−1(
ωu,loop
ωZ
)− tan−1(ωu,loop
ωP
) = tan−1(
bc
1 + b+ c2
) (5.4)
Unity loop bandwidth over zero (defined as c) versus the capacitor ratio in the
loop filter (defined as b) is plotted in Figure 5.6 when the phase margin (ϕ) is 65◦.
In this case, it can be observed that C1 has to be more than 18 times larger than C2.
Typically, c is set in the range of 6 to 10 [61].
85
15 20 25 30 35
b = C1/C2
2
4
6
8
10
12
14
c 
=
 ω
u
,lo
op
/ω
Z
Figure 5.6: Plot of unity loop bandwidth over zero versus capacitor ratio
in the loop filter for 65◦phase margin.
Next, let’s define ωref = aωu,loop. a is set at more than 10 for the following two
main reasons: 1). In order to be able to approximate the VCO tuning voltage (VC)
as continuous time and linearize the PLL model; 2). Filter out the periodic reference
disturbance due to reference clock feed-through. However, there is a chance that C2
will be very small that is comparable to the parasitic capacitance if the reference clock
frequency is high, and the noise performance is also sensitive to the loop bandwidth.
As a rule of thumb, for this application it’s better to set fu,loop less than 1 MHz from
noise perspective. In this design fu,loop=500 kHz is chosen to give enough margin.
C2 can be solved by substituting ωu,loop into the loop magnitude function as in
Equation 5.5. Once C2 is known, C1 and R1 can be easily derived from the b coefficient
and ωZ .
86
C2 =
ICPKV CO
N
a2
√
1 + c2
ω2ref
√
(1 + b)2 + c2
(5.5)
Thus far, ICP and KV CO seem to be the most important design variables to
determine the loop filter parameters. As long as ICPKV CO is kept as a constant, C2
will be fixed. It’s evident that the larger the KV CO or ICP , the larger the capacitance
and the smaller the resistance will be required in the loop filter.
5.4 Circuit Block Design
In this section, circuit block design including phase frequency detector (PFD),
charge pump, loop filter, feedback dividers will be given.
5.4.1 Phase Frequency Detector
A common linear PFD with resettable DFFs is sufficient to meet the design
requirement since the operating frequency is less than 100 MHz. Its schematic and
state diagram are illustrated in Figure 5.7 and Figure 5.8, respectively. The PFD’s
characteristic is ideally linear for the entire range of input phase difference from -2pi
to 2pi. Latches are added at the differential outputs to improve the rise and fall time
so that the crossing point can be adjusted to the half of VDD. In order to avoid
PFD+CP+LF dead zone, minimum delay needs to be assured in the reset path to
make sure the outputs have enough pulse width to turn on the switching fets in the
charge pump. While too large delay will reduce the detectable phase range and the
PFD operating frequency.
87
D
Q
R
Qb
D
Q
R
Qb
upb
up
dnb
dn
CKref
CKfb
TG delay matches INV delay
Use latch to optimize cross point
Min. delay to make 
sure enough pulse width
Figure 5.7: Schematic of the linear phase frequency detector.
up = 0
dn = 0
up = 1
dn = 0
up = 0
dn = 1
CKref
CKref
CKfb
CKfb
CKrefCKfb
Figure 5.8: State diagram of the linear phase frequency detector.
5.4.2 Charge Pump and Passive Loop Filter
On one hand, the smaller the charge pump current, the larger the noise it will
contribute to the PLL output phase noise. On the other hand, the larger the current,
the larger capacitance is required in the loop filter. Figure 5.9 shows the schematic of
the charge pump. By using differential switches (M1-M4) steering the current, the up
and down current sources are always kept in saturation region. The complementary
node vcm of vctrl is well fixed by a buffer to reduce the transient current mismatch
(charge sharing) [62]. Wide range input and output common-mode opamp is desired
to cover the valid vctrl range. Replica bias and large device dimensions are used in
88
the current mirrors to reduce systematic and random process mismatch, respectively.
VDD25
dnb
up
dn
upb
vctrl
60 µA 20 µA 40 µA 80 µA 160 µA 
M1 M2
M3 M4
Replica bias branch
vcm
Figure 5.9: Schematic of charge pump.
Charge pump currents are made controllable to compensate KV CO variation for
stabilizing the system. With simulated KV CO values given in Figure 5.4 and Equation
5.5, possible loop filter parameters can be calculated which are listed in Table 5.1.
On-chip capacitors and resistors have large process variations so that the loop filter
needs to be designed with certain tunability. Table 5.2 and Figure 5.10 characterized
the resistor and capacitor at three corner conditions. The variations of resistor, MIM
capacitor and 2.5V nfet MOS capacitor can be as much as 37%, 34% and 6.3%,
respectively. MIM capacitors are preferred to be used in the LC VCO for its better
quality factor and are not voltage dependent. While, MOS capacitor with less process
variation is better for loop filter application. The resistor can be designed digitally
programmable to cover the large process and temperature variations. The schematic
is shown in Figure 5.11, the size of the transmission gates need to be large enough to
make sure negligible turn-on resistance comparing to the ploy resistance in parallel.
89
Table 5.1: Loop filter parameters when fV CO = 11 GHz, N = 128, b =
25.57, c = 9, PM = 65◦.
ICP KV CO C1 C2 R1
360 µA 100 MHz/V 235.21 pF 9.2 pF 12.18 kΩ
100 µA 400 MHz/V 261.34 pF 10.22 pF 10.96 kΩ
60 µA 700 MHz/V 274.44 pF 10.73 pF 10.44 kΩ
Table 5.2: Simulated opppcres resistor and MIM capacitor characteristics
at three corners in IBM8HP process.
Device W/L +3σ @ 0◦C nominal @ 27◦C -3σ @ 80◦C
opppcres 0.8µ/2.1µ 1183.5 Ω 1000.9 Ω 818.2 Ω
MIM cap 10µ/10µ 120.5 fF 86.04 fF 102.5 fF
0.5 1 1.5 2
VG (V)
130
135
140
145
150
155
Ca
pa
ci
ta
nc
e 
(fF
)
IBM8HP dgnfet as MOS cap (W=5 µ , L=5 µ )
+3 sigma @ 0°C
-3 sigma @ 80°C
nominal @ 27°C
Figure 5.10: Simulated MOS capacitor characteristics with vary gate
voltages at three corners.
90
1 K
2 K
4 K
8 K
rctrl<1>
rctrl<0>
rctrl<2>
vctrl
Figure 5.11: Schematic of tunable resistor.
5.4.3 Frequency Divider
As illustrated in Figure 5.1, CML DFF and true single-phase-clock (TSPC) dy-
namic flip-flop [63] are used for high frequency and lower frequency dividers, respec-
tively. High frequency divide-by-2 circuits require a minimum amplitude at certain
frequency operation, which can be characterized with divider sensitivity curve [64].
The schematic of the CML divide-by-2 circuit is shown in Figure 5.12. AC coupling
is used to take the clock input signals so that the dc operating points are independent
to the input signals. HBT devices are used for the amplification fets (B1-B2) and
regeneration fets (B3-B4). While for the purpose of easy bias under relative large
tail current condition, NMOS fets (M1-M2) are used for clocking switches. Source
follower (B5-B6) stage as level shifter provides better dc operating points for the BJT
to stay in the forward-active region but at the sacrifice of consuming more currents.
91
Latch
Ck
Ckb
D
Db
QH
QHb
Q
Qb
Latch
Ck
Ckb
D
Db
QH
QHb
Q
Qb
Q
Qb
QH
QHb
D
Db
CK
CKb
vbias
VDD25
fin finb
fdiv2
fdiv2b
Latch
M1 M2
B1 B2 B3 B4 B5 B6
Figure 5.12: Schematic of CML divider-by-2.
TSPC dividers feature low power consumption, but require a full swing input
amplitude to be functional. Resistive feedback inverter with AC coupling can be
used to convert the CML signal to CMOS signal. The TSPC divide-by-2 circuit as
shown in Figure 5.13 is used for clock frequencies less than 1 GHz in this application.
92
fin fdiv2
Use wider width for MP1 for lower speed
MP1
Figure 5.13: Schematic of TSPC divider-by-2.
5.5 PLL Phase Noise Analysis
Every block in the PLL will add noise to the system, other than that, external
reference clock and supply voltage can also contribute significant noise. In order to
find out the dominate noise source to the PLL output at different noise band, noise
in each block in the system is checked. Noise transfer function of the PLL output
with respect to the individual noise source is studied in the closed-loop form. Final
PLL output phase noise contributed by individual phase noise after being filtered by
its corresponding noise transfer function. Phase noise due to the device thermal noise
and flicker noise in each block is estimated with Spectre simulator using periodic
steady state (pss) analysis and periodic noise (pnoise) analysis. The simulated noise
data are collected and then post processed with Matlab.
93
5.5.1 Noise Sources and Noise Transfer Function
Noise transfer functions from PLL output to each noise sources shown in Figure
5.5 are listed in Table 5.3 . The reference, PFD and divider noise transfer function
have the same the system closed-loop transfer function. Higher reference frequency
can reduce the contribution of the PFD noise [65]. The charge pump noise transfer
function has the same effect as is introduced from the reference, but scaled by 2pi
ICP
.
Which means lower change pump current will contribute more noise. Continue
down to the loop filter, its noise transfer function is further scaled by the loop filter
impedance Z(s) results in a band-pass like transfer function. VCO noise accumulates
due to its integral characteristic of VCO transfer function. The transfer function for
noise introduced after the VCO is a high-pass shape.
Table 5.3: Noise transfer functions from PLL o/p to each noise sources.
NTF|nO/Pns Ref+Div PFD+CP (A−1) LF VCO
PLL O/P NL(s)
1+L(s)
2piNL(s)
ICP (1+L(s))
2piKV CO
s(1+L(s))
1
1+L(s)
The corresponding amplitude frequency response is plotted in Figure 5.14. It can
be observed that all of the added noise for the PLL system can be attributed to two
primary noise sources. They are detector noise and VCO noise [61]. Detector noise is
considered to be the addition of white and spurious noise, and is composed of noise
due to the reference, PFD and divider jitter, charge pump noise, and spurious noise
from the reference clock. VCO noise is assumed to roll of at – 20 dB/dec, and is
primarily caused by thermal noise in the VCO structure. But in practice, VCO noise
rolls off at a higher rate than – 20 dB/dec at low frequencies due to the influence of
the flicker noise. Detector noise can be reduced by setting the bandwidth as low as
possible. By reducing the bandwidth, the VCO noise amplitude frequency curve will
94
102 103 104 105 106 107
Freq. (Hz)
-150
-100
-50
0
50
100
150
|NT
F| 
(dB
)
Noise Transfer Functions
NTF
ref,div
NTF
vco
NTF
R
NTFCP
Figure 5.14: Plots of PLL noise transfer function for each noise source.
not only shifts to the low frequency side, but also the magnitude will go up. So in
order to minimize VCO noise, the bandwidth of the loop needs to be raised. The peak
point of the band-pass (refer to Figure 5.14) is the bandwidth of the loop. Optimizing
loop bandwidth is basically balancing between the noise due to the reference and VCO
characteristics.
5.5.2 Phase Noise Simulation
Phase noises for each building blocks including VCO, charge pump, feedback
divider chains and resistor in the loop filter are obtained from Spectre simulation
(pss and pnoise analysis). The reference clock is generated by Keysight E8257D
analog signal generator. Its phase noise is measured with Agilent PXA N9030A signal
analyzer. Phase noise introduced by all the noise sources versus an offset frequency
95
ranges from 100 Hz to 10 MHz are plotted in Figure 5.15. Given the noise transfer
functions listed in Table 5.3 and plotted in Figure 5.14, the PLL output phase noise
due to phase noise sources introduced by each block can be plotted in Figure 5.16,
respectively. The solid black curve in Figure 5.16 denotes the PLL output total phase
noise which is the addition of all the other phase noise components. It can be observed
that the reference phase noise seen at the output dominates the total output phase
noise at lower offset frequency range, the VCO output phase noise became dominant
from 2.5 MHz out onwards.
102 103 104 105 106 107
Offset Freq. (Hz)
-240
-220
-200
-180
-160
-140
-120
-100
-80
-60
-40
-20
0
20
30
SS
B 
Ph
as
e 
N
oi
se
 (d
Bc
/H
z)
Noise Sources in the PLL
S
Φ
ref
S
Φ
div
Si
CP
S
v
R
S
Φ
vco
Figure 5.15: Phase noise of each noise sources introduced into the PLL.
Equation 5.6 can be used to calculate the rms jitter (variance) of the phase noise
PSD in the time domain.
σrms =
1
2pifV CO
√ˆ fstop
fstart
S(f)df (5.6)
96
102 103 104 105 106 107
Offset Freq. (Hz)
-170
-160
-150
-140
-130
-120
-110
-100
-90
-80
-70
-60
SS
B 
Ph
as
e 
N
oi
se
 (d
Bc
/H
z)
SSB Output Phase Noise
S
Φ
out
Total S
Φ
out
Φ
ref S
Φ
out
Φ
div S
Φ
out
i
CP S
Φ
out
v
R S
Φ
out
Φ
vco
Figure 5.16: PLL output noise due to individual noise sources.
The calculated rms jitter of the reference clock and the VCO from their phase
noise profile is about 0.94 ps and 1.08 ns, respectively. The rms jitter of the PLL
output phase noise profile is 0.76 ps. This manifests that most of the VCO noise is
filtered out, but reference clock still contributes significantly.
5.6 Experimental Results
The PLL was fabricated in IBM 130 nm BiCMOS process. The micro-photograph
is captured in Figure 5.17. In order to avoid dealing with high-frequency signal off-
chip, a VCO divide-by-64 signal (PLL/64) is taken out from the die for measurement.
The CMOS die is packaged with QFN-64 package and tested with a high-frequency
QFN socket mounted on a prototype PCB. The test setup is given in Figure 5.18.
Keysight E8257D PSG analog signal generator which can generate high quality clock
97
as mechanical crystal is used to provide reference clock for the PLL. The PLL/64
signal is taken and fed into the Keysight PXA N9030A signal analyzer. The first step
of the testing is without giving the reference clock, set the signal analyzer at spectrum
mode to measure the free running frequency of the PLL/64 signal. Then turn on the
signal generator and set the frequency close to the previous measured free running
frequency, tune the reference clock frequency with a fine step within a certain range,
and see if the PLL/64 signal on the signal analyzer can track the reference clock
frequency. Phase noise can be measured by changing the signal analyzer to phase
noise mode once the PLL is locked
Figure 5.17: A micro picture of the PLL.
The VCO control voltage is not taken out of the chip so that VCO frequency versus
control voltage can’t be measured here. However, the PLL is measured with changing
the VCO capacitor bank when the PLL is at lock condition. This is performed with
a fixed charge-pump current and loop filter setting. The VCO frequency can be
derived and plotted in Figure 5.19 along with the capacitor bank setting. Please refer
to Figure 5.2 for VCO capacitor control bits setting (C < 2 : 0 >). The PLL at this
98
Ref. clock
PLL/64
E8257D PSG Analog Signal Generator
PXA N9030A Signal analyzer 
Prototype PCB
Attenuator
Figure 5.18: Signal generator, signal analyzer and the prototype PCB.
specific setting can be locked and operated from 10.81 GHz to 11.66 GHz.
Figure 5.20 and Figure 5.21 show the spectrum plot and the phase noise of the
reference clock compared with the output signal, respectively, when the PLL locks at
83 MHz reference clock. It can be seen that the harmonics of the PLL/64 output is
due to the harmonics of the reference clock. The phase noise for reference clock and
PLL/64 output signal is -113 dBc/Hz and -109 dBc/Hz at 100 Hz offset frequency,
respectively.
99
0 1 2 3 4 5 6 7
VCO cap control bits (in binary)
10.8
11
11.2
11.4
11.6
11.8
V
CO
 fr
ee
 ru
nn
in
g 
fre
qu
en
cy
 (G
Hz
)
0
100
200
300
400
500
Ca
pa
ci
ta
nc
e 
(fF
)
VCO frequency
Cap bank
Figure 5.19: VCO frequency versus capacitor control bits (66 fF incre-
mental) when the PLL is in the lock state.
(a) PSD of a 83MHz reference clock (b) PSD of the PLL/64 signal
Figure 5.20: PSD of (a) the reference clock at 83 MHz and (b) a PLL/64
output signal at 166 MHz.
100
(a) Phase noise of a 83MHz ref. clock (b) Phase noise of the PLL/64 signal
Figure 5.21: Phase noise of (a) the reference clock and (b) the PLL/64
signal.
The phase noise of a clock is decreased by 6 dB per divide-by-2. So the PLL output
phase noise can be derived by adding 36 dB on top of the phase noise of the PLL/64
output signal (Figure 5.21 (b)). Figure 5.22 compares the measured and the simulated
phase noise for PLL output and VCO at free running along with the measured
reference clock phase noise. The measured VCO free running phase noise is lower
than the simulated result at lower frequency range but higher at higher frequency
range. The PLL loop bandwidth is designed and measured at around 500 kHz and
1 MHz, respectively. Because of the low-pass frequency characteristics of the PLL
output to the reference clock, the reference noise will dominate at the low frequency
band. At lower frequencies, the measured PLL output phase noise matches well with
its simulated result. While, at higher frequencies, the measured PLL output phase
noise went off with its simulated result due to the high-pass frequency characteristics
of the PLL output to the VCO noise.
101
102 103 104 105 106 107
Offset freq. (Hz)
-160
-140
-120
-100
-80
-60
-40
-20
0
20
Ph
as
e 
no
ise
 (d
Bc
/H
z)
PLL/64measured
PLLmeasured
PLLsimulated
VCOmeasured
VCOsimulated
Ref. clock
Figure 5.22: Comparison of the measured and the simulated phase noises
for PLL and free-running VCO.
5.7 Summary
A type-II third-order charge pump PLL was designed and analyzed in a system-
atical approach. The PLL is fabricated in the IBM 130 nm SiGe BiCMOS process
and tested with a prototype PCB. No LDO regulators are used for testing the PLL.
So power supply noise can be further exploited and studied to improve the system
noise performance. The PLL is intended to provide clock for a full-rate high-speed
PRBS generator which will be covered in the next chapter.
102
CHAPTER 6
A 10 GB/S FULL-RATE 4-CHANNEL 29 -1 PARALLEL
PRBS
Pseudo random bit sequence (PRBS) generators were proposed and studied for
over 50 years for testing transceivers [66][67]. More recently the testing of N-level
pulse amplitude modulation (PAM-N) transceivers requires multiple uncorrelated
data streams [68]. Although test equipment vendors like Anritsu have pulse pattern
generators (PPG) models which can provide 2-channel (MP1800A) and 4-channel
(MP1775A) PRBS streams up to 32 Gb/s and 12.5 Gb/s, respectively, but their cost
is prohibitive [69]. Alternatively, multi-channel PRBS can be designed and fabricated
on-chip which will not only save the cost, but also avoid the signal integrity challenges
when externally feeding the high-speed input signals. A survey of > 20 Gb/s PRBS
generator publications in the IEEE Journal of Solid State Circuits (JSSC) over the
past 10 years are compared in Table 6.1. Reference [70] adopted emitter-coupled
logic (ECL) design in a SiGe bipolar process which features 200 GHz fT . [71] and
[72] used a 150 GHz fT SiGe BiCMOS process, while [71] was making use of the
on-chip inductors.
In this chapter, a system transition matrix design method is explained and applied
to design a full-rate 4-channel 29− 1 parallel PRBS generator. It can be used to test
transmitters up to PAM-16 modulation format. Designed in the IBM 130 nm SiGe
103
Table 6.1: Comparison with recent PRBS generators published in JSSC.
Reference Single lane length Bit-rate S.E. Vpp Technology Power Area Test method
Kanpp [70]
27 − 1 Half-rate 100 Gb/s 100 mV
SiGe bipolar
fT= 200 GHz
1.5 W
0.63 µm2
RF probing
211 − 1 Full-rate 54 Gb/s 300 mV 1.9 W
Dickson [71] 231 − 1 Quarter-rate 80 Gb/s 430 mV 130 nm SiGe BiCMOS
fT = 150 GHz
[71] used inductor
9.8 W 3.5 × 3.5 mm2
Laskin [72] 27 − 1 Half-rate 23 Gb/s 125 mV 243 mW 393 × 178 µm2
This work 29 − 1 Full-rate 11 Gb/s 450 mV
130 nm SiGe BiCMOS
fT = 210 GHz 1.3 W 1500 × 78 µm2
QFN socket
on PCB
BiCMOS process which provides hetero-junction bipolar transistors (HBT) with a
maximum fT of 210 GHz, the PRBS is achievable to operate up to > 40 Gb/s if
a 40 GHz clock is provided. However, the final implementation of this PRBS is
clocked by an on-chip PLL which was covered in the previous chapter. A system
block diagram of the PRBS application in a PAM-4 optical transmitter is given in
Figure 6.1. Circuit-level design and simulation are presented. Finally, the design is
demonstrated with prototype PCB with extensive PCB transmission line discussions.
Figure 6.1: System block diagram of the MZM based PAM-4 transmitter
using the 4-channel parallel PRBS.
104
6.1 PRBS Principles
A linear-feedback shift register (LFSR) is maximal-length if and only if the corre-
sponding feedback polynomial is primitive [73]. In order to save hardware and reduce
capacitive load in practical implementation, primitive trinomial is preferred, which
means the feedback polynomial has the form of p(x) = xn + xk + 1, where n is the
degree of p(x) represents the number of the registers in the loop, and 1 ≤ k < n is the
necessary but not sufficient condition. As the hardware implementation illustrated in
Figure 6.2, n and k represent the nth and kth registers, called taps, which are connected
to the XOR gate. However, for a polynomial p(x) of degree n with coefficients in
Galois finite field GF(2) to be primitive, k is not randomly chosen. It must satisfy
the condition that, in GF(2), x2
n−1 + 1 mod p(x) is zero. For example, for n = 8,
no number exists for k to meet the condition; for n = 9, k 6= 6 and 7. Some of
the possibilities for n and k are given in the table in Figure 6.2. A Pyhton code for
determining k with respect to n is given in Appendix B.1.
Clock
2 3 k n-1 n1
n
k
7
3,9
8
-
9
4,5
10
3,7
11
2,9
6
1,5
Figure 6.2: An n-stage PRBS generator with possible n and k combinations
(adapted from [74]).
If ui(j) is defined as the state of the i
th DFF at the jth clock cycle, from Figure 6.2,
it can be observed that only the input of the first DFF is newly generated at a period
of 2n − 1 clock cycles. Inputs for the rest of the DFFs are as just one clock delay of
their previous states. So, in this case we cannot simply use any other states to form a
105
multi-channel PRBS since they are highly correlated to each other with short delays.
Another important issue for PRBS is all zero states should be avoided. For full-rate
PRBS made of CML DFFs, all the states may start from the common-mode voltage,
there is a chance that all the DFFs enter into the same logical “0” state. Start-up
circuit is essential to PRBS generator to avoid being trapped into all zero state.
6.2 Transition Matrix Method and Correlation
Transition matrix method can be used to analyze and understand the implemen-
tation of the multiple parallel random sequence generation [66]. In order to generate
4-channel of maximum uncorrelated length of 29−1 PRBS, nine DFFs and four XOR
gates are needed when operating at a full clock rate. In this design, n = 9 and k = 5
were chosen and the corresponding transition matrix T is given by Equation 6.1. If
the 9th DFF is initialized to 1 at the start and the others to 0, the initial state of the
nine DFFs is then be presented as s(0) =
[
0 0 0 0 0 0 0 0 1
]T
. After l
clock cycles, the state of the DFFs is s(l) = T ls(0).
T =

0 0 0 0 1 0 0 0 1
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0

(6.1)
106
The 9th DFF output s9(j) of the above PRBS is either 1 or 0. In computing
the correlation of sequences, it is standard practice to rescale the outputs to -1
and 1 respectively, which is done by letting t(j) , 2s(j) − 1. Implementing this
transition matrix results in the sequence {t9(j)}={2s9(j)− 1} being uncorrelated
with itself for a period of length 29 [74]. This implies that for i = 0, 1, 2, ..., 29 − 1,
the auto-correlation is given by Equation 6.2.
φ(i) =
1
29 − 1
29−1∑
j=0
t9(j)t9(j − i) =

1, i = 0
−1
29−1 , i 6= 0
(6.2)
As shown in [66], implementing T 4 (rather than T ) results in two sets of four par-
allel IQ outputs s1(l), s2(l), s3(l), s4(l) and s5(l), s6(l), s7(l), s8(l). s5(l), s6(l), s7(l),
s8(l) is simply one clock cycle delayed version of s1(l), s2(l), s3(l), s4(l). These outputs
are found using s(l) = (T 4)
l
s(0). The corresponding hardware implementation block
diagram for Equation 6.3 is shown in Figure 6.3.
T 4 =

0 1 0 0 0 1 0 0 0
0 0 1 0 0 0 1 0 0
0 0 0 1 0 0 0 1 0
0 0 0 0 1 0 0 0 1
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0

(6.3)
107
Clock
CK
QH
Q
6 CK
A
B
QH
Q
1CK
A
B
QH
Q
2
D
CK
A
B
QH
Q
4
S4
CK
A
B
QH
Q
3 CK
QH
Q
8
D
CK
QH
Q
7
D
S2 S1
CK
QH
Q
5
D
S3
set
CK
AB
QHQ9
Figure 6.3: Single-ended version block diagram of the full-rate 4-channel
29 − 1 parallel PRBS.
The function of the PRBS generator was simulated and its outputs were processed
in Matlab to perform auto-correlation and cross-correlation which are plotted in
Figure 6.4 over several periods of 29 length. It’s evident that the auto-correlation
of a single channel and the cross-correlation between any two of the four channels
have a period of 29 length random pattern and they are 27 spaced. Which means
the 4-channel have a characteristic like differential IQ signals. Two PRBS patterns
featuring 180◦ phase shift is key for using a MUX to form a higher speed data rate
PRBS pattern which can maintain the same pattern to the source pattern. The four
outputs S1, . . . , S4 in Figure 6.3 correspond to s5(l), s6(l), s7(l), s8(l). The first four
rows of Equation 6.3 show that the implementation requires four XOR gates. The
fifth XOR gated is used as a “set” signal to ensure the PRBS can start up from an
all-zero state. In normal operation the “set” signal is at logic low.
108
-800 -600 -400 -200 0 200 400 600 800
0
0.5
1
1.5
Number of clock cycles (i)
Co
rr
el
at
io
n 
(
pq
(i)
)
autocorr (S1 or S2 or S3 or S4)
xcorr (S1&S2 or S2&S3 or S3&S4)
xcorr (S1&S3 or S2&S4)
xcorr (S1&S4)
27
29
r f clock cycles (i)
Co
rre
la
tio
n 
(Φ
pq
(i)
) 
Figure 6.4: Auto-correlation and cross-correlation of the 4-channel PRBS
generator (signal amplitude rescaled to -1 and 1).
6.3 Circuits Design and Simulation
Both CMOS and Hetero-junction bipolar transistors are provided in the IBM 130
nm SiGe BiCMOS process. The graded base SiGe HBTs feature accelerated drift
field across the base-collector junction which reduced the base transport time, that
is key to improve device speed compared to all silicon homo-junction bipolars [75].
HBTs are chosen not only for higher fT , but also for higher current density and
higher operating voltage ratings. Current mode logic (CML) topology is adopted for
high-speed circuit design. Transistor biasing and sizing are important design aspects
for CML operating at high data rates will be introduced.
109
6.3.1 Current Reference
Under ideal conditions, the current-mirror output current is independent of the
voltage between the output and common terminals. In practice, real transistor-level
current mirrors suffer many deviations from this ideal behavior. One of the most
important deviations from ideality is the variation of the current-mirror output cur-
rent with changes in voltage at the output terminal, which is called channel length
modulation. Figure 6.5 compares the dc I-V characteristic of a self biased wide
swing NMOS cascode current mirror with a simple bipolar current mirror with beta
helper and emitter degeneration [27]. When they are set at unity gain for providing
1 mA current, CMOS current mirror features better dc I-V characteristic than the
bipolar counterpart, while the latter consumes 90× less silicon area just for the mirror
branch. With the help of B2 transistor to reduce the gain error and degeneration
resistor RE to boost the output resistance, the performance of bipolar current mirror
is acceptable for this application. Equation 6.4 shows that the systematic gain error
from finite forward current gain βF has been reduced by a factor of [βF + 1], which is
the current gain of emitter follower B2 [27]. The small-signal output resistance seen
at the collector of B1 transistor is expressed in Equation 6.5. What’s more, Iref can
be designed with digitally controllable.
IC B1 ' Iref ×
(
1− 2
βF (βF + 1)
)
(6.4)
Ro ' ro B1 × (1 + gmRE) (6.5)
110
Vc
Iref = 1 mA1 mA
Vd
B1
5µ /120n
9 Ω 
M1
100µ /300n
CMOS current mirror BJT current mirror
80µ /300n
450 Ω 
100µ /300n
B2
RE
Figure 6.5: Output characteristic comparison of self biased wide swing
NMOS cascode current mirror and BJT current mirror with beta helper
and emitter degeneration.
6.3.2 CML DFF
In bipolar designs, the peak fT current density (JpfT ) of the device is set by sizing
the emitter length (lE) [71]. In the IBM 130 nm SiGe BiCMOS process, JpfT is about
11.9 mA/µm2. And the emitter width (wE) if fixed at 120 nm. The tail current of
the CML can be derived with Equation 6.6 by setting the design variable of emitter
length (lE), where c is a constant which is chosen to 1.12 for CML design since its
diff-pair may not be able to fully switched. While c is set to 1 for source follower.
Itail = c× JpfT × wE × lE (6.6)
There has two main circuit blocks used in the PRBS core previously shown in
Figure 6.3, which are DFF (Figure 6.6) [76] and XOR-merged DFF (Figure 6.7) [77].
Be noted that there are two diff-pair outputs for each block. The QH and QHb output
pair are intended to drive the top devices (A and Ab denoted as the input pair) in
the XOR. However, the BJTs which connected to nodes A, Ab and B, Bb are still
barely biased at the forward-active region. This issue can be fixed by reducing and
111
increasing the load resistor a little for the XOR-merged master latch and regular slave
latch, respectively. The VBE of the upper BJT is VT ln2 smaller than the VBE of the
lower BJT due to the collector current is halved. 3.3 V power supply is used due to it
has four BJTs stacked in the XOR-merged DFFs. AC coupling is used for the clock
signals for the DFFs which will be described next.
D
Db
CK
CKb
Q
Qb
QH
QHb
VDD33
VSS
Master-Latch Slave-Latch
~1.35
~ 1.45
~ 2.23
~ 3.1
~ 1.45
3.3
~ 1.1
~830 m
~ 240 m
~ 0.47~ 8 mA
~ 4 mA
CK
Vb
54 Ω 
lE = 5 µ lE = 5 µ 
lE = 5 µ 
lE = 3 µ 
lE = 3 µ 
lE = 5 µ lE = 5 µ 
lE = 5 µ 
54 Ω 
~ 3.1
~ 2.23
~ 2.23
Figure 6.6: Schematic of the BJT DFF employed in Figure 6.3.
A
Ab
B Bb
Vb
CK CKb
VDD33
VSS
~ 3.1
~ 2.25
~ 2.23
~ 1.45
3.3
~ 1.1 ~1.35
~830 m
~ 240 m
~ 0.47
~ 2.23
~ 1.45~ 2.25
Q
Qb
QH
QHb
Slave-D Latch
~ 4 mA
CKCKb
Master-XOR Latch
~ 8 mA
lE = 3 µ 
lE = 3 µ 
lE = 5 µ lE = 5 µ 
lE = 5 µ 
54 Ω 54 Ω 
lE = 5 µ 
lE = 5 µ 
lE = 5 µ 
~ 4 mA~ 8 mA
~ 3.1
~ 3.1
~ 3.1
~ 1.45
~ 3.1
~ 3.1
~ 240 m
~ 1.1
Figure 6.7: Schematic of the XOR-merged DFF employed in Figure 6.3.
112
6.3.3 Clock and Data Buffers
It’s impractical to drive nine DFFs with one clock buffer due to the heavy loading
effects and the DFFs are at different locations on-chip. So PRBS clocking needs to
be distributed to the DFFs by using multiple clock buffers. As shown in Figure 6.3, it
has three clock buffers to individually drive two DFFs and another clock buffer drives
another three DFFs. An emitter follower is used as a clock buffer as shown in Figure
6.8. AC coupling is used for each clock buffer so that its dc operating points are
independent to the previous stage. The resistors RA and RB used for providing the
base operating point VB can’t be too large especially when the base current cannot
be neglected. As shown in Equation 6.7, the second term needs to be kept much
less than the first term so that VB won’t drop too much. Figure 6.8 shows two bias
condition examples for emitter follower and differential pair. Larger bias resistors can
be used for emitter follower with current source as a load due to the merit of larger
input resistance looking into the base.
VB =
V DD ·RB
RA +RB
− IB · (RA ‖ RB) (6.7)
The schematic of the output data buffer is shown in Figure 6.9. It has 3.3 V and
2.5 V power supplies. The CML stage is added to filter out the ringing caused by the
emitter follower stage and provide extra gain. The final stage emitter follower buffer
can direct drive the 50 Ω termination oscilloscope.
113
CK
CKb
VDD33
VSS
Vb
330 fF
RA = 2.1 kΩ 
RB = 1.05 kΩ 
~ 8 mA
IB
VB lE = 5 µ vin
vip
von
vop
VDD25
VSS
Vb
145 Ω 
330 fF
RA = 5 kΩ 
RB = 20 kΩ 
~ 4 mA
IB
VB
lE = 5 µ 
Clock buffer AC bias for DFF clock inputs
To upper devices
Figure 6.8: Schematics of (left) clock buffer employed in Figure 6.3 and
(right) bias condition for DFF clock inputs.
vin
vip
von
vop
VDD33
VSS
Vb
VDD25
140 Ω 
43 Ω 
Figure 6.9: Schematic of output buffer.
6.3.4 PRBS Startup
PRBS initially stuck at all-zero state can happen both at the simulation and real
scenarios. A start-up pin was added on PCB which can be manually set to VDD or
114
ground. Figure 6.10 shows a simulated PRBS startup process.
0 2 4 6 8 10 12 14 16
-0.5
0
0.5
V
ol
ta
ge
 (V
)
One channel of differential output
0 2 4 6 8 10 12 14 16
2.5
3
3.5
V
ol
ta
ge
 (V
)
Time (ns)
Differential set signal
start-up enable start-up disable
start-up 
disable
all-zero
Figure 6.10: PRBS start-up process by enabling “set” signal.
6.3.5 Creation of PAM Signaling
By combining the PRBS outputs, PAM signaling can be formed up to PAM-16
with four channels. Figure 6.11 shows the simulated results at 40 Giga baud rate
for NRZ and PAM-4/8/16 signaling, respectively. However, the developed prototype
for testing only has clock at around 11 GHz and one data buffer. So only NRZ
signaling can be measured. The rms power consumption for the PRBS including
clock distribution and bandgap reference circuits is about 1.2 W.
115
Figure 6.11: Simulated eye diagrams of data pattern at 40 G Baud rate
for NRZ, PAM-4/8/16.
6.4 Experimental Results
The micro photograph for the fabricated die is shown in Figure 6.12. Locations for
PLL, full-rate PRBS and data buffer are denoted. High-speed PCB design plays an
important role for the successful prototype demonstration. In this section, potential
high-speed limitations due to the selection of packaging, socket and SMA connec-
tors, PCB transmission line design are investigated. The main required equipment
for testing this prototype are high quality signal generator for PLL reference clock
generation, spectrum analyzer for checking PLL lock and sampling oscilloscope to
measure the eye diagram of the PRBS.
116
PLL
Data buffer
Full-rate PRBS
PRBS set
Ref. clock
Sync. clock
PLL clockVCO
78 µm
1500 µm
Modulator 
Drivers
Figure 6.12: Micro photograph of half of the fabricated die.
6.4.1 Packaging and Socket
Since the area of the complete die is 4mm by 4mm with a total pad number of 58,
QFN 9X9 64A package is chosen. The corresponding electrical parasitic parameters
of several sizes of QFN packages are listed in Table 6.2. A first order RC estimation
with 50 Ω load indicates the QFN 64 pin package can only support a bandwidth at
around 8 GHz. Ironwood electronics QFN 64 socket (part number: SG MLF 7008)
features 30 GHz bandwidth was chosen [78]. So QFN 64 package itself is the main
bandwidth limitation.
Table 6.2: QFN package electrical parasitic provided by the vendor.
Package Inductance (nH) Capacitance (pF ) Resistance (mΩ)
4× 4∗ 0.691 0.251 32
6× 6∗ 1.156 0.355 51
8× 8∗ 1.470 0.395/0.496 63
9× 9∗∗ 1.221/1.895 0.395/0.496 242.3/315.6
Note: ∗ Simulated results at 100 MHz; ∗∗ Simulated results at 2 GHz.
117
Keysight DCA 86100D
Sync. clock
PRBS o/p
Ref. clock
Keysight E8257D PSG
Full-rate PRBS Prototype PCB
Figure 6.13: Picture of the prototype FR4 PCB.
6.4.2 PCB Engineering
Two 4-layer PCB versions were made for testing the chip. The first one and the
second one were made of FR-4 and Rogers material as shown in Figure 6.13 and Figure
6.21, respectively. GCPW transmission lines were designed for high-speed signaling.
Regular SMA and Southwest SMA [79] connectors have a bandwidth specification of
12 GHz and 27 GHz, respectively, are used for the two versions of PCB prototype.
The footprint for regular SMA is smaller than the Southwest one, so that the PCB
can be more compact with using regular SMA connectors.
PRBS measurement result with the first version PCB is shown on the left eye
diagram in Figure 6.14. The eye diagram is almost closed. However, with the help of
the embedded math function in the sampling oscilloscope, the signal can be copied
and added with different delay and weighting. A 2-tap feed-forward equalization
(FFE) is illustrated in Figure 6.15. As one example, the eye diagram can be opened
118
a little bit as shown on the right in Figure 6.14 with the FFE delay and weightings
denoted. This FFE setting provides about 5.1 dB peaking at the Nyquist frequency
bandwidth of the sampling clock, but has 10.46 dB attenuation at dc.
Figure 6.14: Eye diagrams of PRBS output with the prototype FR4 PCB.
w1td1
w2td2
PRBS o/p FFE o/p
0.42-0.12*Z-1
Figure 6.15: Block diagram of the FFE to process the signal.
The bad signal performance of the first version PCB is due to the lack of vias under
the side arms of SMA footprints. The regular SMA connectors have relative thick and
long launch pin which can further limit the high-frequency operation. A test board
with a set of single-ended transmission lines is fabricated along with the second version
PCB using RO4350B material with a dielectric constant (Er1 in Table 6.3) of 3.36.
This is shown in Figure 6.16, from top to bottom, it has a microstrip, a GCPW, a
119
GCPW with coating and a GCPW for regular SMA connector. The total length of the
transmission line is 1.368 inch. A cross-section illustration for microstrip and GCPW
transmission lines are illustrated in Figure 6.17. The corresponding design parameters
targeting for single-ended 50 Ω and differential 100 Ω impedance control are listed
in Table 6.3. With the same W1,2 for microstrip and GCPW transmission lines, the
capacitance for GCPW will be larger due to it has side capacitance with respect to the
side ground traces. But the dominant capacitance is still with respect to the bottom
ground plane due to D1 > H1. From
√
L/C, it can be derived that the designed
GCPW features less characteristic impedance than microstrip type. Transmission
line design tools such as Polar Si9000 [80] and Ansys 2D Extractor transmission line
toolkit [81] can be used for impedance control estimation.
Figure 6.16: Transmission line sample board made of RO4350B material.
Total length of the transmission line including SMA footprints is 1.368
inch.
120
T1W2
W1
H1
Er1
Microstrip
T1W2
W1
H1
Er1
D1
G1
G2
Grounded CPW
T1W2
W1
H1
Er1
D1
G1
G2
Diff. Grounded CPW
S1
Figure 6.17: Cross-section of single-ended microstrip, GCPW and differ-
ential GCPW transmission lines.
Table 6.3: Design parameters of single-ended and differential transmission
lines made of RO4350B Rogers material.
Type H1 Er1 W1 W2 S1 G1,2 D1 T1
SE
3.937 mil 3.36
7.5 mil 8 mil NA 20 mil 9 mil
1.4 mil
Diff. 5.5 mil 5.8 mil 5.2 mil 20 mil 7.3 mil
The S-parameters (S11 and S21 ) for the four transmission lines on the test board
(Figure 6.16) are measured with Agilent PNA network analyzer N5225A (10 MHz-50
GHz) and compared in Figure 6.18. It proves that the lack of vias under the side
ground shielding traces for regular SMA connector footprint cause big reflection
coefficient (S11) at all frequencies and deep S21 suck-outs at lower frequencies. Enough
vias is critical to make sure good connections between the top and bottom planes in
order to minimize any accidental resonances. It can be observed that at 5 GHz,
GCPW has more insertion loss, but S11 is better, so that features less amplitude
but better jitter performance. Transient measurement was also performed with a
500 mVPP PRBS-7 pattern at 10 and 15 Gb/s, respectively. Peak-to-peak jitter and
minimum eye height are recorded and compared in Table 6.4 along with a RF cable.
Overall, all three types of transmission lines with the 27 GHz bandwidth Southwest
SMA connectors are acceptable for high-speed signal transmission, and their signal
integrity performance are very comparable.
121
Figure 6.18: Measured S-parameters of the sample transmission lines in
Figure 6.16.
Table 6.4: Transient measurement for RF cable and three different
transmission lines with 500 mVPP PRBS-7 pattern at 10 and 15 Gb/s,
respectively.
DR (Gb/s) T-line type JitterP−P Eye height
10/15
Just RF cable 7.6/7.7 ps 312/268 mV
Microstrip 9.3/10.7 ps 248/183 mV
GCPW 8.8/11.2 ps 242/180 mV
Coated GCPW 9.9/11.5 ps 250/189 mV
Time-domain reflectometry (TDR) measurements can be applied to determine
characteristic impedance along the signal path. The transient TDR simulation using
Ansys Circuits is performed by importing the measured S-parameters in s2p data for-
mat. With a 25 ps rise time input stimuli, the impedance of the GCPW transmission
line is plotted in Figure 6.19. It can be seen that the characteristic impedance is less
than 50 Ω. W1,2 can be reduced a little bit to make the characteristic impedance of
GCPW more close to 50 Ω.
The shorter the rise time, the more impedance discontinuities will be seen in the
channel from the TDR plot.This is because a shorter rise time will be more sensitive
122
to those impedance changes in comparison to a longer rise time, for a longer rise time
may flow right over a small impedance without noticing it. This can be attributed to
the frequency content in a signal. So smaller rise times come at a price that a wider
bandwidth signal contents need to be obtained. Knee frequency (fknee = 0.5/trise)
is usually used to calculate the bandwidth. The bandwidth in frequency determines
how clean a time domain representation will be.For example, a signal with a 100 ps
rise time needs only 5 GHz bandwidth, whereas a 10 ps rise time needs 50 GHz.
Length (Inches)
Im
pe
da
nc
e (
Ω)
Time (ns)
50 Ω TDR source
Transmission line
ConnectorConnector
s2p model 50 Ω 
termination
Figure 6.19: TDR Simulated characteristic impedance with 25 ps rise time
for a 10 MHz to 20 GHz GCPW measured S-parameters .
As shown in Figure 6.20, GCPW transmission lines with 27 GHz bandwidth
Southwest connectors are used in the final prototype PCB design. The testing was
assisted with a FPGA for digital control, as shown in Figure 6.21, the quality of
the measured PRBS output eye diagram has been improved significantly. It has a
maximum 23.4 ps peak-to-peak jitter, an eye amplitude of 453 mV when driving 50
123
Ω in the oscilloscope with a dc blocker when operating at 10.9 Gb/s.
Sync. clock
Ref. clock
PRBS o/p
FPGA 
DCA 86100D E8257D PSGPXA N9030A
Figure 6.20: Picture of the prototype Rogers PCB.
Figure 6.21: Eye diagrams of PRBS output with prototype Rogers PCB.
124
6.5 Summary
A full-rate 4-channel 511 length parallel PRBS generated is demonstrated oper-
ating at more than 10 Gb/s with an on-chip PLL. BJT design for high-speed circuits
are discussed. High-speed PCB design is also discussed extensively. The designed
BiCMOS die is supposed to be hybrid integrated to a PAM-4 traveling-wave MZM
device on the silicon photonic die as shown in Figure 6.22. However, it didn’t reach
this step due to some issue of testing of the modulator drivers.
Figure 6.22: System integration of electrical die and photonics die via
side-by-side wire bonding (Bond wires drawn not to scale).
125
CHAPTER 7
CONCLUSION
This dissertation has covered, from the circuit designer’s perspective, optical
device characterization, behavioral modeling and electrical circuits design. The core
contents of this dissertation come from three chips designed during my PhD study:
one optical chip in the IME 130 nm SOI CMOS process, one limiting receiver chip in
the IBM 130 nm CMOS process, one PRBS transmitter chip in the IBM 130 nm SiGe
BiCMOS process. Chapter 2 focused on the behavioral modeling for silicon photonic
MZM device. The developed model was verified with Cadence Spectre simulation,
the simulated result matched well with the measurement result. Chapter 3 ultilized
the developed model and data from optical device characterization, link budget
was analyzed at the system-level based on a segmented MZM device with voltage
drivers. Optimal transmitter extinction ratio was derived according to the receiver’s
sensitivity requirement. Reliability issue for the latch-based ac coupling level shifter
was discussed. A NRZ/PAM-4 reconfigurable driver scheme was proposed. Chapter 4
showcased a hybrid optoelectronic limiting receiver design by using the IBM 130 nm
CMOS process and an InGaAs/InP PIN photodiode device. The prototype achieved a
BER of 10−12 at the sensitivity level of -3.2 dBm MZM OMA at 4 Gb/s. Conclusions
derived from this case study can provide insights to guide other optoelectronic limiting
receiver design and testing at higher speeds. Chapter 5 systematically presented a
126
type-II third-order charge pump PLL from system-level architecture to transistor-level
design. Noise simulation results matched well with the measurement results. When
operating at 10.624 GHz, the phase noise of the PLL output is -73 dBc/Hz at 100
Hz offset frequency. Chapter 6 demonstrated a full-rate 4-channel parallel PRBS-9,
being clocked by the on-chip PLL presented in Chapter 5. The chip was fabricated
in the IBM 130 nm SiGe BiCMOS process. High-speed PCB design was also covered
with experimental results.
By continuing on this work, future work can include designing the complete
transceiver in one process platform, like in more advanced but cost-effective 65 nm
or 28 nm CMOS process nodes or in SiGe BiCMOS. The full-rate multi-channel
parallel PRBS design concept can be used to implement lower speed multi-channel
parallel PRBS with serialziers to create the high-speed patterns required by the
PAM-4 modulation. In this way, it can better meet the production requirement
of the integration of DSP circuits with the high-speed transmitter circuit. Topics
such as optical modulator device and its driver design trade-off, PAM-4 CDR circuit
are important blocks to the creation of the large system.
There is still a debate regarding whether silicon photonic technology can com-
pletely replace the traditional optical technologies, for example, the vertical-cavity
surface-emitting lasers (VCSELs). Regardless, research scientists are developing sil-
icon photonic quantum dot lasers in the laboratory [82]. Silicon CMOS photonic
link products, mainly the MZM based, were first shipped to the market at the
year of 2012 by Luxtera [83]. Luxtera has recently announced the 100G-PAM4
silicon photonics chipset [84]. Other companies like Macom demonstrated the first
CWDM4 laser photonic integrated circuits for 100G datacenter applications [85]. Aca-
cia Communications announced the industry’s first 400G coherent transceiver module
127
[86]. In terms of choosing the electrical process for high-speed hybrid optoelectronic
integration, advanced CMOS processes are preferred for DSP based systems and by
academic community. However, SiGe BiCMOS and InP processes are still more used
by industries such as Infinera, Inphi and Macom, etc. A successful development of the
optoelectronic interconnect products requires both circuit design expertise and optical
design expertise. Integrated circuits design for high-speed optical communication
keeps evolving because of today’s electronic social media growth and the Internet of
Things (IoT) technologies adopted in all kinds of fields generate exploded amount of
data which need to be transported faster than ever before in a more energy-efficient
approach. More than 400G technological breakthrough is expected to come in the
near future as lots of companies, research institutions and governments are investing
big money and considerable manpower into the field.
To this end, the work in this dissertation has looked at several aspects of advanced
modulation based optical interconnect including optical as well as electronic chips,
their co-design and simulation, prototyping and test. This work will form the basis for
further integration of PAM-N transceivers in finer CMOS processes and low parasitic
packaging techniques.
128
REFERENCES
[1] http://www.ieee802.org/3/bs/index.html.
[2] http://www.itu.int/en/ITU-T/studygroups/2013-2016/15/Pages/default.aspx.
[3] http://www.cdfp-msa.com/.
[4] http://www.oiforum.com/public/currentprojects.html.
[5] B. Analui, D. Guckenberger, D. Kucharski, and A. Narasimha, “A Fully Inte-
grated 20-Gb/s Optoelectronic Transceiver Implemented in a Standard 0.13-µm
CMOS SOI Technology,” IEEE J. Solid-State Circuits, vol. 41, no. 12, pp. 2945
–2955, Dec. 2006.
[6] T. Baehr-Jones, R. Ding, Y. Liu, A. Ayazi, T. Pinguet, N. C. Harris, M. Streshin-
sky, P. Lee, Y. Zhang, A. E.-J. Lim, T.-Y. Liow, S. H.-G. Teo, G.-Q. Lo, and
M. Hochberg, “Ultralow drive voltage silicon traveling-wave modulator,” Opt.
Exp., vol. 20, no. 11, pp. 12 014–12 020, May 2012.
[7] https://www.lumerical.com/.
[8] S. Fathpour and B. Jalali, Silicon photonics for telecommunications and
biomedicine. CRC Press, 2011.
[9] www.phoenixbv.com/optodesigner.
[10] https://www.comsol.com/.
[11] M. Hochberg, N. C. Harris, R. Ding, Y. Zhang, A. Novack, Z. Xuan, and
T. Baehr-Jones, “Silicon photonics: the next fabless semiconductor industry,”
IEEE Solid-State Circuits Magazine, vol. 5, no. 1, pp. 48–58, 2013.
[12] P. Dumon, “ePIXfab, a European platform for silicon photonics R&D,” in 35th
European Conference on Optical Communication. IEEE, 2009, pp. 1–2.
[13] C. Li, R. Bai, A. Shafik, E. Tabasy, G. Tang, C. Ma, C.-H. Chen, Z. Peng,
M. Fiorentino, P. Chiang, and S. Palermo, “A ring-resonator-based silicon
129
photonics transceiver with bias-based wavelength stabilization and adaptive-
power-sensitivity receiver,” in Proc. IEEE ISSCC Dig. Tech. Papers, Feb 2013,
pp. 124–125.
[14] K. Zhu, V. Saxena, and W. Kuang, “Compact Verilog-A modeling of silicon
traveling-wave modulator for hybrid CMOS photonic circuit design,” in Proc.
IEEE MWSCAS, Aug 2014, pp. 615–618.
[15] R. Ding, Y. Liu, Q. Li, Y. Yang, Y. Ma, K. Padmaraju, A. E.-J. Lim, G.-Q. Lo,
K. Bergman, T. Baehr-Jones, and M. Hochberg, “Design and characterization
of a 30-GHz bandwidth low-power silicon traveling-wave modulator,” Optics
Communications, vol. 321, pp. 124–133, 2014.
[16] K. Zhu, V. Saxena, X. Wu, and W. Kuang, “Design Considerations for Traveling-
Wave Modulator Based CMOS Photonic Transmitters,” IEEE Trans. Circuits
Syst. II, Exp. Briefs, vol. 62, no. 4, pp. 412–416, April 2015.
[17] K. Zhu, C. Li, N. Qi, M. Fiorentino, R. Beausoleil, and V. Saxena, “Modeling
of mzm-based photonic link power budget,” in 2016 IEEE Optical Interconnects
Conference (OI), 2016.
[18] H. Xu, X. Xiao, X. Li, Z. Li, T. Chu, Y. Yu, and J. Yu, “44Gbit/s silicon Mach-
Zehnder modulator based on interleaved PN junctions,” in The 9th International
Conference on Group IV Photonics (GFP), 2012.
[19] G. Rasigade, M. Ziebell, D. Marris-Morini, J.-M. Fe´de´li, F. Milesi, P. Grosse,
D. Bouville, E. Cassan, and L. Vivien, “High extinction ratio 10 Gbit/s silicon
optical modulator,” Opt. Express, vol. 19, no. 7, pp. 5827–5832, Mar 2011.
[20] L. Chrostowski and M. Hochberg, Silicon Photonics Design: From Devices to
Systems. Cambridge University Press, 2015.
[21] K. Kundert and O. Zinke, The designer’s guide to Verilog-AMS. Springer, 2004.
[22] P. Dong, W. Qian, H. Liang, R. Shafiiha, N.-N. Feng, D. Feng,
X. Zheng, A. V. Krishnamoorthy, and M. Asghari, “Low power and compact
reconfigurable multiplexing devices based on silicon microring resonators,”
Opt. Express, vol. 18, no. 10, pp. 9852–9858, May 2010. [Online]. Available:
http://www.opticsexpress.org/abstract.cfm?URI=oe-18-10-9852
[23] S. Dimitrijev, Principles of semiconductor devices. Oxford university press,
2012.
130
[24] P. Bravetti, G. Ghislotti, and S. Balsamo, “Chirp-inducing mechanisms in
Mach-Zehnder modulators and their effect on 10 Gb/s NRZ transmission studied
using tunable-chirp single drive devices,” Journal of lightwave technology, vol. 22,
no. 2, p. 605, 2004.
[25] B. Zhang, K. Khanoyan, H. Hatamkhani, H. Tong, K. Hu, S. Fallahi, K. Vakilian,
and A. Brewster, “A 28Gb/s multi-standard serial-link transceiver for backplane
applications in 28nm CMOS,” in Proc. IEEE ISSCC Dig. Tech. Papers, Feb
2015, pp. 1–3.
[26] N. Qi, X. Li, H. Li, X. Xiao, L. Wang, Z. Li, Z. Gao, Y. Yu, M. Moyal, and P. Y.
Chiang, “A 25Gb/s, 520mW, 6.4Vpp Silicon-Photonic Mach-Zehnder Modulator
with Distributed Driver in CMOS,” in Optical Fiber Communication Conference.
Optical Society of America, 2015, p. W1B.3.
[27] P. R. Gray, P. J. Hurst, R. G. Meyer, and S. H. Lewis, Analysis and design of
analog integrated circuits. John Wiley & Sons, 2008.
[28] B. Razavi, Design of integrated circuits for optical communications. John Wiley
& Sons, 2012.
[29] H. Ransijn, G. Salvador, D. D. Daugherty, and K. D. Gaynor, “A 10-gb/s
laser/modulator driver ic with a dual-mode actively matched output buffer,”
IEEE J. Solid-State Circuits, vol. 36, no. 9, pp. 1314–1320, 2001.
[30] C.-M. Tsai and M.-C. Chiu, “A 10gb/s laser-diode driver with active back-
termination in 0.18µm cmos,” in Proc. IEEE ISSCC Dig. Tech. Papers. IEEE,
2008, pp. 222–608.
[31] K. Zhu, V. Saxena, and X. Wu, “Modeling and Optimization of the Bondwire
Interface in a Hybrid CMOS Photonic Traveling-wave MZM Transmitter,” in
Proc. IEEE SOCC, Sept 2016.
[32] “D.C. Blocks, a Trap for the Unwary When Using Long Patterns.”
[33] C. Li, R. Bai, A. Shafik, and et al., “Silicon Photonic Transceiver Circuits With
Microring Resonator Bias-Based Wavelength Stabilization in 65 nm CMOS,”
IEEE J. Solid-State Circuits, vol. 49, no. 6, pp. 1419–1436, June 2014.
[34] H. Li, Z. Xuan, A. Titriku, C. Li, K. Yu, B. Wang, A. Shafik, N. Qi,
Y. Liu, R. Ding, T. Baehr-Jones, M. Fiorentino, M. Hochberg, S. Palermo,
and P. Chiang, “A 25Gb/s, 4.4V-swing, AC-coupled Ring Modulator-Based
WDM Transmitter with Wavelength Stabilization in 65 nm CMOS,” IEEE J.
Solid-State Circuits, 2015.
131
[35] X. Wu, B. Dama, P. Gothoskar, P. Metz, K. Shastri, S. Sunder, J. Van der
Spiegel, Y. Wang, M. Webster, and W. Wilson, “A 20Gb/s NRZ/PAM-4 1V
transmitter in 40nm CMOS driving a Si-photonic modulator in 0.13 µm CMOS,”
in Proc. IEEE ISSCC Dig. Tech. Papers, Feb 2013, pp. 128–129.
[36] M. Cignoli, G. Minoia, M. Repossi, D. Baldi, A. Ghilioni, E. Temporiti, and
F. Svelto, “22.9 A 1310nm 3D-integrated silicon photonics Mach-Zehnder-based
transmitter with 275mW multistage CMOS driver achieving 6dB extinction ratio
at 25Gb/s,” in Proc. IEEE ISSCC Dig. Tech. Papers, Feb 2015, pp. 1–3.
[37] A. Ghiasi, Z. Wang, V. Telang, and B. Welch, “Investigation of PAM-4/6/8
signaling and FEC for 100 Gb/s serial transmission,” IEEE P802. 3bm, vol. 40,
2012.
[38] K. Yu, H. Li, C. Li, A. Titriku, A. Shafik, B. Wang, Z. Wang, R. Bai, C.-H. Chen,
M. Fiorentino et al., “22.4 A 24Gb/s 0.71 pJ/b Si-photonic source-synchronous
receiver with adaptive equalization and microring wavelength stabilization,” in
Proc. IEEE ISSCC Dig. Tech. Papers. IEEE, 2015, pp. 1–3.
[39] C.-F. Liao and S.-I. Liu, “40 Gb/s transimpedance-AGC amplifier and CDR
circuit for broadband data receivers in 90 nm CMOS,” IEEE J. Solid-State
Circuits, vol. 43, no. 3, pp. 642–655, 2008.
[40] M. Integrated, “Application note: Accurately estimating optical receiver sensi-
tivity,” Tech. Rep., 2008.
[41] P. Quan, Y. Wang, Y. Lu, and C. P. Yue, “An 18-Gb/s Fully Integrated Optical
Receiver with Adaptive Cascaded Equalizer,” IEEE Journal of Selected Topics
in Quantum Electronics, 2016.
[42] D. Lee, J. Han, G. Han, and S. M. Park, “An 8.5-Gb/s fully integrated CMOS
optoelectronic receiver using slope-detection adaptive equalizer,” IEEE J. Solid-
State Circuits, vol. 45, no. 12, pp. 2861–2873, 2010.
[43] J.-S. Youn, M.-J. Lee, K.-Y. Park, and W.-Y. Choi, “10-Gb/s 850-nm CMOS
OEIC receiver with a silicon avalanche photodetector,” Quantum Electronics,
IEEE Journal of, vol. 48, no. 2, pp. 229–236, 2012.
[44] http://www.albisopto.com/albis product/pdcs20t/.
[45] D. Thomson, A. Zilkie, J. E. Bowers, and et al., “Roadmap on silicon photonics,”
Journal of Optics, vol. 18, no. 7, p. 073003, 2016.
132
[46] D. Li, G. Minoia, M. Repossi, D. Baldi, E. Temporiti, A. Mazzanti, and F. Svelto,
“A low-noise design technique for high-speed cmos optical receivers,” IEEE J.
Solid-State Circuits, vol. 49, no. 6, pp. 1437–1447, 2014.
[47] Q. Yang, N. Qi, J. Wang, Z. Wang, Z. Hong, and P. Chiang, “A hybrid
integrated TIA and PD for 20-Gb/s optical receivers,” in Opto-Electronics and
Communications Conference (OECC), 2015. IEEE, 2015, pp. 1–3.
[48] E. Sackinger, Broadband Circuits for Optical Fiber Communication, 1st ed.
Wiley, 2005.
[49] E. Amerasekera and C. Duvvury, ESD in silicon integrated circuits 2nd edition.
John Wiley & Son Ltd, 2002.
[50] R. P. Jindal, “Gigahertz-band high-gain low-noise AGC amplifiers in fine-line
NMOS,” IEEE J. Solid-State Circuits, vol. 22, no. 4, pp. 512–521, 1987.
[51] S. Galal and B. Razavi, “10-Gb/s limiting amplifier and laser/modulator driver
in 0.18-µm CMOS technology,” IEEE J. Solid-State Circuits, vol. 38, no. 12, pp.
2138 – 2146, Dec. 2003.
[52] H.-Y. Huang, J.-C. Chien, and L.-H. Lu, “A 10-Gb/s Inductorless CMOS
Limiting Amplifier With Third-Order Interleaving Active Feedback,” IEEE J.
Solid-State Circuits, vol. 42, no. 5, pp. 1111 –1120, May 2007.
[53] K. Zhu, S. Balagopal, and V. Saxena, “Systematic design of 10-bit 50MS/s
pipelined ADC,” in Microelectronics and Electron Devices (WMED), 2013 IEEE
Workshop on, April 2013, pp. 17–20.
[54] V. Singh, N. Krishnapura, S. Pavan, B. Vigraham, D. Behera, and N. Nigania,
“A 16 MHz BW 75 dB DR CT ADC Compensated for More Than One Cycle
Excess Loop Delay,” IEEE J. Solid-State Circuits, vol. 47, no. 8, pp. 1884 –1895,
aug. 2012.
[55] M. P. Li, J. Wilstrup, R. Jessen, and D. Petrich, “A new method for jitter
decomposition through its distribution tail fitting,” in ITC, 1999, pp. 788–794.
[56] F. Tavernier and M. S. Steyaert, “High-speed optical receivers with integrated
photodiode in 130 nm CMOS,” IEEE J. Solid-State Circuits, vol. 44, no. 10, pp.
2856–2867, 2009.
[57] T. Tokairin, M. Okada, M. Kitsunezuka, T. Maeda, and M. Fukaishi, “A
2.1-to-2.8-GHz Low-Phase-Noise All-Digital Frequency Synthesizer With a Time-
Windowed Time-to-Digital Converter,” IEEE J. Solid-State Circuits, vol. 45,
no. 12, pp. 2582–2590, Dec 2010.
133
[58] P. Andreani and A. Fard, “More on the phase noise performance of CMOS
differential-pair LC-tank oscillators,” IEEE J. Solid-State Circuits, vol. 41,
no. 12, pp. 2703–2712, 2006.
[59] J. Verdier, J. C. N. Perez, and C. Gontrand, “Design and optimization of 20
GHz LC-VCOs in SiGe: C BiCMOS technology,” in ICCSC. IEEE, 2008, pp.
648–652.
[60] L. Fanori and P. Andreani, “Low-phase-noise 3.4–4.5 GHz dynamic-bias class-C
CMOS VCOs with a FoM of 191 dBc/Hz,” in ESSCIRC. IEEE, 2012, pp.
406–409.
[61] M. H. Perrott, “PLL design using the PLL Design Assistant program,” Microsys-
tems Technology Laboratories, Massachusetts Institute of Technology, Apr. 2,
2005, vol. 33, 2002.
[62] I. A. Young, J. K. Greason, and K. L. Wong, “A PLL clock generator with 5
to 110 MHz of lock range for microprocessors,” IEEE J. Solid-State Circuits,
vol. 27, no. 11, pp. 1599–1607, 1992.
[63] W. N. HE et al., Cmos Vlsi Design: A Circuits And Systems Perspective, 3/E.
Pearson Education India, 2006.
[64] J. Lee and B. Razavi, “A 40-GHz frequency divider in 0.18-mm CMOS technol-
ogy,” IEEE J. Solid-State Circuits, vol. 39, no. 4, pp. 594–601, 2004.
[65] D. Banerjee, PLL performance, simulation and design. Dog Ear Publishing,
2006.
[66] J. J. O’Reilly, “Series-parallel generation of m-sequences,” Radio and Electronic
Engineer, vol. 45, no. 4, pp. 171–176, April 1975.
[67] K. Metzer Jr and R. Bouwens, “An ordered table of primitive polynomials over
gf (2) of degrees 2 through 19 for use with linear maximal sequence generators.”
DTIC Document, Tech. Rep., 1972.
[68] S. Bommalingaiahnapallya, K.-J. Sham, M. Ahmadi, and R. Harjani, “High-
Speed Circuits for a Multi-Lane 12 Gbps CMOS PRBS Generator,” in Proc.
IEEE ISCAS Symp. Dig., May 2007, pp. 3896–3899.
[69] http://www.anritsu.com/.
[70] H. Knapp, M. Wurzer, W. Perndl, K. Aufinger, J. Bock, and T. Meister,
“100-Gb/s 27-1 and 54-Gb/s 211-1 PRBS generators in SiGe bipolar technology,”
IEEE J. Solid-State Circuits, vol. 40, no. 10, pp. 2118–2125, Oct 2005.
134
[71] T. Dickson, E. Laskin, I. Khalid, R. Beerkens, J. Xie, B. Karajica, and
S. Voinigescu, “An 80-Gb/s 211-1 pseudorandom binary sequence generator in
SiGe BiCMOS technology,” IEEE J. Solid-State Circuits, vol. 40, no. 12, pp.
2735–2745, Dec 2005.
[72] E. Laskin and S. Voinigescu, “A 60 mW per Lane, 4 x 23-Gb/s 27-1 PRBS
Generator,” IEEE J. Solid-State Circuits, vol. 41, no. 10, pp. 2198–2208, Oct
2006.
[73] http://en.wikipedia.org/wiki/Linear feedback shift register.
[74] H. M. Power and R. J. Simpson, Introduction to dynamics and control. McGraw-
Hill UK, 1978.
[75] IBM, Design Kit and Technology Training, BiCMOS8HP, V1210.
[76] T. Masuda, K. Ohhata, N. Shiramizu, E. Ohue, K. Oda, R. Hayami, H. Shi-
mamoto, M. Kondo, T. Harada, and K. Washio, “SiGe-HBT-based 54-Gb/s 4:1
multiplexer IC with full-rate clock for serial communication systems,” IEEE J.
Solid-State Circuits, vol. 40, no. 3, pp. 791–795, March 2005.
[77] W. McFarland, K. Springer, and C.-S. Yen, “1-Gword/s pseudorandom word
generator,” IEEE J. Solid-State Circuits, vol. 24, no. 3, pp. 747–751, Jun 1989.
[78] http://www.ironwoodelectronics.com/.
[79] http://mpd.southwestmicrowave.com/products/endLaunch.phpT-
line cross section.
[80] http://www.polarinstruments.com/.
[81] http://www.ansys.com/Products/Electronics/Option-ANSYS-SI.
[82] G. Kurczveil, D. Liang, M. Fiorentino, and R. G. Beausoleil, “Robust hybrid
quantum dot laser for integrated silicon photonics,” Opt. Express, vol. 24, no. 14,
pp. 16 167–16 174, Jul 2016.
[83] http://12.162.78.22:8080/luxtera/201221LuxteraOneMillionthCMOS.pdf.
[84] http://12.162.78.22:8080/luxtera/LuxteraOFC2015PressRelease2232015.pdf.
[85] https://www.macom.com/products/photonic-solutions/laser-photonic-
integrated-circui.
[86] http://acacia-inc.com/products/ac400-flex-product-family/.
135
APPENDIX A
VERILOG-A TO ENABLE OPTICAL SIMULATION
The built-in Verilog AMS definitions for natures, disciplines and constants are
in the MMSIM installation hierarchy in tools/spectre/etc/ahdl. Optical power and
optical phase can be custom defined in a similar fashion in a seperate discipline file,
with units of “W” and “rads” defined in its according nature.
Listing A.1: Custom optical discipline file
/∗
Ver i log−A d e f i n i t i o n o f S i l i c o n Photonics r e l a t e d
Natures and D i s c i p l i n e s
$RCSfi le : o p t i c a lD i s c i p l i n e s . vams , v
$Revis ion : 1 . 0 Date : Feb 15 2013$
$Kehan Zhu , Visha l Saxena @ Boise State Un ive r s i ty
∗/
‘ i f d e f DISCIPLINES OPTICAL
‘ e l s e
‘ d e f i n e DISCIPLINES OPTICAL 1
// Opt ica l Power in Watts
nature OPower
136
un i t s = ”W” ;
a c c e s s = OptPower ;
ab s t o l = 1e−9;
endnature
// Opt ica l Phase in rad ians
nature OPhase
un i t s = ” rads ” ;
a c c e s s = OptPhase ;
ddt nature = Angular Ve loc i ty ;
ab s t o l = 1e−9;
endnature
// S igna l f low d i s c i p l i n e s
d i s c i p l i n e opt ica lPower
po t e n t i a l OPower ;
f low Current ;
e n dd i s c i p l i n e
d i s c i p l i n e opt i ca lPhase
p o t e n t i a l OPhase ;
f low Angular Force ;
e n dd i s c i p l i n e
137
Listing A.2: Optical source converts voltage to optical power and optical
phase
// Veri logA f o r Modulators , OptSource , v e r i l o g a
‘ i n c lude ” cons tant s . vams”
‘ i n c lude ” d i s c i p l i n e s . vams”
‘ i n c lude ” . . / . . / . . / o p t i c a lD i s c i p l i n e s . vams”
module OptSource (VoptPower , VoptPhase , outOptPower , outOptPhase ) ;
// vo l tage sou r c e s s e t t i n g the power and phase
input VoptPower , VoptPhase ;
output outOptPower , outOptPhase ;
e l e c t r i c a l VoptPower , VoptPhase ;
opt ica lPower outOptPower ;
opt i ca lPhase outOptPhase ;
analog begin
i f (V(VoptPower ) > 0)
OptPower ( outOptPower ) <+ V(VoptPower ) ;
e l s e
OptPower ( outOptPower ) <+ 0 ;
OptPhase ( outOptPhase ) <+ V(VoptPhase ) ;
138
end
endmodule
139
APPENDIX B
DETERMINE THE PRBS FEEDBACK TAP
A Python code is used to determine the PRBS feedback tap. Python and Sympy
need to be installed to run the code. The coefficients of the remainder should all be
even number. Even and odd numbers are equvialent to 0 and 1, respectively, in the
modulo-2 operation.
Listing B.1: Python code to determine the feedback tap for primitive
trinomial
from sympy import ∗
x = symbols ( ’ x ’ )
n = 9
k = 4 # in GF(2 ) , k should be chosen such that f mod p i s 0
f = x∗∗(2∗∗n−1) + 1 # div idend
p = x∗∗n + x∗∗k + 1 # feedback polynomial as d i v i s o r
q , r = div ( f , p , x ) # quot i ent and remainder
remainder = poly ( r , x )
c o e f = remainder . c o e f f s ( )
f o r elem in co e f :
i f elem % 2 != 0 :
140
ans = 1
break
ans = 0
pr in t ans # The answer should be 0
141
APPENDIX C
FIRST AUTHOR PUBLICATIONS DURING 2013-2016
1. Kehan Zhu, Vishal Saxena, Xinyu Wu and Wan Kuang, “Design considerations for
traveling-wave modulator based CMOS photonic transmitters”, IEEE Trans. Circuits
Syst. II, vol. 62, No. 4, April 2015.
2. Kehan Zhu, Vishal Saxena, Xinyu Wu, “Modeling and Optimization of the
Bondwire Interface in a Hybrid CMOS Photonic Traveling-wave MZM Transmitter”,
in Proc. IEEE SOCC, 2016.
3. Kehan Zhu, Cheng Li, Nan Qi, et al., “Modeling of MZM-Based Photonic Link
Power Budget”, Optical Interconnects Conference, 2016.
4. Kehan Zhu, Vishal Saxena, Xinyu Wu, “A Comprehensive Design Approach for a
MZM Based PAM-4 Silicon Photonic Transmitter”, in Proc. IEEE MWSCAS, Aug
2015.
5. Kehan Zhu, Vishal Saxena, Xinyu Wu and Sakkarapani Balagopal, “Design
analysis of a 12.5 GHz PLL in 130 nm SiGe BiCMOS process”, on Workshop IEEE
WMED, March 2015.
6. Kehan Zhu, Vishal Saxena and Wan Kuang, “Compact Verilog-A modeling of
silicon traveling-wave modulator for hybrid CMOS photonic circuit design”, in Proc.
IEEE MWSCAS, Aug 2014, pp.615-618.
7. Kehan Zhu, Sakkarapani Balagopal, Vishal Saxena and Wan Kuang, “Design of a
142
10-Gb/s Integrated Limiting Receiver for Silicon Photonics Interconnects”, in Proc.
IEEE MWSCAS, Aug 2013, pp.713-716.
8. Kehan Zhu, Sakkarapani Balagopal, Vishal Saxena, “Systematic Design of 10-bit
50MS/s Pipelined ADC” onWorkshop IEEEWMED, April 2013 (Best Paper Award).
