A Comprehensive Design Approach for a MZM Based PAM-4 Silicon Photonic Transmitter by Zhu, Kehan et al.
Boise State University
ScholarWorks
Electrical and Computer Engineering Faculty
Publications and Presentations
Department of Electrical and Computer
Engineering
1-1-2015
A Comprehensive Design Approach for a MZM
Based PAM-4 Silicon Photonic Transmitter
Kehan Zhu
Boise State University
Vishal Saxena
Boise State University
Xinyu Wu
Boise State University
© 2015, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,
including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to
servers or lists, or reuse of any copyrighted component of this work in other works. doi: 10.1109/MWSCAS.2015.7282203
A Comprehensive Design Approach for a MZM
Based PAM-4 Silicon Photonic Transmitter
Kehan Zhu,Vishal Saxena, Xinyu Wu
Department of Electrical and Computer Engineering, Boise State University, Boise, ID 83725.
Email: kehanzhu@u.boisestate.edu
Abstract—A 4-level pulse amplitude modulation (PAM-4) sil-
icon photonic transmitter targeting operation at 25 Gb/s is
designed using an electrical-photonic co-design methodology. The
prototype consists of an electrical circuit and a photonics circuit,
which were designed in 130 nm IBM SiGe BiCMOS process
and 130nm IME SOI CMOS process, respectively. Then the
two parts will be interfaced via side-by-side wire bonding. The
electrical die mainly includes a 12.5 GHz PLL, a full-rate 4-
channel uncorrelated 27 − 1 pseudo-random binary sequence
(PRBS) generator and CML drivers. The photonics die is a
2-segment Mach-Zehnder modulator (MZM) silicon photonics
device with thermal tuning feature for PAM-4. Verilog-A model
for the MZM entails the system simulation for optical devices
together with electrical circuitry using custom IC design tools.
A full-rate 4-channel uncorrelated PRBS design using transition
matrix method is detailed, in which any two of the 4-channels
can be used for providing random binary sequence to drive the
two segments of the MZM to generate the PAM-4 signal.
I. INTRODUCTION
Data bandwidth requirements in information and commu-
nication technology industry keep increasing exponentially as
we transition into the era of ‘Big Data’. This growth directly
impacts the cost and the energy consumption of the network
infrastructure. Silicon photonics technology, which can be
used to fabricate optical devices using CMOS-compatible
technology, not only brings integrated photonic chips into mass
production but can also save signiﬁcant amount of power to
move ’big data’ in the form of photons over 1000× faster than
in the form of electrons. Due to the advantages of enormous
bandwidth, lower power consumption and interference immu-
nity, silicon photonics is seen as the enabler for exponential
data throughput growth at all levels of data communication
hierarchy. This entails bringing low-cost photonic ICs into
the racks in the future data centers, between chips on the
mother board and backplanes, and even into future many-core
processors.
Photonics community has successfully brought silicon pho-
tonics platform into a commercially accessible foundry service
(e.g. IME in Singapore and ePIXfab from Europractice) but
no process design kit (PDK) is available for simulation-based
validation before chip fabrication. There are some commercial
photonics device design tools in the market. However, the lack
of interoperability with custom integrated circuits design tools
makes electric-photonic co-design challenging. Also, new ana-
log circuits are needed to leverage photonics to realize novel
and optimized integrated solutions. We developed a mixed-
signal electric-photonic co-design ﬂow, which can simulate
electrical circuits together with photonics devices as a system.
As a continuation of our previous work in [1] and [2],
an integrated MZM based silicon photonics (pulse amplitude
modulation) PAM-4 transmitter (TX) consists of an electrical
die and a photonics die via side-by-side wire bonding which
targets operation at 25 Gb/s is introduced. In [2] the PAM-4
MZM driver was systematically designed. However, it poses a
practical testing problem that it needs two synchronized uncor-
related PRBS streams. Although test equipment vendors (e.g.,
Anritsu, Tektronix) provide pulse pattern generators (PPG)
models which can provide multi-channel PRBS streams, their
cost is prohibitive. And it’s challenging to externally feed the
input signals at such high speeds. Alternatively, multi-channel
PRBS can be designed and fabricated on-chip. In this work, a
system transition matrix design method is applied to design a
full-rate 4-channel uncorrelated true 27 − 1 PRBS generator,
along with a complete MZM based PAM-4 TX architecture.
MZM device modeling and bond wire parasitic modeling
are addressed for the mixed signal simulation and design
challenges. The rest of the paper is organized as follows.
Section II illustrates the complete electrical chip architecture
which is used to drive the MZM devices. Section III provides
the PRBS design methodology and post-layout simulation
results. Section IV discusses the system-level simulation and
integration and Section V concludes the work with simulation
results.
II. ARCHITECTURE OF THE ELECTRICAL CIRCUIT
The electrical driver IC reported in the latest MZM based
silicon photonics PAM-4 TX literature [3] requires multi-
channel 10 Gb/s PRBS externally feed into the chip which
poses testing challenges. And photonics device community
always like to use RF probes with bias-T and PPG to add
electrical drive for their photonics device testing, like [4]
realized a PAM-4 modulation with manipulating the magnitude
and delay of two PRBS signals, then combined the two
signals as an electrical PAM-4 signal before driving an MZM.
However, this didn’t leverage the beneﬁts of processing high
speed signals in the optical phase domain.
In Fig. 1, we proposed the system block diagram of the
MZM driver architecture for the PAM-4 TX. The main blocks
include a 12.5 GHz PLL, a 4-channel uncorrelated 27-1 PRBS
and the PAM-4 driver. It has digital control tunability to
compensate for the PVT variations. Explicit ESD protection
devices are added for non high speed pads for reliability
issue. With this proposed architecture, external high speed
signals and clocks are thus avoided so that it will make the
This is an author-produced, peer-reviewed version of this article. The final, definitive version of this document can be found online at Proceedings 2015 
International Joint Conference on Neural Networks, published by IEEE. Copyright restrictions may apply. doi:  10.1109/IJCNN.2015.7280819
7Fig. 1. System block diagram of the electrical driver for the MZM based
PAM-4 TX.
system practical and the experimental characterization easy.
PLL design and PAM-4 driver design were covered in [5] and
[2], respectively, which will not be reiterated here. The PLL
not only provides a clean 12.5 GHz clock for the PRBS, but
also generates a 12.5/64 GHz synchronization clock for the
sampling oscilloscope (Agilent 86100B) when measuring the
eye diagram. The design of the 4-channel uncorrelated PRBS
will be elaborated in the next section.
III. 4-CHANNEL UNCORRELATED 27-1 PRBS DESIGN
A half-rate 4-channel 23 Gb/s PRBS implemented with 8 
DFFs was proposed in [6]. Only 7 of the DFFs were used 
in the feedback loops to generate 8 channels which were 
then multiplexed to obtain the 4-lane outputs. However, when 
using only 7 DFFs it is impossible to obtain four uncorrelated 
output streams with a period of 27 bits. Furthermore there is a 
redundancy in the feedback logic (x7⊕x6⊕x6⊕x5 = x7⊕x5) 
for the phase 315◦ DFF of Fig. 1 (b) in [6] (This is apparently 
because the authors of [6] did not apply modulo-2 addition 
when computing T 8, where T is the 7-by-7 transition matrix).
An n-stage linear feedback shift register (LFSR) PRBS
generator is illustrated in Fig. 2. This topology can generate a
PRBS of maximal length of 2n−1 provided k is appropriately
chosen. Some of the possibilities for n and k are given in the
table in Fig. 2 [7].
Fig. 2. An n-stage PRBS generator with possible n and k combinations
(adapted from [7]).
In this design, n = 9 and k = 5 were chosen and the
corresponding transition matrix T is given by (1). If the 9th
DFF is initialized to 1 at the start and the others to 0, the
initial state of the nine DFFs is then be presented as s(0) =[
0 0 0 0 0 0 0 0 1
]T
. After l clock cycles, the
state of the DFFs is s(l) = T ls(0).
T =
⎡
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣
0 0 0 0 1 0 0 0 1
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1 0
⎤
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦
(1)
The output s9(j) of the above PRBS is either 1 or 0. In
computing the correlation of sequences, it is standard practice
to rescale the outputs to -1 and 1 respectively, which is done
by letting t(j)  2s(j) − 1. Implementing this transition
matrix results in the sequence {t9(j)}={2s9(j)− 1} being
uncorrelated with itself for a period of length 29 [7]. This
implies that for i = 0, 1, 2, ..., 29 − 1, the auto-correlation is
given by (2).
φ(i) =
1
29 − 1
2
9
−1∑
j=0
t9(j)t9(j − i) =
{
1, i = 0
−1
29−1
, i = 0
(2)
As shown in [8], implementing T 4 (rather than T ) re-
sults in the four outputs s5(l), s6(l), s7(l), s8(l) being
uncorrelated with each other. These outputs are found using
s(l) =
(
T 4
)l
s(0).
T 4 =
⎡
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣
0 1 0 0 0 1 0 0 0
0 0 1 0 0 0 1 0 0
0 0 0 1 0 0 0 1 0
0 0 0 0 1 0 0 0 1
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
⎤
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦
(3)
It then follows that s5(l), s6(l), s7(l), s8(l) are essentially
uncorrelated for a period of 27. That is, with tp(j)=2sp(j)−1
and p = 5, 6, 7, 8, for i = 0, 1, 2, ..., 27 − 1 that the cross-
correlation is given by (4).
φpq(i) 
1
27 − 1
2
7
−1∑
j=0
tp(j)tq(j − i) =
⎧⎪⎨
⎪⎩
1, i = 0, q = p
−1
27−1
, i = 0, q = p
−1
27−1
, i = 0.
(4)
The PRBS generator was simulated with Spectre and the
outputs were processed in Matlab to compute φpq(i) which is
plotted in Fig. 4 over several periods of length 27 illustrating
(4). The auto-correlation period of 29 for each output channel
means that the single PRBS channel has a maximal length of
29-1. The uncorrelated length is reduced to 27-1 among the
four selected channels. The corresponding block diagram for
(3) with a schematic of XOR merged DFF are shown in Fig.
3. The four outputs S1, . . . , S4 in Fig. 3 correspond to s5(l),
s6(l), s7(l), s8(l). The ﬁrst four rows of (3) show that the
implementation requires 4 XORs. The ﬁfth XOR in Fig. 3 is
This is an author-produced, peer-reviewed version of this article. The final, definitive version of this document can be found online at Proceedings 2015
International Joint Conference on Neural Networks, published by IEEE. Copyright restrictions may apply. doi:  10.1109/IJCNN.2015.7280819
Clock
CK
QH
Q
6
CK
A
B
QH
Q
1
CK
A
B
QH
Q
2
D
CK
A
B
QH
Q
4
S4
CK
A
B
QH
Q
3
CK
QH
Q
8
D
CK
QH
Q
7
D
S2 S1
CK
QH
Q
5
D
S3
set
CK
AB
QHQ
9
A
Ab
B Bb
Vb
CK CKb
VDD33
VSS
~ 3.1
~ 2.5
3.3 2.9
~ 2.3
~ 1.5
3.3
~ 1.1
~1.4
~780 m
> 250 m
~ 0.59
~ 2.3
~ 1.5~ 2.5
Q
Qb
QH
QHb
Slave-D Latch
~ 3.5 mA
CKCKb
Master-XOR Latch
Fig. 3. Block diagram of the 4-channel 27 − 1 uncorrelated PRBS (single-ended version, output buffers not shown); XOR merged DFF used in the PRBS
generator (critical dc node voltages are annotated).
used as a “set” signal to ensure the PRBS can start up from
an all-zero state.
-800 -600 -400 -200 0 200 400 600 800
0
0.5
1
1.5
Number of clock cycles (i)
Co
rr
e
la
tio
n
 
(φ p
q(i
))
 
 
autocorr (S1 or S2 or S3 or S4)
xcorr (S1&S2 or S2&S3 or S3&S4)
xcorr (S1&S3 or S2&S4)
xcorr (S1&S4)
7
9
Fig. 4. Auto-correlation and cross-correlation of the 4-channel PRBS gener-
ator (signal amplitude rescaled to -1 and 1).
DFF, XOR logic and buffer are the three main circuit blocks
used in the PRBS generator, which are all current mode logic
(CML) topologies operating at very high speeds. A complete
XOR-merged DFF [9] implemented using HBT BJT devices
is shown in Fig. 3 along with the PRBS block diagram. It
must be noted that the DFF in Fig. 3 has two output pairs
(QH and QHb, Q and Qb). The QH and QHb output pair
have higher common-mode voltage which are intended to drive
the top devices (A and Ab input pair) in the XOR. Clock
tree buffers and output buffers are critical and are CML with
emitter follower based topology. The output buffers will output
a signal swing of more than 800 mV to drive the CML driver
designed in [2]. Fig. 5 plots the 4-channel differential outputs
eye diagram from the post-layout simulation. It has a single-
ened Vpp of more than 800 mV, maximum peak-to-peak jitter
less than 0.48 ps when operating at 12.5 Gb/s. The power
consumption is less than 1.4 W for the PRBS including clock
tree and buffers circuits.
IV. SYSTEM-LEVEL SIMULATION AND INTEGRATION
Recent progress has been made with optical system-level
design tools such as the Lumerical’s Interconnect [10], which
0 20 40 60 80 100 120 140 160
-1200
-800
-400
0
400
800
1200
Time (ps)
V
ol
ta
ge
 
(m
V
)
0 20 40 60 80 100 120 140 160
-1200
-800
-400
0
400
800
1200
Time (ps)
V
o
lta
ge
 (m
V
)
0 20 40 60 80 100 120 140 160
-1200
-800
-400
0
400
800
1200
Time (ps)
V
o
lta
ge
 (m
V
)
0 20 40 60 80 100 120 140 160
-1200
-800
-400
0
400
800
1200
Time (ps)
V
ol
ta
ge
 
(m
V
)
Fig. 5. Post-layout simulation results of eye diagram of 4-channel streams
differential output data pattern at 12.5 Gb/s with a 12.5 GHz, 400 mV swing
ideal sinusoidal input clock.
is speciﬁc to photonic integrated circuits (PIC) simulation and
cannot be employed for hybrid optoelectronic system simu-
lation, where a SPICE-like solver is required for transistor-
level circuit simulations. Photonics device Verilog-A modeling
and system-level integration issues will be addressed in this
section.
A. MZM Device and Modeling
The imbalanced PAM-4 MZM device is made of two seg-
ments of phase shifters which are driven by two CML drivers
as the layout top-view shown in Fig. 6. Coplanar waveguide
(CPW) electrode is used for its low dispersion and more stable
line impedance [11]. A thermal heater section is added for dc
phase ﬁne tune. The PAM-4 MZM acts like a 2-bit DAC which
processes the signal in the optical phase domain. MZM model
including voltage-dependent effective refractive index change,
propagation delay, optical loss, thermo-optical coefﬁcient and
RLGC parameters can be described with Verilog-A [1], in
which, the core element is the phase shifter as illustrated in
Fig. 7. It shows the interaction between electric domain and
photonics phase domain. For detailed modeling information,
the readers are referred to our previous work in [1]
This is an author-produced, peer-reviewed version of this article. The final, definitive version of this document can be found online at Proceedings 2015
International Joint Conference on Neural Networks, published by IEEE. Copyright restrictions may apply. doi:  10.1109/IJCNN.2015.7280819
Fig. 6. System integration of electrical die (left) and photonics die (right) via side-by-side wire bonding (Bond wires drawn not to scale).
Fig. 7. Phase shifter model (device dimensions are drawn not to scale).
B. System Integration
A die-to-die wire bonding solution is illustrated in Fig. 6.
Signal bond wires with a length of 2 mm (used for series
inductive peaking) and a pitch of 200 μm (to avoid the cross
coupling) proved to be necessary by putting the extracted
S-parameters from ADS into Spectre simulation. Off-chip
termination is planned at the far end of the MZM. Grating
couplers will be aligned with ﬁber array to get the optical
input and output. With an on-chip PLL and PRBS, all high
frequency signals are avoided to be routed via PCB.
0 20 40 60 80 100 120 140 160
200
400
600
800
1000
1200
Time (ps)
O
pt
ic
al
 
Po
w
e
r 
(μ W
)
Fig. 8. Post-layout simulated optical output eye diagram of the complete TX
circuits (excluding PLL) with PAM-4 MZM Verilog-A model at 25 Gb/s with
a 12.5 GHz sinusoidal clock features a 0.5 V magnitude, a 7 ps peak-to-peak
jitter to the PRBS .
V. CONCLUSION
An MZM based PAM-4 silicon photonics TX prototype
was proposed for easy testing. A 4-channel uncorrelated
27 − 1 PRBS generator design was systematically presented.
Simulated optical eye as plotted in Fig. 8 was obtained with
the hybrid simulation shows the TX can operate at 25 Gb/s.
ACKNOWLEDGMENT
The authors thank Dr. John Chiasson for discussion about
the PRBS design, and MOSIS educational program (MEP) for
their support with chip fabrication. This work is supported in
part through NSF CAREER Award EECS-1454411.
REFERENCES
[1] K. Zhu, V. Saxena, and W. Kuang, “Compact Verilog-A modeling of
silicon traveling-wave modulator for hybrid CMOS photonic circuit
design,” in Proc. IEEE MWSCAS, Aug 2014, pp. 615–618.
[2] K. Zhu, V. Saxena, X. Wu, and W. Kuang, “Design Considerations for
Traveling-Wave Modulator Based CMOS Photonic Transmitters,” IEEE
Trans. Circuits Syst. II, Exp. Briefs, vol. 62, no. 4, pp. 412–416, April
2015.
[3] X. Wu, B. Dama, P. Gothoskar, P. Metz, K. Shastri, S. Sunder, J. Van der
Spiegel, Y. Wang, M. Webster, and W. Wilson, “A 20Gb/s NRZ/PAM-4
1V transmitter in 40nm CMOS driving a Si-photonic modulator in 0.13
μm CMOS,” in Proc. IEEE ISSCC Dig. Tech. Papers, Feb 2013, pp.
128–129.
[4] A. Samani, D. Patel, S. Ghosh, V. Veerasubramanian, Q. Zhong, W. Shi,
and D. Plant, “OOK and PAM optical modulation using a single drive
push pull silicon Mach-Zehnder modulator,” in Proceedings of IEEE
11th International Group IV Photonics, Aug 2014, pp. 45–46.
[5] K. Zhu, V. Saxena, X. Wu, and S. Balagopal, “Design Analysis of a
12.5 GHz PLL in 130 nm SiGe BiCMOS Process,” in Microelectronics
and Electron Devices (WMED), 2015 IEEE Workshop on, March 2015,
pp. 17–20.
[6] E. Laskin and S. Voinigescu, “A 60 mW per Lane, 4 x 23-Gb/s 27-
1 PRBS Generator,” IEEE J. Solid-State Circuits, vol. 41, no. 10, pp.
2198–2208, Oct 2006.
[7] H. M. Power and R. J. Simpson, Introduction to dynamics and control.
McGraw-Hill UK, 1978.
[8] J. J. O’Reilly, “Series-parallel generation of m-sequences,” Radio and
Electronic Engineer, vol. 45, no. 4, pp. 171–176, April 1975.
[9] N. Mazumder, “Merging of logic function circuits to ECL latch or ﬂip-
ﬂop circuit,” Patent, Dec. 9, 1986, US Patent 4,628,216.
[10] “Lumerical Solutions, Inc. https://www.lumerical.com/.” [Online].
Available: https://www.lumerical.com/
[11] N. I. Dib and L. P. Katehi, “Theoretical characterization of coplanar
waveguide transmission lines and discontinuities.” 1992.
This is an author-produced, peer-reviewed version of this article. The final, definitive version of this document can be found online at Proceedings 2015 
International Joint Conference on Neural Networks, published by IEEE. Copyright restrictions may apply. doi:  10.1109/IJCNN.2015.7280819
