A 2.56 Gbps Serial Wireline Transceiver that Supports An Auxiliary Channel and A Hybrid Line Driver to Compensate Large Channel Loss by Wang, Xiaoran
Southern Methodist University 
SMU Scholar 
Electrical Engineering Theses and Dissertations Electrical Engineering 
Summer 8-4-2020 
A 2.56 Gbps Serial Wireline Transceiver that Supports An Auxiliary 




Follow this and additional works at: https://scholar.smu.edu/engineering_electrical_etds 
 Part of the Digital Circuits Commons, Electrical and Electronics Commons, Hardware Systems 
Commons, Other Electrical and Computer Engineering Commons, and the VLSI and Circuits, Embedded 
and Hardware Systems Commons 
Recommended Citation 
Wang, Xiaoran, "A 2.56 Gbps Serial Wireline Transceiver that Supports An Auxiliary Channel and A Hybrid 
Line Driver to Compensate Large Channel Loss" (2020). Electrical Engineering Theses and Dissertations. 
37. 
https://scholar.smu.edu/engineering_electrical_etds/37 
This Dissertation is brought to you for free and open access by the Electrical Engineering at SMU Scholar. It has 
been accepted for inclusion in Electrical Engineering Theses and Dissertations by an authorized administrator of 
SMU Scholar. For more information, please visit http://digitalrepository.smu.edu. 
A 2.56 GBPS SERIAL WIRELINE TRANSCEIVER THAT                                               
SUPPORTS AN AUXILIARY CHANNEL                                                                                                                                           
AND                                                                                                                                                  





Approved by:  
______________________________________ 
Ping Gui 








Associate Professor of Electrical and Computer Engineering 
 
______________________________________ 
Choon Lee  




Clinical Professor of Computer Science 
 
A 2.56 GBPS SERIAL WIRELINE TRANSCEIVER THAT                                                   
SUPPORTS AN AUXILIARY CHANNEL                                                                                 
AND                                                                                                                                                    
A HYBRID LINE DRIVER TO COMPENSATE LARGE CHANNEL LOSS  
A Dissertation Presented to the Graduate Faculty of  
Lyle School of Engineering 
Southern Methodist University 
in 
Partial Fulfillment of the Requirements 
for the degree of  
Doctor of Philosophy 
with a  
Major in Electrical Engineering 
by 
Xiaoran Wang  
B.S., South China University of Technology, Guangzhou, China 
M.S., Southern Methodist University, Dallas, USA 
 
August 4, 2020 
Copyright (2020) 
Xiaoran Wang 








It has been five years since I started to pursue my PhD degree. During the past five years, 
my life has changed a lot. At the same time, the research work also opens a new world for me, 
which is full of unknown, challenges, excitements and hopes. When I look back, there are always 
the people helping me a lot.   
First of all, I would like to express my deep appreciation to my advisor Dr. Ping Gui for 
her guidance, support and confidence in me throughout the development of this research work. 
Her insights in discovering and grasping the important research details keep me on the correct 
track. I learned a lot from her expertise and attitude in research both professionally and technically.  
I would also like to thank Dr. Mitch Thornton for providing the ideas about communication 
security, and leading me to combine the circuit design and data communication security.  
Many thanks to, Tianwei Liu, Shita Guo, Tao Zhang, Kexu Sun, Guanhua Wang, Chang 
Yang, and Liang Fang for their time, help and suggestions. Without their invaluable support, this 
work would have never been completed. 
Finally, I would like to thank all my family members, my parents and my wife for all their 
love and encouragement. Their love and support make me go through this journey. Without their 
love, I will not make this achievement today. My lovely daughter Natalie gets much workload to 





STATEMENT BY AUTHOR 
 
This dissertation has been submitted in partial fulfillment of the requirements for an 
advanced degree at Southern Methodist University and is deposited in the university library to be 
made available for borrowers under the rules of the library. 
Brief quotations from this dissertation are allowable without special permission, provided 
that an accurate acknowledgement of the source is made. Requests for permission for extended 









Wang, Xiaoran      B.S., South China University of Technology, 2013 
          M.S., Southern Methodist University, 2015 
 
A 2.56 Gbps Serial Wireline Transceiver that Supports An Auxiliary Channel  
and  
A Hybrid Line Driver to Compensate Large Channel Loss 
 
     Advisor: Professor Ping Gui 
Doctor of Philosophy conferred August 4th, 2020  
Dissertation completed August 4th, 2020 
 
  
Serial transceiver links are widely used for high-speed point-to-point communications. This 
dissertation describes two transceiver link designs for two different applications.   
  
In serial wireline communications, security is an increasingly important factor of concern. 
Securing an information processing system at the application and system software layers is 
regarded as a necessary but incomplete defense against the cyber security threats. Many security 
measures at the hardware level require an additional data channel with extra bandwidth to transmit 
authentication information or to just increase redundancy.  A competing issue is that most existing 
designs cannot support the additional channel required to implement these measures without an 
expensive redesign effort.   
In this dissertation, an asynchronous serial transceiver that is capable of transmitting and 
receiving an auxiliary data stream concurrently with the primary data stream is described.  The 
transceiver instantiates the auxiliary data stream by modulating the phase of the primary data 
without affecting the primary channel transmission and recovery mechanisms. Standard receiver 




the proposed transceiver and considerations of the system parameters are included and can be used 
to determine how such an auxiliary channel is implemented.  The proposed transceiver with the 
auxiliary channel can be widely used in many data communication applications such as for 
transmitting signatures for authentication or other control information, steganography, or 
additional data in an existing serial link.  A prototype transceiver, implemented in a 65 nm CMOS 
process, demonstrates the proposed concept with an 80 Mbps auxiliary channel in a 2.56 Gbps 
asynchronous serial link.  
The Deep Underground Neutrino Experiment (DUNE) requires that the front-end transmitters 
operate at cryogenic temperature and drive 25-35 meters long twin-axial (twinax) cables. To 
compensate the frequency-dependent channel loss over the long cables and alleviate the de-
emphasizing of the low-frequency signal magnitude, a hybrid of a current-mode (CM) transmitter 
equalization (TXEQ) and a voltage-mode (VM) pre-emphasis is proposed. The TXEQ employs a 
finite-impulse response (FIR) filter to boost the high-frequency components while de-emphasizing 
the low-frequency signal magnitude, thereby flattening the overall channel frequency response and 
reducing the Intersymbol Interference (ISI). The VM pre-emphasis is proposed to further mitigate 
ISI by boosting the high-frequency portion without degrading the signal magnitude, allowing for 
high signal swing. The main driver utilizes VM source-series-terminated (SST) output stages, 
which offers higher signal swing and better power efficiency than the conventional current-mode 
logic (CML) drivers.  To ensure the lifetime and reliability at cryogenic temperature, the 
transmitter is implemented in a 65-nm CMOS process operating at 1.1 V of supply voltage and 
employing transistors with larger than minimum lengths. Silicon measurement results have 




TABLE OF CONTENTS 
 
 
LIST OF FIGURES ................................................................................................................... xi 
LIST OF TABLES ................................................................................................................... xiv 
CHAPTER 1     INTRODUCTION ............................................................................................ 1 
1.1 Motivation ................................................................................................................... 1 
1.2 Research Contribution ................................................................................................ 4 
1.3 Dissertation Organization ........................................................................................... 5 
CHAPTER 2     SERIAL WIRELINE COMMUNICATION SYSTEM ................................... 6 
2.1 SerDes ......................................................................................................................... 6 
2.2 Clock and Data Recovery ........................................................................................... 8 
2.3 Jitter ........................................................................................................................... 10 
2.4 Line Driver ................................................................................................................ 14 
CHAPTER 3     A SERIAL WIRELINE TRANSCEIVER THAT SUPPORTS AN 
AUXILIARY CHANNEL ................................................................................... 17 
3.1 Introduction ............................................................................................................... 17 




3.3 Transceiver Implementation ..................................................................................... 21 
3.4 System Design Parameters ........................................................................................ 32 
3.5 Measurement Results ................................................................................................ 43 
CHAPTER 4  A HYBRID LINE DRIVER WITH VOLTAGE-MODE SST PRE-EMPHASIS 
AND CURRENT-MODE EQUALIZATION ..................................................... 48 
4.1 Introduction ............................................................................................................... 48 
4.2 The Proposed Hybrid Line Driver ............................................................................ 54 
4.3 DUNE Transmitter Design ....................................................................................... 62 
4.4  Design Considerations for Lifetime Reliability ........................................................ 65 
4.5 Measurement Results ................................................................................................ 66 
CHAPTER 5  CONCLUSION ................................................................................................. 78 
5.1 Summary ................................................................................................................... 78 
5.2 Future Work .............................................................................................................. 79 





LIST OF FIGURES 
 
Figure 1. Block diagram of serial data transmission system. ......................................................... 6 
Figure 2. Block diagram of serializer ............................................................................................. 8 
Figure 3. PLL-based CDR structure ............................................................................................... 9 
Figure 4. Comparison of the pulse signals with and without jitter ............................................... 10 
Figure 5. Jitter transfer function .................................................................................................... 12 
Figure 6. System jitter tolerance mask .......................................................................................... 14 
Figure 7. Schematic of (a) CML driver and (b) SST VM driver. ................................................. 15 
Figure 8. Proposed asynchronous serial link with auxiliary channel. ........................................... 18 
Figure 9. Block diagram of the transmitter. .................................................................................. 21 
Figure 10. Timing diagrams of the transmitter modulation scheme. (a) Auxiliary data changing 
from bit ‘0’ to ‘1’. (b) Auxiliary data changing from bit ‘1’ to ‘0’. ................................ 23 
Figure 11. The case of potential glitches and incorrect data transition. ....................................... 24 
Figure 12. Current-starved delay cell. ........................................................................................... 25 
Figure 13. Block diagram of the receive ....................................................................................... 27 
Figure 14. The architecture and timing diagram of the BBPD. .................................................... 28 
Figure 15. Simulated eye diagram of the modulated primary data. .............................................. 33 
Figure 16. CDR loop linear model. ............................................................................................... 34 
Figure. 17. BBPD output varies with the input data phase difference. Blue curve represents the 




Figure. 18. Simulated jitter tolerance with SONET OC-192 mask. ............................................. 41 
Figure. 19. Die photo of the transceiver. ...................................................................................... 43 
Figure. 20. The measurement environment. ................................................................................. 44 
Figure. 21. The recovered clock signal. ........................................................................................ 45 
Figure. 22. The Eye diagram of the recovered primary data. ....................................................... 45 
Figure. 23. The bathtub of the recovered primary data ................................................................. 46 
Figure. 24. Eye diagram of the recovered auxiliary data. ............................................................. 46 
Figure. 25. Basic architecture of (a) transmitter FIR equalization and (b) pre-emphasis. ............ 48 
Figure. 26. Time-domain waveform of TXEQ and pre-emphasis. ............................................... 49 
Figure. 27. Current-mode driver with FIR equalization. .............................................................. 51 
Figure. 28. Voltage-mode driver with FIR equalization. .............................................................. 52 
Figure. 29. The proposed hybrid line driver with voltage-mode pre-emphasis and current-mode 
transmitter equalization. .................................................................................................. 55 
Figure. 30. “PreEmp_Pulse” generation. ...................................................................................... 57 
Figure. 31. The enable-logic and SST output stage in main driver cell and PreEmp cell. ........... 58 
Figure. 32. The current flow in VM driver using SST output stages. ........................................... 59 
Figure. 33. Hybrid line driver without pre-emphasis. ................................................................... 60 
Figure. 34. CML driver with only CM TXEQ. ............................................................................. 61 
Figure. 35. The block diagram of transmitter. .............................................................................. 62 
Figure. 36. The block diagram of the serializer. ........................................................................... 63 
Figure. 37. The block diagram of the 4/5 divider. ........................................................................ 63 
Figure. 38. The block diagram of the 4/5-to-1 MUX. .................................................................. 64 




Figure. 40. Test set up and environment for line driver measurement. ........................................ 68 
Figure. 41. Insertion loss of (a) 35m and (b) 25m twinax cable at room and cryogenic 
temperature. ..................................................................................................................... 69 
Figure. 42. The measured transmitted eye diagrams and bathtub curves of the serializer output 
for input data of (a) PRBS-7 and (b) PRBS-15. .............................................................. 70 
Figure. 43. The measured transmitter output waveforms and eye diagram at the end of 25m 
twinax cable driven by CML driver without TXEQ and pre-emphasis. ......................... 71 
Figure. 44. Measured eye-diagrams after (a) 25m twinax cable and (b) 35 m twinax cable for 
PRBS7 data pattern. ........................................................................................................ 72 
Figure. 45. Measured bathtub curve after (a) 25mtwinax cable and (b) 35 m twinax cable for 







LIST OF TABLES 
 
Table 1 Simulated comparator noise and power consumption ..................................................... 75 














CHAPTER 1     INTRODUCTION 
 
1.1 Motivation 
Serial wireline communications are widely used in most of the computer networks as well as 
the chip-to-chip communications. In serial wireline communications, in addition to data rate and 
data transfer quality (Bit Error Rate), security is another increasingly important factor of concern. 
Securing an information processing system at the application and system software layers is 
regarded as a necessary but incomplete defense against the cyber security threats. Encryption is a 
commonly employed method with the goal of preventing unauthorized access to sensitive 
information [1]-[2].  However, the modification or redesign of an existing system to include 
encryption at the hardware layer can add significant expense and result in compatibility issues with 
other systems and specifications as well as interoperability issues with other contemporary 
versions of similar systems.  Moreover, sometimes the mere presence of an encrypted channel 
provides an adversary with information that is undesired and encourages increased attacks [3]-[4].  
More emphasis is being placed in the area of hardware security due to the emergence of exploits 
at these lower layers of data transmission and processing [5]-[9].  
Many security measures at the hardware level require an additional data channel with extra 
bandwidth to transmit authentication information or to just increase redundancy.  A competing 
issue is that most existing designs cannot support the additional channel required to implement 




accomplished, the resulting more secure products may not be backwardly compatible with earlier 
or standard generations.   
Modern information processing circuitry is becoming very common for information 
exchanges to be accomplished via asynchronous serial links [10]-[14]. Asynchronous serial 
transceivers are ubiquitous in today’s devices and are used to interface between blocks within and 
between integrated circuits (ICs), and between packaged systems, typically using industry 
standards such as USB, MIPI, and PCI-e [15].   
The motivation of the first work of this dissertation is to create a new approach and transceiver 
architecture to address the above challenges and problems, and enhance the security of serial 
wireline communications and to ensure authenticated data transfer at the circuit level. The 
proposed new serial transmitter is able to embed a signature in the serial data stream by modulating 
the last-stage high-speed clock in the serializer, and the proposed receiver can recover not only the 
transmitted data and clock but also the embedded signature. The modulation of the clock signal 
results in higher deterministic jitter on the transmitter side. This imposes challenges in the 
transmitter and receiver design. On the transmitter side, it needs to produce small clock phase 
modulation which can increase the deterministic jitter in the transceiver. On the receiver side, it 
needs to recover the actual transmitted data in the presence of higher jitter compared to the case 
without clock phase modulation, and at the same time detect the clock phase modulation as the 
signature.       
The output driver of a data link transmitter is a critical part that determines the overall 
performance of the entire high-speed serial link.  Many applications require the transmitter driver 




transmitter driver to provide equalization to compensate the frequency-dependent channel loss 
inherent with the transmission channels thus mitigating intersymbol interference (ISI). Meanwhile, 
large signal swing from the transmitter is also desirable in order for the receiver to recover the data 
stream with a good bit-error rate (BER) performance. 
The Deep Underground Neutrino Experiment (DUNE) is a dual-site experiment consisting of 
two sets of detectors: one in Illinois (near detectors) and the other one in South Dakota (far 
detectors). The far detectors will be in a single-phase time projection chamber, with 10K tons of 
liquid argon operating at cryogenic temperature (89 Kelvin). The interaction of the neutrino 
particles in liquid argon is detected and converted to digital data streams inside the chamber. At 
the transmitter, the digital streams are combined to a 1.28 Gbps serial data stream and sent to the 
warm interface of the chamber cryostat.  Due to the large dimensions of the chamber cryostat, the 
line drivers in the readout ICs need to drive 25-35 meters long twin-axial (twinax) cables to 
transmit the high-speed data streams to the Warm Interface Boards (WIB).  
The motivation of the second work of this dissertation is to design a line driver for DUNE. 
Compensating large channel loss while maintaining large signal swing (thus ensuring good eye 
diagrams) are required for the transmitters used in DUNE. In addition, while major portion of the 
cables will be immersed into liquid argon at cryogenic temperature (89 Kelvin), a small portion of 
the cables may be at the warm side with uncertain temperature and to be connected to the WIB. 
Given the high and uncertain insertion loss resulting from a large range of the cable lengths and a 






1.2 Research Contribution  
This dissertation proposed a new asynchronous serial transceiver that is capable of transmitting 
and receiving an auxiliary data stream along with the primary data stream on a single asynchronous 
serial link.  The proposed transceiver embeds the auxiliary data stream by modulating the phase of 
the primary data in accordance with it.  The receiver recovers both the primary and auxiliary data 
simultaneously.  In a standard receiver, which is not equipped with the phase demodulation 
capability, the auxiliary data appears as jitter of the primary data. The jitter caused by the auxiliary 
data still falls within the jitter budget of the transceiver, and having this much jitter would not 
adversely affect the functionality of the primary data recovery.  The proposed system can be widely 
used in many data communication applications such as for transmitting a hidden signature for data 
authentication, or as control and/or additional data in an existing serial link.  A prototype 
transceiver, implemented in a 65 nm CMOS process, demonstrates the proposed concept with 
2.56 Gbps primary data and 80 Mbps auxiliary data channels. 
This dissertation also proposed a hybrid line driver to drive at least 25-meter long twin-axial 
(twinax) cables in Deep Underground Neutrino Experiment. To compensate the high-frequency 
loss over the long cables and alleviate the de-emphasis of the low-frequency signal magnitude, a 
hybrid of current-mode transmitter equalization (TXEQ) and voltage-mode pre-emphasis is 
proposed. The TXEQ employs a finite-impulse response (FIR) filter to boost the high-frequency 
components while deemphasizing the low-frequency signal magnitude, thereby flattening the 
overall channel frequency response and reducing Inter-Symbol Interference (ISI). Voltage-mode 
pre-emphasis is proposed to further boost the high-frequency portion without degrading the signal 
magnitude. The main driver utilizes voltage-mode source-series-terminated (SST) output stages, 




logic (CML) drivers. Designed in 65-nm CMOS, the proposed hybrid line driver operates at 1.28 
Gbps and with 4.1 mW power consumption when driving a 25-meter long twinax cable at 
cryogenic temperature.   
 
1.3 Dissertation Organization 
This dissertation is organized as follows. Chapter 1 is the introduction, including the 
motivation, research contribution and organization of the dissertation. Chapter 2 presents a review 
of the serial wireline communication system. The key parts of the serial wireline communication 
system are also introduced, such as serializer/deserializer (SerDes), clock and data recovery (CDR) 
and the line driver. In chapter 3, the 2.56 Gbps serial wireline transceiver that supports an auxiliary 
channel is shown, and the measurement results are demonstrated. Chapter 4 depicts the hybrid line 










Figure 1. Block diagram of serial data transmission system. 
Figure 1 shows a generic block diagram of a high-speed wireline data transmission system. 
The left side is the transmitter and the right side is the receiver. Those two parts constitute the 
serial wireline communication system. At the transmitter (TX) side, Parallel digital signals are 
converted to a binary sequence through a data serializer for the single-channel serial-data 
transmission. The data to be sent is bundled into a high-speed stream in the transmitter, and then 
transmitted to the channel by the driver circuit. At the receiver (RX) side, the CDR recovers the 




SerDes is used in the transmitter and receiver of the high-speed serial wireline data 
communication system. The primary use of the SerDes is to provide data transmission over a single 
channel in order to minimize the number of I/O pins and interconnects.  
The input of the serializer is an n-bit data path that is serialized to a one-bit serial data signal. 
Generally, the value of n is a multiple of 8 or 10, and may be programmable on some 
implementations [18]. Values of n that are multiples of 8 are useful for sending unencoded data 
bytes; values of n that are multiples of 10 are useful for protocols using 8B/10B coding. In the 
serializer, shift registers receive the parallel input data and clock, and then shift the data out to a 
higher serial clock frequency. As shown in Figure 2, the parallel input data is assumed to be 4 bits, 
and the input reference clock rate is 500MHz. In order to transfer the parallel data into serial data, 
the first set of two shift registers use 1GHz clock and transfer the data to 2-bit parallel data, and 
the second set of shift register uses 2GHz clock and transfers the data to 1-bit serial data. 
The deserializer first extracts the clock from the incoming serial data stream using a CDR 
and uses this clock to deserializer the data into parallel bits at lower frequency. The CDR 







Figure 2. Block diagram of serializer 
2.2 Clock and Data Recovery 
In the high-speed wireline communication systems, such as wireline long-haul networks, 
and chip-to-chip or backplane communications, the received data are distortion [19] [20] [21]. 
The data is distorted because of many factors such as noises of devices, limited bandwidth of 
channels, signal reflection, duty-cycle distortion, cross-talk, and power supply noise. A clock and 
data recovery (CDR) circuit is a critical block in the data link to extract the timing information 





Figure 3. PLL-based CDR structure 
PLL (Phase locked loop) - based CDR is a topology using feedback phase tracking. Figure 
3 shows a PLL-based CDR that is a single negative feedback loop without extra PLL to generate 
the clock signal. PLL-based CDR is the most common scheme of CDR based on a second-order 
PLL circuit. The tracking loop employs a phase detector (PD), a charge pump (CP) and a low pass 
filter (LPF) to drive the voltage control oscillator (VCO) frequency towards the input data rate. 
When the phase error falls into the capture range of phase tracking loop, the phase tracking loop 
aligns the phase of clock with the data. In PLL-based CDR, the output of the bang-bang phase 
detector makes the charge pump produce ‘up’ or ‘down’ signal, and those ‘up’ and ‘down’ signal 
will tune the VCO frequency and adjust the feedback clock signal [22], [23], [24].  
PLL-based CDR consumes large layout area. Compared to other types of CDR, PLL-based 
CDR is suitable for single channel CDR circuit and consumes less power.  
The jitter performance of PLL-based CDR is worse than other types of CDR architecture. 
In PLL-based CDR, the stabilizing zero in the forward path introduces jitter peaking and the corner 
frequencies of the jitter transfer and jitter tolerance transfer functions are at the same frequency, 
which is the second pole of the PLL. While the bandwidth requirements for good jitter transfer and 




jitter tolerance function needs wide bandwidth, so optimizing one will degrade the other one. 
 
2.3 Jitter 
  Jitter is the time deviation caused by the system noise or other system variation, and is the 
difference of a signal’s ideal or expected arrival time with real arrival time. As can be seen in 
Figure 4, jitter in a clock signal represents the deviation of the zero crossings from their ideal 
position in time.  
 
Figure 4. Comparison of the pulse signals with and without jitter 
There are two types of jitter, one is random jitter, and the other is deterministic jitter. 
Random jitter (RJ) fulfills the Gaussian distribution, and is measured by RMS value. Deterministic 
jitter (DJ) is always fixed in a constant value, and can be quantified by the peak to peak value. The 
sum of DJ and RJ is total jitter (TJ). TJ is also quantified by the peak to peak value, and is related 




The relationship of TJ, RJ and DJ can be expressed by the following equation. 
TJpp = DJpp + Qber * RJrms                                                 (1.1) 
Where TJpp is the peak to peak value of the total jitter. DJpp is the peak to peak value of the 
deterministic jitter. RJrms is the RMS value of the random jitter. Qber is the amount of eye closure 
due to random jitter that must be accounted for a target bit error rate (BER). A common BER in 
wireline communication standards is 10-12. The corresponding QBER is 14.  
The CDR is to recover the data that contains the jitter from the system, and the CDR itself can 
also produce some jitter, such as feedback clock jitter. System and CDR jitter performance can 
influence the target BER spec. CDR jitter performance can be categorized into jitter transfer 
(JTRAN), jitter generation (JGEN) and jitter tolerance (JTOL). 
 
2.3.1  JTRAN 
JTRAN is the jitter transfer function of the CDR, which is the ratio of output to input jitter 
as a function of frequency. The JTRAN of the CDR acts as a low pass effect that only the slow 
jitter in the data can pass through the CDR loop. On the other hand, high frequency jitter is filtered 






Figure 5. Jitter transfer function 
The jitter transfer function of a CDR is shown in Figure 5. As can be seen, CDR loop can 
track the low speed of signal change. Therefore, the phase of input and the phase of the feedback 
are the same, and the ratio of the output phase over the input phase at low frequency is 1. As the 
jitter frequency increases and becomes higher than the CDR loop bandwidth, the CDR loop cannot 
respond fast enough to the input, and then the ratio of the feedback phase over the input phase 
turns small. Therefore, the JTRAN starts to decrease. For the CDR design, the JTRAN is a key 
parameter that defines the jitter bandwidth. 
 
2.3.2  JGEN 
Jitter comes from everywhere in the communication system, also in CDR circuit itself. 
JGEN is defined as the jitter generated by CDR itself. As the jitter from other place, CDR jitter is 
also composed of random jitter (RJ) and deterministic jitter (DJ). When the bang-bang phase 
detector is used as the phase detector (PD) in the CDR loop, its limit cycle oscillation can produce 
the deterministic jitter. The random jitter comes from the charge pump, the thermal noise of the 





2.3.3  JTOL 
JTOL indicates the jitter tolerance performance. JTOL can be measured by adding the 
sinusoidal jitter signal of various magnitudes and frequencies to the data, and need to fulfill the 
target BER requirement at the same time. The jitter tolerance mask is the standard to require the 
system JTOL performance.  
If the CDR jitter transfer function is available, the JTOL can be derived as follows. In order 
to make sure there is no sampling phase error, the phase different between the input and feedback 
signal should be less than 0.5UI (unit interval). 
                                                 (1.2) 
If the jitter transfer function A(f) is already known, the Φout can be replaced by Φin A(f). 
                                                 (1.3) 
So, the input phase requirement can be derived below, 
                                                  (1.4) 
The limitation of Φin’ value is the CDR JTOL. For the jitter transfer function we talked 
above, the low frequency of the jitter, A(f)=1, and Φin could be infinity. When the jitter frequency 
exceed the bandwidth of this jitter transfer function, the value of A(f) start to decrease, and Φin 
also start to decease. 
Jitter mask is the specification of the system jitter. Figure 6 shows system jitter tolerance 
musk. JTOL value at any frequency needs to be larger than the jitter mask value. 
Φin−Φout ≤ 0.5UI
( ) UIfAin 5.0)1( £-F
Φin ≤ 0.5UI







                             Figure 6. System jitter tolerance mask 
 
2.4 Line Driver 
The output line driver of a data link transmitter is a critical part that determines the overall 
performance of the entire high-speed serial link. Not only the drivers need to provide FIR 
equalization to compensate the frequency-dependent channel loss inherent with the transmission 
channels thus mitigating ISI, many applications require the transmitter drivers to provide large 
signal swing for the receiver to recover the data streams with  good bit-error rate (BER) 






(a)                                                                 (b) 
Figure 7. Schematic of (a) CML driver and (b) SST VM driver. 
     The current-mode (CM) and voltage-mode (VM) drivers are two common design styles 
for transmitter output drivers [34]-[44]. Conventional high-speed drivers design typically utilizes 
current-mode-logic (CML) to implement both the main driver and the TXEQ.  A simple CML 
driver schematic is shown in Fig. 7 (a), where a current source provides constant and static current 
IC for the swiching transistors. The resistor R acts as the output impedance of the driver, and its 
value can be constant and not affected by the value of the static current that determines the tap 
coefficients in TXEQ. Z0 is the characteristic impedance for each channel, which is typically equal 
to the driver’s output impedance R.  The off-chip differential termination resistor is 2R. The output 
swing of the driver is determined by the static current and the value of R, and the common-mode 
voltage is determined by the supply voltage as well. The DC current flow !! 	of the CML driver is 
labelled in Fig. 7 (a), corresponding to the case when the swiching transistor on the left branch is 

































across the off-chip termination is		#!	$% . The differential signal swing VD1 is twice of VS1, thus !!	 =
&#$	
$ . 
     On the other hand, the VM drivers use the power supply as a voltage source without a 
constant tail current source, which improves power efficiency since there is no static current being 
consumed [36]-[40]. An example of a differential VM source-series-terminated (SST) driver is 
shown in Fig. 7 (b).  Each branch of the SST output stage comprises a pull-up PMOS and a pull-
down NMOS switching transistor. Both the pull-up and pull-down paths are connected by a series 
termination resistor R. The output impedance of the VM driver is the overall resistance of the 
parallelly connected R from each driving cell. The swiching transistors in the SST output stages 
should be designed with large size to make their resistance negligible compared to the series 
resistor R.  When INP is logic ‘1’ and INN is logic ‘0’, the DC current flow !&	 is labelled in 
Fig. 7 (b). The single-ended signal swing VS2 across the off-chip termination resistor is 2!&	&, thus 
!&	 = &#%	'$ , where VD2 is the differential signal swing of the SST VM driver. As can be seen from 
the above equations, the VM SST driver can be ideally four times power efficient than CML driver 
when achieving the same signal swing (VD1=VD2).  In addition, the CMOS-oriented SST driver is 
flexible to support different termination voltages and able to offer an output swing up to rail-to-








To meet the demand of using the channel bandwidth to create an auxiliary channel, we 
devised and implemented a wireline transceiver for an asynchronous serial channel that provides 
additional data bandwidth through inclusion of an auxiliary data channel and is interoperable with 
non-equipped transceivers in earlier generation systems.  An auxiliary channel at a lower layer 
generally provides increased bandwidth to support requirements of security and other system 
modifications.  This technique is also a means for steganography since it allows for 
communications to be hidden whether they are actually encrypted or not [8]-[9]. 
Serial communication channels must have a bandwidth in excess of the signal bandwidth that 
they transmit to allow for reliable communications in accordance with Shannon’s capacity 
theorem.  Because signals are always transmitted in the presence of noise, practical channels are 
designed with a bandwidth margin to account for reliable detection of the transmitted bitstream. It 
is desirable to efficiently utilize the available bandwidth in asynchronous serial channels. The 
common trend of the modern wireline transceiver is to be designed with increasingly large data 
transmission bandwidth on the serial channel.  However, besides increasing the transmitted data 
rate on a single channel, we consider that the available bandwidth can also be used to support an 
auxiliary channel that is capable of transmitting and receiving an auxiliary data stream in parallel 







Figure 8. Proposed asynchronous serial link with auxiliary channel. 
As shown in Fig. 8, the proposed transceiver system transmits the primary and auxiliary data 
stream through a single asynchronous serial channel and simultaneously recovers both of them at 
the receiver. 
The proposed novel transmission scheme can be used to provide several  benefits mentioned 
as follows.  For example, it can be applied in different ways to enhance hardware security, such as 
by using the auxiliary channel to carry authentication information for the simultaneously received 
primary data [16]. Other enhancements may exploit the fact that the auxiliary data transmission 
can also be considered as a form of steganography since the proposed method provides backward-
compatibility with standard transceivers for the primary data channel, and the auxiliary data 
appears as jitter to a non-equipped transceiver [8]-[9].  Alternatively, the auxiliary channel may be 




















and capacity without an increase in bandwidth [17]. Moreover, data throughput can also be 
increased through the use of the additional auxiliary channel.  
There are many applications for an auxiliary channel over a serial wireline link.  As an 
example, various schemes involving data authentication at the physical level are desirable without 
causing significant cost increases.  Data authentication can be accomplished via the receiver 
verifying the data source authenticity through signatures or other data being transmitted over the 
auxiliary channel. One example is the increasingly common use of reconfigurable FPGAs wherein 
configuration bitstreams are dynamically loaded from on-board serial memory.  Such bitstreams 
could be accompanied with authenticating signatures that provide some level of security to the 
device being configured.  If the auxiliary data stream comprises sensitive information, it can be 
further secured through the use of encryption.  
Auxiliary channel data transmissions may also be used for purposes other than authentication, 
for example link quality indices (LQI) or error detection and correction checksums could be 
transmitted to accompany the primary data [4].  
An overall advantage of embedding a physical-layer auxiliary channels in an asynchronous 
serial link is the use of the channel to transmit additional data in an existing design without the 
need for including another primary channel.  Since transistors are relatively plentiful as compared 
to on- and off-chip communication channels, it is advantageous to use existing communications 
channels rather than incorporating costly additional links.  As an example, high-speed data on the 
primary channel may be accompanied with lower bandwidth control or synchronizing data on the 




primary goal, rather the decrease in cost and more efficient usage of available bandwidth could be 
the motivating factor. 
 The primary contributions of this work include the design, simulation, analysis and prototype 
measurement of a new architecture for an asynchronous serial transceiver that supports an auxiliary 
channel. The potential applications to support hardware-level security is described in the 
manuscript.  We also emphasize that this additional channel is provided in a way that offers 
backward compatibility, interoperability with non-equipped designs, and minimal redesign of 
existing systems.  The architectures of the transmitter and receiver is described in detail. The 
guidelines to select the system parameters on the proposed scheme is provided. The analysis on 
the limits and impact of the additional auxiliary data channel is given. The measurement results 
demonstrate the function and performance of this proposed transceiver. 
 
3.2 Channel Bandwidth Margin 
The Shannon-Hartley channel capacity theorem C=B×log2(1+SNR), describes the 
relationship between the bandwidth of a channel versus the signal-to-noise ratio (SNR) of the 
transmitted signal, where C represents the channel capacity in bit/sec, B represents the bandwidth 
in Hz.  Due to the fact that all practical channels comprise some amount of noise, it is necessary 
to implement communication systems such that the theoretical channel bandwidth exceeds the 
bandwidth of the transmitted signal by some amount and thus has some bandwidth margin.  It is 
this excess bandwidth margin that is exploited in the physical layer auxiliary channel described in 
this dissertation. However, the achievable auxiliary channel bandwidth is not only limited by the 




data rate of the auxiliary channel.  In the proposed prototype transceiver, we consider 
implementing the auxiliary channel at relatively low bandwidth in a conventional serial wireline 
transceiver at a very small additional cost, which will be described in detail below.  
 
3.3 Transceiver Implementation 
The transceiver implementation described here is applicable to wireline baseband modulation 
systems.  The implemented transceiver embeds the auxiliary data into the primary data using 
modulation that can be considered as phase modulation (PM) since the phase of the primary data 
stream is conditionally delayed depending upon the auxiliary data values.  The transceiver 
simultaneously recovers both the primary and the auxiliary data at the receiving end [25]. 
 
3.3.1  Transmitter Architecture 
 
































A serial transmitter transmitting data at N bits per second typically employs a D flip-flop 
(DFF) triggered by a clock of frequency N Hz at the very end to synchronize every data bit before 
sending it out as serial data.  The duration of every data bit is the same as that of one period of the 
clock.  In the proposed transmitter, the auxiliary data is used to modulate the clock to the DFF.  
This is implemented using a 2:1 multiplexer (MUX) whose two data inputs are clock CLK0 and 
its delayed version CLK1.  The MUX control input is driven by the auxiliary data as shown in 
Fig. 9.  Thus, either CLK0 or CLK1 is selected for the DFF clock input by each bit of the auxiliary 















Clock Phase Lead Clock Phase Lag
Phase Lead Phase Lag








                     
(b) 
Figure 10. Timing diagrams of the transmitter modulation scheme. (a) Auxiliary data changing 
from bit ‘0’ to ‘1’. (b) Auxiliary data changing from bit ‘1’ to ‘0’. 
Fig. 10 depicts the waveforms of the clocks and data in the transmitter. CLK1 is delayed from 
CLK0 by a phase of Δφ. The instantaneous modulated clock is produced according to the auxiliary 
data bit.  For example, when the auxiliary data bit is ‘0’, CLK0 is used as the clock for the DFF 
and the modulated data stream is synchronized with the positive edge of CLK0.  When the 
auxiliary data bit is ‘1’, CLK1 is used for the DFF and the modulated data stream is synchronized 
with the positive edge of CLK1.  Essentially, the ‘0’ and ‘1’ bits of the auxiliary data are translated 
to phase lead and phase lag in the modulated data stream by the MUX serving as a binary switch 























 Fig. 10 (a) and Fig. 10 (b) show the transmitter timing diagrams with auxiliary data 
changing from bit ‘0’ to ‘1’ and from ‘1’ to ‘0’, respectively.  In the case from ‘0’ to ‘1’, the 
auxiliary data modulates the primary date from phase lead to phase lag.  In the case from ‘1’ to 
‘0’, the auxiliary data modulates the primary date from phase lag to phase lead. 
It is necessary not to miss any clock sampling edges in the modulated clock to ensure correct 
data transmission.  Higher data rate leads to shorter timing margin for the MUX selection process; 
thus the timing of the MUX block needs to be carefully verified. 
 













Clock Phase Lead Clock Phase Lag
Phase Lead Phase Lag












If not properly designed, glitches in the modulated clock may happen when CLK0 and CLK1 
are at different voltage levels when the modulated clock makes transitions from CLK0 to CLK1 
or vice versa, as shown in Fig. 11.  The potential glitches are avoided by ensuring that the MUX 
select signal, the “auxiliary data”, always transitions when both CLK0 and CLK1 are at the same 
level.  This is accomplished by making sure “auxiliary data” is internally synchronized to the 
positive edge of CLK1 (the transmitter is positive edge triggered by CLK0 and CLK1).  Since 
CLK1 is a delayed version of CLK0 and “auxiliary data” is triggered by the positive edge of CLK1, 
the auxiliary data only transitions when both CLK0 and CLK1 are ‘1’, thus the potential glitches 
problem mentioned above would never occur.  
 
 
Figure 12. Current-starved delay cell. 
Fig. 12 shows the current-starved delay cell that consists of four cascaded current-starved 









transistors thus controlling the degree of the phase shift between CLK0 and CLK. The 10% PVT 





3.3.2  Receiver Architecture 
 


















































Figure 14. The architecture and timing diagram of the BBPD. 
Typically, the asynchronous serial protocols require a clock and data recovery (CDR) circuit 
at the receiver, as the timing is embedded within the transmitted data.  Fig. 13 depicts the block 
diagram of the receiver architecture. It contains a main CDR circuit to recover the primary data 
and the embedded clock, and an auxiliary data recovery path to extract and demodulate the 




























Phase Lead                                                 











Phase Lag                                                 
……                                                 
……                                                 
……                                                 
……                                                 
Input Data 
Phase Lead            




……                                                 




a bang-bang phase detector (BBPD), a charge pump followed by a low-pass filter (LPF), and an 
LC tank voltage-controlled oscillator (VCO) [26]-[28].  
In particular, in the proposed receiver the BBPD is shared by both the main CDR loop and 
the auxiliary data recovery path.  Fig. 14 shows the architecture and the timing diagram of the 
modified BBPD. A conventional Alexander BBPD uses three sampled points and XOR gates to 
produce the early or late signals that may not be continuous during the leading or lagging phase 
[29]. The uniqueness of the modified BBPD is that it detects the phase difference between the 
input data and feedback clock and produces the either ‘1’ to indicate a phase lag or ‘0’ to indicate 
a phase lead for the entire phase leg/lead period. This is an important feature to extract the auxiliary 
data. 
 The error signal produced by the BBPD is sent to the charge pump in the main CDR loop 
and is also used in the auxiliary data path for demodulation of the auxiliary data. In the main CDR 
loop, the error signal through the charge pump is used to control the VCO that adjusts the phase 
of the recovered clock signal so that the primary data stream can be recovered. In the auxiliary 
data path, as the bits ‘0’ and ‘1’ of the auxiliary data are embedded as phase lead and lag in the 
primary data stream produced by the transmitter, the BBPD error signal contains the demodulated 
phase information (lead/lag) that is used to recover the auxiliary data.  The timing diagram in 
Fig. 14 indicates various signals in the BBPD for the case where the auxiliary channel is modulated 
either as the phase lead and phase lag in the input data.  Phase lead represents bit ‘0’ and phase lag 
represents ‘1’ in the auxiliary channel. The BBPD consists of four DFFs, DFF1 through DFF4, 
followed by a MUX.  The input data is connected to the ‘D’ input of the DFF1 and DFF2 whereas 




noted DFF1 is positive-edge triggered whereas DFF2 is negative-edge triggered as shown in the 
figure.  
During the locking process of the modulated input data stream, the BBPD error signal 
generated through the negative feedback loop adjusts the VCO feedback clock so that it is locked 
to the average phase of the input data. Once the locking status is achieved, the VCO clock is locked 
to the mid-point of the two phases (phase lead and phase lag) of the input data. Signal !!	, which 
is always triggered at the positive edge of the VCO clock to be the recovered primary data. Signal 
!#	, triggered at the negative edge of the VCO clock is either 180º leading or lagging with respect 
to !!	 and this phase relation represents the auxiliary data bits.  !$	 is produced by sampling !#	 
using the positive edge of !!	, thus it produces ‘1’ for phase leading and ‘0’ for phase lagging. 
!$	is the opposite of !$	. On the other hand, !%	 is produced by sampling !#	 using the negative 
edge of !!	, thus it produces ‘0’ for phase leading and ‘1’ for phase lagging. !$	and !%	 are almost 
identical except for a certain delay between them. The final error signal is produced by selecting 
either 	!$	or !%	 using !!	 as the select signal to ensure bit ‘0’ and ‘1’ are extracted in a timely 
manner for the recovered auxiliary data. Although this is properly referred to as an error signal 
with respect to the phase of the primary channel, it is also the signal that carries the auxiliary 
channel bitstream and is further processed to extract the auxiliary channel bitstream. 
The CDR relies upon transitions in the received data stream to produce the error signal that 
controls the VCO.  For this reason, it is necessary that a sufficient number of transitions or pulse 
edges be presented in the received asynchronous data stream.  In order to provide a sufficient 
number of signal transitions in the received primary data stream, most asynchronous serial data 




or 8B10B encoding.  These encoding schemes insert extra timing bits that guarantee at least one 
signal transition occurs among some number of subsequent bits.  While the timing bits are 
redundant in terms of information content, their presence allows for transmission speeds to be 
increased due to the transitions occurring often enough to ensure that the receiver maintains 
synchronization thus enhancing overall throughput. The PRBS7 scheme is used in both the primary 
data stream and the auxiliary data stream in the proposed transceiver and data channels. 
Some high-frequency pulses would be generated in the BBPD output, which result from the 
noise coupled from the input serial stream and the VCO feedback clock. The BBPD compares each 
transition edge of the input serial stream with that of the VCO feedback clock to detect the phase 
lead or lag and produces the error signal.   When the instant random noise of the input serial stream 
and the VCO feedback clock is larger than the modulated phase difference Δφ, the modulated 
information of phase lead/lag would be buried in the instant random jitter. At such moment, phase 
error detected by the BBPD is not the modulated phase information, thus some high-frequency 
pulses would appear in the error signal.  To fully recover the auxiliary data from the error signal, 
a second-order low- pass filter (2nd LPF) is employed in the auxiliary data recovery path to first 
filter out the high-frequency pulses in the BBPD output.  By choosing an appropriate bandwidth 
(which will be explained in the next sub-section), the 2nd LPF can filter out the high-frequency 
pulses of the error signal, while maintaining the recovered auxiliary data.   
The auxiliary data rate is lower than the primary data rate in the proposed transceiver, but 
they are synchronized with each other.  The clock for sampling the auxiliary data is thus a divided-
down version of the faster clock in the transmitter/receiver (fast clock being the VCO output for 




A standard receiver that is not equipped to demodulate the auxiliary channel will function 
normally and recover the primary bit stream without noticeable error due to the presence of the 
auxiliary channel. Because the modulated phase difference at the transmitter does not exceed 
allowable tolerances with respect to the requirement of CDR at the standard receiver. In this case, 
the phase difference modulated by the auxiliary data appears to be phase noise/jitter on the primary 
data channel.  
 
3.4 System Design Parameters  
In order to demodulate and recover the auxiliary data and to minimize the jitter in the 
recovered primary data, several parameters of the transceiver need to be chosen appropriately 
including the modulated phase difference at the transmitter, the parameters of the CDR loop, the 
data rate of the auxiliary channel, and the bandwidth of the 2nd LPF.  
 To ensure the main CDR loop operates reliably, the total jitter from the transmitter and 
receiver must be less than one Unit Interval (UI), otherwise the feedback clock of the CDR would 
not sample the input data stream correctly, thus leading to the degradation of the BER performance 
of the recovered primary data.  To fully recover and demodulate the auxiliary data, the total jitter 
from the transmitter, the receiver, and the serial channel together needs to be less than the bounded 
deterministic jitter caused by the modulated phase difference Δφ at the transmitter, otherwise the 
embedded auxiliary data information would be buried in the system’s jitter.  In our design, a jitter 
budget of <0.3 UI is allocated to the total jitter of the transceiver and the serial channel, and a jitter 
budget of 0.3-0.5 UI is allocated to the phase difference Δφ.  Δφ is set to be about 135° in our 




temperature) variation of the phase difference Δφ (0.34 UI – 0.42 UI) is still within the jitter 
budget.  
 
Figure 15. Simulated eye diagram of the modulated primary data. 
Fig. 15 shows the simulated eye diagram of the modulated primary data with two phases. 
Unlike a typical transceiver eye diagram where the eye width is one UI, the proposed transmitter 
has the output data phase modulated by the auxiliary data and thus its eye diagram contains phase 
difference between bits with phase lead and bits with phase lag. The blue arrowed line marks the 
transmitted eye with leading phase and the black one marks the transmitted eye with lagging phase, 
both of which are as wide as 390ps.  Note there is a 0.38 UI (148ps) phase difference, shown as 
indicated by the green arrows between these two eyes. This phase difference corresponds to the 




The data rate of the proposed transceiver has the potential to be increased. Then the new jitter 
budget and modulated phase difference numbers are required to scale with the increase of the data 
rate. For example, if we plan to design the data rate to be ten times faster, it is necessary to scale 
the jitter budget and modulated phase difference to one tenth of the number we used above. It is 
still achievable but needs to be more carefully designed by using low-jitter techniques and some 
precise phase control methods. Transmitter and receiver equalizers would also be needed is the 
channel loss is high at higher data rate.   
In a non-equipped standard receiver, the auxiliary data appears as bounded deterministic jitter 
of the primary data. The jitter caused by the auxiliary data still falls within the jitter budget of the 
transceiver, and having this much jitter would not adversely affect the functionality of the primary 
data recovery. Thus, the proposed auxiliary data is backward compatible with the non-equipped 
standard receiver.  
 
 














In order to select other CDR parameters, the main CDR with a negative feedback loop is 
modelled with a linear model as shown in Fig. 16. The small-signal model of the CDR can be 
derived from this block diagram.  
The LPF in the CDR loop contributes to a pole at wP, and a zero at wZ. The magnitude 





																																																= )+, ∙ 	 +-+ 	 ∙ 	
1 + %..














, KPD is the gain of BBPD, ICP is the change 
pump current, and KVCO is the gain of the VCO.  
Typically, /! ≫ /#. (2.1) can be simplified to (2.2). 







where 7-  is the unit gain bandwidth BW of the CDR open loop transfer function, which can 
be derived as shown in (2.3). 
																																																89 = 7- ≈







Figure. 17. BBPD output varies with the input data phase difference. Blue curve represents the 
ideal gain of the BBPD whereas the red curve presents the realistic gain of the BBPD. 
The binary value ‘0’ and ‘1’ of the phase error signal produced by BBPD provides sign 
information to indicate the phase of the data is either lagging or leading the VCO feedback clock. 
The BBPD is followed by the charge pump to turn on either the charge/discharging current.  The 
value ‘0’ of the BBPD output turns on the charging current of the charge pump, whereas value ‘1’ 
of the BBPD output turns on the discharging current. Thus ‘-1’ and ‘1’ are used to represent the 
BBPD output average amplitude.  As shown in Fig. 17,  )+, is the slope of the waveform when 
the BBPD average output amplitude changes from “-1” to “1”.  The waveform in blue color shows 
the ideal PD has an infinite large gain in a noiseless environment. However, in real situation, the 
slope of the BBPD output is influenced by the random jitter of the input data as shown in the red 
waveform in Fig. 17.  The root-mean-square (RMS) value of the random jitter <467 in the input 
data stream is denoted as	=. The practical )+, can be calculated by the slope of the red waveform 










where > is the transition density of a specific type of data input and its value depends on the 
average number of transition edges present during a unit time.  Typically，> = 0.5 for the PRBS 
type input [30]. As mentioned above, a jitter budget of <0.3 UI is allocated to the total jitter of the 
transceiver and the serial channel. Therefore, the maximum peak random jitter <+ cannot exceed 











where !  is the scaling factor when converting between peak jitter and RMS jitter with 














The bandwidth of the CDR is set at 1.54 MHz (2.56 GHz/1667),	)3-& is set to 100 ∙ 26	MNO ∙
PQR/T and +-+ is designed to1	U#.  With these parameters, resistor R is derived from (2.3) as   
																							R =
89 ∙ 26
)WO ∙ +XW ∙ )YXZ
≈ 2.1	)Ω																																																				(2.7) 
The value of capacitor /!is set to 148 pF, and /#is set to 10 pF, leading to a zero located at 
512 KHz, and a pole located at 7.6 MHz as calculated below in (2.8) and (2.9). 
																																						7. =
1







26 ∙ 5 ∙ /#
= 7.6	PQR																																																														(2.9) 
 
3.4.1  The Auxiliary Data Rate  
Shannon-Hartley Theorem reveals the theoretical bandwidth margin within the total channel 
capacity that can be exploited for an auxiliary channel. In the serial link design, the transceiver 
architecture sets the actual limitations for the auxiliary data rate, and the performance of the 
recovered primary and auxiliary data is related to the auxiliary data rate.  
The main CDR close-loop transfer function H(s), which is also the jitter transfer function 













The observed jitter transfer function (OJTF) shown in (2.11) has a high-pass frequency 
response, so the jitter above the cut-off frequency can be observed at the output of phase detector. 










Due to the low-pass characteristics of the CDR loop, the VCO feedback clock can track the 
low-frequency (in-band) phase change of the input data stream, and accordingly, the in-band phase 




signal of the BBPD output.  In order to keep the auxiliary data information intact in the error signal, 
the bandwidth of the main CDR loop needs to be smaller than the lowest frequency components 
of the auxiliary data frequency spectrum.  Otherwise, the feedback clock would track the phase 
change of the input data stream, which can result in the loss of embedded phase information 
(auxiliary data). 
The random data, including the encoded data such as PRBS or 8B10B, has a wide frequency 
spectrum, because there are some consecutive ‘0’s and ‘1’s in the data stream.  The PRBS7 is used 
for the primary and auxiliary data in the proposed transceiver, and the lowest frequency component 
of PRBS7 is one seventh of the data rate.  In our implementation reported here, the lower limit of 
the data rate of the auxiliary channel is set at 7 times the CDR bandwidth.   The 1.54 MHz 
bandwidth of the CDR leads to the lower boundary of the auxiliary data rate being 21.56 Mbps.  
A ratio between the primary and auxiliary data rate is also considered to set the upper 
boundary for the auxiliary data rate in the proposed transceiver.  As the BBPD compares the edges 
between the input data and the VCO feedback clock at each transition edge of the input data, the 
frequency of the unwanted high-frequency pulses at the error signal highly depends on the primary 
data rate. Therefore, the frequency range of the pulses is from the lowest to the highest frequency 
components of the primary data rate.  In order to filter out the unwanted high-frequency pulses 
coupled from the noise of the primary data and the VCO feedback clock, the bandwidth of the 2nd 
LPF needs to be lower than the lowest frequency component of the primary data rate. Therefore, 
the upper limit of the auxiliary data rate that needs to be within the bandwidth of the 2nd LPF is 
considered to be the lowest frequency component of the primary data rate.  In the proposed 
transceiver, the primary data is coded to PRBS7, leading to the upper boundary of the auxiliary 




Taking these analyses into consideration, the auxiliary data rate in the proposed transceiver 
is set to 80 Mbps, within the auxiliary data rate range as analyzed above.  The bandwidth of the 
2nd LPF is set to 40 MHz to filter out the high-frequency pulses while keeping the recovered 
auxiliary data. 
Generally, there are some trade-offs between the performance of the recovered data (BER or 
jitter performance) and the auxiliary data rate.  Higher auxiliary data rate that are exploited from 
the channel bandwidth margin may degrade the BER performance of the recovered auxiliary data. 
Designers can set the auxiliary data rate based on their channel SNR and the BER requirements 
for the received data. 
The auxiliary data has some limitations and affects the performance of the primary data link 
on jitter tolerance and the jitter of recovered data. The jitter tolerance analysis in section III.E and 
jitter measurement results in section IV can depict the abovementioned limitations and impacts. 
 
3.4.2  The Impact of the Auxiliary Data on the Primary Data Channel – The Jitter Tolerance 
Analysis  
To understand the impact of the auxiliary data on the performance of the primary data channel, 
we evaluate the jitter tolerance (JTOL) of the primary data. JTOL can be evaluated using equation 












JTOL is specified in the unit of peak-to-peak jitter amplitude (UIPP). Timing margin for CDR 
input signal is one unit interval (1 UI).  (&'((%)/()*(%) is the CDR close-loop transfer function 
H(s).  Since H(s) presents a low-pass response, CDR can tolerant the low-frequency jitter 
amplitude up to several times the timing margin.  
 
Figure. 18. Simulated jitter tolerance with SONET OC-192 mask. 
 
The CDR jitter tolerance (JTOL) with and without the auxiliary data is simulated by varying 
the frequency and amplitude of a sinusoidal jitter, and measuring the maximum sinusoidal jitter 
amplitude, at which the receiver guarantees the targeted BER of 10-12. As can be seen from the 
JTOL simulation results in Fig. 18, at low jitter frequency, the JTOL performance with the 
auxiliary data is almost the same as the performance without the auxiliary data, whereas at high 




because the main CDR loop has a high-pass OJTF, and the low-frequency jitter (in band) can be 
tracked by CDR feedback clock. Therefore, as long as the bounded deterministic jitter introduced 
by the auxiliary data is within the CDR timing margin, auxiliary data does not affect the low-
frequency JTOL performance. However, high-frequency sinusoidal jitter (out of band) would not 
be tracked by the CDR feedback clock, thus it occupies part of the CDR timing margin in the input 
bitstream. As mentioned above in section III. B, the auxiliary data also appears as high-frequency 
deterministic jitter in the CDR input bitstream and occupies some CDR timing margin. Therefore, 
the performance degradation of the high-frequency JTOL between the cases with and without 
auxiliary data is very close to the modulated phase difference Δφ introduced by the auxiliary data, 
which is 0.38UI.  
The SONET OC-192 mask is also plotted in Fig. 18 for comparison. The JTOL curves of both 
cases (with and without the auxiliary data) exceed the SONET OC-192 mask with a margin of 
more than 0.2 UI. This means the degraded JTOL performance is still acceptable for obtaining a 
BER of 10-12 for the recovered primary data.  The proposed auxiliary channel compromises some 
performances of the serial transceivers, like JTOL, but it still meets the communication standard 





3.5 Measurement Results  
 
   
The prototype IC for the proposed asynchronous serial transceiver was fabricated in a 
65 nm CMOS process.  The die photo is shown in Fig. 19.  The core circuit occupies about 






















Fig. 20 presents the measurement environment and connections. On the transmitter side, two 
synchronized signal generators (Agilent N5181A and PatternPro SDG 12072) produce three input 
signals, namely TX clock input, TX PRBS primary data input, TX PRBS auxiliary data input. On 
the receiver side, three recovered signals are connected to Tektronix 73304D oscilloscope to 
measure the eye diagrams and jitter performance. All external connections use short-reach (less 
than 0.5m) coaxial cables. The control codes were generated by an I2C master board. 
The modulated data is differentially transmitted between the transmitter and the receiver by 
connection of a pair of differential coaxial cables. Given the 2.56 Gbps of the primary data rate 
and the jitter budget provided in section III. C, the 0.38UI (148 ps) of the modulated phase 


























① TX Clock Input
② TX Primary Data Input
③ TX Auxiliary Data Input
④ RX Recovered Clock Output
⑤ RX Recovered Primary Data Output
⑥ RX Recovered Auxiliary Data Output
Oscilloscope




delay mismatch caused by a pair of short-reach coaxial cables (~0.5m in our measurement) would 
not excess the margin of the phase modulation employed in the proposed design (30 ps), thus it 
would not adversely affect the function or performance of phase modulation and demodulation.   
Figure. 22. The Eye diagram of the recovered primary data. 
 























Recovered Primary Data 










Recovered Primary Data 






The transceiver operates at a supply voltage of 1.2 V. The power consumption of the core 



















Recovered Primary Data Bathtub
@ Auxiliary data is off
Recovered Primary Data Bathtub
@ Auxiliary data is on
0.88UI @ 1E-12 BER
0.82UI @ 1E-12 BER
Figure. 23. The bathtub of the recovered primary data 
 
.  
















primary data rate from 2.2 Gbps to 3.6 Gbps (54.7%).  Fig. 21 shows the recovered 2.56 GHz clock 
signal with 31 ps total jitter, when the auxiliary data is in the serial link.  Fig. 22 shows the eye 
diagram of the recovered 2.56 Gbps primary data at PRBS-7, with 48 ps total jitter when the 
auxiliary data is off, and with 70 ps total jitter when the auxiliary data is on.  The bathtub curves 
of the recovered primary data shown in Fig. 23. The eye open width is 0.88 UI when the auxiliary 
data is off, and 0.82 UI when the auxiliary data is on. Fig. 24 shows the eye diagram of the 
recovered 80 Mbps auxiliary data with 103 ps total jitter.  The total jitter is measured under the 
targeted BER of 10-12.  
While the auxiliary data degrades the jitter performance of the recovered primary data, as can 
be seen from the measurement results in Fig. 22 and Fig. 23, the additional jitter (22 ps) caused by 
the auxiliary data is relatively small, and it would not adversely affect the functionality of the 





CHAPTER 4  A HYBRID LINE DRIVER WITH VOLTAGE-MODE SST PRE-EMPHASIS 




Figure. 25. Basic architecture of (a) transmitter FIR equalization and (b) pre-emphasis. 
    An ideal signal for a receiver completes the data transition within a symbol interval. 
However, signal travelling through the long cable is subject to high-frequency losses, including 
dielectric losses and skin effect. These losses attenuate the amplitude of high-frequency 


























is referred as intersymbol interference (ISI), which causes signal transmission degradation and 
makes it difficult to recover the signal at the receiver. 
 
 
Figure. 26. Time-domain waveform of TXEQ and pre-emphasis. 
In the transmitter, two ways can be used to mitigate the ISI caused by the channel loss: 
transmitter equalization (TXEQ) using finite-impulse response (FIR) and pre-emphasis. The FIR 
filter in TXEQ consists of several taps, and its function is to  emphasize (boost) the high-frequency 
response and de-emphasize the low-frequency response to compensate the channel loss caused by 
ISI.  It applies phase advance and delay to the signal bit and add the advanced and delayed bits to 
the original input signal bit with proper strength. As shown in Fig. 25 (a), Dmain is the original 
input signal bit (main-cursor tap) whereas Dpre is the advanced bit (pre-cursor tap) and Dpst1 and 





FIR with 1 post-tap
No FIR















input data streams switch on/off their corresponding current IPre, IMain, IPst1 and IPst2, which are 
programmable and setting the coefficients for each tap. For comparison, Fig. 25 (b) shows the 
basic architecture of the pre-emphasis. The PreEmp_Pulse is generated by a XOR function of the 
Data_In and its delayed signal Data_Delay. Data_In and PreEmp_Pulse turn on/off current IMain 
and IPreEmp, respectively. Right after a transition edge of Data_In, the current IPreEmp is added to the 
nominal output current IMain, which can speed up the data transition to reach the desired amplitude 
and decrease the rise and fall time. The duration of the IPreEmp is the pulse width of the 
PreEmp_Pulse. Pre-emphasis only boosts the high-frequency components without reducing low-
frequency components, but it has limited ability to fully compensate the channel loss. The time-
domain waveforms in Fig. 26 show the effects of the first post-tap in FIR equalization and the pre-
emphasis. The first post-tap in the FIR equalization emphasizes the first bit period after a data 
transition and de-emphasizes the remaining bits. In time domain, the FIR equalization distorts the 
pulse shape to mitigate the spreading of the data pulses to their adjacent transmitted bits, thus 
eliminating ISI. However, this is at the cost of attenuating the transmitted signal. On the contrary, 
pre-emphasis only emphasizes the first bit period but without de-emphasizing the remaining bits, 






Figure. 27. Current-mode driver with FIR equalization. 
     A simplified schematic of a transmitter FIR equalization using CML topology is shown in 
Fig. 27.  The pre-cursor tap, main-cursor tap and post-cursor tap are shown with the coefficient 
settings of C1, C0 and C-1, respectively. The tap coefficients control the current value of its tail 
current source to set the strength of each tap. Due to the fact that it has an inherently low 
susceptibility to power supply noise, the CML structure offers low jitter and low design complexity 


















Figure. 28. Voltage-mode driver with FIR equalization. 
     Fig. 28 shows an SST VM driver with FIR equalization. The main-cursor signal DP/N, 
post-cursor signal DpstN/P and pre-cursor signal DpreN/P are the inputs for a segment selection 
logic.  In order to implement the TXEQ in VM drivers and maintain proper impedance at the same 
time, the pre-driver needs an additional segment selection logic to distribute large number of 
segments and scale the current weight for different taps by generating coefficients C-1, C0 and C1 
for different group of segments [44]. This not only increases the design complexity in pre-driver, 
but also adds substantial capacitive load, thus degrade the power advantage of the VM drivers by 



















































FIR equalization is that the data-dependent current would be injected into the supply rails, affecting 
the data-dependent jitter performance of the transmitter. 
     Reference [44] presented a hybrid VM main driver with CM FIR equalization. The VM 
main driver offers low power dissipation, and the CM FIR equalization avoids pre-driver 
segmentation for low design complexity. However, the amplitude of the output signal is 
significantly reduced by the FIR equalization, especially when very large channel loss needs to be 
compensated, which would be the case for the data transmission over long cables in DUNE. 
     This dissertation proposes a new hybrid transmitter with a line driver that combines a VM 
main driver with the VM pre-emphasis using SST driving cells and the CM TXEQ. The 
combination of the VM pre-emphasis with the CM TXEQ allows for using relatively small 
coefficients (strength) for the FIR equalization (thus alleviating de-emphasis of the low-frequency 
components) while maintaining a good eye height.  Additionally, the SST-based main driver is 







4.2 The Proposed Hybrid Line Driver  
The proposed hybrid line driver is shown in Fig. 29, which consists of a differential-mode 
VM main driver with VM pre-emphasis and CM TXEQ as outlined in the figure. The differential 
input data streams, denoted as DataN and DataP, drive the VM cells, including both the main 
driver cells and PreEmp (pre-emphasis) cells. A 4-tap FIR equalization is implemented in the 
driver. The VM main driver acts as the main tap, and three CM taps (pre-cursor tap, post-cursor 
tap1 and post-cursor tap2) are connected in parallel with the VM main driver to implement the CM 
TXEQ. 
     The SST topology used in the VM main driver cells and VM PreEmp cells is also shown 
in Fig. 29. An Enable Logic generates Up and Down to switch on and off the VM driving cells. 
Since SST is a CMOS-oriented design, the SST cells can be turned off when the signal Up is set 
to ‘1’, and Down is set to ‘0’. Signal Enable serves as the control signal for the Enable Logic in 
the main driver cells; both signals Enable and PreEmp_Pulse serve as the control signals for the 
Enable Logic in the PreEmp cells. For example, the PreEmp cell is turned on when both its 

































































































  In the proposed hybrid line driver in Fig. 29, DpreN/P are differential inputs of the pre-
cursor tap, and they lead DataN/P by one bit time. Dpst1N/P and Dpst2N/P are inputs of the post-
cursor tap1 and post-cursor tap2, which lag DataN/P by one bit and two bits, respectively. The tap 
coefficients are implemented by tail current source with 4-bit binary control. Higher current 
resolution can be achieved by increasing the number of bits.  However, TXEQ deals with channel 
loss and ISI by sacrificing the signal amplitude, which results in a trade-off between the signal 
amplitude and SNR/jitter performance of the transmitted signals.       
     In DUNE, large signal swing is required for the receiver to recover the transmitted data 
with a good BER performance. In order to reduce the signal attenuation caused by FIR equalization 
and get large eye height, the VM SST pre-emphasis is proposed in the hybrid line driver. It peaks 
the voltage of the first bit after the transition edges to decrease the rise and fall time of the 
transmitted data, which enhances the transmission bandwidth and boosts the high-frequency 
response. VM pre-emphasis allows the FIR equalization to use smaller tap coefficients than 
without pre-emphasis, thus alleviating the de-emphasis to obtain larger signal swing. The 
combination of CM FIR equalization and VM pre-emphasis allows the line driver to take 






Figure. 30. “PreEmp_Pulse” generation. 
Fig. 30 shows the simplified timing diagram to depict the generation of the PreEmp_Pulse. 
An XOR with two inputs of Data_In and its delayed stream Data_Delay is used to generate the 
PreEmp_Pulse. The one-bit delay cell is used to delay the input data with one-bit period, which 
determines the width of the PreEmp_Pulse.  Logic ‘1’ of PreEmp_Pulse turns on the 
corresponding VM PreEmp cell, and provides extra current during the pre-emphasis period, thus 
generating a voltage peak at the transmitted data pulses. As can be seen from the timing waveforms 
in Fig. 30, pre-emphasis is enabled by PreEmp_Pulse, and in Data_Emphasized, the voltage peak 
is applied to the bits after the transition edges of transmitted data. 













Figure. 31. The enable-logic and SST output stage in main driver cell and PreEmp cell. 
     The schematic of the VM main driver cell and VM PreEmp cell is shown in Fig. 31, which 
includes the Enable Logic and the SST output stage. The Enable Logic consists of some 
combination logics and two stages of inverter-type buffers for differential path. The function of 
the Enable Logic is to switch on and off the corresponding SST output stage controlled by Enable 
and PreEmp_Pulse signals. The VM driving cells are turned on when both Enable and 
PreEmp_Pulse are high. When either one of Enable or PreEmp_Pulse is low, OnN and InP are set 
to low, and OnP and InN change to high, which shutting off the current flow and turning off the 

































controlled by Enable signal, the PreEmp_Pulse inputs are always set high for the main driver cells. 
For good jitter performance, it is important to maintain fast rise and fall time for all dynamic nodes 
inside the Enable Logic and the SST output stages. 
 
Figure. 32. The current flow in VM driver using SST output stages. 
     Fig. 32 depicts the current flow on differential VM driving cells, which determines the 
differential output signal swing of the VM drivers. The blue current arrows represent the case of 
input data bit is “0” for N cells and “1” for P cells, when the switching transistors of PMOS in N 
cells and NMOS in P cells are on.  The current flows from the VDD of N cells and through two 
series resistors R, twinax cables and 100 Ohm external termination resistor. Similarly, the case of 





































single-ended voltage swing at the receiving end is actually the voltage across the 100 Ohm external 
termination resistor.  The output resistance of the VM driver is set to be 50 Ohm, therefore, the 
differential output signal swing of the VM driver is 2 ⋅ !""#$%!"" 	$%%, which is 1.1 V in the design.  
 





Figure. 34. CML driver with only CM TXEQ. 
 
     The proposed hybrid line driver is implemented with programmability. Fig. 33 shows the 
line driver is programmed to a hybrid driver without pre-emphasis, which is comprised of VM 
main driver and CM TXEQ. Fig. 34 shows when it switches to a CML driver with only CM TXEQ 
(without pre-emphasis). Measurement results shown in section IV provide a comparison on the 









4.3 DUNE Transmitter Design 
 
Figure. 35. The block diagram of transmitter. 
     Fig. 35 shows the DUNE transmitter which is a data concentrator that captures the 
incoming digital data streams from several ADCs and sends data from cryogenic environment 
(liquid argon) over long twinax cables to the warm interface. The transmitter consists of two 
channels of 8b10b encoder, an 8/10:1 serializer and a line driver. The phase locked loop (PLL) 
generates a high-speed clock to synchronize the transmitter [46]-[47]. The transmitter implements 
standard 8b10b encoding, and encoding can also be switched off and replaced with PRBS7/15 to 



























128 MHz or 160 MHz
















Figure. 36. The block diagram of the serializer. 
 

















Figure. 38. The block diagram of the 4/5-to-1 MUX. 
     A block diagram of the serializer, the schematic of the divide-by-4/5 divider and 4/5:1 
MUX are shown in Fig. 36, Fig. 37 and Fig. 38, respectively.  In the normal 10-bit mode, incoming 
parallel data at 128 MHz is input to two 5:1 multiplexers, whose output is then input to a 2:1 
multiplexer, registered by a D flip-flop, and output to a line driver.  The 640 MHz and 128 MHz 
clocks required by the multiplexers are produced in each serializer stage. The 128 MHz clock is 
also to the FIFO, which provides synchronized data to the serializer.  An 8-bit mode is included 
for debugging purposes. In 8-bit mode, parallel data is input at 160 MHz and the first multiplexer 



























4.4  Design Considerations for Lifetime Reliability 
It is required that electronics including integrated circuits (ICs) used in DUNE have at least 
20 years of the experiment lifetime. The read-out circuits including the proposed transmitter will 
be immersed in liquid argon at cryogenic temperature (89K). The challenge is to design the 
transmitter to operate at such low temperature with the required lifetime.  
Some of the transistor performance such as thermal noise, speed, driving current can be 
improved at cryogenic temperature. However, at lower temperature, hot carriers in the CMOS 
devices generate larger energies with higher frequencies. Therefore, the hot carriers would degrade 
the transistor lifetime at cryogenic temperature, which is a primary concern for DUNE electronics 
design.  
Our previous research carried out the lifetime study on commercial 130nm and 65nm CMOS 
processes [48]-[52]. It was reported that the 65 nm process is more resistant to cryogenic hot carrier 
degradation than the 130 nm process. The transistors in 130 nm process are required to reduce its 
nominal voltage from 1.5 V to 1.49 V in order to get 20 years of lifetime. While for 65 nm process, 
the voltage below 1.3 V meets the 20 years lifetime requirement. Given the fact that the lifetime 
is inversely proportional to the supply voltage of the transistors, to further push the lifetime up to 
30 years, a 1.1 V supply voltage of 65 nm CMOS process is determined to be used in the DUNE 
cold electronics, including the proposed DUNE transmitter. 
In addition, the studies also showed that lifetime increases with increasing transistor lengths, 
but remains unaffected by the transistor width. The width dependence is negligible for short 




such as speed, 90 nm instead of 65nm minimum transistor length is chosen to be used in the DUNE 
cold electronics IC design. 
 
4.5 Measurement Results  
 
Figure. 39. Chip die photograph and corresponding layout view. 
The proposed hybrid line driver is designed and fabricated in a 65 nm technology. The front-
end transmitter is implemented with entire ASIC. The die photo of a part of the chip and the 
corresponding layout of the transmitter are shown in Fig. 39. The transmitter has two channels of 
data transmission which share one phase-locked-loop (PLL). Therefore, the transmitter layout 
contains one PLL, two channels of serializers (SER1 and SER2), and two channels of proposed 
line drivers (DRV1 and DRV2). The core area of the proposed hybrid line driver takes about 180 









which is close to that of liquid argon (89K). The proposed hybrid line driver is tested by immersing 
the entire chip into liquid nitrogen. Its ability to drive 25 m and 35 m cables is verified for both 
warm cable (300K) and cold cable (77K).  Measurement results indicate that three different driver 
configurations, namely, the proposed hybrid line driver, hybrid driver without pre-emphasis, and 
CML driver with only CM TXEQ are able to drive the long cables. PRBS7 and PRBS15 data 
patterns are generated and transmitted, both data patterns and their pattern length are received and 
recognized by Tektronix DSA 72004C oscilloscope for three above-mentioned driver 
configurations. 
                                                                




                                                          
(b) 
Figure. 40. Test set up and environment for line driver measurement. 
 





   
(b) 
Figure. 41. Insertion loss of (a) 35m and (b) 25m twinax cable at room and cryogenic temperature. 
 
Fig. 40 shows the test set up and environment for the line driver measurement. As can be seen 
from left picture, the Samtec twinax cable (blue color) is twined at the bottom of a cable tube with 
the diameter of about 6 inches. When testing with the cold cable, the cable tube is put into a large 
liquid nitrogen dewar, and the long cable is completely immersed into liquid nitrogen to be cooled 
down to cryogenic temperature, as shown in Fig. 40 (b). The output of the line driver is connected 
to one end of the long cable, and the other end of the long cable is connected to the Tektronix DSA 
72004C oscilloscope without any equalizer at the receiving end. Fig. 41 (a) shows the insertion 
loss of a 35-meter cable is 18 dB at 77K and 24.9 dB at 300K, and Fig. 41 (b) shows the insertion 





Figure. 42. The measured transmitted eye diagrams and bathtub curves of the serializer 
output for input data of (a) PRBS-7 and (b) PRBS-15. 
Fig. 42 shows the measured transmitted eye diagrams and bathtub curves of the serializer 
output signals, which are obtained by programming the driver to a conventional CML driver that 
drives a differential pair of short coaxial cables. PRBS-7 input data pattern obtains 361 mV eye 
height and 693 ps eye width (0.89UI). PRBS-15 input data pattern obtains 351 mV eye height and 
664 ps eye width (0.85UI). The total jitter (TJ), random jitter (RJ) and deterministic jitter (DJ) for 
PRBS-7 are 90.7 ps, 1.1 ps and 74.8 ps, respectively. For PRBS-15 data pattern, they are 117.5 ps, 




TJ: 90.67 ps RJ: 1.13 ps DJ: 74.79 ps
PRBS15664 ps351mV
0.85 UI




                      
 
Figure. 43. The measured transmitter output waveforms and eye diagram at the end of 25m 
twinax cable driven by CML driver without TXEQ and pre-emphasis. 
 
Fig. 43 shows the differential timing waveforms of the transmitted data stream after 25m 
twinax cable and its totally closed eye diagram. These results are also obtained by programming 
the driver to a conventional CML driver and turning off the TXEQ and VM cells. Different 
amplitude for high frequencies and low frequencies results from the large channel loss. Long rise 
and fall time on data edges are caused by large capacitive load and low channel bandwidth.  
 







       (a) 
          
(b) 
Figure. 44. Measured eye-diagrams after (a) 25m twinax cable and (b) 35 m twinax cable 
for PRBS7 data pattern. 




Proposed Hybrid Line Driver
Hybrid Driver w/o Pre-Emp
CML with CM EQ CML with CM EQ
Hybrid Driver w/o Pre-Emp














Proposed Hybrid Line Driver
Hybrid Driver w/o Pre-Emp
CML with CM EQ CML with CM EQ
Hybrid Driver w/o Pre-Emp













                         
(a) 
                        
(b) 
Figure. 45. Measured bathtub curve after (a) 25mtwinax cable and (b) 35 m twinax cable for 
PRBS7 data pattern. 





Hybrid EQ w/o pre-emp
Hybrid EQ w/ pre-emp
0.67UI @10E-12
0.86UI @10E-12
Hybrid EQ w/o pre-emp
Current-Mode EQ 
0.69UI @10E-12
Hybrid EQ w/ pre-emp






Hybrid EQ w/o pre-emp
Current-Mode EQ Current-Mode EQ 
Hybrid EQ w/o pre-emp
Hybrid EQ w/ pre-emp
0.60UI @10E-12





     Fig. 44 (a) and (b) show the measured eye diagrams after 25m twinax cable and 35 m 
twinax cable for PRBS7 data pattern, which is driven by the proposed hybrid line driver with CM 
TXEQ and VM pre-emphasis. Fig. 45 (a) and (b) show the measured bathtub curves corresponding 
to the eye diagrams in Fig. 44. The CML driver with only CM TXEQ consumes most of the power 
among all three configurations but obtains smallest eye open height. Compared to the CML driver, 
the hybrid drivers saves power consumption significantly and increases the eye height by using 
the VM SST main driver. The only advantage that the CML driver has over the hybrid ones is 
better jitter performance (larger eye width) since there is no data-dependent current injected into 
the supply rails. Although the CML driver obtains a bit better eye width, all of three configurations 












Table 1 Simulated comparator noise and power consumption 
  
PRBS7 
Eye Height (mV) Eye Width (ps) Power (mW) Eye Height (mV) Eye Width (ps) Power (mW)
Proposed Hybrid 161 537 7.1 278 613 4.1
Hybrid w/o Pre-Emp 112 523 6.5 233 607 4
CML with CM EQ 87 670 16.6 200 690 12.9
25m Warm Cable 25m Cold Cable 
PRBS15 
Eye Height (mV) Eye Width (ps) Power (mW) Eye Height (mV) Eye Width (ps) Power (mW)
Proposed Hybrid 134 360 7.1 243 459 4.1
Hybrid w/o Pre-Emp 76 394 6.5 204 493 4
CML with CM EQ 73 492 16.6 180 628 12.9
25m Warm Cable 25m Cold Cable 
PRBS7 
Eye Height (mV) Eye Width (ps) Power (mW) Eye Height (mV) Eye Width (ps) Power (mW)
Proposed Hybrid 81 468 7.8 247 566 5
Hybrid w/o Pre-Emp 52 488 7.1 213 615 4.6
CML with CM EQ 50 582 18.4 165 673 14.2
35m Warm Cable 35m Cold Cable 
PRBS15 
Eye Height (mV) Eye Width (ps) Power (mW) Eye Height (mV) Eye Width (ps) Power (mW)
Proposed Hybrid 51 245 7.8 212 452 5
Hybrid w/o Pre-Emp 32 165 7.1 181 505 4.6
CML with CM EQ 45 391 18.4 153 615 14.2





   Table 1 extends the measurement results to both PRBS7 and PRBS15 data patterns with the 
cable length of 25 m and 35 m. Compared to the CML driver with only CM TXEQ, the proposed 
hybrid line driver consumes 57%-68% less power but obtains 13%-85% more of the eye height. 
Compared to the hybrid driver without pre-emphasis, the proposed hybrid line driver gets 16%-
76% more of the eye height, and only takes 2.5%-9% more power. The BER performance of the 
proposed hybrid driver is evaluated using the Xilinx KC705 evaluation kit, which is measured 
using PRBS15 data pattern and warm 25 m cable. Data stream is running for three weeks and a 
total of  2.38×1015 bits are received with no error. Assuming Poisson statistics, this corresponds 
to a BER of better than 10-15. These results have shown the effectiveness of the proposed hybrid 
line driver. Given the large eye height and low power consumption, the presented hybrid driver is 
considered as a more suitable architecture for the DUNE application. 
Table 2 shows the comparison of this work with other state of the art with similar data rate. 
The proposed hybrid line driver compensates the largest channel loss, and achieves comparable 








Table 2 Simulated comparator noise and power consumption 
 
 
 [3]  [5] [7] [11]  [13]
Process (nm) 65 nm 90 nm 90 nm 45 nm SOI 90 nm
Supply (V) 1.05 V 1.15 V 1.2 V 1.08 V 1.2 V
Driver Topology CM CML VM VM VM SST Hybrid
Data Rate (Gbps) 15 Gbps 4 Gbps 5 Gbps 7.4 Gbps 6 Gbps
TX Swing (Vpp) 160 mVpp 500 mVpp 400 mVpp 800 mVpp 300 mVpp
Channle loss at 
Nyquist (dB)
10 dB 8-12 dB 6 dB 2.6 dB 3.7 dB 25 dB 13 dB
Power (mW) 34 mW 17 mW 4.9 mW 32 mW 4.1 mW 7.8 mW 4.1 mW
Energy Efficiency 
(pJ/bit)
2.3 pJ/b 4.25 pJ/b 0.98 pJ/b 4.3 pJ/b 1.26 pJ/b 6.1 pJ/b 3.1 pJ/b











CHAPTER 5  CONCLUSION 
 
5.1 Summary  
A new asynchronous serial transceiver that supports an auxiliary channel yielding additional 
data transmission capability is demonstrated in a 0.13mm2 65nm CMOS IC. The prototype 
transceiver allows for both its primary and auxiliary data streams, at 2.56Gbps and 80Mbps 
respectively, to be recovered simultaneously with good jitter and BER performance.  The analysis 
of the auxiliary data rate that can be achieved from available channel bandwidth margin is given 
in this dissertation. The contribution supports secure methods by offering a way to utilize an 
additional auxiliary channel and extra data bandwidth to support potential security measures such 
as the inclusion of authentication data, additional support of encryption or other methods requiring 
another channel or more bandwidth, steganography, etc.  The additional novelty is that this 
auxiliary channel in serial transceiver are provided in a way that offers backward compatibility, 
interoperability with non-equipped designs, and minimal redesign of existing systems. 
A hybrid transmitter with a proposed line driver equips VM pre-emphasis and CM TXEQ has 
been designed, implemented and measured at cryogenic temperature to compensate large 
frequency-dependent channel loss introduced by 25-35 meter twinax cables. Large eye height is 
obtained by employing SST-based VM main driver and VM pre-emphasis together with the CM 
TXEQ architecture. The combination of the VM SST cells and CM TXEQ also offers the benefits 





5.2 Future Work 
The further development on the dissertation topics is to boost the transmitted data rate. The 
goal of the first work in this dissertation is to propose a new modulation and demodulation scheme 
based on the conventional transceiver architecture to provide an auxiliary channel and extra 
bandwidth.  This work is our first prototype, and its purpose is to verify the validity of the new 
scheme. Thus this prototype is not operating at a very high speed or over a high-loss channel, and 
there is no equalization technique being used in the first work. In the future, boosting the data rate 
and maintaining low power consumption in the transceiver should be given priority attention in 






[1] Scott-Hayward, S., Natarajan, S. and Sezer, S., “A Survey of Security in Software Defined 
Networks,” IEEE Communications Surveys & Tutorials, 18(1), pp.623-654, 2016. 
[2] Guo, X., Dutta, R. and Jin, Y., “Eliminating the Hardware-Software Boundary: A Proof-
Carrying Approach for Trust Evaluation on Computer Systems,” IEEE Transactions on 
Information Forensics and Security, 12(2), pp.405-417, 2017. 
[3] Stytz, M. and Whittaker, J., “Software protection: security's last stand?” IEEE Security & 
Privacy Magazine, 1(1), pp.95-98, 2003. 
[4] Zhang, Z., Njilla, L., Kamhoua, C. and Yu, Q., “Thwarting Security Threats from Malicious 
FPGA Tools with Novel FPGA-Oriented Moving Target Defense,” IEEE Transactions on 
Very Large Scale Integration (VLSI) Systems, pp.1-14, 2018. 
[5] Ramirez, R. and Choucri, N., “Improving Interdisciplinary Communication with Standardized 
Cyber Security Terminology: A Literature Review,” IEEE Access, 4, pp.2216-2243, 2016. 
[6] Wolf, M. and Serpanos, D., “Safety and Security in Cyber-Physical Systems and Internet-of-
Things Systems,” Proceedings of the IEEE, 106(1), pp.9-20, 2018. 
[7] Gogniat, G., Wolf, T., Burleson, W., Diguet, J., Bossuet, L. and Vaslin, R., “Reconfigurable 




Perspective,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 16(2), 
pp.144-155, 2008. 
[8] S. Zerafshan Goher, et al., “Covert channel detection: A survey based analysis,” International 
Conference on High Capacity Optical Networks and Enabling Technologies (HONET), Dec. 
2012. 
[9] S. Zander, G. Armitage, and P. Branch, “A Survey of Covert Channels and Countermeasures 
in Computer Network Protocols,” IEEE Communications Surveys & Tutorials, vol. 9, no. 3, 
October 2007, pp. 44-57. 
[10] Khoury, J. and Lakshmikumar, K., “High speed serial transceivers for data 
communication systems,” IEEE Communications Magazine, 39(7), pp.160-165, 2001. 
[11] Lee, K. and Sim, J., “Half-Rate Clock-Embedded Source Synchronous Transceivers in 
130-nm CMOS,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 
22(10), pp.2093-2102, 2014. 
[12] Navid, R., Chen, E., Hossain, M., Leibowitz, B., Ren, J., Chou, C., Daly, B., Aleksic, M., 
Su, B., Li, S., Shirasgaonkar, M., Heaton, F., Zerbe, J. and Eble, J., “A 40 Gb/s Serial Link 
Transceiver in 28 nm CMOS Technology,” IEEE Journal of Solid-State Circuits, 50(4), 
pp.814-827, 2015. 
[13] Chen, M., Shih, Y., Lin, C., Hung, H. and Lee, J., “A Fully-Integrated 40-Gb/s Transceiver 
in 65-nm CMOS Technology,” IEEE Journal of Solid-State Circuits, 47(3), pp.627-640, 2012. 
[14] Savoj, J., Hsieh, K., An, F., Gong, J., Im, J., Jiang, X., Jose, A., Kireev, V., Lim, S., 




Wireline Transceiver Embedded in Low-Cost 28 nm FPGAs,” IEEE Journal of Solid-State 
Circuits, 48(11), pp.2582-2594, 2013. 
[15] B. Casper and F. O’Mahony, “Clocking Analysis, Implementation and Measurement 
Techniques for High-Speed Data Links – A Tutorial,” IEEE Transactions on Circuits and 
Systems I: Regular Papers (TCAS-I), vol. 56, no. 1, pp. 17-39, Feb. 2009.  
[16] Wu, X., Yang, Z., Ling, C. and Xia, X., “Artificial-Noise-Aided Message Authentication 
Codes with Information-Theoretic Security,” IEEE Transactions on Information Forensics 
and Security, 11(6), pp.1278-1290, 2016.  
[17] Reviriego, P., Liu, S., Xiao, L. and Maestro, J., “An Efficient Single and Double-Adjacent 
Error Correcting Parallel Decoder for the (24,12) Extended Golay Code,” IEEE Transactions 
on Very Large Scale Integration (VLSI) Systems, 24(4), pp.1603-1606, 2016. 
[18] Deping Huang, “Design techniques for timing circuits in wireline and wireless 
communication systems,” Ph.D. dissertation, University of Arizona, Tucson, AZ, 2014. 
[19] Behzad Razavi, “Challenges in the design of high-speed clock and data recovery circuits,” 
IEEE Communications Magazine, vol 40, no. 8, pp. 94-101, 2002. 
[20] R. Staszewski, J. Wallberg, S. Rezeq, C. Hung, S. Eliezer, S. Vemulapalli, C.         Fernando, 
K. Maggio, R. Staszewski, N. Barton, M. Lee, P. Cruise, M. Entezari, K. Muhammad and D. 
Leipold, "All-digital pll and transmitter for mobile phones," IEEE Journal of Solid-State 
Circuits, vol. 40, no. 12, pp. 2469-2482, Dec. 2005  
[21] E. Temporiti, C. Weltin-Wu, D. Baldi, R. Tonietto and F. Svelto, "A 3 GHz Fractional all-




of Solid-State Circuits, vol. 44, no. 3, pp. 824-834, March 2009. 
[22] R. Walker, C. Stout and C. Yen, "A 2.488 Gb/s Si-bipolar clock and data recovery IC with 
robust loss of signal detection," in IEEE Int. Solid State Circuit Conf. Digest of Technical 
Papers, Feb. 1997. 
[23] J. Cao, M. Green, A. Momtaz, K. Vakilian, D. Chung, K. Jen, M. Caresosa, X. Wang, T. 
Wee, Y. Cai, I. Fujimori and A. Hairapetian, "OC-192 transmitter and receiver in standard 
0.18-μm CMOS," IEEE Journal of Solid-State Circuits, vol. 37, no. 12, pp. 1768-1780, Dec. 
2002. 
[24] H. Song, D. Kim, D. Oh, S. Kim and D. Jeong, "A 1.0–4.0-Gb/s All-Digital CDR with 1.0-
ps period resolution DCO and adaptive proportional gain control," IEEE Journal of Solid-State 
Circuits, vol. 46, no. 2, pp. 424-434, Feb. 2011.  
[25] X. Wang, T. Liu, S. Guo, M. A. Thornton, and P. Gui, “A 2.56 Gbps Asynchronous Serial 
Transceiver with Embedded 80 Mbps Auxiliary Data Transmission Capability in 65nm 
CMOS”, IEEE Radio Frequency Integrated Circuits Symposium (RFIC), pp. 360-363, 2018. 
[26] K. Fukuda, et al., “A 12.3mW 12.5Gb/s complete transceiver in 65nm CMOS,” IEEE 
International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), Mar. 2010. 
[27] S. Guo, et al., “A Low-Voltage Low-Power 25 Gb/s Clock and Data Recovery with 
Equalizer in 65 nm CMOS,” IEEE Radio Frequency Integrated Circuits Symposium (RFIC), 




[28] T. Liu, et al., “A Temperature Compensated Triple-Path PLL with KVCO Non-Linearity 
Desensitization Capable of Operating at 77 K,” IEEE Transactions on Circuits and Systems I: 
Regular Papers (TCAS-I), vol. 64, no. 11, pp. 2835-2843, Nov. 2017 
[29] C. Sanchez-Azqueta, et al., “Bang-bang phase detector model revisited,” IEEE International 
Symposium on Circuits and Systems (ISCAS), May 2013. 
[30] A. Ghiasi, “Impact of Transition Density on CDR,” IEEE 802.3bs Logic Adhoc Meeting, 
Feb. 2017. 
[31] S. Palermo, “CMOS Nanoelectronics Analog and RF VLSI Circuits. Chaper 9: High-Speed 
Serial I/O Design for Channel-Limited and Power-Constrained Systems,” McGraw-Hill, 2011. 
[32] Deep Underground Neutrino Experiment Technical Proposal, 
https://indico.fnal.gov/event/16429/contribution/0/material/slides/1.pdf  
[33] S. Miryala et al., "CDP1—A Data Concentrator Prototype for the Deep Underground 
Neutrino Experiment," in IEEE Transactions on Nuclear Science, vol. 66, no. 11, pp. 2338-
2345, Nov. 2019. 
[34] G. Balamurugan, J. Kennedy, G. Banerjee, J. E. Jaussi, M. Mansuri, F. O’Mahony, B. 
Casper, and R. Mooney, “A scalable 5–15 Gbps, 14–75 mW low power I/O transceiver in 65 
nm CMOS,” IEEE J. Solid- State Circuits, vol. 43, no. 4, pp. 1010–1019, Apr. 2008.  
[35] P. Chiang, H. Hung, H. Chu, G. Chen and J. Lee, "60Gb/s NRZ and PAM4 transmitters for 
400GbE in 65nm CMOS," 2014 IEEE International Solid-State Circuits Conference Digest of 





[36] R. Sredojevic and V. Stojanovic, "Fully Digital Transmit Equalizer With Dynamic 
Impedance Modulation," in IEEE Journal of Solid-State Circuits, vol. 46, no. 8, pp. 1857-1869, 
Aug. 2011, doi: 10.1109/JSSC.2011.2151530. 
[37] K. L. Chan et al., "A 32.75-Gb/s Voltage-Mode Transmitter with Three-Tap FFE in 16-nm 
CMOS," in IEEE Journal of Solid-State Circuits, vol. 52, no. 10, pp. 2663-2678, Oct. 2017. 
[38] K. Kwak, S. Hong and O. Kwon, "5 Gbit/s 2-tap low-swing voltage-mode transmitter with 
least segmented voltage-mode equalization," in Electronics Letters, vol. 50, no. 19, pp. 1371-
1373, 11 September 2014, doi: 10.1049/el.2014.2285. 
[39] N. Kocaman et al., “A 3.8 mW/Gbps quad-channel 8.5–13 Gbps serial link with a 5 tap 
DFE and a 4 tap transmit FFE in 28 nm CMOS,” IEEE J. Solid-State Circuits, vol. 51, no. 4, 
pp. 881–892, Apr. 2016.  
[40] Y. Lu, K. Jung, Y. Hidaka, and E. Alon, “Design and analysis of energy- efficient 
reconfigurable pre-emphasis voltage-mode transmitter,” IEEE J. Solid-State Circuits, vol. 48, 
no. 8, pp. 1898–1909, Aug. 2013.  
[41] M. Kossel et al., "A T-Coil-Enhanced 8.5 Gb/s High-Swing SST Transmitter in 65 nm 
Bulk CMOS with ≪ -16 dB Return Loss Over 10 GHz Bandwidth," in IEEE Journal of Solid-
State Circuits, vol. 43, no. 12, pp. 2905-2920, Dec. 2008. 
[42] W. D. Dettloff et al., “A 32 mW 7.4 Gb/s protocol-agile source-series- terminated 
transmitter in 45 nm CMOS SOI,” in IEEE ISSCC Dig. Tech. Papers, Feb. 2010, pp. 370–371.  
[43] C. Fan, W. Yu, P. Mak and R. P. Martins, "A 40-Gb/s PAM-4 Transmitter Using a 0.16-
pJ/bit SST-CML-Hybrid (SCH) Output Driver and a Hybrid-Path 3-Tap FFE Scheme in 28-
nm CMOS," in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 12, 




[44] Y. Song and S. Palermo, "A 6-Gbit/s Hybrid Voltage-Mode Transmitter with Current-
Mode Equalization in 90-nm CMOS," in IEEE Transactions on Circuits and Systems II: 
Express Briefs, vol. 59, no. 8, pp. 491-495, Aug. 2012.  
[45] X. Wang and Ping Gui, "A Hybrid Line Driver with Voltage-Mode SST Pre-Emphasis and 
Current-Mode Equalization," 2020 IEEE 63rd International Midwest Symposium on Circuits 
and Systems (MWSCAS). 
[46] T. Liu et.al., “A Temperature Compensated Triple-Path PLL with KVCO Non-Linearity 
Desensitization Capable of Operating at 77K,” IEEE Transactions on circuits and systems I, 
Vol. 64, no. 11, Nov. 2017.  
[47] X. Wang et.al, “A 2.56 Gbps Serial Wireline Transceiver that Supports an Auxiliary 
Channel in 65 nm CMOS,” IEEE Transactions on Very Large Scale Integration (VLSI) 
Systems, in press, 2019.  
[48] Gianluigi De Geronimo et.al., “Front-end ASIC for a liquid argon TPC,”  IEEE Nuclear 
Science Symposium and Medical Imaging Conference, 2010.  
[49] J. R. Hoff et al., "Lifetime Studies of 130 nm nMOS Transistors Intended for Long-
Duration, Cryogenic High-Energy Physics Experiments," in IEEE Transactions on Nuclear 
Science, vol. 59, no. 4, pp. 1757-1766, Aug. 2012, doi: 10.1109/TNS.2012.2203828. 
[50] Shaorui Li., Jie Ma, Gianluigi De Geronimo, Hucheng Chen and Veliko Radeka, “LAr 
TPC electronics CMOS lifetime at 300K and 77K and reliability under thermal cycling,”  IEEE 





[51] J. R. Hoff, G. W. Deptuch, G. Wu and P. Gui, “Cryogenic Lifetime Studies of 130nm and 
65nm Technologies for High-Energy Physics Experiments,”  IEEE Transactions on Nuclear 
Science, 2015.  
[52] G. Wu, G. W. Deptuch, J. R. Hoff and P. Gui, "Degradations of Threshold Voltage, 
Mobility, and Drain Current and the Dependence on Transistor Geometry For Stressing at 77 
K and 300 K," in IEEE Transactions on Device and Materials Reliability, vol. 14, no. 1, pp. 
477-483, March 2014, doi: 10.1109/TDMR.2013.2279175. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88 
 
 
