A High-speed and Low Power Electrical Link Transceiver by Jia, Xiangdong
 iii 








Electrical and Computer Engineering 
 
Presented in Partial Fulfillment of the Requirements for the 
Degree of Master of Applied Science (Electrical and Computer Engineering)  
At 
 
Concordia University  
Montréal, Québec, Canada 
March 2017 
© XIANGDONG JIA, 2017  
 iv 
CONCORDIA UNIVERSITY 
SCHOOL OF GRADUATE STUDIES 
 
 
This is to certify that the thesis prepared 
 
By:   Xiangdong Jia 
  
Entitled: A High-Speed and Low Power Electrical Link Transceiver 
 
and submitted in partial fulfillment of the requirements for the degree of  
 
 Master of Applied Science (Electrical and Computer Engineering) 
 
complies with the regulations of this University and meets the accepted standards with respect to 
originality and quality. 
 
 
Signed by the final examining committee: 
 
 
 ________________________________________________ Chair 
  Dr. C. Wang 
 
 ________________________________________________ External Examiner 
  Dr. J. Sadri 
 
 ________________________________________________ Internal Examiner 
  Dr. R. Raut 
 
 ________________________________________________ Supervisor 





Approved by:  ___________________________________ 
 Dr. W.E. Lynch, Chair 




_______________ 20___ __________________________________ 
       Dr. Amir Asif, Dean, 




A HIGH-SPEED AND LOW POWER ELECTRICAL LINK TRANSCEIVER 
Xiangdong Jia 
 
On-chip wires will present increasing latency and energy problems as VLSI technologies 
continue to scale. Interconnects have an RC-limited bandwidth approximately proportional to the 
area of the metal cross section and inversely proportional to the squared length.  To overcome 
RC-limited channels, an energy-efficient on-chip transceiver is presented that contains a hybrid 
transmitter, a current-sense receiver, and self-testing blocks. The main goal of this research is 
having a relatively low-power transceiver, which can be used as an on-chip communication 
system.  
By adding a pre-emphasis circuit in the transmitter, pre-cursor inter-symbol interference 
can be canceled.  A hybrid transmitter which combines voltage-mode pre-emphasis with a 
current-mode main driver is used. This structure can save pre-emphasis current, and leads to 
reduced power dissipation especially in the static situation. A current-sense amplifier is 
implemented with a cross-coupled stage and an active inductor equalizer at the receiver, in order 
to boost the data rate while maintaining good energy efficiency. An offset cancelation circuit is 
incorporated to make a robust comparator for the receiver.  According to simulation results, the 
transceiver has low power consumption with 1.2 V, 130 nm CMOS technology. The 
performance shows that it operates at 8 Gb/s over a 5 mm and 19 dB loss differential channel. 
The overall dynamic power consumption is 2.05 mW, without the PRBS generator/checker. 





First and foremost, I would like to express my gratitude to my supervisor, Dr. Glenn 
Cowan, for his guidance and support throughout my Master’s study. During my research period, 
he was a great mentor, manager, and supporter. His keen and vigorous academic observation 
enlightens me not only in this thesis but also in my future life.  
I want to thank Mr. Ted Obuchowicz for his help with CAD tools, as well as his support 
for chip fabrication.  
I would like to thanks our group members Chris Williams, Weihao Ni, Abdullah Ibn 
Abbas, Diaa Eldin Mahmoud, Marc Alexandre Chan, Sanjoy Basak, Marjan Mdani, and short-
term member Shamita Tabassum Nur. Working with them was an enjoyable and educational 
journey.  
I also want to thank my good friend for hanging out and encouraging me, Vernon Elmo 
Paul. 
Finally, I cannot show enough thanks to my parents, my girlfriend Yan Gao. Their 





Chapter 1 Introduction ............................................................................................ 1 
1.1 Motivation ............................................................................................................................ 1 
1.2 Thesis Objective .................................................................................................................. 3 
1.3 Thesis Contribution ............................................................................................................ 4 
1.4 Thesis Organization ............................................................................................................ 4 
Chapter 2 Literature Review .................................................................................. 5 
2.1 Transmitter .......................................................................................................................... 5 
2.1.1 VM Driver ...................................................................................................................... 6 
2.1.2 CM Driver ...................................................................................................................... 8 
2.1.3 Channel ........................................................................................................................ 11 
2.1.4 Single-ended and differential comparison ................................................................... 12 
2.1.5 Advanced transmitters ................................................................................................. 13 
2.2 Receiver .............................................................................................................................. 17 
2.3 Pseudo-random bit sequence (PRBS) ............................................................................. 20 
Chapter 3 Proposed Scheme .................................................................................22 
3.1 Hybrid Transmitter .......................................................................................................... 22 
3.1.1 Basic Considerations .................................................................................................... 22 
3.1.2 CM Main Driver with VM Pre-Emphasis Architecture ............................................... 25 
3.2 On-chip Channel ............................................................................................................... 31 
3.3 Termination Comparisons ............................................................................................... 35 
3.4 Current Sense Amplifier .................................................................................................. 38 
3.5 Slicer Circuit with Offset Compensation Circuit ........................................................... 42 
3.6 High-speed DFF ................................................................................................................ 43 
3.7 Clock Alignment Circuit .................................................................................................. 44 
3.8 Shift Register ..................................................................................................................... 45 
Chapter 4 Simulation and Layout ........................................................................47 
4.1 Simulation of Blocks ......................................................................................................... 47 
 viii 
4.1.1 Hybrid Transmitter....................................................................................................... 47 
4.1.2 RC Channel Response.................................................................................................. 52 
4.1.3 Current Sense Amplifier .............................................................................................. 53 
4.1.4 Slicer Circuit Simulation.............................................................................................. 53 
4.1.5 High-speed DFF Simulation ........................................................................................ 55 
4.1.6 Clock Alignment Circuit Simulation ........................................................................... 56 
4.1.7 On-chip transceiver simulation .................................................................................... 57 
4.2 Layout of Blocks ................................................................................................................ 58 
4.2.1 Hybrid Transmitter....................................................................................................... 58 
4.2.2 Channel ........................................................................................................................ 59 
4.2.3 Current Sense Amplifier .............................................................................................. 60 
4.2.4 Slicer Circuit ................................................................................................................ 61 
4.2.5 High-speed DFF ........................................................................................................... 62 
4.2.6 Clock Alignment Circuit .............................................................................................. 64 
4.2.7 Test Chip Layout.......................................................................................................... 65 
4.2.8 Test Chip Post Layout Simulation ............................................................................... 67 
Chapter 5 Measurement Plan and Comparisons with State-of-the-art ...........70 
5.1 Measurement Plan ............................................................................................................ 70 
5.2 DC Measurements ............................................................................................................. 72 
5.3 Comparisons with State-of-the-art .................................................................................. 74 
5.3.1 Comparing with [1] ...................................................................................................... 75 
5.3.2 Comparing with [2] ...................................................................................................... 76 
5.3.3 Comparing with [6] ...................................................................................................... 76 
5.3.4 Comparing with [12], [13] ........................................................................................... 77 





List of Figures 
Figure 1-1: On-chip transceiver circuit [1] ................................................................................. 2 
Figure 2-1: Basic on-chip transceiver system ............................................................................. 5 
Figure 2-2: Invert based VM driver ............................................................................................ 7 
Figure 2-3: Thevenin-equivalent series termination .................................................................. 7 
Figure 2-4: Supply voltage sensitivity analysis model for VM driver ...................................... 8 
Figure 2-5: Common source type CM driver ............................................................................. 9 
Figure 2-6: Norton-equivalent parallel termination .................................................................. 9 
Figure 2-7: Supply voltage sensitivity analysis model for CM driver .................................... 10 
Figure 2-8: 3D stacking [4] ......................................................................................................... 12 
Figure 2-9: 2.5D silicon interposer [4] ....................................................................................... 12 
Figure 2-10: Transmitter of [2] .................................................................................................. 15 
Figure 2-11: Transmitter of [12] ................................................................................................ 15 
Figure 2-12: Transmitter of [13] ................................................................................................ 16 
Figure 2-13: A constant Gm transmitter [14] ........................................................................... 16 
Figure 2-14: Inductive peaking approach ................................................................................ 18 
Figure 2-15: Active inductor ...................................................................................................... 20 
Figure 2-16: PRBS generator [3] ............................................................................................... 21 
Figure 2-17: PRBS checker [3] .................................................................................................. 21 
Figure 3-1: A CM driver with parallel resistors ...................................................................... 23 
Figure 3-2: Pre-emphasis equalization...................................................................................... 24 
Figure 3-3: Open-drain CM pre-emphasis ............................................................................... 25 
Figure 3-4: Proposed hybrid transmitter ................................................................................. 25 
Figure 3-5: Current bias of CM driver ..................................................................................... 26 
Figure 3-6: Current peaking at input channel current ........................................................... 27 
Figure 3-7: Equivalent circuit of VM driver ............................................................................ 29 
Figure 3-8: Cross-section of proposed channels model ........................................................... 31 
Figure 3-9: Multilayer capacitance model ................................................................................ 33 
Figure 3-10: Three basic lumped models for on-chip channels. ............................................. 35 
Figure 3-11: 5-mm RC channel model for proposed transceiver ........................................... 35 
Figure 3-12: Termination schemes ............................................................................................ 37 
Figure 3-13: Current sense amplifier ........................................................................................ 40 
 x 
Figure 3-14: Current sense amplifier equivalent circuit ......................................................... 40 
Figure 3-15: Simulation of parallel current sources termination and parallel resistors 
termination .................................................................................................................................. 42 
Figure 3-16: Comparator with offset compensation circuit .................................................... 43 
Figure 3-17: High-speed DFF .................................................................................................... 44 
Figure 3-18: Clock alignment circuit ........................................................................................ 45 
Figure 3-19: Shift register .......................................................................................................... 46 
Figure 4-1: Equivalent circuit of VM driver ............................................................................ 48 
Figure 4-2: Channel output current comparison ..................................................................... 48 
Figure 4-3: Input and output current of the 5-mm channel ................................................... 49 
Figure 4-4: Pre-emphasis current range ................................................................................... 49 
Figure 4-5: Pre-emphasis current with different RB ............................................................... 50 
Figure 4-6: Output 5-mm channel current with different RB ................................................. 50 
Figure 4-7: Output 2.5-mm channel current with different RB .............................................. 51 
Figure 4-8: Channel output current at 1 Gb/s.......................................................................... 51 
Figure 4-9: Channel output current at 4.2 Gb/s....................................................................... 52 
Figure 4-10: 5-mm on-chip channel loss ................................................................................... 52 
Figure 4-11: Input channel, output channel, and output current sense amplifier voltages . 53 
Figure 4-12: Slicer input mismatch simulation ........................................................................ 54 
Figure 4-13: Resolution simulation ........................................................................................... 54 
Figure 4-14: Noise simulation .................................................................................................... 55 
Figure 4-15: High-speed DFF simulation with 8 GHz clock ................................................... 55 
Figure 4-16: High-speed DFF simulation with 10 GHz clock ................................................. 56 
Figure 4-17 Clock alignment circuit simulation ....................................................................... 56 
Figure 4-18: PRBS checker output, input and output data of transceiver ............................ 57 
Figure 4-19: Eye-diagram of transceiver voltage output ........................................................ 57 
Figure 4-20: Power breakdown ................................................................................................. 58 
Figure 4-21: Layout of hybrid transmitter ............................................................................... 59 
Figure 4-22: Layout of 5-mm on-chip channels ....................................................................... 60 
Figure 4-23: Layout of current sense amplifier ....................................................................... 61 
Figure 4-24: Layout of slicer circuit. ......................................................................................... 62 
Figure 4-25: Layout of high speed DFF. ................................................................................... 63 
Figure 4-26: Layout of PRBS generator. .................................................................................. 63 
 xi 
Figure 4-27: Layout of PRBS checker. ..................................................................................... 64 
Figure 4-28: Layout of clock alignment circuit. ....................................................................... 65 
Figure 4-29. Layout of 5-mm on-chip transceiver. .................................................................. 66 
Figure 4-30: Layout of transceiver blocks with power grid .................................................... 67 
Figure 4-31: Slicer circuit output voltage comparison ............................................................ 68 
Figure 4-32: Input and output channel currents at 6 Gb/s ..................................................... 69 
Figure 4-33: Input and output channel currents at 8 Gb/s ..................................................... 69 
Figure 5-1: PCB testing board ................................................................................................... 71 
Figure 5-2: Bit error output signal ............................................................................................ 71 
Figure 5-3: Quiescent current performance of analog voltage supply ................................... 73 
Figure 5-3: Quiescent current performance of digital voltage supply ................................... 74 
Figure 5-5: VM pre-emphasis current simulation ................................................................... 76 




List of Tables 




List of Acronyms 
 
BER   Bit Error Rate 
CM   Current Mode 
CMOS  Complementary Metal-Oxide Semiconductor 
DFE   Decision Feedback Equalization 
DFF   D-Flip Flop 
I/O   Input / Output 
ISI   Inter-Symbol Interference 
NMOS  N-type Metal-Oxide Semiconductor 
PCB   Printed Circuit Board 
PMOS  P-type Metal-Oxide Semiconductor 
PRBS   Pseudo-Random Bit Sequence 
RMS   Root Mean Square 
SNR   Signal-to-Noise Ratio 
TSV   Through-Silicon Via 
UI   Unit Interval 









Chapter 1 Introduction 
1.1 Motivation 
Although technology scaling has consistently enhanced transistor performance in 
terms of gate switching delay, it produces a reverse influence on on-chip channel latency. 
The importance of an energy-efficient on-chip communication system becomes more and 
more clear. Such a system consists of three main parts:  transmitter, channel, and 
receiver. On-chip channels will present increasing latency and energy problems as CMOS 
process technologies continue to scale. Interconnects have an RC-limited bandwidth 
approximately proportional to the area of the metal cross section and inversely 
proportional to the squared length. It has become a critical limitation for on-chip 
transceiver.  
To compensate the channel loss, there are two kinds of equalization solutions we 
can rely upon: transmitter pre-emphasis and receiver equalization. Both equalization 
approaches are able to achieve this purpose by either boosting the high-frequency gain or 
reducing the low-frequency channel effects. Recently, several transmitter pre-emphasis 
equalization techniques have been proposed for energy-efficient data communication: 
current-mode (CM) pre-emphasis and voltage-mode (VM) pre-emphasis.  
 2 
As shown in Figure 1-1, CM pre-emphasis is implemented in [1]. The equalizer 
can provide a one unit-interval (UI) compensation current and achieve increased 
bandwidth. However, it suffers from static power dissipation. When there is no data 
transition, the output currents of the drivers subtract, generating a smaller output current. 
This subtraction is an inefficient way to generate the smaller transmit current. 
 
Figure 1-1: On-chip transceiver circuit [1] 
 
VM pre-emphasis can save pre-emphasis current compared with CM pre-
emphasis, but there are other concerns about this approach to equalization. To obtain a 
low-swing output voltage, it requires an extra regulator. Hence, a large storage capacitor 
needs to be used for voltage regulation, which occupies about 6000 µm2 in [2]. Area 
consumption becomes the major problem of an on-chip transceiver design. 
In order to maximize data rate across the channel while maintaining adequate bit 
error rate (BER) performance, a receiver also should have equalization circuits. Receiver 
 3 
equalizers can be either passive or active. In this thesis, an active equalizer is 
implemented on-chip. The active inductor circuit is preferred because it does not require 
high voltage headroom. The circuit is capable of increasing signal gain and working at 
broadband. 
In order to simplify testing, a self-test circuit is added on the chip, which is able to 
generate pseudo-random bit sequence (PRBS) and check for correct transmission. A 
PRBS generator and checker [3] are implemented at the transmitter and receiver side, 
respectively. Because of the technology limitation, a custom D-flip flop (DFF) is 
designed for achieving 8 Gb/s performance.  
 
 1.2 Thesis Objective 
The transmitter plays a vital role in the on-chip transceiver design. The main 
objective of this thesis is to explain the design and implementation of a low-power high-
speed on-chip transceiver with a hybrid transmitter. The specific objectives of this thesis 
are the following: 
 Maximum energy use per bit: 1 pJ/b.  
 Lower transmitter power dissipation comparing with state-of-the-art. 
 Modeling the on-chip channel for 5 mm. 
 
 4 
1.3 Thesis Contribution 
This thesis presents the design and implementation of a low-power high-speed on-
chip transceiver with its own self-test circuit. The specific contributions of this thesis are 
described below: 
 Proposing a scheme to achieve low-power consumption and high-speed the link. 
 Customizing high-speed DFF of the PRBS with IBM 130nm technology. 
 Implementation of the transceiver in an integrated circuit using IBM 130nm technology 
 This design has been accepted by IEEE International Symposium on Circuits & Systems 
(ISCAS) 2017 conference [19]. 
1.4 Thesis Organization 
The thesis has a total of six chapters. In Chapter 2, background theory and 
fundamentals are presented. Chapter 3 gives the overall description of the design of the 
on-chip transceiver. The layout and the simulation results are shown in Chapter 4. 
Chapter 5 gives a comparison with other approaches. Finally, Chapter 6 gives the 
conclusion, along with a discussion on the potential future works.  
 5 
Chapter 2 Literature Review 
In this chapter, the main parts of a basic high speed I/O configuration will be 
introduced. Figure 2-1 shows the basic electrical on-chip transceiver. A transmitter sends 
out an electrical signal to a receiver through an electrical link. Unfortunately, the 
electrical link is not an ideal link when operated at high frequency. It acts as a lossy 
transmission line which is usually regarded as an RLC network for modeling its 
performance. Both the transmitter and the receiver have equalization ability while 
remaining low power. For testing purpose, PRBS blocks could be placed on the chip. 
 
 
Figure 2-1: Basic on-chip transceiver system 
 
2.1 Transmitter 



















generate either an accurate voltage or current swing in to the channel. VM and CM 
drivers are two of the main kinds of transmitter. To overcome the lossy channel, both 
methods can have an equalization block counteracting the low-pass responses of the on-
chip channel. 
2.1.1 VM Driver 
VM driver is widely used for low-power design, because of less current 
consumption in contrast to current-mode logic. A typical VM driver in Figure 2-2 uses an 
inverter-based scheme, which adds a series resistor to match channel impedance. The 
circuit can be replaced by Thevenin-equivalent series termination, shown in Figure 2-3. 




                                                       (2-1) 
where RC is the resistance of the channel. In order to compare power efficiency. RC is 
assumed to be equal to RTX, where RTX is the output resistance of the inverter, and the 











Figure 2-3: Thevenin-equivalent series termination 
 
Unfortunately, the output voltage can be easily affected by variations in supply 
voltage. As shown in Figure 2-4, assuming that the Din is low and the supply voltage has 
a VDD offset value, the supply voltage sensitivity can be analyzed by an equivalent 
circuit of a PMOS-over-NMOS inverter. The PMOS is in saturation region, and the 







Figure 2-4: Supply voltage sensitivity analysis model for VM driver 
 
By analyzing the equivalent circuit, we can have equation: 
∆𝑉𝑜𝑢𝑡 = ∆𝑉𝐷𝐷𝑔𝑚1(𝑟𝑜1 ∕∕ 𝑟𝑜2)                                                   (2-2) 
where Vout and VDD are the variation at the output of the inverter and the offset at the 
supply voltage,  ro1 and ro2 are the output resistances of M1/2. The above equation 
indicates that the offset of supply voltage has a voltage gain of 𝑔𝑚1(𝑟𝑜1 ∕∕ 𝑟𝑜2) with 
output voltage variation. 
 
2.1.2 CM Driver 
Comparing with the VM approach, a CM driver is easier to control output 
impedance, and less affected by a supply voltage reduction in order to have low power 










with a parallel resistor for impedance matching. CM driver can be replaced by Norton-
equivalent parallel termination, shown in Figure 2-6. For getting certain voltage swing VC 







                                                       (2-3) 
Comparing equation (2-1) the requirement of current under same assumptions, this 
CM driver scheme need twice current for a given input channel voltage swing. 
  
 





Figure 2-6: Norton-equivalent parallel termination 
 
In terms of the supply voltage effect in Figure 2-7, the swing of the input channel 











channel cannot change the current which conducts by the current source. In order to 
quantify the effect, the relation between output current variation and offset of supply 




                                                   (2-4) 
where Iout is the output current variation. The above equation indicates that the offset of 
the supply voltage has less gain to the output current swing because of large ro2. 
 
 
Figure 2-7: Supply voltage sensitivity analysis model for CM driver 
 
The intrinsic differences between VM and CM driver are mainly about power 
efficiency, supply voltage sensitivity, and termination approaches. As mentioned before, 
a typical VM driver potentially consumes less current to get a given voltage swing at the 
input of the channel comparing with a typical CM driver scheme. In regard to supply 








hand, the offset supply voltage in VM scheme is directly amplified with a large voltage 
gain. With regard to termination approaches, it is much easier for the CM driver to 
control the impedance, and what is most commonly used for CM driver is parallel 
terminated method. However, the VM output impedance is considerably determined by 




The reliance of modern high-performance computing systems on high-speed 
interconnects becomes very noticeable. Multi-core processors have many performance 
requirements such as high speed, bandwidth and low energy use. In order to have short 
interconnects, 3D stacking is becoming more common [4]. This may lead to a stack of 
active chips connected using through-silicon-vias (TSVs) to connect through each chip 
down to a package substrate, as shown in Figure 2-8. However complete 3D stacking is 
still struggling with many issues such as supply chain, bonding, alignment, and energy 
dissipation. As shown in Figure 2-9, 2.5D silicon interposer channel is another valued 









Figure 2-9: 2.5D silicon interposer [4] 
 
In any case, a high performance on-chip transceiver has to overcome the weakness 
of limited bandwidth of distributed RC channel. Usually, the distributed inductance can 
be neglected due to the large amount of distributed resistance.  
 
2.1.4 Single-ended and differential comparison 
Since the to noise immunity performance and area consumption are becoming 
increasingly important in advanced technology, we face more challenging choices about 
methodologies for transmitting electrically. A single-ended signal is defined as one that is 
 13 
measured relative to a fixed potential usually the ground. A differential signal is 
characterized as one that is measurement between two nodes that have equal and opposite 
signal excursions around a fixed potential [5].  
Single-ended signaling is used to maximize the density of the IOs by using fewer 
bumps and wires than fully-differential signaling [6].  The downside of single-ended 
signaling is that it is much more vulnerable to environmental noise at higher data rate.   
Therefore, it needs to have relative high voltage swing in order for ensuring signal-to-
noise ratio (SNR). 
Because of its symmetry and high common-mode rejection, differential signaling 
is robust to supply noise. As long as the disturbance influences at both differential inputs 
equally, the differential signaling can not be affected. Another benefit of differential 
signaling is low voltage operation requirement. That is the reason why it always is 
applied by sensitive signals and on-chip transceivers, which are required for conveying 
small currents or voltages value.  
In consideration of the noise immunity and high-speed requirements, a differential 
signaling methodology is a preferable alternative for on-chip transceiver design. 
 
2.1.5 Advanced transmitters  
[1] was proposed by Seung-Hun Lee, et al. It was how this project got started in 
the first place. As shown in Figure 1-1, the transmitter is combined a CM main driver and 
 14 
a CM pre-emphasis driver. Both of drivers are open-drain type circuit, which consume 
less power than parallel resistors termination scheme. The transmitter can work at 3 Gb/s 
by driving 10 mm long differential channels in 65 nm CMOS technology. The power 
consumption of the transmitter is 196.2 µW. 
A hybrid transmitter [2] is shown in Figure 2-10, which has VM pre-emphasis 
driver and CM main driver. The transmitter is controlled by two half-rate data (DE and 
DO) and half-rate clock (CKE and CKO). 2:1 serializer is implemented for aligning these 
half-rate data sequences. The VM pre-emphasis driver is a low-swing NMOS-over-
NMOS inverter, which has a switch-type regulator for tuning the output voltage swing. 
As we learned from the Chapter 1, the regulator needs a large storage capacitor. [2] used 
a 15 pF capacitor with an area of 6000 µm2 in a 65 nm technology, which is large in 




                                                   (2-5) 
where VREG is the regulator output voltage which is much smaller than supply voltage. 
IEQ2 can be controlled by tuning VREG which ranges from 50 mV to 600 mV. The 300 fF 





Figure 2-10: Transmitter of [2] 
 
Both [12] and [13] are capacitive driven transmitters, as shown in the Figure 2-11 
and Figure 2-12. Because these designs focused on having decision feedback equalization 
(DFE) at the receiver, they have a relatively low requirement of transmitter performance. 
Using capacitance CS and CP can have low-swing voltage with pre-emphasis transitions 
into the channel, which helps both transmitters to have low power consumption. In [12], 
they use extra voltage-controlled current source to cancel the ill-defined DC potential on 
the channel. Therefore, the VOUT can be defined by different Vin value.   
 




Figure 2-12: Transmitter of [13] 
 
Figure 2-13 shows a constant Gm transmitter [14]. Like [2], the driver needs a 
large capacitor to make the regulated voltage stable, and the constant Gm circuit needs an 
extra 1.8 V supply voltage. The design has impedance matching at the inverter-based 
transmitter by using inverter output resistance 1/gm, which also will be affected by input 
data transitions.  The PMOS-over-NMOS inverter which was used in the paper is less 
varying impedance in contrast to NMOS-over-NMOS scheme The approach needs extra 
voltage-to-current converters in order to implement inverter based transimpdience 
amplifiers at the transmitter. The 23% of transmitter power dissipation will be added, 
which is 1.9 mW in [14]. 
 




As the name implies, a receiver circuit is used to capture the signal form channel. 
Impedance matching, signal amplification, data recovery, and equalization are key 
features of an on-chip link receiver.  When the receiver’s input impedance matches the 
channel impedance, it will minimize the reflection at the receiver-side of the channel. 
Receiver equalization is used to compensate the high-frequency loss of the on-chip 
channel. The receiver is typically composed of a sense-amplifier and a flip-flop. A 
current or voltage sense-amplifier detects and amplifies the small swing signal coming 
from the channel while a simple flip-flop samples the data at the correct time. The main 
challenges of a receiver design include lower noise performance, a larger bandwidth of 
sense-amplifier, data recovering ability, and equalization approaches. 
In order to broaden the bandwidth of a receiver, inductive peaking technique is 
achievable for chip design [16]. Figure 2-14 (a) is a common source amplifier, and Figure 






                                                   (2-6) 
Figure 2-14 (c) and (d) indicate the inductive peaking implementation and the 








                                                   (2-7) 
The poles can be complex, and the zero is only determined by L/R time constant. Overall, 
equation (2-7) is characterized by the ratio of L/R and RC time constant. The ratio is 
expressed as ‘m’. Therefore, the inductance value can be denoted as: 
𝐿 = 𝑚𝑅2𝐶                                                   (2-8) 
According to [4], when the m=0.71, the circuit can perform maximum bandwidth. 
               
(a)                                                            (b)              
                       
(c)                                                            (d)              






















However, the monolithic inductor suffers from parasitic capacitance. It also 
requires large layout area and a lot of design time. Using the active inductor can be an 
alternative approach to broaden the bandwidth [7], as shown in Figure 2-15. Figure 2-15 








, the output impedance has a proportional relationship with frequency. 


















                         (2-10) 
According to (2-10), if RE is much larger than gm, we can have Figure 2-15(c), 





 .  Therefore, Figure 
2-15 (a) can be employed as the active load for broadening the bandwidth. In our 
proposed design, it also can increase current sense amplifier high frequency gain, because 





(a)                             (b)                                                (c) 
Figure 2-15: Active inductor  
 
2.3 Pseudo-random bit sequence (PRBS) 
The main performance metric for any I/O link is bit-error rate (BER). Self-test 
technique allows a chip to perform operations upon itself and tests the chip operation. 
Although a self-testing circuit increases the chip area, it is worthwhile since it reduces 
test time and test equipment cost. 
One method of self-test is to use cyclic redundancy checking, which involves 
PRBS generator and checker. As shown in Figure 2-16, a PRBS of length 7 is constructed 
from a linear feedback shift register [15], which in turn is made of 7 DFF connected in a 





























Chapter 3 Proposed Scheme 
This chapter will discuss the design of the overall transceiver, which includes the 
hybrid transmitter, the receiver, PRBS circuits, and channel model. The different 
termination scenario will be analyzed. The goal of the transceiver was to achieve 8 Gb/s 
data rate with energy efficiency better than 1 pJ/b. The application of this scheme can be 
for use multi-core communication on-chip. The design was simulated with IBM 130nm 
technology. 
 
3.1 Hybrid Transmitter 
3.1.1 Basic Considerations 
A source-match CM driver usually uses parallel resistors to match the characteristic 
impedance of the line, as shown in Figure 3-1. Using parallel resistors could make the 
power consumption double for a target signal swing, as RTX matching channel 
impedance. An alternative method for overcoming lossy channel is to use open drain CM 
driver, which dissipates half of the power of a fully terminated transmitter but can lead to 
reflections. For the off-chip channel, it is better to have impedance matching. In our 
design, the on-chip lossy RC channel can be driven by an open-drain main driver, 
 23 




Figure 3-1: A CM driver with parallel resistors 
 
Due to the channel attenuation, the transmitter needs equalization circuits to 
compensate for the frequency dependent attenuation. Pre-emphasis is a well-known 
equalization approach, which can be used to increase data rate and reduce the amount of 
inter-symbol interference (ISI). As shown in Figure 3-2, the output channel signal could 
be recovered by adding one or more extra taps, which are controlled by delay signals. 
After the channel, the pre-emphasis current, which is represented by the red wave, is 
canceled out by the counteracting the channel loss. The blue curve is the output channel 
current with pre-emphasis circuit. The green, dashed curve shows the output channel 
current without pre-emphasis circuit. Clearly, the green one has more ISI, which will 




Figure 3-2: Pre-emphasis equalization 
 
Adding one extra open drain driver is the most common pre-emphasis method for 
CM driver transmitter, as shown in Figure 3-3. The extra tap is controlled by a signal 
delayed by one symbol period signal. However, it increases the static power dissipation.  
For example, during a data transition (0 to 1 or 1 to 0), the output currents of the two 
drivers add up, leading to a larger output current. When there is no data transition, the 
output currents of the drivers subtract, generating a smaller output current. The circuit 
needs a DFF for delaying the signal between the pre-emphasis tap and the main driver. 
Based on our simulation, the DFF consumes about 20% of transmitter power according to 
this scheme. This subtraction is an inefficient way to generate the smaller transmit 
current. [17] has proposed an approach to eliminate the power overhead of current 






0 0 1 0 0
1 1 1 0 1





Figure 3-3: Open-drain CM pre-emphasis 
 
3.1.2 CM Main Driver with VM Pre-Emphasis Architecture 
In this work, a fully differential CM signaling scheme, which is combined with 




Figure 3-4: Proposed hybrid transmitter 
 








































equalizer. The open-drain type main driver directly connects to the channel input. The 
pre-emphasis equalizer consists of a low swing VM driver and two AC-coupling 
capacitors. 
The open drain CM driver is the first part to be discussed. Conveying the main 
current to the channel is the purpose of the CM main driver. There are no parallel 
resistors used as impedance matching because the channel has hundreds Ω level 
resistance. Without impedance matching at the transmitter, the main driver is able to 
conduct all the main driver’s current into the channel. The reflected signal can be 
suppressed by the very large channel loss. As shown in Figure 3-5, the main driver has a 
bias circuit, which has an NMOS capacitor at the gate of MN5 for reducing noise.  
 
Figure 3-5: Current bias of CM driver 
 
Due to the passive termination at the receiver end, the common-mode output 
voltage of the CM driver is low and MN3/4/5 works in the triode region. The bias circuit 
sets a constant voltage at the gate of MN5. Our simulation results show that the 






MN5 is 17 mV, and the value is constant when data does not have transitions. The voltage 
has small variation during the data changing between ‘1’ and ‘0’. Those variations will 
affect the transmitted current at data changing time. But it can not affect the received 
current because of the small peaking and the lossy-channel effect. As shown in Figure 3-
6, peaking-1 and peaking-2 at input channel current do not affect output channel current, 
which is the red curve in the figure. The purple curve shows the VDS variations.  Actually, 
due to increasing the pre-emphasis current value, the small peaking-2 is good for isolated 
bits.  
 
Figure 3-6: Current peaking at input channel current 
 








Pre-emphasis current vanished 
 28 
By controlling VGS of MN5, the main driver MN5 provides a 50 μA main-driver 
current. Due to the VDS variation at data transition, the peaking current occurs at each 
data transition time. This data dependent main-driver current is averaged in the channel 
and acts as a portion of the bias current for the receiver as done in [1].  
In this design, the low-swing VM driver is implemented without an extra regulator 
circuit. The VM driver described below is used for equalization alongside a CM driver in 
the transmitter. Instead of using an active switching regulator [2] to tune the inverter 
output swing, the resistor RB is implemented as an extra current path [8], [18]. Both of 
them implemented it with low-voltage swing VM transmitter, and considered it as a part 
of impedance-matching resistance. In order to tune the pre-emphasis current in this 
proposed design, RB is implemented as a variable quantity with larger tuning range than 
[18], [8]. The circuit can be used to control the driver’s output voltage swing without 
consuming large chip area. Because of the very low voltage swing at the channel input, 




                                                              (3-2) 
where Ceq is the AC-coupling capacitance. RA is the inverter output resistance. Veq(a/b) are 
the steady-state voltage at the outputs of the VM driver inverters. Their value depends on 
the input signal. If D+ and D- are digital input signal “0” and “1” respectively, Veq_a and 
Veq_b are equal to (3-3) and (3-4): 
 29 
𝑉𝑒𝑞_𝑎 = 𝑉𝐷𝐷 ∗
𝑅𝐵+𝑅𝐴
2𝑅𝐴+𝑅𝐵
                                          (3-3) 
𝑉𝑒𝑞_𝑏 = 𝑉𝐷𝐷 ∗
𝑅𝐴
2𝑅𝐴+𝑅𝐵
                                         (3-4) 
If the input values are “1” and “0” for D+ and D-, the Veq_a is 𝑉𝐷𝐷 ∗
𝑅𝐴
2𝑅𝐴+𝑅𝐵




. The PMOS and the NMOS transistors are assumed to have matched 
output resistance. The voltages at the channel input VH_IN and VL_IN are much smaller 
than Veq_a and Veq_b. Therefore, in (3-2), we can ignore the input channel voltage. Veq_a 
and Veq_b can be tuned by changing RA, but it requires a larger width of transistors which 
can lead to increase equalizer output capacitance. The voltage drop at RB is 𝑉𝐷𝐷 ∗
𝑅𝐵
2(𝑅𝐴+𝑅𝐵)






Figure 3-7: Equivalent circuit of VM driver  
 
 30 
 For the quantitative analysis, an equivalent circuit with a single input voltage, as 
shown in Figure 3-7, can model the proposed differential input equalizer. The RC is the 
sum of the resistance of the channel and the receiver passive terminations. According 
voltage analysis of the equivalent circuit, the voltage drop at RB/2 is obtained: 
𝑉𝑒𝑞 = ±𝑉𝐷𝐷 ∗
𝑅𝐵
4𝑅𝐴+2𝑅𝐵
                                         (3-5) 
The difference between Veq_a and Veq_b can correspond to Veq which is simply equal and 
opposite. At the input signal switching time, the peaking current of Ieq is approximately: 








                                         (3-6) 
 
where ∆𝑉𝑒𝑞 is the voltage changing at Veq during a data transition. In the proposed driver 
RA is much larger than RB, and the output impedance of CM driver is much larger than 







                                                        (3-7) 
Increasing RB increases Ieq. For compensating different lengths of on-chip channels, the 
equalizer current is adjustable from 100 µA to 450 µA as RB is changed from 100 Ω to 
500 Ω. The simulation results will be presented in Chapter 4. 
 
 31 
3.2 On-chip Channel  
With rapid development in the field of CMOS technology, due to smaller wire 
cross-section, tighter wire pitch and loner lines, interconnects delay and power 
consumption severely limits integrated circuit performance in current and future nodes.  
To build an on-chip channel, the parameter of metal width, length and layer need 
to be under control. Figure 3-8 shows a cross-section of our proposed channels. In the 
proposed channel, M5 layer is used as the channel. M6 and M4 layers are connected to 
ground as shielding layers. Two differential channels (CH) are used for the on-chip 
transceiver, the other two differential channels (CH_D) are built as testing dummy 
channels. These dummy channels can connect to testing equipment for understanding real 
on-chip channel performance in this technology. The testing results can be referred back 
to schematic simulations for verifying equivalent channel circuit accuracy.  Ground 
channels are added between those signal channels for shielding outside noise.  
 
Figure 3-8: Cross-section of proposed channels model 
 








                                                  (3-8) 
H is the thickness of the metal layer, which is out of designer’s control. Making a trade 
off between length (Llength ) and width (W) is only a way to have a small channel 
resistance.  
A multilayer capacitance model is shown below Figure 3-9. A capacitance value 
of a channel is influence by the dielectric constant of the technology. However, the 
dielectrics are different between adjacent capacitance (Cadj) and vertical capacitance 
(Ctop/bot). In this technology, the dielectrics used between adjacent wires have the lowest 
possible dielectric constant to minimize capacitance. The dielectric between vertical 
layers must provide greater mechanical stability and may have a large dielectric constant. 
The fringe capacitance should also be included which is flux to the under layer and upper 
layer from the sides of wires. The fringe capacitances between horizontal surfaces are too 
small to count in the total capacitance because of the ground shielding. Therefore, the 
total capacitance can be estimated: 






) + 2 (𝜀𝑠𝑖𝑑𝑒
𝑊
𝐻
)] + 𝐶𝑓𝑟𝑖𝑛𝑔𝑒            (3-9) 
According to [9], the fringe capacitance can be applied with the top and bottom 
capacitance, if the 𝜀𝑢𝑝 = 𝜀𝑑𝑜𝑤𝑛 = 𝜀𝑣𝑒𝑟, the total capacitance is found: 
𝐶𝑡𝑜𝑡𝑎𝑙 = 𝜀𝑜𝐿𝑙𝑒𝑛𝑔𝑡ℎ [(𝜀𝑣𝑒𝑟
𝑊
𝐻


















Figure 3-9: Multilayer capacitance model 
 
Extracting inductance for on-chip channel is extremely time-consuming for 
complex geometries, because it is very dependent on the entire circuit’s current loop. 










)                                                  (3-11) 
where µ0 is the magnetic permeability of free space 4𝜋 × 10−7𝐻/𝑚 . Therefore, the 
inductance value is around 2 nH about 5 mm on-chip channel. It is negligible for the 
schematic level simulation, because the L/R time constant is much smaller than the RC 
time constant. 
IBM 130 nm technology has 8 metal layers. M1 to M3 layers are used as routing 
within circuit blocks. M4 and M6 are implemented as ground layers for shielding 
channels. We are using M5 metal layer as channels. M7 and M8 are employed as power 
grids for the chip. To achieve the best density, the minimum space should be used for on-
chip channels, which is 0.4 µm. In order to have less channel resistance and improve 
 34 
bandwidth, the width of channels is about 1.5 µm. The transceiver is designed to drive a 
channel of up to 5 mm in length. In order to reduce the chip area required for this 
experiment, the test channels are laid out in a back-and-forth fashion, giving a 5-mm total 
length. 
The on-chip channel can be considered as RC-limited, instead of RCL limited. 
Inductance can be neglected as long as the RC-time constant is much larger than L/R 
time constant. Therefore, the distribute circuit with resistances and capacitances can be 
used for building the channel model. There are three basic lumped approximations shown 
in Figure 3-8. Comparing with L-model and T-model, the π-model is the most accurate 
approach for modeling on-chip channel performance, which is able to achieve results 
accurate to 3% [10]. It is common practice to model on-chip channels with 3-5 segments 
π-model for running simulation. Because the channel is differentia, the adjacent 
capacitance is added to the π-model as a capacitance between the two channels as shown 
in Figure 3-10. According to the equation (a) and (b), the total resistance of the 5-mm on-
chip channel is 195 Ω, and the total capacitance is about 2.11 pF. In the proposed design 
simulation, a 3 segments π-model is used, which has the R=65 Ω, Cadj/2=41.75 fF, and 




Figure 3-10: Three basic lumped models for on-chip channels. 
 
 
Figure 3-11: 5-mm RC channel model for proposed transceiver 
 
3.3 Termination Comparisons   
50 Ω matching is not necessary for on-chip transceiver because the channel 
 36 
impedance is not 50 Ω. However, the transmitter still can be co-designed with the 
channel impedance matching. For saving potential power dissipation at transmitter 
impedance circuits, this proposed design has no impedance matching implemented at the 
transmitter side. 
An on-chip channel has different signaling scenarios compared with a 
conventional off-chip channel. For each end of the channel a series termination and a 
parallel termination can be considered, alongside the option of having no termination. 
Figure 3-12 shows six different basic scenarios with those three terminations. 
To begin with comparisons, let us consider Figure 3-12 (a) first, also known as a 
series termination. The series termination usually is implemented with a VM driver. The 
series resistor RT is employed for matching to the channel impedance. The main 
advantage of this scheme is zero static power consumption. However, impedance 
matching is not available in the receiver to eliminate reflection.  
Figure 3-12 (b) shows a receiver having been terminated with a parallel resistor RT. 
In this scenario, both sides have matched the channel impedance, ZTX = ZRX = ZChannel. 
Comparing with previous scheme (a), this one is able to eliminate reflection. That is why 
it is commonly used to achieve best signal quality at the receiver. The primary 
disadvantage of this scheme is that the parallel termination consumes currents from the 
transmitter, which increases the receiver power consumption. 
 
 37 
           




            
(c)                                                          (d) 
 
 
         
(e)                                                          (f) 
 
 
Figure 3-12: Termination schemes  
 
A parallel termination is used at the transmitter side, as shown in Figure 3-12 (c). 
Usually in this case, the transmitter is a CM driver, which has a very large output 
impedance. The main concern for this scheme is same as the scheme (b): the parallel 
resistor RT is not energy-efficient. The only difference between Figure 3-12 (c) and (d) is 
that scheme (d) implements the parallel termination at the receiver. Scheme (d) has 
ability to get best signal quality with worst power efficiency. Figure 3-12 (e) is similar to 
previous two scenarios (c) and (d). Scheme (e) only uses a parallel termination at the 
receiver and has the same power consumption as (b) and (c), and same signal quality (a), 
(c). Figure 3-12 (f) does not employ any passive components for achieving impedance 
 38 
matching at both the transmitter and the receiver. In terms of the energy-efficient design, 
this is the best scenario among the terminations. However, the signal quality is poor for 
aiming at the high-speed on-chip transceiver design. 
With regard to the very lossy channel, the open drain CM driver is able to 
efficiently convey signals to the receiver, which means that we do not need any resistor 
termination at the transmitter. For our design, as long as the far-end impedance is well 
matched with the channel impedance, impedance mismatch at the transmitter will not be 
an issue. At the beginning of the design, the scenario (f) was the plan for the on-chip 
transceiver implementation. Instead of using a parallel termination or a series termination 
at the receiver, an active termination is involved for getting relatively low input 
impedance, as shown in Figure 1-1. Unfortunately, this scheme cannot implement an 
input impedance as low as channel impedance, because the common gate amplifier is 
desirable to use a small device to minimize the parasitic capacitance, meanwhile this 
input node of current amplifier also should be matched with the channel impedance to 
minimize reflection. With this technology, we can not match these two requirements for 
the current amplifier. Therefore, a parallel termination is employed for the final chip tap-
out. 
3.4 Current Sense Amplifier 
Figure 3-13 shows the circuit of the current sense amplifier [1]. The DC input 
resistance of the current sense amplifier is: 
 39 






+ 𝑅𝐿)]                               (3-12) 
Due to the cross-coupled NMOS structure, part of negative impedance looking into the 
source of MN6/7 is negative. As we mentioned in Chapter 2, the impedance of MP3/4 is 
varying with frequency. The Figure 3-14 shows the current sense amplifier equivalent 




]                               (3-13) 
where Cin and Cout represent the input and output circuit capacitance; ZL is the sum 
impedance of the active inductor and RL, and given as: 
𝑍𝐿 = − (
𝑠𝑅𝐸𝐶𝑔𝑠+1
𝑠𝐶𝑔𝑠+𝑔𝑚𝑃4
+ 𝑅𝐿)                                                       (3-14) 
Therefore, the negative impedance from cross-coupled NMOS can reduce the input 
impedance. As shown in 3-13, the input impedance is frequency dependent. The circuit is 
designed for low power consumption, which requires relatively small average current 
conducted through the channel, comparing with [1] and [2]. Small bias current means that 
the conductance of MN7 and MN6 cannot be very large. In our design, the width of these 
two transistors is 4 µm with gm=1.76 m. The passive termination RT helps to match the 
receiver to the channel impedance and eliminate signal reflections. Those resistors have 
about 175 μA DC current.  
 40 
 
Figure 3-13: Current sense amplifier  
 
 
Figure 3-14: Current sense amplifier equivalent circuit  
 
RE makes MP4/P3 work as an active inductor, as mentioned before in Chapter 2. It 
can increase the circuit bandwidth and provide inductive behavior. The active inductor 
has frequency dependent impedance, ranging from a small impedance at low frequencies 
1
𝑔𝑚𝑃4/3














The transimpedance gain of the current sense amplifier is shown below: 
𝐴 = −𝑔𝑚𝑁(8∕9) ∗ 𝑅𝐷 [𝑅𝑃𝑅𝑇 (
𝑔𝑚𝑁(6∕7)𝑟𝑜𝑁(6∕7)+1
𝑔𝑚𝑁(6∕7)𝑟𝑜𝑁(6∕7)𝑅𝑇+𝑅𝑇+𝑅𝑃+𝑟𝑜𝑁(6∕7)
)]   (3-13) 




+ 𝑅𝐿)                                              (3-14) 
Assuming 𝑟𝑜𝑁(6∕7) is much larger than RT and RP. Then equation (3-13) can be 
simplified as shown below: 
𝐴 = −𝑔𝑚𝑁(8∕9) ∗ 𝑅𝐷[𝑅𝑃𝑅𝑇𝑔𝑚𝑁(6∕7)]                                 (3-15) 
As mentioned before, the width of MN6/7 is small, 4 µm, for having low power 
consumption. Then RT termination is involved in order to have lower input impedance for 
the current sense amplifier, and is programmable for matching different channel lengths. 
Only MP3/4 conductance is not enough to large voltage swing required by slicer circuit. To 
achieve larger transimpedance gain, the RL is implemented. RP will be about 550  
without an extra RL, and the overall gain A can be significantly reduced. To boost the 
common-mode signal value, two common source amplifiers are added at the output of the 
current amplifier. The small-signal gain of the common source amplifiers is around 2. 
The overall transimpedance gain A of this design is 1.5 kΩ. 
Instead of using parallel resistors as terminations, the parallel current source can 
be used to boost the current through the amplifier, thereby lowering its input resistance 
 42 
and providing an improved match with the channel. According to equation 3-12, the low 
impedance can be achieved by setting these devices at a large bias current. By connecting 




                                                           (3-16) 
Figure 3-15 shows the input receiver current at 8 Gb/s data rate with these two 
approaches. The parallel resistor termination has less jitter time, but the eye-opening is 
smaller. In this proposed design, the chip has been fabricated with parallel resistors 
method. However, in the future work, the current source termination approach can be 
employed for improving controllability of the current sense amplifier. 
 
Figure 3-15: Simulation of parallel current sources termination and parallel resistors 
termination  
 
3.5 Slicer Circuit with Offset Compensation Circuit 






with the current sense amplifier output. The function of the slicer circuit is to recover the 
received signal to rail-to-rail voltage swing. As shown in Figure 3-16, the strong-arm 
comparator circuit has a capacitor based offset correction circuit [11]. Those capacitors 
can slow down ID2/3 currents’ charging and discharging time. Therefore, the input voltage 
offset value can be compensated by tuning current discharging time. 
 
Figure 3-16: Comparator with offset compensation circuit  
 
 
3.6 High-speed DFF 
The series architecture PRBS generator and checker [3] has been used in the 
circuit, which can generate 27-1 length random differential bit sequence and check the 








M6M7 M8M9 M10M11 M12
 44 
feedback. The checker has 15 DFFs, 2 X-OR gates and 1 AND gate. 
The slicer circuit described above can work as a DFF by setting a reference 
voltage at one differential input as shown in Figure 3-17. The circuit can work with a 
clock frequency up to 10 GHz. 
 
 
Figure 3-17: High-speed DFF  
 
 
3.7 Clock Alignment Circuit 
 
The transmitter clock has a short distance to the clock pad, and therefore only 
needs some buffers to delivery the clock into the transmitter circuit. The receiver is far 
away from the clock pad, and therefore needs a clock alignment circuit to tune the clock 
phase, as shown in Figure 3-18.  The circuit is composed of 7 programmable inverter 
chains and 1 starved inverter. Those programmable inverter chains are implemented as 
 45 
propagation delay blocks.  For having same high and low transition time, the starved 





Figure 3-18: Clock alignment circuit  
 
3.8 Shift Register 
 
In order to tune parameters, such as the transmitter pre-emphasis current value, 
offset compensation circuits and clock alignment circuit, the chip need a 30-bit shift 
register. As shown in Figure 3-19, the shift register includes two rows of DFFs. The 
lower row is used for latching input data at the top row. The digital signal “1” or “0” will 
not transmit to internal circuits until the BITS_EN signal transitions from low to high. 
The higher row can prevent internal circuits from frequently changes at BITS_IN. The 




Figure 3-19: Shift register  
  
 47 
Chapter 4 Simulation and Layout 
Simulation results of schematic level will be showed on individual blocks in the 
order they are presented in the previous chapter. 
4.1 Simulation of Blocks  
4.1.1 Hybrid Transmitter 
The first simulation block is the proposed transmitter.  In order to show the VM 
pre-emphasis ability, Figure 4-1 below gives the differential signal one-bit pulse response 
which is at the output of the 5mm channel model. The signal data rate is 8 Gb/s. It has 
less ISI with VM pre-emphasis circuit than without pre-emphasis. For more clarity, Figure 
4-2, shows the channel output current differences between working with VM pre-
emphasis circuit and without pre-emphasis circuit. For having more realistic simulations, 
the random 27-1 data sequence is used as the input signal.  The effectiveness of VM pre-
emphasis obviously is obvious. 
 48 
 




Figure 4-2: Channel output current comparison 
 
 According to the data in Figure 4-3, the input channel current eye opening is 
approximately 100 μA. The pre-emphasis current lasts more than one UI. The peak 
equalization current can reach up to 500 μA. At the channel output port, the current signal 









Figure 4-3: Input and output current of the 5-mm channel 
 
 
 As shown in Figure 4-4, the pre-emphasis current can be tunable by making the RB 
programmable. For compensating different lengths of on-chip channels, the equalizer 
current is adjustable from 100 µA to 450 µA as RB is changed from 100 Ω to 500 Ω. 
 




By sweeping RB from 100 Ω to 1 kΩ, Figure 4-5, Figure 4-6 and Figure 4-7 also 
indicate that the pre-emphasis current has significant influence on the output channel 
current, in terms of 5-mm and 2.5-mm on-chip channels model. If RB is too low, the pre-
emphasis current is too small to compensate the channel loss, as shown in Figure 4-6. If 
RB is too large, the pre-emphasis current can cause peaking current at the output channel, 
as shown in Figure 4-7. In order to compensate channels loss, the 200 Ω is good for 2.5-
mm channel, and the 500 Ω is good for 5-mm channel.  
 
Figure 4-5: Pre-emphasis current with different RB 
 
 





Figure 4-7: Output 2.5-mm channel current with different RB 
 
In order to see the benefit of pre-emphasis equalization, Figure 4-8 and Figure 4-9 
show channel output current eye-diagrams when the transmitter has only the main driver. 
At 1 Gb/s, eye closure is 9%, and the eye closure is 100% at 4.2 Gb/s. It is evident that 
the VM pre-emphasis circuit can increase the data rate up to 8Gb/s and reduce ISI. 
 




Figure 4-9: Channel output current at 4.2 Gb/s 
 
4.1.2 RC Channel Response 
According to Figure 3-9 in Chapter 3, the length of on-chip channel is chosen to be 
5 mm which represents roughly 19 dB loss at Nyquist frequency 4 GHz. The -3 dB 
bandwidth of the channel is 1.6 GHz. 
 




4.1.3 Current Sense Amplifier 
As shown in Figure 4-11 and Figure 4-3, at the input of current sense amplifier, 
signal swings degrades to 2 mV, 60 μA respectively. The current amplifier’s output has a 
46 mV voltage swing. The gain of the current sense amplifier is about 770 . 
 
Figure 4-11: Input channel, output channel, and output current sense amplifier voltages 
 
4.1.4 Slicer Circuit Simulation 
Figure 4-12 illustrates a mismatch simulation for input of the slicer circuit. The 
input offset value is about 9.9 mV after taking 3 times the standard deviation into account. 
The offset cancellation has a resolution of 2.4 mV. It is able to cancel around 12 mV 
offset at the input comparator, as shown in Figure 4-13. It also has the ability to eliminate 
2 mV offset caused by the current sense amplifier. Referring it back to the input of the 





Figure 4-12: Slicer input mismatch simulation 
 
 
Figure 4-13: Resolution simulation 
 
The RMS noise voltage at the input of the slicer due to the current sense amplifier 
is 1.5 mV, as shown in Figure 4-14. The result shows the integrated RMS noise up to 
100 GHz, and the noise cannot affect the circuit performance, when the input signal has 
peak-to-peak voltage 46 mV. An SNR of approximately 23.7 dB indicates that the 
transceiver is not noise limited. 
 55 
 
Figure 4-14: Noise simulation 
 
4.1.5 High-speed DFF Simulation 
The DFF is the core of PRBS generator and checker. Figure 4-15 shows transient 
simulation results for the high-speed DFF. It works well with an 8 GHz clock. The DFF 
speed is up to 10 GHz clock, as shown in Figure 4-16. 
 









Figure 4-16: High-speed DFF simulation with 10 GHz clock 
 
4.1.6 Clock Alignment Circuit Simulation 
The clock alignment circuit has the ability to vary the delay of the 8 GHz clock from 
0 ps to 125 ps, with 18 ps tuning resolution. Figure 4-17 below illustrates a transient 
analysis for the clock alignment circuit for seven different clock phases. 
 
Figure 4-17 Clock alignment circuit simulation 
 57 
4.1.7 On-chip transceiver simulation 
Figure 4-18 shows the bit error output from PRBS checker, where only one error is 
plotted at 0.5 ns. The error is caused by the data transmission delay. The blue and red 
waveforms are the input and output data of the transceiver. As shown in Figure 4-19, the 
output data eye-opening is rail-to-rail with 1.2 V swing. 
 
Figure 4-18: PRBS checker output, input and output data of transceiver 
 
Figure 4-19: Eye-diagram of transceiver voltage output 
 
The overall dynamic power consumption is 2.05 mW, without the PRBS generator 





Figure 4-20: Power breakdown 
 
 
4.2 Layout of Blocks 
The layout for the proposed design was completed and submitted for fabrication in 
IBM 130nm technology. Each block is shown below. 
4.2.1 Hybrid Transmitter 
The hybrid transmitter is laid to be properly matched with schematic current 
performance. The main driver and bias circuits layouts are implemented by dummy 
fingers layout approaches. The dummy figure is an extra transistor put at the end of a 
row of fingers which can improve symmetry and matching by ensuring that all of the 




























The differential channels are implemented with M5 metal layer. As shown in 
Figure 4-22, the channel layout has the zig-sag shape. In order to shield channels, 
ground planes are laid out at the bottom layer M4 and top layer M6. The width and 
space of channels are 1.5 µm and 0.4 µm. The space is the minimum value. Two 






characteristics in IBM 130nm technology. 
 
 
Figure 4-22: Layout of 5-mm on-chip channels  
 
4.2.3 Current Sense Amplifier 
Figure 4-23 shows the layout of current sense amplifier. These cross-coupled 
NMOS transistors and diode-connected PMOS transistors are laid out with multiple 











Figure 4-23: Layout of current sense amplifier 
 
 
4.2.4 Slicer Circuit 
Figure 4-24 shows the layout of the slicer circuit. The cyan box is the layout of the 
comparator circuit, which is well-matched in terms of the differential circuits. The 
block at the top of the layout is the offset cancellation circuit. Each differential signal 
branch has 5 capacitors using for offset compensations. The red box is the S/R latch. 
Because we did not have access to a standard-cell library in IBM 130nm technology, 




Figure 4-24: Layout of slicer circuit. 
 
 
4.2.5 High-speed DFF 
Figure 4-25 shows the layout of the high-speed DFF, which has approximate 
dimensions of 32 µm by 10 µm. The multiple fingers are used in order for these 




Figure 4-25: Layout of high speed DFF. 
 
4.2.1 PRBS Generator and Checker 
PRBS generator and checker are shown in Figure 4-26, Figure 4-27 respectively. 
The PRBS generator measures 130 µm by 22 µm. The PRBS checker is 
approximately 130 µm by 60 µm. This chip sacrifices large area for PRBS generator 
and checker for less testing time consumption.  
 




Figure 4-27: Layout of PRBS checker. 
 
4.2.6 Clock Alignment Circuit  




Figure 4-28: Layout of clock alignment circuit. 
 
 
4.2.7 Test Chip Layout 
The figures in this section show the layout of the 5-mm on-chip transceiver chip 
submitted for fabrication. It includes the power grid, de-coupling capacitors, pads, and 
electrostatic discharge protection blocks. The chip takes an area of 1mm2. As shown 
in Figure 4-29, the two sides of pads use wire bonding process to connect with a 
 66 
package. The data output signal, clock signals and dummy channel outputs plan to use 
probes, because the package can cause large signal loss.  
 
 
Figure 4-29. Layout of 5-mm on-chip transceiver. 
 
Figure 4-30 is a close-up view of transceiver blocks. The up and down green 
boxes are the receiver and the transmitter layouts with PRBS generator and checker 
circuits. These two cyan boxes are the de-coupling capacitors on the chip. And the 
blue box is the 30-bits shift register for controlling programmable circuits. The very 








































Figure 4-30: Layout of transceiver blocks with power grid 
 
4.2.8 Test Chip Post Layout Simulation 
Figure 4-31, Figure 4-32 and Figure 4-33 show the simulation results using layout 
extracted value of the transmitter and receiver circuits. In this post layout simulation, 
the channel is an RC channel model. The simulation result in Figure 4-31 shows that 
transmitter and receiver layout can work at 6 Gb/s, and the output voltage eye-
diagram opening is about 1.6 V. At 8 Gb/s schematic simulation, the eye-diagram 
opening of output is 2.4 V. It is evident that the data rate dropped due to the extra 
parasitic capacitance and resistance. The most significant influence is from 
comparator parasitic capacitance which can slow down the comparator charging and 
discharging phases.  
 68 
In order to improve post layout results, the circuit’s schematic and layout can be 
improved.  In terms of schematic, parasitic capacitances need to be added in the 
simulation at an early stage with approximate values. This will give an early 
prediction of the performance achievable once the circuit is laid out.  In the layout, 
every wire should be carefully chosen width and length and each circuit block should 
have guard ring. 
 
(a) 6 Gb/s post-layout simulation 
 
(b) 8 Gb/s schematic simulation 




Figure 4-32: Input and output channel currents at 6 Gb/s  
 










Chapter 5 Measurement Plan and 
Comparisons with State-of-the-art 
5.1 Measurement Plan 
The prototype chip has been fabricated in IBM 130 nm technology and will be 
tested soon. The PCB testing board needs ten I/O pins, including three voltage supplies, 
three digital inputs, a bit error output, a current bias input and two ground pins. Dummy 
channel pads, the data output pad and two clock signal pads are tested by probes due to 
their high-speed requirements.  
As shown in Figure 5-1, an Arduino UNO will work as a microcontroller for 
giving bits to the PCB board for the on-chip shift register. This shift register holds control 
parameters for various aspects of the transceiver, such as offset cancellation and the 
tuning of RB. The 3.3 V voltage supply from the Arduino UNO board is separated to 
three different voltages for powering digital circuits controlling the reference voltage for 
the PRBS blocks and tuning the clock alignment circuit. The PCB board has a bit error 
output. Instead of using a piece of equipment to implement a current source, a backup 
current source chip is implemented on the PCB board, which can be tuned with a 
potentiometer. As shown in Figure 5-2, the PRBS checker has the ability to drive a signal 
out of the testing board to a 50  scope. 
 71 
 
Figure 5-1: PCB testing board  
 
 
Figure 5-2: Bit error output signal  
 
The testing for the on-chip transceiver will be started with DC performance 
checking. Only voltage supplies for analog and digital circuits will be connected. In order 
to monitor the current into the transceiver circuit, the analog supply voltage will sweep 
from zero to VDD.  
1 UI
1 UI
Bit error output with a package model and 50 ohm loadBit error  at PRBS checker output
One Bit Error
Bit sequence with an error 
Bit sequence generated by PRBS checker
 72 
After comparing DC measurements with simulation results, the shift register will 
be encoded with Arduino digital inputs. The initial bits of the shift register are to set RB 
to 100 , RT to 200 , the offset cancellation circuit to off status and the clock alignment 
circuit to the first inverter chain option. The reference voltage for PRBS blocks will be 
tuned to 680 mV.  Then the probes for output data and clock signals will be landed 
properly. Transient measurements will start from 1 Gb/s to 8 Gb/s. Data output and bit 
error output will be captured by oscilloscopes, and the power consumption will be 
measured. 
In terms of dummy channel measurements, it can be tested by s-parameter and 
pulse response. A network analyzer and pulse generator will be used for these tests. 
 
5.2 DC Measurements 
Figure 5-3 shows the measured quiescent currents of the analog voltage supply. 
During the measurement, only the analog voltage supply powered the chip. 
Measurements show reasonable agreement with the simulation result. However, the 
testing quiescent currents are slightly smaller than the simulation result, because the 
parasitic series resistance can reduce the current value.  
 73 
 
Figure 5-3: Quiescent current performance of analog voltage supply 
 
 
Figure 5-4 shows the measured quiescent currents of the digital voltage supply 
while only the digital voltage supply powered the chip. Testing quiescent currents are 
larger than simulation result. The extra current is consumed by components on the PCB, 
such as voltage regulators and operational amplifiers. 
 74 
 
Figure 5-3: Quiescent current performance of digital voltage supply 
 
5.3 Comparisons with State-of-the-art 
Table 1 shows comparisons between this work and several state-of-the-art designs 







Table 1. Performance Comparison Table 
 [1] [2] [6] [12] [13] This work 
Technology (nm) 65 nm 65 nm 28 nm 90 nm 28nm 130nm 






Data Rate (Gb/s) 3 9 20 2 8 8 
Supply Voltage (V) 0.9 1 1 1.2 0.9 1.2 
Channel loss 23 dB 9 dB 10.7 dB -- 32 dB 19 dB 
Channel type On-chip Off-chip On-chip On-chip On-chip On-chip 
Output channel swing 
5 mV, 
100 μA 





0.362 0.59 0.3 0.28 0.15 0.256 
Power consumption 
(mW) 
1.086 5.31 6.1 0.56 1.2 2.05 
 
5.3.1 Comparing with [1] 
As shown in Table 1, the design has less power consumption, but the energy 
efficiency is worse. In the proposed design, the VM pre-emphasis depends on the data 
transition, as shown in Figure 5-5.  After 2 UI, the pre-emphasis current is less than 10 
A, which is much smaller than the constant pre-emphasis current in [1].  
 76 
 
Figure 5-5: VM pre-emphasis current simulation 
 
5.3.2 Comparing with [2] 
The main differences between [2] and the proposed design are the control 
methodology of VM pre-emphasis current and the receiver implement approach. The 
proposed design can tune the output of VM driver without large storage capacitor on-
chip. [2] targets an off-chip channel which has lower channel loss. In terms of the energy 
performance, our design has better energy efficiency. However, [2] has higher data rate, 9 
Gb/s. Overall, the less area occupation and better energy efficiency are the main 
advantages of the proposed design. 
5.3.3 Comparing with [6] 
[6] introduced a very high-speed and efficient off-chip transceiver with single-
ended signaling approach, as shown in Figure 5-6. Comparing it with the proposed 
 77 
design, both designs implement impedance matching resistor at the input of receivers and 
neither transmitter has impedance matching circuit.  Due to the advanced technology, 
relatively low channel loss and good sampling performance at the receiver circuit, [6] can 
work with 20 Gb/s. In order to reduce ripples on the voltage of sampler circuit, a large 
stored capacitance CCM is used at the receiver. To achieve the high-speed data rate, the 
inverter-based receiver is dependent on transistors’ speed and large stored capacitor. 
Therefore, the performance is difficult to reproduce with the 130 nm technology. 
 
Figure 5-6: Transceiver of [6] 
 
 
5.3.4 Comparing with [12], [13] 
Both [12] and [13] are capacitive driven transmitters by using an AC-coupling 
capacitor to conduct low-swing currents with pre-emphasis transitions into a channel. The 
data rate of [12] is lower than the other two designs due to its longer channel. [13] has 
 78 
same data rate as our design, and lowest power consumption among the three. However, 
by taking advantage of advanced technology, it is possible to build up higher data rate 
and reduce to lower power dissipation under our on-chip transceiver scheme. For 
example, we could reduce the MP3/4 size in latest technology, as we can see in Figure 3-
13, for reaching the same transimpedance gain of the current sense amplifier circuit. In 
130 nm, we need to use a larger overdrive voltage to get the required conductance. 
However, the overdrive voltage can be smaller for the same conductance under advanced 
technology. Reducing the supply voltage for the same driving current is also an excellent 
alternative to reduce power dissipation. 
  
 79 
Chapter 6 Conclusion and Future 
Work 
The thesis introduced an energy-efficient on-chip transceiver scheme with 1.2 V, 
130 nm CMOS technology. The presented hybrid transmitter contains low power 
consumption and easy control methodology. This design allows the on-chip transceiver to 
work at low power consumption and high data rate. The on-chip channel model is 
designed to give you a better understanding about how on-chip channel performs in real 
life. With regard to the receiver, current sense amplifier is impedance matched with 
parallel resistors for achieving higher data rate. A compensation schemes is also adapted 
to the comparator, which is based on the capacitive offset cancellation approach. PRBS 
generator and checker are implemented. It adds the chip area for testability unfortunately, 
but it does save the testing time and budgets. The custom high-speed DFF is used for 
PRBS generator and checker as the high-speed requirement.  In practice, the clock signal 
on chip is not ideal. A programmable clock delay block is designed for delivering the 
clock to transmitter and receiver. The layout of the transceiver in IBM 130nm technology 
is also presented in this thesis. The chip will be tested when test equipment is available. 
The performance shows that it operates at 8 Gb/s over a 5 mm differential channel. The 
overall dynamic power consumption is 2.05 mW, without the PRBS generator and 
checker. 
Future work for this project can be done in the following areas: 
 80 
 The termination approach at the receiver can be reconsidered when the 
main goal is to reduce the power consumption of the current sense amplifier 
while keeping the same data rate. 
 The hybrid transmitter can be reinvestigated for controlling the pre-
emphasis amplitude and time constant separately. 
 The proposed design can be reinvestigated with a source-synchronous 
system in order to convey a clock signal properly aligned with the data 
sequence.  
 The on-chip transceiver can by designed with longer channels by adding 
extra repeater blocks. 
 In terms of a compact transceiver design, the proposed transmitter can be 






[1] S.H. Lee, S.K. Lee, B. Kim, H.J. Park, and J.Y. Sim, “Current-mode transceiver for 
silicon interposer channel,” IEEE J. Solid-State Circuits, vol. 49, no. 9, pp. 2044–
2053, Sep. 2014. 
[2] Il-Min Yi, et al. “A 40 mV-Differential-Channel-Swing Transceiver Using a RX 
Current-Integrating TIA and a TX Pre-Emphasis Equalizer With a CML Driver at 9 
Gb/s,” IEEE Trans. Circuits Syst. I, vol. 63, no. 1, pp. 122-133, Dec. 2015. 
[3] Ekaterina Laskin, “On-chip Self-test Circuit Blocks for High-speed Applications,” 
Thesis of master degree, University of Toronto, 2006. 
[4] Aida Todri-Sanial and Chuan Seng Tan, Physical Design for 3D Integrated Circuits, 
CRC Press, 2016. 
[5] B. Razavi, Design of Analog CMOS Integrated Circuits, International Edition. 
McGraw-Hill, 2001. 
[6] B. Dehlaghi and A. C. Carusone, “A 20 Gb/s 0.3 pJ/b single-ended die-to-die 
transceiver in 28 nm-SOI CMOS,” in Custom Integrated Circuits Conference (CICC) 
2015, San Jose, 2015. 
[7] B. Razavi, Design of Integrated Circuits for Optical Communications, Second 
Edition. New Jersey. Wiley, 2012. 
[8] H. Song, S. Kim, D.K. Jeong, “A Reduced-Swing Voltage-Mode Driver for Low-
Power Multi-Gb/s Transmitters,” Journal of semiconductor technology and science, 
 82 
9(2), June 2009.  
[9] C.J. Chao, S.C. Wong, M.J. Chen, B.K. Liew, “An Extraction Method to Determine 
Interconnect Parasitic Parameters” IEEE Transactions on Semiconductor 
Manufacturing, Vol. 11, No. 4. Nov. 1998. 
[10] Neil H. E. Weste and David Money Harris, Integrated Circuit Design, Fourth 
Edition. Global Edition. Pearson, 2011. 
[11] Sam Palermo, “Design of High-speed Optical Interconnet Transceivers,” PHD’s 
Thesis, Standford University, 2007. 
[12] E. Mensink, D. Schinkel, E. A. M. Klumperink, E. van Tuijl, and B. Nauta, “Power 
efficient gigabit communication over capacitively driven RC-limited on-chip 
interconnects,” IEEE J. Solid-State Circuits, vol. 45, no. 2, pp. 447–457, Feb. 2010.  
[13] M. H. Nazari and A. Emami-Neyestanak, "A 20Gb/s 136fJ/b 12.5Gb/s/ μm on-chip 
link in 28nm CMOS," 2013 IEEE Radio Frequency Integrated Circuits Symposium 
(RFIC), Seattle, WA, pp.257-260, 2013. 
[14] G.S. Jeong, et al., “A 20 Gb/s 0.4 pJ/b energy-efficient transmitter driver utilizing 
constant Gm bias,” IEEE J. Solid-State Circuits, vol. 51, no. 10, Oct. 2016. 
[15] S. W. Golomb, Shift Register Sequences. San Francisco, CA: Holden-Day, Inc., 
1967. 
[16] S. S. Mohan, et al., “Bandwidth Extension in CMOS with Optimized On-Chip 
Inductors,” IEEE J. Solid-State Circuits, vol. 35, no. 3, Mar. 2000. 
[17] B. Kim et al., “An energy-efficient equalized transceiver for RC-dominant channels,” 
IEEE J. Solid-State Circuits, vol. 45, no. 6, pp. 1186–1197, 2010. 
 83 
[18] H. Lu, H.W. Wang, C. Su, and C.N. J. Liu, “Design of an all-digital LVDS driver,” 
IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 8, pp. 1635–1644, Aug. 2009. 
[19] X. Jia and G. E. R. Cowan, “ A 8-Gb/s 0.256-pJ/b Transceiver for 5-mm on-Chip 
Interconnects in 130-nm CMOS”,  IEEE ISCAS 2017, May. 2017. 
