Modeling, Optimization and Power Efficiency Comparison of High-speed Inter-chip Electrical and Optical Interconnect Architectures in Nanometer CMOS Technologies by Palaniappan, Arun
 MODELING, OPTIMIZATION AND POWER EFFICIENCY COMPARISON OF 
HIGH-SPEED INTER-CHIP ELECTRICAL AND OPTICAL INTERCONNECT 
ARCHITECTURES IN NANOMETER CMOS TECHNOLOGIES 
 
 
A Thesis 
by 
ARUN PALANIAPPAN  
 
 
Submitted to the Office of Graduate Studies of 
Texas A&M University 
in partial fulfillment of the requirements for the degree of  
MASTER OF SCIENCE 
 
 
 
December 2010 
 
 
Major Subject: Electrical Engineering
MODELING, OPTIMIZATION AND POWER EFFICIENCY COMPARISON OF 
HIGH-SPEED INTER-CHIP ELECTRICAL AND OPTICAL INTERCONNECT 
ARCHITECTURES IN NANOMETER CMOS TECHNOLOGIES 
 
A Thesis 
by 
ARUN PALANIAPPAN  
 
Submitted to the Office of Graduate Studies of 
Texas A&M University 
in partial fulfillment of the requirements for the degree of  
MASTER OF SCIENCE 
 
Approved by: 
Chair of Committee,  Samuel Palermo 
Committee Members, Jose Silva-Martinez 
 Jiang Hu 
 Duncan Walker 
Head of Department, Costas Georghiades 
 
December 2010 
 
 
Major Subject: Electrical Engineering
iii 
 
 
 
ABSTRACT 
 
Modeling, Optimization and Power Efficiency Comparison of High-speed Inter-chip 
Electrical and Optical Interconnect Architectures in Nanometer CMOS Technologies. 
(December 2010) 
Arun Palaniappan, B.E., Anna University 
Chair of Advisory Committee: Dr. Samuel Palermo 
 
Inter-chip input-output (I/O) communication bandwidth demand, which rapidly 
scaled with integrated circuit scaling, has leveraged equalization techniques to operate 
reliably on band-limited channels at additional power and area complexity. High-
bandwidth inter-chip optical interconnect architectures have the potential to address this 
increasing I/O bandwidth. Considering future tera-scale systems, power dissipation of 
the high-speed I/O link becomes a significant concern. This work presents a design flow 
for the power optimization and comparison of high-speed electrical and optical links at a 
given data rate and channel type in 90 nm and 45 nm CMOS technologies.  
The electrical I/O design framework combines statistical link analysis 
techniques, which are used to determine the link margins at a given bit-error rate (BER), 
with circuit power estimates based on normalized transistor parameters extracted with a 
constant current density methodology to predict the power-optimum equalization 
architecture, circuit style, and transmit swing at a given data rate and process node for 
three different channels. The transmitter output swing is scaled to operate the link at 
iv 
 
 
 
optimal power efficiency. Under consideration for optical links are a near-term 
architecture consisting of discrete vertical-cavity surface-emitting lasers (VCSEL) with 
p-i-n photodetectors (PD) and three long-term integrated photonic architectures that use 
waveguide metal-semiconductor-metal (MSM) photodetectors and either electro-
absorption modulator (EAM), ring resonator modulator (RRM), or Mach-Zehnder 
modulator (MZM) sources. The normalized transistor parameters are applied to jointly 
optimize the transmitter and receiver circuitry to minimize total optical link power 
dissipation for a specified data rate and process technology at a given BER. 
Analysis results shows that low loss channel characteristics and minimal circuit 
complexity, together with scaling of transmitter output swing, allows electrical links to 
achieve excellent power efficiency at high data rates. While the high-loss channel is 
primarily limited by severe frequency dependent losses to 12 Gb/s, the critical timing 
path of the first tap of the decision feedback equalizer (DFE) limits the operation of low-
loss channels above 20 Gb/s. Among the optical links, the VCSEL-based link is limited 
by its bandwidth and maximum power levels to a data rate of 24 Gb/s whereas EAM and 
RRM are both attractive integrated photonic technologies capable of scaling data rates 
past 30 Gb/s achieving excellent power efficiency in the 45 nm node and are primarily 
limited by coupling and device insertion losses. While MZM offers robust operation due 
to its wide optical bandwidth, significant improvements in power efficiency must be 
achieved to become applicable for high density applications. 
v 
 
 
 
DEDICATION 
 
To my parents and brother 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
vi 
 
 
 
ACKNOWLEDGEMENTS 
 
 I would first like to thank my advisor, Dr. Samuel Palermo, for his guidance, 
encouragement, motivation and support from the initial stages of research to the 
completion of the thesis. His technical expertise, insightful thinking and attention have 
been of great value and helped me to shape the thesis through the various discussions. I 
would like to thank all my committee members, Dr. Jose Silva-Martinez, Dr. Jiang Hu, 
Dr. Duncan Walker and also Dr. Henry Pfister, who served as a substitute committee 
member for my defense, for their time, support and valuable suggestions. I thank 
Younghoon Song for his help with Stateye simulations.  I would also like to thank all my 
past and present roommates and friends, especially Karthik, Somasundaram, Naresh, 
Shriram, Rajesh, Seenu, Rakesh, Vivekananth, Ramanathan, Vaidyanathan, Senthil, 
Sivasankar, Elango and Thanigaivel, for all the discussions and support that made my 
stay memorable. Lastly, I thank my parents and brother for their love, sacrifice, 
encouragement and constant support, which made this graduate school journey possible.  
vii 
 
 
 
TABLE OF CONTENTS 
 
              Page 
ABSTRACT ..............................................................................................................  iii 
DEDICATION ..........................................................................................................  v 
ACKNOWLEDGEMENTS ......................................................................................  vi 
TABLE OF CONTENTS ..........................................................................................  vii 
LIST OF FIGURES ...................................................................................................  ix 
LIST OF TABLES ....................................................................................................  xii 
CHAPTER 
 I INTRODUCTION ................................................................................  1 
   1.1     Organization of Thesis .........................................................  3 
 
 II BACKGROUND ..................................................................................  5 
   2.1     Electrical Interconnect Architecture .....................................  5 
             2.1.1     High-Speed Electrical Links ...................................  6 
             2.1.2     Equalization Systems ..............................................  8 
   2.2     Optical Interconnect Architecture ........................................  14 
             2.2.1     Vertical Cavity Surface Emitting Laser (VCSEL) ..  16 
             2.2.2     Electroabsorption Modulator (EAM) ......................  21 
             2.2.3     Ring Resonator (RR) Modulator .............................  24 
             2.2.4     Mach-Zehnder (MZ) Modulator ..............................  26 
             2.2.5     Optical Photodetector and Receiver ........................  29 
   2.3     Summary ..............................................................................  35 
III OPTIMIZATION METHODOLOGY AND 
 PERFORMANCE COMPARISON OF  
 ELECTRICAL AND OPTICAL LINKS .............................................      36 
 
  3.1     Previous Work ......................................................................      36 
  3.2     Optimization Methodology ..................................................      38 
                   3.3     Comparison of Electrical and Optical Link Performance ....      56 
viii 
 
 
 
      CHAPTER                                                                                                                   Page 
             3.3.1     Electrical Link Performance ....................................  56 
                                     3.3.2     Optical Link Performance .......................................  62 
                                     3.3.3     Comparison of Electrical and Optical Links ...........  67 
   3.4     Summary ..............................................................................  71 
IV CONCLUSION AND FUTURE WORK .............................................      72 
         
  4.1     Future Work .........................................................................      74 
REFERENCES ..........................................................................................................  78 
VITA .........................................................................................................................  91 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ix 
 
 
 
LIST OF FIGURES 
 
FIGURE                                                                                                                        Page 
 1 I/O scaling projections ...............................................................................  2 
 
 2 Block diagram of high-speed electrical link ..............................................  6 
 
 3 Channel frequency response .......................................................................  7 
 
 4  Transmitter with 4-tap feed-forward equalization .....................................  9 
 
 5 Continuous time linear equalizer ................................................................  12 
 
 6 Receiver with 5-tap decision feedback equalization ..................................  13 
 
 7 Block diagram of optical link .....................................................................  15 
 
 8 Block diagram of VCSEL ..........................................................................  16 
 
 9 VCSEL optical power vs. current ..............................................................  17 
 
 10  Modulation response of tunnel junction VCSEL .......................................  18 
 
 11 Schematic of VCSEL current mode driver.................................................  19 
 
 12 Quasi-waveguide angled-facet electroabsorption modulator .....................  22 
 
 13 Schematic of electroabsorption modulator driver circuit ...........................  23 
 
 14 Block diagram of ring resonator .................................................................  24 
 
 15 Block diagram of Mach-Zehnder modulator ..............................................  27 
 
 16 Schematic of Mach-Zehnder modulator driver circuit ...............................  28 
 
 17  Transimpedance amplifier based receiver architecture ..............................  30 
 
 18 Modeling vs. SPICE results of the power dissipation of CTLE  
  as a function of 3-dB bandwidth ................................................................  41 
 
 
x 
 
 
 
FIGURE                                                                                                                        Page 
 19 Modeling vs. SPICE results of the power efficiency of CTLE  
  as a function of 3-dB bandwidth ................................................................  41 
 
 20 Stateye results visualized using statistical eye diagram .............................  42 
 
 21  Transmitter power efficiency vs. data rate as a function of  
  number of taps using CML circuits ............................................................  45 
 
 22 Transmitter power efficiency vs. data rate as a function of  
  number of taps using CMOS circuits .........................................................  45 
 
 23 DFE power efficiency vs. data rate as a function of number of taps .........  47 
 
 24 CTLE power efficiency vs. data rate as a function of CTLE peak gain.....  47 
 
 25 Critical timing path of the decision feedback equalizer .............................  48 
 
 26 Flowchart of electrical link optimization methodology .............................  49 
 
 27 Transistor fT versus input Vgs voltage in 90 nm CMOS process ................  53 
 
 28 Transistor fT versus input Vgs voltage in 45 nm CMOS process ................  53 
 
 29 Flowchart of optical link optimization methodology .................................  54 
 
 30 Receiver sensitivity vs. data rate for 45 nm integrated photodetector .......  55 
 
 31  Total power efficiency of electrical link vs. data rate using  
  CML circuits with TX output swing optimized .........................................  56 
 
 32  Total power efficiency of electrical link vs. data rate using CML circuits  
  with and without TX output swing optimization in 90 nm process ...........  58 
 
 33 Total power efficiency of electrical link vs. data rate using  
  CML vs. CMOS circuits in 90 nm process ................................................  59 
 
 34 Optimal solution with minimum power efficiency vs. data rate ................  60 
 
 35 Optimized equalization architecture vs. data rate ......................................  60 
 
 36 Total power efficiency of VCSEL based link vs. data rate ........................  63 
 
xi 
 
 
 
FIGURE                                                                                                                        Page 
 37 Total power efficiency of QWAFEM based link vs. data rate ...................  63 
 
 38 Total power efficiency of waveguide based EAM link vs. data rate .........  64 
 
 39 Total power efficiency of ring resonator based link vs. data rate ..............  65 
 
 40 Total power efficiency of Mach-Zehnder modulator based  
  link vs. data rate .........................................................................................  66 
 
 41  Total power efficiency of electrical and optical link vs. data rate  
  in 90 nm CMOS process ............................................................................  67 
 
 42 Total power efficiency of electrical and optical link vs. data rate  
  in 45 nm CMOS process ............................................................................  69 
 
 43  Power efficiency of the CTLE with 12-dB peak gain vs. data rate  
  for different supply voltages ......................................................................  74 
 
 44 Power efficiency of the CTLE with 6-dB peak gain vs. data rate  
  for different supply voltages ......................................................................  75 
 
 45  Power efficiency vs. data rate as a function of BER for the  
  T20 channel in 90 nm .................................................................................  76 
 
 46 Optimal equalization architecture vs. data rate as a function  
  of BER for the T20 channel in 90 nm ........................................................  76 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
xii 
 
 
 
LIST OF TABLES 
 
TABLE                                                                                                                          Page 
 
 I Electrical link I/O specifications ................................................................  43 
 
 II Optical device parameters (parameters in parenthesis refer to  
  45 nm CMOS technology) .........................................................................  50 
 
 III Discrete optical link budget ........................................................................  51 
 IV Integrated optical link budget .....................................................................  52 
1 
 
 
 
CHAPTER I 
INTRODUCTION 
Integrated circuit technology scaling has dramatically increased the on-chip 
processing and computational capabilities, driving single core processor to multi-core 
and many core processors. These architectures will require an on-chip communication 
bandwidth extending from hundreds of GB/s into the TB/s range [1], thereby 
necessitating a commensurate increase in the inter-chip input-output (I/O) 
communication bandwidth. As shown in Fig. 1 [2], this trend is expected to continue 
requiring a huge increase in I/O bandwidth, which however, has not scaled in the same 
manner as the processor performance.   
The electrical channel bandwidth has not been able to keep up with this rapidly 
increasing inter-chip communication bandwidth requirement due to severe high 
frequency channel losses and non-idealities such as reflections due to impedance 
discontinuities and signal crosstalk. Though technology limitation issues has been 
resolved due to circuit scaling, channel limitation problems has created a bottleneck for 
high speed I/O link design. To overcome the channel constraints and achieve high 
performance bandwidth demands, additional sophisticated equalization circuits [3], [4] 
are implemented which increases the power and area complexity. This creates the need 
for low power architectural techniques which can significantly improve the power 
efficiency to be within the I/O power budgets.  
____________ 
This thesis follows the style of IEEE Transactions on Circuits and Systems. 
2 
 
 
 
 
 
Fig. 1 I/O scaling projections [2] 
 
Increasing inter-chip communication bandwidth demand has motivated 
investigation into using optical interconnect architectures over the channel limited 
electrical counterparts. Optical interconnects with negligible frequency-dependent loss 
and high bandwidth [5] provides a viable alternative to achieve dramatic power 
efficiency improvements at per-channel data rates exceeding 10 GB/s. Optical 
interconnects also mitigate the effects of non-idealities of electrical interconnects such as 
skew, attenuation, impedance discontinuities, reflections and crosstalk [5] without the 
need for additional equalization complexity. This has motivated extensive research into 
optical interconnect technologies suitable for high density integration with CMOS chips. 
The power efficiency of the optical interconnect architectures should be superior to the 
electrical links in order for them to be a viable alternative to the high-speed electrical 
links. Design techniques that optimize and minimize the power consumption of the 
3 
 
 
 
optical links are essential to demonstrate the superior performance of the optical links 
over electrical links. 
This objective of this thesis is to develop a design flow which jointly optimizes 
the transmitter and receiver circuits of high-speed links to find the power efficient 
solution for a given data rate, channel type and circuit technology node. It specifically 
focuses on comparing the power efficiency of electrical and optical interconnect 
architecture in nanometer CMOS technologies. The key electrical and optical link 
parameters that affect the power dissipation of the link are identified and the 
improvements that could be made to obtain better power efficiency are addressed. The 
impact of technology scaling on power efficiency is considered by comparing the link 
performance in 90 nm and 45 nm CMOS process.  
1.1 Organization of Thesis 
Chapter II presents an overview of electrical and optical interconnect 
architectures considered in this thesis. It describes the bandwidth limitations of the 
electrical channels and the additional equalization techniques necessary to overcome the 
channel limitations. It further discusses the advantages of optical links over their 
electrical counterpart and the transmitter and receiver circuits necessary to interface and 
build high-speed optical links.  
In Chapter III, a design flow for the optimization methodology that computes the 
power efficient solution is discussed. Using the optimization methodology, a power 
efficiency comparison between electrical and optical interconnect architectures is 
4 
 
 
 
presented. The merits and demerits of each of the architecture are discussed along with 
the impact of key link parameters and technology scaling on power dissipation.  
Finally Chapter IV concludes the thesis with a performance summary of the 
comparison between electrical and optical interconnect architectures. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5 
 
 
 
CHAPTER II 
BACKGROUND* 
This chapter presents an overview of high-speed electrical and optical 
interconnect architectures used in inter-chip communication applications. It begins with 
the discussion on electrical interconnect architectures describing the non-idealities of the 
electrical channel and the challenges associated with increasing data rates which is 
followed by a presentation on equalization techniques required to achieve reliable 
communication at higher data rates. The next half of the chapter gives a presentation on 
optical interconnect architectures, the different optical sources and detectors along with 
their trade-offs and concludes with a discussion of the associated electrical circuits for 
the optical devices. 
2.1 Electrical Interconnect Architecture 
With the ever increasing data rates, high-speed electrical links have become 
common in applications such as communication systems and multi-processor computer 
systems. To achieve reliable performance over band limited channels, high-speed 
electrical link systems with sophisticated equalization complexity are employed. This 
section outlines the major components of the high-speed electrical links followed by a 
brief discussion on the electrical channel non-idealities. It also discusses the equalization 
schemes used to overcome the bottleneck created by channel limitations. 
____________ 
*Part of this chapter is reprinted with permission from “Power efficiency comparisons of interchip optical 
interconnect architectures,” by A. Palaniappan and S. Palermo, May 2010, IEEE Transactions on Circuits 
and Systems II, vol. 57, no. 5, pp. 343-347. Copyright 2010 IEEE 
6 
 
 
 
2.1.1 High-Speed Electrical Links 
 
 
Fig. 2 Block diagram of high-speed electrical link 
 
The block diagram of a typical high speed electrical link is shown in Fig. 2. The 
parallel data from the transmitter (TX) is serialized due to the limited availability of I/O 
pins in a chip and sent through the electrical channel. At the receiver (RX), the received 
signal is amplified, sampled and restored prior to being deserialized for processing. The 
transmitter and the receiver data is synchronized using high speed clocks generated by 
the phase locked loop (PLL) in the transmitter and timing recovery system in the 
receiver which can either be a source synchronous forwarded clock architecture [6] or an 
embedded clock architecture [7]. 
The backplane channel responses [8] used in this work are shown in Fig. 3. All of 
the channels have linecard traces between 5-6” and varying backplane trace lengths; a 
short 1”channel (B1) with bottom traces, 20” channel (C4) with bottom stripline layer 
and 20” channel (T20) with top traces.  The electrical channel has a typical low-pass
   
7 
 
 
 
channel characteristic limited at high frequencies due to severe channel losses. The 
physical dimensions and quality of the channel together with its different components 
such as vias and stub prominently determine the channel characteristic and magnitude 
response variation with frequency. With increasing data rates, the high frequency loss of 
the channel attenuates and spreads the transmitted signal in time affecting adjacent bits 
resulting in inter-symbol interference (ISI).  
 
 
Fig. 3 Channel frequency response 
 
Inter-symbol interference is a deterministic noise that degrades the signal due to 
the physical properties of the channel and limits the maximum data rate achieved. At 
high frequencies, the losses of the channel can primarily be attributed to the skin effect 
and dielectric losses. Skin effect, which refers to the electric current crowding near the 
surface of the conductor, causes an effective resistance loss change with a square root 
8 
 
 
 
dependency on frequency [9] while the dielectric loss, attributed to energy loss has a 
linear dependency on frequency [9]. Signal interference also results from reflections 
caused by impedance discontinuities and crosstalk between neighboring signal lines. 
Impedance discontinuities due to termination mismatches, via stubs and parasitic 
capacitances result in multiple reflections and attenuation of the signal, thereby 
producing inter-symbol interference. Crosstalk, which arises due to capacitive and 
inductive coupling between different signal lines, also causes significant signal 
interference. Crosstalk can either be far-end (FEXT) or near-end (NEXT), with both of 
them potentially limiting the scaling of high link data rates. 
2.1.2 Equalization Systems 
With the channel limitations creating bottlenecks for high speed electrical links 
and the need to scale to higher data rates, sophisticated equalization circuits are 
implemented. The equalization techniques mitigate the effects of inter-symbol 
interference and compensate the frequency dependent loss of the channel to achieve 
reliable performance at higher data rates. Equalization can be implemented either in the 
transmitter or the receiver or a combination of both could be used depending on the 
channel response and I/O specifications for a given data rate.  This section presents an 
overview of equalization circuits typically implemented in high-speed links.  
Transmitter (TX) side equalization, typically implemented as linear feed-forward 
equalization (FFE) is one of the most common equalization techniques used in high-
speed links [10] due to the simple architecture and ease of implementation.  Transmitter 
feed-forward equalization is realized as a finite impulse response (FIR) filter which 
9 
 
 
 
attenuates the low frequency content in order to flatten the frequency response of the 
combined channel and TX FIR filter up to Nyquist frequency of operation. It pre-distorts 
or pre-shapes the transmitted signal in order to equalize the distortion caused by the 
channel response. The transmit side equalization allows the cancellation of pre-cursor 
inter-symbol interference and since equalization is done at the transmitter before the 
channel, it doesn’t amplify any noise or crosstalk. However, since the transmitter is peak 
power limited, low frequency signal content is attenuated down to the high frequency 
level. The half rate architecture of transmitter feed-forward equalization [3] with 
maximum complexity of 4 taps (main tap, 1 pre-cursor tap and 2 post-cursor taps) as 
shown in Fig. 4 is utilized in this work. 
 
 
Fig. 4 Transmitter with 4-tap feed-forward equalization [3] 
10 
 
 
 
Due to signal integrity considerations, the transmitter driver circuit requires 50 Ω 
output impedance. To produce the required transmit output swing, either a current-mode 
driver or voltage-mode driver circuit is a suitable candidates for the output stage of the 
transmitter. Current mode drivers [11], which use Norton equivalent parallel termination, 
require large currents to produce the required output voltage swing resulting in high 
power dissipation. Though Voltage mode drivers [12], which use Thevenin equivalent 
series termination potentially require one fourth [13] of the current required by current 
mode drivers for the same output swing, additional feedback control is required to 
control the series termination. This work uses a high output common-mode current mode 
logic driver circuit for the output stage.  
 This work investigates the use of both Current Mode Logic (CML) circuits and 
CMOS logic circuits for the different building blocks in the transmitter such as latches, 
multiplexers and xors. As data rates scale higher, faster current mode logic circuits are 
considered for the transmitter circuits. The current mode logic circuits are popular in 
high speed circuits because of the reduced voltage swing, better supply noise immunity 
and robust performance [14]. However, the static power dissipation of the current mode 
logic circuits increases the overall power consumption of the circuit. 
 Receiver (RX) side equalization is typically implemented either as linear FIR 
filter, continuous time linear equalizer (CTLE) or non-linear decision feedback equalizer 
(DFE). Though the RX FIR filter can implement the same functions of TX FIR filter and 
can adaptively tune the filter tap coefficients without any back channel, it amplifies high 
  
11 
 
 
 
frequency noise and crosstalk along with the input signal. The implementation of analog 
delay elements is also a challenging task in using RX FIR filters. Continuous time linear 
equalizer (CTLE) is realized as a differential amplifier with a programmable RC-
degeneration controlling the amount of peaking used to compensate for the low pass 
channel response. The continuous time linear equalizer, as shown in Fig. 5, is a simple 
structure that provides gain and equalization with low power and area complexity. 
However, similar to the RX FIR filter it amplifies the noise and crosstalk along with the 
input signal and is limited to first order compensation. This work analyzes the use of 
continuous time linear equalizer for receiver side equalization. The transfer function of 
CTLE is given [15] by,  
     
  
  
 
  
 
    
   
  
    
 
    
    
 
    
  
 
   
 
    
       
  
    
 
    
       
 
    
 
         
    
  
    
 
                        
               
               
       
    
   
  
    
    
 
 
where gm is the transconductance of the input transistor, Rs is the degenerated resistance, 
Cs is the degenerated capacitance, RD is the output resistance, CP is the load capacitance, 
12 
 
 
 
ωz is the zero location, ωp1 is the first pole location and ωp2 is the second pole location. 
The ideal peaking is a function of ratio between the location of first pole and zero.  
 
 
Fig. 5 Continuous time linear equalizer 
 
Another receiver side equalization commonly implemented in high speed links is 
the decision feedback equalization (DFE).  Decision feedback equalizer is a non-linear 
equalizer which cancels the effect of inter-symbol interference on the current pulse using 
the previously detected signals. Unlike linear receiver equalization, decision feedback 
equalization does not amplify noise and crosstalk as it quantizes the input signal and 
uses it thereby boosting only the high frequency signal content. The filter tap 
coefficients could be adaptively tuned without the need for any back channel. However, 
decision feedback equalization cannot cancel pre-cursor inter-symbol interference and 
also has a potential for error propagation if the initially quantized and resolved output is 
wrong. A half baud rate direct feedback decision feedback equalizer implementation [4] 
13 
 
 
 
with maximum up to 5 taps as shown in Fig. 6 is considered in this work. The main 
challenge in designing the decision feedback equalizer is the critical timing path of the 
first feedback tap which should settle in 1 Unit Interval (1 UI).   
 
 
Fig. 6 Receiver with 5-tap decision feedback equalization [4] 
 
The critical block in the receiver circuit is the sampler/comparator block which 
quantizes the input signal and resolves it CMOS logic levels. Strong-arm latch [16], also 
called as sense amplifier, is a clocked regenerative amplifier which performs the input 
signal amplification with positive feedback latches [17].  The sense amplifiers do not 
have any static power dissipation components unlike the linear amplifiers and are very 
power efficient. The feedback multiplexer [4] used in the decision feedback equalization 
circuit, where the feedback is split into two current mode logic (CML) multiplexers 
which improves the signal integrity of the fed back signal by isolating decisions from the 
sense amplifier. 
14 
 
 
 
2.2 Optical Interconnect Architecture 
The many problems of the electrical channels has motivated interest into the use 
of optical interconnects as a viable alternative to electrical interconnects. Optical links 
which typically operate in the wavelength range of 850-1550 nm corresponding to a 
frequency range of 200-350 THz offer the potential to operate single optical links in 
terahertz range. The optical interconnects with negligible frequency dependent losses at 
short distances, around 0.2 dB/km at a wavelength of 1550 nm [18], can operate at high 
data rates without the need for any equalization circuits. The short wavelengths allow 
better focusing on smaller areas without significant crosstalk and high density 
interconnections [19]. Impedance matching in optical links is achieved using a simple 
resonant impedance transformer namely anti-reflection coating [5]. Optical links also 
support wavelength division multiplexing techniques which could result in huge increase 
in the capacity of interconnect systems.   
The optical channels typically used for inter-chip communication are free-space 
and optical fiber. While free-space optical links using collimating lenses have the 
advantage of dense optical interconnection, they are quite sensitive to alignment 
tolerances and environmental vibrations [20]. The optical fiber is a cylindrical structure 
consisting of higher index core surrounded by a lower index cladding to transmit light 
through total internal reflection. It can be either a multi-mode fiber or a single mode 
fiber depending on the number of propagating mode it supports. The optical fibers have
  
 
15 
 
 
 
 negligible losses at short distances which makes them suitable for high-speed inter-chip 
communication. 
 
Fig. 7 Block diagram of optical link 
 
This work analyzes four different optical links - a near term architectures 
consisting of the discrete laser device, vertical-cavity surface-emitting laser with p-i-n 
photodetector and the three long term integrated photonic architecture using optical 
modulators, electroabsorption modulator, ring resonator modulator or Mach-Zehnder 
modulator with an integrated waveguide based metal-semiconductor-metal 
photodetector. The block diagram of the optical link architecture is shown in Fig. 7. The 
serialized transmitter data is driven by the appropriate driver circuit of the 
aforementioned optical device into the optical channel. The optical data is received by 
the discrete or integrated photodetector, converted to voltage and amplified by the 
transimpedance amplifier and limiting amplifier to sufficient levels before giving it to 
the decision circuit after which the data is deserialized.  
This section discusses about the optical devices and their associated circuitry. An 
overview of the above mentioned four optical sources are presented along with the driver 
circuits and their power consumption. This is followed by a discussion on photodetectors 
16 
 
 
 
and receiver amplifiers, finally concluding with their bandwidth and noise performances. 
2.2.1 Vertical Cavity Surface Emitting Laser (VCSEL) 
Vertical-cavity surface-emitting lasers (VCSEL), belonging to the category of 
directly modulated lasers, are an attractive and suitable candidate for high bandwidth 
optical sources due to their ability to directly emit light with low threshold currents and 
reasonable power efficiencies. In addition, the surface-normal emission of light,  small 
sizes of the devices, operation over a wide temperature range and on-wafer testing 
process allow for high volume and high density production of two dimensional arrays at 
lower cost [21]. Though the VCSEL has significant cost advantages, there exists a trade-
off between speed and reliability in these devices. 
 
 
Fig. 8 Block diagram of VCSEL [20] 
 
Fig. 8 shows the block diagram of the VCSEL [20], which is a semiconductor 
laser device with light emission perpendicular to the chip surface. The commonly used 
VCSELs can be based on GaAs, GaInNA, InGaAs and usually operate at 850 nm [21, 
22], 1100 nm [23], 1310 nm [24] and 1550 nm [25]. In order for the VCSEL device to 
operate and produce output optical power it has to be biased above a certain threshold 
current Ith. Once the VCSEL is biased above the threshold current, it produces optical 
17 
 
 
 
power linearly proportional to the input current flowing through the device as shown in 
Fig. 9. The linearly proportional term, slope efficiency η, determines how efficiently the 
input current is converted to output optical power. The speed of the VCSEL is limited by 
a combination of both electrical parasitics and carrier-photon interaction described by 
means of electrical circuit model and optical rate equations. The electrical circuit time 
constant is dependent on junction RC values with the capacitance in the range of 0.15 pF 
for 25 Gb/s class VCSELs [23] and the series resistance of the order of 50-150 Ω [26]. 
The relaxation oscillation frequency of the VCSEL ωR, derived from the optical rate 
equation that describes the electron-photon interaction and related to the effective 
bandwidth, is dependent upon the average current Iavg flowing through the device [26].  
                
 
 
Fig. 9 VCSEL optical power vs. current 
 
The above equation states that the average current has to be increased in 
quadratic manner to achieve higher bandwidth. However, the output power saturation 
18 
 
 
 
and relaxation oscillation frequency saturation due to self heating effects [27] limit the 
increase in average current to achieve higher bandwidth and causes serious reliability 
problems as the mean time to failure (MTTF) is inversely proportional to the square of 
the device current density [28]. Thus, a sharp trade-off occurs between the VCSEL speed 
and reliability [26].   
      
 
   
 
 
 
Fig. 10 Modulation response of tunnel junction VCSEL [31] 
 
The two commonly used fabrication techniques for production of VCSELs are 
implantation and oxidation method. The implantation method uses proton bombardment 
to make certain sections electrically non-conductive and confines the current while the 
oxide method uses an oxide of aluminum to confine the current [22].  The oxide method 
devices have a better linear light versus current curves and achieve high speed of 
operation and good reliability because of their high differential gain [29]. Though oxide 
19 
 
 
 
confined VCSEL [30] developed had a good modulation bandwidth of 20 GHz they 
suffer from relaxation oscillation frequency saturation due to self heating effects [29]. In 
order to circumvent the self heating problems and increase the bandwidth of operation, 
buried type-II tunnel junction VCSELs [31] were developed. The chip resistance was 
reduced by half compared to oxide confined structures due to the structure enabling the 
use of lower resistive layers. Furthermore, the tunnel junction structure increased the 
differential gain due to improved current injection uniformity and reduced the parasitic 
capacitances which resulted in the bandwidth increase to 24 GHz [29, 31]. Fig. 10 shows 
the modulation response of the tunnel junction VCSEL [31] achieving a bandwidth of 24 
GHz which is used in this work.   
 
 
Fig. 11 Schematic of VCSEL current mode driver 
 
Due to its linear optical power-current relationship beyond the threshold current 
Ith, a differential current-mode driver is used to modulate the VCSEL. A differential 
current mode driver with VCSEL in one arm and a dummy load in another arm is shown 
in Fig. 11. In order to avoid any speed limitations, the VCSEL is always biased just 
20 
 
 
 
above the threshold current using a separate path. The diode knee voltage of the VCSEL 
is larger than the nominal CMOS supply voltages and hence a separate higher supply 
voltage LVdd is necessitated for the VCSEL driver stage. The VCSEL is driven by a 
predriver stage which is a set of cascaded inverters with a fan-out F. The power 
dissipation is given as the sum of predriver inverter stage power and differential current 
mode driver power [32]. The total predriver stage power is  
                      
  
 
 
 
                            
 
   
 
where Vdd is nominal CMOS supply voltage, f is the data rate, Ctotal is the total 
capacitance of the pre-driver, Cload is the output stage gate capacitance, Cin and Cout are 
the input and output capacitance of each inverter stage. The total current of the 
differential driver includes the average modulation current Im and the dc current Idc 
required to bias the VCSEL above threshold. The supply voltage LVdd is given as sum of 
diode knee voltage of the VCSEL Vdiode, voltage across series resistance Rs because of 
the modulation current Im flowing through it and drain-source voltage of transistor Vds. 
                    
                                      
The higher value of driver supply voltage LVdd compared to nominal CMOS supply 
voltages results in higher power dissipation in the driver circuit. 
21 
 
 
 
2.2.2 Electroabsorption Modulator (EAM) 
Electroabsorption Modulators (EAM) belong to the class of integrated optical 
modulators which absorb light originating from a continuous-wave laser source, the 
absorption being dependent on electric field strength applied across it to produce 
modulated optical output signal achieving acceptable contrast ratios at low drive 
voltages over tens of nanometers of optical bandwidth.. Unlike the VCSEL, the 
electroabsorption effect in modulators doesn’t display any carrier speed limitations and 
is an ultra fast process [33], [34] working in sub-picoseconds time scale appropriate for 
high speed operations. The absorption mechanism occurs either due to Franz-Keldysh 
effect, observed in EA modulators formed by placing a bulk semiconductor material [35] 
in the intrinsic region of a p-i-n diode, or by quantum-confined Stark effect (QCSE), 
wherein the electric field is applied perpendicular to the multiple quantum wells [36], 
which form the intrinsic region of the p-i-n diode EA modulator.  
The EA modulators are typically implemented as surface-normal devices [19], 
[37], [38] or waveguide structures [39]–[41]. In order to generate a high contrast ratio in 
the surface-normal devices, a large multiple-quantum well region is required which 
necessitates large drive voltages that can produce the requisite electric field. If the drive 
voltage has to be reduced, the absorbing multiple-quantum well region should be made 
thin which results in a low contrast ratio [42]. To overcome this problem, the multiple-
quantum well is placed in an optical resonator such as asymmetric Fabry-Perot 
modulator [43], [44] to enhance the absorption without using large multiple-quantum 
wells and large drive voltages but this reduces the wavelength band of operation [45] 
22 
 
 
 
and becomes a problem when used in wavelength division multiplexing (WDM) 
systems. In contrast to the surface-normal devices, the waveguide structures are able to 
achieve a high contrast ratio at small voltage swings and over a wide wavelength range 
but they suffer from poor misalignment tolerance because of coupling light into small 
waveguides and packaging difficulties also arise [38].  
 
 
Fig. 12 Quasi-waveguide angled-facet electroabsorption modulator 
 
The quasi-waveguide angled-facet electroabsorption modulator (QWAFEM) [38] 
device combines the best features of surface-normal and waveguide structures is one of 
the EA modulators used in this thesis. Fig. 12 shows the schematic of QWAFEM device 
where the input beam is reflected a couple of times before being exited from the 
substrate. The large incident angle is able to achieve high contrast ratio and wide 
wavelength of operation at small voltage swings and the device is also inherently 
misalignment tolerant due to the three bounce geometry. Another EA modulator 
investigated in this work is the waveguide structure based on an enhanced Franz-
Keldysh effect in tensile strained, epitaxial germanium-on-silicon (Ge-on-Si) [41] with 
high electroabsorption contrast. This work assumes some improvements to the existing 
23 
 
 
 
device structures which include increasing the reference waveguide EA modulator 
length for compatibility with nanometer CMOS voltage swings and using a smaller 
QWAFEM for lower capacitance. 
 
 
Fig. 13 Schematic of electroabsorption modulator driver circuit 
 
Modeled by a reverse biased diode with a lumped capacitor, the EAM devices 
are driven by a voltage-mode driver consisting of cascaded inverters as shown in Fig. 13. 
The power dissipation consists of static absorbed photocurrent and dynamic switching 
components [32], [46], in addition to external laser power. The static component is due 
to absorbed light power in the modulator. It is calculated in terms of contrast ratio (CR) 
and insertion loss (IL) by multiplying the current in each binary state by its voltages and 
taking the average. 
                                             
                                              
    
  
        
                   
  
 
 
                       
 
   
     
  
 
 
 
24 
 
 
 
where Rs,mod is the responsivity of the modulator, Plaser is the input laser power, Vbias is 
the pre-bias voltage, Vswing is the swing voltage, IL is the insertion loss, CR is the 
contrast ratio of the modulator Vdd is nominal CMOS supply voltage, f is the data rate, 
Ctotal is the total capacitance of the driver, Cmod is the capacitance associated with the 
modulator, Cin and Cout are the input and output capacitance of each inverter stage and 
Plaser,electricpower is the external laser power dissipation. The swing voltage Vswing is 
assumed to be the nominal CMOS voltage supply Vdd for the given technology and the 
pre-bias voltage Vbias is twice the supply voltage. 
2.2.3 Ring Resonator (RR) Modulator 
 
 
Fig. 14 Block diagram of ring resonator 
 
Ring resonator (RR) modulators are refractive devices that display very high 
resonant quality factors and can achieve high contrast ratios with small dimensions. A 
block diagram of the ring resonator is shown in Fig. 14 in which the high confinement 
ring structure is coupled to a single waveguide. The ring resonator modulator uses the 
high confinement resonant ring structures to increase the length of the optical path by 
25 
 
 
 
circulating the light within the resonant ring structure at the resonant wavelength and 
destruct the light at all other wavelengths [47]. In order to modulate a transmitted signal 
using the ring resonator, the refractive index of the highly confined ring structure is 
modified by electrical signaling which induces a sharp tuning in the resonance 
wavelength that leads to strong modulation depth changes at or near resonance [48]. The 
high resonance quality factor together with strong modulation depth allows the use of 
small footprint ring resonator devices with small capacitances to achieve high contrast 
ratios.  
The ring resonator structure is typically implemented as carrier-injection based 
[48]-[50] or depletion width based [51], [52] p-i-n diode structures which allow for 
direct integration with CMOS circuits. Ring resonator structures have also been 
implemented with polymeric materials which does not have the carrier bandwidth 
limitations of p-i-n diode based structure. Polymeric materials with low refractive index 
contrast between the waveguide core and cladding and large cross section provide the 
advantages of low scattering and insertion loss with better coupling efficiency [53]. This 
work analyzes the electro-optic (EO) polymer cladding ring resonator [54] that allows 
for photonics on top of CMOS approach where a monolithic optical layer is created on 
top of existing CMOS chip which saves die area and allows optics to be added to any 
process node. Due to small dielectric constants and high confinement resonant ring 
structures, the polymer-based RR creates compact high speed devices modeled as simple 
lumped element capacitors. Polymers doped with optic chromophores provide large 
electro-optic (EO) coefficients and result in very fast shuttling of the electrons within the 
26 
 
 
 
molecular orbital of the chromophores which allow very high frequency of operation. 
Electro-optic polymers have shown a steady improvement in the electro-optic coefficient 
(r33) consistent with Moore’s law over time [55], [56]. As polymer RR modulators are 
still relatively immature, this work assumes some improvements in electro-optic 
coefficients to yield an 8 dB CR [1].  
The ring resonator is pseudo-differentially driven by two pairs of cascaded 
inverter stages in order to double the output swing and increase contrast ratio. Power 
dissipation is similar to the EAM driver with slight modifications to account for the 
higher pseudo-differential output swing and the omission of the absorbed photocurrent 
term not present in the RRM. Due to the resonant nature of the ring structure, the optical 
bandwidth range of the ring resonator is limited and typically less than 1 nm which 
makes it complicated to be used in multi-wavelength systems [57]. The ring resonator is 
highly sensitive to process and temperature variations. Small changes in temperature or 
bias conditions create a significant shift in resonance frequency due to the resonating 
structure and error in the operation of the ring resonator. Additional feedback tuning 
circuits are necessary to compensate for the variations and enhance the feasibility of the 
ring resonator for high density applications whose power is assumed to be 1mW [58]. 
2.2.4 Mach-Zehnder (MZ) Modulator 
Mach-Zehnder modulators (MZM) are devices that use free-carrier plasma-
dispersion effect to achieve optical modulation. The modulator utilizes a Mach-Zehnder 
Interferometer (MZI) as shown in Fig. 15, which works by splitting the light in two arms 
and shifting their phases by varying the carrier density of each arm in accordance with 
27 
 
 
 
the electric field applied. The carrier density variation changes the refractive index of the 
material and hence the phase is shifted [59]. The light in the two arms are then 
recombined either in phase or out of phase depending on the applied electric field to get 
the modulated output. Due to weak electro-optic effects in silicon, the Mach-Zehnder 
modulator requires a long device to produce the change in carrier density and attain the 
required phase shift. Similar to the other integrated optical modulators, MZ modulator 
requires an external continuous wave (CW) laser source whose input is coupled through 
an integrated holographic lens (HL).  
 
 
Fig. 15 Block diagram of Mach-Zehnder modulator 
 
The Mach-Zehnder modulator devices are typically implemented as forward 
biased p-i-n diode [60], [61], MOS capacitors [62] or reverse biased pn diodes [63], [64]. 
The reverse biased pn diodes use electric field induced carrier depletion effects to sweep 
the majority carriers in and out of optical mode [65] which is much faster than the carrier 
diffusion/recombination of the minority carriers in forward biased p-i-n diode, though it 
has high modulation efficiency and compact size [66]. The device capacitance of the 
28 
 
 
 
reverse biased pn diodes is also much reduced when compared to the MOS capacitor 
based devices [59]. This work uses a reverse biased pn diode based Mach-Zehnder 
modulator [66], whose speed is entirely limited by RLC parasitics.  
 
 
Fig. 16 Schematic of Mach-Zehnder modulator driver circuit 
 
Fig. 16 shows the transmitter schematic of Mach-Zehnder modulator based link 
[65]. Unlike smaller modulators which are treated as lumped capacitive loads, due to the 
length of the Mach-Zehnder modulators (~1 mm) the differential electrical signal is 
distributed using a pair of transmission lines terminated with a low impedance. In order 
to achieve the required phase shift and reasonable contrast ratio, long devices and large 
differential swings are required; necessitating a separate voltage supply MVdd. Thick-
oxide cascode transistors are used to avoid stressing driver transistors with the high 
supply voltage and large differential output swing. The thick cascode transistors serve as 
a shield for the driver transistors as they can withstand much higher gate and drain-
source voltages and avoid the high swing requirement at the drain of the driver 
transistor. The driver circuit is driven by a set of cascaded inverter stages. The power 
29 
 
 
 
dissipation of circuit is the sum of predriver, driver and external laser power dissipation. 
                                               
                           
 
   
     
  
 
 
                                     
where Idriver is the driver current and Plaser,electricpower is the external laser power 
dissipation. 
2.2.5 Optical Photodetector and Receiver 
The optical receiver uses photodetector to convert the received optical signal to 
an electrical signal. The photodetector is typically characterized by its responsivity and 
the noise performance [67], both of which determine the performance of the receiver. 
The responsivity of the photodetector determines how much electrical current is 
produced for a given input optical power, specifically, how the incident photon with 
energy hc/λ creates electron carriers with charge q resulting in a current flow and is 
given by,  
  
 
    
    
  
  
 
where   is the responsivity, η is the quantum efficiency of the photodetector, λ is the 
wavelength of operation, h is the Planck’s constant and c is the velocity of light. The 
noise performance of the photodetector is an important factor is determining the 
sensitivity of the receiver. The photodetector noise is dominated by the dark noise 
current Idark, which is the current produced by the device even in the absence of any 
30 
 
 
 
optical input and is dependent on junction area and temperature [67]. The photodetectors 
are typically implemented as p-i-n photodetectors, avalanche photodetectors or 
integrated metal-semiconductor-metal photodetectors. This work uses a long wavelength 
discrete p-i-n photodetector [68] with a capacitance of 100 fF and dark current of 5 nA 
in conjunction with the discrete optical link based on the VCSEL device. For the other 
integrated CMOS photonic links based on modulators, integrated waveguide metal-
semiconductor-metal (MSM) photodetector [1], [69] with ultra low capacitance of < 1fF 
is used with a dark current level of 100 µA.  
 
 
Fig. 17 Transimpedance amplifier based receiver architecture 
 
The electrical current produced by the photodetector must be converted to a 
sufficient voltage level with low noise and adequate bandwidth. The simple resistor 
solution presents trade-off in achieving high gain, low noise and large bandwidth. The 
transimpedance amplifier (TIA) based architecture is chosen because of its ability to 
achieve high gain, large bandwidth with low noise operation [70], [71]. The 
31 
 
 
 
photodetector current is converted to voltage by the transimpedance amplifier (TIA) 
which is further amplified to a sufficient level by a set of 1-5 limiting amplifier (LA) 
stages and finally given to a decision circuit element which digitizes the data and 
deserializes it for further processing as shown in Fig. 17.  
The transimpedance amplifier consists of an inverting voltage amplifier with a 
feedback resistor between input and output. The inverting voltage amplifier is 
implemented as a differential amplifier. The significant advantages of using a 
differential amplifier is the immunity to supply noise and increased voltage swing. One 
input of the differential amplifier is connected to the photo-detector while the other input 
is loaded with a capacitance matching the photo-detector capacitance to make the TIA 
balanced. The limiting amplifier stages limited to a maximum of five stages are also 
implemented as differential amplifiers. Sense amplifiers are typically used to implement 
the decision circuit which is assumed to have 20 mVpp input threshold ambiguity [6]. 
Finally, deserialization circuits are utilized to deserialize the incoming data for further 
processing.  
The two critical design parameters in the receiver are its bandwidth and 
sensitivity. Both of these parameters are highly dependent on the transimpedance 
amplifier performance as it is the first circuit block in the receiver. The closed loop 
transimpedance [67] of the transimpedance amplifier is given by, 
           
 
  
 
   
 
  
  
 
 
32 
 
 
 
where,        
  
    
    
     
    
       
 
  
               
        
 
Here RT is the DC transimpedance of the TIA,   is the pole (angular) frequency, and Q 
is the pole quality factor, related to the damping factor   by         which controls 
the peaking and ringing, Af is the voltage gain of the inverting voltage amplifier, RF is 
the feedback resistance, TA is the time constant of voltage amplifier related to its 3-dB 
bandwidth by    
 
     
 and CT is the total capacitance at the input of TIA, which 
includes the photo-detector capacitance CD, the input capacitance of the voltage 
amplifier CI and parasitic capacitance CF between input and output accounted using 
Miller effect. 
                    
The voltage amplifier output is directly fed back to the input through the 
feedback resistor without any buffer in between. Therefore, the feedback resistor loads 
the voltage amplifier and affects its gain and bandwidth which are given by,  
     
    
 
  
      
 
 
  
 
 
  
       
      
 
  
 
 
  
 
 
33 
 
 
 
    
 
  
 
 
  
 
 
  
      
   
where gm is the input transconductance of voltage amplifier and RL is its load resistance. 
CL is the load capacitance and it includes limiting amplifier input capacitance in addition 
to its self capacitance load. In order to guarantee a flat frequency response and limit the 
overshoot and ringing response to less than 4.3% [67], the Q of the TIA should be less 
than 1/√2 which necessitates the bandwidth of the inverting voltage amplifier to be 
larger than, 
    
 
  
 
   
    
 
Using Q = 1/√2 (  = √2/2), the 3-dB bandwidth of the transimpedance amplifier is 
directly calculated from ω0 [72] by substituting for TA which is given as, 
          
  
  
 
 
  
 
           
    
 
The sensitivity of the receiver determines the minimum input signal level that 
can be detected by the receiver. It measures the smallest optical input power necessary 
for reliable operation at a given bit-error rate (BER). The sensitivity of the receiver Psens 
is dependent on the total input referred noise of the receiver and the power penalty due 
to decision circuit threshold ambiguity PPDC and is calculated in terms of dBm for a 
given bit-error rate (BER). 
      
    
   
 
       
34 
 
 
 
where in
rms
 is the input referred rms noise current of the receiver which includes the 
noise from both the transimpedance amplifier and limiting amplifier stages along with 
the photodetector dark noise current Idark,   is the responsivity of the photodetector and 
Q is a term dependent on BER. The input referred noise current of the TIA is the 
dominant component of the noise at the input of the receiver. The input referred noise 
current spectrum of the TIA consists of two major components, one from feedback 
resistor In,res and other from amplifier front end In,amp. 
      
            
            
     
      
       
   
  
                     
 
     
   
      
 
   
     
where gdsp is the conductance of the PMOS load transistor in linear region. The input 
referred rms noise current for input referred noise spectrum written in the form       
  
        
  is 
      
             
  
 
     
  
where BWn and BWn2 are the noise bandwidths which for a second order low pass 
Butterworth response are given as 1.11 and 1.49 [67].  
The limiting amplifiers are implemented as differential amplifiers with linear 
region PMOS transistors as load. A set of 1-5 limiting amplifier stages could be used in 
series to obtain the required gain. Cascading multiple limiting amplifier increases the 
gain but it also reduces the overall bandwidth. In order to maintain the overall bandwidth 
35 
 
 
 
at the required value, the bandwidth of each of the limiting amplifier stages should be 
increased [72] by, 
             
 
     
where ω-3dB is the overall 3-dB bandwidth, ω0 is the bandwidth of each stage and N is the 
number of stages of limiting amplifier utilized. Care should be taken to increase the 
bandwidth of each stage such that required overall bandwidth is achieved. The noise of 
the limiting amplifiers is divided by the transimpedance gain while being reflected at the 
input of the receiver to calculate the total input referred noise. 
2.3 Summary 
This chapter discussed the electrical and optical interconnect architectures in 
detail. It began with the problems faced by electrical channels in scaling towards higher 
data rates and presented the equalization systems used to compensate for the channel 
losses. It then discussed the advantages of using optical links over their electrical 
counterparts and gave an overview of four different optical sources along with their 
driver circuits. The optical photodetector and its associated transimpedance amplifier 
based receiver circuit were also discussed.  
 
36 
 
 
 
CHAPTER III 
OPTIMIZATION METHODOLOGY AND PERFORMANCE COMPARISON OF 
ELECTRICAL AND OPTICAL LINKS* 
The high-speed electrical and optical link architectures presented in previous 
chapter should be designed with minimum power consumption. A design flow that 
optimizes and minimizes the power dissipation of both electrical and optical links with 
their associated transmitter and receiver circuitry is presented in this chapter. Prior to 
this, a brief introduction is given to the previous modeling and optimization works for 
both electrical and optical links. The optimization methodology is followed by a 
performance comparison of the high-speed electrical and optical links over the different 
process technology which finally concludes with the discussion of results. 
3.1 Previous Work 
Techniques for optimization and minimization of power consumption of the 
high-speed serial link have been investigated in the past. Several works have been 
published considering the high-speed electrical links and optical links, their modeling, 
power consumption and also on comparison of performance between the two links.  
Casper et al. [73] used worst case eye diagram representations calculated from 
peak distortion analysis and peak sampling boundary to characterize the timing and
  
____________ 
*Part of this chapter is reprinted with permission from “Power efficiency comparisons of interchip optical 
interconnect architectures,” by A. Palaniappan and S. Palermo, May 2010, IEEE Transactions on Circuits 
and Systems II, vol. 57, no. 5, pp. 343-347. Copyright 2010 IEEE 
37 
 
 
 
voltage margins of high-speed chip-to-chip signaling systems. Stojanovic et al. [74] 
analyzed a high-speed electrical link system’s deterministic and random noise sources 
along with the timing circuitry jitter and their probability distribution and correlations. 
The impact of the noise sources and its probability distribution on equalization 
complexity was also examined. Hatamkhani et al. [75] presented a methodology that 
uses custom logic optimization to design the transmitter with minimum power 
dissipation satisfying I/O specifications.  Hatamkhani et al. [76] determined the optimal 
power data rate relationship based on the channel’s frequency response and showed that 
an upper bound exits for data rate depending on the channel. Statistical link analysis 
methods [77]-[79] are being used to simulate high-speed links and estimate its 
performance in a faster manner.  
Van Blerkom et al. [80] developed an optimization method for the optical 
transimpedance receiver design. The transimpedance amplifier and the voltage 
amplifiers parameters such as bit rate, gain, input referred noise and circuit sizing were 
evaluated and optimized for minimum power dissipation. Kibar et al. [32] further 
developed on the previous receiver design and optimized the total optical link power for 
vertical cavity surface emitting laser (VCSEL) and multiple quantum-well (MQW) 
modulator based links which included both the transmitter and receiver circuits with 
fixed optical device parameters. Kapur et al. [81] included the optimization of modulator 
with the circuits by using its parameters like insertion loss (IL) and contrast ratio (CR). 
Cho et al. [46] extended the optimization using number of QWs, swing and pre-bias 
voltage parameters of the modulator while taking into account the circuit properties. 
38 
 
 
 
Feldman et al. [82] performed a comparison between electrical and optical 
interconnects based on power and speed considerations by modeling the electrical and 
optical interconnects. Miller [83] summarized the various advantages and challenges of 
using optical links over electrical links. The above references did not consider the 
complex electrical links with its equalization complexity. Cho et al. [84] compared the 
power dissipation of modulator based optical interconnect with the electrical 
interconnect using equalization schemes as a function of interconnect length. The critical 
length beyond which optical interconnect becomes power efficient was identified from 
the comparison and is characterized as a function of bit error rate (BER), data rate and 
mismatches.  
This research leverages previous works and improves the analysis to compare 
sophisticated high-speed electrical links and state-of-the-art optical links. It uses 
statistical link analysis tools with detailed equalization and serialization models for high-
speed link analysis. It also compares the performances of multiple optical links formed 
using different optical devices. It utilizes the constant current density technique for the 
optimization methodology to find the minimum power dissipation of the entire link. 
3.2 Optimization Methodology 
The objective of this optimization methodology is to minimize the total link 
power dissipation which includes all the serialization, deserialization and equalization 
circuits of the transmitter feed-forward equalization, receiver continuous time linear 
equalization and decision feedback equalization for the electrical link and the circuits 
associated with receiver transimpedance and limiting amplifier stages, transmitter 
39 
 
 
 
predriver and driver circuits, ring resonator tuning circuit power, static electroabsorption 
modulator power component and external laser power for the optical link. Note, while 
local clock buffering power is modeled, clock generation, distribution, and recovery 
power, which can vary with application and is common among all of the architectures, is 
excluded in the reported power dissipation to more clearly display the link performance 
impact. Power efficiency defined as the power consumption divided by the data rate 
expressed in units of mW/Gb/s is used as a metric to compare the link performance at 
various data rates over 90 nm and 45 nm technology nodes. 
In order to obtain relatively accurate circuit modeling results, sufficiently 
accurate transistor parameters that will model the circuits satisfactorily are necessary. 
These parameters are obtained from transistor-level SPICE simulations of the circuit 
topologies using a constant current density technique. The drain current of the transistor 
increases linearly with transistor finger number under fixed biasing conditions and finger 
sizing, yielding a constant current density. Normalized transistor parameters 
(transconductance, output conductance, capacitances, etc.) are extracted using constant 
current density technique. The accuracy of the modeling results is tested using a 
continuous time linear equalizer (CTLE) circuit by determining the power required to 
operate at different 3-dB bandwidth and comparing them with SPICE results. The 3-dB 
bandwidth of CTLE is given by, 
      
 
  
 
 
        
 
 
40 
 
 
 
where Rout and Cout are the total output resistance and capacitance of the CTLE. A 
parameter fan-out FO is defined as the ratio of output load capacitance CP to the 
normalized input gate capacitance Cg/W. 
    
  
    
 
Using the above equation the 3-dB bandwidth of the CTLE can be re-written in 
terms of normalized transistor parameters. 
      
 
  
 
 
                     
 
where Cd/W is the normalized drain capacitance and Rout/W is the normalized output 
resistance. This equation is re-written to determine the fan-out factor FO as a function of 
normalized transistor parameters and 3-dB bandwidth of the CTLE. 
   
 
  
 
 
                 
 
    
    
 
Utilizing the above fan-out factor FO value and its definition, the required input 
capacitance Cin and transistor finger size can be determined for a given 3-dB bandwidth. 
    
  
  
              
   
    
 
                     
  
 
 
                  
41 
 
 
 
 
Fig. 18 Modeling vs. SPICE results of the power dissipation of CTLE as a function of 3-
dB bandwidth 
 
 
 
Fig. 19 Modeling vs. SPICE results of the power efficiency of CTLE as a function of 3-
dB bandwidth 
42 
 
 
 
The total current consumption Itotal can be computed from the normalized current Id/W 
and input transistor finger size Wfingersize. A factor of 2 is included to account for the two 
arms of the CTLE. Using the above equation based modeling, the CTLE power 
dissipation computed using total current and supply voltage Vdd is compared with SPICE 
results for different bandwidth. Fig. 18 shows the modeling vs. SPICE results 
comparison of power dissipation as a function of bandwidth with the modeling results 
closely following the SPICE results and showing slight deviation at high bandwidth. In 
order to circumvent the comparison of high power with low power in the same plot and 
get a better understanding of the power dissipation at different bandwidths, power 
efficiency, is used as a performance parameter for comparison. As shown in Fig. 19, 
power efficiency, defined as power dissipation divided by the bandwidth, presents a 
better parameter for the performance comparison of CTLE at both low and high 
bandwidths. 
 
 
Fig. 20 Stateye results visualized using statistical eye diagram  
43 
 
 
 
Table I Electrical link I/O specifications 
Bit-Error Rate (BER) 10
-12
 
TX deterministic jitter (dj) 0.01 UI 
TX random jitter (rj) 0.01 UI 
Min. Eye Opening compliance 20 mVpp 
RX jitter compliance dj 0.45 UI 
RX jitter compliance tj 0.675 UI 
Vdd 1.2V (90 nm) 1.1V (45 nm) 
Max. TX Swing 1.2Vppd (90 nm) 1.1Vppd (45 nm) 
  
Due to the complex trade-offs involved in the design of high-speed serial links 
and to obtain reliable performance at higher data rates, statistical link analysis tools 
which co-optimize circuit and channel characteristics have been developed [77]-[79]. An 
open source statistical link analysis tool, Stateye [78] is used in this work. Statistical link 
analysis utilizes statistical methods to integrate the deterministic and random noise 
sources with the channel response and estimates link margins at a given bit-error rate 
(BER). Stateye uses the s-parameter channel response (touchstone format) coupled with 
I/O specifications such as BER, eye compliance and jitter compliance parameters given 
in Table I to determine the voltage and timing margins and equalization coefficients for 
a channel response at a given data rate. Stateye results, normalized to the transmit 
amplitude [78], are visualized using eye diagrams as shown in Fig. 20, which show the 
44 
 
 
 
voltage and timing margins. The link voltage and timing margins that satisfy the I/O 
specifications at a particular data rate are obtained for various different combinations of 
equalization architectures, which include a maximum of 4-tap TX FFE and 5-tap RX 
DFE along with the CTLE and used to generate a database for the channels’ equalization 
requirements. 
The minimum eye opening required for compliance of voltage margin with the 
combination of transmitter and receiver equalization is 20 mVpp (Table I). But Stateye 
produces link margin results with normalized transmit amplitude [78] which has to be 
scaled to the required transmitter output swing. The link voltage margin increases as the 
equalization complexity increases for a given data rate. A large link voltage margin in 
comparison to the minimum eye requirement is unnecessary and inefficient in terms of 
power dissipation as it has a large transmitter output swing. Transmitter output swing is 
optimized such that link voltage margin meets minimum eye opening compliance 
requirement, resulting in considerable power savings. 
Transmitter and receiver circuits are modeled by iterating the design variables 
over circuit design criteria to satisfy a given data rate specification. The transmitter is 
designed such that the serialization and pre-driver circuits have transition times of one 
third of the bit period to avoid excessive inter-symbol interference and ensure pulse 
shape of the output signal. A half-rate architecture implementation of TX FFE using a 
current summer driver at the output with a maximum of 4-taps is considered in this 
work. Both CML and CMOS logic based circuit designs are analyzed over the data rates 
  
45 
 
 
 
 
Fig. 21 Transmitter power efficiency vs. data rate as a function of number of taps using 
CML circuits 
 
 
 
Fig. 22 Transmitter power efficiency vs. data rate as a function of number of taps using 
CMOS circuits 
46 
 
 
 
of interest. As shown Figs. 21 and 22, static power dissipation of a CML-based design 
results in degraded power efficiency at lower data rates relative to CMOS-based design 
with dynamic power dissipation that scales down at lower frequencies. However, a 
CMOS-based design supports lower fan-outs at higher date rates due to a higher 
percentage of self-loading capacitance; necessitating large transistor sizes and increased 
power to satisfy the transition time specification. Thus, a valuable aspect of this design 
methodology is predicting when it is optimum from a power perspective to transition 
from a CMOS to a CML-based design. 
  Examples of receiver equalization circuitry power efficiency modeling versus 
data rate are shown in Fig 23 and Fig. 24. A half baud rate direct feedback DFE 
implementation with a maximum of 5-taps is considered in this work. The maximum 
data rate the DFE can reliably operate is determined by the 1 unit interval (UI) first-tap 
critical timing path of the direct feedback architecture shown in Fig. 25, which includes 
the comparator Clock-Q delay, the feedback tap propagation delay, and the time for 
amplifier A2 to achieve 95% settling. As shown in Fig. 23, increasing DFE tap number 
adds additional loading on the critical tap-current summation node, resulting in reduced 
maximum operational data rate. CTLE power efficiency is a strong function of the peak 
gain requirement.  As shown in Fig. 24, the 90nm technology realizes 12dB peak gain up 
to a little higher than 14Gb/s, but can achieve 6dB peak gain past 20Gb/s. Scaling 
technology to a higher fT 45nm process allows realization of 12dB peak gain out to 
18Gb/s. 
47 
 
 
 
 
Fig. 23 DFE power efficiency vs. data rate as a function of number of taps 
 
 
 
Fig. 24 CTLE power efficiency vs. data rate as a function of CTLE peak gain 
48 
 
 
 
   
Fig. 25 Critical timing path of the decision feedback equalizer  
 
The link margin and equalization coefficient results from Stateye along with the 
scaled transmitter output swing are coupled with normalized transistor parameters and 
circuit design constraints of the different transmitter and receiver blocks to design and 
compute the power dissipation of each equalization architecture. The power computation 
of various equalization architectures satisfying the I/O specifications yields a wide 
design search space, from which is selected the optimal architecture with minimum 
power solution for a given data rate. Thus, optimizing the transmitter output swing along 
with selecting optimal equalization architecture and circuit style gives optimized power 
efficiency for a given data rate, channel type and process technology node. A flowchart 
describing the above optimization methodology for electrical links is given in Fig. 26. 
 
49 
 
 
 
 
 
 
 
Fig. 26 Flowchart of electrical link optimization methodology
  
 
 
5
0
 
 
 
 
Table II Optical device parameters (parameters in parenthesis refer to 45 nm CMOS technology) 
Optical Device Optical Device Parameters 
VCSEL [31] CR = 5 dB Ith = 0.5 mA η = 0.28 Rs = 172 Ω Cj = 130 fF 
QWAFEM EAM [38] CR = 5 dB IL = 7.2 dB Vbias = 2.4V (2.2V) Vswing = 1.2V (1.1V) Cmod=200fF 
Waveguide EAM [41] CR = 10 dB IL = 5 dB Vbias = 2.4V (2.2V) Vswing = 1.2V (1.1V) Cmod=25fF 
Polymer based RR [54] CR = 8dB IL = 4 dB Tuning Power =1mW Vswing = 2.4V (2.2V) Cmod=10fF 
MZ Modulator [51] Vπ = 8 V IL = 7 dB Rload = 20 Ω   
Long wavelength PD [68] PD Cap=100fF Idark = 5 nA Responsivity = 0.8 A/W  
Integrated PD [1] PD Cap < 1fF Idark = 100µA Responsivity = 0.9 A/W  
51 
 
 
 
The parameters of the optical devices discussed in the previous chapter are given 
in Table II. The link budgets of the discrete and integrated optical links are summarized 
in Tables III and IV. The maximum VCSEL average power is limited to 3 dBm while 
the electrical power of the integrated modulator’s external laser source is limited to a 
maximum of 10 mW with 30 % wall plug efficiency, included in the total link power. 
Contrast ratio power penalty and insertion loss are included based on each device's 
parameters. A 100 fF bond pad capacitance is assumed along with the 100 fF discrete p-
i-n photodetector for the lumped input capacitance of the receiver based on discrete 
VCSEL link while the lumped input capacitance of the integrated link receiver, 
dominated by the interconnect capacitance, is assumed to be 10 fF. Receiver circuit input 
capacitance is added on top of these values to calculate total input capacitance. 
Table III Discrete optical link budget 
Max. Avg. VCSEL TX Power 3 dBm 
VCSEL to MMF Coupling - 1.1 dB 
MMF to Photodetector Coupling - 1.1 dB 
Contrast Ratio (5 dB) Penalty -2.844 dB 
Margin - 3 dB 
Link Budget - 8.044 dB 
Required RX Sensitivity - 5.044 dBm 
 
 
52 
 
 
 
Table IV Integrated optical link budget 
Max. Source CW Laser Power 4.8 dBm 
Source Laser to SMF Coupling - 2 dB 
Modulator to SMF Coupling - 2 dB 
SMF to Photodetector Coupling - 3 dB 
Margin - 3 dB 
Contrast Ratio Penalty (CR_P) depends on the device 
Insertion Loss (IL) depends on the device 
Link Budget - (10 + CR_P + IL) dB 
Required RX Sensitivity - (5.2 + CR_P + IL) dB 
 
For optical links, the normalized transistor parameters obtained for different 
biasing conditions corresponding to different transistor transition frequencies, fT are used 
to jointly optimize the transmitter and receiver design variables over circuit and link 
constraints to satisfy a particular data rate at a given bit-error rate (BER) specification of 
10
-12
, same as that of electrical links. The analysis is enhanced by the use of multiple 
biasing conditions which allows a wide design search space and provides a design 
solution which achieves minimum power for a given data rate at an optimum fT. The 90 
nm process in this analysis achieves a peak fT of 110 GHz at 0.4 mA/µm current density 
as shown in Fig. 27, while scaling to the 45 nm process allows 225 GHz peak fT at 0.4 
mA/µm as illustrated in Fig. 28. Multiple solutions that satisfy the design constraints are 
53 
 
 
 
obtained which correspond to differing circuit configurations, e.g., TIA with varying 
gain and LA stages, optical transmitters with differing levels of optical output power 
from which the minimum power solution is selected at each data rate. A flowchart 
describing the discussed optimization methodology for optical links is given in Fig. 29. 
 
 
Fig. 27 Transistor fT versus input Vgs voltage in 90 nm CMOS process 
 
 
Fig. 28 Transistor fT versus input Vgs voltage in 45 nm CMOS process 
54 
 
 
 
 
 
 
 
Fig. 29 Flowchart of optical link optimization methodology 
 
55 
 
 
 
Applying the above procedure to the integrated photodetector (PD) based optical 
receiver, the receiver sensitivity is calculated in terms of average power in dBm from the 
input referred noise and power penalty due to decision circuit threshold ambiguity to 
achieve a 10
-12
 BER. The improvement in receiver sensitivity due to addition of LA 
stages using 45 nm process is shown in Fig. 30, along with the QWAFEM-based link 
required receiver sensitivity. Addition of 1 LA stage enhances sensitivity out to high 
data rates, while additional LA stages obtain marginal improvements because of the 
relatively low decision circuit threshold ambiguity. Note that the sensitivity of the 
multiple LA cell designs degrades at higher data rates due to the cascaded amplifiers 
bandwidth compression forcing an increased receiver input capacitance. For the 
QWAFEM based link, the TIA Only solution is selected up to 20 Gb/s, beyond which 
additional LA stages are required. 
 
 
Fig. 30 Receiver sensitivity vs. data rate for 45 nm integrated photodetector 
56 
 
 
 
3.3 Comparison of Electrical and Optical Link Performance 
Utilizing the optimization methodology discussed in the previous section, the 
power efficiency of electrical and optical links are calculated and compared for different 
data rates and process technologies in this section. It begins with electrical link power 
efficiency performance comparison over the three backplane channels; short 1”channel 
(B1) with bottom traces, 20” channel (C4) with bottom stripline layer and 20” channel 
(T20) with top traces. The power efficiency performance of the four optical links formed 
using the different optical devices discussed in the previous chapter are compared in the 
following section. It finally ends with the power efficiency comparison of electrical and 
optical links. 
3.3.1 Electrical Link Performance 
 
 
Fig. 31 Total power efficiency of electrical link vs. data rate using CML circuits with TX 
output swing optimized 
57 
 
 
 
Utilizing the optimization methodology discussed above, the optimized power 
efficiency of the electrical link operating on the 1” backplane channel B1 and 20” 
backplane channels C4 and T20 are evaluated at different data rates. Fig. 31 compares 
the power efficiency of electrical links using CML circuits with optimized transmitter 
swing on all three backplane channels using both 90 nm and 45 nm CMOS technologies. 
The link power efficiency of the three channels at low data rates is almost the same 
because the optimization methodology selects the minimal equalization architecture of 
1-tap transmitter with CTLE and minimum sized transistor devices. As the data rate 
starts scaling, the tougher T20 backplane channel with higher frequency dependent 
losses becomes channel limited requiring an equalization complexity of 2-tap transmitter 
FFE, 5-tap DFE and CTLE at 12 Gb/s with a power efficiency of 3.75 mW/Gb/s at 90 
nm and 2.14 mW/Gb/s at 45 nm. Even with maximum equalization complexity, the T20 
backplane channel is not able to scale beyond 12 Gb/s due to the severe frequency 
dependent losses. The C4 channel with improved channel loss characteristics achieves 
better power efficiency and is able to operate up to higher data rates in comparison to the 
T20 channel. The critical timing path of the first tap in the direct feedback DFE becomes 
a major constraint in achieving reliable performance and limits its operation at higher 
data rates. As shown in Fig. 31, the link operated up to 16 Gb/s in 90 nm with a power 
efficiency of 4.7 mW/Gb/s while scaling to 45 nm allowed operation up to 18 Gb/s at 
3.72 mW/Gb/s power efficiency. The low loss B1 channel power efficiency is able to 
scale up to 20 Gb/s in both the 90 nm and 45 nm process as shown in Fig. 31. For the B1 
channel, the link achieved a power efficiency of 1.07 mW/Gb/s and 0.63 mW/Gb/s at 10 
58 
 
 
 
Gb/s in the 90 nm and 45 nm process respectively. These power efficiency values are 
comparable to [85], which achieved a power efficiency of around 1 mW/Gb/s, including 
only transmitter and receiver circuit power and excluding all clock element power at 10 
Gb/s using the 65 nm process at a slightly higher link voltage margin.  
 
 
Fig. 32 Total power efficiency of electrical link vs. data rate using CML circuits with 
and without TX output swing optimization in 90 nm process 
 
The impact of optimizing transmitter output swing and circuit style on the link 
power efficiency for the three channels are illustrated in Fig. 32 and 33 respectively. 
Optimizing transmit output swing can dramatically reduce power. As shown in Fig. 32, 
at 12 Gb/s the power is roughly cut in half on the high-loss T20 channel from 7.01 
mW/Gb/s without any transmit swing optimization to 3.75 mW/Gb/s with transmit 
swing optimization. In the low-loss B1 channel, the power efficiency of optimized 
transmit output swing case is dramatically reduced to 20% of the non-scaled value at 12 
Gb/s. Thus, the use of transmit output swing optimization saves significant power for the 
59 
 
 
 
high-speed electrical links. The choice of CML vs CMOS circuit style is a function of 
data rate and technology node. As shown in the 90nm modeling results of Fig. 33, at low 
data rates the CMOS based link has better power efficiency than the CML based link 
with significant static power dissipation. However, beyond 14 Gb/s the CMOS based 
link requires a large power due to reduced fan-out and CML based link becomes more 
power optimal. For example, at 16 Gb/s the CMOS based link achieves 5.95 mW/Gb/s 
operating on the low loss B1 channel, while the CML based link power efficiency is 
only 1.62 mW/Gb/s. 
 
  
Fig. 33 Total power efficiency of electrical link vs. data rate using CML vs. CMOS 
circuits in 90 nm process 
 
The impact of electrical channel and process node is evident in the modeling 
results of Fig. 34, which combines the CMOS and CML-based results to select the 
  
60 
 
 
 
 
Fig. 34 Optimal solution with minimum power efficiency vs. data rate 
 
 
 
Fig. 35 Optimized equalization architecture vs. data rate 
61 
 
 
 
optimum design at a given data rate, and Fig. 35, which shows the optimum equalization 
architecture. The high loss T20 channel is strongly channel-limited, as there is no 
difference in the optimum equalization architecture or CMOS circuit style between the 
90 nm and 45 nm process. A 3-tap FFE transmitter and 4-tap DFE receiver is required at 
the maximum data rate of 12Gb/s, resulting in a 90nm power efficiency of 3.0 mW/Gb/s 
and 1.8 mW/Gb/s in the 45 nm process.  
The C4 channel has improved loss characteristics due to signaling on the bottom 
backplane layer, avoiding the detrimental impact of the T20 long via stubs. For this 
channel, the process node has an impact on the optimum equalization architecture and 
circuit style. In the 90 nm technology, a CMOS design is more power efficient up to 14 
Gb/s, while above this data rate a CML design is chosen. In contrast, a CMOS design is 
chosen for all data rates in the 45 nm technology. Also, the 90 nm design cannot 
efficiently leverage CTLE equalization above 12 Gb/s, while the 45 nm design utilizes a 
CTLE up to 16Gb/s. The 90 nm design is limited to 16 Gb/s due to the inability to 
implement a high-speed direct feedback DFE, while in the 45 nm process the DFE is 
possible to allow for 18Gb/s.   
 The low-loss B1 channel doesn’t require significant equalization complexity 
until about 18 Gb/s. Interestingly, the optimal equalization architecture selected is 1-tap 
TX FFE with CTLE up to 16 Gb/s in 90 nm and 18 Gb/s in 45 nm. Including the CTLE 
actually achieves less power than with only 1-tap TX FFE, i.e. no equalization, as the 
CTLE peak gain allows scaling down the transmit output swing significantly. The 90nm 
switches to a 3-tap TX at 18 Gb/s due to the inefficiency of the CTLE at this high data 
62 
 
 
 
rate, while the 45 nm can still leverage a high peak gain CTLE at this data rate and 
doesn’t require the 3-tap TX FFE until 20 Gb/s. Excellent power efficiency is achieved 
with this low-loss channel, as sub-mW/Gb/s operation is possible for the transmitter and 
receiver circuitry, again neglecting clock generation, distribution, and recovery, in the 45 
nm technology up to 18 Gb/s. Above 20 Gb/s, the channel could potentially achieve 
higher data rates with DFE.  However, even the 45 nm technology cannot efficiently 
implement the direct feedback architecture modeled in this work. Thus, this link is 
technology limited, and could potentially benefit by scaling to a more advanced process 
node.  
3.3.2 Optical Link Performance 
Using the optimization methodology discussed in the previous section for optical 
links, optimized power efficiency for the optical links at various data rates are obtained. 
Fig. 36 compares the power efficiency of the tunnel junction (TJ) VCSEL based link 
using the 90 nm and 45 nm CMOS processes. This link attained a power efficiency of 
2.15 mW/Gb/s and 1.2 mW/Gb/s at a data rate of 20 Gb/s in 90 nm and 45 nm 
technology respectively. The VCSEL bandwidth constraint and maximum power levels 
limits the link performance at data rates higher than 24 Gb/s in both 90 nm and 45 nm 
nodes. Further improvements in increasing the VCSEL bandwidth by increasing device 
differential gain, reducing series resistance, and improving slope efficiency will result in 
improved power efficiency at higher data rates.  
63 
 
 
 
 
Fig. 36 Total power efficiency of VCSEL based link vs. data rate 
 
 
 
Fig. 37 Total power efficiency of QWAFEM based link vs. data rate 
64 
 
 
 
 
Fig. 38 Total power efficiency of waveguide based EAM link vs. data rate 
 
The QWAFEM EAM structure has a reasonable contrast ratio at low voltage 
swings, but with relatively high insertion loss and large modulator capacitance, the 
power dissipation is increased as shown in Fig. 37. The other EAM structure, the 
waveguide EAM based optical link achieves excellent power efficiency (Fig. 38) due to 
good contrast ratio, low swing and bias voltages, and small modulator capacitance. The 
waveguide EAM achieves a power efficiency of 1.498 mW/Gb/s and 0.763 mW/Gb/s at 
20 Gb/s in 90 nm and 45 nm in comparison to the QWAFEM structure which achieves 
2.09 mW/Gb/s and 1.096 mW/Gb/s power efficiency at the same data rate for the same 
technologies. Technology scaling yields higher transistor fT per current density, which 
allows the power efficiency of the integrated optical links to improve and the support of 
higher data rates. The maximum data rate extended from 24.75 Gb/s to 34 Gb/s for the 
waveguide EAM and 20 Gb/s to 28 Gb/s for the QWAFEM EAM when scaling from 90 
65 
 
 
 
nm to 45 nm. When compared to the VCSEL link at 20 Gb/s, the waveguide EAM link 
attained 35 % better power efficiency in the 45 nm node, while the QWAFEM link 
performance was similar to the VCSEL link. Improved alignment tolerance and coupling 
loss will further enhance the waveguide device’s compatibility in high volume 
applications, while QWAFEM structures with lower insertion loss and smaller 
capacitance would result in improved power efficiency. 
 
 
Fig. 39 Total power efficiency of ring resonator based link vs. data rate 
 
The refractive ring resonator modulator based link achieves excellent power 
efficiency (Fig. 39) due to the good contrast ratio and ultra low modulator capacitance. 
At 20 Gb/s, it attained a power efficiency of 1.47 mW/Gb/s and 0.77 mW/Gb/s in 90 nm 
and 45 nm process respectively. Scaling from 90 nm to 45 nm has allowed the ring 
resonator based link data rate to increase from 10 Gb/s to 28 Gb/s at a power efficiency 
of 1.2 mW/Gb/s and extend the maximum operating data rate from 25 Gb/s to 34 Gb/s. 
66 
 
 
 
Comparing the RR link at 20 Gb/s in the 45nm node, it achieved 35 % improvement in 
power efficiency relative to the VCSEL and QWAFEM EAM and similar performance 
as the waveguide EAM. Though the ring resonator based link has very good power 
efficiency, ring resonators do have very low optical bandwidth (~1 nm) [57] and are very 
sensitive to process and temperature variations. Efficient feedback tuning circuits and/or 
improvements in device structure to allow less sensitivity to variations would enhance 
feasibility of these devices in high density applications. 
 
 
Fig. 40 Total power efficiency of Mach-Zehnder modulator based link vs. data rate 
 
In contrast to the ring resonator based link, the refractive MZM based link has 
very wide optical bandwidth of around 100 nm [57] and improved tolerance to process 
and temperature fluctuations. While voltage swing and contrast ratio were included in 
the power optimization, the resulting MZM link power efficiency was roughly one order 
of magnitude higher than the other optical links (Fig. 40). The power efficiency achieved 
67 
 
 
 
is 4.5 mW/Gb/s and 3.657 mW/Gb/s at 20 Gb/s data rate in the 90 nm and 45 nm 
process. Technology scaling doesn't have a major impact, as the overall power is 
dominated by the voltage swing required for the low impedance modulator. Methods to 
enhance the refractive index change are required in order to reduce device footprint and 
perhaps allow higher impedance termination. 
3.3.3 Comparison of Electrical and Optical Links 
 
 
Fig. 41 Total power efficiency of electrical and optical link vs. data rate in 90 nm CMOS 
process 
 
The comparison of high-speed electrical links operating on the three different 
channels with the optical links based on VCSEL, waveguide EAM and ring resonator on 
the basis of power efficiency is shown in Fig. 41 and Fig. 42 using 90 nm and 45 nm 
process respectively. While the high-loss T20 channel is strongly channel-limited and 
the improved loss characteristic C4 channel requires additional equalization complexity 
68 
 
 
 
and increased power efficiency above 12 Gb/s, the low loss B1 channel achieved 
excellent power efficiencies up to data rates of 14 Gb/s in 90 nm and 18 Gb/s in 45 nm 
process. Scaling and optimization of the transmitter output swing coupled with the 
selection of minimum circuit complexity due to the low-loss channel characteristics 
allowed excellent power efficiency to be achieved. The additional optical link overhead 
components such as external laser source power dissipation of the integrated links, 
higher driver supply voltage of the VCSEL, static power component of electroabsorption 
modulator and tuning circuit power of ring resonator modulator results in degraded 
power efficiency at lower data rates. Beyond 14 Gb/s in 90 nm and 18 Gb/s in 45 nm, 
the optical links become the power efficient solution as the additional equalization 
complexity of electrical links to compensate for the channel losses increases its power 
consumption.  At 18 Gb/s, the electrical link requires a power efficiency of 3.48 
mW/Gb/s in the 90 nm process operating on the low-loss B1 channel while the VCSEL 
based link achieves 2.06 mW/Gb/s and the waveguide EAM and ring resonator 
modulator based links achieve 1.4 mW/Gb/s power efficiency at the same data rate for 
the same technology. 
Unlike the electrical links which is limited to a data rate of 20 Gb/s and requires 
additional equalization complexity and increased power efficiency to scale further, the 
optical links are able to scale up to 25 Gb/s in the 90 nm process and 34 Gb/s in the 45 
nm process at relatively low power efficiencies. As CMOS process is scaled from 90 nm 
to 45 nm, the power efficiency of the link improves and supports higher data rates due to 
higher transistor fT per current density obtained as a result of technology scaling. A 
69 
 
 
 
critical feature of the design methodology used in this work is predicting how the 
optimal equalization architecture and circuit configuration varies with scaling of CMOS 
technology nodes. For instance, the electrical link operating on the C4 channel changes 
its optimal equalization complexity and circuit style selected above 12 Gb/s on scaling 
from 90 nm to 45 nm while the maximum achievable data rate scaled from 25 Gb/s to 34 
Gb/s for the waveguide EAM based link and the ring resonator based link due to 
technology scaling. 
 
 
Fig. 42 Total power efficiency of electrical and optical link vs. data rate in 45 nm CMOS 
process 
 
The electrical links with scaled transmitter output swing achieve excellent power 
efficiencies at lower data rates until the channel limitations require additional circuit 
complexity and make the optical links with relatively low power efficiencies at higher 
data rates, the power efficient solution. The cross-over point, defined as the data rate 
beyond which the optical links become power efficient in comparison to the electrical 
70 
 
 
 
links, is around 9 Gb/s for the VCSEL based link and 8.5 Gb/s for the integrated links 
with respect to the T20 channel and remains almost the same on scaling from 90 nm to 
45 nm. Similarly, the cross-over point of the C4 channel, which doesn’t change much 
with technology scaling, is around 12.6 Gb/s for the VCSEL based link and 11.9 Gb/s 
for the integrated links. In contrast to the above two channels, the B1 channel cross-over 
point increases from 16.2 Gb/s to 19.3 Gb/s for the VCSEL based link and 14.8 Gb/s to 
18. 3 Gb/s for the integrated links on scaling from 90 nm to 45 nm. The combined effect 
of technology scaling, which yields higher transistor fT per current density, and low-loss 
channel characteristics allows operation of electrical links up to higher data rates and 
tolerates much more loss at lower power efficiency.  
The electrical link requires additional receiver decision feedback equalization 
complexity to operate at higher data rates which would further increase its power 
consumption. But the critical timing feedback path of the direct feedback DFE limits the 
operation at higher data rates. Loop unrolled DFE architectures [86] can operate at 
higher data rates but the additional circuit complexity would still result in higher power 
efficiency at high data rates for the electrical link. Better channels with low-loss 
characteristics or low power equalization circuits are needed to improve the power 
efficiency of electrical links. In comparison, the inherent bandwidth of the VCSEL based 
link should be improved to achieve good power efficiency and scale to higher data rates. 
Though the waveguide EAM achieves good power efficiency at higher data rates, 
improvements to alignment tolerance and coupling losses are needed to enhance the 
feasibility of these devices in high density applications. Ring resonators with good 
71 
 
 
 
power efficiency have very low optical bandwidth (~1 nm) and are very sensitive to 
process and temperature variations which requires additional tuning circuits to make it 
compatible for high volume applications. 
3.4 Summary 
In summary, this chapter provided the optimization methodology used in this 
work and also presented the performance comparison of electrical and optical links. It 
began with an overview about the previous optimization methodologies that have been 
used and followed it with a discussion on the optimization methodology and flowcharts 
utilized in this work. The electrical link performance over the three channels for a given 
data rate, channel type and process technology and four different optical link 
performances for a given data rate and process node were compared. It finally concluded 
with a performance comparison between electrical and optical interconnect architectures 
identifying the cross-over data rate, discussing the impact of technology scaling and 
suggesting improvements for further scaling of data rates. 
 
 
 
 
 
72 
 
 
 
CHAPTER IV 
CONCLUSION AND FUTURE WORK 
In conclusion, this work presented a design flow for the modeling and 
optimization of power dissipation of high-speed equalized-electrical and optical I/O 
links. It compared the power efficiency performance of the equalized-electrical links 
over three channels with four different optical links and identified the key parameters 
critical to further power efficiency improvements. The impact of technology scaling on 
performance was also considered by comparing the performance in 90 nm and 45 nm 
CMOS process technologies. 
I/O-communication bandwidth, which rapidly increased with nanometer CMOS 
technology scaling, is limited by the electrical I/O channel bandwidth, necessitating 
equalization schemes that compensate for the channel distortion and losses. This thesis 
developed a design methodology that coupled statistical link analysis techniques and 
circuit power estimates with transmitter output swing optimization to compute the power 
efficiency of high-speed equalized-electrical I/O link over three different channels. The 
design methodology predicts the optimum equalization architecture, circuit style (CMOS 
vs CML), and transmit output swing for minimum I/O power. The high-loss T20 channel 
is primarily channel-limited to a maximum of 12 Gb/s achieving a power efficiency of 
3.0 mW/Gb/s and 1.8 mW/Gb/s in the 90 nm and 45 nm process respectively. The C4 
channel with improved loss characteristics operated up to 16 Gb/s in 90 nm and 18 Gb/s 
in 45 nm only to be limited by the critical timing path of the direct feedback DFE. The 
low-loss B1 channel achieved excellent power efficiency in sub-mW/Gb/s range up to 
73 
 
 
 
20 Gb/s in 45 nm node leveraging the low equalization complexity and scaling of 
transmitter output swing.  
Future processors and systems require an on-chip bandwidth in the range of 
hundreds of Gb/s to Tb/s and optical links with low power efficiencies have offered a 
viable alternative to the electrical links for the dramatically increasing data rates. This 
thesis considered four different optical links and presented a design framework to jointly 
optimize the driver and receiver circuits which predict the optimal circuit configuration 
and biasing condition to achieve minimum power efficiency for a given data rate and 
process technology. VCSEL bandwidth and the maximum power levels limits link 
performance in both 90 nm and 45 nm process to 24 Gb/s. Among the integrated 
photonic links, the waveguide EAM and ring resonator provide good power efficiencies 
attaining 1.65 mW/Gb/s in the 45 nm process at 30 Gb/s respectively, while the 
misalignment tolerant QWAFEM structure achieves 1.096 mW/Gb/s at 20 Gb/s. Though 
the MZM achieves wide optical bandwidth, significant improvements need to be made to 
obtain lower Vπ and better power efficiency. 
Thus, electrical links with low-loss channel characteristics and optimized 
transmitter output swing, and optical links with negligible frequency dependent loss 
offer a promising solution to the dramatically increasing I/O bandwidth with minimum 
power efficiency. In order for the electrical links to scale up to high data rates with 
minimum power, the frequency dependent losses of the channel should be improved and 
be used with scaled transmitter output swing and minimum equalization complexity. The 
efficient integration of optical links with integrated chips allows very good performance 
74 
 
 
 
benefits for future many-core and multi-core processors. In order to be used in high 
density integration with CMOS chips, the optical devices should be improved to provide 
good misalignment tolerance and wide optical bandwidth.   
4.1 Future Work 
 
 
Fig. 43 Power efficiency of the CTLE with 12-dB peak gain vs. data rate for different 
supply voltages 
 
This thesis work considered the optimization and minimization of power 
dissipation of high-speed electrical and optical links. This section presents some more 
extensions and possibilities that could be done for minimizing the power dissipation of 
the entire link. Dynamic voltage frequency scaling (DVFS), scaling of supply voltage 
with data rate, is another important factor which helps in non-linear reduction of power 
with data rate. In order to understand the non-linear scaling of power efficiency with 
data rate, the power efficiency of the CTLE with 12-dB peak gain and 6-dB peak gain 
75 
 
 
 
for different supply voltages are shown in Figs. 43 and 44. Having a higher supply 
voltage allows operation up to high data rates but at lower data rates, the power 
efficiency is worse using high supply voltages. The optimization methodology could be 
made to choose the optimal supply voltage at a given data rate to get minimum power. 
 
 
Fig. 44 Power efficiency of the CTLE with 6-dB peak gain vs. data rate for different 
supply voltages 
 
This thesis work has considered a BER of 10
-12
 for both the electrical and optical 
links. Fig. 45 shows the power efficiency comparison of the electrical link operating on 
the high loss T20 channel in 90 nm process as a function of BER and the corresponding 
equalization architecture is shown in Fig. 46. It can be observed that lowering the BER 
to 10
-3
 reduced the equalization complexity at higher data rates improving the power 
efficiency and also increasing the operational data rate only to be limited by the DFE
  
76 
 
 
 
 
Fig. 45 Power efficiency vs. data rate as a function of BER for the T20 channel in 90 nm 
  
 
Fig. 46 Optimal equalization architecture vs. data rate as a function of BER for the T20 
channel in 90 nm 
 
77 
 
 
 
critical feedback timing path in 90 nm. This performance improvement could motivate to 
study the potential impact of using coding techniques and modulation techniques on the 
power efficiency performance of electrical links. The effect of using different 
modulation techniques such as PAM-4 and duobinary signaling with scaling of 
transmitter output swing on power efficiency could be investigated.  
 Clocking power, common among all of the architectures, including clock 
synthesis, distribution, and recovery, was not included in this work. Depending upon the 
application, modeling of both the transmitter and receiver side clocking circuits could be 
investigated and the optimal clocking structure could be used along with the I/O circuits. 
The optimization methodology could be enhanced further by using convex optimization 
techniques. For the optical links, the cost issues of using optical interconnects over 
electrical interconnects could be weighed against their better power efficiency 
performance and the prospective impact could be explored. 
 
 
 
 
 
 
 
78 
 
 
 
REFERENCES 
 
[1]  I. Young, E. Mohammed, J. T. S. Liao, A.M. Kern, S. Palermo, B. A. Block, M. 
R. Reshotko, and P. L. D. Chang, "Optical I/O technology for tera-scale 
computing," IEEE Journal of Solid-State Circuits, vol. 45, no. 1, pp. 235-248, 
Jan. 2010. 
[2]  International Technology Roadmap for Semiconductors 2006 Update. 
Semiconductor Industry Association (SIA), 2006. 
[3]  J. F. Bulzacchelli, M. Meghelli, S. V. Rylov, W. Rhee, A. V. Rylyakov, H. A. 
Ainspan, B. D. Parker, M. P. Beakes, A. Chung, T. J. Beukema, P. K. 
Pepeljugoski, L. Shan, Y. H. Kwark, S. Gowda, and D. J. Friedman, “A 10-Gb/s 
5-tap DFE/4-tap FFE transceiver in 90-nm CMOS technology,” IEEE Journal of 
Solid-State Circuits, vol. 41, no.12, pp. 2885–2900, Dec. 2006. 
[4]  R. Payne, P. Landman, B. Bhakta, S. Ramaswamy, S. Wu, J. D. Powers, M. Ulvi 
Erdogan, A. Yee, R. Gu, L. Wu, Y. Xie, B. Parthasarathy, K. Brouse, W. 
Mohammed, K. Heragu, V. Gupta, L. Dyson, and W. Lee, “A 6.25-Gb/s binary 
transceiver in 0.13-µm CMOS for serial data transmission across high loss legacy 
backplane channels,” IEEE Journal of Solid-State Circuits, vol. 40, no. 12, pp. 
2646–2657, Dec. 2005. 
[5]  D. A. B. Miller, “Physical reasons for optical interconnection,” International 
Journal of Optoelectronics, vol. 11, no. 3, pp. 155–168, 1997. 
79 
 
 
 
[6]  B. Casper, J. Jaussi, F. O'Mahony, M. Mansuri, K. Canagasaby, J. Kennedy, E. 
Yeung, and R. Mooney, “A 20 Gb/s forwarded clock transceiver in 90-nm 
CMOS,” in Proc. of IEEE International Solid-State Circuits Conference Digest 
of Technical Papers, Feb. 2006, pp. 263-272. 
[7]  J. Jaussi, B. Casper, M. Mansuri, F. O’Mahony, K. Canagasaby, J. Kennedy, and 
R. Mooney, “A 20 Gb/s embedded clock transceiver in 90 nm CMOS,” in Proc. 
of IEEE International Solid-State Circuits Conference, Feb 2006, pp. 340–341. 
[8]  IEEE P802.3ap task force channel model material,  
http://grouper.ieee.org/groups/802/3/ap/public/channel_model/index.html 
[9]  H. Johnson and M. Graham, High-speed digital design: A handbook of black 
magic, Upper Saddle River, NJ: Prentice Hall, 1993.   
[10]  W. Dally and J. Poulton, “Transmitter equalization for 4-Gbps signaling,” IEEE 
Micro, vol. 17, no. 1, pp. 48-56, Jan./Feb. 1997.   
[11]  K. Lee, S. Kim, G. Ahn, and D. Jeong, “A CMOS serial link for fully duplexed 
data communication,” IEEE Journal of Solid-State Circuits, vol. 30, no. 4, pp. 
353-364, Apr. 1995.   
[12]  K.-L. J. Wong, H. Hatamkhani, M. Mansuri, and C.-K. K. Yang, “A 27-mW 3.6-
Gb/s I/O transceiver,” IEEE Journal of Solid-State Circuits, vol. 39, no. 4, pp. 
602-612, Apr. 2004.   
[13]  C. Menolfi, T. Toifl, P. Buchmann, M. Kossel, T. Morf, J. Weiss, and M. 
Schmatz, “A 16 Gb/s source-series terminated transmitter in 65nm CMOS SOI,” 
80 
 
 
 
in Proc. of IEEE International Solid-State Circuits Conference Digest of 
Technical Papers, Feb. 2007, pp. 446.   
[14]  P. Heydari and R. Mohavavelu, “Design of ultra high-speed CMOS CML buffers 
and latches,” in Proc. of the International Symposium on Circuits and Systems, 
May 2003, vol. 2, pp. 208-211. 
[15]  S. Gondi and B. Razavi,   "Equalization and clock and data recovery techniques 
for 10-Gb/s CMOS serial-link receivers," IEEE Journal of Solid-State 
Circuits, vol. 42, no. 9, pp. 1999-2011, Sep. 2007. 
[16]  B. Nikolic, V. G. Oklobzija, V. Stojanovic, W. Jia, J. K. Chiu, and M. M. Leung, 
“Improved sense-amplifier based flip-flop: Design and measurements,” IEEE 
Journal of Solid-State Circuits, vol. 35, no. 6, pp. 876-883, June 2000. 
[17]  J. Montanaro, R. Witek, K. Anne, A. Black, E. Cooper, D. Dobberpuhl, P. 
Donahue, J. Eno, W. Hoeppner, D. Kruckemyer, T. Lee, P. Lin, L. Madden, D. 
Murray, M. Pearce, S. Santhanam, K. Snyder, R. Stehpany, and S. Thierauf, “A 
160-MHz, 32-b, 0.5-W CMOS RISC microprocessor,” IEEE Journal of Solid-
State Circuits, vol. 31, no. 11, pp. 1703-1714, Nov. 1996. 
[18]  N. C. Helman, “Optoelectronic modulators for optical interconnects,” Ph.D. 
dissertation, Stanford University, Stanford, CA, May 2005. 
[19]  G. A. Keeler, “Optical interconnects to silicon CMOS: Integrated optoelectronic 
modulators and short pulse systems,” Ph.D. dissertation, Stanford University, 
Stanford, CA, Dec. 2002. 
81 
 
 
 
[20]  S. Palermo, “Design of high-speed optical interconnect transceivers," Ph.D. 
dissertation, Stanford University, Stanford, CA, Sep. 2007. 
[21]  D. Vez, S. Eitel, S. Hunziker, G. Knight, M. Moser, R. Hoevel, H.-P. Gauggel, 
M. Brunner, A. Hold, and K. Gulden, “10 Gbit/s VCSELs for datacom: Devices 
and applications,” in Proc. of SPIE, Apr. 2003, vol. 4942, pp. 29-43. 
[22]  D. J. Bossert, D. Collins, I. Aeby, J. B. Clevenger, C. J. Helms, W. Luo, C. X. 
Wang, and H. Q. Hou, “Production of high-speed oxide confined VCSEL arrays 
for datacom applications,” in Proc. of SPIE, June 2002. vol. 4649, pp. 142-151. 
[23]  N. Suzuki, H. Hatakeyama, K. Tokutome, K. Fukatsu, M. Yamada, T. Anan, and 
M. Tsuji, “1.1-μm-range InGaAs VCSELs for high-speed optical 
interconnections,” IEEE Photonics Technology Letters, vol. 18, no. 12, pp. 1368-
1370, June 2006. 
[24]  J. Jewell, L. Graham, M. Crom, K. Maranowski, J. Smith, and T. Fanning, “1310 
nm VCSELs in 1-10 Gb/s commercial applications,” in Proc. of SPIE, Feb. 2006, 
vol. 6132, pp. 1-9. 
[25]  M. A. Wistey, S. R. Bank, H. P. Bae, H. B Yuen, E. R. Pickett, L. L. Goddard, 
and J. S. Harris, “GaInNAsSb/GaAs vertical cavity surface emitting lasers at 
1534 nm,” Electronics Letters, vol. 42, no. 5, pp. 282-283, Mar. 2006. 
[26]  S. Palermo, A. Emami-Neyestanak, and Mark Horowitz, “A 90 nm CMOS 16 
Gb/s transciever for optical interconnects," IEEE Journal of Solid-State Circuits, 
vol. 43, no. 5, pp. 1235–1245, May 2008. 
82 
 
 
 
[27]  Y. Liu, W.-C. Ng, K. D. Choquette, and K. Hess, “Numerical investigation of 
self-heating effects of oxide-confined vertical-cavity surface-emitting lasers,” 
IEEE Journal of Quantum Electronics, vol. 41, no. 1, pp. 15-25, Jan. 2005. 
[28]  M. Teitelbaum and K. W. Goossen, “Reliability of direct mesa flip-chip bonded 
VCSEL’s,” in Proc. of IEEE Lasers and Electro-Optics Society 17th Annual 
Meeting, Nov. 2004, vol. 1, pp. 326-327. 
[29]  T. Anan, N. Suzuki, K. Yashiki, K. Fukatsu, H. Hatakeyama, T. Akagawa, and 
M. Tsuji, “High-speed 1.1-µm-range InGaAs VCSELs,” in Proc. of Optical 
Fiber Communication Conference, Mar. 2008, pp. 1–3. 
[30]   N. Suzuki, H. Hatakeyama, K. Fukatsu, T. Anan, K. Yashiki, and M. Tsuji, “25-
Gbps operation of 1.1-μm-range InGaAs VCSELs for high-speed optical 
interconnections,” in Proc. of Optical Fiber Communication Conference, March 
2006, pp. OFA4. 
[31]  K. Yashiki, N. Suzuki, K. Fukatsu, T. Anan, H. Hatakeyama, and M. Tsuji, “1.1-
μm-range high-speed tunnel junction vertical cavity surface emitting lasers," 
IEEE Photonics Technology Letters, vol. 19, no. 23, pp. 1883–1885, Dec. 2007. 
[32]  O. Kibar, D. A. Van Blerkom, C. Fan, and S. C. Esener, “Power minimization 
and technology comparison for digital free-space optoelectronics interconnects,” 
Journal of Lightwave Technology, vol. 17, no. 4, pp. 546–555, Apr. 1999. 
[33]  D. A. B. Miller, “Quantum well optoelectronic switching devices,” International 
Journal of High Speed Electronics, vol. 1, no. 1, pp. 19-46, Jan. 1990. 
83 
 
 
 
[34]  J. F. Lampin, L. Desplanque, and F. Mollot, “Detection of picosecond electrical 
pulses using the intrinsic Franz-Keldysh effect,” Applied Physics Letter, vol. 78, 
no. 26, pp. 4103-4105, June 2001. 
[35]  R. B. Welstand, S. A. Pappert, C. K. Sun, J. T. Zhu, Y. Z. Liu, and P. K. L. Yu, 
“Dual-function electroabsorption waveguide modulator/detector for 
optoelectronic transceiver applications,” IEEE Photonic Technology Letter, vol. 
8, pp. 1540–1542, Nov. 1996. 
[36]  L. Lembo, F. Alvares, D. Lo, C. Tu, P. Wisseman, C. Zmudzinski, and J. Brock, 
“Optical electroabsorption modulators for wideband, linear, low-insertion loss 
photonic links,” in Proc. of SPIE, 1995, vol. 2481, pp. 185–196. 
[37]  M. B. Yairi, C. W. Coldren, D. A. B. Miller, and J. S. Harris, Jr., "High-speed, 
optically-controlled surface-normal modulator based on diffusive conduction," 
Applied Physics Letter, vol. 75, no. 5, pp. 597-599, Aug. 1999. 
[38]  N. C. Helman, J. E. Roth, D. P. Bour, H. Altug, and D. A. B. Miller, 
“Misalignment-tolerant surface-normal low-voltage modulator for optical 
interconnects," IEEE Journal of Selected Topics in Quantum Electronics, vol. 11, 
no. 2, pp. 338–342, March/April 2005. 
[39]  D. Campi, C. Cacciatore, H. C. Neitzert, C. Coriasso, C. Rigo, and A. Stano, 
"Polarisation-independent InGaAsP/InGaAsP MQW waveguide 
electroabsorption modulator," Electronics Letters, vol. 30, pp. 356-358, Feb. 
1994. 
84 
 
 
 
[40]  H. Fukano, T. Yamanaka, M. Tamura, and Y. Kondo, “Very-low-driving-voltage 
electroabsorption modulators operating at 40 Gb/s,” Journal of Lightwave 
Technology, vol. 24, no. 5, pp. 2219-2224, May 2006.   
[41]  J. F. Liu, M. Beals, A. Pomerene, S. Bernardis, R. Sun, J. Cheng, L. C. 
Kimerling, and J. Michel, "Ultralow energy, integrated GeSi electroabsorption 
modulators on SOI," in Proc. of IEEE International Conference on Group IV 
Photonics, Sep. 2008, pp. 10-12. 
[42]  J. E. Roth, S. Palermo, N. C. Helman, D. P. Bour, D. A. B. Miller, and M. 
Horowitz, “An optical interconnect transceiver at 1550 nm using low-voltage 
electroabsorption modulators directly integrated to CMOS,” Journal of 
Lightwave Technology, vol. 25, no. 12, pp. 3739 – 3747, Dec. 2007. 
[43]  S. J. B. Yoo, M. A. Koza, R. Bhat, and C. Caneau, “1.5 mm asymmetric Fabry–
Perot modulators with two distinct modulation and chirp characteristics,” Applied 
Physics Letter, vol. 72, no. 25, pp. 3246–3248, June 1998. 
[44]  C. C. Barron, C. J. Mahon, B. J. Thibeault, and L. A. Coldren, “Design, 
fabrication, and characterization of high-speed asymmetric Fabry–Perot 
modulators for optical interconnect applications,” Journal of Optical and 
Quantum Electronics, vol. 25, no. 12, pp. 5885–5895, Dec. 1993. 
[45]  R. H. Yan, R. J. Simes, and L. A. Coldren, "Extremely low-voltage Fabry-Perot 
reflection modulators," IEEE Photonics Technology Letters, vol. 2, no. 2, pp. 
118-119, Feb. 1990. 
85 
 
 
 
[46]  H. Cho, P. Kapur, and K. C. Saraswat, “A modulator design methodology 
minimizing power dissipation in a quantum well modulator-based optical 
interconnect,” Journal of Lightwave Technology, vol. 25, no. 6, pp. 1621-1628, 
June 2007. 
[47]  M. Lipson, “Compact electro-optic modulators on a silicon chip,” IEEE Journal 
of Selected Topics in Quantum Electronics, vol. 12, no. 6, pp. 1520-1526, Nov. 
2006.   
[48]  Q. Xu, B. Schmidt, S. Pradhan, and M. Lipson, “Micrometre-scale silicon 
electro-optic modulator,” Nature, vol. 435, no. 7040, pp. 325-327, May 2005. 
[49]  B. Schmidt, Q. Xu, J. Shakya, S. Manipatruni, and M. Lipson, "Compact electro-
optic modulator on silicon-on-insulator substrates using cavities with ultra-small 
modal volumes," Optics Express, vol. 15, no. 6, pp. 3140-3148, March 2007. 
[50]  L. Zhou and A. W. Poon, "Silicon electro-optic modulators using p-i-n diodes 
embedded 10-micron-diameter microdisk resonators," Optics Express, vol. 14, 
no. 15, pp. 6851-6857, July 2006. 
[51]  A. Liu, L. Liao, D. Rubin, H. Nguyen, B. Ciftcioglu, Y. Chetrit, N. Izhaky, and 
M. Paniccia, "High-speed optical modulation based on carrier depletion in a 
silicon waveguide," Optics Express, vol. 15, no. 2, pp. 660-668, Jan. 2007 
[52]  T. Sadagopan, S. J. Choi, S. J. Choi, P. D. Dapkus, and A. E. Bond, "Optical 
modulators based on depletion width translation in semiconductor microdisk 
resonators," IEEE Photonics Technology Letters, vol. 17, no. 3 pp. 567-569, 
86 
 
 
 
March 2005. 
[53]  H. Sun, A. Chen, B. C. Olbricht, J. A. Davies, P. A. Sullivan, Y. Liao, and L. R. 
Dalton, “Direct electron beam writing of electro-optic polymer microring 
resonators,” Optics Express, vol. 16, no. 9, pp. 6592-6599, April 2008. 
[54]  B. A. Block, T. R. Younkin, P. S. Davids, M. R. Reshotko, P. Chang, B. M. 
Polishak, S. Huang, J. Luo, and A. K. Y. Jen, "Electro-optic polymer cladding 
ring resonator modulators," Optics Express, vol. 16, no. 22, pp. 18326-18333, 
Oct. 2008. 
[55]  L. R. Dalton, P. A. Sullivan, B. C. Olbricht, D. H. Bale, J. Takayesu, S. 
Hammond, H. Rommel, and B. H. Robinson, “Organic photonic materials,” in 
Tutorials in Complex Photonic Media, Bellingham, WA: SPIE, 2007. 
[56]  L. Dalton, Oct. 2007, “Photonic integration improves on current technologies: 
Newsroom: SPIE.org,” http://spie.org/x16998.xml?highlight=x2414. 
[57]  H.-W. Chen, Y.-H. Kuo, and J. E. Bowers, “High speed silicon modulators,” in 
Proc. of the 14
th
 OptoElectronics and Communications Conference, Jul. 2009, 
pp. 1–2. 
[58]  C. Batten, A. Joshi, J. Orcutt, A. Khilo, B. Moss, C. Holzwarth, M. Popovic, H. 
Li, H. Smith, J. Hoyt, F. Kartner, R. Ram, V. Stojanovic, and K. Asanovic, 
“Building manycore processor-to-DRAM networks with monolithic silicon 
photonics,” in Proc. of the 16th IEEE Symposium on High Performance 
Interconnects, pp. 21-30, Aug. 2008. 
87 
 
 
 
[59]  L. Liao, A. Liu, D. Rubin, J. Basak, Y. Chetrit, H. Nguyen, R. Cohen, N. Izhaky, 
and M. Paniccia, “40 Gbit/s silicon optical modulator for high-speed 
applications,” Electronics Letters, vol. 43, no. 22, Oct. 2007. 
[60]  R. A. Soref and B. R. Bennett, "Electrooptical effects in silicon," IEEE Journal 
Quantum Electron., vol.23, no. 1, pp. 123-129, Jan. 1987. 
[61]  W. M. J. Green, M. J. Rooks, L. Sekaric, and Y. A. Vlasov, "Ultra-compact, low 
RF power, 10 Gb/s silicon Mach-Zehnder modulator," Optics Express, vol. 15, 
no. 25, pp. 17106-17113, 2007. 
[62]  A. Liu, R. Jones, L. Liao, D. Samara-Rubio, D. Rubin, O. Cohen, R. Nicolaescu, 
and M. Paniccia, “A high-speed silicon optical modulator based on a metal-
oxide-semiconductor capacitor,” Nature, vol. 427, pp. 615-618, Feb. 2004. 
[63]  A. Huang, C. Gunn, G. Li, Y. Liang, S. Mirsaidi, A. Narasimha, and T. Pinguet,   
"A 10 Gb/s photonic modulator and WDM MUX/DEMUX integrated with 
electronics in 0.13 μm SOI CMOS," in Proc. of IEEE International Solid-State 
Circuits Conference Digest of Technical Papers, Feb. 2006, pp. 922-929.  
[64]  D. Marris-Morini, X. Le Roux, L. Vivien, E. Cassan, D. Pascal, M. Halbwax, S. 
Maine, S. Laval, J-M. Fédéli, and J-F. Damlencourt, “Optical modulation by 
carrier depletion in a silicon PIN diode,” Optics Express, vol. 14, no. 22, pp. 
10838-10843, Oct. 2006 
[65]  B. Analui, D. Guckenberger, D. Kucharski, and A. Narasimha, “A fully 
integrated 20-Gb/s optoelectronic transceiver implemented in a standard 0.13-μm 
88 
 
 
 
CMOS SOI technology,” IEEE Journal of Solid-State Circuits, vol. 41, no. 12, 
pp. 2945-2955, Dec. 2006. 
[66]  A. S. Liu, L. Liao, D. Rubin, H. Nguyen, B. Ciftcioglu, Y. Chetrit, N. Izhaky, 
and M. Paniccia, “High-speed optical modulation based on carrier depletion in a 
silicon waveguide,” Optics Express, vol. 15, no. 2, pp. 660-668, Jan. 2007. 
[67]  E. Sackinger, Broadband circuits for optical fiber communication, Hoboken, NJ: 
John Wiley & Sons, 2005. 
[68]  PDCS20T, “Long wavelength 20 Gb/s photodiode chip,” Albis OptoElectronic 
AG. 
[69]  M. R. Reshotko, B. A. Block, B. Jin, and P. Chang, "Waveguide coupled Ge-on 
oxide photodetectors for integrated links," in Proc. of the 5
th
 IEEE International 
Conference on Group IV Photonics, Sep. 2008, pp. 182-184. 
[70]  A. A. Abidi, “Gigahertz transresistance amplifiers in fine line NMOS,” IEEE 
Journal of Solid-State Circuits, vol. 19, no. 6, pp. 986–994, Dec. 1984. 
[71]  M. Ingels, G. Van der Plas, J. Crols, and M. Steyaert, “A CMOS 18 THz 240 
Mb/s transimpedance amplifier and 155 Mb/s LED driver for low cost optical 
fiber links,” IEEE Journal of Solid-State Circuits, vol. 29, no. 12, pp.1552–1559, 
Dec. 1994. 
[72]  B. Razavi, Design of integrated circuits for optical communication, New York: 
McGraw-Hill, 2003. 
89 
 
 
 
[73]  B. Casper, M. Haycock, and R. Mooney, "An accurate and efficient analysis 
method for multi-Gb/s chip-to-chip signaling schemes," in Proc. of IEEE Very 
Large Scale Integrated (VLSI) Circuits Symposium Digest of  Technical 
Papers, Aug. 2002, pp. 54–57. 
[74]  V. Stojanovic and M. Horowitz, "Modeling and analysis of high speed links," in 
Proc. of IEEE Custom Integrated Circuits Conference Digest of Technical 
Papers, Sep. 2003, pp. 589–594. 
[75]  H. Hatamkhani, F. Lambrecht , V. Stojanovic, and C.-K. Ken Yang, "Power-
centric design of high-speed I/Os," in Proc. of 43rd ACM/IEEE Design 
Automation Conference, July 2006, pp. 867–872. 
[76]  H. Hatamkhani and C. Ken Yang, “A study of the optimal data rate for minimum 
power of i/o’s,” IEEE Transactions on Circuits and Systems II, vol. 53, no. 11, 
pp. 1230–1234, November 2006. 
[77]  B. Casper, G. Balamurugan, J. E. Jaussi, J. Kennedy, and M. Mansuri, “Future 
microprocessor interfaces: Analysis, design and optimization,” in Proc. of IEEE 
Custom Integrated Circuits Conference, Sep. 2007, pp. 479–486. 
[78]  A. Sanders, M. Resso, and J. D’Ambrosia, “Channel compliance testing utilizing 
novel statistical eye methodology,” DesignCon, Feb. 2004. 
[79]  G. Balamurugan, B. Casper, J. E. Jaussi, M. Mansuri, F. O'Mahony, and J. 
Kennedy, “Modeling and analysis of high-speed I/O links,” IEEE Transactions 
on Advanced Packaging, vol. 32, no. 2, pp. 237-247, May 2009. 
[80]  D. A. Van Blerkom, C. Fan, M. Blume, and S. C. Esener, “Transimpedance 
90 
 
 
 
receiver design optimization for smart pixel arrays,” Journal of Lightwave 
Technology, vol. 16, no. 1, pp. 119–126, Jan. 1998. 
[81]  P. Kapur, R. D. Kekatpure, and K. C. Saraswat, “Minimizing power dissipation 
in optical interconnects at low voltage using optimal modulator design,” IEEE 
Transactions on Electronic Devices, vol. 52, no. 8, pp. 1713–1721, Aug. 2005. 
[82]  M. R. Feldman, S. C. Esener, C. C. Guest, and S. H. Lee, "Comparisons between 
optical and electrical interconnects based on power and speed considerations," 
Applied Optics, vol. 27, no. 9, pp. 1742-1751, May 1988. 
[83]  D. A. B. Miller, "Rationale and challenges for optical interconnects to electronic 
chips," in Proc. of IEEE, vol. 88, no. 6, pp. 728-749, June 2000. 
[84]  H. Cho, P. Kapur, and K. C. Saraswat, "Power comparison between high-speed 
electrical and optical interconnects for interchip communication," Journal of 
Lightwave Technology, vol. 22, no. 9, pp. 2021 - 2033, Sep. 2004. 
[85]  G. Balamurugan, J. Kennedy, G. Banerjee, J. E. Jaussi, M. Mansuri, F. 
O’Mahony, B. Casper, and R. Mooney, “A scalable 5-15 Gbps, 14-75 mW low-
power I/O transceiver in 65 nm CMOS,” IEEE Journal of Solid State Circuits, 
vol. 43, no. 4, pp. 1010-109, April 2008. 
[86]  S. Kasturia and J. H. Winters, “Techniques for high-speed implementation of 
nonlinear cancellation,” IEEE Journal of Selected Areas in Communication, vol. 
9, no. 6, pp. 711–717, June 1991. 
 
 
91 
 
 
 
VITA 
 
Name: Arun Palaniappan 
 
Address: C/O Dr. Samuel Palermo  
 Department of Electrical and Computer Engineering 
 Texas A & M University 
 College Station, TX 77843 
 
Email Address: arunp_ece@neo.tamu.edu 
 
Education: Master of Science, Electrical Engineering,  
 Texas A&M University, 2010 
 Bachelor of Engineering, Electronics & Communication Engineering,  
 College of Engineering, Guindy, Anna University, 2007 
 
