Design and Integration of All-Silicon Fiber-Optic Receivers for Multi-Gigabit Chip-to-Chip Links by Muller, Paul et al.
Design and Integration of All-Silicon Fiber-Optic 
Receivers for Multi-Gigabit Chip-to-Chip Links 
 
P. Muller and Y. Leblebici 
Microelectronic Systems Laboratory 
Ecole Polytechnique Fédérale de Lausanne (EPFL) 
Lausanne, Switzerland 
paul.muller@epfl.ch, yusuf.leblebici@epfl.ch 
M. K. Emsley and M. S. Ünlü 
Department of Electrical and Computer Engineering 
Boston University 
Boston, Massachusetts, USA 
selim@bu.edu 
A. Tajalli and M. Atarodi 
Dept. of Electrical Engineering 
Sharif University of Technology 
Teheran, Iran 
tajalli@ee.sharif.edu, atarodi@sharif.edu 
  
Abstract—This paper presents a top-down approach to the 
design of all-silicon CMOS-based fully integrated optical 
receivers. From the system-level requirements, we determine 
the optimum block-level specifications, based on which the 
individual building blocks are designed. Measurement results 
of the manufactured design show operation at data rates 
exceeding 2.5-Gbps/channel for the detector, the amplification 
and the clock and data recovery circuits. This proof of concept 
is the first step towards design optimized, completely 
integrated, multi-channel optical receivers for high-bandwidth 
short-distance chip-to-chip interconnects. 
I. INTRODUCTION 
While clock frequencies and throughput of digital circuits 
improve with each new technology generation, the lack in 
I/O bandwidth of microprocessors is an increasing limitation 
of the overall system performance of computers. Short-
distance communication interfaces like computer buses and 
LAN systems must support higher data rates to keep the pace 
with the evolution of processor speed. To meet this target, 
future generation microprocessors will likely use chip-to-
chip fiber-optic communication links as a bus extension to 
communicate with close-by processors and peripherals. The 
optical interfaces will connect to a large number of fiber 
ribbons or optical waveguides integrated in already 
commercially available electro-optical backplanes. In turn, 
this requires a large number of optical transmitters and 
receivers to be monolithically integrated with the processor 
cores (Figure 1). 
Although implementations of such parallel fiber-optic 
short-distance communication systems have until now only 
been developed in the high-end server market [1], it is 
expected that they will enter the low-end markets provided 
that the cost/bit-rate ratio lowers sufficiently. Obviously, the 
use of standard manufacturing processes, monolithic 
integration and the availability of low-cost VCSEL sources 
in the 850nm optical window are key issues to achieve such 
cost reductions. As silicon-on-insulator (SOI) is becoming 
the mainstream substrate for integration of state-of-the-art 
microprocessors, this material composition appears to be the 
ideal candidate for building fully integrated optical receivers. 
 
Figure 1.  Conceptual block diagram of an integrated multi-channel photo-
receiver array for data communication 
The following sections present silicon-based high-speed 
photodetection, followed by a top-down approach to the 
design of high-speed optical receivers, the design of 
transimpedance and limiting amplifiers and clock and data 
recovery circuits and corresponding measurement results.  
II. SILICON-BASED PHOTODETECTION 
The design and fabrication of arrays of SOI-based 
resonant-cavity enhanced (RCE) photodetectors with high 
quantum efficiency, capable of operating at data rates up to 
10-Gbps has been previously demonstrated by the authors 
[2]. The low absorption coefficient of silicon (Si) at 850nm 
This work has been partially supported by the Swiss National Science 
Foundation Grant 200021-100625.   
1-4244-0303-4/06/$20.00 ©2006 IEEE. 480
Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on January 5, 2010 at 11:20 from IEEE Xplore.  Restrictions apply. 
is enhanced by the use of a Fabry-Perot resonator, delimited 
by the Si-air interface and the 90% reflectivity, two-period 
Si-Si02 distributed Bragg reflector. These wafers can be 
commercially manufactured using a standard SOI wafer 
fabrication process [3] (Figure 2). 
 
 
Figure 2.  Cross-section  of the double-SOI silicon photodetector 
Figure 3 provides the eye diagram of a 30µm diameter 
detector measured at 3.0-Gbps when coupled with the 
HXR2312 3.3-Gbps transimpedance amplifier (TIA) by 
Helix AG. The measured performance of this novel detector 
shows that innovative silicon photodetectors can fully 
compete with stand-alone compound semiconductor devices. 
 
 
Figure 3.  Double-SOI Si detector eye diagram when operating at 3.0Gbps 
III. RECEIVER CHAIN SPECIFICATION 
Unlike today's multi-chip receiver solutions, where 
detector, amplifiers and clock recovery occupy separate 
substrates, the integrated realization of the complete receiver 
chain allows for an optimized design procedure, which 
includes the specifications of each building block (Figure 4).  
 
Figure 4.  Receiver block diagram with major design parameters 
The receiver chain building blocks are specified by the 
following parameters: the detector responsivity ρPD, and 
capacitance CPD, transimpedance amplifier (TIA) and 
limiting amplifier (LA) gain, bandwidth and integrated noise 
(respectively AvTIA, AvLA, BWTIA, BWLA, InTIA, VnLA), the 
limiting amplifier input capacitance CLA, and the clock and 
data recovery (CDR) input sensitivity VminCDR, capacitance 
CCDR and jitter tolerance JTOL. 
The presented design approach is based on the detailed 
analysis of the receiver's input sensitivity dependency on the 
various design parameters. In fact, the overall system bit 
error rate (BER) depends on the horizontal and vertical eye 
closure, which are both dependent on the system's device 
noise and the bandwidth limitations in the receiver and the 
channel. Equation 1 shows the optical input sensitivity 
OMAmin as a function of some design parameters, as well as 
of bit rate fB, noise factor QBER (= 7.1 for a BER <10-12) and 
deterministic jitter DJPP. 
(1)
ntot
PD
BER I
Q
BAOMA ⋅⋅+=
ρ
22
min
 
While Intot summarizes the noise components in the 
amplification chain (Equation 2), A and B represent the 
vertical and horizontal eye closure terms respectively. 
(2)
2
2
TIA
nLA
nTIAntot R
VII +=
 
As illustrated in Equations 3 and 4, these terms do not 
depend on the overall system noise, but only on the receiver 
bandwidth, jitter tolerance and two channel characteristics: 
deterministic jitter DJchan and contributed relative 
intersymbol interference ζchan. In this expression, T = fB-1 and 
τTIA = (2πBWTIA)-1. 
(3)
1
2
21
2
2tanh
−


 +
−

 ⋅
=
chan
B
TIA
f
BWA ζπ
(4)
TIA
TIA
T
TIA
T
TIA
T
TIA
chan BW
ee
e
T
DJJTOL
B
⋅



+−
+
−−
=
−−
−
ττ
τ
τ
/2/
/
1
1ln
2
55.0
 
Reduction of the signal bandwidth lowers the total 
amplifier noise, but increases both A and B. Indeed, at much 
lower bandwidth, appearance of intersymbol interference 
(ISI) leads to both deterministic jitter and vertical eye 
closure, limiting the benefits of reduced noise bandwidth. 
While long-haul designs based on low-noise bipolar 
transistors in compound technologies apply the well-known 
bandwidth value of 0.75fB which guarantees the absence of 
ISI, the higher MOS device noise results in a shift of the 
optimum bandwidth for lowest input sensitivity (Figure 5). 
 
Figure 5.  Input sensitivity as a function of signal bandwidth 
481
Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on January 5, 2010 at 11:20 from IEEE Xplore.  Restrictions apply. 
The complete receiver design flow propagating the top-
level specifications to the block parameters is shown in 
Figure 6. Starting with the detector parameters, the LA input 
capacitance and the maximum TIA GBW achievable in the 
given technology, we can calculate the maximum feedback 
resistor and the required LA voltage gain according to the 
included equations. 
 
 
Figure 6.  Receiver specification flow 
Following these calculations, we can apply the 
previously discussed input sensitivity analysis to obtain the 
noise parameters for both amplifiers. Through this flow, all 
block-level specifications are determined in accordance with 
the top-level receiver specifications. As such, this design 
methodology leads to transistor-level design results which 
are consistent with the overall system requirements. 
Based on the propagation of all system-level 
specifications to the block level, the detailed design of TIA 
and LA are addressed in the Section IV, while the design of 
the clock and data recovery unit is presented in Section V. 
IV. SIGNAL AMPLIFICATION 
The use of compound materials not only improves the 
detector performance, but also provides commercial gigabit-
range transceivers with larger transconductance and voltage 
head-room, as available e.g. in SiGe BiCMOS processes. 
Building competitive high-performance transimpedance and 
limiting amplifiers in a digital CMOS process remains a 
major challenge and requires smart circuit topologies as 
presented in [4] and [5]. Minimum inter-channel crosstalk 
and sensitivity to supply noise from the digital core are 
achieved through the use of fully differential topologies and 
careful supply decoupling. 
Based on the previously obtained block-level 
specifications, the transimpedance amplifier is designed. A 
two-stage differential pair topology is used to achieve 
sufficient gain using the faster NMOS devices only and 
providing good regulation of the output common-mode 
voltage, a critical issue to correctly drive the limiting 
amplifier (Figure 7). The output stage was dimensioned 
based on the non-dominant pole specification, followed by 
the design of the input stage. As all pole locations depend on 
transistor ratios only, good control of the amplifier stability 
is achieved without adding Miller capacitors. Finally, the 
amplifier noise is simulated to verify the noise specifications. 
 
Figure 7.  TIA schematic 
Two limiting amplifier topologies, with and without 
inductive peaking, were designed and characterized. 
Although more advanced topologies have already been 
published (e.g. in [5]), it was decided to proceed with a 
cascade of resistively loaded gain stages. One reason for 
choosing a simple amplifier topology is its portability to new 
process technologies. The second reason is the goal to 
analyze potential magnetic coupling between neighboring 
channels using inductive peaking amplifiers. An array of 
four TIA and LA channels, as well as a wafer-probed two-
channel limiting amplifier design, has been manufactured in 
a 0.18µm digital CMOS process (Figure 8). 
 
Figure 8.  Chip microphotograph of the 4-channel TIA-LA array 
A thorough comparison of inductive peaking and 
inductor-less topologies, followed by a systematic design 
approach to optimize the limiting amplifier gain-bandwidth 
trade-off has been performed. Figure 9 shows a 2.5-Gbps eye 
diagram measured at the output of the inductive peaking LA. 
 
221mV
400ps  
Figure 9.  Eye diagram at 2.5-Gbps per channel at the LA output 
4 TIAs
4 LAs
4 Output 
Buffers
482
Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on January 5, 2010 at 11:20 from IEEE Xplore.  Restrictions apply. 
V. CLOCK AND DATA RECOVERY CIRCUIT 
While each receiver channel requires a dedicated 
amplification path, an area and power efficient clock and 
data recovery scheme with partial resource sharing has been 
implemented (Figure 10). In each channel, a clock in sync 
with the incoming data is obtained at the output of a gated 
current-controlled oscillator (GCCO). While the oscillation 
frequency of the GCCOs is under the control of tuning 
currents delivered by the shared phase-locked loop (PLL), 
the synchronicity with the data is guaranteed by a gating 
signal generated individually on each incoming data edge. 
 
Figure 10.  Multi-channel gated oscillator clock recovery topology 
In absence of long-term memory in this system, jitter 
tolerance has to be accurately analyzed to guarantee the 
overall system performance. This has been done through the 
development of a jitter-estimation based top-down design 
methodology, leading to BER estimations based on bathtub 
curves (Figure 11). 
 
Figure 11.  Simulated bathtub curve for DJ=0.2UIPP, RJ=0.021UIRMS and 
SJamp=0.2UIpp, SJfreq=0.05fB 
The implemented seven-channel receiver achieves 
accurate clock recovery at 2.5Gbps/channel with a per 
channel power consumption of 8.75mW and silicon area 
occupation of only 0.045mm2 (Figure 12). Although the 
performance of the complete receiver chain has not yet been 
measured, measurements of the building blocks allow for an 
estimation of the overall receiver sensitivity (Table 1). 
Except for the transimpedance amplifier, which has been 
over-designed for load capacitance, the presented design 
proves the feasibility of monolithically integrated silicon 
photonic receivers operating at multi-gigabit data rates.  
 
400ps 
450mV
 
Figure 12.  Eye diagram at 2.5-Gbps per channel at the CDR output 
TABLE I.  MEASURED RECEIVER PERFORMANCE 
 Min Max Units 
Technology 0.18µm CMOS - 
Supply voltage (except PD) 1.6 2.0 V 
Detector Responsivity 0.4  A/W 
Detector Capacitance (30µm diameter) 70 fF 
Detector Bandwidth 7.0  GHz 
Total Transimpedance Gain 80  dBΩ 
TIA / LA / CDR Data Rate 2.5  Gbps 
TIA Input Referred Current Noise (sim.) 390 nARMS 
LA Input Referred Voltage Noise (meas.) 440 µVRMS 
CDR Jitter Tolerance (sim.) 0.75  UI 
Total Input Sensitivity @ BER=10-12 26.5 µWRMS 
Total Power Consumption @ 25°C 
TIA 
LA 
CDR 
94.25 
74 
11.5 
8.75 
mW 
VI. CONCLUSIONS 
We presented a systematic approach to the design and 
integration of all-silicon high-speed multi-channel fiber-optic 
receivers. The complete receiver has been designed based on 
the propagation of specifications from the system level down 
to the transistor level. The presented measurement results 
illustrate the proper operation of all building blocks. 
Achieving multi-gigabit data rates in mainstream silicon 
technologies, this work proves the validity of such a 
methodology in the context of high-speed circuit design, as 
well as the feasibility of fully integrated CMOS receivers 
acting as high-speed I/Os in future microprocessors. 
REFERENCES 
[1] C. Berger et al., “Design and implementation of an optical 
interconnect demonstrator with board-integrated waveguides and 
microlens coupling”, Dig. LEOS Summer Topical Meetings, pp. 19 – 
20, June 2004 
[2] M. S. Unlu, M. K. Emsley, O. I. Dosunmu, P. Muller, and Y. 
Leblebici, “High-Speed Si Resonant Cavity Enhanced Photodetectors 
and Arrays”, J. Vac. Sci. Technol. A, vol. 22(3), pp. 781-787, 
May/June 2004 
[3] M. K. Emsley and M. S. Ünlü, “Epitaxy-Ready Reflecting Substrates 
for Resonant-Cavity-Enhanced Silicon Photodetectors”, Proc. IEEE 
LEOS  2000 Annual Meeting,  vol. 2, pp. 432-433, November 2000 
[4] M. Kossel, C. Menolfi, T. Morf, M. Schmatz and T. Toifl, “Wideband 
CMOS Transimpedance Amplifier”, IEE Electronics Letters, vol. 39, 
no. 7, pp. 587-588, 3rd April 2003 
[5] S. Galal and B. Razavi, “10-Gb/s Limiting Amplifier and 
Laser/Modulator Driver in 0.18-µm CMOS Technology”, IEEE J. 
Solid-State Circuits, vol. 38, no. 12, pp. 2138-2146, December 2003 
483
Authorized licensed use limited to: EPFL LAUSANNE. Downloaded on January 5, 2010 at 11:20 from IEEE Xplore.  Restrictions apply. 
