A 1.6 Gb/s, 3 mW CMOS receiver for optical communication by Emami-Neyestanak, Azita et al.
7.2 A ldGb/s, 3 mW CMOS Receiver for Optical Communication 
Azita Emami-Neyestanak, Dean Liu, Gordon Keeler, Noah Helman and Mark Horowitz 
Computer Systems Laboratory, Stanford University 
Stanford, CA 94305 
Abstract ating the clock signals for this system. Each of these issues is 
described in more detail in the following sections of the paper. A 1.6GbIs receiver for optical communication has been 
designed and fabricated in a 0.25-pm CMOS process. This 
receiver has no transimpedance amplifier and uses the parasitic 
capacitor of the flip-chip bonded photodetector as an integrat- 
ing element and resolves the data with a double-sampling 
technique. A simple feedback loop adjusts a bias current to the 
average optical signal, which essentially “AC couples” the 
input. The resulting receiver resolves an 1 IpA input, dissi- 
pates 3mW of power, occupies 80pm x 50pm of area and 
operates at over 1.6Gbls. 
Introduction 
Using optics to interconnect integrated circuits has recently 
gained a lot of interest [I]. A potential design platform uses 
hybrid integration of arrays of optical multiple quantum well 
(MQW) modulators and detectors with commercial electronic 
circuits [2]-[5]. However a dense array of optical detectors 
requires very low-power, sensitive, and compact optical 
receivers [6]. Various designs for the input receiver have been 
used in sman pixel test systems, including simple E T  inputs 
[7], diode-clamped receivers [8] and transimpedance amplifi- 
ers 191. These designs rely on an analog front end amplifier 
providing either voltage gain, current gain or current to voltage 
conversion. But these amplifiers often dissipate large amount 
of quiescent power to achieve high-bandwidth and low noise. 
This paper describes the design and implementation of a 
novel CMOS receiver suitable for arrays of optoelectronic 
switching nodes comprised of flip-chip-bonded MQW modu- 
lators and detectors on silicon that eliminates the need for a 
linear amplification. Instead it integrates the input current on 
the parasitic capacitance of the detector, and uses double-sam- 
pling to create the voltage difference for a clocked comparator 
to resolve. 
Receiver Design 
One significant parameter of hybrid flip-chip bonded 
MQW detector is its capacitance. The diode capacitance and 
the flip-chip bump-plus-pad capacitance are the two primary 
components of the detector capacitor, C,. This capacitor can 
integrate the optically generated current of the detector over 
time. If the input of the front end receiver is also capacitive 
with a capacitance Gin. the voltage of the input node at each 
signal time, V,, is always a sum of the incoming signal and the 
SensAmp2 
Fig. I Receiver block diagram 
Fig. 1 illustrates the block diagram of the designed receiver. 
The input signal from the photo detector is a single-ended. 
positive current. The injected charge is higher if the bit value is 
“1” but it’s not necessarily zero when the bit value is “0”. 
Therefore, in order to have a bipolar voltage change at the 
input of receiver we need to subtract a constant charge for 
every bit from the input capacitor. This is done by subtracting 
an adjustable current from the input. The DC current is 
adjusted by a feedback loop looking at the DC value of the 
voltage of input node. The feedback loop not only adjusts the 
DC current but also sets the average voltage of input node. 
Bipolar voltage change at the input allows us to decide the 
input value by comparing two adjacent samples of the input 
voltage. If the new sample is higher, the input signal is “ I ” ,  
otherwise, it is “0’. Fig. 2 illustrates how Vi, varies with time 
when IDc is set to correct value, assuming constant currents 
during each bit period. 
4 Vin 
1 0 : 1 : 1  0 1 0 0  
, ,  
, ,  
[“-I t” bTime 
Fig. 2 Voltage of input node when IDC is correct - . _  
voltage of the input node just before that signal; V”.,. 
V ,  = V , _ ,  +(I,. T ) / ( C ,  + C;,,) . Therefore if we compare 
v, and v,.~, we have enough infomation about the input sig. 
nal at time t, to determine whether it was a one or a zero. 
~ ~ ~ l ~ ~ ~ ~ t i ~ ~  a receiver based on this idea requires solving 
four main issues: sampling and storing an analog voltage, fast 
comparison, subtracting the average input current, and gener- 
A.  Sampler: 
The analog sampler is illustrated in Fig. 3. It uses two non 
overlapping phases, + I  and $2 derived from a single 50% duty 
cycle clock. Therefore sampling is done at both rising and fall- 
ing edges of reference clock. Generation and shaping of these 
Phases will  be discussed later. 
84 0-7803-7310-3/02/$17.00 02002 IEEE 2002 Symposium On VLSl Circuits Digest of Technical Papers 
I V 2  
I v 
Fig. 3 Sample! 
When QI is high, VI  is going to be the new sample and V2 
is the old sample of Vi,. After $1 falls and before @2 starts, V I  
and V2 are compared. Only comparing the samples when both 
clocks are low balances the clock feedthrough noise. Thus any 
charge injection through the switching transistors are common 
mode. During the next phase, when $2 is high, VI  is held 
unchanged and V2 is updated. Now V2 is the new sample and 
V I  is the old one and they are compared as soon as q2 goes 
low. The RC delay of the samplers puts a lower limit on duty 
cycle of @I and $2. For Ipm wide sample devices driving l0fF 
loads, we need about ZOOps, or 35% of a bit time at 1.6Gb/s. 
The two hold capacitors are much smaller than the parasitic 
capacitor of the detector. Therefore charge sharing does not 
attenuate the signal significantly. 
B. Comparator: 
Comparison is done by two StrongAnn [IO],  regenerative 
sense amplifiers. Each of them is triggered immediately after 
one of the two phases, $I or Q2 falls, and before the other one 
rises. The sizing of transistors in the sense amp is critical since 
these structure dissipate most of the power in the interface. 
These circuits use offset compensation to break the depen- 
dence of offset voltage on transistor size, allowing 5km wide 
input devices. Offset compensation is done by digitally adjust- 
ing the number of small capacitors added to the internal nodes 
A and B [ I  I ]  in Fig. 4. Any process mismatches between the 
two branches of the sense amp and also mismatches between 
the two branches of the sampler or between @, and Q2 can be 
compensated at this stage. Simulation results show that the 
offset can be corrected with steps of about 6mV. 
Fig. 4 Offset compensated Sense Amp. 
This design is inherently robust against kick-back and 
charge injection from the sense amps to the high impedance 
input nodes. The reason is that there are two similar sense 
amps that their inputs are connected to the same nodes of one 
sampler unit. Soon after one sense amp injects some charge to 
these nodes during the evaluation phase, the other sense amp is 
reseted and injects the opposite charge to the same nodes. The 
total injected charge is zero after one bit period and the sample 
is valid for the next comparison. The key point here is that the 
shape of voltages of precharged nodes, A and B are always 
very similar and therefore kick-back is not significantly data 
dependent. 
C. Filter and Current Feedback: 
Assuming that the stream of incoming data is DC balanced, 
the DC voltage of input node remains constant if IDc. in Fig.1, 
is equal to ( lo + l l ) / 2 ,  where Io is the average optically gener- 
ated current during a ”0’ bit and I, is the average optically 
generated current during a “ I ”  bit (Io and I, can vary due to 
variation of optical input power and characteristic of the pho- 
todetector). If IDc is any other value, the DC value of Vi,will 
increase or decrease even after equal numbers of “Os and 
“1”s. Therefore a feedback loop can be used to adjust IDc by 
looking at Vir The key is to use a low-pass filtered version of 
Vi, to ensure the current does not fluctuate in response to the 
high frequency changes of Vi, due to the incoming data. For 
instance, if we assume that data is DC balanced within 20 bits, 
ID, should be fairly constant even if we receive a row of IO 
consequent ones or IO consequent zeros. The filter should also 
have a relatively lugh DC gain to be able to handle wide range 
of Io and I, values while keeping Vi, relatively constant, at the 
best point of operation for the sampler and the comparators. 
The simplest approach to build the needed low-pass filter is 
a single pole RC circuit, but because of the parasitic capacitor 
of the detector, the open loop transfer function of this simple 
system will have two poles and it will cause stability problem 
in a feedback loop (Fig. 5). 
Lf ..... 
........... 
Ct Q; 
.. 
Fig. 5 Feedback loop for current adjustment 
One way to make the loop stable and increase the phase 
margin is adding a zero to the loop transfer function. Capacitor 
C, in Fig. 5 is added to the circuit for this reason. The open 
loop transfer function is: 
2002 Symposium On VLSl Circuits Digest of Technical Papers 85 
Fig. 6 illustrates the transistor-level schematic of the buffer, 
filter and current source. Resistor R is implemented by a 
switched capacitor, R = I / (  f . C,) . Where f is the frequency 
of non-overlapping clocks, clk and clk-b. 
I . .  A ,  
V 
Fig. 6 Loop filter schematic 
The DC value of Vi, can be externally set by V,,,, 
V i ,  E Vr<, + VGS(nm, ,s , .  The differential pair quiescent current 
should be enough to cover a wide range of I, and it can be 
chosen by Biasl. Finally the input signal is buffered by a 
source follower. 
D. Phase Generator: 
$I and Q2 are two non-overlapping phases with the same 
frequency as reference clk and are used for sampling. LI and 
L2 are the control phases of Sense Amp1 and Sense Amp2. 
Fig. 7 illustrates how Ql, $2, LI and L2 are generated from the 
reference clock and inverted reference clock. C l k b  is gener- 
ated carefully with same rising and falling rates as Clk and 
with low skew. 
Fig. 7 Phase generator 
QI and Q2 are in fact chopped versions of clk and clk-b and 
their duty cycle can be adjusted with digitally controlled 
capacitors, Cadj. The rising edge of L, is delayed by one 
inverter, therefore right after sampling is done by el, Sense 
Amp1 starts to evaluate its inputs. The evaluation should be 
done before the rising edge of Q?. This condition is met 
because $2Ls duty cycle is less than 50%. As mentioned before 
the minimum width of $ I  or $2 is set by the acquisition time 
of the sample switch. the non-overlapping region in this design 
is about l00psec for 1.6 Gbps data rate. Having Ll and L2 
almost at the middle of this region, a skew of about 50psec 
between Clk and Clk-b can be tolerated by this receiver. 
Support and Test Circuits 
To avoid hysteresis and to increase the sensitivity and speed 
of comparison a small latch follows each of the first-stage 
sense amps. The output of the latches are negative true pulses, 
which are converted into levels using a dynamic SR latch. 
For this test-chip, reference clock is generated by an inte- 
grated dual loop Delay Locked Loop [121. The test chip does 
not contain the clock recovery circuitry, so the multiplexer and 
interpolator are digitally controlled by programming the chip 
externally to correct the phase. 
Experiment Results 
This design was fabricated in a 0.25 pm CMOS process 
and tested with a 2.5V power supply. The arrays of GaAlAs p- 
i-n diodes were connected on top of the silicon chip with the 
flip-chip bonding technique, Fig. 8. The total input capaci- 
tance after adding the photo devices measured by sending a 
periodic pattern of long sequences of ones and zeros. A small 
sampler at the input node gives the voltage values at the begin- 
ning and end of each sequence. If laser's driving current is 
adjusted to have zero optical output for a zero bit, then by 
reading I, we can calculate Ci,. Our measurement gives a 
total capacitance of 420fE In the next step we measured the 
sensitivity of the receiver at the maximum possible bit rate. 
The bit rate is limited by the non-overlapping margin needed 
between and $2. More relaxed timing andlor higher perfor- 
mance can be achieved by using 4 samplers, and a 1 to 4 input 
demultiplexing scheme. 
Fig. 8 Receiver micrograph after bonding arrays of photo diodes 
If data is DC balanced over every N bits, increasing N 
causes a small reduction in sensitivity of the receiver. This is 
because of the changes in I,, due to the limited cut-off fre- 
quency of the low-pass filter and changes in common-mode 
range of the input. The worse case pattern to measure the sen- 
a6 2002 Symposium On VLSl Circuits Digest of Technical Papers 
sitivity is when N/2 zeros are followed by N/2 ones. For N=16 
the receiver required an average current of 
AIave = ( I I  = 5 S @ A  for a 1.6 Gb/s data rate 
which corresponds to about 8mV voltage swing per bit. This 
current increases to 9pA when the sequence is extended to 
N=32. For N=16 no errors were found at the minimum power 
level for more than 10' bits. Our optical test setup did not 
allow us to measure the BER for pseudo random data. 
. . .,,. . . . .  . .  
. .  L .-.-.-____-. ___I_ . ---"111 --__I_: 
Fig. 9 Input Voltage and Recovered Data 
The responsivity of MQW detector is about 0.5 A/W, there- 
fore the system can detect an optical switching energy as low 
as 14fJ. Total power dissipation of the whole receiver circuitly 
is less than 3mW at l.bGb/s and is mostly due to the clocking 
and dynamic dissipation of the sense amps. The area of the 
receiver is 80pm x 50pm in our 0.25pm CMOS process. The 
performance of this receiver is summarized in Table 1. 
Conclusion 
We demonstrate that one does not need a uansimpedance 
amplifier in a high-speed optical link. One can get good sensi- 
tivity using a double-sampled approach. ?he receiver is 
designed for a 0.25-pm CMOS process and arrays of hybrid 
Rip-chip bonded MQW detectors, provides high sensitivity 
and bandwidth, while requiring small amounts of power and 
area. The sensitivity of this receiver is more than adequate for 
short-haul optical communications (should improve once the 
input capacitance is reduced) and the required area and power 
will allow 1000 receivers to consume only 3 W  and 4mm2. 
Table I: Receiver Summary 
Supply Voltage 2.5 v 
Technology National 0.25bm CMOS 
Capacitance 420 tF 
Sensitivity @ 1.6 Gbls 
Input data Rate 1.6Gbls 
Power Dissipation 3 mW 
Area 80pm x 50pm 
11 pA (switch current) 
Acknowledgments 
The authors would like to thank David A. B. Miller, 
Diwakar Agarwal, Samuel Palermo, Timothy J. Drabik, Jaeseo 
Lee, Vladimir Stojanovic, Henrik Johansson for technical dis- 
cussions and National Semiconductor for fabricating the test 
chip. 
References 
[ I ]  David A. B. Miller, "Physical reasons for optical interconnec- 
tion", Inremarional Joumal of Optoelectronics, vol. 11, no. 3, pp. 
155-168, 1997 
[Z] K. W. Gossen, 1. A. Walker, L. A. D'Asaro, S .  P. Hui. B. Tseng, 
R. Leibenguth, D. Kossives, D. D. Bacon, D. Dahringer, L. M. F. 
Chirovsky, L. A. Lentine, and D. A. B. Miller, "GaAs MQW 
modulators intergrated with silicon CMOS', IEEE Photon. Tech- 
nol. Lerr., vol. 7, no. 4, pp. 360-362, Apr. 1995 
[3] A. L. Lentine and D. A. B. Miller. "Evolution of the SEED tech- 
nology: Bistable logic gates to optoelectronic m a n  pixels", IEEE 
J. Quanrum Electronics, vol. 29, pp. 655-669, Feb. 1993 
[4] Ashok V. Krishnamoonhy, David A. B. Miller, "Scaling Opto- 
electronic-VLSI Circuits into 21st Century: A Technology Road- 
map", IEEE Joumal of Selecred Topics in Quonrum Elecrronics, 
vol. 2, no. I ,  pp. 55-76, Apr. 1996 
[5] A. L. Lentine, et al. "Arrays of Optoelectronic Switching Nodes 
Comprised of Flip-Chip-Bonded MQW Modulators and Detec- 
tors on Silicon CMOS Circuitry", IEEE Photonics Technology 
Lerrers, vol. 8, no. 2, pp. 221-223. Feb. 1996 
[6] Ted K. Woodward, Ashok V. Krishnamoonhy, A. L. Lentine, L. 
M. F. Chirovsky, "Optical Receivers for Optoelectronic VLSI", 
IEEE Journal of Selecred Topics in Quanrum Elecrronics, vol. 2,  
no. 1,pp. 106-115,Apr. 1996 
[71 D. A. B. Miller, M. D. Feuer, T. Y. Chang, S .  C. Shunk, I. E. 
Henry, D. J. Burrows, and D. S. Chemla, "Feild-effect transistor 
self-electrooptic effect device: Integrated photodiode, quantum 
well modulator and transistor", IEEE Photonics Technology Let- 
ters, vol. 1, no. 3, pp. 62-64, 1989 
[81 A. L. Lentine, L. M. F. Chirovsky, M. W. Focht, M. D. Feuer, G. 
D. Guth, R. Leibenguth, G. 1. Przybylek, and L. E. Smith, 
"Diode-clamped symmetric self-electro-optic effect devices 
with subpicojoule switching energies", Appl. Phy Lett., vol. 60, 
pp. 1809-1811, 1992 
[91 A. V. Krishnamoonhy, L. A. Lentine, K. W. Gossen, J. A. Walker, 
T. K. Woodward, 1. E. Ford, G. F. Aplin, L. A. D'Asaro, S .  P 
Hui. B. Tseng, R. Leibenguth, D. Kossives, D. Dahringer, L. M. 
F. Chirovsky, D. A. B. Miller, "3-D integration of MQW modu- 
lators over active submicron CMOS circuits: 375 Mb/s transim- 
pedance receiver-transmitter circuit", IEEE Phoronics 
Technology Lerrers, vol. 7, no. I I ,  pp. 1288-1290, 1995 
[IO] Montanaro, J. et. al., "A I60-MHz. 32-b, 0.5-W CMOS RlSC 
microprocessor", IEEE Joumal of Solid Stare Circuirs, vol. 31, 
no. I I ,  pp. 1703-1714, Nov. 1996 
[ I l l  M. E. Lee, W. J. Dally, P. Chiang, "Low-Power Area-Efficient 
High-speed ,110 Circuit Techniques", EEE Joumal oJ Solid 
Stare Circuirs, vol. 35, no. 1 I ,  pp. 1591-1599, Nov. 2000 
[I21 S. Sidiropoulos, D. Liu, I. Kim, G. Wei, and M. Horowitz, 
"Adaptive Bandwidth DLLs and PLLs using Regulated Supply 
CMOS Buffers". 2000 Symposium on VU1 Circuits Digesr of 
Technical Papers, pp. 124.127, Jun. 2000 
2002 Symposium On VLSI Circuits Digest of Technical Papers 87 
