Abstract-Optical transimpedance receivers implemented in CMOS VLSI technologies are modeled and optimized for freespace optoelectronic interconnections. Sensitivity, bandwidth, power dissipation, and circuit area are analyzed for receivers using three different submicron CMOS processes. A comparison with the circuit noise limited optical power indicates that, for digital computing applications, the receiver sensitivity is limited by the gain-bandwidth product of the receiver amplifiers and the necessary noise margin of logic circuits.
I. INTRODUCTION
O PTOELECTRONIC computing systems use interconnects comprised of light transmitters with their driver circuits, optical elements, and photodetectors with their associated receiver circuits. To a large extent, the performance of the interconnection depends on the receiver's gain, bandwidth, power consumption, and area requirements. These four parameters can be traded off against each other. By adjusting the number of amplifying stages, the transistor sizes, and the bias voltages, the receiver circuit can be designed to optimize the link performance.
We have developed a framework for modeling and optimizing transimpedance receivers fabricated in digital CMOS technology. The design goal is to minimize the total power dissipation in a link. We calculate the minimum optical power based on the voltage requirement at the receiver output and compare it with the thermal noise-limited optical power required for operation at a given bit error rate (BER). The results indicate that the minimum optical power required is determined by the gain-bandwidth product of the receiver amplifier and the noise margin required in the logic circuits that follow the receiver.
In this paper, we first describe the transimpedance receiver model used in the analysis. Equations for the bit rate, transimpedance gain, noise, power dissipation, and area of these receivers are introduced. Then the method used to optimize receiver performance is presented. Finally, the results of optimized receivers in three different submicron CMOS technologies are discussed. 
II. RECEIVER MODEL
Optical receivers can be classified as high-impedance, transimpedance, and low-impedance, depending on the preamplifier design. Transimpedance receivers are a popular choice due to their high bandwidth, low noise, and ease of biasing [2] - [5] . The operational model of a transimpedance receiver can be broken into three components, as shown in Fig. 1 -the transimpedance amplifier, the voltage amplifier, and the decision circuit. The transimpedance amplifier converts the photocurrent from the detector to an analog voltage. This voltage is then amplified by the voltage amplifier to match the requirements of the the decision circuit. The decision circuit provides a digital voltage output to the following computation logic circuits. In this paper, no coding of the signal is assumed and the receiver components are dc-coupled. The decision threshold cannot be derived from the signal but is generated internally. dc-coupling also implies that the dc-bias conditions must be the same for all the amplifying stages. In addition, because of the large size and performance-limiting parasitics of on-chip inductors, our analysis does not include designs with inductive peaking. A separate optimization of inductive peaking in optical receivers is described in [6] .
The receiver designs presented here are based on CMOS current-source inverters. The analysis examines one and threestage transimpedance amplifiers connected to cascaded voltage amplifying stages. Other or more complicated receiver designs can be envisioned, but in general they will all follow the structure shown in Fig. 1 . They can be fit into the framework of this analysis by providing a description of their gain and bandwidth, as presented in Appendix B for the CMOS currentsource inverter.
A. Receiver Components

1) Transimpedance Amplifier:
The transimpedance amplifier (TIA) converts an input current to an output voltage. A feedback resistor determines the transimpedance, and thus the sensitivity of the amplifier. Larger feedback resistors increase the sensitivity of the amplifier, but simultaneously reduce the amplifiers bandwidth.
0733-8724/98$10.00 © 1998 IEEE The circuit for the amplifying stage is shown in Fig. 2 . The amplifying stage gain and bandwidth are found to depend on the width of M1, and the bias voltage . The loading of the feedback resistor, lowers the gain and increases the bandwidth. A detailed analysis of the gain and the bandwidth can be found in Appendix B, and the device models upon which this analysis is based are developed in Appendix A. A common-gate FET operating in its linear region can serve as the feedback resistor connected around the amplifying stage. In this analysis, the feedback FET is treated as a resistor, which is a valid approximation for small signal levels. The large-signal nonlinearity introduced by using a feedback PFET extends the dynamic range of the receiver [2] .
The circuit for a three-stage TIA is a simple extension of the one-stage case. Three identical amplifying stages are cascaded, and a feedback FET is connected around the three amplifiers. Similar circuits have been proposed in the literature [2] , [3] . The TIA must have an odd number of stages, so that the feedback is negative.
2) Voltage Amplifier: The voltage amplifier consists of a cascaded series of amplifying stages. To ensure proper dcbiasing, the stages are all identical and use the same design as the stages in the transimpedance amplifier. The total gain provided by a -stage cascaded voltage amplifier is thus , where is the gain of a single stage as defined in Appendix B.
One important consideration in the voltage amplifier is the effect of parameter variations on the dc-biasing. Small variations in the transistor parameters can cause offsets that are amplified by subsequent stages in the amplifier, such that later stages may no longer be biased correctly. This problem is alleviated by the use of feedback in the TIA, but it must be taken into account in the voltage amplifier. Typical offsets between identical transistors in modern CMOS processes are in the 10 mV range. Since the gain of the amplifying stages is typically between three and five, the maximum number of stages in the voltage amplifier is limited to two to keep the offset at the output of the voltage amplifier below 250 mV. This insures that all stages are correctly biased and that the input to the decision circuit swings about the threshold point. Although the offset improves slightly for smaller line-length technologies, the voltage swing reduces as well, indicating that the two stage limit is a reasonable choice for all three technologies considered [7] .
3) Decision Circuit: The decision circuit is an amplifying stage without transistor M3, where the ratio of the PMOS to NMOS width is calculated to make the inverter threshold voltage the same as the bias voltage of the amplifying stages. This ratio is given by the parameter defined in Appendix B. A minimum voltage swing must be input to the decision circuit to ensure rail-to-rail output swing. This input voltage swing is the width of the voltage transition region of the decision circuit. The width of the transition region for the decision circuit is given approximately by [8] .
B. Receiver Performance
The receiver performance is evaluated by its bit rate, transimpedance gain, input equivalent noise, power dissipation, and circuit size. In this section, equations are given for these five performance measures in terms of the design of the receiver.
High receiver sensitivity can be obtained by increasing the gain of the amplifying stages or by increasing the number of stages in the amplifiers. Both of these methods reduce the bandwidth of the receiver. By parameterizing the bit rate and transimpedance gain of the receiver in terms of the number of stages and the design of each stage, the receiver design that maximizes the transimpedance gain can be found for each bit rate. The bit rate of the receiver is determined from its rise time, which is a function of the rise time of the individual amplifiers. The transimpedance is determined from the stability requirements. The power dissipation and size of the receiver can then be calculated, once the design of each stage and the number of stages are known.
1) Bit Rate:
The location of the poles in the receiver transfer function determines the 10-90% rise time in response to a step input. The bit rate of the overall receiver can be determined from the rise times of each of the components
where determines what percentage of the bit period makes up the rise time. In a synchronous NRZ receiver, can be taken to be about 60% without significant signal degradation [3] . The rise times for the receiver components are given in Table I [9] . The TIA is designed to have a response that closely approximates a maximally flat magnitude (MFM) response, i.e., the two poles closest to the origin are at 45 . This is achieved with an appropriate value feedback resistor.
It can be seen from Table I that receivers with a three-stage TIA are significantly slower than ones with a one-stage TIA, when both are constructed from identical amplifying stages. In general, it can be shown that when the number of stages in a feedback loop increases, the bandwidth decreases [9] . However, in order to determine when three-stage TIA based receivers are competitive, the transimpedance gain must be examined as well.
2) Transimpedance Gain: The transimpedance of the receiver determines its sensitivity. In order to ensure stability and eliminate resonance peaking in the receiver transfer function, the transimpedance of the TIA is adjusted to approximate a maximally flat magnitude response. The transimpedance can then be calculated based on the gain, bandwidth, input capacitance and transconductance of the amplifying stages, the total number of stages, and the photodiode capacitance.
For the one-stage TIA to have a maximally flat magnitude response, the input open-loop pole must be smaller than the second open-loop pole by a ratio of (2) is the photodiode capacitance plus any parasitic capacitance, and is taken as 100 fF in the analysis. This corresponds to a flip-chip bonded 400 m MQW detector. Since optical alignment and spot sizes are not expected to scale as the gate length of the technology, this value is constant for all three technologies considered in this paper.
The value of obtained by solving (2) is used to determine the transimpedance of the one-stage TIA, which is given by [10] (3)
The three-stage TIA has four open-loop poles, one at and three overlapping poles at . In order for the threestage TIA to approximate a maximally-flat magnitude transfer function, the input open-loop pole must be related to the other three open-loop poles by (4) The transimpedance of the three-stage TIA is given by (3), with replaced with , since there are now 3 stages providing gain in the TIA.
The overall transimpedance gain, TZ, is the receiver's output voltage divided by the input current, and is given by the voltage gain of the -stage postamplifier times the transimpedance of the TIA TZ
For a receiver to operate correctly, a minimum average optical input power is necessary. This is the optical power that results in a voltage swing to the decision circuit. Dividing by the responsivity of the detector, (in amps/watt), and the transimpedance of the receiver, TZ, yields the required 
where the factor of two accounts for the assumption that half of the bits are on and half are off. The value of is 0.5 A/W in this analysis, for the MQW detecting light at nm. Even though three-stage TIA's are generally slower than one-stage TIA's, they can provide higher sensitivities, indicating that they may be competitive at low bit rates. The optimization described below determines the break-even point between one and three-stage TIA based receivers.
3) Noise: The circuit noise introduced by the receiver and detector is referred to the receiver input for signal to noise ratio determination. The input equivalent noise is given by [3] BR BR
where the first term is the input equivalent noise due to the dark current of the detector and the Johnson thermal noise of the feedback resistor, and the second term is the input equivalent noise due to the FET channel noise in all of the stages. The values of and depend on the transfer function of the receiver, [11] , and are given by
Since the transfer functions are known, the values of the integrals in (8) and (9) can be determined for each receiver configuration. The calculated values are given in Table II . The receiver configuration is coded as where is the number of stages in the transimpedance amplifier, and is the number of stages in the voltage amplifier.
4) Power:
The electrical power dissipation of the -stage receiver is determined from the bias current, , and the power supply voltage, , and can be written:
There is additional power dissipation due to the switching of the node capacitances in the receiver, but this component is orders of magnitude less than the power dissipation due to the bias current.
5) Size:
The circuit size of the transimpedance amplifier is approximately determined by the sum of the widths of the transistors, . The total circuit area of a receiver with -stages can then be approximated by (11) where is a proportionality constant that takes into account the source/drain diffusion lengths and the local interconnects. However, the physical circuit area may not be the limiting factor in determining the density of receivers. With the high power dissipation of these receivers, the thermal power density must be considered. In this case, with a maximum power density of dictated by the cooling method, the effective size of the receiver is (12) So, for example, a receiver that dissipates 1 mW of power on a chip that has a maximum power dissipation of 10 W/cm requires 10 000 m 2 , or in other words, a pitch separation of 100 m (assuming a square pixel).
III. OPTIMIZATION AND COMPARISON
The receiver optimization takes place in two steps. First, for a given number of stages at a specified bit-rate, it is possible to choose values for and that simultaneously satisfy the bit rate [from (1)] while minimizing the required average optical power [from (6)]. Second, picking the number of stages that requires the lowest optical power at each bit rate yields the distinct regimes in Figs. 3-5 . These figures show the average optical power required and the electrical power dissipation of the optimium receivers. The solid lines are the results of optimizing for minimum required optical power, and the dashed lines are the results of optimizing for minimum total power, as described below. Solid circles are the results of SPICE simulations of the optimum receiver designs. The SPICE results match closely the predictions of the model.
At lower bit-rates, a three-stage transimpedance amplifier with a two stage voltage amplifier (3 2) is optimum in all three technologies, with an average optical power requirement below dBm. In 0.8 m technology, the (1 2) receiver is optimum next in a narrow region around 450 Mbit/s, followed by the (1 1) receiver up to a bit rate of 900 Mbit/s and an average optical power of 10 dBm. Beyond this bit rate, the (1 0) receiver is optimum. In 0.5 m, the (1 1) receiver is optimum from 780 Mb/s to 1.1 Gb/s ( dBm), followed again by the receiver. In 0.1 m CMOS, the (1 2) receiver is optimum between 1.6 Gb/s and 2.2 Gb/s ( dBm), followed by the (1 1) receiver and the (1 0) receiver.
The (3 1) and (3 0) receivers do not appear as optimum receivers in any of the technologies when optimizing for minimum optical power, indicating that the speed penalty for adding voltage amplifying stages to the three-stage TIA is negligible compared to the benefit in increasing gain.
Shorter gate lengths provide a speed increase of about 30% from 0.8 to 0.5 m, and about 100% from 0.5 to 0.1 m. The power dissipation of the receivers is generally in the 1 to 10 mW range. The equivalent input noise power was also calculated for every optimized design using (7). The noise was always at least an order of magnitude smaller than the optical power requirement. This indicates that the bit-error rate due to receiver noise is insignificant and that this noise is not the dominant factor in determining the sensitivity of the receivers.
If a smart pixel circuit contains both receivers and transmitters, it may be more appropriate to minimize the total power dissipation in a link, which includes the power dissipation in the transmitter circuit as well. Let represent the electrical to optical power conversion efficiency of the transmitter, including the efficiency of the optical system. Multiplying by the optical power gives an equivalent electrical power required at the transmitter. The total power dissipated in a link is then given by (13) If the characteristics of the transmitter are known in greater detail, it is possible to replace the constant with , where the function captures any nonlinear dependence of power efficiency on optical power. In this work, a constant value of is assumed.
The dashed lines in Figs. 3-5 give the results of the optimization for . Plotted are the average optical power required at the receiver and the electrical power dissipation of the receiver. The input equivalent noise power is again at least an order of magnitude less than the optical power. The transition between three-stage and one-stage TIA-based receivers occurs at roughly the same bit-rate. However, the use of postamplifying stages is no longer optimum, indicating the gain provided by the postamp stages does not compensate for the additional electrical power required. The total power dissipation in a link has been reduced to below 1 mW in the low bit rate region, whereas for the minimum optical power optimization (solid lines) the receiver power was greater than 1 mW. This suggests that the total power optimization leads to better link efficiency than that obtained when optimizing for lowest optical power. It should be noted that different values of give different optimization results. In particular, for very large values of (i.e., very inefficient transmitters), the total power optimization reduces to the optical power optimization described above. Fig. 6 plots the layout area and the effective area of receivers optimized for total power at 800 Mbit/s and 1.6 Gb/s. A maximum power density of 10 W/cm 2 is assumed. The layout proportionality constant in (11) is chosen to be 50, as this value matched our experimental receiver layout areas. The actual layout area is seen to be smaller than the area required to meet the maximum power density. As the gate length decreases, the ratio of the layout area to the effective area increases slightly. However, it is less than 25% overall. Optimum receiver configuration is given as N + P , where N is the number of stages in the TIA, and P is the number of stages in the voltage amplifier. 
IV. EXPERIMENTAL RESULTS
Two receivers were implemented in the MOSIS 0.5 m ( m drawn) CMOS process and electrically tested. Both receivers were designed to operate at 800 Mb/s. One receiver was optimized for minimum optical power and the other was optimized for minimum total power, with . The design parameters for the two receivers and the measured sensitivity and electrical power dissipation are listed in Table III. For electrical testing, a resistor with the same value as the receiver's feedback resistor was designed in series with the applied voltage signal, to simulate the high impedance of the photodiode. A pseudorandom bit sequence was applied for testing. The voltage swing required at the input for an open eye determined the simulated photocurrent, and thus determined the sensitivity of the receiver. Microwave probes were used for signal input and output, and the receiver output was buffered with a source follower to drive the 50 probe. The eye diagram at 800 Mb/s for the receiver optimized for minimum optical power is shown in Fig. 7 . The eye diagram for the other receiver was almost identical. It can be seen from the eye-diagram that the receiver is not noise limited.
V. CONCLUSION
Receiver sensitivity, speed, power consumption, and area requirements are critical to the design of smart pixel IC's. In this paper, we have developed a receiver model and an optimization methodology that enables the design of optimal receivers. We have shown that the receiver sensitivity is limited by the gain-bandwidth product of the receiver amplifiers and the minimum noise margin required at the logic circuits. Receivers that are optimized for minimum optical power requirements are found to have electrical power dissipation ranging from 1 to 10 mW. The method for minimizing the total power, including the electrical power of the receiver and the transmitter, yields an increase in the required optical power at the receiver and a decrease in the total power dissipation. This is important, as the effective area of a receiver due to power density considerations will be larger than the actual layout area. Our approach is not limited to CMOS-based receivers. With a complete description of the amplifying stage gain, bandwidth, and capacitances, a similar optimization can be done for other technologies.
APPENDIX I DEVICE MODEL
High frequency MOS circuits typically employ short transistor gate lengths and large gate-source voltages, as this combination produces the highest transistors. To accurately model MOS transistors in this regime, short channel effects must be considered. In saturation, the drain current per unit width of the device is given by [12] (14) where the term contains the short-channel effects of both the gate (normal) electric field and the velocity saturation due to the source-drain (tangential) electric field, and the source resistance.
Transconductance per unit width, , can be approximated
The output conductance per unit width, , can be calculated from the process Early voltage and the coefficient of the threshold voltage shift due to drain induced barrier lowering (DIBL), by [13] (16)
In short-channel devices, the dominant effect is DIBL. Thus, the output conductance is approximated by (17) where a fitting parameter has been introduced to account for surface roughness and other effects that reduce the output conductance at high bias voltages.
Transistor capacitances are also critical in determining the frequency response of high-speed designs. These capacitances scale approximately linearly with the width of the transistor, and are given here in capacitance per minimum width device . The gate-source capacitance in saturation consists of the gate-channel capacitance and the gate-source overlap capacitance, and is given by
The gate-drain capacitance in saturation is determined solely by the gate-drain overlap, , and the drain-bulk capacitance is given by the junction capacitances of the bottom and sidewalls of the drain diffusion, . The length of the drain diffusion is assumed to be the minimum allowed by the process. Although the junction capacitance is voltage dependent, in this analysis constant worst case values are used for simplicity.
The values used in the examples in this paper are for digital CMOS processes, and are listed in Table IV . The values were obtained by curve-fitting I-V and frequency response characteristics. It should be noted that while the 0.8 and 0.5 m processes are commercially available, the 0.1 m process is still experimental [14] . The receivers in this analysis use simple CMOS current source inverters as amplifying stages. The circuit for a single stage is given by FET's M1, M2, and M3 in Fig. 2 . FET's M1 and M2 form a current-source inverter which provides the amplification. M3 serves to shift the second-order pole to higher frequencies, by reducing the small-signal impedance on the inverter's output node; it also simultaneously reduces the gain of the inverter.
The gate-source voltage on M2 is chosen to be the same as its bias source-drain voltage, , to ensure that it remains in saturation. The gain of the current-source inverter without M3 is then given by (19)
The gain depends only on the gate-source voltage of M1, and does not appear in (19). When the effect of M3 is taken into account, the transconductance of M3 adds to the output conductances in the denominator of (19). This means the ratio of to becomes important. For simplicity, the widths of the transistors are given in terms of the minimum achievable width, . Since the ratio of to is important, is chosen to be one, i.e., the minimum width, to minimize capacitance. With this choice, small values of correspond to low capacitance, low gain, high speed amplifiers, whereas larger values of correspond to higher capacitance, higher gain, lower speed amplifiers. The gain with is calculated to be
where is the output resistance given by
The feedback resistor lowers the gain of the amplifying stage it surrounds. Taking into account the loading of the feedback resistor, the loaded gain is given by
The input and output capacitances of the amplifying stage can be written (23) (24) where the second term in (23) is due to the Miller effect. The width of M2 in terms of can be determined by equating the currents in the and channel devices, and is given by , where is
The pole at the output of the amplifying stage determines its 3 dB bandwidth. This pole can be written,
The feedback resistor acts in parallel with the output resistance, moving the pole of the stage loaded with the feedback resistor to (27) in (26) and (27) is the input capacitance of the next amplifying stage, , or the input capacitance of the decision circuit, , if this is the last stage in the receiver. Thus, given the process parameters, the gain and bandwidth of the CMOS current-source inverter can be written in terms of , and .
