Optical links are traditionally set to transmit maximum power for worst-case loss and consequently to dissipate more power than is required. We describe a technique to minimize power consumption based on the measured bit-error rate (BER) of the link. This technique uses a novel power-negotiation algorithm that optimizes the link power setting to achieve minimum power dissipation for a target BER. A 0.5 m complementary metal-oxide semiconductor optical transceiver chip was fabricated, and a freespace optical interconnect system was built for validation. The results showed that the algorithm was able to find the optimum power settings for the VCSELs for a target BER and to account for dynamic changes such as variation in the optical loss in the system.
Introduction
Optical interconnects are potential candidates for high bandwidth, very short-range (VSR) communication. [1] [2] [3] [4] Large vertical-cavity surface-emitting laser (VCSEL) and photodetector arrays can now be reliably produced and integrated with electronics. Aggregate bandwidths of the order of terabits per second per square centimeter between chips have been reported. 5 The high density of very large-scale optoelectronic (OE) integration permits great flexibility in increasing aggregate bandwidth. The scaling of device arrays, decreasing physical dimensions, increasing speed, and tighter integration lead to higher power densities and a nonuniform thermal environment. Fluctuations in temperature and in the manufacturing process introduce variations in threshold current and slope efficiency of the VCSEL.
Current research suggests reducing the power dissipation by using zero-biased, low-ultralow threshold laser devices. 6 However, below-threshold biasing leads to a significant turn-on jitter, and it may not be power efficient for high-speed digital communications. 7 Kibar et al. propose a power-optimization method that takes the link design constraints as input and iterates over the design variables to find the optimum link design for a given bit rate. 8 Design constraints may include optical system efficiency, transceiver characteristics, and system fan-out, while design variables may be the number of stages in the transceiver, the feedback resistor, and the bias voltage. This technique emphasizes circuit-level power optimization in the design stage for a system with given parameters. Here we report on an alternative approach that utilizes an adaptive power-optimization methodology achieved by integrating power management firmware with the optical transceiver.
VSR optical links are usually designed to compensate for the worst-case power requirement for a given link configuration. Some advanced commercial VSR links use various monitoring techniques to detect the effects of device aging and adjust the laser driving strength accordingly. 9, 10 However, such a compensation scheme strives to maintain constant optical output power at the transmitter, without regard to optical link length, optical receiver performance, and optical channel losses. Consequently, this scheme usually results in unnecessarily high power consumption over short optical links. In contrast, wireless links use sophisticated algorithms and circuits in both transmitters and receivers to reduce power consumption. For example, in the Bluetooth wireless standard, the transceiver power is dynamically adjusted, depending on the link data rate and the signal attenuation associated with the communication channel. 11 We present protocols and circuits with which to achieve power-saving and optimization characteristics for VSR optical data links in much the same way as is done in wireless links. Our approach focuses on optimizing the power consumption of optical links based on the bit-error rate (BER) of each optical link by use of a built-in power-negotiation algorithm. Theoretical analysis of the optical communication channels reveals the dependence of the channel BER on the modulation and bias current settings of the VCSELs. Based on this dependence, we develop an algorithm with which to adjust the power settings of the optical transmitter according to the BER feedback from the receiver. The setup is shown in Fig. 1 . This algorithm converges to an optimal setting for each transmitter that results in the minimum power consumption for a given target BER.
This algorithm can be utilized not only when the system is first powered on but also during normal system operation if the link is idle or a change has been made. In addition, for a system that has built-in error-detection capability, such an algorithm can be run in the background even during real data transmission, and the accumulated BER can be used as the feedback to optimize link power consumption. Therefore it provides the system with a pseudo-real-time power-optimization capability. Here it is assumed that such capability is available or can be added to the system. For a system that lacks error-detection functions, the power-saving algorithm can be run when the system is taken off line, and the resultant optimized transmitter setting can be saved for use in actual data transmission. In such a system, some control mechanism, which can be a simple-state machine, can be used to determine the system's operating mode and ensure a smooth transition among various operating modes. The triggering events could be the idle state of the channel or any detected change in system performance. Additionally, the poweroptimization algorithm can be run periodically if needed. The frequency of rerunning the algorithm depends on the requirements of each system, such as the stability of the operating environment and the characteristics of live traffic. Such specification usually needs to be made during the design stage and may be adjusted accordingly over time.
Many commercial optical receiver modules include an electrical output that monitors average optical power. This output is typically used to set up the initial link operation and make coarse adjustments. Achieving a specific BER on that link still requires performing a BER measurement. The power-optimization algorithm presented here is based on BER measurement. One of the major advantages of this technique is that the link BER accounts for all the factors that affect link performance, including OE device performance and electrical transceiver design as well as the attenuation associated with the noisy communication channel. Therefore a BER-based optimization approach provides the system with a simple control mechanism with which to achieve both low-power and high-performance characteristics.
Meanwhile, one of the challenges of implementing this algorithm is the long test time required for a high-confidence BER measurement for a multigigabitper-second link with a low target BER ͑Ͻ10
Ϫ12
͒ by use of the conventional test method. For example, it takes more than 28 h to perform a BER measurement that allows 256 errors to occur for a target BER of 10 Ϫ12 on a link running at 2.5 Gbits͞s. Fortunately, various accelerated BER extrapolative techniques to speed up the measurement process and reduce the test time to minutes have been proposed. 12 Recent research has demonstrated the possibility of integrating such accelerated BER measurement capability into optical transceivers. 13 Alternatively, in some applications, various coding and decoding schemes can be used to improve the final BER performance while they reduce the actual BER required to a much higher range, such as 10 Ϫ9 and greater. 14 Incorporation of the accelerated BER measurement circuits and other control mechanisms inevitably increases design complexity and eats into the system's power budget, however. In addition, for a system that has a large number of channels, the total processing time for performing such a power-optimization algorithm can be significant. Those are the trade-offs that need to be well considered and carefully balanced during system design. Nevertheless, we believe that, for most VSR optical systems, the benefits of such a BER-based power-optimization technique can well offset these design overheads, and an overall power saving together with high performance can be achieved.
An optical transceiver chip was designed and fabricated by use of a 0.5 m complementary metal-oxide semiconductor (CMOS) silicon-on-sapphire process. The chip was integrated with either a VCSEL or photodetector chips by flip-chip bonding. The hybrid chips communicated with a field-programmable gate array (FPGA) in which the power negotiation algorithm was implemented. A free-space optical interconnect system was built, and the power-negotiation algorithm was tested with a 1.8 Gbit͞s optical link. Our test results showed that the algorithm was able to find the optimum modulation and bias settings for the VCSELs to achieve a target BER, despite changes such as variation in the optical loss in the system.
To our knowledge, this is the first demonstration of adaptive power optimization based on a link BER for digital optical communication. The proposed approach provides the capability of link optimization and thus facilitates efficient thermal management for densely integrated systems. The utilization of digitally tunable drivers and receivers offers individual control of OE devices, which provides an efficient way to compensate for parameter variations caused by device and thermal environment nonuniformity. The benefit of this algorithm can also be extended to OE packaging to relax optical alignment requirements and permit dynamic variation within the optical system. For example, the system can compensate for slight optical misalignment by adjusting the transmitted power while it maintains its performance. This paper is organized as follows: In Section 2 we analyze the effect of operating conditions on BER. This analysis is followed in Section 3 by details of our power-saving algorithm and its implementation. The custom transceiver design, chip layout, and integration of optical system are introduced in Section 4, and test results are presented in Section 5.
Theoretical Background
For a single-channel, on-off keyed, intensity-modulated link, error-causing noise may include thermal noise, shot noise, and laser relative intensity noise. The sum of all noise sources is assumed to have a Gaussian distribution. An estimate of error probability P e or the BER with decision threshold I D is given by 15 P e ϭ 1
where I 1 and I 0 are the induced photocurrents from the detector and 1 and 0 are the root-mean-square noise values when bit 1 and bit 0, respectively, are received. In Eq. (1), erfc is the complementary error function, defined as 16 erfc
For most VSR optical systems of interest, the channel is nearly bisymmetric and the probability that bit 1 or bit 0 will be transmitted is assumed the same. The shot noise and the laser's relative intensity noise are small because of the highly sensitive receiver used and the small VCSEL driving current (a few milliamperes). Thus it is reasonable to assume that the thermal noise dominates and limits the receiver's performance, which is a good approximation for most systems that use p-i-n detectors. 15 Therefore the channel noises ͑ 1 and 0 ͒ are independent of the data transmitted ͑I 1 and I 0 ͒, and we have 1 ϭ 0 ϭ . For the optimum decision threshold
, which corresponds to setting the decision threshold midway between I 1 and I 0 to achieve the minimum BER, Eq. (1) can be simplified to P e ϭ 0.5 erfc͑Q͞ͱ2͒,
where
is the Q factor at the decision circuit.
As the induced photocurrent ͑I 1 or I 0 ͒ is proportional to the optical power incident upon the detector, and the link's optical power is determined by the driving current for the VCSEL (I on for data 1 and I b for data 0) with direct intensity modulation, Eq. (3) can be written as
In Eq. (5), , ␣, and R are the quantum efficiency of the laser, the link loss from the laser to the detector, and the responsivity of the photodetector, respectively.
Equation (4) is true only for a VCSEL biased above its threshold current. When a VCSEL is biased below its threshold current ͑I th ͒, a significant turn-on delay occurs at the rising edge of the optical pulse. 7, [17] [18] [19] This delay is a result of the time that it takes for the photon population to build up in the laser cavity after the carrier density has been decaying during its off period, i.e., when it is biased below threshold. Consequently the turn-on delay varies from pulse to pulse, depending on the data rate and on the total off period before the pulse, which results in patterndependent jitter. An approximate form for the worstcase error rate below threshold is 20
where the bit rate, B, is usually in the range of a few gigabits per second for practical optical links. t d, max is the maximum turn-on delay at the rising edge of the optical pulse and can be evaluated by 7, 21 
where is the carrier recombination time and typically has a value of the order of 1 ns for a VCSEL. In Eq. (6), function h͑Bt͒ describes the pulse shape as a result of filtering and equalization at the receiver, which in many cases can be approximated by a raised cosine function:
Substituting Eqs. (7) and (8) into (6), we obtain
In this expression, F represents the maximum time delay between the injected current pulse and the output light pulse for below-threshold conditions. The cosine term represents the reduction in the detected signal that is due to this delay. This reduction is caused by not sampling at the optimal location within the received eye where we assume that the optimal sampling time is determined for a received raised cosine pulse shape by use of the above-threshold conditions. In most systems of interest, delay F is typically small, and the error rate is dominated by the term I on Ϫ I th .
Whereas the expression for the error rate is approximate, it models the fact that, for below-threshold conditions, the error rate increases because of patterndependent jitter. Therefore, one can determine a complete representation of the BER as a function of transmitter power setting by combining Eq. (4) for biasing above-threshold current with Eq. (9) for biasing below-threshold current.
To further illustrate the magnitude of the parameters described above, we take a 1.5 Gbit͞s free-space optical link as an example; a VCSEL and a photodetector are used, and we assume that ϭ 0.5 mW͞mA, I th ϭ 1 mA, and ϭ 1 ns for the VCSEL; ␣ ϭ 0.1 for link loss; and R ϭ 0.4 mA͞mW for the detector. We choose a Q of 6 ͑P e ϭ 10 from I th in either direction will result in degrading the BER. For a given I b , however, the BER decreases with increased I on . In other words, a BER valley exists that is parallel to the I on axis at I b ϭ I th , where the lowest BER is located. Therefore this BER valley indicates that there is an optimal power setting that results in a minimum BER.
To further illustrate the dependence of BER on power setting, in Fig. 3 we have plotted the BER contours with BERs of 10 Ϫ6 to 10
Ϫ15
. Each curve corresponds to a certain BER, and the X and Y coordinates of any point on the curve represent the power setting ͑I b and I on ͒ that results in the value of the BER. For a given BER the minimum value of I on occurs when the laser is biased at threshold. The average power consumption of the laser can be approximated by
where V on is the turn-on voltage of the laser and we assume that data 1 and 0 are transmitted with equal probability. Therefore, to minimize the power consumption for a certain BER value, we need only to find the power setting along the BER contour that results in the minimum sum of I b and I on . In most cases, because I on increases on both sides of threshold biasing, it can be seen that the power setting at the BER valley floor leads to the minimum sum of I b and I on , and thus to the minimum power dissipation. The above analysis illustrates the BER as a function of I on and I b , assuming that parameters B, ␥, and I th , are fixed. In the following paragraphs we investigate how these three parameters individually affect the BER contours. First we look at the effect on BER contours of changing B. Figure 4 shows a comparison between two plots with B ϭ 1.5 and B ϭ 3. The value of B has no effect on the curves for above-threshold biasing because they overlap exactly for both values of B. It affects the slopes of curves only for below-threshold biasing. As B increases, the slope becomes greater. Therefore, for a reasonably large B, threshold biasing remains the optimum power setting and does not vary significantly with changing B.
However, at small B, typically as B drops below 1.2, the minimum power setting will no longer be at the threshold biasing point. Instead, zero bias is the lowest-power solution. 20 Generally, for semiconductor lasers, the rising speed of a pulse is closely related to the relaxation oscillation frequency of the laser, which is approximately proportional to the square root of the current density above threshold. Therefore, for very high data transmission (i.e., 10 Gbits͞s), the VCSEL is usually biased at well above the threshold current to boost the pulse edge rate 17 ; this operating regime is not considered here. For the remainder of this paper it is assumed that the value of B is greater than 1.2, unless an exception is explicitly stated. This assumption is justified because, for values of of the order of 1 ns, this corresponds to a signaling rate of approximately 1.2 Gbits͞s. This rate is the approximate signaling rate of current gigabit-per-second physical layer standards such as Fiber Channel and Gigabit Ethernet that use 8B10B line coding. Thus the use of the BER valley will produce minimum power for existing high-speed interconnect standards.
The scaling factor, ␥, takes into account the quantum efficiency of the laser, the optical loss, and the responsivity of the photodetector. As shown in Fig. 5 , the BER contours for ␥ ϭ 3.7 mA Ϫ1 shift vertically to higher values of I on compared with the contours for ␥ ϭ 4 mA Ϫ1 , indicating that, with less noise, the same error rate can be obtained with a lower current. There is no shifting horizontally along the I b axis, because I th is a constant and all slopes remain the same. For a specific value of BER, the optimum power setting ͑I on only͒ is also moved upward, while the optimum I b remains the same. Figure 6 shows the effects of different VCSEL threshold currents. All the BER contours shift both horizontally and vertically for an increased I th . The new BER valley is aligned with the new I th , and the contours for above-threshold biasing remain the same. Therefore, the optimum setting for both I b and I on needs to be changed if the threshold current changes. If more than one parameter changes, the BER contours are the superimposition of shifting caused by each individual parameter.
In summary, for any given target BER, the power setting in the BER valley floor results in minimum power consumption. If we can develop an algorithm to locate the valley floor and use the optimum setting for the transmitter, it will lead to the lowest power dissipation for this link. In addition, the algorithm needs to be able to recover the optimum power setting by reexamining the BER information when link performance is affected by changed parameters. The detailed algorithm is given in Section 3. Fig. 4 . Effect of B on BER contour: B affects only the slopes of contours for below-threshold biasing. Units are the same as for Fig. 3 . Fig. 5 . Effect of ␥ on BER contour: Changing ␥ causes the contour to shift vertically, and a lower I on is required for the link with a higher ␥ to yield the same error rate. Units are the same as for Fig. 3 . 
Power-Negotiation Algorithm and Implementation
The complete power-negotiation algorithm is composed of two parts: a BER tester (BERT) and a decision algorithm. The BERT provides the BER information on the optical link. The decision algorithm adjusts the power ͑I on and I b ͒ based on the channel BER information by following the analysis in Section 2.
The decision algorithm can be further divided into two subcategories: a power-on optimization algorithm, which can be used to set the initial optimum setting when the system is powered up, and a realtime correction algorithm, which can be used to finetune the setup when changes have been made that cause the BER to deviate from the target value.
A. Power-On Optimization Algorithm
The goal of the power-on optimization algorithm is to find the optimum power setting before transmission of the real data. Following the steps below will lead to the optimal setting, as illustrated in Fig. 7: (1) Set bias current I b above the threshold current suggested by the device's manufacturer and set I on at the low end of the operational range.
(2) Keep I b fixed and keep increasing I on until the target error rate is achieved.
(3) Decrease I b and I on together until the BER starts to be greater than the target BER.
For B Ͼ 1.2, the algorithm converges to the power setting corresponding to the BER valley floor. In the case of lower data rate systems for which B Ͻ 1.2, it may converge to a power setting that has I b below threshold, resulting in an average power consumption that is slightly higher than the one biased at zero current.
B. Real-Time Correction Algorithm
The goal of the real-time correction algorithm is to correct the power setting to account for system parameter deviations between the data transmission periods. The power-on optimization algorithm could be rerun to yield the optimum power setting in this case. However, a fast real-time correction algorithm is needed to reduce the number of iterations and speed up the process of fine-tuning the power setting.
Various factors, such as temperature variation, attenuation changes in the optical path, and optical misalignment, can lead to operating-parameter changes. Each of these factors may cause one or more parameters to change. For example, increasing temperature will lead to an increased I th and a decreased ␥. Changing attenuation in the optical path affects only ␥. However, when the link BER has changed in a real system, the cause of the change may be unknown. Regardless, one thing remains the same: The optimum power setting is still located at the BER valley floor. Therefore the goal remains to find the new BER valley floor and then adjust I on . The following procedure describes the steps of this algorithm as illustrated in Fig. 8: (1) Fix current I on ; slightly change I b toward the direction of decreasing BER until the BER begins to increase again. For changing attenuation (only ␥ changes), this will result in the same I b because the BER valley floor has not been affected.
(2) Fix I b and change I on toward the target BER. Figure 9 shows a flow chart of the negotiation algorithm's implementation. It starts with a transmitter sending a pseudorandom bit sequence (PRBS) over the optical link to a receiver. The receiver checks incoming data and records the number of errors. Once the data transmission is done, the error information is sent back to the transmitter and the decision algorithm makes the proper adjustments to the power setting for the VCSEL. Therefore each iteration involves one BER collection and one powersetting adjustment. This iteration repeats until the optimum setting has been found.
Hardware Implementation
A CMOS chip with four optical transceivers was fabricated, and OE devices were flip-chip bonded on top Fig. 7 . Steps show the power-on optimization algorithm converging to the optimum power setting when the system is powered up. Fig. 8 . Steps show the real-time correction algorithm fine-tuning the power setting during data transmission. Dashed curves, original system setup; solid curves, the new BER contour after the system's operating parameters have been changed. of the CMOS chip. The hybrid chip was then wirebonded to a carrier board that was eventually integrated onto a motherboard to form a FPGA-based demonstration system. Four unidirectional optical links were established between two chips by use of free-space optics, and the power-negotiation algorithm was implemented in the FPGA to provide maximum flexibility in its development and testing.
A. Complementary Metal-Oxide Semiconductor Integrated Circuit Design

Vertical-Cavity Surface-Emitting Laser Driver Design
Inasmuch as the light output of a VCSEL is linearly proportional to the VCSEL's driving current above threshold, the goal of the driver design is to provide accurate tunable modulation and bias current for the VCSELs. They are obtained by use of two four-bit current-mode digital-to-analog converters (DACs) for both modulation and bias current control. The power settings are loaded and stored in register-based storage cells to provide independent controls for each VCSEL, which allows great flexibility in compensating for potential device nonuniformity and parameter variation across a large OE array. Figure 10 shows a schematic of a 4 bit currentmode DAC. It is based on a current mirror structure, and the transistors in each branch are sized proportionally and binary weighted. The reference current input, I ref , is the step size for the current setting and can be adjusted externally to meet the requirements of a specific system. Figure 11 shows a comparison of the current output of the DAC for simulation and probed data while the digital settings are swept from 0000 to 1111 with a current step size of 0.3 mA. Good linearity of current stepping is observed, as well as a close correlation between the simulation and the measurements.
Optical Receiver Design
The receiver is designed in such a way that it complements the operation of the transmitter. It has a large dynamic range with the ability to trade between gain and bandwidth during operation. It is composed of two DACs, two preamplifiers, a three-stage differential postamplifier, and a current-mode logic (CML) driver, as shown in Fig. 12 . The purpose of using two preamplifiers is to achieve a better matching between the data and the threshold voltage channel. The use of DACs in both channels allows the dc performance of the preamplifiers to be tested without the need for integration with OE devices. In addition, during the receiver's normal operation the DAC in the dummy preamplifier is used to source a dc current to generate a threshold voltage for the decision circuit for data comparison. The analog test unit consists of a transmission gate with a 50 k⍀ resistor connected in series and is designed to monitor the analog output of the preamplifiers. During the data transmission, the test bus senses the average voltage at the data channel and provides a reference voltage at which the optimum threshold voltage should be set in the dummy channel for the decision circuits. It also provides a simple and efficient way to gauge the optical power received by the detector and facilitates optical alignment during system integration. The decision circuits and postamplifiers consist of three stages of fully differential amplifiers for data recovery and waveform shaping. Finally, the CML driver is used to send the data off chip electrically. It has a 100 ⍀ differential output impedance and can deliver a 400 mV differential peak-to-peak output. Figure 13 shows schematics of the transimpedance section of the preamplifier. Three digital inputs (Gmin, Gmed, and Gmax) are for gain control, which, when they are enabled, give 1, 2.5, and 10 k⍀ transimpedance gain, respectively. It also has a powercontrol transistor, which can digitally power down the amplifier to reduce power consumption when the link is idle.
The use of the DACs and the analog test bus to enhance the receiver's testability exhibits negligible penalty on the overall noise, speed, and power consumption of the receiver. Each DAC adds a mere 6.7 fA͞ͱHz noise and less than 10 fF parasitic capacitance to the preamplifier input, which can be largely ignored in comparison with an input referred current noise of 4.1 pA͞ͱHz of the preamplifier and the ϳ400 fF capacitance from the photodiode and packaging. The receiver has a sensitivity of 40 A when gain setting Gmed is enabled, which leads to power consumption for the DAC that is of the order of 0.1 mW. By contrast, the complete receiver consumes an average power of 30 mW during normal operation. Each DAC takes an area of 60 m ϫ 18 m in a 0.5 m process.
Complementary Metal-Oxide Semiconductor Technology and Chip Layout
Our chip has been designed and fabricated by use of Peregrine Semiconductor Corporation's 0.5 m, ultrathin, silicon-on-sapphire (UTSi/SOS) process. 22 UTSi is a newly developed silicon-on-insulator technology in which devices are made in a very thin silicon layer ͑100 nm͒ on top of an insulating sapphire substrate. Compared with bulk silicon technology, it offers a number of advantages for mixed-signal circuit designs, including low power consumption, reduced parasitic capacitance, and minimum cross talk. The sapphire substrate is also a good thermal conductor, and the thermal expansion coefficient of gallium arsenide (GaAs) more closely matches sapphire than bulk silicon does. In addition, the optically transparent substrate facilitates the flip-chip attachment of the top-emitting OE devices, which makes it an ideal candidate for OE integrated circuit (IC) designs.
In an area of 4.4 mm ϫ 1.5 mm, four single-ended optical transceivers are implemented with 64 perimeter pads and 16 pairs of flip-chip bonding pads in the center for OE array integration. Although the chip is designed as a full transceiver, because of physical constraints only two 1 ϫ 4 OE arrays-either VCSEL or photodetector arrays-are attached to a given chip. As a result, the system demonstration uses two chips and the optical links between the two are unidirectional. Because the OE arrays have a pitch of 250 m while our optical transceivers are spaced at 500 m, only half of the OE devices in two 1 ϫ 4 arrays are actually connected to the transceivers and are active during normal operation. A micrograph of the chip is shown in Fig. 14 . The driver and receiver circuits are 650 m ϫ 75 m and 740 m ϫ 77 m, respectively.
The chip was manually laid out by use of a custom layout tool from Cadence Design Systems, and transistor-level simulation was performed with Spectre simulator. Special care was taken at the layout level to ensure signal integrity. Power and ground rails were separated for analog and digital circuits, and abundant decoupling capacitors were added to minimize the switching noise and the cross talk through the power-ground network. An interleaving scheme, a dual-gate structure, and dummy gates were used for better matching for the preamplifiers and the differential circuits. Also, on-chip electrostatic discharge circuitry was included for both the perimeter and the flip-chip bonding pads. 23 
Design Overhead for Integrated Power-Negotiation Algorithm
Because of the limited chip area available in our fabrication, the BERT and the decision algorithm are not integrated into our optical transceiver chip. In practice, it is possible to incorporate them into an optical transceiver chip. For example, parallel electrical transceivers that include on-chip BERTs are commercially available. 24 Typically, a BERT consists of a PRBS generator and a verifier. For a 15 bit sequence, it could take approximately 300 m ϫ 400 m in a 0.5 m process and consume 140 mW of power at 1.5 Gbits͞s, based on our simulation. The cost of the decision algorithm is of the order of a few hundred gates. In addition, one power-negotiation algorithm implementation can be shared among many channels to further reduce the design overhead.
As described in Section 3, this power-negotiation algorithm requires a bidirectional path: a forward data path and an error feedback path. However, the error feedback does not require high-speed links, and one feedback link can be shared among multiple receivers. In our case, an electrical feedback path was used to pass the error information.
B. System Integration
Integration of a Complementary Metal-Oxide Semiconductor Chip with an Optoelectronic Device
The OE devices used are an Emcore high-performance 1 ϫ 4 VCSEL and 1 ϫ 4 GaAs p-i-n photodiode arrays on a 250 m pitch. The VCSEL has a nominal wavelength of 850 nm, a typical slope efficiency of 0.45 mW͞mA, and a differential resistance of 50 ⍀. The p-i-n photodiodes have a typical responsivity of 0.5 A͞W and a capacitance of 0.4 pF. They are both rated for 3.125 Gbit͞s applications. 25 Because flip-chip bonding offers low parasitic capacitance and high integration density, it was chosen for the integration of the CMOS chip and the OE devices, which was done by the Peregrine Semiconductor Corporation. Figure 15 shows the CMOS chip with two photodetector arrays attached. As both VCSEL and photodetector arrays are top-emitting devices, the laser beams emit through the sapphire substrate of the CMOS chip during their operation.
Chip-on-Board Packaging
A 5 cm ϫ 3.8 cm carrier printed-circuit board (PCB) was designed and fabricated. The hybrid CMOS chip with the OE arrays attached was placed in the center of the carrier PCB and wire-bonded to it. Directly underneath the hybrid CMOS chip, a 4.5 mm ϫ 1.2 mm opening was made on the carrier PCB for optical access.
Motherboard
A 10 layer, 21.6 cm ϫ 21.6 cm FR4 motherboard was designed and fabricated for final system integration. Two carrier boards were connected perpendicularly to the motherboard simultaneously by Mictor impedance-controlled connectors. Right-angle metal clamps were used to stabilize the attachment. The whole setup was assembled on top of an optical breadboard, which provided good vibration isolation and enhanced system stability. The complete demonstration system is shown in Fig. 16 .
A high-performance FPGA (XC2VP7) from the Xilinx Virtex-II Pro family was used as a central control unit. It has eight full-duplex multigigabit serial transceiver cores (RocketIO, Xilinx, Inc.) that include an embedded serializer-deserializer, clock-datarecovery circuits, and the options of an 8B10B and cyclic redundancy check (CRC) encoder and decoder. The CML input-output connections are used for high-speed data transmission and also provide programmable impedance termination, a pre-emphasis level, and differential output swing. All these features provide great flexibility in system testing.
Free-Space Optical Interconnection
A free-space optical interconnection system was used to establish communication links between two chips. Two seven-element Universe Kogaku f-1.2 lenses were used to link a pair of carrier PCBs, separated by 89 mm. A CCD camera permitted direct observation of both the VCSEL and detector arrays to facilitate alignment. The voltage drop of the receiver preamplifier output through the analog test bus was monitored during active alignment, when the VCSELs were turned on. Owing to the large disparity between the resolution of the incident spots (full width at halfmaximum, Ϸ50 m) and to the separation of the detectors ͑500 m͒, there was no detectible optical cross talk between adjacent channels. In addition, with only four channels and little scattering, there was not a significant level of ambient light.
Experimental Results
A. Test Setup Overview and Initial System Testing
The goal of this test was to see how well the powernegotiation algorithm works in a practical serial data transmission system, in which serializerdeserializer and clock-data-recovery circuits are normally used. Here, we used the FPGA with integrated serial transceivers to fulfill such test needs. All experiments were carried out based on the custom-built test bed described above, and a block diagram of the system is shown in Fig. 17 .
A free-space optical system was constructed to establish optical links between the transmitter and receiver chips. The FPGA sources and sinks data through the RocketIO serial transceivers and also configures the optical transceiver chips through lowspeed lines. The BERT and the decision algorithm were implemented on the FPGA, and a custom interface was developed to permit communication between the RocketIO transceiver and the test logic. To reflect the actual performance of the data link, the 8B10B and CRC coding and decoding options of the RocketIO were bypassed and disabled during the testing.
Extensive testing was performed to verify the proper functioning of the BERT logic, the quality data transmission between the RocketIO transceivers, the stability of the free-space optic interconnects, and the performance of the optical transceivers.
We verified the error-detection capability by introducing artificial errors to the BERT through VHDL coding and comparing them with the detected error bits. Our test results showed that the correct number of errors was detected and displayed on the LEDs on the motherboard. To verify the integrity of the electrical data transmission, we ran the data links between the RocketIO transceivers in the loop-back mode continuously for several hours, and no errors were detected. To show reliable, extended operation, we turned on four optical channels simultaneously with the I on and I b of the VCSELs set to 4.8 and 1.2 mA, respectively. The system was run continuously for 8 h, and no errors were observed.
To test how fast our optical transceiver could operate, we used a pattern generator to send a PRBS to the transmitter and monitored the output of the receiver with an oscilloscope. Figure 18 shows an eye diagram of the receiver output at 2 Gbits͞s. Although the bandwidth of our transceiver design is Ͼ2 Gbits͞s, all experiments were conducted at a rate of 1.8 Gbits͞s, which was limited by the clock oscillator available at the time of experiment.
B. Power-Negotiation Algorithm Test Results
Both the power-on optimization algorithm and the real-time correction algorithm were implemented. A PRBS pattern of 2 20 Ϫ 1 was chosen for the BER testing, and a BER of 3 ϫ 10 Ϫ9 was selected as the target BER. With a system clock of 90 MHz, the optical links were running at a data rate of 1.8 Gbits͞s, which corresponds to a B of 1.8.
During the testing of the power-on optimization algorithm, an optical filter with an optical density of 0.2 (equal to a 63% transmission rate) was inserted between the two compound lenses to attenuate the optical power incident upon the detector. To ensure accurate BER measurement, we ran each iteration long enough to allow more than 256 errors to be received. The current step size of the transmitter was 0.25 mA, and the minimum gain setting was selected on the receiver side. The results of each iteration of the algorithm were tracked and monitored by use of displays on the motherboard. The result of this test was that the algorithm converged to a power setting with an I on of 2 mA and an I b of 0.75 mA at a data rate of 1.8 Gbits͞s. Figure 19 shows the BER and the power setting of each iteration that the algorithm went through.
To show that this derived power setting is the desired optimum power setting for the VCSEL, we used different power settings near the derived setting to run the link and we measured the BER for each setting. The results showed that for any other power setting that led to equal or less power consumption of the optical link, a higher BER than the target BER was observed. Thus it is verified that our power-on algorithm is able to find the optimum power setting that results in the minimum power consumption for a given target BER. The algorithm converged to the same setting each time after being rerun several times, which illustrates the stability of the algorithm's implementation with a fixed operating environment. The algorithm was also run on other channels in multiple transceiver chips, and all the results showed its capability of locating the optimum power setting. The low modulation current needed can be explained by the low-loss (short) optical path and the high sensitivity of the receiver.
To verify that the derived optimum power setting corresponds to the BER valley floor where the threshold current is located, a verification of the threshold current of the VCSEL was carried out. An optical powermeter was placed between the two lenses to measure the optical output of the VCSEL under test with other channels turned off. The digital power setting was stepped through, and the optical output power was recorded for each step. Figure 20 shows a plot of the optical output power versus the current driven into the VCSEL. We extrapolated a threshold current of 0.75 mA by fitting a straight line over our data and extending the line to the ordinate, which shows the agreement of the derived optimum I b and the actual threshold current of the device.
To demonstrate the ability of the real-time correction algorithm to find the optimum power setting when the system's operating environment was changed, we used an optical filter with an optical density of 0.5 (equal to a 32% transmission rate) instead. The real-time correction algorithm was executed, and the same I b of 0.75 mA was found, together with an I on of 2.75 mA for this setup, which illustrated the ability of the algorithm to locate the optimum power setting under dynamic conditions. The power-on algorithm was also run with this setup, and the same power setting was found, which shows agreement between the power-on and real-time correction algorithms.
To further illustrate that threshold biasing yields a better link performance than below-threshold biasing, we show a comparison of the eye diagrams of the receiver output in Fig. 21 . A significant patterndependent jitter resulting from the turn-on delay can be observed at the rising edge in Fig. 21(b) when the VCSEL is biased at below-threshold current. This confirms that threshold biasing minimizes datadependent jitter and results in better performance for the optical link.
The experimental results given above verified the performance of the power-negotiation algorithm for a multigigabit-per-second optical link ͑B Ͼ 1.2͒. To determine how well the algorithm will work in a system at a lower data rate ͑B Ͻ 1.2͒ we carried out separate experiments on a 100 Mbit͞s link with a selected target BER value of 10
Ϫ7
. The algorithm resulted in 4.5 mW power consumption for the transmitter, while zero biasing produced a minimum of 4.0 mW to achieve the same BER. Thus, even for a low data rate ͑B Ͻ 1.2͒, the algorithm can produce a nearly optimal power setting that is close to the minimum setting achieved by zero biasing.
C. Discussion
Although our experiments were carried out with dc coupled receivers and a free-space optical interconnect system, the proposed power-negotiation algorithm can be extended to ac coupled receivers and fiber-optic communication systems. In our experimental setup there were no measurable levels of optical cross talk or ambient light caused by scattering. Therefore in this system a Gaussian approximation is applicable and agrees with the noise variations measured for data 1 and data 0 transmission. In other systems, which suffer from non-Gaussian noise sources such as cross talk, high aggregate levels of scattered light, and modal noise in fiber links, extending this technique may require additional theoretical analysis with non-Gaussian noise models, and the performance of the power-negotiation algorithm needs to be reexamined.
Conclusions
An adaptive power-optimization technique for optical links based on channel bit-error-rate performance has been proposed and demonstrated. Based on an analysis of optical links dominated by addictive Gaussian noise, a power-negotiation algorithm has been developed with which to determine the optimum power setting for a given target BER. Details of the algorithm and its implementation have been developed. An optical transceiver-demonstration system based on VCSELs and photodetectors has been presented. The power-optimization algorithm was implemented, and the test results proved the concept. By using these techniques it is possible to build simpler, more power-efficient optical links that have the flexibility to account for dynamic changes.
