Abstract-The experimental operation of a terabit-per-second scale optoelectronic connection to a silicon very-large-scaleintegrated circuit is described. A demonstrator system, in the form of an optoelectronic crossbar switch, has been constructed as a technology test bed. The assembly and testing of the components making up the system, including a flip-chipped InGaAs-GaAs optical interface chip, are reported. Using optical inputs to the 
I. INTRODUCTION
T HERE has been a long-standing concern that, as the clock-rates of silicon VLSI chips increase-along with the rapid rise in transistor counts-the supply of data to and from each chip at the required overall rates will form a serious performance bottleneck. Conventional electrical connections have fundamental physical limits, explored at length by a number of authors [1] , [7] . The essential problem is that, for a constant connection length, the data bandwidth of an electrical link is determined by its area. As interchip communication rates increase and the area available per link reduces (as a consequence of higher connection densities) it is inevitable that the fundamental limit will become a significant problem. This scenario has been recognized by the Semiconductor Industry Association [2] , which has been supporting research on advanced interconnect technologies for a number of years.
Whereas it is not the only possible solution, the use of optical and optoelectronic interconnect techniques to bypass the electrical communication bottleneck is an attractive option worthy of serious investigation. Unlike electrical links, optical communication (on the scale being considered here) does not suffer from intrinsic frequency-dependent losses. It also has a number of other potential advantages such as reducing power consumption (when compared with terminated lines), offering a high signal fan-out capability, and isolation.
The key challenge is to develop efficient optoelectronic interface techniques compatible with high density connections to conventional high-performance silicon-CMOS chips. The study reported in this paper was motivated by the need to explore such technologies, with the aim of providing large numbers (1-10 k) of optical connections operating at the on-chip clock 0018 -9197/$20.00 © 2005 IEEE Fig. 1 . Schematic representation of demonstrator system. While the layout is illustrated here, the principal components and the demonstrator operation are described fully in the text. DOE 1 and DOE 2 are diffractive optical elements, PBS-A and PBS-B are polarizing beamsplitters. The element labeled "=4" is a quarter-wave plate.
rate (0.1-GHz). A critical aspect of this study was to investigate the extent to which these optoelectronic connections, including thousands of photo-current receivers (in this case, analog transimpendance amplifiers), could be embedded within a fully functional digital chip. It is emphasised that the system aspects of the study-in particular the architecture of the data switch that we have constructed-while being of possible interest, were not the main driver behind this research. Rather, we have designed a shoe-box sized technology testbed, capable of carrying out a recognisable task (signal routing under header control), while simultaneously demanding the correct operation of a number of key optical and optoelectronic interconnect functions (electrical-optical conversion), optical fan-out, optical clock distribution). Thus the demonstrator system we have built cannot be regarded as a prototype of a real switch but instead acts as a means to show the feasibility of terabit-per-second interconnectivity to Si-CMOS while permitting exploration of key enabling technologies.
At the heart of the system described further below is a hybrid component comprising a 0.6-m CMOS integrated circuit with an InGaAs-GaAs optoelectronic interface chip, assembled directly on top, with 4000 solder-bump connections providing direct links between optical detectors/modulators and small-scale CMOS input/output (i/o) pads.
This "smart-pixel" approach, in which optical inputs and outputs are distributed periodically across the full area of the electronic chip, has been explored by a number of other research groups [14] , [16] . Lentine et al. [15] demonstrated a smart-pixel chip with optical i/o operating at 650 Mb/s, on an overall scale compatible with terabit-per-second communication. That work built on an extensive program of research by the group at Bell Laboratories based on the use of GaAs modulators and detectors flip-chip assembled onto the silicon chip. One advantage of the InGaAs devices used in the work reported here is the operating wavelength (around 1 m) which, being in the region of transparency for GaAs, avoids the necessity for substrate removal. This significantly simplifies the manufacturing process and is expected to increase reliability and device yields.
This study was one component of a more extensive research program [4] investigating a wide range of optical interconnect options as part of the European Commission's Long Term Research Programme. The conclusions of the Microelectronics Advanced Research Initiative OPTO program are included in [3] .
II. SYSTEM DESIGN
The design of the optoelectronic system is described fully in an earlier paper [5] . Here, we summarize the main components and their overall configuration.
The demonstrator system took the form of a 62 64 optoelectronic crossbar switch. The layout of the system is shown schematically in Fig. 1 . The (electrical) input to the system was in the form of 62 parallel data channels (each 250 Mb/s) plus a differential clock signal (250 MHz), which together drove an 8 8 array of vertical-cavity surface emitting lasers (VCSELs). The relatively modest data rates corresponded to limits imposed by the silicon foundry available at the time the chips were designed. The VCSELs were 10 m diameter devices, operating at 960-nm wavelength. The VCSEL emission was collected by two-stage collimating optics, described more fully below, and then routed through imaging optics, containing an 8 8 diffractive fan-out element, to the InGaAs-CMOS hybrid switching chip-shown on the right of Fig. 1 .
The optical interface portion of the switching chip was constructed from an InGaAs-GaAs strain-balanced multiple-quantum-well p-i-n structure, fabricated into detectors and modulators. The detectors were laid out as 64 8 8 arrays, each array also including a pair of output modulators. This chip was solder-bump flip-chip bonded to a custom-designed silicon-CMOS chip.
The hybrid chip was configured overall as an 8 8 array of "superpixels," each with 64 input detectors, plus a differential output modulator pair. Each set of detectors, corresponding to one superpixel, received an optical replica of the 64 inputs. The signals were passed, via the flip-chip connections, to the silicon chip, which was responsible for header decoding and the routing of a recognized packet to the specific output controlled by that superpixel. The switching scheme included simple contention resolution but no signal buffering. The superpixel output signals left the chip in differential form, via the modulator pairs, once again in optical mode. Modulator read-out was accomplished by an array of beams deriving from a Nd:YLF laser. The light reflected, carrying the output data streams, was polarization-routed to a pair of fiber pig-tailed differential detectors, which were used to sample any chosen output channel. Two of the 64 inputs provided differential clock signals which, like the data, were distributed to each superpixel optically This avoided the need to route any high frequency signals across the chip, even for full 1 Tb/s operation-all such signals being confined within a superpixel with optical i/o only. The two clock signals carried a 250-MHz differential clock signal and were handled by dedicated receivers.
III. COMPONENT TESTING AND SYSTEM CONSTRUCTION
In this section, we describe the results from tests carried out to assess the performance of the various components making up the demonstrator system.
A. Optical System
The three multielement lenses in the system implemented a telecentric 4-f relay from the input VCSEL array to the optoelectronic smart-pixel array and a second such relay from the smart pixel array to an output detector plane [8] . In addition, the same optical system had to deliver the read-out laser beams to the output modulators and route the reflected signals to the output plane. The optical design took into account the dimensions of the detectors and modulators (35 m), the field size at the smart-pixel chip (17.5 mm diagonal), and the requirement that Lens 2 (see Fig. 1 ) be achromatised over the wavelength range 960-1047 nm to accommodate the VCSEL input beams and the read-out beams from the Nd:YLF laser. (Slightly different wavelengths were intentionally specified to aid efficient signal routing through common optics.) Optical design, lens tolerancing, and system modeling were carried out using CodeV. 1 The modeled input and output arms of the system are shown in Fig. 2 , including the two 4-element and one 5-element lenses designed specifically for the system. The assembly tolerances for the lenses were relatively tight. In each, the interelement spacing was controlled to 10 m and the element tilt to better than 1 arc minute. Centration tolerances varied between elements but the critical Lens-2 had centration requirements down to 10 m. Aluminum barrels, brass mounting rings and spacers were machined to the required precision, and a custom-made jig with optical monitoring was used to ensure the correct assembly.
The performance of the overall optical system was assessed by measuring the spot sizes in the extreme corners of the optical field at the optoelectronic smart-pixel chip-the position of lowest resolution. The modeling of the input arm predicted spot sizes for the corner input beams to be 20 m (90% enclosed energy). Measurements on the corner superpixels of the assembled system, showed spots with 90% enclosed energy diameters in the range 21-24 m. The modeled positional accuracy of the input beams was better than 2 m across the entire smart-pixel chip and system tests bore this out. The detectors were fabricated with a diameter of 35 m. Allowing for the measured spot sizes, the expected effects of spot position variations and VCSEL wavelength variations (which, combined with the diffractive optics, also leads to spot shifts), there remained a tolerance to further misalignment in excess of 3.5 m. This was found to be sufficient.
An illustration of the optical system in operation is presented in Fig. 3 . It shows an image of the complete hybrid chip with a test pattern from the VCSEL input array fanned out to each of the 64 superpixels. A single superpixel is also shown enlarged in the figure to reveal the accurate beam placement achieved by the optical system.
Measurements were made of the optical power losses in the input arm of the demonstrator system, i.e., between the VCSEL array and the hybrid chip plane. The optical power budget calculations (see [5] ) predicted an input arm transmission of 45%. The measured transmission was 44%. This was consistent with the power necessary for the detectors/receivers to operate at the required data rates.
Other key elements of the optical design were the routing beam-splitters (see Fig. 1 ). To avoid specifying the polarization state of the VCSEL array emission, these were required to act as polarization independent reflectors at the VCSEL wavelength (960 nm). At the same time they needed to act as polarizing beam-splitters at the read-out Nd:YLF laser wavelength (1047 nm). A further constraint was the wide angular acceptance 7.5 needed as a result of the large optical field. The chosen PBS design was based on a dichroic air-spaced construction, rather than the usual cemented glass cube configuration, to exploit the inherent asymmetry and larger index difference. The materials used for the high/low refractive index coatings were TiO and SiO . A total of 27 layers were evaporated onto a substrate of B270 resulting in an overall coating thickness of 4.7 m. Experimental tests of the fabricated beam-splitters showed a contrast greater than 2:1 between s-and p-polarization reflection at 1047 nm and, importantly, more than 99% reflectivity of both s-and p-polarizations at 956 nm. These characteristics were maintained over a 16 angular range centred at 45 . The designed and measured spectral performances of these beam-splitters are shown in Fig. 4 . It is worth noting that the field requirements and the 45 angle required the larger of the two beam-splitters to be 64 45 mm in size.
The use of bulk imaging optics in this manner is attractive in a number of ways. First, a single lens is capable of creating millions of free-space high-frequency optical connections simultaneously. Second, as shown above, combined with Fourier-plane diffractive optics a range of interconnect functions can be created. The main drawback of the approach is the overall size required of an optical system suitable for imaging 10 mm area chips with micrometer resolution. Miniaturization could be achieved by making more extensive use of short-focal length microlens arrays or hybrid combinations of micro-and mini-lenses as investigated by Kirk et al. [20] . However, this restricts the use of Fourier-plane routing techniques. Alternatively, the "planar optics" approach, explored by Jahns [18] , [19] offers the option Fig. 5 . The 70-kHz data through silicon chip (electrical test). Top trace is the output from the modulator driver of superpixel #54 address "110 110". Middle trace is input to the data receiver: a start bit "1," followed by a header matching the output address of a super-pixel, followed by data. Lower trace is the clock required during the header decoding phase. (a) Looking at output of modulator driver at superpixel 54 when data input is correctly addressed "110 110." (b) Looking at output of superpixel 54 when data is wrongly addressed 101 100 = 43. of preserving image relay optical connections in a more compact format. Ultimately, the benefits of a more complex interconnect functionality need to be assessed in the context of size restraints.
B. Silicon Electronics
The silicon chip at the core of the demonstrator contained receivers for detectors, drivers for the modulators and digital logic for header-recognition and data routing. The mixed-signal VLSI design of the smart-pixel chips is described in our earlier publication [5] and the details of the transimpedence amplifier design are provided in [11] . The silicon chips were based on 0.6-m CMOS and fabricated using the commercial foundry service Thesys (a subsidiary of Austrian Mikro Systems). The modest 0.6-m technology was chosen for reasons of cost and availability.
Fabricated CMOS chips were tested prior to flip-chip assembly. For these tests, a chip was mounted in a carrier and electrically probed. The routing circuitry was provided with an electrical clock signal at 70 kHz; this low frequency clocking capability having been included for test purposes. A successful trial is illustrated by the results shown in Fig. 5(a) . The middle trace records the electrical input signal simulating the photo-current from a detector. This was supplied, via a probe, to a receiver (flip-chip) contact on the silicon chip. From left to right: the header has a "1" to indicate the start of valid data, followed by the address "110 110" (i.e., 54) and then a data stream. The top trace shows the output signal at superpixel number 54 (binary address 110 110). observed by probing the modulator drive pad. The lowest trace shows the electrical clock that is required during the address-decoding phase. Fig. 5(b) shows an equivalent set of signals but with a different address on the data, "101 011" ( i.e., 43), which (correctly) is not routed to the output modulator of superpixel 54. A series of tests of this type demonstrated, as far as they went, the correct operation of the silicon chip.
C. InGaAs-(Al)GaAs Optoelectronic Devices
The optical interface function for the switching chip was provided by an array of 35 m diameter detectors and modulators fabricated in InGaAs-(Al)GaAs as p-i-n diodes and containing 95 quantum-wells in the intrinsic region. The same device structure was exploited for both the detectors and the modulators (see [5] for further details). In total the InGaAs chip contained 4096 detector and 128 modulator mesas over its 150 mm area. Fig. 6 shows the layout of one quarter of t he InGaAs chip. The set of detectors corresponding to each superpixel were arranged as an 8 8 square array on a pitch of 150 m. A differential modulator pair was positioned in the centre of each array. A common bias was provided to all detectors and, independently, to all of the modulators. The strain-balanced quantum wells were designed for operation at the read laser wavelength (1047 nm) and the structures were grown by MBE on a GaAs substrate (transparent at both input and read wavelengths) with a graded InAlGaAs to achieve the necessary lattice constant for the subsequent growth. When the devices were operated as detectors, an excellent responsivity of better than 0.6 A/W was measured in the wavelength range of interest (Fig. 7) [17] . When operated as quantum confined Stark effect modulators, a contrast ratio of 2:1 (or a modulation depth of 30%) has been measured for individual devices. This modest contrast ratio is a key reason for operating differential pairs of modulators to improve the signal quality.
The integration of our fabricated InGaAs device arrays with the custom CMOS chips was carried out using standard flip-chip technology, available commercially. The design of the InGaAs devices incorporated the flip-chip bonding pads directly on top of each mesa-the optical i/o being through the substrate. Before the flip-chip bonding was carried out the GaAs substrate was polished and anti-reflection coated for the more critical input wavelength of 960 nm. The assembled hybrid chips were wire-bonded into pin-grid-array chip carriers and subsequently inserted into custom-made printed circuit boards.
The modulator devices described above and used in the demonstrator were operated with a 5 V bias plus 5 V signals from the CMOS driver circuits. The prospects for future generations of multiple quantum-well (MQW) modulator structures has also been investigated under the umbrella of this interconnect project. One aspect of modulator development that is of particular importance is the move to lower voltage devices to match the evolution of future CMOS operating voltages. We have investigated the application of the asymmetric Fabry-Pérot modulator (AFPM) approach [9] to improving the performance of our InGaAs devices. A thin film transfer matrix model was developed to model InGaAs-based MQW resonator structures, operating as free-space reflective modulators. The results are encouraging, insofar as they predict that reflectivity changes of greater than 50% with contrast ratios of 7.5:1 can be achieved for operating voltages down to 1 V. Table I summarizes these modeling results showing the reflectivity of the front and back mirrors required to achieve maximum modulation. The high finesse asymmetric Fabry-Perot cavities that are found to be required can have acceptable angular and spectral tolerances because they are kept short, below 15 wavelengths i.e., within the microcavity regime. A major problem, particularly in the InGaAs-(Al)GaAs system, is the production of hard metal mirrors. High reflectivities, greater than 99%, are required in order to keep the cavity short, and these can be difficult to achieve in Bragg-type reflectors. A This table shows the reflectivity of the front and back mirrors required to achieve maximum modulation at these operating voltages where modulation is defined as percentage change in reflectivity (1R). parallel experimental and theoretical study was undertaken to look at the problems of depositing high reflectivity metal mirrors. It is found that high reflectivity leads to a reduction in electrical conductivity and vice versa.
D. Optomechanics
While making no pretence to be a production prototype, it was essential that the system was a stable laboratory demonstrator. The optomechanical assembly therefore needed to provide micrometer stability, the necessary fine adjustment, and thermal control. The main mounting frame was constructed from a single piece of aluminum, milled using computer-controlled tools. The solid-model drawing may be seen in the centre of Fig. 8 . The radiating fins of the heat-sink for the diode-pumped 500-mW Nd:YLF (modulator read) laser may be seen mounted beneath the optical components. The optical mounting system was based on well-established slot-plate technology [12] , which gives a well defined optical axis for barrel-mounted components, convenient focussing adjustment by hand and excellent stability.
The VCSEL and switching chips were mounted on PCBs, themselves held in custom six axis positioners. The thermal control of both these chips was a vital consideration since the switching chip produced several watts of heat and the VCSEL chip needed to work at a specified temperature to ensure wavelength stability. Custom thermal mounts, Peltier cooler assemblies and electronic feedback controllers were designed and implemented to carry out these tasks. The assembled demonstrator may be seen in the photograph Fig. 9 . The substantial heat sink fins on the VCSEL chip are seen in the foreground and the large Fig. 9 . Photograph of the manufactured system optomechanics with bulk lenses, beam splitters, and most other components in position. The VCSEL chip heat sink (near side) is clearly visible as is the PCB for the switching chip and the large imaging lens in front of it. The output plane is at the focal point of the lens to the right of the photograph.
imaging lens in front of the switching chip to the rear is also clearly visible. The overall dimension of the optomechanical system was approximately 30 20 10 cm .
E. VCSELs
The VCSEL arrays described in our earlier publication [5] were developed to provide uniformity of wavelength and power across all 64 elements. The wavelength criticality arose from the use of diffractive fan-out and the consequent shift in focal spots arising from any change in wavelength. This led to a specification of no more than 1% (9.7 nm). Variation of power delivered to the InGaAs detector array was less than 10% permitting the sensitive data receivers to operate in their optimum range. The VCSELs were designed as top-emitting devices operating at a wavelength of around 960 nm. They were grown by metallorganic vapor phase epitaxy (MOVPE) on an n-type GaAs substrate. The structure consists of a one-wavelength AlGaAs cavity with three central In Ga As quantum wells, sandwiched between two reflecting stacks of alternating GaAs and Al Ga As layers with linear hetero-interface grading. A total of 42 layers formed the top (p-doped) mirror and 61 layers for the bottom mirror. The pitch between the individual devices was 250 m and the whole VCSEL array occupied an area of 2.8 2.8 mm . Devices based on a 14-m mesa diameter with the p-contact opening 10 m in diameter, had the following operating values: mean threshold current 2.65 0.05 mA ( 2% across the array); mean threshold voltage 1.88 0.01 V ( 0.5%); and output power of 1.25 0.02 mW ( 1.5%) at 8 mA. The average peak power conversion efficiency was 6.3%. The VCSELs emit at 956 nm with a maximum variation of 0.7 nm.
These arrays were used in the demonstrator experiments described below. Issues of device lifetime and the promise of greater output power (and, hence, system data rate) have driven the introduction of oxide-confined VCSEL arrays for our further system development [6] . The oxide-confined devices offer a threshold current of 0.74 mA and an output power of 2 mW (at 8 mA) [ Fig. 10(a) ]. They have an average power conversion efficiency as high as 14.3%. The oxide-confined arrays have been shown to have lifetimes in excess of 3000 h and wavelength uniformity better than (967 0.7) nm. The array uniformity is illustrated in the measurements presented in Fig. 10(b) .
Arrays of VCSELs offer potential system benefits but one important issue must be overcome in optical systems such as the one described here. The divergence of these VCSEL devices, at an operating current of 8 mA, was (13.0 0.7) . We needed to reduce the divergence, without significant loss of optical power, in order to avoid specifying lens-2 with an impracticably low f-number.
The partial collimation of the VCSEL output beams was achieved using an 8 8 array of refractive microlenses fabricated by reflow of resist and subsequent etching into fused silica. The lenses were 190 m diameter on the same 250 m pitch as the VCSELs. They had a focal length of 909 m. The assembly scheme that we developed is the subject of a patent application [10] and is described more fully in [6] . It is illustrated in Fig. 11 . The vertical standoff of the microlens array from the VCSEL array surface was 145 m, which was achieved using a plastic supporting ring. The lateral alignment was performed using Fresnel zone plates that had been incorporated in the top-level mask of the VCSELs. The alignment is described more fully in [6] . We estimate a tolerance of 2 m. The lens array was attached to the VCSEL chip using UV-curing optical adhesive. Note also in Fig. 11 , the micro-machined grooves in the silica lens substrate to accommodate the bond wires of the VCSEL chip. The performance of the lens array is indicated by the measurements shown in Fig. 12 . This shows the far-field irradiance profile of a VCSEL before and after integration of the microlens array. The reduction in beam divergence for the whole array was measured to be from (13.0 0.7) to (FWHM) at an operating current of 8 mA.
While the majority of the system test results reported below were carried out using the VCSEL array and microlens combination described above as a data input source, further work has been carried out to integrate the improved oxide-confined VCSEL arrays into our system. These have higher output divergence (23 FWHM) and so the microlens array was adapted. Microlenses, 50 m in diameter and f/1.4, were manufactured in fused silica and anti-reflection coated. The microlens array was mounted upside down in an adaptation of the original design shown in Fig. 11 so that the smaller, faster lenses were then only 80 m from the VCSEL chip surface. The resulting higher aberrations did not significantly degrade performance. The output beam profile from this VCSEL/microlens combination is shown in Fig. 13 . The lenses were measured to have a greater than 90% collection efficiency.
IV. SYSTEM TESTS RESULTS
For an experimental study such as this, the performance of the entire system is the key result. We demonstrated that the optoelectronic switch described here could successfully decode packet headers and route data in the manner required at a variety of data rates. Thus it was shown that data could be: 1) converted from electrical to optical-array format by the VCSEL array; 2) correctly routed by the optical system to the hybrid switching chip; 3) registered by the InGaAs detectors; 4) transferred to the silicon digital circuitry, via the solder-bump connections and the analog photo-current receivers; 5) passed through, according to the header code, to the specified InGaAs modulator pair; and 6) emitted from the switching chip, once again in optical array format. This was carried out under optical clock control at data rates up to the target rate of 250 Mb/s/channel.
An example of the system in operation is shown in Fig. 14. Three traces are presented for two different tests. In each case the top trace shows the signal fed to the VCSELs supplying the optical clock. This is present during header recognition, the only phase in which a clock is required. The middle trace shows a signal supplied to one of the 62 data VCSELs. It comprised a 6-bit address segment (with an extra leading "1" to indicate the start of the header sequence) followed by the data stream. The lower trace shows the optical output signal from one selected superpixel, as observed with a conventional detector (plus amplifier), imaging the output plane of the optical system. In Fig. 14(a) the header corresponded to a request for output number-27 (binary-011011, most-significant-bit first). The output that was selected for monitoring (lower trace) was number-27 and showed the data signal to have been correctly routed to it. In Fig. 14(b) , on the other hand, the experimental arrangement remained the same but the header was changed to a request for output number-31 (binary-011 111). Correctly, output channel 27 no longer received that data input.
These tests were performed with full optical fan-out, which, by supplying a complete copy of all the inputs to each superpixel, established the full connectivity of the switch. Thus, the input packet requesting output-27 was optically replicated (by the diffractive element) and sent to all 64 superpixels. The data rate used in this test was 50 Mb/s/channel. On the basis that a total of 62 data inputs could be handled simultaneously in this manner (plus two clock signals) the total number of inputs to the switching chip corresponded to . Thus the aggregate data-rate being supplied to the CMOS chip (and being correctly processed) corresponded to a capacity of Tb/s. With the VCSEL power limited to about 1 mW, the 1-to-64 fan-out and the 44% efficiency of the optical input arm, the receiver sensitivity proved to be a critical limiting parameter. In order to explore higher data rate operation, we chose to increase the available optical power per channel by temporarily removing the fan-out element. Although this removed the routing capability of the switch, it still permitted full tests of each superpixel, in turn, as the optical input was directed at it. The results of tests at data-rates of 125 and 250 Mb/s are shown in Fig. 15(a) and (b), respectively. In all experiments the pre-fan-out power of the Nd:YLF read-out laser was in the range of 15-20 mW. In Fig. 15(a) the upper trace is the clock signal and the lower trace the routed output signal. For Fig. 15(b) the upper and lower traces are the input signal while the central trace is the routed output signal. The quality of the displayed signals in Fig. 15 was significantly impaired by the performance of the detection system (external to the switch), which was working near its bandwidth/sensitivity limit. Nonetheless we were able to conclude that the switching chip, when requested by the appropriate header, was successfully routing this higher-frequency data to the output. On the basis that every superpixel had the capability to operate at the target frequency of 250 Mb/s, we concluded that the hybrid chip had the capacity to handle Tb/s. We note that, when using the full optical fan-out, the system was being fully tested, in the sense that data was being supplied simultaneously to all receivers across the chip and every superpixel was set up to assess its 62 signal inputs and route them, according to the header, to the output. However, the study concentrated on the feasibility of dense integration of optoelectronic, digital and analog functionality, rather than exploring whether every element across the array was performing optimally. For this reason, relatively few data channels per device were tested and only a small number of hybrid chips manufactured. Given this limitation we cannot present any data on yield and reliability. Acquisition of robust yield data would have required statistically significant numbers of MBE growth runs and device fabrication cycles-and was beyond the scope of this investigation.
On the assumptions that 1) any local defects across the arrays tested could be eliminated by suitable process development, and 2) the input signals are provided by the higher power VCSEL arrays (described in Section III-E), we deduce that the system had the potential to operate with the internal aggregate bandwidth of 1 Tb/s and switch connectivity.
V. CONCLUSION
In summary, a digital switch incorporating a smart-pixel optoelectronic connection has been constructed and demonstrated to work. This study has shown the feasibility of using optical interconnect techniques in the terabit-per-second domain. Importantly, the experimental test-bed that we constructed was significantly more than an isolated optical interconnect, but was instead an example of how a highly parallel connection can be provided to a conventional CMOS chip operating a real-world application. Showing the operation of four thousand photo-current receivers embedded within digital electronics was a key result of the study. The ability to provide data links wherever required across the area of a chip, working at the local digital clock-rate, is likely to prove highly valuable. This will be particularly the case as high performance integrated circuits are developed in which even the internal clock-rates for cross-chip connections cannot match the local data communication rates.
The other advantage of constructing this full system is that it has required the development or refinement of a number of optical and optoelectronic techniques that may prove particularly important in future exploitation of this interconnect approach. This has included: 1) VCSEL arrays-already finding many applications; 2) image relay optics with a very high spacebandwidth product-in this case capable of resolving 200 000 spots across a 17.5 mm field; 3) diffractive optics-providing a very effective means of achieving high levels of signal fan-out; 4) InGaAs-based detectors and modulators-suitable for flipchip assembly with CMOS chips; and 5) small area electronic receivers with good sensitivity and bandwidth.
A study of receiver performance [13] showed that, as CMOS feature sizes decrease, the power consumption of the types of receivers used in this work also significantly reduces. In the system demonstrated here, based on 0.6-m CMOS, the total power consumption of the receivers-14 W or 7 mW per receiver-was close to being a limiting factor. Calculations show that using 0.1 m CMOS, with the same bit-rate, the power per receiver falls to 200 W or 0.3 W/Tb/s. Importantly, it is in just the circumstance of high density chips operating at faster clock rates (made possible by shrinking feature sizes), that the need for optical interconnects will become more urgent. Thus this approach offers a solution to future interconnect challenges that is compatible with advancing technology.
The experimental system described here demonstrates how parallel optoelectronic interconnects can access the terabit-persecond domain. Further studies are required to develop higher data rate systems of this type and to engineer production prototypes for real applications. The challenges include miniaturization, achieving acceptable manufacturing and environmental tolerances, cost reduction, etc. Such projects, using a range of novel techniques for the implementation of free-space interconnections, are already underway [18] .
M. G. Forbes received the B.Eng. degree in electronics and physics from the University of Edinburgh, Edinburgh, U.K., in 1995 and the Ph.D. degree from Heriot-Watt University, Edinburgh, U.K., where he received the MacFarlane medal for his dissertation on CMOS photoreceiver design in 1999.
He is presently a Principal Design Engineer in the analog-mixed-signal engineering services division of Cadence Design Systems, Livingston, U.K., where he has worked in technical lead, circuit design, and laboratory characterization roles on a broad range of analog circuits in the areas of RF (2.4 GHz fractional-N Bluetooth PLL 0.18 m CMOS; IF gm-C filters and amplifiers for bipolar GPS; bipolar power-amplifier controller, 187.5 MHz fundamental mode crystal oscillator); data converters (sigma-delta audio DAC and ADC; 10-bit BiCMOS 500 kS/s SAR; 8-bit 10 MS/s interpolating DAC) data communications (quad 3.125 Gb/s transimpedance and limiting amplifiers in SOI CMOS; 500 MHz clock generation PLL for SPI4P2) and low-power design (power management for a digital hearing aid). He has delivered several analog macro blocks for end use in a 0.13-m CMOS digital design flow. His thirteen technical publications include an invited book chapter on photoreceiver design.
