The experimental optical interconnection module of the Free-Space Accelerator for Switching Terabit Networks ͑FAST-Net͒ project is described and characterized. Four two-dimensional ͑2-D͒ arrays of monolithically integrated vertical-cavity surface-emitting lasers ͑VCSEL's͒ and photodetectors ͑PD's͒ were designed, fabricated, and incorporated into a folded optical system that links a 10 cm ϫ 10 cm multichip smart pixel plane to itself in a global point-to-point pattern. The optical system effects a fully connected network in which each chip is connected to all others with a multichannel bidirectional data path. VCSEL's and detectors are arranged in clusters on the chips with an interelement spacing of 140 m. Calculations based on measurements of resolution and registration tolerances showed that the square 50-m detector in a typical interchip link captures approximately 85% of incident light from its associated VCSEL. The measured optical transmission efficiency was 38%, with the losses primarily due to reflections at the surfaces of the multielement lenses, which were not antireflection coated for the VCSEL wavelength. The overall efficiency for this demonstration is therefore 32%. With the measured optical confinement, an optical system that is optimized for transmission at the VCSEL wavelength will achieve an overall efficiency of greater than 80%. These results suggest that, as high-density VCSELbased smart pixel technology matures, the FAST-Net optical interconnection concept will provide a low-loss, compact, global interconnection approach for high bisection-bandwidth multiprocessor applications in switching, signal processing, and image processing.
Introduction and Background
Multiprocessor problems in switching, signal processing, and computing are becoming increasingly performance limited by the bottlenecks associated with planar interconnection technologies. Threedimensional free-space optical interconnection ͑FSOI͒ approaches offer the potential to overcome these bottlenecks through the geometric advantages associated with high-bisection-bandwidth ͑BSBW͒ link patterns 1 in multichip interconnection topologies. 2, 3 These advantages are based on the rapidly emerging technology of smart pixels, which combines high-density optoelectronic input-output ͑I͞O͒ with VLSI electronic circuitry. Smart pixels are projected to achieve throughputs that exceed 1 Tbit͞s͞cm 2 . 4 Systems that use smart-pixel-based FSOI provide two general capabilities for dense interconnections: parallel data transfer and parallel data interchange. Optical imaging provides a high throughput approach to linking smart pixel planes for data transfer. In this case the high I͞O density of smart pixels may provide a power consumption and size advantage over electronics. 5, 6 For data interchange FSOI pro-vides the additional ability to perform spatial data partitioning and interleaving, which is useful in space-variant link interconnection patterns such as the perfect shuffle 7 and other global permutation patterns. These link patterns are inherently difficult to implement in planar interconnection technologies, because they have high BSBW's.
In multiprocessor architecture design there is typically a direct trade-off between minimum BSBW and latency in a network. It is therefore generally desirable to implement networks with the largest minimum BSBW that can be practically achieved to solve a given problem. The ability of optical elements to interconnect large arrays in space-variant patterns, without cross talk in the medium, suggests that FSOI techniques are particularly promising for problems with high BSBW. For architectures that benefit from a network BSBW in the terabit͞second regime, free-space optical interconnects have a marked advantage. 2, 3 Therefore globally interconnected multichip smart-pixel-based architectures have the potential to reap the full benefits of FSOI.
In this paper we describe an experimental demonstration of a smart-pixel-based optical interconnection module currently being developed under the Free-Space Accelerator for Switching Terabit Networks ͑FAST-Net͒ project. Our focus in this paper is on optomechanical and optical characterization of the initial prototype, which comprises the integration of four 2-D arrays of monolithically integrated verticalcavity surface-emitting lasers ͑VCSEL's͒ and metalsemiconductor-metal ͑MSM͒ photodetectors ͑PD's͒ with a multichip global optical interconnection system. Section 2 provides an overview of the FASTNet concept. The distinguishing features and smart pixel functionality are highlighted. As a motivation for the experiments the technical challenges to implementing the concept are also outlined in Section 2. Section 3 details the key elements of the experimental module. These elements include the integrated VCSEL-MSM detector arrays, VCSEL drivers, smart pixel emulation boards, the optomechanical alignment subsystem, and the optical interconnection subsystem. The experimental results are presented in Section 4. The experiments focused on the characterization of the optical interconnection system, including measurements of adjacent channel separation, optical transmission, and detector collection efficiency of the multielement refractivereflective system. As described in Section 4, the FAST-Net prototype exhibits resolution and registration accuracies that are well matched to active element size and spacing of the VCSEL-PD arrays. Furthermore, the system's overall transmission efficiency is limited only by reflective losses of the refractive elements, which can be compensated with an antireflection ͑AR͒ coating in an optimized system. Section 5 is a conclusion that summarizes the effect of the experimental results and discusses plans for the next phase of the FAST-Net evaluation. Figure 1 is a schematic depiction of the FAST-Net optical interconnection approach. A key distinctive feature of the concept is that all smart-pixel-array ͑SPA͒ devices are distributed across a single multichip plane, 8 such as a multichip module ͑MCM͒ or a printed circuit board ͑PCB͒. The SPA chip array is linked to itself through an optical system made up of an array of matched lenses and a mirror. The optical system effects a global point-to-point interconnection pattern in which images of spatially separated clusters of VCSEL's and detectors on any SPA chip are overlaid onto similar clusters at every other SPA chip on the substrate. The clusters are patterned such that the image of the VCSEL array in a cluster registers onto the associated PD array within the other cluster, and vice versa. Every chip is therefore connected to all other chips with a bidirectional parallel data path, whose throughput is determined by the size and the density of the clusters of VCSEL's and detectors ͑i.e., how many links per cluster͒ and the rate at which the signals are transmitted over individual links.
FAST-Net Concept Overview
Each SPA chip is a hybrid complementary metaloxide semiconductor ͑CMOS͒͞GaAs device. The silicon CMOS electronic chip is area bump bonded to the GaAs array of emitters and detectors. The GaAs IC comprises a spatially interleaved array of VCSEL's and MSM PD's. The CMOS chip contains the drivers, receivers, and digital logic associated with the routing, electronic I͞O, and computational elements of the architecture. The optoelectronic I͞O elements are arrayed in a self-similar grid of clusters 9 of interleaved VCSEL's and detectors. Previous experiments, which used etched photomasks to emulate the multichip array of sources and detectors, demonstrated sufficient optical resolution and registration across the entire multichip array to accommodate element sizes of ϳ25 m and element-to-element spacing within clusters as small as ϳ100 m. 10 Each cluster may eventually contain many VCSEL's and detectors that operate at rates of ϳ1 Gbit͞s, thus leading to a large aggregate bandwidth between each pair of chips in the array. The density of devices and the cluster area determine the number of VCSELdetector pairs in each cluster. The cluster area is determined by the geometric constraints of the optics. 10 With proper optical design there is potential for a massive amount of internal BSBW in the system depicted in Fig. 1 . For example, if each SPA chip achieves an aggregate I͞O capability of 1 Tbit͞s, then the total BSBW capacity of the system depicted in Fig. 1 is 8 Tbit͞s for the 16-chip SPA arrangement.
It is envisioned that the basic FAST-Net module, as depicted in Fig. 1 , will be coupled to external nodes through an optical or electrical interface around the periphery of the multichip substrate. This external I͞O bandwidth, although large, will be some fraction of the internal bisection bandwidth of the module. The exact relationship between the internal and the external bandwidth will depend on how the internal bandwidth is partitioned to meet the switching requirement of the application.
There are several significant challenges in establishing the feasibility of the FSOI system depicted in Fig. 1 . First, 2-D arrays of VCSEL's and MSM detectors must be used to emulate the eventual I͞O of full hybrid SPA's. Second, multiple 2-D arrays must be positioned to emulate the high-precision placement expected of eventual MCM packaging. Third, the optical system must be precisely aligned to the arrays to effect the global interconnection pattern with high resolution and registration accuracy. Such an evaluation is complicated by the off-axis and wide-field-of-view nature of the multilens reflective global interconnection system and the inherent small feature size of the interconnected devices. The link design and I͞O density of the eventual interconnection module will be determined from data gathered from the initial experiments described in this paper. The following sections describe the experimental module, the measurements carried out, and the interpretation of the results.
Experimental Module Description
The primary focus of the experiments described in this paper is on the incorporation of multiple arrays of monolithically integrated VCSEL-PD arrays into the FAST-Net optical module in a manner that emulates the operation of the eventual SPA devices. To this end the VCSEL-MSM arrays are packaged and operated on separate small PCB's that are positioned together in the optical system to emulate a single MCM or PCB substrate. In the experimental system, depicted in Fig. 2 , as many as four PCB's, each containing a VCSEL-MSM detector array, are positioned in the active plane of the system. These small daughter boards perform two functions. First, driver, receiver, and control logic ͓in the form of a field programmable gate array ͑FPGA͔͒, which emulate the functions of smart pixel I͞O signals, are contained on each chip's board. Second, each daughter board is mounted onto an individual multiaxis micropositioning stage, which allows for the chips to be positioned precisely in the smart pixel plane for alignment. The chip positioning tolerances in the system correspond to the placement accuracy achievable with state-of-the-art chip pick-and-place equipment ͑which achieves placement accuracy of the order of 10 m͒. This accuracy is necessary because precise registration is needed across the entire multichip array of microlaser sources and detectors. Although the multichip plane is not completely populated in this experiment ͑this would require 16 SPA chips to be spaced more closely than the PCB's allow͒, the prototype still serves to fully characterize the alignment of the system. This is because the interchip links of the global interconnection pattern between the nearest ͑same chip͒ and farthest ͑corner-tocorner chips͒ may be evaluated simultaneously with this system. Wide-angle, flat-field f͞1.12 lenses are used for the interconnecting optics, and the cornerto-corner cluster imaging arrangement of the prototype almost fully exploits the wide range of angular ray deflections needed in the system.
The interleaved 2-D arrays of VCSEL's and PD's in the prototype are arranged in clusters. The clusters are arranged in a pattern that is self-similar to the 4 ϫ 4 lens array pattern. 9 The 16 lenses are arrayed on a center-to-center spacing of 1.7 cm. Since there are 16 lenses, each VCSEL-MSM chip contains an array of 16 clusters. On the chip the cluster-tocluster spacing is 800 m. As shown in the magnified inset of Fig. 2 , each cluster in the experimental system consists of a square array of 2 VCSEL's and 2 detectors, with a closest element spacing of 140 m within the square four-element cluster. The VCSEL-MSM fabrication process does not limit the number of elements in a cluster in this system. In the present experiments the number of elements in each cluster is limited by the density of the wire-bond pads at the periphery of each chip. The cluster pattern for these experiments was designed to evaluate the small intracluster element spacing and relatively large interchip cluster imaging distances across several chips simultaneously. In the eventual system, area bump bonding will be used. Since there will be no peripheral interconnection density limitation, it is hoped that VCSEL-MSM cluster sizes will be large enough to accommodate ϳ1024 elements͞chip. With such chips a 16-chip-16-cluster design may therefore have clusters that contain a total of 1024͞ 16 ϭ 64 VCSEL-MSM pairs.
The optical system is aligned such that the image of each chip's clustered pattern is divided into 16 subimages ͑one for each cluster͒, each cluster's image registered onto a similar cluster on a separate one of the 16 chips. Thus the images of the two VCSEL's of a cluster are made to fall on the two detectors of the associated cluster on one of the chips. Conversely, the two VCSEL's of the associated cluster are imaged onto the two detectors of the first cluster. All 16 chip locations are connected in this manner. Thus the FAST-Net optical interconnection system functions as a fully connected network, in which each chip location is connected to every other with a bidirectional data path, which uses a cluster of two VCSEL's and MSM's on each chip. The optical system may also be viewed as a bidirectional shuffle system that performs a 16-shuffle on the pattern of 16 ϫ 16 clusters in the system. In this case each multielement cluster may be considered to be a node of the 16-shuffle.
As depicted in Fig. 2 , the prototype system integrates three key technology areas. CMOS electronic emitter driver circuitry, receiver circuitry, and control logic are used as the electronic portions of the SPA emulator boards. GaAs VCSEL-PD arrays are fabricated and packaged to emulate the SPA I͞O arrays. These electronic and photonic elements are incorporated into the optical interconnection module by use of optomechanical techniques to emulate the precise alignment of a single multichip substrate to the global optical interconnection system. Subsections 3.A-3.C detail these three key elements of the prototype.
A. Electronic Driver, Receiver, and Control Boards
The area-bump-bonded smart-pixel MCM module depicted in Fig. 1 is emulated in the experimental system by small PCB's with discretely packaged VCSEL-PD arrays ͑which are described in Subsection 3.B͒, CMOS VCSEL driver integrated circuits ͑IC's͒, CMOS receivers IC's, and FPGA's, as depicted in Fig. 2 . ͓The eventual flip-chipped smart pixel processor will contain all the CMOS circuitry ͑includ-ing data processing͒ in addition to the VCSEL-PD arrays.͔ One PCB is used for each emulated SPA. The emulator boards are small enough to permit as many as four to be positioned in the system at the same time, as depicted in Fig. 2 . Test patterns for the VCSEL arrays are loaded onto each daughter card from the central motherboard before each test. Additionally, daughter cards are programmed to transmit selected receiver outputs back to the motherboard for display. This arrangement permits all point-to-point optical links in the global multichip interconnection pattern to be evaluated individually or in arbitrary groups selected by the operator.
The logical elements of the FAST-Net demonstrator system consist of five FPGA's: one master controller and four slave FPGA's. With this configuration, the logical operations used to test the system are completely programmable. Each slave FPGA has dedicated connections to every receiver and driver being used on its own card and can therefore be programmed to send, receive, and manipulate the optical data. It can also receive data or instructions from the motherboard and report back to the motherboard electronically.
The slave FPGA can drive the VCSEL array through the VCSEL driver ASIC. The optical signals generated by the VCSEL's pass through the optical system and land on the PD's. Photocurrent signals generated by the PD's are converted into digital CMOS voltage signals by the receiver arrays on the daughter cards. These signals are then used to drive FPGA inputs. With the logic gate, RAM capability, and the ability to reprogram the FPGA a variety of functionalities may be realized with this system. For the experiments reported here the FPGA's were programmed to facilitate optical alignment and testing of the optical channels by displaying of the logical detector outputs of the proper cluster's detectors during the alignment process. Figure 3 shows the circuit layout of the VCSEL driver-test chip, as fabricated in a 0.5-m CMOS. The design contains ten asynchronous channel drivers, with CMOS-level inputs and current-mode outputs. Modulation and bias are adjustable over a wide range ͑0Ϫ20 mA for bias and Ϯ20 mA for modulation͒. The chip may be operated in a self-test mode. In this mode a pseudorandom data stream is generated on chip and transmitted to the VCSEL array. An adjustable on-chip clock generator ͑150 MHz-1 GHz͒ is used to synchronize the operation ͑in self-test mode͒. Thus the chip can test high-speed operation of the VCSEL device array without requiring high-speed signal generation equipment. Although designed to operate at several hundred megahertz, these drivers were operated at much slower speeds for the experiments reported in this paper. The electrical parasitics of the board and wire-bond connections to the VCSEL's prohibited high-speed modulation in the present experiments. This was tolerable, since high-speed operation was not required for completing the optical characterization of the FAST-Net prototype. Similar high-speed drivers will be incorporated into the next-generation FAST-Net module, which will use area solder bump bonds between the drivers and the VCSEL's to achieve high speed. A block diagram of the VCSEL driver-test chip's operation is provided in Fig. 4 . The self-test circuitry of this chip provided a convenient bit pattern for optical link analysis.
B. Monolithically Integrated VCSEL-MSM Arrays
A schematic cross section of the integrated device structures is shown in Fig. 5 . The VCSEL structure is grown by metal-organic chemical-vapor deposition. 11 It consists of a bottom mirror stack ͑grown directly on the semi-insulating GaAs substrate͒, an n-type doped contact layer, multiple-quantum-well active layers, and a typical p-type top mirror stack. The MSM PD structure is fabricated on top of a 1.5-m cap layer of undoped GaAs and uses WSi x for the Schottky metal of the interdigitated fingers. This cap layer is selectively etched by a wet chemical process to prepare the other I͞O sites for VCSEL fabrication. The VCSEL active window is defined by a p-ohmic metal deposition and gain-guide implantation. A dry etch process is used to access the n-contact layer. Here an n-ohmic metal is deposited on the n-contact layer. Next, the VCSEL gain region is isolated, through ion implantation, and is pas- sivated and insulated with polyimide. Finally, the interconnect metal is applied to the device. A typical VCSEL in the array has a threshold current of 3.5-4 mA and a threshold voltage of 1.7-1.8 V. The maximum output power for an element exceeds 8 mW. In the experiments the devices were typically operated within the range of 0.1-2 mW. The operating wavelength of the VCSEL's is approximately 850 nm. The small signal bandwidth of a typical VCSEL was measured to be Ͼ9 GHz. 11 Figure 6 contains a microphotograph of the 2-D VCSEL-MSM integrated array chip designed for the FAST-Net FSOI module. The 2-D pattern of VCSEL's and PD's was designed to match the prototype optical system's field of view as determined by the lens array spacing and the f͞# of the lenses. The pattern is a self-similar grid designed to make optimum use of the optical interconnect system with a multichip SPA substrate. 9, 10 In this case the 4 ϫ 4 array of clusters of the optical I͞O pattern on the chip is self-similar to the 4 ϫ 4 lens array pattern in the prototype. With this pattern, parallel optical links between chips are clustered in the same physical region in the grid. There are a total of 32 VCSEL's and 36 MSM PD's in each array. The four extra detectors, which break the repeating pattern ͑located in the corners of each chip͒, are used for alignment and cross-talk analysis in the prototype. The side of each square cluster measures 140 m; thus the center detector of the corner clusters is ϳ100 m from the other elements of its cluster. These dimensions are typical for projected high-density VCSEL arrays. In turn, the cluster spacing 800 m. All of the elements have coplanar contacts that are routed to bonding pads at the periphery of the chip. The magnified inset in Fig. 6 is a SEM picture showing detail of a cluster of two VCSEL's and two detectors. The circular VCSEL elements have a confinement aperture diameter of approximately 15 m ͑as defined by the proton implantation͒ and a full angle divergence of ϳ15 deg. The square PD's have an active region that is 50 m wide. Figure 7 is a photograph of the optical interconnection module and the associated optomechanical elements used to align the emulator boards and the optical elements for the prototype demonstration. The module comprises 16 lenses fixed in a common plane and a mirror positioned above the lenses to fold the system back on itself and achieve the global interconnection system as described above. The 16 lenses in the array are commercially available sevenelement f͞1.12 microprojection lenses, with a focal length of 13 mm. These lenses were specified to have a focal length tolerance of ϩ1͞Ϫ2%. The narrow beams of the VCSEL's make it relatively simple to avoid vignetting in this wide-field-of-view, highnumerical-aperture system. This experimental prototype was configured to allow for as many as four SPA emulator boards to be incorporated simultaneously into the prototype. The physical size of the boards prevented positioning of the VCSEL-MSM arrays closer than approximately 3.2 cm from center to center. This means that immediately adjacent SPA chip locations in the MCM plane ͑with a centerto-center spacing of ϳ1.7 cm͒ cannot be evaluated, Fig. 4 . Block diagram of the VCSEL driver-test IC. Each VCSEL can be driven by one of three inputs: a pseudorandom sequence that repeats after 16 bits, a pseudorandom sequence that repeats after 1024 bits, or the corresponding input pin ͑typically connected to the CMOS receiver IC͒. but all other spacings between chips ͑i.e., two-chip separation, or three-chip IC, along a side or diagonal͒ may be evaluated. For the measurement reported in this paper the emulator boards were positioned to populate the four-corner optoelectronic IC locations of the eventual MCM system, as depicted in Fig. 2 . In these positions the optical interconnection system is required to simultaneously accommodate the on-axis case ͑in which a cluster on a chip linked to itself ͒ and the most off-axis case ͑in which clusters on the diagonal corner chips are linked͒. Such a configuration enables measurements over the extreme beam angles to provide a full evaluation of the system. The four daughter cards described in Subsection 3.A were used to mount the VCSEL-PD arrays as needed. Each daughter card, in turn, was mounted on a micropositioning assembly with 5 degrees of freedom ͑x, y, and three angles͒ to emulate the high pick-andplace positioning accuracy ͑ϳ10 m͒ of the flipchipped hybrid IC's on the eventual MCM module. Ribbon cables ͑two are visible in Fig. 7͒ connect the four daughter boards to a motherboard, where the master controller resides ͑the motherboard is visible on the table to right-hand side of the optomechanical system in Fig. 7͒ .
C. Optomechanical Interconnection System
The overall multichip optical I͞O pattern is divided into 16 sections, one for each optoelectronic IC, with a lens above it. A lens is aligned above each VCSEL-PD array in a lens holder that affords a positioning flexibility sufficient to compensate for the specified machining tolerances of the lens barrels. As shown in Fig. 1 , each IC has 16 sections corresponding to a cluster of optical I͞O communicating with another optical IC of the array. As depicted in Fig. 2 , the clusters used in the present experiment contain four parallel links between the two optical IC's, thereby permitting critical analysis of cross talk for the system in a dense ͑140-m separation͒ optical interconnection module, as described in Section 4.
Experiments
The overall goal of the FAST-Net module evaluation described here is to verify and quantify the capability of the macro-optical interconnection module to effect the required global interchip link pattern on the in- terleaved arrays of VCSEL's and detectors across multiple chips in a single plane. To this end, three separate measurements are needed. First, the basic VCSEL-to-MSM optical link capability of the system must be characterized. Owing to the macro-optical reflective approach, the total path length of each link in the reflective shuffle system is approximately 30 cm in the prototype shown in Fig. 7 . The ability of the system to transmit and focus the VCSEL beams after this propagation distance must be validated. This basic functional evaluation also serves to verify the operation of the analog driver and receiver circuitry and the interfacing digital logic link at the input and the output. Also, this basic channel link characterization permits the setting of appropriate bias and trigger voltages within the driver and receiver circuits. The basic channel link validation measurements and calibration were conducted with the actual prototype, with the mechanical degrees of freedom afforded in the prototype for precise positioning of a VCESL-detector pair on axis for one of the lenses in the reflective system.
The resolution and registration of the interconnection prototype play a key role in determining the achievable density of optoelectronic elements within a cluster-which is a key a parameter for determining the BSBW and therefore for predicting the overall performance of the eventual system. The second set of measurements, therefore, was implemented to characterize the prototype in terms of separation and possible optical cross talk between adjacent channels in the same cluster. These measurements require the operation of adjacent receiver channels within a cluster while one of the associated VCSEL's is toggled on and off to determine whether there is enough spatial separation to reliably trigger only the corresponding receiver circuit. In the present prototype the corner clusters of the VCSEL-MSM arrays facilitate the evaluation of cross talk between detectors that are as close as 100 m. This is because the extra detector in these clusters ͑visible in the corner clusters of the microphotograph shown in Fig. 6͒ is located in the center of the cluster. Since the elements of the clusters were fabricated on a square 140-m grid, the additional detector in the corner clusters is located approximately 100 m away ͑as measured from center to center͒ from either of the other two detectors in the cluster. These measurements provide an estimate of an achievable spacing between elements and thus help to predict the eventual achievable density of the elements within a cluster.
The third set of critical measurements of the FASTNet module is needed to characterize the overall optical efficiency of the system. The goal of these measurements is to determine the link efficiency and to identify and quantify the sources of loss in the optical system. As described below, these measurements were facilitated by use of a calibrated broadarea detector positioned at various locations in the system. Measurements of the efficiency of off-axis link paths in the global optical system indicate how an optimized optical design will benefit the overall performance.
To accommodate the second and the third measurements, the system was configured to verify the operation of the corner-to-corner optical channels in the globally interconnected 4 ϫ 4 smart pixel chip array that will exist in the final multichip plane system. These optical channels have the largest angles of incidence in the system; they therefore represent the most challenging optical alignment. The system allows for the alignment techniques developed previously with a passive array 10 ͑a photolithographic etched mask͒ to be refined in an active integrated emitter-receiver scheme similar to what will be encountered in the final multichip packaged system. Subsections 4.A-4.D detail the three measurements made with the FAST-Net prototype.
A. VCSEL-MSM Link Validation
Verifying the corner-to-corner links in the optical module required the precise placement of the optical arrays for mimicking the pick-and-place accuracy of the eventual MCM. We accomplished this by using the five axis positioning mechanics beneath each of the daughter cards. A daughter card was aligned beneath its corresponding lens such that its alignment cluster was imaged onto itself through the folded optical system. The chip is positioned such that its alignment cluster is precisely centered about the optical axis of the lens. This is the location where the eventual MCM pick-and-place machine will place the chip on a common substrate with respect to the rest of the system. When two daughter cards were positioned in this way, the interchip ͑in-tercard͒ link was examined to verify that simultaneous links between clusters on distinct chips were aligned to their respective clusters on a single chip. No repositioning of the daughter cards was allowed during the verification of intercard interconnections, because the positioning equipment was used only to replace the pick-and-place accuracy of the eventual MCM. With this method links of all varieties, with the exception of neighboring chips, since the chip spacing was prohibited by the size of the daughter cards, could be verified. For these and all subsequent measurements the VCSEL's were operated without modulation to facilitate the alignment process and measurements that followed.
B. Intracluster Optical Separation Measurements
Once the basic intercard connections were established, the dc optical intracluster channel separation ability of the prototype was characterized. To facilitate this, the VCSEL's in the link were turned on and off while corresponding detectors were examined. In this configuration it was verified that, at operational conditions, the corresponding detector was triggered whereas the detector located at a position 140 m away in the cluster was not. Four clusters on the chip were fabricated with a detector at the center of the square cluster ͑100 m diagonally from each corner͒. This cluster was used to measure the cross talk at this point. This detector did not trigger when either corner link was established.
After verification of effective intracluster channel separation for the on-axis case, other intercluster ͑and hence interchip͒ link varieties were verified by use of the prototype system. This validation was carried out by means of linking simultaneously closely spaced VC-SEL pairs ͑140 m apart͒ on one chip with corresponding detector pairs on chips positioned across the simulated MCM. Corner VCSEL-detector pairs were chosen for this test, since they stressed the optical system, owing to the large beam angles involved. As with the on-axis measurements, these off-axis interchip links achieved sufficient optical resolution, registration, and efficiency to interconnect active interleaved arrays of VCSEL's and PD's in a system without measurable cross talk for detector separation with a cluster of 140 m. These measurements validated the ability of the optical system to reliably separate densely packed intracluster channels. They provide a good starting point for the design and optimization ͑in terms of VCSEL power and receiver sensitivity͒ of the eventual area-bump-bonded SPA arrays that will be operated at high rates and contain a larger number of channels per cluster. The absence of measurable dc optical cross talk between cluster elements that have only 140-m separation suggests that the optical system achieves excellent resolution within the VCSEL cluster image across the entire field of view of the optical system. The measurements that confirmed and quantified this high resolution are detailed in Subsection 4.C.
C. Optical Link Efficiency Characterization
For the optical efficiency measurements, only one of the smart pixel emulator boards was inserted into the FAST-Net system and only a single VCSEL in one of the corner clusters of elements of the array was used. This configuration allowed for the insertion of a calibrated wide-area detector into various positions within the prototype to capture essentially all of the light from the single VCSEL at various positions in the system. Total optical power was measured in the planes 1, 2, and 3 shown on the cross section of the prototype system in Fig. 8 . Two types of measurement were made in plane 3. In the first the wide-area detector was used as in planes 1 and 2. In the second measurement at plane 3 a 50-mdiameter pinhole was used in front of the wide-area detector to emulate the collection aperture of the MSM detector. This pinhole was used to estimate the amount of light power captured by the detector in the prototype system. The same VCSEL was used throughout the measurements to provide uniformity of results. VCSEL output power was varied between 90 and 1.7 mW.
The measurement at plane 1 provides an absolute baseline for the power measurements. Relating this measurement to the measurement at plane 2 provides an estimate of transmission efficiency of the mirror in the system. Relating the measurement at plane 3 without the pinhole to the measurement at plane 2 provides an estimate of the transmission efficiency of the second lens. Since the lenses in the system are nearly identical, the transmission efficiency of the first lens in the optical path may be inferred to be equal to that of the second lens. The product of the transmission efficiencies of the two lenses and the mirror is the overall optical efficiency of the system. The measurements made at the three planes were repeatable and stable. Figure 9 is graph of the measured optical efficiency of the lens and mirror elements as a function of the VCSEL output power. The figure shows that the overall efficiency is relatively insensitive to total optical power, as expected. The data show that each lens has a light power transmission efficiency of ϳ68%. This is consistent with the fact that the prototype uses seven-element lenses ͑two are cemented together͒ and that each lens surface has ϳ97% transmission, which is typical for a non-AR-coated glass-air interface. The average mirror reflection efficiency is ϳ82%; with the losses assumed to be due to absorption and transmission, since the mirror was not designed for the wavelength of interest. The overall optical efficiency for the FAST-Net prototype is therefore estimated from these measurements to be ͑0.68͒ 2 ͑0.82͒ ϭ 0.38. Relating the measurements made at plane 3 with, and without, the pinhole present in front of the widearea detector shows that approximately 86% of light that reaches plane 3 is collected within the circular 50-m-diameter pinhole. The variations in this measurement with VCSEL power are shown in Fig. 9 . This result is an indirect proof that the relevant aberrations of the optical system are so small that their effect on the overall light budget is of a secondary nature for the element sizes used in the prototype.
Since the PD's are square, with a collection area of 50 m ϫ 50 m, they will actually collect somewhat more of the incident light than the circular pinhole used for the measurements. If the incident VCSEL spot is assumed to have a Gaussian intensity profile that is centered on the detector, then calculations show that a square PD could collect ϳ91% of the light incident at its location. This result confirms the favorable channel separation results reported in Subsection 4.B. It also shows that cross talk should not present a problem in the system, owing to the high confinement of light energy in the PD plane.
D. Optical Cross-Talk Analysis
To determine the worst-case cross talk expected in the FAST-Net prototype, a typical detector array was analyzed in terms of the cross talk. If a regular grid of 50-m detectors is assumed, and if the array is illuminated with an identically spaced grid of laser beams with Gaussian profiles, then the expected cross talk can be calculated as a function of the detector spacing and the misregistration of the VCSEL-detector arrays. The previous pinhole measurements showed that the diameter of the incident light beam at the detector plane was approximately 50 m. Assuming that all eight detectors adjacent to the central one were illuminated with their respective identical VCSEL's, while the central one was not illuminated, the overall light intensity that spilled over to the central detector was calculated. This corresponds to the worst-case scenario for cross talk. Dividing this integrated cross talk by the signal power ͑i.e., the power that the central detector collected with only its corresponding VCSEL on but with all other VCSEL's off ͒ normalizes this cross talk. Figure 10͑a͒ shows the dependence of the normalized cross talk on the spacing of the detector grid. In the extreme case of square detectors with a unity fill factor ͑i.e., spaced at their width͒ the normalized cross talk is ϳ0.1. As the detector spacing is increased, the cross-talk level decreases. Detectors spaced at 140 m achieve a normalized cross talk of ϳ7 ϫ 10
Ϫ20
. In fact, even at a detector spacing of 100 m, the normalized cross talk is negligible. These calculations are consistent with the measurements made on the two detector clusters of the experimental setup, which showed no measurable optical cross talk.
Misregistration of the VCSEL array image will affect the cross talk. Figure 10͑b͒ depicts the effect on normalized cross talk when the VCSEL array image is spatially offset. The results are shown for three different detector spacings. In this case the abscissa represents a simultaneously offset error in both the x and the y directions. For example, a registration error of 10 m in the figure corresponds to ϳ14 m of displacement along the diagonal. If the 50-m detectors are directly adjacent ͑i.e., unity fill factor͒ the normalized cross talk is high but relatively insensitive to registration error. However, for detector spacings of 140 m the normalized cross talk is small even for the registration errors of the order of tens of micrometers. Similarly, the calculations for detector spacing of 100 m also show that cross talk remains negligible across this range. Previous studies have shown the registration of the VCSEL images to be accurate to within 10 m with this optical interconnection system. 10 Even when the location of the image of the VCSEL is shifted 10 m diagonally from the center of the detector, the Gaussian beam calculations show that the detector captures ϳ85% of the light. If a 10-m offset is assumed to be the worst-case VCSEL-MSM alignment in the system, then the overall optical efficiency, inferred from these measurements, is therefore estimated to be the product of the optical transmission efficiency and the worst-case detector collection efficiency or ͑0.38͒͑0.85͒ Ϸ ͑0.32͒. Given the responsivity of the MSM detector and the optical output power capability of the VCSEL's, this efficiency is more than sufficient to achieve adequate signal-to-noise ratios of the VCESL-MSM links in the system. However, given the transmission characteristics of the lenses and mirror used in the prototype, it is clear that the overall optical efficiency of this system may be optimized by simple use of lenses and a mirror that are AR coated for operation at the VCSEL wavelength. Using a custom optical design that minimizes the number of elements in each lens will also improve overall transmission efficiency.
Conclusions
The successful integration of a multichip arrangement of monolithically integrated arrays of VCSEL's and MSM detectors with a global optical interconnection prototype has validated the critical optical system performance elements of the FAST-Net concept. The experiments have confirmed the macro-optical concept's ability to effect a dense global point-to-point shuffle link pattern across a large ͑ϳ10-cm͒ multichip smart pixel plane. The reflective macro-optical interconnection system linked a single multichip plane to itself in a global space-variant pattern of high bisection width. High-accuracy alignment was achieved across the entire plane, owing to the use of a single smart pixel plane and an optomechanical alignment approach that removes the excess mechanical degrees of freedom that would be present between each pair of planes of a multiplane architecture. This approach facilitates an automated alignment procedure, which is critical for the eventual manufacture of such optoelectronic systems.
The resolution and registration accuracy of the single-plane macro-optical reflective approach was well matched to the element size and close spacing of the arrays. Hence the experiments proved that global multichip interconnections are possible between high-density smart-pixel arrays. Furthermore, the optical efficiency was proved to be primarily limited only by reflective losses in the multielement lenses of the optical interconnection system. High optical efficiency will minimize the driver and receiver complexity and the power consumption in the eventual smart pixel devices. The combination of high-resolution, high-registration accuracy and high overall optical efficiency suggests that the FAST-Net concept will be able to fully exploit the projected Tbit͞s͞cm 2 I͞O capability of future smart pixel chips in systems with a BSBW of several terabits͞second.
These encouraging results of the FAST-Net module characterization were attained through the use of the first generation of 2-D monolithically integrated arrays of VCSEL's and MSM PD's. The results will prove invaluable in the design and implementation of a fully integrated FSOI module. This module will comprise area-bump-bonded chips ͑to provide the requisite bandwidth density͒, a single MCM substrate ͑on which to place the smart-pixel arrays with high precision͒, and an enhanced optomechanical module ͑to interconnect them͒. Such a prototype is currently under development. 12 
