Abstract. Large broadband asynchronous transfer mode (ATM) switching nodes require novel hardware solutions that could benefit from the inclusion of optical interconnect technology, since electronic solutions are limited by pin out and by the capacitance/inductance of the interconnections. We propose, analyze and demonstrate a new three stage free space optical switch that utilizes vertical cavity surface emitting lasers (VCSELs) for the optical interconnections, a liquid crystal spatial light modulator (SLM) as a reconfigurable shutter and relatively simple optics for fan out and fan in. A custom complementary metal oxide semiconductor (CMOS) chip is required to introduce a time delay in the optical bit stream and to drive the VCSELs. Analysis shows that the switch should be scalable to 1024ϫ1024, which would require 2048 ϳ2 mW VCSELs.
Introduction
Asynchronous transfer mode ͑ATM͒ is a form of packet switching that transmits information by attaching a 40 bit header to a message packet of 384 bits. ATM services are presently based on all-electronic switching fabrics. 1 However, as the size of the switch becomes large, the throughput of the switch becomes limited by communication bottlenecks caused not by the electronic processing units themselves, but by the electronic interconnects. Thus, while the processing power of electronic chips is sufficiently great that a single chip can contain much of the processing capability needed for an entire system ͑or a large portion of a system͒, a much more difficult task involves supplying the chip with enough high speed reconfigurable input/ output ports. Hence it is advantageous to consider a hybrid optoelectronic approach where free space optics is used to increase the connectivity and electronics is used to carry out the processing functions. [2] [3] [4] [5] [6] This paper presents a scalable technique of implementing a three stage switching architecture 7 using arrays of vertical cavity surface emitting lasers ͑VCSELs͒ and a liquid crystal over silicon ͓ferroelectric liquid crystal/very large scale integration ͑FLC/VLSI͔͒ spatial light modulator ͑SLM͒. The VCSELs are used primarily as high speed optical interconnects and the liquid crystal SLM functions as a reconfigurable routing shutter. VCSELs can easily be fabricated into 2-D arrays of individually addressed lasers that emit a low divergent column of light normal to the array surface. 8 These lasers are highly efficient and have high reliability. 9, 10 Thus, VCSELs are ideal light sources for large scale optical free space interconnects in a communication switch. However, as has been shown by others, [4] [5] [6] it is difficult to find a technique of reconfiguring the optical interconnections without requiring a large number of VCSELs, needing complex optics, or suffering excessive optical loss due to fan-out. Liquid crystal SLMs are relatively slow, but can be used to project large, programmable patterns that can be used for routing. By merging the advantages of the two optical technologies with electronics, a scalable ATM switch can be formed.
Previously, researchers at NEC utilized VCSELs in communication packet switching. Kawai and Kurita 4 implemented a three stage switching architecture using multiwavelength VCSEL arrays and multimode fiber interconnects. This approach requires a large number of VCSELs and complex optics, which results in a relatively high optical power loss. Li et al. 5, 6 proposed an architecture with low optical loss by assigning an array of N VCSELs to each of the N input channels and using passive optics to steer the beams to the required output. In this way, there is no optical fan-out loss. However, the system requires N 2 VCSELs, and therefore the number of VCSELs increases rapidly as the switch increases to a practical size. In addition, the size of the microlenses and the macrolens also increases significantly as N increases. Even so, these researchers have succeeded in demonstrating a system with over 200 inputs by using fiber bundles to route the optical signals to and from a free space switch.
Walker 11 at Heriot-Watt University proposed using VCSELs to send packets of data between complementary metal-oxide semiconductor ͑CMOS͒ chips. The N input data packet streams are fanned out with a binary phase grating ͑BPG͒ to form N subarrays. Each packet has an address header that can be read by an array of smart pixels. One multiple quantum well ͑MQW͒ modulator is attached to each smart pixel subarray. The modulator then transmits one of the data streams from its subarray. The number of data streams is limited by the custom made multiple com-ponent lenses that are required for this system. Other MQW ATM switching systems have also been proposed and/or demonstrated. 3, [12] [13] [14] In addition, several VCSEL based, free space processing systems have been demonstrated. [15] [16] [17] [18] The architecture presented here uses an array of only 2N VCSELs that transmits through fan-out optics onto a liquid crystal routing shutter. The array is divided into sectors that were previously 2, 7 shown to reduce optical loss in proportion to the number of sectors, e.g., for a system with 1024 inputs and 32 sectors, the optical fan-out is reduced from 1024 to 32 and the required optical output power of the VCSELs is also reduced by a factor of 32. The total beam loss of this system, as shown later is then only ϳ23 dB. This loss appears to be sufficiently low to provide a BER of р10 Ϫ9 with an output power of ϳ2 mW per VCSEL. Thus, the proposed VCSEL/LC switch is scalable to N у1024.
Three Stage VCSEL/SLM Based ATM Switching Architecture
The three stage switch illustrated in Fig. 1 was previously described in detail. 7 It is composed of a central routing switch ͑sectorized optical crossbar͒ that performs cross point routing of the inputs to the outputs. The stage in front of the routing switch performs two tasks: ͑1͒ it organizes the inputs into sectors to reduce the number of cross points and ͑2͒ it selects locations on the input plane to prevent the collisions of cells destined for the same output group on input sectors. The stage following the routing switch ͑the output stage͒ places the arbitrated inputs into queues and performs a time shot interchange as required. This architecture is independent of the technology used to implement it and we describe here a scalable technique that utilizes VCSELs as high speed optical interconnects and an FLC SLM that performs the routing. Figure 2 illustrates the general features of the proposed VCSEL/SLM reconfiguable switch. The first stage is composed of the VCSEL array and the controlling electronics, the second stage is made up of the diffractive fan-out optics and an SLM ͑liquid crystal routing shutters͒, and the third stage contains the fan-in optics, the output photodetector array and electronic queues/controls. As discussed, the first stage electronically divides the N inputs into m sectors of size N/m to reduce the number of cross points. The outputs of these sectorized units drive the VCSELs. Then N/m copies of each of the optical bit streams emitted from the VCSELs are made by the diffractive fan-out optics and imaged onto the liquid crystal routing shutters. Only N of the N 2 /m routing shutters are open, i.e., one for each output channel. The beams reflected from the shutter are then collected by the fan-in optics and directed onto a photodetector array prior to queuing for the output port.
The liquid crystal routing shutter pattern remains fixed for one ATM cell period, e.g., 2.7 s for STM-1 ͑155 megabits/s͒, during which time the bit stream of that set of input cells is routed to the proper output. Since the next set of input ATM cells will be routed to different outputs, the routing shutter array must be reconfigured into another pattern before these new cells arrive. However, there is very little time between the incoming cells ͑only the 0.3 s from the 5 bytes of header͒ and since the liquid crystal shutters switch slowly, a delay between the cells must be introduced. This time delay can be gained by electronically demultiplexing the odd and even incoming ATM cells between two spatially separated VCSEL arrays, as illustrated in Fig. 3͑a͒ . To accomplish this, each input channel has a demultiplexing switch, two 424 bit shift registers ͑which provides a one cell delay between the cells streams͒ and a VCSEL driver circuit. One array of VCSELs is driven by the first, third, fifth, etc. ͑odd͒ cells and the other VCSEL array is driven by the second, fourth, sixth, etc. ͑even͒ cells, as shown in the timing diagram of Fig. 3͑b͒ . The VCSELs can operate 19 at Ͼ3 gigabits/s and therefore the operating frequency is limited by the CMOS driver circuit. However, CMOS VCSEL drivers operating up to 622 megabits/s have been reported, 20 therefore the VCSEL drivers should not impose a frequency limitation.
The method of connecting the VCSELs to the CMOS and the layout of the VCSEL array must allow for high speed operation, scalability and functionality of the optical system. Simple wire bonding techniques are suitable for small arrays ͑small number of inputs͒ operating at low speed, however, as the number of inputs and the operating speed increase, more elegant techniques are required, as discussed in Sec. 3.2. With regard to the layout of the VCSELs, it appears that interleaving the two VCSEL arrays is the optimum layout configuration since physically separating the arrays increases the length of the electrical leads on the CMOS chip, thus increasing the chip area and the lead capacitance. The liquid crystal routing shutters must also be interleaved into two arrays to match the VCSEL arrays. Two shutter arrays are used so that while one shutter array is actively routing the signals from one of the VCSEL arrays, the second shutter array is being reconfigured.
Scalability
Scalability is the most important consideration in evaluating the potential of an optoelectronic ATM switch since small switches can be implemented electronically and optoelectronic switches will be used only if they increase the aggregate throughput and/or connectivity. Scalability of the proposed switch is a function of a number of interrelated features of the VCSELs, the VCSEL control chip, the sensitivity of the receivers on the output stage and the fan-in/ fan-out optics. Primarily these features relate to 1. optical power obtainable from the VCSEL 2. size and the ability of the chips to sink the power dissipation 3. characteristics of the liquid crystal routing shutter 4. optics and optomechanics.
These are discussed in the following sections based on reasonable estimates of the present state of the art devices. The practical aspects of the optics and optomechanics are also considered. It is shown that the proposed switch is practical up to at least 1024ϫ1024.
Required VCSEL Optical Output Power
The optical output power required by each of the VCSELs is determined by the optical loss of the system and the incident power on the photodetectors needed to achieve a low bit error rate ͑BER͒ at the specified bit rate. It is important to minimize the required optical output power since it relates to the electrical power dissipation and the usable VCSEL array size. The incident power for a specified BER must first be estimated. Li et al. 5 calculated the required receiver power for their VCSEL based optoelectronic crossbar switch by assuming Gaussian noise and taking into account the fact that high performance/low noise receivers cannot be fabricated in large arrays, they estimated that a receiver sensitivity of about Ϫ20 dBm ͑ϳ10 W/channel͒ is required for a frequency of ϳ1.2 GHz.
The same general approach can be used to analyze the switching architecture proposed here. However, the crosstalk introduced by the finite contrast ratio ͑CR͒ of the liquid crystal routing shutter and the VCSEL might also contribute to the noise. The CR is defined as the ratio of the intensity emerging from the liquid crystal cell in the on state compared to that in the off state. If there are N channels being routed then there is crosstalk due to light leakage through the NϪ1 closed shutters, since the closed shutters will reflect 1/CR of their incident light into the fan-in optics. Assuming that the data presented to the routing shutter are random, the probability of any pixel being on is 0.5. The multiple discrete levels at the photodetector therefore have a binomial probability density function, which for large N, approaches a Gaussian distribution, which is superimposed on the thermal noise of the photodetector/ receiver. The BER for the system can then be approximated by
͑1͒
where i ph is the photocurrent of the detector, B is the bandwidth, R is the input resistance, T is the absolute temperature and k is Boltzmann's constant. Plotting BER as a function of contrast ratio reveals that the crosstalk introduced by the routing shutter is minimal if CRу2N/m, but it increases rapidly for CRϽ2N/m. For Nϭ1024 and mϭ32, CR must be у64. Since fast liquid crystal shutter with CRϾ1000 have been reported, 21 we assume that the shutter introduced crosstalk is minimal.
Noise also arises due to polarization instabilities of the VCSELs. However, this noise can be reduced to a low level by the proper size and shape of the VCSEL aperture. 22 Thus, it appears reasonable to assume that the calculations of Li et al. 5 will hold for the present system and that a detector input power of ϳ10 W will result in a BER р10 Ϫ9 . Based on a detector input of 10 W; the required VCSEL optical power can be determined from the system optical loss. The optical losses of the proposed system can be estimated in a straightforward manner, as outlined in Table 1 , which indicates that the proposed ATM switch with 1024 input channels, can operate with Ϫ23 dB loss, which requires a VCSEL output power of ϳ2 mW. This is well within the capability 23, 24 of present polarization controlled VCSELs. Polarization control is required to reduce the unusable light generated by the VCSEL as a consequence of the LC shutters requiring linearly polarized light in intensity modulation schemes. The optical power budget of Table 1 assumes that the polarization ratio of the VCSELs is sufficiently large that a passive polarizing element placed in front of the shutters produces only a small loss.
Size and Power Dissipation of the VCSEL and CMOS Smart Pixel Chips
The scalability of VCSEL/CMOS smart pixel chip͑s͒ are dependent on three primary factors: the power dissipation of the chips, the size of the chips and the number of electrical input/output pads. This section estimates these values for Nϭ1024. Power dissipation for a VCSEL array can be calculated assuming that only half of the VCSELs are on at one time, but that all of the VCSELs are biased to the threshold current, i.e., the VCSELs operate in the nonreturn to zero mode ͑NRZ͒. As a result, N/2 VCSELs generate heat equal to V op I op , which are the voltage and current at the VCSEL operating output power of 2 mW. The other N/2 VCSELs dissipate V th I th , which is the power dissipated at the lasing threshold condition. Therefore the VCSEL chip power dissipation is approximately:
Pϳ͑NV th /2͒͑I th ϩI op ͒, assuming V th ϳV op . ͑2͒
Using the values reported by Kuksenkov et al. 24 for a 6 ϫ4 m polarization controlled VCSEL with a maximum optical power out of ϳ3.2 mW ͑I th ϭ1.29 mA, I on ϭ6 mA, and V th ϳ1.5 V͒, the power dissipated by an array of 1024 VCSELs is therefore ϳ5.6 W; which is below the maximum allowable power dissipation of a properly heat sinked 2 cm 2 chip. Continued improvements in VCSEL characteristics will lower the power dissipation even further.
For compactness, ease of optical alignment and packaging considerations, it is desirable to place all of the VCSEL control electronics and the VCSELs on a single chip. The CMOS/VCSEL control chip must be composed of N pixels each of which contains an odd/even switch, two 424 bit shift registers and two VCSEL drivers ͑Fig. 3͒. There must also be N input channel bonding pads and bonding pads for each of the 2N VCSELs. The necessity of 3N bonding pads is immediately seen as a limitation to the scalability of the system. One solution to this problem is to use multiple chips. However, this will complicate the optics and optical alignment. A more promising solution is the flip chip bonding of the VCSELs directly to the pixels, 25, 26 which, as discussed later, will reduce the number of bonding pads to ϳN.
To estimate the CMOS chip die size, 0.35 m CMOS design rules are used, although smaller line width will soon be commonplace. 27 A simple dynamic shift register ͑SR͒ can be used for the pixels since the ATM cells will be continuously clocked serial in and out, and serve only as a delay line. One bit of the SR will require ϳ150 m 2 , thus a 424 bit SR will require ϳ6.6ϫ10 Ϫ4 cm 2 . Using standard 90ϫ90 m wire bonding pads requires ϳ1.6ϫ10 Ϫ4 cm 2 per pad including the space between the pads. The size of the VCSEL driver depends on the maximum current that must be supplied and the required bit rate. By scaling the driver design of Banwell et al. 20 for both reduced design rules and lower maximum current, the required area of the driver can be estimated to be ϳ0.8ϫ10
Ϫ4 cm 2 . The flip chip bonding pads for the VCSELs can be placed on top of the dielectric layers of the chip and therefore do not require any additional chip area. The total required chip is therefore, As a result, a 1024ϫ1024 switch will require a CMOS chip of about 2 cm 2 . Connecting VCSELs to smart pixels introduces fabrication problems since they cannot be readily grown on standard Si CMOS integrated circuits; thus, a hybrid technique is required. The three standard methods of making hybrid connections are wire bonding, bridge bonding or flip chip bonding the whole array onto CMOS chip. 28 Unfortunately none of these are suitable for large high speed arrays since they introduce large interconnect capacitance and/or use large chip area. We developed a coplanar contact flip chip process that places individual VCSEL directly within the pixel. 25, 26 This technique requires further development but has been shown 29 to be scalable to over 4000 self-electrooptic devices ͑SEEDs͒ on Si CMOS. A flip chip process for LEDs was also reported 30 that utilizes shrinkable epoxy to reliably attach 2300 LEDs to a substrate with a 10 m pitch. Other processes have also been reported. 31, 32 In addition, an 8ϫ8 array of photodetectors has been attached to a Si CMOS circuit using epitaxial lift off. 33 Therefore it appears that a suitable bonding technique for the proposed CMOS/VCSEL ATM switch will be developed in the near future and should not pose a limitation on switch size up to at least 1024.
Characteristics of the LC Routing Shutter Array
The proposed three stage 1024ϫ1024 ATM switch with 32 sectors operating at 155 MHz will require 65,536 LC pixels with a CRу2N/mу64 and a switching time of 2.7 s. Taken together, these are demanding requirements but not beyond the state of the art since the development of LC on Si CMOS backplane technology has advanced rapidly in recent years, 34 e.g., pixel arrays of 256ϫ256 (65,536) and larger have been fabricated, 35 and LC switching times of 17.2 s have been demonstrated on Si backplanes with typical CMOS voltages and should decrease with continued research. Increasing the operating voltage will yield a faster switching speed, but this is at the expense of pixel density. CRs greater than 64 have been demonstrated. 36 
Optics and Optomechanics
For a 1024ϫ1024 cross point switch, the proposed VCSEL implementation reduces the number of resolvable points in the interconnect plane from Ͼ10 7 using the previous suggested implementation 7 to 6.5ϫ10 4 . This greatly simplifies the optics, however, the overall size of the optics must be scaled to the size of the shutter. Consider an input plane of approximately 15ϫ15 mm. When this is replicated by m ϭ32 times, the resultant images will only fit onto a VLSI based shutter if demagnification is used. This demagnification must be selected so that the reduced VCSEL beams are greater than ͑or equal to͒ the diffraction limited spot size at the operating wavelength. Since the emerging beams from a VCSEL are approximately 6 m in diameter and the diffraction limited spot size is only 1.2 m in diameter, a demagnification of 4 would be appropriate. The routing shutter will consequently fit on a Si VLSI chip. Using a chip of the order 15ϫ15 mm would require the use of lenses of the order 40 mm diameter. If such lenses have F numbers between 2 and 5, the length of the optical system would be of the order of 0.5 m, which would yield an approximate volume of 200 cm 3 . While this simple analysis indicates that the switch can be assembled with reasonable optic components, a detailed optical simulation is required. A more practical design may require a tradeoff between the physical optics and the magnitude of the fan out.
Proof of Principle Demonstration
The optical layout of the demonstrator system was assembled on a 12ϫ12 in. slotted plate, 37 as illustrated in the drawing of Fig. 4 and the photo of Fig. 5 . The system components consisted of a 4ϫ4 array of VCSELs ( ϭ850 nm) mounted on a steel slug with the VCSELs electrically connected to driver circuits controlled by a PC. For this demonstrator, the VCSELs were not attached directly to a custom CMOS chip, but this arrangement will be used in subsequent demonstrators. A Display Tech SLM was used as the LC routing shutter and a CCD camera provided images of the output. The optics was composed of a microlens array placed in front of the VCSELs, four 50 mm focal length lenses and a 1 in. polarizing beamsplitter ͑PBS͒. A 1ϫ5 binary phase grating provided a fan-out for the 16 VCSEL beams, but only four of the five fan-out patterns were required. A 1ϫ5 fan-out was used since a low cost, off the shelf grating was available, thus avoiding an expensive custom grating. Two cylindrical lenses ͑Fϭ6 mm and Fϭ5 mm͒ were used to fan-in the beams reflecting off of the SLM. The two lenses provided a better focus of the output beams than was possible with only one lens.
Proper routing through the demonstrator was verified by turning on several of the VCSELs and then routing them to different output locations by changing the LC pattern. Figure 6 illustrates the results of routing input 4 to output 12, 5 to 13, 11 to 7 and 14 to 2. Figure 6͑a͒ shows the VCSEL array with numbers 4, 5, 11 and 14 turned on, Fig. 6͑b͒ shows the 4ϫ linear fan-out of these VCSEL beams from the BPG ͑note the CCD camera was not large enough to observe four full patterns͒, Fig. 6͑c͒ shows the beams reflecting off the LC shutter and Fig. 6͑d͒ shows these beams collected by the cylindrical fan-in lenses. Figure 6͑d͒ shows the 4 outputs at the specified 12, 13, 7, and 2 output locations. Changing the shutter patterns with the same input pattern resulted in corresponding changes in the output beam locations. Figure 7 shows the output with all 16 VCSELs turned on. Turning one of the VCSELs off caused the output spot at corresponding location to turn off. These tests verify the general routing concepts of the proposed architecture. These tests were performed in a quasi static mode due to the limitations of the CCD camera and because the input data were not demultiplexed, however, parts of the system were tested up to 100 megabits/s. Note that the beams maintain their approximate Gaussian symmetry up to the fan-in optics, which cause the beams to elongate.
Summary
Large ATM network switches require massively parallel, reconfigurable interconnections that must operate at high speeds. This is a difficult task for large switches and may be best implemented with a combination of electronic and free space optics, where the optics provides the reconfigurable cross point routing. This paper proposes an optoelectronic switch using VCSEL arrays for interconnects and an LC routing shutter. This implementation reduces the number of optical beams and simplifies the optics as compared to previously reported systems. [4] [5] [6] [7] First order analysis of the required optical power, power dissipation and physical size of the arrays were presented that indicate that a switch as large as 1024ϫ1024, divided into 32 sectors, should be practical in the near future. Results from a proof of principle demonstrator verified the switching architecture.
This paper addresses only the problems associated with the reconfigurable optoelectronic cross point switch and not the extensive electronics that are required to perform such tasks as reading the header, sectorizing the input and output planes, arbitrating collisions and queuing the output. These functions are important and are similar for all high performance ATM switches, but a discussion of the design of these functions is beyond the scope of this paper. 
