, "A complexity analysis of smart pixel switching nodes for photonic extended generalized shuffle switching networks," Journal of Quantum Electronics, Vol. 29, No. 2, February 1993, pp. 619-634. Abstract-This paper studies the architectural tradeoffs found in the use of smart pixels for nodes within photonic switching interconnection networks. The particular networks of interest within the analysis are strictly nonblocking extended generalized shuffle (EGS) networks. Several performance metrics are defined for the analysis and the effect of node size on these metrics is studied. Optimum node sizes are defined for each of the performance metrics and system-level limitations are also identified.
A Complexity Analysis of Smart Pixel Switching
Nodes for Photonic Extended Generalized Shuffle Switching Networks
I. INTRODUCTION REE-space digital optics is a new interconnection
F technology that may permit signals to be routed between digital integrated circuits as beams of light propagating orthogonal to the plane of the device substrates and routed via bulk or microoptical components, such as lenses, beam splitters, and holograms ( Fig. 1) [l] . This approach to device connectivity may offer several systemlevel benefits, including high bandwidth, high density connectivity (parallelism), low signal skew, low channel crosstalk, and lower overall system power dissipation. These benefits can help solve many packaging problems in the design of high-speed telecommunication switching networks in the future.
Initial designs for photonic multistage switching networks were based on optoelectronic device technologies with relatively large optical switching energies ( = 1 pJ) and/or simple functionality [2] - [8] . Because of these limitations, many researchers are attempting to integrate electronic circuits with the first-generation optical device technologies to create a more powerful set of second-generation optical devices. The resulting integrated optoelectronic circuits will typically contain three distinct, spatially separated subsections on the device substrate consisting of the input signal detection subsection, the signal processing subsection, and the output signal generation subsection. Usually, the input signal detection subsection contains an optical-to-electronic converter (such as a photodiode), an electronic amplifier, and a thresholding decision circuit to determine the binary value Manuscript received April 20, 1992; revised July 7, 1992. The authors are with AT&T Bell Laboratories, Napewille, IL 60566. IEEE Log Number 9205872. of the incoming signal. The electronic amplifier in the signal detection subsection helps lower the required optical energies and permits higher speed operation even with low-power laser sources. The signal processing subsection may contain many different forms of digital logic to implement the required switching functions, and it usually permits relatively complicated functions to be implemented. The output signal generation subsection can be implemented as an active source, such as a laser or lightemitting diode, or it can be implemented as a modulator that absorbs or transmits the optical probe beam that must be generated by an external light source. These more functional second-generation optical devices are often called "smart pixels" [9]- [13] . Switching networks based on smart pixels can be quite different from those designed for pure electronics or those designed for first-generation optical logic devices, and system architects are beginning to determine the systemlevel tradeoffs and explore the overall benefits that can be derived from the use of smart pixels and free-space digital optics in switching applications [ 141-[ 191. However, these benefits can only be derived from prudent combinations of electronics and optics, so system architects must try to determine where the partitions between the electronics and the optics should be drawn to provide the biggest system-level gains. This paper will attempt to answer these questions for a particular class of multistage switching network topologies known as extended generalized shuffle (EGS) networks [20]- [22] .
EXTENDED GENERALIZED SHUFFLE (EGS) NETWORKS
The EGS class of multistage networks display many desirable features, including low hardware costs, low blocking probabilities with the potential for nonblocking operation (given sufficient hardware), high degrees of fault tolerance, the ability to transport point-to-point or broadcast traffic, and relatively simple, fast path hunt operations [23] , [24] .
In addition to these general switching features, EGS networks also display several specific characteristics that are helpful for photonic applications. For example, the interconnection patterns used between the node stages can be easily modified as photonic technologies develop and new interconnection patterns become possible. Additionally, EGS networks can use many different types of switching nodes which permits them to evolve as photonic technologies (and the advent of smart pixels) permit more powerful switching node designs. In an attempt to minimize overall hardware costs, the designer of a photonic EGS network can also vary the number of node stages and the number of switching nodes per node stage while maintaining the network's operational characteristics (blocking probability, fault tolerance, etc.).
In general, an EGS network is a multistage interconnection network (MIN) that provides interconnections between adjacent stages of switching nodes, where a single stage is a set of identical switching nodes [ Fig. 2 . . . , s -1, the outlets of the switching nodes of Si are connected by links only to inlets of the switching nodes of Si + Since all stage i outlets must be connected on a one-to-one basis with all stage i + 1 inlets, it is required that rj X mi = r j + l X n i + l , for 1 5 i I s -1.
By definition, the interconnection pattern that is used between consecutive node stages in an EGS network must be topologically equivalent to the q-shuffle interconnection pattern [25] . The general q-shuffle interconnection topology can provide connections between ri nodes with the input link k on node stage i + 1 has the physical address given by:
where Fi is the link-stage mapping function, LA] represents the largest integer less than or equal to A, and (B modulo C) represents the integer remainder of the quo- tient B / C . This interconnection pattern is illustrated in Fig. 2(b) . EGS networks do not place any restrictions (other than those described above) on the number of nodes within the node stages or on the number of node stages(s) within the MIN. Any type of switching node can be used in the node stages, and the node type can be changed from node stage to node stage. However, a single node type must be used within a single node stage. The node types within different node stages of an EGS network can be described using a simple triplet notation ( n , m , c ) , where n represents the number of inputs to the node, m represents the number of outputs from the node, and c represents the capacity of the node (the number of inputs that can be simultaneously routed to outputs without risk of being blocked) [26] , [27] . Examples of the logic required for three different types of nodes are shown in Fig. 3 . To simplify some of the mathematical expressions related to EGS networks, it is beneficial to define another node parameter known as the alpha ( a ) of the node. The alpha of a node is closely related to the capacity (c) of the node. For nodes where the capacity (c) is equal to n or m (such as nodes constructed from small crossbar switches), the alpha is defined to be 1. For nodes where the capacity (c) is 1, the alpha is defined to be 0. For single cross-point nodes with n inputs and n outputs (also known as n modules), the alpha is defined to be -1. The EGS networks described in this paper will be limited to the small subset of the general EGS class of networks that use node types with the alpha set to 1 or 0 for all of the node stages. Given this constraint, three useful design parameters for EGS networks are the omega (U) of the network, the T of the network, and the U of the network. (Note: Physical descriptions of these parameters are provided in the references [28] .) The omega of the network is defined as:
The T of the network is defined as the largest value of i such that IIb = n,, 5 N , and the U of the network is defined as the largest value of i such that n), = I n,, I M .
An important parameter required in the design of EGS networks is the probability of blocking of the network, (C) Fig. 3 . Typical node-types: (a) (2, 2, 2) node with alpha = 1; (b) (2, I , 1) node with alpha = 0; (c) 2-module with alpha = -1.
P ( B ) .
The EGS network variables N , M , s, n,(1 5 i 5 s), ml(l I i I s), a , w , T , and U can be used to determine if P ( B ) = 0 (indicating that the resulting EGS network is a strictly nonblocking network). However, the designer must first verify that two important constraints are satisfied. Constraint #1-For the maximum value of i that satisfies the inequality T;= np 5 w , the following must be true:
Constraint #I-For the maximum value of i that satisfies the inequality ' I ; = n!, I w , the following must be true:
P = J
An EGS network that satisfies the above constraints can be shown to be strictly nonblocking for point-to-point (nonbroadcasting) connections if [20] :
One particular type of EGS network that is very useful for photonic switching applications is called a "fanoutswitch-fanin EGS network." This subset of the EGS network class requires the number of network inputs (N) to be equal to the number of network outputs ( M ) , and it also logically subdivides the s node stages of the network into three distinct functional units: the fanout section, the switching section, and the fanin section [ Fig. 2(c) ]. An important pair of network parameters related to this logical arrangement are the network fanout and the network fanin, both of which are assumed to be equal to the value F. The fanout section is actually the first node stage of the EGS network, so it must accept the N inputs to the network. The fanout section is composed of N (1, F, 1) switching nodes, and the NFoutput links from this section are directed into the first node stage of the switching section. The fanout section could be implemented in the electronics of the input interface, and the N F output links from the fanout section would be injected into the switching section on a fiber bundle array containing N F unique fibers [ Fig. 2(c) ]. The switching section is actually the middle s -2 node stages of the EGS network, where each node stage contains NF/n n-input, n-output switching nodes with parameter a. The N F output links from the last stage of the switching section are directed into the fanin section. The fanin section is actually the last node stage of the EGS network, and it is composed of N(F, 1, 1 ) switching nodes. Thus, it must produce the N outputs for the network, which are typically routed into an output fiber bundle array.
In some EGS network designs, there are many paths between any input and any output in the network. In fact, in an EGS network with N inputs, N outputs, a fanout (fanin) of F , and s-2 node stages in the switching section containing n-input n-output nodes, it can be shown that there are Fn (' -2'/N paths between any network input and any network output. Each of these paths is typically numbered with a value v ranging from 0 to Fn""/N -1 .
To control the network, there needs to be a method of choosing one path through the network from any input x to any output y. A particular path v through the network from input x to output y will pass through nodei(x, U , y ) of the ith node stage, where
(5)
Methods have been devised that rapidly calculate the FdS -2)/N paths through the network, allowing hardware to quickly determine the availability of those paths [29] .
In a later section, it will be shown that the fanout value F is directly related to the size of the optoelectronic device array. Unfortunately, large device arrays typically have lower yields, so system designers of photonic EGS networks have avoided the use of large device arrays. As a result, the fanout value F has typically been constrained in these early designs to be as small as possible. After determining the minimum value of F required for strictly nonblocking operation, photonic switch designers may also constrain F to produce a rectangular array of nodes with R rows of nodes and C columns of nodes, where R and C are both powers of two. If all of these constraints are factored into (4), then typical values of F and s can be calculated for strictly nonblocking EGS networks of varying size (N) and with various types of nodes. These calculations have been camed out for various EGS network sizes ranging from N = 64 inputs to N = 8192 inputs. They have also been carried out for twelve different node types. These node types include ( n , 1, 1) nodes with a = 0 and (n, n , n) nodes with a = 1 . The value of n has been constrained to be an element of the integer set (2, 4, 8, 16, 32, 64). The results of these calculations are summarized in Table I for EGS networks with N = 256, N = 1024, and N = 4096.
Another example of the flexibility of EGS networks is obtained through the use of pipes to subdivide the switching section of the EGS network into narrow, disjoint subnetworks [22]. The use of pipes permits an EGS network with a fanout of F to be implemented as p parallel networks each with a fanout of F / p , so the resulting nodestage sizes are effectively decreased by a factor of p and the overall reliability of the network can oftentimes be increased. Unfortunately, the total number of device arrays is increased by a factor of p when p pipes are used, so the benefits obtained from having smaller devices arrays must be weighed against the disadvantages of having more device arrays. An example of an EGS network containing two pipes is shown in Fig. 4 . 111. SWITCHING NODES BASED ON SMART PIXELS WITHIN EGS NETWORKS At a minimum, every smart pixel must provide the circuitry required for input signal detection, signal processing, and output signal generation. To simplify the analyses of this paper, all combinational logic circuits must use only 2-input NAND gates, 2-input NOR gates, or l-input buffer gates. In addition, the gate-level fanout is limited to two. For purposes of comparison, it will be assumed that each of these logic gates will require similar amounts of substrate area on the device array, which will be defined as one "gate area" (Agate).
A . The Input Signal Detection Subsection
For purposes of comparison, it will be assumed that the input signal detection subsection will require the opticalto-electronic conversion hardware shown in Fig. 5 . This hardware contains a S-SEED detector [6] and a singlestage amplifier circuit. It will also be assumed that the circuit in Fig. 5 introduces the equivalent of one logic gate delay and occupies an area on the device substrate equal to one gate area. As a result, for an n-input, n-output switching node, the input signal detection subsection will occupy an area on the device substrate equal to n gate areas.
B. The Output Signal Generation Subsection
The output signal generation subsection must provide electronic-to-optical conversion hardware. In all of the designs within the paper, it will be assumed that the output signal generation subsection must also provide a latch function (using a master-slave flip-flop) so that all of the optical signals leaving a device array are properly synchronized. It will be assumed that each smart pixel switching node requires one optical-to-electronic conversion hardware unit for clock derivation. Thus, the output signal generation subsection will require the hardware shown in Fig. 6 . This hardware contains one optical-toelectronic conversion unit per smart pixel switching node for clock derivation, the master-slave flip-flop for bit level synchronization, a single-stage amplifier circuit, and a S-SEED modulator. It will be assumed that the circuit in Fig. 6 introduces the equivalent of eight logic gate delays into the smart pixel circuit, and it occupies an area on the device substrate equal to eleven gate areas (plus one gate area per node for the optical-to-electronic conversion unit). Since there is only one of these hardware units associated with each ( n , 1, 1 ) node, the output signal generation subsection for each ( n , 1 , 1) node will occupy an area equal to 1 1 + 1 = 12 gate areas. Since there are n of these hardware units associated with each ( n , n , n ) node, the output signal generation subsection for each ( n , n , n ) node will occupy an area equal to l l n + 1 gate areas.
C. The Signal Processing Subsection
The signal processing subsection for a smart pixel switching node can typically be divided even further into two subfunctions: switching and control injection. Thus, each smart pixel switching node must provide the necessary combinational logic for these two basic tasks.
1 ) The switching subfunction: The implementation of the switching subfunction within a smart pixel switching node is relatively straight forward, because any node type can be designed using a set of multiplexer circuits. A (n, 1, 1) node requires the single n : 1 multiplexer circuit shown in Fig. 7 , which contains n 2-input NAND gates (for selection) and 2n-3 NAND gates (for combination), requiring a total of 3n-3 logic gates that occupy 3n-3 gate areas on the device substrate. The 2n-3 NAND gates used for combination are arranged in a log, @)-stage tree structure where each stage but the last stage contains two NAND gates, so the n : 1 multiplexer has a total delay given by 210g2 (n) gate delays.
The switching subfunction for a ( n , n, n) node requires n sets of n : 1 multiplexer circuits to be combined together following a 1 : n fanout of each of the input signals (Fig.  8) . Since the logic gates are limited to gate-level fanouts of two, each of the n 1 : n fanout sections must be implemented using n-1 buffer gates arranged in a log2 (n)-stage tree structure. The schematic in Fig. 8 contains n sets of 1 : n fanout circuits followed by n sets of n : 1 multiplexer circuits, requiring a total of n (n -1) + n (3n -3) = 4n2 -4n logic gates occupying 4n2 -4n gate areas on the device substrate. The total delay of the circuit in Fig. 8 is given by 310g2(n) gate delays. It is capable of routing any input t O any output, and each of the n outputs can simultaneously receive data from any one of the n inputs.
a) The control injection subfunction: The control injection subfunction for a particular node must provide a means for routing the control signals into the node and a means for latching the control signals within the node while data is passing through the node. Many different control injection techniques have been developed for photonic switching applications [24] . Only one of these approaches will be considered within this paper. This approach is known as the centralized control injection based on packet headers or the embedded control approach [30] . EGS network operation using this control injection technique requires that the incoming data be buffered and synchronized at the input of the network (Fig. 9) . In Fig. 9 , a call request is transmitted to a remote, centrally located, electronic path hunt processor, which has global information regarding the status of all of the nodes in the network. The path hunt processor calculates an idle path to satisfy the request, and the results of the path hunt (control information) are routed to the input of the EGS network. During the guard-band interval, the incoming data is buffered at the input, and the control information is routed through the network inputs and through the network node-stages to the control memory latches in the smart pixel nodes. Once all of the nodes in the network have stored the appropriate X control bits, the guard-band interval is terminated, and the buffered data at the network input can then be routed based on the control signals that are stored in the control memory latches within the nodes.
The complexity of the hardware needed for the control memory latches depends on the complexity of the switching logic. If decoding logic is not used within the switching nodes, then a (n, 1, 1) switching node will require n latches (master-slave flip-flops) to store n control bits, and the latches must be arranged as an n-bit shift register. A (n, n , n) switching node will require n2 latches (masterslave flip-flops) to store n2 control bits, and the latches must be arranged as n sets of n-bit shift registers. The clock signal for each master-slave flip-flop within these shift register chains can be derived from an externally distributed optical clock source, so optical-to-electronic conversion hardware for the clock signal must also be provided for each smart pixel switching node. Thus, the control injection hardware for the (n, 1 , 1) switching node shown in Fig. 10 will occupy an area on the device substrate equal to 10n + 1 gate areas (where one gate area per node is for the optical-to-electronic conversion unit), and the control injection hardware for the ( n , n , n) switch- The length of the guard-band interval is directly related to the number of node stages(s) and the complexity of the switching logic. If each switching node requires X control bits to uniquely define the routing state for the node, then the network must load sX control bits through the input of the network before data can be routed. As a result, the data rate through the switching fabric must be higher than the data rate on the transmission lines entering the switch.
The effective speedup of the data rate is given by ( D + s X ) / D . Since the ( n , i , 1) node requires X = n control bits to be latched, the effective speedup for an EGS network with embedded control and ( n , 1 , 1 ) nodes is given by ( D + s n ) / D . Since the ( n , n , n ) node requires X = n2 control bits to be latched, the effective speedup for an EGS network with embedded control and (n, n , n ) nodes is Table I1 summarizes the results of the previous sections. From the switching node hardware requirements described above, it can be shown that a single ( n , 1, 1 ) switching node will introduce 210g2 (n) + 9 gate delays to the circuit and occupy an area on the device substrate equal to 14n + 10 gate areas. A single ( n , n , n ) switching node will introduce 310g2 ( n ) + 9 gate delays to the circuit and occupy an area on the device substrate equal to 14n2 + 8n + 2 gate areas.
D. Total Hardware Complexity and Delay in Smart Pixel Switching Nodes
The optical hardware that might be used to provide the connections from one node stage to another node stage is shown schematically in Fig. 11 [ 11. Within this hardware, it is assumed that optical modulators and photo-detectors are used to provide the electronic-to-optical and opticalto-electronic conversions on the smart pixel switching nodes. As a result, an external laser power supply is needed to probe the state of the optical modulators, and two other external lasers are also needed to provide the synchronizing clocks for the latches that store data and control signals with the switching nodes. The outputs from these external laser sources are routed through spot array generating binary phase gratings to produce an array of beams that interrogate the states of the optical modulators. The beams are passed through the polarizing beamsplitter and are imaged by the objective lens and microlenses onto the device substrate containing the smart pixels. The clock pulses are absorbed by the photodetectors to produce electronic clock signals within the smart pixels, while the probe pulses are reflected from the modulator windows on the device substrate. These reflected beams carry binary information toward the smart pixels in the next node-stage. After passing through the link-stage interconnection optics and the magnification lenses, the beams are reflected by the second polarizing beamsplitter toward the dichroic beamsplitter. The beams are then reflected back down through the polarizing beamsplitter and 
IV. LINK-STAGE INTERCONNECTIONS FOR EGS
NETWORKS WITH SMART PIXEL SWITCHING NODES The links within the link stages provide connections between adjacent node stages, and in a photonic switching implementation, they are implemented using appropriately routed beams of light. For photonic EGS networks, all of the interconnection patterns must be topologically equivalent to a q-shuffle interconnection. Several different types of optical interconnection topologies that are isomorphic to the q-shuffle have been proposed within the literature The 2-D q-shuffle can be optically implemented by making q copies of the output image from the source device array, appropriately shifting and interleaving these multiple copies, masking out the superfluous beams (image plane spots), and magnifying the interleaved image by a factor of 4 to produce the final output image that is routed to the receiving device array [38] . The magnification step will produce a spot in the receiving device array whose area is q times larger than the area of the spot in the source device array, but it will be assumed that the use of microlenses can minify this spot image back to its standard (nonmagnified), size [44] . The masking operation for the 2-D q-shuffle will permit only (1 /q)th of the resulting beams (image plane spots) to be routed to the receiving device array.
The creation of multiple copies and the shifting and interleaving of these copies within a 2-D q-shuffle can be accomplished using space invariant, computer generated, binary phase gratings [19], [45] , [46] . Optical q-shuffle 
)
The increase due to optical inefficiencies = ( 1 / q ) where 17 is the efficiency of the optics due to grating losses, Fresnel losses, and vignetting. Thus, the total optical power required to drive the source device array is given by P,,, = 2NFnPdet / 7.
(n, 1, 1) nodes, 2n for (n, n, n) nodes n2 for (n, 1 , 1) nodes, n for (n, n , n) nodes
V. CRITICAL PERFORMANCE METRICS FOR EGS
NETWORKS WITH SMART PIXEL SWITCHING NODES Whetl comparing system architectures to determine their relative feasibility, practicality, and desirability, system designers must carefully define the criteria by which the different architectures will be judged. However, for first-order system designs, only the most critical performance metrics need to be analyzed, and the relative desirability of a particular design can usually be specified by some form of cost function, which is a weighted sum of the calculated performance metrics.
For the first-order design of a photonic EGS switching networks based on smart pixel switching nodes, there are several critical performance metrics that must satisfy desired system-level requirements. Each of these critical performance metrics are described below along with the calculated value for each of the metrics corresponding to each of the different EGS network designs that are outlined in Table I (with p = 1 pipe and p = 4 pipes). To produce a fair comparison, the blocking probability is held constant (at zero) for all of the different network designs. Thus, all of the networks are strictly nonblocking networks that satisfy the constraints outlined in ( 4 ) .
A . Total Number of Optical Components Within the System
There are approximately 15 optical components (lenses, beamsplitters, etc.) associated with a single smart pixel device array within the system of Fig. 11 . Since there are (s -1) device arrays per pipe and p pipes in the system, the total number of optical components within the system is given by 15(s -1)p. This metric is plotted in Fig. 14. Since the overall system cost is typically related to the total number of components, it is usually beneficial to keep this value low. Thus, the plots in Fig. 14 indicate that larger node-sizes would be more desirable, and designs with less pipes will typically require less components.
B. Total Number of Fibers in Input Fiber Bundle
The cost of the EGS input section may be dominated by the cost of the input fiber connectors, the input lasers, and the drivers. Thus, the cost or the input section will be directly related to the number of fibers in the input fiber bundle. There will be N F / p fibers required in each of the p input fiber bundles if the fanout is implemented in electronics, and this metric is plotted within Fig. 15 . The plots in Fig. 15 indicate that smaller node sizes would probably be more desirable for ( n , 1, 1) nodes, while larger node sizes would probably be more desirable for ( n , n , n ) nodes. The plots also indicate that designs with ( n , n , n ) nodes generally require less fibers per array than designs with ( n , 1, 1) nodes, and designs with more pipes will also require less fibers per array.
C. Total Substrate Area Occupied by the Smart Pixel Switching Nodes in a Node Stage
Large device substrate areas are undesirable because they tend to have lower processing yields due to material and device defects. In addition, the objective lens that images spots on the device array must have a large field-ofview for large device substrates, so the cost of the objective lens is closely related to the total substrate area. The total substrate area required for a single device array is given by the product of the number of switching nodes Flg 15 Number of fibers In bundle array versus node-slze (n) for nonblocking EGS networks per node stage per pipe (NF/(np )) and the area occupied by the logic within a single switching node. For the purposes of calculation, the gate area occupied by a single logic gate will be assumed to be a 20 pm by 20 pm area, so a gate area is Agate = 400 pm2. For ( n , 1, 1) nodes, the total substrate area required for a single device array is ( N F / ( n p ) ) ( 1 4 n + 10)Agate. For ( n , n, n ) nodes, the total substrate area required for a single device array is (NF/(np )) ( 14n2 + 8n + 2)Agate. This metric is plotted as a function of node size n in Fig. 16 , and it indicates that smaller node sizes would probably be more desirable. In addition, the use of more pipes also results in smaller substrate areas.
D. Complexity of Spot Array Generating Binary Phase Grating
The output power from the lasers in Fig. 11 must be split into a set of equal intensity spots that interrogate the state of the modulators in the smart pixel device array. Binary phase gratings can be used for this function. The complexity of these spot array generating binary phase gratings can be related to the number of etch depth transitions per period, which can be related to the number of spots created along one dimension of the two dimensional output array [47] .
For ( n , 1, 1) nodes, there is one modulator per switching node and there are NF/(np ) switching nodes per smart . > pixel array, so the grating must create a total of NF/(np ) output spots, and the complexity (number of transitions per grating period) of the grating is proportional to m. For (n, n , n) nodes, there are n modulators per switching node and there are N F / ( n p ) switching nodes per smart pixel array, so the grating must create a total of N F / p output spots, and the complexity of the grating is proportional to m. The complexity metncs plotted in Fig. 17 indicate that larger node-sizes and more pipes produce results that are more desirable.
E. Probability of the System Being Operational
The lasers in the system have relatively high failure rates (when compared to the other components), so the probability that the system is operational is tightly coupled to the probability that a laser has failed. As shown in Fig. 11 , there are three lasers associated with each of the smart pixel device arrays. Since there are (s -1) device arrays within each pipe and p pipes within the system, the total number of lasers required within the system is given by 3(s -1)p. For a system with p = 1 pipe, if the probability that a particular laser is operational is given by P ( O ) , then the probability that the system is operational is the probability that all of the lasers in the pipe are operational, which is given by P(0)3'"i'. If the system has multiple pipes ( p ), then some criteria must be established to identify satisfactory system-level operation before one can determine the probability that the system is operational. For simplicity, assume that a p-pipe system is operational if at least one of its pipes is operational. The probability that the system is operational is then given by 1 -(1 -P ( 0 ) 3 ( s -'IF. In Fig. 18 , this metric is plotted as a function of node-size n assuming P ( 0 ) = 0.999.
The plots indicate that larger node sizes would probably be more desirable, and the use of multiple pipes greatly improves the probability of having an operational system.
F. Minimum Laser Power per Stage Required for Probe Laser
The probe laser will typically require much more output power than the two clock lasers in Fig. 11 , because its output must reflect off of the modulators in the source device array, pass through the link-stage interconnection optics, and drive the detectors in the receiving device array. Thus, the maximum power required by a single laser in the system will typically be determined by the probe laser requirements.
If the detectors in the second device array require power levels given by Pdet to switch at the desired data rate, then the laser power required for a single probe laser driving the nodes can be shown to be 2NFnPd,,/(p?)). In Fig. 19 , this metric is plotted as a function of node size n assuming Pdet = 50 pW and 7) = 0.1, indicating that smaller node sizes are more desirable. The plots also show that systems with (n, n , n) nodes will typically require much less laser power than systems with (n, 1, 1) nodes. The system with four pipes also has lower laser power requirements than the system with one pipe, but it also requires more of the low-powered lasers.
G. Power Density on the Device Substrate
The choice of a thermal management technique within a system is primarily determined by the power density on the device substrate, which is measured in units of power per unit area. There are two sources of power dissipation on the device arrays within the system: electrical power and laser power. The power density for the device substrate is given by the total power dissipated on the device substrate divided by the area of the device substrate. The substrate area can typically be reduced if more pipes are used. However, for this analysis, it will be assumed that the substrate area is fixed at the size found for one pipe even if multiple pipes are used. This will help lower the power densities on the substrates in the multiple-pipe systems. However, the use of larger substrate areas can lead to lower device yields and the need for expensive lenses with large fields of view.
To develop a formula for dissipated power, some simplifying assumptions must be made. First, it is assumed that each of the equivalent gate areas on the device array contributes P, units of electrical power to the substrate. This power is typically dissipated in the FET's that make up the analog amplifiers and the digital logic gates. Second, it is assumed that all three of the lasers associated with a single device array (Fig. 11 ) are operated at the same power level, which is defined by the requirements on the detected laser power (Pdet) for the probe laser (see Fig. 19 ). Third, some of this laser power is typically lost in the interconnection optics, and it will be assumed that only a fraction ( d ) of the power from these three lasers is actually dissipated in the modulators and detectors of the smart pixel device array.
For ( n , 1, 1) nodes, the power density is given by:
For (n, n, n ) nodes, the power density is given by:
In Fig. 20 , the power density metric is plotted as a function of node size n assuming P, = 100 pW, P,,, = 50 pW, d = 0.5, 11 = 0.1, and Agate = 400 pm . The plots indicate that smaller node-sizes are more desirable, and systems with ( n , n , n ) nodes would have lower power densities than systems with ( n , 1, 1) nodes. Because the device areas were not decreased as more pipes were added, the power density on multiple-pipe systems was also decreased. 
H. Required Speedup of the Internal System Data Rate
The data rate through the switching fabric must be higher than the data rate on the signal lines entering the switch, because the control information must be injected into the network along with the raw data. If the raw data packets contain D bits, then the effective speedup of the data rate for ( n , 1, 1) nodes was shown to be given by (D + s n ) / D , while the effective speedup of the data rate for ( n , n, n ) nodes was shown to be given by (D + sn2)/D.
If ATM traffic is assumed, then the length of a single data packet is given by D = 424 bits. This metric is plotted as a function of node-size n in Fig. 21 . The plots in Fig. 21 indicate that smaller node sizes would probably be more desirable. In addition, the plots show that ( n , 1 , 1) nodes offer a slight advantage over ( n , n , n ) nodes. This metric is not affected by the use of pipes.
I. Total Network Latency from Input to Output
The network latency (delay) from the EGS inputs to the EGS outputs is particularly important for applications that use bidirectional data paths for feedback data loops. These applications include computer data transfers and voice communication. In the absence of signal skew problems, the clocks driving the flip-flop chains could be operated at a frequency defined by the gate delays within a single stage of the shift register chain. Assuming that a signal leaving one flip-flop must propagate through all of the A single (n, 1, 1) switching node will introduce 210g2(n) + 9 gate delays, so the total network latency in passing through the s node stages is given by s (210g2 (n) + 9) gate delays. A single (n, n , n) switching node will introduce 310g2(n) + 9 gate delays, so the total network latency in passing through the s node-stages is given by s(310g2(n) + 9) gate delays. This metric is plotted as a function of node size n in Fig. 22 assuming each gate contributes 1 ns of delay. The plots indicate that larger node sizes would probably be more desirable.
J . Path Hunt Algorithm Complexity
For each connection request, the path hunt processor must identify an idle path between the desired input and the desired output. In EGS networks, it was shown that there are F d S -2 ' / N paths between each input and output. In the worst case, the path hunt algorithm that identifies the idle path would have to hunt through all of these paths before finding an idle one. Thus, in the absence of parallel processing techniques, the worst-case amount of time required to implement this path hunt algorithm and the worst-case complexity of this algorithm is given by Fn(s-22)lN. This simple metric is plotted as a function of node size n in Fig. 23 . The optimum choice of a node size based on the complexity of the path hunt algorithm is not clear from these plots, however it is clear that some choices are much worse than others.
K . Control Bandwidth Requirements Between Path Hunt Processor and Inputs
Once the path hunt processor has calculated an idle path for a connection to use, the resulting control information must be injected into the switching nodes within the EGS switching network. As described above, the embedded control injection technique is assumed, so the control information from the path hunt processor must be routed to the inputs of the EGS network before being launched along the optical data paths during a guard-band interval. The aggregate bandwidth required for transmission of these control bits between the path hunt processor and the network inputs is given by the total number of control bits stored in the smart pixel switching nodes in all of the network stages divided by the transfer time Ttransfer. Within this analysis, it will be assumed that Transfer is 25% of a packet interval for 155 Mbps ATM data packets, so Ttmns. fer = 684 ns. For (n, 1, 1) nodes, each smart pixel must store n control bits. There are a total of s ( N F / n ) smart pixels in the network, so a total of sNF control bits are required. As a result, the aggregate bandwidth between the path hunt processor and the network inputs is given by For (n, n , n) nodes, each smart pixel must store n2 control bits. There are a total of s ( N F / n ) smart pixels in the network, so a total of sNFn control bits are required. As a result, the aggregate bandwidth between the path hunt processor and the network inputs is given by (sNFn)/TtranSfer. This metric is plotted as a function of node-size n in Fig. 24 . The plots indicate that the required control bandwidth is rather weakly dependent on the node size. However, (n, n , n) nodes seem to require slightly lower bandwidths than (n, 1, 1) nodes.
L. Probability a Selected Path is Blocked by Faulty Nodes
Tolerance to faulty nodes is a desirable attribute within any switching network. The presence of faulty nodes in an EGS networks will obviously have an effect on the blocking probability, because less paths will be available for the routing of calls. An approximation for the blocking probability P ( B ) of an EGS network in the presence of faulty nodes is given by P ( B ) = 
where s is the total number of node stages (including the two stages added by the fan-out and fan-in sections), and f i s the fraction of nodes that are faulty within the switching section. (Note: It has been shown that the fault tolerance of EGS networks can be greatly improved by increasing the fan-out and fan-in values beyond the values required for strictly nonblocking operation and rerouting blocked calls [22]. This approach is not considered within the approximations above.)
This fault-tolerance metric is plotted as a function of node size rz in Fig. 25 . Within these plots, it is assumed that one percent of the nodes within the switching section are faulty (f = 0.01). The plots indicate that larger node sizes would probably be more desirable. The fault-tolerance of the system is not dependent on the node type or the number of pipes. Table I11 summarizes the formulas developed within the previous sections. A cost function can be defined for the various EGS networks if the designer assigns relative weights to each of the metrics within Table 111 . A particular cost function will not be defined within this paper, because the relative weights are too tightly coupled to the unique capabilities of the technologies and the requirements of the particular application. Since technological capabilities and application requirements are always changing over time, the cost functions must also be changed. Nevertheless, several system-level limitations are clear from the results outlined within the plots above, and design areas requiring further improvement can also be identified. As might be expected, the plots indicate that most of the system-level problems are exacerbated by larger network sizes ( N ) .
VI. DISCUSSION
The most serious system-level limitation is illustrated by the results shown in Figs. 19 and 20. The amount of laser power required to drive a single device array is relatively high, so multiple sources per device array may be required. In addition, the power density on a single device array is also high, so specialized thermal management techniques may be required. To minimize these problems, system designers may want to use pipes within the EGS networks. If p pipes are used within an EGS network, then the required laser power per device array and the corresponding amount of power dissipated on a device array are effectively decreased by a factor of p(at the expense of the total number of optical components, which is increased by a factor of p ). Another helpful solution to the power problem may be found through the use of microchannels [48] , which may permit direct steering of output beams to desired destinations, eliminating the optical fanout problems found in the 2-D q-shuffle implementations. Because of the power problems identified in Figs. 19 and 20, system architects may be forced to work with small values of n in their system designs. Several other system-level problems can also be identified within the plots above. For example, for large network sizes (N), a relatively large number of fibers are required for the input fiber bundle arrays (Fig. 15) . As a result, piped EGS networks using multiple input bundles may be required, and modifications to the embedded control injection scheme may also be warranted. Another limitation can be seen in Fig. 16 , where the required device substrate area becomes relatively large due to the complexity of the electronics within the nodes. Because of the yield problems that may result from these large device arrays and the cost of the required objective lens, system designers may again be driven to subdivide the EGS network into pipes. Finally, the aggregate bandwidth required for routing of control signals between the path hunt processor and the network inputs is relatively high (Fig. 24) . Thus, many parallel paths will probably be required to provide this needed bandwidth.
VII. CONCLUSION The evolution toward smart pixels within photonic switching applications is occurring at a rapid pace, because smart pixel technologies may solve the problems caused by large switching energies and limited functionality within first-generation optical logic devices. The application of smart pixels within photonic switching networks must be justified from an architectural point-ofview as well as from a technological point-of-view, and this paper has studied the architectural trade-offs found in using smart pixels for nodes within switching networks.
The particular networks in the analysis were strictly nonblocking EGS networks with various numbers of inputs and outputs, ranging from N = 256 to N = 4096. Both (n, 1, 1) nodes and (n, n, n) nodes were used in the network analysis, and the value of n was varied from n = 2 to n = 64 in an attempt to identify the optimum node size. The link-stage interconnections were assumed to be 2-D q-shuffles provided by space-invariant binary phase gratings, and the control signals were injected through the network input ports via an embedded control technique.
Based on these assumptions, various performance metr i c~ were defined and analyzed. It was shown that the optimum node size depends on the metric being studied. Critical parameters included the required laser power, the power density per device array, the number of fibers within the input fiber bundle array, the device substrate size, and the bandwidth requirements for control transmission between the path hunt processor and the network inputs. Because of these system-level problems, system architects may need to use several specialized design techniques to circumvent these effects. For example, the use of small values of n will greatly reduce the power problems. The use of piped EGS networks can also help reduce the power problems, and they will also help the problems associated with the fiber bundle array and the device substrate size. In general, most of the system-level limitations identified by the analysis can be greatly reduced using specialized electronic and optical design techniques.
