Abstract-In order to support the continuous growth of transmission capacity demand, optical packet switching technology is emerging as a strong candidate, promising to allow fast dynamic allocation of wavelength-division multiplexing (channels, combined with a high degree of statistical resource sharing). This work addresses the design of optical switch architectures, based on previous proposals available in the technical literature that use an arrayed waveguide grating (AWG) device to route packets. Since the port number of currently available AWGs is a limiting factor, we propose two new modified structures which better exploit the switching capability of this component in the wavelength domain. Since a limited hardware complexity is a key requirement for alloptical switches, due to the high cost of optical components, these different node configurations are compared in terms of complexity. Traffic performance of these new structures in a full optical packet switching scenario is also examined.
I. INTRODUCTION

I
N THE LATEST years, telecommunication networks have been demanding an unprecedented, dramatic increase of capacity, fostered mostly by the exponential growth of Internet users and by the introduction of new broadband services. The IP architecture is being seen as the unifying paradigm for a variety of services and for making real the broadband integrated services digital network (B-ISDN).
In order to face this challenge, considerable research is currently devoted to the design of Internet protocol (IP) fully optical backbone networks, which will provide the possibility of overcoming the capacity bottleneck of classical electronic-switched networks.
A single optical fiber offers a potentially huge transmission capacity: just in the third wavelength window, tens of Terahertz are there to be mined, if only we could be able to exploit such tremendous bandwidth with adequate technology. In the last ten years, optical dense wavelength-division multiplexing (DWDM) has been developed, bringing commercial systems which provide impressive transmission capacities: one Terabit per second per fiber, over distances on the order of 100 km, are feasible today with off-the-shelf components.
Moreover, recently, DWDM has evolved to support some network functions as circuit routing and wavelength conversion and assignment. In WDM-routed networks, a wavelength is as- signed to each connection in such a way that all traffic is handled in the optical domain, without any electrical processing on transmission.
Unfortunately, optical devices used in market equipment are not mature enough to meet packet-by-packet operation requirements yet. An interesting solution which tries to represent a balance between circuit switching low hardware complexity and packet switching efficient bandwidth utilization is the optical burst switching [1] - [3] . In an optical burst switching system, the basic units of data transmitted are bursts, made up of multiple packets, which are sent after control packets, carrying routing information, whose task is to reserve electronically the necessary resources on the intermediate nodes of the transport network.
Such operation results in a lower average processing and synchronization overhead than optical packet switching, since packet-by-packet operation is not required. However packet switching has a higher degree of statistical resource sharing, which leads to a more efficient bandwidth utilization in a bursty, IP-like, traffic environment.
We address here the long-term view of a full packet switching network performing IP packet transport, in which optical operations are performed as much as possible exploiting the currently available optical device technology. Apparently, most of the operations related to the packet header processing needs to be done in the electronic domain. This paper deals with the architecture of an optical packet switching node first proposed in [4] and [5] , which is equipped with a fiber delay line stage used as an input buffer for optical packets. Two new alternative structures of the switching core of the node, which exploit the routing in the wavelength domain inherently available in the switching components being used, are described.
The paper is organized as follows. Sections II and III describe the optical network architecture we envision and the proposed architecture of an optical packet switching node. Section IV provides a comparison of the different solutions in terms of node complexity and traffic performance.
II. NETWORK ARCHITECTURE
The architecture of the optical transport network we propose consists of optical packet-switching nodes, each denoted by an optical address made of bits, which are linked together in a mesh-like topology. A number of edge systems (ES) interfaces the optical transport network with IP legacy (electronic) networks (see Fig. 1 ).
The transport network operation is asynchronous; that is, packets can be received by nodes at any instant, with no time alignment. The internal operation of the optical nodes, on the other hand, is synchronous or slotted, since the behavior of packets in an unslotted node is less regulated and more unpredictable, resulting in a larger contention probability.
An ES receives packets from different electronic networks and performs optical packets generation. The optical packet is composed of a simple optical header, which comprises the -bit destination address, and of an optical payload made of a single IP packet or, alternatively, of an aggregate of IP packets. The optical packets are then buffered and routed through the optical transport network to reach their destination ES, which delivers the traffic it receives to its destination electronic networks. At each intermediate node in the transport network, packet headers are received and electronically processed, in order to provide routing information to the control electronics, which will properly configure the node resources to switch packet payloads directly in the optical domain.
Header and payload of a packet are transmitted serially, as shown in Fig. 2 , where header duration is equal to and payload duration to . At each switching node the optical header is read, dropped and regenerated at the node output; therefore, guard times are needed in order to avoid payload/header superposition, due to clock jitter in the transmission phase. Hence, the total overhead time is equal to . Both header and payload are assumed to be transmitted at 10-Gb/s rate, which is compatible with current transmission technology.
Two critical parameters must be considered when dimensioning and : the maximum header processing time and the switching time of the slowest switching element in the node (see Section III). During header processing time, the node decodes the optical header and processes the carried information, performing packet routing and contentions resolution (as explained later). In order to compensate this processing time, a delay line should be inserted at each node input, delaying all incoming payloads by a interval, so that payloads enter the node when all header processing has been performed. Finally, considering that the optical header is dropped at the node input, the silence between two consecutive payloads is equal to and so it will be necessary that . Given these considerations, an overhead time has been chosen, with header duration and guard times . Thus, an 8-ns interval is available to perform switching, and a 10-bit jitter at 10 Gb/s is tolerated in header regeneration. Moreover, this value of implies a 60-bit header. In [6] , a 10-Gb/s optical packet receiver is demonstrated, using a 40 bit-long preamble. Therefore, the remaining 20 bits will carry packet information: 5 bits are reserved for packet length (expressed in time slots, for a maximum value of 2 time slots), and the remaining 15 bits for the destination ES address (up to a maximum of 2 edge systems). Supporting more information in the packet header, such as time stamping, optical priority labels, etc., is outside the scope of this paper.
We chose a slot duration equal to the time duration of an optical packet whose payload consists of the smallest transmission control protocol (TCP)/IP packet (i.e., 320 bits, the size of an IP packet carrying a TCP acknowledgment). Time-slot duration is, therefore, equal to . Owing to our assumption of slotted operation, it takes a number of slots to switch (and transmit) an optical packet with overhead time and payload time . Fig. 3 shows the case of an IP packet engaging two slots. Note that, under these assumptions, a 1500-byte packet (i.e., the maximum Ethernet payload length) will fill in a 31 time-slot long optical packet. Since a small traffic fraction is composed of longer packets in current IP networks (see [7] ), most of the electronic information units will not need to be segmented in order to travel through the optical network. A more accurate choice of the time-slot duration is left for further study, when detailed and accurate traffic models will be considered.
III. NODE ARCHITECTURE
The general architecture of a network node is shown in Fig. 4 . It consists of three stages: a first stage of channel demultiplexing, a second stage of switching, and a third stage of channel multiplexing. The node is fed by incoming fibers each having wavelengths. In the first stage, the incoming fiber signals are demultiplexed and wavelengths from each input fiber are fed into each one of the second-stage switching planes, which constitute the switching fabric core. Once signals have been switched in one of the parallel planes, packets can reach every output port through multiplexing carried out in the third stage using any of the wavelengths that are directed to each output fiber. We note that the number of inlets of each third-stage multiplexer varies, depending on the specific structure of the switching planes. Wavelength conversion must be used for contention resolution, since at most packets can be concurrently transmitted by each second-stage plane on the same output link.
The detailed structure of one of the parallel switching planes is presented in Fig. 5 . It consists of three main blocks: an input synchronization unit, as the node is slotted and incoming packets need to be slot-aligned, a fiber delay lines unit, used to store packets for contention resolution, and a switching matrix unit, adopted to achieve the switching of signals.
These three blocks are all managed by an electronic control unit which carries out the following tasks:
• optical packet header recovery and processing;
• managing the synchronization unit in order to properly set the correct path through the synchronizer for each incoming packet; • managing the tunable wavelength converters inside the delay unit and in the switching matrix, in order to properly delay and route incoming packets. One electronic control unit is implemented in each switching plane and, since at each plane output packets are transmitted using one of the input wavelengths, the controllers' job is carried out in a completely parallel and independent way.
A. Synchronization Unit
This unit consists of a series of 2 2 optical switches interconnected by fiber delay lines of different lengths. These are arranged in a way that, depending on the particular path set through the switches, the packet can be delayed for a variable amount of time, ranging between and , with a resolution of , where is the time slot duration and the number of delay line stages.
The synchronization is achieved as follows: once the packet header has been recognized and packet delineation has been carried out, the packet start time is identified and the control electronics can calculate the necessary delay and configure the correct path of the packet through the synchronizer.
Due to the fast reconfiguration speed needed, fast 2 2 switching devices, such as 2 2 semiconductor optical amplifier (SOA) switches [8] that have a switching time in the nanosecond range, must be used.
B. Fiber Delay Lines Unit
After packet alignment has been carried out, the routing information carried by the packet header allows the control electronics to properly configure a set of tunable wavelength converters (TWC), in order to deliver each packet to the correct delay line to resolve contentions. An optical packet can be stored for a time slot, with a 40-ns duration, in about 8 m of fiber at 10 Gb/s. To achieve wavelength conversion several devices are available [9] - [11] .
The delay lines are used as an optical scheduler. This policy uses the delay lines in order to schedule the transmission of the maximum number of packets onto the correct output link. This implies that an optical packet , entering the node at time from the th WDM input channel, can be transmitted after an optical packet , entering the node on the same input channel at time , being . For example, suppose that packet , of duration , must be delayed for time slots, in order to be transmitted onto the correct output port. This packet will then leave the optical scheduler at time . So, if packet , of duration , has to be delayed for slots, it can be transmitted before if since no collision will occur at the scheduler output. Previous works considered the employment of optical first-in-first-out (FIFO) buffering, but optical scheduling obviously resulted a better choice. The reader is referred to [4] for a deeper analysis of this topic.
Given the maximum achievable delay slot, for each switch input delay lines are needed, with delays growing from 0 to . Moreover, multiplexers and demultiplexers with input and output ports are needed to perform packet buffering.
C. Switching Matrix
Once packets have crossed the fiber delay lines unit, they enter the switching matrix stage in order to be routed to the desired output port. This is achieved using a set of tunable wavelength converters combined with an arrayed waveguide grating (AWG) wavelength router [12] .
The AWG is used as it gives better performance than a normal space switch interconnection network, as far as insertion losses are concerned. This is due to the high insertion losses of all the high-speed all-optical switching fabrics available at the moment, that could be used to build a space switch interconnection network. Commercially available 40 channel devices have a channel spacing of 100 GHz and show an insertion loss of less than 7.5 dB [13] .
Three different structures are proposed for the realization of this stage, referred to as structure (a), (b), and (c). In the following sections, we will consider single plane structures, that is , in which the switching matrix has inlets and outlets. The extension to multiplane nodes is easily achieved for . The third structure will require further considerations.
1) Structure (a):
The simplest switching matrix structure, first proposed in [4] , is shown in Fig. 6 . It consists of tunable wavelength converters and an AWG with size . Only one packet is routed to each AWG outlet and this packet must finally be converted to one of the wavelengths used in the WDM channel, paying attention to avoid contention with other packets of the same channel. This solution is, therefore, strict sense nonblocking.
2) Structure (b): In order to reduce the number of planes of the node and, thus, to better exploit the channel grouping effect (i.e., the sharing of different channels for transmitting a large number of packets, the load per channel being constant) more than one packet can be routed in each AWG inlet; apparently, the packets sharing the same input must be transmitted on different wavelengths. The structure of the AWG, in fact, is such that different wavelengths entering the same input port will emerge on different output ports, as shown in Fig. 7 in the case of four incoming and outgoing fiber channels, each supporting four wavelengths.
In the switching matrix structure illustrated in Fig. 8 , up to different packets are sent to the same AWG inlet using different wavelengths. A simple node design requires to be an integer that divides . From AWG input port , the output channel can be reached by different packets, since there are exactly AWG outlets connected to that channel. During each time slot, up to packets can be routed to the same AWG outlet using different wavelengths. Hence, demultiplexers are needed to split the different signals and to route them to the last stage of wavelength converters. If , no contention can happen in the multiplexing stage, so this structure behaves exactly as a structure (a) with size . On the other hand, when , events of packet blocking occur, considering the fact that paths are available to reach a tagged output for up to packets per inlet. So, when more than packet in the same AWG inlet are destined to the same output channel, a contention happens, even if the total number of packets addressed to that output is smaller than .
3) Structure (c): Node structure (b) can be simplified by selecting , so that each AWG input can receive up to packets using different wavelengths. Therefore, the number of AWG inlets is now exactly , as shown in Fig. 9 . In this last structure the last TWC stage is not needed anymore, provided the employed AWG works on the same wavelengths used in the outgoing fibers. In fact, if the electronic controller takes care of avoiding wavelength contention between AWG outlets connected to the same output channel, packets are ready to be transmitted as soon as they exit the AWG. Therefore, a packet entering the AWG inlet and destined to the output WDM channel can not be transmitted using every color in the WDM channel, but only using a subset which consist of the wavelengths through which the packet can reach the desired output channel, thus reducing the benefits of channel grouping (when and are kept constant).
We would like to point out that the size of the AWG is not a limiting factor for this node architecture, since the current optical technology enables using optical fibers supporting a number of wavelengths much larger than the maximum size of an AWG. On the other hand if we are willing to fully exploit the external number of wavelengths in the internal node structure in case of a maximum size of the AWG, with , a multiplane structure must be adopted. In order to arrange a multiplane structure, the key point is the wavelength splitting among planes, so that the minimum spacing between adjacent wavelengths in a plane is compliant with the AWG requirements. Let us suppose to use wavelengths per channel with spacing and to adopt AWGs with channel spacing , where is an integer. Then, the WDM channels should be split into wavelengths combs with spacing, whereas in the overall comb wavelengths are spaced by . Fig. 10 represents this situation for . Furthermore, the used AWGs should be built in order to have -spaced central wavelengths.
IV. COMPLEXITY AND PERFORMANCE COMPARISON
In order to evaluate the three different switching structures, we will first examine the packet loss probability of a simplified network scenario, where an analytical model can easily be obtained. We will later compare the complexity of the structures in terms of number of components and, finally, their traffic performance under a more realistic traffic assumption and a more complex configuration will be analyzed.
A. Switching Capability
In order to perform a basic comparison among the switching capabilities of the different structures, the following assumptions are made:
• no input buffering is performed;
• packet length is constant, equal to the time-slot duration. In this simple case, a packet is offered by each single wavelength channel with probability in each time slot, where is the offered load per wavelength. We can easily derive an analytical model of the structure (a) and (b) of switching matrices.
1) Structure (a): Let us define
The probability that exactly packets are addressed to a tagged output channel is then If denotes the number of packets addressed to the tagged output, the carried load per wavelength can be expressed as (1) Since represents the offered load per wavelength , the loss probability is given by (2) 2) Structure (b): Let us define as the probability that packets in AWG inlets are addressed to the tagged output port, with or less packets per inlet
Once defined the probability that more than packet are addressed to the tagged output in a single AWG inlet as the probability that the number of packets that can be routed to the desired output is exactly , after contention resolution in the multiplex stage, is expressed as Therefore, as seen for the original structure in (1) and the loss probability is given again by (2) .
Packet loss probability of the proposed structures are now compared using the previous analytical models for structures (a) and (b), whereas computer simulation has been used for structure (c). Figs. 11 and 12 show the loss probability of the different switching matrices, when employing a single-plane structure to switch 2 and 4 WDM channels, with 12 different wavelengths. This value of has been chosen because it enables us to compare the three structures with different parameters. No results were plotted for and , since in these cases , and so the loss probability is the same as that of the structure (a). As it could easily be foreseen, the original structure outperforms the other two, but it uses AWG with larger size.
The results in Fig. 13 , where the different structures are compared with 8 8 AWGs, are far more interesting. Again, only values of such that have been considered. It can be seen that structure (c) behaves a little better than structure (a), whereas structure (b) leads to better performance for small values of . As is increased, the loss probability grows and reaches that of structure (a) when
. Larger values of lead to worse performance. It is interesting to note that structure (b) performs exactly the same as structure (a) when . In the Appendix, it is shown that whenever , the performance of structure (b) is equal to that of structure (a) with .
B. Complexity
In Table I , the components needed to build a switching matrix with size are listed. Structure (b) does not seem to the number of TWCs will grow quickly. Structure (c) greatly reduces the costs but, as seen in Section IV-A, it is not competitive with the original structure when packet loss performance is evaluated. On the other hand, Table II reports a cost comparison of the different structures keeping constant the AWG size. 1 In this case, structure (c) does not only behave better than original (as was previously seen), but also reduces implementation costs.
C. Performance Results
We show now some traffic performance results given by the different node architecture configurations obtained through computer simulation. Packet interarrival has been modeled as a Poisson process with negative exponential interarrival times. Based on measurement of real IP traffic [7] , the following distribution of packet length has been assumed:
In this traffic model, the resulting average packet length is 393 bytes. Only structures (a) and (c) have been examined now, considering the fact that the implementation of structure (b) brings to an increase of complexity of the electronic controller. On the other hand, in order to implement structure (c), it is only necessary to add a table to the electronic controller, where each entry contains the wavelengths that can be used to route a packet from inlet to output channel . Moreover, this structure reduces the complexity of the controller operations in each time slot, because output wavelengths per packet must be considered, rather than .
Given the results of Section IV-A, the two switching matrices are compared keeping constant the AWG size. Figs. 14 and 15 show a performance comparison for several values of the maximum buffer depth . Structure (c) outperforms the original structure (a) and the improvement is as greater as the maximum buffer depth is increased. This improvement is also much bigger as traffic load is decreased.
The number of wavelengths per channel connected to every switching plane is a key parameter to improve node performance, due to the channel grouping phenomenon, as was pointed out in [4] and [5] . In Fig. 16 , different configurations with 32 32 AWGs are compared. It is shown that structure 
V. CONCLUSION
In this paper, we have proposed and compared different architectures of the switching core for an IP over WDM switching fabric. Starting from previous proposals of AWG-based optical switching node, it has been shown how to arrange the switch core so as to perform the switching also in the wavelength domain, by thus fully exploiting the AWG properties. Two different architectural solutions have been examined and compared in terms of complexity and traffic performance. The results are quite promising in that under reasonable assumptions on the offered IP traffic, the simplest of the new proposed structures outperforms the original one. Other issues will have to be addressed in the future such as the behavior of this new structure when recirculation delay lines are added to obtain shared buffering, especially compared with the results given in [5] for the original structure. Finally, it would be useful to compare the obtained performance results with those given by available electronic routers; regrettably, due to immaturity of current optical switching technology, a comparison between these the two scenarios is not feasible yet. Consider, for example, that it is possible to equip electronic routers with gigabytes of random access memory at relatively low costs, while 8 m of fiber are needed to provide the capacity necessary to store just a single one-slot optical packet in the optical domain.
APPENDIX
Let us consider a switching matrix with structure (b) called with parameters , , , with packet loss probability , and a structure (a) called with parameters and , with packet loss probability . It can be easily shown that
In structure , after the multiplexing stage, up to packets per inlet are feasible to enter the AWG addressed to a tagged output. Being the number of AWG inlets, the total number of packets which will request the tagged output channel will be upper bounded by Therefore, no contention happens in the AWG stage. Let us now examine the multiplexing stage. At each multiplexer, up to packets can contend for the tagged output, and only packets will win contention. This is the same situation that happens in the structure , where up to packets contend for the output and only win, and
Hence, in contention happens only in the multiplexing stage and this contention has the same characteristics of contention at AWG stage in . Therefore
