In this paper, a cell level study of an ATM switch based on a three-stage Clos interconnection network under bursty Interrupted Bernoulli Processes is presented. Different traffic patterns are considered : in the first one, cells of a given burst are assumed to be randomly directed to the different output ports of the switch. In the second one, cells of a given burst are assumed to belong to the same Virtual Channel and at most one burst can be directed to a given output port of the switch. This traffic type is usually named SSSD (Single Source to Single Destination). In the last one, a SSSD high traffic embedded in a uniform traffic is considered. Cells of a given burst are supposed to be routed independently. Consequently, a resequencing mechanism has to be implemented. Approximate analytical models of the switch are proposed, they are validated by discrete event simulations for the parameter values for which simulations can be run. Performance of the resequencer is estimated by simulations. It is shown that such interconnection networks lead to good performance results even with small buffers under those different traffic patterns.
INTRODUCTION
Performance of ATM networks will depend on transmission and switching performance. Many ATM switch designs have been proposed [Tob 90 ]. There are mono-path networks for which there is only one path between one input port and one output port (Banyan, Delta, Omega networks) and multi-path networks for which there are several paths (Clos, Benes networks) . The first ones are easy to design ; a self routing algorithm can be used to route cells in the switch. For multipath networks, several complex algorithms may be implemented (cell or call based).
The main problem when dimensionning ATM networks is due to the fact that traffic is not well characterized. • Non-uniform input process : in [Kim88], one input port sends its cells to one output port and receives a heavier load.
In [Yam91] [Mor92] input processes are IBP type; they differ by the squared coefficient of variation.
In this paper, we consider bursty input traffics (IBP type).
Three traffic patterns are studied. The first one is the "classical" uniform traffic pattern case. IBP approximations of the interstage traffic are proposed.
In the second one, cells of a given burst will be directed to the same output port. In this traffic case, named "Single Source to Single Destination" (SSSD), at a given time t, an output port of the switch is chosen by, at most, one burst.
Consequently, "new" bursts are directed to "idle" ports of the switch (i.e. no burst is directed to this output port). This traffic case has been studied in a previous work [Bey 96] with a monopath interconnection network. It was shown that such networks lead to bad performance results : as soon as two bursts compeat for a common link inside the switch, the corresponding queue increases and cells are lost.
In the last traffic pattern, a case of non-uniform destination distribution and non-uniform input process has been studied.
Only one hot-spot output port is considered, it corresponds to one high load input port. It is an SSSD high traffic embedded in uniform low traffic.
In the present work, we consider a switch based on a multipath interconnection network. Cells of a given burst are assumed to be routed independently. Consequently, it is necessary to reorder cells of a given burst. An algorithm is proposed and the resequencing cost is estimated by discrete event simulations. An approximate analytical model is proposed for the switch itself for the different traffic cases.
They are validated by discrete event simulations.
The paper is organized as follows. Section 2 will present the interconnection network and the resequencing algorithm. In In order to obtain the performance of this switch, let us study the cell delay and the cell loss probability.
In the first traffic case, let us assume that the arrival In the second traffic case, let us assume that the arrival processes at each input port are IBP and that each input link is offered the same traffic load, cells of a given burst are assumed to be directed to the same output port of the switch.
It is assumed that only one burst can be directed to a given output port. This traffic pattern is usually named "Single Source to Single Destination".
Whereas Banyan networks are single-path, Clos networks are multi-paths. In the present work, a random policy is chosen : the choice of the path on the second stage is uniformly (and randomly) done. In the considered traffic case, an algorithm based on burst routing may lead to well known results on non-blocking Clos switches, as far as the number of paths in the switch b is greater or equal to 2a − 1, the cell loss probability is 0 and the cell delay equal to 3 time slots. This is too optimistic because in fact the input traffic will mostly be SSSD.
More generally, two routing policies may be implemented. In the first one, cells of a given call are routed independently. In this case, a resequencing mechanism has to be implemented to reorder cells of a given call. This mechanism may be quite complicated ; an output buffer shared by all the incoming cells directed to a given output port has to be added. The performance may also depend on the selection of the outgoing cell [Urv 95] . In this case, even if the traffic is unbalanced, it may be splitted among the different paths. In the second one, all the cells belonging to the same call may use the same path. In this case, the performance mainly depends on the choice of the path.
In the present work, only the first case has been investigated.
Consequently, cells of a given burst have to be reordered. If the rank number is larger than the expected one, the arriving cell will have to wait into the resequencing queue. If the buffer is empty, the timeout is armed. If this buffer is full, one cell has to be sent or lost. Let us propose that the cell in the head of the resequencing queue is sent. This means that the cells whose rank is included between the rank of the expected one and the rank of the cell of the head of the resequencing queue are lost.
When the expected cell does not arrive before T w , it is considered as being lost. In this case, the cell that is the head of the resequencing queue is sent.
When a cell number is lower than the expected number, it means that it was not lost in the switch but it has been considered as lost by the resequencing algorithm, so it is lost in the resequencer. This event may be due to the fact that the time out is too small or it may be due to the fact that the buffer size is too small. So several events may cause losses in the resequencer. Let us note that losses in the switch increase significantly the number of cells in the resequencer, since the resequencer will not know that cells are lost and will wait for them and keep the following cells in order to resequence them afterwards.
The present algorithm is not of general purpose. In a real switch implementation, a common buffer should be implemented and several bursts should be managed.
In the last traffic case, a SSSD IBP traffic (high traffic) is embedded in a uniform IBP traffic (low traffic). Let us now describe the resequencing algorithm : In our simulations, resequencing will be operated only for high traffic. The low load and the uniform distribution of low traffic implies that this traffic generally, does not need to be resequenced. The cost is negligible. Let us study the resequencing cost for this traffic case. The previous algorithm has been adapted to this traffic case. The only difference corresponds to the low traffic cells directed to the hot spot output port. In the output queue designed for high traffic, if a cell of low traffic arrives, it will be transmitted on the output link. For the high traffic, the previous algorithm has been applied.
3-ANALYTICAL MODEL OF THE SWITCH

Traffic hypotheses and characterization
IBP processes are discrete time "ON/OFF" processes with two states. During the "ON" period, a packet is emitted according to a Bernoulli process (parameter α ). "ON" and "OFF" periods are geometrically distributed. An IBP source is defined by three parameters p, q, α . 
Model of the switch in the SSSD Traffic case
Lots of dissymetrical and bursty traffic should be considered.
In the present work, we will only consider that the input processes are i.i.d Interrupted Bernoulli Processes (IBP) with the same parameters. Cells belonging to a given burst are assumed to be directed to the same output port.
In the SSSD traffic case considered in the present work, a new burst can only be directed to an idle output port of the last stage (i.e. an output port that is not chosen by an other burst).
Model of a first stage switching element
Cells within bursts are randomly directed over the output queues of the switching element; Consequently, the traffic going from one input link to one output queue of a first stage switching element is again an IBP process with parameters
An output queue of a first stage switching element can again be modelled by an n − IBP / D / 1 / M queue.
Model of an output queue of the second stage
The output traffic of an n − IBP / D / 1 / M is known to be a
However, the output traffic of a first stage switching element will be splitted and the input traffic into an output queue of the second stage cannot be derived from the previous study (it should be necessary to know to which output port, cells are directed, the choice is not random anymore a .
An approximate model of the second stage will consequently be derived from the study of an a − IBP / D / 1 / M with such parameters. Let us note that in this approximate models we did not take into account the fact that several bursts from a given switching element of the first stage may be directed to the same output port of the second stage.
Model of an output queue of the third stage
Such kind of models cannot be used for the last stage. When 
The third stage is modelled by a constant service time and finite capacity queue. Figure 3 shows the model used to study the performance of the third stage. The source is an IBP / Geo + D / b / b queue. It will be valid as long as the cell loss probability of the first two stages is negligible.
The study of this model is presented in Annex B. On the first stage input processes in the output queue will be splitted IBPs which are also IBPs. On the last stage splitted processes corresponding to the same burst (high traffic) will go to the same output, so it would be very wrong to assume them to be independent. The solution that is proposed here is to consider that at one time there will be only one input link on which the burst will arrive.
For each low traffic cell the choice of the output by each cell from one burst is random. Consequently, the independence assumption is not bad for the low traffic. So, we approximate input processes on the different links by assuming that on one Let us first note that while the sampling of an IBP process is an IBP process, the superposition of 2 IBP processes is a DBMAP (Discrete Batch Markov Arrival Process) [Blo92] that maybe has two cells in the same slot. When going through a constant service time queue, this process will be modified. It is clearly not always good to approximate it by an IBP process. For example the superposition of two IBP processes, one with long bursts and one with small bursts will have two types of bursts. The burst length will not be geometric.
Nevertheless in the case that is considered here, it seems that an IBP approximation is not too bad. 
RESULTS
Several
Results -Uniform Traffic Case
The global memory size in the switch is 128x72. The input load is 0.8. 
Results -SSSD Traffic case
In this traffic case, the global memory size inside the swich is 128x36. The mean input rate is 0.72 and the mean burst length is 100. The peak rate is 0.9. It corresponds to IBP processes with parameters p = 0. 99, q = 0. 96, α = 0. 9. In this case the cell loss probability within the switch is approximately 10 -6 . The cell loss probability is well approximated by our analytical model. Confidence interval are 10% when the cell loss probability is 10 -6 . shown that when the cell loss probability within the switch is high, the mean resequencing time and the resequencing cell loss probability are quite high because the resequencing unit waits for lost cells. The global memory size in the switch is 128x48. Let us note High(Low)-A = the analytical results for high(low) traffic;
High(Low)-S = the simulation results. Let index h (resp. l ) respectively refer to high (resp. low) traffic.
An output port will be heavily loaded, so it is necessary to choose parameters such than the load is not more than 1 on this output link. Cell delay depends on C w for all values of T w . But, for C w =6, the delay is constant. The best values for resequencing cell delay and cell loss probability is obtained for T w =2 and
The cell loss probability in the resequencer is around 10 -5 for C w = 2 for all values of T w and for another C w , it is 0.
Let us compare these results with the results derived from a case when L b,h = L b,l = 100 (see Figures 13 and 14) . We only intend to validate the analytical method and to study the influence of the burstiness, so, in this first case let us choose small buffer sizes.
Traffic rates are the same. This of course will imply high losses in the switch. So it will increase significantly congestion in the resequencer queue. A lower traffic case will be presented on Figure 15 and 16 (with less congestion).
Cell delay in the switch
Memory size in the first stage 
The memory size on the first stage has no influence on cell delay and cell loss probability for the two kinds of traffic.
Cell delay for high traffic is around 24 slots, cell delay for low traffic is around 3.56 slots. Loss probabilities are unacceptable. Loss probability for high traffic is 6 10 -2 and for low traffic it is 1.1 10 -3 . The best repartition of memory is: 5-3-32. and for a low input traffic rate (the output traffic rate will be quite high on the hot-spot).
Memory size in the first stage Cell Delay Let us note that an estimation of the number of desequenced cells show that for the Bernoulli case it is around 5% of the cells, for the case when average length of bursts was 100 and the traffic rates are 0.54 and 0.36, it is around 7 or 8%. In the last case, the desequencing rate is 2%. So resequencing does not cost much and since it was shown that this switch is performant it appears to be good to choose this multipath switch.
Conclusion
A multipath ATM switch performance has been studied under non-uniform traffic patterns. These hypotheses for 
Let us present some more details to study the two previous examples.
A-1.1. Study of a n − IBP D 1 M queue
In this case, Φ t ( ) the number of sources in state ON at time t. The performance criteria will be derived from the solution
Let M be the capacity of the buffer. 
The evolution of Φ t ( ) is governed by the following equations :
j, k = 0, n :
It is possible to derive loss probability and expected value of the response time.
In the following let us note x = k, φ ( ) the state of the Markov
Let Π x be the steady state probability of x and π k the probability of k cells in the queue.
The expected value of the number of cells in the queue L is :
Let R be the expected value of the response time of the queue, Loss the cell loss probability, Λ the output rate of the queue and λ the input rate.
The following relationships hold :
In this case the arrival process is the superposition of one IBP process of parameters p 1 , q 1 and α 1 and of n IBP processes of parameters p 2 q 2 and α 2 .
The performance criteria of the IBP 1 + n − IBP 2 / D / 1 / M queue will be derived from the solution of the Markov chain
where Φ i t ( ) is the number of active sources of class i at time t. In the following, index i will be added to refer to class i traffic.
( ) is 0 or 1. Φ 2 t ( ) is included between 0 and n.
Let A i t ( ) be the number of cells arriving from source i at time t. The Markov chain evolution is derived from
It is possible to derive loss probability and expected value of the response time for each type of cells. In the following let
The performance criteria may be computed for an average cell (they are derived from the steady state probabilities of the Markov chain) and for a type 1 cell, then the performance criteria for a type 2 cell may be derived. 
Let us note :
The denominator is : Process, DMAP : Discrete Markov Arrival Process) [Blo92] .
Nevertheless in the case that is considered here, it seems that an IBP approximation is not too bad.
One way of caracterising a process is by studying the moments of the inter-arrival time. 
The fourth condition implies that this model is only valid for small values of the buffer size of the last stage. In fact, since only one burst is directed to a given output port, the buffer size of this last stage may be quite low. The steady state probability of this Markov chain and the performance criteria may consequently be derived.
