Asynchronous transfer mode (ATM) switches based on shared buffering are known to have better performance and buffer utilization than input or output queued switches. Shared buffer switches do not suffer from head of line blocking which is a problem in simple input buffering. Shared buffer switches have previously been studied under uniform and unbalanced traffic patterns. However, due to the complexity of the model, the performance of such a switch, in the presence of a single hot spot, has not been fully explored. In this article, we develop a model for a multistage ATM switch constructed of shared buffer switching elements and operating under a hot spot traffic pattern. The model is used to study the switch performance in terms of the throughput, cell delay, cell loss probability and the optimal buffer size. ᭧
Introduction
In the recent years, Broadband ISDN (B-ISDN) has received increasing attention for its capability to provide a wide variety of services like video communication, graphic applications, and high-speed data communications. One of the most promising approaches for B-ISDN is the asynchronous transfer mode (ATM). Among the proposed architectures for ATM switching fabrics, multistage switches have attracted a great deal of research interests due to the features they offer, such as modularity and decentralized routability, which make them ideal for VLSI implementation of packet driven structures such as large computer communication networks and ATM switches in the B-ISDN network.
A blocking type of multistage switch suffers from cells contending for the same outlet within the switch which results in a loss in the performance of the switch. Switches based on Delta, Omega, and Banyan networks are examples of blocking types of switches. One technique to enhance the performance of multistage switches in using internal buffering. The cells losing contentions at the switching elements (SE) are stored in the buffers in the SEs. The location of buffers in an SE is crucial in the throughput, delay, and cost of the switch. Input, output, and shared buffering are among the types of internal buffering whose performances have been widely studied by researchers in multiprocessors systems [1] [2] [3] and communications networks [4] [5] [6] [7] [8] [9] [10] [11] .
Turner [9] developed a model for a multistage switch with shared buffer SEs under uniform output traffic distribution. His model assumes independence between buffer slots, and uses local flow control to avoid cell loss inside the switch. His model was extended by Monterosso [10] and Bianchi [12] for more accurate models. A model for a switch using shared buffer SEs operating under a uniform traffic pattern, and global flow control policy has been reported in Ref. [11] . Gianatti and Pattavina [13] studied shared buffer switches with non-uniform traffic patterns. However, they divide the outputs so that a group of outputs are hot and the rest are cold. The model is not suitable to study switches with a single hot output. A single hot output occurs when one of the switch outputs becomes more popular than the others. Earlier studies on simulation have shown the detrimental effect of hot spots on the performance of shared buffer switches.
The objective of this article is to study the performance characteristics of shared buffer ATM switches with a single hot output. We develop an analytical model for a multistage ATM switch with local flow control and compare our results with simulation. The article is organized as follows. In Section 2, we describe the modeling assumptions and single hot spot model. In Section 4, we examine our model with some numerical examples, and compare the results with the simulation. Concluding remarks and further possible work are given in Section 5.
Shared buffer delta switch
A Delta-d switch with N inlets and N outlets consists of k
In this article, we number the stages from 1 to k where the stage at the input to the switch is referred to as stage 1. There exists only one path between each input and output of the switch, and each stage of the switch consists of N/d SEs. At stage i of single hot spot Delta switch, there are i types of Ses which carry different mixtures of hot and/or cold traffics [3] . A hot spot Delta-2 switch is illustrated in Fig. 1 .
We apply the following assumptions regarding the switch and its operation:
• Each SE is of size d × d and contains B buffers which are shared by the d inlets and d outlets of the SE.
• The switch operates synchronously, i.e. cells are submitted to the switch at the beginning of the time slots.
• Destination tag is used to route a cell. A routing conflict inside the switch is resolved randomly, i.e. if two or more cells are destined to the same outlet, one is chosen at random.
• The probability of a cell arriving at a switch input and being destined to the hot output (p h ) or to any one of the N Ϫ 1 cold outputs (p c ) is given by:
where r is the probability of a cell arriving at any input of the switch during a cycle, and f h the fraction of the hot traffic.
• A local acknowledgment [9] flow control is assumed to prevent any cell loss inside the Delta switch.
• There is no blocking at an output of the switch, i.e. an output can always accept a cell.
We model each SE by a Markov chain representing the distribution of the hot and cold cells stored in the B buffers of the SE. The state of an SE is represented by a pair (h,c) where h is the number of cells destined to the hot outlet of the SE and c is the number of cells destined to the other d Ϫ 1 cold outlets of the SE. We label the hot switch at any stage as a type 1 SE. An SE is of type i if it is fed by a type i Ϫ 1 SE in the previous stage (Fig. 1) . It has been shown in [3] that stage i will have i different types of SEs and i ϩ 1 different traffic rates at its outlets.
The following notations will be used in the model. b i,r,x }: Probability that a successor of a type r SE at stage i ratifies the type x outlet of the SE, given that a cell was submitted to the successor through outlet x during the same cycle. x is of either type hot or cold outlet. Y d (r, s): Probability that s cells in an SE are destined to r distinct outlets of that SE. s is the sum of hot and cold cells in state (h,c). u i,r, j : Probability that a cell in a type r SE at stage i is destined to its jth outlet, where 1 Յ j Յ d: u i,r (h1,c1,h2,c2): Probability that an SE is in state (h2, c2), in the current cycle, given that it was in state (h1, c1) in the previous cycle. B: Total buffer space in an SE.
Our objective is to calculate the steady state vector P i,r for every type r SE at all of the stages where:
Our merits of measurement, viz. the throughput, cell loss, and delay can then be derived from them. In a steady state condition, the Markov chain model can be described as: 
where d is the number of inlets and outlets of an SE. As we assume that the cold traffic of an SE is distributed uniformly over all d Ϫ 1 outlets of an SE, we can use Y d formula derived in Ref. [12] :
Y d is independent of SE type and stage, and so it can be calculated once and used for the rest of calculations.
b i,r,x , the probability that a cell sent through an outlet of type x of SE i,r is accepted by its next stage depends on the stage and type of the SE. The value of b i,r,x depends on whether it is being calculated at the last stage of the switch or any of the other stages as shown later. Note that last stage is different from the other stages in the sense that there is no blocking at the output of the last stage.
i k:
As there is no blocking at the outputs of the switch, the probability b i,r,x of acceptance of an offered cell at stage k is equal to 1. 2. I Ͻ k: An offered cell to a particular outlet of an SE is definitely accepted by its successor SE, if there are at least d buffers in the successor SE, or the total number of cells that are offered to other d Ϫ 1 inlets of the successor SE are less than the available buffers in that SE. Otherwise, only a fraction of cells are acknowledged.
BϪdϩ1Յh1ϩc1ՅBϪ1 p iϩ1;s h1; c1 
u i,r, j , the probability that a cell in a type r SE at stage i is destined to jth outlet of the SE, is determined by: For the last stage, enum k,r, j , the probability that a cell is referencing a hot or cold output is simply calculated by: denom i,r is the probability that any output of the Delta switch accessible from SE i,r is referenced by a cell inside that SE.
Performance evaluation
In steady state condition of the switch, the throughput and delay of various SEs can be computed. Throughput of the hot outlet of an SE is equal to sum of all possible t i,r transitions from an initial state (h,c) to state (h Ϫ 1,c), as we have assumed that there is only one outlet (hot) through which hot cells can leave the SE: 
: 17
Eq. (17) applies to all the stages including the first stage where S becomes irrelevant. Delay of hot and cold outlets of an SE may be calculated using Little's law. As we assume that no cell loss occurs inside the switch, the input and output rate of an SE are the same. Thus, we can use throughput equations to calculate the delay. For the hot outlet we have: : 21
Numerical results
In this section, we present results the normalized throughput, average delay, and cell loss probability of a Delta switch for N 256, d 2. The cell loss probability is given by the ratio of the cells dropped to the number of cells arriving at an input to the switch. It is obtained as a difference between the rate at which cells arrive at the input and the throughput of the switch.
The aforementioned performance parameters are illustrated in Figs. 2-4. The proposed model is accurate when the input load is small. When the input load is more than 0.4, the model is still accurate when buffer size is small (B 2), and the hot spot value is more than 0.05. The model is optimistic for larger buffer sizes (B 4), or smaller hot spot values, as it assumes that output addresses of the cells are independent of each other; whereas in reality, a blocked cell which attempts a particular outlet of an SE in the current cycle will definitely attempt the same outlet in the subsequent cycles, too. The probability of a cell being blocked increases as the input load increases, and so does the discrepancy between the results from the model and the simulation. The cell loss in a shared buffer MIN is very low when the B/d ratio is large, and the traffic is uniform. However, when a hot spot value is introduced into the switch, the cell loss increases sharply as the input load increases. Although the throughput of the hot output of a switch increases sharply when the hot spot value increases, the overall throughput of the switch decreases due to the buffer monopolization effect caused by the hot traffic [14] . The impact of hot spot probability on the performance of a shared buffer switch is illustrated in Fig. 7 for two different input loads. For large hot spot probabilities, the throughput drops significantly, almost regardless of the buffer size, as there is no mechanism to prevent buffer monopoly for the hot output incurred by the hot spot. Buffer monopolization in shared buffer switches can be minimized if a proper buffer management is utilized to limit the maximum number of buffers used by any outlet to some specified value.
Conclusion
We have developed an analytical model to study the performance of multistage ATM switches constructed using shared buffer switching elements. The model can be used to study the throughput, cell delay, cell loss probability and buffer requirements in such switches. It can be used by switch designers to study the effect of the different switch parameters on the performance, and optimize the cost/ performance ratio of the switch. We also have compared the results obtained from the model and computer simulations, and they have been found to be in close agreement.
The model does not account for the correlation of cells in successive cycles. This allows a cell, which is blocked during a cycle to bypass the congested route, giving rise to a higher throughput than simulation. In reality, a blocked cell in an SE always hunts for the same outlet of the SE during successive cycles. The proposed model can be easily modified to handle local flow control and other topologies of multistage switches. 
