.
new bus structure, called the hierarchical bus structure. The exact analytical model of a 2-level hierarchical bus is developed in this paper. The results from both the analytical model and simulation are shown in the paper. These results show that a hierarchical bus would be a cost effective bus structure as compared to the conventional multiple bus and partial multiple bus s t r u c t u r e s .
I. INTRODUCMON
A single bus multiprocessor system is the most simple and inexpensive system among all the bus-based systems. However, the single bus provides very limited bandwidth. Thus, only few processors could be connected to a single bus system. Another disadvantage of a single bus system is that the entire system becomes dead if the bus fails. Multiple bus interconnection, shown in Figure 1 . In a partial multiple bus system all the processors are connected to all the buses, but the memory modules are grouped into a number of groups, and two or more memory modules are connected to a bus if they belong to the same group. Figure 2 shows an example of a partial multiple bus system with two memory groups. The ' number of connections in a multiple bus system with N processors, M memory modules and B buses is B(N+M). On the other hand, the number of connections in a partial multiple bus system with 'g' memory groups is B(N + M/g). Lang also proposed [2]
a number of schemes to reduce the number of connections in a multiple bus system. Some improvement of partial multiple bus has been shown in [4].
This paper presents the analysis of a different bus structure called the hierarchical bus structure. The number of connections in a hierarchical bus system is proportional to (N + M)log(Bh). Where Bh is the number of buses in a hierarchical bus system. A 3-level hierarchical bus is shown in Figure 3 . A hierarchical bus structure could be visualized as 3 ). There are three independent buses in the multiple bus system shown in Figure 1 , but there are seven independent buses in the system shown in Figure 3 . Thus, the hierar- chical bus system of Figure 3 would be able handle more traffic than the multiple bus system to of Figure 1 can handle. Hence, more processors could be connected to the hierarchical bus system. However, the cost of connections of the systems shown in Figures 1 and 3 are same, which means the hierarchical bus would be more cost effective as compared to the multiple bus. This paper presents an exact analytical model of a 2-level synchronous hierarchical bus system, shown in The processors are synchronized;
The cycle time of all the memory modules is the same and constant;
The request generated in a cycle are random and are independent of one a n o t h e r ; The request generated in successive cycles are independent of the requests issued in the previous cycle. Request which are not accepted are rejected. A level-1 bus Bi, 1 5 i 5 2. is used to set up a communication between a processor and a memory module of cluster Ci. g ) The level-2 bus B3 is used to set up a communication between a processor of one cluster and a memory module of another cluster. Assumption (d) is not realistic, but it makes the analysis simpler, and it does not introduce much difference in the actual result [3] .
Let po be the probability that a processor generates a request in every cycle. Let pi be the probability that a processor sends its request through a level-i bus. Let us assume that each cluster has N processors and M memory modules. Let P i and Mk be a processor and a memory module in cluster C1. The probability of Pi requesting Mk is given by popl/M. Hence, the probability that pi does not request Mk is (I -popi/M). The probability that no processor of cluster C1 requests Mk can be expressed as (1 -p o p l / M ) N . Hence, the probability that at least one processor of C1 requests Mk is given by x = 1 -(1 -p o p~/ M )~. Similarly, it can be shown that the probability that at least one processor of C2 requests Mk is given by y = 1 -(1 -p o~2 / M )~. Since a level-I bus is used to set up a communication between a processor and a memory module of the same cluster, and the level-2 bus is used to set up a communication between a processor and a memory module of the different . clusters, the maximum bandwidths Bwl,,, and B W~,~~ of level-1 and level-2 buses can be expressed as:
Thus, the upper bound on the maximum bandwidth of a 2-level hierarchical bus system (shown in Figure 4) can be expressed as
The actual bandwidth will be less than the expression shown in (2) due to the fact that sometimes a bus may be blocked because of the memory conflict between the reference of this bus and the reference of another bus. In a 2-level hierarchical bus system, a level-I bus will not be blocked if two or more memory modules are requested through the level-] bus during any memory cycle. Because, a level-1 bus may have conflict only with the level-2 bus B3. Since the level-2 bus B3 may have conflict with the two level-1 buses, bus B3 will not be blocked if three or more memory modules are requested through this bus during a memory cycle. Figure 5 shows one case (Case-1) where a conflict occurs between the buses B1 and B3. In this case, only one memory module . (say Mi) of cluster C1 is requested through the buses B1 and B3.
The probability that only one memory module of C 1 is requested by the C1 processors and no memory module of C2 is requested by the C1 processors is given by M.x.(l -x )~-' ( ! -Y )~. A conflict occurs between B1 and B3 buses if the same memory module is also requested by the C2 processors. Given that there is a memory reference on the bus B1, the probability that the same memory module is also requested by the C2 processors is given by y.(ly)M-1. Hence, the probability that a Case-1 conflict occurs is given by Figure 6 shows another case (Case-2) where the conflict occurs among all three buses. Here, exactly one memory module of C1 (say Mi) and exactly one memory module of C2 (say Mj) are requested by both the C1 and C2
processors. The probability that a Case-2 conflict occurs is given by 
PO
Given that a Case-1 conflict has occured, the probability that a level-1 bus will be blocked is 0.5, because only one of the two buses has to be blocked. When a Case-2 conflict occurs only one bus will be blocked and the other two buses will be allowed to serve the memory requests. Hence, the probability that any particular bus will be blocked given that a Case-2 conflict has occured is 0.333. Thus, the probability that a level-1 bus will be blocked due to memory conflict is given by PC, = Pr,/2 + Pr43
Since a Case-1 conflict can occur between B1 and B3 buses and between B2 and B3 buses, the probability that the level-2 bus B3 will be blocked due to memory conflict is given by Pc2 = Pr, + Pr43 Table 1 shows the bandwidth obtained from the analytical model and from the simulation. The simulation result was obtained by taking the average of the bandwidths available over a period of 500 memory cycles. In most of the cases the simulation result is within 10% of the analytical result. However, in few cases the simulation result is almost 20%. higher than the analytical result. This large difference may occur mainly due to the fact that a duration of 500 memory cycles may not be long enough to get the results accurately . Table 2 compares the bandwidth of a 2-level hierarchical bus system with the bandwidth of a multiple bus system with 2 buses. The bandwidths of the hierarchical bus system are shown for two cases: UMA and BLE. In UMA (uniform memory access) case a processor generates memroy requests for all the memory modules with equal probability, i.e. pl=0.5 and p2=0.5. In BLE case buses are loaded equally, i.e. p1=0.667 and p2=0.333. Table 2 shows that the bandwidth of a 2-level hierarchical bus system is higher than the bandwidth of a multiple bus system with two buses, but the number of connections is same in both the systems. Hence, a hierarchical bus system would be more cost effective than a multiple bus system. T h e hierarchical bus system would be even better when the number of levels is increased, because the bandwidth of the system would be proportional to the number of buses whereas the number of connections would be proportional to the logarithm of the number of buses.
2-Lcvcl Hierarchical Bus System

IV. DISCUSSIONS ON HIERARCHICAL BUS SYSTEM
Multiplc Bus System with 2 buses
A hierarchical bus structure could be used most efficiently if all the buses could be loaded almost e q u a l l y .
One disadvantage of the hierarchical bus structure is that as the number of levels is increased the higher level buses are heavily loaded. This loading effect can be reduced by using multiple buses at the higher levels, and connecting the processors of one cluster and the memory modules of another cluster to a bus. An example of the reduced loading for a 2-level hierarchical bus system is shown in Figure 9 . This technique increases the number of buses without increasing the number of connections. Thus, the bandwidth would increase without any additional cost for c o n n e c t i o n s . 
V. CONCLUSION
hierarchical bus would be most cost effective if the memory traffic could be distributed almost equally among all the buses. In order to achieve such an optimum performance. most of the code and data (to be used by a group of processors) have to be kept in the same cluster where the processors lie. This paper prescnts the preliminary model and results of a 2-level hierarchical bus system. The result is very good and it encourages to investigate more complicated hierarchical bus systems.
84.
, --Cluster C 1 Cluster C2 Figure 9 : A 2-level hierarchical bui system with reduced loadi nq for the higher level buses.
[ This paper prsents a new bus structure called hierarchical bus structure. From both the analytical and simulation results it has been found that a hierarchical bus would be more cost effective than the corresponding multiple bus system. A
