The purpose of this paper is to understand the relationship between utilization, fairness and access delay in high speed slotted bus networks. We illustrate this relationship by means of a protocol called FUFA (fully utilized and fair). We define full utilization, and fairness precisely, and show that both are achieved together in the FUFA protocol. In addition, the protocol provides bounded access delay that is linear in the round trip propagation delay, and at most a constant away from its minimum possible value for any bus protocol that is both fully utilized and fair. The main idea is that each station takes account of the idle slots propagated previously to interpret the information from downstream (i.e., estimated aggregate number of data segments in queue downstream and estimated number of active downstream stations). This allows the active downstream stations to be served in a round robin fashion according to the updated information.
Introduction
High speed slotted bus networks [6, 7, 9, 111 are attractive candidates for metropolitan area networks (MANs), backbone networks for LANs, and feeder networks for ATM. One such network is the well known distributed queue dual bus (DQDB) network with bandwidth balancing (the IEEE 802.6 standard for MANs) [5] . Although this has not been a great commercial success, it has many interesting features, and has generated a large literature of suggested modifications and improvements. See Currently, there are attempts to build slotted bus networks at much higher speeds, in the multiple G b s range [I, 3, 111. In all of these network proposals, there are tradeoffs between utilization, fairness, access delay, overhead, and complexity. Our objective here is to understand the relationship between utilization, fairness, and access delay. We illustrate this relationship by means of a protocol called FUFA (fully utilized and fair) that shows that full utilization, fairness, and bounded access delay are in fact compatible with each other. The protocol has somewhat greater overhead and complexity than most of the above proposals, and thus is not intended as a practical protocol for very high speed applications, but rather as a means to demonstrate the above compatibility and to understand the kind of inter-station communication required to achieve it.
We define full utilization for a slotted bus network as the property that a station with data segments to send n e~e l releases an idle slot unless that idle slot is used by some further downstream station. As an example of an algorithm with full utilization, consider the greedy algorithm, i.e., the algorithm in which each station with traffic to send fills each passing idle slot. The algorithm is not fair, since the station at the head-end can monopolize the bus. Similarly there is no bound on access delay, since downstream stations can wait forever for an idle slot. Therefore, in order for an algorithm to be fully utilized and be fair or have bounded access delay, it must be non-greedy.
We now show that for a non-greedy algorithm to be fully utilized, it must have a certain kind of feedback information. Suppose the bus is idle and that at a certain time, an upstream station receives an unlimited number of data segments to send and that at close to the same time, a downstream station receives a single segment to send. The downstream station must send some type of request upstream (otherwise the upstream node, in ignorance of the downstream segment, must fill every idle slot to achieve full utilization.) When the upstream station receives the request, in order t o make a decision on whether to propagate the next idle slot without wasting it, it needs to check how many idle slots it has already propagated during the last round trip propagation delay to the downstream station. These are the idle slots that the downstream station would receive before the next slot arrives there.
Although many existing protocols [3, 5-7, 9-11] (including the modifications of DQDB surveyed in [SI)
can provide high utilization, none of them incorporates these recently transmitted idle slots in their decision, and thus none achieves full utilization as defined here. In the FUFA protocol, each station takes account of the idle slots propagated previously to interpret the information from downstream (i.e., estimated aggregate number of data segments in queue downstream and estimated number of active downstream stations). This allows the active downstream stations to be served in a round robin fashion according to the updated information. We will see later that the FUFA protocol actually achieves full utilization.
Besides the full utilization property, FUFA provides fairness in the following sense. If a subset of the stations are very active and the rest are idle, the FUFA protocol distributes idle slots fairly among active stations in a round robin fashion, and the cycle is established within one round trip propagation delay between the most upstream and most downstream active stations.
A protocol is defined to have the bounded access delay property if the access delay of the first data segment in queue at each station is bounded. We show that the FUFA protocol provides a bounded access delay that is linear in the round trip propagation delay, and only a constant away from its minimum value for any bus protocol that is both fully utilized and fair.
The remainder of this paper is organized as follows. In section 2 we describe the basic dual bus topology. In section 3 we define what we mean by full utilization, fairness, and bounded access delay. In section 4, we describe the FUFA protocol. We start with the basic concept, and then give a full description of the FUFA protocol, followed by some basic properties of the protocol. In section 5 we state the full utilization, fairness and bounded access delay properties of FUFA. Finally we conclude our results in section 6.
Basic dual bus network
The dual bus topology we consider here is identical to that used in DQDB (see Figure 1) . Note that full utilization and bounded access delay, as defined here, are for general, arbitrarily varying traffic conditions.
4 Fully utilized and fair (FUFA) protocol
Basic Concept
Since all the idle slots on the data bus are generated from the head-end, the most upstream station, station 1, has first access to idle slots. The basic concept of the protocol is to give equal access to all the stations, according to the most updated information available through the reservation bus. In particular, according to the information available, each station estimates the number of active downstream stations and uses a counter to serve them in a round robin fashion. The novel feature of this protocol is that each station takes account of the idle slots propagated previously to interpret the information from downstream (i.e., estimated aggregate number of data segments in queue downstream and estimated number of active downstream stations) .
Parameters
The following parameters are used in the protocol.
At time t , the information available at station i E (1, 2, ..., K } is as follows: 
Distributed algorithm
The algorithm is described in discrete time with the assumption of zero processing delay. ([a] covers the case with non-discrete time and non-zero processing delay.) At time t , the information available at station
(t), I z ( t ) , and ni(t).
Before station i receives any information from downstream, it uses idle slots whenever it can with round robin counter Ci(t) being 0. This is also the algorithm for the most downstream station K at all t , and,
In general, for time t , and station i E (1, 2, ..., Kl}, the algorithm runs as follows: 
if Ii(t) = 1, the passing slot is idle, and then occupy it, and set ci(t) to K -i, if Ii(t) = 1, the passing slot is idle, and then propagate it, and set Ci(t) to Ci(t) -1, if Ii(t) = 1, and the passing slot is busy, then propagate it, and set Ci(t) to Ci(t),
then propagate the passing slot, and set Ci(t)
Obtain Mi(t) and mi(t) as below, and send them
to station i -1,
The core of the algorithm is the second step, where station i uses the extra piece of information ni(t) to update mi+l (t-D;+l) and Mi+l(t-D;+l) . As an example, consider and Mi+l(t -D;+') = 5). On the other hand, station i needs to take consideration of ni(t) = 3. In the absence of new arrivals, station i knows that the next slot will see 7 data segments remaining at the queues of at most 5 downstream stations (i.e., mf(t) = 7 and M,B(t) = 5). Consider the same example except that ni(t) = 7. Again, in the absence of new arrivals, station i knows that the next slot will see only 3 data segments remaining at the queues of at most 3 downstream stations (i.e., mt(t) = 3 and Mf(t) = 3).
In order to guarantee the full utilization property, the decision made on idle slots should be based solely on the information received, not the probabilistic estimates of future arrivals. As a consequence, downstream stations still suffer from propagation delays. In order to compensate for this disadvantage, the protocol is designed with a bias towards downstream stations in the updating equation (l), where M:(t) takes its maximum possible value in the absence of new arrivals. This can be seen in the second example above. The 3 data segments remaining can be distributed at one station, or at most 3 stations, and M f ( t ) = 3. On the other hand, the estimate ma@), the aggregate number of data segments downstream, is the true value in the absence of new arrivals. This ensures the full utilization property.
Properties of the FUFA protocol
In order to describe some basic properties, we first define the following parameters, for time t, s, and station 
Remark. Lemma 2 formalizes the intuitive fact that the estimated aggregate number of downstream data segments is positive if and only if the estimated number of active downstream stations is positive.
Full utilization, fairness and bounded access delay

Full utilization
Theorem 1: The protocol FUFA has full utilization according to Definition 1 in section 3.
Fairness
The fairness property of the FUFA protocol depends on Lemmas 3 and 4 below. See Figure 5 for an illustration of the timing. Remark. This means that the network converges to a fair state under the condition in Definition 2 in at most one round trip propagation delay between the most upstream and the most downstream active stations.
il
Bounded access delay
In order to prove the bounded access delay property, we first state Lemma 5 as follows:
Lemma 5: For any i E (1, 2, ..., K}, any segment Pi, any k satisfying 1 5 k < i, and any t, denote ti and ti as the time that P i becomes the first segment in the queue and departs from the queue, respectively. Let
be the time when this information propagates to station k, and T k = min{t 2 t k I C k ( t ) =
M i ( t ) } be the time that counter c k ( t ) is set to M$(t)
by (2) is the total number of stations in the network. For any general bus protocol that is both full utilized and fair, let BF be an upper bound on the access delay for the first data segment at station i. Due to the full utilization property, idle slots are propagated by a nonempty station based on only the information that has been received. Therefore, the access delay of the first data segment at station i can be as large as the round trip propagation delay between station a and the most upstream station, station 1, i.e.,
Besides the round trip propagation delay in (4), the round robin cycle under the condition in the definition of "fairness" can result in extra delay for a station to get access to the idle slot. Consider the following scenario. The most upstream station 1 is always active with a long queue. All the other stations i > 1 stay idle until ~i ( t ) , when many data segments arrive at the same time. Based on the full utilization property, station 1 will not propagate any idle slot until time t when the information of downstream stations being active first arrives. Therefore, idle slots will not arrive at station i > 1 earlier than T:(t), a round trip propagation delay away from 7," (t). Hence, (as already seen in (4)), i-1 k=l Notice that starting from t, all the stations are in the set of "very active" stations. According to the definition of fairness, a round robin cycle starts at Til@) at each station i E (1, 2, ..., K } . This extra delay varies between 0 and K -1, depending on the position of the station in the cycle. Therefore, there must exist a station a' which is at the end of the cycle, i.e., 
Conclusion
In this paper, we have designed and analyzed a fully utilized and fair (FUFA) dual bus protocol to illustrate the relationship between utilization, fairness and access delay. The basic concept of the protocol is to give equal access to all the stations according to the most updated information available through the reservation bus. In particular, according to the information from downstream and the idle slots propagated previously, each station computes the latest estimate on the number of active downstream stations, and serves them in a round robin fashion. It was shown that FUFA achieves fairness with full utilization, where a fair round robin cycle for distributing idle slots among the set of "very active" stations (with other stations idle) is established within one round trip propagation delay between the most upstream and most downstream active stations. Additionally, the protocol provides a bounded access delay which is linear in the round trip propagation delay, and at most a constant K -l away from its minimum possible value for any bus protocol that is both fully utilized and fair. To the best of our knowledge, we believe that this is the first protocol that provides full utilization, fairness and bounded access delay. The following issues warrant further research. Modifications of the protocol to make it more practical should be investigated.
[lo] Pillai, R. R. 
Appendix: minimum system queueing delay
Let the queueing delay of a segment be the interval between its entrance and departure into the queue. Due to the distributed nature of the system, we use a time reference T i ( t ) which is the time when a downstream slot starting from station 1 at time t passes station k as it travels down the bus, for k = 1, 2, ...) K .
Since the first idle slot arrives at station 1 at time 0, no departure from station k occurs before time Ti(0).
Therefore, without loss of generality, we can assume that no arrival occurs at station k before Ti(O), for Ak(Ti(t) ) and D k ( T i ( t ) ) as the number of arrivals and departures from station k between Tl(0) and T i ( t ) . Define the system arrival and departure processes {A(t) : t > 0} and { D ( t ) : t > 0) as follows,
The system is said to be "empty" at t if L(t) = A(t) -D ( t ) = 0. That is, the system is "empty" if an idle slot that starts from station 1 at t sees an empty queue at each station as it travels down the bus. We define the ith data segment to be the data segment that causes A(t) to be incremented for the ith time. That is, if this ith increment occurs at time t due to the arrival of a data segment at station k , then that data segment actually arrives at station k at Ti(t). Therefore, W; is the queueing delay of that ith data segment in the system. This system can be viewed as a non-FCFS system in the proof of Little's Law (see, for example, section 3.6 of [4] ). Letting t be any time when the system is empty, w(t) is defined as the system queueing delay up to time t, hence, Theorem 4: Any protocol with full utilization provides the same system queueing delay w(t) for any t 2 0 when the system is empty, and it is the minimum system queueing delay provided by any protocol. This follows from the definition of full utilization which implies that for all s 2 0,
Proof
with the same initialization D( -1) = 0 for all protocols with full utilization. 
The initialization is DA ( Therefore, D ( s ) = DA(s) in this case.
We have completed the proof. Note that with the assumption of statistical stationary, the results can be extended to the time average and ensemble average. w
Introduction
The search for generality, flexibility and standardization has led to bulky implementations of end-systems. Examples are the TCP and the IS0 TP4 based transport systems. The implementations of such systems often conform to the OS1 layered architecture. However, the slowness of execution of the protocol implementations, which is essentially due to sequential processing of the complex protocol procedures for each data unit in various layers, is becoming a limiting factor in some emerging applications which require high band-' P a of this work was performed when the author was at Kansas State Such a capability cannot be met by existing protocol implementation structures, which typically support transfer rates ranging between 750 kbps to 6 m b p s [2, 31. This warrants high performance implementations of end-systems that can provide high transfer rates meeting the application needs, limited only by the network speeds.
The various processing activities on a application-specific data unit (or, packet) such as scheduling control, multiplexing and presentation level processing of data are part of the end-system protocol. The protocol can also include lower level functions such as rate control and error recovery on data. Figure 1 illustrates the placement of these functions in an end-system node with respect to application entities and the backbone transport network attached to this node.
The ways in which various functions influence the overall performance of end-systems are often difficult to be analyzed in a systematic way and generalized for broad usage. This difficulty can often obscure many performance engineering aspects that may be inherently possible. For instance, the communication level processing of a video picture data can proceed in parallel with the compressioddecompression of this data, provided the presentation and communication activities on the video data are carefully separated. This parallelism can be obscured if, for instance, a conventional layered implementation of various protocol
