Abstract
Introduction
There has been an enormous amount of work on contention resolution for Ethernet-like multiple-access channels (see, for example, [3, 41, and the references therein). Most of this work has focused on backoff protocols, including both constant backoff protocols (as in the Aloha network [I]) and increasing backoff a contention-resolution protocol with o(n) expected packet delay for a constant generation rate. Specifically, their protocol allows a generation rate up to about 1/10, while achieving expected delay O(1og n). A key idea that allows the low expeclted delay is that of a "Reset State" which is entered when bad situations are detected. Later, Paterson and Srinivasan [7] achieved constant expected delay for an infinite number of processors, assuming thak the processors have synchronized clocks, for generation rates up to about l/e. Using their infinite-procee,sor protocol, Paterson and Srinivasan designed a protocol that can be used by n processors whose clmks differ by at most B (for some value B ) and, with high probability, achieves constant expected packet delay for generation rates up to about l/e. This protocol starts with a "Pre-processing Phase" which takes n(max(B, n } ) time and succeeds with high probability. If the preprocessing phase fails, then the expected packet delay of the protocol can grow without bound. Thus, the overall expected packet delay of the protocol is unbounded.
In this paper we demonstrate the existence of a contention-resolution protocol with constant expected packet delay in the standard multiplle-access channel model, i.e., we present the first protocol that achieves constant expected packet delay for generation rates up to a fixed constant XO < 1, without the added requirement that processors be startled with identical clocks. In fact, our protocol allows processors to start and stop repeatedly (with some constraints) without having shared clocks.
Our protocol uses the protocol of Paterson and Srinivasan as a subroutine. The analysis of this subroutine requires processors to have synchronized clocks, and we simulate this (for reasonably long periods of time) using a new technique for Robust Synchronization. In particular, our protocol incorporates repeated, robust "Synchronization Phases". Each "Synchronization Phase" is similar to the "Preprocessing Phase" of Paterson and Srinivasan, except that our synchronization phases are performed repeatedly and are robust in the sense that they do not require processors to have similar clocks, and they can handle any pathological situation (such as processors starting in the middle of a synchronization phase) and eventually return to a normal, synchronized state.
In some sense, the structure of our protocol (normal phases, occasionally interrupted by synchronization phases) is similar to the structure of Raghavan and Upfal's protocol (which has normal phases, occasionally interrupted by reset phases). However, there are major differences between them. One difference is that, because lack of synchronization cannot be reliably detected, synchronizing phases must be entered periodically even when no particular bad event is observed. Another difference is that processors in a reset state are only allowed to send packets with very low probability, and this helps other processors to access the channel. However, our synchronization phase is designed to accomplish the more-difficult task of synchronizing the processors (this is needed to obtain constant expected delay rather than logarithmic expected delay), and accomplishing this task requires many transmissions to the channel, which prevent access to the channel by the other processors. Thus, synchronization phases are costly in our protocol. A third difference is that in Raghavan and Upfal's protocol, a normal phase always tends towards low expected delay, and when bad situations arise, there is a good probability of them being caught, and thus causing a reset state to occur. In our protocol, a normal phase tends towards even lower (constant) expected delay if the processors are synchronized. However, if they are not synchronized, the normal phase does not necessarily tend towards low expected delay, and there is no sure way to detect that the processors are unsynchronized. Thus, the bad situation can only be remedied during the next time the processors start a synchronizing phase, which may be after quite a long time! Fortunately, the effects of this type of behavior can be bounded, so we achieve constant expected packet delay.
Model and Definitions
Following the papers that we mention in Section 1.2, we work in a time-slotted model in which time is partitioned into intervals of equal length, called steps. In our basic model, we have n processors which can start and stop at arbitrary steps, with the constraint that each time a processor starts, it runs for at least a certain polynomial number of steps. ' Packets are generated at the processors according to a probability distribution. Under the { X i } l l i l nindependent arrivals distribution, each processor i is associated with a positive probability Xi and it generates a message independently with probability X i during each time step that it is running. The total arrival rate, X = xi X i , is at most Xo for a sufficiently small positive constant XO. For technical reasons, we also consider other arrival distributions. We say that an arrival distribution is (Xi}l<i<,-dominated (for XI,. . . ,A, > 0) if the total arrivalfate is at most XO and the following condition is satisfied: For every processor i, every step t in which processor i is running, and every event E concerning the arrival of messages at steps other than t or at processors other than i, the probability that processor i generates a message at step t, conditioned on event E , is at most Xi.
The packets which arrive at the processors must be transmitted to the multiple access channel which handles contention as follows: when multiple processors attempt to transmit to the channel at the same time, none succeed. If a single processor attempts to transmit to the channel, it receives an acknowledgment that the transmission was successful. Processors must queue all unsuccessful packets for retransmission, and they use a contention-resolution protocol to decide when to retransmit. During the time that a processor is trying to send one packet, it may generate more packets that it needs to transmit. These packets must also be queued. An important feature of a good contention-resolution protocol is that, even when packets are generated fairly frequently, the sizes of the queues do not grow unboundedly.
We 
Previous Work
There has been a tremendous amount of work on protocols for multiple-access channels. Here we will only discuss the work on dynamic protocols in the acknowledgment-based model that we use. We refer the reader to i;he papers cited here for discussion of previous work on protocols using different assumptions or models. (with an expected delay of w ( n ) ) for any X < 1 when the number of processors is finite, and that binary exponential backoff is unstable for X > .567, when the number of processors is finite. Raghavan and Upfal [8] give a protocol which, for X < 1/10, is stable for a finite number of processors, n, and achieves O(1ogn) expected packet delay. Their protocol has the added benefit of being a fairly simple, clean protocol. Paterson and Srinivasart [7] give a clever protocol which, for X < l/e, is stable for an infinite number of processors with synchronizedl clocks, or stable with high probability for a finite number of processors with almost synchronized clocl<s, and achieves constant expected packet delay, agaiin with high probability. (With some small probability, the delay of their protocol is unbounded.)
Our Results
We present a protocol which, for a small constant A, is stable for any finite number of processors and guarantees constant expected packet delay. In our model, we do not make any assumptions about a previously synchronized or alinost synchronized clock. (Note that the lambda used in our protocol is very small, a few orders of magnitude smaller than the lambda used in previous results. We do not claim that our protocol is a practical protocol. The significance of our protocol is two-fold: (1) it is the first protocol with guaranteed constant expected delay, and (2) the ideas behind it, mainly the Robust Synchronization technique, could be useful in other practical protocols.)
The structure of our protocol is simple. Most of the time, the processors are running the "infinite processors" protocol of l'aterson and Srinivasan. The analysis of that protocol assumes that processors have a synchronized clock. Therefore, in our protocol, the processors occasionally enter a synchronizing phase to make sure that the clocks are synchronized (or to resynchronize after a processor enters the system). Note that the synchronizing phase has some probability of (undetectably) failing, and thus it must be repeated periodically to guarantee constant expected packet delay.
The synchronizing phase of our protocol is somewhat complicated, because it must synchronize the processors even though interprocessor communication can only be performed through acknowledgments (or lack of acknowledgments) from the multiple-access channel. (This is the modell used in most previous papers.) The analysis of our protocol is also complicated due to the very dynamic nature of the protocol, with possibilities of processors missing synchronizing phases, trying to start a synchronizing phase while one is already in progress, and so on. Our synchronizing phases are robust, in the sense that they can handle these types of events, and eventually the system will return to a normal synchronized state.
To give an idea of the problems that arise when designing a robust synchronization plhase, consider the following scenario. Suppose that the set L of all live processors enter a synchronization phase, and halfway through it, another set L' of processors start up. Since the processors in L' have missed a large part of the synchronization phase, they will not be able to synchronize with the other processors. (This seems to be inherent in any conceivable synchronization protocol.) There are two possible approaches for solving this problem. One is to try to design the protocol so that the processors in L detect the newly started processors during the synchronization phase. Then they must somehow resynchronize with the newly joined processors. However, any synchronization protocol must perform various tasks (such as electing a leader) and it is difficult to detect the presence of the processors in L' during some of these tasks. A second approach is to allow the processoirs in L to ignore the processors in L' and to finish isheir synchronization phase (either synchronized amongst themselves, or not). Then the set L' of processors in the synchronization phase will very likely disrupt the normal operations of the processors in L , causing them to synchronize again. But now the processors in L' will be about halfway through their synchronization, whereas the processors in L are just starting synchronization! Our solution to this problem is a ctombination of the two approaches, and is described in the next section with the description of our protocol.
Outline
In Section 3 we prove the key features of our protocol, namely, a packet generated at a step in which no processors start or stop soon before or after will have constant expected delay, and a packet generated at a step in which a processor starts soon before or after will have an expected delay of O(n37 steps. In Section 4 we show that our protocol achieves constant expected packet delay for a fairly general multiple access channel model, with processors starting and stopping.
The Protocol
In Section 2 we describe our new protocol. Let S = (0,. . . ,n40-l}\{n2-1,2n2-1,3n2-1,. . . ,n40-l}, be a set of steps, namely, the first n40 steps, except for the last one of every n2 steps. Then let T be the tree defined as in Paterson and Srinivasan's protocol, except that (1) it is truncated to the first n40 -n38 leaves, and (2) if a node in their tree T has step j in its trial set, then the corresponding node in our tree T has the j t h step of S in its trial set. Let W = 12n4. Now we give an informal description of our protocol. In the normal state a processor maintains a buffer B of size n7 and an infinite queue Q. B and Q contain packets to be sent, and when a packet is generated it is put into B. For each packet m E B the processor maintains a variable trial(m) which contains the next step on which the processor will attempt to send m. The step trial(m) will be chosen using Paterson and Srinivasan's protocol (but modified as above). The steps not in S (i.e., the last step of every n2 steps) are used to send packets from Q. At each of these steps, with probability 1/(3n), the processor attempts to send the first packet in Q. Each processor also maintains a list L which keeps track of the results (either "failure" or "success") of the most recent packet sending attempts from Q, up to n2 of them.
A processor goes into a synchronizing state if a packet has remained in the buffer for n7 steps or if L is full (contains n2 results) and only contains failures. It also goes into a synchronizing state from time to time even when these events do not occur. (It synchronizes if it has been simulating Paterson and Srinivasan's protocol for at least n40 steps, and it synchronizes with probability n-" on any given step.) If the processor does go into a synchronizing state, it transfers all packets from B to the end of &.
In the synchronizing state, a processor could be in one of many possible stages, and its actions depend on the stage that it is in. It will always put any generated packets into the queue. Also, it only sends dummy packets in the synchronizing state. (The dummy packets are used for synchronizing. Real packets that arrive during the synchronization phase must wait until the next normal phase to be sent.) The various synchronization stages are as follows (a processor goes through these stages in order).
JAMMING The processor starting the synchronization jams the channel by sending packets at every step. In this way, it signals other processors to start synchronizing also.
FINDINGLEADER Each processor sends to the channel with probability 1/n on each step. The first processor to succeed is the leader.
ESTABLISHING-LEADER
In this stage, a processor has decided it is the leader, and it jams the channel so no other processor will decide to be the leader.
SETTING-CLOCK
In this stage, a processor has established itself as the leader, and it jams the channel once every 4W steps, giving other processors a chance to synchronize with it.
COPYING-CLOCK
In this stage, a processor has decided it is not the leader, and it attempts to copy the leader clock by polling the channel repeatedly to attempt to find the synchronization signal (namely, the jamming of the channel every 4W steps by the leader). Specifically, it sends to the channel with probability 1/(3n) on each step, and if it succeeds, it knows that the current step (mod 4W) does not correspond to the leader's clock. After many attempts, it should only be left with one step (mod 4W) that could correspond to the leader's clock. At the end of this stage, it synchronizes its clock to the leader's clock.
WAITING This stage is used by a processor after COPYING-CLOCK in order to synchronize with the leader's clock. The processor idles during this stage.
POLLING A processor in this stage is simply "biding its time" until it switches to a normal stage. While doing so, it attempts to send to the channel occasionally (with probability 1/(3n) on each step) in order to detect new processors which might be joining the system and re-starting a synchronization phase. If new processors are detected, the processor re-starts the synchronization phase. Otherwise, it begins the normal phase of the protocol.
The length of each of these stages is very important, both in terms of achieving a high probability of synchronization and a high level of robustness. The high probability of synchronization is achieved by making the "preliminary" stages (i.e., JAMMING, FIND-INGLEADER, arid ESTABLISHINGLEADER) of length O ( W ) (this is long enough to guarantee all processors in a normal state will detect a synchronization), and the "synchronizing" stages (i.e. SET-TING-CLOCK , COPYING-CLO CK , and WAITING) of length O(Wn2) (this gives processors enough time to determine the leader's clock modulo 4W, with high probability). The high level of robustness is achieved by the following properties:
1. Having the lengths of the "preliminary" and "synchronizing" stages as above,
2.
Noticing that only the preliminary stages can cause the channel to be jammed,
3.
Noticing that the %ynchronizing" stages cannot detect a new :;ynchronization occurring,
4.
Having the POLLING stage be of length 0(Wn3) (longer than all of the other stages combined), and 5. having the POLLING stage be able to detect new synchronizations.
The differing lengths of time for the "preliminary", "synchronizing" and POLLING stages, and the fact that only the POLLING stage could cause another synchronization to occur, guarantee that bad events as described at the end of Section 1.3 cannot occur, even when up to TI, processors are starting at different times. Whenever a processor joins the multiple-access channel, it starts the protocol with state = SYNCHRONIZINlG, syncstage = JAMMING, clock = 0, and L, empty. We now give the details of the protocol.
Protocol
At 
The Main Proof
Suppose that n is sufficiently large and that n processors run the protocol. Step 0 will be the step in which the first processor starts the protocol. Processors will start and stop (perhaps repeatedly) at certain predetermined times throughout the protocol. We say that the sequence of times at which processors start and stop is allowed if every processor runs for at least n33 steps each time it starts. Just before any step, t, we will refer to the processors that are running the protocol as live processors. We will say that the state of the system is normal if all of these processors are in state NORMAL. We will say that it is good if 1. It is normal, and for some C < n40 -n7, every processor has clock = C , and every processor with IL( 1. n2/2 has a success in the last n2/2 elements of L , and no packet in any processor's buffer has been in that buffer for more than n7/2 steps.
We say that the state is a starting state if the state is good and every clock = 0. We say that it is synchronizing if 0 every processor has state = NORMAL, or has state =: SYNCHRONIZING with either sync-stage = JAMMING or syncstage = POLLING, and 0 some processor has state = SYNCHRONIZING with syncstage = JAMMING and clock = 0.
Definition:
The interval starting at any step t is defined to be the period [t, . . . , t + n33 -11.
Definition: An interval is said to be productive for a given processor if at least n2'/2 packets are sent from the processor's queue during the interval, or the queue is empty at some time during the interval. Definition: An interval is said to be light for a given processor if at most n17 packets are placed in the processor's queue during the interval.
Definition: T = n3'.
Procedure Nor malStep simulates a slightly modified version of Paterson and Srinivasan's protocol from [7] . (We refer the reader to [7] for a full description and analysis of Paterson and Srinivasan's protocol.) We call our slightly modified version PS'. PSI is based on the variant of Paterson and Srinivasan's protocol with IC = 16, s = 2 and r = 8 which was described in [7] . In PSI, a is defined to be IC2' (Paterson and Srinivasan use a smaller a.) Paterson and Srinivasan require X < l/sa. However, in PSI, we require 4X < llsa. In our variant, b is chosen such that b > 2/(4X). Given our choice of constants, PSI can only handle smaller values of X j than those specified in [7] . The main difference between PSI and the protocol discussed in ['i] is that after every n2 -1 steps, PS' waits one step, generating inputs, but not running the Paterson-Srinivasan protocol. (Any inputs that arrive during this step are deemed to have arrived at the beginning of the following step, when PSI continues simulating the Paterson-Srinivasan protocol.) Note that from any starting state until a synchronization, our system simulates PSI. This implies that our system stops simulating PS' when a processor starts up, since that processor will immediately st art a synchronization. Then PSI is simulated again once a starting state is reached. Lemmas 1 and 2 d.escribe the behavior of PSI and are based on a theorem of Paterson and Srinivasan.
Lemma 1 Suppose that PSI is run with a {Xi}l<i<n-dominated arrival distribution f o r r 5 n40 steps. Then the expected delay of any packet that arrives is O(1). Furthermore, the probability that any packet has delay more than n7/2 is ut most n-6Q.
Proof: Omitted.
0 Lemma 2 Suppose that PSI is run with a { X i } l < i < ndominated arrival distribution for r 5 n40 steps and that a packet arrives at processor p at step t' 5 r.
Then the expected delay of any packet that am'ves is O(1). Furthermore, the probability that any packet has delay more than n 7 / 2 is at most n--6Q.

0
Lemma 3 gives the analysis of thl: normal operation of the synchronization phase of ous protocol. Proof Sketch: We show that if the system is in a synchronizing state just before step t , then every live processor sets syncstage to FINDING-LEADER just before some step in the range {t, . . . , t + W } . Then we show that, with high probability, exactly one processor sets syncstage = SETTING-CLOCK just before some step T in the range {t + 2W + l , . . . ,t + 4W} and every other processor sets syncstage = COPYING-CLOCK just before some step in the range { t + W , . . . , t + 2W}. Then we show that, with high probability, all processors set syncstage = POLLING with clock = 0 just before step r -t 20Wn2. Finally, we show that, with high probability, all processors set state = NORMAL and clock = 0 just before step
Lemma 3 Suppose that the protocol is run with a se-
Lemma 4 gives the analysis of the robustness of the synchronization phase. It requires the assumption that no processors start or stop. This is because it is used to show that the system returns to a normal state from any situation, even from a bad situation such as a processor just having started in the middle of a synchronization phase. If another processor starts before the system returns t o normal, then we would again use this lemma to show that the system will return to normal within a reasonable amount of time after that processor started. Proof Sketch: We show that no matter what state the system is in (i.e., possibly normal, possibly in the middle of a synchronization), if some processor starts a synchronization (possibly because it just started) then, within W / 2 steps, every processor will be in an early part of the synchronization phase (though not necessarily in the JAMMING stage). Then we show that, with high probability, within at most 5 n W steps, all processors will be beyond the stages where they would jam the channel, and furthermore, there is a low probability of any going back to those stages (i.e., a low probability of any synchronization starting). Next, we show that soon all processors will be in the polling stage. At this point, they will either all proceed into the normal state, or if a synchronization is started, they will all detect it and with high probability proceed into a good state as in Lemma 3.
Lemma 4
0
Lemmas 5 and 6 and Theorem 7 show that if the protocol is run with a {Xi}l<i<,-dominated packet arrivals distribution then the system is usually in a good state (i.e., synchronized and running the PSI protocol), and thus the expected time that packets wait in the buffer is constant. Proof Sketch:
From Lemma 6, with probability 1 -14n-22, the state will be good with a clock less than T just before step t + T . In this case, we use Lemma 2 to show that the packet will be sent from the buffer in constant expected time. Otherwise, the packet spends at most n7 steps in the buffer (so the 0 Lemmas 8 -12 and Theorem 15 show that the probability of a packet entering a queue is low, the probability of a queue being very full is low, and the rate at which the packets are sent from the queue is high enough so that the expected time any given packet spends in the queue is low. Proof Sketch: In intervals that are both light and productive (call them good), the queue size decreases by at least n 2 9 / 4 -n17. In other intervals (call them bad), the queue slze increases by at most n33 + n7.
From Lemma 9 and Lemma 10, most intervals before step t are likely to be good. The result follows from an analysis of all possible sequences of good and bad intervals before step t following the most recent interval 0 in which the queue was empty. Proof Sketch: We use Lemma 1 1 1 to show that the expected queue size is small when the packet is born.
Furthermore, the system is likely to reach a starting state (in which the packet has either been sent, or is put in the queue) quickly. By Lemima 8 , the packet is unlikely to be put in the queue, but if it is put in the queue and the queue is small, then, by Lemma 12, the packet will be sent quickly. Proof Sketch: We use Lemma 11 to show that, with high probability, the queue is likely to be small just before step t. Then we use Llemma 12 to show that the expected time to send the packet is small. 0
Results
In this section, we need the following definitions. 
