Abstract. The problem of electing a leader in a synchronous ring of n processors is considered. Both positive and negative results are obtained. On the one hand, if processor IDS are chosen from some countable set, then there is an algorithm that uses only O(n) messages in the worst case. On the other hand, any algorithm that is restricted to use only comparisons of IDS requires Q(n log n) messages in the worst case. Alternatively, if the number of rounds is required to be bounded by some t in the worst case, and IDS are chosen from any set having at least J(n, t) elements, for a certain very fast-growing function 1; then any algorithm requires a(n log n) messages in the worst case.
Introduction
Communication in a network can be performed in either a synchronous or an asynchronous mode. How does the choice of communication mode affect the computational resources required to solve a problem? We examine this question by considering the problem of electing a leader in a ring-shaped network. In this problem there are n processors, which are identical except that each has its own unique identifier. At various points in time, one or more of the processors independently initiate their participation in an election to decide on a leader. The relevant resources for such a distributed computation are the total number of 99 messages used and the amount of time expended from the time that the first processor wakes up.
The problem of electing a leader efficiently has been studied by a number of researchers [ 1, 4-6, 8-11, 131 . The best previous deterministic algorithms have used O(n log n) messages for either bidirectional rings [4, 8, 93 or unidirectional rings [6, 131 . These algorithms work for both the synchronous and asynchronous models, and use comparisons of IDS only. In addition, Burns has established a lower bound of Q(n log n) on the number of messages required if communication is asynchronous [4] . However, the proof in [4] does not extend to the case of synchronous communication. It is, therefore, quite natural to ask whether the Q(n log n) lower bound can be achieved in the synchronous case as well as in the asynchronous, or whether there are algorithms that somehow make use of the synchrony to limit the number of messages transmitted.
We obtain both positive and negative answers to our question of whether synchrony helps. On the one hand, we show that if processor IDS are chosen from some countable set (such as the integers), then there is an algorithm that uses only O(n) messages in the worst case. The processors may initiate the algorithm at different rounds, and do not know the value of n. Our algorithm is thus an improvement on a probabilistic algorithm of [lo] that uses O(n) messages on average and assumes that the processors do know the value n. Unlike the earlier algorithms, our algorithm uses not only comparisons on IDS, but also the numerical value of the IDS to count rounds. However, the number of synchronous rounds used by our algorithm can be very large in the worst case. An algorithm similar to ours has been developed independently by Vitanyi [ 151. On the other hand, we show that both the departure from the comparison model and the possibility of using a large number of rounds are necessary in order to obtain an algorithm of linear message complexity. More specifically, if the algorithm is restricted to use only comparisons of IDS, then we obtain an Q(n log n) lower bound for the number of messages required in the worst case. To achieve this bound we generate an assignment of IDS to processors that exhibits a large amount of "replication symmetry" around the ring. We give a relatively simple assignment of values if n is a power of 2, and a somewhat more involved assignment for general values of n. (More recently, a different assignment of IDS has been given in [2] .) Alternatively, if the number of rounds is required to be bounded by some t in the worst case, then there is a (very fast-growing) functionf(n, t), which has the following very interesting property. If IDS are chosen from any set T having at least f(n, t) elements, then any t-bounded algorithm requires Q(n log n) messages in the worst case. In particular, if t is a function of n, say t(n), then any t(n)-bounded algorithm for a set Twith at leastf(n, t(n)) elements exhibits the given lower bound on messages. We achieve this result by giving a transformation from any algorithm in what we call free form, over such a set T, to a comparison-based algorithm. The ideas for this transformation are derived from earlier work of Snir [ 141. Both of our lower bound results hold even in the case that the number of processors in the ring is known to each processor, and all the processors are known to start at the same round.
The Algorithm
In this section we present an algorithm for electing a leader in a synchronous ring.
The algorithm uses only O(n) messages but may require a very large number of rounds. The elected processor, and only this processor, eventually enters one of a set of distinguished "elected" states. The total number of messages used, including any messages that might be sent after the winner is elected, is O(n). The algorithm presented is for a unidirectional ring, with communication assumed to be counterclockwise. Of course, essentially the same algorithm will work on a bidirectional ring. We assume that the unique ID of each processor is an integer. This assumption is reasonable if communication is implemented by transmitting packets of bits. In the description of the algorithm, we shall refer to the processor with ID i as processor i. The algorithm is initiated by individual processors deciding independently to wake up. The processors need not wake up at the same time, but no processor is allowed to wake after it has received a message from an awakened processor. When it wakes up, a processor (henceforth called a participating processor) spawns a message process, which moves around the ring, carrying the ID of the originating processor. The message process is charged one message for each edge that it traverses.
Our algorithm uses two ideas. The first is that message processes that originate at different processors are transmitted at different rates: The message process carrying processor ID i travels at the rate of one message transmission every 2' rounds. (Specifically, each processor delays for 2' -1 rounds before transmitting message process i.) Any slower message process that is overtaken by a faster message process is killed. Also, a message process carrying ID i arriving at processor j is killed if j < i and processor j has also spawned a message process. A message process that returns to its originator causes the originator to become elected.
Suppose that all participating processors were to wake up at the same round. The above strategy would then guarantee that the total number of messages is O(n). To see this, consider the following. Let i be the smallest ID of any participating processor. Message process i traverses all edges, for a total cost of n. Consider any other message process, j. During message process i's circuit, either message process i overtakes message process j, or else message process j reaches processor i. In either case message process j is killed by the time i's circuit is completed. Because of the different rates of travel, message process j could travel at most distance n/(2'-') during the time that message process i travels distance n. Summing over all message processes, the total number of messages expended would be less than 2n.
However, this variable rate of transmission scheme is by itself not enough to realize U(n) messages in the case in which not all participating processors wake up at the same time. The processors with smaller IDS could wake up correspondingly later and spawn message processes that would chase and ultimately overtake the slower message processes, but not before 0(n) messages had been expended by each of Q(n) message processes.
The second idea is to have a preliminary phase for each message process, before the variable rate phase begins. In this phase, all message processes travel at the same rate, one message transmission per round. When a processor decides it wants to participate, it spawns its message process and sends it off to its neighbor. The message process is transmitted around the ring until it encounters the next participating processor. At this point, the message process continues into the second phase, moving at its variable rate and acting as previously described. LEMMA 1. There is an algorithm that elects a leader in a synchronous ring of n processors using fewer than 4n messages and O(n2') time, where i is the ID of the eventual winner.
Electing a Leader in a Synchronous Ring 101 PROOF. We divide the messages into three categories, and bound each category separately. The categories are (1) the first-phase messages, (2) the second-phase messages sent before the eventual winner enters its second phase, and (3) the second-phase messages sent after the eventual winner enters its second phase.
First consider (1). Since exactly one message from the first phase will be transmitted along each edge, the total number of first-phase messages is exactly n. Next, consider (2) . Every message process that is activated will enter its second phase within n rounds of the time at which the first of the processors awakens. Thus, at most n rounds need to be considered. Furthermore, message process i sends no second phase messages during the rounds under consideration. Since the smallest ID that a winner can have is 0, the smallest possible ID for a processor that is not an eventual winner is 1. Thus the maximum number of second-phase messages for message process j in these rounds is n/2', for j > 0. Summing, the total number of messages sent for all the message processes in these rounds is less than n.
Finally, consider (3). The argument is similar to the one used for the case in which all processors awaken at the same round. That is, message process i makes a circuit for a total cost of n. Any other message process j can send at most n/2'-' phase-two messages during the time i travels distance n. As before, the total number of messages used in (3) is less than 2n. Thus the total number of messages for all categories is less than 4n.
Because of the variable transmission rate, the number of rounds required is O(n 2'), where i is the ID of the eventual winner. Cl
The bound of 4n messages for the algorithm above is reasonably tight. Consider the following example, wheref(n) = log n -log log n. Let processor 1 be at distance 1 from processor 0, and let processor k, k = 2, . . . , f(n), be at distance k + 1(2/i-' -2)n/(2"-' -1)J from processor 0. Let processors 1 and 2 awaken at round 1, and each processor k, k = 3, . . . , f(n), awaken the round before it would be visited by the first-phase message from processor k -1. Similarly, let processor 0 awaken the round before it would be visited, which would be round n -f(n). Message process k, k = 1, . . . , f(n), will start its second phase at round 2 + 42" -2)n/(2" -1)1. It will be killed when it reaches processor 0, or when it is overtaken near processor 0, and thus will traverse at least n/(2k -1) -f(n) links before it is killed. There will be n first-phase messages, at least &=,,. ,J(~) (Wk -1) -f(n)) second-phase messages for message processes k = 1, . . . , f(n), and n -1 second-phase messages for message process 0. Of the second-phase messages for message process k, note that [n/(2" -1) -f(n) -2]/2k of them will fall under category (2) , and the remainder under (3). For large n, the total is slightly more than 3.6n messages in all.
It is possible to achieve a trade-off between the number of messages and the number of rounds by using powers of c for any constant c > 1, rather than powers of 2. As before, there will be exactly n messages in category (1). In category (2) there will be fewer than Zj=I,. .-n/c' = n/(c -1) messages, while in category (3) there will be fewer than Zj=o,, .,n/cj = nc/(c -1) messages. Thus, we obtain an algorithm that elects a leader in a synchronous ring of n processors using fewer than 2cn/(c -1) messages and at most O(nc') rounds, where i is the ID of the eventual winner. It is possible to retain the 2cn/(c -1) message bound, while reducing the time to O(nc'), where i is the minimum ID of all processors in the ring. The basic idea is to allow each processor to awaken and begin its algorithm (spawning its message process) as soon as it receives any message from its neighbor, if it has not already awakened on its own. We thus obtain THEOREM 2. Let c > 1. There is an algorithm that elects a leader in a synchronous ring of n processors usingfewer than 2cn/(c -1) messages and O(nc') time, where i is the smallest ID of the processors in the ring.
Note that the algorithm works correctly in the case in which communication is purely asynchronous. It is only its complexity that depends on the synchrony. In the general asynchronous case, the algorithm is essentially the same as that of [5] , and so exhibits a worst-case message behavior, which is O(n*).
Formal Model and Problem Statement
In this section we describe the formal model we use for our lower bounds. The contents of this section are summarized from [7] , and the reader is referred to this paper for further details.
3.1 ALGORITHMS. We use the following model for ring algorithms. Each processor is assumed to be identical to every other one, except for its own unique identifier, chosen from an ID space X, a totally ordered set. The processors all begin their identical election algorithms at the same time. Each processor behaves like an automaton as follows. Initially the state of the processor consists of its ID. At each round, the processor examines its state and decides whether to send a message to each of its neighbors, and what message to send. Then each processor receives any messages sent to it in that round. The processor uses its current state and these new messages to update its state. Certain of the states are designated as "elected" states.
It may be assumed, without loss of generality, that a ring algorithm is in a certain normal form. In this normal form, the state of each processor records exactly its initial ID and the history of messages received, and each message that is sent contains the entire state of the sending processor. We represent such history information by means of LISP S-expressions. The S-expressions that arise during computation are of a special type, which we call well formed. A well-formed Sexpression over X is either (1) an element of X, or (2) an expression of the form (s, , sZ, sj), where sz is a well-formed S-expression over X, and each of sI and s3 is either a well-formed S-expression over X or the atom NIL. Let 9(X) denote the set of well-formed S-expressions over X.
We refer to an algorithm in such a form as a free algorithm, and we restrict attention in this paper to algorithms that are free. An initial state of a processor will just be its ID. Each message will contain exactly the state of the sending processor. When a processor in state s receives messages s1 and sz from its counterclockwise and clockwise neighbors, respectively, its new state will be the S-expression (s, , s, s2). If no message is received from a neighbor, the atom NIL is used in place of sI or s2 as a placeholder. To complete the algorithm specification, we define a function that determines when messages are to be sent in either direction, and a designation of which states indicate that the processor has been elected. Thus, an algorithm over X is a pair (E, p), where E C 9(X) is the set of elected states and CL, a mapping from Y(X) x (clockwise, counterclockwise) to (yes, no), is the message generation function. We assume that the set E of elected states is "closed," so that once a processor has been elected, it will remain elected.
3.2 EXECUTIONS. To facilitate discussion, we index the processors in the ring clockwise, as 0, . . . , n -1. (For convenience we are switching from the naming convention which we used in Section 2. There, by "processor i" we meant "the processor with ID i," whereas for the rest of the paper we mean "the processor with index i." We count indices modulo n. A ring of size n over ID space X is an n-tuple of elements of X, giving the initial IDS of the processors 0, . . . , n -1 in order. A configuration of size n is an n-tuple of S-expressions in Y(X), representing the states for the n processors. A message vector of size n is an n-tuple of ordered pairs of elements ofY(X) U (null). It represents the messages sent counterclockwise and clockwise by each of the n processors.
An execution of an algorithm for ring R of size n is an infinite sequence of triples (C, , M, C2), where C, and Cz are configurations and M is a message vector, all of size n. We require executions to satisfy several properties. First, the initial conliguration must be R. Second, the second configuration in each triple must be the same as the first configuration in the next triple. Finally, each triple in an execution must describe correct message generation, as given by P, and correct state changes, as described earlier. An execution fragment is any finite prefix of an execution.
We now define our complexity measures. We measure the number of messages sent and the number of rounds taken only up to the point where a processor becomes elected. (This convention only serves to strengthen our lower bound.) For any execution e, letfinishtime(e) denote the number of the first round after which a processor has entered a state in E. Let messages(e) denote the number of messages sent during e, up to and including round linishtime(e).
ELECTION OF A LEADER. Let X be an ID space with ] X ] L n. A ring
algorithm over X is said to elect a leader in rings of size n provided that in each execution e of the algorithm, for a ring R of size n over X, exactly one processor eventually enters a state in E.
3.4 COMPARISON ALGORITHMS. We next define algorithms whose only operation with respect to processor IDS is to compare them. We say that two Sexpressions, s and s', over X are order-equivalent provided that they are structurally equivalent as S-expressions, and, if two atoms from s satisfy one of the order relations <, =, or >, then the corresponding atoms from s' satisfy the same relation. An algorithm is a comparison algorithm provided that, if s and s' are order-equivalent well-formed S-expressions over X, then processors with states s and s' transmit messages in the same direction or directions and have the same election status. That is, ~(s, clockwise) = ~(s', clockwise), & counterclockwise) = p(s), counterclockwise), and s is in E exactly ifs' is in E.
Chains
In this section we describe the general theory needed for our lower bound proof for comparison algorithms. We introduce the concept of a chain, which describes information flow during an execution of a ring algorithm. The notion of a chain used in this paper is a substantial generalization of the notion of a chain used for a similar purpose in [7] . For comparison algorithms, we show that nonexistence of certain chains implies that certain processors in a ring remain indistinguishable.
BASIC DEFINITIONS.
A k-segment of a ring is a length k sequence of consecutive processors in the ring, in clockwise order. Let S and T be two ksegments in a ring, with first processors p and q, respectively, and last processors p' and q', respectively, and let e be an execution (or execution fragment) of an algorithm in the ring. Then a clockwise chain in e for (S, T) is a length k subsequence of the steps of e, ei,, ei2, . . . , ejk such that the following is true. In each step e+ a message is sent either by processor p + j -2 to processor p + j -1 or by processor q + j -2 to processor q + j -1. Thus, a clockwise chain for a pair of segments describes combined information flow clockwise in the two segments from outside the two segments up to the last processors p' and q'. A counterclockwise chain in e for (S, 7') is defined analogously for information flow counterclockwise: in each step ei,, a message is sent either by processor p' -j + 2 to processor P/ -j+ 1 orbyprocessorq'-j+2toprocessorq'-j+ 1.
Two length k vectors of X-elements are said to be order-equivalent provided that the elements in corresponding positions satisfy the same ordering relations in the two vectors. That is, if the two vectors are a and b, then ai and Uj satisfy the same relation, <, =, or >, as bi and bj. Two segments S and T are said to be orderequivalent in a particular ring R provided that the sequences of initial IDS of the processors in the two segments are order-equivalent.
Let e be an execution fragment. Then maxcw(e) is defined to be the maximum k for which there are order-equivalent length k segments S and T (possibly with S = T), such that e contains a clockwise chain for (S, T). The quantity maxccw(e)
is defined analogously. Let sum(e) = maxcw(e) + maxccw(e).
LIMITATIONS ON CHAINS.
From the definitions of maxccw, maxcw, and sum, it follows that a length 0 execution e has maxcw(e) = maxccw(e) = sum(e) = 0. We show that chains cannot grow unreasonably quickly. The length of a longest chain can grow by at most 1 in any time step, and only if a message is sent in the appropriate direction.
LEMMA 3. Let e and e' be execution fragments for a ring R, such that e' consists of all but the last step of e. Then (a) muxcw(e) 5 maxcw(e') + 1, with maxcw(e) = maxcw(e') if no messages are sent clockwise at the last step of e, and (b) maxccw(e) 5 maxccw(e') + 1, with maxccw(e) = maxccw(e') if no messuges are sent counterclockwise at the last step of e.
PROOF. We argue part (a). Part (b) is analogous. The second half of the claim is obvious. We argue the inequality maxcw(e) 5 maxcw(e') + 1. We may assume that maxcw(e) L 1, since otherwise the result is obvious.
Let S and T be order-equivalent segments of length maxcw(e) for which there is a clockwise chain in e. Let S ' and T' be the segments of length maxcw(e) -1 consisting of all but the last processor in S and T, respectively. Then S' and T' are order equivalent. Moreover, since only the last message in the chain could have been sent at the last step of e, it must be that e' contains a clockwise chain for (S', T'). Thus, maxcw(e') r maxcw(e) -1, as required. Cl 4.3 BISEGMENTS. We next introduce notation that allows us to describe at the same time a counterclockwise chain and a clockwise chain leading to the same processor. If k, and k2 are positive integers, a (k,, k&bisegment is defined to be a pair of segments, the first of size k, and the second of size k2, which overlap in a single processor. (The last processor of the first segment is the first of the second segment.) The processor that appears in both segments is called the center of the bisegment. The spanning segment of a bisegment is the segment obtained by concatenating the two segments in the bisegment and removing the duplicated center. Two bisegments are said to be order-equivalent in a particular ring provided that their spanning segments are order-equivalent. Two processors p and q are (k, , k2)-equivalent in a particular ring provided that their (kI , k$-bisegments (i.e., the (k, , kz)-bisegments centered at p and q) are order-equivalent.
Let S = (S, , &) and T = (T, , Tz) be two (k, , k2)-bisegments, and let e be an execution or execution fragment. Then a clockwise chain in e for (S, T) is a clockwise chain in e for (S,, T,), and a counterclockwise chain for (S, T) is a counterclockwise chain for (& Tz). A chain in e for (S, T) is either a clockwise chain or a counterclockwise chain for (S, T).

INDISTINGUISHABILITY.
In this subsection we show that, for comparison algorithms, the absence of long enough chains implies that certain processors must remain "indistinguishable." The absence of these chains then also implies that a correspondingly large number of messages will be sent in the next round.
Our notion of "indistinguishability" is defined as follows. If S and Tare two ID sequences, each of length k, and s and t are two S-expressions, then s is congruent to t with respect to (S, T) provided that s and t are structurally equivalent, and corresponding positions in s and t contain elements from corresponding positions of S and T, respectively. If S and T are two segments of a particular ring, then s and t are congruent with respect to (S, T) provided that s and t are congruent with respect to the corresponding sequences of IDS. Similarly, if S and T are two bisegments of a ring, we say that s and t are congruent with respect to S and T provided that they are congruent with respect to their spanning segments. LEMMA 4 . Let e be an execution fragment of a comparison algorithm for ring R. Let k, and k2 be positive integers. Let p and q be any pair of (k, , k&equivalent processors in R, and let S and T be their respective (k, , k&bisegments. If there are no chains in e for (S, T), then at the end of e, the states of p and q are congruent with respect to (S, T).
PROOF. The proof is by induction on the length of e.
Base. 1 e 1 = 0. Neither p nor q has received any messages in e, so they will remain in states that are congruent with respect to (S, T).
Inductive
Step. 1 e 1 > 0. Assume as the induction hypothesis that the result holds for any execution fragment of length shorter than I e I and any values of kl and k2. Let e' denote e, except for its last step. Then by inductive hypothesis, p and q remain in states that are congruent with respect to (S, T) up to the end of e'. Consider what happens at the last step. Let p' and q' be the respective counterclockwise neighbors of p and q, and p" and q" the respective clockwise neighbors.
Case 1. Both of the following hold: (a) Either p' and q' are in states that are congruent with respect to (S, T) just after e', or else neither p' nor q' sends a message clockwise at the last step of e. (b) Either p" and q" are in states that are congruent with respect to (S, T) just after e', or else neither p" nor q" sends a message counterclockwise at the last step of e.
In this case, it is easy to see that p and q remains in states that are congruent with respect to (S, T), after e. For ifp' and q' are in states that are congruent with respect to (S, T) just after e', then, since the algorithm is a comparison algorithm, they both make the same decision about whether or not to send a message clockwise at the last step of e. If they both send a message, then the messages they send are just their respective states, which are congruent with respect to (S, T). A similar argument applies to p" and q". It follows that p and q remain in states that are congruent with respect to (S, T) after the last step of e. Case 2. Processors p' and q' are in states that are not congruent with respect to (S, T) just after e', and at least one of them sends a message clockwise at the last step of e.
If k, = 1 (i.e., if p and q are at the counterclockwise ends of their respective bisegments), then a clockwise chain for (S, T) is produced by the message sent at the last step, a contradiction. So assume that /cl > 1. Since p and q are (kl, kz)-equivalent, it follows that p' and q' are (k, -1, k2 + l)-equivalent. Let S' and T' denote their respective (k, -1, k2 + I)-bisegments. S ' and T' contain exactly the same processors as S and T, respectively, but are centered at p' and q' rather than at p and q. Since the states of p' and q' just after e' are not congruent with respect to (S, T), they are also not congruent with respect to (S ', T'). By the inductive hypothesis, there must be a chain in e' for (S', T'). If there is a counterclockwise chain in e' for (S', T'), then it is also a counterclockwise chain for (S, T), so there is a counterclockwise chain in e for (S, T). On the other hand, if there is a clockwise chain in e' for (S', T'), then since at least one of p' and q' sends a message clockwise at the last step of e, we obtain a clockwise chain in e for (S, T). Either case is a contradiction. Case 3. Processors p" and q" are in states that are not congruent with respect to (S, T) just after e', and at least one of them sends a message counterclockwise at the last step of e.
The argument is analogous to'the one for Case 2. El Thus, we have shown that absence of certain chains implies that certain processors must remain in congruent states. This lemma is actually stronger than we need for this paper, but this extra strength will probably be of use in handling other problems. In our subsequent analysis we use as an upper bound on maxcw(e) simply the number of distinct rounds in which messages are sent clockwise, and similarly for maxccw(e). Thus, instead of the existence of a chain for (S, T), we could have substituted the condition that either there are kl rounds in which messages are sent clockwise or there are k2 rounds in which messages are sent counterclockwise. Reorganized in this way, our proof would be substantially the same as it is now (in fact, marginally simpler), but the revised lemma would give less information about the communication that must occur for congruence to be broken.
Two corollaries, which will be used in our lower bound proofs, follow from this lemma. The first one says that, when chains are short and there are lots of equivalent processors, any message that gets sent has many corresponding messages sent at the same time by other processors. COROLLARY 5. Let k be a positive integer. Assume ring R is such that every ksegment has at least i order-equivalent k-segments. Let e be any execution fragment of a comparison algorithm in R, e' be another fragment consisting of all but the last step of e, and assume that sum(e') c k. If some processor p sends a message clockwise (or counterclockwise) at the last step of e, then there are at least i processors that do the same.
PROOF. Consider the case in which p sends a message clockwise. The other case is analogous. Let k, = maxcw(e') + 1 and k2 = maxccw(e') + 1. The (k,, k2)-bisegment for p has at most k elements, so that p has at least i (k, , k&equivalent processors. Let q be any one of these processors, and let S and T be the (k,, k2)-bisegments centered at p and q, respectively. Then there cannot be a chain in e' 107 for (S, T), by the definitions of maxcw and maxccw. But then Lemma 4 implies that p and q remain congruent with respect to (S, 7') at the end of e'; since the algorithm is a comparison algorithm, q also sends a message clockwise at the last step of e. Cl Lemma 4 also has the following consequence for comparison algorithms to elect a leader. This corollary says that long chains must be generated in order to elect a leader, if certain equivalent processors exist. COROLLARY 6. Let k be a positive integer. Let R be a ring in which every k-segment S has another order-equivalent k-segment T. Let e be any execution fragment of a comparison algorithm that elects a leader in R, such that a leader gets elected in e. Then sum(e) L k.
PROOF. Assume the opposite, that sum(e) = maxcw(e) + maxccw(e) < k. Let k, = maxcw(e) + 1 and k2 = maxccw(e) + 1. The (k,, k&bisegment for the processor p that gets elected leader has at most k elements, so that p has a (k, , k2)-equivalent processor q # p; let S and T be the (kl , k&bisegments centered at p and q, respectively. Then there cannot be a chain in e for (S, T), by the definition of maxcw and maxccw. But then Lemma 4 implies that p and q remain congruent with respect to (S, T); since the algorithm is a comparison algorithm, p and q cannot be distinguished as to leadership. This is a contradiction. 0
Lower Boundfor Comparison Algorithms When n Is a Power of 2
In this section we restrict attention to algorithms that use comparisons only, and to rings in which the number of processors is a power of 2. We present a lower bound of n/2 (log II + 1) for the number of messages required for a comparison algorithm to elect a leader in this case. We handle the case of powers of 2 first because the assignment of IDS to processors that realizes the lower bound is simpler than for general values of n, and also because the constant of proportionality in the lower bound is larger than we have been able to achieve for general n. 5 .1 REPLICATION SYMMETRY. We first generate a labeling of the processors in a ring that has a large amount of replication symmetry. Let (n) denote (0, * * -9 n -1). We assume that n is a power of 2, and let X* be the ID space consisting of the set (n), with the usual ordering.
For j E (n), let reverse(j) denote the integer whose binary representation is the reverse of the binary representation ofj. We assign processor IDS so that processor j has ID reverse(j), for j E (n). We call this pattern of IDS Qn. We note that if a segment of Qn is of length at most 2', then all ordering information about the IDS of processors in the segment is determined solely by the i high-order bits.
LEMMA 7. Let S by any segment of Qn of length at most 2', where i < log n. Then there are at least n/2' segments of Q,, that are order-equivalent to S, including S itself:
PROOF. For each i < log n, the processor IDS repeatedly cycle through the 2' possible arrangements of i high-order bits. Thus in a segment of length at most 2', each ID differs from any other in its i high-order bits. Any segment that is orderequivalent to Swill have its first processor at a distance that is any integral multiple of 2' from the first processor in S. There are n/2' such segments, including S itself. 0 5.2 LOWER BOUND. We can now prove the lower bound for comparison algorithms when n is a power of 2. We make use of the following observation about comparison algorithms. Suppose X and X' are arbitrary ID spaces, and n is any integer. If & is a comparison algorithm over X that elects a leader in a ring of size n and uses at most s messages, then there exists a comparison algorithm &' ' over X' that elects a leader in a ring of size n and uses at most s messages. Thus a lower bound result over ID space X* translates directly into a lower bound result for any arbitrary ID space X. THEOREM 8. Assume n is a power of 2. Let M' be a comparison algorithm over an arbitrary ID space X, which elects a leader in a synchronous ring of size n. Then there is an execution e of dfor which messages(e) L (n/2)(log n + 1).
PROOF. It suffices to consider X = X*. Let e be the execution fragment on Qn, which terminates just when the elected processor enters an "elected" state. By Lemma 7, every segment of length n/2 has at least one other order-equivalent segment in Qn. Thus by Corollary 6, execution e must progress from having a sum of 0 to having a sum of at least n/2.
Consider any step of e at which the sum first stops being at most k, for any k < 2'. By Lemma 3, the sum increases by at most 2 at this step. Moreover, if no messages are sent clockwise (respectively, counterclockwise) at this step, then the sum increases by at most 1.
Let e' be the prefix of e preceding this step. Then sum(e') < 2'. Lemma 7 implies that any segment of length 2' has at least n/2' order-equivalent segments in Q,,. Thus by Corollary 5, if any messages are sent clockwise at this step, then at least n/2' messages are sent clockwise, and similarly for messages sent counterclockwise. Thus, if the sum increases by 1 at this step, at least n/2' messages are sent, whereas if the sum increases by 2 at this step, then at least twice that number of messages are sent. It follows that the cost of increasing the sum from 0 to at least n/2 can be apportioned as a cost of at least n/2 i for each increase from k to k + 1, where k < 2'.
We now total up the number of messages sent in e. Grouping increases by powers of 2, we see that the number of messages sent must be at least n+ lf 0 1.. . ,log(n/2) s (2' -2i-1) = n + x I . . . 
Lower Bound for Comparison Algorithms for General n
In the last section we generated an assignment of IDS to processors in the case in which n was a power of 2. The assignment possessed a large amount of replication symmetry, which allowed us to achieve the fi(n log n) lower bound. It does not appear possible to take our pattern Qn and then try to extend it in some way to accommodate extra processors. Such a strategy would introduce special treatment for the extra processors, which might change the behavior of the algorithm entirely, perhaps allowing some processor to become elected easily. Instead, we generate a pattern P, for any general value of n, such that a ring assigned IDS from P,, possesses a large amount of replication symmetry. We then show that this replication symmetry causes the ring to require a large number of messages for election of a leader.
6.1 HIERARCHICAL ORGANIZATION OF PROCESSORS. Fix a particular ring size n 2 1. We generate a pattern P,, of IDS, the elements of which are then assigned to processors 0 througn n -1, respectively. To achieve considerable replication symmetry, the construction of P,, uses a hierarchical grouping of processors. The idea is that on any level of the hierarchy, two groups of processors should receive order-equivalent sequences of IDS. To have the construction work for general n, one type of group is not enough, so that at every level there will be two types of groups. We describe the grouping using a derivation tree of a context-free grammar. Later, we use the structure of the derivation tree to assign IDS to the n leaves of the tree and thereby produce pattern P,,.
Define the context-free grammar G as follows. The nonterminals, representing groups of processors, are Ai and &, 1 5 i 5 d, plus Bo. There is just one terminal symbol p, representing a processor. The start symbol is &. The productions are Bi * B,+IA;+lA;+lB;+lBi+lAi+lAi+lBi+lBi+l, for Olisd-1, Ai + Ai+lBi+lBi+lAi+lAi+lBi+lB;+lAi+lAi+r, for 1 =isd-1, Bd ---, p"'~', and Ad + p(Q).
The depth d of the hierarchy is defined as d = L(loggn)/2J. Note that in the last two productions, B,, generates a string consisting of bd p symbols, and analogously for A,,. The quantities ad and bd will be defined later, in such a way as to guarantee that the length of the unique sentence generated by G is n.
For each i, 0 5 i 5 d, define the level i sentential form of G to be the unique string over (A,, BiJ derivable in G. There are exactly 9' nonterminal symbols in the level i sentential form. Moreover, for each i, the number of symbols Ai is exactly one less than the number of symbols B;. All Ai nodes derive a terminal string of the same length; we call this length ai. Similarly, all B, nodes derive a terminal string of the same length, which we call bi. Let ci = min(ai, bi), for all i, 1 I i I d.
We next describe how to select the values ad and bd. They are chosen in such a way that the total length of the unique sentence derived in G is exactly n, and so that 1 bd -ad 1 is small. We use the following: Let m = Lgd/2J. It is easy to see that m is 8(n'/*) and, in particular, that m 5 n/*/2. Using Lemma 10, choose ad and bd to be integers such that n = adrn + bd(m + 1) and 1 bd -ad 1 5 m. We must show that ad and bd are nonnegative: if either of ad and bd is negative, then max(ad, bd) I m -1, so n = adm + bd(m + 1) I 2(m*) 5 n/2, a contradiction.
LEMMA 11. The length of the unique sentence generated by G is n. From the choice of ad and bd, we have ad + bd = (n -bd)/m. It follows that ai+l + bi+l = 9d-(i+')(n -bd)/m. Substituting into the expression for ci -C/+1 gives the desired result. 0 6.2 LABELING OF PROCESSORS. Let X be the ID space consisting of all strings of length d + 1 whose elements are nonnegative integers, with the strings ordered lexicographically. X is the ID space from which the pattern P,, will be constructed.
We define P,, by describing an assignment of IDS to n processors, corresponding to the leaves of the derivation tree of G. In order to do this, we associate labels with the nodes of the derivation tree. The label of the root of the tree is the null string. If a node with a corresponding nonterminal Ai or Bi, 0 I i I d -1, is labeled by the string w, then the labels of its nine children are, respectively, ~0, wl, w2, w3, w8, w7, w6, w5, w4. If a node with a corresponding nonterminal Ad is labeled by the string w, then the labels of its ad children are, respectively, w0, wl, . . . ) w(ad -1). If a node with a corresponding nonterminal Bd is labeled by the string w, then the labels of its bd children are, respectively, w0, wl, . . . , w(bd -1).
Processor IDS are generated by interpreting the labels of the leaves as elements of X, that is, as length d + 1 strings of nonnegative integers, ordered lexicographically.
In the level i sentential form of G, define an ordered pair of nonterminal symbols to be "of type A > A" provided that it consists of the two symbols AiAi, and the label of the node of the first nonterminal is lexicographically greater than that of the second. We use analogous definitions for types A < A, A > B, A < B, B > A, B < A, B > B, and B < B. We now show that the level i sentential form has equal numbers of consecutive pairs of nonterminals of the eight possible types. PROOF. It suffices to show that the numbers of occurrences of the eight types of pairs are equal, since the total number of pairs is exactly 9' -1 = 8LSi/8J. We proceed by induction on i. For the basis, i = 0, the result is vacuously true. Assume that the result is true for i, and consider the level i + 1 sentential form. There are two kinds of pairs of level i + 1 nonterminals: those in which both elements are derived from the same level i nonterminal node and those in which the two elements are derived from two different level i nonterminal nodes. Each level i nonterminal node generates a length 9 sequence of level i + 1 nonterminals in which each of the eight types of pairs has exactly one occurrence. Therefore, there are equal numbers of the eight possible types among the pairs that are derived from
