Abstract. We initiate a study of bounded clock synchronization under a more severe fault model than that proposed by Lamport and Melliar-Smith [1985] . Realistic aspects of the problem of synchronizing clocks in the presence of faults are considered. One aspect is that clock synchronization is an on-going task, thus the assumption that some of the processors never fail is too optimistic. To cope with this reality, we suggest self-stabilizing protocols that stabilize in any (long enough) period in which less than a third of the processors are faulty. Another aspect is that the clock value of each processor is bounded. A single transient fault may cause the clock to reach the upper bound. Therefore, we suggest a bounded clock that wraps around when appropriate.
Introduction
In a distributed system, it is often necessary to keep the logical clocks of the processors synchronized. In such a system, physical clocks may drift and messages could have varying delivery times. Moreover, processors may be faulty, and in many cases the type of failures is not predictable in advance. To handle this situation, the worst type of failures must be considered, namely Byzantine faults [Lamport et al. 1982] . In the presence of Byzantine faults a processor can exhibit arbitrary "malicious", "two faced", behavior.
The problem of keeping clocks synchronized in the presence of Byzantine faults has been extensively studied (e.g., Halpern et al. [1984] , Lamport and Melliar-Smith [1985] , Mahaney and Schneider [1985] , , Srikanth and Toueg [1987] , Welch and Lynch [1988] , and Ramanathan et al. [1990] ). Lamport and Melliar-Smith [1985] were the first to present the problem and show that 3 f + 1 procesors are sufficient to tolerate f Byzantine faults. The necessity of 3 f + 1 processors for tolerating f faults was later proved in . A weaker fault model called authenticated Byzantine allows a protocol that can tolerate any number of faulty processors [Halpern et al. 1984] . In that failure model, reintegration of repaired processors is only possible if less than half the processors are faulty.
Many of the protocols for this problem assume that the clocks are initially synchronized and thus focus on keeping them synchronized in the presence of clock drift. The problem of how to ensure that the clocks are initially synchronized was addressed in, for example, Srikanth and Toueg [1987] and Welch and Lynch [1988] . In these protocols, some mechanism is assumed that allows all the nonfaulty processors to begin the protocol within a bounded time period of each other. The mechanism essentially is that the processes are assumed to wake up in a distinguished initial state, in which they can uniquely perform initializing actions, including communication with each other.
In this work, we weaken the assumptions made for the design of clock synchronization protocols in the presence of Byzantine faults. Our goal is protocols that cope with a more severe (and realistic) fault model than the traditional Byzantine fault model [Lamport et al. 1982] . The designers of extremely robust flight devices identified the Byzantine fault model as modeling very bad behavior that should be guarded against [Wensley et al. 1978] ; continuing this line of reasoning, we propose that critical tasks should be able to tolerate additional system misbehavior, namely to be self-stabilizing. In such a device the traditional assumptions could be violated: Is it reasonable to assume that during any period of the execution less than one third of the processors are faulty? What happens if for a short period more than a third are faulty (perhaps experiencing a weaker fault than a Byzantine fault)? What happens if messages sent by nonfaulty processors are lost in one instant of time?
In this article, we present self-stabilizing protocols that can overcome these problems. Such temporary violations of the assumptions can be viewed as leaving the system in an arbitrary initial state from which the protocol resumes. Self-stabilizing protocols work correctly when started in any initial system state. Thus, even if the system loses its consistency due to an unexpected temporary violation of the assumptions made (e.g., more than one-third faulty, unexpected message loss) the system synchronizes the clocks when subsequently the assumptions hold (e.g., less than a third experience Byzantine faults).
Originally, Dijkstra [1974] defined a protocol to be self-stabilizing if, when started in an arbitrary system state, the system converges to a consistent global state that realizes the task. Self-stabilizing protocols are resilient to transient faultsfaults that cause the state of a processor to change arbitrarily and then from the new state, the processor resumes operation according to its program. A permanent fault is a fault that causes a processor to permanently misbehave. A protocol tolerates hybrid faults if it is resilient to both transient and permanent faults (e.g., Dolev and Welch [1993] and Gopal and Perry [1993] , which consider napping and omission faults, respectively). We are interested in clock synchronization protocols that can tolerate hybrid faults: they should work from an arbitrary initial configuration and they should tolerate less than a third of the processors exhibiting permanent Byzantine faults. Recently, Daliot et al. [2003] used our settings to achieve fast pulse synchronization, rather than clock value synchronization. The approach in Daliot et al. [2003] is biologically inspired.
Note that a self-stabilizing system is at least as good as a non-self-stabilizing system, since it can be started in a predefined state that ensures "controlled" execution. Moreover, in case an execution experiences an unexpected temporary violation of the designer's assumptions, the system is not ruined but will eventually converge to good behavior again.
A realistic assumption for a clock synchronization protocol is that a 64-bit clock is "unbounded" for most possible applications. However, in the context of selfstabilizing protocols transient faults could cause the system to reach the upper bound of the clock at once. Thus, another aspect of the problem should be considered: the fact that the clocks are bounded. Surprisingly, some of the techniques used in the past for synchronizing clocks in the presence of Byzantine faults cannot be used for devising self-stabilizing protocols with bounded clocks (and memory). For instance, the authentication scheme used for identifying the signature of processors should be re-examined since a faulty processors could "collect" all the possible authenticated messages signed by nonfaulty processors and send them at the most unfortunate time.
In this article, we present two randomized self-stabilizing clock synchronization protocols that work in the presence of Byzantine faults. Both protocols work for bounded clocks. The first assumes the existence of a common pulse while the second does not make this assumption. The expected stabilization time of both protocols is exponential in n. This is a drawback when the number of processors is large. However, in addition to being of theoretical interest, we believe that our protocols could be of practical interest, at least when the number of backup processors is small.
One of the contributions of this paper is an interesting usage of the Chinese remainder theorem for implementing a distributed counter. This counter is used to accelerate the first protocol.
The remainder of the article is organized as follows. In the next section we formalize the assumptions and requirements for the protocol. Section 3 presents a clock synchronization protocol under the assumption of a common pulse. In Section 4 we present a protocol that does not assume the existence of common pulses. Conclusions are in Section 5.
Definitions
A distributed system consists of a set of processors that communicate by sending messages to each other. Messages have a bounded delay. Each processor has a bounded physical clock that is constantly incremented, wrapping around when appropriate; the physical clocks at the different processors run at approximately the same rates. Each processor also has a bounded logical clock, which is computed as a function of the current state and physical clock value. The goal is for the logical clocks of the nonfaulty processors to become and subsequently remain close to each other, while continuing to progress at a reasonable rate (wrapping around when appropriate). We consider two types of timing behavior of the system, synchronous and semisynchronous. In both models, processors take steps either when they receive a message, or when their physical clocks reach some predetermined value. In addition, in the synchronous model, there is a common pulse that periodically occurs simultaneously at all processors, causing them to take a step. We now proceed more formally.
Each processor P i , 1 ≤ i ≤ n, is modeled as a state machine. Associated with the processor is its physical clock, which takes on integral values from 0 to M pc − 1 for some M pc . The state contains a distinguished timer variable that can take on the values 0 to M pc − 1 and nil; it indicates that the processor wants to take a step the next time its physical clock has the given value. A transition takes the current state of the processor, the current value of its physical clock, and a message received (if any) and produces a new state of the processor and a set of messages to be sent. The message system holds all messages sent but not yet received. A configuration of the system is a set of processor states, one per processor, a set of physical clock values, one per processor, and a state for the message system.
An execution is an alternating sequence of configurations and events C 0 , e 1 , C 1 , . . . . In a semisynchronous execution, events happen at real times, taking one configuration to the next. We require that the times of events in the execution be nondecreasing and, if the execution is infinite, to be unbounded. There are two types of events. One type is a tick of some processor's physical clock, causing it to increase by 1 mod M pc . Nothing else changes. We require that the real time elapsed between two successive ticks of the same processor's clock be between 1 − ρ and 1 + ρ for some fixed ρ.
The other type of event is a step of some processor. No processor can take more than one step at the same real time. In the step, the processor may or may not receive a message. The real time elapsed between the sending and receiving of any message must be in the range [d − , d + ] for some fixed d and . There is a fixed set of faulty processors of size f , where n > 3 f . Roughly speaking, a nonfaulty processor is a processor that follows its program according to its current state. More precisely, for each step, if the processor taking the step is nonfaulty, then the succeeding configuration must correctly reflect the processor's transition function acting on the message received and the state and physical clock in the preceding configuration. Thus, the only changes are to the processor's state and the message system (removing the message received and adding the messages sent). If the processor taking the step is faulty, it can change state arbitrarily and add arbitrary messages (from itself) to the message system.
In a synchronous execution, in addition to the above constraints, there exists a value π > 0 such that, for all j ≥ 0, every processor P i receives a special Pulse message (from a dummy processor) at time j · π . That is, all the processors take a step at each pulse and the pulses occur regularly with period π.
To define the self-stabilization property of a protocol, we first define a set of legal executions, LE, each of which achieves the task of the protocol. For example, LE may consist all the executions in which the clocks of the processors are sufficiently close to each other. A configuration C is safe relative to a protocol and LE iff every execution of the protocol that starts in C belongs to LE. A protocol is self-stabilizing relative to LE if, starting in an arbitrary configuration, every execution reaches in a finite number of steps a safe configuration relative to LE.
We require that for every processor P i there exists a function clock i that, given a state of P i and a value for P i 's physical clock, returns an integral value in the range 0 to M lc − 1 for some fixed M lc . This is the logical clock of P i . Given a particular execution C 0 , e 1 , . . . , we denote by clock i (t) the value of the function clock i applied to P i 's state and physical clock value in C j , where j is the configuration in the execution whose real time of occurrence is the largest not exceeding t.
We rely on the following two important definitions concerning the logical clocks. The clock agreement condition holds for a configuration and essentially says that the logical clocks are close together. The definition is slightly involved because the clocks are bounded and can wrap around.
Given γ < M lc /4, a configuration satisfies clock agreement (with parameter γ ) if, for all nonfaulty processors P i and P j , either (clock
1 In this case, we say that the clocks are within γ of each other or are γ apart.
The next property, clock validity, is a property of an execution and basically says that the amount of logical clock time that elapses (counting the wrap-arounds that occur) during a real-time interval is a linear function of .
Given an execution that starts at real time t s , the execution satisfies clock validity if there exists , 0 < ≤ M lc /4, and there exists a ≥ 0 such that for all real times t > t s and all nonfaulty processors
We define the set LE of legal executions for the clock synchronization task to be those executions that satisfy clock agreement (for some γ ) in every configuration and clock validity (for some and a).
Thus, a protocol that is self-stabilizing for LE can also be viewed as ensuring that, starting from an arbitrary configuration, there exists a real time t s (when a safe configuration is reached, so that the clocks of nonfaulty processors are close to each other) after which clock agreement and clock validity hold, that is, after t s , the clocks of nonfaulty processors stay close to each other and increase linearly with real time.
Synchronous Protocol
We first describe a protocol for the synchronous system, in which nonfaulty processors have access to a periodic common pulse. Each pulse triggers the processors to synchronize their clocks. The time between two successive pulses is an important parameter to the problem. In case two successive pulses are farther apart than the time required to run a Byzantine agreement protocol, then the following scheme solves the problem: Every pulse starts a new version of the Byzantine agreement protocol to agree on the common clock value. However, when the pulses are closer together than that, this scheme cannot work. We assume that the pulses are on the order of the message delay apart. Recall that π is the time between two successive pulses. Nonfaulty processors send messages and update their logical clocks only when a pulse occurs. We assume that π is long enough such that when a pulse takes place, no message sent by a nonfaulty processor in the previous pulse is present in the system. Whenever a nonfaulty processor P is triggered by a pulse, P sends a message with its clock value to all its neighbors. Then P waits to receive all the clock values of the other processors. P waits for a period (1 + ρ)(d + ), which is longer than the bound on the message delay and accounts for clock drift. If during that period P receives more than one message from some neighbor, say Q, then P uses the latest value that arrives from Q. Thus, at the end of such a period P has a set of at least n − f logical clock values, at most one value for each nonfaulty processor including P. P uses the set of the logical clocks received in order to choose its own clock value.
The formal description of the protocol appears in Figure 1 . We now describe the protocol informally. The protocol for a processor P works as follows: (1) If the value of P's clock appears less than n − f times in the set of the received logical clock values, then P assigns 0 to its clock. Otherwise, (2) in case that the value of P's clock appears at least n − f times, we further distinguish between the case (2.1) in which P's clock value is not equal to 0 and the case (2.2) in which it is equal to 0. In case (2.1), P increments its clock by 1 (modulo the number of clock values M lc ). Case (2.2) is further subdivided into two cases: (2.2.1) in which (according to the state of P) in the previous pulse P incremented its clock by 1 (and the result was 0) and the case (2.2.2), otherwise. In case (2.2.1), P increments its clock by 1 (to be 1). In case (2.2.2) P tosses a coin and assigns the result (0 or 1) to its clock.
The protocol guarantees with probability 1 that the system eventually reaches a global state in which all the nonfaulty processors have the clock value 1. Once such a global state is reached the clocks are synchronized: In every pulse, every nonfaulty processor P receives messages from at least n − f processors, including itself, containing a clock value that is identical to its own clock value. Moreover, a pulse in which all the nonfaulty processors set their clocks to 0 always follows a pulse in which every nonfaulty processor increments its clock value by 1 to set it to 0. Thus, case (2.2.2) is not applied.
The main idea of the protocol is to ensure that only when there are "enough" nonfaulty processors with the same clock value will this value be incremented. It is proved in the sequel that in any pulse at most one clock value of nonfaulty processors is incremented by 1 while the rest of the values are changed to be zero. This ensures that after the first pulse, the set of clock values of the nonfaulty processors contains at most two elements. Moreover, if two such elements indeed exist one of them is 0.
At first glance, this seems to be sufficient and no coin toss is needed; the value that is incremented will eventually wrap around to 0 and at that time the clocks of all the nonfaulty processors will be 0. However, we now describe an infinite execution, E, that does not use coin tosses in which the clocks never become synchronized. Consider a system with four processors, P 1 , P 2 , P 3 and P 4 , in which P 4 exhibits Byzantine behavior. Let 0, 0, 1 be the clock values of P 1 , P 2 , P 3 , respectively, in the first configuration of E. In the first pulse, P 4 sends clock value 1 to P 1 and P 3 and clock value 0 to P 2 . Thus, P 1 receives the clock values vector 0, 0, 1, 1, P 2 receives 0, 0, 1, 0 and P 3 receives 0, 0, 1, 1. P 2 is the only processor that finds n − f = 3 processors with the same clock value (namely, the clock value 0) and increments its clock value by one (to be 1). At the same time, P 1 and P 3 find two clock values with value 1 and two with value 0 and assign 0 to their clocks. Hence, a configuration with clock values 0, 1, 0 for P 1 , P 2 , P 3 , respectively, is obtained. P 4 continues by sending the clock values 1, 1, 0 to P 1 , P 2 , P 3 , respectively. P 1 receives the clock values vector 0, 1, 0, 1, P 2 receives 0, 1, 0, 1 and P 3 receives 0, 1, 0, 0. Similarly, P 3 is the only processor that finds n − f = 3 processors with the same clock value and assigns 1 to its clock while P 1 and P 2 assign 0. We reach a configuration with clock values 0, 0, 1 for P 1 , P 2 , P 3 which are identical to the clock values in the first configuration. Therefore, an infinite execution in which nonfaulty processors never agree on their clock values is possible.
To overcome the above problem, we use coin tosses. In a pulse in which a nonfaulty processor with clock value 0 receives n − f clock values with value 0, the processor tosses a coin and decides whether to assign 0 or 1 to its clock. This leads to a possible scenario (that has some positive probability of occurring) in which the coin toss results cause all the nonfaulty processors to simultaneously assign 1 to their clocks.
3.1. CORRECTNESS PROOF OF THE SYNCHRONOUS PROTOCOL. Throughout the proof, we say that a processor P i increments its clock by 1 in a certain pulse, if P i assigns last increment := true during this pulse. Otherwise, we say that P i assigns 0 to clock i . LEMMA 3.1. If nonfaulty processors P i and P j increment their clocks by 1 during some pulse P, then immediately after P, clock i = clock j .
PROOF. Assume towards contradiction that clock i = (x + 1) mod M lc = clock j = (y + 1) mod M lc following P. Hence, during P, P i finds at least n − f clock values that are equal to x. At least n − 2 f of them belong to nonfaulty processors. Thus, P j also receives n − 2 f clock values that are equal to x. Hence, P j receives at most n − (n − 2 f ) = 2 f clock values that are equal to y. Since n > 3 f , it holds that n − f > 2 f , which contradicts the possibility of P j receiving at least n − f clock values that are equal to y. PROOF. The variable last increment is assigned during every pulse. Thus, since P follows the first pulse, P indeed increments during Q, the pulse before P. Thus by Lemma 3.1 all the nonfaulty processors have clock values 0 after Q and before P.
The next theorem uses the scheduler-luck game of Dolev et al. [1991 Dolev et al. [ , 1995 to analyze the randomized protocol. The scheduler-luck game has two competitors, scheduler (adversary) and luck. The goal of the scheduler is to prevent the protocol from reaching a safe configuration, while the goal of luck is to help the protocol reach a safe configuration. The scheduler chooses the order in which processors take steps in a way that will prevent the protocol from reaching a safe configuration, while luck chooses whether to intervene, whenever a processor invokes a random function, by fixing the random function result. The rough idea is to prove that for every configuration there is a scenario, which is a set of executions that start in the configuration, such that the expected number of fixed random function results and the expected length of the scenario until a safe configuration is reached are both bounded. Thus, in case the results of the random function do not reflect the strategy of luck, namely the desired scenario does not occur, then an arbitrary configuration is reached from which there is another chance for convergence. Note that in the case of a synchronous protocol, the scheduler is restricted to operate all the processors simultaneously.
For the synchronous protocol a configuration is safe if for all nonfaulty processors, the logical clocks are equal and last increment is true. For our system the scheduler chooses the message delays and clock drifts during the execution (within the predefined limitations). Each time the processor, activated by the scheduler, tosses a coin, luck may intervene and determine the result of the coin toss. It is proved in Dolev et al. [1991 Dolev et al. [ , 1995 that if, starting with any possible configuration c, luck has a strategy to win the scheduler-luck game within i interventions and expected time t, then the system reaches a safe configuration within expected time t · 2 i . The main observation used for this proof is the fact that if a coin toss result differs from the desired result (according to luck's strategy) a configuration is reached from which a new game can begin. THEOREM 3.5. In expected (M lc + 2) · 2 2(n− f ) pulses, the system reaches a configuration in which the value of every nonfaulty processor's clock is 1.
PROOF. The proof is by the use of Lemma 1 of Dolev et al. [1991] (Theorem 5 of Dolev et al. [1995] ). We present a strategy for luck to win the scheduler-luck game with 2(n − f ) interventions and within (M lc + 2)π time. The strategy of luck is (1) wait for the first pulse to elapse. Thereafter, (2) luck waits until a pulse P occurs in which a nonfaulty processor with clock value 0 receives n − f clock values that are 0. This occurs within the next M lc pulses (if it does not occur by then, there is at least one nonfaulty processor that does not assign 0 to its clock during M lc successive pulses, which is impossible). In case (2.1) during this pulse all the nonfaulty processors are either tossing a coin or assigning 1 without tossing. Then, luck intervenes at most n − f times and fixes the coin toss results of all the nonfaulty processors to be 1. Otherwise, (2.2) if there is a nonfaulty processor P that is neither about to toss a coin nor about to assign 1 without tossing, then luck intervenes and fixes all the coin toss results (less than n − f ) to be 0. Note that before P, P's clock is not equal to 0. Thus, by Claim 3.4, no processor assigns 1 without tossing a coin. By Lemma 3.1 and the fact that some nonfaulty processor tosses a coin during P, it holds that following P the clock values of all the nonfaulty processors are 0. Therefore, in the next pulse, case (2.1) is reached and luck could intervene and fix at most n − f coin toss results to ensure that the desired global state is reached.
By Theorem 3.5, the system reaches a configuration in which the value of every nonfaulty processor's clock is 1, in expected (M lc + 2) · 2 2(n− f ) pulses. It is easy to see that in any successive pulse, all the nonfaulty processors have the same clock value. Thus, the clock agreement requirement holds with γ = 0. Since the clocks of the nonfaulty processors are incremented by 1 in every pulse and the pulses are constant time apart, the clock validity requirement also holds.
Note that the clock value could be multiplied by π (if π is known), the time difference between two successive pulses, in order to yield a clock value that reflects real time. Otherwise, the value of a of the clock validity requirement will be a function of π that relates the clock's rate to the rate of real time.
3.2. ACCELERATING THE PROTOCOL. If M lc = 2 64 , our protocol converges after expected 2 64 · 2 2(n− f ) synchronization pulses. Certainly, because of this time complexity this protocol cannot be used in practice. However, if M lc , n, and f are all small 2 then the expected number of pulses required is reasonably small. For instance, if M lc = 2, n = 4, and f = 1, then the expected number of pulses is 128. We use the above observation to accelerate our protocol. We achieve synchronization of clock values in the range of M lc = 2 64 values within expected number of pulses that is less than 381 · 2 2(n− f ) . (For M lc = 2 16 , synchronization occurs within expected number of pulses that is less than 58 · 2 2(n− f ) pulses). We define the Chinese remainder counter by the use of the Chinese remainder theorem, which appears in Knuth [1981, p. 270] . 2 It is reasonable to think of n and f as being small when a single processor can efficiently compute a task and additional processors are added only to ensure reliability. Let the reliability be f /n, the ratio of the number of faulty processors to the total number of processors. To reach a reliability of 0.25, the number of processors needed (and thus, in general terms, the blowup in the hardware and cost) is four. To improve the reliability to 2/7 ≈ 0.28, the blowup would be 7. Asymptotically, we need an infinite blowup to reach reliability of 1/3. Thus, most devices would use a relatively small number of processors for which our protocol stabilizes in a relatively short time. We use the theorem for the case a = 0 and m ≥ M lc . Let 2, 3, 5, . . . , p j be the series of prime numbers up to the jth prime such that 2 · 3 · 5 · · · · · p j−1 < M lc ≤ 2 · 3 · 5 · · · · · p j . We run j parallel versions of our protocol. The ith version runs the protocol with M lc = p i . Each message carries the value of j clocks, one clock value for each version. The computation of the new clock value of some version i uses the values received for this particular version and is independent from the computation of all the other versions. Thus, the ith version converges within expected p i ·2 2(n− f ) pulses. Therefore, the expected time for all the versions to be synchronized is less than ( p 1 + p 2 + · · · + p j ) · 2 2(n− f ) . This is an upper bound on the expectation since it corresponds to a scenario in which version i starts to synchronize after every version k < i is already synchronized. Now we apply the Chinese remainder theorem to show that every combination of those values is mapped to one and only one number in the range 0 to 2 · 3 · 5 · · · p j . A well-known technique could be used in order to convert such a representation to its mapping (e.g., by Garner methods, c.f. p. 274 in Knuth [1981] ).
The Chinese remainder theorem could be used for other implementations of distributed counters based on the number presentation method suggested in Szabo and Tanaka [1967] . One possible use is as a memory and communication efficient distributed counter. Let DC be a distributed counter that is maintained by a set of processors P 1 , P 2 , . . . , P j that are triggered by a common pulse. P i increments the counter mod p i in every trigger. P i does not need to store the entire bits of the clock or to send messages to indicate the carry (when its counter wraps around). Thus, when the counter is incremented, no communication between processors is needed. Only when the value of the counter is to be scanned is communication required.
3.3. SMOOTHING THE STEP FUNCTION. The result of the synchronous protocol could be described as a step function: the clock values are changed by one (or by π) in every pulse. Sometimes a clock that increases with a finer granularity is preferred. For instance, in a timestamp scheme the messages should be ordered by their timestamp. When the step function is used two operations of a processor can have the same timestamp since they occur between two successive increments of the logical clock.
We use the physical clock to make sure that any two successive operations of a processor have different timestamps. We assume that the maximal value in the physical clock of every processor is larger than the time between two successive pulses. Whenever a pulse occurs, the current time in the physical clock is stored. The clock value is a pair of two values. The first value is obtained by the original protocol while the second value in the pair is the time elapsed since the previous pulse, measured by the physical clock.
Semi-Synchronous Protocol
In this section, we drop the assumption of common pulses. We present a selfstabilizing randomized protocol for semi-synchronous systems.
Our protocol uses the fault-tolerant averaging function first introduced in synchronization in Welch and Lynch [1988] . Given a multiset of values, a processor applies the function by discarding the f highest and f lowest values and then taking the midpoint of the remaining values. It has been shown that this function, when used in the context of the protocols of and Welch and Lynch [1988] , approximately halves the range of values held by the nonfaulty processors. In our situation, with bounded clocks, the notions of "highest" and "lowest" must be appropriately modified. But the real difficulty in directly applying the previous result is that the analysis showing the range is cut in half depends on all nonfaulty processors working with approximately the same multisets at each "round". The multisets can differ arbitrarily in the values corresponding to the faulty processors, but the values corresponding to nonfaulty processors must be close to the same (allowing for error introduced by clock drift and uncertain message delays). This "round" structure can be achieved because the actions of the processors are roughly synchronized in time in the [Welch and Lynch 1988] protocols, due to the assumption of initial synchronization or of distinguished initial states.
Since our protocol is self-stabilizing, it cannot rely on either of those assumptions. Thus using the fault-tolerant averaging function in the obvious manner, with the processors starting with arbitrary information and collecting clock values at arbitrary times, would not ensure that the function is applied at the processors in rounds. For instance, P could apply the function to a multiset M, then subsequently Q could apply the function to a multiset M that reflects P's new value instead of P's old value.
To achieve some sort of approximate rounds for applying the fault-tolerant averaging function, we first use randomization to bring all the clock values of the nonfaulty processors close to each other. Once this is achieved, all the nonfaulty processors collect (approximately) the same multisets from all the nonfaulty processors. In this stage the midpoint averaging function can be shown (cf. Welch and Lynch [1988] ) to approximately halve the nonfaulty clock values, thus overcoming the ongoing effects of clock drift and uncertainty of message delay.
We now describe the code of the protocol in Figure 2 . A processor P i has two synchronization procedures. The first is called the averaging procedure (lines 03 and then 09-11) and the second is the hopping procedure (lines 04 and then 12-14). A processor executes either the averaging or the hopping procedure when T s time has elapsed since the previous execution of the hopping procedure (line 01); T s is a parameter the choice of which is discussed below. The processor chooses to execute the averaging procedure if at least T a time has elapsed since the previous execution of the averaging procedure (line 02), and the value of clock i is at most δ; T a and δ are parameters that are also discussed below. Otherwise, the processor executes the hopping procedure. P i measures T s and T a using its physical clock. Roughly speaking, the hopping procedure causes the clocks of the nonfaulty processors to be within a small range. Then, the averaging procedure keeps the clocks of the nonfaulty processors in a small range by approximately halving the range each time the clock values wrap around.
In more detail, we first define the symmetric value of clock value T , denoted sym(T ), to be (T + M lc /2 ) mod M lc . Given a set S of values in the range 0 to M lc − 1, an element T in S, and an integer f > 0, reduce(S, T, f ) returns the multiset that results from removing from S the f elements that are closest to sym(T ) on either side. For example, if S = {5, 6, 95, 98}, M lc = 100, f = 1, and T = 98, then sym(T ) = (98 + 50) mod 100 = 48 and the eliminated elements are 6 and 95.
In the hopping procedure, P i chooses one of the clock values at random from the reduced clock values list (line 13), while in the averaging procedure, P i chooses the midpoint of the reduced clock values list (line 10). In both procedures, if less than n − f clock values are found within δ of clock i , P i chooses one of the clock values randomly (lines 13 and 16).
Both the synchronization procedures of processor P i start with a request for clock values (lines 05-06).
3 During the execution of the averaging procedure, a processor measures 2(d + )(1+ρ) time on its physical clock in order to make sure that all the requests for clock values arrive at their destinations and the responses return before it proceeds to decide on a new clock value. Thus, each execution of the averaging procedure takes some period of time.
4.1. CORRECTNESS PROOF OF THE SEMISYNCHRONOUS PROTOCOL. The proof starts in Lemma 4.1 where the fact that the averaging function is not executed too frequently is used to show (using the pigeonhole principle) that there is a period in which no processor is executing the averaging function and processors may randomly choose a new clock value from the set of clock values obtained (see Figure 3) . Then, in Lemma 4.2, we show that there is a scenario in such a period (in which no processor executes the averaging function but processors choose clock values) in which an ultra-safe configuration is reached. We use the term ultra-safe configuration for a system configuration in which the nonfaulty processors' clocks are within δ/8 of each other. Moreover, in case a processor is in the middle of collecting clock values, then all the clock values in transit sent by nonfaulty processors are within this range too.
Using the ideas of the scheduler-luck game, we can compute an upper bound for the expected time required to reach such an ultra-safe configuration. See Figure 4 , for illustration our claim for the existence of a 5T s (1 + ρ) time period in which FIG. 3 . PROOF OF LEMMA 4.1. In every time period of length T a , P 1 , P 2 , P 3 , . . . , P n might execute line 10 of the code once, in a period of time that is no larger than 2(d + )(1 + ρ) 2 . Thus, the maximal period of time in which some nonfaulty processor executes line 10 of its code is no more than
2 . Therefore, during at least T a − , no nonfaulty processor executes line 10 of its code. This, in turn, implies there exists at least one (continuous) period of time of length 5T s (1 + ρ) in which no nonfaulty processor executes line 10. To show that all anchor processors are at most 2T r apart, we assume toward contradiction that there are two anchor processors that are more than 2T r apart. Thus, there are two T r length segments of clock values, the left and the right segments (where one anchor processor has a clock value that is the leftmost in the left segment and another anchor processor has a clock value that is the rightmost in the right segment). Since the anchor processors are defined to have at least n − f clock values within T r of their clock values, we have a total of 2n − 4 f clock values of nonfaulty processors, which exceeds the maximum possible number of such clock values, n − f . processors do not execute the averaging procedure. Lemma 4.5 shows that starting in an ultra-safe configuration, the averaging procedure ensures that the clocks of nonfaulty processors are at most δ/2 apart. Theorem 4.6 proves an upper bound on the expected convergence period following which the clocks of nonfaulty processors are at most δ/2 apart.
Our protocol works under the following choice of parameters. The following eight assumptions were chosen in the course of the proof. It turned out that the relative values of n, f , d, , and ρ are important. We note that this is not the only choice of parameters that ensures convergence of the clocks.
2 . The choice of T a ensures (by the pigeonhole principle) that there is a time period of length T s (1 + ρ) in which nonfaulty processors only hop, namely the processors do not perform the averaging procedure. The period of time 5T s (1 + ρ) allows us to prove that there is a scenario in which all nonfaulty processors hop and end up with their clocks close together. See Lemma 4.1.
Assumption 2. T s ≥ (d + )ρ. Lemma 4.1 uses this inequality.
2 < 6/5. This assumption ensures that the relative rates of the clocks are bounded, allowing us to bound the maximal number of hops a processor is executing while another processor executes the minimal number of hops required for reaching an ultra-safe configuration. See the end of Lemma 4.2.
The proof of Lemma 4.2 uses the period in which processors' clocks hop to show that the nonfaulty processors' clocks are within (n − f ) + 2T s (1 + ρ)ρ, a value we choose to replace with a concrete value δ/8 in order to show that even with clock drift the clocks stay within δ/2 of each other. + ρ) ). We prove in Lemma 4.5 that the rate at which the minimal clock value is advanced is at least T s /(1 + ρ) 2 − (n − f ) and we replace it by a rate of growth 0.5 using the inequality of Assumption 5. Thus, the minimal clock value must pass 0 frequently enough. (1+ρ) . Lemma 4.5 uses assumption 6 to replace (n − f ) + 2T s ρ by a value that will yield a rate of growth of 0.5ρ for the clock value range.
Assumption 7. M lc (ρ + ρ 2 ) < 3δ/16. Assumption 7 replaces the range following the execution of the averaging function, which is M lc (1 + ρ)ρ + δ/4, by a more concrete value δ/2. See Lemma 4.5.
. Theorem 4.6 uses this inequality to bound the rate of the maximal clock value.
First, we show that the above assumptions are not contradictory and reflect realistic parameters. To do so, we choose values that correspond to some of the choices in Cristian [1989] . In particular, we choose ρ = 10 A period of time is a hopping period if no nonfaulty processor executes the averaging procedure during this period. We choose T a to be 2(n − f )(6T s + d + )(1 + ρ) 2 (assumption 1 above). The next lemma proves that the above choice yields the existence of a hopping period of length 5T s (1 + ρ). See Figure 3 for an illustration of the assertions made.
LEMMA 4.1. In every time period of length T a , there is a hopping period that is at least 5T s (1 + ρ) long.
PROOF. A processor measures time by the use of its physical clock, whose drift rate from real time is at most ρ. Thus, if a processor measures a period of time T on its physical clock, then the real time elapsed during the measurement is at least T /(1 + ρ) and at most T (1 + ρ). By the way T a is chosen, in every period of length
, every nonfaulty processor executes the averaging procedure at most once. A processor measures 2(d + )(1 + ρ) time in order to make sure that the requests for clock values arrive at their destinations and the responses arrive before it decides on a new clock value. Thus, the amount of real time required for the averaging procedure to execute, namely, the time that elapses between starting line 3 and finishing line 11, is at most 2(d + ) (1+ρ) 2 . By the way T a is chosen, a processor executes the averaging procedure at most once in every period of 2(n − f )(6T s + d + )(1 + ρ) real time. Hence, during a period of length 2(n − f )(6T s + d + )(1 + ρ), the total amount of real time during which at least one nonfaulty processor is executing the averaging procedure is at most (n − f )2(d + )(1 + ρ) 2 . Therefore, the total amount of time during which no nonfaulty processor is executing the averaging procedure is at least
where we use assumption 2 that T s ≥ (d + ). By the pigeonhole principle, at least one hopping period is of length 2
An ultra-safe configuration is a system configuration in which the nonfaulty processors' clocks are within δ/8 of each other. Moreover, in case a processor is in the middle of collecting clock values then all the clock values in transit sent by nonfaulty processors are within this range too. A safe configuration is a configuration that is reached from an ultra-safe configuration and in which the nonfaulty processors' clocks are at most δ/2 apart. LEMMA 4.2. During any hopping period of length 5T s (1 + ρ), with probability at least 1/n 6(n− f ) , the system reaches an ultra-safe configuration.
PROOF. Weprove the lemma by presenting a sequence of random choice results that force the system to reach a configuration in which the clocks of all the nonfaulty processors are less than δ/8 apart. This sequence of random choice results has probability of at least 1/n 6(n− f ) of occurring. Let c be the configuration at the beginning of the hopping period.
Without loss of generality, we assume that the number of faulty processors f is the maximal possible 4 that does not violate the inequality n > 3 f . Let c be the first configuration in a hopping period. For every nonfaulty processor P, luck counts the number of other nonfaulty processors that have clocks within T r = δ +4(T s (ρ +ρ 2 ))+2 of P's clock in the configuration c. Each nonfaulty processor that has at least n − 2 f − 1 such surrounding clock values is called an anchor. PROOF. Assume towards contradiction that there are two nonfaulty anchor processors, P and Q, whose clock values are more than 2T r apart. Thus, P is surrounded by n − 2 f − 1 nonfaulty processors and Q is surrounded by n − 2 f − 1 different nonfaulty processors. Therefore, the total number of nonfaulty processors is at least
contradiction. (end of Claim 4.3)
Note that it is possible that no anchor processor exists. In this case, luck chooses one nonfaulty processor to be an anchor processor.
Then, luck chooses a single anchor processor A out of the anchor processors. Until every nonfaulty processor executes the hopping procedure twice, luck uses the following strategy: Every time a processor, P j , chooses a clock value and the value of the clock of A is a possible choice (i.e., either P j does not find n − f clock values within δ or A's clock value is in the reduced clock values list), this value is chosen; otherwise the value of clock j is not changed. Let c 1 be the first configuration reached from c after each processor executes the hopping procedure at least twice with results according to the strategy of luck. Let E 1 be the execution that starts with c and ends with c 1 . Since in a hopping period every nonfaulty processor chooses a clock value at least once in every period of length T s (1 + ρ), c 1 occurs at most 2T s (1 + ρ) time after c. PROOF. We first show that any nonanchor processor, P, assigns the value of A's clock to its own clock either in the first execution of the hopping procedure or in the second one. Every processor collects the clock values during every execution of the hopping procedure. In particular a nonanchor processor, P j , receives the value of the clock of A before the second execution of the hopping procedure. Next, we show that in the second execution of the hopping procedure P j can choose the value of A's clock.
The choice of P j is restricted to a subset of the clock values that P j read, only if P j finds n − f clock values within δ of clock j . Since P j is a nonanchor processor it holds in c that there are less than n − f processors within T r of clock j . If P j does not assign to its clock the value of A's clock during E 1 , then when P j executes the hopping procedure, P j either discovers n − f clock values within δ of its clock value (and A's clock is not among them) or P j does not assign a new value to its clock.
First, we show that during E 1 no processor assigns to its clock a value that is within δ + 4(T s (ρ + ρ 2 )) + of P j 's clock. Thus, P j cannot find n − f clock values within δ of its clock value. (Note that an assignment of a clock value that is more than δ + 4(T s (ρ + ρ 2 )) + apart from P j 's clock implies that P j cannot read this clock value during E 1 as being δ apart from its clock value.)
Every nonfaulty processor P k that does not change its clock value by the use of the hopping procedure has a rate of drift from the clock of P j of at most 2ρ. Thus, the difference between clock j and clock k can be shortened by at most 2T s (1 + ρ)2ρ = 4(T s (ρ + ρ 2 )). Every nonfaulty processor P k that changes its clock value by the use of the hopping procedure assigns to its clock a clock value which is at most from the clock of A. Thus, if P k 's clock was more than T r = δ + 4(T s (ρ + ρ 2 )) + 2 apart from P j 's clock in c then P j cannot consider P k to have a clock within δ of clock j during E 1 (unless P j assigns to clock j the value of the clock of A).
This proves that in c 1 all the nonanchor processors are within + 4T s ρ from A's clock. The anchor processors that do not assign the clock value of A to their clock during E 1 were at most 2T r apart in c, thus they are at most 2T r + 2T s (1 + ρ)2ρ apart in c 1 .
(end of Claim 4.4)
The fact that all the nonfaulty processors are within a small range of each other is used to define a new anchor processor A . The new anchor processor is a processor that has in c 1 a clock value that is obtained by eliminating f clock values of nonfaulty processors from each side of the symmetric clock value of A.
Starting at c 1 and continuing until every processor executes the hopping procedure at least twice, luck behaves as follows: Any processor P i that is in the process of collecting clock values in c 1 does not change clock i in the first execution of the hopping procedure. For any other execution of the hopping procedure, luck intervenes to fix the result to be the clock of A or the clock of a processor that has already set its clock to the value of A 's clock since c 1 . We have to prove that this behavior is a possible result of the hopping procedure. This is obvious when the processor does not find n − f processors within δ from its clock, since the choice is not restricted. It is also true for the first set of processors that execute the hopping procedure and use the clock values in c 1 as the basis for the decision on the new clock value, since by our choice of A the reduced list must include A 's clock. Moreover, since luck intervenes and fixes all those results to be the value of the clock of A , the reduced list of every processor that uses the new clock values includes either the clock of A or a clock of a processor that assigned to its clock the value of A 's clock.
Hence, in the first configuration, c 2 , that follows the first two executions of the hopping procedure of all the processors following c 1 , all the nonfaulty processors are within (n − f ) + 2T s (1 + ρ)ρ of each other, which, by assumption 4, is less than δ/8.
Following c 2 any processor that is waiting for answers in the process of collecting clock values does not change its clock value. Thus, (d + )(1 + ρ) time after c 2 an ultra-safe configuration is reached.
The length of the execution is 2T s (1 + ρ) until c 1 is reached, 2T s (1 + ρ) from c 1 to c 2 and additional (d + )(1+ρ) until an ultra-safe configuration is reached. Thus, an ultra-safe configuration is reached following (4T s + d + )(1 + ρ) < 5T s (1 + ρ) from c. By assumption 3, 5T s (1 + ρ) < 6T s /(1 + ρ). Thus any processor could choose a new clock value at most six times in such a range. Thus the total number of interventions is 6(n − f ). (end of Lemma 4.2) LEMMA 4.5. In any configuration of any execution that starts with an ultra-safe configuration, the clock values of all the nonfaulty processors are within δ/2 of each other.
PROOF. In an ultra-safe configuration the clock values of the nonfaulty processors are within δ/8. The clock value range of the nonfaulty processors can grow by less than (n− f ) +2T s ρ (by Assumption 6 this quantity is less than 0.5T s ρ/(1+ρ)) in any T s /(1 + ρ) real time. Thus, the rate of growth of the clock value range is 0.5ρ. We define the minimal clock value by using the symetric value of an arbiterary clock value x of a nonfaulty processor. The minimal clock value is the clock value of a nonfaulty processor that is to the right (clockwise) of the symetric clock value. The minimal clock value is advanced by at least T s /(1 + ρ) 2 − (n − f ) (by Assumption 5 this quantity is at least T s /(2(1 + ρ))) in each T s /(1 + ρ) period of time (note that a processor may hop at most once in every T s /(1 + ρ) period of time). Thus, the rate of growth of the minimal clock value is at least 0.5. This yields that in a period of 4M lc (1 + ρ) the minimal value passes 0 twice. The second time that the minimal value passes 0, the range of the processors' clock values is at most 4M lc (1 + ρ)0.5ρ + δ/8. By Assumption 7, this range is less than δ/2.
Hence, the second time that the minimal clock assigns its clock to a value in [0, ρ] the averaging procedure of the processors is executed correctly and the result is a reduction in the range of the clocks to be at most half of the range before averaging (See Lundelius and Lynch [1984] and also Lemma 5 of ). Thus, the result is a range of δ/4. The next time that the averaging function is executed, the clock values' range is M lc (1 + ρ)ρ + δ/4. By Assumption 7, this range is less than δ/2. THEOREM 4.6. In expected O(T a n 6(n− f ) ) time the system stabilizes.
PROOF. ByLemmas 4.1 and 4.2, in every T a period of time there is a probability of n 6(n− f ) to reach an ultra-safe configuration. Thus, in expected O(T a n 6(n− f ) ) time an ultra-safe configuration is reached. Lemma 4.5 proves that once an ultra-safe configuration is reached, the clock agreement requirement holds with γ = δ/2. The minimal clock value is advanced by at least T s /(1 + ρ) 2 − (n − f ) (by assumption 5 it is at least T s /(2(1 + ρ))) in each T s /(1 + ρ) period of time. The maximal clock value is advanced by at most T s (1 + ρ)/(1 + ρ) + (n − f ) in each T s /(1 + ρ) period of time. Thus, by Assumption 8, the maximal clock value is advanced by at most 2T s /(1 + ρ) in every T s /(1 + ρ) period of time. Therefore, the clock validity requirement holds with = T s and a = 1.
Concluding Remarks
Extensive research has been done to find efficient clock synchronization protocols in the presence of Byzantine faults. In this work, we considered a more severe (and realistic) model of faults, that is, one that takes into account transient faults as well as Byzantine faults. When arbitrary corruption of state is possible, as is often the case with transient faults, it is no longer reasonable to approximate unbounded clocks with bounded clocks, no matter how large. Consequently, clocks that can take on only a bounded number of values (and wrap around when appropriate) have been assumed in this article. We presented two randomized self-stabilizing protocols for synchronizing bounded clocks in the presence of f Byzantine processor failures, where n > 3 f .
We believe that our observations and definitions for the types of faults to be considered and the type of clocks (namely, bounded) reflect reality and open new directions for research. Protocols designed under our fault tolerance model are more robust than existing clock synchronization protocols. Therefore, such protocols might be preferred by the system designer over protocols that cope with only Byzantine faults.
