On the computational power of self-stabilizing systems  by Abello, James & Dolev, Shlomi
ELSEVIER Theoretical Computer Science 182 ( 1997) 159-I 70 
Theoretical 
Computer Science 
On the computational power of self-stabilizing systems ’ 
James Abello ‘J, Shlomi Dolev b,* 
a Departmeni of’ Computer Science, Tesas A&M University, College Station, TX 77843, USA 
b Depurtment qf’ Mathemntics und Computer Science, Ben-Gurion Urkersity of’ the Negev. 
Beer-Shevu, Israel 
Received July 1994; revised June 1995 
Communicated by MS. Paterson 
Abstract 
The computational power of self-stabilizing distributed systems is examined. Assuming avail- 
ability of any number of processors, each with (small) constant size memory we show that any 
computable problem can be realized in a self-stabilizing fashion. 
The result is derived by presenting a distributed system which tolerates transient faults and 
simulates the execution of a Turing machine. The total amount of memory required by the 
distributed system is equal to the memory used by the Turing machine (up to a constant factor). 
1. Introduction 
Our motivation to explore the power of interconnected processors with constant size 
memory was first triggered by the following questions: What is the relation between 
the computational power of a single powerM computer and a distributed system of 
limited power and memory processors that are subject to transient faults? The approach 
is different from the one taken by the parallel algorithm community [13]. The concern 
in this work is the fault tolerance of the algorithm rather than the time it takes to 
execute its task. We view a distributed system as a stand-alone system (as opposed to 
a single-site parallel machine that can be locally controlled) that runs on-going tasks 
and is able to overcome faults. 
In particular, we are interested in self-stabilizing systems. A self-stabilizing system is 
a system that can be started in any possible global state. A transient fault is a fault that 
cause the state of a processor to change arbitrarily. Self-stabilizing systems can tolerate 
* Corresponding author. E-mail: dolev@cs.bgu.ac.il. Part of this work was done while this author was at the 
department of computer science, Texas A&M University, College Station, TX 77843. Supported in part by 
TAMU Engineering Excellence funds and NSF Presidential Young Investigator Award CCR-9158478. 
’ An extended abstract of this paper was presented at the 6th International Conference on Computing and 
Information, Canada, May 1994. 
’ E-mail: abello@cs.tamu.edu. Supported in part by NSF grant CCR-9304081. 
0304-3975/97/$17.00 @ l997-Elsevier Science B.V. All rights reserved 
PII SO304-3975(96)00150-S 
160 J. Abello. S. Dolevl Theoretical Computer Science 182 (1997) 159-l 70 
transient faults. When the intermediate period between two successive transient faults is 
long enough the system stabilizes. Following its stabilization the system demonstrates 
its desired predefined behavior. 
In this paper we consider a distributed system of processors each with a constant 
amount of memory. In order to understand the inherent behavior of the system we 
examine the extreme case where each processor is equipped with only few bits of 
memory. The reference powerful computer is modeled by a Turing machine. Theoreti- 
cally there is no upper bound on the amount of memory needed for storing the program 
that a computer needs to execute. This fact is also true in terms of Turing machines - 
where the program corresponds to the transition table. In order to eliminate the table 
size factor we consider only a specific deterministic universal Turing machine (denoted 
in the sequel by TM) [lo]. 
Input and output are given in a distributed fashion. Each processor may receive part 
of the input, and should output part of the output. Since the processors have constant 
memory size, the input consists of no more than a constant number of bits. Note 
that the input length might be shorter than the number of processors in the system. To 
simplify presentation, when the number of processors is II and the input length is I < n, 
we concatenate the word I”-’ to the original input word to obtain an input of length 
n. The output of the processors must (eventually) be correct with respect to the inputs. 
Note that the input can be changed during the execution of the algorithm. In this paper, 
we only focus on long enough periods of time in which the input is fixed and require 
that the output will correspond to the input some time after each such period begins. 
Our distributed system is connected in a chain topology. The chain is only an abstrac- 
tion of a predefined marked chain over any general graph. In particular, a system with a 
predefined ring (as is the case for token ring protocols) and a predefined leader fits this 
model. Each processor in the chain could be a communication port processor with lim- 
ited resources of memory and computation. The number of processors in the chain is not 
a priori restricted. Obviously, any existing hardware is finite and the number of proces- 
sors in the system is also finite. However, in order to have a base for comparison with 
the infinite tape of a Turing machine we do not a priori restrict the number of proces- 
sors. For any given n we construct a self-stabilizing distributed system with n constant 
memory processors. A distributed system of 12 processors accepts (rejects) the input iff 
the corresponding Turing machine accepts (rejects, respectively) the same input using 
no more than n memory cells of its working tape. When the Turing machine uses more 
than n memory cells during the computation (or uses less memory cells but does not 
halt) the processors of the distributed system outputs, ‘L’, as a “don’t know” symbol. 
Certainly, a TM can simulate the execution of any distributed system using the same 
amount of memory (up to a constant factor). Interestingly enough, the main result of 
this paper shows that processors with (small) constant amount of memory can tolerate 
transient faults and obtain the same result as a fault free execution of a TM. Namely, 
we show that a distributed system of interconnected constant-size memory processors 
can simulate the computation of a TM in the presence of transient faults. The total 
amount of memory required by the distributed system during the computation with the 
J Abello, S. Doleri Theoretical Computer Scienw 182 (1997) 159-170 
input word w is equal to the memory used by the TM with the input word 
a constant factor). 
The study of self-stabilizing algorithms started with the fundamental 
161 
w (up to 
paper of 
Dijkstra, [4], where three self-stabilizing algorithms for the mutual exclusion problem 
were presented. Recently, an extensive effort has been directed towards finding time 
and memory efficient self-stabilizing algorithms (cf. [l, 5,6, 141). Most recent works 
assume the existence of distinct identifiers (cf. [l, 51). The use of distinct identifiers 
yields a lower bound of Q(logn) bits for the size of memory per processor. Thus, 
those solutions do not apply to systems with constant memory size processors. Other 
recent works use randomization in order to break symmetry (cf. [2,8, 11, 12,151). In 
this paper we neither assume distinct identifiers nor use randomization. We restrict the 
system topology to be a directed chain with a leader processor at one endpoint and a 
tuil processor in the other. Our system simulates a Turing machine computation in a 
self-stabilizing fashion. 
In [9], it is proposed to use simulation in proving the possibility to force or preserve 
the self-stabilization property for different systems. The simulation of a TM by another 
system is not considered. In [ 161, the question of whether a polynomial self-stabilizing 
finite state program exists for decision problems is considered. Our goal is different, we 
simulate a Turing machine by a self-stabilizing distributed system of constant-memory 
processors in order to examine the computation power of the fault-tolerant distributed 
system. The remainder of the paper is organized as follows. In the next section we 
formalize the assumptions and requirements. Section 3 contains the description of our 
algorithm. Concluding remarks are in Section 4. 
2. Distributed system 
We consider distributed systems that consist of processors Pt, Pz,. . , P,, that are 
connected in a chain. The processors are anonymous, the subscripts 1 to n are used 
only for convenience. No processor knows n the number of processors. Processors have 
sense of direction, i.e. for j > 1, Pi-1 is the left neighbor of Pj and for j < n, Pj,, 
is the right neighbor of P,. PI is the leader processor, P, is the tail processor and the 
rest of the processors are intermediate processors. 
The reader may refer to [6, S] for the full (standard) definitions we use for config- 
uration, atomic step, fair execution, and asynchronous round. Processors communicate 
by the use of shared communication registers. 3 Two neighboring processors fi and 4, 
communicate by two shared registers yi, and Yj;. E (Pj) writes in rg (rJi) and reads 
from Yj; (~0). In addition to accessing its neighbors communication registers, each 
processor E can repeatedly read one symbol of input from Ii, its input register, and 
3 In the context of self-stabilizing algorithms the use of shared communication simplifies the presentation 
with respect to the message passing model. However, our results can be applied to message passing systems 
as well by using methods presented in [7]. 
162 J. Abello, S. DolevlTi~eoretical Computer Science 182 (1997) 159-170 
repeatedly write one symbol of output to its output register Oi. The content of I, and 
Oi is either 0, 1 or 1. We view the concatenation of the input symbols as a fixed 
word in (0, 1)’ I”-’ where n > 1. 
The state of a processor fully describes its internal state and value written in its 
registers including the output and input registers. A conjiguration is a vector of states 
of all processors. Processors execute atomic steps. An atomic step consist of some local 
computation followed by either a read from a communication register and input symbol 
or a write in a communication register and the output symbol. An execution of the 
system is a sequence of configurations E = (cl, cl,. . .) such that for i = 1,2,. . . , Ci+l is 
reached from ci by a single atomic step of some processor. Given an execution E, the 
first round of E is finished immediately after each processor has executed one atomic 
step; the second round is finished after each processor has executed one atomic step 
following the termination of the first round, and so on and so forth. 
The requirements for self-stabilizing algorithms state the conditions under which the 
system has to stabilize when started in an arbitrary configuration and specifies the 
required behavior of the system following the stabilization period. Next we define the 
self-stabilization requirements for our distributed algorithm d. Let w be a word in 
(0, I}‘. An algorithm d is self-stabilizing if for any finite n, when d is executed by 
a system of n processors and is started in uny possible configuration, c, with input 
word w I”-’ (n 2 I) then: (1) any fair execution that starts with c has a suffix in 
which the output of every processor fi is constant and, (2) this constant output is 1 
(0, respectively), if the TM accepts (rejects) w using no more than n working tape 
cells, otherwise the output is i. 
3. The reduction 
A self-stabilizing mutual exclusion algorithm serves as a building block in our al- 
gorithm. The self-stabilizing mutual exclusion algorithm guarantees that starting with 
any possible configuration, after a finite number of asynchronous rounds every con- 
figuration contains exactly one processor which is executing the critical section. The 
mutual exclusion algorithm of [6] and the coloring algorithm of [8] use only constant 
number of states per processor and ensure that in every fair execution following the 
stabilization period the single token repeatedly “travels” from the leader to the tail and 
back. For simplicity we consider a processor that executes the critical section as hold- 
ing a token. We use the terms send token and receive token to indicate transfer of the 
privilege to execute the critical section from one processor to another. Note that before 
a processor P transfers the privilege to execute the critical section, P can write in its 
shared communication register a “content” for the token. Thus, we view the token as 
an entity with a value that is transferred from one processor to another. 
The (eventual) behavior of the token is used to ensure that the chain of proces- 
sors will repeatedly write only the correct output, The processors repeatedly simulate 
the execution of the TM. Whenever a simulation of the TM computation is over, the 
J. Abello, S. Dolevl Theoretical Compufer Science 182 (1997) 159-170 163 
processors are initialized to start a new simulation. The initialization does not effect the 
value of the output registers. The result of the simulation is overwritten to the output 
registers. Once the result of the simulation is correct any further write operation into 
the output registers (which is also a result of a correct simulation) does not change 
the value stored in these registers. A single step of the TM is simulated each time the 
token travels from the leader to the tail and back. Each processor contains the infor- 
mation of a single working tape cell. In every configuration of a correct simulation 
(one that follows correct initialization) there exists a single processor that is marked 
to hold the head of the TM. Whenever the token reaches the processor that holds the 
head of the TM the value of the working tape cell and the current state of the TM are 
used to calculate the transition of the TM. The transition includes modification of the 
contents of the working tape cell, change of the TM state and movement of the head 
mark to a neighboring processor. 
Due to the self-stabilizing setting the simulation might not terminate. The following 
observation is used to ensure detection of a non-terminating simulation: A TM that 
reaches the same configuration twice in a single computation does not ever halt. Since 
our system is finite we propose to count the number of the TM configurations during 
the computation. To do so with a constant amount of memory per a processor we 
suggest using a distributed binary counter. Each processor maintains only two bits of 
the counter. The distributed counter is incremented by one in every step of the TM. 
The tail processor that holds the least significant bits starts to increment the counter 
whenever the token arrives to it. Indication of a carry is sent to the left neighbor when 
appropriate. If a counter overflow occurs before the TM accepts or rejects the input 
then the system is initialized. 
In more detail, each processor fi maintains two bits of the distributed counter in 
CntBits,. When a token arrives to P,, P, computes the new value for CntBits, and the 
carry. Then P, writes the carry value in TknCr to its neighbor P,_I. Whenever the 
leader PI, detects a counter overtlow the leader resets the system. The following activi- 
ties occur during this reset (1) every processor 8 writes its input to WrkSym, which is 
the i’th cell of the virtual TM working tape, (2) every processor 9 set a flag HdMrki 
to be false, the only exception is the leader (PI) which sets HdMrki to be true, (3) the 
binary counter bits of each processor are set to 00. Following the first reset the chain 
implements a virtual TM. The computation of the TM is simulated during the traversal 
of the token from the leader to the tail; when a processor Pi with HdMrk, = T receives 
a token that traverses in this direction the (constant space) TM table is used to deter- 
mine the value for WrkSym,, the movement of the head and the new TM state. Then 
the system reaches the new TM configuration by changing the virtual working tape, the 
head marker location and the TM state accordingly. Note that the token continues to 
the right. Hence, in case the direction of the head movement is towards the leader, the 
transition of the TM head is delayed until the token arrives from the direction of the tail. 
Next we briefly describe the conventions, variables, functions and statements that are 
used in the code of the algorithm. Upon arrival of a token the code of a processor is 
executed sequentially from its beginning to its end; the labels that appear in the code 
164 J. Ahello. S. Doleal Theoretical Computer Science IX2 (1997) 159-170 
(e.g. Ll, L2) are used only for the sake of readability. The symbols ‘{’ and ‘}’ are 
used to denote the beginning and the end, respectively, of the portion of the code that 
is executed when the condition of the appropriate if statement is satisfied. 
The token value: The token is a combination of three field values. TknCr: is used 
to indicate carry for the binary increment. Its value is either 0 or 1. TknRst: indicates 
a reset execution. Its value is either T or F. TknTMSta: encodes the current state of 
(the universal) TM. Every possible state (of the constant number of states of TM) can 
be encoded in TknTMSta. In addition I is used to represent “no-state” during reset 
executions. 
Local variables used by each processor: There are five local variables. RTkn: 
Stores the value of the token received. CntBits: This variable contains two bits of the 
distributed counter. HdMrk: Indication on the presence of the TM head. The value of 
HdMrk is eithr T or F. HdMov: Indicates the computed movement of the head. either 
Left, Right or Stay. WrkSym: The working tape symbol, every possible working symbol 
(of the constant number of working symbols of TM) can be encoded in WrkSym. 
The functions used in the code: TM(Initia1) - results with the initial state of TM. 
TknTMSta, WrkSymi, HdMovi := TM(Tkn.TMSta, WrkSymi): uses the current Tkn. 
TMSta and WrkSym, and TM transition table to compute the next TknTMSta, WrkSym, 
and HdMovi. Note that we define the result of the statement TM(I, WrkSym) to be 
the same as that of TM(TM(Initial), WrkSym). 
The program of the leader 9: Upon arrival of a token (from 4) statements Ll 
to L6 of Fig. 1 are executed sequentially. Ll: The value of the received token is 
stored in RTkn before it is modified. L2: This statement checks whether a counter 
overflow occurs, then checks whether the computation is over and initiates a reset if 
either happens. Following the examination for overflow, CntBitsi are incremented by 
1 (in case of overflow the result is 00). L3: This statement is executed when a head 
mark of TM is presented in PI. In such a case the transition function is computed using 
the state of the TM as received with the token and the work tape symbol WrkSym,. 
When the next head movement is towards 9, PI clears the indication on the presence 
of the head and prepares TknHdMrk to indicate on the transition of the head to P2. PI 
assigns Tkn.Rst := T, when the next head movement causes failure of the head from 
the working tape. L4: When an indication on head movement from 9 to PI is received, 
PI assigns HdMrk := T and clears the head transition indication in TknHdMrk. L5: 
A reset indication (due to L2, L3 or arrival of a token with TknRst = T) triggers 
initialization of the work tape symbol, the head marker (at PI), the counter bits and 
the Turing machine state. If the last computation has been terminated “normally” then 
the output symbol is the result of the computation and the Turing machine state is the 
initial state. Otherwise, the Turing machine state and the output symbol are set to I 
to indicate “abnormal” termination. L6: PI sends the token to 9. 
J. Ahello, S. Dolevl Theoretical Computer Science 182 (1997) 159-170 165 
Fig. 1. Algorithm for the leader Pt 
The program of the tail P,,: Upon arrival of a token (from P,_I) statements Tl to 
T7 of Fig. 2 are sequentially executed. Tl : The value of the received token is stored 
in RTkn before it is modified. T2: Similar to L3. The difference is in the case of head 
failure - a right movement implies head failure. T3: Similar to L4. T4: When a token 
arrives with a TknRst = T the work tape symbol, head marker and counter bits are 
initialized. In addition the TknRst is assigned by F to indicate the completion of the 
reset. This last assignment is not executed when P, initiates the reset (in T2). In such 
a case TknRst is sent to the leader which in turn resets the entire system. T5: The 
output is assigned according to Tkn.TMSta when the computation results with accept, 
reject or when a reset is initiated (i.e. Tkn.TMSta=_L). T6: The counter is incremented 
by 1. T7: The token is sent to P,_ I. 
The program of intermediate processor P,: Upon arrival of a token from Pi- I state- 
ments 11 to I5 (Fig. 3) are executed sequentially. 11: The value of the received token 
is stored in RTkn before it is modified. 12: Similar to L3 and T2, with no head failure 
possibility. 13: Similar to L4 and T3. 14: A token with Tkn.Rst=T causes initialization 
of the working tape, head marker and counter bits. 15: The token is sent to Pi+l. 
Upon arrival of a token from Pi+l, statements 16 to 111 (Fig. 4) are executed 
sequentially. 16: The value of the received token is stored in RTkn before it is modified. 
17: The counter is incremented (when the token moves from the tail to the leader). 18: 
The head of the Turing machine moves from Pi+, to Pi. The assignment HdMov:=Stay 
166 J. Abello, S. Dolevl Theoretical Computer Science 182 (1997) 159-170 
ReceiveLead(Tkn) 
Tl: R:I‘kn:=Tkn 
T’: if Hd!vlrk,,=‘r thrn 
{‘Tkn.TMSta.t~rkSynl,,.Htll\lo~,,:= 
?‘M(Tkn.TMSta.~~rkS~tll,,)} 
if Hdhlov,,=Left, then 
{Tkn.HdMrk:=‘I’; HdMrk,,:= F) 
if HdWov,,=Hight thrn 
{Tku.Rst:=T: HdClrk,:= F] 
T3: if RTkn.HdMrk = T thru 
{Hd.\lrk,,:=T: Tkn.HdMrk:= F) 
T4: if Tkn.Hst.=T l.1~1 
{ WrkSym,,:=l,,: HdMrk,, :=F: 
( ‘i>tBits,, :=W) 
if RTkn.Rst=l‘ t.hrr~ 
(‘l‘kn.Rst:=F) 
T5: if Tkn.TMSta E (occepl.~jccl.l) lhcn 
{O,:=TM(Tkn.T.\ISta)) 
‘T6: if CntBits,,= I1 theu 
{Tkn.Cr:= I) 
clsr {Tkn.Cr:=U) 
CntBits,:=(‘ntBits,, + I 
Ti: SendLead(Tkn) 
Fig. 2. Algorithm for the tail P,,. 
RcceiveLead(Tkn) 
II: RTkn:=Tku 
IL?: if HdMrki=T the11 
{Tkn.TMSt,a.~~rkSym,,HdMovi:= 
TM(Tkn.TMSta.\VrkSym,)} 
if HdMov,=Right then 
{Tkn.HdMrk:=T; HdMrk,:= F) 
13: if RTkn.HdMrk = T t~hen 
{HdMrk,:=T; Tkn.HdMrk:= F; 
HdMov,:=Stay} 
14: if Tkn.R.st=T then 
{WrkSym,:=I,; HdAIrk,:=F: 
CntBits,:=OO} 
1.i: SendTail(Tkn) 
ReceiveTail(Tkn) 
16: RTkn:=Tkn 
17: if Tkn.Cr=l and CntBit.s,= I I then 
{Tkn.Cr:=l) 
else {Tkn.Cr:=O} 
CntBits,:=CntBits, + Tkn.(.‘r 
18: if RTkn.HdMrk = T then 
{HdMrki:=T; Tkn.Hdhlrk:= F; 
HdMov,:=Stay) 
19: if HdMrk,=T and HdMov,=Lrft t.hen 
{Tkn.HdMrk:=T; HdMrk,:= F} 
110: if Tkn.TMSta E (occepl.rcjrc/.J_) then 
{O,:=TM(Tkn.TMSta)} 
111: SendLead(Tkn) 
Fig. 3. Algorithm for intermediate P,, token to- Fig. 4. Algorithm 
wards tail. wards leader. 
makes sure that 19 is not executed. 19: The transition of the 
occurs when the token arrives from Pi+, . 110: Similar to T5. 
Pi_,. 
3.1. Correctness proof 
for intermediate P,, token to- 
head of the TM to Pi-1 
I1 1: The token is sent to 
The correctness hinges on the existence of a self-stabilizing mutual exclusion algo- 
rithm. In particular, the coloring algorithm of [8] guarantees that in any fair execution, 
after O(n) rounds, a safe conj@uratiun for the mutual exclusion algorithm, cme, is 
reached such that, in any configuration that appears after c,,,,, there exists at most one 
J. Abello. S. Dolevl Theoretical Computer Science 182 (1997) 159-I 70 167 
processor that executes the critical section. Moreover, following c,, the processors re- 
peatedly execute the critical section in a fixed order, from the leader to the tail and 
back; at least one transfer of the token from a processor to its neighbor is made in ev- 
ery two successive rounds. Note that before the safe configuration c,, is reached there 
can be many tokens. In this period of time our algorithm does not operate correctly. 
Thus, when the mutual-exclusion algorithm stabilizes, the other part of the algorithm 
(that assumes the existence of a token that travels nicely from the leader to the tail 
and back) is in an arbitrary state. For example in such an arbitrary state more than one 
processor can have HdMrk=T. We prove that this part of the algorithm stabilizes too. 
Lemma 3.1. In every fair execution that starts with a safe configuration c,, of the 
mutual exclusion algorithm, the leader assigns Tkn.Rst:=T at least once in every 
4n2’” rounds. 
Proof. Assume towards contradiction that the leader does not initiate a reset for 4n22” 
rounds. Let cl be the configuration in the beginning of those 4n22”. cl follows c,, 
thus during those rounds the single token repeatedly travels from the leader to the tail 
and back, each such traversal takes no more than 4n rounds (i.e. two rounds for each 
move). In each traversal from the tail to the leader the binary counter is incremented 
by one. The value of the binary counter in CI is an arbitrary non-negative number. 
Thus, if the leader does not initiate a reset during 4n22” steps following c,, then an 
overflow of the counter occurs and triggers a reset initialization. This contradiction 
proves the lemma. 0 
It is easy to see that following c me whenever the leader sends a token with 
Tkn.Rst:=T the token initializes the working tape bits, the counter bits, the place 
of the TM’s head, and the TM state. 
Lemma 3.2. In any fair execution that starts with a safe configuration c,, of the 
mutual exclusion algorithm, after the leader assigns Tkn. Rst:=T and sends the token, 
the token travels to the tail and every processor that receives the token initiates its 
variables. 
Let C be the number of states of a TM. Note that, by [ 10, pp. 1731 and [ 171, C < 56. 
The next lemma is proved by a simple counting argument. 
Lemma 3.3. The number of different TM configurations with u tape of size n and 
{O,l} alphabet is at most Cn22”. 
Proof. There are n possibilities for the place of the first _L in the working tape and 
at most 2” possible working tape contents until the place of the first 1. There are n 
possibilities for the location of the Turing machine head and C possibilities for the 
current TM state. 0 
168 J. Abello, S. Dolevl Theoretical Computer Science 182 (1997) 159-170 
Define a reset initialization configuration, cinit, to be a configuration that follows 
c,, such that cinit immediately follows an atomic step of the leader in which it assigns 
Tkn.Rst:=T. For every reset initialization configuration, cinir, we define a reset termi- 
nation configuration, cterm to be the first configuration after cinit that follows an atomic 
step of the leader in which the leader receives the token. 
Lemma 3.4. In the second reset termination configuration after c,,, all the output 
registers are identical. Moreover the outputs are 1 (0, respectively) ifs TM accepts 
(rejects, respectively) w with a working tape of size n. Otherwise, the output is 1. 
Proof. First note that if a TM is in a certain machine configuration more than once 
before TM accepts or rejects then the TM does not halt. This observation is straight- 
forward since the TM is deterministic and it repeats the same execution forever. Thus, 
if the execution encodes more than Cn*2” configurations and the TM does not accept 
or reject w then the TM reached a certain configuration at least twice and it will never 
halt. By Lemma 3.2 and by the algorithm following the first reset termination configu- 
ration that follows cme the distributed system simulates the TM computation and counts 
the TM configurations. Thus, a reset is triggered when either (1) the TM reaches a 
state in which it accepts or rejects w or (2) when the counter reaches 2’” > Cn22” (for 
n > 14) or (3) when the head of the TM attempts to move to the left of the leader 
or to the right of the tail. Hence, before the second reset termination configuration the 
outputs of the processors are correctly set. 0 
Theorem 3.5. In every fair execution after 0(n22n) rounds, in every configuration 
the output of each processor is accept (reject) if the TM accepts (rejects) w with 
working tape of size n. Otherwise, the output is 1. 
Proof. By Lemma 2 of [8], c,, is reached within O(n) rounds. By Lemma 3.1, the 
first cterm appears during the first 4n22” rounds that follows c,,. By Lemma 3.4 during 
an additional 4n22” rounds each processor either receives a token with TMSta that 
indicates acceptance or rejection or with Tkn.TMSta=_L. Since any further computation 
that begins with a later reset is identical, the output is not changed. 0 
4. Concluding remarks 
In this paper we investigated the computational power of self-stabilizing systems 
with constant memory size processors. Interestingly, interconnected processors with a 
(small) constant amount of memory can tolerate transient faults and obtain the same 
result as a fault free execution of a TM. This implies that when there is an embedded 
ring with a leader in a system with constant memory size processors, the system copes 
with transient faults and still has the computational power of a TM with the same total 
amount of memory (up to a constant factor). 
J. Ah&o, S. Dolecl Theorrrical Computer Science 182 (1997) 159-170 169 
The algorithm presented in Section 3 requires 0(n22n) rounds to stabilize. The al- 
gorithm can be accelerated by the use of an upper bound on the execution time of the 
TM. When an upper bound t(w) on the execution time of the TM (for an input word 
w) is known, and could be given as a part of the input, then our algorithm can stabilize 
within O(nt(w)) time. The use of a counter to reinitiate the system by counting the 
steps in the execution and comparing with a given bound had been previously used 
in, e.g., [3,9, 161. To accelerate the algorithm, we suggest implementing the counter 
in a distributed fashion using constant memory per processor. The modified algorithm 
will compare the input step bound with the step counter and give an indication to the 
leader when the step counter exceeds the step bound. 
Acknowledgements 
We would like to thank the editor Mike Paterson and the anonymous referees for 
their constructive comments. We also thank Alex Ribman for simulating the code. 
References 
[I] B. Awerbuch, S. Kutten, Y. Mansour, B. Pa&Shamir and G. Varghese, Time optimal self-stabilizing 
synchronization, Proc. 25th Annual ACM Symp. on Theory of’ Computing (1993) 652-661. 
[2] B. Awerbuch and R. Ostrovsky, Memory-efficient and self-stabilizing network reset, Proc. Z3th Ann& 
ACM Symp. on Principles of Distributed Computing (1994) 254-263. 
[3] B. Awerbuch and G. Varghese, Distributed program checking: a paradigm for building self-stabilizing 
distributed protocols, Proc. 32nd IEEE Symp. on Foundations of’ Computer Science (1991) 258-267. 
[4] E.W. Dijkstra, Self-stabilizing systems m spite of distributed control, Comm. ACM 17 (I I ) (I 974) 
643-644. 
[5] S. Dolev, Optimal time self-stabilization in dynamic systems, Proc. 7th Internut. Workshop on 
Distributed Algorithms (1993) 160-I 73 
[h] S. Dolev, A. Israeli and S. Moran, Self stabilization of dynamic systems assuming only read/write 
atomicity, Proc. Ninth Ann& ACM Symp. on Principles qf Distributed Computation, Montreal 
(1990) 103~117; Distrih. Cornput. 7 (1993) 3-16. 
[71 S. Dolev, A. Israeli and S. Moran, Resource bounds for self-stabilizing message driven protocols, Proc. 
Tenth Annual ACM Sysp. on Priwiples qf Distributed Computation, Montreal (1991) 28 l-294; 
a journal version to appear in SIAM J. Comput. 
[8] S. Dolev, A. Israeli and S. Moran, Uniform dynamic self-stabilizing leader election, Proc. 5th Internat. 
Work.vhop on Distributed A!yorithms, Delphi (I 991) 163-l 80; Part of the results appears in IEEE 
Trans. Sq/twure Eng. 21 (5) (1995) 429-439 and TR 94-039, Dept. of Computer Science. Texas 
A&M Univ. 
[9] M.G. Gouda, R.R. Howell and L.E. Rosier, The instability of self-stabilization, Acta Injtirm. 27 (I 990) 
697-124. 
[IO] J.E. Hopcroft and J.D. Ullman, Introduction to Automutu Theory. Lunguuyrs and Computution 
(Addison-Wesley, Reading, MA, 1979). 
:I I ] A. Israeli and M. Jalfon, Token management schemes and random walks yield self stabilizing mutual 
exclusion, Proc. Ninth Annual ACM Sy77p. on Principles oj’ Distributed Computation, Montreal 
(1990) 119-130. 
[12] G. ltkis and L. Levin, Fast and lean self-stabilizing asynchronous protocol, Proc 36th Annual IEEE 
Sump. on Foundations of Computer Science ( 1994) 226-239. 
170 J. Abello, S. Dolevl Theoretical Computer Science I82 (1997) 159-l 70 
[13] R. Karp and V. Ramachandran, A survey of parallel algorithms for shared memory machines, Technical 
Report UCB/CSD 881408, Computer Science Division, Univ. California, 1988; also in Handbook of 
Theoretical Computer Science, J. van Leeuwen, ed. (Elsevier, Amsterdam, 1990) 869-941. 
[I41 L. Lamport, The mutual exclusion problem: Part II - Statement and solutions, J. ACM 33 (1986) 
327-348. 
[15] A. Mayer, Y. Ofek, R. Ostrovsky and M. Yung, Self-stabilizing symmetry breaking in constant-space, 
Proc. 24th ACM Conf: on Theory of Computing (1992) 667-678. 
[16] M. Schneider, Self-stabilizing real-time decision systems, Responsive Computer Systems: Steps Toward 
Fault-Tolerant Real-Time Systems (Kluwer Academic, Dordrecht, 1995). 
[ 171 C.E. Shannon, A universal Turing machine with two internal states, Automata Studies (Princeton Univ. 
Press, Princeton, NJ, 1956) 157-167. 
