Generalized Fair Reachability Analysis for Cyclic Protocols with 
Nondeterminism and Internal Transitions by Liu, Hong & Miller, Raymond E.
Generalized Fair Reachability Analysis for Cyclic Protocolswith Nondeterminism and Internal Transitions Hong Liu Raymond E. MillerDepartment of Computer ScienceUniversity of Maryland at College ParkCollege Park, MD 20742AbstractIn this paper, we extend the generalized fair reachability notion to cyclic protocolswith nondeterminism and internal transitions. By properly incorporating internal tran-sitions into the formulation of fair progress vectors, we prove that most of the resultsestablished for cyclic protocols without nondeterminism and internal transitions stillhold even if nondeterminism and internal transitions are allowed. We identify inde-niteness as a new type of logical error resulting from reachable internal execution cyclesand show that indeniteness can also be detected for the class of cyclic protocols withnite fair reachable state spaces with nite extensions.1 IntroductionIt is well-known that state explosion is one of the major obstacles for validating complexprotocols modeled as communicating nite state machines. As a result, many techniqueshave been proposed to tackle this problem (please refer to [10] for a survey). It is observedthat in most cases, signicant state reduction can be achieved if one could eliminate asmuch redundancy as possible by limiting the amount of interleaving of equivalent executionsequences during state exploration. However, care must be taken to ensure that the reducedstate space still maintains competitive, if not the same, logical error detecting capability asthe original reachable state space.Fair reachability analysis was originally proposed as one such improved state explorationtechnique for protocols with two machines [9, 6]. By forcing the two machines in a protocolto make progress at the same time, whenever possible, only fair progress states are generatedduring state exploration. If the fair reachable state space of a protocol is nite, detection ofResearch reported in this paper was supported by NASA Grant No. NAG 5-2648.1
deadlock and unspecied reception are decidable within the fair reachable state space [9],while unboundedness detection is decidable with nite extension on the fair reachable statespace [6].In [7, 8], we generalized the fair reachability notion to cyclic protocols with n  2machines, where each machine is deterministic but partially dened and does not haveinternal transitions. We showed that for a cyclic protocol P , its fair reachable state spaceF is exactly the set of reachable states with equal channel length and F is nite if and onlyif (i for short) P is not \simultaneously unbounded". Moreover, we proved that for P , theclass of cyclic protocols whose F's are nite, deadlock detection is decidable within F [7],while detection of other logical error, such as unspecied reception, unboundedness, andnonexecutable transition, are all decidable via nite extension of F [8]. We also showed thatfor any P 2 P , P is logically correct i its F does not contain any logical errors [8]. As aresult, for class P , our generalized fair reachability analysis technique not only can achievesubstantial state reduction, but also maintains very competitive fault coverage. Therefore,it is a very useful technique for the analysis of a wide variety of cyclic protocols.In this paper, we are going to extend our generalized fair reachability notion to theanalysis of cyclic protocols where a process in a protocol can be nondeterministic but par-tially dened and can have internal transitions. By incorporating internal transitions intothe formulation of fair progress vectors, we are able to show that all the aforementionedresults for cyclic protocols without nondeterminism and internal transitions still hold forcyclic protocols with nondeterminism and internal transitions. Moreover, we observe thatthe inclusion of internal transitions into the model results in a new type of logical errorcalled \indeniteness", meaning that a protocol could reach a state from which one of theprocesses could indenitely execute internal transitions without communicating with otherprocesses in the protocol. However, we will show that indeniteness can also be detectedfor the class of cyclic protocols whose F's are nite, although sometimes nite extension ofF is necessary. Therefore, our generalized fair reachability technique works equally well forthe analysis of cyclic protocols with nondeterminism and internal transitions.In [2, 3], Cacciari and Raq proposed a technique called reduced reachability analysisthat can handle internal transitions for protocols with two machines. However, unlike fairreachability analysis, two machines can proceed at the same time only if the \parallelwise"condition is satised. They showed that detection of deadlock and unspecied reception aredecidable for a protocol with a nite reduced reachable state space. Since their approachis closely related to ours, we will defer the comparison of these two methods to Section 5,after our method is presented.The rest of the paper is organized as follows. In the following section, the communicatingnite state machine model is introduced. In Section 3, we extend the generalized fairreachability notion to cyclic protocols with nondeterminism and internal transitions andstudy the basic properties of fair reachable state space. We study the logical error detection2
capability of fair reachable state space in Section 4, where we show how nite extensioncan be performed on a nite fair reachable space so that logical errors other than deadlock,including indeniteness, can be detected eectively and eciently. We conclude the paperwith future work in Section 6.Due to space limitations, lemmas and theorems in this paper are stated without proofs.Please refer to the full paper for details.2 The CFSM ModelNotation: (1) We use  to denote concatenation. Given a set M . M denotes its reexiveand transitive closure under concatenation. jM j denotes its cardinality. 2M denotes thepower set ofM . For Y 2M, jY j denotes its length.  denotes an empty string, jj = 0. (2)Given n, for any 1  i  n, 0  j < n, i j = i+ j if i+ j  n else i j = (i+ j) mod n;i	 j = i  j if i > j else i	 j = i  j+ n, where mod stands for the modulo operation. (3)An interval [i::j] is an ordered set of at most n consecutive integers i; i 1; : : : ; i k = j,where (1  i  n) ^ (0  k < n). The corresponding (unordered) set is denoted as fi::jg.Let [i0::j0] and [i::j] be two intervals, [i0::j0]  [i::j] i fi0::j0g  fi::jg. Unless specied as[1::n], we assume j[i::j]j< n. (4) We designate n as the number of processes in a protocol.Unless otherwise specied, we assume n  2 and let i; j range over [1::n].In the communicating nite state machine (CFSM) model, a protocol is specied as aset of n processes P = (P1; P2;: : :; Pn), where each process Pi is a nite state machine thatcan communicate with other processes via FIFO channels. For each Pi, Si denotes the setof local states in Pi. The initial local state of Pi is denoted as s0i . A channel from Pi toPj; i 6= j, is denoted as Cij. The set of messages that Pi can send to Pj is denoted as Mij.The content of Cij, denoted as cij, is a sequence of messages sent from Pi to Pj . When Cijis empty, cij = .Let ~Mi = (Sj 6= if mjm 2Mijg)[ (Sj 6= if+mjm 2Mjig).  denotes the partially denedtransition function: Sni=1(Si  ~Mi ! 2Si). For each Pi, a transition dened at local statesi 2 Si is denoted as (si; ), where  2 ~Mi. It is a sending (receiving) transition if  =  m( = +m). As a convention, we use  0 = (si; ) to give a name  0 for this transition, anduse s0i 2 (si; ) to mean that s0i is a local state resulting from the execution of the transition.Similarly,  denotes the partially dened internal transition function: Sni=1(Si ! 2Si). Foreach Pi, an internal transition dened at si is denoted as (si). We also use s0i 2 (si) tomean that s0i is a local state resulting from the execution of  at si. By denition, each Piis nondeterministic but partially dened.A transition cycle in Pi, denoted as Ci, is a cycle in the transition graph of Pi. It isan internal cycle if each transition in the cycle is an internal transition. It is a sending(receiving) cycle in Pi if it is not an internal cycle and each transition in the cycle is eithera sending (receiving) transition or an internal transition. si is a sending (receiving) local3
state i all transitions dened in si are sending (receiving) transitions.A protocol P = (P1; P2;: : :; Pn) is cyclic i each Pi has exactly one input channel Ci	1iand exactly one output channel Cii1. From now on, we are dealing with cyclic protocols.For results established later in this paper, it should be clear that they apply to cyclicprotocols only. For ease of reference, we call a cyclic protocol P a D-cyclic protocol if eachPi in P is deterministic and does not contain any internal transitions.For a cyclic protocol P = (P1; P2;: : :; Pn), a global state (state for short) S is repre-sented as a 2n-tuple (s1; s2; : : :; sn; cn1; c12; : : :; cn 1n), where si is the local state of Pi, andci	1i is the content of channel Ci	1i. In particular, the initial state S0 is denoted as(s01; s02; : : :; s0n; ; ; : : :; ). S is of equal channel length i all channel contents in S are ofthe same length. For convenience, we use si 2 S to denote that si is a local state of S, and(m; si) 2 S to denote that si 2 S and m is at the head of channel Ci	1i in S. S is in asending cycle i there is a local state si 2 S that is in a sending cycle of Pi. As a convention,we use capital letters S;X to denote a state and small letter si to denote a local state of Pi.The reachability relation among states is formulated as follows. Given two states S =(s1; s2, : : : ; sn; cn1; c12; : : : ; cn  1n) and S 0 = (s01; s02; : : :, s0n; c0n1; c012; : : : ; c0n  1n). S0 is directlyreachable from S, denoted as S 7! S0, i 9 i 2 [1::n] such that the elements of S0 canbe derived from S by executing one of the following transitions: (1) s0i 2 (si; m) andc0ii1 = cii1 m. (2) s0i 2 (si;+m) and ci	1i = m  c0i	1i. (3) s0i 2 (si). Except for theelements aected by the one transition applied, all other elements of S 0 remain the same asthose in S.Denote 7! as the reexive, transitive closure of 7!. S0 is reachable from S i S 7! S0.When S = S0, we say S0 is a reachable state. The set of reachable states in P is denoted asR, called the reachable state space of P . A local state s0i is reachable (from S) i there is astate S0 such that S0 7! S0 (S 7! S 0) and s0i 2 S0. (m; s0i) is reachable (from S) there is astate S 0 such that S0 7! S0 (S 7! S 0) and (m; s0i) 2 S 0. A cycle Ci in Pi is reachable (fromS) if one of the local states in Ci is reachable (from S).Suppose S0 7! S0 (S 7! S0). An execution sequence of S 0 (from S to S0), denotedas e = fe1; e2;: : :; eng, is a sequence X0 1! X1 2! : : : k! Xk; k  0, such that X0 = S0(X0 = S), Xk = S0, and 8 l : 1  l  k;X l 1 7! X l via transition  l, where each ei is thecorresponding (possibly empty) transition sequence in Pi. fe1; e2;: : :; eng is called a localexecution sequence set of S 0 (from S to S 0). The length of e, denoted as jej, is dened as thenumber of transitions in e, i.e., jej = k  0. An execution cycle of S is a nonempty executionsequence from S to S, denoted as C = fC1; C2;: : :; Cng, where each Ci is the corresponding(possibly empty, not necessarily elementary) transition cycle in Pi. fC1; C2;: : :; Cng is calleda local execution cycle set of S. By denition, at least one Ci is not empty. C is an internalexecution cycle of S i each nonempty Ci is an internal cycle in Pi.For protocol validation, we check R against common errors such as deadlock, unspeciedreception, nonexecutable transition, and unboundedness. (For denitions of these concepts,4
please refer to [1].) The inclusion of internal transitions in the model introduces a new typeof logical error resulting from internal execution cycles. By denition, P has an internalexecution cycle i one of the processes in P has a reachable internal cycle. A reachablestate S is an indenite state i there is an si 2 S that is in an internal cycle of Pi, meaningthat from S and on, there is a process that could loop indenitely through its internalcycle without communications with its neighbors. A protocol P is indenite i it has anindenite state. It should be clear that indeniteness is also a syntactic error that canbe checked in the same way as other aforementioned errors by inspecting each reachablestate during state exploration. Deadlock, unspecied reception, nonexecutable transition,unboundedness, and indeniteness are called logical errors. P is logically correct i R isfree of logical errors. It can be shown that none of the logical errors is decidable for cyclicprotocols in general using the results established in [1].3 Generalized Fair Reachability AnalysisFair reachability was generalized to D-cyclic protocols with n  2 machines in [7, 8]. In thissection, we rst show how the fair reachability notion for D-cyclic protocols can be extendedto cope with nondeterminism and internal transitions for general cyclic protocols. Then weshow that the fair reachable state space still maintains the equal channel length propertyand satises the same necessary and sucient condition for being nite. For the sake ofspace, we will be expanding on the modication part of the formulation and be brief on thepart that is unchanged. Please refer to [7, 8] for a complete treatment. For conciseness, weuse \fair reachability" for \generalized fair reachability" from now on.Given a cyclic protocol P = (P1; P2;: : :; Pn). Let S = (s1; s2; : : :; sn; cn1; c12; : : :; cn 1n) bea state of P . Denote E i as the set of sending transitions dened at si. Dene E+i =f(si;+m)g if (m; si) 2 S; E+i = ; otherwise. Let Ei = E i [ E+i . Then Ei is the set ofexecutable transitions at si in S. We also dene E++i as the set of enabled transitions at siin S. ((si; ) is enabled i  = +m; ci	1i = , and (si	1; m) is dened.)Denote  as the null transition, indicating no state change in a process. Let TV = f~t =(t1; t2; : : : ; tn)g such that (i) 8 i 2 [1::n] : ti 2 Ei [ E++i if Ei [ E++i 6= ;; ti =  otherwise,and (ii) 8 i 2 [1::n], if ti = (si;+m) 2 E++i , then ti	1 = (si	1; m) 2 E i	1. From TV , wecan compute two sets Vc and Vs whose elements are in the form of ~v = (v1; v2; : : : ; vn) and~v 2 TV . For each ~v 2 Vc, either 8 i 2 [1::n] : vi 2 E i or 8 i 2 [1::n] : vi 2 E+i . In this case,~v is called a concurrency vector in S. For each ~v 2 Vs, there is at least one send-receivepair (vi; vi1) in which vi 2 E i and vi1 2 E+i1 [ E++i1. On the other hand, if vi is notin a send-receive pair, then vi = . In this case, ~v is called a synchronization vector in S.Loosely speaking, in a concurrency vector, either all processes are sending or all processesare receiving; in a synchronization vector, some processes are grouped into send-receivepairs while the rest do not progress at all. For details, please refer to [7, 8].5
To cope with internal transitions, we let Ei = f(si)g if  is dened at si; Ei = ;otherwise. For each Ei 6= ;, we construct an internal vector ~v in S as follows: set vi =(si) and set vj =  for each j 6= i. Denote V as the set of internal vectors in S. LetV = Vc [ Vs [ V. Each ~v 2 V is called the fair progress vector in S.The fair reachability relation is dened as follows. Given two states S = (s1; s2;: : :; sn,cn1; c12;: : :; cn 1n) and S 0 = (s01; s02; : : : ; s0n; c0n1; c012, : : : ; c0n 1n), S 7!f S0 i 9~v 2 V (S) thatleads the system from S to S 0. There are four cases to consider:(1) ~v 2 Vs(S). For each send-receive pair (vi; vi1); i 2 [1::n], there are two subcasesto consider: (a) cii1 = . Let vi = (si; m) and vi1 = (si1;+m). Execu-tion of (vi; vi1) will cause transition (si; m) to be taken, followed by transition(si1;+m), where s0i 2 (si; m) and s0i1 2 (si1;+m). (b) cii1 6= . Letvi = (si; m), vi1 = (si1;+m0), and cii1 = m0  c00ii1. Execution of (vi; vi1) willcause transitions (si; m) and (si1;+m0) to be taken in arbitrary order, wheres0i 2 (si; m), s0i1 2 (si1;+m0), and c0ii1 = c00ii1 m. Except for the elementsaected by the transitions applied in each of the send-receive pairs, all other elementsof S0 remain the same as those in S.(2) ~v 2 Vc(S) ^ (8 i 2 [1::n] : vi = (si; mi) 2 E i ). The result of applying ~v on S issuch that 8 i 2 [1::n] : s0i 2 (si; mi) and c0ii1 = cii1 mi.(3) ~v 2 Vc(S) ^ (8 i 2 [1::n] : vi = (si;+mi) 2 E+i ). Assume that before applying ~v,8 i 2 [1::n] : ci	1i = mi  c00i	1i. The result of applying ~v on S is such that 8 i 2 [1::n] :s0i 2 (si;+mi) and c0i	1i = c00i	1i.(4) ~v 2 V(S). Suppose vi = (si). The result of applying ~v on S is such that 8 j 2[1::n] : s0j 2 (sj) if i = j; s0j = sj otherwise.Denote 7!f as the reexive, transitive closure of 7!f . S0 is fair reachable from S i S 7!f S0.When S = S0, S0 is fair reachable. We can also dene fair reachability (from S) for s0i,(m; s0i), and cycle Ci, respectively. The set of fair reachable states, denoted as F, is calledthe fair reachable state space of P .Note that 7!f is dened in the same way as that for D-cyclic protocols [7, 8] exceptfor two modications: First, internal vectors are added into V during fair progress stateexploration. Second, due to nondeterminism, the resulting local state s0i of a transition(si; ) ( ((si) ) is written as s0i 2 (si; ) ( s0i 2 (si) ) instead of si = (si; ) ( s0i = (si) ).Since S0 2 F is a state with equal channel length of zero and any fair progress vector in S0maintains the equal channel length property in the resulting state, it is not dicult to showby induction that each fair reachable state is a reachable state with equal channel length.Conversely, suppose S is a reachable state with equal channel length. Let fe1; e2;: : :; eng be alocal execution sequence set from S0 to S. We construct a partial fair execution sequence forS with respect to (w.r.t for short) fe1; e2;: : :; eng, denoted as pfs = X0 ~v1! X1 ~v2!    ~vk! Xk,such that k  0, X0 = S0, 8 0 < l  k : X l 1 7!f X l via fair progress vector ~vl, and no6
fair progress vector can be derived from fe1; e2;: : :; eng in state Xk. Xk is called the fairprecursor of S w.r.t fe1; e2;: : :; eng, denoted as fp = (sp1; sp2; : : :; spn; cpn1; cp12; : : :; cpn 1n). It is notdicult to show that fp is unique w.r.t fe1; e2;: : :; eng, although pfs is not always unique.Note that fp 2 F. If fp = S, then we are done. Suppose fp(S) 6= S, then the followinglemma holds for fp.Lemma 3.1 Let fp be the fair precursor for a reachable state S w.r.t fe1; e2; : : : ; eng.If fp 6= S, then the following statements are true in fp: (1) 9k 2 [1::n] : jekj 6= 0. (2)9k 2 [1::n] : jekj = 0. (3) If jekj 6= 0, let  pk be the transition from ek at spk, then  pk isexecutable and  pk 6= (spk). (4) fp 7! S via the remaining transitions from fe1; e2;: : :; engin fp.Based on this lemma, we can show that each reachable state with equal channel lengthis fair reachable.Theorem 3.1 F is exactly the set of reachable states with equal channel length.We are primarily interested in cyclic protocols whose fair reachable state spaces arenite. In [7], we show that for a D-cyclic protocol, F is nite i it is not \simultaneouslyunbounded". (A cyclic protocol P is simultaneously unbounded i 8K  0 9K 0 > K suchthat there is a state S 2 R where each channel has length no less than K0.) By similararguments, we can also show that the same necessary and sucient condition holds forcyclic protocols in general, although it is also undecidable.Lemma 3.2 Given a cyclic protocol P without reachable sending cycles. If P is un-bounded, then P is simultaneously unbounded.Lemma 3.3 If a cyclic protocol P is simultaneously unbounded, then its F is innite.Theorem 3.2 Given a cyclic protocol P with a nite F. P is unbounded i it has areachable sending cycle.Theorem 3.3 Given a cyclic protocol P . F is nite i P is not simultaneously un-bounded.Theorem 3.4 It is undecidable whether a cyclic protocol P has a nite F.In [8], we use P to denote the class of D-cyclic protocols whose F's are nite. In thispaper, we use Q to denote the class of cyclic protocols whose F's are nite. From thepreceding discussion, we know that Q maintains the same equal channel length propertyand membership function as P . From now on, we will restrict our study to class Q. In therest of the paper, unless otherwise stated explicitly, when we mention a cyclic protocol P ,we mean P 2 Q; when we mention F, we mean that it is nite.7
4 Fault Coverage of FAs with states in R, we dene logical errors for states in F in a similar way. Give a stateS 2 F. S is a fair deadlock state if it is a deadlock state. S is a fair unspecied receptionstate i there is a receiving local state si 2 S such that (i) ci	1i = m  c0i	1i and (si;+m)is not dened, or (ii) ci	1i = , (si	1; m) is dened, and (si;+m) is not dened. S isa fair unbounded state i it is in a sending cycle. S is a fair indenite state i it is in aninternal cycle. The set of fair deadlock (unspecied reception, unbounded, indenite) statesis denoted as Fdl (Fur ;Fub;Fid). By Theorem 3.2, a fair unbounded state is well-dened(recall that we assume F is nite). Note that a fair unspecied reception state is not anunspecied reception state when ci	1i = . Moreover, there might be \dead end" statesin F whose V = ;. However, as for D-cyclic protocols, it can be shown that the notionof fair unspecied reception is sucient for detecting unspecied receptions in F and theoccurrence of dead end states does not introduce new types of logical errors in F [8].Let's study the logical error detection capability of F. First, notice that all deadlockstates are of equal channel length zero. By Theorem 3.1, we have the following result ondeadlock detection:Theorem 4.1 Deadlock detection is decidable for Q.However, as for D-cyclic protocols, it is not dicult to see that for detection of logicalerrors other than deadlock, F is not sucient, and thus nite extension of F is needed.Following the same formulation as [8], we reduce the detection of logical errors other thandeadlock in Q to two local state reachability problems as follows:P-I Given a local state si, decide whether si is reachable.P-II Given a local state si and a messagem 2Mi	1i, decide whether (m; si) is reachable.It should be clear that for Q, if we can solve P-I, then we can solve unboundednessand indeniteness detection; if we can solve P-II, then we can solve unspecied receptiondetection; if we can solve both P-I and P-II, then we can solve detection of nonexecutabletransitions. Although neither P-I nor P-II is decidable in general (using the results es-tablished in [1]), we will show that both of them are decidable for Q via nite extensionof F. The line of reasoning is almost identical to that for showing them decidable for P in[8]. As a result, we will be quite informal in the arguments we make and only highlight thedierences along the way. Interested readers should consult [8] for details.As we have already seen, the need for nite extension in F results from the fact thatsome of the reachable local states are not fair reachable. Therefore, the purpose of niteextension is to uncover those local states. Suppose sk is reachable but not fair reachable.Then none of the reachable states containing sk is in F. Let S be any reachable state withsk 2 S, and fe1; e2;: : :; eng be a local execution sequence set for S. Let pfs and fp be apartial fair execution sequence and the fair precursor for S w.r.t fe1; e2;: : :; eng, respectively.8
By Lemma 3.1, we can nd a maximal interval [i::k] in fp such that 8j 2 [i::k] : jejj 6= 0.Moreover, let  pj be the transition from ej at spj , then  pj 6= (spj) and is executable in fp.Starting from fp, we construct the set of states fair reachable from fp as follows: Ineach such state S0, each fair progress vector ~v is computed as usual except that vj musttake on the transition from ej if (j 2 [i::k]) ^ (jejj 6= 0). Without loss of generality, let'sassume that none of the ej 's becomes empty during the construction. Let Fmin[i::k] be the setof states from the construction whose sum of the remaining transitions in fei; ei1;: : :; ekgis minimum. Note that if S0 2 Fmin[i::k] and S00 is fair reachable from S0 by the construction,then S00 2 Fmin[i::k]. More importantly, S 00 is fair reachable from S0 without progress in [i::k].Let ~u[i::k] = (ui; ui1;: : :; uk) be the transition vector associated with a state S 0 2 Fmin[i::k],then ~u[i::k] is a proper incompatible transition vector (pitv) in S. (~u[i::k] is a pitv in S i itsatises the following four conditions: (1) Each transition in the vector is executable in Sand is not an internal transition. (2) S does not have a concurrency vector. (3) There isno send-receive pair in ~u[i::k]. (4) neither ui nor uk appears in a send-receive pair in anysynchronization vector in S.) In fact, ~u[i::k] is the same for any S 0 in Fmin[i::k], and thus is apersistent proper incompatible transition vector (ppitv) in S. (A pitv ~u[i::k] is persistent in Si it is also a pitv in any state fair reachable from S without progress in [i::k].) Denote U[i::k](W[i::k]) as the set of pitv's (ppitv's) in S. Notice that although the preceding discussion isbased on reachability of sk , it also applies to the reachability of (m; sk). To sum up, wehave the following lemma:Lemma 4.1 A local state sk ( (m; sk) ) is reachable but not fair reachable only if thereis a state S0 2 F such that W[i::j] 6= ; in S0, k 2 [i::j], and sk ( (m; sk) ) is reachable from S 0.Therefore, the extension of F should be based on the set of states in F whose W[i::j] 6= ;for some interval [i::j]. To reduce the cost of extension, we want to compute a extension setFT  F such that its membership can be easily decided and there is a state S 2 F whoseW[i::j] 6= ; only if FT 6= ;.Let's see how FT can be computed. Given a state S 2 F whoseW[i::j] 6= ;. We constructa graph FRG[i::j] where S is the initial node and each node in the graph stands for a statefair reachable from S without progress in [i::j]. Then we construct the quotient graphQFRG[i::j] such that each node is a strongly connected component (SCC) in FRG[i::j],denoted as [S 0], where S0 is a state in that SCC. Then QFRG[i::j] is a directed acyclicgraph (DAG). The initial node is denoted as [S]. Let TN be the set of terminal nodesin QFRG[i::j]. By denition of ppitv, it is clear that W[i::j](S 0) = W[i::j](S 00) if [S0] = [S 00].Denote W[i::j]([S 0]) as the set of ppitv's in any state in [S0]. Then it is also obvious thatW[i::j](S) = T[S0] 2 TN W[i::j]([S 0]). Since we assume W[i::j](S) 6= ;, it follows that 8[S 0] 2TN : W[i::j]([S 0]) 6= ;. As a result, we only need to focus on those nodes in TN . Given anode [S0] 2 TN , there are two cases to consider: (1) [S 0] contains only one state S 0 butno outgoing edges. (2) There is a fair execution cycle (i.e., a cycle in the corresponding9
SCC in FRG[i::j]) among states in [S 0]. Unlike D-cyclic protocols, a fair execution cyclemight consist of internal transitions only. In any case, the following lemma shows that [S 0]contains some error state. Note that W[i::j]  U[i::j] for any S and [i::j].Lemma 4.2 Given S 2 F and an interval [i::j]. If U[i::j] 6= ; in S and S does not haveany fair progress vector without progress in [i::j], then S is a fair unspecied reception state.If S is in a fair execution cycle without progress in [i::j], then S is either a fair unboundedstate or a fair indenite state.Let FT = Fur [ Fub [ Fid. Clearly, FT can be easily computed during the constructionof F. From Lemma 4.1 and 4.2, FT is exactly the extension set we want. Thus to solve bothP-I and P-II for Q, we only need to nitely extend those states in FT .Given a state S 2 FT . An interval [i::j] is an incompatible interval in S if U[i::j](S) 6= ;.Let IJ be the set of incompatible intervals in S, then (IJ;) is a partially odered set.Denote ImJ as the set of maximal elements in (IJ;). Our nite extension procedure isbased on the nite extension of part of a state S indexed by each [i::j] 2 ImJ of S foreach S 2 FT . Similar to the approach used for D-cyclic protocols, we can show that suchextension can be done in a nite way so that both P-I and P-II is solvable for S. Theformulation of the reachability relation among partial states is the same as that in [8] exceptfor two modications, as were pointed out in the formulation of fair reachability relation inSection 3. For details, please refer to [8].Theorem 4.2 Both P-I and P-II are decidable for Q. Therefore, detection of unspeci-ed reception, unboundedness, nonexecutable transition, and indeniteness are all decidablefor Q.During the process, we have also found out a fault coverage characterization for F similarto that for F of a D-cyclic protocol [8]. The only dierence is that we need to take fairindenite states into account.Theorem 4.3 Given a cyclic protocol P 2 Q. P has a deadlock i Fdl 6= ;. P has anunspecied reception but Fur = ; only if Fub [ Fid 6= ;. P is unbounded but Fub = ; onlyif Fur [ Fid 6= ;. P is indenite but Fid = ; only if Fur [ Fub 6= ;. P has a nonexecutabletransition that is not detectable via F only if Fur [ Fub [ Fid 6= ;. P is logically correct iF does not contain any logical errors.As a result, F not only oers substantial state reduction over R but also is very compet-itive in fault coverage. Furthermore, the decision procedures can be optimized for eciencyin a similar way as for P. We refer the interested readers to [8] for details.A nal remark on indeniteness is in order here. Given a reachable internal executioncycle C = fC1; C2;: : :; Cng, there is a nonempty Ci that is an internal cycle of Pi. A special10
case is when each Ci is an internal cycle of Pi. Suppose C is such an execution cycle ofS, then once P enters state S, each process could execute internal transitions indenitelywithout any communications occur in P . In this case, we say P is globally indenite. Ifwe dene a function  that marks each normal transition in Pi as \progress" and markseach internal transition in Pi as \nonprogress", then global indeniteness becomes a specialcase of livelock dened in [4, 5, 7]. In [7], we generalized the results of [4, 5] from n = 2 ton  2 and showed that livelock is decidable within F for class P . With the fair reachabilitynotion formulated in Section 3 of this paper, it is not dicult to show that livelock is alsodecidable for Q within F. As a result, global indeniteness is also decidable for Q withoutnite extension on F.5 DiscussionThrough this study, it is clear that the adaptation to nondeterminism in the model canbe done without much modication in the original formulation. The inclusion of internaltransitions, on the other hand, has several implications: First, internal transitions mustbe incorporated into the formulation of fair progress vectors. In order to ensure that eachreachable state with equal channel length is also fair reachable, each internal transitionin a state must be executed individually, which leads us to treat normal transitions andinternal transitions in a state separately when computing the set of fair progress vectors inthat state. Second, the clean separation between normal and internal transitions allows usto adapt our approach in the augmented model with little eort. However, allowing eachprocess to execute only one (internal) transition at a time is inconsistent with the generalphilosophy of fair progress state exploration. As a result, more redundancy is introducedin the state exploration process due to the interleaving of equivalent execution sequences.Third, the existence of a reachable internal cycle in a process results in a new type of logicalerror called indeniteness. Since the set of fair indenite states is a subset of the extensionset, nite extension is in general more costly than the one in [8] for D-cyclic protocols.A closer look at indeniteness leads to the following generalization: A cyclic protocol Pis k-indenite i P has a reachable internal execution cycle C = fC1; C2;: : :; Cng in which thereare at least k 2 [1::n] nonempty Ci's in C. Clearly, indeniteness and global indenitenessrepresent the two ends of the k-indeniteness spectrum and we know that both of themare decidable for Q. We can also generalize the livelock notion in [4, 5, 7] to k-livelockfor cyclic protocols in a similar way. We already know that n-livelock is decidable for Q.Moreover, P-I being decidable forQ implies that 1-livelock is also decidable for Q. Now thequestion is whether k-indeniteness, or more generally k-livelock, can be decided eectivelyand eciently based on F for cyclic protocols in Q when n > 2 and 1 < k < n. We arecurrently working on this issue.While the notion of internal transitions has been used extensively in other models such11
as the labeled transition systems (LTS for short) model, the study of this notion in the CFSMmodel has been limited. The reduced reachability analysis approach by Cacciari and Raq[2, 3] seems to be most closely related to the present work in that reduced progress isquite similar to fair progress and internal transitions are allowed in their model. However,there are several dierences between their approach and ours: First, reduced reachabilityanalysis was proposed for protocols with n = 2 machines. It remains to be shown whetherthis technique can be generalized to protocols with n  2 machines. In addition, theyassumed no internal cycles in the protocols. Although it seems that their technique canalso handle internal cycles, it is not clear why such restriction was made in their formulation.Second, the \parallelwise" condition imposed on reduced transitions implies that it is notalways possible for the two machines to proceed simultaneously. Thus not every reducedreachable state is of equal channel length. This, we feel, makes it more dicult to nd a(sucient) condition for the class of protocols with nite reduced reachable state spaces.Third, since each channel is empty in the initial state and only one machine is allowed toproceed when both channels are empty in reduced state exploration, the reduced reachablestate space properly includes the fair reachable state space if the protocol has more than onereachable state, as is the case for most protocols. For n = 2, although their approach candetect deadlocks and unspecied receptions without extending the reduced reachable statespace, fair reachability analysis can accomplish the same task without nite extension andgenerates much fewer states for most protocols. Even if their technique can be generalizedto cyclic protocols with n  2 machines and can still detect unspecied receptions withinthe reduced reachable state space, it is not clear whether the saving in nite extension canbe paid o by generating more states. Besides, it remains to be seen whether detection ofunboundedness and nonexecutable transitions can be done using their approach. To sumup, compared with reduced reachability analysis, our approach has the advantage that itcan be applied to a much larger class of protocols, can detect more types of logical errorsin a protocol, and is quite ecient in terms of both space and time.6 ConclusionIn this paper, we extended the generalized fair reachability analysis technique to cyclicprotocols with nondeterminism and internal transitions. We showed that most of the resultsestablished for cyclic protocols without nondeterminism and internal transitions in [7, 8]can be carried over to cyclic protocols with nondeterminism and internal transitions. Weidentied indeniteness as a new type of logical error and showed that its detection is alsodecidable for Q via nite extension of the fair reachable state space. As a result, ourtechnique works equally well for the class of cyclic protocols with nite fair reachable statespaces even if nondeterminism and internal transitions are allowed.As for future work, we are going to address the following issues: (1) k-indeniteness and12
k-livelock; (2) More general and yet regular protocol communications topologies; and (3)Other formal models such as the extended nite state machines model.References[1] D. Brand and P. Zaropulo, \On Communicating Finite-State Machines," Journal ofACM, Vol. 30, No. 2, April 1983, pp. 323{342.[2] L. Cacciari and O. Raq, \On Improving Reduced Reachability Analysis," FORTE'92,Perros-Guirec, France, October 13-16, 1992, M. Daiz and R. Groz (Ed.), 1992, pp.137{152.[3] L. Cacciari and O. Raq, \Decidability Issues in Reduced Reachability Analysis,"ICNP'93, San Francisco, CA, October 19{22, 1993, pp. 158{165.[4] M.G. Gouda, C.H. Chow, and S.S. Lam, \Livelock Detection in Networks of Communi-cating Finite State Machines," Technical Report, TR-84-10, Dept. of Computer Science,Univ. of Texas at Austin, April 1984.[5] M.G. Gouda, C.H. Chow, and S.S. Lam, \On the Decidability of Livelock Detection inNetworks of Communicating Finite State Machines," PSTV'85, Y. Yemini, R. Strom,and S. Yemini (Ed.), 1985, pp. 47{56.[6] M.G. Gouda and J.Y. Han, \Protocol Validation by Fair Progress State Exploration,"Computer Networks and ISDN Systems, Vol. 9, 1985, pp. 353{361.[7] H. Liu and R.E. Miller, \Generalized Fair Reachability Analysis for Cyclic Protocols:Part 1," PSTV'94, S.T. Vuong (Ed.), Vancouvor, B.C. Canada, June 1994, pp. 258{273.[8] H. Liu and R.E. Miller, \Generalized Fair Reachability Analysis for Cyclic Protocols:Decidability for Logical Correctness Problems," ICNP'94, Boston, Massachusetts, Oc-tober 25{28, pp. 100{107.[9] J. Rubin and C.H. West, \An Improved Protocol Validation Technique," ComputerNetworks and ISDN Systems, Vol. 6, 1982, pp. 65{73.[10] D. Sidhu, A. Chung, and T.P. Blumer, \Experience with Formal Methods in ProtocolDevelopment," ACM SIGCOMM, Computer Communication Review, Vol. 21, No. 2,April, 1991, pp. 81{101.
13
Appendix: Proofs of Lemmas and TheoremsLemma 3.1 Let fp be the fair precursor for a reachable state S w.r.t fe1; e2; : : : ; eng.If fp 6= S, then the following statements are true in fp: (1) 9k 2 [1::n] : jekj 6= 0. (2)9k 2 [1::n] : jekj = 0. (3) If jekj 6= 0, let  pk be the transition from ek at spk, then  pk isexecutable and  pk 6= (spk). (4) fp 7! S via the remaining transitions from fe1; e2;: : :; engin fp.Proof: fe1; e2;: : :; eng being a local execution sequence set for S implies that thereis no deadlock or unspecied reception during the execution of transitions in fe1; e2;: : :; engfrom S0 to S. Since fp 6= S, there must be some transitions in fe1; e2;: : :; eng remainedto be executed in fp, i.e., 9k 2 [1::n] : jekj 6= 0. Thus, (1) holds. Suppose jekj 6= 0,then  pi cannot be an internal transition, otherwise an internal vector can be found in fp.If  pk is a sending transition, then it is executable. Hence,  pk is not executable only if itis a receiving transition and cpk	1k = , otherwise there will be an unspecied receptionalong fe1; e2;: : :; eng. In this case, there must be at least one such  pi ; jeij 6= 0, that is asending transition; otherwise the protocol cannot precede beyond fp to reach S. As aresult, a send-receive pair can be derived from the transitions in fp, which contradicts theassumption that no fair progress vector can be derived from fp based on the remainingtransitions in fe1; e2;: : :; eng. Therefore, if jekj 6= 0, then  pk must be executable, i.e., (3)holds. Suppose now 8 i 2 [1::n] : jeij 6= 0, then each  pi is executable. As a result, eithera concurrency vector or a synchronization vector can be derived from ~t = ( p1 ;  p2 ; : : : ;  pn),which also contradicts the assumption that no fair progress vector can be derived from fpbased on the remaining transitions in fe1; e2;: : :; eng. Thus, 9 k 2 [1::n] : jekj = 0, i.e., (2)holds. Finally, by induction on the number of remaining transitions in fe1; e2;: : :; eng fromfp to S, it is obvious that S is reachable from fp via those remaining transitions, i.e., (5)holds.Theorem 3.1 F is exactly the set of reachable state with equal channel length.Proof: We need to show that S is fair reachable i it is a reachable state with equalchannel length.(Only If:) Suppose S is fair reachable. Then S is reachable. Let fs be a fair executionsequence for S. Denote fs = X0 ~v1! X1 ~v2!    ~vk! Xk; k  0, where X0 = S0, 8 j 2 [1::k] :X j 1 7!f X j via fair progress vector ~vj, and Xk = S. We claim that S is of equal channellength by induction on k.Basis: k = 0. In this case, S = S0. The claim holds trivially.Induction: Suppose S is of equal channel length for k = k0  0. We want to show fork = k0 + 1. Note that Xk 1 is fair reachable via a fair execution sequence of length k0. Byinduction hypothesis, Xk 1 is of equal channel length. Now, Xk 1 7!f S via fair progress14
vector ~vk. If ~vk is a concurrency vector, then it will either increase each channel length byone or decrease each channel length by one when applied toXk 1. If ~vk is a synchronizationvector or an internal vector, then it will not change the length of any channel when appliedto X j 1. Hence, S is also of equal channel length. The claim holds for k = k0 + 1.Therefore, S is a reachable state with equal channel length.(If:) Suppose S is a reachable state with equal channel length K  0. We want to showthat S is fair reachable. Let fe1; e2;: : :; eng be a local execution sequence set for S and fp bethe fair precursor of S w.r.t fe1; e2;: : :; eng. Then fp is fair reachable. From the precedingargument, fp is of equal channel length. Let K0 be the channel length in fp. Let [i::j] bean interval in fp such that 8 k 2 [i::j] : jekj 6= 0 and jei	1j = jej1j = 0. By Lemma 3.1,such an interval exists. Moreover, 8 k 2 [i::j] :  pk 6= (spk) and is executable in fp. Notethat in this case, either  pi is a receiving transition or  pj is a sending transition. Otherwise,a send-receive pair can be derived from ( pi ;  pi1; : : : ;  pj ), which contradicts the assumptionthat no fair progress vector can be derived from fp. There are three cases to consider:(1) K 0 < K. Note that the length of channel Ci	1i cannot be increased. By the timethe protocol gets to S, the length of channel Ci	1i will be less than K.(2) K 0 > K. Note that the length of channel Cjj1 cannot be decreased. By the timethe protocol gets to S, the length of channel Cjj1 will be greater than K.(3) K 0 = K. There are two subcases to consider:(a)  pi is a receiving transition. Then after the execution of  pi , the length of channelCi	1i will be K   1. Note that the length of channel Ci	1i cannot be increased.By the time the protocol gets to S, the length of channel Ci	1i will be no greaterthan K   1.(b)  pj is a sending transition. Then after the execution of  pj , the length of channelCjj1 will be K + 1. Note that the length of channel cannot be decreased. Bythe time the protocol gets to S, the length of channel will be no less than K+1.In all cases, there will be a channel whose length is not K when the protocol gets to S,which contradicts the assumption that S is of equal channel length K. Hence, S is fairreachable.Lemma 3.2 Given a cyclic protocol P without reachable sending cycles. If P isunbounded, then P is simultaneously unbounded.Proof: Since P is unbounded, P has at least one unbounded channel. Without lossof generality, suppose channel C12 is unbounded.Since C12 is unbounded, there must exist an innite execution sequence e=fe1; e2;: : :; engsuch that for any k  0, there is a state reachable via a prex of e such that jc12j > K.Moreover, since each process Pi has no reachable sending cycles, each ei is composed of15
innitely many sends and receives, and there can only be at most jSij   1 consecutivereceives before a send in ei, where jSij is the number of states in Pi. As a result, theremust be at least one such execution sequence along which P can proceed indenitely, i.e.,no unspecied reception can occur along this sequence, otherwise C12 will be bounded. Fixe = fe1; e2;: : :; eng as such an execution sequence.Dene a function f : [0::n  1]! N , N being the set of natural numbers, as follows:f(i) = ( 1 if i = 01+ jSn	(i	1)j  f(i	 1) if 0 < i < nBased on the preceding argument, for any K  0, there is a state S = (s1; s2; : : : ; sn; cn1,c12; : : : ; cn 1n) reachable via a prex of e such that jc12j = f(n 	 1) K 0, where K 0 > K.If all other channels have more than K messages, we are done. Suppose not, starting fromS, in the order from P2 to Pn, each process Pi; i 2 [2::n], can receive jSij  f(n 	 i) K0messages from channel Ci	1i, and as a result, send at least f(n 	 j) messages to channelCii1. In the end, the protocol must arrive at a reachable state such that each channelshould have at least K0 messages. Therefore, there is a reachable global state in which eachchannel length is greater than K, i.e., P is simultaneously unbounded.Lemma 3.3 If a cyclic protocol P is simultaneously unbounded, then its F is innite.Proof: We rst show the following: if there is a reachable state S = (s1; s2; : : : ; sn; cn1,c12; : : : ; cn 1n) such that 8 i 2 [1::n] : jcii1j  K for some constant K  0, then there existsa fair reachable state S 0 = (s01; s02; : : :; s0n; c0n1; c012; : : :; c0n 1n) such that 8 i 2 [1::n] : jc0ii1j  K.First, if S 2 F, then let S 0 = S, we are done. Second, if K = 0, then let S 0 = S0,and we are done. Now suppose S 62 F and K > 0. Let e = fe1; e2;: : :; eng be anexecution sequence for S. Based on fe1; e2;: : :; eng, we construct the partial fairexecution sequence for S to get to fp, the fair precursor of S. Clearly, fp 2 Fand is of equal channel length by Theorem 3.1. Suppose fp is of channel lengthK0. If K0  K, then let S0 = fp, and we are done. Suppose not, by Lemma 3.1,9k 2 [1::n] : jekj = 0. Note that from state fp and on, the length of channelCkk1 cannot be increased with the execution of remaining transitions in e byother processes. Therefore, at the end of the execution of e, i.e., in state S,the length of channel Ckk1 will be less than K, which contradicts the fact thatevery channel length in S is no less than K. Hence, fp must have channel lengthno less than K. In summary, we can nd a fair reachable state whose channellength is no less than K.Now since P is simultaneously unbounded, 8K  0, there exists a K 0 > K such that there16
is a reachable state S in which each channel length is no less than K 0. As a result, there isa fair reachable state S0 whose channel length is no less than K 0. Therefore, F is innite.Theorem 3.2 Given a cyclic protocol P with a nite F. P is unbounded i it has areachable sending cycle.Proof: Obviously, if P has a reachable sending cycle, then P is unbounded. SupposeP is unbounded but does not have a reachable sending cycle. Then By Lemma 3.2, P issimultaneously unbounded. By Lemma 3.3, F is innite. A contradiction.Theorem 3.3 Given a cyclic protocol P . F is nite i P is not simultaneously un-bounded.Proof: Suppose F is innite, then F = S1k=0 Fk is innite. Thus, 8K  0 9K 0 > K :FK0 6= ;. Since any state in F is of equal channel length, P is simultaneously unbounded.On the other hand, by Lemma 3.3, if P is simultaneously unbounded, then F is innite.Theorem 3.4 It is undecidable whether a cyclic protocol P has a nite F.Proof: Since it is undecidable whether a D-cyclic protocol has a nite fair reachablestate space [7], it follows that it is also undecidable whether a (general) cyclic protocol Phas a nite F.Lemma 4.2 Given S 2 F and an interval [i::j]. If U[i::j] 6= ; in S and S does not haveany fair progress vector without progress in [i::j], then S is a fair unspecied reception state.If S is in a fair execution cycle without progress in [i::j], then S is either a fair unboundedstate or a fair indeterminate state.Proof: By denition, U[i::j] 6= ; in S implies that 8 k 2 [i::j] : Ek 6= ; in S. Thus S isnot a deadlock state. Denote [i::j] as the complement interval of [i::j] w.r.t [1::n]. SupposeS is not a fair unspecied reception state. Then 8 k 2 [i::j] : Ek [ E++k [ Ek 6= ;. As aresult, a fair progress vector can be derived from each ~t 2 TV . Let ~u[i::j] be a pitv in S. Let~t be a pseudo transition vector in TV such that 8 k 2 [i::j] : uk = tk. Then a fair progressvector ~v can be derived from ~t and 8 k 2 [i::j] : vk = . Hence ~v is a fair progress vectorin S without progress in [i::j]. A contradiction. Therefore, S is a fair unspecied receptionstate.Now suppose S is in a fair execution cycle fc without progress in [i::j]. Let fC1; C2;: : :; Cngbe the corresponding local execution cycle set of S. Then 8 k 2 [i::j] : Ck is empty. Sincefc is not empty, there must be a nonempty interval [h::l] in S such that fi::jg\ fh::lg = ;,8 k 2 [h::l] : Ck is nonempty, and Ch	1 and Cl1 are empty. If there is a Ck; k 2 [h::l], that17
is an internal cycle of Pk , then S is an indeterminate state. Otherwise, we claim that Ch isa sending cycle in Ph. Suppose not. Then there is at least one receiving transition in Ch.Assume S is of channel length K. Then going through fc once will decrease the length ofchannel Ch	1h by one. On the other hand, Ph	1 is idle during the execution of fc. As aresult, executing fc once will not lead the system back to S, contradicting the assumptionthat fc is a fair execution cycle. Therefore, Ch must be a sending cycle in Ph, i.e., S mustbe a fair unbounded state.Theorem 4.2 Both P-I and P-II are decidable for Q. Therefore, detection of un-specied reception, unboundedness, nonexecutable transition, and indeterminacy are alldecidable for Q.Proof: The proof requires the formulation of partial state reachability and the niteextension construction of partial states, both of which are omitted in this paper due tospace limitations. Please refer to [8] for details.Theorem 4.3 Given a cyclic protocol P 2 Q. P has a deadlock i Fdl 6= ;. P has anunspecied reception but Fur = ; only if Fub[Fid 6= ;. P is unbounded but Fub = ; only ifFur [Fid 6= ;. P is indeterminate but Fid = ; only if Fur [Fub 6= ;. P has a nonexecutabletransition that is not detectable via F only if Fur [ Fub [ Fid 6= ;. P is logically correct iF does not contain any logical errors.Proof: The deadlock case is obvious from Theorem 3.1. Suppose P has an unspeciedreception but Fur = ;. Then there is a reachable state S such that (m; sk) 2 S, sk is localreceiving state, and (sk;+m) is not dened. Since Fur = ;, (m; sk) is reachable but notfair reachable. By Lemma 4.1 and Lemma 4.2, FT 6= ;. Since FT = Fur [ Fub [ Fid, wemust have Fub [Fid 6= ;. The proofs for unboundedness, indeterminacy, and nonexecutabletransition can be carried out in a similar way.Now suppose P is logically correct, then there is no reachable error states in F. Con-versely, if F is free of logical errors, then FT = ;. P cannot have a deadlock since alldeadlock states are included in F. Based on the discussion in the preceding paragraph, Pcannot have any other logical errors either since otherwise we will have FT 6= ;. Hence, Pis logically correct.
18
