Accuracy of Message Counting Abstraction in Fault-Tolerant Distributed Algorithms by Konnov, Igor et al.
Accuracy of Message Counting Abstraction in
Fault-Tolerant Distributed Algorithms?
Igor Konnov1, Josef Widder1, Francesco Spegni2, and Luca Spalazzi2
1 TU Wien (Vienna University of Technology), Austria
2 UnivPM, Ancona, Italy
Abstract. Fault-tolerant distributed algorithms are a vital part of mis-
sion-critical distributed systems. In principle, automatic verification can
be used to ensure the absence of bugs in such algorithms. In practice
however, model checking tools will only establish the correctness of dis-
tributed algorithms if message passing is encoded efficiently. In this pa-
per, we consider abstractions suitable for many fault-tolerant distributed
algorithms that count messages for comparison against thresholds, e.g.,
the size of a majority of processes. Our experience shows that storing
only the numbers of sent and received messages in the global state is
more efficient than explicitly modeling message buffers or sets of mes-
sages. Storing only the numbers is called message-counting abstraction.
Intuitively, this abstraction should maintain all necessary information.
In this paper, we confirm this intuition for asynchronous systems by
showing that the abstract system is bisimilar to the concrete system.
Surprisingly, if there are real-time constraints on message delivery (as
assumed in fault-tolerant clock synchronization algorithms), then there
exist neither timed bisimulation, nor time-abstracting bisimulation. Still,
we prove this abstraction useful for model checking: it preserves ATCTL
properties, as the abstract and the concrete models simulate each other.
1 Introduction
The following algorithmic idea is pervasive in fault-tolerant distributed com-
puting [36, 13, 30, 33, 21, 39]: each correct process counts messages received from
distinct peers. Then, given the total number of processes n and the maximum
number of faulty processes t, a process performs certain actions only if the mes-
sage counter reaches a threshold such as n− t (this number ensures that faulty
processes alone cannot prevent progress in the computation). A list of bench-
mark algorithms that use such thresholds can be found in [27]. On the left of
Figure 1, we give an example pseudo code [36]. This algorithm works in a timed
environment [35] (with a time bound τ+ on message delays) in the presence of
Byzantine faults (n > 3t) and provides safety and liveness guarantees such as:
? Supported by: the Austrian Science Fund (FWF) through the National Research Net-
work RiSE (S11403 and S11405), and project PRAVDA (P27722); and by the Vienna
Science and Technology Fund (WWTF) through project APALACHE (ICT15-103).
1 l o c a l myvali ∈ {0, 1}
2
3
4
5
6 do a t om i c a l l y
7 -- messages are received implicitly
8 i f myvali = 1
9 and not sent ECHO before
10 then send ECHO to a l l
11
12 i f received ECHO
13 from at least t+ 1 distinct processes
14 and not sent ECHO before
15 then send ECHO to a l l
16
17 i f received ECHO
18 from at least n− t distinct processes
19 then accept
20 od
21 l o c a l myvali ∈ {0, 1}
22 g l o b a l nsntEcho ∈ N0 i n i t i a l l y 0
23 l o c a l hasSent ∈ B i n i t i a l l y F
24 l o c a l rcvdEcho ∈ N0 i n i t i a l l y 0
25
26 do a t om i c a l l y
27 i f (*) -- choose non-deterministically
28 and rcvdEcho < nsntEcho+ f
29 then rcvdEcho++;
30
31 i f myvali = 1 and hasSent = F
32 then { nsntEcho++; hasSent = T; }
33
34
35 i f rcvdEcho ≥ t+ 1 and hasSent = F
36 then { nsntEcho++; hasSent = T; }
37
38 i f rcvdEcho ≥ n− t
39 then accept
40 od
Fig. 1. Pseudocode of a broadcast primitive to simulate authenticated broad-
cast [36] (left), and pseudocode of its message-counting abstraction (right)
a) If a correct process accepts (that is, executes Line 19) at time T , then all
correct processes accept by time T + 2τ+.
b) If all correct processes start with myval i = 0, then no correct process ever
accepts.
c) If all correct processes start with myval i = 1, then at least one correct
process eventually accepts.
As is typical for the distributed algorithms literature, the pseudo code from
Figure 1 omits “unnecessary book-keeping” details of message passing. That is,
neither the local data structures that store the received messages nor the message
buffers are explicitly described. Hence, if we want to automatically verify such
an algorithm design, it is up to a verification expert to find adequate modeling
and proper abstractions of message passing.
The authors of [23] suggested to model message passing using message coun-
ters instead of keeping track of individual messages. This modeling was shown
experimentally to be efficient for fixed size systems, and later a series of pa-
rameterized model checking techniques was based upon it [22, 23, 25–27]. The
encoding on the right of Figure 1 is obtained by adding a global integer vari-
able nsntEcho. Incrementing this variable (Line 36) encodes that a correct pro-
cess executes Line 15 of the original pseudo code. The ith process keeps the
number of received messages in a local integer variable rcvdEchoi that can be
increased, as long as the invariant rcvdEchoi ≤ nsntEcho + f is preserved,
where f is the actual number of Byzantine faulty processes in the run. (This
models that correct processes can receive up to f messages sent by faulty pro-
cesses.) In fact, this modeling can be seen as a message-counting abstraction of
a distributed system that uses message buffers.
Asynchronous Systems
Message sets Message counting
Message sets + time Message counting + time
Timed Systems
bisimulation (Thm. 3.4)
add clocks add clockstimed simulation equivalence (Cor. 5.10)
no timed/time-abstracting
bisimulation (Thm. 5.1)
Fig. 2. Relationship between different modeling choices.
The broadcast primitive in Figure 1 is also used in the seminal clock syn-
chronization algorithm from [35]. For clock synchronization, the precision of the
clocks depends on the timing behavior3 of the message system that the processes
use to re-synchronize; e.g., in [35] it is required that each message sent at an in-
stant T by a correct process must be delivered by a correct recipient process in
the time interval [T + τ−, T + τ+] for some bounds τ− and τ+ fixed in each run.
The standard theory of timed automata [7] does not account for message
passing directly. To incorporate messages, one specifies a message passing sys-
tem as a network of timed automata, i.e., a collection of timed automata that
are scheduled with respect to interleaving semantics and interact via rendezvous,
synchronous broadcast, or shared variables [12]. In this case, there are two typ-
ical ways to encode message passing: (i) for each pair of processes, introduce
a separate timed automaton that models a channel between the processes, or
(ii) introduce a single timed automaton that stores messages from timed au-
tomata (modeling the processes) and delivers the messages later by respecting
the timing constraints. The same applies to Timed I/O automata [24]. Both
solutions maintain much more details than required for automated verification
of distributed algorithms such as [35]: First, processes do not compare process
identifiers when making transitions, and thus are symmetric. Second, processes
do not compare identifiers in the received messages, but only count messages.
For automated verification purposes, it appears natural to model such algo-
rithms with timed automata that use a message-counting abstraction. However,
the central question for practical verification is: how precise is the message-
counting abstraction? In other words, given an algorithm A, what is the strongest
equivalence between the model MS(A) using message sets and the model MC(A)
using message counting. If the message counting abstraction is too coarse, then
this may lead to spurious counterexamples, which may result in many refinement
steps [17], or even may make the verification procedure incomplete.
3 As we deal with distributed algorithms and timed automata, the notion of a clock
appears in two different contexts in this paper, which should not be confused: The
problem of clock synchronization is to compute adjustment for the hardware clocks
(oscillators). In the context of timed automata, clocks are special variables used to
model the timing behavior of a system.
Contributions. We introduce timed and untimed models suitable for the verifica-
tion of threshold-based distributed algorithms, and establish relations between
these models. An overview of the following contributions is depicted in Figure 2:
– We define a model of processes that count messages. We then compose them
into asynchronous systems (interleaving semantics). We give two variants:
message passing, where the messages are stored in sets, and message count-
ing, where only the number of sent messages is stored in shared variables.
– We then show that in the asynchronous case, the message passing and the
message counting variants are bisimilar. This proves the intuition that under-
lies the verification results from [23, 22, 27, 25]. It explains why no spurious
counterexamples due to message-counting abstraction were experienced in
the experimental evaluation of the verification techniques from [22].
– We obtain timed models by adding timing constraints on message delays
that restrict the message reception time depending on the sending times.
– We prove the surprising result that, in general, there is neither timed bisim-
ulation nor time-abstracting bisimulation between the message passing and
the message counting variants.
– Finally, we prove that there is timed simulation equivalence between the
message passing and the message counting variants. This paves a way for
abstraction-based model checking of timed distributed algorithms [35].
In the following section, we briefly recall the classic definitions of transition
systems, timed automata, and simulations [7, 16]. However, the timed automata
defined there do not provide standard means to express processes that commu-
nicate via asynchronous message passing, as required for distributed algorithms.
As we are interested in timed automata that capture this structure, we first de-
fine asynchronous message passing in Section 3 and then add timing constraints
in Section 4 via message sets and message counting.
2 Preliminaries
We recall the classic definitions to the extent necessary for our work, and add
two non-standard notions: First, our definition of a timed automaton assumes
partitioning of the set of clocks into two disjoint sets: the message clocks (used to
express the timing constraints of the message system underlying the distributed
algorithm) and the specification clocks (used to express the specifications). Sec-
ond, we assume that clocks are “not ticking” before they are started (more
precisely, they are initialized to −∞).
We will use the following sets: the set of Boolean values B = {F,T}, the set
of natural numbers N = {1, 2, . . . }, the set N0 = N∪{0}, the set of non-negative
reals R≥0, and the set of time instants T := R≥0 ∪ {−∞}.
Transition systems. Given a finite set AP of atomic propositions, a transition
system is a tuple TS = (S, S0, R, L) where S is a set of states, S0 ⊆ S are the
initial states, R ⊆ S × S is a transition relation, and L : S → 2AP is a labeling
function.
Clocks. A clock is a variable that ranges over the set T. We call a clock that has
the value −∞ uninitialized. For a set X of clocks, a clock valuation is a function
ν : X → T. Given a clock valuation ν and a δ ∈ R≥0, we define ν + δ to be the
valuation ν′ such that ν′(c) = ν(c) + δ for c ∈ X (Note that −∞ + δ = −∞).
For a set Y ⊆ X and a clock valuation ν : X → T, we denote by ν[Y := 0] the
valuation ν′ such that ν′(c) = 0 for c ∈ Y ∩X and ν′(c) = ν(c) for c ∈ X \ Y .
Given a set of clocks Z, the set of clock constraints Ψ(Z) is defined to contain
all expressions generated by the following grammar:
ζ := c ≤ a | c ≥ a | c < a | c > a | ζ ∧ ζ for c ∈ Z, a ∈ N0
Timed automata. Given a set of atomic propositions AP and a finite transition
system (S, S0, R, L) over AP, which models discrete control of a system, we
model the system’s real-time behavior with a timed automaton, i.e., a tuple
TA = (S, S0, R, L,X ∪ U, I, E) with the following properties:
– The set X ∪ U is the disjoint union of the sets of message clocks X and
specification clocks U .
– The function I : S → Ψ(X ∪ U) is a state invariant, which assigns to each
discrete state a clock constraint over X ∪ U , which must hold in that state.
We denote by µ, ν |= I(s) that the clock valuations µ and ν satisfy the
constraints of I(s).
– E : R → Ψ(X ∪ U) × 2(X∪U) is a state switch relation that assigns to each
transition a guard on clock values and a (possibly empty) set of clocks that
must be reset to zero, when the transition takes place.
We assume that AP is disjoint from Ψ(X ∪ U). Thus, the discrete behavior
does not interfere with propositions on time. The semantics of a timed automa-
ton TA = (S, S0, R, L,X ∪ U, I, E) is an infinite transition system TS (TA) =
(Q,Q0, ∆, λ) over propositions AP ∪ Ψ(U) with the following properties [6]:
1. The set Q of states consists of triples (s, µ, ν), where s ∈ S is the discrete
component of the state, whereas µ : X → T and ν : U → T are valuations
of the message and specification clocks respectively such that µ, ν |= I(s).
2. The set Q0 ⊆ Q of initial states comprises triples (s0, µ0, ν0) with s0 ∈ S0,
and clocks are set to −∞, i.e., ∀c ∈ X. µ0(c) = −∞ and ∀c ∈ U. ν0(c) = −∞.
3. The transition relation ∆ contains pairs ((s, µ, ν), (s′, µ′, ν′)) of two kinds of
transitions:
(a) A time step: s′ = s and µ′ = µ+ δ, ν′ = ν + δ, for δ > 0, provided that
for all δ′ : 0 ≤ δ′ ≤ δ the invariant is preserved, i.e., µ+δ′, ν+δ′ |= I(s).
(b) A discrete step: there is a transition (s, s′) ∈ R with (ϕ, Y ) = E((s, s′))
whose guard ϕ is enabled, i.e., µ, ν |= ϕ, and the clocks from Y are reset,
i.e., µ′ = µ[Y ∩X := 0], ν′ = ν[Y ∩U := 0], provided that µ′, ν′ |= I(s).
Given a transition (q, q′) ∈ ∆, we write q δ−→∆ q′ for a time step with delay
δ ∈ R≥0, or q →∆ q′ for a discrete step.
4. The labeling function λ : Q → 2AP∪Ψ(U) is defined as follows. For any state
q = (s, µ, ν), the labeling λ(q) = L(s) ∪ {ϕ ∈ Ψ(U) : µ, ν |= ϕ}.
Comparing system behaviors. For transition systems TS i = (Si, S
0
i , Ri, Li)
for i ∈ {1, 2}, a relation H ⊆ S1 × S2 is a simulation, if (i) for each (s1, s2) ∈ H
the labels coincide L1(s1) = L2(s2), and (ii) for each transition (s1, t1) ∈ R1,
there is a transition (s2, t2) ∈ R2 such that (t1, t2) ∈ H. If, in addition, the set
H−1 = {(s2, s1) : (s1, s2) ∈ H} is also a simulation, then H is called bisimulation.
Further, if TA1 and TA2 are timed automata with TS (TAi) = (Qi, Q
0
i , ∆i, λi)
for i ∈ {1, 2}, then a simulation H ⊆ Q1 × Q2 is called timed simulation [29],
and a bisimulation B ⊆ Q1 ×Q2 is called timed bisimulation [15].
For transition systems TS i = (Si, S
0
i , Ri, Li) for i ∈ {1, 2}, we say that a
simulation H ⊆ S1×S2 is initial, if ∀s ∈ S01 , ∃t ∈ S02 . (s, t) ∈ H. A bisimulation
B ⊆ S1×S2 is initial, if the simulations B and B−1 are initial. The same applies
to timed (bi-)simulations. Then, for i ∈ {1, 2}, we recall the standard preorders
and equivalences on a pair of transition systems TS i = (Si, S
0
i , Ri, Li), and on
a pair of timed automata TAi, where TS (TAi) = (Qi, Q
0
i , ∆i, λi):
1. TS 1 ≈ TS 2 (bisimilar), if there is an initial bisimulation B ⊆ S1 × S2.
2. TA1 t TA2 (TA2 time-simulates TA1), if there is an initial timed simulation
H ⊆ Q1 ×Q2.
3. TA1 ≈t TA2 (time-bisimilar), if there is an initial timed bisimulation B ⊆
Q1 ×Q2.
4. TA1 't TA2 (time-simulation equivalent), if TA1 t TA2 and TA2 t TA1.
Timed bisimulation forces time steps to advance clocks by the same amount
of time. A coarser relation — called time-abstracting bisimulation [37] — allows
two transition systems to advance clocks at “different speeds”. Given two timed
automata TAi, for i ∈ {1, 2} and the respective transition systems TS (TAi) =
(Qi, Q
0
i , ∆i, λi), a binary relation B ⊆ Q1 × Q2 is a time-abstracting bisimula-
tion [37], if the following holds for every pair (q1, q2) ∈ B:
1. The labels coincide: λ1(q1) = λ2(q2);
2. For all j and k such that {j, k} = {1, 2}, and each discrete step qj →∆j rj ,
there is a discrete step qk →∆k rk and (rj , rk) ∈ B;
3. For all j and k such that {j, k} = {1, 2}, a delay δ ∈ R≥0, and a time step
qj
δ−→∆j rj , there is a delay δ′ ∈ R≥0 and a time step qk δ
′
−→∆k rk such that
(rj , rk) ∈ B.
By substituting δ′ with δ, one obtains the definition of timed bisimulation.
3 Asynchronous message passing systems
Timed automata as defined above neither capture processes nor communication
via messages, as would be required to model distributed algorithms. Hence we
now introduce these notions and then construct an asynchronous system using
processes and message passing (or message counting). We assume that at every
step a process receives and sends at most one message [19]. In Section 4, we add
time to this modeling in order to obtain a timed automaton.
V1
V0
SE
AC
c1 c2
c3
c2
Fig. 3. A graphical representation of a process
discussed in Example 3.1
`0 `1
c4
c5
Fig. 4. A simple two-state process
(used later for Theorem 5.1)
Single correct process. We assume a (possibly infinite) set of control states L
and a subset L0 ⊆ L of initial control states. We fix a finite set MT of message
types. We assume that the control states in L keep track of the messages sent
by a process. Thus, L comes with a predicate is sent : L × MT → B, where
is sent(`,m) evaluates to true if and only if a message of type m has been sent
according to the control state `. Finally, we introduce a set Π of parameters and
store the parameter values in a vector p ∈ N0|Π|. As noted in [22], parameter
values are typically restricted with a resilience condition such as n > 3t (less
than a third of the processes are faulty), so we will assume that there is a set of
all admissible combinations of parameter values PRC ⊆ N0|Π|.
The behavior of a single process is defined as a process transition relation
T ⊆ L × N0|Π| × N0|MT| × L encoding transitions guarded by conditions on
message counters that range over N0|MT|: when (`,p, c, `′) ∈ T , a process can
make a transition from the control state ` to the control state `′, provided that,
for every m ∈ MT, the number of received messages of type m is greater than
or equal to c(m) in a configuration with parameter values p.
Example 3.1. The process shown in Figure 1 can be written in our definitions as
follows. The algorithm is using only one message type, and thus MT = {ECHO}.
We assume a set of control states L = {V0,V1,SE,AC}: V0 and V1 encode the
initial states where myval = 0 and myval = 1 respectively, pc = SE encodes the
status “ECHO sent before”, and pc = AC encodes the status “accept”. The initial
control states are: L0 = {V0,V1}. The transition relation contains four types
of transitions: tp1 = (V0,p, c1,SE), t
p
2 = (V0,p, c2,AC), t
p
3 = (V1,p, c3,SE),
and tp4 = (SE,p, c2,AC), for any p ∈ N0|Π| and c1, c2, c3 satisfying the fol-
lowing: c1(ECHO) ≥ p(t) + 1, c2(ECHO) ≥ p(n) − p(t), and c3(ECHO) ≥ 0.
Finally, is sent(`,ECHO) iff ` ∈ {SE,AC}. A concise graphical representation of
the transition relation is given in Figure 3. There, each edge represents multiple
transitions of the same type. Let us observe that while the action of sending
a message can be inferred by simply checking all the transitions going from a
state s to a state t such that ¬is sent(s) and is sent(t), the action of receiving an
individual message is not part of the process description at this level. However,
if a guarded transition is taken, this implies that a threshold has been reached,
e.g., in case of c1, at least t+ 1 messages were received. /
We make two assumptions typical for distributed algorithms [19, 35]:
Message passing (MP) Message counting (MC):
MsgMP
∆
= MT× Proc MsgMC ∆= MT× {C,F}
MsgSetsMP
∆
= 2MT×Proc MsgSetsMC
∆
= {0, . . . , |Corr|}|MT| × {0, . . . , |Byz|}|MT|
Initial messages, init ∈ MsgSets
initMP
∆
= ∅ initMC ∆= ((0, . . . , 0), (0, . . . , 0))
Count messages, card : MT×MsgSets → N0
cardMP(m,M)
∆
= |{p ∈ Proc : (m, p) ∈M}| cardMC(m, (cC, cF)) ∆= cC(m) + cF(m)
Add a message, add : Msg ×MsgSets → MsgSets
addMP(〈m, p〉 ,M) ∆= M ∪ {〈m, p〉} addMC((m, tag), (cC, cF)) ∆= (c′C, c′F) such that
c′C(m) = cC(m) + 1 and c
′
F(m) = cF(m), if tag = C
c′F(m) = cF(m) + 1 and c
′
C(m) = cC(m), if tag = F
and c′(m′) = c(m) for m′ ∈ MT,m′ 6= m
Is there a message to deliver? inTransit : Msg ×MsgSets ×MsgSets → B
inTransitMP(〈m, p〉 ,M,M ′) ∆= inTransitMC((m, tag), (cC, cF), (c′C, c′F)) ∆=
(p ∈ Corr ∧ 〈m, p〉 ∈M ′ \M) ∨ (p ∈ Byz ∧ 〈m, p〉 6∈M) (tag = C ∧ c′C > cC) ∨ (tag = F ∧ cF < |Byz|)
Table 1. The message-passing and message-counting interpretations
A1 Processes do not forget that they have sent messages: If (`,p, c, `′) ∈ T , then
is sent(`,m)→ is sent(`′,m) for every m ∈ MT.
A2 At each step a process sends at most one message to all: If (`,p, c, `′) ∈ T and
¬is sent(`,m) ∧ is sent(`′,m) ∧ ¬is sent(`,m′) ∧ is sent(`′,m′) then m = m′.
Then, we call (MT,L,L0, T ) a process template.
Asynchronous message passing and counting in presence of Byzantine
faults. In this section we introduce two ways of modeling message passing:
by storing messages in sets, and by counting messages. As in [23], we do not
explicitly model Byzantine processes [32], but capture their effect on the correct
processes in the form of spurious messages. Although we do not discuss other
kinds of faults (e.g., crashes, symmetric faults, omission faults), it is not hard to
model other faults by following the modeling in [23].
We fix a set of processes Proc, which is typically defined as {1, . . . , n} for
n ≥ 1. Further, assume that there are two disjoint sets: the set Corr ⊆ Proc
of correct processes, and Byz ⊆ Proc of Byzantine processes (possibly empty),
with Byz ∪ Corr = Proc. Given a process template (MT,L,L0, T ), we refer to
(MT,L,L0, T ,Corr,Byz) as a design. Note that a design does not capture how
processes interact with messages. To do so, in Table 1, we define message pass-
ing (MP) and message counting (MC) models as interpretations of the signature
(Msg ,MsgSets, init , card , add , inTransit), with the following informal meaning:
– Msg : the set of all messages that can be exchanged by the processes,
– MsgSets: collections of messages,
– init : the empty collection of messages,
– card : a function that counts messages of the given type,
– add : a function that adds a message to a collection of messages
– inTransit : a function that checks whether a message is in transit and thus
can be received.
Transition systems. Fix interpretations (MsgI ,MsgSetsI , initI , cardI , addI ,
inTransitI) for I ∈ {MP ,MC}. Then, we define a transition system TS I =
(SI , SI0 , R
I , LI) of processes from Proc that communicate with respect to in-
terpretation I. We call message-passing system the transition system obtained
using the interpretation MP, and message-counting system the transition system
obtained using the interpretation MC.
The set SI contains configurations, i.e., tuples (p, pc, rcvd, sent) having the
following properties: (a) p ∈ N0|Π|, (b) pc : Corr → L, (c) rcvd : Corr →
MsgSetsI , and (d) sent ∈ MsgSetsI . In a configuration, for every process p ∈ Corr,
the values pc(p) and rcvd(p) comprise the local view of the process p, while the
components sent and p comprise the shared state of the distributed system.
A configuration σ ∈ SI belongs to the set SI0 of initial configurations, if for
each process p ∈ Corr, it holds that: (a) σ.pc(p) ∈ L0, (b) σ.rcvd(p) = initI ,
(c) σ.sent = initI , and (d) σ.p ∈ PRC .
Definition 3.2. The transition relation RI contains a pair of configurations
(σ, σ′) ∈ SI × SI , if there is a correct process p ∈ Corr that satisfies:
1. There exists a local transition (`,p, c, `′) ∈ T satisfying σ.pc(p) = ` and
σ′.pc(p) = `′ and for all m in MT, c(m) = cardI(m,σ′.rcvd(p)). Also, it is
required that σ.p = σ′.p = p.
2. Messages are received and sent according to the signature:
(a) Process p receives no message: σ′.rcvd(p) = σ.rcvd(p), or there is a
message in transit in σ that is received in σ′, i.e., there is a message
msg ∈ MsgI satisfying:
inTransitI(msg , σ.rcvd(p), σ.sent) ∧ σ′.rcvd(p) = addI(msg , σ.rcvd(p)).
(b) The shared variable sent is changed iff process p sends a message, that
is, σ′.sent = addI(msg , σ.sent), if and only if ¬is sent(σ.pc(p),m) and
is sent(σ′.pc(p),m), for every m ∈ MT and msg ∈ MsgI of type m.
3. The processes different from p do not change their local states:
σ′.pc(q) = σ.pc(q) and σ′.rcvd(q) = σ.rcvd(q) for q ∈ Corr \ {p}.
The labeling function LI : SI → L|Corr| ×
(
N0|MT|
)|Corr|
labels each con-
figuration σ ∈ SI with the vector of control states and message counters, i.e.,
LI(σ) = ((`1, . . . , `|Corr|), (c1, . . . , c|Corr|)) such that `p = σ.pc(p) and cp(m) =
cardI(m,σ.rcvd(p)) for p ∈ Corr, m ∈ MT. (For simplicity we use the convention
that Corr = {1, . . . j}, for some j ∈ N.) Note that LI labels a configuration with
the process control states and the number of messages received by each process.
The message-passing transition systems have the following features. The mes-
sages sent by correct processes are stored in the shared set sent. In this modeling,
the messages from Byzantine processes are not stored in sent explicitly, but can
be received at any step. Each correct process p ∈ Corr stores received messages
in its local set rcvd(p), whose elements originate from the messages stored in the
set sent or from Byzantine processes.
The message-counting transition systems have the following features. Mes-
sages are not stored explicitly, but are only counted. We maintain two vectors
of counters: (i) representing the number of messages that originate from correct
processes (these messages have the tag C), and (ii) representing the number of
messages that originate from faulty processes (these messages have the tag F).
Each correct process p ∈ Corr keeps two such vectors of counters cC and cF in its
local variable rcvd(p). In the following, we refer to cC and cF using the notation
[rcvd(p)]C and [rcvd(p)]F. The number of sent messages is also stored as a pair of
vectors [sent]C and [sent]F. By the definition of the transition relation R
MC, the
vector [sent]F is always equal to the zero vector, whereas the correct process p
can increment its counter [rcvd(p)]F, if [rcvd(p)]F (m) < |Byz|, for every m ∈ MT.
To prove bisimulation between a message-passing system and a message-
counting system — built from the same design — we introduce the following re-
lation on the configurations of both systems:
Definition 3.3. Let H# ⊆ SMP × SMC such that (σ, σ#) ∈ H# if for all pro-
cesses p ∈ Corr and message types m ∈ MT:
1. σ#.pc(p) = σ.pc(p)
2. σ#. [rcvd(p)]C (m) = |{q ∈ Corr : 〈m, q〉 ∈ σ.rcvd(p)}|
3. σ#. [rcvd(p)]F (m) = |{q ∈ Byz : 〈m, q〉 ∈ σ.rcvd(p)}|
4. σ#. [sent]C (m) = |{q ∈ Corr : 〈m, q〉 ∈ σ.sent}|
5. σ#. [sent]F (m) = 0
6. {q ∈ Proc : 〈m, q〉 ∈ σ.sent} ⊆ Corr
7. σ.rcvd(p) ⊆ σ.sent ∪ {〈m, q〉 : m ∈ MT, q ∈ Byz}
8. is sent(σ.pc(p),m)↔ 〈m, p〉 ∈ σ.sent
Theorem 3.4. For a message-passing system TSMP and a message-counting
system TSMC defined over the same design, H# is a bisimulation.
The key argument to prove the Theorem 3.4 is that given a message counting
state σ#, if a step increases a counter rcvd(p), in the message passing system this
transition can be mirrored by receiving an arbitrary message in transit. In fact,
in both systems, once a message is sent it can be received at any future step. We
will see that in the timed version this argument does not work anymore, due to
the restricted time interval in which a message must be received.
4 Messages with time constraints
We now add time constraints to both, message-passing systems and message-
counting systems. Following the definitions from distributed algorithms [35, 40],
we assume that every message is delivered within a predefined time bound, that
is, not earlier than τ− time units and not later than τ+ times units since the
instant it was sent, with 0 ≤ τ− ≤ τ+. We use naturals for τ− and τ+ for
consistency with the literature on timed automata.
As can be seen from Section 2, to define a timed automaton, one has to pro-
vide an invariant and a switch relation. In the following, we fix the invariants and
switch relations with respect to the timing constraints τ− and τ+ on messages.
However, the specifications of distributed algorithms may refer to time, e.g., “If
a correct process accepts the message (round k) at time t, then every correct
process does so by time t+ tdel” [35]. Therefore, we assume that a specification
invariant (or user invariant) IU : 2
AP → Ψ(U) and a specification switch relation
(or user switch relation) EU : 2
AP × 2AP → Ψ(U)× 2U are given as input. Then,
we will refer to the tuple (L,L0, T ,Proc, IU , EU ) as a timed design and we will
assume that a timed design is fixed in the following.
Using a timed design, we will use message-passing and message-counting
systems to derive two timed automata. For a message of type m sent by a
correct process p, the message-passing system uses a clock c 〈m, p〉 to store the
delay since the message 〈m, p〉 was sent. The message-counting system stores
the delay since the ith message of type m was sent, for all i and m. Both timed
automata specify an invariant to constrain the time required to deliver a message.
Definition 4.1 (Message-passing timed automaton). Given a message-
passing system TSMP = (SMP, SMP0 , R
MP, LMP) defined over a timed system de-
sign (L,L0, T ,Proc, IU , EU ), we say that a timed automaton TAMP = (SMP, SMP0 ,
RMP, LMP, U ∪XMP, IMP, EMP) is a message-passing timed automaton, if it has
the following properties:
1. There is one clock per message that can be sent by a correct process: XMP =
{c 〈m, p〉 : m ∈ MT, p ∈ Corr}.
2. For each discrete transition (σ, σ′) ∈ RMP, the state switch relation EMP(σ, σ′)
ensures the specification invariant and resets the given specification clocks
and the clocks corresponding to the message sent in transition (σ, σ′). That is,
if (ϕU , YU ) is the guard, and specification clocks are in EU (L
MP(σ), LMP(σ′)),
then EMP(σ, σ′) = (ϕU , YU ∪ {c 〈m, p〉 : 〈m, p〉 ∈ σ′.sent \ σ.sent}).
3. Each state σ ∈ SMP has the invariant IMP(σ) = IU (LMP(σ)) ∧ ϕ−MP ∧ ϕ+MP
composed of:
(a) the specification invariant IU (L
MP(σ));
(b) the lower bound on the age of received messages:
ϕ−MP =
∧
〈m,p〉∈M c 〈m, p〉 ≥ τ− for M = {〈m, p〉 ∈ MT × Corr : ∃q ∈
Corr. 〈m, p〉 ∈ σ.rcvd(q)}; and
(c) the upper bound on the age of messages that are in transit: ϕ+MP =∧
(m,p)∈M 0 ≤ c 〈m, p〉 ≤ τ+ for M = {〈m, p〉 ∈ MT × Corr : 〈m, p〉 ∈
σ.sent \⋂q∈Corr σ.rcvd(q)}.
Definition 4.2 (Message-counting timed automaton). Given a message-
counting system TSMC = (S
MC, SMC0 , R
MC, LMC) defined over a timed design (L,
L0, T ,Proc, IU , EU ), we say that a timed automaton TAMC = (SMC, SMC0 , RMC,
LMC, U ∪XMC, IMC, EMC) is a message-counting timed automaton, if it has the
following properties:
1. There is one clock per message type and number of messages sent. That is,
XMC = {c 〈m, i〉 : m ∈ MT, 1 ≤ i ≤ |Corr|}.
2. For each discrete transition (σ, σ′) ∈ RMC, the state switch relation EMC(σ, σ′)
ensures the specification invariant and resets the given specification clocks
and the clocks corresponding to message counters updated by (σ, σ′). That is,
if (ϕU , YU ) = EU (L
MC(σ), LMC(σ′)), then the switch relation EMC(σ, σ′) is
(ϕU , YU ∪ {c 〈m, k〉 : m ∈ MT, k = σ′.sent(m) = σ.sent(m) + 1}).
3. Each state σ ∈ SMC has the invariant IMC(σ) = IU (LMC(σ)) ∧ ϕ−MC ∧ ϕ+MC
composed of:
(a) the specification invariant IU (L
MC(σ));
(b) ϕ−MC =
∧
m∈MT a(m) > 0 → c 〈m, a(m)〉 ≥ τ− for the numbers a(m) =
maxp∈Corr [σ.rcvd(p)(m)]C. If a correct process has received a(m) mes-
sages of type m from correct processes, then the a(m)-th message of
type m, for every m ∈ MT, was sent at least τ− time units earlier.
(c) ϕ+MC =
∧
m∈MT
∧
b(m)<j≤σ.sent(m) 0 ≤ c 〈m, j〉 ≤ τ+ for the numbers
b(m) = minp∈Corr [σ.rcvd(p)(m)]C. If there is a correct process that has
received b(m) messages of type m from correct processes, then for every
number of messages j > b(m), the respective clock is bounded by τ+.
While the number of employed clocks is the same, the latter model is “more
abstract”: by forgetting the identity of the sender, indeed, several configurations
of the message-passing timed automaton can be mapped on the same configura-
tion of the message-counting timed automaton.
5 Precision of Message Counting with Time Constraints
While Theorem 3.4 establishes a strong equivalence — that is, a bisimulation rela-
tion — between message-passing transition systems, we will show in Theorem 5.1
that message-passing timed automata and message-counting timed automata are
not necessarily equivalent in the sense of timed bisimulation. Remarkably, such
automata are also not necessarily equivalent in the sense of time-abstracting
bisimulation. These results show an upper bound on the degree of precision
achievable by model checking of timed properties of FTDAs by counting mes-
sages. Nevertheless, we show that such automata simulate each other, and thus
they satisfy the same ATCTL formulas (Corollary 5.10 and Corollary 6.2).
Theorem 5.1. There exsits a timed design whose message-passing timed au-
tomaton TAMP and message-counting timed automaton TAMC satisfy:
1. There is no initial timed bisimulation between TAMP and TAMC.
2. There is no initial time-abstracting bisimulation between TAMP and TAMC.
Proof (sketch). We give an example of a timed design proving Point 2. Since
timed bisimulation is a special case of time-abstracting bisimulation, this exam-
ple also proves Point 1.
We use the process template shown in Figure 4 on page 7. Formally, this
template is defined as follows: there is one parameter, i.e., Π = {n}, one message
type, i.e., MT = {M}, and two control states, i.e., L = {`0, `1}. There are two
types of transitions: tp1 = (`0,p, c4, `1) and t
p
2 = (`1,p, c5, `1). The conditions c4
and c5 require that c4(M) = 0 and c5(M) ≥ 0 respectively. Every process sends
q0 q1 q2 q3 q4 q5 q6
q7 q8
q′′7 q
′′
8
r0 r1 r4 r5 r6r2 r3 r7 r8
1! 〈M, 1〉 2! 〈M, 2〉 2? 〈M, 1〉 2? 〈M, 2〉τ− τ− 1
? 〈M
, 1〉
1? 〈M, 2〉
τ−
δ4
sent++ sent++ rcvd(2)++rcvd(2)++δ1 δ2 rcvd(1)++ δ3
Fig. 5. Two runs of TAMP (above) and one run of TAMC (below) that violate time-
abstracting bisimulation when τ+ = 2τ−. Circles and edges illustrate states and tran-
sitions. Edge labels are as follows: τ− or δi designate a time step with the respective
delay; i! 〈M, j〉 and i? 〈M, j〉 designate send and receive of a message 〈M, j〉 by process i
in the message-passing system; sent++ and rcvd(i)++ designate send and receive of a
message M by some process and process i respectively.
a message of type M when going from `0 to `1, i.e., is sent(`,M) = T iff ` = `1.
Then the processes self-loop in the control state `1 (by doing so, they can receive
messages from the other processes).
Consider the system of two correct processes and no Byzantine processes,
that is, Corr = {1, 2} and Byz = ∅. We fix the upper bound on message delays to
be τ+ = 2τ− > 0. For the sake of this proof, we set U = ∅, and thus IU and EU
are defined trivially. Together, these constraints define a timed design.
Figure 5 illustrates two runs of a TAMP and a run of TAMC that should be
matched by a time-abstracting bisimulation, if one exists. We show by contradic-
tion that no such relation exists. Note that the message 〈M, 1〉 has been received
by all processes at the timed state q7 and has not been received by the first pro-
cess at the timed state q′′7 . Thus the timed state q7 admits a time step, while the
timed state q′′7 does not. Indeed, on one hand, the timed automaton TA
MP can
advance the clocks by at most τ+ − τ− = τ− time units in q7 before the clock
attached to the message 〈M, 2〉 expires; on the other hand, in q′′7 , the timed
automaton TAMP cannot advance the clocks before the clock attached to the
message 〈M, 1〉 expires. However, both states must be time-abstract related to
the state r7 of TA
MC, because they both received the same number of messages
of type M and thus their labels coincide, from which we derive the required con-
tradiction. Hence, proving that there is no time-abstracting bisimulation. uunionsq
From Theorem 5.1, it follows that message counting abstraction is not precise
enough to preserve an equivalence relation as strong as bisimulation. However, for
abstraction-based model checking a coarser relation, namely, timed-simulation
equivalence, would be sufficient. In one direction, timed-simulation is easy: a
discrete configuration of a message-passing timed automaton can be mapped
to the configuration of the message-counting timed automaton by just counting
the messages for each message type, while the clocks assignments are kept the
same. The other direction is harder: A first approach would be to map a con-
instant t1
instant t2
instant t3
1 2 3
1! 〈M, 1〉
2! 〈M, 2〉
3? 〈M, 2〉 τ
+
1 2 3
1! 〈M, 1〉
2! 〈M, 2〉
3? 〈M, 1〉
τ+
Fig. 6. Receiving messages in order relaxes constraints of delay transitions
figuration of a message-counting timed automaton to all the configurations of
the message-passing timed automaton, where the message counters are equal to
the cardinalities of the sets of received messages. This mapping is problematic
because of the interplay of message re-ordering and timing constraints:
Example 5.2. Figure 6 exemplifies a problematic behavior that originates from
the interplay of message re-ordering and timing constraints on message delays.
In the figure we see the space-time diagram of two timed message passing runs,
where first process 1 sends 〈M, 1〉 at instant t1, and then process 2 sends 〈M, 2〉
at a later time t2 > t1. In the run on the left, process 3 receives 〈M, 2〉 at
instant t3 and has not received 〈M, 1〉 before. In the run on the right process 3
receives 〈M, 1〉 at instant t3. Hence, at t3 on the left 〈M, 1〉 is in transit, while
on the right 〈M, 2〉 is in transit, which has been sent after 〈M, 1〉. As indicated
by the τ+ intervals, due to the invariants from Definition 4.1[3c], the left run
is more restricted: On the left within one time step the clocks can be advanced
by τ+ − (t3 − t1) while on the right the clocks can advance further, namely, by
τ+− (t3− t2) > τ+− (t3− t1). Message counting timed automata abstract away
the origin of the messages, and intuitively, relate the sending of the ith message
to the reception of i messages, which correspond to runs where messages are
received “in order”, like in the run on the right. We shall formalize this below. /
In the following, we exclude from the simulation relation those states where
an in-transit message has been sent before a received one, and only consider
so-called well-formed states where the messages are received in the chronological
order of the sending (according to the clocks of timed automata). Indeed, we use
the fact that the timing constraints of well-formed states in the message-passing
system match the timing constraints in the message-counting system.
Definition 5.3 (Well-formed state). For a message-passing timed automaton
TAMP with TS (TAMP) = (Q,Q0, ∆, λ), a state (s, µ, ν) ∈ Q is well-formed, if
for each message type m ∈ MT, each process p ∈ Corr that has received a message
〈m, p′〉 has also received all messages of type m sent earlier than 〈m, p′〉:
〈m, p′〉 ∈ s.rcvd(p) ∧ µ(c 〈m, p′′〉) > µ(c 〈m, p′〉)
→ 〈m, p′′〉 ∈ s.rcvd(p) for p′, p′′ ∈ Corr (1)
states of a message-passing automaton
well-formed states
states of a message-counting automaton
Theorem 5.6
Corollary 5.8
Corollary 5.10
Fig. 7. Simulations constructed in Theorems 5.6–5.10. Small circles depict states of
the transition systems. An arrow from a state s to a state t illustrates that the pair
(s, t) belongs to a timed simulation
Observe that because messages can be sent at precisely the same time, there
can be different well-formed states s and s′ with s.rcvd(p) 6= s′.rcvd(p). Also,
considering only well-formed states does not imply that the messages are received
according to the sending order in a run (which would correspond to FIFO).
We will use a mapping WF to abstract arbitrary states of any message passing
timed automaton to sets of well-formed states in the same automaton.
Definition 5.4. Given a message-passing timed automaton TAMP with the tran-
sition system TS (TAMP) = (Q,Q0, ∆, λ), we define a mapping WF : Q → 2Q
that maps an automaton state (s, µ, ν) ∈ Q into a set of well-formed states with
each (s′, µ′, ν′) ∈WF((s, µ, ν)) having the following properties:
1. µ′ = µ, ν′ = ν, s′.sent = s.sent, and s.pc(p) = s′.pc(p) for p ∈ Corr, and
2. |{q : 〈m, q〉 ∈ s′.rcvd(p)}| = |{q : 〈m, q〉 ∈ s.rcvd(p)}| for m ∈ MT, p ∈ Corr.
One can show that every timed state q ∈ Q has at least one state in WF(q):
Proposition 5.5. Let TAMP be a message-passing timed automaton, and
TS (TAMP) = (Q,Q0, ∆, λ). For every state q ∈ Q, the set WF(q) is not empty.
Using Proposition 5.5, one can show that the well-defined states simulate all
the timed states of a message-passing timed automaton:
Theorem 5.6. If TAMP is a message-passing timed automaton, and if
TS (TAMP) = (Q,Q0, ∆, λ), then {(q, r) : q ∈ Q, r ∈ WF(q)} is an initial timed
simulation.
Theorem 5.6 suggests that timed automata restricted to well-formed states
might help us in avoiding the negative result of Theorem 5.1. To this end, we
introduce a well-formed message-passing timed automaton. Before that, we note
that Equation (1) of Definition 5.3 can be transformed to a state invariant. We
denote such a state invariant as IWF.
Definition 5.7 (Well-formed MPTA). Given a message-passing timed au-
tomaton TAMP = (S, S0, R, L, U ∪X, I,E), its well-formed restriction TAMPWF is
the timed automaton (S, S0, R, L, U ∪X, I ∧ IWF, E).
Since the well-formed states are included in the set of timed states, and the
well-formed states simulate timed states (Theorem 5.6), we obtain the following:
Corollary 5.8. Let TAMP be a message-passing timed automaton and TAMPWF be
its well-formed restriction. These timed automata are timed-simulation equiva-
lent: TAMP 't TAMPWF.
As a consequence of Theorems 3.4, 5.6, and Corollary 5.8, one obtains that
there is a timed bisimulation equivalence between a well-formed message-passing
timed automaton and the corresponding message-counting timed automaton,
which is obtained by forgetting the sender of the messages and just counting the
sent and delivered messages.
Theorem 5.9. Let TAMP be a message-passing timed automaton and TAMC be
a message-counting timed automaton defined over the same timed system design.
Further, let TAMPWF be the well-formed restriction of TA
MP. There exists an initial
timed bisimulation: TAMPWF ≈t TAMC.
By collecting Theorem 5.9 and Corollary 5.8 we conclude that there is a
timed simulation equivalence between MPTA and MCTA:
Corollary 5.10. Let TAMP be a message-passing timed automaton and TAMC be
a message-counting timed automaton defined over the same timed system design.
TAMP and TAMC are timed-simulation equivalent: TAMP 't TAMC.
Figure 7 uses arrows to depict the timed simulations presented in this work.
6 Conclusions
Asynchronous systems. For systems considered in Section 3, we conclude
from Theorem 3.4 that message-counting systems are detailed enough for model
checking of properties written in CTL?:
Corollary 6.1. For a CTL? formula ϕ, a message-passing system TSMP and a
message-counting system TSMC defined over the same design, TSMP |= ϕ if and
only if TSMC |= ϕ.
The corollary implies that the message counting abstraction does not intro-
duce spurious behavior. In contrast, data and counter abstractions introduced
in [22] may lead to spurious behavior as only simulation relations have been
shown for these abstractions.
Timed systems. For systems considered in Section 4, we consider specifications
in the temporal logic ATCTL [14], which restricts TCTL [6] as follows: first,
negations only appear next to propositions p ∈ AP ∪ Ψ(U), and second, the
temporal operators are restricted to AF∼c, AG∼c, and A U∼c.
To derive that message-counting timed automata are sufficiently precise for
model checking of ATCTL formulas (in the following corollary), we combine the
following results: (i) Simulation-equivalent systems satisfy the same formulas
of ACTL, e.g. see [11, Theorem 7.76]; (ii) Reduction of TCTL model checking
to CTL model checking by clock embedding [11, p. 706]; (iii) Corollary 5.10.
Corollary 6.2. For a message-passing timed automaton TAMP and a message-
counting timed automaton TAMC defined over the same timed design and an
ATCTL-formula ϕ, the following holds: TAMP |= ϕ if and only if TAMC |= ϕ.
Future work. Most of the timed specifications of interest for FTDAs (e.g.,
fault-tolerant clock synchronization algorithms [35, 39, 40]) are examples of time-
bounded specifications, thus belonging to the class of timed safety specifica-
tions. These algorithms can be encoded as message-passing timed automata
(Definition 4.1). In this paper, we have shown that model checking of these
algorithms can also be done at the level of message-counting timed automata
(Definition 4.2). Based on this it appears natural to apply the abstraction-based
parameterized model checking technique from [22]. However, we are still facing
the challenge of having a parameterized number of clocks in Definition 4.2. We
are currently working on another abstraction that addresses this issue. This will
eventually allow us to do parameterized model checking of timed fault-tolerant
distributed algorithms using UPPAAL [12] as back-end model checker.
Related work. As discussed in [23], while modeling message passing is natural
for fault-tolerant distributed algorithms (FTDAs), message counting scales bet-
ter for asynchronous systems, and also builds a basis for efficient parameterized
model checking techniques [22, 28]. We are interested in corresponding results for
timed systems, that is, our long-term research goal is to build a framework for
the automatic verification of timed properties of FTDAs. Such kind of properties
are particularly relevant for the analysis of distributed clock synchronization pro-
tocols [35, 39, 40]. This investigation combines two research areas: (i) verification
of FTDAs and (ii) parameterized model checking (PMC) of timed systems.
To the best of our knowledge, most of the existing literature on (i) can model
only the discrete behaviors of the algorithms themselves [38, 20, 22, 28, 18, 4, 5].
Consequently they can neither reason about nor verify their timed properties.
This motivated us to extend existing techniques for modeling and abstracting
FTDAs, such as message passing and message counting systems together with
the message counting abstraction, to timed systems.
Most of the results about PMC of timed systems [2, 31, 8, 10, 3, 1, 34, 9] are
restricted to systems whose interprocess communication primitives have other
systems in mind than FTDAs. For instance, the local state space is fixed and
finite and independent of the parameters, while message counting in FTDAs re-
quires that the local state space depends on the parameters. This motivated us
to introduce the notions of message passing timed automata and message count-
ing timed automata. Besides, the literature typically focuses on decidability, e.g.,
[3, 1, 34, 9] analyze decidability for different variants of the parameterized model
checking problem (e.g., integer vs. continuous time, safety vs. liveness, presence
vs. absence of controller). Our work focuses on establishing relations between dif-
ferent timed models, with the goal of using these relations for abstraction-based
model checking.
References
1. P.A. Abdulla, J. Deneux, and P. Mahata. Multi-clock timed networks. In LICS,
pages 345–354, 2004.
2. Parosh Aziz Abdulla, Fre´de´ric Haziza, and Luka´sˇ Hol´ık. All for the price of few.
In VMCAI, pages 476–495, 2013.
3. Parosh Aziz Abdulla and Bengt Jonsson. Model checking of systems with many
identical timed processes. Theoretical Computer Science, 290(1):241–264, 2003.
4. Francesco Alberti, Silvio Ghilardi, Andrea Orsini, and Elena Pagani. Counter
abstractions in model checking of distributed broadcast algorithms: Some case
studies. In CILC, pages 102–117, 2016.
5. Francesco Alberti, Silvio Ghilardi, and Elena Pagani. Counting constraints in flat
array fragments. In IJCAR, pages 65–81, 2016.
6. Rajeev Alur, C. Courcoubetis, and D. Dill. Model-checking for real-time systems.
LICS, pages 414–425, 1990.
7. Rajeev Alur and David L. Dill. A theory of timed automata. Theoretical Computer
Science, 126(2):183–235, 1994.
8. Benjamin Aminof, Tomer Kotek, Sasha Rubin, Francesco Spegni, and Helmut
Veith. Parameterized Model Checking of Rendezvous Systems. CONCUR, 23(c):1–
16, 2014.
9. Benjamin Aminof, Sasha Rubin, Florian Zuleger, and Francesco Spegni. Liveness
of parameterized timed networks. In ATVA, pages 375–387, 2015.
10. Simon Außerlechner, Swen Jacobs, and Ayrat Khalimov. Tight cutoffs for guarded
protocols with fairness. In VMCAI, pages 476–494, 2016.
11. Christel Baier and Joost-Pieter Katoen. Principles of model checking. MIT Press,
2008.
12. Gerd Behrmann, Alexandre David, Kim Guldstrand Larsen, John H˚akansson, Paul
Pettersson, Wang Yi, and Martijn Hendriks. UPPAAL 4.0. In QEST, pages 125–
126, 2006.
13. Gabriel Bracha and Sam Toueg. Asynchronous consensus and broadcast protocols.
J. ACM, 32(4):824–840, 1985.
14. Peter Bulychev, Thomas Chatain, Alexandre David, and Kim G Larsen. Efficient
on-the-fly algorithm for checking alternating timed simulation. In Formal Modeling
and Analysis of Timed Systems, pages 73–87. Springer, 2009.
15. Ka¯rlis Cˇera¯ns. Decidability of bisimulation equivalences for parallel timer pro-
cesses. In CAV, volume 663 of LNCS, pages 302–315, 1993.
16. E. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, 1999.
17. Edmund Clarke, Orna Grumberg, Somesh Jha, Yuan Lu, and Helmut Veith.
Counterexample-guided abstraction refinement for symbolic model checking. J.
ACM, 50(5):752–794, 2003.
18. Cezara Dra˘goi, Thomas A. Henzinger, Helmut Veith, Josef Widder, and Damien
Zufferey. A logic-based framework for verifying consensus algorithms, 2014.
19. Michael J. Fischer, Nancy A. Lynch, and M. S. Paterson. Impossibility of dis-
tributed consensus with one faulty process. J. ACM, 32(2):374–382, 1985.
20. Dana Fisman, Orna Kupferman, and Yoad Lustig. On verifying fault tolerance of
distributed protocols. In TACAS, volume 4963 of LNCS, pages 315–331. Springer,
2008.
21. Matthias Fu¨gger and Ulrich Schmid. Reconciling fault-tolerant distributed com-
puting and systems-on-chip. Distributed Computing, 24(6):323–355, 2012.
22. Annu John, Igor Konnov, Ulrich Schmid, Helmut Veith, and Josef Widder. Param-
eterized model checking of fault-tolerant distributed algorithms by abstraction. In
FMCAD, pages 201–209, 2013.
23. Annu John, Igor Konnov, Ulrich Schmid, Helmut Veith, and Josef Widder. To-
wards modeling and model checking fault-tolerant distributed algorithms. In SPIN,
volume 7976 of LNCS, pages 209–226, 2013.
24. Dilsun Kirli Kaynar, Nancy A. Lynch, Roberto Segala, and Frits W. Vaandrager.
The Theory of Timed I/O Automata. Synthesis Lectures on Computer Science.
Morgan & Claypool Publishers, 2006.
25. Igor Konnov, Marijana Lazic´, Helmut Veith, and Josef Widder. A short coun-
terexample property for safety and liveness verification of fault-tolerant dis-
tributed algorithms. In POPL, 2017. (to appear, preliminary version at
http://arxiv.org/abs/1608.05327).
26. Igor Konnov, Helmut Veith, and Josef Widder. On the completeness of bounded
model checking for threshold-based distributed algorithms: Reachability. In CON-
CUR, volume 8704 of LNCS, pages 125–140, 2014.
27. Igor Konnov, Helmut Veith, and Josef Widder. SMT and POR beat counter ab-
straction: Parameterized model checking of threshold-based distributed algorithms.
In CAV (Part I), volume 9206 of LNCS, pages 85–102, 2015.
28. Igor Konnov, Helmut Veith, and Josef Widder. What you always wanted to know
about model checking of fault-tolerant distributed algorithms. In PSI 2015, Revised
Selected Papers, volume 9609 of LNCS, pages 6–21. Springer, 2016.
29. Nancy A. Lynch and Frits W. Vaandrager. Forward and backward simulations for
timing-based systems. In Real-Time: Theory in Practice, REX Workshop, Mook,
The Netherlands, June 3-7, 1991, Proceedings, pages 397–446, 1991.
30. Achour Moste´faoui, Eric Mourgaya, Philippe Raipin Parve´dy, and Michel Raynal.
Evaluating the condition-based approach to solve consensus. In DSN, pages 541–
550, 2003.
31. Kedar S. Namjoshi and Richard J. Trefler. Uncovering symmetries in irregular
process networks. In VMCAI, pages 496–514, 2013.
32. Marshall Pease, Robert Shostak, and Leslie Lamport. Reaching agreement in the
presence of faults. J.ACM, 27(2):228–234, 1980.
33. Yee Jiun Song and Robbert van Renesse. Bosco: One-step Byzantine asynchronous
consensus. In DISC, volume 5218 of LNCS, pages 438–450, 2008.
34. Luca Spalazzi and Francesco Spegni. Parameterized Model-Checking of Timed
Systems with Conjunctive Guards. In Verified Software: Theories, Tools and Ex-
periments, pages 235–251. Springer, 2014.
35. T. K. Srikanth and Sam Toueg. Optimal clock synchronization. J. ACM, 34(3):626–
645, 1987.
36. T.K. Srikanth and Sam Toueg. Simulating authenticated broadcasts to derive
simple fault-tolerant algorithms. Distributed Computing, 2:80–94, 1987.
37. Stavros Tripakis and Sergio Yovine. Analysis of timed systems using time-
abstracting bisimulations. FMSD, 18:25–68, 2001.
38. Tatsuhiro Tsuchiya and Andre´ Schiper. Verification of consensus algorithms using
satisfiability solving. Distributed Computing, 23(5–6):341–358, 2011.
39. Josef Widder and Ulrich Schmid. Booting clock synchronization in partially syn-
chronous systems with hybrid process and link failures. Distributed Computing,
20(2):115–140, 2007.
40. Josef Widder and Ulrich Schmid. The Theta-Model: Achieving synchrony without
clocks. Distributed Computing, 22(1):29–47, April 2009.
