In this paper, we address the problem of applying SAT-based bounded model checking (BMC) and temporal k-induction to asynchronous concurrent systems. We investigate refinement checking in the process-algebraic setting of Communicating Sequential Processes (CSP), focusing on the CSP traces model which is sufficient for verifying safety properties. We adapt the BMC framework to the context of CSP and the existing refinement checker FDR yielding bounded refinement checking which also lays the foundation for tailoring the k-induction technique. As refinement checking reduces to checking for reverse containment of possible behaviours, we exploit the SAT-solver to decide bounded language inclusion as opposed to bounded reachability of error states, as in most existing model checkers. Due to the harder problem to decide and the presence of invisible silent actions in process algebras, the original syntactic translation of BMC to SAT cannot be applied directly and we adopt a semantic translation algorithm based on watchdog transformations. We propose a Boolean encoding of CSP processes resting on FDR's hybrid two-level approach for calculating the operational semantics using supercombinators. We have implemented a prototype tool, SymFDR, written in C++, which uses FDR as a shared library for manipulating CSP processes and the state-of-the-art incremental SAT-solver MiniSAT 2.0. Experiments with BMC indicate that in some cases, especially in complex combinatorial problems, SymFDR significantly outperforms FDR and even copes with problems that are beyond FDR's capabilities. SymFDR in k-induction mode works reasonably well for small test cases, but is inefficient for larger ones as the threshold becomes too large, due to concurrency.
Introduction
Model checking [CGP99; BK08; BHvMW09] is a powerful automatic formal verification technique for establishing correctness of systems. It requires a finite-state model of a system, capturing all its possible behaviours, and a specification property, usually modelled as a formula in some kind of temporal logic. The model checker performs analysis based on exhaustive exploration of the state space of the system to either confirm or refute that the system meets its specification. In the latter case, the model checker provides a counterexample trace for reproducing and fixing the bug. Model checking is complete and, therefore, reliable when pronouncing a system correct.
The main challenge in applying model checking in practice is the so-called state-space explosion problem which tends to be even more severe in the context of concurrent systems. The state-space of a concurrent system grows exponentially with the number of its concurrent components and the number and size of its data values. This puts restrictions on the size of systems that can be tractably analysed.
To alleviate the state-space explosion problem, a significant number of techniques have been proposed. Methods for decreasing the size of the generated state space and enhancing the model checking algorithm include CEGAR [CGJ + 00], partial-order reductions [CGP99; Pel98] , bounded model checking [BCCZ99] , etc. Regarding state-space representation, the major dichotomy is between explicit and symbolic [BCM + 92; BCCZ99] model checking. Explicit model checking is based on explicit enumeration and examination of individual states. Symbolic model checking relies on abstract representation of sets of states, generally as Boolean formulae, and properties are validated using techniques such as BDD manipulation or SAT-solving.
The recent advances of efficient SAT-solvers have significantly broadened the horizons of symbolic model checking. SAT-based bounded model checking [BCCZ99] has proven to be an extremely powerful technique, mainly suited, due to its incompleteness, to falsification of properties. Approaches for making BMC complete include calculating completeness thresholds [CKOS04; CKOS05] or augmenting BMC with k-induction [SSS00; ES03b] or Craig interpolation [McM03] techniques.
Both bounded and unbounded SAT-based model checking have been mainly investigated in the context of hardware and sequential software systems. In this paper, we address applying BMC and temporal k-induction [ES03b] to asynchronous concurrent systems.
The general problem we investigate is refinement checking in process-algebraic settings and, more specifically, in the context of CSP [Hoa85; Ros98; Ros10] .
In process algebras, systems are modelled as interactions of a collection of processes, communicating with each other and with the outer world via message passing, as opposed to shared variables. Using a high-level language, processes are defined compositionally and compiled into a hierarchical structure, starting with atomic process constructs and combining those using operators such as choice, parallel and sequential composition, hiding, etc. This allows for a way of describing reactive systems that is usually very concise and much more economical in state space than shared-variable languages.
Unlike in conventional model checking, where specifications are generally defined as temporallogic formulae, in process algebras specifications are defined as abstract designs of the systems, i.e. as processes, which allows for a step-wise development process. The refinement checking procedure decides whether the behaviours of the system are a subset of the behaviours of the specification, i.e. whether the system refines the specification. Hence, the verification problem reduces to checking for reverse containment of behaviours and, therefore, to reverse language inclusion.
Developed in the late 1970's by Hoare, CSP is one of the two original process algebras. It allows for the precise description and analysis of event-based concurrency. An advantage of the CSP framework is that it offers a well-developed syntax, algebraic and operational semantics, a hierarchy of congruent denotational semantic models, as well as a formal theory of refinement and compositional verification. In terms of syntax and semantics, among other differences with existing formalisms for modelling concurrent systems, CSP supports the usage of broadcast communication, recursion, as well as hiding and renaming of events, both of which are powerful mechanisms for abstraction.
FDR [Ros94; FSE05] is acknowledged as the primary tool for CSP refinement checking and has been widely used for analysing safety-critical systems. The core of FDR is refinement checking in each of the semantic models, which is carried out on the level of the operational representation of the CSP processes and is implemented using explicit state enumeration supplemented by hierarchical state-space compression techniques. When deciding whether an implementation process Impl refines a normalised specification process Spec, FDR follows algorithms exploring the Cartesian product of the state spaces of Spec and Impl in a way comparable to conventional model checking. Although until now FDR has followed the explicit model checking approach, there has been some work on the symbolic model checking of CSP resulting in the BDD-based refinement checker ARC [PY96] and the model checker PAT [SLDS08] , both of which exploit a fully compositional encoding of CSP processes. PAT verifies systems defined in a version of CSP enhanced with shared variables and, within the BMC framework, it uses specifications defined as reachability properties on the values of the shared variables, which requires a different model checking algorithm based on reachability and not on language containment. This paper reports our attempts to integrate SAT-based BMC and temporal k-induction into FDR. The former technique is incomplete and as such is only suitable to detecting bugs. k-induction, however, is complete and can also be used for establishing the correctness of systems. Hence, to the best of our knowledge, this is the first attempt to apply unbounded SAT-based refinement checking to CSP. We propose an alternative Boolean encoding of CSP processes based on FDR's hybrid two-level approach for calculating the operational semantics using supercombinators [Gol04] . As we deal with with a problem that reduces to language inclusion instead of to reachability and due to the presence of invisible hidden actions in process algebras, the original syntactic translation of BMC to SAT cannot be applied directly and we adopt a semantic translation algorithm based on watchdog transformations [RGM + 03]. Essentially, this involves reducing a refinement check into analysing a single process which is constructed by putting the implementation process in parallel with a transformed specification process. The latter plays the role of a watchdog that monitors and marks violating behaviours. Within the scope of this paper, we only consider the translation of trace refinement to SAT checking.
The result is a prototype tool SymFDR 1 which, when combined with state-of-the-art SAT-solvers such as MiniSAT 2.0 [ES03a; EB05], sometimes outperforms FDR by a significant margin in finding counterexamples. We compare the performance of SymFDR with the performance of FDR, FDR used in a non-standard way, PAT [SLD08] and, in some cases, NuSMV [CCG + 02], Alloy Analyzer [Jac06] and straight SAT encodings tailored to the specific problems under consideration.
The remainder of the paper is organised as follows. In Section 2, we set out the necessary background on CSP and FDR's two-level strategy for performing refinement checks. We briefly describe the ideas underlying SAT-based BMC and k-induction. In Section 3, we show how to adapt the watchdog approach [RGM + 03] to BMC and, hence, to k-induction, while in Section 4, we summarise the methods we use to translate FDR's supercombinator representation of a state machine into input for a SAT-solver. Section 5 gives details of how SymFDR is built on top of this, and Section 6 offers experimental evaluation.
Preliminaries

CSP and FDR
In this section, we give a brief overview of CSP and FDR. The interested reader is referred to [Ros98] , [Ros10] and [FSE05] for more details. We restrict our focus to the traces model exclusively, intentionally omitting information about other more expressive models of CSP.
CSP Syntax
In CSP, processes interact with each other and an external environment by communicating events. More than one process may have to cooperate in the performance of an event, i.e. handshake on it. It is standard to distinguish between visible events that might need the cooperation of other processes or the environment and invisible internal actions that occur silently, are not observable or controllable outside a process and model an internal computation such as nondeterminism, unfolding of a recursion, abstraction of details.
Let Σ be a finite alphabet of visible events with τ, ∈ Σ. τ denotes the invisible silent action and -a successful termination of a process -a special action which is visible but uncontrollable from outside and can only occur last. In what follows, we assume that a ∈ Σ, A ⊆ Σ and B ⊆ Σ = Σ∪{ }. R ⊆ Σ × Σ denotes a renaming relation on Σ. For a given process P , we denote by α P ⊆ Σ the set of all visible events that P can perform. We recall the core syntax of CSP.
Definition 2.1. A CSP process is defined recursively via the following grammar:
STOP represents a deadlocked process, i.e. a process that is not capable of communicating any visible or τ actions. The prefixed process x : A −→ P (x) initially offers the environment to perform any event a from A and subsequently behaves like P (a). DIV denotes a livelock, i.e. a process that is engaged in performing an infinite loop of internal τ actions without ever communicating with the external environment. The process SKIP denotes successful termination and is willing to perform at any time. P 1 P 2 and P 1 P 2 denote, respectively, external and internal choice of P 1 and P 2 . In the former case the choice is resolved by the environment while in the latter -nondeterministically. The parallel composition P 1 B P 2 can communicate an event from B only if both P 1 and P 2 are ready to do so -P 1 and P 2 need to handshake (synchronise) on the events in B, but can perform independently on all other events. In practice, it is common to synchronise P 1 and P 2 on their shared events, i.e. use B = α P1 ∩ α P2 . The sequential composition P 1 P 2 behaves like P 1 until P 1 terminates successfully, at which point it silently evolves into P 2 . P \ A behaves like P except that all events from A are being hidden, i.e. turned into internal τ actions. Hence, the A events in P become invisible and uncontrollable by other processes or the environment by means of synchronisation. P R behaves like P , except that, whenever P can perform an event a, P R can perform any event b, such that aRb. µ P • F (P ) denotes a recursive process.
FDR supports the language CSP M which extends core CSP with several further operators and an extensive functional language. Our prototype tool SymFDR supports the full CSP M syntax, except that it cannot at present handle scripts using the function chase.
Running Example: Milner's Scheduler
Given a fixed N ∈ N, a scheduler must arrange two classes of events a.i and b.i for i ∈ {0, . . . , N − 1}. There are two requirements -the a.i-s should occur in strict rotation, i.e. a.0, a.1, . . . , a.N − 1 , a.0, a.1. . . . , a.N − 1 , . . ., and there should be precisely one b.i between each pair of a.i-s.
In CSP, Milner's scheduler can be modelled as a ring of cell processes synchronised using extra events c.i, as illustrated on Figure 1 . An abstracted CSP script for Milner's scheduler establishing the rotation specification is presented on Figure 2 . i ⊕ 1 denotes (i + 1)%N and, likewise, i 1 -(i − 1)%N . For each process Cell (i ), we extend its alphabet α Cell(i) = {a.i, b.i} with extra events c.i and c.i ⊕ 1 to use for synchronising with its neighbouring processes Cell (i 1 ) and Cell (i ⊕ 1 ), respectively. Cell (0 ) is defined in a slightly different way as the a-sequence should start with a.0. The scheduler is constructed by composing all the cells in parallel and hiding all c-events on top as they have been introduced solely for synchronisation purposes. Within Scheduler , the operator corresponds to synchronising the two argument processes on the set of their common events and is fully associative. Hence, a cell can perform an event only if all other cells that have the same event in their alphabet are also offering to do so. {| c |} is a shorthand for {c.i | i ∈ N}.
We give a rough idea of how Scheduler works and why it preserves the rotation specification (modelled as Spec). The only process that can initially perform an event is Cell (0 ) -for all i > 0, Cell (i ) is blocked as it needs Cell (i − 1 ) to also offer c.i. After Cell (0 ) communicates a.0, the only thing that can happen next is Cell (0 ) and Cell (1 ) synchronising on c.1, thus enabling Cell (1 ) to perform a.1. Concerning the sequence of a-s, the same process is repeated around the ring as, synchronising on c.i ⊕ 1, Cell (i ) passes a token to Cell (i ⊕ 1 ) to signify that it is Cell (i ⊕ 1 )'s turn to contribute an a.i ⊕ 1. Obviously, the second requirement for the scheduler is captured as well, also in the most general way. 
Denotational Semantics
Traditionally, the primary means of understanding CSP descriptions has been to use denotational (behavioural) models, whereby a process is identified with the set of observable behaviours it can exhibit.
CSP supports a hierarchy of several such denotational semantic models. Different models describe different types of behaviours, providing more or less information about a process, with the natural trade-off between the amount of details recorded for a process and the complexity of working in the model. All denotational models are compositional in the sense that the denotational value (the set of possible behaviours) of each process can be computed in terms of the denotational values of its subcomponents. The value of a recursive process is obtained using standard fixed-point theory in the style of Scott and Strachey.
In the simplest of all models, the traces model, a process P is identified with the set of its finite traces, denoted by traces(P ). Intuitively, a trace of a process is a sequence of visible actions that the process can perform. Naturally, the set of traces of a process is non-empty and prefix-closed. For example, going back to the scheduler described on Figure 2 , a trace of Cell (0 ) is any prefix of (a.0 c.1 b.0 c.0)
* . The traces model records information about what a process may do and is, therefore, sufficient for verifying safety properties.
There are two different approaches for obtaining the set traces(P ) -either by constructing it inductively from the traces of its subcomponents, or by extracting it from the operational representation. Refer to [Ros98] for the rules underlying the first approach. Just to give a flavour of it in terms of Milner's scheduler:
Since denotational values of processes are rather complex and often infinite, FDR calculates the behaviours of a process from its standard operational representation which is justified by standard congruence theorems, presented and proven in [Ros98; Ros88].
Operational Semantics
The operational semantics models CSP processes as labelled transition systems (LTS's), with nodes denoting processes and labels denoting visible events or τ actions. Since the LTS representation is not unique, in terms of the operational semantics, two processes are considered equivalent if they are strongly bisimilar [Ros98] . The operational semantics can generally be calculated by repeatedly applying a set of SOS-style inference rules, called firing rules. Firing rules provide recipes for constructing an LTS out of a CSP description of a process. The recipes define how processes can evolve by calculating the initial actions available at each node and the possible results after performing each action. The firing rules are presented below. We use an auxiliary process term Ω to denote any process that has already terminated successfully. If F is a CSP term with a free process variable X and Q is a CSP process, F [Q/X] represents the process obtained by substituting every free occurrence of X in F with Q. The last three rules reflect the fact that termination is distributive -P 1 A P 2 terminates when both P 1 and P 2 do. The reader is referred to [Ros98] for more information.
Extracting Behaviours from Operational Semantics. We now present how behaviours, in our casetraces, can be retrieved from the operational semantics of a process. Intuitively, a trace of a process is obtained by trimming the invisible τ actions from an execution of the LTS underlying the operational representation. Formally, a labelled transition system is a quadruple M = S, s 0 , L, T , where S is a finite set of states, s 0 ∈ S is the initial state, L is a finite set of labels, T ⊆ S × L × S is the transition relation. For convenience, we write s l −→ s instead of (s, l, s ) ∈ T . Furthermore, we write s Let P be a finite-state process and OS P = S P , s P 0 , L P = Σ τ, , T P be the LTS underlying the operational semantics of P . We write Σ * to denote the set of finite words over Σ which might end with , and similarly, (Σ τ ) * . For p, q ∈ S P , we use the following notation:
is the set of visible events that can be communicated from the state p.
For t ∈ Σ * , we write p t =⇒ q if there exists t ∈ (Σ τ ) * , such that p t −→ q and t = t Σ , i.e. t is t with all the τ 's removed.
Then, we define traces(P ) = {t ∈ Σ * | ∃ q ∈ S P .s
Operational Representation in FDR -The Two-Level Approach
The SOS notation for operational semantics allows the creation of many operators that do not fit in the denotational world of CSP. Any CSP operator can be described using less general combinator operational rules instead and, conversely, any operator that can be given combinator operational semantics can be derived and given denotational semantics in CSP [Ros10] . Combinator-style semantics can be generalised to supercombinator operational semantics which is the one used in FDR. We give details about both combinator and supercombinator semantics below.
Combinator Rules. As with SOS, there are several combinator rules for each CSP operator and these allow us to infer the initial actions available at each process node out of its top-level operator and the initial actions available at its immediate process arguments. The crucial difference compared to SOS rules originates from the fact that process arguments can be viewed as switched on or off, depending on the context they are used in. Given a compound CSP process P = ⊗(P 1 , . . . , P n ), a process argument P i is considered switched on if its initial actions are immediately relevant for the initial actions of P and switched off if ⊗ does not need its initial actions to deduce the resulting initial action of P . For example:
in P 1 P 2 = P 1 ∅ P 2 , both P 1 and P 2 are switched on in P 1 P 2 , P 2 is initially switched off until P 1 performs , at which point P 1 becomes switched off and P 2 switched on in a −→ P , P is initially switched off but gets switched on when a is communicated in P 1 P 2 , P 1 and P 2 are initially switched off as the nondeterministic choice is only resolved after a τ is performed, at which point precisely one of the two processes becomes activated.
Combinator rules keep track of which processes are switched on at every given moment and restrict SOS by allowing only two types of rules:
rules enforcing that whenever a switched-on process argument performs a τ , this is promoted to a τ of the compound process that does not change its structure. rules combining visible events of switched-on process arguments (if any) into a resulting action of the compound process. In those rules, a switched-on process can participate with either a visible event or not be involved at all, the latter of which we denote with the symbol .
Combinator rules also need to indicate the structure of the successor term. In many instances, the structure is the same as the initial one and so does not have to be mentioned explicitly in the rules. When the structure does change (i.e. processes become switched from on to off or conversely), this is indicated by a CSP term in which the various arguments of the operator may appear. In every case, the successor state contains the original argument if the latter has not participated in the action, or the state that the argument has moved to if it did. Now formally [Ros11] , let P be a compound process with a top-level operator ⊗, switched-on arguments P 1 , . . . , P n (for some n ≥ 0) and switched-off arguments
Having any switched-on process argument P i that can go via a τ to a state P i , the τ -promotion rule takes the form:
As this rule holds universally for any switched-on argument of any CSP operator, τ -promotion rules do not need to be added explicitly to the combinator operational semantics as they were in the SOS rules τ , τ , \A τ , R τ , A τ . Rules combining visible events take the general form ((x 1 , . . . , x n ), y, T ), where
and T is a piece of CSP syntax specifying the structure of the successor term. The idea is that whenever all P i 's that have x i = can perform x i and go to states P i , they can synchronise to make the compound process P perform y and enter a state T . The successor state T is either Ω, if y = , or is specified by an open CSP term in which the free variables are indices drawn from {1, . . . , n} ∪ {-λ | λ ∈ Λ}, which get substituted according to the following rules:
For i ∈ {1, . . . , n}, we distinguish different cases. If x i = or x i ∈ Σ, i is replaced by P i or P i , respectively. If x i = , i does not appear in the successor term T any more as P i becomes switched off. An index -λ for λ ∈ Λ indicates that the process P λ has become switched on and is replaced by P λ .
We list the combinator rules below. In some of them, e.g. Σ \A Σ , R Σ , , the structure of the successor term does not change, i.e. the resulting state is ⊗(P 1 , . . . , P n , Q), where P i = P i if x i = and P i = P i if x i ∈ Σ. In those cases, we omit T from the rules for simplicity. In rules SKIP , 
Supercombinator Operational Semantics. Combinator operational semantics captures precisely CSPdefinable operators [Ros10; Ros11] . However, actions of compound processes need to be calculated recursively on-the-fly out of the actions of their subterms. Furthermore, successor states are presented as pieces of syntax which does not prove to be efficient when analysing large systems. Supercombinator operational semantics is less general but more efficient version of the combinator operational semantics [Ros10] . Supercombinator rules take the form of combinator ones, but are generalised to combine together actions of subprocesses nested under an arbitrary number of applications of CSP operators. As there is no combinator rule for recursion, the only constraint is that any process argument should be a closed CSP term, i.e. should have all the recursion unwound. Based on this assumption, all process arguments have combinator semantic rules which can be composed together to obtain rules for the outermost CSP operator. Hence, it transforms a combination of CSP operators into a single one. Furthermore, this can be implemented efficiently in a single run before the state-space exploration phase rather than on-the-fly when needed using recursive calls.
We illustrate the approach by example. Let P = ⊗ 1 (⊗ 2 (P 1 , P 2 ), ⊗ 3 (P 3 , P 4 )) and let us assume for simplicity that all ⊗ 1 , ⊗ 2 and ⊗ 3 have their two arguments switched on and their application does not result in successor terms with a different structure (format). Considering the τ -promotion rules, if ⊗ 2 or ⊗ 3 have a rule that generates a τ , this τ gets promoted by ⊗ 1 . For instance, if ⊗ 2 has the rule ((a, b), τ ), then we create a supercombinator rule ((a, b, , ), τ ) for the compound process ⊗ 1 . The other type of supercombinator arises when we can match all input requirements of one of ⊗ 1 's combinators using combinators of ⊗ 2 and ⊗ 3 that produce visible results. For example, if ⊗ 1 , ⊗ 2 and ⊗ 3 have the rules ((a, b), c) , (( , a), a) and ((b, d), b), respectively, then the composition will have (( , a, b, d), c) .
Supercompiling -the process of associating supercombinator-style operational semantics to a CSP process, follows a hybrid high-/low-level approach for calculation and representation [Ros10] . It identifies all true recursions and compiles them on a low level, generating explicit LTS's using the combinator rules. What remains for the high level are closed processes combined typically using parallel composition, hiding and renaming, although the dividing line is somewhat more complex and is drawn where sensible. For example, the choice operators and sequential composition can also be lifted to the high level as long as their arguments are all closed terms.
The result of supercompilation is a high-level structure which consists of two parts. The first one is a process tree with leaves -low-level compiled LTS's, and internal nodes -CSP operators compiled on the high level, usually hiding, renaming or parallel composition. Each node, even if internal, represents a process and can be interpreted as an LTS with its behaviours deducible on-the-fly from the behaviours of its children. The second part of the high-level structure is a set of supercombinators mapping actions of a number of leaf processes to an event-outcome of the composite root process [Ros98] . In what follows, we use the notions of supercombinators and rules interchangeably. We note that the list of leaf processes together with the set of supercombinators is a complete characterisation of the high-level process as the semantics of all CSP operators corresponding to the internal nodes in the process tree is captured by the supercombinator rules. The structured process tree can be used, though, for making the whole high-level process completely explicit.
The set of combinators is partitioned with respect to the existing formats -the different configurations of switched on and switched off leaf processes. In the worst case, the number of formats can be exponential in the number of leaves, but in practice this is rarely the case and quite often, there is just a single format, especially when composing processes in parallel on the top level.
Within a supercombinator, each process can participate with a visible event, a silent action τ , or not be involved at all, the latter of which we again denote by . As with combinator rules, the supercompiler generates two types of rules[Ros98; Gol04; RRS + 01]:
a rule for a leaf process willing to perform a τ which promotes a τ action of the root process, rules using visible actions.
Note that the visible actions that the leaf processes perform need not be the same if hiding or renaming is involved in the combination being modelled. For example, if P = a −→ P and Q = b −→ Q, then if P performs a and Q performs b, P {a} Q a /b can perform a, where Q a /b is the process Q with the event b being renamed to a. Hence, ((a, b), a) is a valid rule for the root process P {a} Q a /b with leaves P and Q.
Going back to our running example, after supercompiling Scheduler \ {| b |}, we obtain the process tree depicted on Figure 3 if Cell(i) and
Supercombinator operational representation can be considered an implicit LTS because it gives an initial state and sufficient information to calculate all the transitions of the system on-the-fly. Given a root high-level process, we refer to tuples of the current states of its leaf processes as configurations. When running the root process, FDR computes its initial actions by checking which supercombinators are enabled from the current configuration and the current format of the root. A supercombinator might be disabled if not all leaf processes are currently able to communicate the event that they are responsible for within the supercombinator. Hence, the operational semantics of the root process can be considered an implicit LTS, whose transitions can be switched on and off. The states are represented by a pair of a configuration and a format of the root. Transitions are modelled by supercombinators. For example, the supercombinator ((c.1, c.1, , , ), τ ) (see Figure 3 (b)) would be enabled iff Cell (0 ) is in its state s 1 and Cell (1 ) is in its state s 0 , independent of the current states of the other three cells. If this rule is enabled and the transition taken, Cell (0 ) will go to state s 2 , Cell (1 ) will go to state s 1 , the other three cells will not progress and Scheduler \ {| b |} will perform a τ .
To summarise, supercombinators can be viewed as implicit state-space representations. They are generated by mimicking the SOS or combinator rules, but yield more compact storage and more efficient algorithms. Therefore, FDR is most efficient when manipulating processes with relatively simple sequential leaves composed in parallel or applied hiding or renaming upon. Of course, highlevel processes can be explicated, i.e. transformed into explicit LTS's, paying a potentially exponential price. This is quite logical as explication breaks down the hierarchical structure of a system composed of concurrent processes and makes it sequential.
Refinement Checking
Given two CSP processes Spec and Impl , the refinement check Spec Impl reduces to checking for reverse containment of possible behaviours. For the traces model, Spec T Impl iff traces(Impl ) ⊆ traces(Spec).
FDR carries out the refinement check on the level of the LTS representations
The algorithm is similar to the standard one for deciding language containment L(A) ⊆ L(B) of nondeterministic automata A and B, which reduces to checking whether L(A) ∩ L(B) = ∅ and requires that B be determinised a priori. In a similar fashion, as a preprocessing step, FDR normalises OS Spec , so that OS Spec reaches a unique state after any trace. The normalisation procedure requires as a precondition that OS Spec be explicated and therefore Spec sequentialised. Essentially, the normalisation procedure transforms OS Spec into the unique equivalent τ -free deterministic bisimulation-reduced LTS. We remark that any finite-state CSP process can be normalised, although potentially incurring an exponential blow-up.
. If the property is violated, the breadth-first mode of search guarantees that the counterexample generated is of minimal length.
SAT-Based Model Checking Techniques
In this section, we give a brief summary of SAT-based bounded model checking [BCCZ99] and temporal k-induction [ES03b] .
Bounded Model Checking
Bounded model checking is a sound but generally incomplete technique that focuses on searching for counterexamples of bounded length only. The underlying idea is to fix a bound k and unwind the implementation model for k steps, thus considering behaviours and counterexamples of length at most k. In practice, BMC is conducted iteratively by progressively increasing k until one of the following happens: (1) a counterexample is detected, (2) k reaches a precomputed threshold called completeness threshold [CKOS04; CKOS05], which indicates that the model satisfies the specification, or (3) the model checking instance becomes intractable.
Different notions of completeness threshold exist, mainly based on the properties of the underlying graph of the system, e.g. diameter (the longest shortest path between any two states), recurrence diameter (the longest simple path between any two states), forward and backward radius versions of both, the size of the state space, etc. [BCCZ99; CKOS04; CKOS05; BHvMW09]. A simple path is a path along which all states are different and, in general, the recurrence diameter of a graph can be arbitrarily longer than its diameter (the same holding for radii) -if we consider a clique of size n, it's diameter would be 1, while the recurrence diameter would be n − 1. We remark that this problem is exacerbated when modelling concurrent systems due to the exponential blow up of the state space.
It is important to note that without knowing or reaching a completeness threshold, the BMC procedure is incomplete since we do not know at what step it is correct to stop iterating and declare that the system preserves the desired property. Therefore, BMC is mostly suitable for detecting bugs rather than for full verification, i.e. proving the absence of bugs.
The problem with completeness thresholds is two-fold. On one hand, calculating the exact completeness threshold can be as hard as the model checking problem itself [CKOS05] and, therefore, sound overapproximations of it are usually used in practice. On the other hand, in some cases those overapproximations can be too large to handle efficiently.
SAT-based BMC [BCCZ99] reduces the model checking problem to a propositional satisfiability problem. The idea is to construct at each step k a Boolean formula which is satisfiable if and only if there is a counterexample of length k. This formula is fed into a SAT-solver which decides the model checking problem in question and produces a counterexample, if any. Due to the DFS-nature of the SAT decision procedure, this technique allows for fast detection of counterexamples. Moreover, due to the iterative nature of the BMC framework, the counterexample generated is of minimal length.
In the original syntactic [BCCZ99] and the subsequent semantic [CKOS04; CKOS05] translation of BMC to SAT, the implementation is modelled by a Kripke structure M and verified against a specification f defined as an LTL formula. The BMC instance at each step k is translated to a Boolean formula Generally, having a Boolean encoding of the state space (e.g. a binary or a one-hot encoding [KB05] ), the Kripke structure M can be represented symbolically by a pair of Boolean functions I(s), T (s, s ) defined as the characteristic functions of the set of initial states and the transition relation, respectively. We use s and s as a shorthand for the vectors of Boolean variables necessary for encoding states of M . We replicate a separate copy of state variables s i for each time step i. Then Figure 4) . We illustrate the structure of the entire formula ϕ k with a simple example in case f = Gp, where p represents a state predicate with Boolean encoding P :
Temporal k-Induction
Temporal k-induction [SSS00; ES03b] is a complete SAT-based technique for verifying safety properties. As opposed to BMC, it can be used also for establishing correctness of systems. Given a model I(s), T (s, s ) and a safety property P (s), the method checks if all reachable states of the model preserve P . k-induction builds upon BMC and is also conducted iteratively, as presented in Algorithm 1. It provides two conditions for termination in case the property is not violated -k-inductiveness of P for some k ∈ N or reaching the backward recurrence radius of the model with respect to P . The property P is k-inductive if it can be proven that if P holds along all paths of the system of length k, then it cannot be violated on a path of length k + 1. The backward recurrence radius is the length of the longest simple path from any state to a state violating P and is a valid completeness threshold.
For each step k, the temporal induction proof consists of two parts -a base case and an induction step. The base case Base k is similar to a BMC instance -we check if, starting from an initial state, there is a path of length k that violates P (see Figure 5 ). In the base case, we assume that we have already checked all base cases of shorter length and strengthen the BMC instance by stating that P holds along all initial paths of length up to k − 1. If the base case is satisfiable, we have found a counterexample. Otherwise, we proceed with the induction step Step k which is designed to prove that P is k-inductive. The induction step is strengthened and made complete by a constraint Simple k
We remark again that, in many cases, the backward radius of the model -the longest shortest path from any state to a state violating P , can be considerably smaller than its backward recurrence radius. However, the translation of shortest paths between two states to SAT involves plenty of existential quantifiers and is mostly suitable to using a QBF engine instead of a SAT solver.
As we are dealing with safety properties, we can also carry out the k-induction algorithm backwards, starting from states that violate P and trying to prove that initial states are never reachable. This can be implemented by redefining Base k and Step k as depicted in Figure 6 . This algorithm guarantees termination upon reaching the forward recurrence radius -the longest simple path to any state starting from an initial state.
Step 
Bounded Trace Refinement Framework
In this section, we present our iterative bounded refinement checking algorithm. Our approach for establishing trace refinement is based on watchdog transformations [RGM + 03]. Our objective is the following. We are given two CSP processes Spec and Impl and an integer k. We aim at checking whether Spec k T Impl , i.e. whether all executions of the implementation of length at most k agree with the specification. Similarly to BMC and k-induction, we carry out the analysis on the level of the operational representation of Spec and Impl . We point out that executions of length k can correspond to traces of smaller length if having τ actions entangled within, as defined in Section 2.1.4.
Challenges
As the LTS's underlying the operational semantics of processes are event-based models, we need to also handle events in our encoding. Let OS Spec = I s (s), T s (s, l, s ) and OS Impl = I i (t), T i (t, l, t ) be the models of Spec and Impl , respectively. At first glance, the most natural approach for encoding bounded execution refinement would be to try to directly mirror the original translation of BMC to SAT. We would need to similarly construct the Boolean formula ϕ k as a conjunction of two formulae to model all executions of Impl of length k that are not executions of Spec -ϕ k = OS Impl k ∧ ¬OS Spec k . Hence, we would be looking for an instantiation of the vectors of Boolean variables l 1 , . . . , l k , such that
) is satisfiable and
is not. Due to the implicit universal quantification of s 0 , . . . , s k in the unsatisfiability check of OS Spec k , this analysis is mostly suitable to a QBF engine. Using a SAT-solver in this case would mean that we would need to extinguish all possible satisfying assignment of l 1 , . . . , l k in OS Impl k and to prove the unsatisfiability of OS Spec k over each one of them.
Furthermore, invisible τ actions can be arbitrarily interleaved in the executions of Impl and, therefore, syntactically different executions can produce semantically equivalent traces. This can lead to reporting spurious counterexamples. To illustrate this, consider the executions a, τ, τ, b, τ, c of Impl and a, b, c of Spec as depicted on Figure 7 (b). Even though they correspond to the same trace a, b, c , they do not match pointwise and a, τ, τ, b, τ, c would be falsely reported as a violation of Spec. However, bookkeeping the possible sequences of τ -s stuttered in between visible events does not seem to be trivial and computationally justifiable on the level of Boolean functions. 
The Watchdog Approach
As explained in Section 2.1.6, FDR performs the refinement check by normalising the specification and looking for the existence of behaviours that the implementation allows and the specification does not.
As an alternative, the watchdog approach [RGM + 03; Ros10] reduces the refinement check to analysing a single process constructed by composing the implementation in parallel with a transformed specification process. The latter plays the role of a watchdog that monitors the implementation and flags any behaviours that are considered violating with respect to the specification.
In our settings, using watchdog transformations allows us to actually reduce bounded execution containment to bounded reachability which is already amenable to SAT. The watchdog transformation phase is performed by means of FDR.
Preprocessing Phase Using FDR
Our implementation is intended as an alternative back-end for FDR, orthogonal to the standard explicit method of performing trace refinement. Currently, we use a shared library version of FDR for manipulating CSP processes and we mimic FDR up to the point of the final state-space exploration phase. Therefore, SymFDR reuses FDR's compiler and supercompiler and the data structures underlying the hybrid two-level operational representation of processes, consisting of a process tree and a set of supercombinators, as defined in Section 2.1.5.
At present, we use FDR to supercompile and normalise Spec and to retrieve OS Spec representing the operational semantics of Spec.
Without loss of generality, we assume that the implementation Impl comprises the interaction of c sequential processes P 1 , . . . , P c running in parallel, possibly using hiding, renaming or other CSP operators other than recursion. We write Impl = P 1 P 2 . . . P c to actually denote a high-level process Impl with leaf processes P 1 , . . . , P c , as defined in Section 2.1.5. This form of representing concurrent systems after supercompilation is of no limitation and, as mentioned in Section 2.1.1, we can handle the entire CSP M syntax and functionality apart from the function chase. We use FDR to supercompile Impl and to retrieve both the set of supercombinators and the set {OS Pi | i ∈ {1, . . . , c}}.
Watchdog Bounded Refinement-Checking Algorithm
In a nutshell, the main steps of our algorithm are the following:
1. We transform Spec into a process Watchdog which allows the behaviours of both Spec and Impl and, in fact, many others, but marks those that do not conform to Spec. The transformation is carried out on the level of the LTS and not on the higher CSP description of Spec. It is most easily defined if the specification process is normalised so that it reaches a unique state after following any trace. The LTS of Watchdog is then obtained as an extension of OS Specwe add a fresh state sink and make OS Spec total with respect to the alphabet α Spec ∪ α Impl by directing all non-existing transitions to sink . Formally, having
, where the transition relation T w is defined as follows:
The resulting process Watchdog operationally passes through sink whenever executing a trace that is not allowed by Spec. 2. We construct a process Refinement = Watchdog α Impl ∪αSpec Impl = Watchdog α Impl ∪αSpec (P 1 P 2 . . . P c ). Refinement captures precisely the behaviours of Impl , but those behaviours that do not conform to Spec force Watchdog to bark, i.e. pass through its sink state. Hence, Refinement can be used as an indicator whether Impl can behave in a way incompatible with Spec. Watchdog becomes just one of the sequential leaf processes of Refinement. It is evident then that: (a) Spec T Impl ⇐⇒ Watchdog never reaches its sink state in any execution of Refinement. (b) All executions of Refinement forcing Watchdog to pass through its sink state constitute valid counterexamples of the assertion Spec T Impl . 3. We check whether Watchdog can reach its sink state within k steps of the execution of Refinement.
Boolean Encoding of CSP Processes
In this section we present our encoding of CSP processes into Boolean formulae. First, we show how to encode sequential or explicated processes, corresponding to leaf processes in the operational representation. Then, we show how to glue together sequential processes with supercombinators to obtain an encoding of a high-level process. In what follows, we call a high-level process a concurrent system.
For illustrating the Boolean encoding in this section, we use the following notation. X (Vars) denotes the Boolean encoding of X with respect to the vector(s) of Boolean variables Vars.
Generally, to encode a finite set of elements S, we use an injective mapping enc S : S → {0, 1} m to associate each element s ∈ S with a unique bit vector b = (b 1 , . . . , b m ) of certain size m. To represent elements of S as Boolean functions, we introduce an ordered vector of m distinct Boolean variables x = (x 1 , ..., x m ). Each variable x i uniquely identifies the corresponding bit b i and for each s ∈ S, s (x)| x=b = 1 iff enc S (s) = b. Typically, binary or one-hot [KB05] encoding of sets are used in practice. The basic idea behind binary encoding is to enumerate the elements of S in binary notation and represent them as Boolean functions over m = log 2 |S| Boolean variables. In one-hot encoding, each s ∈ S is represented by a bit vector of size |S| in which precisely one bit is set to 1. To illustrate the two encodings, let us consider the set S = {s 0 , s 1 , s 2 , s 3 }. If we consider the binary encoding s 0 = (00), s 1 = (01), s 2 = (10), s 3 = (11), a vector of just two variables x = (x 0 , x 1 ) suffices and s 1 (x) = ¬x 0 ∧ x 1 , for example. For the one-hot encoding s 0 = (1000), s 1 = (0100), s 2 = (0010), s 3 = (0001), we need a vector x = (x 0 , x 1 , x 2 , x 3 ) of four variables and s 1 (x) = ¬x 0 ∧ x 1 ∧ ¬x 2 ∧ ¬x 3 . Alternatively, we can use s 1 (x) = x 1 , but in this case we need to add global constraints enforcing that precisely one bit is set to true at a given time instance. Those constraints can be expressed by a formula of size linear in |S|.
Encoding a Sequential Process
As explained in Section 3.3, for each sequential leaf process P , we obtain the explicit operational representation of P using FDR. Let OS P = S, s 0 , L = Σ τ, , T be the LTS associated with the finite-state leaf process P communicating over a finite alphabet of events Σ. Using either binary or one-hot encoding of sets, we introduce vectors of Boolean variables x and y for encoding the set of states S and the set of labels L, respectively. We define I (x) = s 0 (x).
In order to represent the transition relation T , we employ a copy x of x. x serves to represent the source states of transitions and x -the destination states. Then, for t = (s src , l, s dest ) ∈ T , t (x, y, x ) = s src (x) ∧ l (y) ∧ s dest (x ). For any s ∈ S, we write s (x ) to denote s (x)[x ← x], i.e. we represent s with respect to the variables x and then substitute the variables x with x . The encoding of the entire transition relation is the following: T (x, y, x ) = t∈T t (x, y, x ).
We can now represent a sequential process P implicitly by a pair of Boolean functions T P (x, y, x ), I
P (x) . For a given integer k, we define Paths(P, k) to be the set of all executions s 0 l 1 s 1 l 2 . . . l k s k of OS P of length k. In order to represent Paths(P, k) symbolically, we replicate (k + 1) vectors of Boolean variables x 0 , x 1 . . . x k for encoding the states s 0 , s 1 , . . . , s k and k vectors of Boolean variables y 1 , y 2 . . . y k for the corresponding transitions l 1 , . . . , l k . Then Paths(P, k) (x 0 , x 1 . . . x k , y 1 , y 2 . . .
Encoding a Concurrent System
In the setting of FDR, after supercompilation we can view a concurrent system as a high-level process identified by a process tree and a set of supercombinators. Since a high-level root process can be modelled as an LTS, we now show how to encode a concurrent system similarly to a low-level sequential process. In what follows, we denote by Sys(c) = P 1 , . . . , P c , SC the high-level process characterised by a set of supercombinator rules SC and c explicitly compiled leaf processes P 1 , ..., P c communicating over sets of events Σ 1 , ...., Σ c , respectively. We define
Encoding the Sequential Leaf Processes. For each i ∈ {1, ..., c}, we retrieve the explicit LTS represen-
, T i of the leaf P i from FDR. Since Σ i ⊆ Σ, we actually consider
Following the ideas from the previous Section 4.1, we introduce vectors of Boolean variables x i , x i and y i to generate the symbolic representation
Hence, each process has its own set of variables for representing the alphabet Σ τ, . We further introduce an additional vector of Boolean variables y for encoding the resulting action of the entire system because, due to the presence of hiding and renaming, it might be different from the contributions of the leaf processes, as illustrated in Section 2.1.5. In case the system violates the specification, we also use the assignment of the variables from y to generate a counterexample trace.
Encoding Configurations of the Concurrent System. Recall that at, every time instance, the state of the entire high-level system, also called a configuration, is identified by the current states of its sequential leaf components. Formally, the set of states of the system is a c-ary relation S ⊆ S 1 ×...×S c , the initial state being s 0 = (s x i ) ). For clarity, we denote the set of states of the overall system by Configurations.
Supercombinators and Formats. As we mentioned in Section 2.1.5, supercombinators are rules for combining together actions of the individual sequential leaf processes into event-outcomes of the overall system [Ros98] . Within a supercombinator, each process can participate with a visible event, a silent action τ , or not be involved at all. We denote the non-involvement with the symbol . For any alphabet Σ, we let Σ = Σ ∪ { }. In addition, the set of supercombinators is partitioned into existing formats, i.e., different configurations of switched on and switched off processes among P 1 , . . . , P c , which we denote by Formats.
Formally, the set of supercombinators can be represented as a (c + 3)-ary relation , a 1 , ..., a c , a, f dest ) ∈ SC iff from a certain configuration and a certain format f src of the overall system, P 1 performs a 1 , ..., P c performs a c and the overall system performs a switching to a format f dest .
The operational semantics of the concurrent system can be considered an implicit LTS, whose transitions can be switched on and off: set of states -Formats × Configurations set of labels -SC transition relation -T ⊆ (Formats × Configurations) × SC × (Formats × Configurations). If the system is in a given configuration and in a given format, the individual processes transition relations determine if the labels are switched on or off. Formally, , a 1 , . . . , a c , a, f 
Encoding Supercombinators. For a given rule sc = (f src , a 1 , ..., a c , a, f dest ) ∈ SC, let Passive(sc) = {i ∈ {1, · · · , c} | a i = , i.e. P i is not involved in sc}. Let u = (u 1 , ..., u c ) be a vector of (supercombinatorindependent) Boolean variables. We denote:
Note that a process might be switched on in a format and still be passive in a certain supercombinator in this format. Hence, we cannot use the format to conclude which processes are passive in a supercombinator.
Let f and f be two vectors of Boolean variables for encoding the source and destination format of a rule. Let sc = (f src , a 1 , ..., a c , a, f 
Hence, in an encoding of a supercombinator, we indicate a passive process P i just by affirming a single Boolean variable u i . We call u i a trigger. For non-passive processes, we also encode the event that the process performs. The encoding of all supercombinators in all formats now becomes the following: SC (y 1 , ..., y c , y, u, f , f ) = sc∈SC sc (y 1 , ..., y c , y, u, f , f ).
Encoding a Transition of the Concurrent System. Let for i ∈ {1, · · · , c},
, where x i = x i is the short for
. The intuition behind a ψ i is that, if P i does not participate in a transition of the entire system, i.e. P i is not involved in a supercombinator, P i remains in the same state within its own labelled transition system OS i . Otherwise, P i progresses with respect to its transition relation T i . Expressed as a Boolean formula,
. We define a predicate T Sys(c) which is true exactly for the transitions of the overall system:
Encoding Fixed Length Executions of the Concurrent System. Within the BMC framework, let k be the maximal bound for the length of the counterexamples we are looking for. Then: P aths(Sys(c), k) ( // variables for P 1 x 1 0 , . . . , x 1 k , y 1 1 , . . . , y 1 k , u 
Implementation Details
Our prototype tool SymFDR is written in C++ and uses FDR as a shared library for manipulating CSP processes. The current implementation of SymFDR supports refinement checking systems with a single format only. However, we do not anticipate any problems generalising the problem to a multi-format setting. Moreover, most practical cases are also single-format or can be easily rewritten in this form.
BMC
In our BMC framework, we have three modes of state space traversal -forward (starting from the initial state), backward (starting from an error state) and simultaneous forward/backward mode.
In the original version of BMC, the system is unwound step by step until the bound k is reached. Despite the recent advances in SAT-solvers' learning capabilities and incremental SAT-solving, we have observed that the bottleneck of the bounded refinement procedure is the SAT-solver. Therefore, we allow unfolding a configurable number i of steps of the process Refinement before running the SAT-solver. The SAT-solver is then used to check if Refinement can pass through the sink state in any of its last i unwindings. If so, we have found a counterexample, otherwise we continue iterating until reaching the configured bound k. We refer to the value of i as SAT-frequency. We believe that this multi-step approach works well because the SAT-solver typically finds it much easier to find a satisfying assignment, if there is any, than to prove unsatisfiability, given CNF formulae with comparable size and structure. Hence, we trade off the shortness of the reported counterexample for efficiency.
k-induction
We have implemented the "Zig-Zag" and "Dual" temporal k-induction algorithms [ES03b] . The difference is that the "Dual" algorithm makes use of separate SAT-solvers for the base case and the induction step, aiming to optimise the incremental SAT interface. For k-induction, SymFDR supports both forward and backward traversal, yielding four algorithms in total: "Zig-Zag" forward, "Zig-Zag" backward, "Dual" forward and "Dual" backward.
SAT
SymFDR supports both binary and one-hot encoding of state spaces, though we have observed that for our test cases binary encoding scales much better. One-hot encoding usually yields CNF instances with smaller number of clauses but substantially higher number of Boolean variables which seems to burden the SAT solver. We construct the Boolean formulae directly in negation normal form and, consequently, transform them into equisatisfiable formulae in CNF using the optimised one-sided Tseitin encoding [BKWW08; BHvMW09] .
Currently, SymFDR supports MiniSAT 2.0, PicoSAT 846 and ZChaff, all working in incremental mode. For our test cases, we have found MiniSAT to be the most efficient and all quoted results use MiniSAT. We exploit MiniSAT's incremental interface in a way similar to [ES03b] . For our larger test cases, we also observed that MiniSAT finds a counterexample faster if we configure it to keep a smaller number of learned clauses (learntsize factor = 0.2, learntsize inc = 1.02) and restart more frequently (restart inc = 1.1). We also implemented adding unit learned clauses explicitly, as suggested in [ES03a] , in conjunctions of multiple ones. Using positive polarity in decision heuristics also produced much better results, as well as freezing and then defreezing state and format variables at each step to avoid variable elimination.
SymFDR also supports strategies for restricting the decision variables to the input ones [Sht00] , incorporating PicoSAT's restarting scheme and phase saving strategy [Bie08] in MiniSAT, etc.
Experimental Results
In this section, we investigate the performance of SymFDR on a small number of case studies. We compare it to the performance of FDR 2.83, FDR used in a non-standard way, PAT 3.2.2 [SLD08] , and, in some cases, direct SAT encodings, NuSMV 2 [CCG + 02] and Alloy Analyzer 4.1.10 [Jac06]. All SATbased experiments use MiniSAT although SymFDR and the direct SAT encoder build upon MiniSAT version 2.0, while Alloy and NuSMV exploit the earlier version 1.14. All tests were performed on a 2.6 GHz PC with 2 GB RAM running Linux, except the test marked with a * , which was performed on a 4-GB-RAM PC running Linux.
Tools We Compare Against
FDR-Div. The main search strategy for FDR is BFS [Ros94] because this has the combined advantages of always finding a shortest counterexample and of enabling implementations that work comparatively well on virtual memory. However, the strategy for discovering divergences is based on DFS. In test cases where it is likely that there are a good number of counterexamples, but that all of them occur comparatively deep in the BFS, there is good reason to use a bounded DFS (BDFS) algorithm to search for them, so that only error states reachable in less than some fixed number N of steps are reached. BDFS will quickly get to the depth where counterexamples are expected without needing to enumerate all of the levels where they are not. Provided that the counterexamples have something like a uniform distribution through the order in which the DFS discovers them, we can expect one to be found after searching through approximately S/(C + 1) states, where S is the total number of states and C is the number of counterexamples.
FDR does not implement such a strategy directly. It was, however, observed a number of years ago by Roscoe and James Heather that it is possible to use a trick that achieves the same effect using the present version of the tool. That is, arrange (perhaps using a watchdog) a system P that performs only up to N events of the target implementation process P and then performs an infinite number of some indicator event when a trace specification is breached. Provided P is itself divergence-free, we then have that P \ Σ can diverge precisely when P violates the specification. FDR searches for this divergence by DFS.
This approach is particularly well suited to CSP codings of puzzles, since it is frequently known ab initio how long a counterexample will be, and the usual CSP coding uses the repeatable event done to indicate that the puzzle has been solved. The columns labelled FDR-Div in Table 1 and Table 2 report on the result of using this technique. In several ways this method is more similar to approach of PAT and SymFDR than the usual FDR approach. As is apparent from the experiments, there seems to be a large element of luck in how fast this approach is, possibly based on how close the path followed by the DFS is to a counterexample.
PAT. PAT [SLD08] is a model checker of a version of CSP enhanced with shared variables. Despite the BMC attempt [SLDS08] , PAT is at present a fully explicit checker. In addition to LTL model checking, PAT supports CSP refinement checking which it performs in a way similar to FDR although using DFS (instead of BFS), normalisation of the specification on-the-fly, partial-order reductions, counter abstraction, symmetry reduction, etc. In the test cases quoted here, the specification is given as a reachability property on the values of the shared variables, as modelled in the benchmarks available with the tool. The reachability algorithm is based on DFS and state hashing is applied for compact state-space representation.
NuSMV. NuSMV [CCG
+ 02] is a symbolic model checker verifying SMV against CTL properties using BDDs. The BMC framework of NuSMV, which we refer to as NuSMV-BMC, uses specifications written in LTL.
Alloy Analyzer. Alloy Analyzer [Jac06] is a fully-automatic tool for finding models of software systems designed in the lightweight Alloy modelling language. Alloy Analyzer could be considered a BMC checker due to its searching for a model only up to a certain scope and generating the model, if existing, using SAT-solving techniques. Direct SAT Encodings. We believe that experimenting with direct SAT encodings of problems will offer guidance for optimising the translation of CSP to logic. For example, the chess knight test case suggests that a shorter chain of inference for high-level actions might be beneficial.
Test Cases
Instances with Counterexamples
In this section, we consider test cases with counterexamples and therefore exploit the BMC framework. The results are summarised in Table 1, Table 2, Table 3, Table 4, Table 5 and Table 6 . The last column titled lists the length of counterexamples.
First, we consider the peg solitaire puzzle [Ros98] , performing experiments on a chain of soluble boards with increasing level of difficulty. In the initial configuration, the board has all slots but one occupied by pegs. The only allowed move in the game is a peg hopping over another peg and landing on an empty slot. The hopped-over peg is then removed from the board. The objective of the game is ending up with a board with a single peg positioned on the slot which had been initially empty. The length of any solution of the puzzle is equal exactly to the number N of pegs on the initial board -a hop event for (N − 1) pegs followed by an event done signifying a valid solution of the puzzle. The results are summarised in Our second test case is the chess knight tour. A knight is placed at position (1, 1) on an empty chess board of size N × N . The objective is covering all squares of the board by visiting each square exactly once. Similarly to peg solitaire, a solution is generated as a counterexample to a specification asserting that the event done is never communicated. The length of a possible solution is N 2 + 1. The results are presented in Table 2 . For N = 5, FDR generates a counterexample faster, but, for N = 6, SymFDR found a solution in approximately 5 minutes, while FDR crashed after an hour and a half of state-space exploration. For this test case, we have observed that restricting the decision variables in MiniSAT to the input ones enhances the performance of SymFDR, especially for N = 7 where the reduction factor is over 20. Hence, for N = 7, the performance of the general tool SymFDR comes close to the performance of the problem-specific SAT encoder for the chess knight tour.
We have observed similar results with the test cases of finding a Hamiltonian path on an N × M grid (Table 3 ) and the lights off puzzle (Table 4 ). The lights off puzzle starts with an N ×N board with all lights initially on. The aim is to reach a configuration with all lights switched off having in mind that upon triggering any light switch, the switch to the right, left, below and above is also triggered. This test case illustrates the difference that search mode can make in SymFDR. We remark that for every N , if using inductive normal compression, FDR can actually obtain a state space consisting of a single state which it can verify in 0 seconds. The fifth test case -the classical puzzle of towers of Hanoi, aims primarily at comparing SymFDR with other SAT-based bounded checkers such as NuSMV and Alloy Analyzer. The results are summarised in Table 5 . NuSMV-BMC and SymFDR seem to be competitive, both outperforming Alloy Analyzer. SymFDR working in simultaneous forward/backward mode outperforms NuSMV-BMC. However, all non-SAT tools -the explicit ones FDR and PAT and the BDD-based NuSMV -are clearly orders of magnitude more efficient than the SAT-based ones. We remark, though, that all solutions for the puzzle generated by PAT are longer than 1000 moves, even when N = 5, when the shortest solution is of length 32. When configuring PAT to report the shortest witness trace, we obtain the results quoted in the column labelled "PAT short". In this case, the performance of PAT worsens fast, falling behind SymFDR for N = 7 and running out of memory for N = 8. Our final Table 6 summarises results obtained while running SymFDR on CSP scripts generated by Casper [Low98] -a well-known tool for analysing and verifying the correctness of security protocols, underlying the discovery of an attack on the Needham Schroeder public key protocol in 1995 [Low95] and the verification of correctness of a fixed version of it in 1996 [Low96] . Casper takes a big advantage of the partial-order-reduction function chase offered by FDR [Ros98] , as is apparent from the comparison of the performance of FDR with and without it. For the Needham Schroeder public key protocol (NSPK3), SymFDR is better than FDR without chase but worse than FDR with chase. SymFDR finds those instances particularly hard because, on one hand, the state-space blow-up is enormous and, on the other hand, probably due to the great number of τ actions entangled into the state space, there are very few clauses learned and those clauses contain far too many literals (often over 1000). For these test cases, we have used MiniSAT configured with negative polarity and no decision on auxiliary variables. 
Instances without Counterexamples
In this section we focus on the performance of SymFDR using k-induction (see Table 7 ). For each of the four algorithms, "Zig-Zag" forward, "Zig-Zag" backward, "Dual" forward and "Dual" backward, we record the time in seconds and the step at which the algorithm terminates.
We consider the readers/writers test case, Milner's scheduler with and without compression, the Bakery algorithm for mutual exclusion and the bully algorithm for leader election. Besides manually generated scripts, we have experimented with scripts translated by Casper and SVA (Shared Variables Analyser) [Ros10] -a front-end for FDR based on modelling concurrency using shared variables. We have also considered the effect of applying available FDR compression techniques to the CSP scripts and have analysed the impact of those techniques to the the recurrence radii. In our experience, the backward algorithm, aiming to reach the forward recurrence radius, often scales better than the forward one. We note that, due to concurrency, the completeness threshold blows up in all cases. Hence, successful performance mainly depends on whether the property is inductive or not. For the Bakery algorithm and Milner's scheduler, applying hierarchical and leaf compression, respectively, has proven to be beneficial and has significantly decreased the termination step and, hence, improved the performance. For all four algorithms we have observed that the induction step is checked much faster than the base one, opposite to what reported in [ES03b; SSS00] . Hence, we also implemented versions of k-induction starting the iteration process from a step greater that zero, as suggested in [SSS00] . Opposite to what we expected, though, for our test cases this approach scales worse than the standard one. In case of unsatisfiable instances, we have also observed that if the length of the longest possible counterexample is known in advance, often iterating the BMC algorithm up to this length produces better results than k-induction due to tuning the SAT frequency and jumping multiple time steps at once.
Conclusion
We can conclude that SymFDR is likely to outperform FDR in large combinatorial problems for which a solution exists, the length of the longest solution is relatively short (growing at most polynomially) and is predictable in advance. In those cases, we can fix the SAT-frequency close to a sizeable divisor of this length and thus spare large SAT overhead. The search space of those problems can be characterised as very wide (with respect to BFS), but relatively shallow -with counterexamples with length up to approximately 50-60. We suspect that problems with multiple solutions also induce good SAT performance. The experiments with the towers of Hanoi suggest that SAT-solving techniques offer advantages up to a certain threshold and weaken afterwards.
SymFDR in k-induction mode works reasonably well for small test cases, especially if the property is inductive. However, for larger test cases, SymFDR does not scale very well as the completeness threshold becomes too large due to concurrency. In all cases considered in Table 7 , FDR is considerably faster.
Conclusions and Future Work
In this paper we have demonstrated the feasibility of integrating SAT-based BMC and k-induction in FDR, and more specifically, exchanging the expensive explicit state-space traversal phase in FDR by a SAT check in SymFDR. On some test cases, such as complex combinatorial problems, SymFDR's performance is very encouraging, coping with problems that are beyond FDR's capabilities. In general, though, FDR usually outperforms SymFDR, particularly when a counterexample does not exist. We plan to further investigate and try to gain insight about the classes of problems that are tackled more successfully within the BMC framework.
We envision several directions for future work. We plan to extend the BMC framework in SymFDR to make it applicable to the stable failures and failures-divergences models as well. This will involve extending the encoding of CSP processes with information about maximal refusals and divergences.
We are currently implementing McMillan's algorithm combining SAT and interpolation techniques to yield complete unbounded refinement checking [McM03] . This method has proven to be more efficient for positive BMC instances (instances with no counterexamples) than other SAT approaches. The completeness threshold in this case is the backward radius of the state-space which is smaller than its backward recurrence radius, as is the case with temporal induction. Moreover, experimental results have shown that, in practice, the algorithm often converges substantially faster, for bounds considerably smaller than the backward radius. In addition, the interpolation algorithm allows jumping multiple time frames at once and hence allows tuning the SAT-frequency. The BMC framework presented in this paper is the foundation we build upon.
Other avenues for further enhancing FDR's performance include partial-order reductions [Pel98] and CEGAR [COYC03; CCO + 05].
Vijay D'Silva, Alastair Donaldson and Phillip Rümmer for interesting discussions on k-induction. The analysis using DFS refinement through divergence checking was inspired by a correspondence several years ago between A. W. Roscoe and James Heather. The work presented in this paper is supported by grants from EPSRC and US ONR.
