Combining the Box Structure Development Method and CSP for Software Development  by Hopcroft, Philippa J. & Broadfoot, Guy H.
Combining the Box Structure Development
Method and CSP for Software Development
Philippa J. Hopcroft1
Oxford University Computing Laboratory
United Kingdom
Guy H. Broadfoot2
Verum Consultants
The Netherlands
Abstract
In this paper, we combine the Box Structure Development Method (BSDM) [6,8] and CSP [3,10],
integrating them into industrial software development processes. BSDM was developed with prac-
tical software projects in mind and provides a framework for developing formal design speciﬁcations
that are fully traceable to the informal requirements. It integrates well into an industrial setting
and forms an ideal bridge between the actual system being developed and the abstract models used
for formal analysis. CSP complements BSDM by providing the mathematical framework for formal
veriﬁcation, together with its model checker FDR. In this paper, we present generic algorithms for
translating speciﬁcations from BSDM into CSP, illustrate how they can be formally veriﬁed using
FDR and summarise an industrial case-study.
Keywords: Software design veriﬁcation, CSP, FDR, Box Structure Development Method.
1 Introduction
In this paper, we combine two existing and complementary formal methods,
integrating them into industrial software development processes. The ﬁrst of
these is the Box Structure Development Method (BSDM), originally developed
by Mills [6], and later extended by others (for example, see [8]). The second
1
Email: philippa.hopcroft@comlab.ox.ac.uk
2
Email: guy.broadfoot@verum.com
Electronic Notes in Theoretical Computer Science 128 (2005) 127–144
1571-0661 © 2005 Elsevier B.V. Open access under CC BY-NC-ND license.
www.elsevier.com/locate/entcs
doi:10.1016/j.entcs.2005.04.008
approach is the process algebra CSP [3,10], together with its model checker
FDR [9,2].
The problem domain motivating this work is the development of practi-
cal software systems where the complexity and concurrency make them in-
creasingly unreliable with conventional testing methods alone. Many formal
methods have been developed over the years, for example [3,4,1]. However, in-
tegrating them into industrial software development environments has proved
to be diﬃcult. There are many reasons cited ranging from scalability to ed-
ucational issues, depending on the formal approach in question. A common
problem we have encountered in practice is that the formal speciﬁcations re-
quired for formal veriﬁcation are typically inaccessible to domain experts and
business analysts who have the necessary product knowledge. Training them
in the formal language at the start of every project is infeasible. This leads
to the very people who are essential to the validation process being excluded
from it.
BSDMwas developed with practical software projects in mind and provides
an ideal vehicle for bridging the gap between the abstract formal models and
the software development process in practice. It uses a requirements analysis
technique that leads to a formal speciﬁcation traceable to the original informal
speciﬁcation. This speciﬁcation is then transformed into an implementation
via a number of reﬁnement steps. Together with the sequence-based speciﬁca-
tion method [7], from the second author’s experience in industry, this approach
integrates well into a practical software development process of CMM level 2
and above. CSP complements BSDM by providing the essential mathematical
framework for formal veriﬁcation, together with the model checker FDR which
provides the necessary automation.
The paper is organised as follows. In Section 2, we give an overview of the
two formal frameworks in question. Sections 3 and 4 describe how BSDM’s
functional views can be modelled in CSP, together with generic algorithms for
constructing them. The scope for automated analysis using the FDR model
checker is discussed in Section 5. In Section 6, we give a brief overview of how
our approach can be extended to model some of the abstraction techniques
used in BSDM. Finally, we present our conclusions, an industrial application
and future work in Section 7.
2 Background
2.1 Box structure development method
The box structure development method [5,6] deﬁnes three functional views of
a system, forming an abstraction hierarchy that allows for stepwise reﬁnement
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144128
and veriﬁcation: A black box is a state-free description of the external view of
the system; a state box is derived from the black box and introduces internal
state; and the clear box view is an implementation of the state box. Each view
must be complete, consistent and traceably correct.
In practice, we use this methodology for developing rigorous design speci-
ﬁcations of the functional behaviour of all the system components from their
requirements, and therefore currently limit ourselves to the black box and state
box views only. Extending this to include clear boxes, or other approaches for
generating code from the state box descriptions, is part of future work.
Black boxes A black box speciﬁcation is the most abstract view and deﬁnes
the external behaviour of a system or component in terms of input stimuli from
and output responses observable to its environment; S and R denote the set
of stimuli and responses respectively. A black box is characterised by a total
black box function BB : S ∗ → R that maps stimulus history to responses,
where S ∗ is the set of all ﬁnite sequences over S .
In practice, there may be a stimulus that does not invoke a response; for
completeness, a special response null is added to the set R to handle this case.
For example, in the initial state of a system prior to any stimulus input, no
response can (spontaneously) occur. Sequences of stimuli that are illegal must
be included and are mapped to the special value ω ∈ R. Illegal sequences are
extension-closed: all extensions of illegal sequences are also illegal.
The black box is developed from the informal requirements and is required
to be complete, consistent and traceably correct with respect to them. In
practice, this is diﬃcult to achieve, since requirements are often incomplete,
inconsistent and can run into the hundreds of pages. In [7], Prowell and Poore
present the sequence-based software speciﬁcation method for systematically
deﬁning a black box speciﬁcation that ensures completeness, consistency and
traceability with the original requirements. This method involves the ordered
enumeration of all possible stimuli sequences in S ∗ in order of length, starting
with the empty sequence, and the assignment of a response from R to each
one. For each length, the ordering must be complete. During this enumeration,
equivalences between sequences may arise: Two sequences u, v ∈ S ∗ are Mealy
equivalent, written u ≡ρMe v , precisely when all nonempty extensions of them
result in the same response.
When such an equivalence arises between a sequence u and a previously
enumerated one v , the response associated with u is noted, together with its
equivalence to v . If no such equivalences exist, then u is said to be irreducible.
The set S ∗ is thus partitioned into a ﬁnite set of equivalence classes as deﬁned
by ρMe where, for every sequence u ∈ S
∗, there is a unique irreducible normal
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144 129
form v such that u ∈ [v ]ρMe ; these are known as canonical sequences.
Using the format in [8], the result of this sequence enumeration is a set
of enumeration tables, one for each canonical sequence c with the following
form:
Stimulus Response Equiv Trace rule
Canonical sequence : c
Traceability to the informal requirements is achieved by insisting that ev-
ery design decision, regarding the assignment of responses and equivalences
found, is justiﬁed by explicit reference to the corresponding informal require-
ments (captured in the column Trace rule). During this process, it frequently
occurs that choices are required concerning issues about which the informal
speciﬁcation is silent, ambiguous or inconsistent. In these cases, new require-
ments must be formulated jointly with the domain experts. When applied
within a well organised software engineering process (for example, of CMM
level 2 and above), this approach is very eﬀective at eliminating ambiguities
and omissions in the requirements and retaining the involvement of domain
experts throughout the design stage.
The enumeration process is complete once all sequences have been assigned
to an equivalence class. Example 3.3 illustrates a complete enumeration in
the tabular form. Enumeration speciﬁcations that are strongly bisimilar are
regarded as equivalent (since they encode the same black box).
State boxes A state box speciﬁcation is the next step towards implementa-
tion. It is derived from a black box and they are expected to be behaviourally
equivalent. A state box introduces state variables to capture distinctive char-
acteristics of the stimulus history, such that its behaviour is equivalent to that
of the black box. It is characterised by a total function SB : (Q×S ) → (Q×R),
where Q , S and R are the sets of states, stimuli and responses respectively.
The states in Q represent characteristic predicates, one distinct state for
each equivalence class, over the partition of S ∗. Deﬁning state variables and
their unique instantiations for each equivalence class is achieved through a pro-
cess called the canonical sequence analysis and is described in [8]. Informally,
this process involves inventing variables that capture the conditions of every
stimuli sequence. A suitable range of values for each variable is deﬁned, such
that the combination of variable values is unique for every canonical sequence.
The introduction of state variables removes the need to refer to stimulus his-
tory and provides a more intuitive description for implementation purposes.
Once such variables have been deﬁned, the construction of the state boxes is
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144130
straightforward. However, in practice, deﬁning suitable variables (within the
context of the overall design) can be diﬃcult and lead to errors being injected
at this stage.
A black box and state box with functions BB : S ∗ → R and SB : (Q×S ) →
(Q × R) respectively are behaviourally equivalent precisely when, for every
sequence in S ∗, they invoke the same response. The state box speciﬁcation is
represented as a set of tables, one for each stimulus s ∈ S with the following
form:
Response New state Trace ruleCurrent state
Stimulus: s
Example 4.3 illustrates this notation. In a complete, consistent state box,
one can determine the unique response given for any stimulus s ∈ S in any
given state.
2.2 CSP and the FDR model checker
CSP [3] is a process algebra for describing concurrent processes that commu-
nicate with one another or their environment. It has a rich language and a
collection of semantic models for reasoning about process behaviour. We give
a brief overview here; see [10] for further details.
A process is deﬁned in terms of synchronous, atomic events that it can
perform with its environment: a → P can initially perform event a and then
act like process P . We write ?x : A → P(x ) (preﬁx choice construct) to
denote the process that is willing to perform any event in A and then behave
like process P(x ). Channels carry sets of events: c?x → Px inputs a value x
from channel c and then acts like Px .
CSP has the following choice operators: P  Q denotes an external choice
between P and Q ; the initial events of both processes are oﬀered to the envi-
ronment. P  Q denotes an internal choice between P and Q ; the process can
act like either P or Q , with the choice being made nondeterministically. Pro-
cesses can be placed in parallel, where they synchronise upon (all or speciﬁed)
common events or interleaved, where they run independently of one another.
For the purposes of this paper, it suﬃces to introduce the simplest semantic
model in CSP, known as the traces model. A trace is a sequence of events
(visible to the environment). For any process P , traces(P) is the set of all
ﬁnite traces of P ; furthermore, traces(P) is non-empty (always contains empty
trace) and preﬁx closed. A process Q trace-reﬁnes a process P , written P T
Q , precisely when every trace of Q is also a trace of P . The traces model
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144 131
allows us to verify safety properties only. Richer semantic models exist in CSP
to capture a broader scope of properties such as nondeterminism and other
liveness conditions. We refer to these models in Section 6 when extending our
approach to handle underspeciﬁed speciﬁcations.
FDR [9,2] is a mature model checker for CSP that provides fully automated
veriﬁcation of reﬁnement (for all semantic models), determinism, deadlock
freedom and livelock freedom, together with extensive debugging facilities.
3 Modelling black boxes in CSP
In this section, we show how black boxes can be modelled in CSP and present
a generic algorithm for their construction. The sequence-based speciﬁcation
of the black box describes a ﬁnite-state automaton (in this case, a Mealy ma-
chine), where the states represent the equivalence classes and the arcs capture
the corresponding stimulus-response pairs between them. Such systems are
intuitively modelled in CSP as a set of mutually recursive processes, one rep-
resenting each equivalence class, and deﬁned with stimulus-response pairs of
events leading to the process that represents the corresponding equivalence
state.
We start by deﬁning what it means for a CSP process to model a black
box speciﬁcation. For a black box function BB : S ∗ → R, we will deﬁne a
CSP process P with stimulus and response events from S and R respectively;
thus the alphabet of P will be S ∪ R. For some trace tr , we write tr  S to
denote the trace whose members are those of tr that are also in set S . For
example, the trace tr = tr ′ 〈r〉 for some trace tr ′ and response r , represents
the stimulus sequence tr ′  S with response r .
Deﬁnition 3.1 A CSP process P is said to model a black box function BB :
S ∗ → R, precisely when the following conditions are satisﬁed:
(i) P must not introduce new behaviour (not in BB):
∀ tr , tr ′ ∈ traces(P); r ∈ R • tr = tr ′  〈r〉 ⇒ BB(tr  S ) = r .
(ii) P must model the whole domain of BB and map each one accordingly:
∀ z ∈ S+; r ∈ R •
BB(z ) = r ⇒ (∃ tr , tr ′ ∈ traces(P) • tr = tr ′  〈r〉 ∧ tr  S = z ).
where S+ = S ∗ − {〈〉}. The case z = 〈〉 is modelled by the empty trace
〈〉 which, by deﬁnition of the traces-model, is a member of traces(P).
(iii) P reﬂects the stimulus-response pairing pattern. Since traces(P) is preﬁx-
closed, we specify this as:
∀ tr : traces(P) • #(tr  R) ≤ #(tr  S ) ≤ #(tr  R) + 1.
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144132
where #(tr  X ) returns the number of times an event in X occurs in
trace tr .
(iv) P is a deterministic process, reﬂecting the fact that BB is a function.
Condition 3 is a characteristic of the black box speciﬁcation that must be
observed by the CSP model. In terms of the corresponding ﬁnite state au-
tomaton, from any given state, actions only of the form of a stimulus followed
by a response can be performed. Without this property in our deﬁnition, one
could construct a CSP process P such that 〈s , r , r〉 ∈ traces(P). Given that
BB(〈s〉) = r , this trace would satisfy the ﬁrst two conditions above. However,
this does not accurately model the pairing assumption made by the black box
speciﬁcation of the system.
In the case where a single stimulus does invoke a sequence of consecutive
responses, one would still capture this as a single abstract response to model
the fact that they are treated atomically and are a result of a single stimulus
input.
Deﬁnition 3.2 A CSP process satisfying Deﬁnition 3.1 and a sequence-based
speciﬁcation are said to be congruent precisely when they encode the same
black box function.
For the purposes of clarity, we capture an enumeration table for some
canonical sequence ci as the set RowsBB(ci) comprising one element per row
of the table and taking the form (s , r , cj ), where s , r and cj represent the
stimulus, response and equivalence columns respectively. For the purposes
of the CSP model, we ignore the fourth column capturing the traceability to
informal requirements, as it is not relevant in the CSP analysis of the system.
Algorithm 1 For a given sequence-based speciﬁcation that encodes a black
box function BB : S ∗ → R, a corresponding CSP process PBB is deﬁned as
follows:
(i) For every canonical sequence ci in the enumeration, a CSP process Pci
is deﬁned as follows:
Pci = {Q(s , r , cj ) | (s , r , cj ) ∈ RowsBB(ci)}
where
Q(s , r , cj ) = s → r → Pcj
This produces a collection of mutually recursive CSP processes Pc0, . . . ,Pcn,
one process Pci for each canonical sequence ci in the enumeration.
(ii) The CSP process PBB is then deﬁned as PBB = Pc0, where c0 represents
the shortest canonical sequence, namely 〈〉.
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144 133
Due to space restrictions, we use a very simple example below to illustrate
this algorithm. However, we discuss our industrial experiences in Section 7.
Example 3.3 Consider a highly simpliﬁed vending machine with the follow-
ing sets of stimuli and responses:
S = {coin, tea, coﬀee, cancel, accept }
R= {return coin, dispense tea, dispense coﬀee,
menu, pay coin, conﬁrm, null }
Assume there are four canonical sequences enumerated as follows:
coin
tea
coffee
cancel
accept
menu
pay_coin
pay_coin
null
null
Stimulus Response Equiv
coin
tea
coffee
cancel
accept
Stimulus Response
dispense_tea
menu
null
null
return_coin
Equiv
coin
tea
coffee
cancel
accept
Stimulus Response
dispense_coffee
menu
null
null
return_coin
Equiv
coin
tea
coffee
cancel
accept
Stimulus Response
return_coin
return_coin
confirm
confirm
null
Equiv
c1
c0
c0
c0
c0
c0 : 〈〉
c2 : 〈coin, tea〉
c2
c2
c2
c1
c0
c3 : 〈coin, coﬀee〉
c0
c1
c3
c3
c3
c1 : 〈coin〉
c1
c2
c3
c0
c1
This enumeration encodes a complete black box speciﬁcation: for any sequence
z ∈ S ∗, one can derive the corresponding response directly from the tables.
The null response is used to represent the lack of response from the system.
Using Algorithm 1, the CSP process generated is P = Pc0 where:
Pc0 = coin → menu → Pc1
 tea → pay coin → Pc0
 coﬀee → pay coin → Pc0
 cancel → null → Pc0
 accept → null → Pc0
Processes Pc1, Pc2 and Pc3 are deﬁned accordingly from the tables. One can
step through the traces of P and see how each one directly corresponds to the
stimulus sequence-response mapping in the enumeration.
Proposition 3.4 For a given sequence-based speciﬁcation that encodes a black
box function BB, if a CSP process P is constructed according to Algorithm 1,
then P models the black box function BB according to Deﬁnition 3.1 and they
are therefore congruent.
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144134
Proof straightforward and omitted due to lack of space.
The resulting CSP processes are deterministic, reﬂecting the fact that the
black box speciﬁcation is indeed a function. For specifying the design of a
component, such deterministic models suﬃce, since one does not choose to
implement nondeterministic software. The purpose of these formal design
speciﬁcations is to ensure that they are complete and unambiguous when
passed on for implementation.
However, the ability to abstract away design details is an important re-
quirement in practice. For example, when specifying interfaces of components
and verifying their interactions, it is essential to be able to hide internal ac-
tions. This naturally leads to nondeterminism. The CSP framework captures
nondeterminism very eﬀectively and the sequence-based speciﬁcation method
provides a practical method for under-specifying components for these pur-
poses, without conﬂicting the functional foundations of their approach. In
Section 6, we discuss how our approach can be extended to handle nondeter-
minism.
4 Modelling state boxes in CSP
The next step towards implementation is deriving a behaviourally equivalent
state box from the black box. A state box deﬁnes a total function SB :
(Q ×S ) → (Q ×R), where Q is the set of states, S is the set of stimuli and R
is the set of responses. Each state q ∈ Q represents a characteristic predicate
χq for an equivalence class deﬁned by ρMe over S
∗; we write [q ]ρMe to denote
the set of sequences in S ∗ for which χq holds true.
In this section, we illustrate how state boxes can be modelled in CSP and
present a generic algorithm for constructing them. We start by deﬁning the
traces properties to be satisﬁed by a CSP model for a state box function.
Deﬁnition 4.1 A CSP process P is said to model a state box function SB :
(Q × S ) → (Q × R), precisely when the following conditions are satisﬁed:
(i) P must not introduce new behaviour (not in SB):
∀ tr , tr ′ ∈ traces(P); s ∈ S ; r ∈ R • tr = tr ′  〈s , r〉 ⇒
∃ q , q ′ ∈Q • SB(q , s) =(q ′, r) ∧ tr ′  S ∈ [q ]ρMe ∧ tr  S ∈ [q
′]ρMe .
(ii) Every mapping in SB is modelled by P :
∀ q , q ′ ∈ Q ; s ∈ S ; r ∈ R • SB(q , s) = (q ′, r) ⇒
(∀ tr ∈ traces(P) • tr  S ∈ [q ]ρMe ⇒
tr  〈s , r〉 ∈ traces(P) ∧ (tr  S ) 〈s〉 ∈ [q ′]ρMe ).
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144 135
By induction, this implies that all mappings in SB are modelled by P
(since traces(P) is always non-empty, by deﬁnition of the traces-model).
(iii) P satisﬁes the stimulus-response pairing pattern:
∀ tr : traces(P) • #(tr  R) ≤ #(tr  S ) ≤ #(tr  R) + 1.
(iv) P is a deterministic process, reﬂecting the fact that SB is a function.
The traces of the CSP process are composed of stimuli and responses in
S and R respectively and clearly make no reference to the state data. This
allows for a direct comparison between the traces of the process modelling the
black box and that of the state box. Any errors introduced by design decisions
made from the sequence-based enumeration of the black box to the state box in
terms of the system’s behaviour are easily found in FDR by verifying whether
the corresponding CSP processes are traces-equivalent.
Deﬁnition 4.2 A CSP process satisfying Deﬁnition 4.1 and a state box are
said to be congruent precisely when they model the same state box function.
We present a generic algorithm below for translating state box speciﬁca-
tions into equivalent CSP models. We capture the information presented in a
state box for a stimulus s ∈ S as the set RowsSB(s) comprising one element
per row and taking the form (q , r , q ′), where q is the current state, r is the
response invoked by s in state q and q ′ is the updated state.
Algorithm 2 Given a state box speciﬁcation with function SB : (Q × S ) →
(Q × R), a corresponding CSP process PSB is deﬁned as follows:
(i) Let x1, . . . , xk denote the state variables as deﬁned by the state box.
(ii) The process PSB is deﬁned as follows:
PSB =P
′
SB(init1, . . . , initk)
where init1, . . . , initk are the initial values of the state variables x1, . . . , xk ,
as deﬁned in the state box.
(iii) The process P ′SB is deﬁned as follows:
P ′SB(x1, . . . , xk) ={Q(s) | s ∈ S}
There is one Q(s) per state box for stimulus s.
(iv) For each stimulus s ∈ S, Q(s) is deﬁned as follows:
Q(s) ={Q
′(q , s, r , q ′) | (q , r , q ′) ∈ RowsSB (s)}
where
Q
′(q , s, r , q ′) = q & s → r → P ′SB (q
′)
For a stimulus s, there is one process Q ′(q , s , r , q ′) for each row in its
state box. The value of q reﬂects the precondition for which s invokes
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144136
response r and state update q ′, and is modelled as a boolean guard in
CSP.
Example 4.3 Recall the simple vending machine presented in Example 3.3.
Consider the following state box speciﬁcation derived from that sequence-
based speciﬁcation, where the characteristic predicates for each equivalence
class (each denoted by a canonical sequence) is deﬁned by two state variables
Coin and Drink as follows:
Equivalence class Characteristic predicate
[〈〉]ρMe Coin = false
[〈coin〉]ρMe Coin = true ∧Drink = none
[〈coin, tea〉]ρMe Coin = true ∧Drink = tea
[〈coin, coﬀee〉]ρMe Coin = true ∧Drink = coﬀee
where Coin and Drink are deﬁned with the range of values {true, false} and
{none, tea, coﬀee} respectively. The state box tables, one for each stimulus,
are derived from the enumeration tables in Example 3.3 and the characteristic
predicates deﬁned above. For example, (ignoring the 4th column Trace rule)
the state boxes for stimuli tea and cancel are deﬁned as:
Current state Response New state
pay_coin No update
No update
Drink = teaconfirm
null
Current state Response New state
No updatenull
return_coin
menu Drink = none
Coin = false
Stimulus: cancel
Stimulus: tea
Coin = false
Coin = true ∧Drink = none
Coin = true ∧Drink = none
Coin = false
Coin = true ∧Drink = none
Coin = true ∧Drink = none
Assuming the initial values for Coin and Drink are false and none respec-
tively, the CSP process is deﬁned as PSB = P(false, none) where:
P(c, d) =
¬c & coin → menu → P(true, d)
 c & coin → return coin → P(c, d)
 ¬c & ?x : {tea, coﬀee} → pay coin → P(c, d)
 c ∧ d = none & ?x : {tea, coﬀee} → conﬁrm → P(c, x)
 c ∧ d = none & ?x : {tea, coﬀee} → null → P(c, d)
 ¬c & cancel → null → P(c, d)
 c ∧ d = none & cancel → return coin → P(false,d)
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144 137
 c ∧ d = none & cancel → menu → P(c,none)
 ¬c ∨ d = none & accept → null → P(c, d)
 c ∧ d = none & accept → dispense . d → P(false,none)
Each guarded choice is directly derived from the state box tables. For exam-
ple, choices 3, 4 and 5 are derived from the tables for tea and coﬀee respectively.
Proposition 4.4 For a given state box that encodes a complete, consistent
state box function SB, if a CSP process P is constructed according to Algo-
rithm 2, then P models SB according to Deﬁnition 4.1 and they are therefore
congruent.
Proof straightforward and omitted due to lack of space.
5 Automated analysis using FDR
In this section, we give a brief summary of the analysis we do in FDR using
this framework in industry.
Verifying black boxes and state boxes At each stage of the box struc-
ture development of a design, once the speciﬁcation is modelled in CSP, prop-
erties such as completeness, determinism and control laws speciﬁed in the
system’s requirements can be formulated and veriﬁed using FDR.
Another requirement to be veriﬁed in the state boxes is that the charac-
teristic predicates deﬁned as state data are non-overlapping and complete. In
practice, errors during this phase are frequently introduced. Both properties
are easily checked in FDR: The ﬁrst of these is captured by checking for non-
determinism; if there are overlapping predicates, then there will be a state
where, for a given stimulus s , at least two guards evaluate to true, leading
to a nondeterministic choice as to which of these is performed. If we want to
distinguish between cases where the resulting response and state in both cases
are the same, then additional events can be added to the model automatically
to ensure that this error is still captured as nondeterministic behaviour. The
second property can be formulated as a liveness condition and also veriﬁed
using FDR.
Verifying behavioural equivalence Every step in the box structure de-
velopment must be veriﬁed. Regarding black boxes and state boxes, one must
verify that they are behaviourally equivalent (as deﬁned in Section 2.1). Typ-
ically, this veriﬁcation is done by deriving the black box behaviour from the
state box and checking whether it is equivalent to the original black box spec-
iﬁcation. This is time consuming and infeasible in practice. The algorithms
presented here allow both speciﬁcations to be translated into corresponding
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144138
CSP models respectively and the necessary veriﬁcation checks to be done au-
tomatically in FDR.
Using Deﬁnitions 3.1 and 4.1, the behavioural equivalence between a sequence-
based speciﬁcation and a state box speciﬁcation as deﬁned above can be for-
mulated in terms of the traces of the corresponding CSP processes: A de-
terministic sequence-based speciﬁcation SpecBB and a deterministic state box
speciﬁcation SpecSB are behaviourally equivalent precisely when their respec-
tive CSP models PBB and PSB are traces-equivalent. This is automatically
checked using FDR. If the check fails, then FDR produces counter-examples;
due to the close correlation between the CSP traces and the tabular represen-
tation within BSDM, the counter-examples are simple to understand in either
framework.
Verifying parallel composition The box structure method is component-
centric in the development of designs. We typically use it to develop design and
interface speciﬁcations for components, once the software architect has com-
pleted the overall structure and decomposed it into individual components.
By translating them into CSP, we are able to place them in parallel according
to the proposed architecture and formally verify the system as a whole. Prop-
erties such as deadlock, whether components invoke illegal behaviour in each
other and control laws are formulated in CSP and automatically veriﬁed in
FDR. The compositionality property of CSP provides the scalability for such
development in practice.
6 Handling abstractions
In [7], a number of eﬀective abstraction techniques are presented for avoiding
over-speciﬁcation in the sequence-based speciﬁcation method (used for con-
structing the black boxes). We brieﬂy summarise how our approach can be
extended to handle two of these we use in industry.
To avoid over-speciﬁcation within the sequence enumeration, details can
be abstracted away in numerous ways, for example, by introducing abstract
stimuli (rather than specifying the complete enumeration with actual concrete
stimuli of the system). One must then show that the two representations
are behaviourally equivalent. Such abstraction typically involves mapping
sequences from distinct equivalence classes into a single sequence and deﬁn-
ing predicates over them to capture their distinctive characteristics. These
predicates can then be deﬁned when that level of detail is required. For exam-
ple, consider a vending machine where dispensing a drink involves checking a
number of resources for availability. Enumerating this and identifying distinct
equivalence classes for every combination would be ineﬃcient.
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144 139
An alternative approach is to abstract away this level of detail and intro-
duce a predicate p that determines whether or not there are suﬃcient resources
to enable a given drink d to be dispensed. Such predicates are typically deﬁned
in terms of preﬁx recursive functions over stimuli sequences. In this case, p
may be deﬁned as Resource(h) = ∅, where Resource(h) is deﬁned over stimuli
sequences h and returns the set of available resources. A stimulus s whose
response depends on this predicate (for example, tea), is then modelled by the
special abstract stimuli s : p and s : ¬p. For example, an enumeration table
for a given canonical sequence c may be deﬁned using these abstract stimuli
as follows:
Stimulus Response Equiv Trace rule
s : Resource(h) = ∅ dispense(s) ci . . .
s : Resource(h) = ∅ unavailable(s) cj . . .
where Resource must be deﬁned for all stimuli sequences in the equivalence
class characterised by the canonical sequence c and h represents the stimuli
sequence history preceding all stimuli listed in the enumeration table.
Completeness and consistency is achieved by ensuring that the predicate
and its negation are deﬁned over the same sequence. See [7] for further details.
Algorithm 1 can be extended to handle predicates that are deﬁned over
stimulus history, by introducing state variables to capture the essential infor-
mation (in the same way as is done for the state box speciﬁcations). Instead
of referring to the sequence history, the predicates are computed in terms of
the corresponding state data within the CSP models. Each stimulus s in an
enumeration table of the form s : p is modelled as a choice with the boolean
guard p. For example, for the table above, we would have:
P(res) = res = ∅ & s → dispense . s → Pci (f (res))

res = ∅ & s → unavailable . s → Pcj (g(res))
where res is the state variable and the functions f and g reﬂect state updates
as deﬁned by Resource.
Under-speciﬁcation The sequence-based speciﬁcation method uses pred-
icates to enable the contruction of under-speciﬁed black box speciﬁcations [7]
(without breaking the function theory upon which this approach is built). This
is an essential form of abstraction in practice, both for specifying component
interfaces, where the internal actions are hidden from its environment, and
for specifying requirements against which the design will be veriﬁed. Under-
speciﬁcation at the black box level of abstraction is achieved by introducing
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144140
undeﬁned predicates to capture the fact that for a given stimulus s , there are a
number of possible responses and subsequent behaviour, the decision of which
is as yet undeﬁned.
In ongoing research, we are looking at weakening Deﬁnition 3.1 and ex-
tending Algorithm 1 to construct corresponding CSP models and capture the
potential nondeterminism. For example, for abstract stimuli of the form:
Stimulus Response Equiv Trace rule
s : p r1 ci . . .
s : ¬p r2 cj . . .
where p is an undeﬁned predicate, the CSP process is deﬁned as follows:
P =(s → r1 → Pci )  (s → r2 → Pcj )
thereby potentially leading to nondeterminism after s is performed (depending
on whether subsequent responses and behaviours are distinct).
In practice, we can specify component interfaces of a system using this
approach and verify whether a proposed design (translated into CSP from
BSDM) satisﬁes the intended interface automatically with FDR by using CSP
reﬁnement. Due to transitivity of reﬁnement and monotonicity of the CSP op-
erators with respect to reﬁnement, CSP supports compositional development,
enabling this approach to scale in practice. When modelling nondeterminism
within the CSP models, the failures-model is used for all reﬁnements.
7 Conclusion
In this paper, we gave an overview of how BSDM and CSP can be combined
to reap the beneﬁts of both in practice. We presented generic algorithms
for converting black box and state box speciﬁcations into CSP and brieﬂy
discussed how this work could be extended to handle abstraction techniques
such as underspeciﬁcation. We use this approach to enhance the designs of
software systems and verify them before the implementation phase starts; in
practice, design errors are frequently the most expensive ones to resolve (in
terms of resources and time) when only found during the testing phase.
There are a number of advantages of our proposed combined approach over
some of the traditional formal methods, including CSP used on its own. The
ﬁrst one is its scope for integration into an organised industrial environment,
achieved by BSDM and the sequence-based speciﬁcation method. It is practi-
cal in the sense that it provides traceability between the formal speciﬁcations
and the informal requirements, is accessible to software engineers and domain
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144 141
experts without the need for extensive training, and from our experience of
using this approach in industry over the last two years, it has shown to scale
well (references to the experiences of others can be found in [8]). Secondly,
the speciﬁcations developed using BSDM can be translated into other formal
languages, in our case CSP, and it therefore acts as an eﬀective bridge for
introducing formal veriﬁcation methods. Finally, the model checker FDR has
provided the necessary automated tool support required in practice. This
also includes eﬀective and meaningful feedback regarding design errors found
during the analysis. Due to the similarity between the tabular representation
within BSDM and its corresponding description in terms of traces within CSP,
provided one selects sensible naming conventions for each component’s stimuli
and responses within the system, it is very straightforward to interpret the
counter-examples generated by FDR in either formalism.
Industrial experience We have used this combined approach for devel-
oping complex, event driven embedded control software controlling complex
manufacturing machines. It has proven to be successful and scalable for the
development of new components and the re-engineering of existing software
components. For example, one of our applications involved the design and
implementation of a process control Kernel for a machine based on 22 head-
less PC platforms running in parallel and communicating with each other via
an internal Ethernet. The Kernel was designed as two concurrent processes,
namely the Machine Controller and the State Controller processes, communi-
cating via queues. The Machine Controller implemented the basic operational
behaviour of the machine and controlled the other major subsystems. It was
responsible for machine consistency and enforcing the control laws governing
safe operation. It used interfaces provided by the subsystems it controlled
and it implemented an interface used solely by the State Controller. The
State Controller implemented the overall machine behaviour as seen by the
GUI component and thus the machine operator.
The black box function of the Machine Controller design had 2,835 map-
pings in 47 equivalence classes. The length of the longest canonical sequence
was 11 stimuli. The black box function of the State Controller design had 837
mappings and the length of the longest canonical sequence was 6 stimuli.
In addition to deﬁning the design of the Kernel components as black boxes,
the interfaces they implemented and those they used were deﬁned as under-
speciﬁed black boxes. Translating these speciﬁcations into CSP using the
algorithms presented here enabled the following veriﬁcations to be performed
automatically by FDR: (i) the designs of both Kernel processes were deter-
ministic and implemented their respective interfaces correctly; (ii) the parallel
composition of the State Controller and the process representing the interface
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144142
of the Machine Controller was deadlock free and behaved correctly accord-
ing to the Machine Controller’s interface; (iii) the parallel composition of the
Machine Controller and the processes representing the interfaces of the other
machine components was deadlock free and behaved correctly; (iv) the Ma-
chine Controller satisﬁed the control laws under all conditions.
The ﬁnished Kernel was approximately 20,000 executable lines of C++.
The speciﬁcation, design and implementation eﬀort was approximately 16 man
weeks. The ﬁnal code was integrated on the target machine in one day. In
the ﬁrst 4 weeks of intensive use, 8 minor coding errors were found, none of
which were speciﬁcation or design errors; in more than 12 months since, no
other errors have been detected. Rework to date is below 1%.
Future work We are currently formalising the work summarised in Sec-
tion 6 and extending our approach with other practical abstraction techniques.
Ongoing research is also being done regarding eﬃcient modelling of constructs
such as queues; we are seeking to classify common design patterns and pro-
cess relationships, according to the modelling techniques appropriate to each.
Other avenues of interest are: enabling the automated veriﬁcation of diﬀerent
levels of abstraction in the enumeration speciﬁcations using FDR (for example,
the abstract versus concrete stimuli); extending our approach for the reﬁne-
ment step from the state box to clear box in BSDM; and modelling real-time
systems in this framework.
Acknowledgement
We thank Bill Roscoe and Michael Goldsmith for useful discussions and com-
ments on this work. This research is funded by Verum Consultants, The
Netherlands.
References
[1] J. R. Abrial. The B Book: Assigning Programs to Meaning. Cambridge University Press, 1996.
[2] Formal Systems (Europe) Ltd. Failures-Divergence Reﬁnement: FDR2 User Manual, 2003.
[3] C. A. R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985.
[4] G. J. Holzmann. The model checker spin. IEEE Transactions on Software Engineering,
23(5):279–295, 1997.
[5] H. D. Mills. Stepwise reﬁnement and veriﬁcation in box structured systems. Computer,
21(6):23–26, 1988.
[6] H. D. Mills, R. C. Linger, and A. R. Hevner. Principles of Information Systems Analysis and
Design. Academic Press, 1986.
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144 143
[7] S. J. Prowell and J. H. Poore. Foundations of sequence-based software speciﬁcation. IEEE
Trans. of Soft. Eng., 29(5):417–429, 2003.
[8] S. J. Prowell, C. J. Trammell, R. C. Linger, and J. H. Poore. Cleanroom Software Engineering
- Technology and Process. Addison-Wesley, 1998.
[9] A. W. Roscoe. Model-checking CSP. In A Classical Mind: Essays in Honour of C.A.R. Hoare,
pages 353–378. Prentice-Hall, 1994.
[10] A. W. Roscoe. The Theory and Practice of Concurrency. Prentice Hall, 1998.
P.J. Hopcroft, G.H. Broadfoot / Electron. Notes Theor. Comput. Sci. 128 (2005) 127–144144
