A Case Study in Specifying and Testing Architectural Features by Krishnan, P.
A Case Study in Specifying and Testing Architectural
Features
Padmanabhan Krishnan
Department of Computer Science
University of Canterbury, Private Bag 4800
Christchurch, New Zealand
Email: paddy@cosc.canterbury.ac.nz
Abstract
This paper studies the specication and testing of two main architectural features.
We consider restricted forms of instruction pipelining and parallel memory models
present in the SPARC specication. The feasibility of using an automatic tool, the
concurrency work bench, has been demonstrated.
Keywords Architecture, SPARC, specication, verication, CCS, modal logic, con-
currency work bench
1 Introduction
Formal specication is the writing of the requirements of a system at a suciently
abstract level. To gain condence that a given specication meets the needs of an user,
tests on it can be performed. If the outcomes agree with the expected behaviour, one
can assume that the given specication expresses what is necessary. A particular form of
testing is the construction of an implementation and verifying that the implementation
satises the specication. However it not always possible to construct an implementation
or verify that it satises the specication. In such cases one may construct tests using
other formalisms
1
Formal specication and verication of computer hardware has been shown to be fea-
sible. [GB89] describes the specication and construction of an SECD machine using HOL
[Gor85], while [Coh88] describes the verication of the Viper architecture. Concurrency
aspects of hardware has also been veried in HOL [LD90] where a multiprocessor cache
protocol is considered.
In this paper we describe our experience in specifying certain aspects of an architec-
ture and formal testing of the specication. The architecture we consider is the SPARC
[Spa91] and the specication language we use is CCS [Mil89]. CCS belongs to a class
of formalisms called process algebras, which are used to describe the observational be-
haviour of concurrent systems. They have been used to specify and verify many systems
including communication protocols [Par87, LM87]. A prototype implementation called
the Concurrency Work Bench (CWB) to help the specier test the specications exists
[CPS89, CPS93]. The main reason for choosing the CWB over the HOL system is that
CWB performs all the analysis automatically, while the HOL system is a proof assistant.
This is not to conclude that HOL cannot be used, rather that as a rst step in the spec-
ication and testing process the CWB is easier to use than HOL. The CWB is only one
of the specication/validation environments. Implementations of LOTOS [BB89, vVD89]
which is based on CCS exist and have been used to specify/verify system [vS89]. We
choose the CWB mainly because it was available.
The SPARC architecture was chosen as it is relatively new architecture and addresses
some of the issues in multiprocessor systems (the memory model) and supports pipelin-
ing whose denition is aects the programming model. The SPARC denition does not
recommend any implementation, rather it denes a class of implementations. Hence it is
crucial to design an implementation and verify that it satises the given specication.
In the next section we present a brief summary of CCS and the CWBwhile in sections 3
and 4 we describe the features and simplications of the architecture, the specication of
2
the architecture in CCS and the tests performed on it.
2 Overview of CCS and CWB
In this section we present a brief summary of the concepts and notation used in this
paper. The reader is referred to [Mil89] and [CPS89] for details. A set of atomic actions
() with a bijection  on it such that for all a2 , a = a is assumed. A special action 
which indicates synchronisation is used. The syntactic structure of processes is given by
the following rules.
P := 0 P P j P P + P P nH P[] X rec X:P
0 is a process which can exhibit no further action, P can exhibit  and then behave
as P. (P jQ) is the parallel composition of P and Q, (P + Q) represents non-deterministic
choice. (P nH) hides all actions specied in H, P[] relabels all actions in P by  and
X and (rec X:P) is used to dene recursive processes. A recursive process can also be
written as (X = P) which permits the specication of a system using a set of equations.
An operational semantics for the processes based on labelled transition system is dened
[Mil89].
The principal semantic relation is the notion of (strong) bisimulation [Par81]. Intu-
itively, P is bisimilar to Q means that every behaviour of P(Q) can be simulated by Q(P).
Other semantic relations such as weak bisimulation which is similar to strong bisimulation
except that the action  is internalised, traces which is the automata theoretic character-
isation, testing etc. can be dened [Mil89, Hen88]. [vG90] presents a comparative study
of various semantics relations.
Some of these semantic relations can be described logically using the modal -calculus
[HM85, Lar88, Sti89b, Sti89a]. We use the modal -calculus to verify that the speci-
cations satisfy certain logical properties. The set of formulae includes action indexed
modalities for possibility hi , universality [] , negation : and recursion (minimal
3
xed point X: and maximal xed point X: .) Formulae can also be combined using
the propositional connectives of conjunction, disjunction etc.
Informally a process can satisfy hi  if it can exhibit the action  and evolve to a
process which can satisfy  . [] is dened to be :(hi: ) and thus can be interpreted
to mean that any  move will necessarily lead to a process which can satisfy  . Modalities
which ignore  moves are also dened. For example, the formula hhii can be satised
by a process which after a nite number of  moves can exhibit . The minimal xed
point corresponds to innite disjunction, while the maximal xed point is the dual of the
minimal xed point and corresponds to innite conjunction. Intuitively, the minimal xed
point can be interpreted as a liveness property. If, for example, the proposition (P
0
or
P
1
or : : :or P
n
: : :) were satised in the nth step, then we can assume that P
0
to P
n 1
were not satised. Therefore, for the property to be satised, there should be some n such
that P
n
is true. By a similar argument the maximal xed point can be interpreted to be
a safety requirement as a proposition of the form (P
0
and P
1
and : : :and P
n
: : : ) to be
satised all of P
i
must be satised.
The CWB is an automatic tool which helps in the analysis of concurrent systems
expressed in CCS. The CWB consists of three main components. The rst component
handles the user interface where user can dene processes and formulae. The user can
also issue various other commands to study the behaviour of the specied system. The
command bi binds an identier to a process (or an agent) and can be used to dene
recursion. For example, bi X P represents the CCS process `recX:P' or (X = P). The
command bsi binds an identier to a set of actions and is useful when dening restrictions
while the command bpi binds an identier to a proposition. The CWB uses t as the ascii
translation of  , while 'a is used instead of a. The second layer performs certain semantic
transformations. While this layer performs a crucial task, the user is completely shielded
from it. This makes the tool easier to use than more complex systems. The third layer
4
P = insertP1 B = insertremove B
P1 = insertP2 + removeP B2 = (B j B)
P2 = removeP1
Figure 1: CCS Example
provides the commands for analysing the specication. Having dened agents and modal
formulae, the CWB can perform automatic analysis to check if two agents are weakly
bisimilar (eq), strongly bisimilar (strongeq), trace equivalence (mayeq), trace preorders
(maypre). The CWB can also verify if an agent satises a logical specication (cp). There
are other features including examining all behaviours, nding deadlocks etc. which are
useful when developing the specications.
In the next section we present a small example to give a avour of CCS and the CWB.
The expert reader can skip this section and proceed to section 3.
2.1 Example
Consider a buer of size two which is initially empty. After performing two inserts, remove
is the only possible operation on the buer. If the buer is empty, only an insertion can be
performed. There are two ways to specify the system. The rst is to explicitly enumerate
the reachable states. As we have not imposed any ordering on insertions and deletions,
one could also specify the system as a parallel composition of two one element buers.
The CCS specication is shown in gure 1.
The CWB specication of the two systems is presented in gure 2.
One may now wish to verify that the two systems P and B2 are equivalent. This can
be achieved by the command eq B2 P. In this case the CWB returns true. As P and B
are not equivalent, the command eq P B returns false. However, any behaviour exhibited
by a one buer can be exhibited by a two buer. This can be checked on the CWB using
the command maypre B P which yields true. In other words, every trace exhibited by B
5
bi P
insert.P1
bi P1
insert.P2 + remove.P
bi P2
remove.P1
bi B
insert.remove.B
bi B2
(B | B)
Figure 2: Specication Using the CWB Syntax
can be exhibited by P.
We now present two small examples of modal formulae. The rst property we consider
is that after two inserts a remove must be performed. This can be restated as after two
inserts it is not possible to perform any action but remove. This is described in the modal
-calculus as < insert >< insert > [ remove]F , i.e., it is possible to perform two inserts
and then no action other than remove is possible.
The second property will use the maximal xed point construct. If it possible to insert
into a buer, then it is always possible to perform the insertion followed by a remove. It
is intuitively obvious that the property is true. To translate the above property into a
formula modal -formula, we notice that the property is always true. Hence it indicates
that we have to use the maximal x-point. Given that, the formula can be written in the
CWB syntax as follows.
bpi Prop max(X: (CanInsert => CanIR)&[ ]X)
bpi CanInsert < insert > T
bpi CanIR < insert >< remove > T
The validity of the property can be veried by the commands cp P Prop, cp B Prop
6
and cp B2 Prop all of which return true. This concludes our brief introduction to the
CWB. A more detailed example can be found in [CPS93][pages 58-66].
In the next two sections we describe the specication and the testing performed. The
two main features of the SPARC architecture that involves parallelism are instruction
pipelining and a memory model that supports multiprocessor operations. In this paper
we consider both these aspects. For the sake of readability we use the CCS syntax for
all elements except the minimal and maximal xed point. All specications in the CWB
syntax are available from the author. Section 3 describes the modelling of instruction
pipelining and the delayed instruction while section 4 describes the memory pipelining
model. While both the models specify pipelining, the eects are dierent with instruction
pipelining being simpler than the memory model.
3 A Simplied Instruction Pipelining in SPARC
While instruction pipelining is not very new, the design of an architecture where the
instruction pipelining is visible at the program level is relatively modern. It has been
made popular mainly by the RISC architectures. We only provide a brief explanation of
this feature. The reader is referred to [Spa91] for more details.
In addition to the program counter (PC) the SPARC has an nPC which points to the
next instruction. It is usually PC+4 except in the case of branch instructions. The SPARC
provides two types of branch instructions, viz., normal branch and annulled branches.
After executing the normal branch instruction, the instruction pointed to by the nPC is
executed. In the case of annulled instructions the instruction pointed to by nPC is not
executed. A simplied view is explained using the following table.
7
PC Instruction
8 Non-branch
12 Branch to 40 (execute delay)
16 Non-branch
: : : : : :
40 Instruction
The instruction sequence executed will be 8, 12, 16, 40 : : : . If the instruction at
address 12 annulled the delayed instruction, the sequence will be 8, 12, 40 : : : . The formal
specication and testing is dened in the next section.
3.1 Specication of Delayed Instructions
Towards modelling the instruction pipeline, we make the following simplications. In
this work we do not consider the complete generality of the SPARC branch instructions.
We assume that a branch instruction is denoted by the action branch. As annulling of
delayed instruction can depend on whether a transfer of control occurs, an internal choice
of  or signal annul is used. As modelling value passing results in an innite (or very
large) space process we also do not model dierent addresses and hence branching to
dierent locations.
We model the PC and the nPC as buers of size 1. As PC and nPC represent a
pipeline, elements are inserted into nPC (insert) and removed from PC (
0
remove) with
getfromnpc used to transfer an instruction from nPC to PC. The processor (CPU) fetches
an instruction from the PC and indicates to the environment that it did so via the ac-
tion fetch, performs a decode and continues or treats the instruction as a branch. The
unit handling control transfer instructions either executes the next (and hence delay) in-
struction or signals an annulment (signal annul), removes the next instruction (does not
execute it) and continues. Note that we need the actions fetch and signal annul to indi-
8
PC = getfromnpc  remove  PC
NPC = insert  getfromnpc NPC
CPU = remove  fetch  decode  (  branch  Continue+   CPU)
Branch = branch  (  continue Branch +   signal annul  annul Branch)
Continue = continue  CPU + annul  remove  fetch  CPU
Sys = (PC j NPC j CPU j Branch)n
fannul continue remove branch getfromnpcg
Figure 3: SPARC Processor-1
cate the behaviour of the system to the environment. Otherwise the system will collapse
to an innite sequence of  moves.
The CCS specication used in the CWB is given in gure 3. Sometimes, it is useful
to construct a diagrammatic representation of the a CCS specication. The nite state
representation of the process CPU and Continue (the states are indicated in the diagram)
is given in gure 4. While the diagram may help clarify the behaviour, we do not present
them for the sake of brevity.
It is also possible to model the pipelining by a process (called Pipeline) which is FIFO
buer of size 2 [Mil89] as shown in gure 5. This is similar to the example considered
earlier.
One can check that the denitions involving the process Pipeline or the process PC in
conjunction with the process NPC are equivalent. That is, Sys and New Sys are weakly
bisimilar. They are not strongly bisimilar due to the synchronisation between PC and
NPC.
The intuitive property the system should satisfy is that after a signal annul and a
fetch, the action decode cannot be exhibited. This is because the processor has to discard
the instruction just fetched simulating annulling. This can be expressed in the modal--
calculus as shown in gure 6.
9
remove
fetch
decode


branch
continue
annul
remove
fetch
CPU
Continue
Figure 4: State Diagram
Pipeline = insert  P
1
P
1
= insert  Full + remove  Pipeline
Full = remove  P
1
NewSys = (Pipeline j CPU j Branch)n
fannul continue remove branchg
Figure 5: Pipeline as a Buer
Delay = max(X:(Poss => Required)&[ ]X)
Poss =< signal annul ><< fetch >> T
Required =< signal annul ><< fetch >> [decode]F
Figure 6: Modal Formula for Delayed Instruction
10
The intuitive explanation of the formula is as follows. If it is possible to exhibit
signal annul followed by fetch (i.e., the formula Poss), the required behaviour must be
observed, i.e., cannot exhibit the action decode. The formula Required species this by
[decode]F which requires that decode is impossible. Note that we use the maximal xed
point operator as the specication Delay is a safety property; i.e., has to be satised by
every execution. The CWB veries that the process Sys satises the formula Delay.
In this work we do not consider dierent types of instructions and assume that there
is one action which represents an instruction. The dierence between a control transfer
instruction (annulling or executing the delay) is modelled as an internal choice. This
concludes our discussion of instruction pipelining. In the next section we consider the two
main memory models supported by the SPARC architecture.
4 A Simplied SPARC Memory Model
The denition of the SPARC memory model is applicable to both uniprocessor and shared
memory multiprocessors. The memory model relates the semantics of the memory oper-
ations as issued by a processor and the semantics of the operations as executed by a
memory unit. In other words, the model species the semantics of data load and store
and the relation between the order in which a processor issues the the instructions and the
order in which a central memory executes them. It also denes how instruction fetches
are synchronised with memory operations.
In this work we consider a simplied model of the total store ordering (TSO) and the
partial store ordering (PSO). Both these models only specify the behaviour observed by
the software and hence is a good candidate to be modelled on the CWB. For the purposes
of the model, a processor consists of a unit which issues loads and stores to the processor's
memory port. This order is called the processor's issuing order. The memory executes the
instructions of all the processors in an order called the memory order. The TSO model
11
guarantees that the sequence of operations executed by the memory is identical to one
issued by a processor. Hence as far as the processor is concerned, the memory is a FIFO
structure. In the PSO model the order in which the memory executes the operations
could be dierent from the order in which a processor issued them. Hence the buer is not
guaranteed to be a FIFO structure. It is possible to maintain a relationship between the
issuing order and the execution order using the stbar instruction. stbar instruction ensures
that any memory operation issued by a processor before a stbar are executed before the
operations issued after the stbar. Hence the stbar instruction partitions the processors
issuing sequence into non-FIFO classes but the partition themselves are ordered. Consider
for example a single processor issuing the instructions i1,i2,i3. In the TSO model, the
memory will necessarily execute i1 followed by i2 followed by i3. However, in the the PSO
model the memory could execute i2 followed by i3 followed by i1. If the sequence were
i1, i2, stbar i3, the memory could execute i1 and i2 in any order but will execute i3 only
after i1 and i2. Hence a limited form of FIFO behaviour is exhibited. Clearly, i1, stbar,
i2, stbar, i3 will be executed in FIFO order. More details can be found in [Spa91][pages
59-68].
The formal specication and testing is given below.
4.1 Memory Model
In order to make the automate the process of verication, we consider a few more sim-
plications. The modelling of values and addresses results in a large state space (which
makes automatic verication extremely time consuming) due to which we do not con-
sider them. This restriction can be removed easily by representing addresses using non-
determinism. See [Mil89] where values are simulated by choice. Therefore, we also assume
that a load does not look at the buer to see if an appropriate store has been issued be-
fore. The the two cases of load returned without a memory operation and with a memory
12
operation can also be modelled as non-deterministic choice of two processes. To simply
this exposition further we do not consider the flush instruction. Therefore, in this paper
we consider only the store, load and the stbar instructions.
If we were to consider a general specication of the memory model, a innite state
space process is necessary. In other words, we have to assume an unbounded memory
system. As this is not practical we consider a xed-nite buer size. The system we
model consists of a store buer of size 3.
It appears to be very dicult to specify the buer succinctly. The main reason seems
to be the lack of a general sequencing operator as in ACP [BK88]. Furthermore, the
behaviour of the buer requires it to be history sensitive, i.e., it has to `remember' the
items inserted into it before a stbar instruction was executed and to distinguish the various
instructions separated by stbars.
Our specication is by enumeration, i.e., each possible state that the buer could be
in is explicitly listed. For example, P
sstbl
indicates a state where a load followed by a
stbar, followed by a store was issued. Thus it represents the minimal state machine. The
behavioral specication implicitly removes the stb instructions when the last instruction
before the stbe is removed. For example, P
sstbl
evolves to P
s
after the load instruction is
removed. We use the actions load insert, store insert and stb to indicate the interaction
between a processor and the buer while the actions
0
load remove and
0
store remove
indicate the removal of the items from the buer by the single-ported memory. The
complete specication of 3 element buer in the PSO model is given by POBuff in
gures 7 and 8.
Once again we enumerate each state the buer can be in and hence is minimal. As in
the PSO case the stb instruction is removed implicitly. The specication of the sequential
buer of size 3 is presented in gures 9 and 10.
Now we describe the tests performed on the two specications to gain condence
13
POBuf = load insert  P
l
+ store insert  P
s
P
l
= load insert  P
ll
+ store insert  P
ls
+ stb  P
stbl
+ load remove  POBuf
P
s
= load insert  P
ls
+ store insert  P
ss
+ stb  P
stbs
+ store remove  POBuf
P
ll
= load insert  P
lll
+ store insert  P
lls
+ stb  P
stbll
+ load remove  P
l
P
ss
= load insert  P
lss
+store insert  P
sss
+
stb  P
stbss
+ store remove  P
s
P
ls
= load insert P
lls
+ store insert P
lss
+ stb P
stbls
+ load remove P
s
+ store remove P
l
P
stbl
= load remove  POBuf + load insert  P
lstbl
+ store insert  P
sstbl
P
stbs
= store remove  POBuf + load insert  P
lstbs
+ store insert  P
sstbs
Figure 7: PSO-1
P
lls
= load remove  P
ls
+ store remove  P
ll
P
lss
= store remove  P
ls
+ load remove  P
ss
P
lll
= load remove  P
ll
P
sss
= store remove  P
ss
P
stbss
= store remove  P
stbs
P
stbll
= load remove  P
stbl
P
lstbs
= store remove  P
l
P
sstbs
= store remove  P
s
P
lstbl
= load remove  P
l
P
sstbl
= load remove  P
s
P
stbls
= load remove  P
stbs
+ store remove  P
stbl
Figure 8: PSO-2
14
SeqBuff = load insert  SeqBuff
l
+ store insert  SeqBuff
s
SeqBuff
l
= load remove  SeqBuff +store insert  SeqBuff
sl
+load insert  SeqBuff
ll
+ stb  SeqBuff
stbl
SeqBuff
s
= store remove  SeqBuff + store insert  SeqBuff
ss
+load insert  SeqBuff
ls
+ stb  SeqBuff
stbs
SeqBuff
sl
= load remove  SeqBuff
s
+store insert  SeqBuff
ssl
+stb  SeqBuff
stbsl
+ load insert  SeqBuff
lsl
SeqBuff
ls
= store remove  SeqBuff
l
+ store insert  SeqBuff
sls
+stb  SeqBuff
stbls
+ load insert  SeqBuff
lls
SeqBuff
ll
= load remove  SeqBuff
l
+load insert  SeqBuff
lll
+store insert  SeqBuff
sll
+ stb  SeqBuff
stbll
SeqBuff
ss
= store remove  SeqBuff
s
+ load insert  SeqBuff
lss
+store insert  SeqBuff
sss
+ stb  SeqBuff
stbss
SeqBuff
stbl
= load remove  SeqBuff + load insert  SeqBuff
lstbl
+store insert  SeqBuff
sstbl
SeqBuff
stbs
= store remove  SeqBuff + load insert  SeqBuff
lstbs
+store insert  SeqBuff
sstbs
Figure 9: TSO-1
15
SeqBuff
lll
= load remove  SeqBuff
ll
SeqBuff
sss
= store remove  SeqBuff
ss
SeqBuff
sll
= load remove  SeqBuff
sl
SeqBuff
lls
= store remove  SeqBuff
ll
SeqBuff
ssl
= load remove  SeqBuff
ss
SeqBuff
lsl
= load remove  SeqBuff
ls
SeqBuff
lss
= store remove  SeqBuff
ls
SeqBuff
sls
= store remove  SeqBuff
sl
SeqBuff
stbsl
= load remove  SeqBuff
stbs
SeqBuff
stbll
= load remove  SeqBuff
stbl
SeqBuff
stbls
= store remove  SeqBuff
stbl
SeqBuff
stbss
= store remove  SeqBuff
stbs
SeqBuff
lstbs
= store remove  SeqBuff
l
SeqBuff
lstbl
= load remove  SeqBuff
l
SeqBuff
sstbl
= load remove  SeqBuff
s
SeqBuff
sstbs
= store remove  SeqBuff
s
Figure 10: TSO-2
16
Producer1 = store  store insert  P11 + load  load insert  P11
P11 = stbar  stb  Producer1 +   Producer1
Producer2 = store  store insert  P21 + load  load insert  P21
P21 = stb  Producer2 +   Producer2
PSO = (Producer1 j POBuf) n fstb load insert store insertg
TSO = (Producer1 j SeqBuff) n fstb load insert store insertg
Figure 11: Specication for Testing the Architecture
that our denition satises the requirements imposed on the two models. Towards that
we dene processes which generate a sequence of loads stores and stbar's. Again each
operation is split into two actions, one for the visible part and the other for the internal
synchronisation (e.g., load insert.) Dene two environments TSO and PSO are systems
constructed using the TSO buer and the PSO buer respectively. The specication of
the above is shown in gure 11. The main dierence between Producer1 and Producer2
is that in Producer2 the issuing of stbar instruction is not visible.
The CWB veries that PSO and TSO are not weakly bisimilar or even trace equivalent.
This is to be expected as in the PSOmodel the execution of stores and loads can be dierent
from the issuing order. However TSO is less than in the trace preorder than PSO. Thus
every trace that can be exhibited by TSO can be exhibited by PSO. Therefore the
specication of the sequential buer is not inconsistent with the partial order buer.
To ensure that the dierence between TSO and PSO is indeed due to the stbar
instruction, we verify that PSO satises the modal-formula Cando in gure 12 which
TSO cannot satisfy. The formula Cando states that after a load and a store, the memory
is able to execute the store operation. The dual of Cando is the formula Ordering which
requires that after a sequence of load and store actions it is not possible to execute a
store remove.
17
Cando =<< load >><< store >><< store remove >> T
Ordering =<< load >><< store >> [[store remove]]F
STBF = max(X: (LoadPossible [ load]X) &[ ]X)
LoadPossible = (<< load >><< stbar >><< store >> [[store remove]]F
Figure 12: Modal Formulae Distinguishing TSO and PSO
PSO also satises the requirement that stbar ensures issuing order as it will satisfy
the formula STBF in gure 12. The intuitive meaning of STBF is that if a stbar is
issued after a load and before a store, it is not possible to execute the store operation.
As this is a safety requirement, we use the maximal xed point operator. To understand
this more formally, we consider two main possibilities; viz., it is possible to perform a
load followed by stbar and store or it is not. If it is not possible to perform the specied
sequence all subsequent behaviours continue to satisfy STBF . Otherwise performing the
sequence of actions will result in arriving at a state where it is not possible to perform
store remove. This is stated by the formula LoadPossible. In other words, STBF is of
the form max(X: (P j Q) [ ]X) where the formula corresponding to P states if a load
and store are separated by stbar then store remove cannot be observed and the formula
corresponding to Q states that if load is not possible the formula STBF is satised in the
future.
The above tests distinguished POBuff and SeqBuff . We now identify conditions
under which the two systems are equivalent. The rst condition we consider is a processes
which separates every load/store with a stbar instruction. By the denition of the eect
of stbar on the PSO model, it is clear that the PSO model collapses to the TSO model.
We also show that executing the stbar instruction in the TSO model has no eect. The
specication of these tests are presented in gure 13.
SeqProd ensures that after every load or store a stbar is issued. This process does not
18
SeqProd = load  load insert  stbar  stb  SeqProd+
store  store insert  stbar  stb  SeqProd
Sys1 = (SeqBuff j SeqProd)n
fstb load insert store insertg
Sys2 = (POBuf j SeqProd)n
fstb load insert store insertg
Sys3 = (Producer2 j SeqBuff)n
fstb load insert store insertg
SProd = load  load insert  SProd+ store  store insert  SProd
Sys4 = (SProd j SeqBuff)n
fstbar stb load insert store insertg
Figure 13: Equivalence Testing
use the non-ordered access of the PSO-store. Thus Sys1 and Sys2 are weakly bisimilar,
i.e., the underlying memory model is of no consequence for SeqProd.
Sys3 and Sys4 are trace equivalent thus showing that stbar is a no-op in the TSO-store
model. Sys3 and Sys4 are not bisimilar due to the presence of stb in the buers. Note
that Producer2 was essential as otherwise the observational actions become dierent.
5 Lessons Learned
In this paper we have shown the feasibility of specifying and testing concurrent aspects
of an architecture. The type of analysis performed on the various specications has been
inspired by both the informal description of the various features and the formal description
(using rst order logic) given in the SPARC manual [Spa91]. The principal observational
properties have been veried here. In verifying the system we have generated formulae
which we believe were relevant.
The CWB has the capability of generating formulae which distinguish non-equivalent
processes. The command dfobs of processes P and Q generates a formula ignoring  which
19
is satised by P and not by Q. Similarly the command dfstr generates a formula where
the  actions are accounted from which is satised by P and not by Q while the command
dfmay generates a trace exhibited by one but not the other.
The command dfstr TSO PSO generates the formula
< store >< t >< t >< load >< t > [load remove]F
while the command dfobs PSO TSO generates the formula
<< load >> [[store]] [[load]] [[load]] << store remove >>
These formulae capture the non-FIFO behaviour of the PSO model while requiring the
FIFO behaviour of TSO model. Similarly, the command dfmay PSO TSO generates the
string store; load; load remove which can be performed by PSO but not by TSO. We
have some condence in our specications as the CWB agrees with our observations. The
CWB does not generate formulae involving x points because a nite formula suces to
distinguish two processes.
As the SPARC architecture is specied formally, it may be possible to prove some
completeness result. However such a result is beyond the scope of this paper. We hope
that one will be prove that all properties specied by the rst order logic specication has
been covered by the modal- calculus specications.
Using a completely automatic tool has its limitations. Features such as the TSO and
PSO buers had to be enumerated and hence were not elegant specications. It also
makes it dicult to generate a buer of size n + 1 from a buer of size n. Consider, for
example, gure 14 where a PSO buer of size two is specied. It is easy to check that
Two is less than POBuf in the trace preorder-order. However, it is dicult to see how
Two can be expanded to obtain POBuf . As POBuf can be perceived to be an extension
of Two one may assume that Two in parallel conjunction with another process (subject
to appropriate synchronisations) can be used to obtain POBuf . The CWB supports a
feature for equation solving which, initially, appears to be attractive. Given processes A,
20
T = si  Ts + li  Tl
Ts = si  Tss+ li  Tls+ stb  Tsts
T l = si  Tls+ li  Tll+ stb  Tstl
Tss = sr  Ts
T ll = lr  Tl
Tstl = lr  T
Tsts = sr  T
Tls = sr  Tl + lr  Ts
Two = T [store insert=si; load insert=li;
load remove=lr; store remove=sr]
Figure 14: PSO Buer
B and a synchronisation set L the system nds an X such that ((A j X) nL) is bisimilar
to B. This feature turn out not to be useful as (Two j X nL)  POBuf cannot be solved
easily. Clearly the set L cannot be empty as interaction between Two and the unknown
X is essential. As the equation solving system requires the user to specify L, the above
equation cannot be solved.
In this paper, we have modelled a small system. As most of the algorithms to check
bisimilarity, trace equivalence etc. are exponential [KS90], an automatic verier cannot be
used for large systems. But this technique is useful in studying synchronisation patterns
in small systems and performing compositional verication semi-automatically.
The SPARC manual [Spa91] provides a formal denition of the memory models. As
the specication is logical rather than behavioural, our specication can be considered
an behavioural representation of the model. However, one needs a system where the
behavioural representation can be veried against the logical specication. We believe
that a system like HOL would be very useful. The technique to specify CCS in HOL has
been described in [CIN91]. It remains to be seen if this technique can be adapted for our
21
system.
Other features such as the flush and ldstub instructions can be added to the basic
specication described here. Modelling the flush instruction requires the specication of
an instruction load and associated buers which behave similar to the PSO model. The
dstub blocks the processor and can be modelled by requiring a handshake (synchronisa-
tion) between the memory and the processor. All these features can be modelled indi-
vidually; however a combined specication appears to be too large to run on the CWB.
This indicates that a prototype implementation while satisfactory for small examples, is
not sucient for large examples. In conclusion, our work shows that the CWB is useful
in studying synchronisation in the initial phases of design.
Acknowledgements
The author acknowledges the many helpful comments from a referee. This research has
been partially supported by University of Canterbury Grant No 1787123.
References
[BB89] T. Bolognesi and E. Brinksma. Introduction to the ISO Specication Language
LOTOS. In P. H. J. van Eijk, C. A. Vissers, and M. Diaz, editors, The Formal
Description Technique LOTOS, pages 23{73. North Holland, 1989.
[BK88] J. A. Bergstra and J. W. Klop. Process Theory Based on Bisimulation Semantics.
In Linear Time, Branching Time and Partial Order in Logics and Models for
Concurrency, LNCS 354, pages 50{122. Springer Verlag, 1988.
[CIN91] A. Camilleri, P. Inverardi, and M. Nesi. Combining Interaction and Automation
in Process Algebra Verication. In S. Abramsky and T. S. E. Maibaum, editors,
TAPSOFT-91: LNCS 494, pages 283{296. Springer Verlag, 1991.
22
[Coh88] A. J. Cohn. A Proof of Correctness of the Viper Microprocessor: The First
Level. In G. Birtwistle and P. A. Subrahmanyam, editors, VLSI Specication,
Verication and Synthesis. Kluwer Academic Press, 1988.
[CPS89] R. Cleaveland, J. Parrow, and B. Steen. The concurrency workbench. In Pro-
ceedings of the Workshop in Automatic Verication Methods for Finite-State Sys-
tems: LNCS 407, pages 24{37. Springer Verlag, 1989.
[CPS93] R. Cleaveland, J. Parrow, and B. Steen. The Concurrency Work Bench: A
Semantics Based Tool for the Verication of Concurrent Systems. ACM Trans.
on Programming Languages and Systems, 15(1):36{72, January 1993.
[GB89] B. Graham and G. Birtwistle. Formalising the Design of an SECD Chip. In
M. Leeser and G. Brown, editors, Hardware Specication, Verication and Syn-
thesis: Mathematical Aspects:LNCS 408, pages 40{66. Springer Verlag, 1989.
[Gor85] M. Gordon. HOL: A Machine Oriented Formulation of Higher Order Logic. Tech-
nical Report 68, University of Cambridge, Computer Laboratory, 1985.
[Hen88] M. C. B. Hennessy. Algebraic Theory of Processes. MIT Press, 1988.
[HM85] M. Hennessy and R. Milner. Algebraic laws for nondeterminism and concurrency.
Journal of the Association of the Computing Machinery, 32(1):137{161, January
1985.
[KS90] P. C. Kannelakis and S. A. Smolka. CCS expressions, nite state processes and
three problems of equivalence. Information and Computation, 86(1), May 1990.
[Lar88] K. G. Larsen. Proof systems for Hennessy-Milner logic with recursion. In 13th
Colloquim on Trees in Algebra and Programming. Springer Verlag, 1988.
[LD90] P. Loewenstein and D. L. Dill. Verication of a Multiprocessor Cache Proto-
col Using Simulation Relations and Higher-Order Logic. In E. M. Clarke and
23
R. P. Krushan, editors, Computer Aided Verication: LNCS 531, pages 302{311.
Springer Verlag, 1990.
[LM87] K. G. Larsen and R. Milner. Verifying a Protocol Using Relativized Bisimulation.
In ICALP -87, LNCS 267. Springer Verlag, 1987.
[Mil89] R. Milner. Communication and Concurrency. Prentice Hall International, 1989.
[Par81] D. Park. Concurrency and Automata on Innite Sequences. In Proceedings of the
5th GI Conference, LNCS-104. Springer Verlag, 1981.
[Par87] J. Parrow. Verifying a CSMA/CD-Protocol with CCS. Technical Report ECS-
LFCS-87-18, Computer Science Department, University of Edinburgh, 1987.
[Spa91] Sparc International. The SPARC Architecture Manual: Version 8, 1991.
[Sti89a] C. Stirling. An Introduction to Modal and Temporal Logics for CCS. In Joint
UK/Japan Workshop on Concurrency:LNCS 491, pages 2{20, 1989.
[Sti89b] C. Stirling. Temporal Logics for CCS. In Linear Time, Branching Time and
Partial Order in Logics and Models for Concurrency, LNCS 354. Springer Verlag,
1989.
[vG90] R. van Glabbeek. The Linear Time-Branching Time Spectrum. In J. C. M. Baeten
and J. W. Klop, editors, CONCUR 90, LNCS-458. Springer Verlag, 1990.
[vS89] J. van de Lagemaat and G. Scollo. On the Use of LOTOS for the Formal De-
scription of a Transport Protocol. In K. J. Turner, editor, Formal Description
Techniques. North Holland, 1989.
[vVD89] P. H. J. van Eijk, C. A. Vissers, and M. Diaz. The Formal Description Technique
LOTOS. North Holland, 1989.
24
