Arbiters: an exercise in specifying and decomposing asynchronously communicating components  by Ebergen, Jo C.
Science of Computer Programming 18 (1992) 223-245 
Elsevier 
223 
Arbiters: an exercise in specifying 
and decomposing asynchronously 
communicating components* 
Jo C. Ebergen 
Computer Science Department, Universiry of Wareho, Waterloo, Ontario, Canada N2L 3GI 
Communicated by M. Rem 
Received December 1989 
Revised March 1991 
Abstract 
Ebergen, J.C., Arbiters: an exercise in specifying and decomposing asynchronously communicating 
components, Science of Computer Programming 18 (1992) 223-245. 
A method is presented for the formal specification and decomposition of asynchronously com- 
municating components. The method is demonstrated by the design of some arbiters. An arbiter 
is a hardware primitive that realizes the mutual exclusive access of processes to their critical 
sections. It is shown how large arbiters can be decomposed into small ones, and how the 
communication behaviour of arbiters can be specified concisely and conveniently in a simple 
program notation. Furthermore, it is shown that the syntax of a program may guide the designer 
in the verification, and even derivation, of possible decompositions in a calculational style. 
1. Introduction 
As computations are distributed over more and more processes to achieve a high 
degree of parallelism, a growing need arises for primitives that realize the proper 
synchronization and communication between these processes. One of these primi- 
tives has to guarantee the mutually exclusive access of processes to their critical 
sections. It has to arbitrate among a number of concurrent requests of the processes 
for entering their critical section. Only one of these requests may be granted at a 
time. A hardware primitive that realizes such a function is called an arbiter. 
Correspondence to: J.C. Ebergen, Computer Science Department, University of Waterloo, Waterloo, 
Ontario, Canada N2L 3Gl. 
* This work was supported by the Natural Sciences and Engineering Research Council of Canada 
under grant OGP0041920. 
0167-6423/92/$05.00 0 1992-Elsevier Science Publishers B.V. All rights reserved 
224 J.C. Ebergen 
Unfortunately, circuit realizations of arbiters exhibit the fundamental problem of 
metastable behaviour [l]. This means that there may be an indefinite delay before 
a decision is made which of the pending requests will be granted. As a consequence, 
arbiters cannot be used safely in purely synchronous circuits, where decisions must 
be reached within a fixed clock period. Therefore, most circuits in which arbiters 
are used are a special type of asynchronous circuits called speed-independent 
circuits [ 141. 
A speed-independent circuit is a network of basic elements of which the correctness 
is independent of delays in the response times of the basic elements. In case the 
correctness of the network is independent of the response times of the connection 
wires as well, we say the circuit is a delay-insensitive circuit [13]. The usefulness, 
flexibility, and potential of these types of asynchronous circuits have been demon- 
strated by many authors [lo, 13, 161 and, most recently, by Ivan Sutherland in his 
1988 Turing Award lecture [ 181. It is believed that these circuits form a challenging 
new area of circuit design which differs considerably from the traditional design 
techniques for synchronous circuits. One of the most challenging tasks is to develop 
a formalism and notation for the design of these circuits. 
Several notations and formalisms are used to specify the communication behaviour 
of asynchronous circuits. Petri Nets and Signal Transition Graphs are widely used 
[3, 5, 121. State graphs are also used, but they become unmanageable in case a high 
degree of parallelism is involved. The notation of CSP [7] has been used successfully 
by Martin and others [lo, 201. In this paper we use a simple notation similar to 
regular expressions and CSP. We show that communication behaviours can be 
specified succinctly in this notation, and that such specifications may even help in 
deriving decompositions in a calculational style. 
The design approach is based on trace theory, an event-based formalism without 
any time metric. It is developed for the specification and design of parallel computa- 
tions and delay-insensitive circuits [2, 6, 15, 19, 211. A variant of trace theory is 
used by Dill, who was the first to design an automatic verifier for speed-independent 
circuits [4]. In our formalism we reason about circuit elements as (abstract) com- 
ponents that communicate asynchronously. The communication behaviours are 
represented by sequences of occurrences of events, also called truces. The communi- 
cation events and the relative ordering of their occurrences are the only topics of 
interest. We will not consider any gate or switch level implementation of our 
components nor give a quantitative analysis of the delays in a circuit. 
We illustrate the method by the design of some arbiters. We first formally specify 
the communication behaviour of the four-phase arbiter. We then present a decompo- 
sition of the general arbiter, which arbitrates among n 2 2 requests, into basic 
arbiters, which arbitrate between only two requests. The decomposition is based on 
the idea of a simple token ring. The token-ring idea has been applied by many 
authors [S-lo], sometimes with similar circuits as a result, using different formalisms. 
The design of a general arbiter is interesting for several reasons. First of all, an 
arbiter is a circuit that exhibits both parallel and nondeterministic behaviour. 
Arbiters 225 
Therefore, it is a nice example to illustrate how parallel and nondeterministic 
behaviour are dealt with in the formalism and notation presented. Secondly, the 
example illustrates nicely how one can derive a decomposition in a calculational 
style, in particular when parallelism is involved. Thirdly, several arbiter decomposi- 
tions have been given in the literature which turned out to have errors [4]. This 
shows that finding arbiter decompositions is indeed a nontrivial and challenging task. 
2. Sequential behaviours 
By way of introduction to the formalism and notation, we consider specifications 
of a number of basic components starting with the WIRE and IWIRE component. 
Along the way, we introduce more programming primitives together with their 
semantics. 
The WIRE component has two terminals or communication ports: an input ter- 
minal, say a, and an output terminal, say b. A communication action at a terminal 
is denoted by the name of that terminal. The communication behaviours that can 
take place between the WIRE component and its environment are alternations of a’s 
and b’s not starting with a b. These communication behaviours can be represented 
by the state graph of Fig. l(a), where inputs and outputs are postfixed by ? and ! 
respectively, and every state is a final state. 
hi 
(b) 
Fig. 1. State graphs of(a) WIRE and (b) IWIRE. 
The IWIRE can be seen as an “initialized” WIRE. It also has one input terminal 
and one output terminal, and its communication behaviour is also an alternation 
of communication actions at either terminals, but now not starting with an input. 
The communication behaviour of the IWIRE can be represented by the state graph 
of Fig. l(b). 
Instead of using state graphs, we use regular-expression-like programs, called 
directed commands, for the specification of communication behaviours. Specifications 
for the WIRE and the IWIRE in terms of directed commands are given in Fig. 2. The 
first column lists the names of the components, the second column the directed 
commands, and the third column the schematics, where a schematic is a pictorial 
representation of a component. Here, “;” denotes concatenation, “*[ 1” denotes 
repetition of the enclosed, and pref denotes prefix-closure. 
226 J.C. Ehergen 
WlKk pref *[a?; b!] a‘? c ) h! 
IWIRE pref *[b!; a?] cl?. D * h! 
Fig. 2. Specifications of WIK~ and IWIR~ 
The semantics of the notations is defined as follows. Communication behaviours 
are represented by sets of fuaces, i.e., sets of finite sequences of symbols. For example, 
for the WIRE we have the possible traces F (the empty trace), a, ab, aba, abab, etc. 
A complete behavioural specification of a component is given by a directed truce 
structure, which is a triple (1, 0, T). The set I is called the input alphabet and contains 
all input terminals; 0 is called the output alphabet and contains all output terminals; 
T is called the truce set and contains all possible communication behaviours. Every 
trace in T is constructed from symbols in I u 0. 
Instead of listing all traces of a directed trace structure, we represent a directed 
trace structure by means of a directed command similar to a regular expression. 
(Since we use directed commands and directed trace structures only, we drop the 
adjective “directed” from now on.) The characters F, b?, and b! are atomiccommands 
and represent the trace structures (B, (il, {F}), ({b}, 8, {b}), and (Cn, {b}, {b}), respec- 
tively. For the moment, we do not attach any operational meaning to (atomic) 
commands, but consider them merely as mathematical objects. 
From the atomic commands we can construct other commands as follows. Let 
commands be denoted by capital E’s and let iE, oE, and tE denote the input 
alphabet, output alphabet, and trace set of the trace structure represented by E 
respectively. The alphabet of E is denoted by aE and given by aE = iE u oE. The 
concatenation, union, repetition, and prefix-closure of trace structures are defined as 
follows: 
EO; El=(iEOuiEl,oEOuoEl,(tEO)(tEl)), 
EO~El=(iEOuiEl,oEOuoEl,tEOutEl), 
*[El = (iE, oE, (tE)*), 
prefE =(iE,oE,{t,1(3t,:: t,,t,EtE)}). 
Concatenation of sets is denoted by juxtaposition, and (tE)* denotes the set of all 
finite-length concatenations of traces in tE. (For reasons of brevity, we use the same 
notation for commands and the trace structure represented by commands. Equality 
between commands denotes equality of the trace structures represented by the 
commands.) 
A trace structure E is called prejx-closed if prefE = E. Accordingly, the pref 
operation constructs prefix-closed trace structures. A trace structure E is called 
non-empty if tE # $5 Consequently, for a prefix-closed, non-empty trace structure 
represented by E, we always have F E tE. 
Arbiters 221 
3. Specification and interpretation 
The communication behaviour between a component and its environment is 
specified by a prefix-closed, non-empty trace structure with disjoint input and output 
alphabet. Such a specification not only prescribes how the component should behave, 
but also how the environment must interact with the component. We interpret a 
prefix-closed, non-empty trace structure in the followong mechanistic way. Suppose 
the communication behaviour is specified by a command E, where iE n oE = (3 and 
tE is prefix-closed and non-empty. Let the communication actions that have taken 
place already correspond to the trace t E tE. (Initially, t = E.) We say that b is a 
possible next symbol (after trace t) if tb E tE. If b E iE, then the environment may 
produce the next communication action b; if b E oE, then the component may 
produce the next communication action b. Because iE noE = 0, a communication 
action is either an input or an output and, therefore, may be produced either by 
the component or by the environment. Furthermore, we have that if the environment 
produces the inputs as specified, every trace in tE may occur. 
Let us illustrate the mechanistic interpretation with the specifications of the MERGE 
and the TOGGLE given in Fig. 3. The environment of the MERGE initially may produce 
either an input a or an input b; the component may then produce an output c, after 
which the environment may produce an input again, and this behaviour repeats. 
MEKCiE pref*[(a?( b?);c!] 
TOGGLE pref *[a?; b!; a?; c!] 
Fig. 3. Specifications of MERGE and TOGGLE. 
The TOGGLE distinguishes the odd and even occurrences of the input a: after 
every odd occurrence of a it may produce output b, and after every even occurrence 
of a it may produce output c. Also here, there is a strict alternation of inputs and 
outputs, where the environment may start with producing an input. 
From the mechanistic interpretation it follows that a specification is a prescription 
for both component and environment. Because of the inclusion of the environment 
prescription, we can specify the conditions under which correct (component) 
behaviour must be guaranteed. In other words, we interpret a specification E as “if 
the environment produces the inputs as prescribed in E, then the component may 
produce the outputs as prescribed in E”. If the environment violates the prescrip- 
tions, nothing is guaranteed and erroneous behaviour may occur. An example of a 
violation of the environment prescription would be when the environment of the 
WIRE pref *[a?; b!] produces two inputs a in a row without the WIRE producing an 
output b. 
22x J. C. Ehrrgen 
A violation of an environment prescription is also called computation interference 
[21]. Later, when we consider networks of components as decompositions of “larger” 
components, one of our proof obligations is to show that there is no computation 
interference, i.e., for each component in the network the environment prescription 
is not violated. 
Besides absence of computation interference, we have another condition when 
decomposing a component into a network of components: if the environment 
produces the inputs as specified, the network of components may produce any trace 
that is specified. Here, “may produce” should be interpreted as “every trace is 
possible to occur”. Although this requirement excludes decompositions where some 
traces cannot occur, it does not guarantee that every trace indeed will occur. The 
actual occurrence of a trace may also depend on the choices made by the nondeter- 
ministic components. We will return to a more formal treatment of these proof 
obligations later. 
With the mechanistic interpretation given above, we specify unambiguously what 
the component should do and what the environment should do in order to get the 
desired responses from the component. Thus, a specification clearly stipulates what 
an “implementer” of a component has to realize and under what conditions. On 
the other hand, a specification also stipulates how the component should be used 
by the “user”. In our approach, the prescriptions for both “user” and “implementer” 
are laid down in one notation. 
Although the abstract mechanistic interpretation allows for several physical 
implementations (like a mechanical, optical, or electrical one), we briefly indicate 
an electrical implementation only. With each symbol in the alphabet, we associate 
a terminal of a circuit. Each occurrence of a symbol in a trace corresponds to a 
voltage transition at that terminal. There is no distinction between high-going and 
low-going transitions: both transitions are denoted by the same symbol. This type 
of signaling is called transition signaling [ 181. Outputs are transitions caused by the 
circuit and inputs are transitions caused by the environment. If we assume that 
initially the voltage levels at the terminals are low, then the interpretation of the 
WIRE component pref *[a?; b!] corresponds to a physical wire and the MERGE 
corresponds to a XOR gate. We shall not discuss any electrical implementations of 
other basic components in this paper, but concentrate on the abstract mechanistic 
interpretations. 
4. Parallel bebaviour 
A component which involves parallel behaviour is the (Muller) C-ELEMENT. A 
C-ELEMENT has two input terminals and one output terminal. As a logical circuit 
it is often specified as follows: if both inputs are 1 (0), then the output will become 
1 (0), otherwise the output remains the same. In our formalism, we prescribe a 
special communication behaviour for the C-ELEMENT. This communication 
Arbiters 229 
behaviour is given in the state graph of Fig. 4 together with a schematic of the 
C-ELEMENT. 
In terms of a command, this C-ELEMENT can be specified by 
pref *[(a?; b? b?; a?); c!], 
where concatenation has a higher priority than union. In case of n-input C- 
ELEMENTS, however, such commands, or state graphs, become prohibitively large, 
because of all the possible sequences of events. 
a? h? a (.! h? a? I:-pc,! 
Fig. 4. State graph and schematic for C-ELEMENT. 
In order to give concise specifications when parallel behaviour is involved, we 
introduce the operation weaving. Formally, the weave EO (1 E 1 of two trace structures 
represented by the commands EO and E 1 is defined by 
EOIIEl = (iEOuiE1 
,oEOuoEl 
where tJB denotes the trace t projected on alphabet B, i.e., the trace t from which 
all symbols not in B have been deleted. We stipulate that weaving has highest 
priority, then concatenation, and then union. 
As an example of weaving, consider the two commands EO = pref *[a?; c!] and 
E 1 = pref *[b?; c!]. According to the above definitions of weaving, we have 
i(EO(I El)={a, b}, 
o(EOIIEl)={cl, 
t(EO)I El)={&, a, b,ab, ba,abc, bat,. . .}. 
Using the definition of weaving, we deduce that the following commands have the 
same trace structure and can all be used as a specification for the C-ELEMENT. 
pref *[(a?; b?l b?; a?); c!] 
=pref *[(a?11 b?); c!] 
=pref *[(a?; c!) II (b?; c!)] 
=pref *[a?; c!]IIpref*[b?; c!]. 
230 J.C. Ehergen 
Notice that, in a weave, common symbols must match. One could also say that 
weaving expresses “parallel behaviour with synchronization on common symbols”. 
There are two special cases of weaving EO and E 1: 
(1) if aE0 n aE 1 = 8, weaving amounts to interleaving or shuffle; 
(2) if aE0 = aE 1, weaving amounts to intersection. 
Weaving is commutative, associative, and has E as identity, i.e., E 11 F = E. 
A weave of two trace structures can also be seen as the “conjunction” of two 
behaviours: every behaviour that is in accordance with EO and E 1 is contained in 
the weave and vice versa. We use this property for specifying communication 
behaviours that have to satisfy several requirements. For each of the requirements 
we then specify a communication behaviour and, subsequently, take the weave of 
these behaviours as the complete specification. For example, the C-ELEMENT can 
be considered as a conjunction of two behaviours: one behaviour that prescribes 
the alternation of a’s and c’s and one behaviour that prescribes the alternation of 
b’s and c’s. 
The reader may be tempted to interpret the weave as the (parallel) composition 
of components in the sense of “connecting the circuits specified by the weavands”. 
We emphasize, however, that weaving should be considered here solely as an 
operation to construct trace structures for expressing the communication behaviour 
of one component. The (de)composition of components is discussed later. 
5. The four-phase arbiter 
A basic component that exhibits both parallel behaviour and nondeterministic 
behaviour is the four-phase arbiter. The basic four-phase arbiter communicates with 
two processes, process 0 and process 1 say. Each process is connected to the 
four-phase arbiter by four terminals. For process 0, we denote these terminals by 
the symbols r0, go, f0, and ~0, according to the following interpretations that are 
associated with them. 
r0 request for grant; 
g0 grant; 
f0 release (or free) the grant; 
a0 acknowledgement of release. 
A similar interpretation holds for the symbols rl, gl, fl, and al, but now related 
to process 1. The specification of the four-phase arbiter by means of a command is 
given in Fig. 5. 
The command for the four-phase arbiter can be explained as follows. First, we 
consider the communication with process 0 in isolation, i.e., we consider the symbols 
r0, gO,fO, and a0 only. With respect to these communication actions the behaviour 
is a repetition of request, grant, release, and acknowledgement of release. This 
behaviour is expressed in the first line of the command. A similar reasoning applies 
Arbiters 231 
pref *[i-O?; go!; fO?; aO!] 
(/pref*[rl?;gl!;fl?;al!J 
11 pref *[go!; fO?lgl!; fl?] 
fo? 
=I3 
UO! 
IQ? go! 
ARB4 
I-l? <?I! 
P? al! 
Fig. 5. Specification of four-phase arbiter. 
to the communication behaviour with respect to the symbols rl, gl, fl, and al, 
which gives rise to the second line of the command. Finally, we have to specify that 
the processes have mutually exclusive access to their critical sections, which is the 
only synchronization requirement between the two communication behaviours. Each 
process is in its critical section between the grant and the successive release of the 
grant. The mutual exclusion condition then amounts to requiring that the parts go!; 
fO? and gl!; fl? do not overlap, i.e., either process 0 is in its critical section or 
process 1 is in its critical section. This leads to the third line of the command. Notice 
that only the third line introduces the nondeterminism in the behaviour. 
The complete specification of the four-phase arbiter is simply the conjunction of 
the three behaviours specified above: the communication behaviour between the 
arbiter and process 0, the communication behaviour between the arbiter and process 
1, and the behaviour to guarantee mutual exclusion. Since “conjunction” of 
behaviours is conveniently expressed by weaving, we obtain the command of Fig. 5. 
The communication behaviour of the four-phase arbiter may also be specified by 
a state graph as given in Fig. 6. Notice that this state graph can be interpreted as 
the Cartesian product of the state graphs of 
pref*[rO?;gO!;fO?;aO!] and pref*[rl?;gl!;fl?; al!], 
Fig. 6. A state graph for the four-phase arbiter. 
232 J.C. Ebergen 
with the exclusion of one state. This is exactly the state where both processes are 
in their critical sections. The exclusion of this state is required by the third line in 
the command. 
It is also possible to specify a four-phase arbiter that starts in a state different 
from 00. For example, the four-phase arbiter that starts in state 02 is specified in 
Fig. 7. In contrast to the arbiter of Fig. 5, the arbiter of Fig. 7 initially “has already 
performed” the actions rl and gl. Notice that 
pref *[fl?; al!; rl?; gl!] 
=pref(fl?; al!; *[rl?; gl!; fl?; al!]). 
The schematic of this four-phase arbiter is also given in Fig. 7. 
pref*[rO?;gO!;fO?; aO!] 
11 pref *[.fl?; al!; rl?; gl!] 
)I pref(fl?; *[go!; fO?lgl!; fl?]) 
P? 
3% 
UO! 
IO‘? go! 
ARB4 
1.1’~ ,?I! 
fl? Ul! 
Fig. 7. Specification of four-phase arbiter starting in state 02. 
Remark. Circuit implementations of the four-phase arbiter can be made with a 
so-called ME element [16], MERGES, and TOGGLES. 
From the specification of the basic four-phase arbiter in Fig. 5 we can easily 
construct a specification for a four-phase arbiter that arbitrates among n processes, 
n > 0. For example, for n = 3 we obtain the specification 
pref *[rO?; go!; fO?; aO!] 
11 pref *[rl?; gl!; fl?; al!] 
]I pref *[ r2?; g2!; f 2?; u2!] 
11 pref *[go!; fO?]gl!; fl?lg2!; f2?]. 
Notice that these specifications are linear in n, while state graphs for such arbiters 
are exponential in n. 
Another attractive property of this command is that the specification concerns 
with respect to parallel behaviour and nondeterministic behaviour are clearly 
separated. 
6. A tentative token-ring decomposition 
In order to become more familiar with specifying communication behaviours by 
means of commands, we discuss a tentative token-ring decomposition for a four- 
Arbifers 233 
phase arbiter arbitrating among n processes, n 2 2. The decomposition is based on 
the following idea. We have a ring-wise connection of n components in which a 
token is traveling clockwise from component to component. Each component 
communicates with its neighbours in the ring and with one process. The components 
are called token-ring interfaces. A process requests the token from the token-ring 
interface in order to enter its critical section. If the token-ring interface receives the 
token and there is a pending request, it may grant the token to the process. Otherwise 
the token is sent on to the next token-ring interface. Upon exit from the critical 
section, the process releases the token to the token-ring interface. Since there is 
only one token in the ring, at most one process can be in its critical section. 
Accordingly, mutual exclusion is guaranteed. 
For n = 3 the tentative decomposition into token-ring interfaces is depicted in 
Fig. 8. Occurrences of the symbols n0 and nl for the first token-ring interface are 
! t1O? !/?I? !l12‘! !lJ3? 
a 
Fig. 8. A tentative decomposition into token-ring interfaces. 
interpreted as the receiving and sending of the token respectively. The IWIRE, 
specified by pref *[no!; n3?], represents the presence of the token in the initial state. 
Using the same strategy as for the specification of the four-phase arbiter, we can 
give a specification for the first token-ring interface: 
pref *[ vO?; go!; fO?; aO!] 
1) pref *[no?; n 1 !] 
11 pref(nO?; *[nl!; nO?lgO!; fO?]). 
The first line specifies the communication behaviour between the token-ring interface 
and process 0. The second line specifies the communication behaviour of the 
token-ring interface with its neighbours. The third line specifies the mutual exclusion 
condition and can be explained as follows. After the token-ring interface has received 
the token, it repeatedly arbitrates between either sending the token to the next 
token-ring interface (and waiting until the token is received again) or granting the 
token to process 0 (and waiting until the token is released). Process 0 is in its critical 
section in the part go!; fO?, and in the part nl!; no? one of the other processes 
may be in its critical section. 
The above token-ring interface may be specified by syntactically different, but 
semantically equivalent, commands. In order to illustrate this, we rewrite the 
234 J.C. Ehergen 
command slightly. The last two lines may be rewritten as follows. 
pref *[no?; nl!] 
I] pref( no?; *[go!; fO?l n 1 !; no?]) 
zz {by definition of weaving} 
pref( no?; *[go!; fO?l n 1 !; no?]) 
= {trace theory calculus} 
pref *[no?; *[go!; fO?]; nl!]. 
Consequently, the token-ring interface can also be specified by the command 
pref *[rO?; go!; fO?; aO!] 
11 pref *[no?; *[go!; fO?]; nl!]. 
A similar property holds for the other token-ring interfaces. We use this freedom 
in manipulating commands later for the verification of the decomposition. 
7. Correctness criteria 
Now that we have a tentative decomposition, we formulate the conditions we 
have to verify in order to conclude that the decomposition is correct. In this section 
we discuss four conditions, which are based on the abstract mechanistic interpreta- 
tion we gave in Section 3. Informally, a network of components forms a decomposi- 
tion of a component E, if this network may produce any trace in tE, provided the 
environment of this network produces the inputs as specified in E. Furthermore, in 
the network of components no computation interference may occur. 
We formulate the conditions for the tentative decomposition given in Fig. 8. Let 
E be the arbiter arbitrating among three processes as given in Section 5, E, is the 
ith token-ring interface for 1 i is 3, and E, denotes the IWIRE. The network of 
components E,, El, E3, and E, is denoted by (E, , E,, E,, Ed). The property that 
E can be decomposed into the network consisting of E, , Es, E,, and E, is denoted 
by E+(E,,E>,Ex,EJ. 
First, we take into account the behaviour of the environment with respect to the 
network (E, , Ez, E,, Ed). The environment’s behaviour is specified in E. In order 
to consider the production of an input by this environment as the production of an 
output by a component, we consider the rejlection of E, denoted by .!? and defined 
by E = (oE, iE, tE). (Consequently, iE = oE, oE = iE, and tE = tE.) By reflecting E, 
we interchanged the role of component and environment. Instead of considering E 
and network (E,, EZ, E,, Ed), we now consider the network (E,, E,, E,, Ej, Ed), 
where E, = I!?. 
Arbiters 235 
In order for E to be decomposable into the network (E,, EZ, E3, Ed), four 
conditions have to hold for the network ( EO, E, , E2, E3, Ed). Two conditions concern 
the so-called structure of the network and two conditions concern the behaviour of 
the network. We first discuss the conditions on the structure of the network. 
In the network (E,, , E, , E2, E3, Ed) there must be no dangling inputs and outputs, 
i.e., every input is connected to an output and every output is connected to an input. 
In formula: 
(ui:O~i<5:oE,)=(ui:O~i<5:iE,). (1) 
If (1) holds, we say that the network (E,,, E,, EZ, E,, Ed) is closed. 
The second condition is that outputs of distinct components are not connected 
to each other. In formula: 
oE,noE,=@ forO<i,j<5Ai#j. (2) 
If (2) holds we say that the network is free ofoutput interference. (Notice, however, 
that inputs may be connected to each other.) Condition (2) guarantees that each 
symbol can be produced by at most one component. Condition (1) and (2) together 
guarantee that each symbol can be produced by exactly one component and be 
received by at least one component. 
Conditions (1) and (2) are conditions on the structure of the network and are 
formulated in terms of the alphabets of the trace structures. Conditions (1) and (2) 
are satisfied by the network (E,,, E, , E?, Es, Ed), as can be verified easily. The next 
two conditions are behavioural conditions; they are phrased in terms of the trace 
sets and the alphabets. 
The third condition requires that the environment prescription for any component 
in the network may not be violated. We can simulate the network by generating 
traces of symbols, representing joint behaviours of the components in the network. 
Formally, we construct the trace set X of all joint behaviours as follows. Initially, 
X = { &}. Choose a trace t, symbol z, and index i, 0 s i < 5, such that 
tEXAzEoE,r\tz&aE,EtE, 
holds. In other words, after joint behaviour t, component E, can produce output z. 
If for all j, 0 S j < 5, we have tzJaE, E tE,, then we add tz to X. In other words, if 
all other components can accept z, i.e., their environment prescription is not violated, 
then tz is a joint behaviour as well. If some component cannot accept z, we stop 
the simulation and say that the network has computation interference. Our third 
condition is: 
The network is free of computation interference. (3) 
When the network is free of computation interference, X represents the set of all 
traces that can be constructed with the above simulation. 
A less operational, and perhaps more formal, formulation of absence of computa- 
tion interference can be given as follows. In case there is no computation interference, 
236 J. C. Ehergen 
the joint behaviour of the network is equivalent to the trace set of the weave of all 
components in the network. In formula, X = W, where 
This follows from the property that each trace in X is, by its construction, in 
accordance with the behaviour of each component in the network. Since the weave 
contains all traces that are in accordance with the behaviour of each component, 
we have X G W. Moreover, each trace t E W can be generated in the above simulation 
as a joint behaviour, since, by condition (I), each symbol is an output symbol of 
some component in the network. Consequently, if there is no computation interfer- 
ence, WG X. For this reason, absence of computation interference implies the 
following property: 
For all traces t, symbols z, and indexes i we have 
tE Wr\zEoE,AtzJaE,EtE, j tzE W. 
Furthermore, if there is computation interference for certain t, z, and i, then it 
follows that the above property does not hold for this t, z, and i as well. Therefore, 
absence of computation interference and the above property are equivalent. 
The fourth condition is that every trace of the component specified (here E) may 
also occur in the simulation. When no computation interference occurs, the joint 
behaviour of the network can be represented by W (or X). The fourth condition 
then becomes: 
WJ,aE = tE, 
i.e., the behaviour of the network with respect to the alphabet of E is exactly the 
trace set of E. 
Condition (4) excludes, for example, decompositions of the general arbiter where 
only process 0 would be granted and never process 1 or 2. This condition also 
excludes the decomposition of components into the so-called “accept-everything-do- 
nothing” module, i.e., a component that accepts every possible input but never 
produces any output. On the other hand, although condition (4) requires that each 
trace in tE may occur in the simulation, it does not require that some traces are 
guaranteed to occur. Consequently, fairness nor absence of deadlock or livelock 
are guaranteed by condition (4) (as we shall see later). For this reason, other works 
on asynchronous circuit design [2,4] have restricted themselves to conditions (l), 
(2), and (3) only. 
In this paper we consider the above four conditions as our correctness criteria 
for a decomposition. They can be generalized naturally to any network of com- 
ponents. 
We mention two properties of decomposition which can be readily verified. The 
first property states that any component can be decomposed into itself, i.e., for any 
Arbiters 231 
component E we have the identity decomposition: 
E+(E) 
The second property is that in any decomposition we can introduce components E 
without invalidating the decomposition. For example, we have E + (E, E). Com- 
ponent E can be seen as the “identity” component: it has no communication terminals 
and it does not do anything. Notice also that the ordering of components in a 
decomposition is immaterial: if E + (E, , Es), then also E + ( E2, E,). 
As an example of a decomposition, we have 
pref *[no?; n3!] + (pref *[no?; nl!] 
,pref *[nl?; n2!] 
,pref *[n2?; n3!] 
), 
i.e., a WIRE can be decomposed into three other WIRES. Although this is a rather 
trivial decomposition, verifying the correctness of the token-ring decomposition 
essentially boils down to verifying a decomposition like this, as we shall see. 
8. Two theorems on decomposition 
Verifying the four conditions of decomposition can be automated. David Dill [4] 
has developed an automatic verifier that checks the first three conditions. Such an 
algorithm basically constructs a finite state graph for the joint behaviour of the 
network from the state graphs of the components. Unfortunately, the state graph 
for the joint behaviour can be exponential in n, where n is the number of components 
in the network. Consequently, the time complexity of such a verification algorithm 
can be exponential in n as well. In the case of our arbiter decomposition, where 
there is a high degree of parallelism, a straightforward verification would indeed 
be exponential in n. 
Fortunately, we have two theorems on decomposition that enable us to verify 
decompositions more efficiently. One theorem can be characterized as “decomposi- 
tion by stepwise refinement” and one theorem can be characterized as “decomposi- 
tion by partwise refinement”. 
The first theorem, which enables us to decompose a component by stepwise 
refinement, is called the Substitution Theorem. The theorem expresses that in a 
decomposition in which a component, say EZ, is used, we may safely substitute 
component El by one of its decompositions, provided that the symbols introduced 
in the decomposition of Ez are fresh symbols. 
238 J. C. Ebergen 
Theorem 1 (Substitution Theorem). Let components E,, E,, E,, E, and E, satisfy 
E,+(E,, E7) and Ez+(Ej, &). 
IJ furthermore, 
(aE, u aE,) n (aE, u aE,) = aE,, 
then 
Notice that the condition on the alphabets is essentially a void condition, since 
it can always be satisfied by an appropriate renaming of the symbols introduced in 
the decomposition of E2. 
Theorem 1 applies to decompositions into two components only. The generaliz- 
ation of this theorem to decompositions into an arbitrary number of components 
is straighforward and is omitted here. (A proof of the Substitution Theorem can be 
found in [6].) 
The Separation Theorem allows us to find a decomposition by partwise refinement. 
The theorem expresses that if a component is specified by a weave E 11 F, we can 
first try to find decompositions for the parts E and F and then find a decomposition 
for E 11 F by collecting all commands in the decompositions and weaving those 
commands that have common outputs. 
We use the notation 
E+(i:l<i<n:E,) 
to stand for “component E can be decomposed into the network of components 
Ei, 14 i < n”. 
Theorem 2 (Separation Theorem). Let components E, F, E,, F,, with 1 G i < n, satisfy 
E + (i:l<i<n:Ei) 
and 
F + (i:l<i<n:F;). 
If the following conditions, with E, = l? and F, = F, are satisjied 
(ui:l<i<n:aEi)n(ui:l<i<n:aFi)caEnaF, 
(oE,uoF,)n(oE,uoF,)=~ for O~i,j<n and i#j, 
then 
E I/ F + (i:lSi<n:EiIIF,). 
(2.1) 
(2.2) 
Condition (2.1) states that the only symbols the two decompositions have in 
common are symbols from aE naF. Again, this last condition is a void condition, 
Arbiters 239 
since it can be satisfied by an appropriate renaming of the symbols introduced in 
the decompositions. Condition (2.2) states that the outputs of the commands that 
are woven are disjoint. This condition may be satisfied by rearranging the commands 
in the decompositions and by introducing E commands at the appropriate places. 
As an example of applying the Separation Theorem, suppose we would find 
decompositions 
where (2.1) is satisfied, and the only commands that have outputs in common are 
E, and F3, and E2 and F2. In order to satisfy condition (2.2), we rearrange the 
commands and introduce an E command. 
E + (E, E,, E,), 
F + (F,, F,, FJ. 
Now both conditions (2.1) and (2.2) are satisfied. Applying the Separation Theorem, 
we find 
E II F + (E II F, > J% II 6, E, II Fd. 
Since 8 is the identity of weaving, this can be simplified to 
Et]F + (F,,E,IIFz,E,IIF,). 
The Separation Theorem can be generalized to decompositions of components 
expressed by weaves of more than two commands. The generalization is straight- 
forward. A proof of the Separation Theorem can be found in [6]. 
9. Verification of token-ring decomposition 
The token-ring decomposition can be verified, and perhaps derived, in a calcula- 
tional style by application of the Separation Theorem. 
Since E is written as a weave of four commands, we obviously try to find 
decompositions for each of these parts and then apply the Separation Theorem. 
E is written as E = F. )I F, 11 F, II F3, where 
F, = pref *[ rO?; go!; fO?; aO!], 
F, =pref *[rl?; gl!; fl?; al!], 
F2 = pref *[r-2?; g2!; f 2?; a2!], 
F3=pref*[gO!;fO?(gl!;fl?Ig2!;f2?]. 
240 J.C. Ehergen 
For F,,, F, , and F7, we take the identity decompositions 
For the decomposition of F,, we first rewrite FJ in such a way that it reflects the 
idea of a token-ring decomposition. To this end, we expand command F7 to a 
command that represents the flow of the token. Formally, EE is called an expansion 
of E if the projection of EE on aE is E. In formula: EEJaE = E, where 
EELA = (iEEnA, oEEnA, {tiAltEtEE}). 
Expansion is in a sense the inverse of projection. Expansions are obtained by 
inserting so-called internal symbols in a command. An internal symbol x is denoted 
in a command by !x? and the semantics of !x? is given by the trace structure 
({XI, ix], {XI>. 
In order to incorporate the flow of the token, we rewrite Fj as follows. 
pref *[go!; fO?Jgl!; fl?lg2!; f2?] 
= {trace theory calculus] 
pref *[*[go!; fO?]; *[gl!; fl?]; *[g2!; f2?]] 
= {inserting internal symbols no, nl, n2, and n3} 
(pref *[!nO?; *[go!; fO?]; !nl?; *[gl!; ,fl?]; 
!n2?; *[g2!; f2?]; !n3?])&aF,. 
The expansion in the last line can be considered as representing the flow of the token. 
The syntax of this expansion suggests the following decomposition for F3. 
F3 + (pref *[no!; n3’?] 
, pref *[no?; *[go!; fO?]; n 1 !] 
,pref *[nl?; *[gl!; fl?]; n2!] 
,pref *[n2?; *[g2!; f2?]; n3!] 
1. 
Subsequently, we apply the Separation Theorem. For this purpose, we observe 
that only in the decomposition of F3 internal symbols are introduced. Consequently, 
condition (2.1) is obviously satisfied. Condition (2.2) also can be satisfied if we 
introduce E commands at the appropriate places. Subsequently, we collect the 
commands of the decompositions and weave those commands that have common 
Arbiters 241 
outputs. Notice that the only common outputs are go, gl, and 82. We obtain 
E + (pref *[no!; n3?] 
,pref*[nO?; *[go!; fO?]; nl!] 11 pref *[rO?; go!; fO?; aO!] 
,pref*[nl?; *[gl!; fl?]; n2!] I( pref *[ul?; gl!; fly; al!] 
,pref*[n2?; *[g2!; f2?]; n3!] (1 pref *[r2?; 821; f2?; a2!] 
And this is exactly the tentative token-ring decomposition we gave in Section 6, 
where the token-ring interfaces are specified by commands as given at the end of 
that section. 
10. Decomposition continued 
Unfortunately, the token-ring interface is not a basic component. The four-phase 
arbiter arbitrating between two processes, however, is a basic component. For this 
reason, we proceed with the derivation of a decomposition for the token-ring 
interface into basic components. 
We try to find a decomposition for the token-ring interface in the same way as 
for the arbiter. For this purpose, we try to manipulate and expand a command for 
the token-ring interface, by inserting internal symbols at the appropriate places, in 
such a way that we can recognize the basic four-phase arbiter in this expansion. 
For this purpose, we consider command E0 for the token-ring interface, given in 
Section 6, where this time we take 
E,,= pref *[TO?; go!; SO?; aO!] 
I/ pref *[no?; nl!] 
11 pref(nO?; *[nl!; nO?lgO!; fO?]). 
Examining the specification of the arbiter in Fig. 7 once more we propose to insert 
two internal symbols !x? and !y? in the second line of the command in such a way 
that we obtain the expansion EEo, where 
EE,, = pref *[TO?; go!; fO?; a 1 !] 
II pref *[no?; !x?; !y?; nl !] 
II pref( HO?; *[n 1 !; nO?l go!; JO?]). 
Notice that EE,JaE, = E,, and that G,, where G,, is the command EE, with !x? 
and !y? replaced by x! and y? respectively, is a specification for a four-phase arbiter 
242 J. C. Ehergrn 
similar to the one in Fig. 7. For application of the Separation Theorem, we take 
pref *[rO?; go!; fO?; aO!] + (pref *[rO?; go!; fO?; aO!], F), 
pref *[no?; nl!] + (pref *[no?; x!; y?; nl!], pref *[x?; y!]), 
pref(nO?; *[nl!; nO?lgO!; fO?]) + 
(pref(nO?; *[nl!; nO?lgO!; fO?]), E). 
Applying the Separation Theorem to these decompositions, we find the decompo- 
sition 
E, + (G,, pref *[x?; y!]). 
This decomposition is depicted in Fig. 9. 
.fl:‘- _ UO! 
IO’? - - $$)! 
ARB4 
Fig. 9. A decomposition of the token-ring interface 
From the decomposition it follows that each time a token is received, the four- 
phase arbiter is released for an indefinite period to allow for a pending request to 
be granted. In case there is no pending request, or after the process has released 
the token, the request y? will eventually result in a grant nl!, i.e., the sending of 
the token to the next token-ring interface. 
By application of the Substitution Theorem, we may substitute the decomposition 
of Fig. 9 into the three token-ring interfaces of Fig. 8. Thus, by stepwise and partwise 
refinement, we obtain a decomposition of a four-phase arbiter, arbitrating among 
n processes, n > 0, into n basic four-phase arbiters (with a special initialization), n 
WIRES, and one IWIRE as depicted in Fig. 10. 
Fig. 10. Complete decomposition 
Arbiters 243 
The surprisingly simple token-ring decomposition can easily be distributed over 
the processes: each token-ring interface is connected to a process by a few wires 
and the token-ring interfaces themselves are connected by just a single (possibly 
long) wire. The token is implemented by a transition propagating along the wires. 
Furthermore the correctness of this decomposition is not affected by any delays in 
the response times of the basic components or connection wires. 
Although the decomposition is simple, it has some disadvantages. One dis- 
advantage is that, formally, the token may travel around the ring continuously 
without any external action. This phenomenon is illustrated by the occurrence of 
the following behaviours: 
rO?; *[!flO?; !nl?; !n2?; !n3?]. 
After a request by process 0, the token may be sent on to the next token-ring interface 
repeatedly without ever being sent to process 0. This phenomenon, where internal 
actions do take place but no external actions are performed, is known as livelock 
[7]. Absence of livelock is not required by our definition of decomposition. Livelock- 
free decompositions of arbiters have been given by Seitz [ 171 and Martin [ 111. Both 
initial solutions, however, contained errors. The corrected versions can be found 
in [4]. 
11. Concluding remarks 
We have illustrated a method for the specification and decomposition of asyn- 
chronously communicating components by the design of a component for a well- 
known difficult problem: the construction of large arbiters from small ones. The 
exercise yielded a surprisingly simple outcome and the decomposition could be 
verified, and perhaps derived, with relative ease. 
The program notation of commands allowed for a concise specification of com- 
munication behaviours of components. The parallel behaviours of arbiters could be 
specified conveniently by a conjunction of specific behaviours, i.e., by a weave of 
commands. Had we used state graphs instead, we would have had to analyze much 
larger specifications. 
The formalism also enabled us to design a component by stepwise refinement, 
thanks to the Substitution Theorem, and by partwise refinement, thanks to the 
Separation Theorem. The Separation Theorem demonstrates that the weaving 
operator is helpful not only finding succinct specifications of components, but also 
for finding decompositions of components. 
The syntax of commands may be of assistance in the derivation of a decomposition: 
we can try to manipulate and expand a command, by rewriting and inserting internal 
symbols at particular places, in such a way that the decomposition can be recognized 
in the expansion. There may be many places where to insert the internal symbols, 
244 J.C. Ehergen 
and finding the appropriate places is still a bit of magic. By carefully studying a 
number of exercises, we hope to develop some heuristics for deriving decompositions. 
The manner in which we have derived a decomposition shows that the task of 
finding decompositions is very similar, to the task of programming. Programs also 
are derived by stepwise or partwise refinement, by rewriting, and by introducing 
local variables. Deriving circuit designs in a similar calculational style may help to 
master the ever-increasing complexity in VLSI design. 
Acknowledgement 
Acknowledgements are due to M. Rem, C.E. Molnar, A.J. Martin, A. Peeters, 
J.A. Brzozowski, G. Gopalakrishnan, and the Eindhoven VLSI Club, for correcting 
my mistakes and their comments on a previous version of this paper. 
References 
[I] T.J. Chaney and C.E. Molnar, Anomalous behaviour of synchronizer and arbiter circuits, IEEE 
Trans. Comput. 22 (1973) 421-422. 
[2] W. Chen, J.T. Udding and T. Verhoeff, Networks of communicating processes and their (de-) 
composition, in: J.L.A. van de Snepscheut, ed., Mathemaric.s qfProgram C‘on.~trucrion, Lecture Notes 
in Computer Science 375 (Springer, Berlin, 1989) 174-196. 
[3] T.A. Chu, Synthesis of self-timed VLSI circuits from graph-theoretic specifications, Ph.D. Thesis, 
MIT, Cambridge, MA (1987). 
[4] D.L. Dill, Trace Theory for Automatic Hierarchical Ver$cafion ofSpeed-Independent Circuirs (MIT 
Press, Cambridge, MA, 1989). 
[5] D.L. Dill, S.M. Nowick and R.F. Sproull, Automatic verification of speed-independent circuits with 
Petri net specifications, in: Proceedings 1989 IEEE International Conference on Computer Design: 
VLSI in Computers and Processors (IEEE Computer Society Press, Washington, DC, 1989) 212-216. 
[6] J.C. Ebergen, Translating programs into delay-insensitive circuits, CWI Tract 56, Centre for 
Mathematics and Computing Science, Amsterdam (1989). 
[7] C.A.R. Hoare, Communicating Sequential Processes (Prentice-Hall, Englewood Cliffs, NJ, 1985). 
[8] R.M. Keller, Towards a theory of universal speed-independent modules, IEEE Trans. Comput. 23 
(1) (1974) 21-33. 
[9] W.M. Littlefield, Interlocken, Tech. Memo. No. 26, Computer Systems Laboratory, Washington 
University, St. Louis, MO (1967). 
[lo] A.J. Martin, Programming in VLSI: from communicating processes to delay-insensitive circuits, in: 
C.A.R. Hoare, ed., Deue1opmenf.s in Concurrency and Communication (Addison-Wesley, Reading, 
MA, 1989) l-64. 
[I l] A.J. Martin, The design of a self-timed circuit for distributed mutual exclusion, in: H. Fuchs, ed., 
Proceedings 1985 Chapel Hill Conference on VLSI (Computer Science Press, Rockville, MD, 1985) 
247-260. 
[12] T.H.-Y. Meng, R.W. Brodersen and D.G. Messerschmitt, Automatic synthesis of asynchronous 
circuits from high-level specifications, IEEE Trans. Cornput.-Aided Design 8 (11) (1989) 1185-1205. 
[13] C.E. Monar, T.P. Fang and F.U. Rosenberger, Synthesis of delay-insensitive modules, in: H. Fuchs, 
ed., Proceedings 1985 Chapel Hi/l Conference on VLSI (Computer Science Press, Rockyille, MD, 
1985) 67-86. 
Arbiters 245 
[I41 D.E. Muller and W.S. Bartky, A theory of asynchronous circuits, in: Proceedings ofan International 
Symposium on the Theory ofSwitching, Annals of the Computation Laboratory of Harvard University 
29 (Harvard University Press, Cambridge, MA, 1959) 204-243. 
[15] M. Rem, Trace theory and systolic computations, in: J.W. de Bakker, A.J. Nijman and P.C. Treleaven, 
eds., Proceedings PARLE, Parallel Architecures and Languages Europe, Vol. 1 (Springer, Berlin, 
1987) 14-34. 
[16] C.L. Seitz, System timing, in: C. Mead and L. Conway, eds., Introduction to VLSISystems (Addison- 
Wesley, Reading, MA, 1980) 218-262. 
[17] CL. Seitz, Ideas about arbiters, Lambda 1 (1) (1980) 10-14. 
[18] I.E. Sutherland, Micropipelines, Comm. ACM 32 (6) (1989) 720-738. 
[19] J.T. Udding, A formal model for defining and classifying delay-insensitive circuits and systems, 
Distributed Comput. 1 (1986) 197-204. 
[20] C. van Berkel, C. Niessen, M. Rem and R. Saeijs, VLSI programming and silicon compilation: a 
novel approach from Philips Research, in: Proceedings IEEE International Conference on Computer 
Design (IEEE, Computer Science Press, Washington, DC, 1988) 150-166. 
[21] J.L.A. van de Snepscheut, Trace Theory and VLSI Design, Lecture Notes in Computer Science 200 
(Springer, Berlin, 1985). 
