On formulating simultaneity for studying parallelism and synchronization  by Miller, Raymond E. & Yap, Chee K.
JOURNAL OF COMPUTER AND SYSTEM SCIENCES 20, 203-218 (1980) 
On Formulating Simultaneity for Studying 
Parallelism and Synchronization* 
RAYMOND E. MILLER 
Mathematical Sciences Department, IBM Thomas J. Watson Research Center, 
Yorktown Heights, New York IO598 
CHEE K. YAP+ 
Department of Computer Science, University of Southern California, 
Los Angeles, California 90007 
Received November 13, 1979 
1. INTRODUCTION 
When studying parallel computation and synchronization one is faced with the problem 
of modeling the simultaneous execution of processes. Although there has been a multitude 
of formal means for representing such problems [2, 6, 9, 10, 14-161, invariably, when 
all the other complexities of the models have been stripped away, the parallelism or 
synchronization is studied via sequences of events. Thus in Petri nets [IO] we have “firing 
sequences, ” in parallel program schemata [6] we have “J-computations,” in the system 
of processes model [9] we have “timings,” and in the analysis of the mutual exclusion 
problem [14] we use “computation sequences.” In all of these studies simultaneity of 
events is not studied directly. Rather, it is represented by the interleaving of the separate 
events into sequences, and by studying properties of the set of all such sequences. 
This basic paradigm for simultaneity may be modeled thus: Let C be a set, usually 
called the instmhnzs. Z is a set of finite or infinite strings over C, called the computation 
sequences. Then two instructions, a, b E C are simultaneous if olab/3 and arba,!? are sequences 
in 2, for some sequences 01,p. It is very convenient to represent simultaneity in terms of 
sequences within these formulations. Sequences are familiar objects of study and can be 
comfortably analyzed and manipulated. Also, this means of representing simultaneity is 
often sufficient for studying the properties of interest. Nevertheless, it should be clear 
that interleaving alone is a weaker notion than one that allows the possibility of simul- 
taneity. We shall see this more clearly later. 
* An early version of this paper was presented at the Tenth Annual ACM Symposium on Theory 
of Computing, May 1, 1978, and appears in that Proceedings. 




Copyright 0 1980 by Academic Press, Inc. 
All rights of reproduction in any form reserved. 
204 MILLER AND YAP 
FIGURE 1 
As an example of simultaneity, in Fig. 1, we have a Petri net whose firing sequences 
(represented as sequences over the transitions {a, b, c}) are the set Z, which are initial 
segments of words of the form ol,c~r,col,c . .. . where oli E {ab, ba}. Clearly Q and b are 
simultaneous. 
One could make the case that all the different approaches to simultaneity in the literature 
reduce to being alternative ways for finitely describing the (potentially) infinite set Z. 
This bold claim is an oversimplification, although as a caricature it helps us to see the 
basic connections between the alternative models. 
We have no objection to the basic paradigm outlined above. But we take exception to 
the usual interpretation of the set C as “instructions, ” where the instructions are assumed 
to be “unanalyzable wholes.” The typical justification for this attitude runs as follows. If 
simultaneous instructions from different processes are allowed, as, for example, where 
one instruction is assigning a new value to a common variable while the other is reading 
from that variable, then the result would be indeterminate. Thus, it is meaningless to 
discuss “true” simultaneity in general. Instead it is postulated that an instruction has to 
be executed to completion before another instruction from any other process can initiate. 
This is often called indivisibility of instructions. 
Indivisibility of instructions, besides the fact that it prevents the embarrassing situation 
of simultaneous instruction execution, may be justified under the following two condi- 
tions: 
(i) The instructions are sufficiently elementary that they take only one machine- 
cycle. We must assume here that interrupts are “inter-cycle,” not “intra-cycle.” 
(ii) There is only one independent access channel to common data. 
If either of these conditions breaks down, then indivisibility of instructions becomes 
an unrealistic assumption. The literature, however, frequently deals with nonelementary 
instructions. The P-V primitives of Djikstra [3, 41 and their generalizations are prime 
examples of this. Even at the machine language level, some instructions are interruptible. 
For example, consider the Move Long instruction, MVCL R, , R, , in the IBM 370 
(pp. 133-134, IBM System/370, Principles of Operation). Also, in recent years, there is 
an increasing trend toward modeling distributed computer systems in synchronization 
problems. With more than one access channel, even if instructions take one machine- 
cycle, we have no guarantee that instructions may not be simultaneous and cause in- 
determinacies. In the light of these arguments, it seems desirable to abandon a blind 
PARALLELISM AND SYNCHRONIZATION 205 
assumption of instruction indivisibility and look at the problem in more detail. As a rare 
exception to the quick dismissal of simultaneity, these issues are discussed in some detail 
in the paper of Gilbert and Chandler [S]. In that sense [5] is closely related to this paper, 
yet the approach and the results are quite different. 
2. A Fow OF SIMULTANEITY 
We shall illustrate how the Indivisibility of Instructions Assumption may be relaxed in a 
controlled and manageable way. Consider the system of two concurrent processes in 
Fig. 2. 
At the termination of the two processes, it is easy to see that 5’ has the value 0 or 1 
depending on whether the execution sequence was (col, cop) or (co2, c,,l) (we omit the end 
instructions), respectively. On closer examination, it appears that S = 2 may also be a 
reasonable outcome. This is shown using a device introduced by Pratt and Rivest [14]: 
(Process 1) 
(Process 2) 
INITIALLY S = 1 
co1 : St S + 1 
Cl1 : end 
coa : SC C 
2. ‘1 * end 
FIGURE 2 
IndirisibiIiiy of Primitive 
hstructioo Assumptioo 




206 MILLER AND YAP 
We say that c,,l should really be analyzed as two primitive instructions, co1 = c&; c& , 
where c&: U f- S and C&Z S c U + 1. Here U is an “invisible” variable which is 
accessed by Process 1 only. Now if the execution sequence were (c& , co2, c&J then indeed 
s = 2. 
What we do in this paper is to generalize the approach exemplified above. It consists 
of breaking down instructions into sufficiently simple “atomic acts.” In this paper the 
atomic acts considered are at the level of reading or writing a single memory location. 
We call the actions at this level Primitive Actions. This seems to be a low enough level 
for our purpose, but even lower levels might be considered. We are not assuming bitwise 
actions in which, for example, in the previous example the final value of S could have 
been a bitwise mixture of the co1 and co2 actions. We call our assumption the Indivisibility 
of Primitive Instructions Assumption and the resulting formal model, the Simultaneity 
Model. In terms of the discussion in the introduction, we are interpreting C as the set of 
primitive instructions. In contrast, when the Indivisibility of Instructions Assumption 
is taken, the corresponding model is called the Commututivity Model. Figure 3 shows the 
conceptual relation between “True Simultaneity” (an intuitive and nebulous concept), 
the commutativity model, and our simultaneity model. The arrows indicate the path 
towards more accurate approximations of “True Simultaneity.” 
A natural question then arises: How much more general is the simultaneity model 
over the commutativity model ? Our main theorem answers this question by giving 
restrictions under which simultaneity and commutativity are equivalent. 
These restrictions deal with how simple the instructions in the processes must be. 
Therefore, if the instructions of the processes all satisfy these constraints, it is sufficient to 
study simultaneity via the more convenient commutative sequence methods. To conclude 
the paper we illustrate that when the constraints of the theorem are violated, then 
simultaneity leads to a wider class of behaviors than commutativity. Thus, the hypotheses 
of the theorem cannot be deleted, and in that sense the theorem is tight. 
We feel that these results help clarify the apparent paradox caused by sidestepping the 
issue of simultaneity in models for parallelism and synchronization. It illustrates, within 
the confines of the particular formalism, when the commutativity approach is feasible. 
It also raises the question of how one might construct an appropriate theoretical frame- 
work to provide a useful hierarchy of synchronization protection, so that in practice one 
would only need to be concerned about correct synchronization at a high level rather 
than at the primitive instruction level indicated by our results. 
In [I l] we find a more extensive formulation for synchronization problems. Section 6 
of [l l] contains an early version of the results of this paper. 
Notational Conventions 
If A and B are sets then 01: A + B represents a function a: from A to B, that is, with 
domain A and range B. Xrn A is the m-fold Cartesian product of A. XL, Ai is the 
Cartesian product of A, ,..., A,. We use P = (x1 ,..., x,) to denote n-tuples, and 
ni(~) to denote the projection function on the ith coordinate of the n-tuple; thus, 
l-Ii(<% ,*a*, x,)) = xi . If K = (xi ,..., x,) and 7 = ( yt ,..., ym) are tuples, then P; 9 
PARALLELISM AND SYNCHRONIZATION 207 
denotes the (n + m)-tuple (xi ,..., X, ,yl ,...,Y~) and x,,; f denotes the (n + I)-tuple 
(% , Xl ,.-a, x,). The set {1,2 ,..., n} is denoted by [n]. / A 1 is the cardinality of the set A 
and N is the set of natural numbers (1,2,...}. A sequence is denoted by 6 = (ur ,..., u,) = 
(uJr=“=, or ~7 = (a,, u2 ,... ). If the members of a sequence belong to a set A, then we say 
that it is a sequence in A. Note that we have used “overbars,” li or X, for sequences as 
well as tuples. This should not lead to any confusion. If 5 = (ui ,..., a,) and 5 = (pi ,..., 7,) 
then 6; d is the sequence (ur ,..., u, , pi ,..., T,). The empty set is denoted by 0. 
3. SYSTEM OF PROCESSES 
Our model of a “system of processes” has affinities with the model introduced by 
Lipton [9] and also that of Gilbert and Chandler [5]. Each process is deterministic and 
sequentially executes instructions. An instruction of a process determines two things: 
it computes new data values and specifies the next instruction of the process to be executed. 
There is a common data set accessed by processes in the system, and it is through this 
common data set that interactions between processes occur. 
DEFINITION 1. z?B = (a,, , D> is called the data set where D = XL, Di, each Di is 
a set, i = 1, 2 ,..., m, and a,, is an arbitrary element of D called the initiaZ data. 
DEFINITION 2. Aprocess on &@ is a 3-tuple, P = (C, h, TV), where: 
(i) C is a finite set called the (instruction) counter values with two distinguished 
elements c,, and ch . c,, is the initial counter value and ch the halt counter value. 
(ii) h: (C - (ch)) x D + C is the next instruction function. 
(iii) CL: (C - {ch)) x D + D is the data transformation function. 
DEFINITION 3. A systqnz of n processes on 9 is a set Z = {Pi}la_, , where each Pi = 
(0, hi, pi), i = 1, 2,. . . , n, is a process on .Q, and Ci n Cj = ia for i # j. 
We call Pi the ith process. We will use superscripts to denote the process being referred 
to. For example, coi and chi are the initial and halt counter values of the ith process. 
A typical member of 5@ is a = (dl ,..., d,). We take 9 and Z to be fured for the rest of 
this paper. Since Ci mCj = o for i # j, we may sometimes write h(c; a) or ~(c: a) 
instead of P(c; a) or #(c; a) since i can be understood from c. 
Each instruction of Pi is represented by a counter value in 0. We will often call Ci 
the instructions of Pi. Each Pi begins its computation at the initial instruction coi. The 
only communication between processes occurs through B. No process may modify or 
read another process’s counter value. Although ka could include all the variables of interest 
in the computational, control, and interaction aspect of the processes, it is often con- 
venient to consider 9 to be only the data used for process interaction. 
In [I I], we included the notion of system failure in the model. Although the results go 
through directly including system failure, for simplicity we omit this aspect of the model 
208 MILLER AND YAP 
here. In addition, the model here differs from [l 1, 121 by including a special halt instruc- 
tion to allow more natural modeling of programs that terminate. 
DEFINITION 4. An instantaneous description (i.d.) of .Z is an (n + m)-tuple, I = 
(cl ,..., cn, dl ,..., d,), where ci E Ci, i = 1, 2 ,..., n, and (4 ,..., d,) E D. The initial i.d. 
of 2 is I,, = E,; &, where E,, = (c,,l, co2,..., co”), and ;E, is the initial data. 
DEFINITION 5. Let I = C; & I’ = E’; J’ be i.d.‘s of Z, i E [n]. The binary relation 
(‘_ti =” is said to hold between I and I’, written I +i,Z I’, iff 
(i) nIj(E’) = IJ(E), for i = l,..., n, j # i, 
(ii) ni(E’) = Xi(ni(E); a), 
(iii) ;E’ = @(ni(E); a). 
We then say that the instruction I-Ii(c) acted in the transition I -+d,z I’. We also say 
that the transition I +i,p I’ is caused by the ith process. Note that if ni(c) = c,:, i.e., 
the ith process has halted, then #(I-Ii(c); a) is undefined so the ith process cannot cause 
any transition from I. We write I --+ I’ iff Eli E [n] such that I -i,Z I’. The reference to 
Z is usually omitted in --+ and jiSr . As usual, -+* and -+T is the reflexive transitive 
closure of-+ and -fi . If I, -+* I1 , where I, is the initial i.d., we say that I1 is reachable. 
DEFINITION 6. A sequence of i.d.‘s 3 = (I1 , I, ,...) is called a transition sequence iff 
Vi> l,Ii+Ii+l. 
4. REALIZATIONS 
If 6 = (ul ,..., a,) and 7 = (TV ,..., T,) are sequences, then the shufle product of 6 and ? 
is the set of all sequences of the type S = (vl ,..., unfm ) such that 5 and ? appear as disjoint 
subsequences of fi. For example, 0 = (a, b, c), ? = (1, 2), then (a, 6, c, 1,2), (a, b, 1, c, 2), 
(1, a, 2, b, c), (1,2, a, b, c), etc. are elements of the shuffle product of 0 and ?. We will 
not use shuffle products of sequences 6 and Q which contain common occurrences of 
letters. It is easy to extend the definition to give the shuffle product of a finite set of 
sequences. 
The first step in our goal to formalize a notion of simultaneity is to define a class of 
atomic actions which could act in combination to simulate any instruction. If c is an 
instruction, we want to define a set of primitive instructions belonging to c, A(c). If 0 = 
(4 ,a**, cSk) is some suitable ordering of the primitive instructions of A(c) such that when 
each oi is executed in the indicated order, the result is the same as having executed c, we 
call CT a simple realization of c. If F is a simple realization of another instruction c’, then the 
simultaneous execution of c and c’ may be reflected in the shuffle product of 5 and Q. We 
illustrate this basic strategy by considering an informal example: 
co: do x +f(t, x) then goto cl od, 
PARALLELISM AND SYNCXIRONIZATION 209 
where 
f(t, x) = 0 if t=O 
(*I 
=X otherwise. 
The variables x and t are assumed to range over the integers. The primitive instructions 
belonging to c,, are of three sorts. First, there are the reading actions: 
Rd,,,,: x‘ c x, 
Rdt,,,: t’ t t. 
Here we regard x’ and t’ as internal variables seen only by the process of c,, (hence x, t 
are also called external variables). In general for each external variable z we have an 
internal copy a’. 
Second, we have the writing actions: 
wc,c,: x + f(X’, t’). 
The important thing to note is that Wt,+ computes using internal variables. 
Finally we have the atomic action corresponding to updating the program counter: 
hco : got0 Cl . 
Hence the set of primitive instructions belonging to c,, is A,(c,) = {RdO,GO , Rdt,eo , 
wtmo 9 hCO}. It is also easy to see the sequences (Rdx,co, Rd,,,o, Wt,,,o, hCO) and 
F-kc,, , Rdz,co 9 Wz,oo 9 hCO) are two simple realizations of c0 . Of course, still other 
simple realizations are possible. 
The alert reader may already note that we made two questionable tacit assumptions 
whose importance becomes apparent when one discusses concurrent processes. The 
first assumption is that the action cs need not “update” the variable t as it did for x. The 
primitive instruction in question is: 
Wft*& t + t’. 
If this assumption is revoked, then the set of primitive actions belonging to c,, ought to 
be A,(c,,) = A,(c,,) u (Wt,,,}. To see that this alternative definition of As(c,,) leads to 
semantically inequivalent instructions, consider 0 = (Rdr,cO , Rdt,eo , Wt,,,o , Wtisco , h,J 
as a simple realization of c, . Let c1 be the instruction 
c,:ttt+1. 
It is not hard to see that if B is any simple realization of c, , we can find 5 a member of 
the shuffle product of? and 6 such that the final effect of executing fi leaves t to have its 
original value (simply let Wtt,,o be the last writing action in 6. However, such a result 
210 MILLER AND YAP 
cannot occur if we take the primitive actions belonging to c, to be A,(c,J. This illustration 
should warn us about potential difficulties in formalizing a suitable notion of simple 
realizations. 
Another tacit assumption in defining A,(c,) is related to the semantics of f(x, t): In 
sequential programming the definition given by (*) is adequate, but for our purposes, it is 
unclear whether the variable x should be updated or it should be “left alone” in case 
t # 0. In Wt,,,o we assumed x is always updated, but we could as well define: 
Wfz.c,,t=ll : if t’=O then x+0$. 
The primitive instruction belonging to co under this assumption is A3(c,,) = {Rd,+ , 
W.co > Wfr,co.t=o 9 h(). 
Remark. The above do not exhaust the possibilities. One can imagine even more 
perverse semantics such as: 
Wfz,co.t=OA&O : if t’=OAx’#O then x+-0$. 
Both of the above ambiguities arise because in sequential programming, the function 
specification, i.e., input-output description, of an instruction is usually sufficient. But in 
concurrent programming, functional specification is unable to distinguish between a 
variable that is to be left “untouched” and one that is to be “recopied.” Our solution is 
the introduction of 9, a collection of predicates on D such that for each variable i E [m], 
& E 9. The predicate & is used to control the updating of variable i. Informally,l 
where II’ is the internal copy of a, di = ni (a). The set of primitive instructions belonging 
to c is now uniquely specified up to a choice of #, denoted by A,(c). For instance, in the 
case of A,(c,) of the preceding example, #$ is true and I+& is false; in A,(c,), both & 
and & are false; in A3(c0), I/$ is “t = 0” and z,bt is false. 
We now formalize the preceding discussion. From now on, C = C’ x c” x ..* x C” 
and fJ = Xnfl D. An extended i.d. is 1 = 5; (dr ,..., &,,) E C x d. We call pi for i E [n] 
the internal data of the ith process, and a,,+i the external data (common to all processes). 
We regard a typical element of 14 to be (a = (di ,..., iE,+&, an (n + I)-vector whose 
components are m-vectors, rather than as an (n + 1) * m-vector. For any vector 3, we 
let %[y/i] denote the vector identical to x except that the ith component is replaced by y. 
We introduce the extractorfunction E: I? x n --f c x D given by E(E; (dl ,..., d,,,)) = E; 
a n+1 . Hence E simply ignores the internal, data and extracts an i.d. from an “extended 
i.d..” 
1 The general notation of Wti,,,+ we now adopt is not consistent with those of Wt,,,o , Wtr.co.t--O , 
etc., of the introductory example. Some resemblance is clear. 
PARALLELISM AND SYNCHRONIZATION 211 
DEFINITION 7. Let i E [m] and c be an instruction of process j. The following are 
primitive instructions belonging to c: 
(i) h,: c x D + 2; x D such that h,(~; (~3, ,..., &+,>) is undefined if nj (E) # c, 
otherwise h,(~; (ai ,..., &+,)) = E’; (Ji ,..., a;+,>, where Ji = c?, for 1 E [n + 11, and 
E’ = ~[zzlj], where z = P(c; a,). 
(ii) Rdi,,: c x b -+ C? x fJ such that Rd,,,(c; (6 ,..., a,,,}) is undefined if 
nIi (E) # c, otherwise Rd,,,(C; (;E, ,..., &+,)) = E’; (C?; ,..., iI;+,), where E’ = E, ai = a, 
for 2 # j, and Ji = d,[z/i], where z = ni (a,,,). 
(iii) Let + = {&: iE [ml} b e a set of predicates on D. Wti,c,e: c x l? -+ c x n 
such that Wti,&E; a) is undefined if IJi (c) # c, otherwise Wti,.JE; a) = 3; (i$...,il~+l), 
where E’ = Z, di = d, for I E [n] and 
;E,+1 = a,+, if $,(a,) does not hold. 
= dn+l[z/i], where z = fl (#(c; ;E5)) if t&@) holds. 
i 
We note that Rdi,, and Rdi,,, are identical instructions iff c and c’ belong to the same 
process. The set A,(c) is {h, , Rdisc , W&,c,ll}~cl . 
If G is a finite sequence of primitive instructions, we can easily define a partial function 
on extended i.d.‘s by induction on its length. If E = (ur) then @) is already given by 
Definition 7. If ii = 3; Cs, , then I? is undefined if 6’(f) is undefined. Otherwise 6(f) = 
dm. 
DEFINITION 8. A sequence of primitive instructions of the form 6 = (Rdil,, , 
W,,c ,..., Rd+ , Wf~,.o.~ ,..., Wf~,.o.+, h), where Ii1 ,..., 4 = (jl ,...,j,,J = [ml, 
is called a simple realization of c relative to 1,4 provided for all extended i.d.‘s 1 = E; 
64 ,..., a,,,), where n, (E) = C, if 6(f) = E’; (ai ,..., a;+,> then p(c; a,,) = ai+, and 
w; an+,) = n, (~7. 
We want to define some notion of an instruction c “depending” and “changing” a 
variable. Recalling the introductory example, one would expect c,, to depend on t and to 
change X. Yet because there are different choices of the predicate set #, this has to be 
defined with care. 
DEFINITION 9. Let 1~ [ml, ftK) = P; (@),..., a$) for k = 1, 2,... and let c belong 
to process j. 
(i) h, depends on 1 iff there exist aji) and d,(2) which are identical except on I, 
~,(f(l)) = f(a), @(a)) = 84) and l-J (8s)) # I-Ii (34)). 
(ii) & depends on 1 iff there exist a and I’ which are identical except on 1, and 
g,(a) holds but &(a,) does not. 
(iii) Wti,c,g depends on I iff either I/I depends on 1 or there exist a and a’ which are 
identical except on 1, &(a) holds and ni ($(c; a)) # nl ($(c; a’)). 
(3 Wti.c.B changes 2 iff I = i and for some a, &(a) holds. 
57+/3-7 
212 MILLER AND YAP 
Let 5 = (Rd+ ,..., Rdim,c , Wtj,,o,4 ,..., Wtim,cBti , h,) be a simple realization of c 
relative to *. 
(v) ii depends on I iff h, depends on 1 or some Wtj,o,* depends on 1, j E [n]. 
(vi) 6 changes I if some Wtj,c,, changes 1. 
There are some subtleties in Definition 9 (which, hopefully, are inherent rather than an 
artifact of our formulation). Consider part (iii) of the definition applied to both the 
actions. 
wc,co: x Cf(X’, t’) 
and 
Wfm),t=o : if t’=O then x+0$ 
of the introductory example. We see that Wt, e , o is dependent on x because the implicit 
predicate #;t: here is “true” and if we pick a = (x, t) and a, = (x’, t’) such that ;I and ;I 
are identical except on x, i.e., t = t’, x # x’, then nE f&c0 , a)) # nlE (~(c,, , nl)). 
However, Wt++,, is not dependent on x because the predicate 9, is “t = 0” and if 
J = (x, t) and P = (x’, t’) are identical except on X, either both ~,&(a) and &(Z) hold 
or both do not hold. If they hold, then nZ (,x(c~, a)) = n&(c,, , al)) = (0,O) since 
t = 0 and x is set to 0. Hence depending on the choice of #, the instruction c, may be 
regarded as dependent or independent of x. 
Additional Remark. We may say that instruction c inherently depends on variable i iff 
for all I& if 0 is any simple realization of c relative to #, then 5 depends on i. The above 
discussion shows that c, does not inherently depend on x. But it is not hard to convince 
oneself that c,, inherently depends on t. 
Let ,Z be the system of processes as before. Let I’ be a function on uy=, Cj which 
assigns a simple realization P(c) to each c E uj”=, Ci. There is no restriction that the 
predicate set # for different instructions c has to be related. We call the pair (Z, r> an 
implemented system, which is fixed for the rest of this paper. We easily extend r to se- 
quences of instructions: If E = (ci ,..., ck) then F(Z) = r(c,); r(c,);...; F(clc). If E is the 
null sequence F(E) is also null. 
Let f be a function from [n] into the nonnegative numbers, and let C* be a set of 
sequences of instructions (3, ?,..., P}, where each 8 = (Q,..., c:,~,). If E(j) = 0, then zj is 
null. C* and E will be fixed from now on. 
DEFINITION 10. A simultaneous realization of C* is an element of the shuffle product 
of {F(3),..., T(P)}. A sequence of primitive instructions is said to be a simultaneous 
realization if there exists a set C* for which it is the simultaneous realization. 
DEFINITION 11. We define the binary relation on i.d.‘s, -ftS) , such that I -+ts) I’ holds 
iff there exists C* = {E~>~=~ a set of sequences of instructions, 5 a simultaneous realization 
of C* and 1, I”’ two extended i.d.‘s such that 
PARALLELISM AND SYNCIiRONIZATION 213 
(i) a(f) is defined and o(f) = f’. 
(ii) E(r) = I and E(I”‘) = I’. 
I -+cS) I’ is to be read “I simultaneously derives I’.” Extending our previous terminology 
we say that C* acted simultaneously in the transition I +t8) I’, and write I -+g I’ or 
I --$’ I’. 
DEFINITION 12. A sequence of instructions z = (ci , cs ,..., cr), where I = ci”_, f(i), 
is a commutative realization of C* iff E is an element of the shuffle product of the sequences 
in C*. If I and I’ are i.d.‘s such that E(I) = I’, then we write I -+E; I’ (or, I -+cc) I’) to be 
read “I commutatively derives I’,” and C* acted commutatively in the transition I -+tG) I’. 
It seems clear that I -+$; I’ implies I -fcS) ‘* I’. But the converse is false in general. Our 
main result concerns discovering conditions under which the converse holds. That is, for 
all C*, I--+tsJ ‘*I’ iff I --+g; I’. When this obtains, we describe the situation with the 
convenient slogan: 
“simultaneity = commutativity.” 
The significance of this condition is that modeling simultaneity by commutativity is then 
justified. 
5. THE MAIN RESULT 
It seems clear that the attainment of “simultaneity = commutativity” comes from 
restrictions on the power of instructions. For this we need some further definitions. 
DEFINITION 13. For each c, the range of c is p(c) = {i E [ml: r(c) changes i} and the 
domain of c is 6(c) = {i E [ml; r(c) depends on i}. 
DEFINITION 14. The variable i, i E [ml, is said to be private to process j, j E [n], iff 
for all instructions c, i E p(c) U 6(c) implies c E Cj. If a variable is not private to any 
process, then it is public. 
DEFINITION 15. An instruction c E Ci is a load-type instruction iff 6(c) contains at 
least one public variable. It is a store-type instruction if p(c) contains at least one public 
variable. 
Private variables are those that are accessed (read or written) by at most one process. If 
the class of load-type instructions and the class of store-type instructions are disjoint, we 
say that instructions are dichotomized. Note that when instructions are dichotomized, 
then each instruction falls under exactly one of the descriptions of a load-type or a store- 
type or uses only private variables. 
We can now state our main result. 
214 MILLER AND YAP 
THEOREM (simultaneity = commutativity). Under the conditions that 
(i) instructions are dichotomized and 
(ii) V instructions c, at most one variabZe in 6(c) u p(c) is public, 
then “simultaneity = commutativity.” 
The following two lemmas assume the conditions of the theorem. 
LEMMA 1. Let ffi) = Si); (al”),..., &$,) for i = 1, 2,... . IfE(81)) = E(T(:)), thenfor 
all simultaneous realizations 5, G(~~l)) is defined 12 c@“)) is dejined. Moreover, G@(l)) = If31 
and c$(~)) = fc4) implies that E(I(3)) = E(f(4)). 
Proof. It is sufficient to show that for any (T’, prefix of G, such that c?‘(P(~)) = f(5) and 
6’(f(2)) = I@) (if defined), we have 
(*) ,?3(f(5)) = E(1”f6)), whenever one is defined. 
Let 6’ be 6”; e for 6” a prefix of 6 and e a single primitive instruction. Let Z’(l(1)) = 1”(‘), 
Cn(ft2)) = ?s) (if defined). By induction hypothesis E(f(‘)) = E(f@)) if either one is 
defined. If both I(‘) and f@) are undefined, then both I@) and l(a) are undefined and we 
are done. So assume I(‘) and f(*) are defined. It is straightforward to check for each of 
the three cases of e = A, or e = Rdi,, or e = Wti,G,ti that E(f(‘)) = E(l(s)) implies (*) 
for 15’. Q.E.D. 
Lemma 1 says that if lo) and It21 are two extended i.d.‘s such that their counter values 
and external data are identical, then any simultaneous realization that transforms them 
(if defined) does it in a way that the counter values and external data of the transformed 
extended i.d.‘s again agree. 
We say that Wti,c,d accesses publik data if i is p-ublic. We say that Rdi,c accesses public 
data if i is public and for somej, Wti,c,s depends on i, No other instructions access public 
data. 
LEMMA 2. Let q and a2 be primitive instructions~belonging to different processes uch that 
at most one of them accesses public data. Let f = +; ul; a2; ~(~1 and ?’ = 9; u,; a,; ~(~1 
be simultaneous realizations. Then for all 1, -F(I) is defined $7 q.‘(f) is defined. Moreover 
H,?(f)) = E(T’(~)) whenever defined. 
Proof. By symmetry, let us assume ~(1) is defined. Then (F; al; 02)(f) is defined. 
To be specific, let or be in process 1, o2 in process 2. Since q cannot affect the counter 
value of process 2, and aa cannot affect the counter value of process 1, it is easy to see that 
(f(l); 0,; q)(I) is defined. We now claim for all prefixes @a) of F), if T.(4) = P; CT,; a,; 83) 
md $5) = $1'; ,,2; or; t(” then 
(i) ?t4)(f) and Q(S)@‘) are defined, 
(ii) E(-FC4)(1)) = E(V(1)). 
PARALLELISM AND SYNCHRONIZATION 215 
This claim would establish our lemma. We use induction on the length of $3). If $3) is 
the null sequence, our remarks aboveshow (i), To see (ii), it is easy to see that the counter 
values in +)(I) and +)(I) are identical when 4s) is empty. If both u1 and a, are not 
writing instructions, then the external data in ~(~‘(1) and @j(f) are identical, being both 
equal to the external data of 1. So assume o, is Wtf,o,tl for some i. Surely the internal data 
of process 2 in (?(l); o,)(f) and G(f) are identical. Hence the ith variable of the external 
data of (+); ai; o&f) and (-9; as)(f) are identical. If a, is not a writing instruction, then 
this would imply the external data of (t(l); us; ul)(fi equals those of (~(1); a,)(l) and 
($1’; u,; u&f) and we are done. So let a, be a writing instruction of the form Wtj,c,,d, . 
We may surely assume both $ and #’ are not “false,” otherwise the writing action with 
the false predicate does nothing and we are back to the previous cases. But then j # 1, 
otherwise i = j would be public contradicting our assumptions on ur , ua . Again we 
see the external data of (b(l); ar;u,)(f) and (?o); u,; u&f) agree. This concludes the basis 
case. The rest of the inductive step is similar, but is tedious and unenlightening, so we 
omit it. Q.E.D. 
Lemma 2 may easily be strengthened in various ways if desired. It permits us to 
“commute” adjacent primitive instructions without ,affecting the final outcome. This 
lemma is used repeatedly in the main proof. 
Proof of Theorem. One direction of the theorem is easily derived without even using 
the restrictions (i) and (ii) on instructions. Let lo) = P); (JF),..., &$,) and I(i) = E(l”@)) 
for i = 1,2,... . If E is a commutative realization of C* and E(P)) = P), we want to 
show that there is a simultaneous realization of C*, 6, such that I(i) + :$a). Let z = 
(Cl ,**-, cc), 1 = zT=, f(z). If I = 1, then the result follows by definition of a simple 
realization. If I > 1, assume the result inductively for I - 1. Note that E’ = (cr ,..., cl-i) 
is a commutative realization of some other set of sequences of instructions. Let P --ffc) I(a) 
and 1t3) --fc~ 1t2) for some P3). By induction 37, a simultaneous realization such that 
P +” P3). This means that Z(fo)) = PCs). From Lemma 1 1’3) --+ P implies that 
there exists a simple realization of cr , G”, such that (Tn(f(3)) = I”c2). Hence (6’; Z’)(Fl)) = 
ft2). Hence 0 = 3; 6” will be the required simultaneous realization of C* such that 
I”’ -$) 1’2’. 
The other direction of the theorem makes essential use of the conditions (i) and (ii) in 
the theorem. The proof proceeds by induction on 2 = Cr=, e(z). If 2 = 1, the result is 
already given by the definition of a simple realization. So let 1 > 1 and assume the result 
for I - 1. Let 6 = (~7~ ,..., ut) be a simultaneous realization of C*, and h be the largest 
index such that oh accesses public data. Let uh belong to the instruction c. We first claim 
that Z may be transformed (by permutation) into 19 = (al’),..., ufc”) such that for all 1, 
P(f) is defined iff O(f) is defined, P(r) = I$) w h enever defined, and if h, is the largest 
index such that ufL:) accesses public data, then uiy = uh and for all i > h, , ~6’) belongs 
to c. This is done by applying Lemma ‘2 repeatedly: If for all i > h, ui belongs to c, then 
we are done. Otherwise, pick the smallest index, i, such that i > h, and ui does not 
belong to c. By choice of h, ui does not access public data, so we may move ui to the 
position immediately to the left of uls . It is easy to see that this procedure must always 
halt and the final sequence has the desired property. We now claim that a(l) may be 
216 MILLER AND YAP 
transformed into P) = (o:~),..., oi2)) such that for some g E [k], e(2) = (5t3); cY*), G(3) = 
(Ui2’,..., uL2’) and $4) = <& ,-**, OF)), where ii(a)(f) = @j(f) whenever one of ,c2)(f) or 
P)(I) is defined, 34) is r(c) and 3s) is the simultaneous realization of some set of instruc- 
tion sequences. Again this is done by repeated application of Lemma 2. We do this by 
moving all the primitive instructions that belong to c to the right of primitive instructions 
that do not belong to c. Note that the primitive instructions that belong to c which have 
to be moved to the right cannot be UC) by construction of 31). By the assumption that 
instructions are dichotomized, such primitive instructions cannot access the same public 
variable as IJ~:. Since one would have to be a reading, the other a writing primitive 
instruction. Since there is only 1 public variable in c, such primitive instructions accesses 
no public variables. Lemma 2 is then applicable. 
Let?(i) = E(i); (Jf),..., &$,) and P = E@o)) for i = 1, 2,... . Given that I(l) -+T8) P, 
we want to show that P ---+$ 1t2j for some commutative realization of C*, E. From our 
transformation of 5 into c5t2), we get that P -$~’ P. This means that 1”) -$f’ I(s) --+TAf’ 
If2) for some P3), since P) = ac3); Of4). By the induction hypothesis, 3 a commutative 
realization E’, such that I(l) --+& 1’s). Therefore, choosing E = c’; c we obtain 1’1) -&I(a), 
as required. Q.E.D. 
6. COUNTEREXAMPLES 
The examples of this section show that our result is tight in the sense that neither of 
its assumptions can be omitted. First, we show that the particular implementation of 
instructions (as specified by r) is crucial to our result. Consider the example shown in 
Fig. 4. We may implement co1 as r(c,l) = (RdX,Gol, WtX,eo~,fnlse) in which case co1 does 
not depend on the variable X. The implemented system then satisfies the conditions 
of the main theorem. The final value of X is always 1. On the other hand, if co1 is im- 
plemented as r(c,l) = (Rdx,Gol, WtX,co~,tme ) then co1 depends on X. This violates the 
dichotomy of instructions since c, l also changes X. The final value of X in this case 
could be 0. 
(Process 1) 
(Process 2) 
INITIALLY X = 0 
co1 : XC X 
cl1 : halt 
toe : Xt 1 




INITIALLY (X, Y) = (0,O) 
co1 : (X, Y) t (1,O) 
Cl 1 : halt 
toe : (X, Y)+ (0, 1) 
Cl 2 : halt 
FIGURE 5 
PARALLELISM AND SYNCHRONIZATION 217 
(Process 1) 
(Process 2) 
INITIALLY (X, Y) = (0,O) 
co1 : U+- X + Y 
Cl 1 : halt 
co2 : XC 1 
Cl =: Yt2 
C-A a : halt 
FIGURE 6 
Returning to the example of Fig. 2, we see that the instruction c,,l: S + S + 1 violates 
the dichotomy of instructions since co1 is both a load-type and a store-type instruction. 
As predicted by the theorem, commutativity + simultaneity for this system. This 
example is almost identical to that of Fig. 4, the only difference being that co1 in Fig. 4 
does not inherently depend on X, but c,,l in Fig. 2 inherently depends on S (that is why 
we need not show explicitly any simple realizations of S +- S + 1). 
Consider the system shown in Fig. 5. We assume that $x and #r are true. Commu- 
tativity alone can get a final value of (1,0) or (0, 1) for (X, Y). But simultaneity may 
result in (0,O) and/or (1, 1) also depending on which simple realizations of co1 and co2 
we choose. The different results come from choices of whether to write X or Y first in 
c,,’ and c,,“. Note that in this case, the instructions are dichotomized but the single public 
variable constraint is violated by writing into two public variables simultaneously. The 
reader may feel that assigning to more than one public variable simultaneously may be 
unusual for many programming languages. The next example (Fig. 6) is more natural 
in that sense. It violates the single public variable constraint in a different manner: by 
reading from two public variables simultaneously instead of writing into two public 
variables simultaneously. In this example, U is a private variable. Assuming commu- 
tativity U may finally be 0, 1, or 3. If we choose the simple realization r(c,l) = (Rdx,cOl , 
Rd y.csl , Rdv,,ol , Wfx.c,~,ti , Wfy.c,l.J , Wt~,c,d, where h ad #Y are f&e and #V is 
true, then it is possible for U to attain a value of 2 in the simultaneity model. 
7. CONCLUSIONS 
We have introduced a formal model of a synchronization system, and within the model, 
formulated a precise notion of simultaneity. The main results show sufficient and non- 
deletable conditions for the formula “commutativity = simultaneity” to obtain. We 
have ignored failure functions in the main results, but it is not difficult to incorporate 
failure functions into the main result, treating the failure as a special type of instruction. 
ACKNOWLEDGMENTS 
We are happy to acknowledge the help of C. C. Elgot, with whom we had many substantive 
discussions during our early work on this paper. We thank Leslie Lamport for comments which 
led to the strengthening of our theorem from that in [ll, 121, and we thank G. L. Peterson for his 
comments which led to a correction of our concept of “depends.” 
218 MILLER AND YAP 
REFERENCES 
I. A. J. BERNSTEIN, Analysis of programs for parallel processing, IEEE Trans. Electronic Computers, 
EC-15 (1966), 757-763. 
2. A. CREMER~ AND T. N. HIBBARD, “An Algebraic Approach to Concurrent Programming Control 
and Related Complexity Problems,” Report, USC Computer Science Program, November, 1975. 
3. E. W. DIJKSTRA, Solution of a problem in concurrent programming control, Comm. ACM 8 
(1965), 569. 
4. E. W. DIJKSTRA, Co-operating sequential processes, in “Programming Languages” (F. Genuys, 
Ed.), pp. 43-112, New York, Academic Press (1968). 
5. P. GILBERT AND W. J. CHANDLRR, Interference between communicating parallel processes, 
Comm. ACM 15, No. 6 (1972), 427-437. 
6. R. M. KARP AND R. E. MILLER, Parallel program schemata, J. Comput. System Sci. 3 (1969), 
147-195. 
7. L. LAMPORT, “On Concurrent Reading and Writing,” Report CA-7409-051 1, Massachusetts 
Computer Assoc., Inc., September 1974; revised March 1976. 
8. L. LAMPORT, “Time, Clocks and the Ordering of Events in a Distributed System,” Report 
CA-7603-2911, Massachusetts Computer Assoc., Inc., March 1976. 
9. R. J. LIPTON, “On Synchronization Primitive Systems,” Ph. D. Thesis, Carnegie-Mellon 
University, 1973, and Research Report No. 22, Yale University, Department of Computer 
Science, October 1973. 
10. R. E. MILLER, Relationships among models of parallelism and synchronization, in “Proceedings, 
Symposium on Petri Nets and Related Methods, July 1975,” to appear. 
11. R. E. MILLER AND C. K. YAP, “Formal Specification and Analysis of Loosely Connected 
Processes,” IBM Research Report RC-6716, September, 1977. 
12. R. E. MILLER AND C. K. YAP, On formulating simultaneity for studying parallelism and 
synchronization, in “Proceedings, Tenth Annual ACM Symposium on Theory of Computing, 
May 1, 1978,” pp. 105-113. 
13. G. L. PETERSIN AND M. J. FISCHER, Economical solutions for the critical section problem in a 
distributed system, extended abstract, in “Proceedings, Ninth Annual ACM Symposium on 
Theory of Computing, May 1977,” pp. 91-97. 
14. R. L. RIOT AND V. R. PRATT, The mutual exclusion problem for unreliable processes: 
Preliminary report, in “Proceedings, 17th Annual IEEE Symposium on Foundations of 
Computer Science, October 1976,” pp. l-8. 
15. C. K. YAP, On abstract synchronization problems and synchronization systems, unpublished 
manuscript, 1976. 
16. P. ZAVE, On the formal definition of processes, in “Proceedings, International Conference on 
Parallel Processing, 1976.” 
17. P. ZAVR AND D. R. FITZWATER, “Specification of Asynchronous Interactions Using Primitive 
Functions,” Technical Report, Dept. of Computer Science, University of Maryland, 1977. 
