Proving sequential consistency by model checking by Tim Braun et al.
Proving Sequential Consistency by Model Checking
￿
Tim Braun† , Anne Condon§ , Alan J. Hu§ , Kai S. Juse† , Marius Laza§ , Michael Leslie‡ , and Rita Sharma§
†Dept. of Computer Science, Technical University of Darmstadt
‡Dept. of Electrical and Computer Engineering, Univ. of British Columbia
§Dept. of Computer Science, Univ. of British Columbia
(condon,ajh)@cs.ubc.ca, http://www.cs.ubc.ca/spider/ajh
Abstract
Sequential consistency is a multiprocessor memory
model of both practical and theoretical importance. Un-
fortunately, the general problem of verifying that a ﬁnite-
state protocol implements sequential consistency is unde-
cidable,andin practice, validatingthata real-world,ﬁnite-
state protocol implements sequential consistency is very
time-consuming and costly. In this work, we show that for
memory protocols that occur in practice, a small amount
of manual effort can reduce the problem of verifying se-
quentialconsistency into a veriﬁcation task that can be dis-
charged automatically via model checking. Furthermore,
we present experimental results on a substantial, directory-
based cache coherence protocol, which demonstrate the
practicality of our approach.
1 Introduction
When multiple processors or other devices access a
shared memory, the correct behavior is often unclear. The
most intuitive model is that each memory operation hap-
pens instantaneously in real-time, so that each load returns
the value of the most recent store to the same memory lo-
cation. Such a model, however, produces contention, ar-
bitration, and serialization of memory accesses, which is
unneeded in many applications and limits performance. At
the other extreme, a system in which all processors could
freely read and write inconsistent or stale values would be
nearly impossible to program. A memory model is a speci-
ﬁcation of the desired behaviorof the memorysystem from
the programmer’s point of view.
Sequential consistency is a multiprocessor memory
model introduced by Lamport [16]. A memory system
￿
This work was supported in part by research grants from the National
Science and Engineering Research Council of Canada.
is sequentially consistent iff there always exists an inter-
leaving of the program orders of all the processors such
that each load returns the value of the most recent store
to the same address. Intuitively, this means that proces-
sors can get out-of-sync with each other, perhaps because
of caching and buffers, but that the same results could
have been achieved had the executions of the processors
been interleavedin some other way. Sequentialconsistency
is important both as a practical memory model that pro-
vides ease-of-programming while allowing efﬁcient hard-
ware optimizations (e.g. [12]) and also as an extensively
studiedmemorymodelthatcanbeusedtounderstandother,
more relaxed models (e.g. [1]).
Model checking [7] has emerged as the dominant
paradigm for formally verifying temporal properties of
computersystemdesigns. Oneofthemostsuccessfulappli-
cation domains for model checking has been multiproces-
sorcachecoherenceprotocols(e.g.,[17,10,6,8,15,23, 24,
3, 20, 14] are some early works). The application domain
is commercially very important, since almost all high-end
servers are now cache-coherent multiprocessors; the pro-
tocols are tricky, highly concurrent, and hence bug-prone;
and the protocols can be modeled in ﬁnite state, naturally
supporting model checking.
An important veriﬁcation task is to check whether a
memory system implements a speciﬁed memory model.
Unfortunately,the general problem of determiningwhether
a ﬁnite-state protocol implements sequential consistency
is undecidable [2], so directly attacking this problem via
model checking is impossible. More subtle approaches are
needed.
In this paper, we present experimental results on a
methodology for proving sequential consistency using
model checking. In contrast, previous experimental results
on the use of model checking for reasoning about the cor-
rectness of cache coherence protocols either prove much
weakerproperties,orwork onlyon a veryrestrictedclass of
protocols which don’t incorporate the subtle optimizationsthat are prone to introduce errors in real protocols. Recent
theoretical results [9, 21] show that model-checking can in
principle be applied to proving sequential consistency of
sophisticated, real-world protocols, but provide no experi-
mental results.
In our methodology, the protocol being veriﬁed is aug-
mented with additional (ﬁnite-state) bookkeeping informa-
tion. We call the augmented protocol the observer. A
ﬁnite-state checker examines runs of the observer and cer-
tiﬁes that a run is indeed sequentially consistent. Model
checking the entire ﬁnite-state system determines whether
the checker will certify all possible runs, proving sequen-
tial consistency of the protocol. In Section 2, we describe a
method for creating observers, along with the correspond-
ing checker. We then present experimental results in using
our methodology to prove sequential consistency of a sub-
stantial directory-basedcache coherenceprotocol. The pre-
sentation here is necessarily brief; details can be found in
our technical report [4].
1.1 Related Work
There has been considerablework over the years on ver-
ifying memory system protocols and memory models. For
brevity, we mention here only closely related work.
The use of an observer, or witness, to aid in reasoning
about the correctness of a protocol, is an old idea. Our
use of an observer was inspired by the work of Plakal et
al. [19], who introduce a veriﬁcation approach based on
logical clocks and apply it to a directory-based protocol.
In contrast to logical clocks, which are unbounded,our ob-
server is ﬁnite state, making it possible to use it as part of a
model-checkingapproach.
Henzinger et al. [11] propose a very similar approach to
ours,usingaﬁnite-stateobservertoreorderloadsandstores
to constructa witness of sequentialconsistency. Because of
theﬁnite-statelimit onreordering,themethodis toorestric-
tive to handle the types of optimizations typically found in
real protocols, such as the optimization due to Scheurich
that we describe in Section 3. We note that Henzinger et
al. provevery strong results for protocolsin their restrictive
class, namely that it is sufﬁcient to reduce veriﬁcation of a
protocol with arbitrarily large parameters (number of pro-
cessors, number of blocks, number of values per block) to
a ﬁxed-parameter problem. In contrast, our method applies
to veriﬁcation of only ﬁxed-parameter protocols.
Nalumasu et al. [18] propose the Test Model-Checking
technique, in which a protocol is checked against various
predeﬁned ﬁnite-state automata that test certain memory
modelproperties. Thesetests canbeconsideredtobeﬁnite-
state observers. By combining these tests, it is possible to
verifymemorymodels that are close to, but not identical to,
sequential consistency.
Two recent works describe a model-checking approach
for automatically proving sequential consistency of real-
world protocols [21, 9]. However, no experimental results
are presented. This paper describes experimental results
for a variation of one of these approaches [9]. In the cur-
rent paper, the observer is constructed manually, unlike the
automatic construction of [9], but is designed to have sig-
niﬁcantly fewer states in order to mitigate the state space
explosion problem of model checking. Our results show
that approaches of this type are indeed feasible for realistic
protocols.
2 The Veriﬁcation Method
In what follows, a protocol is a ﬁnite-state machine pa-
rameterizedby the numberof processors p, memoryblocks
b, andpossiblevaluespermemoryblockv. Amongthepos-
sible actions (transitions) of the protocol are load (LD) and
store(ST)actions,whichindicatetheprocessor,theaddress
(memory block number), and the value loaded or stored.
A protocol run is a sequence of protocol actions that lead
fromstate to state, startingwith the initial state ofthe proto-
col. A protocol trace is the subsequence of a protocol run
that includes exactly the ST and LD operations of the run.
A serial trace is one in which each load returns the value of
the most recent (prior to the load) store to the same block
(or some initial value, if there were no prior stores to that
block). A protocol is sequentially consistent if every one
of its traces has a reorderingthat respects the per-processor
ordering of the trace, and is serial.
Our methodologyfor constructing observers is based on
a bookkeeping structure we call a window. To understand
what a window is, we ﬁrst note that there are two notionsof
timeassociatedwitha protocol: realtime, andreordered(or
logical) time, in which operations and actions of the proto-
colareserializedso thateveryLD getsthevalueofthemost
recentST.Intuitively,a windowsummarizesthe overallsta-
tus of the memory system in reordered time. The window
includes the active STs (i.e. those which may be read by
future LDs of the protocol), their ordering in logical time,
and where the most recent loads have occurred in logical
time. A window observer annotates the original protocol
run with windows. A ﬁnite-state checker (describedin Sec-
tion 2.1) can prove that a run is sequentially consistent by
usingthewindows. Letus nowconsidertheseideas inmore
detail.
Deﬁnition 1 A window is a sequence of nodes. Nodes can
be one of four different types: delete vectors (DV), logical
pointers (LP), stores (ST), and last load indicators (LL). A
delete vector nodecontainsa vector of b bits, one per mem-
ory block. There are exactly p logical pointers, denoted
LP1
￿
￿
￿
￿
￿LPp, one for each processor. A store node containsa block number B and a value V, denoted ST
￿
B
￿V
￿. There
are b last load indicators, denoted LL1
￿
￿
￿
￿
￿LLb.
Delete vectors summarize an unbounded sequence of no-
longer-relevantstores into a bounded-sizenode. This capa-
bility is crucial for handling many real protocols. We will
use DVfalse to denote a delete vector with all entries set to
false.
Deﬁnition 2 A window observer for a protocol P is a pro-
tocol with actions which are either LD or ST operations,
windows, or a special NULL action. If O is a window ob-
server for protocol P, then the set of traces of O must equal
the set of traces of P.
Intuitively, for real-world protocols, a window observer
may be obtained for a protocol by augmenting the proto-
col to output a window after each protocol action, thereby
annotating the protocol runs with windows, and simplify-
ing the run alphabet of the observer so that actions other
than windows, LD and ST operations are replaced by the
NULL action. The NULL action abstracts away the de-
tailed behavior of the protocol, allowing the use of a uni-
versal checker for all observers.
2.1 Checkers
The checker is a ﬁnite-state machine parameterized by
p, b, and v, just as protocols are. The same (family of)
checker is used for all protocols. The checker examines the
annotated protocol run generated by the window observer.
It always saves a copy of the most recently seen window,
and it checks each subsequent action/window against the
most recently seen window:
Checker Rules
1. Windows must be properly structured. In particu-
lar, DV nodes occur only immediately preceding each
LP node and each ST node. This implies that there
is exactly one DV node between adjacent LP or ST
nodes.
2. Each LD must get its value from the most recent
ST. If LD
￿
P
￿B
￿V
￿ (processor P loads value V from
blockB)is theprotocolaction,thecheckerlooksin the
most recent window for the closest ST node to block
B preceding logical pointer LPP. This ST node must
have stored value V, and there must be no DV vector
indicatingdeletedST nodestoblockB betweentheST
node and LPP. If there is no ST node to block B prior
to LPP, then the LD must returned some initial value.
3. A ST cannotbe retroactive. Intuitively,we prohibita
ST operation from occurring at a point in logical time
if a LD operation to the same block has already oc-
curred later in logical time. Formally, if ST
￿
P
￿B
￿V
￿
(processor P stores value V to block B) is the proto-
col action, the logical pointer LPP must be later in the
window than the last load marker LLB.
4. Consecutive windows are consistent. The checker
compares the new window against the most recent
window. (If the new window is the ﬁrst window the
checkersees, then consistencyis checkedagainsta de-
fault initial window that consists of the LP nodes and
nothingelse.) First, the checker makes sure that it sees
at mostonememoryoperation(LDorST)betweenthe
most recent window and the new window. Depending
on the intervening memory operation (if any), the fol-
lowing are possible:
(a) The intervening memory operation was
ST
￿
P
￿B
￿V
￿, and the only difference between the
windows is that a new ST node ST
￿
B
￿V
￿ and a
new DV node DVfalse are inserted immediately
before logical pointer LPP. (The DV node
formerly preceding LPP now precedes the new
ST node.)
(b) The intervening memory operation was
LD
￿
P
￿B
￿V
￿, and if LLB (the last load to
block B) was before LPP in the old window, then
LLB is moved so that it immediately precedes
the DV preceding LPP in the new window.
Otherwise, the window is unchanged.
(c) There were no intervening memory operations,
andone logicalpointerhas movedforward. Intu-
itively,aprocessoris updatingits state toa newer
one. The details of this change are tedious, but
basically, the DV preceding the LP that is mov-
ing is bitwise ORed into the closest subsequent
DV, the LP is free to move to any subsequent
point immediately following a DV, and a new
DVfalse nodeisaddedimmediatelyaftertheLP’s
new location.
(d) There were no intervening memory operations,
and some ST nodes have been deleted. Again,
thedetails ofthis changeare tedious. Basically, a
sequenceof ST nodes withoutany LP nodessep-
arating them can be deleted. Their correspond-
ing DV nodes are bitwise ORed, and any deleted
ST nodes are also marked on the remaining DV
node.
If every action and annotation the checker sees is legal, the
checker accepts the run.
Combining an observer with the checker allows us to
prove sequential consistency via model checking. If the
run of an observer is accepted by the checker, then thatrun is sequentially consistent. Since both the observer and
checker are ﬁnite-state, we can verify by model checking
that all runs of the observer are sequentially consistent,
which implies that the original protocolis sequentially con-
sistent.
3 Example Protocol
The true test of our methodology requires experimen-
tation. We have developed paper-and-pencil window ob-
servers for three different cache coherence protocols and
selected the most challenging, a directory-based protocol,
for the full model-checking experiment.
The protocol is a variant of one provided by the Mul-
tifacet group from the University of Wisconsin, to which
we have added an optimization due to Scheurich [22], that
allows a processor to continue to read a cache block after
acknowledging an invalidation of that block. With this op-
timization, the protocol should be sequentially consistent,
but not coherent (i.e., processors can continue to use stale
data). The protocol involves several interacting entities —
the processors, a directory, and a network interface — and
is comparablein complexityto commercialdirectory-based
protocols.
Roughly, processors may have three types of access to
a block, with three corresponding “stable” processor states
per block: M(modify), S(shared), or I(invalid). A proces-
sor may do a ST only when in the M state and may do a
LD only in the S or M states. For each block, at most one
processor is in the M state at any given time. The directory
coordinates access to blocks of memory, and is the default
owner of a block when no processor has Modify access to
that block.
When a processorneedsto upgradefromonestable state
to another in order to do a LD or ST operation, the pro-
cessor initiates a transaction and enters a transient state.
Several race conditions may arise, resulting in numerous
transient protocol states to track the possibilities. Here, we
present some illustrative situations. Further examples and
the full protocol description can be found in our technical
report [4].
1. If several processors share a block, and processor P
wants Modify access, then P sends a GETX (Get Ex-
clusive) message to the directory. The directory re-
turns the value of block along with the number of cur-
rent sharers. The directory also sends a message to
each sharer asking them to invalidate their copy of the
block and to send an ACK to P once they have done
so. P waits in the transient state IM (Invalid to Mod-
ify) until it gets both the data and all the ACKs before
doing a LD or ST to the block.
2. If one processor Q is owner of a block, and processor
P wants Modifyaccess, then P sends a GETXmessage
to the directory; the directory forwards this request to
Q and sets P as the new owner of the block. Processor
Q (which is in state M) receives a “forwarded GETX”
message from the directory, sends the data to P and
goes to the I state. Processor P waits in the IM state
until it gets the data from Q.
3. Scheurich’s optimizationallows a block to continue to
be read after ownership has been released. We add
a new cache block state I*, which indicates that the
block has been invalidated, but we are in optimization
mode. The I* state is entered when a processor re-
ceives an invalidate for a block that was in the shared
state S, or a forwarded GETX for a block that was in
the exclusivestate M. While in optimizationmode,the
processor can continue to read the block, even though
the invalidation has been acknowledged. As soon as
the cache receives a request from any other entity,
however,optimizationmodeends, andthe cacheblock
state changes from I* to I.
The window observer for the directory protocol be-
haves just like the protocol itself, with the main differ-
ence being that the observer updates and outputs a win-
dow, while executing the protocol. Brieﬂy, a window can
be changed in three ways: addition of a ST node, moving a
logical pointer node, or deletion of a ST node:
￿ Initially, the observer outputs a window containing
just the p LP nodes, in any order.
￿ Each time a processor or the directory sends data to
another processor, if the sender’s LP is later than the
recipient’s LP, then the recipient’s LP gets moved im-
mediately after the sender’s LP. Intuitively, when the
recipient receives the data, it must have moved for-
ward in time at least to pass the sender. We found it
convenient to introduce a LP node for the directory.
This is purely an implementation detail that makes it
easyto determinewheretoadvancetheprocessors’LP
nodes in certain cases.
￿ Upon a ST operation, a new ST node is created in the
window and is placed just before LPP. Upon a LD op-
eration, the observermakes no changes to the window.
￿ To keep the window size ﬁnite, the observer deletes
those ST nodes which will never be read in the future:
for each pair of successive LP nodes, for each block
B, the observer deletes all but the latest ST B node
between the two LP nodes. Also, for each block B,
all but the last ST B node to the earliest LP node is
deleted.4 Experimental Results
We chose the Murphi veriﬁcation system [10] for our
experiments, mainly for ease-of-use and out of familiarity,
and also because Murphi has proven successful for many
cache protocol veriﬁcation efforts. Modeling cache pro-
tocols in Murphi is routine [14], and many examples are
available as part of the standard Murphi distribution. The
main downside is that Murphi does not use symbolic model
checking [5], precluding one of the most powerful tech-
niques for combatting state explosion.
We started with verifyingbasic correctness properties of
the protocol itself. Proving sequential consistency should
wait until after the protocol is debugged. Not surprisingly,
we discovered several minor bugs and one subtle bug (with
an error trace requiring 10 network messages) in the initial
protocol. This ﬁrst phase of the project corresponds to a
typical cache protocol formal veriﬁcation effort.
After ﬁxing these bugs, we proceeded to add the
observer and checker to the model. Adding the ob-
server/checker consisted of adding a variable to store the
most recently seen window, and then weaving additional
actions to manipulate this variable into the rules that im-
plement the protocol. Whenever the window is updated
or a load or store is performed, the checker is invoked to
make sure the action was legal. No DV nodes were needed
for this protocol, so we omitted them. (DV nodes track
deleted ST nodes to prevent an LP node from jumping be-
tween a ST and the subsequent LP, where a LD may have
executed. In this protocol, LP nodes always jump to a po-
sition immediately following another LP node, so the prob-
lem does not arise.) Model checking uncovered several
bugs in the combined protocol/observer/checker,including
one serious protocol bug, involving staying in optimization
modeinasituationwhenitshouldhavebeencanceled. This
bug had eluded our earlier model-checking without the ob-
server/checker. Eventually, we were able to debug the ob-
server/checker as well, proving the protocol sequentially
consistent.
The total effort was three students, as a class project,
working part-time, for approximately two months. In other
words, the total effort was comparable to that required to
model check only simple correctness properties, but the re-
sult is muchstronger. Addingthe observerand checkerwas
neither easy, nor extremely difﬁcult. The complexity was
much like handling a somewhat more sophisticated proto-
col.
The other practical concern is state explosion. Table 1
shows run times and reachable state counts for the proto-
col with and without the observer/checker. As can be seen,
the observer/checkeradds a substantial amountof state, but
the blow-up isn’t outrageous. Again, the results with ob-
server/checker are comparable to what one would expect
if verifying a somewhat more complex protocol without
observer/checker. Additional work is needed on reducing
state explosion, but the results show that our method is
clearly on the edge of feasibility for realistic protocols.
5 Conclusion and Future Work
We have presented experimental results on a methodol-
ogy for proving sequential consistency of memory proto-
cols by using model checking. Our experiments indicate
that the method is indeed feasible in practice, although ad-
ditional research to reduce state explosion is needed.
The main directions to reduce state explosion are to try
symbolic model checking and related techniques, and to
search for domain-speciﬁc reductions. For example, the
state of the window is likely to be highly determined by
the state of the protocol, suggesting that techniques like
functionally dependent variables [13] may be very help-
ful. Another possibility is to partition the checker into sev-
eral smaller sub-checkers, each of which using only part of
the window, that can be model-checked separately, thereby
substantially reducing the state space.
Acknowledgment
We would like to thank the Multifacet Group at the Uni-
versity of Wisconsin for providing us with preliminary ver-
sions of several memory system protocols as well as an-
swering our questions about them.
References
[1] S. V. Adve and K. Gharachorloo. Shared memory consistency mod-
els: A tutorial. IEEE Computer, pages 66–76, December 1996.
[2] R. Alur, K. McMillan, and D. Peled. Model-checking of correctness
conditions for concurrent objects. In Eleventh Symposium on Logic
in Computer Science, pages 219–228. IEEE, 1996.
[3] ´ Asgeir Th. Eir´ ıksson and K. L. McMillan. Using formal veriﬁca-
tion/analysis methods on the critical path in system design: A case
study. InP. Wolper, editor, Computer-Aided Veriﬁcation: Seventh In-
ternational Conference, pages 367–380. Springer-Verlag, July 1995.
Lecture Notes in Computer Science Number 939.
[4] T. Braun, A. E. Condon, A. J. Hu, K. S. Juse, M. Laza, M. Leslie,
and R. Sharma. Proving sequential consistency by model check-
ing. Technical Report TR-2001-03, Department of Computer Sci-
ence, University of British Columbia, April 2001.
[5] J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and L. J.
Hwang. Symbolic model checking: 1020 states and beyond. In Con-
ference on Logic in Computer Science, pages 428–439, 1990. An
extended version of this paper appeared in Information and Compu-
tation, Vol. 98, No. 2, June 1992.
[6] E. Clarke, O. Grumberg, H. Hiraishi, S. Jha, D. Long, K. McMil-
lan, and L. Ness. Veriﬁcation of the Futurebus+ cache coherence
protocol. Technical Report CMU-CS-92-206, Carnegie Mellon Uni-
versity, October 1992.Model Size w/o Observer/Checker w/ Observer/Checker
p b v Reached States Run Time Reached States Run Time
1 1 1 41 19.5s 82 19.5s
2 1 1 2,272 20.5s 8,738 22.8s
2 1 2 7,628 22.0s 37,317 34.1s
3 1 1 98,083 93.2s 742,984 568.0s
2 2 1 641,157 417.0s 71,242,781 47711.5s
3 1 2 754,577 636.2s 7,287,108 5741.0s
2 2 2 9,413,564 7091.6s space out
2 3 2 space out
Table 1. Summary of protocol runs with and without the window/checker. p is the number of processors, b is the
number of memory blocks (addresses), and v is the number of values. Experiments were run on a 300Mhz Sun Ultra-
60 with 2GB main memory. The state-space blow-up is moderate, showing that our method is at the edge of what is
currently feasible for realistic protocols.
[7] E. M. Clarke and E. A. Emerson. Design and synthesis of synchro-
nization skeletons using branching time temporal logic. In D.Kozen,
editor, Workshop on Logics of Programs, pages 52–71, May 1981.
Published 1982 as Lecture Notes in Computer Science Number 131.
[8] E. M. Clarke, O. Grumberg, H. Hiraishi, S. Jha, D. E. Long, K. L.
McMillan, and L. A. Ness. Veriﬁcation of the Futurebus+ cache
coherence protocol. In L. Claesen, editor, 11th International Sym-
posium on Computer Hardware Description Languages and their
Applications. North-Holland, 1993.
[9] A. E. Condon and A. J. Hu. Automatable veriﬁcation of sequential
consistency. In 13th Symposium on Parallel Algorithms and Archi-
tectures, pages 113–121. ACM, 2001.
[10] D. L. Dill, A. J. Drexler, A. J. Hu, and C. H. Yang. Protocol ver-
iﬁcation as a hardware design aid. In International Conference on
Computer Design. IEEE, October 1992.
[11] T. A. Henzinger, S. Qadeer, and S. K. Rajamani. Verifying se-
quential consistency on shared-memory multiprocessor systems. In
Computer-Aided Veriﬁcation: 11th International Conference, pages
301–315. Springer, 1999. Lecture Notes in Computer Science
Vol. 1633.
[12] M. D. Hill. Multiprocessors should support simple memory-
consistency models. IEEE Computer, pages 28–34, August 1998.
[13] A. J. Hu and D. L. Dill. Reducing BDD size by exploiting functional
dependencies. In 30th Design Automation Conference, pages 266–
271. ACM/IEEE, 1993.
[14] A. J. Hu, M. Fujita, and C. Wilson. Formal veriﬁcation of the HAL
S1 system cache coherence protocol. In International Conference on
Computer Design, pages 438–444. IEEE, 1997.
[15] C. N. Ip and D. L. Dill. Efﬁcient veriﬁcation of symmetric con-
current systems. In International Conference on Computer Design,
pages 230–234. IEEE, October 1993.
[16] L. Lamport. How to make a multiprocessor computer that correctly
executes multiprocess programs. ACM Transactions on Computer,
28(9):690–691, September 1979.
[17] K.L. McMillan and J. Schwalbe. Formal veriﬁcation of the Gigamax
cache-consistency protocol. In International Symposium on Shared
Memory Multiprocessing, pages 242–251. Information Processing
Society of Japan, 1991.
[18] R. Nalumasu, R. Ghughal, A. Mokkedem, and G. Gopalakrishnan.
The ‘test model-checking’ approach to the veriﬁcation of formal
memory models of multiprocessors. In Computer-Aided Veriﬁca-
tion: 10th International Conference, pages 464–476. Springer, 1998.
Lecture Notes in Computer Science Vol. 1427.
[19] M. Plakal, D. Sorin, A. Condon, and M. Hill. Lamport Clocks: Ver-
ifying a directory cache coherence protocol. In Symposium on Par-
allel Algorithms and Architectures, pages 67–76, 1998.
[20] F. Pong, A. Nowatzyk, G. Aybay, and M. Dubois. Verifying dis-
tributed directory-based cache coherence protocols: S3.mp, a case
study. In International Conference on Parallel Processing, EuroPar
’95, August 1995.
[21] S. Qadeer. On the veriﬁcation of memory models of shared-memory
multiprocessors. In Workshop on Formal Speciﬁcation and Veriﬁ-
cation Methods for Shared Memory Systems. Unpublished Proceed-
ings, October 31, 2000. Workshop afﬁliated with FMCAD 2000,
Austin, TX.
[22] C. Scheurich. Access Ordering and Coherence in Shared Memory
Multiprocessors. PhDthesis, University ofSouthern California, May
1989. Published as USC Tech Report CENG 89-19.
[23] U. Stern and D. L. Dill. Automatic veriﬁcation of the SCI cache
coherence protocol. In Correct Hardware Design and Veriﬁcation
Methods, CHARME ’95, pages 21–34. IFIP WG 10.5 Advanced Re-
search Working Conference, 1995.
[24] L. Yang, D. Gao, J. Mostouﬁ, R. Joshi, and P. Loewenstein. System
design methodology of UltraSPARC-I. In 32nd Design Automation
Conference, pages 7–12. ACM/IEEE, 1995.