University of Pennsylvania

ScholarlyCommons
Departmental Papers (CIS)

Department of Computer & Information Science

2012

Protocol Design With Concolic Snippets
Rajeev Alur
University of Pennsylvania, alur@cis.upenn.edu

Jyotirmoy Deshmukh
University of Pennsylvania

Sela Mador-Haim
University of Pennsylvania

Milo Martin
University of Pennsylvania, milom@cis.upenn.edu

Arun Raghavan
University of Pennsylvania

See next page for additional authors

Follow this and additional works at: https://repository.upenn.edu/cis_papers
Part of the Computer Sciences Commons

Recommended Citation
Rajeev Alur, Jyotirmoy Deshmukh, Sela Mador-Haim, Milo Martin, Arun Raghavan, and Abhishek Udupa,
"Protocol Design With Concolic Snippets", . January 2012.

This paper is posted at ScholarlyCommons. https://repository.upenn.edu/cis_papers/571
For more information, please contact repository@pobox.upenn.edu.

Protocol Design With Concolic Snippets
Abstract
With the maturing of computer-aided verification technology, there is an emerging opportunity to develop
design tools that can transform the way systems are designed. In this paper, we propose a new way to
specify protocols using concolic snippets, that is, sample execution fragments that contain both concrete
and symbolic values. While the purely symbolic extreme is simply an alternative representation of the
traditional communicating extended finite-state-machines, and the purely concrete extreme is an
instantiation of the "programming by examples" paradigm, our specification language allows the designer
to specify the desired protocol using a mixture of symbolic state machines and concrete scenarios. Our
synthesis engine generalizes the snippets into a transition function, which is then analyzed using a model
checker with respect to high-level temporal-logic correctness requirements. We describe a prototype
implementation for design of cache coherence protocols built using (1) a straightforward enumeration of
all expressions for transition functions, (2) a check for consistency with respect to concolic snippets
using the SMT solver CVC3, and (3) a check for correctness using the model checker Murø. We discuss
our experience in designing classical cache coherence protocols using the proposed methodology.

Disciplines
Computer Sciences

Author(s)
Rajeev Alur, Jyotirmoy Deshmukh, Sela Mador-Haim, Milo Martin, Arun Raghavan, and Abhishek Udupa

This working paper is available at ScholarlyCommons: https://repository.upenn.edu/cis_papers/571

Protocol Design with Concolic Snippets
Rajeev Alur, Jyotirmoy V. Deshmukh, Sela Mador-Haim,
Milo M. K. Martin, Arun Raghavan, and Abhishek Udupa
University of Pennsylvania, Philadelphia PA 19104, USA

Abstract. With the maturing of computer-aided verification technology, there is an emerging opportunity to develop design tools that can
transform the way systems are designed. In this paper, we propose a new
way to specify protocols using concolic snippets, that is, sample execution fragments that contain both concrete and symbolic values. While
the purely symbolic extreme is simply an alternative representation of
the traditional communicating extended finite-state-machines, and the
purely concrete extreme is an instantiation of the “programming by examples” paradigm, our specification language allows the designer to specify the desired protocol using a mixture of symbolic state machines and
concrete scenarios. Our synthesis engine generalizes the snippets into a
transition function, which is then analyzed using a model checker with respect to high-level temporal-logic correctness requirements. We describe
a prototype implementation for design of cache coherence protocols built
using (1) a straightforward enumeration of all expressions for transition
functions, (2) a check for consistency with respect to concolic snippets
using the SMT solver CVC3, and (3) a check for correctness using the
model checker Murφ. We discuss our experience in designing classical
cache coherence protocols using the proposed methodology.

1

Introduction

Over the last two decades, computer-aided verification technology has matured,
and has witnessed growing adoption in industry. However, verification tools have
been confined to discovering bugs in systems that have already been designed.
This raises the question: how can we leverage the advances in verification tools by
interacting with the designer and assisting the production of a “correct” design?
One approach in this direction is based on synthesis, where a synthesis tool maps
a declarative specification to an executable implementation [3, 17, 24]. Another
approach is based on programming by examples, where a synthesis tool maps a set
of example input/output traces to either an executable finite-state machine [9,
10], or a functional program [13] operating on strings [8]. In both approaches, the
designer is still required to express the complete logic of the system in the chosen
level of abstraction. Further, while the input formats such as logical formulas
and examples differ from traditional imperative programming languages, there
is little compelling evidence that these alternative formats reduce the effort for
getting the tricky details “right.”

A perhaps more promising approach is the recent work on sketching, where
a programmer writes a partial program with incomplete details. A synthesizer
then fills in missing details using user-specified assertions as correctness specifications [19–21]. A key methodological innovation in sketching is that it allows
the programmer to mix imperative and assertional styles. Therefore, high-level
invariants can be used to find intricate details necessary for fine-tuning code,
without needing the programmer to abandon familiar programming styles.
While sketching has been demonstrated for functional programs [19, 20],
we extend the view of the synthesis tool as an integrator of multiple formats
to develop a new approach for designing reactive systems, such as distributed
protocols, focusing on cache coherence protocols in this paper. Such protocols are
interesting candidates for synthesis because (a) they are challenging to design
correctly due to corner cases arising from asynchronous concurrency [1, 25],
with bugs being reported even in published protocols [12], (b) they possess a
well-defined structure with relatively few variables and small bounds for values,
allowing for well-scoped and tractable definition of the synthesis problem, and
(c) previous work has been successful on using model checkers to find bugs in
such protocols [4, 7], and was in fact instrumental for adoption of model checking
by the electronic design automation industry.
In our proposed method, a designer describes a distributed protocol using
high-level temporal requirements and concolic snippets, where a concolic snippet
is a fragment of desired execution that contains conditional updates to variables
using both concrete and symbolic values (the term “concolic” was coined in the
context of testing programs using both concrete and symbolic inputs [18]). The
synthesis tool then generates a complete protocol, in the form of a set of communicating extended finite-state machines (efsms), such that the synthesized
description is consistent with the snippets. The synthesized protocol is further
checked against temporal requirements; if any of these requirements are violated, the resulting counterexample traces are used by the designer to supply
additional snippets. Every efsm transition, usually described using a guard and
update action, can be viewed as a symbolic snippet; thus, a traditional protocol
description using efsms can be directly input in our format.
When all snippets use only concrete values, our methodology is an instantiation of the “programming by examples” approach, where the synthesis tool
assists the designer by generalizing from concrete examples to symbolic expressions for guarded updates of transitions, with the model-checker in the design
loop. However, it offers flexibility to mix the two styles. Use of symbolic expressions can reduce the number of examples required for suitable generalization,
and also ensures that we do not make the design process more cumbersome by
insisting on describing every detail with concrete values.
In our prototype implementation, the designer describes the protocol by listing the communication architecture, constituent processes with their control
states and variables, concolic snippets corresponding to efsm transitions, a typed
grammar for allowed expressions in guards and updates, and temporal logic requirements. For each efsm transition, the synthesis tool systematically enumer2

1

S

P0

Home
{P0, P1}

2

fw
Req d
S

3

-

P1
O

P2
I

ReqS

X
S

Dat

a

4

1
1
1
2
3

S

P0 2

S

P02

S

P0

Home
{P0, P1}
Home

{P0, P1}

Home
3
4

4

5

P1
Ofwd

Req

fw
Req d
S

I

{P0, P1, P2}Dat

5

5

{P0, P1, P2}

ck

Unblo

ck
Unblo

ck

-

5

P2
I

...

...

?

if (.)
...

?

?

O

Dat

a

O

O

Req. Data ACK

Req. Data ACK

O
bank 001
moogish
edit save _load

{P0, P1, P2}

Unblo

{P0, P1, P2}

P2
I

Dat
a Unblock

S
a

4

2

+

ReqS

+
S

qS
S Re
P

P1

S
Ofw
Req dReqS
S

{P0, P1}

3

P1
O

M

...

...

if (.)
...

S

?

if (.)
...

?

I

?

?

M

...

...

S

?

I

?

Fig. 1. Flowchart for Design Methodology using Concolic Snippets

ates possible expressions for guards and updates using the specified grammar,
with optimizations to reduce the number of expressions that need to examined.
Our straightforward enumeration mechanism was guided by preliminary investigations on the size of expressions in representative coherence protocols, including
published broadcast-based protocols that are known to have a large number of
states [14, 16]. Checking whether the completed description of a transition is
consistent with all the snippets is formulated as a validity query discharged using the SMT solver CVC3 [2]. When consistent completions for all transitions
are discovered, the resulting protocol is checked with respect to the temporal
requirements using the model checker Murφ [5]. If the check fails, the designer
debugs the reported counterexample with a visualizer, and adds more snippets to
rule out the erroneous behavior. Figure 1 illustrates the proposed methodology.
We evaluated the feasibility of our approach by first specifying protocols
previously defined in SLICC [15], with the help of concolic snippets. We found
that even the largest expression (size 8) was synthesized within tens of seconds
by systematic enumeration. Encouraged by our feasibility results, we performed
two case studies where designers with no experience with coherence protocols
used the prototype tool to design two canonical cache coherence protocols from
textbook illustrations. Both case studies resulted in generating protocols that
were successfully verified by the model checker. We found that the final, correctly
generated protocol required an average of less than 2 snippets per transition.
Further, actually synthesizing the protocols from snippets took less than an
hour, while the entire iterative design process required at most 15 hours of
manual effort for either case study.

2

Background: Protocols as Communicating EFSMs

In this section, we review the traditional design methodology for communication
protocols. The standard description of protocols uses the model of communicat3

M)/
Get
)
?(i, aMsg(d
q
e
R
!Dat
Rsp

I
M)
et
,G
r
i
!(D
eq

Rsp?(j, Ack) ∧
¬allAcks/
acks

IS

I

Busy

/R
St

SM

Rsp?(Dir, DataMsg(d))∧
allAcks/
Unb!(i, UnblockM)
acks := 0
data := d

S

S

II

Busy
Data

/
M)
ck
lo }
nb {j
,U =
(i s
b? er
Un har
S

IM

M

M

(a) The ith Cache Process

(b) Directory Process

Fig. 2. The MSI protocol specified as communicating EFSMs

ing sequential processes, where each process is described as an extended finite
state machine (efsm). For instance, Promela models that serve as inputs to the
Spin model checker [11], the inputs to the Murφ model checker [6] and domain
specific languages such as SLICC [23] have constructs that enable specification
of protocols as communicating efsms. Protocol correctness is verified against
temporal logic requirements by model checking.
2.1

Protocol Design

We explain the basic terminology for communicating efsms using the example
of an invalidation-based directory cache coherence protocol. A cache coherence
protocol ensures that data in private caches of processors in a shared memory
multiprocessor configuration remain consistent; the value read by a load operation must be the value written by the last preceding store operation to that
address. Most commodity server, desktop and even mobile processors today tend
to have multiple cores and commonly implement cache coherence in hardware.
Coherence protocols typically enforce the consistency constraint using permissions — each address in shared memory may have either multiple readers or
a single writer at any given time. Read and write permissions are associated with
every address and are determined by exchanging messages between processes. In
a simple directory-based protocol, a global directory process (D) tracks all processes that cache a memory block (denoted C1 , . . . , Cn ). Processes communicate
using point-to-point messages to maintain coherence.
Communication model. Coherence protocols typically use an asynchronous,

message-passing based model of communication, i.e., processes communicate by
sending messages on channels or networks. Each process description includes
a list of input networks to receive messages, and output networks on which
messages are sent. For example, the input network for the cache process Ci
is Rsp, and its output networks are Req and Unb, while for the directory, Req
4

and Unb are input networks and Rsp is an output network. Note that some
networks can be both input and output networks. A network can be thought of
as a multiset; to send a message process Ci inserts it into the multiset, and if
the destination field for the message is Cj , then Cj will eventually receive this
message. This essentially enables any process to communicate with any other
process.
EFSM description. Figure 2a and Figure 2b show how the cache and directory

processes respectively are specified as efsms. Due to space constraints, we only
specify the labels for the bold edges in the figure. An efsm is specified as a tuple
consisting of the following components:
Control states identify a logical state of the protocol. In Figure 2, the control
states for each cache Ci are M , S , I , I M , S M , I I , and I S , while those for the
directory are M , S , I , Busy, Busy Data. The M , S and I states in the cache are
steady states, and define the read-write permissions; in M state, a cache block
can be read or written, in S , it can be read but not written, while in I , a cache
block can be neither read nor written. It follows that when a cache process is
in M , all other cache processes must remain in I . States such as I S or I M are
transient states and are used when a process wishes to transition between two
steady states to acquire or relinquish permissions.
Process variables are variables local to a process, such as the Sharers variable
for the directory and Acks and Data variables for the cache. Each process variable
has a defined type, e.g., the Sharers variable has the type Set and the Acks
variable has type Int.
Input events either correspond to reception of messages or external events.
E.g., in Figure 2a the transition from I M to I M is triggered by the event
Rsp?(i, Ack), denotes that Ack was received on the Rsp network for Ci , while
the transition from I to I M is triggered by the external store (St) event.
Guards are Boolean predicates over fields of received messages and process
variables. For instance, the receipt of a Data message in the I M state, causes a
transition to M if the allAcks predicate is true and leaves the state unchanged
otherwise.
Output events correspond to sending messages. E.g., as a result of St event
in the I state, process Ci sends a GetM message on its output network Req.
Actions correspond to updates to the process variables. For instance, after
receiving an Ack message in state I M , the process variable Acks is appropriately
decremented. Similarly, upon receiving an UnblockM message in state Busy, the
Sharers variable of the directory process is set to the sender’s process id.
2.2

Protocol Verification

We check the safety property of coherence by assigning read/write permissions
to the control states and maintaining a copy of every stored data value in a
global oracle. We then check if in every readable state the value read is the same
as the value maintained by the oracle. Absence of deadlocks can be viewed as
a safety property: a control state in which none of the guards is enabled should
5

Transition(CurrentState, InputEvent){
[guard] => (NextState, OutMsg1 , ...){
antecedent1 ==> {
var11 = expr11
var12 = expr12
...
}
...
}
}

Fig. 3. Structure of a concolic snippet. CurrentState and NextState are the start and end
control states. The snippet specifies zero or
more outbound messages that may be sent conditionally. It also specifies a guard-action block
for each guard containing a set of conditional
updates. The expression antecedenti specifies
the condition (on process variables and the
fields of a received message) under which variables varij are updated with expressions exprij .

be unreachable. In addition to safety invariants, a well-behaved coherence protocol also provides forward progress guarantees such as deadlock and starvation
avoidance to ensure that every request eventually succeeds, or every process in
a transient state eventually returns to a steady state.

3

Protocol Design using Concolic Snippets

Rather than designing a coherence protocol by fully specifying communicating
efsms, we propose an alternative approach based on the observation that it is
often natural to describe a protocol in terms of its behavior in different scenarios.
Figure 1 illustrates the various steps in our approach. The designer specifies a
protocol using a set of snippets derived from scenarios describing the protocol
behavior. In addition to the snippets, the designer also provides contextual declarations of the terms used in the snippets, such as the number of processes and
an enumeration of possible control states.
Our synthesis tool (called Transit) decomposes the problem of synthesizing
a complete efsm description from snippets into smaller sub-problems of: (a)
synthesizing process-variable update expressions (i.e., RHS of assignments), and
(b) synthesizing guards for transitions. The central component of Transit is
an expression enumerator that generates candidate expressions for guards and
updates which are checked for consistency with the snippets by an SMT solver.
The completed protocol (a fully specified set of communicating efsms) synthesized by Transit is then translated to a formal model and verified for coherence and liveness properties by the Murφ model checker. An error in model
checking results in a violating trace, or counterexample. Upon inspecting the
violating trace, the designer can add a positive example that illustrates the correct behavior for the violating trace to the original set to of snippets. The newly
generated protocol completion would therefore no longer result in the earlier incorrect behavior. The entire process is repeated until the model checking phase
verifies the generated protocol to be correct.
3.1

Structure of Protocol Snippets

A protocol snippet describes the behavior of a single transition of a process
in the protocol. In particular, given a specific control state, input event (such
as receiving a message or an external event), and guard (Boolean condition on
6

variables and received message fields), the designer specifies the actions executed
by the protocol, such as updating the process variables, sending messages, and
finally, change in control state. Unlike an efsm, where a transition is specified
symbolically, i.e., the input event, the guard, and actions are fully specified,
in a snippet, we allow a combination of concrete, symbolic, and in some cases
unspecified terms. A concolic snippet is formally specified using the template
shown in Figure 3.
3.2

Specifying Snippets

In any snippet, CurrentState, InputEvent, NextState, and OutMsg are concretely
specified. In a concrete snippet, antecedents and update expressions are also
specified concretely. A concrete snippet therefore constitutes a single transition
in the actual execution of the protocol. Figure 4b shows a concrete protocol
snippet describing the expected actions of the directory in control state S , when
it receives an input request (GetM) from the cache process 1 (illustrated in Figure 4a). The designer concretely specifies the fields of the outgoing response
message (RspMsg), indicating the number of Ack messages the requester must
expect. Similarly, the designer specifies that invalidation messages must be sent
to cache processes 2 and 3.
The goal of the synthesis tool is to generate expressions for these fields in the
outgoing messages from the given snippets. Figure 4c shows the corresponding
expressions generated by Transit. The key difference between the generated
expressions and the expressions in the snippet is that the generated expressions
are symbolic. As the synthesis tool accepts symbolic input snippets as well, the
generated expression is itself a valid input snippet. In the limit, a fully specified
protocol can be expressed as a set of symbolic snippets, where each snippet fully
specifies a transition corresponding to an arc in Figure 2. This observation allows for incremental protocol design by iteratively adding snippets to previously
generated protocols, or the snippets that were used to generate them. We define
the term concolic snippet for snippets in which expressions can contain a mix of
concrete and symbolic expressions.
To illustrate the iterative refinement process, we continue with the example
in Figure 4. We find that the protocol completion using the generated expressions results in a deadlock when verified with Murφ. The counterexample from
the resulting execution indicates a scenario where the requesting cache process
itself was a member of the directory’s sharer set (called an upgrade miss). The
conventional efsm approach to fixing such a bug would require the designer to
reason about all the expressions that are affected by such a scenario and deduce
the changes required to fix them symbolically. Instead, with our approach, the
designer adds an example that indicates the desired behavior in this scenario.
Figure 5a and Figure 5b show the example case and the corresponding snippet
respectively. When Transit is invoked with this additional snippet, it generates
the new expressions shown in Figure 5c. The protocol completion obtained with
these newly generated expressions is successfully verified by Murφ.
7

1

2

3

Dir

I
Ge

IM

tM

S
{2, 3}

2

=
cks
,A
a
t
a

D

v
In

Busy

v
In {2, 3}

(a) scenario

Transition(S , ReqMsg){
[] => (Busy , RspMsg, InvMsg){
(ReqMsg.Sender=1 ∧
ReqMsg.Type=GetM ∧
Sharers = {2, 3}) ==>{
RspMsg.Acks=2;
...
InvMsg.Dest={2, 3};
}
}
}

(b) snippet

RspMsg.Acks := |Sharers|;

InvMsg.Dest := Sharers;

(c) generated expression

Fig. 4. Figure 4b shows the example snippet given by the designer for the Directory process
transition corresponding to the scenario shown in Figure 4a. Figure 4c shows the expressions
generated by the synthesis tool for two message fields.

1

2

3

Dir

S
IM

Ge

tM

1
ks=
Ac
,
a
t
Da
v
In

S
{1, 2}

Busy
{1, 2}

(a) scenario

Transition(S , ReqMsg){
[] => (Busy , RspMsg, InvMsg){
(ReqMsg.Sender=1 ∧
ReqMsg.Type=GetM ∧
Sharers = {1, 2}) ==>{
RspMsg.Acks=1;
...
InvMsg.Dest={2};
}
}
}

(b) snippet

RspMsg.Acks :=
|Sharers∪Req.Sender|-1;

InvMsg.Dest :=
Sharers \ Req.Sender;

(c) generated expression

Fig. 5. Figure 5a and Figure 5b show an additional snippet illustrating the expected behavior
for a counterexample corresponding to a deadlock resulting from the generated expression in
Figure 4c. The newly generated expression Figure 5c eliminates the deadlock.

Problem Definition. In order to generate a protocol completion, i.e., the symbolic snippets which are consistent with the concolic snippets provided by the
designer, we need to solve two sub-problems:
– Infer the symbolic expressions e0i for the RHS of each update expression.
– If a guard g is concolic or unspecified, then infer the guard g from the
antecedents provided by the designer.
The above problem statements imply that the generated expressions are consistent with the concolic examples for updates and their respective antecedents
specified in every input snippet. As input snippets can be either concrete or
symbolic in nature, we employ an SMT solver to check this implication.

4

Implementation of the Design Tool

To solve the problems described in § 3.2, our implementation of Transit is based
on generating candidate expressions and checking if they are compatible with
the given concolic snippets. The tool systematically enumerates expressions of
increasing sizes based on the types of variables and constants used in the given
8

snippets, and the operators on these types that are defined in our expression
grammar.
4.1

Expression Grammar

An expression grammar G is defined as the tuple (F, V, C) over a set of types
T , where F is a set of typed function symbols with specified arities, V is a
set of typed variable symbols and C is a fixed set of typed constant symbols.
Let e1 , . . . , ek be expressions, let fk ∈ F be a function symbol of arity k, let
v ∈ V be a variable, and c ∈ C be a constant symbol. Further, assume that
the types of the expressions e1 , . . . , ek are respectively the same as the types
of the arguments of fk . A well-formed expression e is then inductively defined
as: e ::= fk (e1 , . . . , ek ) | v | c. The size of an expression is the number of symbols
belonging to F, V or C appearing in the expression. For example, consider the
expression grammar G with T = {Int, Set}, F = {+, ∪, SetOf}, V = {x, y},
and C = {0}. G includes the expressions SetOf(+(+(x, x), y)) (of size 6) and
∪(SetOf(0), SetOf(+(x, x)) (of size 7). We use Gi to denote a restriction of the
expression grammar G that only contains expressions that have a size exactly
equal to i.
To synthesize expressions for cache-coherence protocols, we found that we
needed the following set of types:
T = {Int, Set, Bool, PID, Address, Value} ∪ EnumTypes
Int and Bool have the usual meaning of integer and Boolean types. Each efsm in
a protocol has a unique identifier, and the type PID specifies its type. Any set of
PID types belongs to the Set type. The Address type represents addresses in the
memory and the Value type represents values stored in the memory. EnumTypes
represents the set of user-defined enumerated types, primarily used to specify
message types.
4.2

Synthesizing Update Expressions

The goal of the synthesis tool is to compute expressions for the variables in
guard-action blocks as shown in Figure 3. We assume the standard parallel assignment model (static single assignment, or SSA form) so that expressions for
each variable, renamed as primed variables, can be synthesized independently.
For each output variable v, if type(v) ∈ T denotes the type of v, then each
concolic snippet specifies a (conditional) concolic assignment to the variable v
of the form: [aj =⇒ v = ej ], where aj is the antecedent condition, and ej is a
concolic expression. Assume that G is an expression grammar where F and C
are user-specified, and Vin is the set of all unprimed process-variables and fields
of the received message. The synthesis algorithm systematically enumerates expressions from each Gi (starting from i = 1), up to some user-specified bound.
For each enumerated expression g, we check if g is a valid generalization of the
concolic assignments. Concretely, we check the validity of the formula:
9

n
^

((aj ∧ (v = g)) =⇒ (v = ej ))

j=1

If the formula is valid, then the expression g is the smallest expression consistent with each of the concolic RHSs for v. This process is repeated for all the
variables v ∈ Vout and for every guard-action.
To illustrate the discussion presented above, consider the example shown in
Figure 4b. Suppose that we desire to check if the expression |Sharers|, fits the
examples for the update to RspMsg.Acks, then the proposition to be checked for
validity is



Sharers = {2, 3} ∧
 ReqMsg.Sender = 1 ∧  ∧ (RspMsg.Acks = |Sharers|) =⇒ RspMsg.Acks = 2
ReqMsg.Type = GetM

It can be verified that this proposition is indeed valid given that the value of
Sharers is constrained in the antecedent. Thus, this expression fits the examples
provided for the update to RspMsg.Acks.
Now, when the additional example in shown in Figure 5b is added, and
the expression |Sharers| is tried, the following proposition will be checked for
validity



Sharers = {2, 3} ∧
 ReqMsg.Sender = 1 ∧  ∧ (RspMsg.Acks = |Sharers|) =⇒ RspMsg.Acks = 2
ReqMsg.Type = GetM)
V


Sharers = {1, 2} ∧
 ReqMsg.Sender = 1 ∧  ∧ (RspMsg.Acks = |Sharers|) =⇒ RspMsg.Acks = 1
ReqMsg.Type = GetM)

This proposition is invalid, since Sharers = {1, 2} =⇒ |Sharers| = 2, which
violates the consequent in the second example. It can be verified that the expression |Sharers ∪ {ReqMsg.Sender}| − 1 is valid for these set of examples.
4.3

Synthesizing Guards

A guard can be simply viewed as a Boolean-valued expression. However, the
key difference between synthesizing guards and synthesizing update expressions
is that for a given control state and input event, guards cannot be inferred
independently of each other. To ensure deterministic execution, we require that
synthesized guards are mutually exclusive. For a set of guard-actions B1 , . . . , Bn
from a set of concolic snippets with the same starting control state and input
event type, the j th guard-action block Bj consists of a sequence of examples
conditioned by antecedents aj1 , . . . , ajk . The synthesis algorithm sequentially
synthesizes the guards for the blocks B1 to Bn , starting with B1 . Thus, before
synthesizing the j th guard, it has the guards ϕ1 , . . . , ϕj−1 corresponding to the
guard-action blocks B1 , . . . , Bj−1 available to it.
10

Similar to the synthesis procedure for update expressions, we systematically
enumerate Boolean-valued expressions compatible with the grammar Gi (beginning with i = 1), and check if the enumerated expression g is (1) compatible
with the disjunction of the antecedents in the current guard-action block, (2)
incompatible with the guards synthesized previously, and (3) incompatible with
the antecedents in the subsequent guard-action blocks. Any expression g that
satisfies these three conditions is guaranteed to be mutually exclusive with all
the other W
guards for the given control state and input event. Concretely, we dek
fine Ai = i=1 aji to be the disjunction of the antecedents in Bj , and check the
validity of the following formula:

!  n
j−1
^
^
(Aj =⇒ g) ∧
(ϕi =⇒ ¬g) ∧ 
(Ai =⇒ ¬g)
i=1

i=j+1

Again, to illustrate the discussion presented, consider the following set of
antecedents which occur in a real world protocol, where two guards need to be
synthesized. Note that the guard-action blocks have been labeled with Bi and
antecedents of the examples have been labeled with aij for future reference.
Transition(SM, RspNet InMsg) {
B1 : [] => (SM) {
a11 : (InMsg.MType=ACK
a12 : (InMsg.MType=ACK
a13 : (InMsg.MType=ACK
}
B2 : [] => (M) {
a21 : (InMsg.MType=ACK
a22 : (InMsg.MType=ACK
}
}

I) ∧ (InMsg.Acks=-1) ∧ (Pending=-2) ==> {· · · };
I) ∧ (InMsg.Acks= 3) ∧ (Pending= 0) ==> {· · · };
I) ∧ (InMsg.Acks=-1) ∧ (Pending= 0) ==> {· · · };

I) ∧ (InMsg.Acks=-1) ∧ (Pending=-1) ==> {· · · };
I) ∧ (InMsg.Acks= 3) ∧ (Pending= 3) ==> {· · · };

Suppose that we were trying to synthesize an expression for the guard of the
guard-action block B1 . In this case, we have A1 = a11 ∨ a12 ∨ a13 . No guards
have yet been synthesized, so none of the ϕi propositions are defined for this
example. Also, we have A2 = a21 ∨ a22 . Consider the case where the expression
InMsg.Acks 6= Pending is being evaluated as a candidate expression for the
guard of B1 . The proposition that will be tested for validity is
a11 ∨ a12 ∨ a13 =⇒
V (InMsg.Acks 6= Pending)
a21 ∨ a22 =⇒ ¬(InMsg.Acks 6= Pending)
It can be seen that InMsg.Acks 6= Pending is consistent with a11 , a12 and a13 ,
while the negation InMsg.Acks = Pending is consistent with a21 and a22 . Thus
the expression InMsg.Acks 6= Pending is a valid guard for B1 .
4.4

Optimizations for Expression Enumeration

As several generated expressions from our grammar can be equivalent, we eliminate common causes for equivalent expressions, to avoid redundant invocations
11

5

6

8

U Opt
no
pt

U Opt
no
pt

U Opt
no
pt

7

Expression Size

9

U Opt
no
pt

4

U Opt
no
pt

U Opt
no
pt

U Opt
no
pt

U Opt
no
pt

3

U Opt
no
pt

2

U Opt
no
pt

1

U Opt
no
pt

U Opt
no
pt

Number of Expressions

10G
1G
100M
10M
1M
100K
10K
1K
100
10
1

10

11

12

Fig. 6. Number of expressions per expression size, with and without optimizations enabled.

of the SMT solver. The protocol designer defines operators in the grammar to
be associative and commutative when appropriate, and the expression generator chooses a canonical representation for such compositions. Similarly, we allow
for annotation of operators whose repeated composition can lead to equivalent
expressions (such as Boolean negation and idempotent functions). We also eliminate Boolean expressions of size greater than one consisting solely of constants.
We found that these optimizations reduced the run-time of systematic enumeration by an order of magnitude, and allowed for the synthesis tool to consider
expressions with larger sizes, within a given time budget.

5

Evaluation of the Proposed Design Methodology

In this section, we evaluate the performance of our synthesis tool Transit and
discuss the experience that we have had in designing coherence protocols using
our approach.
5.1

Performance evaluation

We first measure the limits of systematic expression enumeration. Figure 6 shows
the number of Boolean-valued expressions enumerated for each size. The numbers for other types are similar. As expected, the number of generated expressions
increases exponentially with expression size. Our optimizations reduce the number of enumerated expressions, particularly for large expression sizes, allowing
for expressions of larger size — for example, the number of expressions of size
12 enumerated with optimizations is the same as the number of expressions of
size 11 without optimizations.
Figure 7a shows the performance of the expression synthesis algorithm when
synthesizing completions for the MSI protocol. Most of the synthesis time was
spent in evaluating queries with the SMT solver. Figure 7b shows the distribution
of the sizes of the expressions synthesized. Most of the expressions are of size
3 or lower, with fewer than 6% of the expressions having a greater size. As the
average synthesis time for smaller expressions is several orders of magnitude
lower, most of the time is dominated by the few larger expressions which needed
to be synthesized.
12

Table 1. Performance of snippet-based design
Protocol
Snippets
required
19
77

49
157

88
1.8K

0.1
3.8

17
45

100
10
1
0.1
0.01
0.001

Guards
Exprs.
explored

Time
(secs)

3.1K
44.5K

5
134

States
explored
by Murφ
140K
854K

60
% of Expressions

Avg. Synth. Time (s)

VI
MSI

Num.
synth.

Synthesis
Updates
Exprs.
Time Num.
explored (secs) synth.

40
20
0

1 2 3 4 5 6 7 8
Expression Size

1 2 3 4 5 6 7 8
Expression Size

(a) Average synthesis time per size.

(b) Distribution of expression size.

Fig. 7. Expressions synthesized when completing the MSI protocol from snippets.

To evaluate the feasibility of specifying cache coherence protocols with snippets, we manually expressed working versions (fully specified and verified efsm
descriptions) of two protocols — the Valid-Invalid (VI) protocol and the MSI
protocol as concolic snippets. In each case, the protocol had 4 cache processes
and one directory. The key results are summarized in Table 1 and Figure 7a.
As the number of expressions explored during guard synthesis was much higher
than that for updates, the synthesis time was dominated by guard synthesis.
5.2

Case Studies

Methodology. We investigated the effectiveness of designing protocols with con-

colic snippets by conducting two case studies, where we designed two representative coherence protocols presented in the synthesis lectures on computer architecture [22]. The lectures describe directory-based protocols with the help of
illustrations of common-case scenarios, accompanied by a textual description of
some key aspects. To simulate the experience of an inexpert user, we chose two of
the co-authors of this work (henceforth called designers) who were inexperienced
in cache coherence protocol design.
Case Study A: Non-blocking-Directory based MSI Protocol. In this experiment, we designed the MSI protocol described in the synthesis lectures from
concolic snippets with Transit. In contrast to the blocking-directory protocol
in Figure 2b, this protocol has a non-blocking directory, i.e., the directory responds to concurrent requests (for the same cache block). Consequently, there
are several additional race conditions that result from this change. Another key
difference is that this protocol includes non-silent clean line eviction, i.e., when
13

Table 2. Effectiveness Metrics for Protocol Design from Concolic Snippets

Case Study A
Snippets in the first/last version
Writing first set of snippets
Total manual effort
Number of iterations
Number of traces inspected
Synthesis time for last version
Number of updates/guards synthesized
States in verified protocol

19/86
2
6
13
5
52
175/80
7.7M

snippets
hours
hours
iterations
traces
mins.
expressions
states

Case Study B
96/108
6
13
8
6
15
260/74
1.5M

snippets
hours
hours
iterations
traces
mins.
expressions
states

it is in the S state, if a cache receives a replacement request, it does not silently
transition to the I state, but has to introduce additional states and messages.
In contrast, the protocol in Figure 2 implements silent evictions.
The designer observed that the scenarios described in the synthesis lectures
resulted in a sparse initial set of snippets, as most of the tricky corner cases were
either indirectly specified in the textual description, or were left unspecified.
Hence, the designer had to add 67 more snippets over 13 iterations to converge
to a correct protocol. A summary of other key metrics of the case study is
presented in Table 2.
Case Study B: Adding the E State to a Blocking MSI Protocol. The goal of
the second case-study was to augment the MSI protocol in Figure 2 with an “E ”
state to arrive at the MESI protocol. The E state (shorthand for exclusive) is an
optimization that allows the protocol to immediately grant write permissions for
read requests to memory locations to which any other cache process has neither
read nor write permissions. The synthesis lectures describe the MESI protocol in
terms of new scenarios, and modifications to scenarios for the MSI protocol. Our
approach was to add the corresponding snippets to the existing set of snippets
specifying our baseline MSI protocol. As the examples describe a MESI protocol
with non-silent clean line evictions, we had to modify our baseline MSI protocol
correspondingly.
The extended protocol contained 5 new states (4 for the cache, 1 for the
directory), and 7 new message types. In the first iteration, we added 19 snippets
to specify transitions involving the E state and non-silent clean line evictions.
As in Case Study A, the description from the text under-specified corner cases
and scenarios involving transient states, which manifested as errors reported by
the model checker. The designer was able to obtain a fully verified protocol by
adding 12 additional snippets over 8 iterations. (Please see Table 2 for more
metrics.)
5.3

Discussion

Ease of use. We note that the manual effort required in our case studies was

modest - 6 hours for an inexperienced designer to produce a fully verified MSI
protocol, and less than 13 for a fully verified MESI protocol. On an average, we
14

needed less than 2 snippets to specify a transition, and the designers converged to
a correct protocol in a reasonably small number of iterations. In both protocols,
the designers had to visually inspect only a handful of counterexample traces
to identify positive examples to refine the protocol specification. As Transit
searches for the smallest compatible expression, the designers found cases where
the obtained expressions differed (were smaller in size) from expected results.
In fine-tuning Transit, we found that identifying the right expression grammar involves a trade-off between synthesis time and the verbosity of expressions
generated. For instance, adding the constant symbol denoting an empty set can
improve readability of expressions, but can impact the number of expressions
enumerated (and hence synthesis time) adversely. The expression grammar that
we used is presented in the appendix.
Ease of debugging. After synthesizing all guards for a control state, Transit
generates an error guard that accounts for any condition has not yet been covered. Consequently, any unexpected input events are detected immediately upon
occurrence. Adding snippets to remove such errors was a straightforward, and
often mechanical process. With the help of a trace-visualization tool that we
developed, we were able to graphically explore error scenarios, and add positive
examples to refine the protocol specification. Thus, fixes did not involve changing portions of the code, but instead required adding more snippets. In one case,
each added snippet resulted in a number of disjoint antecedents. Such a scenario
presents a challenge for guard synthesis as the generalization is an unwieldy disjunction. However, we observed that each term in the disjunction corresponded
to a distinct message type, and by simply splitting a single guard-action block
into multiple blocks grouped by message type, we were able to synthesize the
guard. In fact, this suggests that for protocols, message type could be made a
language construct when specifying snippets.

6

Conclusions

In this paper, we have proposed a new approach for designing communication
protocols by adopting verification tools that interact with the protocol designer.
Our approach builds on the intuition that protocols can be naturally expressed
in terms of their behavior in example scenarios. We introduced the concept of
concolic snippets to allow a designer to specify these behaviors as a mix of concrete examples and symbolic partial transitions. To demonstrate the feasibility
of our approach, we developed a prototype tool based on expression generation,
which generates complete protocol specifications from concolic snippets, that
are then verified using a model checker. Our preliminary case studies using this
tool allowed inexperienced designers to correctly synthesize representative cache
coherence protocols of modest complexity with several hours of human effort.
Encouraged by the initial feasibility results, our next steps are to further extend designing with snippets by exploring alternate computation strategies for
expression synthesis, synthesizing efsm descriptions from distributed scenarios
such as message sequence charts, and techniques to automatically analyze counterexamples returned by the model checker.
15

References
[1] Abts, D., Lilja, D.J., Scott, S.: So Many States, So Little Time: Verifying Memory
Coherence in the Cray X1. In: IPDPS (Apr 2004)
[2] Barrett, C., Tinelli, C.: CVC3. In: CAV (Aug 2007)
[3] Church, A.: Logic, Arithmetics, and Automata. In: Proceedings of the International Congress of Mathematicians (1963)
[4] Clarke, E., Grumberg, O., Peled, D.: Model Checking. MIT Press (2000)
[5] Dill, D., Drexler, A., Hu, A., Yang, C.: Protocol Verification as a Hardware Design
Aid. In: ICCD (Oct 1992)
[6] Dill, D.L.: The Murphi Verification System. In: CAV (Aug 1996)
[7] E.Clarke, O.Grumberg, H.Hiraishi, S.Jha, D.Long, K.McMillan, L.Ness: Verification of the Futurebus+ Cache Coherence Protocol. Formal Methods in System
Design 6, 217–232 (1995)
[8] Gulwani, S.: Automating String Processing in Spreadsheets using Input-output
Examples. In: POPL (Jan 2011)
[9] Harel, D.: Can Programming be Liberated, Period? IEEE Computer (2008)
[10] Harel, D., Marelly, R.: Come, Let’s Play: Scenario-based Programming using LSCs
and the Play-engine. Springer (2003)
[11] Holzmann, G.J.: The Spin Model Checker: Primer and Reference Manual.
Addison-Wesley (2003)
[12] Komuravelli, R.: Verification and Performance of the DeNovo Cache Coherence
Protocol. Master’s thesis, University of Illinois (2010)
[13] Kuncak, V., Mayer, M., Piskac, R., Suter, P.: Functional Synthesis for Linear
Arithmetic and Sets. Software Tools for Technology Transfer (STTT) (2012)
[14] Martin, M.M.K., Hill, M.D., Wood, D.A.: Token Coherence: Decoupling Performance and Correctness. In: ISCA. pp. 182–193 (Jun 2003)
[15] Martin, M.M.K., Sorin, D.J., Beckmann, B.M., Marty, M.R., Xu, M., Alameldeen,
A.R., Moore, K.E., Hill, M.D., Wood, D.A.: Multifacet’s General Execution-driven
Multiprocessor Simulator (GEMS) Toolset. Computer Architecture News (2005)
[16] Raghavan, A., Blundell, C., Martin, M.M.K.: Token Tenure: PATCHing Token
Counting Using Directory-Based Cache Coherence. In: MICRO (Nov 2008)
[17] R.Bloem, S.Galler, B.Jobstmann, N.Piterman, A.Pnueli, M.Weiglhofer: Interactive Presentation: Automatic Hardware Synthesis from Specifications: A Case
Study. In: DATE (Mar 2007)
[18] Sen, K., Marinov, D., Agha, G.: CUTE: A Concolic Unit Testing Engine for C.
In: FSE (2005)
[19] Solar-Lezama, A., Arnold, G., Tancau, L., Bodik, R., Saraswat, V., Seshia, S.A.:
Sketching Stencils. In: PLDI (Jun 2007)
[20] Solar-Lezama, A., Jones, C.G., Bodik, R.: Sketching Concurrent Datastructures.
In: PLDI (Jun 2008)
[21] Solar-Lezama, A., Rabbah, R., Bodik, R., Ebcioglu, K.: Programming by Sketching for Bitstreaming Programs. In: PLDI (Jun 2005)
[22] Sorin, D.J., Hill, M.D., Wood, D.A.: A Primer on Memory Consistency and Cache
Coherence. Morgan Claypool (2011)
[23] Sorin, D.J., Plakal, M., Hill, M.D., Condon, A.E., Martin, M.M.K., Wood, D.A.:
Specifying and Verifying a Broadcast and a Multicast Snooping Cache Coherence
Protocol. IEEE TPDS 13(6), 556–578 (Jun 2002)
[24] Thomas, W.: On the Synthesis of Strategies in Infinite Games pp. 1–13 (1995)
[25] Vantrease, D., Lipasti, M., Binkert, N.: Atomic Coherence: Leveraging Nanophotonics to Build Race-Free Cache Coherence Protocols. In: HPCA (Feb 2011)

16

Appendix
For synthesizing protocol completions with Transit, we used the list of function symbols F shown in Table 3. The functions were chosen as a trade-off between expressiveness and the time required for expression synthesis. Functions with arity 0 represented
constants (e.g., the 0-ary function Zero).

Table 3. Expression Grammar used for Cache Coherence Protocol Synthesis

Function
Plus (Int, Int) → Int
Minus (Int, Int) → Int
Inc (Int) → Int
Dec (Int) → Int
SetAdd (Set, PID) → Set
SetSize (Set) → Int
SetUnion (Set, Set) → Set
SetInter (Set, Set) → Set
SetDiff (Set, Set) → Set
SetOf (PID) → Set
Or (Bool, Bool) → Bool
And (Bool, Bool) → Bool
Not (Bool) → Bool
SetContains (Set, PID) → Bool
IsZero (Int) → Bool
Equals (T, T) → Bool, ∀ T ∈ T
GE (Int, Int) → Bool

Description
Integer Addition
Integer Subtraction
Add one to an Integer
Subtract one from an Integer
Add an entry into a Set
Compute the cardinality of a Set
Set Union
Set Intersection
Set Difference
Create a singleton Set from a PID
Boolean Disjunction
Boolean Conjunction
Boolean Negation
Membership test on a Set
Test if an integer is Zero
Equality Test
Test if an Integer is
greater than or equal to another
Test if an Integer is
strictly greater than another
Get the smallest element from a Set
Conditional Expression
Constant denoting the
number of caches in the system
Constant value of 0.

GT (Int, Int) → Bool
GetLeast (Set, Int) → PID
IfThenElse (Bool, T, T) → T, ∀ T ∈ T
NumCaches (Void) → Int
Zero (Void) → Int

17

