A correctness criterion for asynchronous circuit validation and optimization by Gopalakrishnan, Ganesh & Brunvand, Erik
A CORRECTNESS CRITERION FOR 
ASYNCHRONOUS CIRCUIT VALIDATION AND OPTIMIZATION
G ANESH  G O PA LA K R ISH N A N 1 
ERIK B R U N V A N D 2 
NICK MICHELL
D epartm ent of Com puter Science  
University of U tah  
Salt Lake City, U T  84112, USA ’ 
g a n e s h Q c s .u ta h .e d u , e lb @ c s .u ta h .e d u , m ic h e l lO c s .u ta h .e d u
ST E V E N  M. NO W ICK 3
Com puter System s Laboratory,
Departm ent of Com puter Science,
Room  $ 2 2 2  Margaret Jacks Hall,
Building 460,
Stanford University,
Stanford, CA 94305 
n ow ick O cs. S ta n fo r d . edu
UUCS-92-004
A bstrac t
W e propose  a new rela tion  C called stron g  conform ance in the con tex t o f  D ill’s trace theory, and define 
B  C A  to  be true ex a c tly  when B  conform s to  A  and the success se t o f  B  con tains th e success se t o f  A . 
W hen B  C A , m o d u le  B  o p era ted  in m odu le  A ’s m ax im al environm en t A M (I.e. B  || A M )  ex h ib its  all 
th e traces th a t A  || A M exh ib its. In add ition , i f  A  has a success trace x, B  can have  additional success 
traces o f  th e form  x i a * where i is an in pu t and a  is the a lph abet o f  th e  trace stru ctu re . T his m eans th at 
B  can have add ition a l capab ilities th a t A  does not. W e show  th a t strong conformance is m ore useful than  
conformance (defined b y  D ill) in d e tec tin g  certain errors in asynchronous circu its. S tron g  conform ance also  
helps ju s tif y  circu it o p tim iza tio n  rules th a t replace a component A b y  another com pon en t B  th a t m a y  have 
ex tra  capab ilities (e.g. can accept m ore in pu ts). T he stru c tu ra l opera tors  compose, rename, and  hide o f  
D ill’s trace th eory  are m on oton ic w ith  respect to  strong conformance. E xperim en ts using a modified version  
o f  D ill’s  trace th eory verifier are presented.
Subm itted to the 1992 Com puter-A ided Verification Workshop and IEEE Transactions on CAD
'Supported in part by NSF Award MIP-8902558
2Supported in part by NSF Award M IP-9111793
3Supported in part by the Semiconductor Research Corporation, Contract no. 91-DJ-205, and by the 
Stanford Center for Integrated Systems, Research Thrust in Synthesis and Verification of Multi-Module 
Systems.
Asynchronous Systems Research Group 
University o f Utah, Department o f  Computer Science





University o f Utah 
Dept, o f  Computer Science 
Salt Lake City, Utah 84112
STEVEN M. NOWICK*
Computer Systems Laboratory,
Department o f  Computer Science,




Keywords: Asynchronous Circuits, Circuit Optimizations, Formal Verification of Hardware, Trace Theory
Abstract. We propose a new relation C called strong conformance in the context o f  Dill's trace theory [1], and define 
B C A to be true exactly when B conforms to A and the success set o f  B contains the success set o f  A. When B O A ,  
module B operated in module A's maximal environment AM (i.e. B || A ) exhibits all the traces that A || AM exhibits. In 
addition, if A has a success trace x, B can have additional success traces o f the form  xia  * where i is an input and a  is 
the alphabet o f  the trace structure. This means that B can have additional capabilities that A does not. We show that 
strong conformancc is more useful than conformance (defined by Dill) in detecting certain errors in asynchronous circuits. 
Strong conformance also helps justify circuit optimization rules that replace a component A by another component B that 
may have extra capabilities (e.g. can accept more inputs). The structural operators compose, rename, and hide o f Dill's 
trace theory are monoionic with respect to strong conformance. Experiments using a  modified version o f Dill's trace 
theory verifier are presented.
1 In trod u ction
Asynchronous circuits are enjoying a revival, as designers confront problems associated with 
the scale of modem VLSI [2], Despite the many advantages they offer, the number of temporal 
concerns involved in verifying asynchronous circuits is, in general, very large. In practice, the task of 
verifying asynchronous circuits is greatly simplified through environmental assumptions (e.g. single 
input changes) or by relying on circuit properties (e.g. delay insensitivity, which means that the 
circuit behavior is independent of delays on wires that lead to its terminals, or speed independence, 
which means that the circuit behavior is independent of the delays of its constituent gates).
Dill [1] has developed a trace theory and a verifier based on it; the verifier has been applied to a
‘Supported in part by NSF Award MIP-8902558 
Supported in part by NSF Award MIP-9111793
*Supported in part by the Semiconductor Research Corporation, Contract no. 91-DJ-205, and by the Stanford Center 





A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 2
number o f finite-state speed independent asynchronous circuits and has uncovered bugs in several 
published circuits. Nowick [3] has integrated this verifier into the asynchronous circuit synthesis 
framework used by a research division o f Hewlett-Packard. Despite the impressive performance 
of the verifier, the verification criteria it uses, namely conformance and conformation equivalence, 
are inadequate to detect many errors that can be introduced during speed independent circuit design 
or during circuit optimizations. In this paper, we propose a simple extension to conformance, 
called strong conformance, and point out when this criterion is useful and interesting during speed 
independent circuit verification. We first motivate the need for this notion through some examples. 
Then, we present the theoretical aspects o f strong conformance. Finally, we present experiments 
that illustrate the strengths as well as the limitations o f this notion.
There is also a more fundamental question: in what different ways can asynchronous circuits be 
compared, and when are they useful/interesting? This question arises quite naturally, because many 
comparison relations have been proposed in the area o f process calculi such as CCS [4] and CSP 
[5] (for example, see [6]). Although we do not offer an answer to the above question, our work (in 
proposing strong conformance) can be seen as a step in this direction.
Our work was principally motivated by our inability to reason about the correctness o f some o f  
the optimization rules used in Brunvand’s asynchronous circuit compiler [7, 8] using ideas that we 
were familiar with.
Section 2 presents the required background o f D ill’s trace theory, and defines conformance, which 
is the comparison relation used by Dill. It also presents the notion o f conformation equivalence. 
Section 3 defines strong conformance as a small extension to conformance, and presents an algorithm 
for verifying this new relation. The central idea underlying this paper can be illustrated through 
a few examples, which are presented in section 4. In presenting examples, we shall use a simple 
notation to dcscribc FSMs. Section 5 examines the properties o f strong conformance. Section 6 
presents experiments with an implementation o f strong conformance in D ill’s trace theory verifier. 
Section 7 discusses our results, related work, and provides concluding remarks. The Appendix 
provides deferred details.
2  B a ck g ro u n d : T race T h eo ry
2.1 Definitions and Trace Structures
The following definitions and notations are taken from [1], Trace theory is a formalism for 
modeling, specifying, and verifying speed-independent circuits. It is based on the idea that the 
behavior o f a circuit can be described by a regular set o f  traces, or sequences o f transitions. Each 
trace corresponds to a partial history o f signals that might be observed at the input and output 
terminals o f a circuit.
A simple prefix-closed trace structure, written SPCTS, is a three tuple (I , 0 , S ) where /  is the 
input alphabet (the set o f  input terminal names), O is the output alphabet (the set o f  output terminal 
names), and S is a prefix-closed regular set of strings over a  = I U O called the success set. In the 
following discussion, we assume that S is a non-empty set.
We associate a SPCTS with a module that we wish to describe. Roughly speaking, the success 
set o f  a module described through a SPCTS is the set o f traces that can be observed when the circuit 
is “properly used”.
With each module, we also associate a failure set, F, which is a regular set o f strings over a.  The
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 3
failure set o f  a module is the set o f  traces that correspond to “improper uses” o f the module. A 
failure set o f a module is completely determined by the success set: F = (SI — S)a*. Intuitively, 
(SI -  S) describes all strings o f  the form xa, where a: is a success and a is an “illegal” input signal. 
Such strings are the minimal possible failures, called chokes. Once a choke occurs, failure*cannot 
be prevented by future events; therefore F is suffix-closed.
As an example, consider the SPCTS associated with a unidirectional WIRE with input a , output 
b, and success set .
( {a } , {b } , { e ,a ,a b ,a b a , . . . } ) .
The success set is a record o f all the partial histories (including the empty one, e), o f successful 
executions o f WIRE. An example o f a choke for WIRE is the trace “aa”. Once input “a” has arrived, 
a second change in “a” is illegal since it may cause unpredictable output behavior.
There are two fundamental operations on trace structures: compose (||) finds the concurrent 
behavior o f two circuits that have some o f their terminals o f  opposite directions (the directions are 
input and output) connected, and hide makes some terminals unobservable (suppressing irrelevant 
details o f  the circuit’s operation). A third operation, rename, allows the user to generate modules 
from templates by renaming terminals.
We can denote the success set o f a SPCTS by using state-transition specifications. The success 
sel o f WIRE, described earlier, is captured by the following specification, where WIRE is regarded
as a process:
WIRE = a l  -+ b \  WIRE
In a process description, we use ‘| ’ to denote c h o i c e , >' to denote sequencing, and & system of tail 
recursive equations to capture repetitive behavior. We use symbols such as al  to denote incoming 
transitions (rising or falling) and b\ to denote outgoing transitions (rising or falling). (Extensions to 
this syntax will be introduced as required.)
When we specify a SPCTS, we generally specify only its success set; its input and output alphabet 
are usually clear from the context, and hence are left out.
2.2 Conformance: The Ability to Perform Safe Substitutions
A trace structure specification, 7s, can be compared with a trace structure description, 7>, o f  the 
actual behavior o f a circuit. When 7/ implements T$, we say that 7) conforms to Ts\ that is, 7) ■< T$. 
(The inputs and outputs o f the two trace structures must be the same.) This relation is a preorder 
and is callcd conformance. Conformance holds when 7/ can be safely substituted for T$.
More precisely, 7 ) < T$ if for every 7', whenever Ts || T1 has no failures, 7/ || T  has no failures, 
either. Intuitively, 7):
(a) must be able to handle every input that Ts can handle (otherwise, 7/ could fail in a context 
where Ts would have succeeded); and
(b) must not produce an output unless Ts produces it (otherwise, 7/ could cause a failure in the 
surrounding circuitry when Ts would not).
We illustrate these two facets o f conformance, first considering restrictions on input behavior 
(case (a)). Consider a JOIN element:
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 4
Now, consider a modified JOIN:
71 = al b l  -+ cl -+ 71
Notice that the success set o f 71 leaves out the trace b\ a\ c. Clearly it is not safe to substitute 71 
for 7: 71 cannot accept a transition on b as its first input, whereas the environment is allowed to 
generate a b as its first output transition, because this would have been acceptable for 7. Formally, 
we say 71 -f> 7, since the implementation cannot accept an input transition which the specification 
can receive.
However, note that it is safe to substitute 7  for 71, since 7  can handle every input (and more) 
which 71 can handle; so 7  <  71. Trace theory allows an implementation to have “more general” 
input behavior than its specification.
Next, consider the case o f restrictions on output behavior (case (b) above). We begin with a 
simple case:
CONCURMOD  = a? -> (b'l || c'\) -v  CONCURMOD  
SEQNTLMOD  = a? -> b'l -> c'l -> SEQNTLMOD
Note that the success set o f SEQNTLMOD omits the trace a\ c. It is not safe to substitute 
CONCURMOD  for SEQNTLMOD: some environment o f SEQNTLMOD  may not accept a 
transition on c after producing an a. Therefore, CONCURMOD  ^  SEQNTLMOD (intuitively, 
implementation CONCURMOD  is “too concurrent”).
However, SEQNTLMOD can be safely substituted for CONCURMOD  in any environment. 
Any environment accepting outputs from CONCURMOD  will also accept outputs generated by 
SEQNTLMOD , so SEQNTLMOD < CONCURMOD. Trace theory allows an implementation to 
have “more constrained” output behavior than its specification.
This point can be illustrated more dramatically. We return to the earlier JOIN and a new 
implementation:
AlmostWood -- a l  -* b l  —> cl —► AlmostWood 
| b l  a l  AlmostWood
The reason why 7 can be safely substituted by AlmostWood in any context is the following. So 
long as the environment and the component keep generating the sequence abcabcabc . . both 7 
and AlmostWood behave alike. Suppose the environment generates the string ba and awaits a c. J 
docs generate a c after seeing ba, thereby allowing the environment to proceed; AlmostWood, on the 
other hand, outputs nothing, and awaits a further a or a b— at the same time as the environment is 
awaiting a c\ in this case, the result is a deadlock.
Going to the extreme, we find that
BlockOJWood = al  —► BlockOJWood 
| bl —>■ BlockOJWood
conforms to 7.
In summary, conformance allows an implementation to be a refinement o f a specification: an 
implementation may have “more general” input behavior or “more constrained” output behavior
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 5
than its specification. However, we want to show not only that an implementation does no harm, 
but that it also does something useful! Unfortunately, prefix-closed trace theory cannot distinguish 
“constrained” output behavior from deadlock. In spite of the usefulness of trace theory, this is its 
greatest practical weakness. •
2.3 On Establishing Conformance
A verifier has been developed by Dill to establish conformance. Relation <  is established in this 
verifier as follows (we use T, Ts, etc. to denote trace structures):
• The verifier constructs a trace structure, Ts, called the mirror of specification Ts (see [1]; 
originally proposed in [9]). Ts is the same as Ts, but with input and output sets reversed. The 
mirror is the worst-case environment which will “break” any trace structure that is not a true 
implementation of Ts.
• The veri fier then generates the parallel composition of the implementation, 77, and the mirror, 
Ts'- T/ || Ts- It has been proven that 7) <  Ts iff 7) || Ts is failure-free (see [1]).
• Tj -< Ts is checked by testing that Ti || Ts is free of failures. This check can be performed by 
“simulating” the parallel behavior of the two trace structures, presented in Figure 1.
As an example of the above simulation, consider the simulation of J \  against 7, where 7 is the 
mirror of specification J:
7 = al -> bl -► cl  -► 7 
| b\ —»■ al —> cl  —»■ 7
We can see that 7 is the only module capable of performing the first output action: either al or bl. 
The production of bl will cause /I  to choke.
2.4 Conform ation Equivalence
We have seen that while conformance captures the notion of “refinement”, it cannot capture the 
notions of deadlock and livelock. There is another relation that can be considered: conformation
c o n f
equivalence. Trace structures A and B are conformation equivalent (A =  and B < A
(see [1]).
Unfortunately, just as conformance is “too weak” a relation for our purposes, conformation 
equivalence is often “too strong”. Often, for a specification Spec and implementation Imp, where
Imp < Spec, we cannot establish that Imp C<=  Spec. For example, Imp commonly is overbuilt in 
the sense that it accepts more inputs than necessary.
Such an implementation gives rise to the following problems. In showing Imp ■< Spec, no 
problem arises, because Imp will accept all the inputs that Spec can. However, in trying to show 
that Spec < Imp, we “simulate” Imp || Spec. Since Imp can accept more inputs than it needs to, Imp 
ends up generating more outputs than it “needs to”—some of these outputs go beyond what Spec 
can accept, and thus the test Spec <  Imp fails.
How do we rescue the situation? The answer lies in not attempting Spec <  Imp, but merely 
whether Sspec Q Simp, where ‘Sm ' denotes the success set of ‘AT. This is the strong conformance 
relation that we have defined.
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 6
Notations:
• It is assumed that the network 7) || Ts is closed (each output o f Ts matches an input o f 7), and 
vice-versa), and no two outputs are connected together.
• Define To = Ts and T\ = 7).
• Define Toi = the set {70,71}.
• Define T = if (T -  To) then T\ else Tq.
• Define next(s,x) to be the next state attained from state s upon processing input/outputx.
• Initialize a global set o f  state pairs, visited = 4>.
• Call conforms-to-p(7oi, start-state-0, start-state-1).
• Report “success”.
conforms-to-p(7oi ,sto,st\)  = 
if (sto,st]) £ visited 
then return
else 1
visited := visited U {C«o,^i)}; 
for each T £ Toi
for each enabled output x o f T 
■ if x is enabled in T
then conforms-to-p(7oi, next(st0,x),nextistl,x)) 






Figure 1: Algorithm for Checking for Conformance
3 Strong Conformance
Definition: We define T C f , read T conforms strongly to T1, if T -< T* and St 2  S f .  The 
algorithm to check for strong conformance is presented in Figure 2.
The strong conformance relation is safe in that it guarantees conformance. However, it is not 
guaranteed to catch all livcncss failures; but for a number o f examples, a verifier based on strong 
conformance provides much better error detection capabilities.
4 Examples Illustrating Strong Conformance
Example 1
Consider a specification for an asynchronous circuit to be built, given in a state-transition oriented 
notation:
Spec = a l  —> a'\ —► Spec
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 7
Notations: Same as in Figure 1.
• Initialize a global set o f state pairs, visited = <f>.
•  Call strong-conforms-to-p(7oi, start-state-O, start-state-1).
• Report “success”. •
strong-conforms-to-p(7oi,s?o,^i) = 
if (sto,sti) G visited 
then return '
else
visited := visited U {(sto,st\)};
for each enabled input x o f To (* Strong conformance checking loop *) 
if x is not enabled in T\
then ERROR (print failure trace and abort);
end if
end for
for each T £ 7oi
for each enabled output x o f T 
if x is enabled in t  
then strong-conforms-to-p(7oi, next(stO,x), next(stl ,x))






Figure 2: Algorithm for Checking for Strong Conformance
| b l  -► b'\ -»  Spec
This specification describes a component having input terminals a and b, output terminals a' and 
bf, and the behavior o f process Spec. Process Spec awaits signal transitions on both terminals a and
b. If the first transition occurs on input terminal a, it generates an output transition on terminal a', 
and continues to behave as process Spec, If the first transition occurs on terminal b, it generates an 
output transition on terminal b' and similarly continues to behave as process Spec.
The behavior o f Spec can be realized in many ways. Strangely enough, it is cheaper (i.e. takes 
fewer components— in fact none!) to overimplement Spec than to implement exactly the required 
behavior. The simplest implementation o f Spec consists o f just two WIREs. These WIREs are 
used to connect input a directly to output a', and input b directly to output b'. This is an over­
implementation, as the state-transition specification starting at process TwoWires shows:
TwoWires -  a l  —> AFull 
I bl  -» BFull
AFull = bl  -*• ABFull
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 8
| a11 —> TwoWires
ABFull = a! 1 -> BFull
| b'l AFull
BFull = a l  - » ABFull
b'l TwoWires
The implementation TwoWires is an over-implementation, because it can accept more input se­
quences than required; for example, one a followed by one b. However it is a correct implementa­
tion, because it supports all the behaviors that Spec supports, and therefore can be safely substituted 
for Spec in any context.
conf
Notice that TwoWires ^  Spec. However, TwoWires < Spec.
We also have TwoWires C Spec. Superficially, it may seem that <  and C are the same -  but the 
examples to follow show that this is not the case.
Example 2
Consider the specification o f the Universal do nothing module BlockOJWood, described earlier 
[1]:
BlockOJWood -  a l  —*■ BlockOJWood 
| b l  —> BlockOJWood
Now consider the specification o f a JOIN element:
J = a l  -+ b l  — cl J 
| b l ^ a l  -► cl - + J
According to D ill’s trace theory, BlockOJWood conforms to ./; and therefore /  may be substituted 
by BlockOf Wood. However, BlockOJ Wood deadlocks and is therefore an undesirable substitution. 
The check BlockOJWood C J fails, and on this basis we can reject BlockOJWood as a replacement 
for J. In this example, for our purposes, C is superior to <.
Example 3
Consider the following circuit built using a generalized selector GS:
GS = a l  —► (bl -* GS | cl —> GS)
where | denotes choice (in this example, a non-deterministic choice). We now build three versions o f  
WIREs using GS (the examples were provided by Ebergen [10]) shown in Figure 3. These examples 
illustrate some o f the limitations o f a trace lheoi7  o f simple prefix closed trace structures when it 
comes to modeling liveness properties. Three examples clarify these remarks:
DL-WIRE can deadlock after some a's and b's, if GS “makes the wrong choice”. LL-WIRE, after 
receiving an a, can livelock, engaging in an arbitrarily long (XI X 2) trace without emitting a ‘b'.
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 9
A  > A £>  Y > B
W I R E
-» B
D L - W I R E
L L - W I R E
Figure 3: Three Different Wire Models
The verifier based on prefix-closed trace structures ignores these possibilities and constructs state 
transition diagrams for DL-WIRE and LL-WIRE that match the state transition diagram of WIRE 
realized using a buffer. Thus, both conformance and strong conformance fail to detect errors in 
DL-WIRE and LL-WIRE.
This example illustrates cases of deadlock and livelock which cannot be modeled by strong 
conformance. Though the theory of complete trace structures [1, Chapter 7] holds promise in being 
able to distinguish between WIRE and DL-WIRE/LL-WIRE, tool support does not currently exist 
for checking for conformance in the realm of complete trace structures. This will be a subject for 
future research.
Example 4
Consider the specification of an alternating selector [1]:
AS = a l  -> b\ -*■ a l -* c\ —> AS
AS < GS (but not vice-versa) showing that AS is a safe substitution for GS. However, neither 
AS C GS (because S a s  does not contain S g s  -  in fact, S g s  contains S a s ')  nor GS C AS (because GS 
docs not conform to /4S).
It may be argued that in some situations, we can accept an AS as a legal substitution for GS (e.g. if 
a two-way round-robin scheduler is acceptable in place of a two-way random scheduler). However, 
because such a replacement will reduce the number of traces that can be observed from AS || GSM 
in comparison with GS |] GSM (i.e. only strictly alternating as and bs are allowed), we can justify 
the failure of the check AS Q GS. Thus, C may occasionally “err”, discarding possibly useful 
implementations, because of its strict definition of acceptable output behavior.
5 P ro p erties  o f  th e  Strong Conformance R e la tion
It can be shown that strong conformance is transitive (because ■< and C  are transitive), and that the 
structural operators compose, rename, and hide are monotonic with respect to strong conformance 
(because they are monotonic with respect to ^ as shown in [1], and, because(5/j D Sa —* Shide(X)(B) 2  
ShideQ Q iA )), (Sb 3  Sa r Srename(r)(B) 2  r^£ramc(r)(/i))> and (Sij D  Sa '  '^/j[|c — i^ /\[ |c )>  ss shown in the 
Appendix. Thus, replacing a component A by another component B in a system S[], where B C A, 
ensures that S[B] C S[/4] -  or the new system is no worse.
We also have the following result.
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 10
A






B '  <r
■>
~Z
A ' <— 
A >- 
B >
Figure 4: Call— Merge Optimization
Proposition. If B C A, then = S(B\\AMy In other words, if B C A, the “simulation” of A
with its maximal environment AM will exhibit the same success traces as the simulation of B with 
AM. (Note: The simulation of A with AM is guaranteed to be free of failures, by the definition of 
mirror. Since A and B are canonical ([1]), their success and failure sets are disjoint. Therefore, 
since Sb 2  Sa , the simulation of (B || AM) will also be free of failures. Thus, the only thing we can 
compare are the success sets.)
Viewed yet another way, B can be replaced for A in any environment, up to the maximal environ­
ment AM, and one will not observe any difference in the set o f transactions that cross the boundary 
between AM and A or AM and B.
Proof Outline. The proof is by induction over the length of traces. As basis case, e is in the success 
sets of (<4 || Am) and (B || A™). Now assume that a trace x of length N is in both success sets. 
Consider a trace xi\ in the success set of (A || AM)\ in other words, Ar  sends an output i\ to A that 
A can accept; then, the same input i\ must be accepted by B also, because Sb 2  Sa, and so xi\ is in 
the success set of (B || AM) also. By the same argument, if xo\ is in Sa\\am> it will be in SA, which 
means it is in Sb, which means it is in also. If there is an 12 such that xi2 is in Sb but not in Sa, 
then xi2 is not in (A || AM) or in (B || AM), because AM will not generate the output 22 -  the “excess 
capability” of B to accept 22 remains unused (and, hence, (2 can lead to any behavior in B, including 
a *.) Finally, if X0 2  is in Sb but not in Sa, we have a contradiction, because B C A by definition 
implies B < A, and a module (like B) that conforms to another (like A) cannot generate an “excess 
output” 02 that A cannot generate.
This proof exactly characterizes the notion of strong conformance: A conforms strongly to B if 
it offers to accept excess inputs at certain states that A doesn’t offer to accept. This is harmless, 
because the maximal environment of A where B will be plugged in as a replacement will not draw 
upon these excess capabilities of B.
6 Experim ental R esults
6.1 Call-Merge Optimization
The initial circuits generated by either the occam [7] or the hopCP f 11, 12] synthesis system have 
a number of redundancies. These redundancies arise, because the HDL constructs are compiled 
without taking their contexts into account. While the circuit is being optimized, certain circuits get 
so constrained in their usage that they can be replaced with other (cheaper) circuits, as shown in [7]. 
An example o f this, from [7], is shown in Figure 4.
Suppose that a circuit contains the CALL element shown to the left. The behavior of CALL is 
CALL = a ! - *  c’\ -► cl  -► a!\ -»• CALL
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 11
Figure 5: Petri Net Specification of a Queue
| bl ->  c'\ ->  cl b'l ->  CALL
Suppose that during the course of optimization, the c' output of CALL gets connected back to its c 
input as shown in CALLX. It is assumed that CALL1 is being operated in a delay insensitive context, 
as the original circuit was. The delay insensitized behavior of CALL\ is
CALL\ = al -*  (c'\ | a'!) -*• CALL1 
| bl -+  (c'\ || b'\) CALL1
where the notation means: after performing al, perform c'\ and a'\ in some order before repeating 
the behavior of CALL\ (and similarly for the second branch of the choice). This circuit can be 
replaced by MCALLX which is smaller and faster than CALLX. Clearly MCALLX is not equivalent 
to CALL\, because the execution sequence
al; c'\; bl
is possible for MCALLX but not for CALLX.
We have MCALLX <  CALLX as well as MCALLX C CALLX. The latter check assures us that 
MCALLX exhibits all the successful traces of CALLX.
6.2 Error Detection in Queue Cell
Consider a queue cell co ncur-Q specified in Figure 5, where the capacity is set to 1. The queue 
cell can be realized using the familiar micropipeline circuit QIMPX shown in Figure 6:
Suppose the circuit is implemented by mistake as QIMP2. Though QIMP2 conforms to CONCUR- 
Q, QIMP2 does not conform strongly to co ncur-Q. This “implementation” does nothing wrong, 
but deadlocks immediately. The strong conformance check fails, and generates the error message:
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 12
R I N R O U T R X N R O U T
Q I M P  1  Q I M P  2
Figure 6: Two Different Queue Elements
Q R 4 2  — I M P  
Figure 7: QR42 Converter Specification and Implementation
... failure trace (RIN AIN)
The trace indicates that the implementation cannot produce output A IN  after receiving RIN, while 
CONCUR-Q can.
This clearly shows that strong conformance is capable of pointing out certain forms of deadlocks 
that can occur. More precisely, if after seeing trace x, the specification has a success extension 
through output o while the implementation does not, strong conformance fails.
6.3 A 4-phase to 2-phase Converter with Quick Return
Consider the specification of a four-phase to two-phase converter with “quick return”:
QRA2 -  SPEC = r4? ((a4! -> r41) | (r2! -> all)) a4\ -> QR42 -  SPEC
where ((a —> b) | (c —>■ d)) represents all possible overlapped executions of (a —> b) and (c —► d).
We obtain that QR42-IMP conforms to QR42-SPEC. However QR42-IMP does not conform 
strongly to QR42-SPEC, because QR42-SPEC allows the trace (R4 A4) while QR42-IMP docs
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 13
not. QR42-SPEC is an abstract specification that allows R2 as well as A4 to occur concurrently 
immediately after R4. This means that the environment of QR42-SPEC is free to accept R2 and A4 
in any order. The circuit QR42-IMP, however, generates A4 only following R2, thus not invoking 
certain capabilities that exist in the environment.
Notice that QR42-IMP does not obey the delay insensitive signaling protocol: it allows the 
trace (R4 R2 A4) but not (R4 A4 R 2 ); however, if a buffer is introduced on the r2 wire, the 
sequence (R4 A4 R2) will also become possible. Thus, a delay insensitize4versiono£QR42-lMP 
conforms strongly to QR42-SPEC.
6.4 31-Location Queue in Place o f a 32-Location Queue
Finally, we experimented with a 31 -location queue in place of a 32 location queue. Conformance 
passed the 31-location implementation, since the 31-location queue can be safely substituted for 
the 32-location queue. However, this implementation certainly has more limited output behavior 
than the specification. On the other hand, the strong conformance check detects this limited output 
behavior; it finds the following sequence leading to an error:
(STRONG-CONFORMS-TO-P *concur-Q31* *concur-Q32*)
Failure path: (RIN AIN RIN AIN RIN AIN RIN AIN RIN AIN RIN AIN RIN AIN
. RIN AIN RIN AIN RIN AIN RIN AIN RIN AIN RIN AIN RIN
AIN RIN AIN RIN AIN RIN AIN RIN AIN RIN AIN RIN AIN 
RIN AIN RIN AIN RIN AIN RIN AIN RIN AIN RIN AIN RIN 
AIN RIN AIN RIN AIN RIN AIN RIN AIN RIN AIN)
The strong conformance checker could find the failure in 0.1 seconds and print out a sequence of 
actions leading to the error, despite having 128 states in the specification. This means that the 
trace theory based verifier does not always suffer from combinatorial explosion, as is commonly 
feared. (However, in large regular array structures where each cell has a lot of independent moves 
possible, the explosion could well manifest -  as communicated to us by Birtwistle in the context of 
the Concurrency Workbench. See [13] for further discussion.)
7 Discussion, Related Work, and Conclusions
A relation strong conformance between trace structures has been presented and its various uses 
have been pointed out. This notion is closely related to the definition of decomposition presented in 
[9]. Key differences are briefly noted. Ebcrgen’s trace theory is designed with different objectives: 
to specify computations, and synthesize circuits through calculations using trace-theoretic rules. 
The trace theory docs not directly relate to circuit components; for instance, two trace structures 
containing the same output symbol can be weaved. Weave merely captures constraints on joint 
execution, and does not correspond to the act of connecting two circuit outputs (which Dill’s | 
operator has, and hence Dill prevents composing two trace structures containing the same output 
symbol).
In Ebergen’s tracc theory, the link between trace theoretic operators and circuit behavior is brought 
out through the following key notions and theorems: decomposition, Dl decomposition, separation 
theorem, and substitution theorem. Together with a rich collection of equational laws on commands 
(where commands denote tracc structures), Ebergen’s tracc theory seems capable of synthesizing 
correct circuits, without having to first “guess” a circuit and “check” it using a verifier (as has been
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 14
the approach suggested here). A tool to demonstrate the power of Ebergen’s trace theory is under 
development. These works address the two prevalent points of view: post-hoc verification after 
“intelligent human design” vs. “correct by construction” design.
The notion of strong conformance is latent in Ebergen’s definition of the decomposition relation 
[9, Definition 3.1.0.0, Page 42] -  as was discovered after the fact by us. A very similar idea called 
input liberalization has also been proposed by Ad Peeters [14] -  again discovered after the fact by 
us! Neither Ebergcn nor Peeters suggest using their definitions for validating circuit optimizations, 
as we do here.
The process algebra developed by Udding and Josephs holds promise to contain state explosion 
[15, Remark on page 2], as circuits are derived through calculations in their process algebra, rather 
than verified post-hoc using a semantic model (state graphs) as with Dill’s verifier. However, so 
long as the two points of view -  post-hoc verification after “intelligent human design” vs. “correct 
by construction” design using intelligent calculations -  exist, both approaches have an important 
role to play.
Finally, work in verification of asynchronous circuits seems to be proceeding along (at least) 
two distinct lines: (1) a class of works that use various trace models; (2) a class of works based 
on process algebras. Many of the notions used in these areas seem to be so conceptually similar 
(e.g. compare autofailure manifestation which converts possible failures to actual failures, and 
may/must pre-orders used by [6]). However there are fundamental differences in these approaches 
also (e.g. unidirectional wires carry information only one-way; so a component cannot refuse an 
input; however, a CCS/CSP rendezvous can be refused by not participating in it). One hopes to see 
unifying efforts relating these (as yet unrelated) efforts.
Acknowledgements. Thanks to Jo Ebergen for his insightful feedback on an earlier version of this 
paper. i 1
R eferences
1. David L. Dill. Trace Theory for Automatic Hierarchical Verification o f Speed-independent 
Circuits. MIT Press, 1989. An ACM Distinguished Dissertation.
2. Ivan Sutherland. Micropipelines. Communications o f the ACM, June 1989. The 1988 ACM  
Turing Award Lecture.
3. Steven M. Nowick. Personal Communication, 1992.
4. Robin Milner. CommunicationandConcurrency. Prentice-Hall, Englewood Cliffs, New Jersey, 
1989.
5. C. A. R. Hoare. Communicating Sequential Processes. Prentice-Hall, Englewood Cliffs, New 
Jersey, 1985.
6. Rocco DeNicola and Matthew Hennessy. Testing equivalences for processes. Theoretical 
Computer Science, 34:83-133,1983.
7. Erik Brunvand. Translating Concurrent Communicating Programs into Asynchronous Circuits. 
PhD thesis, Camcgic Mellon University, 1991.
A CORRECTNESS CRITERION FOR ASYNCHRONOUS CIRCUIT VALIDATION 15
8. Erik Brunvand and Robert F. Sproull. Translating concurrent programs into delay-insensitive 
circuits. In International Conference on Computer Design (ICCAD), IEEE, pages 262-265, 
nov 1989.
9. Jo C. Ebergen. Translating Programs into Delay Insensitive Circuits. Centre for Mathematics 
and Computer Science, Amsterdam, 1989. CWI Tract 56.
10. Jo C. Ebergen. Personal communication, 1988. •
11. Venkatesh Akella. Action refinement based transformation of concurrent processes into asyn­
chronous hardware. Ph.D. research in progress.
12. Venkatesh Akella and Ganesh Gopalakrishnan. Static analysis techniques for the synthesis of 
efficient asynchronous circuits. Technical Report UUCS-91-018, Dept, of Computer Science, 
University of Utah, Salt Lake City, UT 84112, 1991. To appear in TAU ’92: 1992 Workshop 
on Timing Issues in the Specification and Synthesis o f Digital Systems, Princeton, NJ, March 
18-20 ,1992.
13. DavidL. Dill,Steven M. Nowick, and Robert F. Sproull. Specification and automatic verification 
of self-timed queues. Formal Methods in System Design, 1992. To appear. Also available as 
Stanford University Technical Report CSL-TR-89-387, Computer Systems Laboratory, Stanford 
University, August, 1989.
14. Jo C. Ebergen and Ad M.G. Peeters. The modulo-« counter: Design and analysis of delay- 
insensitive circuits. Technical Report CS-91-25, Department of Computer Science, University 
of Waterloo, June 1991.
15. Mark B. Josephs and Jan Tijmen Udding. An algebra for delay-insensitive circuits. Technical 
Report WUCS-89-54, Department of Computer Science, Washington University, St. Louis, MO, 
1989.
8 Appendix
Proposition, compose, rename, and hide arc monotonic with respect to strong conformance.
Proof Outline. These structural operators are monotonic with respect to <  as shown in [1, Page 
58], We arc now required to show the additional facts that Sb D Sa implies
Shide(X)(B) 2  Shide(X)(A) (1)
Srename(r)(B) 2  Srename(r)(A) (2)
Sb\\c 2  *SU||C (3)
Equation 1 follows from the fact that hide(X) is a function that simply removes members of X from 
every success trace in Sa or Sa (as the case may be). Equation 2 follows from the fact that rename(r) 
simply applies the renaming function r to every in Sa or Sb (as the case may be). Finally, equation 3 
follows from the fact that = Sb fl Sc and = 5 ^ 0  Sc-
