Abstract
Introduction.
Verification of the correctness of asynchronous circuits has been considered an important problem for a long time. But, a lack of any formal and efficient method of verification has prevented the creation of practical design aids for this purpose. Since all the known techniques of simulation and prototype testing are time-consuming and not very reliable, there is an acute need for such tools. Moreover, as we build larger and more complex circuits, the cost of a single design error is likely to become even higher. In this paper, we describe an automatic verification system for asynchronous circuits, in which the specifications are expressed in a propositional temporal logic. We illustrate the use of our system by verifying a version of the self-timed queue element given in [MC80] .
Bochmann [B082] was probably the first to recognize the usefulness of temporal logic to describe circuits; he verified an implementation of a self-timed arbiter using linear temporal logic and what he called "reachability analysis". The work of Malchi and Owicki [M082] identified additional temporal operators required to express interesting properties of a circuit and also gave specifications of a large class of modules used in self-timed systems.
Although these researchers have contributed significantly toward developing an adequate notation for expressing the correctness of asynchronous circuits, the problem of mechanically verifying a circuit using efficient algorithms still remains unsolved. In this paper we show how a simple and efficient algorithm, called a model checker, can be used to verify various temporal properties of an asynchronous circuit. Roughly speaking, our method works by first building a labelled state-transition graph for an asynchronous circuit. This graph can be viewed as a finite Kripke Structure. Then by using the model checker we determine the truth of various temporal formulae in this Kripke Structure. As a result, it is possible to avoid the complexity associated with proof construction.
Most complex circuits are built out of relatively less complex modules in a hierarchical manner. Hence it should be possible to verify these circuits in a hierarchical manner, i.e. to verify the correctness of a larger module, given the premises that the smaller modules are correct. A hierarchical approach to verification is important in practice, because it enables us to verify circuits incrementally, to localize faults to small submodules and most importantly, to handle large circuits without a large growth in complexity. We show how the hierarchical method can be incorporated in a mechanical approach to circuit verification.
The paper is organized as follows: Section 1 contains a brief description of the syntax and semantics of CTL, the temporal logic used in this paper, and also explains the algorithms used in the model checker. In Section 2, we give a simple step-by-step method used to verify circuits. In Section 3, we illustrate these methods by establishing some interesting properties of a Self-Timed Queue (FIFO) Element. In Section 4, we introduce a Hierarchical method to be used in verifying large and complex circuit and study some of the model-theoretic properties of the operation of "restriction" on a Kripke Structure. The paper concludes by pointing out the shortcomings of our approach and with a discussion of some remaining open problems.
CTL and Model Checker.
The logic that we use to give the specifications of a circuit is a propositional temporal logic of branching time, called CTL (Computation Tree Logic). This logic is essentially the same as that described in [CES83] , [EC80] and [BMP81] .
The syntax for CTL is given below:
Let P be the set of all the atomic propositions in the language, £. Then 1. Every atomic proposition P in P is a formula in CTL.
2. If fi and f 2 are CTL formulae, then so are -> /i, f\ A f 2) VX/i, 3X/i, V[/i U f 2 ] and 3{h U f 2 \.
In this logic the propositional connectives -i and A have their usual meanings of negation and conjunction. The temporal operator X is the nexttime operator. Hence the intuitive meaning of VX/i (3X/i) is that fi holds in every (in some) immediate successor state of the current state. The temporal operator U is the until operator. The intuitive
is that for every computation path (for some computation path), there exists an initial prefix of the path such that f 2 holds at the last state of the prefix and fi holds at all other states along the prefix.
We also use the following syntactic abbreviations:
VF/i = V[true U'/i] which means for every path, there exists a state on the path at which /i holds.
3F/i = 3[true U fi] which means for some path, there exists a state on the path at which f\ holds.
2

VG/i
3F -i fx which means for every path, at every node on the path fi holds.
3G/i =-» VF -« /i which means for some path, at every node on the path /i holds.
] which means that for every computation path, and for every initial prefix of the path, if f 2 holds at all the states along the prefix then fi holds at all the states along the same prefix.
] which means that for some computation path, and for every initial prefix of the path, if f 2 holds at all the states along the prefix then fi holds at all the states along the same prefix.
In the last two formulae W is the while operator. The formula V[
is read as "for every (some) path f\ while f 2 \
The semantics of a CTL formula is defined with respect to a labelled state-transition graph. A CTL structure is a triple M = (S,R,II) where 1. S is a finite set of states.
2. R is a total binary relation on 5 (it! C S X S) and denotes the possible transitions between states.
3. II is an assignment of atomic proposition to states, i.e. II : S h-> 2
P .
A path is an infinite sequence of states (sq, s\, s 2 , ...) such that V t [(s t -,
G R]> For any structure M = (5, R, II) and state sq E S, there is an infinite computation tree with root labelled sq such that s -> t is an arc in the tree iff (s, t) £ R.
The truth in a structure is expressed by M,sq (= /, meaning that the temporal formula / is satisfied in the structure M at state 5q. The semantics of temporal formulae is defined inductively as follows: From these it is quite easy to see that the semantics of U, the until operator can be easily given in terms of a least fixed-point characterization:
3[/i U/ 2 ] = /iJ./2V(AA3Xy).
The Model Checker for CTL can now be thought of as an algorithm that determines the satisfiability of a given temporal formula }\ in a model M, by computing these fixed points. A full description of the algorithm is given in [CES83] , In order to determine if a CTL formula / is true in a structure M = (S,i?, 77), the algorithm labels each state of S so that when the algorithm terminates, the label of each state s G S, label(s), will be equal to {/' £ sub(f) | M,s \= /'}, where each element of sub(f) is either a subformula of / or the negation of the subformula. Hence M, s \= f iff f G label[s) at the termination of the algorithm.
The labelling algorithm works in several stages. In the i th stage the algorithm labels the states by the subformulae of length i. The labels assigned in the earlier stages, corresponding to the subformulae of length less than i are used to perform the labelling in this stage. It can be shown that the algorithm makes at most n = |/| stages of computation and that the total amount of the work involved in each stage is 0(||5|| + Hence the time complexity of the Model Checker is 0(|/| • + ||#||))-The algorithm is also fairly simple, since it involves only a few straightforward graph theoretic algorithms.
Verification of Circuits.
Given a circuit to be verified, the steps involved in using the Model Checker to assert the correctness of the temporal specifications are as follows:
Step 1. Building the Model.
The structure associated with the circuit is essentially a finite state-transition graph, with its vertices corresponding to the distinct states and the edges corresponding to the (possibly nondeterministic) transition between the states. The initial label associated with each state is the set of propositions true in that state. This labelled state-transition graph can be built using the following simple algorithm:. 
Step 2* Giving the Specifications of the Circuit in CTL.
This corresponds to the specifications of the temporal behaviour of the circuit. It usually involves structural properties (i.e. the specifications for different components of the circuit, specifications of the signalling scheme used for communication with various other modules, etc.), safeness properties and liveness properties. It should probably be pointed out that one need not give the complete specification of the circuit in order to verify some selected properties of the circuit using the model checker.
Step S. Verifying the Circuit using the Model Checker.
This step involves the model checker which checks the truth of the specification (a formula in CTL) in the structure constructed in the step 1. The working of the Model Checker is described in the previous section.
Extended Example.
We illustrate the ideas presented so far by verifying some interesting properties of an asynchronous circuit. The example chosen for this purpose is one element of a Self-timed (FIFO) Queue , which originally appeared in an article by C. Seitz on self-timed system [MC80] . 3.1 is an implementation of a single FIFO queue element combined with some input and output logic. This circuit is of very practical importance; in pipeline processes in which operation times are variable, increased throughput can be achieved by interconnecting the processing elements through queues. The implementation uses simple asynchronous control and hence, can be used to build very fast and area-efficient queues.
The inner cell is intended to be replicated as many times as the number of wordsthe queue is to be able to store, and the same control will operate a queue of any word length. The input cell and the output cell can be thought of as logic circuits converting the two-cycle signalling scheme at the input link to a four-cycle signalling scheme at the internal link and vice versa. The inner cell can be thought of as a latch that stores the state of the cell (i.e. whether the cell is full or empty), together with logic to generate a load signal and a set of static registers to store the bits. However, the design shown is not speed-independent, and uses the 3/2-rules. That is one may expect misoperation if particular sets of 3 gates have a smaller cumulative propagation delay time than other sets of 2 gates.
In the following subsections we specify and verify some interesting properties of the Queue element with a single inner cell.
b. Temporal Specifications for the Self-Timed Queue Element: We give examples of the ways in which various properties of a circuit can be given in CTL. In case of the Queue Element some of the structural properties that we might like to specify, are that the two-cycle signalling used at the input links and the output links is safe and live. Recall that the structural properties are specifications for various components and signalling schemes and thus, may be considered as premises that must be true in any CTL structure modelling the circuit. Hence the request signal must satisfy the following safeness and liveness conditions. (In the following CTL specifications we will use symbols Req and Ack for the request and the acknowledgement signals respectively.)
Safeness Conditions for the Request Signal.
These two CTL formulae essentially express that if the Req and Ack signals are nonequipotential then the Req signal will remain in its stable logic value while Ack signal is in its stable value. In other words, Req will not be given unless acknowledgement to previous request signal has arrived.
Liveness Conditions for the Request Signal.
These two CTL formulae express the property that if the Req and Ack signals are equipotential then eventually the Req signal will change its logic value, thus indicating an arrival of a request.
In a similar manner, we can specify the properties of the response signal.
Safeness Conditions for the Response Signal.
Informally, they express the fact that Ack will not be given unless there has been a Req signal to cause it.
Liveness Conditions for the Response Signal.
That is, if there had been a Req signal then eventually there will be an Ack signal in response to the request.
We can also give the safeness and the liveness properties of the FIFO Queue element in CTL. The following is a representative list of some of the properties, and by no means, exhaustive and complete. In the CTL formulae given below, Reqln stands for request at the input links, Ackln, for acknowledgement at the input links, ReqOut, for request at the output links, AckOut, for acknowledgement at the output links and Fulll, for the state of the queue element when it holds some data.
Some Safeness Properties of the Queue Element.
This formula states that if there have been a Reqln and a ReqOut, then Ackln will not be given until AckOut has arrived.
Some Liveness Properties of the Queue Element.
l.VG(-(ReqIn=AckIn)A -Fulll-* VF(A))
This formula states that if there has been a Reqln, and the memory element was empty, then eventually it will be loaded with the input data.
2.VG(Fulll-> VF(-. (ReqOut=AckOut)))
That is the Queue Element is full then eventually a request at the output links will be generated in order to move the data to the next element in the queue.
3.VG((ReqOut=AckOut) -> VF(-Fulll))
That is if the acknowledgement arrives at the output links thus indicating that the data stored in the current Queue Element has been moved to the next element, then eventually the Queue Element will mark its state as empty.
In the next subsection we show how these specifications can be verified automatically by using a Model Checker.
c. Verification of the Circuit: As a first step for the verification of the circuit, we build a labelled finite state-transition graph corresponding to the circuit given in figure. 3.1, using the algorithm given in section 2. For this model, we assume that each gate of the circuit has one unit delay. This is done in order to take care of the speed-dependent properties of the circuit. This is equivalent to assuming that for any state in the graph, any of the successor states is arrived at after one unit gate-delay. The label associated with each state is the set of nodes in the circuit which assume the logical value 1 in that state. The nodes of the circuit are -Ackln, Reqln, D, A, FullO, Fulll, C, B, El, E2, E3, ReqOut and AckOut. The initial state corresponds to the situation when Reqln and Ackln as well as ReqOut and AckOut are equipotential. Now,the model checker can take a description of the model and a temporal formula specifying some property of the circuit, and determine truth of the formula in that model. However the circuit shown does not obey the 3/2 rule as advertised, and the model checker determines that the safeness property of the queue element, given in the previous subsection is not true.
Informally, the problem can be described as follows: When an AckOut is received in response to the ReqOut signal, the AckOut signal travels via two different electrical paths -one involving three inverters and the other involving four gates. This creates a race condition and produces a glitch of about one gate delay on the ReqOut bus. Though this glitch may not always be able to drive the bus to create a spurious ReqOut, it has the potential to do so. However, this problem can be easily rectified by making the inverters slow or by putting five inverters on that path instead of three. The labelled state-transition graph for the corrected circuit is shown in figure. 3.2.
10
The state-transition graph shown in figure. 3.2. is only one portion of the complete state-transition graph for the FIFO Queue Element and corresponds to the initial state where both Reqln and Ackln are both at logical-zero value and both ReqOut and AckOut are at logical-zero value. But the state in which both Reqln and Ackin are at logicalzero and both ReqOut and AckOut are at logical-one can not be reached from this statetransition graph. In fact the state-graph with this situation as the initial condition is symmetric to the one shown and the complete state-transition graph consists of both of these components. 
Figure. 3.3 A sample run using the Model Checker.
A sample run using the model checker is shown in figure. 3.3. In the formula shown A stands for V, E for 3, I for V, & for A, ~ for -i and -> for -Similarly, G, F, U and W will stand for G, F, U and W, respectively. The first component of "time:" is the cumulative time in 60 th of a second; the second component is the portion of the cumulative time allocated to 'garbage collection'. The number to the right of each formula gives the time taken to determine the truth of the formula.
Hierarchical Verification of Circuits.
The scheme given so far can be practical only for very small circuits. This is because it suffers from the problem that the state transition graph may have number of states, exponential in number of gates. However, this problem can be avoided, if circuits are verified in a hierarchical manner. That is, first small modules are verified and then bigger module is verified assuming that the smaller modules it is composed of are correct. Since at any hierarchical level, the number of small modules that a big module is composed of is relatively small, this method is amenable to proving correctness of large circuits without a large growth of the time complexity. Moreover, hierarchical verification permits the localization of faults to small submodules, thus allowing the designer to rectify the fault by redesigning the appropriate submodule.
In a hierarchical approach, the state transition graph for a circuit is built out of the descriptions of the constitueut submodules. We obtain short a description of a module by using an operation called 'restriction'. If L is the language for the module with a set of atomic propositions P, corresponding to the input, output and internal nodes, then the operation restriction on L, obtains a U with atomic propositions P 1 corresponding to the input and the output nodes only.
Roughly speaking, the effect of restriction .is to make the internal nodes invisible, since in building the state transition graph for the bigger module, we only require input-output behaviour of the constituent submodules. But when the internal nodes are made invisible, certain portions of the state graph will have same labelling of the atomic (input and output) propositions. The restriction operation defines exactly when such states can be collapsed into a single state.
Unfortunately, when we restrict a CTL structure to obtain a smaller structure, some formulae that are true in' the former structure may not be true in the restricted structure. However, by appropriately constraining CTL, we can show that the formulae in the constrained logic have the desirable property that the truth properties of such formulae are preserved with respect to the restriction operation. All of the formulae used in section 3. have the desired syntax.
Let the CTL structure for 1 be M = (S,R,n).
Let P be the set of all atomic propositions in the language L, consisting of I, the set of atomic propositions corresponding to the inputs; 0, the set of atomic propositions corresponding to the outputs and Int, the set of atomic propositions corresponding to the internal nodes of the circuit. That is P = I U 0 U Int. Let fj be the language with the atomic propositions, P For a set of sets {uy}, max({uy}) will denote the set of all distinct sets in {uj} maximal under inclusion. We define a mapping p:5^2 
n\Hi) = P f n fi n{s).
The model W = (5', i?', 77') is called a restriction of M = (S,R, II) with respect to P' C P.
In the following theorem, we show that there are CTL formulas whose truth-properties are not preserved with respect to restriction. However, there exists a large subclass of CTL formulas with the desirable property that if a formula in this subclass is satisfiable in the unrestricted CTL structure, M, then it is satisfiable in the CTL structure, AC obtained by restriction. We call this subclass CTL".
Given a set of atomic propositions P:
1. Every atomic proposition P £ P is a propositional formula in CTL"".
2.
If fi and f 2 are propositional formulae in CTL"", then so are ^/i, /1 A/2.
3.
If fi is a propositional formula and f 2 is a CTL"" formula, then V[/i U f 2 ] and 3[/i U f 2 ] are CTL" formulae. Proof. See appendix for a proof of the theorem. 3
Hence we see that even if the operation of restriction does not preserve all the CTL formulas, the restricted model is equivalent to the original model in terms of its behaviour. We show how to build At' from At in the following three steps. At' is essentially a restriction of At with additional optimizations and labelling of the transitions of the state-transition graph. step 2. Construct the blocks of Al, by first determining the dominant states using a depth first search over the underlying graph. Build At' by replacing each block by a single state. The graph can be optimized further by collapsing the "indistinguishable nodes" (i.e. nodes with the same label and successor states) into single node.
step S.
Label the edges of the graph by the set of input signals that causes the transition and the set of output signals associated with the transition. 
The transition function is M : £ H-> (S > S) and the output function is N : E H-> (5 H-> 0).
This construction is illustrated by taking the restriction of the state-transition graph for the FIFO Queue Element shown in figure. 3.2. The states shown in groups are the blocks constructed in step 2. The resulting labelled state-transition graph is shown in figure. 4.2.
It should be mentioned that since we combine successive states in the operation of step 2, the restricted model may not be a unit-delay model even if the original unrestricted model was so. This notion is essentially captured in Theorems 4.1. and 4.2.
However, this does not pose a problem, since good design methodology forces the designer not to make the modules at higher level in the hierarchy speed-dependent. Moreover, since a speed-dependent circuits must be small enough to fit in an equipotential region and equipotential regions must be small enough that the potential on any wire in this area will equalize in a "short" time for any large circuit, the modules at higher level have to be speed-independent [MC80] .
As the first step for verifying the correctness of a circuit using a hierarchical approach, we construct a CTL structure for a module at some hierarchical level, using the CTL structures for the submodules at the immediately lower level. In order to avoid building large-sized CTL structures, we use the restriction operation on the CTL structures of the submodules and obtain smaller descriptions of these. Moreover, the transitions of the state-transition graph are additionally labelled with the associated set of input signals and set of output signals, as explained earlier in this section.
Given two submodules A and B which are used to build a module C at a higher level by connecting the inputs and outputs of A and J3, we show how to build a CTL structure for the module C using an operation called "composition". It can be shown that the composition operation is commutative and associative and hence can be generalized easily to the case where a module consists of more than two submodules. The reader may note a close analogy between the operations we define and the operations defined in [MI80] .
Let the restricted models of the submodules A and B be Ma -(5a> Pa> ^a) and Mb - [Sb, Rb, IIb) , respectively. We assume that the propositions associated with A and B are renamed so that the input and output nodes of A and B that are connected have the same proposition associated with them. Furthermore, we make the important assumption that these connections are made using "shorf bilateral wires. The transition relation RaoB (RaoB C SaoB X Saob) is defined as follows. Assume that there is a transition (sia, s 2 a) € Ra such that (sia, $2a) has associated with it, the input set a and the output set /?. ' Similarly, assume that there is a transition {s\b, $2b) ^ ^B suc h ^at (sib, S2b) has associated with it the input set 7 and the output set 8. Furthermore, assume that a is parttioned into disjoint subsets a! and a" such that a f is associated with the inputs of C (i.e. the input transitions for a! are generated externally and the transitions for a" are generated internally.) Similarly, assume that 7 is parttioned into disjoint subsets 7' and 7". Then in the CTL structure for C, there will be following transitions: (i) if a" = 0, then there is a transition ((si A , sib), ($2A, «ib) ) G RaoB, with associated input a and output /3; (ii) if a" -0, then there is a transition -((siAi 5 ib)> ($ia> ^2b)) 6 -KAOB, with associated input 7 and output <5; and (iii) if (a) both a" = 0 and 7" = 0, or (b) a" 7^ 0 and a" C 6 or (c) 7" 7^ 0 and 7" C /?, then there is a transition ((sia> sib)? (s2A> $2b) ) 6 #AoB> with associated input a U 7 and output /? U <$. ' The step of constructing the successor states for (sia> s ib) can be thought of as simulating C at (sia>sib) f°r a U possible sets of inputs and can be easily incorporated into algorithm 2.1. Now various properties of C with respect to the model Mc can be determined using the model checker algorithm, as explained in the earlier sections.
Conclusion.
We have shown that it is possible to do automatic verification of asynchronous-circuit efficiently. We have also indicated how this method can be extended to do hierarchical verification of large and complex circuits. We believe that this approach may eventually turn out to be quite practical.
However, there are many problems that need to be addressed before this approach is made feasible in practice. In this paper we have used a unit-delay model for the circuit. Similarly, it is quite easy to use a steady-state model, in which each state in the statetransition graph corresponds to a stable state and only in response to an input change does a state change occur. While the steady-state model is useful for speed-independent selftimed circuits, the unit-delay model is needed to model properties of a speed-dependent circuit. Unfortunately, even for the speed-dependent circuits the assumption that each gate has one unit gate-delay is rather unrealistic, because two similar gates may have different delays depending on process variations, fan-outs of a gate etc. Moreover, because of various capacitive effects, the delay associated with a 0-to-l transition is not equal to the one associated with a l-to-0 transition. It is felt that it is necessary to find models that capture these properties better. Also, we do not know how to handle the effect of large fan-out, charge sharing etc. In addition, we felt that CTL is rather weak for succinctly expressing many properties of circuits. A notation based on temporal intervals [HMM83] may be more suitable for this purpose.
An interesting area for future research is the usefulness of restriction operation in the context of hierarchical verification. We have defined a "restriction" operation and shown that the truth-properties of the CTL"" formulae are preserved with respect to the operation of restriction. It appears that any weaker version of "restriction" will not result in any substantial reduction of the size of the CTL structures and hence will make hierarchical verification rather expensive. On the other hand, it seems any stronger version of "restriction", will severely limit the class of CTL formulae that will be preserved with respect to restriction.
20
