Abstract. In TAKB96], we investigated techniques for checking if one real-time system correctly implements another and developed theory for hierarchical proofs and assume-guarantee style reasoning. In this study, using the techniques of TAKB96], we verify the correctness of the timing of the communication chip STARI.
Introduction
We describe the application of the techniques and tools described in TAKB96] to the veri cation of the high-bandwidth communication chip, STARI G93] .
STARI (by Greenstreet, G93, G96] ) is a self-timed FIFO that interfaces a transmitter and a receiver that operate at the same clock frequency but may have some skew between their clock signals ( Figure 1 ). STARI can compensate for large, time varying skews and makes high bandwidth synchronous operation possible by eliminating the need for handshakes between the transmitter and the receiver. However, because there are no handshakes, certain timing properties need to be veri ed to show that the interface functions correctly. In particular, it needs to be shown that no data is duplicated or dropped by the interface.
The FIFO in STARI consists of a cascade of identical stages and thus, the complexity of automatically verifying a monolithic model of an n stage STARI circuit is roughly O(k n ), where k is the size of the model for a single stage. If the circuit is modeled at the gate level, k is rather large, and this limits automatic veri cation methods to very small n. As long as the circuit is modeled at this level of detail, improvements to veri cation algorithms are not likely to have a signi cant e ect, since adding a single stage multiplies the resource requirements by k. Hence, one needs to perform veri cation on a more abstract representation to be able to handle larger n.
The initial proof of correctness for STARI was performed using a theoremprover ( G93, LG95] ). An automatic proof was also published ( HBAB93] ). Neither of these studies veri ed the actual circuit. They operated on simpli ed, abstract models for STARI which were not proven to be correct and which ignored certain aspects of the circuit implementation. Such simpli cations were necessary for these approaches, otherwise the techniques would have become inapplicable or unmanageably complicated.
Our approach provides a formal guarantee of correctness for the circuit itself, and models the implementation more faithfully. We proceed as follows: (i) We construct an abstract model for one stage of STARI, which we prove to be correct in the environment that it operates in. (ii) By composing n of these abstract models, we obtain an abstract model for STARI on which we prove that the timing properties are satis ed. (i) implies that the properties are also satis ed by the circuit itself. To achieve (i), we made use of the following, which we had developed in TAKB96]:
? Supported by SRC under contract DC-324-026.
{ An algorithm for checking if a real-time system is a correct abstraction for another: In TAKB96] we provided a su cient condition under which a given untimed mapping preserves timed behavior, and gave an algorithm for checking if this condition is satis ed. This algorithm was implemented as part of the veri cation tool Cospan. { Assume-guarantee style reasoning for real-time systems: While proving that the abstract model for the FIFO stage was correct, we needed to make assumptions about the environment that it operates in. To discharge these assumptions in a sound way, assume-guarantee reasoning needs to be employed. The use of multiple levels of abstraction and an inductive argument together with assume-guarantee style reasoning for (i) makes this case study an interesting combination of model checking and theorem proving. The assume-guarantee argument, the abstract model, and the mappings that relate abstract models with the gate level descriptions need to be constructed manually, whereas the abstraction check and the veri cation of the timing properties on the abstract model are performed automatically by Cospan. The automation a orded by Cospan eliminates the need for having oversimpli ed abstract models.
In Section 2 we describe the STARI circuit. Section 3 presents the veri cation of the timing properties and contrasts our method with previous ones. Section 4 summarizes the experience from this study and suggests further research.
STARI 2.1 Operation of the Interface
Most digital electronic circuits are synchronous, i.e., they make use of a clock signal to de ne the time step. A high frequency clock signal can be safely distributed over relatively large distances, however, it is hard to control the exact phase of the clock at di erent points in the distribution network. The di erence between the phases of the clock signals at two such points is referred to as skew. For systems that are not built on a single chip, such as board level designs, ATM networks, etc., skew can be large and time-varying, which makes it a limiting factor on the performance of purely synchronous systems. Self-timed systems avoid this problem by using handshake protocols. For self-timed systems, if no assumptions are made about the delays of circuits and wires, for each data item that is communicated between two parts of a circuit, an acknowledgment needs to be sent back to the transmitter before another data item can be sent. This can limit the communication bandwidth severely, since only one data item can be in transit at a given time, i.e., two pieces of data need to be separated by the round-trip time between the transmitter and receiver in addition to the response time of the receiver. STARI (Self-Timed At Receiver's Input) is a hybrid-scheme: it is a self-timed rst-in rst-out queue (FIFO) that connects a transmitter and receiver. The two communicate as though they were part of an ideal synchronous system ( the FIFO by the transmitter and one value is removed by the receiver. Because data is inserted and removed at the same rate, no control signals are required to prevent under ow and over ow. However, because of variations in clock skew, there can be short term uctuations in the clock rate at the receiver or transmitter and it can appear that one of them is working faster than the other. STARI responds to these uctuations by building up more data in the FIFO when the transmitter is working faster and by supplying data from the FIFO when the receiver is working faster. For correct operation of the STARI interface, the following two properties need to be proven 2 : (i) Each data value output by the transmitter must be inserted into the FIFO before the next one is output. (ii) A new value must be output by the FIFO before each acknowledgment from the receiver. Intuitively, the longer and faster the FIFO, the more skew it can tolerate. The correctness of the properties above depend on the length of the FIFO, the clock speed, the magnitude of the skew and the speed of operation of FIFO stages. G93, LG95] verify that if a certain relationship holds between these parameters, then properties (i) and (ii) hold.
In the rest of Section 2 we present the implementation of the STARI circuit and the operation of the interface.
Dual-rail Coding
In the STARI circuit, each Boolean signal x is represented by the dual-rail code depicted in Fig. 2 . The \empty" value is needed to distinguish between two consecutive data items of the same value and one data value asserted for a long time. In this study, we focus on timing related properties only and do not consider other properties which need to be proved to ensure correct operation.
A high-level view of STARI
According to the STARI scheme, the transmitter outputs a data stream, updating the value of data in in Fig. 1 at each rising edge of its clock input. After each time the transmitter outputs a T or F, it outputs an E to separate the current data item from the next one. The receiver samples its input (data out in Fig. 1 ) each time its clock signal goes high, and updates its acknowledge signal after some delay (ack in in Fig. 1 ).
The FIFO consists of n identical stages, each of which holds a single data value (See Fig. 3 (Fig. 3 ).
- 
One stage of the FIFO
Each stage of the FIFO consists of two Muller C-elements that hold the value of the .t and .f components of a data item, and a NOR gate that computes the acknowledge output signal of the stage (Fig. 5) . A Muller C-element works as follows: when the two inputs are the same, the output takes on this value, when the inputs are di erent, the output retains its previous value (Fig. 4) . To understand how data ows down the FIFO, rst note that stage k is said to have \acknowledged" the data it holds if its ack out output is equal to the NOR of x(k).t and x(k).f. Thus, the copying of data value E is acknowledged by asserting ack out = 1 and data values T and F are acknowledged by ack out = 0. Let us consider a situation where stages k and k+1 hold the value E and stage k ? 1 has the value T. Stage k is enabled to copy the new data from stage k ? 1.
We have ack out(k+1) = ack in(k) = 1 and x(k-1).t = 1 and x(k-1).f = 0. Therefore, both inputs to the C-element that computes x(k).t are 1, while the output of the C-element, x(k).t = 0. After some delay, x(k).t becomes 1 while x(k).f remains unchanged, and, in this way, the value T gets copied to stage k. Transitions from a T or F to E happen similarly. Given this circuit description for STARI, we now proceed to verify the two timing properties mentioned earlier.
3 Veri cation of STARI
Background
The formalism and tools used in the veri cation of STARI are described in detail in TAKB96]. In this section, we review the essential facts.
Modeling timed systems: Timed processes. For an example of a timed process, refer to Figure 6 . A timed process has a set of locations S, where S 0 S is designated as initial. The sets of input and output variables are I and O. At each location a unique value is speci ed for each output variable. Clocks are real-valued variables that can be reset when edges are taken, and they increase at the same rate. Edges are conditioned on clock and input predicates. A clock predicate is a positive Boolean combination of inequalities x k where k 2 IN and is or . An input predicate is a condition of the form i 0 = i new , i = i old which denotes the fact that the input variable i has switched to value i new from i old . At each location there is a clock predicate, called an invariant, that needs to be satis ed while the process remains at that location.
The set of input/output waveforms that a process can exhibit constitute the language of the process.
The delay model. We model the C-elements and the NOR gates used in the STARI circuit as ideal delayless elements followed by inertial delay bu ers, in a fashion similar to MP95]. The output of an inertial delay bu er follows the input with a delay in the range d min ; d max ], i.e., input transitions are re ected at the output with a delay in the given range. The timed process modeling an inertial delay bu er is given in Figure 6 . If an input pulse lasts less than d min , it is not re ected at the output. If it lasts longer than d max , then it is guaranteed to cause a pulse at the output. Pulses lasting between d min and d max may or may not result in an output pulse: Suppose a transition occurs at the input, and before the corresponding output transition takes place (shaded locations in Fig. 6 ), the input returns to its original value. Then the output can either remain unchanged (by taking the edges marked x d max from the shaded locations) or re ect both input transitions (by taking the edges marked d min x d max ).
Abstractions of timed systems. In TAKB96], we examined three notions for a timed system being a correct abstraction (or implementation) of another: (i) timed language inclusion, (ii) timed simulation, and (iii) the existence of timed behavior preserving mappings. We write A B to denote the fact that A implements B. For timed processes A and B, a mapping between the locations h : S A ! S B is said to preserve timed behavior, i for each run 3 of A, the image of the run under h is a run of B. The existence of such a map implies that B is a correct abstraction of A. We gave an algorithm for checking this condition and implemented it as part of COSPAN. We use this algorithm extensively at several stages of the veri cation of STARI.
Compositional and assume-guarantee style reasoning. An implementation relation is compositional i the following holds For all i, 1 i n, R i A i implies (k 1 i n R i ) (k 1 i n A i ) With the stronger assume-guarantee style reasoning, one can prove (k 1 i n R i ) (k 1 i n A i ) by proving for all i, 1 i n, that A 1 k ::: k A i?1 k R i k A i+1 k ::: k A n A 1 k ::: k A i?1 k A i k A i+1 k ::: k A n This style of reasoning is often more useful, since, while showing R i A i one often needs to make assumptions about the environment that R i and A i operate in, and A 1 ; :::; A i?1 and A i+1 ; :::; A n encapsulate the strongest such set of assumptions.
In TAKB96] it was shown that timed language inclusion, denoted by L is compositional and that assume-guarantee style reasoning can be used in conjunction with it correctly, provided that all timed processes are non-blocking. The other two implementation relations mentioned imply timed language inclusion, and thus one can apply assume-guarantee style reasoning using these relations as well.
Veri cation Steps
The veri cation of STARI consisted of the following two steps:
{ Constructing an abstraction for a FIFO stage and verifying its correctness within the environment that it operates.
{ Verifying properties (i) and (ii) of Section 2 using the abstract model for the entire circuit. The latter of these steps was performed using the existing timing veri cation capabilities of COSPAN ( AK96] ). The former step is the novel part of our approach and will be detailed below.
The abstract model for a stage. The abstract model for the FIFO stage describes its behavior at a high level, as in Section 2.3, and expresses bounds on certain response times. The abstract model A (depicted in Figure 7 ) makes use of only one clock variable. Let us focus on stage k. At location stable, the FIFO stage has read and acknowledged its current input. If new data arrives at the inputs of stage k, A moves to location wait for ack, waiting for stage k + 1 to acknowledge having copied the current data. If stage k + 1 sends an acknowledgment before the new data arrives at stage k, stage k moves to location wait for data from location stable instead. After new data and an acknowledgment for the old data from stage k+1 has been received, A moves to location out pend from where, after some delay, it moves to ack out pend and copies the new data to its output. Again after some delay, an acknowledgment is Note that it has been possible to capture the timing information about the stage using one clock only, since only one of the three circuit elements forming the stage can have a pending output change at any given time. Intuitively, this is guaranteed by the fact that the inputs to a stage will not change unless the stage has acknowledged the previous inputs. This assumption about the environment of a stage is crucial for the correctness of the abstraction, and is taken into account in our veri cation by the use of assume-guarantee reasoning.
Let F denote the timed process describing one stage of the FIFO at the gate level. F is the composition of processes representing two Muller C-elements and a NOR-gate as described by Figure 5 and Section 3.1. Let F i and A i denote the detailed and abstract models for the ith FIFO stage. F i and A i have structures identical to those of F and I. Also let Tx and Rx be the timed processes describing the transmitter and the receiver. Refer to W3] for Cospan models of Tx and Rx. We would like to prove that Tx k F 1 k F 2 k ::: k F n k Rx L Tx k A 1 k A 2 k ::: k A n k Rx
(1) which will enable us to prove properties using the abstract description for the circuit given by the right-hand side. We would like to achieve this by showing that for all i, A i is a correct abstraction for F i . As noted above, this is true only within the environment that F i and A i operate in. For an arbitrary environment F i L A i does not hold: if the inputs to F are unconstrained, then F i does not behave like a FIFO element. Therefore, we need to employ assume-guarantee reasoning to carry out the proof. We must prove, for all i, 1 i n, that For both steps we specify mappings from the locations of the left-hand side process to those of the right-hand side process, and show that this mapping preserves timed behavior using the Cospan implementation mentioned in Section 3.1. Cospan code describing the modules and the mappings is provided at W3]. The essential feature of E right is that it samples the data at its inputs periodically and after a certain delay, acknowledges having read the data. We proved that E right is a correct abstraction in exactly the same manner as E left . The
Cospan code for E right and the untimed mappings can be found at W3].
Proving that A is a correct abstraction for F. Given E left and E right , it was rather straightforward to prove that E left k F k E right E left k A k E right Cospan code for the untimed mapping is given in W3]. Time and memory consumption. We report the resource usage for the following checks (Table 1) : For all of these checks the following parameters were used: c;min = 1, c;max = 2, nor;min = 1, nor;max = 2, the clock period = 12. The delays from the clock to the transmitter and the receiver (see Figure 1 ) were allowed to be time varying and bounded by 1 time unit. To serve as a comparison, we tried to verify properties (i) and (ii) of Section 2 using the gate level model for STARI, i.e., using F as the model for a FIFO stage. At this level of detail, we ran out of space using 1 GB of memory for a three stage FIFO. Improvements to our timing veri cation algorithm could enable us to verify larger FIFOs at the gate level, however, any method working on a monolithic description at the circuit level is bound to run out of resources for a FIFO with a large enough number of stages. Using abstractions as we demonstrated above, one is able to handle larger FIFOs, although not an unbounded number of them, since the number of FIFO elements still enters into the computation (VI). 4 Some technicality is involved here. To prove the basis case, we disallow the transmitter to change its data output if the rst FIFO stage has not copied the previous value. Later on, while proving that the interface works correctly, we prove that it is never the case that the transmitter wants to modify the data output and the FIFO is not ready to receive new data. Comparison with previous approaches. The most important bene t of our approach is the correctness guarantee that it provides for the actual circuit, whereas LG95] and HBAB93] based their proofs on oversimpli ed abstractions of the circuit. Their tools do not provide a mechanism for checking the correctness of abstractions, therefore there is no formal guarantee that the properties they proved at a high level are satis ed by the circuit.
LG95] and HBAB93] neglect the time taken by the NOR gate (see Figure 5 ) to compute the acknowledgment output and, furthermore, use a delay model that is less realistic than the inertial delay model that we employ. Since veri cation of properties on the abstract model is performed automatically in our approach, we did not need to resort to such simpli cations.
The proof of LG95] is for all FIFO lengths n, and makes it easier to see the trade-o between circuit parameters, whereas our approach is still limited by n. However, the proof of LG95] is rather involved, and one needs to have an in-depth understanding of why the properties are satis ed. The abstraction proofs and environment abstractions that we used were rather straightforward and intuitive.
Conclusion
We demonstrated the use of compositionality and hierarchy on the veri cation of the STARI communication circuit. By using the timed re nement checking algorithm that we had implemented as part of Cospan in a compositional framework, we were able to divide the veri cation problem into smaller pieces, which enabled us to automatically verify a larger circuit than was previously possible.
This case study demonstrated once more that abstractions are indispensable for verifying large systems, and that compositional and assume-guarantee style reasoning are not only useful techniques for verifying the correctness of abstractions, but are almost always necessary.
The size of our abstract model for STARI, Tx k A 1 k ::: k A n k Rx still has an exponential dependency on n, although less severe than the gate level model. One problem that remains is the construction of an abstract model for STARI that is parametrized with respect to n. One would then prove the correctness of this abstraction using induction. Parametrized real-time systems have been studied before ( AHV93] ) and there is indication that this problem is rather complex.
