Abstract. Asynchronous circuits is a discipline in which the theory of concurrency is applied to hardware design. This paper presents an overview of a design framework in which Petri nets are used as the main behavioral model for speci cation. Techniques for synthesis, analysis and formal veri cation of asynchronous circuits are reviewed and discussed.
Introduction
Finite State Machines has been the most traditional model of computation for sequential circuits 25, 26] . It is a state-based model in which the system, being in a state, reads some inputs, writes some outputs and moves to another state. Time is discretized by the notion of cycle, which is the time that takes the system to move from one state to another. This model is appropriate to derive circuit implementations with a periodic signal, the clock, that dictates the time instants in which the system changes state. The cycle is the nest degree of granularity at which operations are scheduled. Thus, two operations are concurrent if they are scheduled at the same cycle. The cycle delay is determined by the worst-case delay the circuit takes to perform the operations scheduled at any of the cycles. Synchronization among operations is implicit, i.e. the initiation of an operation always starts at a clock edge and completes after a xed quantity of cycles at another clock edge. Thus, clock edges indicate the initiation and completion of several actions simultaneously.
After a long period of hibernation, asynchronous circuits woke up fteen years ago as a potential solution to some of the design problems posed by VLSI technologies 10]. In asynchronous circuits, the sequencing of operations is no longer dictated by a clock, but by events that indicate the initiation and completion of individual actions. The correctness of an asynchronous circuit not only depends on its structure, but also on the timing behavior of the individual gates and their interaction. Thus, a circuit can be modeled as a set of processes (gates) that communicate through channels (wires) and modify the state of the system represented by a set of Boolean variables (signals). A gate is enabled when the value of its output is di erent from the value calculated by its logic function. An enabled gate can produce a transition (event) on the output signal by changing its value.
The view of an asynchronous circuit as a concurrent system makes eventbased models of computation more appropriate for analysis and synthesis. For this reason, process algebras, such as CSP 16] , and Petri nets 34] have raised the interest of the researchers in this discipline.
This paper presents an overview of a design methodology for asynchronous circuits that uses Petri nets as the underlying model for speci cation, synthesis and veri cation. Section 2 explains how the behavior of an asynchronous circuit can be speci ed by using Petri nets. Section 3 presents a set of su cient properties for a speci cation to be implementable as a speed-independent circuit. The techniques to derive an implementation with logic gates are described in Sect. 4. The retrieval of the actual circuit behavior as a Petri net is known as back-annotation and is presented in Sect. 5. Finally, di erent strategies to ght against the state explosion problem in analysis and formal veri cation are reviewed in Sect. 6.
Timing diagrams, Petri nets and Signal Transition Graphs
For most circuit designers, Petri nets resemble timing diagrams, a model to specify asynchronous interfaces as signal waveforms that explictly indicate the causality and concurrency relations among signal transitions. Figure 1 (a) depicts the block diagram of a VME bus controller. According to its functionality, the controller has three sets of \handshake" signals: those interacting with the bus (DSr, DSw and DTACK), those interacting with the device connected to the bus (LDS and LDTACK), and that controlling the transceiver that connects the bus with the device (D).
The behavior of the controller is as follows: a request to read from or write into the device is received by one of the signals DSr or DSw respectively. In a read cycle, a request to read is done through signal LDS. When the device has the data ready (LDT ACK), the controller must open the transceiver to transfer data to the bus (signal D). In the write cycle, data is rst transferred to the device. Next, a request to write is done (LDS). Once the device acknowledges the reception of the data (LDT ACK) the transceiver must be closed to isolate the device from the bus. Each transaction must be completed by a return-to-zero of all interface signals, seeking for a maximum parallelism between the bus and the device operations. is determined by the system and is the one that must be implemented by the circuit. Typically, internal signals are incorporated during the synthesis of the circuit to solve some implementation problems (encoding, decomposition) and do not appear in the original speci cation of the system. Figure 2 depicts the STG that describes the complete behavior of the controller 1 . Unlike timing diagrams, STGs can specify choice and non-determinism. In this example, the initial marking models the situation in which the environment can non-deterministically choose to initiate a read cycle by ring DSr+, or a write cycle by ring DSw+. Usually, places with only one predecessor and one successor transition are not explicitly drawn in STGs This paper presents a methodology to synthesize asynchronous circuits from STGs. This methodology has been completely automated and implemented in a synthesis tool called petrify 35]. The speci cation of the READ cycle shown in Fig. 1(c) will be used as an example to illustrate this methodology along the paper. For the sake of simplicity, the WRITE cycle will be ignored.
Implementability properties
The goal of the synthesis methodology is to derive a speed-independent circuit that realizes the speci ed behavior. Speed independence is a property that guarantees a correct behavior under the assumption that all gates have an unbounded delay and all wires have a negligible delay 28].
A speci cation must ful l certain properties to be implementable as a speedindependent circuit. These properties can be better described on the state graph of the speci cation.
State Graph
The state graph (SG) of a speci cation is the transition system obtained from the reachability analysis of an STG. Each state corresponds to a marking of the STG and each arc corresponds to the ring of a signal transition. Figure 3 shows the SG of the read cycle speci ed in Fig. 1(c) . In the SG, each state is assigned a binary vector with the value of all signals at that state. For the sake of readability, the control signals corresponding to the left handshake, right handshake and data transceiver are separated by dots. Enabled signals are marked with an asterisk. For example, the state corresponding to the marking fp 7 
Properties for implementability
The properties required for the speci cation to be implementable as a speedindependent circuit are the following: { Persistency of signal transitions in such a way that no signal transition can be disabled by another signal transition, unless both signals are inputs. This property ensures that no short glitches, known as hazards, will appear at the disabled signals.
The SG of Fig. 3 ful ls boundedness, consistency and persistency. However, it does not have completeness of state encoding. The states corresponding to the markings fp 4 g and fp 2 ; p 8 g have the same code. Moreover, the behavior of the output signals in those states is di erent. In the state fp 4 g, the event D+ is enabled, whereas in the state fp 2 ; p 8 g, the event LDS? is enabled. Intuitively, this means that the information provided by the value of the signals is not enough to determine the future behavior of the system. This will result in an ambiguity in the de nition of the next-state logic functions.
Logic Synthesis
The goal of logic synthesis is to derive a gate netlist that implements the behavior de ned by the speci cation. For simplicity, we will illustrate this step by synthesizing a speed-independent circuit for the read cycle of the VME bus (see Fig. 3 ).
The main steps in logic synthesis are the following:
{ Encode the SG in such a way that the complete state coding property holds.
This may require the addition of internal signals.
{ Derive the next-state functions for each output and internal signal of the circuit.
{ Map the functions onto a netlist of gates.
Complete State Coding
As mentioned in Sect. 3.2, the SG of Fig. 3 
Next-State Functions
When an SG ful lls all the implementability properties, a next-state function can be derived for each non-input signal.
Given a signal z, we can classify the states of the SG into four sets: positive and negative excitation regions (ER(z+) and ER(z?)) and quiescent regions (QR(z+) and QR(z?)).
A state belongs to ER(z+) if z = 0 and z+ is enabled in that state. In this situation, the value of the signal is denoted by 0 in the SG Once the next-state function has been derived, Boolean minimization can be performed to obtain a logic equation that implements the behavior of the signal. In this step it is crucial to make an e cient use of the don't care conditions derived from those binary codes not corresponding to any state of the SG. For the example of Fig. 4 , the following equations can be obtained:
A well known result in the theory of asynchronous circuits is that any circuit implementing the next-state function of each signal with only one atomic complex gate is speed independent. By atomic gate we mean a gate without internal hazardous behavior 18, 22] . Two possible hazard-free gate mappings for the next-state function of the READ cycle example are shown in Fig. 5 Decomposing the circuit into a set of simpler gates that can be implemented in a given technology is a problem that has been studied by several authors 2, 4, 7, 20].
Size of the state space
The derivation of Boolean equations from a speci cation requires to calculate the encoding of all the states of the system. Unfortunately, the size of the state space of a concurrent system can be exponential on the size of the speci cation.
The existing tools for synthesis of asynchronous circuits use di erent methods to ght against the state explosion problem. In Burst-mode automata concurrency is restricted in such a way that bursts of input and output events are serialized 31]. Thus, the behavior can be represented by a Mealy-like automata with a manageable number of states in which the concurrency is annotated as input/output bursts on the arcs of the automata.
When no constraints are imposed on the type of concurrency manifested by the system, the knowledge and manipulation of the state space usually becomes the dominant part on the complexity of the synthesis algorithms. This is the case when Petri nets are used as the speci cation formalism. Di erent strategies have been used to calculate the state space: 
Back-annotation
As important as the structure of the circuit resulting from the synthesis of the speci cation is the actual behavior of the circuit. During synthesis, internal signals might have been added to encode the states and decompose complex gates. On the other hand, and although paradoxical, reducing concurrency 8] is one of the proposed approaches to improve the e ciency of the nal circuit. Reducing concurrency directly results in a reduction of the state space and, thus, in an increase of the don't care conditions for logic minimization. In general, concurrency reduction produces smaller circuits, but it may also produce faster circuits: the system manifests less concurrency but the events take less time to re. In the synthesis ow, signal insertion and concurrency reduction are usually performed at the level of SG. Providing a behavioral description of the synthesized circuit with the same formalism used for speci cation, e.g. Petri nets, allows the designer to easily interact with the synthesis framework and manually introduce those optimizations that automatic tools cannot nd.
The problem of deriving a Petri net from a transition system was rst tackled in 14] and studied by other authors 1, 12] . In these works, the theory of regions was developed to characterize the class of transition systems which correspond to elementary Petri nets. That work was extended in 9] to propose algorithms for the synthesis of safe Petri nets from any nite transition system. This work was crucial to provide the synthesis tool petrify 35] the capability of deriving a succint behavioral description of the synthesized circuits.
An example of back-annotation is presented in Fig. 6 . The circuit is an implementation of the READ cycle speci ed in Fig. 1(a) . In this implementation, only combinational 2-input gates have been used. With respect to the original speci cation, two new signals have been incorporated: csc0 to uniquely encode the states, as shown in Fig. 4 , and map0 to decompose a complex gate into smaller gates. By using the theory of regions, the Petri net of Fig. 6(b) is automatically synthesized and shown to the designer for analysis of the circuit's behavior. 
Analysis and formal veri cation
Analysis and formal veri cation are used at di erent stages of the design ow of asynchronous circuits. In particular for:
{ Property veri cation. After specifying the design it is required to check implementability properties to answer the following question: \Can the specication be implemented with an asynchronous circuit? " 18, 19] . Other properties of the speci cation can be of interest as well, e.g., absence of deadlocks, fairness in serving requests, etc. General purpose veri cation techniques can be employed for this analysis 23].
{ Implementation veri cation. After the design has been done fully automatically or with some manual intervention it is often desirable to check that the implementation conforms the given speci cation 13, 37].
{ Performance analysis and separation between events is required (a) for determining the latency and throughput of the circuit and (b) for logic optimization based on timing information 17, 30].
Techniques
As mentioned in Sect. 4.3, the state space of a concurrent speci cation is one of the major bottlenecks for the analysis of this type of systems. Here we present some techniques that have been succesfully applied in the area of formal veri cation of asynchronous circuits. { Structural properties of Petri nets (e.g., place invariants) can provide fast upper approximation of the reachability space 6, 11, 29] and can be also used for dense variable encoding of states in the reachability graph. Structural reductions are useful as a preprocessing step in order to simplify the structure of the net before traversal or analysis, keeping all important properties. { Unfoldings 19, 24] are nite acyclic pre xes of the Petri net behavior, representing all reachable markings. They are often more compact than the reachability graph and well-suited for extracting ordering relations between places and transitions (concurrency, con ict and precedence). Di erent types of unfoldings are also used for performance analysis 17].
As an example, we next illustrate how structural properties of a Petri net and BDD-based representations can be combined for an e cient analysis of the state space. Figure 7 is the result of applying linear reductions to the STG from Fig. 2 . Using more elaborate reductions (place and transition fusions) it is possible to reduce the whole Petri net from Fig. 1(c) to a single self-loop transition 29].
The BDD-based method used for deriving the transition function and calculating the reachable markings of a Petri net are similar to those used for reachability analysis and equivalence checking of nite state machines 23]: starting from the initial marking and by iteratively applying the transition relation until a x point is reached, the characteristic function of the reachability set is tions (e.g., by using BDDs), then the conjunction on these two functions give, for this example, the characteristic function of the exact reachability set of markings.
In general a conjunction of any set of invariants gives an upper approximation of the reachability set, which is useful for conservative veri cation. On the other hand, and by using the previous invariants, a dense encoding for places with the vector of Boolean variables V = (v 0 ; v 1 ; v 2 ; v 3 ) can be proposed (see Table 1 ). It can be easily observed that this logic proposition is a contradiction, thus proving that there is no reachable marking in which D and F are enabled.
This type of techniques has been succesfully applied to verify concurrent systems speci ed with Petri nets 32].
Conclusions
Event-based models for concurrency are being used for the speci cation, synthesis, analysis and veri cation of asynchronous circuits. Among them, Petri nets seem to be the most appropriate model that o ers the following features:
{ A succint representation of the behavior of concurrent systems with no restriction on the type of allowed concurrency among events. Asynchronous circuit design is still in its prehistory. In the future, we foresee an increasing interest in such type of circuits and design methodologies. Highlevel synthesis tools will be constructed and used in a broader set of applications. For this reason, techniques to synthesize and verify highly-complex concurrent systems will be required. This is an area where theoreticians on concurrency have a chance to apply their ndings and collect new challenging problems to solve.
