Abstract
Introduction
All the potential benefits of asynchronous circuits may be useless because of the presence of hazards (transient errors due to stray delays). Most of the existing techniques rely on a high-level specification of the system, and derive logic-level or gate-level networks that are hazard-free under the assumption of some delay model.
Bursl-mode design methodologies [8, 161 use transformations [6, 91 or technology mapping [13] to create a specific gate-level network. Speed-independent design methodologies from event models [3, 141 assume the existence of an extensive library of complex gates without internal hazards to avoid the technology mapping step. Varshavsky et al. [14] derived speed-independent circuits from distributive State Transition Diagrams using basic gates, but producing large and inefficient circuits.
Beerel et al. [2] extendedvarshavsky's methodology to aSumof-Products (SOP) architecture. This method allows input choices and applies some optimizations, but does not use a set of necessary conditions to guarantee a correct solution. Kondratyev et al. [5] have defined sufficient conditions to decide when a speedindependent SOP implementation can be derived. Both techniques impose two conditions that restrict the number of specifications that can be implemented and the optimizations that can be applied: 1) the use of SOP architectures that can only include AND-gates, OR-gates, and C-elements (where distributivity and signal persistency are necessary conditions); and 2) the unique entry condition that does not allow the use of certain types of input choices. Finally, Siege1 er al. [ 121 have developed a technique to decompose high fan-in AND-gates into a multi-level structure.
This work presents a set of sufficient conditions for the gatelevel synthesis of speed-independent circuits when restricted to a given gate-library. The proposed technique takes advantage of using complex gates, such as AOI-gates and OAI-gates, typically available in CMOS libraries [l] . The distributivity, signal persistency and unique entry conditions can be safely eliminated as restrictions, increasing the specifications that can be successfully implemented. Transformation techniques to reduce the number of non-persistent signal transitions, and the number of minimal (or muximan states are introduced. These transformations together with the Unique State Coding property, control the type and complexity of the gates involved in the implementation. The proposed synthesis conditions reduce the delay, area and memory elements in speed-independent realizations. Several clarifying examples of circuit specifications and speed-independent realizations, even not satisfying the previously required conditions, are presented.
Definitions
A State Transition Diagram (STD) [14, 151 is a quadruple ( S , E,A,X),whereSisasetofstates, E C S x S i s a s e t o f t r a nsitions, A is a set of signals, and X : S -+ BIAl is a total labeling function that encodes each state with a binary vector of signal values. The set of signals A = Ar U A0 is divided into input signals A I , and non-input signals Ao. Rising (falling) transitions of signal a; are denoted ai + ( a i -), and generic transitions a ; t. An STD (Example I ) is described in Fig. 2(a) , where S = { S I , . . . , s~} , E = { ( S I , a ) ( s 2 ,~3 ) (~3 ,~4 ) . function that given a set of states X , its image is the set of states X' = A(X) such that for any pairs E X and s' E X' then sEs'.
No state can have two outgoing transitions labeled with the same signal but with different signs. Moreover, no state s can have an outgoing transition labeled with a rising (falling) transition of a signal a,, being the encoding function X(s); = 1 (A(s)i = 0). Such an STD is called consistently encoded.
The behavior of a circuit can be represented by an STD. The reachable states depend on the delay model assumed for the gates and wires in the circuit. Two delay models specify the amount of "memory" of the delay (inertial and pure), while two other define the time required to propagate changes through the delay (unbounded and bounded). Speed-independent synthesis assumes a negligible delay in the wires, and also the pure delay and unbounded delay models in the gates [14, 5] . An STD is said to be speed-independent under the pure and the unbounded delay models iff it is output semi-modular [15] . Henceforth we will assume that all the STD specifications are semi-modular and consistently encoded.
The rising Ifalling) excitation set of a; is the set of states in which some transition a r t (ar -) is excited (denoted by ES+(ai) and ES-(ai)). The one (zero) quiescent set of ai is the set of states in which a, has the value 1 (0) and no transition a;* is A generic transition cluster is denoted by Ti*. Similarly to transitions, we define the predecessor and successor transition clusters. The set of predecessor (successor) transition clusters of T;* is denaoted 'T,* (Tit'). In the Example 1 both Tf+ andTf2+ are predecessor transition clusters of Tj -, while Tj -is the only predecessor transition cluster for Tj+ and T j t .
The regions of a transition are also extended to transition clusters. The excitation region ER(T;*) is the union of the excitation regions for all the transitions in the cluster (similar extensions exist for QR(T, *) and BR(T,*)). The excitation regions for the transition clusters of the output signal f are depicted in Fig. 2 Region covers can be derived for the transition clusters in Example I : R(Tj+) = {101110}, R(Tf2t) = {011110}, and A region cover R(T;*) may also cover successor states in QR(Ti*), or predecessor states in BR(Ti*). and QR(Tj+) and therefore cannot be covered by any of the region covsers R(Tj+) and R(Tf2+). If any of both region covers, R(T)+) or R(Tj+), covers sg) the resulting signalnetworkcannot be speed-independent. In general, any state can be covered by only one of the region covers for the rising (or falling) transition clusters of an output signal. States that are included in the quiescent regions (or the backward quiescent regions) of more than one transition cluster cannot by covered. Therefore feasible region covers that do not include states shared by the quiescent regions (or backward quiescent regions) of other transition clusters are called one-hot encoded. One-hot encoded transition clusters guarantee that, at any time, only one of the regions evaluates to 1.
Definition 3 R(T;
*
Definition 4 The restricted quiescent region QR'(Ti*) is the subset of QR(T; e) such that:
QR'(T**) = QR(Ti*) -U, , , e Ti+ QR(ai*).
Definition 5
The restricted backward quiescent region BR'(T, *) is the subset ofBR(T, *) such that:
BR'(T;*) = BR(T,*) -U, , , Tz* BR(a;*).
-Nf
Figure 1: Signal network in a SOF architecture for signal f.
excited (denoted by QS'(a;) and QSo(ai)). The excitation region
ER( a; *) is the maximal connected set of states in which the transition a; * is excited. The quiescent region QR( a; . ) is the maximal connected set of states that can be reached from ER(ai *), and the backwardquiescent region BR(a; *) is the maximal connected set of states that can reach ER(ai*), in both cases without enabling any other transition ai *.
A transition a;* is a predecessor of ai* if there exists an allowed sequence {a;+, U, ai *} in which no other ay E U. Con 
Implementation Architecture Overview
The implementation strategy assumed in this paper is named a two-level Sum-ofFunctions (SOF) architecture. Following this architecture, a signal network (N,,) is created for every output signal. The transitions of a; are grouped into transition clusters that only contain rising (falling) transitions, denoted T i + and T i -.
N,, consists of a first level of rising and falling region covers, R(T;+) (R(T, -)).
The region cover is a function that covers the states in the excitation regions and may cover some of the states in the quiescent regions and backward quiescent regions. The region covers are implemented by one AND-gate in a single cube cover, or by complex gates in a multiple cube (poly-term) cover. The rising region covers are combined with an OR-gate to create the set region network (Sa, ), and the falling region covers create the reset region network (R,,) . Finally, both region networks implement N,, using a Muller's C-element. A one-hot encoded R(T, e ) is correct if any state in BR(Ti *) covered by R(T, *) it is also covered by the region cover of a predecessor transition cluster of T, *. Any state in BR(T, *) and also contained in the quiescent regions of several transition clusters in "Ti I cannot be covered by any of its region covers, and therefore this state cannot be covered by R(T, *) neither. 
Definition 7 The one-hot encoded R(T,*) is

MRT(T,*) = BR'(Ti+) U ER(Ti+) U QR'(Ttt). R(T,*): s E BR(T,*) + 3!T,'* E 'T,* : s E R(Ti*).
Note that the SOF architecture requires that no cube in R(T; *)
can contain the literal corresponding to a;. Otherwise a feedback is introduced into the signal network. 
R(T, *) such that none of its cubes contains the literal corresponding to a, only if
Vs E ER(Tit) : s E ( a , * ) s ' A a;* E T;* A s i E QR'(T,*).
According to Prop. 11, no cube in the region covers for the output signal f (see Fig. 2(a) A signal network that only uses rising (or falling) transition clusters, called a complete signal network, can be derived if all the rising (or falling) region covers are complete. When all the rising region covers are complete, the set region network is implemented while the reset region network and the C-element are removed (similarly for the falling region covers).
The number of clusters determines the number of gates used to implement the signal networks. Two clusters can be used, one containing the rising transitions and the other containing the falling transitions. Then the signal network is reduced to two complex gates for the set and reset region networks, and the Celement. If every cluster contains exactly one transition then, the signal network contains one gate for each transition, two OR-gates and the C-element. The state encoding highly influences the structure of the clusters. Assuming that a;+ is excited at s, that a\+ is excited at s' and that X(s) = X(s'). ER(at+) andER(at+) will intersect, and both transitions must bein the same transition cluster to guarantee monotony. 
Region Cover Minimization
A region cover can be simplified by reducing the number of cubes in it, and the literals in each cube, increasing the probability of finding a match in a gate-library. In any case, after a minimization the region cover must remain monotonic.
Monotony Verification
The monotony verification technique consists of three steps:
states available for minimization have been used. correctly overlaps with the predecessor region covers. gion cover changes exactly twice.
One-hot encoding verification, to guarantee that only those
2.
Correctness verification, to guarantee that the region cover 3. Monotonic transition verification, to guarantee that the re-
5.2
Four different techniques can by applied to simplify a region cover. After any simplification, the region covers can be checked to eliminate all the redundant cubes.
Forward region expansion:
The cover is expanded towards the restricted quiescent region, increasing the states contained in it. The final objective is to derive a complete region cover.
Backward region expansion:
The cover is expanded towards the restricted backward quiescent region. All the newly covered states must be covered by the region covers of the predecessor transition clusters. DC region expansion: A cover is expanded to the dc-set of the specification when a literal is removed and no new states are covered. Region merging: Transition clusters can be merged together when the size of the cubes (or its number) in the resulting region cover decreases. The minimization techniques can be combined into the synthesis algorithm presented in Fig. 4 and divided in seven main steps. The initial transition clusters of cardinality one are computed, and those clusters which excitation regions intersect are merged to preserve the one-hot encoding requirement (1). An initial irredundant region cover is computed for each one of the previously defined transition clusters (2) . At this point the minimization techniques are applied in the predetermined order (3, 4, 5) . After any transformation all the redundant cubes are eliminated, and transition clusters are checked to be merged if improvements are obtained. Finally, the region covers in the final signal network are created and mapped onto a given gate-library [4] (6,7). 1. orderedwith a,* then: cj(aj*) = a j ifaj = 1 in ER(a;*),
Signal Network Synthesis and Minimization
Easing the Synthesis Requirements
R(Tf-) = {----0-}. o r c j ( a ; t ) = c i f a , = O i n E R ai*),
2. concun-ent with a; t then: cJ ( a i = +.
A cover cube c ( a ; * ) will be called correct i f it covers all (but only) the states in ER(ai*). 
Distributivity
Distributive STDs are a subclass of semi-modular STDs in which the local history for each state is unique. Moreover, distributiviry is a necessary condition for both the existence of single cube region covers and the unique entry condition.
Lemma 18 [14] In a semi-modular but non distributive STD, there is at least one excitation region with several minimal states.
Lemma 19 [14] In a non distributive STD, there is at least one excitation region that cannot be correctly covered by its corresponding cover cube.
Example 2 (depicted in Fig. 5(a) ) presents a non distributive STD. Assuming that c is a non-input signal, then Tc+ = {c+} and T,-= {e-}. c(c+) = (--00) may be used to implement
R(T,+). However c(c+)
it is not a correct cover cube. The states in ER(c+) = {sg, s7, sg} are covered by c ( c + ) as well as
SS E BR(Tct).
The multi cube approach requires two cubes for R(T,t) = (1-00, -100) and a singlecube for R(Tc-) = {OOlI}. Both region covers can be minimized obtaining complete signal networks (see Fig. 5(b) and Fig. 5(c) ).
Signal Persistency
The work by Kishinevsky et al. [5] proved that signal persistency is a necessary condition to guarantee that a transition cluster of cardinality one can be covered with exactly one cube.
Lemma 20 [SI Given T,* = {a,*), the region cover R(T,+) = { c ( a , * ) } correctlycovers ER(u, *) i$a,+ is signalpersistent.
Non-persistent output signal transitions require the transformation of the STD specification by reducing its concurrency. Concurrency reductions may decrease the area of the implementations because they eliminate CSC violations, reducing the number of intemal signals, and increase the number of vertices in the dc-set. However, they are not desirable because the imposed sequentiality greatly degrades the performance of the system. The use of complex gates allows to derive monotonic region covers for non-persistent transitions.
Example I (Fig. 2(a) ) depicts a non-persistent transition e-. 
Single MinimaVMaximal State
The notion of unique entry condition was introduced in [5] as a necessary condition to derive monotonic cover cubes.
Example 1 in Fig. 2(a) depicts ER( e-) containing two minimal states s7 and s l g . .(e-) = ({-----1)) it is not monotonic (as we have seen previously). Increasing the concurrency of the specification by changing the triggers of e-from {c-/l, d -/ Z } to { f+/ 1, f t / 2 } , we obtain the STD depicted in Fig. 2@ ). Now the STD contains two more states (SZZ and 523). However, ER(e-) maintains the same structure with two minimal states s g and ~1 5 . Now, c ( e -) = ({-----l}) is monotonic. Therefore the unique entry condition it is not necessary for the existence of monotonic cover cubes.
State Encoding
The monotony conditions when restricted to single cube region covers impose severe constraints in the state encoding of the STD.
A correct cover cube c(a; *) contains all the states in the excitation region ER(a, *), but may also cover other states of the STD. Then, a monotony violation occurs (even if the specification satisfies the CSC and USC requirements), and internal signals must be inserted in the specification.
Example 3 (see Fig. 7(a) ) presents an STD that satisfies both CSC and USC. The cover cube c(c+) = (I---O), covers a state with binary code (10010) in QR(c-), and therefore it is not monotonic. Given T,+ = { c + } , the complex gate approach will use R(T,+) = {lILO, 1-00>. After the minimization, R(T,+) = {-l--O, 1--0-} is a complete region cover that directly implements N , (see Fig. 7(c) ).
Complete Region Covers
The computation of complete region covers is the most important minimization in the synthesis process because of the C-element elimination. Any synthesis algorithm that only uses single cube region covers will not be able --in most cases -to find complete region covers. Two conditions prevent a quiescent region to be completely covered:
1. When QR' (a, *) does not coincide with QR( a , *) (even for 2. Any of the successor transitions a: * E U , ** has more that This second restriction can be eliminated if multi cube region covers are allowed. Example I (see Fig. 2 This situation is quite close to the distributivity restriction presented in Section 6.1. Again, the only possible solution is the use of multi cube region covers. In this particular example a two cube region cover may be used R(Tj+) = {-0--1-, -0-1-}.
Transformation Techniques
Transformations enforcing signal persistency, unique entry condition, and to encode the STD are useful to simplify the region covers, increasing the probability of finding a matching library gate.
Non-persistency is eliminated introducing causality relations between an output transition ai I and its non-persistent concurrent transitions a3 *. Two transformations can be applied enforcing:
1. a3 to be a successor of a; *, (persistency constraint), 2. a, * to be a predecessor of a , *, (concurrency constraint).
None of both requires the insertion of state signals, and only the concurrency degree is modified simplifying the structure of the regions. Fig. 2 (c) and (d) present the result of applyingpersistency and concurrency constraints to e-in Example 1. Transition e-is no longer concurrent with d-/l and c-/2 after persistency constraint, eliminating the state s20. Two falling transitions e-/I and e-/2 are created (see its signal network at Fig. 6(b) ). Concurrency constraint completely eliminates the concurrency between e-and a-, 6-, d-/1 and e-12. States sa, S I O , S17 and 519 are eliminated, reducingER(e-) to s20. The final implementation is composed of N , = NOR (a, b, c, d) .
Monotony violations can be interpreted as encoding conflicts, and therefore eliminated by using standard encoding techniques based on the insertion of state signals [7, IO] . In Example 3, the monotony violation can be interpreted as a CSC conflict between a multi cube approach).
one trigger transition. a (dummy) state s = (10010) in ER(c+), and the real state s' = (10010) in QR(c-). Fig. 7(b) presents a state encoding that solves the coding conflict. A new signal z is created, and two transitions {xt, x-} inserted. The final signal network for c and x are depicted in Fig. 7(d) .
A multi-minimal (maximal) state excitation region can be split into several sub-regions by unfolding the specification, i.e. by duplicating some of its states. The transition will be unfolded into several new transitions. Each one of the unfolded states introduces a USC viollation with any of its "equivalent" states. Such USC violations nus st be eliminated by using existing state encoding techniques. Fig. 2(e) depicts ER(e-) in Example 1 split into two new sub-regionsER(e-/I) andER(e-12). Thestates20 has been unfolded into the states s20 and s22.
Conclusions
A sufficient condition for the synthesis of speed-independent circuits from STDs has been presented. Monotony is defined in terms of the regions in the specification. Moreover, the structure of every signal network is directly reflected on the monotony condition. The complex gate paradigm is efficiently implemented by using multi-cube region covers and a technology mapping final step. Many advantages are offered by this synthesis technique: distributivity, signal persistency, and the unique entry condition are no longer necessary; encoding requirements are eased, coming close to the: CSC conditions; and better minimization techniques can be applied, obtaining smaller and faster implementations.
