Asynchronous circuits are widely used in many real time applications such as digital communication and computer systems. The design of complex asynchronous circuits is a di cult and error-prone task. An adequate synthesis method will signi cantly simplify the design and reduce errors. In this paper, we present a general and e cient partitioning approach to the synthesis of asynchronous circuits from general Signal Transition Graph (STG) speci cations. The method partitions a large signal transition graph into smaller and manageable subgraphs which signi cantly reduces the complexity of asynchronous circuit synthesis. Experimental results of our partitioning approach with large number of practical industrial asynchronous circuit benchmarks are presented. They show that, compared to the existing asynchronous circuit synthesis techniques, this partitioning approach achieves many orders of magnitude of performance improvements in terms of computing time, in addition to the reduced circuit implementation area. This lends itself well to practical asynchronous circuit synthesis from general STG speci cations.
Introduction
Asynchronous circuits are crucial in many important applications such as digital communication and computer systems. In the past, computer system design was dominated by the synchronous circuits due to its relative ease of design with respect to the asynchronous circuits. The ease in designing synchronous circuits is a direct result of timing restrictions placed on circuit signals and the global clock. Due to fundamental feature size limits in IC fabrication, these restrictions can no longer be satis ed without compromising the system performance 31, 34] . The problems of distributing system clock on a complex submicron chip have revived recent interest in asynchronous circuit design. Researchers attempt to replace the global system clock in synchronous circuits by the distributed local clock systems in asynchronous circuit design. The emergence of certain key applications such as low power personal communication systems and the thrust towards economical high-performance computing systems has been a strong motivating force for asynchronous circuit research 35, 37] .
Traditionally, asynchronous circuits are speci ed and synthesized using nite state machine (FSM) speci cations. The FSM model, however, imposes unnecessary restrictions on input changes. It does not provide a succinct way of handling control-intensive asynchronous circuits that have rich concurrent behavior. Recently event-based asynchronous speci cation such as signal transition graphs (STG) has captured wide attention due to its simplicity and ability to describe a variety of asynchronous behavior and to make performance analysis 15, 20, 22, 28, 38] . The STG methods can e ectively utilize global optimization techniques to maximize control parallelism, reduce circuit area, and enhance circuit performance.
Asynchronous circuit synthesis based on STG speci cations requires a distinguishable binary code for every circuit state. This requirement, known as the complete state coding (CSC), is one of the most important and di cult problems in designing asynchronous circuits from STG speci cations. Most methods satisfying the complete state coding requirement are limited by the type of signal transition graphs they can synthesize. Recently, Vanbekbergen et al. formulated a Boolean satis ability (SAT) model to solve the complete state coding problem from general STG speci cations 40] . Their formulation permits complex STG transformations but it su ers from a very large 1 search space for the practical asynchronous circuit benchmarks.
In this paper, we give a general and e cient partitioning approach for the synthesis of asynchronous circuits from general STG speci cations. For a given asynchronous circuit synthesis problem, initially, its signal transition graph is partitioned into a number of simpler and manageable subgraphs. Each subgraph is represented by a small set of SAT formulas which can be solved by a SAT algorithm e ciently. Eventually, the solutions of these small subgraphs are integrated together contributing to the solution of the original problem. Instead of Vanbekbergen et al.'s brute-force approach, this approach of partitioning the state graph into smaller subgraphs avoids the problem of solving very large SAT formulas which lends itself well to practical asynchronous circuits synthesis from the general STG speci cations. Compared to Lavagno et. al.'s 16] and Vanbekbergen et al. 's 40] algorithms, this partitioning approach achieves many orders of magnitude of performance improvements in terms of computing time, in addition to a reduced circuit implementation area.
The rest of this paper is organized as follows. In Section 2, we brie y review previous work in the area. In Section 3, we present some basic de nitions and notations that simplify our discussion. Section 4 describes a Boolean satis ability (SAT) model for complete state coding requirement. In Section 5, we present in detail our general partitioning approach for the synthesis of asynchronous circuits from general STG speci cations. Experimental results of our design method with large number of practical industrial asynchronous circuit benchmarks and its performance comparisons with other existing design methods are given in Section 6. Section 7 concludes this paper.
Previous Work
The problem of synthesizing asynchronous circuits from STG speci cations has been studied by many researchers 2, 15, 18, 39, 40, 42] .
Chu developed the signal transition graph speci cations and proposed the complete state coding 1 and persistency constraints to synthesize an asynchronous circuit from a given STG. Chu considered the CSC requirement as a necessary and su cient condition for deriving a logic circuit from a given state graph. Lin 13] proposed algorithms for hazard-free asynchronous circuit synthesis from state speci cation derived from general STGs but they did not tackle the problem of ensuring complete state coding in the given speci cations.
Recently, Vanbekbergen et al. 40] proposed a general framework to solve the complete state coding problem from general STG speci cations. It is not limited to the marked graph or safe freechoice Petri nets. This approach can handle any speci cation model that can be translated into state based speci cations, i.e., a state graph. In Vanbekbergen et al.'s framework, they formulated the CSC problem as a Boolean satis ability (SAT) problem. 2 They gave the necessary and su cient conditions for CSC satisfaction. This ensures the CSC property while conserving the original STG behavior. Since the CSC solutions obtained in this framework performs very general STG transformations as compared to Lavagno et al.'s approach, it requires a much larger search space. It is well-known that many combinatorial optimization problems can be directly transformed into the SAT problem. Unfortunately, the instances of SAT formulas derived from practical STGs are too large to be solved e ciently. A moderately large size signal transition graph 24] with 174 states, for example, would generate a Boolean formula with 35,386 clauses and 1,044 variables. In our experience 4, 5, 6, 7, 10, 26, 9] , it usually takes prohibitively long time to nd a satis able assignment for very large Boolean formulas. This is especially true since existing heuristic techniques to solve the SAT problem often take advantage of the structure of the problem. They are not suitable for this particular case.
In this paper, based on Boolean satis ability framework, we give a general and e cient partitioning approach for the synthesis of asynchronous circuits from the general STG speci cations. Our approach partitions a large STG into smaller and manageable subgraphs that signi cantly reduces the number of Boolean satis ability constraints and thus the circuit design complexity. The method achieves many orders of magnitude of performance improvements in terms of computing time, in addition to the reduced circuit implementation area.
Preliminaries
In this section, we give some basic de nitions and notations that simplify our discussion.
Petri Nets and Signal Transition Graph (STG)
In asynchronous circuit research, signal transition graphs that use Petri nets 23] as an underlying formalism have captured wide attention 15, 20, 22, 29, 38] . A Petri net 23] is a bipartite directed graph, < P; T; F; M 0 >, consisting of a nite set of transitions T (represented as bars), a nite set of places P (represented as circles), and a ow relation F P T T P (represented as directed arcs) specifying a binary relation between transitions and places. The dynamic behavior of Petri net is captured by the Petri net markings and the ring of net transitions, which transforms one marking into another. A marking M is an integer assignment to places corresponding to the local conditions which hold at a particular moment. It is graphically represented as solid circles called tokens, residing in these places. The initial marking is denoted as M 0 . A transition t is said to be enabled in a marking M, when all its fanin places are marked with at least one token. An enabled transition must eventually re and its ring removes one token from each fanin place and deposits one token in each fanout place. The transformation of a marking M into another marking M Transition x+ is enabled in the initial marking. The ring of transition x+ will remove one token from place fp 7 g and deposit one token in place fp 1 g and one token in place fp 4 g.
Signal transition graphs use Petri nets as the underlying formalism to specify the behavior of digital control circuits. In an STG, Petri net transitions are interpreted as rising and falling transitions in the asynchronous circuits. Transitions s i +, s i ?, and s i denote a rising, a falling, and a rising or falling transitions on signal wire s i , respectively. The set of input signals and the set of non-input, i.e., output and internal signals, are denoted by S I and S NI . The set of all the signals in the STG, i.e., S I S NI , is denoted by S. In an STG, every place with a single fanin and fanout transition is represented by an arc between these transitions (as illustrated in Figure 1(b) Example : In the initial marking of STG shown in Figure 2 (a), transition x+ is enabled. Thus signal x must have a value 0 in the initial state, since it must go to a value 1 after ring a positive transition of x. Similarly in the initial state, STG signals y and z can also be evaluated to be 0 and 0, respectively. This assigns a binary code 000 to the initial state for state graph shown in Figure 2 To synthesize an asynchronous circuit, its STG speci cations must assign a distinguishable binary code to every circuit state. This requirement, known as complete state coding, is one of the most important and di cult problems to be solved in designing asynchronous circuits from STG speci cations.
De nition 2 A state graph satis es the complete state coding constraint if and only if no two states have the same binary code assignment, or the transitions of non-input signals, enabled in two states having the same binary code assignment, are the same.
Thus, only input transitions enabled in two states having the same binary code are di erent and it is assumed that the environment can distinguish between them. A state graph satisfying the CSC constraints has a well de ned logic function, and there is no con ict of implied values even if the binary code assignments of two states are the same. Since only input and output signal values are used for state encoding, the CSC constraints are necessary for logic implementation of the STG speci cations.
A general framework to solve the CSC problem for general STG speci cations by inserting state signals into the state graph was proposed by Vanbekbergen et al. 40] . In this framework, a CSC violation must be corrected by inserting state signals in the STG, so as to distinguish between the states violating CSC 42] .
Example: Figure 3 shows two states M 1 and M 2 having the same state encoding 011 in a state graph. State M 1 enables an input signal transition ai+ and an output signal transition ao?. On the other hand, state M 2 enables only an output signal transition bo?. Since both states M 1 and M 2 have the same binary encoding 011 and they enable di erent output (i.e., non-input) signals, they violate the complete state coding constraint. This violation can be corrected by inserting state signal, n, in the state graph. This signal must have a value 1 in one state and 0 in the other state. In Figure 3 Example: In the initial state 000 of state graph shown in Figure 2 (b), output y is not enabled. The implied value of y in state 000 will be the same as the present value of y, i.e., 0. The output transition y+ is enabled in state 100. Thus the implied value of y in state 100 will be the complement of the present value of y, i.e., 1. The implied values of output y in other states can be obtained similarly.
The logic function of an output can be obtain by constructing a Karnaugh map which contains the entries of implied output values. In Karnaugh map of output y in Figure 2 (c), the entry corresponding to xyz = 000 is 0, i.e., the implied value of y in state 000 is 0. The entry corresponding to xyz = 100 is 1, i.e., the implied value of y in state 100 is 1. A Karnaugh map produces the logic Chu developed a systematic technique for direct synthesis of speed-independent circuits from formal STG speci cations with unbounded gate-delay model 2]. He proposed that the logic implementation obtained by imposing syntactic constraints such as complete state coding are free of any hazards. It has been proved that these syntactic constraints can only remove the undesirable behavior at the functional level 22]. It was also proved that the persistency constraint is redundant and only CSC constraint is necessary and su cient for ensuring a logic implementation 15, 22] .
Signal transitions in a state graph can be characterized by the semi-modularity. Figure 4(a) illustrates the semi-modularity of signal transition t. Similarly, the non semi-modularity of t is illustrated in Figure 4 (b). As discussed in Figure 3 , a CSC violation must be corrected by inserting state signals in the state graph, so as to distinguish between the states violating CSC constraint 16, 40] . Since the insertion of state signals to satisfy CSC constraint must not disable any signal transition speci ed by the STG (i.e., a signal transition t that is enabled in a state will remain enabled), we must ensure semi-modularity of the state graph. Thus, every transition of the inserted state signals must satisfy the semi-modularity constraint. In addition, every transition that is semi-modular in the given state graph speci cations must remain semi-modular after the state signal insertions.
De nition 4 Semi
In the following section, we give the SAT formulation for the CSC constraint satisfaction problem which is the core problem in asynchronous circuit design.
A SAT model for CSC satisfaction
The satis ability (SAT) model for CSC satisfaction, SAT-CSC, has four components: A set of constraints including the CSC constraints, consistent state assignment constraints, and semi-modularity constraints.
The SAT-CSC model has N states and has at least N:dlog 2 (Max csc )e state variables. 3 The number of state variables, m = dlog 2 (Max csc )e, is the lower bound on the number of state signals required to satisfy the CSC constraints, where Max csc denotes the maximum number of states in the state graph that have the same state encoding. In practice, more state signals than the lower bound dlog 2 (Max csc )e may be required in order to avoid critical races introduced by the insertion of new state signals.
The complexity of the extracted Boolean formula depends on the number of states N, the number of concurrent transitions, 4 m is the number of state signals required to satisfy the CSC constraints, N ct 4 is the number of concurrent transitions, N usc is the number of state pairs that have the same binary encoding, N csc is the number of CSC violations, and c 1 , c 2 , c 3 , and c 4 are constants. The SAT formula represents the CSC, consistent state assignment, and semi-modularity constraints. If the formula is unsatis able, we add a new state signal to make it satis able. This generates a new SAT formula. The goal to solve the SAT-CSC problem is to nd a truth assignment to the state variables so that all the constraints are satis ed.
In order to resolve the CSC violations, a state signal n j is inserted in the state graph that assigns one value of state variable n i;j to every state M i in the state graph ( Figure 6 ). In a state graph with states M i and M k violating the CSC constraint, a state signal n j is inserted that assigns the corresponding state variables n i;j a value 0 and state variable n k;j a value 1 (see Section 3.2). In addition, while making a transition from the state variable assignments 0 to 1 and 1 to 0, we must 12 have a positive and negative transition of the corresponding state signal. This is represented by the state variable assignment Up and Down respectively. The Up and Down state variable assignments are expanded by inserting a corresponding positive and negative state signal transition, respectively. This is illustrated in the state graph shown in Figure 6 (a) where states M 3 and M 5 have the same state encoding, 01. Thus, we assign 0 and 1 values to the state variables corresponding to these states for the inserted state signal n. Since during the transition sequence M 3 ! M 4 ! M 5 , the state signal value changes from 0 to 1, we must assign an Up value to the state variable corresponding to state M 4 . As illustrated in Figure 6 (b), this Up assignment can be included in the state graph by expanding it to a positive transition of the inserted state signal n.
In the following we use a simple example, PaBlock ( Figure 5 (a)), to illustrate the SAT-CSC model.
Example : The state graph of the example STG is shown in Figure 5 Thus, there are 8 CSC constraints in the state graph, i.e., fM 17 ; M 11 g, fM 11 ; M 5 g, fM 15 ; M 9 g, fM 15 ; M 2 g, fM 9 ; M 2 g, fM 13 ; M 7 g, fM 7 ; M 1 g, and fM 12 ; M 4 g.
Since the maximum number of states with the same binary encoding < 1000 > is three, i.e., fM 15 ; M 9 ; M 2 g, the lower bound on the number of the required state signals is dlog 2 (3)e = 2. Thus, while keeping the consistent state assignment and state graph semi-modular, the SAT-CSC model requires at least two state signals, n 1 and n 2 , to satisfy the CSC constraints. Two state variables n i;1 and n i;2 , corresponding to the state signals n 1 and n 2 , are assigned to every state M i in the state graph. This is represented by M i fn i;1 gfn i;2 g in the nal state graph in Figure 7 .
The constraints can be formulated as Boolean clauses on the binary variables n 1;1a ; n 1;1 b ; n 1;2a ; n 1;2 b ; : : : ; n 18;2 b , corresponding to the state variables n 1;1 ; n 1;2 ; : : : ; n 18;1 ; n 18;2 , respectively. This model requires 72 binary variables in the Boolean constraint formula, e.g., for state variables n i;1 and n i;2 , four binary variables n i;1a ; n i;1 b ; n i;2a ; n i;2 b are required for each state M i . In order to resolve the CSC con ict between state M 9 and state M 2 , the state signal n 1 may be assigned a value 1 for state M 9 and a 0 for state M 2 , or vice-versa. This can be done by assigning 1 and 0 (or vice-versa) to the corresponding state variables n 9;1 and n 2;1 for states M 9 and M 2 , respectively. This assignment to resolve a CSC con ict is formulated as constraint n 9;1 n 2;1 + n 9;1 n 2;1 :
Since n 9;1 and n 2;1 are multi-valued variables, they can be substituted by the corresponding Boolean variables, e.g., n 9;1 = 0 is represented by fn 9;1a = 0; n 9;1 b = 0g, n 9;1 = 1 is represented by fn 9;1a = 0; n 9;1 b = 1g, n 2;1 = 0 is represented by fn 2;1a = 0; n 2;1 b = 0g, and n 2;1 = 1 is represented by fn 2;1a = 0; n 2;1 b = 1g. Thus the Boolean clause (Eq. (1)) becomes n 9;1a n 9;1 b n 2;1a n 2;1 b + n 9;1a n 9;1 b n 2;1a n 2;1 b : (2) Other Boolean constraints in the SAT formula can be derived similarly. The SAT formula requires 1012 clauses: 608 clauses for CSC constraints, 224 clauses for consistent state assignment constraints, and 180 clauses for semi-modularity constraints.
The nal assignment of the state variables resolves all the CSC con icts in the state graph (Figure 7 ). For example, the CSC con ict between states M 2 and M 9 is resolved by the state variables n 2;1 and n 9;1 . State variable n 2;1 assigns a value 0 to state M 2 and state variable n 9;1 assigns value 1 to state M 9 . This is denoted by M 2 f0gfDg and M 9 f1gf0g in Figure 7 .
In the following section, we describe our partitioning approach to asynchronous circuit synthesis. In the rest of this paper, we use the STG example PaBlock to illustrate this approach.
A General Partitioning Approach for Constraint Satisfaction
Previous work indicates that most techniques proposed for the synthesis of asynchronous circuits from signal transition graphs are restricted in the practical applications. They either handle a limited set of STG speci cations or try to satisfy all the constraints in the state graph directly, which, in most cases, is intractable.
To synthesize a wide variety of asynchronous behavior, it is essential to allow the general STG speci cations. Vanbekbergen et al.'s SAT formulation of the STG constraints 40] is general in synthesizing STGs but the sizes of SAT formulas directly generated from this method are usually too large to be handled. In practice, it is much easier to satisfy several smaller Boolean formulas than a single large one.
In this section, we give a general partitioning approach to handle this problem. Since our approach partitions a large signal transition graph into a number of modular subgraphs, we also refer this method as a modular partitioning approach.
A Model for Modular Signal Transition Graph Partitioning
A constraint satisfaction model has three components: variables, values, and constraints. The goal is to nd an assignment of values to variables such that all the constraints are satis ed 11].
In a modular constraint graph model, the complete graph is partitioned into the smaller and simpler subgraphs 8, 11] . These smaller local graphs can be manipulated individually. An integra- tion method is developed that is able to combine these local modular graphs into a global network. Modularity refers to the ability of a constraint satisfaction model to decompose a complex system into easily understood modular graphs. Another important aspect of modularity is constraints 11]. Modularity of constraints refers to the ability of a constraint satisfaction model to enable complex information to be represented in terms of modules of local constraints. Each individual module of relational constraints may be handled separately. However, the constraint satisfaction models di er with respect to how a module of local constraints communicates with another. In our model, local solution in each modular graph is communicated with those of other local modular graphs and with the complete graph. It is this interaction which allows the complete solution to be built from the solutions of the individual local modular graphs.
In this modular partitioning approach, for a given problem, the state graph is rst partitioned into a number of simpler modular state graphs. Each modular graph is solved individually. The results of these small graphs are then integrated together contributing to the solution of the given problem. A comparison of the direct synthesis approach and the modular partitioning approach is illustrated in Figure 8 . The partitioning of a large state graph into smaller state graphs has several unique advantages:
It signi cantly reduces the number of constraints by several orders of magnitude. For STG benchmark mmu0, for example, the direct SAT formulation requires the solution of a very large SAT formula with 35,386 clauses. In comparison, our modular partitioning approach requires solutions of three very small formulas having 954 clauses, 954 clauses, and 85 clauses, respectively.
The SAT formulation of the complete state coding problem allows exploration of a large search space of design solutions. Thus a solution that yields minimum implementation area can be chosen. Since the number of design constraints are very large for practical STG speci cations, it is di cult to obtain even one solution. The modular partitioning approach signi cantly reduces the number of design constraints and it provides an e cient approach for representing the complete solution space using Binary Decision Diagrams 17, 30] . The enumeration of the shortest path in the BDD 30] yields an e cient solution with reduced circuit implementation area.
The modular partitioning approach synthesizes a speci cation by partitioning it into modules. This simpli es the circuit veri cation process, since the size of speci cations to be veri ed with respect to its logic is considerably smaller.
In the rest of the paper, let: denote the complete state graph of given STG. o i denote the modular state graph corresponding to output o i .
The modular partitioning approach is illustrated in Figure 8 . It takes the following several steps to synthesize the asynchronous circuits:
Determine the input signal set, I S (o i ), belonging to output o i by greedily removing signals from complete graph to decrease the CSC con icts. Similarly, greedily remove the state signals. 5 Derive a smaller modular state graph o i from the complete graph , for the input set, I S (o i ).
Find the new state signals and their assignments (0, 1, Up, Down) to the states of graph o i by nding a truth assignment to the SAT formula representing the consistent assignment, semi-modularity, and CSC constraints.
Propagate the truth assignments to the new state signals from graph o i to the complete graph .
Repeat the above steps for every output signal until all the CSC con icts in the complete graph are satis ed.
The modular partitioning approach outlined above iterates the design steps over circuit outputs. Thus the quality of the nal solution may depend on the order in which the outputs are processed. In practice, it was observed that the area of the nal circuit implementations has very little dependence on the order of output processing. The above procedure yields a network of several modular graphs as shown in Figure 8(b) . It is equivalent to the complete state graph in Figure 8(a) .
There are two criteria to measure the quality of the nal solution of the synthesized circuits: a reduced implementation area and the minimum number of state signals to satisfy CSC constraints. Our goal is to obtain a reduced implementation area, so this modular partitioning approach may not guarantee the minimum number of state signals to satisfy the CSC constraints. The number of state signals required to synthesize the STG may increase due to the integration of local solutions from a number of modular state graphs. In practice, however, we have achieved the minimum number of state signals in all of the asynchronous benchmarks, except two. For STG benchmark mr0, for example, our solution requires more state signals but the two-level implementation area of the synthesized circuit is less than the area required to implement the state graph with minimum state signals.
In the following, we discuss in detail the major steps of this general partitioning approach for asynchronous circuit synthesis.
Determining the Input Signal Set
The input signal set belonging to output o i is de ned as the minimum number of STG signals required to implement the logic circuit. The input signal set consists of a trigger signal set and To reduce computational complexity of satisfying CSC constraints, it is essential to minimize the number of CSC violations in the modular state graph. Thus, instead of satisfying all the CSC violations in the complete state graph, we resolve a signi cantly smaller number of CSC violations in the smaller modular state graphs. This is accomplished by determining the remaining signals in the input set by greedily removing STG signals from the complete state graph so as to minimize the CSC con icts in the modular state graph. If signal s i is not output o i and is not in the trigger signal set of output o i , it can be removed from the state graph if it does not increase the number of CSC con icts and it does not increase the new state signals required to resolve these con icts. A signal is removed from the state graph by labeling all its transitions as the silent transition, i.e.,
. The removal of the STG signal implies that it is not required to implement the logic circuit corresponding to output o i .
The modular partitioning approach propagates the state signal assignments from the modular state graph to the complete state graph and iterates these design steps for every output signal. Thus some state signals will be present in the complete state graph after the iteration on the rst output signal. We may remove some of these state signals if this does not increase the CSC con icts in the modular state graph. The presence of these state signal imposes an additional constraint on the removal of STG signals from the complete state graph. An STG signal s i cannot be removed from the state graph if a state signal n k assigns an Up value to state M i and a Down value to state M j in the transition M i s i ! M j , or vice-versa. 6 The removal of STG signal s i in such a case will assign con icting state signal n k values Up and Down. Thus avoiding this situation ensures that the modular state graph has a well-de ned assignment of state signal n k . This is described in further detail in Section 5.3.
The above procedure to determine the input set is summarized in algorithm determine input set() in Figure 9 .
Example : The example STG in Figure 5 (a) has two primary outputs (po and s) and two primary inputs (pi and t). The state code of the state graph is < pi; t; po; s >. In the signal transition graph, two signal transitions t? and pi? immediately precede the output signal transitions po+ and po?, denoted as t? ! po+ and pi? ! po?. Thus, the trigger signal set of output po consists of two signals, i.e., pi and t. To nd the remaining STG signals in the input signal set, we start from the state graph in Figure 5 (b) containing 8 CSC con icts. The lower bound on the number of state signals is 2. The STG signal s, which is not output po and is not in the trigger signal set fpi, tg, is removed from the state graph ( Figure 5(b) ). This reduces the number of CSC con icts to 3 and reduces the lower bound on the number of state signals to 1. No other STG signal, which is not output po, can be removed since they are in the trigger signal set fpi, tg of output po. Since the state graph in Figure 5 (b) does not include any state signal, the input signal set of output po consists of signals fpi, t, pog. Similarly, the input signal set of output s is fpi, t, po, s, n 1 g, where n 1 is a state signal.
Modular State Graph Generation and Constraint Satisfaction
The input signal set belonging to an output is used to derive a modular state graph. The modular state graph is generated by labeling all the transitions of non-input set signals as transitions in the complete state graph . The values of the non-input set signals are removed from the state codes in the complete state graph. Then, the states connected by transitions are merged together. Case 1 : If states M i and M j have the same assignment value for state signal n k , i.e., n i;k = n j;k , then the merged state M i M j in the modular state graph will also have the same assignment value for state signal n k . This is illustrated in Figures 10(a) , (b), (c), and (d).
Case 2 : If state M i has the assignment value n i;k = Down and state M j has the assignment value n j;k = Up, or vice-versa (Figure 10(e) ), then the collapsed state will not have a unique assignment. Thus, the state M i and the state M j cannot be collapsed. and M j have assignment values fn i;k = Up, n j;k = 1g, fn i;k = Down, n j;k = 0g, and fn i;k = 1, n j;k = Downg, then the merged state M i M j will have an assignment value Up, Down, and Down, respectively. This is illustrated in Figures 10(g), (h) , and (i). Case 4 : The rest of the state signal assignments (Figure 10(j) ) are inconsistent with the consistent state assignment constraints. Thus, they cannot be assigned by the SAT algorithm.
The remaining CSC constraints in the modular state graph are satis ed by deriving a Boolean satis ability formula (Section 4). The solution of this SAT formula gives the state signal value assignment for every state in the modular state graph. The new assignments resolve all the CSC con icts in the modular state graph. These assignments from the modular state graph are then communicated with the complete state graph .
The above procedure to generate the modular state graph and to satisfy STG constraints is summarized in the algorithm module gen and constraint satisfaction() in Figure 11 .
Example : In the previous section, the input signal set of output po was derived as fpi, t, pog.
Since STG signal s does not belong to this input set, in order to derive the modular state graph for output po, all the transitions of s are labeled as transitions. The value of signal s is removed from state code < pi; t; po; s > of the complete state graph. Thus, the state code of the modular state graph is < pi; t; po >. The state sets fM 2 , M 4 g, fM 3 , M 6 g, fM 5 , M 8 , M 11 g, fM 7 , M 10 , M 13 g, and fM 9 , M 12 , M 15 g of complete state graph (see Figure 12 (Figure 12(b) ). Since, at this stage, there are no state signal assignments associated with the complete state graph, the modular state graph does not require the propagation of the state signal assignments.
The modular state graph for output po has 3 CSC con icts and requires one state signal to resolve them. The state assignments for state signal n 1 is derived using the SAT-CSC model. The modular state graph with state signal n 1 assignments is shown in Figure 13 . The assignment of state signal n 1 is then communicated with the complete state graph (see Figure 14) . The same The above procedure to propagate new state signal assignments from modular state graph to the complete state graph is summarized in algorithm propagate new signal assignment() in Figure  15 .
Example : For the PaBlock example, the state signal n 1 assignments are propagated from the modular state graph of output po (Figure 14(a) ) to the complete state graph (Figure 14(b) ). In the modular state graph generation process (Section 5.3), state sets fM 1 g, fM 2 , M 4 g, fM 3 , M 6 g, fM 5 , M 8 , M 11 g, fM 7 , M 10 , M 13 g, fM 9 , M 12 , M 15 g, fM 14 g, fM 16 g, fM 17 g, and fM 18 g in the complete state graph (Figure 12(a) (Figure 12(b) ). Thus, the new state signal n 1 's assignment that corresponds to the state M 1 (i.e., 0) in modular state graph (Figure 14(a) ) is added to state M 1 in complete state graph ( Figure 14(b) ). An assignment corresponding to state M 2 (i.e., 0) in modular state graph is added to states M 2 and M 4 in the complete state graph. Similarly, the remaining state assignments are also propagated to the complete state graph in Figure 14 (b). The complete state graph with the new state signal n 1 's assignments is shown in Figure 14 (b). Figure 16 illustrates the concept of assignment propagation from the complete state graph to the modular state graph (i.e., downward state assignment) and the reverse procedure (i.e., upward state assignment).
Logic Function Derivation
The partitioning process to generate modular state graphs from the complete state graph with new state signal assignments is repeated for every output in the STG until all the CSC con icts in the complete state graph are satis ed. In the worst case, the iteration process must be repeated for all the outputs. In practice, all the CSC con icts are satis ed much earlier than the iteration on the last output signal. We may continue the process after all the CSC con icts have been satis ed in order to obtain a modular state graph for the remaining outputs. In general, it is more e cient to stop the process after the satisfaction of all the CSC constraints. Then the state signal transitions are included into the complete state graph by expanding the state signal assignments using a simple procedure described in 40]. As an example, the expanded modular state graph for output po is shown in Figure 17(b) 40] .
The logic function of an output, which is in the sum-of-products form, can then be obtained 0000 by nding the implied values of the STG outputs in every state of the expanded state graph (see Section 3.2). A prime-irredundant cover of the output logic function can be obtained by employing a standard logic minimizer, e.g., espresso. This cover may contain some static 1-hazards and dynamic hazards which can be removed by using some known hazard removal techniques 15, 21] . Example : For the PaBlock example, the complete state graph with state signal n 1 , n 2 assignments is shown in Figure 7 . These two state signals resolve all the CSC con icts in the given state graph and satisfy the consistent assignment constraints and the semi-modularity constraints. The expanded state graph, derived from the complete state graph in Figure 7 , has 34 states. The logic functions of output signals po, s and state signals n 0 , n 1 , are derived from the primeirredundant cover from espresso. They are given as follows: po = pi:t:n 1 , s = n 0 + s:n 1 , n 0 = s:n 0 + pi:t:s:n 1 , n 1 = t + pi:n 1 + s:n 0 :n 1 . The above logic functions yield an implementation area of 18 literals. 7 The complete procedure for the general partitioning algorithm for asynchronous circuit synthesis from signal transition graphs is summarized in algorithm asynchronous ckt synthesis() in Figure 18 .
Experimental Results
The general partitioning algorithm for asynchronous circuit synthesis was implemented in C language. We employed an e cient implementation of a branch and bound algorithm 8 For the existing industrial asynchronous circuit (STG) benchmarks, the experimental results of our general partitioning approach in terms of computing time and circuit implementation area are given in Table 1 Figure 19 , for three asynchronous circuit design approaches, we give a quick comparison of the number of design constraints, the computing time, and circuit implementation area for a number of slightly larger industrial asynchronous circuit design benchmarks.
The results indicate that our general partitioning approach outperforms both Lavagno et. al. We also calculated the two-level area of the logic circuit synthesized by nding a prime-irredundant cover from logic minimizer espresso. We ran espresso with single output exact minimization option, i.e., espresso -Dso -S1. The results of two-level area from Lavagno et al.'s algorithm were also The number of literals were calculated from the unfactored prime-irredundant cover obtained using espresso -Dso -S1 options.
The state splitting technique of 16] has not yet been implemented in the U.C.Berkeley logic synthesis tool SIS, which results in an internal state error in some cases 14].
calculated from the prime-irredundant cover of the network logic function. We employed astg syn -r option in the U.C.Berkeley logic synthesis tool SIS 32] to nd the prime-irredundant cover. On average, our general partitioning algorithm reduces the two-level implementation area by 12% than that of the Vanbekbergen's direct synthesis method. As compared to Lavagno et al.'s algorithm, we obtained an average area improvement of 9%. The two-level implementation area results are summarized in Table 1 .
Conclusion
A general partitioning approach for asynchronous circuit design is presented. The approach partitions an STG into smaller and manageable subgraphs that signi cantly reduces the complexity in designing larger and complex asynchronous circuits. Experimental results with numerous industrial asynchronous circuit benchmarks indicate that, compared to the major asynchronous circuit design approaches, this general partitioning method produces asynchronous circuits with reduced circuit implementation area and, meanwhile, it achieves many orders of magnitude of performance improvements in terms of computing time. This e cient method lends itself well to general, practical asynchronous circuit design.
