Abstract-State charts have been investigated recently as a powerful specification language for control structures. They extend classical finite state machines (FSM's) in several ways, catering mainly for hierarchy, concurrency, and communication. We present a novel, simple, single-block implementation scheme for state charts, which uses a single conventional combinational-logic block and a state register. The most attractive feature of the proposed scheme is the absence of communication. It eliminates the need for communicating FSM's owing to an older realization method, and does so without having to account for all state configurations implied by concurrency. We investigate the state encoding conditions for our implementation and suggest an appropriate optimization technique.
I. INTRODUCTION
Finite state machines (FSM's) have constituted one of the main formalisms underlying the prevailing approaches for the description and implementation of hardware control units. Their advantages are mainly their simple semantics as well as their simple and regular implementation scheme, based on a combinational-logic unit and a state register, illustrated in Fig. 1 .
Typically, FSM's are pictorially represented by state diagrams.
These diagrams, however, are inherently sequential and flat. ' Recently, an attempt at overcoming these limitations has been made with the advent of state charts, [7] , [9] , which extend state diagrams to cater for hierarchy, concurrency, and synchronization, while retaining their formality and visual nature. In a recent paper [7] , we investigated their use for hardware description and synthesis.
One of the major drawbacks of state charts is the absence of a simple, easy-to-implement implementation method that will be reasonably economical in terms of VLSI resources. In this paper we investigate a novel implementation scheme based on the classical model of Fig. 1 , namely, a single combinational-logic block and a state register. In Section 111, we investigate the sufficient conditions for this scheme to realize state charts correctly. In Section IV we investigate the state-assignment optimization problem for this implementation scheme.
The only existing synthesis method for state charts in the literature is that of [7] . There, the suggested synthesis technique is asymptotically efficient for very large state charts, where the state chart tree out-degree is assumed to be of constant size with respect to the number of states in the state chart. It is based on a tree of communicating FSM's which is isomorphic to the state chart tree.
Manuscript received February 9, 1989 ; revised June 27, 1990. This paper was recommended by Associate Editor R. K. Brayton.
A shorter version of this paper was presented at the SASHIMI-90 Workshop on Synthesis, Japan, Oct. 1990 .
The author was with the CAD Department, Semiconductor Group, Sony Corporation, 4-14-1 Asahi-cho, Atsugi-shi, Kanagawa-ken 243, Japan. He is now with SSL, Sony, 61 1-B River Oaks Parkway, San Jose, CA 94087.
IEEE Log Number 9 100302. 'Throughout, we shall refer to state diagrams as FSM's. The exact meaning should be clear from the context. The major drawback of this method derives from its multiple-block implementation scheme, where several communicating FSM's, arranged in a tree structure, realize the state chart. The communication and synchronization messages within the FSM tree require considerable care to be implemented correctly. This is especially true for submicron technologies, where communication delays are dominant. Also, the old method is inefficient with respect to area and speed for small and medium sized state charts, where constant factors play an important role, and it is sensitive to other parameters (e.g., the out-degree of the state chart tree). In Section I1 we present a brief overview of the state chart formalism and the implementation method of [7] . The major contribution of this method is its handy implementation scheme. Having to deal only with a single combinational-logic block and a state register, we managed to make the actual low-level implementation of state charts identical to that of a conventional FSM, thus enabling the use of most CAD tools built for FSM implementation (e.g., PLA optimization techniques). This is done without having to enumerate all state configurations implied by concurrency and without any state blow-up caused by the power of high-level transitions implied by hierarchy.
Other related synthesis methods are FSM synthesis and Petri-net synthesis. FSM synthesis is related as follows. Given a state chart, one can unfold its concurrency and hierarchy and generate an "equivalent FSM," namely, a sequential state machine that will accept (and produce) the same formal language. This is done essentially by considering the state set defined by the Cartesian product of the state sets within the state chart [3] . This process can be automated so that one can consider the following alternative synthesis method for state charts: 1) convert the state chart into an equivalent FSM and 2) synthesize the FSM using existing CAD tools. However, existing results exhibit exponential lower bounds for the conversion step [8] . Consequently, this method becomes useless even when small degrees of concurrency exist in the state chart. For this reason we shall not review conventional FSM synthesis methods.
0278-0070/91$01.00 0 1991 IEEE Petri nets (PN's) have an extensive body of literature ( cf. [14] ). In their finite resource versions, PN's resemble nondeterministic FSM's, which allow many simultaneous computation paths. PN's, however, do not incorporate hierarchy as part of their syntax. Consequently, there is no known method for emulating state charts on finite-resource PN's without using extensive communication. As pointed out later in this paper, the major difficulty in synthesizing state charts is due to the combination of hierarchy and concurrency. For each feature alone, there exist relatively simple implementation methods. For example, one recent implementation method for PN's [ l ] uses a 1-hot state assignment (also suggested in [15] for nondeterministic FSM's), later optimized so that sets of "exclusive" places (i.e., places that are never reached simultaneously) use a logarithmic number of bits. This is essentially what our method does when the state chart incorporates concurrency without hierarchy (;.e, when it is a set of concurrent FSM's), with the exception that in our case these sets of "exclusive" states are given syntactically by the state chart whereas for PN's one must find them somehow. When the state chart has no concurrency, the equivalent FSM is not exponential and FSM synthesis tools are indeed relevant.
STATE CHARTS, LANGUAGE, DEFINITIONS, A N D OLD-
The state chart formalism was introduced in [9] as a visual formalism for specifying the behavior of complex reactive systems [7] , [IO] . We shall not give their formal syntax and semantics, of which several versions exist [3], [I 11- [13] , but rather review their behavior through the traffic-light controller example of [7] . Fig.  2(a) describes the behavior of a traffic-light controller whose I/O interface is described in Fig. 2(b) . There are two sets of lights: one is positioned over the main road and the other is over the secondary road. During the day (Day = = 1) the controller operates according to one of two possible programs, whereas during the night it operates according to a special night program. The controller can be operated manually as well (Auto = =O). In this mode whenever a policeman pushes the Police button, the lights altemate. A hidden camera can be operated by the controller only when it is in AU-TOMATIC mode. An ambulance signal can arrive (Amb = = I ) , notifying the controller that an ambulance is approaching the junction. Then the lights are set according to the ambulance's direction, and all other events are ignored. The controller can receive an error message (Errin = = l), which will cause yellow lights to flicker. Another possibility for an ERROR occurs when the controller operates manually for more than 15 min without a policeman pushing the police button. A reset signal resets the controller to the AU-TOMATIC state. Note that this state chart is not given in detail; such details are available in [7] , whereas a more elaborate description of the formalism is to be found in [9] , and formal syntax and semantics are available in [3] and [ 1 I].
In Fig. 2(a) , we have exclusive states, which can never be reached simultaneously (e.g., DAY and NIGHT), and orthogonal states, which can be reached simultaneously (e.g., AUTOMATIC and CAMERA). Note that these relations are symmetric and not transitive; e.g., DAY and AMBULANCE are exclusive and so are AMBULANCE and CAMERA, but DAY and CAMERA are not; they are orthogonal. We have basic states (e.g., AMBULANCE) and superstates (e.g. LIGHTS, CONTROL, NORMAL). We use default entrances (e.g. the entry to AUTOMATIC within NOR-MAL). We have high-level transitions, such as the Errin transition to ERROR, which is equivalent to drawing all transitions that lead from any configuration within OPERATE to the default state of ERROR. We also have flexible concurrency, where concurrency is described at any hierarchical level, without causing sequential descriptions to become an awkward exception. Hence, in Fig. 2 (a) CAMERA is orthogonal to AUTOMATIC, but LIGHTS is orthogonal to both of them, on a much higher level. Sometimes we think in terms of concurrent processes, where several states that are reached simultaneously are considered as concurrent processes. We call such an element a state configuration and denote it as a tuple of states. Note that because of the flexible-concurrency feature, the number of processes in a state configuration is not necessarily fixed.
We will see in the sequel that this is the reason for naive state assignments for state charts to be incorrect. Finally, intemal communication between concurrent superstates is possible; throughout this paper we shall assume that such communication is operation-
/ \ D (AND)
e ---\ J""Y< S12 S13 514 ally defined as a delayed feedback loop, from the appropriate output port to the corresponding input port of the black box. Throughout the rest of this paper we shall use the state chart of Fig. 3(a) , which describes the (complex) behavior of the black box illustrated in Fig. 3(b) , with four input channels, each of which receives U, (meaning U, = 1) or T U , (U, = 0), i = 1, 2, 3, 4, and three output channels, each of which produces U, or 1 U ( , i = 1, 2, 3. Note how the state chart svntax induces an AND/OR tree, where hierarchy and concurrency are replaced with OR and AND relationships, respectively, as illustrated in Fig. 3 (c). This state chart tree is frequently used for the definition of various important relations, one of which is the least common ancestor (LCA) relation, where the LCA of S, and S, is E, the LCA of S, and SI3 is D, and the LCA of S8 and SI5 is C. Now, two states are formally defined as exclusive if their LCA is an OR state, and as orthogonal otherwise. Also, the state chart tree serves as the basis for the implementation method of [7] , which will be reviewed in the sequel.
Intuitively, the semantics of state charts can be understood as follows. The state chart computation visits state configurations, which are elements of a Cartesian product. For example, starting at SI (as indicated by the default transition), when input ul is received, the next state configuration will be (S,, s,), and the output event u I is produced. Next, when the pair of inputs ( u~, v3) is received, the following state configuration is ( S3, S 5 ) , and the triple ( U , , U*, u 3 ) of output events is produced. The transitions S, -+ S3 and S, + S5 are said to be orthogonal. Hence, computation takes place in the form of global transitions between state configurations, global transitions which are composed of one or more orthogonal state chart transitions, each of which either encapsulates one or more transitions from basic states, owing to hierarchy, or is (recursively) a global transition.
Clearly, these examples illustrate, on the one hand, the tremendous flexibility within the language and, on the other hand, the difficulty of emulating this behavior in a simple way. Currently, more than one version of formal semantics exists for the formalism [3], [11]- [13] , differing mainly in the notion of time and communication. Hence, our implementation scheme will not be based on formal semantics but rather on the intuitive behavior of the main features of state charts.
The synthesis methodology of [7] maps the state chart tree onto an isomorphic tree of communicating FSM's, one FSM for each superstate. Such an implementation of the state chart of Fig. 3 is illustrated in Fig. 4 . Each FSM implements a single superstate and its immediate substates. Hence, FSM DI implements an FSM with two states, E and SII, and an extra idle state, whereas FSM E implements an FSM with four states, S8, S9, Slo, and idle (see [7] ).
Later the FSM tree is laid out using well-known layout techniques for trees. Hence, this synthesis methodology is efficient when d, the state chart tree out-degree, is constant with respect to n, the number of states within the state chart. This, of course, is true only if n is extremely large. Also, special communication signals are generated for transitions that cross state boundaries, such as the transition from S, to S5. Naturally, such communication overhead degrades performance significantly. The biggest drawback of this implementation scheme, however, is its complexity. (The FSM tree requires special attention for the correct implementation of the inter-FSM communication and synchronization, which is a nontrivial design problem.)
THE SUGGESTED SINGLE-BLOCK IMPLEMENTATION'
In order to reduce the variety of possible transitions within a state chart, we use the abstraction step illustrated in Fig. 5 ; namely, we replace high-level transitions with a primitive form of concurrency. Both state charts in Fig. 5 are equivalent; in the original state chart the a, 0 transitions take place from either A, or A,, whereas in the transformed one they take place concurrently with the activity within A, which has the same final effect. This is true no matter how high the states are within the hierarchy; we simply add an orthogonal state to the superstate A (the original source of the high-level transition), which consists of precisely one basic substate A', and change the corresponding high-level transition so as to depart from A'. Also, transitions will always be used with basic states as their target states (e.g., instead of D -+ D we will use D --t (S,, SI3)). These two transformations are done for every transition in the state chart. Fig. 6 is the basic structure of the transformed state chart equivalent to that of Fig. 3 . Hereafter, we shall refer only to the transformed state chart, and we will omit the prime.
Consequently, there is now only one general type of state cham transition, namely, a quadruple (X, Y, a, @), where X and Yare tuples of basic states, representing the transition's source and target, respectively, a is the transition's label (input), and / 3 is the output produced when this transition is traversed. The transition's source and target are subtuples of a state configuration (in our example, (E, SI3) is a subtuple of (E, SI3, SI,)).
Now, consider the FSM emulation method of Fig. 1 . Here, there is an isomorphism between the FSM state set and the code words stored in the state register. Also, the FSM transition function is emulated by the combinational logic. Obviously, if we try to emulate a state chart in the same way, we might need an enormous combinational-logic block to emulate an exponential number of global transitions between state configurations induced by concurrency. Hence, we shall emulate all state chart transitions in the combinational-logic block in a way which will preserve the global behavior of the state chart. Our implementation scheme is illustrated in Fig. 7 . The combinational logic implements the original state chart transitions in the conventional way; a point on the LSI grid which represents a transition (i.e., the value at this point is 1 iff the transition is traversed) is called a term. For example, in the PLA of Fig. 8(a) , rows 1 through 24 represent 24 terms.
Based on Fig. 7 , a state chart implementation consists in implementing all terms for all state chart transitions, in a way that is similar to the conventional FSM implementation. For a transition ( X , Y, a, P), the source block contains a conjunction of the code words that represent the basic states within X (and of a, the inputs). For a given basic state (or a given output), the target block contains a union of all terms that represent transitions whose target tuples contain this basic state.
State assignment is the crucial step within this implementation if one wants to implement both hierarchy and concurrency correctly. We shall further elaborate this point through a sequence of three abstractions. First, consider the extremely restricted case where state charts are restricted to be FSM's only (i.e., no concurrency or hierarchy is permitted). In this case, any log n bit unambiguous code assignment suffices for a correct implementation of the model of only state charts that consist precisely of m concurrent FSM's. In this case we can simply derive a state assignment for each FSM individually, and then for every state configuration, concatenate the appropriate individual code words into one wider code word. Note that in this case, an unambiguous individual state assignment for each FSM will suffice. Finally, we end our abstraction process in the general case where both hierarchy and concurrency exist. Now, not every such individual state assignment is correct. Consider a state assignment where 0 and 1 are assigned to Yo to represent S2 and &, respectively, whereas 00, 01, and 10 are assigned to Yl and Y, to represent SI5, SI6, and SI,, respectively, in Fig. 6 . Now, when the transition SI -+ (S,, S,, B ) is traversed, Y, and Yz are not affected by the target block, and might be assigned an erroneous code. In fact, when a PLA is used, this code will be 00, so SlS will become part of the next configuration. Hence, as discussed in the introduction, the main cause of difficulty is that because of hierarchy, state chart concurrency is flexible, so one cannot naively assign a fixed portion of a code word to each process. For this reason, state chart state assignment is not a naive extension of an FSM state assignment, as in the restricted case discussed earlier.
We shall distinguish between state assignments and conjiguration assignments. A configuration assignment is a code that represents state configurations, whereas a state assignment is a code that maps basic states to binary strings. Our goal is to find a state assignment such that the induced n,-bit configuration assignment will be unambiguous. We define the U operator to extend the binary "or" operator with VUa = alJ9 = a for a = 0, 1, g. Now, for a well-dejned state assignment, we require the following state chart state assignment conditions: S7 + D <E,S13> + S7 S13 + 512 S12 -+ S14 S14 + S12 S12 + SI3 D -+ D S8 + s11 s11 4 s10 s10 -+ s 9 S9 -+ S8 S6 --t S7 S15 -+ S16 S16 + S17 S17 4 S15 S17 -+ C term no. The first condition describes the nonambiguous state assignment we are looking for, whereas the second condition guarantees that the configuration assignment is nonambiguous and (because it per-mits binary strings only) guarantees that code words, in their entire length, describe legal configurations in a precise way. Hence, these conditions ensure that inductively, as computation proceeds, the binary code words of length nb represent, in a unique and precise way, only legal state configurations. It is the responsibility of the combinational logic to implement the induction correctly with the correct implementation of terms as described earlier. Stated informally, a well-defined state assignment p is such that when code words of elements of a state configuration are summed (U), the resulting configuration assignment T is well defined (i.e., binary) and nonambiguous. Hence, in the previous example, the 3-bit prefixes of p(S2), p(S4), and p@), are m, m, and m, respectively; thus the 3-bit prefix of r (&, S4, B) is w h i c h is not well defined.
A trivial well-defined state assignment resembles the well-known FSM 1-hot state assignment. Here, nb = n, where n is the number of basic states in the state chart (e.g., in Fig. 6 , n = 21). We assign to p(Si) a tuple of nb trinary symbols, where 1 is assigned to the i th variable, 0 is assigned to thejth variable for every j # i such that Sj and Si are exclusive, and 9 is assigned to every other symbol.
For our example, the 1-hot state assignment for SI3 and SII are, respectively, (where we consider B, C, D, E to be SI8, S19, S20, and S,,, respectively).
Theorem I: The 1-hot state assignment satisfies the state chart state assignment conditions. Sketch ofProofi Clearly p is 1-1. That r maps configurations to (0, l}nb follows from the fact that for every configuration c = (SI, -* , Sk) and every i 5 nb, the ith symbol of the code word p(Sj) for some Sj, j 5 k , is not $4. This is because either Si is in c or it is exclusive to some such Sj in c; otherwise the configuration is not "complete" (some concurrent process is not described in c). Consider any two different configurations cI, c2. Clearly, there must be two exclusive basic states Si in cI and Sj in c,; hence ~( C I ) and T(c,) differ in the ith andjth symbols. Thus, is 1-1.
Q.E.D.
IV. EFFICIENT STATE ASSIGNMENTS
As with conventional FSM state assignment, where the total implementation area is optimized (e.g., [2] ), we wish to optimize the total area consumed by our implementation. This includes two main aspects: minimizing the number of terms and minimizing the number of state variables. Term minimization is outside the scope of this paper mainly because it seems to be tightly coupled with state chart minimization, which has not yet been s01ved.~ Hence, we shall concentrate on the minimum state-encoding problem, where a state assignment is sought that is of minimum width (i.e., nb is minimum). This state assignment should be constrained to satisfy the state assignment conditions described earlier. 4 We do not know how to solve this problem in general. Instead, we shall investigate one approach, the exclusivity encoding state assignment. Generally speaking, it is identical to the 1-hot assignment, except that sets of basic states, called exclusivity sets, are coded with a logarithmic number of bits. Formally, we split the set of basic states into m pairwise disjoint exclusivity sets R I , , R,, where each such set consists of states that are pairwise exclusive. For the example of Fig. 6 , we can divide the basic sets into {s1, s,, s3, S6, S7, Sg, S,, Slo, SI,), {S4, S5, S12, SI39 S14}, {SI,, SI6, SI,}, {a, E}, {D}, and {C}. Now we encode all states that belong to a common set R, with [log !Rill bits instead of the (Ril bits used by the 1-hot assignment. This reduction is possible because all states in each such set Ri, i E 1, . . . , n, are pairwise exclusive. These groups of bits are then concatenated to form a state configuration code word. Hence, in the above example, we use 4, 3, 2, 1, 0, and 0 bits, respectively. However, a subtle point now emerges. Consider, in the above example, states Sls and S2.
They are exclusive, and in the above decomposition they belong to two different exclusivity sets. As a result, when the present configuration is (S,, S4) , the pair of bits representing the exclusivity set {SI5, S16, SI,} should represent a dummy state so that, say, ( S 2 , S,, SI,) will not be represented. Formally, for the two exclusivity sets Ri and Rj, when there is a basic state in Ri which is part of a state configuration c in which no basic state of Rj is included, then Rj needs a dummy state to ensure its exclusivity from other elements of c . Hence, when the present configuration is ( S2, S,) the pair of bits representing {SI5, Sl6, SI7} should hold the code for a dummy state.
Once such exclusivity is guaranteed, T becomes 1-1, and we can conclude.
. . .
Theorem 2:
The exclusivity encoding state assignment method satisfies the state chart state assignment conditions.
The exclusivity encoding optimization problem (EEOP) is the appropriate optimization problem, namely, the problem of splitting the basic states into pairwise disjoint exclusivity classes R , , * * , R,, such that is minimum, taking into account the need for dummy states as well. We have investigated a simpler where the need for dummy states is not considered. Note that the solutions for both problems do not necessarily unify. For example, the decomposition given above is EEOPl optimal for our example, although not EEOP optimal. This is evident from the state assignment in Fig. 8(a) , where we have moved SI and S,, from the first set to the second and forth sets, respectively, and thus saved one state variable.
We say a state chart is m-concurrent if m is the maximal dimension consumed by any configuration. For example, the state chart of state chart consists of m exclusivity sets.
Sketch of Proof: Clearly, at least m exclusivity sets are required, one for each basic state in the configuration that has m processes. If more than m sets are used, then one set R is redundant in the sense that for every basic state S E R there is an exclusivity set Rj to which S can be appended (otherwise the system is not m-concurrent but has a higher degree of concurrency, thus reducing the total cost).
Q . E . D .
As a consequence of Lemma 3, given an m-concurrent state chart, an algorithm for solving EEOPl consists of finding m, and then finding the m largest exclusivity sets.
Given a state chart AND/OR tree A , we define a trace to be a subtree A' that contains A's root, and satisfies the following: = 0 0, which means an illegal next state in C,. This is because there is no specified transition for the input 1 v4 when in SIs of C,.
The problematic issue here is that detecting such a situation of unspecified transitions in an m-concurrent state chart is NP-complete in the number of input channels, and PSPACE-complete in m [3], [4] . This is a general problem in concurrent systems and is discussed in detail in [4] . It is outside the scope of this paper, and we would like to mention only that the solutions suggested in [4] are applicable here too. Yet another interesting phenomenon is nondeterminism. In Fig. 3 , for example, the configuration ( S 2 , S 5 ) has two consecutive configurations for the input tuple ( 1 v l , U*, u3,  U,) . The I-hot state assignment can emulate this nondeterminism as a form of concurrency similar in behavior to PN concurrency. As described in [3] and [4] , nondeterminism in state charts is also extremely difficult to detect, because of concurrency.
1) If S iS an
Although asymptotically more area-efficient (in fact, almost Optimal), the method of [7] might generate an implementation that is
The following lemma follows.
Lemma 4: Given a state chart S, the set of leaves of a trace of S forms an exclusivity set, and every exclusivity set within S is a subset of the set of leaves of some trace of S.
Sketch of Proof: First, note that the least common ancestor of any two basic states (i.e., leaves) which belong to a common trace is always an OR state, and that the least common ancestor of any two basic states which do not belong to a common trace is always an AND state, so the first part of the claim follows. For the second part, given an exclusivity set E, consider the subtree T of the state chart tree whose set of leaves is precisely E. Clearly, an AND state vertex in T can have only one substate because elements of E are pairwise exclusive. Hence T is a subtree of a trace of S, with a common root, so the second claim holds as well. Q . E . D .
Consequently, finding a maximum-cardinality exclusivity set is equivalent to finding a trace with a maximum number of leaves. Such an algorithm is induced by the following recursion for the computation of a trace T ( S ) , with a maximum number of leaves,
N( T(S )) .
For a basic state S, T ( S ) = S, and N ( T ( S ) ) = 1.
For an OR state S with substates SI, . . . , SI, T ( S ) is the tree whose root is S with substates SI, . . . , SI, which in tum are roots of the subtrees T(SI), . , T(SI), respectively; N ( T ( S ) ) =
N(T(Si)).
For an AND state S with substates SI, . . . , SI, T(S) is the tree whose root is S with one substate Si such that the subtree T(Si) rooted at Si is the subtree with a maximum N(T(Si)); N ( T ( S ) ) = maxi=l...IN(T(Si)). Fig. 8(a) is a PLA implementation of the state chart of Fig. 3 , according to the decomposition described previously. Fig. 8(b) shows the mapping between state chart transitions in Fig. 3 and the terms of the PLA. The PLA consumes an area of (16 x 2 + 15) X 24 = 1128 transistors, and the state register consumes an approximate area of 30 X 12 = 360 transistors. Hence the total implementation area is 1488 transistors. The 1-hot implementation consumes a total of (26 X 2 + 25) x 24 + 30 x 22 = 2508
V. DISCUSSION
transistors.
An interesting aspect of the implementation has to do with unspecified inputs. Assume, for example, that in Fig. 3 the present configuration is <S6, SIs) and that the input ( U , , v2, 1 u 3 , 1 u 4 ) has been received. Clearly one would expect the next configuration to be (S,, Sls). All terms of Fig. 8(a) , however, will specify Y6 Y, inferior to the proposed implementation, as exemplified by the state chart of Fig. 3 , for which an implementation according to the old method consumes a total PLA area of between 850 and 1450 transistors, and a total state register area of 30 X 16 = 480 transistors. The range depends on the type of implementation with respect to unspecified transitions for the problem discussed above, where the smaller implementation is not entirely correct in this respect and the larger one is. In addition, this implementation includes extensive inter-FSM communication wires, two to four wires between each pair of machines. The cost of this overhead depends on the layout and is estimated to be area equivalent to more than 400 transistors. Moreover, if one wishes to reduce inter-FSM synchronization problems, more flip-flops are required to latch inter-FSM communication signals. Obviously, when these FSM's are encoded with a 1-hot state-assignment, the consumed area is larger.
VI. CONCLUSION
We have presented a simple, single-block implementation methodology for state charts, together with an appropriate optimization technique. Our methodology does not require communication and synchronization between FSM's, which makes it easy to implement and verify and allows the use of existing CAD tools (e.g., PLA optimization for the combinational logic generated by this method). Although asymptotically inferior, our scheme is expected to be superior to the previously known method for the synthesis of state charts with a state chart tree out-degree that is not constant with respect to the number of basic states, namely, for small and medium sized state charts and even some large ones. Also, its efficiency does not depend on the number of edges that cross state boundaries.
I. INTRODUCTION
Logic synthesis of sequential finite state machines (FSM's) is a well-developed field of knowledge. There is a massive body of research in this area, originating in the fundamental research of Steams and Hartmanis [7] , [8] . See, for example, [ l ] and [6] .
In this paper we examine logic synthesis of concurrent FSM's. It is our belief that concurrent FSM's will be increasingly used in the future, as exemplified by the following three scenarios:
Consider a real-time control system; here, the real-time constraints impose hardware concurrency of several, perhaps communicating, sequential controllers where, typically, each controller is modeled as an FSM. Consider a design task which is divided between design groups, where each group designs a designated subsystem. When several such subsystems are individually controlled by a finite state mechanism, the whole system is conceptually controlled by a system of communicating concurrent FSM's. Finally, consider the process of silicon compilation of a behavioral hardware description language (HDL). Conventionally, such a compilation output is an FSM for a control mechanism that generates the sequencing instructions for the controlled data path. Naturally, with the advent of higher level HDL's (e.g., VHDL), and for increasingly sophisticated designs, the controlled data path might be inherently concurrent and hierarchical, requiring a network of concurrent FSM's to control it efficiently. Also, it seems quite natural to expect a concurrent behavioral description to be implemented by many controlling FSM's, perhaps one for each sequential process in the high level specification.
Hence, it is important to examine existing, predominantly sequential logic synthesis methodologies in the concurrent realm. To date, this has not been thoroughly done.
In this paper we reexamine the well-known state assignment methodology in this context. In Section 11, we examine three implicit assumptions made by conventional state assignment tools and review them in the light of concurrency. We describe appropriate decision problems and analyze their complexity in Section 111.
MOTIVATION: STATE ASSIGNMENTS FOR SEQUENTIAL A N D INTERACTING FSM's
Consider the FSM of Fig. l(a) and its PLA implementation of Fig. l(b) . This PLA was generated by NOVA [13] , a well-known state assignment program, and embodies three typical assumptions made by such a program: 1) Determinism (DET): the FSM is assumed to be deterministic; i.e., for every state and every input there is at most one next state. This assumption enables the state assignment program to implement an n-state FSM with as few as log n state variables. 2) Unspecified input (UT): if at some state there is an input configuration that causes no next state (it triggers no transition; e.g., (a, 0) = (T, F ) in state s3 of Fig. l(a) ), then it is assumed to be an "impossible" input for this state. In other words, it is assumed to be the designer's responsibility to verify that such an incident does not occur. This enables the state assignment program to exploit the free space for optimization; hence, the PLA of Fig. l(b) generates s2 as the next state in the above case. Note that in the conventional FSM implementation scheme, where a state register is connected to a combinational logic block, the combinational logic always produces a (perhaps erroneous) next state. 3) Input and output orthogonality (100): the FSM is assumed to have orthogonal (independent) inputs and outputs, i.e., all combinations of signal values over the IO wires, except those found to be impossible earlier, are possible. Hence, in 
