J.L. H u e r t a s
Introduction
Nowadays, technological advances lead to more and more sophisticated digital system. It is impracticable to realise them without the help of CAD tools [l-31. Systems for the design of finite state machines (FSM) have been implemented since the early 1960s but, few of them included state minimisation because the inherent complexity of this process. In particular, it was shown that the reduction of completely specified finite automata can be achieved in O(n log n) steps [4] . The minimisation of incompletely specified finite automata is a NPcomplete problem [SI. Nevertheless, FSM designers have recently pointed out the need for efficient algorithms for the minimisation of large machines [lS, 161.
The minimisation of the number of states is an important task in the optimal design of sequential circuits. Reducing the number of states corresponds to decreasing the number of transitions of the sequencing functions and eventually to reducing the number of implicants in a twolevel logic realisation or reducing the number of literals in a multilevel one. Moreover, a reduction of the number of states may correspond to a reduction of the number of bits that is needed for the state encoding [6] .
Another area where state minimisation applies is the test generation for sequential machines. Many of the reported approximations are ineffective when the number of states in the circuit is large and the test demands long input sequences.
In this paper we deal with the state minimisation problem as a part of an automatic synthesis system. After giving some basic concepts to make the paper self contained, we review existing methods for state minimisation of a FSM. Emphasis is placed on heuristic approaches, establishing a distinction between constructive heuristics and iterative improvement techniques. Two new methods for solving this problem are described, each from one of the two categories above. Experimental results are given and a detailed comparison is carried out mainly in terms of silicon area and computer time. Symbolic descriptions consist of a set of symbolic implicants [7] . Each symbolic implicant has four components : i s 6(i, s) L(i, s) and represents the state transition and outputs produced when input i is applied to the FSM being in state s. The present state s and the next state 6(i, s) are symbolic representations (labels) which will be coded in the state assignment phase. Input and output functions can be both symbolic, in which case they will also be optimally encoded, or binary-valued due to the constraints imposed by other components of the system being designed. If the next state or outputs are not specified for all (input, present state)-pairs the FSM is said to be incompletely specified.
State The class set Pi implied by the compatible C, is the set of all compatibles Cij implied by Ci for all inputs ij, such that (i) Cij has more than one element The allowed compatibility classes are those for which there is assurance a minimum cardinality closed cover exits which is composed uniquely of them. There are drawbacks related to both steps. Concerning the first one, the number of those compatibility entities might be too large precluding an efiicient generation. The second step implies the solution of a minimum closed covering problem, which is known to belong to the class of NPcomplete problems for incompletely specified finite state machines. Then, the very nature of this problem has precluded the inclusion of a state minimisation step in many FSM design systems during the 1970s and early 1980s.
However, advances both in computing power and heuristic algorithms have convinced researchers that revising state minimisation concepts is worthwhile [13-161. But nowadays FSM designers are no more interested in symbolic descriptions with a minimum number of states; instead they focus the FSM synthesis process as a global optimisation task aimed at obtaining minimal cost implementations both in terms of silicon area and design time. It is well known that a symbolic description with a minimal number of states is not necessarily the appropriate starting point when minimal area implementations are looked for; however, usually a reduction in the number of states leads to a reduction in the complexity of the resulting FSM. Then, there is a need for developing efficient algorithms which are able to produce FSMs optimised in terms of silicon area, speed, testability, etc.
(b) extraction of a minimum closed cover.
Heuristic minimisation approaches: previous
There are several styles of heuristic strategies for solving combinatorial optimisation problems. We classify these strategies into two categories: constructive approaches, and iterative improvement methods. In constructive heuristics, a good solution to the problem being solved is built up piece by piece. This kind of approach to the state-reduction problem follows the structure of classical methods but, instead of attempting to determine minimum solutions, near-minimum (minimal) ones are heuristically selected through (a) generation of a set (usually, a complete set) of some kind of compatibles (b) heuristic selection of minimal closed cover.
Heuristic algorithms were developed as early as 1972 by Bennets 1181. In this pioneering work it was pointed out that the type of initial compatibles on which they are based is not so critical. Bennets proposed a reduction algorithm based on the maximal compatible sets, MCs. Using this subset of the prime compatible set reduces (in some cases drastically) the number of candidate compatibles. The procedure consists of selecting one of the essential (or quasi-essential) MCs and attempting to satisfy its closure requirements (generating one of the smallest set of MCs that satisfies the violated closure requirements for the MC selected). The result will be a closed set of MCs that may or may not provide full covering on the work initial set of states. The procedure is repeated until a full cover is achieved. Very recently, several heuristic algorithms for FSM minimisation have been proposed. Kannan and Sarma [lS] begin by generating the set of all the maximal compatibles. Then a minimal set of maximal compatibles which covers all the states is built up. Finally, if the FSM is incompletely specified and the minimal cover is not the complete set of maximal compatibles, this minimal cover is expanded to obtain an optimal closed cover. Two different heuristic algorithms are proposed for the last step. In the 'large set' approach, each compatible in the minimal cover is checked for closure and appropriate compatibles are added to fulfill closure requirements. At every step, extra states are tried to be added to compatibles in the cover. New sets are created only if absolutely necessary. In the 'lean set' approach, all the states that occur more than once in the set of maximal compatibles in the minimal cover are removed. Then closure requirements are satisfied adding states to the existing compatibles or adding new compatibles.
Hachtel et al. [16] stated that exact state minimisation is feasible for a large class of practical examples but have also pointed out the interest of heuristic techniques for the state minimisation of FSM generated by sequential synthesis systems. Two heuristic approaches are presented to solve the problem. In both of them they attempt to reduce the time and memory requirements by generating only a subset of the prime class set for the subsequent closed covering problem. First, a closed cover composed of maximal compatibles uniquely is found. Then, only the prime classes contained in the maximal compatibles of the previous solution are computed and used to formulate a binate covering problem [9, 101 which is then solved exactly. There are two different heuristics reported to find a closed cover of maximal compatibles. One of them is suitable for machines with a high number of maximal compatibles and is based on the concept of isomorphic states so that not all the maximal compatibles are taken into account in the process of building up a closed cover. The other one is for machines with a large number of prime classes. In this case, the complete set of maximal compatibles is generated and a minimum closed cover is found.
Although these recent references appear as very interesting, the field is worth exploring further.
In iterative improvement strategies an existing solution is perturbed in the direction of a lower cost one. These strategies have been only applied to the final steps of some methods and thus they may be considered as refinements [13, 161 of solutions obtained by constructive approaches. But in our opinion, iterative improvement strategies have not been explored in depth. There seems to be a parallelism between the history of logic minimisation and that of state reduction. Approaches to logic minimisation exist which simultaneously identify and select implicants for a cover. This is the case of the well known logic minimiser Espresso [19] . No similar approach exists for state reduction. So an interesting objective is developing state-reduction algorithms which simultaneously identify and select compatibles for the closed cover, and comparing them with existing approaches. We have developed a state-reduction algorithm, Reduces [20] , using a constructive heuristic based on maximal IEE PROCEEDINGS-E, Vol. 139, NO. 6, NOVEMBER I992
REDUCES(T)
compatibles. The state reduction is achieved by processing the state table T , from which the set of maximal compatibles CCSS is obtained. The global process is described using Pidgin-C in Fig. 3 . Then, a table Q with the covering and closure constraints is derived by Getconstraints() in Fig. 3 . Covering and closure constraints member of a closed cover set and c, = 0 if M , is not selected. These covering and closure constraints are shown in 
Covering and closure constraints for FSM in Table I 0 (14) (1 5)
~i~. 5
494
It is clear that M , and/or M , will be members of any closed cover set, including that of minimum cardinality. Then, Cand = { M , , M , } , and C is computed for both MCs: 8 and 9 . N C ( M , ) = 3 because the consequence of selecting M , is the violation of inequalities 10, 11, 12 which express the closure requirements of MI. That is, if M , is selected as a member of a closed cover, MCs containing compatibles (s,, s 2 , sJ, (s2, s5) and (s,, s 5 , s,) will have to be included too.
M , is selected (CC = { M , } ) because it maximises the parameter C . Now there are some closure constraints violated (expr. 10, 11, 12) and the algorithm tries to satisfy them firstly, so procedure Selec-CC() is called for these constraints. Among this set, expr. 12 is the most restrictive one and Cand = { M , } . In this case, as there is only one candidate MC, we do not need to compute C . M , becomes a member of CC. Now, the set of violated closure constraints is {(ll), (31)) and 
Motivations
The strategy we propose herein is similar to that in Espresso [19] for the heuristic minimisation of combinational functions. That is, a set of basic operations is defined which transforms a symbolic description of the FSM in another one with a smaller number of states. 
The new algorithm
Primary objective in Arnes is reducing the number of states in the symbolic description of a FSM being used as an input to other phases of the design process of sequential circuits. Moreover, once it obtains a solution with a reduced number of states, Arnes operates on it to maximise the number of 'don't cares' in excitation functions. This aims at simplifying the task of state assignment programs and logic minimisers [13, 161, so better solutions are obtained.
In Fig. 6 the algorithm control block is described using Pidgin-C, where T is a symbolic description of a FSM using state tables, C is a closed cover for table T , that is, a closed set of compatibles which covers all internal states in T , and (€J is the number of compatibles in C (number of internal states in the description of the FSM). Procedure init(T) initialises C to the set of internal states There are two main functions in Arnes: expand and reduction which we describe in detail. Once C has been transformed by the application of basic previous functions, procedure table(C, T) builds a symbolic description T for the FSM defining an internal state for each compatible in C. If the cost of this new description is smaller than that of the initial one, the whole process is repeated for T . Now, we explain the main procedures of Arnes. Procedure expand adds states to each compatible Ci in C, includes those compatibles needed to fulfill closure requirements of C,? (expanded compatible) and eliminates those in C that are now covered. Given Ci to be expanded, an optimum expanded compatible Cppt is defined such that
where DIMP(CP") is the closure set 4"' of CYp eliminating those compatibles which are members of C, W,{Cppt) is the set of compatibles which can be eliminated from C, when substituting C i by Cpp' because they are included in C:p', and W,(CPpr) is the set of compatibles which can be eliminated from C, when substituting Ci by Cppi because they are included in one compatible in DIMP(CYp'). If all compatibles C 3 C i could be enumerated, we would select one satisfying eqn. 1. Instead, since this operation is computationally too expensive, the basic operation expandl obtains C: that is a good approximation to eqn. 1. In Fig. 7 and 8 procedures expand and expandl are described using Pidgin-C.
Procedure expandl begins by computing the set of states that are compatible to all states in Ci and storing them in set comp. Then, for each potential expanded compatible C + j = C i U s j , where sj is a state in set comp, a is evaluated. One of the expanded compatibles which maximises 1 is selected to be Ci and procedure expandl is repeated for this new C i . The expansion of one compatible finishes when there are not compatible states to the current expanded compatible or a is negative for all potential expanded compatibles. After expanding a compatible, C i s still a closed cover for T . Moreover, expand-
expandl (C,)
/*given C, , compatible in C. returns C, ' which is a good approximation to (1 ) ing is only allowed if it does not mean an increment in the cardinality of C. Procedure reduction transforms C in a new C* where each Ci is sequentially substituted by a reduced compatible C ; c C i , such that {C -C,} U C ; is a closed cover For reducing each C i , first the set of states in C, covered by unless another compatible in C, H , is derived. All states in H could be removed from Ci and C still would be a cover of T but may be not a closed set of compatibles, Procedure MCEE selects the largest subset of H that can be eliminated from Ci, such that C is still a closed cover of T .
Procedure reduction allows to move away from one solution to another of less cost, as the application of the whole procedure to state table T , might lead to less costly solutions. Moreover, reduction eliminates states that are covered by more than one compatible. The minimisation of such number of states leads to the maximisation of 'don't cares' in excitation functions [13] . In Fig. 9 we describe the procedure. Fig. 10 contains a description of the basic function in reduction (procedure MCEE). MCEE complexity increases exponentially with the number of states in the set H . From our experience we conclude this is not a problem, even for large machines.
We use again the machine described in Table 1 to show how the algorithm works. As seen from the foregoing description there are three main blocks in the algorithm: initialisation, expansion and reduction. This means, expanding s1 with sz leads to a closed cover with the same number of compatibles. 
Example

@ = S
There are not compatible states to C, = {(s,szsJs8)}, and so the expansion of this compatible is not continued. Now the algorithm tries to expand compatible C, in C.
The expansion of C, is not continued because z is negative for all states in comp. The number of states in closed cover C would increase if we used s, or s9 to expand C, = {(s2s5s6)}.
The algorithm goes on with C3
I(sls6sZ)<o
I ( s l s 6 s 3 ) < 0 1 ( s I s 6 s S ) < o I a is negative for each C: and C3 is not expanded. Reduction ; Procedure reduction sequentially removes from the compatibles in C, those states which are not necessary to fulfill covering constraints nor closure ones. In this example no state is eliminated.
As @ = 4 is less than @* = 9 the whole procedure is repeated for the new state table derived from the final closed cover. No improvement is achieved in this case.
Experimental results
Experiments have been carried out on a large set of FSMs to test the algorithms previously described. Both of them have been coded in C. The program implementing the constructive one is called Reduces and the iterative improvement one, Arnes. Results both for machines from the literature and for machines in the MCNC benchmark set [24] are shown.
Figures of merit
We focus on three figures of merit: the final number of states in the FSM representation, the area occupied by the combinational component in a PLA-based implementation of the FSM, and the time which is required for design. We introduce them in more detail:
(a) the number of states in the symbolic description after state reduction. This is an important parameter to evaluate how well the state reduction algorithms work. It is the main concern of all the algorithms described in classic papers on the subject. If state assignment programs which use minimum length codes are employed, the number of states determines the number of memory elements.
(h) the area of the combinational component. We use a PLA to implement the combinational component of the FSM. The area in this case is area = Kp(2ni + 3n, + ne)
where ni is the number of input variables, n, is the number of state variables, and no is the number of output variables.
(c) design time. This parameter reports the time needed for passing from the initial symbolic description to the minimised Boolean representation of next state and output functions. It is composed of three partial times: 
Results on machines from the literature
We have tried many examples of machines from switching circuit textbooks and journal papers about minimisation [26] . In many of them the gains were significant. Table 2 shows the number of states (ns), the number of product terms ( t p ) and the size of the PLAs implementing the combinational part of the machines after state assignment and logic minimisation for four different initial descriptions: state tables as they appear in the literature, state tables reduced by the application of an implementation of a classic state reduction algorithm (Bennets algorithm [18] ), state tables reduced by the application of Reduces, and state tables obtained by Arnes. This parameter size has been obtained with a K , = 1 in eqn. 3. Reported times in the last three include the time invested in the state reduction step. In all cases, the optimal state assignment algorithm we used was the 'i-hybrid complement' algorithm in Nova [25] .
From Table 2 Arnes obtains minimum cardinality solutions in 14 of 15 cases, including the well known (and difficult) 22-state machine (FSM6) in Reference 17. Also, from this table, it should be clear that the size of the final circuit when Arnes is applied is lower than the size when starting with original FSMs except for FSMlO (equal size). In ten machines Arnes gave less area realisations than solutions supplied by the well-known Bennets algorithm. Only for FSMl and FSM10 did the implementation of Bennets algorithm achieve better size results than Arnes. Summarising the comparison between Reduces and Arnes, in nine machines Arnes gave area savings while Reduces obtained better results for three machines.
For each of the machines in Table 2 two ratios have been calculated
size, time,
A = -
and T = v sizeo time,
This is the area occupied by the PLA, obtained using Arnes, divided by the area resulting when the original description is used for state assignment. In the same way, the design time with state reduction using Arnes is divided by the design time without state reduction step.
We represent the pair (A, T ) for each machine in a design space whose Y-axis represents the time ratio and the X-axis the area ratio. There are two significant lines in this space, defined as A = 1 and T = 1, respectively. For any machine FSMi with coordinates ( A i , 7J below the line defined by T = I, the design time decreases when state reduction is included in the synthesis process. For any machine FSMi with coordinates ( A i , TJ on the left of the line defined by A = 1, the state-reduction step is advantageous in terms of silicon area.
Advantages in both area and time correspond with machines inside the region limited by those lines. Also its borders represent favourable situations. That is, they represent cases where state reduction achieves a decreasing in one of the design coordinates without changing the other one. In Fig. 11, pairs (A, T ) for the machines in Table 2 have been drawn. The point labelled [FSM] (0.45, 0.22) represents the average for the machines in Table 2 . than that resulting from applying Nova to the original machine (time, column) for all of them. Concerning the state assignment and minimisation steps, there are three machines (bbsse, sse, markl) for which the Nova time when using Reduces ( t 2 + t , in Reduces column) is higher than the Nova time when the original description is used. It is due to the symbolic descriptions generated by the state minimisers which are then used as input to the state assignment phase. In some cases, they are descriptions with a large number of symbolic implicants and the logic minimisation step slows down. However, this is a feature of the algorithm's implementation which can be corrected. In Fig. 12, pairs ( A , T ) for the machines in Table 3 have been drawn. Nevertheless, the discussion of results would be incomplete if we did not consider what happens with machines for which state reduction does not achieve any reduction. When provided with a description of a machine to be implemented, we do not know beforehand whether there are compatible states or not. This means the state reducer will be invoked and a certain amount of time invested in checking if there are compatible states. If so, a description with less states will be tried, but the state reducer can fail in finding such one. Thus, the result of this processing being the same initial description and coordinates for this machine ( A = l , T > l). It is difficult to carry out an estimation of how this increasing in the computer time may affect. To give a rough idea, we have evaluated coordinates Figure) .
From the set of machines from the literature, the superiority of Arnes against Reduces in area is evident. They are similar in time. The superior performance of Arnes is not so clear for the machines in the MCNC benchmark. The key point to understand the differences is whether the selection of maximal compatibles as variables to formulate the LP integer problem (Reduces) implies a limitation to the quality of the solutions it can obtain. This is the case for the literature machines but, obviously, it is not for the MCNC benchmark.
Since there have been two other approaches recently [IS, 161, it is convenient to compare our results with them. From the published results, we have elaborated Table 4 where the minimisation achieved on MCNC 
Conclusions
The experiments carried out suggest that fast state reduction heuristics should be included within FSM automatic synthesis systems to achieve both area and time savings.
Probably new improvements of all of these methods will come about in the future, but they are promising in terms of augmenting the present capabilities of design automation for FSM. In that sense, Arnes opens the door to a new way of solution compared with traditional approaches like Reduces or other, more recently reported, contributions. 
