INTRODUCTION
The synthesis of a finite state machine (FSM) plays a very important role in digital circuit design. The controllers of such circuits are often specified in the form of an FSM consisting of several states and transitions between them. The problem of FSM realization consists of a number of subproblems-the most important one being state assignment. In this phase, binary codes are assigned to the individual states. The resulting combinational logic depends heavily on the codes assigned to the states. Thus, the problem has received a great deal of attention from the research community. Villa et al. [1] have shown that given a symbolic representation of an FSM M and positive integers k and l the problem 'is there an encoding e that produces an encoded cover M e that has a sum-of-product representation with k or fewer product terms and l or fewer literals? is in p 2 . With the advancement in multilevel logic minimization, Micheli et al. [2, 3] proposed symbolic representations of the combinational component of the FSM followed by a Boolean subcube mapping. Micheli [4] also proposed a symbolic minimization to take into account the effect of encoding on the next state part. In [5] the problem of encoding the states of a synchronous FSM has been addressed such that the area of a two-level implementation of the combinational logic is minimized. A tool named NOVA takes a more efficient and flexible approach to constraint satisfaction, representing it as a graph embedding problem that is solved using several heuristic strategies producing superior results and offering quality/runtime trade-offs. Symbolic minimization [6] , i.e. the extension of multivalued minimization to handle input and output encoding simultaneously, has also been presented here. In [7] the authors presented a genetic algorithm (GA)-based approach integrating state assignment and flipflop selection to store the state bits. A judicious choice selects a combination of D and T flip-flops to realize the resulting circuitry. However, T flip-flops are rarely available in package logic (e.g. field-programmable gate arrays). They are formed from J -K flip-flops. However, each J -K flipflop requires two inputs and this could lead to more complex wiring. In VLSI technologies where the size of the wiring area is typically a greater concern than gate area, D flip-flops are used almost universally [8] . It has also been shown in Section 3 that in many cases the present approach produces better results.
Recent investigations have shown that GAs can find good state assignments [7, 9, 10, 11, 12, 13] . Xia and Almaini [10] have used GAs to optimize both area and power. In [14] I have used GAs to obtain power optimized two-and multilevel FSM realizations. In [12, 13] I have shown the effectiveness of GAs in performing state assignment, flip-flop selection (between D and T ) and flip-flop polarity selection in an integrated fashion to obtain better area-optimized circuitry. Owing to the wide acceptability of D flip-flops, this paper considers D flip-flops to store the state bits and investigates the avenue of GAs to solve the following three problems:
• assignment of codes to the states;
• polarity assignment to the flip-flops;
• polarity assignment for primary outputs.
In many cases we have seen that for the current state bits, if we take the complemented output of the D flip-flop, the combinational area of the resulting circuit reduces. In addition, proper selection of the polarities for the primary outputs reduces the area overhead. Next, we present an example showing area reduction under flip-flop and output polarity selection. This is followed by the GA formulation in Section 2 and experimental results in Section 3. Example 1. Consider the three-state FSM shown in Figure 1 . The FSM has single input (I) and two outputs. The three states are represented as S 1 , S 2 and S 3 . Let the states be coded as 01, 10 and 11 respectively. Table 1 shows the truth table for QQ andQQ implementation of the FSM. QQ implementation refers to using positive polarity for both the state bits (Figure 2a ) andQQ implementation refers to using negative polarity for both the state bits (Figure 2b ). The combinational functions for the state transition logic are as follows:
In QQ implementation, a two-level realization of the next-state logic will require four product terms as shown in Equations (1) and (2) . However,QQ implementation requires three product terms as shown in Equations (3) and (4) . The output functions and their complements 
FSM
QQ realizationQQ realization are as follows:
DQ+
Hence, selecting the polarity of O 2 as negative requires fewer product terms. O 1 needs the same number of product terms in both the true and the complemented form.
However, it should be noted that the entire combinational logic (i.e. the state transition function and the output function) is implemented by a single programmable logic array (PLA). Thus, area reduction can be achieved if there exists a good amount of sharing of product terms between the functions. An integrated approach to solve the three subproblems together is definitely more capable of identifying such situations than a step-by-step approach that solves the three problems in sequence.
GA PROBLEM FORMULATION
GAs [15] are stochastic optimization search algorithms based on the mechanics of natural selection and natural genetics. consisting of a set of randomly generated solutions. Based on a reproductive plan, chromosomes undergo evolution for a number of generations. The reproductive plan consists of applying genetic operators, the most common being crossover and mutation. The crossover operator creates new chromosomes by crossing two parent chromosomes participating in the operation. This helps in exploring the search space. In order to ensure that the search space explored is not closed under crossover, another genetic operator called mutation is applied on the working population to perturb one or more solutions. This operator brings about variety within a finite population. For each generation, the chromosomes are evaluated using some fitness criteria. Based on the selection policy and the fitness values, the set of chromosomes for the next generation is selected. This evolution process is continued for a number of generations. Finally, based on some terminating criteria (e.g. the maximum number of generations the GA is run or the maximum number of successive generations without cost improvement), the algorithm terminates. The best solution at that generation is accepted as the solution produced by the GA.
The genetic formulation of any problem involves the careful and efficient choice of the following:
• a proper encoding of the solutions to form chromosomes; • to decide upon a crossover operator;
• to identify a proper mutation operator;
• a cost function measuring the fitness of the chromosomes in a population.
Solution representation
A solution to the problem of state encoding along with flipflop and output polarity selection consists of three partsthe first corresponding to the codes assigned to the different FSM states, the second identifying the polarity of the flipflops to be used (i.e. Q orQ) for each of the state bits and the last identifying the polarities used for the output functions. Naturally, we have selected a structure having three components-the state-code component, the flip-flop polarity component and the output polarity component.
State-code component. This is an array of size equal to the number of states in the FSM. Each entry in this array is an integer between 1 and the number of FSM states. Let the state-code array for an m-state For example, let us consider the state-code array 1, 2, 2, 4 . The possible set of codes is 0, 1, 2, 3 . Now, if the ith entry in the state-code array is j , then state i will be assigned the j th unassigned code from the set of codes. Since the first entry in the state-code array is 1, state 1 is assigned the code 0 (the first unassigned code). The second entry in the state-code array is 2. Now, the second unassigned code from the set of possible codes is 2. Thus state 2 is assigned the code 2. The third entry in the state-code array is again 2. At present, the second unassigned code is 3 (since codes 0 and 2 are already allotted). Thus, the code for state 3 is 3. Finally, for state 4, though the state-code array entry is 4, only the code 1 is available. Hence, state 4 of the FSM is assigned the code 1. Thus, the state-code array 1, 2, 2, 4 assigns codes 0, 2, 3 and 1 to the FSM states state 1, state 2, state 3, and state 4 respectively.
Flip-flop polarity component. This is an array of size equal to the number of state bits used to code the states. Each entry in this array is either 0 or 1. If the ith entry of this array is 1, the polarity of flip-flop holding the ith state bit is positive (i.e. Q output of the flip-flop feeds the combinational logic of the circuit). On the other hand, a 0 in this array signifies that the corresponding flip-flop's polarity is negative (i.e.Q output is fed).
Output polarity component. This is also represented as an array of bits of size equal to the number of primary outputs. If the j th entry in this array is equal to 1, the polarity of the function is taken as positive, else the polarity is taken as negative.
Thus for a six-state FSM with 3-bit encoding and two primary outputs, 4, 4, 4, 3, 6, 2 1, 0, 1 0, 1 signifies the codes of the states 1-6 to be 3, 4, 5, 2, 1 and 0 respectively. The polarity of flip-flops to be used is as follows. The flip-flop polarities corresponding to the least and the most significant state bits are Q, while that for the middle bit is Q. For the primary outputs, the first is implemented in the complemented form, while the second is implemented in the uncomplemented form.
Initial population generation
The initial population is generated randomly as shown in Algorithm 1. 
Crossover operator
Our GA formulation has been biased towards selecting chromosomes with better fitness to participate in the crossover. For this purpose, the whole population is sorted based on their fitness values. A certain percentage of population with a better fitness value is defined to be the 'best class'. To select a chromosome participating in crossover, first, a uniform random number between 0 and 1 is generated. If the number is >0.5, a chromosome from the 'best class' is selected randomly. Otherwise, a chromosome from the entire population is selected. Let the population size be n and the cardinality of the 'best class' be m. Then, the probability of a chromosome getting selected for crossover is 0.5/m + 0.5/n, for chromosomes belonging to the 'best class', whereas for a chromosome not belonging to the 'best class', the corresponding probability is only 0.5/n. Since m is much less than n, the probability of a chromosome belonging to the 'best class' being selected is higher than that of a chromosome not belonging to it. This approach of selecting more fit chromosomes to participate in crossover leads to the generation of better offspring as compared with the truly random one. This is readily reflected in the high rate of convergence of the algorithm on the benchmark results reported in Table 2 .
After selecting two candidate chromosomes (g 1 and g 2 ) to participate in crossover, we randomly select three pointsthe first (p 1 ) from the state-code component, the second (p 2 ) from the flip-flop polarity component and the third (p 3 ) from the output polarity component. The two new chromosomes G 1 and G 2 are created through the following procedure:
for i = p 2 + 1 to number-of-state-bits do
Mutation operator
Mutation is a very important operator as far as bringing about variety into the population is concerned. As the population size is finite, the crossover operator alone cannot bring about enough variation to the population. Thus, the solution quality and rate of convergence suffers. The mutation operator brings about more effective variations into the chromosomes, introducing newer search options. In our case, the mutation operator first selects a chromosome from the population randomly. It then modifies at least one or all the components (i.e. the state-code component, the flip-flop polarity component and the output polarity component) depending on the value of a random number generated. To modify the state-code component, a position (p 1 ) on it is selected randomly. Let the value at position p 1 be b, and the number of states be m. The p 1 th position is now modified to (m − b + 1). However, selection of the position p 1 is not fully random, rather it has been biased towards the positions in the first half of the component. This is because of the fact that in the later part of the state-code component there exists very little choice for codes. For example, consider the statecode component 1, 2, 2, 4 corresponding to a four-state FSM. The codes assigned to the states are 0 to state 1, 2 to state 2, 3 to state 3 and 1 to state 4. Now, if a mutation modifies the third entry of the state-code component to (4 − 2 + 1) = 3, it does not change the codes assigned to the states. Similarly, mutating the fourth entry to 1 also does not modify the assignment, because in all these cases the code selection procedure presented earlier is forced to take the last unassigned code, making the mutation ineffective. Thus, introducing a modification in the earlier part is expected to have more than one impact in the later part. This leads to better variation in the population.
To introduce a mutation in the flip-flop polarity component, the value of a randomly selected position on it is changed from 1 to 0 (or 0 to 1). The output polarity component is also mutated similarly.
Measure of fitness
Next, we introduce the cost function which measures the fitness of a chromosome. A particular state assignment leads to a certain combinational logic. To compare the suitability of a particular state assignment to an FSM, we estimate the number of product terms in the combinational logic block after performing a two-level minimization. In doing so, we have considered the primary input lines and the present state bits as inputs to the combinational logic, whereas the primary output lines and the next state bits are taken as the outputs. We then calculate the number of product terms by running ESPRESSO [16] over the PLA representation of the circuit. This number of product terms has been used to represent the fitness of a chromosome. A chromosome with a smaller number of product terms is considered to be a better one.
Terminating criteria
As the GA evolves through generations, improving the quality of the solutions produced, we need some criteria to terminate the process. We have used the following two criteria for termination.
(i) There is no reduction in the FSM area for the best chromosome in the population for the last MAXGEN-WITHOUT-IMPROVEMENT number of generations.
(ii) The GA has already run for MAXGEN number of generations.
The values of these two parameters were set at 50 and 1000 respectively in our experiments. 23  25  25  22  -22  24  -23  bbtas  9  12  11  -9  9  9  8  9  bbsse  26  30  29  27  29  -30  29  33  beecount  12  12  10  ---16  -16  cse  45  45  45  43  45  --45  54  dk14  26  27  26  ---36  -33  dk15  16  18  18  -18  18  -18  20  dk16  68  72  71  -70  -63  70  73  dk17  15  19  17  -17  17  17  16 -----mark1  11  20  20  -12  --12  -modulo12  10  12  11  12  10  7  11  6  11  opus  12  16  16  15  15  --15  -planet  84  89  86  86  -94  91  --planet1  84  89  86  ------s1  68  87  87  66  74  --74  -s1488  94  141  102  ------s1494  95  149  106  ------s 1 a  6 6  8 0  8 0  ------s208  24  20  20  ------s27  7  15  15  -11  --11  -s510  66  65  65  ------s820  64  85  62  ------s832  66  90  67  ------sand  96  102  101  89  96  -108  86  100  shiftreg  4  8  8  -4  5  -4  -sse  26  30  29  -29  -28  29  35  styr  78  103  101  88  99  --94  -tav  10  11  11  -10  9  -9  -train11  12  12  12  10  ----train4  5 The salient features of the GA formulation are as follows.
• The chromosome structure matches exactly with the solution representation.
• The fitness measure chosen closely reflects the actual cost.
• The crossover operator was biased towards selecting highly fit chromosomes.
• A natural mutation operator was selected to bring about variety within the finite population size. The operator was also biased towards selecting mutation points thereby resulting in better variation.
• Finally, the basic GA was modified so that the solution does not degrade between the generations. For this purpose, the best 20% of the chromosomes at any generation is directly transferred to the next generation. This ensures that the best chromosomes are always maintained between the generations and do not inadvertently get degraded by crossover or mutation.
All these factors lead to a very well-organized search directed almost towards the optimum solution within a smaller number of generations, as shown by the solutions obtained and the generations required for the benchmark FSMs noted in Section 3.
EXPERIMENTAL RESULTS
In this section, we present the results of our experiments with 41 benchmark circuits from the LGSynth93 benchmark suite. The various parameters used for the GA are as follows:
(i) population size: as mentioned in Table 2 for different circuits (ii) mutation rate: 10% (iii) crossover rate: 70% (iv) MAXGEN: 1000 (v) MAXGEN-WITHOUT-IMPROVEMENT: 50.
It may also be noted that the best 20% of the chromosomes of a generation has been directly copied to the next generation. Among the remaining 80%, 70% is created via crossover and 10% via mutation. Table 2 presents the results of NOVA [5] , the area-oriented heuristic FSM encoding strategy, and our approach (denoted as GA in the table). NOVA has been run with the greedy encoding option '-e ig'. The combinational logic resulting from both the tools has been minimized using ESPRESSO [16] and the results are noted in the #Products columns for both tools. Both NOVA and GA have been run on a multiuser PARAM system with a 433 MHz Ultra-250 Sparc processor, 1 GB RAM and 512 KB cache. The Generation column of GA lists the number of generations needed to achieve the result. It may be noted that the GA actually terminates 50 generations after this, as MAXGEN-WITHOUT-IMPROVEMENT has been set to 50. The column CPU sec for NOVA lists the actual time required by NOVA for different benchmarks. For GA, the column CPU sec lists the time needed to complete the number of generations specified in the Generation column. Table 3 compares our GA-based approach with other existing encoding strategies. The number of product terms resulting from different encoding techniques was compared. As noted earlier, NOVA is a heuristic encoding strategy targeting a two-level implementation of the logic. It requires on average 22.87% extra area than our approach. NOVA with '-r' option tries out all possible polarities for the state bits. The results presented in Table 3 show that the strategy requires 15.98% extra product terms, on average. In [10] both area and power minimization were performed using GAs. Two different cost functions were used. The column [10] lists product terms for the better of the two cost functions. As noted at the end of the table, [10] needs 5.56% more product terms than our technique. However, for many of the circuits, results are not available in [10] . In [12] only flip-flop polarity assignment was considered; primary outputs are always taken to be of positive polarity. As shown in the table, it needs on average 9.47% more product terms than our approach. Both [7] and [13] use D and T flip-flops. Flip-flop polarity was also used in [13] . As far as the product terms are concerned, [7] needs 8.15% extra product terms, while [13] needs 6.05%. It may be noted that there is also an associated increase in the area due to the usage of T flip-flops. Thus, in reality, the area overhead will be still higher. In [14] a GA was used to obtain the state-encoding targeting power minimization.
Since the area part is ignored there, the technique needs on average 15.56% extra product terms. In [11] state encoding was performed using a mix of D and T flip-flops targeting low power. As expected, it requires on average 20.05% extra product terms.
CONCLUSION
The synthesis of an FSM consists of three subproblemsstate encoding, flip-flop selection and polarity selection for state bits and primary outputs. Although some of the existing works use a combination of D and T flip-flops, from the VLSI implementation point of view the D flip-flop is better suited. This paper has presented an integrated approach to perform state encoding and select polarities for the flipflops and primary outputs. It has established that a proper choice of state codes and polarities can result in significant improvement in the area requirement of the FSM. The tool outperforms the existing FSM state-encoding strategies targeting two-level realization and is therefore more suitable for area-optimized realization of FSMs.
