Abstract-Evolutionary computation presents a new paradigm shift in hardware design and synthesis. According to this paradigm, hardware design is pursued by deriving inspiration from biological organisms. The new paradigm is expected to radically change the synthesis procedures in a way that can help discovering novel designs andor more efficient circuits. In this paper, a multiohjective optimization of logic circuits based on a modified Ant Colony (ACO) algorithm is presented. The performance of the proposed algorithm is evaluated using a set of randomly generated circuits. The results obtained using the proposed algorithm are compared to those obtained using existing ACO-based techniques. It is shown that the designed circuits using the proposed algorithm outperform those of the existing techniques.
Introduction
Conventional logic design techniques tend to depend on domain-specific knowledge. which is somewhat limited hoth by the training and experience of the designer. While iterative heuristics. with little domain knowledge, allow the search for solutions in a much larger, and often richer, design space beyond the realms of conventional design techniques.
The Ant Colony Optimization (ACO) algorithm is a new meta-heuristic that combines distributed computation, auto-catalysis (positive feedback) and constructive greedy heuristic in finding optimal solutions for combinatorial optimization problems [I] . Unlike Genetic Algorithms (GAS). which are hlind, ACO involves cooperating agents (ants).
In ACO, interaction between components of a designed system can be easily analyzed. Some daemon actions or other heuristics can also be easily incorporated to further improve the quality of solutions. In this context, it is possible to use some rules from the logic synthesis domain to guide the search process to obtain not only better quality results. but also faster ones.
In the early 1990's. Hugo de Garris suggested the establishment of a new field of research called Evolvable Hardware (EHW) [2] . The first work in evolutionary design of digital circuits, Designer Genetic Algorithms (DGA), was proposed by Louis 171. Later. the work of Thompson 141 that produced a tone discriminator circuit without input clock showed the emergence of a new way of designing circuits.
In a recent development 15, 61: much attention is given to the evolutionary design of arithmetic circuits as they provide the essential building blocks needed for larger DSP applications. Such effort has resulted in the development of arithmetic circuits that range from a simple sequential adder to the more complex 3-hit multiplier. Although most of the existing techniques in evolutionary design were able to arrive at solutions that are difficult to obtain using conventional methods, there exist many open problems that were still not addressed. A numher of these problems are described below.
Circuit representations: Most of. the puhlished work in evolutionary logic design used a two-dimensional matrix of n x m to represent a circuit. The position of circuit's outputs will most likely be placed at cell(0,rn -1). However, it may happen that the hest solution can be found at ceIKi,j), 0 < i < n, 0 < j < m. But some redundant gates existing between this cell and the output cell may degrade the quality of the solution. The problem becomes complicated even further when the number of circuit's output is more than one. Figure I illustratrs the problem arises from circuit representation.
Functional fitness calculation:
The value of functional fitness depends on the numher of correct matchings between the output's pattern of the obtained solution and the truth table of the intended circuit. The higher the numher of hits achieved, the higher the value of the functional titness. This argument is not always true in logic design. A solution that has low functional fitness can be inverted to have a high functional fitness (see Figure 2) .
Objectives of the optimization:
Most of the existing techniques use gate count as their objective for optimization. With the increasing need for high performance and low This paper is organized as follows: first. problem and cost function formulation are presented. Then. the modified ACO for logic design is discussed. Finally. performance evaluation and comparison with existing tachniques are given.
Problem and Cost Function Formulation
Evolutionary computation views the problem of logic design as a search task. The methodology explores a solution space larger than that of the desired function. hut gradually pulls the specification of the circuit towards the target truth tahle. However. the design space of digital circuits is huge.
There are 2" (C:") possible solutions that satisfy 2" -1 out of 2" truth tahle's pattern for ann inputs single output function. In addition to that, the number of possible structures representing each of these solutions is many. These different structures represent different design ob.jectives and/or constraints. Exploring the whole search space is impractical. Therefore, the search space sampled hy the algorithm must have its size limited.
In this paper, we use the structure proposed in [3] . Each cell of the n x m matrix contains the information of the gate type and its corresponding inputs. However, unlike Ihc fixed interconnection rules used in [3] . we allow the output of aach cell in column j to he connected to any of the cells in column j + 1 (j > 0, j + 1 < m). Thus, it is possible
any of the cells in column k + 1.
It is known that each type of gate has different characteristics for different technology. These characteristics include area. base delay and capacitance input (output) of the gate.
Although u'e can build any logic circuit using AND, OR and NOT pates. we need to have a rich (hut limited) gate library to he able to obtain different structures of the circuils. The best circuit can then be chosen based on the multiohjective critefia applied in the algorithm. Therefore, ien types of gates are considered. Table I shows these fates. 
2:l Fitness Calculation
The fitness of a solution consists of two parts: functional fitness and ohjective fitness. These are explained below.
Functional Fitness:
The functional fitness deals with the functionality of the solution. i.e.; how good the solution is in satisfying the truth table of the intended Boolean function.
Several functional fitness formulations are reported in the literature 1131. The commonly used one is the ratio of the number of correct hits to the length ofthe truth table. If F F denotes the functional fitness, then the formulation below is applied.
( 1 )
Number of hits Length o f the truth table F F =
The solution has to he 'inverted' if the value of F F is less than 0.5. Therefore. the formulation of irormali:ed F F (FF,) below is applied:
Objective Fitness: The ob,jective fitness ( O F ) is the measure of the quality of solution in terms of optimization ohjectives such as area, delay, gate copni,and power consumption. Formulation of cost functions used to estimate these values is given as follows.
If G is the set of possible gate types and gi E G, the cost for gate count is formalized as follows:
The cost for area of VLSI circuits is stated as follows.
Where A ( g i ) is the area.of gate g ( i ) .
The propagation delay of signals in VLSI circuit consists of two elements, switching delay of gates and interconnect delay. If a path ?I consists of n gates {vl, v2, ..., vn}, then, the delay T, along ?I is expressed by the following equation:
Where CDi is the switching delay of the cell driving gate ui. LFi is the load factor of the driving block, Ri is the interconnect resistance of net u i , and Ci is the load capacitance of cell i given-by Equation 5. Since the value of Ri is constant, it can he neglected. The overall circuit delay is determined by the delay along the longest path (the most critical path).
The total'capacitance Ci of bate .i consists of ihe interconnect capacitance at the output node of gate i and the sum of the capacitances of the input nodes of the gates driven by gate i.
..
Where C : is the capacitance of the input node of a gate j driven by gate'i and C; represents the interconnect capacitance at the output node of cell i.
The total power consumption can he approximated by the following equation 1151.
Where Pt is the total power consumption, VDD is the supply voltage. Si is the switching probability at the output node of cell i, i.e.. the average numher of transitions per clock cycle at the output of gate i. f is the clock frequency and p is a technology dependent constant.
The cost of the overall power consumption in VLSI circuits can then be estimate3 as follows.
In order to indicate whether a given solution is satisfying a certain constraint, objective fitness is formulated as follows. Note that the constraint values states the upper hound for a specific objective. M is filled with randomly generated cells. Then, each ant will traverse the matrix. These ants originate from a dummy cell called nesf (see Figure 4) . and traverse each state (a cell in a column) until it reaches the last column or a cell that has no successor. After the ants finish traversing the matrix, all cells are checked to see whether to he kept or not. Each cell can assume two different status, namely: 'I' (locked) or 'r' (removed). The cells that are included in the hest path(s) will assume the status of 'I' (locked). And the cells rhat are feeding locked cells will he locked as well. All other cells will assume the status 'I' (removed). The cells that assume status 'r' will he removed at the end of each iteration. These empty cells will he then filled up again at the beginning of the next iteration. Figure 5 shows the pseudocode of the proposed algorithm.
.. For 
Modified ACO algorithm

MAXITER numher of iteration d o
Fill the matrix ' 
ACO algorithm Ant activity Pheromone update Remove unfit cells
End For
Return the hest path end Algorithm 
Pheromone Trail Calculation and,Update
The selection of which edge to traverse is determined by a stochastic process. e.g.. Roulette Wheel. Therefore. the probability of choosing each edge must be calculated in advance. This probahility depends on the pheromone value ( r ) and the heuristic value (7) of the corresponding edge (or the next cell), or can be formulated below.
The value of CY and fi imply the preference of the search, whether it depends more on the pheromone value or the heuristic value. Every newly created cell will he given an initial and small amount of pheromone value. This value will be updated every iteration hy the ant. The addition of 0.5 in the calculation of 1) is meant to normalize the value o f q into l0.11. A decrease of the functional fitness means that the value of 1) is in the range of [0,0.5), while an increase of the functional fitness makes the value of 17 in the range of (0.5. I ] While traversing the matrix, every ant carries the information of the paths taken so far. e.g., the row index of all cells that are visited. If an ant reaches a cell that has no successor, the overall fitness ofthe solution built by the ant will he evaluated.
When all ants finish their tour, pheromone update is performed. The pheromone update consists of two procedures. pheromone addition and pheromone evaporation. However, as has been shown in [161, it is better to limit the number of ants that can put addirional pheromone. Thus. only certain number of 'the best'. ants can track their path(s) hack and put some additional pheromone on it. The pheromone addition i~s performed using the following equation:
T ( t ) T ( t ) +AT (15)
where OvF(b) denotes the overall fitness of thc solution that the ants built, AT is the additional pheromone and A is a constant.
Next, pheromone evaporation will take place using the following formula. While traversing the matrix. the ants will memorize the best cell visited along the path. The paths from nest to the hest cell will hcreturned. The remaining part of the path will be discarded hy the ant.
. . When,the maximum number of iterations is reached. the best solution is returned. In case of multiple output circuits. multiple colony of ants are used. In this context. each colony of ants is assigned to find a specific output of the circuit. All colonies will share the same matrix. The possibility of using the same suh-functions is established by sharing the pheromone value among different colonies.
Experimental Setup
The technology parameters are obtained using CMOS 0.25. micron library from MOSIS 1171. The parameters used for experiments are as follows: It should he noted that we avoid using gate count as the measure of objective for sewral reasons. Firstly. the term 'gate' or hasic module for the evolutionary logic design depends on the definition of the gate library that is used. One may use NAND gates, or a set of AND, OR and XOR gates. or MUXes. or a combination of all these. Secondly. each of the aforementioned gates has different characteristics. We can assume that an XOR gate as an atomic gate. However. this may not be the case for all target implementations. For example, in standard cell design, an XOR. gate requires more area compared to an AND gate. and an AND gate requires more area than an NANDgate. This is in contrast with FPGA. in which all types of gate can tit into one cell. Nevertheless. if the target implementation is an P G A . calculating the area is proportional to calculating the gate count. In this context. the proposed algorithm is more general as compared to the existing techniques. However, since most ofthe published work in evolutionary logic design use the gate count as a measure of quality, u e provide a comparison of our results in terms of gate count as well.
Performance Evaluation and Comparison,
Several circuits of different complexity have been used to test the proposed algorithm. For the sake of simplicity. the truth table of the circuits will he represented as a string of zeros and ones. Table 2 shows the circuits used for performance evaluation. Note that these circuits represent single output Boolean functions. Table 2 : Circuits used to test the performance of the proposed approach. Figure 6 shows the behavior of the proposed algorithm for area optimization of Circuit1 for the first 100 iterations. In this figure, the area, delay and power consumption are normalized so that the behavior of' the proposed approach can he ohserved easily. Table 3 : Results for the delay and power optimization. normalized with respect to the results of area optimization.
The results of delay optimized and power optimized circuits are given in Tahle 3. Note that these values are normalized with respect to the results obtained from the area optimized circuits. The normalized area for each circuit in Table ? should he less than or equal to one. As can he seen. except for Circuit5, the delay optimization scheme produced hetter circuits in terms of delay (normalized dclay 5 I ). at the expense of larger area andlor power consumption. These results show the effectiveness of the proposed algorithm.
In order to compare our algorithm with known puhlished work. some circuits are tested and compared to the results reported in 19, IO] . Some selected circuits from Table 2 with the addition of ?-bit multiplier and 2-hit adder with carry circuits are used for comparison. An overall compdrison is shown in Table 4 .
For multiple output circuits. multiple colony of ants are used. In a 2-hit adder circuit for example, we need three colony of ants to find all the three outputs of the circuits. Each colony will find the assigned circuit's output. The sharing of pheromone value hetween colonies makes the sharing of sub-functions possible. However, this sharing can indeed bias the search..since the second colony of ants 'smell' the pheromone that was updated hy the first colony ofants, and the third colony will 'smell' the pheromone that will he updated by the second colony, and so on. This is the reason why the value of p of Equation I 1 is set equal to 2 during the course of the experiment. With higher value of / 3. the search is more dependent on the heuristic value q. which emphasizes finding the functionally correct circuits. Figure 1 shows the functional fitness value of each outputs of a 2-hit adder circuit. Notice that the ants will find outputs of the circuit one by one. In analogy with the process of ants finding the food. the simplest function (the closest) will he obtained first and the most complex1 function (the furthest) will most likely be obtained the last. In this context. for 2-bit adder circuit. the first sum will be obtained first while the carry out will he obtained the !as. In this paper, we have presented a modified ant colony algorithm for evolutionary logic design. The modification is performed in order to suit the problem instance and to handle some of the problems that are not addressed by existing techniques. Performance of the proposed approach and comparison with the existing techniques are shown. In all cases, the results obtained using the proposed approach outperformed those obtained using existing techniques. 
