GATS: A Novel Hybrid Algorithm for Multiobjective Cell Placement in VLSI Circuit Design by Sait, Sadiq M. & Minhas, Mahmood R.
GATS: A Novel Hybrid Algorithm for Multiobjective Cell Placement
in VLSI Circuit Design
Sadiq M. Sait Mahmood R. Minhas
Department of Computer Engineering Department of Information & Computer Science
King Fahd University of Petroleum and Minerals
Dhahran 31261, Saudi Arabia
e-mail: {sadiq,minhas}@ccse.kfupm.edu.sa
Abstract
This paper addresses the optimization of cell placement step in VLSI circuit design [1]. A novel hybrid algorithm
is proposed for performance and low power driven VLSI standard cell placement. The above problem is of multiob-
jective nature since three possibly conflicting objectives are considered to be optimized subject to the constraint of
layout width. These objectives are power dissipation, timing performance, and interconnect wire length. It is well
known that optimizing cell placement for even a single objective namely total wire length is a hard problem to solve.
Due to imprecise nature of objective values, fuzzy logic is incorporated in the design of aggregating function. The
above technique is applied to the placement of ISCAS-89 benchmark circuits and the results are compared with those
obtained from individual application of GA and TS on this problem.
1 Introduction
Hybrid algorithms combine features from various
heuristics as an effort to develop efficient techniques for
solving hard optimization problems [2]. In the scope
of the present work, hybridization refers to mixing of
good characteristics from several iterative algorithms.
The choice of iterative algorithms to be hybridized
needs some careful analysis of the underlying problem
and the characteristics of individual candidate algo-
rithms. The application of an individual algorithm to
the problem may give an insight of its suitability to the
problem. If the results of application of the individual
algorithms are encouraging, then their hybridization is
expected to produce even better performance in many
cases.
Genetic Algorithm (GA) and Tabu search (TS) are
two well-known iterative heuristics that have been ap-
plied for solving a range of combinatorial optimiza-
tion problems in various disciplines of science and en-
gineering [1, 2]. Both of these algorithms have exhib-
ited good performance for the present problem [3, 4].
Therefore, it seems reasonable to hybridize GA and TS
with the view of developing an efficient hybrid tech-
nique for timing and low power driven placement. A
good property of GA is its implicit parallel nature that
helps in exploring the search space efficiently. This
parallelism is due to the fact that GA processes a pop-
ulation of solutions instead of a single solution. On
the other hand, TS showed better results than GA,
and this better performance can be attributed to TS
searching mechanism. These facts lead us to a propo-
sition that an efficient hybrid technique can be devel-
oped by combining the features from these two itera-
tive algorithms. The experimental results support our
proposition as the proposed GATS was able to obtain
the results that are better than those obtained by us-
ing TS as well as GA.
This paper is organized as follows: In the next sec-
tion, we formulate our problem and cost functions.
Section 3 presents the details of our proposed ap-
proach, and the experimental results are presented and
discussed in section 4.
2 Problem Formulation and
Cost Functions
In this section, we formulate our problem and the cost
function used in our optimization process.
VLSI design is a complex process and is carried
out at certain abstraction levels [1]. We are ad-
dressing the problem at cell placement level with the
objectives of optimizing power consumption, timing
performance (delay), and wire length while consid-
ering layout width as a constraint. Formally, the
problem can be stated as follows: A set of cells or
modules M = {m1,m2, ...,mn} and a set of signals
S = {s1, s2, ..., sk} is given. Moreover, a set of signals
Smi , where Smi ⊆ S, is associated with each mod-
ule mi ∈ M . Similarly, a set of modules Msj , where
Msj = {mi|sj ∈ Smi} is called a signal net, is associ-
ated with each signal sj ∈ S. Also, a set of locations
L = {L1, L2, ..., Lp}, where p ≥ n is given. The prob-
lem is to assign each mi ∈M to a unique location Lj,
such that all of our objectives are optimized subject to
our constraint.
2.1 Cost Functions
Now we formulate cost functions for our three said
objectives and for the width constraint.
Wire length Cost: Interconnect Wire length of
each net in the circuit is estimated and then total wire
length is computed by adding all these individual es-
timates:
Costwire =
∑
i∈M
li (1)
where li is the wire length estimation for net i and M
denotes total number of nets in circuit (which is the
same as number of modules for single output cells).
Power Cost: Power consumption pi of a net i in a
circuit can be given as:
pi ' 12 · Ci · V
2
DD · f · Si · β (2)
where Ci is total capacitance of net i, VDD is the sup-
ply voltage, f is the clock frequency, Si is the switching
probability of net i, and β is a technology dependent
constant.
Assuming a fix supply voltage and clock frequency,
the above equation reduces to the following:
pi ' Ci · Si (3)
The capacitance Ci of cell i is given as:
Ci = Cri +
∑
j∈Mi
Cgj (4)
where Cgj is the input capacitance of gate j and C
r
i
is the interconnect capacitance at the output node of
cell i.
At the placement phase, only the interconnect ca-
pacitance Cri can be manipulated while C
g
j comes from
the properties of the cell library used and is thus inde-
pendent of placement. Moreover, Cri depends on wire
length of net i, so equation 3 can be written as:
pi ' li · Si (5)
The cost function for total power consumption in
the circuit can be given as:
Costpower =
∑
i∈M
pi =
∑
i∈M
(li · Si) (6)
Delay Cost: Delay cost is determined by the delay
along the longest path in a circuit. The delay Tpi of a
path pi consisting of nets {v1, v2, ..., vk}, is expressed
as:
Tpi =
k−1∑
i=1
(CDi + IDi) (7)
where CDi is the switching delay of the cell driving
net vi and IDi is the interconnect delay of net vi. The
placement phase affects IDi because CDi is technology
dependent parameter and is independent of placement.
The delay cost function can be written as:
Costdelay = max{Tpi} (8)
Width Cost: Width cost is given by the maximum
of all the row widths in the layout. We have con-
strained layout width not to exceed a certain positive
ratio α to the average row width wavg, where wavg is
the minimum possible layout width obtained by divid-
ing the total width of all the cells in the layout by the
number of rows in the layout. Formally, we can express
width constraint as below:
Width− wavg ≤ α× wavg (9)
Overall Fuzzy Cost Function: Since, we are op-
timizing three objectives simultaneously, we need to
have a cost function that represents the effect of all
three objectives in form of a single quantity. We pro-
pose the use of fuzzy logic to integrate these multiple,
possibly conflicting objectives into a scalar cost func-
tion. Fuzzy logic allows us to describe the objectives
in terms of linguistic variables. Then, fuzzy rules are
used to find the overall cost of a placement solution.
In this work, we have used following fuzzy rule:
IF a solution has
SMALL wire length AND
LOW power consumption AND
SHORT delay
THEN it is an GOOD solution.
1.0
C i/O i
1.0
g i* g i
i
cµ
C width/O width
1.0
gwidth
width
cµ
(a) (b)
Figure 1: Membership functions
The above rule is translated to and-like OWA fuzzy
operator [5] and the membership µ(x) of a solution x
in fuzzy set GOOD solution is given as:
µ(x) =

β ·min
j=p,d,l
{µj(x)}+ (1− β) · 13
∑
j=p,d,l
µj(x);
if Width− wavg ≤ α · wavg,
0; otherwise.
(10)
Here µj(x) for j = p, d, l, width are the member-
ship values in the fuzzy sets LOW power consumption,
SHORT delay, and SMALL wire length respectively. β
is the constant in the range [0, 1]. The solution that
results in maximum value of µ(x) is reported as the
best solution found by the search heuristic.
The membership functions for fuzzy sets LOW
power consumption, SHORT delay, and SMALL wire
length are shown in Figure 1. We can vary the prefer-
ence of an objective j in overall membership function
by changing the value of gj . The lower bounds Oj for
different objectives are computed as given in Equa-
tions 11-14:
Ol =
n∑
i=1
l∗i ∀vi ∈ {v1, v2, ..., vn} (11)
Op =
n∑
i=1
Sil
∗
i ∀vi ∈ {v1, v2, ..., vn} (12)
Od =
k∑
j=1
CDj+ID∗j ∀vj ∈ {v1, v2, ..., vk} in path pic
(13)
Owidth =
∑n
i=1Widthi
# of rows in layout
(14)
where Oj for j ∈ {l, p, d, width} are the optimal costs
for wire-length, power, delay and layout width respec-
tively, n is the number of nets in layout, l∗i is the opti-
mal wire-length of net vi, CDi is the switching delay
of the cell i driving net vi, IDi is the optimal inter-
connect delay of net vi calculated with the help of li,
Si is the switching probability of net vi, pic is the most
critical path with respect to optimal interconnect de-
lays, k is the number of nets in pic and Widthi is the
width of the individual cell driving net vi.
3 GATS for Performance
and Low Power Driven VLSI
Placement
In this section, we first briefly describe GATS algo-
rithm and then discuss the implementation details of
GATS for multiobjective VLSI placement.
Figure 2 illustrates the proposed hybrid algorithm
GATS. An interesting novel idea is the introduction of
a population of solutions instead of single solution in
Algorithm GATS 
S : Current solutions 
S* : Best solutions 
Ν(S) : Neighborhood of S ∈ Ω 
V* : Sample of neighborhood solutions 
AL : Aspiration levels 
NG      : Population size for GA portion 
NT : Population size for TS portion 
No      : Number of Offsprings 
 
Begin 
For fixed number of times Do 
Start with random initial population NT 
For fixed number of iterations Do 
     For j = 1 To NT 
Generate neighbor solutions V* (j) ⊂ N (S (j)) by random swap   
 Find best S* (j) ∈ V* (j) 
 If move S (j) to S* (j) is not in T (j) Then 
  Accept move and update T (j) 
 Else 
  If Cost (S* (j)) < AL (j) Then 
   Accept move and update T (j) and AL 
  End If 
End If 
     End For 
End For 
Pass best or current solutions to GA 
For fixed number of generations Do   
For j = 1 To No 
  (x, y) Å Choose parents 
  Offspring [j] Å Crossover (x, y) 
  Evaluate Fitness (offspring [j]) 
     End For 
     Population Å Select (Population, offspring, NG) 
     For j = 1 To NG 
Apply Mutation  (chromosome [j]) 
Evaluate Fitness (chromosome [j]) 
End For 
End For 
 Pass best or current solutions to TS 
End For 
End. 
Figure 2: Outline of GATS: A Hybrid of GA and TS
for Multiobjective VLSI Cell Placement.
TS. This is likely to enhance the power of TS by al-
lowing it to visit the search space in a parallel fashion.
The algorithm starts by taking a random initial pop-
ulation of solutions. Then, for each individual in the
population, a certain number of neighbor solutions are
generated and the best neighbor is found. A character-
istic of the move leading to the best neighbor solution
is stored in a tabu list. There are as many tabu lists
as the number of solutions in the population i.e., NT .
The reason for taking NT tabu lists is obvious that the
series of moves for each individual in the population
is different. Therefore, each series should be stored in
a separate list so that a tabu list restricts the cyclic
moves on its corresponding individual only. However,
the aspiration level (AL) is unique for all the individu-
als. The purpose is that the tabu move on an individ-
ual solution is allowed only if it results in a solution
that is better than an overall unique best solution.
The above process continues for a certain number
of iterations and a record is kept of the NT individ-
ual best solutions obtained from perturbing NT indi-
vidual initial solutions. Then either these best solu-
tions or the current solutions are passed to GA for
further optimization. These semi-optimized individ-
uals are likely to produce good offsprings by mating
with one another. Now, GA is run for a given number
of generations on these passed solutions and a record
of the best individuals is kept. Again, either the cur-
rent individuals or best ones are passed back to TS.
The switching between TS and GA is repeated for a
given number of times.
3.1 Solution Representation and Ini-
tialization
A placement solution is an arrangement of cells in two
dimensional layout surface. So we decided to repre-
sent solution in the form of a 2-D grid. Due to varying
widths of the cells in a circuit, all the rows can not
have equal number of cells. This fact disturbs our two
dimensional representation. For instance consider a
circuit comprising of 11 cells 1, 2, 3, . . . , 11. A possible
layout may be as below:
3 5 8 6
9 10
7 11 1
4 2
The above layout is constructed by computing the av-
erage row width as explained above in the cost func-
tions section when discussing width cost. Then we
divide average row width by the smallest cell width
to compute the maximum number of locations in a
row. Assume that we have 4 locations and also we
know from the min-cut placer information that there
are 4 rows in layout. Then we start constructing the
initial solution by randomly selecting a cell from 11
cells and placing it in the first row. Before placing a
cell, it is checked whether adding it will violate the
width constraint, and if it does, then it is placed at
the start of next row. In the example above, assume
that sum of widths of cells 3, 5, 8, 6 was within allowed
width constraint, but adding cell 9 was violating the
width constraint, and so it was placed in second row.
Similarly, all the cells were placed on the layout. As
a result, we have five empty locations: two in second
row, one in third row, and two in last row. In order to
make it a perfect grid, we fill the empty locations by
dummy cells represented by distinct negative integers
as shown below:
3 5 8 6
9 10 -1 -2
7 11 1 -3
4 2 -4 -5
3.2 Cost Evaluation
Since, we are addressing a multiobjective optimization
problem in which we are trying to minimize three mu-
tually conflicting objectives, therefore we should have
a measure which can quantify the overall quality of
a solution with respect to all three objectives collec-
tively. Fuzzy logic provides a convenient approach and
hence used in this research. In this scheme, each so-
lution is assigned a fitness value between 0 and 1 that
is equal to the membership value in the fuzzy set of
acceptable solution. This membership value is com-
puted using Equation 10. The fitness of a solution is a
measure of its proximity to the optimal solution. The
higher the fitness value of a solution, the closer is it
to the optimal solution. In our implementation, initial
random solution is assigned a fitness value of 0 and the
optimal solution is assigned a fitness value of 1. This
implies that any solution may have a fitness value in
range 0.0-1.0.
3.3 Neighbor Solutions Generation
In each iteration, we generate a number of neighbor so-
lutions by making perturbations as follows: two cells
are selected randomly with the condition that both of
them should not be dummy cells at the same time,
then their locations are interchanged. The neighbor-
hood size i.e., the number of neighbor solutions gener-
ated in each iteration is taken depending on the circuit
size i.e. number of cells in the circuit. The value of
neighborhood size is varied from 20 solutions for small
circuits to 100 solutions for large circuits.
3.4 Tabu List and Aspiration Level
The characteristic of the move that we keep in tabu list
is the indices of the cells involved in interchange. The
size of tabu list is taken also depending on the circuit
size i.e. 5% of the total number of cells. We have used
short term memory element in our TS implementation.
The aspiration criterion used is as follows: if current
best solution is the best seen so far i.e. better than
the global best, then accept the current solution as
new best solution by overriding the tabu restriction
and update the tabu list.
4 Experimental Results and
Discussion
Here, the performance of proposed hybrid algorithm
GATS is compared with that of TS. Since TS out-
performed GA in terms of the quality of final solution
obtained, so the comparison presented in this section
is of great interest.
The costs of the best solutions generated by TS and
GATS are listed in Table 1. Here “L”, “P” and “D”
represent the wire length, power and delay costs re-
spectively, and “T” represents execution time in sec-
onds. Layout width was constrained not to exceed
TS GATS
Circuit L (µm) P D (ps) T (s) L (µm) P D (ps) T (s)
s2081 2323 379 111 298 2162 356 110 426
s298 3579 635 127 212 3454 631 125 362
s386 6643 1595 190 524 6329 1486 191 656
s641 12620 2868 656 1505 12433 2882 664 2143
s832 18760 4311 349 981 18451 4257 351 1929
s953 27287 4230 214 1036 25967 4239 212 1846
s1196 39054 11700 332 1138 38574 11526 334 3276
s1238 39186 11594 353 1124 39065 11342 351 3667
s1488 56888 13867 662 2256 56148 14135 669 5157
s1494 54710 13533 674 2499 54914 13763 668 5643
Table 1: Comparison between TS and GATS.
more than 1.2 times the average row width by fixing
the value of α in equation 9 equal to 0.2. This con-
straint is satisfied in obtaining all the results shown
here.
The results of TS are obtained by best settings of its
parameters as described above. The settings of GATS
parameters used for achieving theses results are as fol-
lows. Total number of iterations run are 5000, which
comprise of 2000 TS iterations and 3000 GA genera-
tions. The switch from TS to GA is made only once.
The population size NT used in TS part is 4 while in
GA part the population size is 16 chromosomes. This
fine tuning of parameters is made after careful study
of the results obtained by choosing different settings.
The population size in case of TS is reduced after ob-
serving that large population size increases run time
of TS part without providing any significant perfor-
mance. By taking this step, the run time of GATS is
shortened very significantly. The platform used is an
IBM compatible PC with an Intel Pentium-III 600Mhz
CPU and 256MB RAM.
It can be observed from the results that in most of
cases, GATS produced solutions which are better in
quality as compared to those obtained from TS. Al-
though, the execution time of GATS is higher than
TS, but this is tolerable considering the better qual-
ity of solutions. The overall performance of GATS is
comparable to that of TS, and much better than GA.
5 Conclusions
In this work, we have hybridized GA and TS for a hard
multiobjective optimization problem of VLSI standard
cell placement. An effort is made to simultaneously op-
timize three objectives namely power dissipation, per-
formance, and interconnect wire length. The incorpo-
ration of fuzzy logic is suggested to integrate the cost
values of three objectives in an aggregating cost func-
tion. The experimental results for ISCAS-89 bench-
marks clearly indicate the improvement made by our
GATS approach in terms of quality of the final solu-
tion obtained.
Acknowledgment:
Authors thank King Fahd University of Petroleum &
Minerals, Dhahran, Saudi Arabia, for support under
project # COE/ITERATE/221.
References
[1] Sadiq M. Sait and Habib Youssef. VLSI Physi-
cal Design Automation: Theory and Practice. Mc
Graw-Hill Book Company, Europe, 1995.
[2] Sadiq M. Sait and Habib Youssef. Iterative Com-
puter Algorithms with Applications in Engineer-
ing: Solving Combinatorial Optimization Prob-
lems. IEEE Computer Society Press, California,
December 1999.
[3] Sadiq M. Sait, Habib Youssef, Aiman Al-Maleh,
and Mahmood R. Minhas. Iterative Heuristics
for Multiobjective VLSI Standard Cell Placement.
INNS-IEEE International Joint Conference on
Neural Networks, IJCNN2001, July 2001.
[4] Sadiq M. Sait, Mahmood R. Minhas, and Junaid A.
Khan. Performance and low power driven VLSI
standard cell placement using Tabu search. In
Proceedings of the IEEE Congress on Evolutionary
Computation, CEC’2002., 1:372–377, 2002.
[5] Ronald R. Yager. On ordered weighted averag-
ing aggregation operators in multicriteria decision
making. IEEE Transaction on Systems, MAN, and
Cybernetics, 18(1), January 1988.
