Index Terms-Force-directed method, macrocell placement, net clustering, net length minimization, soft macrocells.
Wong and Liu [14] proposed a simulated annealing approach for floorplanning. A slicing floorplan, represented as a tree, is randomly picked from the solution space and the cost is evaluated. The advantage is the possibility to avoid local minima, but the solution space does not include nonslicing floorplans and the running time may be very long for large designs. Murata et al. [15] introduced sequence pair representation of a floorplan which included the nonslicing representations. The solution space is extended and better cell packing can be found. Guo et al. [16] have proposed an ordered tree structure (referred to as O-tree) to represent nonslicing floorplans. The main advantage is the reduced solution space, thus resulting in improved time complexity of a floorplanning algorithm. B*-tree representation [17] , [18] , based on ordered binary trees, improves on O-tree representation, in terms of efficiently performing operations that manipulate the nonslicing floorplan representation.
Pillage and Rohrer [19] have modeled nets as points and used a quadratic cost metric that is minimized by quadratic programming. Although this is one of the early techniques that focuses on nets, the model used is too simple to completely characterize the net. Mo, Tabbara, and Brayton [20] have proposed a macrocell placer that uses force-directed method. The nets are modeled by a star which can be considered as the center of the net. Besides attractive forces, they use a filling force (proposed by Eisenmann and Johannes in [21] ) which reduces the overlaps.
We propose a hierarchical register transfer level (RTL) macrocell placement approach that optimizes the net lengths by employing net clustering and force-directed method. In our approach, the net placement (i.e., placement of terminals of nets) determines the cell placement. It starts at a higher level of abstraction (cluster-level) and goes down to net-level, then to terminal level. At the lowest level, terminals of a net can move freely in the quest for an optimal wire length solution for the net.
The main contributions of this work are as follows.
• Net clustering based approach, which enables prioritizing interconnects.
• The pins of a net are allowed to move freely under the influence of forces and asymptotically approach their cell boundary. The initial cell placement is derived from the net placement as defined by the positions of the terminals of nets.
• The sinks and the source of the net are treated differently when forces are defined; such a distinction enables delay optimization.
• Traditional force-directed methods use forces that are only attractive in nature and that obey Hooke's law. A repulsive force that is electrostatic in nature is used. The force is in-1063-8210/02$17.00 © 2002 IEEE versely proportional to the square of the distance between the points and it is strong at near distances and weak at farther distances.
• Unlike in other force-directed methods, the cells can "jump" out of local minima, by ignoring the rejection forces. The rest of the paper is organized as follows: Section II presents an overview of the proposed approach. Section III presents the underlying models used by our approach and the force-directed terminology. Sections IV-VII describe the four phases of the proposed approach in detail. Section VIII presents and discusses the experimental results. Finally, Section IX draws conclusions.
II. OVERVIEW OF THE PROPOSED APPROACH
The proposed design methodology is shown in Fig. 1 . The input is an RT-level netlist that connects a set of module instances of varying sizes. Each module in the module library is precharacterized for area, dimensions, and pin positions. The proposed approach uses hard macrocells (cells having fixed aspect ratio and fixed pin positions), as well as soft macrocells.
Given an RTL netlist, a net dependency graph (a weighted undirected graph) is built. A node represents a net and an edge between two nodes exists if and only if the represented nets share one or more cells. The weight of an edge is equal to the number of shared cells. This graph is used to cluster strongly interdependent nets. Each net cluster identifies a set of nets that should be routed in the same region-so that all nets in question are optimized. Nets are clustered using clique partitioning followed by a net-cluster based floorplanning by simulated annealing.
The net clusters are used to derive an area-optimal floorplan (high level placement) used as a seed for a coarse net placement (intermediate level placement). At this level, the nets are modeled as circles. Using a force-directed procedure, a coarse net placement is derived.
Detailed macrocell placement is then achieved using force-directed method. The key idea is that the forces on the terminals of nets will determine the final macrocell placement. At the lowest level, a model derived from the star net topology is used. The terminals are free to move and are not locked on the cell boundary. Forces on the net terminals drive the net length optimization. On the other hand, forces between pins of each macrocell and between a pin and the corresponding cell ensure a feasible pin position. During the process, cells with a high net cost (the length of the nets connected to the cell) are chosen for net minimization by ignoring overlap constraints. Cell overlap is a natural consequence of the force-directed approach. Two means for cell overlap removal are employed: 1) a repulsive force (electrostatic in nature) that is strong at close distances but weak at long distances; and 2) an overlap removal heuristic. Input/output (I/O) pin placement is carried out using bipartite minimum-weight matching algorithm. The final placement is then fed into CADENCE Silicon Ensemble to perform global and detailed routing. 
III. MODELS AND FORCE-DIRECTED TERMINOLOGY

A. Models
The objects handled during the iterative process have a complex structure and a good model is necessary in order to characterize their properties. For example, a net was modeled by Pillage [19] as point, but this model cannot provide any net delay information and does not consider the net dimensions. Mo et al. [20] model the net as a star rather than a complete graph. All the terminals are connected to an extra point ("star").
The interconnect model used is the star model 1 [ Fig. 2(a) ] that conforms to a star net topology [ Fig. 2(b) ]. The source is at the center and the sinks are on the periphery. This model is suitable for the force-directed approach as the source and sink can attract each other in an attempt to minimize the wire length. Ideally, the source is at the center of mass of the sinks as shown in Fig. 2(a) . If this is not so, then the attractive force between the current source location and the center of mass of the sinks tries to minimize the difference. Only single-source nets are considered in this paper. This model can be easily extended to multi-source nets. For example, to handle two source nets, an elliptic model, with the two sources as the foci of an ellipse, can be used. The motivation for a distinction between source and sink in the model is the fact that for delay optimization, the delay estimate between the source and any sink can be used to derive appropriate forces on each other. A bus is treated as a single net with higher priority in order to reduce the problem size with respect to the number of nets.
Unlike the models used by other authors, in this approach the macrocell pins are not fixed on the boundary. At the beginning of the algorithm they are placed randomly within a determined area and, during the iteration process, they try to find a position close to the cell boundary. Because the electrostatic rejection force has a circular symmetry, the cells are modeled as circles with the radius derived from the cell size (1) where and are the macrocell dimensions.
B. Forces
Our approach uses forces to find an equilibrium position for the objects described in Section III-A. This position corresponds to a low potential placement, i.e., short net lengths. The main type of force used is an attractive force which obeys Hooke's law [(2)] which states that the force applied by a spring on an object is directly proportional to the spring deformation . is a constant called the spring constant. The use of attractive forces in cell placement was initially proposed in [26] (2)
Unlike in the case of standard cells, macrocells have very different sizes which can lead to overlaps. In order to reduce them, we use a repulsive force.
The basic notation used throughout this paper is shown in Table I . The following notation is used to denote a force:
where subscripts and (if present) classify into a net-level force and cell-level force, respectively. Superscripts and (if present) classify into an attractive and repulsive force respectively. The force is applied on the entity (which can be a net terminal or a cell center) and it is due to entity . For example, , represents an attractive force on the pin, . Further, it is a cell-level force, due to the cell .
Two types of forces are used during the intermediate and low level placement: attractive and repulsive. In general, an attractive force [26] is computed as follows: (3) where is an analogous spring constant and and are the position vectors of objects and .
In some cases, the value of the force coefficient may be increased during the iterative process in order to reduce the distance between the objects attracting each other. Initially, this coefficient has the value of unity and is updated in each iteration using the following expression:
where is a user chosen constant between 0 and 1 (0.5 in our approach), is the new value of the spring constant, and is the estimated maximal distance between the objects that are attracting one another.
In order to obtain a feasible placement, when two objects give rise to overlaps, a repulsive force must be introduced. This force must have a higher value when the overlaps are big and a small value otherwise. We use a repulsive force that is electrostatic in nature. This force is felt strongly at near distances and weakly at long distances. The force is determined by the equation (5) where is a constant (unity unless otherwise specified) and is the distance between the two objects in question.
is a constant that depends on the size or the distance between the objects that repel one another.
1) In the case of a pin being rejected by the center of the cell, where is the average distance between the cell center and the cell boundary (cell radius).
2) In the case of the rejection between two nets, where and are the radii of the nets , respectively . 3) In the case of the rejection between two cells, where and are the radii of the cells , respectively .
C. Finding the Equilibrium Positions
Consider a set of objects , . When a force is applied on each object, the equilibrium position for is given by the value such that the total force when . Given the current positions of the objects, the equations , , generate a system of nonlinear equations with unknowns. The system is solved using the Newton-Raphson's method [27] . Consider the th equation in the system. We want to find the value such that when all the other variables , , are considered constants. The new position of the object is given by the following formula: (6) where is a constant. The procedure is repeated until the difference , where is the maximum admissible error.
This method is applied successively (see Fig. 3 ) on each equation until the positions of all objects are stable, i.e., the movement of each object is smaller than . The constant is usually 0.5. In our approach is a random variable uniformly distributed in [0.1,0.5]. This will help avoiding slow convergence. Consider the situation presented in Fig. 4 . If the current position is , by applying the Newton-Raphson's method, the new position will be because the tangent at in intersects the -axis in . From , the current position will jump again very close to . This process will continue for a large number of iterations, without approaching the solution . By choosing a different value for each iteration, the length of the jump will be different leading to a faster convergence. The experiments show a fast convergence for the net terminals.
IV. HIGH-LEVEL PLACEMENT-NET CLUSTERING
This section presents the first phase of the proposed placement approach, where a high-level net placement is performed. The input is an RT-level netlist that connects a set of module instances of varying sizes. Nets are clustered using clique partitioning followed by a net-cluster based floorplanning by simulated annealing.
A. Net Clustering
Given an RTL netlist, a weighted undirected graph, , is formed. Each vertex denotes a net and an edge exists if and only if the nets represented by vertices and share one or more macrocells. The weight of a vertex (net) is given by the number of net terminals. The weight of an edge (
) is given by the number of cells that the nets and have in common. captures the dependencies between the nets and is known as the net dependency graph.
In order to find sets of nets that have to be routed in the same region, clique partitioning is carried out on the net dependency graph . A clique is defined as a complete subgraph of the net dependency graph. We employed a variation of the clique partitioning heuristic proposed in [28] . The modification enables the heuristic to form maximum weighted cliques. Fig. 5 shows an example of a circuit and the net cliques obtained by using the clique partitioning heuristic. Due to the nature of , a net belongs to only one cluster, but a cell may belong to more than one cluster. Thus, the cells can be classified as cells that belong to a cluster and common cells. A macrocell belongs to a cluster if all the nets connecting it are included in the same cluster . A macrocell is a common cell to a set of clusters if each cluster includes at least one net connecting the cell.
B. Net Cluster Floorplanning
After the net clusters are generated, they have to be placed such that the chip area is minimized. Also, because there are cells connected by nets which belong to different clusters, the distance between these clusters must be minimized. These two objectives are realized by performing a cluster level floorplanning. Given a net cluster (say ), its area is a sum of: 1) the area of the macrocells that solely belong to ; and 2) the area of the common macrocells that are shared with other clusters. The area of the shared cells is distributed equally amongst the clusters. The area is estimated by the following equations: (7) where is the area of the cell , is the set of macrocells connected to the net , and is the set of common macrocells between net clusters and . The factor ( in this approach) in (7) accounts for the area overhead needed for routing. The overhead factor can be adjusted depending on the number of wiring layers.
The cluster-level floorplan is found using the simulated annealing approach and a sequence-pair representation [15] . The floorplan is represented by two sequences of the design cells, and , which define the relation between two modules (module1 and module2) as: module1 is above, below, right of, or left of module2. These relations can be transformed into the vertical and horizontal constraints graphs which determine a feasible module placement. Since the number of clusters is smaller than the number of cells, the time required to find the floorplan is much smaller than the time needed to find a placement using the simulated annealing at the cell level. The initial seed is derived by a random placement of the clusters. The valid moves are: exchange two modules in , exchange two modules in both and , and change the cluster aspect ratio. The cost function is a weighted sum of the distance between clusters (which share one or more cells) and the bounding box area of the floorplan (8) where is the number of clusters, is the bounding box area of the floorplan, and is the distance between the geometrical centers of the bounding boxes of clusters and . Note that we compute only if the two clusters share at least one macrocell.
V. INTERMEDIATE LEVEL PLACEMENT-COARSE NET PLACEMENT
After the net clustering, a coarse net placement is derived based on the force-directed method. A net is modeled as a circle with the radius , the average estimated distance from source to sink, which is defined as a fraction of the net cluster size or may be given as a constraint. The force on the net from the cluster is
The first term is an attractive force between the net and all nets which have cells in common with (i.e., there is an edge ( ) in the net dependency graph ). The second term is a rejection force between nets in the same cluster, used in order to avoid the trivial solution where all the nets are overlapped. This force has and [see (5) ]. The last term is an attraction force between a net and the center of the cluster.
is the number of nets that have at least one cell in common with and are not in . This force will keep the net in the area allocated for the corresponding net cluster. Initially, each net is placed randomly within its cluster area and the rejection forces between nets in the same cluster are reduced by a scaling factor, , such that the nets will not be trapped in a high energy equilibrium position. When the equilibrium is attained, the process is repeated without reducing the rejection forces, . The equilibrium positions are found using the method presented in Section III-C. The method is applied on cluster-bycluster manner rather than for all the nets at the same time.
VI. LOW-LEVEL PLACEMENT-NET TERMINAL AND CELL PLACEMENT
Detailed macrocell placement is achieved using force-directed method. Fig. 6 shows a high-level view of the iterative improvement approach to derive a detailed placement from the coarse net placement. The nets are modeled as presented in Section III-A. Initially, the terminals of a net are randomly placed inside the area determined for the net in the previous phase. An iterative force-directed process is performed on the net terminals and cell centers until a stable solution is found. If the number of iterations exceeds 100, the process is stopped. I/O pin placement is carried out using bipartite minimum-weight matching algorithm. A detailed description of the procedure is shown in Fig. 7 . The values of and in the procedure are determined empirically. In this work, (line 16) has a value of 5 for the first 30 iterations and a value of 10 later on. (line 23) has a constant value of 5.
A. Net Terminal Placement
Two categories of forces are applied on net terminals. The forces in the first category try to optimize the length of the nets, while the forces in the second category try to find a feasible placement of the terminal on the cell boundary.
1) Net Length Optimization Forces on a Net Terminal (NetLevel Forces):
They depend on the type of terminal (source or sink) on which they are applied. Consider a net ( ) in a cluster as shown in Fig. 8 with its source terminal at the center of the net. The center of the cluster and the center of mass of the terminals are also shown. The force on the source terminal is given by (10) The first term is an attractive force that pulls the source pin toward the position derived for the net during the coarse net placement. This force is responsible for keeping the net close to its initial placement. The second term is an attractive force that pulls the source pin toward the center of mass of the sinks of the net.
Consider the sink terminal of the net in Fig. 8 . The force on a sink terminal is given by (11) The first term is an attractive force between the sink and the source of the net . The spring coefficient of this attractive force is updated using (4). in this case is the radius of the net determined in the previous phase. The second term is a small cumulative attractive force between and the rest of the sinks in the net which tries to reduce the sink dispersion.
2) Cell Feasibility Forces on a Pin (Cell-Level Forces):
Because the net terminals can move freely and are not locked on a macrocell boundary, the pins of a cell may not lie on the periphery of the cell. In order to find a feasible placement, the cell feasibility forces are introduced.
Consider the pins of a cell , as shown in Fig. 9(a) , each marked by an . The force acting on a pin of a cell is given by (12) The first term is a force that causes the center of the cell to repel the pin. This kind of force is dominant for the pins which fall inside the cell boundary [e.g., pin in Fig. 9(a) ]. The second term is a force that causes the center of the cell to attract the pin. This component is dominant if the pin position falls outside the cell boundary [e.g., pins and in Fig. 9(a) ]. The third term is a cumulative repulsive force exerted by the rest of the pins of the cell on pin . This force avoids the collapse of multiple pins to the same position. Fig. 9 (b) depicts the two types of forces. At closer distances, the repulsive force dominates and at farther distances, . Consider the forces on pins of a cell as shown in Fig. 9(a) . The objective here is to force the pins to be on the cell boundary. Pin experiences a resultant repulsive force, while pins and experience a resultant attractive force. Pin experiences no force. In Fig. 9(b) , the distance is the equilibrium distance between the points such that the resultant force is zero. As illustrated by the example in Fig. 9(a) , a judicious mixture of attractive and repulsive forces can be used to achieve good positions of pins. Note that the cell is modeled as a circle and the equilibrium position is on the circle boundary.
The total force on a net terminal is given by the sum of the netlevel force and the cell level force. Using the resultant force, a new position is derived for terminals using the method presented in Section III-C.
B. Cell Placement
The terminals, which can move freely, determine the cell positions. This net-based method reduces the net dimensions, but can lead to a placement with cell overlaps. The cell overlaps must be eliminated while keeping the cells in position so that the wire lengths do not increase significantly. Rejection forces are used to achieve this objective.
The overall force on the center of a cell is (13) where is the number of pins of cell and is the number of cells. The first term is the attraction force between the center of the cell and the net terminals that should be placed on the boundary of the cell. Also, it includes the attraction force toward the I/O pins which are placed on the chip boundary.
The second term is a repulsive force between cells and is proportional to the amount of the overlap as described by (5) . In this case, the coefficient is given by , where and are the radii of the cells and , respectively. This force has a big value when two cells overlap and a small value otherwise.
is used as an inertial coefficient such that the big cells will move less than the small cells when they overlap.
The new position for the cell centers are found using a different method which allows limited movement and improves the overall convergence rate. The center of the cell will move in the direction of the force (14) where is a constant which depends on the circuit size ( m in our approach), the sum of forces applied on the center of cell , and is a unit conversion constant (1 m N) used to adhere to the proper dimension units.
A halo is added around each macrocell that helps in reducing the overlaps and providing routing space. On each side, the width of the halo is proportional to the cell connectivity, i.e., the number of pins on that side, plus a constant value given by the number of nets of the design. The size of each macrocell is increased by the halo value.
C. "Jump" Cells Procedure
The main drawback of a force-directed approach is the fact that an equilibrium position is not always the minimum energy placement. In order to reduce this effect the method presented in Fig. 10 is used.
The procedure allows the cells which are connected to long nets to ignore the overlap reduction forces and move to a position where the sum of the wire length of these nets is min- imized. (line 9 in Fig. 10 ) is an empirically determined constant ( ). These steps are applied once in every iterations (see Fig. 7 ), where has a small value ( ) at the beginning of the iterative process and a high value ( ) after 30 iterations. Each cell is allowed to "jump" only a predetermined number of iterations which can be specified by the user (three times in our approach). The procedure helps in avoiding local minima for the all the macrocells and reduces the net lengths up to 20%.
D. I/O Pin Placement
The optimal I/O pin placement is carried out by bipartite minimum-weight matching. The bipartite graph is built such that , where a vertex represents a cell pin that needs to be connected by an I/O net and a vertex represents a valid I/O pin slot. An edge ( ), , represents a possibility of mapping the I/O net corresponding to pin to slot and its weight is determined by the distance between and . The I/O pin placement is not performed in every iteration. It is performed at regular intervals as determined by the constant set by the user.
VII. OVERLAP REMOVAL
The net and cell placement phase generates a placement with overlaps. This phase tries to reduce the overlap and generate a feasible placement.
A two dimensional grid that covers the entire chip area is created. The bin size is determined by the minimum cell size. The density of each bin of the grid is given by (15) where is the area of the bin ( ) that is covered by the cell and is the bin area. The steps performed in this phase are shown in Fig. 11 . The overlap cost for the cell is (16) where is the density of bins covered by the cell , is the number of nets connected to , is the Manhattan length of the net , and is the number of cells. The last term is the rejection force between the the cell and the others cells. If the current position of the cell on the grid is ( ), the costs are calculated on ( ) and all ( ) (Fig. 12) . The position with the smallest cost is selected as the new cell position. At the end, a final overlap removal procedure must be used to eliminate the small remaining overlaps due to the finite-bin dimensions.
During this phase, cell orientation is also performed in order to reduce the length of the nets. All eight orientations (arising due to cell rotation and cell flip) are considered for each cell independently. Initially, the terminals are placed on the cell boundary using cell-level forces. Consider a macrocell, . In order to find the optimal cell orientation, the following cost function is minimized: where is a point that depends on the type of terminal . When is a sink of a net , is the source of the net. If is the source, then is the center of mass of all the sinks of the net. Note that is the real position of the pin on the cell boundary for a given orientation and is the position determined for the source or the center of mass during the iterative process for the floating terminals. While this method does not always give the optimal orientation, it has the advantage that the orientation of one cell does not depend on those of the other cells. Therefore, the aforementioned procedure is applied to all cells, one cell at a time, and the order in which the cells are considered is not important. The cell orientation procedure is applied more often in the first part of the overlap removal phase ( ) and less often in the second part ( ). The user can choose between hard macrocells (aspect ratio and terminal positions fixed) and soft macrocells. When soft cells are used, the initial aspect ratio of all macrocells is unity, such that all the cells have square shape, the closest shape to the circular model considered for cells. During the iterative process of overlap removal, the cells are periodically evaluated with a different aspect ratio and the one with the smallest overlap is chosen. If is the current aspect ratio of a cell, , , and are the aspect ratio values which are considered, where is a constant ( in this work). In order to avoid an increase in overall cell overlap, only a limited number of cells is tested in each step. At the limit, only one cell may change its aspect ratio. At the end, the pin positions are determined by projecting the floating terminals on the cell boundary. This procedure can be easily extended to allow the user to select a list of soft cells.
VIII. EXPERIMENTAL RESULTS
We tested the approach on two sets of benchmarks: 1) a set of five RT-level designs that are synthesized by a behavioral synthesis system [22] , [23] from high-level specifications [24] and 2) a set of MCNC benchmark designs [25] . Note that for the second set there is no information about the source and the sinks of the nets, therefore they are randomly assigned.
The description of the designs of the first set is as follows.
• Compress: Implements a look-up table based compression algorithm.
• Find: A sort-and-search chip.
• FIFO: A first-in-first-out queue implementation.
• Elliptic Wave Filter: Implementation of a fifth-order elliptic wave filter.
• Shuffle Exchange Network: Implementation of "forward pass" functionality of a high speed reconfigurable shuffleexchange network architecture.
• DCT4 4: Implementation of digital cosine transform.
Pertinent design data of the benchmark examples is shown in Table II . For each design, the table shows the number of macrocells, the number of nets, the number of net clusters, the maximum number of nets in a cluster, and the number of I/O pins. For all the designs, a bus is treated as a single net in order to reduce the problem size in terms of number of nets. The designs are implemented in 0.35-m technology with three (first set) and four (second set) wiring layers and over-the-cell routing for metal layers higher than metal2. The results were obtained on a SUN ULTRASPARC 30 Workstation with 296-MHz processor and 128-MB RAM. The placement program is written in C++. In all cases, the global and detailed routing is performed by CADENCE Silicon Ensemble (version 5.3). Special nets such as clock, reset, power, and ground nets are not included in the above results. Since these nets are connected to a majority of the cells in the design, they need to be handled separately.
The optimization results for the longest wire length, the total wire length, and the bounding box area are shown in Table III for the first benchmark set. The results are compared with those produced by the CADENCE Silicon Ensemble (SE) version 5.3 and the results produced by the O-tree deterministic algorithm [16] . SE employs quadratic programming for placement. We implemented the O-tree algorithm presented in [16] . In order to have a routable placement, we added around the cells the same halo we used in our approach (in [16] , just an estimate of the wire lengths is given and no routing space is provided). Because the algorithm depends on the order the cells are considered, the best result of ten runs is taken. Also, cell orientation, which was not included in the original paper, is introduced. The cost function focuses on net minimization by increasing the net length weight. In all cases, the routing is performed by SE. Table IV shows the results for the second benchmark set, which are compared with those generated by CADENCE SE, the results reported by Mo, Tabbara, and Brayton [20] ( [20] may be considered a classical force-directed approach for macrocell placement), and the O-tree algorithm. In all cases, the resultant placements are routed by SE using four metal layers. It can be observed that the proposed approach improved both the total wire length and the longest wire length for all designs. The largest savings are obtained in the case of Playout design, a design with a large number of internal nets. The improvements for all benchmarks are presented in Table V . In the case of the total wire length, our approach has better results than those obtained by SE, with an average reduction of 18.8%. O-tree algorithm gives better total wire lengths for one benchmark, but the overall savings of our approach compared with O-tree are 14.2%. The better total wire lengths of O-tree are done at the cost of longer wire lengths; whereas our approach has savings of 16.8% compared with SE and 28.5% compared with O-tree algorithm. The O-tree algorithm does a good packing of the cells, but cannot minimize the net lengths very well. The chip area of the designs (not considering any I/O pads) is reduced, on average, by 5.8% compared with SE. In the case of FIFO the chip area increased by 5.9%. This is a reasonable penalty paid for a reduction of 30.9% in the longest wire length and 43.3% in the total wire length. Also, for Ami33 there is an area penalty of 3.8%, while the longest wire TABLE VII EXECUTION TIME length decreased by 36.5% and the total wire length decreased by 9.1%. Compared with the O-tree algorithm, our approach has smaller chip area for five benchmarks and bigger for the other four with an average penalty of 5.6%.
Table VI compares the longest wire length and the total wire length for the proposed approach without cell resynthesis and with cell resynthesis. When the cells are resynthesized (different aspect ratio and terminal positions), the results presented in Tables III and IV are additionally improved by an average of 19.8% for the total wire length and 16.6% for the longest wire length. Table VII shows the execution times (user CPU time) for both the benchmark sets and all the methods. Our approach has a longer running time than SE and Mo et al. procedure for bigger circuits. This is mainly due to the simulated annealing procedure used in the first phase. For example, in the case of DCT4 4 circuit, the simulated annealing execution time was 2 h and 27 min, while the execution time of the net coarse placement, net and terminal placement, and overlap removal was less than half an hour. The execution time for the O-tree deterministic algo- rithm is high because of the increased number of combinations to be analyzed when cell orientation is introduced.
A. Net Prioritization
The user may identify and prioritize nets for optimization. The nets, particularly on the critical path of the design, need such a prioritization. For each prioritized net, the coefficients of the forces between the source and the sinks of the net are scaled up to reflect the net priority. To demonstrate net prioritization, the following experiment was performed: for Compress design, four nets (n1-n4) whose lengths are above the average length (when no nets are prioritized) were chosen. Then, the priority of some of the nets was increased and the results were compared. The second column of Table VIII shows the wire lengths without any prioritization. The third column shows the wire lengths when n1 and n2 are prioritized. We observe that n1 reduced by 29.7% and n2 by 89.3%. The fourth column shows the wire lengths when n3 and n4 are prioritized. We observe that n3 reduced by 39.0% and n4 by 72.9%. When some nets are prioritized the length of the others may increase, e.g., net n1 when n3 and n4 have higher priority.
IX. CONCLUSION
In this work, a net-clustering based macrocell placement has been proposed. The novelty of the approach lies in the new net model and the cluster information to derive a rough floorplan. The floorplan further helps in placing the terminals of nets so that net lengths are optimized. The cell placement is derived from such a "net placement" information.
The significant wire length reductions across the benchmark set (with reasonable area penalty) may be attributed to the following factors. 1) Clustering of interdependent nets and floorplanning which give a very good starting point for the force-directed net terminal placement and subsequent macrocell placement. 2) Pin-level force formulation is very effective in optimizing the net lengths.
3) The possibility for cells to "jump" out of local minima. 4) The bipartite minimum weight matching is effective in reducing the nets involving I/O pins. 5) User prioritization of nets can help optimize the nets on the critical path of the design and hence the overall chip delay characteristics.
