Abstract Congestion minimization is the least understood in placement objectives, however, it models routability most accurately. I n this paper, a new incremental placement algorithm C -E C O P for standard cell layout is presented to reduce routing congestion. Congestion estimation is based on a new routing model and a more accurate cost function. An integer linear programming ( I L P ) problem is formulated to determine cell flow direction and avoid the conflictions between adjacent congestion areas. Experimental results show that the algorithm can considerably reduce routing congestion and preserve the performance o f the initial placement with high speed.
Introduction
As VLSI technology advances, the system complesity continues to increase and physical design is getting more and more difficult. With the advent of overcell routing. the goal of every place and route niethodology has been to utilize all available active area to prevent spilling of routes into channels. It is tlie overflow of routes that account for an increase in area. Further heuristic method should be applied in placement to manage local congestion to enhance and improve the latter route ability.
Traditional place men t objectives involve reducing net-cut costs or minimizing wire length [ 1-21. Because of its constructive nature, niin-cut based strategies minimize the number of net crossings but fail to distribute them uniformly [3] . For the sanie reason, traditional placement schemes which are based mainly on wirelength minimization can not adequately account for congestion. Reducing net-cut and minimizing wirelength only help reduce the routing deniand globally but do not prevent causing local routing congestion. How to estimate and reduce congestion i n placement is not well studied. Congestion-driven placement based on multi-partitioning was proposed in [4] . It uses the actual congestion cost calculated from precomputed Steiner trees to minimize the congestion of the chip. However, the number of partitions is limited due to the excessive computational load. reducing and maintain the metrics of the initial placement is very difficult.
In this paper, an incremental placement algorithm C-ECOP for improving local congestion is proposed. It first estimates the routing congestion through a new route model. Then it constructs an integer linear programming (ILP) to move cells to reduce congestion. Finally it adjusts the positions of cells to resolve overlap. Tlie rest of this paper is organized as follows. Section 2. describes the routing estimation and congestion measurement used in this work. The algorithm C-ECOP is presented in section 3. Section 4 gives the experimental results to show the effectives of our algorithm. Section 5 is the conclusion.
Congestion Estimation in Placement

I Cotigestion Cost
The congestion cost is defined based on the global bin concept. We partition a given chip into several rectangular regions, each of which is called a global bin. The boundaries of global bins are called global bin edges as shown in Fig.1 . The congestion is "related" to the number of crossings between routed nets and global bin edges. :.. Then estimating the congestion of a global bin can be replaced by computing the total overflow of the global edges as (2) and cost function is defined as (3) . Most of the algorithms for reducing congestion estimate congestion like this.
This method to estimate congestion is i l l behaved. It can only detect the congestion area that is crossed by a global edge. Actually. the congestion area can be in any place on the chip and it may just locate inside a global bin. Table 1 gives the result of the congestion estimation on a circuit CNTIOO with the method above. The chip is partitioned into 4x4, 5x5, 6x6 and 7x7 global bins. It is shown that the total number of congest areas (TNC.), the total demand (TD.) changed due to different partition. The niax demand (MD.) even increases when the partition amount increases from 4x4 to 5x5. Because the global edges beconie shorter, it seems that the routing demand crossing the edge should decrease.
*H: Horizontal V Vertical **MO.: Max Overflow
From the table we can see that the total demand increases when partition amount increases. It is because that the nets inside a larger global bin may become cross a global edge when the chip be partitioned with smaller bins. So the vertical and horizontal congestion estimation of a global bin in oitr algorithm is defined as follows:
where r,, and rIr are the vertical and horizontal routing deniand inside bl/. It can be obtained by the routing niodel described in 2.2. CO/ and W . are the weights. 
Roiitiiig E s t i i i i~t i o i i blodel
When we are performing congestion reducing, we need to estimate congestion of placement incrementally. A global router is needed here. Obviously. the more accurate is the router, the niore accurate is the estimation at the placement stage. Routing with a real global router will provide an accurate congestion estimation. But it will be very time consuming and could not be applied in incremental placement algorithm. Routing with a simple routing model such as the bounding box model will be very fast. But the bounding box model may be far different with the characteristics of the detailed router so that it causes a bad estimation. Wang et al. verify the bounding box niodel in [8] and prove that it does not correlate with the real router and could not be applied. So it is critical that the algorithm for this application is accurate while maintaining computational efficiency.
A new star model proposed in [6] is used here. It first computes and adjusts the coordinate of the net center. The vertical and horizontal possible route paths connecting each cell to the center are on the edges of a rectangle whose two vertexes locate on the cell and the center.
Route possibility on each path is 0.5. Then all the route possibility on the same path is added up and could not exceed 1 as shown in Fig. 2 . Experimental results show that the new star niodel is very close to the real routing in practice [6] . And the possibility of crossing the top and bottom edges, which are denoted as p,(k) and ph(k), are computed in the symmetric form. The vertical and horizontal routing demand inside a global bin can be easily known from this approach. The riming time of congestion estimation on some circuits through this approach is listed in Table 2 . It is shown that this approach is so fast that it could be used in our algorithm. 
Congestion Reducing through ILP
Generally, minimizing congestion and minimizing wirelength conflict each other in local regions. The reducing of congestion means to sacrifice the wirelength. I tic re menta I p I ace m e tit a I gori t h in sh oit I d ach i eve trade-off between congestion reducing and preserving the metrics, e.g. wirelength, of the initial placement. The design flow is shown in Fig.4 . The flow tendency of each cell is computed based on force driven by nets. Then cells could move due to the tendency to reduce congestion. An integer linear programming is formiilated to deal with the conflicts between multiple congested regions. After that a post process is carried out to place the moved cells and resolve overlap. The iteration of the ECO flow stops when the congestion result is acceptable.
I
Fig.4 C-ECOP Algorithm
Cell Flow Tendency Cotiipi/tation
Based on routing estimation, we can identify the congestion bins on the chip. Each global bin whose congestion cost defined by (4) is greater than a certain threshold value or at least one o f its global edge is overflowed is considered to be a congest bin. Cells in congest bins should move outside to decrease routing demand and achieve more routing resource. Which cells can be moved out so that the perturbation to the initial placement could be minimized is the key problem. We compute the flow tendency of each cell to deal with this.
The horizontal flow tendency of a cell C1 i n bitz(i. j ) is computed as follows: Moreover, the flow tendency of cells could be regarded as the net-cuts crossing the global edges. Generally speaking, reducing in net-cuts is consistent with reducing in wirelength, so moving cells according to the flow tendency will lead a decrease in wirelength. 
iiikii equals 1 means route demand in bi/i(ik. j k ) will reduce after moving Ck out. It equals -1 means route demand in the destination bin will increase for tlie same amount. And the peiturbation to the route probability in other bins is ignored. Experimental results show that it slightly affects the final results but considerably saves the run ti i ng ti me.
Then for some bin, the number of its related cells is limited. Actually, it is only the movable cells inside tlie bin and in the four adjacent bins that may be included iii tlie inequations. The ILP problem can be optinially solved. The problem solution determines tlie total number and destination bins of moved cells. A post processing is then carried out to place the cells and resolve overlap inside each bin. 
Post Processir7g
After the congestion reducing process, some cells are redistributed among different global bins. These cells should be placed withoitt overlap. An efficient algorithm is needed to adjust the positions of cells with tlie minimal perturbation to the initial placement. We use the W-ECOP algorithm [ 5 ] here to acconlplish the process. For a .cell need to be placed, it is inserted into the adjacent row and an optimal scheme to rearrange the cells in the row is found. If free space in the row can not accept tlie cell, a shifting path searching process is carried on to assure cells restrict in their neighboring area so that the performance of the circuit will be preserved. The algorithm has been implemented in C. All the experiments were done on a Sun E450 workstation with 4GB memory. To show the effectiveness and utility of our algorithm, a pait of the esperiniental circuits are chosen from IBM-PLACE benchmark [ 131 placed with a wireleiigth-driven placer, Dragon [ 141. Other set of circuits are from industry (Ultima Company). We compare tlie global routing results (overflow and wirelength) for the design before and after incremental placement. Table 3 shows the results on the circuits from industry. As one can see, tlie congestion reducing approach considerably reduces the total overflow. It has a 50 percent cut down i n average. And tlie total wirelength increased less than 0.1 percent compared to the initial placenient. Some circuits even have a decrease in wrelength after placement. This indicates that the wirelength is not sacriticed much due to the reducing of congestion. It is owed to the wirelength optimization approach in W-ECOP algorithm. Table 4 shows the results on the IBM-PLACE benchmarks placed with Dragon. Dragon has done wirelength and routability optimization by combining powerfit I hypergraph paiti t ion i ng package with si ti1 11 lated annealing technique [ 141. From the results we can see that the total overflow can be reduced continually throiigh 0111' niethod. And the optimization i n wirelength is preserved. The algorithm is mitch faster on the benchmarks than on tlie industry circuits. The short amount of riinning time shows that our method can scale well for large circuits.
Grids
Circui ts
Experimental Results
VIH
Conclusion
A new incremental placenient algorithm for congestion alleviation is presented i n this paper. The proposed algorithm automatically evaluates the routing congestion of a detailed placement with a fast and accurate routing estimation model. Congestion areas on the chip are relieved through cell moving. An integer linear programming (ILP) problem is forniitlated to resolve conflicts among multiple congest areas and avoid causing unexpected congest areas. After that an efficient algorithm for resolving overlap is used to ensure perturbing tlie i t i it i a I place nie t i t the least . Ex per i ni e t i ta I res it Its demonstrate the effectiveness of tlie new approach. Table 3 Experimental results on industry circuits *V/H Cap: The vertical/liorizontal capacity of each grid. **BIP/AIP: overflow before and after incremental placement
