Abstract-In the early stage of floorplan design, many modules have large flexibilities in shape (soft modules). Handling soft modules in general nonslicing floorplan is a complicated problem. Many previous works have attempted to tackle this problem using heuristics or numerical methods, but none of them can solve it optimally and efficiently. In this paper, we show how this problem can be solved optimally by geometric programming using the Lagrangian relaxation technique. The resulting Lagrangian relaxation subproblem is so simple that the optimal size of each module can be computed in linear time. We implemented this method in a simulated annealing framework based on the sequence pair representation. The geometric program is invoked in every iteration of the annealing process to compute the optimal size of each module to give the best packing. The execution time is much faster (at least 15 times faster for data sets with more than 50 modules) than that of the most updated previous work by Murata and Kuh (1998). For a benchmark data with 49 modules, we take 3.7 h in total for the whole annealing process using a 600-MHz Pentium III processor while the convex programming approach described by Murata and Koh needs seven days using a 250-MHz DEC Alpha. Our technique will also be applicable to other floorplanning algorithms that use constraint graphs to find module positions in the final packing.
I. INTRODUCTION
FLOORPLANNING has become increasingly important in physical design of very large scale integrated circuits due to the advance in the deep submicrometer technology. Many floorplanning algorithms were proposed in recent years and many of them make use of constraint graphs to compute module positions in the final packing. Unfortunately, it is not known how shape flexibilities of soft modules can be handled efficiently using constraint graphs. This is an important problem since soft modules are common in the floorplanning stage when many designs are not yet done in details. Some previous works [4] , [8] , [9] , [12] have attempted to tackle this problem but none of them succeeded in obtaining the optimal solution efficiently.
There are two types of floorplans: slicing and nonslicing. A slicing floorplan is a floorplan that can be obtained by recursively cutting rectangles horizontally or vertically. A nonslicing floorplan is one that is not restricted to be slicing. Fig. 1 shows an example of each. Nonslicing floorplans are a more general representation that can describe all kinds of packings. However, slicing floorplans have an important advantage over nonslicing: there are efficient algorithms to handle soft modules in slicing floorplans optimally. A well-known approach by Wong et al. [13] uses shape curve representation. A shape curve can describe all possible shapes of a module and these shape curves can be added up horizontally or vertically to produce new shape curves for supermodules containing more than one basic modules. Moh et al. [5] and Wang et al. [11] use numerical optimization methods. Moh et al. [5] formulate the problem as a geometric programming and find its global minimum using some standard convex optimization techniques. However, all these methods are limited to placement topology of rectangular dissection only, i.e., slicing. The problem of handling soft modules becomes more complicated in nonslicing floorplans. Both Pan et al. [9] and Wang et al. [12] try to generalize Stockmeyer's algorithm [10] to nonslicing floorplan. Kang et al. [4] extend the bounded sliceline grid (BSG) method [8] to handle soft modules using heuristics. These methods are either suboptimal or applicable to some specific nonslicing structures only. Murata et al. [7] follow the framework of [5] and try to reduce the number of variables and functions when formulating the problem so as to improve the efficiency. However, the execution time of their method to find an exact solution is still very long. It takes seven days to pack a benchmark data with 49 modules.
In this paper, we will present an efficient method to handle shape flexibilities of soft modules in general nonslicing floorplans optimally. The problem is formulated as a geometric program, but we use the Lagrangian relaxation technique [6] , a general technique for constrained nonlinear optimization, to solve the problem efficiently. This technique transforms the problem into a sequence of subproblems called Lagrangian relaxation subproblems. Each subproblem can be significantly simplified by the Kuhn-Tucker conditions. The resulting subproblem is so simple that the size of each module can be computed in linear time. This complexity can be further reduced to a constant on average by using a different representation for nonslicing floorplans that supports planar constraint graphs.
We implemented this method in a simulated annealing framework using the sequence pair representation. The objective of the annealing process is to minimize the total packing area and interconnect cost. To evaluate the area in each iteration of the annealing process, we use the geometric program to compute the optimal packing area taking into account the shape flexibilities of all the soft modules simultaneously. Our floorplanner can pack much faster than the most updated previous work [7] . For the benchmark data with 49 modules, we take only 3.7 h in total for the whole annealing process using a 600-MHz Pentium III processor while the convex programming approach in [7] needs seven days using a 250-MHz DEC Alpha. Our method will also be applicable to other floorplanning algorithms that make use of constraint graphs to compute module positions in the final packing.
The rest of this paper is organized as follow. We will formulate the problem in Section II. Section III describes briefly the sequence pair representation. We will formulate the geometric program in Section IV. In Section V, we will explain in details the Lagrangian relaxation technique. Experimental results will be shown in Section VI and some remarks will be given in the last section.
II. PROBLEM FORMULATION
We consider two kinds of modules: hard modules and soft modules. A hard module is a module whose dimension is fixed. A soft module is one whose area is fixed, but its dimension can be changed as long as its aspect ratio, i.e., the ratio of height to width, is within In case of a hard module, the maximum and minimum aspect ratio will be the same.
A packing of a set of modules is a nonoverlap placement of the modules. A feasible packing is a packing such that the widths and heights of the modules are consistent with their aspect ratio constraints and area constraints. We measure the area of a packing as the area of the smallest rectangle enclosing all the modules.
We are also given the netlist information: net 1 ; net 2 ; . . . ; net m and the relative positions of the input-output (I/O) pins p1; p2; . . . ; pq along the boundary of the chip. For each net net i , where 1 i m, we are given its weight, the I/O pin, and the set of modules to which it is connected. Our objective is to obtain a feasible packing minimizing the total packing area and interconnect cost. We use the simulated annealing technique (based on the sequence pair representation) to search the solution space. For each intermediate solution in the annealing process, we evaluate the packing by computing a linear function of its area and interconnect cost. However, there can be many realizations of the same packing due to the shape flexibilities of the soft modules. The most important contribution of our paper is that we devised an efficient method to compute the shapes of the soft modules to give the optimal packing. The problem is formulated as follows.
Problem Floorpolan Area Minimization (FP/AM):
Given a set of hard and soft modules with area and aspect ratio constraints, and a specific packing topology of these modules described by a pair of vertical and horizontal constraint graphs, find the optimal shape of each module so as to produce the smallest possible feasible packing taking into consideration the shape flexibilities of all the soft modules simultaneously.
III. SEQUENCE PAIR AND CONSTRAINT GRAPH
We use sequence pair to represent a general floorplan in the annealing process. A sequence pair of a set of module is a pair of combinations of the module names. For example, s = (abcd; bacd) is a sequence pair of the module set fa; b; c; dg. We can derive the relative positions between the modules from a sequence pair s by the following rules. b is below module a.
1) H-constraint:
We can use constraint graphs to represent these horizontal and vertical placement relationships. A horizontal (vertical) constraint graph G h (Gv ) for a set of n modules is a graph of n vertices with the vertices representing the modules and the edges representing the horizontal (vertical) placement constraints. For example, if module b is on the right-hand side of module a, we will add an edge from a to b in the horizontal constraint graph with a weight equal to the width of a. The reason is that if b is on the right hand side of a, its lower left corner (notice that we always refer the position of a module by the coordinates its lower left corner) should be at a distance of at least the width of a from the lower left corner of a. Similarly, if module b is above module a, we will add an edge from a to b in the vertical constraint graph with a weight equal to the height of a. We can build these graphs directly from a sequence-pair representation. 1 1 1 a 1 1 1) , so a is above b and there is an edge from b to a labeled h b in the vertical constraint graph. For modules a and c, their orders are the same in both sequences (1 1 1 a 1 1 1 c 1 1 1 ; 1 1 1 a 1 1 1 c 1 1 1) , so c is on the right-hand side of a and there is an edge from a to c labeled w a in the horizontal constraint graph. In this way, we can construct the horizontal and vertical constraint graphs by looking at the orders of every pair of modules in the two sequences. In the annealing process, we can modify a sequence pair by two kinds of moves: 1) M1: exchange two modules in the first sequence only; 2) M2: exchange two modules in both sequences.
These two moves are sufficient to transform any sequence pair to any other arbitrary sequence pair in one or more steps.
IV. FORMULATION OF THE GEOMETRIC PROGRAM
We are given n modules M1, M2; . . . ; Mn of areas A1, A 2 ; . . . ; A n . For each module M i , where 1 i n, its minimum and maximum aspect ratios are ri; min and ri;max, respectively. The minimum and maximum width of M i are, thus, L i = A i =r i; max and Ui = Ai=ri;min, respectively. We are also given the topology of the packing described by a pair of horizontal and vertical constraint graphs. Let x i denote the smallest x position of the lower left corner of module i satisfying all the horizontal constraints in the horizontal constraint graph G h . Similarly, y i denotes the smallest y position of the lower left corner of module i, satisfying all the vertical constraints in the vertical constraint graph Gv. Then, for each edge e(i; j) from module i to module j in G h , we have the following constraint:
where w i is the width of module i. Similarly, for each edge e(i; j) from module i to module j in Gv, we have the following constraint:
In the horizontal constraint graph G h , we denote the set of sources and sinks by s h and t h , respectively, where a source is a vertex without incoming edge and a sink is a vertex without outgoing edge. Similarly, For simplicity, we add one dummy vertex labeled n + 1 to each G h and Gv. Let Q(;) denote the optimal value of the problem LRS=(;). We define the Lagrangian dual problem (LDP ) of P P as follows: maximize Q(;) subject to 0 and 0:
Since P P can be transformed into a convex problem [7] , we can apply [6, theorem 6.2.4] and imply that if (;) is the optimal solution to LDP , the optimal solution of LRS=(;) will also optimize P P .
A. Simplification of the Lagrangian Relaxation Subproblem
The Lagrangian relaxation subprogram LRS=(;) can be greatly simplified by the Kuhn-Tucker conditions. Consider the Lagrangian of P P [6] The Kuhn-Tucker conditions imply that @=@xi = 0 and @=@yi = 0 for all 1 i n + 1 at the optimal solution of P P . Therefore, in searching for the and to optimize LDP , we only need to consider those multipliers such that these conditions are satisfied. Therefore, for 
We use to denote the set of (;) satisfying the above relationships where ( e(i; n+1)2G i; n+1 )( e(i; n+1)2G i; n+1 ) is a constant for a fixed (;).
B. Solving LRS=(;)
In this section, we consider solving the Lagrangian relaxation subproblem LRS=(;) when (;) 2 , i.e., computing wi for 1 i n. F can be written as .
C. Solving LDP
As explained above, we only need to consider those (;) 2 in order to maximize Q(;) in the LDP problem. We used a subgradient optimization method to search for the optimal (;). Starting from an arbitrary (;) 2 in step k, we will move to a new pair ( 0 ; 0 ) by following the subgradient direction: After updating and, we will project ( 0 ; 0 ) back to the nearest point ( 3 ; 3 ) in and solve the Lagrangian relaxation subproblem LRS=( 3 ; 3 ) using the method described in Section V-B. This procedure is repeated until the solution converges. The following algorithm summarizes the steps to solve LDP .
Algorithm Solve-LDP /* This algorithm solves the problem optimally. Given the placement topology described by a pair of constraint graphs, it computes the optimal values for the widths of the modules to minimize the total packing area. 
D. Projection
As described above, we used subgradient optimization to search for the optimal (;). Starting from an arbitrary (;) 2 , we will move to a new pair ( 0 ; 0 ) by following the subgradient direction. ( 0 ; 0 ) will then be projected back to the nearest point ( 3 ; 3 ) in based on the two-norm measure. This projection step is done by finding an orthonormal bases1; . . . ;p;1; . . . ;q of . Theñ
To find the orthonormal bases spanning , we first find a set I of independent vectors spanning using QR decomposition. For simplicity, we consider 's only in the following discussion. Let denote the set of's satisfying the relationships (1) and (3) and let Q =ỹ be the system of equations described by (1) and (3) . By QR decomposition, we can write each dependent variable i in as a linear combination of the other independent variables j 's in i = j i; j j :
From these formulae, we can obtain a set of independent vectors I spanning . Notice that in (1)- (4) 
, where jEj is the number of edges in the constraint graph. Fortunately, we only need to do the QR decomposition and Gram-Schmidt process once for each sequence pair. After finding an orthonormal set of vector, we can repeatedly use this set to do projection in searching for an optimal (;) 2 according to (5) and (6) . Another useful incremental technique to improve the efficiency is due to the observation that the structures of the constraint graphs are unchanged if we just exchange two modules in a move of the annealing process (M2), so we do not need to recompute the orthonormal bases in almost half of the iterations.
VI. EXPERIMENTAL RESULTS
We tested our floorplanner with the MCNC benchmarks and some randomly generated data sets using a 600-MHz Pentium III processor. In all the experiments, the weightings between the area term and the wirelength term in the cost function of the annealing process are approximately balanced. We did three sets of experiments. In the first set, we want to know the speed and quality of sizing all the modules once by the Lagrangian relaxation method. We randomly generated six data sets with 10 to 500 modules each. The aspect ratio of each module can range from 0.1 to 10.0 and the areas of the modules are randomly generated in the range between 0 and 500 000. The sizing procedure is applied only once at the end of the annealing process and the chip aspect ratio can range between 0.5 to 2.0. The result is shown in Table I . Notice that the result for each data set is obtained by repeating the experiment six times and picking the best one. Fig. 3 shows the packings for the data set with 100 modules before and after the sizing procedure. Murata and Kuh [7] have also reported the speed and quality of their method on data set with module size randomly generated in the range between 100 2 to 10 000
2 running on a 250-MHz Alpha DEC and their results is shown in Table II . In the second set of experiments, we apply the sizing procedure in every iteration of the annealing process. We use the same set of parameters as in [7] : the initial temperature is decided such that the acceptance ratio is 95%, the temperature is exponentially lowered in four decades by 20 steps, the number of iterations in one temperature step is ten times the number of modules, and the aspect ratio of the whole chip is approximately one. The temperature drops until it is below a certain threshold (1 2 10 010 ). We test our method using the benchmark data sets and the aspect ratio of the modules can range between 0.1 to 10.0. The results is shown in Table III . Note that our experiments are performed on a 600-MHz Pentium III processor while [7] used a 250-MHz DEC Alpha processor. Fig. 4 shows a result packing for ami33. In the last set of experiments, we also use the benchmark data sets and invoke the sizing procedure in every iteration of the annealing process. However, we allow the aspect ratio of each module to range from 0.5 to 2.0. This is a more reasonable range and it can better demonstrate the speed and quality of the sizing method in practice. In this set of experiments, the initial temperature is decided such that the acceptance ratio is 95%. The aspect ratio of the whole chip is also approximately one. The temperature is lowered at a constant rate of 0.95 until it is below a certain threshold (1 210 010 ) and the number of iterations at each temperature step is a constant of 30. The results are shown in Table IV . 
VII. REMARKS
Our method can also be used in the presence of hard rectilinear blocks. This can be done by partitioning a rectilinear hard block into several rectangular submodules and keeping them together as one piece by inserting additional edges in the constraint graphs. In this way, we can still shape the soft modules optimally in the presence of hard blocks.
In our current implementation, the time taken to compute the width of a module i is linear to the total number of outgoing edges from i in the two constraint graphs. This is O(n) on average for constraint graphs constructed from the sequence pair representation. However, this can be reduced to O(1) by using another representation, e.g., O-tree [1] and B 3 -tree [3] , which supports planar constraint graphs.
