Existing 3D placement techniques are mainly used for standardcell circuits, while mixed-size placement is needed to support highlevel functional units and intellectual property (IP) blocks. In this paper we present an analytical 3D placement method that is capable of placing mixed-size circuits. A multiple-stepsize scheme for the analytical solver is proposed to handle standard cells and macros differently for stability and efficiency. To relieve the difficulty of legalization, 3D floorplan-based initial solutions are used to guide the analytical solver. As far as we know, this is the first work that reports 3D placement results for mixed-size circuits. Our experiments show that the multiple-stepsize scheme is better than single-stepsize schemes in both quality and runtime. The experimental results on the ICCAD'04 mixed-size benchmarks show that the 4-tier 3D mixed-size placement can reduce the wirelength by 27% on average compared to 2D placement. The results also show that the 3D mixed-size placement achieves 5.3% shorter wirelength on average than the pseudo 3D placement with similar amount of through-silicon vias (TS vias).
INTRODUCTION
Three-dimensional (3D) IC technologies can offer the potential to significantly reduce interconnect delays and improve system performance. Furthermore, the shortened wirelength, especially that of the clock net, also lessens the power consumption of the circuit. 3D IC technologies also provide a flexible way to carry out the heterogeneous system-on-chip (SoC) design by integrating disparate technologies, such as memory and logic circuits, radio frequency (RF) and mixed signal components, optoelectronic devices, etc., onto different tiers of a 3D IC.
Physically, a 3D IC can be viewed as a stack of multiple 2D ICs, where a single 2D IC is called a tier. Tiers in a 3D IC are connected using through-silicon vias (TS via). However, TS vias are usually etched or drilled through tiers by special techniques and are costly to fabricate. A large number of the TS vias will degrade the yield of the final chip. Also, under current technologies, TS via pitches, usually around 5-10μm, are very large compared to the sizes of regular metal wires. In 3D IC structures, TS vias are usually placed in the whitespace between the macro blocks or cells, so the TS vias affect both the routing resources and the overall chip areas.
In recent years, 3D IC physical design has attracted more and more attention. Along with the technology updates, there are several published works that target the 3D placement problem. A thermaldriven force-directed 3D placement method [14] was proposed, where the temperature profile is interpreted as a thermal force to guide the cell placement. A transformation-based 3D placement [10] was proposed to reuse the 2D placement information by the folding/stacking heuristics to construct a 3D placement. A partitioning-based approach [15] was also applied to the 3D context, where the temperature and the TS via number are modeled in the min-cut objective together with the total wirelength. A quadratic programming approach [19] for 3D placement was also proposed, which finds an overlap-free placement by modeling the cell distribution with a discrete cosine transformation-based cost function.
However, none of these 3D placement methods consider mixed-size circuits, or at least none of these works report 3D placement results on mixed-size examples. As pointed out in [4] [7] , the 3D physical design tools are required to work on a higher level of abstraction, which involves the floorplanning and placement of large functional units in addition to standard cells. Also like 2D ICs, the 3D design will have widespread use of embedded memories, IP blocks, and other hard modules for physical design reuse, which makes the mixed-size 3D placement necessary and important.
In this paper we propose several techniques to add in large-macro placement support in a multilevel analytical placer. In particular, we make the following contributions for the mixed-size 3D placement:
• A multiple-stepsize scheme is proposed to improve the analytical placer for mixed-size 3D placement. This scheme distinguishes the stepsizes for macros and standard cells, which allows larger stepsizes than the single-stepsize scheme. In turn, allowance of large stepsizes improves the stability and efficiency of the analytical 3D placer.
• We analyze cases that are difficult for legalization, from which we propose to use 3D floorplan-based initial solutions to guide the analytical 3D placer. A few large macros are fixed before 3D placement to guarantee legalized solutions.
• Experimental results on the ICCAD'04 mixed-size benchmarks are reported, which show that the 3D placement is able to reduce the wirelength by 27%
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. compared to 2D placements. The results also show that the full 3D mixed-size placement achieves 5.3% shorter wirelength than the pseudo 3D placement with similar amount TS vias.
The remainder of this paper is organized as follows. Section 2 describes the 3D placement problem, and the techniques for handling large macros in the analytical 3D placer are proposed in Section 3. Section 4 presents the experimental results. Finally, Section 5 concludes our work.
3D PLACEMENT PROBLEM FORMULATION
Given a circuit represented as a hypergraph ( , ) 
where ( ) l e and ( ) v e are the wirelength and TS via number estimations, respectively, and ( , ) k D u v is the density map on tier k .
We use the traditional half-perimeter model (HPWL) for wirelength estimation, where the wirelength ( ) l e of a net e is calculated and approximated by the log-sum-exp function [16] as follows: 
where η is set to 0.01 in our implementation to balance the approximation accuracy and the numerical stability.
Because the routing information is unknown during the placement process, the TS via number is also estimated through a similar model. The TS via number of a net is calculated as the height of the bounding cuboid of the cells belong to that net. The TS via number ( ) v e of a net e is calculated and approximated as follows [10] 
The weighting factor α is used to achieve the trade-offs between the wirelength ( ) l e and the TS via number ( ) v e .
For any placement with {1, 2, , },
where the density value at point ( , ) u v equals the total number of cells and macros covering that point. Thus the non-overlap constraints are transformed to the density constraints ( , ) 1 k D u v ≤ , which means that the total number of cells and macros at any point is not greater than 1. These inequality constraints can be further transformed to equality constraints ( , ) 1 k D u v = by adding dummy cells [6] , which are the pseudo cells without connection to the netlist, to fill the white space.
MIXED-SIZE 3D PLACEMENT FLOW
In order to solve the mixed-size 3D placement problem, we apply the flow in Figure 1 , which mainly consists of a 3D floorplanner and our analytical 3D placer for mixed-size designs.
Although our analytical 3D placer is capable of handling mixed-size designs, large-macro legalization is still a problem. Thus, we use a 3D floorplanner to provide an initial 3D placement to guide the analytical placer, which will be discussed in Section 1.1.
Our analytical 3D placer works on a given 3D placement. A multiple-stepsize scheme is developed to enhance the 3D placer [8] for mixed-size designs, which will be described in Section 1.2.
The 3D global placement solution is rounded in z-direction first to snap the movable objects to a tier, and then the movable objects are legalized tier-by-tier using the legalization and detailed placement algorithm in [9] . 
3D Floorplan-Based Initial Solution
Before describing the use of 3D floorplanner, we first analyze the cases where large-macro legalization fails for 3D global placements as in Figure 2 . In each case, the 3D placement is shown from the bottom tier (tier 1) to the topmost tier (tier 4), where large boxes represent macros and small dots represent standard cells. From these cases, we see that some global placements are difficult to legalize, even by hand. For example, there are two overlapping macros on tier 1 of the ibm03 case. Although there are enough white space to hold the overlapping area, these two hard macros cannot be legalized unless one of them is moved to another tier, resulting in great displacement of the overall solution.
Thus, a 3D global placement that roughly satisfies the area density constraints may still be difficult to legalize. We would like to start the analytical 3D placer with fixed large macros to prevent illegalized solutions. We perform 3D floorplanning [12] on a coarsened netlist after partitioning [17] (e.g. partitioning into 100 parts). These initial solutions guide the analytical 3D placer to better placement of the large macros. Furthermore, to prevent these very large macros being placed on the same tier, we may fix the tier assignment of very large macros (e.g., macros with width or height greater than 20% of the chip width or chip height, respectively). As demonstrated in Figure 3 , with these initial solutions, we gain a higher probability of obtaining a solution that is easy to legalize. 
Analytical Solver with Multiple-Stepsize Scheme
The multilevel analytical placement solver [8] is used for 3D global placement of standard-cell circuits. We adopt such analytical solver and propose the multiple-stepsize scheme to handle mixed-size designs.
In an analytical solver, the range of the tier assignment i z for cell i v has to be relaxed from the discrete set {1, 2,..., } K to a continuous interval [1, ] K , thus the previous definition of density map in formulation (1) is no longer valid. To model the non-overlap constraints in terms of the continuous tier assignment, the techniques in [8] of area projection and pseudo tiers are applied, which redefine the density maps by area projection, and use the additional pseudo ties between every two neighboring actual tiers to ensure the equivalence to non-overlap constraints. Then the 3D global placement problem is formulated as a nonlinear programming problem shown below: 
This problem can be solved by the quadratic penalty method [18] :
where the penalty function is defined as
The non-differentiable functions ( , , ) Penalty x y z are approximated by differentiable functions before running an analytical solver. The density functions on both actual tiers and pseudo tiers are replaced by the smoothed density functions [11] .
The solution of the unconstrained minimization problem in (5) is equivalent to the steady solution to the following ordinary differential equation (ODE), which is similar to [5] :
where ( , , ) ( , , ) ( , , ) Q x y z OBJ x y z Penalty x y z
This ODE can be solved by the explicit Euler method, which gives the following iterative scheme:
, ,
, , is a given initial placement
The stepsize τ has to be small enough to guarantee convergence. The analytical upper bound for τ depends on the Hessian of ( , , ) Q x y z μ which is difficult to determine. In practice, the value of τ is determined in an adaptive way: an initial stepsize τ is tried and then the convergence is checked; if it does not converge, the stepsize is scaled down by a ratio between 0 and 1 (e.g., 0.6) and the trial and error process is repeated. This scheme works fine for standard cell cases [8] . However, the application of this scheme may cause trouble in mixed-size cases. We observe that if we use the same stepsize for all the variables, the stepsize has to be very small to guarantee convergence. Conversely, the stepsize would cause instability if it is set too large. Thus, we introduce scaling factors for every cell according to its area, such that the step size ratio i j τ τ is equal to the inversed area ratio
Here we justify the multiple-stepsize scheme by showing its equivalence to the gradient projection method for mixed-size placement problems. The following analysis only focuses on a small example of mixed-size linear placement, but it can be extended to a rigor proof for general mixed-size placement problems.
In the mixed-size linear placement example, there are only two cells 
increase until converge
where 1 x and 2 x are the centers of the two cells,
( , ) OBJ x x is the total wirelength, and 2 1 2 ( , ) Penalty x x is the density penalty function [11] .
The mixed-size problem (9) can be transformed to a uniform-size problem (10), by decomposing 2 v equally into two cells 3 v and 4 v with an additional constraint 3 4 x w x + = , as shown in Figure 5 . 
where ( ) ( )
The gradient projection method [18] can be used to solve the constrained optimization problem in (10) . Each iterative step consists of a descent step followed by a projection step. The descent step is as follows, 
The projection step is to find a feasible solution ( ) 1 3 4 , , x x x ′ ′ ′ such that it is the feasible point with minimal distance to the point ( ) 1 3 4 , , x x x ′′ ′′ ′′ . Formally, it is the solution of the following optimization problem: 
The Lagrangian function for problem (12) is 
Combining the descent step (11) and the projection step (12), we obtain 
According to the property of the density penalty functions [11] , for the specific placements ( )
, x x and ( ) 1 3 4 , , x x x where ( ) 
Penalty x x E x w E x w x Penalty x x E x w E x w x
( , , )
Penalty x x x E x w E x w x Penalty x x x E x w E x w x Penalty x x x E x w E x w x
Based on the equations (16)(17)(18)(19), we can transform (15) into ( ) ( )
This can be viewed as a descent step in the gradient descent method for the unconstrained optimization problem in (9) . It indicates that the step size ratio between 1 v and 2 v is 2 :1 , which is inversely proportional to their area ratio : 2 1: 2 w w= .
Theorem. Assume in a mixed-size placement problem, the cell area The proof can be obtained by the same idea as the previous analysis of the linear placement example, and thus is omitted.
As a result, the stepsize need not be very small for convergence. Conversely, the requirement on the stepsize is less strict and it helps to implement a stable solver in practice. The effect of the multiplestepsize scheme will be reported in TABLE II.
EXPERIMENTAL RESULTS
To evaluate the quality of our analytical 3D placer for mixed-size circuits, experiments are performed on our modified version of the ICCAD'04 mixed-size placement benchmarks [20] . The netlists and the cell sizes remain the same. And the 3D placement regions are scaled from the 2D regions by a factor of K on each side, where K is the number of tiers. Thus the total white spaces are not changed. The I/O port locations are also scaled linearly with the placement regions, and the I/O ports are assumed open at the topmost tier (the tier with the largest tier number). We will set 4 K = for all the experiments in this section.
In this suite of benchmarks, the number of standard cells and nets ranges from 10k to 50k, and there are hundreds of macros in each circuit. The readers may refer to [1] [2] for these numbers in detail. Here we only show the statistics of areas and wirelengths in TABLE I. The first sub-column under "2D HPWL" is the halfperimeter wirelength produced by the placer mPL6 [6] , where the geometric mean is computed at the last row for comparison with 3D placement results. A breakdown of the wirelengths is found in the following sub-columns: there are, on average, 76% wires connecting between standard cells only, 4% wires connecting between macros only, and 20% wires connecting between standard cells and macros. The area breakdown is also shown afterwards, where the standard cells consume 41% of total area, macros consume 39% and there is 20% white space on average.
Before we show the experimental results, we shall first present the data in TABLE II on the effect of the multiple-stepsize scheme as discussed in Section 1.2. We compare the multiple-stepsize scheme to the single-stepsize scheme on 2D placements ( 1 K = ). The different schemes are tested as an additional run on the same given placements, and we expect that the final placement will have the same or a slightly better wirelength (the normalized quality in TABLE II). Three implementations are tested: the first one is a single-stepsize scheme with a moderate increasing penalty factor μ ; the second one is a single-stepsize scheme with an aggressive increasing penalty factor; and the last one is a multiple-stepsize scheme with an aggressive increasing penalty factor. The quality of the results are normalized, which equal to the final HPWL divided by the given HPWL, and the number of iterations spent is also reported. The results show that if only single stepsize is used, it takes more time for the adaptive scheme to search for a small enough stepsize, or the quality degrades significantly. The multiplestepsize scheme helps the stability and the runtime, and maintains the best quality with the fewest number of iterations to converge, compared to the other two single-stepsize schemes.
Experimental results for the two modes of our 3D placer, "Pseudo 3D" and "3D (mac fixed)", are summarized in TABLE III, as well as the folding method in [12] . As pointed out in Section 1.1, large macros create troubles for the legalization. Thus, 3D floorplanning is performed on the coarsened netlist (100 nodes) before 3D placement. The "large macros" in the experiments are the macros whose width or height is greater than 20% of the chip width or 20% of the chip height, respectively. Both modes start with 3D floorplanbased initial solutions. The "pseudo 3D" mode fixes the large macros, disables the movement in the z-direction, and runs the 3D placement only for standard cells and small macros in the (x,y)-direction. And the "3D (mac fixed)" mode fixes the large macros, but allows the movement of standard cells and small macros in both the (x,y)-direction and the z-direction. The detailed placement is completed tier-by-tier using the 2D detailed placer [9] .
The wirelength after global placement (gp-WL), the wirelength after detailed placement (dp-WL), the number of TS vias (TSV), and the total runtime (RT) are all reported in TABLE III. Both modes produce similar amount of TS vias. Comparing to the "pseudo 3D" mode, the "3D (mac fixed)" mode reduces wirelength by 5.3% on average, by allowing the movement of small objects in the z-direction. Comparing to the folding method, the "3D (mac fixed)" mode reduces wirelength by 18% on average with 35% more TS vias. Comparing to the 2D placement, the "3D (mac fixed)" mode provides a 27% wirelength reduction on average for these mixed-size benchmarks.
CONCLUSIONS
In this paper, we proposed several techniques to enable an analytical 3D placement to support mixed-size circuits. The multiple-stepsize method gains efficiency by allowing as large a stepsize as possible for each standard cell, while enabling large macros updated with small stepsizes for stability. The 3D floorplanning is used to generate initial solutions for very large macros and gives a higher possibility of obtaining a legalized solution. The experimental results show that the 3D placement is able to reduce the wirelength by 27% compared to 2D placements. The results also show that the 3D mixed-size placement achieves 5.3% shorter wirelength than the pseudo 3D placement with similar amount TS vias.
ACKNOWLEDGMENTS
This research is partially supported by National Science Foundation under CCF-0430077 and CCF-0528583. The authors would like to thank Prof. Lieven Vanderberghe and John Lee for the inspiring discussions. RT (min) dp-WL (x 10 6 )
