Thermal characteristics have been considered as one of the most challenging problems in 3D integrated circuits (3D ICs). The vertically stacked multiple layers of active devices cause a rapid increase of power density and the thermal conductivity of the dielectric layers inserted between device layers for insulation is quite low compared to silicon and metal, which make the peak temperature of 3D ICs rise, leading to the performance degradation. In this paper, instead of inserting Thermal Through Silicon Vias (TTSVs) to reduce the peak temperature, a temperature-aware floorplanning algorithm based on simulated annealing for fixed-outline 3D IC is proposed. The concept of ''hot'' block is given, by placing the ''hot'' block of the 3D IC on the bottom layer of the chip (near the radiator) and reasonable intra-layer and interlayer heat limitation, the peak temperature of the 3D IC is minimized. The number, area and wirelength of the TSVs are also considered in this paper. The results show that the proposed temperature-aware 3D IC floorplanning can effectively reduce the chip peak temperature and the number of TSVs with reasonable area, wirelength and time overhead.
I. INTRODUCTION
Three-dimensional integrated circuits (3D ICs) that employ the through-silicon vias (TSVs) vertically stacking multiple dies provide many benefits, such as high density, high bandwidth, and low power [1] - [6] . Fig. 1 presents an example of 3D IC consisting of two silicon layers, where inter-layer connections made up of TSV while intra-layer connections consist of metal wires. Although TSV can greatly shorten the total wirelength of chip, but the TSV area in the chip is tens to hundreds of times larger than that of a single logic gate, which will also bring obstacles to the floorplanning of 3D ICs [7] . In addition, the thermal problem poses a serious challenge to 3D IC design, since the vertically stacked multiple layers of active devices cause a rapid increase of power density and the thermal conductivity of the dielectric The associate editor coordinating the review of this manuscript and approving it for publication was Cihun-Siyong Gong . layers inserted between device layers for insulation is quite low compared to silicon and metal [8] , [9] , which make the peak temperature of 3D ICs rise, leading to the performance degradation. In conclusion, it is very important to consider the number and peak temperature of TSV during floorplanning for designing 3D ICs with superior performance. VOLUME 7, 2019 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ Inserting Thermal Through Silicon Vias (TTSVs) is one of the mainstream approaches of temperature-aware planning for fixed-outline 3D ICs [10] - [13] . Some blank areas are reserved for inserting TTSVs during floorplanning, and the heat inside 3D IC can be vertically transferred to the outside heat sink through TTSVs. Wang et al. [10] proposed a heat-dissipation system using redistribution layer (RDL) and TTSV array in 3D IC, which can rapidly reduce the hot spot temperature. Wong and Sung Kyu [11] presented a heat sink TTSV insertion algorithm during 3D floorplanning, which effectively reduced the peak temperature of the chip by using the least number of heat sinks. Goplen and Sapatnekar [12] used finite element analysis (FEA) to model the temperature, and the proposed algorithm can also effectively reduce the peak temperature of 3D ICs by minimizing the number of TTSVs. In [13] , authors proposed a congestion minimization method to plan TTSVs, and the proposed two-stage floorplan design reduced the complexity of solution space, achieving lower chip temperature and shorter time. TTSV inserting is the most direct method of heat dissipation, but it does not solve the problem of local overheating on the silicon wafer. In addition, inserting TTSVs will occupy extensive wiring space, which increases the 3D floorplanning design complexity dramatically.
In this work, instead of inserting TTSVs to reduce the peak temperature, a temperature-aware floorplanning algorithm based on simulated annealing for fixed-outline 3D IC is proposed. The peak temperature of the 3D IC is minimized by setting intra-layer and inter-layer heat constraints. The number, area and wirelength of the TSVs are also considered in this paper. The results show that the proposed temperatureaware 3D IC floorplanning can effectively reduce the chip peak temperature and the number of TSVs with reasonable area, wirelength and time overhead.
The remainder of this paper is organized as follows. Section 2 introduces the floorplanning design of 3D ICs. Section 3 presents the proposed algorithm. Experimental results are shown in section 4. Finally, some conclusions are given in section 5.
II. FLOORPLANNING FOR 3D IC
Floorplanning is a key part in physical design of VLSI, and the results have an important impact on the size, global interconnection structure and performance of the final IC. According to different geometric partitions, floorplanning can be divided into two kinds [14] : one is slicing floorplanning, which can be subdivided by continuous horizontal and vertical cutting to a single module; another is non-slicing floorplanning, which cannot be subdivided into a single module through continuous horizontal and vertical cutting. Traditional 2D ICs only floorplan in the same plane layer, while 3D ICs floorplan simultaneously in multiple plane device layers.
A. PROBLEM FORMULATION
The problem formulation of temperature-aware 3D floorplanning is as follows: Input information of floorplanning for 3D IC: 1) a collection of n modules {m 1 , m 2 , . . . , m n }, an area set {a 1 , a 2 , . . . , a n } of modules and an initial width w i and height h i of each module; 2) Network cable set {e 1 , e 2 , . . .}, each network cable in E represents the interconnection information between modules; 3) n modules corresponding power density set {p 1 , p 2 , . . . , p n }; 4) the number of layers to be divided K , and K is a positive integer.
Output information of floorplanning for 3D IC:
where (x i , y i ) is the coordinate of the lower left corner of the module, w i is the width of module i, h i is the height of module i, and z i is the chip layer where the module is located.
The constraints of floorplanning for 3D IC: 1) there is no overlapping area of modules in the same layer; 2) fixedoutline constraints, which refer to the layout planning in a predetermined area; 3) cross-cutting ratio constraints, this paper discusses the soft modules, so the length and width of each module is variable in a certain range.
B. DATA STRUCTURE
The floorplanning of the non-division structure is more general and common, so this paper aims at the floorplanning of this non-division structure. There are many kinds of data structures for floorplanning. This paper uses the B * tree [15] data structure to describe the floorplanning, as shown in Fig. 2 . Fig. 2 (a) is the floorplanning of each layer of the 3D ICs, and Fig. 2(b) is the B * tree corresponding to each layer of the 3D ICs. In the B * tree description, each B * tree represents a floorplanning of one layer, the root node represents the module at the bottom left corner of the floorplanning. The right side of the root node is the left child, and the root node closest to the upper left side is the right child. Such recursion can transform the floorplanning into a one-to-one corresponding B * tree. Similarly, the B * tree can be transformed into a one-to-one floorplanning.
C. PERTURBATION OPERATION
3D IC floorplanning is a process of searching the best solution by perturbing unceasingly based on the random initial solution described by the B * tree. There are six types of perturbation operations used in this paper: intra-layer module exchange; module rotation; module movement and modified cross-cut ratio (aspect ratio); inter-module switching and inter-layer module movement. Traditional 2D IC floorplanning only performs module operations intra layer, i.e. perturbation operation in the same B * tree, so it only includes the first four kinds of perturbation. However, 3D ICs are stacked vertically by a plurality of 2D ICs, and the B * tree description includes many sub-B * trees, and each sub-B * tree corresponds to one layer of 3D ICs. Therefore, the perturbation operation increases the inter-layer module exchange and movement, which corresponds to the exchange and movement of nodes in different sub-B * trees.
D. TEMPERATURE ESTIMATION
The temperature simulation tool Hotspot [16] can perform temperature effectively, but it takes too long time for calculation. If the temperature simulated by Hotspot is used as a parameter to guide the temperature-sensitive floorplanning of the chip in each iteration of simulated annealing, a more accurate solution can be obtained, but the time overhead is unacceptable. According to the derivation in [13] :
where T denotes the temperature rise, P is the power, R is the thermal resistance of the chip, t is the thickness of the chip, k is the thermal conductivity of the material, A is the area of the module, p is the power density of the module. It can be seen from formula (1) that after the material and chip thickness are determined, t/k is a fixed value, so the temperature rise across the chip is proportional to the power density. The power density can be used instead of the steady state temperature of the chip to guide floorplanning to reduce the time overhead. This heat model assume that the chip material is an isotropic continuous medium, and it conducts heat steadily, then 1D heat conduction equation can be obtained. In the algorithm, we only use this formula to get an approximate temperature value. The final layout structure can be obtained by using accurate 3D temperature simulation tools such as HotSpot software.
E. INTRA-LAYER AND INTER-LAYER THERMAL CONSTRAINTS
In this paper, the concept of ''hot module'' is proposed. Given the set of modules M = {m 1 , m 2 , . . . , m n }, where each module corresponds to a power density of p = {p 1 , p 2 , . . . , pn}, and the average power density of all modules in the set is obtained: where a i is the area of the i-th module, n is the total number of modules, and the module with the power density p i greater than p avq is defined as the ''hot'' module of the module set. This paper covers two different concepts of the ''hot'' module of 3D IC and the ''hot'' module within the 3D IC layer. The former module is the whole 3D IC, while the latter module is a layer of 3D IC.
Research on 2D ICs shows that the temperature of the module is not only related to its own power density, but also to the power density of adjacent modules. Modules with high power density in the chip may cause higher temperature, while modules with lower power density generally do not cause higher temperatures. Thus, hotspot with high peak temperature can be avoided in the floorplanning of 3D ICs by using ''hot'' module as observation objects and reasonably arranging ''hot'' module.
In each layer of the 3D IC, ''hot'' module should be avoided as close as possible [13] . Fig. 3 shows a top view of the relationship between intra-layer modules of a 3D IC. The transverse heat transfers from module 1 to module 2 is as follows:
where c is the proportionality factor, T 1 and T 2 are the temperatures of module 1 and module 2 respectively, and L is the adjacent length of two modules. It shows that the greater the temperature difference and the adjacent length between adjacent modules, the greater the heat transfer between modules; Module 1 will transfer heat to module 2 When T 1 > T 2 , otherwise, module 2 will transfer heat to module 1. According to the previous reasoning, the temperature of the module can be approximated by the power density, so equation (3) is equivalent to:
If module 1 is contiguous with multiple modules, the total heat transferred by module 1 to all surrounding adjacent modules is:
where m i is all modules adjacent to module 1. It is not necessary to calculate the total heat transferred by each module to the surrounding modules during floorplanning for 3D ICs. Only the ''hot'' module may have a higher temperature, so it is only necessary to calculate the total heat transferred by the ''hot'' module of intra-layer of the 3D IC:
m hot is the ''hot'' modules of intra-layer of the 3D ICs. 3D IC should also avoid stacking modules with high power density among the vertically stacked layers. When two high power density modules overlap between different layers, hot spots with higher temperature may be generated. As shown in Fig. 4 , the top view of the relationship between two modules in different wafer layers of 3D chip is presented, the heat generated by two modules in the overlapping area is as follows:
where c is the proportionality factor, p 1 and p 2 are the power density of module 1 and module 2, respectively; and overlap is the overlapping area of two modules. When module 1 overlaps with multiple modules in different layers, the total heat generated by overlapping area is:
where m i is all modules that overlap vertically with module 1. The total heat in the overlap area of all modules with higher power density of all 3D interlayers is:
m hot is the module with high power density of all 3D ICs. In this paper, the module with higher power density selected in the interlayer heat constraints section, which is the same as the strategy of 3DFP implemented in [17] that only the module with the highest power density in each layer is selected.
F. AREA, WIRE LENGTH AND TSV NUMBERS
3D IC area is the product of the maximum length and width of each wafer layer; wire length is estimated by the most widely used half-perimeter wire length (HPWL) in Manhattan interconnection structure [18] . The estimated length of HPWL is equal to the half perimeter of a Bounding Box. If a wire spans multiple layers, the number of TSVs needed to interconnect the wire is the difference between the maximum and minimum layers of the wire. When a wire is only in one layer, TSV is not needed for interconnection, and the sum of TSVs required for all wires is the TSVs of the whole 3D IC.
III. PROPOSED ALGORITHM
The temperature-aware floorplanning algorithm proposed in this paper can be mainly divided into three stages. 1): The first stage is the initialization phase, which mainly completes the reading of basic circuit information, the marking of the ''hot'' module (whose power density is larger than the average power density of all modules) of 3D ICs, the sorting of the circuit modules by area, and the initial circuit division of placing all 3D ICs ''hot'' modules into the bottom layer of the chip (near the radiator). 2): The second stage is to synchronize the circuit division considering the chip area, total wire length and TSVs. Dividing the circuit module into different layers, and at the end of this stage, the boundary of floorplanning for 3D IC is determined to restrict the next stage of floorplanning. 3): The third stage is to carry out temperature-aware floorplanning. After the second stage is divided, the layer of the module is fixed. At this stage, we mainly consider the temperature of floorplanning, and find out the floorplanning results that the ''hot'' modules in the same layer are not as close as possible and the modules with higher power density between different layers (this paper refers to the module with the highest power density in each layer) are not overlapping as much as possible, and further optimize the area and total wire length.
The algorithm flow is as follows:
1) Stage 1: Initialization stage
Step1: Enter the number of layers to be divided of 3D ICs (this paper only discusses two-layer 3D ICs);
Step2: Read the relevant information of a given module: area, power density, wire;
Step3: Find the average power density p avq of all modules, and mark the module with power density p i > p avq as the ''hot'' module of 3D IC;
Step4: Sort the circuit modules by area from large to small;
Step5: Place ''hot'' modules of 3D ICs in different layers in a certain proportion (in this paper, place all ''hot'' modules at the bottom of the two-layer 3D IC), and the remaining non-''hot'' modules are placed one by one into the smallest layer according to the greedy selection strategy and enter the next stage;
2) Stage 2: First simulated annealing (Chip area, total wire length and TSVs are considered simultaneously)
Step1: Initial the simulated annealing temperature;
Step2: Random floorplanning, save the initial floorplanning results to F and F best , calculate the cost function:
Here, dev(F) is the area deviation.
Step3: Calculate the cost function Cost of floorplanning F, use perturbation operation to perturb the solution F, do not allow the ''hot'' modules of 3D ICs to move or exchange to different layers; 3) Stage 3: the second simulated annealing (Considering chip area, total wire length, and temperature simultaneously)
After the second stage of floorplanning for 3D IC, a better solution is obtained. At this stage, the temperature of the chip is taken into account, and the algorithm flow is basically the same as the simulated annealing in the previous stage. The difference is that different cost functions are used at this stage:
Heat intralaver is the total heat transferred from the ''hot'' modules in all layers to the surrounding modules. The larger the value, the more heat is transferred from the module with high power density to the module with low power density, the lower temperature of module itself, the lower the cost, so the minus sign is used in the formula. Heat intralaver is the total heat in the overlapping area of ''hot'' modules in different layers. The larger the value, the greater the heat generated in the overlapping area, the greater the possibility of high temperature, and the higher the cost, so the plus sign is used in the formula. After each iteration, the cost is calculated based on the floorplan thermal profile to guide the floorplan. The iteration process terminates once the SA convergence condition is satisfied or maximum iteration step is reached.
IV. EXPERIMENTAL RESULTS AND ANALYSIS A. EXPERIMENT SETUP
To evaluate the proposed thermal management mechanism, a typical 3-D floorplanner (3DFP) [17] , [9] that aims at reducing the average on-chip temperature is employed as our baseline.
We used the already existing 3DFP tool for temperature estimation, which in turn uses Hotspot 3D extension made by Hotspot [16] , [19] to estimate 3D on-chip temperature. MCNC benchmarks are leveraged in the block-level evaluations. The algorithm proposed in this paper is implemented by C++ coding on the basis of 3DFP. The hardware platform is Linux server, which is configured as a four-core CPU of Intel (R) Xeon E5506 with 2.13GHz main frequency, 2GB memory, RHEL 5.4 operating system and 4.1.2 20080704 compiler of gcc/g++.
The circuit used in the experiment is MCNC standard circuit. In order to verify the effectiveness of the algorithm, the experiments are carried out under the same hardware platform and the same power density. In order to avoid the instability of the simulated annealing algorithm, each circuit is run independently ten times, and then the average value is calculated.
In order to perform a fair comparison with baseline, we adopt the same thermal parameters used in the experiment as follows: silicon thickness is 100µm, silicon thermal conductivity is 149W/(mK), silicon heat capacity is 1.75× 10 6 J/(m 3 K), thermal interface material (TIM) thickness 20µm, TIM thermal conductivity is 4W/(mK), TIM heat capacity is 4×10 6 J/(m 3 K), TSV lateral thermal conductivity is 100W/(mK), heat sink thermal conductivity is 200W/(mK), TSV diameter is 10µm. For simplicity, the elastic mismatch between silicon and copper is neglected.
B. COMPARISON OF WIRE LENGTH, AREA, PEAK TEMPERATURE AND TSV NUMBERS
The results of dividing the MCNC standard circuit into two layers of 3D chips using the proposed algorithm are shown in Table 1 . The peak temperature is calculated by the 3DFP integrated Hotspot [16] tool after floorplanning. It can be seen from Table 1 that compared with the 3DFP, the wire length and area of this scheme increase by 5.175% and 6.85%, respectively; while the peak temperature decreases by 46.4%; the number of TSVs decreases by 39.45%. The heat constraint of 3DFP is that the modules with high power density are evenly divided into different wafer layers, but it still can't guarantee that the modules with high power density are not adjacent to each other, hot spots are still generated when modules with higher power density are in close proximity. In this scheme, the concept of ''hot'' block is proposed, and all the ''hot'' blocks of 3D ICs are placed at the bottom of the chip. This greedy placement strategy doesn't lead to overheating of the bottom layer, because the bottom layer is close to the radiator with the best heat dissipation performance, and the heat transfer between the ''hot'' blocks in the layer and the surrounding modules is considered in each layer. This proves the validity of greedy placement strategy of the ''hot'' blocks and transverse heat constraints in layers for 3D IC. Fig.5 and Fig.6 show the floorplanning results of ami49. The red part marks the ''hot'' blocks in each layer of ami49. It can be seen that the ''hot'' blocks in 3D IC layer are more evenly distributed in the IC, and there are relatively few ''hot'' blocks adjacent to each other.
3DFP does not consider the impact of TSV numbers on the cost of 3D ICs, but the number of TSVs is considered simultaneously in this scheme. The floorplanning results show that the proposed scheme greatly reduces the number of TSVs.
C. COMPARISON OF TIME OVERHEAD
The time overhead of 3DFP, the scheme using the initial solution of 3DFP and the scheme presented in this paper are compared as shown in Table 2 .
As shown in the fourth column of Table 2 , the time overhead of this scheme is 29.85% higher than that of 3DFP, which is acceptable. In this iterative process of the simulated annealing algorithm, the lateral heat transfer and the number of TSVs in the ''hot'' block in the layer are calculated, which increases the time overhead, while the two factors are not taken into account in the 3DFP, so the time consumed in this step is longer than that in the 3DFP. However, the area is sorted in the initialization stage, and the initial division is carried out with the greedy selection strategy. A better initial solution is obtained to accelerate the convergence process of understanding. Moreover, this algorithm fixes the ''hot'' blocks of 3D IC to the bottom layer, which reduces the solution space to be searched by simulated annealing. The initial dividing stage of 3DFP is random dividing. The third column in Table 2 above is the time overhead of simulated annealing search using the initial solution of 3DFP, i.e. the initial result of random partition. From the time overhead of the third column, it can be seen that the time overhead of adopting random initial solution scheme is larger, which is 59.3% higher than that of 3DFP. The initial solution of this scheme is improved by 29.45% compared with the initial solution of 3DFP. Therefore, the table above proves the validity of the proposed scheme in the initial solution acquisition area sorting and greedy selection strategy.
V. CONCLUSION
This paper proposes a temperature-aware 3D IC floorplanning algorithm, and gives the concept of ''hot'' block. By placing the ''hot'' block of the 3D IC on the bottom layer of the chip (near the radiator) and reasonable intra-layer and inter-layer heat limitation, the peak temperature of the 3D IC is effectively reduced. In addition, the number of TSVs are greatly reduced. He is also the Dean with the School of Electronic Science and Applied Physics, and the School of Microelectronics, HFUT. He has taken charge of many projects (e.g. DFG, National Natural Science Foundation, Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry). He has published a book in Germany and more than 100 journal articles. His research interests include built-in-self-test, design automation of digital systems, ATPG algorithms, and distributed control, and so on.
ZHENGFENG HUANG received the Ph.D. degree in computer engineering from the Hefei University of Technology, in 2009. He joined the Hefei University of Technology as an Assistant Professor, in 2004, and has been an Associate Professor, since 2010. He was a Visiting Scholar with the University of Paderborn, Germany, from 2014 to 2015. He is currently a Professor, since 2018. His current research interests include design for soft error tolerance/mitigation. He is a member of Technical Committee on Fault Tolerant Computing which belongs to China Computer Federation. He served on the organizing committee of the IEEE European Test Symposium, in 2014.
