Abstract
Introduction
Thermal issue is of paramount importance in chip performance and reliability modeling. Chip temperature variation may cause timing uncertainty and related failures by changing interconnect resistivity, may increase leakage power by super-linearly increasing gate sub-threshold current, and may shorten the mean time to fail of chip by worsening interconnect electromigration. Accurate thermal analysis is essential in a thermal-aware chip design flow [1] .
Grid-based methods, e.g. finite-difference and finiteelement methods, were successfully used in chip thermal analysis [2, 3] . They can be accelerated by model order reduction techniques [4, 5, 6] ; however, grid-based methods are inefficient in an inner loop, such as in chip floorplanning and placement, compared to Green's function based approaches, which reduce thermal simulation time by orders of magnitude [7, 8, 9] .
For thermal analysis, [10] considered Green's function for various boundary conditions (BCs) and uniform materials; [7] considered the free-space Green's function; [8] , the multi-layer Green's function; and recently [9] , the chip sidewalls. Green's function has frequently occurred in chip parasitics exaction (CPE) [11, 12, 13] . However thermal analysis differs from CPE. For example, in CPE, chip was often assumed horizontally infinite and vertically imposed with Neumann's BCs [13, 14] ; however, in thermal analysis, chip need be considered horizontally finite and vertically imposed with heat convection BCs. Additionally, numerical stability problem can happen in calculating the same Green's function if different formulas are used [12, 15] .
To accurately model a chip, horizontally finite and vertically consisting of layered materials, the paper derives the multi-layer heat conduction Green's function, by integrating the eigen-expansion technique and the classic transmission line theories. The paper offers a logarithmic algorithm, which significantly accelerates the full-chip thermal analysis, compared to traditional Green's function based methods, which are of quadratic complexity and resemble a matrix-vector product procedure [7, 8, 9] .
The rest of the paper is organized as follows. Section 2 mathematically describes the chip thermal analysis issue. Section 3 derives the multi-layer heat conduction Green's function. Section 4 introduces a logarithmic thermal analysis algorithm. Section 5 experimentally demonstrates the accuracy and scalability of the algorithm and discusses the limitations of the traditional single-layer thermal model.
Full-chip thermal analysis based on
Green's function Given a chip, described by the multi-layer thermal model (MLTM) in Fig.1(a) , its steady-state temperature distribution is determined by the 3-D heat conduction equation, written in Cartesian coordinates as
where T denotes temperature (in kelvin, or K); f , heat source power density (in W/m 3 ); and k, material thermal conductivity (in W/(m K)). k is only z-axis dependent, with
In addition, there are two sets of heat conduction boundary conditions (BCs) specified for the chip: sidewall BCs and inter-layer BCs. ; and at the chip top and bottom faces, heat convection BCs are specified:
where h (orh) is the heat transfer rate from the bottom (or top) face z 0 (or z n ) to the ambient, with units W/(m 2 K). The paper chooses the ambient temperature as the reference. Heat conduction Green's function G(x, y, z|x , y , z ), i.e. the Green's function of (1), corresponds to the chip temperature distribution caused by a Dirac delta source
and the sidewall and inter-layer BCs, in which T need be replaced by G. Given G(x, y, z|x , y , z ), the chip temperature distribution can be represented by a spatial convolution:
where V denotes the entire volume space of the given chip.
The following section will detail the derivation of the heat conduction Green's function for 3-D layered chips.
Deriving heat conduction Green's function
After extended from space V to the entire 3-D space as an even and periodic function of x of period 2X, as well as an even and periodic function of y of period 2Y , heat conduction Green's function has an eigen-expansion:
where eigenfunction φ i j (x, y) = cos iπx X cos jπy Y . Eigenexpansion (4) ensures G satisfies the sidewall BCs.
Similarly, the 3-D delta function has an eigen-expansion:
where
for the three cases that both i and j are zero, only one of them is zero, and none of them is zero.
Insert (4) and (5) into (2). Then G i j satisfies
To solve G i j from (6), the paper employs the classic transmission line theories.
Eigen-expansion coefficient
Consider (6) . When i = j = 0, γ i j becomes 0. To compute G 00 , the paper constructs a circuit consisting of an n-section line conductor of per unit-length (PUL) conductance k(z). The line has a current source input I s of intensity c 00 φ 00 (x , y ) at location z and has its two ends terminated by two resistors of resistances R = 1/h andR = 1/h, as shown in Fig.1(b) .
Compare (6) with the line conductor circuit equations
and
δ(z − z ), and also heat conduction BCs with circuit equations at line section boundaries. It is clear that G 00 corresponds to the voltage at location z in the line conductor. Hence,
where H 00 is the line transfer impedance from source z to target z. According to Fig.1(b) , assuming that source z is in layer p and target z is in layer q, H 00 is derived:
In the paper, for the m-th line section, Z m denotes the input impedance seen from its bottom boundary toward the bottom end of the entire circuit,Z m denotes the input impedance seen from its top boundary toward the top end of the entire circuit, Z m denotes its characteristic impedance with Z m = 1/k m , and l m is its length. 
, where γ is the TL propagation constant given by γ = (R + sL)(G + sC), and Z is the characteristic impedance given by Z = (R + sL)/(G + sC). Evidently, G i j corresponds to the voltage at location z in an n-section nonuniform TL of γ = γ i j and Z = 1/k(z), with a current source input I s = c i j γ i j φ i j (x , y ) at location z . For the m-th TL section, its PUL parameters satisfy
The two ends of the TL are terminated by two resistors: R = γ/h andR = γ/h. The equivalent circuit is illustrated by Fig.1(b) as well.
Based on the figure, G i j is derived and given by
where H i j is the normalized TL transfer impedance from location z to location z by propagation constant γ, with
Assuming that source z is in layer p and target z is in layer q, H i j is derived as the following:
Input impedances Z 's orZ's have recurrence formulas: for one TL section, its input impedance at its one boundary, Z in is given by Z in = ZC ZL+ZC tanh γL ZC+ZL tanh γL , where ZL is the load impedance at its other boundary; ZC, the section characteristic impedance; and L, the section length.
The paper has derived closed formulas for the multilayer heat conduction Green's function, by integrating the eigen-expansion technique and the classic transmission line theories. For other types of sidewall BCs than Neumann's BCs, the previous derivation can still be applied. For example, if Dirichlet BCs are imposed on chip sidewalls, i.e. assume the sidewall temperatures are same as the ambient, G i j can be obtained by following the previous derivation, after changing the eigenfunction to φ i j (x, y) = sin iπx X sin jπy Y . In addition, relating G i j to circuit transfer functions in the closed formulas can overcome numerical problems. For example, when heat transfer rate h orh is zero or closed to zero, the loading impedances for the equivalent circuits become infinite or extremely large; therefore, instead of using impedance formulations, using admittance formulations for G i j can avoid numerical overflows. 
A logarithmic thermal analysis algorithm
Based on the derived multi-layer heat conduction Green's function, this section introduces a logarithmic algorithm for full-chip thermal analysis.
Consider the heat source model. The paper discretizes the heat sources in a layer, e.g. the p-th layer, into A × B uniform cells, each being X A × Y B × (z p2 − z p1 ) and having a uniform power density, as shown in Fig.2(a) and (b) . In a given layer, the paper names cell (a, b) after the cell that is the For cell (a, b) , its power density is denoted by f ab , and its temperature is denoted by T ab .
The temperature distribution of a given target layer, e.g. layer q, can be obtained by superposing the temperature distribution raised by each layer of heat sources. Therefore, the paper considers the evaluation of temperature distribution at layer q, raised by the heat sources at a single layer, e.g. layer p. Target layer q is illustrated in Fig.2(a) and (c). To compute the temperature distribution in layer q caused by heat source layer p, traditional Green's function based methods used a simple matrix-vector product like procedure and required O A 2 B 2 computations [7, 8, 9] . Compared to traditional methods, the algorithm presented in the following is of logarithmic complexity O(ABlog(AB)).
Temperature distribution caused by layer p
Refer to Fig.2(b) . Insert eigen-expansion (4) into (3) and integrate with heat source power densities for layer p. Then the temperature at location (x, y, z), T (x, y, z) is derived:
where 
A logarithmic thermal analysis algorithm
Refer to Fig.2(c) . For cell (a, b) in target layer q, its average temperature T ab can be obtained by integrating T (x, y, z) in (11) over cell (a, b):
, (13) where IH i j is given by
Then the series representation of T ab in (13) is truncated:
, (15) which is the two-dimensional inverse discrete cosine transform (2-D IDCT) of F i j IH i j . Note that to improve accuracy, higher order coefficients F i j IH i j can be added to their lower order companions, by exploiting the periodicity of
; however, later experiments show that (15) is already sufficiently accurate.
Based on (12) and (15), a logarithmic algorithm is proposed for computing the temperature at layer q caused by heat sources in layer p. The algorithm is in O(ABlog(AB)) and shown in Fig.3. 
The pre-characterization of IH i j
For a given chip, IH i j 's need be pre-characterized only once. The following details the computing of IH i j . LetĤ i j denote the integral term in (14), i.e.
z p1 H i j (z|z )dz dz. Note that for H i j given in (8) and (10) , the paper has assumed that either layer p is lower than layer q, or both p = q and z < z. When the assumption is not satisfied, H i j can be obtained by exploiting the reciprocity of transfer impedances: if p = q and z > z, H i j can be obtained by exchanging z and z in (8) and (10); otherwise, if p > q, H i j can be obtained by exchanging subscripts p and q, as well as z and z in (8) and (10) . Therefore the paper considers three cases.
a). The case p = q: From (8) and (10),Ĥ i j is obtained as follows:
, l pv = z p2 − z p1 , and l qv = z q2 − z q1 . b). The case p < q: From (8) and (10),Ĥ i j is obtained: 
According to Fig.1(b) , D i j is the reflection coefficient of the p-th TL section, seen from its bottom boundary toward the bottom end of the entire circuit, andD i j is the reflection coefficient of the q-th TL section, seen from its top boundary toward the top end of the entire circuit. Since the coefficients D i j ,D i j , and E i j depend upon only a single parameter γ, they can be pre-characterized into one-dimensional lookup tables to facilitate the pre-characterization of IH i j .
Experimental results
The logarithmic algorithm shown in Fig.3 has been verified by comparisons with a computational fluid dynamics tool (FLUENT). Fig.4(a) shows a simulated chip example of three layers. The top heat transfer rateh = 0, and the bottom heat transfer rate h = 9000 W/(m 2 K). Six heat sources are in the 5 µm thick heat source region, and the position and the power of each heat source are in Fig.4(b) .
In using the algorithm, A and B were set to 40. However it is recommended that A and B be the powers of 2, to facilitate the DCT/IDCT. The results are shown in Fig.5 , where the left graphs give the temperature distributions obtained from the algorithm, and the right graphs show the relative temperature differences from FLUENT in percentages. The temperature deviations between the two methods are within 1.13% for the heat source region and within 0.07% for the chip bottom. The pre-characterization of IH i j took 1.265 s, and temperature evaluation after pre-characterization took only 8 ms; however, using FLUENT took 112 s to obtain the temperature distribution. The CPU usages were taken from a SUN Blade 1500 machine. Note that in practice, A and B can be doubled to verify the convergence of the algorithm. 
Scalability of the proposed algorithm
Fig .6 shows the calculated heat source region temperature distribution by the proposed algorithm, for the chip in Fig.4(a) , under a randomly generated power density function f . For comparison, a simple matrix-vector product procedure was implemented to simulate the traditional methods [7, 8, 9] , The table in Fig.6 shows the CPU usages of the proposed algorithm (excluding the time for precharacterization, as it need be done only once) and the traditional methods, when A × B was increased from 40 × 40 to 120 × 120. Evidently, the proposed algorithm has superior speed due to its logarithmic complexity, compared to the traditional methods, whose complexities are quadratic. 
Limitations of the single-layer thermal model
Traditionally, a single-layer thermal model (SLTM) was used to describe a multi-layer chip [7, 9] . Fig.7(a) shows such a single-layer thermal model, where h e is the effective heat transfer rate that can be determined by the approach in [2] , and ET T is named the effective thermal thickness.
Using the chip in Fig.4 , the paper employed the proposed algorithm to analyze the average temperature distribution of the heat source region, by using the SLTM. Fig.7(b) plots the maximum errors of the calculated temperature distribution based on the SLTM, when ET T was varied. The figure shows that the accuracy of the SLTM was very sensitive to ET T . For the chip in Fig.4 , to ensure the temperature errors within 1.97%, both the bulk and solder regions need be modeled, which is, however, beyond the capability of the SLTM. Hence, there are obstacles in using the SLTM for 3-D ICs [16] .
Conclusions
By integrating the eigen-expansion technique and the transmission line theories, the paper derived the multi-layer heat conduction Green's function, based on which a logarithmic full-chip thermal analysis algorithm was introduced. Experiments demonstrated that the algorithm was very accurate and well-scalable, compared to FLUENT and traditional Green's function based methods. In addition, the paper discussed the limitations of the traditional single-layer thermal model.
