Thermal modeling and analysis of advanced 3D stacked structures  by Vaddina, Kameswar Rao et al.
Procedia Engineering 30 (2012) 248 – 257
1877-7058 © 2011 Published by Elsevier Ltd.
doi:10.1016/j.proeng.2012.01.858





          Procedia Engineering  00 (2011) 000–000 
www.elsevier.com/locate/procedia 
 
International Conference on Communication Technology and System Design 2011 
Thermal modeling and analysis of advanced 3D stacked 
structures 
Kameswar Rao Vaddinaa, Amir-Mohammad Rahmania, Khalid Latifa, Pasi 
Liljebergb, Juha Plosilab, a* 
aTurku Center for computer science (TUCS), Finland, 
bDepartment of Information Technology, University of Turku 
Abstract 
The emerging three-dimensional integrated circuits (3D ICs) offer a promising solution to mitigate the barriers of 
interconnect scaling in modern systems. It also provides greater design flexibility by allowing heterogeneous 
integration. However, 3D technology exacerbates the on-chip thermal issues and increases packaging and cooling 
costs.  In  this  work,  a  3D  thermal  model  of  a  stacked  system is  developed  and  thermal  analysis  is  performed  
in order  to analyze  different  workload conditions  using finite element simulations. The steady-state heat transfer 
analysis on the  3D  stacked  structure  has  been  performed  in  order  to analyze the effect of variation of die power 
consumption, with and without hotspots, on temperature in different layers of the stack has been analyzed. We have 
also investigated the effect of the interaction of hotspots has on peak temperature. 
 
© 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of ICCTSD 2011 
 
Keywords: Thermal analysis; thermal modeling; 3D networks-on-chip; hotspots; thermal management; stacked IC’s. 
1. Introduction 
As technology scales down and power density increases, a lot of factors like power dissipation, 
leakage, data activity and electro-migration contribute to higher temperatures, larger temperature cycles 
and increased thermal gradients all of which impact multiple failure mechanisms [1]. This increase in 
temperature, increases interconnect delay due to the linear increase in electrical resistivity. These delay 
variations pose significant reliability problems with already dense interconnect structures. In order to 
overcome the problems associated with the interconnects and the limits posed by the traditional CMOS 
scaling, three-dimensional (3D) integrated circuits has been proposed. 3D integrated circuits take 
 
* Pasi Lijieberg:  
E-mail address: vadrao@utu.fi. 
Open access under CC BY-NC-ND license.
Open access under CC BY-NC-ND license.
249Kameswar Rao Vaddina et al. / Procedia Engineering 30 (2012) 248 – 257
 Kameswar Rao Vaddina / Procedia Engineering 00 (2011) 000–000 
advantage of dimensional scaling approach and are seen as a natural progression towards future large and 
complex systems. They increase device density, bandwidth and speed. But on the other hand, due to 
increased integration, the amount of heat per unit footprint increases, resulting in higher on-chip 
temperatures and thereby degrading the performance and reliability of the system. In this case, heat sinks 
need to be very efficient in transferring the internally generated heat to the ambient. Although there is a 
dearth of design and layout tools for 
3D technology, there is a significant 
amount of effort going on in that 
direction. 
The ever expanding market for 
consumer electronics is driving 
innovation in packaging technology 
leading to newer packages which are 
smaller, more thermally efficient and 
cost effective at the same time. The 
technology related to wafer level 
packaging and 3D integration has 
recently outpaced ITRS roadmap 
forecasts [1]. One of the fastest 
growing packaging architectures is the wafer level packaging (WLP). It offers lower cost, improved 
electrical performance, lower power requirements and smaller size. Although several architectural 
variations are available, in this paper we will be discussing only the flip-chip packaging. The ITRS report 
projects that the power density for 14nm technology node will be greater than 100 W/cm² and the 
junction-to-ambient thermal resistance will be less than 0.2°C. It is very important to keep the thermal 
resistance at bay as this may increase the package cost and the overall cost of the product. 
 
Guoping et al. [3][4] have done thermal modeling of multicore systems and have investigated the 
effects of CPU power level, local hotspot power density, hotspot location and hotspot size on its thermal 
performance. But they stopped short of extending their work to 3D multicore systems. Ankur et al., [6] 
have proposed an analytical and numerical modeling of the thermal performance of three-Dimensional 
Circuits. In this paper we have chosen to model a 3D multicore system in a modern flip-chip package 
which is used mostly for high-performance processors. We have started our study with thermal modeling 
of a multicore processor and have investigated the effects of hotspots and their locations on the thermal 
performance of the package. We then proceeded to work on the 3D multicore systems. Due to the lack of 
space, only results pertaining to the 3D modeling are presented in this paper.  
 
We will be providing a brief description of a modern flip-chip package in Section 2, briefly delve into 
different workload conditions for 3D stacked systems in Section 3, introduce our thermal model and 
analysis performed in Section 4 and provide simulation results in Section 5. 
2. Flip-Chip Package 
Although IBM's Ball Grid Array packages have been in use since the 1970's, recent advances in 
packaging technology have lead to Flip-Chip Ball Grid Array (FCBGA) packages being extensively used. 
Nomenclature 
250  Kameswar Rao Vaddina et al. / Procedia Engineering 30 (2012) 248 – 257 Kameswar Rao Vaddina / Procedia Engineering 00 (2011) 000–000  
FCBGA allows for much higher pin count than the other package types by distributing the input-output 
signals through the entire die rather than being confined to the chip periphery. In an FCBGA the die is 
mounted upside-down (flipped) and connects to the package balls (lead-free solder bumps) via a package 
substrate. 
The cross-sectional view of a modern 3D flip-chip package is shown in the Fig.1 whose primary 
consideration will be its ability to transfer heat from the silicon die to the ambient. Unlike the traditional 
wire-bonding technology, the electrical connection of a face-down (or flipped) integrated circuit onto the 
substrate is done with the help of conductive bumps on the chip bond pads. The conductive bumps are 
initially deposited on the top-side of the die 
during the fabrication process. It is then flipped 
over so that its top side faces down, and aligned with the matching pads on the substrate. The solder is 
then flown to complete the interconnection. The advantages of flip-chip interconnect include reduced 
signal inductance, power/ground inductance, and package footprint, along with higher signal density [8]. 
3. Different workload conditions for 3D stacked systems 
In this work we have studied 3 different example workload conditions for 3D stacked networks on chip 
(NoC) processors for their thermal behavior. They are as follows: 
a) Static workload (Static): Assuming that the total system power is 200W, each individual die's 
consume around 66.66W, which is one third of the total power consumption. That is all the dies 
in this 3D stacked chip setup consume equal amount of power. 
b) Adaptive workload (Adaptive): In a typical stacked 3D stacked system, the maximum thermal 
conduction usually takes place from the die which is closer to the heat sink. That particular die 
also has lower junction temperature and thermal resistance. Hence, we take advantage of these 
feature set and analyze a setup where in we assume that, most of the switching/routing activity is 
herded away to the die closer to the heat sink. In [9], Chao et al. have proposed a traffic- and 
thermal-aware run-time thermal management scheme using proactive routing towards the die 
closer to the heat sink in order to ensure thermal safety. Rahmani et al. [10] have proposed 
hybrid NoC bus architecture and a hybrid routing strategy similar to these lines in order to 
mitigate thermal issues. Apart from assuming that most of the routing takes place in the die 
closer to the heat sink, we also assume that most of the tasks also get migrated to the cores 
present on this die, thereby keeping it very busy and active all the time. This is one of the 
thermal-aware job allocation and scheduling schemes. By virtue of all this routing and task 
migration happening in the die closer to the heat sink, it would be consuming more power when 
compared to the other two dies. In this thermal model we assume that DIE-3 (the die closer to 
heat sink) consumes around 40% more 
power compared to DIE-2 and around 
60% more power compared to DIE-1. 
So, assuming that the total system 
power is 200W, then DIE-3 is 
consuming around 100W, DIE-2 
around 60W and DIE-1 around 40W 
respectively. 
c) Adaptive workload with a hotspot 
(Adaptive_hotspot): This thermal model is similar to the above adaptive routing with task 
migration (Adaptive) model. But, in here we analyze the effect of hotspot which we assume gets 
created in the closer to the heat sink due to adaptive routing and task migration happening. 
Fig 1: Cross sectional view of a modern flip-chip package 
251Kameswar Rao Vaddina et al. / Procedia Engineering 30 (2012) 248 – 257 Kameswar Rao Vaddina / Procedia Engineering 00 (2011) 000–000 
4. Thermal Modeling and Analysis 
The high operating temperature of a semiconductor device, caused by the combination of device power 
density and ambient conditions is an important reliability concern. Instantaneous high temperature rises in 
the devices can possibly cause catastrophic failure, as well as long-term degradation in the chip and 
package materials, both of which may eventually lead to system failure [8]. Most modern flip-chip 
devices are designed to operate reliably with a junction temperature falling under a certain range. To 
ensure that the package can perform well thermally under this range a thermal model is simulated and 
tested. This thermal model can then be used to gauge the reliability of the package. This shortens the 
package development time and also provides an important analytical tool to evaluate its performance 
under different operating conditions. We have developed a thermal model of the modern flip-chip 
package using a commercial tool called COMSOL. It is a finite element based multiphysics modeling and 
simulation software. Our simulations are based on the heat transfer module of COMSOL multiphysics 
package. The size of the silicon die 1, 2 and 3 is 20 mm x 20 mm x 0.6 mm which is being mounted on to 
the substrate of size 50 mm x 50 mm x 1.44 mm. The layers of silicon die are separated by an interlayer 
material whose thickness is around 0.02 mm. The cup lid which acts as the heat spreader and whose 
thermal conductivity is very high is placed on top of the silicon die. The thermal interface material 
(TIM1) which is some sort of a thermal grease and has very good adhesive properties is being used as the 
filler material in between the heat spreader and the silicon die. The heat sink base of size 100 mm x 100 
mm x 5 mm is being used. A vapor chamber is used as the heat sink base and the detailed assumptions 
can be found in [3]. Instead of including the heat sink fins in our computational model, we have used an 
effective heat transfer coefficient (h_eff) as a boundary condition on the heat sink [4]. Other assumptions 
related to the geometry of the package and its components, material properties (like thermal conductivity, 
density and specific heat capacity) and the boundary conditions are obtained from the literature 
[1][2][3][4]. Some important model configuration parameters are represented in the tabular format as 
shown in Table 1. The parameter Q, which is the heat generated per unit volume is applied to the silicon 
die. The boundary condition for the substrate layer is assumed to be convective and the sides of the 
package are assumed to be adiabatic. 
5. Modelling interlayer material   
Three effective thermal conductivities are used for the lead solder bumps/underfill layer, substrate 
layer and the interlayer material (ILM) respectively. The interlayer material in between the silicon dies is 
modeled as a homogeneous layer in our thermal model. Usually, the TSV's have much lower thermal 
resistance than the silicon dies which helps immensely in heat conduction. We assumed a uniform 
through-silicon-via (TSV) distribution on the die and obtained the effective interlayer material resistivity 
based on the TSV density (d_TSV) values [2], where d_TSV is the ratio of total TSV's area overhead to 
the total layer area. Coskun et al. 
 [2] have observed that even when the TSV density reaches 1-2%, the temperature profile of the silicon 
die is only limited by a few degrees, thus justifying the use of homogeneous TSV density in our thermal 
model. According to the current TSV technology [7], the diameter of each via is 10µm, and the spacing 
required around the TSV's is assumed to be around 10µm [2]. For our experiments we have assumed 
around 8 via's/mm², that is around 3200 vias spread across the 400 mm² area of the silicon die. Hence the 
TSV density is around 0.062% and the resistivity of the interlayer material is around 0.249 mK/W (i.e. 
thermal conductivity = 4.016 W/mK) [2]. 
 
252  Kameswar Rao Vaddina et al. / Procedia Engineering 30 (2012) 248 – 257 Kameswar Rao Vaddina / Procedia Engineering 00 (2011) 000–000  
6. Simulation results 
We have built a generic three-die stack in a flip-chip package using COMSOL and simulated three 
different scenarios (Static, Adaptive and Adaptive_hotspot) as described in Section V. In the Static case 
all the 3 dies in the flip-chip package consume equal amount of power. In both the Adaptive and 
Adaptive_hotspot case DIE-3 consumes around 40% more power compared to DIE-2 and 60% more 
power compared to DIE-1. So, assuming that the total power consumption of the system is 200W, then in 
the Static case all the dies consume around 66.66W, whereas in both the Adaptive and Adaptive_hotspot 
cases DIE-3 consumes 100W, DIE-2 around 60W and DIE-1 around 40W respectively. 
253Kameswar Rao Vaddina et al. / Procedia Engineering 30 (2012) 248 – 257 Kameswar Rao Vaddina / Procedia Engineering 00 (2011) 000–000 
  
Fig: 2   Slice  plot  of  the  thermal  model  in  the  Static case.  P 
= 200W, P die1 =P die2 =P die3 = 66.66W. 
Fig: 3  Subdomain plot of the thermal model in the Static case. 
P=200W, P die1 =P die2 =P die3 = 66.66W. 
 
Due to adaptive routing and task migration happening in DIE-3 we assume that a hotspot gets created at 
the center of the die and analyze the thermal behavior of the system in Adaptive_hotspot case. Guoping 
Xu [4] has varied the size of the hotspot from 0.5 mm to 2 mm in his work related to the thermal 
modeling of multicore systems. In our work the power density of the hotspot which is being generated at 
the center of DIE-3 in the case of Adaptive_hotspot is fixed at 100 W/cm² and the dimensions are fixed at 
1mm x 1mm x 0.6mm. We have performed the steady-state heat transfer analysis on the flip-chip 
package. In the steady-state the heat generated by the three dies is equal to the heat leaving the flip-chip 
package. During the measurements we have assumed that the power is gradually applied to the chip until 
the chip has reached the maximum working temperature (i.e. steady state).  
 
Slice and subdomain plots of the simulated thermal model for the Static case in which the total system 
power consumption is 200W is shown in Fig.2 and Fig.3 respectively. For the sake of brevity we are not 
presenting the slice and subdomain plots for the rest of the cases. The peak temperatures on all the three 
dies for all the three cases at steady-state is shown in Fig.4, 5 and 6 respectively and concisely tabulated 
in Table II. The peak temperature curves are plotted along the X-axis of the dies. It can be observed from 
those curves that the temperature is maximum at the center of the die and decreases on the edges due to 
convection. We have also concisely tabulated the peak temperatures at steady-state in all the three dies in 
cases where the total power consumption of the system is 100W. They are shown in Table III. The 
hotspot parameters in the case where the total power consumption is 100W is the same as 200W system. 
 
254  Kameswar Rao Vaddina et al. / Procedia Engineering 30 (2012) 248 – 257 Kameswar Rao Vaddina / Procedia Engineering 00 (2011) 000–000  
1. Static case analysis: In the 
Static case, since all the dies 
consume equal amount of 
power, generate equal amount 
of heat at the same time, have 
almost the same thermal 
resistance, the only possible 
direction towards which the 
heat can flow is the direction of 
heat sink and the ambient. The 
proximity of DIE-3 to the heat sink makes it dissipate more heat than the other two dies. Hence 
it can be safely said, that the die which is closer to the heat sink (DIE-3) is the coolest, the die 
which is farther from the heat sink (DIE-1) is the hottest and the die which is sandwiched (DIE-
2) has a temperature somewhere in between them. This phenomenon can be observed in all the 
three simulation runs we have conducted. 
 
2. Adaptive case analysis: In the Adaptive case it can be clearly seen that the peak temperature on 
DIE-3 is the same as the Static case despite it consuming around 33.3% more power. The DIE-3 
in this case is consuming around 40% more power compared to DIE-2 and 60% more power 
than DIE-1. Despite dramatic power reductions on DIE-1 and DIE-2 and herding the tasks 
towards DIE-3, it can be seen that there is minimal impact on peak temperatures on the three 
dies at steady-state when compared to the Static case, where all the dies are consuming equal 
amount of power. This is because, since DIE-3 consumes more power it generates more heat 
when compared to the other two dies. Hence the direction of the flow of heat is not only towards 
the heat sink, but also towards the dies which are cooler compared to DIE-3 at any given time. 
So, by the time steady-state is actually reached the system attains thermal equilibrium by 
dissipating heat from the one generating more to the one generating less and to the ambient via 
the heat sink. Hence one does not notice the anticipated reduction in peak temperatures in DIE-2 
and DIE-1. This is a very significant result as researchers now a days are considering moving 
tasks and allowing data to be routed more in the layer closer to the heat sink as a means to 
address the thermal challenges of 3D stacked systems. 
     
  
Fig: 4 Peak temperatures on all the three dies in the Static 
case. P = 200W, P die1 =P die2 =P die3 = 66.66W. 
Fig: 5 Peak temperatures on all the three dies in the Adaptive 
case. P = 200W, P die1 = 40W, P die2 = 60W, P die3 = 100W. 
 














Fig: 6 Peak temperatures on all  the  three  dies  in  the  Adaptive hotspot case. P=200W, Pdie1=40W, Pdie2=60W, Pdie3=100W, 
Pd_hotspot=100W/cm2. 
 
3. Adaptive_hotspot case analysis: Since the Adaptive case does not have much reductions in peak 
temperatures when compared to the Static case, we have experimented further with the presence 
of a hotspot in DIE-3 which we assume gets created due to excessive routing and herding of 
tasks towards it. Even then, we have noticed that the peak temperatures on DIE-2 and DIE-1 are 
not very much different from both the Static and Adaptive. On DIE-3 itself we have noticed a 
slight increase in peak temperature which in this case is the temperature of the hotspot. 
 
A. Interaction between Hotspots 
 
As is the case with typical chip stacks, it is not unusual for them to have more than one hotspot being 
active at the same time. Those hotspots could be active in the same die or in different dies simultaneously. 
Hence, exploring the interaction between those hotspots is of utmost importance and can lead to 
interesting conclusions. Fig.7 shows the interaction of two hotspots on the same die (DIE-3). It has been 
obtained by fixing the hotspot at the center of the die and varying the location of the other. The 
variable’d’ in the plot is the distance between the centers of those two hotspots. For the sake of 
comparison and clarity, we have also included a temperature plot with a single hotspot at the center of the 
die in Fig.7. In this study, we have modeled a 3D stacked system whose overall power consumption is 
200W, with each die consuming around 66.66W. The two hotspots have the same dimensions of 1mm x 
1mm x 0.6mm and their power density is fixed at 100 W/cm². It can be seen from the Fig.7 that the 
maximum temperature on the die actually depends on the distance between the two hotspots. When the 
two hotspots are closer to each other (d = 2mm), there is an increase of about 0.5°C compared to the case 
where only a single hotspot is present. This value increases further as the hotspots come more closer to 
each other and culminates in achieving a temperature of 80.5°C (d = 0mm), which is almost 2.2°C more 







256  Kameswar Rao Vaddina et al. / Procedia Engineering 30 (2012) 248 – 257
 Kameswar Rao Vaddina / Procedia Engineering 00 (2011) 000–000  
 
 
thermal interaction between them and the peak temperature on the die is almost equal to the case when a 
single hotspot is present. 
 
 
Fig: 7 Interaction of two hotspots located on the same die 
(DIE-3). The plot  is  obtained  by  fixing  the  location  of  one  
hotspot  at  the  center  of  the die and varying the location of 
the other. The distance ’d’ in the plot is the distance between 
the centers of  two hotspots. 
Fig: 8 Interaction of hotspots located in different vertically 
stacked layers. Each hotspot is located at the center of its die edge 
respectively. 
257Kameswar Rao Vaddina et al. / Procedia Engineering 30 (2012) 248 – 257
 Kameswar Rao Vaddina / Procedia Engineering 00 (2011) 000–000 
We have also studied two special cases in order 
to understand the interaction of hotspots in 
different vertical layers of the dies. In the first 
case, we have analyzed the interaction of 
hotspots, wherein each hotspot is located at the 
center of its die edge respectively. In the 
second case, we have analyzed the interaction 
of hotspots when they are spread evenly across 
different dies. That is, a hotspot is present at 
the center of the right most edge of DIE-1, 
center of the DIE-2 and at the left most edge of 
DIE-3 respectively. Comparing Fig.8 and 
Fig.9, it can be observed that the peak 
temperature on each die can be reduced by 
efficiently placing the thermally risky blocks 
far from each other, so that their corresponding thermal fields do not interact with each other. In this 
analysis, the maximum temperature on the hottest die (DIE-1) has been reduced from 85.5°C to 83.5°C. 
7. Conclusions 
A thermal model of a 3D stacked system in a modern flip-chip package is developed and thermal analysis 
is performed in order to investigate different job allocation and scheduling schemes from the thermal 
perspective. We have used a finite-element based method to run steady-state analysis on the 3D flip-chip 
package we built. The analysis aimed at understanding the impact of various job allocation and 
scheduling schemes has on the peak temperature of the stacked dies. We have also analyzed the effect of 
interaction of hotspots has on peak temperatures at the same-die and die-die level. 
 
References 
[1] The International Technology Roadmap for Semiconductors, 2007. 
[2] Ayse K. Coskun, Jos´e L. Ayala, David Atienza, Tajana Simunic Rosing, and Yusuf Leblebici,  “Dynamic Thermal 
Management  in 3D Multicore Architectures” , In Proc. of Design Automation and Test in Europe, 2009. 
[3] Guoping Xu, Bruce Guenin, Marlin Vogel, “Extension of Air Cooling for High  Power  Processors” ,  In Proceedings 
of 9th Inter Society Thermal Phenomena in Electronics Systems (ITherm) Conference,  pp.  186-193, 2004. 
[4] Guoping  Xu,  “Thermal  Modeling  of Multi-core  Systems,”  In Proceedings of 10th Inter Society Thermal 
Phenomena in Electronics Systems (ITherm) Conference, 2006, pp. 96-100. 
[5] Ravi Kandasamy, and A.S. Mujumdar, “Interface Thermal Characteristics of flip-chip packages - A numerical 
study,” Applied Thermal Engineering, April 2009, Volume 29, Issues 5-6, pp. 822-829. 
[6] Ankur Jain et al, “Analytical and Numerical Modeling of the Thermal Performance of Three-Dimensional Integrated 
Circuits” , IEEE Transactions on Components and Packaging Technologies, March 2010, Volume 33, No. 1,  
[7] C. Zhu, Z. Gu, L. Shang, R.P. Dick, and R. Joseph,  “Three-Dimensional Chip-Multiprocessor Run-Time  Thermal  
Management,”  IEEE Transactions on CAD, August 2008, Volume 27, No. 8, pp. 1479-1492. 
[8] Texas Instruments, “Flip Chip Ball Grid Array Package Ref. Guide,” Literature Number: SPRU811A, May 2005. 
[9] C.-H.  Chao  et  al.,  “Traffic-  and  Thermal-Aware  Run-Time  Thermal Management  Scheme  for  3D  NoC  
Systems,”  in Proc of. NOCS 2010, pp. 223-230. 
[10] Amir-Mohammad Rahmani, Khalid Latif, Kameswar Rao Vaddina, Pasi Liljeberg,  Juha  Plosila,  Hannu  Tenhunen,  
“Congestion  Aware,  Fault Tolerant,  and  Thermally  Efficient  Inter-Layer  Communication  Scheme for Hybrid 
NoC-Bus 3D Architectures,”  in Proc. of NOCS 2011, 2011, USA. 
Fig: 9 Interaction of hotspots located in different vertically stacked 
layers, but distributed efficiently so that their thermal fields do not 
interact with each other. 
