As the work load on the single core processor increases, its power density and the die temperature increases as well. The increase in the die temperature results in decreased performance, reliability and increased leakage currents and cooling cost. Also, the non-uniform power distribution across the die results in hot spots. In order to decrease the work load and the cooling cost on the single core processor, multi-core processors have been implemented. Multicore Processors also known as Chip Multi Processors (CMP's). CMPs are processors which contain two or more independent cores on a chip. In CMPs, if one core reaches its critical temperature, the workload is transferred to the other. This phenomenon is termed as core hopping. Core hopping facilitates uniform distribution of the work load among the many cores and leads to improvements in the performance and reliability. The demand for greater performance in applications involving high levels of computing has resulted in many cores being put on a single chip. Every succeeding processor is predicted to hold double the number of cores than the previous one. In this study, core hopping for CMPs is analyzed and the thermal analysis of the chip with core hopping is performed using ANSYS Fluent. The hop sequence is analyzed as a function of chip temperature distribution and a numerical methodology to analyze the coupled thermal and structural integrity of the CMPs is demonstrated.
Introduction
According to Moore's law the number of transistors on a chip doubles every 18 months [1] . CMPs are the processors which contain two or more independent cores on a chip. This kind of architecture was introduced by Intel with Intel core Duo and has continued to AMD NVIDA and so on. As the transistor count keeps increasing in order to integrate the billions of transistors resulting from the continued scaling of technology, CPMs is one of the effective strategies to integrate such ICs [2] . Figure 1 shows the Moore's law projection.
978-1-4673-1111-3/12/$31.00 ©20 12 IEEE 112 As the work load on the single core processor increases, there is an increase in the power densities and die temperature. The increase in die temperature results in decreased performance and reliability and increased leakage currents and cooling costs. In order to decrease the work load and cooling cost on the single core processor, multi-core processors have been implemented. As the technology scaling on CMPs occurs i.e., shrinking of chip geometry in the order of sub-100nm realm, this results in increase of transistor density and also increases the leakage current leading to excessive power consumption and heat generation which is a major challenge for future CMPs. The scaling down of silicon technology leads to significant thermal coupling between neighboring cores [2]. Figure 2 shows two different processors from AMD depicting temperature vs power [3] . With the shrinkage of chip geometry and transitioning to billion transistor microprocessors, the power budget of CMPs must be addressed at the design level. The current and future processor power dissipation increases with increase in clock frequency and transistor count [4, 5] . Figure 3 shows the power consumption of the Intel processors from 1970 to 2005 [6]. Huang et al. [7] derived the expression for power consumed by a core and power density with fixed architecture from across generations, as shown in equation I and 2, respectively.
Power consumed 2 P = (Vddn+1) p.
1)
n + l
Vddn n
Power density ... In equations 1 and 2, V dd is the supply voltage, n and n+ 1 denotes technology generations, and s is the scaling factor. Many cores pose a thermal problem i.e., primary cores consume more power than the other simple cores which results in localized hot spots [7] . Cho et al. [8] demonstrated that at any given point of time not all cores in CMP's will be functioning, i.e., different cores at different locations are active at different times resulting in non-uniformity in power consumption. He used a proactive spatiotemporal power multiplexing method in his study to achieve a lower peak temperature and uniform thermal field on the chip. The spatiotemporal power multiplexing is based on time i.e., it changes the location of power dissipation after a fixed interval of time while maintaining the throughput during the redistribution. Borkar et al. [9] discussed the fine grain power management and system design for the many core system. It is observed through his work that multiprocessors have several benefits such as individual cores can be turned "ON" or "OFF", thereby saving power, lower die temperature can be maintained thus improving reliability. Tasks can be distributed among the many cores such that an overall lower temperature is achieved. The governing rule by which the performance increases by micro-architecture alone is given by Pollack's rule. Pollack's rule states that performance increase is roughly proportional to square root of increase in complexity as illustrated in the figure 4.
.... .. .. .
Performance -Sqrt(Area) Area (X) Figure 4 Pollack's Rule
10.00
Huang et al. [10] states that an asymmetric architecture with many core creates a huge thermal problem where in the primary or more complex cores create localized hot spots due to higher power consumption. It also states that due to thermo spatial low pass filtering effect in smaller cores the equivalent thermal resistance reduces i.e., for same power density small cores produce less heat than large cores. He also mentioned that some techniques used by Intel to improve performance such as Intel "turbo mode" used for boosting processing speed by increasing supply voltage and frequency to those cores that are active, results in increasing hot spots. Shayesteh et al.
[11] studied the core swapping technique on a dual micro core architecture triggered thermally, swapping is done with the use of helper engine that reduces the overhead of swapping by buffering the core state during the swapping processes. The author came to a conclusion that core swapping leads to maintaining temperatures below the threshold. In this paper, core hopping for CMPs is analyzed and thermal analysis of the chip is performed using ANSYS. The hop sequence is studied as a function of chip temperature distribution and thermo mechanical analysis of the chip will be carried out to estimate its structural integrity.
Modeling and Methodology
A typical flip chip package is represented in figure 5 . The test vehicle (TV) consists of a heatsink, thermal interface material (TIM-2), heat spreader, TIM-I, die, C4, underfill, substrate, copper pads, solder balls and the printed circuit board (PCB). Thermal analysis was performed with boundary condition of natural convection at the PCB (with a heat transfer coefficient of 10W/m 2 K) and forced convection (1200W/m 2 K) being applied at the top surface of the heat sink. Such high value of heat transfer coefficient was used to compensate for the heat sink fins (heat sink modeled as a block). The dimensions and thermal properties of the components used in the study are listed in table 1. Mechanical properties are given in Table 2 . The geometry is based on Intel Pentium processor used in [12] . Figure 5 shows the full model of the module in ANSYS Workbench.
PCB � x Figure 5 Full Model The package shown in Figure 5 was first modeled in Icepak 13.0.2 in form of simple blocks. The heat sink is modeled as a block without fins and to compensate for the fins a higher value of heat transfer coefficient is used. The surface of the die was divided into 16 equal areas modeled as heats source each representing a core. The copper pads on the top and bottom of the solder ball and the solder ball were modeled as three separate blocks in order to decrease the computing time. The isometric view of the model created in Icepak is shown in Figure 7 .
Figu re 7 Icepak Model After creating the model, it is meshed in Icepak 13.0.2; a Fluent case file was written using Icepak solver after that the case file was imported to Fluent 13.0. Using the User Defined Functions in fluent, a UDF code was written to assign the boundary conditions i.e., the maximum and minimum temperature at which the cores starts hopping and the conditions were integrated to the respective zones in the model. Figure 8 shows the arrangement of all the cores on the chip.
Figure 8 Arrangements and Numbering of Cores
The UDF code was written in such a way that at any given point of time four out of sixteen cores are active and generating heat. The sequence in which the cores start hopping is given in the Table 3 where T min is the minimum threshold temperature i.e., the core gets activated only when it cools down to this temperature and T max is the maximum threshold temperature i.e., the temperature where the core has to be deactivated. Table 3 shows the hop sequence simulated through the UDF. 
Results

Thermal Analysis
Thermal analysis of the various cores is completed based on temperature. The hopping occurs among the four cores depending on temperature conditions given in the UDF code. At any instant four cores are active and if the temperature of any core reaches the threshold (�305K), that particular core is switched "OFF" and the operation is transferred to a different core. Among all the core hopping sequence investigated, the cases shown in table 3 had uniform temperature distribution. In this work, heat flux of le 6 W/m 2 is applied on each core and solution was run for 10 time steps with a step size of 0.5 seconds each. For example, if core I reaches the threshold, it was replaced by the 6th core; similarly core hopping takes place between 4 and 7, 13 and 10, 16 and 11, respectively. The hopping always took place among the above mentioned 8 cores and that the cores 2, 3, 5, 8, 9, 12, 14, and 15never came "ON" till 5 seconds. However when the heat flux is increased from le 6 W/m 2 to 6e 6 W/m 2 core hopping occurred among all the 16 cores because the condition for activation was satisfied at some point of time within the 5 second interval. Figure 9a and 9b show the hop sequence among the 8 cores mentioned above. 
Structural Analysis
Thenno mechanical analysis is carried out using ANSYS Work Bench 13.0, the package geometry is imported to work bench in .IGES fonnat which was written using ANSYS Icepak. The model has been imported to transient structural analysis in work bench. Transient structural analysis is a time based analysis where the loading changes with time. In this study loads were thermal (temperature distribution). Results from the thermal analysis done in Fluent were imported as input loads to Workbench. Figure 10 shows the transfer of thermal loads from fluent to Work bench transient structural analysis. The whole idea was to investigate the thermal-mechanical stresses in the chip region induced due to the thennal gradients within the chip. Compared to the chip region the other components were pretty much stresses free at all instants. The von misses stress results for the entire package are shown in the Figure 11 . Maximum stress is seen in the chip region with a magnitude of 47.9 MPa. The stress distribution on the chip changed with the core hopping and the maximum stress was always observed at the chip edges. Maximum warpage in the chip was 13-llm, as shown in figure 12 . 
Conclusion
Core Hopping has been studied based on the chip temperature distribution. Temperature based core hopping is more realistic than the time based counter-part because in the time based core hopping, the core is activated for a certain period of time irrespective of the temperature, which could result in exceeding the safe temperature limit for which the chip was designed for. This may lead to premature chip failures. A methodology has been demonstrated to analyze the coupled thermo-mechanical integrity of CMPs by integrating ANSYS Fluent with Workbench. For the stress analysis, the temperature based hopping results were imported to Work Bench and transient structural analysis was performed. Maximum stress, as expected was seen in the chip region (die and the C4) and other components show very little stress throughout the hop sequence. The maximum stress in the chip region is around 50MPa which is almost 2 orders of magnitude less than the yield strength of silicon. This concludes that in the core hopping phenomenon, the key parameter is temperature management only and that mechanical stresses do not play much role in causing failures.
