## Thermal Characterization and Thermal Management in Processor-Based Systems<sup>\*</sup>

José L. Ayala<sup>1</sup>, Anya Apavatjrut<sup>2</sup>, David Atienza<sup>3</sup>, Marisa López-Vallejo<sup>1</sup> and Carlos A. López-Barrio<sup>1</sup>

 <sup>1</sup> Universidad Politécnica de Madrid, Departamento de Ingeniería Electrónica Ciudad Universitaria s/n, 28040 Madrid, Spain {jayala, marisa, barrio}@die.upm.es
<sup>2</sup> Department of Telecommunication Services and Usage, INSA Albert Einstein, 69621 Villeurbanne Cedex, France anya.apavatjrut@insa-lyon.fr
<sup>3</sup> Complutense University of Madrid, Dpt. of Computer Architecture and Systems Prof. José García Santesmases s/n, 28040 Madrid, Spain datienza@dacya.ucm.es

Abstract. The register file is one of the hottest devices in processorbased systems. Leakage reduction techniques and DTM mechanisms require a thermal characterization of the hardware. This paper presents a thermal model to analyze the temperature evolution in the shared register files found on VLIW systems. The use of this model allows the analysis of several factors that have an strong impact on the heat transfer. The results obtained can be used in the design of temperature-aware compilers and place&route tools.

Keywords. Thermal characterization, thermal model, register file.

## 1 Introduction

The doubling of microprocessor performance every 18 months can be explained by two facts: more transistors integrated per chip and superlinear scaling of the processor clock with technology generation [1]. However, as CMOS technology is scaled into the sub-100 nm region, the power density of microlectronic designs increases steadily. For example, the power density of high performance microprocessors is found to be  $50W/cm^2$  for the 100nm technology, while reaches the  $100W/cm^2$  for the 50nm technology [2]. This trend is becoming a key limiting factor to the performance of current state-of-the-art microprocessors due to the increase of the average temperature of the chip and the local hot spots.

System performance is affected for both temperature and supply voltage. As is widely known, power consumption includes dynamic and leakage power. Leakage power consumption grows significantly as technology scales down because of:

Dagstuhl Seminar Proceedings 07041

Power-aware Computing Systems

http://drops.dagstuhl.de/opus/volltexte/2007/1110

 $<sup>^{\</sup>ast}$  This work is partially supported by the Spanish Government Research Grant TEC2006-00739

# 2 J. L. Ayala, A. Apavatjrut, D. Atienza, M. López-Vallejo, C. A. López-Barrio

increase of device leakage current due to reduction in threshold voltage, channel length, and gate oxide thickness [3]; and the increasing number of idle modules in a highly integrated system.

Moreover, the reliability of electronic devices is well known to exponentially depend on the operation temperature due to the acceleration of several failure mechanisms on hotter environments [4]. Even small differences in the operating temperature (10-15 C) can result in a 2x difference in the lifespan of the devices [5]. Additionally, the higher operating temperature, the more aggressive the cooling solution must be, which will lead to further increase in power consumption [6].

From above, it can be seen how important is to estimate temperature at different design stages, especially in the very early stages of the design flow. The estimated temperature can be used to perform power, performance, and reliability analysis, together with placement, packaging design, etc. As a result, all the decisions use temperature as a guideline and the design is intrinsically thermally optimized and free from thermal limitations.

Existing power simulators [7] calculate leakage power by assuming a fixed ratio between dynamic and leakage power. This assumption is not accurate because dynamic power and leakage power scale differently as a function of Vdd and temperature. Furthermore, leakage power is sensitive to temperature while dynamic power is independent of temperature. High-level leakage power modeling has been studied. Several references [8] present high-level leakage power models without temperature scaling. Therefore, none of these models is sufficient to study microarchitecture-level power and temperature interaction. Microarchitecturelevel thermal modeling has also been studied. Brooks and Martonosi [9] model the on-chip temperature as the average power consumption within a fixed time window. [10] proposes a simple thermal calculation, applying a one-segment lumped thermal resistance and capacitance circuit to model the entire chip and package. HotSpot [11] provides a detailed thermal model based on an equivalent distributed circuit of resistances and capacitances. However, this model do not consider the temperature and voltage dependence of leakage power.

It is known that the register file is the hottest block in a modern microprocessor chip [12]. Some DTM methods specifically targeted toward temperature control in the register file were presented in [13] and [14]. Therefore, the thermal characterization of this device, and the effect on temperature of high-level factors, is needed.

This paper presents some current work on the thermal characterization of the register file of VLIW architectures. Our work proposes a thermal model to estimate the evolution in time of the temperature and analyzes some factor that influence this heat distribution.

## 2 Thermal Model

For the development of the thermal model, a well known analogy between the electrical circuits and the thermal sources is exploited. The silicon die and heat

spreader is composed in elementary cells in a cubic shape. The temperature for every cell is computed using an RC model. The size of the cell trades-off the simulation speed with the thermal accuracy.

Each cell is associated with a thermal capacitance and five thermal resistances. Four resistances are used to model the horizontal thermal spreading, whereas the fifth is used to model the vertical thermal behavior. The thermal conductivity (horizontal and vertical) and capacitance, respectively, of each elementary cell are computed as follows:

$$G_{hor} = K_{G(Si/Cu)} \times \left(\frac{h \times w}{l}\right)$$
$$G_{ver} = K_{G(Si/Cu)} \times \left(\frac{l \times w}{h}\right)$$
$$C = K_{C(Si/Cu)} \times l \times h \times w$$

where  $K_{G(Si/Cu)}$  (thermal conductivity for silicon or copper),  $K_{C(Si/Cu)}$  (thermal capacitance per volume unit for silicon or copper), l (cell length), w (cell width), h (cell height).

With this RC characterization, every cell is connected with the cells in the surroundings. The heat dissipation of each block is modeled as a source connected to the current node. A thermal circuit, which is similar to an electrical circuit, is created and can be solved by a node voltage analysis. As a result, the temperature of each block is obtained.

#### 2.1 Register File Modeling

As was mentioned before, one of the goals of this work is to increase the model granularity by focusing the analysis on the temperature behavior of the registers inside the register file. To accomplish such goal, the register file is supposed to be represented as a  $N \times M$  matrix and every register belongs to one of the elementary cells. Therefore, the thermal resistance and capacitance (R and C) for every elementary cell and every specific floorplan, have to be calculated.

**Elementary resistances calculation** Since the total resistance and total capacitance of the device is known in advance, the register file can be decomposed into smaller units. Each unit is associated with its own resistance and capacitance as shown in Figure 1.

From Figure 1, the total resistance for a cell is

$$R_{cell} = R + \frac{R}{3} = \frac{4R}{3}$$

Now, from Figure 3, the circuit can be decomposed in a matrix of N rows by M columns of registers, and the resistance of every row can be calculated as follows. 4 J. L. Ayala, A. Apavatjrut, D. Atienza, M. López-Vallejo, C. A. López-Barrio



Fig. 1. Equivalent RC circuit of a cell.



Fig. 2. Resistances of the register unit.

$$R_{cell,M} = \left[ (R_{cell,M-1} + R) || \frac{R}{2} \right] + R$$
$$= \left[ \frac{(R_{cell,M-1} + R) \times \frac{R}{2}}{R_{cell,M-1} + \frac{3R}{2}} \right] + R$$

Supposing that  $R_{cell,M-1} = S_{M-1}R$  and  $R_{cell,M} = S_M R$ , then

$$S_M = \left\{ \left[ \frac{S_{M-1} + 1}{S_{M-1} + 1.5} \right] \times 0.5 \right\} + 1$$

Considering that each row is parallel with the others, the total resistance for the device can be calculated dividing by the number of rows.

$$R_{tot} = \frac{R_{cell,M}}{N}$$

The resistance of each register can be computed as

$$R = \frac{NR_{tot}}{S_M}$$



Fig. 3. Equivalent resistance circuit in a 2D map.

**Elementary capacitances calculation** The total capacitance of the circuit can be calculated by considering each elementary capacitance to be parallel with the others (see Figure 4).

The total capacitance can be computed for N rows and M columns in parallel as

$$C_{tot} = C \times N \times M$$

The capacitance of every register can be computed as

$$C = \frac{C_{tot}}{N \times M}$$

Once the resistance and capacitance for each elementary cell are known, the size of the elementary cell can be calculated supposing that it is a quadratic cube by the expression

$$size = \frac{R}{K_{G(Si/Cu)}}$$

These last expressions are integrated in the VLSI simulator in order to retrieve the thermal behavior for every register in the register file for different placements and topologies.

From [15], we employ the same expression to compute the temperature evolution once the technology factors are calculated and the activities are obtained by the simulator.

6 J. L. Ayala, A. Apavatjrut, D. Atienza, M. López-Vallejo, C. A. López-Barrio



Fig. 4. Equivalent capacitance circuit.

| ddiiddiimmmm | la | aye | er | 1       |
|--------------|----|-----|----|---------|
| ddiiddiimmmm | p  | 1   | 1  | 1.0e-14 |
| xppxxppxtttt | x  | 1   | 1  | 0.0     |
| nssxnssxnssx | s  | 2   | 1  | 1.0e-14 |
| xssnxssnxssn | i  | 2   | 1  | 1.0e-14 |
| mmmrrrrxppx  | -1 | 2   | ÷  | 1.00 14 |
| mmmmrrrriidd | α  | 4   | Т  | 1.0e-14 |
| tttttttiidd  | n  | 1   | 1  | 1.0e-14 |
| mmmnssnmmm   | m  | 2   | 1  | 1.0e-14 |
| mmmxssnmmm   | t  | 1   | 1  | 1.0e-14 |
| ttttxppxtttt | w  | 2   | 1  | 0.0     |
| wwwwiiddwwww | r  | 1   | 1  | 1.0e-14 |
| wwwwiiddwwww |    |     |    |         |

Fig. 5. Configuration files

$$\begin{split} T_c(n+1) \times 2^{36} &= T_c(n) + ((cap \times EC \times 2^{62}) \times act) + \\ &+ ((A \times 2^{26} - B \times 2^{26} \times (T_c(n)) \times \\ &\times (T_n(n) \times 2^{36} - T_c(n) \times 2^{36})))/(2^{26}) \end{split}$$

where  $T_c(n)$  is the temperature at step n,  $T_c(n+1)$  is the temperature at step n+1,  $T_n(n)$  is the neighbor cell temperature,  $cap \times EC$  is the temperature difference due to the activities, *act* is the activity factor, A is the linear coefficient and B is the quadratic one.

## **3** Experimental Results

Once the thermal model for the register file has been developed, it has been integrated in the functional and thermal simulator of the VLIW system. In



Fig. 6. Thermal map for a condensed access pattern.

order to perform the thermal simulation of the system, an specific layout of the architecture has to be designed.

The baseline architecture devised for the set of experiments resembles a common VLIW system with four processing cores, a shared memory subsystem, a shared register file and a communication network. The layout of the system is configured in a text file where the placement and size of these modules is coded with letters (see Figure 5).

Before starting the simulation, the thermal parameters (the heat transfer coefficient and the thermal conductivity for silicon and copper) must be set. Also, the floorplan must be loaded.

During the simulation, the thermal coefficients and power coefficients for each cell are computed. Since every cell is surrounded by other cells, a heat distribution from hotter cells to colder cells takes place. At the end of the simulation, temperatures of all the cells are stored in a file.

The set of experiments we present here analyzes the effect of the access pattern on the temperature of the register file. This analysis will allow to define temperature-aware access policies that reduce the temperature of the device as well as the power consumption [16]. These experiments have been performed for every placement of the layout (positions 1, 2, 3 and 4) and three different access patterns (registers accessed from a bank placed on the right hand side of the register file, accessed registers randomly placed in several spots of the device and registers accessed in an homogeneous manner as a chess board).



8 J. L. Ayala, A. Apavatjrut, D. Atienza, M. López-Vallejo, C. A. López-Barrio

Fig. 7. Thermal map for a random access pattern.

The following graphs show the results for the three different accesses when the register file is placed in position 4.

Figure 6 shows the evolution in time of the thermal map for the register file when the registers are accessed from a bank located at the right hand side of the device. As can be seen, the bank where the registers are accessed from is increasingly heated as time advances. At the end, a large hot spot appears in the register file, what can severely damage the device.

Figure 7 shows the evolution in time of the thermal map for the register file when the registers are randomly accessed from several spots in the device. As can be seen, these spots where the registers are accessed from are increasingly heated as time advances. At the end, several hot spots appear on the register file surface increasing the probability of chip damage. Therefore, an access pattern what homogenizes the thermal map on the silicon surface must be found.

Figure 8 shows the evolution in time of the thermal map for the register file when the registers are accessed in a "chess board" manner. As can be seen, this access pattern homogenizes the temperature on the silicon because the accesses are distributed across a larger surface. Moreover, the probability of hotspots is minimized and the reliability of the system is not compromised.





Fig. 8. Thermal maps for a "chess board" access pattern.

## 4 Conclusions

Leakage reduction and thermal management is one of the key issues in current architectures. The proposed methodologies that optimize these metrics require a thermal and accurate characterization of the hardware modules.

This paper has presented an efficient analytical model to analyze the temperature evolution in the register file of a VLIW architecture, one of the hottest devices of these systems. Moreover, we have also evaluated some of the factors that can modify this thermal map.

The results obtained can be used in the design of temperature-aware compilers and place&route tools.

## References

- Agarwal, V., Hrishikesh, M.S., Keckler, S.W., Burger, D.: Clock Rate versus IPC: The End of the Road for Conventional Microarchitectures. In: ISCA. (2000)
- 2. ITRS: The international technology roadmap for semiconductors (2006)
- Taur, Y., Ning, T.H.: Fundamentals of Modern VLSI Devices. Cambridge Univ. Press (1998)
- 4. Mukherjee, R., Memik, S.O., Memik, G.: Temperature-aware resource allocation and binding in high-level synthesis. In: Design Automation Conference. (2005)
- 5. Viswanath, R., Wakharkar, V., Watwe, A., Lebonheur, V.: Thermal performance challenges from silicon to systems. Intel Technology Journal 4 (2000)

10 J. L. Ayala, A. Apavatjrut, D. Atienza, M. López-Vallejo, C. A. López-Barrio

- Genossar, D., Shamir, N.: Power estimation, budgeting, optimization and validation. Intel Technology Journal (2003)
- 7. Liao, W., Basile, J.M., He, L.: Leakage power modeling and reduction with data retention. In: ICCAD. (2002)
- Jiang, W., Tiwari, V., de la Iglesia, E., Sinha, A.: Topological analysis for leakage prediction on digital circuits. In: ASP-DAC. (2002)
- 9. Brooks, D., Martonosi, M.: Dynamic thermal management for high-performance microprocessors. In: HPCA. (2001)
- Dhodapkar, A., Lim, C., Cai, G.: Tempest: A thermal enabled multimodel power/performance estimator. In: PACS. (2000)
- Skadron, K., Stan, M., Huang, W., Velusamy, S., Sankaranarayanan, K., Tarjan, D.: Temperature-aware microarchitecture. In: ISCA. (2003)
- 12. Srinivasan, J., Adve, S.V.: Predictive dynamic thermal management for multimedia applications. In: ICS. (2003)
- Heo, S., Barr, K., Asanovic, K.: Reducing power density through active migration. In: ISLPED. (2003)
- 14. Apavatjrut, A., Ayala, J.L., Lpez-Vallejo, M.: Thermal analysis of the shared register file in vliw architectures. In: PCI. (2006)
- Paci, G., Marchal, P., Polletti, F., Benini, L.: Exploring Temperature Aware Design in Low-Power MPSoCs. In: DATE. (2006)
- Atienza, D., Raghavan, P., Ayala, J.L., de Micheli, G., Catthoor, F., Verkest, D., López-Vallejo, M.: Compiler-driven leakage energy reduction in banked register files. In: International Workshop on Power and Timing Modeling, Optimization and Simulation. (2006)