A probabilistic method is presented to analyze the temperature and the maximum frequency for multicore processors based on consideration of workload variation, in this paper. Firstly, at the microarchitecture level, dynamic powers are modeled as the linear function of IPCs (instructions per cycle), and leakage powers are approximated as the linear function of temperature. Secondly, the microarchitecture-level hotspot temperatures of both active cores and inactive cores are derived as the linear functions of IPCs. The normal probabilistic distribution of hotspot temperatures is derived based on the assumption that IPCs of all cores follow the same normal distribution. Thirdly and lastly, the probabilistic distribution of the set of discrete frequencies is determined. It can be seen from the experimental results that hotspot temperatures of multicore processors are not deterministic and have significant variations, and the number of active cores and running frequency simultaneously determine the probabilistic distribution of hotspot temperatures. The number of active cores not only results in different probabilistic distribution of frequencies, but also leads to different probabilities for triggering DFS (dynamic frequency scaling).
Introduction
Continuous technology scaling and miniaturization have escalated the power density and temperature of multicore processors. In order to decrease manufacturing costs, the packages of multicore processors are mostly designed based on average power dissipation instead of the maximum, and temperature is controlled with dynamic thermal management (DTM) techniques such as dynamic voltage and frequency scaling (DVFS) and dynamic frequency scaling (DFS) [1] . When temperature of processor reaches or approaches the critical point, DVFS or DFS are invoked to ensure the thermal constraint at the cost of sacrificing the speed of processors. Therefore, it is crucial for design space exploration to analyze temperature and running frequency accurately and fast at the early stage.
Motivation.
To explore the design space of thermalaware multicore processors at the early stage, some thermal models have been proposed to estimate the temperature and performance of processors [2] [3] [4] [5] , and most of estimation approaches are based on transient analysis [6] [7] [8] [9] [10] [11] [12] [13] [14] . For transient analysis, temporal variations of temperature and performance depending on workloads are traced, contributing to high estimation accuracy. However, transient analysis is time-consuming, and in particular for multicore processors time complexity is unacceptable at the early design stage. Accordingly, to speed up the estimation of temperature and performance of multicore processors, researchers resort to steady-state analysis [15, 16] . Nevertheless, to the best of our knowledge, all previous work related to steady-state analysis is based on the assumption that every workload has the same thermal contribution, which greatly hinders the estimation 2 Mathematical Problems in Engineering accuracy. In fact, temperature of multicore processors has great variations between different workloads [10, 17] . Our preliminary work has demonstrated that the dynamic power of processors is highly correlated with IPCs (instructions per cycle) and that within a small temperature range, leakage power linearly depends on temperature [18] . According to the HotSpot thermal model, temperature can be derived given the power of processors [3] . Thus processor temperature has higher correlation with IPC. According to CLT (central limit theorem), when a large number of instructions are executed in the processor, the probabilistic distribution of IPC tends to follow the normal distribution. Accordingly, given both the probabilistic distribution of IPC and the relationship between the temperature and IPC, the probabilistic distribution of processor temperature can be derived and analyzed. Subsequently, the probabilistic distribution of the maximum running frequency can be inferred given that the zero-slack DTM policy is used by the processor, which means that the speed of the processor is set to a value which makes the temperature of the hotspot be the maximum threshold allowed by the processor [7] .
Contributions.
In this paper, a probabilistic method is proposed to analyze the steady-state temperature and frequency of multicore processors taking into account the variation of workloads. In order to simplify the analyzing processes, DFS technique rather than DVFS technique is adopted to manage the temperatures of processor, where the voltage is constant and only frequency is adjusted. And the dynamic power can be modeled as the linear function of the frequency [12, 16] . The main contributions of this work are as follows:
(i) At the microarchitecture level, the dynamic power of processors is modeled as the linear function of IPC and running frequency, and the leakage power of processor is approximated as the linear model of temperature.
(ii) The microarchitecture-level hotspot temperatures of both active cores and inactive cores are derived as the linear functions of IPCs of all active cores.
(iii) It is inferred that the hotspot temperatures of both active cores and inactive cores follow the normal probabilistic distribution, based on the assumption that IPCs of all active cores follow the same normal distribution.
(iv) The probabilistic distribution of the set of frequencies is determined given the zero-slack DTM policy [7] .
The remainder of this paper is organized as follows. In Section 2, related work is overviewed. In Section 3, the microarchitecture-level steady-state temperature of a core is formulated as the linear function of its powers based on the Hotspot thermal model. In Section 4, at the microarchitecture level, the dynamic powers of processors are modeled as the linear function of IPC and the running frequency, and the leakage powers are approximated as the linear model of temperature. In Section 5, the microarchitecture-level hotspot temperatures of both active cores and inactive cores are derived as the linear function of IPCs of all active cores. It is inferred that the hotspot temperatures of both active cores and inactive cores follow the normal probabilistic distribution, based on the assumption that the IPCs of all active cores follow the same normal probabilistic distribution. In Section 6, the probabilistic distribution of the set of frequencies is determined given the zero-slack DTM policy. In Section 7, experimental results are presented. This paper is concluded in Section 8.
Related Work
The estimation approach of temperature and performance of processors can be classified into transient analysis and steadystate analysis, and so far most researches have been based on transient analysis.
Transient Analysis.
In order to transiently analyze the temperature and explore the design space of processors at the early stage, several thermal models for processors have been proposed. Skadron et al. and Huang et al. [2, 3] presented a compact thermal modeling methodology based on the analogy between thermal and electrical phenomena, namely, HotSpot. Using HotSpot, the spatial and temporal variations of processor temperature can be obtained through transient analysis. To improve the accuracy of thermal simulation, Jang et al. [11] made an extension to the thermal model for HotSpot by taking into account the different ambient temperature owing to workload variations. To accelerate thermal analysis of multicore processors at the architecture level, Wang et al. [5] presented a composite thermal model, termed ThermComp, to optimize the model for different large processors. Li et al. [4] proposed a parameterized architecture-level dynamic thermal model, namely, ParThermPOF, in which many parameters can be set such as the location of thermal sensors and the conductivity of different components.
In order to improve the performance of thermal-aware multicore processors, various DTM techniques have been investigated based on thermal models such as Hotspot and analyzed transiently for the estimation of performance. Hanumaiah et al. [7, 19] presented an online thermal management algorithm for thermal-aware multicore processors, in which DVFS and task allocation techniques are simultaneously adopted. In the context of hard real-time systems, the time-varying voltage and frequency of multicores are computed to satisfy not only the thermal constraint but also the deadline constraint [6] . Shi et al. [12] presented a DTM policy under soft thermal constraint, in which the temperature constraint can be exceeded sometimes.
In order to simulate the thermal behavior fast and accurately for multicore processors, several researches have been performed. Wojciechowski et al. [10, 20] analyzed the transient characteristics of workloads based on a finite Fourier series expansion to accurately predict the thermal behavior of multicore processors, and a new DVFS approach is presented. Liu et al. [13] proposed a transient analysis method of temperature of multicore processors based on moment matching, and it is used to guide the migration processes of tasks. To account for the nondeterministic behavior of tasks in terms of executing times and decision branches, Das et al. [14] formulated thermal analysis as a hybrid automata reachability verification problem and an algorithm for constructing the automata was provided.
For transient analysis, temporal variations of temperature and performance depending on workload are traced and then high estimation accuracy is obtained. However, transient analysis is time-consuming, and in particular for multicore processors time complexity is unacceptable at the early design stage.
Steady-State Analysis.
In order to speed up the estimation of temperature and performance at the early design stage of multicore processors, researchers resort to steadystate analysis and have carried out a lot of work. Based on Amdahl's Law, Lee and Kim [15, 21] introduced variations of process and workload parallelism into the analyzing model and optimized the throughput of thermal-aware multicore processors by exploiting DVFS and the per-core power-gating (PCPG). Based on HotSpot, Rao et al. [16, 22] described an approximate thermal model for homogeneous multicore processors to fast and accurately predict the maximum steady-state throughput under thermal constraints. In the context of a hard real-time system of a single-core processor, Mohaqeqi et al. [23] studied stochastic behavior of the system, for example, performance, temperature, and reliability, based on Markovian view.
To the best of our knowledge, all previous work of steadystate analysis is based on the assumption that every workload has the same thermal contribution to processors, resulting in inaccuracy of temperature and performance estimation. This is the focus of our work. In this paper, the variation of workloads is taken into account to model the thermal and frequency more accurately.
Thermal Model
In this paper, a microarchitecture-level thermal model for a multicore processor is created by replication of a singlecore processor based on HotSpot [3] . The multicore processor is divided into four layers, that is, chip, thermal interface material (TIM), heat spreader, and heat sink. There are thermal blocks in the chip and TIM, and there are five and nine thermal blocks in the heat spreader and heat sink, respectively. Totally, there are = + 14 thermal blocks in the multicore processor, where is the number of cores.
The microarchitecture-level thermal model can be represented by the state-space differential equation as follows [16] :
where T and P are -dimension vectors, respectively, denoting the temperature and power of the multicore processor, and A and B are constant matrices of × dimension depending on the thermal conduction and capacitance of the processor. Figure 1 : Thermal conduction matrix G of dual-core processor [16] .
When the temperature of the processor is in the steadystate, T/ = 0, then
where G = −B −1 A isthe thermal conductance matrix of HotSpot model.
The thermal conductance matrix G can be obtained from the HotSpot simulation tool. The thermal conductance matrix of a dual-core processor is as shown in Figure 1 , where the submatrices G die , G int , and G pkg along the diagonal are lateral thermal conductance of the chip, TIM, and package, respectively, the submatrices G die-int and G int-die are the vertical conductance between die and TIM, and the vectors G int-spr and G spr-int are the vertical conductance between the TIM and the spreader. G die-int is equal to G int-die , and G int-spr is equal to G spr-int . The detailed conductance matrix of G pkg is as shown in Figure 2 .
Lemma 1.
According to HotSpot thermal model, when the temperature of the multicore processor is in the steady-state, there exist -dimension matrices R, Q, and Z, such that
where T and P are -dimension vectors, representing, respectively, temperature and power of the th core of the processor.
Proof. According to the arrangement of elements in the conductance matrix G in HotSpot thermal model, (2) can be decomposed into the following equations: Figure 2 : Thermal conduction matrix of the package [16] .
where T int, is -dimension vector representing the temperature of TIM layer of the th core, spr is a scalar representing the temperature of the central block of the spreader, and T other is 13-dimension vector representing the temperature of other blocks of the spreader and the sink. According to (6) and (7),
Set
Equation (8) can be converted into
Substitute (10) into (5) and get
According to (4) and (11), get
Then, (12) can be transformed into
According to (14) , get
According to (15) , get
Substitute (16) into (14) and get
Mathematical Problems in Engineering
This means that in the steady-state of temperature of a multicore processor, given the powers of all cores, the temperature of any core can be calculated according to Lemma 1 at the microarchitecture level.
Power Model
It is assumed that multicore processors have two power states, active mode and inactive mode, and the power state of each core can be set separately. And global dynamic frequency scaling (DFS) technique is used where frequencies of all cores are scaled uniformly.
Active Mode.
When a core is in the active mode, workloads are executed and the core dissipates both dynamic power and leakage power. At the microarchitecture level, the power of the active core is defined as
where P a, , P dyn,a, , and P lea,a, are, respectively, the total power, dynamic power, and leakage power of the th active core of size × 1.
For the processor using DVFS technique, the dynamic power is proportional to the product of the square of the voltage and the frequency [7] ; that is, P dyn ∝ voltage 2 × frequency. In this paper, the primary purpose is to analyze the impact of workload variation on the temperatures of processors. So, in order to simplify the analyzing processes, DFS technique rather than DVFS technique is used to manage the temperatures of processor, where the voltage is constant and only frequency is adjusted. Hence, the dynamic power can be modeled as the linear function of the frequency [12, 16] . In addition, the dynamic power caused by workload execution has a close linear relationship with IPCs.
Let a, be the IPC of the th core when workloads are running on it and let be the normalized frequency between 0 and 1, and then the microarchitecture-level dynamic power P dyn,a, of the th active core is defined as
where G dyn and G 0 are the linear regression coefficients when is set to maximum 1 and E dyn is the regression residual which follows the normal distribution with mean of zero: namely, E dyn ∼ (0, 2 ). G dyn , G 0 , E dyn , and are vectors of size × 1, and 2 is the element-by-element squares of . At the microarchitecture level, in order to simplify the analyzing procedure, the relationship between the leakage power P lea,a, and the temperature T a, of the th active core is approximated by the linear model as follows:
where G lea,a and G 0,a are the regression coefficients and E lea,a is the regression residual which follows the normal distribution with mean of zero: namely, E lea,a ∼ (0, 2 a ). G lea,a is a diagonal matrix of size × . G 0,a , E lea,a , a , and a are vectors of size × 1. According to (20) , (21), and (22), the power of the active core P a, is represented by
Inactive Mode.
When a core is in the inactive mode, it is powered off using power-gating technique. The inactive core only dissipates leakage power, which is much lower than that of the active core. Hence, the power of the th inactive core P ina, is the leakage power P lea,ina, , which depends on the core's temperature T ina, and it can be approximated by the linear model at the microarchitecture level as follows:
where G lea,ina and G 0,ina are regression coefficients and E lea,ina is the regression residuals which follow normal distribution with mean of zero: namely, E lea,ina ∼ (0, 2 ina ). G lea,ina is a diagonal matrix of size × . G 0,ina , E lea,ina , and ina are vectors of size × 1. 
Thermal Analysis
Lemma 2. When the temperature of a multicore processor is in the steady-state, the temperatures and the powers of different inactive cores are same; that is, T , = T , and P , = P , for ∀ , ( ̸ = ), where T , and P , are the temperatures and the powers of the th inactive core, respectively.
Proof. According to (3), for any inactive cores and ,
Substitute (24) into (25), and get
According to (26) , get
The parameters R and G lea,ina are invertible matrices, so E − RG lea,ina is an invertible matrix. Therefore, T ina, − T ina, = 0; that is, T ina, = T ina, . And P ina, = P ina, can also be obtained according to (24) .
To be convenient, set T ina, = T ina and P ina, = P ina ; (24) can be simplified into 
Proof. According to (3), (23) , and (28), the temperature of the inactive core T ina can be derived as
Let
Then, (31) can be transformed into
According to (3), (23), and (28), the temperature of the th active core T a, can be derived as
Then, (34) can be transformed into
Mathematical Problems in Engineering 7 Substitute (33) into (36), and get
Then, (37) is transformed into
According to (39), get
Substitute (40) into (39), and get
The hotspot temperature hot,a, of the active core can be given by
8
Then, (42) is transformed into
Substitute (40) into (33), and get
The hotspot temperature hot,ina of the inactive core is given by
Then, (46) is transformed into 
The norm-distributed random variables a, and a, are independent in the case of 1 ≤ ∀ , ≤ , so that the linear combination ( 1 (H, , a , ) + 2 (H, , a , ) 
a, of a, still follows the normal distribution.
All elements E dyn, in the random vector E dyn follow normal distribution and are independent, so that the linear combination 3 (H, , a , )E dyn of all elements in E dyn follows normal distribution. In the same way, 4 (H, , a , )E lea,a and 5 (H, , a , )E lea,ina also follow normal distribution.
( 
According to Theorem 4, it can be known that the hotspot temperature of any core follows the normal probabilistic distribution.
Frequency Analysis
The zero-slack policy is used as the DTM strategy of processors, that is, the speed of processor is set to a value which makes the temperature of the hotspot be the threshold [7] . However, the frequencies are discrete in this work. Therefore, in most cases, there is no frequency in the set of frequencies making the hotspot temperature be the threshold exactly.
Theorem 5. Let
= { 1 , . . . , , +1 , . . . , } be the set of frequencies of multicore processors, where
then the probabilistic distribution of the frequency f follows
where ℎ , ( , ) is the function of the number of active cores and the frequency , representing the hotspot temperature of the active core, and V V is the temperature threshold of the processor.
Proof. When = count , according to the zero-slack DTM policy, obviously,
When < count , then the probabilities of = can be broken into two cases:
(a) if hot,a ( a , ) > hot,a ( a , +1 ), according to the zero-slack DTM strategy, the probability of = is 0; that is,
, the probability of = is given by
The right-hand side of (59) can be derived by
According to (59) and (60), get
Hence, when < count , according to (58) and (61), get
Therefore, according to Theorem 5 the probabilistic distribution of the set of frequencies can be obtained based on the assumption that the zero-slack policy is used.
Given the probabilistic distribution of the frequency and the mean of hotspot temperature of the active core for a certain frequency, the average hotspot temperature of the active core aver hot,a can be obtained by
where Set = { 1 , . . . , , +1 , . . . , count } denotes the set of frequencies of multicore processors and ( ) denotes the probability that the frequency is ; ,a (H, , a , ) denotes the mean of hotspot temperature of the active core given the frequency .
Experimental Results

Experimental Methodology.
A multicore version of Alpha 21264 processor is used as the processor model in our experiment [24] , and there are eight cores in the processor. The cores have two working states, active state and inactive state, and the working state of each core can be set separately. The processor employs a global DFS technique, which means that frequencies of all cores in the processor are scaled uniformly. There are four discrete frequencies used by the processor, that is, 1.5 GHz, 2 GHz, 2.5 GHz, and 3 GHz. To facilitate analysis, the frequencies are normalized into the interval [0, 1], so that the maximum frequency is normalized to 1. After normalization, the set of frequencies is {0.5, 0.67, 0.83, 1}. According to our previous work, the hotspot of Alpha 21264 processor is the branch predictor [18] . So the second element of the hotspot selection vector H, corresponding to the branch predictor, is set to 1, and the other elements are set to 0 s. The thermal threshold, that is, the maximum temperature allowed by processor, is set to 100 ∘ C. The HotSpot is used as the thermal model of the multicore processor, and the parameters such as thermal conductance and capacitance are set to default values of HotSpot simulation tool [3] . In order to construct the linear model of dynamic power, PTScalar is modified to obtain both the dynamic power profiles of each functional unit in Alpha 21264 processor and the IPC profiles [25] . The parameters of PTScalar are set to default values as well. The mean and variance 2 of norm-distributed a, are determined using IPC profiles based on the maximum likelihood estimation. Some representative tasks such as mesa, ammp, quake, bzip, mcf, math, and qsort from MiBench [26] and SPEC CPU2000 [27] are selected as the benchmarks. These selected tasks are mutually independent; that is, no task takes precedence over the others, so the tasks can be parallel executed at the task level. In addition, workload balancing techniques are used in the processor. A task is not fixed on a core, and the tasks can be migrated among all cores such that the IPCs of different cores are equal. The simple linear regression analysis is used to determine the coefficients G dyn and G 0 in (21), and the variance 2 of regression residuals E dyn is obtained. The leakage power is the nonlinear monotonic increasing function of temperature, which is given by [25] 
where , , and are parameters which depend on topology, size, technology, and design of processors. In order to construct the linear model of leakage power, (64) is regressed linearly to determine the coefficients G lea,a and G 0,a in (22) and the coefficients G lea,ina and G 0,ina in (24), as well as the variance 2 a and 2 ina of regression residuals E lea,a and E lea,ina . The parameters , , and are set to the default values of PTScalar. The smaller the range of temperature, the higher the linear correlation between leakage power and temperature [7] . The temperature of a processor using DTM techniques does not exceed the maximum value, and the lower temperature has no impact on the design optimization of thermal-aware processors. Therefore, the linear regression analysis of leakage powers is performed at the temperature interval between 60 ∘ C and 100 ∘ C, and the regression results are used to estimate the leakage power at the whole temperature interval.
Estimated Accuracy.
For the hotspot of the processor, that is, the branch predictor, Figure 3 shows the comparison between the actual value and the estimated value of leakage power for active cores, and Figure 4 shows that for inactive cores. A higher estimation accuracy of leakage power is obtained at the temperature interval between 60 ∘ C and 100 ∘ C at the cost of lower accuracy at the other intervals.
After regression analysis, the dynamic and leakage power can be estimated with the linear model in (21), (22) , and (24). Figure 5 shows the estimation error rate of the dynamic power and that of the leakage power for active cores and inactive cores in thermal range between 60 ∘ C and 100 ∘ C. It can be seen that the error rates of the dynamic powers for different functional units have significant variations. For the decoder, the estimation error rate of dynamic power is only 2.62% but 10.14% for the floating point register (FPReg). The reason for this fact is that the linear correlations between the IPC and the dynamic powers for various functional units are different. Lower error rate results from higher correlation. Obviously, the IPC and the dynamic power of the decoder have the highest linear correlation, while the IPC and that of FPReg have the lowest linear correlation. It can also be seen from Figure 5 that the estimation error rates of leakage power for both active cores and inactive cores are similar. The estimation error rates of leakage power for active cores are between 3.15% and 3.26%, and those for inactive cores are between 3.06% and 5.91%. This is because the temperature and the leakage power of various functional units have similar linear correlations. In order to consider the Error rate (%) impact of estimation errors on the analysis of temperature and frequency, the error terms E dyn , E lea,a , and E lea,ina are introduced into the linear models of dynamic power and leakage power as expressed in (21), (22), and (24).
Probabilistic Distribution of Temperature.
Based on the assumption that no DTM techniques are used to control the temperature of processors, according to Theorem 4, the probabilistic distribution of the hotspot temperatures for both active cores and inactive cores can be obtained. Table 1 presents the means and the standard deviations of the probabilistic distribution of the hotspot temperature for active cores, and the corresponding probabilistic density curves are as shown in Figure 6 . It can be seen that the hotspot temperature of processors is not deterministic and has significant variations for a certain number of active cores and a certain frequency. The number of active cores and the running frequency simultaneously determine the range in which the temperature lies and the probabilistic distribution of the hotspot temperature. For the same running frequency, more active cores will yield higher temperature, and vice versa. For the same number of active cores, higher frequency will bring higher temperature, and vice versa. According to the characteristics of normal distribution curve, it can be known that the shape of probabilistic density curve corresponds to the variations of data distribution depending on the standard deviation of random variables. The curve with a higher peak implies a smaller standard deviation, that is, a lower variation of data distribution, whereas the curve with a lower peak implies a bigger standard deviation, that is, a larger variation; it can be seen from Figure 6 that the probabilistic density curves corresponding to various frequencies have different peaks. This observation implies that the degree of temperature variation has close correlation with working frequency, and higher frequency will yield higher variation of temperature. Table 2 presents the means and standard deviations of the probabilistic distribution of the hotspot temperature for inactive cores, and the corresponding probabilistic density curves are as shown in Figure 7 . When the number of active cores is eight, inactive core does not exist, so there is no case where the number of active cores is eight in Table 2 and Figure 7 . The effect of the frequency and the number of active cores at the hotspot temperature for inactive cores is the same as that for active cores, except that the mean value and variation of hotspot temperature of inactive cores are lower than those of active cores under the same frequency and the number of active cores.
Probabilistic Distribution of Frequencies.
If the powergating and the DFS techniques are simultaneously used to manage the temperature of a processor, according to Theorem 5, the probabilistic distribution of working frequencies can be determined. Figure 8 presents the probabilistic distribution of frequencies when the number of active cores is six, seven, and eight, respectively. When the frequency is less than 1, it is implied that the hotspot temperature surpasses the threshold, and the DFS technique is triggered to reduce the frequency of the processor. So the probability for triggering DFS can be obtained from the probabilistic distribution of frequencies. Figure 9 presents the probability for triggering DFS when the number of active cores is six, seven, and eight, respectively. If a core is powered off and made inactive using the power-gating technique, it only dissipates the leakage power which is much less than that of an active core, so the power dissipated by the processor is reduced significantly. When the number of active cores is less than six, that is, more than two cores are powered off, the decreased power makes it enough for the rest of active cores of a processor to execute at the full speed, and the DFS is not necessary to be triggered. Therefore, when the number of active cores is less than six, the frequency of the processor is constantly 1, and the probability for triggering DFS is constantly 0. This situation is not given in Figures 8 and 9 . It can be seen that various numbers of active cores result in different probabilistic distributions of frequencies and different probabilities for triggering DFS. When all cores are powered on, the probability that the processor runs at the maximum frequency is only 44.16%, and the probability for triggering DFS is 55.84%. This means that all cores will run at the full speed only when the IPCs of tasks are less. If the IPCs increase to an extent, the running frequency will be scaled down to control the temperature under the threshold. As the number of active cores decreases, the probability that the processor runs at the maximum frequency increases, and the probability for triggering DFS decreases. When the number of active cores is less than six, no matter what the IPCs are, the saved power by shutting off more than two cores makes it deterministic for the active cores to run at the full speed, so the probability that the processor runs at the maximum frequency is 100%, and the probability for triggering DFS is 0%.
Comparisons of Temperatures with and without DFS.
If the DFS technique is not used, then the processor always runs at the full speed, that is, the running frequency is constantly 1. So the average hotspot temperature of the active core without the DFS can be obtained according to (52). If both the powergating and DFS techniques are used simultaneously for the dynamic thermal management, then the running frequency can be scaled to control the temperature of the processor under the thermal threshold. According to (63), the average hotspot temperature of the active core with the DFS can be obtained. In terms of the average hotspot temperature and the probability that the hotspot temperature exceeds the threshold, Table 3 presents the comparative results between the active cores with and without the DFS when the number of active cores is 6, 7, and 8.
If the DFS technique is used by the processor, the running frequency will be scaled down to reduce the temperature once the hotspot temperature reaches the threshold. So the hotspot temperature will not exceed the threshold; that is, the probability that the hotspot temperature exceeds the threshold is 0%. If the DFS technique is not used by the processor, the running frequency is always the maximum, and the hotspot temperature is possible to exceed the threshold; that is, the probability that the hotspot temperature exceeds the threshold is larger than 0%. Therefore, the average hotspot temperature of the processor with the DFS is lower than that without the DFS. As the number of active cores decreases, the saved power by shutting off cores makes it more possible for the active cores to execute at the full speed, and the effect of the DFS on cooling down the processor weakens until it disappears. Therefore, for the processors with and without the DFS, the average hotspot temperatures become closer as the number of active cores decreases, as shown in Table 3 . Even though the DFS is not used, the probability that the hotspot temperature exceeds the threshold will reduce until 0% when more cores are powered off.
When the number of active cores is lower than 6, that is, more than 2 cores are powered off, no matter what speed the processor runs at, the hotspot temperature will not exceed the threshold, so all active cores are not necessary to trigger the DFS for managing the temperature of the processor and can run at the maximum frequency. Therefore, when the number of active cores is lower than 6, the probability that the hotspot temperature exceeds the threshold is 0%, and the average hotspot temperatures of the processor with and without the DFS are same. There is no difference with the DFS and without the DFS when the number of active cores is lower than 6, so the comparisons in this situation are not given in Table 3 .
Conclusions
In this paper, a probabilistic analysis method of the temperature and frequency of multicore processors is presented taking the variation of workloads into account. It is proved theoretically in this paper that (1) the hotspot temperatures of both active cores and inactive cores are the linear functions of the IPC; (2) the hotspot temperature follows the normal probabilistic distribution based on the assumption that IPCs of all cores follow the same normal distribution; and (3) the running frequency follows a probabilistic distribution.
From the experimental results, it can be seen that (1) the estimation error rates of the dynamic powers for different functional units have significant variations, indicating that the linear correlations between the IPC and the dynamic powers for various functional units are different; (2) the estimation error rates of leakage powers for both active cores and inactive cores are similar, showing similar linear correlations between temperature and leakage power across various functional units; (3) a higher estimation accuracy of leakage powers can be obtained at the temperature interval between 60 ∘ C and 100 ∘ C at the cost of lower accuracy at other intervals; (4) the hotspot temperature of the processor is not deterministic and has significant variation for a certain number of active cores and a certain frequency, and the number of active cores and the running frequency determine simultaneously the probabilistic distribution of hotspot temperature; and (5) various numbers of active cores result in different probabilistic distributions of frequencies and different probabilities for triggering DFS.
