Abstract-Dynamic Thermal Management (DTM) emerged as a solution to address the reliability challenges with thermal hotspots and unbalanced temperatures. DTM efficiency is highly affected by the accuracy of the temperature information presented to the DTM manager. This work aims to investigate the effect of inaccuracy caused by the deep sub-micron (DSM) noise during the transmission of temperature information to the manager on DTM efficiency. A simulation framework has been developed and results show up to 62% DTM performance degradation under DSM noise. The finding highlights the importance of further research in providing reliable on-chip data transmission in DTM application.
I. INTRODUCTION
In deep sub-micron (DSM) technologies with high power and temperature densities, dynamic thermal management (DTM) techniques is needed to maintain safe temperature levels during execution [1, 2] . DTM techniques usually rely on the temperature information obtained directly from on-chip thermal monitors or indirect techniques which estimate the temperature based on the power consumption of functional units [3, 4] . DTM manager regulates the operating temperature based on the provided temperature information from thermal monitors that is transmitted using bus interconnection in single core or network on chip (NoC) in multi cores systems [5, 6] .
Accuracy of thermal data is the key feature in DTM and directly affects its efficiency [7] . Inaccurate temperature profile lower than the actual temperature can result in late activation of DTM techniques, which may result in physical damage. On the other hand, inaccurate temperature profile higher than actual temperature can result in early activation of DTM, which degrades system performance [4] .
Temperature sensing inaccuracies are caused by variety of factors including monitor placement, monitor device imprecision and interconnection DSM noise. Optimized allocation of thermal monitors is an open problem that aims to assign monitors efficiently to cover the different hotspots [8] .
Monitor device imprecision are due to inaccurate calibration, supply voltage fluctuations, process variation and others [9] .
The last factor that impacts the thermal data accuracy is the interconnection DSM noise. DSM noise can considerably affect the thermal data transmitted from monitors to DTM manager over bus interconnect or NoC. Interconnection DSM noises include IR drops, capacitive and inductive crosstalk, charge sharing, charge leakage, and process variations [10] . DSM noises are increasing with technology scaling and achieving reliable transmission is a major challenge in future technology nodes. This work proposes a simulator framework to investigate the impact of interconnection DSM noise on temperature data accuracy and DTM efficiency. To the best of our knowledge, this issue has not been investigated yet. Section II describes the investigation methodology and introduces the proposed simulator. Section III discusses the results and findings. Section IV summarizes the paper and draws conclusions.
II. INVESTIGATION METHODOLOGY
The increasing of noise in DSM technology results in the unreliable data transmission through the on-chip interconnections. Hence, accurate thermal data collected from the monitors are prone to faults when it reaches the DTM manager. DTM efficiency is highly dependent on accurate input thermal data, as system performance slowdown increases in consequence of unnecessary invokes. In addition, inaccurate temperature profile may wrongly report a unit in emergency temperature as a normal one that results in leaving this unit unattended for several cycles.
To investigate the effects of DSM noise on DTM efficiency, we proposed a simulator framework named DSM-DTM simulator that is composed of power, temperature, interconnection with DSM noise and DTM simulators as illustrated in Fig.1 .
The interconnection simulator transports the thermal data from the thermal monitors to the DTM manager via a parallel interconnection bus and DSM noise block injects noise modeled in Gaussian noise model to the bus [11] . The inaccuracy of thermal data impacted by DSM noise is quantified through the Bit Error Rate (BER) that is computed using the following equation:
(1) (2) where the V SW and N are the voltage swing and noise deviation respectively [12] . DTM simulator block implements DTM manager, which dynamically control the chip temperature, based on DTM policy given the input thermal data. Different DTM policies have been proposed such as dynamic voltage frequency scaling (DVFS) [13] , task scheduling [14] , fetch toggling [15] , clock gating [16] , and task migration [17] .
Temperature simulator block models the thermal data from monitors based on the chip floorplan and power consumption of each units inside the chip. In our proposed framework, HotSpot version 2.0 is used to generate the chip's temperature profile [6, 19] . The Wattch simulator [20] inside the power simulator provides cycle-accurate dynamic power based on the chip' activity.
In order to evaluate the impact of DSM noise on DTM efficiency, two thermal profiles are considered; temperature profile without DSM noise (i.e. reference) and temperature profile under the effect of DSM noise. Two metrics are used namely system performance slowdown, SP SL and percentage of unattended cycles in emergency temperature define as following: (3) System performance slowdown is caused by increase in execution time due to the temperatures below the threshold reported as emergency temperatures that lead to unnecessary invoke of DTM. Meanwhile, the unattended cycles in emergency temperature is caused by late activation of DTM techniques due to inaccurate temperature profile lower than the actual temperature which may result in physical damage.
III. EXPERIMENTAL RESULTS
In this work, our experimental results are based on a single core processor with configuration parameters shown in Table  I , similar to [18, 22] . Standard benchmarks from Wattch simulator are executed on the processor chip [23] .
A simple dynamic frequency scaling (DFS) has been chosen as a DTM policy. DFS is a reactive technique that is activated only when an emergency temperature is reached [2] . We choose 85°C as the emergency temperature, similar to [1, 18] . For implementing the DFS, we assume two built-in frequency settings [1] . All jobs run at full speed ( ) unless an emergency temperature value is observed. If a unit temperature reaches the critical value, the frequency level of the particular unit is reduced to the lower setting ( ) until the current task terminates. In our experiments, is half of . In addition, a transition penalty of 10 s is considered for every changed in frequency [3] . [25] is used to transport the thermal data; with 12 bits data and 12 bits for address and control signals.
The results present system performance slowdown and percentage of unattended cycles in emergency temperature for the "hot" and "warm" benchmarks for DFS technique. "Anagram" and "test-arg" are considered as hot benchmarks and the rest are warm benchmarks. No cold benchmarks are used since they are unaffected by DTM [7] . . N =0.18V) . As illustrated in Fig.2 and Fig.3 , the system performance slowdown is up to 33% more in warm benchmarks than the hot benchmarks while the percentage of unattended cycles in emergency temperature is up to 39% more in hot benchmarks than the warm ones. This is due to the fact that hot benchmarks have more cycles in emergency temperature, when affected by DSM noise, the number of cycles that are in emergency temperature but reported as normal one is higher than the warm benchmarks. Meanwhile the number of cycles in non-emergency temperature is higher in warm benchmarks than the hot benchmarks, when affected by DSM noise, the number of cycles in non-emergency temperature that reported as emergency temperature and lead to unnecessary invoke of DTM is more in warm benchmarks as compared to hot benchmarks. Noise Deviation=0.18V fscale=50%
Number of sensors Figure 5 Comparison of unattended cycles in emergency temperature for a processor chip with 8 and 16 Thermal Monitors
IV. CONCLUSION
This work investigates the effect of DSM noise on DTM efficiency in terms of percentage of unattended cycles in emergency temperature and system performance slowdown. In general, percentage of unattended cycles in emergency temperature increases and system performance further slowdown with increase of noise deviation. This DTM efficiency further deteriorates with increase number of monitors on the processor chip. This highlights the significant effect of interconnection DSM noise in degrading the DTM efficiency and expected to worsen in multicore system on chip in advance technology node.
