Temperature Sensor Placement Including Routing Overhead and Sampling Inaccuracies by Ituero Herrero, Pablo et al.
2012 International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design 
(SMACD) 
Temperature Sensor Placement Including Routing 
Overhead and Sampling Inaccuracies 
Pablo Ituero Fernando Garcia-Redondo Marisa Lopez-Vallejo 
Abstract—Dynamic thermal management techniques require a 
collection of on-chip thermal sensors that imply a significant 
area and power overhead. Finding the optimum number of 
temperature monitors and their location on the chip surface 
to optimize accuracy is an NP-hard problem. In this work 
we improve the modeling of the problem by including area, 
power and networking constraints along with the consideration of 
three inaccuracy terms: spatial errors, sampling rate errors and 
monitor-inherent errors. The problem is solved by the simulated 
annealing algorithm. We apply the algorithm to a test case 
employing three different types of monitors to highlight the 
importance of the different metrics. Finally we present a case 
study of the Alpha 21364 processor under two different constraint 
scenarios. 
I. INTRODUCTION 
Temperature is a growing issue for electronics designers. 
High temperature peaks in determined places, known as 
hotspots, lead to unreliable circuit operation and degrade the 
performance of the system. Dynamic thermal management 
techniques dynamically adapt the circuit particular thermal 
conditions to increase lifetime and reliability. In this scenario, 
built-in temperature monitoring systems play a decisive role. 
However, allocating an arbitrarily large number of temperature 
monitors supposes a significant area and power overhead. 
Therefore, the number and location of thermal monitors must 
be optimized based on the characterization of the thermal 
behavior of the IC for a given application [1]. 
Given a set of known hotspots, finding the location of the 
smallest set of temperature sensors which minimizes the worst 
reading error is an NP-hard problem [2] and several solutions 
have been proposed to solve it in a computationally sensible 
time. 
In [1], the first work of the field, a uniform and a k-
means approach are presented. The former can be used if no 
thermal profile is available, but it generally requires a very 
fine-grained grill of monitors to achieve an acceptable level of 
accuracy. The k-means works with a fix number of k monitors 
it forms k clusters of hotspots and places each monitor in the 
thermal center of each cluster. It does not directly optimize 
the sensor error, but rather a combined metric of location 
proximity and the average temperature error. Another proposal, 
the RCN (Reda-Cochran-Nowroz) algorithm [2] is a heuristic 
algorithm composed of two phases. The first phase takes 
the thermal characterization data and determines the initial 
Hotspots I Area, Power and 
Information | | Accuracy Constraints 
I I I 
Simulated Annealing 
Design Space Exploration 
Í 
Annealing 
Conf 
Í 
Cost 
Function 
Evaluation 
1 _ 
Correction 
| Terms Weights 
5 Monitors Locations 
Sampling 
Rate 
Fig. 1. Overview of the environment. 
location of all the sensors. The next phase iteratively improves 
the locations of the thermal sensors. A broader solution space 
is explored, achieving better results. Sensor placement has 
been also formulated as an integer linear program (ILP) [3]. 
The inputs of the program are the set of potential hotspots 
along with a set of the desired accuracies for these points 
and a set of potential sensor locations. The algorithm finds 
the minimum set of sensors (and their locations) that fulfills 
the accuracy requirements. All these proposals only target 
accuracy optimization. 
In this paper we improve the modeling of the problem with 
the following elements: 
• Consideration of the area and power consumption of the 
monitors as a design constraint. 
• Consideration of the interconnection costs. 
• Consideration of three inaccuracy terms: spatial errors, 
sampling rate errors and monitor-inherent errors. 
This modeling is used by the simulated annealing algorithm 
to approach the problem of temperature sensor allocation. This 
algorithm, albeit computationally intensive, performs a very 
vast design space exploration finding near-optimal solutions. 
The algorithm yields the optimum number and location of the 
monitors along with the maximum allowable sampling rate. 
II. APPROACH DESCRIPTION 
Figure 1 depicts an overview of the proposed temperature 
sensor placement and allocation environment. As shown, the 
algorithm receives the system layout information, the results of 
the thermal characterization —i.e. the hotspots information— 
and the area, power and accuracy constraints. Also, the algo-
rithm retrieves the monitor and network characteristics and is 
configured by means of a number of correction terms weights. 
The algorithm yields the location of the monitors together with 
the maximum allowable sampling frequency that fulfills the 
power budget. Next sections describe these concepts in detail. 
A. Simulated Annealing 
Given a set of known hotspots, finding the location of all 
the elements of the smallest set of temperature sensors which 
minimizes the worst reading error is an NP-hard combinatorial 
optimization problem. This kind of problems can be solved by 
heuristic algorithms and by general optimization procedures. 
Simulated annealing [4] is a well-known general optimiza-
tion procedure that provides good solutions requiring long 
computation times because a vast solution space is explored. 
This drawback can be overtaken by making use of sophisti-
cated cooling schedules, as is the case of the one used here [5]. 
Moreover, the use of incremental cost functions becomes a 
must to further reduce the computation time. In our case, since 
the allocation of sensors is a task that is carried out off-line 
the focus is not on the execution time of the algorithm, but on 
the quality of the resulting solution. Thus, a right modeling of 
the whole sensor allocation task becomes the definitive part 
of this problem. This modeling, described next, includes the 
definition of adequate cost metrics and the formulation of a 
suitable cost function. 
B. Problem Modeling 
In the temperature sensor placement and allocation problem 
we establish the quality of a certain solution by means of four 
different metrics, namely the accuracy, the area, the power 
consumption and the interconnection costs. 
The accuracy of a measurement realized in a monitor 
network is affected by three main error sources. First of all, 
monitors display a certain 3a error margin, referred to as 
^sensor- Secondly, the distance from the sensors to the hotspots 
is another source of inaccuracies. Ideally, each hotspot should 
have a thermal sensor placed as close to it as possible to track 
its temperature and deliver it to the DTM system. However, the 
power and area costs of temperature sensors along with their 
calibration overheads restrict the number of instances that can 
be allocated in the system. From the heat diffusion equation, 
we know that the difference in temperature reading of a sensor 
that is located at a certain distance r from a hotspot with 
temperature Thotspot is given by: 
& spatial -*- hotspot \ J-
-1r 
• e
 k (1) 
where k denotes the thickness of the processor package — 
die, heat spreader, and thermal interface material— in terms 
of silicon [6]. The third source of inaccuracies is the effect of a 
finite sampling rate. In an ideal scenario, the thermal controller 
would have the readings of all the monitors constantly updated. 
However, due to power budget restrictions, the sampling rate 
of the network must be limited. This produces that the stored 
reading values from the monitors accumulate an error in the 
presence of temporary thermal gradients. Again from the heat 
diffusion equation, we can establish the following expression 
for the error produced by this effect: 
£ 1 sampling 
J sampli 
-Hr, (2) 
where fsampiing refers to the sampling frequency and Hmax 
denotes the maximum heat flow expected on the chip. There-
fore, the expression of the total deviation from the hotspot 
actual temperature is given by: 
&cost &.<i £ spatial £ sampling (3) 
The optimization of the system accuracy entails the mini-
mization of the error committed by each monitor along with 
the minimization of the average error. 
Concerning the area and the power consumption, the cal-
culations are straightforward once the number of monitors, n, 
has been set: 
Ar nAr, 
icost Ft]sampl E 
ingJ-^meas 
(4) 
(5) 
where Amonitor is the area of a monitor and Emeas is 
the energy needed by a sensor to realize a measurement. 
These parameters are inputs of the system and depend on the 
characteristics of the chosen type of monitor. 
For the modeling of the interconnection costs, we have 
considered two cases. In the former, we target a point-to-point 
network architecture supposing that each monitor connects to 
a central controller located at the weighted average of the 
hotspots positions and temperatures: 
(xc, yc) = ( Y^iXhithi J2iV
hithi 
T,ithi J2ithi 
(6) 
where xc and yc denote the coordinates of the controller and 
xhi, y hi, thi refer to the coordinates and the temperature of 
the i — th hotspot, respectively. The cost of the interconnection 
is modeled as the addition of the Manhattan distance of each 
monitor to the central controller: 
I cost = ^(\xc - xm,i\ + \yc - ym¿|) (7) 
xrrii and í/m¿ refer to the coordinates of the i-th monitor. 
The second case for the interconnection cost models a 
JTAG-style architecture in which the monitors are connected 
forming a chain of serial connections. In this case, we employ 
the semiperimeter method to estimate the cost: 
¡cost = (rnax{xrni}—rnin{xrni})-\-(rnax{yrni}—rnin{yrni}) 
(8) 
C. Complete Cost Function Formulation 
For our temperature monitor placement and allocation al-
gorithm, the important information related to the process is: 
• The global cost associated to the solution, provided by 
the area, power consumption and interconnection costs. 
• The design constraints (maximum available area, power 
consumption and acceptable error) and goals (target av-
erage error). 
This information should be considered not only in a quan-
titative but also in a qualitative way, taking into account 
its nature. In this sense, design constraints must define the 
search space and cost issues must be used to characterize the 
quality of the solution. The type of cost related to the quality 
attributes should also be taken into account: fixed costs must 
be considered in a different way than variable costs. 
To build the expression of the cost function, we follow [7] 
where they propose the use of different correction terms (one 
per constraint) to guide the search in the design space. 
TABLE I 
SELECTED MONITORS CHARACTERISTICS. 
•F(S) = 5 > Cj(S) c\ Y^k^oiCiiS)) (9) 
where S is the solution under evaluation, C¿(<S) is the value of 
a particular objective or cost, C¿ is the i-th design constraint 
applied to the quality attribute C¿(<S) in a given partition S, 
the correction terms are denoted by J"c(C¿(<5)) and h is the 
weight factor for the i-th correction term. 
D. Design Space Exploration 
To cover the spectrum of possible solutions as much as 
possible, we have employed the following set of allowed 
moves. During the first phase of the annealing the only change 
that is allowed is the addition of new sensors at random places. 
Once this initial solution is set up, these are the allowed 
movements: 
• Add a new sensor at a random place. 
• Remove a random sensor. 
• Move a random sensor to a new random position. 
• Randomly change the sampling ratio. 
III. RESULTS AND DISCUSSION 
In order to emphasize the importance of the different metrics 
and provide a deeper understanding of the trade-offs inherent 
to the algorithm, in this section we first apply it to a test floor-
plan of 25x50 mm2 containing 20 hotspots, 10 of which are 
spread out homogenously in the leftmost half and reach 85°C, 
whereas the other 10 are spread likewise in the rightmost half 
and reach 65°C. For this floorplan we apply two different 
sets of constraints to three types of monitors. In this test case 
the type of monitor is fixed, therefore we can analyze the 
implications of having different monitor characteristics in the 
resulting allocation. Summarizing we apply each constraints 
set to the three monitors, making a total of six test scenarios. 
In the first three scenarios we target minimizing (no kCi 
terms) the average and maximum error and set an area 
constraint of 2.1mm2, employing a penalty function. In the 
Monitor # 
Ml 
M2 
M3 
Case 
Area-driven Ml 
Area-driven M2 
Area-driven M3 
Power-driven Ml 
Power-driven M3 
Alpha w/o net 
Alpha with net 
Source Area(mm2) 
[8] 
[9] 
[10] 
0.032 
1.0 
0.18 
PP Err (°C) 
1.8 
0.2 
1.0 
TABLE II 
SUMMARY OF RESULTS. 
£cost(°C) #Mon. mm2 
1.90 
6.46 
2.47 
4.73 
3.43 
1.69 
1.80 
20 
2 
11 
9 
10 
11 
10 
0.64 
2.0 
1.98 
0.28 
1.8 
1.98 
1.8 
mW 
8.2 
3.7E4 
3.3 
5.0 
5.0 
3.3 
3.0 
nJ 
4.1E-01 
3.5E+03 
3.0E-01 
Samples/sec 
1000 
5345 
10000 
1355 
1666 
10000 
10000 
second three scenarios, we also target minimizing the errors 
but the constraint is set for a power consumption bigger than 
5 (¿Watt. The three monitors have been selected from [11], 
they all are fabricated in a 0.18/xm technology node and 
feature different trade-offs between area, precision and power 
consumption as shown in table I. 
The summary of the results of these placement/allocations 
is shown in table II, as shown, although we prepared six 
scenarios, just five feasible solutions were produced by the 
algorithm; the power-driven constraints were too tight for the 
second monitor that is very energy hungry and did not produce 
any outcome. Figures 2 and 3 show the resulting distributions 
of monitors. Figure 2 has the three first scenarios superposed 
whereas 3 has the remanning two. To the left of each hotspot 
there is a list with the maximum measurement error for that 
hotspot corresponding to each type of monitor distribution. 
Errors corresponding to type one are at the top and the ones 
corresponding to type three are at the bottom. 
As shown, in the first three scenarios, as the power con-
sumption was left as a secondary metric, the limit in area 
established the maximum number of monitors, and thus, the 
' Hotspots 
<] Type 1 Sensors 
^ Type 2 Sensors 
> Type 3 Sensors 
Fig. 2. 
I 150 200 250 300 350 400 45i 
Sensor distributions. Area-driven cases. 
Hotspots 
Type 1 Sensors 
Type 3 Sensors 
11 
4.72 . 
3.43 
Fig. 3. Sensor distributions. Power-driven cases. 
algorithm found the best placements to minimize the error. 
The placements in the 85°C and the 65°C halves are different 
as the spatial error diminish with the decrease in the hotspot 
temperature. The solution found for the first type of monitors is 
what we can call the obvious solution which sets one monitor 
located at each hotspot, this solution is reached because the 
set area constrain is relatively big for this type of monitor. 
In the other two cases, the constraint limits the number of 
monitors and the algorithm finds the best placement from an 
error view-point. Interestingly there are different allocations 
and placements for the rightmost and leftmost halves as 
they display different temperatures. As we mentioned, for the 
power-driven cases, just monitors of types one and three have 
been considered. As shown in figure 3, the algorithm allocates 
nine monitors of type one and ten monitors of type three. In 
this case, the maximum error of the monitors, along with their 
power consumption determine the functioning of the algorithm 
which seeks for the trade-off between number of monitors, 
sampling rate and maximum error. The placement of type three 
monitors achieves a significantly inferior £cost maintaining an 
equal power consumption by adding one additional monitor 
and increasing the sampling rate. 
To validate the functioning of our placement algorithm in a 
real environment, we have targeted a single-core Alpha 21364 
processor. The features of this processor were presented in 
[12], it was fabricated in a 0.18 ¡im process, entailing an area 
of 21.1x18.8 mm2 and containing 152M transistors. In order 
to obtain the hotspots information, we have taken the results 
provided by [1] employing the combination of SPEC2000 
benchmark suite [13], SimpleScalar [14], Wattch [15] and 
Hotspot [16]. For each point of the grid, we have found the 
maximum temperature throughout the 25 SPEC2000 bench-
marks. A point is considered a hotspot if its maximum 
temperature is above those of the points surrounding it. The L2 
caches of the processor are not considered in our study. Only 
monitors of the third type are used because we believe they 
represent the best trade-off between area, power and accuracy. 
The first results are displayed in figure 4.a, circles represent 
hotspots and they go from smaller to bigger as their mea-
surement error increases, their color goes from light blue to 
dark red as their temperature rises. We have employed a set 
of restrictions entailing a mixture of metrics, but providing a 
bigger weight to the maximum error. As shown, 11 monitors 
are allocated, achieving an accuracy of 1.69°C. For the second 
scenario, figure 4.b, we have included a significant weight 
for the network metric, the triangle represents the controller 
location. All the monitors are attracted to the controller 
consistently with the metric requirements; this is translated 
into a certain precision loss which in turn makes the number 
of monitors decrease to 6, achieving an accuracy of 1.93°C. 
The numerical results are displayed at the bottom of table II. 
IV. CONCLUSIONS 
This work has presented a complete modeling of the prob-
lem of temperature sensor placement and allocation including 
accuracy, area, power and interconnection constraints. For the 
FPMap 
FPMul 
FPWin 
FPAdd 
BPred 
ntMap 
FPO 
D 
ntO 
• 
LdStq* 
D-
TB _ 
itWin 
IntExec 
1 ^ * * \ 
9TBr D 1 
- 1 Dcache 1 • Hotspots V Core 
(a) (b) 
Fig. 4. Sensor distributions for the Alpha case studies. 
error modeling, we have employed spatial and finite sampling 
inaccuracies along with the errors inherent to the monitors. 
The modeling has been implemented with the Simulated 
Annealing algorithm. The trade-off between the different con-
straints has been analyzed for a test floorplan. A complete case 
study for the Alpha 21364 processor including two different 
allocations has been presented. 
REFERENCES 
[1] S. Memik, R. Mukherjee, M. Ni, and J. Long, "Optimizing Thermal 
Sensor Allocation for Microprocessors," IEEE Trans, on Computer-
Aided Design of Integrated Circuits, vol. 27, no. 3, pp. 516-527, 2008. 
[2] S. Reda, R. Cochran, and A. Nowroz, "Improved Thermal Tracking for 
Processors Using Hard and Soft Sensor Allocation Techniques," IEEE 
Trans, on Computers, 2011. 
[3] S. Sharifi and T. Rosing, "Accurate direct and indirect on-chip temper-
ature sensing for efficient dynamic thermal management," IEEE Trans, 
on Computer-Aided Design of Integrated Circuits, vol. 29, no. 10, pp. 
1586-1599, 2010. 
[4] S. Kirpatrick, C. Gelatt, and M. Vecchi, "Optimization by simulated 
annealing," Science, vol. 220, no. 4598, pp. 671-680, 1983. 
[5] M. D. Huang, F. Romeo, and A. Sangiovani-Vincentelli, "An Efficient 
General Cooling Schedule for Simulated Annealing," 1986, pp. 381-384. 
[6] K. Lee, K. Skadron, and W. Huang, "Analytical model for sensor 
placement on microprocessors," 2005. 
[7] M. López-Vallejo, J. Grajal, and J. López, "Constraint-driven system 
partitioning," in Design, Automation and Test in Europe Conference and 
Exhibition 2000. Proceedings. IEEE, 2000, pp. 411^16. 
[8] M. Law and A. Bermak, "A 405-nw cmos temperature sensor based on 
linear mos operation," Circuits and Systems II: Express Briefs, IEEE 
Transactions on, vol. 56, no. 12, pp. 891-895, 2009. 
[9] L. Ho-Yin, C. Shih-Lun, and L. Ching-Hsing, "A cmos smart thermal 
sensor for biomedical application," IEICE transactions on electronics, 
vol. 91, no. 1, pp. 96-104, 2008. 
[10] C. Wu, W. Chan, and T. Lin, "A 80ks/s 36/iw resistor-based temperature 
sensor using bgr-free sar adc with a unevenly-weighted resistor string in 
0.18/im cmos," in VLSI Circuits (VLSIC), 2011 Symposium on. IEEE, 
2011, pp. 222-223. 
[11] K. Makinwa, "Temperature Sensor Performance Survey," [Online], 
Available: http://ei.ewi.tudelft.nl/docs/TSensor_survey.xls, Access date 
April 2012. 
[12] A. Jain et al, "A 1.2 ghz alpha microprocessor with 44.8 gb/s chip 
pin bandwidth," in Solid-State Circuits Conf. 2001.. IEEE, 2001, pp. 
240-241. 
[13] J. Henning, "Spec cpu2000: Measuring cpu performance in the new 
millennium," Computer, vol. 33, no. 7, pp. 28-35, 2000. 
[14] D. Burger and T. Austin, "The simplescalar tool set, version 2.0," ACM 
SIGARCH Computer Architecture News, vol. 25, no. 3, pp. 13-25, 1997. 
[15] D. Brooks, V. Tiwari, and M. Martonosi, "Wattch: a framework for 
architectural-level power analysis and optimizations," in SIGARCH, 
vol. 28, no. 2. ACM, 2000, pp. 83-94. 
[16] K. Skadron, M. R. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, 
and D. Tarjan, "Temperature-aware microarchitecture," in Proceedings 
of the 30th International Symposium on Computer Architecture, 2003. 
