Unified Datacenter Power Management Considering On-Chip and Air Temperature Constraints by Shi, Bing & Srivastava, Ankur
The InsTITuTe for sysTems research
Isr develops, applies and teaches advanced methodologies of design and 
analysis to solve complex, hierarchical, heterogeneous and dynamic prob-
lems of engineering technology and systems for industry and government.
Isr is a permanent institute of the university of maryland, within the  
a. James clark school of engineering. It is a graduated national science 
foundation engineering research center.
www.isr.umd.edu
Unified Datacenter Power Management 
Considering On-Chip and Air Temperature 
Constraints
Bing Shi, Ankur Srivastava
Isr TechnIcal rePorT 2010-8
Unified Datacenter Power Management Considering On-Chip
and Air Temperature Constraints
Bing Shi and Ankur Srivastava
Department of Electrical and Computer Engineering
University of Maryland, College Park, MD 20742, USA
e-mail:{bingshi,ankurs}@umd.edu
Abstract
The current approaches for datacenter power management (work-
load scheduling, CPU speed control, etc) focus primarily on main-
taining the air temperature surrounding servers to be within the
manufacturer specified constraint. This is problematic since sev-
eral CPUs may still be violating the on-chip thermal constraint
thereby leading to reliability loss. The primary objective of this
work is to develop a unified approach for datacenter power op-
timization (by controlling the CPU speeds) which accounts for
both the silicon level temperature of the VLSI components such
as CPUs and the air temperature that directly impacts the reli-
ability of other devices such as disks, and also the performance
delivered. Our algorithm follows a two step approach: optimally
solving a convex approximation that assigns continuous frequency
values to all CPUs and a discretization step for legalization of the
assigned frequencies. The experimental results indicate that our
method guarantees both on-chip CPU and off-chip air temperature
to be within temperature constraints. However, the traditional ap-
proach of constraining only air temperature will result in on-chip
CPU temperature violation on about 40% of the CPUs, or 42%
more power consumption to pull the CPU temperature back within
constraint by increasing the HVAC cooling.
1 Introduction
Datacenters represent centralized facilities which have large num-
ber of high performance servers along with several petabytes of
storage distributed across multiple racks. These facilities form the
backbone of online services and serve millions of users everyday.
For example, YouTube serves up to 100 million videos a day [1];
Facebook has 400 million active users and 3 billion photos up-
loaded each month [2]. These videos and images are stored and
accessed from datacenters. Growing complexity and utilization of
datacenters for performing computational and data accessing tasks
has resulted in a significant increase in their power consumption
levels. Modern datacenters roughly use 1.5% of the US electricity
consumption according to recent EPA estimates [3]. With grow-
ing computing and storage capabilities, increasing connectivity,
online services, advent of cloud computing, this energy footprint
is slated to increase in the coming years. Energy consumption in
datacenters comes from two broad sources 1) power dissipation in
CPUs and the support circuitry (AC-DC converters, etc), and 2)
the power dissipation in the HVAC system. Energy dissipated in
electronic circuitry increases its operating temperature which in-
turn impacts the reliability and increases the failure rates of the
devices. Therefore, manufacturers of datacenter servers and racks
provide a maximum constraint on the air temperature surround-
ing the equipment. In order to maintain the air temperature, the
HVAC system supplies cold air through vents and expends a sig-
nificant amount of energy in doing so. Several approaches have
been investigated that attempt to constraint the growing power
demands in datacenters including load balancing, and the more
recent CPU Vdd/speed control [4] [5] [6]. Other approaches that
attempt to improve the efficiency of the HVAC system have also
been investigated [7]. In this paper, we deal with the problem
of CPU speed control in datacenters such that the overall power
utilization is minimized while maintaining the performance and
temperature. The current approaches for datacenter thermal man-
agement (workload scheduling, CPU speed control, etc) focus pri-
marily on maintaining the air temperature surrounding servers to
be within the manufacturer specified constraint. VLSI compo-
nents such as CPUs, Memory etc have a maximum temperature
constraint at silicon level as well. Constraining the air temper-
ature certainly helps in ensuring the reliability of disks, AC-DC
adapters, etc., but not of VLSI components such as CPUs. As we
will show later, several CPUs might still violate the silicon level
thermal constraint even though the air temperature is not very
high. The primary objective of this work is to develop a unified
approach for datacenter power optimization which accounts for
both the silicon level temperature of the VLSI components such
as CPUs and the air temperature that directly impacts the reli-
ability of other devices such as disks. Thermal management in
multi-core CPUs is an active topic of research where several mod-
els and optimization schemes have been developed for estimating
and managing the chip level thermal profile. In this paper we at-
tempt to develop a unified modeling and optimization approach for
managing the overall datacenter power dissipation (by controlling
the CPU speeds) while considering silicon level temperature con-
straint of CPUs, the traditional air temperature constraints and
also the performance constraints. Our approach would result in a
more reliable and lower power operation compared to traditional
approaches for the same output performance.
Our algorithm follows a two step approach: first we approxi-
mate the CPU speed allocation problem as a continuous convex
program which generates the frequency policy assuming it can be
continuously controlled; then we discretize the frequency to dis-
crete legal levels that the CPUs can run on. We account for both
dynamic and leakage power and also model the leakage thermal in-
terdependence. By exploiting the mathematical properties of con-
vex programs, the convex approximation step can generate high
quality solutions (which are then discretized) quickly. The con-
vex program can have large number of unknowns since we are
interested in simultaneously controlling both the chip level and air
temperature while minimizing power. We describe ways of sim-
plifying the problem without much impact on quality of solution.
Experimental results show that power consumption estimated by
our approximation is very close to the actual power consumption
of the datacenter. Also, our method guarantees both the on-chip
and air temperature to be within constraints, while simply ignor-
ing leakage or constraining only on the air temperature will lead to
overheating in about 40% to 60% CPUs. So in order to pull the on-
1
chip CPU temperature back within acceptable levels in this case,
the datacenter needs to consume about 42% more power than our
method. Our optimization framework implemented in MATLAB
took about 4-5 minutes to execute for a 1000 server datacenter.
The paper is organized as follows. In section 2, we introduce the
overall datacenter thermal/power model. In section 3, we explain
the thermal/power model of multi-core CPUs and our optimiza-
tion problem formulation. We develop a convex approximation
approach to assign continuous frequency values to all CPUs in
section 4. In section 5, we illustrate the frequency discretization
approach while section 6 discusses some extension of our problem.
The experimental result is given in section 7.
2 Datacenter Power Management
Energy consumption in datacenters comes from two broad sources
1) power dissipation in CPUs and the support circuitry (DC con-
verters etc.), and 2) the power dissipation in the HVAC system.
Datacenters represent high performance servers packed together
in hundreds of server racks. Scheduling of high performance tasks
on servers results in excessive power dissipation in CPUs, mem-
ory chips, disks and also in the server support circuitry such as
AC-DC converters, fans etc. All this dissipated energy results in
higher operating temperature of the electronic circuitry. Higher
operating temperature at silicon levels results in higher probabil-
ity of error, reduced lifetime and reliability. Higher temperatures
in datacenters also increases the chance of failure of other circuits
such as fans, adapters etc. Therefore, manufacturers of datacenter
serves and racks provide a maximum constraint on the air tem-
perature surrounding the equipment. In order to maintain the
air temperature, the HVAC system supplies cold air through vents
and expends a significant amount of energy in doing so. Recent ap-
proaches try to perform task scheduling and/or CPU speed control
such that a given amount of workload is completed without vio-
lating the air temperature constraint while minimizing the overall
power utilized (electronic circuitry and HVAC system) [6]. Now
we describe some basic equations that tie server power dissipation
to the surrounding air temperature.
Datacenter racks incorporate several chassis which comprise of
several server slots for housing servers (see figure 1 [6]). Servers
comprise of several multi-core CPUs, RAM, disks etc, all of which
dissipate power when used. Each chassis also has support circuitry
such as power adapters etc. for maintaining the servers. If all
the servers on a chassis are off, then this circuitry could be shut
off, else it must be turned on. This dissipates around γ units of
power (≈ 820W as reported in [6] [8]). The server has components
such as memory, disks etc which dissipate α ≈ 60W − 120W of
power [6] [8]. The CPUs in servers also dissipate power to the
tune of 50-100W depending on their speed. The overall power
dissipation in chassis n is given by equation 1 [6].






1, if at least 1 server on chassis n is on
0, otherwise
(2)
Here Yn is the number of servers which are turned on in chassis
n. Also, Pn,s is the power dissipated in the multi-core CPU of the
s-th server in the n-th chassis. Our model can trivially be extended
to the case where servers have several multi-core CPUs. For the
sake of simplicity in exposition we assume that the server has one
multi-core CPU. The power consumed in severs and chassis is a
strong function of the task scheduling and the CPU speed states.
The basic datacenter organization is such that cool air coming from
Figure 1: Datacenter model
HVAC vents blows over the servers thereby heating up due to the
server power dissipation. Let T nin be the temperature of the cool
air entering the n-th chassis. The temperature of the air exiting
the chassis T nout is given by [9] [10]:
Pn = Kn(T
n
out − T nin) (3)
where Kn is a known constant based on the specific heat of air,
rate of air flow etc. The input air temperature T nin is a function of
the cool air temperature Tsup supplied by the HVAC vents. The
close proximity of racks also results in intermixing of the hot air
coming out of different chassis (see figure 2). This re-circulation
causes the cool air into a chassis to intermix with the hot air from
other chassis. The resulting input air temperature into a chassis
n is given by equation 4.












Here, T mout is the hot air coming from the m-th chassis and amn
is the re-circulation factor or cross-interference coefficient between
chassis m and n. These parameters depend on the design of the
datacenter racks, rack placement, vent configuration etc and could
be learnt using existing methodologies presented in [9]. Excessive
power dissipation in chassis and also cross circulation of hot air re-
sults in an increase in datacenter air temperature, which needs to
be maintained within manufacturer specified constraints. There-
fore the HVAC system needs to reduce the supply air temperature
Tsup thereby leading to increases energy consumption. The HVAC







Where COP is the coefficient of performance and
∑N
n=1 Pn is
the total power consumed by all the N chassis. COP is given by
COP = 0.0068T 2sup +0.0008Tsup +0.458 [7]. The total power used
by the datacenter is given by:
Ptotal = PAC +
N∑
n=1







Existing approaches try to minimize this overall power such that
T nout ≤ Tcons, ∀chasis n while maintaining acceptable performance
levels. This can be achieved using a combination of task schedul-
ing, CPU speed and Tsup control [6].
3 Datacenter Power Management:
From Micro-scale to Mega-scale
In this paper, we deal with the problem of CPU speed control in
datacenters such that the overall power utilization is minimized
while maintaining the performance and temperature. We do not
consider the problem of workload scheduling. A similar approach
was investigated in [6]. The current approaches for datacenter
thermal management (workload scheduling, CPU speed control
etc) focus primarily on maintaining the air temperature surround-
ing chassis to be within the manufacturer specified constraint.
VLSI components such as CPUs, Memory etc have a maximum
temperature constraint at silicon level as well. Usually CPUs etc
should not be heated beyond a certain temperature at silicon level
for maintaining reliable operation [11] . Constraining the air tem-
perature certainly helps in ensuring the reliability of disks, AC-DC
adapters, etc., but not of VLSI components such as CPUs. Figure
3 illustrates the silicon level temperature of different server CPUs
in a datacenter. This data was obtained by using the approach
in [6] to assign CPU speed/Vdd such that overall power utiliza-
tion is minimized while the air temperature is constrained to be
≤ 35℃. The total datacenter performance was also constrained to
be higher than a certain value. The figure highlights the fact that
constraining the air temperature does not necessarily ensure the
CPU silicon temperature to be less than the manufacturer speci-
fied constraint. This would result in loss of reliability and higher
device failure rates. This could certainly be fixed by reducing the
supplied air temperature Tsup from the HVAC system. But this
would be accompanied by an decrease in COP thereby resulting
in an increase in the overall power dissipation.
















Figure 3: Silicon temperature of different Datacenter CPUs
(Y-Axis oC)
The primary objective of this work is to develop a unified ap-
proach for datacenter power optimization which accounts for both
the silicon level temperature of the VLSI components such as
(a) (b)
(c)
Figure 4: Multi-core CPU thermal model, (a)CPU dynamic
thermal model, (b)impact of air temperature on CPU, (c)
CPU steady state thermal model
CPUs and the air temperature that directly impacts the relia-
bility of other devices such as disks. Thermal management in
multi-core CPUs is an active topic of research where several mod-
els and optimization schemes have been developed for estimating
and managing the chip level thermal profile. In this paper we at-
tempt to develop a unified modeling and optimization approach for
managing the overall datacenter power dissipation while consider-
ing silicon level temperature constraint for CPUs, the traditional
air temperature constraints and also the performance constraints.
Our approach would result in a more reliable and lower power op-
eration compared to traditional approaches for the same output
performance. There are several challenges that we need to address
in this regard.
1. Different time-scales: A unified approach needs to address
the fact that on-chip silicon temperature roughly takes mil-
liseconds to change while datacenter air temperature can take
several minutes to change. A combined approach must ad-
dress this lack of synchrony in the two events.
2. Impact of CPU leakage: Leakage power shows a strong ther-
mal dependence. Increase in air temperature around CPUs
will also indirectly increase the silicon temperature resulting
in higher leakage. This would increase the overall power dissi-
pation and impact both the silicon temperature and also the
air temperature.
3. Complex optimization problem: CPUs demonstrate signifi-
cant variation in temperature across the die. Constraining
the maximum silicon temperature and also the air tempera-
ture would force us to formulate and solve a highly complex
optimization problem with millions of unknown variables and
therefore may not be a feasible option.
3.1 CPU Power-Thermal Model
Many researchers have developed thermal models that capture the
on-chip temperature dynamics using a distributed RC circuit (see
figure 4(a)) [11]. The individual current sources represent the
3
power dissipated in those areas and the voltage at the nodes repre-
sent temperature. This power is a function of the CPU operating
frequency and also silicon temperature (due to leakage thermal
interdependence). In this paper we assume that each core in a
particular CPU runs at the same frequency although our methods
are trivially extendable to the case where each CPU core has in-
dependent frequency as well. The thermal dynamics of the system
shown in figure 4 is as follows:
dTi
dt




















Here Tin is the temperature of the air surrounding the CPU.
This is the same with the input air temperature that the chassis
intakes (see figure 4(b)). Here Ti is the temperature of the i-th
node in the thermal model, and Tp is the package temperature.
NEI(i) refers to all the neighbor nodes of i-th node. Since the
time scales of temperature change in CPUs and surrounding air is
significantly different, we ignore the transient behavior and focus
primarily on the steady state silicon temperature (see figure 4(c)).
In the steady state, the silicon temperature Ti at all nodes i is


















By eliminating the variable Tp, we can represent the silicon tem-





wijTj + wiTin = Pi (9)
Here the parameters wii, wi, wij can be derived from equations
8. Power dissipated at location i (Pi) depends on the average
switching activity at location i which is a function of the operating
frequency f . Leakage power which has strong thermal dependence









Ti + di + βif (10)
Here bi, ci are device dependent constants which control the
leakage thermal interdependence [12], βi is the amount of capaci-
tance that we switch at location i and di is a constant that depends
on other circuit parameters. This model captures the steady state
temperature profile of CPUs at silicon level as a function of the
power dissipation profile and also the ambient air temperature Tin.
As indicated in equation 4, the temperature around the CPU Tin
would be a function of both Tsup and Tout of other chassis.
3.2 Optimization Formulation
In this paper, we develop formulations that synthesize the optimal
frequency policy for all CPUs in the servers such that the overall
power utilization is minimized and 1) the silicon temperature at
all CPUs is less than a constraint Tmax 2) the air temperature T
n
out
at all chassis n is less than T chasismax and 3) the total frequency of
all CPUs is greater than a specified constraint. For the sake of
exposition, we will assume that the HVAC supplied temperature
Tsup is given to us. The formulation is easily extendible even if
Tsup is a controllable parameter. We shall also assume that the
air temperature inside the chassis n is T nin. Therefore the ambient
temperature for all server CPU inside chassis n is T nin.
Objective:








Since Tsup is assumed to be known, COP is a known constant
(see discussion in section 2).
Constraints:
We assume that all CPUs can be in discrete frequency state
belonging to the set 0, f1, f2.....fK . The problem constraints can













in = Pn,s,i, ∀n, s, i




Tn,s,i + dn,s,i + βn,s,ifn,s, ∀n, s, i




















8. T nout ≤ T chasismax , ∀n
9. fn,s ∈ {f0, f1, f2.....fK}, ∀n, s
(12)
There are a total of M servers per chassis and a total of N chas-
sis in the datacenter. Here fn,s is the frequency of the s-th server
CPU on the n-th chassis. Tn,s,i is the temperature of the i-th
on-chip node (see figure 4(c)) on the s-th server CPU of the n-th
chassis. Pn,s,i is the power (leakage and dynamic) of the i-th on-
chip node of the s-th server CPU of the n-th chassis. T nin and T
n
out
are the intake and exhaust air temperatures for the n-th chassis.
The first constraint ensures that the total frequency delivered by
the datacenter is at least greater than F . The second constraint
guarantees the on-chip silicon CPU temperature to be ≤ Tmax. As
illustrated in equations 9 and 10 the third and fourth constraints
establish the interdependence between temperature and power of
the i-th on chip location of the s-th server CPU on the n-th chassis.
Similar to equation 1, the fifth constraint specified the total power
dissipated in the n-th chassis. In this constraint, Xn represents
the power consumption overhead of chassis n when at least one
server of this chassis is on, while Yn is the number of chassis that
is turned on in chassis n (see section 2). The sixth constraint is
similar to equation 3 and establishes the relationship between the
intake air temperature and the exhaust output air temperature for
the n-th chassis. Here we have assumed that the ambient temper-
ature inside the n-th chassis for all the server CPUs is T nin. The
seventh constraint establishes the relationship between the input
air temperature and the out air temperature of all the chassis and
Tsup (see equation 4 for details). The eighth constraint limits the
output air temperature at all chassis to be within T chasismax .
This formulation is highly nonlinear and the integer constraints
significantly increase the complexity. Now we present a two step
algorithmic approach for finding the best frequency policy. First
we relax the integer constraint imposed on the CPU frequency to
4
find a reasonable continuous frequency policy. This is followed by
a discretization step that legalizes the frequency policy.
4 Continuous Convex Approxima-
tion
The problem formulation in equations 11 and 12 is highly com-
plex and cannot be solved optimally in polynomial time. Even if
the the frequencies could be controlled continuously in the range
0 ≤ f ≤ fmax, the non-linear leakage-thermal interdependence
leads to a set of nonlinear constraints. Also variables Xn and Yn
must be integer values even if the frequencies are continuous (see
section 2 for a discussion on the nature of Xn and Yn). Now we de-
velop a continuous approximation of the original formulation that
addresses these challenges systematically. We begin by assuming
that all the CPU frequencies fn,s are continuous values that lie in
the range 0...fmax. Now we make the following transformation.
fn,s = e
log(fmax+1)ηn,s − 1
if 0 ≤ fn,s ≤ fmax, then 0 ≤ ηn,s ≤ 1
(13)
Nonlinear Leakage Power-Thermal Interdependence:
Constraints 3 and 4 in equation 12 indicate the power dissipated
Pn,s,i in the i-th on-chip node of the s-th server CPU of the n-th
chassis is a function of the frequency fn,s and temperature Tn,s,i.











wijTn,s,j − wiT nin = 0 (14)
Theorem 1: The left had side of equation 14 is a convex function











∀j∈NEI(i) wijTn,s,j − wiT nin can be shown to
be convex functions. A positive linear combination of convex
functions is a convex function. Hence proved.
We shall exploit this convexity property later.
Accounting for Xn and Yn:
Even if we approximate the frequency to be a continuous vari-
able, parameters Xn and Yn must be discrete. As discussed earlier
Xn = 1 if even one server on chassis n is turned on (see equation
1,2). Also Yn is the total number of servers that are turned on,
regardless of the frequency. Let us define a new parameter Yn,s
where Yn,s = 1 indicates that the s-th server on the n-th chassis is
turned on (that is, Yn,s = 1 if fn,s or ηn,s > 0) and 0 otherwise.








Now we develop a way of approximating the function Yn,s (which
is a function of fn,s or ηn,s) and thereby develop a way of approx-
imating the discrete nature of variables Xn and Yn.
Figure 5 indicates the variable Yn,s as a function of fn,s. It also


























Figure 5: Yn,s and its approximation Y ′n,s
Y ′n,s =
{
1, when fn,s = fmax
0, when fn,s = 0
(17)
Basically Y ′n,s is a concave function that approximates the ac-
tual function as much as possible. Y ′n,s is represented using the
function in equation 16. It can clearly be seen that Y ′n,s = ηn,s.
This transformation is very important since Y ′n,s becomes a linear
function of ηn,s. Now Yn and Xn can be approximated as follows.











Theorem 2: Yn is approximated as a linear function of ηn,s and
Xn approximated as a convex function of ηn,s
Proof: Omitted for brevity.
4.1 Overall Continuous Convex Formulation
The overall formulation that approximates the original problem is
as follows. Firstly, we make the observation that the total power
dissipated (1 + 1
COP
∑N








since Pn = Kn(T
n










out − T nin) (19)
The new set of constraints obtained by transforming fn,s to








(elog(fmax+1)ηn,s − 1) ≥ F






































6. T nout ≤ T chasismax , ∀n
7. 0 ≤ ηn,s ≤ 1, ∀n, s
(20)
Note that constraint 3 above is a combination of constraint 3
and 4 in equation 12. Also constraint 4 above is the same as the
combined set of constraints 4,5,6 in equation 12. Now we make the
following transformations. Consider constraint 1. Clearly it is not














log(fmax+1)ηn,s is a monotonically increasing





log(fmax+1)ηn,s) is a monotoni-
cally decreasing function in ηn,s. Hence the constraint in equation
21, which is equivalent to the first constraint in equation 20, be-
comes quasiconvex [13]. Quasiconvex constraints can be treated
just as convex constraints for all practical purposes [13].
Now consider constraints 3 and 4 in equation 20. Instead of



























in)−Kn(T nout − T nin) ≤ 0, ∀n
(22)
Using theorem 1 and 2, we can easily see that these represent a
set of convex constraints.
Hence we can represent the approximate optimization problem
highlighted in equations 19 and 20 as a convex optimization prob-
lem. This is because, the modifications in constraints 1, 3 and
4 result in convex constraints. All the other constraints are lin-
ear and the objective is linear as well. Hence it could be solved
optimally in polynomial time. One might argue that the optimal
solution of such a formulation might be problematic due to the
inequalities in equation 22. These equations establish the inter-
dependence between CPU silicon temperature, frequency and the
input air temperature. Therefore they must really be represented
as equalities to 0 rather than inequalities. The following theorem
fixes this problem.
Theorem 3: In the optimal solution of the convex formulation
described above, the inequalities of equation 22 become equalities
to 0.
Proof: Omitted for brevity.
This completes our continuous formulation that assigns the fre-
quencies such that performance and thermal constraints are satis-
fied and overall power is minimized.
4.2 Computational Complexity
Even though, convex optimization problems can be optimally
solved in polynomial time, the scale of the problem in our case is
very large. A typical datacenter can have hundreds of racks com-
prising of thousands of servers. Accounting for CPU silicon level
temperature constraint would significantly increase the unknown
variables and therefore could make solving the convex optimization
formulation practically in-feasible. In this context there are sev-
eral simplifications we can do to reduce the size of the optimization
problem. The thermal inter-coupling between racks would only ex-
ist, in general, between neighboring racks. Hence the constraints
that represent the interdependence between Tin and Tout of racks
could be made more sparse thereby leading to simplification of the
optimization process.
Also, the quality of solution and runtime is the function of how
complex the on-chip thermal model is. In many cases we might
be interested in simply constraining the package temperature of
CPUs rather than the temperature at all internal regions of in-
terest on silicon. This would simplify the CPU thermal model to
a simple RC circuit. This would add only a few extra unknown
variables over the formulation in [6] which could be handled easily
by the convex optimization tool. In many cases we must constrain
the silicon temperature at different points of interest on chip as
well. In such scenarios we could have a simpler RC thermal model
rather than the complex model in [11]. For example, consider the
thermal model in figure 4(c). Here each node represents an on-chip
CPU core in a multi-core chip. A homogeneous multi-core design
would imply the individual RC parameters for each core to be the
same. Hence the on-chip temperature of each node i (on-chip core)
could be assumed to be the same as well. Therefore for each CPU
we have only one temperature variable. This would significantly
reduce the overall complexity. Note that similar RC thermal mod-
els for multi-cores were presented by [14]. Assuming homogeneous
CPUs in all servers further reduces the overall problem complexity.
These techniques help in solving the complex optimization prob-
lem that combines the chip level and datacenter level abstraction
in a unified framework quickly and efficiently. Although such ap-
proximations would result in reduction in accuracy, the level of
granularity in controlling the on-chip temperature does not need
to be very high since we are considering the problem at the level
of datacenters. We implemented many of these techniques for im-
proving the runtime. But, in this paper we do not investigate the
full scope of applying these techniques for runtime improvement.
5 Frequency Discretization
In general, most CPUs are constrained to operate on a pre-decided
set of discrete frequencies. Hence, the frequency should be se-
lected from some pre-defined set of discrete levels. So we wish
to discretize the continuous frequency into discrete levels. The
discretization is basically approximating the frequency to the low-
est discrete level that is greater than the original continuous value.
This approximation ensures the performance, so that the total fre-
quency delivered by the datacenter is ensured to be greater than
6
the system requirement. However, this discretization may result
in violation of the maximum on-chip silicon CPU temperature or
the chassis output air temperature constraint. If this occurs, we
reduce the Tsup to pull the CPU and air temperature back within
the constraints at the expense of increasing HVAC power consump-
tion.
6 Extensions
Several extensions on our basic formulation presented above are
possible. Firstly, combining the task scheduling and CPU speed
control techniques would improve the quality of solution. Devel-
opment of such a combined optimization technique is out of scope
of this work. Also, our optimization problem can be easily ex-
tended to other specifications. The formulation in [6] performs
CPU speed assignment such they follow a specific service level
agreement. Specifically, the following constraints are imposed on









1, if s-th server on n-th chassis runs on frequency f ≥ fx
0, otherwise
(24)
Basically, the frequencies need to be assigned such that at least
Qx CPUs should have frequency greater than or equal to a fre-
quency level fx. For example one might want 60% of the CPUs to
run at fmax and 30% at fmax/2. This integer performance con-
straint gxn,s can be approximated by continuous convex constraint
using a method similar to approximating Xn and Yn (as described
in section 4). We do not describe further details of such techniques
for the sake of brevity.
7 Experimental Results
In the experiment, we use a small scale datacenter similar as in
[6] [8]. The datacenter has two rows, each row consists of 5 racks.
Each rack consists of 5 chassis and each chassis contains 20 servers.
Each server on the chassis is a dual-core processor. Therefore,
there are totally 1000 dual-core server CPUs in the datacenter.
We assume the two cores on each CPU are homogenous, so they
have the same temperature profile. The chassis power overhead γ
is 820W and the non-core power overhead of server α is 60W [6].
The discrete frequency set is {0, 2GHz, 3GHz, 4GHz, 5GHz}.
7.1 Comparison of our method with purely
air temperature constraint
We firstly compare our method with the method that only imposes
constraint on the output air temperature [6] (we call it ‘off-chip’
method). In our experiment, the supply temperature Tsup = 10℃,
and the chassis output air temperature constraint T chassismax = 35℃.
The total frequency constraint is F = 5 × 1012Hz (note we have
1000 CPU servers and 2000 CPU cores, so this is equivalent to an
average frequency constraint of 2.5GHz). Figure 6(a) shows the
on-chip temperature distribution on all the server CPU cores in
the datacenter (Treal) achieved by this method. Since we assume
each CPU is a homogenous dual-core processor, the temperature
on the two cores of each CPU are the same and therefore, we
just plot the temperature of one core for each dual-core processor.
Assuming the maximum on-chip silicon temperature constraint is
Tmax = 80℃, as we can see, the on-chip silicon temperature of
more than 40% of the CPUs violates the maximum temperature
constraint. Some cores even heats up to about 130℃.



































Figure 6: (a)On-chip temperature profile achieved by con-
straining only on output air temperature, (b)Temperature
profile achieved by our method
However, as we can see from figure 6(b), using our method, the
on-chip silicon temperature Tn,s,i will stay within the temperature
constraint, since we impose constraints on both the on-chip silicon
temperature and the air temperature.
On the other hand, in off-chip method, one can also try to pull
the on-chip temperature below maximum temperature constraint
by reducing the HVAC supply temperature Tsup. However, this
will result in increase of HVAC power consumption and therefore,
increase the total power consumption of the datacenter. Table 1
compares the power consumption achieved by our method and off-
chip method when setting Tsup = 10℃in the optimization. Pold is
the total power consumption achieved without trying to fixing the
on-chip temperature, while Pnew is the total power consumption
achieved after fixing the on-chip temperature by reducing Tsup. As
we can see, since our method does not lead to on-chip temperature
constraint violation, we don’t need to reduce Tsup. However, for
the off-chip method, Tsup is reduced to pull the on-chip tempera-
ture down and results in about 42% power consumption increase.
Table 1: Total power consumption of our method and off-chip
method
Method Pold(W) Pnew(W)
Our 5.5964× 105 5.5964× 105
Off-chip 5.6435× 105 8.0434× 105
7.2 Comparison of our method with ignor-
ing leakage method
We then look at the temperature profile achieved by the method
ignoring leakage power consumption (called no-leak). We use the
same Tsup, Tmax and T
chassis
max settings with section 7.1. When the
total frequency constraint is F = 5× 1012Hz, we calculate the op-
timal frequency scheme using the method ignoring leakage power,
and then estimate the actual on-chip silicon temperature profile
considering leakage. The resulting temperature profile is shown
in figure 7. As we can see, the frequency assignment achieved by
no-leak method will result in violation of maximum temperature
constraint in about 60% of the CPUs. The on-chip temperature
7
of some cores will reach as high as 105℃. Compared with off-chip
method, although more CPU cores violate the on-chip temperature
constraints, the degree of violation is smaller.

















Figure 7: Real temperature profile achieved by no-leak
method
7.3 Comparison of our approximated power
with real power consumption
In our model, we approximate the integer constraints Xn and Yn
with continuous function as shown in figure 5 and equations 16,18.
We then test the performance of our approximation by compar-
ing the power consumption estimated by our method with the real
power consumption where we use the actual formula for Xn and Yn
. We calculate the optimal frequency assignment for the datacen-
ter that minimizes the total datacenter power consumption by our
method, and then compare the power consumption approximated
by our method with the real power consumption of the datacen-
ter under this frequency assignment. Figure 8(a) shows the power
consumption approximated by our method and the actual power
consumption for different total frequency constraints. In this fig-
ure, Papprox is the datacenter power consumption approximated by
our method, Preal is the actual power consumption under this fre-
quency scheme. As we can see from this figure, the approximated
power consumption is very close to the real power consumption,
and only underestimates the total power consumption by 4% on
average. Also, when the system performance constraint (total fre-
quency constraint) increases, the approximation works better and
when the total frequency is about 5.6× 1012Hz (that is, the aver-
age frequency of each CPU is 2.8GHz), our approximation is only
0.5% lower.










































Figure 8: (a)Comparison of approximated and real power
consumption, (b)Comparison of power consumption after fre-
quency discretization and power consumption under continu-
ous frequency assignment
In the discretization, we round the frequency up to the nearest
discrete frequency value greater than the continuous value in or-
der to guarantee the performance. Figure 8(b) shows the power
consumption after frequency discretization compared to the power
consumption of continuous frequency scheme for different total fre-
quency constraints. The power consumption is about 12% more
after discretization on average.
Our optimization framework implemented in MATLAB took
about 4.8 minutes to execute for a 1000 dual-core server data-
center and 21.6 minutes for a 2000 dual-core server datacenter.
8 Conclusion
In this paper, we develop a unified approach for datacenter power
optimization which accounts for the silicon level temperature of
the VLSI components, the air temperature, the performance de-
livered, and also the leakage thermal interdependence. We use a
two step approach to solve the problem by: 1) optimally solving a
convex approximation that assigns continuous frequency values to
all CPUs and 2) discretizing the assigned frequencies. By exploit-
ing the mathematical properties of convex programs, the convex
approximation step generates high quality solutions quickly.
Acknowledgement
This research work was partly supported by NSF grant CCF
0937865.
References
[1] “Youtube serves up 100 million videos a day online,” USA
Today, 2006.
[2] “Facebook statistics,” http://www.facebook.com.
[3] “Report to congress on server and data center energy effi-
ciency,” U.S. Environmental Protection Agency, 2007.
[4] V. Cardellini, M. Colajanni, and P. S. Yu, “Dynamic load
balancing on web-server systems,” IEEE Internet Computing
Magazine, vol. 3, pp. 28–39, 1999.
[5] D. M. Dias, W. Kish, R. Mukherjee, and R. Tewari, “A scal-
able and highly available web server,” in Proceeding of IEEE
Computer Society International Conference, pp. 85–92, 1996.
[6] E. Pakbaznia and M. Pedram, “Minimizing data center cool-
ing and server power costs,” in Proceedings of the 2003 Inter-
national Symposium on Low Power Electronics and Design
(ISLPED’09), pp. 145–150, 2009.
[7] J. Moore, J. S. Chase, P. Ranganathan, and R. Sharma,
“Making scheduling ‘cool’: Temperature-aware resource as-
signment in data centers,” in Usenix Annual Technical Con-
ference, 2005.
[8] Q. Tang, S. K. S. Gupta, and G. Varsamopoulos, “Energy-
efficient, thermal-aware task scheduling for homogeneous,
high performance computing data centers: A cyber-physical
approach,” in IEEE Transactions on Parallel and Distributed
Systems, pp. 1458–1472, 2008.
[9] Q. Tang, T. Mukherjee, S. K. S. Gupta, and P. Cayton,
“Sensor-based fast thermal evaluation model for energy ef-
ficient high-performance datacenters,” in International Con-
ference on Intelligent Sensing and Information (ICISIP2006),
pp. 203–208, 2006.
[10] Q. Tang, S. K. S. Gupta, D. Stanzione, and P. Cayton,
“Thermal-aware task scheduling to minimize energy usage of
8
blade server based datacenters,” in Proceedings of the 2nd
IEEE International Symposium on Dependable, Autonomic
and Secure Computing, pp. 195–202, 2006.
[11] K. Skadron, M. R. Stan, K. Sankaranarayanan, W. Huang,
S. Velusamy, and D. Tarjan, “Temperature-aware microar-
chitecture: Modeling and implementation,” ACM Trans. on
Architecture and Code Optimization, vol. 1, pp. 94–125, 3.
[12] L. He, W. Liao, and M. R. Stan, “System level leakage re-
duction considering the interdependence of temperature and
leakage,” in Design Automation Conference (DAC’04).
[13] S. Boyd and L. Vandenberghe, “Convex optimization,” Cam-
bridge University Press, New York, NY, 2004.
[14] R. Rao, S. Vrudhula, and N. Chang, “An optimal analyt-
ical solution for processor speed control with thermal con-
straints,” in Proc. of Intl. Symp. on Low Power Electronics
and Design (ISLPED’06).
9
