Power Aware Real Time Scheduling in Constrained Devices by Kemppainen, Teemu
Date of aeptane Grade
Instrutor
Power Aware Real Time Sheduling in Constrained Devies
Teemu Kemppainen
Helsinki August 30, 2007
Master's Thesis
UNIVERSITY OF HELSINKI
Department of Computer Siene
Faulty of Siene Department of Computer Siene
Teemu Kemppainen
Power Aware Real Time Sheduling in Constrained Devies
Computer Siene
Master's Thesis
August 30, 2007
59 + 1 pages
sheduling, real-time systems, power-awareness
Kumpula Siene Library, serial number C-
Real-time sheduling algorithms, suh as Rate Monotoni and Earliest Deadline First,
guarantee that alulations are performed within a pre-dened time. As many real-
time systems operate on limited battery power, these algorithms have been enhaned
with power-aware properties. In this thesis, 13 power-aware real-time sheduling
algorithms for proessor, devie and system-level use are explored.
ACM Computing Classiation System (CCS):
D.4.1 [Operating systems℄,
J.7 [Computers in other systems℄
Tiedekunta/Osasto  Fakultet/Sektion  Faulty Laitos  Institution  Department
Tekijä  Författare  Author
Työn nimi  Arbetets titel  Title
Oppiaine  Läroämne  Subjet
Työn laji  Arbetets art  Level Aika  Datum  Month and year Sivumäärä  Sidoantal  Number of pages
Tiivistelmä  Referat  Abstrat
Avainsanat  Nykelord  Keywords
Säilytyspaikka  Förvaringsställe  Where deposited
Muita tietoja  övriga uppgifter  Additional information
HELSINGIN YLIOPISTO  HELSINGFORS UNIVERSITET  UNIVERSITY OF HELSINKI
ii
Contents
1 Introdution 1
2 Real Time Sheduling 3
2.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Hard Real Time Sheduling . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Rate Monotoni . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.2 Earliest Deadline First . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Soft Real Time Sheduling . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Power Awareness in Constrained Devies 9
3.1 Dynami Voltage Saling . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Advaned Conguration and Power Interfae . . . . . . . . . . . . . . 12
4 Power Aware Proessor Sheduling 14
4.1 Hard Real Time Sheduling . . . . . . . . . . . . . . . . . . . . . . . 14
4.1.1 The Low Power Fixed Priority Sheduling Algorithm . . . . . 15
4.1.2 Low-Energy EDF and Extended Low-Energy EDF . . . . . . . 17
4.1.3 Feedbak DVS-EDF . . . . . . . . . . . . . . . . . . . . . . . 21
4.1.4 Cyle-Conserving DVS for EDF Shedulers . . . . . . . . . . . 24
4.1.5 Comparing the Presented Algorithms . . . . . . . . . . . . . . 25
4.2 Soft Real Time Sheduling . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2.1 The ESheduler Algorithm . . . . . . . . . . . . . . . . . . . . 27
4.2.2 The ReUA Algorithm . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.3 Comparing the Presented Algorithms . . . . . . . . . . . . . . 36
5 Power Aware Devie Sheduling 37
5.1 Hard Real Time Sheduling . . . . . . . . . . . . . . . . . . . . . . . 37
5.1.1 Low Energy Devie Sheduler . . . . . . . . . . . . . . . . . . 38
5.1.2 The Multi-State Constrained Low-Energy Sheduler . . . . . . 40
iii
5.1.3 The Energy-Eient Devie Sheduling Algorithm . . . . . . . 41
5.1.4 The Energy-Optimal Devie Sheduler . . . . . . . . . . . . . 43
5.1.5 Comparing the Presented Algorithms . . . . . . . . . . . . . . 47
6 System-Level Power Aware Sheduling 49
6.1 duSYS: A System-Level EDF Algorithm . . . . . . . . . . . . . . . . 50
6.2 The Critial Speed DVS Algorithm . . . . . . . . . . . . . . . . . . . 52
6.3 Comparing the presented algorithms . . . . . . . . . . . . . . . . . . 54
7 Summary 55
Referenes 56
Appendies
1 The entire Feedbak DVS-EDF algorithm
11 Introdution
In a onventional omputer system, the orretness of alulations is dened by
their logial orretness. A real-time system has been dened as a system where
the alulations need not only be orret, but also be nished within a pre-dened
time [RaS94℄. Real-time systems are today used in a wide variety of omputing
devies: in medial systems, in ABS brake system of vehiles, in Global Positioning
System (GPS) devies, multimedia devies like DVD and MP3 players, in mobile
phones, among others. Many of these systems are onstrained devies funtioning
on limited battery power. Here, the usability of the devie is greatly dependent
upon the operational lifetime of the battery.
A sheduler is an operating system omponent responsible for sharing a resoure
among multiple users. A sheduler deides whih proess may use the proessor at a
partiular moment. Common sheduling algorithms are for instane Round-Robin,
where proesses are ordered in a irular queue, and CPU time is given to eah
proess in turn [Sta05, page 791℄. Another approah is First In First Out (FIFO)
sheduling, where the proess that has been in the queue for the longest time will
be given CPU time rst.
In real-time systems, these ommonplae sheduling methods annot be used sine
they do not guarantee meeting the time boundaries of real-time proesses. There-
fore, real-time systems need speial shedulers that take deadlines into aount. The
researh in real-time sheduling seriously began in the early 1970's. In 1973, Liu
and Layland published their two famous real-time sheduling algorithms, Earliest
Deadline First (EDF) and Rate-Monotoni (RM) [LiL73℄. EDF is based upon dy-
nami priorities, while in RM proesses have xed priorities. Basially all of today's
real-time implementations are based upon one of these two algorithms.
In a hard real-time system, a task must always nish before its deadline. The most
demanding area of hard real-time systems are systems where human lives are at
stake. Examples inlude medial systems like paemakers, military systems, and for
instane nulear power plants. Here, bulletproof evidene that the system will meet
its deadlines are required. The missing of even a single deadline is unaeptable. In
ontrast, in a soft real-time system (Setion 2.3) the deadline is of a somewhat more
relative nature. In a multimedia system, for instane in a video deoder, it might
be suent to guarantee that 95 perent of frames are timely deoded. Oasional
out-of-syn frames are aeptable in an appliation of this kind.
2Many proessors and devies designed for portable use provide several dierent
operational states. Besides its high power and speed state, the proessor an be run
at a lower speed, whih provides lower throughput, but onsumes less energy. At
times when the proessor is not needed at all, it an be put into a sleep mode whih
virtually onsumes no energy at all. Tehniques for adjusting devie throughput
and power onsumption are for instane Dynami Voltage Saling (DVS) [VeF05,
PLS01, VBH03℄ and Advaned Conguration & Power Interfae (ACPI) [HIM06℄.
These tehniques allow the operating system to hange the operating frequeny
and voltage of the proessor and other devies at run-time in order to save energy.
The introdution of run-time voltage saling has opened new possibilities even for
real-time systems in onstrained devies.
The most straightforward energy saving solution is to set the proessor and/or disk
into sleep mode after a period of user inativity [BBC98℄. Information from previous
proess invoations an be used to estimate the length of the sleep interval [HwA00℄.
Even more omplex statistial methods based on use history an be used to estimate
when the devie will be needed next time [IGS02℄. As suh, none of these methods
are usable in real-time systems with hard deadlines [SwC05℄. The implementation
of energy awareness in real-time systems is a more omplex task. The waking up of
the proessor, disk or other devie from sleep mode always introdues a ertain time
penalty. The devie is not instantly usable but requires some time to restart. In
real-time systems this wake-up delay risks missing deadlines and, therefore, needs
speial attention from the sheduler.
The solution is to implement energy-onserving properties into EDF and RM based
real-time shedulers. Reports indiate that suh tehniques have provided energy
savings of up to 50% [SwC03℄ while still guaranteeing meeting of real-time proess
deadlines.
This thesis desribes 13 power-aware sheduling algorithms usable in onstrained
devies with limited battery resoures. The theoretial bakground and terminology
of real-time sheduling with RM and EDF is desribed in Setion 2, and power-
aware properties in onstrained devies are disussed in Setion 3. Reent energy
onserving proessor sheduling algorithms are presented in Setion 4, and devie
sheduling algorithms in Setion 5. The thesis is summarized in Setion 7.
32 Real Time Sheduling
Real-time sheduling algorithms are responsible for sharing resoures among users
while guaranteeing timely exeution of real-time proesses. In order to present real-
time sheduling algorithms, we will rst introdue a system model used throughout
the rest of the thesis.
2.1 System Model
A task is a proess, a piee of independently running software ode. We use the
notation Ti to indiate a task, where i is the task's distintive number. One instane
of a task is alled a job. In real-time systems, tasks typially have a period, a time
interval between whih individual jobs of the task are released for exeution. We
mark the period Pi. A job of task Ti in period k is marked with Ji,k. By release
time we mean the time at whih Ji,k beomes ready for exeution.
By deadline we mean the time when a job needs to be ompleted. We indiate this
time Di. A deadline relative to the urrent time is marked di. For instane, if Ji
has Di = 20 and the urrent time is 15, then di = 5.
By the exeution time, indiated by Ei, we mean the worst ase exeution time of Ji:
the amount of proessor time needed by the job to omplete. In reality, exeution
times of individual jobs Ji,k vary greatly. Consider, for example, a real-time system
ontrolling a roboti arm that is removing faulty produts from a omposition line.
When there are no faulty produts, jobs will omplete extremely fast as the arm
does not need moving at all. But for sheduling reasons, we must expet the worst
ase exeution time. In the ase of the roboti arm, this would mean the (hopefully
rare) event when all produts within the arm's range are faulty, and need to be
removed from the line.
Let ei be one instantaneous exeution time of Ji, where ei ≤ Ei. By slak time we
mean the time Ei − ei, i.e. time alloated for proess exeution that is not atually
needed beause the job nishes earlier than budgeted. This time an be utilized for
energy savings. We will return to this later.
The utilization degree of a task is alulated by Ei/Pi. The utilization of the entire
task set is alulated using Equation 1:
4U =
n∑
i=1
Ei
Pi
(1)
where n is the number of tasks. In later parts of the paper we may desribe a task
(Pi, Ei). For instane (6, 3) means a task with period 6 and exeution time 3, and
(6, 1) indiates a task with period 6 and exeution time 1. The utilization of a task
set onsisting of these two tasks would be
3
6
+ 1
6
= 4
6
= 2
3
, aording to Equation
1. If no deadline is expliitly mentioned, then di = Pi, meaning that the deadline
of the task equals its period. Intuitively this means, that a job of the task must be
ompleted before the release of the next job.
2.2 Hard Real Time Sheduling
The fundamental real-time sheduling algorithms are Rate Monotoni (RM) and
Earliest Deadline First (EDF) [LiL73℄. Neither of these algorithms provide power-
awareness, but all of the energy onsious sheduling solutions presented later in this
thesis are enhanements of either RM or EDF. Therefore, an insight into RM and
EDF is essential for understanding this thesis. Both RM and EDF will be presented
in this setion.
2.2.1 Rate Monotoni
input: list of tasks
1 repeat on task set:
2 perform RM shedulability test;
3 if fail alarm OS;
4 else
5 sort jobs in asending order aording to period;
6 while (jobs left):
7 shedule rst job from list;
8 remove nished job from list;
Figure 1: Pseudo ode of the Rate Monotoni algorithm.
In the Rate Monotoni sheduling algorithm, the task with the shortest period Pi
gets highest priority, and is sheduled rst. Beause periods of tasks are onstant,
5RM is a xed-priority sheduler. Liu and Layland [LiL73℄ have shown that the
shedulability ondition for RM is that of Equation 2:
U ≤ n(21/n − 1) (2)
where n is the amount of tasks. For instane, when n = 2, i.e. with two tasks, RM
is able to shedule the tasks if their U ≤ 0, 83. A task set onsisting of 4 tasks is
shedulable with RM if U ≤ 0, 76. With large numbers of n:
lim
n→∞
n(21/n − 1) = ln 2 (3)
The idea in Equation 3 is, that with large task sets, the RM shedulability ondi-
tion approahes the value ln 2, i.e. approximately 0, 69. The theoretial maximal
utilization, whih also the Earliest Deadline First algorithm aomplishes, is U = 1.
In other words, RM as suh annot be onsidered very eient.
Let us onsider a sample RM shedule using a task set onsisting of two tasks:
τ1=(5,2) and τ2=(7,4). First, RM onsiders the shedulability of this task set. A-
ording to Equation 1, U of this task set is 2
5
+ 4
7
= 34
35
, i.e. approximately 0, 97.
Aording to Equation 2, the promised usage level that RM is guaranteed to be
able to shedule when n = 2 is U ≤ 0, 83. Therefore it seems that this task set is
not shedulable with RM. The sheduler might alert the operating system of this
aording to line 3 in the pseudo ode in Figure 1. Let us, however, more losely
onsider the funtionality of RM by simulating lines 58 of the RM algorithm on
the before mentioned task set. The results are shown in Figure 2.
Figure 2: Tasks τ1=(5,2) and τ2=(7,4) sheduled using Rate Monotoni
[But05℄.
The period of τ1 is 5 and the period of τ2 is 7. In RM the task with the shortest
period gets highest priority. Therefore, τ1 is sheduled rst. Aording to the pseudo
ode in Figure 1, this operation is done by sorting the proesses in a list aording
6to their periods, as seen on line 5. The while ondition on line 6 is true so the
algorithm advanes to line 7. The rst job on the list is τ1, so it is sheduled rst.
Every 5 time units, τ1 is sheduled 2 units of time. This an be seen in Figure 2.
Having sheduled the highest priority task and removed it from the list (line 8), RM
now proeeds to shedule the next task, sine the while ondition on line 6 is true.
Here, τ2 requires 4 units of CPU time every 7 units. However, in period 1 there is
only 3 units of time available in the interval [0,7℄. The time interval [2,5℄ is alloated
to τ2. At time 5 a ontext swith ours, and the higher priority proess τ1 gets the
CPU. This is indiated by an up-arrow in Figure 2. Beause τ1 has the proessor
during [5,7℄, τ2 doesn't get a hane to nish its one remaining exeution time unit,
and J2,1 misses its deadline at time 7. This simulation hene veries the failed RM
shedulability ondition: this task set is not shedulable using RM.
2.2.2 Earliest Deadline First
input: list of tasks
1 repeat:
2 perform EDF shedulability test;
3 if fail alarm OS;
4 else do while (jobs left AND no new task released):
5 put job with losest deadline rst in list;
6 shedule rst job;
7 remove nished job from list;
Figure 3: pseudo ode of the Earliest Deadline First algorithm.
In the Earliest Deadline First algorithm the proess with the deadline losest to
the urrent time gets sheduled rst. Beause the proess with the losest deadline
hanges as exeution progresses, the EDF method leads to dynami priorities. In
EDF, the shedulability ondition is:
U ≤ 1 (4)
This means, that EDF aomplishes full resoure utilization while guaranteeing
timeliness. The pseudo ode of the EDF algorithm an be seen in Figure 3.
Let us onsider the tasks τ1=(5,2) and τ2=(7,4) sheduled using Earliest Deadline
First aording to the pseudo ode in Figure 3. On line 2, the EDF shedulability
7Figure 4: Tasks τ1=(5,2) and τ2=(7,4) sheduled using Earliest Deadline
First [But05℄.
test is performed. Aording to Equation 1, U = 34
35
. Beause the EDF shedula-
bility ondition (Equation 4) guarantees shedulability when U ≤ 1, this task set is
shedulable using EDF. The while ondition on line 4 is true. EDF orders the tasks
aording to their relative deadlines. At time 0, the job with the losest deadline
is τ1, so it gets sheduled rst. At its nish time at 2, τ2 gets sheduled. One
τ2 is nished at 6, the seond job of τ1 has been released, and is sheduled. After
exeution of the third job of τ1, at time 14, τ2 with deadline 21 get sheduled for one
unit of time, but is swithed out at time 15: here, the fourth job of τ1 is released,
and sine its deadline is 20 ≤ 21, τ1 gets higher priority than τ2. One a job is
nished, it is removed from the list of jobs.
2.3 Soft Real Time Sheduling
In a soft real-time system the timing onstraints are somewhat more relaxed than in
a hard real-time system. A soft-real time appliation usually provides a probabilisti
guarantee of p% of tasks meeting their deadlines. For instane a telephone network
might be onsidered a soft real-time appliation. It will be onsidered usable if 95%
of alls are onneted within 10 seonds, and within 20 seonds for 99,95% of alls
[Liu00, page 31℄.
The video viewing experiene or enjoyability of a omputer game is not spoiled if one
or two frames per minute miss their deadline. Multimedia is a a very ommon area
for soft real-time systems. Consider for instane the ESheduler [YuN06℄ algorithm,
presented in Setion 4.2.1. It alulates the atual CPU time demand of n reent
jobs of task Ti. Based on this usage history, it uses as Ei (Equation 1) a value below
of whih p% of the onsidered jobs remain. Hene, it alloates enough CPU time so
that p% of jobs will omplete timely (assuming that the CPU demand distribution of
the task is pretty stable). This is a very typial real-time guarantee that sues for
8a soft real-time appliation. The use of a soft real-time sheduler instead of a hard
one might be motivated if, for instane, the response time of the system improve
when real-time onstraints are relaxed.
93 Power Awareness in Constrained Devies
By energy, measured in joule, we mean the total amount of work done during a
period of time, and by power we mean the rate at whih the work is done. Power is
measured in watts [VeF05℄.
Consider a task that takes 5 seonds to nish with a CPU running at 100 MHz.
Lowering the CPU speed to 50 MHz will derease the power onsumption of the
proessor, as lower frequenies need less power. However, the total energy needed to
omplete the task will not be redued, as the task will take a longer time to nish,
perhaps even twie the time. Atually, lowering only the speed of the CPU often
might inrease the total energy onsumed by the entire system, as for instane hard
disks, network adapters and other omponents need to be powered-up for longer
periods of time. This aspet is more losely onsidered in Setion 6.
In some ases, for instane to ool down a proessor, it is desirable to lower the
power onsumption without onsidering the total need of energy [VeF05℄. This
kind of power redution is, however, hardly what we wish to aomplish when using
battery powered onstrained devies: here, minimizing the total energy need is what
matters.
Calulating and minimizing the system's total energy onsumption depends on the
atual system onguration. This question has been researhed by for instane Zhuo
and Chakrabarti [ZhC05℄. In Setions 4 and 5 of this thesis, we fous on minimizing
the power onsumption of distint omponents. The reader should note that this
hosen view is a simplied one, as in reality systems are omposed of multiple
omponents.
3.1 Dynami Voltage Saling
Contemporary mirohips are based on the CMOS (omplementary metal-oxide-
semiondutor) tehnology. Chips using this tehnology onsume energy both dy-
namially and statially [VeF05℄. The stati power onsumption is aused by urrent
owing through the transistors even when they are turned o. As this form of en-
ergy onsumption annot be altered during run-time by the sheduler, it is not of
interest in this thesis.
The dynami power onsumption onsists of two parts. About one tenth of a hip's
power onsumption is aused by instantaneous short-iruiting of transistors as they
10
swith states [VeF05℄. Currently it is unknown how to ombat this energy waste, and
so we will disregard this form of dynami power onsumption. Most of the proes-
sor's dynami power onsumption an, however, be adjusted during run-time, and
this is where we will fous our attention. Let P be the dynami power onsumption
of a proessor. The following equation indiates how it is formed [PLS01℄:
P = C × f × V 2 (5)
here, C is the apaitane of the transistors. This is a xed value aused by the
physial struture of the proessor. The value f is the operating frequeny of the
proessor. It is usually measured in megahertz or gigahertz. Adjusting the operating
frequeny of the proessor linearly aets power onsumption. The operating voltage
of the hip is indiated by V . As seen in Equation 5, adjusting the voltage aets
power onsumption quadratially.
From Equation 5 it follows that the proessor's power onsumption an be regu-
lated during run-time by adjusting its operation frequeny f , voltage V , or both.
Tehnology for aomplishing this is alled Dynami Voltage Saling (DVS). The ab-
breviations DFS (dynami frequeny saling) and DVFS (dynami voltage-frequeny
saling) are also used [VBH03℄.
Notie, however, that adjusting only f but not V linearly dereases the power on-
sumed by the proessor, but not the total energy needed to omplete the task: a
CPU operating at m MHz that takes s seonds to nish a task will probably take
2s seonds to nish the task at m/2 MHz.
Lowering only V might seem tempting, but a lower V generally annot support
a high f , so usually lowering the supply voltage also requires the lowering of the
operational frequeny. So in DVS both V and f are adjusted: the proessor is made
both slower and less onsuming.
For an example of a real life DVS solution onsider the performane states of the 1.6
GHz Pentium M proessor presented in table 1. At the maximum speed, 1.6 GHz,
the power onsumption of the proessor aording to Equation 5 is C ∗ 1.6GHz ∗
1.484V and at the lowest speed C ∗ 600MHz ∗ 0.956. At lowest frequeny and volt-
age the proessor onsumes less than one fourth of its maximum power onsumption
(
C∗600MHz∗0.956
C∗1.6GHz∗1.484V
= 0.24), while still providing 38% of the maximum omputing per-
formane (
600MHz
1.6GHz
= 0.375). The early Transmeta Crusoe proessor provided even
more impressive power savings, as seen in table 2. The Crusoe provided 29% of the
11
Table 1: DVS performane states of the 1.6 GHz Intel Pentium M pro-
essor [Int04℄.
Table 2: DVS states of the Transmeta TM5400 Crusoe proessor
[PLS01℄.
maximum throughput (200 out of 700 MHz) while onsuming less than 13% of the
maximum power.
For sheduling needs, DVS an be utilized basially in three dierent ways. These are
ompared in table 3. The simplest method is the interval-based approah [VeF05℄, in
whih the CPU frequeny and voltage are adjusted downwards if the CPU utilization
during the past t time units has been low, and upwards if the CPU utilization has
been high. The value of t is ritial. If t is too short, the CPU fequeny and
voltage may be adjusted bak and forth ausing high overhead. On the other hand,
large t values may ompromise eieny as DVS adjustments are made very seldom.
The interval-based method an be enhaned by onsidering a window of intervals.
However, the interval-based method is not suitable for use in real-time systems as
it does not take into onsideration the deadlines of individual tasks.
The inter-task approah [VeF05℄ onsiders a distint DVS value for eah task and,
therefore, suits well the needs of real-time appliations. Voltage and frequeny
settings are altered at ontext swithes and remain xed during the exeution of the
12
Method name DVS oasions Real-time
suitable
Complexity
Interval-based At threshold time in-
tervals
No Low
Inter-task Context swithes Yes Medium
Intra-task Context swithes and
during task exeution
Yes High
Table 3: Comparison of fundamental DVS tehniques.
entire task. The advantage of the inter-task approah over the interval-based is that
eah task may reeive an individually suitable DVS setting. However, the exeution
time alloated for a task generally is muh higher than the atual exeution time.
Using the inter-task approah, the entire task is run with the same DVS value, whih
in most ases an be unneessarily high. Therefore, the power savings ahieved by
this method often are not optimal.
The most advaned DVS method used in real-time systems is the intra-task ap-
proah [VeF05℄. Here DVS values may be hanged even during a task exeution.
Algorithms utilizing this method are, for instane, Feedbak DVS-EDF [DMZ02℄
and ESheduler [YuN06℄, presented in Setions 4.1.3 and 4.2.1, respetively. For
instane the Feedbak DVS-EDF algorithm utilizes DVS aggressively. It will divide
a task's exeution time Ei into two parts, Ca and Cb. During Ca the proessor is
run at a lowered speed, and only at the start of Cb is the CPU speed inreased.
Jobs nishing sooner than their budgeted exeution time will never reah Cb and
the system is saved from this high power exeution interval. In ESheduler, the
speed shedule is divided into several phases, with eah having a slightly dierent
DVS value. The task is initially exeuted with a low speed, and as exeution time
progresses, the speed is gradually inreased.
3.2 Advaned Conguration and Power Interfae
Proessor manufaturers have dierent implementations for their voltage saling
tehnologies. AMD's tehnology is named PowerNow, Intel's SpeedStep, and Trans-
meta's LongRun [PLS01℄, or more reently, LongRun2. ACPI, rst introdued by
Intel, Mirosoft and Toshiba in 1996 [Gro03℄, is a standardized interfae between the
hardware and the operating system. The general arhiteture of ACPI is depited
13
Figure 5: The ACPI provides a standard interfae between the operating
system and the rmware [Gro03℄.
in Figure 5.
The main advantage with ACPI is that both hardware and operating system (OS)
omponents may evolve independently of eah other while letting the OS fully on-
trol the system's power management. The OS may, for instane, hoose to bundle
disk writes to be exeuted in bathes in order to improve system response times.
This kind of funtionality is not possible when power management is ontrolled by
hardware alone.
ACPI provides standardized mehanisms for swithing between dierent power on-
sious states of proessors, disk drives, sreens, modems, and other omponents that
are used in todays portable omputers. Both Windows and Linux platforms support
ACPI for CPU frequeny saling. The ACPI design is based on ASL (ACPI Soure
Language) and AML (ACPI Mahine Language) that reminds quite a lot the Java
programming language [Gro03℄. The human readable unompiled Java soure ode
orresponds to ASL in ACPI, whereas Java byteode orresponds to AML, whih
is the ompiled version of ASL. The idea here is that AML abstrats the platform-
spei details from the operating system so that the OS may use standard operation
names to aess platform-spei features.
The urrent version of the ACPI speiation is 3.0b. This 631 page doument was
released in Otober 2006, and is available for download at http://www.api.info.
14
4 Power Aware Proessor Sheduling
Real-time sheduling algorithms an be divided into proessor and devie sheduling
algorithms. This setion overs power-aware real-time sheduling algorithms for
CPU sheduling, while devie sheduling is overed in Setion 5. In this setion we
fous on uniproessor systems. The sheduler is responsible for sharing this single
CPU between all tasks while guaranteeing that time boundaries are met.
Energy saving is ahieved by running the proessor at lower speed whenever this
speed is suent to meet the deadlines. Beause the proessor's power onsumption
ubially depends on the lok frequeny and voltage (Equation 5, Setion 3.1), sig-
niant energy onsumption redutions an be ahieved by lowering the proessor's
frequeny and voltage at oasions when maximum throughput is not needed. Some
sheduling algorithms even utilize the sleep state of the proessor when the system
is idle, if suh a state is available. For instane, if the sheduler knows that the
next periodi job will not be released until time t, it will set a timer to wake up the
proessor at time t and put the proessor to sleep mode.
Lowering the proessor speed to save energy works as follows. Suppose that the
urrent job needs to nish at time t. When ran at full speed, the proessor will
nish the job at time t/2. Hene, it sues to run the proessor at half of the
maximum speed in order to guarantee timely exeution.
Proessor sheduling algorithms an be divided into two ategories, hard and soft
real time sheduling algorithms. We will rst study algorithms that provide hard
real-time guarantees. These are the stritest type of real-time algorithms: they
guarantee that all deadlines are met. All algorithms presented in Setion 4.1 are
enhanements of either the Rate Monotoni or Earliest Deadline First [LiL73℄ al-
gorithm. In soft real-time algorithms, oasional deadline misses are allowed. Soft
real-time proessor sheduling algorithms are explored in Setion 4.2.
4.1 Hard Real Time Sheduling
The Rate Monotoni and Earliest Deadline First algorithms as suh form an ex-
ellent starting point when engineering energy aware real-time shedulers. Most
ontemporary hard real-time shedulers with energy onserving properties in fat
are relatively small enhanements to the RM and EDF tehniques. As examples of
suh algorithms, we will in this subsetion explore a number of pseudo odes. The
15
LPFPS algorithm enhanes the Rate Monotoni algorithm, and provides a guaran-
teed U of ln2 as indiated by Equation 3. As examples of energy onsious Earliest
Deadline First based shedulers, guaranteeing U ≤ 1, the LEDF and Extended
LEDF algorithms are presented. The most ambitious algorithm that will be onsid-
ered is Feedbak DVS-EDF, whih even utilizes a basi form of intra-task DVS, and
slak-time passing between jobs. In general, EDF based shedulers are muh more
ommon in researh papers than their RM based ounterparts. This is due to EDF
providing full utilization of the proessor. RM is, however, simpler to implement in
some operating system kernels that do not provide expliit support for the timeliness
properties that real-time tasks require [But05℄.
4.1.1 The Low Power Fixed Priority Sheduling Algorithm
The Low Power Fixed Priority Sheduling (LPFPS) [ShC99℄ algorithm, published
in 1999, is one of the earliest energy onsious sheduling algorithms. It enhanes
the Rate Monotoni algorithm by taking into aount energy onserving properties.
For energy savings, LPFPS utilizes two dierent oasions. Firstly, in an RM based
shedule, there usually are idle times in the shedule. Reall the RM shedulability
ondition U ≤ n(21/n− 1) of Equation 2: the maximal CPU utilization U of an RM
based shedule with large task numbers n approahes the value 0.69. So with high
task numbers the maximal RM utilization degree leaves the CPU idle for 30 perent
of the time, and LPFPS utilizes this time for energy savings. Seondly, jobs atually
often exeute faster than budgeted. In other words, jobs rarely use all of the time
that has been alloated to them. When a job exeutes faster than budgeted, the
remaining time is used by LPFPS to save energy.
Both voltage and frequeny saling and the powering down of the CPU are supported
by LPFPS. When the system is idle, i.e., there are no jobs ready to run, LPFPS
plaes the CPU in a power down mode, and initiates a timer to wake up the proessor
so that it will be ready for use when it, aording to the shedule, is needed next
time. When there is only one job left ready to run, LPFPS will alulate an energy
onserving voltage and frequeny setting for the job, and exeute it if possible at a
lower CPU speed.
The LPFPS algorithm utilizes two data strutures of the type queue. Jobs that are
ready for exeution and wait for proessor time are plaed in the run queue. The
job with the highest RM priority (the shortest period) is at the head of the queue.
16
Figure 6: pseudo ode of LPFPS the sheduling algorithm [ShC99℄. Lines
L5L11 orrespond to the onventional Rate Monotoni funtionality.
In the delay queue LPFPS holds tasks whose urrent jobs are ompleted, i.e. tasks
waiting for the arrival of their next jobs in the next period. The job with the losest
arrival time is plaed at the head of the delay queue. The job that is urrently
sheduled for exeution is alled the ative task. Coneptually, this task is present
in neither of the queues.
The LPFPS pseudo ode We are now ready to onsider the LPFPS pseudo ode
of Figure 6. Let us begin by onsidering lines L5L11, where the funtionality of a
onventional RM sheduler is present. On line 5 it is heked whether the urrent
time exeeds or equals the release time of job(s) at the head of the delay queue.
17
If so, the jobs are moved to the run queue (line 6). If the job now at the head of
the run queue has a greater priority, i.e., shorter period, than the ative task has
(line 8), then a ontext swith ours on line 10. This implies that the information
belonging to the urrent ative task in the CPU registers and operating system
ontrol strutures are stored in main memory, and replaed by the information of
the new ative task. Prior to the ontext swith, on line 9, LPFPS also stores the
amount of time the job has been exeuted. This value is later used when alulating
voltage and frequeny saling parameters.
In addition to the onventional RM sheduler funtionality, LPFPS provides energy
saving properties. Energy savings will be sought when the run queue is empty, i.e.,
when there are . This ondition is heked on line 12. If the run queue is empty,
and there is no ative task (line 13), i.e., the proessor is idle, then the CPU will
be put to power down mode. On line 14 a timer is set to ativate the proessor so
it will be ready for use at the arrival of the next job. In setting the timer LPFPS
takes into aount the proessor wakeup delay time. On line 15 the CPU is put to
power down mode.
If the run queue is empty but there is one ative task (line 16), LPFPS will alulate
an energy saving DVS setting for it and, when possible, exeute the task at lower
speed and voltage. The new speed ratio is alulated by the Compute_speed_ratio()
proedure alled on line 17. The formula used by LPFPS in alulating the speed
ratio is [ShC99℄:
speed_ratio =
Ci −Ei
ta − tc
where Ci is the budgeted exeution time, Ei the time that has already been spent
exeuting the job, ta is the arrival time of the next job, and tc is the urrent time.
In essene, the remaining exeution time is divided by the time available before the
arrival of the next job. Among the available CPU lok frequenies the lowest one
guaranteeing timely exeution is loated on line 18. The proessor frequeny and
voltage are adjusted on line 19. It should be noted that it is impliitly assumed that
Di ≥ ta, where Di is the absolute deadline of the ative job.
4.1.2 Low-Energy EDF and Extended Low-Energy EDF
Where LPFPS, desribed in the previous subsetion, is based on the Rate Monotoni
algorithm, we will from here on fous on Earliest Deadline First shedulers. The
18
pseudo ode of an energy onserving EDF based proessor sheduling algorithm
alled Low-Energy EDF (LEDF) is given in Figure 7. This algorithm was published
by Swaminathan and Chakrabarty in 2000 [SwC00℄. It only supports two distint
CPU speeds, low and high speed. Due to its simpliity, it is an exellent entry point
into more omplex shedulers.
Figure 7: The LEDF pseudo ode [SwC00℄.
On line 7 of Figure 7, the jobs urrently present are sorted aording to their dead-
lines, and on line 8 the job with the losest deadline is sheduled aording to the
EDF priniple. On line 9, LEDF heks whether or not the job would make its
deadline if sheduled at a lower speed and voltage. If so, the job is sheduled at the
lower speed. If the job annot meet its deadline at the lower speed, LEDF heks on
line 11 if it an make it with the higher speed, and shedules the task at the higher
speed on line 12. If the deadline annot be met even at higher speed, the exeption
handler (line 13) is alled. It is then up to the operating system to deide what to
do with this task.
Extended LEDF The authors of LEDF have improved their algorithm [SwC01℄.
The Extended LEDF (E-LEDF) algorithm given in Figure 8 onsiders the CPU
transition delay when making sheduling deisions. A swith between the high and
low speed states always introdues a ertain time and energy penalty. The swith
in itself onsumes some energy and takes some time. Very short swithes from high
19
speed state to the low speed state are not worthwhile as the state transition ost
would exeed the net gain.
Figure 8: Pseudo ode of the E-LEDF sheduler enhaning LEDF
[SwC01℄. Syntax: tlow and thi: exeution time with low / high CPU
speed, respetively; ts state transition delay; di deadline; Elow and Ehi
energy onsumption with low / high speed, respetively.
Let us now explore the E-LEDF pseudo ode. On line 6 in the pseudo ode of Figure
8 tasks are sorted aording to their deadlines, and the task with the losest deadline
is hosen for exeution aording to the EDF priniple. When sheduling the very
rst task of the session (line 7), we want to hek if we an shedule the task at
low speed. This is done on line 8: if the exeution time with low speed tlow added
with the proessor transition delay ts is lower or equal to the task's deadline di, the
task is sheduled using low speed. Otherwise, it is heked if the task will meet its
deadline with high speed (line 9). If the deadline annot be met even at high speed,
20
the operating system exeption handler is alled (line 10). The operating system
might, for instane, alert the appliation whose time onstraints annot be met.
The sheduling of the following tasks begins on line 12. If the previous task was
run at high speed, then E-LEDF will ompute the task's total energy onsumption
using both low and high speeds (Elow and Ehi) on line 13. In these alulations,
the proessor state transition energy osts are taken into onsideration. If the task
is not shedulable even at high speed (line 14) the operating exeption handler is
alled (line 15). If the task is shedulable, E-LEDF will need to onsider whether
it is worthwhile to swith to low speed. If the task will meet its deadline at low
speed inluding transition delays (line 17), and the total energy onsumption at low
speed Elow doesn't exeed energy onsumption at high speed Ehi, then the task is
sheduled at low CPU speed (line 19). Otherwise, the task is sheduled at high
speed (line 21 and 23).
A similar pattern to the one desribed in the previous paragraph is followed if the
previous task was sheduled at low speed (line 24). The total energy onsumption
at both speeds is alulated (line 25), and in the sum Ehi also the transition ost
is inluded. The transition to the higher CPU speed is made only if the total
energy onsumption at high speed would be smaller than using the low speed. This
ondition is heked on line 30.
We believe the E-LEDF ode ontains redundanies and at least one error. Notie,
that the if statement on line 19 is redundant: the ondition tlow+ts ≤ di has already
been heked on line 17. In fat, also the if on line 21, and the entire lines 22 and 23,
are redundant. The error we believe we have found is also quite obvious. Consider
a situation where the previous task has been run at low speed, and thi + ts ≤ di, but
Ehi ≥ Elow. This would bring us to line 33 in the pseudo ode. Now assume that
tlow + ts ≥ di. This ould very well be possible, sine the task is shedulable at high
speed (thi + ts ≤ di), and the shedulability test on line 26 would hene have been
passed. In this situation, the if ondition on line 33 would be false, and the task
would never be sheduled. The pseudo ode would hene need some rewriting to
support tasks that would require to be run at high speed, even though they wouldn't
spend less energy at that speed. The required modiation is quite trivial. It sues
to add to line 33 the following: else shedule at high speed .
A more fundamental problem with E-LEDF is that the algorithm does not expliitly
handle situations when the CPU is idle. If the previous task has left the CPU in
its high speed state when the job queue beomes empty, E-LEDF will still keep the
21
CPU running at full speed and hene waste energy although the proessor is not
needed. Currently, E-LEDF supports only two distint CPU speeds, and no power-
o state. Frequeny and voltage saling deisions are made only at the beginning of
eah task, whih limits ahieved energy savings.
4.1.3 Feedbak DVS-EDF
One of the more ambitious power-aware hard real-time CPU sheduling algorithms
is also based on EDF and is alled Feedbak DVS-EDF. It was published by Dudani,
Mueller and Zhu in 2002 [DMZ02℄. The interesting parts of the pseudo ode are
presented in Figure 9. The ode for initializing variables, pre-emption handling and
setting of lok frequeny are exluded, sine they are of little interest to the topi
of this thesis. The interested reader may, however, view the entire algorithm in
appendix 1.
The idea in Feedbak DVS-EDF is to utilize DVS aggressively. The algorithm is
based upon the assumption that most atual task instanes (jobs) will need less CPU
time than sheduled to them. Therefore, Feedbak DVS-EDF begins the exeution
of a job with a very slow CPU speed. Only if the job isn't nished after a ertain
time, is the CPU speed inreased. In real-life situations, jobs rarely use all of the
CPU time alloated to them. Therefore, for most jobs, the CPU will never need to
run at its highest speed, and energy is saved.
In order to be able to alulate a statistially optimal initial speed, the Feedbak
DVS-EDF algorithm maintains statistial information on the exeution times of a
task's previous instanes. Tasks are also able to pass unused slak time on to the
next job. Say, for instane, that a job Ji has exeuted 2 time units faster than
budgeted and nishes at t. Further assume, that the next job Ji+1 has been released
before t. In this ase, using Feedbak DVS-EDF, Ji will pass the two unused slak
time units on to Ji+1. Now, Ji+1 will have in its exeution time budget two more
time units more than usually. This extra time may be used to further slow down the
proessor in order to onserve energy. Information on unused slak time is stored
in the variable slack, and by reading this variable the sheduler will know of these
two superuous time units when it goes on to shedule Ji+1. This inreased time
budget is, of ourse, usable only if it won't jeopardize nishing Ji+1 within its time
boundaries.
These are the main energy onserving properties of Feedbak DVS-EDF. Let us
22
Figure 9: The entral parts of the Feedbak DVS-EDF algorithm
[DMZ02℄.
now study the pseudo ode of 9 in loser detail. In order to do this, a number of
notations need to be explained. By Tij we mean an instane, i.e. a job, j of task Ti,
and with di we mean its deadline. The variable slack stores information on unused
slak time, and leftij holds the remaining exeution time of job Tij . By Tab we
denote the set of idle tasks (tasks that urrently have no jobs waiting for proessor
time), and by pk the previous, by nj the next and by ij the urrent job. The letter
α′ denotes the ratio of the proessor's maximal speed, and fi is the lok frequeny
of the proessor. By rij we denote the release time of job Tij , i.e, the time when
the job is ready to be sheduled, and starts waiting for proessor time. The job's
atual exeution time is denoted by cij , and Ci is the budgeted worst-ase exeution
time. The exeution time Ci of a job is divided into two parts, CA and CB, where
CA is the time interval that the job is exeuted at a slower and less onsuming CPU
23
speed, and CB denotes the time interval when the job is exeuted at high speed.
Therefore, Ci = CA + CB. The variable Cavg_i notates the average exeution time
of Ti. This implies:
CA
α′
+ CB = Ci + slack
By α′ we mean the ratio of the maximal lok frequeny, and this value in turn is
alulated using the formula
α′ =
CA
CA + slack
where slack denotes the unused slak time that emerges when a job is exeuted
faster than budgeted.
This funtionality is presented in the algorithm of Figure 9 beginning on line 7, in the
proedure TaskActivation. (Notie that Feedbak DVS-EDF uses the term task
in the proedure names when referring both to a task, and an instane of a task.
Elsewhere in this thesis, the term job is used for the latter.) On line 7 the value
α′, i.e., the ratio of the lowered CPU speed from the highest speed, is alulated.
In order to nd the optimal value for α′ Feedbak DVS-EDF utilizes statistis from
previous instanes of the task. It is from here that the word Feedbak in the
algorithms name is originated. Statistis is maintained in the variable Cavg, whih
indiates the average exeution time of this task's previous jobs. When a job is
nished, its Cavg is updated on line 21 in the proedure TaskCompletion. Here we
take into onsideration the exeution time of the urrent instane cij and alulate a
weighted average between cij and the previous value of Cavg. The value Cavg is then
utilized when alulating an optimal value for α′ at job ativation. The variable
slack is alulated and utilized at similar oasions. The value is alulated when a
job nishes, in the proedure TaskCompletion on line 20, and is later utilized when
alulating α′ in TaskActivation on line 7. Information on unused time and CPU
utilization statistis of previous jobs is thus passed between jobs using these two
variables.
On line 9 and 11 the variable CA is alulated and set. This variable indiates the
length of the time period from the beginning of a job that the job is to be exeuted
with the lower speed. This speed is indiated as the ratio from the maximum speed
by α′. If α′ is alulated to equal 1 (line 8), then the task must be exeuted at
highest lok frequeny, and the length of the lower speed interval CA is set to 0
(line 9). If the value of α′ is not equal to 1, then CA is alulated on line 11. On
24
line 12, a timer interrupt is set to ativate the sheduler after CA units of time has
passed. This is done by the proedure SetInterrupt. On line 13, the proessor is
adjusted to the new lok frequeny. If the job isn't nished within CA units of time,
the sheduler is reativated by the timer. The reativated sheduler will adjust the
CPU to run at full speed, and the rest of the job will be exeuted at highest lok
frequeny. This will guarantee timely nishing of the job.
4.1.4 Cyle-Conserving DVS for EDF Shedulers
Feedbak DVS-EDF presented in the previous subsetion utilizes DVS aggressively.
For the sake of omparison, let's onsider the Cyle-Conserving DVS for EDF shed-
ulers (EDF) algorithm [PiS01℄ presented in Figure 10. This illustrative algorithm
utilizes DVS onservatively: jobs are initially run at a higher CPU speed, and when-
ever jobs nish before spending their entire time budget, the proessor is slowed
down.
Figure 10: The Cyle-onserving DVS for EDF Shedulers (EDF) algo-
rithm [PiS01℄. Ci budgeted CPU yles to task Ti; cci atual spent yles;
fi proessor frequeny; fm maximal proessor frequeny; Ui utilization
degree.
Now onsider the pseudo ode in Figure 10. Upon task ompletion, on line 8, the
utilization degree Ui of Ti is set to
cci
Pi
, i.e., to reet the eventual time left un-
used by the task. Then, on line 10, the proedure selet_frequeny() is alled.
Here, EDF hooses from among all disrete CPU speeds {fi, . . . , fm} the lowest
one that will guarantee shedulability of the tasks with the newly alulated Ui.
The shedulability riteria,
∑
U ≤ 1, is based on the EDF shedulability ondition
(Equation 4) [LiL73℄, but on the right side of the inequality we now have
fi
fm
instead
25
of 1 to represent the lowered CPU speed. When new tasks are released, EDF
will in task_release(Ti) on line 5 alulate the utilization for the new task, and
then on line 6 all selet_frequeny(), whih now may want to raise the CPU
speed to reet the inreased workload. No expliit transition delay onsiderations,
nor expliit shedulability failure handling, is present in EDF. Its purpose here is
solely to illustrate the funtionality of onservative DVS as opposed to the aggres-
sive tehnique implemented in Feedbak DVS-EDF. The authors of EDF have also
presented RM, an energy onserving Rate Monotoni based algorithm with on-
servative DVS support, and laEDF (Look Ahead EDF), an EDF based power-aware
sheduler with aggressive DVS support [PiS01℄.
4.1.5 Comparing the Presented Algorithms
We now have onsidered ve dierent algorithms for power-aware proessor shedul-
ing. The one based on the Rate Monotoni method is alled LPFPS. This algorithm
is pre-emptive and seeks energy savings in two dierent ways: if only one job remains
left to be sheduled, it is run on a lower lok frequeny. If no jobs are left wait-
ing for proessor time, then the proessor is put to sleep, and is later awoken with
a timer. Beause the Rate Monotoni method guarantees an utilization degree of
approximately 0.69, in an RM shedule there most often is plenty of idle time. The
LPFPS algorithm also onsiders the proessor wakeup delay when making power
down deisions.
The other four algorithms are based on the Earliest Deadline First method. The rst
one presented is alled LEDF and supports only two dierent CPU speeds. At the
beginning of eah job the sheduler alulates whether the job will meet its deadline
if sheduled at the lower speed. The higher speed is used only when needed. This
simple algorithm has later been enhaned by the same authors with E-LEDF. Here
also CPU state transition osts in time delays and energy waste are onsidered. A
state transition is made only if it is worthwhile. Very short transitions not always
are. Even E-LEDF supports only two dierent speeds.
Out of the presented algorithms the most versatile is Feedbak DVS-EDF. This
algorithm aggressively seeks energy savings by starting the exeution of eah job with
a low speed. Only when needed to guarantee timely exeution does the sheduler
run the job at high speed. The idea here is the nding that most real-time jobs
exeute signiantly faster than their budgeted worst-ase exeution times. In order
to nd an optimal starting speed, Feedbak DVS-EDF uses statistial information
26
from previous instanes of the task. Jobs may pass unused exeution time on to the
next job.
Even though Feedbak DVS-EDF is advaned even it ould be further improved. For
instane the algorithm divides the task's exeution times into two piees, CA and CB,
where the time CA is spent running at the lower speed, and CB with highest speed.
By further dividing the exeution time into smaller fragments, where eah fragment
is exeuted slightly faster than the previous one, even greater energy savings ould
be found. This would, however, add to the algorithm's omplexity. The usefulness
would depend on the amount of DVS states the used proessor platform supports.
We ended our review of energy saving hard real-time sheduling algorithms by pre-
senting EDF, a simple algorithm that utilizes DVS onservatively. Where Feed-
bak DVS-EDF begins exeution of tasks with low speed, EDF initially runs tasks
at high speed, and one slak time is arued, forthoming tasks are run at slower
speeds, if possible. This algorithm makes voltage and frequeny saling deisions
only at the end of and upon release of tasks, but is signiantly less omplex than
Feedbak DVS-EDF.
4.2 Soft Real Time Sheduling
We will in this subsetion explore two soft-real time CPU sheduling algorithms.
Soft real-time shedulers provide a statistial performane guarantee. A ertain
perentage, say p, of the sheduled jobs will nish within a ertain time period.
Oasional misses of jobs are allowed. Therefore one might believe that the soft
real-time shedulers would be more simple than their hard real-time ounterparts.
That is, however, not the ase. As will be revealed, these algorithms are far more
omplex than their hard real-time ounterparts. Their system model onepts and
patterns of design are original, whereas the hard real-time shedulers evidently were
ospring of the original EDF and RM algorithms published by Liu and Layland in
1973 [LiL73℄.
Presently, the most ommon implementation environment for soft real-time shed-
ulers are multimedia systems. For instane MPEG video or audio ompression
deoders are onsidered fully usable even when they oasionally do miss a frame
or sound sample. Beause suh a relaxation to the strit hard real-time shedulers
might provide signiantly better system throughput or response times to interative
systems, soft real-time shedulers are inreasingly popular.
27
4.2.1 The ESheduler Algorithm
The ESheduler [YuN06℄ is based upon work done in the GRACE projet [YuN03℄.
The algorithm gives a statistial probability guarantee that sheduled tasks (EShed-
uler uses the term proess) meet their deadlines. This is usually suient for
multimedia appliations, where it sues to know that p % (where p might be for
instane 95) of video frames are timely deoded. ESheduler onserves energy by
utilizing DVS aggressively. It is based on the EDF algorithm.
ESheduler has two main tasks to perform: rstly, task sheduling, i.e. to shedule
instanes of tasks guaranteeing that they meet their deadline with probability p %,
and seondly, speed saling, i.e. to run these sheduled proesses onserving as muh
battery power as possible. These funtions will be desribed next.
Sheduling tasks The fundamental assumption in the design of ESheduler is,
that while the atual CPU demand of a task's individual jobs varies greatly, the yle
demand distribution of the task is pretty stable. ESheduler maintains statistis of
the atual CPU yles needed by the last n jobs of a task.
Figure 11: ESheduler ounts the yle demand of tasks [YuN06℄.
ESheduler alulates the yle demand of a job as depited in Figure 11. The
ounter is implemented as an extra eld in the Proess Control Blok (PCB) of the
operating system. Eah time the task is swithed out the CPU yle ounter of
the job is updated, and when the job nishes, its entire yle ount is added up to
the statistis. Based upon this statistis, aurate estimations of forthoming CPU
yle demand an be made, and the task an be sheduled an appropriate amount
of CPU time. Sheduling too little CPU time will result in low quality of servie
as for instane video frames aren't deoded timely, while sheduling too muh time
will waste CPU resoures and onsume energy superuously.
The graph in Figure 12 depits the umulative yle demand of one task's (Ti)
28
Figure 12: The umulative yle demand distribution in ESheduler
[YuN06℄.
elapsed jobs (Ji). The umulative distribution funtion is based on Equation 6.
F (x) = P [X ≤ x] (6)
Firstly remember, that jobs are instanes of one task. Now let's onsider this equa-
tion. It indiates the probability of the umulative CPU yle demand of jobs of one
partiular task (X), of being equal or less than x. In Figure 12, Cmin is the smallest
yle demand among the task's onsidered jobs, and Cmax the largest. The interval
[Cmin, Cmax] is divided into r setions. Eah setion forms an area in the histogram.
The height of a setion area indiates the probability that the job needs at most bk
yles, where bk is the upper boundary of the setion. From this histogram, it is
possible to extrat the yle boundary bk below of whih p perent of the jobs of the
task remain.
In soft real-time appliations, it sues to provide a statistial guarantee that p
perent of jobs meet their deadline. Before the task is aepted into the set of
shedulable tasks, a shedulability test needs to be performed. The task is shedu-
lable if the ondition in Equation 7 is fullled.
n∑
i=1
Ci/SK
Pi
≤ 1 (7)
In this equation, Ci is the estimated yle demand below of whih the yle demand
of p perent of job instanes of task i remain; SK is the maximum number of yles
29
the CPU an ahieve at full speed, and Pi is the period of task i. The ondition ≤ 1
originates from the EDF shedulability ondition (Equation 4) [LiL73℄.
Adjusting the CPU speed for a task After jobs are sheduled, it is up to
ESheduler to exeute them at optimal CPU speed to minimize power onsump-
tion. Here, its funtion resembles that of Feedbak DVS-EDF (see Setion 4.1.3).
ESheduler utilizes DVS aggressively. It starts job exeution at a low CPU speed
and inreases speed as needed. ESheduler is, however, a little more omplex in its
speed saling tehnique than was Feedbak DVS-EDF.
ESheduler begins by alulating an aggregate CPU speed requirement for the ur-
rent task set. This speed is alulated with the equation
∑n
i=1
Ci
Pi
where the unit is
yles per seond (or hertz). As an example, onsider a task set of two tasks, where
the rst one is alloated 12 ∗ 106 yles every 40 ms and the other 106 yles every
20 ms. The aggregate CPU speed would then be
12∗106
40
+ 10
6
20
= 350MHz [YuN06℄.
The straightforward solution would be to run the tasks at this aggregate speed.
This would, however, waste energy. The estimated yle demand Ci is the value
below of whih the yle demand of p perent of tasks remain. If p is for instane
95, then 95 perent of the tasks require less than Ci yles. The yle demand of
individual tasks vary greatly. Jobs are initially ran at a low speed, and as the job
yle ount inreases, CPU speed is gradually inreased aording to a speed shedule
that ESheduler alulates for every task.
The speed shedule of a task onsists of oordinates (x, y) in an ordered list. At
x or more spent yles the CPU is aelerated to speed y. An example of a speed
shedule might be: (0, 100MHz), (1 ∗ 106, 120MHz), (2 ∗ 106, 180MHz). Here, the
task would be started at CPU speed 100 MHz, and after 1∗106 yles, the proessor
would be aelerated to 120 MHz. After 2 ∗ 106 yles, if the job would still not be
ompleted, the proessor speed would be inreased to 180 MHz.
With high p values most jobs onsume less than Ci CPU yles. They will hene
omplete before ever reahing the highest CPU speed points, and therefore avoid
these most energy onsuming phases. Notie that every task in the set has its own
speed shedule. Therefore, proessor speed hanges our, besides at saling points,
also at ontext swithes. The ESheduler algorithm [YuN06℄ does not expliitly
onsider proessor state transition delays when alulating a speed shedule.
30
Calulating the speed shedule The approah taken by ESheduler in alu-
lating a speed shedule is based upon the yle demand histogram (see Figure 12).
Eah area in the histogram, starting with a yle demand of bi, is issued a spei
CPU speed. The speed shedule of any task will onsist of m oordinates, (bi, s(bi)),
where the CPU speed, s(bi), is alulated using Equation 8 [YuN06℄.
s(bi) =
∑m
j=1 gj
3
√
1− F (bj)
T 3
√
1− F (bi)
, i = 1, . . . , m. (8)
where gj is the size of the j:th yle group (the width of the area in the histogram),
and T represents the time budget of a task. This variable represents the available
time distributed among tasks aording to their yle demand. It is alulated using
the following formula:
T =
Ci∑n
i=1
Ci
Pi
This alulation of optimal proessor speeds is based on the theoretial alternative,
where CPU speed an be adjusted linearly. Real-world proessors provide only
disrete speed alternatives. For instane, the StrongArm SA-110 provide 11 dierent
CPU speed alternatives [YuN06℄. A straightforward approah to deal with this real-
world limitation is to alulate the optimal speed using formula 8, and then round
s(bi) to the nearest upper disrete speed. This is, however, not energy optimal,
sine the provided speed might exeute the job unneessarily fast and waste energy.
On the other hand, rounding s(bi) downwards might jeopardize timely exeution.
Therefore, ESheduler expliitly onsiders all available proessor speeds, and hooses
from among them the most eient ombination for the speed shedule. Here, it
even takes into onsideration the proessor's transition delay from ative to sleep
state.
The problem of hoosing the optimal CPU speed shedule is NP hard [YuN06℄.
ESheduler uses an approximation algorithm for seleting the best speed ombina-
tion. It should also be noted that these speed options are proessor spei. There-
fore, in order to be eient, ESheduler needs to be rewritten for eah partiular
hardware platform it is implemented on.
Implementing ESheduler ESheduler has been implemented into the Linux
2.6.5 kernel with 2605 lines of C ode. In order to implement the yle demand
ounter, the Linux Proess Control Blok is modied aording to Figure 13. The
31
Figure 13: The modied Linux Proess Control Blok [YuN06℄.
long integer job_yles reords the number of yles used by the job; *speed_shedule
is a pointer to the speed shedule list of the task, and urrent_dvsPnt points to
the presently used speed setting.
The Linux sheduler has been revised to (1) update the PCB elds at sheduling
oasions and (2) sale the proessor frequeny using DVS aording to the pro-
ess' speed shedule. A higher resolution timer has been hooked to the standard
Linux sheduler [YuN06℄ to allow invoking of the ESheduler every 500 µs, whih
enables periodi sheduling deisions to be made at a rate suient for soft real-time
appliations.
The ESheduler provides statistial real-time guarantees for multimedia applia-
tions. Tasks are sheduled CPU time aording to their historial CPU demand.
While exeuting tasks, ESheduler saves energy by adjusting the CPU speed a-
ording to a speed shedule it has alulated. Tasks are initially run at slow CPU
speeds, and the speed is aelerated as exeution progresses.
4.2.2 The ReUA Algorithm
This subsetion presents ReUA (Resoure-onstrained energy-eient utility arual
algorithm) [WRJ06℄. It is an ambitious proessor sheduling algorithm that onsid-
ers system-wide energy savings, and replaes deadlines by a onept that provides
higher delity.
The Time Utility Funtion replaes deadlines The lassial onept of dead-
lines an be argued to be artiial. Consider, for instane, a missile ontrol system.
32
In the traditional deadline-based approah, the missile must hit its target no later
than at time D. However, in a real world situation, the hit might be onsidered
to be useful even when missing D by a hair, although a perfet miss is preferred.
This kind of argumentation has lead to the development of a onept of Time Utility
Funtion (TUF), whih replaes deadlines.
Figure 14: Example Time Utility Funtions (TUF) [WRJ06℄.
Some example TUFs an be seen in Figure 14. The utility of nishing a job is de-
pited as a funtion of the ompletion time. In Figure 14 (a) and (b) non-inreasing
TUFs an be seen. Here, the utility of ompleting the task dereases or stays the
same as time goes by. In () a TUF of a missile appliation is depited. Here, the
utility inreases as the missile approahes its target, and then quikly dereases. A
traditional deadline as a TUF is shown in Figure (d). The utility of the ompletion
of the task stays the same until the task's deadline, after whih the utility drops to
zero. A sheduling algorithm that tries to maximize the sum of TUFs in the system
is alled Utility Arual.
The TUF of task Ti is denoted by Ui, and the TUF of job Jk is denoted by UJk . The
utility when Jk is ompleted at time t is denoted UJk(t). When sheduling tasks,
the aim of ReUA is to maximize the utility while minimizing energy onsumption
[WRJ06℄. In order to ahieve this, ReUA uses a unit alled UER (Utility-Energy
Ratio). The system's UER is dened as follows:
UER =
∑n
i=1 Ui∑n
i=1Ei
where Ui denotes the TUF of task Ti, and Ei the energy (desribed hereafter) on-
sumed by task Ti. Hene, UER is an indiator of system-wide energy eieny:
utility ahieved per energy unit.
System wide energy onsiderations Reduing the CPU power requirement
will lead to longer task exeution times. If hardware omponents suh as displays,
hard drives or memory hips need to be powered up during this time, reduing CPU
33
speed might in the worst ase even inrease the system-wide energy requirement, as
other omponents need to be powered up longer. When making sheduling deisions,
ReUA onsiders the system-wide power onsumption instead of only the CPU power
onsumption. While the CPU power onsumption is alulated using the formula
P = C×f×V 2 (Equation 5), The equation for the system-wide energy onsumption
is estimated using Equation 9 [WRJ06℄:
P = S3 × f
3 + S2 × f
2 + S1 × f + S0 (9)
where f is the operating frequeny; S3 is the CPU power requirement; S2 is aused
by CMOS power leakage; S1 presents the power requirement aused by omponents
suh as memory hips operating at a xed voltage independent of frequeny, and S0
is a onstant representing omponents suh as displays, whose power requirement is
independent of both operation frequeny and voltage [WRJ06℄. From Equation 9,
the following equation for the energy onsumed per proessor yle an be derived:
E(f) = S3 × f
2 + S2 × f + S1 + S0/f (10)
Calulating proessor yle demand When alulating the proessor yle de-
mand to be alloated to a task ReUA, like ESheduler (Setion 4.2.1), uses statistial
information. But unlike ESheduler, ReUA does not expliitly present a mehanism
for olleting and proessing statistial information: the CPU yle demand mean
and variane are assumed to be given. To alulate a task Ti's yle demand Ci,
ReUA uses Equation 11 whih provides a statistial performane guarantee:
Ci = E(Yi) +
√
[pi × V ar(Yi)]/(1− pi) (11)
where Yi is the yle demand distribution; E(Yi) is the expeted yle demand, and
V ar(Yi) is the statistial variane of yle demand distribution. The variable pi is
a probability. In ReUA, a pair {vi, pi} is used to indiate that vi of the maximal
utility (TUF) should be ahieved with probability pi.
This statistial performane guarantee an be presented as Pr(U(si,j) ≥ vi×U
max
i ) ≥
pi [WRJ06℄, where si,j is the sojourn time of Ji,j . To alulate the upper bound for
Ti's sojourn time, ReUA uses a variableDi and alls it ritial time. To ensure that
vi of the maximal utility is ahieved with probability pi, ReUA needs to guarantee
that
34
Di = U
−1
i (vi × U
max
i ) (12)
where U−1i is TUF's inverse funtion. The values Ci and Di are alulated in ReUA's
offlineComputing() proedure that an be seen in Figure 15. Equation 12 is used
on line 3 to alulate Di. On line 4 Equation 11 is used to alulate the amount of
CPU yles to be alloated to Ti, and this number is plaed in the variable Ci. This
proedure also alulates f oTi, the optimal speed (frequeny) at whih to exeute Ti.
Figure 15: The offlineComputing() proedure of the ReUA algorithm
[WRJ06℄.
The ReUA main pseudo ode The algorithm for ReUA an be seen in Figure
16. As input ReUA reeives the urrent task set T = {T1, . . . , Tn} and the urrent
unsheduled job set Jr. From these, ReUA will alulate its output, i.e. the job to
be exeuted Jexe, and its exeution speed, fexe.
On line 3 the OfflineComputing(T) proedure is alled, and Ci, Di and the optimal
frequeny f oTi of eah task are alulated. (On line 4, the urrent time tcur is plaed
in t.) The swith-statement on lines 58 manages the variable Cri whih holds the
remaining CPU yles alloated to the urrent job. Upon task release (line 6), the
entire alloated yle amount is plaed in this variable; upon task ompletion (line 7)
the variable is set to zero, and on other sheduling oasions (line 8) Cri is updated
to reet the number of remaining yles.
In the for loop starting on line 9, a feasibility hek is performed on all unsheduled
jobs. The expeted alulation time of any job may not exeed its termination time
at highest CPU speed. If a job is not feasible, it is aborted (lines 1011). Otherwise,
on line 13, ReUA alulates the resoure dependenies of the job using the proedure
buildDep().
The for loop on lines 1415 alulates the UER (Utility-Energy Ratio) for eah
unsheduled job. This Figure implies how muh utility would be ahieved if this job
35
Figure 16: The ReUA main pseudo ode [WRJ06℄. Symbols: Jr de-
notes the urrent unsheduled job set; Ci CPU yles alloated to Ji; C
r
i
remaining yles of urrent job.
were to be exeuted starting at this moment. The alulateUER() proedure even
onsiders job dependenies alulated by buildDep(): if Ji is dependent of tasks
Ji.Dep = {JDep1, . . . , JDepn}, then the jobs in Ji.dep are inluded when alulating the
UER for Ji.
On line 16, the jobs are sorted in non-inreasing order aording to their UER. In
the for loop starting on line 17, the jobs whih are meaningful to run, i.e. the ones
whose UER is larger than zero (line 18), are inserted into a list σ in order of their
ritial times. This is done by the proedure insertByECF() (line 19). Critial
times are moments when, at the latest, the job needs to be nished in order to
guarantee the desired performane level dened by {vi, pi}. The ECF value of a job
Ji is not neessarily the ritial time of Ji alone: if another job is dependent on Ji,
the atual ECF of Ji might be earlier than its tentative ECF. The EDF priniple
is followed by insertByECF(). In essene, on lines 1621, the jobs are rst sorted
36
aording to their UERs, and then aording to their ECFs. The resulting ordered
list is plaed in σ.
On line 22, the job at the head of σ is hosen for exeution. On line 23 in proedure
deideFreq(), ReUA alulates the optimal exeution speed for the job onsidering
available DVS parameters. On line 24 the algorithm returns the job to be sheduled
Jexe, and its exeution frequeny fexe.
4.2.3 Comparing the Presented Algorithms
Presented in this setion were ESheduler [YuN06℄ and ReUA [WRJ06℄, two reent
algorithms for CPU sheduling in a soft real-time environment. Both algorithms
provide a statistial guarantee that jobs meet the desired level of performane with
probability p. In ESheduler, the proess of olleting and analyzing the aumu-
lated CPU yle demand statistis is expliit; in ReUA, the mean and variane of
CPU yle demand is onsidered to be given.
ESheduler is a traditional energy onserving CPU sheduling algorithm: it only
onsiders the power requirements and savings of the CPU (Equation 5) and ignores
the power properties of the rest of the system. The approah hosen in ReUA
is more realisti, as it estimates system-wide energy savings (Equation 9). How
superior as the latter approah may seem, one should note that, in essene, the
dierene is just whether we hoose to onsider the CMOS power onsumption
equation P = C×f×V 2 or the system-wide equation P = S3×f
3+S2×f
2+S1×f+S0
when estimating task power requirements.
Where ReUA stands out in omparison to ESheduler is in its onsideration of
resoure dependenies, and its introdution of the TUF onept that has been ar-
gued to provide higher delity than deadlines. Neither of the algorithms expliitly
takes into onsideration transition delays when making DVS frequeny and voltage
adjustment deisions.
37
5 Power Aware Devie Sheduling
The main problem with devie sheduling is the same as with proessor sheduling.
We have one resoure with multiple users, and wish to share the resoure between
these multiple users in a purposeful way. In real-time systems espeially deadlines
must be met. The major dierene between proessor and devie sheduling is that
the devie sheduler needs to alulate a distint shedule for eah devie. Systems
may ontain multiple devies, and eah task may use several or none of them. The
situation is hene not the same as with proessor shedulers, whih we onsidered
in Setion 4: the proessor shedulers were all aimed at uniproessor systems, and
every task naturally utilize this single proessor.
Devies onsidered in this setion have at least two power states: a sleep state and
an operating or awake state. In the sleep state, the devie is not able to provide its
servie, like disk or network I/O, but in this state the devie onsumes less energy
than in its operating state. Some devies may have several power states, where eah
state psi+1 onsumes less energy than state psi, but takes a longer time to wake
up from. The transition between states is ontrolled by the operating system. A
transition between states always inludes a ertain penalty in terms of time and
energy ost. A transition takes a ertain amount of time, and requires a ertain
amount of energy. A proper power-aware real-time sheduler needs to onsider
these time and energy osts when making sheduling deisions in order to guarantee
meeting of deadlines.
5.1 Hard Real Time Sheduling
The problem of power-aware real-time devie sheduling has in reent researh been
takled in at least two dierent ways. The aim in for instane the EEDS algorithm
[ChG06℄ is to enhane the system's EDF based task sheduler with an energy aware
devie sheduler. One an also entirely separate the devie sheduler from the pro-
essor sheduler, as has been done in MUSCLES and LEDES [SwC03℄. A ompletely
dierent approah is hosen in the EDS [SwC05℄ algorithm, whih due to its CPU
time and memory requirement operates oine. In the next setion we will explore
eah of these algorithm
38
Figure 17: The pseudo ode of the LEDES sheduler [SwC03℄. Notations:
kj a devie, τi task, Li the set of devies needed by τi, si start time of task
i, ci exeution time of task i, t0,j transition time of devie j.
5.1.1 Low Energy Devie Sheduler
The basi assumption in Low Energy Devie Sheduler (LEDES) [SwC03℄, Figure
17, is that the transition time, the time needed for the devie to swith from sleep
state to the powered-up state (or vie versa) is shorter than the exeution time of
any task instane. If we aept this assumption, then it sues to shedule only one
task into the future at a time. This is enough to guarantee that no deadlines will be
missed. In other words, if the urrent task instane is τi we need only onsider the
devie shedule up to and inluding τi+1. This will be enough for us to wake up all
devies so that they will be ready for use when needed. This assumption implies that
no matter how many tasks there are in the system, LEDES need only to onsider
two  the urrent and the next one  in its devie shedule alulations. This is why
39
the workload LEDES adds to the system is aeptable. LEDES supports, however,
devies with only two dierent states  sleep and powered-up.
As input parameters LEDES (gure 17) reeives a pointer to a devie kj , the shedul-
ing information of the urrent and next tasks, Ti and Ti+1. LEDES is ativated at
either the start (line 1) or the end (line 16) of a task. If the devie kj is swithed on
(line 2) while not being needed by the urrent or the next task (line 3) the devie is
swithed o (line 4). If kj is needed by the next task (line 6), but will make it bak
online if we power it down for the remainder of the exeution of the urrent task,
and power it up when nishing the urrent task (line 7), we power kj down (line 8).
On line 7, LEDES also onsiders devie state transition time t0,j . If kj is needed by
the next job, but kj wouldn't make it bak online on time if we would initiate its
wakeup as late as at the end of the urrent task (line 12), then kj is immediately
woken up (line 13). These onsidered ases inlude all possible ases we need to take
into aount at the beginning of a task.
The other sheduling instane of LEDES is at the end of tasks (line 16). If the
devie kj is powered up while not being needed by the next task (line 18) it an be
powered o (line 19). In addition, we must on line 18 hek that the powering down
of the devie will be nished by the start time of the next task, as the devie an
be needed at that oasion. (In LEDES, the powering up and powering down state
transition times are assumed to equal eah other, and both are notated by t0,j.) In
other ases, the devie is powered up (line 22).
We believe that one if sentene is missing from the LEDES pseudo ode. On line
22, before waking up kj, we would want to hek that kj atually is needed by Ti+1.
It is, of ourse, unneessary to wake up the devie if it isn't needed by the next task.
Beause LEDES makes sheduling deisions only in the beginning and at the end of
tasks, its implementation into the operating system's proessor sheduler should be
pretty straightforward: we just all the LEDES proedure at the end and beginning
of tasks. The omputational omplexity of LEDES is O(n), where n is the size of
the set of devies attahed to the system.
With LEDES, implemented into a Rate Monotoni based sheduler, devie energy
savings of up to 40 perent have been reported [SwC03℄. As the algorithm shows,
LEDES supports only two distint power states.
40
Figure 18: The MUSCLES sheduler [SwC03℄. Notations: S the task
shedule; PS set of power states; ki devie; sm start time of task m; cm
exeution time of task m; psi,j power state j of devie i; psi,0 the powered
up state.
5.1.2 The Multi-State Constrained Low-Energy Sheduler
Several ontemporary devies and peripherals, like ash memories, hard drives and
network adapters, support multiple power states for energy onservation. For these
purposes, the authors of LEDES have presented an algorithm alled MUSCLES
(multi-state onstrained low-energy sheduler) [SwC03℄. In MUSCLES, devies are
moved between states one step at a time. Let ki be a devie, and psi,j an arbitrary
power state of this devie. From this state, it is possible to swith to state psi,j+1
or psi,j−1 in one step. In MUSCLES, the state psi,0 is the operating state of the
devie; the other states are power saving states, where the devie doesn't provide
operational funtionality. State psi,j+1 requires less power than state psi,j, but takes
longer to wake up from.
If we aept these assumptions, we an no longer build upon the idea of LEDES,
where the wakeup transition time never exeeds the exeution time of the task.
MUSCLES still relies on the assumption that a transition from state psi,j to psi,j+1
or psi,j−1 never exeeds the exeution time ci of any task. However, if we are in
state psi,j , the wakeup  i.e., the transition to state psi,0  may endure up to j × ci
time units. When j ≥ 2, the wakeup time may exeed the assumption we built upon
in LEDES. Therefore, in order to reliably shedule devies in MUSCLES, we need
41
to alulate the shedule further into the future. Whereas the time requirement of
LEDES isO(n), where n is the amount of devies in the system, the time requirement
of MUSCLES is O(np), where p is the size of the task set [SwC03℄.
Let us now study the pseudo ode of MUSCLES, presented in Figure 18. First it
is worth notiing that MUSCLES is ativated at either the start time of a job 
indiated by sm in the pseudo ode  or at the end of the job, indiated by sm + cm,
where cm is the job's exeution time. As input parameters, MUSCLES reeives S,
the task shedule of the system; P , a list of devies eah task uses, and a devie
pointer ki. The job of MUSCLES is to alulate whether to swith ki to a less
power-onsuming state, to swith the devie loser to the wakeup state, or to leave
the devie in its urrent state.
On line 1, we nd the rst task τL that will need devie ki, and on line 2 we alulate
the amount of sheduling instanes before τL and denote this Figure with X. Let
the urrent power saving state be psi,j. If X ≥ j + 1, ki may safely be swithed
to a lower power state (line 3), and there will still remain a suient number of
sheduling oasions to put ki bak online on time. If there are as many sheduling
oasions as there are power states between the urrent one and the operating state,
i.e. X = j, then ki is swithed one state towards the wakeup state, i.e., from psi,j to
psi,j−1 (line 4). This will guarantee that the devie will be woken up in time when
it is needed.
The other sheduling instane is at the end of the job, at time sm + cm. Here, we
proeed in the same way as at the beginning of the job. It is resolved whih task
rst needs devie ki (line 5). Then we deide how many sheduling oasions there
are before the start of this task (line 6). If there are more sheduling oasions than
there are power states between the urrent state and the wakeup state, the devie is
put into a lower power state (line 7). Otherwise, if the amount of states equals the
number of sheduling oasions, the devie is swithed one state towards the wakeup
state (line 8 and 9). On other oasions, the devie is left in its urrent state.
5.1.3 The Energy-Eient Devie Sheduling Algorithm
A state transition, as suh, always requires a ertain amount of energy and time.
Therefore very short transitions into the sleep state and bak atually do not add
up to net energy savings. We will now disuss an algorithm alled Energy-eient
Devie Sheduling or EEDS [ChG06℄. The pseudo ode for the algorithm an be
42
Figure 19: The EEDS sheduler [ChG06℄. Notations: λk indiates de-
vie k; BE is the breakeven time; Jrun the job urrently being exeuted;
Dev(Jrun) the set of devies Jrun needs; DS(λk, t) the devie slak time of
λk at time t; Up(λk) the wakeup time of λk; twu(λk) the transition delay
time of λk.
seen in Figure 19. The algorithm supports devies with two power states, sleep and
ative. EEDS alulates the breakeven time for eah devie. This is the length of
the time period it is worthwhile to put the devie in sleep mode. For shorter periods
than this, the state transition osts will exeed the net gain. On line 2 in the pseudo
ode, EEDS alulates the breakeven time BE of eah devie. The length of the
devie's breakeven time depends on the properties of the devie: how long a time
the transition from ative to sleep state (and vie-versa) takes, how muh energy
the transition(s) require, and how muh energy the devie spends in ative vs. sleep
state.
EEDS utilizes a data struture of the type queue where the ative jobs are ordered
aording to the EDF priniple  the one with the losest deadline at the head of
the queue. This job is sheduled (line 6). We all devie slak time the length of the
43
time period until devie λk is needed next time. On line 8 we resolve whether there
is a woken up devie λk whose devie slak DS (the length of the time period when
the devie is not needed) is greater than its breakeven time BE. Suh devies may
be put to sleep, whih is done on line 9. In order to wakeup these devies so that
they will be ready when needed next time, EEDS sets a timer on line 11.
As the timer value we put the urrent time t added with the devie's slak time
DS(λk, t) subtrated with the wakeup time twu(λk). Due to the dynami properties
of jobs, the devie slak time may inrease even during the sleep time. Therefore
the timer of the devie may be updated on lines 14 and 15. On line 18 we hek
whether the timer of a devie has expired, and if so, wakeup the devie on line 19.
5.1.4 The Energy-Optimal Devie Sheduler
The shedulers desribed earlier are all online shedulers. Swaminathan and Chak-
rabarty [SwC05℄ have in 2005 published a real-time devie sheduler aimed at oine
use. It diers from all previous algorithms desribed in this thesis also in the sense
that it ompletely rejets both EDF and RM and implements a sheduling meh-
anism of its own. This algorithm is alled Energy-optimal devie sheduler (EDS).
In order to nd an energy optimal devie shedule this algorithm builds a deision
tree using an iterative algorithm. To limit memory spae requirements, EDS prunes
branhes from the tree when possible.
Table 4: The EDS example job set [SwC05℄, where ai indiates the arrival
time; ci the exeution time, and di the deadline of a job. The odd-
numbered jobs belong to task τ1 and use devie k1, and the even-numbered
jobs belong to task τ2 and use devie k2.
Let us start our study of the EDS algorithm by onsidering an example. In Table
4 we have a set of jobs from two tasks, τ1 (the odd-numbered jobs) and τ2 (the
even-numbered jobs). τ1 uses the devie k1 and τ2 the devie k2. The mission
of EDS is to nd suh start times for all of these jobs, that devie energy use is
minimized while deadlines are met. EDS solves this problem by building a shedule
44
tree. The beginning of the shedule tree built using the task set of Table 4 an be
seen in Figure 20.
Figure 20: The EDS sheduling tree after jobs j1 and j2 have been shed-
uled [SwC05℄. Syntax: (ji, time, Ei), where ji is the job number, time the
start time of ji and Ei the devie energy onsumption up to time.
The shedule tree onsists of verties, where eah vertex is represented as a 3-tuple
(ji, time, Ei). In this tuple ji indiates the job number (from Table 4), time is a valid
start time for ji aording to this shedule, and Ei indiates the amount of energy
spent by the devie i aording to this shedule up to time. Verties (x1, x2, x3) and
(y1, y2, y3) are onneted by an edge if y1 an be sheduled at y2 when x1 has been
sheduled at x2 [SwC05℄.
Calulating the energy onsumption Assume that eah devie has two states,
a low power sleep state psl,i and a high power working state psh,i. Let t0,i be the tran-
sition time between these states, and P0,i be the transition power requirement. Let
Ps,i and Pw,i indiate the power spent when in sleep and working states, respetively.
The energy requirement is alulated using the formula
Ei = Pw,itw,i + Ps,its,i +mP0,it0,i (13)
where m is the amount of state transitions; ts,i is the time spent in sleep state, and
tw,i is the time spent in working state [SwC05℄.
Building the shedule tree The building of the shedule tree is started with a
dummy vertex (0, 0, 0). Aording to Table 4, jobs j1 and j2 have been released at
time 0, and will hene be added to the tree. Let's begin with j1. The ompletion
45
(exeution) time of j1 is 1 and its deadline is 3 (Table 4). Therefore, j1 may be
sheduled at time 0, 1 and 2. We therefore add three verties, (1, 0, e1), (1, 1, e2)
and (1, 2, e3) to the tree, and onnet these with an edge to the root vertex. The
energy onsumption value ei for eah vertex is alulated using Equation 13, and
the orret values for e1, e2 and e3 are 0, 8 and 10, respetively (we will here exlude
the details of energy onsumption alulation). We add these verties to the tree, as
an be seen in Figure 20. In a similar fashion, we add to the tree the verties of j2
onneting it to the root vertex, beause even j2 was released at time 0. Aording
to Table 4, the ompletion time of j2 is 2 and its deadline is 4. Therefore, it an
be sheduled at times 0, 1 and 2. The orresponding values for ei (alulated using
Equation 13) are 0, 8 and 10, respetively. Hene, we add the verties (2, 0, 0),
(2, 1, 8) and (2, 2, 10) to the tree, as an be seen in Figure 20.
Pruning the shedule tree EDS performs both temporal and energy pruning.
This way it will redue the size of the shedule tree in order to ease memory spae
and proessor time requirements. Continuing with our example, as the next step,
EDS performs temporal pruning. Consider the vertex (1, 2, 10) in Figure 20. If j1 is
sheduled at time 2, it will nish at time 3, beause its ompletion time is 1 (Table
4). However, nishing j1 at time 3 would mean that the exeution of j2 would start
no earlier than at 3, and beause the ompletion time of j2 is 2, j2 would miss its
deadline at 3. Therefore, this shedule is unfeasible, and the branh of the tree
starting with node (1, 2, 10) an be pruned. This is indiated by the ross in Figure
20. By similar reasoning, we will also be able to prune the branhes starting with
verties (2, 1, 8) and (2, 2, 10). Let us rst onsider (2, 1, 8). If the rst sheduled
job is j2 at 1, it will nish at 3 but then j1 would ertainly miss its deadline at 3,
and hene this shedule is unfeasible, and this branh an be pruned. Similarily,
onsidering vertex (2, 2, 10), if j2 at 2 is the rst sheduled, it will nish at 4, but
then j1 would have missed its deadline at 3, so also this branh an be pruned.
The seond form of pruning utilized by EDS is energy pruning. In Figure 21, whih
displays the entire sheduling tree, onsider the verties (2, 2, 14) and (2, 2, 16) lo-
ated two edges away from the root vertex. These verties indiate two shedules of
the same job, 2, at exatly the same point in time, also 2. Also, in both branhes,
exatly the same job have been previously sheduled. However, the latter of the
shedules onsume 16 units of energy in omparison to 14 of the rst one. Beause
our aim is to minimize energy onsumption we may here utilize energy pruning,
and disard the rest of the branh with the higher energy onsumption. Energy
46
Figure 21: The omplete EDS sheduling tree [SwC05℄. The least energy
onsuming shedule of the 7 jobs has been found.
pruning an always be made when two jobs are sheduled at the same time, and the
order of the previously sheduled jobs among both branhes are idential [SwC05℄.
One we have nished the nal sheduling tree, i.e. inluded all the leaf verties,
we hoose from among the leaf verties the node onsuming the least energy (68)
by eliminating higher-energy verties. The path from the dummy vertex (0, 0, 0) to
this lowest-energy leaf vertex (6, 10, 68) indiates an energy-optimal shedule of the
job set of Table 4.
The EDS pseudo ode The pseudo ode of the iterative EDS algorithm an be
seen in Figure 22. As initialization, on line 2, the dummy vertex (0, 0) is put into
the openList. In the for loop starting on line 3 all verties in the openList are
proessed. On line 5, a set τ ′ is generated out of the jobs that have been released up
to the time stamp of the urrent vertex. Out of these jobs we generate new verties,
and prune those that would be unfeasible. On lines 1522 we ompare all pairs
of verties on the urrent height of the tree, and if two with idential sheduling
oasions are found, we prune the one with the higher energy requirement. The
EDS algorithm is nished on lines 2527 when all jobs have been sheduled, i.e.,
when the height of the tree equals the number of jobs.
47
Figure 22: The EDS pseudo ode [SwC05℄.
Despite its pruning tehnique, its memory and omputation time requirement of
EDS may be exessive [SwC05℄. EDS is aimed at oine use, meaning that the
shedule is omputed before run-time. Also, the shedule alulated by EDS is
non pre-emptive. Jobs are exeuted from start to nish without ontext swithes.
Therefore, jobs may have to wait for long times while large jobs are being proessed.
5.1.5 Comparing the Presented Algorithms
We now have presented four algorithms for power-aware devie sheduling. Out of
these shedulers, LEDES and MUSCLES are add-ons to the system's task sheduler.
48
They also have shortomings. For instane the basi assumption in LEDES is that
the transition time of a devie may never exeed the exeution time of any job. As
a proess in a real-time system may onsist of just a few lines of mahine ode that,
for instane, reads a sensor measurement gure, and for instane a hard-drive may
take several seonds to wake up from sleep state, we always annot build upon this
assumption.
The bigger brother to LEDES is MUSCLES whih supports several sleep states.
However it does not support several operational states. Reall from our disussion
of proessors, that many ontemporary CPU's provide several operational states,
where lesser throughput is provided for less energy ost. MUSCLES does not support
any similar funtionality on devies.
Neither LEDES nor MUSCLES alulate the net gain of state transitions. This is,
however, done by EEDS whih, essentially, is an enhaned EDF sheduler. Devies
that are urrently not needed and whih in spite of transition osts are beneial to
be slept down, are put to sleep and awoken with a timer.
Our nal algorithm, EDS, alulates an energy-optimal shedule using a deision
tree. Due to its omplexity, this algorithm is intended for oine use. The authors
of EDS also have published a heuristi algorithm,Maximum Devie Overlap (MDO),
whih seeks an approximate solution to the same problem and operates in polynomial
time [SwC05℄.
49
6 System-Level Power Aware Sheduling
By using Dynami Voltage Saling, the proessor's operating frequeny and voltage
may be regulated during run-time. Sine the proessor's energy onsumption ubi-
ally depends on frequeny and voltage, impressive CPU energy redutions may be
ahieved using this tehnique. There is, however, a downside ompliating the mat-
ter. Besides the proessor, omputer systems onsists of other omponents, suh as
memory and ahe memory hips, graphi adapters, network ards, bus ontrollers,
graphis proessors, modems, wireless network adapters, and so forth. Performing
a alulation takes a longer time when the proessor speed has been lowered. When
onsidering the CPU energy onsumption in isolation, a frequeny and voltage re-
dution using DVS indeed results in energy savings. However, as the proessing
time inreases, all the other omponents need to be longer in the standby state.
Components suh as memory hips generally require a xed power supply regardless
of the DVS setting of the CPU. Hene, when system-level energy redutions are the
aim, onsidering the CPU power requirement in isolation is not suient. Most
early DVS based CPU sheduling algorithms have hosen to overlook this fat in
their basi assumptions [FEL04℄. This is also the ase with the algorithms desribed
in Setion 4.
Figure 23: The eet of the proessor saling fator s on system-level
energy onsumption [ZhC05℄.
Consider Figure 23. The X axis indiates a StrongArm SA 1100 proessor saling
fator s dened as s =
max_frequency
current_frequency
, and the Y axis indiates power onsump-
50
tion of a task, in watts. In the graph, the rossed line Qproc(s) depits the power
onsumption of the SA 1100 proessor alone. The possible s values are the disrete
saling fators provided by this proessor. The energy optimal s value Θ = 2.8 is
marked in the graph. Next, onsider the dotted line Qdev(s). This monotonially
rising line indiates the power onsumption of the devie set needed by the task,
exluding the proessor. As the saling fator s inreases, and hene the CPU speed
dereases and proessing times inrease, the aggregate power onsumption of the
devie set inreases. The line with irles shows the ombined proessor and de-
vie power requirement when stati devie power requirement is onsidered to be
0.2W , whih often is the ase with for instane Synhronous Dynami Random A-
ess Memory (SDRAM) memory hips [ZhC05℄. The optimal saling fator, when
onsidering both the power onsumption of the proessor and the devie set, is 1.39
and this value is marked in the graph with θ[k]. The line with squares shows the
ombined energy optimal voltage saling fator when the devie set stati power
requirement is onsidered to be 0.4W . This is the ase with many ash drives.
With this power requirement, the energy-optimal saling fator θ[k] = 1.07. Com-
paring this value to the CPU energy optimal value of 2.8 and the 0.2W optimal
value 1.39 learly illustrates how the net gain of aggressive DVS values derease as
proessor independent energy onsumption inreases. It has atually been shown
[ZhC05℄ that when devie energy onsumption is onsiderably large ompared to
CPU energy onsumption, DVS implementations atually an spend more energy
than non-DVS approahes.
As the proessor takes a longer time to perform alulations, the standby energy
requirement of the devie set rises. An energy-eient sheduling algorithm, there-
fore, needs to onsider system-wide energy onsumption when alulating an optimal
saling fator for the proessor. In the next subsetions, two reent algorithms will
be explored.
6.1 duSYS: A System-Level EDF Algorithm
Zhuo and Chakrabarti [ZhC05℄ have published an EDF based system-level power-
aware real-time sheduling algorithm alled duSYS. Its high-level pseudo ode is
given in Figure 24. What makes this algorithm dierent from proessor sheduling
algorithms explored in Setion 4 is the alulation of the energy-optimal DVS saling
fator. The idea behind duSYS is that the system-level energy onsumption an be
written as a funtion of the proessor's saling fator s.
51
Let Pproc be the proessor operating power onsumption, and Pd
[i]
be the standby
power onsumption of the devie set needed by task i. Now, the energy onsumption
of task i an be written as Q(s) = Qproc(s) + Qdev(s). Here, Qproc(s) = s × Pproc
and Qdev(s) = s × Pd
[i]
[ZhC05℄. Beause proessors typially only have a handful
of available speed saling modes (values for s), for instane the SA 1100 has 11, it
is possible for every task to numerially evaluate eah of them [ZhC05℄ and hoose
the one that will yield the lowest aggregate power onsumption. This optimal value
is denoted by θi in duSYS. The mission of duSYS is to nd for the sheduled ative
job Jact an optimal saling fator sact. The duSYS algorithm alulates the saling
fator using Equation 14 [ZhC05℄:
sact = min(
Dact − t
Eact
, θact, du(t)) (14)
where Dact is the ative job's absolute deadline, t is the urrent time, Eact is the
jobs worst-ase exeution time (the exeution time that has been budgeted to the
task), and θact is the optimal voltage saling fator for the task based on the task's
stati exeution parameters. In duSYS, θact is omputed oine. Due to the dynami
nature of jobs, real exeution times vary greatly, and are generally shorter than the
budgeted stati ones. In order to utilize emerging slak times for energy savings,
duSYS also alulates and onsiders the dynami utilization, du(t), when seleting
the appropriate saling fator. The value du(t) is alulated using Equation 15
[ZhC05℄.
du(t) =
H − t− U−1 × (W − Eact)
Eact
, (0 ≤ t ≤ H) (15)
where H is the hyper period, i.e., the least ommon multiple (LCM) of the periods of
the sheduled tasks, W is the estimated remaining workload and U is the utilization
degree of the system. Using the value du(t) for proessor frequeny saling, all slak
available at time t may safely be granted to the ative job, while timely exeution
of the rest of the jobs is also being guaranteed. The term
Dact−t
Eact
in Equation 14
ensures that deadlines are not violated [ZhC05℄.
To summarize, when seleting the optimal saling fator sact for the ative job,
duSYS hooses from among three dierent andidates the smallest one aording to
Equation 14. Out of these three andidates, θact is alulated oine and is based on
stati information (period Pi, worst-ase exeution time Ei) about the task, whereas
the purpose of du(t) is to utilize slak emerging when jobs exeute faster than their
budgeted worst-ase exeution times.
52
1 W = hyperperiod× U
2 while time() < hyperperiod do
3 determine sact and exeute Jact using sact;
4 if Jact is not nished then
5 ExecutedPart = current_duration/sact;
6 W = W −ExecutedPart;
7 Eact = Eact − ExecutedPart;
8 ActualExecutionT imeact = ActualExecutionT imeact − ExecutedPart;
9 else
10 W = W −Eact;
11 end if
12 end while
Figure 24: The high-level pseudo ode of the duSYS algorithm [ZhC05℄.
W denotes the estimated remaining workload, Eact the budgeted exeution
time, and U the system utilization degree.
The pseudo ode of duSYS an be seen in Figure 24. Released jobs are onsidered
to be sorted in a queue with the job with the highest EDF priority at the head
of the queue. On line 1, the estimated workload of the system is alulated. On
line 3 the highest priority job is sheduled using the saling fator sact whih has
been alulated using Equation 14. During the exeution of Jact, dynami runtime
information is maintained on lines 58. This information is used when alulating
du(t), whih seeks to utilize slak times for power savings. When hoosing the
optimal saling fator, duSYS onsiders the ombined proessor and devie power
onsumption in order to minimize system-wide power requirements.
6.2 The Critial Speed DVS Algorithm
Next we will onsider an earlier EDF based power-aware system-wide real-time
sheduling algorithm [JeG04℄. We all this algorithm Critial Speed DVS (CS-DVS).
Like duSYS, CS-DVS onsiders both CPU and devie energy onsumption when al-
ulating an energy-optimal DVS setting. In CS-DVS, the energy onsumption Ei of
a task τi is given by Equation 16 [JeG04℄:
Ei(η) =
Ci
η
P (CPU, η) +
n∑
j=1
C
Rj
i
η
P (Rj) (16)
53
where η ∈ [0, 1] represents the proessor slowdown fator [JeG04℄. This value indi-
ates the fration of the maximum CPU speed at whih the proessor is being run
(η = 1 meaning the maximum speed), and orresponds to the saling fator s used
in duSYS. In Equation 16, Ci indiates the number of proessor yles budgeted to
the task τi, and C
Rj
i the number of yles that devie Rj spends in the standby state
during the exeution of the task τi. The notation P (CPU, η) represents the power
onsumption of the CPU at slowdown fator η, and P (Rj) indiates the power on-
sumption of the devie Rj. In essene, the rst term in Equation 16 represents the
CPU power usage at slowdown fator η, and the seond term represent the sum of
the standby energies onsumed by the set of devies Rj that task τi uses at slowdown
fator η. Naturally, even omponents suh as system memory may be modeled as a
devie.
What CS-DVS needs to do is to minimize the energy onsumption given by Equation
16. It needs to nd the η that yields the lowest total energy onsumption for the task.
Possible η values are the disrete speed settings provided by the underlying proessor
arhiteture. CS-DVS nds the η giving the lowest total energy by alulating
Equation 16 for eah available η value [JeG04℄, and then hoosing the optimal η.
As visualized by Figure 23, this value need not be the one that minimizes the CPU
power usage. The η value that yields the lowest total energy onsumption is alled
the ritial speed of the task. Beause eah task may have dierent exeution times
and use a dierent set of devies, their ritial speeds need not be the same.
The pseudo ode of the CS-DVS Algorithm is given in Figure 25. On line 1, the
ritial speed for eah task is alulated, and on line 2 eah task τi is initialized
its individual ritial speed ηi. Energy-optimal saling fators might ause the task
set to beome unfeasible, i.e. EDF timeliness guarantees would be violated. Hene,
CS-DVS might need to inrease the saling fator of some task(s). This is done
in the while-loop on lines 38. A possible andidate task τm for speed inrease
fullls two onditions (line 4). Firstly, the task's urrent saling fator ηm is not
the maximum speed (line 5). The seond ondition (line 6) is more ompliated.
We wish to hoose the task for whih a speed inrease from the urrent fator ηi
to the next one ηi+1 auses as small an energy onsumption inrease per time unit
as possible. Here, ∆Em represents the energy onsumption inrease between ηi and
ηi+1, and ∆tm the time gained by the speed-up [JeG04℄. From among the andidates
the task with the lowest ∆Em/∆tm value is hosen, and this task's η is inreased.
This proess is repeated (line 3) until the task set beomes feasible aording to the
EDF priniple.
54
1 Compute the ritial speed for eah task;
2 Initialize ηi to ritial speed of τi;
3 while (not feasible) do
4 Let τm be task satisfying:
5 (a) ηm is not the maximum speed;
6 (b)
∆Em
∆tm
is minimum;
7 Inrease speed of task τm;
8 end while
9 return slowdown fators ηi;
Figure 25: The Critial Speed DVS (CS-DVS) Algorithm in pseudo ode
[JeG04℄.
6.3 Comparing the presented algorithms
In this setion we explored two power-aware real-time sheduling algorithms that
onsider system-wide energy onsumption when hoosing the optimal DVS setting
for the proessor. Both algorithms model a real-time task's energy onsumption as
the sum of CPU and devie set energy onsumptions. The slower the proessor is
run, the more standby energy the devies require. A power-aware real-time sheduler
needs to onsider this when making DVS setting deisions.
The onsidered algorithms were duSYS [ZhC05℄ and CS-DVS [JeG04℄. Both al-
gorithms are based on the EDF priniple and provide a hard real-time timeliness
guarantee. The main dierene between the algorithms is that duSYS is able to
utilize dynamially emerging job slak, whereas CS-DVS operates on stati pre-
runtime task information only. It is well known that real-time jobs hardly ever
onsume all the proessor time that has been alloated to them, but exeute faster
than budgeted. Hene, duSYS is potentially more energy-optimal than CS-DVS.
55
7 Summary
In a real-time system, alulations need not only be orret, but also be nished
within a pre-dened deadline. The rst serious real-time sheduling algorithms,
presented in Setion 2, were Rate Monotoni and Earliest Deadline First [LiL73℄.
In a hard real-time system, for instane in a paemaker, the meeting of every single
deadline is ruial. In a soft real-time system, for instane a video player, oasional
deadline misses are tolerated.
Many ontemporary real-time systems operate on onstrained devies with limited
battery power. Power awareness in onstrained devies is disussed in Setion 3. Ex-
tensive energy savings an be ahieved by utilizing Dynami Voltage Saling (DVS)
[Gro03, VeF05℄ to hange the operating frequeny and voltage of the proessor during
run-time. Using the Advaned Conguration and Power Interfae (ACPI) [HIM06℄,
the operating system may shut down devies, suh as disk drives, for time periods
when the devies are not needed.
Using low-power tehniques, the hallenge for the real-time sheduler is to maximize
energy savings while guaranteeing that jobs meet their real-time deadlines. Due to
devie wakeup delay times, the sheduler needs to initiate the wakeup proedure of
a slept-down devie before the devie is atually needed. If the devie isn't awoken
early enough, the job needing it might risk missing a deadline.
Advaned sheduling algorithms suh as Feedbak DVS-EDF [DMZ02℄ and duSYS
[ZhC05℄ are also able to dynamially utilize emerging slak times for energy savings.
One one job nishes earlier than budgeted, the next job may have at its proposal
extra exeution time. The sheduler may use this slak time to onserve proessor
energy by exeuting the job slower.
Considerable researh has been done in the eld of power-aware real-time sheduling.
The Rate Monotoni and Earliest Deadline First algorithms have been enhaned
with power-aware properties. Power aware real-time algorithms for uniproessor,
devie, and system-level sheduling are explored in Setions 4, 5 and 6, respetively.
56
Referenes
BBC98 Benini, L., Bogliolo, A., Cavallui, S. and Rió, B., Monitoring sys-
tem ativity for OS-direted dynami power management. ISLPED
'98: Proeedings of the 1998 international symposium on Low power
eletronis and design, New York, NY, USA, 1998, ACM Press, pages
185190.
But05 Buttazzo, G. C., Rate monotoni vs. EDF: judgment day. Real-Time
Systems, 29,1(2005), pages 526.
ChG06 Cheng, H. and Goddard, S., Online energy-aware I/O devie shedul-
ing for hard real-time systems. DATE '06: Proeedings of the onfer-
ene on Design, automation and test in Europe, 3001 Leuven, Belgium,
Belgium, 2006, European Design and Automation Assoiation, pages
10551060.
DMZ02 Dudani, A., Mueller, F. and Zhu, Y., Energy-onserving feedbak
EDF sheduling for embedded systems with real-time onstraints.
LCTES/SCOPES '02: Proeedings of the joint onferene on Lan-
guages, ompilers and tools for embedded systems, New York, NY, USA,
2002, ACM Press, pages 213222.
FEL04 Fan, X., Ellis, C. S. and Lebek, A. R., The synergy between power-
aware memory systems and proessor voltage saling. Power-Aware
Computer Systems, 3164(2004), pages 164179.
Gro03 Grover, A., Modern system power management. ACM Queue,
1,7(2003), pages 6672.
HIM06 Hewlett-Pakard, Intel, Mirosoft, Phoenix and Toshiba, Advaned
onguration and power interfae speiation, revision 3.0b, 2006.
http://www.api.info/DOWNLOADS/ACPIspe30b.pdf. [3.4.2007℄
HwA00 Hwang, C.-H. and Wu, A. C.-H., A preditive system shutdown method
for energy saving of event-driven omputation. ACM Transations on
Design Automation of Eletroni Systems, 5,2(2000), pages 226241.
IGS02 Irani, S., Gupta, R. and Shukla, S., Competitive analysis of dynami
power management strategies for systems with multiple power savings
57
states. DATE '02: Proeedings of the onferene on Design, automa-
tion and test in Europe, Washington, DC, USA, 2002, IEEE Computer
Soiety, pages 117123.
Int04 Intel Corporation, Enhaned Intel SpeedStep Tehnology for the In-
tel Pentium M Proessor, 2004. http://www.intel.om/design/
intarh/papers/30117401.pdf. [3.4.2007℄
JeG04 Jejurikar, R. and Gupta, R., Dynami voltage saling for systemwide
energy minimization in real-time embedded systems. ISLPED '04: Pro-
eedings of the 2004 international symposium on Low power eletronis
and design, New York, NY, USA, 2004, ACM Press, pages 7881.
Liu00 Liu, J. W. S., Real-Time Systems. Prentie Hall, New Jersey, USA,
2000.
LiL73 Liu, C. L. and Layland, J. W., Sheduling algorithms for multiprogram-
ming in a hard-real-time environment. Journal of the ACM, 20,1(1973),
pages 4661.
PLS01 Pouwelse, J., Langendoen, K. and Sips, H., Dynami voltage saling
on a low-power miroproessor. MobiCom '01: Proeedings of the 7th
annual international onferene on Mobile omputing and networking,
New York, NY, USA, 2001, ACM Press, pages 251259.
PiS01 Pillai, P. and Shin, K. G., Real-time dynami voltage saling for low-
power embedded operating systems. SOSP '01: Proeedings of the eigh-
teenth ACM symposium on Operating systems priniples, New York,
NY, USA, 2001, ACM Press, pages 89102.
RaS94 Ramamritham, K. and Stankovi, J., Sheduling algorithms and oper-
ating systems support for real-time systems. Proeedings of the IEEE,
82,1(1994), pages 5567.
ShC99 Shin, Y. and Choi, K., Power onsious xed priority sheduling for
hard real-time systems. DAC '99: Proeedings of the 36th ACM/IEEE
onferene on Design automation, New York, NY, USA, 1999, ACM
Press, pages 134139.
58
SwC00 Swaminathan, V. and Chakrabarty, K., Real-time task sheduling for
energy-aware embedded systems, 2000. http://www.ee.duke.edu/
~krish/wip.pdf. [27.6.2007℄
SwC01 Swaminathan, V. and Chakrabarty, K., Investigating the eet of
voltage-swithing on low-energy task sheduling in hard real-time sys-
tems. ASP-DAC '01: Proeedings of the 2001 onferene on Asia South
Pai design automation, New York, NY, USA, 2001, ACM Press,
pages 251254.
SwC03 Swaminathan, V. and Chakrabarty, K., Energy-onsious, deterministi
I/O devie sheduling in hard real-time systems. IEEE Transations on
Computer-Aided Design of Integrated Ciruits and Systems, 22,7(2003),
pages 847858.
SwC05 Swaminathan, V. and Chakrabarty, K., Pruning-based, energy-optimal,
deterministi I/O devie sheduling for hard real-time systems. ACM
Transations on Embedded Computing Systems, 4,1(2005), pages 141
167.
Sta05 Stallings, W., Operating Systems: Internals and Design Priniples.
Pearson Prentie Hall, Upper Saddle River, NJ, USA, 2005.
VBH03 Viredaz, M. A., Brakmo, L. S. and Hamburgen, W. R., Energy
management on handheld devies, 2003. http://www.amqueue.org/
modules.php?name=Content&pa=showpage&pid=79. [27.3.2007℄
VeF05 Venkatahalam, V. and Franz, M., Power redution tehniques for mi-
roproessor systems. ACM Computing Surveys, 37,3(2005), pages 195
237.
WRJ06 Wu, H., Ravindran, B., Jensen, E. D. and Li, P., Energy-eient, utility
arual sheduling under resoure onstraints for mobile embedded sys-
tems. ACM Transations on Embedded Computing Systems, 5,3(2006),
pages 513542.
YuN03 Yuan, W. and Nahrstedt, K., Energy-eient soft real-time CPU
sheduling for mobile multimedia systems. SOSP `03: Proeedings of
the nineteenth ACM symposium on Operating systems priniples, New
York, NY, USA, 2003, ACM Press, pages 149163.
59
YuN06 Yuan, W. and Nahrstedt, K., Energy-eient CPU sheduling for
multimedia appliations. ACM Transations on Computer Systems,
24,3(2006), pages 292331.
ZhC05 Zhuo, J. and Chakrabarti, C., System-level energy-eient dynami
task sheduling. DAC '05: Proeedings of the 42nd annual onferene
on Design automation, New York, NY, USA, 2005, ACM Press, pages
628631.
Appendix 1. The entire Feedbak DVS-EDF algo-
rithm
This is the entire Feedbak DVS-EDF algorithm [DMZ02℄ presented in Setion 4.1.3.
