Scheduling of real time embedded systems for resource and energy minimization by voltage scaling by Anne, Naveen Babu
UNLV Retrospective Theses & Dissertations 
1-1-2005 
Scheduling of real time embedded systems for resource and 
energy minimization by voltage scaling 
Naveen Babu Anne 
University of Nevada, Las Vegas 
Follow this and additional works at: https://digitalscholarship.unlv.edu/rtds 
Repository Citation 
Anne, Naveen Babu, "Scheduling of real time embedded systems for resource and energy minimization by 
voltage scaling" (2005). UNLV Retrospective Theses & Dissertations. 1819. 
https://digitalscholarship.unlv.edu/rtds/1819 
This Thesis is protected by copyright and/or related rights. It has been brought to you by Digital Scholarship@UNLV 
with permission from the rights-holder(s). You are free to use this Thesis in any way that is permitted by the 
copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from 
the rights-holder(s) directly, unless additional rights are indicated by a Creative Commons license in the record and/
or on the work itself. 
 
This Thesis has been accepted for inclusion in UNLV Retrospective Theses & Dissertations by an authorized 
administrator of Digital Scholarship@UNLV. For more information, please contact digitalscholarship@unlv.edu. 
NOTE TO USERS
This reproduction is the best copy available.
UMI
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
SCHEDULING OF REAL TIME EM BEDDED SYSTEMS FOR RESOURCE AND 
ENERGY M INIM IZATION BY VOLTAGE SCALING
by
Naveen Babu Anne
Bachelor of Engineering 
Malaviya National Institute of Technology 
University of Rajasthan 
2002
A thesis submitted in partial fulfillment of the 
requirements for the
Master of Science Degree in Electrical Engineering 
Department of Electrical and Computer Engineering 
Howard R. Hughes College of Engineering
Graduate College 
University of Nevada Las Vegas 
December 2005
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UMI Number: 1429693
INFORMATION TO USERS
The quality of this reproduction is dependent upon the quality of the copy 
submitted. Broken or indistinct print, colored or poor quality illustrations and 
photographs, print bleed-through, substandard margins, and improper 
alignment can adversely affect reproduction.
In the unlikely event that the author did not send a complete manuscript 
and there are missing pages, these will be noted. Also, if unauthorized 
copyright material had to be removed, a note will indicate the deletion.
UMI
UMI Microform 1429693 
Copyright 2006 by ProQuest Information and Learning Company. 
All rights reserved. This microform edition is protected against 
unauthorized copying under Title 17, United States Code.
ProQuest Information and Learning Company 
300 North Zeeb Road 
P.O. Box 1346 
Ann Arbor, Ml 48106-1346
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UMZ Thesis ApprovalThe Graduate College 
University of Nevada, Las Vegas
August 10 20 05
The Thesis prepared by
N aveen Babu Anne
Entitled
"Scheduling of Real Time Embedded Systems for Resource and 
__________ Energy Minimization by Voltage Scaling"___________
is approved in partial fulfillment of the requirements for the degree of 
______________ M aster o f  S c ie n c e  In  E l e c t r i c a l  E n g in e e r in g
Clfrflmniflfzon Committee Member 
Examination Committee Member
Graduate College Faculty Representative
nation Con mittee ChairExamt
Dean o f the Graduate College
1017-53 11
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ABSTRACT
Scheduling of Real Time Embedded Systems for Resource and Energy 
Minimization by Voltage Scaling
by
Naveen Babu Anne
Dr. Venkatesan Muthukumar, Examination Committee Chair 
Assistant Professor of Electrical and Computer Engineering 
University of Nevada, Las Vegas
The aspects of real-time embedded computing are explored with the focus on novel 
real-time scheduling policies, which would be appropriate for low-power devices. To 
consider real-time deadlines with pre-emptive scheduling policies will require the 
investigation of intelligent scheduling heuristics. These aspects for various other RTES 
models like Multiple processor system. Dynamic Voltage Sealing and Dynamic 
scheduling are the focus of this thesis. Deadline based scheduling of task graphs 
representative of real time systems is performed on a multiprocessor system
A set of aperiodic, dependent tasks in the form of a task graph are taken as the input 
and all the required task parameters are calculated. All the tasks are then partitioned into 
two or more clusters allowing them to be run at different voltages. Each cluster, thus 
voltage scaled results in the overall minimization of the power utilized by the system. 
With the mapping of each task to a particular voltage done, the tasks are scheduled on a 
multiprocessor system consisting of processors that can run at different voltages and 
frequencies, in such a way that all the timing constraints are satisfied.
iii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
TABLE OF CONTENTS
ABSTRACT..................................................................................................................................iii
LIST OF FIGU RES.....................................................................................................................vi
LIST OF TABLES......................................................................................................................vii
ACKNOWLEDGEMENTS.....................................................................................................viii
CHAPTER I INTRODUCTION................................................................................................ I
Real Time Embedded System (RTES)................................................................................ 1
Real Time Applications......................................................................................................... I
Representing a Real Time Embedded System ....................................................................2
Design of Real Time Embedded System............................................................................ 2
Performance characterization of R TES...............................................................................3
CHAPTER2 SCHEDULING IN REAL TIME SYSTEM S................................................... 7
Definitions................................................................................................................................7
The Need for Scheduling: Scheduler and Scheduling.......................................................8
Commonly used approaches to Real-Time Scheduling.................................................... 9
Clock Driven Scheduling...........................................................................................10
Weighted Round-Robin Approach...........................................................................10
Priority Driven Scheduling Approach..................................................................... 11
Scheduling Policies in Real Time Systems: Classification............................................. 12
Off-line/On-line (Static/ Dynamic)...........................................................................12
Preemptive/Non-preemptive......................................................................................13
Centralized/Distributed.............................................................................................. 14
CHAPTER 3 REVIEW OF EXISTING LITERATURE.......................................................15
Scheduling Algorithms: The Big Picture...........................................................................15
Real Time Task Model......................................................................................................... 16
Voltage Scaling and Energy Minimization....................................................................... 16
Standard Task Graph (STG )............................................................................................... 19
Description of the Existing Scheduling Algorithms........................................................19
CHAPTER 4 DESCRIPTION OF THE SCHEDULING ALGORITHMS........................ 33
Mathematical Modeling of the Scheduling Problem ...................................................... 33
Algorithm Description.........................................................................................................34
The Scheduling after Scaling (SaS) Algorithm ..................................................... 34
IV
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Scheduling before Scaling Algorithm (SbS)...........................................................37
Probability Based Scheduling (PbS) Algorithm....................................................40
Energy - Probability Based Scheduling (E-PbS) Algorithm................................ 46
An Example illustrating the algorithms............................................................................ 48
S aS ................................................................................................................................49
SbS................................................................................................................................51
Probability based Scheduling....................................................................................54
E-PbS........................................................................................................................... 57
CHAPTER 5 RESULTS, CONCLUSION AND FUTURE WORK...................................60
Results.................................................................................................................................... 60
Conclusion............................................................................................................................ 61
Benchmarks................................................................................................................. 61
Assumptions m ade............................................................................................................... 62
Simulation Results............................................................................................................... 62
Directions for future w ork .................................................................................................. 78
REFERENCES........................................................................................................................... 79
V ITA .............................................................................................................................................83
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
LIST OF FIGURES
Fig.I. Design flow of real time embedded system....................................................................4
Fig.2. Classification of RTES Scheduling Algorithms.......................................................... 12
Fig.3. Classification of Scheduling Algorithms into Fixed and Dynamic Priority Algos. 15
Fig.4. Standard Task G raph.......................................................................................................18
Fig.5. Rate Monotonie Schedule of tasks T I, T2, T 3 ............................................................ 20
Fig.6. Deadline Monotonie Schedule of Tasks T l, T2, T3 in 8 .1 .2 ...................................24
Fig.7. Earliest Deadline First Schedule of Tasks T l, T2, T3 in 8 .2 .1 .................................25
Fig.8. Least Slack Time First Scheduling of Tasks T I, T2, T3 in 8.2.2..............................32
Fig.9. Standard Task Graph Example...................................................................................... 48
Fig. 10. Scheduling after Scaling............................................................................................... 51
F ig .II. Scheduling before Scaling............................................................................................52
Fig. 12. Scaled processors in SbS algorithm............................................................................ 53
Fig. 13.ASAP and ALAP schedules for lO.stg........................................................................ 55
Fig. 14. Final PbS Schedule for lO .stg..................................................................................... 58
Fig.l5. Final E-PbS Schedule for lO.stg..................................................................................59
Fig. 16. Comparison chart depicting No. of processors vs. No. of tasks..............................64
Fig. 17. Division of total number of processors for SaS Algorithm by Voltage.............. 64
Fig. 18. Division of total number of processors for SbS Algorithm by Voltage.............. 65
Fig. 19. Division of total number of processors for PbS Algorithm by Voltage.............. 65
Fig.20. Division of total number of processors for E-PbS Algorithm by Voltage..........66
Fig.21. Energy Comparison of Number of Tasks vs. Energy Consumed.......................... 68
Fig.22. Energy consumed per each voltage in SaS Algorithm........................................... 69
Fig.23. Energy consumed per each voltage in SbS Algorithm........................................... 69
Fig.24. Energy consumed per each voltage in PbS Algorithm........................................... 70
Fig.25. Energy consumed per each voltage in E-PbS Algorithm.......................................70
Fig.26. Comparison of Processor utilization vs. Number of tasks...................................... 73
Fig.27. Comparison of Processor utilization at 5.0V for each algorithm..........................73
Fig.28. Comparison of Processor utilization at 3.3V for each algorithm......................... 74
Fig.29. Comparison of Processor utilization at 2.4V for each algorithm..........................74
Fig.30. Average Utilization of the processors expressed in percentage............................. 75
Fig.31. Comparison chart for Energy consumed per processor........................................... 75
Fig.32. Bar graph showing the energy per processor consumed by each algorithm ........76
Fig.33. Graph showing the product of Energy and Utilization.............................................77
Fig.34. Bar graph depicting the efficiency of each algorithm ..............................................77
VI
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
LIST OF TABLES
Table I Tabular form of STG Graph shown in Figure 9 ..................................................... 49
Table 2 Voltage scaling of eacb node..................................................................................... 50
Table 3 Scaling in SbS algorithm ............................................................................................53
Table 4 Table showing ASAP and ALAP start times and Mobility of each task............. 54
Table 5 Probabilities for task# 1 from ASAPbegin till ALAPend...................................... 56
Table 6 Distribution Graph for Time Step-1.......................................................................... 56
Table 7 Table showing number of processors required.........................................................63
Table 8 Energy consumed by the processors while implementing the algorithms........... 67
Table 9 Average utilization of the processors........................................................................71
Vll
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ACKNOWLEDGEMENTS
First and foremost, I would like to thank my advisor Dr.Venkatesan Muthukumar 
whose insights and suggestions for possible approaches helped find a solution that led to 
success. He has been a continuous source of inspiration. I benefited a lot from working 
with him and as his teaching assistant.
My special thanks go to Ms. Shruti Patil who gave my research work a valuable 
impetus by providing crucial directions for the solution of the problem I had been 
working on.
I wish to thank Dr. Ajoy K. Datta and Dr. Henry Selvaraj for sparing their valuable 
time and consenting for being a part of my examination committee. A wonderful thanks 
goes to Dr. Emma Regentova who apart from being in my examination committee, has 
been very cordial and encouraging at every stage of my masters program. I would 
especially like to thank my father Dr. Anne Subbarao and my mother Dr.Madala 
Ramadevi who encouraged me to take up graduate studies.
Finally, I would like to extend thanks to all my colleagues and faculty members in the 
Digital Design Group for creating a conducive environment for research and 
brainstorming. To all friends and other people close to me goes out my biggest “Thank 
you’ of all.
V lll
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1
INTRODUCTION
1.1 Real Time Embedded System (RTES)
Definition: Any system where a timely response by that system to external stimuli is 
vital is known as a Real Time System.
Definition of Components: Components here refer to individual processing elements 
like processors.
A real time embedded system is a system whose behavior depends on the accuracy of 
its logic as well as its timing. It is a collection of components working sequentially or in 
parallel, each with a specialized functionality that is designed to perform a specific 
operation. These components interact with each other either by exchanging information 
or by sharing resources. The degree of this interaction may vary in time which can be of 
the order of microseconds to hours and days.
Real Time Operating System (RTOS) is an operating system that works in a real time 
computing environment. The real time computing environment refers to the situation 
which has power, energy and timing constraints.
1.2 Real Time Applications 
Typical real time embedded system applications are: washing machines, central
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
heating systems, automatic gate keepers, cahier machines at parking areas. More complex 
and sophisticated ones are: flight control avionics, process control in industries, nuclear 
plant monitoring, scientific experiment guidance in laboratories, air traffic control, 
robotics, remote exploration of underwater, space and high risk environments, surgical 
operation and patient monitoring, command and control in defense and virtual reality 
systems.
1.3 Representing a Real Time Embedded System
Tbe use of Computer-aided-Design (CAD) tools in RTES design allows for 
streamlining their design and facilitates component reuse for future designs. Component 
reuse implies saving and using the design templates or designs for future RTES models 
which results in reducing the costs and the time-to-market of the product. CAD tools 
provide performance accurate modeling and simulations thereby reducing the time-to- 
market of the real time application.
A Real Time Embedded System can be modeled as a graph consisting of one or more 
nodes and edges connecting the nodes. A node represents a task within the RTES and the 
edge between two nodes represents the dependency or exchange of information involved 
between two tasks. Depending on the nature and utility of the system, the tasks of a 
system can be made to run either on a single processor or a multiprocessor system.
1.4 Design of Real Time Embedded System
Figure 1 shows the design methodology of a real time embedded system. The design
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
of a RTES is mainly consists of the following areas:
• System Specification
• Hardware-software partitioning
• Software design
• Hardware Synthesis
• Software -  Hardware Interface Synthesis
• Hardware implementation
The hardware-software partitioning is performed based on the requirements and the 
specifications of the system. The hardware synthesis of the system gives the optimized 
design in terms of number of resources required. In the software design part, it is made 
sure that the system meets all its real time constraints like power and timing. The system 
is then scheduled to be executed on the resources selected. The scheduling of the system 
task is done by the scheduler which is controlled by the operating system used. Finally, 
the whole real time system is implemented in the actual hardware.
1.5 Performance characterization of RTES
A Real Time Embedded System is characterized by its ability to meet the timing 
constraints. Power and resource constraints are the performance metrics for the RTES. 
The power consumed and the resources (e.g. Processors) used indicate the efficiency of 
the system. An efficient RTES consumes minimum power and minimum resources. The 
main problem in the design and the research is directed towards minimizing the power.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
( S c h e d u le r  
O S  Im ag e
Sim ulation
and
S y n th e s is
In te rfa c e  of 
P ro c e ss in g  
e le m e n ts
R e a l T im e S y s te m  
S pec ifica tio n
H a  rd w a  re - S  oftw a w re 
Partitioning
o s H ard w areIm plem enta tion
S y n th e s is
A llocation
ii S chedu ling
iii Binding
H ard w areR T E S  d e s ig n  by 
ta s k  S ch ed u lin g  
to  m e e t 
P o w e r  
i T im ing 
iii R e so u rc e  
C o n s tra in ts
üottware uesron'
Fig 1 Design flow of real time embedded system
There are two components of power i.e., static component and dynamic component. 
Static power comes from circuit design techniques that include voltage bias generators 
and any DC paths through active devices. Leakage power is due primarily to sub­
threshold leakage currents that result from reduced threshold voltages that prevent 
transistors from turning completely off. Dynamic power is dissipated when a device is 
switching. Switching power comes from charging and discharging capacitive loads. Short 
circuit power is dissipated due to the current that conducts when both the n-channel and 
p-channel transistors are momentarily on at the same time and is dependent on the
switching frequency, input slew rate and the difference between the operating and
4
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
threshold voltages. Standby power is dissipated when a device is not switching. The 
power consumption in a CMOS circuit is given by the following formula
P to ta l  — P s ta t ic  "h P s h o r t  "h C g w *  f * V j j  +  P g li tch in g
The static power, Pstatic, is the power consumed through leakage currents and it occurs 
even when the circuit does not operate. This power is very small for CMOS circuits, 
almost negligible. Pshon occurs with every gate output switching, when two output 
transistors of a CMOS gate are open in the same time. With a good design and 
technology this power can be kept under 10% of the dynamic power. The third term in 
the equation is the switching power and it is dependent on the clock frequency, the supply 
voltage and the switching capacitance. The last term is the power dissipation due to 
glitching. In this paper the minimization of power due to switching has been prioritized. 
The term CswjVdd states that power consumption is dependent on frequency of operation, 
switching capacitance, which depends on the size of the load (wire capacitance, output 
capacitance of driver, and input capacitance of the driven cells), and the square of the 
operating voltage. It is clear from the equation that reducing the supply voltage, clock 
frequency, switching capacitance or switching activity in the circuit reduces the dynamic 
component of the power. A variety of optimization methods targeting each of these four 
factors have been explored.
This thesis proposes four different algorithms for power minimization in RTES. 
These algorithms proceed by reducing the voltages of operation of the processors and 
distributing tasks running on the processors. This involves scheduling the tasks on the 
processors and scaling the voltage of operation which is explained in Chapter 4. Chapter 
2 introduces the reader to the definitions and terminologies used in this thesis. An
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
extensive survey of the literature on scheduling algorithms aimed at power and energy 
minimization in RTES is discussed in Chapter 3. The results and conclusion are 
explained in Chapter 5 with directions for future work.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2
SCHEDULING IN REAL TIME SYSTEMS
2.1 Definitions
Task Set: A real time application is specified by means of a set of tasks.
Task: A real time task is an executable entity of work which at a minimum, is 
characterized by a worst case execution time and a time constraint.
There are three types of real-time tasks: periodic, aperiodic, and sporadic.
Periodic Tasks: Periodic tasks are real-time tasks which are activated (released) 
regularly at a fixed period. The time constraint for a periodic task is a deadline‘d’ that can 
be less than, equal to or greater than the period.
Aperiodic Tasks: Aperiodic tasks are real-time tasks which are activated irregularly at 
some unknown and possibly unbounded rate. The time constraint is usually a deadline'd\
Sporadic Tasks: Sporadic tasks are real-time tasks which are activated irregularly 
with some known bounded rate. The bounded rate is characterized by a minimum 
interarrival period, that is, a minimum interval of time between two successive 
activations. The time constraint is usually a deadline‘d’.
Deadline: A deadline‘d’ is a point in time by which the task must complete its 
execution. Usually a deadline‘d’ is an absolute time. Sometimes,‘d’ is also referred to as 
a relative deadline. The deadline can be hard, soft, or firm.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Hard Deadline: A hard deadline means that it is vital for the safety of the system that 
this deadline is always met.
Soft Deadline: A soft deadline means that it is desirable to finish executing the task 
by the deadline, but no catastrophe occurs if there is a late completion.
Firm Deadline: A firm deadline means that a task should complete by the deadline, or 
not execute at all. There is no value to completing the task after its deadline.
Release Time: The release time of a task is the instant of time at which the task 
becomes available for execution. The task can be scheduled and executed at any time at 
or after its release time whenever its data and control dependency conditions are met.
Relative Deadline: The maximum allowable response time of a job is its relative 
deadline.
Absolute Deadline: This is equal to the task’s release time plus its relative deadline.
Period: The difference in time between the arrivals of two consecutive instances of a 
periodic task is fixed and is referred to as the period of the task.
Hyperperiod: The Hyperperiod of a set of tasks is the time interval of a fixed length 
which is equal to the least common multiple of the periods of all the tasks.
Execution Time: This is the computation time of the task.
Slack: Slack of a task is the difference of its execution time and the deadline.
Precedence constraint: If a certain task can execute only after the completion of its 
predecessor task(s), then such a task is called as precedence constrained task.
2.2 The Need for Scheduling: Scheduler and Scheduling 
Scheduling a real time system is necessary to satisfy the following requirements:
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
• All the tasks in the system meet their timing constraints and system as a unit 
operates successfully.
• Allocate resources (processors) to all the tasks.
• Optimize resources, power and time utilization of the system.
Tasks are scheduled according to a chosen set of scheduling algorithms and resource 
access-control protocols. The module which implements these algorithms is called a 
scheduler. Specifically, the scheduler assigns the processor to tasks, or equivalently, 
assigns tasks to processors. A task is scheduled in a time interval on a processor if the 
processor is assigned to the task and hence the task executes on the processor, in the 
interval. The total amount of time assigned to a task according to a schedule is the total 
length of all the time intervals during which the task is scheduled on some processor.
A schedule means an assignment of all the tasks in the system on the available 
processors produced by the scheduler. A scheduler produces a valid schedule if it satisfies 
the following conditions:
• Every processor is assigned to at most one task at any time.
• Every task is assigned at most one processor at any time.
• No task is scheduled before its release time.
• Depending on the scheduling algorithms used, the total amount of processor 
time assigned to every task is equal to its maximum or actual execution time.
• All the precedence and resource usage constraints are satisfied.
2.3 Commonly used approaches to Real-Time Scheduling
The subsequent sections will briefly explain clock-driven, weighted round-robin
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
approaches and will give a detail explanation of priority driven scheduling.
2.3.1 Clock Driven Scheduling [1]
As the name implies, when scheduling is clock-driven (also called time driven), 
decisions on what tasks can execute at what times are made at specific time instants. 
These instants are chosen a priori before the system begins execution. Typically, in a 
system that uses clock-driven scheduling; all the parameters of hard real-time tasks are 
fixed and known. A schedule of the tasks is computed off-line and is stored for use at run 
time. The scheduler schedules the jobs according to this schedule at each scheduling 
decision time. In this way, scheduling overhead during run time is minimized.
2.3.2 Weighted Round-Robin Approach [1]
The round-robin approach is commonly used for scheduling applications which have 
to share the processors time for completing their execution. When tasks are scheduled on 
a round-robin basis, every task joins a first-in first-out queue when it becomes ready for 
execution. The task at the head of the queue executes for at most one time slice. (A time 
slice is the basic granule of time that is allocated to tasks. In a time-shared environment, a 
time slice is typically in the order of tens of milliseconds.) If the task does not complete 
by the end of the time slice, it is preempted and placed at the end of the queue to wait for 
its next turn. When there are n ready tasks in the queue, each task gets one time slice 
every n time slices, that is, every round. Because the length of the time slice is relatively 
short, the execution of every task begins almost immediately after it becomes ready. In 
essence, each task gets I/nth share of the processor when there are n tasks ready for 
execution. This is why the round-robin algorithm is also called the processor sharing 
algorithm.
10
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
By giving each task a fraction of the processor, a round-robin scheduler delays the 
completion of every task. If it is used to schedule precedence constrained tasks, the 
response time of a chain of tasks can be very large. For this reason, the weighted round- 
robin algorithm is not suitable for scheduling such tasks. On the other hand, a successor 
task may be able to incrementally consume what is produced by a predecessor. In this 
case, weighted round-robin scheduling is a reasonable approach, since a task and its 
successors can execute concurrently in a pipelined fashion.
2.3.3 Priority Driven Scheduling Approach
The scheduling algorithms that fall under this approach are based on priority 
assignment to one or more of the temporal parameters of the tasks. These priority driven 
scheduling algorithms are divided into two categories: fixed-priority and dynamic priority 
algorithms. This section introduces four basic and highly used scheduling algorithms that 
fall into one of the above categories.
Fixed Priority Scheduling (FPS): The priority of the tasks being scheduled remains 
constant. The schedule is first made and then is implemented on the processors. 
Following two algorithms use fixed priority scheduling:
1. Rate Monotonie Scheduling
2. Deadline Monotonie (Inverse Deadline) Scheduling
Dynamic Priority Scheduling (DPS): The priority of the tasks being scheduled 
changes with the change in the parameter in consideration. The schedule changes 
dynamically as the priorities of the tasks change. The following algorithms are based on 
dynamic priority:
1. Earliest Deadline First
II
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2. Least Slack Time First 
These algorithms are further explained in detail in Chapter 3.
2.4 Scheduling Policies in Real Time Systems: Classification
Scheduling algorithms can be classified according to the following criteria:
• Off-line/On-line scheduling
• Preemptive/Non-preemptive scheduling
• Centralized/ Distributed scheduling
Figure 2 shows a possible classification of algorithms.
RTES
Soft Hard
Periodic Aperiodic
Preem ptive N on-Preem ptive Preem ptive Non-Preem ptive
Static Dynam ic Static D ynam ic Static D ynam ic Static Dynam ic  
Fig 2 Classification of RTES Scheduling Algorithms
2.4.1 Off-line/On-line (Static/Dynamic)
Off-line Scheduling (Static Scheduling): A scheduling algorithm is used offline if it is 
executed on the entire task set before actual task activation. The schedule generated in 
this way is stored in a table and later executed on the processors. The task set has to be
12
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
fixed and known a priori, so that all task activations can be calculated off-line. The main 
advantage is that the run-time overhead is low and it does not depend on the complexity 
of the scheduling algorithms used to build the schedule. However, the system is quite 
inflexible to environmental changes. Environmental changes here refer to the changes in 
the environment of operation of the system and processors like power supply and voltage 
fluctuation which lead to change in the task parameters.
On-line Scheduling (Dynamic Scheduling): A scheduling algorithm is used on-line if 
scheduling decisions are taken at run-time every time a new task enters the system or 
when a running task terminates. With on-line scheduling algorithms, each task is assigned 
a priority, according to one of its temporal parameters. These priorities can be either fixed 
priorities, based on fixed parameters and assigned to the tasks before their activation, or 
dynamic priorities, based on dynamic parameters like voltage, frequency of operation and 
execution time that may change during system evolution. This dynamic approach 
provides less precise information for scheduling the tasks than the static approach since it 
uses less information, and it has higher implementation overhead. However, it manages 
the unpredictable arrival of tasks and allows progressive creation of the planning 
sequence. Thus, on-line scheduling is used to cope with aperiodic tasks and abnormal 
overloading.
2.4.2 Preemptive/Non-preemptive
Preemptive Scheduling: In preemptive scheduling, an elected task may be preempted 
and the processor allocated to a more urgent task or one with higher priority; the 
preempted task is moved to the steady state, awaiting later election on some processor. 
Preemptive scheduling is usable only with preemptive tasks.
13
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Non-preemptive Scheduling: Non-preemptive scheduling does not stop task 
execution. One of the drawbacks of non-preemptive scheduling is that it may result in 
timing faults that a preemptive algorithm can easily avoid. In uniprocessor architecture, 
critical resource sharing is easier with non-preemptive scheduling since it does not 
require any concurrent access mechanism for mutual exclusion and task queuing. 
However, this simplification is not valid in multiprocessor architecture.
2.4.3 Centralized/Distributed
Centralized: Scheduling is centralized when it is implemented on a centralized 
architecture that records the parameters of all the tasks of a distributed architecture.
Distributed: Scheduling is distributed when each site defines a local scheduling
policy. Distributed scheduling can be of various types. For example, every processor 
executing a set of tasks can have its own scheduling policy or a group of processors can 
have same scheduling policy and the others can have their own scheduling policy. In this 
context some tasks may be assigned to a site and migrate later.
14
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 3
REVIEW OF EXISTING LITERATURE
3.1 Scheduling Algorithms: The Big Picture
With the vast amount of literature available on scheduling of real time systems, it is 
necessary for the reader to understand the hierarchy of the existing scheduling 
algorithms. This section captures the literature available on scheduling algorithms 
focused on minimizing the power and energy consumption in RTFS. Our aim is to 
include all possible variations in the existing algorithms.
RMS DMS EDF LST
F ixed Priority Schedu lin g  
Algorithm s
D yn am ic Priority 
S ch ed u lin g  Algorithm s
S ch ed u lin g  A lgorithm s
Fig 3 Classification of Scheduling Algorithms into Fixed and Dynamic Priority
Algos.
The algorithms are classified by the priority they use in scheduling the tasks. The 
algorithms are classified as belonging to fixed priority category or to dynamic priority
15
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
category. And they are further classified as belonging to one of the scheduling 
algorithms. And those algorithms which do not use any of the above scheduling 
algorithms but implement a different priority scheduling are discussed separately. The 
algorithms are also categorized according to the objective function they minimize. Figure 
3 shows the classification picture. Before studying the literature, it is necessary to 
understand the task model and the concept of power and energy minimization.
3.2 Real Time Task Model
A task model is required as a basis for discussing scheduling. A real time task is a 
basic executable entity which can be scheduled; it can be either periodic or aperiodic with 
soft or hard timing constraint. A task is best defined with its main timing parameters. 
This model includes the following primary parameters 
r -  Release Time of the task 
ei -  Worst Case Execution Time of the task 
Di -  Relative Deadline of the task 
di -  Absolute Deadline of the task 
Pi -  Period of the task (Valid only for periodic tasks) 
g -  Start time of the task 
h -  End time of the task
3.3 Voltage Scaling and Energy Minimization
Power and energy are the foremost objective functions that scientists around the 
world are trying to optimize while developing scheduling policies for real time systems.
16
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The basic concept of power reduction in the variable voltage processors is a technique 
called Voltage-Clock scaling in CMOS circuit technology.
Assuming that a processor P runs at a supply voltage V, and a frequency/, and a task 
T takes n clock cycles on the processor to complete its execution, the power consumption 
in the CMOS digital circuit is given by
Pcmos = Ci-Nsw-V^.f
where Cl is the output capacitance, Nsw is the number of switches per clock.
Energy consumption can be computed as
E = (n. Pcmos) / /
This can be written as
E = n. CL.Nsw.V2
From the above equations, we can conclude that lowering the supply voltage 
drastically reduces the power and energy consumption of the particular processor.
The supply voltage also affects the circuit delay. The circuit delay is given by
Td = k. V/(V-Vt) 2
where k is a constant dependent on output capacitance, Vt is the threshold voltage.
Frequency of operation is inversely proportional to the delay. Hence, lowering the 
supply voltage increases the delay of operation, which in turn leads to lower clock 
frequencies.
Processors supporting several supply voltages are available. Various supply voltages 
result in different energy consumption levels for a given task T. The voltage at which a 
task is run can be decided before the execution of the task or sometimes when the task is 
executing. In the former method, a particular voltage is assigned to a task and it is run at
17
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
that voltage on the processor. In the latter case, the switching of the voltage while the 
tasks are running on the processor is controlled by the OS.
These variable voltage processors operate at different voltage ranges to achieve 
different levels of energy efficiency. Some processors which can operate at different 
supply voltages are:
• ARM7D -  runs at 33MHz, 5V and 20MHz, 3.3V;[11]
• Motorola’s PowerPC860 -  can be operated at 50MHz, 3.3V and in a low 
power mode of 25MHz and 2.4V; [11]
• PowerPC603 -  Has four power modes which can be selected by setting the 
appropriate control bits; [6 ]
1 10
2 0 0 0
3 1 9 1 0
4 2 4 1 1
5 3 3 1 1
6 4 6 1 0
7 5 4 1 0
8 6 3 2 5
9 7 3 1 6
10 8 8 1 0
11 9 9 2 3
12 10 2 2 7
13 11 0 2 2
14 n Standard Task Graph Set Project
15 # Random Task Graph 50//tmp/50/rand0000.stg
16 U Precedence constraints generator : sameprob
17 # Random Seed :I 28547
18 n Tasks ; ID (+dummy tasks : 2)
19 n Edges :; 15 / 35 (+dummy edges : 6)
20 # Max. Predecessors :; 2
21 n Min. Predecessors :; 0
22 a Ave. Predecessors ; 1.5
23 # Probability :; 0.067548 (Real : 0.078367)
24 # Task processing time generator : unifproc
25 U Random Seed :; 28547
2 6 n Max. Proc. Time :; 10
27 n Min. Proc. Time :; 1
28 # Ave. Proc. Time : 5.000000 (Real : 5.240000)
29 # CP Length : 55
30 # Parallelism : 4.763637
9
10
Fig 4 Standard Task Graph
18
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3.4 Standard Task Graph (STG)
An STG file consists of the task graph part and information part.
Task graph part:
Line#l represents the number of tasks. Line#2 holds the information for the dummy 
source node. It is represented as task number 0. All subsequent lines hold information 
about rest of the nodes in the graph. The first column represents the task number, second 
column gives its execution time, third column gives the number of predecessors and the 
rest of the columns represent the node number of the predecessors. Line#13 holds 
information of the dummy sink node.
Information part:
The information part consists of other information about the graph and the program 
that generates the STG. This part is composed of four different parts: a common part 
(task graph file name, etc.), precedence constraints form, task processing time, and other 
information such as critical path length and task graph parallelism. The common 
information part is shown in the line #15. Likewise the other major parts are also 
described in detail. Usually they serve the documentation purposes. Each line in the 
information part starts with a ‘# ’.
3.5 Description of the Existing Scheduling Algorithms
3.5.1 Fixed Priority Algorithms
The scheduling algorithm is said to be fixed priority algorithm if the parameter based 
on which priority is assigned is fixed.
3.5.1.1 Rate Monotonie Scheduling [5]
19
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A set of tasks are said to be scheduled based on Rate Monotonie Scheduling if the 
priorities to the tasks are assigned according to their periods: the shorter the periods, the 
higher the priority of the task.
Liu and Layland proposed Rate Monotonie Scheduling approach with the following 
assumptions:
• Tasks are periodic with constant interval between two successive requests.
• Task deadlines are equal to their periods i.e. the tasks must be completed 
before their next instance.
• Tasks are independent in the sense that they do not depend on completion of 
other tasks.
• Execution time of each task is constant and does not vary with time.
• Tasks are preemptable and are run on one processor.
This algorithm is best explained by the following example: A set of three tasks is 
given: T, (4,1), T%(5,2), T3(2 0 ,5 ). All three tasks are scheduled on one processor PI. The 
first parameter is the period and next one is the execution time of each task. The priority 
ordering of the tasks is T%, T2 , T3 since p,< p2< p3 . The RMS schedule for these tasks is 
shown below:
7 2
T3
— I— I— ^ — I— I— — '— I— ^ — I— I— — I— '— I
Pe= 4; C = 1; D= 4; S =  0; Pr= 1; Cpu=p1
Pe= 5; C= 2; D= 5 ; S =  0; Pr= 1; Cpu=p'l
I — I 1 MÊm 1------1------^ ----------i H i ---1------ 1--------------------- 1------ 1------ 1------ 1------1
Pe= 20; C= 5 ; D = 20; S =  0; Pr= i;  Cpu=p1
Fig 5 Rate Monotonie Schedule of tasks T l, T2, T3
20
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3.5.1.1.1 Power Conscious Fixed Priority Scheduling for Hard Real-Time Systems [6 ]
In this paper, Low Power Fixed Priority Scheduling (LPFPS), a power efficient
version of the fixed priority scheduling algorithm is proposed. Tasks are scheduled by 
fixed priority RMS algorithm. Power reduction is obtained by exploiting the slack times 
present in the system and those arising from the variations of execution times of the tasks 
while ensuring that all tasks meet their deadlines. These slack times are efficiently used 
to change the frequency and voltage of a processor. A heuristic methodology is presented 
to compute the ratio of the processor’s speed. The tasks are assumed to be periodic, 
independent and are preemptable. The analysis of the algorithm was done as give below: 
Microprocessor Core ARMS - Max clock frequency: lOOMHz; Supply Voltage: 3.3V 
Benchmark Applications:
•  Avionics
• INS
• Flight Control
•  CNC
Among all the benchmark applications, LPFPS obtains the most power gain of up to 
62% for INS.
3.5.1.1.2 Power optimization of Real Time Embedded Systems on variable speed 
processors [7]
The authors of this paper proposed a power optimization method for real-time 
embedded applications on a Variable Speed Processor (VSP) with a power-down mode. 
This method consists of two components: off-line component based on real time analysis 
o f a task set and an on-line component based on priority based real time scheduling
21
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(RMS). Specifically for a give real-time task set, the lowest possible maximum processor 
speed is computed such that at least one of the deadlines are violated if the processor is 
running below that speed. With the maximum speed of the VSP set to the computed 
value, the speed of the VSP is then dynamically varied or the VSP is brought into a 
power-down mode to exploit execution time variation of each task and idle intervals 
present in the schedule. The analysis is performed with similar specifications as that of 
the [6 ] and CNC application benchmark shows a maximum of 50% power reduction with 
each method compared to the conventional priority based scheduling.
3.5.1.1.3 Energy efficient Fixed-priority Scheduling of Real-Time Systems on 
Variable Voltage Processors [8 ]
The problem of determining the optimal voltage schedule for a real time system with 
fixed priority jobs implemented on a variable voltage processor is discussed in this paper. 
This technique is based on the assumption that the timing parameters if each job is known 
off-line. Two algorithms are presented in the paper. The first one takes O (N2) time (N is 
the number of jobs) to find the minimum constant speed needed to complete each job, 
since constant voltage tends to result in a lower power consumption. The second 
algorithm, with O (N3) time complexity, builds on the first one and gives two results: (i) 
the minimum constant voltage needed to complete a set of jobs, and (ii) a voltage 
schedule which always results in lower energy consumption compared to using the 
minimum constant voltage and shutting down the system when it is idle.
The type of real time system implemented here consists of jobs with predefined 
release times, deadlines and required number of CPU cycles. These jobs can be aperiodic 
or be instances of periodic tasks and are scheduled by a preemptive scheduler following
22
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
RMS policy. The performance of this approach has been compared with [6 ] and [7] and 
the results show that [8 ] has 60% more energy savings than [6 ] and almost equivalent and 
sometimes more energy savings than [7].
3.5.1.1.4 Scheduling and Assignment for Real-time Embedded Systems with 
Resource Contention [9]
This paper attempts to minimize the schedulability loss due to blocking. Blocking 
occurs when a priority inversion makes a higher priority task waiting for the processing 
of a lower priority task. The authors propose a method which deals with resource 
constraint for heterogeneous hard real-time systems and which tries to minimize the 
blocking time.
The scheduling and the assignment problems are interdependent. Hence the 
complexity is reduced by first assigning for each task, the sub tasks on each Processing 
Unit (PU) in order to bind resources on Pus and to minimize the schedulability loss due to 
blocking. Then Priority Ceiling Protocol (PCP) along with RMS is performed at the task 
level in order to reduce the blocking time and avoid deadlock. This algorithm reduces the 
schedulability loss of an average of 12%. For more than 10%, the algorithm works for 
cases which didn’t be dealt with other methods.
3.5.1.2 Deadline Monotonie (Inverse Deadline) Scheduling [10]
Tasks with the shortest relative deadline get the highest priority. This algorithm was 
first proposed by Leung and Merrill in 1980. This algorithm is valid even when the 
relative deadline is less than the task period .When the relative deadline is equal to its 
period, the Rate Monotonie and Deadline Monotonie algorithms behave in the same 
manner. The following schedule illustrates the DMS algorithm: A task is represented as T
23
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
{n, Ci, Di, Pi}. The system consists of three tasks Ti {0, 3, 7, 20}, T% {0, 2 ,4 , 5), T3 {0, 2,
9, 10}.
ri
Pe= 20; C= 3 ; D= 7; S= 0; Pr= 1; Cpu=P‘t
72
Pe= 5; C= 2; D= 4; S= 0; Pr= 1; Cpu=P1
^  I—I— I— I— I— I— I— — I— I— = — I— I— I— I— I— I
P e=  10; C= 2; D= 9 ; S=  0; Pr= 1; Cpu=P1
Fig 6  Deadline Monotonie Schedule of Tasks T l, T2, T3 in 8.1.2
The DM algorithm can sometimes produce a feasible schedule when RM algorithm 
fails but RM algorithm always fails when DM fails.
3.5.2 Dynamic Priority Algorithms
The scheduling algorithm is said to be dynamic priority algorithm if the priorities are 
assigned to tasks based on parameters that change during task execution.
3.5.2.1 Earliest Deadline First (EDF) [5]: This dynamic scheduling algorithm was 
first proposed by Liu and Layland along with RMS policy. The same assumptions made 
for RMS also stand for EDF. Priorities are assigned to tasks according to their absolute 
deadline. The task with earliest deadline has highest priority. This algorithm is important 
because it is optimal when used to schedule tasks on a processor as long as preemption is 
allowed and tasks do not contend for resources and this algorithm is optimal in the sense 
of feasibility: if there exists a feasible schedule for a task set, then the EDF algorithm is 
able to find it.
Figure 7 shows an example of EDF schedule for a set of three periodic tasks
24
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
T l{0,3,7,20}, T2{0,2,4,5} and T3|0,l,8,10}.
Tl
72
H 1------1------1------1------1------1------1------1------1------1------1------1------1------1
Pe= 20; C= 3 ; D = 7; S= 0; Pr= 1; Cpu=P1
H H
Pe= 5; C= 2; D= 4; S= 0; Pr= 1; Cpu=P1
^  I— I— I— I— I— ■ ■ — I— I— I— I— I— mm— I— I— I— I— I— I— I
Pe= 10; C= 1; D= 8; S= 0; Pr= 1; Cpu=P1
Fig 7 Earliest Deadline First Schedule of Tasks T l, T2, T3 in 8.2.1
3.5.2.1.1 EDF scheduling using Two-Mode Voltage-Clock-Scaling for Hard Real- 
Time Systems [11]
The authors here make several assumptions to apply clock scaling with EDF 
scheduling -
• Voltage switching consumes negligible overhead.
• Tasks are independent.
• The worst case execution time of each task is known.
• The overhead of the scheduling algorithm is negligible when compared to the 
execution time of the application.
• The system operates at two different voltage levels.
This algorithm is implemented in two phases: (i) Mode assignment (ii) Resource 
reclaiming phase. All Mode assignment picks the voltage setting for each task, i.e. High 
or Low voltage mode that will minimize the total energy consumed and ensure 
schedulability under EDF. This voltage assignment is done based on the amount of slack 
period left by the previously run task on the system. The slack available is computed after
25
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
the completion of execution of a task and the assignment is done in the order of tasks’ 
first arrival instances during each busy cycle. The slack if available is kept in a different 
queue which is checked every time a new task instance is released to see if there is any 
slack that can be reclaimed. The results have been published for three types of voltage 
assignments: fixed, static and dynamic. For dynamic mode assignments, the subsets of 
high and low voltages are determined for each busy cycle during which the processor is 
busy continuously without any idle intervals, while the static mode assignment assigns 
the voltage at the start of the task instance as the task timings are known a priori. In 
addition to the above two schemes, fixed voltage assignment follows the static 
assignment but with no slack reclamation. In the extreme case, more than 25%of the 
energy can be saved by the dynamic approach over the fixed assignment.
3.5.2.1.2 Real-Time Task Scheduling for a Variable Voltage Processor [12]
This work addresses the problem of less energy consumption by simultaneously 
assigning the CPU time and a supply voltage to each task which results in low power. 
This algorithm uses a Variable Voltage Processor core to schedule the tasks. Three 
methodologies are presented for voltage scheduling of the tasks. f/jStatic Voltage 
scheduling of the tasks first assigns CPU time to all the tasks based on EDF schedule and 
then a supply voltage is assigned to each task to minimize the total energy consumption 
without violating the real time constraints, (ii) Dynamic voltage scheduling assigns a 
supply voltage to only one task which is going to run next. An occupation period is 
defined as the maximum value of period that the next executed task can use without 
violation of real time constraints for the future tasks. Two algorithms are proposed to 
calculate the occupation period: SD and DD algorithms. The start point is determined as
26
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
the actual finishing time of the current task or arrival of the next task. In DD algorithm, 
the end point is determined as the finishing time of the next task on the assumption that 
all tasks are assigned the maximum supply voltage and complete at the worst case 
execution cycle. In the SD algorithm, the end point is determined as the time which is 
added the minimum remaining time in not yet executed tasks to the end point in the DD 
algorithm. These are two optimal algorithms which maximize the length of the 
occupation period. These algorithms have been tested on a set of tasks which are run a 
processor that can operate in three different modes.
The modes of the processor are 5v, 50MHz; 4v, 40MHz; and 2.5v, 25MHz. The five 
tasks Tl, T2 , T3 , T4, T5 are described with certain execution cycles and load capacitances. 
Two scenarios of task execution are assumed where in the deadlines of the tasks in 
scenario 1 are earlier than in scenario2. For scenario 1, energy reduction rate of SS, SD 
and DD are 27%, 38% and 16% respectively compared with normal. In scenario2, the 
energy reduction rate of SS, SD and DD are 56%, 58% and 16% respectively.
3.5.2.1.3 Energy Aware EDF Scheduling in Distributed Hard Real Time Systems [13]
An online energy aware algorithm for distributed heterogeneous hard real time 
systems based on a modification of the Earliest Deadline First algorithm. A distributed 
hard real time system that consists of a set of independent periodic tasks where each task 
is specified by its period, worst case execution time and deadline. An important 
assumption made here is that all tasks are statically allocated to a specific host during the 
system design phase. Low Power Distributed EDF (LPDEDF) is the energy aware 
algorithm. The energy reduction is obtained by reducing the speed of the processor when 
the only ready task in the system has no successor or when the previous task has not
27
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
exhausted its WCET. For performance analysis, the number of hosts has been randomly 
chosen between 2 and 4. The number of end to end deadline have been chosen as 2* 
number of hosts. The number of tasks has been chosen between number of hosts and 
6 *number of hosts. Task periods have been randomly chosen between 100 and 1000. The 
load of each task is determined by dividing randomly the fixed total load of the system. 
The end to end deadlines are equal to their periods. The proposed algorithm reduces 
energy consumption from 10% to 16% where the global load of the system is 65%. In the 
most loaded case of 95%, the energy savings are between 2% to 10%.
3.5.2.1.4 Energy Conserving Feedback EDF Scheduling for Embedded Systems with 
Real-Time Constraints [14]
This work attempts to enhance the EDF scheduling to exploit slack time generated by 
the invocation of the task at multiple frequency levels within the same invocation. 
Initially the overall system utilization is determined to configure the idle task. At each 
scheduling point for task activation, the scaling level is calculated and the scaled task 
portion is scheduled. If a task was preempted, the newly released task receives its 
recalculated slack prior to scaling. Slots from idle jobs increase the slack. In the absence 
of slack, there is no scaled portion and the task proceeds to execute at the highest 
frequency. Otherwise, a timer interrupt is set at the end of the scaled portion. The 
resulting energy savings exceed those of previously published work by up to 34%.
3.5.2.1.5 How to Integrate Precedence Constraints and Shared Resources in Real- 
Time Scheduling [15]: Previously the Priority Ceiling Protocol (PCP) and the Stack 
Resource Policy (SRP) have been used without precedence constraints. In this paper. Dr. 
Stankovic and Dr.Spuri extended these protocols to work with arbitrarily timed tasks with
28
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
precedence constraints. They characterize the EDF scheduling policy to correctly 
schedule precedence constrained tasks and show how preemptive algorithms, those that 
deal with the shared resources can be extended to deal with precedence constraints. They 
prove these mathematically using a set of theorems.
3.5.2.1.6 Characteristics of EDF Schedulability on Uniform Multiprocessor [16]
EDF algorithm has been traditionally used to schedule the real time systems on a
single processor system. The authors of this paper develop tests for determining whether 
the EDF algorithm can successfully schedule a given real-time task system to meet all 
deadlines upon a specified uniform multiprocessor system. They attempt to efficiently 
identify all those uniform multiprocessor platforms such that any real-time instance 
feasible upon these platforms is guaranteed to be EDF schedulable upon the platform 
under consideration. EDF schedulability upon the given platform is then determined by 
ascertaining whether the real time system is feasible upon any of these platforms.
3.5.2.1.7 A Scheduling Model for Reduced CPU Energy [17]
As seen from the title of the paper, the authors here present a novel scheduling model 
for minimizing the energy consumed by the CPU. One very important assumption the 
authors make is that the energy consumption of a schedule is a convex function of the 
processor speed. They present an offline scheduling methodology to minimize the energy 
function. In the offline scheduling algorithm, a critical interval is defined which is an 
interval in which a set of jobs must be scheduled at maximum, constant speed in any 
optimal schedule. The algorithm proceeds by identifying such critical intervals, 
scheduling those critical jobs by following the EDF policy, then constructing a sub 
problem for the remaining jobs and solving it recursively. In the online scheduling
29
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
category, the authors identify two heuristics: (i) Average Rate Heuristic (ii) Optimal 
Available Heuristic. The Average Rate Heuristic associates with each job an average rate 
requirement or density. At any time t, the AVR sets the processor speed and uses the EDF 
policy to choose among the available jobs. In the Optimal Available schedule, a new 
optimal schedule for the problem instance is recalculated after each arrival of a new job 
and the remaining portions of all other available jobs.
3.5.2.1.8 Online Scheduling of Hard Real-Time Tasks on Variable Voltage Processor 
[18]
This paper concentrates on scheduling a mixed set of tasks containing periodic tasks 
as well as sporadic tasks to optimize power consumption. The authors first discuss the 
scheduling of sporadic tasks and then they discuss the scheduling of mixed task sets. 
When only sporadic tasks arrive in the system, first an acceptance test using the processor 
utilization as the parameter is performed to check whether the task can be scheduled. The 
task is accepted only if it passes the acceptance test. All the tasks are maintained by EDF 
priority. These tasks are then scheduled on a variable voltage processor by using Sporadic 
Task Scheduling (STS) algorithm. For scheduling the set of mixed tasks, the authors 
came up with two algorithms: (i) OPASTS -  Optimal Periodic And Sporadic Task 
Scheduling (ii) HPASTS -  Heuristic Periodic And Sporadic Task Scheduling. The first is 
optimal when there is no knowledge of the arrivals of sporadic tasks. It has a time 
complexity of O (N+m) where N is the total number of requests in each hyper period of 
the n periodic tasks in the system and m is the number of sporadic tasks that have been 
accepted. The time complexity of the HPASTS is O (m). This algorithm is not optimal, 
but efficient and effective. The proposed efficient algorithms result in scheduling
30
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
solutions for various scheduling scenarios and workloads, which are within 2 0 % of the
minimum bound achievable with the dynamically variable voltage approach.
3.5.2.1.9 Real Time Task Scheduling for Energy-Aware Embedded Systems [19]
The authors of this paper propose two methodologies to schedule periodic, non- 
preemptable tasks in a real time system to minimize the energy consumption. To develop 
these policies, they assume that they are given a set of n periodic tasks with each task 
having a release time, deadline, execution length and period. They assume that the CPU 
can operate at two voltage levels and the supply voltage is controlled by the OS. The first 
scheduling model is (i) MILP -  Mixed Integer Linear Programming Model: Here the 
objective function of ILP is the energy consumed by the set of n tasks with the following 
constraints -
• CPU speed is limited to two values si and s2.
• The deadline for each task must be met.
• Tasks are non preemptable
• A task may start only after it has been released.
• This model is computationally intensive and cannot be used for large task sets. 
The second scheduling policy is (ii) LEDF -  Low Energy Earliest Deadline First
Heuristic. This algorithm maintains a list of all released tasks based on EDF scheduling 
policy. When tasks are released, the task with the earliest deadline is chosen first to be 
executed. Initially, a check is performed to see whether the task meets its deadlines when 
it is executed at a lower speed. If the task passes the test, the task is assigned the lower 
voltage and it begins execution. Any task that enters the system during this period is put 
in the ready list. LEDF recursively selects the task with the earliest deadline for
31
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
execution. As long as there are tasks for execution, the processor is not idle and this 
procedure is repeated until all the tasks are scheduled. The LEDF algorithm can be used 
for large real time task sets. As per the results, LDEF and MILP fare the same till the 
number of tasks are 14, after which the LEDF consumes slightly higher power than 
MIT .P. This is because LEDF does not know the release times of the tasks in a priori. But 
the time of generating an optimal schedule is less for LEDF when compared to MILP 
while scheduling a large set of tasks.
3.5.2.2 Least Slack Time First (LST) [20, 21]
This LST algorithm first proposed by Dhall and Sorenson assigns priority to tasks 
according to their slack (laxity): the smaller the slack, the higher the priority. If the slack 
is calculated only at the arrival times, the LST is equivalent to the EDF schedule.
71
72
Pe= 5 ; C= 2; D= 4; S= 0; Pr= 1; Cpu=P1
^  I— I— I— I— I— I— — I— I— I— I— H I— I— I— I— I— I— I— I
Pe= 10; C= 1; D= 8 ; S=  0; Pr= 1; Cpu=P1
Fig 8  Least Slack Time First Scheduling of Tasks T l, T2, T3 in 8.2.2
Pe= 20; C= 3 ; D= 7; S= 0; Pr= 1; Cpu=P1
32
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4
DESCRIPTION OF THE SCHEDULING ALGORITHMS
4.1 Mathematical Modeling of the Scheduling Problem
For the problem of scheduling, we consider a task set T = {Ti, Tz, T 3 ...Tn} where 
each task Ti is an aperiodic task with a hard deadline d, and Worst Case Execution Time 
(WCET) of e,. These tasks execute on a multiprocessor system consisting of processing 
elements {PEi, PEg, PE3 ,...PEm} which can operate at a prespecified set of voltages 
V={V), V2 , V3 ...Vp} where i<j implies Vi > Vj. We denote a task Ti running at voltage 
Vk as ‘Til;’. The WCET of a task Ti is its execution time at maximum voltage V I denoted 
by eii. Thus, its execution time at voltage Vk is given by
eik = eii*(Vi/Vk)
T is subject to precedence constraints and a partial order t, < tk is defined on T where 
tj < tj implies t, precedes tj. The problem of scheduling may be mathematically defined as 
a mapping
(T.T -^ {P E q,P E ^,...,P E J  
S -.P E ^{V ,,V ^ ,...,V ff
T'.T —> t, where t denotes the time axis,0 < t < D  
such that V /  e  T, r{T.) < d.
D is the absolute deadline of the last task.
33
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.2 Algorithm Description
A real time system is modeled as a task graph where each task is represented by a 
node and the dependencies between the nodes are represented by edges between the 
nodes. Henceforth in this thesis, the term ‘task’ and ‘node’ mean the same.
4.2.1 The Scheduling after Scaling (SaS) Algorithm
This algorithm proceeds by scaling the voltage at which each task should execute and 
progressively scheduling the task on to the available processor. The Standard Task Graph 
(STG) input benchmarks used during the algorithm development provide only the 
execution time and the predecessors of a task. Since the tasks in consideration have 
arbitrary hard deadlines, we first proceed by determining the deadlines of all the tasks. 
The following steps illustrate how the algorithm proceeds -
Step 1: Finding the path with highest execution time
The first step is to find the path from the start node to the sink node in the task graph 
with highest execution time. This is achieved by exhaustive depth first search. The 
highest execution time is termed as Eg.
Step 2: Finding the deadlines of the tasks by backtracking
The deadlines of all the tasks are determined once the highest execution time is 
found. This is accomplished by backtracking to each node from the sink node. The 
deadline of the last node is found by multiplying Eg with an arbitrary factor ‘x ’. x’ is 
selected such that the tasks have sufficient slack allowing them to be scaled.
d (s in k  node) ~  E g X X
The deadlines of all other tasks are found by backtracking from the last node. The 
procedure of finding the deadlines is represented by the following equations
34
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
di.i = di -  Ci
If a task has more than one successor, then the deadline is obtained by choosing the 
minimum of the differences of the deadlines and execution times of the successors. 
Suppose that a task i-1 has successors i, i+1, i+2, then di-1 is given by
di.i = min {(di -  e j ,  (di+i -  ei+0, (di+2 -  ei+2)}
Slack of each node is found from the difference of deadline and execution time of 
each task.
slack time of each node Si = di -  ei;
Step 3; Scaling the voltage of each task
Once all the parameters of a task are determined, the next step is to scale the voltage 
of each task. It is assumed that the tasks can only execute at a prespecified set of 
voltages. These are the voltages at which the processors can operate.
First the tasks are sorted into set S with respect to their slack. Then the task with least 
slack is made the current node and its execution time is modified by a scaling factor. 
Then the start times and end times of all the successors of the current node are modified 
till the sink. If any node misses its deadline, then the scaling factor is reduced. Again, the 
start and end times of the successors are modified to see if any of the nodes misses its 
deadline. This procedure is repeated till none of the node misses it’s deadline and the 
voltage at which the current node executes is fixed. The scaling is performed on all the 
nodes in the graph. This scheme assures that all the nodes meet their deadlines and results 
in lowered voltages. This scheme is best explained by the pseudo code below:
Let the given set of voltages V = {Vi, V2 , V3 ...Vp} with V, being the highest and Vp 
being the lowest voltage at which a processing element can operate.
35
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
fo r  (a task C in S)
{
k = p; //p is the number o f voltage levels available fo r  processors
Scale: modified execution time
modified hC '=  gC + eCD; 
if(h C '<  dC)
fo r  (all successor nodes N  o f C)
{
g'M = max h ’ o f  all predecessors;
h'N = g'N+eN ;
if(h TV > dN and k >  1)
k = k-1; //voltage level at which task executes increased
goto Scale;
}
else
k = k-1; 
goto Scale;
//increase voltage level
}
Vk obtained at end of this code is the voltage at which the task C can run without any 
successor task missing its deadline. When run for the entire task set S, the mapping 
T  —> {V,,V2 ,...,Vp}is obtained.
Step 4: Scheduling the nodes on Unlimited Number of Processors
36
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
As step3 ends, all the nodes are divided into voltage partitions with each node 
existing in exactly one partition. Scheduling is performed separately for each partition. 
By the end of the loop, Vk is the voltage to which the current node’s execution time is 
scaled. At every instant of time, whenever start time of a node is encountered, a check is 
performed to see if any processor is idle. If any processor is idle, the node is scheduled on 
that processor, else a new processor is initialized. The pseudo code for scheduling the 
nodes on the processors is given below:
Let S be the set o f tasks obtained by sorting all tasks with respect to their start
times ;
Assign first task to first processor PEq
Let P denote the set o f processors that have been allotted atleast one task 
for(every task C in S)
{
if(gc>hi o f all tasks Ti allotted to processor Pj e  P) 
allot C to Pj 
else
allot C to new processor Pk where k=\P\+l
}
This algorithm finally returns the number of processors required for each voltage 
partition. This results in minimum power and maximum resources.
4.2.2 Scheduling before Scaling Algorithm (SbS)
This algorithm performs scheduling of the nodes on the available processor before 
scaling the node voltages. It is similar to the SAS algorithm with the only difference
37
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
being that the nodes are initially sorted with respect to their deadlines following Earliest 
Deadline First policy. The nodes are then scheduled on the processors. At every instance 
of a start time of a node, a check is done to see if any processor is idle. The node is 
scheduled on the idle processor if any; else a new processor is initialized. Assuming that 
a processor can operate at only one voltage, all the nodes on a single processor are then 
scaled such that none of them miss their deadlines.
The pseudo code for the algorithm is given below:
Let S be the set o f tasks obtained by sorting all tasks with respect to their deadline ; 
Assign first task to first processor PEo
Let P denote the set o f processors that have been allotted atleast one task 
fo r  (every task C i nS )
{
if(gc>hi o f all tasks Ti allotted to processor Pj e  P) 
allot C to Pj 
else
allot C to new processor Pk where k-\P\-t-l
}
The scheduling till this stage corresponds to worst case scheduling and gives the 
minimum number of processors required to execute a task set. After scheduling, the 
voltages of the nodes are scaled. The pseudo code for scaling of the nodes on the 
processor is given below:
Let S be the set of processors on which the nodes are scheduled.
Sort the processors in ascending order based on the number of nodes on each
38
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
processor.
S = { PEo, PEj ... PEn } 
fo r  ( each processor PE)
{
fo r  (a task C in PE)
{
k = p; //p is the number o f voltage levels available fo r  processors
Scale: modified execution time e^.' = e ^ x
modified hc '=  gc + ec 
i f  (hc'<  dc)
fo r  (all successor nodes N  o f C)
{
gN = m a x h ’ o f all predecessors;
h 'N  = g 'N + e N ;
i f (h'N>dN and k >  1)
k = k-1; //voltage level at which task executes increased
goto Scale;
}
fo r (all nodes on processor PE executing after C)
{
gc+]'= he'; 
hc+i '= gc+i '+  ec ;
39
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
i f  (hc+i > dc+i)
k = k-1; //voltage level at which task executes increased
goto Scale;
}
fo r  (all successor nodes N  o f nodes on processor PE)
{
gN = max h ’ o f all predecessors;
h ' ^  = g  'n + ^ n
i f(h 'n > dN and k >  1)
k = k-1; //voltage level at which task executes increased
goto Scale;
}
else
k = k-1; //increase voltage level
goto Scale;
}
}
4.2.3 Probability Based Scheduling (PbS) Algorithm
The previous two algorithms have shown how to reduce the voltage of execution of 
each node the processors there by reducing the energy consumed. The next algorithm 
known as ‘Exhaustive Probability Based Scheduling’ tries to increase the utilization of 
the processors while optimizing the resources and the energy consumed by the system by 
including voltage scaling.
40
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Step 1: Determining the time steps
All the tasks in the task set are assumed to be executing at their WCET i.e. all the 
tasks are currently running at 5V. Divide the final dead line of the graph into time-steps 
with the smallest execution time of all tasks as the unit of time steps.
Step 2 ; As Soon As Possible Scheduling
The next step in the algorithm is to perform As Soon As Possible Algorithm (ASAP). 
The ASAP Algorithm starts with the highest nodes (that have no parents) in the task 
graph and assigns time steps in increasing order as it proceeds downwards. It follows the 
simple rule that a successor node can execute only after its parent has executed. This 
algorithm clearly gives the fastest schedule possible. In other words, it schedules in least 
number of time steps but never takes into account the resource constraints.
Step 3: As Late As Possible (ALAP) Scheduling
The next step to follow in the algorithm is the As Late As Possible scheduling. The 
ALAP algorithm works exactly in the same way as the ASAP algorithm expect that it 
starts at the bottom of the task graph and proceeds upwards. This algorithm gives the 
slowest possible schedule that takes the maximum number of time steps. However this 
doesn't necessarily reduce the number of functional units used.
Step 4: Determining the mobility
The algorithm proceeds by finding the mobility of each task. Mobility of a task T is 
defined below:
Mobility jx = ALAP_begintime { T i} -  ASAP _begintime{ Ti };
p represents the number of different time steps in which the current task can be 
scheduled.
41
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Step 5: Determine the probability
The next step in the algorithm is to find the probability of occurrence of each task in 
each time step covered by the mobility of the task.
The probability o f a task C in a time step t is represented as P(Ct). 
y = Total time steps which includes all possibilities o f occurrences o f the task in its 
mobility.
X  = min { execution time units e, p +1}
z = y-2(x-l ); Here z represents the number o f times x  is repeated, 
function probability_of_tasks( C, P(C,))
{
p+1 = ALAP_begintime { T i} -  ASAP _begintime{ T i} +1; 
k=I;
fo r  (i= l; i<=x-I; i++)
{
P(Ct=k) = i(l/(p+ l)); 
k++;
I
fo r  (i=I;i<=z;i++)
{
P(Ct=k) =x(l/(p+l)); 
k++;
}
fo r  (i=x-I ;i>=l ;i—)
42
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
{P(Ct=k) =x(l/(p+l)); 
k++;
}
}
Step 6 : Distribution Graph
A distribution graph is the summation of probabilities of each task in each time step. 
It signifies the portion of a task that will be executed in a time step.
Distribution graph DO ( tj  for a time step t, is given as
DG(ti) = Z P (C j t i )
Step 7: Schedule the task in a time step
At this point, the current task has to be scheduled starting at a particular time step and 
ending at another. This is necessary in order to determine the change in probability of the 
task in each time step when scheduled.
Change in probability of a task C is denoted by ô (P (Cj).
if the current task is scheduled to execute in time step U then change in probability is
(5 (P (Cj ti) = 1 - P(Cj ti)
if the current task is not scheduled to execute in time step t\ ,then change in 
probability is
J  (P (C j ti)  =  0 -  P(Cj ti)
Step 8 ; Self Effect of the nodes
Here the algorithm tries to balance the distribution graph by calculating the effect of 
each task to time step assignment and then selects the smallest effect.
43
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Self effect of task C j in a time step U is
SE (Q W = DG fW*  ^f f  (Q W
The total self effect for a task is given by
Step 9: Successor Effect
Scheduling the current task directly effects the scheduling of the successor tasks. 
Eience the successor effect is also considered while determining the total force of a task. 
The successor force is also found out in the same manner as self force.
Step 10: Total Effect
The total effect of a task determines the feasibility of scheduling the task in that 
particular time step. The least effect determines the step in which the task has to be 
executed. This effect signifies that the current task will use fewer resources if scheduled 
in a time step where the self effect is least and vice versa.
Total effect of a task C is
F (Cj ti ) = SF (Cj ti) + Successor Effect
After finding the total effect, the task is scheduled in the time step where the effect is 
least. Steps 7, 8 , 9 and 10 are repeated for all the tasks in the task set and a final schedule 
is obtained as a result.
Step 11 : Voltage Scaling
This step is a very important step in this algorithm as this incorporates voltage scaling 
into the schedule obtained so far.
V = {VI, V2, V3...Vkj are the pre specified set o f voltages at which a task can 
execute.
44
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
fo r  (all tasks in the existing schedule) 
{
ec
k -  p;//p is the number o f voltage levels available fo r  processors 
modified hC '=  gC  + eÇU; 
i f (hC'< dC)
{
repeat steps 5,6,7,8,9 and 10;
In step9, the successors are assumed to be operating at the highest voltage 
possible;
}
else
k=k-l;
k=k-I;
}
The pseudo code shown above performs the voltage scaling of each task and the total 
effect is calculated for different voltages. The least effect determines the voltage at which 
the task will be executed. After a task is fixed in a voltage, the corresponding DG is 
updated. This process is repeated for all the tasks and each one is scheduled for execution 
at a particular voltage and the DG is updated dynamically to get an optimized schedule.
Step 12; Number of processors
45
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The final step in this algorithm is to find the number of processors required to execute 
the scheduled task set. By the end of scheduling, all the tasks are ready to be executed at 
a specified voltage. The pseudo code to find the number of processors is given below. 
for(each voltage)
{
allot node with earliest start time to one processor 
fo r  (all tasks running at that voltage)
{
if(gC>hi o f all tasks Ti allotted to processor Pj e  P) 
allot C to Pj 
else
allot C to new processor Pk where k=\P\+l
}
}
This algorithm finally returns a schedule in which the task voltages are scaled making 
sure none of the tasks miss their deadlines and these tasks are executed on processors in 
such a way that the utilization of the processors is high.
4.2.4 Energy - Probability Based Scheduling (E-PbS) Algorithm
The PbS algorithm discussed so far results in a schedule that has reduced concurrency 
of tasks in a timestep in order to minimize the number of resources required. But this 
does not involve any component that tries to reduce the energy. To achieve this, an 
algorithm Energy-Probability based Scheduling (E-PbS) has been developed in this 
thesis. This is a result of slight variation of PbS algorithm where in a factor of voltage is
46
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
included in calculating the total effect of a task. Steps 1 to 10 are followed as described in 
section 4.2.3.
Step 11 : Energy factor
From step 10, the total effect of the task at the highest voltage among the prespecified 
set of voltages is determined. Voltage scaling is performed on all the tasks in the schedule
to determine the total effect of the task at different voltages. The pseudo code for this is
similar to that given in Step 11 of Section 4.2.3.
The total effects so determined are at different voltages. For example, effect f, is at
voltage Vi, effect fg is at voltage Vz. The comparison of these effects would be difficult
when they are at different voltages. To compare and find the least effect, these effects are 
divided by the voltage at which they operate.
El Total effect at voltage Vj
p 2 Total effect at voltage Vi
Fi/Vi -> Total effect per volt at Vi 
F2/V 2 Total effect per volt at V 2 
Now FjAlj and F 2/V2 are comparable. Normalizing the fractions, we get the 
following:
Piihilvi) Normalized effect at voltage V;
F2 Normalized effect at voltage V2 
The least effect determines the voltage at which the task will be executed. After a task 
is fixed in a voltage, the corresponding DG is updated. This process is repeated for all the 
tasks and each one is scheduled for execution at a particular voltage and the DG is 
updated dynamically to get an optimized schedule.
47
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Step 12: This is the final step in E-PbS algorithm and is similar to step 12 in PbS 
algorithm.
4.3 An Example illustrating the algorithms
The example here explains the methodology of the algorithm. Figure 9 is an example 
task graph consisting of ten nodes with edges between them depicting the dependencies 
between the tasks. The start and sink nodes have null execution time. This task graph is in 
the Standard Task Graph (STG) format. This is represented in the form of a table as in 
table 1. The deadline and slack of each node is calculated by backtracking as described in 
Chapter 2 in Section 4.2.1.
Fig 9 Standard Task Graph Example
48
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 1 Tabular form of STG Graph shown in Figure 9
Node# Exe-Time
#of
Predec Predec.No
Start-
Time
End-
Time Deadline Slack
0 0 0 0 0 0 0 11.5
1 9 1 0 0 9 20.5 11.5
2 4 1 1 9 13 34.5 21.5
3 3 1 1 9 1 2 23.5 11.5
4 6 1 0 0 6 26.5 20.5
5 4 1 0 0 4 26.5 22.5
6 3 2 4,5 6 9 29.5 20.5
7 3 1 6 9 1 2 32.5 20.5
8 8 1 0 0 8 23.5 15.5
9 9 2 3.8 1 2 2 1 32.5 11.5
1 0 2 2 7,9 2 1 23 34.5 11.5
1 1 0 2 2 , 1 0 23 23 34.5 11.5
4.3.1 SaS
All the processors are assumed to be operating at one of the voltages in the voltage set 
{5V; 3.3V; 2.4V}. Eliminating the nodes with execution time of zero and applying the 
voltage scaling to the nodes sorted with respect to their slack, we get table2. Node 1 has 
an initial execution time of 9. It is then multiplied by a factor of (5/2.4) and then the start 
and end times of the all successor nodes of 1, in this case {2; 3; 9; 10} are changed and 
checked to see if they meet their respective deadlines. In this case, none of the node 
misses its deadline; hence node I is allotted a voltage of 2.4. If any node misses its 
deadline, then we go back to node I and the scaling factor is changed to (5/3.3) and all 
the start and end times are changed to see if they meet their deadlines. If again, any node 
misses its deadline, the original node is set a default voltage of 5. Algorithm proceeds in
49
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
this manner and is repeated for each node.
Table 2 Voltage scaling of each node
Node# Slack ExeTime Mod ExeTime Mod StartTime Mod EndTime Comment Voltage
1 11.5 9 9x(5/2.4) = 18.75 0 18.75 2.4
3 11.5 3 3x(5/2.4) = 6.25 18.75 25
Misses
deadline
3x(5/3.3) = 4.54 18.75 23.29 3.3
9 11.5 9 23.29 32.29
Cannot scale 
further 5
10 11.5 2 32.29 34.29
Cannot scale 
further 5
8 15.5 8 8x(5/2.4) = 16.67 0 16.67 2.4
4 20.5 6 6x(5/2.4) = 12.5 0 12.5 2.4
6 20.5 3 3x(5/2.4) = 6.25 12.5 18.75 2.4
7 20.5 3 3x(5/2.4) = 6.25 18.75 25 2.4
2 21.5 4 4x(5/2.4) = 8.33 18.75 27.08 2.4
5 22.5 4 4x(5/2.4) = 8.33 0 8.33 2.4
Voltage scaling of all the nodes results in three partitions. In this case, the first 
partition operates at 5V, the next one operates at 3.3V and the last one operates at 2.4V. 
Scheduling algorithm is then applied to these partitions as shown in figure 10. By scaling 
the voltages, we make sure all the tasks meet their deadlines.
SaS scheme is applied to the example task set and the scheduling can be seen in 
figure 10. The total number of processors required with this scheme is 6 . An assumption 
here is that each processor runs at a preset voltage. This leads to increase in the number 
of processors and also large idle time of the multiprocessor system. The total number of 
processors can be reduced by using processors that have voltage shift ability.
50
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
F2
F3
F4
={> ( T )
   7 : :  — ^ ( 1 8  7£-
^  G) 0^— ^ ( 0 - 1 6  67; ^ — 1 l8  7£-
c= 0  1 = )
^ ^ ( 0 - 1 2  5; — r i l 2  5-18 7£;
= >  0
   33'
25:
27 oe;
(0 -8  :
Scheduling th e  n o d es  Ir 2 4V partitior 
Total num ber of p rocesso rs  requires = 4
F- Q.(18 75-23 24:
Scheduling the nodes in 3 3V partition 
Total number of processors required = '
F-
 4 2 3  29-32 29 '( 2 - 9 ;  T 32 29- 34 29)
Scheduling the nodes in 5 OV partition 
Total number of processors required = '
Fig 10 Scheduling after Scaling
4.3.2 SbS
The SbS scheme is shown in Figure 11. The total number of processors required by 
this scheme is 4. The scaling here proceeds by scaling each node in each processor and 
checking to see if the current node or any of its successors or any of the nodes which are 
executing on the same processor as the current node do not miss their deadlines. One 
important thing to notice in this scaling is that all the nodes on one processor are scaled to 
same voltage because it is assumed that a processor can operate at only one voltage 
without having voltage shift ability.
In figure I I ,  first start by choosing processor P, and nodel. Scale nodel by
51
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
multiplying it with a factor of (5/2.4) and check to see if it is meeting its deadline. And 
then modify the start and end times of all the successor nodes continuously making sure 
they meet their deadlines. Also the nodes on PI that are executing after 1 i.e. 3, 9 and 10 
also are modified for their start time and end time. If any node in this process misses its 
deadline, we trace back to the original node 1 and decrease the scaling factor to (5/3.3). 
This procedure is repeated for the new scaling factor and the node is fixed to that voltage. 
This procedure is repeated on all the nodes and the nodes in one processor are fixed to 
execute at only one voltage.
(2 - -2:(0 -s;
(9-12;
(O-e; (E-9:
(0-4;
Schedu(ing before Scaling 
Number of P rocessors = 4
Fig 11 Scheduling before Scaling
Table3 shows the tabular form of the procedure for scaling the nodes in SbS 
algorithm.
52
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 3 Scaling in SbS algorithm
P ro c e s s o r N ode E xe time Mod e x e  time Mod s ta r t  time Mod en d  time Com m ent V oltage
1 1 9 18 .75 0 18.75
N o d es  2 ,3 ,9 ,1 0  m eet their 
d e a d lin e s  a f te r  modifying s ta r t  
a n d  en d  times 2 .4
3 3 6 .25 18 .75 25 M s s e s  dead line 2 .4
1 9 1 3 .5 9 0 13.59
N ode#  10 m is s e s  its d ead line  a f te r  
m odifying s ta r t  a n d  en d  tim es 3 .3
1 9 9 0 9
N o d es  2 ,3 ,9 ,1 0  m eet their 
d e a d lin e s  a f te r  modifying s ta r t  
a n d  e n d  times 5
2 8 8 1 6 .64 0 16.64
N o d es  9 ,1 0  an d  7  m eet their 
d e a d lin e s  a f te r  modifying s ta r t  
a n d  e n d  tim es 2 .4
7 3 6 .2 4 16 .64 22.88 N ode 10 m e e ts  its dead line 2 .4
3 4 6 1 2 .48 0 12.48
N o d es  2 ,6 ,7 ,1 0  m ee t their 
d e a d lin e s  a f te r  m odifying th e  s ta r t  
a n d  e n d  times 2 .4
6 3 6 .2 4 12 .48 18.72
N o d es 2 ,7 ,1 0  m ee t their d ead lin es 
a f te r  m odifying th e  s ta r t  a n d  e n d  
tim es 2 .4
2 4 8 .3 2 18.72 27 .04 M eets its dead line 2 .4
4 5 4 8 .3 2 0 8.32
All th e  s u c c e s s o r  n o d e s  m eet 
the ir  d ead lin es 2 .4
^ - i o - 9 )  ^ 9 - 1 2 )  ^ -(1 2 -2 1 ) ^ - p i - 2 3 )
P ro c e s s o r  P1 ru n n in g  a t  5 .0V
P 2
P 3
P 4
"= >
 '  (0-16.64) (18.72-24.96)
^  A^  (0-12.48) — 412.48-18.72)— / ( 1 8.72-27.04) 
^ 0 4 0 - 8.32)
P r o c e s s o r s  P 2 , P 3 , P 4  runn ing  a t  2 .4V
Fig 12 Scaled processors in SbS algorithm
Figure 12 shows the final schedule obtained through SbS algorithm. As can be seen, 
processor PI runs at 5V and all other processors are scaled to run at 2.4V.
53
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
4.3.3 Probability based Scheduling
The first two steps in this algorithm are to determine the ASAP and ALAP schedules 
of the task graph. Here the final deadline of the graph is 34.5 and hence, the whole graph 
is divided into 34 time steps with each time step equal to 1 time unit of execution. Figure 
13 shows the ASAP and ALAP schedules for the example benchmark lO.stg. These 
schedules are required to calculate the mobility of each task. Mobility is calculated by the 
formula given in Step 4 of Section 4.2.3. Table 4 gives the mobility of each task along 
with the ASAP and ALAP start times of each task.
Table 4 Table showing ASAP and ALAP start times and Mobilitv of each task
N o d e E x e c u t io n  u n its A L A P A S A P M obility  ( u + l ) D e a d lin e
[1] 9 12 1 12 2 0 .5
[2] 4 31 1 0 2 2 3 4 .5
[3] 3 21 1 0 12 2 3 .5
[4] 6 21 1 21 2 6 .5
[5] 4 2 3 1 2 3 2 6 .5
[6] 3 2 7 7 21 2 9 .5
[7] 3 3 0 1 0 21 3 2 .5
[8] 8 1 6 1 16 2 3 .5
[9] 9 2 4 1 3 12 3 2 .5
[10] 2 3 3 2 2 12 3 4 .5
The next step is to determine the probabilities of occurrence of each task in each time 
step covered by the mobility of the task. These probabilities are determined by using the 
formula given in Step 5 in Section 4.2.3. As an example, the probabilities for taskl are 
given in Table 5.
54
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Thne Step 
1 
2
10
11
12
13
14
18
19
20 
21 
22
23
24
25
28
29
30
31
32
33
34
ASAP SCHEDULE 5.0V ALAP SCHEDULE 5.0V
Fig 13 ASAP and ALAP schedules for lO.stg
After determining the probabilities of each task in each control step, the next step is to 
find the distribution graph for each time step which is done by the summation of 
probabilities of all task occurrences in that time step. Table 6  gives one such DG for 
timestep 1. Similarly, the DCs for each time step are determined.
Then the self force of the task is determined by scheduling the task in a control step 
and finding the variation in probability as given by the formula in steps 7&8 in section 
4.2.3. The total self force is again the summation of all self forces of a task. Due to the 
high amount of data, the self forces of the tasks are not shown here. In the next step, the 
successor forces are calculated in a similar manner to the self force. The total force of a
55
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
task is then the addition of its self force and its successor forces. Through out this 
procedure, the tasks are assumed to be running at 5V.
Tables Probabilities for task# 1 from ASAPbegin till ALAPend
Time Step Probability
1 0 .0 8 3 3 3 3 3
2 0 .1 6 6 6 6 7
3 0 .2 5
4 0 .3 3 3 3 3 3
5 0 .4 1 6 6 6 7
6 0 .5
7 0 .5 8 3 3 3 3
8 0 .6 6 6 6 6 7
9 0 .7 5
10 0 .7 5
11 0 .7 5
12 0 .7 5
13 0 .6 6 6 6 6 7
14 0 .5 8 3 3 3 3
15 0 .5
16 0 .4 1 6 6 6 7
17 0 .3 3 3 3 3 3
18 0 .2 5
19 0 .1 6 6 6 6 7
2 0 0 .0 8 3 3 3 3 3
Table 6  Distribution Graph for Time Step-1
Node# Probability
[1 ] 0.0833333
[4] 0.047619
[5] 0.0434783
[8 ] 0.0625
Tota DG in TimeStep 1 0.2369306
The next major step is to scale the voltages of the tasks in such a way that neither the 
task nor its successors miss their respective deadlines. The schedule obtained so far is at 
5V. Each task in this schedule is then scaled by a factor of (5/3.3) and (5/2.4) and the
56
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
total force for each task is determined for both the scaling factors. The least force among 
the three forces at 5, 3.3 and 2.4V decides the force at which the task is to be executed. 
While scaling a task, all the successors of the current task are assumed to be operating at 
5V. After fixing a task at a certain voltage, the DG for each time step is updated and the 
procedure is repeated for the next node available. The result of this dynamically updated 
algorithm is an optimized schedule which uses a reduced number of resources. On the 
other hand, the complexity of this algorithm is 3n^. The last and final step in this 
algorithm is to determine the number of processors required to accommodate all the tasks 
in the schedule. The processors here assumed to be operating at only one particular 
voltage without the ability to shift voltages. This is accomplished as explained in Step 12 
of Section 4.2.3. The final schedule for the benchmark program lO.stg is shown in 
Figurel4.
4.3.4 E-PbS
The total force of a task at 5.0V, 3.3V and 2.4V are represented by fs,f3.3 and f2.4 
respectively. After determining the total forces, the next step is to normalize the forces 
with the energy factor. The normalized forces are f5/(5 .0/2.4), f3 .3/(3 .3/2.4) and f2 .4 . The 
least force among these normalized forces is selected and the task is scheduled to execute 
at the corresponding voltage. The final schedule obtained by E-PbS is shown in figure 15.
57
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Time S tep
2
3
4
5
6
7
8
10
11
12
13
14
15
16
17
18 
19 
2C 
2 
22  
2c
24
25
26 
27 
26 
2Ê 
3C 
3 
32 
32 
34
T'
Z 4V
T4 
£ CV
T£2 2V
T2 
£ CV
T2 : 4v
T£
2 4V
T7 
£ CV
T6 
£ CV
T1C 
£ CV
T£
£ CV
Fig 14 Final PbS Schedule for lO.stg
58
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Time Step
2
3
4
5
6
7
8 
9
10
11
12
13
14
15
16
17
18 
19 
2C 
2 '  
25 
22
24
25
26 
27 
26 
2£ 
3C 
3 
35 
32 
34
4\/
6 CV
£ CV £ CV
CV
TIC
Fig 15 Final E-PbS Schedule for lO.stg
59
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 5
RESULTS, CONCLUSION AND FUTURE WORK 
This work has focused primarily on generating optimal schedules for task sets 
representative of RTES on a multiprocessor architecture. Optimal schedules obtained in 
turn reduce the resources required by the system and the total energy consumed by the 
system. The work presented in this thesis is a small step in what we believe is the right 
direction, however there is obviously much more to be done. In the next section, a 
summary of what was done will be presented. This thesis concludes with future directions 
for RTES scheduling with communication costs between the tasks and with processors 
that have voltage shift ability.
5.1 Results
This chapter summarizes the results obtained by running the proposed algorithms on 
standard benchmarks. The four proposed algorithms produce four different schedules for 
a given scheduling problem. The main problem that has been addressed here is the 
number of processors available for scheduling a system. Initially, the number of 
processors is unknown and all four algorithms progressively determine the required 
number of processors. A comparison of the number of processors and their utilization in 
each of the algorithm is given in the later part of this chapter.
60
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The other point of focus in this chapter has been the energy consumed by the system 
which is an important factor in the design and implementation of RTES. The energy 
consumed by the four algorithms while running the standard benchmarks is also reported 
in this chapter. The slack of each processor is also determined to find the efficiency of 
each algorithm. What follows is a brief discussion of the whole thesis followed by 
discussion about the input benchmarks, results and directions for future work.
5.2 Conclusion
Four different algorithms for scheduling and energy minimization of RTES by 
voltage scaling have been proposed in this thesis work. This thesis report starts with an 
introduction to RTES in chapter 1. All relevant definitions and classification of existing 
scheduling techniques are given in chapter!. An extensive survey of scheduling 
algorithms proposed to reduce energy consumption in RTES is presented in chapters. 
The four algorithms -  Scheduling after Scaling (SaS), Scheduling before Scaling (SbS), 
Probability based Scheduling (PbS) and Energy-Probability based Scheduling algorithms 
are described in detail in steps followed by the explanation of an example benchmark in 
chapter4. A variation of PbS, Energy-Probability based Scheduling (E-PbS) is also 
described in chapter4. This helps the reader to understand the algorithms clearly. The 
pseudo code for part of the algorithms is provided wherever necessary. The current 
chapter forms the final chapter of this thesis with detailed results, conclusion and 
directions for future work provided in further sub sections.
5.2.1 Benchmarks
The Standard Task Graph (STG) template has been used as the input benchmark
61
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
through out the course of development of the scheduling algorithms.
5.3 Assumptions made
Following is a list of assumptions made during the course of development of the 
proposed SaS, SbS, PbS and E-PbS algorithms:
• All the processors operate at prespecified set of voltages (5.0V, 3.3V and 2.4V 
in this thesis.
• All the processors operate at a constant and same frequency.
• Processors used cannot shift from one voltage level to another while executing 
a task set.
• The switching capacitance of the processors is constant and same for all the 
processors.
• All the tasks are assumed to be aperiodic, dependent tasks.
5.4 Simulation Results
The simulation results obtained by running STG benchmarks are presented in this 
section. All four scheduling algorithms are completely implemented in C++ on Linux 
platform. The benchmarks used vary in size. The number of nodes vary from 7 to 500.
Table 7 shows the number of processors required by the system when the four 
algorithms are implemented separately.
62
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table? Table showing number of processors required
Num ber of P ro c e s s o rs  Required
B ench  mark
N odes S a S S b S PbS E-PbS
5.0V 3.3V 2.4V Total 5.0V 3.3V 2.4V Total 5.0V 3.3V 2.4V Total 5.0V 3.3V 2.4V Total
7 .s tg 7 1 1 3 5 1 0 3 4 2 2 2 6 2 1 3 6
lO .stg 10 1 1 4 6 1 0 3 4 2 1 2 5 2 0 3 5
SO.stg 50 2 1 13 16 1 0 14 15 7 3 6 16 6 3 7 16
s p a rs e .s tg 96 1 1 62 64 1 0 61 62 12 3 44 59 14 0 47 61
lOO.stg 100 3 5 46 54 2 3 41 46 19 7 15 41 20 3 18 41
SOO.stg 300 1 0 120 121 1 0 122 123 49 15 57 121 53 8 64 125
BOO.stg 500 6 4 205 215 2 2 201 205 79 33 87 199 91 20 96 207
The first column gives the names of the STG benchmarks used for the purpose of 
comparison. The second column gives the number of nodes in each benchmark. In the 
STG benchmarks, as the number of nodes increase, the complexity of the graph increases. 
The table is divided into four main columns, each one representing one scheduling 
algorithm. Each of these main columns is then divided into four sub columns. The first of 
these subcolumns gives the number of processors running at 5.0V. The second sub 
column gives the number of processors running at 3.3V and the third sub column gives 
the number of processors running at 2.4V. Finally, the fourth column represents the total 
number of processors required by the system to execute the benchmark program.
63
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Number of Processors Required
2 5 0
2 200 
I
"  1 5 0  
Ô.
 S a S
 S b S
P b S  
 E-PbS
0 100 
ê
1  5 0
z
1 0 07 10 5 0 9 6 3 0 0 5 0 0
Number of Tasks
Fig 16 Comparison chart depicting No. of processors vs. No. of tasks
Figure 16 is a comparison of the total number of processors required to execute the 
standard benchmarks. Here, we can observe that all scheduling policies use same number 
of resources with PbS using the least among them. The number of processors required to 
execute the tasks increase linearly with the number of tasks.
No. of P rocesso rs by Voltage Partitions for SaS
2 5 0
200
1 5 0
100
■  3 .3  
□  2 .4
Number of Tasks
Fig 17 Division of total number of processors for SaS Algorithm by Voltage
64
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 17 shows the visual diagram of number of processors running at 5, 3.3 and 2.4 
voltage levels. As can be seen, the number of processors is high for 2.4V partition which 
tells us that more number of nodes have been scaled to execute at the least voltage. 
Figure 18 shows a similar trend for SbS algorithm.
Number of Processors by Voltage Partition for SbS
2 5 0
tn 2 0 0
o
1
1 5 0
g 1 0 0
1 5 0
0
. j-jp— —r*=iB—1 r  B—
7  1 0  5 0  9 6  1 0 0  3 0 0  5 0 0
Number of Tasks
0 5  
■  3 .3  
0  2 .4
Fig 18 Division of total number of processors for SbS Algorithm by Voltage
Number of Processors by Voltage Partitions for PbS
@5 
■  3 .3  
□  2 .4
7  10  5 0  9 6  1 0 0  3 0 0  5 0 0  5
Number of Tasks
Fig 19 Division of total number of processors for PbS Algorithm by Voltage
65
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The number of processors running at three different voltages executing the PbS 
algorithm and E-PbS are shown in figure 19 and figure 20 respectively. A very important 
observation that can be made is that the total number of processors are distributed evenly
No. of Processors by Voltage Partilons for E-PbS
S  5  
■  3 .3  
□  2 .4
10  5 0  9 6  1 0 0  3 0 0  5 0 0
Number of Tasks
Fig 20 Division of total number of processors for E-PbS Algorithm by Voltage
to work in all the voltage levels unlike the SaS and SbS algorithms where more 
number of processors were scheduled to run in 2.4 voltage level.
Table 8  shows the energy consumed by the multiprocessor system when the proposed 
algorithms are implemented. The unit of energy here depends on the units of execution 
time of the tasks in the graph. A typical unit for execution would be nanoseconds as in a 
RTFS, in which case the unit for energy would be nanoJoules.
66
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Table 8  Energy consumed by the processors while implementing the algorithms
Energy c o n su m e d  by the p r o c e s s o r s
B en ch  mark
N o d es S a S S b S
5 .0V 3 .3V 2 .4V Total 5 .0V 3.3V 2.4V Total
7 .s tg 7 1 2 5 1 8 1 .1 7 2 0 4 5 1 0 .1 7 4 5 0 0 192 6 4 2
10 .stg 10 2 7 5 4 9 .5 4 0 6 .5 6 7 3 1 .0 6 5 7 5 0 3 3 6 911
SO.stg 5 0 9 7 5 3 7 2 .9 2 0 3 3 .2 8 3 3 8 1 .1 8 1 3 7 5 0 2 4 8 4 3 8 5 9
s p a r s e .s tg 9 6 1 5 0 0 3 7 0 .2 6 2 1 4 3 8 .7 2 3 3 0 8 .9 6 3 0 5 0 0 2 1 7 6 8 2 4 8 1 8
10O.stg 1 0 0 1 0 0 0 3 2 6 .7 5 3 3 8 .5 6 6 6 6 5 .2 6 1 2 0 0 1 0 4 4 .7 8 5 1 8 4 7 4 2 8 .7 8
3 0 0 .s tg 3 0 0 1 3 2 5 0 3 1 1 8 8 .5 3 2 5 1 3 .5 2 3 5 0 0 3 5 5 2 0 3 7 8 7 0
SOO.stg 5 0 0 3 6 0 0 7 7 3 .1 9 5 2 0 6 9 .4 5 6 4 4 2 .5 9 4 1 5 0 2 2 3 1 .7 9 5 9 2 2 0 6 5 6 0 1 .7 9
PbS E-PbS
5 .0V 3 .3V 2 .4V Total 5 .0V 3.3V 2.4V Total
y .stg 7 300 261.4 92.16 6 5 3 .5 2 300 152.46 167.04 6 1 9 .5
tO .stg 1 0 650 141.6 213.12 1 0 0 4 .6 9 500 0 385.92 8 8 5 .9 2
SO.stg 5 0 3300 762.3 1094.4 5 1 5 6 .7 2850 642.51 1411.2 4 9 0 3 .7 1
s p a r s e .s tg 9 6 16650 1416 14331 3 2 3 9 6 .6 15500 0 15897.6 3 1 3 9 7 .6
10O.stg 1 0 0 6850 1906 2056.3 1 0 8 1 2 .0 7 6775 359.37 3202.56 1 0 3 3 6 .9 3
3 0 0 .s tg 3 0 0 41100 4421 14106 5 9 6 2 7 .5 4 35150 1611.7 19100.2 5 5 8 6 1 .9 2
SOO.stg 5 0 0 71600 11151 21174 1 0 3 9 2 5 .2 62725 4192.7 30585.6 9 7 5 0 3 .2 5
Column 1 lists the names of the benchmark programs used to implement the 
algorithms. The second column in the table above gives the number of nodes in the 
benchmarks. Each algorithm is marked by a different color and each of them is divided 
into four columns. The first column in each algorithm represents the energy consumed by 
the processors running at 5.0V, the second column gives the energy consumed by the 
processors running at 3.3V and the third column gives the energy consumed by the 
processors running at 2.4V. The fourth column under each algorithm gives the total 
energy consumed by the system to execute a particular benchmark program. This column 
is given with a heading of ‘Total’.
Figure 21 illustrates the total energy consumed by the multiprocessor system to 
execute the benchmarks with varying number of nodes. From the figure, we can conclude 
that PbS algorithm consumes highest energy followed by E-PbS algorithm. SaS and SbS
67
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
algorithms consume lesser energy with SbS being higher than SaS algorithm.
120000
100000
• |  8 0 0 0 0  
3
>  6 0 0 0 0  
2
c  4 0 0 0 0
LU
20000
0
Energy C om parison
1 0  5 0  9 6
Number of Tasks
 1 1-------
1 0 0  3 0 0  5 0 0
S a S  
SbS 
• P b S  
- E -P b S
Fig 21 Energy Comparison of Number of Tasks vs. Energy Consumed
The energy consumed by the processors at different voltages for SaS and SbS 
algorithms is shown in charts in figure 22 and 23. Both the algorithms follow a similar 
trend for the most part of the graph. The energy consumed by the processors running at 
2.4V is higher than the processors running at 5 and 3.3V. Among these two algorithms, 
the SaS algorithm consumes lesser energy and also the least among all the proposed 
algorithms. From the graphs, we can conclude that these two algorithms constantly try to 
execute as many nodes as possible at a lesser voltage.
68
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Energy Division by Voltage in SaS
6 0 0 0 0
5 0 0 0 0
■ | 4 0 0 0 0  
3
> . 3 0 0 0 0  
2
c  2 0 0 0 0  
LU
10000
.5 .0 V
- 3 .3 V
2 .4 V
10 5 0  9 6  1 0 0
Number of Tasks
3 0 0  5 0 0
Fig 22 Energy consumed per each voltage in SaS Algorithm
Energy Division by Voltage in SbS
7 0 0 0 0  
6 0 0 0 0  
a  5 0 0 0 0  
3  4 0 0 0 0  
2  3 0 0 0 0<D
lu 20000 
10000 
0
1 0  5 0  9 6  1 0 0
Number of Tasks
3 0 0  5 0 0
Fig 23 Energy consumed per each voltage in SbS Algorithm
Unlike SaS and SbS algorithms, PbS and E-PbS algorithms consume high energy. 
This can be seen in energy vs. number of tasks graph depicted in figure 24 and figure 25.
69
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
The reason for consumption of higher energy by the system is that more number of 
processors are running at 5.0V. Since, the energy consumed is quadratically proportional 
to the voltage of execution, the energy consumed increases.
8 0 0 0 0
7 0 0 0 0
w 6 0 0 0 0
c  5 0 0 0 0
^  4 0 0 0 0  
S’
® 3 0 0 0 0  
^  20000 
10000 
0
Energy Division by Voilage in PbS
10
-5 .0 V
-3 .3 V
2 .4 V
5 0  9 6  1 0 0
Number of Tasks
3 0 0  5 0 0
Fig 24 Energy consumed per each voltage in PbS Algorithm
Energy Division by Voltage in E-PbS
7 0 0 0 0
6 0 0 0 0
a  50000
3  4 0 0 0 0  >%
P  3 0 0 0 0
5 .0 V
 3 .3 V
2 .4 V
ui 20000
10000
5 0 05 0 9 6 1 0 0  3 0 0
Number of Tasks
Fig 25 Energy consumed per each voltage in E-PbS Algorithm
70
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Also, the main objective of PbS and E-PbS is to reduce the number of resources being 
utilized and there is a trade off between energy and resources in the process.
The utilization of each processor implementing the algorithms is tabulated in table9.
Table 9 Average utilization of the processors
B e n c h  m ark
N o d e s S a S
S.OV Util S .S V Util 2 .4 V Util
7 . s t g 7 1 0 .9 9 S 9 7 6 1 0 .6 1 6 1 6 2 S 0 .4 S 7 2 4 S
lO .s tg 1 0 1 0 .9 8 1 7 4 S 1 0 .2 8 8 6 4 0 .S 1 1 4 7 4
S O .stg SO 2 0 .7 6 S 8 6 4 1 0 .6 9 6 4 S 6 IS 0 .S 2 9 2 9 7
s p a r s e . s t g 9 6 1 0 .6 S 4 1 4 6 1 0 .S 1 4 8 1 S 6 2 0 .S 2 8 0 4 S
lO O .stg 1 0 0 S 0 .8 1 6 9 9 2 S 0 .S 1 6 S 8 8 4 6 0 .4 S S S 0 2
SO O .stg SOO 1 0 .9 S S S 2 S 0 0 1 2 0 0 .S 2 0 0 1 6
SO O .stg SOO 6 0 .S 9 7 0 2 6 4 0 .2 9 8 2 1 2 2 0 5 0 .S 0 9 4 S 1
S b S
S.OV Util S .S V Util 2 .4 V Util
7 . s t g 7 1 0 .6 6 6 6 6 7 0 0 S 0 .4 1 1 5 2 3
lO .s tg 10 1 0 .6 6 6 6 6 7 0 0 S 0 .S 6 S 6 0 7
S O .stg SO 1 0 .6 6 6 6 6 7 0 0 14 0 .S 7 S 1 S 1
s p a r s e . s t g 9 6 1 0 .6 6 6 6 6 7 0 0 61 0 .S S 8 S 4 4
lO O .stg 1 0 0 2 0 .S 1 6 1 2 9 S 0 .6 8 7 7 S 8 41 0 .4 7 2 0 6 9
SO O .stg SOO 1 0 .6 6 6 6 6 7 0 0 1 2 2 0 .S S 8 S 4 9
SO O .stg SOO 2 0 .S 8 2 4 S 7 2 0 .7 1 9 0 8 6 201 0 .S S 8 9 S 1
P b S
S.OV Util S .S V Util 2 .4 V Util
7 . s t g 7 2 1 2 0 .8 0 4 S 4 8 2 0 .6 6 6 6 6 7
lO .s tg 1 0 2 0 .6 S 1 4 4 8 S 1 0 .S 8 8 0 6 2 0 .6 7 S 0 S 4
S O .stg SO 7 0 .4 7 S 1 0 4 S 0 .S 2 0 1 4 S 6 6 7 6 0 .4 1  S S 8 9
s p a r s e . s t g 9 6 12 0 .S 4 2 1 2 4 4 1 7 S 0 .S 0 9 S S 9 4 4 0 .8 9 9 9 8 2
lO O .stg 1 0 0 19 0 .S 7 1 8 9 9 7 S 7 7 0 .6 S 0 7 9 4 2 8 6 I S 0 .7 S 7 6 1 6
SO O .stg SOO 4 9 0 .S 7 2 8 1 0 0 6 1 1 5 0 .4 1 8 2 S 9 2 5 7 0 .7 S 1 1 9 2
SO O .stg SOO 7 9 0 .6 1 4 S I 8 S 0 6 S S 0 .4 6 9 1 7 S 8 4 8 8 7 0 .6 8 7 7 7 1
E -P b S
S.OV Util S .S V Util 2 .4 V Util
7 . s t g 7 2 1 1 0 .6 0 8 6 9 6 S 0 .7 7 7 7 7 8
tO .s tg 1 0 2 0 .8 S S 7 0 0 S 0 .6 6 6 6 6 7
SO .stg SO 6 0 .7 1 6 4 4 1 s 0 .7 9 4 S S 1 7 0 .4 2 9 4 4 8
s p a r s e . s t g 9 6 14 0 .9 S 4 7 S 4 0 0 4 7 0 .8 9 9 S 4 7
lO O .stg 1 0 0 2 0 0 .9 1 8 6 6 7 s 0 .9 S S 0 8 6 1 8 0 .8 4 8 8 7 4
SO O .stg SOO SS 0 .9 8 6 4 6 S 8 0 .9 S 0 0 4 7 6 4 0 .7 6 1 9 3 1
SOO.stg SOO 91 0 .9 4 1  S 8 S 2 0 0 .8 8 7 2 1 9 9 6 0 .7 8 2 7 4 9
71
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
One of the main parameters that signify the efficiency of the scheduling algorithms is 
the utilization of the resources. The energy table and graphs indicate that PbS and E-PbS 
algorithms consume very high energy compared to SaS and SbS algorithms. The average 
utilization of the processors running at various voltage levels executing the standard 
benchmarks used in this thesis are tabulated in table9. The first column in the table gives 
the names of the benchmarks used to simulate the program. The second column gives the 
number of nodes in each benchmark program. The four algorithms are shown in four 
different colors starting with Sas, SbS and followed by PbS and E-PbS. The table in each 
algorithm is divided in to six columns. The first column shows the number of processors 
working at 5.0V. The second column gives the average utilization of all those processors. 
The third column gives the number of processors running at 3.3V and their average 
utilization is given in column four. Similarly, the fourth and fifth columns give the 
number of processors and their average utilization at 2.4V. From this table, we can 
conclude that processors in E-PbS algorithm have higher utilization factor. This shows 
that the algorithm is more efficiently employing all the processors to find a feasible 
schedule. This is again proved in figure 26 in which it is clearly seen that E-PbS utilizes 
the processor efficiently.
72
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Processor Utilization Comparision
0 .9
0.6
3  0 .5
i  0  4
0 .3  
^  0.2
I
0.1
9 610 5 0 100 5 0 07 3 0 0
•S A S
-S B S
PbS
■E-FbS
Number of Tasks
Fig 26 Comparison of Processor utilization vs. Number of tasks
Processor Utilization at 5.0V
" SaS
 SbS
PbS 
 E-PbS
I 0.8
is
3  0.6 
8
8  0 .4
sÜl 0.2
10 5 0 9 6 1 0 0  3 0 0  5 0 07
Number of tasks
Fig 27 Comparison of Processor utilization at 5.0V for each algorithm
73
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
utilization at 3.3V
P B S  
 E -P b S
96 1 0 0 300 5 0 0
Number of tasks
Fig 28 Comparison of Processor utilization at 3.3V for each algorithm
Figures 27, 28 and 29 compare the utilization of the processors by Sas, SbS, PbS and 
E-PbS algorithms at 5.0V, 3.3V and 2.4V. In all the cases, it can be easily concluded that 
E-PbS and PbS have a higher utilization of the processors than SaS and SbS algorithms.
Processor Utilization at 2.4V
S a S
 S b S
P b S  
 E -P b S
0 .9
0.8
0 .7
0.6
0 .5
0 .4
0 .3
0.2
0.1
9 6 1 0 0 3 0 0 5 0 0
Number of Tasks
Fig 29 Comparison of Processor utilization at 2.4V for each algorithm
74
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Comparing Average Utilization
90 .00
80 .00
70 .00
60 .00
50 .00
40 .00
30 .00
20 .00
10.00
s Com paring A verage 
Utilization
SaS SbS PbS
Algorithms
E-PbS
Fig 30 Average Utilization of the processors expressed in percentage
By comparing the number of processors required and energy consumed by the 
processors, we can conclude that SaS and SbS consume lesser energy while PbS and E- 
PbS use reduced number of resources to schedule the task sets.
Comparision of Energy /  Processor
600.00
500.00
a 400.00 
1
300.00
 SaS
 SbS
PbS 
 E-PbS
I
u  200.00
100.00
0.00
50 967 10 100 300 500
Number of Tasks
Fig 31 Comparison chart for Energy consumed per processor
75
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Comparison of Energy/Processor
100  -
I Com parison of 
Pow er/P rocessor
SaS SbS PbS
Algorithms
E-PbS
Fig 32 Bar graph showing the energy per processor consumed by each algorithm
The graphs studied till now compare the scheduling algorithms on parameters that do 
not depict the efficiency of the algorithm. The processor utilization graphs show that E- 
PbS algorithm is highly successful in utilizing the processors. Figure 31 and figure 33 
compare the four algorithms on the basis of a parameter which shows the efficiency of 
the algorithm. A comparison of energy per processor tells us that SaS consumes least 
energy per processor while executing the task set. The final graph compares the product 
of the energy and utilization of all the algorithms. An algorithm is efficient if the energy 
consumed by it is less and its utilization is high. So, the lesser the product of energy and 
1/utilization, the better the algorithm. From figure 33, we can observe that PbS and E- 
PbS fare well in this front.
76
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
C om parision  o f E nergy - Utilization
1 4 0 0 .0 0
1200.00  -
c  1000.00
1 8 0 0 .0 0  
3
& 6 0 0 .0 0
|5 4 0 0 .0 0
200 .00
0.00
50 9 6 100 30 0 5007 10
■SaS
■SbS
PbS
■E-PbS
Number of Tasks
Fig 33 Graph showing the product of Energy and Utilization
Comparison of Energy *  Deadline
7 0 0
6 0 0
5 0 0
%
«  4 0 0
i
3 3 0 0  O
200
100
0
m P o w e r / P r o c e s s o r ) * l / U t l l i z )  
a t io n
S a S  S b S  P b S  E -P b S
Algorithms
Fig 34 Bar graph depicting the efficiency of each algorithm
77
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5.5 Directions for future work
With scheduling in RTES being one of the challenging problems, the work done as 
part of this thesis leaves us with many unexplored areas. The algorithms implemented 
work for systems with as many as 1000 nodes. These algorithms can be extended to 
systems with even higher number of tasks.
We assume that the processor implementing the SaS, SbS, PbS and E-PbS algorithms 
can operate at one particular voltage and frequency. It will be interesting to see how the 
algorithms perform on processors which can operate at different voltage levels and 
frequencies. Processors with voltage shift ability are available in the market and can be 
used to simulate the algorithms.
The four algorithms developed in this algorithm minimize the energy and resources 
by scaling the voltage of operation. Including the frequency scaling in the algorithms 
would be an interesting area of research. We expect the energy consumed to reduce 
further by scaling the frequencies of operation.
Further, these scheduling algorithms can be implemented in a multiprocessor 
architecture to observe their real time performance. Another area of interest is to schedule 
task sets that have communication cost and delay between the tasks. Also, the tasks are 
assumed to be aperiodic and dependent. Further research is possible in the area of 
implementing the proposed algorithms in this thesis for periodic task sets.
78
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
REFERENCES
[1], Jane W.S. Liu, Real-Time Systems, 1st Edition, June 15, 2000, Prentice Hall, Upper 
Saddle River, NJ.
[2]. Francis Cottet, Joëlle Delacroix, Claude Kaiser, Zoubir Mammeri, Scheduling in 
Real Time Systems, November 15, 2002, John Wiley & Sons Ltd, England.
[3]. Nimal Nissanke, Realtime Systems, 1st Edition, September 1, 1997, Prentice Hall 
Europe.
[4]. C. M. Krishna, Kang G. Shin, Real-Time Systems, December 1, 1996, McGraw-Hill 
Companies.
[5]. C.L.Liu and J.W.Layland, “Scheduling algorithms for multiprogramming in a hard 
real-time environment,” ACM Journal, vol.20, pp. 46-61, Jan. 1973.
[6 ]. Y.Shin and K.Choi, “Power conscious fixed priority scheduling for hard real-time 
systems,” In Proc. of 36th Design Automation Conference, pp.134-139, New 
Orleans, LA, June 1999.
[7]. Y.Shin, K.Choi and T.Sakurai, “Power Optimization of Real-Time embedded 
systems on variable speed processors,” In Proc. of IEEE/ACM International 
Conference on Computer Aided Design, pp.365-368, SanJose, CA, Nov. 2000.
[8 ]. G.Quan and X.Hu, “Energy efficient fixed priority scheduling for real-time systems 
on variable voltage processors,” In Proc. of Design Automation Conference, pp.828 
-833, Las Vegas, Nevada, June 2001.
[9]. Loïc Pontani, Denis Dupont, “Scheduling and Assignment for Real-time Embedded
79
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Systems with Resource Contention”, In Proc. of Euromicro Symposium on Digital 
Systems Design 2003, pp.55-61, Belek-Antalya, Turkey September 01 - 06, 2003.
[10].Leung, J.Y.T., and J. Whitehead. “On the Complexity of Fixed-Priority Scheduling 
of Periodic, Real- Time Tasks”. Performance Evaluation, (Netherlands), vol.2, pp. 
237-250, 1982.
[IIJ.Y . Lee, Y. Doh, C.M. Krishna, “EDF Scheduling Using Two-Mode Voltage-Clock- 
Scaling for Hard Real-Time Systems”, In Proc. of CASES 2001, pp. 221-228, 
Atlanta, Georgia, November 16-17, 2001.
[12].T.0kuma, T.Ishihara and H.Yasuura, “Real-Time Task Scheduling for a Variable 
Voltage Processor,” Proceedings of the 12th International Symposium on System 
Synthesis, SanJose, CA, November 1999.
[13J.M.A. Moncusi, A. Arenas, J. Labarta, “ Energy Aware EDF Scheduling in 
Distributed Hard Real Time Systems”,
[14].A. Dudani, F. Mueller, and Y. Zhu. “Energy-conserving feedback EDF scheduling 
for embedded systems with real time constraints”. In ACM SIGPLAN Joint 
Conference Languages, Compilers, and Tools for Embedded Systems (LCTES’02) 
and Software and Compilers for Embedded Systems (SCOPES’02), pages 213-222, 
June 2002.
[15J.M. Spun, J.A. Stankovic, “How to integrate precedence constraints and shared 
resources in real-time scheduling”, IEEE Transactions on Computers, Volume;
43, Issue: 12, pp. 1407 -  1412, Dec. 1994.
[16J.S. Funk, S. Baruah, “Characteristics of EDF schedulability on uniform
multiprocessors”. Proceedings. 15th Euromicro Conference on Real-Time Systems,
80
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2003, pp. 2 1 1 -2 1 8 , 2-4 July 2003
[17].F. Yao, A. Demers, and S. Shenker, “A scheduling model for reduced CPU energy,” 
in Proc. TREE Annual Foundations of Computer Science, pp. 374—382, 1995.
[18].I. Hong, M. Potkonjak and M. B. Srivastava, “on-line scheduling of hard real-time 
tasks on variable voltage processor,” In Proc. of Computer-Aided Design (ICCAD), 
pp. 653-656, SanJose, CA, November 1998.
[19].V. Swaminathan, K. Chakrabarty, “Real-Time Task Scheduling for Energy-Aware 
Embedded Systems”, Work in Progress sessions of the 21st IEEE Real-Time 
Systems Symposium (RTSSWIPOO), Orlando, Florida, November 27-30, 2000
[20].Sudarshan Kumar Dhall, “Scheduling periodic-time - critical jobs on single 
processor and multiprocessor computing systems”. Doctoral Thesis, January 1977.
[21].P.G. Sorenson, “A methodology for real-time system development”. Doctoral 
Thesis, January 1974.
[22].Y.Kwok, I.Ahmed, “Static scheduling algorithms for allocating directed task graphs 
to multiprocessors” , ACM Computing Surveys (CSUR), Volume 31 , Issue 4, pp. 
406 -  471, December 1999.
[23].A.Gantman, P.Guo, J.Lewis, F.Rashid, “Scheduling Real-Time Tasks in Distributed 
Systems; A Survey”, Dept, of Computer Science, University of California, San 
Diego.
[24] .T.Ishihara, H.Yasuura, “Voltage scheduling problem for dynamically variable 
voltage processors”. In Proc. of International Symposium on Low Power Electronics 
and Design, pp. 197 -  202, 10-12 Aug. 1998, Monterey, CA.
[25].R. Prathipati, R. Mahapatra, “A Three Step Approach For Low Power Scheduling Of
81
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Real Time Embedded Systems”, Dept, of Computer Science, Texas A&M 
University, College Station, TX.
[26].Roychowdhury, D., Koren, I., Krishna, C.M., Y.-H.Lee, “A voltage scheduling 
heuristic for real-time task graphs”. In Proc. of International Conference on 
Dependable Systems and Networks, pp.741 -  750, 22-25 June 2003.
[27].Gruian, P., Kuchcinski, K., “LEneS: task scheduling for low-energy systems using 
variable supply voltage processors”. In Proc. of the ASP-DAC 2001. Asia and South 
Pacific, pp.449 -  455, 30 Jan.-2 Feb. 2001.
82
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
VITA
Graduate College 
University of Nevada, Las Vegas
Naveen Babu Anne
Local Address:
969 East Flamingo Road
Apartment 122
Las Vegas, Nevada 89119
Home Address:
Quarter 4 Agricultural Research Station
Ragolu, Srikakulam
Andhra Pradesh, India 532484
Degrees:
Bachelor of Engineering, Electrical Engineering, 2002 
Malaviya National Institute of Technology, Jaipur 
University of Rajasthan
Publications:
1. Three and Four-dimensional Parity-check Codes for Correction and Detection 
of Multiple Errors, ITCC 2004, Las Vegas
2. Branch Prediction by Checking Loop Terminal Conditions, ISNG 2005,
Las Vegas
Thesis Title: Scheduling of real time embedded systems for resource and energy 
minimization by voltage scaling
Thesis Examination Committee:
Chair person. Dr. Venkatesan Muthukumar, Ph.D.
Committee Member, Dr. Henry Selvaraj, Ph.D.
Committee Member, Dr. Emma Regentova, Ph.D.
Graduate Faculty Representative, Dr. Ajoy K. Datta, Ph.D
83
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
