Task scheduling for a real time multiprocessor by Jordan, J. W.
NASA TECHNICAL NOTE 
*o 
00 
h 
L n  
I 
n 
z
+ 
4 w 

4 
TASK SCHEDULING FOR 

NASA TN D-5786-
c-,\ 
0
!+ 
w 
F !  n 
cp 

z 

A REAL TIME MULTIPROCESSOR 

by John W,Jordan 
Electronics Research Center 
Cambridge, Mass, 02139 
N A T I O N A L  AERONAUTICS A N D  SPACE A D M I N I S T R A T I O N  W A S H I N G T O N ,  D. C. MAY 1970 

L 
https://ntrs.nasa.gov/search.jsp?R=19700017975 2020-03-23T19:42:19+00:00Z
-- 
1. Report No. 2. Government Accession No. 
NASA TN D-5786 
- ._ i 
4. T i t l e  and Subtitle 
Task Scheduling for a 

Real Time Multiprocessor 

7. Author(.)

John W . Jordan 

9. Performing Organization Name and Address 
Electronics Research Center 

Cambridge, Mass. 

12. Sponsoring Agency Name and Address 
National Aeronautics and 
Space Administration 
I ~. 
15. Supplementary Notes 
TECH LIBRARY KAFB, NM 
IlllllllllllllllllllllIl 
0332502 
1-3. Recipient's Cotalog No. I
- 4  
5. Report DateL* May2274I 6. Performing Organization Code 
c-111 

10. Work Unit  No. 
125-23-07-06 
Technical Note 
14. Sponsoring Agency CodeI
1 
 1 

This report describes an algorithm for scheduling

real-time tasks in a multiprocessor system. The algorithm 

guarantees that the deadlines of all scheduled tasks will 

be met. If the number of active tasks exceeds the capa­

bility of the multiprocessor, then only the highest

priority tasks will be scheduled. The algorithm is sub­

optimal in that it may fail to schedule one or more 

low-priority tasks which could be accommodated. A hard­

ware implementation of the algorithm is discussed. 

- -. .~ 
17. 	 Key Words 18. Distribution Statement
*Real-Time Tasks Scheduling
*Multiprocessor System Unclassified - Unlimited
-Algorithm

=Hardware Implementation 

-
19. Security Classif. (of this report) 20. Security Classif. (of this poge) 
Unclassified Unclassified 

-- 
TASK SCHEDULING FOR A REAL TIME MULTIPROCESSOR 
By John W .  Jordan  
E l e c t r o n i c s  Research Center  
Cambridge, Massachuset ts  
SUMMARY 

This  no te  p r e s e n t s  an a lgor i thm f o r  schedul ing  rea l  t i m e  
t a s k s  i n  a mul t ip rocesso r .  The a lgor i thm gua ran tees  t h a t  t h e  
d e a d l i n e s  of  all scheduled t a s k s  w i l l  be m e t .  I f  t h e  number of 
a c t i v e  t a s k s  exceeds t h e  c a p a b i l i t y  of t h e  mul t ip rocesso r ,  then  
on ly  t h e  h i g h e s t  p r i o r i t y  t a s k s  w i l l  be scheduled.  
F i r s t ,  a s imple  schedul ing a lgor i thm i s  developed which 
gua ran tees  r e a l  t i m e  d e a d l i n e s  f o r  p e r i o d i c  t a s k s .  A second 
a lgor i thm i s  then  de r ived  from t h e  f i r s t  which a l s o  gua ran tees  
t a s k  dead l ines  b u t  r e q u i r e s  fewer processor  i n t e r r u p t i o n s .  A 
system load  parameter i s  de f ined  and used t o  d i v i d e  t h e  t a s k s  
i n t o  two c a t e g o r i e s  high p r i o r i t y  t a s k s  which can be sched­
u led  wi th  guaranteed d e a d l i n e s  and low p r i o r i t y  t a s k s  f o r  which 
c u r r e n t  r e sources  a r e  i n s u f f i c i e n t  t o  a l low execut ion .  A t  t h i s  
p o i n t ,  t h e  d i v i s i o n  of t h e  t o t a l  system load  among t h e  i n d i v i d u a l  
p rocesso r s  i s  cons idered .  This  i s  an o p t i m i z a t i o n  problem, b u t  
a t  t h e  p r e s e n t  t i m e  a sub-optimal s o l u t i o n  s e e m s  s u f f i c i e n t .  The 
a l l o c a t i o n  of  t h e  t o t a l  system load  among t h e  i n d i v i d u a l  pro­
c e s s o r s  i s  i n  terms of a " t a s k  load" parameter and does no t  
n e c e s s i t a t e  a c o n s i d e r a t i o n  of r ea l  t i m e  t a s k  d e a d l i n e s ,  t h u s  
cons iderably  s impl i fy ing  t h e  schedul ing problem. 
Non-period t a s k s ,  i n t e r r u p t s  and a hardware implementation 
of t h e  schedul ing a lgor i thm a r e  d i scussed .  An appendix c o n s i d e r s  
memory access  c o n f l i c t s .  
INTRODUCTION 
A mul t i -processor  i s  a computer system c o n s i s t i n g  of a 
number of p rocesso r s  which can communicate through a common memory 
bank. S ince  t h e  o r g a n i z a t i o n  of t h e s e  u n i t s  may i n f l u e n c e  t h e  
cho ice  of schedul ing  a lgor i thms t h e  s t r u c t u r e  of  F igure  1 w i l l  
be  assumed. The most impor tan t  c h a r a c t e r i s t i c  of t h i s  system i s  
t h a t  it i s  " f u l l y  d i s t r i b u t e d "  i n  t h a t  each processor  can access 
any memory module. 
The number of memory u n i t s  and p rocesso r s  i s  no t  impor tan t ,  
and t h e  system may be modular w i th  a v a r i a b l e  number of u n i t s .  
The procedures  (programs) which are  executed by t h e  mul t i -processor  
are d iv ided  i n t o  t a s k s .  When a processor  completes a t a s k  it 
-- 
executes  a s p e c i a l  procedure c a l l e d  t h e  execu t ive  which d e t e r ­
mines which t a s k  w i l l  be  processed nex t .  Th i s  n o t e  i s  concerned 
wi th  t h e  a lgo r i thm used by t h e  e x e c u t i v e  t o  select t h e  next  t a s k .  
F igu re  1.- A Mul t iprocessor  
The most i n t e r e s t i n g  a s p e c t  of t h e  mul t ip rocesso r  configu­
r a t i o n  i s  t h e  p o s s i b i l i t y  of  " f a i l  s o f t "  o p e r a t i o n  t h a t  i s ,  
t h e  f a i l u r e  of  a memory module o r  processor  w i l l  r e s u l t  i n  loss 
o r  deg rada t ion  of some f u n c t i o n s  w h i l e  s t i l l  main ta in ing  o t h e r ,  
more impor tan t ,  f u n c t i o n s .  Achieving t h i s  " f a i l  s o f t "  o p e r a t i o n  
i s  complicated by t h e  p o s s i b i l i t y  t h a t  t h e  f a i l u r e  of a p r o c e s s o r ,  
f o r  example, w i l l  no t  on ly  reduce t h e  computat ional  r e sources  of 
t h e  system b u t  may a l s o  impose an a d d i t i o n a l  load  on t h e  system 
s i n c e  s p e c i a l  d i a g n o s t i c  r o u t i n e s  must be executed t o  i s o l a t e  
t h e  f a u l t y  u n i t  and one o r  more t a s k s  executed by t h e  f a u l t y  
processor  may have t o  be " r o l l e d  back" o r  c o r r e c t e d  f o r  p o s s i b l e  
e r r o r s .  For t h i s  reason  it i s  d e s i r a b l e  t o  des ign  an a lgor i thm 
which r e q u i r e s  a minimum of r e -conf igu ra t ion  o f  t h e  t a s k  s t r u c ­
t u r e  i n  t h e  even t  of a hardware f a i l u r e .  I d e a l l y ,  t h e  a lgor i thm 
would i n s u r e  t h e  cont inued execut ion  of t h e  most important  t a s k s  
and forego execut ion  of less impor tan t  t a s k s .  I t  i s  then  p o s s i b l e
t o  in t roduce  a degraded mode of o p e r a t i o n  w i t h  a modified t a s k  
s t r u c t u r e .  
PROBLEM DEFINITION 
Each rea l  t i m e  t a s k  m u s t - b e  completed w i t h i n  a s p e c i f i e d  
t i m e  frame as  shown i n  F igure  2 .  
2 
t 

0 

'S t e 
I t d  to+ 
Figure 2.- Real Time Task 
The times involved are - ta the activation time after which 
the task can be executed. It is often dependent upon some system 
event such as an external interrupt, 1/0 completion or a pro­

cessor timer. td is the task deadline. t is the task period

if the task is periodic. t is the time wRen the task execution 

starts and te is the time wgen it ends. For the time being, only

independent periodic tasks will be considered. These are charac­

teristic of a real time sampled data system. For simplicity the 

time frame tf during which the task must be processed will be 

taken as equal to the task period as shown in Figure 3 .  The 
activation time ta is controlled by a processor timer and is pre­
cisely known. 

Figure 3 . - Simplified Periodic Task 
A solution of the scheduling problem requires 
(1) the development of an algorithm such that those tasks 

chosen for execution are completed within the specified time 

frame and 

3 

( 2 )  since the current number of operative processors may be 
incapable of processing all the active tasks, the algorithm must 
also select which of the active tasks are to be executed. 
Since the "importance" of a task is only meaningful in the 

context of the task's function, it is up to the application pro­

gramer to assign to each task an integer variable called its 

priority index. Since mission conditions may change, the priority

index may also change. Normally, however, it can be expected to 

be a relatively constant value. 

It is important to note that this assignment of priority is 

not a determination of which task is to be executed next: it 

simply represents the relative value (at the present time) of 

eventually executing task A rather than task B provided that the 

multiprocessor is unable to execute both within their specified

time frames. 

It is easy to show an example (Appendix A) where executing

tasks in order of their priority will result in missed deadlines 

when the multiprocessor is actually capable of executing all 

tasks satisfactorily. Conversely, an algorithm which considers 

only task deadlines will inevitably result in the execution of 

tasks of low priority while tasks of higher priority miss their 

deadline when the multiprocessor is operating under an overload 

condition. Actually, the constraints imposed by real time dead­

lines may require that tasks of low priority be scheduled before 

tasks of higher priority; but the algorithm must also consider 

priority so that if all the deadlines cannot be met the tasks of 

highest priority will be successful. The resolution of this 

potential conflict between task deadlines and task priorities

is the fundamental design problem. 

It should be noted that priority is considered to be a 

variable independent of other task parameters such as length,

period, etc. Correlations between task parameters may permit

the use of simpler scheduling algorithms. 

IDEAL MULTIPROCESSOR 

In order to simplify the discussion of scheduling algorithms,

it is helpful to define an ideal (but unrealizable) multiprocessor.

An ideal multiprocessor behaves like a single processor and 

memory unit. The addition of more processors is equivalent to 

increasing the speed of the ideal processor by a unit amount. 

The scheduling of an ideal multiprocessor is then similar to 

multiprogramming with a single processor. In a later section the 

restriction of an ideal multiprocessor will be removed. 

4 

EXECUTIVE IMPLEMENTATION 

The executive has at least two lists -- a timer list and the 
Active Task List (ATL). The timer list is controlled by a real 
time clock in each processor as described in Ref. 1. When the 
processor clock reaches a specified time the processor moves any
tasks on the timer list which are due to be activated at that 
time to the ATL. The processor clock is then reset. As the 
processors become idle, they take the topmost task from the ATL. 
The scheduling algorithm is responsible for ordering the ATL. 
Not all the tasks in the system will have real time require­

ments. These "background" tasks may be kept in a separate list 

ordered by priority. They are executed after the real time tasks 

have been processed. 

SCHEDULING ALGORITHM ONE 

Associate with each task a number tm which is the maximum 

time required to execute that task. Consider the following group

of tasks: 

TABLE I 

1 Period I tm 
T I T/3
7-+l-II .  3T 
With one processor the following schedule will meet the periodic 

deadlines. The reader will note that it is not a unique solution. 

I 
A 	 A A 
8 C 
Figure 4 . - Task Schedule 
5 

The le t te rs  below t h e  l i n e s  i n d i c a t e  t h e  t a s k  d e a d l i n e s .  The 
minimum t a s k  p e r i o d  T i s  a fundamental parameter  and w i l l  be  
c a l l e d  a "cyc le . "  The schedul ing a lgo r i thm i s  
(1) Since  A must be executed eve ry  c y c l e  schedule  A 
( 2 )  	 Since  B must be completed i n  t w o  c y c l e s ,  do one-half  
of  B each c y c l e  
( 3 )  	 Since  C must be completed i n  t h r e e  c y c l e s ,  do one-
t h i r d  of C each c y c l e .  
I n  g e n e r a l ,  if i s  t h e  maximum run  t i m e  f o r  t a s k  k and i t s  
per iod  i s  tp then  
i s  t h e  amount of t i m e  which mus t  be devoted t o  t h a t  t a s k  dur ing  
eacy c y c l e  i n  o r d e r  t o  m e e t  i t s  d e a d l i n e .  For N
P 
p rocesso r s  
and NT t a s k s  
A mul t ip rocesso r  l oad ing  parameter may be de f ined  by 
L = -2 a ( k )1 
T N  ( 3 )  
k l  
Computer loading  i s  a b a s i c  parameter s i n c e  i t  r e p r e s e n t s  t h e  
f r a c t i o n a l  p a r t  of t h e  computer 's  p rocess ing  c a p a b i l i t y  necessary  
t o  p rocess  t h e  NT t a s k s .  A va lue  of g r e a t e r  t han  one means t h a t  
a l l  t h e  t a s k s  cannot  be processed.  The v a l u e  1 0 0  x L i s  t h e  
percentage of t h e  mul t ip rocesso r  t i m e  r e q u i r e d  t o  process  t h e  
t a s k s .  
Each c y c l e  a ( k )  microseconds a r e  s p e n t  execut ing  t h e  k t h  
t a s k .  Each c y c l e  i s  e x a c t l y  l i k e  a l l  o t h e r  c y c l e s .  This  means 
t h a t  t h e  p rocesso r s  must s w i t c h  from t a s k  t o  t a s k  dur ing  a c y c l e  
wi thout  completing t h e  t a s k s .  For example, i f  t h e r e  w e r e  20  
a c t i v e  t a s k s  and two p rocesso r s ,  each p rocesso r  would make about  
n ine  t a s k  changes p e r  c y c l e .  Although a ( k )  would be d i f f e r e n t  
6 

-- 
-- 
f o r  each t a s k ,  if t h e  b a s i c  c y c l e  t i m e  T w e r e  20  m s  an average 
t i m e  of  about 2 m s  would be s p e n t  on each t a s k  pe r  c y c l e .  The 
t a s k  swi tch ing  can be accomplished by a hardware i n t e r r u p t  o r  by 
subdiv id ing  t h e  t a s k s  i n t o  segments r e q u i r i n g  ak seconds o r  less 
t o  execute .  A s p e c i a l  swi tch ing  i n s t r u c t i o n  would bound each 
segment. 
A s u i t a b l e  o r g a n i z a t i o n  of t h e  ATL i s  shown i n  F igu re  5.  
TOP 
NEXT 
Figure  5.- Act ive  Task L i s t  (ATL)  
The t a s k s  form a c i r c u l a r  c h a i n .  There i s  a f i x e d  p o i n t e r  
(TOP)  which d e s i g n a t e s  one t a s k  as  t h e  t o p  of t h e  cha in .  Another 
p o i n t e r  ( N E X T )  p o i n t s  t o  t h e  nex t  t a s k  t o  be executed.  An i d l e  
processor  w i l l  p i ck  up t h e  NEXT t a s k  and advance t h e  NEXT p o i n t e r .  
The processor  t i m e r  w i l l  be se t  t o  cause  an i n t e r r u p t  a f t e r  ak 
microseconds a t  which t i m e  t h e  NEXT t a s k  w i l l  be executed.  Every 
T microseconds t h e  NEXT p o i n t e r  w i l l  be  reset  t o  t h e  TOP p o i n t e r  
and a new c y c l e  i n i t i a t e d .  S ince  each ak i s  based upon maximum 
t a s k  run  t i m e  and t h e  a c t u a l  run  t i m e s  w i l l  be somewhat less ,  
t h e  NEXT p o i n t e r  w i l l ,  i n  a l l  l i k e l i h o o d  complete more than  one 
f u l l  c i r c l e  p e r  c y c l e .  However, t h i s  a d d i t i o n a l  t i m e  can a l s o  
be used t o  p rocess  background (nonrea l  t i m e )  t a s k s .  
S ince  t h e  o r d e r  i n  which t h e  t a s k s  are  executed does no t  
e f f e c t  t h e  d e a d l i n e  requi rements  they can be a r ranged  i n  o r d e r  
of p r i o r i t y  wi th  t h e  TOP p o i n t e r  p o i n t i n g  t o  t h e  t a s k  of h i g h e s t  
p r i o r i t y .  If one o r  m o r e  p rocesso r s  f a i l ,  t h e  NEXT p o i n t e r  w i l l  
complete less than  one f u l l  c i r c l e  pe r  c y c l e .  However, t h o s e  
t a s k s  wi th  t h e  g r e a t e s t  p r i o r i t y  w i l l  be  executed.  I t  i s  in ­
t e r e s t i n g  t o  n o t e  t h a t  t h i s  a lgor i thm does n o t  need any e x p l i c i t  
in format ion  about  t h e  number of  o p e r a t i n g  p rocesso r s  t h a t  i s ,  
Eq. ( 3 )  does not  have t o  be eva lua ted  s i n c e  t h e  a lgo r i thm
au tomat i ca l ly  a d j u s t s  t o  t h e  number of  o p e r a t i n g  p rocesso r s .  On 
7 

t h e  o t h e r  hand by observ ing  t h e  p o s i t i o n  of  t h e  NEXT p o i n t e r  a t  
t h e  c y c l e  i n t e r r u p t  every  T microsecond some informat ion  about  
t h e  number of o p e r a t i n g  processor  can be ob ta ined .  
A d i sadvantage  of t h i s  a lgor i thm i s  t h a t  i f  t h e  number of 
a c t i v e  t a s k s  i s  l a r g e  t h e r e  w i l l  be a corresponding i n c r e a s e  i n  
t h e  number of i n t e r r u p t i o n s  pe r  c y c l e .  The overhead involved 
i n  changing t h e  p rocesso r  s t a t e  may become excess ive  i f  t h e r e  
are a l a r g e  number of  a c t i v e  r e g i s t e r s  t o  be s t o r e d .  I n  t h e  
nex t  s e c t i o n ,  ano the r  schedul ing a lgo r i thm which circumvents  
t h i s  d i f f i c u l t y  w i l l  be p re sen ted .  
SCHEDULING ALGORITHM TWO 
For schedul ing  a lgor i thm one a t y p i c a l  schedule  might 
appear a s  i n  F i g u r e  6 .  I n  t h i s  example L = 1 and t a s k  p r i o r i t y  
has been ignored .  
“ A  “B Q c  “A “ 8  ‘C 
A A A 
B B 
C 

Figure  6 . - R e a l  T i m e  Schedule 
Once a g a i n ,  t h e  l e t t e r s  under t h e  graph i n d i c a t e  task  dead­
l i n e s .  S ince  t a s k  A i s  due every c y c l e  it must  be executed each 
c y c l e .  However, t a s k  B i s  due every two c y c l e s  and it makes 
l i t t l e  d i f f e r e n c e  whether it i s  executed du r ing  c y c l e  one o r  two. 
S ince  aC i s  g r e a t e r  t han  ag it i s  p o s s i b l e  t o  move aB from c y c l e  
two t o  c y c l e  one and r e p l a c e  it wi th  an e q u i v a l e n t  amount of c1 
from c y c l e  one. Even i f  aB w e r e  g r e a t e r  t han  a c  i t  would s t i lE 
be p o s s i b l e  t o  move p a r t  of ag i n t o  c y c l e  one. C l e a r l y ,  i t  is 
p o s s i b l e  t o  r e a r r a n g e  t h e  t i m e  of execu t ion  of a t a s k  as long as  
t h e  execut ion  i s  n o t  delayed p a s t  t h e  t a s k  dead l ine .  One such 
rearrangement ( n o t  t h e  only one) i s  t o  c o n s o l i d a t e  t h e  t a s k s  
8 

such t h a t  t h o s e  t a s k s  wi th  t h e  ea r l i e s t  d e a d l i n e s  are executed 
f i r s t .  Th i s  i s  schedul ing  a lgor i thm t w o .  S ince  i n s o f a r  a s  dead­
l i n e s  are concerned, t h e r e  i s  no e s s e n t i a l  d i f f e r e n c e  between 
a lgor i thm one and a lgo r i thm t w o  it fo l lows  t h a t  a lgor i thm t w o  
(ear l ies t  d e a d l i n e )  a l so  gua ran tees  t h a t  a l l  t h e  r e a l  t i m e  dead­
l i n e s  w i l l  be m e t .  However, t h e  number of t a s k  i n t e r r u p t i o n s  
necessary  w i l l  be reduced t o  one every  T microseconds and on ly  
t h e  c y c l e  i n t e r r u p t  w i l l  be  r e q u i r e d .  During t h e  c y c l e  t h e  pro­
c e s s o r s  p rocess  t h e  t a s k s  t o  completion i n  o r d e r  of t h e i r  dead-
I l i n e s .
I 
However, it i s  clear t h a t  as  i t  p r e s e n t l y  s t a n d s  a lgo r i thm 
t w o  i s  s e n s i t i v e  t o  r e s o u r c e  v a r i a t i o n s .  For  example, i f  a 
processor  w e r e  t o  f a i l  t h e  t a s k s  would con t inue  t o  be executed 
according t o  t h e i r  d e a d l i n e s  and wi thou t  r e g a r d  t o  t h e i r  p r i o r i t y .  
T o  c o r r e c t  t h i s  c o n d i t i o n ,  t h e  implementation of  F igure  7 can be 
used. 
AT L TIMER 
Figure  7 . - Implementation of  Algorithm Two 
A t h i r d  l i s t  c a l l e d  t h e  p r i o r i t y  l i s t  (PL)  i s  added. A l l  
a c t i v e  t a s k s  appear  on t h e  PL,  o rdered  by p r i o r i t y .  Only a 
c e r t a i n  number of t h e  t a s k s  on PL a l s o  appear  on t h e  ATL where 
they  are ordered  by d e a d l i n e .  I n  o r d e r  t o  determine how many of 
t h e  PL t a s k s  may a l so  appear  on t h e  ATL,  t h e  computer load  para­
m e t e r  of Eq. ( 3 )  i s  c a l c u l a t e d  f o r  t h e  t a s k s  on t h e  PL s t a r t i n g  
w i t h  t h e  h i g h e s t  p r i o r i t y  t a s k .  When t h e  l o a d  parameter e q u a l s  
o r  exceeds one, no f u r t h e r  t a s k s  may appear  on t h e  ATL. Thus 
modif ied,  a lgo r i thm t w o  gua ran tees  t h a t  t h e  real  t i m e  d e a d l i n e s  
of scheduled t a s k s  w i l l  be m e t ,  and i f  t h e  mul t ip rocesso r  cannot  
do a l l  t a s k s  on ly  t h o s e  of t h e  h i g h e s t  p r i o r i t y  w i l l  be  done. I t  
has  an advantage over  a lgo r i thm one i n  t h a t  t h e  p rocesso r s  must 
be i n t e r r u p t e d  on ly  a t  t h e  end of every c y c l e  no m a t t e r  how many 
t a s k s  are  a c t i v e .  
9 

IllIllll1l1l11l1lIll I l l  I l l  

-- 
MULTIPROCESSOR ANOMALIES 
The r e s t r i c t i o n  of  an  " i d e a l "  m u l t i p r o c e s s o r  w i l l  now be 
removed. The occurance of  so c a l l e d  "anomalies" i n  a r e a l i s t i c  
mul t ip rocesso r  p r e s e n t s  a s e r i o u s  d i f f i c u l t y  i n  schedul ing rea l  
t i m e  t a s k s .  The l i t e r a t u r e  (Ref. 2 ,  3 ,  4 )  c o n t a i n s  numerous 
examples where s h o r t e n i n g  t h e  execu t ion  t i m e  of one o r  more t a s k s  
r e s u l t s  i n  an i n c r e a s e  i n  t h e  o v e r a l l  execu t ion  t i m e  of  a s t r i n g  
of t a s k s .  T h i s  c o u n t e r i n t u i t i v e  response  can  r e s u l t  when t h e  
shor tened  run  t i m e  of a t a s k  a l te rs  t h e  sequence i n  which sub­
sequent  t a s k s  are executed ,  t h u s  producing a complex r e -o rde r ing  
of t h e  execu t ion  t i m e  h i s t o r y  of t h e  e n t i r e  t a s k  s t r i n g .  R e f .  
2 g i v e s  examples where dec reas ing  t h e  a c t u a l  t a s k  r u n  t i m e ,  
r e l a x i n g  precedence r e l a t i o n s  between t a s k s  and adding more 
p rocesso r s  can a c t u a l l y  i n c r e a s e  t h e  o v e r a l l  t i m e  r e q u i r e d  t o  
process  a t a s k  s t r i n g .  I n  t e r m s  of  t h e  a lgo r i thms  of t h i s  n o t e  
t h e  problem i s  i l l u s t r a t e d  by F igu re  8. 
YT-I 
F i g u r e  8.- Mul t iprocessor  Anomality 
F igu re  8a r e p r e s e n t s  a mul t ip rocesso r  w i th  one process ing
u n i t .  I f  a second p rocesso r  w e r e  added t o  an  " i d e a l "  mul t i ­
p rocesso r ,  t h e  t a s k  l i s t  would be processed  i n  ha l f  t h e  t i m e .  
However, f o r  an a c t u a l  m u l t i p r o c e s s o r ,  F i g u r e s  8b and 8c c l e a r l y  
show t h a t  t h e  t i m e  r e q u i r e d  i s  a f u n c t i o n  of t h e  o r d e r  i n  which 
t h e  t a s k s  are  processed .  What i s  r e q u i r e d  i s  t o  t a k e  t h e  s i n g l e  
t a s k  s t r i n g  of F igu re  8a of l e n g t h  T and sub-divide it i n t o  N 
s e p a r a t e  t a s k  s t r i n g s  of l e n g t h  T / N .  I n  g e n e r a l ,  t h e r e  i s  no 
procedure f o r  doing t h i s  op t ima l  o r  o the rwise .  For some t a sk  
s t r i n g s  it may n o t  be  p o s s i b l e ;  f o r  example, i f  F igu re  8a con­
s i s t e d  of one s i n g l e  t a s k  which could  n o t  be executed i n  p a r a l l e l .  
H o w e v e r ,  F igu re  8b and 8 c  r e p r e s e n t  two a t t e m p t s  a t  a s u b d i v i s i o n  
10 

-- 
of Figure 8a -- one of which is successful; the other which is 
not. Quite possible there is an algorithm which is optimal in 
the sense that it subdivides a string of length T into N strings
of length N/T in such a way that a minimum number of the original
tasks must be discarded. Graham (Ref. 2 )  points out that the 
optimal solution (for an equivalent problem) can be found by
trying all possible combinations but that this is practical for 
only a limited number of tasks. He suggests ordering the single
task list by task length. This procedure cannot be applied to 
the algorithms of this note since algorithm one orders the tasks 
by priority and algorithm two by their deadline. Graham also 
provides a number of bounds for multiprocessor anomalities. For 
the case where T is the time required to process the single task 
list and T* is the time required to process the multiple lists 
T* = T 
which simply says that the addition of more processors may pro­

vide no improvement which is indeed the case if the system

contains one long task. Although this bound is of little help

in this situation it indicates that processor anomalities must 

be considered in any multiprocessor scheduling philosophy. 

The problem of converting an ideal multiprocessor schedule 

into an actual multiprocessor schedule can be considered from 

another viewpoint. Define a task load parameter by 

B(k) = tm/tF ( 4 )  
so that B(k) is the fractional processor capacity required to 
process task k. From Eq. ( 3 )  
where N is the number of processors and NT is the number of 
tasks wgich can be processed by an ideal multiprocessor. The 
scheduling problem is now transformed from the time domain to a 
''computer loading" domain. The ideal multiprocessor can accom­
modate a load of N but each real processor can accommodate a 
maximum load of onay 1. Figure 9 illustrates this for NP = 2. 
The scheduling problem becomes one of allocating the total 

computational load of NP into N separate lists, none of which 
can exceed unity. Of course, &is is exactly the problem con­

sidered by Graham, only now in terms of computer load rather 

than task run times. Since the B(k)'s are discrete, it may not 

11 

be p o s s i b l e  t o  decompose t h e  l i s t  e x a c t l y  and s o m e  f 3 ( k ) ' s  may 
be l e f t  over .  However, an  a d d i t i o n a l  degree  o f  freedom has been 
in t roduced  s i n c e  t h e  a c t u a l  d i v i s i o n  i n t o  t h e  s e p a r a t e  l i s ts  can  
be done i n  any f a s h i o n  inc lud ing  Graham's method of a s s i g n i n g  
t h e  l o n g e s t  B ( k ) ' s  f i r s t .  There i s  a c o n s t r a i n t ,  however, i n  
t h a t  i f  some f3 (k ) ' s  (and t h e i r  corresponding t a s k s )  are d i s ­
carded they  should n o t  have a p r i o r i t y  h ighe r  t han  any t a s k  l e f t .  
The " b e s t "  s o l u t i o n  can be found by t r y i n g  a l l  p o s s i b i l i t i e s  
a l though t h i s  soon becomes i m p r a c t i c a l  i f  t h e  number of a c t i v e  
t a sks  i s  l a r g e .  The s i m p l e s t  sub-optimal s o l u t i o n  i s  t o  a s s i g n  
t h e  B ( k ) ' s  t o  t h e  p r o c e s s o r s  i n  a round-robin f a sh ion  s t a r t i n g  
a t  t h e  t o p  of t h e  p r i o r i t y  l i s t .  Appendix B d i s c u s s e s  ano the r  
a l l o c a t i o n  s t r a t e g y .  
F i g u r e  9 . - Processor  Loading 
The assignment  problem a c t u a l l y  c o n s i s t s  of decomposing 
t h e  o v e r a l l  t a s k  l i s t  i n t o  s e p a r a t e  l i s t s  f o r  each processor .  
Eventua l ly ,  of cour se ,  t h i s  assignment must be made. The in ­
e f f i c i e n c y  r e s u l t s  from t h e  d i s c r e t e  n a t u r e  of  t h e  t a s k s  and 
hence t h e  B ( k )  ' s .  I t  i s  easy  t o  f i n d  an  upper bound t o  t h e  
i n e f f i c i e n c y  which r e s u l t s .  From Eq. ( 3 )  t h e  loading  of t h e  
" i d e a l "  mul t ip rocesso r  i s  
Nrn 
For t h e  i n d i v i d u a l  p rocesso r s  t h e  loading  i s  
R 

where R i s  t h e  l a s t  t a s k  on t h e  k t h  p r o c e s s o r ' s  l i s t .  The per­
c e n t  i n e f f i c i e n c y  can be  de f ined  as 
7 = 100 -L* L 
12 

where 
I t  i s  reasonab le  t o  e s t a b l i s h  a l i m i t  on t h e  r a t i o  of a t a s k ' s  
maximum run  t i m e  t o  i t s  execu t ion  t i m e  frame 
or 
A = max { B ( k ) )  
From Eq. ( 6 )  and Eq. ( 7 )  it fo l lows  t h a t  
r) 5 l O O A  
I f  A i s  . 5 ,  t h e  maximum i n e f f i c i e n c y  i s  5 0  p e r c e n t ;  and i f  A i s  
. 2 ,  t h e  maximum i n e f f i c i e n c y  i s  2 0  p e r c e n t .  Th i s  shows t h e  
va lue  of d i v i d i n g  t h e  t o t a l  system load  i n t o  many sma l l e r  t a s k s .  
However, t h e  d i v i s i o n  c r i t e r i a  i s  no t  t h e  l e n g t h  of t h e  t a s k  
b u t  r a t h e r  t h e  r a t i o  of t h e  t a s k  l e n g t h  t o  t h e  t i m e  frame i n  
which t h e  t a s k  must be executed.  
I n  summary, schedul ing  a r e a l i s t i c  mul t ip rocesso r  r e q u i r e s  
t h e  a l l o c a t i o n  of t h e . t o t a 1  mul t ip rocesso r  l oad  among t h e  
i n d i v i d u a l  p rocesso r s .  Th i s  a l l o c a t i o n  can be regarded as  an 
op t imiza t ion  problem; however, a s imple sub-optimal s o l u t i o n  
would s e e m  t o  be adequate .  When a l l  t h e  r ea l  t i m e  t a s k s  can be 
scheduled,  any s o l u t i o n  can be regarded a s  "opt imal . "  By 
a l l o c a t i n g  t h e  t a s k  load ing  it i s  no longer  necessary  t o  cons ide r  
t h e  rea l  t i m e  c o n s t r a i n t s  on t h e  i n d i v i d u a l  t a s k s ,  t h u s  con­
s i d e r a b l y  s impl i fy ing  t h e  schedul ing  problem. 
Up t o  now, it has been assumed t h a t  a l l  t h e  t a s k s  w e r e  
p e r i o d i c  and independent .  When t h e  problem of schedul ing  t h e s e  
t a s k s  i s  t ransformed f r o m  t h e  t i m e  domain t o  t h e  computer loading  
domain, i t  becomes a s t a t i c  problem of decomposing one long 
l i s t  i n t o  Np s h o r t e r  l i s ts .  However, t h e  t a s k s  a r e  being
p e r i o d i c a l l y  a c t i v a t e d ,  r u n ,  and te rmina ted  so t h e  computer 
loading  i s  a c t u a l l y  a dynamic r a t h e r  than s t a t i c  parameter .  But 
i t  i s  much s impler  t o  cons ide r  i t  as a s t a t i c  q u a n t i t y .  One 
method of accomplishing t h i s  i s  as fo l lows:  when each p e r i o d i c  
t a s k  i s  a s s igned  t o  a p rocesso r  it i s  tagged wi th  t h a t  pro­
cessors number. When t h e  e x e c u t i v e  t i m e r  a c t i v a t e s  t h e  t a s k  it 
p u t s  it d i r e c t l y  o n t o  an Ac t ive  Task L i s t  (ATL) f o r  t h e  proper  
1 3  

processor. The task load is always assigned to that processor,

thus assuring the capacity to meet the task deadlines. If the 

number of available processors changes, the task lists must be 

rescheduled. 

NONPERIODIC TASKS 

As mentioned before, the total multiprocessor load is 

divided into real-time tasks which have associated deadlines and 

background tasks without deadlines. The real-time tasks are 

processed first. However, it is to be expected that in an 

actual system there would be additional real time tasks which 

do not meet the restrictions which have been assumed up to this 

point. Some examples are real-time tasks which are activated 

after the completion of other tasks and tasks controlled by

external interrupts. If the tasks are activated after the pre­

requisite conditions have been met, then all active tasks can 

still be considered independent tasks. 

If new, nonperiodic tasks are activated, then the computer

loading and task scheduling must become dynamic since no prior

allowance for the task's additional loading has been made. If 

the multiprocessor is processing a mixture of real-time and 

background tasks, this may be quite simple, since the multipro­

cessor has additional capacity beyond that necessary to handle 

the real time tasks. The new task can simply be added to any 

processor list which will accommodate it with less time re­

maining for background work. It is implicity assumed that the 

background tasks have lower priorities than the real-time tasks. 

However, it is also possible for the executive to maintain a 

balance between real-time and background tasks, based upon their 

priorities. A problem arises when a real-time task cannot fit 

on any processor list but its priority is greater than that of a 

currently scheduled task. In this case, the tasks must be re­

scheduled so that lower priority tasks are removed. 

TASK ACTIVATION TIMES 

The computer loading of any task, periodic or not, is 

given by 

where 

tm = Maximum task run time 
tD = Task deadline 
t = Time the task is assigned to a processor 
1 4  

However, if a task can be assigned to a processor when it is 

activated, then 

tD = t + tF 
and tF can be considered to be the "response time" required

from the task. 

A HARDWARE EXECUTIVE 

The task schedule must be recalculated when: (1) a real-
time task is added but cannot be accommodated by any processor
and tasks of lower priority are scheduled. A partial recon­
figuration involving the tasks of lower priority is necessary,
and ( 2 )  when a processor fails. Since it is desirable to 
accomplish the reconfiguration in a minimum amount of time, 
especially in the event of a processor failure, it is reasonable 
to seek a hardware solution even if this may present an addi­
tional reliability problem. 
The hardware executive functions like an associative memory

and a 16-bit adder, although it may be built using a small, 

conventional, high-speed memory. A typical memory word appears 

as in Figure 10. 

I F L A G  I PROC I T A S K  1 P R T Y  P I T D  
Figure 10.- Task word 

It would contain its own microprogram control (perhaps in the 

same memory). When a task is activated, its first Flag bit is 

set at 1. The hardware executive searches all active tasks 

which have the highest priority index. Assume that the micro­

program control is such that the executive will divide the total 

system load among the processors by a simply round-robin sequence.

In this case, all tasks having the highest priority index will be 

assigned to the processors in sequence and the load parameters

for each processor will be incremented by the field of the task 

word. The priority index is then incremented and the search re­

peated. The PRO field of the task word is set equal to the 

number of the processor to which the task is assigned. A second 

flag bit indicates that the task is assigned. When a processor 

requests a task, it is given the task with the nearest due time 

(TD). 

1 5  
I 
INTERRUPTS 

Interrupts may be handled by the hardware executive in a 

manner similar to that suggested in references 1 and 6. The 

executive can compare the next due time of the task currently

being processed to the due time of the task at the head of the 

corresponding processors ATL and generate a processor interrupt,

if necessary. Note that this involves a comparison between 

a processor and its ATL and, unlike reference 6, it is not 

necessary to make a comparison between processors, or to select 

a processor to be interrupted. In this way, it would not be 

necessary to interrupt each processor every T microsecond and 

the cycle interrupt is no longer required. 

Reference 1 points out that external interrupts can be 

handled by simply activating their corresponding task. All that 

is necessary is to provide a mechanism in the hardware executive 

for setting the active bit of the correct task word. The opera­

tion of this interrupt system would be quite different from con­

ventional systems, since an external interrupt which activates 

a task of the highest priority may not cause a processor interrupt 

even if all processors are working on tasks of lesser priority.

Instead, the executive treats the interrupt task exactly like 

any other task. If it has the resources to guarantee the tasks 

execution with the required time frame, it may defer execution 

of the task. If its resources are insufficient, then the task 

is added to a system-wide queue with the winners selected by

priority. 

Another flag bit is used to indicate that a task is a 

member of the "timer" list. In this case, the TD field contains 

the time at which the task should be activated. The nearest 

activation time can be placed in a hardware register (labeled

timer in Figure 11). When the nearest time arrives, the associa­

tive processor is interrupted and all tasks with their timer 

bits on and a TD field corresponding to the interrupt time are 

activated. 

CONCLUSION 

Algorithm two of this note, modified for a realistic multi­

processor is proposed as a reasonable solution to the real-time 

multiprocessor scheduling problem. 

On a system basis, the algorithm does not insist that the 

active task of greatest priority or even the nearest deadline 

be processed first. It simply allocates the total task load 

among the individual processors in such a way that the deadlines 

of all allocated tasks are guaranteed. If the system resources 

are insufficient to allow all of the real time tasks to be 

executed, then those of highest priority will be scheduled. 

1 6  
-

TD 
I I I-
I< 
-

ASSOC IAT1VE 

PROCESSOR 

EXTERNAL 
INTERRUPTS 
L 

INTERRUPT 
TIMER 

b CONTROL 

4 I 
INTERRUPT 
Figure 11.- Hardware executive 

The algorithm is sub-optimal in that the allocation of the 

total workload to the individual processors may not be the "best" 

possible. However, the allocation strategy is independent of 

the overall scheduling algorithm and, therefore, subject to 

further improvement. 

The algorithm is also sub-optimal in that it considers only

the currently active tasks and does not attempt a global optimi­

zation using knowledge of future task loads. However, it is 

assumed that there is a class of tasks activated by interrupts

for which the information required for the global optimization

is incomplete and not available. If the executive does have 

information on the future activity of a task (for example,

periodic tasks), it can use this to do static rather than dynamic

scheduling, thus decreasing executive overhead and improving 

response time. 

18 

APPENDIX A 

SCHEDULING BY PRIORITIES 

Consider the following four periodic tasks: 

TASK 
 4E 

A T T/4 

C 3T 3T/4 

D 4T T 

Assume task priorities are assigned as follows: 

> pD 
A successful task schedule is shown in Figure A-1. The tasks 

are scheduled by deadlines. Note that this requires scheduling

C before D during cycle 2 and D before C during cycle 4. 

Scheduling by priority would result in task D missing its dead­

line, although the multiprocessor is capable of executing all 

tasks successfully. 

A B I C  A 1  C P I A l  O A D 
A A A 
C B 

D 

Figure A-1.- Task schedule 

19 

APPENDIX B 
MEMORY CONFLICTS 

Another reason for the departure of a realistic multi­

processor from an "ideal" multiprocessor is the presence of 

hardware and software conflicts. Software conflicts are due 

to exclusive data areas which can be modified by only one pro­

cessor at a time. Hardware conflicts can result during 1/0

operations or, more probably, from multiple accesses of a single 

memory module. Conflicts complicate scheduling because they may

extend the execution time of a task beyond its "maximum run time" 

parameter. 

In order to estimate the effects of memory conflicts, a 

simulation was conducted using the multiprocessor configuration

of Figure 1. The number of rriemory modules was assumed to be 

equal to the number of processors. The memory conflict depends 

on the cycle time of the memory and the processor operating

speed which determines the rate of memory accesses. Processor 

timing was assumed to be as in Figure B-1.  
tfI time to fetch on instruction from the memory 

tD time to decode instruction address 

tfO time to fetch an operand 

tE time to execute an instruction 

Figure B-1. - Processor cycle 
The processors competed for each individual memory on a 

round-robin basis. A memory was attached to a particular pro­

cessor for a memory cycle time (tmc). If the required memory 

was unavailable, the processor cycle of Figure B-1  would be 
lengthened accordingly. Values used in the simulation were: 
20  

tD = 1 microsec (ps) 
tE = 1 US ( 6 0 % ) ,  9 US (40%) 
Both the instructions and data were considered to be 

randomly distributed in pages among the memory modules. A total 

of 256 accesses was made from each page and then a new page 

was chosen at random. The resultant system through-put and 

percentage of a processor's time lost to a memory conflict 

appear in Figures B-2a and B-2b. For the processors assumed, 

the memory conflicts do not appear critical for 1 and 2 micro­

second memories and the task maximum run-time parameter can be 

modified to take memory conflicts into account. 

Since the scheduling algorithm of this note allows some 

measure of freedom in allocating the total system workload 

among the individual processors, one possible algorithm is to 

assign individual tasks to the processors so as to minimize 

potential memory conflicts. In this case, there would be a 

complex interaction between the task scheduling and memory

allocation functions of the Executive. 

21 

1 
PROCESSORS 
8 7 6 5 4 3 2 
6
I X I 0  u) 50 
I­o 

7 
LL z 
0 40 o 
0 
I-
/ / 4 u s  I­
u) 30 

0
1 

W z
-
I- 20 
LL 
0 
I­z I O  
W 
0.I K 
I W 
I- I I I I I 1 I 0, 
I 2 3 4 5 6 7 8 
PROCESSORS 
F i g u r e  B-2 . - Memory c o n f l i c t s  
REFERENCES 

1. 	 Lampson, Butler W.: A scheduling Philosophy for Multi­
processing Systems. Communications of the ACM, vol .  11, 
no. 5, May 1968. 
2. 	 Graham, R. L.: Bounds on Multiprocessor Timing Anomalies. 
SIAM J. Appl. Math., vol .  17, no. 2, March 1969. 
3. 	 Manacher, G. K.: Production and Stabilization of Real-
Time Task Schedules. Journal of the Association for 
Computing Machinery, vol .  14, no. 3, July 1967. 
4. 	 Ochsner, B. P.: Controlling a Multiprocessor System. Bell 

Laboratories Record, February 1966. 

5. 	 Richards, Paul I.: Parallel Programming. Report under 

contract No. AF33(600)-35190 at Tech. Operations, Inc., 

Burlington, Mass. 

6. 	 Gountanis, R. J., and Viss, N. L.: A Method of Processor 
Selection for Interrupt Handling in a Multiprocessor System.
Proceedings of the IEEE, vo l .  54, no. 12, December 1966. 
NASA-Langley, 1970 -8 c-111 23 
-
I111111111mII11 I11111 
,, A 
>c 
r@ATIONAL AERONAUTICSAND SPACE ADMINISTRATION 
i WASHINGTON,D. C. 20546 
OFFICIAL BUSINESS FIRST CLASS MAIL 
POSTAGE AND FEES PAID 
NATIONAL AERONAUTICS AN 
SPACE ADMINISTRATION 
POSTMASTER: 	 If Undeliverable (Section 158 
Posral Manual)  Do Not Rerun 
~. 
“The aqTomwiicn1 nnd spnce nc i i i ’ i t i es  of the Uuitecl Stntes shnll be 
cdiiducted so ns i o  coutribicte ~ . . t o  the expnnsion of hiiiiiaiz kfzoivl­
edge ,of p ~ e n o n t e n ni72 the afiiiosphere mid space. The Adiiiiizistrndioiz 
shnfl pror8ide for  the widest prncticnble r m d  npproprinte dissemimtjon 
of i~zforiitntioucoizcer~aiizg iis lictiisities n m l  the results thereof.” 
-NATIONALAERONAUTICS A N D  SPACE ACT OF 195s 
NASA SCIENT~FICAND TECHNICAL PUBLICATIONS 
TECHNICAL REPORTS: Scientific and 
technical information considered important, 
complete, and a lasting contribution to existing 
knowledge. ’ 
TECHNICAL NOTES: Information less broad 
in scope but nevertheless of importance as a 
contribution to existing knowledge. 
TECHNICAL MEMORANDUMS: 
Information receiving limited distribution 
because of preliminary data, security classifica­
tion, or other reasons. 
CONTRACTOR REPORTS: Scientific and 
technical information generated under a NASA 
contract or grant and considered an important 
contribution to existing knowledge. 
TECHNICAL TRANSLATIONS: Information 
published in a foreign language considered 
to merit NASA distribution in English. 
SPECIAL PUBLICATIONS: Information 
derived from or of value to NASA activities. 
Publications include conference proceedings, 
monographs, data compilations, handbooks, 
sourcebooks, and special bibliographies. 
TECHNOLOGY UTILIZATION 
PUBLICATIONS : Information on technology 
used by NASA that may be of particular 
interest i n  commercial and other non-aerospace 
applications. Publications include Tech Briefs, 
l’tchnology Utiliz‘ition Reports and Notes, 
and Technology Surveys. 
; 

Details on the availability of fhese publications may be obtained from: 
SCIENTIFIC AND TECHNICAL INFORMATION DIVISION 
NATIONAL AERONAUTICS AND SPACE ADM I N ISTRATI ON 
Washington, D.C. 20546 
