Event driven executive by Cornwell, Smith et al.
United States Patent [19] [ill Patent Number: 4,980,824 
Tulpule et al. [45] Date of Patent: Dec, 25, 1990 
EVENT DRIVEN EXECUTIVE 
Inventors: Bhnlchnndra R. Tulpule, Vernon; 
Robert E. Collins, East Hartford; 
John Cheethpm, Bristol; Smith 
CornweU, East Granby, all of COM. 
Hartford, COM. 
Assignee: United Technologies Corporation, 
Appl. No.: 298,291 
Filed: Jan. 17,1989 
Related U.S. Application Data 
Continuation of Ser. No. 924,542, Oct. 29, 1986, aban- 
doned. 
Int. c l . 5  ................................................ Go6F 9/00 
U.S. Cl. ................................. 364/200; 364/965.4; 
364/948.3 
Field of Serrch ................................ 364/200, 900 
References Cited 
U.S. PATENT DOCUMENTS 
4,152,761 VI979 Louie .................................. 364/200 
4,153,932 5/1979 Dennis et al. ....................... 364/200 
4,286,322 8/1981 Hoffman et al. .................... 364/200 
4,320,451 3/1982 Bachmn et al. ................... 364/200 
4,320,455 3/1982 Woods et ai. ....................... 364/200 
4,333,144 6/1982 Whiteside et al. .................. 364/200 
4,369,494 VI983 Bienyenu et al. ................... 364/200 
4,394,727 7/1983 Hoffman et al. .................... 364/200 
4,394,730 7/1983 Suzuki et al. ....................... 364/200 
4,413,318 11/1983 Hemngton ......................... 364/200 
4,414,624 11/1983 Summer, Jr. et al. .............. 364/200 
4,447,874 5/1984 Bradley et al. ..................... 364/200 
4,466,736 8/1984 De Santis et al. .................. 364/200 
4,494,188 1/1985 Nakane et al. ...................... 364/200 
4,525,780 6/1985 Bratt et al. .......................... 368/200 
4,590,555 5/1986 Bourrez ............................... 364/200 
4,594,655 6/1986 Hao et al. ............................ 364/200 
4,615,001 9/1986 Hudeins, Jr. ........................ 364/200 
4,658,351 4/1987 Teng ................................... 364/200 
4,675,806 6/1987 Uchida ................................ 364/200 
4,736,318 4/1988 Delyani et al. ..................... 364/200 
OTHER PUBLICATIONS 
IBM Corporation Programming Publications, 
"OS/VS2 MSV Overview," Second Edition (May 
1980). Chapters 5-6. 
Primary Examiner-Gareth D. Shaw 
Assktant Examiner-John G. Mills 
Attorney, Agent, or Firm-Francis J. Maguire, Jr. 
P71 ABSTRACT 
Tasks may be planned for execution on a single proces- 
sor or are split up by the designer for execution among 
a plurality of signal processors. The tasks are modeled 
using a design aid called a precedence graph, from 
which a dependency table and a prerequisite table are 
established for reference within each processor. During 
execution, at the completion of a given task, an end of 
task interrupt is provided from any processor which has 
completed a task to any and all other processors includ- 
ing itself in which completion of that task is a 
prerequisite for commencement of any dependent tasks. 
The relevant updated data may be transferred by the 
processor either before or after signalling task comple- 
tion to the processors needing the updated data prior to 
commencing execution of the dependent tasks. Coher- 
ency may be ensured, however, by. sending the data 
before the interrupt. When the end of task interrupt is 
received in a processor, its dependency table is con- 
sulted to determine those tasks dependent upon comple- 
tion of the task which has just been signalled as com- 
pleted, and task dependency signals indicative thereof 
are provided and stored in a current status list of a 
prerequisite table. The current status of all current pre- 
requisites are compared to the complete prerequisites 
listed for all affected tasks and those tasks for which the 
comparison indicates that all prerequisites have been 
met are queued for execution in a selected order. 
3 Claims, 8 Drawing Sheets 
https://ntrs.nasa.gov/search.jsp?R=20080008257 2019-08-30T03:34:00+00:00Z
* 
U.S. Patent Dee. 25,1990 Sheet 1 of 8 4,980,824 
US. Patent DW. 25,1990 Sheet 2 of 8 4,980,824 









B . 0 - C € 











WR52 UST Cw?I? STATUS 
A € N E .  ENTER 
8 A A -166 





F / G .  4 
A A d 6 8  
A, B A 4/70 
c, 0 
US. Patent Dee. 25,1990 Sheet 4 of 8 4,980,824 





F / G .  8’ 
US. Patent Dec. 25,1990 Sheet 5 of 8 4,980,824 
I I i3 2 3 
US. Patent Dec. 25,1990 Sheet 6 of 8 4,980,824 
I 
I 














































































US. Patent Dec. 25,190 Sheet 8 of 8 4,980,824 
Pf?OVJDE END O f  %ZSK 
/NT€R?UPT S/GNAL 773 ANY 
AND ALL OT.€R F)9ocEssORS 
ZF€ND€NT ON c6MpLET/oN 
I > 




The invention described herein was made in the per- 
formance of work under NASA Contract No. NAS2- 
1177 1 and is subject to the provisions of Section 305 of 
the National Aeronautics and Space Act of 1958 (72 
Stat. 435; 42 U.S.C. 2457). 
This application is a continuation of Ser. No. 924,542, 
tiled Oct. 26, 1986, and now abandoned. 
CROSS REFERENCE TO RELATED 
APPLICATION 
The invention described herein may employ some of 
the teachings disclosed and claimed in commonly 
owned co-pending application filed on even date here- 
with by Tulpule et al , Ser. No. 06/924,646, now aban- 
doned and refded as Ser. No. 07/355,070 entitled n- 
SOR LATTICE ARCHITECTURE, which is hereby 
expressly incorporated by reference. 
DIMENSIONAL MODULAR MULTIPROCES- 
1. Technical Field 
This invention relates to event driven executives for 
2. Background Art 
In recent years, there has been an increase in the 
demand for high performance, real-time digital com- 
puter systems capable of solving complex control prob- 
lems demanding high throughput. The designers of high 
performance digital computer systems have resorted to 
multiprocessor architectures such as systolic, processor 
array systems, pipelined systems, or multiprocessor 
networks in an attempt to meet the demand. In most of 
these systems, the arrays of processors share in the total 
workload. Each processor performs the same set of 
tasks and operates on the corresponding data sets under 
the direction of a system controller. In many systems, 
such as network processors. each processing element 
controls and operates on its own internal data and com- 
municates with other processors for data and execution 
flow and control purposes. 
In most real-time critical multiprocessor systems, 
there is usually a concurrent need for minimizing the 
overall computational delay. The computational delay 
in a multiprocessor system depends on the worst case, 
critical path task times in the proccssors, as well as the 
interprocessor data handling delays. The need for mini- 
mizing transport delay, therefore, translates to the need 
for an operating system or task executive that can effi- 
ciently interface with many tasks, both internal and 
external to the local processing element, and minimize 
the intertask handling of data and control signals. 
In the prior art, the operating systems implemented 
for real-time control applications were based on a real- 
time executive in which real-time events were careNly 
signal processors. 
DISCLOSURE OF THE INVENTION 
An object of the present invention is to provide a 
scheme for an event driven executive for a signal pro- 
Another object of the present invention is to provide 
an efficient task executive which fulms the need to 
balance, partition and repartition tasks between proces- 
sors in a multiprocessor system in order to balance the 
10 critical parameters such as path times, transport delays 
and throughput throughout the multiprocessor system. 
Still another object of the present invention is to 
provide a task executive for starting, suspending and/or 
stopping tasks and initiating new tasks after determining 
Still another object of the present invention is to 
provide a task executive in a multiprocessor system 
which, in taking account of task dependencies and pre- 
requisites, manages data and control flow signals in 
20 order to timely and coherently provide required input 
data for a task to the processor which requires that data 
in order to properly execute the task. 
Another object of the present invention is to provide 
a task executive for a multiprocessor system which 
25 takes into sccount an architecture in which a given 
dependent task may require several prerequisite tasks to 
be completed in local or any other processors before 
being executed. 
Another object of the present invention is to provide 
30 a task executive for a multiprocessor system which is 
flexible enough to be changed around either during the 
design process or dynamically in response to changes in 
the execution times of tasks which can change signifi- 
cantly during execution. 
Another object of the present invention is to provide 
a simple, low overhead task executive for a multiproces- 
sor system. 
Another object of the present invention is to provide 
a task executive for a multiprocessor system in which 
40 interprocessor interrupts and data blocks are efficiently 
handled. 
Another object of the present invention is to provide 
a task executive for a multiprocessor system which 
avoids log jams and hidden transport delays endemic to 
Another object of the present invention is to provide 
a task executive for a multiprocessor system which 
optimizes time critical paths. 
Another object of the present invention is to provide 
SO for ease of relocateability of tasks in a multiprocessor 
system, as between processors. 
Another object of the present invention is to provide 
for efficient handling of pass-through data and control 
signals between several processors. 
According to the present invention, an event driven 
task executive for a signal processor determines 
whether an end of task signal has been generated and 
then consults a dependency table in order to determine 
those tasks which demnd uwn comdetion of the com- 
5 cessor. 
15 their priority and precedence. 
35 




graph; thus, the tasks are illustrated interdependently in 
terms of completion of one task as being a prerequisite 
to execution of a subsequent task. The executive is then 
designed to operate in conformance with the preced- 
ences and interdependencies laid out in the precedence 
graph. When a task is completed, an end of task signal is 
triggered and provided to the executive in order to 
indicate a completed task which is a prerequisite to 
commencement of execution of another, dependent 
task. Any updated data, resulting from the completion 
of the task is provided for use by the subsequent task, if 
applicable. The executive determines from a depen- 
dency table those tasks which depend upon completion 
of the task represented by the end of task interrupt 
signal. Current status signals are generated according to 
this determination for the purpose of updating the cur- 
rent status of the prerequisites for each task. The cur- 
rent status is stored in a current status list of a task 
prerequisite table. Thus, all tasks yet to be executed 
which are dependent on the completion of the task 
represented by the end of task interrupt signal have the 
current status of their prerequisites updated, with re- 
spect to that task, in the current status list of the 
prerequisite table. Tasks for which all  prerequisites 
have been met are queued for execution in a selected 
order. 
In still M e r  accord with the present invention, task 
precedences and signal dependencies in a multiproces- 
sor system in which tasks are partitioned between pro- 
cessors may be expressed graphically in terms of a de- 
sign aid called a precedence graph; thus, the assigned 
tasks are illustrated interdependently in terms of tasks 
being assigned among various signal processors in the 
multiprocessor system and in terms of interrupts and 
transfer of data between processors at the proper time. 
The executive is then designed to operate in confor- 
mance with the precedences and interdependencies 
laid-out in the precedence graph. When a task com- 
pletes, an end of task signal is triggered and provided to 
the executive which in turn provides an end of task 
interrupt signal to another processor, the completed 
task being a prerequisite to commencement of execution 
of another, dependent task in the other processor. Up 
dated data, resulting from the completion of the task in 
the processor providing the interrupt signal is trans- 
ferred to the other processor at the time of completion 
of the task. Coherency of data transferred may be en- 
sured by sendmg the data prior to generating the inter- 
rupt. When the executive in each processor receives the 
end of task interrupt signal either from one of its own 
tasks or from another processor in the multiprocessor 
system, it determines from a dependency table those 
tasks which depend upon completion of the task repre- 
sented by the end of task interrupt signal. Current status 
signals are generated according to this determination 
for the purpose of updating the current status of prere- 
quisites for each task. The current status signals are 
stored in memory as a current status list of a task 
prerequisite table. Thus, all tasks vet to be executed 
which are dependent on the completion of the task and 
the associated end of task interrupt signal have the cur- 
rent status of their prerequisites updated, with respect 
to that task, in the current status list of the prerequisite 
table. Tasks for which all prerequisites have been met 
are queued for execution in a selected order. 
In further accord with the present invention, in a 
multiprocessor system, the architecture may be such 
that data cannot be transferred directly from one pro- 
4 
ceSSOr to another either due to lack of a direct path or 
failure thereof; in such a case, according further to the 
present invention, the data must instead first pass 
through one or more other processors or associated 
5 memory devices. In such an architecture, the intermedi- 
ary processor or processors or their associated memory 
devices will serve as intermediaries for the reception of 
a task interrupt signal and its associated updated data 
relating to the completion of the task from the source 
10 processor to the destination processor. In such a case, 
the source processor will send an interrupt which is 
received by the intermediary and which also receives 
the updated data After reception of the data, the inter- 
m e d i i  sends the task interrupt signal and data to the 
15 destinntion processor which then receives the interrupt 
and the data. Such “handoffs” of interrupts and data 
may be chained in cases where several processor bound- 
aries must be crossed. 
In still further accord with the present invention, the 
20 tasks scheduled for execution, for which all prerequi- 
sites have been met, may be scheduled in a plurality of 
task execution queues. The number of execution queues 
will be greater than or equal to the number of different 
task rates for the control system. In other words, there 
25 may be several layers of tasks being accomplished at 
Merent rates within the control system. Each control 
rate may have one or more queues associated with it. 
The reason for the additional queues within a given task 
rate is that in many cases, one set of tasks are considered 
30 more time critical and, therefore, their overall transport 
delay must be minimized. Of course, the order of execu- 
tion of queued tasks may be selected according to other 
types of criteria or as dictated by other priorities. 
In order to effectively utilize the possible growth and 
35 to achieve the flexibility and other desirable capabilities 
of multiprocessor architectures, such as the architec- 
tures pictured without limitation in FIGS. 1 and 2 be- 
low, a new approach, according to the present inven- 
tion, is required for the design of the executive. 
This is particularly true in a particular class of prob- 
lems where the computational tasks are irregular and 
each processor operates differently on a different data 
base; in other words, where non-homogeneous data 
bases are present within a heterogeneous multiprocessor 
45 architecture. That class of problems requires real-time, 
sequential computations which are capable of making 
data dependent decisions and branching off in non-regu- 
lar patterns. Therefore, there is a need for a versatile 
multiprocessor system architecture and task executive 
50 that can meet the changing, real-time applications for 
such problems by efficiently performing large and ever- 
changing complex computations in a sequential manner. 
The throughput requirements of these irregular, real- 
time cornputatid applications are very large and 
55 complex and can change drastically from application to 
application. The full range of arithmetic and data ma- 
nipulation, as well as input-output signal handling capa- 
bilities required, can also change drastically, according 
to application. In many cases, the computational com- 
60 plexities are due to the presence of intertwining, looping 
and mixing of data flow paths between functions. The 
data flow paths and task executions depend on the mode 
of operation and serial, data driven decisions. 
The need for high throughput is synonomous with 
65 the need for performing a given task within a given time 
with a minimum waiting time. For example, in avionic 
real-time control systems applications, the computa- 




gent since they determine the performance and capabili- 
ties of the system in terms of bandwidth, as well as the 
failure management and reliability qualities of the over- 
all system. The use of multiprocessors stretches the data 
and execution flow across processor boundaries and 
becomes an added factor contributing to the overall 
transport delay. The need for reducing this additional 
transport delay is thus closely associated with the re- 
quirement of efficient and high bandwidth communica- 
tion between the interprocessor data elements. A high 
communication bandwidth capable of rapidly tramfer- 
ring a large number of signals is particularly necessary 
because of the presence of irregular and unpredictable 
data and execution flows spread across the multiproces- 
A given computational task to be executed in multi- 
processor architectures, e.g., such as are illustrated, 
without limitation, in FIGS. 1 and 2, can be approached 
using a number of different methods. A straightforward 
approach would consist of using one or two processors 
for the management of input data and Using several 
other processors for most of the computational tasks. 
Output voting planes and built-in-test tasks could then 
be performed by the input/output processors. The 
problem with this approach is that it does not efficiently 
utilize all of the processors aII of the time. Some proces- 
sors may be under utilized while some others may run 
out of real-time. 
Further improvement in effective throughput re- 
quires a different scheme in which tasks can be selected 
to be performed in parallel without significant software 
overhead in the executive. Such an approach to the 
design of the task executive involves splitting and merg- 
ing of critical, interdependent tasks for the purpose of 
balancing the overall computational burden. However, 
this calls for a fair amount of sophistication in the execu- 
tive requiring a potentially significant overhead. 
Another, -haps more important reason for requir- 
ing a sophisticated executive, is the problem of log jam 
in which the data and control dependencies can force 
processors to wait for each other. This is a particularly 
difficult situation to predict, test or simulate for in a 
system consisting of more than two processors. If al- 
lowed to develop, it could lead to catastrophic results. 
Other, more subtle forms of log jams can lead to unnec- 
essary and hidden transport delays in the execution of 
critical timing paths. This problem is caused by ineffi- 
cient techniques of scheduling tasks which have met 
their prerequisites, i.e., which are ready to go. Another 
source of large transport delay, is the lack of efficient 
techniques for passing data between processors. 
The event driven executive for a multiprocessor sys- 
tem, according to the present invention, has the very 
important advantage of being unaffected by design 
changes which might in turn affect the execution times 
of tasks. An event driven executive remains unaffected 
by these changes because its execution sequence de- 
pends only on the task dependency specified by the 
precedence graph. 
The problem of obtaining a high overall throughput 
in a multiprocessor system is solved, according to the 
present invention, by using a flexible, event driven exec- 
utive that utilizes a precedence graph for outlining task 
definition for efficient execution of the workload. 
Each modular processing element (e.g., 12) includes a 
signal processing entity 24 (referred to as “SF”’), having 
data lines 26, address lines 28, and control lines 30 con- 
nected to a ring bus 32. 
sors. 
6 
An event driven executive for a multiprocessor sys- 
tem, according to the present invention, provides the 
flexibility of implementation lacking in real-time execu- 
tives and is a key element essential for the effective 
In sti l l  further accord with the present invention, the 
occurrence of each event suspends the present task for 
a review of the relative priorities of the currently sus- 
pended task and the new task(s) for which the event is 
10 a prerequisite. A task of the highest priority, which has 
atso met all its prerequisites, is then searched for and, if 
found, it is then invoked for execution. If not found, the 
currently suspended task is re-entered. Thus, dynamic 
changes in the relative timings of tasks do not affect the 
15 executive. The executive can also be easily changed 
during the design proccss to reflect a new precedence 
graph by simply changing the prerequisite and depen- 
dency tables. 
The present invention provides a generic executive 
20 for all configurations and requirements which is driven 
by tables of precedences and dependencies based on a 
precedence graph of tasks and signals. The executive is 
dynamically independent of task timings. It provides 
the flexibility needed for design changes which often, in 
25 the design process of the prior art, resulted in architec- 
tural upheavals at very high cost. The present invention 
provides the ability to easily optimize any and all criti- 
cal paths. Moreover, efficient handling of interproces- 
sor interrupts is provided. Data signals between proces- 
30 sors are transferred in a coherent manner simply by 
sending the data before the interrupt and at the same 
time eliminating the need for polling and its associated 
inefficiencies and the potential for lock-ups is also 








thereby eliminated. Pa& through tasks are-also effi- 
ciently handed. Traceability and monitoring of normal 
task completion events is assured. Fault tolerance for 
abnormal events is an additional feature of the present 
invention. 
These and other objects, features and advantages of 
the present invention will become more apparent in 
light of the detailed description of a best mode embodi- 
ment thereof, as illustrated in the accompanying draw- 
BRIEF DESCRIPTION OF THE DRAWING 
FIG. 1 is a pictorial representation of a two dimen- 
sional multiprocessor lattice architecture in which a 
multiprocessor task executive according to the present 
invention may be utilized; 
FIG. 2 is a pictorial representation of a three dimen- 
sional multiprocessor lattice architecture in which a 
multiprocessor executive according to the present in- 
vention may be utilized, 
FIG. 3 is a simplified block diagram illustration of a 
precedence graph, showing a number of tasks to be 
executed in a number of processors and showing the 
interdependencies between the tasks; 
FIG. 4 is a pictorial representation of a dependency 
table showing each of the tasks of FIG. 3 and each of 
the dependent tasks relating to each; 
FIG. 5 is a pictorial representation of a prerequisite 
table showing a prerequisite list for each of the tasks of 
FIG. 3 and also showing a current status list for each of 
the prerequisites for each task; 
FIG. 6 is a pictorial representation of a task identifier 
associated with each of the real time interrupts as well 
as the interprocessor interrupts associatea with the ex- 




FIG. 7 is a pictorial representation of the operation of 
a multi-tasking hierarchical executive in which several 
tasks rates are operating at the same time; 
FIG. 8 is a pictorial representation of an execution 
sequence illustrating the execution of the tasks illus- 
trated in FIG. 3; 
FIG. 9 is an illustration of a second precedence graph 
for a second multiprocessor system, 
FIG. 10 illustrates a dependency table and a 
prerequisite table for the precedence graph of FIG. 9; 
and 
FIG. 11 is a simplified flow chart illustration of a 
plementing a task executive for a multiprocessor sys- 
tem, according to the present invention. 
BEST MODE FOR CARRYING OUT THE 
INVENTION 
FIG. 1 is a pictorial representation of a two-dimen- 
sional multiprocessor lattice architecture 10. A number 
of two-dimensional modular proccsSing elements 12,14, 
16, 18 are illustrated connected to one another in a 
manner to be described in more detail below. The num- 
ber of processing elements is at least two but may be any 
number. 
It should be understood that the architectures de- 
picted, in both FIGS. l and 2 are not presented by way 
of limitation since the event driven multiprocessor task 
executive disclosed herein is broadly applicable to a 
wide range of different entities, from a mere individual 
“uniprocessor” to a general multiprocessor system. 
A two-dimensional modular input/output controller 
(IOC) 20, as shown in FIG. 1, may be used in the two- 
dimensional multiprocessor lattice architecture 10. 
Such an IOC serves the purpose of communicating data 
and control signals between the outside world and the 
multiprocessor architecture. Additional IOCs may be 
utilized as is indicated by an additional IOC 22, which 
helps to share the input/output task load. It may be 
advantageous from the point of view of modularity to 
have both modular processing elements and modular 
IOCs for use as symmetrical building blocks in the lat- 
tice architecture 10. This does not necessarily imply, 









however, that such building blocks would be used, or if 
used, that they would operate identically. In other 45 
words, a heterogeneous multiprocessor system is con- 
templated to be within the mpe of the present inven- 
tion. 
As mentioned above, the task executive of the present 
invention may be used in an architecture such as shown 50 
in FIG. 1, but the present invention is not restricted 
thereto, although it is particularly advantageous 
therein, as will be discussed in greater detail below. 
In a two-dimensional architecture each two-dmen- 
siod modular processing element 12,14,16,18 should, 55 
optimally, have four ports. Such are shown in FIG. 1 as 
emanating from, e.g., the ring bus 32 and exiting the 
modular processing element 12, through each of the 
four sides of the dashed lines which indicate the bound- 
aries of the modular processing element. It will be un- 60 
derstood that an actual circuit implementation of the 
two-dimensional multiprocessor lattice architecture (or, 
for that matter, an any dimension architecture) need not 
have any relation to the square shapes shown in FIG. 1 
since the circuits can be mounted on printed circuit 65 
boards inserted into a chassis with other circuit boards. 
The interconnections in such a case will not be so sim- 
ple or symmetrical as illustrated here. Thus, these Fig- 
8 
urcs will, for many caaes merely be pictorial and func- 
tional representations which aid in the presentation of 
the concepts involved. 
The two-dimensional lattice architecture pictured in 
FIG. 1 relies on a dedicated memory storage area be- 
tween each modular entity and every other modular 
entity with which it communicates in the lattice. This 
dedicated function can most effectively be implemented 
by a dual port random access memory (DPR). Of 
course, a DPR is not absolutely essential since memory 
arbitration using more traditional memory devices 
could be accomplished in lieu thereof. 
If modularity is  desired for each of the two-dimen- 
sional modular processing elements 12,14,16,18, it will 
be best to provide two dual port RAMS per modular 
processing element. The other two ports in each ele- 
ment will not have a dual port RAM since they will be 
interfacing with other modular processing elements 
which do. The symmetry of processing elements con- 
structed in this manner are most advantageous as may 
be illustrated in FIG. 1. There, it will be observed that 
modular processing element 12 has a “South” port with 
a DPR 34 which interfaces with a “North” port of 
modular processing elements 16, which does not have a 
DPR associated with it. Similarly, the “Eastern” port of 
modular processing element 13 does not have a DPR 
associated with it but the “Western” port of modular 
processing element 14 does have a DPR 36 associated 
with it. In this way, the symmetry of the modular pro- 
cessing elements 12,14,16,18 enhances the facility with 
which a multiprocessor lattice may be constructed, in 
which each modular processing element communicates 
with another modular entity, in general, through a dedi- 
cated DPR. Of course, the symmetry of the individual 
processing elements could be different than shown. 
The “Northern” port of modular processing element 
12 contains a DPR 38 having data and address lies 40 
emanating therefrom for connection to another modu- 
lar entity (not shown). Of course, it will be understood 
that the data and address lines 40 need not necessarily 
be connected to another modular entity since the 
boundaries of the architecture must end somewhere. 
Control lines 42 also emanate from the ring bus 32 for 
communication across the “Northern” boundary for the 
modular processing element l2. Such lies are not abso- 
lutely necessary but would normally consist of hard 
wired interrupts. Such interrupts can also pass through 
the DPR rather than being routed separately. 
The “Eastern” boundary of the modular processing 
element 12 is shown having data and address l i e s  44 
and control lines 46 emanating from the ring bus 32 for 
connection to the “Western” boundary of processing 
element 14, including DPR 36. 
Similarly, the “Westem” boundary of entity 12 is 
illustrated having data and address lines 48 and control 
lines 50 emmating from the ring bus 32. 
The “Southern” boundary of the modular processing 
element 12 has a port which interface with data and 
address lines 52 which interface with the ring bus 32 via 
the DPR 34. Control lines 54 provide the hard wired 
interrupts to the adjacent modular processing element 
16. 
It will be observed that the modular symmetry of the 
modular IOC 20, with respect to the number of DPRs 
contained therein, is different from that of the modular 
IOC 22. This showing is merely illustrative, however, 
as it will be realized that once a particular symmetry is 




incentive to have another symmetry available. This is context in which the task executive of the present inven- 
not to say, however, that one or more different symme- tion may be utilized. Thus, it will be understood that the 
tries of either IOOCS or SPs cannot be used in the same task executive presented and claimed herein may simply 
architecture. For example, two types of SPs could be be used on a single processor and, furthermore, is not 
used, one having three DPRs and another having one 5 restricted in application to the types of architectures 
only. Furthermore, the processing entities themselves shown in FIGS. 1 and 2 but is broadly applicable to 
may all have dserent processors or processor struc- other architectures as well. 
tures in them with interfaces that are uniform across the In breaking up a computational job into small Units, 
system. the smallest individual unit of software module(s) plus 
The modular IOC 22 of FIG. 1 comprises a central 10 data and control blocks which may be located in a se- 
input/output controller (IOC) 60 surrounded by a ring lected processor is defined as a task. For example, in 
bus 62 which communicates with data lines 64, address avionics control systems, signal management of a sensor 
lines 66, and control lines 68 emanating from the IOC set would be defined as a task; a triplex signal selection 
60. It will be observed that the ring bus 62 for the IOC subroutine may not be defined as a task but would in- 
22 is slightly different from the ring bus 32 in that it 15 stead be defined as a component or subtask to be joined 
comprises a “broken circle’’ with a gap through which with other subtasks to make up a task. It should be 
a pair of data lines 70 and control lines 72 emanate at the noted that the definition of a task is not necessarily a 
“Western” port of the modular IOC 22 for communicat- fm one. It requires the tradeoff of modularity and 
ing with I/O devices in the outside world. executive overhead for processing. Since the executive 
At the “Northern” and “Southern” boundaries of the 20 overhead directly depends on the number of tasks in the 
modular IOC 22 there exist ports having dedicated precedence graph, a ‘‘small” number is usually desir- 
memories 74,76 which may be DPRs, and which may able. 
be used to communicate with other modular entities in A precedence graph shows the interrelation a job 
the lattice architecture via data and address bus lines 78, subdivided into a set of tasks. In other words, a prece- 
80 and control lines 82, 84, respectively. The “North- 25 dence graph specifies the dependencies and prerequi- 
ern” boundary communicates with IOC 20. The modu- sites of each task. An example of a precedence graph is 
lar entity, if any, communicating with its “Southern” provided in FIG. 3. In this Figure, a task 142, labelled 
boundary is not shown but may be an empty slot, an- “A” is started by an ‘‘external‘‘ event, not specified, but 
other modular IOC, or a modular processing element. which may generally be indicated by an ENTER step 
At the “Eastern” boundary of the modular IOC 22 30 140. Tasks 143, 144, 146, respectively labelled, “B”, 
there is shown a port having data and address lines 86 “C”, and “D” depend on task A. However, only tasks B 
and control lines 88 for communicating with an adja- and C can be started by task A because task D also 
cent modular entity. There is no dedicated memory depends on task B. Similarly, the final task 148, labelled 
Bssociated with the ‘‘Eastern” port of this particular “E”, depends on tasks D and C. Tasks B and C are to be 
modular IOC since, as shown in FIG. 1, it is used in an 35 performed by processors P2 and P3, respectively, with 
application in which the adjacent modular processing processor P1 handling the rest. The overall task prece- 
element 16 already has a dedicated memory 90. dence can be represented by one graph for all of the 
FIG. 2 illustrates a three-dimensional lattice architec- tasks to be completed by all the processors in a given 
ture using several three-dimensional modular process- time frame. Thus, at the end of executing the task E 
ing elements UO, 122,124,126 and a three-dimensional 40 shown in FIG. 3, a step 150 will be executed in which an 
modular IOC l28. The four modular entities 120, 124, exit is made. In the normal course of events, the step 140 
126,128 can be pictured as lying in the same plane while would be reentered at some point, at which time all of 
the modular entity 122 can be pictured as lying in an- the tasks A, B, C, D, and E would be re-executed. This 
other plane, parallel to and behind the front plane. process could go on ad in f~ tum.  It will be understood 
Other modular entities can be imagined lying in the 45 that the broadest claims of the present invention are not 
same plane with entity 122 but are not shown for the restricted to a task executive for a multiprocessor sys- 
sake of simplicity. Each of the modular entities in the tem. Thus, for the single processor case, the tasks of 
three-dimensional lattice is connected to one or more FIG. 3 would not be split between three processors but 
adjacent modular entities via dual port RAMS (DPRs). would be executed, according to the present invention, 
These are shown as cubes in FIG. 2 and are intercon- 50 using a task executive operating with one processor. 
nected between modular entities with dedicated ad- In any multiprocessor architecture, such as are illus- 
dress, data and control lines. Each of the entities is trated in FIGS. 1 and 2, there will normally be various 
illustrated as being surrounded by a “ribbon” bus for types of interrupts which must be handled. Such inter- 
address, data and control lines. It will be observed that rupts might include a macrosync (MS) type of interrupt 
the IOC 128 has its data, address and control “ribbon” 5 5  which indicates the beginning (or end) of a repetitive 
lines broken at one point to permit communication with time frame for purposes of synchronization, a real-time 
the outside world via lines 130 which would be similar (RT) type of interrupt, as well as interprocessor inter- 
in function to lines 70, 72 the two dimensional case rupts for indicating an end of task or a request to start a 
shown in of FIG. 1. The three-dimensional lattice archi- task if prerequisites have been met. 
tecture of FIG. 2 is also similar to that of FIG. 1 except 60 A typical task identifier (ID) is shown in FIG. 6 and 
for the added dimension. Of course, it will be realized such an identification signal would be transmitted over 
that the lattice architecture may be extended to any the data lines to a processor in conjunction with an 
number of dimensions which will not be pictured here interrupt. First, the processor number, i.e., the proces- 
because of the difficulty of pictorially showing more sor designated for performing the task would be identi- 
than three dimensions. 65 fied as indicated in a block 160 which may be any num- 
As mentioned above, the architectures illustrated in ber of bits wide (parallel) or long (serial). Each task may 
FIGS. 1 and 2 are presented not by way of limitation be assigned a unique alphanumeric identifier as indi- 
but merely as an aid to the reader in understanding the cated in a block 162. A task queue number will also be 
4,980,824 
11 12 
assigned in a case where there is more than one queue. queues will be greater than or equal to the number of 
e.g., for either different task rates or dflerent queues different task rates. The reason for any additional 
within a rate. This is indicated by a block 164 in FIG. 6. queues within a given task rate is that in many cases, one 
The task type will also be indicated in a block 166 in set of tasks, e.g., the pitch axis computations for an 
which the type of task to be accomplished is identified. 5 avionic application, will be considered more time criti- 
The task types may include a pass-through for a data cal and, therefore, their overall transport delay must be 
block, a request to start a task (if prerequisites are met), minimized. The additional task queues will, therefore, 
or an end of task signal. be provided for parallel execution. 
FIG. 4 illustrates a dependency table 152 generated FIG. 8 illustrates the execution sequence for the pre- 
from the precedence graph of FIG. 3. Entries in the IO cedence graph of FIG. 3 in relation to the times for 
table contain the sets of task IDS, such as shown in FIG. executing each task. As shown, tasks 143 (B) and 144 
6, pertaining to those tasks that depend on a given task. (C) are performed in processors P2 and P3 and the 
The table is organized in such a way that the ID of a remaining tasks are performed in processor PI. The 
task points to the beginning of the set of dependent shaded areas indicate time unused or used by other 
tasks. It can be seen that the completion of task A de- 15 processor tasks. Notice that if task 144 (C) takes too 
noted by “A” at the left of the table leads to dependency long, a8 shown by a dashed end of task interrupt line 
table task ID entries for tasks B, C, and D at 154,135, 200, task 148 (E) would be sigdkantly delayed, as 
158. Similar task ID entries are made for the other tasks shown by dashed lines 203, as would the earlier end of 
in the precedence graph. task inkrrupt 202. 
Additional interrupts 204, 206 signify to adjacent 
there illustrated. For each executable task listed in a processors the end of task “A” while another interrupt 
column of executable tasks designated by a capital letter 208 signifies the end of task B to processor P1. 
at the left of the table, the prerequisite table contains an The operation of the task executive can be described 
entry for both a prerequisite list 162 and a current status as “event” or “intempt” driven. Only the following 
list 164. The list of prerequisites for each executable task 25 three basic types of events need to be considered: 
contain all of the other tasks which must be completed (1) End of task interrupts, 
before the task in question can be initiated. This list may (2) Pass through interrupts, and 
be generated at compile time and is bascd on the prece- (3) Start request interrupts. 
dence graph of FIG. 3. A rule may be made that it When a processor receives an end of task interrupt, it 
cannot be changed during execution. Thus, for example, 30 uses the task ID as shown in FIG. 6 to locate the set of 
task D requires that tasks A and B must be completed dependent tasks in the dependency table as shown in 
first. The current status list is used to keep abreast of the FIG. 4. Each dependent task ID and its associated 
status of prerequisites for any given task. In the illustra- prerequisite criteria is then used to update the current 
tion of R G .  5, the current status list indicates that task status of prerequisites in the prerequisite table as shown 
A is completed, as indicated by entries 166, 168, 170 35 in FIG. 5. If all prerequisites for a task are met, the task 
corresponding to tasks B, C and D, which depend on is placed on the appropriate execution queue using its 
task A and for which task A is a prerequisite. Thus, this task queue number block in the task ID. The set of all 
list represents those prerequisites which have been met dependent tasks are processed by the executive in this 
in the current task frame associated with the task. This manner before exiting from this overhead work. For the 
list is reinitialized using the list of prerequisites in the 40 example of FIGS. 3,4,5, and 8, the end of task interrupt 
prerequisite, but that task B is not yet completed as 202 issued by procespor P3 to processor P1 at the com- 
indicated by the entry 170 list at the task rate. pletion of task 144 (C) would result in the updating of 
There may be a number of task rates associated with the prerequisite table’s current status list for task E. If 
a multi-tasking executive. Thus, a task which must be some task were directly dependent upon the completion 
completed within a relatively short period of time, e.g., 45 of task C, and only task C, then the end of task interrupt 
12.5 ~ o n d s .  will be repeated at an 80 Hertz rate. issued by task C would result in the scheduling of that 
Tasks which do not have to be completed so quickly, task in the appropriate processor‘s execution queue. 
e.g., at a 40 Hertz rate will be repeated every 25 milli- There will be CBSCS where an interrupt will have to 
seconds. As shown in FIG. 7, for a multi-tasking execu- crm more than one processor boundary. For example, 
tive in which five different rates are going on at the 50 a task in processor P3 could be a prerequisite for a task 
same time there will be, in addition, for example, a 20 in processor P2. In that event, the interrupt from P3 
Hertz rate in which tasks assoCiated with that rate are would have to ‘‘pass through” P1. A pas6 through inter- 
accomplished repetitively every 50 milliseconds as rupt and updated data is provided to Pl for relay to P2. 
shown in FIG. 7(c). Similarily, at a 10 Hertz rate tasks P1 would respond to this interrupt and data by using the 
are repeated every 100 milkconds as shown in FIG. 55 Bssociated task ID to determine the source and destina- 
7(4. For a 5 Hertz rate, as shown in FIG. 7(e), there tion of the data block. The end of task interrupt and data 
will be a spacing of 200 miuiseco nds between repetition would then be provided to P2 for execution. The depen- 
of those tasks. For each of the rates there will be at least dency table may or may not include an entry of the pass 
one execution queue. through task(s). The dependency tables shown in FIG. 
The five different task rates of FIG. 7 are each shown 60 4 do not include such an entry because it is directly and 
being synchronized by macrosync pulses 172 which are most rapidly handled by the interrupt service routine 
transmitted througbout the multiprocessor architecture itself. 
to establish synchronism. For the five rates shown in In case of data blocks which may be used locally, as 
FIG. 7, there will be sixteen repetitions of a 12.5 ms well as passed through to another processor, two possi- 
macrosync before the entire 5-rate task is completed 65 ble approaches need to be traded off. The first involves 
once. not classifying the task as a pass-through, but as an end 
A task is entered into an execution queue when it of task signal and operating as described above. The 
completes it prerequisites. The number of execution alternate involves performing the pass-through task as 
Referring now to FIG. 5, a prerequisite table 160 is 20 
499 
13 
described above and then setting an event flag so that 
the data block can be used locally using the dependency 
and prerequisite tables. The latter approach may be 
preferred since the requesting processor cannot always 
determine whether or not a data block is only being 
A start request interrupt may be used to request a 
processor to start a task, specified by the task ID, re- 
gardless of its prerequisites. This interrupt may be used 
to initiate tasks that have no prerequisites, e.g., real time 
and macrosync (MS) interrupts. These interrupts can be 
handled as end of task interrupts as well. However, a 
mechanism is sometimes needed to start a task in an- 
other processor regardless of what it was doing. 
Referring now to FIG. 11, a simplified flow chart 
illustration shows a series of logical steps which may be 
implemented in carrying out the tasks illustrated in 
FIGS. 3, 4, 5 and 8. 
After entering at a step 210, a decision step 212 is next 
executed in which a determination is made as to 
whether an internal end of task signal has been gener- 
ated. If so, a decision step 214 is next executed in which 
a determination is made as to whether or not there are 
any external dependencies depending on the completion 
of the indicated task. If so, a step 216 is next executed in 
which data relating to the completion of the task is 
transferred to any and all other processors dependent 
on completion of the task. An end of task interrupt 
signal may then be provided, as indicated in a step 218, 
to any and all other processors dependent on comple- 
tion of the task. Tasks 218 and 216 could be inter- 
changed but the transfer of data first is the preferred 
technique since coherency can be ensured if the end of 
task interrupt is sent only after data transfer is complete. 
Such an approach would be based on not permitting the 
destination processor to access data until it has received 
the end of task interrupt. 
If it had been determined in step 212 that there had 
been no internal end of task signal generated, then a step 
220 would next have been executed in which a determi- 
nation is made as to whether or not an end of task inter- 
rupt signal has been received from another processor. If 
so, a step 222 is next executed in which a determination 
is made as to whether or not the end of task signal repre- 
sents a pass-through of data intended for another pro- 
cessor. If it is a pass-through, then a step 224 is next 
executed in which the pass-through data is received and 
forwarded to the target processor. This of course may 
be by way of a “chain” of processors and memory stor- 
age areas, much like a “bucket brigade.” 
Of course, the end of task interrupt must also be trans- 
mitted to the target processor or to the intermediary 
processor, as indicated in a step 226. 
At the conclusion of step 226 or, if it had been deter- 
mined in step 222 that there had been no request for a 
pass-through, then a step 228 is next executed in which 
updated data from another processor is received and 
stored. 
After step 228 is completed or, after step 218 is com- 
pleted or, if it had been determined in step 214 that there 
were no external dependencies, then a step 230 is next 
executed in which a dependency table is consulted to 
determine those internal tasks which depend upon com- 
pletion of the completed task as represented by the just 
received end of task interrupt signal. The current status 
list of prerequisites completed is then updated for each 
such task. The current status list is then compared to the 




232. Those tasks for which all prerequisites are met are 
then queued for execution, in a selected order, as indi- 
cated in a step 234. 
After completion of step 234 or, if it had been deter- 
5 mined in step 220 that there had been no end of task 
interrupt signal received from another processor, then 
an exit is made as indicated in a step 236. 
Another example of a precedence graph for a task 
executive is shown in FIG. 9. This example is slightly 
10 more complex than the example shown in FIG. 3. The 
tasks in FIG. 9 are distributed among four processors, 
P3, P1, P2, P4. The tasks are illustrated, as in FIG. 3, as 
being vertically partitioned between the four proces- 
sors. This method of pictorial representation has no 
special significance other than to indicate a separation 
of processors into separate and distinct signal process- 
ing elements. Dependency and prerequisite tables 211a 
2llb corresponding to the graph of FIG. 9 are shown in 
FIG. 10. 
As with FIG. 3, when a processor receives an end of 
task interrupt it uses the task ID to locate the set of 
dependent tasks in the dependency table. Each depen- 
dent task ID and its associated prerequisite criteria is 
25 used to update the current status list of prerequisites in 
the prerequisite table. If all prerequisites are met, the 
task is placed on the appropriate execution queue giving 
its task ID. The set of all dependent tasks are processed 
in this manner before exiting from this task. For the 
3o example of FIGS. 9 and 10, the dependency and 
prerequisite tables indicate that the end of task interrupt 
issued by task C would result in the scheduling of task 
F and G in the appropriate processor execution queues 
and the updating of the prerequisite status of task H. 
As before, with regard to interrupts and/or data 
which must cross processor boundaries, a pass-through 
interrupt is provided. Again, a processor will respond to 
this interrupt by using the associated task ID to deter- 
mine the source and destination of the data block. The 
task is performed within an interrupt service routine in 
order to achieve the highest throughput rate for pass 
For a more detailed example of a pass-through than 
given before, as seen in the precedence graph of FIG. 9, 
45 the completion of task E in processor P4 requires a 
pass-through interrupt to processor P2 in order to com- 
plete the prerequisites of task J in processor P1. The 
task completion interrupt and updated data is provided 
to P2 by P4 and results in the scheduling of the pass 
50 though task. P2 interrupts processor P1 and transfers 
the necessary data to P1. Processor P1 uses this inter- 
rupt from PZ to update the prerequisite table’s current 
status list for task J. Again, note that the dependency 
table does not include an entry of the pass-through 
55 task@) because these tasks are more eficiently handled 
in the interrupts via a look-up table, not shown. 
Again, the comments with respect to data blocks 
which may be used locally, as well as passed through to 
another processor, as made previously with respect to 
The disclosure made previously with respect to FIG. 
3 concerning start request interrupts is also applicable 
with regard to FIG. 9. 
Although the invention has been shown and de- 
65 scribed with respect to a best mode embodiment 
thereof, it should be understood by those skilled in the 
art that the foregoing and various other changes, omis- 




60 FIG. 3, apply here as well. 
4,980,824 
15 16 
be made therein without departing from the spirit and 
scope of the invention. 
establishing in each given one of said signal proces- 
sors, a stored table of task identifiers indicative, for 
each task dependent on any of said other tasks to be 
executed in said given signal processor, of the iden- 
tity of said dependent task and the specific one of 
said signal processors within which said dependent 
task is to be executed; ' ' g the order in which said tasks may be 
z w  so that any one of said tasks dependent on said end of task si@ comprising an end of task inter- 
rupt signal issued from said given signal processor data to be provided by any other ones of said tasks 
and received by said specific signal processor; and 
ones ofsaid tasks, a dependency in response to an end of task interrupt signal relating 
to any one of said other tasks in said given signal table indicative, for each of said other tasks, of any processor, transferring, from said given signal pro- one of said tasks dependent on such other tasks, and cessor to said specific signal processor, the data establishing a stored prerequisite table including a resulting from completion of such one of said other prerequisite list indicative, for any one of said tasks, tasks in said given signal processor related to said 
dependent task. of any of said other tasks on which said one task is 
dependent and a Current status list 3. A method according to claim 2 for controlling the 
tasks in said prerequisite list has been completed, 2o at least three 
at least one of said given signal processors, an indi- said other ones of said tasks having an immediate 
cation of the fact that one of said tasks, to be exe- enter status associated therewith in said depen- 
dency table and in both lists of said prerequisite cuted in a certain one of said specific signal proces- 
table; sors other than said given signal processor, is a data 
executing, first, any of said tasks which is not depen- block pass-through task, execution of which will 
dent on any of said other ones of said tasks, as pass a block of data from said given signal proces- 
indicated by said immediate enter status, and to sor through said certain specific signal processor to 
said dependency table a corresponding completion a third one of said signal processors; 
of execution of each such task. issuing an end of 30 establishing, in one of said tables of task identifiers in 
task ai@; said certain specific signal processor, an indication 
in response to each of said end of task signals, deter- that a task related to said data block pass-through 
mining from said dependency table each of said task is to be executed in said third signal processor; 
tasks dewdent  on the task issuing said end oftask queuing said data block pass-through task for execu- 
signal and, for each dependent task so detemim4 35 tion in said certain specific signal processor in re- 
entering into the corresponding portion of said sponse to receipt by said certain specific signal 
current status list, as determined by said processor of said end of task interrupt related to 
prerequisite list, an indication that the task issuing said data block pass-through task from said given 
said end of task signal has been completed, and signal processor; and 
queuing, for execution in a selected order, each task 40 issuing from said certain specific signal processor an 
for which said status iist indicated completion of end of task interrupt signal to said third signal pro- 
every corresponding task in said prerequisite list. cessor in response to completion of said data block 
2. A method according to claim 1 for controlling the pass-through task in said certain specific signal 
execution of a plurality of data-independent tasks in a processor. 
W e  claim: 
1. A method of controlling the execution of a plural- 
ity of &&-interdependent tasks in at 1-t one signal 5 
processor, comprising: 
de 
be after completion of said other 10 
15 
Of whether Or not each Of said Other execution of a plurality of data-interdependent tasks in 
processors, comprising: 
Of said tasks which is not On Of establishing, in one of said tables of task identifiers in 
25 
* * * * *  plurality of signal processors, comprising: 45 
50 
55 
60 
65 
