Spaceborne multiprocessing seminar by unknown
~- - -  _ _ ~  
I 
. _ l  , f 
GPO PRICE $ I 
CFSTI PRICE(S) $ 
Hard copy (HC) 5 O U  
Microfiche (MF) j 13c  
I 
ff 653 July 65 
EL S RESEAR 
CENTER 
https://ntrs.nasa.gov/search.jsp?R=19670007772 2020-03-24T02:27:22+00:00Z
SPACEBORNE 
MULTIPROCESSING SEMINAR 
MUSEUM OF SCIENCE 
Boston, Massachusetts 
# 
October 31, 1966 
Sponsored by 
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION 
Cambr idge ,  M a s s a c h u s e t t s  
ELECTRONICS RESEARCH CENTER - 
FOREWORD 
As t h e  Nation's space program evolves, increased emphasis 
is being placed o n  developing the computer technology to support 
a n  expanding class of on-board computational tasks. Wi th in  t h i s  
framework, various multiprocessing approaches appear to offer 
strong promise of accommodating such complex requirements 
in f u t u r e  missions. 
We have been most fortunate in arranging an  outstanding 
group of speakers, each of whom was selected o n  t h e  basis of 
important contr ibut ions to  computer technology. Capsule 
summaries of t h e  presentations have been compiled in t h i s  
.document so that  a l l  attendees may have a background refer-  
ence for  t h e  papers to be delivered. 
I am especially grateful to a l l  those present, both for  
t h e i r  interest  in t h e  proceedings and for  taking t ime out  f rom 
t h e i r  busy schedules to participate in t h e  Seminar. 
THOMASE. BURKE 
General Chairman 
... 
- 1 1 1 -  
Page 
COMPUTER AIDS 
/ Computer Design Assistance for t h e  Evolving Large Scale Integrated C i rcu i t  Technology.. .................................................. 71 
J. S. Mer r i t t  
Honeywell Aerospace Division, St. Petersburg, Florida 
J Essential Features of On-Line Systems.. .......................................... 79 
H. Huskey 
Univers i ty  of California, Berkeley, California 
On-Line Simulation in t h e  OPS System ........................................... 83 7 1  
M. Greenberger and M. M. Jones 
M. I.T. Project MAC, Cambridge, Mass. 
-vi - 
CONTENTS 
REQU I REMENTS/ORGANI ZATl ON 
Multiprocessor Organization for  Manned Mars Mission.. ........................ 
A. Williman, L. Koczela, and G. Burnet t  
Autonetics, Anaheim, California 
Functional Requirements of Spaceborne Computers o n  
Advanced Manned Missions.. .................................................. 
P. S. Schaenman andE. L. Gruman 
Bellcomm, Inc., Washington, D. C. 
R. Alonso, A. 1. Hopkins, and H. A. Thaler 
M. 1. T. Inst rumentat ion Laboratory, Cambridge, Mass. 
R. Hokom 
Autonetics, Anaheim, California 
Design Cri ter ia for  a Spacecraft Computer ...................................... 
Executive Program Control for  Spaceborne MuIt iprocessors.. ................... 
ERROR CONTROL 
Self -Repair: Fau It Detect ion and Automatic Reconf igu  rat  ion.. .................. 
E. C. Joseph 
Univac, St. Paul, Minn. 
Ar i thmet ic Error  Correction.. ................................................ 
H. L. Garner 
Universi ty of Michigan, A n n  Arbor, Mich igan 
System Organization of t h e  JPL Self-Testing and -Repairing Computer 
and I t s  Extension to a Multiprocessor Configuration.. .......................... 
A. Av i i ien is  
NASA Jet Propuls ion Laboratory, Pasadena, Cal i fornia 
Page 
5 /‘ 
13 / 
23 /’ 
31 
41 I / ’ ’  
53 i / 
61 / 
-V- 
REQUIREMENTS/ 
ORGANIZATION 
- I  - 
MULTIPROCESSOR ORGANIZATION 
FOR MANNED MARS MISSION 
A. 0. WILLIMAN 
Mr. Wi l l iman is  Chief, Advanced Digital Systems Group, Data Systems 
Division, Autonetics, a division of Nor th American Aviation. Mr. Wi l l iman 
has been w i th  Autonetics since 1959. Since then  h e  has worked on Minu te-  
man Systems Engineering, determination of computer requirements for f u t u r e  
applications, and advanced systems analysis. Pr ior  to jo in ing  Autonetics, h e  
worked in t h e  Miss i le  Division of North American Aviation on  advanced missi le 
design projects and at Hallamore Electronics Co., on  t h e  F106 ai rcraf t  projects. 
He obtained h i s  M. S. M. E. degree in 1957 f rom the Univers i ty  of Southern 
Cali fornia and h i s  B. S. E. degree in 1952 f rom U. C. L. A. 
L. J. KOCZELA 
Mr. Koczela i s  a Senior Research Engineer, Advanced Digital Systems 
Group, Data Systems Division, Autonetics, a division of Nor th American Avia- 
tion. Mr. Koczela has been w i th  Autonetics since 1963, p r imar i l y  engaged in 
determin ing computer requirements and computer systems definit ion for 
numerous spaceborne projects. Cur ren t ly  h e  i s  assigned as Principal 
Investigator of t he  NASA Spaceborne Mult iprocessing Study. Pr io r  to 
jo in ing  Autonetics, he  worked at the  Miss i le  and Space Division of General 
Electric on  electronic systems for  space applications. He obtained t h e  M. S. E. E. 
degree in 1963 f rom t h e  Univers i ty  of Pennslyvania and t h e  B. S. E. E. degree 
in 1961 f rom Rutgers University. 
GERALD J. BURNEll 
Mr. Burnet t  is a Senior Research Engineer, Advanced Digital Systems 
Group, Data Systems Division, Autonetics, a division of Nor th American Avia- 
tion. Mr. Burnet t  has been wi th  Autonetics since September 1965, working 
pr imar i l y  w i th  multiprocessing organizations and s i l icon sapphire technology. 
He is  cu r ren t l y  investigating computer designs for a NASA Spaceborne Multi- 
processing Study. P r io r  to jo in ing  Autonetics, Mr. Burnet t  worked at Project 
MAC, M. I.T., wh i le  obtaining his M. S. E. E. degree f rom M. 1. T. (1965). He 
received h i s  B. S. E. E. degree f rom M. 1. T. in 1964. 
-3 - 
PRECEDING PAGE BLANK NOT HLMED. 
i 
MULTIPROCESSOR ORGANIZATION FOR MANNED MARS MISSION 
By A. Williman, L. Koczela, G. Burnett 
Autonetfcs, A Division of North American Aviation, Inc., Anaheim, California 
SUMMARY 
Three different approaches to computer organization for a long duration manned space mission are given. 
N 67-17102 
n - -  * -  - - -  
Flexibility in  meeting computational requirements i s  an important factor in the computer system design. The 
organizations a r e  currently being evaluated, and preliminary evaluations indicate that a Multiprocessor o r  Distrib- 
uted Logic approach offers  the most potential f o r  future Space missions.  
INTRODUCTION REQUIRE ME NTS 
The purpose of this paper is to  review some of the 
considerations in the application of digital computers to 
long duration space missions.  The space missions to be 
considered a r e  the extended manned missions which a r e  
typical of advanced ear th  orbital and manned planetary 
missions.  A manned Mars  lander mission would typi- 
cally have a duration of 420 days, an ear th  orbiting 
space station might have a mission duration of 1 year;  
to develop ra ther  specific requirements for computer 
design, a manned M a r s  mission was considered in  detail. 
Although some of the processing tasks a r e  unique to this 
mission, the resulting computer configurations will in 
general  be applicable to other space missions. 
The desirability for some form of digital compu- 
tation aboard the vehicle has been demonstrated in the 
on-board control of Gemini reentry and wi l l  be further 
shown in Apollo. The M a r s  mission computational 
tasks which can be mechanized on a digital computer 
system a r e  shown in Figure 1. They a r e  basically 
divided into two groups: Command and Control and 
Mission Data Processing. The computational require- 
ments vary considerably from phase to phase during 
- a  mission, a s  shown in Figures 2 and 3. The relia- 
bility requirements for the Mars  mission were defined 
a s  a 0.997 probability of success and a 0.997 avail- 
ability for approximately 10,000 hours. 
Command and Control 
Miasion Data Processing 
Vehicle &id.aEe lcld 
Control 
Sci Sensors 
Data Process@ 
Status Monitor 
Navigation 
Targeting 
Required Velocity 
Velocity-to-be Galned 
Flight Sequencing 
Steering 
G & N Controls and Displays 
Optical Sensors Orientation 
Angular Rate Stabilization & Control 
Translation Control 
Thrust Vector Control 
Attitude Controls and Displays 
I . hidance and Navigation 
Vehicle Attitude Control 
Telecanmunlcatiom I 
Image Sensor Data Processing 
i Scientific Sensor Data Proc. 
System Self-Test Operations 
System Performance Monitor 
Antenna Orientation 
Data Processing 
Communications Controls and Displays 
Image Sensors Orientation & Sequencing 
Image Sensors Data Correlat ion/hdysis  
Image Sensors Data Compression 
Image Sensors Controls & Displays 
Scientific Sensors Orientation ek 
Scientific Sensors Data Correlation 
Scientific Sensors Data Compression 
Scientific Sensors Controls & Displays 
Automatic Self-Test Operations 
Self-Test Controls and Displays 
Performance Data Compression 
Monitor Controls and Displays 
Figure 1. Computational and data processing functions 
-> - 
25 
20 
n- 
s 
x 
VI 0g 15 
Y 
e 
v, 10 
a 
5 
0.2 
- - - 
raj Spln Spln 
br r  Up Crulrr 
0 
12 
7 - - 
DE Mars 
Spln Appr 
Corr 
m r 
,em- m r r  
8raklnq i Orb 
INDICATES 
REOU I R E N T S  
0 PHIlOOlC 
.. 
- - - 
Trans Traj Spln Spln 
Earth Corr Up Crulr 
Coat 
Figure 2. Memory requirements per manned M a r s  mission phase 
Figure 3. Computational speed requi rements  per  manned M a r s  mission phase 
-6 - 
MODULAR COMPUTATION SYSTEMS 
A single computer i s  quite efficient for a specific 
requirement, however, as soon as that requirement i s  
altered, such as the speed requirement doubled, diffi- 
culty i s  encountered. 
pointed out below the potential of a modular computa- 
tional approach was investigated. What a r e  the advan- 
tages expected from varying the computational capability 
of the computer sys t em?  Power (which is very impor- 
tant for long duration missions) can b e  saved by being 
able to  turn modules on and off. Reliability is increased, 
given that failure r a t e s  of dormant equipment a r e  lower 
than operating equipment (which appears to  be  the case  
from preliminary data). Probability of mission success  
and computer availability are greatly enhanced due to 
the capability of withstanding failures by reconfiguration 
at the module level. Only a portion of the computational 
capability may be lost  during a failure, thereby giving 
the possibility of "graceful degradation. I '  
For  this reason  and others as 
The computational requirements have been shown 
to vary  considerably during a typical long duration space 
mission. If the computer system were  designed in t e rms  
of modules, the  potential of turning off some of the 
modules during various phases would exist. As an  
example in the Mars  mission considered, if two com- 
puter modules are designed to sha re  the load during the  
Mars  orbital  phase, then only one of these modules may 
be  required during the long duration coast  and c ru i se  
phases;  this capability resu l t s  in  advantages as given 
above, most notably in power and reliability. 
In addition to the above, the coniputational require- 
ments are expected t o  vary substantially from mission to 
mis s i ca  (for example the manned Mars  Lander vs  the 
unmanned M a r s  flyby) and in fact, from the Mission 
Module of a spacecraft  t o  the Lander Module. A modu- 
l a r  concept then enables the  setting up of a baseline se t  
of computing modules upon which many computer sys- 
t e m s  can be built. This  enables a common sparing 
philosophy on any particular mission a s  well as common 
sparing, development, production, and testing for a 
range  of missions.  
Another point worthwhile mentioning i s  the use  of 
the modules t o  back up each other during crit ical  mission 
phases such as Mars  Orbit Injection and Ear th  Reentry. 
During these  phases, 5 seconds is typical of the maxi- 
mum allowable t ime for switching to  a backup o r  redun- 
dant sys tem in case  of a failure. This requi res  fault 
detection, isolation, and reconfiguration t o  an  on-line 
backup within 5 seconds. The cr i t ical  computations 
occur in  nonheavily loaded pnases and ab a i-esil!t tho 
ex t r a  modules necessary only in the heavily loaded Mars 
Orbital phase  can be  used as on-line backup; however, 
even if the extra modules were  not available, modules 
not ca r ry ing  out critical computations oould be used as 
backups in o rde r  t o  be  able to  withstand multiple fail- 
u re s .  This  shows that a l l  the modules in the system can 
b e  used  in  o rde r  t o  obtain a verv  high Drobabilitv of 
possible to  estimate what the modularity costs so that 
a n  evaluation can  be  made to determine the most  effec- 
t ive approach. Three  computer system organizations 
have been investigated to  attain modulariQ. These  are 
Multiple Computer, Multiprocessor and Distributed 
Logic Organizations. 
Multiple Computer 
The Multiple Computer system consists of two 
independent computers each with a 200,000 operation 
p e r  second processor,  a 24K - 18 bit word memory, 
and a program controlled 1/0 section. The individual 
computers perform self-checks and are interconnected 
so that cr i t ical  system outpvts can be automatically 
switched from a failed computer to  a correctly operating 
computer through a n  output switch. A block diagram of 
the system is shown in Figure 4. 
features shown in this figure were developed f rom a n  
analysis of the computational requirements.  The soft- 
ware  considerations for this organization and the other 
two organizations are given in a following paper by 
R.  Hokoml entitled "Executive Program Control for 
Spac eborne Multipr oce s s o r s  . " 
The processo t  
Multiprocessor 
The Multiprocessor, shown in Figure 5, consists 
of two 200, 000 operation pe r  second processors,  th ree  
12K - 18 bit word memor ies  and two program- 
controlled 1/0 units with full intercommunication 
between the processors  and the other modules in the 
system. The requirements for this system are the 
same  as those for  the Multiple Computer; a s  a resu l t  
the processor  features a r e  the same. It should be  noted 
however, that only three  12K memory modules were 
necessary instead of the four in the Multiple Computer. 
This more  efficient use  of the memory was obtained due 
to  the flexible processor-memory communication; 
however, there  will be approximately 2.5 t imes  as 
many lines required for the Multiprocessor than for the 
Multiple Computer. The memor ies  and the input/output 
units will have lock-out features which will permi t  the 
multiprocessor t o  operate essentially as a Multiple Com- 
puter sys tem during critical mission phases where 
redundant calculations a r e  required. This prevents the 
failure of one processor-memory system from annihilat- 
ing information in  memor ies  o r  in disturbing the opera- 
tion of the other processing system. 
I Y I  
ca r ry ing  out cr i t ical  computations (probability of success).  T r o  A c c u m m u ~ s  
lndlrrt Wdng 
I - B l l  Banks - Full l.aqlh bnk Rqldu 
a OP Cadrr 
Indmd word 
0 - 1  2 - 17 
(br*h Mdms The discussion so f a r  has  indicated some advan- I 
m - ~ n d  m - i n b  
01 - lndlrrt 11 - and Indlr*1 tages if s o  called modularity i s  introduced into the com- putational sys tem.  The question which now must b e  
answered is how t o  attain this modularity. It is then Figure 4. Multiple computer organization 
-7 - 
Bit BH 
Pardld SrrU 
1 
I 
Smmrs 
BH SorU 
Sam Felturn and lnrtructbn Word as Multlplr Computer 
Figure 5. Multiprocessor organization Figure 6.  Example of parallelism 
Distributed Process  or 
The third type of computer organization con- 
sidered i s  a distributed logic organization. 
of computer contains a decentralization of the logic 
elements on an a r ray  basis.  Each element o r  cell  com- 
municates with a number of other cells,  and each cell  
has some memory associated with i t .  The complexity 
of a cell can vary from the execution of a single instruc- 
tion to  a small  computer. The control of the execution 
of a program in the a r r ay  can either be distributed 
among the cells o r  can be  centrally controlled. The 
a r r ay  ty e construction as depicted in the Holland 
The Solomon computer3 i s  typical of the global control 
type configuration. Determination of which approach to  
follow was one of the f i rs t  requirements of the distrib- 
uted logic study. To aid in this determination i t  was 
necessary to  define the .parallelism associated with com- 
putational problems. Two types of parallelism have 
been defined: natural parallelism which has the property 
for car ry ing  out a number of operations on distinct data 
bases  o r  on the same data base  simultaneously and 
independently; the second, Applied parallelism where a 
number of exactly the same  operations on distinct data 
bases  o r  on the same data base are ca r r i ed  out simul- 
taneously. 
natural  parallelism for a sum of products type computa- 
tion. 
the figure. The applied parallelism example shows the 
identical operations a/x and b/z being computed simul- 
taneously. The computation c x y i s  done sequentially 
and the sum is made sequentially. At the bottom of the 
figure the use  of natural parallelism i s  shown by the 
parallel  solution of c x y. The saving in t ime  by the use  
of parallelism i s  shown at  the right hand side of the fig- 
u r e  with a solution time for applied parallelism being 
2/3 of that for the sequential computation and with the 
applied and naturalism being 1/2 that of the sequential 
solution time. 
Mars mission were examined to  determine the amount 
of applied parallelism and natural parallelism that 
could b e  mechanized. Figure 7 shows the improvement 
in computational speed a s  a function of the number of 
cel ls  in applied parallelism. It can be seen from the  
curve  that a f t e r  25 cells, there  is little reduction in  the 
This type 
machine E i s  typical of a local control type configuration. 
Figure 6 is  an example of applied and 
The sequential operation i s  shown a t  the top of 
Computational requirements for the 
5- agrr d ADplid Rralhlirm 
1 1  I I I I 1 I 1 
1 2 3 4 5 6 7 8 
CMnpUUion Rdudion Ratio 
Figure 7. Applied parallelism degree of complexity 
vs  computation speed 
computational speed. 
natural  para l le l i sm on the computational task  speed 
reduction; this figure shows that after six computational 
cen te r s  the reduction in computational speed i s  negligible. 
Figure 8 shows the effect of 
The  chosen Distributed P rocesso r  was  developed 
to  take  advantage of both natural  and applied para l le l i sm 
and a l so  t o  take, maximum advantage of the technology 
assumed for  1980. The organization of the Distributed 
P rocesso r  i s  shown in Figure 9 in block diagram form. 
It cons is t s  of a number of groups of ce l l s  all  intercon- 
nected through an  inter-group bus. Each cel l  executes 
macro  instructions f rom s torage  o r  f rom a controller 
cel l  and can  communicate to i ts  neighboring cells. It is 
seen  that the  cel ls  are organized into fixed sized groups 
each of which can  pe r fo rm a computational task. 
alleviates the optimization problems for  program recon- 
figuration and a l so  makes  executive monitoring s impler .  
The number of cel ls  in the group i s  chosen to  bes t  meet  
the applied and natural  parallelism inherent in the Mars 
Lander Mission computations. Therefore  a s t ruc tu re  
containing approximately 25 ce l l s  per  group and having 
up t o  25 groups was  specified. Each cell  within a 
This  
-8- 
M C R O  From Intcr-Group 
NR 
t Degree of Naturalism 
Computation Reduction Ratio 
Figure 8. Natural parallelism degree  of complexity 
vs computation speed 
Figure 9. Distributed processor  organization 
group, shown i n  Figure 10, can operate either as a 
controller,  an operating cell, o r  a s torage  cell. The  
controller cel l  provides the global control for a group by 
placing m a c r o  instructions on the inter-cell bus.  The 
operation cells receive these macros  from the controller 
o r  from the i r  own storage reg is te rs ,  dehode them and 
use them to r e a d  out a sequence of operations from the 
microprogram storage in a cell. The sequence of 
instrcction. f rnm t h e  microprogram storage cause 
s torage  registers and control reg is te rs  to  be added, 
exchanged, t ransfer red ,  etc. Group switches are pro- 
vided which act as lock-out switches for the particular 
group dur ing  cr i t ical  phases. During crit ical  phases 
the switch is se t  such that any given group will only 
accept commands and communicate over one of the two 
inter-group busses.  This  enables isolation of failures 
and reconfiguration within the 5-second t ime constraint. 
EVALUATION 
T h e  following are some of the considerations in 
evaluating the  three  computer mechanization approaches. 
The  Multiple Computer approach provides minimum 
Figure 10. Distributed processor  cell 
components and communication lines and provides a 
reasonably good match of hardware to  the requirements.  
The fault detection t o  a computer is relatively simple; 
however, fault detection to  a level lower than the com- 
puter i s  difficult to  meet. Changing computational 
requirements,  it i s  a l so  necessary to vary computa- 
tional capability in t e r m s  of computer modules. The 
multiprocessor approach again gives a good match t o  
the requirements and permi ts  localization of failures to  
modules due to  the full intercommunication capability. 
It a l so  presents the possibility of providing spa res  at 
the module level. Down t ime during reconfiguration 
after failure is less with this approach and it is possible 
to withstand certain multiple failures.  Expansion of 
this system in memory and computational speed can  be 
done in smal le r  hardware increments than with the 
Multiple Computer approach. Some of the problem 
areas are that the expansion i s  limited and must be 
accounted for in the communication area during the 
design, and that the number of communication lines is 
somewhat grea te r  than with the Multiple Computer 
approach. The Distributed P rocesso r  approach pro- 
vides the possibility of very high reliability and low 
power consumption due partly t o  the elimination of the 
main memory. It i s  possible to  take advantage of the 
hardware by providing many levels of graceful degra- 
dation. The Array  Type technology will a l so  b e  taken 
advantage of in the Distributed Processor  approach. It 
i s  possible to expand the computer in smal l  hardware 
increments if the increments have been anticipated and 
if the packaging i s  adjusted. The problems with the 
distributed approach are that it is relatively complex to  
define optimum m a c r o  and mic ro  instructions and that 
t he  programming and executive a r e  relatively more  
complex. 
Figure 11 shows a block diagram of the Monte 
Car lo  approach used in this study for performing a 
reliability simulation. It should be noticed that prob- 
ability of fault detection was included in the Monte Car lo  
approach as well as the probabilities of failures for the 
various components of the system. Figure 12 shows a 
typical set of curves  that resulted from the  simulations 
for  the multiple computer system. The curves  show 
th ree  failure rates o r  MTBF's of the computers 8,000, 
16, 000 and 25,000 hours;  a probability of detection of 
faults of 0.99 was assumed in  the runs  and an on/off 
failure r a t e  of 10 was used. Curves are a lso  included 
-9- 
Figure 11. Monte Car lo  simulation 
which show the effect of having the full memory on a t  
all  t imes  during the mission and being able t o  turn  off 
portions of the memory when not in use .  It i s  seen that 
a significant increase  in probability of success  of the 
computer system is achieved by turning off memory 
modules when not in use. Other curves  have been 
generated to  access  the value of failure detection, ra t io  
of on/off failure ra tes ,  and computer availability during 
the mission. 
REFERENCES 
Hokom, 13. : "Executive Program Control for  
Spaceborne Multiprocessors", Spaceborne Multi- 
processing Seminar, Boston, Mxss. ,  October 1966. 
Holland, J .  : "A Universal Computer Capable of 
Executing an Arbitrary Number of Sub-programs 
Simultaneously", Proc. E .  J .  C. C . ,  1959. 
Slotnick, D., et. al. , "The Solomon Computer, " 
Proc .  F. .T.C.C. ,  1962. 
-10- 
1 2 3 
NUMBER OF COMPUTERS 
Figure 12. Probability of success vs number 
of computers 
4. "Study of Spaceborne Multiprocessing 1st and 2nd 
Quarterly Report, ' I  Autonetics, Anaheim, 
California, Sept. 1966. 
FUNCTIONAL REQUIREMENTS OF SPACEBORNE COMPUTERS 
ON ADVANCED MANNED MISSIONS 
E. L. GRUMAN 
Mr. Gruman received h is  B. S. E. E. degree in 1960 f rom the  University 
of Maryland and h i s  M. E. E. degree in 1962 f rom New York University. 
Mr. Gruman was wi th  t h e  Bell Telephone Laboratories f rom 1960-1963 and 
has been with Bellcomm, Inc., since 1963. Wi th  Bellcomm's Computer 
Technology Department Mr. Gruman has been engaged in on-board checkout 
and data t ransmiss ion studies. He i s  a member of IEEE, Tau Beta Pi, Eta 
Kappa Nu, and P h i  Kappa Phi. 
P. S. SCHAENMAN 
Mr. Schaenman received h i s  B. S. degree in 1961 f rom Columbia University; 
another B. S. degree, also in 1961, from Queens College; an M. S. degree in 
1962 f rom Stanford University; and an E. E. degree in 1963 f rom Columbia Un i -  
versity. Mr. Schaenman has been with Bellcomm, Inc., since 1963 where h e  
has been engaged in the  study of overall data flows and spaceborne computers 
for  manned spaceflight. He i s  presently supervisor of t h e  Computer Systems 
Cfudks Group. He i s  also a member of P h i  Beta Kappa, Tau Beta Pi, Eta Kappa 
Nu, and ACM. 
-11- 
a - -  & - a , ,  
must be capable of  e n t i r e l y  independent ope ra t ion ,  
r ega rd le s s  o f  whether t h e  spacec ra f t  o r  ground has 
prime c o n t r o l  f o r  the var ious  mission ope ra t ions .  
No attempt i s  made t o  d e l i n e a t e  a s p e c i f i c  compu- 
t e r  system conf igu ra t ion ,  although c e r t a i n  gross  
system c h a r a c t e r i s t i c s  can be i n f e r r e d .  
EXAMPLE MISSION 
FUNCTIONAL REQUIREMENTS OF SPACEBORNE 
COMPUTERS ON ADVANCED MANNED MISSIONS 
By E. L. Gruman and P. S. Schaenman 
Members of  t h e  Technical S t a f f  
Bellcomm, Inc .  
Washington, D.  C.  
SUMMARY 
This paper d iscusses  func t ions  which w i l l  
r equ i r e  support  from t h e  on-board computer system 
dur ing  advanced manned missions.  I n  add i t ion  t o  
p re sen t  day func t ions ,  such as guidance, naviga- 
t i o n ,  a t t i t u d e  c o n t r o l ,  e t c . ,  t h e  computer system 
would provide  c a p a b i l i t y  f o r :  
confidence t e s t i n g ,  and d i agnos t i c  t e s t i n g  f o r  
spacec ra f t  subsystems and experiments ; 2 )  i n f l i g h t  
crew t r a i n i n g  wi th  s imula t ions ;  3 )  c o n t r o l  and 
d a t a  management f o r  experiments;  4 )  d i sp lays  f o r  
f l i g h t  and experiment ope ra t ions ;  and 5 )  G&N of  
unmanned probes launched from t h e  mother c r a f t .  
1) monitoring, 
It i s  concluded t h a t :  1) The func t ions  ( G & N ,  
a t t i  tude  c o n t r o l )  which o r i g i n a l l y  j u s  ti f i e d  
us ing  on-board computers a r e  no longer  t h e  pacing 
f a c t o r s  i n  determining many system cha rac t e r i s -  
t i c s ;  2 )  Mission complexity w i l l  f o rce  t h e  crew t o  
make ex tens ive  use of computer system suppor t ;  3) 
The growth of  computer usage i n  spaceborne sc ien-  
t i f i c  exper imenta t ion  w i l l  p a r a l l e l  t h e  h i s t o r i c a l  
surge  ev ident  i n  ground-based experimentation; 4) 
Increased  f u n c t i o n a l  requirements w i l l  result i n  a 
g r e a t l y  inc reased  number of 110 channels,  in -  
c r eased  high speed memory, t h e  add i t ion  of  o f f -  
l i n e  bulk  s t o r a g e  and more powerful process ing  
c a p a b i l i t y ,  r ega rd le s s  of  t h e  s p e c i f i c  system con- 
f i g u r a t i o n ;  and 5 )  The amount of  on-board sof tware  
r equ i r ed  f o r  a manned f lyby  mission w i l l  be l a r g e  
r e l a t i v e  t o  manned missions he re to fo re .  
In t roduc t ion  
I n  t h e  Apollo program t h e  spacec ra f t  cpmpu- 
ters are used f o r  t h e  func t ions  o f :  guidance, 
nav iga t ion ,  a t t i t u d e  c o n t r o l ,  opera t ion  of simple 
d i sp lays  , astronaut-computer communication, and 
computer-ground communications. They a l s o  run 
tests on themselves and t h e  G&N system. Beyond 
Apol lo ,  the' i n c r e a s i n g  complexity o f  miss ions ,  and 
advances i n  computer technology, w i l l  undoubtedly 
result i n  a lengthening  of t h e  l i s t  of  func t ions .  
This pape r  d iscusses  var ious  func t iona l  re- 
quirements on s p a c e c r a f t  computer systems f o r  
advanced manned missions.  
s i o n  s h a l l  b e  used f o r  t h e  purpose of t h i s  d i s -  
cuss ion .  Never the less  t h e  d i scuss ion  w i l l  be 
a p p l i c a b l e ,  t o  vary ing  degrees ,  on long du ra t ion  
e a r t h  o r b i t a l ,  p l ane ta ry  landing ,  double f lyby ,  
and o t h e r  manned miss ions .  
A p l ane ta ry  f lyby  mis- 
Emphasis i s  p laced  on requirements which are 
new, i . e . ,  n o t  expected t o  be found on missions 
through Apol lo .  It is  assumed t h a t  t h e  spacec ra f t  
The p l ane ta ry  f lyby  mission used as an exam- 
p l e  h e r e  i s  assumed t o  begin  wi th  assembly and 
checkout of a spacec ra f t  and i n j e c t i o n  veh ic l e  i n  
e a r t h  o r b i t .  The spacec ra f t  inc ludes  a l a r g e  
Manned Module (MM) i n  which t h e  a s t ronau t s  nor- 
mally ca r ry  on t h e i r  a c t i v i t i e s  dur ing  t h e  t r i p ,  
and a small Ear th  Entry Module ( E m )  f o r  t h e  f i n a l  
r e t u r n  t o  Ear th .  A f t e r  i n j e c t i o n  toward t h e  plan- 
e t ,  a few midcourse co r rec t ions  are made. I n  
transit,  experiments i n  space phys ics ,  behavior ,  
and physiology are conducted. Astronomical obser- 
va t ions  a r e  made us ing  a l a r g e  t e l e scope .  A f ew 
days before  p l ane ta ry  encounter s e v e r a l  (about 
s i x )  unmanned probes are e j e c t e d  from t h e  space- 
c r a f t  and guided toward t h e  p l ane t .  The probes 
may inc lude  o r b i t e r s ,  slow-descent atmospheric 
probes ,  and s o f t  l ande r s .  The probes communicate 
a t  h igh  d a t a  r a t e s  wi th  t h e  mother s p a c e c r a f t  f o r  
a s h o r t  t i m e  before  and a f t e r  p e r i a p s i s .  
On-board t h e  s p a c e c r a f t ,  remote measurements 
of  t h e  p l ane t  a r e  made us ing  var ious  pa r t ions  o f  
t h e  e lec t romagnet ic  spectrum. A l a r g e  number of 
high r e so lu t ion  p i c t u r e s  a r e  taken  us ing  t h e  l a r g e  
t e l e scope .  Data i s  t r ansmi t t ed .  t o  Ear th  at  r a t e s  
up t o  one megabit pe r  second from i n j e c t i o n  u n t i l  
a few weeks a f t e r  encounter.  The m a x i m u m  rate 
diminishes t o  a low o f  approximately seventy  k i lo -  
b i t s  p e r  second. This low r a t e  lasts f o r  about a 
month, then  r e tu rns  t o  one megabit p e r  second. 
There may be  a pe r iod  on t h e  r e t u r n  l e g  when t h e  
sun l i es  between t h e  spacec ra f t  and Ea r th ;  t h i s  
would cu t  o f f  communications wi th  Ear th  f o r  as 
l ong  as two months. The r e t u r n  l e g  o f  t h e  mission 
i s  used t o  t r ansmi t  d a t a  c o l l e c t e d  dur ing  plane- 
t a r y  encounter t o  Ear th  and t o  perform experiments 
s imilar  t o  those  on t h e  outgoing l e g .  Ear th  en t ry  
w i l l  occur some one and one-half t o  two yea r s  
a f t e r  i n j e c t i o n .  
REQUIREMENTS FOR FLIGHT OPERATIONS 
Monitoring and Tes t ing  On-board Systems 
Three l e v e l s  of t e s t i n g  w i l l  be r equ i r ed  f o r  
conducting f l i g h t  opera t ions  : moni tor ing  - an 
e s s e n t i a l l y  continuous check of c e r t a i n  system 
parameters;  confidence t e s t inK - a more d e t a i l e d  
check of  system parameters be fo re  c e r t a i n  c r u c i a l  
even t s ;  and d i agnos t i c  t e s t i n g  - a s t i l l  more de- 
t a i l e d  check of a system whenever it i s  found t o  
be f a u l t y .  
Checkout at t h e s e  l e v e l s  is  no t  r e s t r i c t e d  t o  
i n f l i g h t  needs b u t  i s  requi red  as w e l l  i n  c e r t a i n  
phases p r i o r  t o  launch. Use of  an i n t e g r a t e d  
t e s t i n g  concept (1)  i n  which c e r t a i n  prelaunch and 
i n f l i g h t  tes ts  a r e  c a r r i e d  out i n  a common manner 
-13- 
by e s s e n t i a l l y  t h e  same automatic equipment 
appears d e s i r a b l e ,  bo th  f o r  economic reasons  and 
t o  main ta in  c o n t i n u i t y  of t e s t i n g .  The on-board 
computer system i s  t h e  n a t u r a l  candida te  f o r  t h e  
job. 
There are o ther  arguments f o r  us ing  t h e  
on-board computer system f o r  checkout.  
ing system s t a t u s  f o r  long  pe r iods  of  t ime i s  a 
job  done poor ly  by humans and, even worse, i s  a 
waste of p rec ious  r e sources . (2 )  Confidence and 
d i agnos t i c  t e s t i n g ,  though h igher  o rde r  t a s k s  
than  monitoring, a r e  a l s o  candida tes  f o r  automa- 
t i o n  i n  o rde r  t o  ob ta in  f a s t e r  t e s t i n g  wi th  less 
chance of human e r r o r .  This  i s  p a r t i c u l a r l y  t r u e  
a t  h e c t i c  times i n  t h e  mission o r  i f  mu l t ip l e  
f a i l u r e s  occur .  
Monitor- 
The use of  a computer f o r  t e s t i n g  provides 
t h e  s to rage  media, computational c a p a b i l i t y ,  and 
l o g i c a l  c a p a b i l i t y  f o r  making comparisons and 
ind ica t ing  t r e n d s  wi th  both  accuracy and r epea t -  
a b i l i t y .  Furthermore, t h e  t e s t  p o i n t s  and much 
of t h e  checkout software w i l l  a l r eady  e x i s t  from 
prelaunch requirements.  It w i l l  a l s o  be d e s i r -  
a b l e  t o  have an  automated system on board which 
can assist t h e  ground i n  determining t h e  s t a t u s  
of t h e  spacec ra f t ,  e s p e c i a l l y  i f  p a r t  o f  t h e  crew 
becomes incapac i t a t ed .  
As t ronauts  w i l l  c o n t r o l  t h e  automated check- 
out system v i a  a checkout s t a t i o n  which w i l l  have 
a keyboard, d i s p l a y s ,  and communication l i n k  wi th  
t h e  c e n t r a l  computer system. Ord ina r i ly ,  on ly  
summaries of system s t a t u s  w i l l  be presented on 
the  d i sp lays .  Upon r e q u e s t ,  t h e  a s t ronau t  w i l l  be 
presented with more d e t a i l e d  information on any 
system. H e  w i l l  be a b l e  t o  ask  f o r  p re sen t ,  
former, o r  nominal va lues .  H e  w i l l  a l s o  be a b l e  
t o  i n i t i a t e  confidence and d i agnos t i c  t e s t s .  H i s  
o v e r a l l  c a p a b i l i t y  w i l l  be somewhat l i k e  t h a t  a t  
launch system consoles dur ing  an Apollo countdown. 
The WAR (Malfunction Detec t ion  and Recording) 
System, an automated i n f l i g h t  checkout and main- 
tenance system being developed f o r  t h e  C-5A 
t r a n s p o r t , ( 3 )  i s  another precursor  of t h e  type  of 
system envis ioned .  
In  Apollo,  t h e  ACE (Acceptance Checkout Equip- 
ment) spacec ra f t  t e s t  po in t s  are au tomat i ca l ly  
checked on t h e  ground. Only a r e s t r i c t e d  set of 
t hese  po in t s  i s  used on board because of t h e  very  
l i m i t e d  use  of  i n f l i g h t  maintenance. In  c o n t r a s t ,  
an i n t e r p l a n e t a r y  spacec ra f t  w i l l  probably have 
a l l  i t s  "ACE" po in ts  a v a i l a b l e  i n  f l i g h t  as w e l l  
as p r e f l i g h t .  Some o f  t h e  p o i n t s  used f o r  d iag-  
n o s t i c s  and a l l  of t hose  used f o r  monitoring and 
confidence t e s t i n g  w i l l  probably be wired i n t o  
t h e  automated checkout system. The r e s t - - those  of  
imTrobable use  due t o  l i m i t e d  system usage o r  l a c k  
of c r i t i c a l i t y - - w i l l  be a c c e s s i b l e  by being 
plugged i n t o  a por tab le  i n t e r f a c e  wi th  t h e  com- 
put e r .  
How many tes t  p o i n t s  w i l l  t h e r e  be? On one 
hand, t h e  increases  i n  system s i z e  and soph i s t i ca -  
t i o n  and t h e  add i t ion  of new s e r v i c e s  w i l l  t end  
t o  inc rease  t h e  checkout requirements over Apollo.  
On t h e ' o t h e r  hand, t h e  inc reas ing  c a p a b i l i t y  pe r  
uni t  s i z e  of e l e c t r o n i c  devices  and o t h e r  system 
bu i ld ing  b locks  w i l l  t end  t o  reduce t h e  o v e r a l l  
number of  p o i n t s  t o  be t e s t e d .  The au tho r s '  specu- 
l a t e  t h a t  f o r  a mission such as t h e  70's f l y b y  
example, one can expect a f a c t o r  of  two t o  f i v e  
inc rease  over t h e  number of  Apollo ACE CSM tes t  
p o i n t s .  This impl ies  approximately 2000 - 4000 
test p o i n t s  f o r  t h e  MM systems. 
p o i n t s  would make up about three f o u r t h s  o f t h e  
t o t a l  number. 
Diagnos t ic  t es t  
Three broad c l a s s e s  of  d i agnos t i c  approaches 
1 )  t hose  f o r  d i g i t a l  systems (auto- are foreseen:  
mated);  2 )  t hose  f o r  analog systems (automated);  
and those  f o r  bas i c  bu i ld ing  b locks  (semi-auto- 
mated).  Jus t  as t h e y  r e q u i r e  more test  p o i n t s ,  
d i g i t a l  equipments gene ra l ly  r e q u i r e  more complex 
d i agnos t i c  r o u t i n e s  than  do ana log  equipments 
s i n c e  they  a r e  gene ra l ly  capable  of many more oper- 
a t i o n a l  states. Thus,, e l abora t e  d i agnos t i c  rou- 
t i n e s  a r e  foreseen  f o r  systems such as up and down 
d a t a  l i n k s ,  t h e  computer system i t s e l f ,  and t h e  
computer i n t e r f a c e  equipments. Ce r t a in  d i g i t a l / a n -  
a log  hybr id  subsystems w i l l  a l s o  r e q u i r e  r a t h e r  
e l a b o r a t e  d i agnos t i c  r o u t i n e s .  
The t o t a l i t y  of d i agnos t i c  programs w i l l  
r e q u i r e  a s i g n i f i c a n t  amount of bu lk  memory space.  
For example, t h e  Apollo ACE programs use  t e n s  of 
thousands of  words f o r  checkout of  a luna r  l and ing  
mission spacec ra f t .  
would r e q u i r e  somewhat g r e a t e r  s to rage  f o r  t h e  
t o t a l i t y  of  i t s  checkout programs. 
A p l ane ta ry  f lyby  spacec ra f t  
The d i scuss ion  t h u s  f a r  a p p l i e s  p r imar i ly  t o  
an MM where t h e  crew normally c a r r i e s  on i t s  a c t i v -  
i t i e s .  
checked out  p r i o r  t o  i n t e r p l a n e t a r y  i n J e c t i o n  and 
p r i o r  t o  Ear th  l and ing .  I ts  checkout equipment 
may be p a r t i a l l y  se l f - con ta ined ,  p a r t i a l l y  con- 
t a i n e d  i n  t h e  MM. The MM computer system may 
t h e r e f o r e  have t o  bear  p a r t  of t h e  load  of EEM 
checkout as we l l  as i t s  own. However, f o r  most o f  
t h e  miss ion ,  EEM checkout w i l l  no t  o r d i n a r i l y  be of 
concern.  
It  i s  recognized t h a t  an  EEM would be 
G&N,  Abort Guidance, and D i g i t a l  Autopi lo t  
Guidance , nav iga t ion ,  and- a b o r t  requi rements  
obvious ly  depend heav i ly  on t h e  type  of miss ion  
involved .  
computation requi rements  wouw not be s u b s t a n t i a l l y  
h ighe r  t han  t h o s e  r equ i r ed  f o r  a l u n a r  f l y b y ,  ex- 
c e p t  i n  t h e  v i c i n i t y  of t h e  p l a n e t .  where probe 
guidance ( b r i e f l y  d i scussed  l a te r )  i s  necessary .  
There would be oppor tun i ty  f o r  e a r l y  r e t u r n  by 
a b o r t  on ly  whi le  t h e  spacec ra f t  i s  s i g n i f i c a n t l y  
in f luenced  by t h e  e a r t h ' s  g r a v i t a t i o n a l  f i e l d ,  
which i s  l e s s  t h a n  one percent  of  t h e  miss ion  dura- 
t i o n .  Abort G&N a lgor i thms would be o f  t h e  same 
o r d e r  of  complexity as t h e  abort a lgor i thms used 
f o r  a l u n a r  f lyby .  
For p l ane ta ry  f lyby  mis s ions ,  t h e  G&N 
A s i g n i f i c a n t  impact upon advanced computer 
systems could  result from t h e  use  of  strap-down 
-14- 
i n e r t i a l  measurement u n i t s  (IMU's), which are can- 
d i d a t e s  f o r  p l ane ta ry  mission use  because of r e l i -  
a b i l i t y  per  u n i t  weight and power cons ide ra t ions .  
There a r e  ind ica t ions  t h a t  strap-down u n i t s  could 
inc rease  t h e  on-board G&N computation load by as 
much as a f a c t o r  of f i v e  t o  t e n  wi th  r e s p e c t  t o  
t h e  c a l c u l a t i o n s  requi red  us ing  a gimbaled IMU, 
although use of a d i g i t a l  a u t o p i l o t  t ends  t o  
lower t h i s  f a c t o r  s l i g h t l y .  Both gimbaled and 
o p t i c a l  p la t forms  are a l s o  p o s s i b i l i t i e s .  
could w e l l  be used, i f  not i n  a prime r o l e  f o r  
t h e  e n t i r e  miss ion ,  then  as prime f o r  a pa r t i cu -  
l a r  mission phase o r  as backup. 
E i the r  
Displays 
Curren t ly ,  t h e  i n t e r i o r  of a spacec ra f t  r e -  
sembles an a i r p l a n e  cockp i t :  
l i g h t s ,  and swi tches ,  each wi th  a unique func t ion .  
For t h e  most p a r t  t hey  a r e  connected t o  sensors  
wi th  l i t t l e  o r  no information process ing  en r o u t e .  
P i l o t s  even tua l ly  l e a r n  t o  l i v e  wi th  t h i s  d i s p l a y  
jung le ,  though non-pilots a r e  u s u a l l y  staggered 
by it. The s i t u a t i o n  could g e t  worse with t h e  
more numerous and complex systems expected on 
advanced miss ions .  
a profus ion  of d i a l s ,  
One source o f  r e l i e f  would be t o  d i sp l ay  l e s s  
subsystem d a t a  wi th  t h e  a i d  of t h e  prev ious ly  d i s -  
cussed automated checkout system. Another approach 
would.be t o  combine information from va r ious  sen- 
sors  i n t o  i n t e g r a t e d  s i t u a t i o n  d i s p l a y s  l i k e  those  
r e c e n t l y  developed by Army-Navy r e sea rch  f o r  air- 
c r a f t  u s e . ( 4 )  I n  one such d i s p l a y  system, d a t a  i s  
c o l l e c t e d  from t h e  gyros ,  r a d a r s ,  a i r  d a t a  compu- 
t e r ,  compass, ins t rumenta t ion  landing  system, and 
f u e l  f lowmeters.  The c e n t r a l  d i g i t a l  computer 
system p rocesses  t h e  information and provides  t h e  
ou tpu t s  t o  run a v e r t i c a l  s i t u a t i o n  d i s p l a y  and a 
h o r i z o n t a l  s i t u a t i o n  d i sp lay .  
A v e r s i o n  of  t h e  v e r t i c a l  s i t u a t i o n  d i s p l a y ,  
made by Kaiser Aerospace and E lec t ron ic s  Corpora- 
t i o n ,  i s  c u r r e n t l y  ope ra t iona l  i n  t h e  Grumman 
A-6A I n t r u d e r .  This "contact analog" d i s p l a y  
shows t h e  command f l i g h t  pa th  as a highway i n  t h e  
sky. The highway i s  i n  proper pe r spec t ive  as 
viewed from t h e  cu r ren t  p o s i t i o n  o f  t h e  a i r c r a f t .  
The p i l o t  f l i e s  h i s  command course  simply by 
t r y i n g  t o  s t a y  on t h e  highway. Other f e a t u r e s  
inc lude  a d i s t a n c e  s c a l e ,  t h e  a i r c r a f t  a t t i t u d e  
i n  t h r e e  dimensions,  and symbols f o r  a t a r g e t  and 
a weapon r e l e a s e  p o i n t .  
Tn  a ~P(I.OE? mnde, t h e  c?iaglng gro:ects z c p  
t h e t i c  3-D view of t h e  t e r r a i n  ahead of t h e  p lane ,  
u s i n g  r ange ,  a l t i t u d e  and azimuth information from 
t h e  r a d a r s .  The t e r r a i n  i s  shown as t e n  v e r t i c a l  
s l i c e s  a t  v a r i o u s  ranges  ( 1 / 4  mi l e  ahead, 1 / 2  mi le  
ahead ,  e t c . ) .  Each s l i c e  shows t e r r a i n  he ight  v s .  
azimuth at that  range ,  so t h a t  t h e  o v e r a l l  e f f e c t  
i s  one o f  l ook ing  a t  a t h r e e  dimensional contour 
map. 
con tour s  of the r a d a r  d i s p l a y  more accu ra t e ly  and 
w i t h  more confidence than  wi th  v i s u a l  r e fe rences  
i n  c l e a r  weather. 
features, and a c c u r a t e l y  measures range which t h e  
Tests show that  p i l o t s  can fo l low t e r r a i n  
The r a d a r  sharpens t e r r a i n  
eye only  e s t ima tes .  P i l o t s  l i k e  t h e s e  d i s p l a y s .  
The f l i g h t  d i s p l a y  pu t s  a l i g h t  incrementa l  
computation load  on t h e  c e n t r a l  computer, s i n c e  
most of  t h e  d isp layed  d a t a  must be c a l c u l a t e d  
r ega rd le s s  of t h e  type  of d i s p l a y  used. 
amount of a d d i t i o n a l  computer memory space which 
may be charged t o  t h i s  ? a r t i c u l a r  d i s p l a y  i s  es -  
t imated  a t  l e s s  than  1000 words. 
The 
It seems reasonable  t o  a n t i c i p a t e  v a r i a t i o n s  
of t h e  above i n t e g r a t e d  s i t u a t i o n  d i s p l a y s  which 
would assist rendezvous, e a r t h  e n t r y ,  a t t i t u d e  
c o n t r o l ,  and v i r t u a l l y  a l l  o t h e r  p i l o t i n g  func- 
t i o n  aboard a manned spacec ra f t .  For example, 
cons ider  a manually con t ro l l ed  rendezvous wi th  
another  veh ic l e  which has  an extremely unfavor- 
a b l e  l i g h t i n g  background. 
v e h i c l e  i n  pe r spec t ive ,  some range markers,  t h e  
d e s i r e d  rendezvous t r a j e c t o r y ,  and appropr i a t e  
command information would be of cons iderable  a i d  
t o  t h e  p i l o t  and the reby  inc rease  r e l i a b i l i t y  i n  
a c r i t i c a l  s i t u a t i o n .  Other i n t e g r a t e d  d i s p l a y s  
might be used for pro jec t ing  e n t r y  c o r r i d o r s  as 
t h r e e  dimensional pa ths ,  o r  i n  s imula t ions  used 
f o r  on-board crew t r a i n i n g .  
A d i s p l a y  showing t h e  
In  add i t ion  t o  t h e s e  somewhat exo t i c  d i s -  
p l ays ,  t h e r e  w i l l  be more mundane CRT o r  e l e c t r o -  
luminescent ( E L )  d i s p l a y s  f o r  showing such t h i n g s  
as X-Y p l o t s  and alphanumerics. For example, t h e  
automated checkout system w i l l  use t h e s e  d i s p l a y s .  
Others w i l l  be a s soc ia t ed  with experiment c o n t r o l  
and d a t a  management. The Apollo spacec ra f t  u s e s  
numeric EL d i s p l a y s  f o r  showing s e l e c t e d  ou tpu t s  
from t h e  AGC. 
The r o l e  o f  t h e  computer i n  a l l  of  t h e s e  
va r ious  on-board d i s p l a y  systems i s  obvious: it 
c o l l e c t s  and formats information from va r ious  
senso r s ;  s t o r e s  and f e t ches  d a t a ;  performs nec- 
e s sa ry  computations; and composes appropr i a t e  d a t a  
i n t o  t h e  va r ious  p re sen ta t ions  by commanding t h e  
appearance and pos i t i on ing  of symbols, waveforms. 
and o the r  t ypes  of g raph ic s .  
i n fe rence  one might draw from t h e  v e r t i c a l  s i t u a -  
t i o n  d i s p l a y  example presented  he re ,  t h e  load  on 
t h e  computer f o r  d r iv ing  d i sp lays  may vary  over 
a very  wide range. 
In  s p i t e  of  t h e  
The MTBF's of p re sen t  i n t e g r a t e d  s i t u a t i o n  
Ant ic i -  
d i s p l a y s  a r e  es t imated  a t  seve ra l  days t o  s e v e r a l  
weeks--too low f o r  p l ane ta ry  miss ions .  
pa ted  improvement i n  CRT and/or EL technology w i l l  
" I B ' . ~ I L c ~ " u I J  L l l C I  =a== "I ICDC I1IUI. D .  l l l C  lac b 
remains t h a t  most of  t h e  elements of  t h e  p re sen t  
d i s p l a y  jungle  have t h e  inherent  r e l i a b i l i t y  
advantage of t h e  simple over t h e  complex and of  
not p u t t i n g  a l l  t h e  eggs i n  one baske t .  U n t i l  
t h e  more soph i s t i ca t ed  d i sp lays  are proven r e l i -  
a b l e ,  t hey  w i l l  probably be backed up by a 
reduced set of "simple" d i sp lays .  
-:.-*:o:-..-.+~-- -. _-_--_- LL--- R I ~ T ) ~ I -  mi--  *--L 
Astronaut-Computer Communications 
The Apollo a s t r o n a u t s  communicate wi th  t h e  
Apollo Guidance Computer (AGC)  us ing  a keyboard 
-15- 
with  bu t tons  f o r :  numbers 0 - 9 ,  +, -, VERB, NOUN, 
CLEAR, STANDBY, K E Y  RELEASE, ENTER, and RESET. 
The a s t r o n a u t s  consul t  a code book and punch i n  
. d i g i t s  r ep resen t ing  ve rbs ,  nouns, o r  d a t a .  The 
verbs  are simple commands t o  t h e  computer, such 
as "d i sp lay  (noun)",  "monitor (noun)", "load 
(noun)". The nouns are va r ious  parameters such 
as v e l o c i t i e s ,  angles ,  r a t e s ,  p o s i t i o n s ,  t ime,  
e t c .  The computer communicates with t h e  a s t r o -  
nau t s  v i a  a simple numeric EL d i s p l a y  and a s e t  
o f  s t a t u s  l i g h t s .  Almost t h e  e n t i r e  computer- 
a s t ronau t  d ia logue  cen te r s  around guidance and 
nav iga t ion .  
The expanded s e t  o f  computer func t ions  on 
advanced miss ions  w i l l  r e q u i r e  more f r equen t ,  
more i n c l u s i v e ,  and more s o p h i s t i c a t e d  man-machine 
communications than  i n  Apollo. A premium w i l l  be 
p laced  on speed and accuracy of communications and 
on minimizing t h e  tremendous l ea rn ing  burden of 
t h e  a s t r o n a u t s .  The improved d i s p l a y s  suggested 
i n  t h e  preceding sec t ion  w i l l  be one source of 
a i d .  Another w i l l  be h igher  l e v e l  input  l an -  
guages. Research i s  needed t o  de f ine  e i t h e r  a 
gene ra l  spacec ra f t  command and c o n t r o l  language 
o r  a group of func t ion-or ien ted  languages f o r  t h e  
astronaut-computer d ia logue .  Reasonably d e t a i l e d  
d i agnos t i c s  of input programs should a l s o  be pro- 
v ided .  A high l e v e l  language package p l u s  in- 
c reased  d i agnos t i c  c a p a b i l i t y  imply t h e  a v a i l -  
a b i l i t y  of a d d i t i o n a l  high speed memory space.  
Another improvement w i l l  be t o  en la rge  t h e  
keyboard and have ind iv idua l  keys f o r  f r equen t ly  
used words, with l e s s  f r equen t ly  used words in- 
s e r t ed  v i a  a genera l  set of alphanumeric keys.  
The choice  of words t o  be considered %est f r e -  
quent" might be l e f t  t o  t h e  use r  and allowed t o  
vary  from person t o  person and s t a t i o n  t o  s t a t i o n .  
A l l  commands could s t i l l  be i n i t i a t e d  from any 
keyboard. 
Optional hard copy of i npu t s  or  ou tpu t s  
would be another  des i r ab le  f e a t u r e .  Also,  punched 
or magnetic ca rds  might be used f o r  s t o r i n g  f r e -  
quent ly  used programs. The card  would be i n s e r t e d  
i n t o  a computer input dev ice ,  similar t o  ca rd  
d i a l e r s  used wi th  te lephones ,  i n  l i e u  of  punching 
bu t tons  whenever the  program i s  d e s i r e d .  Voice 
input  devices  might a l so  become f e a k i b l e  f o r  some 
por t ion  of  t h e  input r e p e r t o i r e .  
I n f l i g h t  Crew Tra in ing  
"During a severa l  month i n t e r p l a n e t a r y  
voyage, crew members could l o s e  some of t h e  s k i l l  
they  have developed i n  such maneuvers as e a r t h  
atmosphere r e e n t r y . " ( 5 )  
Alan B. Shepard 
February 19, 1964 
P lane ta ry  flybys w i l l  put a t  least  one t o  two 
years  between t h e  t i m e  an a s t ronau t  l a s t  flew ( o r  
p rac t i ced  i n  a f u l l  s c a l e  s imula to r )  an  e a r t h  
en t ry  and t h e  t ime he must do so aga in .  The 
in t e rven ing  t i m e  p l u s  t h e  phys io log ica l  and psy- 
cho log ica l  demands of t h e  mission w i l l  t end  t o  
degrade h i s  a b i l i t y  t o  perform t h e  t a s k .  Various 
o t h e r  on-board t a s k s  w i l l  have smaller but  s t i l l  
s i g n i f i c a n t  i n t e r v a l s  between p r a c t i c e s .  Exam- 
p l e s  a r e  p l ane ta ry  photography a t  c l o s e  range ,  
t a r g e t i n g  and guidance of probes ,  and on-board 
p l ane ta ry  encounter experiments.  
The a s t ronau t s  must somehow main ta in  t h e i r  
s k i l l s  i n  t h e s e  t a s k s ,  e i t h e r  by l i v e  p r a c t i c e  
runs  o r  s imula t ions .  I d e a l l y ,  t hey  should use  
t h e  same c o n t r o l s ,  d i s p l a y s  and systems f o r  
p r a c t i c i n g  as i n  a c t u a l  ope ra t iona l  usage. How- 
eve r ,  t h e r e  i s  a school of  thought t h a t  one 
should not t a k e  f l i g h t - c r i t i c a l  c o n t r o l s  and d i s -  
p l ays  o f f  l i n e  f o r  s imula t ions  dur ing  a miss ion .  
Moreover, in t roducing  s imula t ion  modes (wi th  
switches and a d d i t i o n a l  input  and output  p a t h s )  
may lower system r e l i a b i l i t y .  
We t h e r e f o r e  a n t i c i p a t e  t h e  ex i s t ence  of an 
on-board t r a i n i n g ,  s imula t ion  and behav io ra l  
r e sea rch  s t a t i o n .  It would have d i s p l a y s  and con- 
t r o l s  which can assume d i f f e r e n t  conf igu ra t ions  t o  
s imula te  d i f f e r e n t  crew s t a t i o n s .  It would a l s o  
use  t h e  computer system f o r  c o n t r o l l i n g  real t ime 
s imula t ions ,  s t o r i n g  norms, s imula t ing  c e r t a i n  
systems, eva lua t ing  r e s u l t s ,  and compiling s u b j e c t  
p r o f i l e s .  The f a c i l i t y  would have t h e  fo l lowing  
uses  : 
1. 
2 .  
3. 
4 .  
Tra in ing .  
t a i n  crew s k i l l s  which a r e  not f r e -  
quent ly  used .  
Crew reassignments.  Each a s t r o n a u t  w i l l  
be a s p e c i a l i s t  i n  some a r e a s  and c ross -  
t r a i n e d  i n  o t h e r s .  A t  some po in t  i n  
t h e  miss ion ,  perhaps due t o  t h e  incapac i -  
t a t i o n  of  some o t h e r  crew member, it may 
be d e s i r a b l e  o r  necessary t o  r e a s s i g n  an 
a s t ronau t  from h i s  o r i g i n a l  s p e c i a l t y  t o  
another.  The equipment and sof tware  
used f o r  " rou t ine"  t r a i n i n g  might s u f f i c e ,  
though some a d d i t i o n a l  " teaching  machine" 
c a p a b i l i t y  may a l s o  prove t o  be d e s i r a b l e .  
Checkout of new procedures .  These may be 
e s t a b l i s h e d  by t h e  f l i g h t  crew o r  ground 
dur ing  t h e  course  of t h e  miss ion  [The 
miss ion  w i l l  be of such du ra t ion  t h a t  
even s t a t e -o f - the -a r t  advances a r e  
p o s s i b l e . ]  This func t ion  r e q u i r e s  t h e  
a b i l i t y  t o  i n s e r t  l a r g e  new programs i n t o  
t h e  computer from ground or  smaller ones  
from on-board. 
This i s  r equ i r ed  t o  main- 
Behavioral  r e sea rch .  In  a d d i t i o n  t o  t h e  
biomedical monitoring of t h e  a s t r o n a u t s ,  
c e r t a i n  behav io ra l  s t u d i e s  w i l l  t a k e  
p l ace .  These w i l l  c o n s i s t  of t e s t s  of  
r eac t ion - t ime ,  decision-making and 
problem so lv ing .  The r e s u l t s  of t h e s e  
t e s t s  w i l l  be c o r r e l a t e d  wi th  biomedical 
d a t a  t o  i n d i c a t e  t h e  "condition" of  t h e  
a s t r o n a u t s  a t  va r ious  p o i n t s  i n  t h e  
miss ion .  S ince  t h e s e  behav io ra l  s t u d i e s  
w i l l  r e q u i r e  t h e  use  of d i s p l a y s  and 
-16- 
c o n t r o l s ,  it should be poss ib l e  
t o  use  t h e  f a c i l i t y  f o r  behavor ia l  
experiments as we l l  as f o r  simula- 
t i o n  and t r a i n i n g .  
The s imula t ions  used i n  conjunct ion  wi th  t h i s  
f a c i l i t y  would be major u s e r s  of  computational 
t ime when running .  They could be among t h e  
l a r g e s t  programs on board.  The Apollo Mission 
Simulator and LM Mission Simulator programs each 
run  g r e a t e r  t han  l O O K  words. Though not nec- 
e s s a r i l y  r e p r e s e n t a t i v e ,  t hey  i n d i c a t e  how l a r g e  
t h e s e  s imula t ions  can  become. 
MPERIMENT REQUIREMENTS 
On-board Experiments 
"Nuclear ins t rumenta t ion  is  undergoing 
r evo lu t iona ry  changes because of  [ t h e ]  r a p i d l y  
inc reas ing  u s e  of stored-program computers by 
e i p e r i m e n t a l i s t s  i n  nuc lea r - s t ruc tu re  labora-  
t o r i e s  . I 1  
John V. Kane 
"In t h e  h igh  energy phys ics  l a b o r a t o r y  t h e  
most remarkable development t h a t  has  occurred i n  
t h e  l a s t  f i v e  yea r s  has been t h e  in t roduc t ion  of  
t h e  d i g i t a l  computer as an  a c t i v e  p a r t  o f  t h e  
experimental  appara tus  . I 1  
George W. Tau t f e s t  
Both of  t h e s e  quotes were taken  from t h e  
J u l y ,  1966 i s s u e  of "Physics Today" and i n d i c a t e  
t h e  e f f e c t  computers have had on ground-based 
exper imenta t ion .  It i s  l i k e l y  t h a t  spaceborne 
computers w i l l  have a similar e f f e c t  on space 
exper imenta t ion  wi th in  t h e  next  decade when one 
cons ide r s  t h a t  t h e y  have been v i r t u a l l y  unused 
t h u s  f a r  and t h a t  t h e  number and complexity of  
experiments a r e  inc reas ing .  For example, t h e  
p a r t i c l e s  experiment on Explorer I measured omni- 
d i r e c t i o n a l  i n t e n s i t y  of p a r t i c l e s  of any type .  
On OGO-E, t h e  p a r t i c l e s  experiment w i l l  measure 
d i r e c t i o n a l  c h a r a c t e r i s t i c s  and i n t e n  
func t ion  o f  p a r t i c l e  energy and ty$e. ?if Severa l  
u ses  of  computers i n  on-board experimentation a r e  
suggested i n  t h e  fo l lowing  s e c t i o n s .  
Experiment Checkout and Ca l ib ra t ion  -- It i s  e s t i -  
mated t h a t  there w i l l  be about 30-40 major p i eces  
of  exper imenta l  equipment on-board a f lyby  space- 
c r a f t  i n  a d d i t i o n  t o  about f o r t y  c a r r i e d  i n  t h e  
uruuari~ird s c i e n ~ i i i c  yru'ves. Tilere wuuili al;su.Le 
a Large (40")  t e l e scope  wi th  i t s  own a t t i t u d e  
c o n t r o l ,  photographic and TV systems. 
y as a 
About one t h i r d  of t h e  on-board experiments 
and t h e  t e l e s c o p e  system must be monitored and 
Occas iona l ly  t e s t e d  o r  c a l i b r a t e d  throughout the 
mis s ion ,  much i n  t h e  manner of spacec ra f t  systems. 
Another t h i r d  of t h e  on-board experiments,  t h e  
f o r t y  exper iments  c a r r i e d  i n  t h e  probes,  t h e  
v a r i o u s  subsystems of t h e  probes ,  and t h e  f lyby  
photography and TV systems must a l l  be t e s t e d ,  
and c a l i b r a t e d  i f  necessary ,  s h o r t l y  be fo re  
p l ane ta ry  encounter.  
o rde r  of  one t o  two thousand t e s t  p o i n t s .  
This  may involve  on t h e  
The checkout and c a l i b r a t i o n  t a s k s  should 
automated t o  t h e  f u l l e s t  ex t en t  p o s s i b l e  f o r  
reasons  similar t o  those  given f o r  automating 
spacec ra f t  systems checkout ( o b t a i n  speed and 
be 
accuracy, reduce crew workload, avoid human e r r o r ,  
e t c . ) .  
system w i l l  be of even more importance for t h e s e  
t a s k s  than  f o r  t e s t i n g  of  spacec ra f t  systems be- 
cause of t h e  g e n e r a l l y  g r e a t e r  p r e c i s i o n  requi red  
by s c i e n t i f i c  measurements compared t o  ope ra t iona l  
engineer ing  measurements (G&N systems excepted) .  
The p r e c i s i o n  p o s s i b l e  wi th  a computer 
A s  i n  t e s t i n g ,  t h e  use  of t h e  computer system 
f o r  c a l i b r a t i o n  w i l l  permi t  t h e  use  of complicated 
o r  exhaus t ive  schemes which might not o therwise  be 
used. This ,  p l u s  t h e  accuracy and r e p e a t a b i l i t y  
of t h e  computer, t h e  a b i l i t y  t o  record  s t e p s  i n  
t h e  c a l i b r a t i o n  process ,  and t h e  presence of  man, 
w i l l  r e s u l t  i n  g r e a t e r  confidence i n  t h e  c a l i b r a -  
t i o n  of t h e  experiments--an important advantage 
over p re sen t  experimentation. 
Experiment Control -- A t  va r ious  t imes  i n  t h e  m i s -  
s i on  experiments must be turned  on, have t h e i r  
sensors  exposed, be run through a warm-up se- 
quence, be coordinated wi th  o the r  experiments,  
undergo c y c l i c  changing of ope ra t iona l  modes, e t c  . 
Although most o f  t h e s e  func t ions  could be imple- 
mented wi th  simple programmers, t h e y  a r e  candi- 
d a t e s  f o r  computer c o n t r o l  i n  o rde r  t o  a l low 
f l e x i b i l i t y  i n  f l i g h t .  Building a programmer wi th  
enough f l e x i b i l i t y  t o  a r b i t r a r i l y  change t h e  t i m -  
ing  and sequencing of experiments may be l e s s  de- 
s i r a b l e  than  bu i ld ing  a wired i n t e r f a c e  wi th  t h e  
computer system, us ing  a modest amount of computer 
t i m e  and memory space,  and keeping t h e  f l e x i b i l i t y  
i n  t h e  sof tware .  
A more complex type  of c o n t r o l  t han  sequenc- 
ing  may be needed f o r  experiments such as p a t r o l -  
l i n g  f o r  s o l a r  f l a r e s .  In  t h i s  experiment,  t h e  
sun w i l l  be monitored f o r  about h a l f  of t h e  
p l ane ta ry  mission us ing  t e l e scopes ,  X-ray, W ,  
v i s i b l e  and o t h e r  electromagnetic s enso r s ,  cosmic 
r a y  and s o l a r  pro ton  senso r s ,  e t c  . While t h e  
crew may occas iona l ly  or even r e g u l a r l y  monitor 
t h e  senso r s ,  it does not  seem reasonable  t o  spend 
a man-year f o r  t h i s  purpose,  s ince  t h e  f l a r e s  
occur only  about 0 - 20 t imes  per  yea r .  Nor can 
one r e l y  on e a r t h  t o  warn t h a t  a f l a r e  i s  occur- 
r i n g .  The v i s i b l e  po r t ion  of a f l a r e  l as t s  from 
c a t i o n  t ime f o r  t r a n s m i t t i n g  t h e  warning from 
e a r t h  t o  spacec ra f t  w i l l  be i n  t h e  o rde r  of 0 - 30 
minutes;  t h e  f l a r e  or  i t s  beginning might be 
missed e n t i r e l y .  
s,-"cral m;,,ut2S to Oi,~ hoe-, --L ^_ ^ ^ ^  CL^ ^ -^-....: 
C l l C l  Sa13 " L l C  L . v I I y I I L L I A L -  
It would be economical i n  f i l m ,  b i t  s to rage ,  
and man-hours t o  have a system which samples t h e  
sensors  a t  a low rate  u n t i l  something unusual 
occurs ,  t hen  alerts t h e  a s t r o n a u t s ,  i nc reases  t h e  
sampling rates and photographic r e p e t i t i o n  r a t e s ,  
and b r ings  on- l ine  any sensors  which may have been 
-17- 
i n a c t i v e .  The computer system would be used t o  
d iscover  t h e  "something unusual ," perhaps us ing  
p a t t e r n  recogni t ion  techniques  t o  determine an  
unusual ly  b r i g h t  a r e a  on a TV p i c t u r e  of  t h e  sun. 
The computer system would command t h e  i n i t i a l  
response .  This  o v e r a l l  scheme r e p r e s e n t s  one form 
of d a t a  compaction by computers. 
Severa l  experiments, i nc lud ing  t h e  above, 
w i l l  r e q u i r e  accu ra t e  po in t ing  and hold ing  t o  tar- 
g e t s  on va r ious  heavenly bodies .  Ca lcu la t ions  
f o r  t h e s e  func t ions  may r e q u i r e  d a t a  from t h e  on- 
board a u t o p i l o t  and G&N system, t h e  t e l e scope  at-  
t i t u d e  c o n t r o l  system and ephemeris t a b l e s .  
Experiment Data Manauement and Disp lays  -- The 
a s t r o n a u t s  themselves w i l l  have an important r o l e  
i n  t h e  checkout,  c a l i b r a t i o n  and r o u t i n e  c o n t r o l  
of  t h e  experiments. 
have some c a p a b i l i t y  t o  t a k e  advantage of  d i s -  
cove r i e s ,  explore  anomalies,  r e d i r e c t  experiments 
i n  case  of f a i l u r e s  o r  mis takes ,  and s e l e c t  data 
f o r  t ransmiss ion  t o  t h e  ground. 
In  a d d i t i o n ,  t hey  should 
To perform these  t a s k s  t h e  a s t r o n a u t s  must be 
able t o  sample, i n  real o r  near - rea l  t ime,  d a t a  
from any o f  t h e  experiments being conducted. 
Though t h i s  may not always be poss ib l e ,  it should 
serve  as a goa l .  They must then  be a b l e  t o  pro- 
c e s s  t h e  .data with t h e  a i d  of  t h e  computer system 
i f  necessary .  Processing may involve curve- 
f i t t i n g ,  computation of s t a t i s t i c s ,  s t a t i s t i c a l  
f i l t e r i n g ,  c o r r e l a t i n g  d a t a  from s e v e r a l  exper i -  
ments,  so lv ing  systems of  equa t ions ,  and an 
extremely wide range of o the r  p o s s i b i l i t i e s .  
Both r a w  and processed d a t a  should be capable  
of  being d isp layed  i n  a v a r i e t y  of  alphanumeric 
and graphic  formats.  Display op t ions  should in- 
c lude  symbols, histograms, X-Y p l o t s ,  waveforms, 
and s c a t t e r  diagrams. The a s t r o n a u t s  should a l s o  
he a b l e  t o  d i sp l ay  d a t a  from seve ra l  r e l a t e d  ex- 
periments simultaneously and t o  r eques t  a p r i o r i  
expected results t o  be d isp layed  a longs ide  a c t u a l  
r e s u l t s .  
Probes 
I n  our  example miss ion ,  t h e  crew of  t h e  f lyby  
spacec ra f t  must check out  and count down t h e  ap- 
proximately s i x  s c i e n t i f i c  probes about one week 
p r i o r  t o  encounter,  us ing  t h e  computer system f o r  
automated t e s t i n g  and sequencing. About t h r e e  of 
t h e  probes w i l l  be of t h e  complexity of  L u n a r  
Orb i t e r  o r  Surveyor; t h e  o t h e r  t h r e e  w i l l  be r e l -  
a t i v e l y  simple atmospheric probes.  A l l  would be 
launched wi th in  a per iod  of  a few days .  
Af t e r  i n j e c t i o n  toward t h e  p l a n e t ,  t h e  probes 
Before l and ing  ( o r  going i n t o  
w i l l  be t r acked  from t h e  spacec ra f t  by r a d a r  and 
o p t i c a l  techniques.  
o r b i t ) ,  t h e  probes r e c e i v e  one o r  two midcourse 
c o r r e c t i o n s  from t h e  spacec ra f t .  The c o r r e c t i o n s  
a r e  based on t h e  con t inua l ly  improving knowledge 
of t h e  s p a c e c r a f t ' s  t r a j e c t o r y  r e l a t i v e  t o  t h e  
p l a n e t .  The t r a j e c t o r y  improvement is based on 
inpu t s  from t h e  on-board sex tan t  and t e l e s c o p e s  
and from earth-based continuous t r a c k i n g .  
Thus, t h e  spacec ra f t  must a c t  as a space- 
borne t r a c k i n g  and f l i g h t  c o n t r o l  f a c i l i t y  whi le  
a t  t h e  same t ime nav iga t ing  f o r  i t s e l f .  It has 
been es t imated  t h a t  t h e  spaceborne computation 
load  f o r  probe guidance w i l l  be s e v e r a l  t imes  
t h a t  of t h e  Apollo LM descent  guidance, the most 
demanding of  t h e  A p o l l o  guidance programs. 
It i s  not  c l e a r  t o  what e x t e n t  guidance com- 
p u t a t i o n s  f o r  t h e  va r ious  probes would ove r l ap  i n  
t i m e  wi th  one another  and wi th  o t h e r  t a s k s .  It 
i s  c l e a r ,  however, t h a t  t h e  probe guidance t a s k s  
w i l l  be of  extreme importance and w i l l  be demand- 
ing  a t t e n t i o n  at  one of t h e  b u s i e s t ,  most c r i t i -  
cal  t imes  i n  t h e  miss ion .  
RELIABILITY REQUIREMENTS 
The heavy dependence upon t h e  computer sys- 
t e m  suggested i n  t h i s  paper presumes extremely 
r e l i a b l e  hardware and sof tware .  I n  Apollo,  t h e  
guidance computer i s  r equ i r ed  t o  have a c e r t a i n  
MTBF. 
s ions  w i l l  a l s o  have MTBF requi rements ,  bu t  i n  
a d d i t i o n  w i l l  be r equ i r ed  t o  absorb  ( a t  va r ious  
l e v e l s )  c e r t a i n  mal func t ions  without h inde r ing  
performance, and c e r t a i n  o the r  mal func t ions  with- 
ou t  complete loss  o f  performance. 
s tudy  i s  being given t o  hardware r e l i a b i l i t y ,  and 
techniques  are r a p i d l y  becoming a v a i l a b l e  t o  meet 
t h e s e  enhanced requirements on hardware. 
A computer system f o r  i n t e r p l a n e t a r y  m i s -  
Cons iderable  
The AGC sof tware  i s  r equ i r ed  t o  produce ap- 
p r o p r i a t e  ou tpu t s  under a l lowable  inpu t  condi- 
t i o n s .  This  gene ra l  sof tware  "qua l i ty"  r equ i r e -  
ment, a l r eady  d i f f i c u l t  t o  achieve ,  w i l l  t end  t o  
become even more e l u s i v e  i n  f u t u r e  miss ions .  The 
housekeeping func t ions  in t roduced  by redundancy 
and swi tchable  conf igu ra t ions  as w e l l  as inc reased  
110 w i l l  s i g n i f i c a n t l y  complicate t h e  package of 
programs r e l a t i v e  t o  t h e  AGC sof tware .  
I n  s tudying  t h e  sof tware  q u a l i t y  c o n t r o l  
problem of  F'roject Apollo,  it was concluded t h a t  
t h e  use  of s t rong  management procedures inc lud ing  
t i g h t l y  c o n t r o l l e d  documenta i n w a s  t h e  most 
v a l u a b l e  approach a v a i l a b l e  .I77 Producing d e t a i l -  
ed sof tware  documentation has  a most d e s i r a b l e  by- 
p roduc t ,  t h a t  o f  f o r c i n g  t h e  t h i n k i n g  out  of  pro- 
gram p o s s i b i l i t i e s .  
problems due t o  l o g i c a l  i n c o n s i s t e n c i e s .  It ap- 
p e a r s  t h a t  s t r o n g  management c o n t r o l  w i l l  remain 
a va luab le  approach f o r  i n t e r p l a n e t a r y  mis s ions .  
The gene ra l  problem of sof tware  r e l i a b i l i t y  i s  
r i p e  f o r  new approaches.  
This  t ends  t o  e l i m i n a t e  
CONCLUSION 
The fo l lowing  in fe rences  can be drawn from 
t h e  above d i s c u s s i o n :  
1. The func t ions  which o r i g i n a l l y  j u s t i -  
f i e d  b r ing ing  a computer on board a 
-18- 
space veh ic l e  (G&N,  Q t t i t u d e  c o n t r o l )  
w i l l  no longer  be t h e  prime f a c t o r s  i n  
determining c h a r a c t e r i s t i c s  of  t h e  
computer system. The new funct ions ,  
automated checkout, f l ight crew 
t r a i n i n g ,  expanded d i s p l a y s ,  and 
on-board experiment c o n t r o l  and 
d a t a  process ing ,  w i l l  f o rce  t h e  
f u t u r e  on-board computer system 
t o  have t h e  c a p a b i l i t y  t o  
e f f i c i e n t l y  perform cha rac t e r  
manipulations as w e l l  as mathe- 
ma t i ca l  c a l c u l a t i o n s .  
2. The complexity of  spacec ra f t  
f l i g h t  opera t ions  and s c i e n t i f i c  
a c t i v i t i e s  w i l l  a l t e r  previous 
mission management concepts.  The 
crew w i l l  make ex tens ive  use o f  
t h e  automated system centered  
around t h e  computers. This w i l l  
r equ i r e  h ighly  e f f i c i e n t  
astronaut/computer communications. 
More s o p h i s t i c a t e d  d i sp lays  and 
inpu t  languages than  those  be in4  
used i n  Apollo w i l l  be r equ i r ed  ' 
t o  accomplish t h i s .  
3. The growth of  computer usage i n  
on-board experiment at i on w i l l  
p a r a l l e l  t h e  r a p i d l y  expanding 
usage i n  ground-based experi-  
mentation. This t r e n d  w i l l  i m -  
pose s i g n i f i c a n t  requirements on 
bo th  t h e  memory and process ing  
c a p a b i l i t y  of  f u t u r e  on-board 
computer systems. 
4. It i s  e s t ima ted  t h a t  t h e  in-  
c r eased  func t iona l  requirements 
w i l l  r e s u l t  i n  a g r e a t l y  in-  
c r eased  number o f  110 channels,  
an i n c r e a s e  i n  h igh  speed memory 
of an  o rde r  of  magnitude, t h e  
a d d i t i o n  of  o f f - l i n e  bulk  
s t o r a g e  memory c a p a b i l i t y ,  and 
a more powerful process ing  capa- 
b i l i t y  than  i s  p r e s e n t l y  ava i l -  
a b l e  i n  Apollo. These charac- 
t e r i s t i c s  w i l l  r equ i r e  e i t h e r  a 
very fast c e n t r a l  p rocessor  or 
a mul t ip rocesso r  system. 
There are var ious  func t ions  i n  add i t ion  t o  
those  d i s c u s s e d  i n  t h i s  paper  which may be candi- 
d a t e s  f o r  use of  t h e  on-board computer system. 
For example: 
. Housekeeping,, such as automatic 
ba l anc ing  of  s o l a r  h e a t  loads  by 
a t t i t u d e  con t ro l .  
. . Communications management, such as 
rou t ing  messages between spacec ra f t  
systems, experiments,  probes and 
groun'd, and po in t ing  spacec ra f t  an- 
tennas .  
. Medical d iagnos is  
. Data reduct ion  and compression, i n  
add i t ion  t o  c o n t r o l l i n g  experiment 
sampling r a t e s  as i n  s o l a r  p a t r o l s .  
(This is  not foreseen  as a v i t a l  
func t ion  because of t h e  expected lo6 
b i t s / s e c .  t ransmiss ion  c a p a b i l i t y  on 
board.  ) 
A s  t h e  o l d  say ing  goes,  f u r t h e r  s tudy  i s  needed. 
ACKNOWLEDGMENT 
The au thors  wish t o  thank t h e  many members 
of  t h e  t e c h n i c a l  s t a f f  o f  Bellcomm who p a r t i c i -  
pa t ed  i n  d iscuss ions  on t h i s  paper.  
REFERENCES 
(1) Greene, D.  W. and Wood, E. C . :  On-board 
Checkout System Concept. Stepping Stones 
t o  Mars, AIAA/MS Volume of  Technical 
Papers ,  March, 1966, pp. 263-268. 
( 2 )  Chase, W. P.: I n t e g r a t i n g  C r e w  Perform- 
ance i n t o  Space Vehicle System Design 
f o r  Optimum R e l i a b i l i t y .  Yanned Space 
R e l i a b i l i t y  Symposium, M S ,  June 9 ,  1964, 
p. 46. 
( 3 )  Stambler,  I.: The Big New Transpor t s .  
Space and Aeronautics,  Vol. 46, No. 3, 
1966, pp. 54-62. 
( 4 )  Evanzia, W.  J .:  "A View from t h e  Cockpit". 
E l e c t r o n i c s ,  August 22, 1966, pp. 145-148. 
( 5 )  Shepard, Alan B. : "Training by Simulation".  
Smithsonian I n s t i t u t i o n ,  Edwin A. Link 
Lec ture ,  F i r s t ,  Washington, D.  C . ,  
February 19 ,  1964. 
n a t i o n a l  Aerospace Abs t rac ts .  
Published i n  I n t e r -  
( 6 )  Bostrom, C .  O., and Ludwig, G .  H.: " Ins t ru-  
mentation f o r  Space Physics". Physics Today, 
vc1. 13, ??o. 7,  >4.y, 1966, 77. " 3  5 6 .  
(7)  Liebowitz,  B. H . ,  Parker ,  E. B. 111, and 
She r re rd ,  C .  S.: Procedures f o r  Manage- 
ment Control o f  Computer Programming i n  
Apollo. 
September 28, 1966. 
TR 66-320-2, Bellcomm, Inc .  , 
-19- 
DES I GN CR I TER I A  FOR A SPACECRAFT COMPUTER 
DR. RAMONL. ALONSO 
Dr. Alonso is  Assistant Director of t h e  Inst rumentat ion Laboratory and 
Lecturer in t h e  Aeronautics and Astronautics Department at Massachusetts 
Ins t i tu te  of Technology. He i s  a designer of digital systems used in guidance 
and navigation equipment; h i s  present concern i s  t h e  desigi, f compact 
memories for  a i rborne computers, and mechanization of logical and electrical 
design procedures. 
Dr. Alonso was born in  Buenos Aires, Argentina. He came to t h e  United 
States in 1947 and attended Harvard University where h e  received h is  A. B. 
degree in Physics in 1951, h i s  M. S. degree in  Mechanical Engineering in 1952, 
and h i s  Ph. D degree in Applied Mathematics in 1957. He was at Bell Telephone 
Laboratories d u r i n g  1952 and 1953. 
mentation Laboratory in 1957 and was appointed Lecturer in 1964. 
Dr. Alonso joined t h e  Digital Development Group of M. I .T.3 I n s t r u -  
-21 - 
W 67-17104 
. d  
coming yea r s .  I n  t h e  absence of  the  development of 
a s u i t a b l e  se l f -o rgan iz ing  automaton, t he  m u l t i -  
' p rocessor  s t r u c t u r e  appears t o  be b e s t  s u i t e d  t o  
both  the  requirements and the  hardware a v a i l a b l e .  
We descr ibe  an idea l i zed  mul t iprocessor  organiza-  
t i o n  and examine i t s  performance i n  terms of the  
performance of i t s  components. 
Mul t iprocessors  
Ex t rapo la t ing  the  Apollo mission t o  a plane- 
t a r y  mission has many p i t f a l l s ,  a s  e n t i r e l y  new 
problems and s o l u t i o n s  a r e  involved. From the  
computer 's  po in t  of view however, t h e  requirements 
can be expressed independent of many of the  a t t r i -  
bu te s  of the  t o t a l  spacec ra f t .  S ize  and power 
c o n s t r a i n t s  should no t  be expected t o  be much d i f -  
f e r e n t  from what they  a r e  today. However, r e l i -  
a b i l i t y  over a per iod  of s eve ra l  yea r s  adds a new 
dimension t o  the problem, f o r  i n  a system of 
perhaps mi l l i ons  of s o l i d - s t a t e  e l e c t r o n i c  elements, 
it must be assumed t h a t  s e v e r a l ,  perhaps many, w i l l  
become inopera t ive  e i t h e r  due t o  poor q u a l i t y  o r  t o  
s e v e r i t y  of environment. What i s  needed i s  a 
system whose performance w i l l  no t  be reduced below 
the  minimum requi red  f o r  s u r v i v a l  of t he  spacecraf t ,  
un le s s  f a i l u r e s  of calamitous p ropor t ions  occur.  A 
new concept of g race fu l  degrada t ion  has a r i s e n  t o  
supplement t he  o ld  no t ion  of redundancy i n  which 
elements may f a i l ,  bu t  t h e  c i r c u i t s  which con ta in  
them continue t o  func t ion  with no degrada t ion .  I f  
more elements f a i l  than the  redundancy can cope 
w i t h ,  t h e  c i r c u i t  w i l l  f a i l ,  and wi th  i t ,  t h e  
system. Graceful degrada t ion  impl ies  an organiza-  
t i o n  i n  which c i r c u i t  Ea i lu re  reduces ,  bu t  does not  
suppress ,  the  machine's throughput.  The b r a i n  has 
t h i s  c h a r a c t e r i s t i c ,  bu t  neuron-based automata have 
n o t  y e t  exh ib i t ed  promise f o r  minia ture  c o n t r o l  
computer a p p l i c a t i o n s .  
I n  a mul t iprocessor  o rgan iza t ion ,  g race fu l  
degrada t ion  and g race fu l  expansion a r e  r e l a t e d  
p r o p e r t i e s ,  both made poss ib l e  by t h e  independence 
of the  c o n s t i t u e n t  func t iona l  un i t s :  p rocessors  
and memories. A mul t iprocessor  i s  more complex and 
expensive than a l i k e  s ized  a r r a y  of independent 
computers. I t s  va lue  i s  g r e a t e r ,  f o r i t s  perform- 
ance depends on the  number of u n i t s  func t ion ing  a t  
any t ime. To i nc rease  the  power of t he  machine, 
p rocessors  and memories can be added without a f f e c t -  
i n g  p a r t s  prev ious ly  p re sen t  and ,  a t  l e a s t  equa l ly  
impor tan t ,  without a f f e c t i n g  e x i s t i n g  programs. 
Each processor  may be made a s  powerful a s  t he  
technology a l lows ,  bu t  i n  the  face  of t he  r e l i -  
a b i l i t y  problem, i t  appears more d e s i r a b l e  t o  bu i ld  
s imple ,  r e l i a b l e  processors  i n  g r e a t e r  quan t i ty  so 
a s  t o  minimize the  impact of a s i n g l e  p rocesso r ' s  
loss. 
The mul t iprocessor  s t r u c t u r e  i s  compatible with 
s e v e r a l  of the  requirements of  t he  spacec ra f t  ap- 
p l i c a t i o n  bes ide  t h a t  of r e l i a b i l i t y .  For one thing, 
communication between the  mul t iprocessor  and a l l  
o t h e r  spacec ra f t  systems can be handled i n  t h e  same 
fa sh ion  a s  communication among the  p rocesso r s ,  thus 
a f fo rd ing  a un i f i ed  treatment of t he  problem of 
input -output  involv ing  perhaps hundreds of e x t e r n a l  
func t ions .  I n  a time-multiplexed s e r i a l  
DESIGN CRITERIA FOR A 
SPACECRAFT COMPUTER 
By Ramon L .  Alonso 
Albe r t  L .  Hopkins, J r .  
Herbert  A .  Tha ler  
M.I.T. Ins t rumenta t ion  Laboratory 
Cambridge, Massachusetts 
SUMMARY 
Ext rapo la t ion  of Apollo experience t o  space- 
c r a f t  computers of the  next  genera t ion  i n d i c a t e s  a 
need f o r  d i g i t a l  systems of g r e a t e r  computing and 
i n t e r f a c e  a c t i v i t y ,  and of g r e a t e r  r e l i a b i l i t y ,  
than  has been r e a l i z e d  t o  d a t e .  
An i d e a l i z e d  co l l abora t ive  mul t iprocessor  
s t r u c t u r e  i n  which a number of process ing  elements 
a r e  t i e d  toge the r  by means of a s i n g l e  multiplexed 
da ta  bus i s  explored .  A t  l e a s t  one job  assignment 
procedure i s  poss ib l e  f o r  which no one processor  
has t o  a c t  a s  'mas t e r ' ,  and which can surv ive  
processor  mal func t ions  o r  t he  d e l e t i o n  o r  a d d i t i o n  
of p rocesso r s  t o  the  bus ,  t hus  accomplishing 
' g race fu l  deg rada t ion '  and ' r econf igu ra t ion '  of 
s o r t s .  The s i n g l e  bus s t r u c t u r e  a s  used here 
imp l i e s  t h i n g s  about compilers f o r  i t ,  and a l s o  
c e r t a i n  bandwidth r e l a t i o n s h i p s  between p rocesso r s ,  
bus and common memory. Rough e s t ima tes  based on 
s h o r t  e x t r a p o l a t i o n s  of c i r c u i t  technology show 
t h a t  the  s t r u c t u r e  i s  probably r e a l i s t i c .  
DESIGN TRENDS 
In t roduc t ion  
In manned spacec ra f t  t o  d a t e ,  more uses  have 
been i d e n t i f i e d  f o r  on-board da ta  process ing  than 
could be provided by the  computers t h e r e i n .  
Computer des igne r s  a r e  inc l ined  t o  a n t i c i p a t e  t h i s  
s o r t  of problem by t h e i r  n a t u r a l  tendency t o  supply 
g r e a t e r  performance than the  a p p l i c a t i o n  seems .to 
r e q u i r e ,  bu t  have been i n h i b i t e d  i n  the  spacec ra f t  
a r e a  by appa ren t ly  i n e l a s t i c  s i z e ,  power and 
r e l i a b i l i t y  c o n s t r a i n t s .  These c o n s t r a i n t s  a r e  
re laxed  when i t  i s  discovered t h a t  mission success  
i s  imper i led  by lack  of adequate computer pe r fo r -  
mance. This  very  l i k e l y  a r i s e s  a t  a time which i s  
too  l a t e  t o  r econf igu re  the  computer w i th in  the 
miss ion  schedule .  I n s t e a d ,  mission o b j e c t i v e s  a r e  
a p t  t o  be r e s t r i c t e d  and a large s n f t w a r e  e f f n r t  
i s  mounted t o  prepare  and v e r i f y  programs which 
squeeze ou t  maximum performance. A l e s son  :or t h e  
nex t  s p a c e c r a f t  gene ra t ion  is  t h a t  g race fu l  expand- 
a b i l i t y  should  be a fundamental requirement f o r  the 
d a t a  p rocesso r  and o t h e r  systems. This can r e s u l t  
i n  t he  a b i l i t y  t o  p r o f i t  from l e s sons  learned  i n  
t h e  development phases of a mission by reconf ig-  
u r ing  the  on-board systems wi th  a minimum of impact 
upon the  s p a c e c r a f t .  
I n  t h i s  pape r ,  we review some genera l  requi re -  
ments f o r  t h e  next  s p a c e c r a f t  computer genera t ion  
and t h e  f o r e c a s t  f o r  hardware a v a i l a b l e  i n  the 
-23 - 
t ransmiss ion  s t r u c t u r e ,  f o r  example, a new system 
can be added t o  the mul t ip rocesso r ' s  i n t e r f a c e  
wi th  v i r t u a l l y  no more changes bes ide  the  a d d i t i o n  
of access  l i n e s  fo r  t he  new system t o  the  coax ia l  
cab le  ( o r  waveguide) run. Today, mul t iwi re  cable  
and connector problems probably c o n s t i t u t e  ha l f  t he  
b a t t l e  i n  making spacec ra f t  systems work. 
Another example of t h e  mul t ip rocesso r ' s  well-  
su i t edness  i s  the n a t u r a l  d i v i s i o n  of many space- 
c r a f t  da t a  process ing  t a s k s  i n t o  s h o r t  jobs  of 
f r a c t i o n a l  second dura t ion .  This  i s  a r e s u l t  of 
t h e  m u l t i p l i c i t y  of independent programs serv ing  
the  many systems involved, and a l s o  of  the  sampled 
na ture  of con t ro l  computations. 
t y p i c a l l y  has a low duty c y c l e ,  r equ i r ing  b r i e f  
s e r v i c e  s e v e r a l  times per  second. 
s e r v i c e  can be  t r ea t ed  as a sepa ra t e  job  t o  be 
handled by any ava i l ab le  and competent processor .  
I n  the  Apollo spacec ra f t ,  r e p e t i t i o n  r a t e s  f o r  j o b s  
vary  from a few tens per second down, wi th  no more 
than e i g h t  j o b s  running a t  a t ime. I n  f u t u r e  we 
can expect on the o rde r  of a hundred programs run- 
n ing  a t  once and t e n s  o r  hundreds of samples pe r  
second pe r  program. 
Each program 
Each ins tance  of 
Hardware 
Regardless of what o rgan iza t ion  may be used, 
increased  performance without increased  s i z e  can be 
obtained only with sma l l e r  and/or f a s t e r  components. 
S i ze  i s  the  key to speed by v i r t u e  of the  f i n i t e  
v e l o c i t y  of information t ransmiss ion  and of the  
power (hence s i z e )  of an element which d r i v e s  a 
long (hence r eac t ive )  l i n e .  The f i r s t  e f f e c t  more- 
over r equ i r e s  c h a r a c t e r i s t i c  impedance te rmina t ion  
t o  avoid r e f l e c t i o n s ,  which f u r t h e r  aggrava tes  the  
power problem. E f f o r t s  t o  sh r ink  components a r e  
hampered by t h e  d i f f i c u l t y  of i n t e rconnec t ing  
components r e l i a b l y  i n  a small  volume wi th  adequate 
y i e l d .  
An a r e a  i n  which g r e a t  p rog res s  i s  being made, 
wi th  promise of improvement, i s  i n  the  c r e a t i o n  and 
in te rconnec t ion  of l a r g e  numbers of semiconductor 
elements on a s ing le  wafer.  Within a wafer ,  s i g n a l s  
can be t ransmi t ted  a t  a h igher  r a t e  than  from wafer 
t o  wafer. Likewise, the propagation de lay  of an 
element no l a r g e r  than  requi red  t o  d r i v e  an 
i n t e r n a l  in te rconnec t ion  w i l l  be less than t h a t  of 
an element l a r g e  enough t o  d r i v e  an e x t e r n a l  l i n e .  
The des igner  i s  challenged by t h i s  technology t o  
organize h i s  equipment i n t o  l o c a l  h igh  speed a r e a s ,  
in te rconnec ted  by a s  few l i n e s  a s  p o s s i b l e .  How t o  
60 i t  depends on the number of elements pe r  wafer 
t h a t  can be r ea l i zed .  I f  i t  is hundreds, then we 
th ink  i n  terms of a r i t h m e t i c  and e r r o r  d e t e c t i o n  
c i r c u i t s ,  mu l t ip l exe r s ,  d i g i t a l - a n a l o g  conve r t e r s ,  
sequence gene ra to r s ,  s c a l e r s ,  and small s c r a t c h  pad 
memories. I f  i t  i s  thousands,  then s m a l l  p rocessor% 
medium s i zed  sc ra t ch  pad memories and small  a s soc i -  
a t i v e  memories could be made. I f  t e n s  of thousands 
o r  more, p o s s i b i l i t i e s  of r a t h e r  e l egan t  p rocesso r s  
come t o  mind. 
I n  any event ,  l o g i c  i s  becoming inexpens ive ,  
indeed v i r t u a l l y  expendable,  t o  a po in t  where us ing  
wi re ,  cab le  and connections t o  save i t  i s  
uneconomical. Thus i t  i s  a n t i c i p a t e d  t h a t  a l l  
spacec ra f t  systems w i l l  have l o c a l  d i g i t a l  c i r c u i t r y  
f o r  encoding, decoding, and mul t ip lex ing  informa- 
t i o n  f o r  t ransmiss ion  i n  a common language t o  t h e  
computer and elsewhere.  One of t he  ou t s t and ing  
j o b s  of t he  computer des igner  i s  t o  coord ina te  
wi th  the  manufacturers of l a r g e  i n t e g r a t e d  semi- 
conductor c i r c u i t s  t o  b e s t  e x p l o i t  t h i s  new 
technology. 
Memory w i l l  be of s eve ra l  types  t o  se rve  t h e  
v a r i o u s  func t ions  of s c r a t c h  pad, da t a  s t o r a g e ,  
and program s to rage ,  e i t h e r  i n  a common a r e a  o r  
a s soc ia t ed  with a given p rocesso r ,  o r  bo th .  Enough 
sepa ra t e  memories wi th  sepa ra t e  d r iv ing  c i r c u i t s  
must be suppl ied  t o  meet the  g r a c e f u l  degrada t ion  
c r i t e r i o n ,  and enough words must be suppl ied  i n  
each memory t o  do the 'ob .  Sc ra t ch  pad memories 
s to rage  w i l l  perhaps r e q u i r e  50 words p e r  program, 
o r  more than  t en  thousand words i n  a l l .  Program 
memory would have on the  o rde r  of a thousand words 
p e r  program, hence hundreds of thousands of  words 
i n  a l l .  A l l  t h r ee  s i z e s  a r e  an o rde r  of magnitude 
beyond Apollo without even cons ider ing  the  add i t ion -  
a l  c o s t  of redundancy. I n  the  l i g h t  of t he  growth 
of computer s i z e s  and requirements i n  the  l a s t  t en  
yea r s  these  e s t ima tes  may be somewhat conserva t ive .  
might be from 2' t o  2 3 words; commn e r a s a b l e  
IDEALIZED MULTIPROCESSOR STRUCTURE 
System S t r u c t u r e  
A s  a model upon which t o  base our s i z e  and 
performance e s t ima tes  we use an o rgan iza t ion  which 
i s  s imple ,  y e t  which con ta ins  the  elements of a 
gene ra l  c l a s s  of mul t iprocessors .  S t a r t i n g  wi th  a 
group of process ing  elements (roughly computers) 
each  of which has  i t s  own program and s c r a t c h  pad 
d a t a  memories, we c r e a t e  a combination i n  which 
t h e r e  i s  no one superv isory  element o r  p rocesso r ,  
bu t  which i s  t r u l y  c o l l a b o r a t i v e .  
The f i r s t  i t em needed i n  a d d i t i o n  t o  the  
p rocesso r s  i s  an i n f a l l i b l e  da ta  d i s t r i b u t o r  by 
which informat ion  i s  t r a n s f e r r e d  among u n i t s .  A 
simple form of d i s t r i b u t o r  i s  a t ime-multiplexed 
bus .  Every u n i t  having access  t o  t h i s  bus can 
r e c e i v e  a l l  d a t a  which appears  thereon .  Every such 
u n i t  can a l s o  t r ansmi t  da t a  upon the  bus  by means 
of a mul t ip l exe r  c i r c u i t ,  a s soc ia t ed  wi th  t h e  u n i t ,  
which emi ts  the  d a t a  a t  an appropr i a t e  t ime. The 
problem of schedul ing  time i s  handled by making 
each  mul t ip l exe r  enable  the  next  i n  l i n e  a s  soon a s  
i t  i s  through sending d a t a .  The nex t  mul t ip l exe r  
w i l l  then  send i t s  d a t a  un le s s  i t  has  noth ing  t o  
send ,  i n  which case  i t  w i l l  s k ip  the  enable  on t o  
t h e  fo l lowing  mul t ip l exe r  ( see  F igure  1 ) .  
The nex t  i t em needed i s  a common e r a s a b l e  
memory i n  which t o  s t o r e  da t a  needed t o  s t a r t  j obs .  
T h i s  memory must e i t h e r  be i n f a l l i b l e ,  o r  e l s e  have 
g r a c e f u l  degrada t ion  p r o p e r t i e s  o f  a s o r t  which w i l l  
be l e f t  unexplored i n  t h i s  paper .  
a c c e s s  t o  t h e  bus a s  do the  o t h e r  u n i t s  of t he  mul t i -  
P rocesso r .  It i s  i n t e r r o g a t e d  by means of a message 
s e n t  from a p rocesso r  spec i fy ing  i t s  own i d e n t i t y  
The memory has 
-24 - 
program 
I 
I 
I 
I 
.I 
I 
proqram - Ir! 
I 
I 
I 
Y !  
rt 
(D 
6 
3 
0, 
I- 
Common 
Erasable 
c m 
I I 
I I 
Figure  1. Co l l abora t ive  mul t iprocessor  model. 
and t h a t  t h e  con ten t s  of meaory address  k i s  
d e s i r e d .  Upon r e c e i p t  of t h i s  messsage, the  
memory p l a c e s  i t  i n  a wa i t ing  s t a c k .  When i t s  t u r n  
comes, the*message causes a memory cyc le  t o  be 
executed ,  and both  address  and con ten t  t o  be 
d e l i v e r e d  LO ano the r  wa i t ing  s t ack  f o r  t ransmiss ion  
on t h e  bus .  
n i z e  i t s  answer a s  it appears  on t h e  bus .  
The r eques t ing  processor  w i l l  recog- 
The l a s t  i t em i n  the  mul t iprocessor  i s  an  
input -output  b u f f e r  u n i t ,  capable  of r e l ay ing  
messages between mul t iprocessor  u n i t s  and e x t e r n a l  
system d a t a  t e rmina l s .  Although i t  i s  poss ib l e  i n  
p r i n c i p l e  simply to  extend t h e  mul t iprocessor  bus 
ou t  t o  the  e x t e r n a l  u n i t s ,  i t  i s  probably p re fe ra -  
b l e  t o  accommodate the  e x t e r n a l  d a t a  t r a n s f e r s  on a 
s e p a r a t e  bus system. Th i s  no t  on ly  i s o l a t e s  the  
mul t ip rocesso r  from i t s  environment f o r  conceptual 
a n a l y s i s ,  b u t  a s  a p r a c t i c a l  mat te r  permits the  use 
of d i f f e r e n t  sequencing techniques  f o r  the  mutually 
d i s t a n t  remote mul t ip lexers  than f o r  the  i n t e r n a l ,  
c lo se ly  packaged ones. Except f o r  t h i s ,  t he  remote 
s y s t e m s  may be considered t o  be spec ia l i zed  
p rocesso r s ,  and t r e a t e d  accord ingly  i n  the  a n a l y s i s .  
Process ing  Element P r o p e r t i e s  
The process ing  elements P a r e  thought of a s  
small  genera l  purpose computers wi th  a number of 
f e a t u r e s  not  normally presumed i n  connection wi th  
process ing  elements.  These a re :  
Program Storage - -  Each processor  has i t s  own copy 
of a l l  programs. The programs a r e  w r i t t e n  a s  pure 
procedure.  This  redundant program s to rage  can be 
dispensed with by having one o r  s e v e r a l  memories 
which the  var ious  processors  can i n t e r r o g a t e ,  bu t  
i t  s i m p l i f i e s  d i scuss ion  t o  have i t .  
each processor  has a list of j obs  i t  can under take ,  
p l u s  any a d d i t i o n a l  information requi red  by each 
j o b ,  such a s  s t a r t i n g  addres s ,  da t a  l o c a t i o n s ,  e t c .  
Message Transmi t te r  and Receiver -- A processor  i s  
connected t o  the  da ta  bus mul t ip lexer  by way of a 
t r a n s m i t t e r  and r ece ive r  s e c t i o n .  This  s e c t i o n  may 
have a job  reques t  s t a c k ,  a s  discussed below, and 
does have means f o r  d i sc r imina t ing  among o r  o r i g i -  
n a t i n g  va r ious  messages, such a s  common memory 
t r a n s f e r s ,  job  r eques t s ,  j ob  acceptances (see below). 
An important proper ty  is  t h a t  t h i s  s e c t i o n  be 
" i n f a l l i b l e " ,  meaning a s  r e l i a b l e  a s  we can d k e  i t ;  
more t o  the  p o i n t ,  i t  cannot f a i l  i n  such a way a s  
t o  d i s a b l e  the da ta  bus. 
I n  p a r t i c u l a r ,  
Se l f  E r ro r  Detec t ion  - -  Each processor  must be 
capable of d iagnos is  a t  l e a s t  t o  t he  e x t e n t  of 
d e t e c t i n g  any e r r o r s  wi th in  i t s e l f .  The r e su l t  of 
an e r r o r  i n  a processor  must be a s p e c i a l  j o b  
reques t  message put on the  da ta  bus so a s  t o  have 
each processor  inform a l l  o t h e r s  when i t  malfunctions 
o r  when i t  becomes i n a c t i v e  ( e .g . ,  power f a i l u r e ) ;  
t h i s  i s  the  reason f o r  r equ i r ing  an " i n f a l l i b l e "  
message t r a n s m i t t e r  and r e c e i v e r .  E r ro r  d e t e c t i o n  
need not  be ins tan taneous ;  i t  i s  probably s u f f i c i e n t  
t o  d e t e c t  e r r o r s  wi th in  a j ob  execut ion  i n t e r v a l  and 
no t  i s s u e  f a l s e  job  r e s u l t s .  The d e t e c t i o n  of 
c e r t a i n k i n d s o f  e r r o r s  such a s  i n a c t i v i t y ,  o r  
programs becoming " los t" ,  r equ i r e s  e i t h e r  a c e r t a i n  
minimum t i m e  o r  e l s e  an uneconomical amount of 
equipment. The a rea  of e r r o r  de t ec t ion  and/or cor- 
r e c t i o n  may be one of the  more d i f f i c u l t  ones i n  
mul t iprocessor  element design. 
In  a d d i t i o n  each process ing  element has a 
s c r a t c h  pad s to rage ,  an  a r i t h m e t i c  u n i t  and rudimen- 
t a r y  i n t e r r u p t  system which w i l l  enable  s i n g l e  memory 
cyc le s  out of sequence. 
check wi th in  seve ra l  memory cyc le s  t o  see  i f  a j o b r e -  
quested i s  a v a i l a b l e  i n  t h i s  processor 's  r e p e r t o i r e  
of procedures.  
The l a t t e r  should permi t  a 
OPERATION 
Job  Assignment 
A view of t he  d e t a i l e d  process  of j o b  assignment 
-25 - 
i s  important  i n  a s c e r t a i n i n g  i f  the s i n g l e  da ta  
bus s t r u c t u r e  i s  e i t h e r  p o s s i b l e  o r  d e s i r a b l e ,  and 
i f  g r a c e f u l  degradat ion w i l l  occur .  
D e f i n i t i o n s  
P Processor  
p l ,  o r  Pi 
J Job 
J ~ ,  o r  Ji 
Y P r i o r i t y  
Y1' y i  
R(J,Y,T) Job reques t  message 
A(J  ,P) Job acceptance message 
E(J,P) End of job  message 
A s p e c i f i c  processor  
A s p e c i f i c  job  
i + 1 "i A s p e c i f i c  p r i o r i t y  Y 
T 
The 
1. 
2 .  
3a .  
3b. 
Time 
general  job assignment can be a s  follows: 
R(J,Y,T) appears  on the  da ta  bus ,  i s sued  
by e i t h e r  a processor  or  an input -output  
u n i t .  T h i s  i s  a reques t  t o  do job  J ,  
which has p r i o r i t y  Y ,  and t o  do i t  a t  
time T .  The time a t  which the  job i s  t o  
be done can be 'now', o r  ' a s  soon a s  
p o s s i b l e ' ,  o r  some s p e c i f i e d  time i n  the 
f u t u r e  . 
Each P capable of doing J records  R ,  
whether busy o r  n o t ,  i n  a s t a c k  with 
c e r t a i n  a s s o c i a t i v e  p r o p e r t i e s .  The 
messages may be r e t r i e v e d  by keying on J ,  
on T, o r  on the maximum value  of Y .  
Processors  a r e  e i t h e r  f r e e  o r  n o t .  I f  
n o t ,  they a r e  doing a job J of  a c e r t a i n  
p r i o r i t y  Y .  
Suppose J=J.; when R(Ji,Y,T) appears  on 
the  bus the'free processors  PL,P2 . . . P .  
J each compose a response message 
A(J .  ,P,), A(J .  , P 2 ) ,  ... A(J. , P . ) .  Some 
onelof  the f r k e  processors  h i l z  have f i r s t  
t u r n  a t  the d a t a  bus (because t h e  bus i s  
time mult iplexed)  and w i l l  i s s u e  an A- 
message. A l l  P ,  f r e e  o r  n o t ,  then e l i d e  
R(J.,P,T) and a l s o  any redundant A(J . ,P)  
the; may have prepared ,  and which is '  
wai t ing  i t s  PIS t u r n  on t h e  bus .  A f t e r  
A i s  issued by P P .  must br ing  a l l  
p e r t i n e n t  inform4;ioJ about  J1 from t h e  
common memory i n t o  i t s e l f .  
I f  t h e r e  had been no f r e e  P ,  then 
R(J ,Y,T) would remain outs tanding  i n  a l l  
P.  ' A l l  those P doing jobs  wi th  lower 
p r i o r i t y  than t h a t  of the job requestcd 
a l s o  prepare response messages A ( J  ,P). 
Again, some processor  w i l l  be f i r s &  t o  
i s s u e  A(J1,P) because of the bus mul t i -  
p l e x i n g ,  and a l l  o t h e r  R(J ,Y,T) and 
A(J ,P) a re  a n n i h i l a t e d .  1 
1 
choice: i t  may take on t h e  new job J 2  
while keeping a l l  the  informat ion  about  
J with in  i t s e l f ,  i f  i t  knows J 2  t o  be 
s k o r t  (such information can be p a r t  of 
the job  name i t s e l f ,  o r  of i t s  p r i o r i t y  
measure). O r ,  i f  J i s  n o t  s h o r t ,  P must, 
a f t e r  i s s u i n g  A(J2 , 'Y ) but  before  a c t u -  
a l l y  doing any work, g r a n s f e r  a l l  p e r t i -  
nent  information used by and about J 
t o  the  common memory and i s s u e  R(Jl ,Jl , T ) .  
I n  t h i s  way another  P can undertake J 
Common p r a c t i c e  i s  t o  program jobs  wlth 
"bump poin ts" ,  which minimize the  i n f o r -  
mation t h a t  must be s e n t  t o  o r  brought  
from common memory i n  t h e  event  of i n t e r -  
r u p t i o n .  The value of knowing when J2 i s  
s h o r t  enough t o  a l low the  same P t h a t  
was doing J t o  resume J a f t e r  doing J2 
1 .  1 i s  i n  the  saving of common memory t r a n s -  
f e r s .  
1' 
4 .  The end of a j o b ,  o r  the i n t e r r u p t i o n  of 
a j o b ,  a l s o  r e q u i r e s  a message E ( J , P ) .  
5.  Each A(J,P) issued i s  recorded i n  common 
memory, and a n n i h i l i a t e d  by t h e  subsequent 
E(J,P) with matching J .  
t h e r e  i s  a t  a l l  times a record of which J 
a r e  being executed and by which P.  This  
information permi ts  r e s t a r t s  i n  the  event  
of a P f a i l u r e ,  a s  w i l l  be d iscussed  
be low. 
I n  t h i s  way 
6 .  Jobs t o  be executed a t  appointed times 
a r e  of importance i n  sampled da ta  systems 
such a s  s p a c e c r a f t .  The same s t a c k  used 
f o r  s t o r i n g  u n s a t i s f i e d  job r e q u e s t s  can 
be used t o  so lve  the problem. The out-  
s tanding  job  r e q u e s t s  R(J,Y,T) may be 
s o r t e d  ( o r  r e t r i e v e d  a s s o c i a t i v e l y )  by 
T 2 T, where T i s  the  p r e s e n t  t ime,  and 
f z r t h e r  s o r t e d  gy p r i o r i t y .  For each new 
To the  s t a c k  i s  i n t e r r o g a t e d  t o  see i f  
one o r  more j o b s  a r e  outs tanding .  I f  S O ,  
an A-message i s  prepared ,  a s  i n  3 .  
Job Stack 
The s t a c k  which conta ins  the job  r e q u e s t  i s  a 
p o t e n t i a l  problem a r e a .  On the  b a s i s  of  e s t i m a t i o n s  
of system s i z e  and speed,  and of f u t u r e  i n t e g r a t e d  
c i r c u i t  s i z e s ,  we have guessed the  s t a c k  s i z e  t o  be 
100 words oE 50 b i t s  each.  The requi red  a s s o c i a t i v e  
p r o p e r t i e s  might be s imulated by c i r c u l a t i n g  the  
c o n t e n t s  of t h e  e n t i r e  memory i n  between j o b  r e q u e s t s ,  
and f o r  each increment of t i m e .  A r e c i r c u l a t i o n  
t i m e  of t h e  o r d e r  of a few microseconds looks 
reasonable  from t h e  p o i n t  of  view of c i r c u i t  tech- 
nology (10 nsec  p e r  b i t ,  f o r  word-para l le l  s h i f t i n g ) .  
T h i s  a c c e s s  t i m e  i s  c o n s i s t a n t  with a time granu- 
l a r i t y  and a job  r e q u e s t  i n t e r v a l  of the  o r d e r  Of 
t e n  microseconds,  which appear  adequate .  
y e t  c l e a r ,  however, whether room f o r  100 outs tand-  
i n g  job  r e q u e s t s  i s  enough. 
It i s  n o t  
The P tha t  undertakes a new J 2  of p r i o r i t y  
y higher than the priority y of J~ has a 
The j o b  assignment and i n t e r r u p t  s t r u c t u r e  
which has  been def ined  previous ly  assumes t h a t  
2 1 
-26 - 
every processor  conta ins  a job  reques t  s t a c k  with 
a s s o c i a t i v e  and comparative p r o p e r t i e s .  I n  o r d e r  
t o  avoid the  N- tupl ica t ion  of t h i s  p o t e n t i a l l y  
expensive s t a c k ,  t h e  s t r u c t u r e  can be modified 
s l i g h t l y .  One " i n f a l l i b l e "  copy of the  s tack  is  
maintained i n  common memory, and i s  capable  of 
i n i t i a t i n g  jobs  i n  any processor .  The primary 
d i f f e r e n c e  i n  the  message t r a f f i c  flow i s  t h a t  a 
Bump Message [B(P.)] must be def ined and t ransmi t -  
t e d  a t  bump p o i n t & .  
opt ion  a v a i l a b l e  t o  the  d i s t r i b u t e d  t a b l e  system 
which e l i m i n a t e s  unnecessary common memory t r a n s -  
f e r s  i s  unavai lab le  t o  t h e  s i n g l e  t a b l e  system. 
A d d i t i o n a l l y ,  t h e  bumping 
Degradation 
The mul t iprocessor  can degrade g r a c e f u l l y  i f ,  
t o g e t h e r  wi th  the  p o s t u l a t e d  i n f a l l i b l e  common 
memory, t h e  message bus and the  p a r t  o f  each 
processor  concerned with message handl ing  a r e  a l s o  
i n f a l l i b l e .  It i s  necessary  t h a t  a processor  
f a i l u r e  genera te  a message, i . e . ,  a job  reques t .  
The j o b  undertaken by some o t h e r  processor  i s  to  
r e i s s u e  a l l  job  r e q u e s t s  shown outs tanding  f o r  t h e  
f a i l e d  processor .  Since t h e  i n p u t  information 
( the  l i s t  o f  ou ts tanding  A-messages) i s  s t i l l  
a v a i l a b l e  i n  common memory, recovery can be 
e f f e c t e d  by having o t h e r  P ' s  do the  j o b s  over  
aga in .  
There are o t h e r  i n t e r e s t i n g  degraded condi- 
t i o n s .  One of  these  i s  when t h e r e  i s  one processor .  
The message bus then has  only  one occupant ,  
When P1 i s s u e s  R ,  P1 r e c e i v e s  i t ,  s t o r e s  i t ,  
computes A(J,P ) i s s u e s  i t ,  a n n i h i l a t e s  R ,  and g e t s  
on wi th  the  j o h .  Hence t h e  bus s t r u c t u r e  must be 
such as t o  a l l o w  message sending processors  t o  
r e c e i v e  t h e i r  own messages. The s i n g l e  processor  
w i l l  a l so  behave a p p r o p r i a t e l y  i n  t h e  event  of a 
h igher  p r i o r i t y  J2 appearing while  i t  i s  doing a 
j o b  J1. 
pl '  
General system overload i s  another  case of 
i n t e r e s t .  Suppose the  number of j o b  r e q u e s t s  be- 
comes l a r g e  f o r  the  system, and the  l i s t  of R 
messages s t o r e d  i n  each P i n c r e a s e s  t o  the  p o i n t  of 
t a x i n g  t h a t  s t a c k .  I f  by "graceful  degradat ion" 
we mean t h a t  j o b s  of h igher  Y g e t  done f i r s t ,  and 
t h a t  j o b s  of  lower Y g e t  postponed,  bu t  done 
e v e n t u a l l y ,  then w e  must provide means f o r  making 
room i n  t h e  "pending R" s t a c k s .  Other s t r a t e g i e s  
are p o s s i b l e ,  such as proport ioned processor  
occupancy. One way t o  do t h i s  i s  t o  have each 
p r o c e s s o r  s tore  i n  common memory (or  i n  i t s  own 
s c r a t c h  pad ,  i f  i t  has  one b i g  enough) the job 
r e q u e s t s  of  lower p r i o r i t y  and l a t e r  time of  
e x e c u t i o n .  One i n t e r e s t i n g  p o i n t  i s  t h a t ,  i f  a 
p r o c e s s o r  h a s  many unserviced R ' s  i n  i t s  s t a c k ,  
o t h e r  p r o c e s s o r s  a r e  a p t  t o  have the  same messages 
i n  t h e i r  s t a c k s .  Hence, a s  the  lower p r i o r i t y  j o b  
r e q u e s t s  are s tored  i n  common memory, a message must 
be i s s u e d  f o r  a n n i h i l a t i n g  t h e  same r e q u e s t s  
s tacked  i n  o t h e r  p r o c e s s o r s .  
t h e  s t a c k  t h e  o r i g i n a l  processor  must i s s u e  a job  
r e q u e s t  t h a t  t h e  demoted job  r e q u e s t s  now i n  com- 
mon memory b e  r e i s s u e d .  
A f t e r  making room i n  
IMPLICATIONS 
Software Considerat ions 
Despi te  the  f a c t  t h a t  most of the  c a l c u l a t i o n s  
f o r  a s p a c e c r a f t  a r e  sampled by n a t u r e ,  t h e r e  
e x i s t s  a s u b s t a n t i a l  programming burden i n  s e c t i o n -  
i n g  programs i n t o  jobs  of proper  length  and e s t a b -  
l i s h i n g  the  packages of d a t a  requi red  t o  she lve  and 
resume t h e  program f o r  i n t e r r u p t i o n  and r e s t a r t .  
This  burden cannot be placed on the programmer be- 
cause ,  as  a p r a c t i c a l  m a t t e r ,  computer u s e r s  do n o t  
(and should n o t  have t o )  know very much about  the  
computer they use.  The onus c l e a r l y  f a l l s  upon a 
compiler .  Programs w r i t t e n  a s  a s i n g l e  j o b  must be 
segmented au tomat ica l ly  so  a s  t o  be a b l e  t o  r e s t a r t  
and permit  e f f i c i e n t  i n t e r r u p t i o n .  Wri t ing such a 
compiler  probably r e p r e s e n t s  a task  of t h e  same 
o r d e r  of magnitude as  the  design of  the  mul t i -  
p rocessor  i t s e l f ,  and a l s o  r e p r e s e n t s  an advance 
over  present  compilers .  The above mul t iprocessor  
des ign  (and very l i k e l y ,  any o t h e r )  would n o t  be 
a t t r a c t i v e  without  e i t h e r  the  p r i o r  e x i s t e n c e  of a 
s u i t a b l e  compiler ,  or knowledge t h a t  one can be 
w r i t  t e n .  
An i n t e r e s t i n g  extreme form of program segmen- 
t a t i o n  i n t o  jobs  c o n s i s t s  of l e t t i n g  each j o b  be an 
i n s t r u c t i o n  of an elementary type such a s  mul t ip ly  
( o r  perhaps a s  complicated a s  a f l o a t i n g  p o i n t  
v e c t o r  o p e r a t i o n ) .  The job  name must in  t h i s  c a s e  
conta in  da ta  addresses  and a next  i n s t r u c t i o n  ad- 
d r e s s ;  o r  e l s e  the  j o b  name can be simply the  ad- 
d r e s s  of an i n s t r u c t i o n .  This  would undoubtedly r e -  
s u l t  i n  i n e f f i c i e n t  processor  usage, b u t  i t  might 
lead  t o  u s e f u l  segmentation techniques.  
Es t imates  of  Performance 
An o r d e r  of magnitude e s t i m a t e  of performance 
requirements  f o r  t h i s  i d e a l  mul t iprocessor  can be 
der ived  from a n  e x t r a p o l a t i o n  of Apollo exper ience .  
Within a few y e a r s  t i m e  we s h a l l  d e s i r e  a machine 
which can handle  o n t h e o r d e r  of  a hundred programs 
a t  a t i m e  on a sampled b a s i s ,  ou t  of  a t o t a l  
program assembly of hundreds of programs. Each 
program would p e r i o d i c a l l y  rece ive  a sample update; 
an average sample r a t e  of  about 50 samples p e r  
second p e r  program would probably be adequate .  
This  means t h a t  some 5,000 samples, o r  j o b s ,  would 
be executed every second. The o v e r a l l  b i t  t r a n s f e r  
r a t e  f o r  common memory, input -output ,  and messages 
i s  es t imated  a s  fo l lows .  An average of 25 words 
must be brought  from common memory and 25 words 
s t o r e d  t h e r e  p e r  j o b .  This  number i s  based on 
exper ience  wi th  the  execut ive  program s t r u c t u r e  of  
the  Apollo Guidance Computer. Assume 50 b i t s  p e r  
word f o r  address  and d a t a .  Assume an average of  
one i n p u t  and one output  message and f o u r  j o b  
assignment messages of  50 b i t s  each p e r  job .  The 
minimum b i t  r a t e  which could poss ib ly  serve t h i s  
system i s  
1 words + messages x (50 job job 5000 sec 
b i t s  = 14 megabi t s l sec  
50 word o r  message 
-27 - 
This rate takes no account of delays occa- 
sioned by stacked up requests or other access 
times, but is well within reach of today's tech- 
nology for memory and transmission systems. 
The instruction execution rate is estimated 
by assuming an average number, again borrowing 
from Apollo experience, of the order of a thousand 
instructions executed per job, and an average job 
duration of a millisecond. The latter figure is 
chosen on the basis of wanting the multiprocessor 
to react to an input event or job request within 
that space of time. This yields a figure of one 
microsecond per average instruction, and also 
implies that at least five processors need to be 
on line to handle the 5,000 jobs per second. Both 
of these figures seem extremely reasonable in the 
light of our expectations of the technologies 
involved; indeed, we expect that the technologies 
will soon substantially surpass these levels. 
This, added to the fact that we have been describ- 
ing a somewhat primitive form of system organiza- 
tion, suggests that we may expect to have more 
powerful spacecraft data processors in a decade 
than there are on the ground today. 
-28- 
EXECUTIVE PROGRAM CONTROL FOR 
SPACEBORNE MULTl PROCESS I NG 
ROBERT A. HOKOM 
Mr. Hokom is a Senior Computing Engineer wi th  t h e  Computer 
Systems Group, Navigation Systems Division, Autonetics, a division of 
Nor th  American Aviation. Mr. Hokom was awarded a B. A. degree in 
Mathematics f rom t h e  University of Southern Cali fornia in 1959. 
working pr imar i l y  in engineering application and systems programming. 
He i s  cu r ren t l y  investigating software techniques for a NASA spaceborne 
mult iprocessing study and is  also direct ing a team in t h e  development 
of a problem-oriented language and compiler for t h e  on-board Minuteman 
computer. 
Mr. Hokom has been w i th  NAA’s Autonetics Division s ince 1962, 
-29 - 
F;,ECEDlb!G PAGE BLANK NOT FILMED. 
EXECUTIVE PROGRAM CONTROL FOR SPACEBORNE 
MULTIPROCESSORS 
By: Robert A. Hokom, Senior Engineer, Computing N 6 7 - 1 7 1 0 5 Autonetics, A DivisIon of North American Aviation, Inc., 
Anaheim. California 
SUMMARY 
The necessity for executive program control, a 
description of executive functions, and executive soft- 
ware implementation is discussed. The frame of 
reference is a spaceborne computer for a manned 
Mars mission. 
The reasons for having executive program control 
include requirements for reconfiguration to handle 
failures, computational loading, and unanticipated 
processing as well a s  the general questions of effi- 
ciency and timing. 
The executive functions described are program 
scheduling, inter-program communication, reconfigura- 
tion control, request processing, Input/Output super- 
vision, and computer self-test. 
The application of these functions for a multi- 
module computer, which is representative of many 
multiprocessors, illustrates their characteristics. 
Differences in executive implementation for other 
configurations a re  briefly discussed. 
Although this paper is concerned with a particular 
mission for a single NASA study, the approaches and 
problems examined are  applicable to a broad range of 
computer systems. 
INTRODUCTION 
The combination of a new class of space mission 
requirements and hardware concepts new to onboard 
computer systems requires a reorientation of software 
techniques. 
Software design for current ICBM, manned space- 
craft and unmanned space probe computers is primarily 
concerned with maximizing utilization within a frame- 
work of strict timing constraints. Reconfiguration, if 
planned for  at all, is limited to switching in a backup for 
handling failures and to setting logic to enter alternate 
program paths for  changing mission phases. 
phasis on optimum utilization is balanced by the need 
for  flexibility. The dynamic mission environment 
requires computer response, both hardware and soft- 
ware, to multiple failures, widely varying computational 
loads, and unanticipated requirements. Although timing 
constraints still exist, they a re  not as severe and a large 
Portion of the computations a re  unconstrained. 
A manned M a r s  mission (1980 time frame) is used 
in this paper a s  representative of this class of missions. 
After demonstrating the necessity for executive control 
in general, the various executive functions a re  des- 
cribed. Finally, an example of executive software for 
FGT l G ~ g - & i i i i i i i ~ i i  iauieci space missions the em- 
a particular configuration and a discussion of applica- 
tion to other configurations are  presented. 
NECESSITY FOR EXECUTIVE PROGRAM CONTROL 
The Computations 
Mars Mission Characteristics - The manned 
Mars mission used as the framework for this study 
consists of the following major flight segments: 
1. 
2. 
Launch and injection into Earth orbit. 
Escape and trans-Mars coast with midcourse 
corrections. 
3. Injection into Mars orbit. 
4. Escape and trans-Earth coast with midcourse 
corrections. 
5. Earth re-entry and recovery. 
The purpose of the mission is to perform extensive 
scientific measurements and experimentation in the Mars 
area, and a limited amount of the same during coast 
periods. It is representative of other long-duration 
manned space flights, such as orbiting stations, Mars 
landing, and other planetary missions, since their 
operations a re  computationally related. 
Computational Characteristics - There a re  two 
distinct types of computations: control/monitoring and 
batch data processing. The first is necessary for 
guidancehavigation, status monitoring and communi- 
cations, the second for scientific data handling. 
Computationally, the mission consists of a sequen- 
tial series of 20 unique phases which a re  classified a s  
follows: 
1. Mars orbital phase - Characterized by the 
mission's peak loading of 30000 words with 
380,000 operations/second speed, and 
0.5 hour recovery requirements. 
Non-Mars, critical phases - loading of less 
than 12000 words with less than 200,000 
operations/second, and 5 second recovery. 
Non-Mars, non-critical phases - Loading of 
less than 24000 words with 250,000 operations/ 
second, and 0.5 hour recovery. 
2. 
3. 
The,loading estimates include certain functions 
that are  non-continuous. For instance, during the trans- 
M a r s  and the trans-Earth coast phases over half of the 
load is executed on command only at relatively infre- 
quent intervals. Of course, other functions, such as 
-31 - 
status monitoring and attitude determination, are con- 
tinuously performed during the entire mission, although 
the mechanizations may differ from phase to phase. 
Computational Requirements - The computer 
system must be flexible enough to functionally configure 
itself to efficiently process a variety of computational 
loads. This includes the capability for processing 
unanticipated programs (ie. existent program in-flight 
not scheduled for the current phase o r  the development 
of new programs through programming). 
Certain portions of many functions must be exe- 
cuted periodically. The frequency of these cycles 
must not vary. 
In addition to providing necessary error  detection 
and fault isolation, the system must have a means for 
smooth transition to backup configurations. 
Since in-flight programming might be necessary, 
software development must not be hindered by an 
excessive number of restrictions. 
Dynamic V e r s u s  Static Approach 
Static Programs - In general, given enough time, 
a group of skilled programmers can mechanize any 
reasonable set of requirements for a given computer 
without any special support software. In this case, in 
order to keep the computer size within reason, it is 
necessary to program phase defined llload modules, I '  
which are permanently stored on a mass storage device. 
Therefore, at least some means for sequencing load 
modules into the computer a s  new mission phases occur 
must be implemented. 
These load modules could be mechanized as separate 
static programs, each with a high degree of optimality. 
This would mean that 20 unique load modules of between 
12,000 and 30,000 words each would have to reside on 
the mass storage. 
In order to reconfigure for a single failure, another 
complete set of load modules would be required. The 
handling of multiple failures would require multiple load 
modules for each phase. Reconfiguration to handle 
unanticipated processing would involve in-flight construc- 
tion of an entire load-module. 
Assuming that at least 100 unique load modules 
would be required, the mass storage would have to hold 
approximately 1,200,000 words and the logic to select 
them would not be insignificant. In addition, the thought 
of recoding functions on the order of 100 times is 
especially repungent to the programmer. 
Dynamic Programs - With apprapriate executive 
program control, each function can be mechanized just 
once and the load modules can be constructed during the 
flight from Itload profile" tables. The mass storage 
would have to accommodate only about 60,000 words. 
These programs will  have to be mechanized so that 
inter-program data flow, timing, and Input/Output is 
performed with support software, and each program 
must be otherwise self-contained. There is some loss 
in utilization since there is executive overhead and 
modularization by nature implies some inefficiency, but 
the inherent flexibility is essential to achieving true 
effectiveness of 'the computer system. 
An area often overlooked in computer system 
design is the effect of program modifications. A s  
requirements are  altered, deleted, o r  added, the asso- 
ciated programs must be reworked. Programs coded 
statically must be completely redone for each modifi- 
cation. Dynamic programming permits rework to be 
confined to directly affected functions. 
Accepting the dynamic approach means that certain 
executive and support facilities must be provided, and 
that programming procedures must be established. 
EXECUTIVE FUNCTIONS 
Program Sequencing 
Periodic Programs - There are a number of processes 
within each computational program that must be repeat- 
edly executed at a fixed frequency. For the mission 
under consideration, the highest rate function is attitude 
determination which must be executed 20 times per sec- 
ond. Almost all others must be cycled once per second. 
Background Programs - Other processes are  to be 
executed continuously during a phase, but have no 
timing constraints. An example of this is status moni- 
toring and testing during trans-Mars coast of vehicle 
systems to be used in Mars orbit. 
Request Programs - There are  many processes that 
a r e  executed on command only, in other words they are 
non-continuous. An example of this is the guidance and 
navigation function during coast phases; i t  will be exe- 
cuted about once a day and will take only 0.5 hour. In 
terms of sequencing, unanticipated programs a re  con- 
sidered as request programs. 
Program Scheduling Routine - The executive must 
have a routine for scheduling the execution of these 
programs. The schemc must satisfy the strict fre- 
quency requirement of all periodic programs, sequence 
the background programs, and permit execution of re- 
quest programs with some priority control. 
This routine must minimize dead time and be able 
to perform its job for any of the load modules. This 
can only be done i f  the load profiles include information 
concerning the classification of each program in the 
load module so that scheduling tables can be constructed 
and updated. 
Inter-Program Communication 
Even though the various programs a re  functionally 
independent, there are  a number of parameters that can 
be considered as  "global. I '  These include data refer- 
enced by more than one program, interface data for 
separate programs that jointly represent a single func- 
tion, executive-computational programs' control and 
information parameters, and data necessary for initiali- 
zation of programs. 
There must be a means for global parameters to be 
transmitted to and from any configuration of computational 
programs. 
-32- 
Mars orbital phase - A l l  connections a re  un- 
blocked except that Pi  is able to temporarily 
block the P2 - M1 and P2 - 1/0 connections, 
and P can do the same to the $1 - M2 and 
PI - 80, connections. 
Non-Mars, critical phases - A "processing 
group" consisting of PI,  MI, and 1/01 is 
formed by blocking the connections P2 - M i ,  
Pg - 1/01, P1 - M2, Pi -I/Og, and P2 - M3. 
The Pi  - M3 connection is also blocked but may 
be unblocked in order to expand the processing 
group's memory capacity. The corollary 
processing group P2, M2, and 1/02,  which is 
formed by the same blockings, is used as a 
backup to the other in case of failure. 
Non-Mars, non-critical phases - The proc- 
essing group of PI, M i ,  and 1/01 is formed 
and the other modules a r e  turned off. The M3 
memory module is brought into the group when 
required. 
PElllODlC PROGRAMS 
Multi-Module Executive and Support Software 
Program Scheduling - The key area is the scheduling 
of the periodic programs. A requirement is established 
that all frequencies must be integer multiples of faster 
rates. This permits the usage of a fixed "time interval" 
cycle, which equals the highest frequency (0 .05  seconds), 
a s  the base for scheduling. An interrupt system returns 
control'to the schedule at this frequency. 
The highest rate program is always executed first 
and then the next highest. A Periodic Table is main- 
tained which contains frequency and status information. 
Figure 2 shows the execution sequence of a se t  of perio- 
dic programs. 
FREQUENCY TIME-TO-PERFORM 
d 
TlYE 0 0.05 0 .1  0 . 1 5  0 .2  0 .25  8.3 0 .35  0 . 4  0.45 0 . 5  0.55 .0.6 0.65 0.7 
Figure 2. Example of Periodic Program 
Scheduling 
The shaded areas in Figure 2 represent time 
periods when either request o r  background programs 
may be executed. A Request Queue, which contains 
requested programs and is ordered by priority, is ex- 
hausted before the Background Table is used to cycle 
the background programs. 
During the M a r s  orbital phase a separate Periodic . 
Table is assigned for each processing group. When the 
interrupt occurs, each processor scans its schedule. 
However, a single Request Queue, residing in M3, is 
accessed by both. Separate Background Tables a re  used, 
though, when all requests have been satisfied. 
When a program is interrupted, its fegister values 
and the program location counter is saved. A status 
indicator for each program is kept in the tables so that 
resumption can be scheduled. 
Inter-Program Communication - Global parameters 
a r e  assigned fixed locations. The various programs 
merely perform fetch and store operations directly. 
Reconfiguration Program - Each load module is orga- 
nized so that periodic, request, and background pro- 
grams a re  separated into blocks. The memory is 
organized a s  follows : 
1. 
2. Executive tables. 
Central data area and Executive programs. 
3. Periodic programs. 
4. Background programs. 
5. Request programs. 
For the Mars orbital phase M1 and Mg would each 
be organized this way and M3 would contain additional 
request programs. 
To load a new load-module for a new mission phase, 
the current periodic block is moved to upper memory, 
overlaying the request block. The loader is executed a s  
a TOP priority request, s o  this move is safe. 
periodic block, when completely loaded, is initiated; 
the old one has continued operating until this time. When 
the new periodic programs a r e  operating smoothly, the 
other blocks a r e  loaded. 
The new 
A cold restart  is accomplished by direct load. This 
is required if failure causes computer shutdown. 
Unanticipated request programs a re  brought in by 
overlaying lower priority request programs. 
Request Processor - Requests a r e  made by setting 
logic flags in a Request Board that contains pointers to 
all programs. Console messages are interpreted by a 
console message processor routine to se t  this board. 
The scheduler will check this board for possible 
additions to the Request Queue each time it is ready to 
handle request execution. 
1/0 Supervisor - Two special routines, PUT and GET, 
a r e  used by the computational programs to perform 
I/O. The logical name of the sensor, which is supplied 
by the user,  is used to scan a 1/0 Configuration Status 
Board for device selection. This board is updated by 
console inputs whenever manual 1/0 reconfiguration 
occurs. 
-34- 
Reconfiguration Control 
An executive routine must be provided which can, 
i n  response to appropriate commands, produce a load 
module containing all currently required programs.  
 his naturally leads to the requirement that all pro- 
grams be relocatable so that memory resources  are 
efficiently utilized. 
capable of reconfiguring in  response to several  
situations. 
Mission Phasing - When a new mission phase is to 
begin, the associated load module must be loaded in  
such a manner that a smooth transition of the compu- 
tations and outputs is obtained. This is especially 
important for  periodic calculations that will continue 
in the new phase. 
Failure - The load module must  be reloaded into 
either an equivalent backup or reduced computer sys-  
tem after fault isolation determines what e r r o r s  have 
occurred. 
Unanticipated Programs - When an unanticipated pro- 
gram is requested for execution, the current  load 
module must  be  altered to  accommodate the new 
program and, when necessary, the computer configura- 
tion must be expanded. 
The scheme employed must  be 
In all c a s e s ,  the job not only is reassignment of 
resources  (storage and processing time), but involves 
alteration of the executive itself so that adequate con- 
t rol  can continue t o  be exercised. 
Request Processing 
Request programs can only be placed in  the sched- 
uling tables when a specific command is issued. There 
are two pr imary  sources for  request commands. 
Programmed Requests - A program may wish some 
other program to begin execution when certain conditions 
are detectcd. The decision to issue the request can be 
dependent on parameter  values, input quantities, o r  
program sequencing logic. The priority of the request 
can be either fixed gr a computed value. 
Console Requests - The astronaut may want to initiate 
a request via the computer console. This can be accom- 
plished by having a support package that will interrogate 
and interpret  console messages and issue the appro- 
pr ia te  request commands. 
A routine to  accept these requests and set up the 
appropriate logic in  the scheduling scheme must  be 
available. 
Input/Output Supervision 
The functional configuration of the Input/Output (I/O) 
system, which includes computer 1/0 units, sensors ,  and 
their  interconnections, is dynamic also. Thus, reccm- 
figuration is concerned with this area  too. 
Since the recovery requirement during 90 + ’%, 01 the 
mission is 0 . 5  hour, the main means for  accomplishing 
reconfiguration will be manual replugging of sensor  
lines into 1/0 conditioners. During critical phases the 
critical 1/0 devices will be redundantly available and 
automatic switching can be used well within the 5-second 
limit. 
The computational programs could not keep up with 
these changes. Therefore, a central 1/0 supervision 
routine must  b e  provided. Input/Output will be per- 
formed by appropriate calls on this supervisor  f rom 
the various computational programs.  The cur ren t  
status of the 1/0 system must be updated via the con- 
sole when manual alterations are performed. 
Self-Test 
Whatever e r r o r  detection and fault isolation tech- 
niques are employed, the software to support it com- 
pr i ses  another executive function. 
Application to Specific Computers 
The logical design of the computer system will have 
a direct  bearing on the relative difficulty and importance 
of implementing these executive functions. In fact, the 
hardware design must be concerned with providing 
features that will facilitate the executive software 
system. Thus, the total computer system design effort 
involves considerable feedback and t rade off evaluation 
between the hardware and software areas. 
A REPRESENTATIVE SYSTEM 
Multi-Module Computer Description 
The software design for  a multi-module computer 
capable of performing the manned Mars  mission is 
demonstrative of an executive implementation. 
Hardware - The computer i s  composed of a number of 
modularized components as represented in Figure 1 .  
There are two processor  modules (Pi and Pg) ,  three 
memory modules (MI,  M2, and MQ), two 1/0 modules 
(I/O1 and 1/02), and a variable number of 1/0 condi- 
tioners with attached sensors .  A s  can be seen, there  
are memory-processor connections, processor-I/O 
connections, and I/O-conditioner-sensor attachments. 
12,000 WORD MEhlOHY 
MODULES 
ooo OPERATIONS/SECOND 
PROC ESSOIi MODULE: 
..- 2 M  
Figure 1. Multi-Module Computer 
Hardware Representation 
F u n c t i o n a l C o ~ i ~ u ~ t i o i i s  - The functional configuration 
is ehangcd by blocking-2nd unblocking intcrconnections 
depending upon the type of phase being executed; these are: 
-33- 
On input the sensor is sampled, an appropriate 
reasonableness test may be used to check the data, and 
the value is passed to the user. On output the value is 
picked up, transmitted, and verified via hardware 
feedback. 
Self-Test Program - For this system a number of 
error  detection techniques are  employed. Fault isola- 
tion, except for 1/0 errors,  is performed after failure 
has caused a backup configuration (another processirg 
group) to be brought up. The means used are the 
following : 
1. Pulse stream - A special flip-flop, which 
will be hardware monitored, is complemented 
at the beginning of each time-interval cycle 
within the scheduler. Processor logic errors,  
interrupt errors,  and some memory failures 
are  detected. 
Failure notification will automatically be given 
when the pulse stream test is failed. Errors  
detected by other means perform notification 
by forcing pulse stream failure. 
Arithmetic unit test - one of the periodic 
programs is a routine that performs a com- 
plete check of the processor's arithmetic 
logic. 
Check-sum - The computational programs 
a re  internally organized so that code and con- 
stants a r e  blocked separately from variables. 
Check-sum information is contained in the 
load profiles and is kept in the scheduling 
tables. When the scheduler selects a program 
for execution, i t  first performs a check-sum 
on it. 
2. 
3. 
A Conditioner Status Table is used to collect 
information on which 1/0 errors have been 
detected. This is used to isolate to an 1/0 
unit, a conditioner, o r  a sensor. Only the first 
will cause computer failure notification. 
Other Configurations 
Multi-Computer - The software for a multi-computer 
system, which also is able to perform the mission, 
closely parallels the multi-module's . The primary 
difference is that during the Mars orbital phase each 
computer operates under completely separate executive 
control. 
Distributed Logic - The distributed logic configuration 
considered best suited for the mission is one with a 
number of separate cell groups connected only by an 
Inter-Group Buss. 
Task modules, which are  blocks of code that fit 
in one cell group, a r e  scheduled only for requests. 
Periodic scheduling is performed locally within the 
task modules. Reconfiguration consists of reassigning 
cell groups to new task modules. Errors are  detected 
and isolated at the cell group level. 
The inter-program communication function is vast- 
ly increased in scope. An Inter-Group Communication 
System is used to schedule the buss on a time-shared 
basis for transmission of inter-task data, global param- 
eters, 1/0 values, and certain test indications. 
Unexamined Systems - Although these three designs 
a re  fair ly  representative, there are innumerable com- 
puter designs that might be considered for spaceborne 
missions. The executive functions that were discussed 
must be incorporated, to some degree, in any design. 
4.  1/0 tests - Another periodic routine issues 
special test parameters through 1/0 condi- 
tioners and checks built-in feedbacks. The 
PUT and GET tests are  also used to detect 
I/O errors .  
-35 - 
F2ECEDih'G PAGE BLANK NOT FILMED. 
ERROR CONTROL 
-37 - 
SELF-REPAIR: FAULT DETECT1 ON AND 
AUTOMATIC RECONFIGURATI ON 
EARL C. JOSEPH 
Earl C. Joseph is a Staff Scientist at UNIVAC Defense Systems Division 
(St. Paul, Minnesota), Sperry Rand Corporation. Mr. Joseph joined UNIVAC 
in 1951 after receiving a B. A. degree in Mathematics f rom the University of 
Mi n n esota. 
Unt i l  1955, he held positions of Mathematician, Programmer, and 
Applications Analyst in UN IVAC's Arl ington, Virginia, 1101 Computation 
Center. Since then  h e  has held various supervisory and managerial positions 
o n  t h e  Nike-Zeus and Nike-X efforts. In  th i s  capacity, h e  managed t h e  sys- 
tems design, logic design, programming, and application of four  generations 
of large-scale computer systems, including: (1) t h e  Target Intercept Computer 
(TIC), (2) t h e  General-Purpose Digital Computer (GPDC), (3) t he  MAR (Multi- 
func t ion  Ar ray  Radar Computer (a GPDC), combined wi th  a centralized digital 
cont ro l ler  for  d is t r ibut ing parallel cont ro l  commands to the array radar, and 
(4) t h e  Nike-X'MuItiprocessor. 
In his present position as a Staff Scientist, to wh ich  he  was appointed 
in 1963, he  performs advanced systems design research for  ground base and 
aerospace computers. 
-39- 
SELF-REPAIR: FAULT DETECTION AND 
AUTOMATIC RECONFIGURATION 
By Earl C. Joseph 
UNIVAC DEFENSE SYSTEMS DIVISION 
SPERRY RAND CORPORATION 
SUMMARY 
The reliability and flexibility of next generation 
spaceborne computers will not be gained primarily 
from the application of new and more reliable elec- 
tronic devices, but rather through system organization 
to meet the reliability requirements of the space age. 
The multiprocessor is an example of such a computer 
organization. Multiprocessors are capable bf parallel 
processing and can be configured and reconfigured for 
general-purpose applications meeting advanced require- 
ments for reliability and adaptability. 
With the advent of large scale integrated circuits 
(LSI) we are  told that we soon will have "computers on 
a chip1'. With such LSI chips containing full o r  partial 
systems, i t  becomes practical and economical to imple- 
ment self-repair. That is, the addition of spare redun- 
dancy and diagnostic logic on a chip is possible without 
materially increasing the cost and size of the system. 
In addition, by including the "spares" on the chip ready 
to be "fused" into usage, automatically under program 
control, self-repair is accomplished without additional 
connections and complex switching logic. 
Included in this paper is a description of the fault 
detection, isolation, and location techniques required 
to recognize a system failure, to make the necessary 
real-time self-repair adjustments to the hardware con- 
figuration, and recover from errors  generated. This 
multiprocessor organization with nonmanual self- r epair 
features allow for reconfiguration so that a level of 
capability is continuously maintained and 100 percent 
systems availability can be virtually assured. 
INTRODUCTION 
Next generation computers will be required to have 
available a maximum capability 100 percent of the time 
for many applications. 
design of systems capable of achieving self-repair is 
possihlc? sn thzt 100 percent zyszb:!i% hec~mc:: pees- 
ible. 
With the advent of modular multiprocessors 1 , the 
If parts fa i l  the redundancy of a multiprocessor 
organization can be used to obtain reliable operation, 
that is, a multiprocessor is inherently a space redun- 
dant system. 
Integrated Circuits ( L a )  is simply a matter of time; 
for  i ts  usage as the principal ingredient in spaceborne 
and aerospace computers (and for that matter, any 
computer) is a certainty in this decade. The first 
COmPUterS that will be made completely from LSI will 
be for  Aerospace applications. Of course, the major 
Increasing the scope of applications of Large Scale 
incentives leading to the incorporation of LSI, is the 
promise of substantial system'cost reduction and a 
considerable increase in reliability. 
The ushering in of LSI puts us at the threshold of 
fourth generation computers. Like previous gener- 
ations, the current one is characterized by a dramatic 
breakthrough in component/device techriology. The 
four generations and their associated state-of-the-art 
technology are: 
1st generation computers - Vacuum tubes 
2nd generation computers - Transistors 
3rd generation computers - Integrated Circuits 
4th generation computers - LSI (1967-?) 
The continuiw reductions in size. cost, and 
(1950-1957) 
(1956-1966) 
(1962-197?) 
power consumption i f  logic circuit elements through 
the use of LSI encourages and facilitates the utilization 
of more complex logic networks in digital computers 
for spaceborne applications and allows for practical 
self-repair. This LSI-provoked revolution occurring 
in the electronics industry is drastically changing com- 
puter components and is causing an upheaval touching 
all levels of space technology. 
Before LSI, the designer w a s  forced to be concerned 
about the amount of logic going into the make-up of 
the spaceborne computers. In the near future, that 
will not be the case; for doubling or tripling the amount 
of logic per system on a wafer makes only a small dif- 
ference in cost, size, and power consumption. 
a new era  of highly capable and extremely reliable 
computers car- now be considered by the space planner. 
Historically in the computer industry great advances 
have been made in designing for reliability. In 1951 
the early computers had a mean-time-between-failures 
(MTBF) which was less than one hour. 
using a few computers which exhibit an MTBF of thou- 
sands of hours and are designing computers which 
should have an MTBF of 10,000 hours or more. This 
paper describes design methods to increase the reli- 
ability by another order of magnitude, to the range of 
100,000 hours MTBF or more. 
Thus, 
Today we are 
LSI means more logic per component to the sys- 
tem designer. 
IEEE Lake Arrowhead Workshop, "The Impact of 
Large Scale Integration on Information Processing 
Systems", by LSI component manufacturers indicate 
that one can expect hundreds and thousands of logic 
gates per component and in the nottoofardistant future, 
in the 1970's, it will be possible to design computers 
using ten thousand gates per component. Computers 
built today use integrated circuit components with two 
to four or at most ten gates per component. So even 
with a few hundred gates each, an order of magnitude 
breakthrough is occurring and the future promises 
breakthroughs of f a r  greater magnitude. These revo- 
lutionary changes mean higher speed and smaller future 
systems and are of such magnitude that a revolution is 
occurring in Aerospace computer design. 
Projections made at the recent (1966) 
-41- 
Since the component s ize  is relatively the same 
s ize  throughout this revolutionary change higher speeds 
and smaller  systems are possible. 
smal le r  and closer  gates driving shorter  wires. 
supporting long-term missions demanding increased 
data  processing capability, these features  inherent with 
LSI, mean that the computer needs of future high-capa- 
bility, post-Apollo, spacecraft can be met. 
and perhaps greater  importance t o  the spacecraft is 
that these LSI features also require  less power than 
present  day circui ts  and systems. 
This resul ts  f rom 
For  
Of equal 
On-board spacecraft reliability requirements pose 
a formidable problem to the computer designer of yes- 
terday. 
individual components, connections, and process  steps 
are rcquired to implement a computer, o rders  of mag- 
nitude greater  reliability can be achieved. 
provement in component reliability couplcd with a mod- 
ular multiprocessor organization capable of reconfig- 
uring itself and self-repair to  accommodate both equip- 
ment failure and mission changes will allow orders  of 
magnitude improvements in computer systems reliabil- 
ity and availability. 
1) a self-repairing system capable of diagnosing itself, 
2)  a spaceborne computer system capable of simultane- 
ously performing a wide variety of either o r  both com- 
mand and control or mission data processing tasks, and 
3) a system capable of reconfiguring itself into a func- 
tional system, 
a multiprocessor computer system organization i s  opti- 
mum? 
Self-Repair: A Definition 
With LSI, where orders  of magnitude fewer 
This im- 
In order  to meet the spaceborne requirements of: 
A self-repairable digital computer i s  a reliable 
automaton which has the capability of automatically 
detecting and isolating a failure to  a functional sub- 
system, then automatically causing a program (or 
hardware) to  switch a spare  functioning subsystem into 
the systcm to replace and repair  the failure. 
paper descr ibes  methods of designing computers for 
continuous operation through self-repair where manual 
repair  is not possible. 
This 
System Reliability and Self-Repair 
Obviously i f  the total computer system i s  demol- 
ished, i t  is not capable of self-repair. What then is 
the minimum subsystem configuration which must func- 
tion to allow the computer system to be self-repairable? 
This paper describes a self-repairable system requiring 
a minimum of operating par ts  of the system together 
with spare  submodules that can be switched into the sys- 
tem to replace failing subsystems. 
The reliability of a group of modules in  s e r i e s ,  
presents a ser ious problem in maintaining the system 
in operation; for, thc fai lurc  of onc modulc will disable 
the ent i re  complex. The solution i s  to use parallel re- 
dundant units (as in a multiprocessor) and interchange- 
able standby spares  which can  replace failing modules. 
Because the multiproccssor i s  made up of many 
modules that pcrform each function, it i s  l c s s  vulner- 
able, in military applications, than contemporary unit 
computer organizations; for i f  one o r  more modules are 
destroyed, by one means o r  another, and a module of 
each function s t i l l  exists, the m$tiprocessor can still 
accomplish its job. Thus, future computers for  mili- 
t a ry  applications will be organized with parallel oper- 
ating functions like a multiprocessor. To further mini- 
mize the vulnerability of the computer, the system de- 
signer has  two choices: 1) distribution of the functional 
modules throughout the physical system; o r  2)  central- 
izing the functional modules into a well protected area. 
For  several years  UNIVAC has  examined various 
computer configurations in  a continuing search  for a 
more  reliable computer system. These studies con- 
clude that a modular system, which can adapt itself to  
the specific task, i s  needed. A system which can recog- 
nize the failure of a functional unit and take correct ive 
action i s  essential. Future applications require  a sys- 
tem design which allows not only graceful degradation 
at  the module level, but which also permits  adjusting 
the tasks  to  be performed to  the remaining hardware 
capabilities. That i s ,  as e r r o r s  occur, software re- 
covery techniques must enable and assist the systcm 
in recovering from malfunctions. At the full operation- 
al level, all of the hardware consisting of many s imi la r  
functional modules in  a multiprocessor i s  needed and 
used in a nonredundant manner. As a functional unit 
fails, i t  is electronically removed from the system, 
new data  paths are created by switching, and the sys- 
tem continues at a temporarily reduced capability o r  
capacity.' With the advent of multifunction inte ated 
circui ts ,  manufactured using batch processing$consis- 
ting of hundreds and thousands of logic functions per  
compoqent, this type of automatic electronic self-repair 
becomes feasible. The functional units are selected 
(during design) in  such a manner that it is statistically 
improbable f o r  a combination of fai lures  to  reduce the 
capability below the predetermined level needed f o r  
minimal system operation. 
diagnostic system must further locate the fault in the 
failed functional unit to  a replaceable unit without inter- 
fer ing with the system operation. The failed unit can 
then be automatically repaired and the systcm rcturned 
to  full operational capability. 
The hardware and software 
System organization techniques such as these en- 
able the computer system t o  achieve system availability 
and reliability several  o r d e r s  of magnitude greater  than 
that of the functional units involved. Summarized, this 
concept i s  one of graceful degradation down to a prede- 
termined activity level. 
which reduce the system to a lower level i s  so small  
that it is real is t ic  to guarantee operation at this prede- 
termined level. Thus, extreme system reliability can 
be  achieved through system organization and design 
r a t h e r  than f rom circuit improvements only. In the 
past ,  achieving a system reliability greater than the 
reliability of the components was only possible in the 
neuronal sys tems encountered in biological systems. 
With the advent of parallel sys tems like the multiproc- 
essor, however, man can now achieve a s imilar  level 
of reliability in the hardware sys tems he builds. 
The probability of fa i lures  
Any computer system at  somc point in t ime will 
have subsystems that fail. By advance planning and 
using a multiproccssor organization, the system de- 
s i p c r  can  design a computer system to perform i t s  
t ask  at, perhaps, reduced speed even while containing 
failed subsystems.  Both the hardware and software, 
-42 - 
however, must be designed to work together to achieve 
total system reliability and continuous availability. 
Today's individual solid state components are ex- 
tremely reliable, in the neighborhood of o w  failure per 
10 billion component hours. Since it is prohibitively 
costly to obtain sufficient information about failure 
mechanisms to improve component reliability beyond 
this point, there is little likelihood that the components 
of the future will  be more reliable. Today's compo- 
nent, however, represents one or only a few logic 
functions, whereas tomorrow's multifunction, batch fab- 
ricated, LSI circuits will contain hundreds, thousands, 
and even tens of thousands of logic functions. In this 
fashion, with the reliability of the component no better 
than today's, the actual reliability of each logic node 
will be increased by one, two, three, and even four 
orders of magnitude. 
Further, since there is no observable or predic- 
table deterioration with time, associated with solid 
state components, the system designer of a self-repair 
system using replacement modules, need not be concer- 
ned with routine component replacement schedules. 
That is, there exists no algorithms to tell the designer 
when a component is about to fail because there are no 
known wearout mechanisms, other than random failures. 
Thus, the designer is faced with designing the self- 
repair system to replace failures only after a failure 
occurs. In addition, redundant systems gain no added 
reliability from standby modules over active redundant 
modules. If the redundant modules are active, an 
additional capability is achieved. This capability gain 
can be used in a fashion to allow the system to be 
smaller, when designed & a multiprocessor, than a 
system with an inactive standby and to achieve more 
reliability by using fewer components in the total system. 
SELF-DIAGNOSTICS 
In a computer system consisting of many computers, 
such as a multiprocessor, it is possible to design the 
system.to permit a functioning processor to diagnose 
and repair other parts of the failing system. 
For example, consider a worst-case situation where 
all diagnostic aids fail to isolate the failed subsystem. 
Then, under program control, a functioning processor 
need only, by trial and error,  switch in and out function- 
ing subsystems in the failing system until a functioning 
and repaired computer is achieved. 
Briefly, the techniques of self-diagnostics, to be 
discussed, assume that a processor has access to the 
various internal o r  controlling registers of the nther 
modules in the total system. Thus allowing a processor 
to actively diagnose faults in itself and other subunits in 
the system in real time. In this manner, a processor 
can actively detect and control errors  as they occur and 
reassign tasks within and among these devices to com- 
pensate and adjust the systems work load among the 
remaining modules until the failure is repaired. 
During the latter part of the last decade and con- 
tinuing through to the present time, UNIVAC has been 
active in the development of software4 which h a s  the 
capability of recovering from transient computer and 
System errors.  Executive control and error  control 
routines were developed which, in conjunction with 
hardware aids, recognized large classes of computer 
and system errors. When errors  were detected, the 
control w a s  transferred to the appropriate self-analysis 
routine so that appropriate emergency corrective action 
could be taken. In demonstrated cases, useful results 
were produced even with several errors  occurring per 
second . 
With the arrival of LSI circuits containing many 
logic nodes and even complete functions, component 
reliability has and will be greatly enhanced. The de- 
mands upon reliability at the system level, however, 
have become so great that mere attention to circuit 
design, component selection, and manufacturing tech- 
niques will no longer suffice. 
In designing computer systems with self-repair 
features for extreme reliability, the designer needs to 
consider many things, many more than one can possibly 
discuss in a paper of this scope. Some old assumptions, 
definitions, and new considerations presented without 
proof are: 
Failures are malfunctioning hardware or soft- 
ware that may or may not cause an erroneous 
calculation. They may be either intermittent or 
catastrophic. Whereas errors  always result in 
an erroneous calculation. Errors are caused by 
either malfunctioning hardware or partially de- 
bugged programs; either type of malfunction may 
or may not generate an error;  for example, con- 
sider a component failure in the multiply algor- 
ithm: if the multiply instruction is not called 
upon and thus not executed, no error  will exist. 
In the design of a self-repairable system it is 
not just good enough to repair malfunctioning 
hardware; the system must also be designed, as 
described herein, to recover from all errors,  
whether caused by intermittents or catastrophic 
failure. 
described uses both hardware and software, 
working together, to self-repair the damage 
caused by an erroneous calculation. Such damage 
is manifested as unfinished calculations, loss of 
inputs that require reconstitution, and so forth. 
Malfunctions which do not generate errors  are  not 
detected until they cause an error.  
Errors and failures occur infrequently in todays 
debugged systems using highly reliable compo- 
nents; therefore, a computer system should be 
designed so that little or no extra system time 
(additional time to perform calculations) is re- 
Cjetectign ZZCj er'rc)r ccntrz! *2r- 
ing normal operation. That is, the computer can 
be designed to take extra time (and use additional 
logic) on an emergency basis at the time errors  
occur. 
The method of error  recovery to be 
quired fer 
In a self-repair system, spares that would other- 
wise be lying on a shelf until they are manually 
inserted, can be utilized for automatic replace- 
ment. 
handled automatically, however, is small because 
the switching logic increases rapidly as the num- 
ber of modules switched increases. In a practi- 
cal self-repairing system, switchable spares 
The number of spare modules that can be 
-43- 
must contain a lot of logic in  order  to  keep the 
number to  be switched small. 
If e r r o r  control i s  preplanned such that the pro- 
grams pre-condition the system for  e r r o r  con- 
t rol ,  it is possible to recover and self-repair the 
damage resulting f rom e r r o r s .  (Refer to  section 
on adaptive e r r o r  control. ) 
The number of connections between modules is a 
minimum when a total function is included in the 
module. That is, the density of connections with- 
in  a function is high whereas the density of con- 
nections between complete functions i s  low. Thus 
for  practical self-repair a functional break up of 
the logic is desirable to keep the number of lines 
to be switched at  a minimum. 
Assuming that the individual components in a sys- 
tem offer maximum reliability, it  i s  a matter of 
system organization to achieve a greater  system 
reliability than the individual components. 
general, to achieve this higher reliability the de- 
signer must use one form or  another of redun- 
dancy; however, a multiprocessor is already a re- 
dundant system by definition. 
sc r ibes  a method of achieving extreme reliability 
by making use of the redundancy of a multiproces- 
s o r  and by using a computers spares  without re- 
sorting to  total system triplication. The method 
to be described uses  many techniques; fault-mask- 
ing hardware networks where they are required, 
auxiliary coding detection schemes (parity and the 
like), and both software and hardware aided detec- 
tion and correction schemes together with self- 
repair  by sparerreplacement. In general tripli- 
cated faultproof" combinational networks are not 
used for  the sake of economy. Such a system 
combining methods of fault masking and replace- 
ment strategies for achieving reliability, using 
the method best suited to the case at hand, leads 
to  a more economical and more reliable system - 
a system having the widest range of tolerance. 
The most reliable computer in  the world would be 
unreliable and useless if the programs it is executing 
were not completely debugged and designed for  reliable 
operation - designed to  recover  f rom e r r o r s .  To this 
end, an adaptive e r r o r  control program which adapts to  
a malfunctioning environment is required. 
In 
This paper de- 
In order  for a computer to repair  itself and re- 
cover f r o m  e r r o r s  the following s teps  must be accom- 
plished: 
E r r o r  Detection 
Fault Location 
Fault Isolation 
E r r o r  Control and Recovery 
Repair Replacement. 
The following tabulation i s  a sample l is t  of the type 
of e r r o r  detection circuitry that may be included in  a 
self-repairing computer system: 
Parity checking for  data words 
Parity checking for  memory address  words 
Parity checking at functional unit interfaces 
Integrity checker on the program address  
counter 
Over-write and over-read checkers for  l is t  
memories 
Illegal operation detectors 
Arithmetic e r r o r  detectors 
Power and temperature (environment) fault 
detection 
Real-time checks 
Critical command operation checks. 
In addition to the circuitry required to detect the 
above failures, there  a r e  fault reg is te rs  which freeze 
information about e r r o r s  as they occur together with 
associated fault interrupt generation control. 
E r r o r s  and failures in computations a r e  detected 
by either hardware o r  by programmed tests .  
tem which incorporates considerable e r r o r  checking 
hardware, all programs can become fault checking 
programs. Such a system would consist of some o r  
all of the following e r r o r  checking hardware: 
In a sys- 
Data parity checking - all register-to-register 
t ransfers  and all data  read from memory. 
Address parity checking - all t ransfers  of ad- 
d r e s s e s  between reg is te rs  and read or  write 
references to memory up to the input of the mem- 
ory dr ivers .  It i s  far more  undesirable to  jump 
to a wrong address, read the wrong word, o r  
write into a wrong memory location than to  read 
a data word that has  lost  a bit. Thus, this  type 
of parity check is more important than the data 
parity check found in  most computers. Yet, it  
is amazing that address  parity checking i s  seldom 
implemented in  contemporary computers. 
Program address  counter parity check - before 
advancing the counter, the parity of its value plus 
one i s  predicted and then, after advancement, i t s  
parity is checked against the predicted value. 
Again, this is an extremely important check, be- 
cause if the program address  counter is not oper- 
ating properly, no believable program execution 
is possible. 
cr i t ical  commands. Critical commands are de- 
fined as that logic which, when not functioning, 
does not allsow programs t o  be executed. 
Sequence t ime checking - all instructions preload 
a countdown t imer  before execution and, if the 
t imer  reaches zero  before the instruction i s  
completed, a hang-up e r r o r  exists. 
interrupt occurs  to  re lease  the computer f rom 
this condition. 
Arithmetic checks - checking for  overflow and 
the like. 
Illegal operation checks - such as nonexistent 
address  checks. 
Operations like this are termed 
An e r r o r  
-44- 
0 Environmental checks - examining power and 
temperature. To fur ther  reduce the degree of 
system exposure to failures, redundant nodes and 
modules are powered f rom individual and separate  
power sources  which have individual turn on/off 
systems, controlled by the power checking sys- 
tem, S O  that one of the nodes is always operable. 
Memory lockouts - controlled by the executive 
program f o r  protecting data  and programs being 
executed concurrently . 
FAULTRECOVERY 
The main objective of a self-repairable system is 
to maintain the system at i ts  maximum possible capabil- 
ity. This goal is accomplished by detecting a failure as 
soon as possible af ter  it occurs, isolating the failure to 
a functional subunit before system contamination can 
occur, diagnosing the failure to  the replaceable unit, re- 
placing the failing unit, and re-establishing the function 
within the system. To accomplish the above operations 
in real t ime all operations must be manipulated and con- 
trolled by the programs and hardware without manual 
intervention. 
Another objective 'of such a system i s  to  maintain a 
continuous level of computer capability within the sys- 
tem. This means that the removal and replacement of 
a faulty unit should not interfere with the operational 
capability of the remainder  of the system. 
jectives are not singularly limited to the computer, but 
ra ther  include the total system: the computer, i t s  soft- 
ware,  and peripherals. The self-repairing system dis- 
cussed herein i s  f o r  the total system. 
These ob- 
In general, e r r o r s  will be detected by hardware and 
software. Detected e r r o r s  will then be recorded in a 
processor ,  a memory, or a s ta tus  unit. The e r r o r  de- 
tected will reac t  i n  the system with the initiation of a 
task through an interrupt at one of the functioning proc- 
essors .  
trouble, determine corrective action, and decide whether 
to  continue, do fur ther  diagnosis, o r  isolate and replace 
the troubled component. In most  cases ,  the program 
will decide how many t imes an e r r o r  may occur before 
isolation is necessary. Electronic isolation and surgery 
will be accomplished by changing the status of the defec- 
tive unit at a s ta tus  unit. Functional isolation and sys- 
t e m  reconfiguration are accomplished by programs, 
switching submodules, and using memory lockouts. 
Subsystem switching (replacement repair)  will be done 
without power shutdown and confidence checks will be 
made before returning the electronically repaired equip- 
ment t o  operational status. 
The processor  interrupted will analyze the 
In all these cases, if an e r r o r  is detected, the fol- 
lowing,sequence occurs: 
An interrupt  occurs to the appropriate executive 
e r r o r  control program which further resolves the 
e r r o r  condition. This e r r o r  control program, if 
it is executable, will determine if the e r r o r  is 
intermit tent  o r  catastrophic. If a failure has OC- 
cur red ,  it will indicate this condition and initiate 
repair .  If the fai lure  is such that the executive 
e r r o r  control  program is not executable, then 
special hardware o r  another processor  is required 
to isolate the e r r o r  to the functional subsection 
that has  failed, and the e r r o r  control hardware 
initiates the repair  by initiating a functioning 
processor .  
P r i o r  to  an error interrupt and instantly upon de- 
tection of an e r r o r ,  information about the error 
is captured and frozen i n  registers. Two types of 
data  are captured and saved: 1) the type of e r r o r  
and 2) an address  associated with the error. This 
information is used by the error control program 
o r  hardware to determine which functional sub- 
system has failed and for recovery f rom the 
e r r o r .  
Additionally, real-time control programs and cer- 
tain on-line multiuser systems require  that all outputs 
a r e  checked to  determine their validity and the checking 
of arithmetic operations. In either case,  i t  is usually 
sufficient to execute periodically (for example, before a 
control output) a confidence reliability tes t  program to 
determine the state of the system, and if no e r r o r s  are 
detected, assume that the system was also all right 
when the critical computations were performed. This 
program also checks logic which does not have associ- 
ated error checking hardware. In other cases ,  addi- 
tional logic is required t o  check all operations. 
To assist in the error detection, location, isolation, 
and recovery process ,  the following registers are re- 
quired in  each processor: 
E r r o r  Type Detected - Holds type of e r r o r  
E r r o r  Address - Holds address  associated with 
e r r o r  
E r r o r  Recovery Location - Program pre-loaded 
for recovery (to be described) 
Memory Lockouts - Holds lockout information 
Status - Holds operatable status of all subsystems. 
In contemporary computers using discrete  compo- 
nents for  achieving a logic node, parity checking i s  a 
useful method of detecting single e r r o r s  occurring in 
t ransfers  of data. Par i ty  checking has  found widespread 
use in the computer field as an economical method of 
checking for  e r r o r s .  With the advent of the LSI circuit 
containing many logic nodes, however, the probability 
that a multiple logic e r r o r  occurring becomes very 
likely. Since there  are many logic functions per  inte- 
grated circuit and when a failure occurs, such as a 
crack propagating through the single component circuit, 
the probability of the failure affecting many of the logic 
functions, ra ther  than just one of these functions, is 
high. 
With the probability of multiple logic switching 
function failures occurring when a component fails, the 
possibility of an even-odd parity check detecting the 
e r r o r  i s  considerably reduced over the days when dis- 
c re te  components were used in  systems. This means 
the Value of parity needs t o  be reanalyzed in  the light of 
this new problem encountered through the use of batch 
fabricated integrated circuits. In the meantime, while 
other economical methods of detecting e r r o r s  are deter- 
mined, the designer is burdened with arranging his  
logic so that more than one logic failure in  a single 
component does not invalidate the parity checks. As the 
number of logic nodes per  component increases, the 
-45- 
number of external connections to the component for  
performing the logic functions goes down rapidly with a 
correspondent increase in reliability. 
number of connections between natural logic functions 
is considerably less  dense than the connectivity required 
within the function (within the integrated circuit). When 
the logic designer designs around this parity problem, 
the number of connections to  the integrated circuit goes 
up somewhat in  order t o  isolate and achieve valid parity 
checking and, in  turn, reliability goes down because of 
the extra  connections, however, the increase i n  connec- 
tions i s  l e s s  than those deleted through the use of multi- 
function integrated circuits. 
That i s ,  the 
To determine what portion of the diagnostics i s  to  
be performed by the hardware and what par t  by the soft- 
ware, the designer must determine, for  the various 
applications of the equipment being designed, what 
period of t ime is tolerable for  system interruption. In 
general, e r r o r  detection must keep pace with the com- 
putations and thus hardware i s  required for e r r o r  de- 
tection; whereas, fault location, isolation, replacement 
repair  and process restoration need not keep pace with 
the computations and these functions can be performed 
pr imari ly  by the software. The question becomes one 
of determining how much hardware to  put in the system 
for  e r r o r  diagnostics. 
question when the designer real izes  that only a very 
small  percentage of the total system will be required 
for  a coupled e r r o r  detection and self-repair scheme on 
up to three o r  four times as much hardware for  a tripli- 
cated system. Studies to date indicate that replacement 
systems require  a percentage increase in hardware 
rather  than the many-factor increase required when 
massive redundancy i s  used. 
This becomes an important 
PROGRAMMED SELF- REPAIR 
The pool of submodules, a spare  par t s  bank, i s  used 
by the self-repair program in reconstructing those por- 
tions of modules that have failed by switching, see Fig- 
ure  1. In effect, the program switches out the failed 
module. The error control program then determines 
which submodule has failed in  the failing module in  
order  to switch it out and to switch in  a good submodule 
from the pool of spare submodules. 
bility is regained by switching this "repaired" module 
back into the system. 
Total system capa- 
r------------ - -1 
L - I I SYSTEM OPERATING SWITCHING SPARE 
0 PARTS 1 
0 -  
I '  I 
SELF-REPAIR SUBSYSTEM (HARDWARE AND PROGAMS) L - - _ _ - _ - - _ _ _ - - -  _I 
Figure 1. A Self-Xepairing'Machine 
Critical programs,  constants, and computed vari- 
ables are double-stored in  separate  and distinct mem- 
ory  modules. In particular, the self-repair and e r r o r  
control programs are double stored. Thus, i f  a mem- 
ory  module fai ls ,  the alternate module containing iden- 
tical data i s  referenced. If a processor  (or input/out- 
put) module fai ls ,  a functioning module takes over the 
task of the failing module. 
is performable, a t  a possibly reduced speed, until the 
failing module is repaired. 
In either case,  the total task 
ADAPTIVEERRORCONTROL 
There are many methods for  a program, which has  
been interrupted because of a fault, to recover (that i s ,  
t o  adapt to the e r ror ) .  One would be to re turn  to  some 
previous point in  the program where all computational 
values are present and repeat a section of the program 
that has  failed. 
acceptable substitute for  the desired output value and 
continue the program without the information from the 
failing section. 
entirely different calculations o r  reconstitute the inputs 
and reperform the computations. 
is unique to the program which has  been e r r o r  inter- 
rupted. 
t o  select  the best method of recovery for each critical 
portion of his program. 
Another method would be to use an 
Other methods would be to perform 
Each recovery method 
The programmer of such a system must be able 
G 
Onc method of achieving failsafe adaptive e r r o r  
control i s  to preplan for  the occurrence of e r r o r s .  In 
this method the system i s  preconditioned during the exe- 
cution of the program, a s  each cr i t ical  subprogram i s  
initiated, by logging a recovery address  (where to re- 
cover to) into each processors  fault recovery address  
regis ter .  
e r r o r ) ,  the addrcss  in this regis ter  is used by thc CX- 
ecutive e r r o r  control program to determine where to 
recover  to, in the program interrupted. This method 
allows all faults, including transients, to be recovered 
f rom and corrected. 
After an e r r o r  interrupt (detection of an 
Examples of fault recovery initiation points a r e  de- 
noted in  Figure 2 by the encircled le t te rs  A, C, and D. 
As the program reaches each of these recovery points, 
the location of the recovery point i s  sent (retained in 
memory) to the e r r o r  control subroutine in the execu- 
tive control program. Only one recovery point (address 
location) is retained per  processor  a t  a time. Whenever 
a fault i s  detected, control i s  t ransferred to the execu- 
tive error-control  program (usually through an interrupt 
via  a task l is t ) ,  which uses  the address  thus retained 
(A, C, or D) to  cause a jump to where the appropriate 
remedial action will be initiated. For example, assume 
a n  e r r o r  occurs  in  the middle of Subtask 2 in Figure 2. 
Figure 2. Adaptivc I*:rror Control 
and Ikcovery 
-46- 
. Control is transferred to the executive error-control 
task, which, after initiating a diagnostic task, t ransfers  
control to subtask C (the fault recovery point) which 
performs the specified remedial task before any erron- 
eous external effects occur. 
By incorporating adaptive e r r o r  control in  this 
fashion, a processor  becomes a self-recovery system - 
a t ime redundant system. 
The following examples are some of the c r i te r ia  
for  choosing the recovery points: 
A remedial routine must be performed 
Information must be read again 
The problem must be computed again 
A set of computations must be aborted 
Data must be reconstituted. 
This e r r o r  control philosophy is the resul t  of study- 
ing programmed systems using this technique success- 
fully. The following real  example descr ibes  the kind of 
resul ts  possible with this method of e r r o r  control. In 
one system built by UNIVAC, some calculations and all 
e r r o r s  were event-recorded. During many runs involv- 
ing a real-time control problem, a memory short devel- 
oped which associated the computer electrically to other 
equipment in  the metal shielded room. Consequently, 
an e r r o r  occurred whenever a telephone relay clicked, 
whenever an electr ic  typewriter was operated, and so 
forth. 
between-errors of 30 milliseconds. Even with e r r o r s  
occurring at this rate, the real-time control task was 
accomplished satisfactorily. 
The event-recorded tapes indicated a mean-time- 
Thus, all programs,  through adaptive e r r o r  control 
and e r r o r  detecting hardware become fault checking and 
correct ing programs.  
Briefly, the end goal of adaptive e r r o r  control is 
not to  achieve internal fault free operation but ra ther  to 
achieve e r r o r  f r e e  system operation as viewed from the 
out side. 
RELIABILITY AND AVAILABILITY 
The multiprocessor system described herein was 
simulated using pertinent calculated reliability para- 
meters .  The purpose of this simulation analysis was to 
show how levels of organizational redundancy, repair  
philosophy, and component reliability interact and affect 
the reliability of a self-repairing multiprocessor system. 
All units in the system, were assumed to have expo- 
nentially distributed failure times. See Figure 3. The 
r e p a i r  rate using a pool of submodules for  replacement 
was assumed to  b e  the same for  all units in the system. 
It was also assumed that the system was to operate con- 
tinuously f o r  a one-year period without manual mainte- 
nance and upon fai lure  was to  be serviced automatically 
by the e r r o r  control programs and self-repair system 
f r o m  a pool of s p a r e  submodules so  that failing functions 
could be res tored  to  the system; however, each submod- 
& was lost  f r o m  the system upon failure. The mean- 
time-to-restore a failed unit to service was assumed to  
be lengthy compared to  the actual milliseconds required 
by the 3rograms. 
DISTRIBUTED POOL 
OF SUBMODULES 
I 1 MEMORY I A INPUT/OUTPUT I PROCESSOR 
I 
[+y MEMORY INPUT/OUTPUT 
0 
INPUT/OUTPUT 
I 
IN  DISTINCT MEMORY 
MODULES 
J 
SELF-REPAIR PROGRAMS 
0 EXECUTIVE ERROR CONTROL 
0 ERROR RECOVERY 
0 RELIABILITY TEST 
0 SYMPTOM -FAULT CATALOGUE 
Figure 3. Self-Repairable Multiprocessor 
The system was assumed to have failed when: 1) all 
modules of a function fail simultaneously (e. g., all proc- 
essors ) ;  o r  2) when all spare  submodules of a type were 
exhausted. 
Each processor  was assumed to  consist of 10,000 
logic nodes, contained in  250 integrated circuit packages. 
A typical component failure ra te  for  each package was 
selected as 25 failures per  billion hours o r  in other 
words, a meantime-between-f a i lures  per  component of 
about 5000 years .  
The input/output units were assumed to contain as 
many packages of the same type a s  the processors. 
memory modules were assumed to consist of l e s s  than 
half the number of packages as a processor. In all cases 
it was assumed that the switching logic and power were 
included in the modules. 
output, three-memory module multiprocessor, with 
spares  consisting of 2000 integrated circuit packages 
(BO, 000 logic nodes), was analyzed. 
The 
A two-processor, two input/ 
A pessimistic estimation for  the mean-time-be- 
tween-failures was calculated using the method of 
&---’.I -7 
I X I ’ C . 1 1 1 C .  . 
The resul t  was a mean-time-between-failure of 
more than 100,000 years .  
That i s ,  a system consisting of 2000 packages using 
the methods outlined in this paper for  obtaining reliability 
i s  20 t imes more reliable than i ts  individual components. 
SPARE REPLACEMENT SWITCHING 
A major problem in achieving practical self-repair 
is in  performing the electronic surgery (the logic) for  
switching out a failed submodule and switching in  a 
-47 - 
functioning spare  submodule (replacement repair) .  
method of solving this problem is to: 
One 
Place spares  on same LSI wafer as logic being 
spared in  order to reduce the number of external 
connections. 
U s e  logic to perform an analogous operation of 
fusing f o r  the interconnection of submodules 
including spares. 
Place the fuses in ser ia l  f o r  each interconnection 
so that a malfunctioning submodule can be re- 
moved both functionally and electrically f rom the 
system - the blowing of these fuses  deletes the 
submodule from the system. The reverse  oper- 
ation of the fuse (making the circuit) allows the 
spares ,  which a r e  bussed in parallel to the inter- 
connection path, to be added, both electrically 
and functionally, to the system. 
Allow a computer t o  self-repair itself when criti- 
cal control logic fails. The logic required simply 
t r ies  connecting and disconnecting (by fusing) sub- 
modules until a functioning set  of submodules, to 
make up a complete system, is located. 
U s e  an error status regis ter  to address  (deter- 
mine) which interconnection lines and control 
logic to effect the fusing operation of switching 
the failure out and switching a good submodule in, 
when malfunctions occur. This regis ter  is  loaded 
automatically on detection of critical control 
e r r o r s ,  which also automatically cnergizes the 
control logic for  the fusing operation. For  non- 
critical failures (failures which allow a program 
to be executed) a program i s  required for  loading 
the register and initiating the fusing operation 
(for the purpose of economy of hardware). 
U s e  multiple fuses for  sparcs  that a r e  usable in 
more than one section of the system. When a 
submodulc is connected into the system for  oper- 
ation in one section, the other fuses  a r e  inter- 
locked s o  that the same spare  cannot be fused into 
a circuit for a different use in the system a t  a 
la te r  time. This method of fuscd switching allows 
submodulcs together with their fuses to be added 
modularly. 
To each submodule there a r e  multiple fused inter- 
connection paths including power to provide redun- 
dant pathways. 
u res  occur, the submodule can be switched out so 
that it does not affect system operation. 
Therefore, even if multiple fail- 
Since it is desirable to  design computer systems 
using a minimum number of spares  in  order  to keep the 
switching logic within practical limits, an investigation 
was performed using quorum probabilities. 
assumed, since it i s  not possible to predict in which 
submodule a failure would occur, that the self-repairing 
system would require at  least  one of each different type 
submodule as a spare. Further, in order to achieve the 
desired high lcvcl of reliability, additional s p a r  sub- 
modules would be required. By using Einhorn’s equa- 
tions for thc calculation of MTBF and solving for  the 
number of spares  (redundant submodules) i t  can be 
shown that the number of sparc  submodules of each type 
is small .  
It was 
% 
UNSOLVED PROBLEMS 
Not all problems f o r  achieving self-repair have 
been solved. 
repair  techniques requiring solution are:  
The problem areas associated with self- 
What self-repair techniques applied,singularly 
o r  in combination provide the greatest  improve- 
ment in  reliability? 
What methods are optimum for  automating and 
initiating: 
A Fault diagnosis 
A Fault location 
A Fault isolation 
A Self-repair by replacement 
A E r r o r  repair  (process restoration) ? 
What constitutes a complete (closed) s e t  of fault 
diagnostics and self-repair techniques and what 
theory can be formulatcd to show that the s e t  is 
complete ? 
What is the effect of self-repair on the total 
system relative to  design, manufacturability, 
maintenance, application, etc. ’? 
What ground rules  must be followed to achieve 
total self-repair? 
What a r e  the implications of self-repair on 
programming ? 
What different diagnostic and self-repair tech- 
niques a r e  required by various functional logic 
c i rcui t ry ,  such as control and timing logic, hard 
core logic, critical command logic, and memory? 
CONCLUSION 
With micro-miniaturization techniques growing into 
standard usage, that are difficult and time consuming to 
repa i r  manually, automatic self-repair techniques are 
becoming a necessity. 
ment puts the spares ,  associated with any computcr in- 
stallation, to use in  effecting extreme system reliability 
ra ther  than having them sitting idle on a shelf waiting to 
be manually put into use. 
Self-repair by automatic replace- 
In computing sys tems of the future, the ent i re  task 
of fault detecting, identifying, locating, isolating, re- 
pairing, and process  restoration can be automated by 
replacement switching with present day state-of-the-art 
techniques. Future computing systems will undoubtedly 
incorporate some form of self-repair because many 
applications require  continuous e r ror - f ree  operation. 
The many-sided advantages of self-repairing auto- 
mata  for  the computing field include: 
Continuous system operation 
100 percent  system availability 
Long t e r m  remote system operation (operation 
for  years )  
Reduced maintenance costs. 
-48 - 
A computer, through a system design encompassing 
both hardware and software techniques, can be designed 
as a self-repairable space and t ime redundant system. 
Since the technology now exists to achieve a self-repair- 
able system and such a system is realizable, 100 per- 
cent system availability can be virtually assured. 
6. "On-Line Computing Systems: A Summary", 
Proceedings of the Symposium On-Line Computing 
Systems, Los Angeles, California, February 2-4, 
1965. 
7 .  Kneale, S. G. , 'TReliabilitv of Parallel Systems 
1. 
2. 
3. 
4. 
5. 
REFERENCES 
Joseph, E. C., 17Multiprocessing for  Information 
Systems", Sperry Engineering Review, vol. 17, 
no. 1,  Summer 1964, pp. 39-43. 
Burke, T .  E., and Wang, G. Y., '"Spaceborne 
Multiprocessing Organizations", W ESCON/66, 
Session 9, "Advanced Spaceborne Computer 
Concepts", August 23-26, 1966. 
Cubert, J. S., Simmons, G. T., and others, 
"Impact of Batch Fabrication on Future Com- 
puters", Proceedings of the National Symposium 
sponsored by the Computer Group of the IEEE, 
April 6-8, 1965, Los Angeles, California. 
Champine, G. A., and Griffith, G. M., "Automatic 
E r r o r  Recovery i n  the Nike-Zeus Guidance 
Computer", Fall Meeting of ACM, 1962. 
Pierce,  W. H. , "Failure-Tolerant Computer 
Design", Academic Press, New York and London, 
1965. 
with Repair and Switching", Proc. Seventh Nation- 
al Symposium on Reliability and Quality Control, 
IRE, Philadelphia, Pennsylvania, January 9-11, 
1961; pp. 129-133. 
8. Einhorn, S. J., "Reliability Prediction for  Repair- 
able Redundant Systems", Proceedings of the 
IEEE, February 1963, pp. 312-317. 
ACKNOWLEDGEMENTS 
In presenting this  paper, the author wishes to ex- 
p r e s s  his  appreciation for  the assistance and cooper- 
ation offered by fellow colleagues at UNIVAC. Cer- 
tainly, an effort as complex as that discussed herein i s  
the resul t  of many individual contributions. A special 
expression of gratitude, however, is owed to Mr .  D. R. 
Lewis for  providing many comments which served to 
refine this work. 
In addition, acknowledgement is due to  Mr. H. J. 
Corning for his  reliability analysis and simulations 
which aided and confirmed the resul ts  reported. 
the author wishes to  express  his debt of gratitude to the 
other staff members  whose untiring efforts made this 
paper possible. 
Finally, 
-49- 
I . 
AR ITHMETI C ERROR CORRECT1 ON 
HARVEY L. GARNER 
Dr. Garner has been a Professor of Electrical Engineering at the University 
of Mich igan since 1963. He received his 6. S. and M. S. (Physics) degrees f rom 
t h e  University of Denver in  1948 and 1951, respectively, and his Ph. D (Electrical 
Engineering) degree f rom the  University of Mich igan in 1958. 
Dr. Garner was a Research Associate, Cosmic Ray Research Program, Un i -  
versity of Denver, and t h e  I nter-University High-Alt i tude Research Laboratory 
f rom 1949 to 1951. He was Chief Engineer, Engineering Research Inst i tute, 
University of Michigan, where he was involved in t h e  development and operation 
of t h e  MIDSAC and MIDAC computers f rom 1951-55. 
He was an  Instructo,r in Electrical Engineering at Michigan (1955-58) and 
Chai rman of t h e  in tens ive summer computer courses (1955 to present). He 
became an  Assistant Professor in 1958, an  Associate Professor in 1960, and 
a f u l l  Professor in 1963 at t h e  University of Michigan. In  addition, h e  was 
Director, In format ion Systems Laboratory, at t h e  University f rom 1960 to 1964. 
Dr. Garner i s  a member of t h e  Communication Sciences Program Committee 
(1960 to present), was an  Organizer for t h e  Computer Sessions, 1962 I R E  Con- 
vention, i s  a member of t h e  Board of Directors of t h e  A n n  Arbor Computer 
Company, 1965-1966, a Director of the Systems Engineering Laboratory at the 
Univers i ty  (1965-1966), and Acting Chairman for t h e  Program in Communication 
Sciences at t h e  University f rom September 1965 to present. In  addition, he has 
been a consultant to IBM, Lockheed, Bel l  Telephone Labs, Ford Inst rument ,  
Strand Engineering, Holley Carburetor, Sylvania, Cinc innat i  Mil l i ng  Company, 
ESSO Production Research, and the U. S. A i r  Force. 
-51 - 
ARITHMETIC ERROR CORRECTION 
by Harvey L. Garner 
Professor of Communication Sciences 
and Electrical Engineering 
The University d Michigan 
Ann Arbor, Michigan 
SUMMARY 
In this paper the classification and proper- 
t ies of arithmetic codes is briefly reviewed. 
theory of e r r o r  checking and correction is well 
developed. However the application of this theo- 
ry  to practical computers has been limited be- 
cause of the effects of e r r o r  checking on compu- 
tation rate  o r  because of the relative complexity 
of e r r o r  control circuits in comparison with arith- 
metic circuitry. It is possible that some applica- 
tions of e r r o r  codes have lead to a less  reliable 
overall unit because of the complexity of the er- 
r o r  control equipment which is also subject to 
malfunction. It is well known that a modulo three 
residue check suffices for the detection of all sin- 
gle arithmetic e r rors .  In this paper a new logi- 
cal  circuit for  the determination of the modulo 3 
residue is presented and the expected perfor- 
mance of this circuit is analyzed, using the as -  
sumption that the e r r o r  events follow a binomial 
distribution. 
e ra tes  fas ter  than the conventional adder carry 
logic and the statist ical  analysis indicates that 
the circuit is practical. 
Introduction 
The 
The modulo three check circuit op- 
E r r o r  correcting codes for arithmetic op- 
erations have received considerable attention. 
Peterson has shown that all separate checking 
codes are residue codes[ll]. Brown has intro- 
duced the AN codes [ 1 ] and Henderson has given 
examples of systematic codes [6,7]. Since Hender- 
son does not consider the code arithmetic, it is 
impossible to determine whether these codes as 
presented by Henderson were meant to be sepa- 
ra te  or  nonseparate. Recently Garner [ 31 has 
shown that both the Brown codes and the Hender- 
son codes are members of the same general c lass  
of nonseparate codes which are ideals contained 
in rings of integers. 
E r r o r  codes are classified according to 
three characterist ics:  (1) Parity o r  Residue check, 
(2) Separate o r  Non-separate arithmetic for the 
check digits and the number digits, (3) Systematic. 
d e r  logic produces errors. represented by burst 
e r r o r  patterns for parity codes. The burst length 
o r  weight may have any value from zero (no e r ro r )  
to n+l (all  digits of the sum in error)[  51. Al l  
burs t  e r r o r s  due to single component malfunctions 
in a standard binary adder are obtained if + is 
A single malfunction in the conventional ad- 
used to combine the e r ro r  patterns of the type 
e = +2Je1, j=1,2,. . . ,n ,  with the correct  sums 
[ 13. Thus, e r r o r  analyses for single component 
malfunctions can be simplified if + is used to 
combine the error  patterns since only 2n patterns 
need be considered. 
A parity check,checks only the digitwise 
modulo addition in the addition process. Specif- 
ically, it does not check the carry generation 
process. E r r o r s  in car ry  generation will not be 
detected. Any of the transmission type of e r r o r  
correcting codes can be used, rather than the 
simple parity check. However, such codes will 
still only obtain e r ro r  correction for e r r o r s  in 
digitwise modulo addition[13]. The absence of a 
check on the car ry  generation process plus the 
burst  nature of the e r ro r  patterns tend to render 
the parity check useless. Garner [4d compares 
the effectiveness of a two bit p r i t y  check against 
a modulo three residue-check. The modulo three 
check is shown to detect all e r ro r s  due to single 
component malfunction while the two bit parity 
check detects a t  most 92% of these e r rors .  
A separate code for a nonredundant number 
system N is y ,  the single-valued mapping de- 
fined by the se t  of ordered pairs  n, y(n) such 
that each n E N occurs in one and only one or- 
dered pair. This mapping is indicated by y:N-R. 
~ ( I ‘ . ) E  R is the check digit for ne N. The separate 
code is characterized by the absence of anyarith- 
metic interaction between N and R. Different 
arithmetic is defined for N and R. A theorem 
due to Peterson[ll] states that every separate 
check code is a residue code or  is isomorphic to 
a residue code. Separate residue codes. fcr bi- 
nary numbers, have the undesirable requirement 
of sign correction if twos complement coding is 
used. If ones complement coding is used, then 
sign correction is not required. 
Nonseparate codes and transmission codes 
have the same basic structure. Let Q be the se t  
of all distinct n tuples over (0, l}. Then K ,  the 
nonseparate code, is a subset of Q. A single 
arithmetic unit processes the coded elements of 
K. Thus check arithmetic and operand arithme- 
tic are not separated. The general c lass  of non- 
’ D C p l c l L c  LUUC’D I I lGlUUC’D L l l e  fill LUUC’D N U L  15 l l U L  
restricted to a diminished radix complement 
interpretation for K, This is desirable since 
other complement interpretations of k can be 
easily realized and these avoid complete end 
around carry correction. 
& -  ....-I-- : -,..-l-- *I-.. A.7 ^^A^^ L-.* I- - - A  
A nonseparate code is systematic i f  for 
each ne N there exists a unique ke K such that 
n can be identified in k. Separate codes a r e  
trivially systematic. An example of a system- 
at ic  code was given by Henderson [6, 71 The code 
-53 - 
arithmetic was not discussed. 
ted that the code should be a concatenation of 
g - In I 
of the tgrm "additive" to describe this c lass  of 
codes. The te rm additive is not appropriate 
since the Henderson code is a member of the gen- 
e ra l  c lass  of nonseparate codes. All codes in 
this c lass  a r e  multiplicatively generated. An im- 
portant property of the systematic, separate 
codes is that multiplication requires no correc-  
tion. 
gle-error  detection for binary arithmetic is three. 
The separate code structure is  preferred for  a 
one's complement code if n, the number of bits, 
is even. If n is odd, a separate code does not 
exist for g=3 but a systematic, nonseparate code 
exists. For  the two's complement code, the non- 
separate systematic code structure exists for all 
code lengths if g=3;and the nonseparate structure 
is preferred over the separate code structure 
since sign and multiplication correction a r e  not 
required [ 31. 
The basic difficulty relative to the applica- 
tion of e r r o r  detecting or  correcting codes of the 
separate o r  nonseparate type is the number of the 
components o r  the time required to determine the 
check and effect the correction. The check o r  
correction computation must be accomplished in 
about the same time required for addition. Some 
solutions to this problem can be obtained by using 
properly coded stored tables. The actual effec- 
tiveness of check realizations has  receivea little 
attention. Major research efforts have been 
devoted to the structure of the codes. A prelim- 
inary study of the utility of a modulo three checker 
is considered in this paper. 
Henderson indica- 
This description leads to the use and n. 
The smallest value of a check base for  s i n -  
Bounds on Arithmetic 
We shall  consider the following as basic 
t ime units relative to the computational period of 
a logical circuit  
rf = min period between successive input pulses 
7 = the delay associated with a single unit (i. e. 
Technology is such that 2rg 5 T ~ ,  5 4rg  [ 9 1. 
In the following discussion, let kr = T ~ .  
rA, the period of one addition operationqor an  
accumulator type adder, can be defined as the 
period between the set  of the input regis ter  and 
the se t  of the accumulator. For a conventional 
ripple car ry  adder rA = 2n r 
g' 
resolvable by a flip-flop 
a transistor in a logical gate) g 
- 
Thus 
2n rf 
T ~ < T  A -C 2 n 7  g =- k =rAR (1) 
A faster adder  is obtained with an "exclusive OR 
carry"[3,14]. The upper bound is reauced to a t  least  
n r  
f . In fact, T~~ should approach rf. Lehman k 
[lo] has made a detailed comparative study of the 
cost and computation t ime for all known adder 
configurations. H i s  results show the adder with 
exclusive OR carry generation to be as much as  
seven t imes faster  than the conventional ripple 
car ry  adder. Specifically 
. 
7 R  < -  7R 7 5 - 3 
An adder with a modulo four exclusive OR c a r r y  
generation is as nruch as  fourteen t imes faster  
than the ripple ca r ry  adder. 
R 7 7 - R < T  < -  
14  - A e 4  - 6 
(3) 
These substantial reductions in rA are associ-  
ated with only fractional increases m hardware. 
I
Ripple car ry  
Exclusive OR 
(Mod 2) 
xclusive OR 
Components Total Semi- 
Conductor 
TABLE 1 
Components per Stage for  
Different Car ry  Schemes (Lehman [lo]) 
The count in Table 1 does not include the 
two flip-flops per stage in the input and output 
regis ters .  
conductor devices. 
figurations require approximately 24, 28, 31 
semi-conductors per stage. 
components (17% to 25% yield an  adder such that 
Each flip-flop consists of eight semi-  
Thus the different adder con- 
Thus, modest increases in the number of 
2n T~ 
'f <_ 'A@ <_ ( 4) 
where 3 < q x 14 and 6 < - kq < - 64. 
bound i s  val idonly i f  
The upper 
(5) 
The preceding discusion shows the exis- 
tence8of adder designs, requiring a reasonable 
number of components per stage,  with an  aadi- 
tion period, 
'A = Q 7 f '  (6) 
-54- 
where (;Y is small, but cr > 1. Multiplication and 
division a r e  obtained by a sequence of add-shift 
operations in the conventional parallel arithmetic 
unit. Even if the add is deleted for zero multi- 
plier digits and multiplier recoding is employed, 
the multiplication period for  r for n-bit operands 
has a lower bound given by nr This bound can 
be lowered further only by using multiple stage 
shift logic in the accumulator register which is 
costly. It should be possible to approach the 
lower bound using a carry s tore  adder. A multi- 
plier using multiplier 
for zero multiplier digits, and single stage shift- 
ing in conjunction with an adder using an exclusive 
OR car ry  circuit will have an average lower 
bound for the multiplication period equal to 
f '  
coding, deletion of add 
(7) 
2 n 2 + a  - n r  +cr3rf = n r f ( T )  3 f  
SRT division can be considered to have approxi- 
mately the same lower l imits as those given for 
the above multiplication periods. 
Propert ies  of a Switched Mod 3 Adder 
The simplest  circuit for checking is a mod- 
ulo 3 adder. 
obtained from a switch which samples pairs  of ad- 
jacent accumulator digits in sequence. 
circuit  requires two flip-flops. The total number 
of semi-conductor devices required for n stages 
is x + 2n where 30 < x < 40. This count includes 
the commutator pa3 of the  switch but assumes the 
required pulses for required commutation a r e  
available. 
check :r?$&ges. Then 
The input of the modulo 3 adder is 
Such a 
be the time period required for a 
n 
7 > - 7  c s - 2  f 
The checker should almost obtain the lower time 
bound. 
to the different adders previously discussed is 
such that 
The performance of this checker relative 
A s c i i m o  t n 2 t  ----___-- -. _ 
n rcs = Tf. 
Previously, we have shown 
2 < k < 4  2n 7 = -  AR k 7f'  - -  
so 
n 
- T  < r < n r f .  2 f -  A R -  
(9) 
Thus, under the most optimum circumstances, 
the switched checker will not add to the addition 
time of a ripple carry adder. For example, a 
10 n. s. carry propagation time per stage 
requires flip-flops in the checker clocked a t  50 
Mc. It has been estimated that 
Thus the switched checker will require a sub- 
stantial period of time over and above the addi- 
tion period of an adder using an exclusive OR 
car ry  circuit. 
can be overlapped with some period when the 
adder is idle, then the switched checker can be 
used. However, when this cannot be done, a 
parallel  checker is required. 
If the required check period 
A Parallel  Modulo 3 Check Circuit 
Several parallel designs a r e  possible. 
design presented here utilizes the same prin- 
ciples used to obtain fast carry propagation for 
the exclusive OR carry type of adder. The pro- 
posed parallel modulo three checker should re- 
quire a check period no longer than the maximum 
car ry  propagation period of the modulo 4 exclu- 
sive OR car ry  circuits, and the hardware r a l i -  
zation of both circuits will be subject to identi- 
cal  considerations o r  restrictions. Thus 
The 
- 
7 = r  C P  A e 4  (14) 
Approximately ;( 15) transistors are required to 
realize this p a r h e l  modulo three checker. 
Basically the chgcker consists of a chain 
of - switch units and switch control units a s  
shown in Figure 1. Each switch unit is a 3P3T 
switch a s  shown in Figure 2. 
requires nine transistors and each switch con- 
troll  logic unit requires six transistors. The 
checker requires a relatively short  period for 
the check because al l  switches a r e  set  simul- 
taneously shortly after the accumulator is set. 
n 
2 
Each 3P3T switch 
Figure 1 
Block Diagram of Parallel  Mod 3 Checker (12) 
-55 - 
0 0  
l o  
2 0  
1 
output input 
1 ,  1 - . _  0 0  
2 1  
I actuated by 0' r---i switch con- 
* I  trol  logic 
Y . _  
4 I ,  0 1  
0 &-2 i - 
0 ;  - - * I  
Figure 2 
Connections Used in the Parallel  
Modulo Three 3P3T Switch Checker 
The parallel checker should be capable of 
checking a sequence of additions without requir-  
ing excess time for checking. However, on the 
average, the period for multiplication with mul- 
tiplier coding and suppressed addition for  zero  
digits is 
2+(Y 
T = n~(---). M f 3  (15) 
The checker will require 
(16 1 - f = n Q r  CM 7 
since a check should occur,after each add-shift 
operation and after each shift operation in the 
multiplication process, unless reliability require- 
ments permits checking after trie final product. 
The characterist ics of the non-separate, s y s -  
tematic c lass  of residue check codes facilitates 
the direct  checking of products, 
Three alternatives a r e  available relative 
to the check of multiplication using the parallel 
checker. (1) Check every step and an increase 
in the multiplication period occurs due to check- 
ing unless (Y = 1. In this case there is no need 
to employ multiplier coding o r  suppressed addi- 
tion for zero  multiplier digits since 
'M = 'L'M - n r A  (17) 
(2) Check the final product with no check a t  any 
intermediate step. No excess time is required. 
(3) Check each add-shift, shift-sequence during 
the next add-shift step. A shift operation r e -  
quires  a period approximately equal to Q T the 
same as  an  addition check. It is expectedqhat 
this scheme is optimum since the average number 
of shifts between each add-shift operations is two 
and the shift operation is more reliable than addi- 
tion since less  t ime and fewer components a r e  
required. 
low order  part  of the accumulator is required 
unless only rounded products a r e  required. 
. 
A checker for both the high order  and the 
Evaluation of the Checked Adder 
Assume each semi-conductor has  a prob- 
ability of failure p = 1-Q in the time interval 
T ~ .  
Successive time intervals a r e  assumed 
independent. The probability of a t  least  one 
failure in an unchecked arithmetic unit containing 
r semi-conductors in T f is 
p < l  (18) 
A checked arithmetic unit requires r + m com- 
ponents. Let 2(F, T ~ )  ,denote the probability 
in the checked arithmetic unit in T~ and 
P ~ ( F , T ~ )  2 (1: + m)P (19) 
The probability of a t  least one failure of a check 
circuit  semi-conductor in T~ is 
Thus an upper bound on the fractional increase 
in e r r o r s  due to the checker is given by 
Using the component counts for the various ad- 
d e r s  and the parallel checker fixes f (e) between 
1/4 and 1/3. This upper bound is no6 realist ic 
because the checker for the adder a lso checks 
other components in the computer; i. e. memory 
and data transmission between memory and the 
arithmetic unit. 
The modulo three checker will cor rec t  all 
e r r o r s  due to a single component malfunction in 
Tf ,and 1 /2  the e r r o r s  due to two component 
fa i lures  in T ~ .  
bility of an undetected failure in T~ is 
SO, with checking, the proba- 
. 
where t = r + m. 
-56 - 
The ratio of the probability of an undetected fail- 
ure  in T~ with and without checking is 
I Two complete adders with parallel check- 
ing and a comparator between the accumulators 
can be realized with 2t + 2n semi-conductors. 
All e r r o r s  are detected unless there is a com- 
ponent malfunction and the e r r o r  is correctable 
except when an  e r r o r  
parity checker occurs or  when a detectable 
e r r o r  occurs in both adders. Erroneous cor-  
rection can occur because of the possibility of 
an  uncheckable e r r o r  in one adder coupled with 
a checkable e r r o r  in the second adder. Evalua- 
tion of this configuration is in process. 
not detectable by the 
(This research  was partially supported by A i r  
Force Contract A F  3q602)-3546.) 
I 
Bibliography 
[ 11 D. T. Brown, "Error  detecting and correct-  
ing binary codes for arithmetic operations, '' 
B E  Trans. on Electronic Computers, vol. 
EC-9, pp. 333-337, September 1960. 
C. V. Freeman, "Statistical analysis of 
certain linary division techniques, " Proc. 
IRE. vol. 49, no. 1, pp. 91-103, Septem- 
ber 1958. 
H. L. Garner, "Error  codes for arithmetic 
operations, IEEE Trans. on Electronic 
Computers, vol. EC-15, pp. 51-57, Octo- 
ber  1966. 
H. L. Garner,  "Generalized parity check- 
ing, " IRE Trans. on Electronic Computers, 
vol. EC -7, pp. '207-213, September 1958. 
[2] 
[3] 
[4 ]  
[5] H. L. Garner, "Error  checking and the 
structure of binary addition," Ph. D. disser.  
tation, The University of Michigan, Ann 
Arbor , M ic higan ( 19 5 8). 
D. S. Hendersog, "Logical designs for 
arithmetic units, Ph. D. dissertation, 
Harvard University, Cambridge, Massa- 
chusetts (1960). 
[6] 
[7] D. S. Henderson, "Residue class  e r r o r  
checking codes, Proc. 16th National Meet- 
ing of the Assoc. for Computing Machinery, 
Los Angeles, California (1961). 
T. Kilburn, et al. , "A parallel arithmetic 
unit using a saturated-transistor fast- 
carry circuit, " .Proc. IRE, vol. 107, B. 36 
M. Lehman, e t  a l . ,  "Serial arithmetic 
t echniaues. 'I AFIPS Conf. Proc. of the 
[8] 
pp. 573-584 (1960). 
[ 91 
Fall Joint-C-cmputer ConferEce, vol. 27, 
Pa r t  1 ,  pp. 715-725 (1965). 
[ 101 M. Lehman, "A comparative study of prop- 
agation speed-up circuits in binary ari th- 
metic units, 
Munich, Germany, pp. -%2-0&-( 1962). 
IBM J. Res. and Dev., vol. 2, pp. 166-168 
April 1958. 
division methods, " IRE Trans. on Electron- 
ic Computers, vol. EC-7, pp. 218-222 
September 1958. 
[ 131 J. E. Robertson, "Error  detection and 
correction in binary parallel digital com- 
puters, I' in Electronic Digital Computer, 
Internal Rept. 37, Digital Computer Lab., 
University of Illinois, Urbana, pp. 70-81 
(1952). 
[14] F. Salter, "High-speed transistorised ad- 
der for a digital computer, " Trans. IRE, 
IFIP Congress Proc. , 
[ 111 W. W. Peterson, "On checking an adder, I' 
[ 121 J. E. Robertson, "A new class  of digital 
V O ~ .  EC-9, 4, pp. 461-464 (1960). 
-57 - 
SYSTEM ORGANIZATION OF THE JPL SELF-TESTING 
AND -REPAIR I NG COMPUTER AND ITS EXTENS ION 
TO A MULTIPROCESSOR CONFIGURATION 
ALGI RDAS AV I 2 I EN I s 
Dr. Av i i ien is  received B. S., M. S., and Ph. D degrees in Electrical 
Engineering f rom t h e  University of I l l inois, Urbana, in 1954, 1955, and 1960, 
respectively. In 1955-56 he  was a research engineer at t h e  Jet Propulsion 
Laboratory, Pasadena, California. Dur ing  graduate studies at t h e  Digital 
Computer Laboratory, University of I l l ino is ,  h e  was a Fellow in 1954-55, 
1956-57 and 1957-58, and a research assistant in 1959-60, participating in  
t h e  design of t h e  ILLIAC I I system. He also was a staff engineer at Barnes 
and Reinecke, Inc., Chicago, I l l ino is ,  in 1958-59. 
In  1960 h e  rejoined t h e  Jet Propulsion Laboratory as a Senior Engineer 
in computer systems research and init iated t h e  JPL Self-Testing and Repairing 
(JPL-STAR) Guidance Computer research project. Since 1962 he  has been an  
Assistant Professor of Engineering at t h e  University of California, Los Angeles, 
conducting research in computer ari thmetic and digital system design in 
connection w i th  t h e  Variable St ructure Computer project. He has also re-  
mained associated wi th  t h e  Jet Propulsion Laboratory as Pr inc ipa l  Investigator 
of t h e  JPL-STAR Computer Research project. In  the spr ing of 1966 h e  was a 
v is i t ing  professor at t h e  National Computing Center of t h e  National Polytechnic 
i ns t i t u te  ot Mexico, Mexico City, where he  assisted in t h e  organization of a 
graduate program in computer sciences. 
t h e  ACM, and of t h e  Technical Committee o n  Switching and Automata Theory 
of t h e  IEEE Computer Group. 
Dr. Av i i ien is  i s  a member of Sigma Xi, Tau Beta Pi, Eta Kappa Nu, 
-59 - 
SYSTM ORGANIZATION OF THE JPL SELF-TESTING AND -REPAIRING COMPUTER 
AND ITS EXTENSION TO A MULTIPROCESSOR CONFIGURATION 
By Algirdas A v i t i e n i s  
NASA J e t  Propulsion Laboratory, Pasadena, C a l i f o r n i a  
and 
Univers i ty  of Cal i forn ia ,  Los Angeles , C a l i f o r n i a  
SUMMARY 
The techniques f o r  the  a p p l i c a t i o n  of  pro- 
t e c t i v e  redundancy i n  d i g i t a l  s y s t e m a r e  reviewed 
and compared. The choice of a replacement system 
as p r o t e c t i v e  redundancy f o r  a spacecraf t  guid- 
ance end c o n t r o l  computer is reached by consider-  
a t i o n  of its computing requirements. The system 
organiza t ion  of an experimental replacement 
system is descr ibed ,  w i t h  emphasis on the  method 
of  concurrent  f a u l t  diagnosis  by means of a r i t h -  
met ica l  encoding. An extension of the system t o  
a mult iprocessor  conf igura t ion  is considered a s  a 
means t o  provide on-board d a t a  processing a f t e r  
a r r i v a l  a t  a remote d e s t i n a t i o n .  
INTRODUCTION : RELIABILITY BY 
MEANS OF PROTECTIVE REDUNDANCY 
Rel iab le  performance of d i g i t a l  systems i s  
u s u a l l y  a t t a i n e d  by the systematic  a p p l i c a t i o n  of 
two techniques.  The f i r s t  is the  s e l e c t i o n  of 
h ighly  r e l i a b l e  components and the  use of proven 
methods f o r  t h e i r  interconnect ion and packaging. 
The second technique is an extensive v e r i f i c a t i o n  
of t h e  des ign  and of the  programs, f i r s t  by simu- 
l a t i o n  and l a t e r  by d iagnos t ic  and func t iona l  
tests under expected environmental condi t ions.  
Despi te  of these r e l i a b i l i t y  assurance techniques,  
the  system may s t i l l  f a i l  during use because of  
uncont ro l lab le  o r  undetected f a u l t s .  These i n -  
clude undetected design e r r o r s ,  random f a i l u r e s  
of  components o r  connect ions,  and e x t e r n a l l y  i n -  
duced f a i l u r e s  due t o  the  environment (nuclear  
r a d i a t i o n ,  sparks ,  mechanical damage, e t c . ) .  
The e f f e c t s  of these  f a u l t s  can be cont ro l led  
by t h e  t h i r d  r e l i a b i l i t y  technique - the i n t r p -  
duc t ion  of p r o t e c t i v e  redundancy i n t o  the  system. 
A computer system contains. p r o t e c t i v e  redundancy 
i f  the  e f f e c t s  of component f a i l u r e s  o r  program 
e r r o r s  can be t o l e r a t e d  because of t h e  use of 
a d d i t i o n a l  cqmponents o r  programs, o r  the use of 
more time f o r  the  computational tasks .  These ad- 
d i t i o n a l  components, programs, and t i m e  a r e  n o t  
requi red  by ihe s y s i r u ~  i i l  order to sxzcii:: :be 
s p e c i f i e d  t a s k s  a s  long a s  no f a i l u r e s  o r  t r a n -  
s i e n t  malfunct ions occur. 
t e c t i v e  redundancy may be divided i n t o  two major 
The techniques of pro- 
M 6"/17108 
ind iv idua l  c i r c u i t  <omp&ents t o e n t i r e  iysfems. 
The p r i n c i p a l  techniques of massive redundancy 
a r e  : 
1. Repl ica t ion  of c i r c u i t  components; e.g., 
"quadded" diodes,  r e s i s t o r s ,  t r a n s i s t o r s ;  
dupl ica ted  connect ions,  e tc .  (Refs. 1, 2). 
2 .  Repl ica t ion  of l o g i c  s i g n a l s :  use of  
mul t ip le  channels and vot ing  elements ,  
recurs ive  n e t s ,  interwoven logic, v a r i -  
a t i o n - t o l e r a n t  threshold element n e t s .  
(Refs. 3 ,  4, 5 ,  6 ,  7) .  
3. Adaptive l o g i c  e lements ,  e .g . ,  v o t e r s  with 
var iable-weight  inputs .  (Ref. 6 )  
Repl ica t ion  of  e n t i r e  systems with compar- 
i son  and vot ing  o r  d iagnos is  a t  system 
leve 1. 
I n  the  s e l e c t i v e  redundancy approach t h e  
presence of a f a u l t y  element is detec ted  by ob- 
serv ing  a symptan of the  f a i l u r e ;  subsequently 
the  f a u l t  is made harmless by a c o r r e c t i v e  ac t ion .  
The p r i n c i p a l  techniques of s e l e c t i v e  redundancy 
a r e  : 
1. 
4. 
Error  d e t e c t i o n  and c o r r e c t i o n  using e r r o r -  
c o r r e c t i n g  c i r c u i t s  f o r  coded words. 
(Refs. 8 ,  9 )  
2 .  Replacement of  the f a u l t y  element o r  
system by a stand-by spare  ( s e l f - r e p a i r ) .  
3. Reorganization of the system i n t o  a d i f -  
f e r e n t  computer conf igura t ion .  (Multi- 
processors  and o ther  "degradable" systems) 
The las t  two methods presuppose the exist- 
ence of a diagnosis  procedure which w i l l  recog- 
n i z e  t h e  symptoms of a f a u l t  (Refs. 10 ,  ll), and 
of a switch which implements the replacement o r  
reconfigi t ra t ion.  !Refs. 12. 13) . Error  cor rec-  
t i o n  is a t t a i n e d  by recomputation, possibly r e -  
t r a c i n g  s e v e r a l  s t e p s  i n  the program t o  a 
"rollback" point  . 
c a t e g o r i e s  : massive ( a l s o  c a l l e d  masking) redun- 
dancy and s e l e c t i v e  ( a l s o  c a l l e d  stand-by) redun- APPLICATION OF PROTECTIVE REDUNDANCY 
dancy . I N  A SPACECRAFT GUIDANCE COMPUTER 
I n  t h e  massive redundancy approach the The choice of a method o r  of a combination 
e f f e c t  of a f a u l t y  component, c i r c u i t ,  s i g n a l ,  of  methods from the preceding l is t  f o r  a p a r t i c -  
subsystem, program, o r  system is masked ins tan-  u l a r  computing system is inf luenced by the i n -  
taneously by permanently connected and concur- tended a p p l i c a t i o n .  The present  paper considers  
r e n t l y  o p e r a t i n g  r e p l i c a s  of the  f a u l t y  element. t h e  a p p l i c a t i o n  of p r o t e c t i v e  redundancy to  a 
The l e v e l  a t  which r e p l i c a t i o n  occurs ranges from guidance and c o n t r o l  computer f o r  an unmanned 
-61 - 
s p a c e c r a f t  which may a lso b e  employed f o r  t h e  on- 
board p rocess ing  of  s c i e n t i f i c  d a t a  when guidance 
computat ion is  not i n  p r o g r e s s .  The guidance 
computer i s  r equ i r ed  t o  s u r v i v e  space voyages t o  
o t h e r  p l a n e t s  which range up t o  s e v e r a l  y e a r s  i n  
l e n g t h  and t o  perform approach guidance and con- 
t r o l  computation a t  t h e  end of  t h e  voyage. Con- 
t i nued  c o n t r o l  of t h e  s p a c e c r a f t  a f t e r  a r r i v a l  
may a lso be r e q u i r e d .  Course c o r r e c t i o n s  a r e  t o  
be computed one or more times d u r i n g  t h e  voyage; 
c o n s i d e r a b l e  time is  a v a i l a b l e  f o r  t h i s  t a s k .  
The computing a t  launch and i n  e a r l y  s t a g e s  of  
t h e  voyage may be performed or suppor t ed  by com- 
p u t e r s  on t h e  ground and i n  t h e  launch v e h i c l e .  
The extreme d i s t a n c e  and the  p o t e n t i a l  o c c u l t a -  
t i o n  make ground suppor t  less e f f e c t i v e  a t  
approach t o  t h e  p l a n e t ,  t h e r e f o r e  t h e  approach 
p r e s e n t s  the most s e v e r e  problems t o  t h e  guidance 
and c o n t r o l  computer. 
The computer des ign  must a l s o  be performed 
w i t h i n  t h e  c o n s t r a i n t s  o f  t h e  a v a i l a b l e  power, 
we igh t ,  and volume. The e x i s t e n c e  o f  t h e s e  con- 
s t r a i n t s  i n d i c a t e s  a n  advantage f o r  s e l e c t i v e  
redundancy, which does n o t  n e c e s s a r i l y  r e q u i r e  
power f o r  t h e  spare  r e p l i c a s  and which o f f e r s  
p r o t e c t i o n  w i t h  the minimum o f  one s p a r e  f o r  each 
o p e r a t i n g  element.  On t h e  o t h e r  hand,  t he  p r i n -  
c i p a l  advantages o f f e r e d  by the  massive approach 
a r e  : 
1. The c o r r e c t i v e  a c t i o n  i s  immediate and 
"wired-in";  i t  i s  de l ayed  and r e q u i r e s  
s w i t c h i n g  in  s e l e c t i v e  redundancy. 
2 .  During ope ra t ion  t h e r e  i s  no need f o r  
d i a g n o s i s ,  which is e s s e n t i a l  i n  s e l e c t i v e  
redundancy. 
3 .  A l l  p a r t s  of t h e  sys t em a r e  e q u a l l y  p ro -  
t e c t e d ;  unprotected "hard co re"  e l emen t s  
may e x i s t  only a t  i n t e r f a c e s  wi th  o t h e r  
sys t ems .  In s e l e c t i v e  redundancy schemes 
a "hard core" always e x i s t s  i n  t h e  system. 
The conversion of a non-redundant des ign  
t o  a massively redundant  one i s  r e l a t i v e l y  
s t r a i g h t f o r w a r d ;  more nove l  d e s i g n  t e c h -  
n iques  are demanded by t h e  i n t r o d u c t i o n  of  
s e l e c t i v e  redundancy. 
4 .  
Compared to  massive redundancy, t h e  selec- 
t ive form r e q u i r e s  s e v e r a l  a d d i t i o n a l  f e a t u r e s :  
a system a b i l i t y  t o  t o l e r a t e  i n t e r r u p t i o n s  f o r  
r e p a i r  and t o  execute  a " ro l lback"  f o r  error c o r -  
r e c t i o n ,  s o p h i s t i c a t e d  d i a g n o s i s  methods,  p r o t e c -  
t i o n  f o r  t h e  "hard core" ,  and t r a d e - o f f  s t u d i e s  
between time, program, and hardware r e p l i c a t i o n .  
The advantages of s e l e c t i v e  redundancy ove r  t h e  
massive form a re ,  however, a l s o  ve ry  s i g n i f i c a n t  
i n  ou r  a p p l i c a t i o n :  
1. 
2 .  
Power is r equ i r ed  by on ly  one copy o f  each 
r e p l a c e a b l e  i t e m  i n  a replacement  sys t em;  
a l l  c o p i e s  r e q u i r e  power in t h e  massive 
form. 
The replacement swi t ch  p rov ides  f a u l t  
3 .  
4 .  
5. 
6 .  
7 .  
i s o l a t i o n  between subsystems;  such i s o l a -  
t i o n  i s  e s s e n t i a l  i n  t h e  c a s e  of  c a t a s t r o -  
ph ic  f a i l u r e s .  Massive redundancy u s u a l l y  
assumes independent  f a i l u r e s  o f  l o g i c  
e l emen t s ;  such independence r e q u i r e s  i s o -  
l a t i o n  which i s  d i f f i c u l t  t o  provide f o r  
i n t e g r a t e d  c i r c u i t  packages which are 
b a t c h - f a b r i c a t e d  and c o n t a i n  many l o g i c  
c i r c u i t s  i n  c l o s e  p rox imi ty .  The e n t i r e  
batch may posses s  the  same d e f e c t ;  a l s o ,  
mechanical or thermal  damage is l i k e l y  t o  
a f f e c t  an e n t i r e  package, r a t h e r  t han  
s i n g l e  l o g i c  c i r c u i t s .  
A l l  s p a r e s  can be u t i l i z e d  i n  s e l e c t i v e  
redundancy; i n  t h e  massive form a m a j o r i t y  
of  f a u l t y  e lements  i n  a given r e g i o n  l e a d s  
t o  system f a i l u r e .  
The d e s i g n s  of  i n d i v i d u a l  r e p l a c e a b l e  
blocks may be a l t e r e d ,  and the  number 
o f  s p a r e s  may be a d j u s t e d  t o  a given 
mission wi thou t  changes i n  t h e  sys t em 
des ign  i n  the c a s e  of s e l e c t i v e  redun-  
dancy; such changes a r e  more d i f f i c u l t  
i n  t h e  massive c a s e .  
The r e p l i c a t i o n  i n  massive redundancy 
f r e q u e n t l y  l e a d s  t o  inc reased  f an -ou t  
and f a n - i n  r equ i r emen t s  f o r  l o g i c  e l e m e n t s ,  
or t o  i nc reased  t o l e r a n c e  l i m i t s  i n  c i r c u i t  
d e s i g n ;  such problems a r e  avoided in t h e  
s e l e c t i v e  case. 
Permanent connec t ion  of  t h e  redundant  
e lements  makes t h e  p re -mis s ion  check-out  
more d i f f i c u l t  t o  implement i n  systems 
wi th  massive redundancy; special  c i r c u i t s  
and system o u t p u t s  a r e  n e c e s s a r y .  
Mass ive ly  redundant  systems w i t h  v o t i n g  
r e q u i r e  s y n c h r o n i z a t i o n  of t h e  s e p a r a t e  
channe l s  a t  t h e  v o t i n g  e l emen t s ;  t hey  a l s o  
are s u s c e p t i b l e  to  t r a n s i e n t  e x t e r n a l  i n -  
f l u e n c e s  ( e . g . ,  s p a r k s )  which a l t e r  l o g i c  
s i g n a l s  i n  a m a j o r i t y  of  channe l s  w i t h o u t  
l e a v i n g  permanent damage. The de layed  
occur rence  of  d i a g n o s i s  i n  t h e  s e l e c t i v e  
case allows d e t e c t i o n  o f  such t r a n s i e n t  
changes in s i g n a l s .  
CHOICE OF REDUNDANCY TECHNIQUES 
FOR THE JPL-STAR COMPUTER 
E v a l u a t i o n  of  t h e  d i f f e r e n c e s  between the  
massive and s e l e c t i v e  approaches l e d  t o  t h e  
c h o i c e  o f  s e l e c t i v e  redundancy f o r  t he  p r o t e c t i o n  
o f  an  e x p e r i m e n t a l  p r o t o t y p e  f o r  a s p a c e c r a f t  
gu idance  computer ,  which w i l l  be c a l l e d  t h e  "JPL 
S e l f - T e s t i n g  and -Repair ing" ( a b b r e v i a t e d  JPL- 
STAR) computer in t h i s  pape r .  The r equ i r emen t  
f o r  approach guidance demands a c e r t a i n  computing 
c a p a c i t y  a t  t h e  end of  a long voyage,  and t h e r e  
is no a n t i c i p a t e d  r equ i r emen t  f o r  a h i g h e r  capac-  
i t y  a t  an e a r l i e r  t ime.  
a r ep lacemen t  sys t em p o s s e s s i n g  t h e  r e q u i r e d  
c a p a c i t y  is  p r e f e r r e d  ove r  a r e o r g a n i z a b l e  or 
"degradable"  sys t em which h a s  a minimal 
Under t h e s e  c o n d i t i o n s ,  
-62 - 
c o n f i g u r a t i o n  o f  t h e  same c a p a c i t y .  
ment sys t em avo ids  t h e  programs, switches, and 
c o n t r o l  hardware which perform t h e  r e c o n f i g u r a -  
t i o n  and r e s u l t i n g  r e s c h e d u l i n g  o f  programs. 
The r e p l a c e -  
The d i a g n o s i s ,  or s e l f - t e s t ,  i s  an e s s e n t i a l  
f u n c t i o n  o f  a replacement  system. 
mon approach - p e r i o d i c  d i a g n o s i s  - u t i l i z e s  a 
d i a g n o s t i c  program which is s t o r e d  i n  t h e  memory. 
Computation is p e r i o d i c a l l y  i n t e r r u p t e d  and t h e  
d i a g n o s t i c  program i s  execu ted .  D e t e c t i o n  o f  a 
f a u l t  i n i t i a t e s  the replacement  p rocedure ;  t he  
program is " r o l l e d  back" t o  a p o i n t  p reced ing  
t h e  p rev ious  ( s u c c e s s f u l )  d i a g n o s i s  pe r iod .  
Errors which have been induced by t r a n s i e n t  f a u l t  
c o n d i t i o n s  remain unde tec t ed .  The c o s t  o f  d i a g -  
n o s i s  c o n s i s t s  o f  t h e  s t o r a g e  used f o r  t he  d i a g -  
n o s t i c  program, o f  t he  t ime consumed by i t s  exe- 
c u t i o n ,  and o f  t h e  t ime needed f o r  r e p a i r  and 
r e p e a t e d  e x e c u t i o n  o f  t h e  program segment which 
w a s  run a f t e r  t h e  l a s t  d i a g n o s i s .  Such t ime 
costs are v e r y  severe i n  approach and r e - e n t r y  
guidance and c o n t r o l  programs, which r e q u i r e  
r e a l - t i m e  computing. 
The most com- 
The a l t e r n a t e  d i a g n o s i s  method is concur ren t  
d i a g n o s i s  i n  which e r r o r - d e t e c t i n g  codes are em- 
ployed t o  show t h e  p re sence  o f  f a u l t s .  The exe-  
c u t i o n  of  e v e r y  i n s t r u c t i o n  is  checked immediate- 
l y ;  i n s t e a d  o f  a s t o r e d  d i a g n o s t i c  program, t h e  
cost i n c l u d e s  t h e  l o g i c  c i r c u i t s  which perform 
t h e  code check .  E r r o r s  due t o  t r a n s i e n t  f a u l t s  
are d e t e c t a b l e ,  and t h e  inmed ia t e  d e t e c t i o n  of a 
f a u l t  p e r m i t s  a v e r y  s h o r t  r o l l b a c k  i n  the pro- 
gram. For  these reasons  c o n c u r r e n t  d i a g n o s i s  is  
p r e f e r a b l e  i n  the  JPL-STAR computer.  
The s i m p l e s t  and most c o s t l y  code (1009. 
redundancy) is t h e  complete d u p l i c a t i o n  o f  pro-  
gram and d a t a  words.  Errors a r e  i n d i c a t e d  by 
t h e  d i sag reemen t  o f  two words; d i a g n o s i s  is 
needed t o  p i n p o i n t  t h e  f a u l t y  s o u r c e .  P a r i t y  
and many classes of  more complex codes which de-  
tect  e r r o r s  i n  t h e  t r a n s m i s s i o n  o f  d i g i t a l  d a t a  
have much lower redundancy,  b u t  a r e  n o t  s u i t a b l e  
for t h e  check ing  of  a r i t h m e t i c  o p e r a t i o n s .  
o r d e r  t o  have a uniform code f o r  t h e  e n t i r e  
sys t em,  a r i t h m e t i c a l  error d e t e c t i n g  codes were 
s e l e c t e d  a s  a means o f  d i a g n o s i s  f o r  t h e  JPL-STAR 
system. An e x t e n s i v e  t h e o r e t i c a l  i n v e s t i g a t i o n  
o f  t h e  e f f e c t i v e n e s s ,  c o s t ,  and a p p l i c a b i l i t y  o f  
a r i t h m e t i c  codes  was conducted p r i o r  t o  t h e  s y s -  
t e m  d e s i g n  o f  t he  JPL-STAR computer.  (Refs .  14, 
1 5 ) .  The r e s u l t s  showed t h e  e x i s t e n c e  o f  a c l a s s  
o f  low-cos t codes w i t h  s u f f i c i e n t  e f f e c t i v e n e s s  
of error d e t e c t i o n .  
I n  
SYSTEM DESIGN OF THE JPL-STAR COMPUTER 
The JPL-STAR computer is a replacement  s y s -  
t e m ,  which i s  in t ended  t o  s e r v e  as a p ro to type  
f o r  s p a c e c r a f t  guidance computers i n  v e r y  long 
m i s s i o n s  o f  s e v e r a l  y e a r s  d u r a t i o n .  
c o n s i s t s  of  several aiieonomous f u n c t i o n a l  u n i t s ,  
i n c l u d i n g :  
The sys t em 
1. an  a r i t h m e t i c  p r o c e s s o r ;  
2 .  an index a r i t h m e t i c  p r o c e s s o r ;  
3. a r ead -on ly  memory; 
4. a r e a d - w r i t e  memory; 
5 .  i n p u t l o u t p u t  b u f f e r  r e g i s t e r s .  
The f u n c t i o n a l  u n i t s  a r e  in t e rconnec ted  and con- 
t r o l l e d  by the  c e n t r a l  c o n t r o l  u n i t  (CCU). One 
o r  more r e p l i c a s  o f  each  o p e r a t i n g  f u n c t i o n a l  
u n i t  are inc luded  i n  t h e  sys t em as s t andby  re- 
placements .  Replacement o f  a f u n c t i o n a l  u n i t  is 
i n i t i a t e d  by t h e  CCU and implemented by a re- 
placement s w i t c h ,  which selects t h e  s p a r e s  i n  
c y c l i c  o r d e r .  To f a c i l i t a t e  checkou t ,  t h e  s w i t c h  
r e t u r n s  t o  t h e  o r i g i n a l  o p e r a t i n g  u n i t  when t h e  
s p a r e s  are exhaus ted .  I n  o r d e r  t o  reduce t h e  
size o f  t h e  s w i t c h ,  a l l  words ( i n s t r u c t i o n s  and 
numeric d a t a )  are t r a n s m i t t e d  between t h e  func- 
t i o n a l  u n i t s  i n  bytes of f o u r  b i n a r y  d i g i t s  each .  
Word Formats 
The a r i t h m e t i c  coding which is most e f f e c -  
t ive  i n  t h e  case o f  t r a n s m i s s i o n  and computing 
by f o u r - b i t  b y t e s  employs t h e  = ' c o n s t a n t  15. 
(Ref .  15). Any s i n g l e  de t e rmina te  f a u l t  ( l o g i c  
v a l u e  "s tuck on zero",  o r  "s tuck on one") w i l l  
be d e t e c t e d  f o r  word l e n g t h s  up to  14 b y t e s  (56 
b i t s ) ,  even i f  eve ry  b y t e  is s e p a r a t e l y  a f f e c t e d  
by the  f a u l t .  Binary numerical  operands X (28 
b i t s  long)  are encoded i n  t h e  p r o d u c t  2 = 
15X, y i e l d i n g  32 b i t s  l ong  code words. The 
check ing  a l g o r i t h m  computes t h e  modulo 15 resi- 
dues (des igna ted  a s  /2f15) o f  operands and re- 
s u l t s  which are t r a n s m i t t e d  between the  func-  
t i o n a l  u n i t s .  E r r o r  d e t e c t i o n  i s  implemented 
by the  checke r :  a f o u r - b i t  adder  w i t h  an  end- 
around c a r r y  which sums t he  b y t e s  o f  t h e  word 
be ing  t r a n s m i t t e d  t o  o b t a i n  the  modulo 15 res i -  
due. The checke r s  a r e  l o c a t e d  i n  the  CCU; t h e i r  
o p e r a t i o n  i s  v e r i f i e d  by complete d u p l i c a t i o n .  
A non-zero r e s i d u e  is  t h e  symptom o f  a f a u l t  i n  
t h e  u n i t  which d e l i v e r e d  t h e  operand. 
I n s t r u c t i o n s  of  t h e  JPL-STAR computer con- 
s i s t  of  two f o u r - b i t  o p e r a t i o n  codes and a 24- 
b i t  a d d r e s s  p a r t .  The addres s  p a r t  is  a lso sub-  
ject  t o  a r i t h m e t i c  o p e r a t i o n s  ( a d d i t i o n  and sub-  
t r a c t i o n )  d u r i n g  index ing  and d u r i n g  incrernent ing 
o f  t h e  a d d r e s s .  I n  t h e  s e l e c t i o n  o f  a memory 
l o c a t i o n  t h e  addres s  i s  u s u a l l y  d i v i d e d  i n t o  t w o  
o r  t h r e e  segments ,  which s e r v e  as i n p u t s  to 
s e i e c i i o n  iree neiworks.  Troduci  cocied numbers 
cannot  be s e p a r a t e d  i n  t h i s  f a sh ion  i n t o  p r o p e r l y  
coded segments;  t h e r e f o r e  r e s i d u e  c o d i n g  wi th  t h e  
check c o n s t a n t  15 is used f o r  t h e  addres s  p a r t .  
In t h e  r e s i d u e  cod ing ,  t h e  2 0 - b i t  b i n a r y  addres s  
A carries a long  a 4 - b i t  check symbol c ( A ) ,  which 
is t h e  15's complement o f  t h e  modulo 15 r e s i d u e  
of  t h e  a d d r e s s  A:  
c (A)  = 15 - /AIl5 
P a s s i n g  bo th  A and c ( A )  through t h e  checker  
should y i e l d  t h e  check r e s u l t  1111, which 
-63 - 
r e p r e s e n t s  t h e  zero r e s i d u e  o f  a product-coded 
operand.  It i s  very important  t o  n o t e  t h a t  t h e  
r e s i d u e  /A/15 as the check symbol w i l l  n o t  g ive  
t h e  same e r r o r - d e t e c t i n g  e f f e c t i v e n e s s  as the  
p roduc t  code 15X i n  t h e  case o f  f o u r - b i t  b y t e s ,  
wh i l e  15-/A/15 o f f e r s  t he  same e f f e c t i v e n e s s  a s  
t h e  cod ing  15X. 
Two o p e r a t i o n  codes o f  f o u r  b i t s  each are 
used i n  o r d e r  t o  have maximal autonomy o f  t h e  
f u n c t i o n  u n i t s .  The f i r s t  code is  the  c o n t r o l  
code which remains i n  the  CCU and s e r v e s  t o  de -  
f i n e  t h e  pa th  f o r  t he  second code - t h e  f u n c t i o n  e, which i s  d e l i v e r e d  t o  a f u n c t i o n  u n i t  
d e s i g n a t e d  by t h e  c o n t r o l  code.  
codes a r e  p ro tec t ed  by a two-out-of-four  encod ing ,  
which l e a v e s  s i x  v a l i d  words i n  a f o u r - b i t  code.  
Such coding is  most e f f i c i e n t  f o r  s h o r t  words and 
i s  a c c e p t a b l e  because o p e r a t i o n  codes are n o t  sub- 
j e c t e d  t o  a r i t h m e t i c  o p e r a t i o n s .  I t  i s  e v i d e n t  
t h a t  t h e i r  v a l i d i t y  test  is made by a s e p a r a t e  
c i r c u i t ,  s i n c e  i t  cannot  be v e r i f i e d  by t h e  
checke r  (which is  bypassed by the  op .  codes ) .  
-
The o p e r a t i o n  
Ar i thme t i c  P rocesso r s  
The a r i t h m e t i c  p r o c e s s o r  (MAP) of  t h e  
JPL-STAR system accep t s  s i x  f u n c t i o n  codes:  
C l e a r  Add, Add, S u b t r a c t ,  M u l t i p l y ,  D i v i d e ,  and 
N o  Opera t ion .  (Ref. 1 6 ) .  The operands and re- 
s u l t s  are 32 b i t  product-coded b i n a r y  numbers. 
A l l  a r i t h m e t i c  c o n t r o l  is con ta ined  in  the  MAP; 
an  i n p u t  c o n s i s t s  of  a f u n c t i o n  code followed by 
a coded ope rand ,  and the ou tpu t  i s  a coded r e s u l t  
followed by a non-numerical 2-out-of-4 code b y t e ,  
i n d i c a t i n g  e i t h e r  one of t h r e e  s i n g u l a r i t i e s  
(sum ove r f low,  q u o t i e n t  ove r f low,  z e r o  d i v i s o r )  
or t he  type of  a good r e s u l t  ( p o s i t i v e ,  z e r o ,  
n e g a t i v e ) .  The good r e s u l t  codes a r e  s t o r e d  i n  
t h e  c e n t r a l  c o n t r o l  u n i t  (CCU) and are used as 
d a t a  f o r  c o n d i t i o n a l  jump i n s t r u c t i o n s .  A l l  p a r -  
t i a l  and f i n a l  r e su l t s  a r e  d e l i v e r e d  t o  the CCU 
checker  and a l s o  s t o r e d  i n  a D u p l i c a t e  Accumulator 
r e g i s t e r  i n  t h e  sc ra t chpad  ( r ead -wr i t e )  memory. 
A S t o r e  i n s t r u c t i o n  is  t h e r e f o r e  n o t  needed f o r  
t h e  MAP. There are  fou r  d a t a  i n p u t  l i n e s ,  fou r  
d a t a  o u t p u t  l i n e s ,  and fou r  c o n t r o l  l i n e s  between 
t h e  MAP and t h e  CCU. The c o n t r o l  l i n e s  are a 
c l o c k  i n p u t  and th ree  o u t p u t s :  "perfprm check",  
"end o f  a lgo r i thm"  and " i n t e r n a l  f a u l t " .  The 
"end o f  a lgo r i thm"  s e r v e s  as a work r e q u e s t ;  t he  
" i n t e r n a l  f a u l t "  is  ob ta ined  from i n t e r n a l  monitor  
c i r c u i t s  which d e t e c t  c a t a s t r o p h i c  f a i l u r e s  and 
i n t e r n a l  c o n t r o l  f a u l t s .  A breadboard model of  
t he  MAP has  been cons t ruc t ed  and i s  undergoing 
f u n c t i o n a l  t es t s .  Residue coding i s  a l s o  a p p l i -  
c a b l e  t o  a r i t h m e t i c a l  operands.  An a l t e r n a t e  MAP 
des ign  f o r  r e s i d u e  Coded operands is  be ing  p re -  
pared f o r  a comparison t o  the  p r e s e n t  d e s i g n .  
The index a r i t h m e t i c  p rocesso r  (IAP) c o n t a i n s  
t h e  Index R e g i s t e r  ( I R ) ,  t h e  Sequence R e g i s t e r  
(SR) and an adde r .  When the  2 0 - b i t  index word B 
from t h e  I R  is added t o  an addres s  A ,  i t s  4 - b i t  
check symbol c(B) i s  added modulo 15 t o  c(A). 
The indexed address  and t h e  new check symbol go 
through a checker  t o  t h e  inpu t  l i n e s  o f  t h e  
a p p r o p r i a t e  memory u n i t .  The inc remen t ing  (by 
one) of  the c u r r e n t  addres s  i n  SR i s  performed 
i n  e x a c t l y  t h e  same manner, w i th  1 b e i n g  added 
t o  A and c(1) = 14 be ing  added modulo 15 t o  c ( A )  
The incremented addres s  r e t u r n s  t o  SR through a 
checke r .  The i n p u t  and o u t p u t  l i n e s  o f  t h e  U P  
are s imi la r  t o  those  of  t he  MAP. 
S to rage  
The r ead -on ly  memory (ROM) c o n t a i n s  t h e  
programs and t h e  a s s o c i a t e d  c o n s t a n t s  f o r  t h e  
given mis s ion .  
a d d r e s s i n g  of  2 2 ~ 4 1 0 c a t i o n s  ; t h e  expe r imen ta l  
model p rov ides  2 words of 32 b i t s  e a c h ,  u s i n g  
an  assembly o f  magnet ic  c o r e s  and wires f o r  t h e  
permanent s t o r a g e  of  b i n a r y  in fo rma t ion .  
ROM also c o n t a i n s  a l l  necessa ry  p e r i p h e r a l  e lec-  
t r o n i c s :  t h e  addres s  r e g i s t e r ,  a c c e s s  c i r c u i t s ,  
d r i v e r s ,  sequence c o n t r o l ,  and t h e  o u t p u t  r e g i s -  
ter .  
t o r i n g  of d r i v e r  c u r r e n t s  and by the  independent  
r eadou t  of  a f o u r - b i t  check symbol of  t h e  a d d r e s s  
d e s i g n a t i n g  t h e  l o c a t i o n  a c c e s s e d ,  which is com- 
pared t o  the check symbol i n  t h e  a d d r e s s  r e g i s -  
t e r .  This comparison v e r i f i e s  t h a t  t h e  s t o r a g e  
c e l l  which was s p e c i f i e d  has  a c t u a l l y  been 
accessed .  A l l  o u t p u t  words from t h e  ROM a r e  
d e l i v e r e d  byte-by-byte  through the  a p p r o p r i a t e  
checke r .  The p r e s e n t  model of JPL-STAR computer 
i n c l u d e s  complete r e p l i c a s  of t he  ROM a s  r e p l a c e -  
ments ;  the replacement  of  p e r i p h e r a l  e l e c t r o n i c s  
w i t h o u t  d i s c a r d i n g  the  c o r e  and wire assembly is 
now be ing  e x p l o r e d .  I n t e g r a t e d - c i r c u i t  R O M ' s  
which a r e  p r e s e n t l y  be ing  developed by s e v e r a l  
manufac tu re r s  promise a c o n s i d e r a b l e  r e d u c t i o n  
i n  the  s i z e  and weight  of  t he  ROM. The c o s t  of  
replacement  of  e n t i r e  R O M ' s  w i l l  be dec reased  
by such m i n i a t u r i z a t i o n .  
The addres s  p a r t  a l lows  d i r e c t  
The 
Proper  o p e r a t i o n  i s  v e r i f i e d  by t h e  moni- 
The r e a d - w r i t e  memory (RWM) and the  b u f f e r  
r e g i s t e r s  a r e  p r o t e c t e d  by complete d u p l i c a t i o n .  
The RWM p rov ides  s t o r a g e  f o r  v a r i o u s  i n t e r m e d i a t e  
r e s u l t s  and i n p u t s ;  i t  c o n s i s t s  of r e p l a c e a b l e  
c o r e  memory modules which c o n t a i n  a l l  p e r i p h e r a l  
e l e c t r o n i c s .  I n  c a s e  of a permanent f a u l t  i n  one 
member o f  a p a i r ,  the  c o n t e n t s  of  t h e  good module 
a r e  cop ied  i n t o  a replacement  and t h e  f a u l t y  
module is d i s c o n n e c t e d .  The s i z e  of modules and 
t h e i r  number is c o n t r o l l e d  by the  t o t a l  RWM re- 
qu i r emen t s  of a m i s s i o n ;  t h e  p r o t o t y p e  w i l l  u s e  
a s i n g l e  p a i r  of  128 word modules.  It  i s  expec-  
t e d  t h a t  l a r g e - s c a l e  i n t e g r a t i o n  w i l l  p rov ide  
r e p l a c e a b l e  RWM modules on one or a few c h i p s .  
The i n t e r n a l  f a u l t  mon i to r ing  of t h e  RWM modules 
i s  s i m i l a r  t o  t h e  method used i n  t h e  ROM. It i s  
t o  be no ted  t h a t  because of  complete d u p l i c a t i o n  
t h e  copying and replacement  may be postponed 
u n t i l  a c r i t i c a l  computat ion i s  completed.  
C e n t r a l  C o n t r o l  
The c e n t r a l  c o n t r o l  u n i t  (CCU) c o n t a i n s  t h e  
h a r d  c o r e  of  t h e  replacement  system. I t  s e r v e s  
a s  t h e  bus c o n n e c t i n g  a l l  f u n c t i o n a l  u n i t s  and 
performs t h e  f u n c t i o n s  of s y n c h r o n i z a t i o n  
( c l o c k i n g ) ,  t r a n s f e r r i n g  i n f o r m a t i o n ,  e x e c u t i n g  
t h e  check ing  a l g o r i t h m ,  and implementing 
-64- 
replacement .  The t r a n s f e r  of coded words between 
t h e  f u n c t i o n a l  u n i t s  occu r s  on a one-byte  ( f o u r -  
b i t )  bus:  i t  i s  c o n t r o l l e d  by t h e  c o n t r o l  code o f  
t h e  c u r r e n t  i n s t r u c t i o n .  A l l  b y t e s  e n t e r i n g  t h e  
CCU a r e  d i r e c t e d  t o  a checke r .  The two-out-of- 
fou r  code b y t e s  are checked i n d i v i d u a l l y  by a 
check network. The b y t e s  o f  an ope rand ,  a re- 
s u l t ,  o r  an addres s  are summed modulo 15 i n  the  
checker  and t h e  r e s i d u e  is  t e s t e d  f o r  z e r o  v a l u e  
( r e p r e s e n t e d  by fou r  ones) when t h e  t r a n s m i s s i o n  
i s  completed.  The CCU r e c e i v e s  " i n t e r n a l  f a u l t "  
s i g n a l s  from monitor ing c i r c u i t s  i n s i d e  t h e  r e -  
p l a c e a b l e  f u n c t i o n a l  u n i t s ,  as w e l l  as from i t s  
own checke r s .  Two checkers  a r e  employed i n  t h e  
p r e s e n t  JPL-STAR sys t em c o n f i g u r a t i o n :  one f o r  
o u t p u t s  o f  memory u n i t s ,  and one f o r  o u t p u t s  of 
t h e  p r o c e s s o r s .  
In t h e  c a s e  of a f a u l t  s i g n a l ,  t h e  CCU i n t e r -  
r u p t s  t h e  c u r r e n t  program and execu te s  an emergen- 
cy  sequence.  F i r s t ,  t h e  c u r r e n t  i n s t r u c t i o n  is  
r e p e a t e d  i n  o r d e r  t o  correct a t r a n s i e n t  error;  
i f  t h e  f a u l t  p e r s i s t s ,  t h e  replacement  swi t ch  i s  
advanced. A f t e r  replacement  t h e  program i s  
" r o l l e d  back",  i . e . ,  resumed a t  a d e s i g n a t e d  
i n s t r u c t i o n .  The addres s  of  t h i s  i n s t r u c t i o n  
i s  s t o r e d  i n  a s p e c i a l  CCU r e g i s t e r ;  i t s  upda t ing  
is a f u n c t i o n  o f  t h e  program. 
The CCU i t s e l f  is v u l n e r a b l e  t o  f a u l t s  and 
r e q u i r e s  p r o t e c t i o n .  Massive l o g i c  o r  component 
redundancy ( v o t i n g ,  quadding,  e t c . ) ,  complete 
o p e r a t i n g  d u p l i c a t i o n  (Ref.  17 ) ,  p e r i o d i c  s e l f -  
d i a g n o s i s  (Ref.  1 8 ) ,  and e x t e r n a l  mon i to r ing  a r e  
a l l  a p p l i c a b l e  t o  t h i s  t a s k .  S t u d i e s  a r e  p re -  
s e n t l y  be ing  conducted t o  determine op t ima l  o r  
nea r -op t ima l  ba l ances  of  t h e s e  methods i n  t h e  
CCU. O p e r a t i o n a l  d u p l i c a t i o n  of the  checke r s  
and s i m i l a r  f u n c t i o n a l  p a r t s  o f  t h e  CCU pe rmi t s  
t h e i r  r ep lacemen t  and is p r e s e n t l y  cons ide red  as 
t h e  p r e f e r r e d  method o f  r educ ing  the  e x t e n t  of 
t h e  h a r d  c o r e .  
A h i g h l y  r e l i a b l e  replacement  swi t ch  which 
a l s o  p r o v i d e s  adequate  i s o l a t i o n  i n  the  case o f  
c a t a s t r o p h i c  f a i l u r e s  is an e s s e n t i a l  p a r t  of  the 
CCU. A d e s i g n  s i u d y  which c o n s i d e r s  magnet ic  and 
semiconductor  implementat ions of  t h e  switch is i n  
p r o g r e s s .  Performance o f . t h e  s w i t c h  w i l l  be ex- 
t e n s i v e l y  t e s t e d  under  expec ted  env i ronmen ta l  
c o n d i t i o n s .  
AN EXTENSION TO MULTIPROCESSING 
It was observed i n  the p reced ing  d i s c u s s i o n  
t h a t  t h e  m o s t  severe t a s k s  f o r  t h e  s p a c e c r a f t  
gu idance  and c o n t r o l  computer occur  d u r i n g  ap- 
proach and r e - e n t r y  to  a p l a n e t  a f t e r  a long 
p e r i o d  of comparat ive i d l e n e s s .  As a consequence, 
t h e r e  i s  no a p p a r e n t  need t o  u t i l i z e  t h e  s p a r e s  
fo r  an  e x t e n s i o n  of  computing power by mul t ip ro -  
c e s s i n g  d u r i n g  t h e  ear l ier  phases  o f  t he  m i s s i o n ,  
a l t h o u g h  the s p a r e s  a r e  a v a i l a b l e .  In g e n e r a l ,  
it is  e x p e c t e d  t h a t  a thorough a p p l i c a t i o n  of  
c o n v e n t i o n a l  r e l i a b i l i t y  p r a c t i c e s  w i l l  y i e l d  a 
d e s i g n  which is a l r e a d y  h i g h l y  r e l i a b l e  ( t h e  
l o n g e v i t y  of t h e  Mariner IV s p a c e c r a f t  serves as 
an  i l l u s t r a t i o n  of  t h i s  p o i n t ) .  The purpose of  
t h e  replacement  sys t em is  t o  p rov ide  in su rance  
a g a i n s t  overlooked d e s i g n  weaknesses ,  human e r -  
r o r s  i n  p roduc t ion  and checkou t ,  and e x t e r n a l l y  
induced f a u l t s ;  a l l  of t h e s e  f a i l u r e s  may be 
c a t a s t r o p h i c  w i t h  r e s p e c t  t o  an  e n t i r e  f u n c t i o n a l  
u n i t  of t h e  sys t em and r e q u i r e  t h e  i s o l a t i o n  p ro -  
v ided  by replacement .  
Under normal c o n d i t i o n s  such f a u l t s  w i l l  be 
avo ided ,  and t h e  replacement  sys t em w i l l  reach 
t h e  d e s t i n a t i o n  w i t h  a l l  o r  most o f  i t s  s p a r e s  
s t i l l  i n t a c t .  A f t e r  t h e  e x e c u t i o n  of t h e  ap- 
proach and r e - e n t r y ,  t h e  f u n c t i o n s  o f  t h e  gu id -  
ance computer a r e  l a r g e l y  completed;  however,  
t h e r e  remains a l a r g e  computat ional  t a s k  o f  p ro -  
c e s s i n g  the  s c i e n t i f i c  d a t a  which a r e  a c q u i r e d  
d u r i n g  and a f t e r  t h e  a r r i v a l .  A t  t h i s  p o i n t  a l l  
s u r v i v i n g  s p a r e s  of t h e  replacement  sys t em can 
be u t i l i z e d  i n  t h e  new t a s k  of  on-board d a t a  
p r o c e s s i n g ,  and a m u l t i p r o c e s s o r  c o n f i g u r a t i o n  
becomes d e s i r a b l e .  
The conve r s ion  o f  t he  o r d i n a r y  replacement  
system to a m u l t i p r o c e s s o r  r e q u i r e s  s e v e r a l  ad -  
d i t i o n a l  f e a t u r e s .  A c o n s i d e r a b l y  more compli-  
c a t e d  bus and s w i t c h i n g  arrangement i s  needed t o  
accommodate p a r a l l e l  o p e r a t i o n  and r e c o n f i g u r a -  
t i o n  i n  case o f  f a u l t  d e t e c t i o n .  The number o f  
checke r s  is  inc reased  f o r  p a r a l l e l  d i a g n o s i s .  
A mre e l a b o r a t e  c o n t r o l  u n i t  i s  needed f o r  t h e  
schedu l ing  and c o o r d i n a t i o n  of t h e  p a r a l l e l  
e v e n t s .  A se t  of  new programs ( a  new ROM) is 
a l s o  t o  be provided by t h e  conve r s ion .  Design 
s t u d i e s  o f  t he  conve r s ion  problem have been 
i n i t i a t e d  w i t h  t h e  o b j e c t i v e  o f  h o l d i n g  the 
a d d i t i o n a l  sys t em elements  i n a c t i v e  and i s o l a t e d  
u n t i l  t he  conve r s ion  i s  commanded by the  guidance 
program. Such i s o l a t i o n  minimizes t h e  p o s s i h i l -  
i t y  o f  e a r l y  system f a i l u r e  i n  t h e  swi t ch  and 
CCU caused by t h e  a d d i t i o n a l  m u l t i p r o c e s s i n g  
hardware.  A f t e r  a complete f u n c t i o n a l  checkout  
t he  conversion f e a t u r e s  w i l l  be inco rpora t ed  i n t o  
t h e  JPL-STAR computer experimental  model, which 
i s  p r e s e n t l y  b e i n g  c o n s t r u c t e d .  
ACKNOWLEDGMENT 
The r e s e a r c h  d e s c r i b e d  i n  t h i s  paper  has  
been c a r r i e d  o u t  a t  t h e  Je t  P ropu l s ion  Laboratory,  
Pasadena,  C a l i f o r n i a ,  under C o n t r a c t  NAS7-100, 
sponsored by t h e  N a t i o n a l  Aeronau t i c s  and Space 
Admin i s t r a t ion .  The au tho r  wishes t o  acknowledge 
t h e  f u l l  suppor t  and encouragement of  W .  F.  S c o t t  
and d i s c u s s i o n s  w i t h  J .  J .  Wedel and G .  R .  Hansen. 
L I I ~  L U ~ L C  uc=.LgLl of t h e  xeir. cr i t -h -e t i r  pr2cessn r  
was performed by A.  D .  Weeks and D .  A .  Renne l s ,  
and t h e  c o n s t r u c t i o n  was c a r r i e d  o u t  by J .  Buchok, 
a l l  o f  t h e  F l i g h t  Computers and Sequencers  
S e c t i o n ,  Guidance and C o n t r o l  D i v i s i o n ,  JPL. 
I. . - - 1 -  
REFERENCES 
1. C r e v e l i n g ,  C .  J.:  I n c r e a s i n g  the  R e l i a b i l i t y  
o f  E l e c t r o n i c  Equipment by the  Use o f  Redun- 
d a n t  C i r c u i t s .  Proceedings of  t he  I R E ,  v o l .  
44, pp. 509-515, A p r i l  1956. 
-65 - 
2. 
3.  
4 .  
5 .  
6.  
7.  
a. 
9 .  
Lewis,  T. B . :  Pr imary P r o c e s s o r  and Data  
S t o r a g e  Equipment for t h e  O r b i t i n g  Astronom- 
i c a l  Observatory.  IEEE T r a n s a c t i o n s  on 
E l e c t r o n i c  Computers, v o l .  EC-12, No. 5 ,  
pp. 677-686, December 1963. 
D ick inson ,  M .  M . ,  J ackson ,  J .  B . ,  and Randa, 
G .  C . :  Sa tu rn  V Launch V e h i c l e  D i g i t a l  Com- 
p u t e r  and Data Adapter .  
P roceed ings ,  v o l .  26,  (1964 FJCC), pp. 501- 
516. 
AFIPS Conference 
Tryon,  J .  G . :  Quadded Logic .  Redundancy 
Techniques fo r  Computinp. Systems,  pp. 205- 
228, S p a r t a n  P r e s s ,  I n c . ,  Washington, D . C . ,  
1962. 
P i e r c e ,  14. H .  : Interwoven Redundant Logic .  
Jou rna l  o f  the F r a n k l i n  I n s t i t u t e ,  v o l .  277, 
No. 1, pp. 55-85, J anua ry  1964. 
P i e r c e ,  W . H .  : F a i l u r e - T o l e r a n t  Computer 
DesiEn. Academic P r e s s ,  I n c . ,  New York, 1965. 
Winograd, S . , and Cowan, J .  D .  : R e l i a b l e  
Computation i n  t h e  P resence  of  Noise.  The 
M.I.T. P r e s s ,  Cambridge, Mass. ,  1963. 
P e t e r s o n ,  W .  W. : 
The M.I.T. Press  and John Wiley & Sons,  I n c . ,  
New York, 1961. 
Kautz ,  W .  H . :  Codes and Coding C i r c u i t r y  for 
Automatic Error  C o r r e c t i o n  Within D i g i t a l  
Systems. Redundancy Techniques for Computing 
Systems,  pp.  152-195, S p a r t a n  P r e s s ,  I n c .  , 
Washington, D .C . , 1962. 
E r r o r  C o r r e c t i n g  Codes. 
11. Roth,  J .  P. :  D iagnos i s  of Automata 
F a i l u r e s :  A C a k u l u s  and a Method. IBM 
Journa l  of  Research and Development, v o l .  
10,  N o .  4 ,  pp.  278-291, J u l y  1966. 
12. Griesmer, J .  E . ,  Mi l le r ,  R .  E . ,  and Roth,  
The Design of  D i g i t a l  C + r c u i t s  t o  J .  P. :  
E l imina te  C a t a s t r o p h i c  F a i l u r e s .  Redundancy 
Techniques f o r  Computing Systems,  pp. 328- 
348, S p a r t a n  P r e s s ,  I n c . ,  Washington, D . C . ,  
1962. 
13. Avizienis, A . :  Coding of In fo rma t ion  for a 
Guidance Computer w i th  Ac t ive  Redundancy. 
JPL Space Programs Summary N o .  37-22, pp. 
9-12, 1963. 
14.  A v i z i e n i s ,  A . :  A S e t  o f  Algori thms f o r  a 
Diagnosable  Ar i thme t i c  U n i t .  JPL Technical  
Report  No. 32-546, March 1, 1964. 
15. A v i z i e n i s ,  A . :  A Study of the  E f f e c t i v e n e s s  
of  F a u l t - D e t e c t i n g  Codes for Bina ry  Arithme- 
t i c .  JPL Techn ica l  Report  No. 32-711, 
September 1, 1965. 
1 6 .  A v i z i e n i s ,  A . :  The Diagnosable  Ar i thme t i c  
JPL Space Programs Summarv No. P r o c e s s o r .  
37-37, V O ~ .  I V ,  pp. 76-80, 1966. 
1 7 .  Downing, R. W . ,  Nowak, 3.  S . ,  and Tuomenoksa, 
L .  S . :  No. 1 ESS Maintenance P l a n .  The 
Bell System Techn ica l  J o u r n a l ,  v o l .  4 3 ,  
N o .  5 ,  p a r t  1; pp. 1961-2019; September 1964. 
18. F o r b e s ,  R .  E . ,  R u t h e r f o r d ,  D .  H . ,  S t i e g l i t z ,  
C .  B . ,  and Tuna, L.  H.: A Se l f -Diagnosab le  
Computer. AFIPS Conference P r o c e e d i n g s ,  
10. Seshu,  S . ,  and Freeman, D .  N . :  The Diagnos i s  v o l .  27 ,  p a r t  1, (1965 F a l l  J C C ) ,  pp.  1073- 
of  Asynchronous S e q u e n t i a l  Swi t ch ing  Systems. 1086. 
I R E  Transac t ions  on E l e c t r o n i c  Computers,  
v o l .  E C - 1 1 ,  no. 4 ,  pp. 459-465; August ,  1962. 
-66 - 
t 
COMPUTER AIDS 
-67 - 
COMPUTER DES I GN ASS I STANCE FOR THE EVOLVING 
LARGE SCALE INTEGRATED C I RCU IT TECHNOLOGY 
JOHN S. MERRITT 
Mr. Mer r i t t  i s  a Development Engineer, Computer Systems, Honeywell 
Corporation. He received a B. S. degree in Electronic Engineering f rom 
Rutgers University in 1958, and studied Programming and Theory of Auto- 
matic Computation at U. C. L. A. 
At Honeywell, Mr. Mer r i t t  was responsible for t h e  development of 
computer design aids us ing  existing computers. Since December 1965, 
h e  has been engaged in advanced computer development for  an aerospace 
m u  Iti -processor computer system. 
Mr. Mer r i t t  joined Aero-Florida i n  1962 as an electrical engineer, 
assigned to a i rborne computer programming for  t h e  SAINT project. He 
was responsible for  work o n  iner t ia l  navigation, platform calibration, and 
assembly integration testing. Later h e  was involved in simulat ion work on  
t h e  H-800 ground computer. Since September 1963 h e  has programmed 
def in i t ion compiler studies and airborne computer design. 
Mr. Mer r i t t  was employed as an electrical engineer fo r  Remington Rand 
(1958-19611, working o n  such projects as Titan and Nike-Zeus, and was 
involved in t h e  application of logical design techniques to t h e  design of t h e  
logical c i r c u i t r y  between t h e  paper tape reader and t h e  magnetic drum. 
Mr. Mer r i t t  i s  a member of P i  Mu Epsilon (Mathematics Honor Society) 
and Delta P h i  Alpha (German Honor Society). His professional wr i t ings 
i n c l u d e  "The Analog Computer, I '  Rutgers Engineer, March  1957. 
-69- 
COMPUTER DESIGN ASSISTANCE FOR THE 
EVOLVING LARGE SCALE INTEGRATED 
CIRCUIT TECHNOLOGY 
By John S. Merr i t t  
Electr ical  Engineer 
Advanced Computer Development 
Honeywell Inc., Aeronautical Division 
St. Petersburg,  Florida 
SUMMARY 
The design of a multiprocessor computer 
system w i l l  be developed with the aid of an exist-  
ing computer. A tool box of programs is p re -  
sented which have in common the fact that they 
all work on the s a m e  ree l  of magnetic tape. 
tape contains fi les each of which specifies the 
design of a par t icular  unit. 
s teps  in execution of selected program tools for  
a par t icular  equipment design fall into four gen- 
e r a l  categories:  
2) Simulation of operation, 3)  Placement of com- 
ponents, 4)  Preparat ion of wire-run lists. 
This 
The operational 
1)  Formulation of logic, 
Each operational s tep may be thought of a s  a 
different shelf  in a tool box. 
s eve ra l  program tools which may be selected by 
the designer.  
planned are described. The objective of these 
tools is to  provide a cooperative man-machine 
interactive design system which f r ees  the 
designer f r o m  routine bookkeeping t a sks  so  that 
he may devote more  of his  t ime to  the actual 
system design. 
and manual layout techniques not only reduces to  
a f e w  hours  computer t ime the many man-years  
this  takes  by hand but with the advent of Large 
Scale Integrated Circuit  Technology provides the 
most  pract ical  method of implementation. 
On each shelf a r e  
Tools presently available and 
The elimination of breadboarding 
INTRODUCTION 
The design of a multiprocessor computer 
sys t em w i l l  be  developed with the aid of an ex- 
ist ing computer.  
common the design data t o  be processed. The 
evolving design is kept on a r e e l  of magnetic 
tape. Th i s  tape contains files each of which 
specifies t he  design of a par t icular  unit. 
operational s t eps  in execution of selected pro- 
g r a m  tools f o r  a par t icular  equipment design fall 
into four general  categories.  
A tooi box of programs nas in 
The 
Phase  I - Formulation of Logic 
Phase  I1 - Simulation of Operation 
N 
Phase I11 - 
Phase  IV - 
Placement of Components 
Preparat ion of Wire-Run L i s t s  
Each phase may be thought of a s  a different 
shelf in a tool box. On each shelf a r e  seve ra l  
tools which may be selected by the designer.  
Different tools a r e  independent of one another thus 
allowing them to be modified o r  added to  without 
affecting the others.  Different types of tools on a 
shelf a r e  used by the designer to  shape the work 
in the design files. The sequence and use of these 
tools is determined by the designer according t o  
how the work is developing in the file. Different 
vers ions of the s a m e  type of tool may  be available 
f o r  processing different integrated circuit  building 
blocks. After each tool has obtained from the file 
that portion of the design data i t  works on, i ts  
function is performed, and the updated work is 
returned to  the file. U s e  of tools on subsequent 
shelves depends upon data processed by tools on 
the f i r s t  shelve(s). 
However, the same  tool can be reused to  re- 
work the data af ter  the tool has once been used and 
data  in the file has  once reached i t s  level of up- 
date. Several tools a r e  never seen by the designer 
but a r e  used by the tool designer (p rogrammer )  
i n  bootstrap development of the program tools and 
t o  service and run them. At the same  time, the 
designer  uses  the tools he requires  in bootstrap 
development of the design data. 
Designs which use the s a m e  technology use 
the s a m e  program set  but new technology designs 
r equ i r e  program modification. 
e i ther  by making a copy (new version)  of an ex- 
ist ing program with modifications o r  by adding a 
new special  purpose program. Thus, the system 
is expanded without affecting what has already 
been accomplished. 
This  is done 
The program tools now being used fo r  inte- 
grated circuit  technology a r e  being converted to  
machine independent programs so that they may 
run on any computer having the proper  capability. 
The tool box is being expanded to  include additional 
toois for  use in processing various large scale 
integrated circui t  approaches. 
Following is a description of each computer 
p rogram of each phase. 
under control of the engineer design group using 
them. 
design executive program (see. ”Executive” under 
Service Programs) .  One computer run may exe- 
cute any useful combination of programs. 
than one unit design may also be processed on 
one run. 
A l l  p rog rams  a r e  run 
Computer runs are supervised by the 
More 
-71 - 
PHASE I - FORMULATION OF LOGIC 
A d e s i g n  i n  t h e  f o r m  of equat ions  spec i fy ing  
both logic a n d  in te rconnec t ions  m u s t  be loaded  
in to  a f i le  o n  t h e  magnet ic  tape .  Modif ica t ions  
of t h e  f i le  are m a d e  unt i l  t h e  log ic  d e s i g n  h a s  
b e e n  f o r m u l a t e d .  Load lists are p r e p a r e d .  V a r i -  
ous s e c t i o n s  of the des ign  are m e r g e d  subs t i tu t ing  
s i g n a l  s y m b o l i c  n a m e s  w h e r e  n e c e s s a r y  to m a i n -  
t a i n  c o n s i s t e n c y .  
T h e  log ic  d e s i g n  is m a n u a l l y  par t i t ioned  in to  
ten ta t ive  s e t s  of logic f o r  LSI bui lding blocks.  
C o m p o s i t e  mult i - funct ion t r u t h  t a b l e s  a re  g e n e r -  
a ted.  
p e r f o r m e d  with t h e  object ive of m i n i m i z i n g  t h e  
total n u m b e r  of n i i n t e r m s  f o r  all func t ions  i n  t h e  
s e t .  T h e  r e s u l t i n g  new equat ion  s e t  is t h e n  
r e d u c e d  one  equat ion a t  a t i m e  by Boolean  s im- 
p l i f ica t ion  techniques .  F u l l  u se  is m a d e  of don’t  
care condi t ions  i n  both t h e  c o m m o n a l i t y  a n a l y s i s  
a n d  s u b s e q u e n t  Boolean s impl i f ica t ion .  C h e c k s  
are m a d e  w h e r e v e r  p o s s i b l e  in  all p r o g r a m  
appl ica t ions .  E r r o r s  a r e  l i s t e d  but  d o  not s t o p  
t h e  d e s i g n  p r o c e s s .  Defec t ive  d a t a  is r e t u r n e d  
to t h e  file i n  its o r i g i n a l  f o r m  and is not updated. 
An a u t o m a t i c  c o m m o n a l i t y  a n a l y s i s  is 
L o a d  Logic Equat ions  
T h i s  is a b a s i c  bui lding block of t h e  c o m -  
p u t e r  d e s i g n  a s s i s t a n c c  systc’rn. 
t i o n s  into a new f i le  via  punched c a r d s  o r  c o r -  
r e c t s  a n  e x i s t i n g  file o n  a p r e v i o u s  file m a g n e t i c  
tape .  
op t ions  and  t h e  name of the  d e s i g n  file. 
r e c t i o n s  of c h a n g e s ,  addi t ions  o r  d e l e t i o n s  a u t u -  
m a t i c a l l y  modify all a f fec ted  d a t a  in  t h e  f i le .  
It l o a d s  t’qna - 
A c o n t r o l  card s p c c i f i c s  input and output  
C o r -  
Single  va lued  funct ions a r e  r e p r e s e n t e d  by 
s i n g l e  equat ions  where  the  s u b j e c t  s y m b o l  r e p r c -  
s e n t s  t h e  output  s igna l  and t h e  input  s y m b o l s  
r e p r e s e n t  t h e  input  s igna ls .  Logic  o p e r a t o r  
c o n n e c t i v e s  and funct ion n a m e  indica te  t h e  log ic  
funct ion t o  be  p e r f o r m e d  be tween inputs  and  
output. Mult i -valued func t ions  arc r e p r e s e n t e d  
by s e t s  of s i n g l e  equat ions.  A l l  t h e  s u b j e c t  
s y m b o l s  of a s e t  a r e  ident i f ied by  a s e t - n a m e  
m u c h  as a p r o g r a m m e r  n a m e s  a s u b r o u t i n e .  
C h a r a c t e r i s t i c s  of the s e t ,  s u c h  a s  load  l i s t s ,  
are ident i f ied.  
equat ions  within the  s e t  wil l  not a p p e a r  as outputs  
f r o m  t h e  set  and such equat ions’  load l i s t s  wi l l  he  
i n t e r n a l  t o  t h e  s e t .  Only t h o s e  load  l i s t s  e x t e r n a l  
to the  s e t  wi l l  bc? identified as s e t  l o a d s .  Se t  l o a d s  
are s e p a r a t e d  by  the s u b j e c t s  of scst output  s i g n a l s .  
Equat ions  which only f e e d  o t h e r  
A load  l i s t  is genera ted  for a p a r t i c u l a r  
equat ion  by obtaining t h e  sul) jer . ts  o f  a l l  o t h e r  
equat ions  which have t h e  p a r t i c u l a r  e q u a t i o n ’ s  
s u b j e c t  as a n  input. 
e n t i r e  file h a s  b e e n  updated with addi t ions ,  c h a n g e s  
a n d  de le t ions .  
inputs  but  not its s u b j e c t  a n d  d e l e t i o n s  not only  
c o m p l e t e l y  r e m o v e  a n  equat ion  f r o m  the f i l e ,  but  
also remove t h e  equat ion’s  s u b j e c t  s y m b o l  w h e r -  
e v e r  i t  m a y  a p p e a r  as a n  input  i n  o t h e r  equat ions .  
A f t e r  updat ing t h e  f i le ,  load  l i s t s  are r e g e n e r a t e d  
for t h e  e n t i r e  file. T h o s e  inputs  which are not 
r e p r e s e n t e d  in  t h e  file by e q u a t i o n s  c a u s e  e x t e n -  
s i o n s  t o  b e  g e n e r a t e d  which  are e q u a t i o n s  with 
s u b j e c t  and  l o a d s ,  but no inputs .  
Output  op t ions  include c o m p l e t e  l i s t i n g s  of 
equat ions  including l o a d s  o r  a n y  por t ion  t h e r e o f ,  
load  l i s t s ,  boundary  i t e m s  ( s u c h  as e x t e n s i o n s )  
l i s t i n g  s i g n a l s  e n t e r i n g  and  leaving  t h e  d e s i g n ,  
l i s t i n g s  of equat ions  without l o a d s ,  punched c a r d s  
of e q u a t i o n s  s u i t a b l e  f o r  r e l u a d i n g  itito a f i le  of a 
new uni t  which m a y  b e  a r e d e s i g n e d  v e r s i o n  of t h e  
e x i s t i n g  file. 
T h i s  p r o c e s s  is l e f t  un t i l  t h e  
C h a n g e s  r e p l a c e  an e q u a t i o n ‘ s  
G e n e r a t e  T r u t h  T a b l e s  
A f t e r  t h e  log ic  d e s i g n e r  h a s  loaded  e q u a t i o n s ,  
he  ident i f ies  m u l t i p l e  equat ion  s e t s  which he d e t e r -  
m i n e s  m a y  f o r m  a useful  I S 1  bui lding block func t ion .  
A f t e r  t h i s  m a n u a l  par t i t ion ing ,  composite, mii l t i -  
func t ion  t r u t h  t a b l e s  a re  g c n e r a t e d  for e a c h  set. 
T h i s  is d o n c  by a logic  s i m u l a t i o n  p r o g r a m  which 
s i m u l a t c s  e a c h  scst for all c o m b i n a t i o n s  o f  inputs  
which  t h e  log ic  d e s i g n c r  ind ica tes .  Input and  out -  
put b i n a r y  word  p a i r s  art’ t h u s  g e n e r a t e d  for e a c h  
set  r e g a r d l e s s  of t h e  log ic  conta ined  within t h e  s e t  
(see S i m u l a t e  Logic  Equat ions) .  
( t h a t  i s ,  e q u a t i o n s  feeding  o t h e r  e q u a t i o n s )  within 
t h e  s e t  wil l  have  been  t r a n s f o r m e d  in to  p a r a l l e l  
log ic  b e c a u s e  only input  and  output  v a l u e s  wi l l  be  
g iven  i n  t h e  c o m p o s i t e  t r u t h  tah le .  
All s e r i a l  logic  
P r a c t i c a l  l i m i t a t i o n s  wi l l  b e  he ld  t o  d u e  t o  
LSI p in-out  r e s t r i c t i o n s .  
into f i l c  a n d  not be  p r i n t e d  out u n l e s s  r e q u e s t e d .  
Manual  ini’utting of t r u t h  t a b l e s  is also p o s s i b l e ,  
and i n  t h i s  c a s e  e q u a t i o n s  need  not be  in t h e  filc 
for s u c h  func t ions .  A f u t u r e  d e s i g n  a i d  would be  
a p r o b l e n i  o r i e n t e d  c o m p i l e r  t o  g e n c r a t c  t r u t h  
t a b l e s  by  p r o g r a m m i n g  i n s t e a d  of s i m u l a t i o n .  
T h e  t r u t h  t a b l e s  wil l  g o  
C o m m o n a l i t y  A n a l y s i s  
F u n c t i o n s  within t h e  s e t  which for t h e  samc’ 
A n i i n t c r m  is 
c o m b i n a t i o n  of input  v a l u e s  o b t a i n  t h e  s a m e  output  
v a l u e  havt’ n i i n t e r m s  i n  coninion.  
dcfinc.d as a B o o l e a n  p r o d u c t  of all input  v a r i a h l e s ,  
With ~ a c h  v a r i a b l e  p r e s e n t  in  e i ther .  i t s  t r u e  or 
cot l l}) lcniented form depending  u p t 1  whtt ther  O r  riot 
t h e  c‘c)r.rctsponding input  value i s  a 1 01’ 0.  Since  a 
func t ion  n i a y  b e  r e p r e s e n t e d  a s  a t h o l e a n  s u n i  of 
-72- 
a l l  minterms for  which the function is t rue,  (that 
is ,  l ) ,  only such min te rms  w i l l  be considered 
together with don't c a r e s .  
The objective of commonality analysis is the 
minimizing of the total number of minterms f o r  
a l l  functions in the set. 
proach can be followed to generate a new s e t  of 
logic equations in expanded minterm form. 
functions are generated of each minterm for  
which al l  N outputs are t rue,  fo r  which all  N-1 
outputs a r e  t rue ,  a l l  N-2, etc. The result ing 
new equation se t  represents  the original function 
when common logic outputs a r e  ORed together 
with the logic unique to  each function. 
A t r i a l  and e r r o r  ap -  
New 
Boolean Simplification 
A f t e r  the multi-valued function minterm 
reduction is performed by commonality analysis, 
the resulting new equation s e t  is then reduced one 
equation a t  a t ime by Boolean simplification 
techniques. Each equation is represented by i ts  
t ruth table which is a listing of all  min te rms  fo r  
which the function's output is t rue.  
t e r m  fo rm is also called the f i r s t  canonical form 
o r  the disjunctive normal  form. 
This min-  
Each minterm is represented by a binary 
word of N bits where N is the number of equation 
inputs. The value of the word is determined by 
those combinations of bits  for  which the co r re -  
sponding combinations of input signals obtain a 
t rue  output. The words of this truth table a r e  
then grouped according to  the number of 1 's  i n  
each word. Words of adjacent groups a r e  com- 
pared f o r  a match in a l l  but one bit position. 
Such matches produce a new word of N-1 va r i -  
ables  with an X marking the deleted variable. 
This  is equivalent t o  applying the theorm 
xy + Yy = y. After  N passes ,  pr ime implicants 
w i l l  remain.  The s implest  sum-of-products 
representat ion is obtained by the Quine method of 
Boolean simplification. 
This  program has been written in Cobol and 
run  for  12 var iables  in reasonable t ime on a 
i3iK c n a r a c t e r  H - Z Y U U  computer. Any program 
which takes  m o r e  than an hour to run is consid- 
e r e d  unreasonable in execution t ime. 
Substitute Equation Symbols 
Input consis ts  of match symbols denoting the 
equation symbols to  be modified and substitute 
symbols  specifying the modification, 
g r a m  is useful  a s  a c ler ical  aid in changing 
signal names .  If two design f i les  a r e  to  be 
merged,  common signals must have the same  
names and one file may have t o  substitute 
This pro- 
symbols.  A l l  occurrences af the same symbol in 
the file a r e  automatically modified. If a cha rac -  
t e r  in the match symbol is a hyphen, that charac-  
t e r  position is omitted in the comparison and 
substitution. Deletion of symbols is another 
option. 
PHASE I1 - SIMULATION OF OPERATION 
Phase I1 programs depend upon data proces-  
s ed  by Phase I programs.  
been formulated, it is tested and checked by 
Phase  I1 programs.  
rected by rerunning Phase I programs. 
a r e  associated with equations by the circui t  
assignment program in o r d e r  to  allow detailed 
circui t  timing checks to be performed. 
After a design has 
Any e r r o r s  found can  be c o r -  
Circui ts  
Functional Complexity Check 
After the reductions of Phase I, multi-valued 
function s e t s  must  be checked t o  see if the com- 
monality and simplification was enough to  allow 
the function to  meet various LSI pin-out limitations 
and logic functional complexity limitations. 
put of this program a r e  e r r o r  listings. 
is not modified in any way. 
Out - 
The file 
Simulate Logic Functions 
Besides the logic equations in a design file, 
the program simulate logic equations a l so  accepts 
as input a card deck which controls the simulation 
timing, inputting of tes t  data, output format,  and 
output of selected circui ts  a t  selected t imes.  Out- 
put is to  the high speed printer.  
is not modified in any way. 
The design file 
The program will handle up to 24,000 logic 
equations a s  presently written in assembly lan- 
guage on the H-800 Computer. 
upon. number, s i ze  and type of equations; average 
number of unclocked circui ts  in a chain; number 
of clock phases in a clock cycle; amount of t e s t  
data input and amount of output. 
synchronous logic, the program w i l l  accept asyn-  
chronous chains logically separated from other 
chains. Unclocked logic Teedback is accepted and 
logic loops and oscil latory conditions a r e  identified 
if they exceed maximum allowable i terations.  
subroutine exists for  each allowable logic equation 
type. 
new circuit  types. 
functions a r e  a lso available to speed simulation 
by table look up. 
Run t ime depends 
Although for 
A 
This subroutine l i b ra ry  can be updated for  
Truth tables of multi-value 
Circuit  Type Assignment 
The only changes made in the design file by 
this  program w i l l  be circuit  types for  single- 
-73- 
valued  func t ions  ( for  e x a m p l e ,  cell type  o r  f ixed 
a r r a y  LSIs with i n t e r n a l  i n t e r c o n n e c t  needed) .  
If a c i r c u i t  type  is a l r e a d y  s p e c i f i e d  f o r  a n  e q u a -  
t ion,  it is checked;  if not, a c i r c u i t  t y p e  is as- 
s igned .  
load  equat ions  o r  on input c a r d s  t o  c i r c u i t  as-  
s i g n m e n t .  
a c c o r d i n g  t o  a n  equat ion ' s  log ic ,  fan- in  and  fan-  
out. Manual  a s s i g n m e n t  is m a d e  by spec i fy ing  a 
c i r c u i t  type  for  a subjec t  s y m b o l  on  a n  input  
c a r d .  
i n  t h e  s u b s t i t u t e  s y m b o l s  p r o g r a m  t o  al low as- 
s i g n m e n t  of t h i s  s a m e  c i r c u i t  t y p e  to a l l  equa t ions  
which  have  t h e  s a m e  s u b j e c t  s y m b o l  e x c e p t  for 
t h e  c h a r a c t e r  posi t ions which conta in  a d a s h .  
Manual  a s s i g n m e n t s  c a n  be  m a d e  through 
C i r c u i t s  are a s s i g n e d  o r  checked  
T h e  s u b j e c t  s y m b o l  m a y  conta in  d a s h e s  a s  
A s s i g n m e n t  or checking  is done by m e a n s  of 
a t a b l e  in t h e  p r o g r a m  which s p e c i f i e s  c i r c u i t  
t y p e s  by log ic  funct ion,  m a x i m u m  fan-out ,  and  
m a x i m u m  f a n - i n  per  t e r m  ( s u m  of p r o d u c t s  form). 
T h e  t a b l e  is o r d e r e d  so tha t  the  m i n i m a l  c i r c u i t  
type  w i l l  b e  a s s i g n e d  to e a c h  equat ion.  
C i r c u i t  T i m i n g  Check  
A f t e r  c i r c u i t  types have been a s s i g n e d  t o  a l l  
equa t ions  in  t h e  file, t h i s  p r o g r a m  iriay be  run .  
T h e  d e s i g n  file will not b e  modi f ied  i l l  a n y  way. 
V a r i o u s  w o r s t  c a s e  t e s t s  a re  m a d e  through a l l  
c h a i n s  and a s s o c i a t e d  s u b - c h a i n s  u n d e r  t e s t  w h e r e  
a c h a i n  i s  def ined  a s  s t a r t i n g  and ending  a t  
c loc ke  d c i r c u it s . 
d e l a y s  a r e  a c c u m u l a t e d  f o r  v a r i o u s  t e s t s .  
'Turn - o 11 and t u  i'n - o f f  c i I'C u it 
A t a b l e  conta ins  t u r n - o n  and off t i m e s  for 
e a c h  loading  condi t ion of e a c h  c i r c u i t  type  f o r  
e a c h  test condi t ion.  T e s t s  a r e  m a d e  a t  v a r i o u s  
t e m p e r a t u r e s  and  f o r  e i t h e r  m i n i m u m  o r  m a x i -  
m u m  d e l a y s .  T u r n - o n  and off d e l a y s  a re  a l t e r -  
na ted  whenever  c i r c u i t s  i n v e r t  p u l s e s .  
c r i t e r i a  s p e c i f y  m a x i m u m  a n d  m i n i m u m  a c c e p t -  
a b l e  c h a i n  d e l a y s  as a c c u m u l a t e d  a t  t h e  top  of 
e a c h  cha in .  
c h a i n s  be tween d i f fe ren t  c l o c k  p h a s e s .  
s k e w  and  m a r g i n  a r e  included within t h e s e  
criteria. 
Checking  
Different  s u c h  c r i t e r i o n  c a n  c h e c k  
Se t  up, 
Any v io la t ions  found a re  l i s t e d  g iv ing  m a x i -  
m u m / m i n i m u m  t e s t  at t e m p e r a t u r e ,  s u b j e c t  
s y m b o l  of the  c i r c u i t  at t h e  top  of cha in ,  and  
a c c u m u l a t e d  turn-off and t u r n - o n  d e l a y  t i m e .  
c o m p l e t e  l i s t i n g  of a l l  c h a i n s  for all t e s t  condi -  
t i o n s  m a y  also b e  obtained a s  a n  option. 
c a n  b e  c o r r e c t e d  by r e a s s s i g n i n g  c i r c u i t  t y p e s  
a n d / o r  r e l o a d i n g  logic equat ion corrcLctions. 
A 
E r r o r s  
P H A S E  111 - PLACEMENT OF COMPONENTS 
The a u t o m a t i c  p lacement  of t h e  s e l e c t e d  
i n t e g r a t e d  c i r c u i t s  in the r i g h t  p l a c e s  on  e a c h  
b o a r d  o r  LSI m i n i m i z e s  p r i n t e d  o r  e t c h e d  w i r e  
lengths .  
m i n i m i z e s  c r o s s - o v e r  points .  T h u s  t h e  n u m b e r  
of LSI o r  b o a r d  l e v e l s  is kept  t o  a m i n i m u m  re-  
duc ing  t o  a f e w  h o u r s  t h e  m a n y  m a n - y e a r s  t h i s  
t a k e s  b y  hand. 
and  m a n u a l  layout  techniques  p r o v i d e s  with t h e  
advent  of l a r g e  s c a l e  i n t e g r a t e d  c i r c u i t  t echnology 
t h e  only p r a c t i c a l  method of implementa t ion .  
T h e  a u t o m a t i c  rout ing  of s u c h  w i r e s  
T h e  e l imina t ion  of b r e a d b o a r d i n g  
P h a s e  111 p r o g r a m s  d e p e n d  upon d a t a  p r o c -  
e s s e d  by P h a s e  I and I1 p r o g r a m s .  
of P h a s e  111 p r o g r a m s  a re  a p p l i c a b l e  only  t o  
v a r i a b l e  a r r a y  type  LSIs. 
c a b l e  t o  c e l l  type  o r  f ixed a r r a y  LSIs with i n t e r -  
n a l  i n t e r c o n n e c t  w i r e - r o u t i n g  needed.  Note  t h a t  
d i s c r e t i o n a r y  i n t e r c o n n e c t  i s  not c o v e r e d  in  t h e  
d e s i g n  p h a s e s  e x c e p t  as v a r i o u s  d a t a  (for e x a m p l e  
log ic  e q u a t i o n s )  m a y  feed  into o t h e r  p r o g r a m m i n g  
s y s t e m s  s u c h  as C o m p u t e r  P r o d u c t i o n  A s s i s t a n c e  
a n d  C o m p u t e r  Test A s s i s t a n c e .  
V a r i o u s  p a r t s  
O t h e r  p a r t s  are appl i -  
C i r c u i t  P l a c e m e n t  
Manual  p l a c c m e n t  is done  by spec i fy ing  s u b -  
j e c t  s y m b o l s  and  b o a r d  p l a c e m e n t  c o o r d i n a t e  
loca t ions .  P o s s i b l e  e r r o r s  a r e :  1 )  C i r c u i t  typ's 
not in f i le ,  2 )  Subjec t  not in  f i le ,  3 )  Dupl ica te  
s u b j e c t  s y t n b o l s ,  4 )  P l a c e m e n t  loca t ion  o v e r f i l l e d .  
Autoniat ic  p l a c e m e n t  will g e n e r a t e  b o a r d  l o r a t i o n s  
f o r  e a c h  s u b j e c t .  C i r c u i t s  within t h e  s a m e  f l a t -  
pack  for- f ixed a r r a y  type LSIs o r  within the saiiic' 
c e l l  fo r  cell  t y p e  LSIs wil l  be a u t o m a t i c a l l y  a s -  
s igned .  T h u s ,  p l a c e m e n t  w i l l  update  t h e  file with 
p l a c e m e n t  l o c a t i o n s  and p laced  c i r c u i t  t y p e s  g iv ing  
c i r c u i t  a l loca t ion  within cells and  f l a t - p a c k s .  
L o c a t i o n  wil l  also s p e c i f y  a n y  e x p a n d e r  g a t e ( s )  
l o c a t e d  rc , la t ive to  t h e  expanded c i r c u i t s .  
Vio la t ions  wil l  l i s t  s u b j e c t s  o r  b o a r d  l o c a t i o n s .  
Output  o p t i o n s  inc lude  l i s t i n g s  of s p a r e s ,  unplaced  
s i g n a l s ,  p laced  equat ions .  Main output  is p l a c e -  
m e n t  d i a g r a m s  showing c i r c u i t s  and S u b J c c t s  a t  
c o o r d i n a t e  p l a c e m e n t  l o c a t i o n s  by b o a r d .  
a b l e  a r r a y  t y p e  LSIs will not r e q u i r e  a n y  i n t e r n a l  
p l a c e m e n t  o r  c i r c u i t  a l loca t ion .  
V a r i -  
G e n e r a t e  P i n  G r o u p s  
A f t e r  p l a c e m e n t  t h i s  p r o g r a m  wi l l  c o m p u t e  
a n d  a d d  to a d e s i g n  f i le  for e a c h  equat ion  r e c o r d  
a n  input  pin g r o u p  m a t c h i n g  t h e  l o g i c  e q u a t i o n ' s  
i n p u t s  a n d  a n  output  pin g r o u p  m a t c h i n g  t h e  
e q u a t i o n ' s  l o a d  l i s t .  
E q u a t i o n s  m a y  be r e o r d e r e d  within a set o r  
t h e  i n p u t s  of s i n g l e  e q u a t i o n s  m a y  b e  r e o r d e r e d  t o  
f a c i l i t a t e  pin a s s i g n m e n t  s u c h  as i n t e r i o r  c o n n e c t -  
i o n s  brought  o u t  t o  c o m m o n  p ins .  
p i n s  are a u t o m a t i c a l l y  a s s i g n e d .  
E x p a n d e r  g a t e  
G e n e r a t e  p in  
-74- 
groups w i l l  only specify flat pack pins not con- 
nectors o r  feed-throughs. These a r e  e i ther  
generated by special  programs o r  can be added 
by the manual pin assignments program (see 
Manual Pin Assignments). 
Several  l ist ings a r e  generated: Maintenance 
listing with al l  data (equations, input pins, load 
l ist ,  output pins, c i rcui t  type, placement lo- 
cation) ordered by subject; assembly point to  
point wiring lists internal to each LSI; assembly 
wiring l i s t s  external to each LSI. 
Manual Pin Assignments 
LSI a n d / o r  printed board connector pins and 
two-sided printed board feed-through pins a s  
w e l l  a s  other pins may be entered into the output 
pin groups by this program which is also able to  
modify existing input pin groups. This is a gen- 
e r a l  program for use with any design technology 
while generate pin groups is a more  specialized 
program with many versions.  Normally, con- 
nector pins will be known even before listing pin 
groups using only the placement diagrams. 
a ca rd  deck specifying connector pins may be 
submitted to  this program ei ther  before o r  after 
running generate pin groups but af ter  placement. 
Unassigned connectors w i l l  be l isted a s  w i l l  s ig- 
nals which s t i l l  require  feed-through. 
options a r e  the same  a s  given for  the generate 
pin groups program. 
be duplicated. This and other possible e r r o r s  
w i l l  be checked. 
Thus, 
Output 
Pin designations must not 
LSI Board Etch 
F o r  cel l  type o r  fixed a r r a y  LSIs with inter-  
nal interconnect wire-routing needed, this pro- 
g r a m  w i l l  provide l aye r  separation f o r  c r o s s  
over  and w i r e  routing layout according to  the 
specific groundrules of the design. This iS a 
special  purpose program which w i l l  exist  in many 
different versions.  
LSIs i t  w i l l  only be necessary to  specify the a r r a y  
mask  for  the word pa i r s  of a multi-valued function 
as developed in Phase I. F o r  internal intercon- 
E P P ~  roiiting, intprnal  pin groups  must  have pre-  
viously been specified. 
l i s t s  by l a y e r  or  a r r a y  connection points. 
F o r  variable a r r a y  type 
Output is routed w i r e  
Printed Board Etch 
Th i s  p rogram sepa ra t e s  board l a y e r s  indi- 
cating where  plated-through holes are needed and 
routes  printed circuit  board wiring. Its function 
is s i m i l a r  t o  that of LSI board etch except that 
the w i r e  routing is between pins external t o  the 
LSIs and different groundrules w i l l  be followed. 
PHASE IV - PREPARATION OF WIRE-RUN LISTS 
Phase IV depends upon data processed by 
Phases  111, I1 and I programs. P rograms  of this  
phase provide the wiring output in various fo rms  
fo r  production assembly. Wire-listings a r e  r e -  
ordered to  run a scribing machine t o  prepare 
masks  for  LSIs and printed boards. 
data transmitted f rom Engineering to  Production 
is li terally 100, 0 0 0 ' s  of wire segments necessi-  
tating masks automatically prepared by computer 
controlled scribing. 
The massive 
LSI Board Wiring Lis ts  
Ei ther  a routed w i r e  l ist  o r  an a r r a y  mask 
l is t  is made for  each LSI in the file. 
Printed Board Wiring L i s t s  
A routed wiring l is t  is made for  each board 
connecting flat-packs. 
Mother Board Wiring L i s t s  
A routed wiring l is t  is made for  each board 
connecting other boards. 
Listings Ordered for  Scribing 
Cards ,  paper-tape o r  magnetic tape is pro-  
duced to  run a scribing machine. 
SERVICE PROGRAMS 
These programs a r e  used by the p rogrammer  
t o  run and service the computer design assis tance 
system. The l i b ra ry  program fo rms  a basic pack- 
age from which al l  other programs a r e  built up. 
The executive provides continuity of running be- 
tween al l  the other design programs and communi- 
cation with the computer operator. The file edit 
works on design fi les a s  a whole copying, deleting 
o r  renaming them. In addition, various para-  
meteration programs may exist to  modify tables 
in other programs (logic subroutines in simulation 
program, circui t  types in circuit  assignment pro- 
g ram,  t ime delays in timing check program, etc. ). 
Service programs a re :  
Subroutine Library 
In addition to  a l ibrary of logical, input/output 
editing and formatting, sorting, scanning, s ea rch -  
ing and square root subroutines, constant and data 
format  pools a r e  specified. A l l  a r e  designed to  
simplify the programming of the type of programs 
found in the computer design assis tance system. 
The use of a compiling system simplifies l ibrary 
-75 - 
. 
maintenance. Programs a r e  constructed by 
copying common portions f rom the l ibrary.  
Present ly  the Cobol Compiler, Update and Li- 
b r a r y  sys t em is being used for  Bootstrap devel- 
opment and maintenance of a l l  programs.  
Executive 
This  program provides continuity between 
all the other programs. 
of two par ts :  start-up and run. The s tar t -up 
reads the f i r s t  control card and checks magnetic 
tape file mountings. 
called. 
The executive consists 
The run executive is then 
The run executive reads a single program 
batch control ca rd  which preceeds each design 
program input card deck. Any tape reassign-  
ments necessary,  rerun points, comments to  the 
operator,  etc. a r e  made and the next program 
t o  be run is called. 
Thus, a batched system ca rd  deck is sub- 
mitted for each run. The run executive ca l l s  
a l l  other programs which return back to  it upon 
exit. Normally there will be no computer stops 
for  a run. Note that the s a m e  program may be 
run more  than once with different data. 
more  than one design file may be updated. 
Also 
File Edits 
This program will: 1) copy the complete 
CDA system of files and change the sequence 
number; 2 )  delete selected r eco rds  of selected 
files; 3)  copy selected fi le r eco rds  of selected 
files; 4 )  copy selected files with new t i t les ;  
5) rename selected file t i t les;  6 )  l is t  all file 
t i t les;  etc. 
Pa rame te r  Updates 
Various special  p rog rams  may be provided 
to update tables in other programs.  
CONCLUSION 
The objective has been to show various com- 
puter  design aid tools applicable to mult iprocessor  
LSI technology and to provide a cooperative man- 
machine interactive design system to f r e e  the 
designer f rom routine bookkeeping tasks so that 
he may devote more  of this  t ime to  the actual 
system design. 
Since al l  p rog rams  work on the s a m e  file, 
correct ions can be made a t  any point in the design 
by s imply rerunning appropriate p rograms .  Also 
the fact that the programs a r e  independent with 
data separated from the programs into a common 
file allows for  easy  expansion of the system by 
adding of new programs to  take advantage of new 
construction techniques a s  they come along. 
Since each phase is complete and dependent 
only on the preceding phases ,  development of 
p rograms  can proceed in parallel  with the actual 
equipment design using the lead t ime of completed 
phases.  
-76 - 
i .  
c 
ESSENTIAL FEATURES OF 
ON-LINE SYSTEMS 
HARRY D. HUSKEY 
Dr. Huskey is a Professor of Mathematics and Electrical Engineering, 
Department of Mathematics, University of California, Berkeley, California. 
He received h i s  B. S. degree from Idaho in 1937, and h is  M. S. and Ph. D 
degrees (Mathematics) f rom Ohio State in 1940 and 1943, respectively. 
He was an  Assistant Mathematics Ins t ruc to r  at Ohio State f rom 1937-38; 
a n  instructor at Ohio State f rom 1941-43; at t he  University of Pennsylvania 
f rom 1943-46; temporary Principal Science Officer, at t h e  Natural Physics 
Laboratory, England, 1947-48; wi th the National Bureau of Standards, Wash., 
D. C., 1948-49; Assistant Director and Ins t ruc to r  in Numerical Analysis, 
U n i ver s i ty of Ca I i for n ia ( B er kel ey), 1949 -54; Associate Professor , Math emat ic s 
and Electrical Engineering, University of California, 1954-58, and was made 
a f u l l  professor in 1958, his present position. 
He was Technical Director of the Computer Laboratory, \rllayne University, 
1952-53. His specialties include the following: design and use of electronic 
digital computing machines and accessories, mathematical area of surfaces, 
and t h e  solut ion of algebraic l inear simultaneous equations. 
* * - l L - - - A ? - - l  C-..:..4.. He i s  a member o i  M A S ,  IVIdlll~llldllLdl 3uLlcty; I i id i is i r ia l   slath he ma tics 
Society; Mat hematical Association, t he  Association for  Computing Machinery; 
IEEE, and t h e  B r i t i sh  Computer Society. 
-77 - 
ESSENTIAL FEATURES OF ON-LINE SYSTEMS 
By Harry D. Huskey 
Professor  of E l e c t r i c a l  Engineering 
Massachusetts I n s t i t u t e  of Technology and 
Universi ty  of C a l i f o r n i a ,  Berkeley 
- - ., 
t h e  system (hardware, sof tware,  buf fer  a r e a s ,  
e t c . )  dedicated t o  the  ind iv idua l  user  must be 
small. Ind iv idua l  s t a r t - u p  and close-down c o s t s  
must be modest wi th  respec t  t o  the  system and t o  
the  user .  Minimum p r i c e  terminals  are low data-  
r a t e  devices  and t h i s  r e q u i r e s  system s torage  of 
user  f i l e s .  Although, magnetic tape represents  a 
p o s s i b l e  means of  s t o r i n g  such f i l e s  between user-  
sess ions ,  i t  is  much too slow f o r  the  s torage  of 
t h e  f i l e s  of a user  when he  i s  ac t ive .  
I n  t h e  above d iscuss ion  the  cons idera t ion  of 
d i s p l a y  scopes i s  being by-passed because of t h e i r  
change t h i s  cos t .  Of perhaps g r e a t e r  importance: 
w i t h  a proper support ing system ( y e t  t o  come) a 
scope by p i c t o r i a l  techniques may t ransmi t  t o  the 
The e x c i t i n g  i n t e r e s t  i n  time-sharing of 
computers t h a t  i s  sweeping t h e  profess ion  raises high per user' may 
some quest ions.  Why, i s  time-sharing so good? 
I f  i t  i s  so good, why d i d n ' t  we  do i t  years  ago? 
On the  f i r s t  quest ion:  time-sharing i s  not  
so good! 
p u t e r s  wherein the  user  works f o r  continuous p e r i -  
ods with sus ta ined  a t t e n t i o n .  Thus, r e l a t i v e  t o  
batch processing techniques t h e  u s e r  can cobcen- 
trate on h i s  most s i g n i f i c a n t  problems. There i s  
no need t o  have a number of secondary problems 
a c t i v e  i n  order  t o  occupy h i s  t i m e  while  wai t ing 
out  turn-around on a batch system. 
What i s  good i s  the  on- l ine  use of com- 
Even though the above i s  s u f f i c i e n t  economic 
j u s t i f i c a t i o n ,  of much g r e a t e r  s ign i f icance  i s  
the f a c t . t h a t  t h e  user  may "explore" h i s  way 
through a problem. In o ther  words, the  user  may 
s ta r t  on h i s  problem without  knowing a complete 
s o l u t i o n  algorithm. I f  c e r t a i n  cases  never occur 
they need n o t  be accounted f o r  i n  the  program. 
u s e r  so much more information than i s  p o s s i b l e  by 
a typewri te r ,  and poin t ing  techniques ( l igh t -pen ,  
Rand t a b l e t ,  e t c . )  may be so much easier than,say,  
typing coordinates;  t h a t  scope consoles may then 
compete economically wi th  typewri te rs  f o r  some 
appl ica t ions .  Summary of the  d e s i r a b l e  character-  
i s t i c s  of an on- l ine  system: 
1) The terminal  (and a l l  of the  system dedi- 
ca ted  t o  t h e  ind iv idua l  u s e r )  must be low-cost, 
and i n  the  cur ren t  s t a t e - o f - t h e - a r t  ' t h i s  implies 
low d a t a - r a t e  equipment. 
2)  Low d a t a - r a t e  terminals  imply s u b s t a n t i a l  
amounts of system s torage  per  u s e r  ( a c t i v e  o r  qot). 
3 )  The a c t i v e  f i l e s  of a c t i v e  u s e r s  must be 
i n  quickly access ib le  memory. A " l ive ly"  on- l ine  
system must respond t o  each a c t i v e  u s e r  i n  seconds. 
For general  engineer ing and s c i e n t i f i c  computation 
t h i s  means (1) the  u s e r ' s  f i l e s  are i n  f a s t  core  . .  
(which may c o n f l i c t  wi th  the  minimum c o s t  requi re -  
ments mentioned, above), o r  (2)  the  u s e r  can be 
"swapped-in" from a lower-cost memory without  ex- 
This  o n - l i n e  use  cont ras ted  with batch use 
can be compared with communication with a distaw 
person. L e t t e r s  represent  batching of information 
and must be s u f f i c i e n t l y  complete so as t o  avoid Overhead 
misunderstanding. I n  comparison a telephone com- 
munication g i v e s  (1) t h e  p o s s i b i l i t y  of a b r i e f e r  
t e x t  assuming t h e  r e c e i v e r  understands the  ambigu- 
i t i e s  o r  t h e  omissions, (2)  i f  the  t e x t  i s  too 
a te  response (answers) can occur. Note t h a t  the  
turn-around t i m e  f o r  a telegram compares with 
that f o r  b a t c h  processing computers. 
The system of P r o j e c t  Genie a t  the  Universi ty  
of Cal i forn ia ,  Berkeley, r e p r e s e n t s  OUT approach 
t o  the above problems. User te rmina ls  a r e  t e l e t y p e  
(30  mil l ion  words i n  the i n i t i a l  vers ion) ;  low cost 
s torage  f o r  a c t i v e  u s e r s  (and system programs) i s  
a m i l l i o n  word drum which can swap w i t h  the  high 
speed core a t  core- ra te .  The 32K memory i s  divid-  
b r i e f  immediate feed-back o c c u ~ s ,  and ( 3 )  imnedi- machines. User is a large capacity disc 
Now consider t h e  second question: The f i r s t  ed into two frames and priority memory 
e l e c t r o n i c  computers were used on-line. And, even is provided to the processor 
t o  run during drum and 110 t r a n s f e r s .  The v a r i -  
a b l e  p r i o r i t y  reduces memory access  c o n f l i c t s  from 
about 40% t o  below 10% . When a device (drum, 
CPU, d a t a  channel, e t c . )  r e q u i r e s  a memory cycle ,  
ii cciies i n  ~ i t h  ZAELF!E~ p r i o r i t y  and f o r  each 
i n  r e c e n t  t imes,  groups doing l a r g e  system pro- 
grams, wher6 t o t a l  e lapsed t i m e  t o  completion w a s  
of prime importance, have been given exc lus ive  on- 
i i n e  access t u  large coq:.;-ti.=.g s y s t e m s .  
memory cyc le  t h a t  i t  f a i l s  t o  g e t  i t s  p r i o r i t y  i s  
upgraded, However, most computer systems have developed i n t o  w e l l -  tuned batch systems wi th  input  and out- 
p u t  queuing organized t o  keep t h e  c e n t r a l  proces- 
s o r  busy ( t h i s ,  genera l ly ,  maximizes t h e  amount of 
computing t h a t  can be done). 
The hardware conf igura t ion  i s  less than h a l f  
t h e  s tory .  Batch processing software is  not  s u i t -  
a b l e  f o r  on-l ine systems because i t  i s  not  designed 
t o  provide i n t e r a c t i v e  communication with the  user. A s u c c e s s f u l  on- l ine  system must se rv ice  a 
number of  u s e r s  comparable ko (or  more than) the 
number t h a t  can be serv iced  wi th  an equiva len t  
(measured, say ,  i n  terms of  c o s t )  batch system. 
I n  order  f o r  t h i s  t o  be p o s s i b l e  the  por t ion  of 
A number of so-cal led on- l ine  systems have 
been promoted which e i t h e r  (1) provide l imited lan- 
guage c a p a b i l i t y  requi r ing  r e s t r i c t e d  input  and 
-79- 
c 
output  formats or  ( 2 )  which permits  the  user  ( v i a  
h i s  console) t o  obta in  a p o s i t i o n  i n  the batch 
processing queue. 
with batch processing t h e  on- l ine  system must pro- 
v ide  e s s e n t i a l l y  the same computing f a c i l i t i e s .  
I n  order  t o  compare a t  a l l  
The on- l ine  system must c a t e r  t o  a t  l e a s t  
two types of user .  One i s  t h e  neophyte o r  occa- 
s i o n a l  user .  To him the system should be " for -  
giving" and i t  should "lead" him through the  com- 
p u t a t i o n a l  process. A t  the  u s e r  language l e v e l ,  
JOSS (our vers ion  a t  Berkeley i s  c a l l e d  CAL) i s  
an example of a good " f a i l - s a f e "  language. This  
tolerance and he lpfu lness  must e x i s t  a t  a l l  l e v e l s  
such as system communication, t e x t  generat ing and 
e d i t i n g ,  and i n  each user  language. 
A t  the  o ther  extreme i s  the  profess lona l  
user .  For him f l e x i b i l i t y  i s  of prime importance. 
He should be a b l e  t o  maximize h i s  communication 
r a t e  with h i s  problem r e l a t i v e  t o  h i s  phys ica l  and 
mental e f f o r t .  Thus, system procedures should a l l  
be a v a i l a b l e  t o  him wi th  a r e l a t i v e l y  uniform 
method of c a l l i n g .  
A l l  the  above must occur i n  a system which 
g ives  almost a l l  hardware and sof tware f a c i l i t i e s  
t o  each of i t s  u s e r s  (varying perhaps from 20 t o  
100) f o r  h i s  t ime-s l ice  out  of an i n t e r v a l  which 
i s  a t  most a few seconds long. 
The Berkeley system uses  memory paging t o  
p r o t e c t  programs from one another ,  and t o  extend 
the  apparent memory s i z e  from the u s e r ' s  v iewpoin t  
People a r e  r e a l i z i n g ,  and some computer com- 
panies  have apparent ly  not  y e t  r e a l i z e d ,  t h a t  the  
proper software i s  a t  l e a s t  ha l f  of the  c o s t  of a 
good batch processing system. This i s  even more 
t r u e  of software systems. 
-80 - 
ON-L I NE S I MULATI ON 
MARTIN GREENBERGER 
Dr. Greenberger is Associate Professor at t h e  Sloan School of Management, 
Massachusetts Ins t i tu te  of Technology. An applied mathematician, h i s  interests 
a re  in the  application of computers and quantitative methods to decision-making 
and economic behavior. He has been engaged at M. 1. T. 's Project MAC in t h e  
development and use of interactive computer systems. 
Professor Greenberger received A. B. (summa c u m  laude), A.M., and 
Ph. D degrees from Harvard University in Applied Mathematics. Whi le at Har- 
vard h e  was a National Science Foundation Fellow,a Teaching Fellow in Mathe- 
matics, and staff member of t h e  Harvard Computation Laboratory. 
Before jo in ing  the  M. I.T. faculty, Professor Greenberger formed and 
managed Applied Science Cambridge, t h e  I B M  group that cooperated w i th  M. I.T. 
in  t h e  establishment and operation of t h e  M. I.T. Computation Center. This 
group assisted the  Smithsonian Astrophysical Observatory in tracking t h e  f i rs t  
Russian and American satellites. 
Dr. Greenberger became Assistant Professor at t h e  Sloan School in 1958 
and Associate Professor and Head of t h e  Quantitative Section in 1961. In  1962 
h e  edited a book of essays entit led Management and t h e  Computer of t h e  Future. 
Professor Greenberger is co-author of t h e  books, Microanalysis of Socio- 
econnmlc Systems -- A Simulation Study, and On-Line ~ Computation and 
Simulation: t he  OPS-3 System. He has writ ten numerous articles for  tech- 
nical journals and magazines. Dur ing  the 1965-1966 academic year, h e  was 
a Guggenheim Fellow at Berkeley. 
ON-LINE SIMULATION I N  THE OPS SYSTEM 
By Martin Greenberger 
and Malcolm M. Jones 
Associate  Professor of Management 
and I n s t r u c t o r  of Management 
Massachusetts I n s t i t u t e  of Technology 
Cambridge, Massachusetts 6 ? ' 2 7 1 1 1 
SUMMARY 
The OPS system, an i n t e r a c t i v e  system 
designed f o r  use i n  a t ime-sharing environment, 
includes an on-l ine s imulat ion c a p a b i l i t y .  A 
simulat ion a c t i v i t y ,  thought of a s  a s e r i e s  of 
events ,  i s  scheduled, canceled,  or  rescheduled 
dynamically on t h e  AGENDA, e i t h e r  a t  a spec i f ied  
time, or  when a prescr ibed condi t ion is  met. 
The a c t i v i t y  can be made t o  consume simulated 
time by means of an i n t e r n a l  delay f o r  a c e r t a i n  
per iod ,  or  a wait u n t i l  given condi t ions a r e  
s a t i s f i e d .  The AGENDA is  a time-ordered l i s t  
of a 11 cond it iona 11 y and uncond it iona 11 y 
scheduled a c t i v i t i e s .  The user may inspect  i t  
a t  any point  i n  a s imulat ion,  and personal ly  
modify or  r e s t r u c t u r e  i t .  H e  may base h i s  
s t r a t e g y  on d a t a  and p a r t i a l  r e s u l t s  examined 
and analyzed with t h e  help of the  OPS system 
during i n t e r r u p t i o n  of the  run. Extensive 
t r a c i n g  f a c i l i t i e s  permit the  user t o  follow 
t h e  flow of c o n t r o l  during a s imulat ion t o  any 
l e v e l  of d e t a i l .  
In t roduct ion  
OPS i s  an i n t e r a c t i v e  system designed f o r  
genera l  use i n  a t ime-sharing environment.* 
It inc ludes  an on- l ine  c a p a b i l i t y  f o r  bui lding 
models and running s imulat ions.  Simulation 
a c t i v i t i e s  a r e  scheduled, canceled,  o r  resched- 
uled dynamically on an AGENDA e i t h e r  a t  a 
s p e c i f i e d  t i m e  or when a prescr ibed condi t ion 
is  m e t .  A c t i v i t i e s  can be made t o  consume 
simulated t i m e  by means of a delay f o r  a 
c e r t a i n  per iod or  a wait u n t i l  given condi t ions 
a r e  s a t i s f i e d .  The AGENDA i s  a time-ordered 
l i s t  of c o n d i t i o n a l l y  and uncondi t ional ly  
scheduled a c t i v i t i e s .  
*For those  i n t e r e s t e d  i n  t h e  o r i g i n  of 
The system could 
;'raiiicg, nu- -.-- --:..:..qii.. cI_ -__._ 
Line Process  Synthesizer .  
be adapted f o r  a small s tand-alone computer. 
It i s  f u l l y  documented i n  t h e  manual On-Line 
Computation and Simulation: 
M.  Greenberger, M.  M.  Jones, J.  H .  Morris, J r . ,  
and D. N .  Ness, MIT Press ,  1965. 
V L "  1 - 0  V L L 6 L L . U L - J  .... UCLY..J. . .  f c r  ch- 
t h e  OPS-3 System, 
Working wi th in  t h e  multi-purpose framework 
of t h e  OPS system, the  u s e r  may inspect  t h e  
AGENDA o r  some index of performance without 
s topping  t h e  s imulat ion.  He can a l s o  i n t e r r u p t  
t h e  r u n  t o  make unprogrammed inspect ions and 
a l t e r a t i o n s .  Before resuming, he can r o l l  t h e  
s imulat ion back t o  an e a r l i e r  s t a t e  t h a t  has  
been preserved,  o r  per turb  i t  i n  some o ther  
manner. Reference t o  data  and a c t i v i t i e s  i s  
symbolic . 
Extensive t r a c  ing f ac il i t  i e s  p e r m i t  the  
user t o  follow t h e  flow of c o n t r o l  during a 
s imulat ion t o  any des i red  l e v e l  of d e t a i l .  
He may modify h i s  experimental design a s  he 
views p a r t i a l  r e su l t s ,  a s  wel l  a s  conduct 
in te r im s t a t i s t i c a l  analyses ,  without r e l i n -  
quishing t i t l e  t o  t h e  computer or los ing  h i s  
place i n  the s imulat ion.  By running indepen- 
dent components of h i s  model s ing ly  or i n  
se lec ted  combinations from standard i n i t i a l  
condi t ions ,  he i s  a b l e  t o  examine d i f f e r e n t  
aspec ts  of h i s  s imulat ion i n  a cont ro l led  way. 
This f l e x i b l e  mode of operat ion encourages him 
t o  bui ld  and v a l i d a t e  h i s  model incremental ly ,  
thus giving him a measure of pro tec t ion  a g a i n s t  
the  problem of i n i t i a l  overcomplexity t h a t  can 
plague a monolithic s imulat ion.  
The OPS System 
The OPS s y s t e m  provides a multi-purpose 
f a c i l i t y  f o r  on-l ine computation, programming, 
and model-building. I t  is an open system and 
it i s  modular. The u s e r  can en large  and r e -  
shape it t o  s u i t  h i s  own requirements by adding 
ind iv idua l ly  t a i l o r e d  subrout ines ,  known a s  
opera tors .  An operator  may s i m p l y  be a s u b -  
rou t ine  with a f ixed number of arguments of 
f ixed connotat ions.  O r  it may have a v a r i a b l e  
number of parameters whose i n t e r p r e t a t i o n  i s  
s e n s i t i v e  t o  context .  These parameters 
may be read i n  l i t e r a l l y ,  symbolical ly ,  or with 
conversion t o  any of severa l  modes. Some 60 
t o  70 standard operators  come with the system. 
Addit ional  opera tors  may be w r i t t e n  i n  any of 
a v a r i e t y  of programming languages, such a s  
FORTRAN, MAD, o r  FAP; or they may be r e c r u i t e d  
from a wide assortment of e x i s t i n g  subrout ines  
without modif icat ion.  
New opera tors  may be  w r i t t e n  i n  terms of old 
opera tors .  An ordered s e t  of opera tors  is  known 
a s  a compound operator  o r  KOP (pronounced K-Op). 
A KOP may be executed a s  i t  i z  being cons t ruc ted ,  
s ince  i t s  execut ion i s  i n t e r p r e t i v e .  Af te r  it 
has been debugged, it may be compiled i n t o  a 
conventional subrout ine.  A KOP has a f ixed  
-83 - 
number o r  arguments of f i x e d  c o n n o t a t i o n .  
Whether o r  n o t  i t  i s  compi led ,  it i s  r e f e r r e d  t o  
by name a s  though it were a s u b r o u t i n e .  
o p e r a t o r s  (KOP's) may themselves  be  compounded. 
They may c a l l  themselves and each  o t h e r  t o  any 
d e p t h .  
t o  f low of c o n t r o l  between s u b r o u t i n e s .  
Compound 
Flow of c o n t r o l  between KOP's i s  s i m i l a r  
There a r e  s t anda rd  o p e r a t o r s  f o r  i n p u t  and 
o u t p u t ,  t e s t i n g ,  b ranch ing ,  and r e p e a t i n g  w i t h i n  
a KOP. Thus ,  a KOP i s  ana logous  t o  an  o r d i n a r y  
program, excep t  t h a t  i t s  components can  be of 
a r b i t r a r y  complexi ty  and t a i l o r e d  t o  an  i n d i v -  
i d u a l  need .  Opera to r s  a r e  t h e  b u i l d i n g  b locks  
and KOP's a r e  t h e  s t r u c t u r e s .  
I n  t h e  OPS s y s t e m ,  a l l  v a r i a b l e s  a r e  
r e f e r r e d  t o  symbol i ca l ly  through a symbol t a b l e  
ma in ta ined  d u r i n g  e x e c u t i o n .  Changes i n  t h e  
symbol t a b l e  can  be made a t  any t i m e  w i thou t  
d i s t u r b i n g  t h e  d e f i n i t i o n s  of a c t i v i t i e s .  
Ar rays  of up t o  3 dimens ions  can  be addres sed  
by i m p l i c i t  indexing .  Thus,  t h e  m u l t i p l i c a t i o n  
S E T X = A * B  
a p p l i e s  whether  X ,  A and B a r e  s i n g l e  c e l l s  o r  
a r r a y s .  I f  A and B a r e  compa t ib l e  a r r a y s ,  t h e  
m u l t i p l i c a t i o n  i s  c a r r i e d  out  e lement -by-e lement  
ove r  a l l  of t h e i r  e l emen t s .  I n f i x  symbols a r e  
a v a i l a b l e  f o r  m a t r i x  m u l t i p l i c a t i o n  (.M.), 
m a t r i x  t r a n s p o s i t i o n  ( .T . )  and t h e  d i f f e r e n c i n g  
of e l emen t s  of v e c t o r s  ( . D . ) .  I f  e i t h e r  A o r  B 
i s  a c e l l ,  t h e  des igna ted  s c a l a r  o p e r a t i o n  i s  
performed. 
The c u r r e n t  v e r s i o n  of OPS does  n o t  i n c l u d e  
a g e n e r a l  l i s t  p rocess ing  c a p a b i l i t y ,  a l t hough  
s e v e r a l  l i s t  p rocess ing  o p e r a t o r s  have been  used 
e x p e r i m e n t a l l y .  
add t h e  SLIP p r i m i t i v e s  t o  OPS, s u b j e c t  t o  c o r e  
space  l i m i t a t i o n s .  
I t  a l s o  a p p e a r s  f e a s i b l e  t o  
S imula t ion  and Model B g i l d i n g  
A s i m u l a t i o n  model may be c o n s t r u c t e d  from 
o p e r a t o r s  and KOP's by u s i n g  them t o  r e p r e s e n t  
a c t i v i t i e s .  An a c t i v i t y  i s  an o rde red  sequence  
of one o r  more e v e n t s ,  o r  more p r e c i s e l y ,  a 
l i s t  of o p e r a t o r s  d e f i n i n g  how t h e s e  e v e n t s  t a k e  
p l a c e .  S ince  t h e  o r d e r  o f  e x e c u t i o n  of a c t i v -  
i t i e s  i s  n o t  known in advance ,  t h e  f low o f  
c o n t r o l  between a c t i v i t i e s  i s  n o t  handled  i n  
t h e  normal s t y l e  of s u b r o u t i n e  c a l l s .  A 
s p e c i a l  KOP, known a s  t h e  AGENDA, is  in t roduced  
t o  permi t  t h e  dynamic schedu l ing  of a c t i v i t i e s .  
A f t e r  a n  a c t i v i t y  is comple ted ,  c o n t r o l  r e t u r n s  
t o  t h e  AGENDA which s p e c i f i e s  t h e  n e x t  a c t i v i t y  
t o  b e  execu ted .  A c t i v i t i e s  a r e  scheduled  on 
t h e  AGENDA f o r  execu t ion  e i t h e r  a t  a s p e c i f i r d  
t i m e  o r  when a s p e c i f i e d  c o n d i t i o n  i s  m e t .  
A c t i v i t i e s  c a n  schedu le ,  c a n c e l ,  and r e s c h e d u l e  
themselves  and o the r  a c t i v i t i e s  d u r i n g  e x e c u t i o n .  
They may a l s o  consume s imula t ed  t i m e  by d e l a y i n g  
f o r  a c e r t a i n  per iod  or w a i t i n g  f o r  a c o n d i t i o n  
t o  be  m e t .  Delays and waits a r e  used t o  s t r i n g  
e v e n t s  t o g e t h e r  a s  a c t i v i t i e s .  These  f e a t u r e s  
a r e  in t h e  s p i r i t  of s i m u l a t i o n  languages  such  
as  SOL and SIMSCRIPT, a l though  SIMSCRIPT works 
on ly  w i t h  e v e n t s  and does  h o t  s chedu le  c o n d i t i o n -  
a l l y .  
I n  d e s i g n i n g  t h e  s i m u l a t i o n  f a c i l i t y ,  
primary emphasis was p laced  on prov id ing  t h e  
r e s e a r c h e r  w i t h  a f l e x i b l e  framework f o r  b u i l d -  
i n g  h i s  model o n - l i n e  i n  an  i n t e r a c t i v e  manner. 
The b u i l d i n g  phase of t h e  s i m u l a t i o n  p r o c e s s  
was cons ide red  more c r i t i c a l  t h a n  t h e  runn ing  
phase ,  and t h e  s u b j e c t  of runn ing  e f f i c i e n c y  
r ece ived  secondary  s t a t u s .  T h i s  ph i losophy  l e d  
t o  a combined i n t e r p r e t i v e  and c o m p i l a t i v e  sys tem.  
Opera to r s  a r e  compiled programs. They r u n  a t  
f u l l  e f f i c i e n c y .  KOP's, however,  a r e  execu ted  
i n t e r p r e t i v e l y  and may be t r a c e d  i n  d e t a i l ,  a 
f e a t u r e  which i s  p a r t i c u l a r l y  e f f e c t i v e  i n  a n  
o n - l i n e  envi ronment .  Once a model i s  r eady  f o r  
p roduc t ion  r u n s ,  i t s  runn ing  e f f i c i e n c y  can  be 
improved by compi l ing  a l l  a c t i v i t i e s  w r i t t e n  a s  
KOP's i n t o  o p e r a t o r s .  
I n  b u i l d i n g  a s i m u l a t i o n  model o n - l i n e ,  
t h e r e  i s  g r e a t  advan tage  i n  s t r u c t u r i n g  i t  SO 
t h a t  p r e l i m i n a r y  p i e c e s  c a n  be  t e s t e d  b e f o r e  
they  become embedded i n  a l a r g e r  whole.  T h i s  
sometimes i s  b e s t  accomplished by b u i l d i n g  from 
t h e  o u t s i d e  i n ,  a s  i n  t h e  c o n s t r u c t i o n  of a 
house.  Other t i m e s ,  it can  be  achieved  by assem- 
b l i n g  p a r t s  i n  h i e r a r c h i c a l  combina t ions ,  a s  i n  
t h e  fo rma t ion  of a n  o r g a n i z a t i o n .  E i t h e r  way, 
r e l a t i v e l y  independent  p a r t s  should  be  i s o l a t e d  
i n t o  s e p a r a t e  segments .  
t o  a i d  in weighing  a l t e r n a t i v e  f o r m u l a t i o n s  of 
components,  and h e l p s  b u i l d  a n  unde r s t and ing  of  
t h e  model a s  t h e  model i t s e l f  i s  b u i l t .  
T h i s  a l l o w s  t h e  computer 
I n  t h e  OPS sys t em,  t h e  p a r t s  o f  t h e  model 
a r e  o u t  i n  t h e  open and e a s i l y  mod i f i ed .  The 
AGENDA o r  s chedu le  of a c t i v i t i e s  a l s o  i s  o u t  i n  
t h e  open. 
t h e  u s e r  c a n  i n d i c a t e  where i n  t h e  AGENDA he 
wi shes  t o  s t a r t  t h e  s i m u l a t i o n ,  t h e  e x a c t  
d u r a t i o n  of t h e  r u n ,  o r  a c o n d i t i o n  f o r  t e rmin -  
a t i o n .  He can  i n s e r t  h imse l f  i n t o  t h e  s i m u l a t i o n ,  
and modify i t s  c o u r s e  from t h e  c o n s o l e  by a l t e r -  
i n g  t h e  AGENDA or  add ing  t o  i t .  
a s i m u l a t i o n  t o  examine some d a t a ,  make a 
Through t h e  u s e  of sys tem s w i t c h e s ,  
H e  c a n  i n t e r r u p t  
c a l c u l a t i o n ,  t r a n s f o r m  a v a r i a b l e ,  o r  e s t i m a t e  
a c o e f f i c i e n t ;  t h e n  i n s e r t  a change and resume 
t h e  r u n .  T h i s  t y p e  of i n t e r a c t i o n  i s  f a c i l i t a t e d  
by t h e  openness  o f  t h e  OPS sys tem.  A l l  t h e  
s i m u l a t i o n  v a r i a b l e s  a r e  a v a i l a b l e  f o r  examin- 
a t i o n ,  and t h e y  may be  o p e r a t e d  on by any of a 
wide v a r i e t y  of s t a t i s t i c a l  o p e r a t o r s .  S i n c e  
a l l  KOP's are execu ted  i n t e r p r e t i v e l y ,  i t  i s  
s t r a i g h t f o r w a r d  t o  modify a KOP and t h e n  r e s t a r t  
t h e  s i m u l a t i o n .  No i n t e r m e d i a t e  compla t ion  o r  
r e l o a d i n g  of t h e  sys tem is  r e q u i r e d .  
E x t e n s i v e  t r a c i n g  f a c i l i t i e s  a r e  a v a i l a b l e .  
For  example ,  t h e  f o l l o w i n g  may b e  t r a c e d :  t h e  
names of KOP's e x e c u t e d ;  t h e  l i n e  numbers 
e x e c u t e d ;  t h e  pa rame te r s  and r e s u l t s  of o p e r -  
a t o r s  e x e c u t e d ;  t h e  movements of s i m u l a t e d  
t i m e ;  and t h e  v a l u e s  of any v a r i a b l e s  r e f e r e n c e d  
s y m b o l i c a l l y .  T h i s  t r a c i n g  i s  c o n t r o l l e d  by 
sys t em s w i t c h e s  which  may be  set  a t  t h e  c o n s o l e  
or d y n a m i c a l l y  f rom w i t h i n  a KOP. 
-84- 
P 
I 
I 1  
, new p o s s i b i l i t i e s  f o r  b e t t e r  understanding of 
The man-machine i n t e r a c t i o n  poss ib le  with 
t h i s  type of on-l ine s imulat ion f a c i l i t y  o f f e r s  
complex systems. For example, those aspec ts  of 
a s y s t e m  t h a t  a r e  w e l l  understood may be pro- 
grammed a s  opera tors  or  KOP's. The l e s s  under- 
stood components of the  s y s t e m  may be modelled 
by the  researcher  a t  h i s  console. The e n t i r e  
I system may then be made t o  i n t e r a c t  under d i f f -  
, may be stopped by t h e  user  a t  any p o i n t ,  by 
I e r e n t  cont ro l led  condi t ions .  The s imulat ion I 
p ress ing a s p e c i a l  i n t e r r u p t  but ton,  and 
d e t a i l e d  v a l i d a t i o n  ana lys i s  performed. I t  i s  
not necessary t o  specify t h e  des i red  s top  point  
i n  advance, although t h i s  a l t e r n a t i v e  is  
a v a i l a b l e .  Al te rna te  s imulat ion s t r a t e g i e s  may 
be compared by r e s t a r t i n g  the  s imulat ion from 
a given point  with d i f f e r e n t  dec is ion  r u l e s .  
It i s  only necessary t o  e d i t  the  appropriate  
KOP's, re load the  AGENDA and common s torage  
(which has been saved on d i s k ) ,  make appropriate  
modif icat ions t o  t h e  s imulat ion v a r i a b l e s ,  and 
e n t e r  run mode. 
1 A c t i v i t i e s  
A s imulat ion system must have a way of 
represent ing  events  or a c t i v i t i e s .  I n  SIMSCRIPT 
and SOL t h i s  i s  the  subrout ine.  I n  OPS, i t  i s  
an operator  or  sequence of opera tors ,  c a l l e d  a 
KOP. An a c t i v i t y  may a f f e c t  a s imulat ion by 
changing t h e  values of s t a t e  var iab les  i n  
common s torage .  It may a l s o  a l t e r  t h e  course of 
t h e  s imulat ion by scheduling t h e  execution of 
a c t i v i t i e s ,  including i t s e l f ,  a t  f u t u r e  t imes,  
and by cancel ing a c t i v i t i e s  previously scheduled. 
An a c t i v i t y  may a l s o  advance simulated time by 
means of DELAY and WAIT opera tors .  
The Agenda 
A d i s c r e t e  s imulat ion system must a l s o  have 
a scheduler t h a t  d r i v e s  i t s  c lock.  In  SIMSCRIPT, 
t h i s  is the  events  l i s t .  In  OPS, i t  i s  a s p e c i a l  
KOP c a l l e d  the AGENDA. 
conta ins  c a l l s  t o  a c t i v i t i e s .  The e n t r i e s  i n  
t h e  AGENDA a r e  ordered by  t h e i r  l i n e  numbers 
which a r e  equivalent  t o  simulated time. 
The AGENDA normally 
A t  t h e  top of t h e  AGENDA a r e  condi t iona l  
c a l l s  t o  a c t i v i t i e s ,  c a l l s  t h a t  depend upon 
some r e l a t i o n  among s t a t e  v a r i a b l e s .  Following 
t h e  condi t iona l  c a l l s  a r e  uncondi t ional  c a l l s .  
Normally t h e  AGENDA i s  entered from t h e  top and 
t h e  condi t iona l  c a l l s  a r e  examined t o  see i f  any 
of them i s  s a t i s f i e d .  I f  one i s  s a t i s f i e d ,  t h e  
f i r s t  uncondi t ional  c a l l  i s  executed, and the  
system v a r i a b l e  TIME i s  advanced t o  t h e  l i n e  
number of t h a t  c a l l .  
conta ins  t h e  c u r r e n t  value of t h e  simulated 
c lock .  It i s  not changed when a condi t iona l  
c a l l  i s  executed. 
The v a r i a b l e  TIME always 
A l l  t h e  c a l l  operators  i n  t h e  AGENDA d e l e t e  
themselves from the  AGENDA $:hen they a r e  execut-  
ed. Thus, an a c t i v i t y  i s  not c a l l e d  more than 
once unless  i t  i s  scheduled more than once. 
Note: The preceding i s  the f i r s t  par t  of a t a l k  
presented a t  t h e  21s t  National meeting of the  
Associat ion f o r  Computing Machinery, Los Angeles, 
August 30, 1966. The f u l l  t e x t  appears i n  t h e  
Proceedings of the  conference, published by  t h e  
Thompson Book Company, Washington, D. C . ,  1966. 
-85 - 
