Adaptive computers  by Breuer, M.A.
INFORMATION AND CONTROL 11, 402-422 (1967) 
Adaptive Computers* 
M. A. BREUER 
Eleclrieal Engineering Department, University of Southern California, Los Angeles, 
California 90007 
In this paper we shall consider some design aspects of a computer 
which has two new modes of operation, which we call adaptive. One 
new mode of operation is the computer's ability to carry out useful 
computation even when component failures are present. This opera- 
tion may be achieved at the expense of computational rate or ac- 
curacy. The ability to achieve this mode of operation iscalled graceful 
degradation, and its implementation differs from the redundancy 
techniques normally used to increase the reliability of a computer. 
The second mode of operation consists of the computer's ability to 
automatically increase the throughput at the expense of computation 
time and accuracy. Both  hardware and software procedures for ac- 
complishing these goals will be outlined. 
The results of this work are applicable to the design of those space- 
borne computers which need be operational for long periods of time, 
such as a year or two. NormMly, the effective life of a piece of equip- 
ment is measured in terms of its mean time between failure (MTBF). 
However, for the computer system to be described, we are more 
interested i~ how long the system can carry out its overall functional 
goal, rather than the MTBF of its hardware. 
I. BACKGROUND AND MOTIVATION 
The work reported here was motivated by the work of Tomovic (1965, 
1966) on finite control of locomotive automata. Locomotive devices, 
such as a robot, have been constructed in the past. Their control has 
either been (a) passive, based on their own momentum, (b) active, such 
as motor driven, but where the locomotive device is statically stable at 
all times, (c) finite state, where a digital computer is employed, or (d) 
humg~n.  
Tomovie envisioned the control to be implemented by a small finite 
* This work was supported wholly by the Joint Services Electronics Program 
(U. S. Army, U. S. Navy, and U. S. Air Force) under Government Contract No. 
AF-AFOSR-496-67. 
402 
ADAPT IVE  COMPUTERS 403 
state device, completely independent from human assistance except for 
the choice of gait. Here at the University of Southern California, a 
quadruped locomotive device has been constructed which is controlled 
by a finite state automaton [McGhee (1966)]. This mechanical quadruped 
was able to walk with either a synchronous orasynchronous control. 
There are many major unsolved problems associated with the design 
of the finite state control. One such problem deals with keeping the 
locomotion machine dynamically stable under normal operation. The 
second problem is how to keep the locomotion machine stable, and possi- 
bly even moving, under the condition of some machine malfunction. 
That is, if a component failure occurs in either the locomotion machine 
or the automaton, the automaton should automatically enter a failure 
recovery mode which will control the locomotion machine in, at worse, a 
degraded form of operation. This is related to the problem of embedding 
one automaton within another. In trying to solve this latter problem, it 
may be necessary to employ an adaptive automaton. Such an autom- 
aton would be capable of automatically detecting component failures 
and of restructuring itself accordingly so that some basic operations of 
the locomotion machine could still be eontrolled. 
Unfortunately, these problems just mentioned have not yet been 
solved. In this paper we wish to describe the extension of some of the 
concepts concerning adaptive automaton to the more familiar and prac- 
tical area of computer design. 
The realization of such an adaptive digital computer appears to be 
quite useful for many applications where high reliability is required, 
such as in the aerospace industry. Throughout this paper it will be seen 
that most of the problems and solutions considered will be motivated 
and influenced by various control techniques which human beings em- 
ploy. 
II. INTRODUCTION 
For many computer applications, high reliability or, analogously, 
long operational life are required. There are many classical techniques 
for attempting to achieve this end result. For example, one can employ 
a pool of redundant modules, such as arithmetic units, memories, 2/0 
devices, etc. When one unit becomes nonoperational it is replaced by 
another unit. and the execution of the program, which was temporarily 
interrupted, is continued. No significant degradation of system per- 
formance occurs. The implementation f such a system is a nontrivia] 
problem and there are many interesting related problems. 
404 BREUE~ 
In this paper we choose to look at the problem of what to do if no 
redundant modules are available. 
Consider a digital computer in which, if component failures occur, it 
is desired that the system continue to carry out useful computation, 
though not necessarily at the same computational rate or accuracy as 
under no failure conditions. Such a mode of operation is called graceful 
degradation. If, for instance, the computer is used in an aerospace 
application, then it may be required to process many different jobs. As 
the throughput decreases with component failures, the number of jobs 
it can process in a given amount of time decreases and the manner in 
which it executes each job may need to be revised. Both hardware and 
software techniques can be used to solve these problems. 
Loosely, we say that a computer is adaptive if it can be made to 
operate under subnormal circumstances such as described in the pre- 
vious paragraph. It is selfadaptive if it can be made to do so without any 
external information. In this paper we will consider various techniques 
for achieving adaptive computer operation. 
Analogies with human behavior, which partly motivated this work, 
will be stressed. 
III. REDUNDANCY AND GRACEFUL DEGRADATION 
A Adaptation Under Component Failure 
The primary purpose for man's legs are to allow him to be mobil, i.e., 
to wMk. Even if one's knee and/or hip is restricted from rotating and is 
held in a stiff vertical position the person is still able to walk, though 
not as smoothly, swiftly or efficiently as under normal conditions. One 
can see that man is built with a certain amount of redundancy so that 
walking is still possible, though degraded in quality, even if certain mal- 
functions occur. Note that this redundancy is not of the form where one 
replaces a bad leg with a good new replacement orjust forgets about he 
malfunction and acts as if nothing has happened. This capability to 
adapt to malfunctions i  common to many body organisms and in fact 
to most living organisms in general. 
In a digital computer, edundancy is sometimes included in order to 
increase the mean-time-between-failures (MTBF) of the system. Three 
common techniques for increasing the mean-time-to-failure ar  the use of 
error-correcting codes [Garner (1966)], triple redundancy and voters 
[Teoste (1962)], and duplicate systems [Keister, Ketehledge, Vaughan 
(1964)]. With error-correcting codes, machines can be built so that, for 
ADAPTIVE C05IPUTERS 405 
example, if a single error occurs during some arithmetic omputation the 
error can be corrected. The cause of the error can be either a permanent 
or intermittent component failure. In triple redundant systems each 
basic module is triplicated and the outputs are combined in vote4aking 
majority logic gates, tIenee two copies of the same module must fail 
before an error is observed in the output of the system. For the ease of 
dual redundant systems we are given two identical machines Mi and 
M~- where component failures in each machine are detectable; i.e., 
eomponent failures lead to computatonal error. If M~ can detect an error 
within itself we say it is self-testing. If Mi can test for an error in M j ,  
i ~ j, then Mj is said to be externally testable. In any event, M~ and M i 
are used simultaneously in running the same programs. As soon as their 
outputs disagree the machines are put. into a testing or component 
failure detection mode. The machine with the malfunction is then deter- 
mined and shut off, while the good machine continues with the computa- 
tional tasks. 
The amount of hardware associated with these redundancy techniques 
is from two to four times the amount of hardware associated with the 
basic nonredundant system. 
In all three eases it appears to an external observer that computation 
continues to proceed normally until a catastrophic failure occurs; i.e., 
until the results are evidently in error, and hence worthless, and nothing 
can be done to obtain correct results. This type of redundancy is not 
analogous to the redundancy in the leg described previously. We will 
now look at the problem of adding a small amount of redundancy to a 
computer in order to obtain graceful degradation of its computational 
ability due to one or more component failures. 
One practical motivation for considering this problem is associated 
with the problem of producing a computer for use aboard a satellite or 
space vehicle. With projects such as a manned orbital research labora- 
tory, the operational life may be in the range of from one to five years. 
Hence it is necessary to produce extremely reliable computers while not 
appreciably increasing their weight, size, or power requirements. Such a 
satellite computer would have many functions, such as guidance and 
control, re-entry control, monitoring equipment and crew conditions, 
communication with earth, etc. Each of these functions can be assigned 
a priority such that if the eomputer is forced to execute in a degraded 
mode of operation, the higher priority tasks would continue to be 
handled. 
406 ~aEU~R 
Consider a multiprocessing system where modules are assigned tasks 
under the control of a supervisory system. Then graceful degradation of
the system (that is, a reduction in the throughput) is not too difficult o 
achieve. The typical operation of such a system when given a task to 
perform is to have the supervisory program select one of the possibly 
many operating modules which can perform the given task and then 
place the task on a queue of tasks waiting to be executed by the selected 
module. The supervisor assigns tasks only to properly working units; 
hence ven under component failures the job gets done as long as at least 
one of each basic type of unit remains operational. The design and 
application of a system similar to this is discussed in the paper by Hassett 
and Miller (1966). The problem becomes more complex when the condi- 
tion that at least one of each basic type of unit remain operational is no 
longer satisfied. For this ease it now becomes necessary to locate the 
component failure and to see if it is possible to carry out useful computa- 
tion without using the malfunctioning components. To accomplish this, 
alternate procedures for carrying out basic operations must be found. 
These techniques will be the subject of this section. 
There are certain basic properties, concerning the machine, which are 
being assumed. First, the machine is self-diagnosable [Book and Toth 
(1965)]. To execute a diagnostic routine requires that a certain basic 
portion of the computer be operational. This portion, referred to as the 
hard core, may constitute about 15-20 % of the central processing unit 
plus memory. The hard core consists of a major part of the control unit 
and a minor part of the arithmetic unit. Secondly, to selfdiagnose a 
component failure down to the level of a group of gates on flip-flops 
requires approximately 15% additional hardware. Finally, for the 
computer to potentially be capable of continuing to operate usefully 
under a component failure mode, at least the hard core must be opera- 
tional. 
In order to increase the mean-time-to-failure, thehard core should be 
made very reliable ither by the selection of components orby the use of 
redundancy. 
When a component failure occurs it may be still possible to carry out 
useful computation, depending on the type of failure and its logical 
function. 
If this is the ease, we say the computer enters a failure recovering 
mode of operation. Three possible techniques which can be used to 
enter a failure recovery mode of operation are reprogramming, micro- 
ADAPTIVE COMPUTERS 407 
programming, and restructuring the computer. The  concept behind em- 
ploying reprogramming and microprogramming is based upon the fact 
that the basic operations in a computer (required in order to carry out 
arithmetic operations) are subtraction, set and reset of registers, test for 
a specific condition, and branch. 
If a component  failure occurs in the logic of the control or arithmetic 
units, then it is sometimes possible to carry out the intent of the normal 
computer instructions by using a new set of instructions or micro-opera- 
tions which do not activate the malfunctioning logic. 
1. Reprogramming 
A few descriptive xamples of subroutines for carrying out the intent 
of an instruction are given below. Assume numbers are stored as magni- 
tude and sign. 
a. Addition (a + b). The normal mode of operation is to use the addi- 
tion logic and control. Assume that the add control line is inoperative 
and the arithmetic unit eontains speciaI logic to carry out subtraetion. 
Then addition can be carried out by just complementing one operand 
and then subtracting; i.e., a -t- b = a - (2 ~+1 - b), where n is the 
number of magnitude bits in a computer word. 
If  the add control or parallel adder logic is inoperative, the failure 
recovery mode can also be achieved by carrying out addition serially. 
That is, at a small cost in hardware, one extra stage of a full adder can 
be included in the hardware strueture of the computer. A counter is re- 
quired to keep track of the number of additions to be made. This can 
be the same counter used to control the multiplication operation. 
Finally, the error-reeovery mode for addition can be earried out by a 
table look-up process. 
b. Multiplieation (a ;4 b). The normal mode of operation is to use the 
multiply eontrol and adder logic. If the sequencing control for this opera- 
tion fails, then the failure recovery mode can be carried out by employing 
a small subroutine consisting of add, shift, and branching instructions. 
Alternatively, multiplication can be carried out by using two division 
instructions; i.e., a X b = a/(1/b).  
e. Division (a/b). I f  the sequencing control for this operation fails, 
then a small subroutine similar to the multiply subroutine can be em- 
ployed in order to aehieve a failure recovery mode. A second technique 
is to set c = 1 - b, where we assume ] b I -< 1. Now a/b = a/(1 - c) 
= a(1 - c + c 2 - c 3 + . . .  ); hence a/b can be computed by a sub- 
408 Bn~VER 
routine consisting of multiplication, addition, and branching instruc- 
tions. 
d. Transfer on negative. The normal mode of operation is to execute 
a transfer-on-minus (TMI) command. One possible failure recovery 
mode of operation which can be used when only the TMI instruction is
inoperative is to use a subroutine consisting of a transfer-on-plus (TPL), 
transfer~on-zero (TZE), and transfer (TRA) instructions, such as 
TPL * + 3 
TZE * + 2 
TRA - 
where • refers to the address of the instruction i  which it appears. 
It is evident hat depending on the logical function of the component 
failure it is possible to find subroutines to carry out the intent of many 
computer instructions. 
There are various levels at which these error recovery type routines 
can be employed. One procedure is to reprogram all of the routines, 
using only instructions which are currently operational. This could be 
done automatically on a ground base computer and the results trans- 
mitted to the satellite computer. Using this non-selfadaptive procedure 
requires no basic hardware modifications in the satellite computer. 
A second approach is to supply to the satellite computer a closed sub- 
routine associated with each instruction which is not currently opera- 
tional. When any program is being executed each operations code is 
checked to see whether or not it is operational. If i t  is not operational, 
the instruction is trapped and an automatic transfer takes place to the 
subroutine associated with the nonoperationM operation code. The 
purpose of this subroutine is to carry out the intent of the inoperative 
operation code. If these subroutines use instructions which themselves 
are inoperative, these instructions are also trapped. It is important that 
each subroutine be nondestructive. That is, at the completion of a sub- 
routine the contents of all registers used for intermediate storage must 
be reinstated tothe value they had when the subroutine was first entered. 
To accomplish this each subroutine can have its own small segment of 
memory for storing the initialstate of the computer or one large push- 
down stack can be employed by all subroutines. This mechanism is very 
similar -to the way interrupts are handled in conventional computers. 
As an' example, assume the square root instruction SQR is inoperative. 
Then an SQR instruction would be trapped and a square root subroutine 
ADAPTIVE COMPUTERS 409 
would be entered. If this subroutine uses a DtV eommartd which is also 
inoperative, then it too is trapped. At the completion of the DIV sub- 
routine we re-enter the SQR subroutine and continue computing until 
the square root is determined, at. which time we re-enter the main pro- 
gram. 
If too many component failures occur, then this procedure breaks 
down. For example, while executing an error-recovery subroutine A we 
cannot call on the execution of a routine B which itself may call on A. 
More specifically, if the multiply operation is inoperative, we may call 
on a subroutine to implement a X b = a/(1/b). If the division operation 
is also inoperative and we call on a subroutine to implement a/b = a~ 
(1 -- c), then we would again require a multiplication and hence would 
be in a loop. Hence there are restrictions on the degree of adaption that 
can take place. 
The trapping of an instruction can be accomplished either through 
hardware or software techniques. One software approach is to branch to 
a special comparison routine just prior to the execution of each new 
instruction. The execution of this routine would not be interruptable. 
This routine would compare the new operation field against a table of 
inoperative operations. If there is no match, the new instruction is 
executed; otherwise, a transfer to an appropriate subroutine for execut- 
ing the intent of the new instruction takes place. The table consists of 
ordered pairs (a, ~), where a is an operation code and ~ an address of a 
subroutine. This method of trapping is very slow. 
In order not to lose time, the table of pairs (a, ~) can be implemented 
in a small associative memory. Now the hardware will automatically 
trap an inoperative instruction a if a is in the associative memory. The 
memory need only consist of say 20 pairs of the form (a, ~). If more than 
20 instructions are inoperative, we can assume that the computer is com- 
pletely nonoperational. 
An alternate hardware technique for trapping an instruction is to 
allocate one bit of every instruction word as a flag indicator. Then the 
instruction is operational if and only if the flag bit is set to zero. Once 
a component failure is diagnosed, it is the function of the supervisory 
programs to search through all instructions in memory and flag the 
appropriate inoperative instructions. One diffieulty with this procedure 
is that is requires the supervisory program to be able to distinguish 
instructions from data. To alleviate this problem programs and data 
can be in separate portions of the memory. ~: 
410 BREUER 
Address 
00.. .  O0 
00. . .  Ot 
00. . .  I0 
Operolion code 
for ADD 
l l . . .  I 
13 
Memory 
AI 
A2 
A3 
A2n-I 
R/O memory 
Im 
Im 122 ¢ 
13j 132 133 .... 
I37 ¢ ¢ 
1 # 
FIG. 1. Address cheme for microprograms 
The seven micro- 
instructions starting 
in address A3. con- 
stitutes the m~cro- 
program for ADD 
2. Microprogramming 
The second technique to be described for implementing the error- 
recovery mode is associated with the concept of microprogramming. A 
nficroinstruction consists of a group of bit which, when placed in the ap- 
propriate register, are capable of controlling the transfer of information 
between registers during a single clock time. 
In the microprogrammed computer to be described the operation field 
associated with a machine instruction consists of a sequence of pro- 
grammable microinstructions. This sequence is called a microprogram 
and is stored in memory. Let I~ be the ith machine operation code which, 
for example, may be associated with the mnemonic ADD. Let I ~ = (1 ~1, 
I~2, • • • , I~n(~)), where I~j are microinstructions. Assume there are 2 ~ 
distinct operation codes in the computer and let word w in memory 
contain the address A~ of the word in memory containing I~ i ,  where 
0 <= w <= 2 ~ -- 1. The memory address tructure is illustrated in Fig. 1. 
In this figure we have shown that the microinstructions are stored in 
read-only (R/O) memory, three instructions to a word. In general, the 
number of microinstructions per word is a function of the word size and 
the number of control lines that may possibly be activated by a micro- 
instruction. The symbol ~ indicates the end of a microprogram. If  one 
or more component failures occurs such that one or more microprograms 
ADAPTIVE COMPUTERS 411 
become inoperative, the corrective action for each program is as follows: 
First, construct a new" microprogram which will not be effected by the 
component failures, and store this program in main memory. Say this is 
the ith microprogram. 
Secondly, change the contents of word i from A ~ to A/, where A ~' is 
the address of the first microinstruction f this new program. The use of 
microprogramming is a very powerful, natural, and useful tool to use for 
extending the operation of a computer containing component failures. 
The price that is paid is mainly the additional memory required to store 
the microprograms. The advantage is that it is now possible to construct 
many different programs for carrying out the same functional computa- 
rich. 
3. Restructuring 
So far we have considered component failures occurring in the arith- 
metie and control logic. Failure in storage devices such as flip-flops 
must sometimes be handled ifferently. For example, if a flip-flop in the 
arithmetic unit is stuck in one particular state, it may be impossible to 
carry out any worthwhile computations. To determine a machine struc- 
ture that would possibly circumvent this difficulty we can again draw an 
analogy- with man. First, man is "biphysieal", that is he has two arms, 
two legs, two ears, etc. Secondly, externally he is symmetrically built 
about a vertical plane. Note that the "biphysical" property does not lead 
to redundancy, since wdth only one eye one does not have depth percep- 
rich, with only one ear one does not have sound directional capabilities, 
etc. With the loss of one of these dual entities, man's capabilities are de- 
creased. Some of these decreased attributes, however, can be partially 
regained. For example, with only one eye a person can still estimate 
distances by making two separate sightings of an object from two dif- 
ferent locations. Similarly for the ear and sound direction. One can type, 
either with one hand or two hands. From these elementary observations 
we propose the following computer structure: 
The arithmetic unit and information transmission lines (bus) will be 
partitioned into half-word segments with some information interchange 
capabilities between these half-words. For simplicity we will consider a 
single accumulator and I/0 buffer, as shown in Fig. 2. 
In Fig. 2, register A is shown partitioned into half-word segments 
A1, A2. The symbol --~ indicates a transmission line for a half-word of 
information. Dotted lines indicate added hardware required for adaptive 
412 BREUER 
D 
~ E I 
AI ~ A2 
Logic Logic 
] DI D2 
J 
T . . . .  I 
I BI B2 
t To ond from memory ~, 
FIG. 2. Partitioned arithmetic unit 
capabilities. The block between registers A and D consist of add and shift 
logic. No control ines are shown. C is a single flip-flop used to store a 
carry bit. The normal sequence of microinstruction functions, pertain- 
ing only to the portion of the computer shown, is given below for three 
different instructions. 
Instruction 
CLA X 
ADD Y 
STO Z 
Microsequence Functions 
(X) --* B 
(B) -~  D 
(D) --* A 
(Y) --*B 
(B) --* D 
(A) + (B)--~ A 
(A) --* D 
(D) --~ B 
(B) --* Z 
Comments 
CLA is the mnemonic for the 
instruction clear and add (X) 
into register A, where (X) re- 
fers to the contents of memory 
location X. (X) -+ B is a trans- 
fer of information from X into 
B. 
Assume a component failure occurs in the left half of the accumulator, 
i.e., either in A1, D1, or the associated logic or transmission lines. We 
must now separate the most significant half (m.s.h.) and least significant 
half (t.s.h.) of each word and operate on them separately. The corre- 
sponding error-recovery microprograms for the instructions just ex- 
ADAPTIVE COMPUTERS 413 
amined are given below. By  convention, if an instruct ion expects to find 
an  operand a l ready in the accumulator ,  the m.s.h, of this operand will be 
inE .  
Instruction 
CLA X 
ADD Y 
STO Z 
Microsequence Comments 
1. (X) --* B 2. m.s.h, of (X) --* E, 1.s.h of 
2. (B1) --~ E; (B) --~ D of (X) -o ])2. 
3. (D) -~ A 3. 1.s.h. of (X) ~ A2. 
1. (Y) --* B 
2. (B) -* D 2. 1.s.h. of (Y) -~ D 
3. (-4) Zr (D) --* A ; carry from 3. 1.s.h. of (X) -~ (Y) --* A2 
(A2) + (/)2) -* C 
4. (A ) - *  D 
5. (E) -~ B2; (/)2) --* B1; B1 --~ 5. m.s.h, of (X) --~ B2; 1.s.h. of 
D2 (X) -4- (Y) --* B1; m.s.h. 
of (Y) --~ 1)2. 
6. m.s.h, of (Y) --~ A2; m.s.h. 
of (X) --+ D2; 1.s.h. of (X) 
+ (Y)'--~E. 
7. m.s.h, of (X) + (Y) --* A2 
6. (D) --* A; (B) --~ D; (B1) - ,  E 
7. (A) + (D) -~ (¢) ---. A 
8. (A) --~ D; (E) --~ B2 
9. (]92) ~ B1; (B) --* D 
10. (B1) --, E; (D) --, A, 
1. (E) -~ B1; (A) -~ D 
2. (D2) --~ B2 
3. (B)---~ Z 
10. m.s.h, of (X) + (Y) --~ E; 
1.s.h. of (X) -~- (Y) -* A2. 
3. (X) + (Y) -~ Z 
Note  that register E is used only dur ing the error-recovery mode. The 
amount  of addi t ional  storage capac i ty  is then just  one-half of one register 
or about  25% of the ar i thmet ic  unit.  Since any part  of A1 or D1, or 
both, can fail and we can stil l  carry  out ar i thmet ic  operations, only ~ of 
the storage devices need be operat ional  for computat ion  to proceed. If 
the component  fai lure occurs in the 1.s.h. of the ar i thmet ic  unit,  a similar 
sequence of microinstruct ions can be der ived so that  useful computat ions 
can continue. The same E register is used in both eases. Now, if this 
computer  had more than one accumulator, then ~g would not be a special 
register but would in fact be half of one of the registers in some accumula- 
tor, say D2 of accumulator 3. Then, the additional hardware required to 
implement this scheme would consist mainly of transmission lines and 
associated control logic. Hence  it appears that one good system organiza- 
tion would consist of an arithmetic unit containing more  than one 
accumulator. Most  registers would then have dual roles. Their first role 
414 BREUER 
would be to carry out their normal function, e.g., to store the most 
significant half of the addend. Their second role would be to aid in the 
operation of the system under component failure. It  is then possible to 
associate two E registers with each accumulator so that if one develops 
a fault, the second one can be employed in its place. 
If the microprograms are modified by some external means whenever 
a component failure is detected, we say that the system is non-self- 
adaptive. The next evolution of adaptive computers will automatically 
generate their modified microprograms and hence will be truly self- 
adaptive. At this time it is not clear how this process can or should be 
carried out. One possible technique is to have a pool of hardware, say 
registers, addition logic, etc. When a component failure occurs in a 
register, the control logic is modified so that one register is substituted 
for another. This is similar to the variable structured computer described 
by Estrin (1963). In a selfadaptive system one could first execute a com- 
ponent failure detection and diagnostic routine from which the functional 
location of the component failure would be ascertained. From this infor- 
mation a special set of registers, called the adaptive control, would be 
set to indicate the nature of the control. Let the adaptive control register 
X consist of N flip-flops, and be denoted by X = (X~, X2, • • • XN), 
where X~ = 0 for all i if no component failure is present. A typical micro- 
instruction is of the form 
I¢s = (al,as, " ' ' ,~) ,  
where a~ C {0, 1}, for 1 < s -< t. Assume al = 1 allows the transfer of 
information from the D register to the A register under the conditions of 
no component failure. Then one term in the set equation for the ith 
flip-flop of A would be D~.al, where D~ is the output of the ith flip-flop 
in the D register and is also the name of the flip-flop. If the A register 
is inoperative, we may wish al = 1 to allow the transfer of information 
from D to A'. To accomplish this, set X1 = X72 = 1. The modified set 
equation for A~ is now D~.a~.X1.X~ and an added term to the Ai' set 
equation is D~.a~.XI.fG. If the D is register inoperative, we may 
desire ai = 1 to allow the transfer of information from D' to A. To ac- 
complish this, set X:I -- X2 = 1, and add the term D/.a~.X~.X~ to the 
set equation for A~. For this system, the microinstruetions are not 
changed, but rather are decoded ifferently depending on the location of 
the component failure. We refer to this technique as adaptive logic. 
A_DAPTI¥~E COMPUT]~RS 415 
B. Trading Time and Accuracy for Throughput 
Man is also versatile in the sense that given a job J to perform, he can 
repeatedly perform this job perfectly at a rate R or less, but as the rate 
increases above R, his efficiency decreases. We define efficiency as the 
ratio of jobs completed perfectly to jobs undertaken. Given a job J 
consisting of an ordered sequence of sub-jobs j l ,  j2, " "  , j~ • For J to be 
performed correctly requires that all j~ be completed correctly, and the 
correct execution of j~+l may require the correct completion ofj~ for some 
or M1 i, 1 -< i -< n -- 1. It  is sometimes necessary and desirable that u 
person perform a job at a rate higher than/~. In these cases, it may occur 
that some sub-job, say j~, is performed imperfectly, yet the entire job 
is completed in an acceptable manner, though not perfectly. Some of 
these concepts can be carried over to the design of an adaptive digital 
computer. 
For example, consider a real-time computational task J, where at a 
certain instant of time it is also required to service a new real-time task 
K. Assume there is no time to solve jobs J and K perfectly in the sched- 
uled time allowed. This may be due to the fact that a component failure 
has occurred and hence the computer must operate in a degraded mode. 
Then it may be possible to determine a j~, for at least one i and/or a k~ 
for at least one g, such that sub-iobs j~ or kt can be executed in a shorter 
than normal time period, hence allowing tasks J and K to be completed 
in the allotted time. The gain in speed is realized at a sacrifice in ac- 
curacy. 
There are at least two techniques available for reducing the computa- 
tion time of a process; one is hardware adaption, the second is software 
adaption. We first motivate the problem with a real example. 
Consider the use of a spaceborne digital computer used in some missile 
or orbiting laboratory [Lewis (1963)]. The computer must curry out 
guidance and control calculations, act as a digital filter, do coordinate 
transformations, and possibly decode and encode telemetry information. 
Since the sampling rate maybe as high as 50 times a second, the computa- 
tionM requirements on the computer are quite demanding. We define the 
execution interval to be the time between two consecutive sampling 
times. 
We will first illustrate how software adaption may be employed to in- 
crease throughput. 
In solving a given problem there are usually many different ap- 
416 BREUER 
proaches that can be taken. For example, a differential equation may be 
solved using Runge-Kutta integration. However, one can use fifth- 
order or third-order Runge-Kutta formulas, the latter being easier tO 
compute but produces less accuracy in the result. A typical assembly 
language calling sequence to the integration subroutine may be of the 
form 
SKP ADAPT 
TSX RK5 
TSX RK3 
TSX LI 
(DATA) 
Here the word ADAPT contains either a 0, 1, or 2 and the SKP instruc- 
tion skips i instructions, where i equals the contents of ADAPT. Larger 
values of ADAPT correspond to faster, but possibly less accurate, sub- 
routines. The instruction TSX is a transfer and set index. Tile address 
RK5 is the address of the fifth-order Runge-Kutta subroutine, and LI 
is the address of a linear interpolation subroutine. 
In the general case, many different computational tasks may be 
associated with one or more closed subroutines which can effectively 
carry out the requirements of the task. A calling sequence governed by 
the Variable ADAPT, which itself may be changing dynamically, is used 
to select which of the alternative subroutines should be executed at any 
specific instant of time. Each alternate subroutine,has  different execu- 
tion time vs. accuracy response. We now consider the problem of how to 
determine the value of ADAPT. 
The variable ADAPT can be set through programming or by direct 
hardware implementation. The adaptive procedure to be described here 
uses both a hardware pushdown stack containing the name of jobs to be 
performed and a counter which indicates how much time is left in the 
current computation interval. With this scheme the value of ADAPT 
may Change during the computation i terval. 
Consider the following hypothetical case: Let there exist jobs (pro- 
grams) J1 ,  J~, "" • , J~ • At the beginning of each computation i terval 
the programming system scheduler can assign any subset of ten or less 
of these jobs to be executed uring the current computation i terval: 
The names of these jobs are placed in the pushdown stack. The jobs are 
executed in the reverse order in which they are placed in the stack and 
as each job is executed, its name is pushed off the top of the stack. At 
ADAPTIVE COMPUTERS 4:17 
any instant of time during a computation i terval let Y equal the num- 
ber of jobs that have not yet been completely executed. Y can be deter- 
mined directly from the address of the top or bottom of the stack, de- 
pending on how the stack is implemented. 
Let the time required to execute ach job be uniformly distributed 
from 0.75 ms. to 1.25 ms. Let the computation i terval be 10 ms. 
Associate with each program J~ a second (primed) program J J  which 
computes the same quality that J~. computes but  faster and to less ac- 
curacy. Let the execution time for each J J  be uniformly distributed be- 
tween 0.4 ms. and 0.6 ms. At the beginning of each Computation i terval 
a counter T is set to 10 and is decremented by one at 1 ms. intervals. 
When ADAPT = 0, J~ is to be executed and when ADAPT = 1, J~' is 
to be executed, where J~ is the job at the top of the stack. 
ADAPT has the value 1 only under the following conditions: 
(a) T = landY  = 1 
(b) T= 2andY = 2or3  
(c) T - -  3andY- -  3or4:  
To simplify the logic used to implement ADAPT, the conditions can be 
modified to: 
(a') T = landY  => i 
(b') T = 2and Y => 2 
(c') T = 3andY=> 3. 
With this algorithm only the prime version of the last four jbbs will 
possibly ever be executed. This is advantageous since it allows one to 
establish apriority system among jobs where the higher the priority, the 
smaller the probability that a job will ever be required to run in its 
"speed-up mode". The system scheduler would then 10ad the pushdown 
stack in order of increasing priority. If, for example, the total number of 
jobs n is 10, then only four programs need have ass0eiated primed pro- 
grams. 
In some eases the value of ADAPT can be used not only to select pro- 
grams but also to select constants within a single program. This selection 
will then affect the computation time of the subroutine. For example, 
assume the square root of x is calculated by employing the following 
iterative technique: Let 
1[ x~+l = ~ x~+~ 
418 BREUER 
~-~a b ~  m 
0 W w~a x 
Fie. 3. Curve a-probability distribution of work load. Curve b-work-load 
rate. 
and let xl I and (x) 1/2 = = x~+l, where m is the first value of n such that 
I xn+l - x, ] < e for some sufficiently small value of e. The square root 
program can be written so that e = E~ if and only if the value of ADAPT 
is i, where e0 =< el =< ""  • The larger the value of e the faster the pro- 
cedure will converge. Similar techniques can be used in computing the 
value of trigonemetric functions uch as sin x. Here the value of ADAPT 
can be used to control the number of terms which are to be computed in 
a given series. 
The concept of selfadaptation i the control of computational ac- 
curacy already exists in many procedures. For example, in numerical 
integration, in order to reduce the integration error, the integration inter- 
val is sometimes made a function of the second erivative of the function 
being integrated. Here again one is trading time for accuracy. 
The preceding discussion has been primarily concerned with software 
adaptive computing procedures used to increase throughput: We now 
consider a second type of adaptive computing technique. Here, the hard- 
ware is modified in order to increase computation speed. 
Fig. 3 indicates the work-load probability distribution and the work 
rate as a function of work load for a real time computational task. 
The computer is built so that the work-load rate r varies as shown in 
Fig. 3. The work load is determined from the number of demands for 
computation made by the various users or tasks. For w > W the com- 
puter is said to operate in a speed-up mode. The problem is how to make 
the computer function properly during the speed-up hase at the ex- 
pense of accuracy. If this can be accomplished, then the worse case-load 
conditions can be relaxed, making it possible to design a relatively 
slower, simpler, and less expensive computer. 
ADAI~TIYE COMPUTERS 419 
Two basic assumptions are being made concerning the use and design 
of the hypothetical computer being discussed. First, it is assumed that 
the process being controlled will continue to operate satisfactorily 
during the short time the computer is operating in the speed-up mode, 
and hence is calculating at a lower degree of accuracy. Secondly, it is as- 
sumed that address arithmetic is either carried out in a special register or 
if carried out in the main accumulator, a control exists which overrides 
the speed-up mode, such that no loss in accuracy is encountered. 
1. Serial Adaptive Adder 
We now sketch the design of an n bit serial adaptive adder which ean 
be operated in a speed-up mode. Fig. 4 indicates the time allowed for 
addition as a function of the work load. 
Assume the load is quantized into m levels, ~/)1 = ~/g < ~)2 < " ' '  < 
Wm= WMaX. Then the load indicator can consist of m flip-flops, D1, 
D2 , - - .  ,D~.,whereD~ = l if and only if wj-1-___ w < u,j. 
The adder circuit, which operates at a clock frequency f0, is shown in 
Time allowed 
for add operation 
(accuracy) 
n'/f o 
(if<n) 
1 
I I 1 
w i=w w 2 w 3 .  . .  w_ i  
Work load w 
FIO. 4. Time allowed for add operation 
T l DD 
 -18°1 
A iAil  
I ! A I 
Logic er 
B IB, I A' 
I 
FIG. 5. Adder circuit 
I 
win= WMa x 
+B' 
420 BREUER 
Fig. 5. For w > W the number of significant digits in the sum decreases 
monotonically with w. For w =WM~x there are only n' significant 
digits in the sum. We require m ~ n - n' q- 1. 
At time t, which is the beginning of the sequential dd operation, the 
following operations take place: (1.) some Dj is set to one, all others are 
reset. This Dj will not be reset until the end of the entire add operation; 
(2.) an end of operation time counter is loaded. The timing information 
is obtmned from Fig. 4. 
At time t q- 1 the addition begins. If D~- = 1, information enters the 
full adder from A~. and Bkj, where kl = 1, k.o = n -- n' -t- 1, and 
kj-1 < ki < k j+l. At every clock time the information in registers A
and B is shifted right one place. The time required to execute the add 
instruction is then n - / c j  -k 1. 
One can implement the same speed-up mode using M flip-flops, rather 
than m, where 2 M >= m > 2 M-I. The most elementary speed-up system 
has just two modes, hence M = 1, and/c~ = 1 and/c2 = n - n' -[- 1. 
The input equation to the adder from the A register would be A1.D -k 
A~-~,+~.D rather than iust A1. 
2. Adaptive ~ultiplier 
For the operation of multiplication the trade-off between speed and 
accuracy can be achieved with a very small cost in additional hardware. 
Assume the multiplier esides in the X register, where X = (X32, . . .  , 
X1). One simple multiply algorithm is: 
(a) Set operation time counter to 32 . . . . . . . .  ' 
(b) If (X~) = 1, add multiplicand to partial product; otherwise, do 
nothing. 
(e) Shift partial product and multiplier ight one position. 
(d) Decrement counter by one. Exit if Counter is set to zero; other- 
wise, go to step b. 
To speed up this computation by 25 % all that is required is to set the 
counter to 24 rather than 32, and to replace X1 by Xs in step b. The 
product is now accurate to 24 rather than 32 bits. Only one flip-flop is 
required to control whether the normal operating mode or the speed-up 
mode should be selected. 
A procedure having a similar effect as this multiplication procedure is
available with the floating-point option on the IBM 360 computers. On 
this system, at run time, one Can set a dial indicating the length of the 
ADAPTIVE COMPUTERS 421 
mantissa to be used in all floating-point operations. A shorter mantissa 
length leads to less computation time. 
III. SUMMAI%Y 
In this paper we have discussed some of the concepts dealing with 
adaptive computers and adaptive computing techniques. We have first 
introduced the notion of graceful degradation of a computer:and have 
outlined a few procedures for achieving this type of operation~ We feel 
that the microprogramming approach is the most advantageou s one %0 
investigate in greater detail. In the future we plan on carrying out U 
statistical study of the relationship between redundancy, reliability 
(MTBF) and throughput. Most previous tudies have 0nly been con- 
cerned with these first two parameters. These results are, of course, a 
function of the system structure of the computer being analyzed. Hence, 
various different system structures need to be considered. The technique 
of partitioning registers into halves and allowing for the transfer of 
information between various ections will also be investigated in terms of 
redundancy vs. reliability. 
Adaptation can also take place in the form of a trade between through- 
put on the one hand, and computation time and accuracy on the other. 
This adaptation can be achieved through the selection of appropriate 
subroutines to carry out various tasks or by direct modification of the 
hardware. It is possible to construct a real-time monitor system which 
will dynamically determine when the computer system must operate in 
a speed-up mode. 
We see that two levels of adaption exists. The first level occurs when 
a component failure is present. In this case, for example, the system may 
switch from a parallel to a serial mode of operation. The second level 
occurs when a trade-off between speed and accuracy is required in order 
to increase throughput. 
RECEIVED: September 9, 1965; revised July 25, 1967 
REFERENCES 
BOOK, R. V. AND A. P. TOTH (1965). Hardware and software for maintenance in the 
B5500 processor. Proc. intern. Cony. IEEE, New York. 13, 21-27. 
ESTaIN, G. (1963). Parallel processing in a restructurable computer system. IEEE 
Trans. Electronic Comput. ]~C-12, 747-755. 
GA~N~, It. L. (1966). Error codes for arithmetic operations. IEEE Trans. Elec- 
tronic Comput. I~C-15, 763-770. 
HASSETm, R. P., AND MILLE~, E. H. (1966). Multithreading design of a reliable 
422 BREUEa 
aerospace computer. Supplement to [EEE Trans. Aerospace Electronic Syst. 
A]~S-2, 147-158. 
KEISTEt¢, W., KETCHLEDGE, R. W., AND VAUGHAN, l~. E. (1964). No. 1 ESS: Sys- 
tem organization and objectives. Bell System Tech. J. 43, 1831-1844. 
L~wIs, T. B. (1963). Primary processor and data storage quipment for the orbit- 
mg astronomical observatory. IEEE Trans. Electronic Comput. EC-12, 677- 
687. 
McG~EE, R. B. (1966). "Finite State Control of Quadruped Locomotion," Uni- 
versity of Southern California Technical Report No. 186. 
TEOSTE, R.  (1962). Design of a repairable redundant computer. IRE Trans. 
Electronic Comput. ~C-11, 643-649. 
ToMovIc, R. (1965). On the synthesis of self-moving automata. Automation and 
Rvmote Control~ 26, No. 2, (English Translation), 297-304. 
TOMOVIC, Ro AND McGH~E, R. B. (1966). A finite state approach to the synthesis of 
bioengineering control systems. IEEE Trans. Human Factors Electronics, 
HFI~.-7, .65-69. 
